WO2001098454A2 - Human dna sequences - Google Patents

Human dna sequences Download PDF

Info

Publication number
WO2001098454A2
WO2001098454A2 PCT/IB2001/002050 IB0102050W WO0198454A2 WO 2001098454 A2 WO2001098454 A2 WO 2001098454A2 IB 0102050 W IB0102050 W IB 0102050W WO 0198454 A2 WO0198454 A2 WO 0198454A2
Authority
WO
WIPO (PCT)
Prior art keywords
amy2
tes3
protein
nucleic acid
variants
Prior art date
Application number
PCT/IB2001/002050
Other languages
French (fr)
Other versions
WO2001098454A8 (en
Inventor
Stefan Wiemann
Original Assignee
German Human Genome Project
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by German Human Genome Project filed Critical German Human Genome Project
Priority to AU2001295841A priority Critical patent/AU2001295841A1/en
Publication of WO2001098454A2 publication Critical patent/WO2001098454A2/en
Publication of WO2001098454A8 publication Critical patent/WO2001098454A8/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/46Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates
    • C07K14/47Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans from vertebrates from mammals

Definitions

  • the invention provides sequences of human cDNA clones that were isolated from libraries generated from different human tissues- It is another object of the invention to provide assemblages of targets useful in profiling matrices for screening pharmacological test compounds- According to this object! assemblages comprising different populations of human nucleic acidsi proteins and antibodies are provided- In different embodiments! cDNA library-specific assemblages and target-family- specific targets are provided- It is a further object of the invention to provide a database of human nucleotide and protein sequences- Further to this objecti novel human nucleotide and protein sequences are provided in electronic form- In one embodiment! one or more of these sequences is provided in a searchable database-
  • the invention provides biologically active target molecules useful in treating or detecting human disorders- Further to this objecti the invention provides nucleic acid and protein molecules that have the capacity to affect disease etiology or symptoms or correlate with known disease states- Also further to this object! a database is provided which comprises the disclosed molecules in electronic form ⁇
  • the invention results from a need in the art for new human nucleic acids and proteins- This need arises in several contexts- Firsti there is a need to identify targets for therapeutic intervention- Secondi there is a need to identify molecules that may be adversely affected in a therapeutic contexti thereby resulting in toxicity. Knowledge of these molecules will aid in the design of new medicaments with enhanced efficacy and decreased toxicity. Finallyi the need encompasses human nucleic acids and proteins that have medicinal applicability in their own right- In view of these needsi the present inventors set out to isolate and sequence human cDNAs from tissue-specific libraries- In this wayi they represent subsets of molecules likely to be targets for therapeutic intervention or for avoiding toxicity- In addition! the inventors divided the molecules into various sub- categoriesi based on suspected functionality! structural similarity etc! which are of interest from a pharmacological perspective-
  • the present invention provides novel polynucleotide molecules thati in some instances! have similarities with known molecules-
  • inventive DNAs were cloned from five different human cDNA libraries-
  • inventive DNA and protein sequences are show individually in the Description of the Sequences-
  • inventive nucleic acids also include the complements of the DNA sequences provided in the Description of the Sequences as well as their RNA counterparts- Methods of producing the molecules also are provided- Further! the invention provides methods for detecting all or part of the molecules and of detecting polynucleotides encoding all or part of the molecules-
  • inventive molecules derive from five cDNA libraries: human fetal braini human fetal kidneyi human melanomai human testisi and human amygdala- For convenience! each sequence bears a designation that indicates from which library it is derived- In particular!
  • the clones were assigned to the following sixteen functional and/or tissue-derived groups:
  • EBLOCKSJ - Blocks are multiply aligned ungapped segments corresponding to the most highly conserved regions of proteins-
  • the blocks for the Blocks Database are made automatically by looking for the most highly conserved regions in groups of proteins documented in the Prosite Database-
  • the Prosite pattern for a protein group is not used in any way to make the Blocks Database and the pattern may or may not be contained in one of the blocks representing a group-
  • These blocks are then calibrated against the SUISS-PROT database to obtain a measure of the chance distribution of matches- It is these calibrated blocks that make up the Blocks Database-
  • the LJULJ versions of the Prosite and SUISS-PROT Databases that are used on this server are located at the ExPASy Uorld Uide Ueb (UUU) Molecular Biology Server of the Geneva University Hospital and the University of Geneva- Uorld Uide Ueb URL http : //blocks- fhere- org/blocks/about_blocks ⁇ html/ is the entry point
  • the scop database provides a detailed and comprehensive description of the structural and evolutionary relationships between all proteins whose structure is knowni including all entries in Brookhaven National Laboratory's Protein Data Bank (PDB). It is available as a set of tightly linked hypertext documents which make the large database comprehensible and accessible- In addition! the hypertext pages offer a panoply of representations of proteinsi including links to PDB entriesi sequences! references!
  • ENZYME is a repository of information relative to the nomenclature of enzymes- It is primarily based on the recommendations of the Nomenclature Committee of the
  • IUBMB International Union of Biochemistry and Molecular Biology
  • the last PEDANT-block depicts information about the folding structure of the protein generated by PREDATOR- PREDATOR is a secondary structure prediction program- It takes as input a single protein sequence to be predicted and can optimally use a set of unaligned sequences as additional information to predict the query sequence-
  • the mean prediction accuracy of PREDATOR is for a single sequence and 7_ ⁇ V. for a set of related sequences- PREDATOR does not use multiple sequence alignment- Insteadi it relies on careful pairwise local alignments of the sequences in the set with the query sequence to be predicted-
  • PROSITE Motifs PROSITE is a database of protein families and domains- It consists of biologically significant sitesi patterns and profiles that help to reliably identify to which known protein family (if any) a new sequence belongs- Uorld Uide Ueb URL http://www-expasy -ch/prosite/ is the entry point to the database- A description of the prosite consensus patterns is provided hereini after the description of the individual sequences-
  • PFAM Motifs PFAM protein families is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains- Uorld Uide Ueb URL http : //www- sanger • ac- uk/Pfam/ is the entry point to the database-
  • the novel protein contains a Zinc finger motif of the C3HC4 type (RING finger)-
  • the RING-finger domain is involved in mediating protein-protein interactions-
  • Proteins containing a RING-finger are '- mammalian V(D)J recombination activating protein (RAGl) ⁇ mouse rpt-li human rf p -i human S ⁇ Kd Ro/SS-A protein and others-
  • the family of RING finger proteins contains a number of oncogenes- For example PHL-I a probable transcription factor-i BRCAl ⁇ the mammalian cbl- and bmi-1 proto-oncogenes-
  • the new protein can find application in modulating protein- protein-interaction and in studying the expression profile of amygdala-specific genes-
  • 3S1 GCCCCACCTG CCGCCGTGAG ACTGTGCTCT TCACCGACTA CGGCCTGGCC 4D1 GCGCTGGCTG TCAACACGTC CATCCTGAGC CGCCTGCCGC CTGAGGCGCT
  • DKFZphamy ⁇ __10p7 encodes a novel lbl5 amino acid protein with similarity to Na+/Ca ⁇ + exchange proteins-
  • the Transport of Ca ⁇ + from the sarcoplasm into the sarcoplasmic reticulum is an essential process in the initiation of muscle relaxation-
  • novel protein contains a PROSITE multicopper oxidase signature- Multicopper oxidases are enzymes that possess three spectroscopically different copper centers-
  • the new protein can find application in modulation of NA+/Ca5+- exchange and voltage-dependend processes-
  • ATG in frame 3 is first in clone- Sequenced by LHU
  • 3D1 TTCCTGATGA TTTCCCAGAG ATGGATGAGA GTTTTCTAAT TTCTCTCCTT 3S1 GAAGTTCACC TCATGAACAT TTCAGCCAGT TTGAAAAATC A6CCAACCAT
  • H4D1 GAACTAATGG CATTGATTTG GCTGTGAGTG TGCAGTGGGA GACAGTATCT
  • 1DS1 (3NSIDIFIUJE MG(2SSFRYF(3 SVDFAAVNRI HSFTPASGIA HILLIG ⁇ DNS 1101 ALYCUNSERN (3FSFVLEVPS AYDVASVTVK SLNSSKNLIA LVGAHSHIYE
  • VLGRl ⁇ i product "very large G-protein coupled receptor-l” ⁇ Homo sapiens very large G-protein coupled receptor-1 (VLGR1) mRNAi complete eds-
  • FALYSDRfiSI 434 fluery 410 LflEANI— TI(3LFINREFGSLGAINVTYTTVPG!1LSLKN ⁇ 3T- VGNLAEPEVDFVPIIGFL 4bb
  • V + Sbj ct 4A3 VVKDGATYKVDVVPIKN(3VFLSLGSNFTL(3LVTVI1LVGGRFYGI1PTIL(2-
  • EAKSA-VLPV 540 ⁇ uery S27 ANDGARGVIEU ⁇ 3 ( 2SRFEV-NETHGSLTLVA ( 2RSREPLGHVSLFV
  • VDVVPIKN (3VFLSLGSNFTL(3LVTVriLVGGRFYGriPTIL(3EAKSAVLP 531
  • DIFPTSGVILFTEGiQVLSTITLTILADNIPELSEVVIVTLTRITTEGVEDSYKGATIDtQD 1332 D TSG +G+ ++ + +L D +PE+ E ++ L ++ EG GA +D +
  • SBJCT AS1 EKLVTLHG TPAVSEKPDVATVTANVSIHGTFSLGPSIVYIE-
  • DKFZphamy2_lld2 encodes a novel S52 amino acid protein without similarity to known proteins-
  • the novel protein contains 2 transmembrane regions-
  • the new protein can find application in studying the expression profile of amygdala-specific genes and as a new marker for amygdala cells-

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Biochemistry (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Zoology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Medicinal Chemistry (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Toxicology (AREA)
  • Peptides Or Proteins (AREA)

Abstract

Novel human cDNA sequence of a clones, the encoded protein sequence of a clones, antibodies and variants thereof, are provided. The disclosed sequence of a clones find application in a number of ways, including use in profiling assays. In this regard, various assemblages of nucleic acids or proteins are provided that are useful in providing large arrays of human material for implementing large-scale screening strategies. The disclosed sequence of a clones may also be used in formulating medicaments, treating various disorders and in certain diagnostic applications.

Description

HUMANDNA SEQUENCES
Background ofthe Invention
Current methods for testing pharmacological substances rely on a three-stage testing approach to drug development. Firsti candidate compounds are typically screened in some sort of in vitro system! like inhibition of cancer cell growth- Candidates are then tested in an animal model! as a first approximation of systemic effects! including efficacy and toxicity. Compounds that still show promise after these initial in vivo screensi finally are tested in humans- Againi human testing typically occurs in three phases: toxicity1* preliminary efficacyi and efficacy. The entire process can take more than a decade and cost hundreds of millions of dollars- Aside from the monetary costs and protracted time scale! moreover! current testing regimes waste the lives of countless laboratory animals and needlessly endanger the lives of human subjects-
A need exists! therefore! for more sophisticated drug screening techniques that can be done rapidly in vi tro- These screening techniques ideally will be reflective of systemic and/or organ-specific responses! so that they provide a reliable indicator of action in a human body- Current techniques! however! tend to utilize only a single or limited number of markers! thus answering only very simple questions that are of questionable medical import- For example! a typical in vitro assay may ask whether a lead compound binds a particular receptor! which has been implicated in a certain disorder- It is presumed that such binding is indicative of therapeutic usefulness! but it does not even purport to address systemic effects.
Not only are screening techniques for efficacy inadequate! the available toxicity screens likewise are inadequate- Toxicity! on a first leveli is usually measured by animal testing- Aside from the complications related to in vivo versus in vitro testingi such screens are insufficient because of differences in metabolism! uptake! etc-! relative to humans- Thus improved methods would be not only be in vitro-basedi they would also be more "human-11
With the increasing miniaturization of screening assays and the growing availability of targets for pharmaceutical intervention! there is increasing interest in developing arrays containing large numbers of these targets that can be assayed simultaneously- If such an array contains a large enough population of targets! it can be used to essentially mimic the systemic response- In other words! the array becomes an in vitro surrogate for the human body- The more refined the arrays the more accurate the predictive capability- In theory! an array could be constructed that can detect all of the known human expression products simultaneously! therebyi providing a very reliable indicator of the human response to a given compound- These arrays offer advantages over the present in vitro screening systems in that they can assay large numbers of responses simultaneously. They are superior to animal testing because they are more "human" andi thus! more predictive of human responses- In order to construct such arrays! however! the field is in need of further human targets- Advantageously! such targets will be provided with additional physiologically relevant information! such as whether the target is expressed in a particular tissue and whether it is related to a known functional class of targets- In this wayi the artisan can focus as neededi for examplei on tissue-specific effects or target class-specific effectsi thereby providing information useful in evaluating efficacy and/or toxicity-
In addition to a need for pharmacological screening targets! there is a need for further pharmacological substances- These substances can be used in the formulation of medicinal compositions and in treating a wide variety of disorders-
The present invention responds to the aforementioned and other needs in the field by providing a population of novel targets usefuli inter alia-i in the profiling and medicinal contexts described above-
Summary of the Invention It is an object of the invention! thereforei to provide a set of human cDNA clones- Further to this object! the invention provides sequences of human cDNA clones that were isolated from libraries generated from different human tissues- It is another object of the invention to provide assemblages of targets useful in profiling matrices for screening pharmacological test compounds- According to this object! assemblages comprising different populations of human nucleic acidsi proteins and antibodies are provided- In different embodiments! cDNA library-specific assemblages and target-family- specific targets are provided- It is a further object of the invention to provide a database of human nucleotide and protein sequences- Further to this objecti novel human nucleotide and protein sequences are provided in electronic form- In one embodiment! one or more of these sequences is provided in a searchable database-
It is still another object of the invention to provide biologically active target molecules useful in treating or detecting human disorders- Further to this objecti the invention provides nucleic acid and protein molecules that have the capacity to affect disease etiology or symptoms or correlate with known disease states- Also further to this object! a database is provided which comprises the disclosed molecules in electronic form ■
Detailed Description
The invention results from a need in the art for new human nucleic acids and proteins- This need arises in several contexts- Firsti there is a need to identify targets for therapeutic intervention- Secondi there is a need to identify molecules that may be adversely affected in a therapeutic contexti thereby resulting in toxicity. Knowledge of these molecules will aid in the design of new medicaments with enhanced efficacy and decreased toxicity. Finallyi the need encompasses human nucleic acids and proteins that have medicinal applicability in their own right- In view of these needsi the present inventors set out to isolate and sequence human cDNAs from tissue-specific libraries- In this wayi they represent subsets of molecules likely to be targets for therapeutic intervention or for avoiding toxicity- In addition! the inventors divided the molecules into various sub- categoriesi based on suspected functionality! structural similarity etc! which are of interest from a pharmacological perspective-
GENERAL DESCRIPTION OF THE INVENTIVE MOLECULES
The present invention provides novel polynucleotide molecules thati in some instances! have similarities with known molecules- The inventive DNAs were cloned from five different human cDNA libraries- In addition to these DNA molecules! the invention provides their protein translations and antibodies derived from them- The inventive DNA and protein sequences are show individually in the Description of the Sequences- The inventive nucleic acids also include the complements of the DNA sequences provided in the Description of the Sequences as well as their RNA counterparts- Methods of producing the molecules also are provided- Further! the invention provides methods for detecting all or part of the molecules and of detecting polynucleotides encoding all or part of the molecules-
The inventive molecules derive from five cDNA libraries: human fetal braini human fetal kidneyi human melanomai human testisi and human amygdala- For convenience! each sequence bears a designation that indicates from which library it is derived- In particular! these designations are : "hfpbr11 for human fetal braini hfkd" for human fetal kidneyi "hmel" for human melanomai htes" for human testisi and "hamy11 for human amygdala- The individual libraries were constructed and screened as described below in the examples- The protein and DNA molecules of the invention are variously described herein as "target" molecules or "inventive" molecules- The sequences and other information pertinent to the nucleic acid and protein molecules of the invention are shown below in the Description of the Sequences-
Description of the Sequences
Key to the Description of the Sequences The desctiptions below provide the coding sequences of the inventive cDNAsi as well as the protein sequences and other useful information! as set out herein-
Grouping
The clones were assigned to the following sixteen functional and/or tissue-derived groups:
I- Amygdala derived Ξ- Cell Cycle
3- Cell Structure and Motility "4- Differentiation/Development 5- Intracellular Transport and Trafficking b- Melanoma derived 7- Metabolism fl- Nucleic Acid Management
-*-). Signal Transduction
ID- Transmembrane Protein
II- Transcription Factors 1Ξ- Brain derived
13- Kidney derived m - Mammary Carcinoma derived IS- Testes derived lb- Uterus derived
Description of Clone Files
The individual clone files are structured in the same pattern- The Sections are separated by paragraphs- 1. Clone Name
The clone names are deciphered with reference to the following example:
DKFZphfkdΞ_3klι wherein the code represents:
• producer of library ("DKFZ") (for convenience! this reference may be eliminated)
• a "p" for "plasmid cDNA library" (for convenience! this reference may be eliminated)
• library name (e-g- hfbr = human fetal braini hfkd = human fetal kidneyi h el = human melanomai htes = human testisi hamy = human amygdala)
• an underscore ("_") to separate library information from plate information
• plate number (e-g- "3")
• plate coordinates (letter firsti e-g- "kl5")
Group 3. Introduction short review of the similarities! function of the protein and possible applications
4. Short Information specifications about the cDNA (who sequencedi completeness of the cDNAi similarity! who sequencedi chromosomal localisation! length of cDNAi localisation of poly A tail and polyadenylation signal)
5. cDNA-Sequence
6. BLASTn Results search results of blasting the cDNA sequence against all public databases
7. Medline Entries information about genes/proteins similar to the novel cDNA (if available)
8. Putative Encoded Protein Information specifications about the encoded protein (ORF: length and localisation of the reading frame)
9. Protein Sequence
10. BLASTp Results search results of blasting the protein sequence against all public databases
11. Pedant Information output of fully automated annotation: summarises peptide information! homologiesi patterns as follows:
ELengthJ - length of the protein = number of amino acid residues
EMU]]
- molecular weight of the protein CpIH - isoelectric point EH0M0LJ
- shows protein with closest similarity to the cDNA- encoded protein EFUNCATJ
- functional information according to a catalogue developed by Munich Information center for Protein Sequences (MIPS)
EBLOCKSJ - Blocks are multiply aligned ungapped segments corresponding to the most highly conserved regions of proteins- The blocks for the Blocks Database are made automatically by looking for the most highly conserved regions in groups of proteins documented in the Prosite Database- The Prosite pattern for a protein group is not used in any way to make the Blocks Database and the pattern may or may not be contained in one of the blocks representing a group- These blocks are then calibrated against the SUISS-PROT database to obtain a measure of the chance distribution of matches- It is these calibrated blocks that make up the Blocks Database- The LJULJ versions of the Prosite and SUISS-PROT Databases that are used on this server are located at the ExPASy Uorld Uide Ueb (UUU) Molecular Biology Server of the Geneva University Hospital and the University of Geneva- Uorld Uide Ueb URL http : //blocks- fhere- org/blocks/about_blocks ■ html/ is the entry point to the database-
- here Blocks segments found in the analysed protein sequences are displayed ESC0P3
Nearly all proteins have structural similarities with other proteins andi in some of these casesi share a common evolutionary origin- The scop database provides a detailed and comprehensive description of the structural and evolutionary relationships between all proteins whose structure is knowni including all entries in Brookhaven National Laboratory's Protein Data Bank (PDB). It is available as a set of tightly linked hypertext documents which make the large database comprehensible and accessible- In addition! the hypertext pages offer a panoply of representations of proteinsi including links to PDB entriesi sequences! references! images and interactive display systems- Uorld Uide Ueb URL http : //scop -mrc- lmb-cam-ac.uk/scop/ is the entry point to the database- Existing automatic sequence and structure comparison tools cannot identify all structural and evolutionary relationships between proteins- The scop classification of proteins has been constructed manually by visual inspection and comparison of structures! but with the assistance of tools to make the task manageable and help provide generality- Proteins are classified to reflect both structural and evolutionary relatedness. Many levels exist in the hierarchy! but the principal levels are familyi superfamily and fold- The exact position of boundaries between these levels are to some degree subjective- Scop evolutionary classification is generally conservative: where any doubt about relatedness existsi we made new divisions at the family and superfamily levels- - - here SCOPE segments found in the analysed protein sequences are displayed EEC! ENZYME is a repository of information relative to the nomenclature of enzymes- It is primarily based on the recommendations of the Nomenclature Committee of the
International Union of Biochemistry and Molecular Biology (IUBMB) and it describes each type of characterized enzyme for which an EC (Enzyme Commission) number has been provided- Uorld Uide Ueb URL http://www-expasy.ch/enzyme/ is the entry point to the database-
- here EC-number and name of enzymes with similarity to the analysed protein sequences are displayed
EPIRKUJ
- functional information according to the Protein Information Resource (PIR) database catalogue developed by Munich Information Center for Protein Sequences (MIPS)ι the National Biomedical Research Foundation (NBRF) and the International Protein Information Database in Japan (JIPID). ESUPFAMID - information according to the Protein Information Resource (PIR) database catalogue of protein superfamilies developed by Munich Information Center for Protein Sequences (MIPS)ι the National Biomedical Research Foundation (NBRF) and the International Protein Information Database in Japan (JIPID) - EPR0SITE3 please refer to 1Ξ- PROSITE Motifs EPFAMID please refer to 13- PFAM Motifs
EKU3
- overall Ξdimensional folding information
- 3D indicates that the proteins is similar to a protein of which a 3 dimensional structure is known - overall structural information
11
The last PEDANT-block depicts information about the folding structure of the protein generated by PREDATOR- PREDATOR is a secondary structure prediction program- It takes as input a single protein sequence to be predicted and can optimally use a set of unaligned sequences as additional information to predict the query sequence- The mean prediction accuracy of PREDATOR is
Figure imgf000010_0001
for a single sequence and 7_\V. for a set of related sequences- PREDATOR does not use multiple sequence alignment- Insteadi it relies on careful pairwise local alignments of the sequences in the set with the query sequence to be predicted-
Uorld Uide Ueb URL http://www.embl- heidelberg.de/argos/predator/predator_info.html is the entry point to the database-
- H = helixi E = extended or sheeti _ = coili T = transmembranei B = beta
- x indicates a low-complexity region with repeat-like structure which is omitted in all BLAST searches
12. PROSITE Motifs PROSITE is a database of protein families and domains- It consists of biologically significant sitesi patterns and profiles that help to reliably identify to which known protein family (if any) a new sequence belongs- Uorld Uide Ueb URL http://www-expasy -ch/prosite/ is the entry point to the database- A description of the prosite consensus patterns is provided hereini after the description of the individual sequences-
13. PFAM Motifs PFAM (protein families) is a large collection of multiple sequence alignments and hidden Markov models covering many common protein domains- Uorld Uide Ueb URL http : //www- sanger • ac- uk/Pfam/ is the entry point to the database-
In the charts belowi the groups of sequences are listedi and the description of the individual clones follows-
Group Amygdala derived
Figure imgf000012_0001
Group Brain derived
Figure imgf000013_0001
r-
Group cell cycle
Figure imgf000014_0001
I
Group Cell structure and motility
Figure imgf000015_0001
I
Group Differentiation/Development
Figure imgf000016_0001
I π
Group Intracellular Transport and Trafficking
Figure imgf000017_0001
en i
Group Melanoma derived
Figure imgf000018_0001
-~J
Group Metabolism
Figure imgf000019_0001
0 I
Group Nucleic acid management
<£>
Figure imgf000020_0001
Group Signal transduction
IV) o
Figure imgf000021_0001
Group Testis derived
Figure imgf000022_0001
Group Transmembrane proteins
Figure imgf000022_0002
Figure imgf000023_0001
ro ΓSD i
Group Transcription factors
Figure imgf000024_0001
I o
»KFZphamyS_lDhl7
group: signal transduction DKFZphamyΞ_lDhl7 encodes a novel IflD amino acid protein which shows weak similarity to murine had-
The novel protein contains a Zinc finger motif of the C3HC4 type (RING finger)- The RING-finger domain is involved in mediating protein-protein interactions- Proteins containing a RING-finger are '- mammalian V(D)J recombination activating protein (RAGl)τ mouse rpt-li human rf p -i human SΞ Kd Ro/SS-A protein and others- The family of RING finger proteins contains a number of oncogenes- For example PHL-I a probable transcription factor-i BRCAlπ the mammalian cbl- and bmi-1 proto-oncogenes-
The new protein can find application in modulating protein- protein-interaction and in studying the expression profile of amygdala-specific genes-
weak similarity to hacl (Mus musculus)
Sequenced by LMU
Locus: unknown
Insert length: A35 bp
Poly A stretch at pos- 751π polyadenylation signal at pos- 7ΞC1
1 CACAGAGATC ATTGTCAACC AGGCCTGTGG GGGGGACATG CCTGCCTTGG
51 AAGGGGCACC CCATACCCCG CCACTGCCAC GGCGGCCCCG TAAGGGAAGC
1D1 TCGGAGCTGG GCTTTCCCCG CGTGGCCCCA GAGGATGAGG TCATTGTGAA 151 TCAGTACGTG ATTCGGCCTG GCCCCTCGGC CTCGGCGGCT TCTTCGGCGG
ΞD1 CGGCAGGCGA GCCCCTGGAG TGCCCCACCT GTGGGCACTC CTACAATGTC
ESI ACCCAGCGGA GGCCCCGCGT GCTGTCCTGC CTGCACTCTG TGTGTGAGCA
3D1 GTGCCTGCAG ATTCTCTACG AGTCCTGCCC CAAGTACAAG TTCATCTCCT
3S1 GCCCCACCTG CCGCCGTGAG ACTGTGCTCT TCACCGACTA CGGCCTGGCC 4D1 GCGCTGGCTG TCAACACGTC CATCCTGAGC CGCCTGCCGC CTGAGGCGCT
451 GACGGCCCCA TCCGGGGGTC AGTGGGGGGC TGAGCCCGAG GGCAGCTGCT
5D1 ACCAGACCTT CCGGCAGTAC TGTGGGGCCG CGTGCACCTG CCACGTGCGG
551 AACCCACTGT CCGCCTGCTC CATCATGTAG TAGCGCCTGC CTGCCCGCCA bDl CTGCCCGCTG AGCCTCGCTC GCTGCTTCTT CAGGGACCCG GCCCTGCCCT b51 GCCGCCCGCT GACCCTTCCT TCCCCACCAT GGCTTCCGGC CCCACCCCGA
7D1 GTGGCATTGT CGCTGCAGCC AACTTTGCCA TTAAAACTCT TTGCCAAAGT
751 TAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA flDl AAAAAAAAAA AAAAGAAAAA AAAAAAAAAA AAAAG
BLAST Results No BLAST result
Medline entries
No Hedline entry
Peptide information for frame Ξ
ORF from 3fl bp to 577 bpi peptide length: IflD Category: similarity to unknown protein
Classification: Cellular transport and traffic Prosite motifs: PRENYLATION (177-lflD) ZINC FINGER C3HC4 (fll-IO)
1 MPALEGAPHT PPLPRRPRKG SSELGFPRVA PEDEVIVNflY VIRPGPSASA
51 ASSAAAGEPL ECPTCGHSYN VTflRRPRVLS CLHSVCEύCL (3ILYESCP Y
IDl KFISCPTCRR ETVLFTDYGL AALAVNTSIL SRLPPEALTA PSGGώϋlGAEP 151 EGSCYflTFRώ YCGAACTCHV RNPLSACSIM
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphamyΞ_lDhl7-, frame Ξ No Alert BLASTP hits found
Pedant information for DKFZphamyΞ_lDhl7-ι frame Ξ
Report for DKFZphamyΞ_lDhl7-Ξ
CLENGTHJ IflD
Figure imgf000026_0001
CpIH 7-^5 EHOMOLHI TREIBL : ACDD77E7_7 gene: "FflK7-7πi Arabidopsis thaliana chromosome 1 BAC FAK7 sequence! complete sequence- 3e-Dtι
[[BLOCKS! BLDDflSTC
[BLOCKS! PFD14bEA EBL0CKS3 PRDD7t,3H
CBL0CKS3 BLDD51A Zinc finger-, C3HC4 type-, proteins
[PROSITE! PRENYLATION 1
EPR0SITE3 ZINC_FINGER_C3HC4 1
EPFAII]] Zinc finger-. C3HC4 type (RING finger) EKLπ Alpha_Beta
[KliO LOϋJ COMPLEXITY S-5L '/. SE<3 MPALEGAPHTPPLPRRPRKGSSELGFPRVAPEDEVIVNώYVIRPGPSASAASSAAAGEPL
SEG xxxxxxxxxx- - - -
PRD cccccccccccccccccccccccccccccccceeeeeeeeeeecccccchhhhhhhcccc SE(2 ECPTCGHSYNVT(2RRPRVLSCLHSVCE(2CL(2ILYESCPKYKFISCPTCRRETVLFTDYGL
SEG
PRD cccccccccccccccceeeecchhhhhhhhhhhhhccccceeeecccccceeeeeccccc
SE<2 AALAVNTSILSRLPPEALTAPSGG(3lGAEPEGSCY(3TFRd3YCGAACTCHVRNPLSACSII1 SEG
PRD cchhhhhhhhhcccccccccccccccccccccccchhhhhhhcceeeecccccceeeccc
Prosite for DKFZphamyE_lDhl7-E
PSQDΞT 177->lfll PRENYLATION PD0CDD2t,tι
PSDDSlfl fll->c.l ZINC FINGER C3HC4 PD0CDD44T
Pfam for DKFZphamyΞ_lDhl7 -5
HMH_NAHE Zinc finger-, C3HC4 type (RING finger)
HΠΠ
*CPICFcTF<31 DyPI PFdePml 1 PCgHsFCypCIrrlιl C
CP C Y+ +P+ L C+HS C+ C+ ++ C
(Query __E CPTC GHSYNVT(3RRPRVLSCLHSVCE(3CL-
(3ILYESCPKYKFISC IDS
HUM PmC* P C ώuery IDt, PTC IDA
DKFZphamyΞ_10p7
group: signal transduction
DKFZphamyΞ__10p7 encodes a novel lbl5 amino acid protein with similarity to Na+/CaΞ+ exchange proteins- The Transport of CaΞ+ from the sarcoplasm into the sarcoplasmic reticulum is an essential process in the initiation of muscle relaxation-
In addition-, the novel protein contains a PROSITE multicopper oxidase signature- Multicopper oxidases are enzymes that possess three spectroscopically different copper centers-
The new protein can find application in modulation of NA+/Ca5+- exchange and voltage-dependend processes-
similarity to Na+/CaΞ+ exchange proteins
ATG in frame 3 is first in clone- Sequenced by LHU
Locus: unknown
Insert length "■ 5Ξ3fc> bp Poly A stretch at pos- SΞlfc.-, no polyadenylation signal found
1 CGGACGCGTG GGCGGACGCG TGGGCCCTGT ATACCTGTGC CACTTTGTGC
SI CTTAAGGAAC AAGCTTGCTC AGCGTTTTCA TTTTTCAGTG CTTCTGAGGG IDl TCCCCAGTGT TTCTGGATGA CATCATGGAT CAGCCCAGCT GTCAACAATT
151 CAGACTTCTG GACCTACAGG AAAAACATGA CCAGGGTAGC ATCTCTTTTT
2D1 AGTGGTCAGG CTGTGGCTGG GAGTGACTAT GAGCCTGTGA CAAGGCAATG
SSI GGCCATAATG CAGGAAGGTG ATGAATTCGC AAATCTCACA GTGTCTATTC
3D1 TTCCTGATGA TTTCCCAGAG ATGGATGAGA GTTTTCTAAT TTCTCTCCTT 3S1 GAAGTTCACC TCATGAACAT TTCAGCCAGT TTGAAAAATC A6CCAACCAT
4D1 AGGACAGCCA AATATTTCTA CAGTTGTCAT AGCACTAAAT GGTGATGCCT
451 TTGGAGTGTT TGTGATCTAC AGTATTAGTC CCAATACTTC CGAAGATGGC
5D1 TTATTTGTTG AAGTTCAGGA GCAGCCCCAA ACCTTGGTGG AGCT6ATGAT
551 ACACAGGACA GGGGGCAGCT TA6GTCAAGT GGCAGTCGAA TGGCGTGTTG bDl TTGGTGGAAC AGCTACTGAA GGTTTAGATT TTATAGGTGC TGGAGAGATT b51 CTGACCTTTG CTGAAGGTGA AACCAAAAAG ACAGTCATTT TAACCATCTT
7D1 GGATGACTCT GAACCAGAGG ATGACGAAAG TATCATAGTT AGTTTGGTGT
7S1 ACACTGAAGG TGGAAGTAGA ATTTTGCCAA GCTCCGACAC TGTTAGAGTG
AD1 AACATTTTGG CCAATGACAA TGTGGCAGGA ATTGTTAGCT TTCAGACAGC SSI TTCCAGATCT GTCATAGGTC ATGAAGGAGA AATTTTACAA TTCCATGTGA
IDl TAAGAACTTT CCCTGGTCGA GGAAATGTTA CTGTTAACTG GAAAATTATT
TSI GGGCAAAATC TAGAACTCAA TTTTGCTAAC TTTAGCGGAC AACTTTTCTT
1DD1 TCCTGAGGGG TCGTTGAATA CAACATTGTT TGTGCATTTG TTGGATGACA
1DS1 ACATTCCTGA GGAGAAAGAA GTATACCAAG TCATTCTGTA TGATGTCAGG 11D1 ACACAAGGAG TTCCACCAGC CGGAATCGCC CTGCTTGATG CTCAAGGATA
1151 TGCAGCTGTC CTCACAGTAG AAGCCAGTGA TGAACCACAT GGAGTTTTAA
1ΞD1 ATTTTGCTCT TTCATCAAGA TTTGTGTTAC TACAAGAGGC TAACATAACA
1Ξ51 ATTCAGCTTT TCATCAACAG A6AATTTGGA TCTCTAGGAG CTATCAATGT 13D1 CACATATACC ACGGTTCCTG GAATGCTGAG TCTGAAGAAC CAAACAGTAG
1351 GAAACCTAGC AGAGCCAGAA GTTGATTTTG TCCCTATCAT TGGCTTTCTG
14D1 ATTTTAGAAG AAGGGGAAAC AGCAGCAGCC ATCAACATTA CCATTCTTGA
1451 GGATGATGTA CCAGAGCTAG AAGAATATTT CCTGGTGAAT TTAACTTACG 15D1 TTGGACTTAC CATGGCTGCT TCAACTTCAT TTCCTCCCAG ACTAGATTCA
1S51 GAAGGTTTGA CTGCACAAGT TATTATTGAT GCCAATGATG GGGCCCGAGG
ItDl TGTAATTGAA TGGCAACAAA GCAGGTTTGA AGTAAATGAA ACCCATGGAA lbSl GTTTAACATT GGTAGCCCAG AGGAGCAGAG AACCTCTTGG CCATGTTTCC
17D1 TTATTTGTGT ATGCTCAGAA TTTGGAAGCA CAAGTGGGGC TGGATTATAT 1751 CTTCACCCCA ATGATTCTTC ATTTTGCTGA TGGAGAAA6G TATAAAAATG
1AD1 TCAATATCAT GATTCTTGAT GATGACATTC CAGAAGGAGA TGAAAAATTT lflSl CAGCTGATTT TAACAAATCC TTCTCCTGGA CTAGAGCTAG GGAAAAATAC
ΠDI AATAGCCTTA ATTATTGTCC TTGCTAATGA TGACGGCCCT GGAGTTCTAT
1T51 CATTTAACAA CAGTGAGCAC TTTTTCCTAA GAGAGCCAAC AGCTCTCTAC 2D01 GTCCAGGAGA GTGTTGCAGT ATTGTACATT GTTCGGGAAC CTGCACAAGG
ΞDS1 ATTGTTTGGA ACAGTGACAG TTCAGTTCAT TGTGACAGAA GTGAATTCCT
Ξ1D1 CAAATGAATC TAAAGATCTG ACTCCTTCCA AAGGCTATAT TGTTTTAGAA
Ξ1S1 GAAGGTGTTC GATTCAAGGC CCTACAAATA TCTGCCATAT TAGACACGGA
2201 ACCAGAAATG GATGAGTATT TTGTTTGCAC CTTGTTTAAT CCAACTGGAG ΞΞ51 GTGCTAGACT AGGGGTGCAT GTTCAAACCC TGATAACAGT TTTGCAAAAC
Ξ3D1 CAGGCCCCTT TGGGGCTATT CAGTATCTCT GCAGTTGAAA ATAGAGCCAC
2351 CTCCATAGAC ATCGAAGAAG CCAATAGGAC CGTGTATTTA AATGTATCTC
H4D1 GAACTAATGG CATTGATTTG GCTGTGAGTG TGCAGTGGGA GACAGTATCT
24S1 GAAACAGCCT TTGGCATGA6 GGGAATGGAT GTTGTGTTTT CCGTATTTCA 2501 AAGTTTTTTG GATGAATCAG CTTCTGGCTG GTGTTTCTTT ACTTTGGAAA
2551 ATTTAATATA TGGTATAATG TTAAGAAAAT CATCTGTTAC TGTTTACCGA
2bDl TGGCAGGGGA TTTTTATTCC AGTTGAGGAT TTAAATATAG AAAATCCTAA
2b51 AACTTGTGAG GCCTTTAATA TTGGTTTTTC TCCCTACTTT GTGATTACTC
27D1 ATGAAGAAAG AAATGAAGAA AAGCCTTCTC TTAACAGTGT GTTTACATTC 2751 ACATCTGGAT TTAAATTATT CCTGGTACAA ACAATCATTA TTCTGGAAAG
2AD1 TTCTCAAGTA A6ATATTTTA CTTCAGACAG CCAAGATTAT TTAATCATTG
2AS1 CAAGTCAAAG AGATGATTCC GAATTAACTC AGGTCTTCAG GTGGAATGGA
2C.01 GGAAGCTTCG TGTTGCATCA AAAACTCCCT GTCCGAGGTG TGCTGACCGT
STSl GGCCTTGTTC AACAAGGGAG GCTCTGTGTT CTTAGCCATT TCCCAGGCTA 3DD1 ATGCCAGGCT AAACTCCCTT TTATTCAGAT GGTCTGGCAG TGGGTTTATT
3D51 AACTTTCAAG AGGTGCCTGT CAGTGGGACA ACAGAAGTTG AGGCTTTGTC
31D1 TTCAGCCAAT GATATTTACC TAATATTTGC CAAAAATGTC TTTCTAGGAG
3151 ATCAGAATTC AATTGATATT TTCATCTGGG AGATGGGACA GTCTTCCTTC
32D1 AGGTATTTTC AGTCTGTAGA TTTTGCTGCT GTTAACAGAA TCCACTCCTT 3251 CACACCAGCC TCAGGAATAG CCCACATACT TCTTATTGGC CAAGATATGT
33D1 CTGCTCTTTA CTGCTGGAAT TCGGAGCGTA ATCAATTCTC TTTTGTTCTG
3351 GAAGTACCTT CTGCTTATGA TGTGGCTTCT GTTACAGTAA AGTCCCTTAA
34D1 TTCAAGCAAG AATTTAATAG CTCTAGTGGG AGCTCATTCA CATATATATG
3451 AGCTAGCCTA CATTTCCAGC CATTCTGACT TTATTCCTAG TTCAGGTGAA 35D1 CTGATATTTG AACCTGGTGA GAGAGAAGCT ACAATAGCAG TAAATATCCT
35S1 TGATGATACA GTTCCAGAAA AAGAAGAATC CTTCAAAGTT CAACTTAAAA
3b01 ATCCCAAAGG AGGAGCAGAG ATTGGCATTA ATGATTCTGT AACAATAACC
3b51 ATTCTGTCTA ATGATGATGC CTATGGAATT GTTGCATTTG CTCAGAATTC
37D1 ATTATATAAG CAAGTGGAAG AAATGGAGCA AGATAGCCTA GTAACCTTGA 3751 ACGTTGAACG CTTAAAAGGA ACATATGGCC GTATAACCAT AGCATGGGAA
3AD1 GCTGATGGAA GTATTAGTGA TATATTTCCT ACCTCAGGAG TGATTTTATT
3fl51 TACTGAAGGC CAGGTACTGT CAACAATCAC TCTAACTATT CTTGCTGATA
3=101 ATATACCAGA GTTATCAGAG GTTGTGATTG TAACCCTCAC CCGTATCACC
3151 ACAGAAGGGG TTGAGGACTC ATACAAAGGT GCTACTATTG ATCAGGACAG 40D1 AAGCAAGTCT GTTATAACAA CTTTGCCCAA TGACTCACCT TTTGGCTTGG
4DS1 TGGGCTGGCG TGCTGCGTCT GTCTTCATTA GAGTAGCA6A GCCTAAAGAA
41D1 AACACCACCA CTCTTCAGTT ACAAATAGCT CGAGATAAAG GACTACTTGG
4151 GGATATTGCC ATTCACTTGA GAGCTCAACC CAATTTCTTA CTGCATGTCG 42D1 ATAATCAAGC TACTGAGAAT GAAGATTATG TATTGCAAGA AACAATAATA
4251 ATAATGAAAG AAAACATAAA AGAAGCTCAT GCCGAAGTTT CCATTTTGCC
43D1 GGAT6ACCTT CCTGAATTGG AGGAAGGATT TATTGTCACT ATCACTGAGG
4351 TGAACCTGGT GAACTCTGAC TTCTCTACAG GACAGCCAAG TGTGCGGAGG 44D1 CCCGGAATGG AAATAGCTGA GATAATGATA GAAGAAAATG ACGATCCCAG
44S1 AGGAATTTTT ATGTTTCATG TTACTAGAGG CGCTGGGGAA GTTATTACTG
4501 CCTATGAGGT GCCTCCACCC TTGAACGTTC TTCAAGTTCC TGTAGTCCGG
4S51 CTGGCTGGAA GCTTTGGGGC AGTAAATGTT TATTGGAAAG CATCACCAGA
4b01 CAGTGCTGGC CTGGAAGACT TTAAACCATC TCATGGGATT CTTGAATTTG 4b51 CAGATAAACA GGTTACTGCA ATGATAGAAA TCACCATAAT TGATGATGCT
47D1 GAATTTGAAT TGACAGAGAC GTTCAATATT TCCTTGATCA GTGTTGCTGG
4751 AGGTGGCAGA CTTGGTGATG ATGTTGTGGT AACTGTTGTT ATTCCACAAA
4AD1 ATGATTCTCC ATTT6GAGTA TTTGGATTTG AAGAAAAGAC TGTAAGTTAA
4A51 ACATATCAGG GGAAAGCCTT GTTTCAGGCT AGCGTTTCAT GTAATTTTGA 4101 GTAGAAAGTG TCTCACATTT TTGTTTT6GA AGTCTTGGCC AGGCATGGTG
4151 GCTCATGCCA GTAATCCCAG CACTTTGGGA GGCCGCAGCG GGCAGATCAC
SDD1 GAGGTCAGGA GATTGACACC ATCCTGGCCA ATATGGTTGA ATTCCCGTCT
SD51 CTACTGAAAG TACAAAAATT AGCTGGGCGT GGTG6CACAT GCCTGTATTC
S101 CCAGATACTT GGGAGGCTGA GGCAGGAGAC TCGCTTGAAC CCAGGAGGCA 5151 GAGGTTGCAG TGAGCTGAGA TCACGCCATT GCACTCCAGC CTGGCGACAT
52D1 AGAGAGACTC CATCTCAAAA AAAAAAAAAA AAAAAG
BLAST Results
No BLAST result
riedline entries
No Nedline entry
Peptide information for frame 3
ORF from 0 bp to 4fl47 bpi peptide length: l__ll_ Category: putative protein
Classification: Cell signaling/communication Prosite motifs: MULTIC0PPER_0XIDASE1 (151-171)
1 DAϋJADAUALY TCATLCLKEώ ACSAFSFFSA SEGP<2CFIiJfT SUISPAVNNS
51 DFITYRKNIT RVASLFSGώA VAGSDYEPVT R<3bIAir.i2EGD EFANLTVSIL
101 PDDFPEMDES FLISLLEVHL MNISASLKNfl PTIGflPNIST VVIALNGDAF
151 GVFVIYSISP NTSEDGLFVE Vι3E<3P<2TLVE LMIHRTGGSL GflVAVEURVV 201 GGTATEGLDF IGAGEILTFA EGETKKTVIL TILDDSEPED DESIIVSLVY
251 TEGGSRILPS SDTVRVNILA NDNVAGIVSF (3TASRSVIGH EGEILώFHVI
301 RTFPGRGNVT VNUKIIGώNL ELNFANFSG(3 LFFPEGSLNT TLFVHLLDDN
351 IPEEKEVYώV ILYDVRTώGV PPAGIALLDA (3GYAAVLTVE ASDEPHGVLN
401 FALSSRFVLL <2EANITI<2LF INREFGSLGA INVTYTTVPG MLSLKNώTVG 4S1 NLAEPEVDFV PIIGFLILEE GETAAAINIT ILEDDVPELE EYFLVNLTYV
501 GLTHAASTSF PPRLDSEGLT AflVIIDANDG ARGVIEU(3(3S RFEVNETHGS
5S1 LTLVAC3RSRE PLGHVSLFVY AfiNLEAώVGL DYIFTPNILH FADGERYKNV tαi NIHILDDDIP EGDEKFOLIL TNPSPGLELG KNTIALIIVL ANDDGPGVLS b51 FNNSEHFFLR EPTALYVflES VAVLYIVREP AώGLFGTVTV (2FIVTEVNSS
701 NESKDLTPSK GYIVLEEGVR FKAL(3ISAIL DTEPEMDEYF VCTLFNPTGG
751 ARLGVHVflTL ITVL(2N(3APL GLFSISAVEN RATSIDIEEA NRTVYLNVSR
A01 TNGIDLAVSV <2tJETVSETAF GMRGMDVVFS VFtfSFLDESA SGIilCFFTLEN fiSl LIYGIMLRKS SVTVYRUOGI FIPVEDLNIE NPKTCEAFNI GFSPYFVITH
IDl EERNEEKPSL NSVFTFTSGF KLFLVfiTIII LESSώVRYFT SDSflDYLIIA
ISI SβRDDSELTfl VFRINGGSFV LHώKLPVRGV LTVALFNKGG SVFLAISώAN
10D1 ARLNSLLFRϋJ SGSGFINFώE VPVS6TTEVE ALSSANDIYL IFAKNVFLGD
1DS1 (3NSIDIFIUJE MG(2SSFRYF(3 SVDFAAVNRI HSFTPASGIA HILLIGύDNS 1101 ALYCUNSERN (3FSFVLEVPS AYDVASVTVK SLNSSKNLIA LVGAHSHIYE
1151 LAYISSHSDF IPSSGELIFE PGEREATIAV NILDDTVPEK EESFKVGΪLKN
1201 PKGGAEIGIN DSVTITILSN DDAYGIVAFA <3NSLYK(3VEE MEUDSLVTLN
1251 VERLKGTYGR ITIAUEADGS ISDIFPTSGV ILFTEG(2VLS TITLTILADN
13D1 IPELSEVVIV TLTRITTEGV EDSYKGATID (3DRSKSVITT LPNDSPFGLV 13S1 GURAASVFIR VAEPKENTTT L(2L(2IARDKG LLGDIAIHLR AflPNFLLHVD
1401 NflATENEDYV LflETIIIMKE NIKEAHAEVS ILPDDLPELE EGFIVTITEV
14S1 NLVNSDFSTG (3PSVRRPGME IAEIMIEEND DPRGIFMFHV TRGAGEVITA
1501 YEVPPPLNVL (3VPVVRLAGS FGAVNVYQIKA SPDSAGLEDF KPSHGILEFA
1551 DKώVTAIIEI TIIDDAEFEL TETFNISLIS VAGGGRLGDD VVVTVVIPiJN IbOl DSPFGVFGFE EKTVS
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_10p7τ frame 3 TREMBL:AF0550fl4_l gene: "VLGRl"τ product: "very large G-protein coupled receptor-lπi Homo sapiens very large G-protein coupled receptor- 1 (VLGR1) mRNA-, complete eds--, N = 3-, Score = 2A4-, P = l-2e-33
T EPlBL:»riAF1fl17--_l gene: "Calx"i product: "CALX"ή Drosophila melanogaster 3Na ( + )-lCa (2+) exchanger (Calx) mRNA-. complete eds--. N = 1-. Score = 17A-, P = S-Se-OI
>TREMBL:AF0SS0fl4_l gene: "VLGRlπi product: "very large G-protein coupled receptor-l"ή Homo sapiens very large G-protein coupled receptor-1 (VLGR1) mRNAi complete eds-
Length = lilt.?
HSPs
Score = 2A4 (42-b bits)-, Expect = LΞe-33-ι Sum P(3) = l - Se-33 Identities = l.2/73fl (2k, )-, Positives = 314/73A (42*) fluery- b7 SG(3AVAGSDYEPVTR(2UAIM(3EGDEFANLTVSILPDDFPEnDESFLISLLEVHLnNISAS 12b
S + G DY + a G + + +SI+ D+ E +E +E+
L + Sb ct : 1D2 SSASPGGVDYI-LHGSTVTF(2HG(2NLSFINISIIDDNESEFEEP
IEILLTGATGG 155 fiuery- 127 LKN(3PTIG(3PNISTVVIALNGDAFGVFVIYSISPNTSEDGLFVEV(3E(3P(3TLV-ELriIHR 1A5
+G+ +S ++IA + FGV N S+ + +
T++ L++ R
Sbjct : ISb A VLGRHLVSRIIIAKSDSPFGVIRFL NώSK
ISIANPNSTNILSLVLER 2D3 ώuery : lAb TGGSLGώVAVEIilRVVGGTATEGL DFIG-AGEILTFAEGETK-
KTVILTXXXXXXX 23A
TGG LG + + V Id VG + E L D + F EGE
+T+ILT Sbjct : 204
TGGLLGEI(3VNIα)ETVGPNS<2EALLP(3NRDIADPVSGLFYFGEGEGGVRTIILTIYPHEEI 2b3 ώuery : 231 XXXXXXXXXLVYTEGGSRILPSSDTVRVNILANDNVAGIVSF—
(3TASRSVIGH EG 212 L +G +++ + V + I + G+V F +T S+
EG
Sb jct : 2b4 EVEETFIIKLHLVKGEAKLDSRAKDVTLTI(2EFGDPNGVV<2FAPETLSKKTYSEPLALEG 323 ώuery : 213 EIL(2FHVIRTFPGR-GNVTVNlιlKIIG(2-
NLELNFANFSGώLFFPEGSLNTTLFVHLLDDN 350
+L +R G G + M U + + + ++ +F + SG +G +
VHLL D
Sb j ct : 324 PLLITFFVRRVKGTFGEIMVYhlELSSEFDITEDFLSTSGFFTIADGESEASFDVHLLPDE 3A3
(3uery : 3S1
IPEEKEVY(3VILYDVRT(3GVPPAGIALLDAι3GYAAVLTVEASDEPHGVLNFAL-SSRFVL 401 +PE +E Y + L V G A LD + +V A+D+PHGV FAL S R +
Sb jct : 3A4 VPEIEEDYVIώLVSVE GGAELDLEKSITUFSVYANDDPHGV--
FALYSDRfiSI 434 fluery : 410 LflEANI— TI(3LFINREFGSLGAINVTYTTVPG!1LSLKNι3T- VGNLAEPEVDFVPIIGFL 4bb
L N+ +1(3+ I R G+ G + V K (3 V AE +
L
Sbjct: 435 LIG(2NLIRSId3INITRLAGTFGDVAVGLRISSDH---KE(3PIVTENAER(3— L 4A2 ώuery: 4b7
ILEEGETAAAINITILEDDVPELEEYFLVNLTYVGLTMAASTSFPPRLDSEGLTAflVIID 52b ++++G T + 1 L F + L V L P L E
+ A V + Sbj ct : 4A3 VVKDGATYKVDVVPIKN(3VFLSLGSNFTL(3LVTVI1LVGGRFYGI1PTIL(2-
EAKSA-VLPV 540 ώuery : S27 ANDGARGVIEU<3(2SRFEV-NETHGSLTLVA(2RSREPLGHVSLFV
YAfiNLEAώVGLDY SA2 + A + ++ + F++ N T G+ ++ R R G +S+ YA
LE +
Sb jct : S41 SEKAANS(2VGFESTAF(2LMNITAGTSHVI1ISR- RGTYGALSVA ITTGYAPGLEIPEFIVV S11 ώuery: Sfl3 -IFTPMI-- LHFADGERYKNVNIMILDDDIPEGDEKF(2LILTNPSPGLELGKNTIALIIV b31
TP + L F+ GE+ K V + P E F L L+ G + IV
Sb jct : bOO GNNTPTLGSLSFSHGEflRKGVFLUTFPS-- PGϋJPEAFVLHL'SGV(3SSAPGGA(3LRSGFIV bS7
(Juery : b40 LANDDGPGVLSFN- NSEHFFLREPTALYV(3ESVAVLYIVREPA(2GLFGTVTV(3FIVTEVN blA
A + GV F+ . + S + + E T + ++ L+ G +
T
Sbjct: b5A -AEIEPMGVFdFSTSSRNIIVSEDTflM-IRLHVβRL
GFHSDLIKVSY(3TTAG 70S
(3uery: bll SSNESKDLTP-SKGYIVLEEGVRFKALώlSAILDTEPEIIDEYFVCTL FNP 747
S+ +D P G + ++ +1+ I D E++E+F L
F+ Sbjct: 701
SAKPLEDFEPV(2NGELFF(3KF(2TEVDFEITIIND(QLSEIEEFFYINLTSVEIRGL(3KFDV 7bfl βuery: 74A TGGARLGVHV(2T-LITVL(2N(3APLGLFSISAVENR-ATSIDIE
EANRTVYLNVSRT A01 RL + +IT+L N G+ IS E A ++D E
YL+ S+T
Sbjct: 7b1 NUSPRLNLDFSVAVITILDNDDLAGH- DISFPETTVAVAVDTTLIPVETESTTYLSTSKT A27 fluery: A02 NGI fl04
I
Sbjct: fl2A TTI A30
Score = 2bb (31-1 bits)-, Expect = 4-0e-2Sι Sum P(3) = 4-0e-25 Identities = 17S/70A (24 )-, Positives = 30b/70A (43V.) ώuery : 131
PTIG(3PNISTVVIALNGDAFGVFVIYSISPNTSEDGLFVEV(3E(3P(2TLVELriIHRTGGSL 110 P IG +1 ++I N +A G+ P + EV+E L+ + + R G +
Sbjct : 31 PEIGNISIVRIIIHKNDNAEGII EFDPKYTA FEVEEDVG-
LIMIPVVRLHGTY 10
Ouery : 111 GfiVAVEURVVGGTATEG- LDFIGAGEILTFAEGETKKTVILTXXXXXXXXXXXXXXXXLV 241
C V ++ +A+ G +D+I G +TF G+ + ++
L
Sbjct : 11
GYVTADFIS(3SSSASPGGVDYILHGSTVTF(3HG(3NLSFINISIIDDNESEFEEPIEILLT 15D ώuery : 2S0 YTEGGSRILPSSDTVRVNILANDNVAGIVSF(3TASRSVIGHEGE —
ILώFHVIRTFPGRG 3D7
GG+ +L R+ I +D+ G++ F S+ I + IL +
RT G Sb jct : 151 GATGGA-
VLGRHLVSRIIIAKSDSPFGVIRFLNώSKISIANPNSTIIILSLVLERTGGLLG 2D1 2uery: 30A NVTVNUKIIGfiN LELN--FAN-FSG(2LFFPEGSLNT-
TLFVHLLDDNIPEEKEVY 35δ
+ VNIι)+ +G N L N A+ SG +F EG T+ + +
E +E + Sbjct: 21D
EI(2VNUETVGPNS(3EALLP(3NRDIADPVSGLFYFGEGEGGVRTIILTIYPHEEIEVEETF 2b1
(3uery: 3S1 (3VILYDVRT(3GVPPAGIALLDA(3GYAAVLTVEASDEPHGVLNFA — - LSSRFV---LL(3E 412 + L+ V+ G A LD++ LT++ +P+GV+ FA LS +
L E
Sbjct: 270 IIKLHLVK
GEAKLDSRAKDVTLTI(3EFGDPNGVV(3FAPETLSKKTYSEPLALE 322 (Query: 413
ANITIfiLFINREFGSLGAINVTYTTVPGNLSLKNtQTVGNLAEPEVDFVPIIGFLILEEGE 472
+ I F+ R G+ G I V + L ++ ++ E DF+
GF + +GE
Sbjct: 323 GPLLITFFVRRVKGTFGEIdVYϋ) ELSSEF--DITE--- DFLSTSGFFTIADGE 37D
(Query: 473
TAAAINITILEDDVPELEEYFLVNLTYVGLTNAASTSFPPRLDSEGLTAflVIIDANDGAR 532 + A+ ++ +L D+VPE+EE +++ L S LD E + AND
Sbjct: 371 SEASFDVHLLPDEVPEIEEDYVIώLV
SVEGGAELDLEKSITUFSVYANDDPH 422
(Query: 533 GVIEIι)(Q(2SRFEV---NETHGSLTLVAlQRSREPLGHVS-- LFVYAώNLEAfiVGLDYIFTPri 5A7
GV R + S+ + R G + L + + + E +
Sbjct: 423 GVFALYSDR(3SILIG(QNLIRSI(QINITRLAGTFGDVAVGLRISSDHKE(QPIVTENAER(QL 4 2
(Query: 5flA ILHFADGERYKNVNIMILDDDI--PEGDE-KF(QLILTNPSPGLELGKNTI-- -ALIIVLA b41
++ DG YK V +++ + + + G (3L+ G G TI
A VL Sbjcf- 4fl3 VVK — DGATYK-
VDVVPIKN(3VFLSLGSNFTL(3LVTVriLVGGRFYGriPTIL(3EAKSAVLP 531
Query: b42
NDDGPGVLSFNNSEHFFLREPTALYV(3ESVAVLYIVREPA(3GLFGTVTV(3FIV TE bib + NS+ F E TA + A ' V +G +G ++V +
E
Sbjct: 540 VSEKAA NS(3VGF--
ESTAFtJLtlNITAGTSHVrilSRRGTYGALSVAldTTGYAPGLE 512 (Query: b17
VNSSNESKDLTPSKGYIVLEEGVRFKALOISAILDTEPEMDEYFVCTLFNPTGGARLGVH 75b + ++TP+ G + G + K + + P E FV L
A G
Sbjct: 513 IPEFIVVGNI1TPTLGSLSFSHGE(3RKGVFLI TF-- PSPGlilPEAFVLHLSGVflSSAPGGAfl bSO fluery: 7S7 VflTLITVLflNώAPLGLFSISAVENRATSIDIEEANRTVYLNVSRTNGI-- DLAVSV(3 )ET A14 +++ V + + P+G+F S +R +1 + E + + L+V R G DL + V + + T
Sbjct: bSl LRSGFIVAEIE-PHGVFiQFST-SSR — NHVSEDTflHIRLHVtQRLFGFHSDL-IKVSYtQT 705
(Query: 515 VSETAFGHRGMDVVFS VFfQSFLDE A3A
+ +A + + V + F(Q F E
Sbjct: 7Db TAGSAKPLEDFEPV(QNGELFF(QKF(QTE 732 Score = 24b (3b-1 bits)-, Expect = 4-le-32-, Sum P(3) = 4-le-32 Identities = 12/33A (27*)-, Positives = 1S7/33A (4b*)
(Query: Sll PPRLDSEGLTAtQVIIDANDGARGVIEU-- <QtQSRFEVNETHGSLTLVA(QRSREPLGHVSLF SbA PP + + + ++II ND A G+IE+ + + FEV E G + + R
G + V +
Sbjct: 3A PPEIGNISIV-
RIIIMKNDNAEGIIEFDPKYTAFEVEEDVGLimPVVRLHGTYGYVTAD lb (Query: Sb VYA(QNLEA(QVG-
LDYIFTPMILHFADGERYKNVNirilLDDDIPEGDEKFlQLILTNPSPGL b27
+(Q+ A G +DYI + F G+ +NI I+DD+ E +E
+++LT + G
Sbjct: 17 FIS(QSSSASPGGVDYILHGSTVTF(QHG(QNLSFINISIIDDNESEFEEPIEILLTGATGGA ISb
(Query: b2A
ELGKNTIALIIVLANDDGPGVLSFNNSEHFFLREPTALYV(3ESVAVLYIVREPA(QGLFGT bA7 LG++ ++ 11+ +D GV+ F N + P S +L +V E GL G
Sbjct: 157 VLGRHLVSRIIIAKSDSPFGVIRFLNiQSKISIANP
STMILSLVLERTGGLLGE 21D
(Query: bAA VTVώFIVTEVNSSN ESKDLT-PSKGYIVLEEGVR- FKALώlSAILDTEPEMDEYFV 741
+ V + NS +++D+ P G EG + + ++ E
E++E F+
Sbjcf- 211
IiQVNUETVGPNSiQEALLPiQNRDIADPVSGLFYFGEGEGGVRTIILTIYPHEEIEVEETFI 27D
(Query: 742 CTLFNPTGGARLGVHV(2TL-ITVL(QN<QAPLGL--FSISAVENRATSIDIE-
EANRTVYLN 717
L G A+L + + +T+ + P G+ F+ + + S + E
+ Sbjct: 271
IKLHLVKGEAKLDSRAKDVTLTIfQEFGDPNGVVtQFAPETLSKKTYSEPLALEGPLLITFF 330
(Query: 71fi VSRTNGIDLAVSViQUETVSETAFGMRGIIDVVFSVFiQSFLDESASGhJCFFTL A4A V R G + V tilE SE F + + FL S SG FFT +
Sbjct: 331 VRRVKGTFGEIMVYIαJELSSE FDITEDFL — TSG — FFTI
3bb
Score = 24b (3b-1 bits)-, Expect = l-le-11-, Sum P(3) = l-le-11 Identities = A7/303 (2A*)-, Positives = 13A/3D3 (45*)
(Query: Hb2 PSSGELIFEPGEREA-TIAVNILDDTVPEKEESFKVtQLKNPKGGAEIGIN- DSVTITILS 1211 P SG F GE TI + I E EE+F ++L KG A++ VT+TI
Sbjct: 23b
PVSGLFYFGEGEGGVRTIILTIYPHEEIEVEETFIIKLHLVKGEAKLDSRAKDVTLTIiQE 215
(Query : 1220 NDDAYGIVAFAtQNSL
YKlQVEEMEiQDSLVTLNVERLKGTYGRITIAϋJEADGSIS--- 1272
D G+V F A +L Y + +E L+T V R+KGT+G I + UE
Sbjct : 21b FGDPNGVViQFAPETLSKKTYSEPLALEGPLLITFFVRRVKGTFGEiriVYldELSSEFDITE 355
(Query : 1273
DIFPTSGVILFTEGiQVLSTITLTILADNIPELSEVVIVTLTRITTEGVEDSYKGATIDtQD 1332 D TSG +G+ ++ + +L D +PE+ E ++ L ++ EG GA +D +
Sbjct : 35b DFLSTSGFFTIADGESEASFDVHLLPDEVPEIEEDYVK3L— SVEG
-GAELDLE 407
(Query: 1333 RSKSVITTLPNDSPFGLVGlι)RAASVFIRVAEPKENTTTL(QL(QIARDKGLLGDIAIHLRA(Q 1312
+S + + ND P G+ + I + + ++(Q+ I R G
GD+A+ LR
Sbj cf- 40fi KSITh)FSVYANDDPHGVFALYSDRι2SILIG(Q--
NLIRSIlQINITRLAGTFGDVAVGLRIS 4b5
(Query : 1313 PNFLLHVDNlQ-
ATENEDYVLiQETIIIMKENIKEAHAEVSILPDDLPELEEGFIVTITEVN 1451
+ H + TEN E +++K+ I L F
+ + V Sbjct: 4bb SD HKEβPIVTENA
ER(3LVVKDGATYKVDVVPIKN(3VFLSLGSNFTL(QLVTVπ 517
(Query: 1452 LVNSDFSTGiQPSV 14b4 LV F G P++ Sbjct: Slfl LVGGRFY-GMPTI 521
Score = 24b (3b-1 bits)-, Expect = l-le-11-. Sum P(3) = l-le-11 Identities = A1/334 (2b*)-, Positives = 150/334 (44*) ώuery: 1151
DFIPSSGELIFEPGEREATIAVNILDDTVPEKEESFKVlQLKNPKGGAEIGINDSVTITIL 121A D+I + F+ G+ + 1 ++I+DD E EE ++ L GGA +G +
I I
Sbjct: 110 DYILHGSTVTF(QHG(QNLSFINISIIDDNESEFEEPIEILLTGATGGAVLGRHLVSRIIIA Ibl
(Query: 1211
SNDDAYGIVAFA(3NSLYK(QVEEHE(QDSLVTLNVERLKGTYGRITIAIilEADGSIS 1272
+D +G++ F S + +++L +ER G G I + LIE G S
Sbjct: 17D KSDSPFGVIRFLNlQSKIS- IANPNSTMILSLVLERTGGLLGEIiQVNIilETVGPNSlQEALLP 22A
(Query: 1273 DIF-PTSGVILFTEG(QV- LSTITLTILADNIPELSE VIVTLTRITTEGVEDSYKGA 1327
DI P SG+ F EG+ + TI LTI E+ E 1+ L + E DS Sb jct : 221 (QNRDIADPVSGLFYFGE6EGGVRTIILTIYPHEEIEVEETFIIKLHLVKGEAKLDS 2fl4
(Query : 132fl TIDiQDRSKSVITTLPN-DSPFGLVGURAASVFIRV-AEPK-- ENTTTLiQLtJIARDKGLLG 13A3
R+K V T+ P G+V + ++ + +EP E +
R KG G
Sb jct : 2AS
RAKDVTLTIiQEFGDPNGVVlQFAPETLSKKTYSEPLALEGPLLITFFVRRVKGTFG 331
(Query : 13A4 DIAIHLRA(QPNFLLHVDN<QATENEDYVL<QETIIiriKENIKEAHAEVSILPDDLPELEEGF 1443
+1 ++ F + ED++ + + EA +V
+LPD++PE+EE + Sb jct : 340 EIMVYIilELSSEFDI
TEDFLSTSGFFTIADGESEASFDVHLLPDEVPEIEEDY 311
(Query : 1444 IVTITEVNLVNSDFSTGiQPSVRRPGriEIAEirilEENDDPRGIFriFHVTR 1412 ++ + V + + + I + NDDP G+F + R
Sbj ct : 312 VIiQLVSVE GGAELDLEK SITUFSVYANDDPHGVFALYSDR
431
Score = 237 (35-b bits)-, Expect = 1-4e-34-, Sum P(3) = 1-4e-34 Identities = lDl/3b7 (27*)-, Positives = Ib5/3b7 (44*)
(Query- __f
SGiQAVAGSDYEPVTRiQliiAIIIlQEGDEFANLTVSILPDDFPEriDESFLISLLEVHLMNISAS 12b S + G Y + a G + + +SI+ D+ E +E +E+ L +
Sbjct : 102 SSASPGGVDYI-LHGSTVTF(3HG(3NLSFINISIIDDNESEFEEP
IEILLTGATGG 155 fluery : 127 LKN(QPTIG(QPNISTVVIALNGDAFGVFVIYSISPNTSEDGLFVEV<QE(QP(QTLVEL!1IHRT lAb
+G+ +S ++IA + F6V N S+ +
++ L++ RT
Sb jct : 15b A VLGRHLVSRIIIAKSDSPFGVIRFL NlQSKISI — -
ANPNSTMILSLVLERT 204
(Query : 1A7 GGSLGlQVAVEURVVGGTATEGL DFIG-AGEILTFAEGETK-
KTVILTXXXXXXXX 231
GG LG++ V h) VG + E L D + F EGE +T+ILT
Sbjct : 2D5 GGLLGEI(QVNIι)ETVGPNS(2EALLPi3NRDIADPVSGLFYFGEGEGGVRTIILTIYPHEEIE 2b4
(Query : 24D XXXXXXXXLVYTEGGSRILPSSDTVRVNILANDNVAGIVSF--
(QTASRSVIGH EGE 213
L +G +++ + V + I + G+V F +T S+ EG
Sb jct : 2bS VEETFIIKLHLVKGEAKLDSRAKDVTLTI(QEFGDPNGVV(QFAPETLSKKTYSEPLALEGP 324
(Query : 214 IL(2FHVIRTFPGR-GNVTVNh)KIIG(3- NLELNFANFSGiQLFFPEGSLNTTLFVHLLDDNI 351
+ L +R G G + J lιl++ + ++ +F + SG +G +
VHLL D + Sbjct: 325 LLITFFVRRVKGTFGEIMVYIαJELSSEFDITEDFLSTSGFFTIADGESEASFDVHLLPDEV 3A4
(Query: 352 PEEKEVY(QVILYDVRT(QGVPPAGIALLDA(QGYAAVLTVEASDEPHGVLNFAL-SSRFVLL 41D
PE +E Y + L V G A LD + +V A+D+PHGV FAL
S R +L
Sbjct: 3fl5 PEIEEDYVIfQLVSVE GGAELDLEKSITUFSVYANDDPHGV —
FALYSDRiQSIL 435
(Query: 411 (QEANI--TI(2LFINREFGSLGAINV 433
N+ +I(Q+ I R G+ G + V Sbjct: 43b IG(QNLIRSI(QINITRLAGTFGDVAV 4b0 Score = 230 (34-5 bits)-, Expect = 2-3e-14-, Sum P(3) = 2-3e-14 Identities = 1A/3bA (2b*)-, Positives = lb4/3bA (44*)
(Query: 1240 EME(QD-
SLVTLNVERLKGTYGRITIAUEADGSISDIFPTSGVILFTEGfQVLSTITLTILA 121A E+E+D L+ + V RL GTYG +T + + S + P GV G
ST + T
Sbjct: 71 EVEEDVGLIMIPVVRLHGTYGYVTADFISC2SSSAS--P-GGVDYILHG —
STVTFiQH-G 123 (Query: 1211 DNIPELSE VIVTLTRITTEGVEDSYKGATIDiQDRSKSVITTL---
PNDSPFGLVGURAA 1355
N+ ++ +1 E +E GAT + +++ +
+DSPFG++ +
Sbjct: 124 (QNLSFINISIIDDNESEFEEPIEILLTGATGGAVLGRHLVSRIIIAKSDSPFGVIRFLNiQ 1S3
(Query: 135b SVFIRVAEPKENTTTL(QL(QIARDKGLLGDIAIHLRA(Q- PNFLLHVDN(QATENEDYVL(QET 1414
S I +A P +T L L + R GLLG+I ++ PN + Q. + D V
Sbjct: 1A4 SK-ISIANPN- STI ILSLVLERTGGLLGEI(QVNIjJETVGPNS(3EALLP(QNRDIADPV--SG 231 fiuery: 1415 IIIMKENIKEAHAEV- SILPDDLPELEEGFIVTITEVNLVNSDFSTGlQPSVRRPGriEIAE 1473
+ E + +1 P + E+EE FI+ +++LV G+ +
+ +
Sbjct: 24D LFYFGEGEGGVRTIILTIYPHEEIEVEETFII---KLHLVK
GEAKLDSRAKDVT- 210
(Query: 1474
IMIEENDDPRGIFHFHVTRGAGEVITAYEXXXXXXXXXXXXXXXAGSFGAVNVYUKASPD 1S33 + I+E DP G+ F + + + G+FG +
VYU+ S + Sbjct: 211
LTI(3EFGDPNGVVι3FAPETLSKKTYSEPLALEGPLLITFFVRRVKGTFGEiriVYlι)ELSSE 3S0
(Query: 1534
SAGLEDFKPSHGILEFADKflVTAMIEITIIDDAEFELTETFNISLISVAGGGRLGDDV V 1513 EDF + G AD + A ++ ++ D E+ E + I L+SV GG
L + + Sbjct: 351 FDITEDFLSTSGFFTIADGESEASFDVHLLPDEVPEIEEDYVIiQLVSVEGGAELDLEKSI 41D (Query: 1514 T-VVIP<QNDSPFGVF lb07
T + ND P GVF Sbjct: 411 TϋJFSVYANDDPHGVF 425
Score = 11D (2A-5 bits)-, Expect = 7-Se-ll-. Sum P(3) = 7-5e-ll Identities = 13b/511 (23*) •, Positives = 247/511 (41*)
(Query: b7 SGtQAVAGSDYEPVTRtQ lAirKQEGDEFANLTVSILPDDFPENDESFLISLLEVHLriNISAS 12b
+ G A D + EPV <Q+ + + + I+ D E + + E F I+L V
+ +
Sbjct: 707
AGSAKPLEDFEPV(3NGELFF(QKF(3TEVDFEITIIND(QLSEIEEFFYINLTSVEIRGL(QKF 7bb
(Query: 127 LKN-(QPTIG(QP-NISTVVIALNGDAFGVFVIY-
SISPNTSEDGLFVEVlQEώPlQTLVELMI 1S3
N P + +++ + I G+ + + + + D + V+ +
T L Sbjct: 7b7
DVNUSPRLNLDFSVAVITILDNDDLAGI1DISFPETTVAVAVDTTLIPVETESTTY — LST A24
(Query: 1A4 HRTGGSLGiQVAVEI RVVGGTATEGLDFIGAGEILTF-- AEGETKKTVILTXXXXXXXXXX 241 +T L V +V T G+ I + + + T + + K + T
Sbjct: A2S SKTTTIL(3PTNVV-AIV — TEATGVSAIPE- KLVTLH6TPAVSEKPDVATVTANVSIHGT AAD
(Query: 242 XXXXXXLVYTEGGSRILPSSDTVRVNILANDNVAGIVSF-- (QTASRSVIGHEGEILiQFHV 211
+VY E + + +T V I G VS +T E
L F
Sbjct: AA1 FSLGPSIVYIEEEMKN-
GTFNTAEVLIRRTGGFTGNVSITVKTFGERCAfQNEPNALPF-- 137
(Query: 300
IRTFPGRGNVTVNUKIIGiQNLELNFANFSGlQLFFPEGSLNTTLFVHLLPDNIPEEKEVYiQ 3S1 R G N + T Id + E +F + L F +G + V +LDD +
PE +E + Sbjct: 13A -RGIYGISNLT--UAVE
EEDFEE(QTLTLIFLDGERERKVSV(QILDDDEPEG(QEFFY 110
(Query: 3b0 VILYDVRT(2GVPPAGIALLDA(Q---GYAA-- VLTVEASDEPHGVLNFALSSRFVL-LiQEA 413 V L + P G +++ + G+AA ++ + SD +G++ F+ S+
L L + E
Sbjct: HI VFLTN
P(QGGA(QIVEGKDDTGFAAFANVIITGSDLHNGIIGFSEES(QSGLELREG 1D44 (Query: 414 NITIiQLFI NREFGSLGAI-
NVTYTTVPGI LSLKNiQTVGNLAEPEVDFVPIIGFL 4bb
+ +L + NR F + VT ++ L+ V NL E E+
V G
Sbjct: 1045 AVMRRLHLIVTR(QPNRAFEDVKVFIdRVTLNKT--VVVL(QKDGV-NLI1E- EL(3SVS--GTT 101A
(Query: 4b7 ILEEGETAAAINITILEDDVPELEEYFLVNL — TYVGLTMAASTSFPPRLDSEGLTAiQVI 524 G+T I+I + + VP++E YF V L G + S F
E +0 + Sbjct: 1011
TCTHG<QTKCFISIELKPEKVP(QVEVYFFVELYEATAGAAINNSARFA(QIKILESDESι2SL 115A
(Query: 525 IDANDGARGVIEld(Q(QSRF— EVNETHGS- LTLVA(QRSREPLGHVSLFVYA(QNLEA(3VGL SAO
+ + G+R + +++ +V G+ L + S + L
A G Sbjct: 1151
VYFSVGSRLAVAHKKATLISLfQVARDSGTGLMMSVNFSTlQELRSAETIGRTIISPAISGK 121A
(Query: SA1 DYIFTPMILHFADGERYKNVNIMILDD-- DIPEGDEKFfiLILTNPSPGLELGKNTIALII b3B D + + T L F G + R + + + + + + + + + F(Q + + L +P G +
K I
Sbjct: 1211 DFVITEGTLVFEPG(QRSTVLDVILTPETGSLNSFPKRF(QIVLFDPKGGARIDKVYGTANI 127A (Query: b31 VLAND-DGPGVLSFNNSEH bSb
L +D D + + H
Sbjct: 1271 TLVSDADS(QAIldGLAD(3LH 1217
Score = 1AA (2A-2 bits)-. Expect = l-2e-33-. Sum P(3) = l-2e-33 Identities = A4/321 (25*)-, Positives = 14b/321 (44*)
(Query: H2 SVTVKSLNS
SKNLIALVGAHSHIYELAYISSHSDFIPSSGELIFEPGEREATIAV HAD
S+TVK+ N + G + I L + DF + LIF GERE ++V
Sbjct: 117 SITVKTFGERCAiQMEPNALPFRGIYG- ISNLTldAVEEEDFEEiQTLTLIFLDGERERKVSV 175
(Query: 11A1 NILDDTVPEKEESFKViQLKNPKGGAEI — GINDS VTITILSNDDAY- GIVAFAtQNS 1233
ILDD PE +E F V L NP+GGA+I G +D+ + I++ D +
GI+ F++ S
Sbjct: 17b
(QILDDDEPEG(QEFFYVFLTNP(QGGA(QIVEGKDDTGFAAFAI1VIITGSDLHNGIIGFSEES 1035
(Query: 1234 LYK(QVEEME(QDSLVT---LNVERLKG-TYGRITIAIdEAD-
GSISDIFPTSGVILFTEGiQV 12AA
+ E+ + + + + L V R + + + ld + GV
L E (Q Sbjcf- 103b —
(QSGLELREGAVriRRLHLIVTR(2PNRAFEDVKVFURVTLNKTVVVL(QKDGVNLI1EEL(QS 1013
(Query: 12A1 LSTITLTILADNIPELS-EVVIVTLTRITTEGVEDSYK- — GATIDiQDRSKSVITTLPND 1344 +S T + +S E+ + ++ + Y+ GA 1+ +
I L +D Sbjct: 1014 VSGTTTCTMGfiTKCFISIELKPEKVPlQVEVYFFVELYEATAGAAINNSARFAiQIKILESD 1153 (Query: 1345 SPFGLVGIdRAASVFIRVAEPKENTTTL(QLι3IARDKG--LLGDIAI
HLRAiQPNFLLHV 1311
LV + S R + A + T + L(Q + ARD G L+ + LR +
+ Sbjct: 1154 EStQSLVYFSVGS
RLAVAHKKATLISL(3VARDSGTGLM1 SVNFST(QELRSAETIGRTI IBID
(Query: 140D DN(QATENEDYVL(3ETIIIMKENIKEAHAEVSILPD 1434
+ A +D+V+ E ++ + + +V + P+
Sbjct: 1211 ISPAISGKDFVITEGTLVFEPGtQRSTVLDVILTPE 124S
Score = __&__ (27-1 bits)-, Expect = 2-5e-13-, Sum P(3) = 2-5e-13 Identities = 7S/242 (3D*)-, Positives = 113/242 (4b*)
(Query: 120b EIGINDSVTITILSNDDAYGIVAFAiQNSLYKiQVEEMEiQDSLVTLNVERLKGTYGRITIAId 12b5
EIG V I 1+ ND+A GI+ F + Y E E L+ + V RL
GTYG +T + Sbjct: 40 EIGNISIVRIIIMKNDNAEGIIEF--
DPKYTAFEVEEDVGLIMIPVVRLHGTYGYVTADF 17
(2uery: 12bb EADGSIS
DIFPTSGVILFTEGtQVLSTITLTILADNIPELSEVVIVTLTRITTEGV 1320 + S + D + F Ga LS I ++I+ DN E E + +
LT T G Sbjct: Ifi
IS(QSSSASPGGVDYILHGSTVTF(QHG(QNLSFINISIIDDNESEFEEPIEILLTGAT--G- 154 (Query: 1321
EDSYKGATID(QDRSKSVITTLPNDSPFGLVGIdRAASVFIRVAEPKENTTTL(QL(QIARDKG 13A0
GA + + +1 +DSPFG++ + S I +A P +T L
L + R G
Sbjct: 15S GAVLGRHLVSRIIIA-KSDSPFGVIRFLN(QSK-ISIANPN- STMILSLVLERTGG 20b
(Query- 13A1 LLGDIAIHLRA(Q-PNFLLHVDN(2ATENEDYVL<3ETIIIMKENIKEAHAEV- SILPDDLPE 143A
LLG+I ++ PN + lQ + D V + E + +1 P + E
Sbjct: 207 LLGEI(QVNIdETVGPNS(QEALLP(QNRDIADPV-- SGLFYFGEGEGGVRTIILTIYPHEEIE 2b4
(Query: 1431 LEEGFIVTI 1447 +EE FI+ +
Sbjct: 2bS VEETFIIKL 273
Score = 171 (2b-1 bits)-, Expect = 1-4e-34-, Sum P(3) = 1-4e-34 Identities = bS/244 (2b*)-, Positives = 114/244 (4b*)
(Query: 5fll DYIFTPMILHFADGERYKNVNIMILDDDIPEGDEKFiQLILTNPSPGLEL-- GKN T b33
D+ + L F DGER + V++ ILDDD PEG E F + LTNP G ++
GK + Sbjct: 154
DFEE(QTLTLIFLDGERERKVSV(QILDDDEPEG(QEFFYVFLTNP(QGGA(QIVEGKDDTGFAA 1013
(Query: b34 IALIIVLANDDGPGVLSFNNSEHFFLREPTALYViQESVAVLYIVREPAiQG-- LFGTV bAA A++I+ +D G++ F+ L ++ L + R+P +
+F V
Sbjct: 1D14 FAMVIITGSDLHNGIIGFSEESiQSGLELREGAVflRR-- LHLIVTRlQPNRAFEDVKVFIdRV 1071 Query: bAI TV(Q— FIVTEVNSSNESKDLTPSKGYIVLEEGVRFKAL(QISAILDTEPEriDEYFVCTLFN 74b
T+ +V + + N ++L G G + I + P+++ YF L+
Sbjct: 1072 TLNKTVVVL(QKDGVNLHEEL(QSVSGTTTCTHGι2TKCFISIELKPEKVP(QVEVYFFVELYE 1131
(Query: 747 PTGGARLGVHVώ- TLITVLiQNiQAPLGLFSISAVENRATSIDIEEANRTVYLNVSRTNGID ADS
T GA + + I +L++ L S V +R ++ ++A + L
V + R +G
Sbjct: 1132 ATAGAAINNSARFA(QIKILESDES(QSLVYFS-VGSRL-AVAHKKAT-
LISLiQVARDSGTG 11AA
(Query: AOb LAVSViQldET A14 L +SV + T
Sbjct: HA1 LMMSVNFST 1117 Score = 174 (2b-l bits)-, Expect = 4-le-32-, Sum P(3) = 4-le-32 Identities = SA/20D (21*)-, Positives = 102/20D (51*)
(Query: HS1
DFIPSSGELIFEPGEREATIAVNILDDTVPEKEESFKV(QLKNPKGGAEIGINDSVT-ITI 1217 DF+ +SG GE EA+ V ++L D VPE EE + +(QL + +GGAE+ +
S+T + + Sbjct: 35b DFLSTSGFFTIADGESEASFDVHLLPDEVPEIEEDYVIβLVSVEGGAELDLEKSITldFSV 415 (Query: I21fl LSNDDAYGIVAFA(QNSLYK(3VEE!1EιQDSL —
VTLNVERLKGTYGRITIAIdEADGSISDIF 1275
+ NDD +G+ A + +(Q + (Q+ + + +N+ RL GT + G + +
SD
Sbjct: 41b YANDDPHGVFALYSD RlQSILIGtQNLIRSIiQINITRLAGTFGDVAVGLRIS SDHK 4b1
(Query- 127b PTSGVILFTEGtQVLSTITLTILADNIPELSE VI
VTLTRITTEGVEDSYKGA-TI 1321
V E (Q + + T D +P + + V + TL +T V + G TI
Sbjct: 47D E(QPIVTENAER(QLVVKDGATYKVDVVPIKN(3VFLSLGSNFTL(3LVTVr.LVG6RFYGr.PTI 521
(Query: 133D DiQDRSKSVITTLPNDSPFGLVGldRAAS 135b (Q+ +KS + + + VG+ + +
Sbjct: S3D L(QE-AKSAVLPVSEKAANS(QVGFESTA 5SS
Score = 14S (21-fl bits)-, Expect = 4-3e-24-, Sum P(3) = 4-3e-24 Identities = 104/31b (2b*)-, Positives = 17D/31b (42*)
(Query: AA
EGDEFANLTVSILPDDFPEMDESFLISLLEVHLMNISASLKNtQPTIGtQPNIST VIALNG 147
+G+ A+ V +LPD+ PE++E ++I L+ V A L + +1 +
+ N Sbjct: 3bA DGESEASFDVHLLPDEVPEIEEDYVI<QLVSVEG---GAELDLEKSI
TldFSVYAND 411 (Query: 14A
DAFGVFVIYSISPNTSEDGLFVEV(QE(QP(QTLVELMIHRTGGSLG(QVAVEldRVVGGTATEG 207 D GVF +YS D + + + +++ I R G+ G VAV R+
+ Sbjct: 420 DPHGVFALYS
DRiQSILIGiQNLIRSIfllNITRLAGTFGDVAVGLRISSDHKEiQP 472
(Query: 20fi LDFIGAGEILTFAEGETKKTVILTXXXXXXXXXXXXXXXXLVYTE-GGSRI- -LPSS-DT 2b3 + A L +G T K ++ LV G R
+P+
Sbjct: 473 IVTENAER(QLVVKDGATYKVDVVPIKN(QVFLSLGSNFTL(QLVTVΓ1LVGGRFYGΓ1PTIL(QE 532 (Query: _>__ VRVNIL-ANDNVAGI-VSFiQTASRSVIGHEGEILiQFHVIRTFPGR-
GNVTVNldKI-IGiQN 311
+ +L ++ A V F++ + ++ HV+ + G G ++V
Id
Sbjct: 533 AKSAVLPVSEKAANS(QVGFESTAF(QLf1NITAGTS-- HVrilSRRGTYGALSVAIdTTGYAPG 510
(Query: 32D LEL
NFANFSG(QLFFPEGSLNTTLFVHLLDDNIPEEKEVY(QVILYDVRT(QGVPP 372
LE+ N G L F G +F+ E + + L V++ P
Sbjct: 511 LEIPEFIVVGNflTPTLGSLSFSHGEiQRKGVFLIdTFPS — PGIdPEAFVLHLSGViQSSA-- P b4b ώuery: 373 AGIALLDA(QGYAAVLTVEASDEPHGVLNFALSSRFVLL(QEANITIιQLFINREFG-SLGAI 431
G L G+ + A EP GV F+ SSR +++ E I+L + R
FG I
Sbjct: b47 GGAiQL—RSGF
IVAEIEPMGVF(QFSTSSRNIIVSEDT(QMIRLHV<QRLFGFHSDLI bll
(Query: 432 NVTYTTVPGNLS-LKN-(QTV--GNLA EPEVDF-
VPIIGFLILEEGETAAAINITIL 4A2
V+Y T G L++ + V G L + EVDF + 11 L E E
IN+T + Sbjct: 700 KVSY(QTTAGSAKPLEDFEPV(QNGELFF(QKF(QTEVDFEITIIND(2-
LSEIEEFFYINLTSV 7SA
(Query: 463 E 4A3 E Sbjct: 751 E 751
Score = 142 (21-3 bits)-, Expect = 5-be-OS-, Sum P(3) = S-be-OS Identities = 54/17S (30*)-, Positives = 7b/175 (43*) (Query: 143S
DLPELEEGFIVTITEVNLVNSDFSTGflPSVRRPGMEIAEINIEENDDPRGIFNFHVTRGA 1414 DL + G+ TI E N + D (QP + 1 I+I +ND+ GI
F
Sbjct: lb DLYDFGRGYDFTI(3E-NGL(QID (QPP- EIGNISIVRIIiriKNDNAEGIIEFDPK — - bb
(Query: 1415 GEVITAYEXXXXXXXXXXXXXXXAGSFGAVNVYld-- KASPDSAGLEDFKPSHGILEFADK 1552 TA+E G++G V + ++S S G D+
+ F
Sbjct: b?
YTAFEVEEDVGLIHIPVVRLHGTYGYVTADFISiQSSSASPGGVDYILHGSTVTFiQHG 123
(Query: 15S3 (QVTAriIEITIIDDAEFELTETFNISLISVAGGGRLGDDVVVTVVIP(QNDSPFGVF6F lbOI
(2 + 1 I + IIDD E E E I L GG LG +V + + I
++DSPFGV F Sbjct: 124
(QNLSFINISIIDDNESEFEEPIEILLTGATGGAVLGRHLVSRIIIAKSDSPF6VIRF IAD
Score = 125 (1A-A bits)-, Expect = 4-0e-25-, Sum P(3) = 4-0e-2S Identities = 77/3DA (25*)-, Positives = 134/30A (43*)
(Query: 1141 LVGAHSHIYELAYISSHS DFIP-
SSGELIFEPGEREATIAVNILDDTVPEKEES 1113
L G HS + +++Y ++ DF P +GEL F+ + E + I++D
+ E EE Sbjct: bll
LFGFHSDLIKVSY(QTTAGSAKPLEDFEPV(QNGELFF(QKF(QTEVDFEITIINDlQLSEIEEF 750
(Query: 1114 FKViQLKNP--
KGGAEIGINDSVTITILSNDDAYGIVAFA(QNSLYK(QVEEI E(QDSLVTLNV 1251 F + L + +G + +N S + + D + ++ N
D L + + +
Sbjct: 751 FYINLTSVEIRGLiQKFDVNIdSPRLNL-- -DFSVAVITILDN
DDLAGMDI 71b (Query: 1252
ERLKGTYGRITIAIdEADGSISDIFPTSGVILFTEG(QVLSTITLTILADNIPELSEVVIVT 1311
+ + T + A D ++ + S L T + + + T + + E
+ V +
Sbjct: 717 SFPETTVAVAVDTTLIPVETESTTYLSTS- KTTTILtQPTNVVAIVTEATGVSAIP A5D
(Query: 1312
LTRITTEGVEDSYKGATID(QDRSKSVITTLPNDSPFGLVGldRAASVFIRVAEPKENT-TT 1370
+ T G T V T N S G + V + I E K T T
SBJCT: AS1 EKLVTLHG TPAVSEKPDVATVTANVSIHGTFSLGPSIVYIE-
EEΓIKNGTFNT 101
(Query: 1371 L(QL(QIARDKGLLGDIAIHLRA (QPNFL LHVDNlQ-- ATENEDYVLiQETI 1415
++ I R G G+++I ++ +PN L + N A E
ED+ (Q
Sbjct: 1D2
AEVLIRRTGGFTGNVSITVKTFGERCA(2riEPNALPFRGIYGISNLTldAVEEEDFEE(QTLT Ibl
(Query- 141b IIMKENIKEAHAEVSILPDDLPELEEGFIVTIT 144A +1 + +E V IL DD PE +E F V +T
Sbjcf- 1b2 LIFLDGERERKVSViQILDDDEPEGlQEFFYVFLT 114 Score = 123 (1A-5 bits)-, Expect = b-0e-2fi-, Sum P(3) = b-Oe-EA Identities = 11/372 (24*)-, Positives = ISO/372 (4D*) (Query : 3Ab VLTVEASDEPHGVLNFALSSRFVLL<3EA--NITI
(QLFINREFGSLGAINVTYTTV-- 43A
V TV A+ HG F+L V ++E N T ++ I R G G
+ ++T T Sb jct : AbA VATVTANVSIHGT—
FSLGPSIVYIEEENKNGTFNTAEVLIRRTGGFTGNVSITVKTFGE 125
(Query : 431 PGMLSLKN-tQTVGNL--
AEPEVDFVPHGFLILEEGETAAAINITILEDDVPEL 4A1 P L + + NL A E DF LI +GE + + +
IL+DD PE Sbjct : 12b RCA(QMEPNALPFRGIYGISNLTldAVEEEDFEE(QTLTLIFLDGERERKVSV(QILDDDEPEG 1A5 (Query : 410 EEYFLVNLTYVGLTMAASTSFPPRLDSE6LTA--(QVIIDANDGARGVI~-
Elι)(Q(QSRFEV 544
+E+F V LT D G A VII +D G+I E
(QS E +
Sbjct : 1Ab (QEFFYVFLT NPiQGGAiQIVEGKDDTGFAAFAMVIITGSDLHNGIIGFSEESlQSGLEL 1D41
(Query : 545 NE--THGSLTLVA(QRS-REPLGHVSLF — VYA(QNLEA(3VGLDYIFTPHILHFADGERYKN 511
E L L+ R V +F V + D + L G
Sbjct: 1042 REGAVMRRLHLIVTR(QPNRAFEDVKVFIdRVTLNKTVVVL(QKDGVNLr.EELflSVSGTTTCT 1101
(Query: bOO VNIMILDDDIPEGDEKFiQLILTNPSPGLELGKNT- IALIIVLANDDGPGVLSF bSl
++I + + +P+ + F + L + G + + A I +L
+D+ ++ F
Sbjct: 1102
MG<QTKCFISIELKPEKVP(QVEVYFFVELYEATAGAAINNSARFA(QIKILESDES(QSLVYF llbl
(Query: b52 NNSEHFFLREPTALYV(QESVAVLYIVREPA(QGLFGTVTV<3FIVTEVNSSNE-
-SKDLTPS 701
+ + A + L + R+ GL ++V F E+ S+
++P+ Sbjct: Hb SVGSRLAVAHKKATLIS L(QVARDSGTGLM—
MSVNFSTiQELRSAETIGRTIISPA 1214
(Query: 710 -—KGYIVLEEGVRFKAL(QISAILD 731
K +++ E + F+ (3 S +LD Sbjct: 1215 ISGKDFVITEGTLVFEPG(3RSTVLD 1231
Score = 120 (1A-0 bits)-, Expect = l-Ae-22-, Sum P(3) = l-Ae-22 Identities = 77/31b (24*)-, Positives = 127/31b (40*) (3uery: 1255 KGTYGRITIAldE—-ADGS
ISDIFPTSGVILFTEG(3VLSTITLTILADNIPEL 1304
+GTYG +++Ald A G + ++ PT G + F+ G+ + L
P
Sbjct : 573 RGTYGALSVAUTTGYAPGLEIPEFIVVGNriTPTLGSLSFSHGElQRKGVFLIilTFPS— PGld b3D
(Query : 1305 SEVVIVTLTRITTEGVEDSYKGATID(3DRSKSVITTLPNDSPFGLVGIιJRAASVFIRVAEP 13b4 E ++ L+ GV+ S G (Q RS ++ + P G+ + +S
I V + E
Sbjct : b31 PEAFVLHLS GV(QSSAPGGA--(3LRSGFIVAEI —
EPHGVFiQFSTSSRNIIVSE- b71
(Query : 13b5 KENTTTL(QL(QIARDKGLLGDIAIHLRA<QPNFLLHVDN(QATENEDYV- LώETIIinKENIK 1423
+ T + + L + R G D+ I + (Q A ED+ +(Q
+ + + Sbj ct : bδO --DT(QHIRLHV(3RLFGFHSDL-IKVSY(QTTA
GSAKPLEDFEPV(QNGELFF(QKF(3T 731
(Query : 1424 EAHAEVSILPDDLPELEEGFIVTITEVNLVN- SDFSTG(3PSVRRPGI EIAEI(1IEENDDP 14 A2 E E++I+ D L E+EE F + +T V + F +A I
I +NDD
Sb jct : 732 EVDFEITIIND(QLSEIEEFFYINLTSVEIRGL(QKFDVNIdSPRLNLDFSVAVITILDNDDL 711 (Query : 14A3 RGI-FMFHVTRGAGEVITAY---
EXXXXXXXXXXXXXXXAGSFGAVNVYldKASPDSAGLE 1S3A
G+ F T A V T E V +
+A+ SA E
Sb j ct : 712 AGMDISFPETTVAVAVDTTLIPVETESTTYLSTSKTTTILiQPTNVVAIVTEATGVSAIPE A51
(Query : 1S31 DFKPSHGILEFADKlQVTAI IEITIIDDAEFEL 157D
HG ++K A + + ' F L
Sb j ct : AS2 KLVTLHGTP AVSEKPDVATVT ANVSIHGTFSL AA3
Score = 113 (17-0 bits)-, Expect = 1-4e-34-, Sum P(3) = 1-4e-34 Identities = 2A/A7 (32*) Positives = 50/A7 (57*)
(Query: H5b SHSDFIPSSGELIFEPGEREATIAVNILDDT-- VPEKEESFKViQLKNPKGGAEIG-INDS 1212
S DF+ + G L+FEPG+R + V + +T + + F++ L +PKGGA
I + +
Sbjct: 121b
SGKDFVITEGTLVFEPG(QRSTVLDVILTPETGSLNSFPKRF(QIVLFDPKGGARIDKVYGT 127S
(Query: 1213 VTITILSNDDAYGIVAFAiQNSLYKlQVEE 1240
IT++S+ D+ I A + L++ V +
Sbjct: 127b ANITLVSDADSiQAIUGLA-DiQLHlQPVND 1302 Score = 13 (14-0 bits)-, Expect = 4-le-32-, Sum P(3) = 4-le-32 Identities = 57/222 (25*)-, Positives = 10/222 (40*)
(Query: 1404 TENEDYVL--(2ETIIiriKENIKEAHAE VSILPDDLPEL
EEGFIVTITEVN 1451 TE+ Y+ + T 1+ N+ E VS +P+ L L E+
+ T+T
Sbjct: Alb TESTTYLSTSKTTTILOPTNVVAIVTEATGVSAIPEKLVTLHGTPAVSEKPDVATVTANV A7S (Query: 14S2 LVNSDFSTGfQPSVRRPGHEIAEirHEENDDPRGIFHFHVTRGAGEV- ITAYEXXXXXXXX 1510
++ FS G PS+ + I E M + + + G V IT Sbjct: A7b SIHGTFSLG-PSI
VYIEEEHKNGTFNTAEVLIRRTGGFTGNVSITVKTFGERCAiQri 130
(Query: 1511 XXXXXXXAGSFGAVNVYldKASPDSAGLEDFKPSHGILEFADKtQVTANIEITIIDDAEFEL 157D
G +G N+ Id EDF+ L F D + + +
I+DD E E
Sbjct: 131 EPNALPFRGIYGISNLTldAVEE
EDFEE(QTLTLIFLDGERERKVSV(QILDDDEPEG 1A5
(Query: 1571 TETFNISLISVAGGGRL--GDD VVVTVVIPώNDSPFGVFGFEEKTVS
IblS
E F + L + GG ++ G D V+I +D G+ GF E++ S
Sbjct: Ifib (QEFFYVFLTNPι3GGAι3IVEGKDDTGFAAFAriVIITGSDLHNGIIGFSEES(3S 1037
Score = 13 (14-D bits)-, Expect = l-Oe-18-, Sum P(3) = 1-De-lfl Identities = 51/236 (21*)-, Positives = 107/236 (44*) (Query: bOD VNIMILDDDIPEGDEKFiQLILTNPSPGLELGKNT-
IALIIVLANDDGPGVLSFNNSEHFF b5δ
++I + + +p+ + F + L + G + + A I +L +D+ ++
F +
Sbjct: HO ISIELKPEKVP(3VEVYFFVELYEATAGAAINNSARFA(QIKILESDES(QSLVYFSVGSRLA llbδ
(Query: bS LREPTALYV(3ESVAVLYIVREPA(3GLFGTVTV<3FIVTEVNSSNE-- SKDLTPS- — GYI 713
+ A + L + R+ GL ++V F E+ S+ ++P+ K ++
Sbjct: llbl VAHKKATLIS LfQVARDSGTGLM — riSVNFSTtQELRSAETIGRTIISPAISGKDFV 1221
(Query: 714 VLEEGVRFKAL<QISAILDT--EPE MDEY FVCTLFNPTGGARLG- VHViQTLITVL 7b4
+ E + F+ (3 S +LD PE ++ + F LF+P GGAR+ V +
IT + +
Sbjct: 1222
ITEGTLVFEPG(QRSTVLDVILTPETGSLNSFPKRF3IVLFDPKGGARIDKVYGTANITLV 12A1
(Query: ?b5 (QNiQAPLGLFSISAVENRATSIDI-
EEANRTVYLNVSRTNGIDLAVSVlQldETVSETAFGIIR 623
+ ++ ++ ++ + DI T+ + V+ T D +S +
+ Sbjct: 12A2 SDADS(QAIldGLAD(QLH(QPVNDDILNRVLHTISMKVA-
TENTDE(QLSAf1MHLIEKIT--TE 133A
(Query- 624 GMD VFSV A31 G FSV Sbjct: 1331 GKItQAFSV 134b
Score = 12 (13-A bits)-, Expect = l-Se-25-, Sum P(3) = l-Se-25 Identities = 44/177 (24*)-, Positives = 62/177 (4b*) (Query: bδO
PA(3GLFGTVTV(QFIVTEVNSSNESKDLTPSKGYIVLEEGVRFKAL(QISAILDTEPEI1DEY 731
P +G++G + + V E + E + LT ++ +G R + + + + D
EPE E + Sbjct: 13b PFRGIYGISNLTIdAVEEEDF — EEiQTLT
LIFLDGERERKVSV(QILDDDEPE6(QEF 166
(Query: 740 FVCTLFNPTGGARL
GVHV(QTLITVL(QN(3APLGLFSISAVENRATSIDIEEAN- 711
F L NP GGA++ G ++ + + G+ S
+ + + E
Sbjct: Ifil FYVFLTNP(QGGA(QIVEGKDDTGFAAFAriVIITGSDLHNGIIGFS- EESIQSGLELREGAV lD4b
(Query: 712 -RTVYLNVSRT-NGIDLAVSViQldE-TVSETAF
GNRGNDVVFSVFiQSFLDESASGtd 643
R ++L V+R N V V Id T + ++T G+ M+ + SV +
Sbjct: 1047 MRRLHLIVTR(QPNRAFEDVKVFIdRVTLNKTVVVLιQKDGVNLriEEL(QSVSGTTTCTriG(QTK 110b
(Query: 644 CFFTLE 641
CF ++E Sbjct: 1107 CFISIE 1112
Score = 11 (13-7 bits)-, Expect = b-be-32-, Sum P(3) = b-be-32 Identities = 41/153 (32*) n Positives = 70/153 (45*)
(Query: I4bb RPGNEIAEIMIEENDDPRGIFMFHVTRGAGEVITAYEXXXXXXXXXXXXXXXAGSFGAVN 152S
R G +AEI +P G+F F + + +1 + + + +
Sbjct: bS2 RSGFIVAEI EPHGVFlQFSTS--
SRNIIVSEDT(QriIRLHV(QRLFGFHSD---LIK 70D
(Query: lS2b VYIdKASPDSAG-LEDFKP-
SHGILEFADKlQVTAI IEITIIDDAEFELTETFNISLISVAG 1563
V ++ + SA LEDF + P +6 L F (Q EITII + D E+ E F
I + L SV Sbjct: 701
VSY(3TTAGSAKPLEDFEPVlQNGELFF(QKF(QTEVDFEITIIND(QLSEIEEFFYINLTSVEI 7b0
(Query: 1564 GG RLGDDVVVTVV-IPtQNDSPFGV-FGFEEKTVS IblS
G RL D V V+ I ND G+ F E TV+ Sbjct: 7bl RGL(QKFDVNldSPRLNLDFSVAVITILDNDDLAGI DISFPETTVA 604
Score = bS (1-6 bits)-, Expect = 6-6e-21-, Sum P(3) = 6-δe-21 Identities = 2b/11 (2b*)-, Positives = 50/11 (50*) (Query: 1232 NSLYK(3VEEI1E(3DSLVTLNVERLKGTYGRITIAIdEADGS ISDIF —
PTSGVILFTE 1265
NS K+ + + D ++++ GT IT+ +AD ++D P
+ IL
Sbjct: 1250 NSFPKRFOIVLFDPKGGARIDKVYGT- ANITLVSDADS(3AIIdGLAD(3LH(3PVNDDIL — - 13DS
Query: I2δb GiQVLSTITLTILADNIPELSEVVIVTLTRITTEGVEDSYKGAT 1326
+VL TI++ + +N E ++ + +ITTEG ++ A+
Sbjct: 130b NRVLHTISHKVATENTDE(QLSAIiriHLIEKITTEGKI(QAFSVAS 1346
Score = 46 (7-2 bits)-, Expect = l-le-27-, Sum P(3) = l-le-27 Identities = 23/115 (20*)-, Positives = 44/115 (36*) (Query : 1411 TAYEXXXXXXXXXXXXXXXAGSFGAVNVYIdKAS
PDSAGLEDFKPSHGILEFAD 1551
TA + + G + + GA++V Id P+ + + P +
G L F + Sbjct : 5S4
TAFlQLMNITAGTSHVMISRRGTYGALSVAldTTGYAPGLEIPEFIVVGNMTPTLGSLSFSH bl3
Query : 1SS2 KώVTAHIEITIIDDAEFELTETFNISLI-- SVAGGGRLGDDVVVTVVIP(QNDSPFGVFGF IbOI + + + + ++S + S GG +L +V +
P GVF F
Sb jct : bl 4 GE(3RKGVFLIdTFPSPGIdPEAFVLHLSGV(3SSAPGGA(3LRSGFIVAEI
EPNGVFtQF bbβ
Pedant information for DKFZphamy2__10p7-> frame 3
Report for DKFZphamy2_lDp7-3
[LENGTH! lblS
[Mid! 177b00 - Sδ
[pi! 4 - 37
[HOriOL! TREMBL:AF0550δ4_l gene: "VLGRl"ή product: very large G-protein coupled receptor-l"i Homo sapiens very large G- protein coupled receptoi—1 (VLGR1) mRNA-, complete eds- 5e-24 [BLOCKS! BPD1413A [BLOCKS! BL00713B Sodium : dicarboxylate symporter family proteins
[BLOCKS! PR01D03A
[BLOCKS! PR00412C
[BLOCKS! BL00624E [PIRKId! heart le-Dδ [PIRKId! ion transport le-Oδ
[PIRKId! transmembrane protein 3e-06 [PIRKId! phosphoprotein 2e-0δ [PIRKId! membrane protein le-06
[PROSITE! IULTIC0PPER_0XIDASE1 1 EKld! All_Beta
[Kid! LOld COMPLEXITY 2-b0 *
SE(3 DAIdADAIdALYTCATLCLKE(3ACSAFSFFSASEGP(3CFUriTSωiSPAVNNSDFIdTYRKNπT SEG xxxxxxxxxxx
PRD ccchhhhhhhhchhhhhhhhhhheeeeeecccccceeeeeeeccccccccceeeecccee
SE(3 RVASLFSGfiAVAGSDYEPVTRflldAIMiQEGDEFANLTVSILPDDFPEMDESFLISLLEVHL
SEG PRD eeeeeccccccccccceeeceeeeeeccccceeeeeeeeccccccchhhhhhhhhhhhhh
SE(Q MNISASLKN<QPTIG<QPNISTVVIALNGDAFGVFVIYSISPNTSEDGLFVEV(QE(QP(QTLVE
SEG
PRD hccccccccccccccccceeeeeeecccceeeeeeeeecccccccceeeeeeecccceee
SE(Q LMIHRTGGSLGώVAVEldRVVGGTATEGLDFIGAGEILTFAEGETKKTVILTILDDSEPED
SEG xxxxxxxxx
PRD eeeeecccccceeeeeeecccccccccccccccceeeeeccccceeeeeeeeeccccccc SE(Q DESIIVSLVYTEGGSRILPSSDTVRVNILANDNVAGIVSFiQTASRSVIGHEGEILiQFHVI
SEG xxxxxxx
PRD ccceeeeeeeccccccccccccceeeeeeccccceeeeeeeccceeeeccccceeeeeee
SE(Q RTFPGRGNVTVNIdKIIG(3NLELNFANFSG(QLFFPEGSLNTTLFVHLLDDNIPEEKEVY(3V
SEG
PRD eccccccceeeeeeeecccccccccccccceeecccceeeeeeeeeecccccccccceee SE(Q ILYDVRT(QGVPPAGIALLDA(QGYAAVLTVEASDEPHGVLNFALSSRFVLL(QEANITI(QLF
SEG
PRD eeccceeeeccchhhhhhhhccccceeeeeecccccceeeeeeceeeeeecccccceeee
SE(Q INREFGSLGAINVTYTTVPGHLSLKNiQTVGNLAEPEVDFVPIIGFLILEEGETAAAINIT SEG
PRD cccccccceeeeeeecccccccccccccccccccccceeeeeeeeeeeccccccccceee
SE(3 ILEDDVPELEEYFLVNLTYVGLTMAASTSFPPRLDSEGLTAiQVIIDANDGARGVIEIdlQiQS
SEG PRD eccccchhhhhheeeeeeeecceeecccccccccccccceeeeeeeccccceeeeeeccc
SE(Q RFEVNETHGSLTLVA(QRSREPLGHVSLFVYA(QNLEA(QVGLDYIFTPπiLHFADGERYKNV
SEG
PRD eeeecccccceeeeeeccccccceeeeeeeeccccccccccccccceeeecccccceeee
SE(Q NIMILDDDIPEGDEKFiQLILTNPSPGLELGKNTIALIIVLANDDGPGVLSFNNSEHFFLR
SEG
PRD eeeeeccccccccceeeeeeeccccccccccceeeeeeeecccccceeeeeeccceeeee SE(Q EPTALYV(QESVAVLYIVREPA(QGLFGTVTV(QFIVTEVNSSNESKDLTPSKGYIVLEEGVR
SEG
PRD ccceeeeccchhhhhhhhhcccccceeeeeeeeeeeccccccccccccccceeeeeccce
SE(Q FKAL(QISAILDTEPEMDEYFVCTLFNPTGGARLGVHV(3TLITVL(3N(QAPLGLFSISAVEN SEG
PRD eeeeeeeeecccchhhhhhheeeeecccccceeehhhhhhhhhhhhhcccceeeeeecch
SE(Q RATSIDIEEANRTVYLNVSRTNGIDLAVSV(QldETVSETAFGf1RGHDVVFSVF(QSFLDESA
SEG PRD hhhhhccccccceeeeeeeccccchhhhheeeeeccceeeeccccceeeeeeeecccccc
SElQ SGldCFFTLENLIYGiriLRKSSVTVYRldlQGIFIPVEDLNIENPKTCEAFNIGFSPYFVITH
SEG
PRD cceeeeeccccccceeecccceeeecccceeeccceeeeccccccceeecccccceeeee
SE(Q EERNEEKPSLNSVFTFTSGFKLFLV(QTIIILESS(QVRYFTSDS(QDYLIIAS(QRDDSELT(Q
SEG
PRD hhhhhcccceeeeeeecccceeeeeceeecccccceeeeccccceeeeeeecccccceee SE(Q VFRIdNGGSFVLH(3KLPVRGVLTVALFNKGGSVFLAIS(QANARLNSLLFRIdSGSGFINF(QE
SEG
PRD eeeeccceeeeeeccccceeeeeeeeccccceeeeeeehhhhhheeeeeecccccceeee
SE(Q VPVSGTTEVEALSSANDIYLIFAKNVFLGD(3NSIDIFIldEriG(QSSFRYF(QSVDFAAVNRI SEG
PRD eeccccceeeeccccceeeeeeeeeeeecccceeeeeeeeccccceeeeeeccceeeece
SE(Q HSFTPASGIAHILLIG(QDMSALYCIdNSERN(QFSFVLEVPSAYDVASVTVKSLNSSKNLIA SEG
PRD eecccccceeeeeeeccccceeeeecccccceeeeeeeccccceeeeeeeccccccceee
SE(3 LVGAHSHIYELAYISSHSDFIPSSGELIFEPGEREATIAVNILDDTVPEKEESFKV(3LKN SEG
PRD eeccceeeeeeeeeeccccccccceeeeecccchhhhheeeeeccccccccceeeeeeec
SEG PKGGAEIGINDSVTITILSNDDAYGIVAFAflNSLYKtQVEEMEtQDSLVTLNVERLKGTYGR
SEG PRD ccccceeecccceeeeeecccccchhhhhhccchhhhhhhhhhhhhhhhhhhccccceee
SE(Q ITIAldEADGSISDIFPTSGVILFTEGlQVLSTITLTILADNIPELSEVVIVTLTRITTEGV
SEG
PRD eeeeeeeccceeeeeccccceeeeccccccceeeeeecccceeeeeeeeeeeeeeceeee
SE(Q EDSYKGATID(3DRSKSVITTLPNDSPFGLVGldRAASVFIRVAEPKENTTTL(QL(QIARDKG
SEG
PRD cceeeeeeeecccceeeeeecccccccceeehhhhhheeeeeccccccccceeeeccccc SE(Q LLGDIAIHLRA(QPNFLLHVDN<QATENEDYVL(QETIIIHKENIKEAHAEVSILPDDLPELE
SEG
PRD ccccccceeecccceeeeeccccccccceeeeeeeeeecccchhhhheeeeccccccccc
SEβ EGFIVTITEVNLVNSDFSTGIQPSVRRPGΓIEIAEII IEENDDPRGIFΓIFHVTRGAGEVITA SEG
PRD cceeeeeeeeeeccccccccccccccccchhhhhhhhcccccceeeeeeeeccccceeee
SE(Q YEVPPPLNVL(QVPVVRLAGSFGAVNVYldKASPDSAGLEDFKPSHGILEFADK(QVTAI1IEI
SEG - - xxxxxxxxxxxxxxx PRD eeccccceeeeeeeeeecccccceeeeeeccccccccccccccceeeeecccceeeecce
SE(Q TIIDDAEFELTETFNISLISVAGGGRLGDDVVVTVVIPiQNDSPFGVFGFEEKTVS
SEG
PRD eeechhhhhhhhcceeeeeeecccccccceeeeeeeecccccccceeeecccccc
Prosi te f or DKFZphamy2_lDp7 - 3 PS0D071 151->172 I ULTIC0PPER_0XIDASE1 PD0C00D7b
( No Pf am data ava i l abl e f or DKFZphamy2_10p7 - 3 )
DKFZphamy2_lld2
group: transmembrane protein
DKFZphamy2_lld2 encodes a novel S52 amino acid protein without similarity to known proteins- The novel protein contains 2 transmembrane regions-
No informative Blast resultsi no predictive prosite-, pfam or scope motife-
The new protein can find application in studying the expression profile of amygdala-specific genes and as a new marker for amygdala cells-
unknown protein
Pedant: TRANSMEMBRANE 2
Sequenced by EMBL Locus: /map="lbpl3 ■ 3"
Insert length: 2131 bp
Poly A stretch at pos- 212D-, polyadenylation signal at pos- 26b1
1 GGCGGGTGAG AGGCCGCGGC GGCAGGTCCA CCTGGGCTTG CGAAGGCACA
SI GATTCCCCGT CCACAGCTCA CGACCAGATG CACCAGCAGG AGTCCACATC
101 GAGGACGTCC TCCGGGCACT CCCACGACCA GTGACCAGGA GTTAAACTTT
151 GGGATGTGCC CGTGATGTTG GACCACAAGG ACTTAGAGGC CGAAATCCAC 201 CCCTTGAAAA ATGAAGAAAG AAAATCGCAG GAAAATCTGG GAAATCCATC
251 AAAAAATGAG GATAACGTGA AAAGCGCGCC TCCACAGTCC CGGCTCTCCC
301 GGTGCCGAGC GGCGGCGTTT TTTCTTTCAT TGTTTCTCTG CCTTTTTGTG
351 GTGTTCGTCG TCTCATTCGT CATCCCGTGT CCAGACCGGC CGGCGTCACA
401 GCGAATGTGG AGGATAGACT ACAGTGCCGC TGTTATCTAT GACTTTCTGG 451 CTGTGGATGA TATAAACGGG GACAG6ATCC AAGATGTTCT TTTTCTTTAT
501 AAAAACACCA ACAGCAGCAA CAATTTCAGC CGATCCTGTG TGGACGAAGG
5S1 CTTTTCCTCT CCCTGCACCT TTGCAGCTGC TGTGTCGGGG GCCAACGGCA bOl GCACGCTCTG GGAGAGACCT GTGGCCCAAG ACGTGGCCCT CGTGGAGTGT b51 GCTGTGCCCC AGCCAAGAGG CAGTGAGGCA CCTTCTGCCT GCATCCTGGT 701 GGGCAGACCC AGTTCTTTCA TTGCAGTCAA CTTGTTCACA GGGGAAACCC
7S1 TGTGGAACCA CAGCAGCAGC TTCAGCGGGA ATGCGTCCAT CCTGAGCCCT δOl CTGCTGCAGG TGCCTGATGT GGACGGCGAT GGGGCCCCAG ACCTGCTGGT
651 TCTCACCCAG GAGCGGGAGG AGGTTAGTGG CCACCTCTAC TCCGGCAGCA
101 CCGGGCACCA GATTGGCCTC AGAGGCAGCC TTGGTGTGGA CGGGGAAAGT 151 GGCTTCCTCC TTCACGTCAC CAGGACAGGT GCCCACTACA TCCTCTTTCC
1001 CTGCGCAAGC TCCCTCTGCG GCTGCTCTGT GAAGGGTCTC TACGAGAAGG
1DS1 TGACCGGGAG CGGCGGCCCG TTCAAGAGTG ACCCGCACTG GGAGAGCATG
1101 CTCAATGCCA CCACCCGCAG GATGCTTTCC CACAGCTCTG GAGCAGTGCG
1151 CTACCTGATG CATGTCCCAG GGAACGCCGG TGCAGATGTG CTTCTTGTGG 12D1 GCTCAGAGGC CTTCGTGCTG CTGGACGGGC AGGAGCTGAC GCCTCGCTGG
12S1 ACACCCAAGG CAGCCCATGT CCTGAGAAAA CCCATCTTCG GCCGCTACAA
13D1 ACCAGACACC TTGGCTGTAG CCGTTGAAAA CGGAACTGGC ACCGACAGAC
1351 AGATCCTGTT TCTGGACCTT GGCACTGGAG CCGTCCT6TG TAGCCTAGCC 1401 CTCCCGAGCC TCCCTGGGGG TCCACTGTCC GCCAGCCTGC CGACCGCAGA
1451 CCACCGCTCA GCCTTCTTCT TCTGGGGCCT CCACGAGCTG GGGAGCACCA
1501 GCGAGACGGA GACCGGGGAG GCCCGGCkCk GCCTGTACAT GTTCCACCCC
15S1 ACCCT6CCGC GCGTGCTGCT GGAGCTGGCC AATGTCTCTA CCCACATTGT
IbOl CGCCTTTGAC GCCGTCCTGT TTGAGCCAAG CCGCCACGCC GCCTACATCC
IbSl TTCTGACAGG CCCGGCAGAC TCAGAGGCAC CCGGCCTGGT CTCTGTGATC
1701 AAGCACAAGG TGCGGGACCT TGTCCCAAGC AGCAGGGTGG TCCGCCTGGG
17S1 TGAGGGTGGG CCAGACAGTG ACCAAGCCAT CAGGGACCGG TTCTCCCGGC
1601 TGCGGTACCA GAGTGAGGCG TAGAGGCACG CCAGCCAGAG CCTGTGGAGA
1A51 GACTCCGCCT GCTGACACTA AACGTCCTGG GAAGTGGGCC CTTCCCTGGG
1101 TCTCTGCACT GACTCCCCCA CTCCTGACCC TGGTGAT6GT CGCCACTGGG
11S1 CAGCAGCAGC CTTACCAGTC CTCCATGATC ACACCCAGGG ACCTGCATGG
2001 GTGAGGGGAC ACCCTGGGCC TCTCTCCCGC CCAGCATCCT CCCTGAGTCC
2051 CCACACAGGG CCTCACTCTG CACCCCACCA GGGTCCCGCT CACACCAGGC
2101 AGCCTTCATA GTGGTCTCCC TGGCCACCTT GGGCAGAGCT GGGTCATGCA
2151 GCACCCCATC CTTACCCGGT GCCCTCTCCT TGCCAGCTTC TCCCCAGGCC
2201 AGAGCGGCCA TCGCGTAGAA AGAACCAGGG TGTCCCCGGG ACAGGCCGTC
2251 CCCCACCCCA TCCTGTAGAG TCCATTCCCC TTTTCCCTCC TGTGCTCTGT
2301 CCCCCAAGGA GTCATGGAAC TCAGGGTACT GGGCCTCAAC GGGAACCTGA
2351 GACAGCTTCC AGCTTCGCAG CCCTTCCCGG AGCTACAGGG GGATCCTCTA
2401 GCATGGGGGG TGTGACTTGG T.TCCTTTGAC CAGGTCCTGT GAGGAAGCCT
2451 GGAGCAAGGG TCTCCCCCAG CAGGATGGGT GGGGCCTGCT CTGGAGCTGA
2501 GCCCGTGGCC GCTCACAGGT GTCCTTAGTG GTGTTGCAGC TGTCTACTGG
2SS1 CTGCATGTGC TGTGAATATC CCAAGGAACT GGCTGTGGAA TGCGTGTTTG
2b01 GGTCAGTCTG TGCCCTCTCA GTAGACACTG GAGCTGCTCT GTCCCTGAAG
2b51 AGGCCCCGTG CCCCAGGCAT GGCAAGCGCC TGCCTCTCCC CTTCCGGTGC
2701 TCACACGCCC ACGCCGTGCC ACCCGATGCA GGACTCACCT CTGTGCCTTG
2751 CTGCTCCTGA GGCCCAAGGG CAGCCATGGT GCTCTGTACT GCTCGGGCCG
2601 CCCAGGTCAC AGAGCCTGAG CTTCGTAGCC AAAGCAGCCT GATGACCCAC
2651 CCACCAAGGA AGAAAGCAGA ATAAACATTT TTGCACTGCC TGAAAAACCC
2101 CGGTGGTCAG GCGTGAGCCT AAAAAAAAAA AAAAAAAAA
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 2
ORF from 2555 bp to 2631 bp=, peptide length: 15 Category: questionable ORF Classification: unclassified
1 MCCEYPKELA VECVFGSVCA LSVDTGAALS LKRPRAPGMA SACLSPSGAH 51 TPTPCHPMlQD SPLCLAAPEA ι3G(QPldCSVLL GPPRSβSLSF VAKAA BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_lld2-, frame 2
TREMBL:MMIGCF_2 Mouse ig gamma2a-b (c57bl/b allele) c gene and secreted tail--, N = l-i Score = 73-, P = 0-1
>TREMBL:MMIGCF_2 Mouse ig gamma2a-b (cS7bl/b allele) c gene and secreted tail - Length = 334
HSPs:
Score = 73 (11-D bits)-, Expect = l-le-01-, P = l-De-01 Identities = lb/41 (32*)-, Positives = 27/41 (55*)
(Query: 44 LSPSGAHTPTPCHPM(3DSPLCLAAPEA(3G(3PldCSVLLGPPRS(QSLSFVA 12
+ P T PC P+++ P C AAP+ G P SV + PP+ + + ++ Sbjct: It, IEPRVPIT(QNPCPPLKECPPC-AAPDLLGGP--SVFIFPPKIKDVLMIS 141
Peptide information for frame 3
ORF from lb5 bp to 1620 bpi peptide length: 552
Category: putative protein
Classification: Transmembrane proteins unclassified
1 MLDHKDLEAE IHPLKNEERK S(3ENLGNPSK NEDNVKSAPP (3SRLSRCRAA
51 AFFLSLFLCL FVVFVVSFVI PCPDRPAS(3R MldRIDYSAAV IYDFLAVDDI
101 NGDRIC3DVLF LYKNTNSSNN FSRSCVDEGF SSPCTFAAAV SGANGSTLldE
151 RPVAiSDVALV ECAVPlQPRGS EAPSACILVG RPSSFIAVNL FTGETLIdNHS 201 SSFSGNASIL SPLLtQVPDVD GDGAPDLLVL TlQEREEVSGH LYSGSTGHώl
251 6LRGSLGVDG ESGFLLHVTR TGAHYILFPC ASSLCGCSVK GLYEKVTGSG
301 GPFKSDPHldE SMLNATTRRM LSHSSGAVRY LMHVPGNAGA DVLLVGSEAF
351 VLLDGlQELTP RldTPKAAHVL RKPIFGRYKP DTLAVAVENG TGTDRiQILFL
401 DLGTGAVLCS LALPSLPGGP LSASLPTADH RSAFFFIdGLH ELGSTSETET 451 GEARHSLYMF HPTLPRVLLE LANVSTHIVA FDAVLFEPSR HAAYILLTGP
501 ADSEAPGLVS VIKHKVRDLV PSSRVVRLGE GGPDSDiQAIR DRFSRLRYflS
5S1 EA
BLASTP hits
No BLASTP hits available Alert BLASTP hits for DKFZphamy2_lld2-, frame 3
No Alert BLASTP hits found Pedant information for DKFZphamy2_lld2ι frame 2
Report for DKFZphamy2_lld2-2
[LENGTH! 15
[MU! 1757-36
[pi! b-bδ [BLOCKS! PR00521E
[Kid! Alpha_Beta
SE(3 MCCEYPKELAVECVFGSVCALSVDTGAALSLKRPRAPGMASACLSPSGAHTPTPCHPM(3D PRD cccchhhhhhhhhccceeeeeecccchhhhhccccccccccccccccccccccccccccc
SE(3 SPLCLAAPEA(QG(2PldCSVLLGPPRS(QSLSFVAKAA PRD ccccccccccccccceeeeccccccchhhhhhccc
(No Prosite data available for DKFZphamy2_lld2-2)
(No Pfam data available for DKFZphamy2_lld2-2)
Pedant information for DKFZphamy2_lld2π frame 3
Report for DKFZphamy2_lld2- 3
[LENGTH! 552
[Mid! 51bS1-b6
[pi! 5-84 [BLOCKS! PRD0211G
[BLOCKS! BL002βδC Tissue inhibitors of metalloproteinases proteins
[BLOCKS! PR0043bA
[Kid! TRANSMEMBRANE 2 [Kid! LOld_C MPLEXITY 6 - 15 *
SE(Q MLDHKDLEAEIHPLKNEERKSiQENLGNPSKNEDNVKSAPPiQSRLSRCRAAAFFLSLFLCL
SEG xxxxxxxxx PRD ccchhhhhhhcccccccccccccccccccccccccccccccchhhhhhhhhhhhhhhhhh
MEM MMMMMMM
SEfQ FVVFVVSFVIPCPDRPASlQRMIdRIDYSAAVIYDFLAVDDINGDRIiQDVLFLYKNTNSSNN
SEG xxxxxxxxx PRD hhhhhhhccccccccccchhhhhhhchhhhhhhhhccccccccchhhhhhhccccccccc
MEM MMMMMMMMMM
SE(Q FSRSCVDEGFSSPCTFAAAVSGANGSTLIdERPVA(QDVALVECAVP(QPRGSEAPSACILVG
SEG PRD ccccccccccccccccccccccccccccccccccchhhhhhhccccccccccceeeeeec
MEM
SE(3 RPSSFIAVNLFTGETLDNHSSSFSGNASILSPLLiQVPDVDGDGAPDLLVLTiQEREEVSGH SEG
PRD cccceeeeeccccccccccccccccceeeecceeecccccccccccchhhhhhhhhhhcc
MEM SE(Q LYSGSTGHiQIGLRGSLGVDGESGFLLHVTRTGAHYILFPCASSLCGCSVKGLYEKVTGSG
SEG
PRD cccccccccccccccccccccceeeeeeecccceeeeeccccccccccceeeeccccccc
MEM SE(Q GPFKSDPHIdESMLNATTRRMLSHSSGAVRYLMHVPGNAGADVLLVGSEAFVLLDGiQELTP
SEG
PRD ccccccccccccchhhhhhhhhcccccceeeccccccccceeeeeccceeeeeccccccc
MEM SE(Q RldTPKAAHVLRKPIFGRYKPDTLAVAVENGTGTDRlQILFLDLGTGAVLCSLALPSLPGGP
SEG xxxxxxxxxxx
PRD ccchhhhhhcccccccccccceeeeeeccccccceeeeeeeccccceeeeeeeccccccc
MEM MMMMMMMMMMMMMMMMM SE(Q LSASLPTADHRSAFFFIdGLHELGSTSETETGEARHSLYMFHPTLPRVLLELANVSTHIVA
SEG xxxxxx xxxxxxxxxx
PRD ccccccccccccceeeccccccccccccccccccccceeeccccccccccccccceeeee
MEM SE(Q FDAVLFEPSRHAAYILLTGPADSEAPGLVSVIKHKVRDLVPSSRVVRLGEGGPDSD(3AIR
SEG
PRD eeeeeeccccceeeeeecccccccccceeeeeecccccccccceeeeecccccccchhhh
MEM SE(2 DRFSRLRY(3SEA
SEG
PRD hhhhhhhhhccc
MEM
(No Prosite data available for DKFZphamy2_lld2 -3) (No Pfam data available for DKFZphamy2_lld2-3)
DKFZphamy2_lln4
group: nucleic acid management
DKFZphamy2_lln4 encodes a novel 1011 amino acid protein with similarity to RAD16 of Schizosaccharomyces pombe and YLR3A3w of Saccharomyees cerevisia-
The novel protein contains a ATP/GTP-binding site motif A (P- loop)- It has similarity to RAD16 acts in a DNA repair pathway for removal of UV-induced DNA damage- YLR3δ3w of Saccharomyees cerevisiae is a recombination repair protein-
The new protein can find application in modulation of DNA-repair and a as a new tool for manipulation of nucleic acids- similarity to RAD16 (Schizosaccharomyces pombe) comment on P53b12:
FUNCTION: ACTS IN A DNA REPAIR PATHWAY FOR REMOVAL OF UV-INDUCED DNA DAMAGE THAT IS DISTINCT FROM CLASSICAL NUCLEOTIDE EXCISION REPAIR AND IN REPAIR OF IONIZING RADIATION DAMAGE-
Sequenced by EMBL
Locus: /map="2" Insert length: 3b71 bp
Poly A stretch at pos- 3b4b-> polyadenylation signal at pos- 3b20
1 ACCGCGGTGG GCGCCGGGGC TCCCGGGAAT CTACCTTCTC CTGCGGCCGG SI CACGCGGTTC CCAGGGGGCC AGCGGCGGTC AGCCGAGGTC GAGACGCCCG
101 CAGGGTGGCC TTAGCGGCCG GTCGTACCAC GGCAGCCCCG CCGATCAGGT
151 TCCTTTGGGA GACTTCGACT T6TTGGCGAA ATGAACCGGA GAAGAATCCC
201 AATTGGGAAT TGCGGAAAAC AGGACTCTAG GGTAGAGAAA GGTTGTAGAA
251 CCAATAGGGT TTGAGACCTG ATGGCCAAAA GAAAGGAAGA AAATTTTTCC 301 TCTCCTAAAA ATGCCAAAAG GCCAAGACAA GAAGAATTGG AGGATTTTGA
351 TAAAGATGGT GACGAAGACG AATGTAAAGG TACTACTTTG ACTGCAGCAG
401 AAGTTGGAAT AATTGAGAGT ATTCACCTAA AAAACTTCAT GTGTCATTCA
451 ATGCTTGGAC CTTTTAAGTT TGGTTCTAAT GTCAACTTTG TTGTTGGCAA
SD1 CAATGGAAGT GGGAAGAGTG CAGTACTCAC AGCTCTCATA GTCGGTCTTG S51 GTGGAAGAGC AGTTGCTACT AATAGAGGAT CCTCTTTAAA AGGTTTTGTG bOl AAAGATGGAC AGAACTCTGC AGATATCTCA ATAACATTGA GGAACAGAGG b51 AGATGATGCC TTTAAAGCCA GTGTGTATGG TAACTCTATA CTTATACAGC
701 AACACATCAG CATAGATGGA AGTCGATCTT ATAAACTTAA AAGTGCAACA
751 GGCTCCGTGG TTTCCACGAG GAAAGAAGAG CTGATTGCAA TTCTTGATCA 8D1 TTTTAACATC CAGGTGGATA ATCCAGTTTC TGTTTTAACA CAAGAAATGA
651 GCAAGCAGTT CTTACAGTCT AAAAATGAAG GAGACAAATA CAAATTCTTC
IDl ATGAAAGCAA CGCAACTTGA ACAGATGAAG GAAGATTATT CATACATTAT
151 GGAAACGAAA GAAAGAACAA AGGAGCAGAT ACATCAAGGA GAAGAGCGGC
1D01 TTACTGAACT AAAGCGCCAG TGTGTAGAGA AAGAGGAACG TTTTCAAAGT 1D51 ATTGCTGGTT TAAGTACAAT GAAGACTAAT TTAGAGTCCT TGAAACATGA
1101 AATGGCTTGG GCAGTGGTCA ATGAAATTGA AAAACAATTG AATGCCATCA
1151 GAGATAATAT CAAAATTGGA GAAGATCGTG CTGCTAGACT TGACAGGAAA
1201 ATGGAAGAAC AGCAGGTCAG ACTTAATGAG GCAGAACAAA AGTACAAGGA 1251 TATTCAAGAC AAACTAGAAA AGATTAGTGA AGAGACAAAT GCACGAGCAC
1301 CAGAATGTAT GGCATTGAAA GCAGATGTTG TTGCTAAGAA AAGGGCCTAT
1351 AATGAAGCTG AGGTTTTATA TAACCGATCC TTAAACGAAT ATAAAGCATT
1401 AAAGAAAGAT GATGAGCAGC TTTGTAAACG AATTGAAGAG CTGAAAAAAA 1451 GTACTGACCA ATCTTTGGAA CCTGAACGGT TGGAAAGACA AAAAAAAATA
1S01 TCTTGGTTAA AAGAGAGAGT AAAGGCCTTT CAAAATCAAG AAAATTCAGT
15S1 CAATCAAGAG ATCGAACAGT TTCAGCAAGC CATAGAAAAG GACAAAGAAG
IbOl AACATGGCAA AATTAAGAGA GAAGAATTAG ATGTGAAGCA TGCACTGAGC lb51 TACAATCAGA GGCAACTGAA AGAATTGAAA GATAGTAAAA CTGATCGACT 1701 CAAAAGATTT GGCCCTAATG TTCCAGCTCT TCTTGAAGCC ATAGATGATG
1751 CTTATAGACA AGGACATTTT ACCTATAAAC CTGTAGGCCC TTTAGGAGCT
1601 TGCATTCATC TTCGGGACCC AGAACTTGCT TTGGCTATTG AATCTTGCTT
1651 AAAAGGGCTT CTGCAGGCCT ATTGTTGCCA TAATCATGCT GATGAAAGGG
1101 TCCTTCAGGC ACTCATGAAA AGGTTTTATT TACCAGGGAC CTCACGGCCA 1151 CCGATAATAG TTTCTGAGTT TCGGAATGAG ATATATGATG TAAGACACAG
2001 AGCTGCTTAT CATCCAGACT TTCCAACAGT TCTGACAGCT TTAGAAATAG
2051 ATAATGCGGT TGTGGCAAAT AGCCTAATTG ACATGAGAGG CATAGAGACA
2101 GTGCTACTAA TCAAAAATAA TTCTGTAGCT CGT6CAGTAA TGCAGTCCCA
2151 AAAGCCACCC AAAAATTGTA GAGAAGCTTT TACTGCTGAT GGTGATCAAG 2201 TTTTTGCAGG ACGTTATTAT TCATCTGAAA ATACAAGACC TAAGTTCCTA
2251 AGCAGAGATG TGGATTCTGA AATAAGTGAC TTGGAGAATG AGGTTGAAAA
2301 TAAGACGGCC CAGATATTAA ATCTTCAGCA ACATTTATCT GCCCTTGAAA
2351 AAGATATTAA ACACAATGAG GAACTTCTTA AAAGGTGCCA ACTACATTAT
2401 AAAGAACTAA AGATGAAAAT AAGAAAAAAT ATTTCTGAAA TTCGGGAACT 24S1 TGAGAACATA GAAGAACACC AGTCTGTAGA TATTGCAACT TTGGAAGATG
2501 AAGCTCAGGA AAATAAAAGC AAAATGAAAA TGGTTGAGGA ACATATGGAG
25S1 CAACAAAAAG AAAATATGGA GCATCTTAAA AGTCTGAAAA TAGAAGCAGA
2b01 AAATAAGTAT GATGCAATTA AATTCAAAAT TAATCAACTA TCGGAGCTAG
2bSl CAGACCCACT TAAGGATGAA TTAAACCTTG CTGATTCTGA AGTGGATAAC 27D1 CAAAAACGAG GGAAACGACA TTATGAAGAA AAACAAAAAG AACACTTGGA
2751 TACCTTAAAT AAAAAGAAAC GAGAACTGGA TAT6AAAGAG AAAGAACTAG
2601 AGGAGAAAAT GTCACAAGCA AGACAAATCT GCCCAGAGCG TATAGAAGTA
2δ51 GAAAAATCTG CATCAATTCT GGACAAAGAA ATTAATCGAT TAAGGCAGAA
2101 GATACAGGCA GAACATGCTA GTCATGGAGA TCGAGAGGAA ATAATGAGGC 2151 AGTACCAAGA AGCAAGAGAG ACCTATCTTG ATCTGGATAG TAAAGTGAGG
30D1 ACTTTAAAAA AGTTTATTAA ATTACTGGGA GAAATCATGG AGCACAGATT
3051 CAAGACATAT CAACAATTTA GAAGGTGTTT GACTTTACGA TGCAAATTAT
3101 ACTTTGACAA CTTACTATCT CAGCGGGCCT ATTGTGGAAA AATGAATTTT
3151 GACCACAAGA ATGAAACTCT AAGTATATCA GTTCAGCCTG GAGAAGGAAA 3201 TAAAGCTGCT TTCAATGACA TGAGAGCCTT GTCTGGAGGT GAACGTTCTT
3251 TCTCCACAGT GTGTTTTATT CTTTCCCTGT GGTCCATCGC AGAATCTCCT
3301 TTCAGATGCC TGGATGAATT TGATGTCTAC ATGGATATGG TTAATAGGAG
3351 AATTGCCATG GACTTGATAC TGAAGATGGC AGATTCCCAG CGTTTTAGAC
3401 AGTTTATCTT GCTCACACCT CAAAGCATGA GTTCACTTCC ATCCAGTAAA 34S1 CTGATAAGAA TTCTCCGAAT GTCTGATCCT GAAAGAGGAC AAACTACATT
3501 GCCTTTCAGA CCTGTGACTC AAGAAGAAGA TGATGACCAA AGGTGATTTG
3551 TAACTTAACA TGCCTTGTCC TGATGTTGAA GGATTTGTGA AGGGAAAAAA
3b01 AATTCTGGAC TCTTTGATAT AATAAAATGA GACTGGAGGC ATTCTGAAAA
3b51 AAAAAAAAAA AAAAAAAAAA AAAAAAAAA
BLAST Results
o BLAST result
Medline entries 1b0b1417:
Lehmann AR-, Idalicka M-, Griffiths DJ-, Murray JM-, Idatts FZ-, McCready S-,
Carr AM-"-, The radlδ gene of Schizosaccharomyces pombe defines a new subgroup of the SMC superfamily involved in DNA repair- Mol Cell Biol 1115 DecilS (12) :70b7-60 113601b7:
Mengiste T-, Revenkova E-, Bechtold N-. Paszkowski J - --, An SMC-like protein is required for efficient homologous recombination in Arabidopsis- EMBO J 1111 Aug Ibilδ (lb) : 4505-12
Peptide information for frame 1
ORF from 271 bp to 3543 bpi peptide length 1011 Category: similarity to known protein Classification: Nucleic acid management Prosite motifs: RGD (12b-12β) ATP GTP A (7b-63)
1 MAKRKEENFS SPKNAKRPRlQ EELEDFDKDG DEDECKGTTL TAAEVGIIES 51 IHLKNFMCHS MLGPFKFGSN VNFVVGNNGS GKSAVLTALI VGLG6RAVAT
IDl NRGSSLKGFV KDGiQNSADIS ITLRNRGDDA FKASVYGNSI LI<Q(QHISIDG
151 SRSYKLKSAT GSVVSTRKEE LIAILDHFNI (QVDNPVSVLT (3EMSK(QFL(QS
2D1 KNEGDKYKFF MKAT(QLE(QMK EDYSYIMETK ERTKE(QIH(3G EERLTELKRβ
251 CVEKEERFtQS IAGLSTMKTN LESLKHEMAld AVVNEIEKiQL NAIRDNIKIG 301 EDRAARLDRK MEE(3(3VRLNE AE(QKYKDI(3D KLEKISEETN ARAPECMALK
351 ADVVAKKRAY NEAEVLYNRS LNEYKALKKD DEiQLCKRIEE LKKSTD(QSLE
401 PERLERlQKKI SldLKERVKAF (QN(QENSVN(QE IE(QF<Q(QAIEK DKEEHGKIKR
451 EELDVKHALS YNlQRlQLKELK DSKTDRLKRF GPNVPALLEA IDDAYRlQGHF
501 TYKPVGPLGA CIHLRDPELA LAIESCLKGL LiQAYCCHNHA DERVLiQALMK 551 RFYLPGTSRP PIIVSEFRNE IYDVRHRAAY HPDFPTVLTA LEIDNAVVAN bOl SLIDMRGIET VLLIKNNSVA RAVM(3S(3KPP KNCREAFTAD GDlQVFAGRYY b51 SSENTRPKFL SRDVDSEISD LENEVENKTA (QILNL(Q(QHLS ALEKDIKHNE
701 ELLKRCiQLHY KELKMKIRKN ISEIRELENI EEHtQSVDIAT LEDEAfiENKS
7S1 KMKMVEEHME (QiQKENMEHLK SLKIEAENKY DAIKFKINlQL SELADPLKDE 801 LNLADSEVDN (QKRGKRHYEE KiQKEHLDTLN KKKRELDMKE KELEEKMSiQA
651 RlQICPERIEV EKSASILDKE INRLRtQKIiQA EHASHGDREE IMRIQYIQEARE
101 TYLDLDSKVR TLKKFIKLLG EIMEHRFKTY (QflFRRCLTLR CKLYFDNLLS
151 (QRAYCGKMNF DHKNETLSIS ViQPGEGNKAA FNDMRALSGG ERSFSTVCFI
10D1 LSLIdSIAESP FRCLDEFDVY MDMVNRRIAM DLILKMADSiQ RFRlQFILLTP 1051 (QSMSSLPSSK LIRILRMSDP ERGώTTLPFR PVT(QEEDDD(Q R
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_lln4-, frame 1 SUISSPR0T:RAlβ_SCHP0 DNA REPAIR PROTEIN RADlδ--, N = 1 -, Score = 1D21-, P = S-2e-lD3
PIR:S5147D hypothetical protein YLR363w - yeast (Saccharomyees cerevisiae)τ N = 1-, Score = 623-, P = Se-δ2
>SldISSPR0T:RAlδ_SCHP0 DNA REPAIR PROTEIN RADlβ-
Length = 1-.140
HSPs: Score = 1021 (153-2 bits)-, Expect = 5-2e-103-, P = S-2e-103 Identities = 315/1D11 (2δ*)-, Positives = S40/1011 (41*)
(Query: 2 AKRKEENFSSPKNAKRPRfQEELEDF— KDGDEDECKGTTLTAAE
VGIIESIHLKN 55 A R ++N ++ + +E ++DG+ D T T +
VG+IE IHL N Sbjct: 45 ASRN(QDNRPER(QSRL(QRSSSLIE(3VRGNEDGENDVLN(3TRETNSNFDNRVGVIECIHLVN 1D4 βuery: 5b
FMCHSMLGPXXXXXXXXXXXXXXXXXXXAVLTALIVGLGGRAVATNRGSSLKGFVKDGiQN US FMCH L A+LT L + LG +A TNR ++K
VK G+N
Sbjct: IDS FMCHDSL- KINFGPRINFVIGHNGSGKSAILTGLTICLGAKASNTNRAPNMKSLVKiQGKN lb3
(Query : lib
SADISITLRNRGDDAFKASVYGNSILI(Q(QHISIDGSRSYKLKSATGSVVSTRKEELIAIL 17S A IS+T+ NRG +A++ +YG SI I++ I +GS Y+L+S G+V+ST+++EL I Sbj ct : lb4 YARISVTISNRGFEAYώPEIYGKSITIERTIRREGSSEYRLRSFNGTVISTKRDELDNIC 223
(Query- 17b DHFNI(QVDNPVSVLT(2EMSK<QFL(QSKNEGDKYKFFMKATι2LE(QMKEDYSYIMETKERTKE 235
DH +(Q + DNP + + + LT(Q+ + + (QFL + + +KY+ FMK (QL + (Q + + E + YS I
+ + TK
Sb jct : 224
DHMGL(3IDNPMNILT(3DTAR(3FLGNSSPKEKY(3LFMKGI(3LK(3LEENYSLIE(QSLINTKN 263
(Query : 23b
(QIH(QGEERLTELKR(QCVEKEERF(QSIAGLSTMKTNLESLKHEMAI AVVNEIEK(QLNAIRD 215 + + ++ L ++ E + ++ + LE K EM UJA V
E+EK+L Sb jct : 264
VLGNKKTGVSYLAKKEEEYKLLIdE(QSRETENLHNLLE(QKKGEMVIdA(QVVEVEKEL 336
(Query : 21b NIKIGEDRAARLDRKMEE(Q (QVRLNEAE(QKYKDI(QDKLEKISEETNARAP- ECMALKADVV 354 + E + K+ E + L DI K+ EE RA E
K +
Sbjct : 331 --LLAEKEF(QHAEVKLSEAKENLESIVTN(QSDIDGKISS- KEEVIGRAKGETDTTKSKFE 315 ώuery: 355
AKKRAYNEAEVLYNRSLNEYKALKKDDE(QLCKRIEELKKSTD(QSLEPERLER(QKKISIdLK 414
+ ++ Y +N+ K+D + I K D E ER ++ +
Sbj ct : 31b DIVKTFDG YRSEMNDVDI(QKRDI(QN — -
SINAAKSCLDVYRElQLNTERARENNLGG 448
Query : 415 ERVKAF(QN(QENSVN(QEIE(QF-(Qf3AIEKDKE EHG KIKREELDVKHALS 4bD
+++ N+ N++ +EI +Q +E + + E G + ++
+ + +S
Sbj ct : 441
S(3IEKRANESNNL(3REIADLSE(3IVELESKRNDLHSALLEMGGNLTSLLTKKDSIANKIS 508
(Query : 4bl
YN(3R(3LKELKDSKTDRLKRFGPNVPALLEAIDDAYR(3GHFTYKPVGPLGACIHLRDPELA 520
LK L+D + D++ FG N+P LL+ I R+ F + P GP+G +
+++ + Sb jct : 501 D(QSEHLKVLEDV(QRDKVSAFGKNMPι3LLKLIT--- RETRF(QHPPKGPMGKYMTVKE(QKIdH 5bS
(Query: 521
LAIESCLKGLLiQAYCCHNHADERVLiQALMKRFYLPGTSRPPIIVSEFRNEIYDVRHRAAY 58D L IE L ++ + +H D+ +L+ LM++ T ++V +
YD ++
Sbjct: Sbb LIIERILGNVINGFIVRSHHD(QLILKELMR(QSNCHAT VVVGK
YDPFDYSSG bib (Query: 581 HPD—
FPTVLTALEIDNAVVANSLIDMRGIETVLLIKNNSVARAVM(QS(QKPPKNCREAFT b3δ
PD +PTVL ++ D+ V ++LI+ GIE +LLI++ A A M+ +
N + +
Sbjct : bl7 EPDSώYPTVLKIIKFDDDEVLHTLINHLGIEKMLLIEDRREAEAYMK— RGIANVTlQCYA b74
(Query : b31 ADG-DlQVFAGRYYSSENTR— PKFLSRDVDSEI — SDLENEVENKTA(QILNL(Q(QHLSAL b12
D ++ + R S++ + K + I S E E K L Q + ++
Sbjct : b75 LDPRNRGYGFRIVST(QRSSGISKVTPldNRPPRIGFSSSTSIEAEKKILDDLKK(3YNFASN 734
(Query : b13 E-KDIKHNEELLKRC(QLHYKELKMKIRKNIS-EIRELENIEEH(Q-SV-D--- IATLEDEA 745
+ + K + KR + E I+K I + RE+ + + E + SV D
I TLE
Sbjct : 735
<QLNEAKIE(QAKFKRDE(2LLVEKIEGIKKRILLKRREVNSLES(QELSVLDTEKI(3TLERRI 714
(Query : 74b (QENKSKMKMVEEHME(Q(QKENMEH-
LKSLKIEAENKYDAIKFKINfQLSELADPLKDELN-L 603
E + +++ ++ K N EH ++ + + + Kl ++
L+ EL+ L Sbjcf- 715 SETEKELESYAG(2L(QDAK-
NEEHRIRDN(QRPVIEEIRIYREKI(QTET(QRLSSL(3TELSRL 653 Query : 804
ADSEVDN(QKRGKRHYEEK(QKEHLDTLNXXXXXXXXXXXXXXXXXS(QAR(QICPERIEVEKS 8b3 D + +++ +RH + + + L ++A C
ER+ V+ S Sbjct : 654 RDEKRNSEVDIERH-R(QTVESCTNILREKEAKKV(QCA(QVVADYTAKANTRC- ERVPVlQLS 111
(Query : 6b 4 ASILDKEINRLRiQKIiQAEHASHG- DREEIMR(3Y(3EARETYLDLDSKVRTLKKFIKLLGEI 122 + LD El RL+ +1 G E+ Y A+E + V L +
+ + L E
Sbjct : 112 PAELDNEIERL(3M(3IAEIdRNRTGVSVE(QAAEDYLNAKEKHD(QAKVLVARLT(3LL(3ALEET 171 (Query: 123
MEHRFKTY(3(3FRRCLTLRCKLYFDNLLS(3RAYCGKMNFDHKNETLSISV(QPGEGNKA-AF 161 + R + + +FR+ +TLR K F+ LSiQR + GK+ H+ E L V P
N A A
Sbjct: 172 LRRRNEMIdTKFRKLITLRTKELFELYLS(QRNFTGKLVIKH(3EEFLEPRVYPANRNLATAH 1031
(Query : 162 N
DMRALSGGERSFSTVCFILSLldSIAESPFRCLDEFDVYMDMVNRRIAMDLIL 1034
N ++ LSGGE+SF+T+C +LS+U P RCLDEFDV+MD VNR +++ +++
Sb j ct : 1032 NRHEKSKVSVlQGLSGGEKSFATICMLLSIIdEAMSCPLRCLDEFDVFMDAVNRLVSIKMMV 1011
(Query : 1035 KMADS(QRFR(QFILLTP(QSMSSLPSSKLIRILRMSDPERG(QTTLP 1076 A +(QFI +TP(Q M + K + + R+SDP + LP
Sbj ct : 1012 DSAKDSSDK(3FIFITP(3DMG(3IGLDKDVVVFRLSDPVVSSSALP 1135
Pedant information for DKFZphamy2_lln4-. frame 1
Report for DKFZphamy2_lln4-l
[LENGTH! 1011
[Mid! 12b32b-13
[pi! b-57
[HOMOL! SldISSPROT:RAlβ_SCHPO DNA REPAIR PROTEIN RADlβ- le-
1D1 [FUNCAT! 03-11 recombination and dna repair [S- cerevisiae-i
YLR383w! le-68
[FUNCAT! 08- D7 vesicular transport (golgi network-, etc) [S- cerevisiae-, YDLOSflw! 3e-lb
[FUNCAT! 30-03 organization of cytoplasm [S- cerevisiae-, YDLOSSw! 3e-lb
[FUNCAT! D1-13 biogenesis of chromosome structure [S- cerevisiae-, YLROδbw! 2e-14
[FUNCAT! 1 genome replication-, transcription-, recombination and repair [M- jannaschii-, MJlb43! 3e-14 [FUNCAT! 3D-04 organization of cytoskeleton [S- cerevisiae-,
YIL141c! le-12
[FUNCAT! D3-22 cell cycle control and mitosis [S- cerevisiae-,
YDR35bw! 8e-12 [FUNCAT! 01-10 nuclear biogenesis [S- cerevisiae-, YDR35bw! βe-12
[FUNCAT! 30-10 nuclear organization [S- cerevisiae-. YFLDOβw!
3e-ll [FUNCAT! 11-04 dna repair (direct repair-, base excision repair and nucleotide excision repair) [S- cerevisiae-. YKROISw! 2e-D1
[FUNCAT! 11 unclassified proteins [S- cerevisiae-, Y0R21bc!
5e-D1
[FUNCAT! 03-25 cytokinesis [S- cerevisiae-, YHR023ω MY01 - myosin-1 isoform! 8e-08
[FUNCAT! D3-D4 budding-, cell polarity and filament formation [S- cerevisiae-. YHRD23w MY01 - myosin-1 isoform! δe-08
[FUNCAT! Dδ-22 cytoskeleton-dependent transport [S- cerevisiae-.
YHR023W MY01 - myosin-1 isoform! 6e-0β [FUNCAT! 0b-D7 protein modification (glycolsylationπ acylation-, myristylation-, palmitylationi farnesylation and processing) [S- cerevisiae-, YKL201c! 2e-D7
[FUNCAT! 03-13 meiosis [S- cerevisiae-, YDR2δ5w! 4e-07
[FUNCAT! 30-13 organization of chromosome structure [S- cerevisiae-. YDR265w! 4e-07
[FUNCAT! 16 classification not yet clear-cut [S- cerevisiae-.
YJR134c! 7e-D7
[FUNCAT! Ob-10 assembly of protein complexes [S- cerevisiae-.
YPR141c! 7e-07 [FUNCAT! 30-05 organization of centrosome [S- cerevisiae-.
YPR141c! 7e-07
[FUNCAT! 11-01 stress response [S- cerevisiae-. YPR141c! 7e-D7
[FUNCAT! 03-D7 pheromone response! mating-type determination-, sex-specific proteins [S- cerevisiae-, YPR141c! 7e-D7 [FUNCAT! r general function prediction [H- influenzae-,
HI075b! le-Db
[FUNCAT! 10-D5-11 other pheromone response activities [S- cerevisiae-, YHR15βc! 2e-Db
[FUNCAT! 05-D4 translation (initiation! elongation and termination) [S- cerevisiae-, YALD35w! 3e-D4
[FUNCAT! 30-02 organization of plasma membrane [S- cerevisiae-.
YERDOβc! 4e-04
[FUNCAT! 08-lb extracellular transport [S- cerevisiae-.
YERDOSc! 4e-04 [FUNCAT! 01-04 biogenesis of cytoskeleton [S- cerevisiae-.
YKL171c! 7e-04
[FUNCAT! 03-22-01 cell cycle check point proteins [S- cerevisiae-. YGLOδbw! 7e-04
[FUNCAT! 06- DI nuclear transport [S- cerevisiae-. YDL207w! D-0D1 [FUNCAT! 04-07 rna transport [S- cerevisiae-, YDL207w! 0-001
[BLOCKS! BL0032bC Tropomyosins proteins
[BLOCKS! PR01004B
[BLOCKS! BLDD121A Colipase proteins
[BLOCKS! PF0DS60A [SCOP! d2tmab_ 1-10S-4.1-1 Tropomyosin [rabbit
(Oryctolagus cuniculus) 3e-0b
[EC! 3- b-1- 32 Myosin ATPase 1e-2D
[PIRKId! phosphotransferase le-lb
[PIRKId! nucleus 2e-10 [PIRKId! blocked amino end 2e-07
[PIRKId! citrulline 2e-10
[PIRKId! tandem repeat 1e-2D
[PIRKId! heterodimer 3e-ll [PIRKId! endocytosis 2e-13
[PIRKId! heart 1e-2D
[PIRKId! polymorphism le-10
[PIRKId! serine/threonine-specific protein kinase 1e-lb [PIRKIil! transmembrane protein 8e-15
CPIRKld! zinc finger 2e-13
[PIRKId! metal binding 2e-13
[PIRKId! DNA binding 2e-0b
[PIRKId! muscle contraction 1e-20 [PIRKId! acetylated amino end 3e-13
[PIRKId! actin binding 1e-2D
[PIRKId! mitosis βe-10
[PIRKId! icrotubule binding 3e-01
[PIRKId! chromosomal protein 3e-ll [PIRKId! ATP 1e-20
[PIRKId! receptor 2e-0b
[PIRKId! thick filament 1e-20
[PIRKId! phosphoprotein 2e-14
[PIRKId! glycoprotein le-10 [PIRKId! skeletal muscle le-lδ
[PIRKId! calcium binding 2e-10
[PIRKId! alternative splicing 3e-12
[PIRKId! DNA condensation 3e-ll
[PIRKId! P-loop 1e-2D [PIRKId! coiled coil 1e-20
[PIRKId! heptad repeat le-10
[PIRKId! methylated amino acid 1e-20
[PIRKId! basement membrane le-lD
[PIRKId! immunoglobulin receptor 4e-01 [PIRKId! peripheral membrane protein 2e-13
[PIRKId! cardiac muscle 1e-2D
[PIRKId! extracellular matrix le-10
[PIRKId! hydrolase 1e-20
[PIRKId! microtubule 2e-lD [PIRKId! muscle 2e-14
[PIRKId! membrane protein le-10
[PIRKId! EF hand 2e-10
[PIRKId! cell division βe-10
[PIRKId! cytoskeleton le-13 [PIRKId! hair 2e-10
[PIRKId! calmodulin binding 2e-13
[PIRKId! Golgi apparatus be-Dδ
[PIRKId! smooth muscle 2e-07
[SUPFAM! conserved hypothetical P115 protein 4e-2b [SUPFAM! myosin heavy chain 1e-20
[SUPFAM! unassigned Ser/Thr or Tyr-specific protein , kinases 1e- lb
[SUPFAM! centromere protein E 3e-01
[SUPFAM! calmodulin repeat homology 2e-10 [SUPFAM! alpha-actinin actin-binding domain homology 7e-D7
[SUPFAM! myosin motor domain homology 1e-20
[SUPFAM! tropomyosin Se-06
[SUPFAM! plectin 7e-07
[SUPFAM! pleckstrin repeat homology 3e-01 [SUPFAM! trichohyalin 2e-10
[SUPFAM! hypothetical protein MJ1322 2e-0b
[SUPFAM! ribosomal protein SID homology 7e-D7
[SUPFAM! protein kinase C zinc-binding repeat homology 3e-D1 [SUPFAM! giantin 1e-12
[SUPFAM! protein kinase homology le-lb
[SUPFAM! kinesin motor domain homology 3e-D1
[SUPFAM! human early endosome antigen 1 2e-13 [SUPFAM! MS protein 4e-01
[SUPFAM! cytoskeletal keratin βe-Ob
[PROSITE! ATP_GTP_A 1
[PROSITE! RGD 1
[Kid! All_Alpha [Kid! LOld_COMPLEXITY 3-30 *
[Kid! COILED COIL 15-12 *
SE(Q MAKRKEENFSSPKNAKRPRtQEELEDFDKDGDEDECKGTTLTAAEVGIIESIHLKNFMCHS SEG
PRD ccchhhhhcccccccccchhhhhhcccccccccccccccccccccceeeeehhhhhhccc COILS
SE(Q MLGPFKFGSNVNFVVGNNGSGKSAVLTALIVGLGGRAVATNRGSSLKGFVKDG(QNSADIS SEG - - • -xxxxxxxxxxxxxxxxxxx
PRD ccccccccceeeeeeccccccchhhhhhhhhhcccccccccccccceeeecccccceeee COILS
SE(Q ITLRNRGDDAFKASVYGNSILI(Q(3HISIDGSRSYKLKSATGSVVSTRKEELIAILDHFNI
SEG
PRD eeeecccccccccccccccccchhhhhccccceeeeccccchhhhhhhhhhhhhhhhhhh
COILS
SE(3 (QVDNPVSVLT(QEMSK(QFLι3SKNEGDKYKFFMKAT(QLE(QMKEDYSYIMETKERTKE(QIH(QG SEG
PRD cccchhhhhhhhhhhhhhhhhhcchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
SE(Q EERLTELKR(QCVEKEERF(QSIAGLSTMKTNLESLKHEMAIdAVVNEIEK(QLNAIRDNIKIG
SEG PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS CCCCCCCCCCCCCCCC
SEiQ EDRAARLDRKMEE(Q(QVRLNEAE(QKYKDI(QDKLEKISEETNARAPECMALKADVVAKKRAY SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC SE(Q NEAEVLYNRSLNEYKALKKDDE(2LCKRIEELKKSTD(3SLEPERLER(3KKISIdLKERVKAF SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
SE(Q (QN(3ENSVN(QEIE(QF(3(QAIEKDKEEHGKIKREELDVKHALSYN(3R(3LKELKDSKTDRLKRF
SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
SE(Q GPNVPALLEAIDDAYR(QGHFTYKPVGPLGACIHLRDPELALAIESCLKGLL(3AYCCHNHA SEG
PRD hhhhhhhhhhhhhhhhhtihhhcccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
SE(3 DERVL(3ALMKRFYLPGTSRPPIIVSEFRNEIYDVRHRAAYHPDFPTVLTALEIDNAVVAN SEG ■
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcccccchhhhhhhhhhhhhhhh COILS
SE<3 SLIDMRGIETVLLIKNNSVARAVM(3S(3KPPKNCREAFTADGD(QVFAGRYYSSENTRPKFL SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhchhhhhhhhhhhhhhhhhhh COILS
SE(Q SRDVDSEISDLENEVENKTA(3ILNL(Q(QHLSALEKDIKHNEELLKRC(QLHYKELKMKIRKN SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
SE(Q ISEIRELENIEEH(3SVDIATLEDEA(3ENKSKMKMVEEHME<3(3KENMEHLKSLKIEAENKY SEG PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
SE(Q DAIKFKIN(QLSELADPLKDELNLADSEVDN(QKRGKRHYEEK(QKEHLDTLNKKKRELDMKE SEG xxxxxxxxxx
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
CCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCC SE(Q KELEEKMS(QAR(3ICPERIEVEKSASILDKEINRLR(QKI(QAEHASHGDREEIMR(QY(QEARE
SEG xxxxxxx
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcchhhhhhhhhhhhh COILS
CCCCCCCCCCCCCC
SE(Q TYLDLDSKVRTLKKFIKLLGEIMEHRFKTYι3(3FRRCLTLRCKLYFDNLLS(QRAYCGKMNF SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcccceee COILS
SE(3 DHKNETLSISVώPGEGNKAAFNDMRALSGGERSFSTVCFILSLIdSIAESPFRCLDEFDVY SEG
PRD eccccccceeeeccccchhhhhhcccccccchhhhhhhhhhhhhhhhccccchhhhhhhh COILS
SE(Q MDMVNRRIAMDLILKMADS(QRFR(QFILLTP(QSMSSLPSSKLIRILRMSDPERG(3TTLPFR SEG
PRD hhhhhhhhhhhhhhhhhhhhhhceeeeeeccccccccccceeeeeecccccccccccccc COILS
SE(3 PVT(3EEDDD(3R
SEG
PRD chhhhhhhccc
COILS
Prosite for DKFZphamy2_lln4 - 1 PSOOOlb 12b->121 RGD PDOCODOlb
PSD0D17 7b->64 ATP GTP A PD0CDD017
(No Pfam data available for DKFZphamy2_lln4 - 1)
DKFZphamy2_121fl1
group: cell structure and motility
DKFZphamy2_121f11 encodes a novel 251 amino acid protein with high similarity to a Rat ankyrin binding glycoprotein-1 related mRNA-
Ankyrin binding glycoproteins play a role in neural cell adhesion and in prosate tumor cell transformation- DKFZphamy2_121f11 ■ p3 is expressed in brain-, uterus and prostate above average- The new protein can find application modulation of cyto skeleton- membrane interactions-
similarity to ankyrin binding glycoprotein-1 related mRNA (Rattus norvegicus)
Sequenced by DKFZ
Locus: /map="l"
Insert length: 1416 bp
Poly A stretch at pos- 1471-. polyadenylation signal at pos- 14bD
1 CGGCACCTTC GCCGGCGCCC TCGCCCACCC CAGCCCCGCC CCAGAAGGAG
SI CAGCCCCCCG CGGAGACCCC TACAGACGCT GCTGTCTTGA CCTCACCCCC
101 AGCCCCTGCT CCCCCGGTGA CCCCTAGCAA ACCAATGGCC GGCACCACAG
ISI ACCGAGAAGA AGCCACTCGG CTCTTGGCTG AGAAGCGGCG CCAGGCCCGG
201 GAGCAGCGGG AGCGCGAGGA GCAGGAGCGG AGGCTGCAGG CAGAAAGGGA 251 CAAGCGAATG CGAGAGGAGC AGCTGGCACG GGAGGCCGAG GCCCGGGCGG
3D1 AGCGGGAGGC GGAGGCCCGG AGGCGGGAGG AGCAGGAGGC ACGAGAGAAG
351 GCGCAGGCCG AGCAGGAGGA GCAGGAGCGG CTGCAGAAGC AGAAAGAGGA
401 GGCCGAAGCT CGGTCGCGGG AAGAGGCGGA GCGGCAGCGT CTGGAGCGGG
4S1 AAAAGCACTT CCAGCAGCAG GAGCAAGAGC GGCAAGAGCG CAGAAAGCGT SD1 CTGGAGGAGA TCATGAAGAG GACTCGGAAG TCAGAAGTTT CTGAAACCAA
S51 GAAGCAGGAC AGCAAGGAGG CCAACGCCAA CGGTTCCAGC CCAGAGCCTG bOl TGAAAGCTGT GGAGGCTCGG TCCCCAGGGC TGCAGAAGGA GGCTGTGCAG bSl AAAGAGGAGC CCATCCCACA GGAGCCTCAG TGGAGTCTCC CAAGCAAGGA
7D1 GTTGCCAGCG TCCCTGGTGA ATGGCCTGCA GCCTCTCCCA GCACACCAGG 751 AGAATGGCTT CTCCACCAAC GGACCCTCTG GGGACAAGAG TCTGAGCCGA
801 ACACCAGAGA CACTCCTGCC CTTTGCAGAG GCAGAAGCCT TCCTCAAGAA
851 AGCTGTGGTG CAGTCCCCGC AGGTCACAGA AGTCCTTTAA GAGGGTTTGC
IDl CTTGGATCCG GGCACAGTTG TGAGGGCTCC TCTGCATCAC CTACCAGGAT
151 GTCTGGAGGA GAAAAAGACA GAACAAAGAT GGAAGTGGCC TGGGCCCCTG 1D01 GGGGTGGGTC CTCTCTGTTG TTTTTAATCT GCACCTTATA GACTGATGTC
1051 TCTTTGGCCG GAGCCAGATC TGCCCCTCAG TGCATTCGTG TGCTCGCACG
1101 CGCAGACATC CCTTCTCCCC CATACACACA TATACACTCA CAGCCTCTCT
1151 GGCCTCTTCC CTTGGGGAGG GGCCACCTGT AGTATTTGCC TTGATTTGGT
12D1 GGGGTACAGT GGATGTGAAT ACTGTAAATA GCTTGTGCTC AGACTCCTCT 12S1 GCGTGGAGAG GGTGGGTGCA GGAGGCAGAC CCTCCCCCCA AAGCCCCCTG
1301 GGGAGATCTT CCTCTCTCTA TTTAACTGTA ACTGAGGGGG ATCCCAGGTC
1351 TGGGGATGGG GGACACCTTG GGCCACAGGA TACTGGTTGC TTCAGGGGTA
14D1 CCCATGCCCC CTGCCCTCGC CTGGAATCAG TGTTACTGCA TCTGATTAAA 14S1 TGTCTCCAGA AATAAAGAAT AATTCTGCCA AAAAAAAAAA AAAAAAAA
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 3
ORF from 13S bp to 687 bpi peptide length: 251 Category: putative protein Classification: Cell signaling/communication
1 MAGTTDREEA TRLLAEKRR(3 ARE(3REREE(Q ERRLtQAERDK RMREEiQLARE 51 AEARAEREAE ARRREElQEAR EKA(QAE(QEE(Q ERL(QK(QKEEA EARSREEAER
101 (SRLEREKHFiQ (Q(QE(3ER(3ERR KRLEEIMKRT RKSEVSETKK (QDSKEANANG
ISI SSPEPVKAVE ARSPGLiQKEA V(3KEEPIP(3E PfQldSLPSKEL PASLVNGLtQP
201 LPAHώENGFS TNGPSGDKSL SRTPETLLPF AEAEAFLKKA VV(3SP(QVTEV
251 L
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_121f11-, frame 3
No Alert BLASTP hits found
Pedant information for DKFZphamy2_121f11-, frame 3
Report for DKFZphamy2_121f11- 3
[LENGTH! 215
[Mid! 33517- lb
[pi! S-bl [HOMOL! TREMBLNEld:AB033013_l gene: "KIAA1167"i product
"KIAA1167 protein":. Homo sapiens mRNA for KIAA1167 protein-, partial eds- le-b4
[BLOCKS! PFD1140D
[BLOCKS! BL00412D Neuromodulin (GAP-43) proteins [BLOCKS! BL0D82bC
[BLOCKS! BL0D422C Granins proteins
[BLOCKS! PRDDlb7C
[BLOCKS! PFDD112A [BLOCKS! BL00224B Cl athr i n l i ght ccha in protei ns
[BLOCKS! PR00041D
[BLOCKS! PR00110A
[KU! A l l A l pha
[Kid! L ld_C MPLEXITY 51 - 11 *
[KU! COILED COIL 10 - 51 *
SE<Q APSPAPSPTPAPP(QKE(QPPAETPTDAAVLTSPPAPAPPVTPSKPMAGTTDREEATRLLAE SEG xxxxxxxxxxxxxxxxxxxxxx - ■ xxxxxxxxxxxxxxxxxxxxx - XX
PRD cccccccccccccccccccccccccceeeeccccccccccccccccccchhhhhhhhhhh COILS
SEfQ KRR(QARE(QREREE(QERRL(QAERDKRMREE(3LAREAEARAEREAEARRREEι3EAREKA(3AE
SEG XXXXXXXXXXXXXXXXXXXXX- • ■ ■ XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS CCCCCCCCCCCCC
SE<2 (QEE(QERL(3K(QKEEAEARSREEAER(3RLEREKHF(3(3(QE(QER(QERRKRLEEIMKRTRKSEVS
SEG XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhccccc COILS CCCCCCCCCCCCCCCCCC
SE(Q ETKKώDSKEANANGSSPEPVKAVEARSPGL(QKEAV(QKEEPIP(QEP(QldSLPSKELPASLVN SEG
PRD chhhhhhhhhccccccccceeeeccccccchhhhhhhhcccccccccccccccccceeee COILS
SE(Q GL(QPLPAH(QENGFSTNGPSGDKSLSRTPETLLPFAEAEAFLKKAVV(3SP(QVTEVL
SEG PRD ecccccccccccccccccccccccccccchhhhhhhhhhhhhhhhhccccccccc COILS
(No Prosite data available for DKFZphamy2_121f11 - 3) (No Pfam data available for DKFZphamy2_121f11.3)
DKFZphamy2_121m2
group: cell cycle
DKFZphamy2_121m2 encodes a novel 4δ0 amino acid protein with similarity to human PA2b-T2 protein- PA2b-T2 is a pS3 responsive gene- The protein is predominantly expressed in brain-, breast and kidney and may represent a potential novel regulator of cellular growth- Isoforms are differentially induced by genotoxic stress (UV-, gamma-irradiation and cytotoxic drugs)in a p53-dependent manner-
The new protein can find application in modulating cell division and apoptosis pathways-
similarity to PA2b nuclear protein isoforms (Homo sapiens) probably differential polyadenylation Sequenced by DKFZ Locus-" unknown
Insert length: 3327 bp
Poly A stretch at pos. 33Dbι polyadenylation signal at pos. 3271
1 TCCAGCACCA AAGCGGCCGT TCTCGGATTC CGGAGCGTTC TGGAGCCCCG
51 AGAGACGCCC CGGGGTTCTA GAAGCTCCCC GGCGGCGCCC AGTCCCGGCT
101 TCATTCGGGC GTCCCTCCGA AACCCACTCG GGTGCACGGG TCGTCGGCGA 151 GCCGCGACCG GGTCCTGGCG CGCACCATGA TCGTGGCGGA CTCCGAGTGC
201 CGCGCAGAGC TCAAGGACTA CCTGCGGTTC GCCCCGGGCG GCGTCGGCGA
2S1 CTCGGGCCCC GGAGAGGAGC AGAGGGAGAG CCGGGCTCGG CGAGGCCCTC
301 GAGGGCCCAG CGCCTTCATC CCCGTGGAGG AGGTCCTTCG GGAGGGGGCT
351 GAGAGCCTCG AGCAGCACCT GGGGCTGGAG GCACTGATGT CCTCTGGGCG 4D1 AGTAGACAAC CTGGCAGTGG TGATGGGCCT GCACCCTGAC TACTTTACCA
4S1 GCTTCTGGCG CCTGCACTAC CTGCTGCTGC ACACGGATGG TCCCTT6GCC
SD1 AGCTCCTGGC GCCACTACAT TGCCATCATG GCTGCCGCCC GCCATCAGT6
SSI TTCTTACCTG GTAGGCTCCC ACATGGCCGA GTTTCTGCAG ACTGGTGGTG bOl ACCCTGAGTG GCTGCTGGGC CTCCACCGGG CCCCCGAGAA GCTGCGCAAA bSl CTCAGCGAGA TCAACAAGTT GCTGGCGCAT CGGCCATGGC TCATCACCAA
701 GGAACACATC CAGGCCTTGC TGAAGACCGG CGAGCACACT TGGTCCCTGG
751 CCGAGCTCAT TCAGGCTCTG GTCCTGCTCA CCCACTGCCA CTCGCTCTCC δOl TCCTTCGTGT TTGGCTGTGG CATCCTCCCT GAGGGGGATG CAGATGGCAG
651 CCCTGCCCCC CAGGCACCTA CACCCCCTAG TGAACAGAGC AGCCCCCCAA 101 GCAGGGACCC GTTGAACAAC TCTGGGGGCT TTGAGTCTGC CCGCGACGTG
151 GAGGCGCTGA TGGAGCGCAT GCAGCAGCTG CAGGAGAGCC TGCTGCGGGA
10D1 TGAGGGGACG TCCCAGGAGG AGATGGAGAG CCGCTTTGAG CTGGAGAAGT
1051 CAGAGAGCCT GCTGGTGACC CCCTCAGCTG ACATCCTGGA GCCCTCTCCA
1101 CACCCAGACA TGCTGTGCTT TGTGGAAGAC CCTACTTTCG GATATGAGGA 1151 CTTCACTCGG AGAGGGGCTC AGGCACCCCC TACCTTCCGG GCCCAGGATT
12D1 ATACCTGGGA AGACCATGGC TACTCGCTGA TCCAGCGGCT TTACCCTGAG
1251 GGTGGGCAGC TGCTGGATGA GAAGTTCCAG GCAGCCTATA GCCTCACCTA
1301 CAATACCATC GCCATGCACA GTGGTGTGGA CACCTCCGTG CTCCGCAGGG 1351 CCATCTGGAA CTATATCCAC TGCGTCTTTG GCATCAGATA TGATGACTAT
1401 GATTATGGGG AGGTGAACCA GCTCCTGGAG CGGAACCTCA AGGTCTATAT
1451 CAAGACAGTG GCCTGCTACC CAGAGAAGAC CACCCGAAGA ATGTACAACC
1501 TCTTCTGGAG GCACTTCCGC CACTCAGAGA AGGTCCACGT GAACTTGCT6
1551 CTCCTGGAGG CGCGCATGCA AGCCGCTCTG CTGTACGCCC TCCGTGCCAT
IbOl CACCCGCTAC ATGACCTGAC TCCTGAGCAG GACCTGGGCC CGGTTCAGCT
IbSl CCCCACAAGG ACTTCTCTGT CTGGAGACAG CCCCAGACCC TTTTGTGTCC
1701 CATGCCCACC CTCCCCACGC TGCAGTGGGC TTGTGTGTGA TGTGCAGTCC
17S1 CGAAGCCACA CCCTCCCTTT TCCTCACTGG AATGGACAGT TCATTGCACT
1801 GACTCTGGGA TCTCAGCCCT GCTCCTGGGA GCTGGAAGAG CACTTGGAGA
1651 TCCTAAGGGA CCACACCCTT CCTCCTTCCC CTGCCCACAG AGGCAGAGGG
1101 CACAGGAAAG AAGCCGGGCC AAGCTCGGAA TTAATGTGCC ACAAGTGTTG
1151 TGGCCTTCCT GAACTGGGAA GTCCCTGGCT GGCCCCCGGG GGAGAGGGGC
2001 AAATGCCTCC GGGACTGACA CTCCAGGCAG CTTTGCCTTC TCTCCCCTGT
2051 CATTTCCAGA TTTCATTACC TCCTACTTGC CATTCACCCA TCAATGTGAA
2101 AGTCAGGGTC ACAGCTGGTC TGTGTGTCCA GTTCCCTAAA AGCCTGTTCT
2151 GTTGGGCAGC CTGAGGCTGT TGCCCGAATC CTAGTTCAGT TTTTTGACTT
2201 CCTTTGCCCT TTTTCCCTTT TCTCCATGCT TAATGGTGTG AGGCGTCAGG
2251 AGAGAGGCCA AGTACATAAA AAAAAAAAAA AGCAGATTAT CTCTAGAGAG
2301 TTTGAGCCTT TGCTGGTCAC ATTGCCTTCT GAAGAGGAGG GAGTATTAGA
2351 TTATAAATCC TCTTTATTTT GGTCCTTTAT GCTTGAGGTT CCAACCTGGA
2401 GCCACAGTGT GTGAGAGGAG GAGGAGAGGG AGAATTCTGT TCTCCCAGAG
2451 CTGCACCTGC CTCGCAGAGG CCAGCACCCC ACTCTCCTGC CTCCAGTGGC
2501 CCTGCCGCAG ATGTCTCCCA AAAAGTTGAG CCTTTCTAGA TGGCTTAGGT
25S1 GGCACCATGG CTCAGCAGGA GGGGCGGGkG GCACCAGGGT TCTTGTTTGG
2bDl ACCCTGCCCC TGGGCCATGG CCAGGTGACC ATGGCTACAT TGCCAAACCT
2bSl CTGACTGCCA CAGCTGCAGA CTGAGAGGGT GGGTCTGAGT CCCCACAATG
27D1 TCTGAAGCTG CCCCTGGGAT TCTCAGGCCA ACCTGCCAAC AGCAAGCGGA
2751 TTTTCTTGCA AGATCAGGGA CCCCATTTCT GCAGCCAGTG TCTCCTGGGT
2601 GCCTTCTGAG GACTCCCACC CCCATCCCAG TATCTCATCT GTCCCCTCTC
2δ51 CTGGGGCTTA AGTGGGTTGC TTCCAGGCAG AAGCAGCCAA GGACCGATTC
2101 CAGGCACTTT CTGTAGCAAA TGACTGTGAA TTACGACTTC TCTTGCCCTT
2151 CTTCTAGCAG TCTGTGCCTC CTCTCTGACC AGTTTGGAGG GCACTGAAGA
3001 AAGGCAAGGG CCGTGCTGCT GCTGGGCGGG GCAGGAGAGG AGCCTGGCCA
3051 GTGTGCCACA TTAAATACCC GTGCAGGCGC GGAGAAGCAA CCGGCACCCC
3101 CTTCCGGCCT GAAAGCCCTC CCTGCAAGAA GGTGTGCAGG AGAGAAGAGG
3151 CCCCGGCATG GGGATCTGGG TTCTAGAGGG CATGTGATGA CTGTAAATGT
3201 TCACTGGGTG GGTAGGGAGT GGTATCCAGT GTTCAAGTGC AGAAATCTTT
3251 GGCTTTGCTA CCAGTTCCAT ATGATGAGAA ATAAACGTTC GCTGAGGTTT
3301 TGTTTCATAA AAAAAAAAAA AAAAAAA
BLAST Results
No BLAST result
Medline entries
15024170: Buckbinder L ■ -> Talbott R-n Seizinger B-R--, Kley N-=. Gene regulation by temperature-sensitive p53 mutants ' identification of p53 response genes- Proc Natl- Acad- Sci- U-S A- 11(22) :lDb40-lDb44(1114) -
1124117: Velasco-Miguel S-, Buckbinder L-, Jean P-. Gelbert L-, Talbott R-,
Laidlaw
Ji Seizinger Bi Kley N ■ ϊ PA2bι a novel target of the p53 tumor suppressor and member of the GADD family of DNA damage and growth arrest inducible genes. Oncogene
1111
Jan 7-^16(1) :127-37
Peptide information for frame 3
ORF from 177 bp to Iblb bpi peptide length: 480 Category: strong similarity to known protein Classification: Cell division
1 MIVADSECRA ELKDYLRFAP GGVGDSGPGE EtQRESRARRG PRGPSAFIPV 51 EEVLREGAES LEOHLGLEAL MSSGRVDNLA VVMGLHPDYF TSFIdRLHYLL
101 LHTDGPLASS IdRHYIAIMAA ARHiQCSYLVG SHMAEFLώTG GDPEldLLGLH
151 RAPEKLRKLS EINKLLAHRP ldLITKEHI(QA LLKTGEHTldS LAELIiQALVL
201 LTHCHSLSSF VFGCGILPEG DADGSPAPiQA PTPPSEβSSP PSRDPLNNSG
251 GFESARDVEA LMERM(3(3L(QE SLLRDEGTS(Q EEMESRFELE KSESLLVTPS 301 ADILEPSPHP DMLCFVEDPT FGYEDFTRRG A(QAPPTFRA(Q DYTUEDHGYS
351 LI(3RLYPEGG (QLLDEKFfiAA YSLTYNTIAM HSGVDTSVLR RAIldNYIHCV
401 FGIRYDDYDY GEVNlQLLERN LKVYIKTVAC YPEKTTRRMY NLFldRHFRHS
451 EKVHVNLLLL EARMlQAALLY ALRAITRYMT
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_121m2ι frame 3
TREMBL:AF033120_1 gene: "PA2b"ή product: πpS3 regulated PA2b-T2 nuclear protein"i Homo sapiens p53 regulated PA2b-T2 nuclear protein (PA2b) mR A! complete eds-! N = li Score = 1377ι P = 1-7e-141
TREMBL:AF033122_1 gene: "PA2b"i product: "non-pS3 regulated PA2b- Tl nuclear protein"i Homo sapiens non-p53 regulated PA2b-Tl nuclear protein (PA2b) mRNAi complete eds-! N = li Score = 13b3ι P = 3e- 131 TREMBL:AF033121_1 gene: "PA2b"i product: "pS3 regulated PA2b-T3 nuclear protein"i Homo sapiens p53 regulated PA2b-T3 nuclear protein (PA2b) mRNAi complete eds-! N = l -ι Score = 130?! P = 2-Se-133
>TREMBL:AF03312D_1 gene: "PA2bπ! product: "p53 regulated PA2b-T2 nuclear protein"! Homo sapiens p53 regulated PA2b-T2 nuclear protein (PA2b) mRNA! complete eds-
Length = 412
HSPs:
Score = 1377 (20b-b bits)! Expect = 1.7e-141τ P = 1-7e-141 Identities = 277/471 (5δ*)τ Positives = 334/471 (70*)
Query : 22 GVGDSGPGEEiQRESRARRGPR GPSAFIPVEEVLREGAESLE(QH-
LGLEALMSSGRV 7b
G G G +(Q E R PR GPS FIP +E+L+ G+E + H L ++ + GR+ Sbjct: 22
GCK(QCGGGRD(QDEELGIRIPRPLG(QGPSRFIPEKEILιQVGSEDA(QMHALFADSFAALGRL 61
(Query : 77
DNLAVVMGLHPDYFTSFURLHYLLLHTDGPLASSURHYIAIMAAARHiQCSYLVGSHMAEF 13b DN+ +VM HP Y SF + + LL DGPL +RHYI
IMAAARHiQCSYLV H+ +F Sb j ct : 62
DNITLVMVFHP(QYLESFLKT(QHYLL(QMDGPLPLHYRHYIGIMAAARH(QCSYLVNLHVNDF 141 (Query : 137
L(QTGGDPEldLLGLHRAPEKLRKLSEINKLLAHRPldLITKEHI(QALLKTGEHTIdSLAELI(Q lib L GGDP + ldL GL AP + KL+ L E + NK + LAHRPULITKEHI+ LLK
EH+USLAEL+
Sb j ct : 142 LHVGGDPKldLNGLENAP(QKL(3NLGELNKVLAHRPldLITKEHIEGLLKAEEHSIιlSLAELVH 2D1
Query : H? ALVLLTHCHSLSSFVFGCGILPEGDADGXXXXXXXXXX
XXXXXXXXRDPLNNS 241
A+VLLTH HSL+SF FGCGI PE DG P+N++
Sb jct : 2D2 AVVLLTHYHSLASFTFGCGISPEIHCDGGHTFRPPSVSNYCICDITNGNHSVDEMPVNSA 2bl
(Query : 250 GGF ESARDVEALMERM(Q(QL(3ESLLRDEG- TSiQEEMESRFELEKSESLLVTPSADILE 30S
+ S +VEALME + M + (QL(QE RDE SiQEEM SRFE + EK ES+ V
S + D E
Sb j ct : 2h2 ENVSVSDSFFEVEALMEKMR(3L(3EC--
RDEEEASiQEEMASRFEIEKRESMFVF-SSDDEE 316
(Query : 30b
PSPHPDMLCFVEDPTFGYEDFTRRGA(QAPPTFRA(QDYTldEDHGYSLI(QRLYPEGG(QLLDE 3b5 + P + ED + + GY + DF + R G P TFR (QDY UEDHGYSL +
RLYP+ G(QL + DE Sbjct : 311 VTPARAVSRHFEDTSYGYKDFSRHGMHVP-
TFRViQDYCUEDHGYSLVNRLYPDVGlQLIDE 377
(Query : 3bb
KFtQAAYSLTYNTIAMHSGVDTSVLRRAIIdNYIHCVFGIRYDDYDYGEVNiQLLERNLKVYI 425 KF AY+LTYNT+AMH
VDTS + LRRAIldNYIHC + FGIRYDDYDYGE + N(QLL + R+ KVYI Sb j ct : 376 KFHIAYNLTYNTMAMHKDVDTSMLRRAIldNYIHCMFGIRYDDYDYGEINiQLLDRSFKVYI 437 (Query: 42b
KTVACYPEKTTRRMYNLFURHFRHSEKVHVNLLLLEARM(3AALLYALRAITRYMT 480 KTV C PEK T+RMY+ FldR F+HSEKVHVNLLL+EARM(QA LLYALRAITRYMT Sbjct: 436 KTVVCTPEKVTKRMYDSFIdR(QFKHSEKVHVNLLLIEARM(3AELLYALRAITRYMT 412
Pedant information for DKFZphamy2_121m2! frame 3
Report for DKFZphamy2_121m2 ■ 3
[LENGTH! 460
[Mid! 54413-12
[pi! 5-57
[HOMOL! TREMBL:AF033120_1 gene: "PA2bn! product: "pS3 regulated PA2b-T2 nuclear protein"i Homo sapiens p53 regulated
PA2b-T2 nuclear protein (PA2b) mRNAi complete eds- le-151
[BLOCKS! PR00D41D
[KU! All_Alpha
[KU! LOId_COMPLEXITY 3-75 *
SE<3 MIVADSECRAELKDYLRFAPGGVGDSGPGEEiQRESRARRGPRGPSAFIPVEEVLREGAES
SEG
PRD cccchhhhhhhhhhhhhccccccccccccchhhhhhhccccccccccccchhhhhhhhhh
SElQ LEfQHLGLEALMSSGRVDNLAVVMGLHPDYFTSFldRLHYLLLHTDGPLASSIdRHYIAIMAA
SEG
PRD hhhhhhhhhhhhhccccccceeeeccccchhhhhhhhhhhhhcccccchhhhhhhhhhhh SE(Q ARH(3CSYLVGSHMAEFL(QTGGDPEULLGLHRAPEKLRKLSEINKLLAHRPIdLITKEHI(QA SEG
PRD hhhhhheeeccccceeeecccccccccccccchhhhhhhhhhhhhhhhccceeehhhhhh
SE(Q LLKTGEHTUSLAELI(3ALVLLTHCHSLSSFVFGCGILPEGDADGSPAP(QAPTPPSE(QSSP SEG xxxxxxxxxxxxxxxx
PRD hhhhhcchhhhhhhhhhhhhhhhccccccccccccccccccccccccccccccccccccc
SE(Q PSRDPLNNSGGFESARDVEALMERM(Q(QL(QESLLRDEGTS(QEEMESRFELEKSESLLVTPS SEG xx PRD cccccccccccchhhhhhhhhhhhhhhhhhhhhhhhccchhhhhhhhhhhhhheeeeccc
SE(Q ADILEPSPHPDMLCFVEDPTFGYEDFTRRGA(QAPPTFRA(QDYTUEDHGYSLI(QRLYPEGG
SEG
PRD ccccccccccceeeeccccccccccccccccccccceeeeeeccccccceeeeecccccc
SE(Q (3LLDEKF(QAAYSLTYNTIAMHSGVDTSVLRRAIIdNYIHCVFGIRYDDYDYGEVN(QLLERN
SEG
PRD hhhhhhhhhhhhhcccceeecccccchhhhhhhhhhhhhhccccccccccchhhhhhhhh SElQ LKVYIKTVACYPEKTTRRMYNLFIdRHFRHSEKVHVNLLLLEARM(3AALLYALRAITRYMT SEG
PRD hheeeeeeeecccchhhhhhhhhhhhhhccchhhhhhhhhhhhhhhhhhhhhhhhhhccc (No Prosite data available for DKFZphamy2_121m2-3) (No Pfam data available for DKFZphamy2_121m2- 3)
DKFZphamy2_121ol7
group: transmembrane protein
DKFZphamy2_121ol7 encodes a novel 212 amino acid protein without similarity to known proteins- The novel protein contains 1 transmembrane region- No informative BLAST results No predictive prosite! pfam or SCOP motife-
The new protein can find application in studying the expression profile of amygdala-specific genes and as a new marker for amygdala cells-
unknown protein
Pedant: TRANSMEMBRANE 1
Sequenced by DKFZ Locus-" /map="16b-b cR from top of Chr22 linkage group"
Insert length: 2b10 bp
Poly A stretch at pos- bbl! polyadenylation signal at pos- 2b34
1 TGCTGGGAAA AGTGACTGCG ATTCTGAAGA ACCGCTGCCT TGCAAGGTCA
SI AGGACATTCA GTGGTT6CTG 6GGTCCGCAG ACTACTGCCA CCCACTCACC
101 ATCAACTCTG TTAGCCCAAT TGCCCTGCTG AACAACTGCC TGAATACAGG
151 CTTTAGGTTC CCCTGGACTC CAGCCAAGGC TGTTCAGGTG GGACCATGGT 201 GCTCTTTAAG CGTGATCGGA GGGAAGACAC ACAGCAGGGC CACCATTCCA
251 TGAATGGGAG GTGTACAGAT CACTTTCTCT TTGTGCTCAG TTCTCTTCTG
301 TCTCCAGCAG CTATATTGGT AAGACTAGTA CCTGCCAGGG AGAGGTGCCC
351 CCAAGTGAAG GGGTACAGTG GCACCTGGGA AAAGGCACCT GGAAGGTTTC
401 CATGTGGCCC AGCCCAGCAT GGAAGCAGGG TGGGAACTCT GCTGTGTCGC 451 CAGCCCTCAC TCTACTCAAG TGGCTTTTTG AGAGCCCTGC CATGTCTGTG
501 TCAGGCCTGT GCTGCTTCAC ACCCTACAGC TGCCTGGGAA AGGCCGGCCA
5S1 CGCTCCCTGT CCACACACTC CCTGTCCACA CACTCCCTGT CCACAACTGC bOl AGCCGGGCCC TCTGCCTATG GGCACCCAAT CCAAGCAGCT GCTCCACCTT b51 TGTTTGGCAT GGTGATTTGT GTTTTTTCTC TTGGTGCTTA TGTGTGTGGG 701 CTTGGGACGA GTGCTGGTAT GCACTTAGGA CCTTCTTGAT AGCTCCCTGC
751 ACTTTGGAAC ACGGA6CAGA TGAGAGAGGG TCAGGGGCTT GCCCTCCACC
801 TTGGACTTGG AAGAAGCCCA CATTGGAGAG GTGAGGACCC CATGGTGGCT
8S1 CTAGTGGAAG ATACGTTAGT CTCCAGCTAA GGAGGATGAG GCGCAGCCCC
101 AGAGGGAGAC CTCAGTGATA GGGGATCAGG CTACGAAAGT GGGGGkkGGG 151 AGATGCTTTG TACATATTTT GGGGTTATAA TTTCTCTAAA TTTTAGGAGA
1001 ACGGGTATTG ATTGATAAAA GGGACAGGCA GTA6TGTTCA ACAGTGCATG
1D51 TGAAGGAAAG TTCTGTTTTC CATGGTTTT6 ACATTCTTTG GACTGTATTG
1101 TGACTGCTGT CTGGTCCACA TGGTACCCTT TTGGTAAGTA GGCTTCAGTG
1151 CATACCAGGG TATCACTGGA GATGGGAGTT AGTGAAGGGG TGACTCCCTG 1201 GCCTAGTATA GTGTGACCCT GGGACAACTT AATGTCCTAA AGCATTTTGG
1251 TGACTTCTA6 G6AATAGCAA AGACCTATTT CATTGTCCCC AG6TAAGTAT
13D1 GTGATGAGCA ATGAGGAGGA GTGGAAAACA AAACCCAGAA AGTGCGGCAG
1351 GACCAGCCTG ACGCACACGC TCCTGTTGTC ATGGCAGACA GCCGCCTTGG 1401 GTGGGCACCA CCCTGGCAGT TCCAGCCTGT AGGGGAGTGA AGGGACATGG
14S1 CTGAGCTGGG CATGTGCTGA GGTTGACTTA GGGAACAAGC CCTGGGATTG
1501 GACAAAAGGG CCCATGCTGC AGCCACTGAC TGGGGGCAGA GCTCTGGGTG
1S51 GAAGAGGGAA GAGATCCTAA TGGAGGCGCC TCCATCTGCA ACCACAGTTG IbDl TAAGGCTCAT GGCACCTCTG CTTGGAAAGC ACTGGTTTAG GGACTTAGAG
IbSl AGGTAGGCAC AAGGTGGGTC TCCTGGGTAA GGGAAGCAAG AGCAGACTGT
17D1 TGGGCCAACA GGAGAAGCTC CCCAGAGTAG GGGAGAAGGT TGGGGTGTAG
1751 GGCCTTCCAC GTGGAACAGA CAGCCCCTGT GTCTCTGTCT CTTGGGGACC
1801 TGAGTTTGGG TGGGGTGGCA GTTGGCACAG CGCAGATGCG GTAGAGATGG 16S1 GAGGAAACCC AGCTCCTCAC TTCCGTGTGC CTCATGCCTT TGCATACACA
1101 AGCACCAAAC CTACTAGGTC TTCTCATTAC CCATGTAAAC CACATGTTAG
1151 ATAAATTTTT GCAAGTAGAG GAAAGAAGGA AATAAAACAT CACATTTTGG
20D1 TGTCTCTCAG GCTTTCCCCC CCAACTATGG TTTCTTTGCT TTTTGTTTTA
2DS1 ACATAGTTTT GTTGCTGTCT TCTGTAATGA TACAGTTTTG TGCAGCTGTT 2101 TTCACTTAGC ATATCGTGGG CATCTCCCCT TATGATTACT AAATATTTTA
2151 TTTTGGAGTG GCTGTGTACT CTCCCATTGA CTAGATGGAC CATTGTGCCA
2201 GTTGCCAATC ACTAATGCTG TTACTAACTT TTCAGTTATA AATTGATGAA
2251 TATCTTTGTG CACAGGCTGT TTCCCAATGT CAAGTTATTA GGGTAGACTC
2301 CAGGAGGTGG GATTCTTCAA CTAAAGAATA TGAAAACCTT TGAGGCTTTT 23S1 ACTACATATT GACAAAATGG TTTCCGGAAA TATTTGTATC CCCTTACACT
24D1 GCCACCAGCA AGGATAAACA TGTCCATCTT GCCCGTATTG GGAATTATCA
2451 TCTGGCTAAA TATTTGCTAA TTTGATAATG AAAAAATAGC ATCGTGTTTC
2501 AGTTGGCATT TCACTGACTT CTAGCACGGT TGAACATCTT TCATGTGGAG
2551 CGATTGTATT TCCTCCTTTG TGGATTGTCA GTGTCCTTTG CTCTATCTTC 2b01 TGGGGTCAGA TAAATTTGTA TGAGCTCGGT ATATATTAAA GATATTAACC
2b51 TGGTGTGTGT CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA
BLAST Results
Entry HS1033E1S from database EMBL:
Human DNA sequence from clone 1D33E15 on chromosome 22ql3 -1-13- 2 ■ Contains part of a novel gene! ESTs and a GSS- Score = 5111! P = 5-le-2b2! identities = 1187/1115
Entry HSN128A12 from database EMBL:
Human DNA sequence from cosmid N128A12 on chromosome 22ql2-qter contains ESTsi CpG island- Score = 5036! P = D-Oe+OO! identities = 1014/1011
Entry HSb1034b from database EMBL: human STS UI-14034- Score = 1800! P = l-4e-7bτ identities = 312/417
Medline entries
No Medline entry
Peptide information for frame 1
ORF from lib bp to 631 bp=. peptide length: 212 Category: putative protein Classification: no clue
1 MVLFKRDRRE DT(Q(QGHHSMN GRCTDHFLFV LSSLLSPAAI LVRLVPARER
51 CPlQVKGYSGT UEKAPGRFPC GPAώHGSRVG TLLCROPSLY SSGFLRALPC
IDl LC(3ACAASHP TAAldERPATL PVHTLPVHTL PVHNCSRALC LUAPNPSSCS
151 TFVIdHGDLCF FSIdCLCVUAU DECIdYALRTF LIAPCTLEHG ADERGSGACP
201 PPldTUKKPTL ER
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_121ol7! frame 1 No Alert BLASTP hits found Pedant information for DKFZphamy2_121ol7! frame 1
Report for DKFZphamy2_121ol7 - 1
[LENGTH! 212
[Mid! 23727 - 55
[pi! 6 - 73
[Kid! TRANSMEMBRANE
SE(3 MVLFKRDRREDT(3(QGHHSMNGRCTDHFLFVLSSLLSPAAILVRLVPARERCP(QVKGYSGT
PRD ccchhhhhcccccccccccccccccchhhhhhhhccccceeeeecccccccccccccccc
MEM MMMMMMMMMMMMMMMMM
SElQ lιlEKAPGRFPCGPA(QHGSRVGTLLCR(QPSLYSSGFLRALPCLC(QACAASHPTAAldERPATL
PRD ccccccccccccccccccceeeeeccccccccccccccccchhhhhhccccccccccccc
MEM SEO PVHTLPVHTLPVHNCSRALCLUAPNPSSCSTFVIdHGDLCFFSIdCLCVIdAldDECIdYALRTF
PRD ccccccccccccccccceeeeecccccccceeeecccceeecccceeeeccchhhhhhhe
MEM
SE(3 LIAPCTLEHGADERGSGACPPPUTIdKKPTLER PRD eeeccccccccccccccccccccccccccccc
MEM
(No Prosite data available for DKFZphamy2_121ol7-l) (No Pfam data available for DKFZphamy2_121ol7 -1) DKFZphamy2_12d7
group: signal transduction
DKFZphamy2__12d7 encodes a novel 552 amino acid proteini which is a so far unknown alternative spliced form of disks large homolog DLG2.
It seems to be predominantly expressed in the retinai germ cells and brain- It contains a SH3-domain and a guanylate kinase domain. These conserved regions are shared among members of the discs-large family of proteins that include human p55ι a membrane protein expressed in erythrocytesi rat PSD-15/SAPIDι a synapse protein expressed in brain Drosophila dlg-A! a septate junction protein expressed in various epithelia! and human and mouse ZO-1 and canine Z0-2ι two tight junction proteins- The Homologue of Drosophilai dlg-Ai acts as a tumor suppressor- All members of this family may be involved in signal transduction-
The new protein can find application in modulating/blocking intracellular signal transduction pathways-
similarity to disks large homolog DLG2 (Homo sapiens) alternative splicing: see DLG2 complete eds- frame shift: around position 1437 one C too many
Sequenced by EMBL
Locus: /map="33β-b cR from top of Chrl7 linkage group"
Insert length: 4220 bp
Poly A stretch at pos- 418Dι polyadenylation signal at pos- 41bS
1 CCCGGCTGCG CTGGAGCCGC CCGGAGCTAG GGGCTTCCCG GGGCGCAGGA
51 GAGACGTTTC AGAGCCCTTG CCTCCTTCAC CATGCCGGTT GCCGCCACCA
101 ACTCTGAAAC TGCCATGCAG CAAGTCCTGG ACAACTTGGG ATCCCTCCCC
151 AGTGCCACGG GGGCTGCAGA GCTGGACCTG ATCTTCCTTC GAGGCATTAT
201 GGAAAGTCCC ATAGTAAGAT CCCTGGCCAA GGCCCATGAG AGGCTGGAGG 251 AGACGAAGCT GGAGGCCGTG AGAGACAACA ACCTGGAGCT GGTGCAGGAG
301 ATCCTGCGGG ACCTGGCGCA GCTGGCTGAG CAGAGCAGCA CAGCCGCCGA
351 GCTGGCCCAC ATCCTCCAGG AGCCCCACTT CCAGTCCCTC CTGGAGACGC
4D1 ACGACTCTGT GGCCTCAAAG ACCTATGAGA CACCACCCCC CAGCCCTGGC
4S1 CTGGACCCTA CGTTCAGCAA CCAGCCTGTA CCTCCCGATG CTGTGCGCAT 501 GGTGGGCATC C6CAAGACAG CCGGAGAACA TCTGGGTGTA ACGTTCCGCG
5S1 TGGAGGGCGG CGAGCTGGTG ATCGCGCGCA TTCTGCATGG GGGCATGGTG bOl GCTCAGCAAG GCCTGCTGCA TGTGGGTGAC ATCATCAAGG AGGTGAACGG b51 GCAGCCAGTG GGCAGTGACC CCCGCGCACT GCAGGAGCTC CTGCGCAATG
701 CCAGTGGCAG TGTCATCCTC AAGATCCTGC CCAGCTACCA GGAGCCCCAT 7S1 CTGCCCCGCC AGGTATTTGT GAAATGTCAC TTTGACTATG ACCCGGCCCG
801 AGACAGCCTC ATCCCCTGCA AGGAAGCAGG CCTGCGCTTC AACGCCGGGG
851 ACTTGCTCCA GATCGTAAAC CAGGATGATG CCAACTGGTG GCAGGCATGC
IDl CATGTCGAAG GGGGCAGTGC TGGGCTCATT CCCAGCCAGC TGCTGGAGGA 151 GAAGCGGAAA GCATTTGTCA AGAGGGACCT GGAGCTGACA CCAAACTCAG
1001 GGACCCTATG CGGCAGCCTT TCAGGAAAGA AAAAGAAGCG AATGATGTAT
1051 TTGACCACCA AGAATGCAGA GTTTGACCGT CATGAGCTGC TCATTTATGA
1101 GGAGGTGGCC CGCATGCCCC CGTTCCGCCG GAAAACCCTG GTACTGATTG 1151 GGGCTCAGGG CGTGGGACGG CGCAGCCTGA AGAACAAGCT CATCATGTGG
12D1 GATCCAGATC GCTATGGCAC CACGGTGCCC TACACCTCCC GGCGGCCGAA
1251 AGACTCAGAG CGGGAAGGTC AGGGTTACAG CTTTGTGTCC CGTGGGGAGA
1301 TGGAGGCTGA CGTCCGTGCT GGGCGCTACC TGGAGCATGG CGAATACGAG
1351 GGCAACCTGT ATGGCACACG TATTGACTCC ATCCGGGGCG TGGTCGCTGC 1401 TGGGAAGGTG TGCGTGCTGG ATGTCAACCC CCAGGCCGGT GAAGGTGCTA
1451 CGAACGGCCG AGTTTGTCCC TTACGTGGTG TTCATCGAGG CCCCAGACTT
1SD1 CGAGACCCTG CGGGCCATGA ACAGGGCTGC GCTGGAGAGT GGAATATCCA
1S51 CCAAGCAGCT CACGGAGGCG GACCTGAGAC GGACAGTGGA GGAGAGCAGC
IbOl CGCATCCAGC GGGGCTACGG GCACTACTTT GACCTCTGCC TGGTCAATAG IbSl CAACCTGGAG AGGACCTTCC GCGAGCTCCA GACAGCCATG GAGAAGCTAC
1701 GGACAGAGCC CCAGTGGGTG CCTGTCAGCT GGGTGTACTG AGCCTGTTCA
1751 CCTGGTCCTT GGCTCACTCT GTGTTGAAAC CCAGAACCTG AATCCATCCC
1801 CCTCCTGACC TGTGACCCCC TGCCACAATC CTTAGCCCCC ATATCTGGCT
18S1 GTCCTTGGGT AACAGCTCCC AGCAGGCCCT AAGTCTGGCT TCAGCACAGA 1101 GGCGTGCACT GCCAGGGAGG TGGGCATTCA TGGGGTACCT TGTGCCCAG.G
1151 TGCTGCCCAC TCCTGATGCC CATTGGTCAC CAGATATCTC TGAGGGCCAA
20D1 GCTATGCCCA GGAATGTGTC AGAGTCACCT CCATAATGGT CAGTACAGAG
20S1 AAGAGAAAAG CTGCTTTGGG ACCACATGGT CAGTAGGCAC ACTGCCCCTG
21D1 CCACCCCTCC CCAGTCACCA GTTCTCCTCT GGACTGGCCA CACCCACCCC 2151 ATTCCTGGAC TCCTCCCACC TCTCACCCCT GTGTCGGAGG AACAGGCCTT
2201 GGGCTGTTTC CGTGTGACCA GGGGAATGTG TGGCCCGCTG GCAGCCAGGC
2251 AGGCCCGGGT GGTGGTGCCA GCCTGGTGCC ATCTTGAAGG CTGGAGGAGT
2301 CAGAGTGAGA GCCAGTGGCC ACAGCTGCAG AGCACTGCAG CTCCCAGCTC
2351 CTTTGGAAAG GGACAGGGTC GCAGGGCAGA TGCTGCTCGG TCCTTCCCTC 24D1 ATCCACAGCT TCTCACTGCC GAAGTTTCTC CAGATTTCTC CAATGTGTCC
24S1 TGACAGGTCA GCCCTGCTCC CCACAGGGCC AGGCTGGCAG GGGCCATTGG
2501 GCTCAGCCCA GGTAGGGGCA GGATGGAGGG CTGAGCCCTG TGACAACCTG
2S51 CTGTTACCAA CTGAAGAGCC CCAAGCTCTC CATGGCCCAC AGCAGGCACA
2b01 GGTCTGAGCT CTATGTCCTT GACCTTGGTC CATTTGGTTT TCTGTCTAGC 2bSl CAGGTCCAGG TAGCCCACTT GCATCAGGGC TGCTGGGTTG GAGGGGCTAA
2701 GGAGGAGTGC AGAGGGGACC TTGGGAGCCT GGGCTTGAAG GACAGTTGCC
2751 CTCCAGGAGG TTCCTCACAC ACAACTCCAG AGGCGCCATT TACACTGTAG
2801 TCTGTACAAC CTGTGGTTCC ACGTGCATGT TCGGCACCTG TCTGTGCCTC
2651 TGGCACCAGG TTGTGTGTGT GTGCGTGTGC ACGTGCGTGT GTGTGTGTGT 21D1 GTGTCAGGTT TAGTTTGGGG AGGAAGCAAA GGGTTTTGTT TTGGAGGTCA
2151 CTCTTTGGGG CCCCTTTCTG GGGGTTCCCC ATCAGCCCTC ATTTCTTATA
30D1 ATACCCTGAT CCCAGACTCC AAAGCCCTGG TCCTTTCCTG ATGTCTCCTC
3D51 CCTTGTCTTA TTGTCCCCCT ACCCTAAATG CCCCCCTGCC ATAACTTGGG
3101 GAGGGCAGTT TTGTAAAATA GGAGACTCCC TTTAAGAAAG AATGCTGTCC 3151 TAGATGTACT TGGGCATCTC ATCCTTCATT ATTCTCTGCA TTCCTTCCGG
32D1 GGGGAGCCTG TCCTCAGAGG GGACAACCTG TGACACCCTG AGTCCAAACC
3251 CTTGTGCCTC CCAGTTCTTC CAAGTGTCTA ACTAGTCTTC GCTGCAGCGT
3301 CAGCCAAAGC TGGCCCCTGA ACCACTGTGT GCCCATTTCC TAGGGAAGGG
3351 GAAGGAGAAT AAACAGAATA TTTATTACAA ATGTTAGAAT ATATTTCTTA 3401 TACTAGGAAT CTCATTTGCA TTTGCATAGA CTATACACAT GGGGTGGAAA
34S1 GGCCAGGCCT GCCCCCATCT CGTTGGTGTG GCTCTGCGTA TACTACACAC
35D1 TCATTCTCCT GCTCCTCTTT TCCCTTAGTC AGTGTCCTTT CATCCTGATT
3551 CAGCTCTGCC TTGCATCACC CTCAGCCTAA GGGAGTGGGA AGGAAATGGG
3b01 GTGTTTTCTT GCTGACCTGA GGCTATAGGG TCACTTGCCA TTTCCTACCT 3b51 TCTCTGGGGG ATTTGAGGGT AGAGGCAGGG GAAGATCTGT TGTTGCAGTT
37D1 GCTTCTGCCC CCTTGATCCA AATGACCATC ATCTCTGATG GAGATGGGTT
3751 GGGTACCTGG CCTTCATGGC ACCTTCACTG CTAGGGATGC TCAAGGGGCA
3801 GGCCTGGGGC CCTTCCCTCC TGTCTCTTCT CGGTCTTTCC TCTCTGAGCA 3651 GCCTCCTACC TCCCCTGCCT GAGCCCTCAC TCCACAGCCC TCCCAGGTAC
3101 CTAGCAGAGG CTGTCAGTCC TTGGCTCACC TGGAACAGGG CTGGGGCTGG
3151 GTTGGAACAG GTGTGTGCCC CCACCACAGC TCTATGACTC TGTTCTCCCT
4001 CCCTGCCATT GTGGACTCTT GTATTTGAGG GACCTCAAGA GAGTGAGGAC
4051 CCTACCATCC ACTGTCCATA TTCAGTCCCA GCCCCAGTGC GCTTCCTCTG
4101 TTCCCTCCCT CAGCCATCCA ATTCTTGAGT TTTCTCACTG ATTGGTTTTC
4151 TTTCTTTTTC CTTGGATTAA ATGTGAAAGC AAAGAAAAAA AAAAAAAAAA
42D1 AAAAAAAAAA AAAAAAAAAA
BLAST Results
No BLAST result
Medline entries
1b070428:
Mazoyer Si Gayther SAi Nagai MAi Smith SAi Dunning Ai van
Rensburg EJi
Albertsen Hi Idhite Ri
Ponder BA-ϊ A gene (DLG2) located at 17ql2-q21 encodes a new homologue of the Drosophila tumor suppressor dlg-A- Genomics 1115 Jul lήEfi(l) :2S-31
Peptide information for frame 1
ORF from 62 bp to 1437 bpi peptide length: 452 Category: strong similarity to known protein Classification: Cell signaling/communication Prosite motifs: GUANYLATE KINASE 1 (38S-4D2)
1 MPVAATNSET AMlQiQVLDNLG SLPSATGAAE LDLIFLRGIM ESPIVRSLAK
51 AHERLEETKL EAVRDNNLEL V(QEILRDLA(Q LAEiQSSTAAE LAHILώEPHF
IDl (3SLLETHDSV ASKTYETPPP SPGLDPTFSN (QPVPPDAVRM VGIRKTAGEH 151 LGVTFRVEGG ELVIARILHG GMVA(Q(QGLLH VGDIIKEVNG (3PVGSDPRAL
201 (3ELLRNASGS VILKILPSYώ EPHLPRϊQVFV KCHFDYDPAR DSLIPCKEAG
251 LRFNAGDLLO IVNtQDDANUId (QACHVEGGSA GLIPSOLLEE KRKAFVKRDL
301 ELTPNSGTLC GSLSGKKKKR MMYLTTKNAE FDRHELLIYE EVARMPPFRR
351 KTLVLIGAtQG VGRRSLKNKL IMldDPDRYGT TVPYTSRRPK DSEREGiQGYS 401 FVSRGEMEAD VRAGRYLEHG EYEGNLYGTR IDSIRGVVAA GKVCVLDVNP
451 Qk
BLASTP hits
No BLASTP hits available Alert BLASTP hits for DKFZphamy2_12d7ι frame 1 No Alert BLASTP hits found
Peptide information for frame 2
ORF from 1431 bp to 1736 bpi peptide length: loo Category: strong similarity to known protein Classification: Cell signaling/communication Prosite motifs: LEUCINE ZIPPER (bb-87)
1 VKVLRTAEFV PYVVFIEAPD FETLRAMNRA ALESGISTKlQ LTEADLRRTV 51 EESSRIiQRGY GHYFDLCLVN SNLERTFREL (QTAMEKLRTE PlQldVPVSUVY
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_12d7ι frame 2 No Alert BLASTP hits found
Pedant information for DKFZphamy2_12d7ι frame 1
Report for ' DKFZphamy2_12d7.1
[LENGTH! 51b
[MU! 5b45β. 3b
[pi! b-21
[HOMOL! PIR:A57bS3 disks large homolog DLG2 - human 0-0
[FUNCAT! 01-03 ■11 other nucleotide-metabolism activities [S. cerevisiaei YDR4S4c! 7e-lS
[FUNCAT! f nucleotide metabolism and transport [H- influenzaei
HI1743! 3e-D7
[BLOCKS! PRD08 34F
[BLOCKS! BLDD65bC
[BLOCKS! BLD08 SbB Guanylate kinase proteins
[BLOCKS! BLD085bA Guanylate kinase proteins
[SCOP! dlgky 3.21-1-1-1 Guanylate kinase [baker's yeast (Saccharomyce 8e-45
[SCOP! dlkwab_ 2-2b.l.l-2 Cask/Lin-2 [Human (Homo sapiens) 4e-34
[EC! 2.7-4 ■6 Guanylate kinase βe-17
[PIRKId! blocked amino end 8e-17
[PIRKU! phosphotransferase 8e-17
[PIRKId! monomer 6e-17
[PIRKU! duplication 5e-21
[PIRKId! signal transduction 3e-24
[PIRKU! alternative splicing Se-21
[PIRKId! P-loop 8e-17
[PIRKU! acetylated amino end le-lb [PIRKU! membrane protein 1e-74
[PIRKId! magnesium 6e-17
[PIRKU! ATP fle-17
[SUPFAM! SH3 homology 1e-74 [SUPFAM! discs-large tumor suppressor 3e-24
[SUPFAM! unassigned Ser/Thr or Tyr-specific protein kinases Sell
[SUPFAM! protein kinase homology 5e-ll
[SUPFAM! GLGF domain homology 1e-74 [SUPFAM! guanylate kinase fle-17
[SUPFAM! guanylate kinase homology 1e-74
[PROSITE! GUANYLATE_KINASE_1 1
[PFAM! Src homology domain 3
[Kid! Irregular [Kid! 3D
SE(3 MPVAATNSETAMiQiQVLDNLGSLPSATGAAELDLIFLRGIMESPIVRSLAKAHERLEETKL Igky-
SE(Q EAVRDNNLELV(QEILRDLA(3LAE(3SSTAAELAHIL(3EPHF<3SLLETHDSVASKTYETPPP Igky-
SE(3 SPGLDPTFSN(3PVPPDAVRMVGIRKTAGEHLGVTFRVEGGELVIARILHGGMVA(3(3GLLH Igky-
SElQ VGDIIKEVNG(QPVGSDPRAL(QELLRNASGSVILKILPSY(QEPHLPR(QVFVKCHFDYDPAR lgky-
SE(Q DSLIPCKEAGLRFNAGDLL(QIVN(QDDANUU(QACHVEGGSAGLIPS(3LLEEKRKAFVKRDL Igky-
SE(3 ELTPNSGTLCGSLSGKKKKRMMYLTTKNAEFDRHELLIYEEVARMPPFRRKTLVLIGAiQG Igky- CCEEEECTTT
SElQ VGRRSLKNKLIMUDPDRYGTTVPYTSRRPKDSEREGlQGYSFVSRGEMEADVRAGRYLEHG Igky-
TCHHHHHHHHHHHTTTTEEECCEEECCCCTTTTTTTTTTEECCHHHHHHHHHHCCEEEEE
SE(3 EYEGNLYGTRIDSIRGVVAAGKVCVLDVNPlQAGEGATNGRVCPLRGVHRGPRLRDPAGHE Igky-
EETTEEEEEEHHHHHHHHHHCCEEEEECCHH SElQ (QGCAGEUNIH(QAAHGGGPETDSGGE(QPHPAGLRALL lgky-
Prosite for DKFZphamy2_12d7.1
PSDOfiSb 385->403 GUANYLATE_KINASE_1 PD0C00b70 Pfam for DKFZphamy2_12d7-l
HMM_NAME Src homology domain 3
HMM
*pyVIALYDYqAqd pDELSFkEGDIIillEdsDD . UldrgRnnn +V+ +DY++ + + L F GD ++I++++D+ UU +
(Query 228
VFVKCHFDYDPARDSLIPCKEAGLRFNAGDLL(3IVN(3DDANUU(3ACHVE 27b
HMM TNGiQEGUIPSNYVEPi* ++ G+IPS +E+
Query 277 GG-SAGLIPS(3LLEEK 211
Pedant information for DKFZphamy2_12d7ι frame 2
Report for DKFZphamy2_12d7- 2
[LENGTH! 175
[Mid! 11721- ID
[pi! 1-bl
[HOMOL! PIR:A57b53 disks large homolog DLG2 - human 7e-S3 [PIRKU! membrane protein le-13
[SUPFAM! SH3 homology le-13
[SUPFAM! GLGF domain homology le-13
[SUPFAM! guanylate kinase homology le-13
[PROSITE! LEUCINE_ZIPPER 1 [Kid! Alpha_Beta
SE(Q MAPRCPTPPGGRKTiQSGKVRVTALCPVGRURLTSVLGATUSMANTRATCMAHVLTPSGAU
PRD ccccccccccccccccceeeeeeeccccccceeeeeccccccchhhhhhhhhhccccccc
SE(Q SLLGRCACUMSTPRPVKVLRTAEFVPYVVFIEAPDFETLRAMNRAALESGISTK(3LTEAD
PRD ccccceeeeecccchhhhhhhhhcceeeeeeeccchhhhhhhhhhhhhccccchhhhhhh
SE(3 LRRTVEESSRI(3RGYGHYFDLCLVNSNLERTFREL(3TAMEKLRTEP(QUVPVSUVY PRD hhhhhhhhhhhhhhhhhheeeeeecccchhhhhhhhhhhhhhhhccccccccccc
Pros i te f or DKFZphamy2_12d7 . 2 PS00D21 141->lb3 LEUCINE_ZIPPER PD0C00021
( No Pf am data ava i l abl e f or DKFZphamy2_12d7 - 2 ) DKFZphamy2_12g7
group: amygdala derived
DKFZphamy2_12g7 encodes a novel 254 amino acid protein without similarity to known proteins- No informative BLAST resultsi No predictive prositei pfam or SCOP motife.
The new protein can find application in studying the expression profile of amygdala-specific genes-
putative protein
Sequenced by EMBL
Locus: unknown
Insert length"- 1257 bp
No poly A stretch foundi no polyadenylation signal found
1 CTCCAAGACT TCCTTGCTGT GAGGCTCGTG TGGACCCCAG AGCATGCACA
51 GGCTGTTTAC TCCACAGAGT GGCTTTGAGA ATCAGATGAG ACTGTGCTGG
101 CGAAGGCCCT GTGGGAATGA GGAACGCTGT AGTGTTTGCT GGTCCCTGTT 151 TCTGCCCCCA GGAAAGCAGC TGTGTGAGGA GGAGCGCCGG GCCATGCAGG
201 CTGCCCTGGA CTCCGTCGTC TGCCACACGC CCCTCAACAA CCTTGGCTTT
2S1 TCCCGGAAGG GCAGCGCGCT CACCTTCAGT GTGGCCTTCC AGGCTCTGAG
3D1 GACGGGGCTC TTCGAGCTAA GCCAGCACAT GAAACTGAAG CTGCAGTTCA
351 CCGCCAGCGT GTCCCACCCT CCACCCGAGG CCCGGCCCCT CTCCCGCAAG 401 AGCAGCCCCA GAAGCCCTGC TGTCCGGGAC TTGGTGGAGA GGCATCAGGC
451 TAGCCTGGGC CGCTCCCAGT CCTTCTCCCA CCAGCAGCCT TCCCGAAGCC
501 ACCTCATGAG GTCGGGCAGT GTGATGGAGC GCAGAGCATC ACGCCCCCTG
S51 TGGCCTCTCC TGTTGGCCGC CCCCTCTACC TGCCCCCGGA CAAGGCTGTG bOl TTGTCTCTGG ACAAGATTGC CAAGCGCGAG TGCAAGGTCC TGGTGGTGGA bSl ACCCGTCAAG TAGCACCGTG CCAGCTCTGT TCCCTCTTAC ACTCCAGAGA
701 CCCAACGCCC CCAGAGGGTA TCCTTGCTCC CGGGCTGTGC CTCCCCTGGG
751 ATGCCTCCCA GACGGGGGTG AAGAGGCCTG GCAGAGCTGC CTGTCTTGTG
601 TCTGCTGATG AGGGATGGGG GAAGAAGCTG TGAAGTGGGC GGGCATGGCT
651 GGGACTAAGC CACCAGTATT CCCCGACGTT CCTGTGGGGG GGGCTGGCCC 101 ACCCCTAGGC CAGGGCAAGG GTTCCCAGAG CTCCCTTGTC CCCGGCCCTT
151 TACCCTGGTT CTGAGTTTAC AAAGTCTCTT CCTCATTCCC GTTGAGTTCT
10D1 TTCCCACCTC TGACATTCCC TCCCTCCCTC CCGCAGGCTG AGATTAGAGG
1051 GTGGTGATGG CTAAGGGCCC CTGACAGTGA CCTTCCTGTC TCAGGGGTTG
1101 GGGACAGGGC CAGGTAGCCT CCTGCCCCTT ATGTTTACGT TTGCAGCCTG 1151 AAGCACTTTA ATTTTTTTTT TTTTTGGTCT GTCCCTGTAA CTAATTTTCC
12D1 AACTATTGCT TCCAACTGAA ATAAGACTAT TAAATGCCTG TTCAGAGGGA
1251 AAAAAAA
BLAST Results
No BLAST result Medline entries
No Medline entry
Peptide information for frame 2
ORF from 44 bp to 8D5 bpi peptide length 2S4 Category: putative protein Classification: no clue
1 MHRLFTPiQSG FENlQMRLCUR RPCGNEERCS VCUSLFLPPG KiQLCEEERRA
51 MiQAALDSVVC HTPLNNLGFS RKGSALTFSV AF<3ALRTGLF ELSOHMKLKL
IDl (3FTASVSHPP PEARPLSRKS SPRSPAVRDL VERHiQASLGR SiQSFSHlQlQPS
151 RSHLMRSGSV MERRASRPLU PLLLAAPSTC PRTRLCCLUT RLPSASARSld
201 UUNPSSSTVP ALFPLTLiQRP NAPRGYPCSR AVPPLGCLPD GGEEAUiQSCL 251 SCVC
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_12g7ι frame 2 No Alert BLASTP hits found
Pedant information for DKFZphamy2_12g7ι frame 2
Report for DKFZphamy2_12g7 ■ 2
[LENGTH! 254
[Mid! 28471-11
[pi! 10-00
[BLOCKS! BL01D13C Oxystero l -b i nd i ng prote in f ami l y prote i ns
[KU! A l pha_Beta [KU! LOU COMPLEXITY 4 - 72 *
SE(Q MHRLFTP(QSGFEN(QMRLCURRPCGNEERCSVCUSLFLPPGK(QLCEEERRAM(QAALDSVVC
SEG PRD ccccccccccccchhhhhhcccccccceeeeeeeeeccccccchhhhhhhhhhhhhheee
SElQ HTPLNNLGFSRKGSALTFSVAFlQALRTGLFELSlQHMKLKLiQFTASVSHPPPEARPLSRKS
SEG xxxxxxx
PRD cccccccccccccceeeehhhhhhhhhhhhhhhhhhhhhhhhhhhccccccccccccccc
SE(3 SPRSPAVRDLVERH(3ASLGRS(3SFSH(3(3PSRSHLMRSGSVMERRASRPLUPLLLAAPSTC
SEG xxxxx
PRD ccccchhhhhhhhhhhhcccccccccccccceeeecccchhhhhhccccccccccccccc SE(3 PRTRLCCLUTRLPSASARSUUUNPSSSTVPALFPLTL(3RPNAPRGYPCSRAVPPLGCLPD SEG
PRD cccceeeeeccccccccceeeccccccccccccccccccccccccccccccccccccccc
SE(3 GGEEAUlSSCLSCVC SEG
PRD cchhhhhhhhhccc
(No Prosite data available for DKFZphamy2_12g7 - 2) (No Pfam data available for DKFZphamy2_12g7-2)
DKFZphamy2_lΞil
group: amygdala derived
DKFZphamy2__12il encodes a novel 263 amino acid protein with weak similarity to F41Eb-3 of Caenorhabditis elegans- No informative BLAST resultsi No predictive prositei pfam or SCOP motife.
The new protein can find application in studying the expression profile of amygdala-specific genes-
putative protein
Sequenced by EMBL
Locus: /map="3"
Insert length: 2528 bp
Poly A stretch at pos- 2515ι polyadenylation signal at pos- 2411
1 ATATAGTTGG ATCAAACAAA AACAACACAA TTTGTCCCGA TAATTATCAA
51 ACAGCACAGC TACTTGCCTT AATTTTAGAG TTACTCACAT TTTGTGTGGA
101 ACATCACACA TATCACATAA AAAACTATAT TATGAACAAG GACTTGCTAA ISI GAAGAGTCTT GGTCTTGATG AATTCAAAGC ACACTTTTCT GGCCTTGTGT
2D1 GCCCTTCGCT TTATGAGGCG GATAATTGGA CTTAAAGATG AATTTTATAA
251 TCGTTACATC ACCAAGGGAA ATCTTTTTGA GCCAGTTATA AATGCACTTC
301 TGGATAATGG AACTCGGTAT AATCTGTTGA ATTCAGCTGT TATTGAGTTG
351 TTTGAATTTA TAAGAGTGGA AGATATCAAG TCTCTTACTG CCCATATAGT 4D1 TGAAAACTTT TATAAAGCAC TTGAATCGAT TGAATATGTT CAGACATTCA
451 AAGGATTGAA GACTAAATAT GAGCAAGAAA AAGACAGACA AAATCAGAAA
501 CTGAACAGTG TACCATCTAT ATTGCGTAGT AACAGATTTC GCAGAGATGC
5S1 AAAAGCCTTG GAAGAGGATG AAGAAATGTG GTTTAATGAA GATGAAGAAG bOl AGGAAGGAAA AGCAGTTGTG GCACCAGTGG AAAAACCTAA GCCAGAAGAT b51 GATTTTCCAG ATAATTATGA AAAGTTTATG GAGACTAAAA AAGCAAAAGA
7D1 AAGTGAAGAC AAGGΑAAACC TTCCCAAAAG GACATCTCCT GGTGGCTTCA
7S1 AATTTACTTT CTCCCACTCT GCCAGTGCTG CTAATGGAAC AAACAGTAAA
601 TCTGTAGTGG CTCAGATACC ACCAGCAACT TCTAATGGAT CCTCTTCCAA
851 AACCACAAAC TTGCCTACGT CAGTAACAGC CACCAAGGGA AGTTTGGTTG IDl GCTTAGTGGA TTATCCAGAT GATGAAGAGG AAGATGAAGA AGAAGAATCG
ISI TCCCCCAGGA AAAGACCTCG TCTTGGCTCA TAAAATATTT ATTAGGGGAC
1001 CCTCAACATG TGGTCTTACA ATGCTGCAAC TGTTCAGTGA GCTGAAAATC
1051 TGAATCAGAA AGCTTTCTCA ATTGAACTTA TAAAATATAC AAGGAGTAGC
1101 AAAAGACAGT ATATCAGCTA AGAGAGTTTA GTTCTAATAA AAATCAGGCT 1151 TCCCAGGAAC TTGATTGCTT GCTAGTAATT AAGGGGTTTG CCTTTTAGGC
1201 TGTCAAAACA AACATTAGTA ACCAGAACCT GGGAGATAGC TTCTCAGCAA
1251 GGAAAAGTCA CAGGTTTGGG GACGGTTTAG GGGAGGGGAA AAGGTTGATA
1301 TAATAATGCA GGGTTGCTCC TCGGGGTGTC GATCTAGAAA CAATTTTACA
1351 GAACTTCAGT TGTAAACTCA ATAACATTAC TTGTATAATG GTGCTGGCCA 1401 TGTTGTTGTT TTAATCAGTT GCCTCTTTTT AAAAGAAATT TTTATGGAAA
1451 ACACATTCAA CTATCATTAA AAAAATGAAG TTAAGCTGTT GGGACCATTT
1501 CTTTAAGATT TAACAAAAGT TCAGCCTTTT AGGTAGTTGA AGGGAAGTAC
15S1 ACCCCGTATT CAGCACATGT TGAGTTTTCT ACACCAGGAA TTTTCAATAT IbOl GTATATTGAT GAAAACAAGC TCAATTCAAA CTGGACAGTT TTAAGATAAT lbSl GTTAAAATCA GCACTTTTAG AGACAACGAA GGCCAAGAAT CAGTACAGTA 1701 GTATTCCAAA ATGATTTTCT CTAGAAATTT GAAAGTAGAT CGAACAGAAT 1751 GTTGTCAACC GCCTACCAGT ACAATCTTTT GTGGAAGATA CTTTGAAATC 1601 ACTTTCTACT TTGTTAGTAA AGTTCTGTCT TTCCAGAGCT GCAAGTTTTA 1651 AAGTGTTACT TATACAGACC AACCAAGAAT AGTGCTGAAT TAAGTGGCAT 1101 TTAGTATCTA GAAGCCATTT TGATCCAAGA AGCTACTTAA GTGTCAAAGT 1151 CAGCATGCAG CACATGTAGC TTTTCTGTAA ACAAGGGTGT GATATGAAAG 20D1 CTGCT AAGAAGAGTA AAAGCACATT CCATATACGT AAGTGAATTT 2051 TAAAA ATAAA TTGAGGCAAA CAGTTAAGTT TTATTTTTAG AGCAACAAGT 21D1 TAACTGTAAA TATTTTAATG TTAGTTTGCT CATCTATGAT CTGAGATCAT 2151 GCCGAAGTGA GAAAAATCTC CCCAAAATAC AATTTAATGC ATTGGGAAAA 2201 AAAAACTTTA ACAGTAATTC CAGCCACAAT CTTTAGATCA CCCTTGTAAT 22S1 GTGTTACGGG TCCATTTTTC CTGGAATCGT TTAATCTAAA GCAGTTTCCC 2301 CTGTTTTGGA GATTTTGTAG TTAATTTTAA TTTTGGCTAT TGTTTGGAAA 2351 AGATGAGCTG TCTGTGTAGA TATGAAGTAT AGTTTTTTCC ATAAAACAGA 24D1 TGTTTATTTT GTATTAAAAA ATACCACTGT ACTTGTTTTA CACCATTTGT 24S1 ATACATGTGG TGATATTAAT GCTAAACTGT AAAATTCAGG AATTAAAATG 2501 TGACCCTGTA ATTCCAAAAA AAAAAAAA
BLAST Results
Entry AF01b448_8 from database TREMBL: gene: "F41Eb-3"i Caenorhabditis elegans cosmid F41Eb- Score = 310ι P = S-De-32ι identities = 73/164ι positives 118/lS4ι frame +3
Entry HS21125b from database EMBL: human STS SHGC-1S844- Score = 177ι P = 5-Se-38i identities = 111/202
Medline entries
No Medline entry
Peptide information for frame 3
ORF from 132 bp to 18D bpi peptide length: 263 Category: putative protein Classification: no clue
1 MNKDLLRRVL VLMNSKHTFL ALCALRFMRR IIGLKDEFYN RYITKGNLFE
51 PVINALLDNG TRYNLLNSAV IELFEFIRVE DIKSLTAHIV ENFYKALESI
IDl EYVOTFKGLK TKYE<3EKDR<3 N(3KLNSVPSI LRSNRFRRDA KALEEDEEMU
ISI FNEDEEEEGK AVVAPVEKPK PEDDFPDNYE KFMETKKAKE SEDKENLPKR 2D1 TSPGGFKFTF SHSASAANGT NSKSVVA(3IP PATSNGSSSK TTNLPTSVTA
251 TKGSLVGLVD YPDDEEEDEE EESSPRKRPR LGS BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_12ilι frame 3 No Alert BLASTP hits found Pedant information for DKFZphamy2_12ilι frame 3
Report for DKFZphamy2_12il-3
[LENGTH! 32b
[MU! 372bl-10
[pi! S-bD
[H0M0L! TREMBL:AF01b448_8 gene: "F41Eb-3"i Caenorhabditis elegans cosmid F41Eb- le-3b
[FUNCAT! 01-05-04 regulation of carbohydrate utilization [S- cerevisiaei YNL201c! 2e-D8
[BLOCKS! BL0D357 Histone H2B proteins
[BLOCKS! BP02232B [BLOCKS! PR01D73C
[BLOCKS! BPD3050C
[BLOCKS! BP03560F
[BLOCKS! PR00613F
[KU! All_Alpha [KU! LOU COMPLEXITY 10-43 *
SElQ IVGSNKNNTICPDNYfQTAώLLALILELLTFCVEHHTYHIKNYIMNKDLLRRVLVLMNSKH
SEG xxxxxxxxx
PRD cccccccccccccchhhhhhhhhhhhhhhhhhcccchhhhhhhhhhhhhhhhhhhhccch
SElQ TFLALCALRFMRRIIGLKDEFYNRYITKGNLFEPVINALLDNGTRYNLLNSAVIELFEFI
SEG
PRD hhhhhhhhhhhhhhhhccchhhhhccccccchhhhhhhhhcccccccccchhhhhhhhhh
SElQ RVEDIKSLTAHIVENFYKALESIEYVIQTFKGLKTKYEIQEKDRIQNIQKLNSVPSILRSNRFR
SEG
PRD hheeehhhhhhhhhhhhhhhhhhhhhhhhccchhhhhhhhhcccccccccccccccchhh
SElQ RDAKALEEDEEMUFNEDEEEEGKAVVAPVEKPKPEDDFPDNYEKFMETKKAKESEDKENL
SEG xxxxxxxxxxxxxxx
PRD hhhhhhhhhhhhhhhccccccceeeeeeeccccccccccccchhhhhhhhhhhhcccccc
SElQ PKRTSPGGFKFTFSHSASAANGTNSKSVVAlQIPPATSNGSSSKTTNLPTSVTATKGSLVG
SEG
PRD ccccccccceeeeccccccccccccceeeeecccccccccccccccccccccccccccee
SElQ LVDYPDDEEEDEEEESSPRKRPRLGS
SEG xxxxxxxxxx
PRD eeccccccchhhhhcccccccccccc
(No Prosite data available for DKFZphamy2_12il .3) (No Pfam data available for DKFZphamy2_12il -3)
DKFZphamy2_13gl1
group: amygdala derived
DKFZphamy2_13gl1 encodes a novel 281 amino acid protein without similarity to known proteins- The novel protein contains a PROSITE ASP_PR0TEASE motif and seem to be expressed Ubiquitously-
No informative BLAST resultsi No predictive prositei pfam or SCOP motife.
The new protein can find application in studying the expression profile of amygdala-specific genes-
unknown protein perhaps complete eds- Pedant: SIGNAL_PEPTIDE Sequenced by EMBL
Locus: /chromosome="12pl3-3"
Insert length: 2754 bp Poly A stretch at pos- 2743ι polyadenylation signal at pos- 2724
1 GCAATCTCGG GAAATTGGAG ACTGACGCGG CTGCTCCTGC ATGTTATTTA
51 TTTTTCCTCT TTCCCTCCCG TGGAGACCCT CCTGTTGGAA AGAGAGCTGC 101 AGCACGGGAC AGAGACAGGC AGGAAGAAGC AGAGAGGACT CGGTGACGCC
151 CCCACCGAGC AGCCCCTGGC CCACTCCTCC AGCAGGGGCC ATGAGCACCA
2D1 AGCAGGAGGC CAGGAGAGAT GAGGGAGAAG CCAGGACGAG GGGGCAGGAG
251 GCACAGCTTC GAGACCGAGC CCACCTGAGC CAGCAGCGCC GGCTCAAACA
301 GGCCACCCAG TTCCTGCACA AGGACTCGGC CGACCTGCTC CCGCTGGACA 351 GCCTCAAGAG GCTCGGCACC TCCAAGGACT TGCAGCCGCG CAGTGTGATC
401 CAAAGACGCC TGGTGGAGGG AAACCCGAAT TGGCTTCAGG GGGAGCCTCC
451 CCGGATGCAG GACCTGATTC ATGGCCAGGA GAGCAGGAGG AAGACCAGCA
501 GGACAGAGAT TCCAGCTCTT CTGGTCAACT GCAAGTGCCA GGACCAGCTG
S51 CTTAGAGTGG CCGTTGACAC AGGCACCCAA TACAATCGGA TCTCTGCTGG bDl ATGTCTCAGC CGCCTGGGGT TAGAGAAAAG GGTCCTAAAA GCCTCAGCTG bSl GGGACCTGGC CCCTGGGCCC CCAACCCAGG TGGAGCAGTT GGAGCTACAG
701 CTGGGGCAGG AGACTGTGGT GTGCTCGGCA CAGGTGGTGG ATGCTGAGAG
751 TCCTGAATTC TGCCTGGGCC TGCAGACTCT GCTTTCTCTC AAGTGCTGCA
801 TCGACCTGGA GCACGGAGTG CTGCGGCTGA AAGCCCCGTT CTCAGAGCTA 8S1 CCCTTCCTGC CTTTGTACCA AGAGCCTGGC CAGTGACTGC TGTCTCAGTC
101 AGTCCCCAGA GGGAAAGACC TTGCCTTAGA AGAAGAGGCG TGTGGGGAAC
151 GGGGGCTCTT GAAGCCAGGT AGCTGGGGAC TATGGTGTCT GCCCTTCCAA
1001 TCACCTCCCT GACCCCTGCT GTCCCATTTT CCCCAGCTGG CCGCATTCCT
1051 CTCTGCTTCT CAGCAGCTGT CCTACTCCCC AGGACGAGTT TTCACTAGAG 1101 GGCCCACGAT GCCAGGATTC TGATTCATCT TCCTCCCAAG AAAAGCAAAG
1151 CCAAATCAAG ACCACAGATA GGAACCTAAG CACAATGGGG TGCCTGCTTG
1201 GGCTGGGTCG AAGGCTCTGC TGACTGCTGT CCTTGTCCAT CACCCAATAC
1251 CACCCCAAAC ACAACTCAAC TTCCCACACC ACCATGTCTC TCACCACACC 1301 TTCTGGGCCT CATTATCTCC CACAACTAGA CCGCCATGCC TCACCAACCT
1351 ATGTCCCTGG ACCTCCTGGT GTCTGCCTCT CGGAGTCTGT GCACATCTGC
1401 TCACAGTTGA GTGGGGGAAG AAACAGCCAG AATTCAATAC AACAAAGAGC
1451 GGGAGTTAGT ATAGGAATGT CCATCTCATA AGGCTGAGAG CTATTTTTTC 1501 CTGTGGCTGC AAATGTCTGA AGCCAGTTAG TTTGATTACC CTGTGCAAAA
1SS1 CCTTGGACAT ACTTCTGCTA TTAACGCTAT AGGTATTTAT CCGTTTCCAC
IbOl TGGCTTTTTG TACCCACCGA GCCCCTGAGC CTTGCGTGTG TGTGTGTGGA lb51 AGAGCCTTGT AGAGAACTGC TCCTGTGAGG CAGACAGGAC AGTGAGGTTG
1701 TCACCACTCA GACTTCACCT ATTCAGCATT CTTTCTGATT TCTAGAACTA 17S1 TCCACCTCAT TAGGCCTTCT TCCTATCCCC ATCTCTGGCC TCTTGAGCTT
1601 AAGCTTGTAT TGTCCTGGAA TCAGTGGCTT TCTAACCCCC TGCCAGGCTT
1851 TGCCAAAGCA AAAAGACAGA GGCTTTTTTT TTTTTTTTAA AGTTTGGGGT
11D1 CTGTCAGGAG ACAGAGGCTT TTTTGAATTC ACTGTGAAGA GAAGAACCCG
1151 AACCTTAAGA CGCCAGATCC CTGAGAGTCT TTCTGGCTGG TTTGAGTCTC 2D01 TCAAATCATG GATTAGGAGT AAAGAAAGAG GCAGGCGCAA TGGCTCATGC
2D51 CTGTAATCCC AGCACTTTGG GAGGCTGAGG TGGGTGGATC ACTTGAGGTC
2101 AGGAGTTTGA GACCAGCCTG GGTAATATGG CAAAACCCCA TCTCTACTAA
21S1 AAAATACAAA AATTAGCCAG GTATGGTGGT GAACACCTGT AATCCCAGCT
2201 ACTTGGAAGG CTGAGGCATA GGAGTTGCTT GAACCTGGGA GATGGGGGTT 2251 GTAGTGAGCC AAGTTCGTGC CATCGGACTC CAGCCTGGGT GAAGGAGTGA
23D1 GACCCTGTCT CCAAAAACAA ACAAAAAAGG AGCAGAGAAA GACAGTGGTA
23S1 CAGCTAACCT GAACAAGGGA ACTGGGACCG TTGGGCTGAA ACAGTCTTGA
2401 GCCTGGGGTT GACTGGGTTA GAGAAGAACC GGGATGCAAG GAGCTGCCTG
2451 TGACACCTGG CCTGCCCTTT CTCAGCTGCC TCCCCTGCCC TTTCTCAGCT 25D1 GCCTCCCCTG CCCTCAGAAG GAAAGGAGAG GGCTCACTTA TCACTTGTGC
2S51 CATAGCACCT GGTCTCAAAA TCCTAAAAGC TTTCCTCGCC CTCACTGCCT
2b01 TGCTCCACAA GGTCCACTTT CCTGGGTCTT GTGCTGTGCC TTTCCTTGTC
2bSl TGCCTCCTGC TGCTTCTGTA ACTGCAGACC CCA6GCCCAA TTGCAAGCCC
27D1 TCGGCTCAGC TGCTTCTCCA TTGGAATAAA CTCTTGTTTC TCTAAAAAAA 2751 AAAA
BLAST Resul ts
No BLAST resu l t
Medl i ne entri es
No Med l i ne entry
Pepti de i nf ormati on f or f rame 2
ORF f rom 41 bp to 683 bp i pept i de length : 281 Category : putat i ve prote i n Cl ass i f i cati on : no c lue Pros i te moti f s : ASP PROTEASE ( 173-184 )
1 MLFIFPLSLP URPSCUKESC STG(QR(QAGRS REDSVTPPPS SPUPTPPAGA
51 MSTKiQEARRD EGEARTRGiQE AlQLRDRAHLS (Q(QRRLKιQAT(Q FLHKDSADLL
101 PLDSLKRLGT SKDLlQPRSVI (QRRLVEGNPN ULlQGEPPRMiQ DLIHGlQESRR
ISI KTSRTEIPAL LVN CKCIQD IQ L LRVAVDTGT(3 YNRISAGCLS RLGLEKRVLK 2D1 ASAGDLAPGP PTiQVElQLELiQ LGlQETVVCSA (QVVDAESPEF CLGLiQTLLSL 251 KCCIDLEHGV LRLKAPFSEL PFLPLYώEPG (Q
BLASTP h its
No BLASTP hits available Alert BLASTP hits for DKFZphamy2_13gl1ι frame 2
PIR:S50b4b hypothetical protein YER143w - yeast (Saccharomyees cerevisiae)ι N = li Score = 10ι P = D-2b TREMBL:RNDObO_l product: "DNA (cytosine-S- ) -methyltransferase"i Rattus norvegicus mRNA for DNA (cytosine-5-) -methyltransferase! partial eds • i N = li Score = 61ι P = D-61
>PIR:S50b4b hypothetical protein YER143w - yeast (Saccharomyees cerevisiae)
Length = 426
HSPs:
Score = 10 (13-5 bits)ι Expect = 3-De-Olι P = 2-be-Dl Identities = 28/112 (2S*)ι Positives = 46/112 (42*)
(Query: 155 TEIPALLVNCKClQDtQLLRVAVDTGTlQYNRISAGCLSRLGLEKRVLKASAGD- --LAPGPP 211
T++P L +N + + ++ VDTG Q +S + GL + + K G+ + G Sbjct: 111
T(QVPMLYINIEINNYPVKAFVDTGA(QTTIMSTRLAKKTGLSRMIDKRFIGEARGVGTGKI 256
(Query: 212 XXXXXXXXXXXXXXXX- CSA(QVVDAESPEFCLGL(QTLLSLKCCIDLEHGVLRL 2b3 CS V+D + + +GL L C+DL+
VLR+
Sbjct: 251 IGRIHIQAIQVKIETIQYIPCSFTVLDTDI- DVLIGLDMLKRHLACVDLKENVLRI 310
Pedant information for DKFZphamy2_13gl1ι frame 2
Report for DKFZphamy2_13gl1- 2
[LENGTH! 281
[MU! 31330-17
[pi! 6-75 [BLOCKS! PR00041D
[BLOCKS! BP01121G
[PROSITE! ASP_PROTEASE
[KU! All_Alpha [KU! SIGNAL_PEPTIDE 17
[KU! LOU COMPLEXITY L ib *
SElQ MLFIFPLSLPURPSCUKESCSTG(QR(QAGRSREDSVTPPPSSPUPTPPAGAMSTK(QEARRD
SEG xxxxxxxxxxxx
PRD ccccccccccccccceeeccccccccccccceeecccccccccccccccchhhhhhhhhh
SElQ EGEARTRG(QEA(QLRDRAHLSιQ(3RRLK(QAT(QFLHKDSADLLPLDSLKRLGTSKDL(QPRSVI SEG
PRD ccccccchhhhhhhhhhhhhhhhhhhhhhhhhhccccccccccccccccccccccchhhh
SE(Q <QRRLVEGNPNUL(QGEPPRM<QDLIHG(QESRRKTSRTEIPALLVNCKC(QD(QLLRVAVDTGTιQ
SEG PRD hhhhhccccccccccccccccccccccccccccccccchhhhhhhchhhhhhhhhhccce
SE(Q YNRISAGCLSRLGLEKRVLKASAGDLAPGPPT(QVE(QLEL(QLG(QETVVCSA(QVVDAESPEF
SEG xxxxxxxxxxxxxxxx
PRD eeecccchhhhhhhhhhhhhhhccccccccccchhhhhhhhccceeeeccceeecccccc
SE(3 CLGL(3TLLSLKCCIDLEHGVLRLKAPFSELPFLPLY<3EPG(3 SEG
PRD cccchhhhhhhhhhcchhhhhhhcccccccccccccccccc
Prosite for DKFZphamy2_13gl1 • 2
PS0D141 173->185 ASP PROTEASE PD0C0D128
(No Pfam data available for DKFZphamy2_13gl1 -2)
DKFZphamy2_14b5
group: intracellular transport and trafficing
DKFZphamy2_14bS encodes a novel 771 amino acid protein which shows bl* identity to the human TYL protein and 48* identity to the human Tic protein-
Both proteins show similarity to Sec7 of Saccharomyees cerevisiaei which takes function in vesicular traficking. The new protein shows also significant similarity to human ARN03ι which is involved in the control of Golgi structure and function- DKFZphamy2_14b5 is predominantly expressed in the ens and germ cells-
The new protein can find application in diagnosis/therapy of diseases related to vesicular traficking e-g- in synapses of the central nervous system and in studying expression profiles.
similarity to TYL protein (Homo sapiens)
Sequenced by EMBL
Locus: /map="44S-7 cR from top of Chr5 linkage group" Insert length: 4528 bp
Poly A stretch at pos- 4511ι polyadenylation signal at pos- 4481
1 CTCGCTCAGC CTCTCCACAT CGCGGCTCCG GCACCTGAAG GGACGCGGGC SI GGGCGCGGGC AGCTCCGACC GGCGGCGGCG GGGCGGGACA GGCAGCCCGG
101 CGGCCTCCGA TGGCCCCGCC GTGAGAGGCC GGACCCGCGG CGGGGACCAG
ISI CAGCGGTCTA GAGGAGTCCC AGGAGCAGCC AGGACAGGCG GAAGCAGTGG
2D1 CTGCCATGGA GGAGGACAAG CTCTTATCTG CAGTGCCTGA GGAAGGCGAT
251 GCCACCCGTG ACCCCGGTCC AGAGCCTGAA GAGGAGCCAG GGGTCCGGAA 301 TGGGATGGCC AGTGAGGGCC TGAACAGCAG CCTCTGCAGC CCAGGGCACG
351 AGCGAAGGGG CACCCCAGCG GACACTGAGG AACCCACGAA GGACCCAGAT
4D1 GTGGCCTTCC ATGGCCTCAG CCTTGGCCTC TCTCTCACCA ATGGCCTAGC
4S1 CCTGGGGCCA GACTTGAACA TTCTGGAAGA TTCAGCGGAG TCCAGGCCCT
501 GGAGGGCTGG CGTGCTGGCA GAGGGGGACA ATGCTTCCAG GAGCCTCTAC 551 CCAGATGCTG AGGACCCTCA GCTGGGGTTG GATGGTCCCG GGGAGCCAGA bOl TGTGCGGGAT GGCTTCAGCG CCACGTTTGA GAAGATTCTG GAGTCAGAGC bSl TGCTGCGGGG CACCCAGTAC AGCAGCCTCG ACTCCCTAGA CGGGCTGAGC
7D1 CTCACGGATG AGAGCGACAG CTGCGTCAGC TTCGAGGCCC CCCTCACACC
751 CCTCATCCAG CAGCGGGCCC GTGACAGCCC TGAGCCAGGG GCTGGGTTGG 801 GCATTGGGGA CATGGCGTTT GAGGGGGACA TGGGGGCAGC TGGTGGTGAT
851 GGGGAGCTGG GCAGCCCCCT GCGGCGCTCC ATCTCCAGCA GCCGCTCTGA
101 GAATGTCCTG AGCCGCCTGT CTCTCATGGC CATGCCCAAT GGATTCCATG
ISI AAGATGGCCC TCAGGGCCCA GGGGGGGATG AGGATGATGA TGAGGAGGAC
1001 ACGGACAAGT TGCTGAACTC AGCCAGTGAC CCCAGCCTGA AGGATGGCCT 1051 GTCAGACTCA GACTCTGAGC TCAGCAGCTC GGAGGGGTTG GAGCCTGGTA
1101 GTGCAGACCC TCTGGCCAAC GGGTGCCAGG GGGTCAGTGA AGCTGCTCAT
1151 CGGCTGGCAC GCCGTCTCTA CCACCTCGAG GGCTTCCAGC GCTGTGATGT
12D1 GGCCCGGCAG CTGGGCAAGA ACAACGAGTT TAGCAGGCTG GTGGCCGGGG 12S1 AGTACCTCAG TTTCTTCGAC TTCTCGGGCT TGACTCTGGA CGGAGCACTC
1301 AGAACATTCT TGAAGGCCTT CCCGCTGATG GGGGAGACAC AAGAGCGTGA
1351 GCGGGTCCTC ACACACTTCT CCCGCCGGTA CTGCCAGTGC AACCCTGATG
1401 ACAGCACTTC GGAAGATGGG ATCCACACGC TCACCTGTGC CCTGATGCTG 1451 CTCAACACGG ACCTGCACGG CCACAACATT GGCAAAAAGA TGTCCTGTCA
1501 GCAATTCATT GCCAACTTGG ACCAGCTGAA TGATGGCCAA GACTTTGCCA
1551 AAGACCTGCT GAAGACCCTT TACAACTCCA TCAAGAATGA AAAGCTGGAA
IbOl TGGGCCATTG ATGAGGATGA GCTGAGGAAA TCCCTGTCTG AGCTGGTGGA
IbSl TGACAAGTTC GGGACAGGCA CGAAGAAGGT GACGCGAATC CTGGATGGTG 1701 GCAACCCCTT CCTGGATGTC CCACAGGCGC TCAGTGCCAC CACCTACAAG
17S1 CACGGCGTCC TGACCCGGAA GACTCACGCT GACATGGATG GCAAGAGGAC
1801 GCCCCGTGGG AGGCGTGGCT GGAAGAAATT CTACGCAGTG CTCAAAGGGA
1851 CCATCCTGTA CCTGCAGAAG GATGAGTACA GGCCTGACAA AGCTCTATCG
1101 GAGGGTGACC TGAAGAACGC CATTCGCGTG CATCACGCTC TGGCCACCAG 1151 GGCCTCTGAC TACAGCAAGA AGTCCAACGT GCTGAAGCTT AAGACAGCCG
20D1 ACTGGAGGGT ATTCCTCTTC CAGGCACCGA GCAAGGAAGA AATGCTGTCC
2051 TGGATCCTCA GGATCAACCT GGTGGCAGCC ATCTTCTCTG CCCCGGCCTT
21D1 CCCAGCCGCT GTCAGCTCCA TGAAGAAGTT CTGTCGGCCC CTGCTGCCCT
2151 CCTGCACCAC CCGCCTCTGC CAGGAGGAGC AACTGCGGTC TCATGAGAAT 2201 AAGTTGAGGC AGCTGACTGC GGAGCTGGCC GAACACAGGT GTCACCCAGT
2251 CGAGAGGGGC ATCAAGTCCA AGGAGGCCGA GGAGTACCGG TTGAAGGAGC
23D1 ACTATCTCAC CTTCGAGAAA AGCCGTTATG AGACCTATAT CCACCTCCTG
2351 GCTATGAAAA TCAAAGTGGG CTCAGATGAT CTGGAGCGGA TTGAGGCCCG
2401 GCTGGCCACT CTGGAAGGGG ATGACCCTTC TCTCCGGAAG ACACATTCAA 24S1 GCCCTGCCCT CAGCCAGGGC CATGTGACTG GCAGCAAAAC CACAAAGGAT
2501 GCCACTGGGC CTGATACTTA GCTGACATGG ATTTGCAGAC CCCAGGGTGG
2S51 GCAGATGTCT CCAGTGGGGT CAGTGAGCAC AATTCCAGCC AGGGGCCACT
2b01 TGGACCAAGC TCCAGTCAGT TGATGGGCAG CTAGAGGGGT GCAGAAAGCC
2bSl TGTGGGCCCA GGAGATGGAG ATGCCGTTTG TGGCGTTGAT CTCCTTGCGT 2701 CCTTGGGCAT CTCCGGGCAT CAGACCCTCT CCCTGGCCCT TGTTTTCCTC
2751 TCCACCATGG AGCCTCATTT TGTAGGCCAG TTGTGTGCAT GCTCTAGACA
28D1 CCACCTCGCT GGAGAAGCTG GAAGGGCTGT TGTCTTCCCA GGTCTTTCTC
2851 TTCTCATCAA GCTCCTCTCC TCATCTTTTT TGTGTGTGAG GGCAGGTCTT
21D1 GACTCTAGGT CTCAGCTGGA ACCCCACCCT TTCTCCTCCT CCTTCCTCTG 2151 AGTTGACCAG CAGCAGGTCT GCCGACCACC AGCACCATCC TCTCCTCCCA
30D1 GCAGCCTCCA GAACCATGCC CAGGTCTCCT GCCTCACATC ACAATAATCT
3051 GGGACCCAGG CTTGTGCCCT TTCAGTGTAA AGCTGACTCC ATCACATGTG
3101 CATCCACTTC TTTTCATCCA TTGAGATCAC ACTGCCTCCT TTTTATACAG
3151 ACACAAATAT ACATCTATAA GAATAATATA TACATAAGGA ACCCCTGAAA 32D1 GATGGTTTTG GAACTGGAAT CAGTTAGAGG ATGAAATCAG ATAAAGGAAA
32S1 AGCCTATTTT GGAGCTTCCC CTGTTAG6AA GGATGGCTGC ACCTGGCCCC
3301 CTGGCATTCC TGACGCTCTA GGAGGGAAGG GGGAGGCAGT GCTGGCCTCC
3351 CTTGCCCTGT TTTTCCCTCT TCCAGCTGAC CTGTGACTTA TACTGCTCTT
34D1 ACCGATGATA CTTTTGGAAA AAATAGAGCG TGTATGCACC GCCCCGTTTG 3451 TCCCATGGAT ATCCTGGGGT GTGAGTCGGA TGGGACCACG GCCCTGTTTA
3SD1 TATTTGGGTC TTTATGTTGG TGCTGCCAGG TCTCTGAGCT CCAGAGGTGG
3S51 CCTCTTGGAC AGATCTACTG CTATAGGAAT AAAAGACACT CTGTCTCGCA
3b01 AATGGCTGCT TGTCAACAAG CCCAAAGATG CTTGTCGGAG GACGGTTATG
3bSl GAAGCCCTTA ATTCTTGGTT GTGGGAAAAG GTGGAATGAC AAGTTATTGA 37D1 TTGTTTTTCT GTCGCTATTT CTTTCATTTG TCTAGTGAAT CAGAAAGGCT
3751 TAGCCAAGGC CACATCTGGG AAGAGTGGAG AAATTTGCCA CTTGACGATC
3601 ACGGATTAGC TAGCACCTTT AAGCCCTGCA TTTCTCCAAC TGACAAGTGG
3851 GTGGGGGTGA TGGCACATTC AGTGTGGCTA TGAAGAGCGA ATCCTCTCTA
3101 TTGTTTAAAT AGATTACTGT AGTTTGGCCA GGAATTTGGC GTCAGTGGTA 3151 ACACACTTAG TTAATAAAAT AAGCCAGGCT TGCAACTAAG TATCTAACTT
4D01 TACAGGCCCA CTCACATTTG AGGCAAGGGG CTATTGAGTA TGTGGAGAGA
4D51 TGTAGTGATT TAAATTCAGA TTATTTAAGT TGGATCAGCT GAAGTGTGTT
4101 TTAGACCCAA ACCATCTGGC CCCTTCGTTT TGCTCAGAGG AAGTAAATGT 41S1 TCACTTAAAT GAAATTGAAA ACGCCATGTG GCACCACAAA AGAGCTCTCT
42D1 GTACTTTCCC CATGCTGCCT CAAAAGTTCT GTGAGTTTCG GGGTCAGTGT
4251 CCCACCCTTC ACTTCCCGAG GGCGGGTGAG TGGAGAGCAG AGCCAGGAGC
4301 TCTGGCAGCT GTGGACAGAT GTGCTTCCTG AGCATGGGTT GTGCCTCCCA
4351 TCAGTAAAAA AATGTTTAGT TCACTTCCTT AATTGTATAA TTATTTATTT
44D1 GTAAATTATA TACATGTACT ACTGTACTAA AATATTATGT ACATTATAAA
44S1 ACATACACAA AAATAGAAAT TTAAAAAAGA TGAGATGAAA ATAAATCTAA
4501 GTCAAAGTTC CAAAAAAAAA AAAAAAAA
BLAST Results
No BLAST result
Medline entries
18D8b482:
Perletti Li Talarico Di Trecca Di Ronchetti Di Fracchiolla NSi
Maiolo ATi Neri A-i Identification of a novel gene-, PSDi adjacent to
NFKB2/lyt-10ι which contains See? and pleckstrin-homology domains.
Genomics 4b : 251-251 (1117)
Peptide information for frame 2
ORF from 20b bp to 2518 bpi peptide length: 771 Category: similarity to known protein
Classification: Cell signaling/communication
1 MEEDKLLSAV PEEGDATRDP GPEPEEEPGV RNGMASEGLN SSLCSPGHER
51 RGTPADTEEP TKDPDVAFHG LSLGLSLTNG LALGPDLNIL EDSAESRPUR 101 AGVLAEGDNA SRSLYPDAED PiQLGLDGPGE PDVRDGFSAT FEKILESELL
151 RGTtQYSSLDS LDGLSLTDES DSCVSFEAPL TPLIlQiQRARD SPEPGAGLGI
2D1 GDMAFEGDMG AAGGDGELGS PLRRSISSSR SENVLSRLSL MAMPNGFHED
251 GPiQGPGGDED DDEEDTDKLL NSASDPSLKD GLSDSDSELS SSEGLEPGSA
301 DPLANGCiQGV SEAAHRLARR LYHLEGFlQRC DVARiQLGKNN EFSRLVAGEY 351 LSFFDFSGLT LDGALRTFLK AFPLMGETύE RERVLTHFSR RYCiQCNPDDS
4D1 TSEDGIHTLT CALMLLNTDL HGHNIGKKMS CIQIQFIANLDIQ LNDGflDFAKD
451 LLKTLYNSIK NEKLEUAIDE DELRKSLSEL VDDKFGTGTK KVTRILDGGN
501 PFLDVPI3ALS ATTYKHGVLT RKTHADMDGK RTPRGRRGUK KFYAVLKGTI
S51 LYLiQKDEYRP DKALSEGDLK NAIRVHHALA TRASDYSKKS NVLKLKTADU bOl RVFLFiQAPSK EEMLSUILRI NLVAAIFSAP AFPAAVSSMK KFCRPLLPSC bSl TTRLClQEElQL RSHENKLR(3L TAELAEHRCH PVERGIKSKE AEEYRLKEHY
701 LTFEKSRYET YIHLLAMKIK VGSDDLERIE ARLATLEGDD PSLRKTHSSP 751 ALS(3GHVTGS KTTKDATGPD T
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_14bSι frame 2 PIR:GD1205 TYL protein - humani N = 2τ Score = 1421ι P = 8-be-lSD
TREMBL:AB023151_1 gene: "KIAAD142"i product: "KIAA0142 protein"i Homo sapiens mRNA for KIAAD142 proteini partial cds-i N = li Score = 1251ι P
= 2.3e-127
TREMBL:Ub3127_l gene: "TIC"i product: "Tic"i Human SEC7 homolog Tic (TIC) mRNAi complete cds-i N = li Score = 105Dι P = 4-be-lOb
>PIR."GD120S TYL protein - human Length = b45
HSPs:
Score = 1421 (213-2 bits)ι Expect = 6-be-lSOι Sum P(2) = 8-be- 150 Identities = 260/452 (bl*)ι Positives = 33b/452 (74*) ώuery: 3D1
DPLANGC(QGVSEAAHRLARRLYHLEGF(QRCDVAR(QLGKNNEFSRLVAGEYLSFFDFSGLT 3b0 D L+NG + EAA RLA+RLY L+GF++ DVAR LGKNN+FS+LVAGEYL FF F+G+T
Sbjct: Ibb DTLSNGIQKADLEAAIQRLAKRLYRLDGFRKADVARHLGKNNDFSKLVAGEYLKFFVFTGMT 22S
(Query: 3bl LDGALRTFLKAFPLMGETIQERERVLTHFSRRYCIQCNPDDSTSEDGIHTLTCALMLLNTDL 42D
LD ALR FLK LMGETiQERERVL HFS + RY (QCNP+ +SEDG
HTLTCALMLLNTDL
Sbjct: 22b
LDIQALRVFLKELALMGETIQERERVLAHFSIQRYFIQCNPEALSSEDGAHTLTCALMLLNTDL 285
(Query: 421
HGHNIGKKMSC(Q(3FIANLD(QLNDGιQDFAKDLLKTLYNSIKNEKLEUAIDEDELRKSLSEL 460 HGHNIGK+M+C FI NL+ LNDG DF ++LLK
LY+SIKNEKL+UAIDE+ELR+SLSEL Sbjct: 28b
HGHNIGKRMTCGDFIGNLEGLNDGGDFPRELLKALYSSIKNEKLiQUAIDEEELRRSLSEL 34S
(Query: 461 VDDKFGTGTKKVTRIL
DGGNPFLDVPiQALSATTYKHGVLTRKTHADMDGKRTPRGR 53b D K + RI G +PFLD+ A YKHG L RK HAD D
++TPRG+
Sbjct: 34b ADPN
PKVIKRISGGSGSGSSPFLDLTPEPGAAVYKHGALVRKVHADPDCRKTPRGK 4D1 (Query: 537
RGUKKFYAVLKGTILYLiSKDEYRPDKALSEGDLKNAIRVHHALATRASDYSKKSNVLKLK 51b
RGUK F+ +LKG ILYLiQK+EY+P KALSE +LKNAI +HHALATRASDYSK+ +V L+ Sbjct : 402 RGUKSFHGILKGMILYL(QKEEYKPGKALSETELKNAISIHHALATRASDYSKRPHVFYLR 4bl
(Query : 517 TADURVFLFlQAPSKEEMLSUILRINLXXXXXXXXXXXXXXXSMKKFCRPLLPSCTTRLCiQ bSb
TADURVFLFlQAPS E + M SUI RIN+ S KKF
RPLLPS TRL (Q Sbjct : 4b2 TADURVFLF(QAPSLEιQM(QSUITRINVVAAMFSAPPFPAAVSS(QKKFSRPLLPSAATRLS(Q S21
(Query : b57 EElQLRSHENKLRiQLTAELAEHRCHPVERGIKSKEAEEYRLKEHYLTFEKSRYETYIHLLA 71b
EE(Q + R + HE KL+ + +EL EHR + + + KEAEE R KE YL FEKSRY TY LL Sbj ct : S22
EE(Q'VRTHEAKLKAMASELREHRAAι3LGKKGRGKEAEEιQR(QKEAYLEFEKSRYSTYAALLR 561
(Query : 717 MKIKVGSDDLERIEARLATLEGDDPSLRKTHSSPAL 752 +K+K GS++L+ +EA LA + L +HSSP+L Sb jct : 582 VKLKAGSEELDAVEAALA(QAGSTEDGLPPSHSSPSL bl7
Score = b3 ( 1 - 5 bits ) ι Expect = β - be-lSD i Sum P ( 2 ) = 6 - be-lSD Ident i ties = 11/b4 ( 21* ) ι Posi t i ves = 23/b4 ( 35* ) (Query : 132 DVRDGFSATFEKILESELLRGTlQYXXXXXXXXXXXXXXXXX- CVSFEAPLTPLIlQlQRARD 110
D D FS FE ILES +GT Y +FE P P
+
Sbjct : 18 DGPDSFSCVFEAILESHRAKGTSYTSLASLEALASPGPTiQSPFFTFELPPiQPPAPRPDPP 77
(Query: 111 SPEP 114
+P P Sbjct: 78 APAP 81
Pedant information for DKFZphamy2_14bSι frame 2
Report for DKFZphamy2_14b5-2
[LENGTH! 771
[MU! 64bbD.S5 [pi! 5-04
[HOMOL! PIR:GD1205 TYL protein - human le-158
[FUNCAT! 3D- 01 organization of intracellular transport vesicles
[S. cerevisiaei YDR17Dc! 5e-22
[FUNCAT! 30-08 organization of golgi [S- cerevisiaei YDR170c! Se-22
[FUNCAT! 3D-03 organization of cytoplasm [S- cerevisiaei
YDR170c! 5e-22
[FUNCAT! D6-D7 vesicular transport (golgi networki etc) [S- cerevisiaei YDR170c! Se-22 [FUNCAT! 11 unclassified proteins [S- cerevisiaei YPRDISc!
4e-04
[BLOCKS! BL01277B
[BLOCKS! BP02373F [BLOCKS! PRD0b55C
[BLOCKS! PRD1088F
[BLOCKS! PR0D221B
[BLOCKS! BP02b4bD [BLOCKS! PRD0311A
[BLOCKS! DM013S4M
[BLOCKS! PF013b1B
[BLOCKS! PF013b1A
[SCOP! dlbtn 2.41-1-1-2 beta-spectrin [mouse (Mus musculus) brain le-31
[PIRKU! transmembrane protein le-2D
[SUPFAM! Caenorhabditis elegans K0bH7-4 protein 7e-24
[SUPFAM! pleckstrin repeat homology 7e-24
[PFAM! PH (pleckstrin homology) domain [KU! Irregular
[KU! 3D
[KU! LOU COMPLEXITY 18-42 *
SElQ MEEDKLLSAVPEEGDATRDPGPEPEEEPGVRNGMASEGLNSSLCSPGHERRGTPADTEEP SEG xxxxxxxxxx lbtn-
SE(Q TKDPDVAFHGLSLGLSLTNGLALGPDLNILEDSAESRPURAGVLAEGDNASRSLYPDAED SEG xxxxxxxxxxxxxxx lbtn-
SElQ P(QLGLDGPGEPDVRDGFSATFEKILESELLRGT(QYSSLDSLDGLSLTDESDSCVSFEAPL
SEG xxxxxxxxxxxxxxxxx
Ibtn-
SE(Q TPLIiQiQRARDSPEPGAGLGIGDMAFEGDMGAAGGDGELGSPLRRSISSSRSENVLSRLSL SEG xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Ibtn-
SElQ MAMPNGFHEDGPώGPGGDEDDDEEDTDKLLNSASDPSLKDGLSDSDSELSSSEGLEPGSA SEG xxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxx.
Ibtn-
SElQ DPLANGClQGVSEAAHRLARRLYHLEGFiQRCDVARiQLGKNNEFSRLVAGEYLSFFDFSGLT SEG
Ibtn-
SElQ LDGALRTFLKAFPLMGETiQERERVLTHFSRRYClQCNPDDSTSEDGIHTLTCALMLLNTDL SEG
Ibtn-
SElQ HGHNIGKKMSC(Q(QFIANLD(QLNDG(3DFAKDLLKTLYNSIKNEKLEUAIDEDELRKSLSEL SEG
Ibtn- SElQ VDDKFGTGTKKVTRILDGGNPFLDVPlQALSATTYKHGVLTRKTHADMDGKRTPRGRRGUK
SEG
Ibtn- EEEEEEEEEETTTEET— TTTCEE
SElQ KFYAVLKGTILYLiQKDEYRPDKALSEGDLKNAIRVHHALATRASDYSKKSNVLKLKTADU SEG
Ibtn- EEEEEEETTEEEEECCHHHHHHCCBTTT- TCCEETTTTEEEETTTTTCTTTEEEEETTTT
SElQ RVFLFlQAPSKEEMLSUILRINLVAAIFSAPAFPAAVSSMKKFCRPLLPSCTTRLCiQEEiQL
SEG xxxxxxxxxxxxxxx
Ibtn- CEEEEECCCHHHHHHHHHHHH
SElQ RSHENKLRlQLTAELAEHRCHPVERGIKSKEAEEYRLKEHYLTFEKSRYETYIHLLAMKIK SEG
Ibtn-
SElQ VGSDDLERIEARLATLEGDDPSLRKTHSSPALSlQGHVTGSKTTKDATGPDT
SEG
Ibtn-
(No Prosite data available for DKFZphamy2_14bS .2)
Pfam for DKFZphamy2_14bS- 2
HMM_NAME PH (pleckstrin homology) domain HMM
*dvIREGUMyKUgswrkstg nUqrRUFvLrndpnrLiYYkddk
+ ++G + +++ + ++ U++ ++VL++ + L++
KD+
(Query 512 TTYKHGVLTRKTHADMDGKRTPRGRRGUKKFYAVLKG-- TILYLtQKDE- 5S7
HMM dekPr YMlIdld . cUrMidVEidUmmdndHCFilUtrq . rtYYF
+P+ ++++ + ++D ++ +++ +++T + R+++F
(Query 5S8 -YRPDKALSEGDLKNAIRVHHALATRASDYSKK-
SNVLKLKTADURVFLF b05
HMM (QAeNeEEMmeUMsalrRalw* (QA+++EEM +U+ 1+ + +
(Query bOb (QAPSKEEMLSUILRINLVAA b2S DKFZphamy2_14mlb
group: transcription factors
DKFZphamy2_14mlb.pl encodes a novel 252 amino acid protein with similarity to the homeotic protein emx2 of mani mouse and zebra fish as well as to the gene "empty spiracles" of Drosophila melanogaster-
Homoeobox genes are known to play important roles in developmental processes- In zebrafish emx2 mRNAs are found in the dorsal telencephaloni parts of the diencephalon and the otocyst- The human homologue Emx2 appears to be already expressed in 8-5 day embryos- It is also expressed in the presumptive cerebral cortexi olfactory bulbsi in some neuroectodermal areas in embryonic head including olfactory placodes in earlier stages and olfactory epithelia later in development- Mutants of the D- melanogaster gene "mempty spiracles" display spiracles devoid of filzkorperi no antenna and an open head-
The new protein can find application in modulating the expression of genes controlled by this transcription factor and modulation of neuronal development. strong similarity to homeotic protein emx2 (Homo sapiens) perhaps differential splicing
Sequenced by EMBL
Locus: /chromosome="lD" Insert length: 241b bp
Poly A stretch at pos- 2316ι polyadenylation signal at pos- 2373
1 GAAAAAAAAA GAAAAAAAAA GAAAAAAAAT TACCCCAATC CACGCCTGCA SI AATTCTTCTG GAAGGATTTT CCCCCCTCTC TTCAGGTTGG GCGCGTTTGG
IDl TGCAAGATTC TCGGGATCCT CGGCTTTGCC TCTCCCTCTC CCTCCCCCCT
151 CCTTTCCTTT TTCCTTTCCT TTCCTTTCTT TCTTCCTTTC CTTCCCCCCA
2D1 CCCCCACCCC CACCCCAAAC AAACGAGTCC CCAATTCTCG TCCGTCCTCG
251 CCGCGGGCAG CGGGCGGCGG AGGCAGCGTG CGGCGGTCGC CAGGAGCTGG 301 GAGCCCAGGG CGCCCGCTCC TCGGCGCAGC ATGTTCCAGC CGGCGCCCAA
351 GCGCTGCTTC ACCATCGAGT CGCTGGTGGC CAAGGACAGT CCCCTGCCCG
401 CCTCGCGCTC CGAGGACCCC ATCCGTCCCG CGGCACTCAG CTACGCTAAC
451 TCCAGCCCCA TAAATCCGTT CCTCAACGGC TTCCACTCGG CCGCCGCCGC
501 CGCCGCCGGT AGGGGCGTCT ACTCCAACCC GGACTTGGTG TTCGCCGAGG S51 CGGTCTCGCA CCCGCCCAAC CCCGCCGTGC CAGTGCACCC GGTGCCGCCG bOl CCGCACGCCC TGGCCGCCCA CCCCCTACCC TCCTCGCACT CGCCACACCC bSl CCTATTCGCC TCGCAGCAGC GGGATCCGTC CACCTTCTAC CCCTGGCTCA
7D1 TCCACCGCTA CCGATATCTG GGTCATCGCT TCCAAGGGAA CGACACTAGC
751 CCCGAGAGTT TCCTTTTGCA CAACGCGCTG GCCCGAAAGC CCAAGCGGAT 601 CCGAACCGCC TTCTCCCCGT CCCAGCTTCT AAGGCTGGAA CACGCCTTTG
651 AGAAGAATCA CTACGTGGTG GGCGCCGAAA GGAAGCAGCT GGCACACAGC
IDl CTCAGCCTCA CGGAAACTCA GGTAAAAGTA TGGTTTCAGA ACCGAAGAAC
ISI AAAGTTCAAA AGGCAGAAGC TGGAGGAAGA AGGCTCAGAT TCGCAACAAA 1DD1 AGAAAAAAGG GACGCACCAT ATTAACCGGT GGAGAATCGC CACCAAGCAG
1DS1 GCGAGTCCGG AGGAAATAGA CGTGACCTCA GATGATTAAA AACATAAACC
11D1 TAACCCCACA GAAACGGACA ACATGGAGCA AAAGAGACAG GGAGAGGTGG
1151 AGAAGGAAAA AACCCTACAA AACAAAAACA AACCGCATAC ACGTTCACCG 12D1 AGAAAGGGAG AGGGAATCGG AGGGAGCAGC GGAATGCGGC GAAGACTCTG
1251 GACAGCGAGG GCACAGGGTC CCAAACCGAG GCCGCGCCAA GATGGCAGAG
1301 GATGGAGGCT CCTTCATCAA CAAGCGACCC TCGTCTAAAG AGGCAGCTGA
13S1 GTGAGAGACA CAGAGAGAAG GAGAAAGAGG GAGGGAGAGA GAGAAAGAGA
1401 GAGAAAGAGA GAGAGAGAGA GAGAGAAAGC TGAACGTGCA CTCTGACAAG 1451 GGGAGCTGTC AATCAAACAC CAAACCGGGG AGACAAGATG ATTGGCAGGT
1501 ATTCCGTTTA TCACAGTCCA CTTAAAAAAT GATGATGATG ATAAAAACCA
15S1 CGACCCAACC AGGCACAGGA CTTTTTTGTT TTTTGCACTT CGCTGTGTTT
IbOl CCCCCCCATC TTTAAAAATA ATTAGTAATA AAAAACAAAA ATTCCATATC lb51 TAGCCCCATC CCACACCTGT TTCAAATCCT TGAAATGCAT GTAGCAGTTG 1701 TTGGGCGAAT GGTGTTTAAA GACCGAAAAT GAATTGTAAT TTTCTTTTCC
17S1 TTTTAAAGAC AGGTTCTGTG TGCTTTTTAT TTTGATTTTT TTTCCCAAGA
1601 AATGTGCAGT CTGTAAACAC TTTTTGATAC CTTCTGATGT CAAAGTGATT
1651 GTGCAAGCTA AATGAAGTAG GCTCAGCGAT AGTGGTCCTC TTACAGAGAA
1101 ACGGGGAGCA GGACGACGGG GGGGCTGGGG GTGGCGGGGG AGGGTGCCCA 1151 CAAAAAGAAT CAGGACTTGT ACTGGGAAAA AAACCCCTAA ATTAATTATA
20D1 TTTCTTGGAC ATTCCCTTTC CTAACATCCT GAGGCTTAAA ACCCTGATGC
2051 AAACTTCTCC TTTCAGTGGT TGGAGAAATT GGCCGAGTTC AACCATTCAC
2101 TGCAATGCCT ATTCCAAACT TTAAATCTAT CTATTGCAAA ACCTGAAGGA
21S1 CTGTAGTTAG CGGGGATGAT GTTAAGTGTG GCCAAGCGCA CGGCGGCAAG 22D1 TTTTCAAGCA CTGAGTTTCT ATTCCAAGAT CATAGACTTA CTAAAGAGAG
2251 TGACAAATGC TTCCTTAATG TCTTCTATAC CAGAATGTAA ATATTTTTGT
2301 GTTTTGTGTT AATTTGTTAG AATTCTAACA CACTATATAC TTCCAAGAAG
2351 TATGTCAATG TCAATATTTT GTCAATAAAG ATTTATCAAT ATGCCCTCAC
2401 AAAAAAAAAA AAAAAA
BLAST alert EMBL/EMBLNEU
EMBLNEU:AL1333S3 Human DNA sequence *** SEQUENCING IN PROGRESS *** from clone RP11-483F1U N = 2ι Score = 3108ι P = 5-3e-134
EMBL:HSEMX2 H. sapiens EMX2 mRNAi N = li Score = 236Sι P = 5-le- IDl
Medline entries
12331b0b:
Simeone Ai Gulisano Mi Acampora Di Stornaiuolo Ai Rambaldi Mi
Boncinelli E- i
Two vertebrate homeobox genes related to the Drosophila empty spiracles gene are expressed in the embryonic cerebral cortex EMBO J
1112 Julill(7) :2541-50
Peptide information for frame 1 ORF from 331 bp to 106b bpi peptide length: 252
Category: questionable ORF
Classification: unset
Prosite motifs: HOMEOBOX 1 (167-210)
1 MFώPAPKRCF TIESLVAKDS PLPASRSEDP IRPAALSYAN SSPINPFLNG
51 FHSAAAAAAG RGVYSNPDLV FAEAVSHPPN PAVPVHPVPP PHALAAHPLP
IDl SSHSPHPLFA S(Q(QRDPSTFY PULIHRYRYL GHRFlQGNDTS PESFLLHNAL ISI ARKPKRIRTA FSPSlQLLRLE HAFEKNHYVV GAERKiQLAHS LSLTETlQVKV
2D1 UFiQNRRTKFK RiQKLEEEGSD S(Q(QKKKGTHH INRURIATKlQ ASPEEIDVTS
251 DD
Alert BLASTP hits for DKFZphamy2_14mlbι frame 1
PIR:I51737 homeotic protein emx2 - zebra fishi N = 2ι Score = 7S3ι P = le-105
PIR:S22722 homeotic protein emx2 - human (fragment)i N = li Score 7b3ι P = l-3e-7S TREMBL:0LA1324D3_1 gene: "emx2"i product: "Emx2 protein"i Oryzias latipes mRNA for Emx2 proteini partiali N = 2ι Score = 513ι P = 4-Se-72
>PIR:S22722 homeotic protein emx2 - human (fragment) Length = 158
HSPs:
Score = 7b3 (114-5 bits)ι Expect = l-3e-7Sι P = l-3e-7S Identities = 144/144 (100*)ι Positives = 144/144 (1DD*)
(Query: 101 FAS(Q(QRDPSTFYPULIHRYRYLGHRF(3GNDTSPESFLLHNALARKPKRIRTAFSPSfQLLR lb8
FAS(Q(QRDPSTFYPULIHRYRYLGHRF(3GNDTSPESFLLHNALARKPKRIRTAFSPS(QLLR Sbjct: IS FAS(3(QRDPSTFYPULIHRYRYLGHRF(QGNDTSPESFLLHNALARKPKRIRTAFSPS(QLLR 74
(Query: ibl LEHAFEKNHYVVGAERK(QLAHSLSLTET(QVKVUF(QNRRTKFKR(QKLEEEGSDS(Q(QKKKGT 226
LEHAFEKNHYVVGAERK(QLAHSLSLTET(QVKVUF(QNRRTKFKR(QKLEEEGSDS(QιQKKKGT Sbjct: 75
LEHAFEKNHYVVGAERK(QLAHSLSLTET(QVKVUF(QNRRTKFKR(QKLEEEGSDS(Q(QKKKGT 134
(Query: 221 HHINRURIATKiQASPEEIDVTSDD 2S2 HHINRURIATKiQASPEEIDVTSDD Sbjct: 135 HHINRURIATKiQASPEEIDVTSDD 158
Pedant information for DKFZphamy2_14mlbι frame 1 Report for DKFZphamy2_14mlb .1
[LENGTH! 3b2
[MU! 40741-26
[pi! 10- SI
[H0M0L! PIR:IS1737 homeotic protein emx2 - zebra fish le-
113 [FUNCAT! 3D-10 nuclear organization [S cerevisiae-
YMLD27w! Se-DS
[FUNCAT! 04-11 other transcription activities [S cerevisiae i YMLO 27w! 5e- D5
[FUNCAT! 03- D7 pheromone re isponse i mating-type determinationi sex-specifie protei .ns [S - cerevisiae i YCRO 17w! Se- 04
[FUNCAT! 04.05-01 -04 transcriptional control [S- cerevisiae i YDL1 Obc! 7e- D4
[FUNCAT! 01-04-04 regulation of phosphate utilization [S. cerevisiaei YDLlObc! 7e-04
[FUNCAT! DI.03-13 regulation of nucleotide metabolism
[S. cerevisiaei YDLlObc! 7e-04
[BLOCKS! PR0D041D
[BLOCKS! PRD01D1H [BLOCKS! PR0D467F
[BLOCKS! PR0D71bG
[BLOCKS! BL0DD35C
[BLOCKS! BL0D027 'Homeobox' domain proteins
[BLOCKS! PRD0D2bA [BLOCKS! BL0D032C
[BLOCKS! BL0D032B 'Homeobox ' antennapedia-type protein
[SCOP! dlau7bl 1-4.1-l.b Pit-1 POU homeodomain Pit-1
Pit-1 [Rat (Rattu Se-lb
[SCOP! dlyrna_ 1-4-1.1-2 mating type protein Al Homeodomain mat alpha 2e -IS
[SCOP! dlenh 1-4- 1-1-1 engrailed Homeodomain
[(Drosophila mel anogaster 2e-13
[PIRKU! nucleus le-b7
[PIRKU! heart 3e -10 [PIRKU! DNA binding le-b7
[PIRKU! leukemia 3e-lS
[PIRKU! alternative splicing le-10
[PIRKU! proto-oncogene 3e- 15
[PIRKU! transcri ption factor be-11 [PIRKU! embryo 1e-12
[PIRKU! transcription regulation le-b7
[PIRKU! homeobox le-b7
[SUPFAM! homeobox homology le-b7
[SUPFAM! homeotic protein Hox AS 7e-10 [SUPFAM! homeotic protein Hox B3 3e-lD
[SUPFAM! homeotic protein Hox B2 3e-ll
[SUPFAM! homeotic protein Hox Bl 7e-ll
[SUPFAM! unassigned homeobox proteins le-b7
[SUPFAM! homeotic protein goosecoid 4e-10 [SUPFAM! homeotic protein Hox D4 1e-12
[PROSITE! HOMEOBOX. _1 1
[PFAM! Homeobox domain
[KU! Irregular [KU! 3D
[KU! LOU_COMPLEXITY 25 . bl *
SElQ
EKKRKKKKKNYPNPRLiQILLEGFSPLSSGUARLVlQDSRDPRLCLSLSLPPPFLFPFLSFL SEG
• xxxxxxxx xxxxxxxxxxxxxxxxx
If jlA
SElQ
SSFPSPHPHPHPK(QTSP(QFSSVLAAGSGRRR(3RAAVARSUEPRAPAPRRSMF(QPAPKRCF SEG xxxxxxxxxxxx xxxxxxxxxxxxxxxx
If jlA
SE(Q TIESLVAKDSPLPASRSEDPIRPAALSYANSSPINPFLNGFHSAAAAAAGRGVYSNPDLV SEG xxxxxx
If jlA
SE(Q
FAEAVSHPPNPAVPVHPVPPPHALAAHPLPSSHSPHPLFASiQiQRDPSTFYPULIHRYRYL
SEG
. • • • xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx IfjlA
SElQ
GHRFIQGNDTSPESFLLHNALARKPKRIRTAFSPSIQLLRLEHAFEKNHYVVGAERKIQLAHS SEG
If jlA CCCCCCCCCCHHHHHHHHHHHHHTTTTCHHHHHHHHHH SElQ
LSLTET(QVKVUF(QNRRTKFKR(QKLEEEGSDS(3<QKKKGTHHINRURIATK<QASPEEIDVTS SEG
If j lA HCCCHHHHHHHHHHHHHHHHHH
SElQ DD
SEG I f j l A
Prosite for DKFZphamy2_14mlb- 1 PSD0D27 217->321 H0ME0B0X_1 PD0C0D027
Pfam for DKFZphamy2_14mlb ■ 1 HMM_NAME Homeobox domai n
HMM *RRRpRTtFTre<QLdELEREFHf NrYPTRqRREELA(QmLNLTER(QVKIUF
+ R RT+F+ +IQL ++LE +F+ N + Y+ ++R + LA + + L+LTE + (QVK + UF (Query 2b4
PKRIRT AFSPS(QLLRLEHAFEKNHYVVGAERK(JLAHSLSLTET(QVKVUF 312
HMM iQNRRMKUKRMH*
IQNRR + K KR + (Query 313 (QNRRTKFKRiQK 323
DKFZphamy2_lbel4
group: amygdala derived
DKFZphamy2__lbel4 -p3 encodes a novel 328 amino acid proteini similar to carbonic anhydrase-related proteins-
A similar cDNA encoding a protein of the same length was identified in sheep- This protein shows a strong signal sequence! which indicates that it is a secreted protein- The new protein belongs to a protein familyi which was designated carbonic anhydrase-related protein XI (CA-RP XI)τ encoded by CA11 (human) and Carll (mousei rat). Despite potentially inactivating changes in the active-site residues! CA-RP XI is evolving very slowly in mammalsi a property indicative of an important function! which has also been observed in the two other "acatalytic" CA isoformsi CA-RP VIII and CA-RP X-
No informative BLAST resultsi No predictive prositei pfam or SCOP motife.
The new protein can find application in studying the expression profile of amygdala-specific genes-
similarity to carbonic anhydrase-related protein (Homo sapiens) ESTs ending at appr- 18DD have polyA-signal Sequenced by EMBL Locus: /map="17q24i 5-13cR from GATA41CDS" Insert length: 22b? bp
Poly A stretch at pos- 2252ι polyadenylation signal at pos- 2231
1 GGATGGAAAT AGTCTGGGAG GTGCTTTTTC TTCTTCAAGC CAATTTCATC SI GTCTGCATAT CAGCTCAACA GAATTCACCA AAAATCCATG AAGGCTGGTG
101 GGCATACAAG GAGGTGGTCC AGGGAAGCTT TGTTCCAGTT CCTTCTTTCT
151 GGGGATTGGT GAACTCAGCT TGGAATCTTT GCTCTGTGGG GAAACGGCAG
201 TCGCCAGTCA ACATAGAGAC CAGTCACATG ATCTTCGACC CCTTTCTGAC
251 ACCTCTTCGC ATCAACACGG GGGGCAGGAA GGTCAGTGGG ACCATGTACA 301 ACACTGGAAG ACACGTATCC CTTCGCCTGG ACAAGGAGCA CTTGGTCAAC
351 ATATCTGGAG GGCCCATGAC ATACAGCCAC CGGCTGGAGG AGATCCGACT
401 ACACTTTGGG AGTGAGGACA GCCAAGGGTC GGAGCACCTC CTCAATGGAC
451 AGGCCTTCTC TGGGGAGGTG CAGCTCATCC ACTATAACCA TGAGCTATAT
S01 ACGAATGTCA CAGAAGCTGC AAAGAGTCCA AATGGATTGG TGGTAGTTTC SSI TATATTTATA AAAGTTTCTG ATTCATCAAA CCCATTTCTT AATCGAATGC bOl TCAACAGAGA TACTATCACA AGAATAACAT ATAAAAATGA TGCATATTTA bSl CTACAGGGGC TTAATATAGA GGAACTATAT CCAGAGACCT CTAGTTTCAT
701 CACTTATGAT GGGTCGATGA CTATCCCACC CTGCTATGAG ACAGCAAGTT
751 GGATCATAAT GAACAAACCT GTCTATATAA CCAGGATGCA GATGCATTCC 801 TTGCGCCTGC TCAGCCAGAA CCAGCCATCT CAGATCTTTC TGAGCATGAG
651 TGACAACTTC AGGCCTGTCC AGCCACTCAA CAACCGCTGC ATCCGCACCA
101 ATATCAACTT CAGTTTACAG GGGAAGGACT GTCCAAACAA CCGAGCCCAG
151 AAGCTTCAGT ATAGAGTAAA TGAATGGCTC CTCAAGTAGG GAACAAAGCC 10D1 AAGAAGAATC CCACCTCAGT GAAATGCTAC AACTGTGAAT TGACGTAACC
1051 TAGAATGTCC CCCTTCTTGC TTCTCTCTCC TTCTTTCCCC CAAGCCTCAT
11D1 TCATTCTTGG GATTGGCCCT TTCTTCATGA AAAGTGTCTG CAAAACCATG
1151 GCAGAGGAAT ACATCTCTCA CACATACTCA CAAACACACA CACAAGCACT 1201 TGCACATACA TACAAACACA TGCAAACATA CCTACACACA CACACACTCT
1251 TACAACCTCC ATCATGGGAA GTCAAGTTTC AGAAACAAAA GTCTCATTCA
1301 TAAGAGGTCT TAGAAGAAAA TAACCAGTTA ACCTGATTTC AATTTTGATA
1351 CCGTTTTCCT GAACTAATAA ATCTACCCAA TGAGACTTTT CAGCCTTTGT
14D1 ACATACAAAA TTCTTCCAAA AGAGAGAGGA GAAAATACAG CTCTGATGGC 1451 ATCAAACGGA CTTTGCATCA AGTAATTTCA GATAGTGTCC TAGGATCCTT
1501 TGAGGGTGCT GGTAGCAGGT GAGCAGGACA AAGTTGACCA AGGACACTTA
15S1 TTTCTAGATT ATGATTCTTC TGTTTACTCA ACAATTTACA AAGAAAAAAA
IbOl GGACAGACAT TGAAGAGCTA CACATTGTAT ATATATCACC ACAGACTATA
IbSl AGGAAATGGA ATTATTTCCC TCTTTGTCAC ATATCTGTAG TAGGATTTGC 1701 CAAGATCAGA AATGATCCAT TTGCTGTTTC TTGTTTTCCA AAGGTCATAC
1751 ATTGTGTTTG GTTATTGTTA CCAGCTCAAT AAATGTGTTT AACGAGTTAA
1601 TTTCATTTTT CTGGCTTTGG TCTGTTCTCC TTCCTTACAG GCTAAGCCCT
1651 GGCTCCATGC AACTGCATTC TTTGATTTCA CTTGTTCCTT CATCTACATG
1101 TTTTGTTCAT TTGCAGCCAG TTTTTACTGA GTTTGTGGCA ATCAGGAATG 1151 CATTTGCTAA GCAAGTATGA CTTTAATTCC ACTCCATGGC TCAATCATTC
20D1 ACATGAGGTG AGCTTCAGCC TGAGATAGCA GGCGACAGAC TTCTTGCGTT
2051 TCAAAACTGC CATGCCCCCC TGTGATGCTC CCGTGAAGGA ATGCACTTTG
2101 CCTTGTAAGT TCCTGGGAAA GGGGTATGTT TTCTCTCCAG GTGCAGCCAG
21S1 ATCTCACAAA GTACAAAACG AATGCCTTTC TTTTCTTGTT TATAATGGTC 22D1 ACTCACTGTG TTTGGTTACT GTCAAGAAAT CAATAAATGT GTTTAACAAG
2251 TCAAAAAAAA AAAAAAA
BLAST alert EMBL/EMBLNEU
EMBL:AF0b4654 Homo sapiens map 17q24i 5-13cR from GATA41C05 repeat regioni complete sequence- i N = 2ι Score = 6764ι P = D
EMBLNEU:AC005683 Homo sapiens chromosome 17 clone RP11-158E11 map
17ι
UORKING DRAFT SEiQUENCEi 2 ordered pieces- i N = 3ι Score = b2b0ι P
= 0
Medline entries
1017341: Lovejoy DAi Hewett-Emmett Di Porter CAi Cepoi Di Sheffield Ai
Tashian RE-i Evolutionarily conserved! "acatalytic" carbonic anhydrase-related protein XI contains a sequence motif present in the neuropeptide sauvagine: the human
CA-RP XI gene (CA11) is embedded between the secretor gene cluster and the DBP gene at 11ql3-3. Genomics 1118 Dec 15iS4 (3) : 484-1 Peptide information for frame 3
ORF from 0 bp to 18b bpi peptide length: 321 Category: similarity to known protein Classification: unclassified
1 MEIVUEVLFL LiQANFIVCIS A(3(QNSPKIHE GUUAYKE VlQ GSFVPVPSFU 51 GLVNSAUNLC SVGKRiQSPVN IETSHMIFDP FLTPLRINTG GRKVSGTMYN
IDl TGRHVSLRLD KEHLVNISGG PMTYSHRLEE IRLHFGSEDS (QGSEHLLNGiQ
151 AFSGEVlQLIH YNHELYTNVT EAAKSPNGLV VVSIFIKVSD SSNPFLNRML
2D1 NRDTITRITY KNDAYLLtQGL NIEELYPETS SFITYDGSMT IPPCYETASU
251 IIMNKPVYIT RMiQMHSLRLL S(QN(QPS(QIFL SMSDNFRPViQ PLNNRCIRTN 301 INFSLtQGKDC PNNRAtQKLiQY RVNEULLK
Alert BLASTP hits for DKFZphamy2_lbel4ι frame 3 PIR:JED375 carbonic anhydrase-related protein - humani N = li Score = 137ι P = 4-be-14
SUISSNEU:CAHB_SHEEP CARBONIC ANHYDRASE-RELATED PROTEIN 2 PRECURSOR
(CARP 2) (CA-RP II) (CA-XD-i N = li Score = 13Sι P = 7-5e-14
>PIR:JE037S carbonic anhydrase-related protein - human Length = 328
HSPs:
Score = 137 (140-b bits)ι Expect = 4-be-14ι P = 4-be-14 Identities = lbl/267 (5β*)ι Positives = 223/287 (77*) fluery: 3D
EGUUAYKEVViQGSFVPVPSFUGLVNSAUNLCSVGKRtQSPVNIETSHMIFDPFLTPLRINT 81 E UU+YK+ +IQG+FVP P FUGLVN+AU+LC+VGKR(QSPV++E +++DPFL PLR++T Sbjct: 32 EDUUSYKDNLlQGNFVPGPPFUGLVNAAUSLCAVGKRlQSPVDVEVKRVLYDPFLPPLRLST 11 fluery: ID GGRKVSGTMYNTGRHVSLRLDKEHLVNISGGPMTYSHRLEEIRLHFGSEDSiQGSEHLLNG 141
GG K+ GT+YNTGRHVS +VN+SGGP+ YSHRL E+RL FG+ D
GSEH +N
Sbjct: 12
GGEKLRGTLYNTGRHVSFLPAPRPVVNVSGGPLLYSHRLSELRLLFGARDGAGSEHtQINH ISI
(Query: 150
(QAFSGEVώLIHYNHELYTNVTEAAKSPNGLVVVSIFIKVSDSSNPFLNRMLNRDTITRIT 201 Q FS EV<QLIH + N ELY N + A + + PNGL + + S+F+ V +
+SNPFL+R+LNRDTITRI+ Sb j ct : 152
(QGFSAEViQLIHFNiQELYGNFSAASRGPNGLAILSLFVNVASTSNPFLSRLLNRDTITRIS 211 (Query : 21D
YKNDAYLL(QGLNIEELYPETSSFITYDGSMTIPPCYETASUIIMNKPVYITRM(3MHSLRL 2b1 YKNDAY Lfl L++E L+PE+ FITY GS++ PPC ET +UI++++ + IT + (QMHSLRL Sb j ct : 212
YKNDAYFLlQDLSLELLFPESFGFITYiQGSLSTPPCSETVTUILIDRALNITSLlQMHSLRL 271
(Query : 270 LS(QN(QPS(QIFLSMSDNFRPV(3PLNNRCIRTNINFSL(3GKDC-- PNNR 314 LSlQN PSlQIF S + S N RP + (QPL +R +R N + + C PN R Sb jct : 272 LS(QNPPS(3IF(QSLSGNSRPL(QPLAHRALRGNRDPRHPERRCRGPNYR 316
Pedant inf ormati on f or DKFZphamy2_lbel4 ι f rame 3
Report f or DKFZphamy2_lbel4 - 3
[LENGTH! 328
[MU! 375b3 . 11 [pi! 8 - 22
[HOMOL! PIR : J E037S carboni c anhydrase-rel ated prote i n - human le-lDl
[BLOCKS! DM011D1B
[BLOCKS! BL0Dlb2F [BLOCKS! BL0Dlb2E
[BLOCKS! BL0Dlb2D
[BLOCKS! BL0Dlb2C Eukaryotic-type carbonic anhydrases proteins
[BLOCKS! BL001b2A Eukaryotic-type carbonic anhydrases proteins
[SCOP! dlznca_ 2.5b-l.l-3 Carbonic anhydrase [human
(Homo sapiens le-103
[SCOP! d2cba 2-Sb.l.l.2 Carbonic anhydrase [human
(Homo sapiens 1e-17 [EC! 4.2-1-1 Carbonate dehydratase le-3b
[EC! 3-1.3-46 Protein-tyrosine-phosphatase 2e-2D
[PIRKU! blocked amino end 8e-21
[PIRKU! carbon-oxygen lyase le-3b
[PIRKU! zinc le-3b [PIRKU! polymorphism 2e-20
[PIRKU! hydro-lyase le-3b
[PIRKU! transmembrane protein 3e-23
[PIRKU! tyrosine-specific phosphatase 2e-2D
[PIRKU! brain be-lb [PIRKU! acetylated amino end le-3b
[PIRKU! phosphatidylinositol linkage 2e-11
[PIRKU! receptor 2e-20
[PIRKU! liver 3e-21
[PIRKU! phosphoprotein 2e-2D [PIRKU! saliva 2e-21
[PIRKU! glycoprotein 2e-22
[PIRKU! mitochondrion le-32
[PIRKU! monomer 3e-32
[PIRKU! alternative splicing be-lb [PIRKU! lipoprotein 2e-11
[PIRKU! pyroglutamic acid 2e-21
[PIRKU! metalloprotein be-35
[PIRKU! muscle 4e-31 [PIRKU! membrane protein 2e-11
[PIRKU! phosphoric monoester hydrolase 2e-20
[PIRKU! homodimer 3e-23
[SUPFAM! fibronectin type III repeat homology 2e-2D
[SUPFAM! carbonic anhydrase homology le-3b
[SUPFAM! protein-tyrosine-phosphatasei receptor type zeta be-lb
[SUPFAM! carbonate dehydratase le-3b
[SUPFAM! protein-tyrosine-phosphatasei receptor type gamma
2e-20
[SUPFAM! protein-tyrosine-phosphatase homology 2e-2D
[SUPFAM! leukocyte common antigen cytosolic domain homology 2e-2D
[PFAM! Eukaryotic-type carbonic anhydrases
[KU! All_Beta
[KU! 3D
[KU! SIGNAL_PEPTIDE 22
SElQ
MEIVUEVLFLL(3ANFIVCISA<Q(QNSPKIHEGUUAYKEVV(QGSFVPVPSFUGLVNSAUNLC lugc-
SElQ
SVGKRiQSPVNIETSHMIFDPFLTPLRINTGGRKVSGTMYNTGRHVSLRLDKEHLVNISGG luge- - . TTTTCCCEETTTTTEETTTTCEEEEETT- TTCEEEEEETTTTEEEEECTTTTTEEEEE SElQ
PMTYSHRLEEIRLHFGSEDSiQGSEHLLNGlQAFSGEVlQLIHYNHELYTNVTEAAKSPNGLV luge- TTCCCEEEEEEEEEETTTTTTCTTTEETTBCCCEEEEEEEEEGG- GTTHHHHHCTTTTEE SElQ
VVSIFIKVSDSSNPFLNRMLNRDTITRITYKNDAYLLiQGLNIEELYPETSSFITYDGSMT luge- EEEEEEEEC-CCCGGGHHHH-- HHGGGCCTTTEEEETTTTCGGGGCCCCCCEEEEEECCC SElQ
IPPCYETASUIIMNKPVYITRM(QMHSLRLLS(QN(QPS(QIFLSMSDNFRPV(QPLNNRCIRTN lugc-
TTTTCCCEEEEEECCCEEECHHHHHHHHCCBCCTTTTCCCBTTTTCCCCCCTTTTCCEEC SElQ INFSLiQGKDCPNNRAlQKLiQYRVNEULLK luge-
(No Prosite data available for DKFZphamy2_lbel4 -3)
Pfam for DKFZphamy2_lbel4 -3
HMM NAME Eukaryotic-type carbonic anhydrases
HMM
*UCYgeHUGPEHH UHkhYPIAU. GDR(QSPINI(QUkearYDPS U Y E + U +++ + + G R(QSP + NI + +
+ DP
(Query 33
UAYKEVViQGSFVPVPSFUGLVNSAUNLCSVGKRiQSPVNIETSHMIFDPF 61
HMM LKPUrv - SYYpaUCrEUelUNNGHSFiQVeFDDSMDMSVLsGGPLPgHPYR
L P+R+ ++ ++++ ++ N+G+ + +D +SGGP++
+ + R (Query 62 LTPLRINTGGRKVSG — TMYNTGRHVSLRLDK-
EHLVNISGGPMTY-SHR 127
HMM
Lk(QFHFHUGGASsNDUGSEHTVDGmkYPMELHLVHUNStKYnNYdEA(Qdq L + ++H G S++ +GSEH ++G +++ E+ L+H+N +Y N+
EA + +
(Query 126 LEEIRLHFG--
SEDS(QGSEHLLNG(QAFSGEV(QLIHYNHELYTNVTEAAKS 175 HMM
PDGLAVIGVFMKVGNYqENPyLiQKVv. - DALdnlKYKGKratMTNFDPsC
P+GL V+ +F+KV NP L++ + D + I YK + +++++
(Query 17b PNGLVVVSIFIKVS- DSSNPFLNRMLNRDTITRITYKNDAYLLiQGLNIEE 224
HMM LLPpPnCRDYUTYPGSLTTPPChECVTUIVCKEPIsISsE(QMUKFRsLLF
L P+ + TY GS+T+PPC+E UI+ P+ I + (QM +R L
(Query 225 LYPE--
TSSFITYDGSMTIPPCYETASUIIMNKPVYITRM(QMHSLRLLS(Q 272
HMM NhEGEeeVpMVDNURPPiQPLKhRvVRASF* N +M DN+RP (QPL++R +R +
(Query 273 N(QPS(QIFLSMSDNFRPV(QPLNNRCIRTNI 3D1
DKFZphamy2_lcl2
group: nucleic acid management
DKFZphamy2_lcl2 encodes a novel 422 amino acid protein with partial identity to I-kappa-B-related protein and to BRCAl- I-kappa-B-related protein interacts with transcription factors and BRCAl has a function in DNA damage response- I-kappa-B-alpha mutations contribute to constitutive NF-kappaB activity in cultured and primary HRS (Hodgkin/Reed-Sternberg) cells and are therefore involved in the pathogenesis of Hodgkin's disease (HD) patients.
The new protein can find application in modulating DNA repair and mutagenesis and also in expression profiling in HD related syndro s-
similarity to I-kappa-B-related protein
Sequenced by MediGenomix
Locus: unknown
Insert length: lb4S bp
Poly A stretch at pos- lb2bι polyadenylation signal at pos. IbOS
1 GGATTTTCCT TGGTCTTAAG ATGGGTAGAA ATGTGATGCG ACACATGTCT
SI GATGACTTAG GAAGTTATGT TTCTCTTTCG TGTGATGACT TTTCTTCACA
IDl GGAATTAGAG ATTTTCATTT GCTCCTTTTC CTCCTCCTGG CTTCAAATGT ISI TTGTTGCAGA GGCAGTCTTT AAAAAGTTGT GTCTACAGAG CTCTGGCAGT
201 GTTTCTTCTG AGCCACTCTC TCTTCAGAAA ATGGTATATT CCTATTTACC
251 AGCCTTGGGG AAAACTGGTG TGCTTGGGTC TGGAAAGATT CAGGTGTCAA
301 AGAAAATAGG ACAGCGGCCT TGTTTTGACT CTCAGAGAAC CTTACTAATG
351 CTGAATGGTA CTAAACAAAA ACAAGTCGAA GGGCTGCCAG AGTTACTAGA 401 CCTGAACCTT GCTAAATGTT CCTCATCATT AAAAAAATTG AAAAAGAAGT
451 CAGAAGGAGA ATTGTCATGT TCCAAGGAGA ATTGCCCCTC TGTAGTTAAA
SD1 AAGATGAATT TTCACAAGAC TAATCTAAAA GGAGAAACAG CCCTGCATAG
551 AGCTTGCATA AATAACCAAG TGGAGAAATT GATTCTTCTT CTCTCTTTGC bOl CAGGAATAGA CATCAATGTT AAAGACAATG CTGGCTGGAC GCCTTTGCAT b51 GAAGCCTGTA ACTATGGCAA CACAGTGTGT GTCCAGGAAA TTTTGCAACG
7D1 TTGTCCAGAG GTAGATCTGC TCACTCAAGT GGACGGGGTG ACTCCTTTGC
7S1 ATGATGCACT GTCAAACGGA CATGTAGAAA TTGGCAAGCT GCTACTACAG
801 CATGGGGGCC CAGTGCTTTT ACAACAGAGG AATGCTAAGG GAGAATTGCC
851 CTTGGATTAT GTGGTTTCAC CTCAAATCAA AGAAGAACTG TTTGCTATTA 101 CAAAAATAGA AGATACAGTG GAGAACTTTC ATGCACAAGC AGAGAAACAT
151 TTTCATTACC AGCAACTTGA ATTTGGCTCC TTTTTACTTA GTAGGATGTT
10D1 GCTAAATTTT TGTTCAATTT TTGATTTATC TTCAGAGTTC ATTTTAGCTT
1051 CCAAAGGGTT AACTCATCTA AATGAACTGC TTATGGCTTG TAAAAGTCAT
1101 AAAGAAACCA CCAGTGTTCA TACTGACTGG TTACTGGATC TTTATGCTGG 1151 AAATATAAAG ACATTGCAGA AACTCCCACA CATTCTTAAG GAACTGCCTG
1201 AGAATTTGAA AGTGTGTCCT GGGGTACACA CTGAGGCCTT GATGATAACA
1251 TTGGAAATGA TGTGTCGGTC AGTCATGGAG TTTTCATGAT GATGCTAGAA
13D1 AGTATGGATT GACTTTCTAA ATCTGTTCAG TTTGCATTGG TACTTACTGT 1351 GGACTTCATA GCTTACTGAC AGATAGTAAT TTGATTTATT TATTGACAGA
1401 CTTTGCAGCC TTGCTAAATT TTAAAAGCAT TTTTAAAAAA ACTTCTACAA
1451 AACTCTAGTA TGGGCTTCTG ACTTTTTCCA GGGTGTAGAA TTTGACTCAA
1S01 AAGTAAAAAT AATTTTGTTT TAGTATATTC TACTTTCATT AATGTTTTTT
15S1 TGTTCTGAAA GTGATATTAT ATTGTACATG TAAAATTAAT TTAAATATTT
IbOl TTTCAAATAA AAATGTAATG TCCTGTAAAA AAAAAAAAAA AAAAA
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 3
ORF from 21 bp to 126b bpi peptide length: 422 Category: similarity to known protein Classification: Cell signaling/communication
1 MGRNVMRHMS DDLGSYVSLS CDDFSSOELE IFICSFSSSU LώMFVAEAVF 51 KKLCLiQSSGS VSSEPLSLfQK MVYSYLPALG KTGVLGSGKI (QVSKKIGlQRP
101 CFDSlQRTLLM LNGTK<QK(QVE GLPELLDLNL AKCSSSLKKL KKKSEGELSC
151 SKENCPSVVK KMNFHKTNLK GETALHRACI NNiQVEKLILL LSLPGIDINV
201 KDNAGUTPLH EACNYGNTVC VώEILiQRCPE VDLLTlQVDGV TPLHDALSNG
251 HVEIGKLLLtQ HGGPVLL(Q(QR NAKGELPLDY VVSPώlKEEL FAITKIEDTV 301 ENFHAlQAEKH FHY(Q(QLEFGS FLLSRMLLNF CSIFDLSSEF ILASKGLTHL
351 NELLMACKSH KETTSVHTDU LLDLYAGNIK TLtQKLPHILK ELPENLKVCP
401 GVHTEALMIT LEMMCRSVME FS
BLASTP hits
No BLASTP hits available Alert BLASTP hits for DKFZphamy2_lcl2ι frame 3
PIR:ASb421 I-kappa-B-related protein - humani N = li Score = 242ι
P =
4-be-lδ
TREMBLNEU:AF036D42_1 gene: "BARDl"i product: "BRCAl-associated
RING domain protein"i Homo sapiens BRCAl-associated RING domain protein (BARD1) gene-, exons 10ι 11 and complete cds-i N = li Score = 23bι
P = b-le-17 >PIR:A5b421 I-kappa-B-related protein - human Length = 481 HSPs:
Score = 242 (3b-3 bits)τ Expect = 4-be-lδι P = 4-be-lδ Identities = 52/116 (44*)ι Positives = 71/llδ (b0*) (Query: 15b
PSVVKKMNFHKTNLKGETALHRACINNiQVEKLILLLSLPGIDINVKDNAGUTPLHEACNY 215 P K + + + N GET LHRACI (Q+ ++ L+ G +N +D
GUTPLHEACNY
Sbjct: 354 PGAAKGSKUNRRNDMGETLLHRACIEG(QLRRV(QDLVR- (QGHPLNPRDYCGUTPLHEACNY 412
(Query: 21b GNTVCVIQEILIQRCPEVDLL-- T(QVDGVTPLHDALSNGHVEIGKLLL(QHGGPVLL(Q(QRNA 272
G+ V+ +L VD +G+TPLHDAL+ GH E+ +LLL+ G V L+ R A
Sbjct: 413 GHLEIVRFLLDHGAAVDDPGGώGCEGITPLHDALNCGHFEVAELLLERGASVTLRTRKA 471
Pedant information for DKFZphamy2_lcl2ι frame 3
Report for DKFZphamy2_lcl2-3
[LENGTH! 422
[MU! 47071-16
[pi! b-57
[HOMOL! PIR:A5b421 I-kappa-B-related protein - human 3e-11
[FUNCAT! 11 unclassified proteins [S- cerevisiaei YIL112w!
3e-ll
[FUNCAT! Ob-13-01 cytoplasmic degradation [S- cerevisiaei
YGR232w! 4e-0b [FUNCAT! 30-1D nuclear organization [S- cerevisiaei YIR033w!
2e-D4
[FUNCAT! 04-D5-01-D7 chromatin modification [S- cerevisiae!
YIR033w! 2e-D4
[SCOP! dlawcb_ 1-11-3-1-2 GA binding protein (GABP) alpha GA bindini be-24
[EC! 3-1-3-53 Myosin-light-chain-phosphatase 1e-Db
[PIRKU! phosphotransferase 3e-D7
[PIRKU! tandem repeat le-Ob
[PIRKU! transmembrane protein 7e-10 [PIRKU! serine/threonine-specific protein kinase 3e-07
[PIRKU! phosphoprotein 3e-lD
[PIRKU! integrin binding 3e-07
[PIRKU! alternative splicing 3e-ll
[PIRKU! peripheral membrane protein 2e-01 [PIRKU! transcription regulation 3e-0b
[PIRKU! phosphoric monoester hydrolase le-Ob
[PIRKU! cytoskeleton 4e-10
[PIRKU! smooth muscle le-Ob [SUPFAM! ankyrin 3e-ll
[SUPFAM! ankyrin repeat homology 3e-ll
[SUPFAM! unassigned ankyrin repeat proteins 7e-10
[PFAM! Ank repeat
[KU! Irregular
[KU! 3D
[KU! LOU COMPLEXITY 6-53 *
SElQ MGRNVMRHMSDDLGSYVSLSCDDFSSlQELEIFICSFSSSULlQMFVAEAVFKKLCLiQSSGS SEG xxxxxx lawcB
SElQ VSSEPLSLlQKMVYSYLPALGKTGVLGSGKIiQVSKKIGiQRPCFDSlQRTLLMLNGTKiQKlQVE SEG xxxxxxxx lawcB
SElQ GLPELLDLNLAKCSSSLKKLKKKSEGELSCSKENCPSVVKKMNFHKTNLKGETALHRACI SEG xxxxxxxxxxxxxxxxxxxxxx lawcB
SE(Q NN(QVEKLILLLSLPGIDINVKDNAGUTPLHEACNYGNTVCV(QEIL3RCPEVDLLT(QVDGV
SEG lawcB TTTTCCHHHHHHHHCCHHHHHHHHHCCCTTTTCTTTTC- SEύ TPLHDALSNGHVEIGKLLL(QHGGPVLL(Q(QRNAKGELPLDYVVSP(QIKEELFAITKIEDTV
SEG lawcB
CHHHHHHHHTTHHHHHHHHHCCCTT SElQ ENFHAIQAEKHFHYIQIQLEFGSFLLSRMLLNFCSIFDLSSEFILASKGLTHLNELLMACKSH SEG lawcB
SElQ KETTSVHTDULLDLYAGNIKTLIQKLPHILKELPENLKVCPGVHTEALMITLEMMCRSVME SEG lawcB
SEiQ FS SEG lawcB
(No Prosite data available for DKFZphamy2_lcl2 -3)
Pfam for DKFZphamy2_lcl2 - 3
HMM_NAME Ank repeat
HMM *GyTPLHIAARyNNvEMVrlLL(QH-GADIN* G+T+LH A+++N+VE LLL+ G DIN
(Query 171 GETALHRACINNlQVEKLILLLSLPGIDIN 111
34-46 (bits) f: 20S t: 232 Target: dkfzphamy2_lcl2- 3 similarity to I-kappa-B-related protein
Alignment to HMM consensus: (Query *GyTPLHIAARyNNvEMVrlLL(3HGADIN*
G+TPLH A+ Y+N+ +V+ L(Q+ + ++ dkfzphamy2 205 GUTPLHEACNYGNTVCV(QEIL(QRCPEVD 232
(Query f: 231 t: 2bb Target: dkfzphamy2_lcl2-3 similarity to I-kappa-B-related protein Alignment to HMM consensus:
HMM *GyTPLHIAARyNNvEMVrlLL(QHGADIN* G TPLH A +++VE+ +LLL(QHG +
(Query 231 GVTPLHDALSNGHVEIGKLLLlQHGGPVL 2bb DKFZphamy2_lil
group: nucleic acid management
DKFZphamy2_lil encodes a novel b21 amino acidprotein with similarity to the murine hemin-sensitive initiation factor 2-
The hemin-sensitive initiation factor 2 is expressed predominantly in liveri spleeni colon and uterus and contains 2 protein kinase motifs- The mouse homologue inhibits protein synthesis in stress conditions by phosphorylation of eif-2-alpha. Four different eIF2alpha kinases have been identified in mammalian cellsi the he e-regulated inhibitor (HRI)ι the interferon-inducible RNA-dependent kinase (PKR)ι the endoplasmic reticulum-resident kinase (PERK) and MGCN2- The new protein represents a new member of this family
The new protein can find application in modulating/blocking of translation •
similarity to hemin-sensitive initiation factor 2 (Mus musculus)ι complete eds. alpha kinase complete eds- probably complete in genomic clone DJ0042MD2
Sequenced by MediGenomix
Locus: /map="37.H cR from top of Chr7 linkage group"
Insert length: 26b3 bp
Poly A stretch at pos- 2844ι polyadenylation signal at pos- 2824
1 GCAGTGCTGG GCTGGCCGGC GGGCTGGGCT GCGGCCCGCG CGCGGCCGGC
SI GATGCAGGGG GGCAACTCCG GGGTCCGCAA GCGCGAAGAG GAGGGCGACG
101 GGGCTGGGGC TGTGGCTGCG CCGCCGGCCA TCGACTTTCC CGCCGAGGGC
151 CCGGACCCCG AATATGACGA ATCTGATGTT CCAGCAGAAA TCCAGGTGTT 2D1 AAAAGAACCC CTACAACAGC CAACCTTCCC TTTTGCAGTT GCAAACCAAC
251 TCTTGCTGGT TTCTTTGCTG GAGCACTTGA GCCACGTGCA TGAACCAAAC
301 CCACTTCGTT CAAGACAGGT GTTTAAGCTA CTTTGCCAGA CGTTTATCAA
351 AATGGGGCTG CTGTCTTCTT TCACTTGTAG TGACGAGTTT AGCTCATTGA 4D1 GACTACATCA CAACAGAGCT ATTACTCACT TAATGAGGTC TGCTAAAGAG
451 AGAGTTCGTC AGGATCCTTG TGAGGATATT TCTCGTATCC AGAAAATCAG
SOI ATCAAGGGAA GTAGCCTTGG AAGCACAAAC TTCACGTTAC TTAAATGAAT
551 TTGAAGAACT TGCCATCTTA GGAAAAGGTG GATACGGAAG AGTATACAAG bOl GTCAGGAATA AATTAGATGG TCAGTATTAT GCAATAAAAA AAATCCTGAT b51 TAAGGGTGCA ACTAAAACAG TTTGCATGAA GGTCCTACGG GAAGTGAAGG
701 TGCTGGCAGG TCTTCAGCAC CCCAATATTG TTGGCTATCA CACCGCGTGG
751 ATAGAACATG TTCATGTGAT TCAGCCACGA GACAGAGCTG CCATTGAGTT
601 GCCATCTCTG GAAGTGCTCT CCGACCAGGA AGAGGACAGA GAGCAATGTG
651 GTGTTAAAAA TGATGAAAGT AGCAGCTCAT CCATTATCTT TGCTGAGCCC 101 ACCCCAGAAA AAGAAAAACG CTTTGGAGAA TCTGACACTG AAAATCAGAA
151 TAACAAGTCG GTGAAGTACA CCACCAATTT AGTCATAAGA GAATCTGGTG
1D01 AACTTGAGTC GACCCTGGAG CTCCAGGAAA ATGGCTTGGC TGGTTTGTCT
1051 GCCAGTTCAA TTGTGGAACA GCAGCTGCCA CTCAGGCGTA ATTCCCACCT
1101 AGAGGAGAGT TTCACATCCA CCGAAGAATC TTCCGAAGAA AATGTCAACT 1151 TTTTGGGTCA GACAGAGGCA CAGTACCACC TGATGCTGCA CATCCAGATG
1201 CAGCTGTGTG AGCTCTCGCT GTGGGATTGG ATAGTCGAGA GAAACAAGCG
12S1 GGGCCGGGAG TATGTGGACG AGTCTGCCTG TCCTTATGTT ATGGCCAATG
13D1 TTGCAACAAA AATTTTTCAA GAATTGGTAG AAGGTGTGTT TTACATACAT
1351 AACATGGGAA TTGTGCACCG AGATCTGAAG CCAAGAAATA TTTTTCTTCA 1401 TGGCCCTGAT CAGCAAGTAA AAATAGGAGA CTTTGGTCTG GCCTGCACAG
1451 ACATCCTACA GAAGAACACA GACTGGACCA ACAGAAACGG GAAGAGAACA
1501 CCAACACATA CGTCCAGAGT GGGTACTTGT CTGTACGCTT CACCCGAACA
15S1 GTTGGAAGGA TCTGAGTATG ATGCCAAGTC AGATATGTAC AGCTTGGGTG
IbOl TGGTCCTGCT AGAGCTCTTT CAGCCGTTTG GAACAGAAAT GGAGCGAGCA IbSl GAAGTTCTAA CAGGTTTAAG AACTGGTCAG TTGCCGGAAT CCCTCCGTAA
17D1 AAGGTGTCCA GTGCAAGCCA AGTATATCCA GCACTTAACG AGAAGGAACT
17S1 CATCGCAGAG ACCATCTGCC ATTCAGCTGC TGCAGAGTGA ACTTTTCCAA
1601 AATTCTGGAA ATGTTAACCT CACCCTACAG ATGAAGATAA TAGAGCAAGA
1651 AAAAGAAATT GCAGAACTAA AGAAGCAGCT AAACCTCCTT TCTCAAGACA 1101 AAGGGGTGAG GGATGACGGA AAGGATGGGG GCGTGGGATG AAAGTGGACT
1151 TAACTTTTAA GGTAGTTAAC TGGAATGTAA ATTTTTAATC TTTATTAGGG
20D1 TATAGTTGGT ACAATGCTTC GTTGTATTTA GTAAGCCTTT ACAAGACTTG
2D51 TTAAAGATGT CAGAGTGCCC CAAGCTGCCG TTCCTTCCCT TCCTGCCCCA
2101 CAAGCTCCTT TTCCTGAATT TCCTACCTAA ATATTAACCA TATGCCTAGT 21S1 CTCTGAAACT AAAAACTTGG ACCTCATCCT CAATTATTTT CTCCTTTCAA
22D1 CTCTGTTGAC CCTCTGTCTG GTCTTCCTCT AGAAGGTTCT ACCGCAGAAA
2251 TTGATGTGTG CTCCCTGCCC TCGTCACTGC CCAAGCCCGG GCCTGCACAT
2301 ACTCACTGGA CTGTTCCAGT TTTGACAGCT GCCAGTCTTC CTGCCCCTTT
2351 CACACTGCAG CTGAAGTTCA TTACCTGAAG GACGCCTCAT CATTTCATTC 2401 CTTGGCTCCA AACCTTCTGC TGCCTCTAAG ATAAAAGCTC AACTTCTTAA
2451 CAGTGTACAG TGTGCAACTT CCAACCTTTT TATCTGTTCT CTCCACCTTC
2S01 AGTTTAGCGT CATTCCAAAA CCACACCCTT GCAAAGCTTT GTACTCCGCA
2551 CCCCAGATGA TCTCCAGGCA GCTCAGATCT CTTTCCTGCC TTTGCCCTGC
2b01 ACTGTTCCCC GGTACTTCCT CCTTTATTGT AGCACTCAGC TCCCCAGCCA 2b51 ATCTGTACAT CCCTCAGAGG CAGCGATCTG ATGAATTGGT TTTTGAATCC
2701 CAGAAAGGGT CTGCCATGGA GTTGGCAGTC ATCACGGTAG ATGGCGTATG
27S1 ATTTTGCTGA ATTTTAAATA AAATGAAAAC CATAAATTAC ATGATGCTTT
2601 TATTGACACT TGACAACTGG CCTAAATAAA AAGACTCTGA CTCCAAAAAA
2651 AAAAAAAAAA AAA
BLAST Results Entry AF0266D6 from database EMBL:
Mus musculus hemin-sensitive initiation factor 2 alpha kinase mRNAi complete eds.
Score = bbflδi P = 2-7e-21bι identities = 1122/2S34
Entry AC005115 from database EMBL:
Homo sapiens clone DJ0D42M02ι UORKING DRAFT SEiQUENCEi 13 unordered pieces • Score = 511bι P = D-Oe+OOi identities = 1010/1146
Medline entries
11042001: Berlanga J-J-i Herrero S-i de Haro C-i Characterization of the hemin-sensitive eukaryotic initiation factor 2alpha kinase from mouse nonerythroid cellsi J- Biol- Chem- 273 (4δ) : 3234D-3234b (1116) .
Peptide information for frame 1
ORF from 52 bp to 1136 bpi peptide length: b21
Category: similarity to known protein
Classification: Protein management
Prosite motifs: PR0TEIN_KINASE_ATP (173-llb)
PR0TEIN_KINASE_ATP (173-117)
PROTEIN KINASE_ST (437-441)
1 MiQGGNSGVRK REEEGDGAGA VAAPPAIDFP AEGPDPEYDE SDVPAEItQVL 51 KEPL(Q(3PTFP FAVAN-2LLLV SLLEHLSHVH EPNPLRSRiQV FKLLClQTFIK
101 MGLLSSFTCS DEFSSLRLHH NRAITHLMRS AKERVRiQDPC EDISRIlQKIR
ISI SREVALEAiQT SRYLNEFEEL AILGKGGYGR VYKVRNKLDG (QYYAIKKILI
2D1 KGATKTVCMK VLREVKVLAG LtQHPNIVGYH TAUIEHVHVI (QPRDRAAIEL
251 PSLEVLSDiQE EDREiQCGVKN DESSSSSIIF AEPTPEKEKR FGESDTENiQN 301 NKSVKYTTNL VIRESGELES TLELlQENGLA GLSASSIVElQ (QLPLRRNSHL
3S1 EESFTSTEES SEENVNFLGiQ TEAiQYHLMLH I(QM(QLCELSL UDUIVERNKR
4D1 GREYVDESAC PYVMANVATK IFiQELVEGVF YIHNMGIVHR DLKPRNIFLH
451 GPDlQiQVKIGD FGLACTDILiQ KNTDUTNRN6 KRTPTHTSRV GTCLYASPEiQ
501 LEGSEYDAKS DMYSLGVVLL ELFlQPFGTEM ERAEVLTGLR TGiQLPESLRK 551 RCPV(3AKYI(Q HLTRRNSSiQR PSAIIQLLIQSE LFiQNSGNVNL TLiQMKIIElQE bOl KEIAELKKiQL NLLSlQDKGVR DDGKDGGVG
BLASTP hits
No BLASTP hits available Alert BLASTP hits for DKFZphamy2_lilι frame 1
No Alert BLASTP hits found Pedant information for DKFZphamy2_lilι frame 1
Report for DKFZphamy2_lil -1
[LENGTH! b4b
[MU! 72736.76"
[pi! 5.60
[H0M0L! SUISSNEU:HRI_MOUSE HEME-REGULATED EUKARYOTIC INITIATION FACTOR EIF-2-ALPHA KINASE (EC 2.7-1.-) (HEME-REGULATED INHIBITOR) (HRI) (HEME-CONTROLLED REPRESSOR) (HCR) (HEMIN- SENSITIVE INITIATION FACTOR-2 ALPHA KINASE). D-D
[FUNCAT! D5.D7 translational control ES. cerevisiaei YDR283c! 2e-43 EFUNCAT! 30-03 organization of cytoplasm ES- cerevisiaei YDR263c! 2e-43
EFUNCAT! .10.02-11 key kinases ES- cerevisiaei Y0R231w! 8e-14 EFUNCAT! 03- D4 buddingi cell polarity and filament formation ES- cerevisiaei Y0R231w! δe-14 EFUNCAT! 03-01 cell growth ES- cerevisiaei Y0R231w! δe-14
EFUNCAT! 11-01 stress response ES- cerevisiaei Y0R231w! δe-14 EFUNCAT! 03-22 cell cycle control and mitosis ES- cerevisiaei Y0R231w! fle-lM EFUNCAT! 30-10 nuclear organization ES- cerevisiaei YKLlOlw! δe-12
EFUNCAT! 11 unclassified proteins ES- cerevisiaei YPLlSDw! βe-12
EFUNCAT! D3-13 meiosis ES- cerevisiaei YDR523c! Ξe-11
EFUNCAT! D3-10 sporulation and germination ES- cerevisiaei YDR523c! Ξe-11
EFUNCAT! D1-01 biogenesis of cell wall ES- cerevisiaei YPL14Dc! 4e-ll
EFUNCAT! 10-D3.11 key kinases ES- cerevisiaei YCR073c! 1e-ll EFUNCAT! 16 classification not yet clear-cut ES- cerevisiaei YHRD62c! le-10
EFUNCAT! 03-07 pheromone response! mating-type determination! sex-specific proteins ES- cerevisiaei YLR3b2w! 2e-lD EFUNCAT! 10.05-11 key kinases ES- cerevisiaei YLR3b2w! Ξe-ID EFUNCAT! 10-04-11 key kinases ES- cerevisiaei YLR3b2w! Ξe-ID EFUNCAT! 10-11 other signal-transduction activities ES- cerevisiaei YDLlOlc! 3e-lD
EFUNCAT! 11-04 dna repair (direct repairi base excision repair and nucleotide excision repair) ES- cerevisiaei YDLlOlc! 3e-10 EFUNCAT! 03-25 cytokinesis ES- cerevisiaei YDR507c! 3e-10 EFUNCAT! 04-0S.01.D1 general transcription activities ES. cerevisiaei YDLlOβw! le-01
EFUNCAT! 03- lb dna synthesis and replication ES- cerevisiaei YBRlbOw! le-01 EFUNCAT! 01-05.04 regulation of carbohydrate utilization ES. cerevisiaei YLR113w! 4e-01
[FUNCAT! 02-11 metabolism of energy reserves (glycogeni trehalose) [S. cerevisiaei YPL031c! le-06 [FUNCAT! 04.05.01-04 transcriptional control [S- cerevisiaei
YPLD31c! le-08
[FUNCAT! 01-04-04 regulation of phosphate utilization [S- cerevisiaei YPL031c! le-06 [FUNCAT! c energy conversion EM- genitaliunrii MG101! 2e-08
EFUNCAT! 03-11 recombination and dna repair ES- cerevisiaei
Y0R351c! le-07
EFUNCAT! 03-22-01 cell cycle check point proteins ES- cerevisiaei YPL153c! le-07 EFUNCAT! 10-05-01 regulation of g-protein activity ES- cerevisiaei YBLOlbw! 7e-07
EFUNCAT! 04-03-11 other trna-transcription activities ES- cerevisiaei YIL03Sc! le-Ob
EFUNCAT! 08-13 vacuolar transport ES- cerevisiaei YGLlδOw! le-Ob
EFUNCAT! Ob- 13-04 lysosomal and vacuolar degradation ES- cerevisiaei YGLlβOw! le-Ob
EFUNCAT! 04-11 other transcription activities ES- cerevisiaei
YER121w! 2e-0b EFUNCAT! 30-02 organization of plasma membrane ES- cerevisiaei
YDR122w! 2e-0b
EFUNCAT! 30-07 organization of endoplasmatic reticulum ES- cerevisiaei YHR071c! 3e-Db
EFUNCAT! Ol-Ob-10 regulation of lipidi fatty-acid and sterol biosynthesis ES- cerevisiaei YHR071c! 3e-0b
EFUNCAT! 06-11 other intracellular-transport activities ES- cerevisiaei YKLllβc! le-05
EFUNCAT! 10-04-11 other nutritional-response activities ES- cerevisiaei YKLllδc! le-OS EFUNCAT! D1-04 biogenesis of cytoskeleton ES. cerevisiaei
YNL020c! le-05
EFUNCAT! Ob- 07 protein modification (glycolsylationi acylationi myristylationi palmitylationi farnesylation and processing) ES- cerevisiaei YFL033c! 4e-04 EFUNCAT! 01-02-04 regulation of nitrogen and sulphur utilization ES- cerevisiaei YNL163c! 7e-04
[BLOCKS! BL00107A Protein kinases ATP-binding region proteins
[SCOP! dlir3a_ S-l-l-2-b insulin receptor Complex
(transf erase/substrate) le-22 [SCOP! dlfgkb_ 5.1.1.2-5 Fibroblast growth factor receptor 1 [human (Horn 1e-27
[SCOP! dlphk 5.1-1-l.b gamma-subunit of glycogen phosphorylase kinas 2e-23
[SCOP! dlabo 5.1.1-1.14 Protein kiase CK2ι alpha subunit [Maize (Ze le-23
[SCOP! d31ck 5 ■ 1.1 • E ■ Ξ Lymphocyte kinase (lck) [Human
(Homo sapiens) 3e-22
ESCOP! d2erk 5.1.1.1.11 MAP kinase Erk2 [rat (Rattus norvegicus) 7e-20 ESCOP! dlcdkb_ 5-1.1.1.2 cAMP-dependent PKi catalytic subunit Comple be-11
[SCOP! dlhcl 5-1.1.1-1 Cyclin-dependent PK [Human
(Homo sapiens) 5e-21
[EC! 2-7.1-112 Protein-tyrosine kinase le-08 [EC! 2-7-l-12b beta-Adrenergic-receptor kinase 2e-08
[EC! 2-7-1-117 Myosin-light-chain kinase le-01
[EC! 2-7.1-37 Protein kinase Se-12 EEC! 2-7-1-123 Ca2+/calmodulin-dependent protein kinase 4e-
01
EPIRKU! phosphotransferase 0-0
EPIRKU! nucleus le-01 EPIRKU! RNA binding 2e-21
EPIRKU! duplication βe-10
EPIRKU! tandem repeat 4e-01
EPIRKU! zinc Se-12
[PIRKU! cell cycle control 2e-01 [PIRKU! serine/threonine-specific protein kinase 0-0
[PIRKU! transmembrane protein 2e-01
[PIRKU! zinc finger 8e-10
[PIRKU! oncogene be-12
[PIRKU! autophosphorylation 0-0 [PIRKU! coat protein le-11
[PIRKU! magnesium le-01
[PIRKU! ATP 0-0
[PIRKU! polyprotein be-12
[PIRKU! receptor le-01 [PIRKU! phosphoprotein D-D
[PIRKU! sporulation 2e-01
[PIRKU! glycoprotein le-01
[PIRKU! growth factor receptor le-11
[PIRKU! signal transduction 2e-12 [PIRKU! serine/threonine/tyrosine-specific protein kinase
8e-10
[PIRKU! protein kinase δe-10
[PIRKU! transforming protein 2e-12
[PIRKU! heme binding 0-0 [PIRKU! purine nucleotide binding 2e-10
[PIRKU! calcium binding 4e-01
[PIRKU! meiosis le-08
[PIRKU! alternative splicing le-11
[PIRKU! P-loop 2e-10 [PIRKU! proto-oncogene 2e-12
[PIRKU! segmentation 4e-10
[PIRKU! stress-induced protein le-DI
[PIRKU! EF hand 4e-01
[PIRKU! cell division le-01 [PIRKU! calmodulin binding 4e-01
[SUPFAM! LIM protein kinase 8e-10
[SUPFAM! calcium-dependent protein kinase 4e-01
[SUPFAM! rat protein kinase raf 5e-12
[SUPFAM! AMP-activated protein kinase 2e-0δ [SUPFAM! protein kinase byr2 Se-01
[SUPFAM! SH2 homology le-06
[SUPFAM! unassigned Ser/Thr or Tyr-specific protein kinases 0-0
[SUPFAM! leucine-rich alpha-2-glycoprotein repeat homology le-01 [SUPFAM! double-stranded RNA-binding repeat homology 2e-21
[SUPFAM! histidine--tRNA ligase homology be-42
[SUPFAM! SAM homology 5e-01
[SUPFAM! avian retrovirus IC10 gag-Rmil-env polyprotein le-11
[SUPFAM! LIM metal-binding repeat homology βe-10 [SUPFAM! GCN2 protein be-42
[SUPFAM! protein kinase homology 0-0
[SUPFAM! protein kinase C zinc-binding repeat homology 2e-12
[SUPFAM! Ca2+/calmodulin-dependent protein kinase II 4e-0β [SUPFAM! beta-adrenergic-receptor kinase 2e-0δ
ESUPFAM! kinase-related transforming protein be-12
[SUPFAM! protein kinase A-raf 2e-12
[SUPFAM! SH3 homology le-06 [SUPFAM! Ca2+/calmodulin-dependent protein kinase 4e-01
[SUPFAM! protein kinase Xa21 le-01
[SUPFAM! calmodulin repeat homology 4e-01
ESUPFAM! protein kinase DUN1 le-01
ESUPFAM! pleckstrin repeat homology le-01 ESUPFAM! protein kinase TIK Ξe-Ξl
ESUPFAM! protein-tyrosine kinase tec le-06
ESUPFAM! kinase interaction domain homology le-01
EPROSITE! PROTEIN_KINASE_ATP 2
EPROSITE! PROTEIN_KINASE_ST 1 EPFAM! Eukaryotic protein kinase domain
EKU! Irregular
EKU! 3D
EKU! L0U_C0MPLEXITY 10-11 *
EKU! COILED COIL 5-2b *
SE(Q AVLGUPAGUAAARARPAMiQGGNSGVRKREEEGDGAGAVAAPPAIDFPAEGPDPEYDESDV SEG ---xxxxxxxxxxxxxx xxxxxxxxxxxxxxx
COILS
1jstA
SElQ PAEI(QVLKEPL(Q(QPTFPFAVAN(QLLLVSLLEHLSHVHEPNPLRSR(QVFKLLC<QTFIKMGL SEG xxxxxxxxxxxxxxx
COILS
1 j stA
SElQ LSSFTCSDEFSSLRLHHNRAITHLMRSAKERVRϊQDPCEDISRIiQKIRSREVALEAiQTSRY SEG -
COILS I j stA
SElQ LNEFEELAILGKGGYGRVYKVRNKLDG(QYYAIKKILIKGATKTVCMKVLREVKVLAGL(3H SEG COILS
IjstA
TTTEEEEEECCCBTTBCEEEEEETTTTCEEEEEEECCTTTTTTTTHHHHHHHHHHHTTTB SE(Q PNIVGYHTAUIEHVHVI(QPRDRAAIELPSLEVLSD(QEEDRE(QCGVKNDESSSSSIIFAEP SEG
COILS
I j stA TTBC
SElQ TPEKEKRFGESDTEN(QNNKSVKYTTNLVIRESGELESTLEL<QENGLAGLSASSIVE(Q(QLP SEG COILS
IjstA
SElQ LRRNSHLEESFTSTEESSEENVNFLG(QTEA(QYHLMLHI(QM(QLCELSLUDUIVERNKRGRE SEG xxxxxxxxxxxxx
COILS IjstA
SElQ YVDESACPYVMANVATKIF(QELVEGVFYIHNMGIVHRDLKPRNIFLHGPD(Q(QVKIGDFGL SEG COILS
IjstA
SElQ ACTDILIQKNTDUTNRNGKRTPTHTSRVGTCLYASPEIQLEGSEYDAKSDMYSLGVVLLELF SEG
COILS
IjstA
SElQ (QPFGTEMERAEVLTGLRTG(QLPESLRKRCPV(QAKYI(QHLTRRNSS(2RPSAI(3LL(QSELF(Q SEG ,
COILS
IjstA
SElQ NSGNVNLTL(QMKIIE(QEKEIAELKK(QLNLLS(QDKGVRDDGKDGGVG SEG xxxxxxxxxxxxxx
COILS - -CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC ■
IjstA
Prosite for DKFZphamy2_lil- 1
PS00107 110->214 PROTEIN_KINASE_ATP PD0C00100
PS00107 110->21S PROTEIN_KINASE_ATP PD0C00100 PS00106 454->4b7 PROTEIN KINASE ST PD0C00100
Pfam for DKFZphamy2_lil ■ 1
HMM_NAME Eukaryotic protein kinase domain
HMM *YeigRiIGeGsFGtVYkCiUr-TGeIVAIKIIk-krsms F1REI
+E + I+G+G++G+VYK++++ +G+ +AIK+I K ++ +LRE+ (Query 164
FEELAILGKGGYGRVYKVRNKLDG(QYYAIKKILIKGATKTVCMKVLREV 232
HMM qlMRrLnHPNIIRFYDwFedddDHI* ++++ L+HPNI+ + +++ ++ H+
(Query 233 KVLAGL(QHPNIVGYHTAUI-EHVHV 25b
HMM
*IYMIMEYMeGGDLFDYIrrng pMsEwelrfIMyύlL +++ M+++E +L+D+I++++ ++ + + +1+
+++
(Query 31b LHIlQMώLCEL-
SLUDUIVERNKRGREYVDESACPYVMANVATKIFiQELV 443 HMM rGMeYLHSMgllHRDLKPENILIDeN.gqIKIcDFGLARqMn
+ G+ Y + H + MGI + HRDLKP + NI + + + (Q + KI + DFGLA +
(Query 444
EGVFYIHNMGIVHRDLKPRNIFLHGPD(Q(3VKIGDFGLACTDIL(QKNTDUT 413
HMM nYerMttfCGTPUYMMAPEVIImgnyYttkVDMUSFGCILUEMMT
+ T+++GT Y +PE ++G++Y+ K+DM+S+G++L
E++ (Query 414 NRNGKRTPTHTSRVGTCLYA-SPEiQ-
LEGSEYDAKSDMYSLGVVLLELF- 540
HMM
GepPFyd-.dnMemlmrliqr.frrpfUpnCSeElyDFMrwCUnyDPekR +PF ++ E + ++ + ++ ++ +C+ +++ + + +++
++R
(Query 541 — (QPFGTEMERAEVLTGLRTG(QLPESLRKRCPV(QAKYI(Q-
HLTRRNSSiQR 567 HMM PTFriQILnHPUF*
P + + (Q + L + + F (Query 566 PSAIIQLLIQSELF 511
DKFZphamy2_lil4
group: transmembrane proteins
DKFZphamy2_lil4 encodes a novel bl7 amino acid protein with similarity to the human l(3)mbt protein homolog. Mutations of the Drosophila l(3)mbt gene lead to malignant brain tumors- The novel protein contains 1 transmembrane domain- No informative BLAST resultsi No predictive prositei pfam or SCOP motife The new protein can find application in studying the expression profile of oncogenes and amgydala-specif ic genes and as a new marker for amygdala cells-
similarity to Human l(3)mbt protein homolog mRNA
> 14 exons (HS7SbG23 (EMBLNEU)) Pedant: TRANSMEMBRANE 1 Sequenced by MediGeno ix
Locus: /map="22ql3-31-13-33"
Insert length: 3071 bp Poly A stretch at pos- 3052ι no polyadenylation signal found
1 GGCAGGCCAA TATGGCTTCC TGCACCT6GT GACGCTTGGC GAAACTGAGG
SI TCTCATGGAG AAGCCCCGGA GTATTGAGGA GACCCCATCT TCAGAACCAA
101 TGGAGGAAGA GGAAGATGAC GACTTGGAGC TGTTTGGTGG CTATGATAGT
ISI TTCCGGAGTT ATAACAGCAG TGTGGGCAGT GAGAGCAGCT CCTATCTGGA
201 GGAGTCAAGT GAAGCAGAAA ATGAGGATCG GGAAGCAGGG GAACTGCCGA
251 CCTCCCCGCT GCATTTGCTC AGCCCTGGGA CTCCTCGCTC CTTGGATGGC
301 AGTGGTTCTG AGCCAGCTGT CTGTGAGATG TGTGGTATCG TGGGTACAAG
351 GGAAGCCTTC TTCTCCAAGA CCAAGAGGTT CTGCAGCGTC TCCTGCTCCA
401 GGAGCTACTC CTCCAACTCC AAGAAAGCCA GTATCTTGGC TAGATTACAG
451 GGAAAACCAC CGACCAAAAA AGCCAAAGTC CTGCACAAGG CTGCCTGGTC
501 TGCCAAAATT GGAGCCTTCC TCCACTCTCA AGGGACAGGA CAGCTGGCAG
S51 ATGGGACACC AACAGGACAA GACGCTCTGG TCTTGGGCTT CGACTGGGGG bOl AAGTTCCTGA AGGATCACAG TTACAAGGCT GCTCCCGTCA GCTGTTTCAA bSl GCACGTCCCA CTCTATGACC AGTGGGAGGA TGTGATGAAA GGGATGAAGG
701 TGGAGGTGCT CAACAGTGAT GCTGTGCTCC CCAGCCGGGT GTACTGGATC
751 GCCTCTGTCA TCCAGACAGC AGGGTATCGG GTGCTGCTTC GGTATGAAGG
601 CTTTGAAAAT GACGCCAGCC ATGACTTCTG GTGCAACCTG GGAACAGTGG
6S1 ATGTCCACCC CATTGGCTGG TGTGCCATCA ACAGCAAGAT CCTAGTGCCC
101 CCACGGACCA TCCATGCCAA GTTCACCGAC TGGAAGGGCT ACCTCATGAA
151 ACGGCTGGTG GGCTCCAGGA CGCTTCCCGT GGATTTCCAC ATCAAGATGG
1001 TGGAGAGCAT GAAGTACCCC TTTAGGCAGG GCATGCGGCT GGAAGTGGTG
1051 GACAAGTCCC AGGTGTCACG CACTCGCATG GCTGTGGTGG ACACAGTAAT
1101 CGGGGGTCGC CTACGGCTCC TCTACGAGGA TGGTGACAGT GACGACGACT
1151 TCTGGTGCCA CATGTGGAGC CCCCTGATCC ACCCAGTGGG TTGGTCACGA
1201 CGTGTGGGCC ACGGCATCAA GATGTCAGAG AGGCGAAGTG ACATGGCCCA
1251 TCACCCCACC TTCCGGAAGA TCTACTGTGA TGCCGTTCCT TACCTCTTCA 1301 AGAAGGTACG AGCAGTCTAC ACAGAAGGCG GTTGGTTTGA GGAAGGGATG
1351 AAGCTGGAGG CCATTGACCC CCTGAATCTG GGCAACATCT GCGTGGCAAC
1401 TGTCTGTAAG GTTCTCCTGG ATGGATACCT GATGATCTGT GTGGACGGGG
1451 GGCCCTCCAC AGATGGCTTG GACTGGTTCT GCTACCATGC CTCTTCCCAC
15D1 GCCATCTTCC CGGCCACCTT CTGTCAGAAG AATGACATTG AGCTCACACC
1551 GCCAAAAGGT TATGAGGCAC AGACTTTCAA CTGGGAGAAC TACTTGGAGA
IbOl AGACCAAGTC GAAAGCCGCT CCATCGAGAC TCTTTAACAT GGATTGCCCA lb51 AACCATGGCT TCAAGGTGGG CATGAAGCTG GAGGCCGTGG ACCTGATGGA
1701 GCCCCGGCTC ATCTGTGTGG CCACGGTGAA ACGAGTGGTG CATCGGCTCC
1751 TCAGCATCCA CTTTGACGGC TGGGACAGCG AGTACGACCA GTGGGTGGAC
1601 TGCGAGTCCC CAGACATCTA CCCCGTCGGC TGGTGTGAGC TCACCGGCTA
16S1 CCAGCTCCAG CCTCCTGTGG CCGCAGGTGT GGGCTCTCGT GGCCCTAAGA
1101 GGCTCTGACT TTCTTTCCTC TTCTTTTTTC CTTCTTCCCC CGCCCCTGTG
1151 CCCATCTCCG TTCTTTGGCA TGAGGTGGAG ATGTCTCATG GACCACTTTA
2001 AGTAGAGAGT GAGCCCCGTC ACCCAGCCCC TGCTCCTGAC TTCTCTGTCT
2051 CCCTTTCCCT CTGGCCTGCA GAGCTCCTTC CTTCATCTTG CCCACTCTGT
2101 CATATGTTCG TGCCCTTGTG CACCCAGGTA AACTACCCAG GTCCCTCTGA
21S1 GCAGCCCTGG TAACAAGGGT GGGAAGAAGG GACAGCTGTT CTCCGGCCCC
2201 TCCTCCAGCC CCGCCCTCTC CTCATTGCCC AGGTTTGGCT TCCTGTCTTG
2251 GGGTGTCTCG TGTGGGAGGG TGGATGGGGT CTCGGGATGC GCCTGTGCCC
2301 TGTGTCCTCC CAGGGACCCT CTTCTCATCT CTTTCACCCT TGTCTTTCAA
2351 CAACAGAACC GGCCACACCG CTGAAGGCCA AAGAGGCCAC AAAGAAGAAA
2401 AAGAAACAGT TTGGGAAGAA AAGGAAAAGA ATCCCGCCCA CTAAGACGCG
2451 ACCCCTCAGA CAGGGGTCCA AGAAGCCCCT GCTGGAGGAC GACCCTCAGG
2501 GTGCCAGGAA GATCTCGTCG GAGCCTGTTC CTGGCGAGAT CATTGCTGTG
25S1 CGTGTGAAGG AAGAGCATCT AGACGTGGCC TCGCCCGACA AGGCTTCAAG
2b01 TCCAGAGCTG CCTGTCTCCG TCGAGAACAT CAAGCAGGAA ACAGACGACT
2b51 GAGCCTTCCT GCCTCCAGCC TGGCTTCTAG CTGGAAGCCA GCCCAGCGTT
2701 TCTCTACCAC CACCACCATG CCTCCACCTG ACTTTGGCTT GGAGACTGAT
27S1 CCTCTCTGTG TAAATTCTGC CCGGTGCTGT GAAGGCTGGA CGGTGGAGGA
2δ01 CCTGCTGGGG TCTCCTGGGA CCCGCCTGTT GCTTCTGCCC TCCCCTGTGG
2651 AAAGGTCTAT ATGACGGGCC GCCTGAGGCC CCAGAACTCG TCTGTGAACC
2101 ACCTTTTCCA GCCAGAGTTC CCAAAGCTGG AACGCTAGCT GCCTGCTCTT
21S1 CCTTAAGATG GCCTCCCCCC GACCCGCCAC GGCCCTCAGT TGCCAGGGAT
3001 GGGGCCACCA CTGTCACACT GTGGAATACA AGACAGTGAA CTCTGTCTGC
3051 CTAAAAAAAA AAAAAAAAAA A
BLAST Results
Entry HS7SbG23 from database EMBLNEU:
Human DNA sequence from clone 75bG23 on chromosome 22ql3-31-13 - 33
Score = 3131ι P = O-Oe+OOi identities = 67S/154
Entry Uδ135β_l from database TREMBL: product: "l(3)mbt protein homolog"i Human l(3)mbt protein homolog mRNAi complete eds- Score = SOSi P = 7-2e-45ι identities = 123/320ι positives = 170/320ι frame +1
Entry AB014S61_1 from database TREMBL: gene: "KIAAObδl"i product: "KIAAObβl protein"i Homo sapiens mRNA for KIAAObβl proteini partial eds- Score = SD3ι P = l-4e-4bι identities = 122/3D7ι positives = lb3/307ι frame +1
Medline entries
No Medline entry
Peptide information for frame 1
ORF from 55 bp to 1105 bpi peptide length: bl7 Category: similarity to known protein Classification: unclassified
1 MEKPRSIEET PSSEPMEEEE DDDLELFGGY DSFRSYNSSV GSESSSYLEE
51 SSEAENEDRE AGELPTSPLH LLSPGTPRSL DGSGSEPAVC EMCGIVGTRE
101 AFFSKTKRFC SVSCSRSYSS NSKKASILAR LiQGKPPTKKA KVLHKAAUSA
151 KIGAFLHSlQG TGlQLADGTPT GiQDALVLGFD UGKFLKDHSY KAAPVSCFKH 201 VPLYDώUEDV MKGMKVEVLN SDAVLPSRVY UIASVIiQTAG YRVLLRYEGF
2S1 ENDASHDFUC NLGTVDVHPI GUCAINSKIL VPPRTIHAKF TDUKGYLMKR
301 LVGSRTLPVD FHIKMVESMK YPFRtQGMRLE VDKSlQVSRT RMAVVDTVIG
351 GRLRLLYEDG DSDDDFUCHM USPLIHPVGU SRRVGHGIKM SERRSDMAHH
401 PTFRKIYCDA VPYLFKKVRA VYTEGGUFEE GMKLEAIDPL NLGNICVATV 451 CKVLLDGYLM ICVDGGPSTD GLDUFCYHAS SHAIFPATFC (QKNDIELTPP
501 KGYEAώTFNU ENYLEKTKSK AAPSRLFNMD CPNHGFKVGM KLEAVDLMEP
SSI RLICVATVKR VVHRLLSIHF DGUDSEYDiQU VDCESPDIYP VGUCELTGYlQ bOl LlQPPVAAGVG SRGPKRL
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_lil4 i frame 1
TREMBL:ABD14Sβl_l gene: "KIAAObδl"i product: "KIAAObδl protein"i Homo sapiens mRNA for KIAAObδl proteini partial cds-i N = li Score = S03ι P = 3-1e-4δ
TREMBL:U8135δ_l product: "l(3)mbt protein homolog"i Human l(3)mbt protein homolog mRNAi complete cds-i N = li Score = SOSi P = b-2e-48
>TREMBL:U6135δ_l product: "l(3)mbt protein homolog"i Human l(3)mbt protein homolog mRNAi complete eds- Length = 772 HSPs:
Score = S05 (75. A bits)ι Expect = b-2e-4δι P = b-2e-4β Identities = 123/313 (31*)ι Positives = 170/313 (54*)
(Query: 213 UKGYLMKRLVGSRTLPVDFH-- IKMVESMKYPFR(QGMRLEVVDKS(QVSRTRMAVVDTVIG 350
U+ YL ++ + T PV + V K F+ GM+LE +D S + V V G
Sbjct: 206 UESYLEEiQK — AITAPVSLF(QDS(3AVTHNKNGFKLGMKLEGIDP(QHPSMYFILTVAEVCG 2b5
(Query: 351 GRLRLLYEDGDSD-DDFUCHMUSPLIHPVGUSRRVGHGIKMSE — RRSDMAHHPTFRKIY 407
RLRL + DG S+ DFU + SP IHP GU + GH +++ + + + + Sbjct: 2bb YRLRLHF- DGYSECHDFUVNANSPDIHPAGUFEKTGHKLIQLPKGYKEEEFSUSIQYMCSTR 324 (Query: 4DΘ CDAVP-
YLFKKVRAVYTEGGUFEEGMKLEAIDPLNLGNICVATVCKVLLDGYLMICVDGG 4bb
A P ++F G F+ GMKLEA+D +N +CVA+V V+ D
++ D
Sbjct: 325 A(QAAPKHMFVS(QSHSPPPLG-F(QVGMKLEAVDRMNPSLVCVASVTDVV- DSRFLVHFDNU 3δ2
(Query: 4b7 PSTDGLDUFCYHASSHAIFPATFClQKNDIELTPPKGY- EAfQTFNUENYLEKTKSKAAPSR 52S
T D + + C SS I P +C(QK LTPP+ Y + F UE YLE + T + A P+
Sbjct: 363 DDT--YDYUC- DPSSPYIHPVGUC(QK(QGKPLTPP(QDYPDPDNFCUEKYLEETGASAVPTU 431
(Query: 52b LFNMDCPNHGFKVGMKLEAVDLMEPRLICVATVKRVVHRLLSIHFDGUDSEYDiQUVDCES S65
F + P H F V MKLEAVD P LI VA+V+ V + IHFDGU
YD U+D +
Sbjct: 440 AFKVR-
PPHSFLVNMKLEAVDRRNPALIRVASVEDVEDHRIKIHFDGUSHGYDFUIDADH 416
(Query: Sβb PDIYPVGUCELTGY(QL(QPPV b05 PDI+P GUC TG+ L(QPP+
Sbjct: 411 PDIHPAGUCSKTGHPLiQPPL Slβ Score = 333 (50-0 bits)ι Expect = 4-le-27ι P = 4-le-27 Identities = 103/324 (31*)ι Positives = 151/324 (4b*)
(Query: 171 FDUGKFLKDHSYKAAPVSCFKHVPLYDώUEDVMK- GMKVEVLNSDAVLPSRVYUIASVIiQ 237 + U +L++ APVS F+ ++ K GMK+E + D PS
+Y+I +V +
Sbjct: 20b USUESYLEE(QKAITAPVSLF(QDS(QAVTHNKNGFKLGMKLEGI—DP(QHPS- MYFILTVAE 2bΞ (Query: 236
TAGYRVLLRYEGFENDASHDFUCNLGTVDVHPIGUCAINSKILVPPRTIHAKFTDUKGYL 217
GYR+ L ++G+ HDFU N + D+HP GU L P+ + U Y+ Sbjct: 2b3 VCGYRLRLHFDGYSE-- CHDFUVNANSPDIHPAGUFEKTGHKLώLPKGYKEEEFSUSiQYM 320
(Query: 216 MKRLVGSRTLPVDFHIKMVESMKYP
FR(QGMRLEVVDKS(QVSRTRMAVVDTVIGGRLR 354
+R H+ + +S P F+ GM+LE VD+ S +A V
V+ R
Sbjct: 321 CS
TRA(QAAPKHMFVS(QSHSPPPLGF(3VGMKLEAVDRMNPSLVCVASVTDVVDSRFL 37b
(Query: 3SS LLYEDGDSDDDFUCHMUSPLIHPVGUSRRVGHGIKMSERRSD
MAHHPTFRKIYCDAV 411
+ +++ D D+UC SP IHPVGU ++ G + + D + AV Sbjct: 377
VHFDNUDDTYDYUCDPSSPYIHPVGUCiQKfQGKPLTPPlQDYPDPDNFCUEKYLEETGASAV 43b
(Query: 412
PYLFKKVRAVYTEGGUFEEGMKLEAIDPLNLGNICVATVCKVLLDGYLMICVDGGPSTDG 471 P KVR ++ F MKLEA + D N I VA + V V D + I
DG + G
Sbjct: 437 PTUAFKVRPPHS FLVNMKLEAVDRRNPALIRVASVEDVE-
DHRIKIHFDGU--SHG 461 (Query: 472 LDUFCYHASSHAIFPATFClQKNDIELTPPKG 502
D F A I PA +C K L PP G
Sbjct: 410 YD-FUIDADHPDIHPAGUCSKTGHPL(QPPLG 511
Score = 23b (35-4 bits)ι Expect = 2-5e-lbι P = 2-Se-lb Identities = 47/110 (42*)ι Positives = bb/110 (bO*)
(Query: 411 PPKGYEAiQTFNUENYLEKTKSKAAPSRLF-NMDCPNH- — GFKVGMKLEAVDLMEPRLIC 5S4
P G + + ++UE+YLE+ K+ AP LF + H GFK+GMKLE +D P +
Sbjct: 117 PATGEKKECUSUESYLEEIQKAITAPVSLFIQDSIQAVTHNKNGFKLGMKLEGIDPIQHPSMYF 25b
(Query: 555 VATVKRVVHRLLSIHFDGUDSEYDtQUVDCESPDIYPVGUCELTGYiQLiQPP b04
+ TV V L +HFDG+ +D UV+ SPDI + P GU E TG + + L(Q P
Sbjct: 257 ILTVAEVCGYRLRLHFDGYSECHDFUVNANSPDIHPAGUFEKTGHKLiQLP
3Db
Pedant information for DKFZphamy2_lil4ι frame 1
Report for DKFZphamy2_lil -1
ELENGTH! bl7 EMU! b12b4.11 Epl! b.OS EHOMOL! TREMBL :U6135δ_l product: "l(3)mbt protein homolog"i Human l(3)mbt protein homolog mRNAi complete eds- le-47
EBLOCKS! BL0120bA Amiloride-sensitive sodium channels proteins EKU! TRANSMEMBRANE 1
EKU! LOU COMPLEXITY 1-40 *
S E lQ MEKPRSIEETPSSEPMEEEEDDDLELFGGYDSFRSYNSSVGSESSSYLEESSEAENEDRE
SEG xxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxx
PRD ccccceeeeccccccccccccccccccccccccccccccccccccccccccccccccccc
MEM
S ElQ AGELPTSPLHLLSPGTPRSLDGSGSEPAVCEMCGIVGTREAFFSKTKRFCSVSCSRSYSS
SEG xxxxxxxxxx
PRD ccccccccccccccccccccccccccceeeeeecccccccccccccccceeeeccccccc
MEM
S E lQ NSKKASILARL(QGKPPTKKAKVLHKAAUSAKIGAFLHS(QGTG(QLADGTPTG(QDALVLGFD
SEG xxxxxx
PRD ccchhhhhhhhhcccccccchhhhhhhhhhhhhhhcccccccccccccccccceeeeecc
MEM
SE lQ UGKFLKDHSYKAAPVSCFKHVPLYDIQUEDVMKGMKVEVLNSDAVLPSRVYUIASVIIQTAG
SEG
PRD chhhhhhccccccccccccccccccccchhhhhheeeeeccccccceeeehhhhhhhhhc
MEM
SE lQ YRVLLRYEGFENDASHDFUCNLGTVDVHPIGUCAINSKILVPPRTIHAKFTDUKGYLMKR
SEG
PRD ceeeeeeccccccccceeeeccccccccccccccccceeeccccccccccccchhhhhhh
MEM
S E lQ LVGSRTLPVDFHIKMVESMKYPFRώGMRLEVVDKSfQVSRTRMAVVDTVIGGRLRLLYEDG
SEG
PRD hccccccccccccccccccccccccccceeeecccccceeeeeeeeeccccceeeeeccc
MEM
S E lQ DSDDDFUCHMUSPLIHPVGUSRRVGHGIKMSERRSDMAHHPTFRKIYCDAVPYLFKKVRA
SEG
PRD cccceeeeeccccccccccccccccccccccccccccccchhhhhhcccccccccccccc
MEM
S E lQ VYTEGGUFEEGMKLEAIDPLNLGNICVATVCKVLLDGYLMICVDGGPSTDGLDUFCYHAS
SEG
PRD ccccccchhhhheeeeccccccceeeeeeehhhhhcceeeeeeccccccccceeeeeecc
MEM MMMMMMMMMMMMMMMMM
S E lQ SHAIFPATFC(QKNDIELTPPKGYEA(QTFNUENYLEKTKSKAAPSRLFNMDCPNHGFKVGM
SEG
PRD cccccccccccccccccccccccccchhhhhhhhhhhhccccccccccccccchhhhhhe
MEM
S E lQ KLEAVDLMEPRLICVATVKRVVHRLLSIHFDGUDSEYDiQUVDCESPDIYPVGUCELTGYlQ
SEG
PRD eeecccccccceeeeeehhhhhhhheeeeeccccccccccccccccccccceeeeccccc
MEM
S ElQ LiQPPVAAGVGSRGPKRL
SEG
PRD ccccccccccccccccc
MEM (No Prosite data available for DKFZphamy2_lil4 ■ 1) (No Pfam data available for DKFZphamy2_lil4.1)
DKFZphamy2_li24
group: differentiation/development
DKFZphamy2_li24 encodes a novel 835 amino acid protein without partial similarity to rattus norvegicus Notch2 protein- Notch family molecules are thought to be negative regulators of neuronal differentiation in early brain development. Notch2 is expressed not only by neuronal cells in the embryonic braini but also by glial cells in the postnatal brain- The new protein represents a new member of this family and may be involved in specific differentation or developmental pathways of the nervous system -
The new protein can find application in modulating development and differentiation of amygdala cells-
putative protein probably complete eds-
Sequenced by MediGenomix
Locus: unknown Insert length: 27bδ bp
Poly A stretch at pos- 2714ι polyadenylation signal at pos- 2b17
1 AGAAATCTTC AGCCAAACAG CTGCAGGAAG TAGAGAAGGT TAAACCCCAG 51 AGTGAGAAAG TTCATCAGAC TCTGATTCTG GACCCAGCAC AGAGGAAGAG
101 ACTCCAGCAG CAGATGCAGC AGCACGTTCA GCTCTTGACC CAAATCCACC
151 TTCTTGCCAC CTGCAACCCC AACCTCAATC CGGAGGCCAC TACCACCAGG
201 ATATTTCTTA AAGAGCTGGG AACCTTTGCT CAAAGCTCCA TCGCCCTTCA
251 CCATCAGTAC AACCCCAAGT TTCAGACCCT GTTCCAACCC TGTAACTTGA 301 TGGGAGCTAT GCAGCTGATT GAAGACTTCA GCACACATGT CAGCATTGAC
3S1 TGCAGCCCTC ATAAAACTGT CAAGAAGACT GCGAATGAAT TTCCCTGTTT
401 GCCAAAGCAA GTGGCTTGGA TTCTGGCCAC AAGCAAGGTT TTCATGTATC
451 CAGAGTTACT TCCAGTGTGT TCCCTGAAGG CAAAGAATCC CCAGGATAAG
501 ATCGTCTTCA CCAAGGCTGA GGACAATTTG TTAGCTTTAG GACTGAAGCA 551 TTTTGAAGGA ACTGAGTTTC CTAATCCTCT AATCAGCAAG TACCTTCTAA bOl CCTGCAAAAC TGCCCACCAA CTGACAGTGA GAATCAAGAA CCTCAACATG bSl AACAGAGCTC CTGACAACAT CATTAAATTT TATAAGAAGA CCAAACAGCT
7D1 GCCAGTCCTA GGAAAATGCT GTGAAGAGAT CCAGCCACAT CAGTGGAAGC
7S1 CACCTATAGA GAGAGAAGAA CACCGGCTCC CATTCTGGTT AAAGGCCAGT 601 CTGCCATCCA TCCAGGAAGA ACTGCGGCAC ATGGCTGATG GTGCTAGAGA
651 GGTAGGAAAT ATGACTGGAA CCACTGAGAT CAACTCAGAT CGAAGCCTAG
101 AAAAAGACAA TTTGGAGTTG GGGAGTGAAT CTCGGTACCC ACTGCTATTG
ISI CCTAAGGGTG TAGTCCTGAA ACTGAAGCCA GTTGCCACCC GTTTCCCCAG
1001 GAAGGCTTGG AGACAGAAGC GTTCATCAGT CCTGAAGCCC CTCCTTATCC 1051 AACCCAGCCC CTCTCTCCAG CCCAGCTTCA ACCCTGGGAA AACACCAGCC
1101 CGATCAACTC ATTCAGAAGC CCCTCCGAGC AAAATGGTGC TCCGGATTCC
1151 TCACCCAATA CAGCCAGCCA CTGTTTTACA GACAGTTCCA GGTGTCCCTC
1201 CACTGGGGGT CAGTGGAGGT GAGAGTTTTG AGTCTCCTGC AGCACTGCCT 1251 GCTGTGCCCC CTGAGGCCAG GACAAGCTTC CCTCTGTCTG AGTCCCAGAC
1301 TTTGCTCTCT TCTGCCCCTG TGCCCAAGGT AATGCTGCCC TCCCTTGCCC
1351 CTTCTAAGTT TCGAAAGCCA TATGTGAGAC GGAGACCCTC AAAGAGAAGA
1401 GGAGTCAAGG CCTCTCCCTG TATGAAACCT GCCCCTGTTA TCCACCACCC 1451 TGCATCTGTT ATCTTCACTG TTCCTGCTAC CACTGTGAAG ATTGTGAGCC
1S01 TTGGCGGTGG CTGTAACATG ATCCAGCCTG TCAATGCGGC TGTGGCCCAG
15S1 AGTCCCCAGA CTATTCCCAT CACTACCCTC TTGGTTAACC CTACTTCCTT
IbOl CCCCTGTCCA TTGAACCAGT CCCTTGTGGC CTCCTCTGTC TCACCCTTAA
IbSl TTGTTTCTGG CAATTCTGTG AATCTTCCTA TACCATCCAC CCCTGAAGAT 1701 AAGGCCCACG TGAATGTGGA CATTGCTTGT GCTGTGGCTG ATGGGGAAAA
17S1 TGCCTTTCAG GGCCTAGAAC CCAAATTAGA GCCCCAGGAA CTATCTCCTC
1801 TCTCTGCTAC TGTTTTCCCG AAAGTGGAAC ATAGCCCAGG GCCTCCACTA
1851 GCAGATGCAG AGTGCCAAGA AGGATTGTCA GAGAATAGTG CCTGTCGCTG
1101 GACCGTTGTG AAAACAGAGG AGGGGAGGCA AGCTCTGGAG CCGCTCCCTC 1151 AGGGCATCCA GGAGTCTCTA AACAACCCTA CCCCTGGGGA TTTAGAGGAA
2001 ATTGTCAAGA TGGAACCTGA AGAAGCTAGA GAGGAAATCA GTGGATCCCC
2051 TGAGCGTGAT ATTTGTGATG ACATCAAAGT GGAACATGCT GTGGAATTGG
2101 ACACTGGTGC CCCAAGCGAG GAGTTGAGCA GTGCTGGAGA AGTAACGAAA
2151 CAGACAGTCT TACAGAAGGA AGAGGAGAGG AGTCAGCCAA CTAAAACCCC 2201 TTCATCTTCT CAAGAGCCCC CTGATGAAGG AACCTCAGGG ACAGATGTGA
2251 ACAAAGGATC ATCAAAGAAT GCTTTGTCCT CAATGGATCC TGAAGTGAGG
2301 CTTAGTAGCC CCCCAGGGAA GCCAGAAGAT TCATCCAGTG TTGATGGTCA
23S1 GTCAGTGGGG ACTCCAGTTG GGCCAGAAAC TGGAGGAGAG AAGAATGGGC
2401 CAGAAGAAGA GGAAGAAGAG GACTTTGATG ACCTCACCCA AGATGAGGAA 2451 GATGAAATGT CATCAGCTTC TGAGGAATCT GTGCTTTCTG TCCCAGAACT
2SD1 CCAGGTGAGA GCTGGAGAAT ATTCTCAAGT ATTTCGTGGA CTCAGTAATA
25S1 TGTATCACTT ATTGATATGC CACCTGCTTG CTTGCTGCAC TATGGATAGT
2b01 CCTAAAATCA TTTGTATTTG ATTTGTGAAT GCATTATGGG ACATGATTGT
2bSl GGAGTTGAGG TGAAATGAGA TGGAAAGGAT GAAATTTTAC TTATTATATT 2701 AAACTCGTTT ACACATTAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAGA
27S1 AAAAAAAAAA AAAAAAAA
BLAST Results
Entry RNNOTCHX from database EMBL: Rat notch 2 mRNA- Score = 818ι P = l-be-2bι identities = 21b/277
Medline entries
No Medline entry
Peptide information for frame 3
ORF from 114 bp to 2blδ bpi peptide length: δ35 Category: putative protein Classification: Differentiation/Development
1 Mι3(3HV(QLLT(Q IHLLATCNPN LNPEATTTRI FLKELGTFAiQ SSIALHHiQYN 51 PKF(QTLF(QPC NLMGAMώLIE DFSTHVSIDC SPHKTVKKTA NEFPCLPKiQV 101 AUILATSKVF MYPELLPVCS LKAKNPlQDKI VFTKAEDNLL ALGLKHFEGT
151 EFPNPLISKY LLTCKTAHiQL TVRIKNLNMN RAPDNIIKFY KKTKiQLPVLG
201 KCCEEI(QPH(Q UKPPIEREEH RLPFULKASL PSIiQEELRHM ADGAREVGNM
251 TGTTEINSDR SLEKDNLELG SESRYPLLLP KGVVLKLKPV ATRFPRKAUR
301 (QKRSSVLKPL LI(QPSPSL(2P SFNPGKTPAR STHSEAPPSK MVLRIPHPIlQ
351 PATVLiQTVPG VPPLGVSGGE SFESPAALPA VPPEARTSFP LSESiQTLLSS
401 APVPKVMLPS LAPSKFRKPY VRRRPSKRRG VKASPCMKPA PVIHHPASVI
4S1 FTVPATTVKI VSLGGGCNMI (QPVNAAVAlQS PiQTIPITTLL VNPTSFPCPL
501 NiQSLVASSVS PLIVSGNSVN LPIPSTPEDK AHVNVDIACA VADGENAFlQG
SSI LEPKLEPtQEL SPLSATVFPK VEHSPGPPLA DAEClQEGLSE NSACRUTVVK bOl TEEGRlQALEP LPώGIiQESLN NPTPGDLEEI VKMEPEEARE EISGSPERDI bSl CDDIKVEHAV ELDTGAPSEE LSSAGEVTKtQ TVLiQKEEERS (QPTKTPSSSlQ
701 EPPDEGTSGT DVNKGSSKNA LSSMDPEVRL SSPPGKPEDS SSVDGύSVGT
751 PVGPETGGEK NGPEEEEEED FDDLTiQDEED EMSSASEESV LSVPELiQVRA
601 GEYSlQVFRGL SNMYHLLICH LLACCTMDSP KIICI
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_li24ι frame 3 No Alert BLASTP hits found
Pedant information for DKFZphamy2_li24 ι frame 3
Report for DKFZphamy2_li24 ■ 3
ELENGTH! 672
EMU! 153bb-21
Epl! 5.67
EHOMOL! PIR:S4δ47β glucan li 4-alpha-glucosidase (EC
3.2-L3) - yeast (Saccharomyees cerevisiae) 5e-0b
EFUNCAT! 30-01 organization of cell wall ES- cerevisiaei
YIROllc! 2e-07
EFUNCAT! 3D-10 extracellular/secretion proteins ES- cerevisiae-
YIROllc! 2e-07
EFUNCAT! 01-05-01 carbohydrate utilization ES- cerevisiaei
YIROllc! 2e-07
EFUNCAT! 02-10 tricarboxylic-acid pathway ES- cerevisiaei
YDR14δc! Se-04
EFUNCAT! 30-lb mitochondrial organization ES- cerevisiaei
YDR14δc! 5e-04
EKU! Alpha_Beta
EKU! LOU COMPLEXITY 1-40 *
SElQ KSSAK(QQEVEKVKP(QSEKVHιQTLILDPAι3RKRL(Q(Q(QM(Q(QHV(QLLT(QIHLLATCNPNLNP
SEG xxxxxxxxxxxxxx
PRD ccchhhhhhhhhhhccccchhhhhcccchhhhhhhhhhhhhhhhhhhhhhhhcccccccc
SElQ EATTTRIFLKELGTFA(QSSIALHH(QYNPKF(3TLF(QPCNLMGAM(QLIEDFSTHVSIDCSPH
SEG
PRD cchhhhhhhhhhhhhhhhhhhhhcccccceeeeecccchhhhhhhhhhceeeeeeccccc SElQ KTVKKTANEFPCLPKlQVAUILATSKVFMYPELLPVCSLKAKNPlQDKIVFTKAEDNLLALG
SEG
PRD eeeeeccccccccchhhhhhhccceeeecccccccccccccccceeeeeeeccchhhhhh
SElQ LKHFEGTEFPNPLISKYLLTCKTAHlQLTVRIKNLNMNRAPDNIIKFYKKTKlQLPVLGKCC
SEG
PRD hheeecccccccceeeeeeeehhhhhhhhheeeccccccccceeeeeeccccccccceee SElQ EEI(QPH(QUKPPIEREEHRLPFULKASLPSI(QEELRHMADGAREVGNMTGTTEINSDRSLE
SEG
PRD eeecccccccccchhhhhcceeeecchhhhhhhhhhhhhhhhhhhcccccccccccceee
SElQ KDNLELGSESRYPLLLPKGVVLKLKPVATRFPRKAURlQKRSSVLKPLLIlQPSPSLiQPSFN SEG xxxxxxxxxxxxxxx xxxxxxxxxxxxxxx. ■
PRD ecccccccccccccccccceeeeeeeeeeeccchhhhhhccccccccccccccccccccc
SElQ PGKTPARSTHSEAPPSKMVLRIPHPIiQPATVLlQTVPGVPPLGVSGGESFESPAALPAVPP
SEG xxxxxxxxxxxx PRD cccccccccccccccccceeecccccceeeeeeccccccccccccccccccccccccccc
SElQ EARTSFPLSESiQTLLSSAPVPKVMLPSLAPSKFRKPYVRRRPSKRRGVKASPCMKPAPVI
SEG
PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
SElQ HHPASVIFTVPATTVKIVSLGGGCNMI(QPVNAAVA(QSP(QTIPITTLLVNPTSFPCPLN(QS
SEG
PRD ccccceeecccccceeeeeccccccccccccccccccccccccceeeccccccccccccc SElQ LVASSVSPLIVSGNSVNLPIPSTPEDKAHVNVDIACAVADGENAFiQGLEPKLEPiQELSPL
SEG
PRD ccccccccccccccccccccccccccccccccccceeecccccccccccccccccccccc
SElQ SATVFPKVEHSPGPPLADAEC(QEGLSENSACRUTVVKTEEGR(QALEPLP(QGIfQESLNNPT SEG
PRD ccccccccccccccccccccccccccccccceeeeeecccccccccccccceeeeccccc
SElQ PGDLEEIVKMEPEEAREEISGSPERDICDDIKVEHAVELDTGAPSEELSSAGEVTKiQTVL
SEG PRD ccccccccccccccceeeccccccccccccccccccccccccccccccccccccccchhh
SElQ (QKEEERSlQPTKTPSSSiQEPPDEGTSGTDVNKGSSKNALSSMDPEVRLSSPPGKPEDSSSV
SEG
PRD hhhhhhcccccccccccccccccccccccccccccccccccccccccccccccccccccc
SElQ DGώSVGTPVGPETGGEKNGPEEEEEEDFDDLTiQDEEDEMSSASEESVLSVPELiQVRAGEY
SEG xxxxxxxxxxxxxxxxxxxxxxxxxx
PRD cccccccccccccccccccchhhhhhhcccchhhhhhhhhhcccccccccceeeeecccc SEA SiQVFRGLSNMYHLLICHLLACCTMDSPKIICI
SEG
PRD eeeeeehhhhhhhhhhhhhhhhcccccccccc
(No Prosite data available for DKFZphamy2_li24 • 3) (No Pfam data available for DKFZphamy2_li24 -3) DKFZphamy2_ljl1
group: differentiation/development DKFZphamy2_ljll encodes a novel ISO amino acid protein with high similarity to the allograft inflammatory factor-1 of Cyprinus carpio-
Allograft inflammatory factor-1 (AIF-11 is a protein involved in allograft rejection- In experimental autoimmune encephalomyelitis (EAE)ι neuritis(EAN) and uveitis (EAU) it is produced by macrophages and microglia cells-
The new protein can find clinical application in the development of tools to enhance the compatibility of transplanted tissues as well as in expression profiling of autoimmune diseases and infections.
strong similarity to allograft inflammatory factor-1 (Cyprinus carpio) identical to DKFZphamy2_lnl Sequenced by MediGenomix
Locus: /map="504-1 cR from top of Chrl linkage group"
Insert length: 3361 bp Poly A stretch at pos- 33b2ι polyadenylation signal at pos- 3344
1 GCCGGAGCCC GGACCAGGCG CCTGTGCCTC CTCCTCGTCC CTCGCCGCGT
SI CCGCGAAGCC TGGAGCCGGC GGGAGCCCCG CGCTCGCCAT GTCGGGCGAG 101 CTCAGCAACA GGTTCCAAGG AGGGAAGGCG TTCGGCTTGC TCAAAGCCCG
ISI GCAGGAGAGG AGGCTGGCCG AGATCAACCG GGAGTTTCTG TGTGACCAGA
201 AGTACAGTGA TGAAGAGAAC CTTCCAGAAA AGCTCACAGC CTTCAAAGAG
251 AAGTACATGG AGTTTGACCT GAACAATGAA GGCGAGATTG ACCTGATGTC
301 TTTAAAGAGG ATGATGGAGA AGCTTGGTGT CCCCAAGACC CACCTGGAGA 351 TGAAGAAGAT GATCTCAGAG GTGACAGGAG GGGTCAGTGA CACTATATCC
401 TACCGAGACT TTGTGAACAT GATGCTGGGG AAACGGTCGG CTGTCCTCAA
4S1 GTTAGTCATG ATGTTTGAAG GAAAAGCCAA CGAGAGCAGC CCCAAGCCAG
501 TTGGCCCCCC TCCAGAGAGA GACATTGCTA GCCTGCCCTG AGGACCCCGC
5S1 CTGGACTCCC CAGCCTTCCC ACCCCATACC TCCCTCCCGA TCTTGCTGCC bOl CTTCTTGACA CACTGTGATC TCTCTCTCTC TCATTTGTTT GGTCATTGAG bSl GGTTTGTTTG TGTTTTCATC AATGTCTTTG TAAAGCACAA ATTATCTGCC
701 TTAAAGGGGC TCTGGGTCGG GGAATCCTGA GCCTTGGGTC CCCTCCCTCT
751 CTTCTTCCCT CCTTCCCCGC TCCCTGTGCA GAAGGGCTGA TATCAAACCA δOl AAAACTAGAG GGGGCAGGGC CAGGGCAG6G AGGCTTCCAG CCTGTGTTCC 651 CCTCACTTGG AGGAACCAGC ACTCTCCATC CTTTCAGAAA 6TCTCCAAGC
101 CAAGTTCAGG CTCACTGACC TGGCTCTGAC GAGGACCCCA GGCCACTCTG
151 AGAAGACCTT GGAGTAGGGA CAAGGCTGCA GGGCCTCTTT CGGGTTTCCT
1001 TGGACAGTGC CATGGTTCCA GTGCTCTGGT GTCACCCAGG ACACAGCCAC 10S1 TCGGGGCCCC GCTGCCCCAG CTGATCCCCA CTCATTCCAC ACCTCTTCTC
1101 ATCCTCAGTG ATGTGAAGGT GGGAAGGAAA GGAGCTTGGC ATTGGGAGCC
1151 CTTCAAGAAG GTACCAGAAG GAACCCTCCA GTCCTGCTCT CTGGCCACAC
1201 CTGTGCAGGC AGCTGAGAGG CAGCGTGCAG CCCTACTGTC CCTTACTGGG 1251 GCAGCAGAGG GCTTCGGAGG CAGAAGTGAG GCCTGGGGTT TGGGGGGAAA
1301 GGTCAGCTCA GTGCTGTTCC ACCTTTTAGG GAGGATACTG AGGGGACCAG
13S1 GATGGGAGAA TGAGGAGTAA AATGCTCACG GCAAAGTCAG CAGCACTGGT
1401 AAGCCAAGAC TGAGAAATAC AAGGTTGCTT GTCTGACCCC AATCTGCTTG
1451 AAACCTGACT CTGCTTCTCT CATTTGTCTT CCTACCCTAC TCACATAATT 1S01 CACTCATTGA CTCACTCATT CACCAGATAT TTATTGACCT GCTATTATAA
15S1 GCTTTACATC CTCCCATGTT GTCCTGGCAT GTGCAGTATA CACGGTCTAA
IbOl CTCATCTCTC CCCAGATCTC TCAGAACCTT GAGCTTGGGA ATTGAACTGG lb51 GGTCACCTGT GTCCTTTCTT ATGGACTCGC AGGATTTTAG AACCCTAATG
1701 CACCCTGGAG GGTAGCTGGG CCAGACTTCT CATTTCACAG GTGAGGAGAC 17S1 TGGTGCCCCA CAGGGATTAA GTGCCTTGCC CAAGGTCAGG CTTATCTCCA
1601 GAGGGAGGTG CCCTGGACTG GGGCCCAGAT GTTCAGGGAC CCTGCCTACA
IδSl CCTCATTTCC AGTGTGGGCT GCCTTAGTTA GTTATGAGAA CAGGGAAGGG
1101 CTGGGAAGAG ACAGCCTCCA AGGTCAACAC TTGGAGAGGG TTTCACTTGC
11S1 TCTGAAGACC CTGGTCCAGG ATTCGCCCTC TCCCATGCCT TCAAGTCAGC 2001 ATCAGGCTTA GGGCAAAGAC CAGGCCTCTG AAGCTGCCTC TTGTAATTCA
2051 TGCAGGAAGA TGTCAAAGTC AGCCCCATCT TGGCTGATCA GGGTGTTCAG
2101 CCTTAACCCC ACCTGTGTTC TGAAGTCTCT TACCCTACCT GCTCAGGACT
2151 GAGACAGTTA TTCACTGAAC ATATTTATTA AGCACTTGCT GTAGGCCAAC
2201 AGTTAAGAAT CCAATAATGA AATGGACAGA TTCATGGAAC TTAGAGTCCA 2251 ATAGGAAAGT GAGACCCAGA CAATGACAAT GAGATAAATG TTAGGAAGGG
2301 GGAGGTATGG GGTGACTTCC CTGCAGTCCT GGGGGCCTAC ATGGGCCCAA
2351 GACTGGGTGA GAGTCTTGGC AGAGCCTTTG CAACACCTTA AGTGGACAGG
2401 ACTGGGAGGT CTTGGTGGTT GGAGCCAACG TGGGTTCCCT GCGGCTCCTT
2451 AGTCACCTCT GATAGCAGAT TGAGGGAGGA AAACAGGTAA GGCATGAGGA 2S01 AATGGCCAGG TTGGGTTAAC CCACTGGTTT CAACCAGTTC AGGAATGAGG
25S1 TTATTTGGCC ATGACT6GCT GATCTTGAGC TCAAGGATCT GCTTCAAATG
2b01 CACACAGGCC TAGTTGAAGT TTAAACCCCA GCAAAACATT CCTCCCTGTA
2b51 AATGGAAAAT CCTACTTCTA CCCCCACCCT GCCCTGTTTT TTGTTTTTTT
2701 TTTCCCCAAG ATCATTAGAT GTCCTCACCC CTCCTCACTG CCTCTCCTCT 27S1 CTGGGACAGG CTGGGACCTT TGAGGAAGAT AAAGCCTTCC TTGACTACCC
2801 ATCATATTCA GTGTCCCTGT TCCTCACTCA GAGAGGAAGG CAGAACCAGT
2851 CAGGCTTATT TCAGTAAGTT CCACAGTTCT ACAAGACTGC AGGAATTCTC
2101 CTTAAGGGAG GAGAGCAAGC AGGTGTGGCC CCAGCTTCTG GAAATGGCAG
2151 AAGAGAGGGT TTTCTCATTG AATGGGGGTG GGGGCTCGTG TGTCCTGGGA 30D1 AACCCCATCA GTCCCTTCAT TTCTTGAGAC TCAACTCCTG GGAGGAGAGG
3051 GTCTCAAGAG TTGTCCCTGG AAGGAGGGCG GGGGCAGTCT GCATCTATTT
3101 CAGGTTGTGG CTCTTGGTTC TAGGACTCTT ACTTCTCTGG CTAAGGGCTC
3151 AGCTTCTTGG GACTTCAACC ATCTTCTTTC TGAAAGACCA AATCTAATGT
3201 AACCAGTAAC GTGAGGACTG CCAAGTATGG CTTTGTCCCT ATGACTCAGA 3251 GGAGGGTTTG TCGGGCAAAT TCAGGTGGAT GAAGTATGTG TGTGCGTGTG
3301 CATGGGAGTG TGCGTGGACT GGGATATCAT CTCTACAGCC TGCAAATAAA
33S1 CCAGACAAAC TTAAAAAAAA AAAAAAAAAA A
BLAST Results
Entry AB012301_1 from database TREMBL: product: "allograft inflammatory factor-l"i Cyprinus carpio mRNA for allograft inflammatory factor-li complete eds-
Score = 57Sι P = 3-7e-S4ι identities = 113/14bι positives =
126/14bι frame +2
Medline entries
No Medline entry
Peptide information for frame 2
ORF from 81 bp to 538 bpi peptide length: 150 Category: strong similarity to known protein Classification: unclassified
1 MSGELSNRF(Q GGKAFGLLKA RiQERRLAEIN REFLCDlQKYS DEENLPEKLT SI AFKEKYMEFD LNNEGEIDLM SLKRMMEKLG VPKTHLEMKK MISEVTGGVS
101 DTISYRDFVN MMLGKRSAVL KLVMMFEGKA NESSPKPVGP PPERDIASLP
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_ljlli frame 2 No Alert BLASTP hits found
Pedant information for DKFZphamy2__ljlli frame 2
Report for DKFZphamy2_ljll ■ 2
ELENGTH! 15D
EMU! 170b7-8b
E Eppll!! bb--bb33
EH0M0L! TREMBL AB0123D1_1 product: "allograft inflammatory factor-l"i Cyprinus carpio mRNA for allograft inflammatory factor-li complete eds- 2e-S1 EFUNCAT! 30-04 organization of cytoskeleton ES- cerevisiaei
YBRlOlc! 5e-04
EFUNCAT! 03-07 pheromone response! mating-type determination! sex-specific proteins ES- cerevisiaei YBRlOlc! Se-04
EFUNCAT! Oδ-11 cellular import ES- cerevisiaei YBRlOlc! 5e-04 EFUNCAT! 10-02-11 other morphogenetic activities ES- cerevisiaei YBRlOlc! Se-04
EFUNCAT! 03-22 cell cycle control and mitosis ES. cerevisiaei
YBRlOlc! Se-04
EFUNCAT! 03-04 buddingi cell polarity and filament formation ES- cerevisiaei YBRlOlc! Se-04
EFUNCAT! 03-01 cell growth ES- cerevisiaei YBRlOlc! Se-04
EFUNCAT! 30-05 organization of centrosome ES- cerevisiaei
YBRlOlc! 5e-04 ESCOP! d2mysb_ 1-37-1-5-15 Myosin Essential Chain Myosin
Regulatory Chai 5e-2D
ESCOP! dlwdcb_ 1-37-1-S-14 Myosin Essential Chain Myosin
Regulatory Chai 3e-05 ESCOP! dlosa 1-37-1.5-13 Calmodulin E(Paramecium tetraurelia) 3e-lb
ESCOP! dlauib_ 1-37-1.5.11 Calcineurin regulatory subunit
(B-chain 2e-lb •
EPIRKU! duplication 7e-0b EPIRKU! mitosis 7e-0b
EPIRKU! calcium binding 7e-0b
[PIRKU! EF hand 7e-0b
[PIRKU! cell division 7e-0b
[SUPFAM! unassigned calmodulin-related proteins 3e-47 [SUPFAM! calmodulin 7e-0b
[SUPFAM! calmodulin repeat homology 3e-47
[KU! All_Alpha
[KU! 3D
SElQ MSGELSNRFlQGGKAFGLLKARlQERRLAEINREFLCDlQKYSDEENLPEKLTAFKEKYMEFD lctr- HHHHHHHHHHHHHT SElQ LNNEGEIDLMSLKRMMEKLGVPKTHLEMKKMISEVTGGVSDTISYRDFVNMMLGKRSAVL lctr-
TTTTTCBCHHHHHHHHHHTTTCCCHHHHHHHHHHCTTTTCCCBCHHHHHHHHCCTTTHHH
SElQ KLVMMFEGKANESSPKPVGPPPERDIASLP lctr- HHHHHHTTTTC
(No Prosite data available for DKFZphamy2_ljll ■ 2) (No Pfam data available for DKFZphamy2_ljll .2)
DKFZphamy2_24b4
group: cell cyle
DKFZphamy2_24b4 encodes a novel blδ amino acid protein with similarity to human STIML The stromal interaction molecular 1 gene (STIMl) encodes a type I trans-membrane protein of unknown functioni which induces growth arrest and degeneration of the human tumor cell lines G401 and RD but not HBL100 and CaLu-bi suggesting a role in the pathogenesis of rhabdomyosarcomas and rhabdoid tumors- There is also strong similarity to a Mus musculus stromal cell proteini which selectively increases interleukin 7-dependent proliferation of pre-B cells- The novel protein contains 1 transmembrane domain-
The new protein can find application in modulation of tumour growth-
similarity to STIMl (Homo sapiens) probably differential polyadenylation: cf- EST-BLAST file- perhaps complete eds- Pedant: SIGNAL PEPTIDE and TRANSMEMBRANE 1
Sequenced by GBF
Locus: /map="131-2 cR from top of Chr4 linkage group'
Insert length: 3305 bp
Poly A stretch at pos- 3274ι polyadenylation signal at pos- 32b0 ' I
1 GGCGCCTTCA TCCCGCCTCG ACTCCTGGCC CAGCGTGGGG CTGGCTGCTG
51 CGGCGGCGGC GCTGGGCTGC GTTGCTGGTG CTCGGGCTGC TGGTACCCGG
101 AGCGGCGGAC GGATGCGAGC TTGTGCCCCG GCACCTCCGC GGGCGGCGGG ISI CGACTGGCTC TGCCGCAACT GCCGCCTCCT CTCCCGCCGC GGCGGCCGGC
201 GATAGCCCGG CGCTCATGAC AGATCCCTGC ATGTCACTGA GTCCACCATG
251 CTTTACAGAA GAAGACAGAT TTAGTCTGGA AGCTCTTCAA ACAATACATA
301 AACAAATGGA TGATGACAAA GATGGTGGAA TTGAAGTAGA GGAAAGTGAT
351 GAATTCATCA GAGAAGATAT GAAATATAAA GATGCTACTA ATAAACACAG 401 CCATCTGCAC AGAGAAGATA AACATATAAC GATTGAGGAT TTATGGAAAC
451 GATGGAAAAC ATCAGAAGTT CATAATTGGA CCCTTGAAGA CACTCTTCAG
SOI TGGTTGATAG AGTTTGTTGA ACTACCCCAA TATGAGAAGA ATTTTAGAGA
SSI CAACAATGTC AAAGGAACGA CACTTCCCAG GATAGCAGTG CACGAACCTT bOl CATTTATGAT CTCCCAGTTG AAAATCAGTG ACCGGAGTCA CAGACAAAAA bSl CTTCAGCTCA AGGCATTGGA TGTGGTTTTG TTTGGACCTC TAACACGCCC
7D1 ACCTCATAAC TGGATGAAAG ATTTTATCCT CACAGTTTCT ATAGTAATTG
751 GTGTTGGAGG CTGCTGGTTT GCTTATACGC AGAATAAGAC ATCAAAAGAA
601 CATGTTGCAA AAATGATGAA AGATTTAGAG AGCTTACAAA CTGCAGAGCA
651 AAGTCTAATG GACTTACAAG AGAGGCTTGA AAAGGCACAG GAAGAAAACA 101 GAAATGTTGC TGTAGAAAAG CAAAATTTAG AGCGCAAAAT GATGGATGAA
151 ATCAATTATG CAAAGGAGGA GGCTTGTCGG CTGAGAGAGC TAAGGGAGGG
1001 AGCTGAATGT GAATTGAGTA GACGTCAGTA TGCAGAACAG GAATTGGAAC
10S1 AGGTTCGCAT GGCTCTGAAA AAGGCCGAAA AAGAATTTGA ACTGAGAAGC 1101 AGTTGGTCTG TTCCAGATGC ACTTCAGAAA TGGCTTCAGT TAACACATGA
1151 AGTAGAAGTG CAATACTACA ATATTAAAAG ACAAAACGCT GAAATGCAGC
1201 TAGCTATTGC TAAAGATGAG GCAGAAAAAA TTAAAAAGAA GAGAAGCACA
1251 GTCTTTGGGA CTCTGCACGT TGCACACAGC TCCTCCCTAG ATGAGGTAGA 1301 CCACAAAATT CTGGAAGCAA AGAAAGCTCT CTCTGAGTTG ACAACTTGTT
13S1 TACGAGAACG ACTTTTTCGC TGGCAACAAA TTGAGAAGAT CTGTGGCTTT
1401 CAGATAGCCC ATAACTCAGG ACTCCCCAGC CTGACCTCTT CCCTTTATTC
1451 TGATCACAGC TGGGTGGTGA TGCCCAGAGT CTCCATTCCA CCCTATCCAA
1S01 TTGCTGGAGG AGTTGATGAC TTAGATGAAG ACACACCCCC AATAGTGTCA 15S1 CAATTTCCCG GGACCATGGC TAAACCTCCT GGATCATTAG CCAGAAGCAG
IbOl CAGCCTGTGC CGTTCACGCC GCAGCATTGT GCCGTCCTCG CCTCAGCCTC
IbSl AGCGAGCTCA GCTTGCTCCA CACGCCCCCC ACCCGTCACA CCCTCGGCAC
1701 CCTCACCACC CGCAACACAC ACCACACTCC TTGCCTTCCC CTGATCCAGA
1751 TATCCTCTCA GTGTCAAGTT GCCCTGCGCT TTATCGAAAT GAAGAGGAGG IδOl AAGAGGCCAT TTACTTCTCT GCTGAAAAGC AATGGGAAGT GCCAGACACA
1651 GCTTCAGAAT GTGACTCCTT AAATTCTTCC ATTGGAAGGA AACAGTCTCC
1101 TCCTTTAAGC CTCGAGATAT ACCAAACATT ATCTCCGCGA AAGATATCAA
1151 GAGATGAGGT GTCCCTAGAG GATTCCTCCC GAGGGGATTC GCCTGTAACT
2001 GTGGATGTGT CTTGGGGTTC TCCCGACTGT GTAGGTCTGA CAGAAACTAA 20S1 GAGTATGATC TTCAGTCCTG CAAGCAAAGT GTACAATGGC ATTTTGGAGA
2101 AATCCTGTAG CATGAACCAG CTTTCCAGTG GCATCCCGGT GCCTAAACCT
2151 CGCCACACAT CATGTTCCTC AGCTGGCAAC GACAGTAAAC CAGTTCAGGA
2201 AGCCCCAAGT GTTGCCAGAA TAAGCAGCAT CCCACATGAC CTTTGTCATA
22S1 ATGGAGAGAA AAGCAAAAAG CCATCAAAAA TCAAAAGCCT TTTTAAGAAG 2301 AAATCTAAGT GAACTGGCTG ACTTGATGGA ATCATGTTCA AGTGGCATCT
2351 GTAAACTATT ATCCCCCACC CTCCACTCCC CACCTTTTTT TTGGTTTAAT
2401 TTTAGGAATG TAACTCCATT GGGGCTTTCC AGGCCGGATG CCATAGTGGA
2451 ACATCCAGAA GGGCAACTGT CTACTGTCTG CTTATTTAAG TGACTATATA
2S01 TAATCAATTC ATCAAGCCAG TTATTACTGA AAAATCATTG AAATGAGACA 2SS1 GTTTACAGTC ATTTCTGCCT ATTTATTTCT GCTTTGTTCT CAGTGATGTA
2b01 TATGCAACAT TTTGTTGAAA GCCACGATGG ACTTACAAGC TTTAATGGAC
2b51 TCGTAAGCCA GCATGGGCTT GCAAAAATTT CTTGTTTACC AGAGCATCTT
2701 CTTATCTTTC CACAGAGCTA TTTACATCCT GGACTATATA ACTTAAAAGA
2751 AGTAAAACGT AATTGCACTA CTGTTTTCCA GACTGGAAAA AAAAAAAAAT 2601 CTCTGCAAGT GAAACTGTAT AGAGTTTATA AAATGACTAT GGATAGGGGA
2651 CTGTTTTCAC TTTTAGATCA AAATGGGTTT TTAAGTAGAA CCTAGGGTTT
2101 CTAATTGACT TGATTTCTGG AAATGAAAAC CCGCGCTTTT ATTATGGGAA
2151 GCTTCTTGAA CTGCATTTAC TATTGTGAAG TTTCAAGTCC CGCTGTAAAG
3001 ATCATGTTGT TTTGTTTTCC CCAGGGCTTT CACTGTGATT TACTGCATTG 3051 CAGGCTGTAT GATAAAACAC ACATAATTTA AAGAGAGAAG GCTCTTGATT
3101 CCTTATGCAA GTGGAAGAGT TGAAACTTGA TTGAAGGACT TAAAACATTC
31S1 ACAACCTTAA GCCGAGGTGG GGGGATATGG GGATTCAGGC AGTTGTTTAC
3201 ACACTTTGAA TAACTGCAAA GGATTTACGG TTTGTGAAAA ATGTGTACTG
3251 TGGAAAAGAT AATAAATTGA AGACATTAAA AAAAAGAAAA AAAAAAAAAA 3301 AAAAA
BLAST Results
Entry HSS242blO_l from database TREMBL: gene: "STIMl"i product: "GOK"i Homo sapiens GOK (STIMl) mRNAi complete eds- Score = 1317ι P = 4-2e-142ι identities = 275/447ι positives =
33b/447ι frame +3 Entry MMU47323_1 from database TREMBL: product: "stromal cell protein"i Mus musculus stromal cell protein mRNAi complete eds- Score = 1314ι P = 6-6e-142i identities = 274/447ι positives = 33b/447ι frame +3
Entry HS117341 from database EMBL: human STS ESTlb7471-
Score = 1310ι P = 1.1e-57ι identities = 264/267
Medline entries
17071b12:
Parker NJi Begley CGi Smith PJi Fox RM-i Molecular cloning of a novel human gene (DllS461bE) at chromosomal region llplS-S- Genomics 111b Oct 15i37 (2) :253-b
1b32bbβ0: Oritani Ki Kincade PU-i Identification of stromal cell products that interact with pre-B cells- J Cell Biol 111b Augil34 (3) : 771-62
Peptide information for frame 3
ORF from 21b bp to 2301 bpi peptide length: blβ Category: similarity to known protein Classification: Cell signaling/communication Prosite motifs: RGD (561-511)
1 MTDPCMSLSP PCFTEEDRFS LEALiQTIHKiQ MDDDKDGGIE VEESDEFIRE
51 DMKYKDATNK HSHLHREDKH ITIEDLUKRU KTSEVHNUTL EDTLlQULIEF
101 VELPlQYEKNF RDNNVKGTTL PRIAVHEPSF MISώLKISDR SHR(QKL(QLKA ISI LDVVLFGPLT RPPHNUMKDF ILTVSIVIGV GGCUFAYTlQN KTSKEHVAKM
201 MKDLESLiQTA EtQSLMDLiQER LEKAiQEENRN VAVEKlQNLER KMMDEINYAK
251 EEACRLRELR EGAECELSRR (QYAEiQELEiQV RMALKKAEKE FELRSSUSVP
301 DALlQKULlQLT HEVEVcQYYNI KRlQNAEMlQLA IAKDEAEKIK KKRSTVFGTL
3S1 HVAHSSSLDE VDHKILEAKK ALSELTTCLR ERLFRUiQlQIE KICGFiQIAHN 401 SGLPSLTSSL YSDHSUVVMP RVSIPPYPIA GGVDDLDEDT PPIVSlQFPGT
451 MAKPPGSLAR SSSLCRSRRS IVPSSPcQPiQR AlQLAPHAPHP SHPRHPHHPiQ
501 HTPHSLPSPD PDILSVSSCP ALYRNEEEEE AIYFSAEKlQU EVPDTASECD
SSI SLNSSIGRKiQ SPPLSLEIYfi TLSPRKISRD EVSLEDSSRG DSPVTVDVSU bOl GSPDCVGLTE TKSMIFSPAS KVYNGILEKS CSMNiQLSSGI PVPKPRHTSC bSl SSAGNDSKPV (QEAPSVARIS SIPHDLCHNG EKSKKPSKIK SLFKKKSK BLASTP hits No BLASTP hits available Alert BLASTP hits for DKFZphamy2_24b4ι frame 3
No Alert BLASTP hits found
Pedant information for DKFZphamy2_24b4ι frame 3
Report f or DKFZphamy2_24b4 - 3
[LENGTH! 7b1
[MU! 6bb73 - 41
[pi! b - bl
[H0M0L! TREMBL:HS5242blO_l gene: "STIMl"i product: "G0K"i
Homo sapiens GOK (STIMl) mRNAi complete eds. le-154 [BLOCKS! BLOOδβbC Dihydroxy-acid and b-phosphogluconate dehydratases proteins
[BLOCKS! PR00021D
[BLOCKS! PR010S3F
[BLOCKS! BL0072bB AP endonucleases family 1 proteins [PROSITE! RGD 1
[KU! SIGNAL_PEPTIDE 3δ
[KU! TRANSMEMBRANE 1
[KU! L0U_C0MPLEXITY 15-δb *
[KU! COILED COIL 6-45 *
SElQ RLHPASTPGPAUGULLRRRRUAALLVLGLLVPGAADGCELVPRHLRGRRATGSAATAASS
SEG xxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxx
PRD cccccccccccchhhhhhhhhhhhhhhhhcccccccccccchhhhhhhcccccccccccc COILS
MEM
SElQ PAAAAGDSPALMTDPCMSLSPPCFTEEDRFSLEALiQTIHKiQMDDDKDGGIEVEESDEFIR SEG xxxxxxxxxx
PRD ccccccccccccccccccccccccchhhhhhhhhhhhhhhhhhcccccceeeecchhhhh COILS
MEM
SElQ EDMKYKDATNKHSHLHREDKHITIEDLUKRUKTSEVHNUTLEDTLlQULIEFVELPiQYEKN SEG
PRD hhcccccccccccccccccceeeehhhhhhhhhhccccchhhhhhhhhhhhhhcccchhh COILS
MEM
SElQ FRDNNVKGTTLPRIAVHEPSFMISlQLKISDRSHRlQKLiQLKALDVVLFGPLTRPPHNUMKD SEG PRD hhhhhccccccceeeeecccceeeeeecccchhhhhhhhhhheeeecccccccccccchh COILS
MEM SElQ FILTVSIVIGVGGCUFAYT(QNKTSKEHVAKMMKDLESL(QTAE(QSLMDL(QERLEKA(QEENR
SEG
PRD hhheeeeeeccccceeeecccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcc COILS CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
MEM MMMMMMMMMMMMMMMMM
SElQ NVAVEKiQNLERKMMDEINYAKEEACRLRELREGAECELSRRiQYAElQELEiQVRMALKKAEK SEG
PRD ceeeehhhhhhhhhhhhhhhhhhhhhhhhhhhhcchhhhhhhhhhhhhhhhhhhhhhhhh COILS
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
MEM
SElQ EFELRSSUSVPDAL(QKUL(QLTHEVEV(QYYNIKR(QNAEM(QLAIAKDEAEKIKKKRSTVFGT SEG xxxxxxxxxxxxx
PRD hhhhhccccccchhhhhhhhhhhheeeeccchhhhhhhhhhhhhhhhhhhhhhhhhccce COILS
MEM
SElQ LHVAHSSSLDEVDHKILEAKKALSELTTCLRERLFRUiQiQIEKICGFlQIAHNSGLPSLTSS SEG PRD eeeeeccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcceeeeccccccceee COILS
MEM SElQ LYSDHSUVVMPRVSIPPYPIAGGVDDLDEDTPPIVSiQFPGTMAKPPGSLARSSSLCRSRR
SEG xxxxxxxxxxxxx
PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc COILS MEM
SElQ SIVPSSPIQPIQRAIQLAPHAPHPSHPRHPHHPIQHTPHSLPSPDPDILSVSSCPALYRNEEEE
SEG x xxxxxxxxxxxxxxxxxxxxxxxxx xxxx
PRD eeecccccccccccccccccccccccccccccccccccccccceeeeeecccchhhhhhh COILS
MEM
SElQ EAIYFSAEKIQUEVPDTASECDSLNSSIGRKIQSPPLSLEIYIQTLSPRKISRDEVSLEDSSR SEG x
PRD hhhhhhhhhhcccccccccccccccccccccccceeeeeeeecccccccccccccccccc COILS
MEM
SElQ GDSPVTVDVSUGSPDCVGLTETKSMIFSPASKVYNGILEKSCSMNiQLSSGIPVPKPRHTS SEG
PRD cccceeeeeccccccccceeeccccccccccceeeeeeeccccccccccccccccccccc COILS
MEM
SElQ CSSAGNDSKPViQEAPSVARISSIPHDLCHNGEKSKKPSKIKSLFKKKSK SEG xxxxxxxxxxxxxxxxx
PRD ccccccccceeeecceeeeeccccccccccccccccccceeeeeecccc
COILS
MEM
Prosite for DKFZphamy2_24b4 -3 PSOOOlb bb0->bb3 RGD PDOCOOOlb
(No Pfam data available for DKFZphamy2_24b4.3)
DKFZphamy2_24cδ
group: transmembrane protein
DKFZphamy2_24cδ encodes a novel 454 amino acid protein without similarity to known proteins- The novel protein contains 1 transmembrane region- No informative BLAST resultsi No predictive prositei pfam or SCOP motife-
The new protein can find application in studying the expression profile of amygdala-specific genes and as a new marker for amygdala cells-
putative protein
EST of GEN-42bH07 is 141 Bp longer at S'-end perhaps complete eds- Pedant: TRANSMEMBRANE 1
Sequenced by GBF
Locus". /map="b01.7 cR from top of Chr3 linkage group" Insert length: 3200 bp
Poly A stretch at pos- 3177ι polyadenylation signal at pos. 315b
1 CCTGTCCACA GGGCCCGCTC CAGCAGCCAT GGCAACCACA TCCTCCAAGC 51 CAGAGGGCCG CCCTCGAGGG CAGGCTGCCC CCACCATCCT GCTGACAAAG
101 CCACCGGGGG CCACCAGCCG CCCCACCACA GCGCCCCCCC GCACTACCAC
151 ACGCAGGCCC CCCAGGCCCC CAGGCTCTTC CCGAAAAGGG GCTGGTAATT
201 CATCACGCCC TGTCCCGCCT GCACCTGGTG GCCACTCCAG GAGTAAAGAA
251 GGACAGCGAG GACGAAATCC AAGCTCCACA CCTCTGGGGC AGAAGCGGCC 301 CCTGGGGAAA ATCTTTCAGA TCTACAAGGG CAACTTCACA GGGTCTGTGG
351 AACCGGAGCC CTCTACCCTC ACCCCCAGGA CCCCACTCTG GGGCTACTCC
401 TCTTCACCAC AGCCCCAGAC AGTGGCTGCG ACCACAGTGC CCAGCAATAC
451 CTCATGGGCA CCCACCACCA CCTCCCTGGG GCCTGCAAAG GACAAGCCAG
501 GCCTTCGCAG AGCAGCCCAG GGGGGTGGTT CTACCTTCAC CAGCCAAGGA S51 GGGACACCAG ATGCCACAGC AGCCTCAGGT GCCCCTGTCA GTCCACAAGC bOl TGCCCCAGTG CCTTCTCAGC GCCCCCACCA CGGTGACCCA CAGGATGGCC bSl CCAGCCATAG TGACTCTTGG CTTACTGTTA CCCCTGGCAC CAGCAGACCT
701 CTGTCTACCA GCTCTGGGGT CTTCACGGCT GCCACGGGGC CCACCCCAGC
751 TGCCTTCGAT ACCAGTGTCT CAGCCCCTTC CCAGGGGATT CCTCAGGGAG 601 CATCCACAAC CCCACAAGCT CCAACCCATC CCTCCAGGGT CTCAGAAAGC
651 ACTATTTCTG GAGCCAAGGA GGAGACTGTG GCCACCCTCA CCATGACCGA
101 CCGGGTGCCC AGTCCTCTCT CCACAGTGGT ATCCACA6CC ACAGGCAATT
151 TCCTCAACCG CCTGGTCCCC GCCGGGACCT GGAAGCCTGG GACAGCAGGG
1001 AACATCTCCC ATGTGGCCGA GGGGGkCkkk CCGCAGCACA GAGCCACCAT 1051 CTGCCTGAGC AAGATGGATA TCGCCTGGGT GATCCTGGCC ATCAGCGTGC
1101 CCATCTCCTC CTGCTCTGTC CTGCTGACGG TGTGCTGCAT GAAGAGGAAG
1151 AAGAAGACCG CCAACCCGGA GAACAACCTG AGCTACTGGA ACAACACCAT
1201 CACCATGGAC TACTTCAACA GGCATGCTGT GGAGCTGCCC AGGGAGATCC 12S1 AGTCCCTTGA AACCTCTGAG GACCAGCTCT CAGAGCCCCG CTCCCCAGCC
1301 AATGGCGACT ATAGAGACAC TGGGATGGTC CTTGTTAACC CCTTCTGTCA
1351 AGAAACACTG TTTGTGGGAA ACGATCAAGT ATCTGAGATC TAACTACAGC
1401 AGGCATCACT TTGCCATTCC GTATTTTTCG TCTCTAAATT ATAAATATAC 1451 AAATATATAT ATTATAAATA TAACCTTTGT GTAACCCTGA CTTAATGAGA
1501 AACATTTTCA GCTTTTTTTC CTATGAATTG TCAACATCTT TTTTACAAGT
1SS1 GTGGTTTAAA AAAAAAAAAA CTTTACAGAA TGATCTGTGG CTTTATAAAA
IbOl TAAAGGTATT TCTAAGCAAA GCAGTTGCAT TGATTGCTTC TCTTAATAAC
IbSl TATTCTTGAG CACCTGGGGA TCCCAGGAAC CCTGGTCAGG TGAGGTAAGA 1701 GACTGACCTC CTGTAGAAGC TGAATGTTAC AGTGGTCAAG CGCACGATTC
17S1 TTTGAGTGAT TCTTAAAGCT CTGGTTCCTC TTGATTTGGT GTGACCCCAT
IflDl TTCCTCCCTT CTCATACGCA CACCTGTAAA GGGAACTGGA CCGCCTCAGG
IδSl GGAAGACGGC AGACTCATGC ACAGAGAAGG AAAAGGGAAC ATCTCATCAC
,. , 1.101 CTCTGAGGAT GAGTACCCTG GAGCCTTATG ACGGCACCAT TGGATGTCAT 1151 GTTTAATTCC ATCCAAGTTG TGGATGGCAG GCAGGAGCAT GGAGCCCTCA
2001 GGAATCCATG GAGGACATCA AGGCATCCCA AGGCCATATT CCCCTAACAT
2051 TACTTCCACT GCTAACAACA GGACTGCCTT TCCCTGGTGG GAAAATGCTC
2101 CCTTTATGCC CATTCCTGTA TCCCCTCCAA CACCCACATC TGCATTAAAC
2151 ACCCGTGCCT TTCTCTTGGA GAGGGTTTAG ATGCAGATCC CGGCCCTGGA 2201 GCTTTAAAAT GCTTGCCCTT CCTTCTTCAA GGATCAAATG TTTATTGGGG
2251 TTCAGCTTTG TTTTCTCAAA AGGCCATGGT ATCGTGCCCC TGAGGAACAT
2301 GTTTATCTAA GAAGCTTTGA GGTAGTAGAG CGATAATTTT TGAAACCTTC
2351 CTCCTGCAAT CTTTAAAAAA GkkkkkkkkG ATTGCCCAAA CAAATCATTT
2401 GGGAGAAGAC ATCATTATAC TCCTACTTGG CACTGCAAAC CTGCTCGCAG 2451 CACCAGCCGG TGGACTTGCC ATCCAGCTCT CAGCTTCCAC TGCTCCCCTT
2501 GTTCCCGGCC GGCTGGCTGC CTCCCCGTGC TGTGTCCAGC ACGGCCAACA
25S1 ACGTCAGACC CTCAGAGACG CCCAAGGGGC TTCCAGAGGT GGCCGCTTCT
2b01 CTATTTTTTC CTGATTGTGG CTGAGAGAGA TGATTACTGC TTTGACACTT
2b51 CCTTTCTCTA AAAGAAAAAT AGTTTGATAG TATATTTTGA ATATAGATGC 2701 TCTTATAGTC AGATTGGGAA TTGAACTTGA ATATTGGGTC ATATGTTTGT
27S1 GTTGTTGCTG TAGTCTATCA TGACTTTTTT CTTTCTGCAT TTTCCTTAAA
2601 AAAAAAAAAA AGATGGCCTT CAAAAGTGTG TTCTCAATGT TGTATGAACC
2851 TCCTTCACAT GAGTTCGGTT GTTGTCTCTC TTCAAAGACT CTTCAACCCA
2101 CAAAGAAGCA ACTAAATGTT TCTCTAAGTT TAATTTTCTA GCGTGTTGTT 21S1 GTCTTACCTT TTTAACCTTA CCATAATATT TCTGTTAACT GTTACATTTA
3001 ATATACCAAT GTGTGTAAGT ATACAGAGAA AAATCTGTTT GTAAAGTAAA
3051 ATTTATATAT AATATATGTA ATCAAAGATA CATATGTTAT ATATACATAT
3101 GTGGATGTAT GACTTATTTT TCCTTATCCA CAGATTTCAG CTACCATGTA
31S1 TATATAAATA AACTTATTTT ATTAGCCAGA GAAAAAAAAA AAAAAAAAAA
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 2
ORF from 21 bp to 1310 bpi peptide length: 151 Category: putative protein
Classification: Transmembrane proteins unclassified
1 MATTSSKPEG RPRGiQAAPTI LLTKPPGATS RPTTAPPRTT TRRPPRPPGS 51 SRKGAGNSSR PVPPAPGGHS RSKEGlQRGRN PSSTPLGiQKR PLGKIFlQIYK
101 GNFTGSVEPE PSTLTPRTPL UGYSSSPIQPIQ TVAATTVPSN TSUAPTTTSL
151 GPAKDKPGLR RAAiQGGGSTF TSlQGGTPDAT AASGAPVSPlQ AAPVPSiQRPH
201 HGDPlQDGPSH SDSULTVTPG TSRPLSTSSG VFTAATGPTP AAFDTSVSAP
251 SiQGIPlQGAST TPlQAPTHPSR VSESTISGAK EETVATLTMT DRVPSPLSTV 301 VSTATGNFLN RLVPAGTUKP GTAGNISHVA EGDKPiQHRAT ICLSKMDIAU
3S1 VILAISVPIS SCSVLLTVCC MKRKKKTANP ENNLSYUNNT ITMDYFNRHA
401 VELPREIiQSL ETSEDiQLSEP RSPANGDYRD TGMVLVNPFC (QETLFVGNDfl
451 VSEI
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_24cβι frame 2
No Alert BLASTP hits found Pedant information for DKFZphamy2_24cβι frame 2
Report for DKFZphamy2_24cβ .2
[LENGTH! 4b3
[MU! 48277-64
[pi! 1-60
[FUNCAT! 18 classification not yet clear-cut [S- cerevisiaei YJRlSlc! 2e-04
[BLOCKS! PR00112F
[BLOCKS! BP03b1bF
[KU! TRANSMEMBRANE 1
[KU! L0U_C0MPLEXITY 15-55 *
SElQ LSTGPAPAAMATTSSKPEGRPRGiQAAPTILLTKPPGATSRPTTAPPRTTTRRPPRPPGSS
SEG xxxxxxxxxxxxxxxxxxxxxxxxxxx
PRD ccccchhhhhhhhcccccccccccccceeeeccccccccccccccccccccccccccccc MEM
SElQ RKGAGNSSRPVPPAPGGHSRSKEGIQRGRNPSSTPLGIQKRPLGKIFIQIYKGNFTGSVEPEP
SEG x
PRD cccccccccccccccccccccccccccccccccccccccccceeeeeecccccccccccc MEM
SElQ STLTPRTPLUGYSSSPlQPlQTVAATTVPSNTSUAPTTTSLGPAKDKPGLRRAAiQGGGSTFT
SEG xxxxxxxx
PRD ccccccccccccccccccceeeeeecccccccccccccccccccccccceeecccccccc MEM
SElQ S(QGGTPDATAASGAPVSP(QAAPVPS(QRPHHGDP(QDGPSHSDSULTVTPGTSRPLSTSSGV S EG X XX X X - - - - X X X X X X X XX X X X X X X XX -. PRD ccccccccccccccccccccccccccccccccccccccccceeeeeccccccccccccce
MEM
SElQ FTAATGPTPAAFDTSVSAPSlQGIPiQGASTTPfiAPTHPSRVSESTISGAKEETVATLTMTD SEG xxxxxxxxxxxxxx
PRD eeeeccccccccccccccccccccccccccccccccccceeeeeecccchhhhhhhcccc
MEM
SElQ RVPSPLSTVVSTATGNFLNRLVPAGTUKPGTAGNISHVAEGDKPiQHRATICLSKMDIAUV SEG
PRD ccccccceeeeeccccccccccccccccccccccceeecccccccceeeccccchhhhhh
MEM M
SE(Q ILAISVPISSCSVLLTVCCMKRKKKTANPENNLSYUNNTITMDYFNRHAVELPREIlQSLE SEG
PRD hhhhccccccccceeeehhhhhhccccccccccccccccccccccccccccccchhhhhc
MEM MMMMMMMMMMMMMMMM
SElQ TSEDiQLSEPRSPANGDYRDTGMVLVNPFCiQETLFVGNDlQVSEI SEG
PRD cccccccccccccccccccceeeeecccccceeeeeccccccc
MEM
(No Prosite data available for DKFZphamy2_24c6- 2) (No Pfam data available for DKFZphamy2_24cβ - 2)
DKFZphamy2_24kl5
group: amygdala derived
DKFZphamy2_24kl5 encodes a novel 271 amino acid protein with weak similarity to pecanex of Drosophila melanogaster- Pecanex is a maternal-effect neurogenic genei involved in differentiation processes in the developing central nervous system- DKFZphamy2_24kl5-p3 seems to be expressed ubiquitiously-
The new protein can find application in studying the expression profile of amygdala-specific genes and as a new marker for amygdala cells-
similarity to pecanex (Drosophila melanogaster) probably complete eds-
Sequenced by GBF Locus: unknown
Insert length: I4b4 bp
Poly A stretch at pos. 1445ι polyadenylation signal at pos- 1421
1 AAGGAAAACA AGAGGACATG CCATATATTC CTCTCATGGA GTTCAGTTGT
SI TCACATTCTC ACTTAGTATG CTTACCCGCA GAGTGGAGGA CTAGCTGTAT
101 GCCCAGTTCC AAAATGAAGG AGATGAGCTC GTTATTTCCA GAAGACTGGT
151 ACCAATTTGT TCTAAGGCAG TTGGAATGTT ATCATTCAGA AGAGAAGGCC 201 TCAAATGTAC TGGAAGAAAT TGCCAAGGAC AAAGTTTTAA AAGACTTTTA
251 TGTTCATACA GTAATGACTT GTTATTTTAG TTTATTTGGA ATAGACAATA
301 TGGCTCCTAG TCCTGGTCAT ATATTGAGAG TTTACGGTGG TGTTTTGCCT
351 TGGTCTGTTG CTTTGGACTG GCTCACAGAA AAGCCAGAAC TGTTTCAACT
401 AGCACTGAAA GCATTCAGGT ATACTCTGAA ACTAATGATT GATAAAGCAA 451 GTTTAGGTCC AATAGAAGAC TTTAGAGAAC TGATTAAGTA CCTTGAAGAA
501 TATGAACGTG ACTGGTACAT TGGTTTGGTA TCTGATGAAA AGTGGAAGGA
551 AGCAATTTTA CAAGAAAAGC CATACTTGTT TTCTCTGGGG TATGATTCTA bOl ATATGGGAAT TTACACTGGG AGAGTGCTTA GCCTTCAAGA ATTATTGATC b51 CAAGTGGGAA AGTTAAATCC TGAAGCTGTT AGAGGTCAGT GGGCCAATCT 701 TTCATGGGAA TTACTTTATG CCACAAACGA TGATGAAGAA CGTTATAGTA
751 TACAAGCTCA TCCACTACTT TTAAGAAATC TTACGGTACA AGCAGCAGAA
601 CCTCCCCTGG GATATCCGAT TTATTCTTCA AAACCTCTCC ACATACATTT
651 GTATTAGAGC TCATTTTGAC TGTAATGTCA TCAAATGCAA TGTTTTTATT
101 TTTTCATCCT AAAAAAGTAA CTGTGATTCT TGTAACTTGA GGACTTCTCC 151 ACACCCCCAT TCAGATGCCT GAGAACAGCT AAGCTCCGTA AAGTTGGTTC
1001 TCTTAGCCAT CTTAATGGTT CTAAAAAACA GCAAAAACAT CTTTATGTCT
1051 AAGATAAAAG AACTATTTGG CCAATATTTG TGCCCTCTGG ACTTTAGTAG
1101 GCTTTGGTAA ATGTGAGAAA ACTTTTGTAG AATTATCATA TAATGAATTT
1151 TGTAATGCTT TCTTAAATGT GTTATAGGTG AATTGCCATA CAAAGTTAAC 1201 AGCTATGTAA TTTTTACATA CTTAAGAGAT AAACATATCA GTGTTCTAAG
12S1 TAGTGATAAT GGATCCTGTT GAAGGTTAAC ATAATGTGTA TATATTTGTT
1301 TGAAATATAA TTTATAGTAT TTTCAAATGT GCTGATTTAT TTTGACATCT
1351 AATATCTGAA TGTTTTTGTA TCAAGTAGTT TGTTTTCATA GACTTCAATT 1401 CATAAACTTT AAAAAACTTT TAATAAAATA TTTTCCTTCC TTTTCAAAAA 1451 AAAAAAAAAA AAAA
BLAST Results
Entry AC007131 from database EMBLNEU:
Homo sapiens clone 422_H_5ι UORKING DRAFT SEiQUENCEi 5 unordered pieces-
Score = 411bι P = 0-Oe+OOι identities = 640/856 3 exons
Medline entries
No Medline entry
Peptide information for frame 3
ORF from 18 bp to 8S4 bpi peptide length: 271 Category: similarity to known protein Classification: unclassified 1 MPYIPLMEFS CSHSHLVCLP AEURTSCMPS SKMKEMSSLF PEDUYώFVLR
51 (QLECYHSEEK ASNVLEEIAK DKVLKDFYVH TVMTCYFSLF GIDNMAPSPG
101 HILRVYGGVL PUSVALDULT EKPELFiQLAL KAFRYTLKLM IDKASLGPIE
151 DFRELIKYLE EYERDUYIGL VSDEKUKEAI LiQEKPYLFSL GYDSNMGIYT
201 GRVLSLiQELL IlQVGKLNPEA VRGώUANLSU ELLYATNDDE ERYSIlQAHPL 2S1 LLRNLTViQAA EPPLGYPIYS SKPLHIHLY
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_24kl5ι frame 3 No Alert BLASTP hits found
Pedant information for DKFZphamy2_24klSι frame 3
Report for DKFZphamy2_24kl5-3
[LENGTH! 264 [MU! 330bb-31 [pi! 5.17
[HOMOL! TREMBL:AF0b7b08_ll gene: "B0S11.12"i, Caenorhabditis elegans cosmid B0511. 2e-13 [KU! Alpha_Beta SElQ GKIQEDMPYIPLMEFSCSHSHLVCLPAEURTSCMPSSKMKEMSSLFPEDUYIQFVLRIQLECY
PRD ccccccccccceeeecccceeeeecccccccccccccccccccccccchhhhhhhhhhhh
SElQ HSEEKASNVLEEIAKDKVLKDFYVHTVMTCYFSLFGIDNMAPSPGHILRVYGGVLPUSVA
PRD hhhhhhhhhhhhhhhhhhhhhhheeeeeeeeeeeecccccccccceeeeeeccccccccc
SElQ LDULTEKPELFlQLALKAFRYTLKLMIDKASLGPIEDFRELIKYLEEYERDUYIGLVSDEK PRD cchhhhhchhhhhhhhhhhhhhhhhhhhhcccccchhhhhhhhhhhhhhheeeeeccccc
SElQ UKEAIL<QEKPYLFSLGYDSNMGIYTGRVLSL(QELLI(QVGKLNPEAVRG(QUANLSUELLYA
PRD hhhhhhhhcchhhhhhhhcccccchhhhhhhhhhhhheeeeechhhhhhhhhhhhheeee SE(Q TNDDEERYSI(QAHPLLLRNLTV(QAAEPPLGYPIYSSKPLHIHLY
PRD cccccccccccchhhhhhhhhhhccccccccccccccccccccc
(No Prosite data available for DKFZphamy2_24kl5-3) (No Pfam data available for DKFZphamy2_24klS-3)
DKFZphamy2_2al3
group: amygdala derived
DKFZphamy2_2al3 encodes a novel 440 amino acid protein without similarity to known proteins- No informative BLAST resultsi No predictive prositei pfam or SCOP motife ■
The new protein can find application in studying the expression profile of amygdala-specific genes-
putative protein perhaps complete cds-
Sequenced by MediGenomix
Locus: /map="lbpl3-3" Insert length: 2584 bp
Poly A stretch at pos- 25b2ι polyadenylation signal at pos- 2545
1 GTTCCTGAGG ACGTGCTACG GGGGCAGCTT CCTGGTACAC GAGTCGTTCC 51 TCTACAAGCG GGAGAAGGCT GTCGGGGACA AGGTGTATTG GACCTGCCGG
101 GACCACGCGC TGCACGGCTG CCGGAGCCGG GCCATCACCC AGGGACAGCG
151 GGTGACTGTG ATGCGTGGGC ACTGCCACCA GCCCGATATG GAGGGCCTGG
201 AAGCCCGGCG GCAGCAGGAG AAGGCCGTGG AGACGCTGCA GGCTGG6CAG
2S1 GACGGCCCTG GGAGCCAAGT GGACACGCTG CTCCGAGGCG TGGATAGTTT 301 GCTCTACCGC AGGGGTCCGG GTCCCCTGAC TCTCACCAGG CCTCGGCCCA
351 GAAAGCGAGC AAAGGTCGAA GACCAGGAGC TGCCAACCCA GCCCGAGGCC
401 CCAGACGAGC ACCAGGACAT GGACGCAGAC CCGGGAGGCC CTGAGTTCCT
451 GAAGACGCCC CTGGGGGGCA GCTTCCTGGT GTACGAGTCC TTCCTCTACC
SOI GGCGGGAGAA GGCGGCTGGG GAGAAGGTGT ATTGGACCTG CCGGGACCAG S51 GCCCGCATGG GCTGCCGCAG CCGCGCCATC ACCCAGGGCC GACGGGTGAC bOl TGTCATGCGT GGTCACTGCC ACCCGCCCGA CCTGGGAGGC CTGGAGGCCC bSl TGAGGCAGCG GGAGAAACGC CCCAACACGG CGCAGCGGGG GAGCCCAGGC
701 GCTGGCCTCT CTTTCCAGTG GCTCTTCCGG ATCCTGCAGC TTTTGGGTCA
751 TGCTCCTGTG CTGCTGTGCC CCTCAGGGTC CTCCTGCCTC CCGAGCCTCC 801 CTGCTCCACA TGGCCCCTGC CCAGCCCTCT CCATCCCTCT TGAAGGAGGC
651 CCCGAGTTCC TGAAGACGCC CCTGGGGGGC AGCTTCCTGG TGTACGAGTC
101 CTTCCTCTAC CGGCGGGAGA AGGCGGCCGG GGAGAAGGTG TATTGGACCT
151 GCCGGGACCA GGCCCGCATG GGCTGCCGCA GCCGCGCCAT CACCCAGGGC
1001 CGGCGGGTCA TGGTCATGCG CAGGCACTGC CACCCACCGG ACCTGGGCGG 1051 CCTGGAGGCC CTGCGGCAGC GGGAGCACTT CCCCAACCTG GCGCAGTGGG
1101 ACAGCCCAGA TCCTCTCCGG CCCCTGGAGT TCCTGAGGAC TTCCCTGGGG
1151 GGCAGGTTCC TGGTGCACGA GTCCTTCCTC TACAGGAAGG AGAAGGCGGC
1201 TGGGGAGAAG GTGTACTGGA TGTGCCGGGA CCAGGCTCGG CTGGGCTGCC
1251 GCAGCCGCGC CATAACCCAG GGCCACCGCA TCATGGTCAT GCGCAGCCAC 1301 TGCCATCAGC CTGACCTGGC AGGCCTGGAG GCCTTGAGGC AACGGGAGCG
1351 GCTCCCCACC ACGGCCCAGC AGGAGGACCC AGAAAAGATT CAAGTTCAGC
1401 TGTGCTTCAA GACGTGTTCT CCTGAAAGCC AGCAGATTTA TGGGGACATC
1451 AAAGACGTCA GACTGGATGG CGAGTCCCAG TGAGGCGATG TGGGCAGAGG 1501 AGCTCCGAGC CGCCCACCCA AGGTGGCTTC ACATCCACAC AGGCACTTCC
15S1 CATCCACCTA GGTTTGGCTT AGCAGAAACT TCTTTTCATT CTTCCAAAGC
IbOl ATCGATGGTC TTCGCGTCTC CTCAGGAGGT CTCCCAGGAG GAATTCTTGG lb51 ATGGTGTCCT CATGTCGGCG GAGAACAGTG CTCAGAGCTG GCGCTTGCAG 1701 ACGCAGCTGT CGTGGGGCAG GGCGGTGGCG CCTTCCTGAC CTTTGGAAGA
1751 CATGACAAAG CTGCCTGGAC ACGGACGCCC CTGCTGTACG GCCACAGCAC
1801 CCCTGGGTTT GCAGAGCACG CAGCCTTCCT AGGGCTTTCC ACCTGGCGAG
1651 GCCCCGCTCT GCTCAGCACG GTGCAAAGTG AATGCTGCTG TCTTGGAGCC
1101 TGGGCACGTT TGGGGAAGTT CCTGCTTCAA ACTGAGCTGC CCCGCATAGG 1151 CCAGGTCAAC CCACACCAAT CTTTTCTGGA CAGGTGCTGG GTAGGCCTTC
2001 CTGGTCTCTG GCCGCCTGCT GCCAGGGTGT GGCCATCCCC AGCAACCGGA
2051 GCCGGCCAAA CCAGAGGCCT CGCTCCGCAC TCCACACTTT CCTTTCTGTG
2101 CTCCTTCCAA GTTAAATTAA ACCCCCTCTC CACGATTCCC ACGGCAGGCG
2151 TCATTCCCGA GATGGGAGCC AGTCCAGGGG TCAGCAGGAG CCAGCGCTGG 2201 GCACACGTGC CCTGGCTGAG GCCAGCGGCA TCCTGGGTGG CCCAGGTCCA
2251 TCCTGGGCAG CAAAGGCGTG TCCCCTTCTG TCAGACAGCT TCACAGAGTG
2301 TGGCTTCACC AGTCAGAGGG AGCAGTCCGG AGAGGCAAGA TGACCCCACC
2351 GGGACTGCAG AGCCTCCTCC TTACTAACAA GGACCTGTCC GCAGCCGCGA
2401 GGTCCTTCAC TCCCACCCTG TAATTGTGGG GGGAGTGCCA GCAACAGGCC 2451 TGTCCCCTGG CAAGTTGGCC ACGGAACCCA CCATGCACTG CAAGGCTGTG
2501 ACAGCCTGGG CACCCCTGCT TCTCCTCTGC TTGTACGGTT CCCCCAATAA
2S51 ATCCTATTTT CCATCAAAAA AAAAAAAAAA AAAA
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 2
ORF from Ibl bp to 1480 bpi peptide length: 440
Category: putative protein
Classification: no clue 1 MRGHCHlQPDM EGLEARRIQIQE KAVETL(QAG(Q DGPGSiQVDTL LRGVDSLLYR
SI RGPGPLTLTR PRPRKRAKVE D(QELPT(QPEA PDEHiQDMDAD PGGPEFLKTP
101 LGGSFLVYES FLYRREKAAG EKVYUTCRDiQ ARMGCRSRAI TlQGRRVTVMR
151 GHCHPPDLGG LEALRiQREKR PNTAiQRGSPG AGLSFώULFR ILlQLLGHAPV
201 LLCPSGSSCL PSLPAPHGPC PALSIPLEGG PEFLKTPLGG SFLVYESFLY 251 RREKAAGEKV YUTCRDlQARM GCRSRAITlQG RRVMVMRRHC HPPDLGGLEA
301 LRlQREHFPNL AiQUDSPDPLR PLEFLRTSLG GRFLVHESFL YRKEKAAGEK
351 VYUMCRDlQAR LGCRSRAITlQ GHRIMVMRSH CHlQPDLAGLE ALRώRERLPT
401 TA(Q(QEDPEKI (QVώLCFKTCS PES(Q(QIYGDI KDVRLDGESώ
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_2al3ι frame 2 No Alert BLASTP hits found
Pedant information for DKFZphamy2_2al3ι frame 2
Report for DKFZphamy2_2al3 - 2
[LENGTH! 413
[MU! 55840-13
[pi! 1-33
[KU! Alpha_Beta
[KU! LOU COMPLEXITY b-21 *
SE(Q FLRTCYGGSFLVHESFLYKREKAVGDKVYUTCRDHALHGCRSRAITlQGlQRVTVMRGHCHfQ
SEG
PRD ccccccccceeeccchhhhhhhhhccceeeeecccccccccceeeeccceeeeeeccccc
SElQ PDMEGLEARRIQIQEKAVETLIQAGIQDGPGSIQVDTLLRGVDSLLYRRGPGPLTLTRPRPRKRA SEG xxxxxxxxxxxxxxx- - -
PRD cccchhhhhhhhhhhhhhhhhcccccccccccccccccceeeeecccceeecccccchhh
SElQ KVED(QELPT(QPEAPDEH(QDMDADPGGPEFLKTPLGGSFLVYESFLYRREKAAGEKVYUTC
SEG PRD hhhhhcccccccccccccccccccccccccccccccceeeehhhhhhhhhhhccceeeec
SElQ RD(QARMGCRSRAIT<QGRRVTVMRGHCHPPDLGGLEALR(QREKRPNTA(QRGSPGAGLSF(QU
SEG
PRD cchhhhhccceeecccceeeeeecccccccccchhhhhhhhhccccccccccccchhhhh
SElQ LFRILiQLLGHAPVLLCPSGSSCLPSLPAPHGPCPALSIPLEGGPEFLKTPLGGSFLVYES
SEG xxxxxxxxxxxxxxxx
PRD hhhhhhhhhccceeeccccccccccccccccccccccccccccccccccccccceeeehh SElQ FLYRREKAAGEKVYUTCRD(QARMGCRSRAIT(QGRRVMVMRRHCHPPDLGGLEALR(QREHF
SEG
PRD hhhhhhhhhccceeeeccchhhhhccceeecccceeeeeecccccccccchhhhhhhhhc
SElQ PNLAIQUDSPDPLRPLEFLRTSLGGRFLVHESFLYRKEKAAGEKVYUMCRDIQARLGCRSRA SEG
PRD ccccccccccccchhhhhhhcccceeeeecchhhhhhhhccceeeecchhhhhhhccccc
SElQ IT(3GHRIMVMRSHCH(QPDLAGLEALR<QRERLPTTA(Qι3EDPEKI(QV(QLCFKTCSPES(Q(QIY
SEG PRD ccccceeeeeeccccccccchhhhhhhhhhhhhccccccccceeehhhhhcccccccccc
SElQ GDIKDVRLDGESiQ
SEG
PRD ccccccccccccc
(No Prosite data available for DKFZphamy2_2al3- 2) (No Pfam data available for DKFZphamy2_2al3-2)
DKFZphamy2_2bl1
group: differentiation/development
DKFZphamy2_2bl1 encodes a novel 761 amino acid protein which originates from TXBP151 mRNA by alternative splicing-
It is ubiquitously expressed- The mRNA is also subject to alternative polyadenylation. Overexpression of TXBPISI in NIH3T3 cells causes inhibition of apoptosis induced by tumour necrosis factor (TNF). It binds to A20ι which is also an inhibitor of cell death by a yet unknown mechanism-
The new protein can find application in modifying/blocking apoptosic pathways and therefore serve as a tool in diagnosis of cancer predisposition and as a tool in cell culture-
TXBPlSli differentially spliced differential splicing differential polyadenylation
Sequenced by MediGenomix Locus: /map="7plS"
Insert length: 3028 bp
Poly A stretch at pos- 2865ι polyadenylation signal at pos- 28bS
1 GAAGAGGTTC GGCGGCTGAT GGCGGATCAG GATCGGAAGC CTGCGTAACT
SI TTCTCCCTTG ATCCGG6AGT CTTTCCACTG GATTCACAAT GACATCCTTT
101 CAAGAAGTCC CATTGCAGAC TTCCAACTTT GCCCATGTCA TCTTTCAAAA
151 TGTGGCCAAG AGTTACCTTC CTAATGCACA CCTGGAATGT CATTACACCT 201 TAACTCCATA TATTCATCCA CATCCAAAAG ATTGGGTTGG TATATTCAAG
251 GTTGGATGGA GTACTGCTCG TGATTATTAC ACGTTTTTAT GGTCCCCTAT
301 GCCTGAACAT TATGTGGAAG GATCAACAGT CAATTGTGTA CTAGCATTCC
351 AAGGATATTA CCTTCCAAAT GATGATGGAG AATTTTATCA GTTCTGTTAC
401 GTTACCCATA AGGGTGAAAT TCGTGGAGCA AGTACACCTT TCCAGTTTCG 4S1 AGCTTCTTCT CCAGTTGAAG AGCTGCTTAC TATGGAAGAT GAAGGAAATT
501 CTGACATGTT AGTGGTGACC ACAAAAGCAG GCCTTCTTGA GTTGAAAATT
SSI GAGAAAACCA TGAAAGAAAA AGAAGAACTG TTAAAGTTAA TTGCCGTTCT bOl GGAAAAAGAA ACAGCACAAC TTCGAGAACA AGTTGGGAGA ATGGAAAGAG bSl AACTTAACCA TGAGAAAGAA AGATGTGACC AACTGCAAGC AGAACAAAAG 701 GGTCTTACTG AAGTAACACA AAGCTTAAAA ATGGAAAATG AAGAGTTTAA
751 GAAGAGGTTC AGTGATGCTA CATCCAAAGC CCATCAGCTT GAGGAAGATA
801 TTGTGTCAGT AACACATAAA GCAATTGAAA AAGAAACCGA ATTAGACAGT
851 TTAAAGGACA AACTCAAGAA GGCACAACAT GAAAGAGAAC AACTTGAATG
101 TCAGTTGAAG ACAGAGAAGG ATGAAAAGGA ACTTTATAAG GTACATTTGA 151 AGAATACAGA AATAGAAAAT ACCAAGCTTA TGTCAGAGGT CCAGACTTTA
1001 AAAAATTTAG ATGGGAACAA AGAAAGCGTG ATTACTCATT TCAAAGAAGA
10S1 GATTGGCAGG CTGCAGTTAT GTTTGGCTGA AAAGGAAAAT CTGCAAAGAA
1101 CTTTCCTGCT TACAACCTCA AGTAAAGAAG ATACTTGTTT TTTAAAGGAG 1151 CAACTTCGTA AAGCAGAGGA ACAGGTTCAG GCAACTCGGC AAGAAGTTGT
1201 CTTTCTGGCT AAAGAACTCA GTGATGCTGT CAACGTACGA GACAGAACGA
1251 TGGCAGACCT GCATACTGCA CGCTTGGAAA ACGAGAAAGT GAAAAAGCAG
1301 TTAGCTGATG CAGTGGCAGA ACTTAAACTA AATGCTATGA AAAAAGATCA 1351 GGACAAGACT GATACACTGG AACACGAACT AAGAAGAGAA GTTGAAGATC
1401 TGAAACTCCG TCTTCAGATG GCTGCAGACC ATTATAAAGA AAAATTTAAG
1451 GAATGCCAAA GGCTCCAAAA ACAAATAAAC AAACTTTCAG ATCAATCAGC
1501 TAATAATAAT AATGTCTTCA CAAAGAAAAC GGGGAATCAG CAGAAAGTGA
15S1 ATGATGCTTC AGTAAACACA GACCCAGCCA CTTCTGCCTC TACTGTAGAT IbOl GTAAAGCCAT CACCTTCTGC AGCAGAGGCA GATTTTGACA TAGTAACAAA
IbSl GGGGCAAGTC TGTGAAATGA CCAAAGAAAT TGCTGACAAA ACAGAAAAGT
1701 ATAATAAATG TAAACAACTC TTGCAGGATG AGAAAGCAAA ATGCAATAAA
17S1 TATGCTGATG AACTTGCAAA AATGGAGCTG AAATGGAAAG AACAAGTGAA
1601 AATTGCTGAA AATGTAAAAC TTGAACTAGC TGAAGTACAG GACAATTATA 1851 AAGAACTTAA AAGGAGTCTA GAAAATCCAG CAGAAAGGAA AATGGAAGGT
1101 CAGAATTCCC AGAGTCCTCA ATGTTTCAAA ACATGCTCAG AGCAAAATGG
1151 TTATGTTCTC ACATTGTCAA ATGCACAACC AGTTCTGCAA TATGGTAATC
2001 CTTATGCATC TCAGGAAACA AGAGATGGAG CAGATGGTGC TTTTTACCCA
2051 GATGAAATAC AAAGGCCACC TGTCAGAGTC CCCTCTTGGG GACTGGAAGA 2101 CAATGTTGTC TGCAGCCAGC CTGCTCGAAA CTTTAGTCGG CCTGATGGCT
21S1 TAGAGGACTC TGAGGATAGC AAAGAAGATG AGAATGTGCC TACTGCTCCT
2201 GATCCTCCAA GTCAACATTT ACGTGGGCAT GGGACAGGCT TTTGCTTTGA
2251 TTCCAGCTTT GATGTTCACA AGAAGTGTCC CCTCTGTGAG TTAATGTTTC
2301 CTCCTAACTA TGATCAGAGC AAATTTGAAG AACATGTTGA AAGTCACTGG 2351 AAGGTGTGCC CGATGTGCAG CGAGCAGTTC CCTCCTGACT ATGACCAGCA
2401 GGTGTTTGAA AGGCATGTGC AGACCCATTT TGATCAGAAT GTTCTAAATT
2451 TTGACTAGTT ACTTTTTATT ATGAGTTAAT ATAGTTTAGC AGTAAAAAAA
2S01 AAAAAAAAAA ACCACACCTA AAATAGACCA CTGAGGAGAC CATAGAGCGG
2551 ATGCTTTCAT GCACCCTTTA CTGCACTTTC TGACCAGGAG CTACTTTGAG 2b01 TTTGGTGTTA CTAGGATCAG GGTCAGTCTT TGGCTTATCA ATAAATTTTA
2b51 ATCTCTGTTA ATCTTACCTG CTTTAAAAAA AAGTTCTTGT GTGTTCGTAT
2701 CTTTATTTAT TCCCTAGTTT GCAGAACTGT CTGAATAAAG GATACAAGGA
27S1 TTATTTCAAT GTTACTGCAC TGAAAAACGT GTATGTATTA GTGTGCTAGA
2801 TTATTTAGCA GAATATTCAC AAGTTTCTGT TGACCTTGTT GATTGAGCAT 2651 GACTACTAAA TATTATGTAA TAAAAAGCAT TTGTCATAAC AAAAAAAAAA
2101 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA
21S1 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA
3001 AAAAAAAAAA AAAAAAAAAA AAAAAAAA
BLAST Results
No BLAST result
Medline entries
113bl184."
De Valck Di Jin DYi Heyninck Ki Van de Craen Mi Contreras Ri
Fiers Ui
Jeang KTi Beyaert R-i The zinc finger protein A20 interacts with a novel anti-apoptotic protein which is cleaved by specific caspases-
Oncogene
1111 Jul 22il8(21) :41β2-10 Peptide information for frame 2
ORF from 81 bp to 245S bpi peptide length: 761 Category: known protein Classification: Cell division
1 MTSFlQEVPLiQ TSNFAHVIFlQ NVAKSYLPNA HLECHYTLTP YIHPHPKDUV
51 GIFKVGUSTA RDYYTFLUSP MPEHYVEGST VNCVLAFiQGY YLPNDDGEFY
101 (QFCYVTHKGE IRGASTPFlQF RASSPVEELL TMEDEGNSDM LVVTTKAGLL ISI ELKIEKTMKE KEELLKLIAV LEKETAOLRE (QVGRMERELN HEKERCDIQLIQ
201 AEiQKGLTEVT (QSLKMENEEF KKRFSDATSK AHlQLEEDIVS VTHKAIEKET
251 ELDSLKDKLK KAlQHERElQLE CύLKTEKDEK ELYKVHLKNT EIENTKLMSE
301 VlQTLKNLDGN KESVITHFKE EIGRLlQLCLA EKENLώRTFL LTTSSKEDTC
3S1 FLKEiQLRKAE EIQVIQATRIQEV VFLAKELSDA VNVRDRTMAD LHTARLENEK 401 VKKiQLADAVA ELKLNAMKKD (QDKTDTLEHE LRREVEDLKL RLiQMAADHYK
451 EKFKEC(QRL(Q KiQINKLSDώS ANNNNVFTKK TGNiQlQKVNDA SVNTDPATSA
SOI STVDVKPSPS AAEADFDIVT KGlQVCEMTKE IADKTEKYNK CKIQLLIQDEKA
5S1 KCNKYADELA KMELKUKElQV KIAENVKLEL AEViQDNYKEL KRSLENPAER bOl KMEGlQNSlQSP (QCFKTCSEiQN GYVLTLSNAlQ PVLiQYGNPYA SiQETRDGADG bSl AFYPDEIlQRP PVRVPSUGLE DNVVCSlQPAR NFSRPDGLED SEDSKEDENV
701 PTAPDPPSiQH LRGHGTGFCF DSSFDVHKKC PLCELMFPPN YDlQSKFEEHV 7S1 ESHUKVCPMC SElQFPPDYDiQ (QVFERHViQTH FDiQNVLNFD
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_2bl1ι frame 2
TREMBL:HS336211_1 product: "taxi-binding protein TXBP151"i Homo sapiens taxi-binding protein TXBPISI mRNAi complete cds-i N = 2ι
Score
= 2148ι P = 0
>TREMBL:HS338211_1 product: "taxi-binding protein TXBP151"i Homo sapiens taxi-binding protein TXBP151 mRNAi complete eds-
Length = 747
HSPs: Score = 2148 (442-3 bits)ι Expect = 0-Oe+OOι Sum P(2) = O-Oe+00 Identities = 57S/b03 (15*)ι Positives = S7b/bD3 (15*)
(Query: 1 MTSF(QEVPL(QTSNFAHVIF(QNVAKSYLPNAHLECHYTLTPYIHPHPKDUVGIFKVGUSTA bO
MTSF(QEVPL(QTSNFAHVIF(QNVAKSYLPNAHLECHYTLTPYIHPHPKDUVGIFKVGUSTA Sbjct: 1 MTSF(QEVPL(QTSNFAHVIF(QNVAKSYLPNAHLECHYTLTPYIHPHPKDUVGIFKVGUSTA bO (Query: bl RDYYTFLUSPMPEHYVEGSTVNCVLAF(QGYYLPNDDGEFY(QFCYVTHKGEIRGASTPF(QF 120 RDYYTFLUSPMPEHYVEGSTVNCVLAF(QGYYLPNDDGEFY(QFCYVTHKGEIRGASTPF(QF Sbjct: bl RDYYTFLUSPMPEHYVE6STVNCVLAF(QGYYLPNDDGEFY(QFCYVTHKGEIRGASTPF(QF 120
(Query: 121 RASSPVEELLTMEDEGNSDMLVVTTKAGXXXXXXXXXXXXXXXXXXXXXXXXXXTAiQLRE 180
RASSPVEELLTMEDEGNSDMLVVTTKAG
TAiQLRE
Sbjct: 121
RASSPVEELLTMEDEGNSDMLVVTTKAGLLELKIEKTMKEKEELLKLIAVLEKETAiQLRE 180
(Query: 181
(QVGRMERELNHEKERCD(QL(QAE(QKGLTEVT(QSLKMENEEFKKRFSDATSKAH(QLEEDIVS 240 IQVGRMERELNHEKERCDIQLIQAEIQKGLTEVTIQSLKMENEEFKKRFSDATSKAH
+EEDIVS Sbjct: 181
(QVGRMERELNHEKERCD(QL(QAE(QKGLTEVT(QSLKMENEEFKKRFSDATSKAHHVEEDIVS 240
(Query: 241 VTHKAIEKETELDSLKDKLKKA(QHERE(QLEC(QLKTEKDEKELYKVHLKNTEIENTKLMSE 300
VTHKAIEKETELDSLKDKLKKA(QHERE(QLECιQLKTEKDEKELYKVHLKNTEIENTKLMSE Sbjct: 241 VTHKAIEKETELDSLKDKLKKAlQHERElQLEClQLKTEKDEKELYKVHLKNTEIENTKLMSE 300 Ouery : 301
ViQTLKNLDGNKESVITHFKEEIGRLlQLCLAEKENLlQRTFLLTTSSKEDTCFLKEiQLRKAE 3b0
V(QTLKNLDGNKESVITHFKEEIGRL(QLCLAEKENL(3RTFLLTTSSKEDTCFLKE(QLRKAE Sb j ct : 301 V(QTLKNLDGNKESVITHFKEEIGRL(QLCLAEKENL(QRTFLLTTSSKEDTCFLKE(QLRKAE 3b0
(Query : 3bl E(QV(QATR(QEVVFLAKELSDAVNVRDRTMADLHTARLENEKVKK(QLADAVAELKLNAMKKD 420 E(QV(QATR(QEVVFLAKELSDAVNVRDRTMADLHTARLENEKVKK(QLADAVAELKLNAMKKD Sb jct : 3bl E(QV(QATRtQEVVFLAKELSDAVNVRDRTMADLHTARLENEKVKK(QLADAVAELKLNAMKKD 420
(Query : 421 (QDKTDTLEHELRREVEDLKLRL(QMAADHYKEKFKEC(QRL(QK(QINKLSD(QSANNNNVFTKK 480
(QDKTDTLEHELRREVEDLKLRL(QMAADHYKEKFKEC(QRL(QK(QINKLSD(QSANNNNVFTKK Sb jct : 421 (3DKTDTLEHELRREVEDLKLRL(QMAADHYKEKFKEC(QRL(QK(QINKLSD(QSANNNNVFTKK 480
(Query: 481 TGN(Q(QKVNDASVNTDPATSASTVDVKPSPSAAEADFDIVTKG(QVCEMTKEIADKTEKYNK 540
TGNtQiQKVNDASVNTDPATSASTVDVKPSPSAAEADFDIVTKGlQVCEMTKEIADKTEKYNK Sbjct: 461
TGNlQiQKVNDASVNTDPATSASTVDVKPSPSAAEADFDIVTKGiQVCEMTKEIADKTEKYNK 540 (Query: S41 CK(QLLιQDEKAKCNKYADELAKMELKUKE(QVKIAENVKLELAEVι3DNYKELKRSLENPAER bOO
CKlQLLlQDEKAKCNKYADELAKMELKUKElQVKIAENVKLELAEViQDNYKELKRSLENPAER Sbjct: 541
CKiQLLlQDEKAKCNKYADELAKMELKUKEiQVKIAENVKLELAEVlQDNYKELKRSLENPAER bOO
(Query: bDl KME b03 KME Sbjct: bOl KME b03
Score = 631 (124-7 bits)ι Expect = 0-Oe+OOι Sum P(2) = O-Oe+00 Identities = 147/153 (1b*)ι Positives = 141/153 (17*) (Query: b37
NPYASIQETRDGADGAFYPDEIIQRPPVRVPSUGLEDNVVCSIQPARNFSRPDGLEDSEDSKE bib NP A ++
DGADGAFYPDEIiQRPPVRVPSUGLEDNVVCSiQPARNFSRPDGLEDSEDSKE
Sbjct: Sib NP- AERKMEDGADGAFYPDEIlQRPPVRVPSUGLEDNVVCSlQPARNFSRPDGLEDSEDSKE bS4
(Query: b17 DENVPTAPDPPSIQHLRGHGTGFCFDSSFDVHKKCPLCELMFPPNYDIQSKFEEHVESHUKV 75b DENVPTAPDPPSlQHLRGHGTGFCFDSSFDVHKKCPLCELMFPPNYDiQSKFEEHVESHUKV Sbjct: b5S DE VPTAPDPPSlQHLRGHGTGFCFDSSFDVHKKCPLCELMFPPNYDlQSKFEEHVESHUKV 714
(Query: 757 CPMCSEiQFPPDYDiQiQVFERHViQTHFDiQNVLNFD 761 CPMCSElQFPPDYDiQiQVFERHVlQTHFDiQNVLNFD
Sbjct: 715 CPMCSE(QFPPDYD(QiQVFERHV(QTHFD(QNVLNFD 747
Score = 104 (15. t bits)ι Expect = 1-2e-02ι Sum P(2) = 6 - 6e-02 Identities = 60/351 (22*)ι Positives = 1S7/3S1 (44*)
(Query : 177 <QLR---E(QVGRMERELNH- EKERCD(QL(QAE(3KGLTEVT(QSLKMENEEFKKRFSDATSKAH 232
(QLR E(QV +E+ KE D + + + ++ + + + ENE+ KK+
+ DA Sb j ct : 355 (QLRKAEE(QV(QATR(QEVVFLAKELSDAVNVRDRTMADL- HTARLENEKVKKiQLADA 406
(Query : 233 (QLEEDIVSVTHKAIEKETE- LDSLKDKLKKAlQHERElQLECiQLKTEKDEKELYKVHLKNTE 211 + + A++K+ + D+L+ +L++ E E L+ +L+ D
YK K +
Sb j ct : 401 VAELKLNAMKKD(QDKTDTLEHELRR---EVEDLKLRL(QMAADH- —
YKEKFKEClQ 457 (Query : 212
IENTKLMSEV(QTLKNLDGNKESVITHFKEEIGRL(QLCLAEKENL(QRTFLLTTSSKEDTCF 351
+L ++ L + N +V T ++ G Q N T
T + + S D
Sb j ct : 456 RL(QK(QINKLSD(QSANNNNVFT KKTGN(Q(QKVNDASVN TDPATSASTVD 504
(Query : 3S2 LKElQLRKAEEiQVlQ- ATRiQEVVFLAKELSDAVNVRDRTMADLHTARLENEKVKKiQLADAVA 410 +K AE T+ +V + KE++D ++ L + + K +LA
Sbjct: S05
VKPSPSAAEADFDIVTKG(QVCEMTKEIADKTEKYNKCK(QLL(QDEKAKCNKYADELAKMEL Sb4
(Query: 411 ELKLNAMKKDiQDKTDTLE HELRREVED-LKLRLiQMAAD—
HYKEKFKEC(Q-RL(QK 4bl
+ K + K + E EL+R +E+ + +++ AD Y ++ + R+ Sbjct: 5bS
KUKE(QVKIAENVKLELAEV(QDNYKELKRSLENPAERKMEDGADGAFYPDEI(QRPPVRVPS b24
(Query: 4b2 -—(QINKLSDiQSANNNNVFTKKTG—- N(Q(QKVNDASVNTDPATSASTVDVKPSPSAAEAD 515 + N + (Q A N F++ G ++ D +V T P + +
+ ++
Sbjct: b25 UGLEDNVVCSώPARN
FSRPDGLEDSEDSKEDENVPTAPDPPSlQHLRGHGTGFCFDSS bβl (Query: Sib FDIVTKGώVCEM S27
FD+ K +CE+ Sbjct: b62 FDVHKKCPLCEL b13
Pedant information for DKFZphamy2_2bl1ι frame 2
Report for DKFZphamy2_2bl1 -2
[LENGTH! 761
[MU! 10877-47
[pi! 5-30
[HOMOL! TREMBL:HS338211_1 product: "taxi-binding protein TXBP151"i Homo sapiens taxi-binding protein TXBPISI mRNAi complete eds- 0-0
[FUNCAT! 11 unclassified proteins [S- cerevisiaei Y0R21bc!
3e-14
[FUNCAT! 06-07 vesicular transport (golgi network-, etc-) ES- cerevisiaei YDLOSβw! 2e-13
[FUNCAT! 30-03 organization of cytoplasm [S- cerevisiaei
YDLOSδw! 2e-13
[FUNCAT! 01-10 nuclear biogenesis [S- cerevisiaei YDR3Sbw!
4e-13 [FUNCAT! 30-04 organization of cytoskeleton [S- cerevisiaei
YDR35bw! 4e-13
[FUNCAT! 03-22 cell cycle control and mitosis [S- cerevisiaei
YDR3Sbw! 4e-13
[FUNCAT! 11-04 dna repair (direct repairi base excision repair and nucleotide excision repair) [S- cerevisiaei YKR015w! 7e-12
[FUNCAT! 3D-10 nuclear organization [S- cerevisiaei YKROISw!
7e-12
[FUNCAT! 03-25 cytokinesis [S- cerevisiaei YHR023w MY01 - myosin-1 isoform! be-11 [FUNCAT! 08-22 cytoskeleton-dependent transport [S- cerevisiaei
YHR023w MY01 - myosin-1 isoform! be-11
[FUNCAT! 03-04 buddingi cell polarity and filament formation
[S- cerevisiaei YHR023w MY01 - myosin-1 isoform! be-11 [FUNCAT! 1 genome replication! transcriptioni recombination and repair [M- jannaschiii MJ1322! 3e-0δ
[FUNCAT! 16 classification not yet clear-cut [S- cerevisiaei
YJR134c! 4e-06 [FUNCAT! 03-11 recombination and dna repair [S- cerevisiaei
YNL2S0w! 2e-07
[FUNCAT! 03.13 meiosis CS- cerevisiaei YNL250w! 2e-D7
[FUNCAT! 03-01 cell growth [S- cerevisiaei YNL071c! 2e-0b
[FUNCAT! 03-07 pheromone response! mating-type determination! sex-specific proteins [S- cerevisiaei YNLD71c! 2e-0b
[FUNCAT! 08-11 other intracellular-transport activities [S- cerevisiaei YNL071c! 2e-0b
[FUNCAT! 01-13 biogenesis of chromosome structure [S- cerevisiaei YLROδbw! 5e-0b [FUNCAT! 11-01 stress response CS. cerevisiaei YPR141c! 2e-05
[FUNCAT! Ob-10 assembly of protein complexes [S- cerevisiaei
YPR141c! 2e-0S
[FUNCAT! 03.22-01 cell cycle check point proteins [S- cerevisiaei YGLOδbw! 2e-05 [FUNCAT! 30-05 organization of centrosome [S- cerevisiaei
YPR141c! 2e-0S
[FUNCAT! 06-lb extracellular transport [S- cerevisiaei
Y0R32bw! le-04
[FUNCAT! 01-2S vacuolar and lysosomal biogenesis [S- cerevisiaei Y0R32bw! le-04
EFUNCAT! 30- lb mitochondrial organization [S- cerevisiaei
YALOllw! 2e-04
[FUNCAT! Ob-07 protein modification (glycolsylationi acylationi myristylationi palmitylationi farnesylation and processing) CS. cerevisiaei YKL201c! 2e-D4
[FUNCAT! e amino acid metabolism and transport [M- genitaliumi
MG042! 4e-04
[FUNCAT! 30-13 organization of chromosome structure [S- cerevisiaei YDR265w! 7e-04 [FUNCAT! n secretion and adhesion EM- jannaschiii MJ0211!
0-001
EFUNCAT! 05-04 translation (initiation! elongation and termination) ES- cerevisiaei YAL035w! 0-001
EBL0CKS! BL0032bD Tropomyosins proteins EBL0CKS! PR00545E
EBL0CKS! PR00041F
ESCOP! d2tmab_ 1-10S-4-1-1 Tropomyosin Erabbit
(Oryctolagus cuniculus) 5e-0S
EEC! 3- b- 1-32 Myosin ATPase 5e-lb EPIRKU! nucleus 2e-3S
EPIRKU! phosphotransferase 5e-10
EPIRKU! duplication 2e-01
EPIRKU! citrulline 7e-01
EPIRKU! tandem repeat 2e-13 EPIRKU! heterodimer 2e-0β
EPIRKU! heart 2e-ll
EPIRKU! endocytosis 3e-10
[PIRKU! polymorphism le-01
[PIRKU! transmembrane protein be-12 [PIRKU! serine/threonine-specific protein kinase Se-10
[PIRKU! cell wall 7e-01
[PIRKU! zinc finger 3e-10
[PIRKU! surface antigen be-06 [PIRKU! DNA binding be-12
[PIRKU! metal binding 3e-10
[PIRKU! muscle contraction 2e-13
[PIRKU! brain δe-06 [PIRKU! acetylated amino end 4e-01
[PIRKU! actin binding 5e-lb
[PIRKU! endoplasmic reticulum 4e-01
EPIRKU! mitosis 3e-15
[PIRKU! microtubule binding 3e-15 [PIRKU! ATP Se-lb
[PIRKU! chromosomal protein 2e-06
[PIRKU! receptor 4e-10
[PIRKU! thick filament 2e-13
[PIRKU! phosphoprotein 5e-lb [PIRKU! glycoprotein 4e-10
[PIRKU! skeletal muscle 7e-ll
[PIRKU! calcium binding 7e-01
[PIRKU! alternative splicing 3e-13
[PIRKU! DNA condensation 2e-08 [PIRKU! coiled coil Se-lb
[PIRKU! P-loop Se-lb
[PIRKU! heptad repeat 3e-13
[PIRKU! methylated amino acid 2e-13
[PIRKU! basement membrane le-01 [PIRKU! immunoglobulin receptor 2e-01
[PIRKU! peripheral membrane protein 3e-10
[PIRKU! cardiac muscle 2e-ll
[PIRKU! extracellular matrix le-01
[PIRKU! hydrolase 5e-lb [PIRKU! microtubule le-11
[PIRKU! muscle le-01
[PIRKU! membrane protein le-01
[PIRKU! EF hand 7e-01
[PIRKU! protein biosynthesis 4e-01 [PIRKU! cytoskeleton 3e-13
[PIRKU! hair 7e-01
[PIRKU! Golgi apparatus le-11
[PIRKU! calmodulin binding 3e-10
[SUPFAM! myosin heavy chain Se-lb [SUPFAM! conserved hypothetical P115 protein 4e-10
[SUPFAM! IgA Fc receptor 7e-01
[SUPFAM! centromere protein E 3e-lS
[SUPFAM! unassigned Ser/Thr or Tyr-specific protein kinases Se- 10 [SUPFAM! calmodulin repeat homology 7e-01
ESUPFAM! myosin motor domain homology Se-lb
[SUPFAM! alpha-actinin actin-binding domain homology 5e-10
[SUPFAM! hypothetical protein MJ0114 4e-08
[SUPFAM! tropomyosin be-01 [SUPFAM! plectin Se-10
[SUPFAM! trichohyalin 7e-01
[SUPFAM! pleckstrin repeat homology le-08
[SUPFAM! ribosomal protein S10 homology 5e-10
[SUPFAM! giantin 4e-13 [SUPFAM! protein kinase homology 5e-10
[SUPFAM! protein kinase C zinc-binding repeat homology le-06
[SUPFAM! kinesin motor domain homology 3e-15
[SUPFAM! human early endosome antigen 1 3e-10 [SUPFAM! myosin MY02 δe-Oδ
[SUPFAM! unassigned kinesin-related proteins le-10
[SUPFAM! MS protein 3e-10
[SUPFAM! cytoskeletal keratin 4e-07
[KU! All_Alpha
[KU! L0U_C0MPLEXITY 3 - 3D *
[KU! COILED COIL 22 - 16 *
SElQ MTSFiQEVPLiQTSNFAHVIFiQNVAKSYLPNAHLECHYTLTPYIHPHPKDUVGIFKVGUSTA SEG
PRD ccceeeeeccccceeeeeccccccccccccceeeeeccccccccccccceeeeeeecccc COILS
SElQ RDYYTFLUSPMPEHYVEGSTVNCVLAFiQGYYLPNDDGEFYlQFCYVTHKGEIRGASTPFiQF SEG
PRD eeeeeeeecccccccccccccceeeecccccccccccceeeeeeeccccccccccccccc COILS
SElQ RASSPVEELLTMEDEGNSDMLVVTTKAGLLELKIEKTMKEKEELLKLIAVLEKETAlQLRE
SEG xxxxxxxxxxxxxxxxxxxxxxxxxx
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
SElQ (QVGRMERELNHEKERCD(QL(QAE(QKGLTEVT(QSLKMENEEFKKRFSDATSKAH(QLEEDIVS SEG PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
CCCCCCCCCCCCCCCCCCC CCCCCCCC
SElQ VTHKAIEKETELDSLKDKLKKAIQHEREIQLECIQLKTEKDEKELYKVHLKNTEIENTKLMSE SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC SElQ V(QTLKNLDGNKESVITHFKEEIGRL(QLCLAEKENL(QRTFLLTTSSKEDTCFLKE(QLRKAE SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS ccccccccccc
SElQ ElQVlQATRiQEVVFLAKELSDAVNVRDRTMADLHTARLENEKVKKlQLADAVAELKLNAMKKD
SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
COILS CCCCCCCCCCCCCCCCCCCCCCC
SElQ <QDKTDTLEHELRREVEDLKLRL(QMAADHYKEKFKEC(QRL(QK(QINKLSD(QSANNNNVFTKK SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
SElQ TGNiQiQKVNDASVNTDPATSASTVDVKPSPSAAEADFDIVTKGlQVCEMTKEIADKTEKYNK SEG
PRD hhhhhhhhhhhhhchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
SElQ CK(QLL(QDEKAKCNKYADELAKMELKUKE(QVKIAENVKLELAEV(QDNYKELKRSLENPAER
SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS CCCCCCCCCCCCCCCCCCCCCCCCCCCC
SEύ KMEG(QNSι3SP(QCFKTCSE<QNGYVLTLSNA(QPVL(QYGNPYASfQETRDGADGAFYPDEI(2RP SEG
PRD hhhhhhcccchhhhhhhhhhheeeeccccceeeecccccccccccccccccccccccccc COILS
SElQ PVRVPSUGLEDN VCSiQPARNFSRPDGLEDSEDSKEDENVPTAPDPPSfQHLRGHGTGFCF SEG PRD ccccccccccceeeeccccccccccccccccccccccccccccccccccccccccccccc COILS
SE(Q DSSFDVHKKCPLCELMFPPNYD(QSKFEEHVESHUKVCPMCSE(QFPPDYD(Q(QVFERHV(QTH SEG
PRD ccccccccccccccccccccccchhhhhhhhhhhhccccccccccccchhhhhhhhhhhh COILS
SElQ FDlQNVLNFD
SEG
PRD hcceeeccc COILS
(No Prosite data available for DKFZphamy2_2bl1 - 2)
(No Pfam data available for DKFZphamy2_2bl1-2)
DKFZphamy2_2c22
group: metabolism
DKFZphamy2_2c22 encodes a novel 3b4 amino acid protein with similarity to the l-acyl-glycerol-3-phosphate acyltransferase of Zea mais-
It contains one leucine zipper- The protein is belived to play a role in fatty acid metabolism- It is ubiqitous expressed! with a slight predominance in uterus-, placenta and foreskin- The new protein can find application in modulation of fatty acid metabolism and as a new enzyme for biotechnological production processes-
weak similarity to l-acyl-glycerol-3-phosphate acyltransferase (Zea mais) perhaps complete eds-
Sequenced by MediGenomix
Locus: /map="6" Insert length: 3403 bp
Poly A stretch at pos- 3373ι polyadenylation signal at pos- 3351
1 AGATGCTGCT GTCCCTGGTG CTCCACACGT ACTCCATGCG CTACCTGCTG 51 CCCAGCGTCG TGCTCCTGGG CACGGCGCCC ACCTACGTGT TGGCCTGGGG
101 GGTCTGGCGG CTGCTCTCCG CCTTCCTGCC CGCCCGCTTC TACCAAGCGC
151 TGGACGACCG GCTCTACTGC GTCTACCAGA GCATGGTGCT CTTCTTCTTC
201 GAGAATTACA CCGGGGTCCA GATATTGCTA TATGGAGATT TGCCAAAAAA
251 TAAAGAAAAT ATAATATATT TAGCAAATCA TCAAAGCACA GTTGACTGGA 301 TTGTTGCTGA CATCTTGGCC ATCAGGCAGA ATGCGCTAGG ACATGTGCGC
351 TACGTGCTGA AAGAAGGGTT AAAATGGCTG CCATTGTATG GGTGTTACTT
401 TGCTCAGCAT GGAGGAATCT ATGTAAAGCG CAGTGCCAAA TTTAACGAGA
451 AAGAGATGCG AAACAAGTTG CAGAGCTACG TGGACGCAGG AACTCCAATG
501 TATCTTGTGA TTTTTCCAGA AGGTACAAGG TATAATCCAG AGCAAACAAA 551 AGTCCTTTCA GCTAGTCAGG CATTTGCTGC CCAACGTGGC CTTGCAGTAT bOl TAAAACATGT GCTAACACCA CGAATAAAGG CAACTCACGT TGCTTTTGAT b51 TGCATGAAGA ATTATTTAGA TGCAATTTAT GATGTTACGG TGGTTTATGA
701 AGGGAAAGAC GATGGAGGGC AGCGAAGAGA GTCACCGACC ATGACGGAAT
7S1 TTCTCTGCAA AGAATGTCCA AAAATTCATA TTCACATTGA TCGTATCGAC 601 AAAAAAGATG TCCCAGAAGA ACAAGAACAT ATGAGAAGAT GGCTGCATGA
651 ACGTTTCGAA ATCAAAGATA AGATGCTTAT AGAATTTTAT GAGTCACCAG
101 ATCCAGAAAG AAGAAAAAGA TTTCCTGGGA AAAGTGTTAA TTCCAAATTA
151 AGTATCAAGA AGACTTTACC ATCAATGTTG ATCTTAAGTG GTTTGACTGC
1001 AGGCATGCTT ATGACCGATG CTGGAAGGAA GCTGTATGTG AACACCTGGA 10S1 TATATGGAAC CCTACTTGGC TGCCTGTGGG TTACTATTAA AGCATAGACA
1101 AGTAGCTGTC TCCAGACAGT GGGATGTGCT ACATTGTCTA TTTTTGGCGG
1151 CTGCACATGA CATCAAATTG TTTCCTGAAT TTATTAAGGA GTGTAAATAA
1201 AGCCTTGTTG ATTGAAGATT GGATAATAGA ATTTGTGACG AAAGCTGATA 1251 TGCAATGGTC TTGGGCAAAC ATACCTGGTT GTACAACTTT AGCATCGGGG
1301 CTGCTGGAAG GGTAAAAGCT AAATGGAGTT TCTCCTGCTC TGTCCATTTC
1351 CTATGAACTA ATGACAACTT GAGAAGGCTG GGAGGATTGT GTATTTTGCA
1401 AGTCAGATGG CTGCATTTTT GAGCATTAAT TTGCAGCGTA TTTCACTTTT 1451 TCTGTTATTT TCAATTTATT ACAACTTGAC AGCTCCAAGC TCTTATTACT
1S01 AAAGTATTTA GTATCTTGCA GCTAGTTAAT ATTTCATCTT TTGCTTATTT
15S1 CTACAAGTCA GTGAAATAAA TTGTATTTAG GAAGTGTCAG GATGTTCAAA
IbOl GGAAAGGGTA AAAAGTGTTC ATGGGGAAAA AGCTCTGTTT AGCACATGAT
IbSl TTTATTGTAT TGCGTTATTA GCTGATTTTA CTCATTTTAT ATTTGCAAAA 1701 TAAATTTCTA ATATTTATTG AAATTGCTTA ATTTGCACAC CCTGTACACA
17S1 CAGAAAATGG TATAAAATAT GAGAACGAAG TTTAAAATTG TGACTCTGAT
1601 TCATTATAGC AGAACTTTAA ATTTCCCAGC TTTTTGAAGA TTTAAGCTAC
1651 GCTATTAGTA CTTCCCTTTG TCTGTGCCAT AAGTGCTTGA AAACGTTAAG
1101 GTTTTCTGTT TTGTTTTGTT TTTTTAATAT CAAAAGAGTC GGTGTGAACC 1151 TTGGTTGGAC CCCAAGTTCA CAAGATTTTT AAGGTGATGA GAGCCTGCAG
2D01 ACATTCTGCC TAGATTTACT AGCGTGTGCC TTTTGCCTGC TTCTCTTTGA
2051 TTTCACAGAA TATTCATTCA GAAGTCGCGT TTCTGTAGTG TGGTGGATTC
2101 CCACTGGGCT CTGGTCCTTC CCTTGGATCC CGTCAGTGGT GCTGCTCAGC
21S1 GGCTTGCACG CAGACTTGCT AGGAAGAAAT GCAGAGCCAG CCTGTGCTGC 2201 CCACTTTCAG AGTTGAACTC TTTAAGCCCT TGTGAGTGGG CTTCACCAGC
2251 TACTGCAGAG GCATTTTGCA TTTGTCTGTG TCAAGAAGTT CACCTTCTCA
2301 AGCCAGTGAA ATACAGACTT AATTTGTCAT GACTGAACGA ATTTGTTTAT
2351 TTCCCATTAG GTTTAGTGGA GCTACACATT AATATGTATC GCCTTAGAGC
2401 AAGAGCTGTG TTCCAGGAAC CAGATCACGA TTTTTAGCCA TGGAACAATA 2451 TATCCCATGG GAGAAGACCT TTCAGTGTGA ACTGTTCTAT TTTTGTGTTA
2501 TAATTTAAAC TTCGATTTCC TCATAGTCCT TTAAGTTGAC ATTTCTGCTT
2551 ACTGCTACTG GATTTTTGCT GCAGAAATAT ATCAGTGGCC CACATTAAAC
2b01 ATACCAGTTG GATCATGATA AGCAAAATGA AAGAAATAAT GATTAAGGGA
2b51 AAATTAAGTG ACTGTGTTAC ACTGCTTCTC CCATGCCAGA GAATAAACTC 2701 TTTCAAGCAT CATCTTTGAA GAGTCGTGTG GTGTGAATTG GTTTGTGTAC
27S1 ATTAGAATGT ATGCACACAT CCATGGACAC TCAGGATATA GTTGGCCTAA
2601 TAATCGGGGC ATGGGTAAAA CTTATGAAAA TTTCCTCATG CTGAATTGTA
2651 ATTTTCTCTT ACCTGTAAAG TAAAATTTAG ATCAATTCCA TGTCTTTGTT
2101 AAGTACAGGG ATTTAATATA TTTTGAATAT AATGGGTATG TTCTAAATTT 21S1 GAACTTTGAG AGGCAATACT GTTGGAATTA TGTGGATTCT AACTCATTTT
30D1 AACAAGGTAG CCTGACCTGC ATAAGATCAC TTGAATGTTA GGTTTCATAG
3051 AACTATACTA ATCTTCTCAC AAAAGGTCTA TAAAATACAG TCGTTGAAAA
3101 AAATTTTGTA TCAAAATGTT TGGAAAATTA GAAGCTTCTC CTTAACCTGT
3151 ATTGATACTG ACTTGAATTA TTTTCTAAAA TTAAGAGCCG TATACCTACC 3201 TGTAAGTCTT TTCACATATC ATTTAAACTT TTGTTTGTAT TATTACTGAT
3251 TTACAGCTTA GTTATTAATT TTTCTTTATA AGAATGCCGT CGATGTGCAT
3301 GCTTTTATGT TTTTCAGAAA AGGGTGTGTT TGGATGAAAG TAAAAAAAAA
3351 AAATAAAATC TTTCACTGTC TCTAAAAAAA AAAGAAAAAA AAAAAAAAAA 3401 AAA
BLAST Results
o BLAST result
Medline entries
o Medline entry Peptide information for frame 3
ORF from 3 bp to 1014 bpi peptide length: 3b4 Category: similarity to known protein Classification: Metabolism Prosite motifs: LEUCINE ZIPPER (105-12b)
1 MLLSLVLHTY SMRYLLPSVV LLGTAPTYVL AUGVURLLSA FLPARFYlQAL
SI DDRLYCVYlQS MVLFFFENYT GVlQILLYGDL PKNKENIIYL ANHlQSTVDUI
101 VADILAIRiQN ALGHVRYVLK EGLKULPLYG CYFAώHGGIY VKRSAKFNEK
151 EMRNKLG2SYV DAGTPMYLVI FPEGTRYNPE (QTKVLSASiQA FAAlQRGLAVL 201 KHVLTPRIKA THVAFDCMKN YLDAIYDVTV VYEGKDDGGiQ RRESPTMTEF
251 LCKECPKIHI HIDRIDKKDV PEElQEHMRRU LHERFEIKDK MLIEFYESPD
301 PERRKRFPGK SVNSKLSIKK TLPSMLILSG LTAGMLMTDA GRKLYVNTUI
351 YGTLLGCLUV TIKA
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_2c22ι frame 3
No Alert BLASTP hits found Pedant information for DKFZphamy2_2c22ι frame 3
Report for DKFZphamy2_2c2 ■ 3
[LENGTH! 3b4 [MU! 42072-47
[pi! 1-lβ
[HOMOL! TREMBL:CEAF313b_l gene: "F26B3-5"i Caenorhabditis elegans cosmid F2δB3- 2e-3b
[FUNCAT! 11 unclassified proteins [S- cerevisiaei YDROlδc!
7e-13
[FUNCAT! Ol-Ob-01 lipidi fatty-acid and sterol biosynthesis
[S- cerevisiaei YDLD52c! 4e-05 [FUNCAT! 30-11 other cellular organization [S- cerevisiaei
YDL0S2c! 4e-05
[BLOCKS! BL012b3A
[BLOCKS! BPOOIδlA
[PIRKU! transmembrane protein 2e-ll [SUPFAM! probable membrane protein YBR042c 2e-ll
[PROSITE! LEUCINE_ZIPPER 1
[KU! Alpha_Beta
[KU! LOU COMPLEXITY 3-57 *
SElQ MLLSLVLHTYSMRYLLPSVVLLGTAPTYVLAUGVURLLSAFLPARFYlQALDDRLYCVYiQS SEG
PRD ccchhhhhhhhhccccccceeecccceeeccchhhhhhhhhhhhhhhhhhhhhhhhhhhh SElQ MVLFFFENYTGViQILLYGDLPKNKENIIYLANHiQSTVDUIVADILAIRlQNALGHVRYVLK
SEG
PRD hhhhhhhceeeeeeeeeccccccccceeeeecccchhhhhhhhhhhhhccccchhhhhhh
SElQ EGLKULPLYGCYFAiQHGGIYVKRSAKFNEKEMRNKLiQSYVDAGTPMYLVIFPEGTRYNPE
SEG
PRD hhhccccccceeeccceeeeeeccccccchhhhhhhhhhhccccceeeeeecccccchhh SElQ (QTKVLSASiQAFAAlQRGLAVLKHVLTPRIKATHVAFDCMKNYLDAIYDVTVVYEGKDDGGiQ
SEG
PRD hhhhhhhhhhhhhhhcccccceeeecccchhhhhhhhhhhcceeeceeeeeecccccccc
SElQ RRESPTMTEFLCKECPKIHIHIDRIDKKDVPEEiQEHMRRULHERFEIKDKMLIEFYESPD SEG xxxxxxxxxxxxx
PRD cccccchhhhhccccceeeeeecccccccccccchhhhhhhhhhhhhhhhhhhhhhhccc
SE(Q PERRKRFPGKSVNSKLSIKKTLPSMLILSGLTAGMLMTDAGRKLYVNTUIYGTLLGCLUV
SEG PRD cccccccccccchhhhhhhhchhhhhchhhhhhhhhhccccceeeeeeeechhhhhhhhh
SE(Q TIKA
SEG • • • ■
PRD hccc
Prosite for DKFZphamy2_2c22 ■ 3 PS00021 10S->127 LEUCINE ZIPPER PD0CD0021
(No Pfam data available for DKFZphamy2_2c22 ■ 3)
DKFZphamy2_2flδ
group: signal transduction
DKFZphamy2_2f16 encodes a novel 215 amino acid protein with similarity to sodium channel protein betal of Rattus norvegicus. The sodium channel protein beta 1 of Rattus norvegicus is crucial in the assembly! expression! and functional modulation of the heterotrimeric complex of the rat brain sodium channel. The expression of the new protein seems to be restricted to braini all matching ESTs isolated so far-, derive from there-
The new protein can find application in modulating the sodium channel beta li studying the expression profile in neurodegenerative diseases and of amygdala -specific genes-
similarity to sodium channel protein betal (Rattus norvegicus)
Pedant: SIGNAL_PEPTIDE
Sequenced by MediGenomix
Locus: unknown
Insert length: 4D52 bp
Poly A stretch at pos- 403Sι no polyadenylation signal found
1 CAGGGCTGAC AGCACACACG GCCTGGGGGC CTAGAGAAGG ATTGCTGATC 51 ACCTGCCACC CAGGGTCGGG GCCCCGCACC ATCCGGGGGC GAGCTCCCGG
101 GAAGGGGCTC CCCCTCTACA CCCACCCCCC AACCTCTGAC ATCGCCGGCC
151 GAACGGGAGC TGCCGCTTCC TTCCCGGCCC CGCTGCACCT CCCCAGGGAG
201 CCGAGGGCGG GCGTGGACGG GACCGACGTG GAACGCATTC TGTAGCCCAG
251 ACGGGCGGCC CCGGCGGCTT CGGGAGTGGG GTCACGCCCA GCTGGAGAAG 301 CAGTTAGGGC GGACGAAGCA GGAGCCGCGG GGCTGGGAGG ATTCCAGTCG
3S1 GAACGCAACC GATCCTGGGG AGGCGAGAGG TGAATCAACC TGGACCCTTC
401 CACAGCCTGG CTGCTAGGCC AGCAGTGCGA CTCCCTTCCG AGCTGAGCTT
451 ACCCTGGGCG CAAACGAGCG AGGCAGGGGC GCGAGTGGAA GCTGGAGTTC
S01 CGGGGTGGGC GGGGAGGCGA CTGTCCGTGG TGCTGAGCGC CGGCGAGAGC 5S1 GGGCGCGGAG CGGCTGATCA GCTCCCTCGA ACTGGGGAGG TCCAGTGGGG bOl TCGCTTAGGG CCCAAAGCCC CCGCCCGGCT CCAAAAGCTC CCAGGGCCTC bSl CCCAGGCACC GGTGCTCGGC CCTTCCTTCG GTCAGAAAGT CGCCCCCTGG
701 GGGCAGTTCG TCCCAAAGGG TTTCCTCGAA AGAATCTGAG AGGGCGCAGT
7S1 CCTTGACCGA GGGAATCTCT CTGTGTAGCC TTGGAAGCCG CCAGCCCCAG 601 AAGATGCCTG CCTTCAATAG ATTGTTTCCC CTGGCTTCTC TCGTGCTTAT δSl CTACTGGGTC AGTGTCTGCT TCCCTGTGTG TGTGGAAGTG CCCTCGGAGA
IDl CGGAGGCCGT GCAGGGCAAC CCCATGAAGC TGCGCTGCAT CTCCTGCATG
ISI AAGAGAGAGG AGGTGGAGGC CACCACGGTG GTGGAATGGT TCTACAGGCC
10D1 CGAGGGCGGT AAAGATTTCC TTATTTACGA GTATCGGAAT GGCCACCAGG 1051 AGGTGGAGAG CCCCTTTCAG GGGCGCCTGC AGTGGAATGG CAGCAAGGAC
1101 CTGCAGGACG TGTCCATCAC TGTGCTCAAC GTCACTCTGA ACGACTCTGG
1151 CCTCTACACC TGCAATGTGT CCCGGGAGTT TGAGTTTGAG GCGCATCGGC
1201 CCTTTGTGAA GACGACGCGG CTGATCCCCC TAAGAGTCAC CGAGGAGGCT 12S1 GGAGAGGACT TCACCTCTGT GGTCTCAGAA ATCATGATGT ACATCCTTCT
1301 GGTCTTCCTC ACCTTGTGGC TGCTCATCGA GATGATATAT TGCTACAGAA
1351 AGGTCTCAAA AGCCGAAGAG GCAGCCCAAG AAAACGCGTC TGACTACCTT
1401 GCCATCCCAT CTGAGAACAA GGAGAACTCT GCGGTACCAG TGGAGGAATA 1451 GAACAGGAGC AGTGTGACAT GAGGTGGCCT GAACACCTGA GGGACTGGAC
1501 ATCCCATGTT CAGCAATGTC AATGGCATCA GGAGGGCGCC CCAAGGGCCC
1SS1 CATCGCTTCC CTTCATGCAT CCATTGTTCT GTTCATTCAT TCATCCATAC
IbOl ATCCACCTGC CTCTGAGCTT TCACCTCTGA CTCCCTAACT CCATCAGACC
IbSl TCTACGCACC ATAAGACTCT GCCAGAACTG AGAAGCCAAC ATTTCTACAT 1701 AGACTCAACC TCACCCTCTC CTAGTTTTCC AACAAGACAC TCCAAAGCCA
17S1 ACTGGATTTC TCCCCTGTGC TCCAAATGAC TTTGTACAAG TGCTGGAGTT
1801 AGCACCTCCC TCTGCCCTTA ACTGGCTGGA ACTGGTTCAT TCTCCATTAC
1851 TGCAAGAGAA TGGAAGTCTT AATAGAAGGA AGCAGGAGTG ATTAGTTCGG
1101 GTTAAAGCAA AAGTGTGTCA TGAACTTGGA TTCCCTGAAG TCAGTTTTGT 1151 CAGGTTCATG GCCCACTTTG CTACAGCATC AGAGTGAAGC ACGCCTGTCT
2001 AGGTTCTCCA GTGACAGAAA GATCCTGAAG CATGGACTAA CATGCTCTCT
2051 GGAGCTTAGT ACTCCAGAGC TAGATCCTGA TGGGTCTCTA AGGTTCCCTC
2101 CAAGAAGACA AGGACAGGAG ACTTGGGAAG GACCAATGGT AATTTAAGTG
2151 GCTCTTAAAA AGTCATGCAA CATGTTTCTG GACACGTTCC TGATCCTATT 2201 GCGATAATGT ATGTGTGCCC TCCCTGTGGG CACACCACCT GGGCATTAGG
22S1 ACTGAAATTC CTGAGTTCTT CCTCTCAAAA TTTCTGTGCA CCAGTATTAT
2301 TCCTCATTTT ACATACAGGA GGCAACTAAG ACTCATACAG GGCTCAACTG
2351 AATAAGAGGC TTAAGAGGAT AAACTGGAGC AGAAATAAGC CTTAGGTGCT
2401 GCCCAGTTTA CACTTCCTGG GATGGATGTT TTTGTTTGTT TTGTTTTTTG 2451 TTTTTTTTGT TTGAGATGGA GTCTCACTCT GTCACCTAGG CTAGAGTGCA
2501 GTGGTGTGAT CTCGGCTCAC TGCAACCTCT GCCTCTTGGG TTCAAGCAAT
25S1 TCTCATGCCT CGGCCTCTCC AGTAGCTGGG ATTACAGGTG TGCACCACCA
2b01 CGCCTGGCTA AATTTTGTAT TTTTAGTACA GACAGGGTTT GACTATGTTG
2b51 GCCAGGCTAG TCTTGAACTC CTGACCTCAA ATGACCCACC CACCTCAGCC 27D1 TCCCAAAGTG CT6AGATTAC AGGCGTGAGG CACTGCGCCC GGTGGATAAC
27S1 TTTGTTTCTG AAAAGACTGA CATTGAACTT GTCTATGGCA ATGCTTCTTT
2601 CACAAGCACG GACTGGGCTG AGGTCAACTC TGATAGATTC AGATGACTAG
2651 AAATTGGCCA AAAAAGCAGG GAGAAGAACA TGAGGTAGAC TTAAAGAACT
2101 TCCTTTATGT AAAGATCTGT GACTCTGAAA TATCCTCCAA AAGGAGAGTG 2151 CATCTGAGAC TGATATTTAA ACTAAGAAAA ATGTTTAGTC TGAGATGGAT
3001 CATAAGTAAA TGAGCAGTGT GAGAGGGGAG GGATGGGTAG GTGCTTTCCA
3051 AATACTTCGC CTATGAATGC ATAATTTTCA GATTTTTTTC CCCTAGATTT
3101 TGAGGGAGCA GAGAAACTGG AAAAAACTTT AGTCAATATC TCGTGTTTCA
3151 TTTTAATTAA GTGACAGGTC CAAGTGTGAC ATCCTTCAGC ACCCAGGGAC 3201 AAGAGAGGGG AAAGATGCTT TATGGAATGT AAGAAGATGA AGGTGACTGG
3251 GATTCAGCGA GAGAGAGGTC CCTCAGACCT GGGACCTCCC TTTATAGGGA
3301 AAGACCATAT TCCATAGGTT TAGGGCTTTA CCTTAAAAGC TCATTTTTTT
33S1 CATTCTTCCA TCCCTAGGAA AGTACTTAAA ACCAGACTTT TAAATTTTTA
3401 TTTATTTATT ATTATTTTTT TGAGACAGAT TCTCACTCTG TCTCCCAGGC 3451 TAGAGTGCAG TGGTGCAATC TCAGCTCACT GCAGCCTCAA CTGCCCCAGG
3S01 TTTAAGCAAT CCTCCCACCT CAGCCCCCAG GTAACTGGGA CTACAGGCAT
35S1 GCACCACCAT GCCTGGCTAA TTTTTGTATT TTATGTAGAG ACAGGGGTCT
3b01 TGCCATGTCG CCCAGGCTGA TCTTGAACTC CTGGGCTCAA GCAATCTGCC
3b51 AGCCTCAGCC TCTCAAAGTG CTGGGATTAC AGGCCTGAGC AACTGTGCCT 3701 GGCCCAAAAC CAGACCGTTA ACACATTAAA GAGTCTGATT TTGTTGAAGA
37S1 AAATATTTGC AATAAATTCA AGACTCTTCT TATTGGTAAT TTTCCACACA
3601 ATCCCTCTGA AATAAGGGAG AGGATATAGA CCTTTTTAAC TTTATAGTTA
3651 GAAAAATTGG CCTCAGTGTG AAATTTTTCC AGTCCCATAG CTCATGGATG
3101 CCACCAGCTT GCGGTAGTAG CAAGATGCTT ACTACCACAC CGTTTTCCTC 3151 GGTGGCCCAA TAGCTCGTGT ATCTAAGTTG AACCCGGCAG TATGCATGAT
4001 TGCCTTTTTC TCTTCTTTTT AAAAAAACCC AACTCAAAAA AAAAAAAAAA 4051 AA BLAST Results
No BLAST result
Medline entries
12271207:
Isom LLi De Jongh KSi Patton DEi Reber BFi Offord Ji Charbonneau
Ualsh Ki Goldin ALi Catterall
UA-i Primary structure and functional expression of the beta 1 subunit of the rat brain sodium channel- Science 1112 May 8i2Sb (5056) : 631-42 1b2351Sl:
Belcher SMi Howe JR-i Cloning of the cDNA encoding the sodium channel beta 1 subunit from rabbit- Gene 111b May 6il70(2) :265-b 1335774b:
McClatchey AIi Cannon SCi Slaugenhaupt SAi Gusella JF-i The cloning and expression of a sodium channel beta
1-subunit cDNA from human brain- Hum Mol Genet 1113 Juni (b) :745- 1
Peptide information for frame 3
ORF from 804 bp to 1446 bpi peptide length: 215 Category: similarity to known protein Classification-" Transmembrane proteins unclassified
1 MPAFNRLFPL ASLVLIYUVS VCFPVCVEVP SETEAViQGNP MKLRCISCMK
51 REEVEATTVV EUFYRPEGGK DFLIYEYRNG H(QEVESPF(QG RLiQUNGSKDL
101 (QDVSITVLNV TLNDSGLYTC NVSREFEFEA HRPFVKTTRL IPLRVTEEAG 151 EDFTSVVSEI MMYILLVFLT LULLIEMIYC YRKVSKAEEA AώENASDYLA
201 IPSENKENSA VPVEE
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_2flδι frame 3
PIR:JC4788 sodium channel protein betal chain - rabbiti N = li
Score =
434ι P = β.3e-41 PIR:AS5734 sodium channeli voltage-gatedi beta-1 chain precursor human! N = li Score = 42βι P = 3-be-40
PIR:A42737 sodium channel beta 1 subunit - rati N = li Score =
421ι P =
2.8e-40
>PIR:JC4768 sodium channel protein betal chain - rabbit Length ~= 218
HSPs:
Score = 434 (bS-1 bits)ι Expect = 8.3e-41ι P = δ-3e-41 Identities = 100/214 (4b*)ι Positives = 121/214 (bO*)
(Query: ID LASLVLIYUVSVCFPVCVEVPSETEAVlQGNPMKLRCISCMKREEVEATTVVEUFYRPEGG bl
LA +V VS + CVEV SETEAV G K+ CISC +R E A T
EU +R +G
Sbjct: 5
LAF VGAALVSSAUGGCVEVDSETEAVYGMTFKILCISCKRRSETTAETFTEUTFRώKGT b4
(Query : 70 KDFL-IYEYRNGHtQEVESP--F(QGRL(QUNGS
KDL(QDVSITVLNVTLNDSGLYTCNVS 123
+ + F+ I Y N + + E F + GR+ UNGS KDLiQD + SI + NVT N
SG Y C+V Sb jct : bS
EEFVKILRYENEVLlQLEEDERFEGRVVUNGSRGTKDLlQDLSIFITNVTYNHSGDYiQCHVY 124
(Query: 124
REFEFEAHRPFVKTTRLIPLRVTEEAGEDFTSVVSEIMMYIXXXXXXXXXXIEMIYCYRK 183 R FE + + I L V ++A D S+VSEIMMY+
EM+YCY+K Sbjct: 125 RLLSFENYEHNTSVVKKIHLEVVDKANRDMASIVSEIMMYVLIVVLTIULVAEMVYCYKK 184 (Query: 184 VSKAEEAA-iQENASDYLAIPSENKEN-SAVPVEE 21S
++ A EAA (QENAS+YLAI SE+KEN + V V E Sbjct: 185 IAAATEAAAiQENASEYLAITSESKENCTGViQVAE 216
Pedant information for DKFZphamy2_2flβι frame 3
Report for DKFZphamy2_2f16 ■ 3
[LENGTH! 21S [MU! 24702-40
[pi! 4-bl
[HOMOL! PIR:JC47δδ sodium channel protein betal chain rabbit 3e-41
[BLOCKS! BL00401D Prokaryotic sulfate-binding proteins [BLOCKS! BP00S70 ESCOP! dlneu 2-1-1-1-1 Myelin membrane adhesion molecule P0 Era 2e-43
EPIRKU! Schwann cell 2e-07
EPIRKU! transmembrane protein le-40 EPIRKU! myelin 2e-07
EPIRKU! phosphoprotein 5e-D7
EPIRKU! glycoprotein le-40
[PIRKU! structural protein 2e-07
[PIRKU! muscle le-40 [PIRKU! membrane protein Se-07
[SUPFAM! immunoglobulin homology 2e-07
[SUPFAM! myelin P0 protein 2e-07
[PFAM! IG (immunoglobulin) superfamily
[KU! All_Beta [KU! 3D
[KU! SIGNAL_PEPTIDE 23
[KU! LOU COMPLEXITY 4 - b5 *
SElQ MPAFNRLFPLASLVLIYUVSVCFPVCVEVPSETEAVlQGNPMKLRCISCMKREEVEATTVV
SEG
Ineu- CEEEECCEEEETTTbCEEECE-
EEECCCCCCCCCEE SElQ EUFYRPEGGKDFLIYEYRNGH(QEVESPF(QGRL(QUNGSKDL(QDVSITVLNVTLNDSGLYTC SEG
Ineu-
EEEEEETTTCCCEEEEEETTEEEETTTTTTTEEECCBGGGCBCCEEECCbTTTTTEEEEE SEώ NVSREFEFEAHRPFVKTTRLIPLRVTEEAGEDFTSVVSEIMMYILLVFLTLULLIEMIYC
SEG xxxxxxxxxx
Ineu-
EE SE(Q YRKVSKAEEAAOENASDYLAIPSENKENSAVPVEE
SEG
Ineu-
(No Prosite data available for DKFZphamy2_2flδ ■ )
Pfam for DKFZphamy2_2flδ-3
HMM_NAME IG (immunoglobulin) superfamily
HMM yrNgqp ipssegyUytRweqqgRYs i s i f qLtl i sUepeDsGtYUCmV* YRNG ++ E+ ++ R++++G ++ +++T+ +++ +DSG
Y+C+V
(Query 77 YRNGHiQEV--
ESPFiQGRLlQUNGSKDLlQDVSITVLNVTLNDSGLYTCNV 122 DKFZphamy2_2f22
group: nucleic acid management
DKFZphamy2_2f22 encodes a novel 471 amino acid protein with similarity to YDL153c of Saccharomyees cerevisia- The novel protein is ubiquitously expressed- YDLlS3c is involved in transcriptional silencing-
The new protein can find application in modulation of transcriptioni e-g- transcriptional silencing-
putative protein probably complete eds- perhaps differential polyadenylation
YDL153c is involved in transcriptional silencing
Sequenced by MediGenomix Locus: /map="4"
Insert length: 2011 bp
Poly A stretch at pos- 20D0ι polyadenylation signal at pos- 1161
1 GGAGTCTGCA AACTCCGGTG GTAGGGGAGC GCGCTGCTGT TTAGAGCCAC
SI GAGTTACCGG AGCGCCTGAT TCCTGCGCCG AAGTCAGTGG TGGCCGAAAG
IDl TCCGGAGTCG CTGTAAAACC TGAGATTGTG AGCCATGGTG GGGAGATCCC
151 GGCGGCGCGG AGCAGCTAAG TGGGCAGCTG TGCGAGCCAA GGCAGGTCCC 2D1 ACGCTCACCG ACGAAAATGG AGATGATTTA GGATTGCCAC CCTCACCAGG
251 GGACACCAGC TACTACCAAG ATCAGGTAGA TGACTTTCAT GAGGCACGAT
301 CCCGGGCCGC CTTAGCTAAG GGCTGGAATG AAGTACAGAG TGGAGACGAG
351 GAGGATGGCG AGGAGGAGGA GGAGGAGGTG CTAGCCCTAG ATATGGACGA
401 TGAGGACGAC GAAGATGGAG GGAATGCGGG GGAGGAGGAG GAGGAGGAGA 4S1 ATGCCGATGA TGATGGTGGG AGCTCCGTGC AAAGTGAAGC TGAGGCCTCT
501 GTGGATCCCA GTTTGTCGTG GGGTCAGAGG AAAAAACTTT ACTATGACAC
5S1 GGACTATGGT TCCAAGTCCC GAGGCCGGCA GAGTCAACAG GAGGCAGAGG bOl AGGAGGAAAG AGAGGAGGAG GAGGAGGCAC AGATCATTCA GCGGCGCCTA b51 GCCCAAGCGC TGCAAGAGGA TGATTTTGGT GTCGCCTGGG TTGAGGCCTT 7D1 TGCAAAACCA GTGCCTCAGG TAGATGAGGC TGAGACACGG GTCGTGAAGG
7S1 ATTTGGCTAA AGTTTCAGTG AAAGAGAAGC TGAAAATGTT GCGAAAGGAA
601 TCACCAGAAC TCTTGGAGCT GATAGAAGAC CTGAAAGTCA AGTTGACAGA
651 GGTTAAGGAT GAGCTGGAGC CATTGTTAGA GTTGGTGGAA CAAGGGATCA
IDl TTCCACCCGG AAAAGGAAGC CAATACTTGA GGACCAAGTA CAACCTCTAC 151 TTGAATTATT GCTCGAACAT CAGTTTTTAT TTGATCCTGA AAGCTAGGAG
1001 AGTCCCAGCA CATGGACATC CTGTCATAGA AAGGCTTGTT ACCTACCGAA
1051 ATTTGATCAA CAAGCTGTCC GTTGTGGATC AGAAGCTGTC CTCAGAAATT
1101 CGTCATCTGT TGACACTTAA GGATGATGCT GTAAAGAAAG AACTGATTCC
1151 AAAAGCAAAA TCCACCAAGC CCAAACCAAA GTCTGTTTCA AAGACTTCTG 1201 CTGCTGCCTG TGCTGTTACA GATCTTTCTG ATGATTCTGA TTTTGATGAA
1251 AAAGCAAAAC TGAAGTACTA TAAAGAAATA GAAGACAGGC AAAAGCTAAA
1301 GAGAAAGAAA GAAGAAAATA GCACTGAAGA ACAGGCTCTT GAAGATCAAA
1351 ATGCAAAGAG AGCTATTACC TATCAAATTG CTAAAAATAG GGGACTTACT 1401 CCTAGGAGAA AGAAGATTGA TCGCAATCCC AGAGTGAAAC ACAGAGAGAA
14S1 GTTCAGAAGA GCCAAAATTA GAAGAAGAGG CCAGGTTCGT GAAGTTCGTA
15D1 AAGAAGAGCA ACGTTATAGT GGTGAATTAT CTGGCATTCG TGCAGGAGTT
1551 AAAAAGAGCA TTAAGCTTAA ATGAAGTTTT TGCTTAGCAT AAGGTTTTTG IbOl GCAGTTTTGG ATCAATAAAT TTTTACTTTT AACTAAAGTC ATTGTATTAA lb51 TATATAATAC TTTAAATTTT AAAAATTCTT GTCCACAAGG AAATTTGTCT
17D1 GGGTTATTGG ACAATTTATA AGAACTATGG GAGCAATATG AAGGTGCTTG
17S1 AGAAAAGAGA TGATGTTGAA GTTTTCCAAT ATTCTGTTGA AGTTTTCCAA
1801 TATTAAGTAT TAGCTTAGGG AAATTTCACA GTTCATTGTG GAGTGTTAAA 1851 CTTAGAACAT GTGTAACTTT TCACATAAAG AGAATGCATC TTTGACAGTT
1101 ATCTTATTTG TAAGGCAGCC TATAAAATAG TTCTGAAGTA TTTTATTTAC
1151 CTAACTATAA TTATTGGGCC AGATACTTGT TAATAAATGG GCTTAATGTC 2001 AAAAAAAAAA AAAAAAAAA
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 3
ORF from 135 bp to 1571 bpi peptide length 471 Category: similarity to unknown protein Classification: Nucleic acid management
1 MVGRSRRRGA AKUAAVRAKA GPTLTDENGD DLGLPPSPGD TSYYIQDIQVDD
51 FHEARSRAAL AKGUNEViQSG DEEDGEEEEE EVLALDMDDE DDEDGGNAGE
101 EEEEENADDD GGSSVύSEAE ASVDPSLSUG (QRKKLYYDTD YGSKSRGROS
151 (QiQEAEEEERE EEEEA(QII(QR RLA(QAL(QEDD FGVAUVEAFA KPVPiQVDEAE 201 TRVVKDLAKV SVKEKLKMLR KESPELLELI EDLKVKLTEV KDELEPLLEL
251 VElQGIIPPGK GSfQYLRTKYN LYLNYCSNIS FYLILKARRV PAHGHPVIER
301 LVTYRNLINK LSVVDiQKLSS EIRHLLTLKD DAVKKELIPK AKSTKPKPKS
351 VSKTSAAACA VTDLSDDSDF DEKAKLKYYK EIEDRώKLKR KKEENSTEElQ
401 ALEDώNAKRA ITYlQIAKNRG LTPRRKKIDR NPRVKHREKF RRAKIRRRGlQ 4S1 VREVRKEEiQR YSGELSGIRA GVKKSIKLK
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_2f22ι frame 3
PIR:Sb77Dl hypothetical protein YDL153c - yeast (Saccharomyees cerevisiae)ι N = 4ι Score = 134ι P = l-8e-08 PIR:T08b14 hypothetical protein DKFZpSb40012 • 1 - human
(fragment)ι l\l = li Score = 141ι P = S.8e-D7
TREMBL :SPBC3B8_1 gene: "SPBC3B6 • 01"i product: "hypothetical protein"i
S. pombe chromosome II cosmid c3B8-ι N = 2ι Score = lb4ι P = b-2e-
13
>TREMBL:SPBC3B8_1 gene: "SPBC3B8 - 01"i product: "hypothetical protein"i
S-pombe chromosome II cosmid c3B6- Length = 517
HSPs:
Score = lb4 (24. bits)ι Expect = b-2e-13ι Sum P(2) = b-2e-13 Identities = 44/12b (34*)ι Positives = bδ/12b (53*)
(Query: 3b7 DSDFDEKAKLKYYKEIEDR(QKLKRK-KEEN STEEiQALE-
DIQNAKRAITYIQ 414
D + +++ L YY+ ++ + K+ +K ++EN S + +E +
KR IT Sbjct: 472
DREVEDlQDDLDYYESLDKKSKMAKKLRKENHDLERDLIRASRHPELIELGEGDKRGITLD 531
(Query: 415 IAKNRGLTPRRKKIDRNPRVKHXXXXXXXXXXXXG(QVREVRKEE(QR- YSGELSGIRAGVK 473 IAKNRGLTPRR K +RNPR+K + + Q Y+GE
+GI+AG+ Sbjct: 532 IAKNRGLTPRRPKENRNPRLKKRMRYEKAKKKLASKKAIYKGAPOGGYAGEiQTGIKAGLV 511 (Query: 474 KSIKLK 471
KSIKL+ Sbjct: S12 KSIKLlQ 517
Score = 60 (12-D bits)ι Expect = b-2e-13ι Sum P(2) = b-2e-13 Identities = 21/121 (22*)ι Positives = bb/121 (51*)
(Query: H7 DEAETRVVK-DLAKVSVKEKLKMLRKESP--ELLELIE
DLKVKLTEVKDELEPLLE 241
D ++ + +K D + +++E ++ + + P ELL+++E + ++ L E+ ++L+P L
Sb j ct : 173 DNSDLKSIK(QDSSAAAIEELV(Q(QISPDLPRTELLKILEAKHPEF(QLFLDEL- N(QLKP(QLN 231
(Query : 2S0 LVEiQGIIPPGKGSiQYLRTKYNLYLNYCSNISFYL- ILKARRVPAHGHPVIERLVTYRNLI 3D6
+ + + + S(Q L+ + Y S + + FY +LK HP++
LV +
Sbjct: 232 EIKEKL- KTYPSS(QLL(QA(QCTALSTYISFLTFYFALLKDGEEDLKNHPIMVDLVRCK(QTU 210
(Query: 301 NKLSVVDtQKLS 311
+D+ L+ Sbjct: 211 ESYCGLDEVLT 301 Score = SI (6-1 bits)ι Expect = 1.2e-llι Sum P(2) = 1. Ξe-11 Identities = lδ/51 (30*)ι Positives = 35/51 (51*) (Query: lib VDEAETRVVKDLAKVSVKEKLKMLRKESPEL
LELIEDLKVKLTEVKDELE—PLLEL 2SD
++E ++ DL + E LK+L + PE L+ + LK +L E+K++L+ P +L
Sbjct: 161 IEELV(Q(QISPDLPRT ELLKILEAKHPEF(QLFLDELN(QLKP(QLNEIKEKLKTYPSS<QL 245
(Query: 251 VE 252
++ Sbjct: 24b LiQ 247
Score = 57 (β-b bits)ι Expect = 3-Oe-Dlι Sum P(2) = 2-be-Ol Identities = 13/56 (22*)ι Positives = 2b/Sδ (44*)
(Query : 3b7 DSDFDEKAKLKYYKEIEDRiQKLKRK-- KEENSTEE(QALED(QNAKRAITY(QIAKNRGLT 422
D + +++ L YY+ ++ + K+ +K KE + E + I
RG + T
Sb jct : 472 DREVEDiQDDLDYYESLDKKSKMAKKLRKENHDLERDLIRASRHPELIELGEGDKRGIT 521
Score = 42 (b-3 bits)ι Expect = 5-2e-D1ι Sum P(2) = S-2e-01 Identities = 13/51 (25*) i Positives = 21/51 (5b*)
(Query: m AETRVVKDLAKVSVKEKLKMLRKESPE-- LLELIEDLKVKLTEVKDELEPLLE 241
+ET + D+++ + LK ++++S + EL++ + L + EL +LE
Sbjct: ibO SETDAIDDIS(QUADNSDLKSIKώDSSAAAIEELV(Q(QISPDLP— RTELLKILE 210
Score = 31 (5-1 bits)ι Expect = 1-le-Dδι Sum P(2) = 1. le-06 Identities = δ/lβ (44*)ι Positives = 11/18 (bl*)
(Query: 43 YYflDlQVDDFHEARSRAAL bO +Y +I3+D RSRA L
Sbjct: 402 FYANlQIDlQKAAKRSRAVL 411
Pedant information for DKFZphamy2_2f22ι frame 3
Report for DKFZphamy2_2f22-3
[LENGTH! 471
[MU! 54558-00
Epl! 5-50
[HOMOL! TREMBL:SPBC3Bδ_1 gene: "SPBC3B6 ■ D1"i product:
"hypothetical protein"i S-pombe chromosome II cosmid c3Bδ- le-lD [FUNCAT! 04-05.01.04 transcriptional control [S- cerevisiaei
YDLlS3c! le-06
[BLOCKS! PRD0526D
[BLOCKS! BL003bDC Ribosomal protein SI proteins [BLOCKS! BL001b4A Syndecans proteins
[BLOCKS! PRDDb24G
[BLOCKS! PR00626H
[BLOCKS! BL00624B Elongation factor 1 beta/beta ' /delta chain proteins
EKU! All_Alpha
EKU! L0U_C0MPLEXITY 24-b3 *
EKU! COILED COIL 7-10 *
SElQ MVGRSRRRGAAKUAAVRAKAGPTLTDENGDDLGLPPSPGDTSYYlQDiQVDDFHEARSRAAL SEG xxxxxxxxxxxxxxxx
PRD cccccchhhhhhhhhhhhhccccccccccccccccccccccccccchhhhhhhhhhhhhh COILS
SElQ AKGUNEVIQSGDEEDGEEEEEEVLALDMDDEDDEDGGNAGEEEEEENADDDGGSSVIQSEAE
SEG xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PRD hhcccccccccccchhhhhhhhhhhhhhccccccccchhhhhhhhhhccccccchhhhhh COILS
SElQ ASVDPSLSUG(QRKKLYYDTDYGSKSRGR(QS(Q<QEAEEEEREEEEEA(QII(QRRLA(QAL<QEDD SEG xxxxxxxxxxxxxxxxxxxxxxxxxxx PRD hcccccccccccceeeecccccccccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhcc COILS
SElQ FGVAUVEAFAKPVPtQVDEAETRVVKDLAKVSVKEKLKMLRKESPELLELIEDLKVKLTEV SEG
PRD chhhhhhhhhhccchhhhhhhhhhhhhhhhhhhhhhhhhhhcchhhhhhhhhhhhhhhhh COILS CCCCCCCCCCCCCCCCCCCCCCCCCCCCC SElQ KDELEPLLELVElQGIIPPGKGSlQYLRTKYNLYLNYCSNISFYLILKARRVPAHGHPVIER SEG
PRD hhhhhhhhhhhhhhhcccccchhhhhhhhhhhhhhhhhhhhhhhhhhhcccccccccccc COILS ccccc
SElQ LVTYRNLINKLSVVDlQKLSSEIRHLLTLKDDAVKKELIPKAKSTKPKPKSVSKTSAAACA SEG xxxxxxxxxxxxxxxxxxxx- ■
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhcchhhhhhhhhhhhcccccccchhhhhhhhh COILS
SElQ VTDLSDDSDFDEKAKLKYYKEIEDRlQKLKRKKEENSTEEiQALEDlQNAKRAITYlQIAKNRG SEG
PRD hhhhcccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhc COILS
SElQ LTPRRKKIDRNPRVKHREKFRRAKIRRRGlQVREVRKEEiQRYSGELSGIRAGVKKSIKLK SEG xxxxxxxxxxxx PRD cccccccccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcccccchhhhhhhhccc COILS (No Prosite data available for DKFZphamy2_2f 2- 3) (No Pfam data available for DKFZphamy2_2f22 -3)
DKFZphamy2_2gl2
group: nucleic acid management
DKFZphamy2_2gl2 encodes a novel 111 amino acid protein with similarity to NVL-2 of Rattus norvegicus- The novel protein contains 3 EF-hand calcium-binding domains- The related human VILIP Ca-dependend protein specifically binds the 3'-untranslated region of the neurotrophin receptori trkBi an mRNA localized to hippoca pal dendrites in an activity-dependent manner- The new protein exhibists elevated expression in brain and testis-
The new protein can find application in studying the expression profile of brain-specific genes and as a new marker for neuronal cells-
strong similarity to NVL-2 (Rattus norvegicus)
Comment for P35332: FUNCTION: MAY BE INVOLVED IN THE CALCIUM-DEPENDENT REGULATION OF
RHODOPSIN PHOSPHORYLATION.
TISSUE SPECIFICITY: NEURON-SPECIFIC IN THE CENTRAL AND PERIPHERAL
NERVOUS SYSTEM-
MISCELLANEOUS: PROBABLY BINDS TUO OR THREE CALCIUM IONS (BY SIMILARITY)
SIMILARITY: TO OTHER EF-HAND CALCIUM BINDING PROTEINSi BELONGS TO
THE
RECOVERIN SUBFAMILY- Sequenced by MediGenomix
Locus: /chromosome="l"
Insert length: 4265 bp Poly A stretch at pos- 4256ι polyadenylation signal at pos- 4247
1 GGCGGCTCCG GCGCAGACCT TGGAGAGCAC AGCTGCCGGC CCGCGAGCCA
51 GCCTCGGTTC CCGCGGCCCG CCGAGGCTCG GAGCCATCCA GCGACCCGGC 101 GACCGGCCTC AGGCCCCGCC ATGGGGAAGA CCAACAGCAA GCTGGCCCCC
151 GAGGTGCTGG AGGACCTTGT TCAGAACACT GAGTTCAGCG AGCAGGAGCT
201 GAAGCAGTGG TACAAGGGCT TCCTGAAGGA CTGCCCCAGC GGCATCCTCA
251 ACCTGGAGGA GTTTCAGCAG CTCTACATCA AGTTCTTCCC CTACGGCGAC
3D1 GCCTCCAAGT TCGCGCAGCA CGCTTTCCGC ACCTTCGACA AGAACGGCGA 3S1 CGGCACCATC GACTTCCGGG AGTTCATCTG CGCCCTGTCG GTCACCTCCC
401 GCGGCAGCTT CGAGCAGAAG CTCAACTGGG CCTTTGAGAT GTACGACCTG
451 GACGGCGACG GGCGAATCAC GCGCCTGGAG ATGCTGGAGA TCATCGAGGC
501 AATCTACAAG ATGGTGGGCA CCGTGATCAT GATGCGCATG AACCAGGACG
SSI GGCTCACGCC CCAGCAGCGT GTGGACAAGA TCTTCAAGAA GATGGACCAG bDl GATAAGGACG ACCAGATTAC ATTGGAGGAG TTCAAGGAGG CAGCCAAGAG bSl TGACCCATCC ATTGTGTTGC TGCTGCAGTG TGACATGCAG AAGTAGAAGC
701 TGGTGAGGGG CAGGGTCCCT GGCCAGAAGG GGCATGGCCA CCTCCCAACC
751 TGATGACCTC TCTGGCTGGC CTCCCAGGAG GAGGGACACT CCAGCCCCCC δOl TCTCTGGCCC ACCCAGTCCT CTGCCCAAGC CCTTCCTCCC CTCCATCAAG δSl ATCTTTGAGG GACCACCTCA CCCTGCAAAA GAGACAGGTC CTCCAGTACC
101 CT6TCTTCTA GCCCCACCTC CCACTTGGCC AGAACCAATG TCCATTGGGC
151 ATAGGGGAGT TGGCTTTTGC CCCAGGAGGT GAGGTTAAGG AGTTGGGGGC 1001 CTGGGGTTCT GGTTAGGAAT TCTCTTGATC CTGGGATTAT GCTTTATAGG
1051 ATGTGGTCCC ACAGGCCTGT CACAGGGCCA AATTGGGTCT GTCCATTCCT
1101 GAGGCTCCAG ATCCCATAAA GGGGGTCTCT TCCCCATCCC TTCTACTCTA
1151 CCTGGCCCTT CCAGCCCCAG CCTTTGGAGC GTTCATTCAG TCCTTTCTTC
1201 AGCTAATGAT TACTGAGCAC CTGTTTGGTG CTAAGGATAT GGTCATTTAC 1251 AAGACACATC TTGTGCCCTC TGGAAGCTCA TAGGGTTGTG AGGCAAACTT
1301 CCAGCCGTCA GGGTCTCAGC TAAGCAGAAG GTGCTGGAAG GCTGGTTAGT
1351 CTGGGAGGAG CTATTTCATC TTCCAGCTCA GCTCCACACA AAGCTGCAGA
1401 AGGACGAAAT GAAAAGCATT TGGAAGTTTA GGAGCCACGT GAGTGAAAGT
1451 TTTAAGAAAA ATGAAATTTA TGTCATACTT ATTTTTTTAG TACCCTTTAA 1501 AGGAGCTACA GTCATTTTAT TATTTCAGGA GGTTAAAATA TACTCTATAT
15S1 TACTTGGTTT ATTATAAAAT GATTAAATGA ATAGAGAAAA TATTAATTTT
IbOl CAAGGGGAAA AAACCTGAGA AGAAAGGGAG AAAAGACCAT GAAATTTACC
11.51 AGATAACACT TTTTAAGACT AAGTCCTGAG CTGCCACTCT CAGCAGTTTT
1701 TGCTGCTTCA GCTCTTCCTT TTTATTACCT TTTTCAATTC AACAAGCAAC 1751 TTTCTGCTAC ATACTTACTC CGGTTGGGTG CTGACTTCAG GGACAGGAAA
1601 AAGCAAGGTT TGCAAAGAGT GAAACTAGTG TATATTCCGT ATCTTGGTAG
1651 TTCGTTTCTG GATTGGGTTT AGTTTCAGAA CTGGACTTGT TCCTTCACTG
1101 CCACAGAATC AGAAAGAGCT AGAAGAAAAG GCTCACCTGG CCACTGTTTA
1151 GGCACCCAGA CATAATTTAT GGACGAAATG CCTAAAAATG TGCCAGGCAT 2001 GCTCTGTTTG AGAGGCTTTT TCTAACCCCA AATCTTAGAT CTGCCAGGTA
2DS1 GTTCAACATC TTCCAAGTGT GCTGGTTCTG CTTTCCAATG CCTGCTTCCC
2101 AATTTTGGAT CCATGAGCTA TACAGCTGCA TGCTTTGACT GCCGGAAAAA
21S1 TTAATCTTGC TTCTTCATCA GGTCTTTCTC CTGTACTTGT GATCAGAAAT
2201 TACCTTTGAC GTGCAGTGAC AGTTGATTTC CTCTTGAACT GCCGGTGAAA 2251 ACAGTCTAGT ACACAGGTGC TGTCAGCCCA GGGTGGGAGC AGGAAATGAT
23D1 TGCTGAGCCC GGGGCAGGGG AATTGCATCT GCAGGAAAGA GATGCAGCAT
2351 GCTCCTCACT CCTGAGTGCC CACCTGTCCT GCTTCTCTGC AGGTGAAAAC
2401 TCTGGGGGAT GCTGATCAAT AGAGCTTGGT CCCAAGCTCT ACTGGGCCCT
2451 TGGAGGTAGC AAGGCCACTG GGTTGCTATC CTCTTGATGG GGATAGCAAC 2501 CACTGGTTTG CAACCACTGG GTTGCTATCC TTTTGCTATC CTCTTGCTCA
25S1 TGACCAGCCA TATGGTGAGG CTGGGGAGTT CACATCCTCA GGCAGGAACT
2b01 AGCAGTTGTT TATCCAGCAA TGCCTCAAGG ATGTTGCATT GCTCCCAGGA
2b51 GCTGGCTATT AGGTATGTCT TGTGCGGTCA GTCAGCATCA CAGACACATA
27D1 GATGCTCACC AGCCTGGCTT AGCTGGGACC TAAATCTTCT GGTGAAAAGC 27S1 TTTTCACTAA GTGAGGTTCC TTCCCTGCAA ATGCTGAATC TAGCCTAATT
2601 CGCAACCACA CAGAATTTCA TGGCTTTCAA AGGCTTGCCA TGTGCCCCAT
2651 CTCATTCTAT ACTCACATCC CATGGAGGTG AGGATTTTCA CTTCTTTTCT
21D1 CTAGACTTGG AAGCTGAGAT TCAGAGAGGA AGCATCCCTT GTGCAAGATC
2151 ACATAGTCAG GAGGTGACAC AGGGCTAAGA CTTGAACCAA GGCTCTAAGA 3001 GGATTTCTTC TTTTCAGAGT CTCTTCCCTG TCCATTTCTG TGACTAAGCT
3D51 GTGCAGAGGT TGACAGCAGG GCAAGTTACA TTGATATTCA TCCTTTATAG
31D1 GCTTCCTGCT AAAAAGCTTC TGAGATTGTG GTCTTCCAAA AAAAATAGGA
3151 GCTTGGTTGA AGTCCCCACA TTTTCAAGCA CTCAGTGTTC TGCCTCTGGC
3201 AGCTGTGCTA ACAGCTCAGT GCTGTCCTGG GAGTCCTCTG ACTCAGAACC 3251 CTCGAAGCAT CCTGCATTGT CTTTACCCAC CATCATCGTC ACTAAGAGAA
33D1 ACATGCCTAC CCATGAAGGC GTGTTTGATT ACTCCAGGCT TCTGGACACA
33S1 CATACCCATG GGTGATTTTT GCTCCTCAGG CCCAATATTC TCAGACAGCC
3401 CAGCAGTGTG AACACACAAT GCCAGGCCAG GAACTGGGAC CACCATCTTG
3451 CTGATGGAAG GAACAACAGG TGGCCCAGGA CATGCTCCTG CATACTCCTG 3S01 GGTGTCCCAG GGACTGTGTG CTCAGGAGCA CTGTGGTAGA GCACTGGCCC
35S1 TGCCTTGAGA AGAGACACAG GTCTCCCGTC CCTGCACCAG CTGAGAGAGA
3b01 CTTGCCACAA AGCACAAGGC TGGCAGAGAT TTATGTATGA CTTGCACAGA
3b51 CACAAAAATA TACAGACAAT CAAAACATTG ATATATTCAA ACTCTCCTTT 3701 AAATTCCAAT CTTATTGCAA CAACTCTGTG AATTGCAAGG TCCCAGAATC
3751 TGCCTTCTCA CATACTCTAC CCTCATTCAT CCTTTTGGGC TAATTGATGA
36D1 GCATCTTATT TCTTATCTCT AAAAATTATC AGCAAAGGCT ACTTCAGATG
3651 GCCACTTTAG TCCTTTCAGC TGTAGTCAGG ATTATTTAAC TTACCTGTAT
3101 ATCAAAAGTG AAGAAAAAGT TAGTTCATAA GTAAAGGCAC TAAATCCTTT
3151 CCTGACAATG GCAGAGTCTC TAGAGGTAGA AATTTGCCTT GCTGCAGAGA
40D1 GAGAAGGAAT GGCGTGGGAT GGGGGAAAGA AAAGAAAGAG AAGAAGAGAA
4051 GAAGCTGGGG TCTCCAGGCA GGGTAGTAAG CTGACACTAA ATATTTTTTA
4101 CACAAAAATG TATTGAAGCA ACAAATATTT CCTGAAGATC CACCCTGGGT
41S1 GAGGCTTTGA GCTGACTTTA GAGATCACTG TGGGGTCAAG AATGTCTTAC
4201 ATGTTTTATT CATCATTCTT GAAAAAAGAA ATAATTCAAA CCTTGGAATT
4251 AAAAAGTCAG AAAAACAAAA AAAAAAAAAA AAAAA
BLAST Results
No BLAST result
Medline entries
133b7470:
Kajimoto Yi Shirai Yi Mukai Hi Kuno Ti Tanaka C-i Molecular cloning of two additional members of the neural visinin-like Ca (2+)-binding protein gene family- J Neurochem 1113
Sepibl(3):1011-b
1b071121:
Polymeropoulos M-H-i Ide S-i Soares M-B-i Lennon G-G-i Sequence characterization and genetic mapping of the human VSNLl genei a homologue of the rat visinin-like peptide RNVP1- Genomics 21(1) :273-275(1115) -
Peptide information for frame 1
ORF from 121 bp to b13 bpi peptide length: m Category: strong similarity to known protein Classification: Protein management Prosite motifs: EF_HAND (73-65) EF_HAND (101-121) EF HAND (151-171)
1 MGKTNSKLAP EVLEDLViQNT EFSE(QELK(QU YKGFLKDCPS GILNLEEF(Q(Q
SI LYIKFFPYGD ASKFAlQHAFR TFDKNGDGTI DFREFICALS VTSRGSFEiQK
101 LNUAFEMYDL DGDGRITRLE MLEIIEAIYK MVGTVIMMRM N(QDGLTP(Q(QR
151 VDKIFKKMDiQ DKDDiQITLEE FKEAAKSDPS IVLLLIQCDMIQ K
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_2gl2ι frame 1 No Alert BLASTP hits found
Pedant information for DKFZphamy2_2gl2ι frame 1
Report for DKFZphamy2_2gl2-l
ELENGTH! 231 EMU! 2b277-12
Epl! 5.2b
EHOMOL! PIR:JH061S neural visinin-like Ca2+-binding protein-type 2 - rat Ie-1D7
EFUNCAT! 16 classification not yet clear-cut ES- cerevisiaei YDR373w! 3e-52
EFUNCAT! 03-01 cell growth ES. cerevisiaei YKLllOw! 3e-16
EFUNCAT! 03.07 pheromone responsei mating-type determination! sex-specific proteins ES- cerevisiaei YKLllOw! 3e-lβ
EFUNCAT! 13-04 homeostasis of other ions ES- cerevisiaei YKLllOw! 3e-lδ
EFUNCAT! 04-05-01-04 transcriptional control ES- cerevisiaei
YKLllOw! 3e-lβ
EFUNCAT! 30-03 organization of cytoplasm ES- cerevisiaei
YKLllOw! 3e-lβ EFUNCAT! 11-01 stress response ES- cerevisiaei YGRlOOw! 7e-04
EBLOCKS! BL00303B S-100/ICaBP type calcium binding protein
EBLOCKS! BL00016
EBLOCKS! PR00450G
EBLOCKS! PR004S0F EBLOCKS! PR0045DE
EBLOCKS! PR004SDD
EBLOCKS! PR0045DC
EBLOCKS! PR004S0B
EBLOCKS! PR00450A ESCOP! dlosa 1-37.1-5.13 Calmodulin E(Paramecium tetraurelia) 6e-25
ESCOP! dlrec 1.37.1.5-21 Recoverin Ebovine (Bos taurus) le-72
ESCOP! dla4pa_ 1-37-1-2-5 Calcyclin (S100) [Human (Homo sapiens)ι PI 7e-0S
[SCOP! dlrro 1-37.1-4-1 Oncomodulin [rat (Rattus norvegicus) 2e-17
[SCOP! dlsyma_ 1-37-1-2-2 Calcyclin (S100) [rat (Rattus norvegicus) 1e-14 [SCOP! d4icb 1.37-1.1-1 Calbindin D1K [bovine (Bos taurus) 2e-lδ
[SCOP! dlauib_ 1-37-1-5-11 Calcineurin regulatory subunit
(B-chain le-45
[PIRKU! blocked amino end le-11 [PIRKU! phosphotransferase 3e-D6
[PIRKU! duplication 7e-17
[PIRKU! tandem repeat 7e-0b
[PIRKU! heterodimer 7e-17 [PIRKU! heart 7e-Db
[PIRKU! serine/threonine-specific protein kinase 7e-0b
[PIRKU! acetylated amino end 7e-0b
[PIRKU! ATP 7e-0b [PIRKU! skeletal muscle 7e-0b
[PIRKU! signal transduction 4e-b1
[PIRKU! protein kinase 3e-0β
[PIRKU! calcium binding le-11
[PIRKU! alternative splicing le-13 [PIRKU! lipoprotein le-11
[PIRKU! cardiac muscle 7e-0b
[PIRKU! muscle 7e-0b
[PIRKU! myristylation le-11
[PIRKU! EF hand le-11 [PIRKU! retina le-4b
[SUPFAM! calcium-dependent protein kinase 3e-06
[SUPFAM! unassigned calmodulin-related proteins 2e-34
[SUPFAM! protein kinase homology 3e-0β
[SUPFAM! calmodulin le-11 [SUPFAM! calmodulin repeat homology le-11
[PROSITE! EF_HAND 3
[PFAM! EF hand
[KU! All_Alpha
[KU! 3D
SE(Q GGSGADLGEHSCRPASiQPRFPRPAEARSHPATRRPASGPAMGKTNSKLAPEVLEDLViQNT Irec- HHHHHHHHHTTTT
SElQ EFSE(QELK(QUYKGFLKDCPSGILNLEEF(Q(QLYIKFFPYGDASKFA(QHAFRTFDKNGDGTI Irec- CCCHHHHHHHHHHHHHHTTTTEEEHHHHHHHHHHHTTTTCHHHHHHHHHHHH---
--CEE SElQ DFREFICALSVTSRGSFElQKLNUAFEMYDLDGDGRITRLEMLEIIEAIYKMVGTVIMMRM Irec-
EHHHHHHHHHHHHCCCGGGHHHHHHHHHTTTTCCCEEHHHHHHHHHHHHHCCTTTTGGGC
SElQ N(QDGLTP(Q(QRVDKIFKKMD(QDKDD(QITLEEFKEAAKSDPSIVLLL(QCDMιQK lrec- TTTTTCHHHHHHHHHHHHCCTTTTEECHHHHHHHHHHCHHHHHHHCCCHHH
Prosite for DKFZphamy2_2gl2-l
PSDDOlβ 113->12b EF_HAND PD0C0D016
PS00018 141->lb2 EF_HAND PDOCOOOlδ
PS00016 111->212 EF HAND PD0C00016
Pfam for DKFZphamy2_2gl2-l
HMM NAME EF hand
HMM *EIqEMFrmMDkDGDGyIDFEEFmeMMkem* (Q +FR +DK+GDG+IDF EF+ +++ (Query 104 FAiQHAFRTFDKNGDGTIDFREFICALSVT 132
27-15 140 Ibβ 1 21 dkfzphamy2_2gl2 - 1 strong similarity to NVL-2 (Rattus norvegicus) Alignment to HMM consensus:
(Query *EIqEMFrmMDkDGDGyIDFEEFmeMMkem*
++++F+M+D DGDG+I+ E++E++ ++ dkfzphamy2 140 KLNUAFEMYDLDGDGRITRLEMLEIIEAI Ibδ (Query 216 1 ' 21 dkfzphamy2_2gl2-l strong similarity to NVL-2 (Rattus norvegicus)
Alignment to HMM consensus: HMM *EIqEMFrmMDkDGDGyIDFEEFmeMMkem*
++++F++MD+D+D +1+ EEF+E+ K+ (Query 110 RVDKIFKKMDlQDKDDiQITLEEFKEAAKSD 216
DKFZphamy2_2il7
group: amygdala derived
DKFZphamy2_2il7 encodes a novel 4b2 amino acid protein without similarity to known proteins. Most ESTs are derived from brain and pancreas- No informative BLAST resultsi No predictive prositei pfam or SCOP motife ■
The new protein can find application in studying the expression profile of amygdala-specific genes-
unknown protein perhaps complete eds-
Sequenced by MediGenomix Locus: unknown
Insert length: 3473 bp
Poly A stretch at pos- 3454ι polyadenylation signal at pos- 343b
1 GATATCCCAA TCTTTGGACT GCATCCTGGT TGCCTCTACT GTGGTCACCT
SI TTGGGAAGAA ATGTCTTCTG TAAAAAGAAG TCTGAAGCAA GAAATAGTTA
101 CTCAGTTTCA CTGTTCAGCT GCTGAAGGAG ATATTGCCAA GTTAACAGGA
ISI ATACTCA6TC ATTCTCCATC TCTTCTCAAT GAAACTTCTG AAAATGGCTG
201 GACTGCTTTA ATGTATGCGG CAAGGAATGG GCACCCAGAG ATAGTCCAAT 251 TTCTGCTTGA GAAAGGGTGT GACAGATCAA TTGTCAATAA ATCAAGGCAG
301 ACTGCACTGG ATATTGCTGT ATTTTGGGGT TATAAGCATA TAGCTAATTT
351 ACTAGCTACT GCTAAAGGTG GGAAGAAGCC TTGGTTCCTA ACGAATGAAG
401 TGGAAGAATG TGAAAATTAT TTTAGCAAAA CACTACTGGA CCGGAAAAGT
4S1 GAAAAGAGGA ATAATTCTGA CTGGCTGCTA GCTAAAGAAA GCCATCCAGC 501 CACAGTTTTT ATTCTTTTCT CAGATTTAAA TCCCTTGGTT ACTCTAGGTG
S51 GCAATAAAGA AAGTTTCCAA CAGCCAGAAG TTAGGCTTTG TCAGCTGAAC bOl TACACAGATA TAAAGGATTA TTTGGCCCAG CCTGAGAAGA TCACCTTGAT b51 TTTTCTTGGA GTAGAACTTG AAATAAAAGA CAAACTACTT AATTATGCTG
701 GTGAAGTCCC GAGAGAGGAG GAAGATGGAT TGGTTGCCTG GTTTGCTCTA 751 GGTATAGATC CTATTGCTGC TGAAGAATTC AAGCAAAGAC ATGAAAATTG
601 TTACTTTCTT CATCCTCCTA TGCCAGCCCT TCTGCAATTG AAAGAAAAAG
ΘS1 AAGCTGGGGT TGTAGCTCAA GCAAGATCTG TTCTTGCCT6 GCACAGTCGA
101 TACAAGTTTT GCCCAACCTG TGGAAATGCA ACTAAAATTG AAGAAGGTGG
151 CTATAAGAGA CTATGTTTAA AAGAAGACTG TCCTAGTCTC AATGGCGTCC 1001 ATAATACCTC ATACCCAAGA GTTGATCCAG TAGTAATCAT GCAAGTTATT
1051 CATCCAGATG GGACCAAATG CCTTTTAGGC AGGCAGAAAA GATTTCCCCC
1101 AGGCATGTTT ACTTGCCTTG CTGGATTTAT TGAGCCTGGA GAGACAATAG
1151 AAGATGCTGT TAGGAGAGAA GTAGAAGAGG AAAGTGGAGT CAAAGTTGGC
1201 CATGTTCAGT ATGTTGCTTG TCAACCATGG CCAATGCCTT CCTCCTTAAT 12S1 GATTGGTTGC TTAGCTCTAG CAGTGTCTAC AGAAATTAAA GTTGACAAGA
1301 ATGAAATAGA GGATGCCCGC TGGTTCACTA GAGAACAGGT CCTGGATGTT
1351 CTGACCAAAG GGAAGCAGCA GGCATTCTTT GTGCCACCAA GCCGAGCTAT
1401 TGCACATCAA TTAATCAAAC ACTGGATTAG AATAAATCCT AATCTCTAAA 1451 TCTAAGAACT AAGCTTTGAG TATTATTTAA TAATTTCTAA TAACACTCAT
1S01 TCCTCAAGTG ATATTAGAGA TTATTCAGTA CTCTTGAGAG TGTCACAACA
1551 CAAAATACGA TGTTGGGTTT TCGAAATATT TTCAAAGTGT TCTGTCTTAA
IbOl TCACAAATTC ATATTTTTAC ACATTTTTAC AATATTGCCT CAGATTATGT lb51 TAAATTTGGG TCAGTCTTCT CTGAACTTTT TCTCTCTCGG TTTCTTTTCT
1701 TCCTTCACAG TTTTATCTCA CAAAACCATT TTTCTAATAA GAGACATCAT
17S1 GTTGGAAAGA TGTTGTAGAA ATGTGCATAA ATTTCAGTGC CTCTTGTAAG
1601 CATTAAACTG ATGATGAAGA AAGTTCCTGA TTTGAGAAAT GAATCAAAGT
1651 AATTTTAATG AATTTTTAGC TTGTATTAGC TTGAGTTAGC TGGCATTGAT 1101 TTTTTAGTCC TTTTGTTACC TTTAAGTTGT CAATATATGG TTTTTGTTCA
1151 TCTCCCCATT GTAGTCCCAC TTGCTCTTTC CTGGGGGTTC CATTGTTCTA
2001 GCAGTGGAGG TGTTACAGTG TCGCCACTCG TCTAATTTGA CCAGTGTTAA
2051 GAATTTTCTA ATTTAATAAT TTAATAGTGA TCTCAATACC ACACCCTCAT
2101 GGAAGGAGAA AAGCATACTA TTATATCTGG GACCTCTCTT TTAGACCTAA 2151 AATTAATTAA CATATCTACT TATATGTTAC TTATACCTAA AGCTGTTATT
2201 AAGACAAACC AAGATTCTCT GCTTTTGCAC TGAAATTAAA CTTGAAAGGA
2251 ATTCTCCTCA AAGGTCGGAT ATTAAATAAG TCCCAGGCAG ATTTACATAT
2301 TTAATTTAAA ACATTGGCTT TATTTCATTT TGTGATGAGT GATGTATCTG
2351 TGTTAACAAA AAATTGTATA ATCATTACCA ATACTATTTA TTATGCTCAA 2401 ATATATCTTG GCTTTGACCT TATTTCAACA CATTCTAAGA AGCCTTGACA
24S1 AAGTAAGTAT ATTTTAGAGC TGAATCAGTA AGATTCTAGA GAAAGCAAAA
2501 CATAGTAGTT CACAATTTTG CAACATAGAA AGTCACATTT TGAAAGGCTA
2551 TTTTGAAATT GATTTAATAG CTATTATAGT TTATGAATAT CAAAATTTGT
2b01 ATAATTTGCA TCTTTACTAA TGTATGCTAG AGCTACAAGA GACCTTAAGG 2b51 ATAATATATG AAATTAGCTT TCCTTATTTT ATAGATAAGG AAAAAGAAAT
2701 TGTGAAAGGT GAATTTACCT AATTAGTGAA AGTTACATAA CTAATTACAA
2751 CAGTCTGTAC TATATAATGC AGAGGACGAT TCTCCCTGTA AAAGGAACTA
2601 GAAGCTATTA CTAAAAATAT ATATAGACAA AATTAAAAGA AGGAATGATA
2651 AGAATAAATT TAATTTACCA AATATTGTTA ATTAAAATTT TAGATACTTA 2101 ACATTTATTT AACTTAAATA AAAGATAACT GTCAGATAAA ACTTTATTTT
2151 ACTAATGAGC AGTGATTTTC TTAGGAATTG ATGAAGGCTT ATTGGTATCA
3001 AGAATTTAAA CCAAATTAAA ACTGACAGAG GACATTTAGA TACATAATAA
3051 AATTCGAGCT ACATAAGTAT ATGGAAAATA ATGTACCTTG ATTATTATGA
3101 AATAGAGCAT CTTGAAATTC AGTTTTACTC TAAATGTACT TTTAATACTT 3151 GCAGATTCTA AGATTACATT GTGAAATTCC AGGTTTTCAT AATGTTAAAA
3201 TAGGAAAGTA GAATATAAAG TATCAACAAG TGTAGTTATA CATTTTGTTT
3251 TGGATATTTA ATCCTTACTT GGGAAAAAAT CAGCATCTAG GTAAATTATT
3301 ATTTTAATAA GAACTCTTAA ATTGCCAACC TCTGAGAGGT GAAAAGCTAT
33S1 GTAAATAGAA GGAATGGCCA GTTCAAAAGA ATAGTAGAAG TGATAGTGCC 3401 GTGAATGTAT TCTACTGGAA ATGAATGTAA TAATACATTA AATTTTTAAA
3451 ATCGAAAAAA AAAAAAAAAA AAA
BLAST Results
o BLAST result
Medline entries
o Medline entry
Peptide information for frame 1 ORF from bl bp to 144b bpi peptide length 4b2 Category: putative protein Classification: unclassified Prosite motifs: MUTT (355-374)
1 MSSVKRSLKlQ EIVTlQFHCSA AEGDIAKLTG ILSHSPSLLN ETSENGUTAL 51 MYAARNGHPE IViQFLLEKGC DRSIVNKSRiQ TALDIAVFUG YKHIANLLAT 101 AKGGKKPUFL TNEVEECENY FSKTLLDRKS EKRNNSDULL AKESHPATVF 151 ILFSDLNPLV TLGGNKESFlQ (QPEVRLCiQLN YTDIKDYLAiQ PEKITLIFLG 201 VELEIKDKLL NYAGEVPREE EDGLVAUFAL GIDPIAAEEF KiQRHENCYFL 2S1 HPPMPALL(QL KEKEAGVVAώ ARSVLAUHSR YKFCPTCGNA TKIEEGGYKR 301 LCLKEDCPSL NGVHNTSYPR VDPVVIMiQVI HPDGTKCLLG RiQKRFPPGMF 351 TCLAGFIEPG ETIEDAVRRE VEEESGVKVG HVlQYVACtQPU PMPSSLMIGC 401 LALAVSTEIK VDKNEIEDAR UFTRElQVLDV LTKGKIQIQAFF VPPSRAIAHiQ 451 LIKHUIRINP NL
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_2il7ι frame 1 No Alert BLASTP hits found
Pedant information for DKFZphamy2_2il7ι frame 1
Report for DKFZphamy2_2il7 -1
ELENGTH! 4b2
EMU! 5207b-25
Epl! b-36
EHOMOL! TREMBL:SPBC1778_3 gene: "SPBC1778 - 03c"i product:
"conserved hypothetical protein"i S. pombe chromosome II cosmid cl778. le-45
EFUNCAT! 11 unclassified proteins ES. cerevisiaei YGL0b7w!
4e-34
EFUNCAT! r general function prediction [H- influenzaei
HI0432 pyrophosphohydrolase! 4e-24 [FUNCAT! 1 genome replication! transcriptioni recombination and repair EM- jannaschiii MJ1141 nucleotide pyrophosphohydrolase! le-04
EBLOCKS! BL00211F Anion exchangers family proteins
[BLOCKS! BL01213B [BLOCKS! DM01101
[BLOCKS! PF00D23A
[BLOCKS! BL0D813 mutT domain proteins /
[SCOP! dlawcb_ 1-11.3.1.2 GA binding protein (GABP) alpha
GA bindini 2e-35 [SUPFAM! hypothetical protein HI0432 le-22
[PROSITE! MUTT 1
[PFAM! Bacterial utT protein
[PFAM! Ank repeat [KU! Irregular
[KU! 3D
SElQ MSSVKRSLKlQEIVTiQFHCSAAEGDIAKLTGILSHSPSLLNETSENGUTALMYAARNGHPE lawcB .CCCTTTTCTTTCCHHHHHHHHTTHHHHHHHHHCCCTT-
TTEETTTEEHHHHHHHHCCHH
SElQ IVlQFLLEKGCDRSIVNKSRiQTALDIAVFUGYKHIANLLATAKGGKKPUFLTNEVEECENY lawcB
HHHHHHHHCCTTTTCBTTTBCHHHHHHHHCCHHHHHHH
SElQ FSKTLLDRKSEKRNNSDULLAKESHPATVFILFSDLNPLVTLGGNKESFiQiQPEVRLCiQLN lawcB
SElQ YTDIKDYLAiQPEKITLIFLGVELEIKDKLLNYAGEVPREEEDGLVAUFALGIDPIAAEEF lawcB
SElQ KiQRHENCYFLHPPMPALLlQLKEKEAGVVAlQARSVLAUHSRYKFCPTCGNATKIEEGGYKR lawcB
SElQ LCLKEDCPSLNGVHNTSYPRVDPVVIMlQVIHPDGTKCLLGRlQKRFPPGMFTCLAGFIEPG lawcB
SElQ ETIEDAVRREVEEESGVKVGHVlQYVACiQPUPMPSSLMIGCLALAVSTEIKVDKNEIEDAR lawcB
SE(Q UFTREIQVLDVLTKGKIQIQAFFVPPSRAIAHIQLIKHUIRINPNL lawcB
Prosite for DKFZphamy2_2il7-l PS0Oδ13 355->37S MUTT PDOCOOblS
Pfam for DKFZphamy2_2il7.1
HMM_NAME Ank repeat
HMM *GyTPLHIAARyNNvEMVrlLL(QHGADIN* G+T+L++AAR+++ E+V++LL++G D
(Query 4b GUTALMYAARNGHPEIVlQFLLEKGCDRS 73
HMM_NAME Bacterial mutT protein
HMM *ILMiqRedppnHYdtHhgdUIFPGGkIEeGETPE(QCarREIUEETGI* L++++++ +++ + ++G+IE+GET+E+++RRE++EE+G+
(Query 337 CLLGRlQKRF—PPG
MFTCLAGFIEPGETIEDAVRREVEEESGV 377
DKFZphamy2_2ol3
group: intracellular transport and trafficing
DKFZphamy2_2ol3 encodes a novel 510 amino acid protein with high similarity to murine synaptotagmin 3- The novel protein contains two C2 domains- The C2 domain is thought to be involved in calcium-dependent phospholipid binding. Synaptotagmins are essential for Ca (2+) -regulated exocytosis of neurosecretory vesicles- The new protein can find application in modulating/blocking synaptic activity-
similarity to synaptotagmin 3 (Mus musculus)
Sequenced by MediGenomix
Locus: unknown Insert length: 2131 bp
Poly A stretch at pos- 2112ι polyadenylation signal at pos- 2664
1 ACTCTATGTC TCCTCTCGTT GGATTGTGAC ACCGGGAGGT CAGGGAACTC SI CAGGACCTTG TTCTCTGCTG GATTCGCAGC AACCAGCACA GCACGTAGGG
101 CGTAGTTGGT GCTGGATGGA TGTTTGTTGA ATGAATGAAT GATGAATGGC
ISI TGGCACCTTG TCTGCTCATC CCTAACTCCT GTTCCTTCAT CTGTGCAGCC
201 CTAATCTTTG TTTCCTCATC TGTCCATCCC TTTATTTGTG CATCCTCATT
251 CTTAGCCCCT TCACTGCCCT TCTCCATCTC TTCCTCCTTG TTCATTTGTC 301 CCTGTTCTCT GTCCTCTACT CCACTCATGC CCATCTCTGT CCCCTTGACT
3S1 TACCCAGTCC CTGCTACTAT CTCCATCCCT AATTTCTGCC CTCTTGTCTG
4D1 TCTACTCCTA ATTCCTTTTC CTTGTCCATC CCTAATACCT GTCACCTTGT
451 CCTTCTTCCT CGAATCTCCA TCCCTAATCC ATCTGCCCCT AATCTCTGTC
501 CCCTTTGCCC ATCCTTCCTT TTCTCGGTGT CTCTTTCCAC CCTTATCTCC 5S1 ACACCTGCCC ACCCTGCACT CCCATTCTGT TTCCCATCTG CACCCTTGCC bDl CCATCCCTCC CACACACAGG ACCAGACGGC CACCATGTCA GGAGACTACG b51 AGGATGACCT CTGCCGGCGG GCACTCATCC TGGTCTCGGA CCTCTGTGCG
701 CGGGTCCGAG ATGCTGACAC CAACGACAGG TGCCAGGAGT TCAATGACCG
7S1 AATCCGAGGC TATCCCCGGG GTCCAGATGC AGACATCTCC GTGAGCCTGC 601 TGTCGGTCAT CGTGACATTC TGTGGCATTG TCCTTCTGGG TGTCTCTCTC
651 TTCGTGTCCT GGAAGTTGTG CTGGGTGCCC TGGCGGGACA AGGGAGGCTC
101 GGCAGTGGGC GGTGGCCCCC TGCGCAAAGA CCTAGGCCCT GGTGTCGGGC
ISI TGGCAGGCCT GGTAGGCGGA GGCGGGCACC ACCTGGCGGC TGGCCTGGGT
1001 GGCCATCCTC TGCTGGGCGG CCCACACCAC CATGCCCATG CCGCCCACCA 1051 TCCACCCTTT GCTGAGCTGC TGGAGCCAGG CAGCCTGGGG GGTTCTGACA
1101 CCCCTGAGCC CTCCTACTTG GACATGGACT CGTATCCAGA GGCTGCAGCA
1151 GCAGCAGTGG CCGCTGGGGT CAAACCGAGC CAAACATCCC CTGAGCTGCC
1201 CTCTGAGGGG GGAGCAGGCT CTGGGTTGCT CCTGCTGCCC CCCAGTGGTG
1251 GGGGCTTGCC CAGTGCCCAG TCACATCAGC AGGTCACAAG CCTGGCACCC 1301 ACTACCAGGT ACCCAGCCCT GCCCCGACCC CTCACCCAGC AGACTCTGAC
13S1 CTCCCAGCCG GACCCCAGCA GTGAGGAGCG CCCACCTGCC CTGCCCTTAC
1401 CCCTGCCTGG AGGCGAGGAA AAAGCCAAAC TCATTGGGCA GATTAAGCCA
1451 GAGCTGTACC AGGGGACTGG CCCTGGTGGC CGGCGGAGCG GTGGGGGCCC 1501 AGGCTCTGGA GAGGCAGGCA CAGGGGCACC CTGTGGCCGT ATCAGCTTCG
15S1 CCCTGCGGTA CCTCTATGGC TCGGACCAGC TGGTGGTGAG GATCCTGCAG lbDl GCCCTGGACC TCCCTGCCAA GGACTCCAAC GGCTTCTCAG ACCCCTACGT
IbSl CAAGATCTAC CTGCTGCCTG ACCGCAAGAA AAAGTTTCAG ACCAAGGTGC
1701 ACAGGAAGAC CCTGAACCCC GTCTTCAATG AGACGTTTCA ATTCTCGGTG
17S1 CCCCTGGCCG AGCTGGCCCA ACGCAAACTG CACTTCAGCG TCTATGACTT
1601 TGACCGCTTC TCGCGGCACG ACCTCATCGG CCAGGTGGTG CTGGACAACC
1651 TCCTGGAGCT GGCCGAGCAG CCCCCTGACC GCCCGCTCTG GAGGGACATC
1101 GTGGAGGGCG GCTCGGAAAA AGCAGATCTT GGGGAGCTCA ACTTCTCACT
1151 CTGCTACCTC CCCACGGCCG GGCGCCTCAC CGTGACCATC ATCAAAGCCT
2001 CTAACCTCAA AGCGATGGAC CTCACTGGCT TCTCAGACCC CTACGTGAAG
20S1 GCCTCCCTGA TCAGCGAGGG GCGGCGTCTG AAGAAGCGGA AAACCTCCAT
2101 CAAGAAGAAC ACGCTGAACC CCACCTATAA TGAGGCGCTG GTGTTCGACG
2151 TGGCCCCCGA GAGCGTGGAG AACGTGGGGC TCAGCATCGC CGTGGTGGAC
2201 TACGACTGCA TCGGGCACAA CGAGGTGATC GGCGTGTGCC GTGTGGGCCC
2251 CGACGCTGCC GACCCGCACG GCCGCGAGCA CTGGGCAGAG ATGCTGGCCA
2301 ATCCCCGCAA GCCCGTGGAG CACTGGCATC AGCTAGTGGA GGAAAAGACT
23S1 GTGACCAGCT TCACAAAAGG CAGCAAAGGA CTATCAGAGA AAGAGAACTC
2401 CGAGTGAGGG GTCTGGCCTA GGCCCGGGAT CGGACCAGGC TCCCTCAGGA
2451 CCCCATCCTT TCCTGCCCGG ACCGTGAATT CATCTCCTTG AAGCCATAAC
2S01 GTCCGAGCTG CTGGTGCGGG GCAGCCCTGG CCCTAGGCTT CCTAACCCTG
25S1 GAAGCGAGAG GATGAGAGGA GGCCGGCCCA GCTCCTTCTT TCAGGGTGGG
2b01 GGTCATTCAG CCTCCACTGT GTCTGTCTTT TCTTCCCTGG GGCTCCCCCT
2b51 CGAGGCGAGG GGCCATGCAT GTCTGGGGGA CCCCTGCCCC CCAAAACCCT
2701 CTGTCTGTCT CTGTCTCTTT GCTGTTTGTC CAAGACTCAG TGTCCCGACC
2751 CTTGTTCTCG CCGTGAATGT CAATGGGCCA ATCCTCTCTG TCCTTTCAGA
2801 CACACACACA CCTGTGTCCA CCCCTTCTGT TCGCCACACC CTGCGTCTGG
2651 CCGGTCCCCC CACTGCTGCT GCTATCAACG CCAGAATAAA CACACTCTGT
2101 GGGTCTCACT CCAAAAAAAA AAAAAAAAAA A
BLAST Results
Entry MMAB813_1 from database TREMBL: product: "synaptotagmin 3"i Mus musculus mRNA for synaptotagmin
3ι complete eds-
Score = 1614ι P = S-7e-231ι identities = 3b2/450ι positives = 3b1/4S0ι frame +2
Medline entries
1b0b4733:
Fukuda Mi Kojima Ti Aruga Ji Niinobe Mi Mikoshiba K-i Functional diversity of C2 domains of synaptotagmin family-
Mutational analysis of inositol high polyphosphate binding domain- J
Biol Chem 1115 Nov 3i270 (44) :2bS23-7
Peptide information for frame 2 ORF from b3S bp to 2404 bpi peptide length: 510 Category: strong similarity to known protein Classification: Cell signaling/communication Prosite motifs: C2_D0MAIN_1 (323-336) C2_D0MAIN_1 (455-470)
1 MSGDYEDDLC RRALILVSDL CARVRDADTN DRCiQEFNDRI RGYPRGPDAD
51 ISVSLLSVIV TFCGIVLLGV SLFVSUKLCU VPURDKGGSA VGGGPLRKDL
101 GPGVGLAGLV GGGGHHLAAG LGGHPLLGGP HHHAHAAHHP PFAELLEPGS
ISI LGGSDTPEPS YLDMDSYPEA AAAAVAAGVK PSlQTSPELPS EGGAGSGLLL
201 LPPSGGGLPS AlQSHiQiQVTSL APTTRYPALP RPLTlQlQTLTS (QPDPSSEERP 251 PALPLPLPGG EEKAKLIGiQI KPELYiQGTGP GGRRSGGGPG SGEAGTGAPC
301 GRISFALRYL YGSDlQLVVRI LlQALDLPAKD SNGFSDPYVK IYLLPDRKKK
351 FlQTKVHRKTL NPVFNETFiQF SVPLAELAlQR KLHFSVYDFD RFSRHDLIGiQ
401 VVLDNLLELA ElQPPDRPLUR DIVEGGSEKA DLGELNFSLC YLPTAGRLTV
451 TIIKASNLKA MDLTGFSDPY VKASLISEGR RLKKRKTSIK KNTLNPTYNE 501 ALVFDVAPES VENVGLSIAV VDYDCIGHNE VIGVCRVGPD AADPHGREHU
SSI AEMLANPRKP VEHUHώLVEE KTVTSFTKGS KGLSEKENSE
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_2ol3ι frame 2
TREMBL:MMAB613_1 product: "synaptotagmin 3"i Mus musculus mRNA for synaptotagmin 3ι complete cds-i N = 2ι Score = 1814ι P = l-le-231
>TREMBL:MMAB613_1 product: "synaptotagmin 3"i Mus musculus mRNA for synaptotagmin 3ι complete eds- Length = Sδ7
HSPs
Score = 1614 (272-2 bits)ι Expect = l-le-231ι Sum P(2) = l-le- 231 Identities = 3b2/441 (60*)ι Positives = 3b1/441 (62*)
(Query: 142 FAELLEPGSLGGSDTPEPSYLDMDSYPEXXXXXX- XXGVKPSiQTXXXXXXXXXXXXXXXX 200
FAELLEPG LGGS+ PEPSYLDMDSYPE GVKPSiQT Sbjct: 143
FAELLEPGGLGGSELPEPSYLDMDSYPEAAVASVVAAGVKPSlQTSPELPSEGGTGSGLLL 202
(Query : 201
XXXXXXXXXXX(QSH(Q(QVTSLAPTTRYPALPRPLT(Q(QTLTS(QPDXXXXXXXXXXXXXXXXX 2b0 (QSH(QιQVTSLAPTTRYPALPRPLT(Q(QTLT+ιQ D
Sb j ct : 203 LPPSGGGLPSA(QSH(Q(QVTSLAPTTRYPALPRPLT(Q(QTLTT(QADPSTEERPPALPLPLPGG 2b2 (Query: 2bl XXKAKLIG(QIKPELY(QXXXXXXXXXXXXXXXXXXXXXXPCGRISFALRYLYGSD(QLVVRI 320
KAKLIG(QIKPELY(Q PCGRISFALRYLYGSDlQLVVRI
Sbjct: 2b3 EEKAKLIGIQIKPELYIQGTGPGGRRGGGSGEAGA
PCGRISFALRYLYGSDlQLVVRI 317
(Query: 321 LIQALDLPAKDSNGFSDPYVKIYLLPDRKKKFIQTKVHRKTLNPVFNETFIQFSVPLAELAIQR 360
LlQALDLPAKDSNGFSDPYVKIYLLPDRKKKFiQTKVHRKTLNP + FNETFlQFSVPLAELAiQR Sbjct: 316 LiQALDLPAKDSNGFSDPYVKIYLLPDRKKKFlQTKVHRKTLNPIFNETFiQFSVPLAELAiQR 377 (Query: 361
KLHFSVYDFDRFSRHDLIGiQVVLDNLLELAEiQPPDRPLURDIVEGGSEKADLGELNFSLC 440
KLHFSVYDFDRFSRHDLIGiQVVLDNLLELAEiQPPDRPLURDI+EGGSEKADLGELNFSLC Sbjct: 376 KLHFSVYDFDRFSRHDLIGUVVLDNLLELAEiQPPDRPLURDILEGGSEKADLGELNFSLC 437
(Query: 441 YLPTAGRLTVTIIKASNLKAMDLTGFSDPYVKASLISEGRRLKKRKTSIKKNTLNPTYNE 500 YLPTAGRLTVTIIKASNLKAMDLTGFSDPYVKASLISEGRRLKKRKTSIKKNTLNPTYNE Sbjct: 436 YLPTAGRLTVTIIKASNLKAMDLTGFSDPYVKASLISEGRRLKKRKTSIKKNTLNPTYNE 417
(Query: 501 ALVFDVAPESVENVGLSIAVVDYDCIGHNEVIGVCRVGPDAADPHGREHUAEMLANPRKP SbO
ALVFDVAPESVENVGLSIAVVDYDCIGHNEVIGVCRVGP+AADPHGREHUAEMLANPRKP Sbjct: 416
ALVFDVAPESVENVGLSIAVVDYDCIGHNEVIGVCRVGPEAADPHGREHUAEMLANPRKP 557
(Query: 5bl VEHUHύLVEEKTVTSFTKGSKGLSEKENSE 510
VEHUH(QLVEEKT++SFTKG KGLSEKENSE Sbjct: 5S6 VEHUHlQLVEEKTLSSFTKGGKGLSEKENSE 567 Score = S20 (78-0 bits)ι Expect = l-le-231ι Sum P(2) = l-le-231 Identities = 18/100 (18*)ι Positives = 11/100 (11*)
(Query: 1 MSGDYEDDLCRRALILVSDLCARVRDADTNDRCiQEFND- RIRGYPRGPDADISVSLLSVI 51 MSGDYEDDLCRRALILVSDLCARVRDADTNDRC(QEFN+
RIRGYPRGPDADISVSLLSVI Sbjct: 1 MSGDYEDDLCRRALILVSDLCARVRDADTNDRCiQEFNELRIRGYPRGPDADISVSLLSVI bO (Query: t,0 VTFCGIVLLGVSLFVSUKLCUVPURDKGGSAVGGGPLRKD 11
VTFCGIVLLGVSLFVSUKLCUVPURDKGGSAVGGGPLRKD Sbjct: bl VTFCGIVLLGVSLFVSUKLCUVPURDKGGSAVGGGPLRKD 100
Pedant information for DKFZphamy2_2ol3ι frame 2
Report for DKFZphamy2_2ol3-2 [LENGTH! S10
EMU! b3304-02 [pi! __ _ ____
[HOMOL! TREMBL:MMAB813_1 product: "synaptotagmin 3"i Mus musculus mRNA for synaptotagmin 3ι complete eds- 0-0
[FUNCAT! 11 unclassified proteins [S- cerevisiaei YML072c! be-10 [FUNCAT! 01-Ob-Ol lipidi fatty-acid and sterol biosynthesis
[S- cerevisiaei YGR170w! 7e-0b
[FUNCAT! 30-08 organization of golgi [S- cerevisiaei YGR170w!
7e-0b
[BLOCKS! BL01224A N-acetyl-gamma-glutamyl-phosphate reductase proteins
[BLOCKS! BL01013B Oxysterol-binding protein family proteins
[BLOCKS! PF013b8B
[SCOP! dla25a_ 2-b-l-2-2 C2 domain from protein kinase c
(beta) [Ra 2e-27 [SCOP! dlrsy 2-b-1.2.1 Synaptogamin IT first C2 domain
[Rat (Rattu 4e-43
ESCOP! dlrlw Ξ-b.1.1-2 A domain from cytosolic phospholipase A2 EHuma Se-12
ESCOP! dlqasb2 2-b-l-l.l Phosphoinositide-specific phospholipase C 4e-27
EPIRKU! phosphotransferase 7e-15
EPIRKU! duplication be-7b
EPIRKU! synaptic vesicle le-lb7
EPIRKU! phorbol ester binding 2e-14 EPIRKU! zinc 2e-14
EPIRKU! transmembrane protein 0-0
EPIRKU! serine/threonine-specific protein kinase 7e-lS
EPIRKU! membrane trafficking 0-0
EPIRKU! phospholipid binding be-7b EPIRKU! autophosphorylation 7e-15
EPIRKU! ATP 7e-15
EPIRKU! phosphoprotein 7e-15
EPIRKU! glycoprotein le-lb7
EPIRKU! calcium binding Se-34 EPIRKU! alternative splicing le-10
EPIRKU! dimer le-7S
EPIRKU! membrane protein le-lb7
EPIRKU! calmodulin binding 2e-74
ESUPFAM! ras-specific GAP catalytic domain homology le-08 ESUPFAM! protein kinase C zinc-binding repeat homology 7e-lS
ESUPFAM! protein kinase homology 7e-15
ESUPFAM! protein kinase C alpha 7e-lS
ESUPFAM! HsC2 phosphatidylinositol 3-kinase le-01
ESUPFAM! synaptotagmin 0-0 ESUPFAM! PX domain homology le-01
ESUPFAM! pleckstrin repeat homology le-08
ESUPFAM! protein kinase C C2 region homology 0.0
EPROSITE! C2_D0MAIN_1 2
EPFAM! C2 domain EKU! Irregular
EKU! 3D
EKU! L0U_C0MPLEXITY 20-00 * SE(Q MSGDYEDDLCRRALILVSDLCARVRDADTNDRCiQEFNDRIRGYPRGPDADISVSLLSVIV
SEG
Irsy-
SElQ TFCGIVLLGVSLFVSUKLCUVPURDKGGSAVGGGPLRKDLGPGVGLAGLVGGGGHHLAAG
SEG xxxxxxxxxxxxxxxxxxxxx
Irsy-
SElQ LGGHPLLGGPHHHAHAAHHPPFAELLEPGSLGGSDTPEPSYLDMDSYPEAAAAAVAAGVK
SEG xxxxxxxxxxxxxxxxxxxxx xxxxxxxx- ■ ■
Irsy-
SElQ PSlQTSPELPSEGGAGSGLLLLPPSGGGLPSAiQSHiQlQVTSLAPTTRYPALPRPLTiQiQTLTS
SEG - • ■ ■ xxxxxxxxxxxxxxxxxxxxxxxxxxx
Irsy-
SElQ (QPDPSSEERPPALPLPLPGGEEKAKLIGfllKPELYiQGTGPGGRRSGGGPGSGEAGTGAPC
SEG ■ ■ ■ xxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxx ■ ■
Irsy-
SElQ GRISFALRYLYGSDiQLVVRILiQALDLPAKDSNGFSDPYVKIYLLPDRKKKFiQTKVHRKTL SEG
Irsy- CEEEEEEEEETTTTEEEEEEEEEECCCCCBTTTBBCEEEEEEEETTTTTTEECCCTTTBT
SElQ NPVFNETF<QFSVPLAELAlQRKLHFSVYDFDRFSRHDLIG(QVVLDNLLELAE<QPPDRPLUR SEG
Irsy- TTEEEEEEEECCCHHHHHCCEEEEEEEECTTTTCCEEEEE
SElQ DIVEGGSEKADLGELNFSLCYLPTAGRLTVTIIKASNLKAMDLTGFSDPYVKASLISEGR
SEG
Irsy-
SE(Q RLKKRKTSIKKNTLNPTYNEALVFDVAPESVENVGLSIAVVDYDCIGHNEVIGVCRVGPD
SEG
Irsy-
SElQ AADPHGREHUAEMLANPRKPVEHUHiQLVEEKTVTSFTKGSKGLSEKENSE
SEG lrsy- ■
Prosite for DKFZphamy2_2ol3-2
PS00411 323->331 C2_D0MAIN_1 PD0C00360 PS00411 455->471 C2_D0MAIN_1 PD0C00380 Pf am f or DKFZphamy2_2ol3 - 2
HMM_NAME C2 doma i n
HMM *LtVrIIeARNLUkMDMnGfSDPYVKVdMdPdpkDtkKUKTkTi UNNGLN
L+VRI++A +L+++D+NGFSDPYVK++++PD+K KK++TK+++++ LN
(Query 31b LVVRILiQALDLPAKDSNGFSDPYVKIYLLPDRK —
KKFiQTKVHRKT-LN 3bl
HMM PVUNEEeFvF edlPyPd l qrkMLRFaVUDUDRFSRBDFIGHC i * PV+N E+F+F +P+ +L+ + L+F+V+D+DRFSR+D+IG+++ ώuery 3b2 PVFN-ETFiQFS-VPLAELA(QRKLHFSVYDFDRFSRHDLIG(QVV 402
HMM *LtVrIIeARNLUkMDMnGf SDPYVKVdMdPdpkDtkKUKTkT iUNNGLN
LTV+II+A NL++MD +GFSDPYVK +++ + +++KK+KT++++N+ LN (Query 446
LTVTIIKASNLKAMDLTGFSDPYVKASLISEGRRLKKRKTSIKKNT-LN 415
HMM PVUNEEeFvFedlPyPdlqrkMLRFaVUDUDRFSRBDFIGHCi*
P++N E +VF+ ++ ++ +++ L +AV D+D+++++++IG+C+ (Query 41b PTYN-EALVFD-VAPESVENVGLSIAVVDYDCIGHNEVIGVCR
53b
DKFZphamy2_7 j S
group: differentiation/development
DKFZphamy2_7 S encodes a novel b13 amino acid protein with similarity to Tspyll testis-specific Y-encoded-like protein of Mus musculus- TSPY genes are arranged in clusters on the Y chromosome of many mammalian species- TSPY is believed to function in early spermatogenesis and is a candidate for GBYi the putative gonadoblastoma-inducing gene on the Y- The TSPY family forms part of a superfamilyi TTSNi with autosomal representatives! highly conserved in mammals and beyond-
The new protein can find application in studying the expression profile of testis- and brain-specific genes and diagnosis/therapy of malfunctioning male fertility-
HRIHFB221b similarity to Y-linked Gene of Mus musculus
Sequenced by BMFZ
Locus: unknown Insert length: 2611 bp
Poly A stretch at pos- 2800ι polyadenylation signal at pos- 2771
1 AGGAGAGCTG GTTGCGTGAG TCTCCTCAGC TCTGCTTACC GGTGCGACTA 51 GCGGCkGCGk CGCGGCTAAA AGCGAAGGGG CGAGTGCGAG TCCCCTGAGC
IDl TGTACGAACG CGGTCGCCAT GGACCGCCCA GATGAGGGGC CTCCGGCCAA
151 GACCCGCCGC CTGAGCAGCT CCGAGTCTCC ACAGCGCGAC CCGCCCCCGC
2D1 CGCCGCCGCC GCCGCCGCTC CTCCGACTGC CGCTGCCTCC ACCCCAGCAG
2S1 CGCCCGAGGC TCCAGGAGGA AACGGAGGCG GCACAGGTGC TGGCCGATAT 3D1 GAGGGGGGTG GGACTGGGCC CCGCGCTGCC CCCGCCGCCT CCCTATGTCA
351 TTCTCGAGGA GGGGGGGATC CGCGCATACT TCACGCTCGG TGCTGAGTGT
401 CCCGGCTGGG ATTCTACCAT CGAGTCGGGG TATGGGGAGG CGCCCCCGCC
451 CACGGAGAGC CTGGAAGCAC TCCCCACTCC TGAGGCCTCG GGGGGGAGCC
S01 TGGAAATCGA TTTTCAGGTT GTACAGTCGA GCAGTTTTGG TGGAGAGGGG S51 GCCCTAGAAA CCTGTAGCGC AGTGGGGTGG GCGCCCCAGA GGTTAGTTGA bOl CCCGAAGAGC AAGGAAGAGG CGATCATCAT AGTGGAGGAT GAGGATGAGG bSl ATGAGCGGGA GAGTATGAGG AGCAGCAGGA GGCGGCGGCG GCGGCGGAGG
7D1 AGGAAGCAGA GGAAGGTGAA GAGGGAAAGC AGAGAGAGAA ATGCCGAGAG
7S1 GATGGAGAGC ATCCTGCAGG CACTGGAGGA TATTCAGCTG GATCTGGAGG 801 CAGTGAACAT CAAGGCAGGC AAAGCCTTCC TGCGTCTCAA GCGCAAGTTC
8S1 ATCCAGATGC GAAGACCCTT CCTGGAGCGC AGAGACCTCA TCATCCAGCA
IDl TATCCCAGGC TTCTGGGTCA AAGCATTCCT CAACCACCCC AGAATTTCAA
151 TTTTGATCAA CCGACGTGAT GAAGACATTT TCCGCTACTT GACCAATCTG
10D1 CAGGTACAGG ATCTCAGACA TATCTCCATG GGCTACAAAA TGAAGCTGTA 1051 CTTCCAGACT AACCCCTACT TCACAAACAT GGTGATTGTC AAGGAGTTCC
11D1 AGCGCAACCG CTCAGGCCGG CTGGTGTCTC ACTCAACCCC AATCCGCTGG
11S1 CACCGGGGCC AGGAACCCCA GGCCCGTCGT CACGGGAACC AGGATGCGAG
1201 CCACAGCTTT TTCAGCTGGT TCTCAAACCA TAGCCTCCCA GAGGCTGACA 1251 GGATTGCTGA GATTATCAAG AATGATCTGT GGGTTAACCC TCTACGCTAC
1301 TACCTGAGAG AAAGGGGCTC CAGGATAAAG AGAAAGAAGC AAGAAATGAA
13S1 GAAACGTAAA ACCAGGGGCA GATGTGAGGT GGTGATCATG GAAGACGCCC
1401 CTGACTATTA TGCAGTGGAA GACATTTTCA GCGAGATCTC AGACATTGAT 1451 GAGACAATTC ATGACATCAA GATCTCTGAC TTCATGGAGA CCACCGACTA
1501 CTTCGAGACC ACTGACAATG AGATAACTGA CATCAATGAG AACATCTGCG
15S1 ACAGCGAGAA TCCTGACCAC AATGAGGTCC CCAACAACGA GACCACTGAT
IbOl AACAACGAGA GTGCTGATGA CCACGAAACC ACTGACAACA ATGAGAGTGC lb51 AGATGACAAC AACGAGAATC CTGAAGACAA TAACAAGAAC ACTGATGACA 1701 ACGAAGAGAA CCCTAACAAC AACGAGAACA CTTACGGCAA CAACTTCTTC
17S1 AAAGGTGGCT TCTGGGGCAG CCATGGCAAC AACCAGGACA GCAGCGACAG
1601 TGACAATGAA GCAGATGAGG CCAGTGATGA TGAAGATAAT GATGGCAACG
1851 AAGGTGACAA TGAGGGCAGT GATGATGATG GCAATGAAGG TGACAATGAA
1101 GGCAGCGATG ATGACGACAG AGACATTGAG TACTATGAGA AAGTTATTGA 11S1 AGACTTTGAC AAGGATCAGG CTGACTACGA GGACGTGATA GAGATCATCT
2001 CAGACGAATC AGTGGAAGAA GAGGGCATTG AGGAAGGCAT CCAGCAAGAT
2051 GAGGACATCT ATGAGGAAGG AAACTATGAG GAGGAAGGAA GTGAAGATGT
2101 CTGGGAAGAA GGGGAAGATT CGGACGACTC TGACCTAGAG GATGTGCTTC
2151 AGGTCCCAAA CGGTTGGGCC AATCCGGGGA AGAGGGGGAA AACCGGATAA 2201 GGGTTTTCCC CTTTTGGGGA TCACCTCTCT GTATCCCCCA CCCACTATCC
22S1 CATTTGCCCT CCTCCTCAGC TAGGGCCACG CGGCCCCACA TTGCACTTCT
23D1 GGGGGGTGAC CGACTTCGTA CACGGGTTTA AAGTTTATTT TTATGGTTTA
2351 GTCATTGCAG AGTTCTTATT TTGGGGGGAG GGAAAGGGGG CTAGTCCCCT
2401 TCTTTTGGCC CTCCGCCCCC GCAGGCTTCT GTGTGCTGCT AACTGTATTT 2451 ATTGTGATGC CTTGGTCAGG GCCCCTCTAC CCACTTCTCC CAGTCAGTTG
2SD1 TGGCCCCAGC CCCTCTCCCT GTGCTGTGTG GAGTGGACAC CCTGACCCCC
25S1 GAAGCGGGGA GGGCCGCTGT GGCCTTCGTC ACAGCCGCGC AGTGCCCATG
2b01 GAGGCGCTGC TGCCACCTTC CTCTCCCAAG TTCTTTCTCC ATCCCTCTCC
2b51 TCTTCCCGCC GCGCCGCTAG CCCGCCTCGG TGTCTATGCA AGGCCGCTTC 2701 GCCATTGCGG TATTCTTTGC GGTATTCTTG TCCCCGTCCC CCAGAAGGCT
27S1 CGCCTCTCCC CGTGGACCCT GTTAATCCCA ATAAAATTCT GAGCAAGTTT
2801 AAAAAAAAAA AAAAAAAAA
BLAST Results
No BLAST result
Medline entries
163118b4: Vogel Ti Dittrich Oi Mehraein Yi Dechend Fi Schnieders Fi Schmidtke
J-iMurine and human TSPYL genes: novel members of the TSPY-SET-NAPILI family. Cytogenet Cell Genet lllδiδl (3-4 ) : 2bS-70
Peptide information for frame 2
ORF from 111 bp to 2117 bpi peptide length: b13 Category: similarity to known protein Classification: unclassified 1 MDRPDEGPPA KTRRLSSSES PiQRDPPPPPP PPPLLRLPLP PP(Q(QRPRL(QE
SI ETEAAώVLAD MRGVGLGPAL PPPPPYVILE EGGIRAYFTL GAECPGUDST
101 IESGYGEAPP PTESLEALPT PEASGGSLEI DF(QVV(QSSSF GGEGALETCS
151 AVGUAPiQRLV DPKSKEEAII IVEDEDEDER ESMRSSRRRR RRRRRKiQRKV
201 KRESRERNAE RMESILiQALE DIlQLDLEAVN IKAGKAFLRL KRKFIώMRRP
2S1 FLERRDLIIiQ HIPGFUVKAF LNHPRISILI NRRDEDIFRY LTNLlQViQDLR
301 HISMGYKMKL YF(QTNPYFTN MVIVKEFlQRN RSGRLVSHST PIRUHRGlQEP
351 (QARRHGNώDA SHSFFSUFSN HSLPEADRIA EIIKNDLUVN PLRYYLRERG
401 SRIKRKKiQEM KKRKTRGRCE VVIMEDAPDY YAVEDIFSEI SDIDETIHDI
4S1 KISDFMETTD YFETTDNEIT DINENICDSE NPDHNEVPNN ETTDNNESAD
501 DHETTDNNES ADDNNENPED NNKNTDDNEE NPNNNENTYG NNFFKGGFUG
551 SHGNNiQDSSD SDNEADEASD DEDNDGNEGD NEGSDDDGNE GDNEGSDDDD bDl RDIEYYEKVI EDFDKDiQADY EDVIEIISDE SVEEEGIEEG IlQiQDEDIYEE b51 GNYEEEGSED VUEEGEDSDD SDLEDVLlQVP NGUANPGKRG KTG
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphamy2_7jSi frame 2 TREMBL:AB01S345_1 gene: "HRIHFB221b" i Homo sapiens HRIHFB221b mRNAi partial cds-i N = 4ι Score = 1313ι P = 2-le-lbS
TREMBL:HSDJ4βbI3_2 gene: "dJ4δbI3-2"i product: "dJ4δbI3-2 (KIAAD721
(NAP (Nucleosome Assembly Protein) domain containg protein))"i Human
DNA sequence from clone 46bI3 on chromosome bq22-l-22-3. Contains the part of a gene for a novel proteini the gene for KIAAD721 (NAP
(Nucleosome Assembly Protein) domain containg protein)ι the TSPYL gene for TSPY-like (testis specific proteini Y-linked like)ι and an
RPS5 (4DS Ribosomal Protein SS) pseudogene- Contains ESTsi STSsi GSSs and two putative CpG islands-i N = li Score = S70ι P = 3-4e-5S
>TREMBL:AB01534S_1 gene: "HRIHFB221b"i Homo sapiens HRIHFB221b mRNAi partial cds-
Length = 4δb HSPs:
Score = 1313 (201-0 bits)ι Expect = 2-le-lbSι Sum P(4) = 2-le- lb5 Identities = 2b6/21S (10*)ι Positives = 2bδ/21S (ID*)
(Query : 206 NAERMESILiQALEDIiQLDLEAVNIKAGKAFLRLKRKFIiQMRRPFLERRDLIIlQHIPGFUV 2b7 NAERMESILlQALEDIlQLDLEAVNIKAGKAFLRLKRKFIiQMRRPFLERRDLIIlQHIPGFUV Sbjct: 1
NAERMESILlQALEDIiQLDLEAVNIKAGKAFLRLKRKFIiQMRRPFLERRDLIIiQHIPGFUV bO
(Query: Ebβ KAFLNHPRISILINRRDEDIFRYLTNL(QV(QDLRHISMGYKMKLYF(QTNPYFTNMVIVKEF 327
KAFLNHPRISILINRRDEDIFRYLTNL(QV(QDLRHISMGYKMKLYF(QTNPYFTNMVIVKEF Sbjct: bl
KAFLNHPRISILINRRDEDIFRYLTNLiQVlQDLRHISMGYKMKLYFiQTNPYFTNMVIVKEF 120
(Query: 328
(QRNRSGRLVSHSTPIRUHRGiQEPiQARRHGNlQDAXXXXXXXXXXXXLPEADRIAEIIKNDL 367 (QRNRSGRLVSHSTPIRUHRGiQEPiQARRHGNlQDA
LPEADRIAEIIKNDL Sbjct: 121 (QRNRSGRLVSHSTPIRUHRGlQEPiQARRHGNlQDASHSFFSUFSNHSLPEADRIAEIIKNDL 160 (Query: 366
UVNPLRYYLRERGSXXXXXXXXXXXXXXXGRCEVVIMEDAPDYYAVEDIFSEISDIDETI 447 UVNPLRYYLRERGS
GRCEVVIMEDAPDYYAVEDIFSEISDIDETI
Sbjct: 161 UVNPLRYYLRERGSRIKRKKlQEMKKRKTRGRCEVVIMEDAPDYYAVEDIFSEISDIDETI 240
(Query: 446 HDIKISDFMETTDYFETTDNEITDINENICDSENPDHNEVPNNETTDNNESADDH 502 HDIKISDFMETTDYFETTDNEITDINENICDSENPDHNEVPNNETTDNNESADDH Sbjct: Ξ41 HDIKISDFMETTDYFETTDNEITDINENICDSENPDHNEVPNNETTDNNESADDH 215
Score = 117 (17-b bits)ι Expect = 1-Oe-llι Sum P(4) = 1-Oe-ll Identities = 32/77 (41*)ι Positives = 44/77 (S7*)
(Query: 42b
DAPDYYAVEDIFSEISDIDETIHDIKISDFMETTDYFETTDNEITDINENICDSENPDHN 465 + DY+ D +EI+DI+E I D E D+ E +NE TD NE+ D E D+N
Sbjct: 2S0 ETTDYFETTD—NEITDINENICD
SENPDHNEVPNNETTDNNESADDHETTDNN 301
(Query: 4βb EVP--NNETT-DNNESADDH 502 E NNE DNN++ DD+
Sbjct: 302 ESADDNNENPEDNNKNTDDN 321
Score = 14 (14-1 bits)ι Expect = 2-le-lb5ι Sum P(4) = 2-le-lbS Identities = lb/lb (lDO*)ι Positives = lb/lb (100*)
(Query: b7β (QVPNGUANPGKRGKTG b13
(QVPNGUANPGKRGKTG Sbjct: 471 (QVPNGUANPGKRGKTG 4δb Score = 10 (13-5 bits)ι Expect = 1-le-lbι Sum P(4) = 1-le-lb Identities = 34/65 (40*)ι Positives = 45/65 (52*) (Query: 42b DAPDYYAVEDIFSEISDIDETIHDIKISDFME TTDYFETTDN-
EITDINENICDS 471
+ DY+ D +EI+DI+E I D + D E TTD E+ D+ E TD NE+ D+ Sbjct: 25D ETTDYFETTD—
NEITDINENICDSENPDHNEVPNNETTDNNESADDHETTDNNESADDN 307
(Query: 460 -ENPDHN EVPNN-ETTDNN 41b
ENP+ N E PNN E T N Sbjct: 306 NENPEDNNKNTDDNEENPNNNENTYGN 334
Score = 87 (13-1 bits)ι Expect = 2-le-lbSι Sum P(4) = 2-le-lbS Identities = 14/14 (100*)ι Positives = 14/14 (100*) (Query: 543 FFKGGFUGSHGNNiQ 55b
FFKGGFUGSHGNNiQ Sbjct: 33b FFKGGFUGSHGNNiQ 341
Score = 85 (12-6 bits)ι Expect = 2-le-lbSι Sum P(4) = 2-le-lbS Identities = lb/16 (86*) i Positives = 17/16 (14*)
(Query: bOl RDIEYYEKVIEDFDKDiQA blδ
RDIEYYEK IEDFD+D<QA Sbjct: 314 RDIEYYEKGIEDFDRDlQA 411
Score = bO (1-0 bits)ι Expect = S-3e-D3ι Sum P(4) = 5-3e-03 Identities = 21/bb (31*)ι Positives = 33/bb (SD*)
(Query : 42b DAPDYYAVEDIFSEISDIDETIHD-IKIS- DFMETTDYFETTDNEITDINENICDSENPD 463
D DY V +1 S+ S +E I + 1+ D E +Y E ++ + E+
D S+ D
Sb jct : 4D1
DiQADYEDVIEIISDESVEEEGIEEGIlQlQDEDIYEEGNYEEEGSEDVUEEGEDSDDSDLED 4bδ
(Query : 484 HNEVPN 4δ1
+ VPN
Sbjct: 4b1 VLlQVPN 474 Score = 41 (7-4 bits)ι Expect = l-4e-0bι Sum P(4) = l-4e-Db Identities = 12/35 (34*)ι Positives = 21/3S (bO*)
(Query: 4b3 ETTDNEITDINENICDSENPDHNEVPNNETTDNNE 417 E +D+E D NE + + D NE +NE +D+++ Sbjct: 3b0 EASDDEDNDGNEGDNEGSDDDGNE-GDNEGSDDDD 313
Score = 42 (b-3 bits)ι Expect = 7-2e-0bι Sum P(4) = 7-2e-Db Identities = 11/37 (21*)ι Positives = lδ/37 (4δ*) (Query: 4b5 TDNEITDINENICDSENPDHNEVPNNETTDNNESADD SOI
+DNE + + D E+ D NE N + D+ D+ Sbjct: 354 SDNEADEAS DDEDNDGNEGDNEGSDDDGNEGDN 3δb
Pedant information for DKFZphamy2_7jSi frame 2
Report for DKFZphamy2_7jS-2 ELENGTH! b13
EMU! 71435-07 Epl! 4-45
EHOMOL! TREMBL:AB015345_1 gene: "HRIHFB221b" i Homo sapiens
HRIHFB221b mRNAi partial eds- le-171
EFUNCAT! Ob-ID assembly of protein complexes ES- cerevisiaei
YKRD4βc! 4e-0S EFUNCAT! 03-22 cell cycle control and mitosis ES- cerevisiaei
YKR048c! 4e-05
EFUNCAT! 03-04 buddingi cell polarity and filament formation
ES- cerevisiaei YKR04βc! 4e-DS
EFUNCAT! 01-13 biogenesis of chromosome structure ES- cerevisiaei YKR048c! 4e-D5
EFUNCAT! 30-10 nuclear organization ES- cerevisiaei YKRD46c!
4e-D5
EBLOCKS! BP02b4bH
EBLOCKS! BP02b4bE EBLOCKS! PF00424A
EBLOCKS! BL00415N Synapsins proteins
EBLOCKS! BP02711E
EBLOCKS! BL00046 Protamine PI proteins
EBLOCKS! PR00D41D EBLOCKS! PFOOISbD
[BLOCKS! PF0015bC
[BLOCKS! PFOOISbB
[PIRKU! nucleus 8e-33
[PIRKU! phosphoprotein 8e-33 [PIRKU! alternative splicing δe-33
[KU! Alpha_Beta
[KU! LOU COMPLEXITY 35-35 *
SElQ MDRPDEGPPAKTRRLSSSESPiQRDPPPPPPPPPLLRLPLPPPiQlQRPRLlQEETEAAiQVLAD
SEG xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PRD ccccccccccccccccccccccccccccccccccccccccccchhhhhhhhhhhhhhhhh
SElQ MRGVGLGPALPPPPPYVILEEGGIRAYFTLGAECPGUDSTIESGYGEAPPPTESLEALPT SEG - - ■ -xxxxxxxxxxx
PRD ccccceeeeccccccccccccccceeeeccccccccccceeecccccccccchhhhhhhh
SElQ PEASGGSLEIDFlQVViQSSSFGGEGALETCSAVGUAPlQRLVDPKSKEEAIIIVEDEDEDER
SEG xxxxxxxx PRD hcccccccccceeeeecccccchhhhhhhhccccccccccccchhhhhhhhhhhhhhhhh
SElQ ESMRSSRRRRRRRRRKiQRKVKRESRERNAERMESILiQALEDIlQLDLEAVNIKAGKAFLRL
SEG xxxxxxxxxxxxxxxxxxxxxxxxxxx
PRD hhhhhhhhhhhhhhhhhhhhhhhhhccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
SElQ KRKFIlQMRRPFLERRDLIIiQHIPGFUVKAFLNHPRISILINRRDEDIFRYLTNLiQVlQDLR
SEG
PRD hhhhhhhhhhhhhhhhhhhhhccccceeeccccccchhhhhccchhhhhhhhhhhhhhhc SElQ HISMGYKMKLYF(QTNPYFTNMVIVKEF(QRNRSGRLVSHSTPIRUHRG<QEP(QARRHGN(QDA
SEG
PRD cccccceeeeeeccccccchhhhhhhcccccccceeeccccccccccccccccccccccc SElQ SHSFFSUFSNHSLPEADRIAEIIKNDLUVNPLRYYLRERGSRIKRKKlQEMKKRKTRGRCE
SEG xxxxxxxxxxxx ■. xxxxxxxxxxxxxxx- ■ ■ •
PRD cccceeeccccccccchhhhhhhhhhhhcccchhhhhhhhhhhhhhhcceeeeecccccc SElQ VVIMEDAPDYYAVEDIFSEISDIDETIHDIKISDFMETTDYFETTDNEITDINENICDSE
SEG
PRD eeeccccccceeehhhhhhhhhhccccccceeeeeccccccccccchhhhhhhhcccccc
SElQ NPDHNEVPNNETTDNNESADDHETTDNNESADDNNENPEDNNKNTDDNEENPNNNENTYG SEG xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PRD ccccccceeeecccccccccccccccccchhhhhcccccceeeeeccccccccccccccc
SElQ NNFFKGGFUGSHGNNlQDSSDSDNEADEASDDEDNDGNEGDNEGSDDDGNEGDNEGSDDDD
SEG XX xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
SElQ RDIEYYEKVIEDFDKDlQADYEDVIEIISDESVEEEGIEEGIiQiQDEDIYEEGNYEEEGSED
SEG xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PRD cchhhhhhhhhhhcccccchhhhheeeecccccccccccccccccceeecccccccccce
SElQ VUEEGEDSDDSDLEDVLiQVPNGUANPGKRGKTG SEG xxxxxxxxxxxxxxxxx
PRD eeecccccccccceeeeeccccccccccccccc
(No Prosite data available for DKFZphamy2_7 jS - 2) (No Pfam data available for DKFZphamy2_7jS - 2)
Pedant information for DKFZphamy2_7 jSi frame 3
Report for DKFZphamy2_7 jS-3
[LENGTH! ISO
[MU! IbδlD-bl
[pi! 12-88
[BLOCKS! PR00308A
[KU! All_Alpha
[KU! LOU COMPLEXITY bl-33 *
SElQ MRTSATARILTTMRSPTTRPLITTRVLMTTKPLTTMRViQMTTTRILKTITRTLMTTKRTL
SEG xxxxxxxxxxxxxxxxxxxxx
PRD ccchhhhhhhhhhcccccccccceeeeccccccchhhhhhhhhhhhhhhhhhhccccccc
SElQ TTTRTLTATTSSKVASGAAMATTRTAATVTMKiQMRPVMMKIMMATKVTMRAVMMMAMKVT SEG xxxxxxxxxx xxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxx
PRD ccccceeecccccccchhhhhhhhhhhhhhhhhchhhhhhhhhhhhhhhhhhhhhhhhhh
SElQ MKAAMMTTETLSTMRKLLKTLTRIRLTTRT
SEG xxxxxxxx- xxxxxxxxxxxxxxxxxxxxx PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhccc
(No Prosite data available for DKFZphamy2_7 S- 3) (No Pfam data available for DKFZphamy2_7j5- 3) DKFZphfbr2_7δcl2
group: nucleic acid management DKFZphfbr2_7δcl2 encodes a novel 52δ amino acid protein with high csimilarity to glutamyl-tRNA (Gin) amidotransferase subunit A of the hyperthermophilic bacterium Aquifex aeolicus-
The novel protein contains one ATP/GTP-binding site motif A (P- loop). This loop interacts with one of the phosphate groups of a A or G nucleotide. It is found in numerous ATP- or GTP-binding proteinsi such as ATP synthase alpha and beta subunitsi Myosin heavy chainsi Kinesin heavy chains and kinesin-like proteinsi Dynamins and dynamin-like proteinsi several kinasesi DNA and RNA helicasesi GTP-binding elongation factors and the Ras family of GTP-binding proteins. The protein seems to be expressed ubiquitously.
The new protein can find application in the modulation of translational pathways-
similarity to glutamyl-tRNA (Gin) amidotransferase subunit A
(Aquifex aeolicus)
Sequenced by MediGenomix
Locus-" /map="bδb-3 cR from top of Chrb linkage group'
Insert length: 3244 bp
Poly A stretch at pos- 3222ι polyadenylation signal at pos- 32D4
1 AGTGACAATT AAAGATGGCT GCGCCCATGT AACATCACTA GCGACCGGTG
51 ACCTCTTTTT CCCCCTTGCC TGGCTCCTGT GGTGGCAGGC TGGGCACGAG
101 GACCATGCTG GGCCGGAGCC TCCGAGAAGT TTCTGCGGCA CTGAAACAAG
151 GCCAAATTAC ACCAACAGAG CTCTGTCAAA AATGTCTCTC TCTTATCAAG
2D1 AAGGCCAAGT TTCTAAATGC CTACATTACT GTGTCAGAAG AGGTGGCCTT 2S1 AAAACAAGCT GAAGAATCAG AAAAGAGATA TAAGAATGGA CAGTCACTTG
301 GGGATTTAGA TGGAATTCCT ATTGCAGTAA AAGACAATTT CAGCACTTCT
351 GGCATTGAGA CAACATGTGC ATCAAATATG CTGAAAGGTT ATATACCACC
401 TTATAATGCT ACAGTAGTTC AGAAGTTGTT GGATCAGGGA GCTCTACTAA
451 TGGGAAAAAC AAATTTAGAT GAGTTTGCTA TGGGATCTGG GAGCACAGAT SD1 GGTGTATTTG GACCAGTTAA AAACCCCTGG AGTTATTCAA AACGATATAG
5S1 AGAAAAGAGG AAGCAGAATC CCCACAGCGA GAATGAAGAT TCAGACTGGC bOl TGATAACTGG AGGAAGCCCA GGTGGGAGTG CAGCTGCTGT ATCGGCGTTC bSl ACATGCTACG CGGCTTTAGG ATCAGATACA GGAGGATCGA CCAGAAATCC
7D1 TGCTGCCCAC TGTGGGCTTG TTGGTTTCAA ACCAAGCTAT GGCTTAGTTT 7S1 CCCGTCATGG TCTCATTCCC CTGGTGAATT CGATGGATGT GCCAGGAATC
601 TTAACCAGAT GTGTGGATGA TGCAGCAATT GTGTTGGGTG CACTGGCCGG
651 ACCTGACCCC AGGGACTCTA CCACAGTACA TGAACCTATT AATAAACCAT
101 TCATGCTTCC CAGTTTGGCA GATGTGAGCA AACTATGTAT AGGAATTCCA ISI AAGGAATATC TTGTACCGGA ATTATCAAGT GAAGTACAGT CTCTTTGGTC
1D01 CAAAGCTGCT GACCTCTTTG AGTCTGAGGG GGCCAAAGTA ATTGAAGTAT
1051 CCCTTCCTCA CACCAGTTAT TCAATTGTCT GCTACCATGT ATTGTGCACA
1101 TCAGAAGTGG CATCGAATAT GGCAAGATTT GATGGGCTAC AATATGGTCA 1151 CAGATGTGAC ATTGATGTGT CCACTGAAGC CATGTATGCT GCAACCAGAC
12D1 GAGAAGGATT TAATGATGTG GTGAGAGGAA GAATTCTCTC AGGAAACTTT
12S1 TTCTTATTAA AAGAAAACTA TGAAAATTAT TTTGTCAAAG CACAGAAAGT
1301 GAGACGCCTC ATTGCTAATG ACTTTGTAAA TGCTTTTAAC TCTGGAGTAG
1351 ATGTCTTGCT AACTCCCACC ACCTTGAGTG AGGCAGTACC ATACTTGGAG 1401 TTCATCAAAG AGGACAACAG AACCCGAAGT GCCCAGGATG ATATTTTTAC
1451 ACAAGCTGTA AATATGGCAG GATTGCCAGC AGTGAGTATC CCTGTTGCAC
15D1 TCTCAAACCA AGGGTTGCCA ATAGGACTGC AGTTTATTGG ACGTGCGTTT
1S51 TGTGACCAGC AGCTTCTTAC AGTAGCCAAA TGGTTTGAAA AACAAGTACA
IbOl GTTTCCTGTT ATTCAACTTC AAGAACTCAT GGATGATTGT TCAGCAGTCC IbSl TTGAAAATGA AAAGTTAGCC TCTGTCTCTC TAAAACAGTA AACATATCTT
17D1 ACAAATTAAA ATGACTTTTA GGCTGGGTGC AGTGGCTCAC ACCTGTAATC
1751 CCAGCACTTT GGGAGGCCAA GGCGAGCGGA TCATGAGGTC AGAAGATCTA
1601 GAACAGCCTG GTCAACATGG TGAAACCCCG TCTCTACTAA AAATACAAAA
16S1 ATTAGCCAGG CTTAGTGGCG GGCATCTGTA GTCCCAGCTA CTCAGGAGGC 1101 TGAGGCAGGA GAATCACTTG AACCCTGGAG GTGGAGGTTG CAGTGAGCCG
1151 AGATCATGCC ACTGCACTGC ACTCCAGCCT GGGTGACAAA GCAAGACTGT
2DD1 GTCTCAAAAT AAATAAATAA AATAAAATAA AATGACGTAC AGAGATTCTA
2051 TATTCTAGAG AGTCAAATGG TCTTGCTCAA TTCTTGTAAT TAGGTTCTTG
21D1 TTAATACAGT CATTCCATGG AATTACTTTT TAAAATTCCT GTGACAATTA 21S1 ATAATAAATA ACGTGTCAGC ATTTAGTAAG CATCCACTAA GTGTACAATA
2201 CTTCTACAAT AACACAAGAT ACCTGTTCCT CAAAGACAAT GCATTCTGCC
2251 ATAATGTTCA TTAAAGAGTT TACAGTAAAA ATAAGATTAG GGATAAACTT
2301 CTCAAAAATT GTACATCTGT GTAACTAAAG CACTAACAAA AACATGAATA
2351 GTCCTTCTAG AGGTAACTTG GATAGCCTAG GCAGGCAACT TATCATGTGG 2401 TGAAGGCCGC CTCAGGGGTT GTTAAAAATG CACAGAAACA ATTGAGTGCG
24S1 ATTATTGGCT TCTGAGCGCT GAGCAGAGCA GGTGGAAGAG GAACTTTGAG
25D1 CACAGGAGGA AATGCAACCA GTCAGGGCCC AGAATCATGC AAATCTCAGG
2SS1 GGTATGCCTC TCTGGGGAGG AGCTCCACTT GCAGGGACTC CTTTTATTTC
2b01 CCTAAGAAAG AGCTGAAATG ACTGAGAACT TTCCTTTCCT CCTTAGAGTT 2b51 ACAATTTTAC TTCTGCTATT CCGGAGCCCA TGCCTAGAAG CCAGAACAAC
27D1 TCCATGTTAC ACTGAGTTCA TGCTCCTATT TACTGATCAC AAATGAGCTC
2751 ATTAATGTCA TCGAAACATT TATTGTAACC TAACAGACCA TCACAGATTG
2601 GAAACTTGGT AGATAGCAGA GCATGGTATT AGTGAAAAAG GTTCAAAATA
2651 CACAAGTAAC ATACACTCTG AAAAACATGC AGATAATTTG CTGATGAAGC 2101 AGAAGAGGGG ATGCGCATGG CAAGAACTTG CCTTACCCCA GATTCTCTAT
2151 ATCTCATGGT TTCCTTTTCC TCTTGACTGT CTTTACGAGT GTTTTTTATT
3D01 TGGGACCCTC GAGCCCAGAG ATATTAATGG ATATCTGTAT TCAATATTTG
3DS1 ACAAAATCTA ATGGAAACCA TCCATTTACT CATGATAAGG CTTCATCACT
31D1 GGATTTCTGT GTCTTCACTA GAACACCATT GTCATCTCAT ATTGATCAGG 3151 TATTTTAATC TAGCACTTAC ATATTGTTGA TAAATGAAAG CTGAATTGTT
3201 ACTTAATAAA TTCACTTTGT TTAGCAAAAA AAAAAAAAAA AAAA
BLAST Results
o BLAST result
Medline entries
o Medline entry Peptide information for frame 3
ORF from 1D5 bp to lb88 bpi peptide length; 528
Category: similarity to known protein
Classification: Protein management
Prosite motifs: ATP GTP A (112-111)
1 MLGRS LREVS AALKiQGlQITP TELCiQKCLSL IKKAKFLNAY ITVSEEVALK 51 (QAEESEKRYK NGlQSLGDLDG IPIAVKDNFS TSGIETTCAS NMLKGYIPPY 101 NATVV(QKLLD (QGALLMGKTN LDEFAMGSGS TDGVFGPVKN PUSYSKRYRE 151 KRKlQNPHSEN EDSDULITGG SPGGSAAAVS AFTCYAALGS DTGGSTRNPA 201 AHCGLVGFKP SYGLVSRHGL IPLVNSMDVP GILTRCVDDA AIVLGALAGP 251 DPRDSTTVHE PINKPFMLPS LADVSKLCIG IPKEYLVPEL SSEViQSLUSK 301 AADLFESEGA KVIEVSLPHT SYSIVCYHVL CTSEVASNMA RFDGLlQYGHR 351 CDIDVSTEAM YAATRREGFN DVVRGRILSG NFFLLKENYE NYFVKAiQKVR 4D1 RLIANDFVNA FNSGVDVLLT PTTLSEAVPY LEFIKEDNRT RSA(QDDIFT(Q 451 AVNMAGLPAV SIPVALSNώG LPIGLlQFIGR AFCDIQIQLLTV AKUFEKiQViQF 501 PVIiQL(3ELMD DCSAVLENEK LASVSLKiQ
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphfbr2_78cl2ι frame 3
PIR:F70322 glutamyl-tRNA (Gin) amidotransferase subunit A -
Aquifex aeolicusi N = 2ι Score = b20ι P = 4-3e-81
>PIR:F7D322 glutamyl-tRNA (Gin) amidotransferase subunit A - Aquifex aeolicus
Length = 478
HSPs: Score = b20 (13-D bits)ι Expect = 4-3e-61ι Sum P(2) = 4-3e-81 Identities = 13S/311 (42*)ι Positives = 115/311 (bl*)
(Query : 187
ALGSDTGGSTRNPAAHCGLVGFKPSYGLVSRHGLIPLVNSMDVPGILTRCVDDAAIVLGA 24b +LGSDTGGS R PA+ CG++G KP+YG VSR+GL+ +S+D G+ R
+D A+VL Sb jct : lb3 SLGSDTGGSIRiQPASFCGVIGIKPTYGRVSRYGLVAFASSLDlQIGVFGRRTEDVALVLEV 222 (Query: 247
LAGPDPRDSTTVHEPINKPFMLPSLADVSKLCIGIPKEYLVPELSSEViQSLUSKAADLFE 30b
++G D +DST+ P+ + + +V L IG+PKE+ EL +V+ + E Sbjct: 223 ISGUDEKDSTSAKVPVPE- USEEVKKEVKGLKIGLPKEFFEYELύPiQVKEAFENFIKELE 261
(Query- 307 SEGAKVIEVSLPHTSYSIVCYHVLCTSEVASNMARFDGLiQYGHRCDIDVSTEAMYAATRR 3bb
EG ++ EVSLPH YSI Y+++ SE +SN+AR+DG++YG+R MYA TR
Sbjct: 2δ2 KEGFEIKEVSLPHVKYSIPTYYIIAPSEASSNLARYDGVRYGYRAKEYKDIFEMYARTRD 341
(Query: 3b7 EGFNDVVRGRILSGNFFLLKENYENYFVKAώKVRRLIANDFVNAFNSGVDVLLTPTTLSE 42b
EGF V+ RI+ G F L Y+ Y++KA(QKVRRLI NDF+ AF VDV+ +PTT Sbjct: 342 EGFGPEVKRRIMLGTFALSAGYYDAYYLKAlQKVRRLITNDFLKAFEE- VDVIASPTT—P 316
(Query: 427
AVPYLEFIKEDNRTRSA(QDDIFT(QAVNMAGLPAVSIPVALSN(QGLPIGLιQFIGRAFCDιQ(Q 48b +P+ + +N DI T N + AGLPA+SIP+A + GLP+G (Q
IG+ + +
Sbjct: 311 TLPFKFGERLENPIEMYLSDILTVPANLAGLPAISIPIAUKD- GLPVGGlQLIGKHUDETT 457 (Query: 487 LLTVAK-UFEK(QV(QFPVI(QL 505
LL ++ U +K + I L Sbjct: 456 LLiQISYLUEiQKFKHYEKIPL 477
Score = 261 (43-4 bits)ι Expect = 4-3e-81ι Sum P(2) = 4-3e-61 Identities = b4/143 (44*)ι Positives = 10/143 (b2*)
(Query: 4 RSLREVSAALKIQGIQITPTELCIQKCLSLIKKAKF-
LNAYITVSEEVALKiQAEESEKRYKNG b2
+ SL E+ LK + G+ + + P E+ + + + + AYIT ALKiQAE + + R
Sbjct: 5 KSLSELRELLKRGEVSPKEVVESFYDRYN(QTEEKVKAYITPLYGKALK(QAESLKER bO ώuery: b3 (QSLGDLDGIPIAVKDNFSTSGIETTCASNMLKGYIPPYNATVVlQKLLDiQGALLMGKTNLD 122
L L GIPIAVKDN G +TTCAS +L+ ++ PY+ATV+++L
GAL++GKTNLD Sbjct: bl -EL-
PLFGIPIAVKDNILVEGEKTTCASKILENFVAPYDATVIERLKKAGALIVGKTNLD 118
(Query: 123 EFAMGSGSTDGVFGPVKNPUSYSK 14b
EFAMGS + F P KNPU +
Sbjct: 111 EFAMGSSTEYSAFFPTKNPUDLER 142
Pedant information for DKFZphf br2_78cl2ι frame 3
Report for DKFZphf br2_78cl2-3
[LENGTH! 526 [MU! 574bδ-76 [pi! S-57
[H0M0L! PIR:E7172S glutamyl-tRNA amidotransferase chain A
(gatA) RP1S2 - Rickettsia prowazekii 2e-13
[FUNCAT! r general function prediction [M- jannaschiii MJllbO! δe-bl tFUNCAT! DI- 02-01 nitrogen and sulphur utilization ES- cerevisiaei YMR213c! le-S5
[FUNCAT! c energy conversion [M- genitaliumi MG011! 4e-41
[FUNCAT! 01.01-10 amino-acid degradation [S- cerevisiaei YBR20δc! 2e-31
[FUNCAT! 01-03-01 purine-ribonucleotide metabolism [S- cerevisiaei YBR20βc! 2e-31
[BLOCKS! BLD0S71
[EC! b-3-4-b Urea carboxylase 5e-3D [EC! 3-5.1-4 Amidase 3e-31
[EC! 3-5-2-12 b-Aminohexanoate-cyclic-dimer hydrolase le-17
[PIRKU! ligase Se-3D
[PIRKU! transmembrane protein 5e-30
[PIRKU! ATP Se-3D [PIRKU! crown gall tumor le-21
[PIRKU! mitochondrion 2e-13
[PIRKU! purine nucleotide binding 5e-30
[PIRKU! P-loop Se-30
[PIRKU! hydrolase 3e-31 [PIRKU! biotin 5e-30
ESUPFAM! amidase 3e-31
ESUPFAM! biotin carboxylase homology Se-3D
ESUPFAM! indoleacetamide hydrolase 7e-12
ESUPFAM! lipoyl/biotin-binding homology 5e-30 EPROSITE! ATP_GTP_A 1
EKU! Alpha_Beta
EKU! LOU COMPLEXITY 2-4b *
SElQ MLGRSLREVSAALKlQGlQITPTELClQKCLSLIKKAKFLNAYITVSEEVALKlQAEESEKRYK
SEG
PRD ccchhhhhhhhhhhhhcccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
SElQ NGlQSLGDLDGIPIAVKDNFSTSGIETTCASNMLKGYIPPYNATVViQKLLDlQGALLMGKTN SEG
PRD hccccccccccceeeecccccccccccchhhhhhhcccccchhhhhhhhhccceeeeccc
SElQ LDEFAMGSGSTDGVFGPVKNPUSYSKRYREKRKlQNPHSENEDSDULITGGSPGGSAAAVS
SEG xxxxxxxxxxxx PRD ccccccccccccccccccccccccceeecccccccccccccccccccccccccccccchh
SElQ AFTCYAALGSDTGGSTRNPAAHCGLVGFKPSYGLVSRHGLIPLVNSMDVPGILTRCVDDA
SEG x
PRD hhhheeeecccccccccccccceeeecccccceeeeccceeeeecccccccccchhhhhh
SElQ AIVLGALAGPDPRDSTTVHEPINKPFMLPSLADVSKLCIGIPKEYLVPELSSEViQSLUSK
SEG
PRD hhhhhhhccccccccccccccccccccccccccccceeeecccccccccchhhhhhhhhh SElQ AADLFESEGAKVIEVSLPHTSYSIVCYHVLCTSEVASNMARFDGLlQYGHRCDIDVSTEAM
SEG
PRD hhhhhhhhcceeeeeeccccceeeeeeeeehhhhhhhhhhhhhcccceeeccchhhhhhh SElQ YAATRREGFNDVVRGRILSGNFFLLKENYENYFVKAlQKVRRLIANDFVNAFNSGVDVLLT
SEG
PRD hhhhhhcccchhhhhhhhhhheeeccccchhhhhhhhhhhhhhhhhhhhhhhhheeeeee SElQ PTTLSEAVPYLEFIKEDNRTRSAlQDDIFTlQAVNMAGLPAVSIPVALSNiQGLPIGLlQFIGR
SEG
PRD cccccccccccccccccccccccccceeeeccccccccccccccccccccccceeeeeec
SElQ AFCDlQiQLLTVAKUFEKlQVlQFPVIiQLlQELMDDCSAVLENEKLASVSLKiQ SEG
PRD cccchhhhhhhhhhhhhhhhhheeehhhhhhheeeeccccceeeeccc
Pros i te f or DKFZphf br2_7βcl2 - 3
PSOOD17 112->120 ATP_GTP_A PD0C00D17
( No Pf am data ava i l abl e f or DKFZphf br2_76cl2 . 3 ) DKFZphf br2_7βdlδ
group: brain derived
DKFZphfbr2_7βdlδ encodes a novel S35 amino acid protein with weak similarity to a human putative mitogen-activated protein kinase kinase kinase-
No informative BLAST resultsi No predictive prositei pfam or SCOP motife. The new protein can find application in studying the expression profile of brain-specific genes. similarity to putative mitogen-activated protein kinase kinase kinase (Homo sapiens)
Sequenced by MediGenomix
Locus: unknown
Insert length: 2156 bp
Poly A stretch at pos- 2136ι polyadenylation signal at pos. 2117
1 ATCCGGGGCC CCGGAACCCG AGCTGGAGCT GAAGCGCAGG CTGCGGGGCG
51 CGGAGTCGGG AGTGCAGGCC TGAGTGTTCC TTCCAGCATG TCGGAGGGGG
101 AGTCCCAGAC AGTACTTAGC AGTGGCTCAG ACCCAAAGGT AGAATCCTCA
151 TCTTCAGCCC CTGGCCTGAC ATCAGTGTCA CCTCCTGTGA CCTCCACAAC
201 CTCAGCTGCT TCCCCAGAGG AAGAAGAAGA AAGTGAAGAT GAGTCTGAGA 251 TTTTGGAAGA GTCGCCCTGT GGGCGCTGGC AGAAGAGGCG AGAAGAGGTG
301 AATCAACGGA ATGTACCAGG TATTGACAGT GCATACCTGG CCATGGATAC
351 AGAGGAAGGT GTAGAGGTTG TGTGGAATGA GGTACAGTTC TCTGAACGCA
401 AGAACTACAA GCTGCAGGAG GAAAAGGTTC GTGCTGTGTT TGATAATCTG 451 ATTCAATTGG AGCATCTTAA CATTGTTAAG TTTCACAAAT ATTGGGCTGA
501 CATTAAAGAG AACAAGGCCA GGGTCATTTT TATCACAGAA TACATGTCAT
S51 CTGGGAGTCT GAAGCAATTT CTGAAGAAGA CCAAAAAGAA CCACAAGACG bOl ATGAATGAAA AGGCATGGAA GCGTTGGTGC ACACAAATCC TCTCTGCCCT b51 AAGCTACCTG CACTCCTGTG ACCCCCCCAT CATCCATGGG AACCTGACCT
701 GTGACACCAT CTTCATCCAG CACAACGGAC TCATCAAGAT TGGCTCTGTG
751 GCTCCTGACA CTATCAACAA TCATGTGAAG ACTTGTCGAG AAGAGCAGAA
601 GAATCTACAC TTCTTTGCAC CAGAGTATGG AGAAGTCACT AATGTGACAA
651 CAGCAGTGGA CATCTACTCC TTTGGCATGT GTGCACTGGA GATGGCAGTG 101 CTGGAGATTC AGGGCAATGG AGAGTCCTCA TATGTGCCAC AGGAAGCCAT
151 CAGCAGTGCC ATCCAGCTTC TAGAAGACCC ATTACAGAGG GAGTTCATTC
1D01 AAAAGTGCCT GCAGTCTGAG CCTGCTCGCA GACCAACAGC CAGAGAACTC
1DS1 CTGTTCCACC CAGCATTGTT TGAAGTGCCC TCGCTCAAAC TCCTTGCGGC
1101 CCACTGCATT GTGGGACACC AACACATGAT CCCAGAGAAC GCTCTAGAGG 1151 AGATCACCAA AAACATGGAT ACTAGTGCCG TACTGGCTGA AATCCCTGCA
1201 GGACCAGGAA GAGAACCAGT TCAGACTTTG TACTCTCAGT CACCAGCTCT
1251 GGAATTAGAT AAATTCCTTG AAGATGTCAG GAATGGGATC TATCCTCTGA
13D1 CAGCCTTTGG GCTGCCTCGG CCCCAGCAGC CACAGCAGGA GGAGGTGACA
13S1 TCACCTGTCG TGCCCCCCTC TGTCAAGACT CCGACACCTG AACCAGCTGA 1401 GGTGGAGACT CGCAAGGTGG TGCTGATGCA GTGCAACATT GAGTCGGTGG
1451 AGGAGGGAGT CAAACACCAC CTGACACTTC TGCTGAAGTT GGAGGACAAA
1SD1 CTGAACCGGC ACCTGAGCTG TGACCTGATG CCAAATGAGA ATATCCCCGA
1S51 GTTGGCGGCT GAGCTGGTGC AGCTGGGCTT CATTAGTGAG GCTGACCAGA
IbOl GCCGGTTGAC TTCTCTGCTA GAAGAGACCT TGAACAAGTT CAATTTTGCC IbSl AGGAACAGTA CCCTCAACTC AGCCGCTGTC ACCGTCTCCT CTTAGAGCTC
17D1 ACTCGGGCCA GGCCCTGATC TGCGCTGTGG CTGTCCCTGG ACGTGCTGCA
1751 GCCCTCCTGT CCCTTCCCCC CAGTCAGTAT TACCCTGTGA AGCCCCTTCC
16D1 CTCCTTTATT ATTCAGGAGG GCTGGGGGGG CTCCCTGGTT CTGAGCATCA
1651 TCCTTTCCCC TCCCCTCTCT TCCTCCCCTC TGCACTTTGT TTACTTGTTT 1101 TGCACAGACG TGGGCCTGGG CCTTCTCAGC AGCCGCCTTC TAGTTGGGGG
1151 CTAGTCGCTG ATCTGCCGGC TCCCGCCCAG CCTGTGTGGA AAGGAGGCCC
2D01 ACGGGCACTA GGGGAGCCGA ATTCTACAAT CCCGCTGGGG CGGCCGGGGC
2051 GGGAGAGAAA GGTGGTGCTG CAGTGGTGGC CCTGGGGGGC CATTCGATTC
2101 GCCTCAGTTG CTGCTGTAAT AAAAGTCTAC TTTTTGCCAA AAAAAAAAAA 2151 AAAAAAAA
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 1
ORF from 66 bp to lb12 bpi peptide length: 535 Category: similarity to unknown protein Classification: Protein management
1 MSEGESiQTVL SSGSDPKVES SSSAPGLTSV SPPVTSTTSA ASPEEEEESE 51 DESEILEESP CGRUlQKRREE VNlQRNVPGID SAYLAMDTEE GVEWUNEViQ
101 FSERKNYKLiQ EEKVRAVFDN LIlQLEHLNIV KFHKYUADIK ENKARVIFIT
151 EYMSSGSLKlQ FLKKTKKNHK TMNEKAUKRU CTiQILSALSY LHSCDPPIIH
201 GNLTCDTIFI (QHNGLIKIGS VAPDTINNHV KTCREElQKNL HFFAPEY6EV 251 TNVTTAVDIY SFGMCALEMA VLEIlQGNGES SYVPώEAISS AIIQLLEDPLIQ
3D1 REFIlQKCLiQS EPARRPTARE LLFHPALFEV PSLKLLAAHC IVGHiQHMIPE
351 NALEEITKNM DTSAVLAEIP AGPGREPVlQT LYSiQSPALEL DKFLEDVRNG
401 IYPLTAFGLP RP(Q<QPι2<QEEV TSPVVPPSVK TPTPEPAEVE TRKVVLMiQCN
451 IESVEEGVKH HLTLLLKLED KLNRHLSCDL MPNENIPELA AELVlQLGFIS 501 EADiQSRLTSL LEETLNKFNF ARNSTLNSAA VTVSS
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphfbr2_78dlβι frame 1
TREMBL".ACDD14b5_14 gene: "T1J14.14"i product: "putative mitogen activated protein kinase kinase"i Arabidopsis thaliana chromosome III
BAC T1J14 genomic sequence! complete sequence- i N = li Score =
372ι P = l-le-33
TREMBL:AF14Sb1D_l gene: "BcDNA - LD28bS7"i product:
"BcDNA-LD2δb57"i
Drosophila melanogaster clone LD2δbS7 BcDNA - LD2βbS7
(BcDNA-LD2βb57) mRNAi complete cds-i N = li Score = 114Dι P = l-3e-115
PIR:T02151 probable mitogen activated protein kinase - rice-, N =
Score = 311ι P = l-4e-3S
>TREMBL:AF145b1D_l gene: "BcDNA - LD2δbS7"i product: "BcDNA-LD2βbS7"i Drosophila melanogaster clone LD28b57 BcDNA -LD2βbS7 (BcDNA-LD2δbS7) mRNAi complete eds-
Length = b37 HSPs:
Score = 1140 (171. D bits)ι Expect = 1.3e-115ι P = 1.3e-llS Identities = 230/4b5 (41*)ι Positives = 3D4/4bS (b5*) (Query : bl
CGRU(QKRREEVN<QRNVPGIDSAYLAMDTEEGVEVVUNEV(QFSERKNYKL(QEEKVRAVFDN 120 CGRU KRREEV + (QR + VPGID +LAMDTEEGVEVVUNEV(Q + + + K
(QEEK + R VFDN
Sbjct : 1D2 CGRULKRREEVD(QRDVPGIDCVHLAMDTEEGVEVVUNEV(QYASL(QELKS(QEEKMR(QVFDN Ibl
(Query : 121 LIiQLEHLNIVKFHKYUADIKE- NKARVIFITEYMSSGSLKlQFLKKTKKNHKTMNEKAUKR 171 L + (QL + H NIVKFH + YU D ++ + RV + FITEYMSSGSLK(QFLK + TK + N K + ++U+R Sbjct: lb2
LL(QLDH(QNIVKFHRYUTDT(Q(QAERPRVVFITEYMSSGSLK<QFLKRTKRNAKRLPLESURR 221
(Query: 160 UCT(QILSALSYLHSCDPPIIHGNLTCDTIFI(QHNGLIKIGSVAPDTINNHVKTCREE(QKN 231
UCTlQILSALSYLHSC PPIIHGNLTCD + IFIiQHNGL + KIGSV PD ++ V + RE + + Sbjct: 222
UCTiQILSALSYLHSCSPPIIHGNLTCDSIFIlQHNGLVKIGSVVPDAVHYSVRRGRERERE 261
(Query: 240 LHFF-APEYGEVTNVTTAVDIYSFGMCALEMAVLEI(Q-
GNGESSYVPiQEAISSAIiQ 213 H + F APEYG +T A + DIY + FGMCALEMA LEIiQ N ES+ +
+ E I I Sbjct: 262 RERGAHYFιQAPEYGAAD(QLTAALDIYAFGMCALEMAALEI(3PSNSESTAINEETI(QRTIF 341 (Query: 214 LLEDPL(QREFI(QKCLιQSEPARRPTARELLFHPALFEVPSLKLLAAHCIV
GHiQHMIPE 350
LE+ LιQR+ I + KCL +P RP + A +LLFHP LFEV SLKLL AHC + V
+ + M E
Sbjct: 342 SLENDL(QRDLIRKCLNP(QP(QDRPSANDLLFHPLLFEVHSLKLLTAHCLVFSPANRTMFSE 401
(Query: 351 NALEEITKNM- DTSAVLAEIPAGPGREPV(QTLYS(QSPALELDKFLEDVRNGIYPLTAFGL 4D1
A + + + V+A++ G+E L S A +L+KF+EDV+ G+YPL +
Sbjct: 402 TAFDGLM(QRYY(QPDVVMA(QLRLAGG<QER(QYRLADVSGADKLEKFVEDVKYGVYPLITYS- 4b0 fluery: 410 PRXXXXXXXXXXXXXXXXXXXXXXXXXAEVETRKVVLMiQCNIESVEEGVXXXXXXXXXXX 4b1
+ + E+R++V M C+++ E+
Sbjct: 4bl GKKPPNFRSRAASPERADSVKSATPEPVDTESRRIVNMMCSVKIKEDSNDITMTILLRMD 520 (Query: 470 XXXXXXXSCDLMPNENIPELAAELVIQLGFISEADIQSRLTSLLEETL SIS
+ C + N+ +L +ELV + LGF+ DiQ ++ LLEETL Sbjct: S21 DKMNRIQLTCIQVNENDTAADLTSELVRLGFVHLDDIQDKIIQVLLEETL 5bb
Pedant information for DKFZphf br2_7βdlδι frame 1
Report for DKFZphf br2_7δdlβ ■ 1
ELENGTH! Sb4 EMU! b24b4-87
Epl! 5-10
EHOMOL! TREMBL:AF14Sb10_l gene: "BcDNA - LD2βb57"i product: "BcDNA- LD28b57"i Drosophila melanogaster clone LD28b57
BcDNA. LD28b57 (BcDNA - LD2βbS7 ) mRNAi complete eds- le-123 EFUNCAT! D3-22 cell cycle control and mitosis ES. cerevisiaei
YJLOISw! be-15 EFUNCAT! 3D- 03 organization of cytoplasm ES- cerevisiaei
YJLDISw! be-15
EFUNCAT! 11-01 stress response ES- cerevisiaei YJLDISw! be-lS
EFUNCAT! 03-01 cell growth ES- cerevisiaei YJLDISw! be-15 EFUNCAT! ID.02-11 key kinases ES- cerevisiaei YJLD15w! be-lS
EFUNCAT! 03-D4 buddingi cell polarity and filament formation ES- cerevisiaei YJL015w! be-15
EFUNCAT! 18 classification not yet clear-cut ES- cerevisiaei
YLRDIbw! 2e-01 EFUNCAT! 3D-02 organization of plasma membrane ES- cerevisiaei
YLRDIbw! 2e-D1
EFUNCAT! ID.03.11 key kinases ES. cerevisiaei YNR031c! 3e-D1
EFUNCAT! 01. DI biogenesis of cell wall ES- cerevisiaei
YNR031c! 3e-01 EFUNCAT! 03-D7 pheromone responsei mating-type determination! sex-specific proteins ES- cerevisiaei YLR3b2w! 4e-0β
EFUNCAT! ID- 05- 11 key kinases ES- cerevisiaei YLR3b2w! 4e-06
EFUNCAT! ID- 04-11 key kinases ES- cerevisiaei YLR3b2w! 4e-0δ
EFUNCAT! 11-04 dna repair (direct repairi base excision repair and nucleotide excision repair) ES- cerevisiaei YPL153c! le-D7
EFUNCAT! 03-11 recombination and dna repair ES- cerevisiaei
YPL153c! le-07
EFUNCAT! 03-22-01 cell cycle check point proteins ES- cerevisiaei YPL153c! le-07 EFUNCAT! 3D-10 nuclear organization ES- cerevisiaei YPL153c! le-D7
EFUNCAT! 03-25 cytokinesis ES- cerevisiaei YDR5D7c! le-07
EFUNCAT! 10-11 other signal-transduction activities ES- cerevisiaei YPLlS3c! le-07 EFUNCAT! 03-13 meiosis ES- cerevisiaei YDR523c! 3e-07
EFUNCAT! D3-1D sporulation and germination ES- cerevisiaei
YDRS23c! 3e-D7
EFUNCAT! 03-lb dna synthesis and replication ES- cerevisiaei
YMRDOlc! 2e-0b EFUNCAT! 11 unclassified proteins ES- cerevisiaei YDR410c!
3e-05
EFUNCAT! 05-07 translational control ES- cerevisiaei YDR2δ3c! le-04
EFUNCAT! 01.05-04 regulation of carbohydrate utilization ES- cerevisiaei YDR477w! le-04
EBLOCKS! PF00b37A
EBLOCKS! BP03111J
EBLOCKS! PF01317B
ESCOP! dlir3a_ 5-1-1-2-b insulin receptor Complex (transferase/substrate) 2e-S3
ESCOP! dlphk S-l-l-l-b gamma-subunit of glycogen phosphorylase kinas 3e-bβ
ESCOP! dlfgkb_ 5.1-1.2.5 Fibroblast growth factor receptor 1 [human (Horn le-55 [SCOP! dlabo 5-1.1.1.14 Protein kiase CK2ι alpha subunit [Maize (Ze 2e-55
[SCOP! d31ck 5-1.1-2-2 Lymphocyte kinase (lck) [Human
(Homo sapiens) 7e-S4
[SCOP! d2erk 5.1.1.1-11 MAP kinase Erk2 [rat (Rattus norvegicus) 1e-71
[SCOP! dlcdkb_ 5-1.1.1-2 cAMP-dependent PKi catalytic subunit Comple le-55 [SCOP! dlhcl 5-1-1-1-1 Cyclin-dependent PK [Human
(Homo sapiens) 4e-b7
[EC! 2-7-1-112 Protein-tyrosine kinase 4e-0b
[EC! 2-7.1-37 Protein kinase 3e-D1 EPIRKU! phosphotransferase 2e-2δ
[PIRKU! nucleus 3e-0b
[PIRKU! RNA binding 3e-10
[PIRKU! tandem repeat 4e-07
[PIRKU! cell cycle control 3e-0b [PIRKU! serine/threonine-specific protein kinase 2e-13
[PIRKU! transmembrane protein 4e-07
[PIRKU! autophosphorylation 3e-10
[PIRKU! tyrosine-specific protein kinase 4e-Db
[PIRKU! magnesium 4e-D7 [PIRKU! ATP 2e-13
[PIRKU! receptor 4e-07
[PIRKU! phosphoprotein 2e-13
[PIRKU! apoptosis 3e-0b
[PIRKU! glycoprotein 4e-07 [PIRKU! protein kinase 2e-2δ
EPIRKU! signal transduction 2e-0β
EPIRKU! cell division le-11
EPIRKU! calmodulin binding 3e-0b
ESUPFAM! protein kinase byr2 le-Ob ESUPFAM! unassigned Ser/Thr or Tyr-specific protein kinases 2e-
13
ESUPFAM! leucine-rich alpha-2-glycoprotein repeat homology 4e-D7
ESUPFAM! double-stranded RNA-binding repeat homology 3e-10 ESUPFAM! SAM homology le-Ob
ESUPFAM! death-associated protein kinase 3e-0b
ESUPFAM! ankyrin repeat homology 3e-0b
ESUPFAM! protein kinase homology 2e-2δ
ESUPFAM! kinase-related transforming protein 2e-0b ESUPFAM! protein kinase SPK1 3e-0b
ESUPFAM! protein kinase Xa21 4e-07
ESUPFAM! protein kinase TIK 3e-lD
ESUPFAM! kinase interaction domain homology 3e-Db
EPFAM! Eukaryotic protein kinase domain EKU! All_Alpha
EKU! 3D
EKU! LOU COMPLEXITY lb-41 *
SElQ IRGPGTRAGAEAiQAAGRGVGSAGLSVPSSMSEGESlQTVLSSGSDPKVESSSSAPGLTSVS SEG - - - ■ xxxxxxxxxxxxxxxx xxxxx
IkobA
SElQ PPVTSTTSAASPEEEEESEDESEILEESPCGRUiQKRREEVNiQRNVPGIDSAYLAMDTEEG SEG xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
IkobA
SE(Q VEVVUNEViQFSERKNYKLiQEEKVRAVFDNLIlQLEHLNIVKFHKYUADIKENKARVIFITE SEG
IkobA CHHHHHHHHHHHHHHHTTTBTTBCCEE
EEEETTTEEEEEEC SElQ YMSSGSLK(QFLKKTKKNHKTMNEKAUKRUCT(QILSALSYLHSCDPPIIHGNLTCDTIFI(Q SEG
IkobA CCCCEEH--HHHHCTTTTC-CCHHHHHHHHHHHHHHHHHHH-- HHCEETTTTTTTTEETT
SElQ HNGLIKIGSVAPDTINNHVKTCREEiQKNLHFFAPEYGEVTNVTTAVDIYSFGMCALEMAV SEG
IkobA TTCCEEECCTTTTEECTTTTEEEEETTTGGGCCHHHHHCCCBCHHHHHHHHHHHHHHHHC
SElQ LEIlQGNGESSYVPiQEAISSAIiQLLEDPLlQREFIlQKCLiQSEPARRPTARELLFHPALFEVP SEG
IkobA CCTTTTCCCHHHHHHHHHHCCCCTTTHHHHHHHHHTTTTTGGGCCCHHHHHHTTTT
SElQ SLKLLAAHCIVGH(QHMIPENALEEITKNMDTSAVLAEIPAGPGREPV(2TLYS(QSPALELD SEG
IkobA
SElQ KFLEDVRNGIYPLTAFGLPRPIQIQPIQIQEEVTSPVVPPSVKTPTPEPAEVETRKVVLMIQCNI SEG xxxxxxxxxxxxxxxxxxxxxxxxx
IkobA
SElQ ESVEEGVKHHLTLLLKLEDKLNRHLSCDLMPNENIPELAAELVIQLGFISEADIQSRLTSLL SEG xxxxxxxxxxxxxxxxxx
IkobA
SElQ EETLNKFNFARNSTLNSAAVTVSS
SEG
IkobA
(No Prosite data available for DKFZphfbr2_76dlβ -1)
Pfam for DKFZphfbr2_7δdlβ-l
HMM_NAME Eukaryotic protein kinase domain HMM
*rLnHPNIIRFYDwFed- - - ddDHIYMIMEYMeGGDLFDYIrrng p
+L H NI++F ++ D + ++ +I+EYM G+L +++++ +
(Query 1S2
(QLEHLNIVKFHKYUADIKENKARVIFITEYMSSGSLKlQFLKKTKKNHKT 2D0
HMM
MsEwelrflMyiQILrGMeYLHSMg- - IIHRDLKPENILIDeNgqIKIcDF
M+E+ +++ +IQIL++++YLHS IIH L + I+I +NG
IKI+ (Query 2D1
MNEKAUKRUCTiQILSALSYLHSCDPPIIHGNLTCDTIFIiQHNGLIKIGSV 250 HMM GLARqMnnYerMttfCGTPUYMMAPEVIImgnyYttkVDMUSFGCILUEM
++ N+ + + + APE + ++ TT+VD++SFG+
EM (Query 251 APDTINNHVKTCREE<QKNLHFF-APEY-
GEVTNVTTAVDIYSFGMCALEM 216
HMM
MTGepPFyddnMemlmrliqrfrrpfUpnCSeElyDFMrwCUnyDPekRP + ++ + N E + ++ + ++ + + ++F+ +C++
P++RP
(Query 211 A~ VLEIfl-
GNGESSYVP(QEAISSAI(QLLEDPL(QREFI(QKCL(QSEPARRP 345 HMM TFriQILnHPUF*
T+R++L HP + (Query 34b TARELLFHPAL 35b
DKFZphfbr2_7θd4
group: transmembrane protein
DKFZphfbr2_7βd4 encodes a novel Iδδ amino acid protein without similarity to known proteins.
The novel protein contains 1 transmembrane region and a Cytochrome c family heme-binding site- No informative BLAST resultsi No predictive prositei pfam or SCOP motife ■
The new protein can find application in studying the expression profile of brain-specific genes and as a new marker for amygdala cells-
weak similarity to hypothetical protein of Arabidopsis thaliana perhaps complete eds- Pedant: TRANSMEMBRANE 1
Sequenced by MediGenomix
Locus: unknown
Insert length: 1547 bp
Poly A stretch at pos- 1527ι polyadenylation signal at pos- 15Dδ
1 TTGCCGCCGC CGCCACCCCC GCCCAGGATG GCGGAAGTGG AGGCGCCGAC
51 GGCGGCCGAG ACGGACATGA AGCAATATCA AGGCTCCGGC GGCGTCGCCA
101 TGGATGTGGA ACGGAGTCGC TTCCCCTACT GCGTGGTGTG GACGCCCATC 151 CCGGTGCTCA CGTGGTTTTT CCCCATCATC GGCCACATGG GCATCTGCAC
201 ATCCACAGGA GTCATTCGGG ACTTCGCGGG CCCCTACTTT GTCTCAGAGG
251 ACAACATGGC CTTTGGAAAG CCTGCCAAGT ACTGGAAGTT GGACCCTGCT
3D1 CAGGTCTATG CTAGCGGGCC CAACGCATGG GACACGGCTG TGCACGACGC
3S1 CTCTGAGGAG TACAAGCACC GCATGCACAA TCTCTGCTGT GACAACTGCC 401 ACTCGCACGT GGCATTGGCC CTGAATCTGA TGCGCTACAA CAACAGCACC
451 AACTGGAATA TGGTGACGCT CTGCTTCTTC TGCCTGCTCT ACGGGAAGTA
501 CGTCAGCGTT GGGGCCTTCG TGAAGACCTG GCTGCCCTTC ATCCTTCTCC
5S1 TGGGCATCAT CCTCACCGTC AGCCTGGTCT TTAACCTCCG GTGATGGCTG bOl CTCGGTGGCC CCACACCCAC CAGGGTCCCG AGGAAACAGC CGCCATCCCT bSl TTTGGTTCCA GATTTTTTTC TCCTCACCCC AAAAGGCAGG GTTGGGCCTG
7D1 CTGTTGTGGA CCGGGGGTCG GGGCTGGCAG GATGGAAGGA CTGAGGACCA
751 GCATGAAGTG GGGGTTTGTT GTCTCCCTGC CTCTCAGAAG CACCCTGTCC
601 CCTCCTCCCC AGGCCTGTGA CTCCGGCCCT GGAAGCCCCT TTGTTCTTCT
651 GTTGAAAGGC TTTGGCTTCC CTCTGTAGAG CTGCTCCCGC CACCACCTGC 101 TGGGGTCCTG CCTCAGCCCA GTGCCCAGTA TGGGGAGAGG AGGACATTTG
151 GGCTCACCTG TCAAGGTGGC CCTGGGACCA GAGCTGGTCC CAGCATGGGG
10D1 TGCACCGGGT ACACTTAACG TGTCTCTATA AGCCAAGTTG CTTCAGGACC
1051 TTCACCACTG GCCTCTAGAA TGGTCCAGAG GGGCTGGCTG GGTCCCTTTG
11D1 TCAGACTCCT GCCGGCAGCT GCCCTGGGGG ACATGTGTGC CCATCTGGCA 1151 TCCTCCAGCC CGTGCAGTCC GCTCTTCACT GTTCCACGGC CTCCCAGTGC
1201 CTCCCAGCAT TGGACCCATC TCCCCCTGCA GTTTGAGGCC AGAGAGGTGA
1251 GTGGACCTGA CAAGTGCCAG AGTAACCGTG TAGACAGAGC AGTGTAGACA
13D1 GCGCTCAGCC CCAGCCCCAG GTGTGGACCT CATGCTGGTG ATGGCTCCCC 13S1 TGGGTGGCCT GCCAGCACAG CCAGTGCCAT CAGGGAGCTG AAGGGGCTGT
1401 CCCCCACCTA ACTCCAGCTC CCCCTTCACG TTGTCACCAA GGCCCTGTGC
1451 CGCCCGCCTC GCCCCCCTGC TCTGTGGATT CCTTTGGGAA GGGCTCCCTG
1501 GGCAGGACAA TAAAGAGTTT TGACTCCAAA AAAAAAAAAA AAAAAAA
BLAST Results
Entry TD2blb from database PIR:' hypothetical protein T11L16-12 - Arabidopsis thaliana Score = 221ι P = l-3e-17ι identities = 57/lblι positives 76/lblι frame +1
Medline entries
No Medline entry
Peptide information for frame 1
ORF from 26 bp to S11 bpi peptide length 166 Category: similarity to unknown protein Classification: no clue Prosite motifs: CYTOCHROME C (121-111)
1 MAEVEAPTAA ETDMKiQYiQGS GGVAMDVERS RFPYCVVUTP IPVLTUFFPI 51 IGHMGICTST GVIRDFAGPY FVSEDNMAFG KPAKYUKLDP AlQVYASGPNA
101 UDTAVHDASE EYKHRMHNLC CDNCHSHVAL ALNLMRYNNS TNUNMVTLCF
151 FCLLYGKYVS VGAFVKTULP FILLLGIILT VSLVFNLR
BLASTP hits
No BLASTP hits available Alert BLASTP hits for DKFZphfbr2_7δd4 ■, frame 1
PIR:T02blb hypothetical protein T11L18-12 - Arabidopsis thalianai
N =
2ι Score = 22bι P = 4-Se-21
>PIR:TD2blb hypothetical protein T11L18-12 - Arabidopsis thaliana Length = 2b7 HSPs:
Score = 22b (33-1 bits)ι Expect = 4-5e-21ι Sum P(2) = 4-Se-21 Identities = 52/132 (31*)ι Positives = 71/132 (53*) (Query: 25 MDVERSRFPYCVVUTPIPVLTUFFPIIGHMGICTSTGVIRDFAGPYFVSEDNMAFGKPAK 64
+D ++S+FP C+VUTP+PV++U P IGH+G+C GVI DFAG F++ D+ AFG PA+
Sbjct: bl IDTKKSKFPCCIVUTPLPVVSULAPFIGHIGLCREDGVILDFAGSNFINVDDFAFGPPAR 12D
(Query: 65 YUKLDPAlQVYASGPNAUDTAVHDASEEYKHRMHNLC-- CDNCHSHVALALNLMRYNNST- 141
Y +LD + PN H +KH DN S +
YN T
Sbjct: 121 YLlQLDRTKCCLP-PNMGG---
HTCKYGFKHTDFGTARTUDNALSSSTRSFEHKTYNIFTC 17b
(Query: 142 NUN-MVTLCFFCLLYG ISb N + V C L YG
Sbjct: 177 NCHSFVANCLNRLCYG 112 Score = 157 (23-b bits)ι Expect = l.fle-13ι Sum P(2) = l-δe-13 Identities = 27/61 (33*)ι Positives = 50/61 (bl*)
(Query: 101
UDTAVHDASEEYKHRMHNLCCDNCHSHVALALNLMRYNNSTNUNMVTLCFFCLLYGKYVS IbO UD A+ ++ ++H+ +N+ NCHS VA LN + Y S UNMV +
++ GK+++ Sbjct: 155 UDNALSSSTRSFEHKTYNIFTCNCHSFVANCLNRLCYGGSMEUNMVNVAILLMIKGKUIN 214 (Query: ibl VGAFVKTULPFILL--LGIIL 171
+ V+++LP ++ LG++L Sbjct: 215 GSSVVRSFLPCAVVTSLGVVL 235
Score = 3b (5-4 bits)ι Expect = 4-Se-21ι Sum P(2) = 4-Se-21 Identities = 7/21 (33*)ι Positives = 14/21 (bb*)
(Query: ID AETDMKIQYIQGSGGVAMDVERS 3D
++ ++K +G G MD++RS Sbjct: 12 SDRNLKMSRGRGVPMMDLKRS 32
Pedant information for DKFZphfbr2_7δd4 i frame 1
Report for DKFZphfbr2_7δd4.1
ELENGTH! 166 EMU! 21176. bb Epl! b-27
EHOMOL! PIR:TD2blb hypothetical protein T11L18-12 - Arabidopsis thaliana 7e-32 EPROSITE! CYT0CHR0ME_C 1 EKU! TRANSMEMBRANE 1
SElQ MAEVEAPTAAETDMK(QY(QGSGGVAMDVERSRFPYCVVUTPIPVLTUFFPIIGHMGICTST PRD cccccchhhhhhhhhhccccccccccccccccccceeeccceeeeeeeeecccceeecce M EM
SElQ GVIRDFAGPYFVSEDNMAFGKPAKYUKLDPAiQVYASGPNAUDTAVHDASEEYKHRMHNLC
PRD eeeeccccccccccccccccccceeeeccccceeeccccccccccccccchhhhhhhhee MEM
SElQ CDNCHSHVALALNLMRYNNSTNUNMVTLCFFCLLYGKYVSVGAFVKTULPFILLLGIILT
PRD ecccchhhhhhhhhhhccccccchhhhhhhhhhhccceeeeeeeeeeeccceeeeceeec
MEM MMMMMMMMMMMM
SElQ VSLVFNLR PRD ceeeeccc MEM MMMMM...
Prosite for DKFZphfbr2_78d4-l PSD011D 121->127 CYTOCHROME_C PDOCODlbl
(No Pfam data available for DKFZphfbr2_7βd4.1)
DKFZphfbr2_7δelβ
group: brain derived
DKFZphfbr2_76elδ encodes a novel 3D7 amino acid protein without similarity to known proteins- The mRNA is differentially polyadenylated.
No informative BLAST resultsi No predictive prositei pfam or SCOP motife •
The new protein can find application in studying the expression profile of brain-specific genes-
similarity to hypothetical protein of Arabidopsis thaliana differential polyadenylation > 7 exons complete on human genomic clone 451B21ap- perhaps complete eds- Sequenced by MediGenomix
Locus: /map="144 -50 cR from top of Chrb linkage group"
Insert length: 301b bp Poly A stretch at pos- 3075ι polyadenylation signal at pos- 3047
1 TGGTGAGTTC GGAGTAGAGA TGGCCGCGCT TGCACCGCTG CCCCCGCTCC
51 CCGCACAGCT CAAGAGCATA CAGCATCATC TGAGGACGGC TCAGGAGCAT 101 GACAAGCGAG ACCCTGTGGT GGCTTATTAC TGTCGTTTAT ACGCAATGCA
ISI GACTGGAATG AAGATCGATA GTAAAACTCC TGAATGTCGC AAATTTTTAT
2D1 CAAAGTTAAT GGATCAGTTA GAAGCTCTAA AGAAGCAGTT GGGTGATAAT
251 GAAGCTATTA CTCAAGAAAT AGTGGGCTGT GCCCATTTGG AGAATTATGC
301 TTTGAAAATG TTTTTGTATG CAGACAATGA AGATCGTGCT GGACGATTTC 351 ACAAAAACAT GATCAAGTCC TTCTATACTG CAAGTCTTTT GATAGATGTC
4D1 ATAACAGTAT TTGGAGAACT CACTGATGAA AATGTGAAAC ACAGGAAGTA
451 TGCCAGATGG AAGGCAACAT ACATCCATAA TTGTTTAAAG AATGGGGAGA
501 CTCCTCAAGC AGGCCCTGTT GGAATTGAAG AAGATAATGA TATTGAAGAA
5S1 AATGAAGATG CTGGAGCAGC CTCTCTGCCC ACTCAGCCAA CTCAGCCATC bOl ATCATCTTCA ACTTATGACC CAAGCAACAT GCCATCAGGC AACTATACTG bSl GAATACAGAT TCCTCCGGGT GCACACGCTC CAGCTAATAC ACCAGCAGAA
701 GTGCCTCACA GCACAGGTGT AGCAAGTAAT ACTATCCAAC CTACTCCACA
7S1 GACTATACCT GCCATTGATC CCGCACTTTT CAATACAATT TCCCAGGGGG
601 ATGTTCGTCT AACCCCAGAA GACTTTGCTA GAGCTCAGAA GTACTGCAAA 651 TATGCTGGCA GTGCTTTGCA GTATGAAGAT GTAAGCACTG CTGTCCAGAA
101 TCTACAAAAG GCTCTCAAGT TACTGACGAC AGGCAGAGAA TGAAGCCTTT
151 GTATGACAGA CCCATGTATT TTTGGCATGA GGAACTAACA GTCCATTACT
1D01 CTATCTTCAG CCTATCAGGA TCACAGTTTT AAGGAAGACT TGGTTTTGTT
1051 GAATATGACA ATGAAATCTG TGTGTATCAG ATTTTTATTG AAGCATTCAT 1101 CAGCAGCCTC AACCAGTTTT CATTGTCCAT TTACTAGATT CAATCGTCTC
1151 TGAGTATATA GGGCTGATGT TAGCAAGACC CTAAAAATGT CCATTGAACC
12D1 CTGCTTCAAA AAATGAAAAC ACACCTCTAT AAAATGTGTA CTGGGAATAA
1251 GCTTTGTATT TACATACATT AGGGGAATTT TTTAAAATCT GTAATGTTTG 13D1 GACAAACAGA TGATATTACT TTGCTATAAA ATTATAAATG TAACTTTTAA
1351 TAAAGATAGC CAGAATATTC TAAATTAGAA ATTACGTTTT TGTTTCCCTC
1401 AAGACATAAA ACAAATATAA ACATTCTAAA CTGCTGGATG AATCTGAAAA
1451 GACATTAAGT TCAAATTTTA ATTTATTCTC ATATTAAATA TAACTCCATT 1501 AAAAGTTTAA AATTTCATGG GAGAAAATAT AATAAGGTAA AGAGGTAGAA
1S51 TCACTTTCAG ACTTAAGAAT AATGTTGATT TCCCAAGTGC TTTACCTTAT
IbOl CTGTTAAAGC GTAAGATGAA TTGGTATTTG CTTCATAGGC AGTTTGACTG
IbSl CATGTATTAG AGAAT6AAAA GAAGATATTT GTAGTAATGC CTGGAAACTT
1701 GGTGCTTTAA ATTAAGGTAC TCCTCTGCTG CTGTAGAATG GATTCCACAC 1751 AGTGGATAGC TATGGGTGAT TCAGAATATT ATGTTTAGAT TCCCATTTGT
IδOl TAAGTTTATA AGTTTTGTGG GGAATTATGA ACTTACTGTG TACTACCTGC
18S1 ATTTGTGCTG TGTGAAAAAT AAATACAAGG ATTCGTTTAG CTAATTCAAC
1101 TTACTACAAA GACAAATGTC TGTTTTTATT TGCCTGCTAG GATTGTCTTT
1151 TTTAAAAGTC ATTTTTATTT ATAGGAATAT GGGTGTTTCT ATAGGAAGAA 20D1 ACAGGTTTTT TGTTTTTTGT TTTTTAAGAT AAATTTGACA AAGTTAACTG
2DS1 AAATTTATCT GGTCCATTTT ATTCATGCTA CTAAGATGGG AATCTTTAAA
21D1 CACAAGGGTC AGCAAGCTTT GGCCCATGGA TTGGCCACCT GTTACGTAAA
2151 TAAAGTTTCT TTGAAACAAG CCTACACTCA TTCATTTATG TTTTGTCTGT
2201 GGTTGCTTTC CACAACTGCA GAGTTGTATG GCTTGCAAGT CTAAAAACAT 2251 TTACTATTTG GCCCTCTAAG AAAAAGTTAA GACACCTAGT CTAATGGCCT
23D1 TTTGGGAAAA AACAAATCAC TAACTCATAA TCATTTATAT CCATTATTTT
23S1 CTGCATAAAT GTAATGCTAT TGTACAGGGT TTGGTAGAAT AAATATTCA6
2401 ACTGACTAAA CTGTTCTAAA TTCTCACAAA AAAGTCCCCA AACAACATGC
2451 CTCCTAAAAA ACATTTTCCT ATCTTTTACA AGAGGTATGA ACATTTGTAG 2S01 GGTTCCACAT TTGCATCTAG AAATCCAATG CTCTTTAGAA TGTTATTACG
2SS1 AATAGAAAGA TGGCCAGGAT GACCTTTAGT GTTACATGAT GTTCAGCAAA
2b01 TTTTAATTCA AACCTTGATA TGCCTGGACA CTGAAAAGTA AACGCATCAC
2b51 CTCCTATTTT ATACCCTACC TTCTGGTTCC CAATTGGGAG AGCACATAGA
27D1 GGGAAGGAGA CAATATAGAA ACTACGGAGT CCGCTGGTAG TGGGCTGCAT 27S1 GGTGTGACAG AGCCCTTCTC TGTAAAAT66 AAATGACACC ACTAGCCATC
2601 TCAATAGTTA CAAGAATTAA AAGAGATACA GTACCTGAAG TGCTTAGCGC
2651 ATGGTAGCAT TTCATAAATG TTTAGTGTCA ATACTAATGC TCTAATAATG
2101 TAAATTGTTA ATAATTTATT TCCCTAATAT CAGGAAATCC CAGTTGTCTA
2151 TGTGGCCCAG TGCTTAAAAA CGCCTTCTTG CATGAGGGGA TTGAACTATA 3001 CAATGTTTGT TAACTTTGTA TTTGTATTTT TTCCTATAAA ATCTTAAAAT
3051 AAAATTAGGA GATGTGTTCT GATGTAAAAA AAAAAAAAAA AAAAAA
BLAST Results
Entry HS451B21 from database EMBL:
Human DNA sequence *** SEQUENCING IN PROGRESS *** from clone 4S1B21 Score = 11211ι P = 0-Oe+DOι identities = 2267/2343
Medline entries
No Medline entry
Peptide information for frame 2 ORF from 2D bp to 140 bpi peptide length: 307 Category: similarity to unknown protein Classification: no clue 1 MAALAPLPPL PA(QLKSI(QHH LRTA(QEHDKR DPVVAYYCRL YAMiQTGMKID
51 SKTPECRKFL SKLMDiQLEAL KKiQLGDNEAI TiQEIVGCAHL ENYALKMFLY
101 ADNEDRAGRF HKNMIKSFYT ASLLIDVITV FGELTDENVK HRKYARUKAT
151 YIHNCLKNGE TPώAGPVGIE EDNDIEENED AGAASLPTiQP TlQPSSSSTYD
2D1 PSNMPSGNYT GIiQIPPGAHA PANTPAEVPH STGVASNTKQ PTPiQTIPAID 251 PALFNTISiQG DVRLTPEDFA RAώKYCKYAG SALfQYEDVST AV(QNL(QKALK
301 LLTTGRE
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphfbr2_7βelβι frame 2
No Alert BLASTP hits found
Pedant information for DKFZphfbr2_7βelβι frame 2
Report for DKFZphfbr2_7δelδ -2
ELENGTH! 313 EMU! 344b3-15
Epl! S-b4
EHOMOL! PIR:TD471δ hypothetical protein F10M23-1D -
Arabidopsis thaliana 3e-22
EKU! All_Alpha EKU! LOU COMPLEXITY Ib-bl *
SElQ GEFGVEMAALAPLPPLPAiQLKSIlQHHLRTAiQEHDKRDPVVAYYCRLYAMlQTGMKIDSKTP
SEG xxxxxxxxxxxxx PRD ccchhhhhheeecccccchhhhhhhhhhhhhhhhcccceeehhhhhhhhhhccccccccc
SElQ ECRKFLSKLMD(QLEALKK<QLGDNEAIT(2EIVGCAHLENYALKMFLYADNEDRAGRFHKNM
SEG
PRD chhhhhhhhhhhhhhhhhhhcccccccchhhhhhhhhhhhhhhhhhhccccccccccchh
SElQ IKSFYTASLLIDVITVFGELTDENVKHRKYARUKATYIHNCLKNGETPiQAGPVGIEEDND
SEG xxxxxxx
PRD hhhhhhhhhhhhhhhcccccccchhhhhhhhhhhhhhhhhhhhccccccccccccccccc SEA IEENEDAGAASLPTiQPTlQPSSSSTYDPSNMPSGNYTGIlQIPPGAHAPANTPAEVPHSTGV
SEG xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
SElQ ASNTIiQPTPlQTIPAIDPALFNTISiQGDVRLTPEDFARAiQKYCKYAGSALiQYEDVSTAVlQN SEG
PRD cccccccccccccccccccccccccccccccchhhhhhhhhhhhhcceeeecchhhhhhh
SElQ LlQKALKLLTTGRE SEG
PRD hhhhhhhhccccc
(No Prosite data available for DKFZphfbr2_7δelδ -2) (No Pfam data available for DKFZphfbr2_7δelβ - 2)
DKFZphfbr2_76i21
group: metabolism
DKFZphfbr2_78i21 encodes a novel 477 amino acid protein with similarity to beta-aspartate methyltransferases - The L-isoaspartyl methyltransferase (Pimt)ι as an examplei is a highly conserved enzyme utilising S-adenosylmethionine (AdoMet) to methylate aspartate residues of proteins damaged by age- related isomerisation and deamidation- The new protein can find application in diagnosis/modulation of protein damage and age-related degenerative processes-
unknown protein weak similarity to beta-aspartate methyltransferase pimT of Mycobacterium leprae perhaps complete eds- Sequenced by MediGenomix
Locus: unknown
Insert length: 1642 bp Poly A stretch at pos- 1811ι polyadenylation signal at pos- 1602
1 CCTTCGCGAA ACACTATGCT AATGGCATGG TGCCGCGGTC CTGTCTTGCT
51 GTGCCTGCGG CAGGGGCTCG GAACCAATTC ATTCCTGCAC GGCCTGGGGC 101 AGGAGCCCTT CGAGGGAGCT CGGTCACTGT GTTGCAGGTC CTCGCCTAGA
ISI GACCTGCGAG ATGGAGAAAG AGAGCACGAG GCGGCACAAA GGAAAGCCCC
2D1 AGGAGCAGAG TCTTGCCCAT CTCTCCCTCT GAGCATCTCG GACATTGGGA
251 CTGGATGTCT TTCGTCACTG GAAAACCTCA GACTGCCGAC GCTGCGGGAA
301 GAGTCATCCC CTCGAGAGCT CGAGGACTCG AGCGGAGACC AGGGCCGGTG 3S1 CGGTCCCACA CACCAGGGAT CCGAGGATCC TTCGATGCTC TCGCAGGCCC
401 AGTCCGCTAC CGAGGTCGAA GAGCGTCACG TCTCCCCTTC TTGTTCAACT
451 TCCAGAGAGA GACCCTTTCA GGCTGGGGAA CTGATTTTAG CTGAGACTGG
501 GGAGGGAGAA ACAAAATTTA AGAAATTATT TAGGTTGAAC AACTTCGGAC
551 TCTTAAATAG TAACTGGGGG GCAGTCCCGT TCGGCAAGAT CGTGGGGAAG bOl TTCCCCGGCC AGATACTGAG GAGTTCCTTC GGTAAGCAGT ACATGCTGAG b51 GAGGCCAGCC TTGGAAGACT ATGTAGTATT GATGAAAAGA GGGACTGCCA
701 TAACATTCCC AAAGGATATT AATATGATTC TCTCAATGAT GGATATCAAC
751 CCAGGTGATA CTGTTTTGGA AGCTGGCTCA GGCTCTGGTG GAATGAGCTT
801 ATTTTTATCC AAA6CAGTTG GATCACAAGG ACGAGTCATA AGTTTTGAGG 851 TACGAAAAGA CCACCATGAT CTGGCTAAGA AGAATTACAA ACACTGGCGT
101 GATTCATGGA AATTAAGTCA TGTAGAAGAG TGGCCAGACA ATGTGGATTT
151 TATTCATAAG GACATTTCAG GAGCAACCGA AGACATAAAA TCTTTAACAT
1D01 TTGACGCAGT AGCTTTGGAT ATGTTAAATC CTCATGTTAC TTTGCCTGTT
1DS1 TTTTACCCAC ATCTTAAGCA TGGTGGTGTA TGTGCTGTAT ATGTAGTAAA 11D1 CATCACACAG GTTATTGAAC TTTTAGATGG AATTCGCACC TGTGAACTTG
1151 CTCTTTCATG TGAAAAGATA AGCGAGGTCA TTGTCAGAGA TTGGTTGGTT
12D1 TGCCTTGCAA AACAGAAAAA TGGAATTTTA GCTCAAAAAG TAGAATCTAA
1251 AATCAACACA GATGTACAAC TAGATTCTCA AGAGAAAATT GGAGTTAAAG 1301 GTGAGCTGTT TCAAGAGGAT GACCATGAAG AATCGCATTC TGATTTTCCA
1351 TATGGATCAT TTCCCTATGT TGCTAGACCA GTACACTGGC AACCTGGTCA
1401 TACAGCTTTT CTTGTCAAGT TGAGGAAGGT CAAACCACAA CTTAACTGAG
1451 TACTCCAGAT GACAGTAACT GACTTGAAGA TGGAAAAATA TCAAAATAGA
1501 ACTTTATATT GAAAATCACT GCTTCCATAG ATTGGCATTT TTAGCTATTA
1551 CTATGACTTA TATAACTTAT ACATATAATT TTGAAAATAA CAACTAAAAG
IbOl ATGTATAACA TAGCAAAACT GCTTAAACAT CCCATTTTGA CACTTGTCTT lb51 GCAGTTAGTT TGACATTTTG TAGTTAATGA TTCCAAATTG GTTTAGTTGG
1701 GCCATCTCAT TCTTCACTTC CTGTAAACCA CTCCATAGAT TTGTCTTTCT
1751 TCAAGAAATT AGTTTTCTTT CCTTTATTTG ATTGATGGTC ATTGACTACT
1801 GAAATAAAAT ATGCATTTTA AGAAAAAAAA AAAAAAAAAA AA
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 1
ORF from lb bp to 144b bpi peptide length: 477 Category: putative protein Classification: no clue
1 MLMAUCRGPV LLCLRiQGLGT NSFLHGLGiQE PFEGARSLCC RSSPRDLRDG 51 EREHEAAiQRK APGAESCPSL PLSISDIGTG CLSSLENLRL PTLREESSPR
101 ELEDSSGDlQG RCGPTHlQGSE DPSMLSiQAlQS ATEVEERHVS PSCSTSRERP
151 FlQAGELILAE TGEGETKFKK LFRLNNFGLL NSNUGAVPFG KIVGKFPGlQI
2D1 LRSSFGKiQYM LRRPALEDYV VLMKRGTAIT FPKDINMILS MMDINPGDTV
251 LEAGSGSGGM SLFLSKAVGS (QGRVISFEVR KDHHDLAKKN YKHURDSUKL 301 SHVEEUPDNV DFIHKDISGA TEDIKSLTFD AVALDMLNPH VTLPVFYPHL
351 KHGGVCAVYV VNITlQVIELL DGIRTCELAL SCEKISEVIV RDULVCLAKlQ
4D1 KNGILAiQKVE SKINTDVlQLD SlQEKIGVKGE LFiQEDDHEES HSDFPYGSFP
451 YVARPVHUlQP GHTAFLVKLR KVKPlQLN
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphfbr2_78i21ι frame 1
No Alert BLASTP hits found Pedant information for DKFZphfbr2_7βi21ι frame 1
Report for DKFZphfbr2_76i21 - 1 ELENGTH! 462
EMU! 53521-2D Epl! b-26
EHOMOL! TREMBL:AFD6660D_2 product: "unknown"i Rhodococcus erythropolis ARC (arc) genei complete cdsi and unknown genes- 2e-
23
EFUNCAT! r general function prediction EM- jannaschiii MJ0134! be-10
EFUNCAT! 05-07 translational control ES- cerevisiaei YJL125c! be-04
EBLOCKS! BL00601E
EBLOCKS! BL01271A EKU! Alpha_Beta
EKU! LOU COMPLEXITY 2-41 *
SElQ PSRNTMLMAUCRGPVLLCLRIQGLGTNSFLHGLGIQEPFEGARSLCCRSSPRDLRDGEREHE SEG
PRD cccceeeeeecccccchhhhhccccceeeeeccccceeeceeeeccccccccccchhhhh
SElQ AAiQRKAPGAESCPSLPLSISDIGTGCLSSLENLRLPTLREESSPRELEDSSGDiQGRCGPT
SEG PRD hhhhhccccccccccceeeeeecccccccceeeccccccccccccccccccccccccccc
SElQ HiQGSEDPSMLSlQAiQSATEVEERHVSPSCSTSRERPFiQAGELILAETGEGETKFKKLFRLN
SEG
PRD cccccccchhhhhhhhhhhhhccccccccccccccccccceeeecccccccceeeeeecc
SElQ NFGLLNSNUGAVPFGKIVGKFPGlQILRSSFGKiQYMLRRPALEDYVVLMKRGTAITFPKDI
SEG
PRD ccccccccccccccceeeccccceeeeecccceeeeccchhhhhhhhhhccceeeecccc SElQ NMILSMMDINPGDTVLEAGSGSGGMSLFLSKAVGSlQGRVISFEVRKDHHDLAKKNYKHUR
SEG xxxxxxxxxxxx
PRD cceeecccccccceeeeeccccchhhhhhhhhccccceeeeeehhhhhhhhhhhhhhhhh
SElQ DSUKLSHVEEUPDNVDFIHKDISGATEDIKSLTFDAVALDMLNPHVTLPVFYPHLKHGGV SEG
PRD hccccccccccccceeeeecccccccccccccccceeeecccccccchhhhhhhcccccc
SElQ CAVY VNITlQVIELLDGIRTCELALSCEKISEVIVRDULVCLAKiQKNGILAlQKVESKINT
SEG PRD eeeeeechhhhhhhhhhhhhhhhhhhhccceeeeeehhhhhhhhhhccceeeeccccccc
SElQ DV(QLDS(QEKIGVKGELF(QEDDHEESHSDFPYGSFPYVARPVHU(3PGHTAFLVKLRKVKPιQ
SEG
PRD cccccccccccccccccccccccccccccccccccccccccccccccceeeeeccccccc
SElQ LN SEG - . PRD cc
(No Prosite data available for DKFZphfbr2_7δi21.1) (No Pfam data available for DKFZphfbr2_7βi21 ■ 1) DKFZphmel2_12jl
group: melanoma derived
DKFZphmel2_12jl encodes a novel 105 amino acid proteini which has similarity to integrin I of Saccharomyees cerevisiae.
The novel protein contains a leucin zipper-
No informative BLAST resultsi No predictive prositei pfam or SCOP motife - The new protein can find application in studying the expression profile of melanoma-specific genes-
weak similarity to integrin I (Saccharomyees cerevisiae)
Sequenced by EMBL
Locus: unknown Insert length: 2142 bp
Poly A stretch at pos- 212bι no polyadenylation signal found
1 CGAAAGCTAA AGGCCGGCGC ACGCTGGGCG GTGGTGGTCC CTAAGCCGGG 51 CCGCGGCCGG TGCAATGGAC TCCACTGCCT GCTTGAAGTC CTTGCTCCTG
IDl ACTGTCAGTC AGTACAAAGC CGTGAAGTCA GAGGCGAACG CCACTCAGCT
151 TTTGCGGCAC TTGGAGGTAA TTTCTGGACA GAAACTCACA CGACTATTTA
201 CATCAAATCA GATATTAACA AGTGAATGCT TGAGTTGCCT TGTAGAGCTA
251 CTTGAAGACC CCAACATAAG TGCTTCACTG ATCTTAAGTA TTATCGGTTT 301 GCTGTCTCAA CTAGCAGTAG ACATTGAAAC CAGAGATTGT CTTCAGAATA
351 CATATAATCT GAATAGTGTG CTGGCGGGAG TGGTTTGTCG GAGCAGCCAC
401 ACTGATTCGG TGTTTTTGCA GTGCATTCAA CTTCTACAGA AGTTAACATA
451 TAATGTCAAA ATTTTCTATT CTGGTGCCAA TATAGATGAA TTAATTACGT
SD1 TCCTGATAGA TCACATTCAA TCTTCTGAAG ATGAGTTAAA AATGCCTTGT 551 CTAGGATTAT TGGCAAATCT TTGTCGGCAC AATCTTTCTG TTCAAACGCA bOl CATAAAGACA TTGAGTAATG TGAAATCTTT TTATCGAACT CTTATCACCT b51 TGTTGGCCCA TAGTAGTTTA ACTGTGGTTG TGTTTGCACT TTCAATATTA
7D1 TCCAGTTTGA CATTAAATGA AGAGGTGGGG GAAAAGCTAT TCCATGCTCG
751 AAACATTCAT CAGACTTTTC AACTAATATT TAATATTCTC ATAAACGGTG 601 ATGGCACTCT AACTAGAAAG TATTCAGTTG ACCTACTGAT GGATCTCCTT
651 AAGAATCCTA AAATTGCTGA TTATCTCACC AGATATGAGC ACTTTTCTTC
101 ATGTCTTCAC CAAGTATTAG GTCTTCTTAA TGGAAAGGAT CCTGATTCCT
151 CTTCAAAGGT TTTAGAATTA CTTCTTGCCT TCTGTTCAGT GACTCAGCTG
10D1 CGCCATATGC TCACTCAGAT GATGTTTGAA CAGTCTCCAC CTGGCAGCGC 1051 CACTCTGGGA AGCCATACTA AATGTTTAGA ACCTACTGTG GCTCTACTGC
1101 GCTGGTTAAG CCAACCTTTG GACGGATCAG AAAACTGTTC TGTTTTAGCA
1151 TTGGAGTTGT TCAAGGAAAT ATTTGAGGAT GTCATAGATG CTGCTAACTG
12D1 TTCCTCGGCT GATCGTTTTG TGACCCTTCT GCTGCCTACA ATCCTTGATC
1251 AACTTCAGTT CACAGAACAA AATCTAGATG AGGCTTTAAC AAGAAAAAAT 1301 GTGAAAGGGA TTGCCAAGGC CATTGAAGTT TTGTTAACTC TCTGTGGAGA
1351 TGATACACTA AAAATGCATA TTGCAAAAAT CTTGACAACT GTCAAGTGTA
14D1 CCACTCTTAT AGAACAACAA TTTACATATG GCAAGATTGA CCTGGGATTT
1451 GGAACAAAGG TTGCAGATTC TGAATTATGC AAACTTGCTG CTGATGTAAT 1501 TTTGAAAACT CTTGATTTGA TTAACAAACT TAAACCATTG GTTCCTGGTA
1551 TGGAAGTAAG CTTCTACAAA ATACTTCAGG ACCCACGTTT GATTACTCCT
IbOl TTGGCTTTTG CTTTAACGTC AGATAATAGA GAACAAGTAC AGTCTGGACT lb51 GAGAATATTA TTGGAGGCTG CTCCACTGCC AGATTTTCCT GCTTTAGTAC 17D1 TTGGAGAAAG TATAGCAGCA AACAATGCCT ATAGACAACA GGAAACAGAA
1751 CATATACCCA GAAAAATGCC CTGGCAATCA TCAAATCACA GTTTTCCAAC
IβOl ATCAATAAAG TGTTTAACTC CTCATTTGAA AGATGGTGTT CCTGGATTGA
1651 ATATTGAAGA ATTAATAGAG AAACTTCAGT CTGGAATGGT GGTAAAGGAT
1101 CAGATTTGTG ATGTGAGAAT ATCTGACATA ATGGATGTAT ATGAAATGAA 1151 ACTATCCACA TTAGCTTCCA AAGAAAGCAG GCTACAAGAT CTTTTGGAAA
20D1 CAAAAGCTCT AGCCCTTGCA CAGGCTGATA GACTGATTGC TCAGCATCGC
2051 TGTCAAAGAA CTCAAGCTGA AACAGAGGCA CGGACACTTG CTAGTATGTT
21D1 GAGAGAAGTT GAGAGAAAAA ATGAAGAGCT TAGTGTGTTG CTGAAGGCGC
2151 AGCAAGTTGA ATCAGAAAGA GCGCAGAGTG ATATTGAGCA TCTCTTTCAA 2201 CATAATAGGA AGTTAGAGTC TGTGGCTGAA GAACATGAAA TACTGACAAA
2251 ATCCTACATG GAACTTCTTC AGAGAAATGA AAGTACTGAA AAGAAGAATA
2301 AAGATTTACA GATCACATGT GATTCTCTGA ATAAACAAAT TGAGACAGTG
2351 AAAAAGTTGA ATGAGTCACT CAAGGAACAA AATGAAAAAA GTATTGCCCA
2401 ATTAATAGAG AAAGAAGAAC AGAGAAAAGA AGTACAGAAT CAGCTAGTAG 2451 ACAGAGAACA TAAGCTAGCA AATTTGCATC AAAAAACAAA AGTACAAGAA
25D1 GAAAAGATTA AAACCTTACA AAAGGAAAGG GAAGATAAGG AAGAAACCAT
2551 TGATATCCTT AGAAAAGAAT TAAGCAGAAC AGAACAGATA AGAAAAGAGT
2b01 TGAGCATTAA GGCTTCCTCC CTAGAGGTTC AAAAGGCACA ATTAGAAGGT
2b51 CGTTTGGAAG AGAAAGAGTC CTTGGTGAAA CTTCAGCAAG AGGAATTGAA 27D1 CAAACACTCC CACATGATAG CAATGATCCA CAGTTTAAGT GGTGGAAAAA
2751 TAAATCCAGA AACTGTGAAT CTCAGTATAT AGACATTATG GCATTTTGGA
2601 ATTTGTAATC TCATGATATT TTTGATGTAT TTATCTATTG GAGGGGGGGT
2651 GGGTAGGGGA GTTAATTTGT GACTTCGTAA CAATAAGAAG TTATTATCTA
2101 ATTTAGTAAA GACCCTGATC TGTTGCAAAA AAAAAAAAAA AA
BLAST Results
No BLAST result
Medline entries
1b031111:
Hostetter MKi Tao NJi Gale Ci Herman DJi McClellan Mi Sharp RLi
Kendrick KE-i Antigenic and functional conservation of an integrin
I-domain in
Saccharomyees cerevisiae- Biochem Mol Med 1115 Augi55(2) :122-30
11456454:
Berton Gi Lowell CA-i Integrin signalling in neutrophils and macrophages. Cell Signal 1111 Sepill (1) : b21-35
Peptide information for frame Ξ
ORF from b5 bp to 2771 bpi peptide length: 105 Category: putative protein
Classification: Cellular transport and traffic
Prosite motifs: LEUCINE ZIPPER (331-352)
1 MDSTACLKSL LLTVSiQYKAV KSEANATlQLL RHLEVISGiQK LTRLFTSNiQI
51 LTSECLSCLV ELLEDPNISA SLILSIIGLL SiQLAVDIETR DCLώNTYNLN
101 SVLAGVVCRS SHTDSVFLiQC IiQLLlQKLTYN VKIFYSGANI DELITFLIDH
151 IiQSSEDELKM PCLGLLANLC RHNLSViQTHI KTLSNVKSFY RTLITLLAHS 2D1 SLTVVVFALS ILSSLTLNEE VGEKLFHARN IHiQTFiQLIFN ILINGDGTLT
251 RKYSVDLLMD LLKNPKIADY LTRYEHFSSC LHiQVLGLLNG KDPDSSSKVL
301 ELLLAFCSVT (QLRHMLTiQMM FElQSPPGSAT LGSHTKCLEP TVALLRULSiQ
351 PLDGSENCSV LALELFKEIF EDVIDAANCS SADRFVTLLL PTILD(QL(QFT
401 ElQNLDEALTR KNVKGIAKAI EVLLTLCGDD TLKMHIAKIL TTVKCTTLIE 451 (QiQFTYGKIDL GFGTKVADSE LCKLAADVIL KTLDLINKLK PLVPGMEVSF
5D1 YKILiQDPRLI TPLAFALTSD NREIQVIQSGLR ILLEAAPLPD FPALVLGESI
551 AANNAYRlQlQE TEHIPRKMPU (QSSNHSFPTS IKCLTPHLKD GVPGLNIEEL bOl IEKLlQSGMVV KDiQICDVRIS DIMDVYEMKL STLASKESRL (QDLLETKALA b51 LAiQADRLIAlQ HRCIQRTIQAET EARTLASMLR EVERKNEELS VLLKAIQIQVES 7D1 ERAiQSDIEHL FlQHNRKLESV AEEHEILTKS YMELLiQRNES TEKKNKDLlQI
751 TCDSLNKiQIE TVKKLNESLK EiQNEKSIAlQL IEKEElQRKEV (QNiQLVDREHK
801 LANLHύKTKV (QEEKIKTLώK EREDKEETID ILRKELSRTE (QIRKELSIKA
651 SSLEVlQKAlQL EGRLEEKESL VKLiQlQEELNK HSHMIAMIHS LSGGKINPET IDl VNLSI
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphmel2_12 li frame 2
TREMBL :SCINTANA_1 Saccharomyees cerevisiae integrin analogue genei complete cds-i N = li Score = 21bι P = l-3e-13
>TREMBL :SCINTANA_1 Saccharomyees cerevisiae integrin analogue genei complete eds-
Length = liOlS
HSPs :
Score = 21b ( 32 - 4 bi ts ) ι Expect = l - 3e-13 ι P = l - 3e-13 Ident i t ies = 6D/302 ( 2b* ) ι Posit i ves = 155/3D2 ( 51* )
(Query : 517 IEELIEKLiQSGMVVKDlQICDVRISDIM — DVYEMKLSTLASKESRLiQDLLETKALALAiQ b53
I L EKL++ D+ + +IS++ + E +L+ + ++ L+
LET AL +
Sb jct : 275 ISLLKEKLETATTANDENVN- KISELTKTREELEAELAAYKNLKNELETKLETSEKALKE 333
(Query : b54 A---DRLIA(QHRCtQRT<QAETEAR TLASMLREVERKNEELSVLLKA-
(QiQVESERAtQ 704 + + + + IQ + TE + +L + L +E +++E + L+ LK
+ Q+ ++ iQ Sbjct : 334
VKENEEHLKEEKIlQLEKEATETKiQlQLNSLRANLESLEKEHEDLAAiQLKKYEElQIANKERiQ 313
(Query : 7D5 SDIEHLFlQHNRKLESVAEEHEILTKSYMEL LlQRNESTEKKNKDLlQIT-
CDSLNKiQIE 7b0
+ E + <Q N ++ S +E + E + K EL ++ +ST ++ +L+ + D + LN <QI + Sb jct : 314 YN-
EEISiQLNDEITSTlQlQENESIKKKNDELEGEVKAMKSTSEElQSNLKKSEIDALNLlQIK 452
(Query : 7bl
TVKKLNESLKE(QNEKSIA(QLIEKEE(QRKEV(QN(QLVDREHKLANLH(QKTKV(QEEKIKT 817 +KK NE+ + +SI + + + KE + IQ + + +E +++ L K K
E + K
Sb j ct : 453 ELKKKNETNEASLLESIKSIESETVKIKELlQDECNFKEKEVSELEDKLKASEDKNSKYLE 512 (Query : 616 LiQKEREDKEETIDI LRKELSRTElQIRKELSIKASSLE-
V(QKA(QLEGRLEEKESLVK 672
LiQKE E +E +D L+ +L + + K S L + + K E R
+ E L K
Sb j ct : 513 LiQKESEKIKEELDAKTTELKIlQLEKVTNLSKAKEKSESELSRLKKTSSEERKNAEEfQLEK 572
(Query : 673 L(Q(QE 67b L+ E
Sb j ct : 573 LKNE 57b
Score = 18b (27-1 bits)ι Expect = 2-De-lOι P = 2-Oe-lO Identities = 62/301 (27*)ι Positives = 155/301 (51*)
(Query: 516 EELIEKLiQSGMVVKDlQICDVRISDIMDVYEMKLSTLASKESR LlQD- LLETKALALAώ b53
+ELI +L(Q+ +K + D S++ V L K++ LiQD +L
K
Sbjct: bll DELI-
RLiQNENELKAKEIDNTRSELEKVSLSNDELLEEKlQNTIKSLiQDEILSYKDKITRN bbl
(Query: b54
ADRLIA(QHRC(QRT(QAETEARTLASMLREVERKNEELSVLLKA(Q(QVESERAιQSDIEHLF 3H 713 + + L + + R + E+ L LR + ++ LK + ES +
++++E + Sb j ct : b70 DEKLLSIERDSKRDLES
LKElQLRAAlQESKAKVEEGLKKLEEESSKEKAELEKSKEM 725
(Query : 714 NRKLESVAEEHEILTKSYMELLiQRN-ESTEKKNKDLlQITCDSL- NKiQIETVKKLNESLKE 771 +KLES E +E KS ME ++++ E E+ K + +L +++ + +
++NES K+ Sb j ct : 72b MKKLESTIESNETELKSSMETIRKSDEKLEiQSKKSAEEDIKNLiQHEKSDLISRINESEKD 785 (Query : 772 (QNE-KSIA<QLIEKEE<QRKE-V(QN<QLVDREHKL- ANLHlQKTKViQEEKIKTLiQKEREDKEET 826
E KS ++ K E V+ +L + + K+ N + T V + K++
+++E +DK+ Sbjct: 78b IEELKSKLRIEAKSSSELETVKlQELNNAiQEKIRVNAEENT- VLKSKLEDIERELKDKiQAE 844
(Query: 821 IDILR — KEL — SRTEiQIRKEL SIKASSLEVlQKAiQLE-
GRLEEKESLVKLiQ 674
I + KEL SR + + + +EL S + S EV + K <Q + E
+L+EK L++ + Sbjct: 645 IKSNIQEEKELLTSRLKELEIQELDSTIQIQKAIQKSEEESRAEVRKFIQVEKSIQLDEKAMLLETK 104
(Query: 675 (QEEL-NK 680
+ L NK Sbjct: IDS YNDLVNK 111 Score = 173 (2b.0 bits)ι Expect = 5-7e-D1ι P = 5-7e-01 Identities = 77/267 (2b*)ι Positives = 14b/267 (50*)
(Query: bOl IEKLiQSGMVVKDiQICDVRISDIMDVYEMKLSTLASKES — RL(QDLLETKALALA(QADRLI b5β + + K + + K + + + IS + D E+ ST ES + D LE +
A +
Sbjct: 38D LKKYEE(QIANKER(QYNEEIS(QLND~EIT- ST(Q(QENESIKKKNDELEGEVKAMKST 432 (Query: b51 A(QHRC(QRT(QAETEARTLASMLREVERKNE--
ELSVLLKAιQιQVESERAιQSDIEHLF(QH-NR 715
++ + ++E +A L ++E+++KNE E S+L + +ESE + 1+
L N
Sbjct: 433 SEE<QSNLKKSEIDALNL--(QIKELKKKNETNEASLLESIKSIESETVK-- IKELiQDECNF 468
(Query: 71b KLESVAEEHEILTKSY — - MELLtQRNESTEKKNKDLiQITCDSLNKiQIETVKKLNESLKElQ 772
K + V + E + L S + L+ + +EK + + L L (Q + E V L+++ KE+
Sbjct: 481 KEKEVSELEDKLKASEDKNSKYLEL(QKESEKIKEELDAKTTELKI<QLEKVTNLSKA-KEK 547
(Query: 773 NEKSIA(QLIE-KEE(QRKEV(QN(QL--VDREHKLAN-- LHlQKTKViQEEKIKTLlQKEREDKEE 827
+ E + + + L + E + RK + (QL + E ++ N ++ K+ E T +
+ E +K
Sbjct: 548
SESELSRLKKTSSEERKNAEE(QLEKLKNEI(QIKN(QAFEKERKLLNEGSSTIT(QEYSEKIN b07
(Query: 828 TI-
DILRKELSRTE(QIRKELSIKASSLEV(QKA(QLEGRLEEKESLVKL(Q(QEELNKHSHMI 665
T+ D L + + E KE+ S LE + LEEK + + +K (Q + E +
+ I Sbjct: b08
TLEDELIRLiQNENELKAKEIDNTRSELEKVSLSNDELLEEKlQNTIKSLiQDEILSYKDKI bbb
Score = 171 (25-7 bits)ι Expect = 1-3e-01ι P = 1-3e-D1 Identities = 7b/311 (24*)ι Positives = 152/311 (48*)
(Query: 51b NIEELIEKL(QSGMVVKD(Q
ICDVRISDIMDVYEMKLSTLASKESRLiQDLLETKA b46 N EE +EKL++ + +K+Q + + S I Y K++TL +
RL(Q+ E KA Sbjct: Sb5
NAEE(QLEKLKNEI(QIKN(QAFEKERKLLNEGSSTIT(QEYSEKINTLEDELIRL(QNENELKA b24
(Query: b41 LALAOADRLIAiQHRCtQRTiQA-ETEARTLASMLREVERKNEELSVL- LKAlQiQVESERAiQSD 70b
+ + + + E + T+ S+ E+ ++++ K
+E + ++ D Sbjct: b25
KEIDNTRSELEKVSLSNDELLEEKiQNTIKSL(QDEILSYKDKITRNDEKLLSIERD-SKRD b83
(Query: 7D7 IEHLFIQHNRKL-ESVAEEHEILTKSYMELLIQRNESTEKKN---
KDLiQITCDS LNKlQ 758 +E L + R ES A+ E L K E + EK K L+ T +S
L
Sbjct: b84 LESLKElQLRAAiQESKAKVEEGLKKLEEESSKEKAELEKSKEMMKKLESTIESNETELKSS 743 (Query: 751 IETVKKLNESLKEIQNEKSIAIQLIEK-
EEIQRKEVIQNIQLVDREHKLANLHIQKTKVIQEE K 614
+ ET + + K +E L E(Q + + KS + 1+ + ++ ++ + + + + E + L K + + + + +
Sbjct: 744 METIRKSDEKL- ElQSKKSAEEDIKNLiQHEKSDLISRINESEKDIEELKSKLRIEAKSSSE 802
(Query: 615 IKTLlQKEREDKEETIDILRKE LSRTElQIRKELSIKASSL---
EVIQKAIQLEGRLEEK 6b7
++T+++E + +E I + +E S+ E I +EL K + + + +K L RL+E
Sbjct: 603 LETVKIQELNNAIQEKIRVNAEENTVLKSKLEDIERELKDKIQAEIKSNIQEEKELLTSRLKEL 8b2
(Query: 6b8 ESLVKLiQlQEELNK 880 E + (Q + + K
Sbjct: 6b3 E(QELDST(Q(QKA(QK 875
Score = lb5 (24-8 bits)ι Expect = 4-le-08ι P = 4-le-Dβ Identities = b5/28b (22*)ι Positives = 141/28b (52*)
(Query: 515 LNIEELIEKLlQSGMVVKDlQICDVR-ISDIMDVYEMKLSTLASKESRL- (QDLLETKALALA b52
+N ++ + L+ + K I +++ I++ ++ +++ + L+ ++ + ++L+E K+ Sbjct: 114 VNHlQKETKSLKEDIAAK--
ITEIKAINENLEKMKI(QCNNLSKEKEHISKELVEYKS-RF(Q 170
(Query: b53 (QADRLIA(QHRClQRT(QAETEARTLASMLREVERKNEELSVLLKA(Q(QVESE
-RAiQSDIE 7D8 D L+A+ T+ + ++LA+ ++++ +NE L ++ + ES
(Q+ 1 +
Sbjct: 171 SHDNLVAK LTE
KLKSLANNYKDMlQAENESLIKAVEESKNESSIiQLSNLiQNKID 223 (Query: 7D1 HLF(QH--NRKLE—
SVAEEHEILTKSYMELL(QRNESTEKKNKDL(QITCDSLNK(QIETVKK 7b4
+ Q N ++E S+ + E L K+ +L (Q E K+ + D Ql +K+ Sbjct: 224 SMSIQEKENFIQIERGSIEKNIEIQLKKTISDLEIQTKEEIISKSDSSK
DEYESiQISLLKE 26D
(Query: 7b5 LNESLKE(QNEKSIA(QLIEKEE(QRKEV(QN(QLVDREHKLANLH(QKTKV(QEEKIKTLι3KERED 824
E+ N++++ ++ E + R+E++ +L ++ L K + E+ +K
+ + + E
Sbict: Pfll
KLETATTANDENVNKISELTKTREELEAELAAYKNLKNELETKLETSEKALKEVKENEEH 340
(Query: 825 - KEETIDILRKELSRTE(QIRKELSIKASSLEV(QKA(QLEGRLEEKESLVKL<Q(QEELNK 880
KEE I L KE + T + iQ L SLE + L +L + + E + + +
+ N + Sbjct: 341 LKEEKIiQ-
LEKEATETK(Q(QLNSLRANLESLEKEHEDLAA(QLKKYEEiQIANKER(QYNE 31b
Score = 158 (23-7 bits)ι Expect = l-1e-07ι P = 1. le-07 Identities = 74/2bδ (27*)ι Positives = 13b/2b8 (50*)
(Query: 516 EELIEKL(QSGMVVKD(QICDVRISDIMDVYEM--KL- STLASKESRL(QDLLET--KALALA b52
+E -K++ G+ ++ +++ EM KL ST+ S E+ L+ +ET
K + Sbjct: b15
(QESKAKVEEGLKKLEEESSKEKAELEKSKEMMKKLESTIESNETELKSSMETIRKSDEKL 754
(Query: b53
(QADRLIA(QHRC(QRT(QAETEARTLASMLREVERKNEELSVLLKAιQ(QVESERA(QSDIEHLF(Q 712 + + A+ + IQ E L S + E E+ EEL L+ + S
+ + + L
Sbjct: 755 EiQSKKSAEEDIKNLώHEKS — DLISRINESEKDIEELKSKLRIEAKSSSELETVKlQELNN 612 (Query: 713 HNRKLESVAEEHEILTKSYMELL(QRNESTEKKNKDL(QITCDSLNKtQIET--
VKKLNESLK 77D
K+ AEE+ +L KS +E ++R E K+K +1 + K++ T
+K+L + L
Sbjct: δl3 AiQEKIRVNAEENTVL-KSKLEDIER ELKDK3AEIKSN(QEEKELLTSRLKELE(QELD 6b7
(Query: 771 E(QNEKSIA(QLIEKEE(QRKEV(QNιQLVDR~- EHKLANLHlQKTKViQEEKIKTLiQKEREDKEE δ27
+ K AiQ E EE R EV+ V + + + K L K K + +++ + ++
Sbjct: δbδ ST(Q(QK--A(QKSE- EESRAEVRKFiQVEKSiQLDEKAMLLETKYNDLVNKElQAUKRDEDTVKK 124
(Query: 626 TIDILRKELSRTEιQIRKEL-SIKASSLEV(QKA(QLEGRLE 8b5 T D R+E+ E++ KEL ++KA + ++++A E R E
Sbjct: 125 TTDSiQRiQEI- — EKLAKELDNLKAENSKLKEAN-EDRSE 151
Score = 155 (23-3 bits)ι Expect = 3-1e-07ι P = 3-1e-07 Identities = 73/2b1 (27*)ι Positives = 133/2b1 (41*)
(Query: b24 DVYEMKLSTLASKESRL(QD-LLETKALALA(QADRLIA(QHRC(QRT(QAET- — EARTLASML b71 ++ E K +T+ S LflD +L K ++L++ R + E+ +
R
Sbjct: b43 ELLEEKlQNTIKS
LlQDEILSYKDKITRNDEKLLSIERDSKRDLESLKEiQLRAAiQESK blδ
(Query: bβO REVE — RKNEELSVLLKAlQiQVESERAiQSDIEHLFiQHNRKLESVAEEHEILTKSYMELLlQ 73b
+VE +K EE S KA+ +S+ +E + N + E +
KS +L Q Sbjct: bll AKVEEGLKKLEEESSKEKAELEKSKEMMKKLESTIESNET-- ELKSSMETIRKSDEKLEiQ 75b
(Query: 737 RNESTEKKNKDLiQITCDSLNKiQIETVKKLNESLKElQ--- NEKSIAiQLIEKEEiQRKEVlQNlQ 713 +S E+ K + L(Q L +1 +K E LK + KS + + L
+ + + Q +
Sbjct: 757 SKKSAEEDIKNLlQHEKSDLISRINESEKDIEELKSKLRIEAKSSSELETVKlQELNNAlQEK 61b (Query: 714 L-VDREH
KLANLHlQKTKVlQEEKIKTLiQKEREDKEETIDILRKELSRTEiQIRKEL 64b
+ V+ E KL ++ ++ K ++ +IK+ (Q + E+E + L +EL
T + (Q + +
Sbjct: 817 IRVNAEENTVLKSKLEDIERELKDK(QAEIKSN(QEEKELLTSRLKELE(QELDST<Q(Q-KA<QK 875
(Query: 847 SIKASSLEV(QKA(QLE-GRLEEKESLVKL<Q(QEEL-NK 860
S + S EV + K (Q + E +L + EK L + + + +L NK Sbjct: 87b SEEESRAEVRKFlQVEKSlQLDEKAMLLETKYNDLVNK 111
Score = 14b (21-1 bits)ι Expect = 3-Se-Dbι P = 3.5e-0b Identities = 73/311 (23*)ι Positives = 152/311 (46*)
(Query: 520 DNREώViQSGLRIL LEAAPLPDFPALV-- LGESIAANNAYRIQIQETEHIPRK-MPUIQ 571
+++ +V+ GL+ L E A L ++ L +1 +N + E I
Sbjct: bib ESKAKVEEGLKKLEEESSKEKAELEKSKEMMKKLESTIESNETELKSSMETIRKSDEKLE 755
(Query: 572 SSNHSFPTSIKCLTPHLKDGVPGLNIEEL- IEKLlQSGMVVKDlQICDVRISDIMDVYEMKL b30
S S IK L D + +N E IE+L+S + + + + S
+ + + +L Sbjct: 75b (QSKKSAEEDIKNLlQHEKSDLISRINESEKDIEELKSKLRI
EAKSSSELETVKlQEL 61D
(Query: b31 STLASK— -
ESRLIQDLLETKALALAIQADRLIAIQHRCIQRTIQAETEARTLASMLREVERKNE b87 + K + +L + + K L +R + + + + E L S
L+E+E++ +
Sbjct: 811 NNAlQEKIRVNAEENTVLKSK--- LEDIERELKDK(3AEIKSN(QEEKELLTSRLKELE(QELD 6b7 (Query: bδ8
ELSVLLKAiQlQVESERAlQSDIEHLFiQHNRKLESVAEEHEILTKSYMELLiQRNESTEKKNKD 747
S KAtQ+ E E + + + + + F<Q + + E+ +L Y +L+ +
+ + + + Sbjct: 6bβ — STIQIQKAIQKSEEE-SRAEVRK-FIQVEKS — (QLDEKAMLLETKYNDLVNKElQAUKRDEDT 121
(Query: 748 L(QITCDSLNK(QIETVKKLNESLKE(QNEKSIA(QLIEKEE(QRKEV(QN(QLV--- DREHKLANL 804
++ T DS ++IE + K ++LK +N K L E E R E+ + ++ D
+ K N
Sbjct: 122 VKKTTDSIQRIQEIEKLAKELDNLKAENSK
LKEANEDRSEIDDLMLLVTDLDEK— NA 175
(Query: 605 H(QKTKV(QEEKIKTL(QKEREDKEETID 830
++K+++ ++ E +D+EE D
Sbjct: 17b KYRSKLKDLGVEISSDEEDDEEEEDD 1001 Score = 14b (21-1 bits)ι Expect = 4-be-Dbι P = 4-be-Ob Identities = 82/313 (2b*)ι Positives = 145/313 (4b*)
(Query: 518
EELIEKLlQSGMVVKDiQICDVRISDIMDVYEMKLSTLASKESRLiQDLLETKALALAlQADRL b57 EEL +L + +K + + + + + E+K + KE + + <Q LE +A
Sbjct: 304 EELEAELAAYKNLKNELETKLETSEKALKEVKENEEHLKEEKIlQ — LEKEATETKlQiQ 358 (Query: b58 IA(QHRC(QRT(QAETEARTLASMLREVERK NEELSVL
LKA(Q(QVESERA(QSD 70b
+ R E E LA+ L + + E + NEE+S L + + (Q
E + E +
Sbjct: 351 LNSLRANLESLEKEHEDLAA(QLKKYEE(QIANKER(QYNEEIS(QLNDEITST(Q(QENESIKKK 416
(Query: 707 IEHLFiQHNRKLESVAEEHEILTKSYMELLlQRN- ESTEKKNKDL(QITCDSLNK(QIET-VKK 7b4
+ L + ++S +EE L KS ++ L + +KKN+ + + K IE+ K
Sbjct: 411 NDELEGEVKAMKSTSEE(2SNLKKSEIDALNL(QIKELKKKNETNEASLLESIKSIESETVK 476
(Query: 7b5 LNESLKE(QN--EKSIA(QLIEK---EE(QRKEV<QN(QLVDREHKLAN-LH(QKT-- -KVlQEEKI 615
+ E E N EK +++L +K E + +L K+ L KT
K + (Q EK +
Sbjct: 471
IKEL(QDECNFKEKEVSELEDKLKASEDKNSKYLEL(QKESEKIKEELDAKTTELKIiQLEKV 536
(Query: δlb
KTLlQKEREDKEETIDILRKELSRTEiQIRKELSIKASSLEViQKAlQLEGRLEEKESLVKLiQlQ δ75
L K +E E ELSR + + K S + + E iQ +L+ ++ K
+ + + Sbjct: 531 TNLSKAKEKSES ELSR—-
LKKTSSEERKNAEElQLEKLKNEIlQIKNiQAFEKER 5βδ
(Query: 67b EELNKHSHMIAMIHSLSGGKINPETVNL 103 + LN+ S I +S + E + L Sbjct: 561 KLLNEGSSTITlQEYSEKINTLEDELIRL bib
Score = 145 (21-8 bits)ι Expect = 5-le-Obι P = 5-le-Ob Identities = 51/24b (23*)ι Positives = 115/24b (4b*) (Query: t,34 ASKESRLiQ- DLLETKALALAIQADRLIAIQHRCIQRTIQAETEARTLASMLREVERKNEELSVL b12
+ ES +(Q L+ K + + + (Q + +R E L + + + E + EE + +
Sbjct: 2D7 SKNESSIIQLSNLIQNKIDSMSIQEKE
NF(QIERGSIEKNIE(QLKKTISDLE(QTKEE--II 2bl
(Query: b13 LKA(Q(QVESERA(QSDIEHLF(QHNRKLESVAEEHEI LTKSYMELLlQRNESTEKKNKD 747
K+ + E +S I L + + + A + + LTK+ EL
+ + +
Sbjct: 2b2 SKSDSSKDEY-ESiQIS-
LLKEKLETATTANDENVNKISELTKTREELEAELAAYKNLKNE 311
(Query: 746
L(QITCDSLNK(QIETVKKLNESLKE(QNEKSIA(QLIEKEE(QRKEV(QN<QLVDREHKLANLH(QK 807 L+ ++ K ++ VK+ E LKE+ + + E ++(Q ++ L E
+ +L + Sbjct: 320
LETKLETSEKALKEVKENEEHLKEEKI(QLEKEATETK(Q(QLNSLRANLESLEKEHEDLAA(Q 371
(Query: 608
TKV(QEEKIKTLιQKEREDKEETIDILRKELSRTE(QIRKELSIKASSLEV(QKA(QLEGRLEEK Sb7 K EE + I KER+ EE I L E + + T + (Q + + K LE +
+ + EE +
Sbjct: 380 LKKYEEiQIAN — KERiQYNEE- ISiQLNDEITSTiQlQENESIKKKNDELEGEVKAMKSTSEEiQ 43b (Query: 8b6 ESLVKLiQiQEELN 871
+L K + + LN Sbjct: 437 SNLKKSEIDALN 448
Score = 137 (20. b bits)ι Expect = 4.2e-05ι P = 4-2e-DS Identities = 61/312 (25*)ι Positives = 140/312 (44*)
(Query: 516 EELIEKLiQSGMVVKDiQICDVRISDIMDVYEMKLSTLASK-ESRLlQDLLET- KALALAiQAD b55
+EL ++++ ++ +++ S+I D +++ L K E+ LLE+ K + +
Sbjct". 420 DELEGEVKAMKSTSEElQSNLKKSEI- DALNLlQIKELKKKNETNEASLLESIKSIESETVK 478
(Query: b5b RLIA(QHRC(QRT(QAETEARTLASMLREVERKNEELSVLLKA(Q(QVESERA(QSDIEHLF(QHNR 715
<Q C E E L L+ E KN + L K + E +
L
Sbjct: 471 IKELlQDECNFK--
EKEVSELEDKLKASEDKNSKYLELlQKESEKIKEELDAKTTELKIiQLE 53b
(Query: 71b
KLESVAEEHEILTKSYMELL(QRNESTEKKNKDL(2ITCDSLNK(QIETVKKLNESLKE(QNEK 775 K+ + + + + E + + S + L + + S E + KN + (Q+ (QI+ + +
K NE Sbjct: 537 KVTNLSKAKE-KSESELSRLKKTSSEERKNAEElQLEKLKNEIiQIKN-
(QAFEKERKLLNEG 514 (Query: 77b SIA<QLIEKEEιQRKEV<QNιQLV--DREHKL-ANLHι3KTKV(QEEKIKTL(QKER- EDKEETIDI 631
S E E+ ++++L+ E++L A T+ + EK+ E
E+K+ TI Sbjct: 515
SSTITiQEYSEKINTLEDELIRLiQNENELKAKEIDNTRSELEKVSLSNDELLEEKlQNTIKS bS4
(Query: 632 LRKE-LSRTE(QI RKELSIKASS LEVώKAiQLEGRLEEK
ESLVKL(Q(QE 87b L+ E LS + + I K LSI+ S LE K (QL E K E L
KL + + E
Sbjct: b55 L(QDEILSYKDKITRNDEKLLSIERDSKRDLESLKE(QLRAA(QESKAKVEEGLKKLEEESSK 714 (Query: 877 ELNKHSHMIAMIHS 81D
EL K M+ + S Sbjct: 715 EKAELEKSKEMMKKLES 731
Score = 126 (11-2 bits)ι Expect = 3-1e-04ι P = 3-1e-04 Identities = 60/35b (22*)ι Positives = 148/35b (41*)
(Query: 54b LGESIAANNAYR(Q(QETEHIPRKMPU(QSSNHSFPTSIKCLTPHL
KDGVPGLN-I 517
L E + ++ E+ + ++S+ H SIK L L K G+N +
Sbjct: 25
LDEMT(QLRDVLETKDKEN(QTALLEYKSTIHK(QEDSIKTLEKELETILS(QKKKAEDGINKM 84
(Query: 518 EELIEKLlQSGMVVKDiQICD-- VRISDIMDVYEMKLSTLASKESRLiQDLLETKALALAiQAD b55
+ + L M ++ C + D +V K T + KE + E
KA+ +
Sbjct: 65 GKDLFALSREM(QAVEENCKNL(QKEKDKSNVNH(QK-
ETKSLKEDIAAKITEIKAIN-ENLE 142
(Query: b5b
RLIA(QHRC(QRT(QAETEARTLASMLREVERKNEELSVLLKA<Q(QVESERA(QSDIEHLF(QHNR 715 + + (Q C E E ++ L E + + + L+ + + + +
+ + N Sbjct: 143 KMKI(Q--CNNLSKEKEH--
ISKELVEYKSRFIQSHDNLVAKLTEKLKSLANNYKDMIQAENE 116
(Query: 71b KLESVAEEHEILTKSYMELLiQRN- ESTEKKNKDLlQITCDSLNKiQIETVKKLNESLKEiQNE 774 L EE + + + L(Q +S ++ ++ iQI S+ K IE +KK
L + + E
Sbjct: 111 SLIKAVEESKNESSI(QLSNL(QNKIDSMS<QEKENFιQIERGSIEKNIE(QLKKTISDLE(QTKE 256 (Query: 775
KSIA(QLIEKEEιQRKEV(QN(QLVDREHKLANLH(QKTKV<QEEKIKTL(QKEREDKEETI 621
+ I + + + + E + + IQ+ + KL Kl L K RE+ E
+
Sbjct: 251 EIISK — - SDSSKDEYESiQISLLKEKLETATTANDENVNKISELTKTREELEAELAAYKN 315
(Query: 830 --DILRKELSRTEIQIRKELSIKASSLEVIQKAIQLEGRLEE-KESLVKLIQIQ — EELNK-HSH 683 + L +L +E+ KE+ L+ +K (QLE E K+ L L+ E L K H
Sbjct: 31b
LKNELETKLETSEKALKEVKENEEHLKEEKIiQLEKEATETKiQiQLNSLRANLESLEKEHED 375
(Query: 664 MIAMI 868
+ A + Sbjct: 37b LAAiQL 380 Score = 117 (17-b bits)ι Expect = 3.8e-03ι P = 3-8e-D3 Identities = 50/240 (20*)ι Positives = 111/24D (4b*)
(Query: b34
ASKESRL(QDLLETKALALA(QADRLIA(QHRC(QRT(QAETEARTLASMLREVERKNEELSVLL b13 A E L+ L E + A+ ++ + + E+ L S + + +
+E+L
Sbjct: bll AKVEEGLKKLEEESSKEKAELEKSKEMMKKLESTIESNETELKSSMETIRKSDEKLEiQSK 758 (Query: b14 KA(Q(QVESERA(Q---SD-
IEHLFlQHNRKLESVAEEHEILTKSYMELLlQRNESTEKKNKDLlQ 741
K+ + + + Q SD I + + + +E + + I KS EL +
+ + +
Sbjct: 751 KSAEEDIKNLlQHEKSDLISRINESEKDIEELKSKLRIEAKSSSELETVKiQELNNAiQEKIR 618
(Query: 750
ITCDSLNK(QIETVKKLNESLKE(QNEKSIA(QLIEKEE(QRKEV(QN(QLVDREHKLANLH(QKTK 801 + + N +++ KL + +E +K A++ +E+++ + ++L + E +L + <QK +
Sbjct: 611 VNAEE-NTVLKS--KLEDIERELKDK(Q- AEIKSN(QEEKELLTSRLKELE(QELDST(Q<QKA(Q 874
(Query: 810 ViQEEK IKTL(QKEREDKEETIDILRKELSRTE(QIRKELSIKASSLEV3KA(QLEGRLE 8b5
EE+ ++ Q E+ +E +L E + + KE + K V+K
+ + +
Sbjct: 875 KSEEESRAEVRKFiQVEKSiQLDEKAMLL-- ETKYNDLVNKEIQAUKRDEDTVKKTT-DSIQRIQ 131
(Query: 8bb EKESLVK 872
E E L K Sbjct: 132 EIEKLAK 136 Score = 101 (lb-4 bits)ι Expect = 2-be-D2ι P = 2-5e-02 Identities = b4/264 (22*)ι Positives = 135/284 (47*)
(Query: 518
EELIEKL(QSGMVVKD(QICDVRISDIMDVYEMKLSTLASKESRL(QDLLETKALALA Qk bS4 +E + + + KL + S + + + 1 E + S E + + + L K +
+ + + +
Sbjct: 723 KEMMKKLESTIESNETELKSSMETIRKSDEKLElQSKKSAEEDIKNLiQHEKSDLISRINES 782 (Query: b55 DRLIAIQHRCIQ-
RT(QAETEARTLASMLREVERKNEELSVLLKA(QιQVESERA(QSDIEH-LFιQ 712
++ I + + + R +A++ + L ++ +E+ E++ V + V + +
DIE L Sbjct: 783 EKDIEELKSKLRIEAKSSSE-LETVKiQELNNAiQEKIRVNAEENTVLKSKLE- DIERELKD 640
(Query: 713 HNRKLESVAEEHEILTKSYMELL(QRNESTEKK-NKDL(QITCDSLNK- (QIETVKKLNES-- 7bβ
+++S EE E+LT EL Q +ST++K K + + + K (Q+E +L+E
Sbjct: 641 KiQAEIKSNiQEEKELLTSRLKELEiQELDSTlQlQKAlQKSEEESRAEVRKFiQVEK- SiQLDEKAM 611
(Query: 7b1 LKElQNEKSIA- — (QLIEKEEiQ-- RKEV(QN(QLVDREHKLANLH(QKTKV(QEEKIKTLiQKERE 623
L E + (Q + + + E +K +<Q + E KLA K +
K+K ++R Sbjct: 1DD LLETKYNDLVNKEIQAUKRDEDTVKKTTDSIQRIQEIE- KLAKELDNLKAENSKLKEANEDRS 158
(Query: 624 DKEETI DILRKELSRTEiQIRKELSIKASSLEVlQKAlQLEGRLEEKE
8b8 + ++ + D+ K ++ K+L ++ SS E + E E+ E
Sbjct: 151 EIDDLMLLVTDLDEKNAKYRSKL-KDLGVEISSDEEDDEEEEDDEEDDE 100b
Score = lb (14-4 bits)ι Expect = 1-le+ODι P = b-be-01 Identities = 40/21D (11*)ι Positives = 101/210 (46*)
(Query: fc,81 EVERKN-- EELSVLLKAiQlQVESERAlQSDIEHLFlQHNRKLESVAEEHEILTKSYMELLlQRN 738
E E KN + L + + + V + + + L ++ + + + L K +L +
Sbjct: IS
ETELKNVRDSLDEMTlQLRDVLETKDKENiQTALLEYKSTIHKiQEDSIKTLEKELETILSiQK 74
(Query: 731 ESTE KKNKDLιQITCDSLNK(QIETVKKLNESLKE(QNEKSIA(QLIEKEE(QRKEV(QN(QL 714
+ E K KDL +L+++++ V++ ++L+++ +KS + +++
K ++ +
Sbjct: 75 KKAEDGINKMGKDLF ALSREMiQAVEENCKNLlQKEKDKSN
VNHiQKETKSLKEDI 127
(Query: 715 VDREHKLANLH(QKTKV(QEEKIKTL(QKERED-
KEETIDILRKELSRTEfQIRKELSIKASSL 853
+ ++ +++ + + + L KE+E +E ++ + S + K
L+ K SL Sbjct: 128 AAKITEIKAINENLEKMKIiQCNNLSKEKEHISKELVEYKSRFOSHDNLVAK-
LTEKLKSL 16b
(Query: 654 EVlQKAiQLEGRLEEKESLVKLlQiQEELNKHSHMIAMIHS 610
++ E ESL+K +E N+ S ++ + + Sbjct: 167 ANNYKDMiQA ENESLIKAVEESKNESSIIQLSNLIQN 22D
Score = 52 (7-6 bits)ι Expect = 2-Oe-lDι P = 2-Oe-lD Identities = 31/lb7 (23*)ι Positives = 74/lb7 (44*) (Query: 11 LNSVLAGVVCRSSHTDSVFLι3CI(QLL(QKLTYNVKIFYSGANIDEL-
ITFLIDHIiQSSEDE 157
LN + + ++ ++ L+ 1+ ++ T +K N E ++ L D
+++SED+ Sbjct : 447 LNL(QIKELKKKNETNEASLLESIKSIESETVKIKEL(QDECNFKEKEVSELEDKLKASEDK 50b
(Query : 156 - LKMPCLGLLANLCRHNLSVlQTHIKTLSNVKSFYRTLITLLAHSSLTVVVFALSILSSLT 21b
K L + + L +T T ++ T ++ S + +
S
Sb jct : 507 NSKYLELiQKESEKIKEELDAKT — TELKIώLEKVTNLSKAKEKSESELSRLKKTSSEER Sb3
(Query : 217 LN-EEVGEKLFHARNI-H(QTF(QLIFNILINGDGTLTRKYS--VDLLMDLL 2b2
N EE EKL + I +<Q F+ +L G T+T+ + YS ++ L D L
Sb j ct : 5b4 KNAEEiQLEKLKNEIlQIKNiQAFEKERKLLNEGSSTITiQEYSEKINTLEDEL bl3
Pedant information for DKFZphmel2_12jli frame 2
Report for DKFZphmel2_12jl .2
[LENGTH! 105 EMU! 102Db7-61
Epl! 5- 65
EHOMOL! TREMBL:SCINTANA_1 Saccharomyees cerevisiae integrin analogue genei complete eds- le-14
EFUNCAT! 06. D7 vesicular transport (golgi networki etc-) ES. cerevisiaei YDL058w! 5e-lb
EFUNCAT! 3D-03 organization of cytoplasm ES- cerevisiaei
YDLD58w! 5e-lb
EFUNCAT! 1 genome replication! transcriptioni recombination and repair EM- jannaschiii MJ1322! le-10 EFUNCAT! OLID nuclear biogenesis [S- cerevisiaei YDR35bw!
2e-10
[FUNCAT! 3D-04 organization of cytoskeleton [S- cerevisiaei
YDR35bw! 2e-lD
[FUNCAT! D3-22 cell cycle control and mitosis [S- cerevisiaei YDR35bw! 2e-lD
EFUNCAT! 30- ID nuclear organization ES- cerevisiaei YKR015w! le-01
EFUNCAT! 11-04 dna repair (direct repairi base excision repair and nucleotide excision repair) ES- cerevisiaei YKR015w! le-DI EFUNCAT! 08-22 cytoskeleton-dependent transport ES- cerevisiaei
YHR023w MY01 - myosin-1 isoform! 4e-01
EFUNCAT! D3-04 buddingi cell polarity and filament formation
ES- cerevisiaei YHRD23w MY01 - myosin-1 isoform! 4e-01
EFUNCAT! 03-25 cytokinesis ES. cerevisiaei YHR023w MY01 - myosin-1 isoform! 4e-01
EFUNCAT! 11 unclassified proteins ES- cerevisiaei YNLOIlw!
3e-08
EFUNCAT! 01-25 vacuolar and lysosomal biogenesis ES- cerevisiaei Y0R32bw! be-06 EFUNCAT! OS-lb extracellular transport ES- cerevisiaei
Y0R32bw! be-08
EFUNCAT! 01-13 biogenesis of chromosome structure ES- cerevisiaei YLROSbw! fie-OS EFUNCAT! 16 classification not yet clear-cut ES. cerevisiaei YJR134c! le-07
EFUNCAT! 0b-D7 protein modification (glycolsylationi acylationi myristylationi palmitylationi farnesylation and processing) ES- cerevisiaei YKL201c! 4e-D7
EFUNCAT! 30-05 organization of centrosome ES- cerevisiaei YIL144w! 4e-0b
EFUNCAT! D3-07 pheromone responsei mating-type determination! sex-specific proteins ES- cerevisiaei YNLD71c! 5e-0b [FUNCAT! D3-01 cell growth [S- cerevisiaei YNLD71c! 5e-0b
[FUNCAT! 06-11 other intracellular-transport activities [S- cerevisiaei YNL071c! 5e-0b
EFUNCAT! D1-04 biogenesis of cytoskeleton [S- cerevisiaei
YKL171c! be-Db
E[FFUUNNCCAATT!! 3300.-0022 organization of plasma membrane [S- cerevisiaei
YERODβc! δe-Ob
EFUNCAT! 03.11 recombination and dna repair [S- cerevisiaei
YNL250w! le-05
EFUNCAT! 03-13 meiosis [S- cerevisiaei YDR265w! le-05
E[FFUUNNCCAATT!! 3300--1133 organization of chromosome structure ES- cerevisiaei YDR285w! le-05
EFUNCAT! 11- DI stress response ES. cerevisiaei YPR141c! 2e-05
EFUNCAT! Db-10 assembly of protein complexes ES- cerevisiaei
YPR141c! 2e-D5 EFUNCAT! Ob-Dl protein folding and stabilization ES- cerevisiaei YNL227c! le-05
[FUNCAT! 05-D4 translation (initiation! elongation and termination) [S- cerevisiaei YALD35w! le-04
[FUNCAT! 10.05-11 other pheromone response activities [S- cerevisiaei YHR158c! le-04
[FUNCAT! o chaperones EM- genitaliumi MG355! 2e-D4
[FUNCAT! 03-22-01 cell cycle check point proteins [S- cerevisiaei YGLOβbw! 2e-04
[FUNCAT! 03-1D sporulation and germination [S- cerevisiaei YNL225c! 3e-04
[FUNCAT! r general function prediction [M- jannaschiii
MJ1254! 4e-D4
[FUNCAT! Oβ-Dl nuclear transport [S- cerevisiaei YPL174c! 4e-04
[FUNCAT! 04-05-D1.01 general transcription activities [S- cerevisiaei YMR227c TAFb7 - TFIID subunit! be-D4
[BLOCKS! PR01D02E
[BLOCKS! BLDllbOB Kinesin light chain repeat proteins
[BLOCKS! BLD032bD Tropomyosins proteins
[SCOP! d2tmab_ 1-1D5-4-1.1 Tropomyosin [rabbit (Oryctolagus cuniculus) 3e-23
[EC! 3. -1- 32 Myosin ATPase 4e-10
[PIRKU! nucleus Se-DI
[PIRKU! phosphotransferase 2e-07
[PIRKU! blocked amino end le-Ob [PIRKU! duplication 2e-07
[PIRKU! citrulline 3e-D6
[PIRKU! tandem repeat 4e-10
[PIRKU! heterodimer le-07
[PIRKU! heart 4e-06 [PIRKU! endocytosis 7e-D8
[PIRKU! transmembrane protein le-14
[PIRKU! serine/threonine-specific protein kinase 2e-07
[PIRKU! cell wall 2e-0b EPIRKU! zinc finger 7e-08
EPIRKU! DNA binding 3e-D1
EPIRKU! metal binding 7e-06
EPIRKU! muscle contraction 4e-10 EPIRKU! brain 2e-0b
EPIRKU! acetylated amino end 2e-D7
EPIRKU! heterotetramer 5e-07
EPIRKU! actin binding 4e-lD
EPIRKU! mitosis le-08 EPIRKU! microtubule binding le-Dδ
EPIRKU! ATP 4e-10
EPIRKU! chromosomal protein le-D7
EPIRKU! thick filament le-10
EPIRKU! phosphoprotein le-01 EPIRKU! skeletal muscle le-08
EPIRKU! calcium binding 3e-06
EPIRKU! alternative splicing le-10
EPIRKU! DNA condensation le-D7
EPIRKU! coiled coil le-14 EPIRKU! P-loop 2e-10
EPIRKU! heptad repeat 5e-01
EPIRKU! methylated amino acid 4e-10
EPIRKU! immunoglobulin receptor 2e-D7
EPIRKU! peripheral membrane protein 7e-0β EPIRKU! cardiac muscle 4e-08
EPIRKU! hydrolase 4e-lD
EPIRKU! microtubule 5e-01
EPIRKU! muscle 4e-08
EPIRKU! membrane protein 5e-01 EPIRKU! EF hand 3e-08
EPIRKU! cell division le-Ob
EPIRKU! cytoskeleton be-01
EPIRKU! hair 3e-08
EPIRKU! calmodulin binding 7e-08 EPIRKU! Golgi apparatus 2e-07
ESUPFAM! hypothetical protein YJL074c 5e-01
ESUPFAM! unassigned Ser/Thr or Tyr-specific protein kinases 2e-
07
ESUPFAM! myosin motor domain homology 2e-10 ESUPFAM! alpha-actinin actin-binding domain homology be-01
ESUPFAM! tropomyosin 2e-0δ
ESUPFAM! kinesin heavy chain 5e-07
ESUPFAM! plectin be-01
ESUPFAM! SAM homology le-Ob ESUPFAM! trichohyalin 3e-08
ESUPFAM! ribosomal protein S10 homology be-01
ESUPFAM! protein kinase C zinc-binding repeat homology Se-01
ESUPFAM! giantin 7e-0δ
ESUPFAM! protein kinase homology 2e-07 ESUPFAM! protein 4-1 membrane-binding domain homology le-06
ESUPFAM! human early endoso e antigen 1 7e-0δ
ESUPFAM! myosin MY02 2e-0b
ESUPFAM! MS protein 3e-01
ESUPFAM! Mycoplasma genitalium hypothetical protein MG216 5e-01 ESUPFAM! myosin heavy chain 2e-10
ESUPFAM! conserved hypothetical PUS protein 3e-01
ESUPFAM! centromere protein E le-06
ESUPFAM! calmodulin repeat homology 3e-06 ESUPFAM! hypothetical protein MJ0114 2e-07
ESUPFAM! hypothetical protein MJ1322 3e-01
ESUPFAM! pleckstrin repeat homology 5e-01
ESUPFAM! kinesin motor domain homology le-06
ESUPFAM! ezrin"1e-*H3.8
EPROSITE! LEUCINE_ZIPPER 1
EKU! TRANSMEMBRANE 2
EKU! L0U_C0MPLEXITY 3-01 *
EKU! COILED COIL 18-34 *
SElQ MDSTACLKSLLLTVS(QYKAVKSEANAT(QLLRHLEVISG(QKLTRLFTSN<QILTSECLSCLV SEG xxxxxxxxxx
PRD ccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhccccceeeeeecceeeeeehhhhhh COILS
MEM
SE(Q ELLEDPNISASLILSIIGLLSiQLAVDIETRDCLiQNTYNLNSVLAGVVCRSSHTDSVFLlQC SEG xxxx- • - xxxxxxxxxxxxxx
PRD hhhhccccchhhhhhhhchhhhhhhhhhcccccccccceeeeeeeeeeeccccccchhhh COILS
MEM MMMMMMMMMMMMMMMMM'
SElQ IlQLLlQKLTYNVKIFYSGANIDELITFLIDHIiQSSEDELKMPCLGLLANLCRHNLSViQTHI SEG
PRD hhhhhhhcceeeeeecccchhhhhhhhhhhhhhhhhhhccccceeeeeeeecceeeeeee COILS
MEM
SElQ KTLSNVKSFYRTLITLLAHSSLTVVVFALSILSSLTLNEEVGEKLFHARNIHiQTFlQLIFN SEG PRD eeeehhhhhhhhhhhhhhcccccccceeehhhhhhchhhhhhhhhhhhhhhhhhhhhhhh COILS
MEM MMMMMMMMMMMMMMMMM SElQ ILINGDGTLTRKYSVDLLMDLLKNPKIADYLTRYEHFSSCLHlQVLGLLNGKDPDSSSKVL SEG
PRD hhcccccceeeehhhhhhhhhhccccchhhhhheeeeehhhhhhhhcccccccccchhhh COILS MEM
SElQ ELLLAFCSVTIQLRHMLTIQMMFEIQSPPGSATLGSHTKCLEPTVALLRULSIQPLDGSENCSV SEG i
PRD hhhhhchhhhhhhhhhhhhhhhccccccccccccceeehhhhhhhhhhhcccccccchhh COILS
MEM
SElQ LALELFKEIFEDVIDAANCSSADRFVTLLLPTILDiQLlQFTEiQNLDEALTRKNVKGIAKAI SEG
PRD hhhhhhhhhhhhhhhhcccccchhhhhheeehhhhhhhhhhhhhhhhhhhhhchhhhhhh COILS MEM
SElQ EVLLTLCGDDTLKMHIAKILTTVKCTTLIEiQlQFTYGKIDLGFGTKVADSELCKLAADVIL SEG
PRD hhhhhhccccchhhhhhhhhhhheeeeeeeeeeecccccccccceeehhhhhhhhhhhhh COILS
MEM SElQ KTLDLINKLKPLVPGMEVSFYKILIQDPRLITPLAFALTSDNREIQVIQSGLRILLEAAPLPD SEG
PRD hhhhhhhhcccccccccccceeeccccccchhhhhhhccccchhhhhhhhhhhhhccccc COILS MEM
SElQ FPALVLGESIAANNAYRiQiQETEHIPRKMPUiQSSNHSFPTSIKCLTPHLKDGVPGLNIEEL SEG
PRD cceeeeehhhhhhhhhhhhhhhhhhhhhhhhcccccccchhhhhhhhhhhhhhhhhhhhh COILS
MEM
SElQ IEKLiQSGMVVKDfQICDVRISDIMDVYEMKLSTLASKESRLiQDLLETKALALAiQADRLIAiQ SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
MEM
SElQ HRCiQRTiQAETEARTLASMLREVERKNEELSVLLKAlQlQVESERAlQSDIEHLFlQHNRKLESV
SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
MEM
SElQ AEEHEILTKSYMELLiQRNESTEKKNKDLlQITCDSLNKlQIETVKKLNESLKElQNEKSIAiQL
SEG PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
MEM SElQ IEKEEiQRKEViQNlQLVDREHKLANLHiQKTKVlQEEKIKTLlQKEREDKEETIDILRKELSRTE
SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC MEM
SElQ IQIRKELSIKASSLEVIQKAIQLEGRLEEKESLVKLIQIQEELNKHSHMIAMIHSLSGGKINPET
SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcccc COILS
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
MEM SElQ VNLSI
SEG
PRD ccccc
COILS
MEM
Prosite for DKFZphmel2_12 l.2
PS00021 331->353 LEUCINE ZIPPER PD0C00021
(No Pfam data available for DKFZphmel2_lΞ l ■ 2)
DKFZphmel2_7gl4
group: intracellular transport and trafficing
DKFZphmel2_7gl4 encodes a novel 173 amino acid protein with similarity to the dor (deep orange) protein of drosophila melanogaster.
The novel protein is also similar to the vakuolar membrane protein pep3 of Saccharomyees cerevisiaei which is involved in protein sorting mechanisms- The expression profile is ubiquitous and a role in protein transport/targeting is likely-
The new protein can find application in modulation of the sorting of proteins into different compartments-
similarity to DEEP ORANGE (Drosophila melanogaster) perhaps complete eds- and full length Sequenced by MediGeno ix Locus: unknown
Insert length: 3151 bp
Poly A stretch at pos- 3813ι polyadenylation signal at pos- 3874
1 GCCCGCGTCA CGGGGGCGGG AGTCAGCTGA GCTGCCGGGG CGAGGTTGGG
SI ATCACCTGGC ACCGGCTGAA GGGAGCCTGT GATTTTTTTG TAGCGGGGGC
101 GGGGAGTAAG GTGCAAGACT GCGCCAGATT CAAGGACGAG GGCTGCCCGA 151 TTATCTCGCT GCATAAGGCA AGAGCAAGAG GATCCTCAGG ATTTTAAAGA
201 GGAGGCGACG GCTGCAGGTT CCCAGGATCT GTCAGAGGCT GGGGAGTTAC
251 AGCTTCCATT CTGGGGCGAC GGGGACCCCG GGGGGGTAGC CCTTTTGTAA
301 TCCCCAGGCC CCGGACAAAG AGCCCAGAGG CCGGGCACCA TGGCGTCCAT
351 CCTGGATGAG TACGAGAACT CGCTGTCCCG CTCGGCCGTC TTGCAGCCCG 401 GCTGCCCTAG CGTGGGCATC CCCCACTCGG GGTATGTGAA TGCCCAGCTG
4S1 GAGAAGGAAG TGCCCATCTT CACAAAGCAG CGCATTGACT TCACCCCTTC
501 CGAGCGCATT ACCAGTCTTG TCGTCTCCAG CAATCAGCTG TGCATGAGCC
5S1 TGGGCAAGGA TACACTGCTC CGCATTGACT TGGGCAAGGC AAATGAGCCC bOl AACCACGTGG AGCTGGGACG TAAGGATGAC GCAAAAGTTC ACAAGATGTT b51 CCTTGACCAT ACTGGCTCTC ACCTGCTGAT TGCCCTGAGC AGCACGGAGG
701 TCCTCTACGT GAACCGAAAT GGACAGAAGG TACGGCCACT AGCACGCTGG
7S1 AAGGGGCAGC TGGTGGAGAG TGTGGGTTGG AACAAGGCAC TGGGCACGGA
801 GAGCAGCACA GGCCCCATCC TGGTCGGGAC TGCCCAAGGC CACATCTTTG
851 AAGCAGAGCT CTCAGCCAGC GAAGGTGGGC TTTTCGGCCC TGCTCCGGAT 101 CTCTACTTCC GCCCATTGTA CGTGCTAAAT GAAGAAGGGG GTCCAGCACC
ISI TGTGTGCTCC CTTGAGGCCG AGCGGGGCCC TGATGGGCGT AGCTTTGTTA
1001 TTGCCACCAC TCGGCAGCGC CTCTTCCAGT TCATAGGCCG AGCAGCAGAG
1051 GGGGCTGAGG CCCAGGGTTT CTCAGGGCTC TTTGCAGCTT ACACGGACCA
1101 CCCACCCCCA TTCCGTGAGT TTCCCAGCAA CCTGGGCTAC AGTGAGTTGG 1151 CCTTCTACAC CCCCAAGCTG CGCTCCGCAC CCCGGGCCTT CGCCTGGATG
1201 ATGGGGGATG GTGTGTTGTA TGGGGCATTG GACTGTGGGC GCCCTGACTC
1251 TCTGCTGAGC GAGGAGCGAG TCTGGGAGTA CCCAGAGGGG GTAGGGCCTG
1301 GGGCCAGCCC ACCCCTAGCC ATCGTCTTGA CCCAGTTCCA CTTCCTGCTG 1351 CTACTGGCAG ACCGGGTGGA GGCAGTGTGC ACACTGACCG GGCAGGTGGT
1401 GCTGCGGGAT CACTTCCTGG AGAAATTTGG GCCGCTGAAG CACATGGTGA
1451 AGGACTCCTC CACAGGCCAG CTGTGGGCCT ACACTGAGCG GGCTGTCTTC
1SD1 CGCTACCACG TGCAACGGGA GGCCCGAGAT GTCTGGCGCA CCTATCTGGA 15S1 CATGAACCGC TTCGATCTGG CCAAAGAGTA TTGTCGAGAG CGGCCCGACT
IbOl GCCTGGACAC GGTCCTGGCC CGGGAGGCCG ATTTCTGCTT TCGCCAGCGT lb51 CGCTACCTGG AGAGCGCACG CTGCTATGCC CTGACCCAGA GCTACTTTGA
1701 GGAGATTGCC CTCAAGTTCC TGGAGGCCCG ACAGGAGGAG GCTCTGGCTG
17S1 AGTTCCTGCA GCGAAAACTG GCCAGTTTGA AGCCAGCCGA ACGTACCCAG 18D1 GCCACACTGC TGACCACCTG GCTGACAGAG CTCTACCTGA GCCGGCTTGG
1851 GGCTCTGCAG GGCGACCCAG AGGCCCTGAC TCTCTACCGA GAAACCAAGG
1101 AATGCTTTCG AACCTTCCTC AGCAGCCCCC GCCACAAAGA GTGGCTCTTT
1151 GCCAGCCGGG CCTCTATCCA TGAGCTGCTC GCCAGTCATG GGGACACAGA
2001 ACACATGGTG TACTTTGCAG TGATCATGCA GGACTATGAG CGGGTGGTGG 2051 CTTACCACTG TCAGCACGAG GCCTACGAGG AGGCCCTGGC CGTGCTCGCC
2101 CGCCACCGTG ACCCCCAGCT CTTCTACAAG TTCTCACCCA TCCTCATCCG
21S1 TCACATCCCC CGCCAGCTTG TAGATGCCTG GATTGAGATG GGCAGCCGGC
2201 TGGATGCTCG TCAGCTCATT CCTGCCCTGG TGAACTACAG CCAGGGTGGT
2251 GAGGTCCAGC AGGTGAGCCA GGCCATCCGC TACATGGAGT TCTGCGTGAA 2301 CGTGCTGGGG GAGACTGAGC AGGCCATCCA CAACTACCTG CTGTCACTGT
2351 ATGCCCGTGG CCGGCCGGAC TCACTACTGG CCTATCTGGA GCAGGCTGGG
2401 GCCAGCCCCC ACCGGGTGCA TTACGACCTC AAGTATGCGC TGCGGCTCTG
2451 CGCCGAGCAT GGCCACCACC GCGCTTGTGT CCATGTCTAC AAGGTCCTAG
2501 AGCTGTATGA GGAGGCCGTG GACCTGGCCC TGCAGGTGGA TGTGGACCTG 25S1 GCCAAGCAGT GTGCAGACCT GCCTGAGGAG GATGAGGAAT TGCGCAAGAA
2bDl GCTGTGGCTG AAGATCGCAC GGCACGTGGT GCAGGAAGAG GAAGATGTAC
2b51 AGACAGCCAT GGCTTGCCTG GCTAGCTGCC CCTTGCTCAA GATTGAGGAT
2701 GTGCTGCCCT TCTTTCCTGA TTTCGTCACC ATCGACCACT TCAAGGAGGC
27S1 GATCTGCAGC TCACTTAAGG CCTACAACCA CCACATCCAG GAGCTGCAGC 2801 GGGAGATGGA AGAGGCTACA GCCAGTGCCC AGCGCATCCG GCGAGACCTG
2851 CAGGAGCTGC GGGGCCGCTA CGGCACTGTG GAGCCCCAGG ACAAATGTGC
2101 CACCTGCGAC TTCCCCCTGC TCAACCGCCC TTTTTACCTC TTCCTCTGTG
2151 GCCATATGTT CCATGCTGAC TGCCTGCTGC AGGCTGTGCG ACCTGGCCTG
3001 CCAGCCTACA AGCAGGCCCG GCTGGAGGAG CTGCAGAGGA AGCTGGGGGC 3051 TGCTCCACCC CCAGCCAAGG GCTCTGCCCG GGCCAAGGAG GCCGAGGGTG
3101 GGGCTGCCAC GGCAGGGCCC AGCCGGGAAC AGCTCAAGGC TGACCTGGAT
3151 GAGTTGGTGG CCGCTGAGTG TGTGTACTGT GGGGAGCTGA TGATCCGCTC
3201 TATCGACCGG CCGTTCATCG ACCCCCAGCG CTACGAGGAG GAGCAGCTCA
3251 GTTGGCTGTA GGAGGGTGTC ACCTTTGATG GGGGTGGGCA ATGGGGAGCA 3301 GTGGCTTGAA CCCACTTGAG AAGGCTGCCT CCTAGGCTCT GCTCAGTCAT
3351 CTTGCAATTG CCACACTGTG ACCACGTTGA CGGGAGTAGA GTAGCGCTGT
3401 TGGCCAGGAG GTGTCAGGTG TGAGTGTATT CTGCCAGCTT TTCATGCTGT
3451 TCTTCAGAGC TGCAGTTATG CCAGACCATC AGCCTGCCTC CCAGTAGAGG
3501 CCCTTCACCT GGAGAAGTCA GAAATCTGAC CCAATTCCAC CCCCTGCCTC 35S1 TAGCACCTCT TCTGTCCCTG TCATTCCCCA CACACGTCCT GTTCACCTCG
3b01 AGAGAGAGAG AGAGAGAGCA CCTTTCTTCC GTCTGTTCAC TCTGCGGCCT
3b51 CTGGAATCCC AGCTCTTCTC TCTCAGAAGA AGCCTTCTCT TCCTCCTGCC
3701 TGTAGGTGTC CCAGAAGTGA GAAGGCAGCC TTCGAAGTCC TGGGCATTGG
3751 GTGAGAAAGT GATGCTAGTT GGGGCATGCT TTTGTGCACA CTCTCTGGGG 3601 CTCCAGTGTG AAGGGTGCCC TGGGGCTGAG GGCCTTGTGG AGGATGGTCG
3651 GTGGTGGTGA TGGAGGTGGA GAGCATTAAA CTGTCTGCAC TGCAAAAAAA
3101 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAGAAAAAA AAAAAAAAAA
3151 A
BLAST Results No BLAST result
Medline entries
17216037:
Shestopal SAi Makunin IVi Belyaeva ESi Ashburner Mi Zhimulev
IF-iMol
Gen Genet 1117 Feb 20i253 (5) : b42-8
1204130b:
Robinson JSi Graham TRi Emr SD-i A putative zinc finger proteini Saccharomyees cerevisiae Vpslδpi affects late Golgi functions required for vacuolar protein sorting and efficient alpha-factor prohormone maturation. Mol Cell Biol 1111 Decill (12) :5δl3-24
12041305: Preston RAi Manolson MFi Becherer Ki Ueidenhammer Ei Kirkpatriek
Uright Ri
Jones EU-i Isolation and characterization of PEP3ι a gene required for vacuolar biogenesis in Saccharomyees cerevisiae. Mol Cell
Biol 1111
Decill(12) :S601-12
Peptide information for frame 1
ORF from 340 bp to 3256 bpi peptide length: 173 Category: similarity to known protein
Classification: Cellular transport and traffic
1 MASILDEYEN SLSRSAVLώP GCPSVGIPHS GYVNAiQLEKE VPIFTKiQRID 51 FTPSERITSL VVSSNlQLCMS LGKDTLLRID LGKANEPNHV ELGRKDDAKV
101 HKMFLDHTGS HLLIALSSTE VLYVNRNGiQK VRPLARUKGtQ LVESVGUNKA
151 LGTESSTGPI LVGTAiQGHIF EAELSASEGG LFGPAPDLYF RPLYVLNEEG
201 GPAPVCSLEA ERGPDGRSFV IATTRiQRLFlQ FIGRAAEGAE AlQGFSGLFAA
251 YTDHPPPFRE FPSNLGYSEL AFYTPKLRSA PRAFAUMMGD GVLYGALDCG 301 RPDSLLSEER VUEYPEGVGP GASPPLAIVL TlQFHFLLLLA DRVEAVCTLT
351 GώVVLRDHFL EKFGPLKHMV KDSSTGiQLUA YTERAVFRYH ViQREARDVUR
401 TYLDMNRFDL AKEYCRERPD CLDTVLAREA DFCFRiQRRYL ESARCYALTiQ
451 SYFEEIALKF LEARiQEEALA EFLiQRKLASL KPAERTiQATL LTTULTELYL
501 SRLGALiQGDP EALTLYRETK ECFRTFLSSP RHKEULFASR ASIHELLASH 551 GDTEHMVYFA VIMiQDYERVV AYHCiQHEAYE EALAVLARHR DPiQLFYKFSP bOl ILIRHIPRiQL VDAUIEMGSR LDARlQLIPAL VNYSiQGGEVlQ (QVSiQAIRYME b51 FCVNVLGETE (QAIHNYLLSL YARGRPDSLL AYLEiQAGASP HRVHYDLKYA
701 LRLCAEHGHH RACVHVYKVL ELYEEAVDLA LiQVDVDLAKlQ CADLPEEDEE
751 LRKKLULKIA RHVVlQEEEDV (QTAMACLASC PLLKIEDVLP FFPDFVTIDH 801 FKEAICSSLK AYNHHIiQELlQ REMEEATASA (QRIRRDLlQEL RGRYGTVEPlQ
651 DKCATCDFPL LNRPFYLFLC GHMFHADCLL (QAVRPGLPAY KiQARLEELiQR
101 KLGAAPPPAK GSARAKEAEG GAATAGPSRE iQLKADLDELV AAECVYCGEL 151 MIRSIDRPFI DPIQRYEEEIQL SUL BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphmel2_7gl i frame 1 SUISSPR0T:D0R_DR0ME DEEP ORANGE PROTEIN- i N = li Score = 1271ι P
2-4e-130
PIR:A41143 vacuolar membrane protein PEP3 - yeast (Saccharomyees cerevisiae)ι N = 3ι Score = 2bbι P = 5-le-27
>SUISSPR0T:D0R_DR0ME DEEP ORANGE PROTEIN Length = lι002
HSPs:
Score = 1271 (111-1 bits)ι Expect = 2-4e-130ι P = 2-4e-130 Identities = 303/647 (35*)ι Positives = 4b3/647 (54*)
(Query: 130
KVRPLARUKG(QLVESVGUNKALGTESSTGPILVGTA(QGHIFEAELSASEGGLFGPAPDLY 181 KVR + ++K + +V +N G ESSTGPIL+GT++G IFE EL+ + G
+ Sb j ct : 155 KVRRIEKFKDHEITAVAFNPYHGNESSTGPILLGTSRGLIFETELNPAADG- HV(Q 208
(Query : HO FRPLYVLNEEGGPA-PVCSLEAERGPDG- RSFVIATTR(QRLF(QFIGRAAEGAEA(QGFSGL 247 + LY L G P P+ L+ R P+ R ++ T+ + ++ F +
AE + +
Sb jct : 201 RKiQLYDLGL-GRPKYPITGLKLLRVPNSSRYIIVVTSPECIYTF — (QETLKAEERSLiQAI 2b5 (Query: 248 FAAYTD--
HPPPFREFPSNLGYSELAFYTPKLRSAPRAFAUMMGDGVLYGAL--DCGRPD 303
FA Y P E ++L +S+L F+ P P+ +AU+ G+G+ G L
+
Sb jct : 2bb FAGYVSGV(QEPHCEERKTDLTFS(QLRFFAPPNSKYPK(QUAULCGEGIRVGELSIEANSAA 325
(Query : 304 SLLSEERV---UEYPEGVGPGA--- S PPL A IVLT IQF H FLLLL A D RVEA VCT L T GIQVVLRD 357
+L+ + +E + G + P A VLT++H +LL AD V A+C L + V + +
Sbj ct : 32b TLIGNTLINLDFEKTMHLSYGERRLNTPKAFVLTEYHAVLLYADHVRAICLLN(QE(QVY(QE 385
(Query : 358 HFLE- KFGPLKHMVKDSSTGlQLUAYTERAVFRYHViQREARDVURTYLDMNRFDLAKEYCR 41b
F E + G + +D TG ++ YT + VF V RE R+VUR YLD
+++LA + Sb j ct : 38b AFDEARVGKPLSIERDELTGSIYVYTVKTVFNLRVTREERNVURIYLDKGiQYELATAHAA 445
(Query : 417 ERPDCLDTVLAREADFCFR(QRRYLESARCYALT(QSYFEEIALKFLEAR(QEEALAEFL(QRK 47b
E P+ L VL + AD F Y +A YA T FEE+ LKF+ +
+ + + + + +
Sb jct : 44 b
EDPEHL(QLVLC(QRADAAFADGSY(QVAADYYAETDKSFEEVCLKFMVLPDKRPIINYVKKR 505
(Query : 477 LASL--KPAERXXXXXXXXXXXXXXXSRLGAL(Q
GDPEALTLYRETKEC-FRTFLSS 521
L+ + KP E L L P+ +R + +
+ F + Sbjct : 50b
LSRVTTKPMETDELDEDKMNIIKALVIULIDLYLIiQINMPDKDEEURSSUlQTEYDEFMME 5b5
(Query : 530
PRHKEULFASRASIHELLASHGDTEHMVYFAVIMlQDYERVVAYHCiQHEAYEEALAVLARH 561 +R ++ +L + A H D +M FA+ + DY+ WA + E Y
EAL L
Sb jct : 5bb AHVLSCTR(QNRETVR(QLIAEHADPRNMA(QFAIAIGDYDEVVA<QιQLKAECYAEAL<QTLIN(Q b25 (Query: 510
RDP(QLFYKFSPILIRHIPR(QLVDAUIEMGSRLDAR(QLIPALVNYS(QGGEV(Q(QVS(QAIRYM b41 R+P+LFYK++P LI +P+ VDA + GSRL+ +L+P L+ + E ++
+(Q RY+
Sbjct: b2b RNPELFYKYAPELITRLPKPTVDALMAlQGSRLEVEKLVPTLI- IMENRE(QRE(QT(Q--RYL b82 βuery : b50
EFCVNVLGETElQAIHNYLLSLYARGRPDSLLAYLEiQAGASPHRVHYDLKYALRLCAEHGH 701 EF + L T AIHN+LL LYA P L+ YLE G VHYD+ YA ++C +
Sbjct : b83 EFAIYKLNTTNDAIHNFLLHLYAEHEPKLLMKYLEIiQGRDESLVHYDIYYAHKVCTDLDV 742
(Query : 710 HRACVHVYKVLELYEEAVDLAL(QVDVDLAK(QCADLPEEDEELRKKLULKIARHVV(QEEED 7b1
A V + +L + AVDLAL D+ LAK+ A P D ++R+KLUL+IA
H ++ D
Sbjct: 743 KEARVFLECMLRKUISAVDLALTFDMKLAKETASRPS-
DSKIRRKLULRIAYHDIKGTND 601
(Query: 770
VlQTAMACLASCPLLKIEDVLPFFPDFVTIDHFKEAICSSLKAYNHHIlQELiQREMEEATAS 621 V+ A+ L C LL+IED+LPFF DF ID+FKEAIC +L+ YN
IiQELlQREM E T Sb j ct : 802
VKKALNLLKECDLLRIEDLLPFFADFEKIDNFKEAICDALRDYNiQRIlQELiQREMAETTElQ βbl
(Query : 830
A(QRIRRDLιQELRGRYGTVEP(QDKCATCDFPLLNRPFYLFLCGHMFHADCLL(QAVRPGLPA 881 R +L(Q + LR TVE (QD C C+ LL +PF + + F + CGH FH + DCL +
V P L
Sbj ct : 8b2 TDRATAEL(Q(QLR(QHSLTVES(QDTCEICEMMLLVKPFFIFICGHKFHSDCLEKHVVPLLTK 121 (Query: 610
YKlQARLEELlQRKLGAAPPPXXXXXXXXXXXXXXXXXXPSRElQLKADLDELVAAECVYCGE 141 + RL L+++L A R LK ++++++AA+C++CG Sbjct: 122 EiQCRRLGTLKlQiQLEAEVflTlQAiQPlQSGALSKiQiQAMELiQRKRAALKTEIEDILAADCLFCG- 160
(Query: 150 LMIRSIDRPFIDP(QRYEEE(QLSU 172 L+I +ID+PF+D +E+ + U
Sbjct: 161 LLISTIDiQPFVDD — UElQVNVEU 1001
Score = 2b8 (40-2 bits)ι Expect = 3-be-llι P = 3-be-ll Identities = 11/281 (32*)ι Positives = 14b/281 (51*)
(Query: 3b <QLEKEVPIFTK(QRIDF-TPSE- — RITSLVVSSNlQLCMSLG---
KDTLLRIDLGKANEPN 88
+ ++E IF++ ++ PS + L VS N L LG + TLLR
L +A P Sbjct: 37
ETDEEDEIFSRHKMVLRVPSNCTGDLMHLAVSRNULVCLLGTPERTTLLRFFLPRAIPPG lb
(Query: 81 HVELGRK- — DDAKVHKMFLDHTGSHLLIAL- — SST EVLYVN —
RNGiQ KV 131 L + K+ +MFLD TG H++IAL S+T + LY++ +
Q KV
Sbjct: 17 EAVLEKYLSGSGYKITRMFLDPTGHHIIIALVPKSATAGVSPDFLYIHCLESP(QA(Q(QLKV ISb (Query: 132
RPLARUKGlQLVESVGUNKALGTESSTGPILVGTAiQGHIFEAELSASEGGLFGPAPDLYFR 111
R + ++K + +V +N G ESSTGPIL+GT++G IFE EL+ + G + +
Sbjct: 157 RRIEKFKDHEITAVAFNPYHGNESSTGPILLGTSRGLIFETELNPAADG-— ---HViQRK 210
(Query: 112 PLYVLNEEGGPA-PVCSLEAERGPDG- RSFVIATTR(QRLF(QFIGRAAEGAEA(QGFSGLFA 241
LY L G P P+ L+ R P+ R ++ T+ + ++ F + AE + +FA
Sbjct: 211 (QLYDLGL-GRPKYPITGLKLLRVPNSSRYIIVVTSPECIYTF — (QETLKAEERSLiQAIFA 2b7
(Query: 250 AYTD--HPPPFREFPSNLGYSELAFYTPKLRSAPRAFAUMMGDGVLYGAL 217
Y P E ++L +S+L F+ P P+ +AU+ G+G+ G L
Sbjct: 2b6 GYVSGVIQEPHCEERKTDLTFSIQLRFFAPPNSKYPKIQUAULCGEGIRVGEL
317
Pedant information for DKFZphmel2_7gl4ι frame 1
Report for DKFZphmel2_7gl4 -1
ELENGTH! 173 EMU! 11018b.01 Epl! 5-72
EHOMOL! SUISSPR0T:D0R_DR0ME DEEP ORANGE PROTEIN, le-145
EFUNCAT! 30-25 vacuolar and lysosomal organization ES- cerevisiaei YLR148w! 5e-41 EFUNCAT! 0b-D4 protein targetingi sorting and translocation ES- cerevisiaei YLR146w! Se-41
EFUNCAT! 06-07 vesicular transport (golgi networki etc) ES. cerevisiaei YLR148w! 5e-41
EBLOCKS! BLOOlObF Galactokinase proteins EBLOCKS! PR01014B
EBLOCKS! BP0330bB
EBLOCKS! PFOObOOB
EPIRKU! yeast vacuole le-31
EPIRKU! transmembrane protein le-31 EKU! Alpha_Beta
EKU! L0U_C0MPLEXITY 3-31 *
EKU! COILED COIL 4-83 *
SElQ MASILDEYENSLSRSAVLlQPGCPSVGIPHSGYVNAiQLEKEVPIFTKiQRIDFTPSERITSL SEG
PRD ccceeeccccccceeeeecccccceeeecccchhhhhhhhhhhhhhhhhhcccccceeee COILS
SElQ VVSSNiQLCMSLGKDTLLRIDLGKANEPNHVELGRKDDAKVHKMFLDHTGSHLLIALSSTE
SEG
PRD eeccceeeeecccccceeeccccccccceeeeehhhhhhhheeecccccceeeeeeccce
COILS
SElQ VLYVNRNGiQKVRPLARUKGlQLVESVGUNKALGTESSTGPILVGTAlQGHIFEAELSASEGG SEG
PRD eeeeecccccchhhhhcccceeeeeecccccccccccceeeeecccchhhhhhhhhhccc COILS
SElQ LFGPAPDLYFRPLYVLNEEGGPAPVCSLEAERGPDGRSFVIATTRiQRLFlQFIGRAAEGAE SEG PRD ccccccccccceeeeecccccccceeecccccccccceeeeeehhhhhhhhhhcchhhhh COILS
SElQ AlQGFSGLFAAYTDHPPPFREFPSNLGYSELAFYTPKLRSAPRAFAUMMGDGVLYGALDCG SEG
PRD hhhchhhhhhhhccccccccccccccccceeeecccccchhhhhhhhcccceeeeeeccc COILS
SElQ RPDSLLSEERVUEYPEGVGPGASPPLAIVLTiQFHFLLLLADRVEAVCTLTGiQVVLRDHFL SEG
PRD cccccchhhhhhccccccccccccchhhhhhhhhhhhhhhhheeeecccchhhhhhhhhh COILS
SElQ EKF GPL KH M VK D SS T GIQLUA YTE RA VFRY H V IQ RE A R D VUR T YL DM NRFD L AKEYCR E RP D
SEG
PRD hcccccccccccccccceeeehhhhhhhhhhhhhhhhhhhhhhccchhhhhhhhhhhccc COILS
SElQ CLDTVLAREADFCFRiQRRYLESARCYALTiQSYFEEIALKFLEARiQEEALAEFLlQRKLASL SEG
PRD cchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhc COILS
SElQ KPAERTIQATLLTTULTELYLSRLGALIQGDPEALTLYRETKECFRTFLSSPRHKEULFASR SEG xxxxxxxxxxxxxxx
PRD ccchhhhhhhhhhhhhhhhhhhhhcccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
SElQ ASIHELLASHGDTEHMVYFAVIMlQDYERVVAYHClQHEAYEEALAVLARHRDPlQLFYKFSP SEG
PRD hhhhhhhhhccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhccchhhhhcce COILS
SElQ ILIRHIPRlQLVDAUIEMGSRLDARlQLIPALVNYSlQGGEViQiQVSlQAIRYMEFCVNVLGETE SEG
PRD eeeeccccchhhhhhhhccccccccccchhhhhccccchhhhhhhhhhhhhhhhccccch COILS
SElQ (QAIHNYLLSLYARGRPDSLLAYLElQAGASPHRVHYDLKYALRLCAEHGHHRACVHVYKVL SEG PRD hhhhhhhhhhhhhcccchhhhhhhhcccccccccchhhhhhhhhhhhcccccceeehhhh COILS
SElQ ELYEEAVDLALlQVDVDLAKlQCADLPEEDEELRKKLULKIARHVVlQEEEDViQTAMACLASC SEG
PRD hhhhhhhhhhhhhchhhhhhhhhccccchhhhhhhhhhhhhhhhhhcchhhhhhhhhhhc COILS
SElQ PLLKIEDVLPFFPDFVTIDHFKEAICSSLKAYNHHIiQELiQREMEEATASAlQRIRRDLlQEL
SEG
PRD ccchhhhhhcccccceeechhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
COILS CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
SElQ RGRYGTVEPlQDKCATCDFPLLNRPFYLFLCGHMFHADCLLiQAVRPGLPAYKlQARLEELώR
SEG
PRD hhhheeeeccccccccccccccceeeeeeeccchhhhhhhhhhhccchhhhhhhhhhhhh
COILS CCCCCCCC
SElQ KLGAAPPPAKGSARAKEAEGGAATAGPSRElQLKADLDELVAAECVYCGELMIRSIDRPFI SEG xxxxxxxxxxxxxxxxxx
PRD hhhhhcchhhhhhhhhhhcccchhhhhhhhhhhhhhhhhhhhhhhccccceeeecccccc COILS
SElQ DPiQRYEEElQLSUL SEG
PRD chhhhhhhhhccc COILS
(No Prosite data available for DKFZphmel2_7gl4 - 1)
(No Pfam data available for DKFZphmel2_7gl4 -1)
DKFZphmel2_7kl1
group: melanoma derived
DKFZphmel2_7kl1 encodes a novel 234 amino acid protein without similarity to known proteins- Transcpripts can be found in almost any tissuei but are most abundant in kidney and retina- No informative BLAST resultsi No predictive prositei pfam or SCOP motife ■ The new protein can find application in studying the expression profile of melanoma-specific genes-
unknown protein first ATG in frame 1
Sequenced by MediGenomix Locus: /map="3"
Insert length: 238b bp
Poly A stretch at pos- 2343ι polyadenylation signal at pos- 2323
1 GGCAAAAGTC CAGGAATTAT CTTCATCCCT GGCTATCTTT CTTATATGAA
SI TGGTACAAAA GCGTTGGCGA TTGAGGAGTT TTGCAAATCT CTAGGTCACG
101 CCTGCATAAG GTTTGATTAC TCAGGAGTTG GAAGTTCAGA TGGTAACTCA
151 GAGGAAAGCA CACTGGGGAA ATGGAGAAAA GATGTTCTTT CTATAATTGA 201 TGACTTAGCT GATGGGCCAC AGATTCTTGT TGGATCTAGC CTTGGAGGGT
2S1 GGCTTATGCT TCATGCTGCA ATTGCACGAC CAGAGAAGGT TGTGGCTCTT
301 ATTGGTGTAG CTACAGCTGC AGATACCTTA GTGACAAAGT TTAATCAGCT
351 TCCTGTTGAG CTAAAAAAGG AAGTAGAGAT GAAAGGTGTG TGGAGCATGC
401 CATCAAAATA CTCTGAAGAA GGAGTTTATA ACGTTCAGTA CAGTTTCATT 451 AAAGAAGCTG AACATCACTG CTTGTTACAT AGCCCAATTC CTGTGAACTG
501 CCCCATAAGA TTGCTCCATG GCATGAAGGA TGACATTGTA CCTTGGCATA
551 CATCAATGCA GGTTGCCGAT CGAGTACTCA GCACAGATGT GGATGTCATC bOl CTCCGAAAAC ACAGTGATCA CCGAATGAGG GAAAAAGCAG ACATTCAACT bSl TCTTGTTTAC ACTATTGATG ACTTAATTGA TAAGCTCTCA ACTATAGTTA 701 ACTAGTATCA CATGTTTAGT TGGTATGTAA ACTAATGTAT CCAGAAGATT
751 GGAAGAGGGA TAAGAAATGA AAGATCCTGA TACTTTAGGT TTTTCCCTTT
801 CCTCTATTTT GTAAATATAA GATGAGTATT ATTTAATGAT GTATTTGCAT
851 AAGTAATGCA AATTGTGAAG AAGGACCAGC TGCTGTTTAG AAAATTTTCT
101 CCTTCCTTCT GTCCTTGATT TTTTTTCATT AAAGTATTTC CTTTTTTTAA 151 TTCAAGAAAA GTTTACCTTT CTTATGCTTA TGTTAGCTAT GCCAGCTCTT
1001 AATTGCATCC TTTTCTAATT AGGATTATTA ATAAAGCGTG AATATTTTGT
1051 TTTTTATTAT AGACAGAAAT TTGTAACATT ACTTCTGATT TGAAAATGCA
1101 ATTCACAAAA TATAGGGAAA TTTTTATTGA AGTAAATTTG AAATGATGGA
1151 GAAATTTCAG AAGCATAATA AAGTTCACAA TAAGGATAAT ACTTTATATA 1201 ATGTATAAAG TATATATAAT ATAATATATA TGTTATATAA ACTGCACATT
1251 ATATTCAAAC TTAAAATTGA GCTTTTTTTT TAAAGGCCCA AAATTGTACA
1301 GTGATACAAG GAGCTATTTC TAAAATTTGG CTTATGTATA ATATATTTAA
1351 ATGGGGAATT TCATCTAAAA CAATGATGTA GTATTTTTAA TATTCTGATT 1401 GGTAAAATTA AAGAGGAAAT TAATCTTTAT ATATTATTTC TTGCAGAAAC
1451 ATTCATTATT TTATTAATAT TGCCCTAAGT ACAACTAGGC AAGTGATTGC
1501 CACCTAAATC AGAAGACGTT CTAAAGTCAG TAAGAAAGTG TGAAATGCTA
1551 GTATAAAGGT TATTTTTTTT CTTTCCTAAA TAACTAAAGT GAGGTGTAGA
IbOl TTGAGCCTTG ATATTATTTA GTTAATGTTT TTTATTAATT AATTTTGGCT lb51 GGACTTTATT TAGCTTGATT AGGTTATTAT CTGTCAAACC TTTTAAGTTG
1701 ACAACATGAC TCATATATAT ACATGTGTAT AAGATGAGCA TGTGTCGAAG
1751 ACTTATTCGA CTCATTAATG AGGAAACCAG CAGATAGTAA ACCTGGTTCA
1801 AAGTACAATT CAAGAAACTG AGTATTTATG GGCATTGAAG AAAAAATGTT
1651 GAGATAAAAT TGCTGTGCAG AAAAAAGTGT TAATGAAGCC GACCTGACTA
1101 CTTAACCTTA GAGACCTGCT TTACAAGGTT GGCCCTTGAT TGGCATCTGG
1151 GAACTTGGAG TTCAGGGGGC TTCCACCATT CCCAGAACTG ATCAAAGTAG
2001 CTTACTATAT CTAAACTGTA AAACAATATA GTTTCTCCTG AACACCTGCT
2051 TTCCTTCTGG GAGTCTGGAA TTTTGGTATG TGCCAGGCAG AGACTACCTT
2101 TGTGACCAGC TCCCAGTAAA AACCCCAGGC ACTCAGTCTC TAACAAGCTT
2151 TTCTGGTTGA CAGTGTTTCA CAAGTGCTGT TACAACTGGT TGCTGGGAGA
2201 ATTAAGCTCA TCCTCTGTGA TTCCACTGGC GGAGGATTCT TGGAAGCTTG
2251 CACTTAGTTT CCCCTGACTT CACCCCATGT GTCTTTTTTC CTTTGCTGAT
2301 TTTGTTTTGT ATCCTTTCAC TGTAATAAAT CATGGCCGTG AGCAGAAAAA
2351 AAAAAAAAAA AAAAAAAAAT AAAAAAAAAA AAAAAA
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 1
ORF from 4b bp to 702 bpi peptide length 211 Category: similarity to unknown protein Classification: unclassified
1 MNGTKALAIE EFCKSLGHAC IRFDYSGVGS SDGNSEESTL GKURKDVLSI
51 IDDLADGPlQI LVGSSLGGUL MLHAAIARPE KVVALIGVAT AADTLVTKFN
101 (QLPVELKKEV EMKGVUSMPS KYSEEGVYNV (QYSFIKEAEH HCLLHSPIPV
151 NCPIRLLHGM KDDIVPUHTS MiQVADRVLST DVDVILRKHS DHRMREKADI
201 (QLLVYTIDDL IDKLSTIVN
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphmel2_7kl1ι frame 1 No Alert BLASTP hits found Pedant information for DKFZphmel2_7kl1ι frame 1
Report for DKFZphmel2_7kl1-l
ELENGTH! 211
EMU! 24301-18
Epl! 5-bl [H0M0L! PIR:A71b11 hypothetical protein RP343 - Rickettsia prowazekii 3e-21
[BLOCKS! BP04352K
[BLOCKS! PR00826E
[KU! Alpha_Beta
SElQ MNGTKALAIEEFCKSLGHACIRFDYSGVGSSDGNSEESTLGKURKDVLSIIDDLADGPiQI PRD ccchhhhhhhhhhhhccceeeeeeccccccccccccccccchhhhhhhhhhhhhccccee SElQ LVGSSLGGULMLHAAIARPEKVVALIGVATAADTLVTKFNiQLPVELKKEVEMKGVUSMPS PRD eeecccchhhhhhhhhhccceeeeeeeeeehhhhhhcccccchhhhhhhhhhhheeeccc
SElQ KYSEEGVYNVlQYSFIKEAEHHCLLHSPIPVNCPIRLLHGMKDDIVPUHTSMiQVADRVLST PRD ccccccceeeehhhhhhhhhhhhhhhccccccceeecccccccccccchhhhhhhhhhhh
SElQ DVDVILRKHSDHRMREKADIiQLLVYTIDDLIDKLSTIVN PRD hheeeeeccccchhhhhhhheeeeehhhhhhhhcccccc
(No Prosite data available for DKFZphmel2_7kl1- 1) (No Pfam data available for DKFZphmel2_7kl1-l)
DKFZphtes3_10ilb
group: nucleic acid management
DKFZphtes3_10ilb encodes a novel 742 amino acid protein with similarity to human ZK1-
The ZK1 gene is one of early response genes by exposure to ionizing radiationi and plays a role in radiation-induced apoptotic cell death on hematopoietic cells- The novel protein contains 18 zinc finger domainsi a RGD cell attachment and a ATP GTP A domain- The new protein can find application in diagnosis/therapy in leukemia predisposition/disease in the modulation of DNA repair-
similarity to ZK1 (Homo sapiens)ι complete eds-
Sequenced by (Qiagen
Locus: unknown Insert length: 2684 bp
Poly A stretch at pos- 28blι polyadenylation signal at pos- 2835
1 CGGAAATGGA GGGGGTCGCT TTCCTCACCT TCCTCGCTGC GCGGGCGGCG
51 GTTGGTAACC GGTCAGACCA GCCCGAGAGG GACCTGGTGC CTGTACCCAG
101 GCTTCTGTCG CTCTGTCGCC TGCGCTATGC CCTGCTGTAG TCACAGGAGC
151 TGTAGAGAGG ACCCCGGTAC ATCTGAAAGC CGGGAAATGG ACCCAGTGGC 201 CTTTGAGGAT GTGGCTGTGA ACTTCACCCA GGAAGAGTGG ACATTGCTGG
251 ATATTTCCCA GAAGAATCTC TTCAGGGAAG TGATGCTGGA AACTTTCAGG
301 AACCTGACCT CTATAGGAAA AAAATGGAGT GACCAGAACA TTGAATATGA
351 GTACCAAAAC CCCAGAAGAA GCTTCAGGAG TCTCATAGAA GAGAAAGTCA
401 ATGAAATTAA AGAAGACAGT CATTGTGGAG AAACTTTTAC CCAGGTTCCA 451 GATGACAGAC TGAACTTCCA GGAGAAGAAA GCTTCTCCTG AAGTAAAATC
501 ATGTGACAGC TTTGTGTGTG CAGAAGTTGG CATAGGTAAC TCATCTTTTA
551 ATATGAGCAT CAGAGGTGAC ACTGGACACA AGGCATATGA GTATCAGGAA bOl TATGGACCAA AGCCATATAA GTGTCAACAA CCTAAAAATA AGAAAGCCTT b51 CAGGTATCGC CCATCCATTA GAACACAAGA AAGGGATCAC ACTGGAGAGA 701 AACCCTATGC TTGTAAAGTC TGTGGAAAAA CCTTTATTTT CCATTCAAGC
751 ATTCGAAGAC ACATGGTAAT GCACAGTGGG GATGGAACTT ATAAATGTAA
801 ATTTTGTGGG AAAGCCTTCC ATTCTTTCAG TTTATATCTT ATCCATGAAA
651 GAACTCACAC TGGAGAGAAA CCATATGAAT GTAAACAATG TGGTAAATCC
101 TTTACTTATT CTGCTACCCT TCAAATACAT GAAAGAACTC ACACTGGGGA 151 GAAGCCCTAT GAATGTAGCA AATGTGATAA AGCATTTCAT AGTTCTAGTT
1001 CCTATCATAG ACATGAAAGA AGTCACATGG GAGAGAAGCC TTATCAATGC
1051 AAAGAATGTG GAAAAGCATT TGCATATACC AGTTCTCTTC GTAGACATGA
1101 AAGGACCCAC TCTGGGAAAA AACCGTATGA ATGTAAGCAA TATGGGGAAG
1151 GCTTATCCTA TCTTATAAGT TTTCAAACAC ACATAAGAAT GAACTCTGGA 1201 GAAAGACCTT ATAAATGTAA GATATGTGGG AAAGGCTTTT ATTCTGCCAA
1251 GTCATTTCAA ACACATGAAA AAACTCACAC TGGAGAGAAA CGCTATAAAT
1301 GCAAGCAATG TGGTAAAGCC TTCAATCTTT CCAGTTCCTT TCGATATCAT
1351 GAAAGGATTC ACACTGGAGA GAAACCCTAT GAGTGTAAGC AGTGTGGGAA
1401 AGCCTTCAGA TCTGCCTCAC AGCTTCGAGT GCACGGTGGG ACTCACACTG 1451 GAGAGAAACC CTATGAATGT AAGGAATGTG GGAAAGCCTT CAGATCTACC
1501 TCACACCTTC GAGTGCATGG TAGGACTCAT ACTGGAGAGA AACCCTATGA
1551 ATGTAAGGAA TGTGGGAAAG CCTTCAGATA TGTGAAGCAC CTTCAAATTC
IbOl ATGAAAGGAC AGAAAAACAC ATAAGAATGC CCTCTGGAGA AAGACCTTAT
IbSl AAATGTAGTA TATGTGAGAA AGGCTTTTAT TCTGCCAAGT CATTTCAAAC 1701 ACATGAAAAA ACTCACACTG GAGAGAAACC CTATGAATGC AACCAATGTG
1751 GTAAAGCCTT CAGATGTTGC AATTCCCTTC GATATCATGA AAGGACTCAC
1601 ACTGGAGAGA AACCCTATGA GTGTAAGCAA TGTGGGAAAG CCTTCAGATC
1851 TGCCTCACAC CTTCGAATGC ATGAAAGGAC TCACACTGGA GAGAAACCCT
1101 ATGAGTGTAA GCAATGTGGG AAAGCCTTCA GTTGTGCCTC AAACCTTCGA 1151 AAGCATGGTA GGACTCACAC TGGAGAGAAA CCCTATGAGT GTAAGCAATG
2001 TGGGAAAGCC TTCAGATCTG CCTCAAACCT TCAGATGCAT GAAAGGACTC
2051 ACACTGGAGA GAAACCCTAT GAATGTAAGG AATGCGAAAA AGCATTCTGT
2101 AAATTCTCTT CTTTTCAAAT ACATGAAAGG AAGCACAGAG GAGAGAAGCC
2151 CTATGAATGT AAGCATTGTG GGAATGGATT CACATCTGCC AAGATTCTTC 2201 AAATACATGC AAGAACACAC ATTGGAGAGA AACACTATGA ATGTAAGGAA
2251 TGCGGAAAAG CATTCAATTA TTTTTCTTCC TTGCATATAC ACGCAAGGAC
2301 TCATATGGGA GAGAAGCCAT ATGAATGTAA GGATTGTGGG AAAGCATTCA
2351 GCTAGCCTGG TTCCTTTTAT GGACATGAAT AGACTCACAC TGGAAGGAAG
2401 CACTATGAAT GCAAGCAATG TGGCAAAACT TTCACATTTT CCAGTTCTTT 2451 TCGATATCAT GAAAGGACTC ACACTGGGGA GAAACCCTAT CAATGTAAGC
2501 AGTGTGGGAA AGCCTTCATT CCTTTTACTT CTTTTCAATG TCATGAAAGG
2551 ACTCACACGG GAGAGAAACC CTATGAGTGT ATTCTAGTTC CGTTTGATAT
2b01 CATGAAAGGA CTTACACTGG AGTGAAACCC TATGAATGTA AGCAATGTGG 2b51 GAAAGCCTTC AGATGTGCCT CGCACCTTCA ACGGCATGGA AGGGTTCACA
27D1 CTTGGGAGAA ACTCTATGAA TGTAAGCAGT ATGGGAAAGC CTTCAGATCT
2751 GCCAAGATTC TTTGAATACA GATAATTAAT GTAAACAATT ATCATAAGTA
2801 TACTAACATG TTATTCTTTT TAAATAAGAA GGTATAATAA AATATCCCAT
2651 TGGTTTTATG TATTAAAAAA AAAAAAAAAA AAAA
BLAST Results
No BLAST result
Medline entries
16401134:
Katoh Oi Oguri Ti Takahashi Ti Takai Si Fujiwara Yi Uatanabe H-i novel Kruppel-type zinc finger genei is induced following exposure to ionizing radiation and enhances apoptotic cell death on hematopoietic cells- Biochem Biophys Res Commun 1118 Aug 28i241(3) :515-b00 15137313:
Uick MJi Ann DKi Lee NMi Loh HH-i Isolation of a cDNA encoding a novel zinc-finger protein from neuroblastoma x glioma NG106-15 cells- Gene 1115 Jan 23il52(2) :227-32
Peptide information for frame 1
ORF from 127 bp to 2352 bpi peptide length: 742
Category: similarity to known protein Classification: Nucleic acid management
Prosite motifs: RGD (14b-148)
ATP_GTP_A (115-202)
ZINC_FINGER_C2H2 (11b-21b)
ZINC_FINGER_C2H2 (224-244) ZINC_FINGER_C2H2 (252-272)
ZINC_FINGER_C2H2 (280-300)
ZINC_FINGER_C2H2 (306-328)
ZINC_FINGER_C2H2 (3b4-384)
ZINC_FINGER_C2H2 (312-412) ZINC_FINGER_C2H2 (420-440)
ZINC_FINGER_C2H2 (448-4b8)
ZINC_FINGER_C2H2 (510-530)
ZINC_FINGER__C2H2 (536-558)
ZINC_FINGER_C2H2 (Sbb-58b) ZINC_FINGER_C2H2 (514-bl4)
ZINC_FINGER_C2H2 (b22-b42)
ZINC_FINGER_C2H2 (b50-b70)
ZINC_FINGER_C2H2 (b78-b16) ZINC_FINGER_C2H2 (70b-72b) ZINC FINGER C2H2 (47b-418)
1 MPCCSHRSCR EDPGTSESRE MDPVAFEDVA VNFTiQEEUTL LDISlQKNLFR
51 EVMLETFRNL TSIGKKUSDiQ NIEYEYlQNPR RSFRSLIEEK VNEIKEDSHC
101 GETFTiQVPDD RLNFlQEKKAS PEVKSCDSFV CAEVGIGNSS FNMSIRGDTG
151 HKAYEYlQEYG PKPYKCiQlQPK NKKAFRYRPS IRTlQERDHTG EKPYACKVCG
201 KTFIFHSSIR RHMVMHSGDG TYKCKFCGKA FHSFSLYLIH ERTHTGEKPY 251 ECKiQCGKSFT YSATLiQIHER THTGEKPYEC SKCDKAFHSS SSYHRHERSH
301 MGEKPYQCKE CGKAFAYTSS LRRHERTHSG KKPYECKiQYG EGLSYLISFlQ
351 THIRMNSGER PYKCKICGKG FYSAKSFlQTH EKTHTGEKRY KCKlQCGKAFN
401 LSSSFRYHER IHTGEKPYEC KiQCGKAFRSA SiQLRVHGGTH TGEKPYECKE
451 CGKAFRSTSH LRVHGRTHTG EKPYECKECG KAFRYVKHLiQ IHERTEKHIR 501 MPSGERPYKC SICEKGFYSA KSFlQTHEKTH TGEKPYECNiQ CGKAFRCCNS
551 LRYHERTHTG EKPYECKlQCG KAFRSASHLR MHERTHTGEK PYECKiQCGKA bOl FSCASNLRKH GRTHTGEKPY ECKlQCGKAFR SASNLiQMHER THTGEKPYEC b51 KECEKAFCKF SSFiQIHERKH RGEKPYECKH CGNGFTSAKI LiQIHARTHIG
701 EKHYECKECG KAFNYFSSLH IHARTHMGEK PYECKDCGKA FS
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_10ilbι frame 1 No Alert BLASTP hits found
Peptide information for frame 2
ORF from 1703 bp to 2584 bpi peptide length-" 214 Category: questionable ORF Classification: no clue
1 MKKLTLERNP MNATNVVKPS DVAIPFDIMK GLTLERNPMS VSNVGKPSDL 51 PHTFECMKGL TLERNPMSVS NVGKPSVVPiQ TFESMVGLTL ERNPMSVSNV
101 GKPSDLPiQTF RCMKGLTLER NPMNVRNAKK HSVNSLLFKY MKGSTEERSP
151 MNVSIVGMDS HLPRFFKYMlQ EHTLERNTMN VRNAEKHSII FLPCIYTiQGL
201 IUERSHMNVR IVGKHSASLV PFMDMNRLTL EGSTMNASNV AKLSHFPVLF
251 DIMKGLTLGR NPINVSSVGK PSFLLLLFNV MKGLTRERNP MSVF
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_10ilbι frame 2
TREMBL:AF153201_1 product: "zinc finger protein dp"i Homo sapiens zinc finger protein dp mRNAi complete cds-i N = li Score = 225ι P 4-le-lβ >TREMBL:AF153201_1 product: "zinc finger protein dp"i Homo sapiens zinc finger protein dp mRNAi complete eds- Length = 423
HSPs:
Score = 225 (33-6 bits)ι Expect = 4-le-lβι P = 4-le-lβ Identities = 84/24b (34*)ι Positives = 122/24b (41*)
(Query: lb VVKPSDVA- IPFDIMKGLTLERNPMSVSNVGKPSDLPHTFECMKGLTLERNPMSVSNVGK 74
V KPS A I F I + + L RN + V +V K S T ++G TLERNP++V +VGK
Sb jct : 3 VGKPSVRAlQILFCIRESI-
LGRNHIHVISVAKVSVRIiQTLLNIEGSTLERNPINVMSVGK bl
(Query : 75 PSVVPlQTFESMVGLTLERNPMSVSNVGKPSDLPiQTFRCMKGLTLERNPMNVRNAKKHSVN 134
+ (Q+ + G LERNP+ V NV KPS Q + TLER+ +V
+ A K V
Sbj ct : b2
LLIRAiQSLFYIRGFILERNPIPVINVAKPSVGFfQILLIINEFTLERSLTHVISAIKCLVE 121
(Query: 135 SLLFKYMKGSTEERSPMNVSIVGMDS-
HLPRFFKYMiQEHTLERNTMNVRNAEKHSIIFLP 113
+ + + R+PMNV VG P F +++E TLERN M+V
K + Sbjct: 122 DEILLNITEFIiQVRNPMNVMNVGKPLVRAPTLF-
FIRESTLERNLMHVVIVLKALVAVlQI 180
(Query: 114
CIYTiQGLIUERSHMNVRIVGKHSASLVPFMDMNRLTLEGSTMNASNVAKLSHFPVLFDIM 253 + + ER+HM+V V K +++ TL S + A V K S
+ +
Sbjct: 181 LLSIKEYTLERNHMHVISVIKVLVKAlQTSLNIREYTLVKSLIIAIVVRKPSVRVLTLFFI 240 (Query: 254 KGLTLGRN 2bl
+ TL +N Sbjct: 241 REFTLEKN 248
Score = 215 (32-3 bits)ι Expect = 1-le-lbι P = 1-le-lb Identities = 82/24b (33*)ι Positives = 124/24b (50*)
(Query : 44
VGKPSDLPHTFECMKGLTLERNPMSVSNVGKPSVVPlQTFESMVGLTLERNPMSVSNVGKP 103 VGKPS C + + L RN + V +V K SV (QT ++' G TLERNP++V +VGK Sbj ct : 3
VGKPSVRAiQILFCIRESILGRNHIHVISVAKVSVRIlQTLLNIEGSTLERNPINVMSVGKL b2
(Query : 104 SDLPlQTFRCMKGLTLERNPMNVRNAKKHSVNSLLFKYMKGSTEERSPMNV- SIVGM D 151
(Q+ + + G LERNP+ V N K SV + + T ERS +V S + D Sbjct: b3 LIRAiQSLFYIRGFILERNPIPVINVAKPSVGFiQILLIINEFTLERSLTHVISAIKCLVED 122
(Query: ibO SHLPRFFKYMlQEHTLERNTMNVRNAEKHSIIFLPCIY- TtQGLIUERSHMNVRIVGKHSAS 218
L +++(Q RN MNV N K ++ P ++ + ER+ M+V IV K +
Sbjct: 123 EILLNITEFIlQV RNPMNVMNVGK-
PLVRAPTLFFIRESTLERNLMHVVIVLKALVA 177
(Query: 211 LVPFMDMNRLTLEGSTMNASNVAK- LSHFPVLFDIMKGLTLGRNPINVSSVGKPSFLLLL 277
+ + + TLE + M+ +V K L +1 + TL ++ I V KPS +L Sbjct: 178 V(QILLSIKEYTLERNHMHVISVIKVLVKA(QTSLNIRE- YTLVKSLIIAIVVRKPSVRVLT 23b
(Query: 276 FNVMKGLTRERN 261
++ T E+N Sbjct: 237 LFFIREFTLEKN 246
Score = 207 (31-1 bits)ι Expect = 5-2e-15ι P = 5-2e-15 Identities = 80/270 (21*)ι Positives = 121/270 (47*) (Query: 1 MKKLTLERNPMNATNVVKPSDVAIPFDI-
MKGLTLERNPMSVSNVGKPSDLPHTFECMKG 51
+++ L RN ++ +V K S V I + ++G TLERNP++V +VGK
+ ++G
Sb j ct : lb IRESILGRNHIHVISVAKVS- VRI(QTLLNIEGSTLERNPINVMSVGKLLIRA(QSLFYIRG 74
(Query: bO
LTLERNPMSVSNVGKPSVVP(QTFESMVGLTLERNPMSVSNVGKPSDLP(QTFRCMKGLTLE 111
LERNP+ V NV KPSV Q + TLER+ V + K + +
Sbjct: 75 FILERNPIPVINVAKPSVGFlQILLIINEFTLERSLTHVISAIKCLVEDEILLNITEFIiQV 134
(Query: 120 RNPMNVRNAKKHSVNSLLFKYMKGSTEERSPMNVSIVGMDSHLPRFFKYMiQEHTLERNTM 171
RNPMNV N K V + +++ ST ER+ M+V IV +
++E+TLERN M
Sbjct: 135
RNPMNVMNVGKPLVRAPTLFFIRESTLERNLMHVVIVLKALVAViQILLSIKEYTLERNHM 114
(Query: 160
NVRNAEKHSIIFLPCIYTiQGLIUERSHMNVRIVGKHSASLVPFMDMNRLTLEGSTMNASN 231 +V + K + + + +S + +V K S ++ + TLE
+ + Sbjct: 115
HVISVIKVLVKAiQTSLNIREYTLVKSLIIAIVVRKPSVRVLTLFFIREFTLEKNYYLCTiQ 254
(Query : 240 VAKLSHFPVLFDIMKGLTL--GRNPINVSSVGK 270 + K F + D + + K + G P S K Sbjct : 255 CSK--SFS(QISDLIKH(3RIHTGEKPYKCSECRK 285
Score = 181 ( 27 - 2 bi ts ) ι Expect = l - 4e-ll ι P = l - 4e-ll Ident i t i es = 74 /2b1 ( 27* ) ι Posit i ves = llb/2b1 ( 43* ) fluery: 5
TLERNPMNATNVVKPSDVAIPFDIMKGLTLERNPMSVSNVGKPSDLPHTFECMKGLTLER b4 TLERNP+N +V K A ++G LERNP+ V NV KPS + TLER
Sbjct: 48 TLERNPINVMSVGKLLIRAlQSLFYIRGFILERNPIPVINVAKPSVGFiQILLIINEFTLER 107
(Query: b5 NPMSVSNVGKPSVVPiQTFESMVGLTLERNPMSVSNVGKPSDLPiQTFRCMKGLTLERNPMN 124
+ V + K V + ++ RNPM+V NVGKP T ++
TLERN M+
Sbjct: 108
SLTHVISAIKCLVEDEILLNITEFIώVRNPMNVMNVGKPLVRAPTLFFIRESTLERNLMH lb7
(Query: 125 VRNAKKHSVNSLLFKYMKGSTEERSPMNV-
SIVGMDSHLPRFFKYMiQEHTLERNTMNVRN 183
V K V + +K T ER+ M+V S++ + ++E+TL
++ + Sbjct: Ibβ VVIVLKALVAViQILLSIKEYTLERNHMHVISVIKVLVKAiQTSLN- IREYTLVKSLIIAIV 22b
(Query: 164
AEKHSIIFLPCIYTiQGLIUERSHMNVRIVGKHSASLVPFMDMNRLTLEGSTMNASNVAKL 243 K S+ L + + E + + + K + + + R +
S K
Sbjct: 227 VRKPSVRVLTLFFIREFTLEKNYYLCT(QCSKSFS(QISDLIKH(QRIHTGEKPYKCSECRKA 2δb (Query: 244 SHFPVLFDIMKGLTLGRNPINVSSVGKPSF 273
L + + + G+ P GK SF
Sbjct: 287 FS(QCSLLALH(QRIHTGKKPNPCDECGK-SF 315
Score = Ibb (24-1 bits)ι Expect = 8-4e-10ι P = 8-4e-10 Identities = b3/114 (32*)ι Positives = 61/114 (45*)
(Query: 100
VGKPSDLPώTFRCMKGLTLERNPMNVRNAKKHSVNSLLFKYMKGSTEERSPMNVSIVGMD 151 VGKPS Q C++ L RN ++V + K SV ++GST ER+P+NV VG Sbjct: 3
VGKPSVRA(QILFCIRESILGRNHIHVISVAKVSVRI(QTLLNIEGSTLERNPINVMSVGKL b2
(Query: IbO SHLPRFFKYM(2EHTLERNTMNVRNAEKHSIIFLPCIYT(QGLIUERSHMNVRIVGKHSASL 211
+ Y++ LERN + V N K S+ F + ERS +V
K
Sbjct: b3
LIRA(QSLFYIRGFILERNPIPVINVAKPSVGF(QILLIINEFTLERSLTHVISAIKCLVED 122
(Query: 220 VPFMDMNRLTLEGSTMNASNVAK-
LSHFPVLFDIMKGLTLGRNPINVSSVGKPSFLLLLF 276
+++ + MN NV K L P LF I + TL RN ++V V K
+ + Sbjct: 123 EILLNITEFIiQVRNPMNVMNVGKPLVRAPTLFFIRES-
TLERNLMH VIVLKALVAViQIL 161
(Query: 271 NVMKGLTRERNPMSV 213 +K T ERN M V Sbjct: 162 LSIKEYTLERNHMHV lib
Pedant information for DKFZphtes3_10ilbι frame 1
Report for DKFZphtes3__10ilb-l
[LENGTH! 784 [MU! 10657-05
[pi! 1-24
[HOMOL! TREMBL:AB011414_1 gene: "ZKl"i product: "Kruppel- type zinc finger protein"i Homo sapiens ZK1 mRNA for Kruppel-type zinc finger proteini complete eds- 0-0
[FUNCAT! 30-10 nuclear organization [S- cerevisiaei YJL05bc! be-33
[FUNCAT! 04- 05- DI- 04 transcriptional control [S- cerevisiaei YJL05bc! be-33
[FUNCAT! 04-11 other transcription activities [S- cerevisiaei
Y0R113w! 5e-24
[FUNCAT! 04-01-01 rrna synthesis CS. cerevisiaei YPRlδbc PZFl -
TFIIIA! le-20 EFUNCAT! 04-03.01 trna synthesis ES- cerevisiaei YPRlβbc PZFl -
TFIIIA! le-20
[FUNCAT! 13-04 homeostasis of other ions [S- cerevisiaei
YNL027w! le-13
[FUNCAT! 11-07 detoxif icaton CS. cerevisiaei YGL254w! Se-IE [FUNCAT! 01-02-04 regulation of nitrogen and sulphur utilization CS- cerevisiaei YGL254w! 2e-12
[FUNCAT! 01.05-04 regulation of carbohydrate utilization [S- cerevisiaei YGL201w! 2e-ll
[FUNCAT! 04-05-11 other mrna-transcription activities [S- cerevisiaei YER028c! 3e-lD
[FUNCAT! 11-01 stress response [S- cerevisiaei YKLDb2w! le-01
[FUNCAT! 01-01-04 regulation of amino-acid metabolism [S- cerevisiaei YDR253c! 5e-01
[FUNCAT! 11 unclassified proteins CS- cerevisiaei YBRObbc! 3e-08
[FUNCAT! 03-07 pheromone responsei mating-type determinationi sex-specific proteins [S- cerevisiaei YDR14bc! le-D7
[FUNCAT! 03-25 cytokinesis CS. cerevisiaei YLR131c! 2e-0b
CBL0CKS! BL004bb TFIIS zinc ribbon domain proteins [BLOCKS! BLD024SA Phytochrome chro ophore attachment site proteins
[BLOCKS! DMD1151B
[BLOCKS! PF013b3B
[BLOCKS! BL01D30 EBLOCKS! PFOOOIbB
[BLOCKS! BL00026 Zinc fingeri C2H2 typei domain proteins
[BLOCKS! BP04213E
[BLOCKS! BP04213C
EBLOCKS! BP04213B [SCOP! d2adr 7-31-1-1-4 ADR1 [synthetic based on yeast
(Saccharomyce 2e-05
[PIRKU! nucleus le-53
[PIRKU! RNA binding 2e-58 [PIRKU! duplication le-34
[PIRKU! tandem repeat le-171
[PIRKU! spermatogenesis 5e-b2
[PIRKU! zinc le-lbl [PIRKU! zinc finger 0-0
[PIRKU! DNA binding 0-0
[PIRKU! metal binding le-12D
[PIRKU! phosphoprotein 2e-5δ
[PIRKU! leucine zipper le-53 [PIRKU! alternative splicing 2e-58
[PIRKU! eye lens le-111
[PIRKU! oocyte. le-lOb
[PIRKU! transcription factor le-111
[PIRKU! embryo le-lOb [PIRKU! segmentation le-34
[PIRKU! transcription regulation le-152
[SUPFAM! POZ domain homology 7e-63
[SUPFAM! transcription factor Krueppel le-34
[SUPFAM! zinc finger protein ZFP-3b le-173 [SUPFAM! transcription factor IIIA δe-31
[PROSITE! ATP_GTP_A 1
[PROSITE! RGD 1
[PROSITE! ZINC_FINGER_C2H2 16
[PFAM! Zinc fingeri C2H2 type [PFAM! TNFR/NGFR cysteine-rich region
[KU! Irregular
[KU! 3D
[KU! LOU COMPLEXITY 3-57 *
SElQ RKURGSLSSPSSLRGRRLVTGiQTSPRGTUCLYPGFCRSVACAMPCCSHRSCREDPGTSES SEG ■ ■ -xxxxxxxxxxxxxxx
ImeyF
SElQ REMDPVAFEDVAVNFTlQEEUTLLDISiQKNLFREVMLETFRNLTSIGKKUSDlQNIEYEYiQN SEG
ImeyF
SElQ PRRSFRSLIEEKVNEIKEDSHCGETFTiQVPDDRLNFlQEKKASPEVKSCDSFVCAEVGIGN SEG
ImeyF
SElQ SSFNMSIRGDTGHKAYEYlQEYGPKPYKClQiQPKNKKAFRYRPSIRTlQERDHTGEKPYACKV SEG
ImeyF
SElQ CGKTFIFHSSIRRHMVMHSGDGTYKCKFCGKAFHSFSLYLIHERTHTGEKPYECKiQCGKS SEG
ImeyF
SElQ FTYSATLlQIHERTHTGEKPYECSKCDKAFHSSSSYHRHERSHMGEKPYiQCKECGKAFAYT SEG xxxxxxxxxxxxx ImeyF
SElQ SSLRRHERTHSGKKPYECKiQYGEGLSYLISFiQTHIRMNSGERPYKCKICGKGFYSAKSFiQ SEG
ImeyF
SElQ THEKTHTGEKRYKCK(QCGKAFNLSSSFRYHERIHTGEKPYECK(2CGKAFRSAS<QLRVHGG SEG
ImeyF
SElQ THTGEKPYECKECGKAFRSTSHLRVHGRTHTGEKPYECKECGKAFRYVKHLiQIHERTEKH SEG
ImeyF
SE(3 IRMPSGERPYKCSICEKGFYSAKSFlQTHEKTHTGEKPYECNlQCGKAFRCCNSLRYHERTH SEG
ImeyF
SElQ TGEKPYECKiQCGKAFRSASHLRMHERTHTGEKPYECKiQCGKAFSCASNLRKHGRTHTGEK SEG
ImeyF
- -TTTEETTTTTCEETTHHHHHHHHHHHHTTCCEEETTTTEEECCHHHHHHHHHHHHCCC
SElQ PYECKiQCGKAFRSASNLiQMHERTHTGEKPYECKECEKAFCKFSSFiQIHERKHRGEKPYEC SEG
ImeyF
CEEETTTTEEECCHHHHHHHHHHH
SElQ KHCGNGFTSAKILiQIHARTHIGEKHYECKECGKAFNYFSSLHIHARTHMGEKPYECKDCG SEG
ImeyF
SElQ KAFS SEG ....
ImeyF
Prosite for DKFZphtes3_10ilb ■ 1
PSOOOlb 188- ->111 RGD PDOCOOOlb
PS00017 237- ->245 ATP_GTP_A PD0C00017
PS0D028 236- ->2S1 ZINC. FINGER _C2H2 PD0C00028
PS00028 2bb- ->287 ZINC FINGER _C2H2 PD0C00028
PS00026 214- ->315 ZINC FINGER. C2H2 PD0C00028
PS00026 322- ->343 ZINC_ FINGER _C2H2 PD0C00026
PS00026 350- ->371 ZINC. FINGER _C2H2 PD0C00026
PS00028 40b- ->427 ZINC FINGER _C2H2 PD0C00028
PS00028 434- ->455 ZINC FINGER _C2H2 PD0C00028
PS00028 4b2- ->463 ZINC FINGER _C2H2 PD0C00028
PS00026 410- ->511 ZINC FINGER C2H2 PD0C00028
PS00028 552- ->573 ZINC .FINGER. C2H2 PD0CO0028 PS00026 56D->b01 ZINC_FINGER_C2H2 PD0C00026
PS00028 b06->b21 ZINC_FINGER_C2H2 PD0C00026
PS00026 b3b->b57 ZINC_FINGER_C2H2 PD0C00026
PS00028 bb4->b85 ZINC_FINGER_C2H2 PD0C00028
PS00026 b12->713 ZINC_FINGER_C2H2 PD0C00028
PS00026 72D->741 ZINC_FINGER_C2H2 PD0C00028
PS00028 746->7b1 ZINC_FINGER_C2H2 PD0C00026
PS00028 518->S41 ZINC FINGER C2H2 PD0C00028
Pfam for DKFZphtes3_10ilb- 1
HMM_NAME TNFR/NGFR cysteine-rich region
HMM *CpeGtYtD-UNHvpqClpC- - trCePEMG(QYMvqPCTwT(QNTVC*
C + +++ +++++C C ++C+++ G++++++ ++ V
(Query 30 CLYPGFCRSVACAMPC—CSHRSCREDPGTSESREMDP VA b7
HMM_NAME Zinc fingeri C2H2 type
HMM *CpwPDCgKtFrrwsNLrRHMRTH*
C++ CGKTF S+ RRHM +H (Query 238 CKV—CGKTFIFHSSIRRHMVMH 258 32-15 (bits) f: 2bb t: 28b Target: dkfzphtes3_lUilb - 1 similarity to ZKl (Homo sapiens)ι complete eds-
Alignment to HMM consensus: (Query *CpwPDCgKtFrrwsNLrRHMRTH*
C++ CGK+F + S + +H RTH dkfzphtes3 2bb CKF—CGKAFHSFSLYLIHERTH 26b
(Query f: 214 t: 314 Target: dkfzphtes3_10ilb ■ 1 similarity to ZKl (Homo sapiens)ι complete eds- Alignment to HMM consensus: HMM *CpwPDCgKtFrrwsNLrRHMRTH*
C+ CGK+F+++ +L++H RTH (Query 214 CK<2--CGKSFTYSATLiQIHERTH 314
34-22 (bits) f: 322 t: 342 Target: dkfzphtes3_lUilb - 1 similarity to ZKl (Homo sapiens)ι complete eds- Alignment to HMM consensus: (Query *CpwPDCgKtFrrwsNLrRHMRTH*
C++ C+K+F ++S++ RH R+H dkfzphtes3 322 CSK--CDKAFHSSSSYHRHERSH 342
(Query f: 350 t "- 370 Target: dkfzphtes3_10ilb ■ 1 similarity to ZKl (Homo sapiens)ι complete eds-
Alignment to HMM consensus: HMM *CpwPDCgKtFrrwsNLrRHMRTH* C++ CGK+F + S+LRRH RTH
(Query 350 CKE--CGKAFAYTSSLRRHERTH 370 32-01 (bits) f: 40b t: 42b Target: dkfzphtes3_10ilb • 1 similarity to ZKl (Homo sapiens)ι complete eds-
Alignment to HMM consensus: (Query *CpwPDCgKtFrrwsNLrRHMRTH* C++ CGK F ++ ++++H +TH dkfzphtes3 40b CKI--CGKGFYSAKSF(QTHEKTH 42b
(Query f: 434 t: 454 Target: dkfzphtes3_10ilb .1 similarity to ZKl (Homo sapiens)ι complete eds- Alignment to HMM consensus:
HMM *CpwPDCgKtFrrwsNLrRHMRTH*
C+ CGK+F+ +S++R H R+H (Query 434 CK(Q—CGKAFNLSSSFRYHERIH 454 32-14 (bits) f: 4b2 t: 462 Target: dkfzphtes3_10ilb -1 similarity to ZKl (Homo sapiens)ι complete eds-
Alignment to HMM consensus: (Query *CpwPDCgKtFrrwsNLrRHMRTH*
C+ CGK+FR++S+LR H TH dkfzphtes3 4b2 CKiQ—CGKAFRSASlQLRVHGGTH 482
(Query f: 410 t: 510 Target: dkfzphtes3_10ilb ■ 1 similarity to ZKl (Homo sapiens)ι complete eds- Alignment to HMM consensus: HMM *CpwPDCgKtFrrwsNLrRHMRTH*
C++ CGK+FR+ S+LR H RTH (Query 410 CKE--CGKAFRSTSHLRVHGRTH 510
30- bl (bits) f: 518 t: 540 Target", dkfzphtes3_10ilb ■ 1 similarity to ZKl (Homo sapiens)ι complete eds- Alignment to HMM consensus: (Query *CpwPDCgKtFrrwsNLrRHMR- -T-H*
C++ CGK+FR+ +L++H R H dkfzphtes3 518 CKE~CGKAFRYVKHL(QIHERTE-KH 540 ώuery f: 552 t: 572 Target: dkfzphtes3_10ilb ■ 1 similarity to ZKl (Homo sapiens)ι complete eds-
Alignment to HMM consensus: HMM *CpwPDCgKtFrrwsNLrRHMRTH* C++ C+K F ++ ++++H +TH
(Query 552 CSI--CEKGFYSAKSF(QTHEKTH 572
31-33 (bits) f: 580 t: bOO Target: dkfzphtes3_lUϋb- 1 similarity to ZKl (Homo sapiens)ι complete eds- Alignment to HMM consensus:
(Query *CpwPDCgKtFrrwsNLrRHMRTH*
C+ CGK+FR +LR H RTH dkfzphtes3 560 CN(Q—CGKAFRCCNSLRYHERTH bOO (Query f: b06 t: b28 Target: dkfzphtes3_10ilb -1 similarity to ZKl (Homo sapiens)ι complete eds-
Alignment to HMM consensus: HMM *CpwPDCgKtFrrwsNLrRHMRTH*
C+ CGK+FR++S+LR+H RTH (Query bOβ CK(Q--CGKAFRSASHLRMHERTH b28
35-30 (bits) f: b3b t: bSb Target: dkfzphtes3_10ilb - 1 similarity to ZKl (Homo sapiens)ι complete eds- Alignment to HMM consensus: (Query *CpwPDCgKtFrrwsNLrRHMRTH*
C+ CGK+F+ +SNLR+H RTH dkfzphtes3 b3b CK(Q--CGKAFSCASNLRKHGRTH b5b
(Query f : bb4 t: b84 Target: dkfzphtes3_lDϋb • 1 similarity to ZKl (Homo sapiens)ι complete eds-
Alignment to HMM consensus: HMM *CpwPDCgKtFrrwsNLrRHMRTH* C+ CGK+FR++SNL++H RTH
(Query bb4 CK(Q--CGKAFRSASNL(QMHERTH b84
31-74 (bits) f: b12 t: 712 Target: dkfzphtes3_10ilb - 1 similarity to ZKl (Homo sapiens)ι complete eds- Alignment to HMM consensus:
(Query *CpwPDCgKtFrrwsNLrRHMRTH*
C++ C+K+F+ S++++H R H dkfzphtes3 b12 CKE—CEKAFCKFSSFiQIHERKH 712 (Query f: 720 t: 740 Target: dkfzphtes3_10ilb - 1 similarity to ZKl (Homo sapiens)ι complete eds-
Alignment to HMM consensus: HMM *CpwPDCgKtFrrwsNLrRHMRTH*
C++ CG F+++ L++H RTH (Query 720 CKH--CGNGFTSAKIL(QIHARTH 740
34-86 (bits) f: 748 t: 7b8 Target: dkfzphtes3_10ilb - 1 similarity to ZKl (Homo sapiens)ι complete eds- Alignment to HMM consensus: (Query *CpwPDCgKtFrrwsNLrRHMRTH*
C++ CGK+F++ S+L +H RTH dkfzphtes3 748 CKE--CGKAFNYFSSLHIHARTH 7bβ
Pedant . information for DKFZphtes3_10ilbι frame 2
Report for DKFZphtes3_lDilb-2
[LENGTH! 214
[MU! 33083-18
[pi! 1-17 [HOMOL! TREMBL : AF153201_1 product: "zinc finger protein dp"i Homo sapiens zinc finger protein dp mRNAi complete eds. 7e- 17
[KU! All_Alpha
SElQ MKKLTLERNPMNATNVVKPSDVAIPFDIMKGLTLERNPMSVSNVGKPSDLPHTFECMKGL PRD cccccccccccceeeeecccccchhhhhccccccccccccccccccccccccchhhhhee
SElQ TLERNPMSVSNVGKPSVVPiQTFESMVGLTLERNPMSVSNVGKPSDLPlQTFRCMKGLTLER PRD ecccccccccccccccchhhhhhhhhhhhhccccccccccccccccchhhhhhhhhhhcc
SElQ NPMNVRNAKKHSVNSLLFKYMKGSTEERSPMNVSIVGMDSHLPRFFKYMiQEHTLERNTMN PRD cccccccccccccccccccccccccccccccceeeeecccchhhhhhhhhhhhhhhcccc SE(Q VRNAEKHSIIFLPCIYTlQGLIUERSHMNVRIVGKHSASLVPFMDMNRLTLEGSTMNASNV PRD chhhhhhheeeccceeechhhhhcccceeeeeccccceeeeccchhhhhhhccccccccc
SElQ AKLSHFPVLFDIMKGLTLGRNPINVSSVGKPSFLLLLFNVMKGLTRERNPMSVF PRD cccccccchhhhhhhhcccccccccccccccchhhhhhhhhccccccccccccc
(No Prosite data available for DKFZphtes3_10ilb .2) (No Pfam data available for DKFZphtes3_10ilb ■ 2)
DKFZphtes3_10nlO
group: testis derived
DKFZphtes3_10nlO encodes a novel 502 amino acid protein without similarity to known proteins. The mRNA is differentially polyadenylated and the novel protein is ubiquitously expressed-
No informative BLAST resultsi No predictive prositei pfam or SCOP motife- The new protein can find application in studying the expression profile of testis-specific genes-
unknown protein differentially polyadenylated
Sequenced by (Qiagen Locus: unknown
Insert length: 2551 bp
Poly A stretch at pos- 2531ι polyadenylation signal at pos- 2513
1 CTCAGCCTCC CAAGTGGCTG GGACTGCAGG TTCTAAATGG CTTCTAAGAA
51 GTTGGGTGCA GATTTTCATG GGACTTTCAG TTACCTTGAT GATGTCCCAT
101 TTAAGACAGG AGACAAATTC AAAACACCAG CTAAAGTTGG TCTACCTATT
151 GGCTTCTCCT TGCCTGATTG TTTGCAGGTT GTCAGAGAAG TACAGTATGA 201 CTTCTCTTTG GAAAAGAAAA CCATTGAGTG GGCTGAAGAG ATTAAGAAAA
251 TCGAAGAAGC CGAGCGGGAA GCAGAGTGCA AAATTGCGGA AGCAGAAGCT
301 AAAGTGAATT CTAAGAGTGG CCCAGAGGGC GATAGCAAAA TGAGCTTCTC
351 CAAGACTCAC AGTACAGCCA CAATGCCACC TCCTATTAAC CCCATCCTCG
401 CCAGCTTGCA GCACAACAGC ATCCTCACAC CAACTCGGGT CAGCAGTAGT 451 GCCACGAAAC AGAAAGTTCT CAGCCCACCT CACATAAAGG CGGATTTCAA
501 TCTTGCTGAC TTTGAGTGTG AAGAAGACCC ATTTGATAAT CTGGAGTTAA
551 AAACTATTGA TGAGAAGGAA GAGCTGAGAA ATATTCTGGT AGGAACCACT bOl GGACCCATTA TGGCTCAGTT ATTGGACAAT AACTTGCCCA GGGGAGGCTC bSl TGGGTCTGTG TTACAGGATG AGGAGGTCCT GGCATCCTTG GAACGGGCAA 701 CCCTAGATTT CAAGCCTCTT CATAAACCCA ATGGCTTTAT AACCTTACCA
751 CAGTTGGGCA ACTGTGAAAA GATGTCACTG TCTTCCAAAG TGTCCCTCCC
801 CCCTATACCT GCAGTAAGCA ATATCAAATC CCTGTCTTTC CCCAAACTTG
651 ACTCTGATGA CAGCAATCAG AAGACAGCCA AGCTGGCGAG CACTTTCCAT
101 AGCACATCCT GCCTCCGCAA TGGCACGTTC CAGAATTCCC TAAAGCCTTC 151 CACCCAAAGC AGTGCCAGTG AGCTCAATGG GCATCACACT CTTGGGCTTT
1001 CAGCTTTGAA CTTGGACAGT GGCACAGAGA TGCCAGCCCT GACATCCTCC
1051 CAGATGCCTT CCCTCTCTGT TTTGTCTGTG TGCACAGAGG AATCATCACC
1101 TCCAAATACT GGTCCCACGG TCACCCCTCC TAATTTCTCA GTGTCACAAG
1151 TGCCCAACAT GCCCAGCTGT CCCCAGGCCT ATTCTGAACT GCAGATGCTG 1201 TCCCCCAGCG AGCGGCAGTG TGTGGAGACG GTGGTCAACA TGGGCTACTC
1251 GTACGAGTGT GTCCTCAGAG CCATGAAGAA GAAAGGAGAG AATATTGAGC
1301 AGATTCTCGA CTATCTCTTT GCACATGGAC AGCTTTGTGA GAAGGGCTTC
1351 GACCCTCTTT TAGTGGAAGA GGCTCTGGAA ATGCACCAGT GTTCAGAAGA 1401 AAAGATGATG GAGTTTCTTC AGTTAATGAG CAAATTTAAG GAGATGGGCT
1451 TTGAGCTGAA AGACATTAAG GAAGTTTTGC TATTACACAA CAATGACCAG
1501 GACAATGCTT TGGAAGACCT CATGGCTCGG GCAGGAGCCA GCTGAGACCA
1551 GGCCCTGCCT AGGCCCTGCC GCAGAACCAC CATCCCTGGG AGGCCCTGCA
IbOl GAGCCCACCT GTGGGGAAAG kGk kGGGGCk GCTTCCGGAT TTTCTTTTGG lb51 GGGTTAGAAG GTCAGGTGTG GAGACTGCTC GCCAGTCTCT GTGAGCCTAG
1701 GCCCTGAGCT GGGGAGGTGG GGAAGATTCG GGCATGTGAG TGCCCCCAGA
1751 ACTGTCCTGG CTCCTTCCGT ATTAAACGCA TTTGCATTTT GAGAAGTGTC
IflDl CTTCCCACTT CAGCCCTCCG GAGAGACTAC CCTAGTCTTT CTGGGGTGTT
1851 TATGTCCTCA GCTGAAGCCT GGCCTAGTTG CTGAGAGGGG CTGGGGAGAT
1101 GGGGCGGGkG GGCCAGACTC AGTGCTGCTG TGGAGCTAGG TGCTTCCCCC
1151 TTCCCCTGAG ACTGGTTGAC TGAACTCCAG TCAAGTTGAG TTCAAGTGAA
2001 AGATTCTTCC AGGGTTTTAT TTTTTCCCCT CCTAACAAAG TCTCATAGTG
2051 TTAACACTGG TTCTGCAATA TCTCTGAGGT GCAAAGAATG CACTTTTCCC
2101 TATGGGGCCC AGAGTTTGCC TTTTCTGCCA GGCAGTCACC ACGCTTCCCT
2151 ACCCCAGCCT GTTTCTTTTG GCTTGGTTTG GACCACAGTC CTCTGCTACC
2201 CAGGGTTTTA GAGCCCCTGC TCTAGGAAAC AGTTTAAGAA ATCATTGGCC
2251 CCTTCCCAGC ACATTGAATG GGTAAGCAGA CAGGCCATGA TTTAGTTGGC
2301 CAGCACTAAC TCCACCTCTG TTCTCCTTGA ACAGCTTCCC CTCCAGCCCA
2351 CTGCTTTAGG ATGACACAAT GAATAACACC TAGTCATAGA AATCAGTCTC
2401 TCTGGTTTGT TTTGTATTAT GTTGTACATC ATTAAAGATC TAAATACAAA
2451 GGATATACAG TCTTGAATCT AAAATAATTT GCTAACTATT TTGATTCTTC
25D1 AGAGAGAACT ACTAATAAAA ATCTAAAAGG TAAAAAAAAA AAAAAAAAAA
2551 A
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 1
ORF from 37 bp to 1542 bpi peptide length: 502 Category: putative protein Classification: unclassified
1 MASKKLGADF HGTFSYLDDV PFKTGDKFKT PAKVGLPIGF SLPDCLlQ VR
51 EVlQYDFSLEK KTIEUAEEIK KIEEAEREAE CKIAEAEAKV NSKSGPEGDS
101 KMSFSKTHST ATMPPPINPI LASLiQHNSIL TPTRVSSSAT KiQKVLSPPHI 151 KADFNLADFE CEEDPFDNLE LKTIDEKEEL RNILVGTTGP IMAiQLLDNNL
201 PRGGSGSVLiQ DEEVLASLER ATLDFKPLHK PNGFITLPiQL GNCEKMSLSS
251 KVSLPPIPAV SNIKSLSFPK LDSDDSNiQKT AKLASTFHST SCLRNGTFώN
301 SLKPSTC3SSA SELNGHHTLG LSALNLDSGT EMPALTSSiQM PSLSVLSVCT
351 EESSPPNTGP TVTPPNFSVS (QVPNMPSCPώ AYSELiQMLSP SERlQCVET V 401 NMGYSYECVL RAMKKKGENI ElQILDYLFAH GώLCEKGFDP LLVEEALEMH
451 (QCSEEKMMEF LlQLMSKFKEM GFELKDIKEV LLLHNNDώDN ALEDLMARAG 501 AS BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_10nl0ι frame 1
No Alert BLASTP hits found
Pedant information for DKFZphtes3_10nl0ι frame 1
Report for DKFZphtes3_10nlO -1
[LENGTH! 502
[MU! 55063-76
[pi! 5-02 [BLOCKS! PR01083D
[BLOCKS! BL0130bB
[KU! All_Alpha
[KU! L0U_C0MPLEXITY 6-57 *
SElQ MASKKLGADFHGTFSYLDDVPFKTGDKFKTPAKVGLPIGFSLPDCLlQVVREVlQYDFSLEK
SEG xx
PRD cccccccccccccccccccccccccccccccccccccccccccchhhhhhhhhhcccchh SElQ KTIEUAEEIKKIEEAEREAECKIAEAEAKVNSKSGPEGDSKMSFSKTHSTATMPPPINPI SEG xxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcccccccccccccccccccccccccchhh
SElQ LASLiQHNSILTPTRVSSSATKlQKVLSPPHIKADFNLADFECEEDPFDNLELKTIDEKEEL SEG
PRD hhhhcccccccccccccchhhhhcccccchhhhhcccccccccccccccccchhhhhhhh
SElQ RNILVGTTGPIMAiQLLDNNLPRGGSGSVLiQDEEVLASLERATLDFKPLHKPNGFITLPlQL SEG PRD hhhhccccchhhhhhhhcccccccccccchhhhhhhhhhhhhcccccccccccccccccc
SElQ GNCEKMSLSSKVSLPPIPAVSNIKSLSFPKLDSDDSNlQKTAKLASTFHSTSCLRNGTFlQN
SEG
PRD ccccccccccccccccccccccccccccccccccccchhhhhhhhhcccccccccccccc
SElQ SLKPSTtQSSASELNGHHTLGLSALNLDSGTEMPALTSSiQMPSLSVLSVCTEESSPPNTGP
SEG xxxxxx
PRD ccccccccccccccccccccceeecccccccccccccccccceeeeeeeccccccccccc SElQ TVTPPNFSVS(QVPNMPSCP(QAYSEL(QMLSPSER(QCVETVVNMGYSYECVLRAMKKKGENI SEG xxxxxx
PRD cccccccccccccccccccchhhhhhhcccccchhhhhhhccccchhhhhhhhhhccchh
SElQ ElQILDYLFAHGiQLCEKGFDPLLVEEALEMHlQCSEEKMMEFLiQLMSKFKEMGFELKDIKEV SEG
PRD hhhhhhhhhhhccccccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
SElQ LLLHNNDiQDNALEDLMARAGAS SEG
PRD hhcccccchhhhhhhhhhhccc
(No Prosite data available for DKFZphtes3_10nlO -1)
(No Pfam data available for DKFZphtes3_10nlO.1) DKFZphtes3_llal7
group: transmembrane protein
DKFZphtes3_llal7 encodes a novel 428 amino acid protein without similarity to known proteins-
The novel protein contains 2 transmembrane regions and one leucine zipper- The protein is ubiquitously expressed with higher abundance in stomachi brain and testis-
No informative BLAST resultsi No predictive prositei pfam or SCOP motife-
The new protein can find application in studying the expression profile of testis-specific genes and as a new marker for testicular cells-
unknown protein Pedant: TRANSMEMBRANE 2 perhaps differential polyadenylation
Sequenced by (Qiagen Locus: unknown
Insert length: 2511 bp
Poly A stretch at pos- 2570ι polyadenylation signal at pos- 2548
1 CTCTCCTGC6 CCCTCTGGAG GAAGTGAGAA GAGTCAGTCC CACCCAGCTG
51 CCGCCTGGTA TCTGGGCTCC AGGCCACCGA GTATTTGGCC CCCAGCCACG
101 GAGCCCTTAG CACACACCTC CCCCACAGGT CCTGGAGATG TGGCTGAGCT
151 ACCTGCAGCC GTGGCGGTAC GCGCCTGACA AGCAGGCTCC GGGCAGCGAC 201 TCCCAGCCCC GGTGTGTGTC GGAGAAATGG GCACCCTTTG TCCAGGAGAA
251 CCTGCTGATG TACACCAAGT TGTTTGTGGG CTTTCTGAAC CGCGCGCTCC
301 GCACAGACCT GGTCAGCCCC AAGCACGCGC TCATGGTGTT CCGAGTGGCC
351 AAAGTCTTTG CCCAGCCCAA CCTGGCTGAG ATGATTCAGA AAGGTGAGCA
401 GCTATTCCTG GAGCCAGAGC TGGTCATCCC CCACCGCCAG CACCGACTCT 451 TCACGGCCCC CACATTCACT GGGAGCTTCC TGTCACCCTG GCCACCAGCG
501 GTCACTGATG CCTCCTTCAA GGTGAAGAGC CACGTCTACA GCCTGGAGGG
551 CCAGGACTGC AAGTACACCC CGATGTTTGG GCCCGAGGCC CGCACCCTGG bOl TCCTGCGCCT CGCTCAGCTC ATCACACAGG CCAAACACAC AGCCAAGTCC b51 ATCTCCGACC AGTGTGCGGA GAGCCCGGCT GGCCACTCCT TCCTCTCATG 701 GCTGGGCTTT AGCTCCATGG ACACCAATGG CTCCTACACA GCCAACGACC
751 TGGACGAGAT GGGGCAAGAC AGTGTCCGGA AGACAGATGA ATACCTGGAG
801 AAGGCCCTGG AGTACCTGCG CCAGATATTC CGGCTCAGCG AAGCGCAGCT
851 CAGGCAGTTC ACACTCGCCT TGGGCACCAC CCAGGATGAG AATGGAAAAA 101 AGCAACTCCC CGACTGCATC GTGGGTGAGG ACGGACTCAT CCTTACGCCC
151 CTGGGGCGGT ACCAGATCAT CAATGGGCTG CGAAGGTTTG AAATTGAGTA
1001 CCAGGGGGAC CCGGAGCTGC AGCCCATCCG GAGCTATGAG ATCGCCAGCT
1051 TGGTCCGCAC ACTCTTTAGG CTGTCGTCTG CCATCAACCA CAGATTTGCA 1101 GGACAGATGG CGGCTCTGTG TTCCCGGGAT GACTTCCTCG GCAGCTTCTG
1151 TCGCTACCAC CTCACAGAAC CTGGGCTGGC CAGCAGGCAC CTGCTGAGCC
1201 CTGTGGGGCG GAGGCAGGTG GCCGGCCACA CCCGCGGCCC CAGGCTCAGC
1251 CTGCGCTTCC TGGGCAGTTA CCGGACGCTG GTCTCGCTGC TGCTGGCCTT
1301 CTTCGTGGCC TCTCTGTTCT GCGTCGGGCC CCTCCCATGC ACGCTGCTGC 1351 TCACCCTGGG CTATGTCCTC TACGCCTCTG CCATGACACT GCTGACCGAG
1401 CGGGGGAAGC TGCACCAGCC CTGAAGGTGT CAGCTGCCTT CAGAGCAGGC
1451 TGGAGGGATT TGCCACACAG CCCCACCCTT GGGCTGAGAG GACCTGGGAA
1501 GCCCCTCCAG GAGGGAACAC GGTCATCCTC GGGCTTCTGG AGCGGGGTTC
1551 CTGCAGCCGC AGAGGCATCT GGAGGAAACG CAACCAAGAA AGGAAGGCAG IbOl GTGGGCCCCA GCAAAGGAGT AGCTGCCAGG GCTCAACAGC TACGCTCTGT lb51 GACAGCGCAG AGCTCAGCGC CGGCCTTTCC CTCCCTCCGC CAAGGACTCA
1701 CGGCCAAGCC AGCTCTCGGG GCCTTTTTTC CAGTGCCCAT TTGGCTACTC
1751 TGCTGCACCA AGCTTGGGAG CCAGCCTGCC AACAGCCACC TGGGCCTGGC
1801 CTCCCCACTG GCTGGCCTTG AGGTTGGCAG AGTGGGTTGT GGCGCTTCCT 1651 CTCTCTGTGT GGGACCAGGA CAGTGGCTTA AGTCTCCACT CCAGGAAAGA
1101 ATCAAAGTTT CTAGAGTTGT GAGAAAACCA GAGAGTGGCT GTCCTGATTC
1151 TTCACTGTGA GGGGCGTTCT TCATGTTCTC CCAGCTGTTC CAAGACTGGG
2001 CCGTAGAATT CCATGTTTCA GGAGCCTAAG ACCCTCCCAG AGCCCAGGGG
2051 CTTCACCGCA GACCCCAAGC CATTGAGCAC ATCACCCAAA GCAGTGGCCA 2101 ACATCGCGGA CCCCTGTGCC TTGTCACAGA TGGGTGCTGG TCCTCAGGCG
2151 TTGGGGACAC TGCTGGGTCG ATGGGGTCGG ATTCTGCCAG TTTCTGCTCT
2201 GCAGCCAAAG ATGGTCAGAA GCATTGTCAC TTCAGTAACA TCAAGTGCTC
2251 AAAGACATGG CAACCGTTCA GTGGTACTTA AGTATTCAAA ATATACAACT
2301 ACAGATTCTC TGACAGAAAC CAGCACGGGG TCTTCACCTT CATTCACCCC 2351 ACAGGCGACA TGCGAGGGAG AACAGCATCT CAGTGGTGAT TTCCAAACCA
2401 AGCCTTTGTT TTCGGTGTGG GGTTTTGGGG GTTTGCTTTA ATGTTTTTGA
2451 AATTGTAAAT GTTGGGCTTT TTATTTTGAT GTAAACTGAG AATAATGGCA
2501 TTTTAGGGCC TGTGACCAAA AATGAAGCTT GTAACGACCA TGGATCTGAA
2551 TAAACATGTC CTTGCTTCTG AAAAAAAAAA AAAAAAAAAA A
BLAST Results
Entry AF0S2134 from database EMBLNEU: Homo sapiens clone 23565 mRNA sequence. Score = 57b5ι P = 2-1e-254ι identities = 115S/115b 3' UTR
Medline entries
No Medline entry
Peptide information for frame 3
ORF from 136 bp to 1421 bpi peptide length: 426 Category: putative protein Classification: Transmembrane proteins unclassified Prosite motifs: LEUCINE_ZIPPER (404-425)
1 MULSYLiQPUR YAPDKώAPGS DSlQPRCVSEK UAPFViQENLL MYTKLFVGFL 51 NRALRTDLVS PKHALMVFRV AKVFAlQPNLA EMI(QKGE(QLF LEPELVIPHR 101 (QHRLFTAPTF TGSFLSPUPP AVTDASFKVK SHVYSLEGiQD CKYTPMFGPE 151 ARTLVLRLAfi LITώAKHTAK SISDώCAESP AGHSFLSULG FSSMDTNGSY 201 TANDLDEMGiQ DSVRKTDEYL EKALEYLRlQI FRLSEAiQLRtQ FTLALGTTQD 251 ENGKKfQLPDC IVGEDGLILT PLGRYiQIING LRRFEIEYlQG DPELlQPIRSY 301 EIASLVRTLF RLSSAINHRF AGiQMAALCSR DDFLGSFCRY HLTEPGLASR 351 HLLSPVGRRiQ VAGHTRGPRL SLRFLGSYRT LVSLLLAFFV ASLFCVGPLP 401 CTLLLTLGYV LYASAMTLLT ERGKLHiQP
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_llal7ι frame 3 No Alert BLASTP hits found
Pedant information for DKFZphtes3__llal7ι frame 3
Report for DKFZphtes3_llal7- 3
[LENGTH! 426 [MU! 46274-13 [pi! 8-12 [PROSITE! LEUCINE_ZIPPER 1 [KU! TRANSMEMBRANE 2
[KU! LOU COMPLEXITY 7-48 *
SElQ MULSYLlQPURYAPDKiQAPGSDSiQPRCVSEKUAPFVdJENLLMYTKLFVGFLNRALRTDLVS SEG
PRD cccccccccccccccccccccccccccccccchhhhhhhhhhhhhhhhhhhhhhhhhccc
MEM
SElQ PKHALMVFRVAKVFAlQPNLAEMIiQKGEiQLFLEPELVIPHRlQHRLFTAPTFTGSFLSPUPP SEG
PRD cchhhhhhhhhhhhcccchhhhhhhccceeeccceeeccccccccccccccccccccccc
MEM
SElQ AVTDASFKVKSHVYSLEG(QDCKYTPMFGPEARTLVLRLAιQLIT(QAKHTAKSISD(2CAESP SEG
PRD cccccccccccceeeccccccccccccccchhhhhhhhhhhhhhhhcccccccccccccc
MEM - -
SElQ AGHSFLSULGFSSMDTNGSYTANDLDEMGlQDSVRKTDEYLEKALEYLRiQIFRLSEAiQLRiQ SEG
PRD ccceeecccccccccccccccccccccccccccccchhhhhhhhhhhhhhhhhhhhhhhh
MEM SElQ FTLALGTTIQDENGKKIQLPDCIVGEDGLILTPLGRYIQIINGLRRFEIEYIQGDPELIQPIRSY
SEG
PRD hhhhhhccccccccccccceeecccccccccccceeeecchhhhheeecccccccccchh
MEM
SElQ EIASLVRTLFRLSSAINHRFAGiQMAALCSRDDFLGSFCRYHLTEPGLASRHLLSPVGRRlQ
SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcccccceeeeeeccccchhhhhcccccccc
MEM
SElQ VAGHTRGPRLSLRFLGSYRTLVSLLLAFFVASLFCVGPLPCTLLLTLGYVLYASAMTLLT
SEG xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PRD cccccccccccccccccchhhhhhhhhhhhhhhccccccchhhhhhhhhhhhhhhhhhhh
MEM MMMMMMMMMMMMMMMMM MMMMMMMMMMMMMMMMM -
SElQ ERGKLHlQP
SEG
PRD hhhccccc
MEM
Prosite for DKFZphtes3_llal7- 3 PS00021 404->42b LEUCINE_ZIPPER PD0C00021
(No Pfam data available for DKFZphtes3_llal7- 3)
DKFZphtes3_llc22
group: signal transduction
DKFZphtes3_llc22 encodes a novel 462 amino acid protein with partial similarity to mouse PC32b- The novel protein contains UD-repeats- UD-repeat proteins are known as regulatory elements in a large variety of pathways- The repeats form a propeller like strcturei which serves as a platform for protein/protein interaction- The new protein is ubiquitously expressed! indicating that it takes an essential regulatory function in the cell-
The new protein can find application in modulating/blocking of regulatory pathways-
similarity to mouse PC32b perhaps complete eds- contains UD-Repeats: cf- BLASTX-S37b14 perhaps differential polyadenylation
Sequenced by (Qiagen
Locus: /map=nlq23- 2-24-3"
Insert length: 1152 bp
Poly A stretch at pos- 1132ι polyadenylation signal at pos- 1112
1 GAAGCAAGTG AGGTTGCACA AAGCAATAGA GGACGAGGAA GATCTCGACC
51 CAGAGGTGGA ACAAGTCAAT CAGATATTTC AACTCTTCCT ACGGTCCCAT
101 CAAGTCCTGA TTTGGAAGTG AGTGAAACTG CAATGGAAGT AGATACTCCA
151 GCTGAACAAT TTCTTCAGCC TTCTACATCC TCTACAATGT CAGCTCAGGC
201 TCATTCGACA TCATCTCCCA CAGAAAGCCC TCATTCTACT CCTTTGCTAT 251 CTTCTCCAGA TAGTGAACAA AGGCAGTCTG TTGAGGCATC TGGACACCAC
301 ACACATCATC AGTCTGATTC TCCTTCTTCT GTGGTTAACA AACAGCTCGG
351 ATCCATGTCA CTTGACGAGC AACAGGATAA CAATAATGAA AAGCTGAGCC
401 CCAAACCAGG GACAGGTGAA CCAGTTTTAA GTTTGCACTA CAGCACAGAA
451 GGAACAACTA CAAGCACAAT AAAACTGAAC TTTACAGATG AATGGAGCAG 501 TATAGCATCA AGTTCTAGAG GAATTGGGAG CCATTGCAAA TCTGAGGGTC
551 AGGAGGAATC TTTCGTCCCA CAGAGCTCAG TGCAACCACC AGAAGGAGAC bOl AGTGAAACAA AAGCTCCTGA AGAATCATCA GAGGATGTGA CAAAATATCA b51 GGAAGGAGTA TCTGCAGAAA ACCCAGTTGA GAACCATATC AATATAACAC
701 AATCAGATAA GTTCACAGCC AAGCCATTGG ATTCCAACTC AGGAGAAAGA 751 AATGACCTCA ATCTTGATCG CTCTTGTGGG GTTCCAGAAG AATCTGCTTC
801 ATCTGAAAAA GCCAAGGAAC CAGAAACTTC AGATCAGACT AGCACTGAGA
851 GTGCTACCAA TGAAAATAAC ACCAATCCTG AGCCTCAGTT CCAAACAGAA
IDl GCCACTGGGC CTTCAGCTCA TGAAGAAACA TCCACCAGGG ACTCTGCTCT
151 TCAGGACACA GATGACAGTG ATGATGACCC AGTCCTGATC CCAGGTGCAA 10D1 GGTATCGAGC AGGACCTGGT GATAGACGCT CTGCTGTTGC CCGTATTCAG
1051 GAGTTCTTCA GACGGAGAAA AGAAAGGAAA GAAATGGAAG AATTGGATAC
1101 TTTGAACATT AGAAGGCCGC TAGTAAAAAT GGTTTATAAA GGCCATCGCA
1151 ACTCCAGGAC AATGATAAAA GAAGCCAATT TCTGGGGTGC TAACTTTGTA 12D1 ATGAGTGGTT CTGACTGTGG CCACATTTTC ATCTGGGATC GGCACACTGC
1251 TGAGCATTTG ATGCTTCTGG AAGCTGATAA TCATGTGGTA AACTGCCTGC
1301 AGCCACATCC GTTTGACCCA ATTTTAGCCT CATCTGGCAT AGATTATGAC
1351 ATAAAGATCT GGTCACCATT AGAAGAGTCA AGGATTTTTA ACCGAAAACT 14D1 TGCTGATGAA GTTATAACTC GAAACGAACT CATGCTGGAA GAAACTAGAA
1451 ACACCATTAC AGTTCCAGCC TCTTTCATGT TGAGGATGTT GGCTTCACTT
1501 AATCATATCC GAGCTGACCG GTTGGAGGGT GACAGATCAG AAGGCTCTGG
1551 TCAAGAGAAT GAAAATGAGG ATGAGGAATA ATAAACTCTT TTTGGCAAGC
IbOl ACTTAAATGT TCTGAAATTT GTATAAGACA TTTATTATAT TTTTTTCTTT IbSl ACAGAGCTTT AGTGCAATTT TAAGGTTATG GTTTTTGGAG TTTTTCCCTT
1701 TTTTTGGGAT AACCTAACAT TGGTTTGGAA TGATTGTGTG CATGAATTTG
1751 GGAGATTGTA TAAAACAAAA CTAGCAGAAT GTTTTTAAAA CTTTTTGCCG
1801 TGTATGAGGA GTGCTAGAAA ATGCAAAGTG CAATATTTTC CCTAACCTTC
1651 AAATGTGGGA GCTTGGATCA ATGTTGAAGA ATAATTTTCA TCATAGTGAA 1101 AATGTTGGTT CAAATAAATT TCTACACTTG CCAAAAAAAA AAAAAAAAAA 1151 AA
BLAST Results
Entry HS7D2J11 from database EMBL".
Human DNA sequence *** SEQUENCING IN PROGRESS *** from clone 702J11 Score = 2043ι P = 5-8e-252ι identities = 425/445 ID exons matching Bp 31b-1132
Entry HS53bl48 from database EMBL: human STS UI-b347- Score = 1203ι P = l-5e-47ι identities = 247/252
Entry HS7D3H14 from database EMBLNEU:
Human DNA sequence from clone 7D3H14 on chromosome lq23-2-24-3 Score = 13D7ι P = l-le-51ι identities = 2b3/2bS 2 exons matching Bp l-31b
Medline entries
1302b383:
Bergsagel PLi Timblin CRi Eckhardt Li Laskov Ri Kuehl UM-i Sequence and expression of a murine cDNA encoding PC32bι a novel gene expressed in plasmacytomas but not normal plasma cells.
Oncogene
1112 0cti7(lD) :2051-b4
Peptide information for frame 1
ORF from 133 bp to 1576 bpi peptide length: 482 Category: similarity to known protein Classification: Protein management Prosite motifs: MYB_1 (410-418)
1 MEVDTPAElQF LlQPSTSSTMS AiQAHSTSSPT ESPHSTPLLS SPDSE(QR(QSV 51 EASGHHTHHiQ SDSPSSVVNK (QLGSMSLDElQ (QDNNNEKLSP KPGTGEPVLS
101 LHYSTEGTTT STIKLNFTDE USSIASSSRG IGSHCKSEGiQ EESFVPiQSSV
151 (QPPEGDSETK APEESSEDVT KYiQEGVSAEN PVENHINITiQ SDKFTAKPLD
201 SNSGERNDLN LDRSCGVPEE SASSEKAKEP ETSDlQTSTES ATNENNTNPE
251 PiQFiQTEATGP SAHEETSTRD SALiQDTDDSD DDPVLIPGAR YRAGPGDRRS 3D1 AVARIlQEFFR RRKERKEMEE LDTLNIRRPL VKMVYKGHRN SRTMIKEANF
351 UGANFVMSGS DCGHIFIUDR HTAEHLMLLE ADNHVVNCLώ PHPFDPILAS
401 SGIDYDIKIU SPLEESRIFN RKLADEVITR NELMLEETRN TITVPASFML
451 RMLASLNHIR ADRLEGDRSE GSGώENENED EE
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_llc22ι frame 1
TREMBLNEU:HS0bb31_l gene: "H32b"i Human (H32b) mRNAi complete eds- ι N = li Score = 278ι P = 4e-22
PIR:S37b14 gene PC32b protein - mousei N = li Score = 2b5ι P = 2. le-20 PIR:T05b7b hypothetical protein F2DM13-40 - Arabidopsis thalianai N = li Score = 24Dι P = b-3e-18
>TREMBLNEU:HS0bb31_l gene: "H32b"i Human (H32b) mRNAi complete eds •
Length = 517
HSPs:
Score = 278 (41-7 bits)ι Expect = 4-De-22ι P = 4-0e-22 Identities = b3/14δ (42*)ι Positives = 14/146 (b3*)
(Query: 335 YKGHRNSRTMIKEANFUG-- ANFVMSGSDCGHIFIUDRHTAEHLMLLEADNH-VVNCLlQP 311
YKGHRN+ T +K NF+G + FV+SGSDCGHIF+U++ + + + +E D
VVNCL+P
Sbjct: 426 YKGHRNNAT-
VKGVNFYGPKSEFVVSGSDCGHIFLUEKSSClQIIiQFMEGDKGGVVNCLEP 4δb
(Query: 312 HPFDPILASSGIDYDIKIUSPLEESRIFNRKLADEVITRNELMLEE-
TRNTITVPASFML 4SD
HP P+LA+SG+D+D+KIU+P E+ L D VI +N+ +E + +
+ S ML Sbjct: 467 HPHLPVLATSGLDHDVKIUAPTAEASTELTGLKD-
VIKKNKRERDEDSLHlQTDLFDSHML 545
(Query: 451 RMLASLNHIRADRLEGD-RSEGSGiQENENEDE 461 L ++H+R R R G G + + DE Sbjct: 54b UFL—MHHLRiQRRHHRRUREPGVGATDADSDE 575
Pedant information for DKFZphtes3_llc22ι frame 1
Report for DKFZphtes3_llc2 ■ 1
[LENGTH! 462
[MU! 53470-12
[pi! 4-72
[HOMOL! PIR:T041bl hypothetical protein T12J5-1D - Arabidopsis thaliana 2e-22
[FUNCAT! 30- DI organization of intracellular transport vesicles CS- cerevisiaei YDL145c! 4e-D5
[FUNCAT! 0β-07 vesicular transport (golgi networki etc) CS. cerevisiaei YDL145c! 4e-0S CFUNCAT! 11 unclassified proteins CS- cerevisiaei YCLD31w!
2e-04
[SUPFAM! UD repeat homology 4e-21
[PROSITE! MYB_1 1
[KU! Alpha_Beta [KU! LOU COMPLEXITY 17. DI *
SElQ MEVDTPAEIQFLIQPSTSSTMSAIQAHSTSSPTESPHSTPLLSSPDSEIQRIQSVEASGHHTHHIQ
SEG xxxxxxxxxxxxxxxxxxxxx PRD cccccceeeeecccccceeeeeeccccccccccccceeecccccchhhhhhccccceeec
SElQ SDSPSSVVNKlQLGSMSLDEώiQDNNNEKLSPKPGTGEPVLSLHYSTEGTTTSTIKLNFTDE
SEG
PRD ccccceeeeecccccccccccccccccccccccccccceeeeccccccccccceeeeccc
SElQ USSIASSSRGIGSHCKSEGiQEESFVPiQSSVlQPPEGDSETKAPEESSEDVTKYlQEGVSAEN
SEG -xxxxxxxxxxxx
PRD cccccccccccccccccccceeeeeccccccccccccccccccccccccccccccccccc SElQ PVENHINITiQSDKFTAKPLDSNSGERNDLNLDRSCGVPEESASSEKAKEPETSDlQTSTES
SEG xxxxxxxxxxxxxx • ■ • .xxxxx
PRD ccceeeeeecccccccccccccccccccccccccccccchhhhhhhhccccccccccccc
SElQ ATNENNTNPEPiQFlQTEATGPSAHEETSTRDSALlQDTDDSDDDPVLIPGARYRAGPGDRRS SEG xxxxxxxx xxxxxxxx
PRD cccccccccccceeeeeccccccccccccccccccccccccccccccccccccccccchh
SElQ AVARIiQEFFRRRKERKEMEELDTLNIRRPLVKMVYKGHRNSRTMIKEANFUGANFVMSGS
SEG xxxxxxxxxxxxxx PRD hhhhhhhhhhhhhhhhhhhhhhhhccccceeeeeeccccccceeeeccccccceeeeccc
SElQ DCGHIFIUDRHTAEHLMLLEADNHVVNCLiQPHPFDPILASSGIDYDIKIUSPLEESRIFN
SEG
PRD ccceeeeeecchhhhhhhhhcccceeeecccccccceeecccccceeeecccchhhhhhh
SElQ RKLADEVITRNELMLEETRNTITVPASFMLRMLASLNHIRADRLEGDRSEGSGiQENENED SEG
PRD hhchhhhhhhhhhhhhhhhcceeecchhhhhhhhhhchhhhhhccccccccccccccccc SElQ EE SEG . . PRD cc
Prosite for DKFZphtes3_Hc22- 1 PS0D037 41D->411 MYB 1 PD0C0D037
(No Pfam data available for DKFZphtes3_llc22 - 1)
DKFZphtes3_lld21
group: signal transduction
DKFZphtes3_lld21 encodes a novel 122 acid protein and contains the full coding sequence of the human Nedd-4-like ubiquitin- protein ligase-
The novel protein contains four UU domains- The UU/rsp5/UUP domain has been shown to bind proteins with particular proline- motifsi and thus resembles somewhat SH3 domains- It is frequently associated with other domains typical for proteins in signal transduction processes- There is also a ubiquitin-protein ligase activity reported- The protein is believed to play an important role in protein-degradation pathways-
The new protein can find application in diagnosis of diseases due to unnormal protein degradation like muscular dystrophy or multiple sclerosis as well as in modulating the half life of specific proteins and in expression profiling-
similarity to Nedd-4-like ubiquitin-protein ligase (Homo sapiens) Sequenced by fliagen Locus: unknown
Insert length: 3382 bp
Poly A stretch at pos- 33b2ι polyadenylation signal at pos- 3345
1 ATTTTGGGAC ATGGCCACTG CTTCACCAAG GTCTGATACT AGTAATAACC
51 ACAGTGGAAG GTTGCAGTTA CAGGTAACTG TTTCTAGTGC CAAACTTAAA
IDl AGAAAAAAGA ACTGGTTCGG AACAGCAATA TATACAGAAG TAGTTGTAGA
151 TGGAGAAATT ACGAAAACAG CAAAATCCAG TAGTTCTTCT AATCCAAAAT
2D1 GGGATGAACA GCTAACTGTA AATGTTACGC CACAGACTAC ATTGGAATTT 251 CAAGTTTGGA GCCATCGCAC TTTAAAAGCA GATGCTTTAT TAGGAAAAGC
301 AACGATAGAT TTGAAACAAG CTCTGTTGAT ACACAATAGA AAATTGGAAA
351 GAGTGAAAGA ACAATTAAAA CTTTCCTTGG AAAACAAGAA TGGCATAGCA
4D1 CAAACTGGTG AATTGACAGT TGTGCTTGAT GGATTGGTGA TTGAGCAAGA
451 AAATATAACA AACTGCAGCT CATCTCCAAC CATAGAAATA CAGGAAAATG 501 GTGATGCCTT ACATGAAAAT GGAGAGCCTT CAGCAAGGAC AACTGCCAGG
551 TTGGCTGTTG AAGGCACGAA TGGAATAGAT AATCATGTAC CTACAAGCAC bOl TCTAGTCCAA AACTCATGCT GCTCGTATGT AGTTAATGGA GACAACACAC b51 CTTCATCTCC GTCTCAGGTT GCTGCCAGAC CCAAAAATAC ACCAGCTCCA
7D1 AAACCACTCG CATCTGAGCC TGCCGATGAC ACTGTTAATG GAGAATCATC 751 CTCATTTGCA CCAACTGATA ATGCGTCTGT CACGGGTACT CCAGTAGTGT
601 CTGAAGAAAA TGCCTTGTCT CCAAATTGCA CTAGTACTAC TGTTGAAGAT
851 CCTCCAGTTC AAGAAATACT GACTTCCTCA GAAAACAATG AATGTATTCC
IDl TTCTACCAGT GCAGAATTGG AATCTGAAGC TAGAAGTATA TTAGAGCCTG
151 ACACCTCTAA TTCTAGAAGT AGTTCTGCTT TTGAAGCAGC CAAATCAAGA 1001 CAGCCAGATG GGTGTATGGA TCCTGTACGG CAGCAGTCTG GGAATGCCAA
1051 CACAGAAACC TTGCCATCAG GGTGGGAACA AAGAAAAGAT CCTCATGGTA
1101 GAACCTATTA TGTGGATCAT AATACTCGAA CTACCACATG GGAGAGACCA
1151 CAACCTTTAC CTCCAGGTTG GGAAAGAAGA GTTGATGATC GTAGAAGAGT 12D1 TTATTATGTG GATCATAACA CCAGAACAAC AACGTGGCAG CGGCCTACCA
1251 TGGAATCTGT CCGAAATTTT GAACAGTGGC AATCTCAGCG GAACCAATTG
1301 CAGGGAGCTA TGCAACAGTT TAACCAACGA TACCTCTATT CGGCTTCAAT
1351 GTTAGCTGCA GAAAATGACC CTTATGGACC TTTGCCACCA GGCTGGGAAA 1401 AAAGAGTGGA TTCAACAGAC AGGGTTTACT TTGTGAATCA TAACACAAAA
1451 ACAACCCAGT GGGAAGATCC AAGAACTCAA GGCTTACAGA ATGAAGAACC
15D1 CCTGCCAGAA GGCTGGGAAA TTAGATATAC TCGTGAAGGT GTAAGGTACT
1551 TTGTTGATCA TAACACAAGA ACAACAACAT TCAAAGATCC TCGCAATGGG
IbOl AAGTCATCTG TAACTAAAGG TGGTCCACAA ATTGCTTATG AACGCGGCTT lb51 TAGGTGGAAG CTTGCTCACT TCCGTTATTT GTGCCAGTCT AATGCACTAC
17D1 CTAGTCATGT AAAGATCAAT GTGTCCCGGC AGACATTGTT TGAAGATTCC
1751 TTCCAACAGA TTATGGCATT AAAACCCTAT GACTTGAGGA GGCGCTTATA
1801 TGTAATATTT AGAGGAGAAG AAGGACTTGA TTATGGTGGC CTAGCGAGAG
1651 AATGGTTTTT CTTGCTTTCA CATGAAGTTT TGAACCCAAT GTATTGCTTA 1101 TTTGAGTATG CGGGCAAGAA CAACTATTGT CTGCAGATAA ATCCAGCATC
1151 AACCATTAAT CCAGACCATC TTTCATACTT CTGTTTCATT GGTCGTTTTA
20D1 TTGCCATGGC ACTATTTCAT GGAAAGTTTA TCGATACTGG TTTCTCTTTA
2051 CCATTCTACA AGCGTATGTT AAGTAAAAAA CTTACTATTA AGGATTTGGA
21D1 ATCTATTGAT ACTGAATTTT ATAACTCCCT TATCTGGATA AGAGATAACA 2151 ACATTGAAGA ATGTGGCTTA GAAATGTACT TTTCTGTTGA CATGGAGATT
2201 TTGGGAAAAG TTACTTCACA TGACCTGAAG TTGGGAGGTT CCAATATTCT
2251 GGTGACTGAG GAGAACAAAG ATGAATATAT TGGTTTAATG ACAGAATGGC
23D1 GTTTTTCTCG AGGAGTACAA GAACAGACCA AAGCTTTCCT TGATGGTTTT
2351 AATGAAGTTG TTCCTCTTCA GTGGCTACAG TACTTCGATG AAAAAGAATT 2401 AGAGGTTATG TTGTGTGGCA TGCAGGAGGT TGACTTGGCA GATTGGCAGA
2451 GAAATACTGT TTATCGACAT TATACAAGAA ACAGCAAGCA AATCATTTGG
25D1 TTTTGGCAGT TTGTGAAAGA GACAGACAAT GAAGTAAGAA TGCGACTATT
2551 GCAGTTCGTC ACTGGAACCT GCCGTTTACC TCTAGGAGGA TTTGCTGAGC
2b01 TCATGGGAAG TAATGGGCCT CAAAAGTTTT GCATTGAAAA AGTTGGCAAA 2b51 GACACTTGGT TACCAAGAAG CCATACATGT TTTAATCGCT TGGATCTACC
27D1 ACCATATAAG AGTTATGAAC AACTAAAGGA AAAACTTCTT TTTGCAATAG
2751 AAGAGACAGA GGGATTTGGA CAAGAATGAA TGTGGCTTCT TATTTTGGAG
2601 GAGCTCTTGC ATTTAAATAC CCCAGCCAAG AAAAATTGCA CAGATAGTGT
2851 ATATAAGCTG TTCATTCTGT ACAGTGAATT TTCCGAACCT CTCAAAGTAT 2101 GTTTTCCGTT CTTCCACAGA AATATGCAAA ACAGTTCATC CTTTTCTACT
2151 TTATTTATTG TTCCCTTGAA ATGACTGACC AGGAAAAAGA TCATCCTTAA
30D1 ATTTTGAAGC AAGTGAGAGA CTTTATTAAA AATACATATA TATCTATATA
3051 AACATATATG ATAGTGGCTC TAGTTTTATA GAGCTCCAAG TGTATTAAAC
31D1 ATGACAGCCA TTCATTCATA AAGATCTGGA TTTGCTTTAC CTTGTTAATA 3151 TTATCTAGGG GAAAAAGTGC AAATTGCTCC ATGTTCTTCT CTCCCTTATG
3201 TAACATCTCC TGAGGGTGTT TAGTTGCATG GCTGTTCAGA AAGGTATTAA
3251 GGGCTTAGGC CAAATCTTAC TTTGAGTATG TTAAAAAAAA AAAAATGCTG
3301 CTGGCTTTTC TGAAGACAGG TGCTTGAACT TGTCAGTTTG TTTTAAATAA
3351 ATACAATAGT TGAAAAAAAA AAAAAAAAAA AA
BLAST Results
No BLAST result
Medline entries
17313427:
Pirozzi Gi McConnell SJi Uveges AJi Carter JMi Sparks ABi Kay BKi
Fowlkes DM-i Identification of novel human UU domain-containing proteins by cloning of ligand targets-J Biol Chem 1117 Jun bi272(23) :14bll-b
Peptide information for frame 2
ORF from 11 bp to 277b bpi peptide length: 122
Category: known protein
Classification: Protein management
Prosite motifs: UU_D0MAIN_1 (355-380)
UU_D0MAIN_1 (367-412)
UU_D0MAIN_1 (4b2-487)
UU DOMAIN 1 (5D2-527)
1 MATASPRSDT SNNHSGRLlQL (QVTVSSAKLK RKKNUFGTAI YTEVVVDGEI
51 TKTAKSSSSS NPKUDElQLTV NVTPiQTTLEF (QVUSHRTLKA DALLGKATID
101 LKiQALLIHNR KLERVKElQLK LSLENKNGIA (QTGELTVVLD GLVIElQENIT
151 NCSSSPTIEI iQENGDALHEN GEPSARTTAR LAVEGTNGID NHVPTSTLViQ
2D1 NSCCSYVVNG DNTPSSPSiQV AARPKNTPAP KPLASEPADD TVNGESSSFA 251 PTDNASVTGT PVVSEENALS PNCTSTTVED PPVfQEILTSS ENNECIPSTS
301 AELESEARSI LEPDTSNSRS SSAFEAAKSR (QPDGCMDPVR (QiQSGNANTET
351 LPSGUEiQRKD PHGRTYYVDH NTRTTTUERP (QPLPPGUERR VDDRRRVYYV
401 DHNTRTTTUlQ RPTMESVRNF E1QU1QS1QRN1QL (QGAMlQlQFNlQR YLYSASMLAA
451 ENDPYGPLPP GUEKRVDSTD RVYFVNHNTK TTiQUEDPRTlQ GLώNEEPLPE 501 GUEIRYTREG VRYFVDHNTR TTTFKDPRNG KSSVTKGGPlQ lAYERGFRUK
551 LAHFRYLCiQS NALPSHVKIN VSRiQTLFEDS FIQIQIMALKPY DLRRRLYVIF bOl RGEEGLDYGG LAREUFFLLS HEVLNPMYCL FEYAGKNNYC LiQINPASTIN b51 PDHLSYFCFI GRFIAMALFH GKFIDTGFSL PFYKRMLSKK LTIKDLESID
7D1 TEFYNSLIUI RDNNIEECGL EMYFSVDMEI LGKVTSHDLK LGGSNILVTE 751 ENKDEYIGLM TEURFSRGVlQ ElQTKAFLDGF NEVVPLlQULlQ YFDEKELEVM
601 LCGMlQEVDLA DUiQRNTVYRH YTRNSKlQIIU FUiQFVKETDN EVRMRLLlQFV
651 TGTCRLPLGG FAELMGSNGP (QKFCIEKVGK DTULPRSHTC FNRLDLPPYK
101 SYEiQLKEKLL FAIEETEGFG iQE
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_lld21ι frame 2
No Alert BLASTP hits found Pedant information for DKFZphtes3_lld21ι frame 2
Report for DKFZphtes3_lld21- 2
[LENGTH! 125
[MU! 105b50.58
[pi! 5-bO [H0M0L! TREMBL:HSUlbll3_l product: "UUPl"i Homo sapiens
Nedd-4-like ubiquitin-protein ligase UUPl mRNAi partial eds- 0-D
[FUNCAT! 30-D2 organization of plasma membrane [S. cerevisiaei
YER125w! le-141 [FUNCAT! 11-01 stress response [S- cerevisiaei YER125w! le- 141
[FUNCAT! 0b-13.Dl cytoplasmic degradation [S- cerevisiaei
YER125w! le-141
[FUNCAT! D3-1D sporulation and germination [S. cerevisiaei YER125w! le-141
[FUNCAT! Ob-07 protein modification (glycolsylationi acylationi myristylationi palmitylationi farnesylation and processing)
[S- cerevisiaei YER125w! le-141
[FUNCAT! 03-22 cell cycle control and mitosis [S. cerevisiaei YDR457w! le-78
[FUNCAT! 11 unclassified proteins [S. cerevisiaei YJR03bc!
7e-31
[FUNCAT! 3D- 03 organization of cytoplasm [S- cerevisiaei
YKLDlOc! 6e-21 [FUNCAT! 30-10 nuclear organization [S- cerevisiaei YKL012w! be-05
[FUNCAT! 04.D5-D3 rna processing (splicing) [S. cerevisiaei
YKLD12w! be-05
[FUNCAT! 3D-01 organization of cell wall [S- cerevisiaei YIROllc! 3e-04
[FUNCAT! 30-1D extracellular/secretion proteins [S- cerevisiaei
YIRDllc! 3e-04
[FUNCAT! 01-D5.01 carbohydrate utilization [S- cerevisiaei
YIRDllc! 3e-04 [BLOCKS! BP0374bE
[BLOCKS! BP037blG
[BLOCKS! BLD0514E Fibrinogen beta and gamma chains C-terminal domain proteins
[BLOCKS! PR00731B [BLOCKS! BPOlSbbC
[BLOCKS! BL01151 UU/rsp5/UUP domain proteins
[BLOCKS! PR00403B
[BLOCKS! PR00403A
[BLOCKS! PF00b32B [BLOCKS! PF00b32A
[EC! b- 3-2.11 Ubiquitin—protein ligase le-151
[PIRKU! ligase le-151
[PIRKU! transmembrane protein 2e-37
[PIRKU! leucine zipper 2e-26 [SUPFAM! UU repeat homology le-151
[SUPFAM! UD repeat homology 2e-26
[SUPFAM! ubiquitin ligase homolog le-151
[PROSITE! UU_D0MAIN_1 4
CPFAM! UU/rsp5/UUP domain containing proteins [PFAM! C2 domain
[KU! Alpha_Beta
[KU! LOU COMPLEXITY 1 - 41 *
SElQ FUDMATASPRSDTSNNHSGRLlQLiQVTVSSAKLKRKKNUFGTAIYTEVVVDGEITKTAKSS SEG
PRD ccccccccccccccccccceeeeeehhhhhhhhhhhhccccceeeeeeeccccceeeecc SElQ SSSNPKUDE(QLTVNVTP(QTTLEF(QVUSHRTLKADALLGKATIDLK(QALLIHNRKLERVKE
SEG
PRD ccccccccceeeeeccccceeeeeeecchhhhhhhhhhhhhhhhhhhhhhhchhhhhhhh SElQ (QLKLSLENKNGIAlQTGELTVVLDGLVIElQENITNCSSSPTIEIiQENGDALHENGEPSART
SEG
PRD hhhhhhcccccccccceeeeeecceeeeeeeccccccccceeeecccccccccccccchh
SElQ TARLAVEGTNGIDNHVPTSTLVlQNSCCSYVVNGDNTPSSPSlQVAARPKNTPAPKPLASEP SEG
PRD hhhhhhcccccccccccccccccccccccccccccccccccccccccccccccccccccc
SElQ ADDTVNGESSSFAPTDNASVTGTPVVSEENALSPNCTSTTVEDPPViQEILTSSENNECIP
SEG PRD cccccccccccccccccceeeccccccccccccccccccccccccccccccccccccccc
SElQ STSAELESEARSILEPDTSNSRSSSAFEAAKSRlQPDGCMDPVRiQlQSGNANTETLPSGUEiQ
SEG
PRD ccccccccceeeeccccccccccccccccccccccccccccccccccccccccccccccc
SElQ RKDPHGRTYYVDHNTRTTTUERPiQPLPPGUERRVDDRRRVYYVDHNTRTTTUlQRPTMESV
SEG xxxxxxxxxxxxx
PRD ccccccceeeecccccccccccccccccccccccccccceeeeecccccccccccccccc SElQ RNFEIQUIQSIQRNIQLIQGAMIQIQFNIQRYLYSASMLAAENDPYGPLPPGUEKRVDSTDRVYFVNH
SEG
PRD hhhhhhhhhhhhhhhhhhhcccccccccccccccccccccccccceeeeccccceeeeec
SElQ NTKTTiQUEDPRTiQGLlQNEEPLPEGUEIRYTREGVRYFVDHNTRTTTFKDPRNGKSSVTKG SEG
PRD ccceeeecccccccccccccccccceeeeecccceeeeeccceeeeeccccccccccccc
SElQ GPIQIAYERGFRUKLAHFRYLCIQSNALPSHVKINVSRIQTLFEDSFIQIQIMALKPYDLRRRLY
SEG PRD cccccchhhhhhhhhhhhhhhhhcccccceeeeehhhhhhhhhhhhhhhhcchhhhhhhh ElQ VIFRGEEGLDYGGLAREUFFLLSHEVLNPMYCLFEYAGKNNYCLlQINPASTINPDHLSYF
SEG
PRD hhhccccccccccchhhhhhhhhhhccccccceeeeecccceeeeecccccccccceeee
SElQ CFIGRFIAMALFHGKFIDTGFSLPFYKRMLSKKLTIKDLESIDTEFYNSLIUIRDNNIEE
SEG
PRD hhhhhhhhhhhhhhhccccccchhhhhhhhhhhhhhcccccccchhhhhheeeeeccccc SElQ CGLEMYFSVDMEILGKVTSHDLKLGGSNILVTEENKDEYIGLMTEURFSRGVIQEIQTKAFL
SEG
PRD chhhhhhhhhccccceeeeeeeccccceeeeeeccchhhhhhhhhhhhhhhhhhhhhhhh
SElQ DGFNEVVPLiQULiQYFDEKELEVMLCGMiQEVDLADUiQRNTVYRHYTRNSKiQIIUFUiQFVKE SEG
PRD hhhhhcccccchhhhhhhhhhhhhhcccccccccccccccccccccchhhhhhhhhhhhh
SEώ TDNEVRMRLLlQFVTGTCRLPLGGFAELMGSNGPiQKFCIEKVGKDTULPRSHTCFNRLDLP
SEG PRD hchhhhhhhhhhhccccccccccceeeecccccceeeeeecccccccccccccccccccc
SElQ PYKSYElQLKEKLLFAIEETEGFGiQE
SEG PRD cccchhhhhhhhhhhhhhccccccc
Prosite for DKFZphtes3_lld21 ■ 2
PSD1151 356->384 UU_D0MAIN_1 PD0C5D02D PS01151 31D->41b UU_D0MAIN_1 PD0C50D20 PSD1151 4b5->411 UU_D0MAIN_1 PD0C5D02D PS01151 505->531 UU_D0MAIN_1 PD0C50D20
Pf am f or DKFZphtes3_lld2L 2
HMM NAME C2 doma i n
HMM *LtVrIIeARNLUkMDMnGf SDPYVKVdMdPdpkDtkKUKTkTi UNN - GL
L V++ +A+ +K++++G+ Y +V +D++ TKT
+++ +
(Query 23 LiQVTVSSAKLKRKKNUFGTA-IYTEVVVDGE
ITKTAKSSSSS b3
HMM NPVUNEEeFvFedlPyPd l qrkMLRFaVUDUDRFSRBDFIGHCi *
NP U+ E+++ + + + L+F+VU + ++ + ++G ++
(Query b4 NPKUD-EiQLTVN VTPiQTT — LEFώVUSHRTLKADALLGKAT
IDl
HMM_NAME UU/rsp5/UUP domain containing proteins HMM *LPsGUEeHUDpsGRpUYYUNHETkTT(QUEpP*
LPSGUE+++DP GR+ YY++H+T+TT+UE+P (Query 354 LPSGUEiQRKDPHGRT-YYVDHNTRTTTUERP 383
50-01 36b 415 1 31 dkfzphtes3_lld21 - 2 similarity to Nedd-4-like ubiquitin-protein ligase (Homo sapiens) Alignment to HMM consensus: (Query *LPsGUEeHUDpsGRpUYYUNHETkTT<QUEpP*
LP+GUE++ D+ R YY++H+T+TT+U++P dkfzphtes3 38b LPPGUERRVDDRRRV-YYVDHNTRTTTUiQRP 415
(Query 410 1 31 dkfzphtes3_lld21 ■ 2 similarity to
Nedd-4-like ubiquitin-protein ligase (Homo sapiens)
Alignment to HMM consensus: HMM *LPsGUEeHUDpsGRpUYYUNHETkTT(QUEpP* LP+GUE++ D + R Y++NH+TKTT(QUE+P
(Query 4bl LPPGUEKRVDSTDRV-YFVNHNTKTTώUEDP 41D
38-b2 SD1 53D 1 31 dkfzphtes3_lld21 - 2 similarity to Nedd-4-like ubiquitin-protein ligase (Homo sapiens) Alignment to HMM consensus:
(Query *LPsGUEeHUDpsGRpUYYUNHETkTT(QUEpP*
LP GUE +++ +G + Y+++H+T+TT+ ++P dkfzphtes3 501 LPEGUEIRYTREGVR-YFVDHNTRTTTFKDP 530 PAGE INTENTIONALLY LEFT BLANK
DKFZphtes3_llel7
group: testis derived
DKFZphtes3_llel7 encodes a novel 573 amino acid protein without similarity to known proteins- No informative BLAST resultsi No predictive prositei pfam or SCOP motife •
The new protein can find application in studying the expression profile of testis-specific genes-
unknown protein
Sequenced by (Qiagen
Locus: unknown
Insert length: 2102 bp
Poly A stretch at pos. 208Dι polyadenylation signal at pos- 2051
1 GGCCTGGGGG GCTTCCCTGG GGGGCTTGTC GCCGGGGCCG CCTGGGCTTT
51 CAGGTCTTCC GAGGCTGACA TTCACGTTTC ATTCTGCCAC ACTCGGGAAC
IDl GGTGATCGGG GAAGCATGGG GATCCGGGAG AAGCACCCAC AAAACTAGCA 151 TCCTCCTGGA GGAGCTC6GG AATAGGATGA GTGATAATCC ACCCAGAATG
201 GAAGTGTGTC CTTACTGTAA GAAGCCATTT AAACGATTAA AATCCCACTT
251 GCCATACTGT AAGATGATAG GATCAACCAT ACCTACTGAT CAAAAAGTTT
301 ATCAGTCCAA GCCAGCTACA CTCCCACGTG CTAAAAAGAT GAAAGGACCA
351 ATCAAAGATT TAATTAAAGC TAAAGGGAAA GAGTTAGAGA CAGAGAATGA 401 AGAAAGAAAT TCTAAGTTGG TGGTGGACAA ACCAGAACAG ACAGTGAAGA
451 CCTTTCCACT GCCAGCTGTT GGTTTGGAAA GAGCAGCTAC TACAAAGGCA
501 GATAAAGACA TCAAGAATCC AATCCAACCA TCCTTCAAAA TGTTAAAAAA
551 TACTAAACCA ATGACTACTT TCCAAGAAGA AACCAAGGCT CAGTTTTACG bOl CATCAGAGAA AACCTCTCCT AAAAGAGAAC TTGCCAAAGA TTTGCCTAAA b51 TCAGGAGAAA GTCGATGTAA TCCTTCAGAA GCTGGAGCGT CTTTACTGGT
701 TGGCTCAATA GAACCTTCTT TGTCAAATCA AGATAGAAAA TATTCCTCAA
751 CTCTACCTAA TGATGTACAA ACTACCTCTG GTGATCTCAA ATTGGACAAA
601 ATTGATCCCC AAAGACAGGA ACTTCTAGTA AAATTACTAG ATGTGCCTAC
651 TGGTGATTGT CATATTTCTC CAAAGAATGT CAGTGATGGG GTTAAAAGGG 101 TAAGAACATT ATTAAGCAAT GAGAGAGATT CCAAAGGCAG GGATCACCTC
151 TCAGGAGTCC CTACTGATGT TACAGTTACT GAGACTCCAG AAAAGAACAC
1001 AGAATCCCTC ATTTTAAGCC TTAAAATGAG CTCATTAGGT AAAATCCAAG
1051 TCATGGAGAA ACAAGAGAAA GGACTTACCC TGGGAGTAGA GACGTGTGGG
1101 AGCAAAGGAA ATGCAGAGAA AAGTATGTCT GCAACAGAAA AGCAGGAACG 1151 GACTGTCATG AGCCATGGCT GTGAGAACTT CAACACCAGG GATTCAGTCA
1201 CAGGAAAGGA GTCTCAAGGG GAAAGACCAC ATTTAAGTTT GTTCATTCCG
1251 AGGGAGACGA CTTACCAGTT TCATTCTGTA TCGCAGTCAA GTAGTCAAAG
13D1 TCTTGCCTCT CTAGCTACAA CATTTCTTCA AGAAAAGAAA GCAGAAGCCC
1351 AGAATCATAA TTGTGTCCCT GATGTAAAGG CATTAATGGA GAGTCCCGAG 14D1 GGACAGTTAT CTCTGGAGCC CAAATCTGAT AGTCAGTTCC AAGCATCACA
1451 CACTGGGTGC CAGAGCCCTT TATGTTCAGC CCAGCGTCAC ACTCCTCAGA
1501 GCCCCTTCAC CAATCATGCT GCAGCTGCTG GCAGGAAGAC TCTTCGCAGC
1551 TGCATGGGGC TGGAGTGGTT TCCAGAGCTC TATCCTGGTT ACCTTGGACT IbOl AGGGGTGTTG CCAGGGAAGC CTCAGTGTTG GAATGCAATG ACCCAGAAGC
IbSl CACAACTTAT CAGTCCCCAG GGGGkkkGkC TCTCACAAGG CTGGATCAGG
1701 TGCAACACCA CCATAAGGAA GAGTGGATTC GGTGGCATCA CTATGCTCTT
1751 CACAGGATAC TTCGTCCTGT GTTGTAGCTG GAGTTTCAGA CGTCTGAAAA
1601 AATTGTGCCG ACCCCTGCCC TGGAAGAGCA CAGTACCTCC ATGCATTGGT
1851 GTGGCGAAGA CGACTGGGGA TTGCCGCTCT AAAACATGTT TGGATTAGGA
1101 AGCACGTTTA AGTAGGAGAA GCCTTCGTGA CTTCTCTCTA GTGCCTTCGT
1151 GCCCTGTGTT GCCCACTGAA TTGCCCTGTA ACACCTAAGT GTAGTGGTAG
20D1 CATTAAGGGA TAGCTTTTCA GCCCTCAAGG TTATCAGGAG CATTTGTATC
2051 ACTGCTATAA ATAAAGTAGT ATCACTTGTC ATAAAAAAAA AAAAAAAAAA
2101 AA
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 3
ORF from 177 bp to 1815 bpi peptide length: 573 Category: putative protein Classification: no clue
1 MSDNPPRMEV CPYCKKPFKR LKSHLPYCKM IGSTIPTDlQK VYiQSKPATLP 51 RAKKMKGPIK DLIKAKGKEL ETENEERNSK LVVDKPElQTV KTFPLPAVGL IDl ERAATTKADK DIKNPIiQPSF KMLKNTKPMT TFiQEETKAiQF YASEKTSPKR 151 ELAKDLPKSG ESRCNPSEAG ASLLVGSIEP SLSNlQDRKYS STLPNDVlQTT 2D1 SGDLKLDKID PIQRIQELLVKL LDVPTGDCHI SPKNVSDGVK RVRTLLSNER 251 DSKGRDHLSG VPTDVTVTET PEKNTESLIL SLKMSSLGKI (QVMEKiQEKGL 301 TLGVETCGSK GNAEKSMSAT EKiQERTVMSH GCENFNTRDS VTGKESiQGER 351 PHLSLFIPRE TTYiQFHSVSlQ SSSlQSLASLA TTFLiQEKKAE AiQNHNCVPDV 401 KALMESPEGiQ LSLEPKSDSώ FiQASHTGClQS PLCSAiQRHTP (QSPFTNHAAA 451 AGRKTLRSCM GLEUFPELYP GYLGLGVLPG KP(QCUNAMT(Q KPώLISPώGE 5D1 RLSiQGUIRCN TTIRKSGFGG ITMLFTGYFV LCCSUSFRRL KKLCRPLPUK 551 STVPPCIGVA KTTGDCRSKT CLD
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_llel7ι frame 3 No Alert BLASTP hits found
Pedant information for DKFZphtes3_llel7ι frame 3 Report for DKFZphtes3_Hel7 -3
[LENGTH! 573
[MU! b3381.66
[pi! 1-24
[BLOCKS! BL00D28 Zinc fingeri C2H2 typei domain proteins
[KU! Alpha_Beta [KU! LOU COMPLEXITY 7-50 *
SElQ MSDNPPRMEVCPYCKKPFKRLKSHLPYCKMIGSTIPTDIQKVYIQSKPATLPRAKKMKGPIK
SEG PRD ccccccceeecccccchhhhhhhcccceeeeccccccceeeeeccccchhhhhhhcccch
SElQ DLIKAKGKELETENEERNSKLVVDKPEiQTVKTFPLPAVGLERAATTKADKDIKNPKQPSF
SEG
PRD hhhhhhcccchhhhhhhhheeeeccccceeecccccchhhhhhhhhhhcccccccccchh
SElQ KMLKNTKPMTTFlQEETKAiQFYASEKTSPKRELAKDLPKSGESRCNPSEAGASLLVGSIEP
SEG
PRD hhhhcccccchhhhhhhhhhhhhcccccchhhhhccccccccccccccchhhhhhhcccc SElQ SLSNlQDRKYSSTLPNDViQTTSGDLKLDKIDPlQRiQELLVKLLDVPTGDCHISPKNVSDGVK
SEG
PRD ccccccceeecccccccccccccccccccccchhhhhhhhhccccccccccccccccchh
SEώ RVRTLLSNERDSKGRDHLSGVPTDVTVTETPEKNTESLILSLKMSSLGKIiQVMEKiQEKGL SEG xxxxxxxxxxxx
PRD hhhhhhhhcccccccccccccccceeeeeccccchhhhhhhhhhccccchhhhhhhhccc
SElQ TLGVETCGSKGNAEKSMSATEKiQERTVMSHGCENFNTRDSVTGKESiQGERPHLSLFIPRE
SEG PRD eeeeecccccccchhhhhhhhhhhhhhhcccccccccccccccccccccccceeeeeccc
SE(Q TTY(QFHSVS(QSSS(QSLASLATTFL<QEKKAEA<QNHNCVPDVKALMESPEG<QLSLEPKSDS(Q
SEG xxxxxxxxxxxxxx
PRD eeeeeeccccccchhhhhhhhhhhhhhhhhhhccccccchhhhhcccccccccccccccc
SElQ F(QASHTGC(QSPLCSAfQRHTP(QSPFTNHAAAAGRKTLRSCMGLEUFPELYPGYLGLGVLPG
SEG xxxxxxxxxxxxxxx
PRD cccccccccccccccccccccccccchhhhhcchhhhhhccccccccccccccceeeccc SElQ KP(2CUNAMT(QKP(QLISP(QGERLS(QGUIRCNTTIRKSGFGGITMLFTGYFVLCCSUSFRRL
SEG xx
PRD ccccccccccccccccccccchhhhhccccceeeecccccceeeecceeeeeecchhhhh
SElQ KKLCRPLPUKSTVPPCIGVAKTTGDCRSKTCLD SEG
PRD hhhccccccccccccceeeeecccccccccccc
(No Prosite data available for DKFZphtes3_llel7.3) (No Pfam data available for DKFZphtes3_llel7 -3) DKFZphtes3_12dl6
group: testis derived
DKFZphtes3_12dl8 encodes a novel 1170 amino acid protein without similarity to known proteins- The EST-distribution signifies an ubiquitous expression pattern- No informative BLAST resultsi No predictive prositei pfam or SCOP motife-
The new protein can find application in studying the expression profile of testis-specific genes-
unknown protein perhaps complete eds- Sequenced by (Qiagen Locus: /map="13b-1 cR from top of Chrl3 linkage group"
Insert length: 54b1 bp
Poly A stretch at pos- 5441ι polyadenylation signal at pos 5420
1 AAGGACAGAG GACGAGATTT TGAACGACAA AGAGAAAAGA GAGACAAGCC
51 AAGGTCTACT TCCCCAGCAG GACAGCATCA TTCTCCTATA TCTTCTAGAC
IDl ATCACTCATC TTCCTCACAA TCAGGATCAT CTATTCAAAG ACATTCTCCT
151 TCTCCTCGTC GAAAAAGAAC TCCTTCACCA TCTTATCAGC GGACACTAAC
2D1 TCCACCTTTA CGACGCTCTG CCTCTCCTTA TCCTTCACAT TCTTTGTCGT 251 CTCCCCAGAG AAAGCAGAGT CCTCCAAGAC ATCGCTCTCC AATGCGAGAG
301 AAAGGGAGAC ATGATCATGA ACGAACTTCA CAGTCTCATG ATCGACGCCA
351 CGAAAGGAGG GAAGATACTA GGGGCAAACG AGACAGAGAA AAGGACTCAA
401 GAGAAGAACG AGAATATGAA CAGGATCAGA GCTCTTCTAG AGACCACAGA
451 GATGACAGAG AACCTCGAGA TGGTCGGGAT CGGAGAGATG CCAGAGATAC 5D1 TAGGGACCGA AGGGAACTAA GAGACTCCAG AGACATGCGG GACTCAAGGG
551 AGATGAGAGA TTATAGCAGA GATACCAAAG AGAGCCGTGA TCCCAGAGAT bOl TCTCGGTCCA CTCGTGATGC CCATGACTAC AGGGACCGTG AAGGTCGAGA b51 TACTCATCGA AAGGAGGATA CATATCCAGA AGAATCCCGG AGTTATGGCC
701 GAAACCATTT GAGAGAAGAA AGTTCTCGTA CGGAAATAAG GAATGAGTCC 751 AGAAATGAGT CTCGAAGTGA AATTAGAAAT GACCGAATGG GCCGAAGTAG
6D1 GGGGAGGGTT CCTGAGTTAC CTGAAAAGGG AAGTCGAGGC TCAAGAGGTT
851 CTCAAATTGA TAGTCACAGT AGTAATAGCA ACTATCATGA CAGCTGGGAA
101 ACTCGAAGTA GCTATCCTGA AAGAGATAGA TATCCTGAAA GAGACAACAG
151 AGATCAAGCA AGGGATTCTT CCTTTGAGAG AAGACATGGA GAGCGAGACC 10D1 GTCGTGACAA CAGAGAGAGA GATCAAAGAC CAAGCTCACC AATTCGACAT
1051 CAGGGAAGGA ATGACGAGCT TGAGCGTGAT GAAAGAAGAG AGGAACGAAG
1101 AGTAGACAGA GTGGATGATA GGAGAGATGA AAGGGCTAGA GAGAGAGATC
1151 GGGAACGAGA ACGAGACAGG GAGCGGGAGA GAGAGAGGGA ACGTGAACGG
1201 GATCGGGAAA GAGAAAAAGA GAGAGAACTA GAAAGAGAGC GTGCTAGGGA 1251 ACGGGAGAGA GAAAGAGAAA AAGAGAGAGA TCGTGAAAGG GATAGAGACC
1301 GAGACCACGA TCGAGAGCGG GAAAGAGAGA GGGAACGAGA CAGGGAAAAA
1351 GAACGGGAAC GAGAAAGAGA AGAGAGAGAG AGGGAGAGAG AGCGAGAACG
14D1 GGAGAGAGAG CGAGAGCGAG AACGGGAACG AGAAAGAGCG AGAGAAAGGG 1451 ATAAAGAACG AGAACGCCAA AGGGATTGGG AAGACAAAGA CAAAGGACGA
1501 GATGACCGCA GAGAAAAGCG AGAAGAGATC CGAGAAGATA GGAATCCAAG
1551 AGATGGACAT GATGAAAGAA AATCAAAGAA GCGCTATAGA AATGAAGGGA
IbOl GTCCCAGCCC TAGACAGTCC CCGAAGCGCC GGCGTGAACA TTCTCCGGAC 1L.51 AGTGATGCCT ACAACAGTGG AGATGATAAA AATGAAAAAC ACAGACTCTT
17D1 GAGCCAAGTT GTACGACCTC AAGAATCTCG TTCTCTTAGT CCCTCGCACC
1751 TCACAGAAGA CAGACAGGGT AGATGGAAAG AGGAGGATCG TAAACCAGAA
1601 AGGAAAGAGA GTTCAAGGCG CTACGAAGAA CAGGAACTCA AGGAGAAAGT
1651 TTCTTCTGTA GATAAACAGA GAGAACAGAC AGAAATCCTG GAAAGCTCAA 1101 GAATGCGTGC ACAGGACATT ATAGGACACC ACCAGTCTGA AGATCGAGAG
1151 ACATCTGATC GAGCTCATGA TGAAAACAAG AAGAAAGCAA AAATTCAAAA
2001 GAAACCAATT AAGAAAAAGA AAGAGGATGA TGTTGGAATA GAGAGGGGTA
2D51 ACATAGAGAC AACATCTGAA GATGGTCAAG TATTTTCACC AAAAAAAGGA
21D1 CAGAAAAAGA AAAGCATTGA AAAAAAACGT AAAAAATCCA AAGGTGATTC 2151 TGATATTTCT GATGAAGAAG CAGCCCAGCA AAGTAAGAAG AAAAGAGGCC
2201 CACGGACTCC CCCTATAACA ACTAAAGAGG AATTGGTTGA AATGTGCAAT
2251 GGTAAGAATG GTATTCTAGA GGACTCCCAG AAAAAAGAAG ATACAGCATT
2301 CAGTGACTGG TCTGATGAGG ATGTCCCTGA CCGTACAGAG GTGACAGAAG
2351 CAGAGCATAC TGCCACCGCC ACGACTCCTG GTAGTACCCC TTCTCCTCTA 2401 TCTTCTCTTC TTCCTCCTCC ACCGCCTGTG GCTACTGCCA CTGCTACAAC
2451 TGTGCCTGCA ACTCTTGCTG CCACTACTGC TGCTGCCGCC ACCTCTTTCA
2501 GCACATCTGC CATCACTATT TCCACCTCTG CCACCCCCAC CAATACCACC
2551 AATAATACTT TTGCCAATGA AGACTCACAC AGAAAATGCC ACAGAACACG
2b01 AGTAGAAAAA GTAGAGACGC CTCACGTGAC TATAGAAGAT GCACAGCATC 2b51 GCAAGCCTAT GGATCAAAAG AGGAGCAGCA GCCTCGGGAG CAATCGGAGT
27D1 AACCGTAGTC ATACGTCTGG TCGTCTTCGC TCCCCATCCA ATGATTCAGC
2751 CCATCGAAGT GGAGATGACC AAAGTGGTCG AAAGAGAGTA CTGCACAGTG
2601 GCTCAAGAGA TA6AGAAAAA ACAAAAAGCC TGGAAATCAC AGGAGAGAGA
2851 AAATCTAGGA TTGATCAGTT AAAGCGTGGA GAACCCAGTC GAAGTACTTC 2101 TTCAGATCGC CAGGATTCAA GAAGCCATAG TTCAAGAAGA AGTTCTCCAG
2151 AGTCAGATCG ACAGGTCCAT TCAAGATCTG GGTCATTTGA TAGCAGAGAC
30D1 AGGCTTCAAG AACGAGATCG ATATGAACAC GACAGAGAGC GCGAGAGAGA
3051 GAGGAGAGAT ACGAGGCAGA GAGAATGGGA CCGAGATGCT GATAAAGATT
3101 GGCCACGCAA CAGGGATCGA GATAGATTGC GAGAACGAGA ACGAGAGAGA 3151 GAACGAGACA AAAGGAGAGA CTTGGATAGG GAAAGAGAGA GACTAATTTC
3201 TGATTCTGTT GAAAGGGACA GGGACAGAGA CAGAGACAGA ACTTTTGAGA
3251 GTTCTCAAAT AGAGTCTGTG AAACGCTGTG AAGCAAAACT GGAAGGTGAA
3301 CATGAAAGGG ATCTAGAAAG CACTTCCCGA GACTCTCTAG CCTTGGATAA
3351 AGAGAGAATG GATAAAGATC TGGGATCTGT GCAGGGATTT GAAGATACAA 3401 ATAAATCCGA GAGAACTGAG AGTCTGGAAG CAGGAGATGA CGAGTCCAAG
3451 TTAGATGATG CACATTCATT AGGCTCTGGT GCTGGAGAAG GATACGAGCC
3501 AATCAGTGAT GACGAACTAG ATGAAATTCT GGCAGGTGAT GCAGAAAAGA
3551 GGGAGGACCA ACAGGATGAG GAGAAGATGC CAGATCCCTT AGATGTGATA
3b01 GATGTGGATT GGTCTGGTCT TATGCCAAAG CATCCAAAAG AACCACGAGA 3b51 GCCTGGGGCT GCACTCTTAA AATTCACACC TGGAGCTGTT ATGCTAAGAG
3701 TTGGGATTTC TAAAAAGTTG GCAGGTTCTG AACTCTTTGC CAAAGTCAAA
3751 GAAACATGTC AGAGACTTTT AGAAAAACCC AAAGGTAGTT TCATTTTACT
3601 TTAACTATAT AATGTCTGTT AACCATTTAA GATGCCATCT GAAGGGGATT
3651 CTGATCTGTT CTTATGTAGC ACTTAACACT GTGTAGAAAC TATTTTTTGA 3101 GAAATCATTT TATAATCATT ATTTAACCCT CATGGTCAAA GTTTCTCTTT
3151 AAAATTTATT TTGAGAAGAA GAGTTATCCC ACAGAAAAGT TGGGAAAAGA
4D01 GTACAATGAC CTTTTTGTAT GAAAATTACT TATTAACAGG CCAGGCGTGG
4D51 TGTTGCATGT CTGTAGTCAC AGCTACTCA6 GGAGGTTGAG GCAGCAGGAT
41D1 TGCTGGAGCC CAGGAAATTG AGGCTGCAGT GAGCCATGAT TGAGCCACCA 4151 CACTCCAACC TAGGTGACAG AGCAAGACCC TGTCTCAAAA AAAAAAAAAC
42D1 AAATTAACCA ATAAGTTCTA ATATCAAAGT GCTCAGTGGT TTGCCCTTGG
4251 CTAAATGAAG CAGAGCCAGG AAAAACAGAC TACATATTTT TCATGTCTAA
4301 AGAAATTGGG TATTTTGGCA GCCCTTTCCC CTAGACATCT ACCCAAATGC 4351 AGGTGTGTAG GTTGAGTCTT TAACAAAGTG ATTAAGAGCT TGGTCTGTAA
4401 GGCCGGATGA TCTGGATTTC AGTAGGCACA CCACTTACTG GCTATTACTT
4451 AATCTGTGTG TTAGTGTCAT CATCTGTAAG TCAGGAATAA TCATACCACC
4501 AACTTCCTAT GGTAATTAGG AGCAAATGAG TTATTACAGG CAAAACACTT
4551 AGAACAGTTC CTGGCATATA GTAATACCCA ATAAATATTA ACTGCTACTT
4b01 TGAAAATATC CTATCACGCT GATTTTTGAC CTCACTGCAG CAATTTTCAG
4b51 TTATTCCAGA TTATCTAGCT TATGGATTCT GGTGGTAGGG GTTGTTTGGT
4701 TTTGGTTTTC ACTGTCTCTG TCTCATCTAG TACCTACCTT AGTTTATTTT
4751 GCAACTTACT AATACTTTAT TAATGGGGAG GGACGAGTAG ATGGTAAAAA
4801 GAAGGAAAAG GAGGTAAAAG GTGAAAGGAA CAACATTAAT TAACAATTTT
4851 ACGTCATGTC CCTGGACATA AAAGTTTAGT TAGTATTAAA TTTTTCACTA
41D1 ATACAAAATA AAAAAATATT GTTTTATGAG TTTTATGAAT TCATGCCCTT
4151 CCTTTACTCT ATTAGCATAA GCAGTAAATT TTTTTATTTT AATATAGCCC
50D1 AATAAACCTA GAGTATACAT GTACAAAATA CATATAATTG TTAACGTGTA
5051 TTAACCGAAA AATGACCCAA GACTTAGTTC TTGCCCTACT GTATCTGCCT
5101 TGTTTGGTTG GTTCTGTGAC CTTAAGCAAA TAACTCCTGT GAGCCTCAAT
5151 TTTATTTGTA AAGTGATGGA ATAAAACCCC TAAAATCTTA CCCACCTCTA
5201 AAGATATTTG TTTCTGTGAC CTTTTGCTAG TAGCATTTCA AGTTAAAATC
5251 TGGTTTGATT TTGCTACCCA TGAAATACAG TTCGGCCCTT ACTTATTGAT
5301 GACTTAACCT AAACAGTGAA AATATGCACT GTAAAGGGTG GGGTGATGTG
5351 GCTTAACAAT CAGACTTCTT CTATTTTTGC TGCTATGGTG GTTGTATTAG
5401 AGAACTGATG TATTATCTTG AATAAAGACT TTGTCTTGTT TACTGCCCTA
5451 AAAAAAAAAA AAAAAAAAA
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 1
ORF from 212 bp to 3801 bpi peptide length: 1170 Category: similarity to unknown protein Classification: no clue
1 MREKGRHDHE RTSlQSHDRRH ERREDTRGKR DREKDSREER EYEiQDlQSSSR
51 DHRDDREPRD GRDRRDARDT RDRRELRDSR DMRDSREMRD YSRDTKESRD
IDl PRDSRSTRDA HDYRDREGRD THRKEDTYPE ESRSYGRNHL REESSRTEIR
151 NESRNESRSE IRNDRMGRSR GRVPELPEKG SRGSRGSiQID SHSSNSNYHD 2D1 SUETRSSYPE RDRYPERDNR DiQARDSSFER RHGERDRRDN RERDώRPSSP
251 IRHiQGRNDEL ERDERREERR VDRVDDRRDE RARERDRERE RDRERERERE
301 RERDREREKE RELERERARE REREREKERD RERDRDRDHD RERERERERD
351 REKERERERE ERERERERER ERERERERER ERARERDKER ERiQRDUEDKD
401 KGRDDRREKR EEIREDRNPR DGHDERKSKK RYRNEGSPSP RfQSPKRRREH 451 SPDSDAYNSG DDKNEKHRLL SlQVVRPlQESR SLSPSHLTED RiQGRUKEEDR
501 KPERKESSRR YEEiQELKEKV SSVDKlQREiQT EILESSRMRA (QDIIGHHiQSE
551 DRETSDRAHD ENKKKAKIiQK KPIKKKKEDD VGIERGNIET TSEDGώVFSP bOl KKGiQKKKSIE KKRKKSKGDS DISDEEAAIQIQ SKKKRGPRTP PITTKEELVE b51 MCNGKNGILE DSώKKEDTAF SDUSDEDVPD RTEVTEAEHT ATATTPGSTP
701 SPLSSLLPPP PPVATATATT VPATLAATTA AAATSFSTSA ITISTSATPT
751 NTTNNTFANE DSHRKCHRTR VEKVETPHVT IEDAlQHRKPM DlQKRSSSLGS
801 NRSNRSHTSG RLRSPSNDSA HRSGDDlQSGR KRVLHSGSRD REKTKSLEIT
851 GERKSRIDlQL KRGEPSRSTS SDRiQDSRSHS SRRSSPESDR iQVHSRSGSFD
101 SRDRLiQERDR YEHDRERERE RRDTRiQREUD RDADKDUPRN RDRDRLRERE
151 RERERDKRRD LDRERERLIS DSVERDRDRD RDRTFESSiQI ESVKRCEAKL
1D01 EGEHERDLES TSRDSLALDK ERMDKDLGSV iQGFEDTNKSE RTESLEAGDD
1D51 ESKLDDAHSL GSGAGEGYEP ISDDELDEIL AGDAEKREDiQ (QDEEKMPDPL
11D1 DVIDVDUSGL MPKHPKEPRE PGAALLKFTP GAVMLRVGIS KKLAGSELFA
1151 KVKETCiQRLL EKPKGSFILL
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_12dl8ι frame 1 No Alert BLASTP hits found
Pedant information for DKFZphtes3_12dl8ι frame 1
Report for DKFZphtes3_12dl8-l
[LENGTH! 12b7 [MU! 150513-45 [pi! 1-22 [HOMOL! TREMBL:AB020bbO_l gene; "KIAAD853"i product
"KIAAD653 protein i Homo sapiens mRNA for KIAA0653 proteini partial eds- 0-0
[BLOCKS! BL0D422C Granins proteins
[BLOCKS! BL0D803F
[BLOCKS! PR0D306C
[BLOCKS! PRD1081B
[BLOCKS! PRDDD41D
[BLOCKS! PR01D63A
[BLOCKS! PR00545A
[BLOCKS! BL00048 Protamine PI proteins
[BLOCKS! PF0114DD
[BLOCKS! PR0D833H
[KU! All_Alpha
[KU! LOU COMPLEXITY 44-12 *
SElQ KDRGRDFERiQREKRDKPRSTSPAGlQHHSPISSRHHSSSSiQSGSSIiQRHSPSPRRKRTPSP SEG xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PRD ccccchhhhhhhhccccccccccccccccccccccccccccccceeeccccccccccccc
SElQ SYlQRTLTPPLRRSASPYPSHSLSSPlQRKiQSPPRHRSPMREKGRHDHERTSiQSHDRRHERR
SEG x xxxxxxxxxxxxx xxxxxxxx PRD ccccccccccccccccccccccccccccccccccccccccccccccccccccchhhhhhc
SElQ EDTRGKRDREKDSREEREYElQDiQSSSRDHRDDREPRDGRDRRDARDTRDRRELRDSRDMR
SEG xx-xxxxxxxxxxxxxxxxx-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx PRD cccccccccccchhhhhhhhhhccccccccccccccccccchhhhhhhhhhhhhhhcccc
SElQ DSREMRDYSRDTKESRDPRDSRSTRDAHDYRDREGRDTHRKEDTYPEESRSYGRNHLREE
SEG xxxxxxxxxxx- • ■ xxxxxxxxxxxx
PRD hhhhhhhccccccccccccccccccccccccccccccccccccccccccccccccccccc
SElQ SSRTEIRNESRNESRSEIRNDRMGRSRGRVPELPEKGSRGSRGSlQIDSHSSNSNYHDSUE
SEG xxxxxxxxxxxx- •
PRD hhhhhhhccccccccccccccccccccccccccccccccccccccccccccccccccccc
SElQ TRSSYPERDRYPERDNRD1QARDSSFERRHGERDRRDNRERD1QRPSSPIRH1QGRNDELERD
SEG xxxxxxxxxxxxxxxxxx xxxxxx
PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccchhhhhh SElQ ERREERRVDRVDDRRDERARERDRERERDRERERERERERDREREKERELERERARERER
SEG xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PRD hhhhhhhhhcchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
SElQ EREKERDRERDRDRDHDRERERERERDREKEREREREERERERERERERERERERERERA SEG xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PRD hhhhhhhhhcccccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
SElQ RERDKERERiQRDUEDKDKGRDDRREKREEIREDRNPRDGHDERKSKKRYRNEGSPSPRlQS
SEG xxxxxxxxx- xxxxxxxxxxxxxxxxxxxxxxxx xxxxxxx PRD hhhhhhhhhhhhhhhccccccchhhhhhhhhhccccccccccchhhhhhccccccccccc
SElQ PKRRREHSPDSDAYNSGDDKNEKHRLLSIQVVRPIQESRSLSPSHLTEDRIQGRUKEEDRKPE
SEG xxxxx
PRD ccccccccccccccccccccchhhhhhhhhcccccccccccccccchhhhhhhhhhccch
SElQ RKESSRRYEE(QELKEKVSSVDK<QRE(QTEILESSRMRA(QDIIGHH(QSEDRETSDRAHDENK
SEG x
PRD hhhhhhhhhhhhhhhhhhccchhhhhhhhhhhhhhhhhheeeecccccccccccccccch SElQ KKAKIiQKKPIKKKKEDDVGIERGNIETTSEDGlQVFSPKKGiQKKKSIEKKRKKSKGDSDIS
SEG xxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxx
PRD hhhhhhhhccccccccccccccccceeecccceeecccccchhhhhhhhhhccccccccc
SElQ DEEAAlQiQSKKKRGPRTPPITTKEELVEMCNGKNGILEDSlQKKEDTAFSDUSDEDVPDRTE SEG xxx xx
PRD hhhhhhhhhhcccccccccccchhhhhhcccccceeecccccccccccccccccccceee
SElQ VTEAEHTATATTPGSTPSPLSSLLPPPPPVATATATTVPATLAATTAAAATSFSTSAITI
SEG xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx PRD hhhhhhhhccccccccccceeeccccccceeeeeeecccchhhhhhhhhhhhccccceee
SEA STSATPTNTTNNTFANEDSHRKCHRTRVEKVETPHVTIEDAiQHRKPMDiQKRSSSLGSNRS
SEG xxxxxxxxxxxxxxxx xxxxxxxxxx
PRD eccccccccccccccccccccchhhhheeeeccceeeecccccccccccccccccccccc
SElQ NRSHTSGRLRSPSNDSAHRSGDDlQSGRKRVLHSGSRDREKTKSLEITGERKSRIDlQLKRG
SEG xxx
PRD ccccccccccccccccccccecccccceeeeeccccccccceeeeehhhhhhhhhhhhcc SElQ EPSRSTSSDRiQDSRSHSSRRSSPESDRlQVHSRSGSFDSRDRLiQERDRYEHDRERERERRD
SEG - - xxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxx
PRD ccccccccccccccccccccccccccceeeecccccccchhhhhhhhhhchhhhhhhhhh SElQ TRiQREUDRDADKDUPRNRDRDRLRERERERERDKRRDLDRERERLISDSVERDRDRDRDR
SEG xxxxxxxxx xxxxxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxx
PRD hhhhhhhhccccccccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhccccccccccce SElQ TFESSlQIESVKRCEAKLEGEHERDLESTSRDSLALDKERMDKDLGSVtQGFEDTNKSERTE
SEG
PRD eechhhhhhhhhhhhhhhhhhccccccccccchhhhhhhhhhcccccccccccccccccc
SElQ SLEAGDDESKLDDAHSLGSGAGEGYEPISDDELDEILAGDAEKREDiQlQDEEKMPDPLDVI SEG
PRD cccccccccccccccccccccccccccccccccceeeecchhhhhhhhhhhcccccceee
SElQ DVDUSGLMPKHPKEPREPGAALLKFTPGAVMLRVGISKKLAGSELFAKVKETCiQRLLEKP
SEG PRD eccccccccccccccccccceeeeeccceeeeeecccccccchhhhhhhhhhhhhhhl-.ee
SEfQ KGSFILL
SEG
PRD ccccccc
( No Prosi te data avai lable for DKFZphtes3_12dlβ - 1 ) ( No Pf am data avai l abl e f or DKFZphtes3_12dlδ - 1 )
DKFZphtes3_1417
group: testis derived
DKFZphtes3_1417 encodes a novel 615 amino acid protein without similarity to known proteins- The mRNA is transcribed ubiquitously -
No informative BLAST resultsi No predictive prositei pfam or SCOP motife.
The new protein can find application in studying the expression profile of testis-specific genes-
similarity to C-elegans B0412-3 see also DKFZphtes3_17n3 perhaps complete eds-
Sequenced by BMFZ Locus: unknown
Insert length: 3522 bp
Poly A stretch at pos- 345bι polyadenylation signal at pos- 3437
1 AACACATCGA CTTGTGTAAG AAAAAGATTG GAAGTGCGGA GCTGTCTTTT
51 GAGCATGATG CATGGATGTC TAAACAATTC CAGGCCTTTG GAGATTTATT
IDl TGATGAAGCT ATTAAGTTAG GGTTAACAGC TATTCAAACT CAGAATCCTG
151 GTTTCTATTA CCAGCAGGCA GCATACTATG CCCAGGAGCG GAAACAGCTT 201 GCAAAAACCC TCTGTAA CCA CGAAGCTTCT GTAATGTATC CCAATCCTGA
251 TCCCTTAGAA ACACAAACAG GCGTTCTTGA CTTTTATGGA CAAAGATCAT
301 GGCGACAAGG AATACTAAGT TTTGATCTTT CTGATCCTGA AAAAGAAAAG
351 GTGGGAATTC TTGCCATTCA GCTGAAGGAG AGAAATGTTG TTCACTCTGA
401 GATAATCATA ACTCTTCTGA GCAATGCTGT TGCACAGTTC AAGAAGTATA 451 AGTGCCCGCG AATGAAAAGT CACCTAATGG TTCAGATGGG AGAGGAATAT
501 TATTACGCAA AGGATTATAC CAAAGCTTTG AAGTTGCTGG ATTATGTGAT
551 GTGTGATTAT CGGAGTGAAG GATGGTGGAC TCTGCTCACT TCTGTATTAA bOl CTACAGCTCT GAAGTGCTCC TACCTCATGG CCCAATTAAA GGATTACATT b51 ACTTACTCCC TAGAACTCCT TGGTAGAGCT TCAACTCTGA AAGATGACCA 701 GAAGTCTCGG ATAGAAAAGA ACCTCATAAA TGTTTTAATG AATGAAAGTC
751 CTGATCCAGA ACCCGACTGT GATATCTTAG CTGTGAAAAC TGCTCAGAAG
801 CTGTGGGCAG ACCGAATTTC TCTGGCTGGC AGCAATATTT TCACAATAGG
651 AGTACAGGAC TTTGTGCCAT TTGTGCAGTG CAAAGCCAAG TTTCATGCCC
IDl CAAGTTTTCA TGTTGATGTT CCTGTTCAGT TTGATATTTA TCTGAAGGCT 151 GATTGTCCAC ATCCCATTAG GTTTTCCAAG CTCTGTGTCA GCTTTAATAA
1D01 TCAGGAATAC AACCAGTTCT GTGTAATAGA AGAAGCATCC AAAGCAAATG
1D51 AAGTTTTAGA AAATCTGACT CAAGGAAAGA TGTGCCTAGT TCCTGGCAAA
1101 ACAAGAAAAC TGTTATTTAA GTTTGTTGCA AAAACTGAAG ATGTGGGAAA
1151 GAAAATTGAG ATTACTTCAG TGGATCTTGC TCTGGGCAAT GAGACGGGAA 12D1 GATGTGTGGT TTTAAATTGG CAGGGAGGAG GAGGAGATGC TGCTTCCTCC
1251 CAAGAAGCCT TACAGGCAGC TCGGTCTTTC AAAAGGCGAC CTAAGCTACC
1301 TGACAATGAA GTTCACTGGG ACAGCATTAT AATTCAGGCA AGCACAATGA
1351 TCATATCCAG AGTCCCAAAC ATTTCTGTAC ATCTGCTACA TGAACCCCCT 14D1 GCACTGACTA ATGAAATGTA TTGTTTGGTT GTGACTGTTC AGTCCCATGA
1451 AAAGACCCAA ATCAGAGATG TGAAGCTCAC TGCTGGCTTA AAACCAGGAC
1501 AGGATGCCAA TTTAACTCAG AAGACTCACG TGACTCTTCA TGGACCAGAA
1551 CTGTGTGATG AATCCTACCC GGCTTTACTC ACTGACATTC CTGTTGGAGA IbOl CTTACATCCA GGGGkkCkGC TGGAAAAAAT GTTGTATGTT CGCTGTGGAA lb51 CAGTGGGTTC CAGAATGTTT CTTGTATATG TTTCTTACCT GATAAATACA
17D1 ACCGTTGAAG AAAAAGAAAT TGTTTGCAAG TGTCACAAGG ATGAAACTGT
1751 AACAATTGAA ACAGTCTTTC CATTTGATGT TGC6GTTAAA TTTGTTTCTA
1801 CCAAGTTTGA GCACCTGGAA AGGGTTTATG CTGACATCCC CTTTCTGTTG 1851 ATGACGGACC TCTTAAGTGC CTCACCCTGG GCCCTCACTA TTGTTTCCAG
1101 TGAGCTCCAG CTTGCTCCAT CCATGACCAC AGTGGACCAG CTCGAGTCTC
1151 AAGTGGACAA TGTTATCTTA CAGACTGGAG AGAGTGCTAG TGAATGCTTT
20D1 TGTCTTCAAT GCCCATCTCT TGGAAATATT GAAGGTGGAG TAGCAACCGG
2051 GCATTATATT ATCTCTTGGA AAAGGACCTC AGCAATGGAG AATATCCCCA 2101 TCATCACAAC TGTCATCACT CTGCCGCACG TGATTGTGGA GAATATCCCT
2151 CTCCATGTGA ATGCAGATCT GCCGTCATTT GGGCGTGTCA GAGAGTCGTT
2201 ACCTGTCAAG TATCACCTAC AGAATAAGAC CGACTTAGTT CAAGATGTAG
2251 AAATTTCTGT GGAGCCCAGT GATGCCTTCA TGTTCTCAGG TCTCAAACAG
23D1 ATTCGATTAC GTATCCTCCC TGGCACGGAG CAGGAAATGC TATATAATTT 2351 CTATCCTCTG ATGGCTGGAT ACCAGCAGCT GCCATCTCTC AACATCAACT
2401 TGCTTAGATT TCCTAACTTC ACAAATCAGC TGCTCAGGCG TTTTATACCT
2451 ACCAGTATTT TTGTCAAGCC ACAGGGTCGA CTCATGGATG ATACCTCTAT
25D1 TGCTGCTGCA TGATGTTCAA GACCGGCCCT TGGCTGTTGT TACAGAGATG
2551 TTGGGCAGAG CTATGCAGGT GTTTCATTGT GAACTCTAGC TTTGATCATG 2b01 GTAAAAAGTT AACCTTTTCT ATTTTTTAAT GGATGTTATA CCAACTATTC
2b51 AGAGGAACTC ATACTTCAAA AATATTAGGA AAATCTGTCT TATAGTTTCT
27D1 CTAATAAATA TCTGAAATCT CAGTACGACA TGAAAGAATG TCAGACCATT
2751 GTTATTGTTG AAAGTCATTT GATGAATGGT AAATTCTATG AAAAGTAAGT
2801 GATTTGCATG TATAATATCA GGAAAATTAA GCATCCCAAG TGTGACTGGA 2651 CAAAGAGAGC AGATGCACCA GTGCCTGTGC CATAAAGTTC CGAATCCCCC
21D1 ATGTGTCTCT TTCAGAGCTG GCCAGACCGG AAATAAATCA TTCTCATAAA
2151 TTCAGTGTGT ACTCAGAACA CATACACAAC AACATAGGGA GTTGTATGAC
30D1 TGATACGGAA AACTTCCAGA AAGTTTTAAT CAAAGCAGTT TAATTAAGGT
3051 ATCAAAAATA TCTTTGCTTA CTATCAAGAA GTGTCAAATA GGTTCAGCTT 31D1 GCTGCCAAAA TATGGATCAT TTATGAAGCA GGTTCATATT TTAGAGGTGT
3151 TAATAAAATC CTCATCGGAA AAGATCCAAA GTGCAAGGAT TTGATTATAA
3201 ACATAATTTC CTAGACTGAA AGTTTTTGGA AAAGATGCAG GGTCTGAGTC
3251 AGGCCTTCTG GTTATATTGT GCAGTTTCAA AAGAACTATT TAAAACTCTT
3301 GAAAACTCAT GTAAATAAAA ATCATAGGGT GAAAATTGTA TTTGTTAAAA 3351 TACCTTAATA ATTTAAAATG ACCTGATTTC CTGGAAAATT TTATTATTCA
34D1 AAAGGTGGAG GCATTGTAAA AAGGAAATAG TGATGTAAAT AAACATGTTC
3451 TCTTTCAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA
3501 AAAAAAAAAA AAAAAAAAAA AA
BLAST Results
o BLAST result
Medline entries
o Medline entry Peptide information for frame 3
ORF from bb bp to 2510 bpi peptide length: 615 Category: similarity to unknown protein Classification-" no clue
1 MSKiQFlQAFGD LFDEAIKLGL TAIlQTiQNPGF YYlQiQAAYYAiQ ERKώLAKTLC
51 NHEASVMYPN PDPLETlQTGV LDFYGiQRSUR (QGILSFDLSD PEKEKVGILA 101 IiQLKERNVVH SEIIITLLSN AVAlQFKKYKC PRMKSHLMViQ MGEEYYYAKD
151 YTKALKLLDY VMCDYRSEGU UTLLTSVLTT ALKCSYLMAfQ LKDYITYSLE
201 LLGRASTLKD DiQKSRIEKNL INVLMNESPD PEPDCDILAV KTAlQKLUADR
251 ISLAGSNIFT IGVlQDFVPFV (QCKAKFHAPS FHVDVPVlQFD IYLKADCPHP
3D1 IRFSKLCVSF NNiQEYNiQFCV IEEASKANEV LENLTiQGKMC LVPGKTRKLL 351 FKFVAKTEDV GKKIEITSVD LALGNETGRC VVLNUUGGG6 DAASSlQEALfl
401 AARSFKRRPK LPDNEVHUDS IIIiQASTMII SRVPNISVHL LHEPPALTNE
451 MYCLVVTViQS HEKTlQIRDVK LTAGLKPGiQD ANLTiQKTHVT LHGPELCDES
5D1 YPALLTDIPV GDLHPGEώLE KMLYVRCGTV GSRMFLVYVS YLINTTVEEK
551 EIVCKCHKDE TVTIETVFPF DVAVKFVSTK FEHLERVYAD IPFLLMTDLL bOl SASPUALTIV SSELlQLAPSM TTVDiQLESlQV DNVILiQTGES ASECFCLUCP b51 SLGNIEGGVA TGHYIISUKR TSAMENIPII TTVITLPHVI VENIPLHVNA
7D1 DLPSFGRVRE SLPVKYHLώN KTDLVlQDVEI SVEPSDAFMF SGLKiQIRLRI
751 LPGTEώEMLY NFYPLMAGYlQ (QLPSLNINLL RFPNFTNiQLL RRFIPTSIFV
801 KPiQGRLMDDT SIAAA
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_1417ι frame 3 No Alert BLASTP hits found
Pedant information for DKFZphtes3_1417ι frame 3
Report for DKFZphtes3_1417- 3
[LENGTH! 63b
[MU! 14241-30
[pi! 5-64 [HOMOL! TREMBL : CEUB0412_2 gene: "B0412-3"i Caenorhabditis elegans cosmid BD412- be-30
[KU! < Alpha_Beta
CKU! L U_COMPLEXITY 1-20 *
SElQ HIDLCKKKIGSAELSFEHDAUMSK(QF(QAFGDLFDEAIKLGLTAI(3T(QNPGFYY(Q(QAAYYA
SEG xxxxxxxxx
PRD ccceeeeeehhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhccceeeccccchhhhhhhhh SE(Q (QERKiQLAKTLCNHEASVMYPNPDPLETiQTGVLDFYGiQRSURiQGILSFDLSDPEKEKVGIL SEG x
PRD hhhhhhhhhhhhhcceeeccccccccceeeeeeeeccccceeeceeeeeccchhhhhhhh SEA AIiQLKERNVVHSEIIITLLSNAVAlQFKKYKCPRMKSHLMViQMGEEYYYAKDYTKALKLLD
SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhccceeehhhhhhhhhhhh SElQ YVMCDYRSEGUUTLLTSVLTTALKCSYLMAlQLKDYITYSLELLGRASTLKDDiQKSRIEKN
SEG
PRD hhhhccccccceeehhhhhhhhhhhhhhhhhhhhhhhhhhhhhhchhhhhccccchhhhh
SElQ LINVLMNESPDPEPDCDILAVKTAlQKLUADRISLAGSNIFTIGVlQDFVPFViQCKAKFHAP SEG
PRD hheeeeccccccccccchhhhhhhhhhhhhhhhhhcccceeeeeeehhhhhhhhhhcccc
SElQ SFHVDVPViQFDIYLKADCPHPIRFSKLCVSFNNiQEYNiQFCVIEEASKANEVLENLTlQGKM
SEG PRD eeeeeeecceeeeeecccccceeeeeeeeecccccccceeeeeccccchhhhhccccccc
SElQ CLVPGKTRKLLFKFVAKTEDVGKKIEITSVDLALGNETGRCVVLNUlQGGGGDAASSiQEAL
SEG
PRD cccccccccchhhhhhhhhccceeeeeeeecccccccccceeeeeecccccccchhhhhh
SElQ (QAARSFKRRPKLPDNEVHUDSIIIlQASTMIISRVPNISVHLLHEPPALTNEMYCLVVTViQ
SEG
PRD hhhhhhhhcccccccccccceeeeeeeceeeeccccceeeeeccccccccceeeeeeeee SElQ SHEKTiQIRDVKLTAGLKPGiQDANLTiQKTHVTLHGPELCDESYPALLTDIPVGDLHPGEiQL
SEG
PRD cccccceeeeeccccccccchhhhhheeeeecccccccccceeeeecccccccccccchh
SElQ EKMLYVRCGTVGSRMFLVYVSYLINTTVEEKEIVCKCHKDETVTIETVFPFDVAVKFVST SEG
PRD hhhhhhcccccchhhhhcchhhhhccccceeeeeeecccccceeeeeeccceeeeeeeeh
SElQ KFEHLERVYADIPFLLMTDLLSASPUALTIVSSELlQLAPSMTTVDlQLESlQVDNVILlQTGE
SEG PRD hhhhhhhhhhccceeeehhhhhccccceeehhhhhhhhccceeeccccccccceeeeccc
SElQ SASECFCLiQCPSLGNIEGGVATGHYIISUKRTSAMENIPIITTVITLPHVIVENIPLHVN
SEG
PRD cceeeeeeeecccccccccccceeeeeeeeccccccccceeeeeeeeeeeeeeecccccc
SElQ ADLPSFGRVRESLPVKYHLlQNKTDLVlQDVEISVEPSDAFMFSGLKiQIRLRILPGTEiQEML
SEG
PRD cccccccceeeccceeeeeecccccceeeeeecccccceeeccccccceeeccccccccc SElQ YNFYPLMAGYlQiQLPSLNINLLRFPNFTNiQLLRRFIPTSIFVKPiQGRLMDDTSIAAA
SEG
PRD cccccccccccccccccccccccccccchhhhhcccceeeeecccccccccccccc
( No Prosite data ava i l ab le f or DKFZphtes3_lM 17 - 3 ) ( No Pf am data avai l able f or DKFZphtes3_1417 - 3 ) DKFZphtes3_15nl4
group: testis derived
DKFZphtes3_15nl4 encodes a novel 713 amino acid protein with weak similarity to the neurofilament triplet M protein of the ratNeurofilaments are the intermediate filaments specific to nervous tissue- They are probably essential to the tensile strength of the neuroni as well as to transport of molecules and organelles within the axon- Until nowi ESTs of the novel mRNA could only be isolated from testesi germ cells and uterus- No informative BLAST resultsi No predictive prositei pfam or SCOP motife.
The new protein can find application in studying the expression profile of testis-specific genes.
similarity to neurofilament triplet M protein - rat few EST hits (b of 1 hits from testis) perhaps complete eds-
Sequenced by GBF
Locus: unknown
Insert length: 2381 bp
Poly A stretch at pos. 2328ι polyadenylation signal at pos- 230b
1 TGGGCCCCAC CTCCTCAGCA CAACTTTCTG AAAAACTGGC AGCGTAACAC
51 AGCCCTGCGG AAGAAGCAGC AGGAAGCCCT CAGCGAACAC CTAAAGAAGC
101 CAGTGAGTGA GCTGCTCATG CACACCGGGG AGACCTACAG ACGGATCCAG
151 GAGGAGCGGG AGCTCATTGA CTGCACACTT CCAACCCGGC GTGATAGGAA
201 AAGCTGGGAG AACAGTGGGT TCTGGAGTCG ACTGGAATAC TTGGGAGATG 251 AGATGACAGG TCTGGTCATG ACCAAGACAA AAACTCAGCG TGGCCTCATG
301 GAGCCCATCA CTCATATCAG GAAGCCCCAC TCCATCCGGG TGGAGACAGG
351 ATTACCAGCC CAGAGGGACG CTTCATACCG CTACACCTGG GATCGGAGTC
401 TGTTTCTGAT CTACCGACGC AAGGAGCTGC AGAGAATCAT GGAAGAGCTG
451 GATTTCAGCC AGCAGGATAT TGATGGCCTG GAGGTGGTGG GCAAAGGGTG 5D1 GCCCTTCTCG GCTGTTACTG TGGAAGACTA CACAGTGTTT GAAAGAAGTC
551 AGGGAAGCTC CTCTGAAGAC ACAACATACT TAGGCACATT GGCCAGTTCC bOl TCTGATGTCT CCATGCCTAT TCTCGGCCCT TCTCTGCTGT TCTGTGGGAA b51 GCCAGCTTGC TGGATCAGAG GCAGTAATCC ACAGGACAAG AGGCAGGTTG
701 GGATTGCTGC TCACTTGACC TTTGAAACCC TAGAAGGCGA GAAAACCTCC 751 TCAGAACTGA CTGTGGTCAA TAATGGCACC GTGGCCATTT GGTATGACTG
8D1 GCGACGGCAG CACCAGCCGG ACACTTTCCA AGACCTTAAG AAAAACAGGA
851 TGCAGCGATT TTACTTTGAC AACCGGGAAG GTGTGATTCT GCCTGGAGAA
101 ATTAAAACAT TTACCTTCTT CTTCAAGTCT TTGACTGCTG GGGTCTTCAG
151 GGAATTTTGG GAGTTTCGAA CCCATCCTAC TCTATTAGGA GGTGCTATAC 1001 TGCAGGTCAA TCTCCACGCG GTCTCCCTGA CCCAGGACGT TTTTGAGGAT
1051 GAGAGGAAAG TACTGGAGAG CAAGCTGACT GCCCATGAGG CAGTCACCGT
1101 CGTTCGCGAA GTGCTGCAGG AGCTGCTGAT GGGGGTCTTG ACCCCGGAGC
1151 GCACACCATC ACCTGTGGAT GCCTATCTCA CCGAGGAAGA CTTGTTCCGG 1201 CACAGAAATC CTCCGCTGCA TTATGAGCAC CAAGTGGTGC AAAGCCTGCA
1251 CCAACTGTGG CGCCAGTACA TGACCCTGCC CGCCAAGGCT GAGGAGGCCA
1301 GGCCAGGGGA CAAGGAGCAC GTCAGCCCCA TAGCCACAGA GAAGGCCTCT
1351 GTGAATGCTG AGCTGTTACC ACGCTTTAGG AGCCCCATCT CCGAAACTCA 1401 AGTGCCCCGG CCTGAGAACG AGGCCCTCAG GGAATCCGGG TCCCAGAAGG
1451 CCAGAGTGGG GACCAAGAGT CCTCAGCGGA AGAGCATCAT GGAGGAGATC
1501 CTGGTGGAGG AAAGCCCAGA TGTGGACAGC ACCAAGAGCC CCTGGGAGCC
1551 GGATGGCCTT CCCCTGCTGG AGTGGAACCT CTGCTTGGAG GACTTCAGAA
IbOl AGGCAGTGAT GGTGCTCCCT GATGAGAACC ACAGAGAGGA TGCGTTGATG lb51 AGGCTCAACA AAGCAGCCCT GGAGCTGTGC CAGAAGCCAA GGCCATTGCA
17D1 GTCCAACCTC CTGCACCAGA TGTGTTTGCA GCTGTGGCGA GATGTGATTG
1751 ACAGCCTGGT GGGCCATTCC ATGTGGCTGA GGTCTGTGCT GGGCCTGCCT
1801 GAGAAGGAGA CCATCTATTT GAATGTGCCT GAAGAGCAAG ATCAAAAATC
1851 ACCTCCTATC ATGGAAGTGA AGGTACCTGT GGGGAAAGCT GGGAAGGAGG 1101 AGCGGAAAGG AGCAGCCCAG GAAAAGAAGC AACTGGGGAT CAAAGACAAA
1151 GAAGACAAGA AAGGAGCCAA GCTGCTCGGG AAAGAGGACC GTCCCAACAG
20D1 CAAGAAGCAC AAGGCAAAGG ATGACAAGAA AGTCATAAAA TCTGCAAGTC
2051 AGGACAGGTT TTCTTTGGAA GACCCTACCC CTGACATCAT CCTCTCTTCT
2101 CAAGAACCCA TAGACCCCCT GGTCATGGGG AAATACACCC AGAGGCTGCA 2151 CAGTGAGGTC CGTGGGCTGC TGGACACCCT GGTGACCGAC CTGATGGTCC
2201 TGGCTGATGA GCTCAGCCCC ATAAAGAATG TCGAGGAGGC TTTGCGCCTC
2251 TGCAGGTGAC TCTCGGGCCC AAGCAACCTT CTGGAAAACG GGTTAATAAA
2301 TAAATCAATA AAGAACCTTC AAGTTTCTAC TAAAAAAAAA AAAAAAAAAA
2351 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA GGGCGGCCG
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 1
ORF from 118 bp to 225b bpi peptide length: 713 Category: putative protein Classification: Cell structure/motility
1 MHTGETYRRI (QEERELIDCT LPTRRDRKSU ENSGFUSRLE YLGDEMTGLV
51 MTKTKTώRGL MEPITHIRKP HSIRVETGLP AiQRDASYRYT UDRSLFLIYR
101 RKELiQRIMEE LDFSώiQDIDG LEVVGKGUPF SAVTVEDYTV FERSlQGSSSE 151 DTTYLGTLAS SSDVSMPILG PSLLFCGKPA CUIRGSNPώD KRlQVGIAAHL
201 TFETLEGEKT SSELTVVNNG TVAIUYDURR (QHiQPDTFlQDL KKNRMiQRFYF
251 DNREGVILPG EIKTFTFFFK SLTAGVFREF UEFRTHPTLL GGAILiQVNLH
301 AVSLTiQDVFE DERKVLESKL TAHEAVTVVR EVLlQELLMGV LTPERTPSPV
351 DAYLTEEDLF RHRNPPLHYE HiQVViQSLHlQL URiQYMTLPAK AEEARPGDKE 401 HVSPIATEKA SVNAELLPRF RSPISETiQVP RPENEALRES GSώKARVGTK
451 SPiQRKSIMEE ILVEESPDVD STKSPUEPDG LPLLEUNLCL EDFRKAVMVL
501 PDENHREDAL MRLNKAALEL C(QKPRPL(QSN LLHIQMCLIQLU RDVIDSLVGH
551 SMULRSVLGL PEKETIYLNV PEEIQDIQKSPP IMEVKVPVGK AGKEERKGAA bOl (QEKKώLGIKD KEDKKGAKLL GKEDRPNSKK HKAKDDKKVI KSAStQDRFSL b51 EDPTPDIILS SiQEPIDPLVM GKYTiQRLHSE VRGLLDTLVT DLMVLADELS
7D1 PIKNVEEALR LCR
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_15nl4ι frame 1 No Alert BLASTP hits found Pedant information for DKFZphtes3_15nl4ι frame 1
Report for DKFZphtes3_15nl .1
[LENGTH! 713
[MU! 81760.53
[pi! b-DO
[BLOCKS! PF0D878C [BLOCKS! BLOOblOC DEAH-box subfamily ATP-dependent helicases proteins
[KU! Alpha_Beta
[KU! L0U_C0MPLEXITY 4-07 *
SE(Q MHTGETYRRIIQEERELIDCTLPTRRDRKSUENSGFUSRLEYLGDEMTGLVMTKTKTIQRGL
SEG
PRD ccchhhhhhhhhhhhhhhhccccchhhhhhccccccceeeeccccceeeeeecccccccc SElQ MEPITHIRKPHSIRVETGLPAIQRDASYRYTUDRSLFLIYRRKELIQRIMEELDFSIQIQDIDG SEG
PRD cccccccccccceeeeeccccchhhhhhhcccchhhhhhhhhhhhhhhhhhcccccccce
SElQ LEVVGKGUPFSAVTVEDYTVFERSiQGSSSEDTTYLGTLASSSDVSMPILGPSLLFCGKPA SEG
PRD eeeeeccccceeeeecceeeeeecccccccceeecccccccccccccccccceeeecccc
SElQ CUIRGSNPiQDKRiQVGIAAHLTFETLEGEKTSSELTVVNNGTVAIUYDURRiQHiQPDTFlQDL SEG PRD eeeeccccccchhhhhhhhhheeecccccccceeeeecccceeeeehhhhhccccchhhh
SElQ KKNRMlQRFYFDNREGVILPGEIKTFTFFFKSLTAGVFREFUEFRTHPTLLGGAILlQVNLH
SEG
PRD hhhhhhhhhcccccccccccceeeeeeeehhhhhhhhhhhhhhhcccccccchhhhhhhh
SElQ AVSLTIQDVFEDERKVLESKLTAHEAVTVVREVLIQELLMGVLTPERTPSPVDAYLTEEDLF
SEG
PRD hhhhhhhccchhhhhhhhhhhhhhhhhhhhhhhhhhhhccccccccccccceeeeccccc SElQ RHRNPPLHYEHlQVViQSLHiQLURlQYMTLPAKAEEARPGDKEHVSPIATEKASVNAELLPRF SEG
PRD cccccccccccchhhhhhhhhhhhhhhhhhhhhccccccccccccchhhhhhhhhccccc SElQ RSPISET(QVPRPENEALRESGS(QKARVGTKSP(QRKSIMEEILVEESPDVDSTKSPUEPDG
SEG
PRD cccccccccccccchhhhhcccccccccccccchhhhhhhhhhhcccccccccccccccc SElQ LPLLEUNLCLEDFRKAVMVLPDENHREDALMRLNKAALELCiQKPRPLiQSNLLHlQMCLiQLU
SEG
PRD ccccchhhhhhhhhhhhccccccchhhhhhhhhhhhhhhhhhcccccchhhhhhhhhhhh
SElQ RDVIDSLVGHSMULRSVLGLPEKETIYLNVPEElQDlQKSPPIMEVKVPVGKAGKEERKGAA SEG
PRD hhhhhhhhccchhhhhhccccccceeeeecccccccccccceeeeeccccchhhhhhhhh
SElQ (QEKKiQLGIKDKEDKKGAKLLGKEDRPNSKKHKAKDDKKVIKSASiQDRFSLEDPTPDIILS
SEG xxxxxxxxxxxxxxxxx PRD hhhhhhccccccccchhhhhccccccccccccccccceeeeecccccccccccccceeee
SElQ SiQEPIDPLVMGKYTlQRLHSEVRGLLDTLVTDLMVLADELSPIKNVEEALRLCR
SEG xxxxxxxxxxxx
PRD ccccccceeechhhhhhhhhhhhhhhhhhhhhhhhhhhccccchhhhhhhccc
(No Prosite data available for DKFZphtes3_15nl4 -1) (No Pfam data available for DKFZphtes3_15nl4 - 1)
DKFZphtes3_lbb5
group: cell structure and motility
DKFZphtes3_lbb5 encodes a novel 2b8 amino acid protein with similarity to various tropomyosins- Tropomyosins play regulatory roles in cellular structure and transport-
The new protein can find application in modulating cell structure and motility as well as modulationg cellular transport-
weak similarity to KIAA0774 perhaps complete eds- Sequenced by BMFZ Locus-" unknown Insert length: 131b bp
Poly A stretch at pos- 1247ι polyadenylation signal at pos. 1232
1 TGCTAAAATG GAATTAGAGA GAAGCATAGA CATCAGCAGA AGACAGAGTA 51 AGGAGCACAT ATGTAGAATT ACAGATCTAC AAGAGGAATT AAGACACAGA
IDl GAGCATCACA TCTCTGAATT GGATAAGGAG GTTCAGCACC TTCATGAGAA
151 TATAAGTGCC CTAACCAAAG AACTGGAATT TAAGGGGAAA GAAATTCTCA
201 GAATACGAAG TGAATCTAAC CAACAGATAA GGTTGCATGA ACAAGATTTA
251 AACAAGAGAC TTGAAAAAGA GTTGGATGTC ATGACAGCAG ACCACCTCAG 301 AGAGAAAAAT ATCATGCGGG CAGATTTTAA TAAGACTAAC GAGCTACTCA
351 AGGAAATAAA TGCCGCTTTA CAAGTGTCAT TAGAAGAAAT GGAAGAAAAA
401 TATCTAATGA GAGAATCAAA ACCAGAAGAT ATACAGATGA TTACAGAATT
451 AAAAGCCATG CTTACAGAAA GAGACCAGAT CATAAAGAAA CTAATTGAGG
501 ATAATAAGTT TTATCAGCTG GAATTAGTCA ATCGAGAAAC TAACTTCAAC 551 AAAGTGTTTA ACTCAAGTCC TACTGTTGGT GTTATTAATC CATTGGCTAA bOl GCAAAAGAAG AAGAATGATA AATCACCAAC AAACAGGTTT GTGAGTGTTC b51 CCAATCTAAG TGCTCTGGAA TCTGGTGGAG TGGGCAATGG ACATCCTAAC
7D1 CGCCTGGATC CCATTCCTAA TTCTCCAGTC CACGATATTG AGTTCAACAG
751 CAGCAAACCA CTTCCACAGC CAGTGCCACC TAAAGGGCCC AAGACATTTT 601 TGAGGTATCA GTAAGATGCA TGTGCATGAG CTCAAGGAAC ATGACTACTG
851 GAGTTTCCAT TACACATTGT TGCGTGCCTT GTAATTTTCC CCAAAGACGT
101 CCTGCTCAGA GTGAAGCTTC TCCAGTGGCT TCTCCAGATC CCCAGCGCCA
151 GGAGTGGTTT GCCCGGTACT TCACATTCTG AAAGAATTGT GTTGGCACAG
10D1 CTCTGTATAG ACTGTTACTA AGAGCATGAC TTTATACAGA TTGTTATGTA 1051 AATAGGCTTT CCTATGTCAA ACACTGTGAA TGAGAAAGTA TTTGTCTCTC
1101 CAACTTGAAA ATGCACTGTA TTTCCTGTGA TATTTATTGG AATCATTCTA
1151 TAAGGTACTA TATTATGTGT GTAATTATAA CTGTTATTTT TATTTGAGAT
1201 GGAAGAGTCT TTAACCTTTG TAATTACTGC ATAATAAATT TTGTTAGAAT
1251 CAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 1301 AAAAAAAAAA AAAAAA
BLAST Results No BLAST result
Medline entries
No Medline entry
Peptide information for frame 2
ORF from 8 bp to 611 bpi peptide length: 2bβ Category: similarity to known protein Classification: Cellular transport and traffic 1 MELERSIDIS RRlQSKEHICR ITDLiQEELRH REHHISELDK EViQHLHENIS
51 ALTKELEFKG KEILRIRSES NIQIQIRLHEIQD LNKRLEKELD VMTADHLREK
101 NIMRADFNKT NELLKEINAA LlQVSLEEMEE KYLMRESKPE DIiQMITELKA
151 MLTERDiQIIK KLIEDNKFYiQ LELVNRETNF NKVFNSSPTV GVINPLAKiQK
2D1 KKNDKSPTNR FVSVPNLSAL ESGGVGNGHP NRLDPIPNSP VHDIEFNSSK 251 PLPiQPVPPKG PKTFLRYiQ
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_lbb5ι frame 2 No Alert BLASTP hits found
Pedant information for DKFZphtes3_lbb5ι frame 2
Report for DKFZphtes3_lbb5- 2
[LENGTH! 270
[MU! 31413- DI [pi! b-10
[HOMOL! PIR:A57013 early endosome antigen 1 - human le-05
[FUNCAT! 03-11 recombination and dna repair [S- cerevisiaei
Y0L034w! le-D5
[FUNCAT! D3-22 cell cycle control and mitosis [S- cerevisiaei YFR031c! 2e-05
[FUNCAT! 30- ID nuclear organization [S- cerevisiaei YFR031c!
2e-D5
[FUNCAT! 11-04 dna repair (direct repairi base excision repair and nucleotide excision repair) [S- cerevisiaei YKRDISw! 5e-05 [FUNCAT! 30-04 organization of cytoskeleton [S- cerevisiaei
YDR35bw! 7e-05
[FUNCAT! 01-10 nuclear biogenesis [S- cerevisiaei YDR35bw!
7e-05 [FUNCAT! 06-07 vesicular transport (golgi networki etc) [S- cerevisiaei YDLOSβw! le-04
[FUNCAT! 3D-03 organization of cytoplasm [S- cerevisiaei
YDLOSδw! le-D4 [FUNCAT! 1 genome replication! transcriptioni recombination and repair [M- jannaschiii MJlb43! 2e-D4
[FUNCAT! 11 unclassified proteins [S- cerevisiaei YLR301c!
3e-04
[FUNCAT! 06-lb extracellular transport [S- cerevisiaei YNL272c! 5e-04
[FUNCAT! 30-D1 organization of intracellular transport vesicles
[S. cerevisiaei YNL272c! 5e-04
[KU! All_Alpha
[KU! L0LJ_C0MPLEXITY 4 - 81 * [KU! COILED COIL 10 - 74 *
SElQ AKMELERSIDISRRiQSKEHICRITDLlQEELRHREHHISELDKEViQHLHENISALTKELEF
SEG PRD ccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS CCCCCCCCCCCCCCCCCCCCCCCC
SElQ KGKEILRIRSESNiQiQIRLHElQDLNKRLEKELDVMTADHLREKNIMRADFNKTNELLKEIN SEG
PRD hhhhhhhhhhcchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS ccccc SElQ AALIQVSLEEMEEKYLMRESKPEDIIQMITELKAMLTERDIQIIKKLIEDNKFYIQLELVNRET SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
SElQ NFNKVFNSSPTVGVINPLAKiQKKKNDKSPTNRFVSVPNLSALESGGVGNGHPNRLDPIPN SEG
PRD hhhhhhhcccceeeehhhhhhhhhhccccccceeeccccccccccccccccccccccccc COILS
SElQ SPVHDIEFNSSKPLPIQPVPPKGPKTFLRYIQ
SEG xxxxxxxxxxxxx
PRD ccceeeeecccccccccccccccceeeccc COILS
(No Prosite data available for DKFZphtes3_lbbS - 2) (No Pfam data available for DKFZphtes3_lbb5- 2) DKFZphtes3_lbp3
group: testis derived DKFZphtes3_lbp3 encodes a novel lbb3 amino acid protein without similarity to known proteins-
The novel protein is glutamine rich and contains a cell attachment RGD motif- According to the low number of ESTs and their origin the protein seems to be expressed ubiquitously at low levels- No informative BLAST resultsi No predictive prositei pfam or SCOP motife ■
The new protein can find application in studying the expression profile of testis-specific genes-
putative protein perhaps complete eds. Sequenced by BMFZ Locus: unknown
Insert length: 5411 bp Poly A stretch at pos- 5354ι polyadenylation signal at pos- 5340
1 GGCGGCCAGG TGGAGGACCT GAGCAAGCAG CTCAAGCGTG TGGACGGCCA
51 GGTGCAGGGC ATCGCCACGC ACGTGCAGCA CTTCTCCCAG GCCAGCGGGC
101 TTGACCTGGC CGCGCTAGAG TGGCCGGAGG AGCAGGAGGT GGGCGTGCGG 151 GCGTTCGATA GGGTGCGGAC TGGGAGTATC ATGAAGGACG CCGCCGAGGA
2D1 GCTCAGCTTT GCCAGGGTAC TTTTACAGCG GGTTGATGAA CTAGAGAAGC
251 TATTCAAAGA TCGGGAGCAA TTCCTGGAAC TAGTCAGCCG GAAGCTGAGT
301 TTGGTTCCTG GTGCAGAAGA AGTCACCATG GTCACCTGGG AAGAGCTGGA
351 GCAGGCGATT ACGGACGGCT GGAGAGCCTC ACAAGCGGGC TCAGAAACAC 401 TTATGGGATT TTCTAAGCAC GGAGGGTTCA CTTCCTTAAC ATCACCTGAA
451 GGGACTCTAA GCGGAGACTC TACCAAGCAA CCAAGTATTG AGCAGGCTCT
501 GGATTCTGCC AGTGGTCTTG GCCCGGATCG GACTGCATCA GGATCTGGTG
551 GCACAGCACA CCCCTCTGAT GGGGTTTCCA GTAGGGAACA AAGCAAGGTC bOl CCCTCTGGTA CTGGGAGACA GCAGCAGCCG AGGGCCCGTG ATGAAGCTGG b51 CGTGCCACGA CTCCATCAGT CTTCTACATT CCAATTCAAA TCAGACTCAG
7D1 ATCGTCACAG GAGTAGAGAG AAGCTTACCT CGACACAACC AAGAAGAAAT
751 GCACGTCCTG GTCCAGTTCA ACAGGACTTA CCCTTGGCCA GAGACCAGCC
ΘD1 CAGTAGTGTG CCCGCTAGCC AGAGTCAGGT CCATCTAAGG CCAGATCGTC
851 GTGGGTTAGA ACCAACTGGC ATGAATCAGC CTGGATTAGT GCCTGCTAGC 101 ACTTACCCAC ATGGTGTGGT ACCCCTCAGC ATGGGTCAGC TTGGTGTGCC
151 ACCACCTGAA ATGGATGATC GGGAATTGAT ACCATTTGTC GTGGATGAGC
10D1 AACGTATGTT GCCACCATCA GTACCTGGCA GAGACCAGCA AGGATTGGAA
1051 CTACCTAGCA CAGACCAACA TGGTCTGGTT TCAGTCAGTG CATATCAGCA
1101 TGGTATGACA TTTCCTGGCA CAGACCAACG CAGTATGGAA CCACTTGGCA 1151 TGGATCAGCG TGGATGTGTA ATATCAGGCA TGGGTCAGCA AGGACTAGTA
12D1 CCCCCTGGTA TAGACCAGCA AGGATTGACA TTGCCTGTCG TCGATCAACA
1251 TGGCCTGGTT CTACCTTTTA CAGACCAGCA TGGTTTGGTA TCACCTGGTT
13D1 TGATGCCAAT TAGTGCAGAT CAGCAAGGTT TTGTGCAGCC CAGTTTGGAA
1351 GCAACTGGCT TCATACAACC TGGCACAGAG CAGCATGATT TGATCCAGTC 1401 TGGCAGATTT CAGCGTGCTT TGGTGCAGCG TGGTGCATAT CAGCCTGGCT
1451 TGGTCCAACC TGGTGCAGAT CAGCGTGGTT TGGTCCGGCC TGGAATGGAT
1501 CAGTCTGGTT TGGCCCAACC TGGTGCAGAT CAGCGTGGTT TGGTCTGGCC
1551 TGGAATGGAT CAGTCTGGTT TGGCCCAACC TGGTAGAGAT CAGCATGGTT IbOl TGATCCAGCC TGGCACAGGT CAGCATGATT TGGTCCAATC TGGCACAGGT lb51 CAGGGTGTCT TGGTACAGCC TGGTGTAGAT CAGCCTGGCA TGGTCCAACC
17D1 TGGCAGATTT CAGCGTGCTT TGGTGCAGCC TGGTGCATAT CAGCCTGGCT
1751 TGGTCCAACC TGGTGCAGAT CAGATTGATG TGGTGCAACC TGGTGCAGAT 1801 CAGCATGGTT TGGTACAATC TGGTGCAGAT CAGAGTGATT TGGCTCAACC
1651 TGGTGCAGTT CAGCATGGTT TGGTCCAACC TGGAGTAGAT CAGCGTGGTT
1101 TGGCACAACC TCGTGCAGAT CATCAGCGTG GTTTGGTCCC ACCTGGTGCA
1151 GATCAGCGTG GTTTGGTCCA ACCTGGTGCA GATCAGCATG GTTTGGTCCA
2001 ACCTGGAGTG GATCAGCATG GTTTGGCACA ACCTGGTGAA GTTCAGCGTA 2051 GTTTGGTGCA ACCTGGTATA GTTCAGCGTG GTTTGGTGCA ACCTGGTGCA
2101 GTTCAGCGTG GTTTGGTGCA ACCTGGTGCA GTTCAGCGTG GTTTGGTCCA
2151 ACCTGGAGTG GATCAGCGTG GTTTGGTTCA ACCTGGTGCA GTTCAGCGTG
2201 GTTTGGTCCA ACCTGGTGCA GTTCAGCATG GTTTGGTCCA ACCTGGTGCA
2251 GATCAGCGTG GTTTGGTCCA ACCTGGAGTG GATCAGCGTG GTTTGGTGCA 2301 ACCTGGAGTG GATCAGCGTG GTTTGGTCCA ACCTGGAATG GACCAGCGTG
2351 GTTTGATCCA ACCTGGTGCA GATCAGCCT6 GTTTGGTCCA GCCTGGTGCA
2401 GGTCAGCTGG GTATGGTGCA GCCTGGAATA GGTCAGCAAG GTATGGTGCA
2451 ACCTCAGGCA GATCCACATG GCCTGGTACA ACCTGGTGCC TATCCTCTTG
2501 GTTTGGTACA ACCTGGTGCA TATTTGCATG ATTTATCTCA ATCTGGGACA 2551 TATCCACGTG GTCTGGTGCA GCCAGGAATG GATCAGTATG 6TTTGAGACA
2b01 ACCTGGTGCA TATCAGCCAG GCTTGATAGC ACCAGGCACA AAGCTTCGTG
2b51 GCTCTTCAAC ATTCCAGGCA GATTCTACAG GTTTTATATC AGTACGTCCA
27D1 TATCAACATG GTATGGTACC TCCTGGCAGA GAACAATACG GCCAGGTGTC
2751 ACCACTCCTA GCCAGTCAAG GTTTGGCATC ACCTGGTATA GATCGAAGGA 2801 GTTTGGTACC ACCAGAAACT TATCAGCAAG GTTTGATGCA TCCTGGCACA
2851 GACCAGCACA GCCCAATACC ACTGAGTACA GGTTTGGGAT CTACACACCC
2101 AGATCAACAG CATGTGGCAT CACCTGGCCC AGGTGAGCAT GACCAGGTAT
2151 ACCCAGATGC AGCTCAGCAT GGCCATGCTT TCTCTCTCTT TGACAGTCAT
30D1 GATTCAATGT ATCCTGGTTA TCGTGGCCCA GGGTATCTAA GTGCTGATCA 3051 GCATGGCCAG GAAGGTTTGG ATCCAAATAG AACACGAGCC TCGGACCGAC
3101 ATGGAATTCC TGCCCAGAAG GCCCCAGGCC AAGATGTCAC TCTTTTCA6G
3151 AGTCCAGACT CCGTCGACCG AGTCTTATCA GAAGGGAGCG AAGTCTCGAG
3201 TGAAGTCCTG AGTGAGCGAC GCAATTCACT GCGTAGAATG AGTTCTAGTT
3251 TCCCCACGGC AGTGGAGACA TTTCATCTGA TGGGAGAGCT CAGTAGCCTC 3301 TATGTGGGGC TAAAGGAGAG TATGAAGGAT CTGGATGAGG AGCAGGCCGG
3351 CCAAACCGAC TTGGAGAAGA TCCAGTTCCT GCTGGCACAG ATGGTCAAAA
3401 GGACCATACC TCCTGAACTG CAGGAGCAGC TGAAGACCGT AAAGACGCTA
3451 GCCAAAGAAG TTTGGCAGGA GAAAGCAAAA GTGGAAAGGC TGCAGAGGAT
3501 CCTGGAAGGG GAAGGGAATC AAGAAGCAGG GAAGGAACTG AAAGCTGGAG 3551 AGCTGAGATT GCAGCTGGGT GTCCTCAGAG TCACCGTGGC TGACATAGAA
3b01 AAGGAGCTGG CCGAGTTGAG GGAGAGCCAA GACAGGGGCA AGGCTGCCAT
3b51 GGAAAATTCT GTCTCTGAAG CCTCCCTTTA CCTGCAGGAC CAGTTGGACA
3701 AGCTCAGGAT GATCATTGAG AGCATGCTGA CCTCCTCCTC CACGCTCCTG
3751 TCCATGAGCA TGGCCCCGCA CAAGGCCCAC ACCTTGGCTC CTGGCCAGAT 3801 CGACCCTGAG GCCACCTGTC CAGCCTGCAG CCTGGATGTG AGCCATCAGG
3851 TCAGCACGCT GGTGCGGCGC TATGAGCAAC TCCAAGACAT GGTCAACAGC
3101 CTGGCCGTCT CCCGACCCTC CAAGAAGGCC AAGCTCCAGA GACAGGACGA
3151 GGAGCTGCTG GGCCGTGTGC AGAGTGCCAT CCTGCAGGTG CAGGGTGACT
40D1 GCGAGAAGCT CAACATCACC ACCAGCAACC TCATCGAGGA CCATCGGCAG 4051 AAACAGAAGG ACATTGCTAT GCTGTACCAG GGTCTGGAGA AGCTCGAAAA
41D1 GGAAAAGGCC AACAGGGAGC ACCTGGAGAT GGAGATCGAT GTGAAAGCCG
4151 ACAAGAGTGC TCTGGCCACC AAAGTGAGCC GTGTCCAGTT TGATGCCACC
4201 ACGGAGCAGC TGAACCACAT GATGCAGGAG CTGGTGGCCA AGATGAGCGG
4251 GCAGGAGCAG GACTGGCAGA AGATGCTGGA CAGGCTGCTC ACAGAGATGG 4301 ACAACAAGCT GGACCGCCTG GAGCTGGACC CAGTGAAGCA GTTGCTGGAG
4351 GATCGGTGGA AATCGCTGCG ACAGCAGCTC AGGGAGCGCC CCCCACTCTA
4401 CCAGGCAGAC GAGGCGGCTG CCATGCGGAG GCAGCTCCTG GCACATTTCC
4451 ACTGCCTCTC ATGTGACCGG CCCTTGGAGA CACCTGTGAC TGGACATGCC 4501 ATCCCCGTGA CCCCCGCGGG TCCAGGCCTA CCTGGGCACC ATTCCATCCG
4551 CCCCTACACG GTGTTTGAAC TGGAGCAGGT CCGGCAGCAT AGCCGCAACC
4b01 TCAAGCTGGG CAGCGCCTTC CCTCGGGGTG ACCTGGCGCA GATGGAGCAG
4bSl AGCGTGGGGC GCCTGCGCTC CATGCACTCC AAGATGCTGA TGAACATTGA
4701 GAAGGTGCAG ATCCACTTCG GGGGCTCCAC CAAGGCCAGC AGCCAGATAA
4751 TCCGCGAGCT GCTGCACGCC CAGTGCCTGG GCTCCCCCTG CTACAAACGG
4601 GTGACAGATA TGGCTGATTA CACCTACTCA ACTGTGCCCC GGCGCTGCGG
4851 GGGCAGCCAC ACCCTCACCT ACCCCTACCA CCGCAGCCGC CCGCAGCACC
4101 TTCCCCGGGG CCTGTATCCT ACTGAAGAGA TCCAGATTGC CATGAAGCAT
4151 GATGAGGTGG ACATCTTGGG CCTGGATGGC CACATTTACA AGGGACGGAT
5001 GGACACAAGG CTGCCAGGCA TCCTCCGAAA AGACAGCTCA GGGACCTCAA
5051 AGCGCAAGTC CCAGCAGCCC AGGCCCCACG TGCACAGGCC GCCATCCCTC
5101 AGCAGCAATG GCCAGCTGCC CTCTCGGCCA CAGAGCGCCC AGATTTCGGC
5151 TGGCAACACC TCAGAAAGAT AGACCTTCCT CCGAGGGCCG TCTCTCCCAG
5201 CCGAACACAG CCCACCCGCC CAGCTCCGCC TCGGTGGCAA ACAGGGGGCT
5251 GGAGAGGCAC GTGGACATGC CTCCTGGGGA GGGGCTCGAG GAGCCCACGC
5301 GGGGGCCGCG GTCCAGCACC GCTCAGTGAG CGGAGGTGTA AATAAACATT
5351 CAGGAGGAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA
5401 AAAAAAAAAA A
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 1
ORF from 181 bp to 51b1 bpi peptide length lbb3 Category: putative protein Classification: no clue Prosite motifs: RGD (1462-1484)
1 MKDAAEELSF ARVLLiQRVDE LEKLFKDREύ FLELVSRKLS LVPGAEEVTM 51 VTUEELEiQAI TDGURASlQAG SETLMGFSKH GGFTSLTSPE GTLSGDSTKiQ
101 PSIEiQALDSA SGLGPDRTAS GSGGTAHPSD GVSSREUSKV PSGTGRiQlQlQP
151 RARDEAGVPR LHiQSSTFiQFK SDSDRHRSRE KLTSTiQPRRN ARPGPVlQiQDL
2D1 PLARDiQPSSV PASIQSIQVHLR PDRRGLEPTG MNώPGLVPAS TYPHGVVPLS
251 MGώLGVPPPE MDDRELIPFV VDElQRMLPPS VPGRDlQiQGLE LPSTDύHGLV 301 SVSAYiQHGMT FPGTDiQRSME PLGMDiQRGCV ISGMGώlQGLV PPGIDiQlQGLT
351 LPVVDώHGLV LPFTDiQHGLV SPGLMPISAD (QlQGFVlQPSLE ATGFIiQPGTE
4D1 (QHDLKQSGRF (QRALVIQRGAY (QPGLViQPGAD (QRGLVRPGMD (QSGLAiQPGAD
451 (QRGLVUPGMD iQSGLAlQPGRD (QHGLIlQPGTG (QHDLViQSGTG (QGVLViQPGVD
501 (QPGMVlQPGRF (QRALViQPGAY (QPGLViQPGAD iQIDVVtQPGAD (QHGLViQSGAD 551 (QSDLAiQPGAV (QHGLViQPGVD (QRGLAlQPRAD HiQRGLVPPGA DiQRGLVlQPGA bOl DiQHGLVlQPGV DiQHGLAlQPGE ViQRSLViQPGI ViQRGLViQPGA VIQRGLVIQPGA b51 ViQRGLViQPGV DiQRGLVlQPGA VIQRGLVIQPGA VIQHGLVIQPGA DIQRGLVIQPGV
7D1 DIQRGLVIQPGV DiQRGLVώPGM DiQRGLIiQPGA DlQPGLVlQPGA GiQLGMVlQPGI 751 GIQ IQGM V IQ P IQ A DPHGLVlQPGA YPLGLViQPGA YLHDLSiQSGT YPRGLVlQPGM
801 D IQY GL R IQ P G A YlQPGLIAPGT KLRGSSTFlQA DSTGFISVRP YiQHGMVPPGR
651 ElQYGiQVSPLL ASiQGLASPGI DRRSLVPPET YiQlQGLMHPGT DlQHSPIPLST
101 G L GST H PD IQIQ HVASPGPGEH DlQVYPDAAiQH GHAFSLFDSH DSMYPGYRGP 151 GYLSADiQHGlQ EGLDPNRTRA SDRHGIPAiQK APGiQDVTLFR SPDSVDRVLS
10D1 EGSEVSSEVL SERRNSLRRM SSSFPTAVET FHLMGELSSL YVGLKESMKD
1051 LDEElQAGlQTD LEKIlQFLLAiQ MVKRTIPPEL iQElQLKTVKTL AKEVUlQEKAK
11D1 VERLiQRILEG EGNOEAGKEL KAGELRLiQLG VLRVTVADIE KELAELRESώ
1151 DRGKAAMENS VSEASLYLώD (QLDKLRMIIE SMLTSSSTLL SMSMAPHKAH 1201 TLAPGiQIDPE ATCPACSLDV SHiQVSTLVRR YEIQLIQDMVNS LAVSRPSKKA
1251 KL IQRIQD EE LL GRV IQS A ILIQV (QGDCEKLNIT TSNLIEDHRβ KώKDIAMLYiQ
1301 GLEKLEKEKA NREHLEMEID VKADKSALAT KVSRVlQFDAT TElQLNHMMlQE
1351 L V A KM SG IQ E IQ DUlQKMLDRLL TEMDNKLDRL ELDPVKiQLLE DRUKSLRiQlQL
1401 RERPPLYlQAD EAAAMRRiQLL AHFHCLSCDR PLETPVTGHA IPVTPAGPGL 1451 PGHHSIRPYT VFE LE IQVR IQ H SRNLKLGSAF PRGDLAlQMEiQ SVGRLRSMHS
1501 KMLMNIEKVώ IHFGGSTKAS SiQIIRELLHA (QCLGSPCYKR VTDMADYTYS
1551 TVPRRCGGSH TLTYPYHRSR PiQHLPRGLYP TEEIlQIAMKH DEVDILGLDG
IbOl HIYKGRMDTR LPGILRKDSS GT S KRKS IQ IQ P RPHVHRPPSL SSNGlQLPSRP
IbSl (QSAiQISAGNT SER
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_lbp3ι frame 1 No Alert BLASTP hits found
Pedant information for DKFZphtes3_lbp3ι frame 1
Report for DKFZphtes3_lbp3 -1
[LENGTH! 1723
[MU! 167354-16
[pi! b-11 [HOMOL! TREMBL-"AF0254bl_4 gene-" "M01D1.5"i Caenorhabditis elegans cosmid MD1D1- le-47
[FUNCAT! 30-D3 organization of cytoplasm [S- cerevisiaei
YDL056w! 6e-D7
[FUNCAT! 08-07 vesicular transport (golgi networki etc-) [S- cerevisiaei YDL058w! 8e-D7
[FUNCAT! 11 unclassified proteins [S- cerevisiaei Y0R21bc!
2e-04
[FUNCAT! 11-04 dna repair (direct repairi base excision repair and nucleotide excision repair) [S- cerevisiaei YKR015w! D-0D1 [FUNCAT! 30-10 nuclear organization [S- cerevisiaei YKROISw!
D-DD1
[BLOCKS! PR01D16C
[BLOCKS! BP023D8D
[BLOCKS! PR00543H [BLOCKS! PR0D210G
[BLOCKS! PRD021DE
[BLOCKS! BP0423bA
[PIRKU! RNA binding 3e-0b [PIRKU! hydroxylysine 2e-10
[PIRKU! endoplasmic reticulum 7e-18
[PIRKU! ATP 2e-0b
[PIRKU! phosphoprotein 3e-0b [PIRKU! seed 4e-34
[PIRKU! saliva 2e-lD
[PIRKU! glycoprotein 2e-10
[PIRKU! heterotrimer 3e-Db
[PIRKU! alternative splicing 2e-10 [PIRKU! P-loop 2e-0b
[PIRKU! storage protein 4e-34
[PIRKU! extracellular matrix 2e-10
[PIRKU! membrane protein 7e-18
[PIRKU! protein biosynthesis 7e-16 [SUPFAM! myosin motor domain homology 2e-0b
[SUPFAM! elastin 2e-lD
[SUPFAM! glutenin 2e-37
[SUPFAM! myosin heavy chain 2e-0b
[SUPFAM! unassigned ribonucleoprotein repeat-containing proteins 3e-0b
[SUPFAM! proline-rich protein 2e-10
[SUPFAM! ribonucleoprotein repeat homology 3e-0b
[PROSITE! RGD 1
[KU! All_Alpha [KU! L0U_C0MPLEXITY 2-84 *
CKU! C0ILED_C0IL 1-80 *
SElQ GGiQVEDLSK(QLKRVDGiQViQGIATHV(QHFS(QASGLDLAALEUPEE(QEVGVRAFDRVRTGSI SEG
PRD cccchhhhhhhhhhhhheeeeeeeeeeccccccchhhhhhhhccceeeeeeeeeeecccc COILS
SElQ MKDAAEELSFARVLLiQRVDELEKLFKDRElQFLELVSRKLSLVPGAEEVTMVTUEELEiQAI SEG
PRD chhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhccccccchhhhhhhhhhhhh COILS
SElQ TDGURASiQAGSETLMGFSKHGGFTSLTSPEGTLSGDSTKlQPSIElQALDSASGLGPDRTAS SEG
PRD hhhhccccccceeeeeccccccccccccccccccccccchhhhhhhhhhccccccceeec COILS
SElQ GSGGTAHPSDGVSSREiQSKVPSGTGRlQlQlQPRARDEAGVPRLHiQSSTFlQFKSDSDRHRSRE SEG
PRD cccccccccceeeccccccccccccccchhhhhhhhccchhhhhcccccccccccccccc COILS
SElQ KLTST(QPRRNARPGPV(Q(QDLPLARD(QPSSVPAS<QS(QVHLRPDRRGLEPTGMN(QPGLVPAS SEG xxxxxxxxxxxx PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc COILS SElQ TYPHGVVPLSMGlQLGVPPPEMDDRELIPFVVDElQRMLPPSVPGRDlQiQGLELPSTDiQHGLV
SEG
PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc COILS
SElQ SVSAYlQHGMTFPGTDiQRSMEPLGMDiQRGCVISGMGiQiQGLVPPGIDlQlQGLTLPVVDiQHGLV SEG
PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc COILS
SElQ LPFTD(QHGLVSPGLMPISAD(Q(QGFV<QPSLEATGFI(QPGTEιQHDLI<QSGRF<QRALV(QRGAY SEG PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc COILS
SElQ (QPGLViQPGADiQRGLVRPGMDiQSGLAiQPGADiQRGLVUPGMDlQSGLAlQPGRDiQHGLIiQPGTG SEG
PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc COILS
SElQ (QHDLViQSGTGiQGVLViQPGVDiQPGMVlQPGRFlQRALVlQPGAYiQPGLViQPGADiQIDVViQPGAD SEG
PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc COILS
SElQ (QHGLVιQSGADιQSDLA<QPGAV<QHGLV(2PGVD(QRGLA(2PRADH<QRGLVPPGADιQRGLVι2PGA
SEG
PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
COILS
SElQ D(QHGLVιQPGVD(QHGLAlQPGEV<QRSLV(QPGIV(QRGLV<QPGAV(QRGLVlQPGAV(QRGLV(QPGV SEG
PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc COILS
SElQ D(QRGLV<QPGAV(QRGLVιQPGAV<QHGLV(QPGAD (QRGLV(QPGVD<QRGLVιQPGVD(QRGLVιQPGM SEG PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc COILS
SElQ D(QRGLI(QPGAD (QPGLV<QPGAG<QLGMV<QPGIG(QlQGMV(QP<QADPHGLV(QPGAYPLGLV(QPGA SEG
PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc COILS
SElQ YLHDLSflSGTYPRGLVtQPGMDlQYGLRiQPGAYiQPGLIAPGTKLRGSSTFiQADSTGFISVRP SEG
PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc COILS
SElQ YιQHGMVPPGRE(QYG!2VSPLLAS<QGLASPGIDRRSLVPPETY(Q<QGLMHPGTD(QHSPIPLST SEG
PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc COILS
SElQ GLGSTHPDIQIQHVASPGPGEHDIQVYPDAAIQHGHAFSLFDSHDSMYPGYRGPGYLSADIQHGIQ SEG
PRD cccccccccccccccccccccccccccccccccccccccccccccccccccccccccccc COILS
SElQ EGLDPNRTRASDRHGIPAιQKAPG(3DVTLFRSPDSVDRVLSEGSEVSSEVLSERRNSLRRM
SEG xxxxxxxxxxxxxxxxxxxxxxx-
PRD ccccccccccccccccccccccccceeeeeccccccccccccchhhhhhhhhhhcccccc COILS
SElQ SSSFPTAVETFHLMGELSSLYVGLKESMKDLDEE(QAG(QTDLEKI(QFLLA(QMVKRTIPPEL SEG
PRD cccccceeeeeeeeeccceeehhhhhhhhhhhhhcccchhhhhhhhhhhhhhhhhcchhh COILS
SElQ (QEiQLKTVKTLAKEVUlQEKAKVERLlQRILEGEGNiQEAGKELKAGELRLiQLGVLRVTVADIE
SEG PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhccchhhhhhhhhhhhhhhhhhhhhhhhhhh COILS CCCCCCCCCCCCCCCCCCCCCC
SElQ KELAELRESiQDRGKAAMENSVSEASLYLiQDiQLDKLRMIIESMLTSSSTLLSMSMAPHKAH SEG xxxxxxxxxxxxxx
PRD hhhhhhhhhhhchhhhhcccchhhhhhhhhhhhhhhhhhhhhhccccceeeehhhhhhhh COILS ccccccccc SElQ TLAPG(QIDPEATCPACSLDVSH(QVSTLVRRYE(QL(QDMVNSLAVSRPSKKAKL(QRιQDEELL SEG
PRD hhcccccccccccccccccchhhhhhhhhhhhhhhhhhhhhhccccchhhhhhhhhhhhh COILS
SElQ GRVlQSAILiQViQGDCEKLNITTSNLIEDHRiQKiQKDIAMLYlQGLEKLEKEKANREHLEMEID SEG
PRD hhhhhhhhhhhhhhhcchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
SElQ VKADKSALATKVSRViQFDATTElQLNHMMlQELVAKMSGlQEiQDUiQKMLDRLLTEMDNKLDRL SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcccchhhhhhhhhhhhhhhhhhhhh COILS
SElQ ELDPVKIQLLEDRUKSLRIQIQLRERPPLYIQADEAAAMRRIQLLAHFHCLSCDRPLETPVTGHA SEG
PRD hhhhhhhhhhhhhhhhhhhhhhcccchhhhhhhhhhhhhhhhhhhcccccccccccccee COILS
SElQ IPVTPAGPGLPGHHSIRPYTVFELE(QVR<QHSRNLKLGSAFPRGDLA(QME<QSVGRLRSMHS SEG
PRD eeeecccccccccccccccchhhhhhhhhhhhhhcccccccccchhhhhhhhhhhhhhhh COILS
SElQ KMLMNIEKV(QIHFGGSTKASS(QIIRELLHA(QCLGSPCYKRVTDMADYTYSTVPRRCGGSH SEG
PRD hhhhhheeeeeecccccchhhhhhhhhhhhhhcccccceeeccccccceeeccccccccc COILS
SElQ TLTYPYHRSRP(QHLPRGLYPTEEI(QIAMKHDEVDILGLDGHIYKGRMDTRLPGILRKDSS SEG PRD ccccccccccccccccccccchhhhhhhhhcceeeeccccceeeecccccccceeecccc COILS
SElQ GTSKRKSIQIQPRPHVHRPPSLSSNGIQLPSRPIQSAUISAGNTSER SEG
PRD cccccccccccccccccccccccccccccccceeeeecccccc COILS
Prosite for DKFZphtes3_lbp3-l
PSODOlb 1542->1545 RGD PDOCDOOlb
(No Pfam data available for DKFZphtes3_lbp3.1)
DKFZphtes3_17i21
group: transmembrane protein
DKFZphtes3_17i21 encodes a novel 224 amino acid protein without similarity to known proteins- The novel protein contains 2 transmembrane regions- ESTs can be found in testisi retina and brain- No informative BLAST resultsi No predictive prositei pfam or SCOP motife- The new protein can find application in studying the expression profile of testis-specific genes and as a new marker for testicular cells-
unknown protein
Pedant: contains signal peptide (frame 1) and TRANSMEMBRANE 2 (frame 2) perhaps complete eds-
Sequenced by GBF
Locus: unknown
Insert length: 1518 bp
Poly A stretch at pos- 1480ι polyadenylation signal at pos 1454
1 GCCAGACAGC TAGGTGTCAT TCAGGGCTGG TGTCCTCTGT CCAGGCCATC
51 ATGGCCTCCA CTGCCGGCTA CATCGTCTCC ACCTCCTGCA AGCACATCAT
101 TGATGACCAA CACTGGCTGT CCTCTGCCTA CACGCAATTT GCTGTGCCCT
151 ACTTCATCTA CGACATCTAC GCCATGTTCC TCTGTCACTG GCACAAGCAC
2D1 CAGGTCAAAG GGCATGGAGG GGACGACGGA GCGGCCAGAG CCCCGGGCAG 251 CACGTGGGCC ATAGCGCGTG GCTACCTGCA CAAGGAGTTC CTCATGGTGC
301 TCCACCATGC CGCCATGGTG CTGGTGTGCT TCCCACTCTC AGTGGTGTGG
351 CGACAGGGTA AGGGAGACTT CTTTCTGGGT TGCATGTTGA TGGCACAGGT
401 CAGCACGCCC TTCGTCTGCC TTGGCAAGAT CCTCATCCAG TACAAGCAGC
451 AGCACACACT GCTGCACAAG GTGAACGGGG CCCTGATGCT GCTCAGCTTC 501 CTCTGCTGCC GGGTGCTGCT CTTTCCCTAC CTGTACTGGG CCTACGGGCG
551 CCATGCCGGC CTGCCCCTGC TGGCCGTGCC CCTGGCCATC CCTGCCCACG bOl TCAACCTGGG CGCTGCGCTG CTCCTGGCCC CTCAGCTCTA CTGGTTCTTC b51 CTCATCTGCC GTGGGGCCTG CCGCCTCTTC TGGCCCCGCT CCCGGCCGCC
701 CCCGGCCTGC CAGGCCCAGG ACTGAGGCCG GGGGCCGGGA CCCTCCCCCT 751 CCCCACCCCC ACCCCCGTGG AGACAGGGCT CTGGGGCTGA TGGCTGGGGT
601 TGGGAGCCAG GGTCCTCTTG CCCGGACAAC CCCAGGACTG ACGATGACCC
651 CGAAAGGGAA GAGGCCCCAT CTCTCGGGGA CTGAGGGGGT GGAGAGAGGG
101 GACCTCTTCC CCCTACTCTG CCCCCTTCCT GCACACCCTT GCGCTGGAGG
151 kGGGGkGGGG GCACCGCCTC CCACCCACTG AGGGCAGGAG GGCTTGTGGG 1D01 GAGGGACACC AACAGGGTTT CAAGGGGACC AGGAGTCAGA ATGTGGGGAG
1D51 ACGCCTCTGC CAAGGCCATC CCAGCCCCTA TGCTGCCATC CCCCAGGGCT
1101 CCCCATCACC CGAGAGGAGA GGACGCCCCA ACTAACCCCC GCTGGCCCTC
1151 GGGCCTCCCG AGTGGCCGGC TGCAACCACG GCTCCTCTCC AGGGTAGGCC 1201 AGCTTGAGGA ATCTTATTTA TTTTATTTAT TTACCCAAAT TTGAACTAGT
1251 CTGTTGGGTT GGGGGAAGGA GGTGGCTGCT ACCCCCAAGC CTTCCCAGTG
13D1 CTGACAACCC CGGGGGCAGG CGAGGGCGCC CAGTCCCTCA CCATCGGCTG
1351 CACATCGCGC CCTCGGGCCC TGCCATGTCC CTGGTGCTAC TGACCTCTCA
14D1 AGGCTTCCTC CAATCTGGGG TCGGGGGACC CTGGGAGGTG CTTTACAGAC
1451 CGCTAATAAA AGACGATCTG CGTGAACGCC AAAAAAAAAA AAAAAAAAAA
1501 AAAAAAAAAA AAAAAAAA
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 3
ORF from 51 bp to 722 bpi peptide length: 224
Category: putative protein
Classification: Transmembrane proteins unclassified 1 MASTAGYIVS TSCKHIIDDlQ HULSSAYTiQF AVPYFIYDIY AMFLCHUHKH
51 (QVKGHGGDDG AARAPGSTUA IARGYLHKEF LMVLHHAAMV LVCFPLSVVU
IDl RiQGKGDFFLG CMLMAEVSTP FVCLGKILIlQ YKlQQHTLLHK VNGALMLLSF
151 LCCRVLLFPY LYUAYGRHAG LPLLAVPLAI PAHVNLGAAL LLAPlQLYUFF
2D1 LICRGACRLF UPRSRPPPAC (QAlQD
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_17i21ι frame 3 No Alert BLASTP hits found
Pedant information for DKFZphtes3_17i21ι frame 3
Report for DKFZphtes3_17i21-3
[LENGTH! 224 [MU! 25224-11 [pi! 1-03 [HOMOL! TREMBLNEU : AF181b4b_l gene: "BcDNA - GH1232b"i product: "BcDNA - GH1232b"i Drosophila melanogaster BcDNA- GHD2340 (BcDNA-GHD2340) mRNAi complete eds- le-20 [BLOCKS! PR00b32H [BLOCKS! PR00104A [BLOCKS! BL01243C [KU! TRANSMEMBRANE 2 [KU! LOU COMPLEXITY b-25 *
SE(2 MASTAGYIVSTSCKHIIDDlQHULSSAYTiQFAVPYFIYDIYAMFLCHUHKHiQVKGHGGDDG
SEG
PRD cccceeeeeccccceeecchhhhhhhhhhheeehhhhhhhhhhhhhhhhhhccccccccc MEM
SElQ AARAPGSTUAIARGYLHKEFLMVLHHAAMVLVCFPLSVVURiQGKGDFFLGCMLMAEVSTP
SEG
PRD ccccccceeeeecccchhhhhhhhhhhhhhhhcccceeeeecccccchhhhhhhhhhccc MEM MMMMMMMMMMMMMMMMM
SElQ FVCLGKILIiQYKlQiQHTLLHKVNGALMLLSFLCCRVLLFPYLYUAYGRHAGLPLLAVPLAI
SEG xxxxxxxxxxxx
PRD ccchhhhhhhhhhhhhhhhccchhhhhhhhhhhhheeecceeeeccccccccceeeeccc MEM MMMMMMMMMMMMMMMMM
SElQ PAHVNLGAALLLAPlQLYUFFLICRGACRLFUPRSRPPPACiQAiQD
SEG xx
PRD cchhhhhhhhhhhccceeeeeecccccccccccccccccccccc MEM
(No Prosite data available for DKFZphtes3_17i21 -3) (No Pfam data available for DKFZphtes3_17i21. )
DKFZphtes3_18nl4
group: transcription factors
DKFZphtes3_18nl4 encodes a novel 377 amino acid protein with similarity to human giantin- Giantin is discussed as an autoantigen in rheumatoid arthritis- The novel protein contains a leucine zipper and a putative Helix- loop-helix DNA-binding domain- Therefore it might be a novel transkription factor- Most EST hits are from testis and germ cells-
The new protein can find application in modulation of gene expression and in expression profiling-
unknown protein see DKFZphtes3_30i23 wrong orientation perhaps complete cds-
Sequenced by MediGeno ix
Locus: /chromosome="lb" Insert length-" 5262 bp
Poly A stretch at pos- 5242ι polyadenylation signal at pos- 5227
1 CCGGCACCCG GAGCTCCTGG GCACACGGCA TTGGCAGGGG CCGCTTCGGC 51 AGAGTGATGA CTGATGATGA GTCCGAGAGC GTCCTCTCCG ACTCCCATGA
IDl AGGGTCGGAG CTGGAGCTGC CTGTTATCCA GCTGTGCGGG CTGGTGGAGG
151 AGCTCAGCTA TGTAAACTCT GCTCTCAAAA CTGAGACTGA GATGTTTGAG
201 AAATATTACG CTAAACTGGA GCCCAGGGAT CAGCGACCTC CACGATTATC
251 AGAAATTAAA ATATCAGCAG CAGATTATGC ACAGTTTCGA GGCAGGCGTA 3D1 GATCCAAATC CCGGACAGGT ATGGACCGTG GGGTAGGCCT GACTGCCGAC
351 CAAAAACTTG AGCTGGTACA AAAAGAGGTT GCGGACATGA AGGATGACTT
401 ACGACACACA AGGGCAAATG CGGAACGCGA CCTGCAGCAT CACGAGGCGA
451 TCATTGAGGA GGCTGAAATT CGATGGAGTG AAGTTTCGAG AGAAGTGCAT
5D1 GAGTTTGAAA AAGATATTCT AAAAGCCATA TCCAAGAAGA AAGGGAGTAT 551 TTTGGCCACT CAGAAAGTGA TGAAATACAT TGAGGACATG AACCGCCGGA bOl GGGATAATAT GAAGGAGAAA TTACGTTTGA AAAATGTTTC TCTCAAAGTT bSl CAGAGGAAAA AAATGCTTTT ACAATTGAGG CAGAAGGAAG AGGTGAGTGA
701 GGCCCTTCAC GATGTTGATT TTCAGCAGTT GAAGATAGAG AACGCTCAAT
751 TTCTTGAGAC AATTGAAGCA AGGAATCAAG AACTGACCCA GCTAAAGCTG 801 TCATCTGGAA ACACTCTGCA GGTTCTCAAT GCCTACAAAA GCAAGCTTCA
851 CAAGGCAATG GAAATATACC TCAATCTGGA CAAGGAGATC TTGCTGAGAA
101 AAGAGCTACT TGAAAAAATT GAAAAAGAAA CACTACAAGT AGAGGAGGAC
151 CGGGCCAAAG CCGAGGCAGT GAATAAGAGG CTCCGGAAGC AGCTGGCCGA
1001 GTTCCGGGCA CCACAGGTGA TGACTTACGT CCGGGAGAAG ATCTTAAATG 1051 CGGACCTGGA GAAGAGCATC AGGATGTGGG AAAGGAAAGT GGAGATAGCA
1101 GAGATGTCCT TAAAAGGCCA TCGTAAGGCT TGGAATCGAA TGAAAATAAC
1151 CAATGAGCAG TTGCAGGCAG ATTACCTTGC TGGGAAGTAG CCAGAGGCAG
1201 GCCACGGCTT ACAGACCACT ACATGACCTA TAAAAGTAAT CAGCTCCTTT 1251 CTAGTCACGG GCTCCTCTCA CTGTTCCCTG TCTGCCTGGT GTTCCCAACC
1301 CCCCACCCAG GCTGAGTATC ATCTCCTGGG CCACATCTGC CCATGGGGAG
1351 TGTTTTCACA GCCTGGCCCC TGGAACTGTT ACCACTGAAA GAACCACAGG
1401 GCACTCTAAT GGTTTGACAC TTGTTAGCCA GCATTTAGTT CACAAGCATA 1451 GTGAAAGTGA CCTTCCCACA CCTGGGAGAG GGATAGAGGA GGGAGAGCCA
1501 GCCCAGTGTA TGCCATGGGC TTATCCGTGG CAGCCCCAGT GTGCAACTAT
1551 CAAAAACAGA CATCAAAACA GCATGGTGAA TGCCTGGCAC TCAGCATTCT
IbOl CAGTTTACTC TTCAGTTTGG TGGGGTAGCT CCTGGACTAG ATACTGCTGC lb51 AAAAGAAAAC AAGCACGAAG GAAACCAAGA TGATTTCTTC GGGCTGATAC 17D1 AACCTGTTCT GACCTGCAAA AATCCTACCT TCCCCCACCT CCCCACCGTA
1751 ATAGTCATAG TATAAGGGTT GTACAGACGC CTCAGGAGAC CTGCCTGATT
1801 CCTTTACATC CTTCTCCCTA ACATCTAGAC TATCTCTAGA GCTGTTTCCT
1851 AGTCGTGAAT GCGTGATGGT CCTTCTTTGT CCCTGCAAGT ATGATCCAAC
1101 ATGGCCCAGT TCAGAATCAG AATATGTCTT CTGTGTCATG GTGGCATTTG 1151 GTCCATGGTG GGAGAAAGAA ATCAACTTTT CCCAGTGGTG GAGTGAGGAC
2001 AGGGGAGGGC CGGCCCTCTC AGCCTTGGAT GTGATCCATT TGCTGTAGTC
2051 TTCCACCTTG GTGTACAGAA ACAGGCCAGG GCACGTCTCA CCACCGAAGT
2101 TCAGGACTCC TCTCAGAACC CACAGATCGA ACTGCTGTAG CTGGCACATC
2151 ATTGGGCTTC CTGGGTCCCC CTGTGATAAA AGACAGAAGG CTTCAAGTCT 22D1 TAGAAAAACT AGTTTTTGTT GTAAATCTAT CCTTGTGCAA TATACTGTTT
2251 GTTCTAGAAA TGTTTTACGC TGGTTCTCAC TGGAAATGGG GCAAATTATA
23D1 GGATACAATT TCAAATCTAG GCAGCCACCA CCACAAATTC CAACAA6ATG
2351 ACTTTTCCTT TTATTATGCA AATTAGCTGT GGACTTCTGC TGATTGCCTA
2401 TAGCTTCCT6 GTTCATATTT CATTTTCTTG CCCCTTTCCA GTCCTTTGGC 2451 CAAACCTTCC CTCTCTTCTG GCTTCTCATT CCTGAAATGT TGGTGTTTGT
2501 TTCTGTTTTG TCCTGAAATG CTCACATTTT CCCTTCTCTG CCTTGCTTCA
2551 ACCCTTAGTG TAAGCCACTT CCTGCCACCT GGCAACTGCT TACCAGCCTG
2b01 GCTGGCCGTG CTCTGGGTCT TCCCTACTCC CAATGGAGCA GTCCTCTGGG
2b51 ACTTGGGAAT TCTGCCACAT ACACTTTATC TAACTTAAAG TGACGGAGTA 2701 GAAGCTTGGC ATCATTAGCT AGATATGGGA CCCTGGCAAG TGACCAAATC
2751 CTCTCTGAGC CAAGGTGGGA ACACAGTTAA TGCCTGTAAC ACGTGCTGAG
2601 CACAGCACAG TGCCTGGCAC ACAGCAAACA CTCAATAGAA TATTAGCTAC
2651 CATCATCCTG ATGTCGCTAT AAAGGCCAGC ATTTTTCTGA AAAGTTGGGG
2101 AAAATGGGAA AAGCAACAAG GCAACTAGTA GGTATCACTT ACCTTACCTG 2151 CCCAGACCCC ACACCCCTAG GTCTCCTCTC AAAGGAATTC CTGCCCCTCC
3D01 CATGGCCCAT CTTGGTCCGA GAAGGGGGTG GTCATCCCCA GGCTAGCCAG
3D51 CCACTTCTGA CCTGTGTGGC CTGCCTGGCT GGAAGGCCCA GGCAATGACA
3101 TGTTGCTCTC GCAGTTTGGA CTGAGACATG GAATGGGGCC GCAATTAACA
3151 ACAGGAAACA ATCTGAACAG ACTGAACCAC GAGCAGCAGA AAGGCAGAAG 3201 AGCAGCCGCT TCAGCCCCTT ACCATCCGAG ACCTGGGTGT GTGGTCTGTC
3251 TTGGTCACTC TCTCTGTCTC TCTTTCTCTC TTTCTTTCTC TGTCCCCAAG
3301 GCTGGAGTGC AGTGGTGCAA TCTTGGCTCA CTGCAACCTC CACCTCTGGG
3351 ATTCAAGCAA TTCTCCCACC TCAGCCTCTC GAGTAGCTGG GGCTACAGCT
3401 ATGCGCCACC ATGCCCAGCT AATTTTTTTT TTTTTTTTTT GAGATGGAGT 3451 CTTGCTCTGT CCCCCATGCT GGAGTGCAGT GGCATGATCT CGGCTCGCTG
35D1 CAACCTCCTC CTCCTGGGTT CAAGCGATTC TCCTACCTCA GCCTCCCCAG
3551 TAGCTGGGAT TACAGGCGCC CACCACCACA CCTGGCTAAT TTTTATTTTT
3b01 AGTAGAGATG GGGTTTCACC ATGTTGGCCA GGCTGGTCTC GAACTCCTGA
3b51 CCTCATGATC CACCCGCCTC GGCCTCCCCA AGTGTTGGGA TTACAGGCGT 3701 GAGCCACTGC ACCCGGCCTA ATTTCTGTAT TTTTAGTAGA GATGGGGTTT
3751 CACGATGTTG GCCAGGCTGG TCTTAATCTA ACTTCAAGTG ATCTGCCCGC
3601 CTCGCCCTCT CAAAGTGCTG GGATTAGGCA TGAACTACCA TGCCCAGTGG
3851 GGTATTCTCT TTCAATAAAG CTCCTCTTTT CCAAGGAAGC CACACCAGAA
3101 CAGAGATGAA GACCAGTGGG AAAACATGGG AGCAACTCCG TGGGCAGGCC 3151 AGCGGGGAGG CCATGCTGCA AAGCTGCCGT GATTCCCTGG TGATCTCTCA
40D1 GCAGGCCAAG GCCAGACATG TGAGGAAGGC CTTGAGGACT TCATTCTGTG
4051 CCTCTCCTTG GATGGAAGGG GGTGCTTTAG TGTGGCACTC CTGACTTTTC
4101 AATTGACTGG TGAAGAGGCC CTTGTGTGCA CCTCACTATG TCTGCCTAGG 4151 TCATGGGGGC TCCCTGGCCA AGAATGACGT GGTTCCCCCT TTCATCAGTC
4201 CGATTCGCAG TTTGTCTTAA CTGTAGTGGT ATAGCCAGAG CAAGAAAAAG
4251 AATGTGATTT AGGACAAATG ATTGGATGAG TGATTGGTAG ATGTCCTCAG
43D1 CTATGGCGTG GTTTTGCAGG TCACTGTTCC ACCCACCTGG GCACAGCATA 4351 TACGCTTTTT CTCTTCCCCA TAATCCCGTA GGGGCTGCGA CTTCTGAAGC
4401 ACAAGAGGCA GAGGCGAACA GCTCCAGGTG CCCCTCTGGA GCTACCCTAC
4451 CTCATCTCCC AAGGGAGCGG CCACAGCCCA GAGTGGGGTC TTTCATTTTG
4501 TGATCTTTTC CCTTGACATT CAGCAAAAGC CCTGACAGTG GTAGAATAAA
4551 GGCAGGATGG GTGAGTGCAG AGTGATTCTG CTTTTGTTGG GTTTCAGGGA 4b01 AACCCATAGG CAGATTCTGA ACCTGGTGGT TGATTCTACA TGTGGGAATT
4b51 GTGGCTTTGA AGACCTCTGG ACATGAGAAC ATATTTCCAA GACAGAGGAT
47D1 TCTATGGGGA CGGGTCACCA TTAAATGGTG TGCAAGCATA ATTCTGTTCA
4751 AAAATGAAGG CATGTTTAGA GGTGTGTCAC AGTTAAAAAC CAACCTGAAC
4801 TTTGCAGTTA GATTTTAAAA GATGGTCAGT TAGAGTAGAA ATAGCTTAGA 4851 ATATTCCATT GAGTCTAAGA TACAGTTAGA AATCAACATC TTTGAAATTA
4101 GGGTGTGTCT TTTAATCAGT TGATGTCAGA GTTTAACGGG CAGCATTTTT
4151 TTCTTTCTTG GGATTACAAA AAATGATGGT GCATTCTATA ATTGGCAGCA
5001 TCTTAGATCT GAGGAAGTAT GATACTTGTT TGACGGAATG GTTGACGGCA
5051 GAATTTTGTT AAAAAGCTAT ATCTTCACTG TATTTTAACA CATTATCTAA 51D1 TTTAAGAAAT TGTTAAGATC CCCCACCTGG CAGAGGACCC AGTACAAAAT
5151 AGGCACTCAA TAGATGTTAC ACCAACTTTG GAAGGGCAAA CATATTTCTT
5201 AATGAGAGGC AGTCCTTCAT GTTTTGCAAT AAAATGACTT TTAAAAAAAA
5251 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AA
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 3
ORF from 57 bp to 1187 bpi peptide length: 377 Category: putative protein Classification: no clue Prosite motifs: LEUCINE_ZIPPER (11-40)
1 MTDDESESVL SDSHEGSELE LPVIlQLCGLV EELSYVNSAL KTETEMFEKY
51 YAKLEPRDiQR PPRLSEIKIS AADYAώFRGR RRSKSRTGMD RGVGLTADώK IDl LELViQKEVAD MKDDLRHTRA NAERDLiQHHE AIIEEAEIRU SEVSREVHEF
151 EKDILKAISK KKGSILATlQK VMKYIEDMNR RRDNMKEKLR LKNVSLKViQR
201 KKMLLlQLRiQK EEVSEALHDV DFiQULKIENA (QFLETIEARN (QELT(QLKLSS
251 GNTLiQVLNAY KSKLHKAMEI YLNLDKEILL RKELLEKIEK ETLlQVEEDRA
3D1 KAEAVNKRLR K(QLAEFRAP(Q VMTYVREKIL NADLEKSIRM UERKVEIAEM 351 SLKGHRKAUN RMKITNE(QL(Q ADYLAGK BLASTP hits No BLASTP hits available Alert BLASTP hits for DKFZphtes3_18nl4ι frame 3
No Alert BLASTP hits found
Pedant information for DKFZphtes3_18nl4ι frame 3
Report for DKFZphtes3_18nl4 -3
[LENGTH! 315
[MU! 4bl51-lb
[pi! 1-17
[H0M0L! TREMBL:AF13b711_l product: "myosin heavy chain"i
Amoeba proteus myosin heavy chain mRNAi complete eds- 5e-0b [FUNCAT! 11 unclassified proteins [S- cerevisiaei Y0R21bc!
7e-D4
[BLOCKS! BL005b3B Stathmin family proteins
[BLOCKS! PRD0115D
[PROSITE! LEUCINE_ZIPPER 1 [PFAM! Helix-loop-helix DNA-binding domain
[KU! All_Alpha
[KU! L0U_C0MPLEXITY b-33 *
[KU! COILED COIL 14- bδ *
SElQ GTRSSUAHGIGRGRFGRVMTDDESESVLSDSHEGSELELPVIlQLCGLVEELSYVNSALKT SEG
PRD cccccccccccccceeeeeccccceeeeeccccccceeeeeeeeccchhhhhhhhhhhhh COILS
SEώ ETEMFEKYYAKLEPRDORPPRLSEIKISAADYA(QFRGRRRSKSRTGMDRGVGLTAD(QKLE SEG
PRD hhhhhhhhhhhhcccccchhhhhhhhhhhhhhhhhhccchhhhhccccccccchhhhhhh COILS
SE(Q LV(QKEVADMKDDLRHTRANAERDL(QHHEAIIEEAEIRUSEVSREVHEFEKDILKAISKKK SEG xxxxxxxxx PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhc COILS CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
SElQ GSILAT(QKVMKYIEDMNRRRDNMKEKLRLKNVSLKV(QRKKMLL(QLR(QKEEVSEALHDVDF SEG
PRD ccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
SElQ (Q(QLKIENA(QFLETIEARN(QELTtQLKLSSGNTLιQVLNAYKSKLHKAMEIYLNLDKEILLRK SEG xxxxxxx
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhcccccccchhhhhhhhhhhhhhhhhhhhhhhhhh COILS
SElQ ELLEKIEKETLlQVEEDRAKAEAVNKRLRKlQLAEFRAPiQVMTYVREKILNADLEKSIRMUE SEG xxxxxxxxx
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
CCCCCCCCCCCCCCCCCCCCCCCCCCC SElQ RKVEIAEMSLKGHRKAUNRMKITNElQLiQADYLAGK
SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhccc COILS
Prosite for DKFZphtes3_lβnl4 ■ 3 PSD0D21 37->51 LEUCINE_ZIPPER PD0C00D21
Pfam for DKFZphtes3_18nl4 -3
HMM_NAME Helix-loop-helix DNA-binding domain
HMM
*RRRNHNMRERRRRndINNUFeaLRDHIPHhnV- .. PNEKPLSKVEILRM RRR NM E+ R++++ + + ++++ +E L V+
++
(Query 116 RRR-DNMKEKLRLKNVSLKV(QRKKMLL(QL-R(QKEEVSEA-
LHDVDFύiQL 243
HMM AIEYIrsLiQ*
IE ++L+
(Query 244 KIENAiQFLE 252
DKFZphtes3_11pl2
group: testis derived
DKFZphtes3_11pl2 encodes a novel bb4 amino acid protein without similarity to known proteins- No informative BLAST resultsi No predictive prositei pfam or SCOP motife ■
The new protein can find application in studying the expression profile of testis-specific genes-
unknown protein
Sequenced by MediGenomix
Locus: unknown
Insert length: 21bl bp
Poly A stretch at pos- 206bι no polyadenylation signal found
1 CCCGAGCCAG CAACCCTGAG GGGCGGCCGG GCAGCGCCGC CACCATGTTC
51 CTGGGCACCG GGGAGCCGGC CTTGGACACG AGTCACCTTA TCTCTCTAAG
101 CCGAGCGTCC CTGACCCCGC AGAAGCTGTG GCTGGGAACC GCAAAGCCAG 151 GAAGTCTGAC CCAGGCCCTG AACTCACCCC TCACCTGGGA GCATGCGTGG
201 ACTGGCGTCC CCGGCGGCAC TCCTGACTGT CTGACAGACA CCTTCAGAGT
251 GAAGAGGCCA CATCTCAGGC GCTCTGCCAG CAACGGTCAT GTCCCTGGGA
301 CTCCTGTCTA CAGAGAAAAA GAAGATATGT ATGACGAGAT TATTGAGTTA
351 AAGAAGTCAT TGCACGTGCA GAAGAGCGAC GTGGACCTGA TGAGAACGAA 401 GCTCCGGCGC CTGGAGGAGG AAAACAGCAG GAAGGACCGG CAGATAGAGC
451 AGCTCCTGGA TCCCAGCCGC GGCACGGATT TTGTTCGGAC TCTGGCAGAG
501 AAAAGGCCCG ATGCCAGTTG GGTCATTAAC GGGCTGAAGC AGAGGATCCT
551 GAAGCTGGAA CAGCAGTGCA AGGAGAAGGA CGGCACCATC AGCAAACTCC bOl AGACCGATAT GAAGACTACC AACCTGGAAG AGATGCGGAT CGCCATGGAG faSl ACATACTACG AGGAGGTGCA TCGTCTCCAG ACCCTCTTGG CAAGTTCTGA
701 AACCACCGGA AAGAAGCCCC TGGGGGAGAA GAAGACGGGC GCCAAAAGGC
751 AGAAGAAGAT GGGCAGTGCC CTCCTGAGCT TGTCCCGGAG TGTCCAGGAG
601 CTCACGGAAG AGAACCAGAG CCTGAAGGAG GACCTGGACC GCGTGCTGAG
651 CACCTCCCCA ACCATCTCCA AGACACAGGG TTATGTGGAG TGGAGCAAGC 101 CCCGGCTGCT GAGGCGCATT GTGGAGCTGG AGAAGAAACT AAGTGTGATG
151 GAGAGCTCAA AATCACACGC CGCAGAGCCA GTCAGATCAC ACCCGCCAGC
1001 CTGCCTTGCA TCCAGCTCTG CGCTGCACAG ACAGCCACGA GGGGACCGCA
1051 ACAAGGACCA CGAGCGTCTC CGAGGGGCTG TGAGAGACCT GAAGGAAGAG
1101 CGGACCGCGC TGCAGGAGCA GCTGCTGCAG AGAGATTTGG AGGTGAAGCA 1151 GCTCCTGCAG GCGAAGGCCG ACCTGGAGAA GGAGCTGGAG TGCGCGAGGG
1201 AGGGCGAGGA GGAGAGGAGA GAGCGAGAGG AGGTTTTGAG AGAGGAGATT
1251 CAGACACTTA CCAGCAAGCT CCAAGAATTG CAAGAAATGA AGAAAGAAGA
1301 GAAAGAGGAT TGCCCGGAAG TTCCTCATAA GGCCCAAGAG CTCCCAGCTC
1351 CCACTCCCAG CAGCAGGCAC TGCGAGCAAG ACTGGCCGCC GGATTCCAGC 1401 GAGGAGGGGC TCCCGCGGCC CCGCTCCCCC TGCTCTGATG GGAGAAGAGA
1451 CGCCGCGGCC AGAGTCCTGC AGGCCCAGTG GAAGGTGTAC AAGCACAAGA
1501 AAAAAAAGGC TGTTCTGGAT GAGGCGGCTG TGGTGCTTCA GGCAGCTTTC
1551 AGGGGACATC TCACGCGGAC AAAGCTCTTA GCAAGCAAAG CACATGGCTC IbOl AGAGCCACCC AGCGTGCCAG GCCTCCCAGA CCAGAGCTCT CCTGTGCCCC lb51 GCGTTCCGAG CCCCATCGCC CAGGCCACGG GCAGCCCTGT GCAGGAGGAG
1701 GCCATCGTCA TCATCCAGTC CGCTCTGCGG GCACACCTGG CCCGGGCCAG
1751 GCACAGTGCT ACCGGTAAAA GAACCACCAC CGCAGCTTCT ACCAGGAGGA
1801 GATCGGCTTC AGCCACACAC GGGGACGCCT CCTCCCCACC CTTCCTCGCA
1851 GCTCTTCCTG ACCCCTCTCC CTCAGGGCCA CAGGCCTTGG CACCTCTACC
1101 TGGGGATGAC GTCAACTCCG ATGATTCCGA CGATATTGTC ATTGCACCGT
1151 CTCTGCCCAC GAAGAACTTT CCAGTTTAGG TCCCCGTCAC TGTCTCCACG
2001 CCGTGATGGC AGCGCTGCCG AGGACATAGG AACCACGACT GGAAAGATAA
2051 TTTATCGTGT TAGGAGAAGA ACGATGATAC CTACTTAAAA AAAAAAAAAA
2101 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA
2151 AAAAAAAAAA A
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 3
ORF from 45 bp to 117b bpi peptide length-" b44 Category: similarity to unknown protein Classification: unclassified Prosite motifs: RGD (332-334)
1 MFLGTGEPAL DTSHLISLSR ASLTPiQKLUL GTAKPGSLTiQ ALNSPLTUEH 51 AUTGVPGGTP DCLTDTFRVK RPHLRRSASN GHVPGTPVYR EKEDMYDEII 101 ELKKSLHVύK SDVDLMRTKL RRLEEENSRK DRIQIEIQLLDP SRGTDFVRTL 151 AEKRPDASUV INGLKlQRILK LElQiQCKEKDG TISKLlQTDMK TTNLEEMRIA 201 METYYEEVHR LiQTLLASSET TGKKPLGEKK TGAKRiQKKMG SALLSLSRSV 251 (QELTEENiQSL KEDLDRVLST SPTISKTiQGY VEUSKPRLLR RIVELEKKLS 301 VMESSKSHAA EPVRSHPPAC LASSSALHRiQ PRGDRNKDHE RLRGAVRDLK 351 EERTALlQElQL LIQRDLEVKIQL LlQAKADLEKE LECAREGEEE RREREEVLRE 401 EIlQTLTSKLiQ ELiQEMKKEEK EDCPEVPHKA (QELPAPTPSS RHCElQDUPPD 451 SSEEGLPRPR SPCSDGRRDA AARVLlQAiQUK VYKHKKKKAV LDEAAVVLiQA 501 AFRGHLTRTK LLASKAHGSE PPSVPGLPDiQ SSPVPRVPSP IAIQATGSPVIQ 551 EEAIVIIlQSA LRAHLARARH SATGKRTTTA ASTRRRSASA THGDASSPPF bOl LAALPDPSPS GPiQALAPLPG DDVNSDDSDD IVIAPSLPTK NFPV
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_11pl2ι frame 3 No Alert BLASTP hits found
Pedant information for DKFZphtes3_11pl2ι frame 3
Report for DKFZphtes3_11pl2- 3
[LENGTH! b44 [MU! 71810-41
[pi! 6-80
[H0M0L! TREMBL:AB02814b_l gene: "KIAA1023"i product:
"KIAA1023 protein"i Homo sapiens mRNA for KIAA1D23 proteini partial eds- 0-D [FUNCAT! 3D-03 organization of cytoplasm [S- cerevisiaei
YDLOS&w! 2e-07
[FUNCAT! 08-D7 vesicular transport (golgi networki etc) [S- cerevisiaei YDL056w! 2e-D7
[FUNCAT! 11 unclassified proteins CS- cerevisiaei YLR301c! 3e-0b
[FUNCAT! 3D-04 organization of cytoskeleton [S- cerevisiaei
YDR35bw! 2e-D5
[FUNCAT! 01-10 nuclear biogenesis CS- cerevisiaei YDR35bw!
2e-05 [FUNCAT! D3-22 cell cycle control and mitosis CS- cerevisiaei
YDR35bw! 2e-05
[FUNCAT! 16 classification not yet clear-cut [S- cerevisiaei
YJR134c! 4e-D5
[BLOCKS! DM01354I [BLOCKS! BL00b27B GHMP kinases ATP-binding domain proteins
[BLOCKS! BL0032bC Tropomyosins proteins
[BLOCKS! BLOllbOB Kinesin light chain repeat proteins
[BLOCKS! BL00620D Glucoamylase proteins region proteins
[BLOCKS! BP04417C [BLOCKS! BLD0412B Neuromodulin (GAP-43) proteins
[EC! 3. fa.1.35 Myosin ATPase 3e-06
[PIRKU! tandem repeat 3e-0θ
[PIRKU! transmembrane protein 2e-D7
[PIRKU! muscle contraction 3e-08 [PIRKU! actin binding 3e-D8
[PIRKU! ATP 3e-08
[PIRKU! thick filament 3e-D8
[PIRKU! alternative splicing 7e-07
[PIRKU! coiled coil 3e-D8 [PIRKU! P-loop 3e-0β
[PIRKU! heptad repeat 2e-D7
[PIRKU! methylated amino acid 3e-06
[PIRKU! hydrolase 3e-08
[PIRKU! Golgi apparatus 2e-07 [SUPFAM! myosin heavy chain 3e-0δ
[SUPFAM! myosin motor domain homology 3e-08
[SUPFAM! alpha-actinin actin-binding domain homology 8e-Db
[SUPFAM! plectin βe-Ob
[SUPFAM! ribosomal protein SID homology βe-Ob [SUPFAM! giantin 2e-07
[PROSITE! RGD 1
[KU! All_Alpha
[KU! L0U_C0MPLEXITY 14-bD * [KU! COILED COIL 15 - 22 *
SElQ MFLGTGEPALDTSHLISLSRASLTPiQKLULGTAKPGSLTiQALNSPLTUEHAUTGVPGGTP SEG
PRD cccccccccccceeeeeeeecccccceeeeecccccceeeeccccccccccccccccccc COILS
SElQ DCLTDTFRVKRPHLRRSASNGHVPGTPVYREKEDMYDEIIELKKSLHViQKSDVDLMRTKL
SEG
PRD cccccchhhhhhhhhhhhhhccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
COILS CCCCCCCCCCCCCCCCCCCCCCC
SElQ RRLEEENSRKDR(QIE(QLLDPSRGTDFVRTLAEKRPDASUVINGLK(QRILKLE(Q(QCKEKDG
SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
COILS CCCCCC
SElQ TISKLlQTDMKTTNLEEMRIAMETYYEEVHRLiQTLLASSETTGKKPLGEKKTGAKRiQKKMG SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS cccc
SElQ SALLSLSRSVlQELTEENlQSLKEDLDRVLSTSPTISKTiQGYVEUSKPRLLRRIVELEKKLS SEG PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
SElQ VMESSKSHAAEPVRSHPPACLASSSALHRiQPRGDRNKDHERLRGAVRDLKEERTALlQEiQL SEG
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
SElQ LiQRDLEVKiQLLiQAKADLEKELECAREGEEERREREEVLREEIlQTLTSKLiQELlQEMKKEEK
SEG xxxxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxx
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS CCCCCCCCCCCCCCCCCCCCCCCCCCCCC
SElQ EDCPEVPHKAQELPAPTPSSRHCEiQDUPPDSSEEGLPRPRSPCSDGRRDAAARVLiQAlQUK
SEG x x
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS CCCCCC
SElQ VYKHKKKKAVLDEAAVVLiQAAFRGHLTRTKLLASKAHGSEPPSVPGLPDiQSSPVPRVPSP SEG xxxxxxxx
PRD hhhhhhhhhhhhhhhhhhhhhhhcchhhhhhhhhhhcccccccccccccccccccccccc COILS
SElQ IAlQATGSPVlQEEAIVIIiQSALRAHLARARHSATGKRTTTAASTRRRSASATHGDASSPPF SEG xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PRD ccccccccccceeeehhhhhhhhhhhhhhhhccccceeehhhhhhhhhccccccccccce COILS
SElQ LAALPDPSPSGPiQALAPLPGDDVNSDDSDDIVIAPSLPTKNFPV
SEG xxxxxxxxxxxxx
PRD eeecccccccccccccccccccccccccceeeeecccccccccc COILS
Prosite for DKFZphtes3_11pl2- 3 PSODOlb 332->335 RGD PDOCOODlb
(No Pfam data available for DKFZphtes3_11pl2-3)
DKFZphtes3_20hl2
group: transmembrane protein
DKFZphtes3_2Dhl2 encodes a novel 1204 amino acid protein without similarity to known proteins- The novel protein contains 1 transmembrane region and two leucine zippers.
No informative BLAST resultsi No predictive prositei pfam or SCOP motife- The new protein can find application in studying the expression profile of testis-specific genes and as a new marker for testicular cells-
putative protein perhaps complete eds- Pedant: TRANSMEMBRANE 1 Sequenced by MediGenomix
Locus: unknown
Insert length: 5614 bp Poly A stretch at pos. 5674ι no polyadenylation signal found
1 CTCTGCCTTT CCTCTCGCAG CCACCCTTCC TCTCAGACCA GTACGGTGGC
51 CGACGGGAGT CAGACGCTGG GGATGAATGA AGGATCAACA AACAGTAATA 101 ATGACTGAAT GTACAAGTCT TCAGTTTGTC AGCCCTTTTG CTTTTGAGGC
151 AATGCAGAAG GTGGATGTTG TTTGCCTGGC ATCTTTAAGT GATCCAGAAT
201 TAAGACTTCT TCTGCCCTGT TTGGTACGGA TG6CACTTTG TGCACCTGCT
251 GACCAGAGCC AAAGCTGGGC TCAGGATAAG AAACTCATCC TTCGCCTTCT
301 TTCTGGAGTG GAAGCTGTCA ACTCCATTGT TGCATTGTTG TCCGTGGACT 351 TTCATGCTTT AGAACAAGAT GCCAGCAAAG AACAGCAGCT TAGGCATAAA
401 CTTGGAGGAG GCAGTGGAGA GAGCATCCTG GTATCACAGC TTCAGCATGG
451 ACTGACGTTA GAGTTTGAAC ACAGTGATTC ACCTCGTCGA TTGCGTCTTG
501 TGCTTAGTGA ACTGTTGGCA ATTATGAACA AGGTGTCTGA GTCCAACGGA
551 GAATTTTTTT TCAAGTCTTC TGAACTTTTT GAGAGTCCAG TATATTTGGA bOl GGAAGCTGCA GATGTACTTT GTATTTTACA AGCAGAGCTC CCTTCCTTGC b51 TCCCTATAGT TGATGTAGCT GAAGCTTTGC TACATGTTAG AAATGGTGCC
701 TGGTTCTTGT GTCTCTTGGT GGCCAATGTT CCTGATAGTT TTAATGAAGT
751 TTGTAGGGGC CTGATAAAAA ATGGAGAACG ACAAGATGAA GAAAGTCTTG
6D1 GAGGAAGGCG CAGGACAGAT GCCTTACGCT TCTTGTGTAA AATGAATCCT 651 TCTCAGGCCC TCAAGGTCCG AGGCATGGTG GTGGAAGAAT GTCACTTGCC
101 AGGCCTTGGT GTGGCTTTGA CATTGGATCA TACTAAAAAT GAAGCTTGTG
151 AGGATGGAGT GAGTGACTTG GTTTGTTTTG TAAGTGGTTT GCTTCTTGGA
1001 ACAAATGCGA AAGTCCGGAC TTGGTTTGGA ACTTTTATCC GAAATGGACA
1051 GCAGAGAAAA AGAGAGACCA GCAGTTCTGT CCTTTGGCAG ATGAGAAGGC 1101 AGCTTCTTCT GGAGTTGATG GGCATTCTTC CCACAGTAAG AAGCACCCGA
1151 ATTGTGGAAG AAGCTGATGT GGATATGGAG CCCAATGTGT CTGTGTATTC
1201 GGGGCTGAAA GAAGAGCATG TTGTGAAAGC CAGTGCACTC TTACGTCTGT
1251 ACTGTGCTTT GATGGGGATC GCTGGACTCA AACCAACTGA AGAAGAAGCT 1301 GAGCAATTAC TGCAGTTGAT GACGAGCCGT CCTCCTGCTA CGCCAGCTGG
1351 GGTTCGCTTT GTTTCACTTT CCTTTTGTAT GCTACTGGCC TTTTCTACAC
14D1 TTGTCAGTAC ACCTGAACAG GAGCAGCTGA TGGTGGTGTG GCTAAGTTGG
1451 ATGATAAAAG AAGAAGCGTA TTTTGAGAGT ACTTCAGGCG TCTCTGCTTC 1501 TTTTGGGGAG ATGTTATTAT TGGTGGCTAT GTACTTTCAC AGCAACCAGC
1551 TTAGTGCTAT CATTGACTTG GTCTGTTCCA CTTTGGGGAT GAAGATTGTA
IbOl ATTAAGCCAA GCTCCTTGAG CAGGATGAAG ACAATCTTCA CACAGGAAAT
IfaSl TTTTACTGAG CAGGTTGTCA CAGCTCATGC AGTTCGGGTC CCTGTCACCA
1701 GCAACCTGAG TGCCAACATT ACTGGATTTT TGCCTATTCA TTGTATTTAC 1751 CAGCTTCTCA GGAGCCGTTC CTTTACCAAG CACAAAGTGT CAATAAAAGA
1601 TTGGATTTAT AGACAGCTGT GTGAAACCTC TACTCCACTT CATCCTCAAT
1A51 TACTTCCTTT GATTGATGTG TACATAAATT CTATACTTAC TCCTGCGTCG
1101 AAATCTAATC CAGAAGCCAC AAATCAGCCA GTCACAGAAC AGGAGATACT
1151 CAATATTTTC CAAGGAGTCA TTGGGGGTGA CAACATCCGC CTTAATCAGC 2001 GTTTCAGTAT CACAGCACAG CTTTTGGTGC TCTACTATAT ACTGTCTTAT
2051 GAAGAGGCTC TTCTAGCAAA CACGAAGACT TTAGCTGCCA TGCAAAGAAA
21D1 GCCCAAATCA TATTCTTCTT CTTTAATGGA TCAGATTCCT ATCAAATTCC
2151 TTATTCGACA GGCTCAAGGG CTGCAGCAGG AGTTGGGAGG GTTGCATTCA
2201 GCTTTACTAC GTCTCCTTGC TACTAACTAC CCACATTTAT GTATTGTGGA 2251 TGACTGGATT TGTGAAGAAG AAATCACAGG GACTGATGCC CTGCTACGGC
23D1 GAATGCTCCT GACTAATAAT GCTAAAAATC ATTCTCCCAA ACAACTCCAA
2351 GAAGCATTTT CAGCTGTCCC AGTAAATCAC ACACAAGTGA TGCAGATTAT
2401 AGAACACTTG ACTCTACTCT CTGCCAGTGA ACTTATACCA TATGCGGAAG
2451 TGTTAACATC CAATATGAGC CAGCTATTGA ATTCAGGGGT TCCACGGAGA 2501 ATTCTGCAAA CAGTCAATAA ACTATGGATG GTTCTTAATA CTGTGATGCC
2551 TAGAAGGCTA TGGGTAATGA CGGTTAATGC ACTTCAGCCT TCAATAAAGT
2b01 TTGTACGACA ACAAAAGTAT ACTCAGAATG ACCTGATGAT AGATCCTCTC
2b51 ATTGTCCTAA GGTGTGATCA GAGGGTTCAC AGATGCCCCC CACTGATGGA
27D1 TATTACCCTA CACATGTTGA ATGGATATCT TCTTGCATCT AAAGCCTACC 2751 TTAGTGCTCA TCTGAAGGAA ACAGAGCAAG ATAGGCCTTC CCAGAATAAT
2601 ACAATTGGTT TAGTTGGACA AACTGATGCT CCGGAAGTTA CCAGGGAAGA
2651 ATTGAAAAAT GCATTACTGG CCGCTCAGGA TAGTGCAGCT GTCCAGATTC
2101 TCTTAGAGAT TTGCCTACCT ACTGAAGAGG AGAAAGCAAA TGGTGTCAAT
2151 CCAGATAGCT TGTTAAGAAA TGTTCAAAGT GTTATTACCA CCAGCGCTCC 30D1 AAATAAGGGA ATGGAGGAAG GAGAAGACAA TTTGCTCTGT AACCTTCGAG
3D51 AAGTTCAGTG CCTTATCTGT TGTCTCTTGC ACCAAATGTA CATTGCAGAT
31D1 CCCAACATTG CTAAGCTTGT TCACTTTCAG GGTTATCCAT GTGAACTTTT
3151 GCCTCTGACG GTCGCAGGTA TTCCATCTAT GCACATCTGT CTAGATTTCA
3201 TACCTGAGCT TATTGCACAG CCAGAACTTG AGAAACAGAT ATTTGCTATC 3251 CAGTTGCTTT CTCACTTGTG TATACAATAT GCATTACCAA AGTCACTTAG
3301 TGTGGCTCGT TTAGCTGTCA ATGTCATGGG AACTTTGTTA ACAGTTTTAA
3351 CACAGGCTAA GCGGTATGCT TTTTTTATGC CAACTCTGCC AAGTTTGGTC
3401 TCTTTTTGTC GAGCATTTCC TCCATTGTAT 6AGGATATTA TGTCTTTGCT
3451 GATCCAAATA GGGCAAGTTT GTGCCTCTGA TGTTGCCACT CAGACAAGAG 3501 ACATTGATCC AATTATTACA CGTCTTCAAC AAATAAAGGA GAAACCAAGT
3551 GGATGGTCTC AAATCTGTAA AGATTCATCT TATAAAAATG GATCCAGGGA
3b01 CACTGGAAGC ATGGATCCTG ATGTACAGCT CTGTCACTGT ATTGAAAGAA
3b51 CAGTAATTGA AATAATAAAT ATGAGTGTTA GTGGAATTTA AAACAAAATT
3701 TAAAACAACA AAAAGTTGTT TGCTGCATAT ACCCAACATG AATCTGCATA 3751 TTAGTAACAA CTCTAAACTG AATGGGAACA GTAAAGTATT GTCTTGGAAT
3601 CACTAAAACA ATTCAATTCA ACATGAGTAT AGTTTAGAAC TTTATGAGAA
3651 TTATGCTTGC TTGTTTCTGA TTGGCACATC TTTGGATCTA CTTTGCTGAT
31D1 ATGTTTCTAT TGTAGCAGCT GAGCTTTTTT TTTTTCCACT GGGAACACAT
3151 GTAAGAAACT CATTATTGGA AAGGGAATTT GGCCTTGTAT TTAGCTTTTG 40D1 AAGTGAAGAC TGCCATGCCT TTAATTTCTT ATAAAAATGA GTCTGTGGGT
4051 AGCCCTAGTG TTTATTTTAA CTGTGAGCTT GTAACAGAAT GTGACAAAGA
4101 TGCAAAGATG GGAGAGGAAA AAAGGGTAAA GGGAAAGGAG AATTAAGGAA
4151 ATAATAGGAG TTAAAAACAC AAGTAGAAAT CTCAAAGATT TGCAGTGCAA 4201 GTAATAGTAA TGCAAGTTGG AATTCTAGTT CTCAAGAAAG AGTATTGAGA
4251 AGACTTTTAA AAAGGCAAGT AGCTTTTGTA AATGATTTCT GTGGAAATAC
4301 AGATGAGGAT TTAAAGATTT CACATATTTG CTTCAATTTT TATTAATATA
4351 TGAAGCCATA TGTTTAAAGA GATACTTGAA TAATTTGGAA TTTTAAGATA 4401 CTGGTGTAAA AGTGTTTACA GAAACATCTT TGTTCAAAGA AGAACCTGAG
4451 AGATCTCATT TAGTTTTATG TTTTAAATTT ATTTTTATAA TGCTTTATTA
45D1 ACTTACCTAA TGCTCAGAGG GGGGAAATAT GTATCAAATT AAATGAAGGT
4551 AGAGCAATAA AACCCACTGG ATTAAAGAGC TCTTGGTTTG TCATCAGGAT
4b01 TATAATTCAT ATCTTACTTT GAGAAGATCT TTGAGTAAGA AAATGCAGTG 4b51 TTTGAACCTG AGGAAAAGTT AAAGTGTAGA AAATATTGTC TTGCCGAAGG
47D1 ATTTTGCAGT CCTCTGTCAG TAACTTCCAT TGATTAGGCA GACATATTCA
4751 GGTAAACCCT AATCATTAAA AAAAAATTAT CAATGTAGAA AGTAATTCCC
4601 TTTTTTCTCT CTGAGATATA CCTCAATCAC ACACTTCCCC ACCCCCACTT
4651 GAAACAGACC TCTTCACTTG TGTTTTTTTT TTTTTTTTCC TGAGGTGGAG 41D1 TCTTCCCCTG TTGCCCAGGC TGGAGTGCAG TGGGATGATC TTGGCTCACT
4151 GCAACTTCTG CCACCTGGGT TCAAGGGATT CTCGTGCCTC AACCTCCTGA
5001 GTAGCTGGGA CTGCAGGCAC GCGCCACCTG TATTTTTGTA TTTTTAGTAG
5051 AGACGGGGGT TTGCCATGTT GCCCAGACTG GTTTTGAACT CCTGGCCTCA
51D1 GGTGATCTGC CCACCTTGGC CTCCCAAAGT GCTGGGATTA CAGGTGTGAG 5151 CCACCGCACC TGGCCAGACC GCTTCACTTG TAAAAGAAAT TAGGCTAATA
5201 AGAAGGTGTA GTTTTTGAGA AATGAAATTT AACTTTAGCC TTTTCACTAG
5251 TAAATAGTCA CATCTCATTT TCTTCCTTTG TAAAATGGGG TTACTACTGG
53D1 CCCTACCTCA TATTCTATGA GAATGAGTTT GTAGCTGTTT CAAATCATGA
5351 AGTGCATAGT ATCACATGTG ATAGAATATT TATAACTTTT TATTAGATGC 5401 TTAATGTTCA ATTAAGTAAT TTTGATGTGA AAAATAAAAG TAATAAAAGT
5451 ATCTTAAAAA TAGCATAAGA ATTTTCATAT TTTTAAACAA GGCAGTTTTG
5501 TAGTCCCTTA AGATTAAATA CAACTGCTCC TTTTTTTTTT AAACTGAGGC
5551 CTTGCGATAT TTTGTGTGAA TAGATATGCC CTAGGAGTTC AGAAAAAGTT
5b01 AAAAGTATGT TTTCTAATTA AATGCAGTGC ACATTCCTGG ATCAATATTC 5b51 AAAGACTGGT CATAACCTGC TGTGTTAAAA TAATCACATA TGCTCTTTTT
5701 CATCAGATTT GTTGATGATG TAAATAAAAT GTGTAAATAT ATTAGTAAAT
5751 GTTAATATTC ATGTATTTTA AGTTAAGGTT ATAAAATTTG TCACAATGTG
5601 TTTTTTTATT CAAGTGAAAA CAGATGTGTG CAGCTATTTT GAATATTGGT
5651 TTATAAACAT TCATATTCTT TATCAAACAA AAAAAAAAAA AAAA
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 2
ORF from 77 bp to 3bβδ bpi peptide length: 1204 Category: putative protein Classification: unclassified
Prosite motifs: LEUCINE_ZIPPER (lb7-184) LEUCINE ZIPPER (b12-701) 1 MKDIQIQTVIMT ECTSLiQFVSP FAFEAMiQKVD VVCLASLSDP ELRLLLPCLV
51 RMALCAPADlQ SiQSUAlQDKKL ILRLLSGVEA VNSIVALLSV DFHALEiQDAS
101 KEIQIQLRHKLG GGSGESILVS (QLlQHGLTLEF EHSDSPRRLR LVLSELLAIM 151 NKVSESNGEF FFKSSELFES PVYLEEAADV LCILlQAELPS LLPIVDVAEA
2D1 LLHVRNGAUF LCLLVANVPD SFNEVCRGLI KNGERlQDEES LGGRRRTDAL
251 RFLCKMNPSiQ ALKVRGMVVE ECHLPGLGVA LTLDHTKNEA CEDGVSDLVC
301 FVSGLLLGTN AKVRTUFGTF IRNGIQIQRKRE TSSSVLUiQMR RiQLLLELMGI
351 LPTVRSTRIV EEADVDMEPN VSVYSGLKEE HVVKASALLR LYCALMGIAG 4D1 LKPTEEEAEiQ LLlQLMTSRPP ATPAGVRFVS LSFCMLLAFS TLVSTPE(QE(Q
451 LMVVULSUMI KEEAYFESTS GVSASFGEML LLVAMYFHSN (QLSAIIDLVC
501 STLGMKIVIK PSSLSRMKTI FTIQEIFTEIQV VTAHAVRVPV TSNLSANITG
551 FLPIHCIYlQL LRSRSFTKHK VSIKDUIYRώ LCETSTPLHP (QLLPLIDVYI bOl NSILTPASKS NPEATNώPVT EIQEILNIFIQG VIGGDNIRLN (QRFSITAiQLL b51 VLYYILSYEE ALLANTKTLA AMiQRKPKSYS SSLMDiQIPIK FLIRlQAiQGLlQ
701 (QELGGLHSAL LRLLATNYPH LCIVDDUICE EEITGTDALL RRMLLTNNAK
751 NHSPKlQL(3EA FSAVPVNHTύ VMiQIIEHLTL LSASELIPYA EVLTSNMSlQL
801 LNSGVPRRIL (QTVNKLUMVL NTVMPRRLUV MTVNALiQPSI KFVRiQiQKYTlQ
651 NDLMIDPLIV LRCDiQRVHRC PPLMDITLHM LNGYLLASKA YLSAHLKETE 101 (QDRPSiQNNTI GLVGtQTDAPE VTREELKNAL LAAlQDSAAVlQ ILLEICLPTE
151 EEKANGVNPD SLLRNVtQSVI TTSAPNKGME EGEDNLLCNL REVlQCLICCL
10D1 LHlQMYIADPN lAKLVHFlQGY PCELLPLTVA GIPSMHICLD FIPELIAiQPE
1051 LEKlQIFAIiQL LSHLCIlQYAL PKSLSVARLA VNVMGTLLTV LTiQAKRYAFF
1101 MPTLPSLVSF CRAFPPLYED IMSLLIiQIGώ VCASDVATiQT RDIDPIITRL 1151 (QiQIKEKPSGU SlQICKDSSYK NGSRDTGSMD PDVlQLCHCIE RTVIEIINMS
1201 VSGI
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_20hl2ι frame 2 No Alert BLASTP hits found
Pedant information for DKFZphtes3_20hl2ι frame 2
Report for DKFZphtes3_2Dhl2-2
[LENGTH! 1204 [MU! 134347-53
[pi! 5-75
[HOMOL! TREMBL :CEZC37b_3 gene-" "ZC37b-b"i Caenorhabditis elegans cosmid ZC37b 2e-22
[PROSITE! LEUCINE_ZIPPER 2 [KU! TRANSMEMBRANE 1
[KU! L U_C MPLEXITY 2-57 *
[KU! COILED COIL 2-33 *
SElQ MKDiQiQTVIMTECTSLlQFVSPFAFEAMiQKVDVVCLASLSDPELRLLLPCLVRMALCAPADlQ SEG
PRD cccceeeeeeeccceeecchhhhhhhhheeeeeeecccchhhhhhhchhhhhhhhccccc COILS
MEM
SElQ S(QSUA(QDKKLILRLLSGVEAVNSIVALLSVDFHALE(QDASKE(Q(QLRHKLGGGSGESILVS SEG
PRD hhhhhhhhhhhhhhhccccccccccccccccchhhhhhhhhhhhhhhhhcccccceeeec COILS MEM
SElQ (QLlQHGLTLEFEHSDSPRRLRLVLSELLAIMNKVSESNGEFFFKSSELFESPVYLEEAADV SEG xxxxxxxxxxxx
PRD ccccccceeeecccccchhhhhhhhhhhhhhhhhhcccccccccccccccchhhhhhhhh COILS
MEM
SElQ LCILlQAELPSLLPIVDVAEALLHVRNGAUFLCLLVANVPDSFNEVCRGLIKNGERiQDEES SEG
PRD hhhhhhcccccchhhhhhhhhhhhhccchhhhhheeeccccccchhhhhccccccccccc COILS
MEM
SElQ LGGRRRTDALRFLCKMNPSiQALKVRGMVVEECHLPGLGVALTLDHTKNEACEDGVSDLVC
SEG
PRD ccccchhhhhhhhhhcccceeeeeeeeeeeeeccccccceeeecccccccccccccceee
COILS
MEM
SElQ FVSGLLLGTNAKVRTUFGTFIRNGIQIQRKRETSSSVLUIQMRRIQLLLELMGILPTVRSTRIV SEG PRD eeeccccccceeeeeeeeeeeecchhhhhcccchhhhhhhhhhhhhhhhccccceeeeee COILS
MEM SElQ EEADVDMEPNVSVYSGLKEEH VKASALLRLYCALMGIAGLKPTEEEAEiQLLiQLMTSRPP SEG
PRD eeeccccccceeeeccccchhhhhhhhhhhhhhhhhhhhccchhhhhhhhhhhhhhcccc COILS MEM
SElQ ATPAGVRFVSLSFCMLLAFSTLVSTPEIQEIQLMVVULSUMIKEEAYFESTSGVSASFGEML SEG
PRD cccceeeeeehhhhhhhhhhccccccchhhhhhhhhhhhhhhhhhhcccccccchhhhhh COILS
MEM MMMMMMMMMMMMMMMMM
SElQ LLVAMYFHSNlQLSAIIDLVCSTLGMKIVIKPSSLSRMKTIFTiQEIFTEiQVVTAHAVRVPV SEG
PRD hhhhhhhccchhhhhhhhhhhhccceeeeeccccchhhhhhhhhhhhhhhhhhhhheeec COILS MEM
SElQ TSNLSANITGFLPIHCIYlQLLRSRSFTKHKVSIKDUIYRlQLCETSTPLHPiQLLPLIDVYI SEG
PRD cccccceeeeeehhhhhhhhhhhhhcccccccchhhhhhhhhcccccccccccccceeee COILS
MEM SElQ NSILTPASKSNPEATNlQPVTEiQEILNIFiQGVIGGDNIRLNiQRFSITAlQLLVLYYILSYEE SEG
PRD eeccccccccccccccccchhhhhhhhhhhccccccceeeehhhhhhhhhhhhhhhhhhh COILS MEM
SElQ ALLANTKTLAAMlQRKPKSYSSSLMDlQIPIKFLIRiQAiQGLiQlQELGGLHSALLRLLATNYPH
SEG xxxxxxxxxxxxxxxxxxx
PRD hhhhhhhhhhhhhhcccccccccccchhhhhhhhhhhhhhhhhcccchhhhhhhhhcccc COILS CCCCCCCCCCCCCCCCCCCCCCCCCCCC
MEM
SElQ LCIVDDUICEEEITGTDALLRRMLLTNNAKNHSPKiQLiQEAFSAVPVNHTiQVMlQIIEHLTL SEG
PRD eeeecceeeeechhhhhhhhhhhhhhccccccccchhhhhhhhhhcccchhhhhhhhhhh COILS
MEM
SElQ LSASELIPYAEVLTSNMSlQLLNSGVPRRILiQTVNKLUMVLNTVMPRRLUVMTVNALiQPSI SEG
PRD hhhhhhhhhhcccccchhhhhccccchhhhhhhhhhhhhhhhccccchhhhhhcccccch COILS
MEM
SElQ KFVRiQtQKYTiQNDLMIDPLIVLRCDiQRVHRCPPLMDITLHMLNGYLLASKAYLSAHLKETE SEG PRD hhhhhhhcccccccccceeeeecccccccccccceeeccccccchhhhhhhhhhhhhhhh COILS
MEM SElQ (QDRPSiQNNTIGLVGlQTDAPEVTREELKNALLAAlQDSAAViQILLEICLPTEEEKANGVNPD SEG
PRD ccccccccccccccccccchhhhhhhhhhhhhhhhhhhhhhhhhhhcccccccccccccc COILS MEM
SElQ SLLRNVIQSVITTSAPNKGMEEGEDNLLCNLREVIQCLICCLLHIQMYIADPNIAKLVHFIQGY SEG
PRD cceeeeeeeeeecccccccccccchhhhhhhhhhhhhhhhhhhhhhhccccceeeecccc COILS
MEM SElQ PCELLPLTVAGIPSMHICLDFIPELIAiQPELEKiQIFAIlQLLSHLCIiQYALPKSLSVARLA
SEG
PRD ccceeeeeeeecccceeeeehhhhhhhhhhhhhhhhhhhhhhhhhhhhhccchhhhhhhh COILS
MEM
SElQ VNVMGTLLTVLTiQAKRYAFFMPTLPSLVSFCRAFPPLYEDIMSLLIlQIGlQVCASDVATlQT SEG PRD hhhhhhhhhhhhhhhhhhhhccccccceeeccccccchhhhhhhhhhhhcchhhhhcccc COILS
MEM SElQ RDIDPIITRLiQlQIKEKPSGUSlQICKDSSYKNGSRDTGSMDPDVlQLCHCIERTVIEIINMS SEG
PRD cccchhhhhhhhhhhccccceeeeeccccccccccccccccceeeeeeeehhhhhhheee COILS MEM
SElQ VSGI SEG - - • • PRD eccc COILS
MEM
Prosite for DKFZphtes3_20hl • 2
PS0D021 lb7->161 LEUCINE_ZIPPER PD0CD0D21
PS0D021 b12->714 LEUCINE_ZIPPER PD0CD0021
(No Pfam data available for DKFZphtes3_20hl2 ■ 2)
DKFZphtes3_21kl4
group: testis derived
DKFZphtes3_21kl4 encodes a novel 556 amino acid protein without similarity to known proteins. No informative BLAST resultsi No predictive prositei pfam or SCOP motife-
The new protein can find application in studying the expression profile of testis-specific genes-
unknown protein perhaps complete eds-
Sequenced by LMU
Locus-" unknown Insert length: 2547 bp
Poly A stretch at pos- 25Dbι polyadenylation signal at pos- 2471
1 GGCCACGTTC AGCGGACACG GGAGCAAGAT GGCGATTCCG GGCAGGCAGT 51 ATGGGCTTAT TTTGCCAAAG AAAACACAGC AGTTGCACCC TGTTTTGCAA
101 AAACCATCAG TGTTTGGGAA TGATTCTGAT GATGATGATG AGACCTCTGT
151 GAGTGAAAGC CTTCAGAGGG AAGCTGCTAA GAAGCAGGCC ATGAAACAGA
201 CCAAACTGGA AATCCAGAAG GCCCTTGCAG AAGATGCTAC TGTGTATGAA
251 TATGACAGTA TTTATGATGA AATGCAGAAA AAAAAGGAGG AAAATAATCC 301 CAAATTGCTT TTGGGGAAAG ACA6AAA6CC CAAGTATATT CACAACTTGC
351 TAAAAGCAGT TGAGATCAGA AAAAAGGAAC AGGAAAAAAG AATGGAAAAG
401 AAAATACAGA GAGAACGAGA AATGGAAAAG GGGGAGTTTG ATGATAAAGA
451 AGCATTTGTG ACATCTGCAT ATAAGAAAAA ACTGCAAGAG AGAGCTGAAG
501 AAGAAGAAAG AGAAAAGAGG GCTGCTGCAC TGGAAGCATG TTTGGATGTA 551 ACCAAGCAGA AAGATCTCAG TGGATTTTAT AGGCACCTAT TAAATCAAGC bOl AGTTGGTGAA GAGGAAGTAC CTAAATGCAG CTTTCGTGAA GCCAGATCTG b51 GTATAAAGGA AGAAAAATCA AGGGGCTTCT CCAATGAAGT AAGTTCAAAA
7D1 AACAGAATAC CACAAGAGAA ATGCATTCTT CAAACTGATG TGAAAGTAGA
751 GGAAAACCCA GATGCAGACA GTGACTTCGA TGCTAAGAGC AGTGCGGATG 801 ATGAAATAGA AGAAACTAGA GTGAACTGCA GAAGGGAAAA GGTCATAGAG
651 ACCCCTGAGA ATGACTTCAA GCACCACAGG AGTCAAAACC ACTCTCGGTC
101 ACCTAGTGAA GAAAGAGGGC ACAGTACCAG GCACCACACG AAAGGATCAC
151 GAACGTCGAG AGGACATGAG AAAAGGGAAG ATCAGCACCA GCAGAAGCAA
1D01 TCCAGAGACC AAGAGAACCA TTACACTGAC CGTGATTACC GGAAAGAAAG 1D51 GGATTCTCAT AGGCACAGAG AGGCCAGTCA TAGAGATTCC CATTGGAAGA
1101 GGCATGAACA GGAAGATAAA CCAAGGGCGA GGGACCAAAG AGAAAGAAGT
1151 GACAGAGTAT GGkkkkGGGk GAAAGATAGG GAGAAATATT CCCAAAGAGA
1201 ACAAGAAAGA GATAGACAAC AAAATGATCA GAACCGACCC AGTGAGAAAG
1251 GAGAGAAGGA AGAGAAAAGC AAAGCAAAGG AAGAGCATAT GAAAGTAAGG 13D1 AAGGAAAGAT ATGAAAATAA TGATAAATAC AGAGATAGAG AAAAACGAGA
1351 GGTAGGTGTT CAGTCTTCAG AAAGAAATCA AGACAGAAAG GAAAGCAGCC
1401 CAAATTCTAG GGCAAAGGAT AAATTTCTTG ACCAAGAAAG ATCCAACAAA
1451 ATGAGAAACA TGGCAAAGGA CAAAGAAAGA AACCAAGAGA AACCCTCTAA 1501 TTCTGAATCA TCACTGGGAG CAAAACACAG ACTCACAGAG GAAGGGCAAG
1551 AGAAGGGTAA AGAACAAGAG AGACCACCTG AGGCAGTGAG CAAGTTTGCA
IbOl AAGCGGAACA ATGAAGAAAC TGTAATGTCA GCTAGAGACA GGTACTTGGC
IbSl CAGGCAGATG GCGCGGGTTA ATGCAAAGAC CTATATTGAG AAAGAAGATG 17D1 ATTGATGGCT ACCCCAAGAG AAAGATTTAA GGAAGCACAG AAAACTGTAA
1751 TTCCTGGAAC CTGCTGCGTA AAACCATAAA GGAGTGTGTT ACCAGTAGTT
1801 TGGAGGGCAT TTTTAAATTT ATTTTCAAAA TTTTAAGTTA AAAGTCAGTC
1851 TTACAGCTTG GATGTTTGGA TGTGGATGTT TGGCTGAATT TATATATAGT
1101 GTGTACTCAT CAATACCACA TTCTTTGTTG TATTCAAGAA CCGTTAAGAG 1151 TGTGCTAATT CCCTGTAGGT ACATAATGAG GAAAATTTGC TCCACTACAA
2001 CCATTAAAAA ATAATTTTGG CCAGATACGG TAGCTCGTGC CTGTAATACC
2051 AACATTTTGG GAGGCCAAGG CAGAAGGATA TTGAGGCTAG GCATTCAAGA
2101 CCAGCCTAGG CAGGATAATA AGACCTTGTC TCTATTTAAA AAACAAAAAG
2151 CCTAGCATGG TAGTCCATGC CTGTAGTCCC AGCTGTTCGA GAGGCTGAGG 22D1 CAAGAAGATC ACTTGAGCCT AGGAATTTGA TGTTACAGTG AGGTATGATC
2251 ATGCCACTGC ACTCCAACCT GGGCAACAGA ATGAGACCCT GTCTCTAAAA
2301 AATTTTTTTT AAATAAATAA TTTAACTCTT CTAATAATGT TTTGTTGCAG
2351 GAAATGTATT TCAGATAAAA TATGGATTTG AAAAACAGAA AATATACTTT
2401 ATGTTCTGAA ATTTGTATTT AAGTATAAAA TGTGAATCAT CTTGTCTAAA 2451 TAGCTTACAG CATAGTTGGC TTAAATGAAA ATAAAATGAT ATGCTTATAC
2501 ATTTGGAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAG
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 2
ORF from 21 bp to 1702 bpi peptide length: 558 Category: similarity to unknown protein Classification: Nucleic acid management
1 MAIPGRiQYGL ILPKKT(Q(QLH PVLlQKPSVFG NDSDDDDETS VSESLiQREAA 51 KK(QAMK(QTKL EIiQKALAEDA TVYEYDSIYD EMiQKKKEENN PKLLLGKDRK
101 PKYIHNLLKA VEIRKKElQEK RMEKKIiQRER EMEKGEFDDK EAFVTSAYKK
151 KLώERAEEEE REKRAAALEA CLDVTKlQKDL SGFYRHLLNO AVGEEEVPKC
201 SFREARSGIK EEKSRGFSNE VSSKNRIPlQE KCILlQTDVKV EENPDADSDF
251 DAKSSADDEI EETRVNCRRE KVIETPENDF KHHRSlQNHSR SPSEERGHST 301 RHHTKGSRTS RGHEKREDiQH (QιQK(QSRDι2EN HYTDRDYRKE RDSHRHREAS
351 HRDSHUKRHE (QEDKPRARDiQ RERSDRVUKR EKDREKYSiQR E(QERDR(Q(QND
401 (QNRPSEKGEK EEKSKAKEEH MKVRKERYEN NDKYRDREKR EVGVlQSSERN
451 (QDRKESSPNS RAKDKFLDlQE RSNKMRNMAK DKERNiQEKPS NSESSLGAKH
5D1 RLTEEGiQEKG KEiQERPPEAV SKFAKRNNEE TVMSARDRYL ARlQMARVNAK 551 TYIEKEDD BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_21kl4ι frame 2 No Alert BLASTP hits found
Pedant information for DKFZphtes3_21kl4ι frame 2
Report for DKFZphtes3_21kl4 -2
[LENGTH! 5b7 [MU! b72b2-61 [pi! δ-1b [H0M0L! TREMBL:AC00b233_14 gene: "F12K2.14"i Arabidopsis thaliana chromosome II BAC F12K2 genomic sequencei complete sequence 3e-ll [FUNCAT! 04-11 other transcription activities [S- cerevisiaei YKR012c! le-05 [FUNCAT! 30-1D nuclear organization [S- cerevisiaei YKR012c! le-05 [FUNCAT! Ob-07 protein modification (glycolsylationi acylationi myristylationi palmitylationi farnesylation and processing) [S. cerevisiaei YKL201c! le-04 [BLOCKS! PF00748F [BLOCKS! BL01182E Glycosyl hydrolases family 35 proteins [EC! 2-7.1.37 Protein kinase 7e-0b [EC! 5.11.1-2 DNA topoisomerase 4e-Dfa [PIRKU! phosphotransferase 7e-0b [PIRKU! pre-mRNA splicing le-Db [PIRKU! citrulline 3e-0b [PIRKU! tandem repeat 3e-0b [PIRKU! DNA binding 4e-0fa [PIRKU! DNA replication 4e-0b [PIRKU! isomerase 4e-Db [PIRKU! ATP 3e-0b [PIRKU! phosphoprotein le-Db [PIRKU! calcium binding 3e-Db [PIRKU! alternative splicing 7e-0fa [PIRKU! P-loop 3e-Db [PIRKU! EF hand 3e-0b [PIRKU! hair 3e-0b [SUPFAM! DEAD/H box helicase homology 3e-0b [SUPFAM! unassigned Ser/Thr or Tyr-specific protein kinases 4e- Ob
[SUPFAM! cal odulin repeat homology 3e-Db [SUPFAM! unassigned ribonucleoprotein repeat-containing proteins le-Ob [SUPFAM! unassigned DEAD/H box helicases 3e-0b [SUPFAM! trichohyalin 3e-Db CSUPFAM! protein kinase homology 4e-0b [SUPFAM! eukaryotic type I DNA topoisomerase 4e-0b [SUPFAM! ribonucleoprotein repeat homology le-Db [KU! All_Alpha [KU! LOU COMPLEXITY 22-75 * SElQ ATFSGHGSKMAIPGRiQYGLILPKKTiQlQLHPVLiQKPSVFGNDSDDDDETSVSESLiQREAAK
SEG xxxxxxxxxxxxx PRD ccccccccccccccccceeeeccccccccccccccccccccccccccccchhhhhhhhhh
SElQ KiQAMKlQTKLEIiQKALAEDATVYEYDSIYDEMiQKKKEENNPKLLLGKDRKPKYIHNLLKAV
SEG xxxxxxxxxxxxxx
PRD hhhhhhhhhhhhhhhhhhcccccccccchhhhhhhhhhchhhhhhccccchhhhhhhhhh
SE(Q EIRKKEiQEKRMEKKIiQREREMEKGEFDDKEAFVTSAYKKKLiQERAEEEEREKRAAALEAC
SEG xxxxxxxxxxxxxxxxxxxxxxx xxxxxxxxxxxxxxxxxxxxxx -
PRD hhhhhhhhhhhhhhhhhhhhhhccccccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhh SElQ LDVTKlQKDLSGFYRHLLNiQAVGEEEVPKCSFREARSGIKEEKSRGFSNEVSSKNRIPiQEK
SEG
PRD hhhhhhhccchhhhhhhhhhhhccccccccccccchhhhhhhhhhhhhhhhhhhhhhhhh
SElQ CILlQTDVKVEENPDADSDFDAKSSADDEIEETRVNCRREKVIETPENDFKHHRSlQNHSRS SEG xxxxxxxxxxxxxx
PRD hhhhhhhhhhhcccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcccc
SElQ PSEERGHSTRHHTKGSRTSRGHEKREDlQHiQiQKiQSRDlQENHYTDRDYRKERDSHRHREASH
SEG PRD ccccchhhhhhhhhhhhcccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhchh
SElQ RDSHUKRHEiQEDKPRARDlQRERSDRVUKREKDREKYSlQREiQERDRiQiQNDlQNRPSEKGEKE
SEG xxxxxxxxxxxxxxx . - xxxxxx
PRD hhhhhhhhccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhccccccccccccchhh
SElQ EKSKAKEEHMKVRKERYENNDKYRDREKREVGVlQSSERNiQDRKESSPNSRAKDKFLDlQER
SEG xxxxxxxxxxx
PRD hhhhhhhhhhhhhhhhhhhccchhhhhhhhhhhhhhhhchhhhhhhhhhhhhhhhhhhhh SEA SNKMRNMAKDKERNlQEKPSNSESSLGAKHRLTEEGlQEKGKEiQERPPEAVSKFAKRNNEET
SEG xxxxxxxxxxx
PRD hhhhhhhhhhhhhhhhccchhhhhccchhhhhhhhhhhhhhhccccchhhhhhhccccch
SElQ VMSARDRYLARiQMARVNAKTYIEKEDD SEG
PRD hhhhhhhhhhhhhhhhhchhhhhcccc
(No Prosite data available for DKFZphtes3_21kl4 -2) (No Pfam data available for DKFZphtes3_21kl4 - 2) DKFZphtes3_22ill
group: testis derived
DKFZphtes3_22ill encodes a novel 580 amino acid protein with similarity to RCCl-like G exchanging factor RLGi UVR8 (UVB- resistance protein) of Arabidopsis thaliana and to the murine retinitis pigmentosa GTPase regulator- No informative BLAST resultsi No predictive prositei pfam or SCOP motife- The new protein can find application in studying the expression profile of testis-specific genes-
Homo sapiens chromosome 7q22 sequencei 0RF4ι extension differences to genmodel of 0RF4ι differential splicing
Sequenced by LMU
Locus: /map="?q22"
Insert length: 223b bp
Poly A stretch at pos- 2117ι polyadenylation signal at pos- 2160
1 ACAATGCTCA GATCGGGAGG TGGAGCCAAT CAGGTCCAAC CAAGAGGAGG
51 GGACACCGGC ACTCCACTAG CAGGAAAACG GGCCGAGGGA CCGCAAGCAG
101 GGGGTGCCTA GTCCTCGTCC CCCAAAGACC AATCGTAAGC CAGATACAGG 151 CGAGTGACTG TCAAGAAGGC CAATTAGAGC CTCCGAAGGG AATCTGGACC
201 TGCCTCTTCT CTGAGGGACG GCTCTACCTA CCAATAGCAT GGGCGAGAAG
251 GCGGTCCCTT TGCTAAGGAG GAGGCGGGTG AAGAGAA6CT GCCCTTCTTG
3D1 TGGCTCGGAG CTTGGGGTTG AAGAGAAGAG GGGGAAAGGA AATCCGATTT
351 CCATCCAGTT GTTCCCCCCA GAGCTGGTGG AGCATATCAT CTCATTCCTC 401 CCAGTCAGAG ACCTTGTTGC CCTCGGCCAG ACCTGCCGCT ACTTCCACGA
451 AGTGTGCGAT GGGGAAGGCG TGTGGAGACG CATCTGTCGC AGACTCAGTC
501 CGCGCCTCCA AGATCAGGGT TCTGGAGTCC GGCCCTGGAA GAGAGCTGCC
551 ATTCTGAACT ACACGAAGGG CCTGTATTTC CAGGCATTTG GAGGCCGCCG bOl CCGATGTCTC AGCAAGAGCG TGGCCCCCTT GCTAGCCCAC GGCTACCGCC b51 GCTTCTTGCC CACCAAGGAT CACGTCTTCA TTCTTGACTA CGTGGGGACC
7D1 CTCTTCTTCC TCAAAAATGC CCTGGTCTCC ACCCTCGGCC AGATGCAGTG
751 GAAGCGGGCC TGTCGCTATG TTGTGTTGTG TCGTGGAGCC AAGGATTTTG
801 CCTCGGACCC AAGGTGTGAC ACAGTTTACC GTAAATACCT CTACGTCTTG
851 GCCACTCGGG AGCCGCAGGA AGTGGTGGGT ACCACCAGCA GCCGGGCCTG 101 TGACTGTGTT GAGGTCTATC TGCAGTCTAG TGGGCAGCGG GTCTTCAAGA
151 TGACATTCCA CCACTCAATG ACCTTCAAGC AGATCGTGCT GGTTGGTCAG
10D1 GAGACCCAGC GGGCTCTACT GCTCCTCACA GAGGAAGGAA AGATCTACTC
1051 TTTGGTAGTG AATGAGACCC AGCTTGACCA GCCACGCTCC TACACGGTTC
11D1 AGCTGGCCCT GAGGAAGGTG TCCCACTACC TGCCTCACCT GCGCGTGGCC 1151 TGCATGACTT CCAACCAGAG CAGCACCCTC TACGTCACAG ACCAGGGGGG
12D1 AGTGTATTTT GAGGTGCATA CCCCAGGGGT GTATCGCGAT CTCTTTGGGA
1251 CCCTTCAAGC CTTTGACCCC CTGGACCAGC AGATGCCGCT TGCTCTCTCA
1301 CTGCCTGCCA AGATCCTATT CTGTGCTCTT GGCTACAACC ACCTTGGCCT 1351 GGTGGATGAA TTTGGCCGAA TCTTCATGCA AGGAAATAAC AGATACGGGC
14D1 AGCTAGGAAC AGGGGACAAA ATGGACCGAG GGGAACCCAC ACAGGTTTGT
1451 TACCTGCAGC GGCCCATCAC CCTGTGGTGC GGCCTCAACC ACTCCCTGGT
1501 GCTGAGCCAG AGCTCAGAGT TCAGCAAGGA GCTGCTGGGC TGCGGCTGTG
1551 GGGCIGGGGG CCGCCTCCCA GGCTGGCCCA AGGGGAGTGC CTCCTTCGTC
IbOl AAGCTCCAAG TCAAGGTCCC TCTGTGTGCC TGTGCCCTCT GTGCCACCAG lb51 GGAGTGCCTA TACATCCTGT CCAGCCACGA CATTGAGCAG CACGCCCCCT
1701 ATCGCCACCT GCCAGCCAGC AGGGTGGTGG GGACTCCTGA GCCCAGCCTG
1751 GGGGCCAGAG CACCCCAGGA CCCCGGGGGG ATGGCCCAGG CCTGCGAGGA
1601 GTACCTCAGC CAGATCCACA GTTGCCAAAC GTTGCAGGAC CGCACGGAGA
1851 AGATGAAGGA GATCGTAGGG TGGATGCCCC TGATGGCCGC ACAGAAGGAC
1101 TTCTTCTGGG AGGCCCTGGA CATGCTGCAG AGGGCTGAAG GAGGCGGGGG
1151 TGGTGTAGGG CCCCCAGCCC CTGAGACCTA ATCCCCCTCA TGCTAGCCTA
2001 GTCCCTGGAG GAGGGAGTCC GGCCCCAGGC CAGGGACTAA GGAGCAATGA
2051 CCATTGTGCA CATGCGTGTG GGAAGGGGTT GCTAGGGGGT GGGGACGGCT
2101 AACCAGGGTA AGAATGTTCA GGGGGCTGCC CAGGAGGGGC CCCCAACCTG
2151 ACTATCATGG ACAAGAGATT TGATGGATAG AATAAAAGGC TGCAGCGAAA
22D1 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAG
BLAST Results
Entry AFD5335b from database EMBL: Homo sapiens chromosome 7q22 sequence! complete sequence. Score = 2152ι P = O-De+OOi identities = bbb/721 ID exons
Medline entries
No Medline entry
Peptide information for frame 2
ORF from 231 bp to 1178 bpi peptide length: 580 Category: similarity to unknown protein Classification: no clue 1 MGEKAVPLLR RRRVKRSCPS CGSELGVEEK RGKGNPISIiQ LFPPELVEHI
51 ISFLPVRDLV ALGlQTCRYFH EVCDGEGVUR RICRRLSPRL (QDώGSGVRPU
101 KRAAILNYTK GLYFtQAFGGR RRCLSKSVAP LLAHGYRRFL PTKDHVFILD
151 YVGTLFFLKN ALVSTLG(QM(Q UKRACRYVVL CRGAKDFASD PRCDTVYRKY
2D1 LYVLATREPώ EVVGTTSSRA CDCVEVYLtQS SGiQRVFKMTF HHSMTFKlQIV 251 LVGOETlQRAL LLLTEEGKIY SL VNETiQLD (QPRSYTVlQLA LRKVSHYLPH
301 LRVACMTSNώ SSTLYVTDώG GVYFEVHTPG VYRDLFGTLlQ AFDPLD(Q(3MP
351 LALSLPAKIL FCALGYNHLG LVDEFGRIFM (QGNNRYGiQLG TGDKMDRGEP
401 TiQVCYLiQRPI TLUCGLNHSL VLSlQSSEFSK ELLGCGCGAG GRLPGUPKGS
451 ASFVKLώVKV PLCACALCAT RECLYILSSH DIEiQHAPYRH LPASRVVGTP 501 EPSLGARAPiQ DPGGMAώACE EYLSiQIHSCtQ TLiQDRTEKMK EIVGUMPLMA 551 AfiKDFFUEAL DMLiQRAEGGG GGVGPPAPET BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_22illι frame 2
TREMBL:AF05335b_ll product: "0RF4"i Homo sapiens chromosome 7q22 sequence! complete sequence-i N = li Score = 1554ι P = l-fae-151
TREMBL:AF130441_1 gene: "UVR8"i product: "UVB-resistance protein UVR8"i
Arabidopsis thaliana UVB-resistance protein UVR8 (UVRβ) mRNAi complete cds-i N = li Score = IDIi P = 0-D062
TREMBL:AF044b77_l gene: "Rpgr"i product: "retinitis pigmentosa GTPase regulator"i Mus musculus retinitis pigmentosa GTPase regulator (Rpgr) mRNAi complete cds-i N = li Score = 10bι P = 0-035
>TREMBL:AF05335b_ll product: "0RF4"i Homo sapiens chromosome 7q22 sequence! complete sequence- Length = 318
HSPs
Score = 1554 (233-2 bits)ι Expect = 1-be-lSlι P = l-be-151 Identities = 303/31S (15*)ι Positives = 303/316 (IS*)
(Query: 1 MGEKAVPLLRRRRVKRSCPSCGSELGVEEKRGKGNPISIiQLFPPELVEHIISFLPVRDLV bO
MGEKAVPLLRRRRVKRSCPSCGSELGVEEKRGKGNPISIώLFPPELVEHIISFLPVRDLV
Sbjct: 1
MGEKAVPLLRRRRVKRSCPSCGSELGVEEKRGKGNPISIiQLFPPELVEHIISFLPVRDLV bO βuery: bl
ALG(QTCRYFHEVCDGEGVURRICRRLSPRL(QD(QGSGVRPUKRAAILNYTKGLYF(QAFGGR 12D ALGlQTCRYFHEVCDGEGVURRICRRLSPRLiQDlQ
TKGLYFiQAFGGR Sbjct: bl ALGIQTCRYFHEVCDGEGVURRICRRLSPRLIQDIQD
TKGLYFiQAFGGR 10b
(Query: 121
RRCLSKSVAPLLAHGYRRFLPTKDHVFILDYVGTLFFLKNALVSTLG(QM(QUKRACRYVVL 180
RRCLSKSVAPLLAHGYRRFLPTKDHVFILDYVGTLFFLKNALVSTLG(QM(QUKRACRYVVL Sbjct: 107 RRCLSKSVAPLLAHGYRRFLPTKDHVFILDYVGTLFFLKNALVSTLG(QM(QUKRACRYVVL Ibb (Query: 181
CRGAKDFASDPRCDTVYRKYLYVLATREP(QEVVGTTSSRACDCVEVYL(QSSG(QRVFKMTF 240
CRGAKDFASDPRCDTVYRKYLYVLATREP(QEVVGTTSSRACDCVEVYL(QSSG(QRVFKMTF Sbjct"- Ib7 CRGAKDFASDPRCDTVYRKYLYVLATREP(QEVVGTTSSRACDCVEVYL(QSSG(QRVFKMTF 22b
(Query: 241 HHSMTFKIQIVLVGIQETIQRALLLLTEEGKIYSLVVNETIQLDIQPRSYTVIQLALRKVSHYLPH 300
HHSMTFKiQIVLVGlQETlQRALLLLTEEGKIYSLVVNETlQLDlQPRSYTViQLALRKVSHYLPH Sbjct: 22? HHSMTFKlQIVLVGlQETlQRALLLLTEEGKIYSLVVNETiQLDiQPRSYTViQLALRKVSHYLPH 26b
(Query: 301 LRVACMTSNiQSSTLYVTD 318
LRVACMTSNtQSSTLYVTD Sbjct: 267 LRVACMTSNiQSSTLYVTD 304
Pedant information for DKFZphtes3_22illι frame 2
Report for DKFZphtes3_22ill- 2
[LENGTH! 560
[MU! b4881.41
[pi! 1-01 [HOMOL! TREMBL : AF0S335b_H product: "0RF4"i Homo sapiens chromosome 7q22 sequence! complete sequence- le-174
[BLOCKS! BLO0b25B Regulator of chromosome condensation (RCC1) proteins
[BLOCKS! BL00b25A Regulator of chromosome condensation (RCC1) proteins
[KU! Alpha_Beta
[KU! LOU COMPLEXITY 3-b2 *
SElQ MGEKAVPLLRRRRVKRSCPSCGSELGVEEKRGKGNPISIiQLFPPELVEHIISFLPVRDLV
SEG
PRD ccccchhhhhhhhhcccccccccccccccccccccceeeeccccchhhhhhheeeeeeee
SElQ ALGiQTCRYFHEVCDGEGVURRICRRLSPRLiQDlQGSGVRPUKRAAILNYTKGLYFiQAFGGR SEG
PRD ecccceeeeeeeecceeeeeeeeeecccccccccccccccccchhhhhhccceeeecccc
SElQ RRCLSKSVAPLLAHGYRRFLPTKDHVFILDYVGTLFFLKNALVSTLGlQMiQUKRACRYVVL
SEG PRD eeeeccchhhhhhhheeeeccccceeeeeeeeeeecccceeeeeeccchhhhhhhhheee
SElQ CRGAKDFASDPRCDTVYRKYLYVLATREPiQEVVGTTSSRACDCVEVYLiQSSGlQRVFKMTF
SEG
PRD ecccccccccccceeeeeehhhhhhhhccceeeeeccccceeeeeeeeecccceeeeeec
SElQ HHSMTFKlQIVLVGlQETiQRALLLLTEEGKIYSLVVNETlQLDlQPRSYTViQLALRKVSHYLPH
SEG
PRD ccccceeeeeeeehhhhhhhhhhhhhcceeeeeeeccccccccceeeehhhhhhhhccce SElQ LRVACMTSNIQSSTLYVTDIQGGVYFEVHTPGVYRDLFGTLIQAFDPLDIQIQMPLALSLPAKIL
SEG
PRD eeeeeeccccccceeeecccceeeeeccccccccccceeeecccccccceeeeeccceee SElQ FCALGYNHLGLVDEFGRIFMiQGNNRYGlQLGTGDKMDRGEPTiQVCYLlQRPITLUCGLNHSL
SEG
PRD eeeeccccceeeeeceeeeeecccccccccccccccccccceeeeeccceeeecccccee SElQ VLSlQSSEFSKELLGCGCGAGGRLPGUPKGSASFVKLlQVKVPLCACALCATRECLYILSSH
SEG xxxxxxxxxxxxxx - - ■
PRD eeeeccccceeeeccccccccccccccccceeeeeeeeeeeeeeeeeeeecccceeeecc
SElQ DIE(QHAPYRHLPASRVVGTPEPSLGARAP(QDPGGMAιQACEEYLS(QIHSC(QTLtQDRTEKMK SEG
PRD cccccccccccccceeeecccccccccccccccchhhhhhhhhhhhhcchhhhhhhhhhh
SElQ EIVGUMPLMAAIQKDFFUEALDMLIQRAEGGGGGVGPPAPET
SEG xxxxxxx PRD hhhhcchhhhhhhhhhhhhhhhhhhhccccceeecccccc
(No Prosite data available for DKFZphtes3_22ill ■ 2) (No Pfam data available for DKFZphtes3_22ill - 2)
DKFZphtes3_22124
group: testis derived
DKFZphtes3_22124 encodes a novel 451 amino acid protein with similarity to the F-box protein FBL2 of the rat- No informative BLAST resultsi No predictive prositei pfam or SCOP motife-
The new protein can find application in studying the expression profile of testis-specif ic genes-
similarity to p37NB (Homo sapiens)
Sequenced by LMU
Locus". /map="7q22-q31-l"
Insert length: 1537 bp
Poly A stretch at pos- 1451ι no polyadenylation signal found
1 CAACAGGACG ATGCGACTCC TGCCGAGGCA CTTCCACAAC TTACAGAATC
51 TTAGTTTGGC TTATTGCAGA CGGTTCACAG ACAAAGGCTT ACAGTACCTG
101 AACTTGGGGA ATGGATGCCA CAAGCTCATC TATCTGGACC TCTCTGGCTG 151 CACCCAGATT TCAGTCCAAG GCTTCAGGTA CATTGCAAAC AGCTGCACTG
2D1 GAATTATGCA TCTTACCATT AATGACATGC CAACTCTGAC GGACAACTGT
251 GTAAAAGCTT TAGTTGAAAA ATGCTCTCGT ATTACATCGC TGGTTTTCAC
301 TGGTGCACCG CATATCTCCG ATTGTACTTT CAGAGCTCTT TCTGCTTGTA
351 AACTCAGAAA GATCCGATTT GAAGGAAATA AAAGGGTTAC TGATGCATCC 401 TTCAAATTTA TAGACAAGAA TTATCCAAAT CTCAGTCACA TTTATATGGC
451 TGACTGCAAG GGAATAACAG ACAGCAGCCT CAGATCCCTT TCACCTTTGA
501 AGCAACTGAC TGTGTTGAAT TTGGCAAATT GTGTAAGAAT TGGTGATATG
551 GGACTAAAGC AATTTCTTGA TGGTCCTGCA AGCATGAGGA TAAGAGAGCT bOl AAATTTAAGC AACTGTGTGC GGCTAAGTGA TGCCTTTGTT ATGAAACTAT bSl CTGAGCGCTG CCCTAATTTA AACTACTTGA GTTTACGAAA TTGTGAACAT
7D1 TTGACTGCCC AAGGAATTGG ATATATTGTA AACATCTTTT CCTTGGTATC
751 AATAGATCTC TCTGGAACAG ACATCTCTAA TGAGGGTTTG AATGTGCTTT
801 CCAGACATAA AAAATTGAAG GAACTTTCTG TATCTGAATG TTATAGAATC
851 ACTGATGAT6 GAATTCAGGC ATTCTGCAAA AGCTCACTGA TCTTGGAACA 101 TTTGGATGTC TCTTATTGCT CCCAGCTGTC AGATATGATT ATCAAAGCAC
151 TGGCCATTTA CTGCATTAAC CTCACATCTC TCAGCATTGC TGGCTGTCCA
1001 AAGATTACTG ACTCAGCAAT GGAGATGTTA TCGGCAAAAT GCCATTACCT
1051 GCACATTTTG GATATCTCTG GTTGTGTCTT GCTTACTGAC CAAATCCTTG
1101 AGGACCTTCA GATAGGCTGC AAACAACTCC GGATCCTTAA GATGCAATAC 1151 TGCACAAATA TTTCCAAGAA GGCAGCTCAA AGAATGTCAT CTAAAGTTCA
1201 GCAGCAGGAA TACAACACTA ATGACCCTCC ACGTTGGTTT GGCTATGATA
1251 GGGAAGGAAA CCCTGTTACA GAGCTTGACA ACATAACATC ATCTAAAGGA
13D1 GCCTTAGAAT TAACAGTGAA AAAGTCAACA TACAGCAGTG AAGACCAAGC
1351 AGCGTGACCT TCAGCCTCAA GCAGGAAGAA CAAAAAATCA AGAACTTGGC 1401 AAGTTTTCTC CATTTGTTGC AAGTATGTTT ACTAGCTGAA TCTCAATAAC
1451 AATGTAAACA AGCAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA
15D1 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAG BLAST Results
Entry AC00525D from database EMBL:
Homo sapiens BAC clone RG318M05 from 7q22-q31.1ι complete sequence.
Score = 630ι P = l-δe-124ι identities = 160/113 Entry HS321D7 from database EMBL: Human p37NB mRNAi complete eds- Score = 318ι P = 4.be-04ι identities = 70/78
Medline entries
1713bβ75: Kim i Laώuaglia MPi Yang SY- i A cDNA encoding a putative 37 kDa leeuucciinriee--rπicchπ rreeppeeaatt
(LRR) proteini p37NBι isolated from S-type neuroblastoma cell has a differential tissue distribution- Biochim Biojjphys Acta
111b Dec llil301(3):183-δ
Peptide information for frame 2
ORF from 11 bp to 1354 bpi peptide length: 446 Category: similarity to known protein Classification: unclassified
1 MRLLPRHFHN LώNLSLAYCR RFTDKGLiQYL NLGNGCHKLI YLDLSGCTiQI
51 SViQGFRYIAN SCTGIMHLTI NDMPTLTDNC VKALVEKCSR ITSLVFTGAP
101 HISDCTFRAL SACKLRKIRF EGNKRVTDAS FKFIDKNYPN LSHIYMADCK 151 GITDSSLRSL SPLKlQLTVLN LANCVRIGDM GLKlQFLDGPA SMRIRELNLS
201 NCVRLSDAFV MKLSERCPNL NYLSLRNCEH LTAlQGIGYIV NIFSLVSIDL
251 S6TDISNEGL NVLSRHKKLK ELSVSECYRI TDDGIiQAFCK SSLILEHLDV
301 SYCSiQLSDMI IKALAIYCIN LTSLSIAGCP KITDSAMEML SAKCHYLHIL
351 DISGCVLLTD (QILEDLiQIGC KlQLRILKMiQY CTNISKKAAlQ RMSSKVlQlQiQE 4D1 YNTNDPPRUF GYDREGNPVT ELDNITSSKG ALELTVKKST YSSEDlQAA
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_22124ι frame 2 No Alert BLASTP hits found
Pedant information for DKFZphtes3_22124ι frame 2 Report for DKFZphtes3_22154.2
[LENGTH! 451
[MU! 50545-15
[pi! δ-bβ
[HOMOL! TREMBLNEU:AF18b273_l product: "leucine-rich repeats containing F-box protein FBL3"i Homo sapiens leucine-rich repeats containing F-box protein FBL3 mRNAi complete eds- βe-31
[FUNCAT! 11- DI stress response CS- cerevisiaei YJRDIOc! 8e-20
[FUNCAT! 03-01 cell growth CS- cerevisiaei YJRO Oc! 8e-20
[FUNCAT! 08-11 cellular import CS- cerevisiaei YJROIOc! 8e-20
[FUNCAT! D3-22 cell cycle control and mitosis CS- cerevisiaei YJROIOc! δe-20
[FUNCAT! 03-04 buddingi cell polarity and filament formation [S- cerevisiaei YJROIOc! βe-2D
[FUNCAT! 01-05-04 regulation of carbohydrate utilization [S- cerevisiaei YJROIOc! βe-20 [FUNCAT! 11-04 dna repair (direct repairi base excision repair and nucleotide excision repair) [S- cerevisiaei YJR052w! 3e-07
[FUNCAT! 30-10 nuclear organization [S- cerevisiaei YJR052w!
3e-D7
[BLOCKS! PR00D11B [BLOCKS! PR003b4D
[BLOCKS! BP01121A
[BLOCKS! BP03743B
[PIRKU! tandem repeat 2e-lδ
[PIRKU! zinc finger le-07 [PIRKU! DNA binding le-07
[SUPFAM! leucine-rich alpha-2-glycoprotein repeat homology 2e-16
[SUPFAM! regulatory protein ESAG6c le-07 [KU! Alpha_Beta
SElQ NRTMRLLPRHFHNLiQNLSLAYCRRFTDKGLlQYLNLGNGCHKLIYLDLSGCTlQISVlQGFRY PRD ccccccccccccccccceeeeeecccccccceeeecccccceeecccccccccccccccc SElQ IANSCTGIMHLTINDMPTLTDNCVKALVEKCSRITSLVFTGAPHISDCTFRALSACKLRK PRD ccccccceeeeeccccccccchhhhhhhhhhhccccccccccccccccccccccccceee
SElQ IRFEGNKRVTDASFKFIDKNYPNLSHIYMADCKGITDSSLRSLSPLKlQLTVLNLANCVRI
PRD eeecccccccccccccccccccceeeeeeeccccccchhhhhhhhcccccccceeeeeec
SElQ GDMGLKIQFLDGPASMRIRELNLSNCVRLSDAFVMKLSERCPNLNYLSLRNCEHLTAIQGIG
PRD cccccccccccccccccceeeeccccccccccchhhhhhcccccccccccccccccccee
SElQ YIVNIFSLVSIDLSGTDISNEGLNVLSRHKKLKELSVSECYRITDDGIIQAFCKSSLILEH PRD eeccccceeeeeecccccccccchhhhhhcccccccccccccccchhhhhcccccccccc
SElQ LDVSYCSIQLSDMIIKALAIYCINLTSLSIAGCPKITDSAMEMLSAKCHYLHILDISGCVL
PRD cceeecccccchhhhhhhhccccceeeeeecccccchhhhhhhhhhccceeeeecccccc SElQ LTD(QILEDL(QIGCK(QLRILKM(QYCTNISKKAA(QRMSSKVlQ(Q(QEYNTNDPPRUFGYDREGN
PRD chhhhhhhhhhcchhhhhhcceeeeechhhhhhhhhhhhheeeccccccccccccccccc
SElQ PVTELDNITSSKGALELTVKKSTYSSEDiQAA PRD ccccccccccccceeeeeccccccccccccc
(No Prosite data available for DKFZphtes3_22124 -2)
(No Pfam data available for DKFZphtes3_2212 - 2)
DKFZphtes3_2bg3
group: testis derived
DKFZphtes3_2fag3 encodes a novel 101D amino acid protein without similarity to known proteins- No informative BLAST resultsi No predictive prositei pfam or SCOP motife-
The new protein can find application in studying the expression profile of testis-specific genes-
similarity to C-elegans C01D4-4 on genomic level encoded by HSDJ116I1 perhaps complete eds-
Sequenced by EMBL
Locus: /map="b"
Insert length: 45b2 bp Poly A stretch at pos- 4550ι polyadenylation signal at pos- 4515
1 GATTCAGTTA CTGAAGACTT AGATGCACCC TGGATGGGAA TTCAGAATCT
51 TCAGAGATCA GAGTCCAGTA AAATGGATAA ATATGAGACT GAAGAAAGCT
IDl CTGTAGCAGG ACTTTCTAGC CCAGAGTTGA AAGTCAGACC TGCTGGTGCC
151 TCCAGTATTT GGTATACAGA AGGTGAAAAG CAGCTAACAA AATCTCTAAA
201 AGGAAAGAAT GAAGAATCAA ATAAATCCAA AGTTAAGGTT ACTAAGCTTA 251 TGAAAACAAT GAAATCTGAA AACACAAAAA AATTAATAAA ACAGAACTCT
3D1 AAGGATTCTG TGGTTTTGGT AGGCTACAAA TGTTTGAAAA GTACAGCATC
351 AAATGATCTC.ATTAAATGCT TTGAAGGCAA TCCTTCACAT AGTCAGAAGG
401 AAGGTCTGGA TCCCACAATA TGTGGATATA ATTTTGACCC AAAGACCTAC
451 ATGAGACAGA CAAGTCAAAA GGAAGCTAGC TGTTTGCCAA CTAATACAGA SD1 GAGAACTGAA CAAAAGTCTC CAGATATTGA AAATGTTCAA CCAGACCAGT
551 TTGATCCTTT GAACTCTGGC AACCTAAATC TTTGTGCAAA TTTGTCCATT bOl TCAGGTAAAC TTGATATCTC CCAGGAC6AT AGTGAAATTA CACAAATGGA b51 ACACAATCTG GCATCCAGAA GGTCATCAGA CGATTGCCAT GATCATCAAA
7D1 CAACCCCATC TTTGGGAGTT AGAACAATTG AAATAAAGCC CAGTAATAAA 751 GATCCTTTCA GTGGAGAGAA TATAACTGTC AAACTAGGAC CTTGGACAGA
601 GCTTCGACAA GAGGAAATAC TTGTGGATAA TTTACTACCC AACTTTGAGT
551 CCTTAGAATC TAATGGTAAA TCTAAATCTA TAGAAATAAC ATTTGAAAAG
101 GAAGCTTTGC AAGAAGCAAA GTGTCTTTCT ATTGGAGAAT CATTAACTAA
151 ATTACGAAGT AATCTACCTG CCCCTTCTAC AAAAGAATAT CATGTTGTAG 10D1 TAAGTGGAGA TACAATTAAG TTACCAGATA TTAGTGCCAC ATATGCCTCA
1051 TCTAGATTTT CAGATTCAGG TGTTGAAAGT GAACCGAGTT CTTTTGCGAC
1101 ACATCCAAAC ACTGATTTAG TCTTTGAAAC TGTGCAAGGG CAAGGTCCTT'
1151 GCAATAGTGA AAGATTATTT CCTCAGCTTT TGATGAAACC TGATTATAAT
12D1 GTAAAATTTT CATTAGGAAA TCATTGTACT GAGAGTACAA GTGCTATAAG 1251 TGAAATACAG TCATCTTTGA CATCCATAAA CTCTCTACCC TCCGATGATG
1301 AACTGTCACC TGATGAAAAT TCTAAGAAAT CTGTTGTACC TGAATGCCAT
1351 CTAAATGATA GCAAAACTGT ATTAAATCTA GGAACGACTG ATTTGCCAAA
14D1 ATGTGATGAT ACTAAAAAGT CAAGTATCAC TTTGCAACAG CAGAGTGTTG 1451 TATTTTCAGG GAACTTGGAC AATGAAACTG TAGCAATACA TTCCTTAAAT
1501 TCAAGCATTA AAGACCCTTT ACAATTTGTT TTTTCAGATG AAGAGACTTC
1551 CAGTGATGTG AAAAGTAGTT GCAGCTCCAA ACCTAACTTG GATACTATGT
IbOl GTAAAGGCTT CCAGAGTCCT GATAAATCTA ATAACTCTAC AGGGACAGCA IbSl ATTACATTAA ATTCAAAACT GATTTGTTTA GGCACTCCTT GTGTCATTTC
17D1 AGGTTCCATT TCTAGTAATA CAGATGTTAG TGAAGATAGA ACTATGAAAA
1751 AAAATAGTGA TGTATTAAAT CTCACACAGA TGTATTCAGA AATCCCTACA lδOl GTTGAAAGTG AAACTCATCT GGGTACAAGT GATCCTTTTT CAGCCAGTAC
1651 TGATATAGTA AAGCAAGGGC TTGTGGAAAA TTATTTTGGT TCTCAAAGCA 1101 GTACGGATAT TTCTGACACA TGTGCTGTTA GCTACAGCAA TGCACTTAGC
1151 CCTCAGAAGG AAACTTCTGA AAAAGAAATT AGTAATCTTC AGCAGGAACA
2D01 GGkJkkkGkG GATGAGGAGG kkGkGCkGGk TCAACAAATG GTTCAAAATG
2051 GGTACTATGA AGAAACAGAT TATTCAGCTT TGGATGGAAC AATAAATGCT
21D1 CACTATACAA GCAGAGATGA ACTAATGGAA GAAAGACTTA CAAAATCTGA 2151 AAAAATAAAC AGTGACTATC TGAGAGATGG TATAAACATG CCTACTGTCT
2201 GTACTTCTGG TTGTTTGTCC TTCCCGTCTG CACCACGAGA GTCTCCTTGT
2251 AATGTTAAAT ATTCTTCCAA AAGTAAATTT GATGCCATTA CAAAGCAGCC
23D1 AAGCAGTACT TCTTACAACT TCACTTCTTC GATTTCCTGG TATGAAAGTT
2351 CACCAAAACC TCAAATACAA GCCTTCCTTC AGGCAAAAGA AGAACTGAAG 2401 CTACTAAAAC TTCCTGGGTT CATGTACAGT GAAGTTCCTC TGCTGGCATC
2451 CTCAGTACCT TATTTTAGTG TAGAAGAAGA GGGTGGTTCT GAAGATGGAG
2501 TACATCTGAT TGTCTGTGTG CACGGTTTAG ATGGAAACAG TGCAGATCTC
2551 CGATTAGTAA AAACTTACAT TGAACTTGGA TTGCCTGGGG GAAGAATTGA
2b01 TTTTCTTATG TCTGAGAGAA ATCAGAATGA TACTTTTGCT GATTTTGATA 2b51 GCATGACTGA TCGTCTTTTG GATGAGATAA TACAGTATAT TCAGATATAT
27D1 AGTCTAACAG TCTCAAAAAT AAGCTTTATT GGACATTCGT TGGGCAATTT
2751 AATAATTCGT TCAGTGCTTA CAAGGCCAAG GTTTAAATAT TACCTCAACA
2601 AACTTCATAC CTTTCTGTCT CTTTCTGGAC CTCACCTTGG TACACTCTAC
2651 AACAGCAGTG CTCTTGTTAA TACAGGTCTC TGGTTTATGC AGAAATGGAA 2101 AAAATCAGGT TCGCTTTTGC AGCTGACATG TCGAGATCAC TCAGACCCTC
2151 GCCAAACTTT TTTATATAAG CTTAGTAACA AAGCAGGGCT TCATTATTTC
3001 AAAAATGTTG TGCTAGTGGG ATCCCTACAG GATCGCTATG TTCCTTATCA
3051 CTCTGCCCGC ATTGAAATGT GTAAAACAGC TTTAAAGGAC AAACAGTCAG
31D1 GACAGATCTA TTCAGAAATG ATCCACAACT TGCTTCGACC CGTTCTGCAA 3151 AGCAAGGACT GTAATTTGGT TCGCTATAAT GTCATCAATG CATTGCCCAA
3201 TACAGCTGAT TCACTCATTG GGAGAGCTGC ACATATAGCT GTTCTTGATT
3251 CGGAAATATT TTTAGAGAAA TTCTTTCTGG TTGCTGCCCT CAAATATTTC
3301 CAATAGTATA AAAGCATTGT TAGCGACTGG ACAATTACCT CATTCAACAA
3351 TGTTTCAAAT AATGTATTAT ATTAAAATGT AGATGCTGAT AAGTTCTAAG 34D1 AAATATTTAT ACCTTTTTAT ATGGAAGATA ATTTATATCA TCCATGTTTA
3451 GTGCTTTTTA AACATCAACT TTACTTTCTA GGTAATGTGG CTGTGCAATA
3501 TTTTTTTAAT TTTATCTTTT TACTTTTCTA TTACTTTTTC ATATATTTTG
3551 CTACCTAAGT ATTTCAGTGA AACTTTAAGC CCATACCTGT GTCTGATTGT
3b01 TTATTATTGG CTTTCCACAA TTCTTACATC AGACTACATT ATATTAGAGA 3b51 CCATTATTGC TAGAATAGCA TGGGATTTAA AATTTCTAAT ACTG6GGGTA
37D1 TTATTTAGTT AATTATAAAT TTTTCTTTTC ACATTTTACT GTGTTTTAAC
3751 TGGAAATAAA ATTATGGCTG CTACAATATA TTTTTTGAAA TCAACTTCTG
3601 TAGTTCTAAA ATACAACTTT ATCATACAAT CAAACCAGGT AGTTCATATA
3651 AAACAGTGTA ATACAAGTTT TCTATAAAGT CATTACTGTT GCTTAAACAT 3101 ATTTCATGCC TATTAAAATA TATTTTCTAC TGGTGATTTC AACATTATTT
3151 CTCATACTGA CTTTTATTAC TGGAAATGTT CCTGTACATG TTGGCAGCAG
4001 ATAAAGATTT TTGAATGTTT GAATGCCCTC TGCCTTGATT TGGTTGGATT
4051 TTGCTAATTG GTATGTTGCT TGAACTTTAT GACTACATTT TCTTTTAACT
4101 TTTTTCATGG ACTTCCTTAT ATGTACATAA TAATTAAATG TTGAAATTTA 4151 TGAAATACTT TTATGAATTT AGATAATTTT TAAATATTGT TAAAATTTAT
4201 TGAACTAAAA AGTAATGTAA ATAAAATAAT TCATGTTAAA GATGGAACAA
4251 AATAATTAAC TTTACATGTT TGGTGATACA GATGCAAATG TTTTTGATAT
4301 ATGGAGATGT TGAGTCTTTT GACTTTACTA AAGGTGCTGA ATAGCATTAA 4351 ATTCACTATT TTCCTTTTCT GTTTTACTTG TGAAAATAAA AATGCACTAA
4401 GGTTGGGTAG AAGTTCTGTT TGCACTCACT AATTGTGACA GACAGAGGTT
4451 TTTGTAAGTA TTTATTGTAC AATTGATGCA TGTT-TATTTT TAGCGTTGTT
4501 ATTGCCTCTG GTGTTAATAA ATGAACAAAT GGCTATCTGG AGGAACAGCT
4551 AAAAAAAAAA AA
BLAST Results
Entry HSDJ116I1 from database EMBLNEU:
Human DNA sequence *** SEQUENCING IN PROGRESS *** from clone
DJllδll
Score = 7221ι P = D-Oe+ODi identities = 1455/14bl
Medline entries
No Medline entry
Peptide information for frame 1
ORF from 34 bp to 3303 bpi peptide length: 1010 Category: similarity to unknown protein Classification: no clue
1 MGIiQNLlQRSE SSKMDKYETE ESSVAGLSSP ELKVRPAGAS SIUYTEGEKiQ
51 LTKSLKGKNE ESNKSKVKVT KLMKTMKSEN TKKLIKiQNSK DSVVLVGYKC
101 LKSTASNDLI KCFEGNPSHS (QKEGLDPTIC GYNFDPKTYM RiQTSiQKEASC 151 LPTNTERTEiQ KSPDIENVlQP DiQFDPLNSGN LNLCANLSIS GKLDISiQDDS
201 EITiQMEHNLA SRRSSDDCHD HlQTTPSLGVR TIEIKPSNKD PFSGENITVK
251 LGPUTELRlQE EILVDNLLPN FESLESNGKS KSIEITFEKE ALiQEAKCLSI
301 GESLTKLRSN LPAPSTKEYH VVVSGDTIKL PDISATYASS RFSDSGVESE
351 PSSFATHPNT DLVFETVlQGiQ GPCNSERLFP (QLLMKPDYNV KFSLGNHCTE 401 STSAISEIiQS SLTSINSLPS DDELSPDENS KKSVVPECHL NDSKTVLNLG
451 TTDLPKCDDT KKSSITLlQlQlQ SVVFSGNLDN ETVAIHSLNS SIKDPLiQFVF
5D1 SDEETSSDVK SSCSSKPNLD TMCKGFiQSPD KSNNSTGTAI TLNSKLICLG
551 TPCVISGSIS SNTDVSEDRT MKKNSDVLNL TlQMYSEIPTV ESETHLGTSD faOl PFSASTDIVK (QGLVENYFGS (QSSTDISDTC AVSYSNALSP (QKETSEKEIS b51 NLlQiQElQDKED EEEEIQDIQIQMV (QNGYYEETDY SALDGTINAH YTSRDELMEE
701 RLTKSEKINS DYLRDGINMP TVCTSGCLSF PSAPRESPCN VKYSSKSKFD
751 AITKiQPSSTS YNFTSSISUY ESSPKPlQIiQA FLlQAKEELKL LKLPGFMYSE
601 VPLLASSVPY FSVEEEGGSE DGVHLIVCVH GLDGNSADLR LVKTYIELGL
651 PGGRIDFLMS ERNiQNDTFAD FDSMTDRLLD EIIiQYIiQIYS LTVSKISFIG IDl HSLGNLIIRS VLTRPRFKYY LNKLHTFLSL SGPHLGTLYN SSALVNTGLU
151 FMiQKUKKSGS LLiQLTCRDHS DPRiQTFLYKL SNKAGLHYFK NVVLVGSLώD
10D1 RYVPYHSARI EMCKTALKDK (QSGiQIYSEMI HNLLRPVLiQS KDCNLVRYNV
1051 INALPNTADS LIGRAAHIAV LDSEIFLEKF FLVAALKYFiQ
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_2bg3ι frame 1 No Alert BLASTP hits found
Pedant information for DKFZphtes3_2bg3ι frame 1
Report for DKFZphtes3_2bg3- 1
[LENGTH! 1101
[MU! 122245-22 [pi! 5-12
[H0M0L! TREMBL:CEAF211b_l gene: "C01D4-4"i Caenorhabditis elegans cosmid C01D4- 2e-3δ
[FUNCAT! 11 unclassified proteins [S- cerevisiaei Y0R051c!
2e-0b [BLOCKS! BL0D120B
[KU! Alpha_Beta
[KU! LOU COMPLEXITY b-72 *
SElQ DSVTEDLDAPUMGIlQNLiQRSESSKMDKYETEESSVAGLSSPELKVRPAGASSIUYTEGEK
SEG
PRD ccccccccccceeeeechhhhhhhhhhccccccccccccccceeeeccccceeeccccch
SElQ iQLTKSLKGKNEESNKSKVKVTKLMKTMKSENTKKLIKiQNSKDSVVLVGYKCLKSTASNDL SEG xxxxxxxxxxxxxxx
PRD hhhhhhccccccccceeeehhhhhhhhhcccccceeecccccceeeeeeeeccccccccc
SElQ IKCFEGNPSHSiQKEGLDPTICGYNFDPKTYMRlQTSiQKEASCLPTNTERTEiQKSPDIENViQ
SEG PRD eeeecccccccccccccccccccccccccccccccccccccccccccccccccccccccc
SElQ PDIQFDPLNSGNLNLCANLSISGKLDISIQDDSEITIQMEHNLASRRSSDDCHDHIQTTPSLGV
SEG
PRD ccccccccccceeecccccccccccccccccccchhhhhhhcccccccccccccccccee
SElQ RTIEIKPSNKDPFSGENITVKLGPUTELRiQEEILVDNLLPNFESLESNGKSKSIEITFEK
SEG
PRD eeeeecccccccccccceeeccccchhhhhhhhhhhccccccccccccccceeeehhhhh SElQ EALlQEAKCLSIGESLTKLRSNLPAPSTKEYHVVVSGDTIKLPDISATYASSRFSDSGVES
SEG
PRD hhhhhhhhhhhhhhhhhhhhccccccccceeeeecccccccccccccccccccccccccc
SElQ EPSSFATHPNTDLVFETVlQGlQGPCNSERLFPiQLLMKPDYNVKFSLGNHCTESTSAISEIlQ SEG
PRD ccccccccccceeeeeeeccccccccccccccccccccceeeeecccccccccchhhhhh
SElQ SSLTSINSLPSDDELSPDENSKKSVVPECHLNDSKTVLNLGTTDLPKCDDTKKSSITLiQiQ
SEG PRD cccccccccccccccccccccccccccccccccccceeecccccccccccccccceeecc
SElQ (QSVVFSGNLDNETVAIHSLNSSIKDPLiQFVFSDEETSSDVKSSCSSKPNLDTMCKGFlQSP
SEG xxxxxxxxxx PRD eeeeeecccccceeeeeeeccccccceeeeeccccccceeeccccccccccccccccccc
SElQ DKSNNSTGTAITLNSKLICLGTPCVISGSISSNTDVSEDRTMKKNSDVLNLTώMYSEIPT
SEG
PRD cccccccccccccceeeeeeeccceeeeecccccccccccccccccchhhhhhheeeeec
SElQ VESETHLGTSDPFSASTDIVKώGLVENYFGSiQSSTDISDTCAVSYSNALSPlQKETSEKEI
SEG
PRD cccccccccccccccceeeeeeeeeeeecccccccccceeeeeecccccccccccccccc
SE(2 SNLlQlQEiQDKEDEEEElQDiQlQMViQNGYYEETDYSALDGTINAHYTSRDELMEERLTKSEKIN
SEG ■ ■ - xxxxxxxxxxxxxxxx
PRD cchhhhhcccchhhhhhhhhhcccccccccccccccceeeeccchhhhhhhhhhhhhccc SElQ SDYLRDGINMPTVCTSGCLSFPSAPRESPCNVKYSSKSKFDAITKiQPSSTSYNFTSSISU
SEG xxxxxxxxxxxx.
PRD ccccccccccccccccceeecccccccccceeeecccccceeeeeccccccceeecceee
SElQ YESSPKPlQIiQAFLiQAKEELKLLKLPGFMYSEVPLLASSVPYFSVEEEGGSEDGVHLIVCV SEG xxxxxxxxx xxxxxxxxxxxx
PRD ccccccccchhhhhhhhhhhhhccccceeeeeeeeeecccceeeeeccccccceeeeeee
SElQ HGLDGNSADLRLVKTYIELGLPGGRIDFLMSERNiQNDTFADFDSMTDRLLDEIIlQYIiQIY
SEG PRD eccccccchhhhhhhhhhhccccccchhhhhccccccccccccchhhhhhhhhhhhhhhh
SElQ SLTVSKISFIGHSLGNLIIRSVLTRPRFKYYLNKLHTFLSLSGPHLGTLYNSSALVNTGL
SEG
PRD hccccccccccccceeeeeeeeccccchhhhhhhhhhccccccccceeeeccccccccch
SElQ UFMiQKUKKSGSLLlQLTCRDHSDPRlQTFLYKLSNKAGLHYFKNVVLVGSLiQDRYVPYHSAR
SEG
PRD hhhhhhhhhheeeeeecccccccceeeeeeccccceeeeeeeeeeeccccccceeehhhh SElQ IEMCKTALKDKlQSGiQIYSEMIHNLLRPVLlQSKDCNLVRYNVINALPNTADSLIGRAAHIA
SEG
PRD hhhhhhhccccccchhhhhhhhhhhccccccccceeeeeeecccccccccchhhhhhhhh
SElQ VLDSEIFLEKFFLVAALKYFiQ SEG
PRD hhhhhhhhhhhhhhhhhhccc
(No Prosite data available for DKFZphtes3_2bg3.1) (No Pfam data available for DKFZphtes3_2bg3-l) DKFZphtes3_21f24
group: signal transduction
DKFZphtes3_21f24 encodes a novel 52b amino acid protein with similarity to murine netla- The closely related mNETl activates signalling pathways in addition to those directly controlled by activated RhoA- The novel protein is expressed ubiquitously-
The new protein can find application in modulation/blocking signalling pathways-
similarity to netla (Mus musculus) perhaps complete eds- Sequenced by BMFZ Locus: /map="72-4D cR from top of Chr3 linkage group'
Insert length: 3551 bp
Poly A stretch at pos- 3534ι polyadenylation signal at pos- 3513
1 CGCCGCCGCC CGGCATCGTG GAGCTGGGGC CCCCTTTTGC CTGGGAGTTT
51 TGTAGTCGCC TAGGGTCAGC GGTGACATCC CAAAGGGCAG GCCCGGCAGC
101 CGCCATGGTG GCCAAGGATT ACCCCTTCTA CCTCACGGTC AAGAGAGCGA
151 ACTGCAGCCT GGAGCTACCC CCGGCCAGCG GTCCGGCCAA GGACGCTGAG
201 GAGCCTAGTA ATAAACGGGT CAAACCCCTT TCCCGAGTCA CGTCGCTAGC 251 AAACCTCATC CCGCCCGTGA AGGCCACGCC ATTAAAGCGC TTCAGTCAAA
301 CCCTGCAGCG CTCCATTAGC TTCCGCAGTG AGAGCCGCCC TGACATCCTC
351 GCCCCCCGAC CCTGGTCCAG AAATGCCGCC CCCTCGAGCA CGAAACGGAG
4D1 AGATAGCAAG CTGTGGAGTG AGACCTTCGA TGTGTGCGTC AATCAGATGC
451 TTACATCCAA GGAAATCAAA CGTCAGGAGG CGATCTTTGA GCTTTCCCAA 501 GGAGAAGAAG ACTTGATAGA AGACTTGAAA TTAGCAAAAA AGGCCTATCA
551 TGACCCCATG CTGAAACTCT CCATAATGAC AGAACAAGAG TTGAATCAAA bOl TTTTTGGAAC ACTGGACTCT CTAATTCCTC TACATGAAGA GCTCCTTAGT bSl CAGCTTCGAG ATGTTAGGAA GCCTGATGGC TCGACTGAAC ATGTTGGTCC
701 CATCCTCGTG GGCTGGCTCC CTTGCCTCAG CTCCTATGAT AGCTACTGCA 751 GCAATCAAGT AGCCGCCAAA GCTCTGCTGG ACCACAAAAA GCAAGATCAC
601 CGAGTCCAGG ATTTCCTACA GCGATGTTTA GAATCCCCCT TTAGCCGCAA
651 ACTAGATCTC TGGAATTTCC TCGATATTCC AAGAAGCCGC CTGGTAAAAT
101 ACCCTCTGCT TCTCCGAGAA ATCTTGAGGC ACACACCAAA TGATAATCCA
151 GATCAGCAGC ACTTGGAAGA AGCTATAAAT ATCATTCAGG GAATTGTGGC 10D1 AGAAATCAAC ACCAAGACTG GTGAATCTGA ATGCCGCTAT TATAAAGAGC
1051 GGCTTCTTTA CTTGGAAGAA GGCCAGAAAG ACTCCCTGAT CGACAGCTCT
1101 CGAGTCTTGT GTTGTCATGG TGAACTGAAG AACAATCGGG GCGTGAAACT
1151 GCATGTTTTC CTGTTCCAAG AAGTGCTTGT GATCACTCGA GCCGTCACCC
12D1 ACAATGAGCA GCTTTGCTAC CAGCTGTACC GTCAGCCAAT CCCCGTGAAA 1251 GACCTCCTGC TGGAAGACCT CCAGGATGGA GAAGTGAGGC TGGGTGGCTC
1301 CCTGCGAGGG GCATTCAGCA ACAATGAGAG AATTAAAAAC TTCTTCAGAG
1351 TCAGTTTCAA AAATGGATCC CAAAGTCAGA CCCACTCGCT ACAAGCCAAT
14D1 GACACTTTCA ACAAACAGCA GTGGCTTAAC TGTATTCGTC AAGCCAAAGA 1451 AACAGTTTTG TGTGCTGCCG GGCAAGCTGG GGTGCTTGAC TCCGAGGGAT
1501 CGTTCCTAAA TCCCACCACC GGGkGCkGkG AGCTACAGGG AGAAACAAAA
1551 CTTGAGCAGA TGGACCAATC GGACAGTGAG TCAGACTGTA GTATGGACAC
IbOl GAGTGAGGTC AGCCTCGACT GTGAGCGCAT GGAACAGACA GACTCTTCCT lb51 GTGGAAACAG CAGGCACGGT GAAAGTAACG TCTGACAGAA GCATGTGCAC
17D1 TTCGGGAAGC AGGCCTGCAT CTTACCTGTA CAGTATTTGC ATTCCACAGA
1751 TGGAACGGTT TGGAGAAGCA CTTTTTCATA CTTTTGTGAA AGTATACATG
1601 TTGGCCCAGT CTCTCGTATC TGTACCTTTG TCCCTAGTAC TGTAACTGCC
1651 AATCTGTCTG TGTAAGCTGG AATCTGTGGC AACTATTACC CTGTGTTGTA
1101 TTTCCCAAGT GTCTGGATGG ATGGAGAGGT ACTCAAACAA GTTACTTTCA
1151 GTTGTCCTGC TGGATTTTAA AAAAATAGAA AAAGAATCTC AAAACTACTG
20D1 TTTTACATAG ATTGTTTGAA GAGTCCTTCC TCTTGTGCTT CTGTACCACT
2051 TTCCCAGCTC TTAGATGTGG TAGCTAAAGG CACGGAATTT AGACGGCCTT
2101 GTAAATAGGG CATGAGGAAC TCATCTGTGT ATTGGGATGG TATTAGAGAG
2151 AGAATCAGGA AAGACCAACT CATGAAGTGA ACTTGGTTTG ATCTTACTCA
2201 ACTAGAAAGC TTGAAAACAT CCCTGGGGAT TCTGAAGGCT TAATTTTGCA
2251 AAGGAGGATG CATTGTCTGA ACTTTGCAAC TTCATCCAGT GCAAGTTTGA
23D1 TGCAAGAATG TATTAGGACA TAAAATAGAG GCTGACCTTA AAAGGGCCAG
2351 GACAGAAGCG GCTGCCAGCT CTGAATCTTT AACTGAAATG CACATGGCAC
2401 CAGGAGGTGT CTCTCATAGT TGGTTGCTAG CCTAAAACAT CAGAATAGAA
2451 CCCAAAGGGC TTAGGAAGGC CTGCCAGGAT AACAAGAAGG CCCTGTATTC
25D1 ATTGTGTTTC ATCTGCCTAG GCCTACTCAT TATTTTAGAG AATGAATGAA
2551 GCAACAAGGA AGAGAGACCA TGACTCTATC GATGACACTG TTTATAGAAA
2b01 CACAGGAGAG GAAGAATTTG GAATGAAAAG CACTTCGTCA GAACCTTCTG
2b51 TGGGAGCCAT TGAGAGAAAA GCATGGTCCA GTGCCTTCTG AGAAAGGCCA
27D1 GAGCTTTGGG CTTTCCTGCT CTGCTTTTGG GTCGTCAATT GCCATCTCT
2751 GGTTCTGTGC TATAATCAGA ATTGTAATTA TGTTCTCCAG AGGCCAATTT
26D1 CATTAACTCT GATTAATTAG AATCAGCTAG CCAGATTAGT AACCTCTTTG
2651 TCCAGCCTTG ATTTACAGTG CAGGGTAAAG TGCAGACCTT AAAAACAGCT
21D1 AAGTACCTAG AAGAGCTCCC TGCAAGTGTA AATATTAAGG ATGACCTGTG
2151 CAAAATTATA CCCACACCAG CACTAGTGGT AATTATTCTA AATTATTGCC
30D1 AAAAAGTTTT TTTTAATCTG TCTTTCAAGT TTACAGAAAA GAAAGCAGTA
3051 AATGCATTGA TGTCATTTTA TTATGTACAT ATATCATGTG CATTCAAGCT
3101 GTGTGACAAG ATATATCAAT ATAAAAACAA GGTATATACT TTATTATTTT
3151 TTGAAAACAA GGATATTGTG ATCAATTTTA CCCTGTAAAA CATATTTCTG
3201 TATTTATAGG TCTTAAACAT GATGAATTTT TTCTATTACA AGTTTATTTA
3251 AAACTGCTTT CTCAAGTCGT TATTGATACA GCAAGTGAAC CTGCTGCAGA
3301 CAGAAGCAGA GGAAAGCCAA GAACAGCCTT TATTGGTGAA GAAAAGAATG
3351 AATGATTCTT TGTAGGCGCC ATCAGCCACT TTTAGAAGCC ATCAGCCAGT
3401 GTGTTGGGAA AAGAGGTTTG TCAAGTGTTG GCCTATGGGA AGGTGGTCAA
3451 TGAATGTTTT GATGAAATGA ATGTTTTTGT ATAATGGCCT TAAACTTTTC
35D1 TGGAAGTATT TCAAATAAAT TACATTATTA AGTCAAAAAA AAAAAAAAAA
3551 AAAAAAAAA
BLAST Results
No BLAST result
Medline entries
1633bl1fa:
Alberts ASi Treisman R-i Activation of RhoA and SAPK/JNK signalling pathways by the RhoA-specif ic exchange factor mNETl- EMBO J 1116 Jul 15il7(14):4075-85
Peptide information for frame 3
ORF from 105 bp to lbδ2 bpi peptide length: 52b
Category: strong similarity to known protein Classification: Cell signaling/communication
1 MVAKDYPFYL TVKRANCSLE LPPASGPAKD AEEPSNKRVK PLSRVTSLAN 51 LIPPVKATPL KRFSιQTL(QRS ISFRSESRPD ILAPRPUSRN AAPSSTKRRD
101 SKLUSETFDV CVNiQMLTSKE IKRlQEAIFEL SlQGEEDLIED LKLAKKAYHD
151 PMLKLSIMTE IQELNIQIFGTL DSLIPLHEEL LSlQLRDVRKP DGSTEHVGPI
201 LVGULPCLSS YDSYCSNiQVA AKALLDHKKώ DHRVlQDFLlQR CLESPFSRKL
251 DLUNFLDIPR SRLVKYPLLL REILRHTPND NPDiQiQHLEEA INIIiQGIVAE 3D1 INTKTGESEC RYYKERLLYL EEGiQKDSLID SSRVLCCHGE LKNNRGVKLH
351 VFLFiQEVLVI TRAVTHNEiQL CYiQLYRlQPIP VKDLLLEDLlQ DGEVRLGGSL
401 RGAFSNNERI KNFFRVSFKN GSIQSIQTHSLIQ ANDTFNKlQiQU LNCIRiQAKET
451 VLCAAGlQAGV LDSEGSFLNP TTGSRELlQGE TKLElQMDlQSD SESDCSMDTS
5D1 EVSLDCERME (QTDSSCGNSR HGESNV
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_21f24 i frame 3 No Alert BLASTP hits found
Pedant information for DKFZphtes3_21f24ι frame 3
Report for DKFZphtes3_21f24 - 3
[LENGTH! 5b0
[MU! b32D2-65
[pi! fa.04 [HOMOL! TREMBL : AFD14520_1 gene: "Netl"i product: "NET1 homolog"i Mus musculus NET1 homolog (Netl) mRNAi complete eds- le-lb2
[FUNCAT! 01-01 biogenesis of cell wall [S- cerevisiaei
YLR371w! 3e-lb [FUNCAT! 03-D7 pheromone responsei mating-type determination! sex-specific proteins [S- cerevisiaei YLR371w! 3e-lb
[FUNCAT! 10-02-01 regulation of g-protein activity CS. cerevisiaei YLR371w! 3e-lfa
[FUNCAT! 01- 04 biogenesis of cytoskeleton [S- cerevisiaei YLR371w! 3e-lb
[FUNCAT! D3-D4 buddingi cell polarity and filament formation CS- cerevisiaei YLR371w! 3e-lb [FUNCAT! 01-05.04 regulation of carbohydrate utilization [S- cerevisiaei YLR371w! 3e-lfa
[FUNCAT! 3D-D3 organization of cytoplasm [S. cerevisiaei
YALD41w! 3e-ll [FUNCAT! 03-22 cell cycle control and mitosis [S- cerevisiaei
YAL041w! 3e-ll
[FUNCAT! 1D-05.01 regulation of g-protein activity [S- cerevisiaei YALD41w! 3e-ll
[BLOCKS! PR0051DE [BLOCKS! PR0D041E
[BLOCKS! BL00741B
[PIRKU! breakpoint cluster region le-Ob
[PIRKU! transmembrane protein 5e-13
[PIRKU! brain 3e-0b [PIRKU! signal transduction 5e-13
[PIRKU! alternative splicing le-Db
[SUPFAM! CDC24 homology le-15
[SUPFAM! SH2 homology le-11
[SUPFAM! CDC25-type guanine nucleotide exchange activator homology 2e-0δ
[SUPFAM! dbl transforming protein le-06
[SUPFAM! protein kinase C zinc-binding repeat homology le-11
[SUPFAM! SH3 homology le-11
[SUPFAM! bcr protein le-Db [SUPFAM! pleckstrin repeat homology 2e-ll
[SUPFAM! vav transforming protein le-11
[KU! All_Alpha
SElQ PPPGIVELGPPFAUEFCSRLGSAVTSlQRAGPAAAMVAKDYPFYLTVKRANCSLELPPASG
PRD cccceeeeccccccchhhhhhhhhhhhhcccccccccccccceeeecccccccccccccc
SElQ PAKDAEEPSNKRVKPLSRVTSLANLIPPVKATPLKRFSiQTLiQRSISFRSESRPDILAPRP
PRD cccccccccccccccccccccccccccccccccccchhhhhhcccccccccccccccccc
SElQ USRNAAPSSTKRRDSKLUSETFDVCVNlQMLTSKEIKRiQEAIFELSiQGEEDLIEDLKLAKK
PRD cccccccccchhhhhhhhhhhcccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
SElQ AYHDPMLKLSIMTEiQELNiQIFGTLDSLIPLHEELLSlQLRDVRKPDGSTEHVGPILVGULP
PRD hhhchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhccccccccccceeeeeccc
SElQ CLSSYDSYCSNiQVAAKALLDHKKlQDHRVlQDFLiQRCLESPFSRKLDLUNFLDIPRSRLVKY
PRD cccceeecccchhhhhhhhhhhhcchhhhhhhhhhhcccccccccccceeeccccccchh
SElQ PLLLREILRHTPNDNPDiQlQHLEEAINIIiQGIVAEINTKTGESECRYYKERLLYLEEGiQKD
PRD hhhhhhhhhhccccccchhhhhhhhhhhhhhhhhhhhcccchhhhhhhhhhhhhhhcccc SElQ SLIDSSRVLCCHGELKNNRGVKLHVFLFIQEVLVITRAVTHNEIQLCYIQLYRIQPIPVKDLLL
PRD hhhhhhheeecccccccccccceeeeehhhhhhhhhhhchhhhhhhhhhhhccccccccc
SElQ EDLιQDGEVRLGGSLRGAFSNNERIKNFFRVSFKNGS<QS(QTHSL(QANDTFNKιQ(QULNCIR(Q
PRD ccccccccccccccchhhhhhhhhhhhheeeeccccchhhhhhhhcccchhhhhhhhhhh
SElQ AKETVLCAAGIQAGVLDSEGSFLNPTTGSRELIQGETKLEIQMDIQSDSESDCSMDTSEVSLDC
PRD hhhhhhhhhccceeeeccccccccccccchhhhhhhhhhhhhhccccccccccccccccc
SElQ ERMEiQTDSSCGNSRHGESNV
PRD cccccccccccccccccccc (No Prosite data available for DKFZphtes3_21f24 -3) (No Pfam data available for DKFZphtes3_21f2 ■ 3)
DKFZphtes3_3Dpb
group: testis derived
DKFZphtes3_3Dpfa encodes a novel 4bl amino acid protein without similarity to known proteins- No informative BLAST resultsi No predictive prositei pfam or SCOP motife ■
The new protein can find application in studying the expression profile of testis-specific genes-
similarity to C-elegans F41H1D-4 perhaps complete eds-
Sequenced by LMU
Locus: unknown
Insert length: 1144 bp
Poly A stretch at pos- lllli no polyadenylation signal found
1 GGAACAGACC ACTGGGCTGG CAGCTGAGTT GCAGCAGCAG CAGGCTGAGT 51 ACGAGGACCT TATGGGACAG AAAGATGACC TCAACTCCCA GCTCCAGGAG
101 TCATTACGGG CCAATAGTCG ACTGCTGGAA CAACTTCAAG AAATAGGGCA
151 GGAGAAGGAG CAGTTGACCC AGGAATTACA GGAGGCTCGG AAGAGTGCGG
2D1 AGAAGCGGAA GGCCATGCTG GATGAGCTAG CAATGGAAAC GCTGCAAGAG
251 AAGTCCCAGC ACAAGGAAGA GCTGGGAGCA GTTCGTCTAC GGCATGAGAA 301 GGAGGTGCTG GGGGTGCGTG CCCGCTATGA GCGTGAGCTC CGAGAGCTGC
351 ATGAAGACAA GAAGCGTCAG GAGGAGGAGC TCCGTGGGCA GATCCGGGAG
4D1 GAGAAGGCCC GGACACGGGA GCTGGAGACT CTCCAGCAGA CAGTGGAAGA
451 ACTTCAAGCT CAGGTACATT CCATGGATGG AGCCAAGGGC TGGTTTGAAC
501 GGCGCTTGAA GGAAGCCGAG GAATCCCTGC AGCAGCAGCA GCAGGAACAA 551 GAGGAAGCCC TCAAGCAGTG TCGGGAGCAG CACGCTGCCG AGCTGAAGGG bOl CAAGGAGGAG GAGCTACAGG ATGTACGGGA TCAGCTCGAG CAGGCCCAGG fa51 AGGAGCGGGA CTGCCACCTG AAGACCATTA GCAGCCTGAA GCAGGAGGTG
701 AAGGACACAG TGGATGGGCA GAGGATCCTG GAGAAGAAGG GCAGTGCTGC
751 GCTCAAGGAC CTCAAGCGGC AGCTGCATTT GGAGCGGAAA CGGGCAGATA 601 AGCTGCAGGA GCGACTGCAG GACATCCTCA CTAACAGCAA GAGCCGCTCA
651 GGCCTTGAGG AGCTGGTTCT CTCAGAGATG AACTCACCAA GCCGGACCCA
101 GACAGGGGAC AGCAGTAGCA TCTCCTCCTT CAGCTACCGG GAGATCTTGC
151 GGGkkkkGGk GAGCTCGGCT GTTCCAGCCA GGTCCTTATC CAGCAGCCCT
1001 CAAGCCCAGC CCCCTCGGCC AGCAGAGCTG TCAGATGAGG AAGTGGCTGA 1051 GCTCTTTCAG CGGCTGGCAG AGACACAGCA GGAGAAATGG ATGCTGGAGG
1101 AGAAGGTGAA GCACCTGGAA GTGAGCAGTG CTTCCATGGC AGAGGACCTC
1151 TGCCGGAAGA GCGCCATCAT TGAGACCTAC GTCATGGACA GCCGGATCGA
1201 TGTGTCTGTG GCAGCAGGCC ACACAGACCG CAGCGGGCTG GGCAGCGTCC
1251 TGAGAGACCT AGTGAAGCCA GGCGACGAGA ACCTTCGGGA GATGAACAAG 1301 AAGCTGCAGA ACATGCTGGA GGAGCAGCTC ACCAAGAATA TGCACTTGCA
1351 CAAGGATATG GAAGTTCTGT CCCAGGAAAT TGTGCGGCTC AGCAAGGAGT
14D1 GCGTGGGGCC TCCTGACCCA GACCTAGAGC CAGGAGAAAC CAGCTAAAGA
1451 CCTGCAGGCT GCACCCACCT CCTCCCCTTC CTACCCCCTA GGATGCTATT 1501 CCCTTGGGCT GTGGTGGAAA AATGAGGGCT GGAGCCAAAA TCAAATAGCT
1551 TGGGAGACTG GACATTAAAG GGGCTAGAGG CCTGATGGTT AGTGTTAATG
IfaOl ATCCTGTCTT AGGGCAGAGG CCACCAGGGA GTGGGGATCC TGAGGGAAGG
IfaSl GGCAGGGATT TCTCCTTCTT CTTGGTCCTG GCTCCCAAGG GCTTCTGTCT
1701 TCATCTCTGC ATGAGCTCTC CTTCCCAGAG ACCAACTCTT TTTTATTTTA
1751 TTTTATTTTT TAATTTATGT CTGGAGCCTG GCTACTCTGC ATTTGGGATT
1601 GGGGATGCTG GGTGGGTGTG TGGTCCATGT TCAGCGTTCT AGCAACACGT
1651 GTGTGTGTGT GTGTGTAAAG GCTATGCAGC CAAAATACCA TCTGGCCAGA
1101 CGGGCCCACC CACAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAG
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 2
ORF from b2 bp to 1444 bpi peptide length: 4bl Category: similarity to unknown protein Classification: no clue
1 MGiQKDDLNSiQ LώESLRANSR LLEώLiQEIGώ EKE(QLT<QEL<2 EARKSAEKRK
51 AMLDELAMET LlQEKSiQHKEE LGAVRLRHEK EVLGVRARYE RELRELHEDK
101 KRiQEEELRGlQ IREEKARTRE LETLlQiQTVEE LlQAiQVHSMDG AKGUFERRLK 151 EAEESLiQlQlQlQ iQElQEEALKiQC RElQHAAELKG KEEELlQDVRD (QLElQAiQEERD
201 CHLKTISSLK (QEVKDTVDGiQ RILEKKGSAA LKDLKRiQLHL ERKRADKLlQE
251 RLiQDILTNSK SRSGLEELVL SEMNSPSRTiQ TGDSSSISSF SYREILREKE
301 SSAVPARSLS SSPiQAlQPPRP AELSDEEVAE LFIQRLAETIQIQ EKUMLEEKVK
351 HLEVSSASMA EDLCRKSAII ETYVMDSRID VSVAAGHTDR SGLGSVLRDL 401 VKPGDENLRE MNKKLiQNMLE ElQLTKNMHLH KDMEVLSiQEI VRLSKECVGP 451 PDPDLEPGET S
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_3Dpbι frame 2 No Alert BLASTP hits found
Pedant information for DKFZphtes3_3Dpbι frame 2
Report for DKFZphtes3_30pb -2 [LENGTH! 461
[MU! 55316-10
[pi! 5-07
[HOMOL! TREMBL:CEF41H1D_4 gene: "F41H10-4"i Caenorhabditis elegans cosmid F41H1D- 2e-12
[FUNCAT! 30-03 organization of cytoplasm [S- cerevisiaei
YDL056w! 5e-D4
[FUNCAT! 06-D7 vesicular transport (golgi networki etc-) [S- cerevisiaei YDL05δw! 5e-04 [BLOCKS! BLD110DD NNMT/PNMT/TEMT family of methyltransferases proteins
[KU! All_Alpha
[KU! L0U_C0MPLEXITY 11-13 *
[KU! C0ILED_C0IL 40-lb *
SElQ E(QTTGLAAELιQιQ(Q(QAEYEDLMG(QKDDLNS(QL(QESLRANSRLLEιQL(QEIG(QEKEιQLT(QEL(Q SEG xxxxxxxxxxxxxxx xxxxxxxxxxx
PRD ccchhhhhhhhhhhhhhhhhhhhhcchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS
- - -ccccccccccccccccccccccccccccccccccccccccccccccccccccccccc
SElQ EARKSAEKRKAMLDELAMETLiQEKSiQHKEELGAVRLRHEKEVLGVRARYERELRELHEDK SEG x PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh COILS cccccccc
SElQ KROEEELRG(QIREEKARTRELETL(Q<QTVEEL(QA(QVHSMDGAKGUFERRLKEAEESL(Q(Q<Q(Q SEG xxxxxxxxxxxxxxxx xxxxxxxxxxx
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcccchhhhhhhhhhhhhhhhhh COILS CCCCCCCCCCCCCCCC SElQ IQ E IQ EEA LK IQ C RE IQ H A A E L KGKE EELIQDVR D IQ L E IQ A IQ E E RD CH L K T IS SLKIQ EVK D T V DG IQ
SEG xxxxxxxxx
PRD hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcccc
COILS
CCCCCCCCCCCCC CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
SElQ RILEKKGSAALKDLKRiQLHLERKRADKLlQERLlQDiLTNSKSRSGLEELVLSEMNSPSRTiQ
SEG
PRD cccccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhccccchhhhhhhhhhhccccccc
COILS CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
SElQ TGDSSSISSFSYREILREKESSAVPARSLSSSPlQAlQPPRPAELSDEEVAELFiQRLAETiQiQ SEG ---xxxxxxxx xxxxxxxxxxxxxxxxxxxxx
PRD cccccccchhhhhhhhhhhcccccccccccccccccccccchhhhhhhhhhhhhhhhhhh COILS
SElQ EKUMLEEKVKHLEVSSASMAEDLCRKSAIIETYVMDSRIDVSVAAGHTDRSGLGSVLRDL SEG PRD hhhhhhhhhhhhhhchhhhhhhhhhhhhhhcccccchhhhhhhccccccccccccccccc COILS SElQ VKPGDENLREMNKKLlQNMLEEiQLTKNMHLHKDMEVLSiQEIVRLSKECVGPPDPDLEPGET SEG
PRD ccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhcccccccccccccccc COILS CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC
SElQ S SEG . PRD c COILS
(No Prosite data available for DKFZphtes3_3Dpfa -2) (No Pfam data available for DKFZphtes3_30pfa -2)
DKFZphtes3_31alD
group: nucleic acid management
DKFZphtes3_31alO encodes a novel 542 amino acid protein with similarity to histone HI of Drosophila hydei. Histone HI variants are known to act as specific regulators of genes via the differential condensation of DNA-
The new protein can find application in modulating/blocking the transcriptional activity and in expression profiling.
weak similarity to Drosophila histone HI perhaps complete eds-
Sequenced by LMU
Locus: /map="13" Insert length: 2667 bp
Poly A stretch at pos- 2855ι polyadenylation signal at pos- 2831
1 AGATGATCCC CAAAGTCAAC ATATGACATT AAGCCAGGCA TTTCACCTTA 51 AAAACAATAG TAAAAAGAAA CAAATGACTA CAGAAAAACA AAAGCAAGAT
IDl GCTAACATGC CCAAGAAACC TGTGCTTGGA TCTTATCGTG GCCAGATTGT
151 TCAGTCTAAG ATTAATTCAT TTAGAAAACC TCTACAAGTC AAAGATGAGA
201 GTTCTGCAGC AACAAAGAAA CTTTCAGCCA CTATACCTAA AGCCACAAAA
251 CCTCAGCCTG TAAACACCAG CAGTGTAACA GTGAAAAGTA ATAGATCCTC 3D1 CAATATGACT GCCACTACTA AATTTGTGAG CACTACATCT CAGAACACAC
351 AACTTGTGCG ACCTCCTATT AGAAGTCATC ACAGTAATAC CCGGGACACT
401 GTGAAACAAG GCATCAGTAG AACCTCTGCC AATGTTACAA TCCGGAAAGG
451 GCCTCATGAA AAAGAACTAT TACAATCAAA AACAGCTTTA TCTAGTGTCA
5D1 AAACCAGTTC TTCTCAAGGT ATAATAAGAA ATAAGACTCT ATCAAGATCC 551 ATAGCATCTG AAGTTGTAGC CAGGCCTGCT TCATTGTCTA ATGATAAACT bOl GATGGAAAAG TCAGAGCCCG TTGACCAGCG AAGACATACT GCAGGAAAAG b51 CAATTGTTGA TAGTAGATCA GCTCAGCCCA AAGAAACCTC GGAAGAGAGA
701 AAAGCTCGTC TGAGTGAGTG GAAAGCTGGC AAAGGAAGAG TGCTAAAAAG
751 GCCCCCTAAT TCAGTAGTTA CTCAGCATGA GCCTGCAGGA CAAAATGAAA 601 AACTAGTTGG GTCTTTTTGG ACTACCATGG CAGAAGAAGA TGAACAAAGA
651 TTATTTACTG AAAAAGTAAA CAACACATTT TCTGAATGCC TGAACTTGAT
IDl TAATGAGGGA TGTCCAAAAG AAGATATACT GGTCACACTG AATGACCTGA
151 TTAAAAATAT TCCAGATGCC AAAAAGCTTG TTAAGTATTG GATATGTCTT
1D01 GCACTTATTG AACCAATCAC AAGTCCTATT GAAAATATTA TTGCAATCTA 1D51 TGAGAAAGCC ATTCTGGCAG GGGCTCAGCC TATTGAAGAG ATGCGACACA
1101 CGATTGTAGA TATTCTAACA ATGAAGAGTC AAGAAAAAGC TAATTTAGGA
1151 GAAAATATGG AGAAGTCTTG TGCAAGCAAG GAAGAAGTCA AAGAAGTCAG
1201 TATTGAAGAT ACAGGTGTTG ATGTAGATCC AGAAAAACTG GAAATGGAGA
1251 GTAAACTTCA TAGAAATTTG CTATTTCAAG ATTGTGAAAA AGAGCAAGAC 13D1 AACAAAACAA AAGATCCAAC CCATGATGTT AAAACCCCCA ATACAGAAAC
1351 GAGGACAAGT TGCTTAATTA AATATAATGT GTCTACTACG CCATACTTGC
1401 AAAGTGTGAA AAAAAAGGTG CAGTTTGATG GAACAAATTC CGCATTTAAA
1451 GAGCTGAAGT TTTTAACACC AGTGAGACGT TCTCGACGTC TTCAAGAGAA 1501 AACTTCTAAA TTGCCAGATA TGTTAAAAGA TCATTATCCT TGTGTGTCTT 1551 CATTGGAACA GCTAACGGAG TTGGGAAGAG AAACTGATGC TTTTGTATGC IbOl CGCCCTAATG CAGCACTGTG CCGGGTGTAC TATGAGGCTG ATACAACATA IbSl AGAGAAATAA AGCTCTGTTA GGGAATGGGG TTTTTATTAT TTGTGGGGTG 17D1 TTTTGTTTTG AGTAGCTTTA TATTGCTCTT AGGTCTGGAG TTGGCCATGT 1751 ACCTATGTAT CCTAAGCATT CACGGCAGTG AGCTCCTTTA CTAACATTCA 1601 TGTTATGGCA AGAGTTGTCC TCTACATTGG AAAGCTAATC CTACCTTGTC 1851 AGTTTCAACC AACTGAGTTT TTTCTTTAAG AAAGGTAAAT TTTGTCAGCT 1101 AGTTTACTAT GTTCCTTGAA TATAAACAGG TTATAATACT ACCCTGTTCA 1151 CTTTACTAAA TATAAGTACA GTAATGATGC ATAATTAGAA AATGAGGTAT 20D1 TCTAGGTAAA ATGTATGTTT GCCTTGACAT GTTTTTAAAA GTTATGATGT 2051 ACCTCCCTGC CTTTAAACAG AATACTTTTT GGCCTTTCTC 21D1 AGATTAGTCA AAAATTCTAT AGAATGACTC ACTTCGAATA CTAAGACACA 2151 GGAGGTTTAG CCTGCTTTCT TACCAAATTC ATGTTACCCA GACTTGTGTT 2201 CTCTTGCGTC CCTTGGACTG CCTGTTGATT GATGGAAAGT GTCTGCACTG 2251 ACACTTTTCG TCAGTAGTCT GTAGTTTCGT GGCCTCTTTT GATTATAACT 23D1 GGGGTCACCA AGAAGGTTTA CTTAATTAAA TACCGCATTT CTAAGAGAAG 2351 ATACTTTGTG TAAGAAAAGA TGCCACATTT AGTGGTTTAA CTTTTGTAAC 2401 TTCACTTGAT AGTTTTTAAG CAATTAGAAT GGAGTTAGGG AAAGAACATA 2451 TCATACTGAA CAAATGTCAT TCTAGTTTAG ATAGCATTTC TAAGATAACT 25D1 GATACTAATA CTTGTTTTCT TCCCTATAAC ATAAAAAACT TCACTGTTAA 2551 GTCATGTCCC TTGAAACATG ATAGTTACAT ACACAGTTTT CTCTCCACAC 2b01 ATAAATAACA CCACTAAAGT TGTTTTGTAA GGTTCCAAAC TAATATGGCA 2b51 TATATCAACT CTACAGTTTC AAATAAATGA CTTTTTAATT GTAAAAGATT 27D1 AGTTGAAAAA CTGTATGAAT GTGAAGATCA CATGCTTAGT CATTTTTATG 2751 TTCATTCCAC TTTGTATATC TTTTCTATTT ATTGACTTCT CATGTTCTAG 2801 AGAGTAGGAC TTTTATTCCG TGTACCTGAT ATATATACAA TTAAAATATC 2851 TGTATAATTA AAAAAAAAAA AAAAAAAAAA AAAAAAG
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 2
ORF from 23 bp to lb48 bpi peptide length: 542 Category: similarity to known protein Classification: unclassified
1 MTLSlQAFHLK NNSKKKiQMTT EK(QK(QDANMP KKPVLGSYRG (QIViQSKINSF
51 RKPLiQVKDES SAATKKLSAT IPKATKPlQPV NTSSVTVKSN RSSNMTATTK
101 FVSTTS(QNT(Q LVRPPIRSHH SNTRDTVKtQG ISRTSANVTI RKGPHEKELL
151 (QSKTALSSVK TSSSiQGIIRN KTLSRSIASE VVARPASLSN DKLMEKSEPV 201 DiQRRHTAGKA IVDSRSAώPK ETSEERKARL SEUKAGKGRV LKRPPNSVVT
251 (QHEPAGlQNEK LVGSFUTTMA EEDEiQRLFTE KVNNTFSECL NLINEGCPKE
301 DILVTLNDLI KNIPDAKKLV KYUICLALIE PITSPIENII AIYEKAILAG
351 AiQPIEEMRHT IVDILTMKStQ EKANLGENME KSCASKEEVK EVSIEDTGVD 4D1 VDPEKLEMES KLHRNLLFiQD CEKElQDNKTK DPTHDVKTPN TETRTSCLIK
451 YNVSTTPYLlQ SVKKKViQFDG TNSAFKELKF LTPVRRSRRL iQEKTSKLPDM
501 LKDHYPCVSS LElQLTELGRE TDAFVCRPNA ALCRVYYEAD TT
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_31alDι frame 2 No Alert BLASTP hits found Pedant information for DKFZphtes3_31al0ι frame 2
Report for DKFZphtes3_31alO - 2
[LENGTH! 541
£MU! blfa77-3b
[pi! 1.33
[KU! Alpha_Beta
[[KKUU!! L LOOUU CCOOMMPPLLEEXXIITTYY 2-11 *
SElQ DDP(QS(QHMTLS(QAFHLKNNSKKK(QMTTEK<QKιQDANMPKKPVLGSYRG(QIV(QSKINSFRKP
SEG xxxxxxxxxxxx PRD ccccccchhhhheeeeccccccchhhhhhhhhccccccccccccccceeeeccccccccc
SElQ LIQVKDESSAATKKLSATIPKATKPIQPVNTSSVTVKSNRSSNMTATTKFVSTTSIQNTIQLVR
SEG
PRD cccccchhhhhhhhhhccccccccccccceeeeeccccccccccceeeeeccccceeeec
SElQ PPIRSHHSNTRDTVKiQGISRTSANVTIRKGPHEKELLlQSKTALSSVKTSSSlQGIIRNKTL
SEG
PRD cccccccccccccccccccccceeeeeccccchhhhhhhhhhcccccccccceeecccch SElQ SRSIASEVVARPASLSNDKLMEKSEPVDiQRRHTAGKAIVDSRSAiQPKETSEERKARLSEU
SEG
PRD hhhhhheeeecccccchhhhhhhcccchhhhhhcceeecccccccccchhhhhhhhhhhh
SElQ KAGKGRVLKRPPNSVVTiQHEPAGiQNEKLVGSFUTTMAEEDElQRLFTEKVNNTFSECLNLI SEG
PRD hcccceeeeccccceeeeccccccceeeeeecchhhhhhhhhhhhhhhhccccccceeec
SElQ NEGCPKEDILVTLNDLIKNIPDAKKLVKYUICLALIEPITSPIENIIAIYEKAILAGAiQP
SEG PRD ccccccceeeeecccceeecccchhhhhhhhhhhhcccccccchhhhhhhhhhhhhcchh
SElQ IEEMRHTIVDILTMKSiQEKANLGENMEKSCASKEEVKEVSIEDTGVDVDPEKLEMESKLH
SEG
PRD hhhhhhhhhhhhhhhhhhhhhccchhhhhcccccceeeeeecccccccccchhhhhhhhh
SElQ RNLLFiQDCEKEiQDNKTKDPTHDVKTPNTETRTSCLIKYNVSTTPYLiQSVKKKVlQFDGTNS SEG
PRD cccccccccccccccccccccccccccccccceeeeeeecccccchhhhhhheeecccch SElQ AFKELKFLTPVRRSRRLiQEKTSKLPDMLKDHYPCVSSLElQLTELGRETDAFVCRPNAALC
SEG
PRD hhhhhhhchhhhhhhhhhhhhhccccccccccccchhhhhhhhhccccceeeecccceee
SElQ RVYYEADTT
Figure imgf000376_0001
PRD eeeeccccc
(No Prosite data available for DKFZphtes3_31alD-2) (No Pfam data available for DKFZphtes3_31alO .2)
DKFZphtes3_31j2D
group: signal transduction
DKFZphtes3_31j2D encodes a novel 312 amino acid protein that contains a Protein phosphatase 2C motif. The novel protein shares 15* identity withthe rat protein phosphatase 2C and is expressed ubiquitously. PP2C is a structurally diversified protein phosphatase family with a wide range of functions in cellular signal transduction. The transcription of the PP2Cdelta gene was activated in response to stressi like alcohol or UV irridation- PP2C plays a role in cell cycle control-
The new protein can find application in and the diagnosis/therapy of stress related diseases and canceri as well as a for modulation of cell cycle and signal transduction.
strong similarity to protein phosphatase 2C (Rattus norvegicus) Sequenced by LMU
Locus: unknown
Insert length: 143b bp Poly A stretch at pos- 13b7ι polyadenylation signal at pos- 1341
1 CGCTGCTCGC GGGCTGAGTG TCTGTCGCTG CTGCCGCCTC CACCCAGCCT
51 CCGCCATGGA CCTCTTCGGG GACCTGCCGG AGCCCGAGCG CTCGCCGCGC 101 CCGGCTGCCG GGAAAGAAGC TCAGAAAGGA CCCCTGCTCT TTGATGACCT
151 CCCTCCGGCC AGCAGTACTG ACTCAGGATC AGGGGGACCT TTGCTTTTTG
2D1 ATGATCTCCC ACCCGCTAGC AGTGGCGATT CAGGTTCTCT TGCCACATCA
251 ATATCCCAGA TGGTAAAGAC TGAAGGGAAA GGAGCAAAGA GAAAAACCTC
301 CGAGGAAGAG AAGAATGGCA GTGAAGAGCT TGTGGAAAAG AAAGTTTGTA 351 AAGCCTCTTC GGTGATCTTT GGTCTGAAGG GCTATGTGGC TGAGCGGAAG
401 GGTGAGAGGG AGGAGATGCA GGATGCCCAC GTCATCCTGA ACGACATCAC
451 CGAGGAGTGT AGGCCCCCAT CGTCCCTCAT TACTCGGGTT TCATATTTTG
501 CTGTTTTTGA TGGACATGGA GGAATTCGAG CCTCAAAATT TGCTGCACAG
551 AATTTGCATC AAAACTTAAT CAGAAAATTT CCTAAAGGAG ATGTAATCAG bOl TGTAGAGAAA ACCGTGAAGA GATGCCTTTT GGACACTTTC AAGCATACTG b51 ATGAAGAGTT CCTTAAACAA GCTTCCAGCC AGAAGCCTGC CTGGAAAGAT
7D1 GGGTCCACTG CCACGTGTGT TCTGGCTGTA GACAACATTC TTTATATTGC
751 CAACCTCGGA GATAGTCGGG CAATCTTGTG TCGTTATAAT GAGGAGAGTC
601 AAAAACATGC AGCCTTAAGC CTCAGCAAAG AGCATAATCC AACTCAGTAT 851 GAAGAGCGGA TGAGGATACA GAAGGCTGGA GGAAACGTCA GGGATGGGCG
IDl TGTTTTGGGC GTGCTAGAGG TGTCACGCTC CATTGGGGAC GGGCAGTACA
151 AGCGCTGCGG TGTCACCTCT GTGCCCGACA TCAGACGCTG CCAGCTGACC
10D1 CCCAATGACA GGTTCATTTT GTTGGCCTGT GATGGGCTCT TCAAGGTCTT
1051 TACCCCAGAA GAAGCCGTGA ACTTCATCTT GTCCTGTCTC GAGGATGAAA 1101 AGATCCAGAC CCGGGAAGGG AAGTCCGCAG CCGACGCCCG CTACGAAGCA
1151 GCCTGCAACA GGCTGGCCAA CAAGGCGGTG CAGCGGGGCT CGGCCGACAA
1201 CGTCACTGTG ATGGTGGTGC GGATAGGGCA CTGAGGGGTG GCGCGCGGCC
1251 AGGAGCACGC ATGGTATTGA CTTAAAAGGT TCATTTTGTG TGTGTGCACA 1301 TTGTGTGTTT TGTGTACTCC TGTGGGACTC CCATGGTTGT AAATAAAGGT
1351 TTCTCTTTTT TTTTCCTAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA
1401 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAG
BLAST Results
No BLAST result
Medline entries
11074314:
Tong Yi (Quirion Ri Shen SH-i Cloning and characterization of a novel mammalian PP2C isozyme. J Biol Chem 1118 Dec 25i273 (52) : 35282-10
Peptide information for frame 2
ORF from 5b bp to 1231 bpi peptide length: 312 Category: strong similarity to known protein Classification: Protein management Prosite motifs: PP (147-155)
1 MDLFGDLPEP ERSPRPAAGK EAlQKGPLLFD DLPPASSTDS GSGGPLLFDD
51 LPPASSGDSG SLATSISiQMV KTEGKGAKRK TSEEEKNGSE ELVEKKVCKA 101 SSVIFGLKGY VAERKGEREE MlQDAHVILND ITEECRPPSS LITRVSYFAV
151 FDGHGGIRAS KFAAiQNLHiQN LIRKFPKGDV ISVEKTVKRC LLDTFKHTDE
2D1 EFLKlQASSlQK PAUKDGSTAT CVLAVDNILY IANLGDSRAI LCRYNEESiQK
251 HAALSLSKEH NPTiQYEERMR IlQKAGGNVRD GRVLGVLEVS RSIGDGlQYKR
301 CGVTSVPDIR RClQLTPNDRF ILLACDGLFK VFTPEEAVNF ILSCLEDEKI 351 (QTREGKSAAD ARYEAACNRL ANKAViQRGSA DNVTVMVVRI GH
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_31j20ι frame 2 No Alert BLASTP hits found
Pedant information for DKFZphtes3_31j20ι frame 2
Report for DKFZphtes3_31J20-2
[LENGTH! 410 [MU! 44751 - 85
[pi! 7 - 15
[HOMOL! TREMBL:AFD15127_1 product: "protein phosphatase
2C"i Rattus norvegicus protein phosphatase 2C mRNAi complete eds- 0-D
[FUNCAT! 03-01 cell growth CS- cerevisiaei YDLOObw! fae-2S
[FUNCAT! 10-03-13 key phosphatases CS. cerevisiaei YDLOObw! be-25
[FUNCAT! 01- lb mitochondrial biogenesis [S. cerevisiaei YDLOObw! be-25
[FUNCAT! 11-01 stress response CS- cerevisiaei YDLOObw! be-25
[FUNCAT! D3-D4 buddingi cell polarity and filament formation
CS. cerevisiaei YDLOOfaw! be-25
[FUNCAT! 01.05-04 regulation of carbohydrate utilization CS. cerevisiaei YDLOObw! be-25
[FUNCAT! 18 classification not yet clear-cut CS- cerevisiaei
YEROβlc! le-23
[FUNCAT! 11 unclassified proteins CS- cerevisiaei YORDIOc! le-12 [FUNCAT! 03-22 cell cycle control and mitosis CS- cerevisiaei
YJLODSw! 3e-10
[FUNCAT! D3-10 sporulation and germination CS. cerevisiaei
YJLDOSw! 3e-lD
[FUNCAT! 30-D2 organization of plasma membrane [S- cerevisiaei YJL0D5w! 3e-10
[FUNCAT! 01-03-10 metabolism of cyclic and unusual nucleotides
CS- cerevisiaei YJLODSw! 3e-10
[FUNCAT! 1D.04-D3 second messenger formation [S- cerevisiaei
YJL005w! 3e-lD [BLOCKS! PR01023F
[BLOCKS! PR00fa77D
[BLOCKS! BL01D32I
[BLOCKS! BL01D32H
[BLOCKS! BL01D32G [BLOCKS! BL01D32C Protein phosphatase 2C proteins
[BLOCKS! BL01032B Protein phosphatase 2C proteins
[SCOP! dlabq 4.16.1.1-1 Protein serine/threonine phosphatase 2C [Huma le-107
[EC! 3-1-3-43 [Pyruvate dehydrogenase ( lipoamide) !- phosphatase 3e-D1
[EC! 3-1-3-lb Phosphoprotein phosphatase 7e-35
[EC! 4-fa-l-l Adenylate cyclase Se-11
[PIRKU! duplication Se-11
[PIRKU! tandem repeat 8e-01 [PIRKU! serine/threonine-specific phosphatase 2e-27
[PIRKU! magnesium fae-2b
[PIRKU! cAMP biosynthesis 5e-ll
[PIRKU! liver 2e-27
[PIRKU! leucine zipper le-08 [PIRKU! mitochondrion 3e-01
[PIRKU! phosphoric monoester hydrolase 7e-35
[PIRKU! phosphorus-oxygen lyase 2e-ll
[SUPFAM! leucine-rich alpha-2-glycoprotein repeat homology 2e-ll [SUPFAM! yeast adenylate cyclase catalytic domain homology 2e-ll
[SUPFAM! kinase interaction domain homology 3e-ll [SUPFAM! yeast adenylate cyclase Se-11 [PROSITE! PP2C 1
[PFAM! Protein phosphatase 2C
[KU! Alpha_Beta
SE(Q AARGLSVCRCCRLHPASAMDLFGDLPEPERSPRPAAGKEAiQKGPLLFDDLPPASSTDSGS PRD ccceeeeeeeeccccccceeeecccccccccccccccccccccccccccccccccccccc
SElQ GGPLLFDDLPPASSGDSGSLATSISOMVKTEGKGAKRKTSEEEKNGSEELVEKKVCKASS PRD ccceeeccccccccccccccccccccccccccccccccccccccccccccccccccccce
SE(Q VIFGLKGYVAERKGEREEMlQDAHVILNDITEECRPPSSLITRVSYFAVFDGHGGIRASKF PRD eeeceeeecchhhhhhhhhhhhheeeeccccccccccccccceeeeeeeccccchhhhhh SElQ AAIQNLHIQNLIRKFPKGDVISVEKTVKRCLLDTFKHTDEEFLKIQASSIQKPAUKDGSTATCV PRD hhhhhhhhhhhcccccccchhhhhhhhhhhhhhhhhhhhhhhhhhhcccccccccceeee
SElQ LAVDNILYIANLGDSRAILCRYNEESiQKHAALSLSKEHNPTlQYEERMRIlQKAGGNVRDGR
PRD eeccceeeeeccccceeeeeeccccccccceeeeecccccccchhhhhhhhcccceeeee
SElQ VLGVLEVSRSIGDGlQYKRCGVTSVPDIRRCiQLTPNDRFILLACDGLFKVFTPEEAVNFIL
PRD ccccceeeeccccccccccccccccccccccccccceeeeeecccccccccchhhhhhhh
SElQ SCLEDEKIiQTREGKSAADARYEAACNRLANKAVlQRGSADNVTVMVVRIGH PRD hhhhhhhhhhhhcchhhhhhhhhhhhhhhhhhhhccccccceeeeeeccc
Prosite for DKFZphtes3_31 j20 - 2
PS01D32 lbS->174 PP2C PD0C0D712
Pfam for DKFZphtes3_31j20.2
HMM_NAME Protein phosphatase 2C HMM
*Gl CcM(QGPRURMsMEDaH i aylNF pcn l DUUh i MFFGVFDGHg
+++ +G R++M+DAH+ + ++ P++L ++
+++F+VFDGHG
(Query 128 YVAERKG--EREEM(QDAHVILNDITEECRPPSSLITR- VSYFAVFDGHG 173
HMM GDiQCSiQUCgeHUHdII*
G+++S++ +++H+ + (Query 174 GIRASKFAAiQNLHiQNL 161 DKFZphtes3_5k22
5. group: signal transduction
DKFZphtes3_5k22 encodes a novel 455 amino acid protein with similarity to human paraneoplastic neuronal antigen MAI- 0 Antibodies against MAI where found in patients with paraneoplastic neurological disorders- The protein is predominantly expressed in testis and braini but ESTs are also found in liveri lung uterus and kidney. 5 The new protein can find application in studying/therapy of paraneoplastic neurological disorders-
strong similarity to paraneoplastic neuronal antigen MAI 0
Sequenced by (Qiagen
Locus: unknown 5 Insert length: 3534 bp
Poly A stretch at pos- 3S14ι polyadenylation signal at pos- 3414
1 GAACGTCCGC GCTGGGAGCC AGGGGTGCCC GACCCCCGTC CGCCGCCGCC 0 51 GCCGCCGCCG CGCATAGCCC CCGGAGAGCC CTCTGGGGAC CCCGACCAGA
101 AGGGACCTTG CCCTGGGAGA AGGCTGTGGA GACCTGGGCC TTCTGCGATC
151 ACCCTAGGAG TTGATCCAGA TATGTGCCTC ACGCCCTGAT CACTCCCCCC
2D1 AAATTAGTAT CCGCAGAGAT TCGAGGACAT GCCGTTGACC TTGTTACAGG
251 ACTGGTGTCG GGGGGAACAC CTGAACACCC GGAGGTGCAT GCTCATCCTG 5 301 GGGATCCCCG AGGACTGTGG CGAGGATGAG TTTGAGGAGA CACTCCAGGA
3S1 GGCTTGCAGG CACCTGGGCA GATACAGGGT GATTGGCAGG ATGTTTAGGA
4D1 GGGAGGAGAA CGCCCAGGCG ATTCTACTGG AGCTGGCACA AGATATCGAC
451 TATGCTTTGC TCCCAAGGGA AATACCAGGA AAGGGGGGGC CCTGGGAAGT
501 GATTGTAAAA CCCCGTAACT CAGATGGGGA ATTTCTCAAC AGACTGAACC 0 SSI GCTTCTTAGA GGAGGAGAGG CGGACCGTGT CAGATATGAA CCGAGTCCTC faOl GGGTCGGACA CCAATTGTTC GGCTCCAAGA GTGACTATAT CACCAGAGTT faSl CTGGACCTGG GCCCAGACTC TGGGGGCAGC AGTGCAGCCT CTGCTAGAAC
7D1 AAATGTTGTA CCGAGAACTA AGAGTGTTTT CTGGGAACAC CATATCCATC
751 CCAGGTGCAC TGGCCTTTGA TGCCTGGCTT GAGCACACCA CTGAGATGCT 5 801 ACAGATGTGG CAGGTGCCCG AGGGGGAAAA GAGGCGGAGG CTGATGGAAT
8S1 GCTTACGGGG CCCTGCTCTC CAGGTGGTCA GTGGGCTCCG GGCCAGCAAT
101 GCTTCCATAA CTGTGGAGGA GTGCCTGGCT GCCTTGCAGC AGGTGTTCGG
151 ACCTGTGGAG AGCCATAAAA TTGCCCAGGT GAAGTTGTGT AAAGCCTATC
1D01 AGGAGGCAGG AGAGAAAGTA TCTAGCTTTG TGTTACGTTT GGAACCCCTG 0 1DS1 CTCCAAAGAG CTGTAGAAAA CAATGTGGTA TCACGTAGAA ACGTGAATCA
1101 GACTCGCCTG AAACGAGTCT TAAGTGGGGC CACCCTTCCT GACAAACTCC
1151 GAGATAAGCT TAAGCTGATG AAACAGCGAA GGAAGCCTCC TGGTTTCCTG
1201 GCCCTGGTGA AGCTCCTGCG TGAGGAGGAG GAATGGGAGG CCACTTTAGG
1251 TCCAGATAGG GAGAGTCTGG AGGGGCTGGA AGTAGCCCCA AGGCCACCTG 5 13D1 CCAGGATCAC TGGGGTTGGG GCAGTACCTC TCCCTGCCTC TGGCAACAGT
1351 TTTGATGCGA GGCCTTCCCA GGGCTACCGG CGCCGGAGGG GCAGAGGCCA
1401 ACACCGAAGG GGTGGTGTGG CAAGGGCTGG CTCTCGAGGC TCAAGAAAAC
1451 GGAAACGCCA CACATTCTGC TATAGCTGTG GGGAAGACGG CCACATCAGG 1501 GTACAGTGCA TCAACCCCTC CAACCTGCTC TTGGCCAAGG AGACAAAAGA
1551 GATATTGGAA GGkGGGGkkk GAGAAGCCCA GACAAACAGC AGATGAGTTG lfaOl AGTGGGGCAG AGGGACAGGG CAGCCAGACC AAGGCCAAGC CTTCTCACCC lfa51 TTGGCCAGCT GGAAGGGACT TCAGCAACCA AGACCACCTG GCAACAGGCT 1701 CAGTGGGGGT CAGGTCCAGG TCCCCGAAGA GGTGCTGGAG AGGAAAGCAG
17S1 GGAGCCACTG CATCCAGCAC ATGGGGTGCC TGGGCCTCAG ATGGGGACCC
16D1 CAAAGAAGCA GAAGCTGAAG AAGGTACGGC TGGGGGTTCT GTCCTGCTCA
1651 TCCAACCACC CCTAAATACC CACCCTGTGG ACTTTGAGCT GAACATGCCC
1101 ACTGGCCCCC AGGCCACATG GGACCTGGAG GAGCCTACCT GGGGCCTGCC 1151 CCTGCCAGCA GGTGCCAGGG CTGGTGAGGA AGAGCTGGGG GGCAGAGGTA
2D01 AAGCCCTGCA GGGGAGGCCA CAGGGTCCAT CCCGTCTTCA GGATCATCTA
2D51 CACTGCACTA GGGGAGCCCC AGGAAGGCAG CACCCTGGAG GCCCTGTGCC
21D1 AGTGAGGACA GGAGACCCTA AGGCCCCGGG AGCCCAGTGC CAGCCAGAGG
2151 TTGTGCAGGC AAGGAGACCA AAGATTGATG AGAAGACCCC CAGCAGGGGT 22D1 ACTGGGTACC CGGCAGGCCA GTGCCCTCAC AGTTGACTTG GACCAGGGTG
2251 GCTGTGAAGG GAAGTCTTTG TTGCAAAGGA GGAGGAGGAA AAGGGAGGAC
2301 TTGGTAGGGT TTTGTTTCTT CTGCTTGTTT CTGTACAGGG CCACCAGACT
2351 CCTGGAGAGA TCAAGCAAGG AGAACCTGGG GCTGCCATGG CCAAAGCAAC
2401 TCAACAGATG CCAATGCCAA TTCCAAGGCC AGCCACAACC CTGCCACCTT 2451 GGGGAATCCA GCCTGGAGGC ATCCCCTAAG CAGCCAGCCA TGGCCTGGGT
25D1 GGAGGCACCT GAAGACGTCT GTCCCAAACT CCCCCAGCCC TGAGCTGGGA
2551 GATGACAGGG GGAAAGAGGC CCTCTCAAGG GTGCCAGATG CCTGGGTCTC
2b01 CCAAGAGGGG TCCCCCAACT CACCGTTCCC GGGACAGGCT GCCCCCTGTT
2b51 CCAGGAAGCT CATCCTCACC TGTGTAGGCC CCTGTAGTGA CCCACGCGTC 2701 CAGCAGACGC CCACCCACCG CTAGCCGTTG TTCCTGTGCA AAGTAGTGTG
2751 CTATGCACCC ACCCAGGTGG CCGCCTCTGG GCCCAAGGCA CATGCTGTGA
28D1 GCTTCCTGTG AGCCCAGGCT CTGCTCACTG CTGTCCCGCG TCATGAGCAC
2851 CACCTCTGCT TTCCCTGTGT AGATCTAGGC CAGTGGCTGC TTGTTCTTGT
2101 GGAGCTGTGT GTGTTCTTCT CTGAGCAGCT CCTCCCCGGA GTCCCCCAGC 2151 ACAGTCCCAG GAGATGACAG GAAGGAAGCA CCAGGGCAAG GCGGACGCTC
3D01 ACCCTGTGAC CACGATGGTG ACCGTGGCTG TGGGAGGAAG AACTGGACCC
3D51 AGGACGGAGC GGGGCTGCCC TGCCTGAGGC TCCCGAGGAG CTTTGTGCTT
3101 TGGTGTTCCA CCCCTGTTGT TACTCATGAC TCAGTTTCCT TGACCTGGTA
3151 GGGTGTTCCC TGCTGTGTTT TCCAGTGTCC TGTGACTGTC CTGTGCGGGC 3201 CATAGGGCAG GGCCCTGCCC CAGCAGATGG GCTTGGGAGG GGGCTCCCTA
3251 AAGCCAGTGG ACACTGCCAG AGTCTACCTT CCTGGCAAGA GGCAGACCCC
33D1 GGGGCCCTCA GGAAGGAGGG AGTTGGCAGC GGGGGCTGCA GCAGGAGTAG
3351 GAGCAGATGA GGCGTCTTGC CAGGAACCTC AGGAGGAGGG GGCCCGGGAC
3401 CTGTGTGGGA CCTGTGTCCT GTGGTGGCCG TTTGCAGTTT CTCTCTGTGT 3451 TGTGATTCCC TTCTCTTCAA TGGTTTCAGT ACGTGTTTCT CTTCAATAAA
3501 CTTCATTCAG TGTTAAAAAA AAAAAAAAAA AAAA
BLAST Results
No BLAST result
Medline entries
11156171:
Mali a novel neuron- and testis-specific proteini is recognized by the serum of patients with paraneoplastic neurological disorders Peptide information for frame 1
ORF from 221 bp to 1513 bpi peptide length: 455
Category: strong similarity to known protein Classification: unclassified 1 MPLTLLiQDUC RGEHLNTRRC MLILGIPEDC GEDEFEETLlQ EACRHLGRYR
51 VIGRMFRREE NAώAILLELA (QDIDYALLPR EIPGKGGPUE VIVKPRNSDG
IDl EFLNRLNRFL EEERRTVSDM NRVLGSDTNC SAPRVTISPE FUTUAiQTLGA
151 AVlQPLLEiQML YRELRVFSGN TISIPGALAF DAULEHTTEM LIQMUIQVPEGE
201 KRRRLMECLR GPALlQVVSGL RASNASITVE ECLAALlQiQVF GPVESHKIAiQ 251 VKLCKAYlQEA GEKVSSFVLR LEPLLiQRAVE NNVVSRRNVN (QTRLKRVLSG
301 ATLPDKLRDK LKLMKfQRRKP PGFLALVKLL REEEEUEATL GPDRESLEGL
351 EVAPRPPARI TGVGAVPLPA SGNSFDARPS (QGYRRRRGRG (QHRRGGVARA
401 GSRGSRKRKR HTFCYSCGED GHIRVlQCINP SNLLLAKETK EILEGGEREA 451 (QTNSR
BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_Sk22ι frame 1
TREMBLNEU:AB020b1D_l gene: "KIAA0863"i product: "KIAAD663 protein"i
Homo sapiens mRNA for KIAA0883 proteini complete cds-i N = li
Score =
Figure imgf000383_0001
TREMBL:AF0373b4_l gene: "MAl"i product: "paraneoplastic neuronal antigen MAl"i Homo sapiens paraneoplastic neuronal antigen MAI (MAI) mRNAi complete cds-i N = li Score = bfa5ι P = 2-fae-faS
>TREMBLNEU:ABD20b1D_l gene: "KIAA0663"i product: "KIAAD683 protein"i Homo sapiens mRNA for KIAA0883 proteini complete eds- Length = 3b4
HSPs:
Score = 722 ( 106 - 3 b its ) ι Expect = 2 - 4e-71 ι P = 2 - 4e-71 Ident it i es = 15fa/348 ( 44* ) ι Posit i ves = 215/348 ( bl* )
(Query : l
MPLTLL(QDUCRGEHLNTRRCMLILGIPEDCGEDEFEETL(QEACRHLGRYRVIGRMFRREE bO
M L LL + DUCR ++ ++ + + + GIP D E E +E L(QE + LGRYR++G++FR++E Sb jct : 1
MALALLEDUCRIMSVDE(QKSLMVTGIPADFEEAEI(QEVL(QETLKSLGRYRLLGKIFRK(QE faD (Query: bl
NAiQAILLELAlQDIDYALLPREIPGKGGPUEVIVKPRNSDGXXXXXXXXXXXXXXXTVSDM 120
NA A+LLEL +D D + +P E+ GKGG U+VI K N D TVS M Sbjct: bl
NANAVLLELLEDTDVSAIPSEV(QGKGGVUKVIFKTPN(QDTEFLERLNLFLEKEG(QTVSGM 12D
(Query: 121 NRVLGSDTNCSAPRVTISPEFUTU-- AlQTLGAAViQPLLEiQMLYRELRVFSGNTISIPGAL 178 R LG + A ISPE <Q + A iQPLL M YR + LRVFSG +
+ P
Sbjct: 121 FRALGlQEGVSPATVPCISPELLAHLLGlQAMAHAPlQPLLP- MRYRKLRVFSGSAVPAPEEE 171 (Query: 171
AFDAULEHTTEML(QMUιQVPEGEKRRRLMECLRGPAL(QVVSGLRASNASITVEECLAAL(Q(Q 236 +F+ ULE TE+++ U V E EK+R L E LRGPAL ++ ++A N
SI+VEECL A +__
Sbjct: 160 SFEVULEIQATEIVKEUPVTEAEKKRULAESLRGPALDLMHIVIQADNPSISVEECLEAFKIQ 231
(Query: 231
VFGPVESHKIAIQVKLCKAYIQEAGEKVSSFVLRLEPLLIQXXXXXXXXXXXXXXXXXLKRVL 216 VFG +ES + AιQV+ K YiQE GEKVS + + VLRLE LL + L++V+
Sbjct: 240 VFGSLESRRTAiQVRYLKTYiQEEGEKVSAYVLRLETLLRRAVEKRAIPRRIADiQVRLElQVM 211
(Query: 211 SGATLPDKLRDKLKLMKlQRRKPPGFLALVKLLREEEEUEATLGPDRESLE 346
+GATL L +L+ +K + PP FL L+K++REEEE EA+ + ES+E Sbjct: 30D AGATLNiQMLUCRLRELKDlQGPPPSFLELMKVIREEEEEEASF — ENESIE
347
Pedant information for DKFZphtes3_5k22ι frame 1
Report for DKFZphtes3_Sk22-l
[LENGTH! 455
[MU! 51514-34
[pi! 1-27 [HOMOL! TREMBLNE : ABD20b10_l gene: "KIAA0863"i product
"KIAAD883 protein"i Homo sapiens mRNA for KIAA0883 proteini complete eds- 3e-75
[BLOCKS! BLDD87bB Indoleamine 2ι3-dioxygenase proteins
[PFAM! Zinc fingeri CCHC class [KU! Alpha_Beta
[KU! LOU COMPLEXITY 13-41 *
SElQ MPLTLLiQDUCRGEHLNTRRCMLILGIPEDCGEDEFEETLiQEACRHLGRYRVIGRMFRREE SEG
PRD ccchhhhhccccccccccceeeeeecccccchhhhhhhhhhhhhhccceeehhhhhhhhh
SElQ NAiQAILLELAlQDIDYALLPREIPGKGGPUEVIVKPRNSDGEFLNRLNRFLEEERRTVSDM SEG xxxxxxxxxxxxxxx
PRD hhhhhhhhhhhcccccccccccccccceeeeeeecccccchhhhhhhhhhhhhhchhhhh
SElQ NRVLGSDTNCSAPRVTISPEFUTUAiQTLGAAVlQPLLEiQMLYRELRVFSGNTISIPGALAF SEG
PRD hhhhcccccccccccccccchhhhhhhhhhhhhhhhhhhhhhhheeeccccccccchhhh
SElQ DAULEHTTEMLiQMUiQVPEGEKRRRLMECLRGPALiQVVSGLRASNASITVEECLAALlQiQVF
SEG PRD hhhhhhhhhhhhhhhccchhhhhhhhhhhhccccccccccccccceeehhhhhhhhhhhh
SElQ GPVESHKIAIQVKLCKAYIQEAGEKVSSFVLRLEPLLIQRAVENNVVSRRNVNIQTRLKRVLSG
SEG ■ ■ -xxxxxxxxxxxxxxxxx
PRD hccchhhhhhhhhhhhhhhcccccceeeeehhhhhhhhhhhhcchhhhhhhhhhhhhhhc
SElQ ATLPDKLRDKLKLMKlQRRKPPGFLALVKLLREEEEUEATLGPDRESLEGLEVAPRPPARI
SEG
PRD ccccchhhhhhhhhhhhccccchhhhhhhhhhhhhhhhhcccchhhhheeeecccccccc SElQ TGVGAVPLPASGNSFDARPSlQGYRRRRGRGiQHRRGGVARAGSRGSRKRKRHTFCYSCGED
SEG XXXXXXXXXXXXXXXXXXXXXXXXXXXXX
PRD eeeeeeccccccccccccccccccccccccccceeeeeeccccccccccceeeeeccccc
SElQ GHIRVlQCINPSNLLLAKETKEILEGGEREAiQTNSR SEG
PRD ceeeeeeccccchhhhhhhhhhhcccccccccccc
(No Prosite data available for DKFZphtes3_5k22 -1)
Pfam for DKFZphtes3_5k22 ■ 1
HMM_NAME Zinc fingeri CCHC class
HMM *(QkCUNCGKPGHMMRDCPE*
C++CG+ GH+ +C + (Query 412 TFCYSCGEDGHIRViQCIN 421
DKFZphtes3_7nl2
group: transmembrane protein
DKFZphtes3_7nl2 encodes a novel 7D3 amino acid protein without similarity to known proteins- The novel protein contains 1 transmembrane domain
No informative BLAST resultsi No predictive prositei pfam or SCOP motife ■
The new protein can find application in studying the expression profile of testis-specific genes and as a new marker for testicular cells-
putative protein contains transmembrane domain perhaps complete eds-
Sequenced by BMFZ
Locus: unknown
Insert length: 2347 bp
Poly A stretch at pos- 2271ι polyadenylation signal at pos- 2253
1 CGGCTGCAGT CTGGGCCGGG GCCCTGTGCC GCTGAAGACA TGGAGTTTGT
51 GTCTGGATAC CGGGATGAGT TCCTTGATTT CACTGCCCTT CTCTTCGGCT
101 GGTTCCGAAA GTTTGTGGCA GAGCGTGGAG CTGTAGGGAC TAGCCTTGAG 151 GGCCGCTGCC GGCAGCTGGA GGCCCAGATC AGAAGGCTAC CCCAGGACCC
201 TGCCCTTTGG GTGCTCCATG TCCTGCCCAA CCATAGTGTG GGCATCAGCC
251 TGGGGCAAGG GGCAGAACCA GGTCCTGGAC CAGGCCTGGG GACTGCCTGG
301 CTCCTGGGAG ACAACCCTCC ACTCCACCTG CGAGACCTGA GCCCCTACAT
351 CAGCTTTGTC AGCCTAGAGG ATGGGGAGGA AGGGGAGGAG GAAGAGGAGG 401 AAGATGAAGA AGAAGAGAAG AGAGAGGACG GGGGTGCAGG CAGCACAGAG
451 AAGGTGGAAC CAGAGGAGGA CCGGGAGCTA GCCCCTACCA GCAGGGAGTC
501 CCCCCAGGAA ACAAACCCTC CAGGAGAGTC AGAGGAGGCT GCCCGGGAGG
551 CAGGAGGTGG CAAGGATGGC TGCCGAGAGG ACAGGGTGGA GAACGAAACA bOl AGACCCCAGA AGAGGAAGGG ACAGAGGAGT GAGGCTGCCC CCCTGCACGT b51 TTCCTGTCTC TTACTTGTGA CGGATGAGCA TGGCACCATC TTGGGCATTG
7D1 ATCTGCTAGT GGATGGAGCC CAGGGAACCG CAAGCTGGGG CTCAGGGACC
751 AAGGACCTGG CTCCTTGGGC CTATGCTCTC CTCTGTCACA GCATGGCCTG
801 TCCCATGGGC TCTGGGGATC CCCGAAAGCC CCGACAGCTT ACTGTGGGAG
851 ATGCCCGGCT GCATCGAGAG CTGGAGAGCT TGGTCCCAAG GCTAGGTGTG 101 AAGTTAGCCA AAACCCCAAT GCGGACATGG GGTCCCCGGC CAGGCTTCAC
151 CTTTGCTTCC CTTCGTGCTC GAACCTGCCA TGTGTGTCAC AGGCACAGCT
1001 TTGAAGCGAA GCTGACACCT TGCCCCCAGT GTAGTGCTGT CTTGTATTGT
1051 GGAGAGGCTT GTCTCCGGGC TGACTGGCAG CGGTGCCCAG ATGATGTGAG
1101 TCACCGATTT TGGTGCCCAA GGCTTGCAGC CTTCATGGAG CGGGCAGGAG 1151 AACTGGCAAC CCTACCTTTT ACCTACACCG CAGAGGTGAC CAGTGAAACC
1201 TTCAACAAAG AGGCCTTCCT GGCCTCTCGG GGCCTCACTC GTGGCTATTG
1251 GACCCAGCTC AGCATGCTGA TTCCAGGCCC GGGCTTCTCC AGACACCCCC
1301 GAGGCAACAC GCCATCCCTC AGCCTTCTTC GCGGTGGAGA CCCCTACCAG 1351 CTTCTCCAGG GAGACGGGAC TGCCCTGATG CCTCCTGTGC CCCCACATCC
14D1 ACCCCGGGGT GTTTTTGTCC CTGAGCTCAA CATCCAAAAC AAACAGTCAC
1451 TGAAGATCCA CGTGGTGGAG GCCGGGkkGG AGTTT6ACCT TGTCATGGTG
1501 TTTTGGGAGC TTTTGGTCCT GCTCCCCCAT GTGGCCCTGG AGCTGCAGTT 1551 TGTAGGTGAT GGCCTGCCCC CCGAAAGCGA CGAGCAGCAT TTTACCCTGC
IbOl AGAGGGACAG CCTGGAGGTG TCTGTCCGGC CTGGTTCCGG CATATCAGCA lb51 CGGCCCAGCT CTGGCACTAA GGAGAAAGGG GGCCGCAGGG ACCTGCAGAT
1701 CAAGGTGTCA GCAAGGCCCT ACCACCTGTT CCAGGGGCCC AAGCCTGACC
1751 TGGTTATTGG ATTTAACTCC GGGTTTGCTC TCAAGGATAC GTGGCTGAGG 1601 TCTCTGCCCC GGTTACAGTC CCTCCGAGTG CCAGCCTTCT TCACCGAGAG
1651 CAGCGAGTAC AGCTGTGTGA TGGACGGCCA GACCATGGCG GTGGCCACTG
1101 GAGGGGGCAC CAGCCCTCCC CAGCCCAACC CCTTCCGCTC CCCCTTTCGC
1151 CTCAGAGCGG CCGACAACTG CATGTCCTGG TACTGCAATG CCTTCATCTT
20D1 CCACCTGGTT TACAAGCCTG CTCAAGGGAG CGGGGCCCGC CCGGCGCCCG 2051 GGCCCCCACC CCCATCCCCA ACTCCCTCTG CTCCTCCTGC CCCCACCCGA
2101 AGGCGCCGAG GAGAAAAGAA ACCTGGGCGG GGGGCCCGCC GGCGGAAATG
2151 AATGCTGATA CCCTAGTAGT CCCCAGCTCC CAAACACTGA AAGGAAAACG
2201 TGAAAACACT CAAGGCCTAG GGGGAGGACA GGTTGGTAAA ACATGAAAAG
2251 GTAAATAAAA TTACTTGTTT GAAAAAAAAA AAAAAAAAAA AAAAAAAAAA 2301 AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAAAAA AAAAAAA
BLAST Results
No BLAST result
Medline entries
No Medline entry
Peptide information for frame 1
ORF from 40 bp to 2148 bpi peptide length: 703 Category: putative protein
Classification: Transmembrane proteins unclassified
1 MEFVSGYRDE FLDFTALLFG UFRKFVAERG AVGTSLEGRC R(QLEA(QIRRL
51 PlQDPALUVLH VLPNHSVGIS LGlQGAEPGPG PGLGTAULLG DNPPLHLRDL IDl SPYISFVSLE DGEEGEEEEE EDEEEEKRED GGAGSTEKVE PEEDRELAPT
151 SRESPώETNP PGESEEAARE AGGGKDGCRE DRVENETRPiQ KRKGlQRSEAA
2D1 PLHVSCLLLV TDEHGTILGI DLLVDGAiQGT ASUGSGTKDL APUAYALLCH
251 SMACPMGSGD PRKPRlQLTVG DARLHRELES LVPRLGVKLA KTPMRTUGPR
301 PGFTFASLRA RTCHVCHRHS FEAKLTPCPώ CSAVLYCGEA CLRADUiQRCP 351 DDVSHRFUCP RLAAFMERAG ELATLPFTYT AEVTSETFNK EAFLASRGLT
401 RGYUTiQLSML IPGPGFSRHP RGNTPSLSLL RGGDPY(QLL(Q GDGTALMPPV
451 PPHPPRGVFV PELNIiQNKdS LKIHVVEAGK EFDLVMVFUE LLVLLPHVAL
501 ELώFVGDGLP PESDElQHFTL ORDSLEVSVR PGSGISARPS SGTKEKGGRR
551 DLlQIKVSARP YHLFiQGPKPD LVIGFNSGFA LKDTULRSLP RLiQSLRVPAF bOl FTESSEYSCV MDGiQTMAVAT GGGTSPPiQPN PFRSPFRLRA ADNCMSUYCN b51 AFIFHLVYKP AiQGSGARPAP GPPPPSPTPS APPAPTRRRR GEKKPGRGAR 701 RRK BLASTP hits No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_7nl2ι frame 1
No Alert BLASTP hits found
Pedant information for DKFZphtes3_7nl2ι frame 1
Report for DKFZphtes3_7nl2-l
[LENGTH! 7D3
[MU! 77312 - 72
[pi! b - 45
CKU! TRANSMEMBRANE 1
[KU! LOU COMPLEXITY 15 - 22 *
SE lQ MEFVSGYRDEFLDFTALLFGUFRKFVAERGAVGTSLEGRCR(QLEA(QIRRLP(QDPALUVLH
SEG
PRD ccceeeccchhhhhhhhhhhhhhhhhhhhccccccchhhhhhhhhhhhhccccccccccc
MEM
SE lQ VLPNHSVGISLGiQGAEPGPGPGLGTAULLGDNPPLHLRDLSPYISFVSLEDGEEGEEEEE
SEG xxxxxxxxxxx
PRD cccccccccccccccccccccceeeeeecccccccccccccceeeeeeccccchhhhhhh
MEM
SE lQ EDEEEEKREDGGAGSTEKVEPEEDRELAPTSRESPiQETNPPGESEEAAREAGGGKDGCRE
SEG xxxxxxxxxxxx xxxxxxxxxxxxx
PRD hhhhhhhhcccccccccccccccccccccccccccccccccchhhhhhhhccccccccce
MEM
SElQ DRVENETRP(QKRKG(QRSEAAPLHVSCLLLVTDEHGTILGIDLLVDGA(QGTASUGSGTKDL
SEG
PRD eeccccccccccccccccccchhhhhheeeecccccccchhhhhhccccccccccccccc
MEM
S ElQ APUAYALLCHSMACPMGSGDPRKPRiQLTVGDARLHRELESLVPRLGVKLAKTPMRTUGPR
SEG
PRD hhhhhhhhhhhhccccccccccccceeeecchhhhhhhhhhhcccccccccccccccccc
MEM
SE(Q PGFTFASLRARTCHVCHRHSFEAKLTPCP(QCSAVLYCGEACLRADU(QRCPDDVSHRFUCP
SEG
PRD ccccchhhhhhhhcccccccccccccccccceeeeccchhhhhhhhccccccccccccch
MEM
SEώ RLAAFMERAGELATLPFTYTAEVTSETFNKEAFLASRGLTRGYUTiQLSMLIPGPGFSRHP
SEG
PRD hhhhhhhhhhhhhccccccccchhhhhhhhhhhhhhhhcccccchhhhhccccccccccc
MEM SElQ RGNTPSLSLLRGGDPYtQLL(QGD6TALMPPVPPHPPRGVFVPELNI(QNK(QSLKIHVVEAGK
SEG xxxxxxxxxxxxxx
PRD cccccceeeeeccccceeeccccccccccccccccceeeeccccchhhhhheeeeeeccc
MEM
SElQ EFDLVMVFUELLVLLPHVALELiQFVGDGLPPESDEiQHFTLlQRDSLEVSVRPGSGISARPS
SEG xxxxxxxxxxxxx
PRD cccchhhhhhhhhchhhhhhhhhhhcccccccchhhhhhhccccceeeeccccccccccc
MEM - - - MMMMMMMMMMMMMMMMM
SElQ SGTKEKGGRRDLiQIKVSARPYHLFlQGPKPDLVIGFNSGFALKDTULRSLPRLlQSLRVPAF
SEG
PRD ccccccccccceeeeeeccccccccccccceeeecccccccccccccccccccccccccc
MEM
SElQ FTESSEYSCVMDGiQTMAVATGGGTSPPiQPNPFRSPFRLRAADNCMSUYCNAFIFHLVYKP
SEG x
PRD cccccceeeecccceeeeeecccccccccccccccchhhhhcchhhhhhhhhhhhhhccc
MEM
SElQ AiQGSGARPAPGPPPPSPTPSAPPAPTRRRRGEKKPGRGARRRK
SEG xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
PRD ccccccccccccccccccccccccchhhhhccccccccccccc
MEM
(No Prosite data available for DKFZphtes3_7nl2 - 1) (No Pfam data available for DKFZphtes3_7nl2- 1) DKFZphtes3_1elb
group: transmembrane protein
DKFZphtes3_1elb encodes a novel 531 amino acid protein without similarity to known proteins- The novel protein contains 1 transmembrane region. The only EST described so far is from testis-
No informative BLAST resultsi No predictive prositei pfam or SCOP motife ■ The new protein can find application in studying the expression profile of testis-specif ic genes and as a new marker for testicular cells-
putative protein
1 EST hit perhaps complete eds- Sequenced by DKFZ
Locus: unknown Insert length: 2011 bp
Poly A stretch at pos- 116bι no polyadenylation signal found
1 CATGGCAACA TGAGCAGTGC TGAGATAATT GGTTCTACAA ATCTTATAAT
51 TCTGCTAGAG GATGAAGTCT TTGCCGATTT TTTCAACACA TTTCTTTCCC
101 TCCCGGTTTT TGGTCAGACA CCATTTTATA CTGTTGAAAA TTCACAGTGG
151 AGCTTGTGGC CAGAAATACC TTGTAACTTG ATTGCCAAAT ACAAAGGGTT
201 ATTGACCTGG TTGGAAAAAT GCCGATTACC TTTCTTCTGT AAAACAAACT 251 TGTGTTTCCA TTACATTCTC TGTCAGGAGT TCATCAGTTT CATTAAGTCC
3D1 CCAGAAGGAG CCAAGATGAT GAGATGGAAA AAGGCAGACC AGTGGCTACT
351 CCAGAAATGC ATTGGCGGGG TCAGAGGGAT GTGGCGCTTC TATTCCTACC
4D1 TCACAGGCAG TGCAGGTGAA GAATTGGTGG ATTTCTGGAT CCTTGCTGAG
451 AACATCCTGA GCATAGATGA GATGGACCTG GAAGTGAGAG ACTACTACCT 501 GTCCCTCCTC CTCATGCTGA GGGCCACTCA TCTGCAGGAG GGCTCCAGGG
551 TGGTAACCCT CTGTAACATG AACATCAAGT CCCTCCTGAA CCTCTCCATC bOl TGGCATCCCA ACCAATCAAC CACTAGGAGG GAGATCCTGA GCCACATGCA bSl GAAAGTGGCT CTGTTCAAAC TCCAGAGCTA TTGGCTTCCC AACTTTTACA
701 CCCACACCAA GATGACCATG GCCAAGGAGG AAGCATGCCA TGGTCTGATG 751 CAAGAGTACG AGACTCGCTT ATACAGCGTT TGCTACACCC ACATAGGAGG
801 GCTCCCTCTG AACATGAGCA TCAAGAAGTG CCACCACTTT CAGAAACGGT
651 ACTCAAGCAG GAAAGCCAAG AGGAAGATGT GGCAATTGGT AGATCCTGAC
101 TCTTGGTCTC TGGAAATGGA TCTCAAGCCA GATGCTATTG GTATGCCCCT
151 ACAGGAGACA TGTCCTCAAG AGAAGGTGGT TATACAAATG CCTTCCCTGA 1001 AAATGGCTTC TTCAAAGGAA ACAAGAATCA GTTCCCTGGA AAAGGATATG
1051 CATTATGCAA AAATATCCAG CATGGAGAAT AAAGCCAAGA GCCACCTCCA
1101 CATGGAAGCC CCCTTTGAGA CAAAGGTCTC TACCCACCTG AGGACTGTCA
1151 TCCCCATTGT CAATCACTCC TCCAAGATGA CAATTCAGAA GGCCATCAAG
1201 CAAAGCTTCT CCTTAGGATA CATCCACTTG GCCTTGTGTG CTGATGCCTG 1251 TGCAGGGAAC CCTTTCCGGG ACCACCTGAA GAAGCTGAAT TTGAAAGTGG
13D1 AGATCCAACT TCTTGACCTC TGGCAGGACT TGCAGCATTT CCTCAGTGTC
1351 CTTCTGAATA ACAAAAAGAA TGGGAATGCA ATCTTTCGTC ACTTGCTGGG
1401 TGACAGAATC TGCGAGCTCT ACCTGAATGA GCAGATTGGT CCGTGCTTAC
1451 CACTCAAATC CCAAACCATT CAGGGCCTGA AGGAACTATT GCCCTCTGGG 1501 GATGTGATCC CCTGGATTCC CAAAGCCCAG AAGGAGATTT GCAAGATGCT
1551 CAGTCCCTGG TATGATGAGT TTCTAGATGA AGAGGACTAC TGGTTTCTCC
IfaOl TTTTTACGGT AGGAAGGACT TTGGGTTAGG AAGGAATCAT GAGGATGAGG lfa51 GAAGAAGAAA GAGTAATTAC TGTTTTAAAA GGGTTATGTG TTAAAGTAAA
17D1 TGAAATTGTT ATTTTTCCTA GAGTCAACCA AAGATCAGCA TGGTCCCTGT 1751 TGTTCTAAAG CTAAACCTCT CAAGGAAAAG GACTCAGTGC ATAAGATGAC
1801 TTTGGTGAAA CCCCGTCTCT ACTAAAAATA CAAAAAATTA GCCGGGCGTA
1851 GTGGCGGGCG CCTGTAGTCC CAGCTACTTG GGAGGCTGAG GCAGGAGAAT
1101 GGTGTGAACC CGGGAGGCGG AGCTTGCAGT GAGCCGAGAT CCCGCCACTG
1151 CACGCCAGCC TGGGCGACAG AGCGAGACTC CGTCTCAAAA AAAAAAAAAA 20D1 AAAAAAAAAA G
BLAST Results
No BLAST result
Medline entries
No Medline entry Peptide information for frame 1
ORF from 10 bp to lb2b bpi peptide length: 531 Category: putative protein Classification: no clue 1 MSSAEIIGST NLIILLEDEV FADFFNTFLS LPVFGiQTPFY TVENSiQUSLU
51 PEIPCNLIAK YKGLLTULEK CRLPFFCKTN LCFHYILCiQE FISFIKSPEG
IDl AKMMRUKKAD (QULLiQKCIGG VRGMURFYSY LTGSAGEELV DFUILAENIL
151 SIDEMDLEVR DYYLSLLLML RATHLύEGSR VVTLCNMNIK SLLNLSIUHP
2D1 NlQSTTRREIL SHMiQKVALFK LiQSYULPNFY THTKMTMAKE EACHGLMOEY 251 ETRLYSVCYT HIGGLPLNMS IKKCHHFtQKR YSSRKAKRKM UiQLVDPDSUS
301 LEMDLKPDAI GMPLiQETCPlQ EKVVIiQMPSL KMASSKETRI SSLEKDMHYA
351 KISSMENKAK SHLHMEAPFE TKVSTHLRTV IPIVNHSSKM TIIQKAIKIQSF
4D1 SLGYIHLALC ADACAGNPFR DHLKKLNLKV EIlQLLDLUlQD LiQHFLSVLLN
451 NKKNGNAIFR HLLGDRICEL YLNElQIGPCL PLKSiQTIlQGL KELLPSGDVI 501 PUIPKAώKEI CKMLSPUYDE FLDEEDYUFL LFTVGRTLG
BLASTP hits
No BLASTP hits available
Alert BLASTP hits for DKFZphtes3_1elbι frame 1 No Alert BLASTP hits found
Pedant information for DKFZphtes3_1elfaι frame 1
Report for DKFZphtes3_1elb - 1
[LENGTH! 542
[MU! b210b-0b [pi! 8-35
[KU! Alpha_Beta
SElQ HGNMSSAEIIGSTNLIILLEDEVFADFFNTFLSLPVFGiQTPFYTVENSlQUSLUPEIPCNL PRD cccccceeeeccccceeehhhhhhhhhccccccccccccccccccccccccccccccchh
SElQ IAKYKGLLTULEKCRLPFFCKTNLCFHYILClQEFISFIKSPEGAKMMRUKKADiQULLiQKC PRD hhhhccceeecccccccccccccceeehhhhhhhhhhhccccchhhhhhhcchhhhhhhh SElQ IGGVRGMURFYSYLTGSAGEELVDFUILAENILSIDEMDLEVRDYYLSLLLMLRATHLlQE PRD ccccccceeeeeecccccccchhhhhhhhhhhhcccchhhhhhhhhhhhhhhhhhhhccc
SElQ GSR VTLCNMNIKSLLNLSIUHPNiQSTTRREILSHMiQKVALFKLiQSYULPNFYTHTKMTM
PRD cceeeeecccchhhhhhhhhccccccchhhhhhhhhhhhhhhhhhhccccccchhhhhhh
SElQ AKEEACHGLMlQEYETRLYSVCYTHIGGLPLNMSIKKCHHFiQKRYSSRKAKRKMUiQLVDPD
PRD hhhhhhhhhhhhhhhhheeeeeeccccccccccccccccchhhhhhhhhhhhhheeeccc SElQ SUSLEMDLKPDAIGMPLIQETCPIQEKVVIIQMPSLKMASSKETRISSLEKDMHYAKISSMEN
PRD cccccccccccccccccccccccceeeeeccccccccccccccchhhhhccccchhhhhh
SElQ KAKSHLHMEAPFETKVSTHLRTVIPIVNHSSKMTIlQKAIKlQSFSLGYlHLALCADACAGN PRD hhhhhhhccccccccccccceeeeeeeccccchhhhhhhhhhcccccccchhhhhhcccc
SElQ PFRDHLKKLNLKVEIlQLLDLUlQDLiQHFLSVLLNNKKNGNAIFRHLLGDRICELYLNEiQIG
PRD ccccchhhhhhhhhhhhhhhhhhhhhhhhhhhhccccccceeeecccchhhhhhhhhhcc SElQ PCLPLKSlQTIiQGLKELLPSGDVIPUIPKAiQKEICKMLSPUYDEFLDEEDYUFLLFTVGRT
PRD ccccccchhhhhhhhccccccceeecccchhhhhhhcccchhhhhccccceeeccccccc
SElQ LG PRD cc
(No Prosite data available for DKFZphtes3_1elfa - 1) (No Pfam data available for DKFZphtes3_1elfa ■ 1)
The PROSITE is a database of protein families and domains- It consists of biologically significant sitesi patterns and profiles that help to reliably identify to which known protein family (if any) a new sequence belongs- Uorld Uide Ueb URL http://www-expasy.ch/prosite/ is the entry point to the database. A description of the prosite consensus patterns follows-
NAME: N-glycosylation site CONSENSUS: N--CP>-[ST!--CP> -
NAME: Glycosaminoglycan attachment site- CONSENSUS: S-G-x-G-
NAME: Tyrosine sulfation site-
NAME: cAMP- and cGMP-dependent protein kinase phosphorylation site-
CONSENSUS: CRK! (2 ) -x-CST! .
NAME: Protein kinase C phosphorylation site. CONSENSUS: CST!-x-[RK!-
NAME: Casein kinase II phosphorylation site CONSENSUS: CSTJ-x (2 ) -[DE! .
NAME: Tyrosine kinase phosphorylation site- CONSENSUS: [RK!-x (2ι3) -CDE!-x (2ι3) -Y .
NAME: N-myristoylation site-
CONSENSUS: G--CEDRKHPFYU>-x (2) -CSTAGCN!--CP • NAME: Amidation site.
CONSENSUS: x-G-[RK!-[RK! -
NAME: Aspartic acid and asparagine hydroxylation site. CONSENSUS : C-x-CDN!-x ( 4 ) -CFY!-x-C-x-C -
NAME: Vitamin K-dependent carboxylation domain- CONSENSUS: x (12) -E-x (3) -E-x-C-x (fa) -CDEN!-x-[LIVMFY!-x (1) - [FYU!-
NAME: Phosphopantetheine attachment site- CONSENSUS: [DEtQGSTALMKRH!-[LIVMFYSTAC!-[GN(Q!-[LIVMFYAG!- [DNEKHS!-S-[LIVMST!- CONSENSUS: -CPCFY3--[STAGCP<QLIVMF!-[LIVMATN!-[DEN(QGTAKRHLM!- CLIVMUSTA!-[LIVGSTACR!- CONSENSUS: x (2) -CLIVMFA! ■
NAME: Acyl carrier protein phosphopantetheine domain profile-
NAME: Prokaryotic membrane lipoprotein lipid attachment site-
CONSENSUS: ERK3- (fa) -[LIVMFUSTAG! (2) -[LIVMFYSTAGC(Q!-[AGS!-C .
NAME: Prokaryotic N-terminal methylation site- CONSENSUS: [KRHE(QSTAG!-G-[FYLIVM!-[ST!-[LT!-[LIVP!-E- [LIVMFUSTAG!(14) - NAME: Prenyl group binding site (CAAX box). CONSENSUS: C--CDEN(Q}-[LIVM!-x>.
NAME: Protein splicing signature.
CONSENSUS: [DNEG!-x-CLIVFA!-[LIVMY!-[LVAST!-H-N-[STC! •
NAME: Endoplasmic reticulum targeting sequence CONSENSUS: [KRH(QSA!-[DEN(Q!-E-L> •
NAME: Microbodies C-terminal targeting signal- CONSENSUS: [STAGCN!-[RKH!-[LIVMAFY!> -
NAME: Gram-positive cocci surface proteins 'anchoring' hexapeptide-
CONSENSUS: L-P-x-T-G-[STGAVDE! .
NAME Bipartite nuclear targeting sequence
NAME: Cell attachment sequence CONSENSUS: R-G-D-
NAME: ATP/GTP-binding site motif A (P-loop) CONSENSUS: _ [AG!-x ( 4 ) -G-K-[ST! ■
NAME: Cyclic nucleotide-binding domain signature 1. CONSENSUS: [LIVM!-[VIC!-x (2) -G-[DEN(QTA!-x-CGAC!-x (2 ) - [LIVMFY!(4)-x(2)-G.
NAME: Cyclic nucleotide-binding domain signature 2- CONSENSUS: [LIVMF!-G-E-x-[GAS!-[LIVM!-x (Sill) -R-[STA(Q!-A-x- [LIVMA!-x-[STACV!-
NAME: cAMP/cGMP binding motif. NAME: EF-hand calcium-binding domain-
CONSENSUS: D-x-[DNS!-€ILVFYU}-[DENSTG!-[DN(QGHRK!--CGP}- [LIVMCJ-[DEN(QSTAGC!-x(2)- CONSENSUS: [DE!-[LIVMFYU! -
NAME: Actinin-type actin-binding domain signature 1- CONSENSUS: [E(Q!-x (2) -CATV!-[FY!-x (2) -U-x-N -
NAME: Actinin-type actin-binding domain signature 2- CONSENSUS: CLIVM!-x-[SGN!-[LIVM!-[DAGHE!-[SAG!-x- NEAG!- [LIVM!-x-[DEAG!-x(4)- CONSENSUS: [LIVM!-x-[LM!-[SAG!-[LIVM!-[LIVMT!-U-x-[LIVM! (2 ) -
NAME: Anaphylatoxin domain signature- CONSENSUS: [CSH!-C-x (2) -[GAPl-x (7ι8) -[GASTDE(QR!-C-[GASTDE(QL!- x(3ι1)-[GASTDE(QN!-x(2)- CONSENSUS: ECE!-x (faι7) -C-C .
NAME: Anaphylatoxin domain profile.
NAME: Apple domain-
CONSENSUS: C-x (3) -[LIVMFY!-x (5) -[LIVMFY!-x (3) -CDENiQ!- CLIVMFY!-x(10)-C-x(3)-C-T-
CONSENSUS: x ( ) -C-x-CLIVMFY!-F-x-CFY!-x (13ιl4) -C-x-[LIVMFY!- [RK!-x-[ST!-x(14ιl5)-
CONSENSUS: S-G-x-[ST!-[LIVMFY!-x (2) -C -
NAME: Band 4-1 family domain signature 1- CONSENSUS: U-CLIV!-x (3 ) -CKR(Q!-x-[LIVM!-x (2) -[(QH!-x (0 ι2 ) - [LIVMF!-x(faι6)-[LIVMF!-
CONSENSUS: x (3ι 5) -F-[FY!-x (2) -[DENS! -
NAME: Band 4-1 family domain signature 2-
CONSENSUS: [HYU!-x (1) - EN(QSTV!-[SA!-x (3 ) -[FY!-[LIVM!-x ( 2) - [ACV!-x(2)-[LM!-x(2)-
CONSENSUS-" [FY!-G-x-CDEN(QST!-[LIVMFYS!-
NAME: Band 4-1 family domain profile. NAME: Clq domain signature-
CONSENSUS: F-x ( 5) -CND!-x (4 ) -[FYUL!-x (b) -F- (5) -G-x-Y-x-F-x- [FY!-
NAME: C-terminal cystine knot signature- CONSENSUS: C-C-X (13) -C-x (2) -CGN!-x (12) -C-x-C-x (2ι4 )-C -
NAME: C-terminal cystine knot profile-
NAME: CUB domain profile-
NAME: Death domain profile-
NAME-" EGF-like domain signature 1- CONSENSUS: C-x-C-x (5) -G-x (2 ) -C •
NAME: EGF-like domain signature 2- CONSENSUS: C-x-C-x (2) -CGP!-CFYU!-x ( 4 iβ) -C . NAME: Calcium-binding EGF-like domain pattern signature- CONSENSUS: CDE(QN!-x-CDE(QN! (2) -C-x (3ιl4 ) -C-x (3ι7 ) -C-x-[DN!- x(4)-CFY!-x-C NAME: Laminin-type EGF-like (LE) domain signature-
CONSENSUS: C-x (lι2) -C- ( 5) -G-x (2) -C-x (2) -C-x (3ι4 ) -CFYU!-
Figure imgf000395_0001
NAME: Coagulation factors 5/6 type C domain (FA58C) signature 1-
CONSENSUS: CGAS!-U-x (7ιl5) -CFYU!-[LIV!-x-[LIVFA!-[GSTDEN!- x(fa)-[LIVF!-x(2)-[IV!-x-
CONSENSUS: CLIVT!-[(QKM!-G - NAME: Coagulation factors 5/8 type C domain (FA58C) signature 2. CONSENSUS: P-x (8 ilO) -[LM!-R-x-[GE!-[LIVP!-x-G-C •
NAME: Forkhead-associated (FHA) domain profile-
NAME: Fibrinogen beta and gamma chains C-terminal domain signature.
CONSENSUS: U-U-CLIVMFYU!-x (2) -C-x (2) -[GSA!-x (2) -N-G - NAME: Type I fibronectin domain-
CONSENSUS: C-x (fa iS ) -CLFY!-x (5) -[FYU!-x-[RK!-x (8ιl0) -C-x-C- x(bι1)-C
NAME: Type II fibronectin collagen-binding domain- CONSENSUS: C-x (2 ) -P-F-x-CFYUI!-x (7 ) -C-x (8ιlD ) -U-C-x ( 4 ) - [DNSR!-[FYU!-x(3ι5)-[FYU!-x- CONSENSUS: [FYUO-O
NAME: Hemopexin domain signature- CONSENSUS: [LIFAT!-x (3) -U-x (2ι3) -[PE!-x (2 ) -[LIVMFY!- EN(QS!- [STA!-[AV!-[LIVMFY!-
NAME: Kringle domain signature. CONSENSUS: [FY!-C-R-N-P- NR! -
NAME: Kringle domain profile-
NAME: LDL-receptor class A (LDLRA) domain signature- CONSENSUS: C-[VILMA!-x (5) -C- NH!-x (3) - EN(QHT!-C-x (3ι 4 ) - [STADE!-[DEH!-[DE!-x(lι5)~ CONSENSUS: C
NAME: LDL-receptor class A (LDLRA) domain profile- NAME: C-type lectin domain signature-
CONSENSUS: C-[LIVMFYATG!-x (5ιl2) -[UL!-x-[DNSR!-x (2) -C-x ( 5ιb ) -
[FYULIVSTA!-[LIVMSTA!-
CONSENSUS: C NAME: C-type lectin domain profile-
NAME: Link domain signature-
CONSENSUS: C-x (15) -A-x (3ι4 ) -G-x (3) -C-x (2) -G-x il) -P-χ (7 ) -C - NAME: Osteonectin domain signature 1-
CONSENSUS: C-x-[DN!-x (2) -C-x (2) -G-CKRH!-x-C-x (faι7) -P-x-C-x-C- x(3ι5)-C-P.
NAME: Osteonectin domain signature 2- CONSENSUS: F-P-x-R-[IM!-x-D-U-L-x-[N(Q! .
NAME: Somatomedin B domain signature-
CONSENSUS: C-x-C-x (3) -C-x (5) -C-C-x-CDN!-[FY!-x (3) -C .
NAME: Thyroglobulin type-1 repeat signature-
CONSENSUS: [FYUHP!-x-P-x-C-x (3ι 4 ) -G-x-[FYU!-x (3) -iQ-C-x (4 ilO)
C-[FYU!-C-V-x(3ι4)- CONSENSUS: CSG!
NAME: P-type 'Trefoil' domain signature-
CONSENSUS: R-x(2)-C-x-[FYPST!-x(3ι4)-[ST!-x(3)-C-x(4)-C-C-
[FYUH!-
NAME-" Cellulose-binding domaini bacterial type-
CONSENSUS: U-N-[STAGR!-[STDN!-[LIVM!-x(2)-[GST!-x-[GST!-x(2)'
CLIVMFT!-CGA! NAME: Cellulose-binding domaini fungal type-
CONSENSUS: C-G-G-x (4 ι7) -G-x (3) -C-x ( 5) -C-x (3ι 5) -CNHG!-x- [FYUM!-x(2)-(Q-C
NAME: Chitin recognition or binding domain signature. CONSENSUS: C-x (4 i 5 ) -C-C-S-x (2) -G-x-C-G-x ( 4 ) -CFYU!-C -
NAME: Barwin domain signature 1- CONSENSUS: C-G-[KR!-C-L-x-V-x-N ■ NAME: Barwin domain signature 2-
CONSENSUS: V- N!-Y-[E(Q!-F-V-[DN!-C ■
NAME: BIR repeat-
CONSENSUS: [HKEPILVY!-x (2 ) -R-x (3ι7) -[FYU!-x (11 ιl4 ) -[STAN!-G- [LMF!-X-[FYHDA!-X(4)-
C NSENSUS-" [DESL!-X(2ι 3) -C-X (2) -C-X (b) -CUA!-X (D-H-X (4 ) - CPRSD!-X-C-X(2)-CLIVMA!-
NAME: UAP-type ' four-disulfide core' domain signature. CONSENSUS: C-x-CO-CDN!-x (2) -C-x (5) -C-C
NAME: Phorbol esters / diacylglycerol binding domain- CONSENSUS: H-x-CLIVMFYU!-x (δill) -C-x (2) -C-x (3) -[LIVMFC!- x(5ιlO)-C-x(2)-C-x(4)-[HD!- CONSENSUS: x (2) -C-x (5ι1 ) -C -
NAME: C2 domain signature-
CONSENSUS: [ACG!-x (2) -L-x (2ι3> -D-x (li 2) -[NGSTLIF!-[GTMR!-x-
[STAP!-D-[PA!-[FY!-
NAME". C2-domain profile- NAME-" CAP-Gly domain signature CONSENSUS: G-x (βilO ) -[FYU!-x-G-[LIVM!-x-[LIVMFY!-x ( 4 ) -G-K-
CNH!-x-G-[STAR!-x(2)-G-
CONSENSUS: x(2)-[LY!-F- NAME: Ly-b / u-PAR domain signature-
CONSENSUS: CE(QR!-C-[LIVMFYAH!-x-C-x ( 5ιδ) -C-x ^^-[EDNOSTVI- C-CO-x ( 5) -C- CONSENSUS". x(12ι24)-C NAME: MAM domain signature-
CONSENSUS: G-x-CLIVMFY! (2) -x (3) -CSTAJ-x (10 ill ) -CLV!-x (4 ) -
[LIVMF!-x(bι7)-C-CLIVM!-x-
CONSENSUS: F-x-[LIVMFY!-x (3) -[GSC! - NAME: MAM domain profile-
NAME: PH domain profile-
NAME: Phosphotyrosine interaction domain (PID) profile-
NAME: Src homology 2 (SH2) domain profile-
NAME-" Src homology 3 (SH3) domain profile- NAME: VUFC domain signature-
CONSENSUS: C-x (2ι3 ) -C-x-C-x (bιl4 ) -C-x (3 ι4 ) -C-x (2ι 10) -C- x(1ιlfa)-C-C-x(2ι4)-C-
NAME: UU/rsp5/UUP domain signature- CONSENSUS: U-x dill ) -CVFY!-[FYU!-x (bι7) -[GSTNE!-[GSTιQCR!- [FYU!-x(2)-P-
NAME: UU/rsp5/UUP domain profile- NAME: ZP domain signature-
CONSENSUS: CLIVMFYU!-x (7) -[STAPDNL!-x (3) -[LIVMFYU!-x-
[LIVMFYU!-x-[LIVMFYU!-x(2)-C-
CONSENSUS: [LIVMFYU!-x-[ST!-[PSL!-x (2ι4 ) - ENS!-x-[STADN(QLF!- x(fa)-[LIVM!(2)-x(3ι4)- CONSENSUS: C
NAME: S-layer homology domain signature-
CONSENSUS: CLVFYT!-x-[DA!-x (2ι 5) -[DN6SATPHY!-[UYFPDA!-x ( ) - [LIV!-x(2)-[GTALV!- CONSENSUS: x ( ifa )-[LIVFYC!-x (2) -G-x-[PGSTA!-x (2ι3)-[MFYA!-x- [PGAV!-x(3ιlO)-[LIVMA!- CONSENSUS: [STKR!-[RY!-x-[EιQ!-x-[STALIVM! -
NAME: 'Homeobox' domain signature- CONSENSUS: [LIVMFYG!-[ASLVR!-x (2 ) -[LIVMSTACN!-x-[LIVM!-x (4 ) - [LIV!-[RKN<QESTAIY!- CONSENSUS: [LIVFSTNKH!-U-[FYVC!-x-[NDιQTAH!-x (5) -[RKNAIMU! -
NAME: 'Homeobox' domain profile.
NAME." 'Homeobox' antennapedia-type protein signature. CONSENSUS: [LIVMFE!-[FY!-P-U-M-[KR(QTA! . NAME: 'Homeobox' engrailed-type protein signature- CONSENSUS: L-M-A-(Q-G-L-Y-N -
NAME: 'Paired box' domain signature- CONSENSUS: R-P-C-x (11) -C-V-S -
NAME: 'POU' domain signature 1-
CONSENSUS: CRK(Q!-R-[LIM!-x-[LF!-G-[LIVMFY!-x-(Q-x-[DN(Q!-V-G - NAME-" 'POU' domain signature 2-
CONSENSUS: S-(Q-[ST!-[TA!-I-[SC!-R-F-E-x-ELS(Q!-x-[LI!-[ST! -
NAME: Zinc fingeri C2H2 typei domain- CONSENSUS: C-x(2ι4)-C-x(3)-[LIVMFYUC!-x(δ)-H-x(3ι5)-H
NAME: Zinc fingeri C3HC4 type (RING finger)ι signature CONSENSUS: C-x-H-x-[LIVMFY!-C-x (2) -C-[LIVMYA! -
NAME: Nuclear hormones receptors DNA-binding region signature-
CONSENSUS: C-x (2)-C-x-CDE!-x (5) -[HN!-[FY!-x (4 ) -C-x (2) -C-x (2) F-F-x-R.
NAME: GATA-type zinc finger domain- CONSENSUS: C-x-CDN!-C-x ( 4 ι5) -[ST!-x (2) -U-[HR!-[RK!-x (3) -[GN!- x(3ι4)-C-N-CAS!-C
NAME: Poly ( ADP-ribose) polymerase zinc finger domain signature. CONSENSUS: C-CKR!-x-C-x (3) -I-x-K-x (3) -[RG!-x (Ifailβ ) -U-[FYH!- H-x(2)-C
NAME-" Poly (ADP-ribose) polymerase zinc finger domain profile ■
NAME: Fungal Zn(2)-Cys(fa) binuclear cluster domain signature .
CONSENSUS: CGASTPV!-C-x (2) -C-CRKHSTACU!-x (2) -[RKH(Q!-x (2) -C- x(5ιl2)-C-x(2)-C-x(faιβ)- CONSENSUS: C
NAME: Fungal Zn(2)-Cys(fa) binuclear cluster domain profile-
NAME-" Prokaryotic dksA/traR C4-type zinc finger- CONSENSUS: C-[DES!-x-C-x (3) -I-x (3) -R-x (4 ) -P-x ( ) -C-x (2) -C -
NAME: Copper-fist domain signature-
CONSENSUS: M-CLIVMF! (3) -x (3 ) -K-CMY!-A-C-x (2) -C-I-CKR!-x-H- CKR!-x(3)-C-x-H-x(β)- CONSENSUS: CKR!-x-[KR!-G-R-P -
NAME: Copper fist DNA binding domain profile-
NAME: Leucine zipper pattern- CONSENSUS-" L-x (fa ) -L-x (fa )-L-x (fa )-L -
NAME: bZIP transcription factors basic domain signature- CONSENSUS : [KR!- x ( l ι 3 ) -[RKSAιQ!-N- x ( 2 ) -CSAlQ! ( 2 ) -χ-[RKTAEN<Q!-x-
R- x-[RK! -
NAME: Myb DNA-binding domain repeat signature 1- CONSENSUS: U-CST!-x (2 ) -E-[DE!-x (2) -[LIV! .
NAME: Myb DNA-binding domain repeat signature 2-
CONSENSUS: U-x (2) -[LI!-[SAG!-x ( 4 i 5) -R-x (8) -[YU!-x (3) -[LIVM! - NAME: Myc-typei 'helix-loop-helix' dimerization domain signature-
CONSENSUS: [DEMSTAP!-K-[LIVMUAGSN!--CFYUCPHKR>-[LIVT!-[LIV!- x(2)-[STAV!-[LIVMSTAC!-x- CONSENSUS: [VMFYH!-[LIVMTA!--CP>--CP>-[LIVMSR! -
NAME: p53 tumor antigen signature- CONSENSUS: π-C-N-S-S-C-M-G-G-M-N-R-R
NAME: CBF-A/NF-YB subunit signature- CONSENSUS: C-V-S-E-x-I-S-F-[LIVM!-T-[SG!-E-A-[SC!- E!-[KRιQ!- C
NAME: CBF-B/NF-YA subunit signature-
CONSENSUS: γ_V-N-A-K-(Q-Y-x-R-I-L-K-R-R-x-A-R-A-K-L-E -
NAME: 'Cold-shock' DNA-binding domain signature- CONSENSUS: [FY!-G-F-I-x (bι7) - ER!-[LIVM!-F-x-H-x-[STKR!-x- [LIVMFY!- NAME: CTF/NF-I signature-
CONSENSUS: R_κ_R_κ-γ_F_<_<---.H-E-K-R -
NAME: Ets-domain signature 1-
CONSENSUS: L-[FYU!-[<QEDH!-F-[LI!-[LVt2K!-x-[LI!-L -
NAME: Ets-domain signature 2.
CONSENSUS: [RKH!-x (2 ) -M-x-Y-[DEN(Q!-x-[LIVM!-[STAG!-R-[STAG!-
[LO-R-x-Y. NAME: Ets-domain profile-
NAME: Fork head domain signature 1-
CONSENSUS: [KR!-P-[PT(Q!-[FYLVιQH!-S-[FY!-x (2) -[LIVM!-x (3ι 4 ) -
[AC!-[LIM!.
NAME: Fork head domain signature 2- CONSENSUS: U-[(QKR!-[NS!-S-[LIV!-R-H -
NAME: Fork head domain profile-
NAME: HSF-type DNA-binding domain signature-
CONSENSUS: L-x (3) -[FY!-K-H-x-N-x-[STAN!-S-F-CLIVM!-R-ιQ-L-
[NH!-x-Y-x-[FYU!-[RKH!-K-
CONSENSUS: [LIVM!-
NAME: Tryptophan pentad repeat (IRF family) signature- CONSENSUS: U-x- NH!-x (5) -[LIVF!-x-[IV!-P-U-x-H-x (lilO ) -[DE!- x(2)-[LIVF!-F-[KR(Q!-x- CONSENSUS : [UR!-A -
NAME: LIM domain signature-
CONSENSUS: C-x (2 ) -C-x (15-ιΞl) -CFYUH!-H-x (2) -[CH!-x (2) -C-x (2) - C-x(3)-CLIVMF!-
NAME: LIM domain profile-
NAME: NF-kappa-B/Rel/dorsal domain signature- CONSENSUS: F-R-Y-x-C-E-G -
NAME: MADS-box domain signature-
CONSENSUS: R-x-CRK!-x ( 5) -I-x- N!-x (3) -[KR!-x (2) -T-[FY!-x- [RK!(3)-x(2)-[LIVM!-x- CONSENSUS: K (2 ) -A-x-E-CLIVM!-CST!-x-L-x ( 4 ) -CLIVM!-x- CLIVM!(3)-x(fa)-[LIVMF!-x(2)- CONSENSUS: CFY!-
NAME: MADS-box domain profile-
NAME: T-box domain signature 1-
CONSENSUS: L-U-x (2) -CFC!-x (3ι ) -[NT!-E-M-[LIV! (2) -T-x (2 ) -G-
[RG!-[KR(Q!- NAME: T-box domain signature 2-
CONSENSUS: [LIVMYU!-H-[PADH!- EN!-[GS!-x (3) -G-x (2) -U-M-x (3) - [IVA!-x-F-
NAME: TEA domain signature- CONSENSUS: G-R-N-E-L-I-x (2) -Y-I-x (3) -CTC!-x (3) -R-T-[RK! (2 ) -(Q- [LIVM!-S-S-H-[LIVM!- CONSENSUS: (Q-V-
NAME: Transcription factor TFIIB repeat signature- CONSENSUS: G-[KR!-x (3) -[STAGN!-x-[LIVMYA!-[GSTA! (2 ) -[CSAV!- [LIVM!-[LIVMFY!-[LIVMA!- CONSENSUS: [GSA!-[STAC! -
NAME: Transcription factor TFIID repeat signature- CONSENSUS: Y-x-P-x (2) -CIF!-x (2 ) -CLIVM! (2) -x-[KRH!-x ( 3) -P- [RK(Q!-x(3)-L-[LIVM!-F-x-
CONSENSUS: [STN!-G-[KR!-[LIVM!-x (3) -G-[TAGL!-[KR!-x ( 7 ) -[AGC!- x(7)-CLIVM!- NAME: TFIIS zinc ribbon domain signature-
CONSENSUS: C-x (2 ) -C-x (1 ) -CLIVM(QSAR!-[(QH!-[STιQL!-[RA!-[SACR!- x-[DE!-[DET!-[PGSEA!-
CONSENSUS: x (fa ) -C-x (2ι5) -C-x (3) -CFU! - NAME: TSC-22 / dip / bun family signature-
CONSENSUS: M-D-L-V-K-x-H-L-x (2) -A-V-R-E-E-V-E -
NAME: Prokaryotic transcription elongation factors signature 1- CONSENSUS: CST!-x ( 2) -[GS!-x (3) -[LI!-x (2) -E-L-x (2) -L-x (3ι 4 ) -R- x(2)-CIV!-x(3)-[LIV!- CONSENSUS: x (fa ) -G-D-x (2) -E-N-CGSA!-x-Y - NAME: Prokaryotic transcription elongation factors signature
2-
CONSENSUS: S-x (2) -S-P-[LIVM!-[AG!-x-[SAG!-[LIVM!-[LIVMY!- x(4)-[DG!-[DE!-
NAME: DEAD-box subfamily ATP-dependent helicases signature- CONSENSUS: [LIVMF!(2)-D-E-A-D-[RKEN!-x-[LIVMFYGSTN!-
NAME: DEAH-box subfamily ATP-dependent helicases signature- CONSENSUS: [GSAH!-x-[LIVMF!(3)-D-E-[ALIV!-H-[NECR!-
NAME: Eukaryotic putative RNA-binding region RNP-1 signature.
CONSENSUS: [RK!-G--CEDRKHPCG>-[AGSCI!-[FY!-[LIVA!-x-[FYLM!.
NAME: Fibrillarin signature-
CONSENSUS: [GST!-[LIVMAP!-V-Y-A-[IV!-E-[FY!-[SA!-x-R-x(2)-R-
CDE!.
NAME: MCM family signature- CONSENSUS: G-CIVT!-[LVAC!(2)-[IVT!-D-[DE!-[FL!-[DNST!.
NAME: MCM family domain-
NAME: XPA protein signature 1-
CONSENSUS: C-x-CDE!-C-x (3) -CLIVMF!-x (lι2) -D-x (2) -L-x (3) -F- x(4)-C-x(2)-C
NAME: XPA protein signature 2-
CONSENSUS: [LIVM!(2)-T-[KR!-T-E-x-K-x-[DE!-Y-[LIVMF!(2)-x-D- x-[DE!-
NAME: XPG protein signature 1-
CONSENSUS: CVI!-[KRE!-P-x-[FYIL!-V-F-D-G-x(2)-[PIL!-x-[LVC!-
K.
NAME: XPG protein signature 2- CONSENSUS: [GS!-[LIVM!-[PER!-[FYS!-[LIVM!-x-A-P-x-E-A-[DE!-
[PAS!-[(QS!-[CLM!-
NAME: Bacterial regulatory proteinsi araC family signature
CONSENSUS: [KR(Q!-[LIVMA!-x (2) -[GSTALIV!--CFYUPGDN>-x (2) -
[LIVMSA!-x( ι1)-[LIVMF!-
CONSENSUS: x (2) -[LIVMSTA!-[GSTACIL!-x (3) -[GAN(QRF!-[LIVMFY!-
Figure imgf000401_0001
CONSENSUS: CFYIVAI--CFYUHCMJ-X (3) -[GSADEN(QKR!-x-[NSTAPKL!-
[PARL!.
NAME: Bacterial regulatory proteinsi araC family DNA-binding domain profile-
NAME: Bacterial regulatory proteinsi arsR family signature. CONSENSUS: C-x (2) -D-[LIVM!-x (fa ) -[ST!-x (4 ) -S-[HYR!-[H(Q! ■ NAME: Bacterial regulatory proteinsi asnC family signature. CONSENSUS: [GSTAP!-x (2) - NEA!-[LIVM!-[GSA!-x (2) -[LIVMFY!- [GN!-[LIVMST!-[ST!-x(b)-R- CONSENSUS: [LVT!- (2)-[LIVM!-x (3) -G. NAME: Bacterial regulatory proteinsi crp family signature- CONSENSUS: [LIVM!-[STAG!-[RHNU!-x (2) -[LIM!-[GA!-x-[LIVMFYA!- [LIVSC!-[GA!-x-[STACN!- CONSENSUS: x (2) -[MST!-x-[GSTN!-R-x-[LIVMF!-x (2) -[LIVMF! -
NAME: Bacterial regulatory proteinsi deoR family signature. CONSENSUS: R-x (3) -CLIVM!-x (3) -[LIVM!-x (lbιl7) -CSTA!-x (2) -T- [LIVMA!-[RH!-[KRNA!-D- CONSENSUS: [LIVMF!-
NAME: Bacterial regulatory proteinsi gntR family signature. CONSENSUS: [LIVAPKR!-[PILV!-x-[E(QTIVMR!-x (Ξ) -[LIVM!-x (3) - [LIVMFYK!-x-[LIVFT!- CONSENSUS: NGSTK!-[RGTLV!-x-[STAIVP!-[LIVA!-x (2) -[STAGV!- [LIVMFYH!-x(2)-[LMA!>
NAME: Bacterial regulatory proteinsi iclR family signature CONSENSUS: [GA!-x (3) - S!-x (2) -E-x (fa) -[CSA!-[LIVM!-[GSA!- x(2)-[LIVM!-[FYH!-[DN!-
NAME: Bacterial regulatory proteinsi lad family signature CONSENSUS: [LIVM!-x-[DE!-[LIVM!-A-x (2) -[STAGV!-x-V-[GSTP!- x(2)-CSTAG!-[LIVMA!-x(2)- CONSENSUS: [LIVMFYAN!-[LIVMC! .
NAME: Bacterial regulatory proteinsi luxR family signature. CONSENSUS: [GDC!-x (2) -[NSTAVY!-x (2)- V!-[GSTA!-x (2) - [LIVMFYUCT!-x-[LIVMFYUCR!-x(3)- CONSENSUS: [NST!-[LIVM!-x (5) -CNRHSA!-[LIVMSTA!-x (2) -[KR! .
NAME: Bacterial regulatory proteinsi lysR family signature. CONSENSUS: [N<QKRHSTAG!-[LIVMFYTA!-x (2) -[STAGLV!-CSTAG!-x ( 4 ) - [LIVMYCT(QR!-[PSTANLVER!- CONSENSUS: x-[PSTAG(QV!-[PSTAGNVMF!-[LIVMFA!-[STAGH!-x (2) - [LIVMF!-x(2)-[LIVMFU!- CONSENSUS: [RKEAV!-x (2) -[LIVMFYNTAE!-x (3) -[LIMVT! .
NAME-" Bacterial regulatory proteinsi marR family signature. CONSENSUS: [STNA!-[LIA!-x-[RNGS!-x (4 ) -[LM!-[EIV!-x (2) -[GES!- [LFYU!-[LIVC!-x(7)- CONSENSUS: [DN!-[RK(QG!-[RK!-x (fa) -T-x (2) -[GA! ■
NAME: Bacterial regulatory proteinsi merR family signature. CONSENSUS: [GSA!-x-[LIVMFA!-[ASM!-x (2) -[STACLIV!-[GSDEN(QR!- [LIVC!-[STANHK!-x(3)-
CONSENSUS: [LIVM!-[RHF!-x-[YU!- EtQ!-x (2ι3) -[GHDNlQ!- [LIVMF!(2) . NAME-" Bacterial regulatory proteinsi tetR family signature- CONSENSUS: G-[LIVMFYS!-x (2ι3) -[TS!-CLIVMT!-x (2) -[LIVM!-x (5) - [LIV(QS!-[STAGEN(QH!-x-
CONSENSUS: [GPAR!-x-[LIVMF!-[FYST!-x-[HFY!-[FV!-x- NST!-K- x(2)-[LIVM!>
NAME: Transcriptional antiterminators bglG family signature CONSENSUS: [ST!-x-H-x (2) -[FA! (2) -[LIVM!-[E(QK!-R-x (2 ) -[(QNK! . NAME: Sigma-54 factors family signature 1-
CONSENSUS: P-CLIVM!-x-CLIVM!-x ( ) -[LIVM!-A-x (2) -CLIVMF!-x (2 ) -
[HS!-x-S-T-[LIVM!-S-R. NAME: Sigma-54 factors family signature 2- CONSENSUS: R-R-T-[IV!-[AT!-K-Y-R •
NAME: Sigma-54 factors family profile- NAME: Sigma-70 factors family signature 1-
CONSENSUS: [DE!-[LIVMF! (2) -[HE(QS!-x-G-x-[LIVMFA!-G-L- [LIVMFYE!-x-[GSAM!-[LIVMAP!-
NAME: Sigma-70 factors family signature 2- CONSENSUS: [STN!-x (2) -[DE(Q!-[LIVM!-CGAS!-x ( 4 ) -[LIVMF!-[PSTG!- x(3)-[LIVMA!-x-[N(QR!- CONSENSUS: CLIVMA!-[E(QH!-x (3) -[LIVMFU!-x (2) -[LIVM! •
NAME: Sigma-7D factors ECF subfamily signature- CONSENSUS: [STAIV!-[P(QDEL!- E!-[LIV!-[LIVTA!-(Q-x-[STAV!- [LIVMFYC!-[LIVMAK!-x-
CONSENSUS: [GSTAIV!-[LIMFYU(Q!-x (12ιl4 ) -[STAP!-[FYU!-[LIF!- x(2)-[IV!- NAME: Sigma-54 interaction domain ATP-binding region A signature-
CONSENSUS: CLIVMFY! (3) -x-G-CDE(Q!-[STE!-G-[STAV!-G-K-x (2) - [LIVMFY!- NAME: Sigma-54 interaction domain ATP-binding region B signature.
CONSENSUS: [GS!-x-[LIVMF!-x (2 ) -A-[DNE(QASH!-[GNEK!-G-[STIM!-
CLIVMFY!(3)-[DE!-[E -
CONSENSUS: [LIVM!-
NAME: Sigma-54 interaction domain C-terminal part signature.
CONSENSUS: [FYU!-P-[GS!-N-[LIVM!-R-[E(Q!-L-x-[NHAT! ■
NAME: Sigma-54 interaction domain profile-
NAME: Single-strand binding protein family signature 1- CONSENSUS: [LIVMF!-[NST!-[KRT!-[LIVM!-x-[LIVMF! (2) -G- HR - [LIVM!-[GST!-x-[DET!. NAME: Single-strand binding protein family signature 2- CONSENSUS: T-x-U-[HY!-[RNS!-[LIVM!-x-[LIVMF!-[FY!-[NGKR! .
NAME: Bacterial histone-like DNA-binding proteins signature. CONSENSUS: [GS -F-x (2 ) -[LIVMF!-x ( 4 ) -[RKEιQA!-x (2) -[RST!-x- [GA!-x-[KN!-P-x-T.
NAME: Dps protein family signature 1-
CONSENSUS: H-CFU!-x-CLIVM!-x-G-x (5) -[LV!-H-x (3) -OE! - NAME: Dps protein family signature 2-
CONSENSUS: [LIVMFY!-[DH!-x-[LIVM!-[GA!-E-R-x (3) -[LIF!-[GDN!- x(2)-[PA!. NAME: DNA repair protein radC family signature- CONSENSUS: H-N-H-P-S-G-
NAME: recA signature-
CONSENSUS: A-L-[KR!-[IF!-[FY!-[STA!-[STAD!-[LIVM(Q!-R
NAME: RecF protein signature 1-
CONSENSUS: P-[ED!-x (3) -[LIVM! ( ) -x-G-[GSAD!-P-x (2) -R-R-x-
[FY!-[LIVM!-D-
NAME: RecF protein signature 2-
CONSENSUS: [LIVMFY!(2)-x-D-x(2ι3)-[SA!-[EH!-L-D-x(2)-[KRH!- x(3)-L- NAME: RecR protein signature-
CONSENSUS: C-x (2) -C-x (3) -CST!-x ( 4 ) -C-x-I-C-x ( 4 ) -R .
NAME: Histone H2A signature- CONSENSUS: CA -G-L-x-F-P-V -
NAME: Histone H2B signature-
CONSENSUS: CKR!-E-CLIVM!-[E(Q!-T-x (2) -[KR!-x-[LIVM! (2 ) -x-
[PAG!-[DE!-L-x-[KR!-H-A-
CONSENSUS: [LIVM!-[STA!-E-G -
NAME: Histone H3 signature 1- CONSENSUS: K-A-P-R-K-(Q-L -
NAME: Histone H3 signature 2-
CONSENSUS: P-F-x-[RA!-L-[VA!-[KR<Q!-CDEG!-[IV! .
NAME: Histone H4 signature- CONSENSUS: G-A-K-R-H- NAME: HMG1/2 signature-
CONSENSUS: [FI!-S-[KR!-K-C-S-[EK!-R-U-K-T-M
NAME: HMG-I and HMG-Y DNA-binding domain (A+T-hook) CONSENSUS: [AT!-x (1 ι2 ) -[RK! (2 ) -[GP!-R-G-R-P-[RK!-x -
NAME: HMG14 and HMG17 signature- CONSENSUS: R-R-S-A-R-L-S-A-[RK!-P .
NAME: Bromodomain signature- CONSENSUS: [STANVF!-x (2)-F-x (4 )-[DNS!-x (5ι 7) - EN(QTF!-Y- CHFY!-x(2)-[LIVMFY!-x(3)-
CONSENSUS: CLIVM!-x (4 ) -[LIVM!-x (faiδ) -Y-x (12ιl3 ) -[LIVM!-x (2) N-[SACF!-x(2)-[FY!. NAME: Bromodomain profile
NAME: Chromo domain signature-
CONSENSUS: CFYL!-x-CLIVMC!-[KR!-U-x-[GDNR!-[FYULE!-x (5ιfa)
CST!-U-[ES!-CPSTDN!-x(3)-
CONSENSUS: [LIVMC!-
NAME: Chromo and chromo shadow domain profile NAME: Regulator of chromosome condensation (RCCl) signature
1-
CONSENSUS: G-x-N-D-x (2 ) -[AV!-L-G-R-x-T - NAME: Regulator of chromosome condensation (RCCl) signature 2-
CONSENSUS: CLIVMFA!-[STAGC! (2) -G-x (2)-H-[STAGLI!-[LIVMFA!-x- [LIVM!- NAME: Protamine PI signature-
CONSENSUS: [AV!-R-CNFY!-R-x (2ι3) -[ST!-x-S-x-S-
NAME: Nuclear transition protein 1 signature CONSENSUS: S-K-R-K-Y-R-K -
NAME: Nuclear transition protein 2 signature 1 CONSENSUS: H-x (3) -H-S-[NS!-S-x-P-(Q-S -
NAME: Nuclear transition protein 2 signature 2- CONSENSUS: K-x-R-K-x (2) -E-G-K-x (2) -K-[KR!-K -
NAME: Ribosomal protein LI signature-
CONSENSUS: [IM!-x (2)-[LIVA!-x (2ι3) -[LIVM!-G-x (2 ) -[LMS!- [GSNH!-[PTKR!-[KRAV!-G-x- CONSENSUS: [LMF!-P-[DENSTK! -
NAME: Ribosomal protein L2 signature-
CONSENSUS: P-x (2 ) -R-G-[STAIV! (2 ) -x-N-[APK!-x-[DE! - NAME: Ribosomal protein L3 signature-
CONSENSUS: CFL!-x (fa) - N!-x (2) -[AGS!-x-[ST!-x-G-[KRH!-G-x (2) - G-x(3)-R-
NAME: Ribosomal protein L5 signature- CONSENSUS: [LIVM!-x (2) -[LIVM!-[STAC!-[GE!-[(QV!-x (2) -[LIVMA!- x-[STC!-x-[STAG!-[KR!- CONSENSUS: x-[STA!-
NAME: Ribosomal protein Lfa signature 1- CONSENSUS: [PS!-[DENS!-x-Y-K-[GA!-K-G-[LIVM! .
NAME: Ribosomal protein Lfa signature 2-
CONSENSUS: (Q-x (3 ) -CLIVM!-x (2) -[KR!-x (2 ) -R-x-F-x-D-G-[LIVM!-Y- CLIVM!-x(2)-[KR!.
NAME: Ribosomal protein LI signature-
CONSENSUS: G-x (2) -CGN!-x ( 4 ) -V-x (2) -G-CFY!-x (2) -N-CFY!-L-x (5) -
[GA!-x(3)-[STN!- NAME: Ribosomal protein LID signature-
CONSENSUS: CDEH!-x (2 ) -CGS!-[LIVMF!-[STN!-[VA!-x-[DEιQK!- [LIVMA!-x(2)-[LIM!-R-
NAME: Ribosomal protein Lll signature- CONSENSUS: [RKN!-x-[LIVM!-x-G-[ST!-x (2 ) -[SNιQ!-[LIVM!-G-x (2) - [LIVM!-x(Oιl)-[DENG!-
NAME: Ribosomal protein L13 signature- CONSENSUS: [LIVM!-[KRV!-[GK!-M-CLIV!-[PS!-x (4 -, 5) -C6S3-
[N(QEKRA!-x(5)-[LIVM!-x-[AIV!-
CONSENSUS: [LFY!-x-[GDN! - NAME: Ribosomal protein L14 signature-
CONSENSUS: CGA!-[LIV!(3 ) -x (lilO ) -CDNS!-G-x (4 ) -CFY!-x (2) -[NT!- x(2)-V-CLIV!-
NAME: Ribosomal protein L15 signature- CONSENSUS: K-CLIVM! (2) -[GAL!-x-[GT!-x-[LIVMA!-x (2ι5) -[LIVM!- x-CLIVMF!-x(3ι4)- CONSENSUS: CLIVMFC!-CST!-x (2) -A-x (3) -CLIVM!- (3) -G -
NAME-" Ribosomal protein Lib signature 1- CONSENSUS: [KR!-R-x-[GSAC!-[KιQVA!-[LIVM!-U-[LIVM!-[KR!- [LIVM!-[LFY!-[AP!-
NAME: Ribosomal protein Lib signature 2. CONSENSUS: R-M-G-x-[GR!-K-G-x ( 4 ) -[FUKR! •
NAME: Ribosomal protein L17 signature-
CONSENSUS: I-x-[ST!-[GT!-x (2) -[KR!-x-K-x (fa) - E!-x-[LIMV!-
[LIVMT!-T-x-[STAG!-[KR!- NAME: Ribosomal protein L11 signature-
CONSENSUS: [RT!-[KRSVY!-[GSA!-x-V-[RS!-[KR!-[SA!-K-L-Y-Y-L-R .
NAME: Ribosomal protein L2D signature-
CONSENSUS: K-x (3) -[KRC!-x-[LIVM!-U-[IV!-[STNALV!-R-[LIVM!-N- x(3)-[RKH!-
NAME: Ribosomal protein L21 signature-
CONSENSUS: [IVT!-x (3) -[KR!-x (3) -[KR(Q!-K-x (fa ) -G-[HF!-R-[RιQ!- x(2)-T-
NAME: Ribosomal protein L22 signature-
CONSENSUS: [RK(QN!-x ( 4 ) -[RH!-[GAS!-x-G-[KR(QS!-x (1)-[HDN!-
[LIVM!-x-[LIVMS!-x-[LIVM!- NAME: Ribosomal protein L23 signature-
CONSENSUS: [RK! (2 ) -[AM!-[IVFYT!-[IV!-[RKT!-L-[STAN(QK!-x (7) - [LIVMFT!-
NAME: Ribosomal protein L24 signature- CONSENSUS: [GDEN!-D-x-V-x-[IV!-[LIVMA!-x-G-x (2 ) -[KA!-[GN!-
Figure imgf000406_0001
NAME: Ribosomal protein L27 signature- CONSENSUS: G-x-CLIVM! (2) -x-R-(Q-R-G-x ( 5) -G -
NAME-" Ribosomal protein L21 signature-
CONSENSUS: CKN(QS!-CPSTL!-x (2) -[LIMFA!-[KRGSAN!-x-[LIVYSTA!-
[KR!-[KRH!-[DESTANRL!-
CONSENSUS: [LIV!-A-[KRC(QVT!-[LIVMA! -
NAME: Ribosomal protein L30 signature-
CONSENSUS: [IVT!-[LIVM!-x (2) -[LF!-x-[LI!-x-[KRH(QEG!-x (2) -
CSTN(QH!-x-[IVT!- CONSENSUS : x ( 10 ) -CLMS!-[LIV!-x ( 2 ) -[LIVA!-x ( 2 ) -CLMFY!-[IVT! •
NAME: Ribosomal protein L31 signature- CONSENSUS: H-P-F-[FY!-[TI!-x (1 ) -G-R-[AV!-x-[KR! -
NAME: Ribosomal protein L33 signature-
CONSENSUS: Y-x-[ST!-x-[KR!-[NS!-x ( ) -[PAT!-x (Iι2)-[LIVM!-
[EA!-x(2)-K-[FY!-[CSD!- NAME: Ribosomal protein L34 signature-
CONSENSUS: K-[RG!-T-[FYUL!-[EιQS!-x (5) -[KRHS!-x ( 4 ι5) -G-F-x (2) - R-
NAME: Ribosomal protein L35 signature- CONSENSUS: [LIVM!-K-[TV!-x (2) -[GSA!-[SAIL!-x-K-R-[LIVMFY!- [KRL!-
NAME: Ribosomal protein L3b signature-
CONSENSUS: C-x (2 ) -C-x (2) -[LIVM!-x-R-x (3) -[LIVMN!-x-[LIVM!-x- C-x(3ι4)-[KR!-H-x-(Q-x-(Q-
NAME: Ribosomal protein Lie signature-
CONSENSUS: N-x (3) -[KR!-x (2) -A-[LIVT!-x-S-A-[LIV!-x-A-CST!-
[SGA!-x(7)-[RK!-G-H.
NAME: Ribosomal protein Lbe signature- CONSENSUS: N-x (2) -P-L-R-R-x ( ) -[FY!-V-I-A-T-S-x-K .
NAME: Ribosomal protein L7Ae signature- CONSENSUS: CCAl-x ( 4 ) -CIV!-P-[FY!-x (2 ) -CLIVM!-x-[GSιQ!-[KR(Q!- x(2)-L-G-
NAME: Ribosomal protein LlOe signature- CONSENSUS: R-x-A-CFYU!-G-K-[PA!-x-G-x (2 ) -A-R-V -
NAME: Ribosomal protein L13e signature-
CONSENSUS: CKR!-Y-x (2) -K-[LIVM!-R-[STA!-G-[KR!-G-F-[ST!-L-x-
E- NAME: Ribosomal protein L15e signature-
CONSENSUS: [DE!-[KR!-A-R-x-L-G-[FY!-x-[SAP!-x (2) -G- [LIVMFY!(4)-R-x-R-V-x-R-G-
NAME: Ribosomal protein L18e signature- CONSENSUS: [KRE!-x-L-x (2) -[PS!-[KR!-x (2) -[RH!-[PSA!-x-[LIVM!- [NS!-[LIVM!-x-[RK!- CONSENSUS: [LIVM!-
NAME: Ribosomal protein Llle signature- CONSENSUS: R-x-CKR!-x (5) -CKR!-x (3) -[KRH!-x (2) -G-x-G-x-R-x-G- x(3)-A-R-x(3)-CK(Q!- CONSENSUS: x (2) -U-x (7) -R-x (2) -L-x (3) -R -
NAME: Ribosomal protein L21e signature- CONSENSUS: G-CDE!-x-V-x (10) -[GV!-x (2) -[FYH!-x (2) -[FY!-x-G-x- T-G-
NAME: Ribosomal protein L24e signature- CONSENSUS : CFY!-x-[GS!-x ( 2 ) -CIV!-x-P-G-x-G-x ( 2 ) -CFYV!-x-
[KRHE!-x-D -
NAME: Ribosomal protein L27e signature. CONSENSUS:
Figure imgf000408_0001
.
NAME: Ribosomal protein L3De signature 1-
CONSENSUS: CSTA!-x ( 5) -G-x-[(QKR!-x (2 ) -[LIVM!-[K(QT!-x (2) -[KR!- x-G-x(2)-K-x-CLIVM!(3) •
NAME: Ribosomal protein L30e signature 2-
CONSENSUS: CDE!-L-G-[STA!-x (2) -G-[KR!-x (fa ) -[LIVM!-x-[LIVM!-x-
[DEN!-x-G. NAME: Ribosomal protein L31e signature-
CONSENSUS: V-[KR!-[LIVM!-x (3) -[LIVM!-N-x-[AK!-x-U-x-[KR!-G -
NAME: Ribosomal protein L32e signature-
CONSENSUS: F-x-R-x (4 ) -[KR!-x (2 ) -[KR!-[LIVM!-x (3)-U-R-[KR!- x(2)-G-
NAME: Ribosomal protein L34e signature- CONSENSUS: Y-x-[ST!-x-S-[NY!-x (5) -[KR!-T-P-G - NAME: Ribosomal protein L35Ae signature-
CONSENSUS: G-K-[LIVM!-x-R-x-H-G-x (2) -G-x-V-x-A-x-F-x (3 ) -[LI!- P-
NAME: Ribosomal protein L3fae signature- CONSENSUS: P-Y-E-[KR!-R-x-[LIVM!-[DE!-[LIVM! (2) -[KR! -
NAME: Ribosomal protein L37e signature-
CONSENSUS: G-T-x-[SA!-x-G-x-[KR!-x (3) -[ST!-x (Oιl)-H-x (2 ) -C-x- R-C-G-
NAME-" Ribosomal protein L31e signature-
CONSENSUS: [KRA!-T-x (3) -[LIVM!-[KRιQF!-x-[NHS!-x (3) -R-[NHY!-U-
R-R. NAME: Ribosomal protein L44e signature-
CONSENSUS: K-x-[TV!-K-K-x (2) -L-[KR!-x (2) -C -
NAME: Ribosomal protein S2 signature 1-
CONSENSUS: [LIVMFA!-x (2) -[LIVMFYC! (2) -x-[STAC!-[GSTAN(QEKR!- [STALV!-[HY!-[LIVMF!-G.
NAME: Ribosomal protein S2 signature 2-
CONSENSUS: P-x (2) -[LIVMF! (2) -[LIVMS!-x-[GDN!-x (3)-[DENL!- x(3)-CLIVM!-x-E-x(4)- CONSENSUS: CGN(QKRH!-[LIVM!-[AP! •
NAME: Ribosomal protein S3 signature-
CONSENSUS: CGSTA!-[KR!-x (fa) -G-x-[LIVMT!-x (2) -[N<QSCH!-x (lι3 ) - [LIVFCA!-x(3)-[LIV!- CONSENSUS: CDENιQ!-x (7 ) -[LMT!-x (2) -G-x (2) -G -
NAME: Ribosomal protein S4 signature- CONSENSUS: [LIVM!-[DE!-x-R-L-x (3) -CLIVMC!-[VMFYH(Q!-[KRT!- x(3)-[STAGCF!-x-[ST!-x(3)-
CONSENSUS". [SAI!-[KR!-x-[LIVMF!(2) .
NAME: Ribosomal protein S5 signature-
CONSENSUS: G-[KR(Q!-x (3) -[FY!-x-[ACV!-x (2) -[LIVMA!-[LIVM!-
[AG!-[DN!-x(2)-G-x-
CONSENSUS: [LIVM!-G-x-[SAG!-x ( 5ιb) -[DE(Q!-[LIVM!-x (2) -A-
[LIVMF!-
NAME: Ribosomal protein Sb signature-
CONSENSUS: G-x-[KRC!-[DEN(QRH!-L-[SA!-Y-x-I-[KRNSA! -
NAME: Ribosomal protein S7 signature- CONSENSUS: [DENSK!-x-[LIVMET!-x (3) -[LIVMFT! (2 ) -x (fa ) -G-K-[KR!- x(5)-[LIVMF!-[LIVMFC!- CONSENSUS: x(2)-[STA!.
NAME: Ribosomal protein S8 signature- CONSENSUS: [GE!-x (2) -[LIV! (2) -[STY!-T-x (2 ) -G-[LIVM! (2) -x (4 ) - [AG!-[KRHAYI!-
NAME: Ribosomal protein SI signature-
CONSENSUS: G-G-G-x (2 ) -CGSA!-(Q-x (2) -CSA!-x (3) -[GSA!-x-[GSTAV!- [KR!-[GSAL!-[LIF!-
NAME: Ribosomal protein SID signature-
CONSENSUS: [AV!-x (3) -[GDNSR!-[LIVMSTA!-x (3) -G-P-[LIVM!-x-
[LIVM!-P-T-
NAME: Ribosomal protein Sll signature-
CONSENSUS: [LIVMF!-x-[GSTAC!-[LIVMF!-x (2) -[GSTAL!-x (Oil)
[GSN!-[LIVMF!-x-[LIVM!-
CONSENSUS: x ( 4 ) -[DEN!-x-T-P-x-[PA!-[STCH!-[DN! -
NAME: Ribosomal protein S12 signature- CONSENSUS: [RK!-x-P-N-S-[AR!-x-R -
NAME: Ribosomal protein S13 signature- CONSENSUS: [KR(QS!-G-x-R-H-x (2) -[GSNH!-x (2) -[LIVMC!-R-G-(Q
NAME: Ribosomal protein S14 signature-
CONSENSUS: [RP!-x (Oil) -C-x (Hi 12 ) -[LIVMF!-x-[LIVMF!-[SC!-
[RG!-x(3)-[RN!-
NAME: Ribosomal protein S15 signature-
CONSENSUS: CLIVM!-x (2 ) -H-[LIVMFY!-x (5) -D-x (2) -[SAGN!-x (3 ) -
[LF!-x(1)-[LIVM!-x(2)~
CONSENSUS: CFY!-
NAME: Ribosomal protein Sib signature-
CONSENSUS: CLIVMT!-x-[LIVM!-[KR!-L-[STAK!-R-x-G-[AKR! .
NAME: Ribosomal protein S17 signature- CONSENSUS: G-D-x-[LIV!-x-[LIVA!-x-[(QEK!-x-[RK!-P-[LIV!-S
NAME: Ribosomal protein Slβ signature- CONSENSUS: CIV!-[DY!-Y-x (2) -CLIVMT!-x (2 ) -CLIVM!-x (2) -[FYT!-
[LIVM!-[ST!-[DERP!-x-
CONSENSUS: [GY!-K-[LIVM!-x (3) -R-[LIVMAS! - NAME: Ribosomal protein S11 signature-
CONSENSUS: [STDNιQ!-G-[KR(QM!-χ (fa) -[LIVM!-x ( 4 ) -[LIVM!-[GSD!- x(2)-[LF!-[GAS!-[DE!-F-
CONSENSUS: x(2)-[ST!- NAME: Ribosomal protein S21 signature-
CONSENSUS: CDE!-x-A-[LY!-[KR!-R-F-K-[KR!-x (3) -[KR! -
NAME: Ribosomal protein S3Ae signature- CONSENSUS: [LIV!-x-[GH!-R-[IV!-x-E-x-[SC!-L-x-D-L .
NAME: Ribosomal protein S4e signature-
CONSENSUS: H-x-K-R-[LIVM!-[SAN!-x-P-x (2) -U-x-[LIVM!-x-[KR!
NAME: Ribosomal protein Sbe signature- CONSENSUS: [LIVM!-[STAMR!-G-G-x-D-x (2 ) -G-x-P-M -
NAME: Ribosomal protein S7e signature- CONSENSUS: [KR!-L-x-R-E-L-E-K-K-F-[SAP!-x-[KR!-H • NAME: Ribosomal protein S8e signature-
CONSENSUS: R-x (2) -T-G-[GA!-x (5) -[HR!-K-[KR!-x-K-x-E-[LM!-C
NAME: Ribosomal protein S12e signature-
CONSENSUS: A-L-CKRιQP!-x-V-L-x (2 ) -[SA!-x (3) -[DN!-G-L .
NAME: Ribosomal protein S17e signature-
CONSENSUS: A-x-I-x-[ST!-K-x-L-R-N-[KR!-I-A-G-[FY!-x-T-H
NAME: Ribosomal protein Slle signature-
CONSENSUS: P-x(b)-[SAN!-x(2)-[LIVMA!-x-R-x-[ALIV!-[LV!-(Q-x-L-
[E(Q!-
NAME: Ribosomal protein S21e signature-
CONSENSUS: -Y-V-P-R-K-C-S-[SA! •
NAME: Ribosomal protein S24e signature.
CONSENSUS: [FA!-G-x (2) -[KR!-[STA!-x-G-[FY!-[GA!-x-[LIVM!-Y-
[DN!-[SN!. NAME: Ribosomal protein S2be signature-
CONSENSUS: [YH!-C-V-S-C-A-I-H -
NAME: Ribosomal protein S27e signature-
CONSENSUS: C<QK!-C-x (2) -C-x (fa ) -F-CGS!-x-CPSA!-x ( 5) -C-x (2 ) -C- [GS!-x(2)-L-x(2)-P-x-G-
NAME: Ribosomal protein S2βe signature.
CONSENSUS: E-CST!-E-R-E-A-R-x-L • NAME: DNA mismatch repair proteins mutL / hexB / PMS1 signature-
CONSENSUS: G-F-R-G-E-A-L - NAME: DNA mismatch repair proteins mutS family signature- CONSENSUS: [ST!-[LIVM!-x-[LIVM!-x-D-E-[LIVMY!-[GC!-[RKH!-G- [GST!-x(4)-G- NAME: mutT domain signature-
CONSENSUS: G-x (5) -E-x ( 4 ) -CSTAGC!-[LIVMAC!-x-R-E-[LIVMFT!-x-E- E-
NAME: DnaA protein signature- CONSENSUS: I-[GA!-x (2) -[LIVMF!-[SGDNK!-x (Oil) -[KR!-x-H-[STP!- [STV!-[LIVM!(2)-x- CONSENSUS: [SA!-x (2 ) -[KRE!-CLIVM! -
NAME: Smalli acid-soluble spore proteinsi alpha/beta typei signature 1-
CONSENSUS: K-x-E-[LIV!-A-x-[DE!-[LIVMF!-G-[LIVMF! .
NAME: Smalli acid-soluble spore proteinsi alpha/beta typei signature 2. CONSENSUS: [KR!-CSA<Q!-x-G-x-V-G-G-x-[LIVM!-x-[KR!(2) - [LIVM!(2).
NAME: Zinc-containing alcohol dehydrogenases signature- CONSENSUS: G-H-E-x (2) -G-x (5) -CGA!-x (2) -[IVSAC! -
NAME: (Quinone oxidoreductase / zeta-crystallin signature. CONSENSUS: CGSD!-CDE<QH!-x (2) -L-x (3) -[SA! (2) -G-G-x-G-x ( 4 ) -Q- x(2)-[KR!- NAME: Iron-containing alcohol dehydrogenases signature 1- CONSENSUS: CSTALIV!-[LIVF!-x-[DE!-x (faι7 ) -P-x (4 ) -[ALIV!-x- [GST!-x(2)-D-[TAIVM!- CONSENSUS: CLIVMF!-x ( 4 ) -E • NAME: Iron-containing alcohol dehydrogenases signature 2- CONSENSUS: CGSU!-x-[LIVTSACD!-[GH!-x (2) -[GSAE!-[GSHY(Q!-x- [LIVTP!-[GAST!-[GAS!-x(3)~ CONSENSUS: [LIVMT!-x-[HNS!-[GA!-x-[GTAC! • NAME: Short-chain dehydrogenases/reductases family signature-
CONSENSUS: [LIVSPADNK!-x (12) -Y-[PSTAGNCV!-[STAGNιQCIVM!-
[STAGC!-K--CPO-[SAGFR!-
CONSENSUS: [LIVMSTAGD!-x (2) -[LIVMFYU!-x (3) -[LIVMFYUGAPTHlQ!- [GSACiQRHM!-
NAME: Aldo/keto reductase family signature 1- CONSENSUS: G-[FY!-R-[HSAL!-[LIVMF!-D-[STAGC!-[AS!-x ( 5) -E- x(2)-CLIVM!-G-
NAME: Aldo/keto reductase family signature 2-
CONSENSUS: CLIVMFY!-x (1) -CKRE(Q!-x-[LIVM!-G-[LIVM!-[SC!-N-
[FY!- NAME: Aldo/keto reductase family putative active site signature-
CONSENSUS: [LIVM!-[PAIV!-[KR!-[ST!-x ( 4 ) -R-x (2) -[GSTAElQK!- [NSL!-x(2)-[LIVMFA!- NAME: Homoserine dehydrogenase signature-
CONSENSUS: A-x (3) -G-[LIVMFY!-[STAG!-x (2ι3) -CDNSl-P-x (2) -D- [LIVM!-x-G-x-D-x(3)-K-
NAME: NAD-dependent glycerol-3-phosphate dehydrogenase signature.
CONSENSUS: G-CAT!-[LIVM!-K-[DN!-[LIVM! (2) -A-x-CGA!-x-G- CLIVMF!-x-CDE!-G-[LIVM!-x- CONSENSUS: [LIVMFYU!-G-x-N ■
NAME: FAD-dependent glycerol-3-phosphate dehydrogenase signature 1-
CONSENSUS: [IV!-G-G-G-x (2) -G-[STACV!-G-x-A-x-D-x (3) -R-G.
NAME: FAD-dependent glycerol-3-phosphate dehydrogenase signature 2-
CONSENSUS: G-G-K-x (2) -[GSTE!-Y-R-x (2 ) -A - NAME: Mannitol dehydrogenases signature-
CONSENSUS: [LIVMY!-x-[FS!-x (2) -[STAGCV!-x-V-D-R-[IV!-x-[PS! -
NAME: Histidinol dehydrogenase signature-
CONSENSUS: I-D-x (2 ) -A-G-P-[ST!-E-[LIVS!-[LIVMA! (3) -[AC!-x (3) • A-x(4)-[LIVM!-[AV!-
CONSENSUS: [SACL!-[DE!-[LIVMFC!-[LIVM!-[SA!-x (2 ) -E-H -
NAME: L-lactate dehydrogenase active site CONSENSUS: [LIVMA!-G-[E(Q!-H-G-[DN!-[ST! -
NAME: D-isomer specific 2-hydroxyacid dehydrogenases NAD- binding signature-
CONSENSUS: [LIVMA!-[AG!-[IVT!-[LIVMFY!-[AG!-x-G-[NHKR(QGSAC!- [LIV!-G-x(13ιl4)- CONSENSUS: [LIVfMT!-x (2 ) -[FYwCTH!-[DNSTK! -
NAME: D-isomer specific 2-hydroxyacid dehydrogenases signature 2-
CONSENSUS: [LIVMFYUA!-[LIVFYUC!-x (2) -[SAC!-[DN(QHR!-[IVFA!- [LIVF!-x-[LIVF!-[HNI!-x-
CONSENSUS: P-x ( 4 ) -[STN!-x (2 ) -[LIVMF!-x-[GSDN! -
NAME: D-isomer specific 2-hydroxyacid dehydrogenases signature 3- CONSENSUS: [LMFATC!-[KP(Q!-x-[GSTDN!-x-[LIVMFYUR!- [LIVMFYU!(2)-N-x-[STAGC!-R-[GP!-x- CONSENSUS: [LIVH!-[LIVMC!-[DNV! -
NAME". 3-hydroxyisobutyrate dehydrogenase signature- CONSENSUS: [LIVMFY! (2) -G-L-G-x-[M(Q!-G-x-[PGS!-[MA!-[SA! -
NAME: Hydroxymethylglutaryl-coenzyme A reductases signature
1-
CONSENSUS: [RKH!-x (fa ) -D-x-M-G-x-N-x-[LIVMA! -
NAME: Hydroxymethylglutaryl-coenzyme A reductases signature
2-
CONSENSUS: [LIVM!-G-x-[LIVM!-G-G-[AG!-T - NAME: Hydroxymethylglutaryl-coenzyme A reductases signature 3-
CONSENSUS: A-[LIVM!-x-[STAN!-x (2) -[LI!-x-[KRN(Q!-[GSA!-H-[LM!- x-CFYLH!-
NAME: Hydroxymethylglutaryl-coenzyme A reductases profile-
NAME: 3-hydroxyacyl-CoA dehydrogenase signature- CONSENSUS: [DNE!-x (2) -[GA!-F-[LIVMFY!-x-[NT!-R-x (3 ) -CPA!- CLIVMFY!(2)-x(S)- CONSENSUS: CLIVMFYCT!-[LIVMFY!-x (2 ) -[GV! -
NAME: Malate dehydrogenase active site signature- CONSENSUS: [LIVM!-T-[TRKMN!-L-D-x (2) -R-[STA!-x (3)-[LIVMFY! -
NAME: Malic enzymes signature-
CONSENSUS: F-x-CDV!-D-x (2) -G-T-CGSA!-x-[IV!-x-[LIVMA!-
CGAST!(2)-CLIVMF!(2) ■
NAME-" Isocitrate and isopropylmalate dehydrogenases signature-
CONSENSUS: CNS!-[LIMYT!-[FYDN!-G-[DNT!-[IMVY!-x-[STGDN!-[DN!- x(2)-[SGAP!-x(3ι4)-G- CONSENSUS: [STG!-[LIVMPA!-G-[LIVMF! -
NAME: fa-phosphogluconate dehydrogenase signature- CONSENSUS: [LIVM!-x-D-x (2 ) -[GA!-[N(QS!-K-G-T-G-x-U- NAME: Glucose-fa-phosphate dehydrogenase active site- CONSENSUS: D-H-Y-L-G-K-[E(QK! -
NAME: IMP dehydrogenase / GMP reductase signature- CONSENSUS: [LIVM!-[RK!-[LIVM!-G-[LIVM!-G-x-G-S-[LIVM!-C-x-T -
NAME: Bacterial quinoprotein dehydrogenases signature 1- CONSENSUS: CDEN!-U-x (3) -G-CR -x (fa) -[FYU!-S-x ( 4 ) -CLIVM!-N- x(2)-N-V-x(2)-L-[RK!- NAME: Bacterial quinoprotein dehydrogenases signature 2- CONSENSUS: U-x ( 4 ) -Y-D-x (3 ) -CDN!-[LIVMFY! ( 4 ) -x (2) -G-x (2) - [STA!-P-
NAME: FMN-dependent alpha-hydroxy acid dehydrogenases active site-
CONSENSUS: S-N-H-G-[AG!-R-(Q -
NAME: GMC oxidoreductases signature 1-
CONSENSUS: [GA!-[RKN!-x-[LIV!-G (2) -[GST! (2) -x-[LIVM!-N-x (3 ) - CFYUA!-x(2)-CPAG!-x(5)- CONSENSUS: ONESH!-
NAME: GMC oxidoreductases signature 2-
CONSENSUS: [GS!-[PSTA!-x (2) -[ST!-P-x-[LIVM! (2) -x (2) -S-G- [LIVM!-G-
NAME: Eukaryotic molybdopterin oxidoreductases signature- CONSENSUS-" [GA!-x(3)-[KRN(QHT!-x(llιl4)-[LIVMFYUS!-x(8)-
CLIVMF!-x-C-x(2)-[DEN!-R-
CONSENSUS: x(2)-[DE!- NAME: Prokaryotic molybdopterin oxidoreductases signature 1. CONSENSUS: CSTAN!-x-[CH!-x (2ι3) -C-[STAG!-[GSTVMF!-x-C-x- [LIVMFYU!-x-CLIVMA!-x(3ι4)- CONSENSUS: [DENiQKHT!- NAME: Prokaryotic molybdopterin oxidoreductases signature 2. CONSENSUS: [STA!-x-[STAC! (2) -x (2) -[STA!-D-[LIVMY! (2) -L-P-x- CSTAC!(2)-x(2)-E-
NAME: Prokaryotic molybdopterin oxidoreductases signature 3. CONSENSUS: A-x (3) -CGDT!-I-x-[DN(QTK!-x-[DEA!-x-[LIVM!-x- [LIVMC!-x-[NS!-x(2)-[GS!- CONSENSUS: x ( 5) -A-x-[LIVM!-[ST! ■
NAME: Aldehyde dehydrogenases glutamic acid active site- CONSENSUS: [LIVMFGA!-E-[LIMSTAC!-CGS!-G-CKNLM!-[SADN!- [TAPFV!.
NAME: Aldehyde dehydrogenases cysteine active site- CONSENSUS: [FYLVA!-x (3) -G-[(QE!-x-C-[LIVMGSTANC!-[AGCN!-x- [GSTADNEKR!.
NAME: Aspartate-semialdehyde dehydrogenase signature- CONSENSUS: [LIVM!-[SADN!-x (2) -C-x-R-[LIVM!-x ( 4 ) -[GSC!-H- [STA!-
NAME: Glyceraldehyde 3-phosphate dehydrogenase active site- CONSENSUS: [ASV!-S-C-[NT!-T-x (2 ) -[LIM! -
NAME-" N-acetyl-gamma-glutamyl-phosphate reductase active site-
CONSENSUS: [LIVM!-[GSA!-x-P-G-C-[FY!-[AVP!-T-[GA!-x (3) - [GTAC!-[LIVM!-x-P-
NAME: Gamma-glutamyl phosphate reductase signature- CONSENSUS: V- (5) -A-[LIV!-x-H-I-x (2) -[HY!-[GS!-[ST!-x-H-CST!- [DE!-x-I-
NAME: Dihydrodipicolinate reductase signature- CONSENSUS: E-[IV!-x-E-x-H-x (3) -K-x-D-x-P-S-G-T-A -
NAME: Dihydroorotate dehydrogenase signature 1-
CONSENSUS: [GS!-x ( 4 ) -[GK!-[STA!-[IVSTA!-[GT!-x (3) -[N(QR!-x-G-
[NH!-x(2)-P-CRT!- NAME: Dihydroorotate dehydrogenase signature 2-
CONSENSUS: [LIV! (2) -[GSA!-x-G-G-[IV!-x-[STGN!-x (3) -CACVJ- x(b)-G-A.
NAME: Coproporphyrinogen III oxidase signature- CONSENSUS: K-x-U-C-x (2 ) -[FYH! (3) -[LIVM!-x-H-R-x-E-x-R-G- [LIVM!-G-G-[LIVM!-F-F-D- NAME: Fumarate reductase / succinate dehydrogenase FAD- binding site-
CONSENSUS: R-[ST!-H-[ST!-x (2) -A-x-G-G - NAME: Acyl-CoA dehydrogenases signature 1.
CONSENSUS: CGAC!-CLIVM!-[ST!-E-x (2) -[GSAN!-G-[ST!-D-x (2) - [GSA!-
NAME: Acyl-CoA dehydrogenases signature 2- CONSENSUS: [(QDE!-x (2)-G-[GS!-x-G-[LIVMFY!-x (2) - EN!-x ( 4 ) - [KR!-x(3)-[DEN!>
NAME: Alanine dehydrogenase & pyridine nucleotide transhydrogenase signature 1- CONSENSUS: G-[LIVM!-P-x-E-x (3) -N-E-x (lι3) -R-V-A-x-[ST!-P-x- [GST!-V-x(2)-L-x-[KRH!- CONSENSUS: x-G-
NAME: Alanine dehydrogenase & pyridine nucleotide transhydrogenase signature 2-
CONSENSUS: [LIVM! (2) -G-[GA!-G-x-A-G-x (2) -[SA!-x (3) -[GA!-x-
[SG!-[LIVM!-G-A-x-V-
CONSENSUS: x(3)-D- NAME: Glu / Leu / Phe / Val dehydrogenases active site-
CONSENSUS: [LIV!-x (2) -G-G-[SAG!-K-x-[GV!-x (3) - NST!-[PL! .
NAME: D-amino acid oxidases signature-
CONSENSUS: CLIVM! (2 ) -H-[NHA!-Y-G-x-[GSA! (2) -x-G-x ( 5) -G-x-A
NAME: Pyridoxamine 5'-phosphate oxidase signature- CONSENSUS: [LIVF!-E-F-U-[<QHG!-x ( 4 ) -R-[LIVM!-H-[DNE!-R -
NAME: Copper amine oxidase topaquinone signature- CONSENSUS: [LIVM!-[LIVMA!-[LIVM!-x ( 4 ) -T-x (2) -N-Y-[DE!-[YN! -
NAME: Copper amine oxidase copper-binding site signature- CONSENSUS: T-x-G-x (2) -H-[LIVMF!-x (3 ) -E- E!-x-P - NAME: Lysyl oxidase putative copper-binding region signature- CONSENSUS: U-E-U-H-S-C-H-(Q-H-Y-H -
NAME: Delta l-pyrroline-5-carboxylate reductase signature- CONSENSUS: [PALF!-x (2ι3) -[LIV!-x (3 ) -[LIVM!-[STAC!-[STV!-x- [GAN!-G-x-T-x(2)-[AG!- CONSENSUS: [LIV!-x (2) -CLMF!-[DEN(QK! -
NAME: Dihydrofolate reductase signature. CONSENSUS: [LVAGC!-[LIF!-G-x (4 ) -[LIVMF!-P-U-x ( 4ι 5) -[DE!-x (3) - [FYIV!-x(3)-[STI(Q!-
NAME: Tetrahydrofolate dehydrogenase/cyclohydrolase signature 1- CONSENSUS: CE(Q!-x-[E(QK!-[LIVM! (2) -x (2) -[LIVM!-x (2) -CLIVMY!-N- x-[DN!-x(5)-[LIVMF!(3)- CONSENSUS: (Q-L-P-CLV!- NAME: Tetrahydrofolate dehydrogenase/cyclohydrolase signature 2-
CONSENSUS: P-G-G-V-G-P-CMF!-T-[IV! - NAME: Oxygen oxidoreductases covalent FAD-binding site-
CONSENSUS: P-x (10) -CDE!-[LIVM!-x (3) -CLIVM!-x (1 ) -[LIVM!-x (3) - [GSA!-CGST!-G-H-
NAME: Pyridine nucleotide-disulphide oxidoreductases class-I active site-
CONSENSUS: G-G-x-C-[LIVA!-x (2) -G-C-[LIVM!-P .
NAME: Pyridine nucleotide-disulphide oxidoreductases class- II active site- CONSENSUS: C-x (2) -C-D-CGA!-x (2ι 4 ) -CFY!-x ( 4 ) -CLIVM!-x- [LΪVM!(2)-G(3)-ON!.
NAME: Respiratory-chain NADH dehydrogenase subunit 1 signature 1- CONSENSUS: G-CLIVMFYKRS!-[LIVMAGP!-(Q-x-[LIVMFY!-x-D-[AGIM!- [LIVMFTA!-K-CLVMYST!- CONSENSUS: [LIVMFYG!-x-[KR!-[EιQG! -
NAME: Respiratory-chain NADH dehydrogenase subunit 1 signature 2-
CONSENSUS: P-F-D-[LIVMFY(Q!-[STAGPVM!-E-[GAC!-E-x-[E(Q!- [LIVMS!-x(2)-G-
NAME: Respiratory-chain NADH dehydrogenase 2D Kd subunit signature-
CONSENSUS: [GN!-x-D-[KRST!-[LIVMF! (2) -P-[IV!-D-[LIVMFYU! (2) - x-P-x-C-P-[PT!-
NAME: Respiratory-chain NADH dehydrogenase 24 Kd subunit signature.
CONSENSUS: D-x (2 ) -F-CST!-x (5) -C-L-G-x-C-x (2) -CGA!-P .
NAME: Respiratory chain NADH dehydrogenase 30 Kd subunit signature- CONSENSUS: E-R-E-x (2) -CDE!-[LIVMF! (2) -x (b ) -[HK!-x(3) -[KRP!-x- [LIVM!-[LIVMS!-
NAME: Respiratory chain NADH dehydrogenase 41 Kd subunit signature. CONSENSUS: [LIVMH!-H-[RT!-[GA!-x-E-K-[LIVMT!-x-E-x-[KR(Q! •
NAME: Respiratory-chain NADH dehydrogenase 51 Kd subunit signature 1-
CONSENSUS: G-[AM!-G-[AR!-Y-[LIVM!-C-G- E! (2) -[STA! (2) - [LIM!(2)-[EN!-S-
NAME: Respiratory-chain NADH dehydrogenase 51 Kd subunit signature 2-
CONSENSUS: E-S-C-G-x-C-x-P-C-R-x-G ■
NAME: Respiratory-chain NADH dehydrogenase 75 Kd subunit signature 1-
CONSENSUS: P-x ( ) -C-[YUS!-x (7) -G-x-C-R-x-C - NAME: Respiratory-chain NADH dehydrogenase 75 Kd subunit signature 2-
CONSENSUS: C-P-x-C-CDE!-x-[GS! (2) -x-C-x-L-(Q •
NAME: Respiratory-chain NADH dehydrogenase 75 Kd subunit signature 3.
CONSENSUS: R-C-CLIVM!-x-C-x-R-C-[LIVM!-x-CFY! - NAME: Nitrite and sulfite reductases iron-sulfur/siroheme- binding site- CONSENSUS: [STV!-G-C-x (3) -C-x (fa ) - E!-[LIVMF!-[GAT!-[LIVMF!
NAME: Uricase signature. CONSENSUS: L-x-[LV!-L-K-[ST!-T-x-S-x-F-x (2) -[FY!-x ( 4 ) -[FY! .
NAME-" He e-copper oxidase catalytic subuniti copper B binding region signature.
CONSENSUS: [YUG!-[LIVFYUTA! (2) -[VGS!-H-[LNP!-x-V-x ( 44 i 47 ) -H- H.
NAME: CO II and nitrous oxide reductase dinuclear copper centers signature-
CONSENSUS: V-x-H-x (33ι 40 ) -C-x (3) -C-x (3) -H-x (2) -M -
NAME: Cytochrome c oxidase subunit Vbi zinc binding region signature-
CONSENSUS: CLIVM! (2) -[FYU!-x (10) -C-x (2) -C-G-x (2) -[FY!-K-L - NAME: Multicopper oxidases signature 1-
CONSENSUS: G-x-[FYU!-x-[LIVMFYU!-x-[CST!- (8) -G-[LM!-x (3) - [LIVMFYU!.
NAME: Multicopper oxidases signature 2- CONSENSUS: H-C-H-x (3) -H-x (3) -CAG!-CLM! ■
NAME: Peroxidases proximal heme-ligand signature- CONSENSUS: CDET!-[LIVMTA!-x (2) -[LIVM!-[LIVMSTAG!-[SAG!- [LIVMSTAG!-H-[STA!-CLIVMFY!-
NAME: Peroxidases active site signature-
CONSENSUS: [SGATV!- (3 ) -[LIVMA!-R-[LIVMA!-x-[FU!-H-x-[SAC! .
NAME: Catalase proximal heme-ligand signature- CONSENSUS: R-[LIVMFSTAN!-F-[GASTNP!-Y-x-D-[AST!-[(QEH! -
NAME: Catalase proximal active site signature- CONSENSUS: [IF!-x-[RH!-x ( 4 ) -[E(Q!-R-x (2 ) -H-x (2) -[GAS!-[GASTF!- [GAST!-
NAME: Glutathione peroxidases selenocysteine active site. CONSENSUS: [GN!-[RKHNFYC!-x-[LIVMFC!-[LIVMF! (2) -x-N-[VT!-x- [STC!-x-C-[GA!-x-T. NAME: Glutathione peroxidases signature Ξ« CONSENSUS: [LIV!-[AGD!-F-P-[CS!-[NG!-ι3- .
NAME: Lipoxygenases iron-binding region signature 1. CONSENSUS : H-CEflJ-x ( 3 ) -H-x-CLM!-[NιQRC!-[GST!-H-[LIVMSTAC! ( 3 )
E -
NAME: Lipoxygenases iron-binding region signature 2- CONSENSUS: CLIVMA!-H-P-[LIVM!-x-[KR(Q!-[LIVMF! (2) -x-[AP!-H -
NAME: Extradiol ring-cleavage dioxygenases signature- CONSENSUS: [GNTIV!-x-H-x ( 5ι7) -[LIVMF!-Y-x (2) -CDENTA!-P-x- [GP!-x(2ι3)-E-
NAME: Intradiol ring-cleavage dioxygenases signature- CONSENSUS: CLIVM!-x-G-x-[LIVM!-x (4 ) -[GS!-x (2) -[LIVM!-x (4 ) - [LIVM!-[DE!-[LIVMFY!- CONSENSUS: x (fa ) -G-x-CFY! .
NAME-" Indoleamine 2ι3-dioxygenase signature 1- CONSENSUS: G-G-S-CAN!-CGA!-(Q-S-S-x (2) -(Q •
NAME: Indoleamine 2ι3-dioxygenase signature 2- CONSENSUS: CFY!-L-[D(Q!-[DE!-[LIVM!-x (2) -Y-M-x (3) -H-[KR! •
NAME: Bacterial ring hydroxylating dioxygenases alpha- subunit signature-
CONSENSUS: C-x-H-R-[GA!-x (6 ) -G-N-x ( 5) -C-x-[FY!-H -
NAME: Bacterial luciferase subunits signature- CONSENSUS: [GA!-[LIVM!-P-[LIVM!-x-[LIVMFY!-x-U-x (fa ) -[RK!- x(b)-Y-x(3)-[AR!- NAME: ubiH/COiQb monooxygenase family signature- CONSENSUS: H-P-[LIV!-[AG!-G-(Q-G-x-N-x-G-x (2) -D -
NAME: Biopterin-dependent aromatic amino acid hydroxylases signature ■ CONSENSUS: P-D-x ( ) -H- E!-[LI!-[LIVMF!-G-H-[LIVMC!-P -
NAME: Copper type Hi ascorbate-dependent monooxygenases signature 1-
CONSENSUS: H-H-M-x (2) -F-x-C -
NAME: Copper type Hi ascorbate-dependent monooxygenases signature 2-
CONSENSUS: H-x-F-x ( 4 ) -H-T-H-x (2 ) -G ■ NAME: Tyrosinase CuA-binding region signature-
CONSENSUS: H-x (4 i 5) -F-[LIVMFTP!-x-[FU!-H-R-x (2 ) -[LM!-x (3) -E .
NAME: Tyrosinase and hemocyanins CuB-binding region signature- CONSENSUS: D-P-x-F-CLIVMFYU!-x (2) -H-x (3) -D -
NAME: Fatty acid desaturases family 1 signature- CONSENSUS: G-E-x-[FY!-H-N-[FY!-H-H-x-F-P-x-D-Y - NAME: Fatty acid desaturases family 2 signature-
CONSENSUS: [ST!-[SA!-x (3) -[(QR!-[LI!-x (5ιfa ) -D-Y-x (2) - [LIVMFYU!-[LIVM!-[DE!- NAME: Cytochrome P450 cysteine heme-iron ligand signature- CONSENSUS: . [FU!-[SGNH!-x-[GD!-x-[RHPT!-x-C-[LIVMFAP!-[GAD! -
NAME: Heme oxygenase signature- CONSENSUS: L-L-V-A-H-A-Y-T-R -
NAME: Copper/Zinc superoxide dismutase signature 1- CONSENSUS: [GA!-[IFAT!-H-[LIVF!-H-x (2 ) -CGP!-CSDG!-x-[STAGD! ■ NAME: Copper/Zinc superoxide dismutase signature 2- CONSENSUS: G-[GN!-[SGA!-G-x-R-x-[SGA!-C-x (2) -[IV! .
NAME: Manganese and iron superoxide dismutases signature- CONSENSUS: D-x-U-E-H-[STA!-[FY! ( 2 ) -
NAME-" Ribonucleotide reductase large subunit signature- CONSENSUS: U-x (2)-CLF!-x (faι7) -G-[LIVM!-[FYRA!-[NH!-x (3) - [STA(QLIVM!-[ASC!-x(2)- CONSENSUS: [PA!-
NAME: Ribonucleotide reductase small subunit signature-
CONSENSUS: [IVMSE(Q!-E-x (lι2 ) -[LIVTA!-[HY!-[GSA!-x-[STAVM!-Y- x(2)-[LIVM(Q!-x(3)-
CONSENSUS: CLIFY!-[IVFYCSA! -
NAME: Nitrogenases component 1 alpha and beta subunits signature 1-
CONSENSUS: [LIVMFYH!-[LIVMFST!-H-[AG!-[AGSP!-[LIVMN(QA!-[AG!-
C
NAME: Nitrogenases component 1 alpha and beta subunits signature 2-
CONSENSUS: [STAN(Q!-[ET!-C-x ( 5) -G-D-[DN!-[LIVMT!-x-[STAGR!-
[LIVMFYST!-
NAME: NifH/frxC family signature 1- CONSENSUS: E-x-G-G-P-x (2) -[GA!-x-G-C-[AG!-G-
NAME: NifH/frxC family signature 2- CONSENSUS: D-x-L-G-D-V-V-C-G-G-F-[AG!-x-P -
NAME: Nickel-dependent hydrogenases large subunit signature 1-
CONSENSUS: R-G-[LIVMF!-E-x ( 15) -[(QESM!-R-x-C-G-[LIVM!-C -
NAME: Nickel-dependent hydrogenases large subunit signature
2-
CONSENSUS: [FY!-D-P-C-[LIM!-[ASG!-C-x (2ι3) -H - NAME: Glutamyl-tRNA reductase signature-
CONSENSUS: H-[LIVM!-x (2) -[LIVM!-[GSTAC! (3) -[LIVM!-[DE(Q!-S-
[LIVMA!-[LIVM!(2)-[GF!-E-
CONSENSUS: x-[(QR!-[IV!-[LIT!-[STAG!-(Q-[LIVM!-[KR! - NAME: Bacterial-type phytoene dehydrogenase signature-
CONSENSUS: [NG!-x-[FYUV!-[LIVMF!-x-G-[AGC!-[GS!-[TA!-[H(QT!-P-
G-[STAV!-G-[LIVM!-
CONSENSUS: x(5)-[GS!- NAME: Glycine radical signature-
CONSENSUS: [STIV!-x-R-[IVT!-[CSA!-G-Y-x-[GACV! .
NAME: Ergosterol biosynthesis ERG4/ERG24 family signature 1 CONSENSUS: G-x (2 ) -CLIVM!-Y-D-x-[FY!-x-G-x (2) -L-N-P-R -
NAME: Ergosterol biosynthesis ERG4/ERG24 family signature 2 CONSENSUS: CLIVM! (2) -H-R-x (2 ) -R-D-x (3) -C-x (2) -K-Y-G -
NAME: NNMT/PNMT/TEMT family of methyltransferases signature CONSENSUS: ---I---.])-.I-G_s_G_p_'r---.fl;iV!-Y-(Q-L-L-S-A-C .
NAME: RNA methyltransferase trmA family signature 1. CONSENSUS: [DN!-P-[PA!-R-x-G-x (14 ilfa ) -[LIVM! (2 ) -Y-x-S-C-N- x(2)-T-
NAME: RNA methyltransferase trmA family signature 2- CONSENSUS: [LIVMF!-D-x-F-P-[(QHY!-[ST!-x-H-[LIVMFY!-E-.
NAME: Thymidylate synthase active site-
CONSENSUS: R-x (2) -[LIVM!-x (3) -[FU!-[(QN!-x (βil ) -[LV!-x-P-C-
[HAVM!-x(3)-[(QMT!-[FYU!-
CONSENSUS." x-[LV!.
NAME: Ribosomal RNA adenine dimethylases signature-
CONSENSUS: [LIVM!-[LIVMFY!-[DE!-x-G-[STAPV!-G-x-[GA!-x-
[LIVMF!-[ST!-x(2)-[LIVM!-
CONSENSUS: x (fa) -[LIVMY!-x-[STAGV!-[LIVMFYHC!-E-x-D •
NAME: Methylated-DNA--protein-cysteine methyltransferase active site-
CONSENSUS: [LIVMF!-P-C-H-R-[LIVMF! (2) . NAME: N-fa Adenine-specific DNA methylases signature- CONSENSUS: [LIVMAC!-[LIVFYUA!-x-[DN!-P-P-[FYU! -
NAME: N-4 cytosine-specific DNA methylases signature- CONSENSUS: [LIVMF!-T-S-P-P-[FY! -
NAME: C-5 cytosine-specific DNA methylases active site- CONSENSUS: [DENKS!-x-[FLIV!-x (2) -[GSTC!-x-P-C-x (2) -[FYULIM!- S- NAME: C-5 cytosine-specific DNA methylases C-terminal signature -
CONSENSUS: [RK(QGTF!-x (2) -G-N-[STAG!-[LIVMF!-x (3) -[LIVMT!- x(3)-CLIVM!-x(3)-[LIVM!- NAME: Protein-L-isoaspartate ( D-aspartate) 0- methyltransferase signature- CONSENSUS: [GSA!-D-G-x (2) -G-[FYUV!-x (3) -[AS!-P-[FY!-[DN!-x-I .
NAME: Uroporphyrin-III C-methyltransferase signature 1- CONSENSUS: [LIVM!-[GS!-[STAL!-G-P-G-x (3) -[LIVMFY!-[LIVM!-T- [LIVM!-[KRH(QG!-[AG!.
NAME: Uroporphyrin-III C-methyltransferase signature 2- CONSENSUS: V-x (2) -ELIJ- (2) -G-D-x (3) -[FYU!-[GS!-x (6) -CLIVF!- x(5ιfa)-[LIVMFYUPAC!-
CONSENSUS: x-CLIVMY!-x-P-G .
NAME: ubiE/C0(Q5 methyltransferase family signature 1- CONSENSUS: Y-D-x-M-N-x (2) -CLIVM!-S-x (3) -H-x (2) -U-
NAME: ubiE/COiQS methyltransferase family signature 2. CONSENSUS: R-V-CLIVM!-K-[PV!-G-G-x-[LIVMF!-x (2 ) -CLIVM!-E-x-S .
NAME: Serine hydroxymethyltransferase pyridoxal-phosphate attachment site-
CONSENSUS: [DEH!-[LIVMFY!-x-[STMV!-[GST!-[ST! (2) -H-K-[ST!- [LF!-x-G-[PAC!-[R(Q!- CONSENSUS: [GSA!-[GA!-
NAME: Phosphoribosylglycinamide formyltransferase active site-
CONSENSUS: G-x-[STM!-[IVT!-x-[FYUV(Q!-[VMAT!-x-[DEVM!-x- [LIVMY!-D-x-G-x(2)-[LIVT!- CONSENSUS: x(fa)-[LIVM!-
NAME: Aspartate and ornithine carbamoyltransferases signature. CONSENSUS: F-x-CEK!-x-S-CGT!-R-T ■
NAME: Transketolase signature 1-
CONSENSUS: R-x (3) -CLIVMTA!- EN(QSTHKF!-x (Sifa) -[GSN!-G-H- [PLIVMF!-[GSTA!-x(2)- CONSENSUS: [LIMC!-[GS!-
NAME: Transketolase signature 2-
CONSENSUS: G-[DE(QGSA!-[DN!-G-[PAE(Q!-[ST!-[H(Q!-x-[PAGM!- [LIVMYAC!-[DEFYU!-x(2)- CONSENSUS: [STAP!-x (2) -[RGA! .
NAME: Transaldolase signature 1-
CONSENSUS: [DG!-[IVSA!-T-[ST!-N-P-[STA!-[LIVMF! (2) ■ NAME: Transaldolase active site.
CONSENSUS: [LIVM!-x-[LIVM!-K-CLIVM!-[PAS!-x-[ST!-x-[DEN(QPAS!-
G-[LIVM!-x-[AGV!-x-
CONSENSUS: [(QEKRST!-x-[LIVM! . NAME: Acyltransferases ChoActase / COT / CPT family signature 1-
CONSENSUS: [LI!-P-x-[LVP!-P-[IVTA!-P-x-[LIVM!-x-[DEN(QAS!- [ST!-[LIVM!-x(2)-[LY!- NAME: Acyltransferases ChoActase / COT / CPT family signature 2-
CONSENSUS: R-[FYU!-x-[DA!-[KA!-x (Oil) -[LIVMFY!-x-[LIVMFY! (2 ) - x(3)-[DNS!-[GSA!-x(fa)- CONSENSUS: CDE!-CHS!-x (3) -[DE!-[GA! -
NAME: Thiolases acyl-enzyme intermediate signature- CONSENSUS: [LIVM!-CNST!-x (2)-C-[SAGLI!-[ST!-[SAG!-[LIVMFYNS!- x-[STAG!-[LIVM!-x(fa)- CONSENSUS : [LIVM! -
NAME: Thiolases signature 2-
CONSENSUS: N-x (2) -G-G-x-CLIVM!-[SA!-x-G-H-P-x-G-x-[ST!-G -
NAME: Thiolases active site-
CONSENSUS: CAG!-CLIVMA!-[STAGLIVM!-[STAG!-[LIVMA!-C-x-[AG!-x-
[AG!-x-[AG!-x-[SAG!- NAME: Chloramphenicol acetyltransferase active site- CONSENSUS: ώ-[LIV!-H-H-[SA!-x (2) -D-G-[FY!-H -
NAME: Hexapeptide-repeat containing-transferases signature- CONSENSUS: [LIV!-[GAED!-x (2) -[STAV!-x-[LIV!-x (3) -CLIVAC!-x- [LIV!-[GAED!-x(2)-
CONSENSUS: [STAVR!-x-[LIV!-[GAED!-x (2) -[STAV!-x-[LIV!-x (3) - [LIV!-
NAME: Beta-ketoacyl synthases active site- CONSENSUS: G-x (4 ) -[LIVMFAP!-x (2 ) -[AGC!-C-[STA! (2)-[STAG!- x(3)-[LIVMF!-
NAME: Chalcone and stilbene synthases active site- CONSENSUS: R-[LIVMFYS!-x-[LIVM!-x-[ιQHG!-x-G-C-[FYNA!-[GA!-G- [GA!-[STAV!-x-[LIVMF!- CONSENSUS: [RA!-
NAME: Myristoyl-CoA: protein N-myristoyltransferase signature 1- CONSENSUS: E-I-N-F-L-C-x-H-K -
NAME: Myristoyl-CoA:protein N-myristoyltransferase signature
2-
CONSENSUS: K-F-G-x-G-D-G -
NAME: Gamma-glutamyltranspeptidase signature-
CONSENSUS: T-[STA!-H-x-[ST!-[LIVMA!-x ( 4 ) -G-[SN!-x-V-[STA!-x-
T-x-T-[LIVM!-[NE!-
CONSENSUS: x ( lι2 ) -CFY!-G -
NAME: Transglutaminases active site-
CONSENSUS: [GT!-(Q-[CA!-U-V-x-[SA!-[GA!- VT!-x (2)-T-x-[LMSC!-
R-[CSA!-[LV!-G- NAME: Phosphorylase pyridoxal-phosphate attachment site- CONSENSUS: E-A-[SC!-G-x-[GS!-x-M-K-x (2 ) -[LM!-N -
NAME: UDP-glycosyltransferases signature-
CONSENSUS: [FU!-x (2) -<Q-x (2 ) -[LIVMYA!-[LIMV!-x ( 4ιb) -[LVGAC!- [LVFYA!-[LIVMF!-[STAGCM!-
CONSENSUS: CHN<Q!-CSTAGC!-G-x (2) -[STAG!-x (3) -[STAGL!-[LIVMFA!- x(4)-[P(QR!-[LIVMT!-
CONSENSUS: x (3) -CPA!-x (3) -[DES!-[<QEHN! - NAME: Purine/pyri idine phosphoribosyl transferases signature-
CONSENSUS: [LIVMFYUCTA!-[LIVM!-[LIVMA!-[LIVMFC!-[DE!-D- [LIVMS!-[LIVM!-[STAVD!- CONSENSUS : [STAR!-[GAC!-x-[STAR! -
NAME: Glutamine amidotransferases class-I active site-
CONSENSUS: [PAS!-[LIVMFYT!-[LIVMFY!-G-[LIVMFY!-C-[LIVMFYN!-G- x-[(QEH!-x-[LIVMFA!-
NAME: Glutamine amidotransferases class-II active site- CONSENSUS: <x (Dill) -C-[GS!-[IV!-CLIVMFYU!-[AG! •
NAME: Purine and other phosphorylases family 1 signature- CONSENSUS: CGST!-x-G-[LIVM!-G-x-[PA!-S-x-CGSTA!-I-x (3) -E-L
NAME: Purine and other phosphorylases family 2 signature-
CONSENSUS: CLIVJ-x (3) -G-x (2) -H-x-[LIVMFY!-x (4 ) -[LIVMF!-x (3) -
CATV!-x(lι2)-[LIVM!-x-
CONSENSUS: CATV!-x ( 4) -CGN!-x (3ι4 -CLIVMF! (2) -x (2)-CSTN!-[SA!- x-G-[GS!-[LIVM!-
NAME: Thymidine and pyrimidine-nucleoside phosphorylases signature. CONSENSUS: S-[GS!-R-[GA!-[LIV!-x(2)-[TA!-[GA!-G-T-x-D-x- [LIV!-E-
NAME: ATP phosphoribosyltransferase signature-
CONSENSUS: E-x(5)-G-x-[SAG!-x(2)-[IV!-x-D-[LIV!-x(2)-[ST!-G- x-T-[LM!-
NAME: NAD:arginine ADP-ribosyltransferases signature. CONSENSUS: [FY!-x-[FY!-K-x(2)-H-[FY!-x-L-[ST!-x-A-
NAME: Prolipoprotein diacylglyceryl transferase signature CONSENSUS: G-R-x-[GA!-N-F-CLIVMF!-N-x-E-x(2)-G-
NAME: S-adenosylmethionine synthetase signature 1- CONSENSUS: G-A-G-D-(Q-G-x (3) -G-Y -
NAME: S-adenosylmethionine synthetase signature 2 CONSENSUS: G-[GA!-G-CASC!-F-S-x-K-[DE! -
NAME-" Polyprenyl synthetases signature 1- CONSENSUS [LIVM!(2)-x-D-D-x(2ι4)-D-x(4)-R-R-[GH!-
NAME: Polyprenyl synthetases signature 2-
CONSENSUS CLIVMFY!-G-x(2)-[FYL!-ιQ-[LIVM!-x-D-D-[LIVMFY!-x-
[DNG!-
NAME: Squalene and phytoene synthases signature 1. CONSENSUS: Y-[CSAM!-x(2)-[VSG!-A-[GSA!-[LIVAT!-[IV!-G-x(2)-
[LMSC!-x(2)-[LIV!-
NAME: Squalene and phytoene synthases signature 2- CONSENSUS: [LIVM!-G-x (3)Q-x (2ι3) -N-[IF!-x-R-D-[LIVMFY!-x (2) [DE!-x(4ι7)-R-x-[FY!- CONSENSUS: x-P-
NAME: Protein prenyltransferases alpha subunit repeat signature- CONSENSUS : [PSIAV!-x-[NDFV!-[NEιQIY!-x-[LIVMAGP!-U-[N(QSTHF!-
[FYH(Q!-[LIVMR! -
NAME: Riboflavin synthase alpha chain family signature- CONSENSUS: [LIVMF!-x ( 5) -G-[STADN<Q!-[KRE(QIYU!-V-N-[LIVM!-E -
NAME: Dihydropteroate synthase signature 1-
CONSENSUS: CLIVM!-x-[AG!-[LIVMF! (2) -N-x-T-x-D-S-F-x-D-x-[SG! - NAME: Dihydropteroate synthase signature 2-
CONSENSUS: [GE!-[SA!-x-[LIVM! (2) -D-[LIVM!-G-[GP!-x (2) -[STA!- x-P-
NAME-" EPSP synthase signature 1- CONSENSUS: [LIVM!-x (2 ) -CGN!-N-[SA!-G-T-[STA!-x-R-x-[LIVMY!-x- [GSTA!-
NAME-" EPSP synthase signature 2-
CONSENSUS: [KR!-x-[KH!-E-[CST!-[DNE!-R-[LIVM!-x-[STA!- [LIVMC!-x(2)-[EN!-[LIVMF!-x-
CONSENSUS: [KRA!-[LIVMF!-6 -
NAME: FLAP/GST2/LTC4S family signature- CONSENSUS: G-x (3) -F-E-R-V-[FY!-x-A-[NιQ!-x-N-C .
NAME: Aminotransferases class-I pyridoxal-phosphate attachment site-
CONSENSUS: [GS!-[LIVMFYTAC!-[GSTA!-K-x (2 ) -[GSALVN!-[LIVMFA!- x-[GNAR!-x-R-[LIVMA!- CONSENSUS: [GA!-
NAME: Aminotransferases class-II pyridoxal-phosphate attachment site-
CONSENSUS: T-[LIVMFYU!-[STAG!-K-[SAG!-[LIVMFYUR!-[SAG!-x (2 ) - [SAG!-
NAME: Aminotransferases class-Ill pyridoxal-phosphate attachment site-
CONSENSUS: [LIVMFYUC! (2) -x-D-E-[LIVMA!-x (2) -CGP!-x (Oil ) - CLIVMFYUAG!-x(Dιl)-[SACR!-x-
CONSENSUS: [GSAD!-x (IΞ-ilb ) -D-[LIVMFYUC!-x (2ι3) -[GSA!-K-x (3) - CGSTADN!-[GSA!-
NAME: Aminotransferases class-IV signature- CONSENSUS: E-x-[STAGCI!-x (2) -N-[LIVMFAC!-[FY!-x (bιl2) - [LIVMF!-x-T-x(faι6)-[LIVM!-x- CONSENSUS: CGS!-CLIVM!-x-[KR! ■
NAME: Aminotransferases class-V pyridoxal-phosphate attachment site-
CONSENSUS: CLIVFYCHT!-[DGH!-[LIVMFYAC!-[LIVMFYA!-x (2) -
[GSTAC!-[GSTA!-[H<QR!-K-
CONSENSUS: x ( 4ιfa) -G-x-[GSAT!-x-[LIVMFYSAC! - NAME: Hexokinases signature-
CONSENSUS: [LIVM!-G-F-[TN!-F-S-[FY!-P-x ( 5) -CLIVM!-[DNST!- x(3)-ELIVM!-x(2)-U-T-K-x-
CONSENSUS: CLF!- NAME: Galactokinase signature- CONSENSUS: G-R-x-N-CLIV!-I-G-E-H-x-D-Y - NAME: GHMP kinases putative ATP-binding domain-
CONSENSUS: [LIVM!-[PK!-x-[GSTA!-x (Oil) -G-L-CGS!-S-S-[GSA!- [GSTAC!-
NAME"- Phosphofructokinase signature- CONSENSUS: [RK!-x ( 4 ) -G-H-x-<Q-[(QR!-G-G-x (5) -D-R -
NAME: pfkB family of carbohydrate kinases signature 1- CONSENSUS: CAG!-G-x (0 il) -[GAP!-x-N-x-[STA!-x (fa ) -[GS!-x (1 ) -G • NAME: pfkB family of carbohydrate kinases signature 2-
CONSENSUS: [DNSK!-[PSTV!-x-[SAG! (2 ) -[GD!-D-x (3 ) -[SAGV!-[AG!- [LIVMFY!-[LIVMSTAP!-
NAME: ROK family signature. CONSENSUS: [LIVM!-x (2) -G-[LIVMFCT!-G-x-[GA!-[LIVMFA!-x (8 ) -G- x(3ι5)-CGATP!-x(2)- CONSENSUS: G-CRKH!-
NAME: Phosphoribulokinase signature- CONSENSUS: K-[LIVM!-x-R-D-x (3) -R-G-x-[ST!-x-E -
NAME: Thymidine kinase cellular-type signature- CONSENSUS: [GA!-x (lι2) -CDE!-x-Y-x-[STAP!-x-C-[NKR!-x-[CH!- [LIVMFYUH!-
NAME: FGGY family of carbohydrate kinases signature 1- CONSENSUS: [MFYGS!-x-[PST!-x ( 2 ) -K-[LIVMFYU!-x-U-[LIVMF!-x- [DEN(QTKR!-[EN<QH!. NAME: FGGY family of carbohydrate kinases signature 2- CONSENSUS: [GSA!-x-[LIVMFYU!-x-G-[LIVM!-x ( 7ιβ) -[HDENlQ!- [LIVMF!-x(2)-[AS!-[STAIVM!- CONSENSUS: [LIVMFY!-[DEιQ! • NAME: Protein kinases ATP-binding region signature.
CONSENSUS: [LIV!-G--CP3--G--CP}-[FYUMGSTNH!-[SGA!--CPU}-[LIVCAT!-
CPD>-x-[GSTACLIVMFY!-
CONSENSUS". x(5ιl8)-[LIVMFYUCSTAR!-[AIVP!-[LIVMFAGCKR!-K- NAME: Serine/Threonine protein kinases active-site signature-
CONSENSUS: [LIVMFYC!-x-[HY!-x-D-[LIVMFY!-K-x (2 ) -N- CLIVMFYCT!(3)- NAME: Tyrosine protein kinases specific active-site signature-
CONSENSUS: [LIVMFYC!-x-[HY!-x-D-[LIVMFY!-[RSTAC!-x (2 ) -N- [LIVMFYC!(3) - NAME: Protein kinase domain profile-
NAME: Casein kinase II regulatory subunit signature- CONSENSUS : C-P-x-CLIVMY!-x-C-x ( 5 ) -L-P-CLIVMC!-G-x ( 1 ) -V-CKR!- x ( 2 ) -C-P-x-C -
NAME: Pyruvate kinase active site signature- CONSENSUS: CLIVAC!-x-CLIVM! (2) -[SAPCV!-K-[LIV!-E-[NKRST!-x- [DE(QH!-[GSTA!-[LIVM!-
NAME"- Shikimate kinase signature-
CONSENSUS: CKR!-x (2) -E-x (3) -[LIVMF!-x (6ιl2) -[LIVMF! (2) -[SA!- x-G(3)-x-CLIVMF!-
NAME: Prokaryotic diacylglycerol kinase signature- CONSENSUS: E-x-[LIVM!-N-[ST!-[SA!-[LIV!-E-x (2) -V-D - NAME: Phosphatidylinositol 3- and 4-kinases signature 1- CONSENSUS: [LIVMFAC!-K-x (lι3) - EA!-[DE!-[LIVMC!-R-(Q-[DE!- x(4)-(Q.
NAME: Phosphatidylinositol 3- and 4-kinases signature 2- CONSENSUS: [GS!-x-[AV!-x (3) -[LIVM!-x (2) -[FYH!-[LIVM! (2 ) -x- [LIVMF!-x-D-R-H-x(2)-N-
NAME: Acetate and butyrate kinases family signature 1 CONSENSUS: [LIVM! (2) -x-[LIVM!-N-x-G-S-[ST!-S-x-[KE! .
NAME: Acetate and butyrate kinases family signature 2-
CONSENSUS: [LIVMA!(2)-x(2)-H-x-G-x-G-x-[ST!-[LIVM!-x-[AV!- x(3)-6. NAME". Phosphoglycerate kinase signature.
CONSENSUS: [KRHGTCV!-[VT!-[LIVMF!-[LIVMC!-R-x-D-x-N-[SACV!-P .
NAME: Aspartokinase signature-
CONSENSUS: [LIVM!-x-K-[FY!-G-G-[ST!-[SC!-[LIVM! -
NAME: Glutamate 5-kinase signature-
CONSENSUS: [GSTN!-x (2) -G-x-G-[GC!- M!-x-[STA!-K-[LIVM!-x-
[SA!-[TCA!-x(2)-[GALV!-
CONSENSUS: x(3)-G-
NAME: ATP:guanido phosphotransferases active site- CONSENSUS: C-P-x (Oil) -[ST!-N-[IL!-G-T -
NAME-" PTS HPR component histidine phosphorylation site signature.
CONSENSUS: G-[LIVM!-H-[STA!-R-[PA!-[GSTA!-[STAM! •
NAME: PTS HPR component serine phosphorylation site signature. CONSENSUS: [GSADE!-[KRE(QTV!-x ( 4 ) -[KRN!-S-[LIVMF! (2 ) -x-[LIVM!- x(2)-[LIVM!-[GAD!.
NAME: PTS EIIA domains phosphorylation site signature 1- CONSENSUS: G-x (2) -[LIVMF! (3) -H-[LIVMF!-G-[LIVMF!-x-T-[ALV! .
NAME: PTS EIIA domains phosphorylation site signature 2. CONSENSUS: ENlQ!-x (b) -[LIVMF!-[GA!-x (2) -[LIVM!-A-[LIVM!-P-H- [GAC!. NAME: PTS EIIB domains cysteine phosphorylation site signature.
CONSENSUS: N-[LIVMFY!-x ( 5) -C-x-T-R-[LIVMF!-x-[LIVMF!-x-
[LIVM!-x-[D(Q!.
NAME: Adenylate kinase signature- CONSENSUS: [LIVMFYU!(3)-D-G-[FYI!-P-R-x(3)-[N(Q!-
NAME: Nucleoside diphosphate kinases active site- CONSENSUS: N-x(2)-H-[GA!-S-D-[SA!-[LIVMPKNE!-
NAME: Guanylate kinase signature-
CONSENSUS: T-[ST!-R-x(2)-[KR!-x(2)-[DE!-x(2)-G-x(2)-Y-x-[FY!- CLIVM .
NAME: Guanylate kinase domain profile-
NAME: Phosphoribosyl pyrophosphate synthetase signature CONSENSUS: D-CLI!-H-[SA!-x-(Q-[IMST!-[(QM!-G-[FY!-F-x(2)-P- CLIVMF -D
NAME: 7ι8-dihydro-b-hydroxymethylpterin-pyrophosphokinase signature. CONSENSUS: G-[PE!-R-x(2)-D-L-D-[LIVM!(2) -
NAME: Bacteriophage-type RNA polymerase family active site signature 1 CONSENSUS: P-[LIVM!-x(2)-D-[GA!-[ST!-[AC!-[SN!-[GA!-[LIVMFY!- Q -
NAME: Bacteriophage-type RNA polymerase family active site signature 2-
CONSENSUS: [LIVMF!-x-R-x(3)-K-x(2)-[LIVMF!-M-[PT!-x(2)-Y-
NAME: Eukaryotic RNA polymerase II heptapeptide repeat- CONSENSUS: Y-[ST!-P-[ST!-S-P-[STANK!-
NAME: RNA polymerases beta chain signature.
CONSENSUS: G-x-K-[LIVMFA!-[STAC!-[GSTN!-x-[HSTA!-[GS!-[(QNH!-
K-G-[IVT!.
NAME: RNA polymerases M / 15 Kd subunits signature. CONSENSUS: F-C-x-[DEKST!-C-[GNK!- NSA!-[LIVMH!-[LIVM!-
Figure imgf000427_0001
NAME: RNA polymerases D / 3D to 4D Kd subunits signature CONSENSUS: N-[SGA!-[LIVMF!-R-R-x ( 1) -[SA!-x (3) -V-x( ) -N-x- [STA!-x(3)-[DN!-E-x-[LI!- CONSENSUS: CGA!-x-R-[LI!-[GA!-[LIVM!(2)-P.
NAME: RNA polymerases H / 23 Kd subunits signature- CONSENSUS: H-[NEI!-[LIVM!-V-P-x-H-x(2)-[LIVM!-x(2)-[DE!-
NAME: RNA polymerases K / 14 to 18 Kd subunits signature-
CONSENSUS: [ST!-x-[FY!-E-x-[AT!-R-x-[LIVM!-[GSA!-x-R-[SA!-x-
(Q. NAME: RNA polymerases L / 13 to lfa Kd subunits signature- CONSENSUS: CDE! (2) -H-CST!-[LIVM!-[GAP!-N-x (11) -V-x-CFM!-x (2) - Y-x(3)-H-P- NAME: RNA polymerases N / 8 Kd subunits signature- CONSENSUS: CLIVMF! (2) -P-CLIVM!-x-C-F-[ST!-C-G -
NAME: DNA polymerase family A signature.
CONSENSUS: R-x (2) -CGSAV!-K-x (3 ) -[LIVMFY!-[AG(Q!-x (2) -Y-x ( 2) - [GS!-x(3)-[LIVMA!.
NAME: DNA polymerase family B signature-
CONSENSUS: [YA!-[GLIVMSTAC!-D-T-D-[SG!-[LIVMFTC!-x- [LIVMSTAC!-
NAME: DNA polymerase family X signature-
CONSENSUS: G-[SG!-[LFY!-x-R-[GE!-x (3) -[SGCL!-x-D-[LIVM!-D-
[LIVMFY!(3)-x(2)-[SAP!- NAME: Galactose-1-phosphate uridyl transferase family 1 active site signature- CONSENSUS: F-E-N-CRK!-G-x (3) -G-x ( 4 ) -H-P-H-x-(Q -
NAME: Galactose-1-phosphate uridyl transferase family 2 signature-
CONSENSUS: D-L-P-I-V-G-G-[ST!-[LIVM! (2) -[SA!-H- EN!-H-[FY!- Q-G-G -
NAME: ADP-glucose pyrophosphorylase signature 1- CONSENSUS: [AG!-G-G-x-G-[STK!-x-L-x (2 ) -L-[TA!-x (3) -A-x-P-A- [LV!.
NAME: ADP-glucose pyrophosphorylase signature 2- CONSENSUS: U-[FY!-x-G-[ST!-A-[DNSH!-[AS!-[LIVMFYU! -
NAME: ADP-glucose pyrophosphorylase signature 3- CONSENSUS: [APV!-[GS!-M-G-[LIVMN!-Y-[IVC!-[LIVMFY!-x (2 ) - [DENPHK!. NAME: Phosphatidate cytidylyltransferase signature-
CONSENSUS: S-x-[LIVMF!-K-R-x ( 4 ) -K-D-x-[GSA!-x (2) -[LI!-CPG!-x-
H-G-G-[LIVM!-x-D-R-
CONSENSUS". [LIVMFT!-D- NAME". Ribonuclease PH signature-
CONSENSUS: C-[DE!-[LIVM! (2 ) -(Q-[GTA!-D-G-[SG!-x (2)-[TA!-A -
NAME: 2'-5'-oligoadenylate synthetases signature 1- CONSENSUS: G-G-S-x-[AG!-[KR!-x-T-x-L-CKR!-[GST!-x-S-D-[AG! ■
NAME: 2'-5'-oligoadenylate synthetases signature 2 CONSENSUS: R-P-V-I-L-D-P-x-[DE!-P-T -
NAME: CDP-alcohol phosphatidyltransferases signature- CONSENSUS: D-G-x (2) -A-R- (8) -G-x (3) -D-x (3) -D -
NAME: PEP-utilizing enzymes phosphorylation site signature CONSENSUS: G-[GA!-x-[TN!-x-H-[STA!-[STAV!-[LIVM! (2 ) -[STAVING!.
NAME: PEP-utilizing enzymes signature 2- CONSENSUS: E(QS!-x-[LIVMF!-S-[LIVMF!-G-[ST!-N-D-[LIVM!-x-(Q- [LIVMFYGT!-[STALIV!- CONSENSUS: CLIVMF!-[GAS!-x (2) -R .
NAME". Rhodanese signature 1- CONSENSUS: CFY!-x (3) -H-CLIVΪ-P-G-A- (2) -CLIVF! ■
NAME: Rhodanese C-terminal signature-
CONSENSUS: [AV!-x (2) -CFY!-[DEAP!-G-[GSA!-[UF!-x-E-[FYU! - NAME: CoA transferases signature 1-
CONSENSUS: [DN!-[GN!-x (2) -[LIVMFA! (3) -G-G-F-x (3) -G-x-P -
NAME: CoA transferases signature 2- CONSENSUS: [LF!-[H(Q!-S-E-N-G-[LIVF! (2) -[GA! .
NAME: - Phospholipase A2 histidine active site CONSENSUS: C-C-x (2 ) -H-x (2) -C ■
NAME: Phospholipase A2 aspartic acid active site. CONSENSUS: [LIVMA!-C--CLIVMFYUPCSTΪ-C-D-x (5) -C •
NAME: Lipasesi serine active site.
CONSENSUS: [LIV!-x-[LIVFY!-[LIVMST!-G-[HYUV!-S-x-G-[GSTAC! ■ NAME: Colipase signature-
CONSENSUS: γ_x (Ξ) -γ_γ_x_c-x-C .
NAME: Lipolytic enzymes "G-D-S-L" familyi serine active site- CONSENSUS: [LIVMFYAG! ( 4 ) -G-D-S-[LIVM!-x (lι2) -[TAG!-G -
NAME: Lipolytic enzymes "G-D-X-G" familyi putative histidine active site-
CONSENSUS: [LIVMF! (2)-x-[LIVMF!-H-G-G-[SAG!-[FY!-x (3)-[STDN!- x(2)-[ST!-H-
NAME: Lipolytic enzymes "G-D-X-G" familyi putative serine active site-
CONSENSUS: CLIVM!-x-[LIVMF!-[SA!-G-D-S-[CA!-G-[GA!-x-L-[CA! -
NAME: Carboxylesterases type-B serine active site- CONSENSUS: F-[GR!-G-x ( 4 ) -[LIVM!-x-[LIV!-x-G-x-S-[STAG!-G •
NAME: Carboxylesterases type-B signature 2- CONSENSUS: [ED!-D-C-L-[YT!-[LIV!-[DNS!-[LIV!-[LIVFYU!-x- IPQR1 -
NAME: Pectinesterase signature 1-
CONSENSUS: [GSTN!-x (5) -[LIVM!-x-[LIVM!-x (2) -G-x-Y-[DNK!-E-x- [LIVM!-x-[LIVM!>
NAME: Pectinesterase signature 2- CONSENSUS: G-[STAD!-[LIVMT!-D-F-I-F-G . NAME." Peptidyl-tRNA hydrolase signature 1-
CONSENSUS: CFY!-x(2)-T-R-H-N-x-G-x(2)-[LIVMFA!(2)-[DE!-
NAME: Peptidyl-tRNA hydrolase signature 2-
CONSENSUS: [GS!-x(3)-H-N-G-[LIVM!-[KR!-[DNS!-[LIVMT!-
NAME: Alkaline phosphatase active site-
CONSENSUS: [IV!-x-D-S-[GAS!-[GASC!-[GAST!-[GA!-T.
NAME: Histidine acid phosphatases phosphohistidine signature.
CONSENSUS: [LIVM!-x(2)-[LIVMA!-x(2)-[LIVM!-x-R-H-[GN!-x-R-x-
[PAS!.
NAME: Histidine acid phosphatases active site signature.
CONSENSUS: [LIVMF!-x-[LIVMFAG!-x(2)-[STAGI!-H-D-[STAN(3!-x-
[LIVM!-x(2) -[LIVMFY!-x(2)~
CONSENSUS: [STA!.
NAME: Class A bacterial acid phosphatases signature-
CONSENSUS: G-S-Y-P-S-G-H-T-
NAME: 5 ' -nucleotidase signature 1- C COONNSSEENNSSUUSS:: CLIVM!-x-[LIVM! (2) -[HEA!-[TI!-x-D-x-H-[GSA!-x-
[LIVMF!.
NAME: 5 ' -nucleotidase signature 2-
CONSENSUS: [FYP!-x(4)-[LIVM!-G-N-H-E-F-[DN!
NAME: Fructose-1-fa-bisphosphatase active site-
CONSENSUS: [AG!-[RK!-L-x(lι2)-[LIV!-[FY!-E-x(2)-P-[LIVM!-
[GSA!.
NNAAMMEE:: Serine/threonine specific protein phosphatases signature.
CONSENSUS: [LIVM!-R-G-N-H-E-
NAME: Protein phosphatase 2A regulatory subunit PR55 signature 1
CONSENSUS: E-F-D-Y-L-K-S-L-E-I-E-E-K-I-N.
NAME: Protein phosphatase 2A regulatory subunit PR55 signature 2
CCOONNSSEENNSSUUSS:: N-[AG!-H-[TA!-Y-H-I-N-S-I-S-[LIVM!-N-S-D .
NAME: Protein phosphatase 2C signature-
CONSENSUS: [LIVMFY!-[LIVMFYA!-[GSAC!-[LIVM!-[FYC!-D-G-H-
[GAV!.
NAME: Tyrosine specific protein phosphatases active site-
CONSENSUS: [LIVMF!-H-C-x(2)-G-x(3)-CSTC!-[STAGP!-x-[LIVMFY!
NAME: Tyrosine specific protein phosphatases profile-
NAME: Dual specificity protein phosphatase profile-
NAME: PTP type protein phosphatase profile- NAME: Inositol monophosphatase family signature 1- CONSENSUS: CFUV!-x (Oil ) -[LIVM!-D-P-[LIVM!-D-[SG!-[ST!-x (2) - [FY!-x-[HKRNSTY!-
NAME: Inositol monophosphatase family signature 2- CONSENSUS: [UV!-D-x-[AC!-[GSA!-[GSAPV!-x-[LIVACP!-[LIV!-. [LIVAC!-x(3)-[GH!-[GA!. NAME: Prokaryotic zinc-dependent phospholipase C signature. CONSENSUS: H-Y-x-[GT!-D-[LIVM!-[DNS!-x-P-x-H-[PA!-x-N •
NAME: Phosphatidylinositol-specific phospholipase X-box domain profile-
NAME: Phosphatidylinositol-specific phospholipase Y-box domain profile-
NAME: 3'5'-cyclic nucleotide phosphodiesterases signature- CONSENSUS: H-D-[LIVMFY!-x-H-x-[AG!-x (2) -[N(Q!-x-[LIVMFY! -
NAME: cAMP phosphodiesterases class-II signature- CONSENSUS: H-x-H-L-D-H-[LIVM!-x-[GS!-[LIVMA!-[LIVM! (2) -x-S- [AP!-
NAME: Sulfatases signature 1-
CONSENSUS: [SAP!-[LIVMST!-[CS!-[STAC!-P-[STA!-R-x (2)-
[LIVMFU!(2)-[TR!-G- NAME: Sulfatases signature 2-
CONSENSUS: G-CYV!-x-[ST!-x (2) -[IVA!-G-K-x (Oil ) -[FYUK!-[HL! -
NAME: AP endonucleases family 1 signature 1- CONSENSUS: [APF!-D-[LIVMF! (2) -x-[LIVM!-(Q-E-x-K .
NAME: AP endonucleases family 1 signature 2. CONSENSUS: D-[ST!-[FY!-R-[KH!-x (7ι8) -[FYU!-[ST!-[FYU! (2 )
NAME: AP endonucleases family 1 signature 3- CONSENSUS: N-x-G-x-R-[LIVM!-D-[LIVMFYH!-x-[LV!-x-S
NAME: AP endonucleases family 2 signature 1- CONSENSUS: H-x (2) -Y-[LIVMF!-[IM!-N-[LIVMCA!-[AG! . NAME: AP endonucleases family 2 signature 2- CONSENSUS: [GR!-[LIVMF!-C-[LIVM!-D-T-C-H •
NAME: AP endonucleases family 2 signature 3- CONSENSUS: [LIVMU!-H-x-N- E!-[SA!-K-x (3) -G-[SA!-x (2) -D •
NAME: Deoxyribonuclease I signature 1.
CONSENSUS: [LIVM! (2 ) -[AP!-L-H-[STA! (2) -P-x (5) -E-[LIVM!-[DN!- x-L-x-[DE!-V- NAME: Deoxyribonuclease I signature 2- CONSENSUS: G-D-F-N-A-x-C-[SA! .
NAME: Endonuclease III iron-sulfur binding region signature CONSENSUS : C-x ( 3 ) -[KRS!-P-[KRAGL!-C-x ( 2 ) -C-x ( 5 ) -C .
NAME: Endonuclease III family signature-
CONSENSUS: [GST!-x-[LIVMF!-P-χ (5) -[LIVMU!-x (2ι3) -CLI!-[PAS!- G-V-CGA!-x(3)-[GAC!-
CONSENSUS: x (3) -[LIVM!-x (2) -[SALV!-[LIVMFYU!-[GANK! -
NAME: Ribonuclease II family signature- CONSENSUS: [HI!-[FYE!-[GSTAM!-[LIVM!-x (4ιS) -Y-[STAL!-x- [FUVAC!-[TV!-[SA!-P-[LIVMA!-
CONSENSUS: [R(Q!-[KR!-[FY!-x-D-x (3) -[H(Q! -
NAME: Ribonuclease III family signature- CONSENSUS: [DE(Q!-[R(Q!-[LM!-E-[FYU!-[LV!-G-D-[SAR! -
NAME: Bacterial Ribonuclease P protein component signature- CONSENSUS: [LIVMFYS!-x (2 ) -A-x (2) -R-[NH!-[KR(QL!-[LIVM!-[KRA!- R-x-[LIVMTA!-[KR!- NAME: Ribonuclease T2 family histidine active site 1- CONSENSUS: [FYUL!-x-[LIVM!-H-G-L-U-P •
NAME: Ribonuclease T2 family histidine active site 2- CONSENSUS: [LIVMF!-x (2) -[HDGTY!-[E(Q!-[FYU!-x-[KR!-H-G-x-C .
NAME: Pancreatic ribonuclease family signature CONSENSUS: C-K-x (2 ) -N-T-F •
NAME: DNA/RNA non-specific endonucleases active site- CONSENSUS: D-R-G-H-[(QIL!-x (3) -A •
NAME: Thermonuclease family signature 1.
CONSENSUS: D-G-D-T-[LIVM!-x-[LIVMC!-x (lilD ) -R-[LIVM!-x (2 )
[LIVM!-D-x-P-E-
NAME: Thermonuclease family signature 2- CONSENSUS: D-[KR!-Y-[G(Q!-R-x-[LV!-[GA!-x-[IV!-[FYU! .
NAME: Beta-amylase active site 1. CONSENSUS: H-x-C-G-G-N-V-G-D .
NAME: Beta-amylase active site 2- CONSENSUS: G-x-[SA!-G-E-[LIVM!-R-Y-P-S-Y ■ NAME: Glucoamylase active site region signature.
CONSENSUS: [STN!-[GP!-x (lι2) - E!-x-U-E-E-x (2 ) -[GS! •
NAME: Polygalacturonase active site-
CONSENSUS." [GSDENKRH!-x(2)-[VMFC!-x(2)-[GS!-H-G-CLIVMAG!- x(lι2)-[LIVM!-G-S.
NAME: Clostridium cellulosome enzymes repeated domain signature-
CONSENSUS: D-CLIVMFY!-[DNV!-x-[DNS!-x (2 ) -[LIVM!-[DN!-[SALM!- x-D-x(3)-[LIVMF!-x-
CONSENSUS: [RKS!-x-[LIVMF! -
NAME: Chitinases family 16 active site- CONSENSUS : [LIVMFY!-[DN!-G-[LIVMF!-[DN!-[LIVMF!-[DN!-x-E -
NAME-" Chitinases family 11 signature 1-
CONSENSUS: C-x (4 -, 5) -F-Y-CSTJ-x (3) -[FY!-[LIVMF!-x-A-x (3) -CYF!- x(2)-F-CGSA!.
NAME: Chitinases family 11 signature 2-
CONSENSUS: CLIVM!-[GSA!-F-x-[STAG!(2)-[LIVMFY!-U-[FY!-U-
[LIVM!-
NAME: Alpha-lactalbumin / lysozyme C signature. CONSENSUS: C-x(3)-C-x(2)-[LMF!-x(3)-[DEN!-[LI!-x(S)-C-
NAME: Alpha-galactosidase signature.
CONSENSUS: G-CLIVMFY!-x(2)-CLIVMFY!-x-[LIVM!-D-D-x-U-x(3ι4)
R- NSF!-
NAME: Trehalase signature 1- CONSENSUS: P-G-G-R-F-x-E-x-Y-x-U-D-x-Y
NAME: Trehalase signature 2- CONSENSUS: <Q-U-D-x-P-x-[GA!-U-[PA!-P
NAME: Alpha-L-fucosidase putative active site. CONSENSUS: P-x (2) -L-x (3) -K-U-E-x-C ■
NAME: Glycosyl hydrolases family 1 active site- CONSENSUS: [LIVMFSTC!-[LIVFYS!-[LIV!-[LIVMST!-E-N-G- [LIVMFAR!-[CSAGN!-
NAME: Glycosyl hydrolases family 1 N-terminal signature CONSENSUS: F-x-[FYUM!-[GSTA!-x-[GSTA!-x-[GSTA! (2)-[FYNH!- [N(Q!-x-E-x-[GSTA!. NAME-" Glycosyl hydrolases family 2 signature 1-
CONSENSUS: N-x-[LIVMFYUD!-R-[STACN! (2 ) -H-Y-P-x (4 ) -
[LIVMFYU!(2)-x(3)-[DN!-x(2)-
CONSENSUS: G-CLIVMFYU! ( 4 ) • NAME: Glycosyl hydrolases family 2 acid/base catalyst-
CONSENSUS: CDEN(QF!-CKRVU!-N-H-[AP!-[SAC!-[LIVMF! (3 ) -U-[GS!- x(2ι3)-N-E-
NAME: Glycosyl hydrolases family 3 active site- CONSENSUS: CLIVM! (2 ) -CKR!-x-[E(QK!-x ( 4 ) -G-[LIVMFT!-[LIVT!- [LIVMF!-[ST!-D-x(2)~ CONSENSUS: [SGADNI!-
NAME: Glycosyl hydrolases family 5 signature- CONSENSUS: [LIV!-[LIVMFYUGA! (2) - NE(QG!-[LIVMGST!-x-N-E-[PV!- [RHDNSTLIVFY!.
NAME: Glycosyl hydrolases family fa signature 1- CONSENSUS: V-x-Y-x(2)-P-x-R-D-C-[GSAF!-x(2)-[GSA!(2)-x-G
NAME: Glycosyl hydrolases family fa signature 2-
CONSENSUS: [LIVMYA!-[LIVA!-[LIVT!-[LIV!-E-P-D-[SAL!-[LI!-
[PSAG!. NAME: Glycosyl hydrolases family 6 signature- CONSENSUS: A-CST!-D-[AG!-D-x (2 ) -CIM!-A-x-[SA!-[LIVM!-[LIVMG!- x-A-x(3)-CFU!-
NAME: Glycosyl hydrolases family 1 active sites signature 1. CONSENSUS: CSTV!-x-[LIVMFY!-[STV!-x (2) -6-x-[NKR!-x ( )- [PLIVM!-H-x-R- NAME-" Glycosyl hydrolases family 1 active sites signature 2. CONSENSUS: [FYU!-x-D-x (4 ) -[FYU!-x (3) -E-x-[STA!-x (3) -N-[STA! -
NAME: Glycosyl hydrolases family ID active site- CONSENSUS: [GTA!-x (2) -[LIVN!-x-[IVMF!-[ST!-E-[LIY!- N!- [LIVMF!-
NAME"- Glycosyl hydrolases family 11 active site signature 1. CONSENSUS: [PSA!-[L(Q!-x-E-Y-Y-[LIVM! (2 ) -[DE!-x-[FYUHN! - NAME: Glycosyl hydrolases family 11 active site signature 2. CONSENSUS: [LIVMF!-x (2 ) -E-[AG!-[YUG!-[(QRFGS!-[SG!-[STAN!-G-x- [SAF!-
NAME: Glycosyl hydrolases family lb active sites- CONSENSUS: E-[LIV!-D-[LIV!-x (Oil) -E-x (2) -[G(Q!-[KRNF!-x- [PSTA!-
NAME: Glycosyl hydrolases family 17 signature- CONSENSUS: [LIVM!-x-[LIVMFYUA! (3) -[STAG!-E-[STA!-G-U-P-[STN!- x-[SAG(Q!-
NAME: Glycosyl hydrolases family 25 active sites signature- CONSENSUS: D-CLIVM!-x (3) -CN(Q!-[PG!-x (lilO) -G-x ( 4 )- [LIVMFY!(2)-K-x-[ST!-E-[GS!-x(2)- CONSENSUS: Y-x- N!-
NAME: Glycosyl hydrolases family 31 active site- CONSENSUS: [GF!-[LIVMF!-U-x-D-M-CNSA!-E - NAME: Glycosyl hydrolases family 31 signature 2-
CONSENSUS: G-[AV!-D-[LIVMT!-C-G-[FY!-x (3) -[ST!-x (3) -L-C-x-R-
U-x(2)-[LV!-[GS!-[SA!-
CONSENSUS: F-x-P-F-x-R- N! - NAME: Glycosyl hydrolases family 32 active site- CONSENSUS: H-x (2) -P-x ( 4 ) -[LIVM!-N-D-P-N-G -
NAME: Glycosyl hydrolases family 35 putative active site. CONSENSUS: G-G-P-[LIVM! (2)-x (2) -ιQ-x-E-N-E-[FY! -
NAME: Glycosyl hydrolases family 31 active site- CONSENSUS: U-x-F-E-x-U-N-E-P- N! -
NAME: Glycosyl hydrolases family 45 active site CONSENSUS: [STA!-T-R-Y-[FYU!-D-x ( 5) -[CA! •
NAME: Prokaryotic transglycosylases signature CONSENSUS: CLIVM!-x (3) -E-S-x (3) -CAP!-x (3) -S-x (5) -G-CLIVM!-
[LIVMFYU!-x-[LIVMFYU!-
CONSENSUS: x(4)-[SAG!- NAME: Inosine-uridine preferring nucleoside hydrolase family signature- CONSENSUS: D-x-D-CPT!-[GA!-x-D-D-[TAV!-[VI!-A .
NAME: Alkylbase DNA glycosidases alkA family signature- CONSENSUS: G-I-G-x-U-CST!-CAV!-x-[LIVMFY! (2) -x-[LIVM!-x (6) - [MF!-x(2)-[ED!-D-
NAME: Formamidopyrimidine-DNA glycosylase signature- CONSENSUS: C-x (2ι 4 ) -C-x-CGTAιQ!-x-[IV!-x ( 7 ) -R-[GSTAN!-[STA!-x- [FYI!-C-x(2)-C-(Q-
NAME: Uracil-DNA glycosylase signature- CONSENSUS: [KR!-[LIV!-[LIVC!-[LIVM!-x-G-[(QI!-D-P-Y - NAME: S-adenosyl-L-homocysteine hydrolase signature 1-
CONSENSUS: [CS!-N-x-[FYL!-S-[ST!-[(QA!-[DEN!-x-[AV! (2) -A-A- [LIV!-[SAV!.
NAME: S-adenosyl-L-homocysteine hydrolase signature 2- CONSENSUS: G-K-x (3) -[LIV!-x-G-Y-G-x-V-G-[KR!-G-x-A .
NAME: Cytosol aminopeptidase signature- CONSENSUS: N_T_D_A-E-G-R-L - NAME: Aminopeptidase P and proline dipeptidase signature- CONSENSUS: [HA!-[GSYR!-[LIVMT!-[SG!-H-x-[LIV!-G-[LIVM!-x- [IV!-H-[DE!-
NAME: Methionine aminopeptidase subfamily 1 signature- CONSENSUS: [MFY!-x-G-H-G-[LIVMC!-[GSH!-x (3 ) -H-x (4) -[LIVM!-x- [HN!-[YUV!.
NAME: Methionine aminopeptidase subfamily 2 signature- CONSENSUS: [DA!-[LIVMY!-x-K-[LIVM!-D-x-G-x-CH(Q!-[LIVM!-[DNS!- G-x(3)-[DN!-
NAME: Renal dipeptidase active site-
CONSENSUS: CLIVM!-E-G-[GA!-x (2 ) -[LIVMF!-x ( fa ) -L-x (3) -Y-x (2) -G-
[LIVM!-R-
NAME: Serine carboxypeptidasesi serine active site CONSENSUS: [LIVM!-x-[GTA!-E-S-Y-[AG!-[GS! .
NAME: Serine carboxypeptidasesi histidine active site- CONSENSUS: [LIVF!-x (2) -[LIVSTA!-x-[IVPST!-x-[GSDN(QL!-[SAGV!- [SG!-H-x-[IVA(Q!-P-x(3)- CONSENSUS: [PSA!-
NAME: Zinc carboxypeptidasesi zinc-binding region 1 signature-
CONSENSUS: [PK!-x-[LIVMFY!-x-[LIVMFY!-x ( 4 ) -H-[STAG!-x-E-x-
[LIVM!-[STAG!-x(fa)-
CONSENSUS: [LIVMFYTA!- NAME: Zinc carboxypeptidasesi zinc-binding region 2 signature.
CONSENSUS: H-[STAG!-x(3)-[LIVME!-x(2)-[LIVMFYU!-P-[FYU!.
NAME: Serine proteasesi trypsin familyi histidine active site- CONSENSUS: [LIVM!-[ST!-A-CSTAG!-H-C. NAME: Serine proteasesi trypsin familyi serine active site CONSENSUS: [DNSTAGC!-[GSTAPIMV(QH!-x (2) -G-[DE!-S-G-CGS!- [SAPHV!-[LIVMFYUH!- CONSENSUS: [LIVMFYSTANiQH! . NAME: Serine proteasesi subtilase familyi aspartic acid active site.
CONSENSUS: [STAIV!-x-[LIVMF!-[LIVM!-D- STA!-G-[LIVMFC!-
Figure imgf000436_0001
NAME: Serine proteasesi subtilase familyi histidine active site-
CONSENSUS: H-G-CSTM!-x-[VIC!-[STAGC!-[GS!-x-[LIVMA!- [STAGCLV!-[SAGM!-
NAME: Serine proteasesi subtilase familyi serine active site. CONSENSUS: G-T-S-x-[SA!-x-P-x(2)-[STAVC!-[AG!-
NAME: Serine proteasesi V8 familyi histidine active site- CONSENSUS: [ST!-G-[LIVMFYU!(3)-[GN!-x(2)-T-[LIVM!-x-T-x(2)-H.
NAME: Serine proteasesi V8 familyi serine active site- CONSENSUS: T-x(2)-CGC!-[N(Q!-S-G-S-x-[LIVM!-[FY!.
NAME: Serine proteasesi omptin family signature 1- CONSENSUS: U-T-D-x-S-x-H-P-x-T.
NAME: Serine proteasesi omptin family signature 2-
CONSENSUS: A-G-Y-(Q-E-[ST!-R-[FYU!-S-[FYU!-[TN!-A-x-G-G-EST!-
Y.
NAME: Prolyl endopeptidase family serine active site
CONSENSUS: D-x(3)-A-x(3)-[LIVMFYU!-x(14)-G-x-S-x-G-G-
[LIVMFYU!(2)
NAME-" Endopeptidase Clp serine active site-
CONSENSUS: T-x (2) -[LIVMF!-G-x-A-[SAC!-S-[MSA!-[PAG!-[STA! -
NAME-" Endopeptidase Clp histidine active site- CONSENSUS: R-x (3) -[EAP!-x (3) -[LIVMFYT!-M-[LIVM!-H-ιQ-P -
NAME: ATP-dependent serine proteasesi lon familyi serine active site-
CONSENSUS: D-G-[PD!-S-A-[GS!-[LIVMCA!-[TA!-[LIVM! -
NAME: Eukaryotic thiol (cysteine) proteases cysteine active site.
CONSENSUS: (Q-x (3) -[GE!-x-C-CYU!-x (2) -[STAGC!-[STAGCV!- NAME: Eukaryotic thiol (cysteine) proteases histidine active site.
CONSENSUS: [LIVMGSTAN!-x-H-EGSACE!-[LIVM!-x-[LIVMAT! (2) -G-x- [GSADNH!-
NAME: Eukaryotic thiol (cysteine) proteases asparagine active site-
CONSENSUS: [FYCH!-[UI!-[LIVT!-x-[KR(QAG!-N-[ST!-U-x (3) -CFYϋlJ- G-x(2)-G-[LFYU!-
CONSENSUS: [LIVMFYG!-x-[LIVMF! ■
NAME: Ubiquitin carboxyl-terminal hydrolase family 1 cysteine active-site- CONSENSUS: (Q-x (3) -N-CSA!-C-G-x (3) -CLIVM! (2 ) -H-CSA!-CLIVM!- [SA!-
NAME: Ubiquitin carboxyl-terminal hydrolases family 2 signature 1- CONSENSUS: G-CLIVMFY!-x (lι3) -CAGC!-CNASM!-x-C-CFYU!-[LIVMC!- [NST!-[SACV!-x-[LIVMS!- CONSENSUS: Q -
NAME-" Ubiquitin carboxyl-terminal hydrolases family 2 signature 2 ■
CONSENSUS: Y-x-L-x-[SAG!-[LIVMFT!-x (2) -H-x-G-x (4ι5) -G-H-Y -
NAME: Caspase family histidine active site- CONSENSUS: H-x (2ι4 ) -CSC!-x ( 4 ) -[LIVMF! (2) -[ST!-H-C
NAME: Caspase family cysteine active site CONSENSUS: K-P-K-[LIVMF! ( 4 ) -ιQ-A-C-[RιQG!-G .
NAME: Eukaryotic and viral aspartyl proteases active site- CONSENSUS: ELIVMFGAC!-[LIVMTADN!-[LIVFSA!-D-[ST!-G-[STAV!- [STAPDEN(Q!-x-[LIVMFSTNC!- CONSENSUS: x-[LIVMFGTA! ■
NAME: Neutral zinc metallopeptidasesi zinc-binding region signature.
CONSENSUS: [GSTALIVN!-x (2 ) -H-E-[LIVMFYU!--CDEHRKP}-H-x- [LIVMFYUGSPiQ!.
NAME: Matrixins cysteine switch- CONSENSUS: P-R-C-[GN!-x-P-[DR!-[LIVSAPKιQ! -
NAME: Insulinase familyi zinc-binding region signature. CONSENSUS: G-x (βil ) -G-x-[STA!-H-[LIVMFY!-[LIVMC!- ERN!- [HRKL!-[LMFAT!-x-[LFSTH!-x- CONSENSUS: [GSTAN!-[GST! ■
//
AC PSDIOlfai DE Glycoprotease family signature-
CONSENSUS: [KR!-[GSAT!-x (4 ) -[FYUHL!-[D(QNGK!-x-P-x-[LIVMFY!- x(3)-H-x(2)-[AG!-H-
CONSENSUS: CLIVM!- NAME: Proteasome A-type subunits signature-
CONSENSUS: [FY!-x ( 4 ) -CSTNV!-x-[FYU!-S-P-x-G-[RKH!-x (2) -ύ-
[LIVM!-[DE!-Y-[SAD!-x(2)-
CONSENSUS: [SAG!-
NAME: Proteasome B-type subunits signature-
CONSENSUS: CLIVMA!-[GSA!-[LIVMF!-x-[FYLVGAC!-x (2) -[GSACFY!-
[LIVMSTAC!(3)-[GAC!-
CONSENSUS: [GSTACV!-[DES!-x (15) -CRK!-x (12ιl3) -G-x (2 ) -[GSTA!-
D-
NAME: Signal peptidases I serine active site- CONSENSUS: [GS!-x-S-M-x-[PS!-[AT!-[LF!-
NAME: Signal peptidases I lysine active site-
CONSENSUS: K-R-[LIVMSTA!(2)-G-x-[PG!-G-[DE!-x-[LIVM!-x-
[LIVMFY!- NAME: Signal peptidases I signature 3-
CONSENSUS: [LIVMFYU! (2) -x (2) -G-D-[NH!-x (3) -[SND!-x (2) -[SG! .
NAME: Signal peptidases II signature-
CONSENSUS: [GAF!-[GA!-[GAS!-CLIVM!-[GAS!-N-[LVMFG!-[LIVMFY!-
D-R-[LIMFA!
NAME: Peptidase family U32 signature-
CONSENSUS: E-x-F-x (2) -G-CSA!-[LIVM!-C-x (4 ) -G-x-C-x-CLIVM!-S - NAME: Amidases signature.
CONSENSUS: G-CGA!-S-S-[GS!-G-x-[GSA!-[GSAVY!-x-[LIVM!-[GSA!- x(fa)-[GSA!-x-[GA!-x-D-
CONSENSUS: x-[GA!-x-S-[LIVM!-R-x-P-[GSAC! • NAME: Asparaginase / glutaminase active site signature 1- CONSENSUS: [LIVM!-x (2) -T-G-G-T-[IV!-[AGS! .
NAME: Asparaginase / glutaminase active site signature 2- CONSENSUS: G-x-[LIVM!-x (2 ) -H-G-T-D-T-[LIVM! -
NAME: Urease nickel ligands signature-
CONSENSUS: T-EAY!-[GA!-[GAT!-[LIVM!-D-x-H-[LIVM!-H-x (3) -P ,
NAME: Urease active site-
CONSENSUS [LIVM!(2)-[CT!-H-[HN!-L-x(3)-[LIVM!-x(2)-D-ELIVM!- x-F-A-
NAME: ArgE / dapE / ACYl / CPG2 / yscS family signature 1. CONSENSUS: ELIV!-[GALMY!-[LIVMF!-x-[GSA!-H-x-D-[TV!-[STAV! .
NAME: ArgE / dapE / ACYl / CPG2 / yscS family signature 2. CONSENSUS: [GSTAI!-[SAN(Q!-D-x-K-[GSACN!-x (2) -[LIVMA!-x (2 ) - CLIVMFY!-x(1 ιl7)-[LIVM!-
CONSENSUS: x-CLIVMF!-CLIVMSTAG!-[LIVMFA!-x (2) - NG!-E-E-x- CGSTN!.
NAME: Dihydroorotase signature 1-
CONSENSUS: D-[LIVMFYUSAP!-H-[LIVA!-H-[LIVF!-[RN!-x-[PGN! . NAME: Dihydroorotase signature 2- CONSENSUS: CGA!-[ST!-D-x-A-P-H-x ( ) -K . NAME: Beta-lactamase class-A active site-
CONSENSUS: CFY!-x-[LIVMFY!-x-S-[TV!-x-K-x (4 ) -[AGLM!-x (2) - CLC!.
NAME: Beta-lactamase class-C active site. CONSENSUS: F-E-[LIVM!-G-S-[LIVMG!-[SA!-K •
NAME: Beta-lactamase class-D active site-
CONSENSUS: [PA!-x-S-[ST!-F-K-[LIV!-[PAL!-x-[STA!-[LI! • NAME: Beta-lactamases class B signature 1-
CONSENSUS: [LI!-x-[STN!-[HN!-x-H-[GSTA!-D-x (2) -G-[GP!-x (7ιδ ) • [GS!.
NAME: Beta-lactamases class B signature 2 - CONSENSUS: P-x (3) -CLIVM! (2) -x-G-x-C-[LIVMF! (2) -K-
NAME: Arginase family signature 1.
CONSENSUS: [LIVMF!-G-G-x-H-x-[LIVMT!-[STAV!-x-[PAG!-x (3) -
[GSTA!
NAME: Arginase family signature 2- CONSENSUS: [LIVM! (2 ) -x-[LIVMFY!-D-[AS!-H-x-D
NAME: Arginase family signature 3- CONSENSUS: [ST!-[LIVMFY!-D-[LIVM!-D-x (3) -[PA(Q!-x (3) -P-[GSA!- x(7)-G-
NAME: Adenosine and AMP deaminase signature-
CONSENSUS: [SA!-[LIVM!-[NGS!-[STA!-D-D-P -
NAME: Cytidine and deoxycytidylate deaminases zinc-binding region signature-
CONSENSUS: [CH!-[AGV!-E-x (2) -[LIVMFGAT!-[LIVM!-x (17ι33) -P-C- x(2ι8)-C-x(3)-CLIVM!-
NAME: GTP cyclohydrolase I signature 1-
CONSENSUS: CEN!-[LIVM! (2 ) -x (2) -[KR(QN!-[DN!-[LIVM!-x (3) -[ST!- x-C-E-H-H- NAME: GTP cyclohydrolase I signature 2-
CONSENSUS: [SA!-x-[RK!-x-(Q-[LIVM!-(Q-E-[RN!-[LI!-[TSN! -
NAME: Nitrilases / cyanide hydratase signature 1- CONSENSUS: G-x (2) -[LIVMFY! (2 ) -x-[IF!-x-E-x (2) -[LIVM!-x-G-Y-P .
NAME-" Nitrilases / cyanide hydratase active site signature- CONSENSUS: G-[GA(Q!-x (2) -C-[UA!-E-[NH!-x (2) -[PST!-[LIVMFYS!-x- [KR!- NAME: Inorganic pyrophosphatase signature-
CONSENSUS: D-[SGDN!-D-[PE!-[LIVMF!-D-[LIVMGAC! ■
NAME: Acylphosphatase signature 1. CONSENSUS : [LIV!-x-G-x-V-(Q-G-V-x-[FM!-R .
NAME: Acylphosphatase signature 2- CONSENSUS: G-CFYU!-[AVC!-[KR(QAM!-N-x(3)-G-x-V-x(5)-G.
NAME: ATP synthase alpha and beta subunits signature- CONSENSUS: P-[SAP!-[LIV!-[DNH!-x(3)-S-x-S-
NAME: ATP synthase gamma subunit signature- CONSENSUS: [IV!-T-x-E-x(2)-[DE!-x(3)-G-A-x-[SAKR!-
NAME: ATP synthase delta (OSCP) subunit signature- CONSENSUS: [LIVM!-x-[LIVMFYT!-x(3)-[LIVMT!-[DEN(QK!-x(2)-
[LIVM!-x-[GSA!-G-[LIVMFYGA!- CONSENSUS: x-[LIVM!-[KRHEN(Q!-x-[GSEN!-
NAME: ATP synthase a subunit signature- CONSENSU [STAGN!-x-[STAG!-[LIVMF!-R-L-x-[SAGV!-N-[LIVMT!-
NAME: ATP synthase c subunit signature-
CONSENSUS: [GSTA!-R-[N(Q!-P-x(10)-[LIVMFYU!(2)-x(3)-[LIVMFYU!- x-CDE!.
NAME: E1-E2 ATPases phosphorylation site- CONSENSUS: D-K-T-G-T-[LI!-[TI!-
NAME: Sodium and potassium ATPases beta subunits signature
1-
CONSENSUS: [FYU!-x(2)-[FYU!-x-[FYU!-[DN!-x(b)-[LIVM!-G-R-T- x(3)-U-
NAME: Sodium and potassium ATPases beta subunits signature
2.
CONSENSUS: [RK!-x(2)-C-[RK(QUI!-x(5)-L-x(2)-C-[SA!-G-
NAME: GDA1/CD31 family of nucleoside phosphatases signature
CONSENSUS: CLIVM!-x-G-x(2)-E-G-x-[FY!-x-[FU!-[LIVA!-[TAG!-x-
N-[HY!-
NAME". Iodothyronine deiodinases active site- CONSENSUS: R-P-L-V-x-N-F-G-S-[CA!-T-C-P-x-F-
NAME: Cutinasei serine active site- CONSENSUS: P-x-[STA!-x-[LIV!-[IVT!-x-[GS!-G-Y-S-[(QL!-G-
NAME: Cutinasei aspartate and histidine active sites. CONSENSUS: C-x(3)-D-x-[IV!-C-x-G-[GST!-x(2)-[LIVM!-x(2i3)-H-
NAME: DDC / GAD / HDC / TyrDC pyridoxal-phosphate attachment site-
CONSENSUS: S-[LIVMFYU!-x ( 5) -K-[LIVMFYUG! (2 ) -x (3) -[LIVMFYU!-x-
[CA!-x(2)-[LIVMFYU(Q!-
CONSENSUS: x(2)-[RK!-
NAME: Orn/Lys/Arg decarboxylases family 1 pyridoxal-P attachment site-
CONSENSUS: [STAV!-x-S-x-H-K-x (2) -[GSTAN! (2) -x-[STA!-(Q- [STA!(2) . NAME: Orn/DAP/Arg decarboxylases family 2 pyridoxal-P attachment site-
CONSENSUS: [FY!-[PA!-x-K-[SACV!-[NHCLFU!-x ( 4 ) -[LIVMF!- CLIVMTA!-x(2)-[LIVMA!-x(3)- CONSENSUS: [GTE!-
NAME: Orn/DAP/Arg decarboxylases family 2 signature 2- CONSENSUS: [GS!-x (Sit ) -CLIVMSCP!-x (2) -ELIVMF!-CDNS!-CLIVMCA!- G-G-G-ELIVMFY!-
CONSENSUS: CGSTPCElQ!-
NAME: Orotidine 5'-phosphate decarboxylase active site- CONSENSUS: CLIVMFTA!-[LIVMF!-x-D-x-K-x (2) -D-I-[GP!-x-T- CLIVMTA!.
NAME: Phosphoenolpyruvate carboxylase active site 1- CONSENSUS: CVT!-x-T-A-H-P-T-[E(Q!-x (2) -R-CKRH! . NAME: Phosphoenolpyruvate carboxylase active site 2. CONSENSUS: [IV!-M-[LIVM!-G-Y-S-D-S-x-K-D-[STAG!-G ■
NAME: Phosphoenolpyruvate carboxykinase (GTP) signature CONSENSUS: F-P-S-A-C-G-K-T-N •
NAME: Phosphoenolpyruvate carboxykinase (ATP) signature CONSENSUS: L-I-G-D-D-E-H-x-U-x- E!-x-G-[IV!-x-N .
NAME: Uroporphyrinogen decarboxylase signature 1- CONSENSUS: P-x-U-x-M-R-ύ-A-G- .
NAME: Uroporphyrinogen decarboxylase signature 2. CONSENSUS: G-F-[STAGCV!-[STAGC!-x-P-[FYU!-T-[LV!-x (2 ) -Y-x (2) ■ [AE!-[GK!.
NAME: Indole-3-glycerol phosphate synthase signature- CONSENSUS: [LIVMFY!-[LIVMC!-x-E-[LIVMFYC!-K-[KRSP!-[STAK!-S- P-[ST!-x(3)-[LIVMFYST!- NAME: Ribulose bisphosphate carboxylase large chain active site- CONSENSUS: G-x-[DN!-F-x-K-x-D-E -
NAME: Fructose-bisphosphate aldolase class-I active site- CONSENSUS: [LIVM!-x-[LIVMFYU!-E-G-x-[LS!-L-K-P-[SN! -
NAME: Fructose-bisphosphate aldolase class-II signature 1- CONSENSUS: [FYVM!-x (lι3) -[LIVMH!-[APN!-[LIVM!- (lι2) -ELIVM!- H-x-D-H-[GACH!-
NAME: Fructose-bisphosphate aldolase class-II signature 2 CONSENSUS: [LIVM!-E-x-E-[LIVM!-G-x (2) -[GM!-[GSTA!-χ-E ■
NAME." Malate synthase signature- CONSENSUS: [KR!- EN(Q!-H-x (2) -G-L-N-x-G-x-U-D-Y-[LIVM!-F -
NAME: Hydroxymethylglutaryl-coenzyme A lyase active site CONSENSUS: S-V-A-G-L-G-G-C-P-Y - NAME: Hydroxymethylglutaryl-coenzyme A synthase active site CONSENSUS: N-x-CDN!-[IV!-E-G-[IV!-D-x (2) -N-A-C-CFY!-χ-G - NAME: Citrate synthase signature-
CONSENSUS: G-[FYA!-[GA!-H-x-[IV!-x (lι2)-ERKT!-x (2) -D-EPS!-R-
NAME"- Alpha-isopropylmalate and homocitrate synthases signature 1- CONSENSUS: L-R-[DE!-G-x-(Q-x (10) -K -
NAME: Alpha-isopropylmalate and homocitrate synthases signature 2-
CONSENSUS: CLIVMFU!-x (2) -H-x-H-[DN!-D-x-G-x-[GAS!-x-[GASLI! ■
NAME". KDPG and KHG aldolases active site CONSENSUS: G-[LIVM!-x (3)-E-[LIV!-T-[LF!- .
NAME: KDPG and KHG aldolases Schiff-base forming residue- CONSENSUS: G-x (3) -[LIVMF!-K-[LF!-F-P-[SA!-x (3) -G -
NAME: Isocitrate lyase signature- CONSENSUS: K-[KR!-C-G-H-[LM(Q! - NAME: Beta-eliminating lyases pyridoxal-phosphate attachment site -
CONSENSUS: Y-x-D-x (3) -M-S-[GA!-K-K-D-x-[LIVM! (2) -x-[LIVM!-G- G- NAME: DNA photolyases class 1 signature 1-
CONSENSUS: T-G-x-P-[LIVM! (2 ) -D-A-x-M-[RA!-x-[LIVM! -
NAME: DNA photolyases class 1 signature 2-
CONSENSUS: [DN!-R-x-R-[LIVM! (2) -x-[STA! (2) -F-[LIVMFA!-x-K-x- L-x(2ι3)-U-[KR(Q!.
NAME: DNA photolyases class 2 signature 1- CONSENSUS: F-x-E-E-x-CLIVM! (2) -R-R-E-L-x (2 ) -N-F . NAME: DNA photolyases class 2 signature 2.
CONSENSUS: G-x-H-D-x (2) -U-x-E-R-x-[LIVM!-F-G-K-[LIVM!-R-[FY!- M-N.
NAME: Eukaryotic-type carbonic anhydrases signature- CONSENSUS: S-E-H-x-[LIVM!-x ( 4 ) -[FYH!-x (2 ) -E-[LIVM!-H- [LIVMFA!(2) ■
NAME: Prokaryotic-type carbonic anhydrases signature 1- CONSENSUS: C-[SA!-D-S-R-[LIVM!-x-[AP! -
NAME: Prokaryotic-type carbonic anhydrases signature 2- CONSENSUS: CE(J!-Y-A-[LIVM!-x (2) -[LIVM!-x (4 ) -[LIVMF! (3) -x-G-H- x(2)-C-G- NAME: Fumarate lyases signature- CONSENSUS: G-S-x (2) -M-x (2) -K-x-N -
NAME: Aconitase family signature 1- CONSENSUS: CLIVMJ-x (2) -[GSACIVM!-x-[LIV!-[GTIV!-[STP!-C- x(0ιl)-T-N-[GSTANI!-x(4)-
CONSENSUS: [LIVMA!- NAME: Aconitase family signature 2-
CONSENSUS: G-x (2) -CLIVUP(Q!-x (3) -[GAC!-C-[GSTAM!-[LIMPTA!-C- [LIMV!-[GA!-
NAME : D i hydroxy-acid and fa-phosphog luconate dehydratases s i gnature 1 .
CONSENSUS : C-D-K-x ( 2 ) -P-CGA!-x ( 3 ) -CGA! ■
NAME: Dihydroxy-acid and fa-phosphogluconate dehydratases signature 2- CONSENSUS: [SA!-L-[LIVM!-T-D-[GA!-R-[LIVMF!-S-[GA!-[GAV!- CST3.
NAME: Dehydroquinase class I active site-
CONSENSUS: D-CLIVM!-EDE!-CLIVN!-x (18ι20) -[LIVM! (2) -x-CSC!- [NHY!-H-[DN!.
NAME: Dehydroquinase class II signature.
CONSENSUS: [LIVM!-[N(Q!-G-P-N-[LV!-x (2 ) -L-G-x-R-[(QED!-P-x (2) -
[FY!-G-
NAME: Enolase signature-
CONSENSUS: [LIV! (3) -K-x-N-(Q-I-G-[ST!-[LIV!-[ST!-[DE!-[STA!
NAME: Serine/threonine dehydratases pyridoxal-phosphate attachment site-
CONSENSUS: [DESH!-x ( 4 ι5) -[STVG!-x-[AS!-[FYI!-K-[DLIFSA!- [RVMF!-[GA!-[LIVMGA!-
NAME: Enoyl-CoA hydratase/isomerase signature- CONSENSUS: [LIVM!-[STA!-x-[LIVM!-[DEN(QRHSTA!-G-x (3) -[AG! (3) - x(4)-[LIVMST!-x-[CSTA!- CONSENSUS: [D(QHP!-[LIVMFY! •
NAME: Imidazoleglycerol-phosphate dehydratase signature 1- CONSENSUS: [LIVMY!-[DE!-x-H-H-x (2) -E-x (2) -[GCA!-ELIVM!- [STAC!-[LIVM!>
NAME: Imidazoleglycerol-phosphate dehydratase signature 2- CONSENSUS: G-x-[DN!-x-H-H-x (2) -E-[STAGC!-x-[FY!-K ■
NAME: Tryptophan synthase alpha chain signature- CONSENSUS: [LIVM!-E-[LIVM!-G-x (2) -[FYC!-[ST!- E!-[PA!- [LIVMY!-[AGLI!-[DE!-G. NAME: Tryptophan synthase beta chain pyridoxal-phosphate attachment site. CONSENSUS: [LIVM!-x-H-x-G-[STA!-H-K-x-N •
NAME: Delta-aminolevulinic acid dehydratase active site- CONSENSUS: G-x-D-x-ELIVM! (2) -[IV!-K-P-[GSA!-x (2) -Y -
NAME-" Urocanase active site- CONSENSUS: F-(Q-G-L-P-x-R-I-C-U . NAME". Prephenate dehydratase signature !•
CONSENSUS: CFY!-x-[LIVM!-x (2) -[LIVM!-x (5) - N!-x (5) -T-R-F- [LIVMU!-x-[LIVM!>
NAME: Prephenate dehydratase signature 2. CONSENSUS: [LIVM!-[ST!-[KR!-[LIVM!-E-[ST!-R-P ■
NAME: Dihydrodipicolinate synthetase signature !■ CONSENSUS: [GSA!-[LIVM!-[LIVMFY!-x (2) -G-[ST!-[TG!-G-E- [GASNF!-x(fa)-[E(Q!-
NAME: Dihydrodipicolinate synthetase signature 2- CONSENSUS: Y-[DNS!-[LIVMF!-P-x (2) -[ST!- (3) -[LIVM!-x (13ιl [LIVM!-x-[SGA!-[LIVMF!-
CONSENSUS: K-[DE(QAF!-[STAC! •
NAME: RsuA family of pseudouridine synthase signature. CONSENSUS: G-R-L-D-x (2) -[ST!-x-G-[LIVMF! ( 4 ) -[ST!-[DNT! .
NAME: Cysteine synthase/cystathionine beta-synthase P- phosphate attachment site-
CONSENSUS: K-x-E-x (3) -[PA!-[STAGC!-x-S-EIVAP!-K-x-R-x-[STAG!- x(2)-[LIVM!.
NAME: Phenylalanine and histidine ammonia-lyases signature- CONSENSUS: G-CSTG!-[LIVM!-[STG!-[AC!-S-G-EDH!-L-x-P-L-[SA!- x(2)-[SA!- NAME: Porphobilinogen deaminase cofactor-binding site-
CONSENSUS: E-R-x-CLIVMFA!-x (3) -[LIVMF!-x-G-[GSA!-C-x-[IVT!-P- [LIVMF!-[GSA!-
NAME-" Cys/Met metabolism enzymes pyridoxal-phosphate attachment site-
CONSENSUS: [DιQ!-[LIVMF!-x (3) -[STAGC!-[STAGCI!-T-K-[FYUtQ!- [LIVMF!-x-G-[HtQ!-[SGNH!.
NAME: Glyoxalase I signature 1- CONSENSUS: [H<Q!-[IVT!-x-[LIVFY!-x-[IV!-x ( 5) -[STA!-x (2) -F- [YM!-x(2i3)-[LMF!-G-[LMF!-
NAME". Glyoxalase I signature 2-
CONSENSUS: G-CNTKιQ!-x (Dι5) -[GA!-[LVFY!-[GH!-H-[IVF!-[CGA!-x- [STAGL!-x(2)-[DNC!-
NAME: Cytochrome c and cl heme lyases signature 1. CONSENSUS: H-N-x (2) -N-E-x (2) -U-CN(QKR!-x (4 ) -U-E - NAME: Cytochrome c and cl heme lyases signature 2- CONSENSUS: P-F-D-R-H-D-U •
NAME: Adenylate cyclases class-I signature 1- CONSENSUS: E-Y-F-G-[SA! (2) -L-U-x-L-Y-K ■
NAME-" Adenylate cyclases class-I signature 2- CONSENSUS: Y-R-N-x-U-[NS!-E-[LIVM!-R-T-L-H-F-x-G . NAME: Guanylate cyclases signature.
CONSENSUS: G-V-CLIVM!-x (Oil) -G-x ( 5) -CFY!-x-[LIVM!-[FYU!-[GS!- CDNTHKU!-[DNT!-[IV!- CONSENSUS: [DNTA!-x (5) -[DE! .
NAME: Chorismate synthase signature 1-
CONSENSUS: G-E-S-H-[GC!-x (2) -[LIVM!-[GTV!-x-[LIVM! (2) - E!-G- x-[PV!- NAME : Chor i smate synthase signature 2 -
CONSENSUS : [GE!-R-[SA! ( 2 ) -[SAG!-R-[EV!-[ST!-x ( 2 ) -[RH!-V-x ( 2 ) -
G .
NAME : Chor i smate synthase signature 3 - CONSENSUS : R-[SH!-D-[PSV!-[CSAV!-x ( 4 ) -[GAI!-x-[IVGSP!-[LIVM!- x-E-[STAH!-[LIVM! -
NAME: b-pyruvoyl tetrahydropterin synthase signature 1. CONSENSUS: C-N-N-x (2) -G-H-G-H-N-Y ■
NAME: fa-pyruvoyl tetrahydropterin synthase signature 2. CONSENSUS: D-H-K-N-L-D-x-D -
NAME: Ferrochelatase signature- CONSENSUS: [LIVMF! (2 ) -x-S-x-H-[GS!-[LIVM!-P-x (4 ι5) -[DENiQKR!- x-G-D-x-Y-
NAME: Alanine racemase pyridoxal-phosphate attachment site- CONSENSUS: V-x-K-A-[DN!-[GA!-Y-G-H-G -
NAME: Aspartate and glutamate racemases signature 1- CONSENSUS: [IVA!-[LIVM!-x-C-x (Dil ) -N-[ST!-[MSA!-[STH!- [LIVFYSTANK!- NAME: Aspartate and glutamate racemases signature 2.
CONSENSUS: [LIVM! (2) -x-[AG!-C-T-[DEH!-[LIVMFY!-[PNGRS!-x- [LIVM!.
NAME: Mandelate racemase / muconate lactonizing enzyme family signature 1-
CONSENSUS: A-x-[SAG! (2) -[LIVM!-[DE!-x-A-x (2) -D-x (2) -[GA!- [KR!-
NAME: Mandelate racemase / muconate lactonizing enzyme family signature 2-
CONSENSUS: G-x (7 ) -D-x (1 ) -A-x (14 ) -CLIVM!-E-[DENιQ!-P-x ( 4 ) - [DENlQ!.
NAME: Ribulose-phosphate 3-epimerase family signature 1- CONSENSUS: [LIVMF!-H-[LIVMFY!-D-[LIVM!-x-D-x (lι2) -[FY!- [LIVM!-x-N-x-[STAV!.
NAME". Ribulose-phosphate 3-epimerase family signature 2- CONSENSUS: [LIVMA!-x-[LIVM!-M-[ST!-[VS!-x-P-x (3) -G-lQ-x-F- x(b)-[NK!-[LIVMC!-
NAME: Aldose 1-epimerase putative active site- CONSENSUS: [NS!-x-T-N-H-x-Y-[FU!-N-[LI! - NAME: Cyclophilin-type peptidyl-prolyl cis-trans isomerase signature-
CONSENSUS: [FY!-x (2) -CSTCNLV!-x-F-H-[RH!-[LIVMN!-[LIVM!-x (2) - F-[LIVM!-x-(Q-[AG!-G-
NAME: Cyclophilin-type peptidyl-prolyl cis-trans isomerase profile- NAME: FKBP-type peptidyl-prolyl cis-trans isomerase signature 1-
CONSENSUS: CLIVMC!-x-[YF!-x-[GVL!-x (lι2) -CLFT!-x (2) -G-x (3) - [DE!-[STAE(Q -[STAN!- NAME: FKBP-type peptidyl-prolyl cis-trans isomerase signature 2-
CONSENSUS: [LIVMFY!-x (2) -[GA!-x (3ι 4 ) -[LIVMF!-x (2) -[LIVMFHK!- x(2)-G-x(4)-CLIVMF!- CONSENSUS: x (3) -[PSGA(Q!-x (2) -[AG!-[FY!-G -
NAME: FKBP-type peptidyl-prolyl cis-trans isomerase domain profile -
NAME: PpiC-type peptidyl-prolyl cis-trans isomerase signature-
CONSENSUS: F-[GSADEI!-x-[LVA(Q!-A-x (3) -[ST!-x (3ι4 )-[ST(Q!- x(3ι5)-[GER!-G-x-[LIVM!-
CONSENSUS: [GS!- NAME: Triosephosphate isomerase active site-
CONSENSUS: [AV!-Y-E-P-[LIVM!-U-[SA!-I-G-T-[GK! -
NAME-" Xylose isomerase signature 1- CONSENSUS: [LO-E-P-K-P-x (2) -P .
NAME: Xylose isomerase signature 2- CONSENSUS: [FL!-H-D-x-D-[LIV!-x-[PD!-x-[GDE! .
NAME: Phosphomannose isomerase type I signature 1- CONSENSUS: γ_x-D-χ-N-H-K-P-E -
NAME: Phosphomannose isomerase type I signature 2- CONSENSUS: H-A-Y-[LIVM!-x-G-x (2) -[LIVM!-E-x-M-A-x-S-D-N-x- ELIVM!-R-A-G-x-T-P-K-
NAME: Phosphoglucose isomerase signature 1-
CONSENSUS: [DENS!-x-[LIVM!-G-G-R-[FY!-S-[LIVMT!-x-[STA!-
[PSAC!-[LIVMA!-G- NAME-" Phosphoglucose isomerase signature 2.
CONSENSUS: [GS!-x-[LIVM!-[LIVMFYU!-x (4 ) -[FY!- N!-(Q-x-G-V-E- x(2)-K-
NAME: Glucosamine/galactosamine-fa-phosphate iso erases signature-
CONSENSUS: [LIVM!-x (3) -G-x-[LIT!-x-[LIV!-x-[LIVM!-x-G-[LIVM!- G-x-[DEN!-G-H- NAME: Phosphoglycerate mutase family phosphohistidine signature-
CONSENSUS: CLIVM!-x-R-H-G-CEιQ!-x (3) -N - NAME: Phosphogluco utase and phosphomannomutase phosphoserine signature-
CONSENSUS: [GSA!-[LIVM!-x-[LIVM!-[ST!-[PGA!-S-H-x-P-x ( 4 ) - [GNHE!- NAME: Methylmalonyl-CoA mutase signature-
CONSENSUS: R-I-A-R-N-CTflJ-x (2) -[LIVMFY! (2) -x-EE(Q!-E-x ( 4 ) -
CKRN!-x(2)-D-P-x-[GSA!-
CONSENSUS: G-S- NAME: Terpene synthases signature-
CONSENSUS: CDE!-G-S-U-x-G-x-U-[GA!-[LIVM!-x-[FY!-x-Y-[GA! -
NAME-" Eukaryotic DNA topoisomerase I active site- CONSENSUS: CDEN!-x (fa) -[GS!-[IT!-S-K-x (2 ) -Y-[LIVM!-x (3) - [LIVM!.
NAME: Prokaryotic DNA topoisomerase I active site- CONSENSUS: [E(Q!-x-L-Y- E(QT!-x (3ιl2) -[LI!-[ST!-Y-x-R-[ST!-
[DE(QS!
NAME: DNA topoisomerase II signature- CONSENSUS: [LIVMA!-x-E-G-[DN!-S-A-x-[STAG!
NAME: Aminoacyl-transfer RNA synthetases class-I signature- CONSENSUS: P-x (Dι2) -[GSTAN!-[DEN(QGAPK!-x-[LIVMFP!-EHT!- [LIVMYAC!-G-[HNTG!- CONSENSUS: [LIVMFYSTAGPC! -
NAME: Aminoacyl-transfer RNA synthetases class-II signature 1-
CONSENSUS: CFYH!-R-x- E!-x (4 ιl2) -[RH!-x (3) -F-x (3) -[DE! .
NAME: Aminoacyl-transfer RNA synthetases class-II signature 2- CONSENSUS: [GSTALVF!--CDEN(QHRKP}-[GSTA!-[LIVMF!-[DE!-R- [LIVMF!-x-[LIVMSTAG!-[LIVMFY!-
NAME: UHEP-TRS domain signature-
CONSENSUS: [<QY!-G-[DNEA!-x-[LIV!-[KR!-x (2) -K-x (2)-[KRNG!- [AS!-x(4)-[LIV!-[DEN -
CONSENSUS: x (2) -CIV!-x (2) -L-x (3) -K -
NAME: ATP-citrate lyase / succinyl-CoA ligases family signature 1- CONSENSUS: S-CKR!-S-G-CGT!-[LIVM!-[GST!-x-[E(Q!-x (8ιlD) -G- x(4)-[LIVM!-[GA!-[LIVM!-G- CONSENSUS: G-D-
NAME: ATP-citrate lyase / succinyl-CoA ligases family active site-
CONSENSUS: G-x (2) -A-x ( 4ι7) -[R(QT!-[LIVMF!-G-H-[AS!-[GH! - NAME: ATP-citrate lyase / succinyl-CoA ligases family signature 3-
CONSENSUS: G-x-CIV!-x (2) -CLIVMF!-x-[NA!-G-[GA!-G-[LA!-[STAV!- x(4)-D-x-[LIVM!-x(3)-
CONSENSUS: G-CGRE!-
NAME: Glutamine synthetase signature 1-
CONSENSUS: [FYUL!-D-G-S-S-x (faiδ) - EN(QSTAK!-[SA!-[DE!-x (2) -
[LIVMFY!-
NAME: Glutamine synthetase putative ATP-binding region signature.
CONSENSUS: K-P-CLIVMFYA!-x (3ι 5) -[NPAT!-G-[GSTAN!-G-x-H-x (3) -
S-
NAME: Glutamine synthetase class-I adenylation site- CONSENSUS: K-[LIVM!-x (5) -[LIVMA!-D-[RK!- N!-[LI!-Y .
NAME: D-alanine--D-alanine ligase signature 1- CONSENSUS: H-G-x (2) -G-E-D-G-x-[LIVMA!-[(QSA!-[GSA! .
NAME: D-alanine— -alanine ligase signature 2. CONSENSUS: [LIV!-x (3) -[GA!-x-[GSAIV!-R-[LIVCA!-D-[LIVMF! (2 ) - x(7ι1)-[LI!-x-E- CONSENSUS: [LIVA!-N-[STP!-x-P-[GA! ■
NAME: SAICAR synthetase signature !•
CONSENSUS: [LIVMF! (2) -P-[LIVM!-E-x-[LIVM!-[LIVMCA!-R-x (3) -
[TA!-G-S-
NAME: SAICAR synthetase signature 2- CONSENSUS: [LIVM!-[LIVMA!-D-x-K-[LIVMFY!-E-F-G .
NAME: Folylpolyglutamate synthase signature 1- CONSENSUS: [LIVMFY!-x-[LIVM!-[STAG!-G-T-[NK!-G-K-x-[ST!-x (7) - [LIVM!(2)-x(3)-[GSK!.
NAME: Folylpolyglutamate synthase signature 2- CONSENSUS: CLIVMFY! (2 ) -E-x-G-[LIVM!-[GA!-G-x (2 ) -D-x-[GST!-x- [LIVM!(2)-
NAME: Ubiquitin-activating enzyme signature 1- CONSENSUS: K-A-C-S-G-K-F-x-P ■ NAME: Ubiquitin-activating enzyme active site. CONSENSUS: P-[LIVM!-C-T-[LIVM!-[KRH!-x-[FT!-P .
NAME: Ubiquitin-conjugating enzymes active site. CONSENSUS: [FYULSP!-H-[PC!-[NH!-[LIV!-x (3ι 4 ) -G-x-[LIV!-C- [LIV!-x-[LIV!-
NAME: Formate—tetrahydrofolate ligase signature 1- CONSENSUS: G-[LIVM!-K-G-G-A-A-G-G-G-Y ■ NAME: Formate--tetrahydrofolate ligase signature 2- CONSENSUS: V-A-T- V!-R-A-L-K-x-[HN!-G-G -
NAME: Adenylosuccinate synthetase GTP-binding site- CONSENSUS : (Q-U-G-D-E-G-K-G .
NAME: Adenylosuccinate synthetase active site- CONSENSUS: G-I-CGR!-P-x-Y-x (2) -K-x (2) -R •
NAME: Argininosuccinate synthase signature 1. CONSENSUS: A-CFY!-S-G-G-L-D-T-S ■
NAME: Argininosuccinate synthase signature 2- CONSENSUS: G-x-T-x-K-G-N-D-x (2) -R-F •
NAME: Phosphoribosylglycinamide synthetase signature- CONSENSUS: R-F-G-D-P-E-x-C(QM! - NAME: Carbamoyl-phosphate synthase subdomain signature 1 CONSENSUS: CFYV!-[PS!-[LIVMC!-[LIVMA!-[LIVM!-[KR!-[PSA!- [STA!-x(3)-[SG!-G-x-[AG!-
NAME: Carbamoyl-phosphate synthase subdomain signature 2 CONSENSUS: [LIVMF!-[LIMN!-E-[LIVMCA!-N-[PATLIVM!-[KR!- [LIVMSTAC!-
NAME: ATP-dependent DNA ligase AMP-binding site- CONSENSUS: [EDιQH!-x-K-x-[DN!-G-x-R-[GACIVM! -
NAME: ATP-dependent DNA ligase signature 2-
CONSENSUS: E-G-[LIVMA!-[LIVM! (2 ) -[KR!-x ( 5ι β) -CYU!-[(QNEK!- x(2ιb)-[KRH!-x(3ι5)-K-
CONSENSUS: CLIVMFY3-K-
NAME: NAD-dependent DNA ligase signature 1-
CONSENSUS: K-CLIVM!-D-G-[LIVM!-[SA!-x ( 4 ) -Y-x (2) -G-x-L-x ( 4 ) -
[ST!-R-G-[DN!-G-x(2)-G-
CONSENSUS: [DE!-[DENL!-
NAME: NAD-dependent DNA ligase signature 2.
CONSENSUS: [IV!-G-[KR!-[ST!-G-x-[LIVM!-[STNK!-x-[VT!-x (2 ) -L- x-[PS!-V- NAME: RNA 3'-terminal phosphate cyclase signature- CONSENSUS: [RH!-G-x (2) -P-x-G (3) -x-[LIV! -
NAME: Lipoate-protein ligase B signature-
CONSENSUS: R-G-G-x (2) -T-[FYU!-H-x (2 ) -[GH!-(Q-x-[LIV!-x-Y .
NAME: Isopenicillin N synthetase signature 1- CONSENSUS: [RK!-x-[STA!-x (2) -S-x-C-Y-[SL! ■
NAME: Isopenicillin N synthetase signature 2- CONSENSUS: [LIVM! (2) -x-C-G-[STA!- (2) -[STAG!-x (2)-T-x-[DNG! .
NAME: Site-specific recombinases active site- CONSENSUS: Y-[LIVAC!-R-[VA!-S-[ST!-x (2) -(Q . NAME: Site-specific recombinases signature 2-
CONSENSUS: G-[DE!-x (2) -[LIVM!-x (3) -[LIVM!- T!-R-[LIVM!- [GSA!. NAME: Transposasesi Mutator familyi signature- CONSENSUS: D-x (3) -G-[LIVMF!-x (fa) -CSTAV!-[LIVMFYU!-[PT!-x- [STAV!-x(2)-[(QR!-x-C-x(2)- CONSENSUS: H-
NAME: Transposasesi IS3D familyi signature-
CONSENSUS: R-G-x (2) -E-N-x-N-G-CLIVM! (2) -R-[(QE!-[LIVMFY! (2) -P-
K- NAME: Autoinducers synthetases family signature-
CONSENSUS: CLMFY!-R-x (3) -F-x (2) -[KR!-x (2) -U-x-[LIVM!-x (fa il) - E-x-D-x-[FY!-D-
NAME: Thiamine pyrophosphate enzymes signature- CONSENSUS: [LIVMF!-[GSA!-x ( 5) -P-x ( 4 ) -[LIVMFYU!-x-[LIVMF!-x-G- D-[GSA!-[GSAC!-
NAME: Biotin-requiring enzymes attachment site- CONSENSUS: [GN!-[DEtQTR!-x-[LIVMFY!-x (2) -[LIVM!-x-[AIV!-M-K- [LMAT!-x(3)-[LIVM!-x- CONSENSUS: [SAV!-
NAME: 2-oxo acid dehydrogenases acyltransferase component lipoyl binding site- CONSENSUS: CGN!-x (2) -[LIVF!-x (5) -[LIVFC!-x (2) -[LIVFA!-x (3) -K- CSTAIV!-[STAV(QDN!- CONSENSUS: x (2 ) -[LIVMFS!-x (5) -[GCN!-x-[LIVMFY! -
NAME: Putative AMP-binding domain signature- CONSENSUS: [LIVMFY!-x (2) -[STG!-[STAG!-G-[ST!-[STEI!-[SG!-x- [PASLIVM!-[KR!-
NAME: Molybdenum cofactor biosynthesis proteins signature 1. CONSENSUS: [LIVM! (3) -[LIT! (2 ) -G-G-T-G-x (4 ) -D -
NAME-" Molybdenum cofactor biosynthesis proteins signature 2. CONSENSUS: S-x-[GS!-x (2 ) -D-x (5) -[LIVU!-x (10ιl2) -[LIV!-x (2 ) - CKR!-P-G-[KRL!-P-x(2)- CONSENSUS: [LIVMF!-[GA! ■
NAME: moaA / nifB / pqqE family signature-
CONSENSUS: [LIV!-x (3) -C-[NP!-[LIVMF!-[(QRS!-C-x-[FYM!-C -
NAME: Radical activating enzymes signature- CONSENSUS: [GV!-x-G-x-[KR!-x (3) -F-x (2)-G-x (Oil )-C- (3) -C- x(2)-C-x-[NL!-
NAME: Tpx family signature-
CONSENSUS: S-x-D-L-P-F-A-x (2) -CKR!-CFU!-C -
NAME: Cytochrome c family heme-binding site signature CONSENSUS: C--CCPUHF}--CCPUR>-C-H--CCFYU -
NAME: Cytochrome b5 familyi heme-binding domain signature. CONSENSUS: [FY!-[LIVMK!-x (2)-H-P-CGA!-G-
NAME: Cytochrome b/bfa heme-ligand signature- CONSENSUS: [DEN(Q!-x (3) -G-[FYUM(Q!-x-[LIVMF!-R-x (2)-H - NAME: Cytochrome b/bfa 1Q0 site signature- CONSENSUS: P-CDE!-U-[FY!-JLLFY! (2) . NAME: Cytochrome bSSI subunits heme-binding site signature- CONSENSUS: ELIV!-x-CST!-CLIVF!-R-[FYU!-x (2) -[IV!-H-[STGA!- [LIV!-[STGA!-[IV!-P-
NAME: Nickel-dependent hydrogenases b-type cytochrome subunit signature 1-
CONSENSUS: R-[LIVMFYU!-x-H-U-[LIVM!-x (2) -[LIVMF!-[STAC!- [LIVM!-x(2)-L-x-[LIVM!-T-G-
NAME: Nickel-dependent hydrogenases b-type cytochrome subunit signature 2-
CONSENSUS: ERH!-[STA!-[LIVMFYU!-H-[RH!-[LIVM!-x (2) -U-x- [LIVMF!-x(2)-F-x(3)-H-
NAME: Succinate dehydrogenase cytochrome b subunit signature 1.
CONSENSUS: R-P-[LIVMT!-x (3) -[LIVM!-x (fa) -[LIVMUPK!-x ( 4 ) -S- x(2)-H-R-x-CST!.
NAME: Succinate dehydrogenase cytochrome b subunit signature 2.
CONSENSUS: H-x (3) -CGA!-CLIVMT!-R-CHF!-[LIVMF!-x-[FYUM!-D-x- [GVA!-
NAME: Thioredoxin family active site- CONSENSUS: ELIVMF!-[LIVMSTA!-x-[LIVMFYC!-[FYUSTHE!-x (2) - EFYUGTN!-C-[GATPLVE!- CONSENSUS: [PHYUSTA!-C-x (fa) -[LIVMFYUT! •
NAME: Glutaredoxin active site. CONSENSUS: [LIVD!-[FYSA!-x ( 4 ) -C-[PV!-[FYU!-C-x (2) -CTAV1-
Figure imgf000451_0001
NAME: Type-1 copper (blue) proteins signature- CONSENSUS: EGA!-x (Dι2) -CYSA!-x (0 il) -[VFY!-x-C-x (lι2) -[PG!- x(Dιl)-H-x(2ι4)-[M(Q!-
NAME: 2Fe-2S ferredoxinsi iron-sulfur binding region signature-
CONSENSUS: C--CC}--CC}-[GA!--CC>-C-[GAST!--CCPDEKRHFYU>-C -
NAME: Adrenodoxin familyi iron-sulfur binding region signature.
CONSENSUS: C-x (2) -ESTA(Q!-x-[STAMV!-C-[STA!-T-C-[HR! ■ NAME: 4Fe-4S ferredoxinsi iron-sulfur binding region signature. CONSENSUS: C-x (2) -C-x (2) -C-x (3) -C-CPEG! ■
NAME: High potential iron-sulfur proteins signature- CONSENSUS: C-x (fail) -CLIVMI-x (3) -G-[YU!-C-x (2) -CFYU! -
NAME: Rieske iron-sulfur protein signature 1- CONSENSUS: C-[TK!-H-L-G-C-[LIVT! . NAME: Rieske iron-sulfur protein signature 2- CONSENSUS: C-P-C-H-x-CGSA! - NAME: Flavodoxin signature.
CONSENSUS: [LIV!-[LIVFY!-[FY!-x-[ST!-x (2) -CAG -x-T-x ( 3) -A- x(2)-CLIV!.
NAME: Rubredoxin signature- CONSENSUS: CLIVM!-x (3) -U-x-C-P-x-C-[AGD! -
NAME: Electron transfer flavoprotein alpha-subunit signature.
CONSENSUS: [LI!-Y-[LIVM!-[AT!-x-G- V!-[SD!-G-x-[IV!-(Q-H- x(2)-G-x(b)-CIV!-x-A- CONSENSUS: CIV!-N-
NAME: Electron transfer flavoprotein beta-subunit signature CONSENSUS: CIVA!-x-CKR!-x (2) - E!-[GD!-[GDE!-x (lι2 ) -[E(Q!-x- [LIV!-x(4)-P-x-[LIVM!(2)- CONSENSUS: [TAC!-
NAME: Vertebrate metallothioneins signature-
CONSENSUS: C-x-C-[GSTAP!-x (2) -C-x-C-x (2) -C-x-C-x (2) -C-x-K -
NAME: Ferritin iron-binding regions signature 1-
CONSENSUS: E-x-[KR!-E-x (2) -E-[KR!-[LF!-[LIVMA!-x (2) -ώ-N-x-R- x-G-R. NAME: Ferritin iron-binding regions signature 2-
CONSENSUS: D-x (2) -[LIVMF!-[STAC!-EDH!-F-[LI!-EEN!-x (2 ) -[FY!- L-x(b)-[LIVM!-[KN!-
NAME: Bacterioferritin signature- CONSENSUS: <M-x-G-x (3) -V-CLIV!-x (2) -[LM!-x (3) -L-x (3) -L .
NAME: Transferrins signature 1-
CONSENSUS: Y-x (Oil ) -[VAS!-V-[IVAC!-[IVA!-[IVA!-[RKH!-[RKS!- [GDENSA!.
NAME: Transferrins signature 2-
CONSENSUS: Y-x-G-A-[FL!-[KRHN(Q!-C-L-x (3ι 4 ) -G- EN(Q!-V-[GA!-
[FYU!- NAME: Transferrins signature 3-
CONSENSUS: ENιQ!-[YF!-x-ELY!-L-C-x-[DN!-x (5ιβ) -[LIV!-x ( 4ιS)
C-x(2)-A-x(4)-[H(QR!-x-
CONSENSUS: [LIVMFYU!-[LIVM! • NAME: Globins profile-
NAME: Protozoan/cyanobacterial globins signature- CONSENSUS: F-CLF!-x (5) -G-[PA!-x ( 4 ) -G-[KRA!-x-[LIVM!-x (3) -H - NAME: Plant hemoglobins signature-
CONSENSUS: [SN!-P-x-L-x (2) -H-A-x (3) -F -
NAME: Hemerythrins signature- CONSENSUS : U-L-x-[NιQ!-H-I-x ( 3 ) -D-F .
NAME: Arthropod hemocyanins / insect LSPs signature 1 CONSENSUS: Y-CFYU!-x-E-D-[LIVM!-x(2)-N-x(b)-H-x(3)-P-
NAME: Arthropod hemocyanins / insect LSPs signature 2 CONSENSUS: T-x(2)-R-D-P-x-CFY!-[FYU!-
NAME: Heavy-metal-associated domain- CONSENSUS: [LIVN!-x (2) -[LIVMFA!-x-C-x-[STAGCDNH!-C-x (3) - [LIVFG!-x(3)-[LIV!-x(1ιlD- CONSENSUS: [IVA!-x-[LVFYS! -
NAME-" ABC transporters family signature- CONSENSUS: [LIVMFYC!-[SA!-[SAPGLVFYK(QH!-G-[DEN(QMU!-
[KR(QASPCLIMFU!-[KRN(QSTAVM!-
CONSENSUS: [KRACLVM!-[LIVMFYPAN!--CPHY>-[LIVMFU!-CSAGCLIVP!-
CFYUHP --KRHP>-
CONSENSUS: [LIVMFYUSTA! -
NAME: Binding-protein-dependent transport systems inner membrane comp- sign-
CONSENSUS: [LIVMFY!-x (6 ) -[E(QR!-[STAGV!-[STAG!-x (3) -G-
[LIVMFYSTAC!-x(5)-[LIVMFYSTA!- CONSENSUS: x ( )-[LIVMFY!-[PKR! -
NAME"- ABC-2 type transport system integral membrane proteins signature-
CONSENSUS: [LIMST!-x (2) -[LIMU!-x (2) -[LIMCA!-[GSTC!-x-[GSAIV!- x(fa)-[LIMGA!-[PGSN(Q!-
CONSENSUS: x (1 ιl2) -P-CLIMFT!-x-[HRSY!-x (5) -[R(Q! -
NAME: Bacterial extracellular solute-binding proteinsi family 1 signature- CONSENSUS: [GAP!-[LIVMFA!-[STAVDN!-x ( 4 ) -[GSAV!-[LIVMFY! ( 2 ) -Y- [ND!-x(3)-[LIVMF!-x- CONSENSUS: [KNDE!-
NAME: Bacterial extracellular solute-binding proteinsi family 3 signature-
CONSENSUS: G-[FYIL!- E!-[LIVMT!-[DE!-[LIVMF!-x (3) -[LIVMA!- [VAGC!-x(2)-[LIVMAGN!-
NAME: Bacterial extracellular solute-binding proteinsi family 5 signature-
CONSENSUS: [AG!-x ( bι7) - NEG!-x (2 ) -[STAVE!-[LIVMFYUA!-x-
[LIVMFY!-x-[LIVM!-[KR!-
CONSENSUS: [KRHDE!-[GDN!-[LIVMA!-[KNGSP!-[FU! - NAME: Serum albumin family signature-
CONSENSUS: [FY!-x (fa )-C-C-x (7) -C-CLFYJ-x (fa) -[LIVMFYU! -
NAME: Transthyretin signature 1- CONSENSUS: S-K-C-P-L-M-V-K-V-L-D-[AS!-V-R-G
NAME: Transthyretin signature 2>
CONSENSUS: S-P-[FY!-S-[FY!-S-T-T-A-[LIVM!-V-[ST!-x-P NAME: Avidin / Streptavidin family signature-
CONSENSUS: [DEN!-x(2)-[KR!-[STA!-x(2)-V-G-x- N!-x-[FU!-T-
[KR!-
NAME: Eukaryotic cobalamin-binding proteins signature- CONSENSUS: CSN!-V-D-T-[GA!-A-[LIVM!-A-x-L-A-[LIVMF!-T-C.
NAME: Lipocalin signature-
CONSENSUS: [DENG!-x-[DEN(QGSTARK!-x (0ι2) -CDEN(QARK!-[LIVFY!- -CCPD-G--tO-U-[FYULRH!-x- CONSENSUS: [LIVMTA!-
NAME: Cytosolic fatty-acid binding proteins signature- CONSENSUS: [GSAIVK!-x-[FYU!-x-[LIVMF!-x ( 4 ) -[NHG!-[FY!-[DE!-x- [LIVMFY!-[LIVM!-x(2)-
CONSENSUS: [LIVMAKR!-
NAME: Acyl-CoA-binding protein signature-
CONSENSUS: P-[STA!-x-[DEN!-x-[LIVMF!-x (2) -[LIVMFY!-Y-CGSTA!- x-[FY!-K-(Q-[STA!(2)-x-G-
NAME: LBP / BPI / CETP family signature-
CONSENSUS: CPA!-[GA!-[LIVMC!-x (2) -R-[IV!-[ST!-x (3) -L-x (5) -
[E(Q!-x(4)-[LIVM!-[EιQK!- CONSENSUS: x(β)-P-
NAME: Phosphatidylethanolamine-binding protein family signature.
CONSENSUS: [FY!-x-[LIVMF!(3)-x-[DC!-P-D-x-P-[SN!-x(lD)-H-
NAME: Plant lipid transfer proteins signature-
CONSENSUS [LIVM!-[PA!-x(2)-C-x-[LIVM!-x-[LIVM!-x-[LIVMFY!-x-
[LIVM!-[ST!-x(3)-
CONSENSUS: [DN!-C-x(2)-[LIVM!-
NAME: Uteroglobin family signature 1-
CONSENSUS: [GA!-x(3)-I-C-P-x-[LIVMF!-x(3)-[LIVM!-[DE!-x-
[LIVMF!(2)
NAME: Uteroglobin family signature 2-
CONSENSUS: [DE(Q!-x(4)-[SN!-x(5)-[DEιQ!-x-I-x(2)-S-[PSE!-[LS!-
C
NAME: Mitochondrial energy transfer proteins signature- CONSENSUS: P-x-[DE!-x-[LIVAT!-[RK!-x-[LRH!-[LIVMFY!-[ιQMAIGV! .
NAME: Sugar transport proteins signature 1- CONSENSUS: [LIVMSTAG!-[LIVMFSAG!-x (2) -[LIVMSA!- E!-x- [LIVMFYUA!-G-R-[RK!-x(4ιfa)~ CONSENSUS: [GSTA!.
NAME: Sugar transport proteins signature 2-
CONSENSUS: [LIVMF!-x-G-[LIVMFA!-x(2)-G-x(δ)-[LIFY!-x(2)-[E(Q!- x(fa)-[RK!.
NAME: LacY family proton/sugar symporters signature 1- CONSENSUS: G-[LIVM!(2)-x-D-ERK!-L-G-L-[RK!(2)-x-ELIVM!(2)-U. NAME: LacY family proton/sugar symporters signature 2- CONSENSUS: P-x-CLIVMF! (2) -N-R-ELIVM!-G-x-K-N-[STA!-[LIVM! (3) -
NAME: PTR2 family proton/oligopeptide symporters signature 1.
CONSENSUS: CGA!-[GAS!-[LIVMFYUA!-[LIVM!-[GAS!-D-x-[LIVMFYUT!-
[LIVMFYU!-G-x(3)-[TAV!-
CONSENSUS: CIV!-x (3) -CGSTAV!-x-[LIVMF!-x (3) -[GA!- NAME: PTR2 family proton/oligopeptide symporters signature 2-
CONSENSUS." [FYT!-x(2)-CLMFY!-[FYV!-ELIVMFYUA!-x-[IVG!-N- [LIVMAG!-G-[GSA!-[LIMF!- NAME: Amiloride-sensitive sodium channels signature-
CONSENSUS: Y-x (2) -[E(QTF!-x-C-x (2) -[GSTDNL!-C-x-E(QT!-x (2) - [LIVMT!-[LIVMS!-x(2)-C-x-C
NAME: Sodium:alanine symporter family signature- CONSENSUS: G-G-x-CGA! (2) -[LIVM!-F-U-M-U-[LIVM!-x-[STAV!- CLIVMFA!(2)-G-
NAME: Sodium: dicarboxylate symporter family signature 1- CONSENSUS: P-x (Oil) -G-CDE!-x-CLIVMF! (2 ) -x-CLIVM! (2 ) -[KRElQ!- [LIVM!(3)-x-P-
NAME: Sodium:dicarboxylate symporter family signature 2- CONSENSUS: P-x-G-x-CSTA!-x-[NT!-[LIVMC!-D-G-[STAN!-x-[LIVM!- [FY!-x(2)-[LIVM!-x(2)- CONSENSUS: CLIVM!-[FY!-[LI!-[SA!-(Q -
NAME: Sodium:galactoside symporter family signature- CONSENSUS: D-x (3) -G-x (3) -[DN!-x (biδ ) -G-CKH!-F-[KR!-P-[FYU!- [LIVM!(2)-x-CGSTA!(2).
NAME: Sodium:neurotransmitter symporter family signature 1 CONSENSUS: U-R-F-[GP!-Y-x ( 4 ) -N-G-G-G-x-[FY! -
NAME: Sodium:neurotransmitter symporter family signature 2. CONSENSUS: Y-[LIVMFY!-x (2) -[SC!-[LIVMFY!-[STιQ!-x (2 ) -L-P-U- x(2)-C-x(4)-N-CGST!-
NAME: Sodium:solute symporter family signature 1- CONSENSUS: CGS!-x (2) -[LIY!-x (3) -[LIVMFYUSTAG! (10)-[LIY!- [TAV!-x(2)-G-G-[LMF!-x- CONSENSUS: [SAP!-
NAME: Sodium:solute symporter family signature 2- CONSENSUS: [GAST!-[LIVM!-x (3) -[KR!-x ( 4 ) -G-A-x ( ) -[GAS!- [LIVMGS!-[LIVMU!-[LIVMGAT!-G- CONSENSUS: x-[LIVMG!-
NAME: Sodium:sulfate symporter family signature- CONSENSUS: [STACP!-S-x (2) -F-x (2) -P-[LIVM!-[GSA!-x (3) -N-x- [LIVM!-V.
NAME: lpT family of transporters signature- CONSENSUS: R-G-x ( 5) -U-N-x (2) -H-N-x-G-G- NAME: Ammonium transporters signature-
CONSENSUS: D-CFYUS!-A-G-[GSC!-x (2) -EIV!-x (3) -CSAG! (2) -x (2) -
ESAG!-CLIVMF!-x(3)-
CONSENSUS: [LIVMFYUA!(2)-x-[GK!-x-R
NAME: BCCT family of transporters signature- CONSENSUS: [GSDN!-U-T-[LIVM!-x-[FY!-U-x-U-U - NAME-" Flagellar motor protein motA family signature-
CONSENSUS: A-[LMF!-x-[GAT!-T-[LIVF!-x-G-x-[LIVMF!-x (7) -P
NAME: Formate and nitrite transporters signature 1- CONSENSUS: [LIVMA!-CLIVMY!-x-G-[GSTA!-[DES!-L-[FI!-[TN!-[GS!
NAME: Formate and nitrite transporters signature 2- CONSENSUS: [GA!-x (2) -[CA!-N-ELIVMFYU! (2) -V-C-[LV!-A -
NAME: Prokaryotic sulfate-binding proteins signature 1. CONSENSUS: K-x-[N(QEK!-[GT!-G-[D<Q!-x-[LIVM!-x (3) -iQ-S -
NAME: Prokaryotic sulfate-binding proteins signature 2- CONSENSUS: N-P-K-[ST!-S-G-x-A-R - NAME: Sulfate transporters signature.
CONSENSUS: P-x-Y-CGS!-L-Y-[STAG! (2) -x (4 ) -ELIVMFY! (3) -x (3 ) - [GSTA!(2)-S-[KR!-
NAME: Amino acid permeases signature- CONSENSUS: [STAGC!-G-[PAG!-x (2ι3) -[LIVMFYUA! (2 ) -x-[LIVMFYU!- x-[LIVMFUSTAGC!(2)-
CONSENSUS: [STAGC!- (3) -[LIVMFYU!-x-[LIVMST!-x (3)-[LIMCTA!- EGA!-E-x(S)-[PSAL!- NAME: Aromatic amino acids permeases signature-
CONSENSUS: I-G-[GA!-G-M-[LF!-[SA!-x-P-x (3 ) -[SA!-G-x (2) -F -
NAME: Xanthine/uracil permeases family signature- CONSENSUS: [LIVM!-P-x-[PASIF!-V-[LIVM!-G-G-x ( 4 ) -[LIVM!-[FY!- [GSA!-x-[LIVM!-x(3)-G-
NAME: Anion exchangers family signature 1- CONSENSUS: F-G-G-[LIVM! (2) -[KR!-D-[LIVM!-[RK!-R-R-Y. NAME: Anion exchangers family signature 2- CONSENSUS: EFO-L-I-S-L-I-F-I-Y-E-T-F-X-K-L -
NAME: MIP family signature- CONSENSUS: [HN(3A!-x-N-P-[STA!-[LIVMF!-[ST!-[LIVMF!-[GSTAFY!.
NAME: General diffusion Gram-negative porins signature-
CONSENSUS: [LIVMFY!-x(2)-G-x(2)-Y-x-F-x-K-x(2)-[SN!-[STAV!-
[LIVMFYU!-V NAME: OmpA-like domain-
CONSENSUS: [LIVMA!-x-[GT!-x-[TA!-EDA!-x (2 ) - G!-[GSTP!-x (2) - [LFYDE!-[N(QS!-x(2)~ CONSENSUS: CLI!-CSG!-[(QE!-[KR(QE!-R-A-x (2) -CLVJ-x (3) -CLIVMF!- x(4ι5)-CLIVM!-x(4)-
CONSENSUS: CLIVM!-x (3) -CSG!-x-G •
NAME: Eukaryotic mitochondrial porin signature.
CONSENSUS: CYH!-x (2)-D-[SPA!-x-[STA!-x (3)-CTAG!-[KR!-CLIVMF!-
[DNSTA!-[DNS!-x(4)-
CONSENSUS [GSTAN!-[LIVMA!-x-[LIVMY!. NAME: Insulin-like growth factor binding proteins signature. CONSENSUS: G-C-[GS!-C-C-x (2) -C-A-x (fa) -C •
NAME: GPRl/FUN34/yaaH family signature- CONSENSUS: N-P-[AV!-P-[LF!-G-L-x-[GSA!-F
NAME: GNS1/SUR4 family signature- CONSENSUS: L-x-F-L-H-x-Y-H-H -
NAME: 43 Kd postsynaptic protein signature. CONSENSUS: G-(Q-D-ιQ-T-K-(Q-(Q-I -
NAME: Actins signature 1-
CONSENSUS: [FY!-[LIV!-G- E!-E-A-lQ-x-[RK(Q! (2) -G . NAME: Actins signature 2-
CONSENSUS: U-[IV!-[STA!-[RK!-x-[DE!-Y-[DNE!-[DE! .
NAME-" Actins and actin-related proteins signature- CONSENSUS: [LM!-[LIVM!-T-E-[GAP<Q!-x-[LIVMFYUH(Q!-N-[PSTAιQ!- x(2)-N-[KR!-
NAME: Annexins repeated domain signature-
CONSENSUS: CTG!-[STV!-x (6) -[LIVMF!-x (2) -R-x (3) - E(QNH!-x (7) -
[IFY!-x(7)-[LIVMF!- CONSENSUS: x(3)-[LIVMF!-x(ll)-[LIVMFA!-x(2)-[LIVMF!
NAME: Caveolins signature- CONSENSUS: F-E-D-V-I-A-E-P - NAME: Clathrin light chain signature 1 CONSENSUS: F-L-A-(Q-(Q-E-S -
NAME: Clathrin light chain signature 2-
CONSENSUS: [KR!-D-x-S-[KR!-[LIVM!-[KR!-x-[LIVM! (3) -x-L-K
NAME: Clusterin signature 1- CONSENSUS: C-K-P-C-L-K-x-T-C -
NAME: Clusterin signature 2-
CONSENSUS: C-L-[RK!-M-[RK!-x-[E<Q!-C-[ED!-K-C
NAME"- Connexins signature 1-
CONSENSUS: C-[DN!-T-x-ιQ-P-G-C-x (2) -V-C-Y-
NAME: Connexins signature 2-
CONSENSUS: C-x(3ι4)-P-C-x(3)-[LIVM!-[DEN!-C-[FY!-[LIVM!-[SA!-
[KR!-P- NAME: Crystallins beta and gamma 'Greek key' motif signature-
CONSENSUS: [LIVMFYUA!-x--CDEHRKSTP>-[FY!-[DElQHKY!-x (3) ~[FY!-x- G-x(4)-[LIVMFCST!-
NAME: Dynamin family signature-
CONSENSUS: L-P-[RK!-G-[STN!-[GN!-[LIVM!-V-T-R -
NAME: Dynein light chain type 1 signature- CONSENSUS: H-x-I-x-G-[KR!-x-F-[GA!-S-x-V-[ST!-[HY!-E -
NAME: FtsZ protein signature 1-
CONSENSUS: N-CST!-D-x-(Q-x-L-x (Ifailβ) -G-x-G-CATV!-G-CGSAN!-x-
P-x(2)-G-
NAME: FtsZ protein signature 2-
CONSENSUS." [DNHKR!-[LIVMF!-x-[LIVMF!(2)-[VSTAC!-[STAC!-G-x-G-
[GK!-G-T-G-[ST!-G-
CONSENSUS: [GSAR!-[STA!-P-[LIVMFT!-[LIVMF!-[SGAV! -
NAME: Fungal hydrophobins signature-
CONSENSUS: [GN!-[DN(QPSA!-x-C-[GSTANK!-[GSTADN(Q!-[STN(QI!-
[PTIV!-x-C-C-[DEN(QKPST!- NAME: Intermediate filaments signature-
CONSENSUS: [IV!-x-[TACI!-Y-[RKH!-x-[LM!-L-[DE! -
NAME-" Involucrin signature-
CONSENSUS: <M-S-[ιQH!-ιQ-x-T-[LV!-P-V-T-[LV! -
NAME: Kinesin motor domain signature-
CONSENSUS: [GSA!-[KRHPST(QVM!-[LIVMF!-x-[LIVMF!-[IVC!-D-L-
[AH!-G-[SAN!-E- NAME: Kinesin motor domain profile-
NAME: Kinesin light chain repeat-
CONSENSUS: [DE(QR!-A-L-x (3) -IGEQl-x (3) -G-x-[DNS!-x-P-x-V-A- x(3)-N-x-L-[AS!- CONSENSUS: x ( 5) -[(QR!-x-[KR!-[FY!-x (2) -[AV!-x ( 4 ) -[HKNiQ! .
NAME: Myelin basic protein signature- CONSENSUS: V-V-H-F-F-K-N - NAME: Myelin PO protein signature-
CONSENSUS: S-[KR!-S-x-K-[AG!-x-[SA!-E-K-K-[STA!-K -
NAME: Myelin proteolipid protein signature 1- CONSENSUS: G-[MV!-A-L-F-C-G-C-G-H -
NAME: Myelin proteolipid protein signature 2-
CONSENSUS: C-x-[ST!-x-[DE!-x (3) -[ST!-[FY!-x-L-[FY!-I-x (4 ) -G-
A- NAME: Neuromodulin (GAP-43) signature 1- CONSENSUS: <M-L-C-C-[LIVM!-R-R -
NAME". Neuromodulin (GAP-43) signature 2. CONSENSUS : S-F-R-G-H-I-x-R-K-K-CLIVM! •
NAME: Osteopontin signature- CONSENSUS: CKιQ!-x-[TA!-x (2) -CGA!-S-S-E-E-K -
NAME: Peripherin / rom-1 signature-
CONSENSUS: D-[GS!-V-P-F-[ST!-C-C-N-P-x-S-P-R-P-C -
NAME-" Profilin signature- CONSENSUS: <x (Oil ) -[STA!-x (Oil) -U-[DEN(QH!-x-[YI!-x-CDE<Q! -
NAME: Surfactant associated polypeptide SP-C palmitoylation sites-
CONSENSUS: I-P-C-C-P-V-
NAME: Synapsins signature 1. CONSENSUS: |__R_R_R_L_S_D-S .
NAME"- Synapsins signature 2- CONSENSUS: G-H-A-H-S-G-M-G-K-V-K -
NAME: Synaptobrevin signature-
CONSENSUS: N-[LIVM!-[DENS!-[KL!-V-x-[DEιQ!-R-x (2) -[KR!-[LIVM!- [STDE!-x-[LIVM!-x-[DE!- CONSENSUS: [KR!-[TA!-[DE! .
NAME: Synaptophysin / synaptoporin signature- CONSENSUS: L-S-V-[DE!-C-x-N-K-T - NAME: Tropomyosins signature- CONSENSUS: L-K-E-A-E-x-R-A-E -
NAME: Tubulin subunits alphai betai and gamma signature- CONSENSUS: [SAG!-G-G-T-G-[SA!-G -
NAME: Tubulin-beta mRNA autoregulation signal CONSENSUS: <M-R-[DE!-[IL! -
NAME: Tau and MAP proteins tubulin-binding domain signature. CONSENSUS: G-S-x (2) -N-x (2) -H-x-[PA!-[AG!-G (2) -
NAME: Neuraxin and MAPIB proteins repeated region signature. CONSENSUS: [STAGDN!-Y-x-Y-E-x (2 ) - E!-[KR!-[STAGCI! ■ NAME: F-actin capping protein alpha subunit signature 1- CONSENSUS: V-H-[FY! (2) -E-D-G-N-V -
NAME: F-actin capping protein alpha subunit signature 2- CONSENSUS: F-K-[AE!-L-R-R-x-L-P -
NAME: F-actin capping protein beta subunit signature. CONSENSUS: c-D-Y-N-R-D-
NAME: Vinculin family talin-binding region signature- CONSENSUS: [KR!-x-[LIVMF!-x (3) -[LIVMA!-x (2 )-[LIVM!-x (fa) -R-ιQ- iQ-E-L-
NAME-" Vinculin repeated domain signature- CONSENSUS : CLIVM!-x-[<QA!-A-x ( 2 ) -U-CIL!-x- N!-P -
NAME: Amyloidogenic glycoprotein extracellular domain signature-
CONSENSUS: G-CVT!-E-CFY!-V-C-C-P -
NAME: Amyloidogenic glycoprotein intracellular domain signature-
CONSENSUS: G-Y-E-N-P-T-Y-[KR! -
NAME: Cadherins extracellular repeated domain signature- CONSENSUS: [LIV!-x-[LIV!-x-D-x-N-D-[NH!-x-P •
NAME: Insect cuticle proteins signature. CONSENSUS: G-x ( 7) - EN!-G-x (fa) -Y-x-A- NG!-x (2 ι3) -G-[FY!-x- [AP!.
NAME: Gas vesicles protein GVPa signature 1- CONSENSUS: [LIVM!-x-[DE!-[LIVMFYT!-[LIVM!- E!-x-[LIVM! (2) - CDKR!(2)-G-x-[LIVM!(2) ■
NAME: Gas vesicles protein GVPa signature 2- CONSENSUS: R-[LIVA! (3) -A-CGS!-[LIVMFY!-x-T-x (3) -Y-CAG! • NAME: Gas vesicles protein GVPc repeated domain signature- CONSENSUS: F-L-x (2 ) -T-x (3) -R-x (3) -A-x (2) -iQ-x (3) -L-x (2) -F -
NAME: Bacterial microcompartiments proteins signature- CONSENSUS: D-x (Oil ) -M-x-K-[SAG! (2 ) -x-[IV!-x-[LIVM!-[LIVMA!- [GCS!-x(4)-[GD!-[SGPD!- CONSENSUS: [GA!-
NAME: Flagella basal body rod proteins signature- CONSENSUS: [GTARY(Q!-x (1)-[LIVMYSTA! (2 ) -[GSTA!-[STADEN!-N- [LIVM!-[SAN!-N-x-[SADNFR!- CONSENSUS: [STV!-
NAME: Flagella transport protein fliP family signature 1- CONSENSUS." [PA!-A-[FY!-x-CLIVT!-[STHJ-[E(Q!-[LI!-x(2)-[GA!-F- [KRE<Q!-[IM!-G-[LIF!.
NAME: Flagella transport protein fliP family signature 2- CONSENSUS: P-[LIVMF!-K-CLIVMF! (5) -x-[LIVMA!- NGS!-G- . NAME: Plant viruses icosahedral capsid proteins 'S' region signature-
CONSENSUS: [FYU!-x-[PSTA!-x (7 ) -G-x-[LIVM!-x-[LIVM!-x-[FYUI!- x(2)-D-x(5)-P- NAME: Potexviruses and carlaviruses coat protein signature. CONSENSUS: [RK!-[FYU!-A-[GAP!-F-D-x-F-x (2) -[LV!-x (3) - CGASTKS).
NAME: Neurotransmitter-gated ion-channels signature- CONSENSUS: C-x-[LIVMFιQ!-x-[LIVMF!-x (2) -[FY!-P-x-D-x (3) -C -
NAME: ATP P2X receptors signature- CONSENSUS : G-G-x-[LIVM!-G-[LIVM!-x- V!-x-U-x-C-[DN!-L-D- x ( 5 ) -C-x-P-x-Y-x-F -
NAME-" G-protein coupled receptors signature- CONSENSUS: [GSTALIVMFYUC!-[GSTANCPDE!-CEDPKRH]--x (2 ) -
[LIVMN(QGA!-x(2)-[LIVMFT!- CONSENSUS: [GSTANC!-CLIVMFYUSTAC!- ENH!-R-[FYUCSH!-x(2) -
[LIVM!- NAME-" G-protein coupled receptors family 2 signature 1- CONSENSUS: C-x (3) -[FYULIV!-D-x (3ι 4 ) -C-[FU!-x(2 ) -CSTAGV!-
Figure imgf000461_0001
NAME: G-protein coupled receptors family 2 signature 2- CONSENSUS: (Q-G-[LMFCA!-[LIVMFT!-[LIV!-x-[LIVFST!-[LIF!- [VFYH!-C-[LFY!-x-N-x(2)-V-
NAME: G-protein coupled receptors family 3 signature 1- CONSENSUS: [LV!-x-N-[LIVM! (2) -x-L-F-x-I-[PA!-ιQ-[LIVM!-[STA!- x-[STA!(3)-[STAN!-
NAME: G-protein coupled receptors family 3 signature 2- CONSENSUS: C-C-CFYU!-x-C-x (2) -C-x ( 4 ) -[FYU!-x (2ι4) - N!-x (2) - [STAH!-C-x(2)-C
NAME: G-protein coupled receptors family 3 signature 3. CONSENSUS: F-N-E-[STA!-K-x-I-[STAG!-F-[ST!-M.-
NAME: Visual pigments (opsins) retinal binding site- CONSENSUS: [LIVMUAC!-[PGAC!-x (3) -[SAC!-K-[STALIMR!-[GSACPNV!- [STACP!-x(2)-[DENF!- CONSENSUS: CAP!-x (2) -[IY! •
NAME: Bacterial rhodopsins signature 1- CONSENSUS: R-Y-x-[DT!-U-x-[LIVMF!-[ST!-T-P-[LIVM! ( 3) •
NAME: Bacterial rhodopsins retinal binding site- CONSENSUS: [FYIV!-x-[FYVG!-[LIVM!-D-[LIVMF!-x-[STA!-K-x (2) -
[FY!.
NAME: Receptor tyrosine kinase class II signature CONSENSUS: [DN!-[LIV!-Y-x (3) -Y-Y-R .
NAME: Receptor tyrosine kinase class III signature- CONSENSUS: G-x-H-x-N-[LIVM!-V-N-L-L-G-A-C-T -
NAME: Receptor tyrosine kinase class V signature 1- CONSENSUS: F-x-[DN!-x-[GAU!-[GA!-C-[LIVM!-[SA!-[LIVM! (2) - [SA!-[LV!-[KRH(Q!-[LIVA!- CONSENSUS: x (3) -[KR!-C-[PSAU! -
NAME: Receptor tyrosine kinase class V signature 2- CONSENSUS: C-x (2) -[DE!-G- E(Q!-U-x (2ι3) -[PA(Q!-[LIVMT!-[GT!-x- C-x-C-x(2)-G-[HFY!- CONSENSUS: [E(Q!-
NAME: Growth factor and cytokines receptors family signature 1- CONSENSUS : C-[LVFYR!-x ( 7 ι β ) -[STIVDN!-C-x-U -
NAME: Growth factor and cytokines receptors family signature
2-
CONSENSUS: [STGL!-x-U-[SG!-x-U-S-
NAME: TNFR/NGFR family cysteine-rich region signature- CONSENSUS: C-x(4ιfa)-[FYH!-x(5ιlD)-C-x(0ι2)-C-x(2ι3)-C- x(7ιll)-C-x(4ιfa)-CDNE(QSKP!-
CONSENSUS: x(2)-C
NAME: TNFR/NGFR family cysteine-rich region domain-
NAME: Integrins alpha chain signature. CONSENSUS: [FYUS!-[RK!-x-G-F-F-x-R.
NAME-" Integrins beta chain cysteine-rich domain signature. CONSENSUS: C-x-[GN(Q!-x(lι3)-G-x-C-x-C-x(2)-C-x-C
NAME: Natriuretic peptides receptors signature- CONSENSUS: G-P-x-C-x-Y-x-A-A-x-V-x-R-x(3)-H-U-
NAME: Photosynthetic reaction center proteins signature-
CONSENSUS: [NH!-x(4)-P-x-H-x(2)-[SAG!-x(ll)-[SAGC!-x-H-
CSAG!(2).
NAME: Antenna complexes alpha subunits signature- CONSENSUS: [LIVFAG!-x-[GASV!-[LIVFA!-x-[IV!-H-x (3) -[LIVM!- CGSTAE!-[STANH!-x(lι3)- CONSENSUS: [STN!-U-[LIVMFYU! -
NAME: Antenna complexes beta subunits signature-
CONSENSUS: CE(Q!-x (4 )-H-x ( 5) -CGSTA!-x (3) -[FY!-x (3)-[AG!-x (2) -
[AV!-H-x(7)-P-
NAME-" Photosyste I psaA and psaB proteins signature- CONSENSUS: c-D-G-P-G-R-G-G-T-C -
NAME: Photosystem I psaG and psaK proteins signature- CONSENSUS: G-F-x-[LIVM!-x-[DEA!-x (2)-[GA!-x-CGTA!-[SA!-x-G-H- x-[LIVM!-[GA!-
NAME: Phytochrome chromophore attachment site signature CONSENSUS: [RGS!-[GSA!-[PV!-H-x-C-H-x(2)-Y-
NAME: Phytochrome chromophore attachment site domain profile-
NAME: Speract receptor repeated domain signature-
CONSENSUS: G-x(5)-G-x(2)-E-x(fa)-U-G-x(2)-C-x(3)-[FYU!-x(δ)-C- x(3)-G-
NAME: TonB-dependent receptor proteins signature 1- CONSENSUS: <x(10ιll5)-[DENF!-[ST!-[LIVMF!-[LIVSTE(Q!-V-x-
CAGP!-[STANE(QPK!-
NAME: TonB-dependent receptor proteins signature 2 CONSENSUS: [LYGSTANE!-χ (3) -[GSTAEN(Q!-x-[PGE!-R-x-[LIVFYUA!-x-
[LIVMFTA!-[STAGNιQ!-
CONSENSUS: [LIVMFYGTA!-χ-[LIVMFYUGTAD(Q!-x-F> - NAME-" Transmembrane 4 family signature-
CONSENSUS: G-x (3) -CLIVMF!-x (2) -CGSA!-[LIVMF! (2) -G-C-x-[GA!-
CSTA!-x(2)-CEG!-x(2)-
CONSENSUS: [CUN!-[LIVM! (2) - NAME: Bacterial chemotaxis sensory transducers signature-
CONSENSUS: R-T-E-[EιQ!-(Q-x (2) -CSA!-[LIVM!-x-[E(Q!-T-A-A-S-M-E- (Q-L-T-A-T-V-
NAME: ER lumen protein retaining receptor signature 1- CONSENSUS: G-I-S-x-[KR!-x-(Q-x-L-[FY!-x-[LIV! (2 ) -F-x (2) -R-Y .
NAME: ER lumen protein retaining receptor signature 2- CONSENSUS: L-E-[SA!-V-A-I-[LM!-P-(Q-L ■ NAME: Ephrins signature-
CONSENSUS: CKR(Q!-CLF!-[CST!-x-K-[IF!-(Q-x-[FY!-[ST!-[PA!-x (3) -
G-x-E-F-x(5)-[FY!(2)-
CONSENSUS: x(2)-[SA!- NAME: Granulins signature-
CONSENSUS: C-x-D-x (2) -H-C-C-P-x ( 4 ) -C .
NAME". HBGF/FGF family signature.
CONSENSUS: G-x-L-x-[STAGP!-x (bι7 ) -[DE!-C-x-[FM!-x-E-x (b ) -Y •
NAME: PTN/MK heparin-binding protein family signature 1- CONSENSUS: S- E!-C-x- E!-U-x-U-x (2) -C-x-P-x-[SN!-x-D-C-G- [LIVMA!-G-x-R-E-G. NAME: PTN/MK heparin-binding protein family signature 2-
CONSENSUS: C-[KR!-[LIVM!-P-C-N-U-K-K-x-F-G-A-[DE!-C-K-Y-x-F- [EιQ!-x-U-G-x-C
NAME: Nerve growth factor family signature- CONSENSUS: G-C-[KR!-G-[LIV!-[DE!-x (3) -[YU!-x-S-x-C -
NAME: Platelet-derived growth factor (PDGF) family signature ■
CONSENSUS: P-[PS!-C-V-x (3) -R-C-[GSTA!-G-C- .
NAME: Small cytokines (intercrine/chemokine) C-x-C subfamily signature ■
CONSENSUS: C-x-C-[LIVM!-x (Sifa ) -[LIVMFY!-x (2) -[RKSE(Q!-x-
[LIVM!-x(2)-[LIVM!-x(5)~ CONSENSUS: CSAG!-x (2) -C-x (3) -CE(Q!-[LIVM! (2 ) -x (lilO) -C-L-[DN! •
NAME: Small cytokines (intercrine/chemokine) C-C subfamily signature.
CONSENSUS: C-C-[LIFYT!-x ( Sifa) -[LI!-x ( 4 ) -[LIVMF!-x (2 ) -CFYU!- x(bιβ)-C-x(3ι4)-[SAG!-
CONSENSUS: CLIVM! (2) -[FL!-x (6 ) -C-[STA! -
NAME: TGF-beta family signature- CONSENSUS : [LIVM!-x ( 2 ) -P-x ( 2 ) -[FY!-x ( 4 ) -C-x-G-x-C
NAME-" TNF family signature-
CONSENSUS: CLV!-x-[LIVM!-x (3) -G-[LIVMF!-Y-[LIVMFY! (2) -x (2) - [<QEKHL!-[LIVMGT!-x-
CONSENSUS: [LIVMFY!-
NAME: TNF family profile- NAME: Unt-1 family signature-
CONSENSUS: C-K-C-H-G-CLIVMT!-S-G-x-C -
NAME: Interferon alphai beta and delta family signature- CONSENSUS: CFYH!-[FY!-x-[GNRC!-[LIVM!-x (2 ) -[FY!-L-x (7 ) -[CY!- A-U-
NAME: Granulocyte- acrophage colony-stimulating factor signature-
CONSENSUS: C-P-[LP!-T-x-E-[ST!-x-C -
NAME: Interleukin-1 signature-
CONSENSUS: CFC!-x-S-[ASLV!-x (2) -P-x (2) -CFYLIV!-[LI!-[SCA!-T- x(7)-[LIVM!- NAME: Interleukin-2 signature-
CONSENSUS: T-E-CLF!-x (2) -L-x-C-L-x (2) -E-L .
NAME: Interleukins -4 and -13 signature-
CONSENSUS: L-x-E-CLIVM! (2 ) -x (4 i 5) -[LIVM!-[TL!-x (5ι7) -C-x ( 4 ) - [IVA!-x-[DNS!-[LIVMA!-
NAME: Interleukin-b / G-CSF / MGF signature- CONSENSUS: C-x (1 ) -C-x (fa ) -G-L-x (2 ) -CFY!-x (3 ) -L - NAME: Interleukin-7 and -1 signature- CONSENSUS: N-x-[LAP!-[SCT!-F-L-K-x-L-L -
NAME: Interleukin-ID family signature-
CONSENSUS: [GS!-C-x (2 ) -CLV!-x (2 ) -[LIVM! (2) -x-F-Y-L-x (2) -V -
NAME: LIF / OSM family signature.
CONSENSUS: CPST!-x ( 4 ) -F-[NιQ!-x-K-x (3) -C-x-[LF!-L-x (2) -Y-[HK! .
NAME: Macrophage migration inhibitory factor family signature-
CONSENSUS: E!-P-C-A-x (3) -[LIVM!-x-S-I-G-x-[LIVM!-G ■
NAME: Adipokinetic hormone family signature- CONSENSUS: ιQ-[LV!-[NT!-[FY!-[ST!-x (2) -U -
NAME: Bombesin-like peptides family signature CONSENSUS: U-A-x-G-[SH!-[LF!-M -
NAME: Calcitonin / CGRP / IAPP family signature. CONSENSUS: C-[SAGDN!-[STN!-x (Oil) -[SA!-T-C-[VMA!-χ (3) -CLYF3- x(3)-[LYF!-
NAME Corticotropin-releasing factor family signature CONSENSUS : CP(Q!-x-[LIVM!-S-[LIVM!-x ( 2 ) -CPST!-CLIVMF!-χ-
[LIVM!-L-R-x ( 2 ) -[LIVM! -
NAME: Crustacean CHH/MIH/GIH neurohormones family signature CONSENSUS: C-CDENK!-D-C-x-N-[LIV!-[FY!-R-x (7) -C-[KR!-x (2 ) -C -
NAME: Erythropoietin / thrombopoeitin signature- CONSENSUS: P-x (4 ) -C-D-x-R-CLIVM! (2) -x-[KR!-x (14 ) -C - NAME: Granins signature 1-
CONSENSUS: E!-CSN!-L-[SAN!-x (2) -[DE!-x-E-L -
NAME: Granins signature 2-
CONSENSUS: C-[LIVM! (2) -E-[LIVM! (2) -S-[DN!-[STA!-L-x-K-x-S- x(3)-[LIVM!-CSTA!-x-E-C-
NAME: Galanin signature-
CONSENSUS: G-U-T-L-N-S-A-G-Y-L-L-G-P-H - NAME: Gastrin / cholecystokinin family signature- CONSENSUS: Y-x (Oil) -[GD!-[UH!-M-[DR!-F -
NAME: Glucagon / GIP / secretin / VIP family signature- CONSENSUS: [YH!-[STAIVGD!- EιQ!-[AGF!-[LIVMSTE!-[FYLR!-x- [DENSTAK!-[DENSTA!-
CONSENSUS: [LIVMFYG!-x (1) -[KRE(QL!-[KRDEN(QL!-[LVFYUG!-[LIV(Q! -
NAME: Glycoprotein hormones alpha chain signature 1 CONSENSUS: C-x-G-C-C-[FY!-S-R-A-[FY!-P-T-P •
NAME: Glycoprotein hormones alpha chain signature 2 CONSENSUS: N-H-T-x-C-x-C-x-T-C-x (2 ) -H-K .
NAME: Glycoprotein hormones beta chain signature 1- CONSENSUS: C-[STAGM!-G-[HFYL!-C-x-[ST! -
NAME: Glycoprotein hormones beta chain signature 2- CONSENSUS: CPA!-V-A-x (2) -C-x-C-x (2) -C-x ( 4 ) -[STD!-[DEY!-C- x(faιβ)-[PGSTAVM!-x(2)-C
NAME: Gonadotropin-releasing hormones signature CONSENSUS: <Q-H-CFYU!-S-x(4)-P-G.
NAME: Insulin family signature- CONSENSUS: C-C--CP>-x (2) -C-[STDNEKPI!-x (3) -CLIVMFS!-x (3) -C .
NAME: Natriuretic peptides signature- CONSENSUS: C-F-G-x (3) -D-R-I-x (3) -S-x (2) -G-C - NAME: Neurohypophysial hormones signature- CONSENSUS: C-[LIFY! (2) -x-N-[CS!-P-x-G -
NAME: Neuromedin U signature- CONSENSUS: F-[LIVMF!-F-R-P-R-N -
NAME: Endogenous opioids neuropeptides precursors signature- CONSENSUS: C-x (3) -C-x (2) -C-x (2) -CKRH!-x (bι7) -[LIF!- N!-x (3) - C-x-[LIVM!-[E(Q!-C- CONSENSUS : CE(Q!-x ( δ ) -U-x ( 2 ) -C
NAME : Pancreatic hormone family signature-
CONSENSUS : [FY!-x(3)-[LIVM!-x(2)-Y-x(3)-[LIVMFY!-x-R-x-R-
[YF! -
NAME: Parathyroid hormone family signature. CONSENSUS: V-S-E-x-(Q-x(2)-H-x(2)-G- NAME: Pyrokinins signature-
CONSENSUS: F-CGSTV!-P-R-L-CG>!-
NAME: Somatotropini prolactin and related hormones signature 1- CONSENSUS: C-x-[ST!-x (2) -[LIVMFY!-x-[LIVMSTA!-P-x ( 5) -CTALIV!- x(7)-CLIVMFY!-x(b)- CONSENSUS: CLIVMFY!-x (2) -CSTA!-U -
NAME: Somatotropini prolactin and related hormones signature 2-
CONSENSUS: C-CLIVMFY!-x (2) -D-CLIVMFYSTA!-x (5) -[LIVMFY!-x (2) - [LIVMFYT!-x(2)-C
NAME: Tachykinin family signature- CONSENSUS: F-[IVFY!-G-[LM!-M-[G>! -
NAME: Thymosin beta-4 family signature. CONSENSUS: K-L-K-K-T-E-T-(Q-E-K-N -
NAME: Urotensin II signature- CONSENSUS: c-F-U-K-Y-C
NAME: Cecropin family signature-
CONSENSUS: U-x (Dι2 ) -[KDN!-x (2) -K-[KRE!-[LI!-E-[RKN!
NAME: Mammalian defensins signature- CONSENSUS: C-x-C-x (3ι5) -C-x (7) -G-x-C-x (1) -C-C -
NAME: Arthropod defensins signature-
CONSENSUS: C-x(2ι3)-[HN!-C-x(3ι4)-[GR!-x(2)-G-G-x-C-x(4ι7)-C- x-C
NAME: Cathelicidins signature 1-
CONSENSUS: Y-x-[ED!-x-V-x-[R(Q!-A-[LIVMA!- £2G!-x-[LIVMFY!-N-
CE(Q!-
NAME: Cathelicidins signature 2- CONSENSUS: F-x-[LIVM!-K-E-T-x-C-x(lD)-C-x-F-[KR!-[KE!
NAME: Endothelin family signature- CONSENSUS: C-x-C-x(4)-D-x(2)-C-x(2)-[FY!-C-
NAME: Plant thionins signature- CONSENSUS C-C-x(5)-R-x(2)-[FY!-x(2)-C-
NAME: Gamma-thionins family signature-
CONSENSUS: [KR!-x-C-x(3)-[SV!-x(2)-[FYUH!-x-[GF!-x-C-x(5)-C- x(3)-C- NAME: Snake toxins signature-
CONSENSUS: G-C-x (lι3) -C-P-x (βilO) -C-C-x (2) -[PDEN! - NAME: Myotoxins signature-
CONSENSUS: K-x-C-H-x-K-x (2) -H-C-x (2) -K-x (3 ) -C-x (6) -K-x (2 ) -C- x(2)-[RK!-x-K-C-C-K-K-
NAME: Scorpion short toxins signature- CONSENSUS: C-x (3) -C-x (fail) -CGAS!-K-C-[IM(QT!-x (3) -C-x-C -
NAME: Heat-stable enterotoxins signature- CONSENSUS: C-C-x ( 2) -C-C-x-P-A-C-x-G-C - NAME: Aerolysin type toxins signature- CONSENSUS: CKT!-x (2) -N-U-x (2) -T-[DN!-T -
NAME: Shiga/ricin ribosomal inactivating toxins active site signature. CONSENSUS: [LIVMA!-x-[LIVMSTA! (2) -x-E-[SAGV!-[STAL!-R-[FY!- [RKN(QS!-x-[LIVM!-[E(QS!- CONSENSUS: x (2 ) -CLIVMF! .
NAME: Channel forming colicins signature- CONSENSUS: T-x (2) -U-x-P-CLIVMFY! (3) -x (2) -E •
NAME: Hok/gef family cell toxic proteins signature- CONSENSUS: [LIVMA! ( 4 ) -C-[LIVMFA!-T-[LIVMA! (2) -x (4) -[LIVM!-x- [RG!-x(2)-L-[CY!-
NAME: Staphylococcal enterotoxin/Streptococcal pyrogenic exotoxin signature 1-
CONSENSUS: Y-G-G-[LIV!-T-x (4 ) -N - NAME : Staphy l occocal enterotox i n/Streptococca l pyrogen i c exotoxin si gnature 2 -
CONSENSUS : K-x ( 2 ) -CLIV!-x ( 4 ) -[LIV!-D-x ( 3 ) -R-x ( 2 ) -L-x ( 5 ) -
[LIV!-Y - NAME: Thiol-activated cytolysins signature- CONSENSUS: [RK!-E-C-T-G-L-x-U-E-U-U-[RK! -
NAME: Membrane attack complex components / perforin signature. CONSENSUS: Y-x (b) -[FY!-G-T-H-[FY! .
NAME: Pancreatic trypsin inhibitor (Kunitz) family signature-
CONSENSUS: F-x (3) -G-C-x (fa) -CFY!-x (5) -C -
NAME: Bowman-Birk serine protease inhibitors family signature.
CONSENSUS: C-x (5ιb ) -[DEN(QKRHSTA!-C-[PASTDH!-[PASTDK!-[ASTDV!-
C-[NDKS!-[DEKRHSTA!-C
NAME: Kazal serine protease inhibitors family signature- CONSENSUS: C-x (7 ) -C-x (fa) -Y-x (3) -C-x (2ι3) - • NAME: Soybean trypsin inhibitor (Kunitz) protease inhibitors family signature-
CONSENSUS : [LIVM!-x-D-x-[EDNTY!-[DG!-[RKHDEN(Q!-x-[LIVM!-x ( 5 ) -
Y-x-CLIVM! -
NAME: Serpins signature.
CONSENSUS: [LIVMFY!-x-[LIVMFYAC!-[DN(Q!-[RKH(QS!-[PST!-F-
[LIVMFY!-[LIVMFYC!-x-
CONSENSUS: [LIVMFAH!-
NAME: Potato inhibitor I family signature- CONSENSUS: [FYU!-P-[E(QH!-[LIV!(2)-G-x(2)-[STAGV!-x(2)-A-
NAME: Squash family of serine protease inhibitors signature. CONSENSUS: C-P-x(5)-C-x(2)-D-x-D-C-x(3)-C-x-C
NAME: Streptomyces subtilisin-type inhibitors signature- CONSENSUS: C-x-P-x(2ι3)-G-x-H-P-x(4)-A-C-[ATD!-x-L-
NAME: Cysteine proteases inhibitors signature-
CONSENSUS: [GSTE(QKRV!-(Q-[LIVT!-[VAF!-[SAG(Q!-G-x-[LIVMNK!- x(2)-[LIVMFY!-x-[LIVMFYA!-
CONSENSUS: [DENiQKRHSIV!-
NAME: Tissue inhibitors of metalloproteinases signature CONSENSUS: C-x-C-x-P-x-H-P-(Q-x-A-F-C
NAME: Cereal trypsin/alpha-amylase inhibitors family signature.
CONSENSUS: C-x(4)-[SAGD!-x(4)-[SPAL!-CLF!-x(2)-C-CRH!-x-
[LIVMFY!(2)-x(3ι4)-C
NAME: Alpha-2-macroglobulin family thiolester region signature
CONSENSUS CPG!-x-CGS!-C-CGA!-E-[E(Q!-x-[LIVM!-
NAME: Disintegrins signature- CONSENSUS: C-x(2)-G-x-C-C-x-[N(QRS!-C-x-[FM!-x(fa)-C-[RK!-
NAME: Lambdoid phages regulatory protein CIII signature-
CONSENSUS: E-S-x-L-x-R-x(2)-[KR!-x-L-x(4)-[KR!(2)-x(2)-[DE!- x-L-
NAME: Chape onins cpnfaD signature- CONSENSUS: A-[AS!-x-[DE(Q!-E-x(4)-G-G-[GA!-
NAME: Chaperonins cpnlO signature-
CONSENSUS: CLIVMFY!-x-P-[ILT!-x-[DEN!-[KR!-[LIVMFA! (3) -
[KRE(Q!-x(δι1)-CSG!-x-
CONSENSUS: [LIVMFY!(3)-
NAME: Chaperonins TCP-1 signature 1-
CONSENSUS: [RKEL!-[ST!-x-[LMFY!-G-P-x-[GSA!-x-x-K-[LIVMF! (2) NAME: Chaperonins TCP-1 signature 2-
CONSENSUS: [LIVM!-[TS!-[NK!-D-[GA!-[AVNHK!-[TAV!-[LIVM! (2) - x(2)-[LIVM!-x-[LIVM!-x-
CONSENSUS: [SNH!-[P(QH! - NAME: Chaperonins TCP-1 signature 3- CONSENSUS: (Q-[DEK!-x-x-[LIVMGTA!-[GA!-D-G-T - NAME: Heat shock hsp2D proteins family profile-
NAME: Heat shock hsp7D proteins family signature 1- CONSENSUS: CIV!-D-L-G-T-[ST!-x-[SC! - NAME: Heat shock hsp7D proteins family signature 2-
CONSENSUS: CLIVMF!-[LIVMFY!-[DN!-[LIVMFS!-G-[GSH!-[GS!-[AST!- x(3)-[ST!-[LIVM!-
CONSENSUS: [LIVMFC!- NAME: Heat shock hsp70 proteins family signature 3-
CONSENSUS: [LIVMY!-x-[LIVMF!-x-G-G-x-[ST!-x-[LIVM!-P-x- [LIVM!-x-[DE(QKRSTA!-
NAME: Heat shock hsplD proteins family signature- CONSENSUS: Y-x-[N<QH!-K- E!-[IVA!-F-L-R-[ED! -
NAME: Chaperonins clpA/B signature 1-
CONSENSUS: D-[AI!-[SGA!-N-[LIVMF! (2) -K-[PT!-x-L-x (2) -G - NAME: Chaperonins clpA/B signature 2-
CONSENSUS: R-[LIVMFY!-D-x-S-E-[LIVMFY!-x-E-[KR(Q!-x-[STA!-x-
[STA!-[KR!-[LIVM!-x-G-
CONSENSUS: [STA!- NAME: Nt-dnaJ domain signature-
CONSENSUS: [FY!-x (2) -[LIVMA!-x (3) -[FYUHNT!-[DEN<QSA!-x-L-x- [DN!-x(3)-[KR!-x(2)-[FYI!-
NAME: dnaJ domain profile-
NAME: CXXCXGXG dnaJ domain signature-
CONSENSUS: C-[DEGSTHKR!-x-C-x-G-x-[GK!-[AGSDM!-x (2 ) -[GSNKR!- x(4ιb)-C-x(2ι3)-C-x-G-x-G- NAME: grpE protein signature-
CONSENSUS: [FL!-CDN!-[PHEA!-x (2) -[HM!-x-A-CLIVMTN!-x (lbι2D) -
G-[FY!-x(3)-CDEG!-x(2)-
CONSENSUS: CLIVM!-CRI!-x-[SA!-x-V-x-[IV! - NAME: Bacterial type II secretion system protein C signature-
CONSENSUS: p-χ (fa ) -F-x ( 4 ) -L-x (3) -D-[LIVM!-A-[LIVM!-x-[LIVM!-N- x-[LIVM!-x-L- NAME: Bacterial type II secretion system protein D signature-
CONSENSUS: [GR!-[DE(QKG!-[STVM!-[LIVMA! (3) -CGA!-G-[LIVMFY!- x(ll)-[LIVM!-P-
CONSENSUS: [LIVMFYUGS!-[LIVMF!-[GSAE!-x-[LIVM!-P- CLIVMFYU!(2)-x(2)-[LV!-F-
NAME: Bacterial type II secretion system protein E signature- CONSENSUS : CLIVM!-R-x ( 2 ) -P-D-x-[LIVM! ( 3 ) -G-E-CLIVM!-R-D .
NAME: Bacterial type II secretion system protein F signature . CONSENSUS: CKRιQ!-[LIVMA!-x (2) -[SAIV!-[LIVM!-x-[TY!-P-x (2) - [LIVM!-x(3)-[STAGV!-x(fa)~ CONSENSUS: CLMY!-x (3) -CLIVMF! (2 ) -P -
NAME: Bacterial type II secretion system protein N signature.
CONSENSUS: G-T-L-U-x-G-x (11 ) -L-x ( 4 ) -U ■
NAME: Bacterial export FHIPEP family signature- CONSENSUS: R-CLIVM!-[GSA!-E-V-[GSA!-A-R-F-[STV!-L-D-[GSA!-M- P-G-K-<Q-M-[GSA!-I-D-
CONSENSUS: [GSA!-D-
NAME: Protein secA signatures-
CONSENSUS: [IV!-x-[IV!-[SA!-T-[N(Q!-M-A-G-R-G-x-D-I-x-L -
NAME: Protein secY signature 1-
CONSENSUS: [GST!-[LIVMF! (2) -x-[LIVM!-G-[LIVM!-x-P-
[LIVMFY!(2)-x-[AS!-[GST(Q!-
CONSENSUS: [LIVMFAT! (3) -(Q-[LIVMFA! (2) -
NAME: Protein secY signature 2-
CONSENSUS: [LIVMFYU! (2) -x-[DE!-x-[LIVMF!-[STN!-x (2)-G-
[LIVMF!-[GST!-[NST!-G-x-[GST!-
CONSENSUS: [LIVMF!(3).
NAME: Protein secE/secfal-gamma signature-
CONSENSUS: [LIVMFY!-x (2) - EN(QGA!-x ( 4 ) -[LIVMTA!-x-[KRV!-x (2) -
[KU!-P-x(3)-[SE(Q!-x(7)-
CONSENSUS: CLIVT!-CLIVGA!-[LIVFGAST! -
NAME: Gram-negative pili assembly chaperone signature-
CONSENSUS: CLIVMFY!-[APN!-x-[DNS!-[KRE(Q!-E-[STR!-[LIVMΛR!-x-
[FYUT!-x-[NC!-[LIVM!-
CONSENSUS: x (2) -[LIVM!-P-[PAS! -
NAME: Fimbrial biogenesis outer membrane usher protein signature.
CONSENSUS: [VL!-[PAS(Q!-[PAS!-G-[PAD!-[FY!-x-[LI!-[DN(QSTAP!-
[DNH!-[LIVMFY!-
NAME: SRP54-type proteins GTP-binding domain signature- CONSENSUS: P-[LIVM!-x-[FYL!-[LIVMAT!-[GS!-x-[GS!-[E(Q!-x ( 4 ) - [LIVMF!- NAME: Cytochrome c oxidase assembly factor COXlO/ctaB/cyoE signature- CONSENSUS: CED!-x-D-x (2)-M-x-R-T-x (2) -R-x ( 4 )-G -
NAME: Cyclin-dependent kinases regulatory subunits signature 1-
CONSENSUS: Y-S-x-CKR!-Y-x-CDE! (2) -x-[FY!-E-Y-R-H-V-x-[LV!- [PT!-[KRP!- NAME: Cyclin-dependent kinases regulatory subunits signature
2.
CONSENSUS: H-x-P-E-x-H-[IV!-L-L-F-CKR! - NAME: Pentaxin family signature- CONSENSUS: H-x-C-x-CST!-U-x-[ST! -
NAME"- Immunoglobulins and major histocompatibility complex proteins signature- CONSENSUS: CFY!-x-C-x-CVA!-x-H -
NAME: Prion protein signature 1- CONSENSUS: A-G-A-A-A-A-G-A-V-V-G-G-L-G-G-Y • NAME." Prion protein signature 2.
CONSENSUS: E-x-CED!-x-K-[LIVM! (2) -x-[KR!-[LIVM! (2) -x-[<QE!-M- C-x(2)-(Q-Y.
NAME: Cyclins signature- CONSENSUS: R-x (2) -CLIVMSA!-x (2) -[FYUS!-[LIVM!-x (δ) -[LIVMFC!- x(4)-[LIVMFYA!-x(2)-
CONSENSUS: CSTAGC!-[LIVMFY(Q!-x-[LIVMFYC!-[LIVMFY!-D-[RKH!- [LIVMFYU!- NAME: Proliferating cell nuclear antigen signature 1.
CONSENSUS: [GA!-[LIVMF!-x-[LIVMA!-x-[SAV!-[LIVM!-D-x-[NSAE!-
[HKR!-[VI!-x-[LY!-
CONSENSUS: [VGA!-x-CLIVM!-x-[LIVM!-x ( ) -F ■ NAME: Proliferating cell nuclear antigen signature 2-
CONSENSUS: [RKA!-C-[DE!-[RH!-x (3) -[LIVMF!-x (3) -[LIVM!-x-
[SGAN!-[LIVMF!-x-K-
CONSENSUS: [LIVMF!(2). NAME: Actin-depolymerizing proteins signature-
CONSENSUS: P- E!-x-[SA!-x-[LIVMT!-[KR!-x-[KR!-M-[LIVM!-[YA!-
CSTA!(3)-x(3)-CLIVMF!-
CONSENSUS: CKR!- NAME: BCL2-like apoptosis inhibitors (spans part of BH3ι BHI and BH2) -
NAME: Apoptosis regulator! Bcl-2 family BHI domain signature- CONSENSUS: CLVME!-[FT!-x-[GSD!-[GL!-x (lι2 ) -[NS!-[YU!-G-R- [LIV!-[LIVC!-[GAT!- CONSENSUS: CLIVMF! (2)-x-F-[GSAE!-[GSARY! -
NAME: Apoptosis regulator! Bcl-2 family BH2 domain signature-
CONSENSUS: U-[LIM!-x (3 ) -[GR!-G-[UιQ!-[DENSAV!-x-[FLGA!- [LIVFTC!-
NAME: Apoptosis regulator! Bcl-2 family BH3 domain signature-
CONSENSUS: [LIVAT!-x (3) -L-[KAR<Q!-x-[IVAL!-G-D-[DESG!-[LIMFV!-
[DENSH(Q!-[LVSHR(Q!-
CONSENSUS: [NSR!- NAME: Apoptosis regulator! Bcl-2 family BH4 domain signature-
CONSENSUS: EDS!-[NT!-R-[AE!-CLI!-V-x-[KD!-[FY!-[LIV!-[GHS!-Y- K-L-CSR!-(Q-[RK!-G-
CONSENSUS: CHY!-x-CCU!-
NAME: Apoptosis regulator! Bcl-2 family BH4 domain profile- NAME: Arrestins signature-
CONSENSUS: [FY!-R-Y-G-x-[DE! (2) -x-[DE!-[LIVM! (2 ) -G-CLIVM!-x- F-x-[RK!-CDE(Q!-[LIVM!-
NAME: AAA-protein family signature- CONSENSUS: CLIVMT!-x-[LIVMT!-[LIVMF!-x-[GATMC!-[ST!-[NS!- x(4)-[LIVM!-D-x-A-[LIFA!- CONSENSUS: x-R-
NAME: Ubiquitin domain signature- CONSENSUS: K-x (2) -[LIVM!-x- ESAK!- (3) -[LIVM!-CPA!-x (3) -ώ-x- [LIVM!-[LIVMC!- CONSENSUS: [LIVMFY!-x-G-x ( 4 ) -[DE! -
NAME: Ubiquitin domain profile-
NAME: ADP-ribosylation factors family signature- CONSENSUS: CHR(QT!-x-[FYUI!-x-[LIVM!-x (4 ) -A-x (2 ) -G-x (2) [LIVM!-x(2)-[GSA!-[LIVMF!-x- CONSENSUS: [UK!-[LIVM!-
NAME: GTP-binding nuclear protein ran signature- CONSENSUS: D-T-A-G-<Q-E-K-[LF!-G-G-L-R-[DE!-G-Y-Y -
NAME-" SARI family signature- CONSENSUS: R-x-[LIVM!-E-V-F-M-C-S-[LIVM! (2) -x-[KR(Q!-x-G-Y-x- E-[AG!-[FI!-x-U-[LIVM!- CONSENSUS: x-(Q-Y.
NAME: Band 7 protein family signature- CONSENSUS: R-x (2) -CLIV!-[SAN!-x (fa ) -[LIV!-D-x (2) -T-x (2) -U-G- [LIV!-[KRH!-[LIV!-x- CONSENSUS: [KR!-[LIV!-E-[LIV!-[KR! -
NAME: Trp-Asp (UD) repeats signature- CONSENSUS: [LIVMSTAC!-[LIVMFYUSTAGC!-[LIMSTAG!-[LIVMSTAGC!- x(2)-[DN!-x(2)- CONSENSUS: CLIVMUSTAC!-x-[LIVMFSTAG!-U-[DEN!-[LIVMFSTAGCN! -
NAME: G-protein gamma subunit profile-
NAME: Ras GTPase-activating proteins signature- CONSENSUS: [GSN!-x-[LIVMF!-[FY!-[LIVMFY!-R-[LIVMFY! (2) - [GACN!-P-[AV!-[LIV!(2)- CONSENSUS: [SGAN!-P-
NAME: Ras GTPase-activating proteins profile- NAME: Guanine-nucleotide dissociation stimulators CDC24 family signature-
CONSENSUS: L-x (2) -CLIVMFYU!-L-x (2) -P-[LIVM!-x (2 ) -CLIVfO-x-
[KRS!-x(2)-L-x-[LIVM!-x-
CONSENSUS: CDE(Q!-[LIVM!-x (3) -CST! ■
NAME: Guanine-nucleotide dissociation stimulators CDC25 family signature-
CONSENSUS: [GAP!-[CT!-V-P-[FY!-x ( 4 ) -[LIVMFY!-x- N!-[LIVM! .
NAME: MARCKS family signature 1- CONSENSUS: G-lQ-E-N-G-H-V-CKR! -
NAME: MARCKS family phosphorylation site domain- CONSENSUS: E-T-P-K ( 5) -x (Oil) -F-S-F-K-K-x-F-K-L-S-G-x-S-F-K- [KR!-[NS!-[KR!-K-E-
NAME: Stathmin family signature 1- CONSENSUS: P-[K<Q!-[KR! (2) -[DE!-x-S-L-[EG!-E .
NAME: Stathmin family signature 2- CONSENSUS: A-E-K-R-E-H-E-[KR!-E-V .
NAME: GTP-binding elongation factors signature. CONSENSUS: D-[KRSTGANlQFYU!-x (3) -E-CKRA(Q!-x-[RK(QD!-[GC!- [IVMK!-[ST!-[IV!-x(2)- CONSENSUS: CGSTACKRNiQ! ■
NAME: Elongation factor 1 beta/beta ' /delta chain signature I-
CONSENSUS: [DE!-[DEG!-[DE! (2) -[LIVMF!-D-L-F-G •
NAME: Elongation factor 1 beta/beta ' /delta chain signature 2- CONSENSUS: V-ιQ-S-x-D-[LIVM!-x-A-[FUM!-[N<Q!-K-[LIVM! .
NAME: Elongation factor 1 gamma chain profile-
NAME: Elongation factor Ts signature 1- CONSENSUS: L-R-x (2) -T-[GD(Q!-x-[GS!-[LIVMF!-x (Oιl)-[DENKAC!-x- K-[KRNEιQS!-[AV!-L.
NAME: Elongation factor Ts signature 2-
CONSENSUS: E-[LIVM!-N-[SCV!-[ιQE!-T-D-F-V-[SA!-[KRN! .
NAME: Elongation factor P signature-
CONSENSUS: K-x-A-x ( 4 ) -G-x (2) -CLIV!-x-V-P-x (2) -[LIV!-x (2) -G .
NAME: Eukaryotic initiation factor 1A signature- CONSENSUS: [IM!-x-G-x-[GS!-[KRH!-x (4 ) -[CL!-x-D-G-x (2) -R-x (2) - [RH!-I-x-G-
NAME: Eukaryotic initiation factor 4E signature- CONSENSUS: CDE!-[IFY!-x (2) -F-[KR!-x (2) -[LIVM!-x-P-x-U-E- V!- x(S)-G-G-[KR!-U-
NAME: Eukaryotic initiation factor 5A hypusine signature- CONSENSUS: CPT!-G-K-H-G-x-A-K - NAME: Initiation factor 2 signature-
CONSENSUS: G-x-CLIVIU-x (2) -L-[KR!-[KRHNS!-x-K-x (5) -CLIVM!- x(2)-G-x-CDEN!-C-G.
NAME: Initiation factor 3 signature-
CONSENSUS: CKR!-CLIVM! (2) -CDN!-CFY!-CGSN!-CKR!-[LIVMFYS!-x-
[FY!-[DE(QT!-x(2)-[KR!- NAME Translation initiation factor SUI1 signature- CONSENSUS: CLIVM!-[E(Q!-[LIVM!-(Q-G-[DEN!-[KH(Q!-[KRV! -
NAME: Prokaryotic-type class I peptide chain release factors signature. CONSENSUS: [AR!-[STA!-x-G-x-G-G-(Q-[HNGCS!-V-N-x(3)-[ST!-A- CIV!-
NAME: Transcription termination factor nusG signature- CONSENSUS: [LIVM!-F-G-[KRU!-x-T-P-[IV!-x-[LIVM!-
NAME: Calponin family repeat- CONSENSUS: [LIVM!-x-[LS!-(Q-[MAS!-G-[STY!-[NT!-[KR(Q!-x(2)
CSTN!-(Q-x-G-x(3ι4)-G-
NAME: CAP protein signature 1-
CONSENSUS: CLIVM! (2) -x-R-L-[DE!-x ( 4 ) -R-L-E -
NAME: CAP protein signature 2-
CONSENSUS: D-[LIVMFY!-x-E-x-[PA!-x-P-E-(Q-[LIVMFY!-K -
NAME: Calreticulin family signature 1-
CONSENSUS: [KRHN!-x- E(QN!-[DE<QNK!-x (3) -C-G-G-[AG!-[FY!-
[LIVM!-[KN!-CLIVMFY!(2) - NAME: Calreticulin family signature 2 CONSENSUS: [LIVM! ( 2) -F-G-P-D-x-C-[AG! -
NAME: Calreticulin family repeated motif signature- CONSENSUS: [IV!-x-D-x-[DENST!-x (2 ) -K-P- EH!-D-U-[DEN! ,
NAME: Calsequestrin signature 1-
CONSENSUS: CE(Q!-CDE!-G-L-[DN!-F-P-x-Y-D-G-x-D-R-V .
NAME: Calsequestrin signature 2-
CONSENSUS: CDE!-L-E-D-U-CLIVM!-E-D-V-L-x-G-x-CLIVM!-N-T-E-D-
D-D-
NAME: S-lOD/ICaBP type calcium binding protein signature. CONSENSUS: CLIVMFYU! (2) -x (2 ) -CL -D-x (3) -CDN!-x (3) -CDNSG!- CFY!-x-[ES!-CFYVC!-x(2)- CONSENSUS: CLIVMFS!-[LIVMF! ■
NAME: Hemolysin-type calcium-binding region signature- CONSENSUS: D-x-[LI!-x ( 4 ) -G-x-D-x-[LI!-x-G-G-x (3) -D -
NAME: HlyD family secretion proteins signature- CONSENSUS: CLIVM!-x (2) -G-CLM!-x (3) -CSTGAV!-x-[LIVMT!-x- [LIVMT!-[GE!-x-[KR!-x- CONSENSUS: [LIVMFYU! (2) -x-[LIVMFYU! (3) ■
NAME: P-II protein urydylation site- CONSENSUS: Y-CKR!-G-CAS!-[AE!-Y .
NAME: P-II protein C-terminal region signature- CONSENSUS: CST!-x (3) -G-[DY!-G-CKR!-[IV!-[FU!-[LIVM!-x (2 ) - CLIVM!- NAME: 14-3-3 proteins signature 1-
CONSENSUS: R-N-L-CLIV!-S-CVG!-[GA!-Y-[KN!-N-CIVA! •
NAME: 14-3-3 proteins signature 2-
CONSENSUS: Y-K-[DE!-S-T-L-I- M!-ιQ-L-[LF!-[RHC!-D-N-[LF!-T- [LS!-U-[TAN!-[SAD!-
NAME: ATP1G1 / PLM / MAT6 family signature-
CONSENSUS: [DNS!-x-F-x-Y-D-x (2) -[ST!-[LIVM!-[R(Q!-x (2) -G - NAME: BTG1 family signature 1-
CONSENSUS: Y-x (2) -[HP!-U-[FY!-[AP!-E-x-P-x-K-G-x-[GA!-[FY!-R- C-[IV!-[RH!-[IV!-
NAME: BTG1 family signature 2- CONSENSUS: [LV!-P-x-[DE!-[LM!-[ST!-[LIVM!-U-[IV!-D-P-x-E-V- [SC!-x-[R(Q!-x-G-E.
NAME: Cullin family signature-
CONSENSUS: CLIV!-K-x (2) -[LIV!-x (2) -L-I-[DE(Q!-[KRHN(Q!-x-Y- [LIVM!-x-R-x(faι7)-[FY!-x- CONSENSUS: Y-x-[SA!>-
NAME: Cullin family profile. NAME: Enhancer of rudimentary signature-
CONSENSUS: Y-D-I-[SA!-x-L-[FY!-x-F-[IV!-D-x (3) -D-[LIV!-S -
NAME: G10 protein signature 1-
CONSENSUS: L-C-C-x-[KR!-C-x ( 4 ) - E!-x-N-x ( 4 ) -C-x-C-R-V-P
NAME: G1D protein signature 2- CONSENSUS: C-x-H-C-G-C-[KRH!-G-C-[SA! -
NAME: Glucokinase regulatory protein family signature. CONSENSUS: G-[PA!-E-x-[LIV!-[STA!-G-S-[ST!-R-[LIVM!-K- CSTGA!(3)-x(2)-K-
NAME: GTP1/0BG family signature-
CONSENSUS: D-CLIVM!-P-G-CLIVM! (2 ) -[DEY!-[GN!-A-x (2) -G-x-G -
NAME: HIT family signature-
CONSENSUS: [N(QA!-x (4 ) -[GAV!-x-[(QF!-x-[LIVM!-x-H-[LIVMFYT!-H-
[LIVMFT!-H-[LIVMF!(2)-
CONSENSUS: [PSGA!-
NAME: Caseins alpha/beta signature- CONSENSUS: C-L-[LV!-A-x-A-[LVF!-A - NAME: Clathrin adaptor complexes medium chain signature 1- CONSENSUS: CIVT!-[GSP!-U-R-χ (2ι3) -CGAD!-x (2 ) -CHY!-x (2 ) -N-x- [LIVMAFY!(3)-D-[LIVM!-
CONSENSUS: [LIVMT!-E-
NAME: Clathrin adaptor complexes medium chain signature 2 CONSENSUS: CLIV!-x-F-I-P-P-x-G-x-[LIVMFY!-x-L-x (2) -Y -
NAME: Clathrin adaptor complexes small chain signature. CONSENSUS: CLIVM! (2) -Y-[KR!-x ( 4 ) -L-Y-F -
NAME: Ependymins signature 1-
CONSENSUS: F-E-E-G-x-CLIVMF!-Y-CED!-I-D-x(2)-N-C(QE!-S-C-
[RKH!(2).
NAME: Ependymins signature 2-
CONSENSUS: [(QE!-[LIVMA!-F-x(2)-P-[STA!-[FY!-C-[DE!-[GA!-
CLIVM!-x(2)-CDE3(2). NAME: Syntaxin / epimorphin family signature-
CONSENSUS: CR(Q!-x (3) -CLIVMA!-x (2) -CLIVM!-[ESH!-x (2) -[LIVMT!- x-[DEVM!-[LIVM!-x(2)-
CONSENSUS: [LIVM!-[FS!-x (2 ) -[LIVM!-x (3) -[LIVT!-x (2) -<Q-
[GADE(Q!-x(2)-[LIVM!-[DN(QT!-x- CONSENSUS: CLIVMF!-[DESV!-x (2) -CLIVM! -
NAME: Extracellular proteins SCP/Tpx-l/Ag5/PR-l/Sc7 signature 1-
CONSENSUS: [GDER!-H-[FYUH!-T-(Q-[LIVM! (2) -U-x (2 ) -[STN! -
NAME: Extracellular proteins SCP/Tpx-l/Ag5/PR-l/Sc7 signature 2-
CONSENSUS: [LIVMFYH!-[LIVMFY!-x-C-[N(QRHS!-Y-x-[PARH!-x-[GL!-
N-[LIVMFYUDN!-
NAME: Fetuin family signature 1-
CONSENSUS: C-x (5b) -C-x (10 ) -C-x (13) -C-x (17ιlβ) -C-x (13) -C-x (2 )
C-x(5δ)-C-x(lDιlD-
CONSENSUS: C-x (10 i 12) -C-x (lb i 22) -C -
NAME: Fetuin family signature 2- CONSENSUS: L-E-T-x-C-H-x-L-D-P-T-P .
NAME". Legume lectins beta-chain signature. CONSENSUS: [LIV!-[STAG!-V-[DEιQV!-[FLI!-D-[ST!
NAME: Legume lectins alpha-chain signature. CONSENSUS: CLIV!-x-[ED(Q!-[FYUKR!-V-x-[LIV!-G-[LF!-[ST! . NAME: Vertebrate galactoside-binding lectin signature. CONSENSUS: U-[GEK!-x-CE(Q!-x-[KRE!-x (3ιfa) -CPCTF!-[LIVMF!- [N(QEGSKV!-x-[GH!-x(3)~ CONSENSUS: [DENKHS!-[LIVMFC! • NAME: Lysosome-associated membrane glycoproteins duplicated domain signature-
CONSENSUS: [STA!-C-[LIVM!-[LIVMFYU!-A-x-[LIVMFYU!-x (3) - CLIVMFYU!-x(3)-Y- NAME: LAMP glycoproteins transmembrane and cytoplasmic domain signature-
CONSENSUS: C-x (2) -D-x (3ι4 ) -CLIVM! (2) -P-[LIVM!-x-[LIVM!-G- x(2)-CLIVM!-x-G-CLIVM!(2)-
CONSENSUS: x-CLIVM! ( 4 ) -A-[FY!-x-[LIVM!-x (2) -[KR!-[RH!-x (lι2) - [STAG!(2)-Y-[EiQ!-
NAME: Glycophorin A signature. CONSENSUS: I-I-x-[GAC!-V-M-A-G-[LIVM! (2) ■
NAME: PMP-22 / EMP / MP2D family signature 1-
CONSENSUS: [LIVMF! (4 ) -[SA!-T-x (2) -[DNKS!-x-U-x (1ιl3) -[LIV!-U- x(2)-C
NAME: PMP-22 / EMP / MP20 family signature 2-
CONSENSUS: CR(Q!-CAV!-x-M-CIV!-L-S-x-CLI!-x ( 4 ) -CGSA!- CLIVMF!(3). NAME: Oxysterol-binding protein family signature- CONSENSUS: E-CK(Q!-x-S-H-CHR!-P-P-x-CSTACF!-A -
NAME-" Yeast PIR proteins repeats signature- CONSENSUS: S-ιQ-CIV!-CSTGNH!-D-G-(Q-CLIV!-(Q-CAIV!-CSTA! -
NAME: Seminal vesicle protein I repeats signature- CONSENSUS: CIVM!-x-G-(Q-D-x-V-K-x ( 5) -CKN!-G-x (3) -CSTLV!
NAME: Seminal vesicle protein II repeats signature- CONSENSUS: [GSA!-(Q-x-K-S-[FY!-x-ιQ-x-K-[SA! .
NAME". Serum amyloid A proteins signature- CONSENSUS: A-R-G-N-Y-[ED!-A-x-[ιQKR!-R-G-x-G-G-x-U-A - NAME: Spermadhesins family signature 1-
CONSENSUS: C-G-x (2) -CLO-x ( 4) -G-x-I-x (1) -C-x-U-T .
NAME: Spermadhesins family signature 2-
CONSENSUS: C-x-K-E-x-CLIVM!-E-CLIVM!-x-CDE!-x (3) -CGS!-x (5) -K- x-C
NAME: Stress-induced proteins SRPl/TIPl family signature. CONSENSUS: P-U-Y-CST! (2) -R-L ■ NAME: Glypicans signature-
CONSENSUS: C-x (2) -C-x-G-CLIVM!-x (4 ) -P-C-x (2) -CFY!-C-x (2) - CLIVM!-x(2)-G-C
NAME: Syndecans signature- CONSENSUS: CFY!-R-CIM!-CKR!-K (2) -D-E-G-S-Y -
NAME: Tissue factor signature-
CONSENSUS: U-K-x-K-C-x (2) -T-x-[DEN!-T-E-C-D-[LIVM!-T-D-E - NAME: Translationally controlled tumor protein signature 1- CONSENSUS: CIA!-G-[GAS!-N-[PA!-S-A-E-CGDE!-[PAGE!-x (Oil ) - [DEG!-x-[DEN!-x(2)-[DE!- NAME: Translationally controlled tumor protein signature 2. CONSENSUS: CFL!-CFY!-[IVT!-G-E-x-[MA!-x (2ι5) - EN!-[GAS!-x- CLV!-CAV!-x(3)-CFY!-CKR!- CONSENSUS: CDE!-
NAME: Tub family signature 1-
CONSENSUS: F-CKHlQ!-G-R-V-[ST!-x-A-S-V-K-N-F-(Q .
NAME: Tub family signature 2- CONSENSUS: A-F-CAG!-I-CSAC!-[LIVM!-[ST!-S-F-x-[GST!-K-x-A-C- E.
NAME: HCP repeats signature- CONSENSUS: H-R-H-R-G-H-x (2) -[DE! (7) .
NAME: Bacterial ice-nucleation proteins octamer repeat. CONSENSUS: A-G-Y-G-S-T-x-T ■
NAME: Cell cycle proteins ftsU / rodA / spoVE signature- CONSENSUS: [NV!-x ( 5) -[GTR!-[LIVMA!-x-P-[PTLIVM!-x-G-[LIVM!- x(3)-[LIVMFU!(2)-S-[YSA!- CONSENSUS: G-G-[STN!-[SA! -
NAME: Enterobacterial virulence outer membrane protein signature 1-
CONSENSUS: G-[LIVMFY!-N-[LIVM!-K-Y-R-Y-E •
NAME: Enterobacterial virulence outer membrane protein signature 2- CONSENSUS: [FYU!-x (2) -G-x-G-Y-[KR!-F> .
NAME: Hydrogenases expression/synthesis hypA family signature ■
CONSENSUS: F-[CSA!-[FY!-CDE!-[LIVA! (2) -x (3 ) -[ST!-[LIVM!- x(lfa)-C-x(2)-C-x(12!lS)- CONSENSUS: C-P-x-C-
NAME: Hydrogenases expression/synthesis hupF/hypC family signature ■ CONSENSUS: <M-C-[LIV!-[GA!-CLIV!-P-x-[(QKR!-[LIV! .
NAME: Staphylocoagulase repeat signature-
CONSENSUS: A-R-P-x (3) -K-x-S-x-T-N-A-Y-N-V-T-T-x (2) -CDN!-G- x(3)-Y-G-
NAME: 11-S plant seed storage proteins signature- CONSENSUS: N-G-x-CDE! (2 ) -x-CLIVMF!-C-[ST!-x (llιl2) -[PAG!-D .
NAME: Dehydrins signature 1- CONSENSUS: S (5) -CDE!-x-CDE!-G-x (lι2) -G-x (Oτl ) -[KR! ( )
NAME: Dehydrins signature 2-
CONSENSUS: [KR!-[LIM!-K-[DE!-K-[LIM!-P-G ■ NAME: Germin family signature-
CONSENSUS: G-x ( 4 ) -H-x-H-P-x-A-x-E-[LIVM! -
NAME: Oleosins signature- CONSENSUS: CAG!-CST!-χ (2) -CAG!-x (2 ) -CLIVM!-[SAD!-T-P-
CLIVMF!(4)-F-S-P-[LIVM!(3)-
CONSENSUS: P-A-
NAME: Small hydrophilic plant seed proteins signature- CONSENSUS: G-[E<Q!-T-V-V-P-G-G-T •
NAME: Pathogenesis-related proteins Betvl family signature- CONSENSUS: G-x (2 ) -CLIVMF!-x (4 ) -E-x (2) -CCSTAEN!-x (δil) -CGND!- G-CGS!-[CS!-x(2)-K-x(4)-
CONSENSUS: CFY!-
NAME: Pollen proteins Ole e I family signature- CONSENSUS: CE(Q!-G-x-V-Y-C-D-T-C-R-
NAME: Thaumatin family signature- CONSENSUS: G-x-CGF!-x-C-x-T-CGA!-D-C-x(lι2)-G-x(2ι3)-C-
NAME: Mrp family signature- CONSENSUS: U-x(2)-[LIVM!-D-[LIVMY!(4)-D-x-P-P-G-T-[GS!-D-
NAME: Glucose inhibited division protein A family signature
1-
CONSENSUS: [GS!-P-x-Y-C-P-S-[LIVM!-E-x-K-[LIVM!-x-[KR!-F-
NAME: Glucose inhibited division protein A family signature
2-
CONSENSUS: A-G-(Q-x-[NT!-G-x(2)-G-Y-x-E-[SAG!(3)-[(2S!-G-
[LIVM!(2)-A-G-[LIVMT!-N-A
NAME: N0Ll/N0P2/sun family signature-
CONSENSUS: [FV!-D-[KRA!-[LIVMA!-L-x-D-[AV!-P-C-[ST!-[GA!
NAME: PET112 family signature-
CONSENSUS: [DN!-x- N!-R-x (3) -P-L-[LIV!-E-[LIV!-x-[ST!-x-P .
NAME: Protein smpB signature-
CONSENSUS: [TA!-G-[LIVM!-x-L-x-G-x-E-[LIVM!-[K(Q!-[SA!-[LIVM! . NAME: Hypothetical cof family signature 1-
CONSENSUS: [LIVFYAN!-[LIVMFA!-x (2) -D-[LIVMF!-[ND!-G-T-[LIV!- [LVY!-[STANLM!-
NAME: Hypothetical cof family signature 2- CONSENSUS: [LIVMFC!-G-D-[GSAN(Q!-x-N-D-x (3) -[LIMFY!-x (2) -CAV3- x(2)-CGSCP!-x(2)- CONSENSUS: CLMP!-x (2) -CGAS! -
NAME." RI01/ZKfa32-3/MJ0444 family signature- CONSENSUS: CLIVM!-V-H-CGA!-D-L-S-E-CFY!-N-x-CLIVM! -
NAME: SUA5/yciO/yrdC family signature-
CONSENSUS: [LIVMTA! (3) -CLIVMFYC!-CPG!-T-CDE!-[STA!-x-[FY!-
[GA!-[LIVM!-[GS!-
NAME: Uncharacterized protein family UPF00D1 signature- CONSENSUS: [FU!-H-[FM!-[IV!-G-x-[LIV!-ιQ-x-[NKR!-K-x (3 ) -[LIV! . NAME: Uncharacterized protein family UPF0D03 signature- CONSENSUS: G-x-V-x (2) -CLIV!-x (3) -CSA!-x (b) -D-x (3)-CLIVT! (3) - P-N-x(2)-CLIVMF!(2)- CONSENSUS: x(5)-N-
NAME: Uncharacterized protein family UPFD0D4 signature- CONSENSUS: [LIVM!-x-[LIVMT!-x (2) -G-C-x (3) -C-[STAN!-[FY!-C-x- [LIVM!-x(4)-G- NAME: Uncharacterized protein family UPF0D05 signature- CONSENSUS: G-CLIVM! (2 ) -[SA!-x (5ιβ) -G-x (2) -[LIVM!-G-P-x-L-
Figure imgf000480_0001
CONSENSUS: CLIVM! (2) -x (2) -A-x (3) -T-A-CLIVM! (2 ) -F - NAME: Uncharacterized protein family UPFDOOfa signature 1- CONSENSUS: CLIVMFY! (2 ) -D-CSTA!-H-x-H-CLIVMF!-CDN! -
NAME: Uncharacterized protein family UPFDODb signature 2. CONSENSUS: P-[LIVM!-x-[LIVM!-H-x-R-x-[TA!-x-[DE! -
NAME"- Uncharacterized protein family UPFODOb signature 3. CONSENSUS: CLVSA!-CLIVA!-x (2) -CLIVM!-CPS!-x (3) -L-CLIVM!- [LIVMS!-E-T-D-x-P- NAME: Uncharacterized protein family UPFDDD7 signature- CONSENSUS: V-L-[IV!-H-D-[GA!-A-R -
NAME-" Uncharacterized protein family UPF0011 signature- CONSENSUS: S-D-A-G-x-P-x-[LIV!-[SN!-D-P-G -
NAME: Uncharacterized protein family UPFDD12 signature- CONSENSUS: [GTA!-x (2) -CIVT!-C-Y-D-[LIVM!-x-F-P-x (1) -G -
NAME: Uncharacterized protein family UPF0D15 signature- CONSENSUS: [DE!-[LIVMF! (3) -R-T-[SG!-G-x (2 ) -R-x-S-x-[FY!- [LIVM!(2)-U-(Q-
NAME: Uncharacterized protein family UPFODlfa signature- CONSENSUS: E-[LIVM!-G-D-K-T-F-[LIVMF! (2 ) -A -
NAME: Uncharacterized protein family UPF0D17 signature-
CONSENSUS: D-x (δ ) -CGN!-CLFY!-x ( 4 ) -CDET!-CLY!-Y-x (3) -CST!- x(7)-[IV!-x(2)-[PS!-x-
CONSENSUS: CLIVM!-x-CLIVM!-x (3) -[DN!-D -
NAME: Uncharacterized protein family UPF0D11 signature- CONSENSUS: L-P-V-CVT!-CN(QL!-F-CAT!-A-G-G-[LIV!-A-T-P-A-D-A-A- [LM!- NAME: Uncharacterized protein family UPF0D20 signature- CONSENSUS: D-P-[LIVMF!-C-G-[ST!-G-x (3 ) -[LI!-E -
NAME: Uncharacterized protein family UPFD021 signature. CONSENSUS: C-K-x(2)-F-x(4)-E-x(22ι23)-S-G-G-K-D-
NAME: Uncharacterized protein family UPFD023 signature CONSENSUS: D-x-D-E-[LIV!-L-x ( 4 ) -V-F-x (3) -S-K-G - NAME: Uncharacterized protein family UPF0024 signature- CONSENSUS: G-x-K-D-CKR!-x-A-[LV!-T-x-<Q-x-[LIVF!-[SGC! -
NAME: Uncharacterized protein family UPFD025 signature- CONSENSUS: D-V-CLIV!-x (2) -G-H-CST!-H-x (12) -[LIVMF!-N-P-G -
NAME: Uncharacterized protein family UPFD027 signature- CONSENSUS: (Q-[LIVM!-x-N-x-A-x-[LIVM!-P-x-I-x (fa ) -CLIVM!-P-D-x- H-x-G-x-G-x(2)-[IV!-G-
NAME: Uncharacterized protein family UPFD026 signature- CONSENSUS: [GA!-[GS!-G-[GA!-A-R-G-x-[SA!-H-x-G-x (1) -[IV!-x- [IV!-D-x(2)-CGA!-G-x-S- CONSENSUS: x-G-
NAME". Uncharacterized protein family UPFD021 signature. CONSENSUS: G-x (2) -CLIVM! (2 ) -x (2 ) -CLIVM!-x (4 ) -CLIVM!-x (5) - CLIVM!(2)-x-R-CFYU!(2)-G- CONSENSUS: G-x (2) -CLIVM!-G .
NAME: Uncharacterized protein family UPFD03D signature- CONSENSUS: [GA!-L-I-[LIV!-P-G-G-E-S-T-[STA! -
NAME: Uncharacterized protein family UPF0D31 signature 1- CONSENSUS: [SAV!- VU!-[LVA!-[LIV!-G-[PNS!-G-L-[GP!-x- [DENiQT!-
NAME: Uncharacterized protein family UPF0031 signature 2- CONSENSUS: [GA!-G-x-G-D-[TV!-[LT!-[STA!-G-x-[LIVM! -
NAME: Uncharacterized protein family UPF0032 signature- CONSENSUS: Y-x (2) -F-CLIVMA! (2) -x-L-x ( 4 ) -G-x (2 ) -F-CEiQ!- [LIVMF!-P-[LIVM!. NAME: Uncharacterized protein family UPF0D33 signature- CONSENSUS: L-[DN!-x (2) -[TAG!-x (2) -C-P-x-P-x-[LIVM! -
NAME: Uncharacterized protein family UPF0034 signature- CONSENSUS: [LIVM!-[DNG!-[LIVM!-N-x-G-C-P-x (3) -[LIVMASιQ!-x (5) G-[SAC!-
NAME: Uncharacterized protein family UPF0D35 signature- CONSENSUS: L-L-T-x-R-CSA!-x (3) -R-x (3) -G-x (3) -F-P-G-G - NAME: Uncharacterized protein family UPF003fa signature-
CONSENSUS: H-x-S-G-H-CGA!-x (3) - E!-x (3) -[LM!-x (5) -P-x (3) - [LIVM!-P-x-H-G-[DE!-
NAME: Uncharacterized protein family UPFD036 signature- CONSENSUS: G-x-CLI!-x-R-x (2) -L-x ( 4 ) -F-x (6 ) -CLIV!-x ( 5) -P-x- CLIV!-
NAME: Uncharacterized protein family UPF0044 signature- CONSENSUS: L-CST!-x (3) -K-x (3) -CKR!-[SGA!-x-[GA!-H-x-L-x-P- [LIV!-x(2)-[LIV!-[GA!- CONSENSUS: x(2)-G-
NAME: Uncharacterized protein family UPF0047 signature- CONSENSUS : S-X ( 2 ) -CLIV!-x-[LIV!-x ( 2 ) -G-x ( 4 ) -G-T-U-iQ-x-CLIV! -
NAME: Uncharacterized protein family UPF0054 signature- CONSENSUS: H-CGS!-x-L-H-L-CLI!-G-CFYU!-D-H -
NAME-" Uncharacterized protein family UPF0057 signature- CONSENSUS: CLIV!-x-CSTA!-CLIVF! (3) -P-P-CLIVA!-[GA!-[IV!-x (4 ) - [GKN!- NAME: Hypothetical YER0S7c/yjjV family signature-
CONSENSUS: P-[AT!-R-[SA!-x-[LIVMY!-x (2) -[AK!-x-L-P-x ( 4 ) - [LIVM!-E-
NAME: Hypothetical hesB/yadR/yfhF family signature- CONSENSUS: F-x-[LIVMFY!-x-N-[PG!-[NSK!-x ( 4 ) -C-x-C-[GS!-x-S-F -
NAME: Hypothetical yabO/yceC/sfhB family signature- CONSENSUS: [NHY!-R-[LI!-D-x (2) -T-[ST!-G-[LIVMA!-[LIVMF! (2) - [LIVMFG!-[SGAC!-
Deposit of Clones Each clone has been transfected into separate bacterial cells (E- coli) in the composite deposit-
The clones are located and publically available from the Resource Center of the German Human Genome Project (Heubner Ueg bi 14D51 Berlini GERMANY)ι from which each clone comprising a particular polynucleotide is obtainable- The Resource Center library numbers are slightly different that those presented herei but may be readily obtained by the following key or with the assistance of Resource Center personnel.
The library name becomes a number: brain (hfbr2) becomes 5fa4i kidney (hfkd2) becomes 5bbi mammary carcinoma (hmcfl) becomes 727i testis (htes3) becomes 434i amygdala (hamy2) becomes 7blι melanoma (hmel2) becomes 7b2 and uterus (hutel) becomes 56fa- Nexti the plate number is converted to two digits (e.g-i Ll2" becomes "02") and is moved behind the plate coordinate! and the underscore is dropped. The following examples are helpful:
Listed Number Resource Center Number
DKFZphamy2_lDhl7 DKFZp7falH1710
DKFZphfbr2_7δi21 DKFZpSfa4I217δ DKFZphfkd2_3kl DKFZp5fabK013
DKFZphmcfl_lc23 DKFZp727C231
DKFZhmel2_12jl DKFZp7b2JD112
DKFZphtes3_lfab5 DKFZp434BD51fa
DKFZphutel_17k7 DKFZp5βfaK0717 The libraries were constructed using two commercially available vectors- The brain (hfbr2 designations) and kidney (hfkd2 designations) libraries utilize pAMP 1 from Life Technologies and are maintained in XL-2Blue (Strategene) i the amygdala (hamy2)ι testes (htes3) and melanoma (hmel2) libraries are constructed in pSPORTli also from Life Technologies! and are maintained in DH10B (LifeTechnologies) ■ In addition to the following techniques! consultation with the commercial literature available on these clones will make evident all of the housekeeping techniques needed to propagate and isolate the individual constructs- All inserts may be excised with a Notl/Sall digestion. Alternatively! universal pri ersi flanking the cloning regioni may be used to amplify the inserts using PCR methods- Bacterial cells containing a particular clone can be obtained from the composite deposit as follows:
An oligonucleotide probe or probes should be designed to the sequence that is known for that particular clone- This sequence can be derived from the sequences provided hereini or from a combination of those sequences- Methods of probe design are presented below-
Oligonucleotide probes may be labeled with -P ATP (specific activity faOOO Ci/mmole) and T4 polynucleotide kinase using commonly employed techniques for labeling oligonucleotides. Otheri non-radioactive labeling techniques can also be used- Unincorporated label typically is removed by gel filtration chromatography or other established methods- The amount of radioactivity incorporated into the probe can be quantified by measurement in a scintillation counter- Preferably! specific activity of the resulting probe generally should be approximately 4XlDb dmp/p ole-
The bacterial culture containing the pool of full-length clones should preferably be thawed and 100 1 of the stock used to inoculate a sterile culture flask containing 25 ml of sterile L-broth containing ampicillin at 50 - 1DD g/ml (for XL-2Blue strains 25 g/ml tetracycline should also be used)- The culture should preferably be grown to saturation at 37°C-ι and the saturated culture should preferably be diluted in fresh L-broth- Aliquots of these dilutions should preferably be plated to determine the dilution and volume which will yield approximately 5D00 distinct and well-separated colonies on solid bacteriological media containing L-broth containing ampicillin at 10D g/ml (for XL-2Blue strains 25 g/ml tetracycline should also be used)and agar at 1-5 in a 150 mm petri dish when grown overnight at 37°C- Other known methods of obtaining distincti well-separated colonies can also be employed-
Standard colony hybridization procedures should then be used to transfer the colonies to nitrocellulose filters and lysei denature and bake them- The filter is then preferably incubated at fa5°C- for 1 hour with gentle agitation in fa x SSC (2D x stock is 175-3 g NaCl/literi δβ-2 g Na citrate/liter i adjusted to pH 7-0 with NaOH) containing 0-5C SDSi 1D0 g/ml of yeast RNAi and 10 mM EDTA (approximately 10 mL per 150 mm filter)- Preferably! the probe is then added to the hybridization mix at a concentration greater than or equal to 1X10*-* dpm/mL- The filter is then preferably incubated at fa5°C- with gentle agitation overnight. The filter is then preferably washed in 500 mL of 2 x SSC/D-5 SDS at room temperature without agitationi preferably followed by 5D0 mL of 2 x SSC/0-1 SDS at room temperature with gentle shaking for 15 minutes. A third wash with 0-1 x SSC/D-5 SDS at b5°C- for 30 minutes to 1 hour is optional- The filter is then preferably dried and subjected to autoradiography for sufficient time to visualize the positives on the X-ray film- Other known hybridization methods can also be employed.
The positive colonies are pickedi grown in culturei and plasmid DNA isolated using standard procedures. The clones can then be verified by restriction analysisi hybridization analysis! or DNA sequencing-
Alternatively! clones may be grown as described abovei and PCR used to isolate the insert DNAs- Methods of PCR are described below and are otherwise well known ■
ERROR SCREENING
The DNA sequences found herein derive from individual clonesi which are publicly availablei as noted above- Thusi the skilled artisan will recognize that any specific sequence disclosed herein readily can be screened for errors by resequencing a particular fragment! in both directions (.i . e. -, by sequencing both strands)- Alternatively! error screening can be performed by amplifying and/or cloning any of the inventive DNAsi using for example RT- PCRi and sequencing the resulting amplified product- In the event that there is a sequencing errori reference should be made to the deposited clone as the correct sequence-
USES AND BIOLOGICAL ACTIVITIES OF THE INVENTIVE MOLECULES
The inventive molecules and their derivatives are susceptible to a wide variety of usesi based on functional and/or structural properties. The skilled worker will appreciate! based on the biological activities detailed belowi and discussed with regard to the individual sequences hereini that the inventive molecules will find usefulness in numerous therapeutic and diagnostic applications.
The DNA moleculesi especially the potassium salts thereofi can be used as fertilizer supplements due to their high nitrogen and phosphorus contents- Since the DNAs are of defined lengthi they are also useful in gel electrophoresis as molecular weight markers. Due to their similarity with known moleculesi certain of the DNA molecules and their variants and derivatives may be used in any number of different diagnostic procedures and therapeutic applications- They may also be used to make the encoded proteins-
The proteins themselves have many possible uses- They may be used as a nutritional supplement for humansi animals and even for laboratory use asi for examplei medium for bacterial cultures- Moreoveri since the proteins are of definedi known sizesi they may be used as molecular weight markers for gel electrophoresis and gel filtration. Because they are of defined sequences! they also have use in microsequencing and protein fingerprinting applications.
Expression Profiling Applications
Given their known tissue expression and functional associations! assemblages of the inventive proteins (or corresponding antibodies) and nucleic acids are particularly suited to expression profiling applications- Expression profiling generally entails constructing an array of indicators that signal the presence of a particular RNA or protein expression product. Such arrays can be used to evaluate! for examplei pharmacological effectiveness and toxicity. In particulari expression profiles from such arrays can be generated from cells treated with known compounds! having known properties! and these profiles can be compared to profiles of unknowns to evaluate similarities and differences! which can be correlated with efficacy or toxicity.
Additional uses of profiling include diagnosis! tracking development! and ascertaining signaling and metabolic pathways- For examples of references describing profiling and its usesi see Farr et al . -, U-S- Patent 5ιδllι231 (111δ)i Seilhamer et al . -, U-S- Patent 5ιβ40ι464 (111δ)i Rine et al - -, U-S- Patent No- Sι777ιβδδ (1116) i UO 17/27317i UO 11/05323i UO 11/0121βi and UO 11/143b1. For a device for implementing such techniques! see Lipshutz et al . -, U-S- Patent No- 5ιδ5bιl74 (1111) and Anderson et al.-i U-S- Patent No- 5ι122ι511 (1111).
In one embodimenti a subset of the inventive DNAs will be arrayed on a substratei like a gene chipi a filter or a lb-well plate- Test samples containing cells are maintained in the presence of a label capable of incorporation into nascent mRNA- Samples are treated with test and control compoundsi which will induce mRNA expression in the samplei resulting in incorporation of label- Uhole mRNA is isolated and applied to the array such that it hybridizes with the DNAs contained therein- After washingi the amount of hybridization is quantified and a profile is generated. These steps are repeated with various control and test compoundsi thereby generating a library of profiles! which can be used to ascertain the relationships relevant to pharmacological efficacy or toxicity- The matrices used in such profiling! howeveri need not be limited to those utilizing DNAs- Ratheri other nucleic acidsi like RNAs and protein nucleic acids (PNAs)ι as well as the inventive proteins and antibodies corresponding to the inventive proteins may also be employed- Hencei for examplei antibodies could form the array and the samples could be treated in order to label nascent proteins- Uhole proteins then would be isolated and applied to the antibody matrix- Developing the resulting signal would result in a protein expression profile! which is useful in essentially the same manner as the nucleic acid profile- A protein matrix could be usedi for examplei in evaluating antibody responses to pharmaceutical agents in order to eliminate possible cross-reactivity- Moreover! where nucleic acids are used in the matrixi it is often beneficial to use variants (as defined below) of the molecules described hereinin- This can be used to account for genetic variations that are of little or no consequence to the function of the resultant gene product- Hencei they can account for wobble or conservative amino acid variations that do not perturb function! like variations in some of the protein motifs elucidated below- Thusi each position in the matrix can employ multiple nucleic acid probes that account for a series of variants - Expression profiling may also be donei in another embodiment! using two-dimensional protein gels in which the inventive proteins are detected- The resultant profiles can be used in the same way as described- Matrices useful for profiling may be constructed based on different criteria- Of coursei the more relevant profiles will take into account expression of most human genesi preferably all of them- In certain situationsi howeveri it is advantageous to look at a smaller subset- For examplei if one were concerned about fetal neural toxicityi a fetal brain-specific matrix might be chosen- On the other handi if one were interested in targeting mammary carcinoma tissuei a corresponding matrix could be used- Thusi matrices may be constructed using all of the sequences available from a tissue-specific library-
* * * The following discussion relates to some of the various functional and structural groupings that would be of interest to the artisan wishing to construct profiling matrices- Of coursei the artisan will also recognized that these functional descriptions may find additional applicability in the therapeutic and diagnostic applications discussed below-
Cell Cycle
A proliferating cell must coordinate replication and chromosomal separation to ensure that the genome is replicated completelyi and that a single copy is correctly inherited by each daughter cell- The cell cycle is the coordinated series of events that achieves these aims- Many of the key events are initiated by a family of conserved Seiren/threonine protein kinasesi the cyclin-dependent kinases (CDKs)ι that are activated by the cyclin family of proteins (cyclins A-H)- In turni the cyclin-CDK complexes are modulated by other protein kinases or phosphatasesi and by binding specific inhibitor proteins- The enormous variety of ways in which CDK activity can be regulated allows the cell to respond to internal signals generated by preceding events in the cell cycle and to external growth signals-
The somatic cell cycle is divided into four phases: DNA replication (S phase) and chromosome separation (M phase) are separated by gap phases (GI and G2)- At specific control points the decision to begin the next stage (DNA synthesis or mitosis) is carefully regulated-
Cdc2ι the primary kinasei is especially required for the Gl- S transition and S phase- Cdc4 and Cdcfa are involved at the restriction pointi where the cell can decide to proliferate or arrest (GK->GD) and Cdc7 is a CDK activating kinase (CAK) as well as a subunit of TFIIH-
The Cyclin-CDK complexes are regulated in various ways- One is through phosphorylation by CDK activating kinases (CAK)ι like the Y15 kinase (Ueel) and dephosphorylation by CDK associated phosphatases (CAP)ι like Cdc25A a member of the Cdc25 family
Figure imgf000488_0001
An other way of regulation occurs through two classes of CDK inhibitors (CKI)ι the INK4 proteins pl5ι plfai plδi and plli who negatively regulates the cyclin D CDK complexes and second the p21 family with p21ι p27ι and p57-
The cell cycle is also regulated through ubiquitin-mediated proteolysis involving the destruction of both cyclins and CDK inhibitors by the 2faS proteasomei that requires an ubiquitin conjugating enzyme (UBC) and an ubiquitin ligase- The instability is conferred by PEST regions (cyclin D and E) or a ten amino acid region in the amino terminus (degradation box) in the A- and B- type cyclins-
All these modifications play an important role for the cellular localizationi because only the nuclear CDK-cyclin complexes are functional for cell cycle- During GI phase of the cell cyclei cyclines Ai E and D are synthesized and bind to their cyclin-dependent kinase (CDK) partners- CDK complexes containing cyclins Ai E and DI are then imported into and concentrated within nuclei- Cdkfa- cyclin D3 has been localized to both cytoplasmic and nuclear compartmentsi although only the nuclear complex is active- As cells enter S phasei cyclin A and cyclin E complexes remain within the nucleusi whereas cyclin DI relocalizes to the cytoplasm for proteolysis at the onset of S phase- Like Cdk2-cyclin Ai Cdc2-cyclin A is nuclear and remains so until it is degraded during mitosis- By contrasti as a result of ongoing nuclear import and more rapid re-exporti cyclin Bli which binds to Cdc2 upon synthesis during S phasei is predominantly cytoplasmic. Cdc2-cyclin B2 is also cytoplasmici although this might occur through anchoring of the complex to some cytoplasmic constituent- At prophasei phosphorylation of cyclin Bl promotes accumulation of Cdc2-cyclin Bl in the nucleusi whereas cyclin B2 remains in the cytoplasm until nuclear envelope breakdown.
Two crucial regulators of Cdc2-cyclin B-Ueel and Cdc25C exist and are responsible for the G2 to M control point- Ueel is a nuclear protein throughout the cell cyclei whereas Cdc25C binds to 14-3-3 proteins during interphase and remains predominantly cytoplasmic. In some systems Cdc25Cι like cyclin Bli rushes precipitously into the nucleus just before entry into mitosis-
The 110-kDa retinoblastoma (tumor suppressor) protein (RB)ι a pRB-family member is an important regulator of cell-cycle progression and differentiation- Like the E2F family (E2F1-5) or DP family (DP1-3) of transcription activatorsi RB suppresses inappropriate proliferation by arresting cells in GI by repressing the transcription of genes required for the transition into S phase- Before the cell proceeds into S phasei RB becomes phosphorylated at multiple sites by the cyclin dependent protein kinases (CDKs) and loses its transcriptional repressing activity- Phosphorylation of RB during late GI phase results in the dissociation of the E2F-RB repressor complex which allows S-phase specific genes to be transcribed- Cyclin E is the evolutionary conserved target for E2F and interacts together with CDC2 in late GI-
For a proliferating cell it is vital that only undamaged DNA is replicated because if DNA damage is substantial! its replication can lead to chromosome loss or rearrangement. Thusi we find a GK->S checkpoint in late GI that requires tumor suppressor p53- A p53-dependent GI arrest is effected by the cyclin dependent kinase inhibitor p21 through higher expression levels that inhibits almost all cyclin CDK complexes.
The kinase responsible for phosphorylating the unidentified kinetochore component in metaphase may be a member of the MAP kinase family and appears to be the proto oηcogene c-M0Sι a cytostatic factor (CSF) in meiosis.
Several categories of proteins are coded for by clones of the invention within the overall group of Cell cycle"and includei among othersi the following:
PA2fa-T2 protein: PA2fa-T2 is a p53 responsive gene- The protein is predominantly expressed in braini breast and kidney and represents a novel regulator of cellular growth- Isoforms are differentially induced by genotoxic stress (UVi gamma-irradiation and cytotoxic drugs)in a p53-dependent manner- The p53 tumor antigen is found in increased amounts in a wide variety of transformed cells- The protein is also detectable in many actively proliferating! nontransformed cellsi but it is undetectable or present at low levels in resting cells- P53 is postulated to bind as a tetramer to a p53-binding site (PBS) and to activate the expression of adjacent genes that inhibit growth and/or invasion- Deletion or inactivation of one or both p53 alleles reduces the expression of tetra ersi resulting in decreased expression of the growth inhibitory genes- This mechanism is found in tumors of several types- (OMIN *111170) Clones in this category include: amy2_121m2 Cell structure and motility
One of the major differences between prokaryotes and eukaryotes is the ability of the eukaryotic cell to adopt very different shapes dependent on its function during the differentiation process- Animal cells vary from being round to extended cylindric forms like motorneurons or muscle cells- In humansi more than 1D0 different cell types can be distinguished! each having a characteristic shape- The form of a cell often is closely related to its capacity to move- Some completely differentiated cells like fibroblasts can still change their form actively! thereby migrating. Other cell types serve as motor elements - "macroscopically11 like muscle cells or "microscopically" like ciliated epithelia. Such tasks are fulfilled by a big class of proteinsi on the one hand responsible for maintenance of cell structure and contacting neighbor cells or the intercellular matrix and on the other hand for cell motility. These topics cannot be regarded separately: The motility apparatus e-g- must be fixed in the cytoskeleton ■ Three different types of filaments can be distinguished: Actin filamentsi tubulin filaments and intermediate filamentsi each present in almost all types of cells.
Actin filaments (F-actin) are built up of monomers (G-
Actin). In muscle cellsi actini myosini for both of which several paralogous genes are knowni as well as many more proteins are constituents of the contractile apparatus-
The "thin" and "thick filaments11 in a muscle cell consist mainly of actin and myosini respectively-
Several different proteins are responsible for the anchoring of the actin filaments in the Z-disks (e-g- alpha-actinin and desmin) or at the end of the myofibers in the cell membrane-
Troponin Ii -Ci -T and Tropomyosin - associated with actin - confer the Ca++- dependent triggering of contraction. Length of the sarcomere is controlled by the giant protein titin-
In smooth musclei there is no troponin- Contraction activity is controlled by phosphorylation / dephosphorylation of myosin by a specialized kinase instead. Contractile fibers are not organized in sarcomeres-
Apart from contributing to muscle contraction! the actomyosin system is responsible for many other motions at cellular leveli e-g- the amoeboid movement of pseudopodia or the fission of cells at the end of mitosis by a contractile ring.
Besides thisi actin fibers fulfill structural tasks like maintenance of the shape of stereocilia or microvilli- Herei actin filaments are connected by proteins like fimbrin- But not only specialized structures like the mentioned ones contain actin fibers- There is a network covering the complete cell volume with F-actin as a major constituent. Uhereas the actin filaments in the structures mentioned above are relatively stablei this F- actin is highly dynamic- Management of the network structure and turnover is achieved by connecting proteins like alpha-actinini fimbrin or fill-ini turnover is regulated by gelsolini villini and different capping- and fragmentation-proteins-
Microtubules are built up of alpha-beta tubulin heterodi ers- Turnover of filaments is achieved by building-in and releasing of monomers with different time constant rates at both ends- The resulting cycle is called "treadmilling" ■ Thirteen strings of tubulin duplets build up one subfiberi whereas one fiber contains two or three of those- A complete axoneme consists of 1 radial and 2 central fibers- This "1+2" - structure is the basis both of flagellai their basal bodies and centrioles- In flagellai several additional structures like radial elements exist- Nexin connects the fibers and dyneine is the motor ATPase which shifts the fibers relative to each other- Several genetic diseases like the Cartageneric syndrome are caused by deficiencies of distinct proteins in cilia-
Besides thisi microtubules are abundant in all types of cells- They are part of a delivery system for organellesi e-g- in the golgi apparatus- A further very important system based on microtubules is the mitotic spindlei it is organized by the centrosomes- Besides many other components! the major part of a centrosome are two centrioles which are built up of nine microtubule-triplets- Most remarkablyi new centrioles are not synthesized de novo but generated by duplication of old ones-
Cytoplasmic microtubules are associated with many different proteins- Two major classes are known: The MAPs ("microtubule- associated proteins11! with molecular masses between 20D and 300 kD) and the much smaller tau-Proteins with a MU between bO and 70 kD- These proteins regulate the treadmill-process and the interaction with other structures in the cell-
Besides actin and myosin the so-called intermediate filaments constitute a third class of filaments- In contrast to the former two groupsi they do not participate in motilityi nor are they dynamic structures subject to a vivid turnover- The most important ones are neurofilaments (in neurons)ι keratin filaments (mainly in epithelial cells)ι and vimentin filaments (in many sorts different cell types)-
The biological function of both the cytoskeleton as well as contractile apparatus of a cell does not end at the cell membrane. Cells must be embedded in the extracellular matrixi all cells of a muscle must act as one single mechanical unit and epithelia must resist macroscopic mechanical forces- Hencei cell adhesion and the extracellular matrix are closely connected to the cytoskeleton- Vincullin is one of the proteins which serve as an anchor for intracellular fibers (actin)- Different types of desmosomes and tight junctions connect neighbor cells with intercellular fibers- On the insidei cytoplasmic plaques connect them to the cytoskeleton- These structures! on the one handi serve as mechanical elements whereas gap junctionsi on the other handi connect cells metabolically -
The extracellular matrix consists of a network of proteinsi glycoproteins and polysaccharides- Different proteins are present in relation to different mechanical demands:- Elastin is found in tissues with high elasticity (lungsi heart) whereas collageni a more hard-wearing proteini is found in tendons and ligaments- Fibronectin is an extracellular protein highly important for cell adhesion-
Reference: Murray J et al (1112): Cell Motil Cytoskeleton 22"- 211-223-
Uithin the overall group of Cell Structure and Motility several categories of proteins are coded for by clones of the invention:
Ankyrins: Ankyrins are peripheral membrane proteins which interconnect integral proteins with the spectrin-based membrane skeleton. Thus these proteins are involved in coupling of cyto skeleton and cell membrane. OMIN reports that Ankyrins have associations (as potentially diagnostic! therapeutic! causativei and/or relatedi etc..) with the following diseases: 1) Heriditary Spherocytosis (OMIN *162100)i 2) Hemolytic Poikilocytic Anemia due to reduced ankyrin binding sites (OMIN 14170D)i 3) Atypical Elliptocytosis (OMIN 225450)i 4) Autosomal recessive spherocystosis (OMIN #270170)i 5) Uerner Syndrome (OMIN *277700)i and b) Rhesus-unlinked type Elliptocytosis (OMIN #130b00). Ankyrin bindung glycoprotein proteins mediate Ankyrin effectsi especially in neuronal adhesion and prostate tumour vcell transformation: Clones in this category include: amy2_121fl1.
Tropomyosins are ubiquitous proteins of 35 to 45 kD associated with the actin filaments of myofibrils and stress fibers. They are involved in cardiomyopathies (OMIN *111030ι *11101Dι *11011Dι *fa00317)- Clones in this category include: tes3_lbb5-
Differentiation/Development
Almost every multicellular organism originates from meiotic cell divisions and the recombination of a paternal and a maternal set of chromosomes- After fertilization of the eggi all cells of a body originate from this one cell- Thus the cells of the developing body are initially genetically alike- But phenotypically they become very different- They are specialized to a certain cell type and arranged in an organized pattern to a certain type of tissue and the whole structure has the well- defined shape of an organ. All these features are determined by the DNA sequence of the genomei which is reproduced in every cell- Each cell acts on the genetic instructions given to a certain time and at a certain place of development and plays its individual part in the multicellular organism- Cell differentiation may be divided into three general steps: cell cycle exiti apoptosis protection and tissue specific gene expression- These processes are coordinated to provide the final and unique tissue characteristics-
An animal cell that has achieved a certain level of development is said to be determined- This differentiation of a cell may be irreversible and in that case the cell may be renewed only by simple duplication- Other cells are renewed by means of stem cells which are immortal ( e-g- stem cells of the bone marrowi epidermal stem cells). The genetic control of development is extensively studied in non-vertebrates and vertebrates- The classical animal model is the fruit fly Drosophilia and the modern model is the transgenic mouse. Animal transgenesis has proven to be useful for physiological as well as physiopathological studies- Besides the approach based on the random integration of a DNA construct in the mouse genomei gene targeting can be achieved using totipotent embryonic stem cells for targeted transgenesis. Transgenic mice are than derived from the embryonic stem cells- This allows the introduction of null mutations in the genome (so-called knock-out) or the control of the transgene expression by the endogeneous regulatory sequence of the gene of interest (so-called knock-in). Mice can be created that express wild-type genesi mutant genesi marker genes or cell lethal genes in a tissue specific manner- These animal models allow to follow changes in tissue and organ development and lead to a better understanding of the cellular function of many genes or to the generation of animal models for human diseases- Fundamental problems in immunology! onset and development of canceri regulation in fatty acid metabolismi aspects of cardiovascular function! control of the central nervous system development! analysis of reproductive development and function are only some examples of research interests. The final stage of cell differentiation is growth arrest- In animal tissues with rapid cell turnover terminally differentiated cells undergo programmed cell death- The cells have the ability to kill themselves by activating an intrinsic cell suicide program when they are no longer needed or have become seriously damaged- The execution of this program is termed apoptosis- Apoptosis is of importance for development and homeostasis of animals- The key components of this program have been conserved in evolution from worms (C- elegans) to insects (Drosophilia) to humans- The roles of apoptosis include the sculpting of structures during development! deletion of unneeded cells and tissuesi regulation of growth and cell numberi and the elimination of abnormal and potentially dangerous cells- In this way apoptosis provides "quality control mechanism" that limits the accumulation of harmful cellsi such as virus-infected cells and tumor cells- On the other hand inappropriate apoptosis is associated with a wide variety of diseases! including AIDSi neuro-degenerative disorders and ischemic stroke. Because it is now clear that apoptosis is a result of an activei gene-directed processi it should be eventually possible to manipulate this form of cell death by developing drugs that interact with its recently identified mechanisms of action. Inducers of cell differentiation! cell cycle arrest and apoptosis might be the novel molecular targets for new anticancer agents in addition to the signaling pathways for growth factors and cytokines-
Proteinsi factorsi receptors and genes of importance in apoptosis:
Proteases :
- Calpaini an intracellular cysteine proteasei exact role unknown-
- Caspase-1 to Caspase-lli a family of proteases synthesized as an inactive proenzyme- Targets of the activated enzymes include: poly (ADP-ribose) polymerasei DNA-dependent protein kinasei Ul ribonucleoprotein! nuclear la inins and cytoskeleton components (actin)- - Granzyme Bi a serine protease released by cytotoxic T- cells-
Receptors :
- CD 15 (synonyms: Fasi AP0-l)ι a receptor protein of the TNF-receptor family which includes TNF-Rl and TNF-R2 with the common characteristic of a 70 amino acid cytoplasmic domain-
- FADD (synonym: MORT-Di a cytoplasmic protein
- DR-3 (synonym: APO-3) a member of the TNF-receptor-family
- DR-4 and DR-5
Genes:
- ced-3ι ced-4 and ced-1 encode the general apoptotic and antiapoptotic program in Caenorhabditis elegans- Apaf-3 is the mammalian homologue of ced-3-
- Bcl-2 / Bcl-xL / Bax / Bcl-xS / Bak: a large gene family that can either inhibit or promote apoptosis-
- Cytokine response modifier Ai a cowpox virus gene whose gene product inhibits caspases-
Others:
- Caspase-activated DNase (CAD) and its inhibitor (ICAD)ι causes DNA fragmentation in the nucleus
- Ceramidei a complex lipid that acts as a second messenger-
- c-Jun N-terminal kinase (JNK) is a proline-directed kinase
- p53 proteini is essential for the induction of apoptosis as a response to chromosomal damage-
- RAIDDi a death signal-transducing protein-
- Receptor interacting protein (RIP) is an accessory protein with a death domain and a serine/threonine kinase activity- - Sphingo yelinasei an enzyme that hydrolyzes the complex lipid sphingomyelin to ceramide-
- Tumor necrosis factor (TNF) is a type -II membrane protein
- TNF-receptor associated factor (TRAF2)ι is an accessory protein that can bind to both TNF-Rl and TNF-R2-
Uithin the overall group of Differentiation/Develop ent i several categories of proteins are coded for by clones of the invention:
Notch family proteins: Notch family molecules are negative regulators of neuronal differentiation in early brain development- Clones in this category include: amy2_li24-
Testis-specific Y-encoded proteins: The TSPY genes are arranged in clusters on the Y chromosome of many mammalian species. TSPY is believed to function in early spermatogenesis and is a candidate for GBYi the putative gonadoblastoma-inducing gene on the Y. These proteins are involved in early spermatogenesis. Clones in this category include: amy2_7j5-
Inflammation-mediating proteins: Inflammation is a basic mechanism responsible for recruiting and activation of im un- competent cells- By various mediators! cells are activated and triggered to differentiate- Hyperactivation of these pathways leads to various disease states: In neuronal tissuesi in inflammatory diseases such as experimental autoimmune encephalomyelitis (EAE)ι neuritis (EAN) and uveitis (EAU) allograft inflammatory factor-1 is produced by macrophages and microglia cells- Clones in this category include: amy2_2bl1-
Intracellular transport and trafficking Eukaryotic cells rely for their viability on the partitioning of many basic cellular processes into membrane- bounded organelles. These are the nucleusi endoplasmic reticulum (ER)ι Golgi apparatus! endosomesi lysosomal compartments! mitochondria and peroxisomes- Most molecules destined for the lysosomei cell surface and outside the cell are routed through the ER and Golgii which together with the vesicular intermediates between themi comprise the secretory pathway (Palade 1175). In the ER and Golgi compartments proteins are sortedi modified and often assembled into complexes en route to their final destination- Incorrectly assembled proteins are retained in the ER until they fold correctly or are targeted for degradation- Additional proteins are translocated into and function within the lumenal spaces of organelles or are secreted- Thus a large proportion of proteins synthesized require targeting to membranes either for insertion into or transport across them- A major purpose of this is growth- The secretory pathway is dependent on an intact cytoskeleton and also closely linked to general metabolism by affecting ribosome biogenesis (Mizuta and Uarneri 1114). A huge number of proteins is required for targeting! translocation and sorting of newly synthesized proteins-
The first step in sorting is the recognition of cis-acting targeting or signal sequences that organelle-targeted proteins contain. This is carried out by cytosolic targeting factors and/or receptors on the membrane to which the protein is targeted- In some cases the primary sequences are extremely degenerate! with only the overall character being conserved (hydrophobicity for an ER signal sequencei helical amphiphilicity for mitochondrial targeting sequence (Kaiser et al - -, 1187i Lemire et al - -, 1161)- Following the targeting stepi proteins are either inserted into or transported across the membrane (translocated) through a proteinaceous apparatus (termed the translocon). The translocon include or recruit motors to drive the translocation process in the correct direction (Schatz and Dobbersteini 111b). Defined intracellular protein transport steps: ■ ER
- targeting to the ER
- translocation into the lumen of the ERi andi depending on the presence of certain signals in the peptide sequence transport through the golgi complex ■ Mitochondria
- targeting
- translocation ■ Peroxiso es ■ The general secretory pathway
- protein modification! assembly and quality control in the ER
- vesicle-mediated trafficking - vesicle docking and fusion
- transport through the golgi apparatus and sorting at the trans-golgi
- transport to the cell surface
- transport routes to the lysosome ■ Endocytosis
■ Specialized protein transport routes
■ Protein export from the cytoplasm
References: Paladei G (1175) Science lδ1:347-35δi Mizuta et al. (1114) Mol Cell Biol 14: 2413-2502i Kaiser et al - (1167) Science 235: 312-317i Lemire et al - (1161) J Biol Chem 2b4: 2D20b-2D215i Schatz et al. (111b) Science 271: 1511-152b.
Rab proteins
In eukaryotic cells the compartmentalisation of processes is a prerequisite for a tight regulation of processes and activities- The cells contain a highly dynamic set of membrane compartments that are responsible for packaging! sortingi secretingi and recycling proteins and other molecules- Trafficking between organelles within the secretory pathway occurs as vesicles derived from a donor compartment fuse with specific acceptor membranesi resulting in the directional transfer of cargo molecules- This process is tightly controlled by the Rab/Ypt family of proteins (reviewed by Novick and Zeriali 1117) i a branch of the superfamily of small GTPases- Rab proteins regulate a variety of functions! including vesicle translocation and docking at specific fusion sites- Rabs may also play critical roles in higher order processes such as modulating the levels of neurotransmitter release in neuronsi a likely mechanism in synaptic plasticity that underlies learning and memory (Geppert and SUdhofi 1116)- Small GTPases share a common three-dimensional fold thati in the GTP bound statei can bind a variety of downstream effector proteins- GTP hydrolysis leads to a conformational change in the "switch" regions that renders the GTPase unrecognizable to its effectors- In this wayi by localizing and activating a select set of effectorsi a common structural motif is used to control a wide array of distinct cellular processes-
The final steps in membrane fusion are likely to be driven by a set of proteins known as SNAREs- After a vesicle becomes dockedi the cytoplasmic domains of VAMP (also termed synaptobrevin) and syntaxin on opposing membranesi in combination with a SNAP-25 moleculei coalesce into an elongated -helical bundle (Poirier et al-i lllβi Sutton et al-i 111β)ι which may lead to fusion- Because numerous SNARE isoforms have been identified that localize to distinct membrane compartments! it was originally proposed that the specificity of interaction between the SNARE proteins accounted for the specificity in membrane trafficking- Recent resultsi howeveri suggest that SNAREs are not specific in their ability to form complexes in vitroi suggesting that trafficking specificity requires additional factors (Yang et al-i 1111)- In this regardi Rab proteins are strong candidates for governing the specificity of vesicle trafficking. Like the SNAREsi many isoforms (40) of the Rab family have been identified that localize to specific membrane compartments (reviewed by Novick and Zeriali 1117).
Concomitant with the SNARE cyclei Rab proteins undergo a intricate cycle of membrane and protein interactions. Rabs are posttranslationally modified at C-terminal cysteines by the addition of two geranylgeranyl groupsi which mediate membrane association when the Rab is in the GTP-bound state- After guanine nucleotide hydrolysis occursi the Rab is extracted from the membrane upon forming a complex with a cytosolic GDP-dissociation inhibitor (GDI)- This cytosolic intermediate is then recycled onto a newly forming vesiclei most likely through a secondary factor termed a GDI dissociation factor (GDF)ι which displaces GDI- After the Rab becomes membrane boundi a guanidine nucleotide exchange factor (GEF) promotes release of GDP and the subsequent loading of GTP- In its GTP-bound conformation! the Rab is then free to associate with its specific set of effectorsi which can in turn trigger events leading to the eventual fusion of the vesicle with a target membrane- To complete the cyclei perhaps after or concurrent with membrane fusioni a GTPase activating protein (GAP) accelerates nucleotide hydrolysis! switching off the GTPase. The remaining GDP-bound Rab can then participate in a new round of fusion-
Rab interactions with effectors are likely to regulate vesicle targeting and membrane fusion in three ways- Firsti a Rab may specifically facilitate vectorial vesicle transport- Vesicles are transported from their site of origin to acceptor compartments likely through associations with cytoskeletal elements and transport motors- A protein has been identified with a domain structure that suggests a connection between the cytoskeleton and the Rabs- This proteini called Rabkinesin-fai contains a kinesin-like ATPase motor domain followed by a coiled- coil stalk region and a RBD that specifically binds Rabfa (Echard et al-i 1116 )• An additional link with the cytoskeleton is provided by the Rab effectori Rabphilin-3A ■ Rabphilin-3A has been shown in vitro to interact with -actinini an actin-bundling proteini but only when not bound to Rab3A (Kato et al-i lllfa )■ These results raise the intriguing possibility that Rab proteins regulate vesicle interactions with the cytoskeleton and thereby play an active role in targeting vesicles to their appropriate destinations-
Secondi Rab proteins may regulate membrane trafficking at the vesicle docking step- A number of Rab effectorsi including Rabaptin-5ι EEAli Rabphilin-3Aι and Rimi may serve as molecular tethers- Each effector protein contains a RBDi followed by a linker region (some having the potential to form elongated coiled-coil structures)! and a domain capable of interacting with a second Rab or the target membrane- Rabaptin-5ι for examplei contains two RBDsi one near the N terminus that specifically recognizes Rab4 and a second near the C terminus that binds Rab5 (Vitale et al-i lllδ )- Both Rimi which is localized to the target membranei and Rabphilin-3Aι which is localized to the vesiclei contain N-terminal RBDs and C-terminal Ca2+-binding C2 domainsi implicating these effectors in synaptic vesicle localization or docking in response to Ca2+ influx (Uang et al-i 1117 ) ■ Tethering effectors may also recognize protein complexes on the acceptor membrane- Sec4pι a yeast Rab3A ho ologi interacts with the exocyst (Guo et al-i 1111 )ι a complex of seven or more subunits that is assembled at sites of vesicle fusion along the plasma membrane- The exocyst complex may therefore function as a landmark for Rab/effectoi—mediated vesicle docking-
Thirdi once a vesicle has become tethered to its fusion sitei Rab proteins may selectively activate the SNARE fusion machinery. The mechanism of this activation is unknown but may involve direct interactions of Rabs ori more likelyi their effectors with SNAREs- For examplei Hrs-2 is a protein that binds to SNAP-25 and contains a Zn2+-finger motif characteristic of Rab-binding proteins such as Rabphilin-3Aι Rimi EEAli and Noc2ι suggesting that Hrs-2 may form a physical link between Rabs and SNAREs (Bean et al-i 1117)- In addition! certain mutations in the syntaxin-binding protein Slylpi the Seclp homolog utilized in ER to Golgi trafficking! eliminate the requirement for Yptlpi a Rab protein that functions at this trafficking step (Dascher et al-i 1111 ) • Rabs may therefore regulate SNARE associations through Seel family members. In support of this ideai a Rab effector was recently found to interact with a vacuole Rabi a Seclp homologi and a SNARE protein (Peterson et al-i 1111 )ι which suggests that this effector serves to connect Rab and SNARE function- In this wayi Rabs and their effectors may facilitate the correct pairing of SNAREs.
References: Dascher et al. (1111) Mol- Cell- Biol- 11-, 672- 665i Echard et al- (lllδ). Science- 271ι 5δ0-565i Geppert et al- (1116) Annu- Rev. Neurosci- Ξl-i 75-1Si Guo et al. (1111). EMBO J. lδi 1071-lOδOi Kato et al. (lllfa) J- Biol- Chem- 271ι 31775- 3177βi Novick et al. (1117) Curr- Opin. Cell Biol- li 41fa-5D4i Peterson (1111) Curr. Biol- li 151-lb2i Poirier et al. (1116) Nat- Struct- Biol- 5ι 7b5-7fa1i Vitale et al- (1116) EMBO J- 17ι 1141-115U Uang et al. (1117) Nature- 3δδι 513-51βi Yang et al- (1111) J- Biol- Chem- 274ι 5fa41-Sfa53-
Uithin the overall group of Intracellular Transport and Trafficking several categories of proteins are coded for by clones of the invention-
Vesicular trafficing: Various proteins are involved in trafficing of vesicles inside the cell and for the exocytotic pathway- For examplei See? of Saccharomyees cerevisiae takes function in vesicular traficking. Synaptotagmins are essential for Ca (2+)-regulated exocytosis of neurosecretory vesicles. Other proteins such as Dynamin are microtubule-associated force- producing proteinsi which are involved in the production of microtubule bundles." By binding and subsequent hydrolysation of GTP such proteins provide the motor for vesicular transport during endocytosis- Clones in this category include: amy2_14b5ι amy_2ol3 and fkd2_3kl-
Protein sorting: Protein sorting is a process essential for the maintenance of a cells functionality and structural integrity. Most proteins perform their biological function in special compartments in the cell-The process of sorting is complex and highlky regulated. Clones in this category include: mel2_7gl4.
Metabolism
This group includes proteins which are involved in the uptake and consumption of nutrientsi and enzymes which are part of the biochemical pathways for energy metabolism or which are involved in the supply of building blocks of nucleic acidsi proteins (NTPsi dNTPsi amino acids) for DNA/RNA and protein synthesis! and fatty acids (membranes)ι to allow for the generation of higher order structures- This group constitutes the most important and largest group in prokaryotes and lower eukaryotes- The higher the evolutionary level of an organism isi however! the more other protein classes like "-signal transduction"11 Lcell cycle1 and differentiation and development"1 increase in importance and number of representatives.
Proteins involved in the metabolism of energy and compounds (here: other than nucleic acids or proteins) are usually the products of house keeping genesi they are often constitutively and/or ubiquitously expressed- Several categories of proteins are coded for by clones of the invention within the overall group of Metabolism:
Fatty acid metabolism: OMIN lists more than 50 diseases caused by pathologic altered fatty acid metabolism- 1-acyl- glycerol-3-phosphate acyltransferase is involved in fatty acid metabolism and is ubiqitous expressed! with a slight predominance in uterusi placenta and foreskin- Clones in this category include: amy2_2c22 Repair and surveillance nf protein damage: Several classes of protein are involved in reapair and surveilance of protein damage. L-isoaspartyl methyltransferase (Pimt)ι as an examplei is a highly conserved enzyme utilising S-adenosylmethionine (AdoMet) to methylate aspartate residues of proteins damaged by age- related isomerisation and deamidation- Clones in this categroy include: fbr2_7βi21-
Nucleic acid management
The genetic information is stored in the form of nucleic acids in all organisms- Two kinds of nucleic acids existi DNA and RNA- Uhereas the more stable DNA in most organisms constitutes the storage form of the genetic information! the labile RNA and in particular mRNA is an intermediate used for the temporal expression of specific genes- In eukaryotesi DNA is usually a double stranded linear molecule consisting of two antiparallel strands and made up of a deoxyribosei a phosphorus backbone and the four bases Ai Ci Gi and T- The DNA of some organisms has a ring structure- The structure of DNA was unraveled years ago by Uatson and Crick- DNA is directional molecule determined by the C-atoms of the sugar-
The most important processes dealing with nucleic acids are :
• replication (e-g- DNA poly erasesi Telomerase)
• transcription (RNA polymerases)
• RNA processing (maturation - splicing and degradation) • in addition! enzymes and proteins exist which require a nucleic acid (mostly RNA) in the active center to be functional (ribozymes - e-g- RNasei Ribosomal proteins)
The DNA of a cell is replicated in the S-phase of the cell cycle- Several enzymes carry out the task of doubling this nucleic acid- As all steps of the cell cyclei also the process of replication is tightly regulated- The enzyme DNA polymerase and several other proteins are involved in this process. Uhereas many prokaryotes do have only one origin of replication (i-e-i the starting point of the replication cycle)ι in eukaryotic DNAs (chromosomes) multiple such start points exist- The switch from the synthesis (S) phase to the subsequent G2 or M phases of the cell cycle are dependent on the completion of the replication. This makes cleari that a number of proteins are involved in the replication itself as well as in the control of the process. Since most eukaryotic chromosomes are linear structures! additional proteins and enzymes are necessary to make sure that the structure is maintained through successive generations. This includes those proteins necessary to build the three dimensional structure of chromosomes (e-g- histones) and the structural network of the nucleus and nucleolus (including the defined localization of transcriptionally active genes in the vicinity of nucleoli) but also such enzymes as telo erase which guarantees the integrity of the chromosomal ends.
The expression of genes is usually performed in two steps. First a messenger RNA (mRNA) is produced (transcribed) in one to many copies and second this mRNA is translated into the protein product- The regulation of transcription is discussed under the separate heading "-transcription factorsπι but also the classes "-signal transduction"11 '-development'' i Lcell cycle1 and others are affected as the expression of certain genes determines the fate of a cell or organism- The primary transcript (hnRNA - heterogeneous nuclear RNA) is a single stranded one-to-one copy of the gene as it is located on the chromosome- Before a protein can be translated! already during transcription the process of maturation is initiated- Firstlyi a 5"" cap structure is enzymatically and covalently added to the RNAi blocking the 5"1 end of the RNA- Secondi when the RNA polymerase has terminated polymerizationi the enzyme poly A polymerase adds varying numbers of adenine residues to the 31 end of the transcript. This enzyme recognizes the sequence AAUAAA or AUUAAA (+ some minor variations)! cuts the RNA 10 - 30 nucleotides downstream and adds the A residues- The size of the poly A sequence affects the stability of the RNA- Finallyi in the process of splicingi the introns present on the genomic level and also present in the hnRNA are spliced out by a multi-protein complex consisting of several proteins and RNAs- The finally maturated mRNA is exported to the cytoplasm where it is translated with help of the ribozymes.
The half life of RNA is usually much shorter than that of DNA- Usuallyi the mRNA is degraded shortly after synthesis! to guarantee a very defined window of expression of a given gene- This regulation is necessary to specifically maintain or change the set of proteins present at any time in a cell- Specific regions in the 3""UTR (untranslated region) determine the stability of the mRNA in the cytoplasm before it is degraded by RNasesi enzymes consisting both of protein and RNA-
References: Uatson and Crick (1153) Nature 171". 737-736- Several categories of proteins are coded for by clones of the invention within the overall group of "Nucleic acid management"and includei among othersi the following:
Proteins induced by DNA-Damage: There are several distinct pathways responsible for repair of DNA- Nucleotide excision repair is the most versatile DNA repair pathway and isthe main defense of mammalian cells against UV-induced DNA damage- Defects in proteins involved in this pathway can lead to inherited disorders (such as xeroderma pigmentosu OMIN *276700ι *276720ι *27δ740 and *114400i Cockayne's syndrome OMIN *21b400 and trichothiodystrophy OMIN #b01b75). Study of UV-sensitive yeast RAD mutants has greatly aided this process and has revealed strong conservation of the components of nucleotide excision repair in eukaryotes- Clones in this category include: amy2_lln4 and tes3_10ilb-
Proteins involved in Loading of transferRNAs : transfer RNAs must be coupled to an aminoacidi which then is transported to the peptideyl-transferase centre of the ribosome. Clones in this category include: fbr2_7δcl2-
Cytosolic ribosomal proteins: Several proteins are part of the eukaryotic ribosomal peptidyl transferase center or modulate the activity of this centre- Such proteins can find application in modulation of ribosome assembly! maintenance and activity. Clones in this category include: amy21il
Histones: Histones are DNA-binding protein responsible not only for DNA structure and folding and packingi but also are discussed to be involved in activation and silencing of large chromosomal regions- Clones in this category include: tes3_31al0-
mRNA-binding proteins: mRNA-binding are involved in regulation of mRNA foldingi translation and stability- For examplei the VILIP protein binds specifically to the 3'untranslated region of the neurotropin receptor mRNA- Clones in this group include amy2_2gl2-
Signal transduction
Cells in higher order organisms need to continuously communicate with its environment especially with other cells of the same organism in order to maintain the function and specialization of the whole system these cells are part of- This important task of communication is performed with help of cell- surface receptors which receive and transmit signals from outside into the cell- G-proteins
The largest known family of cell-surface receptors is that of the G-protein-coupled receptorsi which mediate the transmission of diverse stimuli such as neurotransmittersi glycopeptidesi hormonesi peptidesi odorant moleculesi and photons- The functional unit of these receptors is composed of the receptor molecule itself (GPCR) which is anchored in the cytoplasma membrane with seven membrane spanning domainsi the heterotrimeric G-protein which is composed of and -subunits (G and G )ι and the effectors that interact with G and / or G - In particular! the dissociated G and G can regulate the activities of a number of effector molecules such as adenylate cyclasesi phopholipase C isoformsi ion channels! and tyrosine kinasesi resulting in a variety of cellular functions- The process of signal transduction must be tightly regulated and reversible in order to avoid overstimulationi to achieve signal termination! and render the receptor responsive to subsequent stimuli [Iacovelly L- et al.-i (1111) FASEB J. 13ι 1-βι Hammi H-E- (1116) J. Biol . Chem- 273ι bb1-fa723. G-proteins are GTPases thati upon binding of GTP change their conformation which in return unmasks structural motivesi in particular the so called effector loopi which can mediate the interactions to target proteinsi or effectorsi for the GTPases- This ability enables the GTPases to cycle between activei GTP- bound and inactivei GDP bound conformations and in the process to function as molecular traffic lights in a multitude of signal transduction pathways- The most important of these signal transduction pathways that are regulated with help of G-proteins are that of the phospholipase C / protein kinase C and that of the adenylate cyclase / protein kinase A-
The cycling of GTPases is tightly regulated by three main classes of proteins: The exchange of hydrolyzed GDP for a fresh GTP is facilitated by guanosine nucleotide exchange factors (GEFs)ι the hydrolysis of GTP to GDP is sped up by GTPase- activating proteins (GAPs)ι and the dissociation of GDP from the GTPases is inhibited by GDP dissociation inhibitors (GDIs) [Tapon and Hall (1117) Curr. Opin. Cell . Biol . 9 -, 6b-12ι Van Aelst and D- Souza-Schorey (1117) Genes Dev. Hi 2215-23223-
SOC-family
A conserved motif that was originally identified in proteins that negatively regulate the signaling action of cytokines was termed SOCS boxi the Suppressor Of Cytokine Signaling- Based on ho ologyi five distinct structural protein classes have been identified since that carry this motif- The function of most of these proteins is presently not known- Common to the proteins is only the SOCS box which is located near the C-terminus of the respective peptides- Recently! the SOCS box has been demonstrated to induce binding of proteins to elongins B and C which could target the proteins (and bound substrates) to the proteasomal protein degradation pathway (Kamurai T- et al . (1116) Genes Dev. 12ι 3672-3δδli Zhangi J--G- et al . (1111) Proc. Natl . Acad. Sci . USA 96ι 2071-207b) - The class where the SOCS box was originally described contains several members (S0CS-1-S0CS-7 and CIS)- In addition to the SOCS boxi these proteins also contain a SH2 (Src-homology 2) domain and a variable N-terminus- These SOCS proteins appear to form part of a classical negative feedback loop that regulates cytokine signal transduction- Upon cytokine stimulation! expression of SOCS proteins is rapidly induced and the proteins inhibit further cytokine action- The mode of action of the SOCS proteins is variable- Uhile SOCS-1 binds and inhibits the JAK (Janus kinases) family of cytoplasmic protein kinases [Narahzaki II. et al. (1116) Proc . Natl . Acad . Sci . USA 951 13130-1313 1
Nicholsoni S-E- et al - (1111) EMBO. J- 18ι 375-3653ι CIS appears to act by competing with signaling molecules such as the STATs (Transducers and Activators of Transcription) family for binding to phosphorylated receptor cytoplasmic domains [Yoshimurai A- et al . (1115) EMBO J. 14ι 2δlfa-2δ2fai MatsumotOi A- et al - (1117) Blood 89ι 3146-31541-
A second class of SOCS box protein contains additionally UD- 40 repeats which were initially identified in the mouse USB-1 and -2 proteins. The functions of UD-40 proteins are not completely understood but seem to be rather divergent- In Cdc4p the UD-4D repeats probably are necessary for binding the substrate for Cdc34p [Mathiasi N- et al . (1111) Mol . Cell Biol . 19 -, 17S1-17fa73. Cdc4p is a component of a ubiquitin ligase that tethers the ubiquitin-conjugating enzyme Cdc34p to its substrates. The posttranslational modification of a protein by ubiquitin usually results in rapid degradation of the ubiquitinated protein by the proteasome. The transfer of ubiquitin to substrate is a multistep process where UD-40 repeats might play an important function- Other UD-40 containing proteins (e-g- the retino blastoma binding protein RbAp46) have been shown to bind metal ions (Zinc) and that this metal binding might mediate and/or regulate protein-protein interactions which are functionally important in chromatin metabolism [Kenziori A-L- and Folki U-R- (1116) FEBS Lett. 440ι 425-4213- These proteins are involved in the RAS-cAMP pathway that regulates cellular growth [Ach R-A- et al - (1117) Plant Cell 9 -, 1515-lbObJ-
The SPRY domain has been identified in pyrin or marenostrini a protein which is mutated in patients with Mediterranean fever and which is similar to the butyrophilin family- Uhile butyrophilins seem to be involved in the lactation process in mammalsi the function pyrin is unknown- Three proteins (SSB-1 to -3) have been identified to contain both SPRY and SOCS box motifs- The function of these proteins is also not known- Ankyrin repeat containing proteins share a 33-residue repeating motifi an L-shaped structure with protruding -hairpin tips which mediate specific macromolecular interactions with cytoskeletal i me branei and regulatory proteins- These proteins play fundamental roles in diverse biological activities including growth and development! intracellular protein trafficking! the establishment and maintenance of cellular polarityi cell adhesion signal transductioni and mRNA transcription- Three proteins that contain ankyrin repeats (ASB-1 to -3) have been identified to contain a C-terminal SOCS box additionally to the ankyrin repeats- The function of these proteins or the individual domains remains to be discovered [Hiltoni D-J- et al - (1116) Proc. Natl . Acad . Sci . USA 95ι 114-1113-
A few small GTPases (RAR and RAR like) do also contain a SOCS box- GTPases are involved in signal transduction during cellular communication. The function of the SOCS box in this type of proteins is currently unclear [Hiltoni D-J. et al - (lllδ) Proc. Natl . Acad. Sci . USA 95 i 114-1113-
Ca 5+ as second messenger
The bivalent cation CaΞ+ isi besides cAMPi one of the two major second messengers in eukaryotic cells- Its intracellular concentration is tightly regulated and usually kept very low compared to the cell1s environment- CaΞ+ binding proteins and transporters (Gap junction! Voltage-gatedi second messenger- gated) help to sequester huge amounts of the ion in various organelles from where CaΞ+ can be released upon extracellular stimuli- E-g- the contraction of the muscle is dependent on the presence of CaΞ+ ions which are readily transported back into the organelles in order for the muscle to relax. In signal transductioni CaΞ+ functions as a second messenger that activates CaΞ+ dependent processes through the activation of CaB+/calmodulin dependent protein kinases (CaM kinases) which are the major effector molecules of Ca5+- In the signaling cascades! the CaM dependent kinases activate phospholipases (e-g. phospholipase C) that in return activate other protein kinases such as protein kinase C cAMP The cyclic AMP is produced by the enzyme adenylate cyclase in response to extracellular signals. Certain G-proteins stimulate the activity of adenylate cyclase which converts ATP to cAMP and PPi- Two molecules of cAMP bind to each of two regulatory subunits of cAMP dependent protein kinase which in turn dissociate from the two catalytic subunits of the heterotetramer RaCa- Upon release of the C-subunitSi they become active and phosphorylate substrate proteins at Ser and Thr residues. The process leading from binding of extracellular molecules to their receptorsi the transmission of the stimuli into the celli the activation of adenylate cyclase and the subsequent activation of cAMP dependent protein kinase is one of two major signal transduction pathways in eukaryotic cells. Since the phosphorylation of proteins is a posttranslational modification of proteinsi the kinases are described in the class "signal transduction-"
SARA
Members of the transforming growth factor β (TGFβ) superfamily signal through a family of cell-surface transmembrane serine/threonine kinasesi known as type I and type II receptors (Heldin et al-i 1117 i Attisano and Uranai 1116 i Kretzschmar and Massaguei 1116). Ligand induces formation of heteromeric complexes of these receptorsi and signaling is initiated when receptor I is phosphorylated and activated by the constitutively active kinase of receptor II (Urana et al-i 1114 ). The activated type I receptor kinase then propagates the signal to a family of intracellular signaling mediators known as Smads (contraction of the C-elegans S a and Drosophila Mad genes which were the first identified members of this class of signaling effectors).
Three classes of Smads with distinct functions have been defined: the receptor-regulated Smadsi which include Smadli 2ι 3ι 5ι and 6i the common mediator S adi Smad4i and the antagonistic Smadsi which include Smadfa and 7 (Heldin et al-i 1117i Attisano and Uranai lllδ i Kretzschmar and Massaguei lllδ )■ Receptor- regulated Smads (R-Smads) act as direct substrates of specific type I receptorsi and the proteins are phosphorylated on the last two serines at the carboxyl terminus within a highly conserved SSXS motif (Macϊas-Silva et al-i lllfa i Abdollah et al-i 1117 i Kretzschmar et al-i 1117 i Liu et al.-i 1117b i Souchelnytskyi et al-i 1117 )■ Regulation of R-Smads by the receptor kinase provides an important level of specificity in this system- Thusi Smad2 and Smad3 are substrates of TGFβ or activin receptors and mediate signaling by these ligands (Macϊas-Silva et al-i lllfa i Liu et al-i 1117b i Nakao et al-i 1117 )ι whereas Smadi! 5ι and 6 are targets of BMP receptors and propagate BMP signals (Hoodless et al-i 111b i Chen et al-i 1117b i Kretzschmar et al-i 1117 i Nishimura et al-i 1116 )■ Once phosphorylatedi R-Smads associate with the common Smadi Smad4 (Lagna et al-i 111b i Zhang et al-i 1117 )ι and mediate nuclear translocation of the heteromeric complex. In the nucleusi Smad complexes then activate specific genes through cooperative interactions with DNA and other DNA- binding proteins such as FASTli FAST2ι and Fos/Jun (Chen et al-i lllfa i Chen et al-i 1117a i Liu et al-i 1117a i Labbe et al-i 1116 i Zhang et al-i lllδ i Zhou et al-i 1116 ). In contrast to R-Smads and Smad4ι the antagonistic Smadsi Smadfa and 7ι appear to function by blocking ligand-dependent signaling (reviewed in Heldin et al-i 1117 ) • Phosphorylation of R-Smads by the type I receptor is essential for activating the TGFβ signaling pathway (Heldin et al-i 1117 i Attisano and Uranai 1116 i Kretzschmar and Massaguei 1116 )■ Howeveri little is known of how Smad interaction with receptors is controlled- A novel Smad2/Smad3 interacting protein has been described (Tsukazaki T- et al-i 1116 ) that contains a double zinc fingeri or FYVE domaini and which has been called SARA (Smad anchor for receptor activation)- The SARA motif recruits Smad2 into distinct subcellular domains and co-localizes and interacts with TGFβ receptors- TGFβ signaling induces dissociation of Smad2 from SARA with concomitant formation of Smad2/Smad4 complexes and nuclear translocation- Moreover! deletion of the FYVE domain in SARA causes mislocalization of Smad2 and inhibits TGFβ-dependent transcriptional responses. Thusi SARA defines a component of TGFβ signaling that functions to recruit Smad2 to the receptor by controlling the subcellular localization of Smad-
References: Abdollah et al- (1117) J- Biol- Chem- 272ι 27fa7δ-27faδ5i Attisano et al- (1116) Curr- Opin- Cell Biol- lDi 16δ-114i Chen et al. (lllfa) Nature 363ι fall-falfai Chen et al. (1117a) Nature 361ι δS-61i Chen et al- (1117b) Proc Natl. Acad- Sci- USA 14ι 1213δ-12143i Heldin et al. (1117) Nature 31Dι 4b5-471i Hoodless et al. (111b) Cell 65ι 461-SDOi Kretzschmar et al- (1116) Curr- Opin- Genet. Dev. δi 103-llli Kretzschmar et al- (1117) Genes Dev- Hi 1δ4-115i Labbe et al- (1116) Mol- Cell 2ι 101-120i Lagna et al- (111b) Nature 363ι 632-63bi Liu et al- (1117a) Genes Dev. Hi 3157-31fa7i Liu et al- (1117b) Proc Natl- Acad- Sci. USA 14ι 10fab1-107b4i Macϊas- Silva et al- (lllfa) Cell δ7ι 1215-1224i Nakao et al- (1117) EMBO J- Ibi 5353-53fa2i Nishimura et al. (1116) J. Biol- Chem. 273ι 1672-1671i Souchelnytskyi et al- (1117) J. Biol- Chem- 272ι 26107-2βllSi Tsukazaki et al- (1116) Cell 15ι 771-71U Urana et al. (1114) Nature 37Dι 341-347i Zhang et al- (1117) Curr- Biol- 7ι 270-27fai Zhang et al- (1116) Nature 314ι 101- 113i Zhou et al. (lllδ) Mol- Cell Si 121-127-
Calcium
The bivalent cation Ca2+ isi along with cAMPi one of the two major second messengers in eukaryotic cells- Its intracellular concentration is tightly regulated and usually kept very low compared to the cellos environment- CaΞ+ binding proteins and transporters (Gap junction! Voltage-gatedi second messenger- gated) help to sequester huge amounts of the ion in various organelles from where CaH+ can be released upon extracellular stimuli- E-g- the contraction of the muscle is dependent on the presence of CaΞ+ ions which are readily transported back into the organelles in order for the muscle to relax- In signal transductioni Ca5+ functions as a second messenger that activates CaΞ+ dependent processes through the activation of CaΞ+/calmodulin dependent protein kinases (CaM kinases) which are the major effector molecules of CaΞ+- In the signaling cascadesi the CaM dependent kinases activate phospholipases (e-g- phospholipase C) that in return activate other protein kinases such as protein kinase C- Rab proteins
In eukaryotic cells the compartmentalization of processes is a prerequisite for a tight regulation of processes and activities. The cells contain a highly dynamic set of membrane compartments that are responsible for packaging! sortingi secretingi and recycling proteins and other molecules- Trafficking between organelles within the secretory pathway occurs as vesicles derived from a donor compartment fuse with specific acceptor membranesi resulting in the directional transfer of cargo molecules- This process is tightly controlled by the Rab/Ypt family of proteins (reviewed by Novick and Zeriali 1117 )ι a branch of the superfamily of small GTPases- Rab proteins regulate a variety of functions! including vesicle translocation and docking at specific fusion sites- Rabs may also play critical roles in higher order processes such as modulating the levels of neurotransmitter release in neuronsi a likely mechanism in synaptic plasticity that underlies learning and memory (Geppert and Sϋdhofi lllδ )■
Small GTPases share a common three-dimensional fold thati in the GTP bound statei can bind a variety of downstream effector proteins- GTP hydrolysis leads to a conformational change in the "switch" regions that renders the GTPase unrecognizable to its effectors- In this wayi by localizing and activating a select set of effectorsi a common structural motif is used to control a wide array of distinct cellular processes-
The final steps in membrane fusion are likely to be driven by a set of proteins known as SNAREs- After a vesicle becomes dockedi the cytoplasmic domains of VAMP (also termed synaptobrevin) and syntaxin on opposing membranesi in combination with a SNAP-25 moleculei coalesce into an elongated -helical bundle (Poirier et al-i 1116 i Sutton et al-i 1116 )ι which may lead to fusion. Because numerous SNARE isoforms have been identified that localize to distinct membrane compartments! it was originally proposed that the specificity of interaction between the SNARE proteins accounted for the specificity in membrane trafficking- Recent resultsi howeveri suggest that SNAREs are not specific in their ability to form complexes in vitroi suggesting that trafficking specificity requires additional factors (Yang et al-i 1111 )■ In this regardi Rab proteins are strong candidates for governing the specificity of vesicle trafficking- Like the SNAREsi many isoforms (40) of the Rab family have been identified that localize to specific membrane compartments (reviewed by Novick and Zeriali 1117 )■
Concomitant with the SNARE cyclei Rab proteins undergo a intricate cycle of membrane and protein interactions- Rabs are posttranslationally modified at C-terminal cysteines by the addition of- two geranylgeranyl groupsi which mediate membrane association when the Rab is in the GTP-bound state- After guanine nucleotide hydrolysis occursi the Rab is extracted from the membrane upon forming a complex with a cytosolic GDP-dissociation inhibitor (GDI)- This cytosolic intermediate is then recycled onto a newly forming vesiclei most likely through a secondary factor termed a GDI dissociation factor (GDF)ι which displaces GDI- After the Rab becomes membrane boundi a guanidine nucleotide exchange factor (GEF) promotes release of GDP and the subsequent loading of GTP- In its GTP-bound conformation! the Rab is then free to associate with its specific set of effectorsi which can in turn trigger events leading to the eventual fusion of the vesicle with a target membrane- To complete the cyclei perhaps after or concurrent with membrane fusioni a GTPase activating protein (GAP) accelerates nucleotide hydrolysis! switching off the GTPase- The remaining GDP-bound Rab can then participate in a new round of fusion- Rab interactions with effectors are likely to regulate vesicle targeting and membrane fusion in three ways- Firsti a Rab may specifically facilitate vectorial vesicle transport. Vesicles are transported from their site of origin to acceptor compartments likely through associations with cytoskeletal elements and transport motors- A protein has been identified with a domain structure that suggests a connection between the cytoskeleton and the Rabs- This proteini called Rabkinesin-fai contains a kinesin-like ATPase motor domain followed by a coiled- coil stalk region and a RBD that specifically binds Rabfa (Echard et al-i 1116 )■ An additional link with the cytoskeleton is provided by the Rab effectori Rabphilin-3A ■ Rabphilin-3A has been shown in vitro to interact with -actinini an actin-bundling proteini but only when not bound to Rab3A (Kato et al-i 111b )■ These results raise the intriguing possibility that Rab proteins regulate vesicle interactions with the cytoskeleton and thereby play an active role in targeting vesicles to their appropriate destinations.
Secondi Rab proteins may regulate membrane trafficking at the vesicle docking step. A number of Rab effectorsi including Rabaptin-5ι EEAli Rabphilin-3Aι and Rimi may serve as molecular tethers. Each effector protein contains a RBDi followed by a linker region (some having the potential to form elongated coiled-coil structures)! and a domain capable of interacting with a second Rab or the target membrane- Rabaptin-5ι for examplei contains two RBDsi one near the N terminus that specifically recognizes Rab4 and a second near the C terminus that binds Rab5 (Vitale et al-i 1116 )■ Both Rimi which is localized to the target membranei and Rabphilin-3Aι which is localized to the vesiclei contain N-terminal RBDs and C-terminal Ca2+-binding C2 domainsi implicating these effectors in synaptic vesicle localization or docking in response to Ca2+ influx (Uang et al-i 1117 ) - Tethering effectors may also recognize protein complexes on the acceptor membrane- Sec4pι a yeast Rab3A homologi interacts with the exocyst (Guo et al-i 1111 )ι a complex of seven or more subunits that is assembled at sites of vesicle fusion along the plasma membrane- The exocyst complex may therefore function as a landmark for Rab/ef fector-mediated vesicle docking-
Thirdi once a vesicle has become tethered to its fusion sitei Rab proteins may selectively activate the SNARE fusion machinery- The mechanism of this activation is unknown but may involve direct interactions of Rabs ori more likelyi their effectors with SNAREs- For examplei Hrs-2 is a protein that binds to SNAP-25 and contains a Zn2+-finger motif characteristic of Rab-binding proteins such as Rabphilin-3Aι Rimi EEAli and Noc2ι suggesting that Hrs-2 may form a physical link between Rabs and SNAREs (Bean et al-i 1117). In addition! certain mutations in the syntaxin-binding protein Slylpi the Seclp homolog utilized in ER to Golgi trafficking! eliminate the requirement for Yptlpi a Rab protein that functions at this trafficking step (Dascher et al-i 1111 ) ■ Rabs may therefore regulate SNARE associations through Seel family members- In support of this ideai a Rab effector was recently found to interact with a vacuole Rabi a Seclp homologi and a SNARE protein (Peterson et al-i 1111 )ι which suggests that this effector serves to connect Rab and SNARE function. In this wayi Rabs and their effectors may facilitate the correct pairing of SNAREs-
References: Dascher et al. (1111). Mol- Cell- Biol- Hi δ72- 665i Echard et al- (1116). Science- 271ι 5δ0-5δ5i Geppert et al. (1116). Annu- Rev- Neurosci- 21ι 75-15i Guoet al- (1111). EMBO J. Iδi 1071-lOδOi Kato et al- (111b). J. Biol. Chem- 271ι 31775- 3177δi Novick et al- (1117). Curr. Opin- Cell Biol- li 41b-5D4i Peterson et al- (1111). Curr- Biol- li 151-lfa2i Poirier et al- (1116). Nat. Struct- Biol- 5ι 7b5-7fa1i Vitale et al- (1116). EMBO J- 17ι 1141-HSli Uang et al- (1117). Nature- 36δι 513-51δi Yang et al- (1111)- J- Biol- Chem- 274ι 5b41-Sb53.
Kinases Reversible posttranslational modifications of proteins are major means of regulating cellular activities- Among the various modifications that are carried out by the cellsi the addition of phosphoryl groups to Ser/Thr or Tyr residues is the most important and widely used- The phosphorylation of proteins is accomplished by protein kinasesi while the reverse reaction! the removal of phosphoryl groupsi is carried out by phosphatases- Kinases / Phosphatases regulate key positions e-g- in the processes of cell proliferation! differentiation and communication/signaling- These processes must be tightly regulated in order to maintain a steady state level of cellular fate- Mis-regulation of kinase activities (or that of phosphatases) is made responsible for a multitude of disease processes such as oncogenesisi inflammatory processes! arteriosclerosis! and psoriasis-
Protein kinases constitute the largest protein family that is currently known- Several hundred kinases have been identified already. Classically! kinases are subdivided into two classes based on the amino acid residues in their substrates that are phosphorylated by the particular enzymes- The kinases specifically add phosphoryl groups from adenosine triphosphate (ATP) ori less frequently! guanosine triphosphate (GTP)ι either to serine and/or threonine or to tyrosine residues of substrate proteins- An estimated liODO to lOiDOO proteins present in a typical mammalian cell are believed to be regulated also by the action of protein kinases.
Protein kinases are frequently integral parts of signaling cascades that transmit extracellular stimuli (e-g- hormonesi neurotransmittersi growth- or differentiation factors) into the cell and result in various responses by the cells- The kinases play key roles in these cascades as they constitute a sort of "-molecular switches"1 turning on or off the activities of other enzymes and proteinsi e-g- metabolic regulatoryi channels and pumpsi receptorsi cytoskeletal ι transcription factors- The regulation of kinase activities is accomplished by various means-"
The best characterized example for the regulation via regulatory subunits is the cAMP-dependent protein kinase (PKA) which is also a prototype for second messenger activated protein kinases- This enzyme consists of a heterotetramer of two catalytic (C) and two regulatory (R) subunits- Upon binding of two molecules of second messenger (cAMP) in each R subuniti the catalytic subunits are released and active- Both of the catalytic and the regulatory subunits several isoforms exist- The combination of catalytic and regulatory subunits determines the localization of the holoenzyme and also the substrate spectrum that is available for phosphorylation- The consensus pattern necessary to be present in the substrate for PKA action is RRXS/T where X can be any amino acid-
The casein kinase II comprises another examples for holoenzymes that consist of catalytic and regulatory subunits- Other kinases that are activated by second messengers are cGMP- dependent protein kinase and Protein kinase C (PKC) which is activated by diacylglycerol i which in turn is produced by phospholipases by cleavage of phosphatidylcholine-
Receptor kinases usually consists of an extracellular domain which can bind effector molecules (e.g. growth factors and hormones) and transfer the stimulus to the intracellular domain of these proteins which usually is a protein tyrosine kinase- Other tyrosine kinases lack an extracellular domain but are associated with receptors which transfer the signal after effector binding by activating the associated protein kinase enzyme (e-g- Src kinase familyi Src Blki Fgri Fyni Lck Lyni Yes and Janus kinase familyi Jakl-3ι Tyk2)-
Dysfunction of kinasesi e-g- caused by non-functioning regulation! can be the cause of inflammatory diseases and uncontrolled proliferation- v-Src which is a truncated version of the C-Src protooncogene tyrosine kinase is a classical example for this process as v-Src does not contain the regulatory domain of the cellular gene and is thus constitutively active-
Several categories of proteins are coded for by clones of the invention within the overall group of "Signal transduction"and includei among othersi the following:
Discs-large family: In Drosophila more than 50 genes are discribed ιin which mutation leads to loss of cell proliferation control indicating that they are tumor suppressor genes- Most of these genes have mammalian homologs- The Drosophila 'discs large' tumor suppressor proteini Dlgi is the prototype of a family of proteins termed MAGUKs (membrane-associated guanylate kinase homologs). MAGUKs are localized at the membrane-cytoskeleton interface! usually at cell-cell junctioni where they appear to have both structural and signaling roles- They contain several distinct domainsi including a modified guanylate kinase domaini an SH3 motifi and 1 or 3 copies of the DHR (GLGF/PDZ) domain- Recessive lethal mutations in the 'discs large' tumor suppressor gene interfere with the formation of septate junctions (thought to be the arthropod equivalent of tight junctions) between epithelial cellsi and they also cause neoplastic overgrowth of imaginal discsi suggesting a role for cell junctions in proliferation control- These proteins can f ind application in modulating/blocking the guanylate cyclase-pathway ■ Clones in this category include: amy2_12d7-
Proteins with a UU Domain: Proteins that contain a UU domain which has been originally described as a short conserved region in a number of unrelated proteinsi among them dystrophini the gene responsible for Duchenne muscular dystrophy- The domaini which spans about 35 residues! is repeated up to 4 times in some proteins- It has been shown to bind proteins with particular proline-motifsi [AP3-P-P-[AP3-Yι and thus resembles somewhat SH3 domains- This domain is frequently associated with other domains typical for proteins in signal transduction processes- Examples of proteins containing the UU domain are Dystrophini Utrophini vertebrate YAP protein (binds the SH3 domain of the Yes oncoprotein) i murine NEDD-4 (embryonic development and differentiation of the central nervous system) i IQGkP (human GTPase activating protein acting on ras). Therefore these proteins should be involved in intracellular signal transduction- Diseases associated (as potentially diagnostic therapeutic causativei and/or relatedi etc..) with these proteins include as reported by OMIN 1) Muscular Dystrophyi Pseudohypertrophic
Progressive Duchenne and Becker Types (OMIN *31Q20D). Clones in this category include: tes3_lld21- Ion-Transporters: For signalling stringent control od ion fluxes over biological membranes is of the essence- Several trans-membrane ion-chennel-proteins key elements of signal transduction pathways- Clones in this category include: amy2_10p7 and amy2_2fl6-
RING-finger proteins: A Zinc finger motif of the C3HC4 type (the so-called RING finger domain) is involved in mediating protein-protein interactions- Proteins containing a RING-finger are '- mammalian V(D)J recombination activating protein (RAGl)ι mouse rpt-li human rfpi human 52 Kd Ro/SS-A protein and others- The family of RING finger proteins contains a number of oncogenes- For example PMLi a probable transcription factori BRCAli the mammalian cbl- and bmi-1 proto-oncogenes- Clones in this category include: amy2_10hl7-
Phosphatases : Proper targeting of PTPs is essential for many cellular signalling events including antigen induced proliferative responses of B and T cells- The physiological significance of PTPs is further unveiled through mice gene knockout studies and human genome sequencing and mapping projects- Several PTPs are shown to be critical in the pathogenesis of human diseases! as shown by over 210 entries in OMIN- Clones in this category include: tes3_31j20.
Phosphoproteins: Some paraneoplastic syndromes affecting the nervous system are associated with antibodies that react with neuronal proteins and the causal tumor (onconeuronal antigens). Several of these antibodies are markers of specific neurologic syndromes associated with distinct types of cancer- One of the antigenes recognised by such antibodies is Ma-li the neuron- and testis-specific protein 1- The expression of Mai mRNA is highly restricted to the brain and testis- Subsequent analysis suggested that Mai is likely to be a phosphoprotein (see OMIN *fa04010). Clones in this category include: tes3_5k22.
Transmembrane proteins
Membrane region prediction was effected using the AL0M2 software (Klein et al-i llδSi version 2 by K- Nakai). Similar to many other methodsi the Kyte & Doolitle (1162) amino acid hydrophobicity scale is used in AL0M2 as the primary variable for classifying sequences in terms of their localization- High prediction accuracy is achieved through the system of intelligent decision rules and the utilization of a carefully selected training data set- The method also generates reliability estimates which makes it possible to distinguish between membrane-spanning proteins (Ii intrinsic) and globular proteins with regions of high hydrophobicity buried in the core-
For a protein of length Li the block of length 1 with maximum hydrophobicity is found:
Figure imgf000522_0001
where H± represents the hydrophobicity of an individual residue ■
Let P(I/maxΗ) and P(E/maxH) be the conditional probabilities that a protein is integral or peripheral! respectively! given its value of maximal hydrophobicity maxHi and let P(I) and P(E) be the prior probabilities of intrinsic and extrinsic membrane proteins estimated from the training set- Then a sequence is assigned to E if
P(E/maxH) > P(I/maxH)
ori after applying the Bayes rulei
P(E)P(maxH/E) > P (DP (maxH/I) i
where the conditional probabilities P(maxH/E) and P(maxH/I) can be determined based on the estimates of probability distributions of maxH in both groups-
Discriminant analysis allows to simplify this task by calculating the odds P(E/MaxH) :P(I/maxH) as eh-, where b is the left-hand side of a linear or quadratic inequality. For examplei for the window of length 17ι the protein is allocated to the peripheral category E based on the empirically derived quadratic inequality:
1.05(maxH)Ξ+12.30maxH+17.41 >0ι
whereas the optimal inequality for assigning membrane proteins (category I) is linear:
-1-02maxH + 14.27 > 0
The odds parameter can be made more or less stringent- For examplei one can require odds at least 1:10 for a protein to be classified as integral- This leads to higher selectivity but less sensitivity-
The boundaries of membrane-spanning regions in putative membrane proteins are detected by means of an iterative procedure whereby the most hydrophobic region corresponding to the value maxH is considered to be membrane and removed from the sequence- The classification procedure is then repeated again for the remaining sequence! andi if such a protein is again classified as integrali the next most hydrophobic region is considered-
Reference: Kleini P-i Kanehisai M-i DeLisii C- (1165) The detection and classification of membrane-spanning proteins- Biochem Biophys Acta 815: 4fa6-47fa
Transcription factors
Purified eukaryotic RNA polymerase II is unable to initiate promoter-specific transcription. A family of factors that collectively confer RNAPII promoter specificity is known as the general transcription factors (GTFs). They include the TATA- binding Protein (TBP) TFIIBi TFIIEi TFIIF and TFI IH- These factors are conserved among all eukaryotes-
RNAPII complexes containing the entire set of GTFs or a subset of GTFs together with other proteins have been isolated from mammalian and yeast cells- Although purified RNAPII and GTFs are sufficient for promoter-specific initiation! this system fails to respond to activators- This is mediated by a further complex termed mediator complex which associates with the carboxy-terminal heptapeptide domain (CTD) of the largest subunit of RNAPII.
Purification of human RNAPII complexes resulted in two distinct forms of human RNAPII after analysis of functional properties. One complex contained chromatin remodeling activities but was devoid of GTFs. The other complex did not contain factors that modify chromatin but contained a subset of SRB/mediator subunits and GTFs and other polypeptides that mediate transcriptional activation! a scenario similar to that reported for yeast-
A complex designated NAT (-20 SU) for negative regulator of transcription contains RNAPIIi Cdkδi homologs of the yeast mediator complex as well as Rgrl and SrblO/11 known as negative regulators of transcription-
A complex with striking similar structural and functional properties to NAT has been identified designated SMCC (-15 SU) (SRB/mediator coactivator complex) i that can also mediate transcriptional activation-
The SMCC complex includes all reported NAT subunits including subunits of the TRAP complex- TRAP is a coactivator complex isolated on the basis of its interaction with the thyroid hormone receptor- Another coactivator complex DRIPi isolated on the basis of its ability to interact with the vitamin D3 receptori contains novel subunits as well as subunits of NAT/SMCC and TRAP complexes-
The effects of each of these coactivator complexes is dependent on the TFIID complex- It is not known if the T AF subunits of TFIID are required- It is likely that new coactivator complexes will be uncovered containing both novel and previously defined components.
Beside the huge amount of transcription factors which can be part of the RNAIIP holoenzyme or the coactivator complexes there is an even larger quantity of specific transcription factors binding to promoter elements within the DNA sequences of a given gene leading to activation or repression of transcription. A broad range of cellular responses like differentiation! proliferation! cell death and others are elicited through activating or repressing the transcription of target genes-
There are at least five superclasses of transcription factors:
1- Superclass contains members with characteristic basic domains:
Members are '-
Leucine zipper factorsi where the basic domain is followed by a leucine zipper of repeated leucine residues at every seventh position- The zipper mediates protein dimerization as a prerequisite for DNA-binding-
Helix-loop-helix factors (bHLH) contain a DNA-binding basic region followed by a motif of two potential amphipathic alpha- helices connected by a loop of variable length also mediating dimerization ■
Factors with a combination of Helix-loop-helix and leucine zipper.
Further members of this superclass are NF-li RF-Xi and bHSH like proteins-
2- Superclass comprises factors containing zinc-coordinating DNA-binding domains.
Members are '-
Proteins with Cys4 zinc finger of nuclear receptor typei where two such motifs differing in sizei composition and function are present in each receptor molecule- Each finger comprises 4 cysteine residues coordinating one zinc ion- The second half including the second cysteine pair has alpha-helix conformation and the helix of the first finger binds to the DNA through the major groove- The sequence between the first two cysteines of the second finger mediates dimerization upon DNA-binding- This class includes the steroid hormone receptors and the thyroid hormone receptor-like factors- Other diverse cys4 zinc fingers have a motif of GATA-type-
Proteins with Cys2His2 zinc finger domain(s). Each finger comprises 2 cysteine and 2 histidine residues coordinating one zinc ioni and in some cases one histidine is replaced by another cysteine. The zinc ion is essential for DNA-binding-
Proteins with Cysb cysteine-zinc cluster(s)- Six cysteine residues coordinate two zinc ionsi i- e- two of the thiol groups are coordinating two zinc ions each- Present in many fungal regulators-
Zinc fingers of alternating composition.
3. Superclass contains factors of helix-turn-helix type-
Members are '-
Proteins with homeo domains- Homeo domains are three consecutive alpha-helix structures- Helix 3 contacts mainly the major groove of the DNAi some contacts at the minor groove are observed as well- Helix 2 and 3 resemble the helix-turn-helix structure of prokaryotic regulators-
Proteins with Paired box domain(s). This is a DNA-binding domain of approximately 130 amino acid residues- Its N-terminal half is basic its C-terminal half is highly charged in general- It probably comprises 3 alpha-helices-
Proteins with Fork head / winged helix domain(s)- This domain was identified by homology between HNF-3A and fkh- The domain comprises approx- 110 AA- Analysis of the crystal structure has revealed a compact structure of three alpha- helicesi the third alpha-helix being exposed towards the major groove of the DNA- The domain also exerts minor groove contacts. Upon binding to DNAi it induces a bend of 13 degree-
Heat shock factors
Proteins with Tryptophan clusters- The tryptophan clusters comprise several tryptophan residues with a spacing of 12-21 amino acid residuesi the subclass of myb-type DNA-binding domains typically exhibit a spacing of 11-21 amino acid residues-
Proteins with TEA domain(s)- The TEA domain has been identified as a region which is conserved among the transcription factors TEF-li TEC1 and abaA- This domain in TEF-1 has been shown to interact with DNAi although two additional regions may also contribute to DNA-binding- It is predicted to fold into three alpha-helicesi with a randomly coiled region of lb-16 amino acid residues between helices 1 and 2ι and a short stretch between helices 2 and 3 of 3-6 residues-
4- Superclass contains beta-Scaffold Factors with Minor Groove Contacts
Members are :
Proteins with RHR (Rel homology) region-
The structure of the Rel-type DBD exhibits a bipartite subdomain structurei each subdomain comprising a beta-barrel with five loops that form an extensive contact surface to the major groove of the DNA- Particularly! the first loop of the N-terminal subdomain (the highly conserved recognition loop) performs contacts with the recognition element on the DNAi but other loops are involved- The fact that the main DNA-contacts are made through loops has been suggested to provide a high degree of flexibility in binding to a range of different target sequences- Augmenting interactions are achieved by two alpha-helices within the N-terminal Part that form strong minor groove contacts to the A/T-rich center of the B-element- In pfaSi the sequence between both alpha-helices is much shorter and even helix 2 is truncated- The secondi C-terminal domain is necessary mainly for protein dimerization •
p53 proteins
MADS (MCMl-agamous-deficiens-SRF) box proteins- Proteins of this class comprise a region of homology- The DNA-binding domain also comprises the dimerization capability- In the DNA-bound dimer (shown for SRF)ι two antiparallel amphipathic alpha-helices (alpha-I)ι form a coiled coil and are oriented approximately parallel on the minor groove- These helices make minor and major groove contacts! the N-terminal extensions form minor groove contacts- The bound DNA is bent and wrapped around the protein- It exhibits a compressed minor groove in the center and widened minor groove in the flanks.
Beta-Barrel alpha-helix transcription factors-
TATA-binding proteins
HMG proteins
Proteins of this class comprise a region of homology with the chromosomal non-histone HMG proteins such as HMG1- This region comprises the DNA-binding domain which in some instances such as HMG1 mediates sequence-unspecific in other cases such LEF-1 sequence-specific binding to DNA- This domain exhibits a typical L-shaped conformation made up of 3 alpha-helices and an extended N-terminal extension of the first helix- The latter together with helix li which contains a kinki form the long arm of the Li whereas helices 1 and 2 form the short arm- Binding to the minor groove induces a sharp bending of the DNA by more than 10 degreei away from the bound protein- The overall topology of the DNA-protein complexes resembles somewhat that of the TBP-TATA box complex.
Heteromeric CCAAT factors
Proteins with Grainyhead domain(s)
Cold-shock domain factors. Cold-shock domain proteins are characterized by a highly conserved region first found in prokaryotic cold-shock proteins- This domain is a single-stranded nucleic acid-binding structure interacting with DNA or RNA- It consists of an antiparallel five-stranded beta-barreli the strands of which are connected by turns and loops- Uithin this structure! a three-stranded beta-strand contains a conserved RNA- binding motifi RNP1- Not all CSD proteins are transcription factors- Those which specifically bind to a certain sequence are termed Y-box proteins- Proteins of this class were previously called protamine-like domain proteins because of having a highly positively charged domain with interspersed proline residues-
Proteins with Runt homology domain
The members of this transcription factor class have been identified on the basis of their homology to a defined region within the Drosophilia protein Runt- The runt domain is part of the DNA-binding domain of these factors- It consists mainly of beta-strandsi does not contain alpha-helical regions and seems to be most similar to the palm domain found in DNA polymerase beta (rat)-
5- Superclass contains other transcription factors like Copper fist proteinsi HMGI(Y)ι STATi Pocket domain proteins and Ap2/EREBP-related factors-
The classification of transcription factors originates from TRANSFAC database:
http: //transfa gbf.de/TRANSFAC/
Reference: Heinemeyer
Several categories of proteins are coded for by clones of the invention within the overall group of "Transcription Factors" and includei among othersi the following:
Homeobox-proteins: Homeodomain-containing transcription factors are essential for a variety of processes in vertebrate development! including organogenesis. They have been shown to regulate cell proliferation! pattern segmental identity anddetermine cell fate decisions during embryogenesis- For examplei In zebrafish emx2 mRNAs are found in the dorsal telencephaloni parts of the diencephalon and the otocyst- The human homologue Emx2 appears to be already expressed in 6-5 day embryos- It is also expressed in the presumptive cerebral cortexi olfactory bulbsi in some neuroectodermal areas in embryonic head including olfactory placodes in earlier stages and olfactory epithelia later in development. Mutants of the D- melanogaster gene "mempty spiracles" display spiracles devoid of filzkorperi no antenna and an open head- Clones in this category include: amy2_14mlfa-
Proteins with myc-tvpei helix-loop-helix dimerization domain signature (s) . This helix-loop-helix domain mediates protein dimerization has been found in various multimeric transcrpition factors. Clones in this category include: tes3_lδnl4-
Transcriptional silencers: In addition to transcription factorsi other proteinsi such as YDL153c of Saccharomyees cerevisia are responsible for silencing of genes. Clones in this category include: amy2_2f22-
Proteins regulating transcription factors: The activity of several transcription factor is regulated by the binding or dissociation of other proteins or by phosphorylation or dephosphorylation of the transcription factor- For exampleil- kappa-B-related protein interacts with the transcription factor NF-κB- I-kappa-B-alpha mutations contribute to constitutive NF- kappaB activity in cultured and primary HRS (Hodgkin/Reed- Sternberg) cells and are therefore involved in the pathogenesis of Hodgkin's disease (HD) patients- Clones in this category include: amy2_lcl2-
Signal transducing proteins: Beta-transducin subunits of G- proteins contain UD-40 repeats- The beta subunits seem to be required for the replacement of GDP by GTP as well as for membrane anchoring and receptor recognition- Due to the zinc finger the novel protein seems to be a new molecule involved in signal transduction and transcription- These proteins have been reported by OMIN to be associated (as potentially diagnostic therapeutic causativei and/or relatedi etc..) with the following diseases: 1) essential hypertension (OMIN *131130)- Clones in this category include: tes3_llc22-
* * * The invention! therefore! specifically contemplates the following assemblages of materials! which track the above- identified fourteen functional groupings! that are useful in practicing the profiling aspects of the invention- One type of assemblage is nucleic acid-based and can include the following groupings of sequences and their derivatives: all sequencesi human fetal brain sequencesi brain derived sequencesi human fetal kidney library sequencesi kidney derived sequencesi human mammary carcinoma library sequencesi mammary carcinoma derived sequencesi human testis library sequencesi testes derived sequencesi cell cycle genesi cell structure and motility genesi differentiation and development genesi intracellular transport and trafficking genesi metabolism genesi nucleic acid management genesi signal transduction genesi transmembrane protein genesi and transcription factor genes- Other assemblages contain proteins or their corresponding antibodies or antibody fragments! divided along the same groupings-
Database Applications
Because they are human genes and gene productsi the inventive molecules are useful as members of a database- Such a database may be usedi for examplei in drug discovery and rationale drug design or in testing the novelty and non- obviousness of newly sequenced materials- In additioni they are particularly suited in designing variants for the profiling (and other) applications described herein- Hencei the following discussion of electronic embodiments applies equally to such variants! whichi naturallyi will be generated and stored using a computer using known methodologies-
Accordingly! one aspect of the invention contemplates a database of at least one of the inventive sequences stored on computer readable media- Againi the individual sequences may be grouped with regard to the individual functional and structural groups mentioned above- Uhile the individual sequences of a database may exist in printed formi they are preferably in electronic formi as in an ascii or a text file- They may also exist as word processing files or they may be stored in database applications like DB2ι Sybasei Oraclei GCG and GenBank- One skilled in the art will understand the range of applications suitable for using and storing the electronic embodiments of the invention-
"Computer readable media" refers to any medium which can be read and accessed by a computer- These include: magnetic storage mediai like floppy discsi hard drives and magnetic tapei optical storage mediai like CD-ROMi electrical storage mediai like RAM and ROMi and hybrids of these categoriesi like magnetic/optical storage media- One skilled in the art will readily understand the scope of computer readable media and how to implement them-
Biological Activities and Assays for Implementing Therapeutic and Diagnostic Applications This section provides assays for biological activity that are useful in characterizing and quantifying the biological activity of the inventive molecules and their derivatives! which is relevant to the pharmacological effects of the inventive molecules. As used in this sectioni it will be understood that "protein" may also refer to the inventive antibodies (including fragments) ■
Cytokine and Cell Proliferation/Differentiation Activity
A protein of the present invention may exhibit cytokinei cell proliferation (either inducing or inhibiting) or cell differentiation (either inducing or inhibiting) activity or may induce production of other cytokines in certain cell populations- Many protein factors discovered to datei including all known cytokinesi have exhibited activity in one or more factor dependent cell proliferation assaysi and hence the assays
Figure imgf000532_0001
as a convenient confirmation of cytokine activity. The activity of a protein of the present invention is evidenced by any one of a number of routine factor dependent cell proliferation assays for cell lines includingi without limitation! 32Dι DA2ι DAlGi TlDi Bli Bl/lli BaF3ι MCI/Gi M + (preB M + )ι 2Eβι RB5ι DAli 123ι TllfaSi HT2ι CTLL2ι TF-li Mo7e and CMK-
The activity of a protein of the invention mayi among other eansi be measured by the following methods:
Assays for T-cell or thymocyte proliferation include without limitation those described in: Current Protocols in Immunologyi. Ed by J. E- Coligani A- M- Kruisbeeki D- H. Marguliesi E- M.
Shevachi U- Stroberi Pub- Greene Publishing Associates and Uiley- Interscience (Chapter 3ι In Vitro assays for Mouse Lymphocyte Function 3.1-3-lli Chapter 7ι Immunologic studies in Humans)i Takai et al-i J. Immunol- 137 : 3414-3500! llβfai Bertagnolli et al-i J- Immunol. 145: 170fa-1712ι HlOi Bertagnolli et al-i
Cellular Immunology 133:327-341ι lllli Bertagnollii et al.-, I- Immunol. 141 : 377δ-3763ι 1112i Bowman et al.-i I- Immunol.
Figure imgf000532_0002
Assays for cytokine production and/or proliferation of spleen cellsi lymph node cells or thymocytes includei without limitation! those described in: Polyclonal T cell stimulation! Kruisbeeki A- M- and Shevachi E- M- In Current Protocols in Immunology- J. E- e - a - Coligan eds- Vol 1 pp- 3-12 ■ l-3.12-14ι
John Uiley and Sonsi Toronto- 1114i and Measurement of mouse and human interleukin gamma i Schreiberi R- D- In Current Protocols in Immunology. J. E- e - a - Coligan eds- Vol 1 pp. b-δ.l-b.β.δi John Uiley and Sonsi Toronto. 1114- Assays for proliferation and differentiation of hematopoietic and lymphopoietic cells includei without limitation! those described in: Measurement of Human and Murine Interleukin 2 and Interleukin 4ι Botto lyi K-i Davisi L- S. and Lipskyi P- E- In Current Protocols in Immunology. J. E- e - a - Coligan eds. Vol 1 pp. fa.3-l-b .3-12ι John Uiley and Sonsi
Toronto- lllli deVries et al-i J. Exp- Med- 173: 12D5-1211ι lllli Moreau et al-i Nature 33b:fa1D-b12ι llδβi Greenberger et al-i Proc Natl- Acad- Sci- U-S-A- 60 :2131-213βι 11δ3i Measurement of mouse and human interleukin b-Nordani R- In Current Protocols in Immunology- J- E- e - a - Coligan eds. Vol 1 pp. fa-fa. l-b.fa.5i John Uiley and Sonsi Toronto, lllli Smith et al-i Proc- Natl- Aced- Sci- U-S-A- 63:lδ57-lβblι llδbi Measurement of human Interleukin 11-Bennettι F-i Giannottii J-i Clarki S- C. and Turneri K- J. In Current Protocols in Immunology. J. E- e - a - Coligan eds- Vol 1 pp. fa.15-1 John Uiley and Sonsi Toronto- lllli Measurement of mouse and human Interleukin 1-Ciarlettaι A-i Giannottii J-i Clarki S- C- and Turneri K- J- In Current Protocols in Immunology- J- E- e - a - Coligan eds- Vol 1 pp- fa-13-lι John Uiley and Sonsi Toronto- 1111. Assays for T-cell clone responses to antigens (which will identify! among othersi proteins that affect APC-T cell interactions as well as direct T-cell effects by measuring proliferation and cytokine production) includei without limitation! those described in: Current Protocols in Immunology! Ed by J- E- Coligani A. M- Kruisbeeki D- H- Marguliesi E- M-
Shevachi U Stroberi Pub- Greene Publishing Associates and Uiley- Interscience (Chapter 3ι In Vitro assays for Mouse Lymphocyte Functioni Chapter bi Cytokines and their cellular receptorsi Chapter 7ι Immunologic studies in Humans)i Ueinberger et al-i Proc Natl- Acad- Sci- USA 77 : fa011-fa015ι HδOi Ueinberger et al-i Eur- J- Immun- 11:405-411! llδli Takai et al-i J- Immunol. 137:3414-350Dι llδbi Takai et al-i J- Immunol- 140:50δ-512ι llδβ-
Immune Stimulating or Suppressing Activity A protein of the present invention may also exhibit immune stimulating or immune suppressing activityi including without limitation the activities for which assays are described herein- A protein may be useful in the treatment of various immune deficiencies and disorders (including severe combined immunodeficiency (SCID))ι e-g-i in regulating (up or down) growth and proliferation of T and/or B lymphocytes! as well as effecting the cytolytic activity of NK cells and other cell populations- These immune deficiencies may be genetic or be caused by vital (e-g-i HIV) as well as bacterial or fungal infections! or may result from autoimmune disorders- More specifically! infectious diseases causes by virali bacteriali fungal or other infection may be treatable using a protein of the present inventioni including infections by HIVi hepatitis virusesi herpesvirusesi mycobacteriai Leishmania spp-i malaria spp- and various fungal infections such as candidiasis- Of coursei in this regardi a protein of the present invention may also be useful where a boost to the immune system generally may be desirablei i-e-i in the treatment of cancer- Autoimmune disorders which may be treated using a protein of the present invention includei for examplei connective tissue diseasei multiple sclerosisi systemic lupus erythematosusi rheumatoid arthritisi autoimmune pulmonary inflammation! Guillain-Barre syndromei autoimmune thyroiditisi insulin dependent diabetes mellitisi myasthenia gravisi graft-versus-host disease and autoimmune inflammatory eye disease- Such a protein of the present invention may also to be useful in the treatment of allergic reactions and conditions! such as asthma (particularly allergic asthma) or other respiratory problems- Other conditions! in which immune suppression is desired (including! for examplei organ transplantation)! may also be treatable using a protein of the present invention- Using the proteins of the invention it may also be possible to modify immune responsesi in a number of ways- Down regulation may be in the form of inhibiting or blocking an immune response already in progress or may involve preventing the induction of an immune response- The functions of activated T cells may be inhibited by suppressing T cell responses or by inducing specific tolerance in T cellsi or both- Immunosuppression of T cell responses is generally an activei non-antigen-specific process which requires continuous exposure of the T cells to the suppressive agent- Tolerancei which involves inducing non- responsiveness or anergy in T cellsi is distinguishable from immunosuppression in that it is generally antigen-specific and persists after exposure to the tolerizing agent has ceased- Operationally! tolerance can be demonstrated by the lack of a T cell response upon reexposure to specific antigen in the absence of the tolerizing agent- Down regulating or preventing one or more antigen functions (including without limitation B lymphocyte antigen functions (such asi for examplei B7))ι e-g-i preventing high level lymphokine synthesis by activated T cellsi will be useful in situations of tissuei skin and organ transplantation and in graft-versus-host disease (GVHD). For examplei blockage of T cell function should result in reduced tissue destruction in tissue transplantation. Typicallyi in tissue transplants! rejection of the transplant is initiated through its recognition as foreign by T cellsi followed by an immune reaction that destroys the transplant. The administration of a molecule which inhibits or blocks interaction of a B7 lymphocyte antigen with its natural ligand(s) on immune cells (such as a solublei monomeric form of a peptide having B7-2 activity alone or in conjunction with a monomeric form of a peptide having an activity of another B lymphocyte antigen (e-g.i B7-lι B7-3) or blocking antibody)ι prior to transplantation can lead to the binding of the molecule to the natural ligand(s) on the immune cells without transmitting the corresponding costimulatory signal- Blocking B lymphocyte antigen function in this matter prevents cytokine synthesis by immune cellsi such as T cellsi and thus acts as an immunosuppressant- Moreover! the lack of costimulation may also be sufficient to anergize the T cellsi thereby inducing tolerance in a subject. Induction of long-term tolerance by B lymphocyte antigen-blocking reagents may avoid the necessity of repeated administration of these blocking reagents- To achieve sufficient immunosuppression or tolerance in a subjecti it may also be necessary to block the function of a combination of B lymphocyte antigens ■ The efficacy of particular blocking reagents in preventing organ transplant rejection or GVHD can be assessed using animal models that are predictive of efficacy in humans- Examples of appropriate systems which can be used include allogeneic cardiac grafts in rats and xenogeneic pancreatic islet cell grafts in micei both of which have been used to examine the immunosuppressive effects of CTLA4Ig fusion proteins in vivo as described in Lenschow et al-i Science 257:761-712 (1112) and Turka et al-i Proc Natl- Acad- Sci USAi δl : 11102-11105 (1112)- In addition! murine models of GVHD (see Paul ed-i Fundamental Immunology! Raven Pressi New Yorki llδli pp. 64b-647) can be used to determine the effect of blocking B lymphocyte antigen function in vivo on the development of that disease- Blocking antigen function may also be therapeutically useful for treating autoimmune diseases- Many autoimmune disorders are the result of inappropriate activation of T cells that are reactive against self tissue and which promote the production of cytokines and autoantibodies involved in the pathology of the diseases- Preventing the activation of autoreactive T cells may reduce or eliminate disease symptoms- Administration of reagents which block costimulation of T cells by disrupting receptor : ligand interactions of B lymphocyte antigens can be used to inhibit T cell activation and prevent production of autoantibodies or T cell-derived cytokines which may be involved in the disease process- Additionally! blocking reagents may induce antigen-specific tolerance of autoreactive T cells which could lead to long-term relief from the disease- The efficacy of blocking reagents in preventing or alleviating autoimmune disorders can be determined using a number of well-characterized animal models of human autoimmune diseases- Examples include murine experimental autoimmune encephalitis! systemic lupus erythmatosis in MRL/lpr/lpr mice or NZB hybrid micei murine autoimmune collagen arthritisi diabetes mellitus in NOD mice and BB ratsi and murine experimental myasthenia gravis (see Paul ed-i Fundamental Immunologyi Raven Pressi New Yorki llβli pp. 640- 65b).
Upregulation of an antigen function (preferably a B lymphocyte antigen function)! as a means of up regulating immune responses! may also be useful in therapy- Upregulation of immune responses may be in the form of enhancing an existing immune response or eliciting an initial immune response- For examplei enhancing an immune response through stimulating B lymphocyte antigen function may be useful in cases of viral infection- In addition! systemic viral diseases such as influenzai the common coldi and encephalitis might be alleviated by the administration of stimulatory forms of B lymphocyte antigens systemically -
Alternatively! anti-vital immune responses may be enhanced in an infected patient by removing T cells from the patienti costimulating the T cells in vitro with viral antigen-pulsed APCs either expressing a peptide of the present invention or together with a stimulatory form of a soluble peptide of the present invention and reintroducing the in vitro activated T cells into the patient. Another method of enhancing anti-viral immune responses would be to isolate infected cells from a patienti transfect them with a nucleic acid encoding a protein of the present invention as described herein such that the cells express all or a portion of the protein on their surfacei and reintroduce the transfected cells into the patient- The infected cells would now be capable of delivering a costimulatory signal toi and thereby activate! T cells in vivo-
In another application! up regulation or enhancement of antigen function (preferably B lymphocyte antigen function) may be useful in the induction of tumor immunity. Tumor cells (e-g-i sarcomai melanomai lymphomai leukemiai neuroblastomai carcinoma) transfected with a nucleic acid encoding at least one peptide of the present invention can be administered to a subject to overcome tumor-specific tolerance in the subject- If desiredi the tumor cell can be transfected to express a combination of peptides- For examplei tumor cells obtained from a patient can be transfected ex vivo with an expression vector directing the expression of a peptide having B7-2-like activity alonei or in conjunction with a peptide having B7-l-like activity and/or B7-3- like activity- The transfected tumor cells are returned to the patient to result in expression of the peptides on the surface of the transfected cell- Alternatively! gene therapy techniques can be used to target a tumor cell for transfection in vivo-
The presence of the peptide of the present invention having the activity of a B lymphocyte antigen(s) on the surface of the tumor cell provides the necessary costimulation signal to T cells to induce a T cell mediated immune response against the transfected tumor cells- In addition! tumor cells which lack MHC class I or MHC class II moleculesi or which fail to reexpress sufficient mounts of MHC class I or MHC class II moleculesi can be transfected with nucleic acid encoding all or a portion of (e-g-i a cytoplasmic-domain truncated portion) of an MHC class I alpha chain protein and beta 2 microglobulin protein or an MHC class II alpha chain protein and an MHC class II beta chain protein to thereby express MHC class I or MHC class II proteins on the cell surface- Expression of the appropriate class I or class II MHC in conjunction with a peptide having the activity of a B lymphocyte antigen (e-g-i B7-lι B7-2ι B7-3) induces a T cell mediated immune response against the transfected tumor cell- Optionallyi a gene encoding an antisense construct which blocks expression of an MHC class II associated proteini such as the invariant chaini can also be cotransfected with a DNA encoding a peptide having the activity of a B lymphocyte antigen to promote presentation of tumor associated antigens and induce tumor specific immunity- Thusi the induction of a T cell mediated immune response in a human subject may be sufficient to overcome tumor-specific tolerance in the subject-
The activity of a protein of the invention ayi among other meansi be measured by the following methods: Suitable assays for thymocyte or splenocyte cytotoxicity includei without limitation! those described in: Current Protocols in Immunology! Ed by J- E- Coligani A- M- Kruisbeeki D. H- Marguliesi E- M- Shevachi U- Stroberi Pub. Greene Publishing Associates and Uiley-Interscience (Chapter 3ι In Vitro assays for Mouse Lymphocyte Function 3-1-3. Hi Chapter 7ι Immunologic studies in Humans)i Herrmann et al-i Proc- Natl- Acad- Sci- USA 7δ:246δ-2412ι llδli Herrmann et al-i J. Immunol- 12δ :11bδ-1174ι 1162i Handa et al-i J- Immunol- 135"-15fa4-1572ι 1165i Takai et al-! I- Im unol- 137."3414-3500ι Hδbi Takai et al-i J. Immunol- 140:5D6-512ι Hβδi Herrmann et al-i Proc- Natl- Acad- Sci- USA 7δ:2466-2412ι llδli Herrmann et al-i J- Immunol- 126 :11bβ-1174ι 1162i Handa et al.-i J- Immunol- 135 : 15b4-1572ι 1165i Takai et al-i J- Immunol. 137 : 3414-3500ι llδfai Bowmanet al-i J. Virology fal:1112-111βi Takai et al-i J. Immunol- 140:5D6-512ι Hδβi
Bertagnolli et al-i Cellular Immunology 133:327-341ι lllli Brown et al-i J- Immunol. 153: 3071-3D12ι H14-
Assays for T-cell-dependent immunoglobulin responses and isotype switching (which will identify! among othersi proteins that modulate T-cell dependent antibody responses and that affect Thl/Th2 profiles) includei without limitation! those described in: Maliszewskii J- Immunol- 144 :3026-3033ι HlOi and Assays for B cell function: In vitro antibody production! Mondi J- J- and Brunswicki M- In Current Protocols in Immunology. J. E. e - a - Coligan eds- Vol 1 pp. 3-6.l-3.fi.lbi John Uiley and Sonsi Toronto. 1114-
Mixed lymphocyte reaction (MLR) assays (which will identify! among othersi proteins that generate predominantly Thl and CTL responses) includei without limitation! those described in: Current Protocols in Immunology! Ed by J- E- Coligani A- M- Kruisbeeki D- H- Marguliesi E. M- Shevachi U- Stroberi Pub- Greene Publishing Associates and Uiley-Interscience (Chapter 3ι In Vitro assays for Mouse Lymphocyte Function 3-l-3-11i Chapter 7ι Immunologic studies in Humans)i Takai et al-i J. Immunol- 137:3414-350Dι Hfibi Takai et al-i J- Immunol- 140:5Dβ-512ι Hβδi Bertagnolli et al-i J- Immunol- 141 : 377δ-3763ι 1112-
Dendritic cell-dependent assays (which will identify! among othersi proteins expressed by dendritic cells that activate naive T-cells) includei without limitation! those described in: Guery et al-i J- Immunol- 134:53fa-544ι 1115i Inaba et al-i Journal of Experimental Medicine 173:541-551ι lllli Macatonia et al-i Journal of Immunology 154 :S071-5071ι HlSi Porgador et al-i Journal of Experimental Medicine 162:255-2b0ι 1115i Nair et al-i Journal of Virology b7:40b2-40b ! 1113i Huang et al.-, Science 2fa4:1fal-1fa5ι 1114i Macatonia et al-i Journal of Experimental Medicine Ibl :1255-12b4ι llδli Bhardwaj et al-i Journal of Clinical Investigation 14:717-607ι 1114i and Inaba et al-i Journal of Experimental Medicine 172:b31-fa40ι HID. Assays for lymphocyte survival/apoptosis (which will identifyi among othersi proteins that prevent apoptosis after superantigen induction and proteins that regulate lymphocyte homeostasis) includei without limitation! those described in: Darzynkiewicz et al-i Cytometry 13:715-606ι 1112i Gorczyca et al-i Leukemia 7 : b51-fa70ι 1113i Gorczyca et al-i Cancer Research 53:1145-1151ι 1113i Itoh et al.-i Cell fafa:233-243ι lllli Zacharchuki Journal of Immunology 145 : 4D37-4045ι HlOi Zamai et al- Cytometry 14:611-617i 1113i Gorczyca et al-i International Journal of Oncology I:fa31-fa4δι 1112-
Assays for proteins that influence early steps of T-cell commitment and development includei without limitation! those described in: Antica et al-i Blood δ4:lll-117ι 1114i Fine et al- Cellular Immunology 155".lll-122ι 1114i Galy et al-i Blood 6S:2770-277βι 1115i Toki et al-i Proc Nat. Acad Sci- USA
Figure imgf000540_0001
Hematopoiesis Regulating Activity
A protein of the present invention may be useful in regulation of hematopoiesis andi consequentlyi in the treatment of myeloid or lymphoid cell deficiencies- Even marginal biological activity in support of colony forming cells or of factor-dependent cell lines indicates involvement in regulating hematopoiesisi e-g- in supporting the growth and proliferation of erythroid progenitor cells alone or in combination with other cytokinesi thereby indicating utilityi for examplei in treating various anemias or for use in conjunction with irradiation/chemotherapy to stimulate the production of erythroid precursors and/or erythroid cellsi in supporting the growth and proliferation of myeloid cells such as granulocytes and monocytes/macrophages (i-e-i traditional CSF activity) usefuli for examplei in conjunction with chemotherapy to prevent or treat consequent myelo-suppressioni in supporting the growth and proliferation of megakaryocytes and consequently of platelets thereby allowing prevention or treatment of various platelet disorders such as thrombocytopeniai and generally for use in place of or complimentary to platelet transfusionsi and/or in supporting the growth and proliferation of hematopoietic stem cells which are capable of maturing to any and all of the above- mentioned hematopoietic cells and therefore find therapeutic utility in various stem cell disorders (such as those usually treated with transplantationi includingi without limitation! aplastic anemia and paroxysmal nocturnal hemoglobinuria) i as well as in repopulating the stem cell compartment post irradiation/chemotherapyi either in-vivo or ex-vivo (i-e-i in conjunction with bone marrow transplantation or with peripheral progenitor cell transplantation (homologous or heterologous)) as normal cells or genetically manipulated for gene therapy-
The activity of a protein of the invention ayi among other meansi be measured by the following methods:
Suitable assays for proliferation and differentiation of various hematopoietic lines are cited above-
Assays for embryonic stem cell differentiation (which will identify! among othersi proteins that influence embryonic differentiation hematopoiesis) includei without limitation! those described in: Johansson et al- Cellular Biology 15:l41-151ι 1115i Keller et al-i Molecular and Cellular Biology 13:473-4βbι 1113i McClanahan et al-i Blood 61 :2103-2115ι 1113-
Assays for stem cell survival and differentiation (which will identify! among othersi proteins that regulate lympho- hematopoiesis) includei without limitation! those described in: Methylcellulose colony forming assaysi Freshneyi M- G- In Culture of Hematopoietic Cells- R- I- Freshneyi et al- eds- Vol pp- 2fa5- 2faδι Uiley-Lissi Inc-i New Yorki N-Y- 1114i Hiraya a et al.-. Proc Natl- Acad- Sci- USA δl : 5107-5111ι 1112i Primitive hematopoietic colony forming cells with high proliferative potential! McNiecei I- K- and Briddelli R- A- In Culture of Hematopoietic Cells. R- I- Freshneyi et al. eds- Vol pp. 23-31ι Uiley-Lissi Inc-i New Yorki N-Y- 1114i Neben et al-i Experimental Hematology 22-"353-351ι 1114i Cobblestone area forming cell assayi Ploemacheri R- E- In Culture of Hematopoietic Cells- R- I-
Freshneyi et al- eds- Vol pp. l-21ι Uiley-Lissi Inc-i New Yorki N-Y- 1114i Long term bone marrow cultures in the presence of stromal cellsi Spoonceri E-i Dexteri M- and Alleni T- In Culture of Hematopoietic Cells- R- I- Freshneyi et al- eds- Vol pp- lfa3- 171ι Uiley-Lissi Inc-i New Yorki N-Y- 1114i Long term culture initiating cell assayi Sutherland! H- J- In Culture of Hematopoietic Cells- R- I- Freshneyi et al- eds- Vol pp. 131-lfa2ι Uiley-Lissi Inc-i New Yorki N-Y- 1114-
Tissue Growth Activity A protein of the present invention also may have utility in compositions used for bonei cartilagei tendoni ligament and/or nerve tissue growth or regeneration! as well as for wound healing and tissue repair and replacement! and in the treatment of burnsi incisions and ulcers- A protein of the present invention! which induces cartilage and/or bone growth in circumstances where bone is not normally for edi has application in the healing of bone fractures and cartilage damage or defects in humans and other animals- Such a preparation employing a protein of the invention may have prophylactic use in closed as well as open fracture reduction and also in the improved fixation of artificial joints. De novo bone formation induced by an osteogenic agent contributes to the repair of congenital! trauma inducedi or oncologic resection induced craniofacial defectsi and also is useful in cosmetic plastic surgery-
A protein of this invention may also be used in the treatment of periodontal diseasei and in other tooth repair processes- Such agents may provide an environment to attract bone-forming cellsi stimulate growth of bone-forming cells or induce differentiation of progenitors of bone-forming cells- A protein of the invention may also be useful in the treatment of osteoporosis or osteoarthritisi such as through stimulation of bone and/or cartilage repair or by blocking inflammation or processes of tissue destruction (collagenase activity! osteoclast activity! etc-) mediated by inflammatory processes-
Another category of tissue regeneration activity that may be attributable to the protein of the present invention is tendon/ligament formation- A protein of the present invention! which induces tendon/ligament-like tissue or other tissue formation in circumstances where such tissue is not normally formedi has application in the healing of tendon or ligament tearsi deformities and other tendon or ligament defects in humans and other animals- Such a preparation employing a tendon/ligament-like tissue inducing protein may have prophylactic use in preventing damage to tendon or ligament tissuei as well as use in the improved fixation of tendon or ligament to bone or other tissuesi and in repairing defects to tendon or ligament tissue- De novo tendon/ligament-like tissue formation induced by a composition of the present invention contributes to the repair of congenitali trauma inducedi or other tendon or ligament defects of other origini and is also useful in cosmetic plastic surgery for attachment or repair of tendons or ligaments- The compositions of the present invention may provide environment to attract tendon- or ligament-forming cellsi stimulate growth of tendon- or ligament-forming cellsi induce differentiation of progenitors of tendon- or ligament-forming cellsi or induce growth of tendon/ligament cells or progenitors ex vivo for return in vivo to effect tissue repair- The compositions of the invention may also be useful in the treatment of tendonitisi carpal tunnel syndrome and other tendon or ligament defects- The compositions may also include an appropriate matrix and/or sequestering agent as a carrier as is well known in the art-
The protein of the present invention may also be useful for proliferation of neural cells and for regeneration of nerve and brain tissuei i-e- for the treatment of central and peripheral nervous system diseases and neuropathiesi as well as mechanical and traumatic disorders! which involve degeneration! death or trauma to neural cells or nerve tissue. More specifically! a protein may be used in the treatment of diseases of the peripheral nervous systemi such as peripheral nerve injuries! peripheral neuropathy and localized neuropathiesi and central nervous system diseases! such as Alzheimer's! Parkinson's diseasei Huntington's diseasei amyotrophic lateral sclerosisi and Shy-Drager syndrome- Further conditions which may be treated in accordance with the present invention include mechanical and traumatic disorders! such as spinal cord disordersi head trauma and cerebrovascular diseases such as stroke- Peripheral neuropathies resulting from chemotherapy or other medical therapies may also be treatable using a protein of the invention- Proteins of the invention may also be useful to promote better or faster closure of non-healing woundsi including without limitation pressure ulcersi ulcers associated with vascular insufficiency! surgical and traumatic woundsi and the like- It is expected that a protein of the present invention may also exhibit activity for generation or regeneration of other tissuesi such as organs (including! for examplei pancreas i liveri intestinei kidneyi skini endothelium) i muscle (smoothi skeletal or cardiac) and vascular (including vascular endothelium) tissuei or for promoting the growth of cells comprising such tissues- Part of the desired effects may be by inhibition or modulation of fibrotic scarring to allow normal tissue to regenerate- A protein of the invention may also exhibit angiogenic activity-
A protein of the present invention may also be useful for gut protection or regeneration and treatment of lung or liver fibrosisi reperfusion injury in various tissuesi and conditions resulting from systemic cytokine damage-
A protein of the present invention may also be useful for promoting or inhibiting differentiation of tissues described above from precursor tissues or cellsi or for inhibiting the growth of tissues described above-
The activity of a protein of the invention mayi among other meansi be measured by the following methods: Assays for tissue generation activity includei without limitation! those described in: International Patent Publication No. U015/lfaD35 (bonei cartilagei tendon)i International Patent Publication No- U015/0564b (nervei neuronaDi International - Patent Publication No- U011/D7411 (skini endothelium). Assays for wound healing activity includei without limitation! those described in: Uinteri Epidermal Uound Healingi pps- 71-112 (Maibachi H- I. and Roveei D- T-i eds-)ι Year Book Medical Publishers! Inc-i Chicagoi as modified by Eaglstein and Mertzi J- Invest- Dermatol 71:362-64 (1176)-
Activin/Inhibin Activity
A protein of the present invention may also exhibit activin- or inhibin-related activities- Inhibins are characterized by their ability to inhibit the release of follicle stimulating hormone (FSH)ι while activins and are characterized by their ability to stimulate the release of follicle stimulating hormone (FSH). Thusi a protein of the present invention! alone or in heterodimers with a member of the inhibin alpha familyi may be useful as a contraceptive based on the ability of inhibins to decrease fertility in female mammals and decrease spermatogenesis in male mammals. Administration of sufficient amounts of other inhibins can induce infertility in these mammals. Alternatively! the protein of the invention! as a homodimer or as a heterodimer with other protein subunits of the inhibin- beta groupi may be useful as a fertility inducing therapeutic based upon the ability of activin molecules in stimulating FSH release from cells of the anterior pituitary. Seei for examplei U«S- Pat- No- 4ι716ιδδ5- A protein of the invention may also be useful for advancement of the onset of fertility in sexually immature mammalsi so as to increase the lifetime reproductive performance of domestic animals such as cowsi sheep and pigs-
The activity of a protein of the invention mayi among other meansi be measured by the following methods: Assays for activin/inhibin activity includei without limitation! those described in: Vale et al-i Endocrinology 11:5b2-572ι 1172i Ling et al--. Nature 321:771-7δ2ι Hδfai Vale et al-i Nature 321:77b-771ι Hδbi Mason et al.-. Nature 31β:b51-bfa3ι HβSi Forage et al-i Proc Natl- Acad- Sci- USA 63:3011-3015!
Chemotactic/Che okinetic Activity
A protein of the present invention may have chemotactic or ■ chemokinetic activity (e-g-i act as a chemokine) for mammalian cellsi including! for examplei monocytesi fibroblastsi neutrophilsi T-cellsi mast cellsi eosinophilsi epithelial and/or endothelial cells- Chemotactic and chemokinetic proteins can be used to mobilize or attract a desired cell population to a desired site of action- Chemotactic or chemokinetic proteins provide particular advantages in treatment of wounds and other trauma to tissuesi as well as in treatment of localized infections- For examplei attraction of lymphocytes! monocytes or neutrophils to tumors or sites of infection may result in improved immune responses against the tumor or infecting agent- A protein or peptide has chemotactic activity for a particular cell population if it can stimulate! directly or indirectly! the directed orientation or movement of such cell population- Preferably! the protein or peptide has the ability to directly stimulate directed movement of cells- Uhether a particular protein has chemotactic activity for a population of cells can be readily determined by employing such protein or peptide in any known assay for cell chemotaxis-
The activity of a protein of the invention mayi among other meansi be measured by the following methods: Assays for chemotactic activity (which will identify proteins that induce or prevent chemotaxis) consist of assays that measure the ability of a protein to induce the migration of cells across a membrane as well as the ability of a protein to induce the adhesion of one cell population to another cell population- Suitable assays for movement and adhesion includei without limitation! those described in: Current Protocols in Immunology! Ed by J- E- Coligani A- M- Kruisbeeki D- H- Marguilesi E- M- Shevachi U- Stroberi Pub- Greene Publishing Associates and Uiley-Interscience (Chapter fa-12ι Measurement of alpha and beta Chemokines fa -12-1-fa .12 ■ βi Taub et al- J- Clin- Invest- 1S:1370-137bι HlSi Lind et al. APMIS 103:140-14bι 1115i Muller et al Eur- J- Immunol- 25 : 1744-174δi Gruber et al- J- of Immunol. 152 : 56fa0-5δfa7ι 1114i Johnston et al- J. of Immunol.
153:17fa2-17faβι 1114-
Hemostatic and Thrombolytic Activity
A protein of the invention may also exhibit hemostatic or thrombolytic activity. As a resulti such a protein is expected to be useful in treatment of various coagulation disorders
(including hereditary disorders! such as hemophilias) or to enhance coagulation and other hemostatic events in treating wounds resulting from traumai surgery or other causes- A protein of the invention may also be useful for dissolving or inhibiting formation of thromboses and for treatment and prevention of conditions resulting therefrom (such asi for examplei infarction of cardiac and central nervous system vessels (e-g-i stroke)-
The activity of a protein of the invention mayi among other meansi be measured by the following methods: Assay for hemostatic and thrombolytic activity includei without li itationi those described in". Linet et al-i J. Clin- Pharmacol. 2b:131-14Dι Hδbi Burdick et al.-. Thrombosis Res- 45:413-411ι 1167i Humphrey et al.-. Fibrinolysis 5:71-71 (Hll)i Schaubi Prostaglandins 35:4b7-474ι 1166-
Receptor/Ligand Activity
A protein of the present invention may also demonstrate activity as receptorsi receptor ligands or inhibitors or agonists of receptor/ligand interactions- Examples of such receptors and ligands includei without limitationi cytokine receptors and their ligandsi receptor kinases and their ligandsi receptor phosphatases and their ligandsi receptors involved in cell-cell interactions and their ligands (including without limitation! cellular adhesion molecules (such as selectinsi integrins and their ligands) and receptor/ligand pairs involved in antigen presentation! antigen recognition and development of cellular and humoral immune responses). Receptors and ligands are also useful for screening of potential peptide or small molecule inhibitors of the relevant receptor/ligand interaction- A protein of the present invention (including! without limitation! fragments of receptors and ligands) may themselves be useful as inhibitors of receptor/ligand interactions- The activity of a protein of the invention mayi among other meansi be measured by the following methods'.
Suitable assays for receptor-ligand activity include without limitation those described in:Current Protocols in Immunology! Ed by J. E- Coligani A- M- Kruisbeeki D- H- Marguliesi E- M- Shevachi U- Stroberi Pub. Greene Publishing Associates and Uiley- Interscience (Chapter 7-26ι Measurement of Cellular Adhesion under static conditions 7 -26 -l-7.26-22) i Takai et al-i Proc Natl- Acad. Sci- USA 64 : faβb4-bβbδι 1187i Bierer et al-i J. Exp. Med. lbδ:1145-115fa! Hββi Rosenstein et al-i J. Exp. Med- Ifa1:141-lb0 Hβli Stoltenborg et al--. J- Immunol- Methods 175:51- bδi 1114i Stitt et al-i Cell δ0:bbl-fa70ι 1115-
Anti-Inf lammatorv Activity
Proteins of the present invention may also exhibit anti- inflammatory activity. The anti-inflammatory activity may be achieved by providing a stimulus to cells involved in the inflammatory responsei by inhibiting or promoting cell-cell interactions (such asi for examplei cell adhesion)ι by inhibiting or promoting chemotaxis of cells involved in the inflammatory processi inhibiting or promoting cell extravasationi or by stimulating or suppressing production of other factors which more directly inhibit or promote an inflammatory response- Proteins exhibiting such activities can be used to treat inflammatory conditions including chronic or acute conditions)! including without limitation intimation associated with infection (such as septic shocki sepsis or systemic inflammatory response syndrome (SIRS))ι ischemia-reperfusion injuryi endotoxin lethalityi arthritisi complement-mediated hyperacute rejection! nephritisi cytokine or che okine-induced lung injuryi inflammatory bowel diseasei Crohn's disease or resulting from over production of cytokines such as TNF or IL-1- Proteins of the invention may also be useful to treat anaphylaxis and hypersensitivity to an antigenic substance or material- Tumor Inhibition Activity
In addition to the activities described above for immunological treatment or prevention of tu orsi a protein of the invention may exhibit other anti-tumor activities- A protein may inhibit tumor growth directly or indirectly (such asi for examplei via ADCO- A protein may exhibit its tumor inhibitory activity by acting on tumor tissue or tumor precursor tissuei by inhibiting formation of tissues necessary to support tumor growth (such asi for examplei by inhibiting angiogenesis) i by causing production of other factorsi agents or cell types which inhibit tumor growthi or by suppressing! eliminating or inhibiting factorsi agents or cell types which promote tumor growth-
Other Activities
A protein of the invention may also exhibit one or more of the following additional activities or effects: inhibiting the growthi infection or function ofi or killingi infectious agentsi including! without limitationi bacterial virusesi fungi and other parasitesi effecting (suppressing or enhancing) bodily characteristics! including! without limitation! heighti weighti hair colori eye colori skini fat to lean ratio or other tissue pigmentation! or organ or body part size or shape (such asi for examplei breast augmentation or diminutioni change in bone form or shape)i effecting biorhythms or caricadic cycles or rhythmsi effecting the fertility of male or female subjectsi effecting the metabolismi catabolismi anabolismi processing! utilization! storage or elimination of dietary fati lipidi proteini carbohydratei vitaminsi mineralsi cofactors or other nutritional factors or component (s) i effecting behavioral characteristics! including! without limitation! appetitei libidoi stressi cognition (including cognitive disorders)! depression (including depressive disorders) and violent behaviorsi providing analgesic effects or other pain reducing effectsi promoting differentiation and growth of embryonic stem cells in lineages other than hematopoietic lineages hormonal or endocrine activityi in the case of enzymesi correcting deficiencies of the enzyme and treating deficiency-related diseasesi treatment of hyperproliferative disorders (such asi for examplei psoriasis)i immunoglobulin-like activity (such asi for examplei the ability to bind antigens or complements and the ability to act as an antigen in a vaccine composition to raise an immune response against such protein or another material or entity which is cross-reactive with such protein-
Particular Applications for Certain Clones
The following sets out a non-exclusive list of applications for certain embodiments of the invention- In the interest of economyi applications relevant to multiple embodiments are not duplicated in this list- Other embodiments described herein have similar characteristics! as described there- The artisan is directedi therefore! to the Description of the Sequences for similar descriptions of the functions of other embodiment-
Testes htes3_lDilfa: The new protein can find application in diagnosis/therapy in leukemia predisposition/disease in the modulation of DNA repair-
htes3_10nlD: The new protein can find application in studying the expression profile of testis-specif ic genes-
htes3_llal7: The new protein can find application in studying the expression profile of testis-specif ic genes and as a new marker for testicular cells-
htes3_llc22: The new protein can find application in modulating/blocking of regulatory pathways-
htes3_lld21: The new protein can find application in diagnosis of diseases due to unnormal protein degradation like muscular dystrophy or multiple sclerosis as well as in modulating the half life of specific proteins and in expression profiling-
Kidney hfkd2_3kl The new protein can find application in modulation of endocytosis ■ strong similarity to testicular dynamin (Rattus norvegicus)- Amygdala; hamy2_10hl7: The new protein can find application in modulating protein-protein-interaction and in studying the expression profile of amygdala-specific genes-
hamy2_10p7: The new protein can find application in modulation of NA+/Ca2+-exchange and voltage-dependend processes.
hamy2_lld2: The new protein can find application in studying the expression profile of amygdala-specific genes and as a new marker for amygdala cells-
hamy2_lln4: The new protein can find application in modulation of DNA-repair and a as a new tool for manipulation of nucleic acids-
hamy2_121f11-" The new protein can find application modulation of cyto skeleton-membrane interactions.
Fetal Brain: hfbr2_7δcl2: The new protein can find application in the modulation of translational pathways- hfbr2_7βdlβ: The new protein can find application in studying the expression profile of brain-specific genes- hfbr2_7fid4: The new protein can find application in studying the expression profile of brain-specific genes and as a new marker for amygdala cells- hfbr2_78elδ: The new protein can find application in studying the expression profile of brain-specific genes- hfbr2_7fii21: The new protein can find application in diagnosis/modulation of protein damage and age-related degenerative processes-
Melanoma: hmel2_12jl: The new protein can find application in studying the expression profile of melanoma-specific genes-
hmel2_7gl4: The new protein can find application in modulation of the sorting of proteins into different compartments.
hmel2_7kl1: The new protein can find application in studying the expression profile of melanoma-specific genes-
VARIANTS OF THE INVENTIVE DNA MOLECULES
Variants in General
"Variantsi" according to the invention! include DNA and/or protein molecules that resemblei structurally and/or functionally! those set forth herein- Variants may be isolated from natural sources ("homologs") i may be entirely synthetic or may be based in part o-n both natural and synthetic approaches-
The section set forth below presents various structural and functional characteristics of molecules within the invention- Preferred molecules are characterized by a combination of one or more of these characteristics- For instance! some preferred molecules are described with reference to at least two structural characteristics! while others may be described with reference to at least one structural and at least one functional characteristic- It will be recognized by the skilled artisan that structure ultimately defines function! i.e. the functions of the molecules described herein derives from the structures of those molecules- Accordingly! the structural variants described below that bear the closest structural relationship (as variously defined below) to the inventive molecules are the variants that most likely will preserve biological function- This relationship between structure and function will guide the skilled artisan in identifying the preferred embodiments of the invention- Splicing Variants
It is well-known that eukaryotic structural genes are comprised of both protein coding and non-coding portions. Uhen the messenger RNA is transcribed from the DNA template! it contains intronsi which are non-codingi and exonsi which are coding. In order to form a translation competent mRNAi the introns must be "spliced" out of this initial pre mRNA-
Specific sequences within the pre mRNA represent "splice junctions" that direct the cellular splicing machinery to the appropriate position- The splice junctions are loosely conserved sequence regions of the pre mRNAi which almost invariably begin with GT and end with AG (DNA perspective). The 5' end of the splice junction typically contains about nine somewhat conserved residuesi for examplei C/AAGTA/GAGT ■ The 3' end usually contains a pyrimidine rich stretch of at least about 11 nucleotidesi followed by NC/TAGG- Splicing occurs before the GT and after the AG. Mounti Nucleic Acids Res. 10:451-72 (1162).
Interestingly! exons often correspond to discrete functional domains of the protein product- The intron/exon arrangement thus creates a linear array of nucleotides which can be correlated to discrete! and often interchangeable! functional protein fragments- GOT Nature 211:10-12 (1161) i Branden et al.i EMBO J. 3:1307-10 (1164)- This linear arrangement creates the possibility of generating multiple different full length proteins by rearranging the order of the different functional portions in the array - For examplei if a set of exons are arranged l-2-3-4ι where (-) represents the introns separating the exonsi a splicing event need not simply produce 1234ι but may produce 123ι 134ι 124 and so on- Production of different mRNA products in this way is commonly called "alternative splicing-" Andreadis et al . -, Ann. Rev. Cell Biol . 3:2D7-42 (1167)-
Some of the present DNA molecules can be represented in modular fashion in terms of their coding regions- Essentially! these modules are exons (though each "exon" may in fact be made up of several exons) i which may be combined in different ways to form a variety of different DNA moleculesi each encoding a different functional protein- Splicing variants are indicated in the Description of the Sequences- Degenerate Variants
One aspect of the present invention provides "degenerate variants" of the nucleic acid fragments of the present invention- A "degenerate variant" is a nucleotide fragment which differs from those of inventive molecules by nucleotide sequence! but due to the degeneracy of the genetic codei encodes an identical polypeptide sequence- Given the known relationship between DNA sequences and the proteins they encodei degenerate variants typically are described by reference to this relationship- It is well known that the degeneracy of the genetic code results in many possible DNA sequences which encode a particular protein- Indeedi of the three bases which comprise an amino acid-encoding tripleti the third position! and often the secondi almost always may vary- This fact alone allows for a class of variant DNA molecules which encode protein sequences identical to those disclosed hereini yet have about 3D sequence variation- In other wordsi the variant DNA molecules are about 70* identical to the inventive DNAsi having no additional or deleted sequences- Thusi one aspect of the invention provides degenerate variant DNA molecules encoding the inventive protein sequences-
In one embodiment! these variants have at least about 70* sequence identity with the DNA molecules described herein- In a preferred embodimenti these variants have at least about 60* sequence identity to the inventive molecules- In a more preferred embodiment these variants have at least about 10* sequence identity with the inventive molecules-
Conservative Amino Acid Variants
Variants according to the invention also may be made that conserve the overall molecular structure of the encoded proteins- Given the properties of the individual amino acids comprising the disclosed protein products! some rational substitutions will be recognized by the skilled worker. Amino acid substitutions! i.e. "conservative substitutions!" may be madei for instance! on the basis of similarity in polarityi chargei solubilityi hydrophobicityi hydrophilicityi and/or the amphipathic nature of the residues involved- For example: (a) nonpolar (hydrophobic) amino acids include alaninei leucinei isoleucinei valinei prolinei phenylalaninei tryptophani and methioninei (b) polar neutral amino acids include glycinei serinei threoninei cysteinei tyrosinei asparaginei and glutaminei (c) positively charged (basic) amino acids include argininei lysinei and histidinei and (d) negatively charged (acidic) amino acids include aspartic acid and glutamic acid- Substitutions typically may be made within groups (a)-(d)- In additioni glycine and proline may be substituted for one another based on their ability to disrupt α-helices- Similarly! certain amino acidsi such as alaninei cysteinei leucinei methioninei glutamic acidi glutaminei histidine and lysine are more commonly found in α-helicesi while valinei isoleucinei phenylalaninei tyrosinei tryptophan and threonine are more commonly found in β- pleated sheets- Glycinei serinei aspartic acidi asparaginei and proline are commonly found in turns- Some preferred substitutions may be made among the following groups: (i) S and Ti (ii) P and Gi and (iii) Ai Vi L and I- Given the known genetic codei and recombinant and synthetic DNA techniquesi the skilled scientist readily can construct DNAs encoding the conservative amino acid variants-
As used hereini "sequence identity" between two polypeptide sequences indicates the percentage of amino acids that are identical between the sequences- "Sequence similarity" indicates the percentage of amino acids that either are identical or that represent conservative amino acid substitutions-
Functionally Equivalent Variants
Yet another class of DNA variants within the scope of the invention may be described with reference to the product they encode- As shown in the Description of the Sequences! some of the inventive DNA molecules encode a protein having a degree of homology with known proteinsi or protein domains- It is expectedi thereforei that they will have some or all of the requisite functional features of such molecules- These "functionally equivalent variants" products are characterized by the fact that they are functionally equivalent! with respect to biological activity! to certain known molecules- Also provided herein is information on common structural motifsi including consensus sequences that will guide the artisan in constructing functionally equivalent variants- It will be understood that the motifsi identified in the Description of the Sequences for each inventive proteini may be modified within the identified consensus sequences- Thusi the invention contemplates the proteins in the Description of the Sequences that contain variability in the consensus sequences identified! and the invention further contemplates the full range of nucleic acids encoding themi and the complements of those nucleic acids-
Hybridizing Variants
DNA variants within the invention also may be described by reference to their physical properties in hybridization- One skilled in the field will recognize that DNA can be used to identify its complement andi since DNA is double strandedi its equivalent or homologi using nucleic acid hybridization techniques. It will also be recognized that hybridization can occur with less than 10D* complementarity- Howeveri given appropriate choice of conditions! hybridization techniques can be used to differentiate among DNA sequences based on their structural relatedness to a particular probe- For guidance regarding such conditions seei for examplei Sambrook et al . i llδli MOLECULAR CLONINGi A LABORATORY MANUALi Cold Spring Harbor Pressi N-Y-i and Ausubel et al . -, 11β1τ CURRENT PROTOCOLS IN MOLECULAR BIOLOGYi Green Publishing Associates and Uiley Intersciencei N-Y-
Structural relatedness between two polynucleotide sequences can be expressed as a function of "stringency" of the conditions under which the two sequences will hybridize with one another- As used hereini the term "stringency" refers to the extent that the conditions disfavor hybridization- Stringent conditions strongly disfavor hybridization! and only the most structurally related molecules will hybridize to one another under such conditions. Converselyi non-stringent conditions favor hybridization of molecules displaying a lesser degree of structural relatedness. Hybridization stringency! therefore! directly correlates with the structural relationships of two nucleic acid sequences- The following relationships are useful in correlating hybridization and relatedness (where Tm is the melting temperature of a nucleic acid duplex) :
a- Tm = bl-3 + 0.4KG+O* b- The Tm of a duplex DNA decreases by 1°C with every increase of 1* in the number of mismatched base pairs. c (Tra)μS - (Tm μi = 16-5 logιDμ2/μl where μl and μ2 are the ionic strengths of two solutions-
Hybridization stringency is a function of many factorsi including overall DNA concentration! ionic strengthi temperaturei probe size and the presence of agents which disrupt hydrogen bonding. Factors promoting hybridization include high DNA concentrations! high ionic strengths! low temperatures! longer probe size and the absence of agents that disrupt hydrogen bonding- Hybridization usually is done in two stages- Firsti in the "binding" stagei the probe is bound to the target under conditions favoring hybridization- Stringency is usually controlled at this stage by altering the temperature- For high stringency! the temperature is usually between b5°C and 7D°Cι unless short (<20 nt) oligonucleotide probes are used- A representative hybridization solution comprises faX SSCi D-5* SDSi 5X Denhardt's solution and lODμg of non-specific carrier DNA- See Ausubel et al . -, supra-, section 2-1ι supplement 27 (1114)- Of course many different! yet functionally equivalent! buffer conditions are known- Uhere the degree of relatedness is loweri a lower temperature may be chosen- Low stringency binding temperatures are between about 25°C and 40°C- Medium stringency is between at least about 40°C to less than about fa5°C- High stringency is at least about fa5°C- Secondi the excess probe is removed by washing- It is at this stage that more stringent conditions usually are applied- Hencei it is this "washing" stage that is most important in determining relatedness via hybridization. Uashing solutions typically contain lower salt concentrations. One exemplary medium stringency solution contains 2X SSC and 0-1* SDS- A high stringency wash solution contains the equivalent (in ionic strength) of less than about 0-2X SSCi with a preferred stringent solution containing about D-1X SSC The temperatures associated with various stringencies are the same as discussed above for "binding." The washing solution also typically is replaced a number of times during washing- For example! typical high stringency washing conditions comprise washing twice for 30 minutes at 55° C- and three times for 15 minutes at b0° C-
The present invention includes nucleic acid molecules that hybridize to the inventive molecules under high stringency binding and washing conditions- More preferred molecules (from an mRNA perspective) are those that are at least 50 * of the length of any one of those depicted in the Description of the Sequences- Particularly preferred molecules are at least 75 * of the length of those molecules-
Substi utions, Insertions , Additions and Deletions
In a general sensei the preferred DNA variants of the invention are those that retain the closest relationship! as described by "sequence identity" to the inventive DNA molecules- According to another aspect of the invention! thereforei substitutions! insertions! additions and deletions of defined properties are contemplated- It will be recognized that sequence identity between two polynucleotide sequences! as defined hereini generally is determined with reference to the protein coding region of the sequences- Thusi this definition does not at all limit the amount of DNAi such as vector DNAi that may be attached to the molecules described herein- Preferred DNA sequence variants include molecules encoding proteins sharing some or all of any relevant biological activity of the native molecule-
In creating these variants! the skilled worker will be guided by reference to the protein structure- Firsti insertions and deletions in any recognized functional domain above generally should be avoidedi except as noted below in the section entitled "Proteinsi" where this domain is discussed in detail- Alterations in such domains usually will be limited to conservative amino acid substitutions. In addition! where insertions and deletions are desiredi this may be accomplished at the N- and/or C-terminus of the protein molecule (or the corresponding coding regions of the DNA). If insertions or deletions are made within the proteini deletions of major structural features usually should be avoided- Thusi a preferred place to make insertion or deletion variants is in non-structural regionsi such as linker regions between two alpha helices- "Substitutions" generally refer to alterations in the DNA sequence which do not change its overall lengthi but only alter one or more nucleotide positions! substituting one for another in the common sense of the word- One class of preferred substitutions! "degenerate substitutions!" are those that do not alter the encoded amino acid sequence. Some subsitutions retains 50*ι 55*! bD* or faS* identity. Preferred substitutions retain at least about 70* identity! more preferably at least 70* or 75* identity! with the inventive DNAs- Some more preferred molecules have at least about 60* identity! more preferably at least 60* or 65* identity. Particularly preferred DNAs share at least about 10* identity! more preferably at least ID* or 15* identity.
"Insertions!" unlike substitutions! alter the overall length of the DNA moleculei and thus sometimes the encoded protein. Insertions add extra nucleotides to the interior (not the 5' or 3' ends) of the subject DNAs. Preferred insertions are made with reference to the protein sequence encoded by the DNA- Thusi it is most preferred to provide an insertion in the DNA at a location that corresponds to an area of the encoded protein which lacks structure- For instance! it typically would not be beneficiali if the preservation of biological activity is desiredi to provide an insertion within an alpha-helical region or a beta-pleated sheet- Accordingly! non-structural areasi such as those containing helix- breaking glycines and proline residuesi are most preferred sites of insertion. Other preferred sites of insertion are the splice sitesi which are indicated above in the description of the inventive DNA molecules-
Uhile the optimal size of insertions will vary depending upon the site of insertion and its effect on the overall conformation of the encoded proteini some general guides are useful- Generally! the total insertions (irrespective of their number) should not add more than about 3D* (or preferably not more than 30*) to the overall size of the encoded protein- More preferablyi the insertion adds less than about 1D-2D* (yet more preferably 10- 20*) in sizei with less than about 10* being most preferred. The number of insertions is limited only by the number of suitable insertions sitesi and secondarily by the foregoing size preferences-
"Additions!" like insertions! also add to the overall size of the DNA oleculei and usually the encoded protein- Howeveri instead of being made within the moleculei they are made on the 5' or 3' endi usually corresponding to the N- or C- terminus of the encoded protein- Unlike deletions! additions are not very size- dependent- Indeedi additions may be of virtually any size- Preferred additions! howeveri do not exceed about 1D0* of the size of the native molecule- More preferably! they add less than about bD to 30* to the overall sizei with less than about 30* being most preferred-
"Deletions" diminish the overall size of the DNA andi therefore! also reduce the size of the protein encoded by that DNA- Deletions may be made from either end of the molecule or internal to it- Typical preferred deletions remove discrete structural features of the encoded protein- For examplei some deletions will comprise the deletion of one or more exons which may define a structural feature- Preferred deletions remove less than about 30* of the size of the subject molecule- More preferred deletions remove less than about 20* and most preferred deletions remove less than about ID*-
Computer -De fined Variants and Definition of "Sequence Identity" In generali both the DNA and protein molecules of the invention can be defined with reference to "sequence identity-" As used hereini "sequence identity" refers to a comparison made between two molecules usingi for examplei the standard Smith- Uaterman algorithm that is well known in the art- Some molecules have at lease about 50*ι 55* or bO* identity- Preferred molecules are those having at least about b5* sequence identity! more preferably at least fa5* or 70* sequence identity. Other preferred molecules have at least about 6D*ι more preferably at least δO* or 65*ι sequence identity. Particularly preferred molecules have at least about 10* sequence identity! more preferably at least 10* sequence identity- Most preferred molecules have at least about 15*ι more preferably at least 15*ι sequence identity- As used hereini two nucleic acid molecules or proteins are said to "share significant sequence identity" if the two contain regions which possess greater than 65* sequence (amino acid or nucleic acid) identity-
"Sequence identity" is defined herein with reference the Blast 2 algorithm! which is available at the NCBI
(http://www.ncbi -nlm.nih.gov/BLAST) i using default parameters- References pertaining to this algorithm include: those found at http: //www- ncbi ■ nlm-nih-gov/BLAST/blast_references • htmli Altschuli S-F-i Gishi U-i Milleri U-i Myersi E-U- & Lipmani D-J- (1110) "Basic local alignment search tool-" J- Mol- Biol-
215:403-410i Gishi hi. & Statesi D-J. (1113) "Identification of protein coding regions by database similarity search." Nature Genet- 3:2bb-272i Maddeni T-L-i Tatusovi R.L- & Zhangi J. (111b) "Applications of network BLAST server" Meth- Enzymol. 2bb:131- 141i Altschuli S-F-i Maddeni T-L-i Schafferi A-A-i Zhangi J-i Zhangi Z-i Milleri U- & Lipmani D-J. (1117) "Gapped BLAST and PSI-BLAST: a new generation of protein database search programs-" Nucleic Acids Res- 25 : 3361-3402i and Zhangi J- & Maddeni T-L- (1117) "PowerBLAST: A new network BLAST application for interactive or automated sequence analysis and annotation-" Genome Res- 7:fa41-b5fa-
METHODS OF MAKING VARIANTS
It will be recognized that variants of the inventive molecules can be constructed in several different ways- For examplei they may be constructed as completely synthetic DNAs- Methods of efficiently synthesizing oligonucleotides in the range of 20 to about 150 nucleotides are widely available- See Ausubel et al . -, supra-, section 2-llι Supplement 21 (1113)- Overlapping oligonucleotides may be synthesized and assembled in a fashion first reported by Khorana et al.-. J- Mol- Biol- 72:201-217 (1171)i see also Ausubel et all Section 6-2. The synthetic DNAs are designed with convenient restriction sites engineered at the 5' and 3' ends of the gene to facilitate cloning into an appropriate vector- An alternative method of generating variants is to start with one of the inventive DNAs and then to conduct site-directed mutagenesis- See Ausubel et al.i supra-, chapter 6ι Supplement 37 (1117)- In a typical methodi a target DNA is cloned into a single-stranded DNA bacteriophage vehicle- Single-stranded DNA is isolated and hybridized with a oligonucleotide containing the desired nucleotide alteration (s) ■ The complementary strand is synthesized and the double stranded phage is introduced into a host- Some of the resulting progeny will contain the desired mutanti which can be confirmed using DNA sequencing. In addition! various methods are available that increase the probability that the progeny phage will be the desired mutant. These methods are well known to those in the field and kits are commercially available for generating such mutants-
ISOLATING HOMOLOGS
Methods
By using the sequences disclosed herein as probes or .as primersi and techniques such as PCR cloning and colony/plaque hybridization! one skilled in the art can obtain homologs- "Homologs" are essentially naturally-occurring variants and include allelic species-specific and tissue-specific variants-
Region-specific primers or probes derived from the nucleotide sequence(s) provided can be used to prime DNA synthesis and PCR amplification! as well as to identify colonies containing cloned DNA encoding a homolog using known methods (Innis et aϊ . , PCR Protocols, Academic Pressi San Diegoi CA (1110))- Such an application is useful in diagnostic methodsi as described in more detail belowi as well as in preparing full-length DNAs from various sources- The PCR primers are preferably at least 15 basesi and more preferably at least 16 bases in length- Uhen selecting a primer sequence! it is preferred that the primer pairs have approximately the same G/C ratioi so that melting temperatures are approximately the same- As a general guidei the formula 3(G+C) + 2(A+T) = °Cι is useful.
Uhen using primers derived from the inventive sequences! one skilled in the art will recognize that by employing high stringency conditions ( e. g. -, annealing at 50-faO°C)ι only sequences with greater than 75* sequence identity to the primer will be amplified- By employing lower stringency conditions (e-g-i annealing at 35-37°C)ι sequences which have greater than 4D-50* sequence identity to the primer also will be amplified-
The PCR product may be subcloned and sequenced to confirm that it indeed displays the expected sequence identity. The PCR fragment may then be used to isolate a full length cDNA clone by a variety of methods- For examplei the amplified fragment may be labeled and used to screen a bacteriophage cDNA library. Alternatively! the labeled fragment may be used to screen a genomic library. PCR technology may also be utilized to isolate full length cDNA sequences- For examplei RNA may be isolatedi following standard procedures! from an appropriate cellular or tissue source- A reverse transcription reaction may be performed on the RNA using an oligonucleotide primer specific for the most 5' end of the amplified fragment for the priming of first strand synthesis. The resulting RNA/DNA hybrid may then be "tailed" with guanines using a standard terminal transferase reaction! the hybrid may be digested with RNAase Hi and second strand synthesis may then be primed with a poly-C primer- Thusi cDNA sequences upstream of the amplified fragment may easily be isolated- For a review of cloning strategies which may be usedi see e-g-i Sambrook et al-i 1161 i supra .
Uhen using DNA probes derived from the inventive sequences for colony/plaque hybridizationi one skilled in the art will recognize that by employing medium to high stringency conditions (e-g-i hybridizing at 50-b5°C in 5X SSPC and 50* formamidei and washing at 5D-b5°C in D-5X SSPOi sequences having regions with greater than ID* sequence identity to the probe can be obtained! and that by employing lower stringency conditions (e-g-i hybridizing at 35-37°C in 5X SSPC and 4D-45* formamidei and washing at 42°C in SSPOi sequences having regions with greater than 35-45* sequence identity to the probe will be obtained.
Suitably! genomic or cDNA libraries can be constructed and screened in accord with the previous paragraph- The libraries should be derived from a tissue or organism that is known to express the gene of interest! or that is suspected of expressing the gene- The clone containing the homolog may then be purified through methods routinely practiced in the arti and subjected to sequence analysis-
Additionally! an expression library can be constructed utilizing DNA isolated from or cDNA synthesized from a tissue or organism that is known to express the gene of interesti or that is suspected of expressing the gene- In this manneri clones may be induced and screened using standard antibody screening techniques in conjunction with antibodies raised against the normal gene producti as described herein- (For screening techniquesi seei for examplei Harlowi E- and Lanei eds-i l βδi ANTIBODIES: A LABORATORY MANUALi Cold Spring Harbor Pressi Cold Spring Harbor Press ■ )
Human Homologs
Any organism or tissue can be used as the source for homologs of the present invention so long as the organism or tissue naturally expresses such a protein or contains genes encoding the same- The most preferred organism for isolating homologs is human ■
PROTEINS OF THE INVENTION One class of proteins included within the invention is encoded by the inventive DNA molecules presented- Other proteins according to the invention are those encoded by the DNA variants described above- As notedi these variants are designed with the encoded proteins in mind- A preferred class of protein fragments includes those fragments which retain any biological activity- These molecules share functional features common the family of proteinsi although these characteristics may vary in degree-
According to one aspect of the invention fragments of the inventive proteins are contemplated- Some preferred fragments are those which are capable of eliciting an immune response- Generally these "antigenic" fragments will be from about five amino acids in length to about fifty amino acids in length- Some preferred antigenic fragments are from five to about twenty amino acids long- "Antigenic" response may refer to a T cell responsei a B cell response or a response by cells of the macrophage/monocyte lineages- In most casesi howeveri it will refer to the immune response involved in the generation of antibodies- In other wordsi the relevant immune response is that of helper T cells and/or B cells- These preferred molecules comprise one or more T cell and /or B cell epitopes-
ANTIBODIES OF THE INVENTION
Antibodies raised against the proteins and protein fragments of the invention also are contemplated by the invention- Described below are antibody products and methods for producing antibodies capable of specifically recognizing one or more epitopes of the presently described proteins and their derivatives- Antibodies includei but are not limited to polyclonal antibodies! monoclonal antibodies (mAbs)ι humanized or chimeric antibodies! single chain antibodies including single chain Fv (scFv) fragments! Fab fragments! F(ab')s fragments! fragments produced by a Fab expression libraryi anti-idiotypic (anti-Id) antibodiesi epitope-binding fragments! and humanized forms of any of the above- As known to one in the arti these antibodies may be usedi for examplei in the detection of a target protein in a biological sample- They also may be utilized as part of treatment methodsi and/or may be used as part of diagnostic techniques whereby patients may be tested for abnormal levels or for the presence of abnormal forms of the such proteins. In generali techniques for preparing polyclonal and monoclonal antibodies as well as hybridomas capable of producing the desired antibody are well known in the art (Campbelli A-M-i Monoclonal Antibody Technology: Laboratory Techniques in Biochemistry and Molecular Biology-, Elsevier Science Publishers! Amsterdam! The Netherlands (1164) i St- Groth et al-i J. Immunol . Methods 35:1-21 (1160) i Kohler and Milsteini Nature 25fa:41S-417 (1175))ι the trioma technique! the human B-cell hybridoma technique (Kozbor et al-i Immunology Today 4:72 (1163) i Cole et al-i in Monoclonal Antibodies and Cancer Therapy-, Alan R- Lissi Inc. (1165) i pp. 77-1b). Antibodies may also be generated by the known techniques of phage display and in vitro immunization. Polyclonal Antibodies
Polyclonal antibodies are heterogeneous populations of antibody molecules derived from the sera of animals immunized with an antigeni such as an inventive protein or an antigenic derivative thereof-
Polyclonal antiserumi containing antibodies to heterogeneous epitopes of a single proteini can be prepared by immunizing suitable animals with the expressed protein described abovei which can be unmodified or modifiedi as known in the arti to enhance immunogenicity- Immunization methods include subcutaneous or intraperitoneal injection of the polypeptide-
Effective polyclonal antibody production is affected by many factors related both to the antigen and to the host species- For examplei small molecules tend to be less immunogenic than other and may require the use of carriers and/or adjuvant- In addition! host animal response may vary with site of inoculation- Both inadequate or excessive doses of antigen may result in low titer antisera- In generali howeveri small doses (high ng to low μg levels) of antigen administered at multiple intradermal sites appears to be most reliable- Host animals may include but are not limited to rabbitsi micei chickens and ratsi to name but a few- An effective immunization protocol for rabbits can be found in Vaitukaitisi J- et al., J. Clin. Endocrinol . Metab. 33:166-111 (1171) - The protein immunogen may be modified or administered in an adjuvant in order to increase the protein's antigenicity- Methods of increasing the antigenicity of a protein are well known in the art and includei but are not limited to coupling the antigen with a heterologous protein (such as globulin β-galactosidase) or through the inclusion of an adjuvant during immunization- Adjuvants include Freund's (complete and incomplete)! mineral gels such as aluminum hydroxidei surface active substances such as lysolecithini pluronic polyolsi polyanionsi peptidesi oil emulsionsi keyhole limpet he ocyanini dinitrophenoli and potentially useful human adjuvants such as BCG (bacille Calmette- Guerin) and Corynebacterium parvum .
Booster injections can be given at regular intervals! with at least one usually being required for optimal antibody production. The antiserum may be harvested when the antibody titer begins to fall- Titer may be determined semi-quantitativelyi for examplei by double immunodiffusion in agar against known concentrations of the antigen- Seei for examplei Ouchterlony et al . , Chap. 11 in: Handbook of Experimental Immunology, Uieri edi Blackwell (1173)- Plateau concentration of antibody is usually in the range of D-l to D-2 mg/ml of serum (about 12 μM). The antiserum may be purified by affinity chromatography using the immobilized immunogen carried on a solid support. Such methods of affinity chromatography are well known in the art- Affinity of the antisera for the antigen may be determined by preparing competitive binding curvesi as described! for examplei by Fisher i Chap- 42 in: Manual of Clinical Immunology, second editioni Rose and Fried ani eds-i Amer- Soc- For Microbiology! Uashingtoni D-C (1160).
In addition to using protein an the immunogeni DNA molecules may be used directly- In this manneri a DNA encoding the protein immunogen is administered- Boosting and harvesting is done in a manner analogous to that detailed above- Yet another method of producing antibodies entails immunizing chickens and harvesting the antibodies from their eggs-
Monoclonal Antibodies
Monoclonal antibodies (MAbs)ι are homogeneous populations of antibodies to a particular antigen- They may be obtained by any technique which provides for the production of antibody molecules by continuous cell lines in culture or in vivo- MAbs may be produced by making hybridomas which are immortalized cells capable of secreting a specific monoclonal antibody-
Monoclonal antibodies to any of the proteinsi peptides and epitopes thereof described herein can be prepared from murine hybridomas according to the classical method of Kohleri G- and Milsteini d Nature 25b:415-417 (1175) (and U-S- Patent No- 4ι37bιll0) or modifications of the methods thereofi such as the human B-cell hybridoma technique (Kosbor et al.i 1163ι Immunology Today 4:72i Cole et al . > 11δ3ι Proc. Natl . Acad. Sci . USA δ0: 2D2b-2030)! and the EBV-hybridoma technique (Cole et al . -, 11δ5ι MONOCLONAL ANTIBODIES AND CANCER THERAPYi Alan R- Lissi Inc-i pp. 77-1b).
In one method a mouse is repetitively inoculated with a few micrograms of the selected protein over a period of a few weeks- The mouse is then sacrificed! and the antibody producing cells of the spleen are isolated- The spleen cells are fusedi typically using polyethylene glycoli with mouse myeloma cellsi such as SP2/0-Agl4 myeloma cells- The excessi unfused cells are destroyed by growth of the system on selective media comprising aminopterin (HAT media)- The successfully fused cells are dilutedi and aliquots are plated to microliter plates where growth is continued- Antibody— producing clones (hybridomas) are identified by detection of antibody in the supernatant fluid of the wells by immunoassay procedures- These include ELISAi as originally described by Engvalli Meth. Enzymol . 70:411 (11δ0)ι western blot analysis! radioimmunoassay (Lutz et al . -, Exp. Cell Res. 175:101-124 (1166)) and modified methods thereof- Selected positive clones can be expanded and their monoclonal antibody product harvested for use- Detailed procedures for monoclonal antibody production are described in Davisi L- et al . BASIC METHODS IN MOLECULAR BIOLOGYi Elsevieri New York- Section 21-2 (1161). The hybridoma clones may be cultivated in vitro or in vivo-, for instance as ascites- Production of high titers of mAbs in vivo makes this the presently preferred method of production- Alternatively! hybridoma culture in hollow fiber bioreactors provides a continuous high yield source of monoclonal antibodies-
The antibody class and subclass may be determined using procedures known in the art (Ca pbelli A-M-i Monoclonal Antibody Technology: Laboratory Techniques in Biochemistry and Molecular Biology-, Elsevier Science Publishers! Amsterdami The Netherlands (11δ4))- MAbs may be of any immunoglobulin class including IgGi IgMi IgEi IgAi IgD and any subclass thereof- Methods of purifying monoclonal antibodies are well known in the art-
Antibody Derivatives and Fragments
Fragments or derivatives of antibodies include any portion of the antibody which is capable of binding the target antigeni or a specific portion thereof. Antibody derivatives include poly- specific ( e. g. , bi-specific) antibodies! which contain binding sites specific for two or more different epitopes- These epitopes may be from the same or different inventive molecules or one or more epitope may be from a molecule not specifically disclosed here- Antibody fragments specifically include F(ab')gι Fabi Fab' and Fv fragments- These can be generated from any class of antibodyi but typically are made from IgG or IgM- They may be made by conventional recombinant DNA techniques ori using the classical methodi by proteolytic digestion with papain or pepsin- See CURRENT PROTOCOLS IN IMMUNOLOGYi chapter 2ι Coligan et al . -, eds-i (John Uiley & Sons 1111-12)-
F(ab')s fragments are typically about 110 kDa (IgG) or about 15D kDa (IgM) and contain two antigen-binding regionsi joined at the hinge by disulfide bond(s). Virtually alii if not alii of the Fc is absent in these fragments- Fab' fragments are typically about 55 kDa (IgG) or about 75 kDa (IgM) and can be formedi for examplei by reducing the disulfide bond(s) of an F(ab')e fragment- The resulting free sulfhydryl group(s) may be used to conveniently conjugate Fab' fragments to other moleculesi such as detection reagents ( e . g. -, enzymes).
Fab fragments are onovalent and usually are about 50 kDa (from any source). Fab fragments include the light (L) and heavy (H) chaini variable (VL and VHI respectively) and constant (CL CHI respectively) regions of the antigen-binding portion of the antibody. The H and L portions are linked by an intramolecular disulfide bridge-
Fv fragments are typically about 25 kDa (regardless of source) and contain the variable regions of both the light and heavy chains (VL and VHI respectively)- Usuallyi the VL and VH chains are held together only by non-covalent interacts andi thusi they readily dissociate. They doi howeveri have the advantage of small size and they retain the same binding properties of the larger Fab fragments- Accordingly! methods have been developed to crosslink the VL and VH chainsi usingi for examplei glutaraldehyde (or other chemical crosslinkers) i intermolecular disulfide bonds (by incorporation of cysteines) and peptide linkers- The resulting Fv is now a single chain (i.e.i SCFv). Other antibody derivatives include single chain antibodies (U-S- Patent 4ι14bι776i Birdi Science 242:423-42fa (11βδ)i Huston et al.i Proc- Natl- Acad- Sci- USA δ5:5671-56δ3 (11δδ)i and Uard et al . -, Nature 334:544-54b (1161))- Single chain antibodies are formed by linking the heavy and light chain fragments of the Fv region via an amino acid bridgei resulting in a single chain FV (SCFv) -
One preferred method involves the generation of scFvs by recombinant methodsi which allows the generation of Fvs with new specificities by mixing and matching variable chains from different antibody sources- In a typical methodi a recombinant vector would be provided which comprises the appropriate regulatory elements driving expression of a cassette region- The cassette region would contain a DNA encoding a peptide linkeri with convenient sites at both the 5' and 3' ends of the linker for generating fusion proteins- The DNA encoding a variable region(s) of interest may be cloned in the vector to form fusion proteins with the linkeri thus generating an scFv-
In an exemplary alternative approachi DNAs encoding two Fvs may be ligated to the DNA encoding the linkeri and the resulting tripartite fusion may be ligated directly into a conventional expression vector. The scFv DNAs generated any of these methods may be expressed in prokaryotic or eukaryotic cellsi depending on the vector chosen-
Antibody fragments which recognize specific epitopes may be generated by known techniques. For examplei such fragments include but are not limited to: the F(ab')Ξ fragments which can be produced by pepsin digestion of the antibody molecule and the Fab fragments which can be generated by reducing the disulfide bridges of the F(ab)2 fragments. Alternatively! Fab expression libraries may be constructed (Huse et al-i llδli Science-, 24b:1275-l261) to allow rapid and easy identification of monoclonal Fab fragments with the desired specificity- Derivatives also include "chimeric antibodies" (Morrison et al.i Proc. Natl . Acad. Sci . -, δl:faβSl-faδ55 (1164) i Neuberger et al . -, Nature-, 312:fa04-b0δ (11δ4)i Takeda et al . -, Nature-, 314:452- 454 (1165))- These chimeras are made by splicing the DNA encoding a mouse antibody molecule of appropriate specificity withi for instance! DNA encoding a human antibody molecule of appropriate specificity- Thusi a chimeric antibody is a molecule in which different portions are derived from different animal speciesi such as those having a variable region derived from a murine mAb and a human immunoglobulin constant region. These are also known sometimes as "humanized" antibodies and they offer the added advantage of at least partial shielding from the human immune system. They arei therefore! particularly useful in therapeutic in vivo applications-
Labeled Antibodies The present invention further provides the above-described antibodies in detectably labeled form- Antibodies can be detectably labelled through the use of radioisotopesi affinity labels (such as biotini avidini etc-)ι enzymatic labels (such as horseradish peroxidasei alkaline phosphatasei etc) fluorescent labels (such as FITC or rhodaminei etc)ι paramagnetic atomsi etc Procedures for accomplishing such labeling are well-known in the arti for example see (Sternberger et al-i J". Histochem. Cytochem- 16:315 (1170) i Bayer et al-i Meth. Enzym- b2:306 (1171) i Engval et al-i Immunol . 101:121 (1172) i Godingi J. Immunol . Meth- 13:215 (117fa)). The labeled antibodies of the present invention can be used for in vitro, in vivo-, and in situ diagnostic assays-
Immobilized Antibodies
The foregoing antibodies also may be immobilized on a solid support- Examples of such solid supports include plastics such as polycarbonate! complex carbohydrates such as agarose and sepharosei' acrylic resins and such as polyacryla ide and latex beads- Techniques for coupling antibodies to such solid supports are well known in the art (Ueir et al-i "Handbook of Experimental Immunology" 4th Ed-i Blackwell Scientific Publications! Oxfordi Englandi Chapter 10 (116b) i Jacoby et al . , Meth. Enzym- 34 Academic Pressi N-Y. (1174))- The immobilized antibodies of the present invention can be used for in vitro, in vivo-, and in situ assays as well as for immunoaffinity purification of the proteins of the present invention-
THERAPEUTIC AND DIAGNOSTIC COMPOSITIONS
The proteinsi antibodies and polynucleotides of the present invention can be formulated according to known methods to prepare pharmaceutically useful compositions! whereby these materialsi or their functional derivatives! are combined in admixture with a pharmaceutically acceptable carrier vehicle- Suitable vehicles and their formulation! inclusive of other human proteinsi e-g-i human serum albumini are described! for examplei in Remington ' s Pharmaceutical Sciences (Ifath ed-i Osoli A-i Ed-i Macki Easton PA (116D))- In order to form a pharmaceutically acceptable composition suitable for effective administration! such compositions will contain an effective amount of one or more of the agents of the present invention! together with a suitable amount of carrier vehicle-
Pharmaceutical compositions for use in accordance with the present invention may be formulated in conventional manner using1 one or more physiologically acceptable carriers or excipients. Thusi the compounds and their physiologically acceptable salts and solvate may be formulated for administration by inhalation or insufflation (either through the mouth or the nose) or orali buccali parenteral or rectal administration-
For oral administrationi the pharmaceutical compositions may take the form ofi for examplei tablets or capsules prepared by conventional means with pharmaceutically acceptable excipients such as binding agents (e.g. -, pregelatinised maize starchi polyvinylpyrrolidone or hydroxypropyl methylcellulose) i fillers (e.gr.i lactosei microcrystalline cellulose or calcium hydrogen phosphate)i lubricants (e.g. -, magnesium stearatei talc or silica)i disintegrants ( e.g. -, potato starch or sodium starch glycolate)i or wetting agents (e.gr.i sodium lauryl sulphate). The tablets may be coated by methods well known in the art- Liquid preparations for oral administration may take the form ofi for examplei solutions! syrups or suspensions! or they maybe presented as a dry product for constitution with water or other suitable vehicle before use- Such liquid preparations may be prepared by conventional means with pharmaceutically acceptable additives such as suspending agents (e.gr.i sorbitol syrupi cellulose derivatives or hydrogenated edible fats)i emulsifying agents ( e.g. -, lecithin or acacia) i non-aqueous vehicles ( e.g. -, almond oili oily estersi ethyl alcohol or fractionated vegetable oils)i and preservatives ( e.g. -, methyl or propyl-p-hydroxybenzoates or sorbic acid)- The preparations may also contain buffer saltsi flavoring! coloring and sweetening agents as appropriate- Preparations for oral administration may be suitably formulated to give controlled release of the active compound- For buccal administration the composition may take the form of tablets or lozenges formulated in conventional manner-
For administration by inhalationi the compounds for use according to the present invention are conveniently delivered in the form of an aerosol spray presentation from pressurized packs or a nebuliseri with the use of a suitable propellanti e . g. -, dichlorodifluoromethanei trichlorofluoromethanei dichlorotetraf luoroethanei carbon dioxide or other suitable gas- In the case of a pressurized aerosol the dosage unit may be determined by providing a valve to deliver a metered amount- Capsules and cartridges ofi e. g. gelatin for use in an inhaler or insufflator may be formulated containing a powder mix of the compound and a suitable powder base such as lactose or starch- The compounds may be formulated for parenteral administration by injection! e.gr.i by bolus injection or continuous infusion- Formulations for injection may be presented in unit dosage formi e.gr.i in ampules or in multi-dose containersi with an added preservative. The compositions may take such forms as suspensions! solutions or emulsions in oily or aqueous vehicles! and may contain formulatory agents such as suspending! stabilizing and/or dispersing agents- Alternatively! the active ingredient may be in powder form for constitution with a suitable vehiclei e.gr.i sterile pyrogen-free wateri before use-
The compounds may also be formulated in rectal compositions such as suppositories or retention enemasi e. g. -, containing conventional suppository bases such as cocoa butter or other glycerides-
In addition to the formulations described previously! the compounds may also be formulated as a depot preparation- Such long acting formulations may be administered by implantation (for example subcutaneously or intramuscularly) or by intramuscular injection- Thusi for examplei the compounds may be formulated with suitable polymeric or hydrophobic materials (for example as an emulsion in an acceptable oil) or ion exchange resinsi or as sparingly soluble derivatives! for examplei as a sparingly soluble salt-
The compositions mayi if desiredi be presented in a pack or dispenser device which may contain one or more unit dosage forms containing the active ingredient- The pack may for example comprise metal or plastic foili such as a blister pack- The pack or dispenser device may be accompanied by instructions for administration.
RECOMBINANT CONSTRUCTS AND EXPRESSION
The present invention further provides recombinant DNA constructs comprising one or more of the nucleotide sequences of the present invention. The recombinant constructs of the present invention comprise a vectori such as a plasmid or viral vectori into which a DNA or DNA fragment! typically bearing an open reading framei is insertedi in either orientation. The gene products encoded by the subject DNAs may be produced by recombinant DNA technology using techniques well known in the art- Seei for examplei the techniques described in Sambrook et al-i llθli supra-, and Ausubel et al-i llδli supra- Alternatively! the DNA sequences may be chemically synthesized usingi for examplei synthesizers. Seei for examplei the techniques described in OLIGONUCLEOTIDE SYNTHESIS! 1164ι Gaiti ed-i IRL Pressi Oxfordi which is incorporated by reference herein in its entirety- They may be assembled from fragments and short oligonucleotide linkersi or from a series of oligonucleotides- The are preferably made by RT-PCR methods. The resulting synthetic gene is capable of being expressed in a recombinant vector- In some cases the recombinant constructs will be expression vectorsi which are capable of expressing the RNA and/or protein products of the encoded DNA(s)- Thusi the vector may further comprise regulatory sequences! including for examplei a promoteri operably linked to the open reading frame (ORF)- The vector may further comprise a selectable marker sequence. Specific initiation signals may also be required for efficient translation of inserted target gene coding sequences- These signals include the ATG initiation codon and adjacent sequences- In cases where a target DNA includes its own initiation codon and adjacent sequences is inserted into the appropriate expression vectori no additional translation control signals may be needed- Howeveri in cases where only a portion of an ORF is usedi exogenous translational control signalsi including! perhapsi the ATG initiation codoni must be provided- Furthermore! the initiation codon must be in phase with the reading frame of the desired coding sequence to ensure translation of the entire target- These exogenous translational control signals and initiation codons can be of a variety of originsi both natural and synthetic The efficiency of expression may be enhanced by the inclusion of appropriate transcription enhancer elements! transcription terminators! etc- (see Bittner et al . -, Methods in Enzymol . 153:51b-544 (11δ7))- Some appropriate cloning and expression vectors for use with prokaryotic and eukaryotic hosts are described by Sambrooki et al . , in Molecular Cloning: A Laboratory Manual -, Second Edition! Cold Spring Harbori New York (1161) i the disclosure of which is hereby incorporated by reference-
If desiredi to enhance expression and facilitate proper protein foldingi the codon context and codon pairing of the sequence may be optimized for the particular expression organismi as explained by Hatfield et al . , U-S- Patent No. 5i062i7b7-
The present invention further provides host cells containing at least one of the DNAs of the present invention. The host cell can be virtually any cell for which expression vectors are available- It may bei for examplei a higher eukaryotic host celli such as a mammalian celli a lower eukaryotic host celli such as a yeast celli or the host cell can be a prokaryotic celli such as a bacterial cell- Introduction of the recombinant construct into the host cell can be effected by calcium phosphate transfectioni DEAEi dextran mediated transfectioni or electroporation (Davis et al . , Basic Methods in Molecular Biology (116b))-
A wide variety of expression systems are available! such as: yeast ( e.g. Saccharomyees, Pichia) transformed with recombinant yeast expression vectors containing the target DNAi insect cell systems infected with recombinant virus expression vectors (e.gr.i baculovirus) containing the target DNA sequencesi plant cell systems infected with recombinant virus expression vectors (e.gr.i cauliflower mosaic virusi CaMVi tobacco mosaic virusi TMV) or transformed with recombinant plasmid expression vectors ( e. g. Ti plasmid) containing target DNA coding sequencesi or mammalian cell systems ( e. g. COSi CHOi BHKi 213ι 3T3) harboring recombinant expression constructs containing promoters derived from the genome of mammalian cells ( e. g. -, metallothionein promoter) or from mammalian viruses (e.gr.i the adenovirus late promoteri the vaccinia virus 7-5K promoter).
Depending on the system choseni the resulting product may differ- For examplei proteins expressed in most bacterial cultures! e.gr.i E. coli , will be free of glycosylation modificationsi polypeptides or proteins expressed in yeast will have a glycosylation pattern different from that expressed in mammalian cells-
Vectors
Generally! recombinant expression vectors will include origins of replication and selectable markers permitting selection of the host celli e.gr.i the ampicillin resistance gene of E. coli and S. cerevisiae TRP1 genei and a promoter derived from a highly- expressed gene to direct transcription of a downstream structural sequence. Such promoters can be derived from operons encoding glycolytic enzymes such as 3-phosphoglycerate kinase (PGK)ι -factori acid phosphatasei or heat shock proteinsi among others- The heterologous structural sequence is assembled in appropriate phase with translation initiation and termination sequence! and in one aspect of the invention! a leader sequence capable of directing secretion of translated protein into the periplasmic space or extracellular medium- Optionallyi the heterologous sequence can encode a fusion protein including an N-terminal or C- terminal identification peptide imparting desired characteristics! e.gr.i stabilization or simplified purification of expressed recombinant product-
Bacterial Expression Useful expression vectors for bacterial use are constructed by inserting a structural DNA sequence encoding a desired protein together with suitable translation initiation and termination signals in operable reading phase with a functional promoter- The vector will comprise one or more phenotypic selectable markers and an origin of replication to ensure maintenance of the vector andi if desirablei to provide amplification within the host- Suitable prokaryotic hosts for transformation include E. coli, Bacillus subtilis, Salmonella typhimurium and various species within the genera Pseudomonasi Strepto ycesi and Staphylococcus! although others mayi also be employed as a matter of choice-
Bacterial vectors may bei for examplei bacteriophage-i plasmid- or cosmid-based- These vectors can comprise a selectable marker and bacterial origin of replication derived from commercially available plasmids typically containing elements of the well known cloning vector pBR322 (ATCC 37017)- Such commercial vectors includei for examplei GEM 1 (Promega Biotec Madisoni UIi USA)ι pBsi phagescripti PsiX174ι pBluescript SKi pBs KSi pNHβai pNHlfaai pNHlβai pNH4faa (Stratagene) i pTrcl Ai pKK223-3ι pKK233-3ι pKK232-βι pDRS40ι and pRIT5 (Pharmacia). These "backbone" sections are combined with an appropriate promoter and the structural sequence to be expressed. Bacterial promoters include lac T3ι T7ι lambda PR or PLI trpi and ara -
Following transformation of a suitable host strain and growth of the host strain to an appropriate cell densityi the selected promoter is derepressed/induced by appropriate means (e.g-i temperature shift or chemical induction) and cells are cultured for an additional period- Cells are typically harvested by centrifugation! disrupted by physical or chemical meansi and the resulting crude extract retained for further purification-
In bacterial systemsi a number of expression vectors may be advantageously selected depending upon the use intended for the protein being expressed- For examplei when a large quantity of such a protein is to be producedi for the generation of antibodies or to screen peptide libraries! for examplei vectors which direct the expression of high levels of fusion protein products that are readily purified may be desirable- Such vectors includei but are not limitedi to the E. coli expression vector pUR276 (Ruther et al.-. 11δ3ι EMBO J. 2:1711) i in which the coding sequence may be ligated into the vector in frame with the lac Z coding region so that a fusion protein is producedi pIN vectors (Inouye et al . 1165ι Nucleic Acids Res. 13 : 3101-31D1i Van Heeke et al . -, llβli J. Biol . Chem. 2b4 : 5503-55D1) i pET vectors! Studier et al . -, Methods in Enzymology 165: bO-61 (Academic Press 111D)i and the like-
Moreover! pGEX vectors may be used to express foreign polypeptides as fusion proteins with glutathione S-transferase (GST). In generali such fusion proteins are soluble and easily can be purified from lysed cells by adsorption to glutathione- agarose beads followed by elution in the presence of free glutathione. The pGEX vectors are designed to include thrombin or factor Xa protease cleavage sites so that the cloned target gene protein can be released from the GST moiety. In a one embodiment! full length cDNA sequences are appended with in-frame BamHl sites at the amino terminus and EcoRI sites at the carboxyl terminus using standard PCR methodologies (Innis et al-i lllOi supra) and ligated into the pGEX-2TK vector (Pharmaciai Uppsalai Sweden). The resulting cDNA construct contains a kinase recognition site at the amino terminus for radioactive labeling and glutathione S-transferase sequences at the carboxyl terminus for affinity purification (Nilssoni et al . 1165ι EMBO J. 4: lD75i Zabeau and Stanleyi 1162ι EMBO J". 1:1217-
Eukaryotic Expression Various mammalian cell culture systems can also be employed to express recombinant protein- Examples of mammalian expression systems include the COS-7 lines of monkey kidney fibroblastsi described by Gluzmani Cell 23:175 (11δl)ι and other cell lines capable of expressing a compatible vectori for examplei the C127ι 3T3ι CHOi HeLa and BHK cell lines- Mammalian expression vectors will comprise an origin of replication! a suitable promoter and enhanceri and also any necessary ribosome binding sitesi polyadenylation sitei splice donor and acceptor sitesi transcriptional termination sequences! and 5' flanking nontranscribed sequences- DNA sequences derived from the SV4D viral genomei for examplei SV4D origini early promoter! enhanceri splicei and polyadenylation sites may be used to provide the required nontranscribed genetic elements-
Mammalian promoters include CMV immediate earlyi HSV thymidine kinasei early and late SV40ι LTRs from retrovirusi and mouse metallothionein-I- Exemplary mammalian vectors include pULneoi pSV2catι pOG44ι pXTli pSG (Stratagene) pSVK3ι pBPVi pMSGi and pSVL (Pharmacia)- Selectable markers include CAT (chloramphenicol transferase)-
In mammalian host cellsi a number of viral-based expression systems may be utilized. In cases where an adenovirus is used as an expression vectori the coding sequence of interest may be ligated to an adenovirus transcription/translation control complexi e. g. i the late promoter and tripartite leader sequence. This chimeric gene may then be inserted in the adenovirus genome by in vitro or in vivo recombination. Insertion in a non- essential region of the viral genome (e.gr. i region El or E3) will result in a recombinant virus that is viable and capable of expressing a target protein in infected hosts- (E.g. -, See Logan et al.-. 11β41 Proc. Natl . Acad. Sci . USA 61 "■ 3b55-3b51) -
In one embodiment! cDNA sequences encoding the full-length open reading frames are ligated into pCMVIS replacing the 13- galactosidase gene such that cDNA expression is driven by the CMV promoter (Alami lllDi Anal . Biochem. 166: 245-254i MacGregor et al.i llδli Nucl . Acids Res . 17: 23b5i Norton et al . 1165ι Mol . Cell . Biol . 5 : 261). In addition! a host cell strain may be chosen which modulates the expression of the inserted sequences! or modifies and processes the gene product in the specific fashion desired- Such modifications (e.gr.i glycosylation) and processing (e.gr.i cleavage) of protein products may be important for the function of the protein- Different host cells have characteristic and specific mechanisms for the post-translational processing and modification of proteins- Appropriate cell lines or host systems can be chosen to ensure the correct modification and processing of the foreign protein expressed- To this endi eukaryotic host cells which possess the cellular machinery for proper processing of the primary transcripti glycosylation! and phosphorylation of the gene product may be used- Such mammalian host cells include but are not limited to CHOi VEROi BHKi HeLai COSi MDCKi 213ι 3T3ι UI3δι etc-
For long-termi high-yield production of recombinant proteins in eukaryotic cellsi stable expression is preferred- Rather than using expression vectors which contain viral origins of replication! host- cells can be transformed with DNA controlled by appropriate expression control elements (e.gr.i promoter! enhanceri sequences! transcription terminators! polyadenylation sitesi etc.)ι and a selectable marker- Following the introduction of the foreign DNAi engineered cells may be allowed to grow for 1-2 days in an enriched mediai and then are switched to a selective media- The selectable marker in the recombinant plasmid confers resistance to the selection and allows cells to stably integrate the plasmid into their chromosomes and grow to form foci which in turn can be cloned and expanded into cell lines- This method may advantageously be used to engineer cell lines which express the target protein- Such engineered cell lines may be particularly useful in screening and evaluation of compounds that affect the endogenous activity of the protein-
A number of selection systems may be usedi including but not limited to the herpes simplex virus thymidine kinase (Uigleri et al.i Cell 11:223 (1177))ι hypoxanthine-guanine phosphoribosyltransferase (Szybalska et al . -, Proc. Natl . Acad. Sci . USA 46:2D2b (11b2))ι and adenine phosphoribosyltransferase (Lowyi et al . -, Cell 22:617 (118D)) genes can be employed in tk"ι hgprt" or aprt" cellsi respectively. Alsoi antimetabolite resistance can be used as the basis of selection for dhfri which confers resistance to methotrexate (Uigleri et al . -, Proc. Natl . Acad, Sci . USA 77:3Sb7 (1160) )i O'Harθi et al . -, 11611 Proc. Natl . Acad. Sci . USA 76:1527) gpti which confers resistance to mycophenolic acid (Mulligan et al.i Proc. Natl . Acad. Sci . USA 76:2072 (11βl))i neoi which confers resistance to the aminoglycoside G-41δ (Colberre-Garapini et al.i llδli J". Mol . Biol . 150:1) i and hydroi which confers resistance to hygromycin (Santerrei et al.i 1164ι Gene 30:147) genes.
An alternative fusion protein system allows for the ready purification of non-denatured fusion proteins expressed in human cell lines (Janknechti et al.i Proc. Natl . Acad. Sci . USA 68: 6172-617fa (1111)). In this systemi the gene of interest is subcloned into a vaccinia-based plasmid such that the gene's open reading frame is translationally fused to an amino-terminal tag consisting of six histidine residues- Extracts from cells infected with recombinant vaccinia virus are loaded onto Ni2+ nitriloacetic acid-agarose columns and histidine-tagged proteins are selectively eluted with imidazole-containing buffers-
In an insect systemi Autographa californica nuclear polyhedrosis virus (AcNPV) is used as a vector to express foreign genes- The virus grows in Spodoptera frugiperda cells- The target coding sequence may be cloned individually into non- essential regions (for example the polyhedrin gene) of the virus and placed under control of an AcNPV promoter (for example the polyhedrin promoter). Successful insertion of a target gene coding sequence will result in inactivation of the polyhedrin gene and production of non-occluded recombinant virus (i-e-i virus lacking the proteinaceous coat coded for by the polyhedrin gene). These recombinant viruses are then used to infect Spodoptera frugiperda cells in which the inserted gene is expressed- (E. g. -, see Smith et al-i 1163ι J. Virol . 4b: 564i Smithi U-S- Patent No- 4ι21SιOSl).
Uhile the present proteins can be expressed in recombinant systemsi as described abovei cell-free translation systems can also be employed to produce such proteins using RNAs derived from the DNA constructs of the present invention- Purification of Recombinant Proteins
Recombinant proteins produced may be isolated by host cell lysis- This may be followed by one or more salting-outi aqueous ion exchange or size exclusion chromatography steps- Finallyi high performance liquid chromatography (HPLC) can be employed for final purification steps- Microbial cells employed in expression of proteins can be disrupted by any convenient methodi including freeze-thaw cyclingi sonicationi mechanical disruption! or use of cell lysing agentsi like lysozyme and chelators- If inclusion bodies are formed in bacterial systemsi they may be extracted from cell pellets usingi for examplei detergents! reducing agentsi saltsi ureai guanidinium chloride and extremes of pH (e.g. <4 or >1D)- If denaturation occursi protein refolding steps (e.gr.i dialysis) can be usedi as necessaryi in completing configuration of the mature protein- If disulfide bridges are present in the native proteini they may be reoxidized using known methods.
By way of specific non-limiting examplei the recombinant bacterial cellsi for example E. coli , are grown in any of a number of suitable mediai for example LBi and the expression of the recombinant protein induced by adding IPTG (e.gr.i lac operator- promoter) to the media or switching incubation to a higher temperature (e.gr.i λ cla57) ■ After culturing the bacteria for a further period of between 2 and 24 hoursi the cells are collected by centrifugation and washed to remove residual media- The bacterial cells are then lysedi for examplei by disruption in a cell homogenizer and centrifuged to separate the cell membranes from the soluble cell components. If the protein aggregates into inclusion bodiesi this centrifugation can be performed under conditions whereby the dense inclusion bodies are selectively enriched by incorporation of sugars such as sucrose into the buffer and centrifugation at a selective speed- The inclusion bodies can then be washed in any of several solutions to remove some of the contaminating host proteinsi then solubilized in solutions containing high concentrations of urea (e.gr. 6M) or chaotropic agents such as guanidinium hydrochloride in the presence of reducing agents such as β-mercaptoethanol or DTT (dithiothreitol) ■ At this stage it may be advantageous to incubate the protein for several hours under conditions suitable for the protein to undergo a refolding process into a conformation which more closely resembles that of the native protein- Such conditions generally include low protein concentrations less than 500 μg/ml) i low levels of reducing agenti concentrations of urea less than 2 M and often the presence of reagents such as a mixture of reduced and oxidized glutathione which facilitate the interchange of disulphide bonds within the protein molecule- The refolding process can be monitoredi for examplei by SDS-PAGE or with antibodies which are specific for the native molecule- Following refolding! the protein can then be purified further and separated from the refolding mixture by chromatography on any of several supports including ion exchange resinsi gel permeation resins or on a variety of affinity columns-
Labeling Proteins
Uhen used as a component in assay systems such as those described! belowi the target protein may be labeledi either directly or indirectlyi to facilitate detection of the present res-like molecules either in vitro or in vivo- Any of a variety of suitable labeling systems may be used including but not limited to radioisotopes such as 1S5Ii enzyme labeling systems that generate a detectable colorimetric signal or light when exposed to substratei and fluorescent labels- Uhere recombinant DNA technology is used for protein production thei it may be advantageous to engineer fusion proteins that can facilitate labeling! immobilization and/or detection- These fusion proteins mayi for examplei add amino acids which facilitate further chemical modification- They also may add a functional moietyi such as an enzymei which directly facilitates detection. TRANSGENIC ANIMALS
The invention further contemplates animal models for studying the function of the present molecules and for overproducing the protein products. The disclosed DNA sequences may be used in conjunction with techniques for producing transgenic animals that are well known to those of skill in the art.
To prepare transgenic animalsi target gene sequences may for example be introduced intoi and overexpressed ini the genome of the animal of interesti ori if endogenous target gene sequences are presenti they may either be overexpressed ori alternatively! be disrupted in order to underexpress or inactivate target gene expression! such as described for the disruption of apoE in mice (Plum et al.i Cell 71: 343-353 (1112)).
In order to overexpress a target gene sequence! the coding portion of the target gene sequence may be ligated to a regulatory sequence which is capable of driving gene expression in the animal and cell type of interest- Such regulatory regions will be well known to those of skill in the arti and may be utilized in the absence of undue experi entation- For underexpression of an endogenous target gene sequence! such a sequence may be isolated and engineered such that when reintroduced into the genome of the animal of interesti the endogenous target gene alleles will be inactivated- Preferably! the engineered target gene sequence is introduced via gene targeting such that the endogenous target sequence is disrupted upon integration of the engineered target gene sequence into the animal's genome- Animals of any speciesi including! but not limited toi icei ratSi rabbitsi guinea pigsi pigsi micro-pigsi goatsi and non-human primatesi e.gr.i baboonsi monkeysi and chimpanzees may be used to generate cardiovascular disease animal models- Goatsi cows and sheep are particularly preferred for producing protein in vivo-
Any technique known in the art may be used to introduce a target gene transgene into animals to produce the founder lines of transgenic animals- Such techniques includei but are not limited to pronuclear microinjection (Hoppe et al.i U-S- Pat- No- 4ι673ιl11 (1161))i retrovirus mediated gene transfer into germ lines (Van der Putten et al.i Proc. Natl . Acad. Sci . , USA 62:fal48- bl52 (1165) )i gene targeting in embryonic stem cells (Thompson et al.i Cell 5fa:313-321 (1181) )i electroporation of embryos (Loi Mol . Cell . Biol . 3:1603-1814 (1183) )i and sperm-mediated gene transfer (Lavitrano et al . -, Cell 57:717-723 (1181) )i etc. For a review of such techniques! see Gordoni Transgenic Animalsi Intl . Rev. Cytol . 115:171-221 (1161)-
The present invention provides for transgenic animals that carry the transgene in all their cellsi as well as animals which carry the transgene in somei but not all their cellsi i.e.i mosaic animals. The transgene may be integrated as a single transgene or in concatamersi e.gr.i head-to-head tandems or head-to-tail tandems- The transgene may also be selectively introduced into and activated in a particular cell type by following! for examplei the teaching of Lasko et al- (Lasko et al.i Proc. Natl . Acad. Sci . USA 61:3232-fa23b (1112))- The regulatory sequences required for such a cell-type specific activation will depend upon the particular cell type of interesti and will be apparent to those of skill in the art- Uhen it is desired that the target gene be integrated into the chromosomal site of the endogenous target genei gene targeting is preferred- Brieflyi when such a technique is to be utilized! vectors containing some nucleotide sequences homologous to the endogenous target gene of interest are designed for the purpose of integrating! via homologous recombination with chromosomal sequences! into and disrupting the function of the nucleotide sequence of the endogenous target gene-
The transgene may also be selectively introduced into a particular cell typei thus inactivating the endogenous gene of interest in only that cell typei by followingi for examplei the teaching of Gu et al . Science 2b5: 103-lOb (1114))- The regulatory sequences required for such a cell-type specific inactivation will depend upon the particular cell type of interesti and will be apparent to those of skill in the art-
Once transgenic animals have been generated! the expression of the recombinant target gene and protein may be assayed utilizing standard techniques- Initial screening may be accomplished by Southern blot analysis or PCR techniques to analyze animal tissues to assay whether integration of the transgene has taken place- The level of mRNA expression of the transgene in the tissues of the transgenic animals may also be assessed using techniques which include but are not limited to Northern blot analysis of tissue samples obtained from the animali in situ hybridization analysis! and RT-PCR. Samples of target gene-expressing tissuei may also be evaluated immunocytochemically using antibodies specific for the target gene transgene gene product of interest.
The transgenic animals that express target gene mRNA or target gene transgene peptide (detected immunocytochemicallyi using antibodies directed against the target gene product's epitopes) at easily detectable levels should then be further evaluated to identify those animals which display characteristic increased susceptibility to carcinogenesis- Additionally! specific cell types within the transgenic animals may be analyzed and assayed in vi tro f or cellular phenotypes characteristic of mutant phenotype.
Once target gene transgenic founder animals are producedi they may be bredi inbredi outbredi or crossbred to produce colonies of the particular animal- Examples of such breeding strategies include but are not limited to: outbreeding of founder animals with more than one integration site in order to establish separate linesi inbreeding of separate lines in order to produce compound target gene transgenics that express the target gene transgene of interest at higher levels because of the effects of additive expression of each target gene transgenei crossing of heterozygous transgenic animals to produce animals homozygous for a given integration site in order both to augment expression and eliminate the possible need for screening of animals by DNA analysisi crossing of separate homozygous lines to produce compound heterozygous or homozygous linesi breeding animals to different inbred genetic backgrounds so as to examine effects of modifying alleles on expression of the target gene transgene and the possible development of carcinogenesis. One such approach is to cross the target gene transgenic founder animals with a wild type strain to produce an FI generation that exhibits increased susceptibility to carcinogenesis. The FI generation may then be inbred in order to develop a homozygous linei if it is found that homozygous target gene transgenic animals are viable- Methods of generating "knockout" mice using homologous recombination in embryonic stem cells are well known in the art- Suitable methods are described! for examplei in Mansour et al.i Nature-, 33b:34δ (1166) i Zijlstra et al.i Nature-, 342:435 (1161) and 344:742 (1110) i and Hasty et al.i Nature! 350:243 (1111). This genomic DΝA can be obtained by conventional methods using the cDΝA sequence as a probe in a commercially-available genomic DΝA library.
Brieflyi a genomic fragment is cleaved with a restriction endonuclease and a heterologous cassette containing a neomycin- resistance gene is inserted at the cleavage site- A suitable cassette is the GTI-II neo cassette described by Lufkin et al.i Cell bfa:HD5 (1111). The modified genomic fragment is cloned into a suitable targeting vector that is introduced into murine embryonic stem cells by electroporation- Cells that have undergone homologous recombination (and hence disruption of the gene) are selected by resistance to G41βι and used to generate chimeric mice using well known methods- See Lufkin et al.i supra- Traditional breeding methods then can be used to generate mice that are homozygous for the disrupted gene-
The phenotype of mice that are homozygous for the mutation then can be studied to provide insights into the role of the protein ini for examplei carcinogenesis. These mice also can be used as models for developing new treatments for cancers- If this mutation is lethal in homozygous mice (for example during embryogenesis) heterozygous micei which express only half the amount of the protein can also be studied-
GENE THERAPY APPLICATIONS
Uhen mutations in the inventive proteini or in the elements controlling expression of that proteini are found to be associated with a malignant phenotypei control of cellular proliferation can be restored by gene therapy methods- For examplei overexpression of the protein can be counteracted by concurrent expression of an antisense molecule that binds to and inhibits expression of the mRNA encoding the protein- Alternatively! overexpression can be inhibited in an analogous manner using a ribozyme that cleaves the mRNA- In another embodiment! where expression of a mutated protein induces the malignant phenotypei concomitant expression of the non-mutated molecule via introduction of an exogenous gene may be used- Methods of using antisense and ribozyme technology to control gene expression! or of gene therapy methods for expression of an exogenous gene in this manner are well known in the art- Each of these methods requires a system for introducing a vector into the cells containing the mutated gene- The vector encodes either an antisense or ribozyme transcript of the inventive protein- The construction of a suitable vector can be achieved by any of the methods well-known in the art for the insertion of exogenous DNA into a vector- See, e.gr.i Sambrook et al.i Molecular Cloning (Cold Spring Harbor Press 2d ed- 1161) ι which is incorporated herein by reference- In addition! the prior art teaches various methods of introducing exogenous genes into cells in vivo- See Rosenberg et al.i Science 242:1575-1578 (1186) and Uolff et al.i PNAS δfa:1Dll-1014 (1161) i which are incorporated herein by reference- The routes of delivery include systemic administration and administration in situ- Uell-known techniques include systemic administration with cationic liposomesi and administration in situ with viral vectors. Any one of the gene delivery methodologies described in the prior art is suitable for the introduction of a recombinant vector containing an inventive gene according to the invention into a MTX-resistanti transport- deficient cancer cell- A listing of present-day vectors suitable for the purpose of this invention is set forth in Hodgsoni
Bio /Technology 13 '- 222 (1115) i which is incorporated by reference-
For examplei liposome-mediated gene transfer is a suitable method for the introduction of a recombinant vector containing an inventive gene according to the invention into a MTX-resistanti transport-deficient cancer cell- The use of a cationic liposomei such as DC-Chol/DOPE liposomei has been widely documented as an appropriate vehicle to deliver DNA to a wide range of tissues through intravenous injection of DNA/cationic liposome complexes- See Caplen et al.i Nature Med. l:31-4fa (1115) and Zhu et al.i Science 262:201-211 (1113) i which are herein incorporated by reference- Liposomes transfer genes to the target cells by fusing with the plasma membrane. The entry process is relatively efficient! but once inside the celli the liposome-DNA complex has no inherent mechanism to deliver the DNA to the nucleus- As suchi the most of the lipid and DNA gets shunted to cytoplasmic waste systems and destroyed- The obvious advantage of liposomes as a gene therapy vector is that liposomes contain no proteinsi which thus minimizes the potential of host immune responses.
As another examplei viral vector-mediated gene transfer is also a suitable method for the introduction of the vector into a target cell. Appropriate viral vectors include adenovirus vectors and adeno-associated virus vectorsi retrovirus vectors and herpesvirus vectors-
Adenoviruses are lineari double stranded DNA viruses complexed with core proteins and surrounded by capsid proteins. The common serotypes 2 and 5ι which are not associated with any human malignancies! are typically the base vectors- By deleting parts of the virus genome and inserting the desired gene under the control of a constitutive viral promoteri the virus becomes a replication deficient vector capable of transferring the exogenous DNA , to differentiated! non-proliferating cells- To enter cellsi the adenovirus fibre interacts with specific receptors on the cell surfacei and the adenovirus surface proteins interact with the cell surface integrins- The virus penton-cell integrin interaction provides the signal that brings the exogenous gene- containing virus into a cytoplasmic endosome- The adenovirus breaks out of the endosome and moves to the nucleusi the viral capsid falls aparti and the exogenous DNA enters the cell nucleus where it functions! in an epichromosomal fashioni to express the exogenous gene- Detailed discussions of the use of adenoviral vectors for gene therapy can be found in Berkneri Biotechniques £:falb-fa21 (1166) and Trapnell i Advanced Drug Delivery Rev. 12:165- 111 (1113) i which are herein incorporated by reference- Adenovirus-derived vectorsi particularly non-replicative adenovirus vectorsi are characterized by their ability to accommodate exogenous DNA of 7-5 kBi relative stability! wide host rangei low pathogenicity in raarii and high titers (ID4 to ID5 plaque forming units per cell). See Stratford-Perricaudet et al.i PNAS 89:2561 (1112) -
Adeno-associated virus (AAV) vectors also can be used for the present invention- AAV is a linear single-stranded DNA parvovirus that is endogenous to many mammalian species- AAV has a broad host range despite the limitation that AAV is a defective parvovirus which is dependent totally on either adenovirus or herpesvirus for its reproduction in vivo- The use of AAV as a vector for the introduction into target cells of exogenous DNA is well-known in the art- See, e.g. -, Lebkowski et al.i Mole. & Cell . Biol . 6:3166 (116β)ι which is incorporated herein by reference- In these vectorsi the capsid gene of AAV is replaced by a desired DNA fragmenti and transco plementation of the deleted capsid function is used to create a recombinant virus stock- Upon infection the recombinant virus uncoats in the nucleus and integrates into the host genome-
Another suitable virus-based gene delivery mechanism is retroviral vector-mediated gene transfer- In general! retroviral vectors are well-known in the art- See Breakfield et al.i Mole. Neuro. Biol . 1:331 (1167) and Shih et al.i in Vaccines 65: 177 (Cold Spring Harbor Press 1165)- A variety of retroviral vectors and retroviral vector-producing cell lines can be used for the present invention- Appropriate retroviral vectors include Moloney Murine Leukemia Virusi spleen necrosis virusi and vectors derived from retroviruses such as Rous Sarcoma Virusi Harvey Sarcoma Virusi avian leukosis virusi human immunodeficiency virusi yeloproliferative sarcoma virusi and mammary tumor virus- These vectors include replication-competent and replication-defective retroviral vectors- In addition! amphotropic and xenotropic retroviral vectors can be used- In carrying out the invention! retroviral vectors can be introduced to a tumor directly or in the form of free retroviral vector producing-cell lines- Suitable producer cells include fibroblastsi neuronsi glial cellsi keratinocytesi hepatocytesi connective tissue cellsi ependymal cells-, chromaffin cells- See Uolff et al.i PNAS 84:3344 (llδl).
Retroviral vectors generally are constructed such that the majority of its structural genes are deleted or replaced by exogenous DNA of interesti and such that the likelihood is reduced that viral proteins will be expressed- See Bender et al . -x J. Virol . bl:lb31 (1167) and Armento et al.-i J. Virol . 6i:lfa47 (1167)ι which are herein incorporated by reference- To facilitate expression of the antisense or ribozyme moleculei of the inventive proteini a retroviral vector employed in the present invention must integrate into the genome of the host cell genomei an event which occurs only in mitotically active cells- The necessity for host cell replication effectively limits retroviral gene expression to tumor cellsi which are highly replicativei and to a few normal tissues- The normal tissue cells theoretically most likely to be transduced by a retroviral vectori therefore! are the endothelial cells that line the blood vessels that supply blood to the tumor- In addition! it is also possible that a retroviral vector would integrate into white blood cells both in the tumor or in the blood circulating through the tumor-
The spread of retroviral vector to normal tissuesi howeveri is limited- The local administration to a tumor of a retroviral vector or retroviral vector producing cells will restrict vector propagation to the local region of the tumori minimizing transductioni integration! expression and subsequent cytotoxic effect on surrounding cells that are mitotically active-
Both replicatively deficient and replicatively competent retroviral vectors can be used in the invention! subject to their respective advantages and disadvantages- For instance! for tumors that have spread regionally! such as lung cancersi the direct injection of cell lines that produce replication-deficient vectors may not deliver the vector to a large enough area to completely eradicate the tumori since the vector will be released only form the original producer cells and their progenyi and diffusion is limited. Similar constraints apply to the application of replication deficient vectors to tumors that grow slowlyi such as human breast cancers which typically have doubling times of 30 days versus the 24 hours common among human gliomas. The much shortened survival-time of the producer cellsi probably no more than 7-14 days in the absence of immunosuppression! limits to only a portion of their replicative cycle the exposure of the tumor cells to the retroviral vector-
The use of replication-defective retroviruses for treating tumors requires producer cells and is limited because each replication-defective retrovirus particle can enter only a single cell and cannot productively infect others thereafter- Because these replication-defective retroviruses cannot spread to other tumor cellsi they would be unable to completely penetrate a deepi multilayered tumor in vivo- See Markert et al . , Neurosurg. 77: 510 (1112). The injection of replication-competent retroviral vector particles or a cell line that produces a replication- competent retroviral vector virus may prove to be a more effective therapeutic because a replication competent retroviral vector will establish a productive infection that will transduce cells as long as it persists. Moreover! replicatively competent retroviral vectors may follow the tumor as it metastasizesi carried along and propagated by transduced tumor cells- The risks for complications are greateri with replicatively competent vectorsi however- Such vectors may pose a greater risk then replicatively deficient vectors of transducing normal tissuesi for instance- The risks of undesired vector propagation for each type of cancer and affected body area can be weighed against the advantages in the situation of replicatively competent verses replicatively deficient retroviral vector to determine an optimum treatment-
Both amphotropic and xenotropic retroviral vectors may be used in the invention- Amphotropic viruses have a very broad host range that includes most or all mammalian cellsi as is well known to the art- Xenotropic viruses can infect all mammalian cell cells except mouse cells- Thusi amphotropic and xenotropic retroviruses from many speciesi including cowsi sheepi pigsi dogsi catsi ratsi and micei inter alia can be used to provide retroviral vectors in accordance with the invention! provided the vectors can transfer genes into proliferating human cells in vivo.
Clinical trials employing retroviral vector therapy treatment of cancer have been approved in the United States- See Culveri Clin . Chem. 40 - 510 (1114)- Retroviral vectoi—containing cells have been implanted into brain tumors growing in human patients- See Oldfield et al . , Hum. Gene Ther. 4 - 31 (1113). These retroviral vectors carried the HSV-1 thymidine kinase (HSV-tk) gene into the surrounding brain tumor cellsi which conferred sensitivity of the tumor cells to the antiviral drug ganciclovir- Some of the limitations of current retroviral based cancer therapyi as described by Oldfield are : (1) the low titer of virus producedi (2) virus spread is limited to the region surrounding the producer cell implanti (3) possible immune response to the producer cell linei (4) possible insertional mutagenesis and transformation of retroviral infected cellsi (5) only a single treatment regimen of pro-drugi gancicloviri is possible because the "suicide" product kills retrovirally infected cells and producer cells and (fa) the bystander effect is limited to cells in direct contact with retrovirally transformed cells. See Bi et al.i Human Gene Therapy 4 -m 725 (1113).
Yet another suitable virus-based gene delivery mechanism is herpesvirus vectoi—mediated gene transfer- Uhile much less is known about the use of herpesvirus vectorsi replication-competent HSV-1 viral vectors have been described in the context of antitumor therapy. See Martuza et al . , Science 252 '- 654 (1111) i which is incorporated herein by reference-
DIAGNOSTIC METHODS
The present invention also contemplates! for certain molecules described belowi methods for diagnosis of human disease- In particular! patients can be screened for the occurrence of cancersi or likelihood of occurrence of cancersi associated with mutations in the encoded protein. DNA from tumor tissue obtained from patients suffering from cancer can be isolated. and the gene encoding the protein can be sequenced- By examining a number of patients in this manneri mutations in the gene that are associated with a malignant cellular phenotype can be identified. In addition! correlation of the nature of the observed mutations with subsequent observed clinical outcomes allows development of prognostic model for the predicted outcome in a particular patient- Screening for mutations conveniently can be carried out at the DNA level by use of PCRi although the skilled artisan will be aware that many other well known methods are available for the screening. PCR primers can be selected that flank known mutation sitesi and the PCR products can be sequenced to detect the occurrence of the mutation- Alternatively! the 3' residue of one PCR primer can be selected to be a match only for the residue found in the unmutated gene- If the gene is mutatedi there will be a mismatch at the 3' end of the primeri and primer extension cannot occuri and no PCR product will be obtained- Alternatively! primer mixtures can be used where the 3' residue of one primer is any nucleotide other than the nonmutated residue- Observation of a PCR product then indicates that a mutation has occurred- Other methods of usingi for examplei oligonucleotide probes to screen for mutations are described! or examplei in U-S- Patent No- 4ιβ71ιβ36ι which is herein incorporated by reference in its entirety-
Alternatively! antibodies can be generated that selectively bind either mutated or non-mutated protein- The antibodies then can be used to screen tissue samples for occurrence of mutations in a manner analogous to the DNA-based methods described supra -
The diagnostic methods described above can be used not only for diagnosis and for prognosis of existing diseasei but may also be used to predict the likelihood of the future occurrence of disease- For examplei clinically healthy patients can be screened for mutations in the inventive molecule that correlate with later disease onset- Such mutations may be observed in the heterozygous state in healthy individuals- In such cases a single mutation event can effectively disable proper functioning of the gene and induce a transformed or malignant phenotype- This screening also may be carried out prenatally or neonatally-
DNA molecules according to the invention also are well suited for use in so-called "gene chip" diagnostic applications- Such applications have been developed byi inter alia-, Synteni and Affymetrix- Brieflyi all or part of the DNA molecules of the invention can be used either as a probe to screen a polynucleotide array on a "gene chipi" or they may be immobilized on the chip itself and used to identify other polynucleotides via hybridization to the surface of the chip- In this manneri for examplei related genes can be identified! or expression patterns of the gene in various tissues can be simultaneously studied- Such gene chips have particular application for diagnosis of diseasei or in forensic analysis to detect the presence or absence of an analyte- Suitable chip technology is described for examplei in Uodicka et al.i Nature Biotechnology-, 15:1351 (1117) which is hereby incorporated by reference in its entiretyi and references cited therein-
PROTEIN-PROTEIN INTERACTIONS Due to their similarity to certain known proteinsi it is anticipated that some of the inventive protein molecules will interact with another class of cellular proteins. This is particularly true of those molecule containing leucine zipper motifs-
Any method suitable for detecting protein-protein interactions can be employed for identifying interacting targets- Among the traditional methods which can be employed are co- immunoprecipitationi crosslinking and co-purification through gradients or chromatographic columns- Utilizing procedures such as these allows for the identification of GAP gene products- Once identified! a GAP protein can be usedi in conjunction with standard techniques! to identify its corresponding pathway gene- For examplei at least a portion of the amino acid sequence of the pathway gene product can be ascertained using techniques well known to those of skill in the arti such as via the Edman degradation technique (seei e-g-i Creightoni 1163ι PROTEINS: STRUCTURES AND MOLECULAR PRINCIPLES! Id • H • Freeman & Co.i N-Y-i pp.34-41). The amino acid sequence obtained can be used as a guide for the generation of oligonucleotide mixtures that can be used to screen for pathway gene sequences- Screening can be accomplished! for examplei by standard hybridization or PCR techniques- Techniques for the generation of oligonucleotide mixtures and for screening are well-known. (See e.g. -, Ausubeli supra^ and PCR PROTOCOLS: A GUIDE TO METHODS AND APPLICATIONS! l lOi Innis et al.i eds- Academic Pressi Inc-i New York).
Additionally! methods can be employed which result in the simultaneous identification of interacting target genes- One method which detects protein interactions in vivo-, the two-hybrid systemi is described in detail for illustration purposes only and not by way of limitation- One version of this system has been described (Chien et al.i Proc. Natl . Acad. Sci . USA-, δδ: 1576-1562 (1111)) and is commercially available from Clontech (Palo Altoi CA)- Briefly! utilizing such a systemi plasmids are constructed that encode two hybrid proteins: one consists of the DNA-binding domain of a transcription activator protein fused to a known proteini in this case an inventive proteini and the other contains the activator protein's activation domain fused to an unknown protein (a putative GAPi for instance) that is encoded by a cDNA which has been recombined into this plasmid as part of a cDNA library- The plasmids are transformed into a strain of the yeast Saccharomyees cerevisiae that contains a reporter gene ( e. g. , lacZ) whose regulatory region contains the transcription activator's binding sites- Either hybrid protein alone cannot activate transcription of the reporter genei the DNA-binding domain hybrid cannot because it does not provide activation function! and the activation domain hybrid cannot because it cannot localize to the activator's binding sites- Interaction of the two hybrid proteins reconstitutes the functional activator protein and results in expression of the reporter genei which is detected by an assay for the reporter gene product- The two-hybrid system or related methodology can be used to screen activation domain libraries for proteins that interact with a known "bait" gene product- By way of examplei and not by way of limitation! gene products known to be involved in TH cell subpopulation-related disorders and/or differentiation! maintenance! and/or effector function of the subpopulations can be used as the bait gene products- Total genomic or cDNA sequences are fused to the DNA encoding on activation domain- This library and a plasmid encoding a hybrid of the bait gene product fused to the DNA-binding domain are cotransformed into a yeast reporter straini and the resulting transformants are screened for those that express the reporter gene- For examplei and not by way of limitationi the bait gene can be cloned into a vector such that it is translationally fused to the DNA encoding the DNA-binding domain of the GAL4 protein- These colonies are purified and the library plasmids responsible for reporter gene expression are isolated. DNA sequencing is then used to identify the proteins encoded by the library plasmids-
The present invention! thus generally described! will be understood more readily by reference to the following examples! which are provided by way of illustration and are not intended to be limiting of the present invention- The examples below are provided to illustrate the subject invention- These examples are provided by way of illustration and are not included for the purpose of limiting the invention-
EXAMPLES
EXAMPLE I: cDNA Library Construction
cDNA library plates and clones originated from five cDNA libraries that were constructed by directional cloning- These are available through the Resource Center (http://www-rzpd.de) of the German Genome Project- In particular! the hfbr2 (human fetal braini RZPD number DKFZp5b4) and hfkd2 (human fetal kidneyi DKFZp5bfa) libraries were generated using the Smart kit (Clontech) i except that PCR was carried out with primers that contained uracil residues to permit directional cloning without restriction digestion and ligationi and were complementary with the pAMPl (LifeTechnologies) cloning sites for directional cloning- The htes3 (human testesi DKFZp434)ι hutel (human uterusi DKFZp56fa) and hmcfl (human mammary carcinomai DKFZp727) libraries are conventional (Gubleri U-i Hoffmani B-J-i (1163)ι A simple and very efficient method for generating cDNA libraries- Gene 25ι 2fa3-2fa1)ι size-selected cDNA libraries- They are cloned into pSPORTl (LifeTechnologies) via a Notl site which is introduced during reverse transcription downstream of the oligo dT primer and a Sail site that is introduced by the ligation of a adapters- The human mammary carcinoma library was constructed from MCF7 cells -
In a similar fashioni the hamy2 (human amygdala nucleus (inside the brain) RZPD number DKFZp7bl) and hmel2 (human melanomai RZPD number DKFZp7b2) libraries have been generated using conventional approaches! emplying a Notl -dT V primer for first strand synthesis (GAGCGGCCGC(T)HV) - After second strand synthesis! Sail adapters were ligated to the blunted cDNA- Then the cDNA was cut with Notl to generate Sall-Notl compatible ends at the 51 and 3"1 ends of the cDNAi respectively! to allow directional cloning- The cDNAs were then size selected on agarose gels in two dimensions and cloned into the pSPORTl plasmid vector which had been pre-cut with Sail and Notl (LifeTechnologies) ■ The DNA was transformed into the DH1DB bacterial strain and single colonies were picked into 364well microtiter plates from the non- amplified library. The human melanoma library was constructed from MeUo cellsi published by Kerni M-A-i Helmbachi H-i Artuc M-i Karmanni D-i Jurgovskyi K. and Schadendorfi D- (1117) Human melanoma cell lines selected in vitro displaying various levels of drug resistance against cisplatini fotemustinei vindesine or etoposide: modulation of proto-oncogene expression- Anticancer Res- 17ι 4351-4370- The cDNA sequences of this application were first identified among the sequences comprising various libraries. Technology has advanced considerably since the first cDNA libraries were made- Many small variations in both chemicals and machinery have been instituted over timei and these have improved both the efficiency and safety of the process- Although the cDNAs could be obtained using an older procedure! the procedure presented in this application is exemplary of one currently being used by persons skilled in the art- For the purpose of providing an exemplary methodi the mRNA isolation and cDNA library construction described here is for the MCF-7 library (DKFZp727) from which the clones named DKFZphmcfl_xxyyxx were obtained-
The human cell line MCF-7 was grown in DMEM supplemented with 10* fetal calf serum until confluency- 3 X 10s cells were harvested with a cell scraper in PBS- Cells were lysed in buffer containing D-5 * NP-4D to leave the nuclei intact- The debris was pelleted by centrifugation at 15 ODD x g for 10 minutes at 4 degrees Celsius- Proteins in the supernatant were degraded in presence of SDS and Proteinase K (30 minutes at 5fa degrees Celsius)- Precipitation of proteins was done in a Phenol/Chloroform extraction! RNA was precipitated from the aqueous phase with Na-acetate and Ethanol- Polyadenylated messages were isolated using ώiagen Oligotex (ώlAGENi Hilden Germany) -
First strand cDNA synthesis was accomplished using an oligo (dT) primer which also contained an Notl restriction site- Second strand synthesis was performed using a combination of DNA polymerase Ii E. coli ligase and RNase Hi followed by the addition of a Sail adaptor to the blunt ended cDNA- The Sail adaptedi double-stranded cDNA was then digested with Notl restriction enzymei and fractionated by size on an agarose gel- DNA of the appropriate size was cut from the gel and cast into a second gel in a 10° angle- After electrophoresis in the second dimension! cDNA of the appropriate size was cut from the gel- The agarose block was broken down with help of gelase- The cDNA was purified with help of two phenol extractions and an ethanol precipitation. The cDNA was ligated into Sall/Notl pre-digested pSportl vector (LifeTechnologies) and transformed into DH1DB bacteria .
The libraries were arrayed into 364-well microtiter plates and spotted on high density nylon membranes for hybridization analysis- All libraries have been arrayed into 364well microtiter plates and spotted on high density nylon membranes for hybridization analysis-
The hamy2 Library consists of 121 364well plates comprising 4b4fa4 clones. The hmel2 library consists of 72 364well plates comprising 27b4δ clones. Filters and clones are available through the Resource Center of German Genome Project
(http://www.RZPD.de). Uhole library plates were distributed to the sequencing partners of the consortium for systematic sequencing ■
EXAMPLE II: Sequencing of cDNA Clones
All clones in the 364-well microtiter plates were sequenced from the 5' end- Sequencing was done preferentially using dye terminator chemistry (ABD or Amersham) on ABI automated DNA sequencers (ABI 377ι Applied Biosystems) i one partner used EMBL prototype instruments (Arakis) mainly with dye primer chemistry.
The resulting expressed sequence tag (EST) sequences ("rl ESTs11 = sequenced from 5'-end) were analysed for:
a) the lack of identical matches with known genes-
For thisi the EST-sequence was blasted against the cDNA consortiums own database and after that against public databases and (with BLASTn and BLASTx against EMBL/EMBLNEU and assembled ESTsi please refer to EXAMPLE III: Bioinformatics analysis of full length cDNAsi for description and parameter settings)- ESTs which were identical to known genes in more than 10D bpi with less than 2 mismatchesi were excluded from further analysis-
fa) the presence of an open reading frame
Open reading frames (ORFs) were detected with an tool developed by Munich Information Center for Protein Sequences (MIPS) called ORF-map- ORF- ap visualises potential start and stop-codons- If an ORF without a stop codon was detected in a rl-ESTi the sequence was processed further-
c) the presence of GC rich sequences
A script developed by MIPS computed the GC-content of the rl-sequencei which should be >4D*. Uriting similar scripts is within the ordinary skill of one in bioinformatics -
d) the lack of repeat structures
Repeats such as Alui Line or CA-repeats were detected by blasting (BLASTn and BLASTxi please refer to EXAMPLE III: Bioinformatics analysis of full length cDNAsi for description and parameter settings) against a repeat-database compiled by MIPS-
If a repeat was present within the rl-sequencei the sequence were not processed further-
Novel clones that met all criteria were identified to the sequencers! who then performed S^-end sequencing of these clones- The resulting 31 ESTs (usl ESTs" = sequenced from 3'-end) were checked for
a) the lack of matches with known genes in public databases! and sequences already generated by us-
This was done by blasting against EMBL/EMBLNEU and assembled EST (BLASTn and BLASTxi please refer to EXAMPLE III:
Bioinformatics analysis of full length cDNAsi for description and parameter settings)- b) the presence of polyadenylation signals-
Again only clones matching the selection criteria were chosen to be sequenced completely by the sequencers. Clones were selected after the following criteria:
A very good ORF had at least one BLASTx match to other proteins. A "good ORF" should extend to the 3' end and be longer than ~4D codons- If the ORF started in the rl sequence! in front of the potential start codoni there should not exist too many competing start codons in frame with the ORF start codon and the start should match the Kozak consensus ATG- If the EST sequence was to short to decide according to the potential ORFi and there were only a few or no start codons in the sequence the GC content of the Sequence should be greater than 40*- The rl sequences needed not contain an polyA-tail at the 3' end- In addition! the results of the blasting against the assembled human ESTs could help in questionable cases to decide whether to stop or to continue- A hit against these ESTs was an indication to go further ■
Clones passing the above-described screening were sequenced in full- Sequencing was done preferentially using dye terminator chemistry (ABD or Amersham) on ABI automated DNA sequencers (ABI 377ι Applied Biosystems)ι one partner used EMBL prototype instruments (Arakis) mainly with dye primer chemistry- Primer walking (Strauss et al-i llδbi Specific-primer-directed DNA sequencing- Anal Biochem- 154ι 353-3bD) was the preferred sequencing strategy because of the lower redundancy possible compared to random shotgun (Messingi J-i Creai R-i Seeburgi H-P- (1161) A system for shotgun DNA sequencing- Nucleic Acids Res- li 32-31) methods- Ualking primers were generally designed using software (e-g- Haasi S-i Vingroni M-i Poustkai A-i Uiemanni S- (lllδ) Primer design in large-scale sequencing- Nucleic Acids Res- 2faι 300fa-3D12ι Schwageri d Uiemanni S-i Ansorgei U. (1115) GeneSkipper: integrated software environment for DNA sequence assembly and alignment- HUGO Genome Digest 2ι δ-1) that permitted complete automation of this usually time consuming process and helped in the parallel processing of large numbers of clones. EXAMPLE III: Bioinformatics analysis of full length cDNAs
Each sequence obtained was compared on nucleotide level in a stepwise manner to sequences in EMBL/EMBLNEUi EMBL-ESTi EMBL-STS using the BLASTn algorithm- Basic Local Alignment Search Tool (BLASTi Altschul S- F- (1113) J Mol Evol 3b-"210-300i Altschuli S- F- et al (HID) J Mol Biol 215:403-10) is used to search for local sequence alignments. BLAST produces alignments of both nucleotide (BLASTn) and amino acid sequences (BLASTp or BLASTx) to determine sequence similarity. BLAST is especially useful in determining exact matches or in identifying homologsi because of the local nature of the alignments. Uhile it is useful for matches which do not contain gapsi it is inappropriate for performing motif-style searching- The fundamental unit of BLAST algorithm output is the High-scoring Segment Pair (HSP).
An HSP consists of two sequence fragments of arbitrary but equal lengths whose alignment is locally maximal and for which the alignment BLAST approach is to look threshold or cut off score set by the user- BLAST looks for HSPs between a query sequence and a database sequence! to evaluate the statistical significance of any matches foundi and to report only those matches which satisfy the user-selected threshold of significance- The parameter E establishes the statistically significant threshold for reporting database sequence matches- E is interpreted as the upper bound of the expected frequency of chance occurrence of an HSP (or set of HSPs) within the context of the entire database search- Any database sequence whose match satisfies E is reported in the program output- Parameter settings for the BLAST-operations (BLASTN 2 - OallMP-UashU) described were: EMBL-EMBLNEU: H=0 V=5 B=S -filter segi EMBL-EST-" H=D E=le-10 B=5D0 V=500 -filter segi EMBL-STS: H=D V=5 B=S-
Search against EMBL/EMBLNEU was done to determine whether the cDNAs are already knowni and also to find out whether the cDNAs are encoded by genomic sequences already sequenced and published/submitted to these databases.
Search against EMBL-EST was performed to get a first impression how abundant a particular cDNA would be and to get information on tissue specificity (so-called "electronic Northern-Blot"ι e-g- some of the cDNAs derived of the testis library show only hits to ESTs also derived of testis libraries)-
The cDNA-sequences were blasted against EMBL-STS to determine STS-sequence-match to the cDNAi thus providing a mapping information to the new cDNA-
The potential protein-sequences were generated automatically by a script searching for the longest open reading frame (ORF) in each of the three forward frames with a minimum length of 10 codons. Nexti the automatically generated ORFs were translated into protein sequences- These protein sequences were searched against the non redundant protein data set of PIR/SwissProt/Trembel/Tre blnew (BLASTP 2- OallMP-UashUi parameter setting". V=7 B=7 H=0 -filter seg). If the script generated more than one ORFi one ORF was chosen manually by the annotater according to the degree of similarity to known proteinsi the location of the ORF in the cDNAi the lengthi the amino acid composition and the content of Prosite-Motifs ■
Additionally there was a BLASTx (BLASTX 2 -DallMP-UashU against non redundant protein database comprising PIR/SUISSPROT/TREMBL/TREMBLNEUi parameter-settings were: matrix/home/data/blast/matrix/aa/BL0SUMb2 H=D V=5 B=5 -filter seg) search to find potential frame shift in the complementary eds of the cDNAs and to identify unspliced or partly spliced cDNAs- The protein sequence was then transferred to the PEDANT systemi in order to generate additional information on the new proteins- PEDANT (Protein Extraction! Description! and ANalysis Tooli Frishmani D- & Mewesi H--U- (1117) PEDANTic genome analysis. Trends in Genetics i 13ι 415-41b) is a platform developed at the Munich Information Center for Protein Sequences (MIPSi Munichi Germany)! which incorporates practically all bioinformatics methods important for the functional and structural characterisation of protein sequences- Computational methods used by PEDANT are : FASTA
Very sensitive protein sequence database searches with estimates of statistical significance. Pearson U-R. (1110) Rapid and sensitive sequence comparison with FASTP and FASTA- Methods Enzymol- 183ι b3-16-
BLAST2
Very sensitive protein sequence database searches with estimates of statistical significance- Altschul S-F-i Gish U-i Miller U-i Myers E-U-i and Lipman D-J- Basic local alignment search tool- Journal of Molecular Biology 215ι 4D3-10-
PREDATOR
High-accuracy secondary structure prediction from single and multiple sequences- Frishmani D- and Argosi P- (1117) 75* accuracy in protein secondary structure prediction- Proteins! 27ι 321-335- Frishmani D. and Argosi P-(lllb) Incorporation of longdistance interactions in a secondary structure prediction algorithm. Prot- Eng- li 133-142-
STRIDE
Secondary structure assignment from atomic coordinates. Frishmani D- and Argosi P-(1115) Knowledge-based secondary structure assignment. Proteins 23ι Sbfa-571-
CLUSTALU
Multiple sequence alignment- Thompson! J-D-i Higginsi D-G- and Gibsoni T-J- (1114) CLUSTAL U: improving the sensitivity of progressive multiple sequence alignment through sequence weighting! positions-specific gap penalties and weight matrix choice- Nucleic Acids Research! 22: 4b73-4bδD-
TMAP
Transmembrane region prediction from multiply aligned sequences. Perssoni B- and Argosi P. (1114) Prediction of transmembrane segments in proteins utilising multiple sequence alignments. J- Mol- Biol- 237ι 182-112- AL0M2
Transmembrane region prediction from single sequences. Kleini P-i Kanehisai M-i and DeLisii C Prediction of protein function from sequence properties: A discriminant analysis of a database. Biochim- Biophys- Acta 767ι 221-22b (1184). Version 2 by Dr. K- Nakai-
SIGNALP
Signal peptide prediction Nielseni H-i Engelbrechti J.i Brunaki S-i and von Heijnei G (1117). Identification of prokaryotic and eukaryotic signal peptides and prediction of their cleavage sites- Protein Engineering IDi 1-fa-
SEG
Detection of low complexity regions in protein sequences. Uoottoni J.C-i Federheni S- (1113) Statistics of local complexity in amino acid sequences and sequence databases- Computers & Chemistry 17-. 141-lfa3-
COILS
Detection of coiled coils- Lupasi A-i M- Van Dykei and J- Stocki "Predicting Coiled Coils from Protein Sequences-" Science (1111) 252ι Ilfa2-llfa4-
PROSEARCH
Detection of PROSITE protein sequence patterns- Kolakowski L-F- Jr-i Leunissen J-A-M-i Smith J-E- (1112) ProSearch: fast searching of protein sequences with regular expression patterns related to protein structure and function- Biotechniques 13ι 111- 121-
BLIMPS
Similarity searches against a database of ungapped blocks- J-C- Uallace and Henikoff S-i (1112) PATMAT: a searching and extraction program for sequence! pattern and block queries and databases! CABIOS δi 241-254- Uritten by Bill Alford. HMMER
Hidden Markov model software - Sonnhammer E-L-L-i Eddy S.R-i Durbin R. (1117) Pfam". A Comprehensive Database of Protein Families Based on Seed Alignments- Proteins 2δι 405-420-
pi
Perl script that returns the amino acid composition! molecular weighti theoretical pli and expected extinction coefficient of an amino acid sequence- By Fred Lindberg- The parameter-settings were as follows: known3d: score > lODi BLAST"- E-value < IDi SCOP: <= 50 Alignments! E-Value < 0-OOOli signalp: Y=D-7i untersucht vom N-Terminus her: 5D aai funcat: E-value < D-ODli BLOCKS: <= 10 hitsi BLIMPS: threshold 1100-Oi COILS: threshold D-15i SEG: threshold 20-Di BLAST in report: E-value < 0-DOli PIR-KUi superfa iliesi EC-Nummern in report: E-value < D-ODODli known3d in report: score > 120
The results of PEDANT analysis together with the results of the similarity searches constitute the basis for the structural and functional annotation of the cDNAs and the encoded proteinsi as specified herein-

Claims

Ue cl a im :
1- An assemblage! comprising at least one nucleic acid molecule having the sequence of a clone selected from the group consisting of: amy2_12g7i amy2_12ili amy2_13gl1i amy2_lfael4i amy2_24kl5i amy2_2al3i amy2_2il7i fbr2_7δdlδi fbr2_7δelβi amy2_121m2i amy2_24b4i amy2_121fl1i tes3_lfab5i amy2_li24i amy2_ljl1i amy2_2bl1i amy2_7j5i amy2_14b5i amy2_2ol3i fkd2_3kli mel2_7gl4i mel2_12jl i mel2_7kl1i amy2_2c22i fbr2_7δi21i amy2_lln4i amy2_lcl2i amy2_lili amy2_2f22i amy2_2gl2i fbr2_7βcl2i tes3_10ilfai tes3_31al0i amy2_10hl7i amy2_10p7i amy2_12d7i amy2_2flβi tes3_llc22i tes3_lld21i tes3_21f24i tes3_31j20i tes3_5k22i Tes3_lDnl0i Tes3_Hel7i Tes3_12dlβ i Tes3_141?i Tes3_15nl4i Tes3_lfap3i Tes3_Hpl2i Tes3_21kl4i Tes3_22illi Tes3_22124i- tes3_2bg3i tes3_30pbi amy2_lld2i amy2_121ol7i amy2_lil4i amy2_24cδi fbr2_7βd4i tes3_llal7i tes3_17i21i tes3_20hl2i tes3_7nl2i tes3_1elfai amy2_14mlbi tes3_lδnl4i their complementsi and variants thereof-
2- An assemblagei comprising at least one nucleic acid molecule having the sequence of a clone selected from the group consisting of: amy2_12g7i amy2_12ili amy2_13gl1i amy2_lbel4i amy2_24kl5i amy2_2al3i amy2_2il7i amy2_lΞlm2i amy2_24b4i amy2_121fl1i amy2_li24i amy2_ljl1i amy2_2bl1i amy2_7j5i amy2_14b5i amy2_2ol3i amy2_2c22i amy2_lln4i amy2_lcl2i amy2_lili amy2_2f22i amy2_2gl2i amy2_10hl7i amy2_10p7i amy2_12d7i amy2_2flβi amy2_lld2i amy2_121ol7i amy2_lil4i amy2_24cβi their complementsi and variants thereof-
3- An assemblagei comprising at least one nucleic acid molecule having the sequence of a clone selected from the group consisting of: fbr2_7βdlδi fbr2_7βelδi fbr2_76i21i fbr2_76cl2i fbr2_7δd4i their complementsi and variants thereof.
4- An assemblage! comprising at least one nucleic acid molecule having the sequence of a clone selected from the group consisting of: amy2_121m2i amy2_24b4i their complementsi and variants thereof.
5- An assemblage! comprising at least one nucleic acid molecule having the sequence of a clone selected from the group consisting of: amy2_121fl1i tes3_lbb5i their complementsi and variants thereof-
fa- An assemblage! comprising at least one nucleic acid molecule having the sequence of a clone selected from the group consisting of: amy2_li24i amy2_ljl1i amy2_2bl1i amy2_7j5i their complementsi and variants thereof.
7. An assemblagei comprising at least one nucleic acid molecule having the sequence of a clone selected from the group consisting of: amy2_14b5i amy2_2ol3i fkd2_3kli mel2_7gl4i their complementsi and variants thereof.
6. An assemblage! comprising at least one nucleic acid molecule having the sequence of a clone selected from the group consisting of mel2_7gl4i mel2_12jl i mel2_7kl1i their complementsi and variants thereof-
1- An assemblage! comprising at least one nucleic acid molecule having the sequence of a clone selected from the group consisting of: amy2_2c22i fbr2_7δi21ϊ their complementsi and variants thereof.
ID- An assemblage! comprising at least one nucleic acid molecule having the sequence of a clone selected from the group consisting of: amy2_lln4i amy2_lili amy2_2gl2i fbr2_7βcl2i tes3_10ilbi tes3_31al0i their complementsi and variants thereof-
11- An assemblagei comprising at least one nucleic acid molecule having the sequence of a clone selected from the group consisting of: amy2_10hl7i amy2_lDp7i amy2_12d7i amy2_2flβi tes3_llc22i tes3_lld21i tes3_21f24i tes3_31j20i tes3_5k22i their complementsi and variants thereof-
12- An assemblage! comprising at least one nucleic acid molecule having the sequence of a clone selected from the group consisting of: tes3_lbb5i tes3_10ilfai tes3_31al0i tes3_llc22i tes3_lld21i tes3_21f24i tes3_31j20i tes3_Sk22i Tes3_10nl0i Tes3_llel7i Tes3_12dl8 i Tes3_1417i Tes3_15nl4i Tes3_lfap3i Tes3_Hpl2i Tes3_21kl4i Tes3_22illi Tes3_22124i tes3_2bg3i tes3_30pbi tes3_llal7i tes3_17i21i tes3_2Dhl2i tes3_7nl2i tes3_1elfai their complementsi and variants thereof.
13- An assemblage! comprising at least one nucleic acid molecule having the sequence of a clone selected from the group consisting of: amy2_lld2i amy2_121ol?i amy2_lil4i amy2_24cδi fbr2_7δd4i tes3_llal7i tes3_l?iΞli tes3_20hl2i tes3_7nl2i tes3_1elfai their complementsi and variants thereof-
14- An assemblage! comprising at least one nucleic acid molecule having the sequence of a clone selected from the group consisting of: amy2_14mlfai tes3_lδnl4i amy2_lcl2i amy2_2f22i their complementsi and variants thereof.
15- A nucleic acid molecule comprising a nucleotide sequence of the clone fkd2_3kl-
lfa. A computer readable mediumi comprising in electronic form at least one nucleic acid or protein sequence of a clone selected from the group consisting of: amy2_12g7i amy2_12ili amy2_13gl1i amy2_lfael4i amy2_24kl5i amy2_2al3i amy2_2il7i fbr2_7βdlδi fbr2_7βelδi amy2_121m2i amy2_24b4i amy2_121fl1i tes3_lfab5i amy2_li24i amy2_ljl1i amy2_2bl1i amy2_7j5i amy2_14b5i amy2_2ol3i fkd2_3kli mel2_7gl4i mel2_12jl i mel2_7kl1i amy2_2c22i fbr2_7βi21i amy2_lln4i amy2_lcl2i amy2_lili amy2_2f22i amy2_2gl2i fbr2_78cl2i tes3_lDilbi tes3_31al0i amy2_lDhl7i amy2_10p7i amy2_12d7i amy2_2fl8i tes3_Hc22i tes3_lld21i tes3_21f24i tes3_31j2Di tes3_5k22i Tes3_10nl0i Tes3_llel7i Tes3_12dlβ i Tes3_1417i Tes3_15nl4i Tes3_lbp3i Tes3_11pl2i Tes3_21kl4i Tes3_22illi Tes3_22124i tes3_2bg3i tes3_30pfai amy2_lld2i amy2_121ol7i amy2_lil4i amy2_24c8i fbr2_78d4i tes3_llal7i tes3_17i21i tes3_20hl2i tes3_7nl2i tes3_1elfai amy2_14mlbi tes3_18nl4i their complementsi and variants thereof.
17- A computer readable mediumi comprising in electronic form at least one nucleic acid or protein sequence of a clone selected from the group consisting of: amy2_12g7i amy2_12ili amy2_13gl1i amy2_lbel4i amy2_24kl5i amy2_2al3i amy2_2il7i amy2_121m2i amy2_24b4i amy2_121fl1i amy2_li24i amy2_ljl1i amy2_2bl1i amy2_7j5i amy2_14b5i amy2_2ol3i amy2_2c22i amy2__lln4i amy2_lcl2i amy2_lili amy2_2f22i amy2_2gl2i amy2_10hl7i amy2_lDp7i amy2_12d7i amy2_2fl8i amy2_lld2i amy2_121ol7i amy2_lil4i amy2_24cδi their complementsi and variants thereof.
16- A computer readable mediumi comprising in electronic form at least one nucleic acid or protein sequence of a clone selected from the group consisting of: fbr2_7δdlδi fbr2_7βel6i fbr2_7βi21i fbr2_7βcl2i fbr2_7δd4i their complementsi and variants thereof-
11- A computer readable mediumi comprising in electronic form at least one nucleic acid or protein sequence of a clone selected from the group consisting of: amy2_121m2i amy2_24b4i their complementsi and variants thereof-
20- A computer readable mediumi comprising in electronic form at least one nucleic acid or protein sequence of a clone selected from the group consisting of: amy2_121fl1i tes3_lbb5i their complementsi and variants thereof.
21- A computer readable mediumi comprising in electronic form at least one nucleic acid or protein sequence of a clone selected from the group consisting of: amy2_li24i amy2_ljl1i amy2_2bl1i amy2_7j5i their complementsi and variants thereof-
22- A computer readable mediumi comprising in electronic form at least one nucleic acid or protein sequence of a clone selected from the group consisting of: amy2_14b5i amy2_2ol3i fkd2_3kli mel2_7gl4i their complementsi and variants thereof-
23- A computer readable mediumi comprising in electronic form at least one nucleic acid or protein sequence of a clone selected from the group consisting of: mel2_12jl i mel2_7kl1i their complementsi and variants thereof.
24- A computer readable mediumi comprising in electronic form at least one nucleic acid or protein sequence of a clone selected from the group consisting of: amy2_2c22i fbr2_7βi21i their complementsi and variants thereof. 25- A computer readable mediumi comprising in electronic form at least one nucleic acid or protein sequence of a clone selected from the group consisting of: amy2_lln4i amy2_lili amy2_2gl2i fbr2_78cl2i tes3_10ilbi tes3_31al0i their complementsi and variants thereof.
2fa- A computer readable mediumi comprising in electronic form at least one nucleic acid or protein sequence of a clone selected from the group consisting of: amy2_10hl7i amy2_10p7i amy2_lΞd7i amy2_2fl8i tes3_llc22i tes3_lldSli tes3_21f24i tes3_31j2Di tes3_5k22i their complementsi and variants thereof.
27- A computer readable mediumi comprising in electronic form at least one nucleic acid or protein sequence of a clone selected from the group consisting of: tes3_lfab5i tes3_10ilfai tes3_31al0i tes3_llc22i tes3_lld21i tes3_21f24i tes3_31j20i tes3_5k22i Tes3_lDnl0i Tes3_llel7i Tes3_lΞdlβ i Tes3_1417i Tes3_15nl4i Tes3_lbp3i Tes3_11pl2i Tes3_21kl4i Tes3_22illi Tes3_22124i tes3_2bg3i tes3_30pbi tes3_llal7i tes3_17i21i tes3_20hl2i tes3_7nl2i tes3_1elfai their complementsi and variants thereof-
28- A computer readable mediumi comprising in electronic form at least one nucleic acid or protein sequence of a clone selected from the group consisting of: amy2_lld2i amy2_121ol7i amy2_lil4i amy2_24c8i fbr2_78d4i tes3_llal7i tes3_17i21i tes3_20hl2i tes3_7nl2i tes3_1elbi their complementsi and variants thereof •
21- A computer readable mediumi comprising in electronic form at least one nucleic acid or protein sequence of a clone selected from the group consisting of: amy2_14mlbi tes3_18nl4i amy2_lcl2i amy2_2f22i their complementsi and variants thereof-
3D- A computer readable mediumi comprising in electronic form a nucleic acid or protein sequence of the clone fkd2_3kl-
PCT/IB2001/002050 2000-04-25 2001-04-25 Human dna sequences WO2001098454A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2001295841A AU2001295841A1 (en) 2000-04-25 2001-04-25 Human dna sequences

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US19938000P 2000-04-25 2000-04-25
US60/199,380 2000-04-25

Publications (2)

Publication Number Publication Date
WO2001098454A2 true WO2001098454A2 (en) 2001-12-27
WO2001098454A8 WO2001098454A8 (en) 2003-09-18

Family

ID=22737261

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IB2001/002050 WO2001098454A2 (en) 2000-04-25 2001-04-25 Human dna sequences

Country Status (2)

Country Link
AU (1) AU2001295841A1 (en)
WO (1) WO2001098454A2 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002026957A3 (en) * 2000-09-28 2002-12-12 Bayer Ag Regulation of a human serine protease
WO2004028558A1 (en) * 2002-09-27 2004-04-08 Takeda Pharmaceutical Company Limited Preventives/remedies for neurodegenerative disease
US8008332B2 (en) 2006-05-31 2011-08-30 Takeda San Diego, Inc. Substituted indazoles as glucokinase activators
US8034822B2 (en) 2006-03-08 2011-10-11 Takeda San Diego, Inc. Glucokinase activators
US8124617B2 (en) 2005-09-01 2012-02-28 Takeda San Diego, Inc. Imidazopyridine compounds
US8163779B2 (en) 2006-12-20 2012-04-24 Takeda San Diego, Inc. Glucokinase activators
US8173645B2 (en) 2007-03-21 2012-05-08 Takeda San Diego, Inc. Glucokinase activators
CN114854736A (en) * 2022-06-23 2022-08-05 香港中文大学(深圳) Circular nucleic acid molecule, method for producing same, nucleic acid probe and detection method

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002026957A3 (en) * 2000-09-28 2002-12-12 Bayer Ag Regulation of a human serine protease
WO2004028558A1 (en) * 2002-09-27 2004-04-08 Takeda Pharmaceutical Company Limited Preventives/remedies for neurodegenerative disease
US8124617B2 (en) 2005-09-01 2012-02-28 Takeda San Diego, Inc. Imidazopyridine compounds
US8034822B2 (en) 2006-03-08 2011-10-11 Takeda San Diego, Inc. Glucokinase activators
US8008332B2 (en) 2006-05-31 2011-08-30 Takeda San Diego, Inc. Substituted indazoles as glucokinase activators
US8394843B2 (en) 2006-05-31 2013-03-12 Takeda California, Inc. Substituted isoindoles as glucokinase activators
US8163779B2 (en) 2006-12-20 2012-04-24 Takeda San Diego, Inc. Glucokinase activators
US8173645B2 (en) 2007-03-21 2012-05-08 Takeda San Diego, Inc. Glucokinase activators
CN114854736A (en) * 2022-06-23 2022-08-05 香港中文大学(深圳) Circular nucleic acid molecule, method for producing same, nucleic acid probe and detection method
CN114854736B (en) * 2022-06-23 2023-12-15 香港中文大学(深圳) Circular nucleic acid molecule, method for producing same, nucleic acid probe, and method for detecting same

Also Published As

Publication number Publication date
AU2001295841A1 (en) 2002-01-02
WO2001098454A8 (en) 2003-09-18

Similar Documents

Publication Publication Date Title
EP1248798A2 (en) Human dna sequences
AU2023214237A1 (en) Modified polynucleotides for the production of biologics and proteins associated with human disease
KR102630357B1 (en) Multi-site SSI cells with difficult protein expression
US6262333B1 (en) Human genes and gene expression products
AU2016364667A1 (en) Materials and methods for treatment of Alpha-1 antitrypsin deficiency
AU2016376191A1 (en) Materials and methods for treatment of amyotrophic lateral sclerosis and/or frontal temporal lobular degeneration
EP0973896A2 (en) SECRETED EXPRESSED SEQUENCE TAGS (sESTs)
EP0973898A2 (en) SECRETED EXPRESSED SEQUENCE TAGS (sESTs)
CN107249318A (en) For cell, tissue and the organ of the genetic modification for treating disease
JP2003135075A (en) NEW FULL-LENGTH cDNA
JP2003088388A (en) NEW FULL-LENGTH cDNA
KR20190129857A (en) Mammalian Cells for Adeno-associated Virus Production
EP1165784A2 (en) Nucleic acids including open reading frames encoding polypeptides; &#34;orfx&#34;
KR20190006611A (en) Humanized non-human animals with restricted immunoglobulin heavy chain loci
WO1995014772A1 (en) Gene signature
WO1999055858A2 (en) Human nucleic acid sequences obtained from pancreas tumor tissue
US20040248256A1 (en) Secreted proteins and polynucleotides encoding them
AU2016202635B2 (en) Method for assessing embryotoxicity
KR20220025806A (en) Random configuration of nucleic acids targeted integration
JP2003159059A (en) Identification and use of molecule associated with pain
JP2003156489A (en) Identification and use of molecule associated with pain
KR102046839B1 (en) Method for in vitro diagnosis or prognosis of colon cancer
WO2001098454A2 (en) Human dna sequences
CN115151558A (en) Targeted integration in mammalian sequences enhances gene expression
US20030207286A1 (en) Nucleic acid sequences showing enhanced expression in benign neuroblastoma compared with acritical human neuroblastoma

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CR CU CZ DE DK DM DZ EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
122 Ep: pct application non-entry in european phase
D17 Declaration under article 17(2)a
NENP Non-entry into the national phase in:

Ref country code: JP