GB2381790A - LDL-receptor polypeptides - Google Patents

LDL-receptor polypeptides Download PDF

Info

Publication number
GB2381790A
GB2381790A GB0222372A GB0222372A GB2381790A GB 2381790 A GB2381790 A GB 2381790A GB 0222372 A GB0222372 A GB 0222372A GB 0222372 A GB0222372 A GB 0222372A GB 2381790 A GB2381790 A GB 2381790A
Authority
GB
United Kingdom
Prior art keywords
asp
leu
gly
ser
arg
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
GB0222372A
Other versions
GB0222372D0 (en
Inventor
Filippo Volpe
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Glaxo Group Ltd
Original Assignee
Glaxo Group Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from GB0123124A external-priority patent/GB0123124D0/en
Priority claimed from GB0214703A external-priority patent/GB0214703D0/en
Application filed by Glaxo Group Ltd filed Critical Glaxo Group Ltd
Publication of GB0222372D0 publication Critical patent/GB0222372D0/en
Publication of GB2381790A publication Critical patent/GB2381790A/en
Withdrawn legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/705Receptors; Cell surface antigens; Cell surface determinants

Abstract

MEGF7 polypeptides and polynucleotides and methods for producing such polypeptides by recombinant methods are disclosed. The MEGF7 protein contains multiple epidermal growth factor (EGF)-like motifs and a signal peptide at the 5' end. The structure of human MEGF7 suggests that it may act as a type 1 cell surface receptor protein (having the N-terminal external to the cell and the C-terminal located extracellularly), and indicates that it is a member of the LDL receptor family of proteins.

Description

<Desc/Clms Page number 1>
Novel Protein Field of the Invention This invention relates to newly identified polypeptides, and polynucleotides encoding such polypeptides, to their use in diagnosis and in identifying compounds that may be agonists or antagonists thereof that are potentially useful in therapy, and to production of such polypeptides and polynucleotides.
Background of the Invention LDL receptors are known to be involved in the homeostasis of plasma lipids. Mutations in LDL receptor gene cause familial hypercholesterolemia, marked by high plasma cholesterol levels and coronary heart disease (see e. g. , Krieger and Herz, Annu. Rev. Biochem. 63: 601 (1994) ). More recently it has been suggested that this receptor family controls the cellular uptake of a large number of different molecules, including lipoproteins, proteases and protease/inhibitor complexes, steroids and retinoids. Furthermore gene disruption studies have shown the involvement of LDLR family members in embryonic development of the nervous system (Herz et al. Nature Reviews, 1: 51 (2000) ). A growing body of evidence suggests a role in signal transduction pathways for at least some family members (LRP6, VLDL and ApoER2) (Pinson et al, Nature, 407: 535 (2000); Cooper and Howell, Cell, 97: 671 (1999) ) and in the case of LRPI in synaptic transmission in adult brain (Zhuo et al. J. Neurosci. 20: 542 (2000); Bacskai et al. PNAS, 97: 11551 (2000)).
This receptor family shares structural and functional properties, and interacts with a diverse group of ligands. Structurally, members of the LDL receptor family display a short cytoplasmic domain with little conservation with the exception of one or more NPXY motifs important for receptor internalization and as docking sites for adaptor proteins harbouring phosphotyrosine-binding domains (Gotthasdt et al. J. Biol. Chem. 275: 25616 (2000) ). A single transmembrane domain connects the carboxyl cytoplasmic region to a large amino terminal extracellular region. This extracellular region is mostly conserved throughout the gene-family and is characterized by the presence of complement-type/ldltypea repeats (40 amino acids
containing six cross-linked cysteines residues) and epidermal growth factor-like repeats (YWTD pi repeats). The latter repeats appear to be important for ligand dissociation following receptor internalization, while clusters of several complement-type/ldl~type~a repeats constitute the ligand-binding domain of the receptor (Simmons et al. J Biol. Chem. 272,25531 (1997) ).
The EGF domain is believed to play a role in various extracellular events, including cell adhesion and receptor-ligand interactions. See e. g. , Campbell and Bork, Curr. Opin. Struct. Biol. 3: 385 (1993). A sequence of about thirty to forty amino-acid residues found in the sequence of EGF has also been found to be present, in a more or less conserved form, in a large number of other proteins; these domains are termed EGF-like domains. Various proteins that play critical roles in neuronal development contain multiple EGF-like domains in their extracellular portions; mutations in proteins with EGF-like domains have been associated with various animal and human disorders. See e. g. , Nakayama et al., Genomics 51: 27 (1998). See also Davis, New Biol. 2: 410 (1990); Blomquist et al., Proc. Natl. Acad. Sci. U. S. A. 81: 7363 (1984); Barker et al., Protein Nucl. Acid Enz. 29: 54 (1986); Doolittle et al., Nature 307: 558 (1984); Appella et al., FEBS Lett. 231: 1 (1988). Known MEGF genes include reelin (mutations in reelin cause disorganization of the cerebellar and cerebral cortex in reeler mice), TAN-1/Notchl, Notch4, jagged protein and nel (see e. g., Elhsen et al., Cell 66: 649 (1991); Uyttendaele et al., Development 122: 2251 (1996); Lindsell et al. Cell 80: 909 (1995)).
Complexes of lipids and apolipoprotein E (APOE) bind to most of the LDL family members (Hussain et al., Annu. Rev. Nutr. 19: 141 (1999) ). Three common alleles for
<Desc/Clms Page number 2>
APOE have been reported (s2, 83, 84) with 83/83 the most common genotype in Caucasian populations. The APOE4 allele has been associated with late onset of Alzheimer's disease.
Variation in binding property among the different ApoE isoforms have been reported and in some cases correlated to a specific pathology (i. e. Apode-62 and type III hyperlipoproteinemia).
The identification of genes and encoded proteins with multiple EGF-like domains (MEGFs) is important in understanding specific cell-cell and/or ligand-receptor interactions. The availability of such genes and proteins provide methods of screening pharmaceutical compounds to identify those compounds that interact with, or affect the function of, such receptors.
Accordingly, there is a continuing need to identify and characterize additional human MEGF genes and proteins, and to identify their roles in metabolism and disease. Such molecules are useful in determining the pharmaceutical properties of chemical compounds (drug discovery), e. g. , detecting whether the compound acts to increase or decrease the activity of the protein of the invention.
Summary of the Invention The present invention relates to a human gene and encoded protein having Multiple Epidermal Growth Factor-like (EGF-like) domains; this protein is termed herein MEGF7. The present invention particularly relates to human MEGF7 polypeptides and MEGF7 polynucleotides, recombinant materials and methods for their production. In a further aspect, the invention relates to methods for identifying agonists and antagonists (e. g., inhibitors) using the materials provided by the invention, and treating conditions associated with alterations in MEGF7 function (hereinafter referred to as"diseases of the invention") with the identified compounds. In a still further aspect, the invention relates to diagnostic assays for detecting diseases associated with inappropriate MEGF7 activity or levels.
Description of the Invention Nakayama et al. (Genomics 51: 27 (1998) ) identified a number of genes for high molecular weight proteins with multiple Epidermal Growth Factor (EGF)-like motifs, including an unidentified LDL receptor-like protein (termed MEGF7; see GenBank accession number AB011540 dated 22 August 1998).
The present inventors determined the complete sequence of the human MEGF7, including the 5'end of the sequence containing a further LDL type A repeats. The present nucleotide sequence includes eight complement-type/ldl~type~a repeats, followed by a signal peptide at the 5'end. The nucleotide molecule was initially isolated from human tissue. The structure of human MEGF7 and its presence in human tissue suggests that it may act as a type 1 cell surface receptor protein (having the N-terminal external to the cell and the C-terminal located intracellularly), and indicates that it is a member of the LDL receptor family of proteins.
The presence of a cluster of eight complement-type/ldl~type~a repeats suggests that this receptor shares ligand-binding properties common to other family members (i. e. ApoE-lipid complex binding). The complete MEGF7 structure appears similar to receptors that recently have been described as having a role in signal transduction pathway, for example, the VLDL receptor that plays a role in regulating microtubule function in neurons.
The nucleotide sequence of human MEGF7 is provided herein as SEQ ID NO : 4; the encoded polypeptide is provided as SEQ ID NO : 5.
In a first aspect, the present invention relates to MEGF7 polypeptides. Such polypeptides include: (a) an isolated polypeptide encoded by a polynucleotide comprising the sequence of SEQ ID NO : 4; (b) an isolated polypeptide comprising a polypeptide sequence having at least 95%, 96%, 97%, 98%, or 99% identity to the polypeptide sequence of SEQ ID NO : 5;
<Desc/Clms Page number 3>
(c) an isolated polypeptide comprising the polypeptide sequence of SEQ ID NO : 5; (d) an isolated polypeptide having at least 95%, 96%, 97%, 98%, or 99% identity to the polypeptide sequence of SEQ ID NO : 5; (e) the polypeptide sequence of SEQ ED NO : 5; and (f) an isolated polypeptide having or comprising a polypeptide sequence that has an Identity Index of 0.95, 0.96, 0.97, 0.98, or 0.99 compared to the polypeptide sequence of SEQ ID NO : 5; (g) fragments and variants of such polypeptides in (a) to (f).
Polypeptides of the present invention are believed to be cell surface receptor proteins involved in signal transduction, and to be members of the low density lipoprotein receptor family of polypeptides. They are therefore of interest because they are human cell surface receptor proteins.
The biological properties of the human MEGF7 are hereinafter referred to as"biological activity of human MEGF7"or"human MEGF7 activity". Preferably, a polypeptide of the present invention exhibits at least one biological activity of human MEGF7.
Polypeptides of the present invention also include variants of the aforementioned polypeptides, including all allelic forms and splice variants. Such polypeptides vary from the reference polypeptide by insertions, deletions, and substitutions that may be conservative or nonconservative, or any combination thereof. Particularly preferred variants are those in which several, for instance from 50 to 30, from 30 to 20, from 20 to 10, from 10 to 5, from 5 to 3, from 3 to 2, from 2 to 1 or 1 amino acids are inserted, substituted, or deleted, in any combination.
Preferred fragments of polypeptides of the present invention include an isolated polypeptide comprising an amino acid sequence having at least 30,50 or 100 contiguous amino acids from the amino acid sequence of SEQ ID NO: 5, or an isolated polypeptide comprising an amino acid sequence having at least 30,50 or 100 contiguous amino acids truncated or deleted from the amino acid sequence of SEQ ID NO: 5. Preferred fragments are biologically active fragments that mediate the biological activity of human MEGF7, including those with a similar activity or an improved activity, or with a decreased undesirable activity. Also preferred are those fragments that are antigenic or immunogenic in an animal, especially in a human.
The invention also includes a polypeptide consisting of or comprising a polypeptide of the formula: (Rl) m- (SEQIDNO : 5)- (R2) n wherein each occurrence of R 1 and R is independently any amino acid residue or modified amino acid residue, m is zero or is an integer between 1 and 1000, n is zero or is an integer between 1 and 1000, and SEQ ID NO : 5 is an amino acid sequence of the invention. In the formula above, SEQ ID NO : 5 is oriented so that its amino terminus is the amino acid residue at the left, covalently bound to R and its carboxy terminus is the amino acid residue at the right, covalently bound to R2. Any stretch of amino acid residues denoted by either RI or R, wherein m and/or n is greater than 1, may be either a heteropolymer or a homopolymer, preferably a heteropolymer. Other suitable embodiments of the invention are those wherein m is an integer between 1 and 50,1 and 100, or 1 and 500, and n is an integer between 1 and 50,1 and 100, or 1 and 500.
It will be appreciated by those skilled in the art, that in the above identified structure, RI or R2 or both may represent sequences such as a leader or secretory sequence, a pre- , pro-or prepro-protein sequence or the like as further described below.
Fragments of the polypeptides of the invention may be employed for producing the corresponding full-length polypeptide by peptide synthesis; therefore, these variants may be employed as intermediates for producing the full-length polypeptides of the invention. The polypeptides of the present invention may be in the form of the"mature"protein or may be a part of a larger protein such as a precursor or a fusion protein. It is often advantageous to include an additional amino acid sequence that contains secretory or leader sequences, pro-sequences,
<Desc/Clms Page number 4>
sequences that aid in purification, for instance multiple histidine residues, or an additional sequence for stability during recombinant production.
Polypeptides of the present invention can be prepared in any suitable manner, for instance by isolation form naturally occurring sources, from genetically engineered host cells comprising expression systems (vide infra) or by chemical synthesis, using for instance automated peptide synthesizers, or a combination of such methods. Means for preparing such polypeptides are well understood m the art.
In a further aspect, the present invention relates to human MEGF7 polynucleotides. Such polynucleotides include: (a) an isolated polynucleotide comprising a polynucleotide sequence having at least 95%, 96%, 97%, 98%, or 99% identity to the polynucleotide sequence of SEQ ID NO : 4; (b) an isolated polynucleotide comprising the polynucleotide of SEQ ID NO : 4; (c) an isolated polynucleotide having at least 95%, 96%, 97%, 98%, or 99% identity to the polynucleotide of SEQ ID NO : 4; (d) the isolated polynucleotide of SEQ ID NO : 4; (e) an isolated polynucleotide comprising a polynucleotide sequence encoding a polypeptide sequence having at least 95%, 96%, 97%, 98%, or 99% identity to the polypeptide sequence of SEQ ID NO : 5; (f) an isolated polynucleotide comprising a polynucleotide sequence encoding the polypeptide of SEQ ID NO : 5; (g) an isolated polynucleotide having a polynucleotide sequence encoding a polypeptide sequence having at least 95%, 96%, 97%, 98%, or 99% identity to the polypeptide sequence of SEQ ID NO : 5; (h) an isolated polynucleotide encoding the polypeptide of SEQ ID NO : 5; (i) an isolated polynucleotide having or comprising a polynucleotide sequence that has an Identity Index of 0.95, 0.96, 0.97, 0.98, or 0.99 compared to the polynucleotide sequence of SEQ ID NO : 4; (j) an isolated polynucleotide having or comprising a polynucleotide sequence encoding a polypeptide sequence that has an Identity Index of 0.95, 0.96, 0.97, 0.98, or 0.99 compared to the polypeptide sequence of SEQ ID NO : 5; and polynucleotides that are fragments and variants of the above mentioned polynucleotides or that are complementary to above mentioned polynucleotides, over the entire length thereof.
Preferred fragments of polynucleotides of the present invention include an isolated polynucleotide comprising an nucleotide sequence having at least 15,30, 50 or 100 contiguous nucleotides from the sequence of SEQ ID NO: 4, or an isolated polynucleotide comprising an sequence having at least 30,50 or 100 contiguous nucleotides truncated or deleted from the sequence of SEQ ID NO: 4.
The invention also includes a polynucleotide consisting of or comprising a polynucleotide of the formula: (Rl) m- (SEQIDNO : 4)- (R2) n wherein each occurrence of R and R2 is independently any nucleic acid residue or modified nucleic acid residue, m is zero or an integer between 1 and 3000, n is zero or an integer between I and 3000, and SEQ ID NO : 4 is a nucleotide sequence of the invention. In the polynucleotide formula above, SEQ ID NO : 4 is oriented so that its 5'end nucleic acid residue is at the left, bound to R and its 3'end nucleic acid residue is at the right, bound to R. Any stretch of nucleic acid residues denoted by RI or R, wherein in or n or both are greater than 1, may be either a heteropolymer or a homopolymer, preferably a heteropolymer. Where R 1 and R2 are joined together by a covalent bond, the polynucleotide of the above formula is a closed, circular polynucleotide, that can be a double-stranded polynucleotide wherein the formula shows a first strand to which the second strand is complementary. In another embodiment m or n or
<Desc/Clms Page number 5>
both are an integer between 1 and 1000. Other embodiments of the invention include those wherein m is an integer between 1 and 50,1 and 100 or 1 and 500, and n is an integer between 1 and 50,1 and 100, or 1 and 500.
Preferred variants of polynucleotides of the present invention include splice variants, allelic variants, and polymorphisms, including polynucleotides having one or more single nucleotide polymorphisms (SNPs).
Polynucleotides of the present invention also include polynucleotides encoding polypeptide variants that comprise the amino acid sequence of SEQ ID NO : 5 and in which several, for instance
from 50 to 30, from 30 to 20, from 20 to 10, from 10 to 5, from 5 to 3, from 3 to 2, from 2 to 1 or 1 amino acid residues are substituted, deleted or added, in any combination.
In a further aspect, the present invention provides polynucleotides that are RNA transcripts of the DNA sequences of the present invention. Accordingly, there is provided an RNA polynucleotide that: (a) comprises an RNA transcript of the DNA sequence encoding the polypeptide of SEQ ID NO : 5; (b) is the RNA transcript of the DNA sequence encoding the polypeptide of SEQ ID NO : 5; (c) comprises an RNA transcript of the DNA sequence of SEQ ID NO : 4; or (d) is the RNA transcript of the DNA sequence of SEQ ID NO : 4; and RNA polynucleotides that are complementary thereto.
The polynucleotide sequence of SEQ ID NO : 1 is a cDNA sequence that encodes the polypeptide of SEQ ID NO : 5. The polynucleotide sequence encoding the polypeptide of SEQ ID NO : 5 may be identical to the polypeptide encoding sequence of SEQ ID NO : 4 or it may be a sequence other than SEQ ID NO : 4, which, as a result of the redundancy (degeneracy) of the genetic code, also encodes the polypeptide of SEQ ID NO : 5.
Preferred polypeptides and polynucleotides of the present invention are expected to have, inter alia, similar biological functions/properties to their homologous polypeptides and polynucleotides. Furthermore, preferred polypeptides and polynucleotides of the present invention have at least one human MEGF7 activity.
Polynucleotides of the present invention may be obtained using standard cloning and screening techniques from a cDNA library derived from mRNA in cells of human tissue, particularly brain tissue (see for instance, Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, N. Y. (1989) ). Polynucleotides of the invention can also be obtained from natural sources such as genomic DNA libraries or can be synthesized using well known and commercially available techniques.
When polynucleotides of the present invention are used for the recombinant production of polypeptides of the present invention, the polynucleotide may include the coding sequence for the mature polypeptide, by itself, or the coding sequence for the mature polypeptide in reading frame with other coding sequences, such as those encoding a leader or secretory sequence, a pre-, or pro-or repro-protein sequence, or other fusion peptide portions. For example, a marker sequence that facilitates purification of the fused polypeptide can be encoded. In certain preferred embodiments of this aspect of the invention, the marker sequence is a hexa-histidine peptide, as provided in the pQE vector (lagen, Inc. ) and described in Gentz et aL, Proc Natl Acad Sci USA (1989) 86: 821-824, or is an HA tag. The polynucleotide may also contain non-coding 5'and 3'sequences, such as transcribed, non-translated sequences, splicing and polyadenylation signals, ribosome binding sites and sequences that stabilize mRNA.
Polynucleotides that are identical, or have sufficient identity to a polynucleotide sequence of SEQ ID NO : 4, may be used as hybridization probes for cDNA and genomic DNA or as primers for a nucleic acid amplification reaction (for instance, PCR). Such probes and primers may be used to isolate full-length cDNAs and genomic clones encoding polypeptides of the present invention and to isolate cDNA and genomic clones of other genes (including genes encoding paralogs from human
<Desc/Clms Page number 6>
sources and orthologs and paralogs from species other than human) that have a high sequence similarity to SEQ ID NO : 4, typically at least 95% identity. Preferred probes and primers will generally comprise at least 15 nucleotides, preferably, at least 30 nucleotides and may have at least 50, if not at least 100 nucleotides. Particularly preferred probes will have between 30 and 50 nucleotides. Particularly preferred primers will have between 20 and 25 nucleotides.
A polynucleotide encoding a polypeptide of the present invention, including homologs from species other than human, may be obtained by a process comprising the steps of screening a library under stringent hybridization conditions with a labeled probe having the sequence of SEQ ID NO: 4 or a fragment thereof, preferably of at least 15 nucleotides; and isolating full-length cDNA and genomic clones containing said polynucleotide sequence. Such hybridization techniques are well known to the skilled artisan. Preferred stringent hybridization conditions include overnight incubation at 420C in a solution comprising: 50% formamide, 5xSSC (150mM NaCI, 15mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5x Denhardt's solution, 10 % dextran sulfate, and 20 microgram/ml denatured, sheared salmon sperm DNA; followed by washing the filters in O. lx SSC at about 650C. Thus the present invention also includes isolated polynucleotides, preferably with a nucleotide sequence of at least 100, obtained by screening a library under stringent hybridization conditions with a labeled probe having the sequence of SEQ ID NO : 4 or a fragment thereof, preferably of at least 15 nucleotides.
Recombinant polypeptides of the present invention may be prepared by processes well known in the art from genetically engineered host cells comprising expression systems.
Accordingly, in a further aspect, the present invention relates to expression systems comprising a polynucleotide or polynucleotides of the present invention, to host cells which are genetically engineered with such expression systems and to the production of polypeptides of the invention by recombinant techniques. Cell-free translation systems can also be employed to produce such proteins using RNAs derived from the DNA constructs of the present invention.
For recombinant production, host cells can be genetically engineered to incorporate expression systems or portions thereof for polynucleotides of the present invention. Polynucleotides may be introduced into host cells by methods described in many standard laboratory manuals, such as Davis et al., Basic Methods in Molecular Biology (1986) and Sambrook et al. (ibid). Preferred methods of introducing polynucleotides into host cells include, for instance, calcium phosphate transfection, DEAE-dextran mediated transfection, transvection, micro-injection, cationic lipidmediated transfection, electroporation, transduction, scrape loading, ballistic introduction or infection.
Representative examples of appropriate hosts include bacterial cells, such as Streptococci, Staphylococci, E. coli, Streptomyces and Bacillus subtilis cells; fungal cells, such as yeast cells and Aspergillus cells; insect cells such as Drosophila S2 and Spodoptera Sf9 cells; animal cells such as CHO, COS, HeLa, C127,3T3, BHK, HEK 293 and Bowes melanoma cells; and plant cells.
A great variety of expression systems can be used, for instance, chromosomal, episomal and virus-derived systems, e. g., vectors derived from bacterial plasmids, from bacteriophage, from transposons, from yeast episomes, from insertion elements, from yeast chromosomal elements, from viruses such as baculoviruses, papova viruses, such as SV40, vaccinia viruses, adenoviruses, fowl pox viruses, pseudorabies viruses and retroviruses, and vectors derived from combinations thereof, such as those derived from plasmid and bacteriophage genetic elements, such as cosmids and phagemids. The expression systems may contain control regions that regulate as well as engender expression. Generally, any system or vector that is able to maintain, propagate or express the polynucleotide to produce a polypeptide in a host may be used. The appropriate polynucleotide sequence may be inserted into an expression system by any of a variety of well-known and routine techniques, such as, for example, those set forth in Sambrook et al., (ibid). Appropriate secretion signals may be incorporated into the desired polypeptide to allow secretion of the translated protein
<Desc/Clms Page number 7>
into the lumen of the endoplasmic reticulum, the periplasmic space or the extracellular environment.
These signals may be endogenous to the polypeptide or they may be heterologous signals.
If a polypeptide of the present invention is to be expressed for use in screening assays, it is generally preferred that the polypeptide be produced at the surface of the cell. In this event, the cells may be harvested prior to use in the screening assay. If the polypeptide is secreted into the medium, the medium can be recovered in order to recover and purify the polypeptide. If produced intracellularly, the cells must first be lysed before the polypeptide is recovered.
Polypeptides of the present invention can be recovered and purified from recombinant cell cultures by well-known methods including ammonium sulfate or ethanol precipitation, acid extraction, anion or cation exchange chromatography, phosphocellulose chromatography, hydrophobic interaction chromatography, affinity chromatography, hydroxylapatite chromatography and lectin chromatography. Most preferably, high performance liquid chromatography is employed for purification. Well known techniques for refolding proteins may be employed to regenerate active conformation when the polypeptide is denatured during intracellular synthesis, isolation and/or purification.
Polynucleotides of the present invention may be used as diagnostic reagents, through detecting mutations in the associated gene. Detection of a mutated form of the gene characterized by the polynucleotide of SEQ ID NO : 4 in the cDNA or genomic sequence and which is associated with a dysfunction will provide a diagnostic tool that can add to, or define, a diagnosis of a disease, or susceptibility to a disease, which results from under-expression, over-expression or altered spatial or temporal expression of the gene. Individuals carrying mutations in the gene may be detected at the DNA level by a variety of techniques well known in the art.
Nucleic acids for diagnosis may be obtained from a subject's cells, such as from blood, urine, saliva, tissue biopsy or autopsy material. The genomic DNA may be used directly for detection or it may be amplified enzymatically by using PCR, preferably RT-PCR, or other amplification techniques prior to analysis. RNA or cDNA may also be used in similar fashion. Deletions and insertions can be detected by a change in size of the amplified product in comparison to the normal genotype. Point mutations can be identified by hybridizing amplified DNA to labeled MEGF7 nucleotide sequences. Perfectly matched sequences can be distinguished from mismatched duplexes by RNase digestion or by differences in melting temperatures. DNA sequence difference may also be detected by alterations in the electrophoretic mobility of DNA fragments in gels, with or without denaturing agents, or by direct DNA sequencing (see, for instance, Myers et at., Science (1985) 230: 1242). Sequence changes at specific locations may also be revealed by nuclease protection assays, such as RNase and SI protection or the chemical cleavage method (see Cotton et al., Proc Natl Acad Sci USA (1985) 85: 4397-4401).
An array of oligonucleotide probes comprising MEGF7 polynucleotide sequence or fragments thereof can be constructed to conduct efficient screening of eg., genetic mutations. Such arrays are preferably high density arrays or grids. Array technology methods are well known and have general applicability and can be used to address a variety of questions in molecular genetics including gene expression, genetic linkage, and genetic variability, see, for example, M. Chee et al., Science, 274,610-613 (1996) and other references cited therein.
Detection of abnormally decreased or increased levels of polypeptide or mRNA expression may also be used for diagnosing or determining susceptibility of a subject to a disease of the invention. Decreased or increased expression can be measured at the RNA level using any of the methods well known in the art for the quantitation of polynucleotides, such as, for example, nucleic acid amplification, for instance PCR, RT-PCR, RNase protection, Northern blotting and other hybridization methods. Assay techniques that can be used to determine levels of a protein, such as a polypeptide of the present invention, in a sample derived from a host are well-known to those of skill in the art. Such assay methods include radio-immunoassays, competitive-binding assays, Western Blot analysis and ELISA assays.
<Desc/Clms Page number 8>
Thus in another aspect, the present invention relates to a diagnostic kit comprising : (a) a polynucleotide of the present invention, preferably the nucleotide sequence of SEQ ID NO: 4, or a fragment or an RNA transcript thereof ; (b) a nucleotide sequence complementary to that of (a); (c) a polypeptide of the present invention, preferably the polypeptide of SEQ ID NO : 5 or a fragment thereof; or (d) an antibody to a polypeptide of the present invention, preferably to the polypeptide of SEQ ID NO : 5.
It will be appreciated that in any such kit, (a), (b), (c) or (d) may comprise a substantial component. Such a kit will be of use in diagnosing a disease or susceptibility to a disease, particularly diseases of the invention, amongst others.
The polynucleotide sequences of the present invention are valuable for chromosome localisation studies. The sequence is specifically targeted to, and can hybridize with, a particular location on an individual human chromosome. The mapping of relevant sequences to chromosomes according to the present invention is an important first step in correlating those sequences with gene associated disease. Once a sequence has been mapped to a precise chromosomal location, the physical position of the sequence on the chromosome can be correlated with genetic map data. Such data are found in, for example, V. McKusick, Mendelian Inheritance in Man (available on-line through Johns Hopkins University Welch Medical Library).
The relationship between genes and diseases that have been mapped to the same chromosomal region are then identified through linkage analysis (co-inheritance of physically adjacent genes).
Precise human chromosomal localisations for a genomic sequence (gene fragment etc. ) can be determined using Radiation Hybrid (RH) Mapping (Walter, M. Spillett, D. , Thomas, P., Weissenbach, J. , and Goodfellow, P. , (1994) A method for constructing radiation hybrid maps of whole genomes, Nature Genetics 7,22-28). A number of RH panels are available from Research Genetics (Huntsville, AL, USA) e. g. the GeneBridge4 RH panel (Hum Mol Genet 1996 Mar; 5 (3): 339-46 A radiation hybrid map of the human genome. Gyapay G, Schmitt K, Fizames C, Jones H, Vega-Czamy N, Spi11ett D, Muselet D, Prud'Homme JF, Dib C, Auffray C, Morissette J, Weissenbach J, Goodfellow PN). To determine the chromosomal location of a gene using this panel, 93 PCRs are performed using primers designed from the gene of interest on RH DNAs. Each of these DNAs contains random human genomic fragments maintained in a hamster background (human/hamster hybrid cell lines). These PCRs result in 93 scores indicating the presence or absence of the PCR product of the gene of interest. These scores are compared with scores created using PCR products from genomic sequences of known location. This comparison is conducted at http://www. genome. wi. mit. edu/. The gene of the present invention maps to human chromosome location llpll. 2-p. 12.
The polynucleotide sequences of the present invention are also valuable tools for tissue expression studies. Such studies allow the determination of expression patterns of polynucleotides of the present invention which may give an indication as to the expression patterns of the encoded polypeptides in tissues, by detecting the mRNAs that encode them. The techniques used are well known in the art and include in situ hybridization techniques to clones arrayed on a grid, such as cDNA microarray hybridization (Schena et al, Science, 270, 467-470, 1995 and Shalon et al, Genome Res, 6,639-645, 1996) and nucleotide amplification techniques such as PCR. A preferred method uses the TAQMANG technology available from Perkin Elmer.
Results from these studies can provide an indication of the normal function of the polypeptide in the organism. In addition, comparative studies of the normal expression pattern of mRNAs with that of mRNAs encoded by an alternative form of the same gene (for example, one having an alteration in polypeptide coding potential or a regulatory mutation) can provide valuable insights into the role of the polypeptides of the present invention, or that of inappropriate expression
<Desc/Clms Page number 9>
thereof in disease. Such inappropriate expression may be of a temporal, spatial or simply quantitative nature.
The polypeptides of the present location are expressed in brain tissue.
A further aspect of the present invention relates to antibodies. The polypeptides of the invention or their fragments, or cells expressing them, can be used as immunogens to produce antibodies that are immunospecific for polypeptides of the present invention. The term "immunospecific"means that the antibodies have substantially greater affinity for the polypeptides of the invention than their affinity for other related polypeptides in the prior art.
Antibodies generated against polypeptides of the present invention may be obtained by administering the polypeptides or epitope-bearing fragments, or cells to an animal, preferably a nonhuman animal, using routine protocols. For preparation of monoclonal antibodies, any technique which provides antibodies produced by continuous cell line cultures can be used. Examples include the hybridoma technique (Kohler, G. and Milstein, C., Nature (1975) 256: 495-497), the trioma technique, the human B-cell hybridoma technique (Kozbor et al., Immunology Today (1983) 4: 72) and the EBV-hybridoma technique (Cole et ai., Monoclonal Antibodies and Cancer Therapy, 77-96, Alan R. Liss, Inc. , 1985).
Techniques for the production of single chain antibodies, such as those described in U. S.
Patent No. 4,946, 778, can also be adapted to produce single chain antibodies to polypeptides of this invention. Also, transgenic mice, or other organisms, including other mammals, may be used to express humanized antibodies.
The above-described antibodies may be employed to isolate or to identify clones expressing the polypeptide or to purify the polypeptides by affinity chromatography. Antibodies against polypeptides of the present invention may also be employed to treat diseases of the invention, amongst others.
Polypeptides of the present invention have one or more biological functions that are of relevance in proper physiologic function. It is therefore useful to identify compounds that stimulate or inhibit the function or level of the polypeptide. Accordingly, in a further aspect, the present invention provides for a method of screening compounds to identify those that stimulate or inhibit the function or level of the polypeptide. Such methods identify agonists or antagonists that may be employed for therapeutic and prophylactic purposes for diseases associated with altered MEGF7 function. Compounds may be identified from a variety of sources, for example, cells, cell-free preparations, chemical libraries, collections of chemical compounds, and natural product mixtures.
Such agonists or antagonists so-identified may be natural or modified substrates, ligands, receptors, enzymes, etc. , as the case may be, of the polypeptide; a structural or functional mimetic thereof (see Coligan et ai., Current Protocols in Immunology 1 (2): Chapter 5 (1991) ) or a small molecule. Such small molecules preferably have a molecular weight below 2,000 daltons, more preferably between 300 and 1,000 daltons, and most preferably between 400 and 700 daltons. It is preferred that these small molecules are organic molecules.
The screening method may simply measure the binding of a candidate compound to the polypeptide, or to cells or membranes bearing the polypeptide, or a fusion protein thereof, by means of a label directly or indirectly associated with the candidate compound. Alternatively, the screening method may involve measuring or detecting (qualitatively or quantitatively) the competitive binding of a candidate compound to the polypeptide against a labeled competitor (e. g. agonist or antagonist). Further, these screening methods may test whether the candidate compound results in a signal generated by activation or inhibition of the polypeptide, using detection systems appropriate to the cells bearing the polypeptide. Inhibitors of activation are generally assayed in the presence of a known agonist and the effect on activation by the agonist by the presence of the candidate compound is observed. Further, the screening methods may simply comprise the steps of mixing a candidate compound with a solution containing a polypeptide of the present invention, to form a mixture, measuring a MEGF7 activity in the
<Desc/Clms Page number 10>
mixture, and comparing the MEGF7 activity of the mixture to a control mixture which contains no candidate compound.
Polypeptides of the present invention may be employed in conventional low capacity screening methods and also in high-throughput screening (HTS) formats. Such HTS formats include not only the well-established use of 96-and, more recently, 384-well micotiter plates but also emerging methods such as the nanowell method described by Schullek et aI, Anal Biochem., 246: 20 (1997).
Fusion proteins, such as those made from Fc portion and MEGF7 polypeptide, as hereinbefore described, can also be used for high-throughput screening assays to identify antagonists for the polypeptide of the present invention (see D. Bennett et al., J Mol Recognition, 8: 52-58 (1995); and K. Johanson et al., J Bid Chem, 270 (16): 9459-9471 (1995)).
The polynucleotides, polypeptides and antibodies to the polypeptide of the present invention may also be used to configure screening methods for detecting the effect of added compounds on the production of mRNA and polypeptide in cells. For example, an ELISA assay may be constructed for measuring secreted or cell associated levels of polypeptide using monoclonal and polyclonal antibodies by standard methods known in the art. This can be used to discover agents that may inhibit or enhance the production of polypeptide (also called antagonist or agonist, respectively) from suitably manipulated cells or tissues.
Examples of antagonists of polypeptides of the present invention include antibodies or, in some cases, oligonucleotides or proteins that are closely related to the ligands of the polypeptide, e. g., a fragment of the ligands; or a small molecule that binds to the polypeptide of the present invention but does not elicit a response, so that the activity of the polypeptide is prevented.
Screening methods may also involve the use of transgenic technology and the MEGF7 gene. The art of constructing transgenic animals is well established. For example, the MEGF7 gene may be introduced through microinjection into the male pronucleus of fertilized oocytes, retroviral transfer into pre-or post-implantation embryos, or injection of genetically modified, such as by electroporation, embryonic stem cells into host blastocysts. Particularly useful transgenic animals are so-called"knock-in"animals in which an animal gene is replaced by the human equivalent within the genome of that animal. Knock-in transgenic animals are useful in the drug discovery process, for target validation, where the compound is specific for the human target. Other useful transgenic animals are so-called"knock-out"animals in which the expression of the animal ortholog of a polypeptide of the present invention and encoded by an endogenous DNA sequence in a cell is partially or completely annulled. The gene knock-out may be targeted to specific cells or tissues, may occur only in certain cells or tissues as a consequence of the limitations of the technology, or may occur in all, or substantially all, cells in the animal.
Transgenic animal technology also offers a whole animal expression-cloning system in which introduced genes are expressed to give large amounts of polypeptides of the present invention Screening kits for use in the above described methods form a further aspect of the present invention. Such screening kits comprise: (a) a polypeptide of the present invention; (b) a recombinant cell expressing a polypeptide of the present invention; (c) a cell membrane expressing a polypeptide of the present invention; or (d) an antibody to a polypeptide of the present invention; which polypeptide is preferably that of SEQ ID NO : 5.
It will be appreciated that in any such kit, (a), (b), (c) or (d) may comprise a substantial component.
Glossary The following definitions are provided to facilitate understanding of certain terms used frequently hereinbefore.
<Desc/Clms Page number 11>
"Antibodies"as used herein includes polyclonal and monoclonal antibodies, chimeric, single chain, and humanized antibodies, as well as Fab fragments, including the products of an Fab or other immunoglobulin expression library.
"Isolated"means altered"by the hand of man"such that it is no longer in its natural state, i. e. , if it originaly occurred in nature, it has been changed or removed from its original environment, or both. For example, a polynucleotide or a polypeptide naturally present in a living organism is not"isolated,"but the same polynucleotide or polypeptide separated from the coexisting materials of its natural state is"isolated", as the term is employed herein. Moreover, a polynucleotide or polypeptide that is artificially introduced into an organism by transformation, genetic manipulation or by any other recombinant method is considered"isolated"even when present in said organism.
"Polynucleotide"generally refers to any polyribonucleotide (RNA) or polydeoxribonucleotide (DNA), which may be unmodified or modified RNA or DNA.
"Polynucleotides"include, without limitation, single-and double-stranded DNA, DNA that is a mixture of single-and double-stranded regions, single-and double-stranded RNA, and RNA that is mixture of single-and double-stranded regions, hybrid molecules comprising DNA and RNA that may be single-stranded or, more typically, double-stranded or a mixture of single-and double-stranded regions. In addition,"polynucleotide"refers to triple-stranded regions comprising RNA or DNA or both RNA and DNA. The term"polynucleotide"also includes DNAs or RNAs containing one or more modified bases and DNAs or RNAs with backbones modified for stability or for other reasons."Modified"bases include, for example, tritylated bases and unusual bases such as inosine. A variety of modifications may be made to DNA and RNA; thus,"polynucleotide"embraces chemically, enzymatically or metabolically modified forms of polynucleotides as typically found in nature, as well as the chemical forms of DNA and RNA characteristic of viruses and cells."Polynucleotide"also embraces relatively short polynucleotides, often referred to as oligonucleotides.
"Polypeptide"refers to any polypeptide comprising two or more amino acids joined to each other by peptide bonds or modified peptide bonds, i. e. , peptide isosteres."Polypeptide" refers to both short chains, commonly referred to as peptides, oligopeptides or oligomers, and to longer chains, generally referred to as proteins. Polypeptides may contain amino acids other than the 20 gene-encoded amino acids."Polypeptides"include amino acid sequences modified either by natural processes, such as post-translational processing, or by chemical modification techniques that are well known in the art. Such modifications are well described in basic texts and in more detailed monographs, as well as in a voluminous research literature. Modifications may occur anywhere in a polypeptide, including the peptide backbone, the amino acid side-chains and the amino or carboxyl termini. It will be appreciated that the same type of modification may be present to the same or varying degrees at several sites in a given polypeptide. Also, a given polypeptide may contain many types of modifications. Polypeptides may be branched as a result of ubiquitination, and they may be cyclic, with or without branching. Cyclic, branched and branched cyclic polypeptides may result from post-translation natural processes or may be made by synthetic methods. Modifications include acetylation, acylation, ADP-ribosylation, amidation, biotinylation, covalent attachment of flavin, covalent attachment of a heme moiety, covalent attachment of a nucleotide or nucleotide derivative, covalent attachment of a lipid or lipid derivative, covalent attachment of phosphotidylinositol, cross-linking, cyclization, disulfide bond formation, demethylation, formation of covalent cross-links, formation of cystine, formation of pyroglutamate, formylation, gamma-carboxylation, glycosylation, GPI anchor formation, hydroxylation, iodination, methylation, myristoylation, oxidation, proteolytic processing, phosphorylation, prenylation, racemization, selenoylation, sulfation, transfer-RNA mediated addition of amino acids to proteins such as arginylation, and ubiquitination (see, for instance, Proteins-Structure and Molecular Properties, 2nd Ed. , T. E. Creighton, W. H. Freeman and Company, New York, 1993; Wold, F. , Post-translational Protein Modifications: Perspectives and
<Desc/Clms Page number 12>
Prospects, 1-12, in Post-translational Covalent Modification of Proteins, B. C. Johnson, Ed., Academic Press, New York, 1983; Seifter et al.,"Analysis for protein modifications and nonprotein cofactors", Meth Enzymol, 182,626-646, 1990, and Rattan et al., "Protein Synthesis: Post-translational Modifications and Aging", Ann NY Acad Sci, 663,48-62, 1992).
"Fragment"of a polypeptide sequence refers to a polypeptide sequence that is shorter than the reference sequence but that retains essentially the same biological function or activity as the reference polypeptide."Fragment"of a polynucleotide sequence refers to a polynucleotide sequence that is shorter than the reference sequence of SEQ ID NO : 4.
"Variant"refers to a polynucleotide or polypeptide that differs from a reference polynucleotide or polypeptide, but retains the essential properties thereof. A typical variant of a polynucleotide differs in nucleotide sequence from the reference polynucleotide. Changes in the nucleotide sequence of the variant may or may not alter the amino acid sequence of a polypeptide encoded by the reference polynucleotide. Nucleotide changes may result in amino acid substitutions, additions, deletions, fusions and truncations in the polypeptide encoded by the reference sequence, as discussed below. A typical variant of a polypeptide differs in amino acid sequence from the reference polypeptide. Generally, alterations are limited so that the sequences of the reference polypeptide and the variant are closely similar overall and, in many regions, identical. A variant and reference polypeptide may differ in amino acid sequence by one or more substitutions, insertions, deletions in any combination. A substituted or inserted amino acid residue may or may not be one encoded by the genetic code. Typical conservative substitutions include Gly, Ala; Val, De, Leu; Asp, Glu; Asn, Gin ; Ser, Thr; Lys, Arg; and Phe and Tyr. A variant of a polynucleotide or polypeptide may be naturally occurring such as an allele, or it may be a variant that is not known to occur naturally. Non-naturally occurring variants of polynucleotides and polypeptides may be made by mutagenesis techniques or by direct synthesis. Also included as variants are polypeptides having one or more post-translational modifications, for instance glycosylation, phosphorylation, methylation, ADP ribosylation and the like. Embodiments include methylation of the N-terminal amino acid, phosphorylations of serines and threonines and modification of C-terminal glycines.
"Allele"refers to one of two or more alternative forms of a gene occurring at a given locus in the genome.
"Polymorphism"refers to a variation in nucleotide sequence (and encoded polypeptide sequence, if relevant) at a given position in the genome within a population.
"Single Nucleotide Polymorphism" (SNP) refers to the occurrence of nucleotide variability at a single nucleotide position in the genome, within a population. An SNP may occur within a gene or within intergenic regions of the genome. SNPs can be assayed using Allele Specific Amplification (ASA). For the process at least 3 primers are required. A common primer is used in reverse complement to the polymorphism being assayed. This common primer can be between 50 and 1500 bps from the polymorphic base. The other two (or more) primers are identical to each other except that the final 3'base wobbles to match one of the two (or more) alleles that make up the polymorphism. Two (or more) PCR reactions are then conducted on sample DNA, each using the common primer and one of the Allele Specific Primers.
"Splice Variant"as used herein refers to cDNA molecules produced from RNA molecules initially transcribed from the same genomic DNA sequence but which have undergone alternative RNA splicing. Alternative RNA splicing occurs when a primary RNA transcript undergoes splicing, generally for the removal of introns, which results in the production of more than one mRNA molecule each of that may encode different amino acid sequences. The term splice variant also refers to the proteins encoded by the above cDNA molecules.
"Identity"reflects a relationship between two or more polypeptide sequences or two or more polynucleotide sequences, determined by comparing the sequences. In general, identity refers to an exact nucleotide to nucleotide or amino acid to amino acid correspondence of
<Desc/Clms Page number 13>
the two polynucleotide or two polypeptide sequences, respectively, over the length of the sequences being compared.
"Percent (%) Identity"-For sequences where there is not an exact correspondence, a"% identity"may be determined. In general, the two sequences to be compared are aligned to give a maximum correlation between the sequences. This may include inserting"gaps"in either one or both sequences, to enhance the degree of alignment. A % identity may be determined over the whole length of each of the sequences being compared (so-called global alignment), that is particularly suitable for sequences of the same or very similar length, or over shorter, defined lengths (so-called local alignment), that is more suitable for sequences of unequal length.
"Similarity"is a further, more sophisticated measure of the relationship between two polypeptide sequences. In general,"similarity"means a comparison between the amino acids of two polypeptide chains, on a residue by residue basis, taking into account not only exact correspondences between a between pairs of residues, one from each of the sequences being compared (as for identity) but also, where there is not an exact correspondence, whether, on an evolutionary basis, one residue is a likely substitute for the other. This likelihood has an associated"score"from which the"% similarity"of the two sequences can then be determined.
Methods for comparing the identity and similarity of two or more sequences are well known in the art. Thus for instance, programs available in the Wisconsin Sequence Analysis Package, version 9.1 (Devereux J et al, Nucleic Acids Res, 12,387-395, 1984, available from Genetics Computer Group, Madison, Wisconsin, USA), for example the programs BESTFIT and GAP, may be used to determine the % identity between two polynucleotides and the % identity and the % similarity between two polypeptide sequences. BESTFIT uses the"local homology" algorithm of Smith and Waterman (J Mol Biol, 147,195-197, 1981, Advances in Applied Mathematics, 2, 482-489, 1981) and finds the best single region of similarity between two sequences. BESTFIT is more suited to comparing two polynucleotide or two polypeptide sequences that are dissimilar in length, the program assuming that the shorter sequence represents a portion of the longer. In comparison, GAP aligns two sequences, finding a"maximum similarity", according to the algorithm of Neddleman and Wunsch (J Mol Biol, 48, 443-453, 1970). GAP is more suited to comparing sequences that are approximately the same length and an alignment is expected over the entire length. Preferably, the parameters"Gap Weight"and "Length Weight"used in each program are 50 and 3, for polynucleotide sequences and 12 and 4 for polypeptide sequences, respectively. Preferably, % identities and similarities are determined when the two sequences being compared are optimally aligned.
Other programs for determining identity and/or similarity between sequences are also known in the art, for instance the BLAST family of programs (Altschul S F et al, J Mol Biol, 215,403-410, 1990, Altschul S F et al, Nucleic Acids Res. , 25: 389-3402,1997, available from the National Center for Biotechnology Information (NCBI), Bethesda, Maryland, USA and accessible through the home page of the NCBI at www. ncbi. nlm. nih. gov) and FASTA (Pearson W R, Methods in Enzymology, 183,63-99, 1990; Pearson W R and Lipman D J, Proc Nat Acad Sci USA, 85,2444-2448, 1988, available as part of the Wisconsin Sequence Analysis Package).
Preferably, the BLOSUM62 amino acid substitution matrix (Henikoff S and Henikoff J G, Proc. Nat. Acad Sci. USA, 89,10915-10919, 1992) is used in polypeptide sequence comparisons including where nucleotide sequences are first translated into ammo acid sequences before comparison.
Preferably, the program BESTFIT is used to determine the % identity of a query polynucleotide or a polypeptide sequence with respect to a reference polynucleotide or a polypeptide sequence, the query and the reference sequence being optimally aligned and the parameters of the program set at the default value, as hereinbefore described.
"Identity Index" is a measure of sequence relatedness which may be used to compare a candidate sequence (polynucleotide or polypeptide) and a reference sequence. Thus, for instance, a candidate polynucleotide sequence having, for example, an Identity Index of 0.95
<Desc/Clms Page number 14>
compared to a reference polynucleotide sequence is identical to the reference sequence except that the candidate polynucleotide sequence may include on average up to five differences per each 100 nucleotides of the reference sequence. Such differences are selected from the group consisting of at least one nucleotide deletion, substitution, including transition and transversion, or insertion. These differences may occur at the 5'or 3'terminal positions of the reference polynucleotide sequence or anywhere between these terminal positions, interspersed either individually among the nucleotides in the reference sequence or in one or more contiguous groups within the reference sequence. In other words, to obtain a polynucleotide sequence having an Identity Index of 0.95 compared to a reference polynucleotide sequence, an average of up to 5 in every 100 of the nucleotides of the in the reference sequence may be deleted, substituted or inserted, or any combination thereof, as hereinbefore described. The same applies mutais mutandis for other values of the Identity Index, for instance 0.96, 0.97, 0.98 and 0.99.
Similarly, for a polypeptide, a candidate polypeptide sequence having, for example, an Identity Index of 0.95 compared to a reference polypeptide sequence is identical to the reference sequence except that the polypeptide sequence may include an average of up to five differences per each 100 amino acids of the reference sequence. Such differences are selected from the group consisting of at least one ammo acid deletion, substitution, including conservative and non-conservative substitution, or insertion. These differences may occur at the amino-or carboxy-terminal positions of the reference polypeptide sequence or anywhere between these terminal positions, interspersed either individually among the amino acids in the reference sequence or in one or more contiguous groups within the reference sequence. In other words, to obtain a polypeptide sequence having an Identity Index of 0.95 compared to a reference polypeptide sequence, an average of up to 5 in every 100 of the amino acids in the reference sequence may be deleted, substituted or inserted, or any combination thereof, as hereinbefore described. The same applies mutatis mutandis for other values of the Identity Index, for instance 0.96, 0.97, 0.98 and 0.99.
The relationship between the number of nucleotide or amino acid differences and
the Identity Index may be expressed in the following equation :
na < x- (Xa'I), in which: na is the number of nucleotide or ammo acid differences, xa is the total number of nucleotides or amino acids in SEQ ID NO : 4 or SEQ ID NO : 5, respectively, I is the Identity Index, is the symbol for the multiplication operator, and in which any non-integer product of xa and I is rounded down to the nearest integer prior to subtracting it from xa.
"Homolog"is a generic term used in the art to indicate a polynucleotide or polypeptide sequence possessing a high degree of sequence relatedness to a reference sequence.
Such relatedness may be quantified by determining the degree of identity and/or similarity between the two sequences as hereinbefore defined. Falling within this generic term are the terms "ortholog", and"paralog"."Ortholog"refers to a polynucleotide or polypeptide that is the functional equivalent of the polynucleotide or polypeptide in another species."Paralog"refers to a polynucleotideor polypeptide that within the same species which is functionally similar.
"Fusion protein"refers to a protein encoded by two, often unrelated, fused genes or fragments thereof. In one example, EP-A-0 464 533-A discloses fusion proteins comprising various portions of constant region of immunoglobulin molecules together with another human protein or part thereof. In many cases, employing an immunoglobulin Fc region as a part of a fusion protein is advantageous for use in therapy and diagnosis resulting in, for example,
<Desc/Clms Page number 15>
improved pharmacokinetic properties [see, e. g., EP-A 0232 262]. On the other hand, for some uses it would be desirable to be able to delete the Fc part after the fusion protein has been expressed, detected and purified.
All publications and references, including but not limited to patents and patent applications, cited in this specification are herein incorporated by reference in their entirety as if each individual publication or reference were specifically and individually indicated to be incorporated by reference herein as being fully set forth. Any patent application to which this application claims priority is also incorporated by reference herein in its entirety in the manner described above for publications and references.
<Desc/Clms Page number 16>
SEQUENCE LISTING < 110 > Glaxo Group Limited < 120 > Novel Protein < 130 > PG4842 < 140 > < 141 > < 160 > 7 < 170 > Patent In Ver. 2.1 < 210 > 1 < 211 > 3408 < 212 > DNA < 213 > Homo sapiens < 220 > < 221 > CDS < 222 > (1).. (3408) < 400 > 1
atg agg egg cag tgg ggc gcg ctg ctg ctt ggc gcc ctg etc tgc gca 48 Met Arg Arg Gln Trp Gly Ala Leu Leu Leu Gly Ala Leu Leu Cys Ala 1 5 10 15 cac ggc ctg gcc age age ccc gag tgt get tgt ggt egg age cac ttc 96 His Gly Leu Ala Ser Ser Pro Glu Cys Ala Cys Gly Arg Ser His Phe 20 25 30 aca tgt gca gtg agt get ctt gga gag tgt ace tgc ate cct gcc cag 144 Thr Cys Ala Val Ser Ala Leu Gly Glu Cys Thr Cys Ile Pro Ala Gln 35 40 45 tgg cag tgt gat gga gac aat gac tgc ggg gac cac age gat gag gat 192 Trp Gln Cys Asp Gly Asp Asn Asp Cys Gly Asp His Ser Asp Glu Asp 50 55 60 gga tgt ata aaa aaa tgt tec cct ctt gac ttt cac tgt gac aat ggc 240 Gly Cys Ile Lys Lys Cys Ser Pro Leu Asp Phe His Cys Asp Asn Gly 65 70 75 80 aag tgc ate cgc cgc tec tgg gtg tgt gac ggg gac aac gac tgt gag 288 Lys Cys Ile Arg Arg Ser Trp Val Cys Asp Gly Asp Asn Asp Cys Glu 85 90 95 gat gac teg gat gag cag gac tgt ccc ccc egg gag tgt gag gag gac 336 Asp Asp Ser Asp Glu Gln Asp Cys Pro Pro Arg Glu Cys Glu Glu Asp 100 105 110 gag ttt ccc tgc cag aat ggc tac tgc ate egg agt ctg tgg cac tgc 384 Glu Phe Pro Cys Gln Asn Gly Tyr Cys Ile Arg Ser Leu Trp His Cys 115 120 125
<Desc/Clms Page number 17>
gat ggt gac aat gac tgt ggc gac aac age gat gag cag gcc tec gtt 432 Asp Gly Asp Asn Asp Cys Gly Asp Asn Ser Asp Glu Gln Ala Ser Val 130 135 140 etc cct gga gag let gca ctg act ctg ggg tgc ctt teg ggg get agg 480 Leu Pro Gly Glu Ser Ala Leu Thr Leu Gly Cys Leu Ser Gly Ala Arg 145 150 155 160 tea gac ccg cca aag cca gtg agg ttg gga gtc aga gag gag ggt gag 528 Ser Asp Pro Pro Lys Pro Val Arg Leu Gly Val Arg Glu Glu Gly Glu 165 170 175 agg gtt gcc tgt ggg gcc ccc tea gaa ctg ctg tec cat cgc ate ccc 576 Arg Val Ala Cys Gly Ala Pro Ser Glu Leu Leu Ser His Arg Ile Pro 180 185 190 cca gac atg cgc aag tgc tec gac aag gag ttc cgc tgt agt gac gga 624 Pro Asp Met Arg Lys Cys Ser Asp Lys Glu Phe Arg Cys Ser Asp Gly 195 200 205 age tgc att get gag cat tgg tac tgc gac ggt gac ace gac tgc aaa 672 Ser Cys lie Ala Glu His Trp Tyr Cys Asp Gly Asp Thr Asp Cys Lys 210 215 220 gat ggc tec gat gag gag aac tgt ctg cca gcg ccc ccc tgc aac ctg 720 Asp Gly Ser Asp Glu Glu Asn Cys Leu Pro Ala Pro Pro Cys Asn Leu 225 230 235 240 gag gag ttc cag tgt gcc tat gga cgc tgc ate etc gac ate tac cac 768 Glu Glu Phe Gln Cys Ala Tyr Gly Arg Cys lie Leu Asp lIe Tyr His 245 250 255 tgc gat ggc gac gat gac tgt gga gac tgg tea gac gag tct gac tgc 816 Cys Asp Gly Asp Asp Asp Cys Gly Asp Trp Ser Asp Glu Ser Asp Cys 260 265 270 tgt gag tac tct ggc cag ctg gga gcc tec cac cag ccc tgc cgc tct 864 Cys Glu Tyr Ser Gly Gln Leu Gly Ala Ser His Gln Pro Cys Arg Ser 275 280 285 ggg gag ttc atg tgt gac agt ggc ctg tgc ate aat gca ggc tgg cgc 912 Gly Glu Phe Met Cys Asp Ser Gly Leu Cys lie Asn Ala Gly Trp Arg 290 295 300 tgc gat ggt gac gcg gac tgt gat gac cag tct gat gag cgc aac tgc 960 Cys Asp Gly Asp Ala Asp Cys Asp Asp Gln Ser Asp Glu Arg Asn Cys 305 310 315 320 ace ace tec atg tgt acg gca gaa cag ttc cgc tgt cac tea ggc cgc 1008 Thr Thr Ser Met Cys Thr Ala Glu Gln Phe Arg Cys His Ser Gly Arg 325 330 335 tgt gtc cgc ctg tec tgg cgc tgt gat ggg gag gac gac tgt gca gac 1056 Cys Val Arg Leu Ser Trp Arg Cys Asp Gly Glu Asp Asp Cys Ala Asp 340 345 350 aac age gat gaa gag aac tgt gag aat aca gga age ccc caa tgt gcc 1104
<Desc/Clms Page number 18>
Asn Ser Asp Glu Glu Asn Cys Glu Asn Thr Gly Ser Pro Gln Cys Ala 355 360 365 ttg gac cag ttc ctg tgt tgg aat ggg cgc tgc att ggg cag agg aag 1152 Leu Asp Gln Phe Leu Cys Trp Asn Gly Arg Cys He Gly Gln Arg Lys 370 375 380 ctg tgc aac ggg gtc aac gac tgt ggt gac aac age gac gaa age cca 1200 Leu Cys Asn Gly Val Asn Asp Cys Gly Asp Asn Ser Asp Glu Ser Pro 385 390 395 400 cag cag aat tgc egg ccc egg acg ggt gag gag aac tgc aat gtt aac 1248 Gln Gln Asn Cys Arg Pro Arg Thr Gly Glu Glu Asn Cys Asn Val Asn 405 410 415 aac ggt ggc tgt gcc cag aag tgc cag atg gtg egg ggg gca gtg cag 1296 Asn Gly Gly Cys Ala Gln Lys Cys Gln Met Val Arg Gly Ala Val Gln 420 425 430 tgt ace tgc cac aca ggc tac egg etc aca gag gat ggg cac acg tgc 1344 Cys Thr Cys His Thr Gly Tyr Arg Leu Thr Glu Asp Gly His Thr Cys 435 440 445 caa gat gtg aat gaa tgt gcc gag gag ggg tat tgc age cag ggc tgc 1392 Gln Asp Val Asn Glu Cys Ala Glu Glu Gly Tyr Cys Ser Gin Gly Cys 450 455 460 ace aac age gaa ggg get ttc caa tgc tgg tgt gaa aca ggc tat gaa 1440 Thr Asn Ser Glu Gly Ala Phe Gln Cys Trp Cys Glu Thr Gly Tyr Glu 465 470 475 480 cta egg ccc gac egg cgc age tgc aag get ctg ggg cca gag cct gtg 1488 Leu Arg Pro Asp Arg Arg Ser Cys Lys Ala Leu Gly Pro Glu Pro Val 485 490 495 ctg ctg ttc gcc aat cgc ate gac ate egg cag gtg ctg cca cac cgc 1536 Leu Leu Phe Ala Asn Arg He Asp Ile Arg Gln Val Leu Pro His Arg 500 505 510 let gag tac aca ctg ctg ctt aac aac ctg gag aat gcc att gcc ctt 1584 Ser Glu Tyr Thr Leu Leu Leu Asn Asn Leu Glu Asn Ala He Ala Leu 515 520 525 gat ttc cac cac cgc cgc gag ctt gtc ttc tgg tea gat gtc ace ctg 1632 Asp Phe His His Arg Arg Glu Leu Val Phe Trp Ser Asp Val Thr Leu 530 535 540 gac egg ate etc cgt gcc aac etc aac ggc age aac gtg gag gag gtt 1680 Asp Arg lIe Leu Arg Ala Asn Leu Asn Gly Ser Asn Val Glu Glu Val 545 550 555 560 gtg let act ggg ctg gag age cca ggg ggc ctg get gtg gat tgg gtc 1728 Val Ser Thr Gly Leu Glu Ser Pro Gly Gly Leu Ala Val Asp Trp Val 565 570 575 cat gac aaa etc tac tgg ace gac tea ggc ace teg agg att gag gtg 1776 His Asp Lys Leu Tyr Trp Thr Asp Ser Gly Thr Ser Arg He Glu Val
<Desc/Clms Page number 19>
580 585 590 gcc aat ctg gat ggg gcc cac egg aaa gtg ttg ctg tgg cag aac ctg 1824 Ala Asn Leu Asp Gly Ala His Arg Lys Val Leu Leu Trp Gln Asn Leu 595 600 605 gag aag ccc egg gcc att gcc ttg cat ccc atg gag ggt ace att tac 1872 Glu Lys Pro Arg Ala He Ala Leu His Pro Met Glu Gly Thr He Tyr 610 615 620 tgg aca gac tgg ggc aac ace ccc cgt att gag gcc tec age atg gat 1920 Trp Thr Asp Trp Gly Asn Thr Pro Arg He Glu Ala Ser Ser Met Asp 625 630 635 640 ggc let gga cgc cgc ate att gcc gat ace cat etc ttc tgg ccc aat 1968 Gly Ser Gly Arg Arg He He Ala Asp Thr His Leu Phe Trp Pro Asn 645 650 655 ggc etc ace ate gac tat gcc ggg cgc cgt atg tac tgg gtg gat get 2016 Gly Leu Thr He Asp Tyr Ala Gly Arg Arg Met Tyr Trp Val Asp Ala 660 665 670 aag cac cat gtc ate gag agg gcc aat ctg gat ggg agt cac cgt aag 2064 Lys His His Val He Glu Arg Ala Asn Leu Asp Gly Ser His Arg Lys 675 680 685 get gtc att age cag ggc etc ccg cat ccc ttc gcc ate aca gtg ttt 2112 Ala Val He Ser Gln Gly Leu Pro His Pro Phe Ala He Thr Val Phe 690 695 700 gaa gac age ctg tac tgg aca gac tgg cac ace aag age ate aat age 2160 Glu Asp Ser Leu Tyr Trp Thr Asp Trp His Thr Lys Ser He Asn Ser 705 710 715 720 get aac aaa ttt acg ggg aag aac cag gaa ate att cgc aac aaa etc 2208 Ala Asn Lys Phe Thr Gly Lys Asn Gln Glu Ile He Arg Asn Lys Leu 725 730 735 cac ttc cct atg gac ate cac ace ttg cac ccc cag cgc caa cct gca 2256 His Phe Pro Met Asp He His Thr Leu His Pro Gln Arg Gln Pro Ala 740 745 750 ggg aaa aac cgc tgt ggg gac aac aac gga ggc tgc acg cac ctg tgt 2304 Gly Lys Asn Arg Cys Gly Asp Asn Asn Gly Gly Cys Thr His Leu Cys 755 760 765 ctg ccc agt ggc cag aac tac ace tgt gcc tgc ccc act ggc ttc cgc 2352 Leu Pro Ser Gly Gln Asn Tyr Thr Cys Ala Cys Pro Thr Gly Phe Arg 770 775 780 aag ate age age cac gcc tgt gcc cag agt ctt gac aag ttc ctg ctt 2400 Lys He Ser Ser His Ala Cys Ala Gln Ser Leu Asp Lys Phe Leu Leu 785 790 795 800 ttt gcc cga agg atg gac ate cgt cga ate age ttt gac aca gag gac 2448 Phe Ala Arg Arg Met Asp He Arg Arg He Ser Phe Asp Thr Glu Asp 805 810 815
<Desc/Clms Page number 20>
ctg let gat gat gtc ate cca ctg get gac gtg cgc agt get gtg gcc 2496 Leu Ser Asp Asp Val lIe Pro Leu Ala Asp Val Arg Ser Ala Val Ala 820 825 830 ctt gac tgg gac tec egg gat gac cac gtg tac tgg aca gat gtc age 2544 Leu Asp Trp Asp Ser Arg Asp Asp His Val Tyr Trp Thr Asp Val Ser 835 840 845 act gat ace ate age agg gcc aag tgg gat gga aca gga cag gag gtg 2592 Thr Asp Thr He Ser Arg Ala Lys Trp Asp Gly Thr Gly Gln Glu Val 850 855 860 gta gtg gat ace agt ttg gag age cca get ggc ctg gcc att gat tgg 2640 Val Val Asp Thr Ser Leu Glu Ser Pro Ala Gly Leu Ala He Asp Trp 865 870 875 880 gtc ace aac aaa ctg tac tgg aca gat gca ggt aca gac egg att gaa 2688 Val Thr Asn Lys Leu Tyr Trp Thr Asp Ala Gly Thr Asp Arg lie Glu 885 890 895 gta gee aac aca gat ggc age atg aga aca gta etc ate tgg gag aac 2736 Val Ala Asn Thr Asp Gly Ser Met Arg Thr Val Leu He Trp Glu Asn 900 905 910 ctt gat cgt cct egg gac ate gtg gtg gaa ccc atg ggc ggg tac atg 2784 Leu Asp Arg Pro Arg Asp He Val Val Glu Pro Met Gly Gly Tyr Met 915 920 925 tat tgg act gac tgg ggt gcg age ccc aag att gaa cga get ggc atg 2832 Tyr Trp Thr Asp Trp Gly Ala Ser Pro Lys He Glu Arg Ala Gly Met 930 935 940 gat gcc tea ggc cgc caa gtc att ate let tct aat ctg ace tgg cct 2880 Asp Ala Ser Gly Arg Gln Val Ile lIe Ser Ser Asn Leu Thr Trp Pro 945 950 955 960 aat ggg tta get att gat tat ggg tec cag cgt cta tac tgg get gac 2928 Asn Gly Leu Ala He Asp Tyr Gly Ser Gln Arg Leu Tyr Trp Ala Asp 965 970 975 gcc ggc atg aag aca att gaa ttt get gga ctg gat ggc agt aag agg 2976 Ala Gly Met Lys Thr He Glu Phe Ala Gly Leu Asp Gly Ser Lys Arg 980 985 990 aag gtg ctg att gga age cag etc ccc cac cca ttt ggg ctg ace etc 3024 Lys Val Leu He Gly Ser Gln Leu Pro His Pro Phe Gly Leu Thr Leu 995 1000 1005 tat gga gag cgc ate tat tgg act gac tgg cag ace aag age ata cag 3072 Tyr Gly Glu Arg He Tyr Trp Thr Asp Trp Gln Thr Lys Ser He Gln 1010 1015 1020 age get gac egg ctg aca ggg ctg gac egg gag act ctg cag gag aac 3120 Ser Ala Asp Arg Leu Thr Gly Leu Asp Arg Glu Thr Leu Gln Glu Asn 1025 1030 1035 1040
<Desc/Clms Page number 21>
ctg gaa aac cta atg gac ate cat gtc ttc cac cgc cgc egg ccc cca 3168 Leu Glu Asn Leu Met Asp lIe His Val Phe His Arg Arg Arg Pro Pro 1045 1050 1055 gtg let aca cca tgt get atg gag aat ggc ggc tgt age cac ctg tgt 3216 Val Ser Thr Pro Cys Ala Met Glu Asn Gly Gly Cys Ser His Leu Cys 1060 1065 1070 ctt agg tec cca aat cca age gga ttc age tgt ace tgc ccc aca ggc 3264 Leu Arg Ser Pro Asn Pro Ser Gly Phe Ser Cys Thr Cys Pro Thr Gly 1075 1080 1085 ate aac ctg ctg let gat ggc aag ace tgc tea cca ggc atg aac agt 3312 He Asn Leu Leu Ser Asp Gly Lys Thr Cys Ser Pro Gly Met Asn Ser 1090 1095 1100 ttc etc ate ttc gcc agg agg ata gac att cgc atg gtc tec ctg gac 3360 Phe Leu lie Phe Ala Arg Arg Ile Asp Ile Arg Met Val Ser Leu Asp 1105 1110 1115 1120 ate cct tat ttt get gat gtg gtg gta cca ate aac att ace atg aag 3408 He Pro Tyr Phe Ala Asp Val Val Val Pro lie Asn Ile Thr Met Lys 1125 1130 1135 < 210 > 2 < 211 > 1136 < 212 > PRT < 213 > Homo sapiens < 400 > 2 Met Arg Arg Gln Trp Gly Ala Leu Leu Leu Gly Ala Leu Leu Cys Ala 1 5 10 15 His Gly Leu Ala Ser Ser Pro Glu Cys Ala Cys Gly Arg Ser His Phe 20 25 30 Thr Cys Ala Val Ser Ala Leu Gly Glu Cys Thr Cys Ile Pro Ala Gln 35 40 45 Trp Gln Cys Asp Gly Asp Asn Asp Cys Gly Asp His Ser Asp Glu Asp 50 55 60 Gly Cys Ile Lys Lys Cys Ser Pro Leu Asp Phe His Cys Asp Asn Gly 65 70 75 80 Lys Cys Ile Arg Arg Ser Trp Val Cys Asp Gly Asp Asn Asp Cys Glu 85 90 95 Asp Asp Ser Asp Glu Gln Asp Cys Pro Pro Arg Glu Cys Glu Glu Asp 100 105 110 Glu Phe Pro Cys Gln Asn Gly Tyr Cys lIe Arg Ser Leu Trp His Cys 115 120 125 Asp Gly Asp Asn Asp Cys Gly Asp Asn Ser Asp Glu Gln Ala Ser Val
<Desc/Clms Page number 22>
130 135 140 Leu Pro Gly Glu Ser Ala Leu Thr Leu Gly Cys Leu Ser Gly Ala Arg 145 150 155 160 Ser Asp Pro Pro Lys Pro Val Arg Leu Gly Val Arg Glu Glu Gly Glu 165 170 175 Arg Val Ala Cys Gly Ala Pro Ser Glu Leu Leu Ser His Arg 11e Pro 180 185 190 Pro Asp Met Arg Lys Cys Ser Asp Lys Glu Phe Arg Cys Ser Asp Gly 195 200 205 Ser Cys 11e Ala Glu His Trp Tyr Cys Asp Gly Asp Thr Asp Cys Lys 210 215 220 Asp Gly Ser Asp Glu Glu Asn Cys Leu Pro Ala Pro Pro Cys Asn Leu 225 230 235 240 Glu Glu Phe Gln Cys Ala Tyr Gly Arg Cys 11e Leu Asp 11e Tyr His 245 250 255 Cys Asp Gly Asp Asp Asp Cys Gly Asp Trp Ser Asp Glu Ser Asp Cys 260 265 270 Cys Glu Tyr Ser Gly Gln Leu Gly Ala Ser His Gln Pro Cys Arg Ser 275 280 285 Gly Glu Phe Met Cys Asp Ser Gly Leu Cys 11e Asn Ala Gly Trp Arg 290 295 300 Cys Asp Gly Asp Ala Asp Cys Asp Asp Gln Ser Asp Glu Arg Asn Cys 305 310 315 320 Thr Thr Ser Met Cys Thr Ala Glu Gln Phe Arg Cys His Ser Gly Arg 325 330 335 Cys Val Arg Leu Ser Trp Arg Cys Asp Gly Glu Asp Asp Cys Ala Asp 340 345 350 Asn Ser Asp Glu Glu Asn Cys Glu Asn Thr Gly Ser Pro Gln Cys Ala 355 360 365 Leu Asp Gln Phe Leu Cys Trp Asn Gly Arg Cys 11e Gly Gln Arg Lys 370 375 380 Leu Cys Asn Gly Val Asn Asp Cys Gly Asp Asn Ser Asp Glu Ser Pro 385 390 395 400 Gln Gln Asn Cys Arg Pro Arg Thr Gly Glu Glu Asn Cys Asn Val Asn 405 410 415 Asn Gly Gly Cys Ala Gln Lys Cys Gln Met Val Arg Gly Ala Val Gln 420 425 430 Cys Thr Cys His Thr Gly Tyr Arg Leu Thr Glu Asp Gly His Thr Cys
<Desc/Clms Page number 23>
435 440 445 Gln Asp Val Asn Glu Cys Ala Glu Glu Gly Tyr Cys Ser Gln Gly Cys 450 455 460 Thr Asn Ser Glu Gly Ala Phe Gln Cys Trp Cys Glu Thr Gly Tyr Glu 465 470 475 480 Leu Arg Pro Asp Arg Arg Ser Cys Lys Ala Leu Gly Pro Glu Pro Val 485 490 495 Leu Leu Phe Ala Asn Arg Ile Asp Ile Arg Gln Val Leu Pro His Arg 500 505 510 Ser Glu Tyr Thr Leu Leu Leu Asn Asn Leu Glu Asn Ala lie Ala Leu 515 520 525 Asp Phe His His Arg Arg Glu Leu Val Phe Trp Ser Asp Val Thr Leu 530 535 540 Asp Arg lIe Leu Arg Ala Asn Leu Asn Gly Ser Asn Val Glu Glu Val 545 550 555 560 Val Ser Thr Gly Leu Glu Ser Pro Gly Gly Leu Ala Val Asp Trp Val 565 570 575 His Asp Lys Leu Tyr Trp Thr Asp Ser Gly Thr Ser Arg Ile Glu Val 580 585 590 Ala Asn Leu Asp Gly Ala His Arg Lys Val Leu Leu Trp Gln Asn Leu 595 600 605 Glu Lys Pro Arg Ala Ile Ala Leu His Pro Met Glu Gly Thr Ile Tyr 610 615 620 Trp Thr Asp Trp Gly Asn Thr Pro Arg He Glu Ala Ser Ser Met Asp 625 630 635 640 Gly Ser Gly Arg Arg He He Ala Asp Thr His Leu Phe Trp Pro Asn 645 650 655 Gly Leu Thr He Asp Tyr Ala Gly Arg Arg Met Tyr Trp Val Asp Ala 660 665 670 Lys His His Val He Glu Arg Ala Asn Leu Asp Gly Ser His Arg Lys 675 680 685 Ala Val He Ser Gln Gly Leu Pro His Pro Phe Ala Ile Thr Val Phe 690 695 700 Glu Asp Ser Leu Tyr Trp Thr Asp Trp His Thr Lys Ser Ile Asn Ser 705 710 715 720 Ala Asn Lys Phe Thr Gly Lys Asn Gln Glu He He Arg Asn Lys Leu 725 730 735 His Phe Pro Met Asp He His Thr Leu His Pro Gln Arg Gln Pro Ala
<Desc/Clms Page number 24>
740 745 750 Gly Lys Asn Arg Cys Gly Asp Asn Asn Gly Gly Cys Thr His Leu Cys 755 760 765 Leu Pro Ser Gly Gln Asn Tyr Thr Cys Ala Cys Pro Thr Gly Phe Arg 770 775 780 Lys lIe Ser Ser His Ala Cys Ala Gln Ser Leu Asp Lys Phe Leu Leu 785 790 795 800 Phe Ala Arg Arg Met Asp Ile Arg Arg lIe Ser Phe Asp Thr Glu Asp 805 810 815 Leu Ser Asp Asp Val Ile Pro Leu Ala Asp Val Arg Ser Ala Val Ala 820 825 830 Leu Asp Trp Asp Ser Arg Asp Asp His Val Tyr Trp Thr Asp Val Ser 835 840 845 Thr Asp Thr Ile Ser Arg Ala Lys Trp Asp Gly Thr Gly Gln Glu Val 850 855 860 Val Val Asp Thr Ser Leu Glu Ser Pro Ala Gly Leu Ala lie Asp Trp 865 870 875 880 Val Thr Asn Lys Leu Tyr Trp Thr Asp Ala Gly Thr Asp Arg Ile Glu 885 890 895 Val Ala Asn Thr Asp Gly Ser Met Arg Thr Val Leu He Trp Glu Asn 900 905 910 Leu Asp Arg Pro Arg Asp He Val Val Glu Pro Met Gly Gly Tyr Met 915 920 925 Tyr Trp Thr Asp Trp Gly Ala Ser Pro Lys lie Glu Arg Ala Gly Met 930 935 940 Asp Ala Ser Gly Arg Gln Val He He Ser Ser Asn Leu Thr Trp Pro 945 950 955 960 Asn Gly Leu Ala He Asp Tyr Gly Ser Gln Arg Leu Tyr Trp Ala Asp 965 970 975 Ala Gly Met LysThr He Glu Phe Ala Gly Leu Asp Gly Ser Lys Arg 980 985 990 Lys Val Leu lIe Gly Ser Gln Leu Pro His Pro Phe Gly Leu Thr Leu 995 1000 1005 Tyr Gly Glu Arg He Tyr Trp Thr Asp Trp Gln Thr Lys Ser He Gln 1010 1015 1020 Ser Ala Asp Arg Leu Thr Gly Leu Asp Arg Glu Thr Leu Gln Glu Asn 1025 1030 1035 1040 Leu Glu Asn Leu Met Asp He His Val Phe His Arg Arg Arg Pro Pro
<Desc/Clms Page number 25>
1045 1050 1055 Val Ser Thr Pro Cys Ala Met Glu Asn Gly Gly Cys Ser His Leu Cys 1060 1065 1070 Leu Arg Ser Pro Asn Pro Ser Gly Phe Ser Cys Thr Cys Pro Thr Gly 1075 1080 1085 lie Asn Leu Leu Ser Asp Gly Lys Thr Cys Ser Pro Gly Met Asn Ser 1090 1095 1100 Phe Leu He Phe Ala Arg Arg He Asp lie Arg Met Val Ser Leu Asp 1105 1110 1115 1120 He Pro Tyr Phe Ala Asp Val Val Val Pro Ile Asn Ile Thr Met Lys 1125 1130 1135 < 210 > 3 < 211 > 3408 < 212 > DNA < 213 > Homo sapiens < 400 > 3 cttcatggta atgttgattg gtaccaccac atcagcaaaa taagggatgt ccagggagac 60
catgcgaatg tctatcctcc tggcgaagat gaggaaactg ttcatgcctg gtgagcaggt 120 cttgccatca gacagcaggt tgatgcctgt ggggcaggta cagctgaatc cgcttggatt 180 tggggaccta agacacaggt ggctacagcc gccattctcc atagcacatg gtgtagacac 240 tgggggccgg cggcggtgga agacatggat gtccattagg ttttccaggt tctcctgcag 300 agtctcccgg tccagccctg tcagccggtc agcgctctgt atgctcttgg tctgccagtc 360 agtccaatag atgcgctctc catagagggt cagcccaaat gggtggggga gctggcttcc 420
aatcagcacc ttcctcttac tgccatccag tccagcaaat tcaattgtct tcatgccggc 480 gtcagcccag tatagacgct gggacccata atcaatagct aacccattag gccaggtcag 540 attagaagag ataatgactt ggcggcctga ggcatccatg ccagctcgtt caatcttggg 600 gctcgcaccc cagtcagtcc aatacatgta cccgcccatg ggttccacca cgatgtcccg 660 aggacgatca aggttctccc agatgagtac tgttctcatg ctgccatctg tgttggctac 720 ttcaatccgg tctgtacctg catctgtcca gtacagtttg ttggtgaccc aatcaatggc 780 caggccagct gggctctcca aactggtatc cactaccacc tcctgtcctg ttccatccca 840 cttggccctg ctgatggtat cagtgctgac atctgtccag tacacgtggt catcccggga 900 gtcccagtca agggccacag cactgcgcac gtcagccagt gggatgacat catcagacag 960 gtcctctgtg tcaaagctga ttcgacggat gtccatcctt cgggcaaaaa gcaggaactt 1020 gtcaagactc tgggcacagg cgtggctgct gatcttgcgg aagccagtgg ggcaggcaca 1080 ggtgtagttc tggccactgg gcagacacag gtgcgtgcag cctccgttgt tgtccccaca 1140 gcggtttttc cctgcaggtt ggcgctgggg gtgcaaggtg tggatgtcca tagggaagtg 1200 gagtttgttg cgaatgattt cctggttctt ccccgtaaat ttgttagcgc tattgatgct 1260 cttggtgtgc cagtctgtcc agtacaggct gtcttcaaac actgtgatgg cgaagggatg 1320 cgggaggccc tggctaatga cagccttacg gtgactccca tccagattgg ccctctcgat 1380
gacatggtgc ttagcatcca cccagtacat acggcgcccg gcatagtcga tggtgaggcc 1440 attgggccag aagagatggg tatcggcaat gatgcggcgt ccagagccat ccatgctgga 1500 ggcctcaata cggggggtgt tgccccagtc tgtccagtaa atggtaccct ccatgggatg 1560 caaggcaatg gcccggggct tctccaggtt ctgccacagc aacactttcc ggtgggcccc 1620 atccagattg gccacctcaa tcctcgaggt gcctgagtcg gtccagtaga gtttgtcatg 1680 gacccaatcc acagccaggc cccctgggct ctccagccca gtagacacaa cctcctccac 1740 gttgctgccg ttgaggttgg cacggaggat ccggtccagg gtgacatctg accagaagac 1800
<Desc/Clms Page number 26>
aagctcgcgg cggtggtgga aatcaagggc aatggcattc tccaggttgt taagcagcag 1860 tgtgtactca gagcggtgtg gcagcacctg ccggatgtcg atgcgattgg cgaacagcag 1920 cacaggctct ggccccagag ccttgcagct gcgccggtcg ggccgtagtt catagcctgt 1980 ttcacaccag cattggaaag ccccttcgct gttggtgcag ccctggctgc aatacccctc 2040 ctcggcacat tcattcacat cttggcacgt gtgcccatcc tctgtgagcc ggtagcctgt 2100 gtggcaggta cactgcactg ccccccgcac catctggcac ttctgggcac agccaccgtt 2160 gttaacattg cagttctcct cacccgtccg gggccggcaa ttctgctgtg ggctttcgtc 2220 gctgttgtca ccacagtcgt tgaccccgtt gcacagcttc ctctgcccaa tgcagcgccc 2280 attccaacac aggaactggt ccaaggcaca ttgggggctt cctgtattct cacagttctc 2340 ttcatcgctg ttgtctgcac agtcgtcctc cccatcacag cgccaggaca ggcggacaca 2400 gcggcctgag tgacagcgga actgttctgc cgtacacatg gaggtggtgc agttgcgctc 2460 atcagactgg tcatcacagt ccgcgtcacc atcgcagcgc cagcctgcat tgatgcacag 2520 gccactgtca cacatgaact ccccagagcg gcagggctgg tgggaggctc ccagctggcc 2580 agagtactca cagcagtcag actcgtctga ccagtctcca cagtcatcgt cgccatcgca 2640 gtggtagatg tcgaggatgc agcgtccata ggcacactgg aactcctcca ggttgcaggg 2700 gggcgctggc agacagttct cctcatcgga gccatctttg cagtcggtgt caccgtcgca 2760 gtaccaatgc tcagcaatgc agcttccgtc actacagcgg aactccttgt cggagcactt 2820 gcgcatgtct ggggggatgc gatgggacag cagttctgag ggggccccac aggcaaccct 2880 ctcaccctcc tctctgactc ccaacctcac tggctttggc gggtctgacc tagcccccga 2940 aaggcacccc agagtcagtg cagactctcc agggagaacg gaggcctgct catcgctgtt 3000 gtcgccacag tcattgtcac catcgcagtg ccacagactc cggatgcagt agccattctg 3060 gcagggaaac tcgtcctcct cacactcccg ggggggacag tcctgctcat ccgagtcatc 3120 ctcacagtcg ttgtccccgt cacacaccca ggagcggcgg atgcacttgc cattgtcaca 3180 gtgaaagtca agaggggaac atttttttat acatccatcc tcatcgctgt ggtccccgca 3240 gtcattgtct ccatcacact gccactgggc agggatgcag gtacactctc caagagcact 3300 cactgcacat gtgaagtggc tccgaccaca agcacactcg gggctgctgg ccaggccgtg 3360 tgcgcagagc agggcgccaa gcagcagcgc gccccactgc cgcctcat 3408 < 210 > 4 < 211 > 5715 < 212 > DNA < 213 > Homo sapiens < 220 > < 221 > CDS < 222 > (1).. (5715) < 400 > 4 atg agg egg cag tgg ggc gcg ctg ctg ctt ggc gcc ctg etc tgc gca 48 Met Arg Arg Gln Trp Gly Ala Leu Leu Leu Gly Ala Leu Leu Cys Ala 1 5 10 15 cac ggc ctg gcc age age ccc gag tgt get tgt ggt egg age cac ttc 96 His Gly Leu Ala Ser Ser Pro Glu Cys Ala Cys Gly Arg Ser His Phe 20 25 30 aca tgt gca gtg agt get ctt gga gag tgt ace tgc ate cct gcc cag 144 Thr Cys Ala Val Ser Ala Leu Gly Glu Cys Thr Cys Ile Pro Ala Gln 35 40 45 tgg cag tgt gat gga gac aat gac tgc ggg gac cac age gat gag gat 192 Trp Gln Cys Asp Gly Asp Asn Asp Cys Gly Asp His Ser Asp Glu Asp 50 55 60 gga tgt ata cta cct ace tgt tec cct ctt gac ttt cac tgt gac aat 240 Gly Cys Ile Leu Pro Thr Cys Ser Pro Leu Asp Phe His Cys Asp Asn
<Desc/Clms Page number 27>
65 70 75 80 ggc aag tgc ate cgc cgc tec tgg gtg tgt gac ggg gac aac gac tgt 288 Gly Lys Cys lie Arg Arg Ser Trp Val Cys Asp Gly Asp Asn Asp Cys 85 90 95 gag gat gac teg gat gag cag gac tgt ccc ccc egg gag tgt gag gag 336 Glu Asp Asp Ser Asp Glu Gln Asp Cys Pro Pro Arg Glu Cys Glu Glu 100 105 110 gac gag ttt ccc tgc cag aat ggc tac tgc ate egg agt ctg tgg cac 384 Asp Glu Phe Pro Cys Gln Asn Gly Tyr Cys He Arg Ser Leu Trp His 115 120 125 tgc gat ggt gac aat gac tgt ggc gac aac age gat gag cag tgt gac 432 Cys Asp Gly Asp Asn Asp Cys Gly Asp Asn Ser Asp Glu Gln Cys Asp 130 135 140 atg cgc aag tgc tec gac aag gag ttc cgc tgt agt gac gga age tgc 480 Met Arg Lys Cys Ser Asp Lys Glu Phe Arg Cys Ser Asp Gly Ser Cys 145 150 155 160 att get gag cat tgg tac tgc gac ggt gac ace gac tgc aaa gat ggc 528 Ile Ala Glu His Trp Tyr Cys Asp Gly Asp Thr Asp Cys Lys Asp Gly 165 170 175 tec gat gag gag aac tgt ccc tea gca gtg cca gcg ccc ccc tgc aac 576 Ser Asp Glu Glu Asn Cys Pro Ser Ala Val Pro Ala Pro Pro Cys Asn 180 185 190 ctg gag gag ttc cag tgt gcc tat gga cgc tgc ate etc gac ate tac 624 Leu Glu Glu Phe Gln Cys Ala Tyr Gly Arg Cys lie Leu Asp Ile Tyr 195 200 205 cac tgc gat ggc gac gat gac tgt gga gac tgg tea gac gag let gac 672 His Cys Asp Gly Asp Asp Asp Cys Gly Asp Trp Ser Asp Glu Ser Asp 210 215 220 tgc tec tec cac cag ccc tgc cgc tct ggg gag ttc atg tgt gac agt 720 Cys Ser Ser His Gin Pro Cys Arg Ser Gly Glu Phe Met Cys Asp Ser 225 230 235 240 ggc ctg tgc ate aat gca ggc tgg cgc tgc gat ggt gac gcg gac tgt 768 Gly Leu Cys He Asn Ala Gly Trp Arg Cys Asp Gly Asp Ala Asp Cys 245 250 255 gat gac cag tct gat gag cgc aac tgc ace ace tec atg tgt acg gca 816 Asp Asp Gln Ser Asp Glu Arg Asn Cys Thr Thr Ser Met Cys Thr Ala 260 265 270 gaa cag ttc cgc tgt cac tea ggc cgc tgt gtc cgc ctg tec tgg cgc 864 Glu Gln Phe Arg Cys His Ser Gly Arg Cys Val Arg Leu Ser Trp Arg 275 280 285 tgt gat ggg gag gac gac tgt gca gac aac age gat gaa gag aac tgt 912 Cys Asp Gly Glu Asp Asp Cys Ala Asp Asn Ser Asp Glu Glu Asn Cys 290 295 300
<Desc/Clms Page number 28>
gag aat aca gga age ccc caa tgt gcc ttg gac cag ttc ctg tgt tgg 960 Glu Asn Thr Gly Ser Pro Gln Cys Ala Leu Asp Gln Phe Leu Cys Trp 305 310 315 320 aat ggg cgc tgc att ggg cag agg aag ctg tgc aac ggg gtc aac gac 1008 Asn Gly Arg Cys lIe Gly Gln Arg Lys Leu Cys Asn Gly Val Asn Asp 325 330 335 tgt ggt gac aac age gac gaa age cca cag cag aat tgc egg ccc egg 1056 Cys Gly Asp Asn Ser Asp Glu Ser Pro Gln Gln Asn Cys Arg Pro Arg 340 345 350 acg ggt gag gag aac tgc aat gtt aac aac ggt ggc tgt gcc cag aag 1104 Thr Gly Glu Glu Asn Cys Asn Val Asn Asn Gly Gly Cys Ala Gln Lys 355 360 365 tgc cag atg gtg egg ggg gca gtg cag tgt ace tgc cac aca ggc tac 1152 Cys Gln Met Val Arg Gly Ala Val Gln Cys Thr Cys His Thr Gly Tyr 370 375 380 egg etc aca gag gat ggg cac acg tgc caa gat gtg aat gaa tgt gcc 1200 Arg Leu Thr Glu Asp Gly His Thr Cys Gln Asp Val Asn Glu Cys Ala 385 390 395 400 gag gag ggg tat tgc age cag ggc tgc ace aac age gaa ggg get ttc 1248 Glu Glu Gly Tyr Cys Ser Gln Gly Cys Thr Asn Ser Glu Gly Ala Phe 405 410 415 caa tgc tgg tgt gaa aca ggc tat gaa cta egg ccc gac egg cgc age 1296 Gln Cys Trp Cys Glu Thr Gly Tyr Glu Leu Arg Pro Asp Arg Arg Ser 420 425 430 tgc aag get ctg ggg cca gag cct gtg ctg ctg ttc gcc aat cgc ate 1344 Cys Lys Ala Leu Gly Pro Glu Pro Val Leu Leu Phe Ala Asn Arg lIe 435 440 445 gac ate egg cag gtg ctg cca cac cgc let gag tac aca ctg ctg ctt 1392 Asp 11e Arg Gln Val Leu Pro His Arg Ser Glu Tyr Thr Leu Leu Leu 450 455 460 aac aac ctg gag aat gcc att gcc ctt gat ttc cac cac cgc cgc gag 1440 Asn Asn Leu Glu Asn Ala Ile Ala Leu Asp Phe His His Arg Arg Glu 465 470 475 480 ctt gtc ttc tgg tea gat gtc ace ctg gac egg ate etc cgt gcc aac 1488 Leu Val Phe Trp Ser Asp Val Thr Leu Asp Arg lIe Leu Arg Ala Asn 485 490 495 etc aac ggc age aac gtg gag gag gtt gtg let act ggg ctg gag age 1536 Leu Asn Gly Ser Asn Val Glu Glu Val Val Ser Thr Gly Leu Glu Ser 500 505 510 cca ggg ggc ctg get gtg gat tgg gtc cat gac aaa etc tac tgg ace 1584 Pro Gly Gly Leu Ala Val Asp Trp Val His Asp Lys Leu Tyr Trp Thr 515 520 525
<Desc/Clms Page number 29>
gac tea ggc ace teg agg att gag gtg gcc aat ctg gat ggg gcc cac 1632 Asp Ser Gly Thr Ser Arg He Glu Val Ala Asn Leu Asp Gly Ala His 530 535 540 egg aaa gtg ttg ctg tgg cag aac ctg gag aag ccc egg gcc att gcc 1680 Arg Lys Val Leu Leu Trp Gln Asn Leu Glu Lys Pro Arg Ala lie Ala 545 550 555 560 ttg cat ccc atg gag ggt ace att tac tgg aca gac tgg ggc aac ace 1728 Leu His Pro Met Glu Gly Thr He Tyr Trp Thr Asp Trp Gly Asn Thr 565 570 575 ccc cgt att gag gcc tec age atg gat ggc let gga cgc cgc ate att 1776 Pro Arg He Glu Ala Ser Ser Met Asp Gly Ser Gly Arg Arg lie He 580 585 590 gcc gat ace cat etc ttc tgg ccc aat ggc etc ace ate gac tat gcc 1824 Ala Asp Thr His Leu Phe Trp Pro Asn Gly Leu Thr lie Asp Tyr Ala 595 600 605 ggg cgc cgt atg tac tgg gtg gat get aag cac cat gtc ate gag agg 1872 Gly Arg Arg Met Tyr Trp Val Asp Ala Lys His His Val He Glu Arg 610 615 620 gcc aat ctg gat ggg agt cac cgt aag get gtc att age cag ggc etc 1920 Ala Asn Leu Asp Gly Ser His Arg Lys Ala Val He Ser Gln Gly Leu 625 630 635 640 ccg cat ccc ttc gcc ate aca gtg ttt gaa gac age ctg tac tgg aca 1968 Pro His Pro Phe Ala He Thr Val Phe Glu Asp Ser Leu Tyr Trp Thr 645 650 655 gac tgg cac ace aag age ate aat age get aac aaa ttt acg ggg aag 2016 Asp Trp His Thr Lys Ser He Asn Ser Ala Asn Lys Phe Thr Gly Lys 660 665 670 aac cag gaa ate att cgc aac aaa etc cac ttc cct atg gac ate cac 2064 Asn Gln Glu He He Arg Asn Lys Leu His Phe Pro Met Asp He His 675 680 685 ace ttg cac ccc cag cgc caa cct gca ggg aaa aac cgc tgt ggg gac 2112 Thr Leu His Pro Gln Arg Gln Pro Ala Gly Lys Asn Arg Cys Gly Asp 690 695 700 aac aac gga ggc tgc acg cac ctg tgt ctg ccc agt ggc cag aac tac 2160 Asn Asn Gly Gly Cys Thr His Leu Cys Leu Pro Ser Gly Gln Asn Tyr 705 710 715 720 ace tgt gcc tgc ccc act ggc ttc cgc aag ate age age cac gcc tgt 2208 Thr Cys Ala Cys Pro Thr Gly Phe Arg Lys He Ser Ser His Ala Cys 725 730 735 gcc cag agt ctt gac aag ttc ctg ctt ttt gcc cga agg atg gac ate 2256 Ala Gln Ser Leu Asp Lys Phe Leu Leu Phe Ala Arg Arg Met Asp He 740 745 750 cgt cga ate age ttt gac aca gag gac ctg tct gat gat gtc ate cca 2304
<Desc/Clms Page number 30>
Arg Arg He Ser Phe Asp Thr Glu Asp Leu Ser Asp Asp Val lie Pro 755 760 765 ctg get gac gtg cgc agt get gtg gcc ctt gac tgg gac tec egg gat 2352 Leu Ala Asp Val Arg Ser Ala Val Ala Leu Asp Trp Asp Ser Arg Asp 770 775 780 gac cac gtg tac tgg aca gat gtc age act gat ace ate age agg gcc 2400 Asp His Val Tyr Trp Thr Asp Val Ser Thr Asp Thr He Ser Arg Ala 785 790 795 800 aag tgg gat gga aca gga cag gag gtg gta gtg gat ace agt ttg gag 2448 Lys Trp Asp Gly Thr Gly Gln Glu Val Val Val Asp Thr Ser Leu Glu 805 810 815 age cca get ggc ctg gcc att gat tgg gtc ace aac aaa ctg tac tgg 2496 Ser Pro Ala Gly Leu Ala lIe Asp Trp Val Thr Asn Lys Leu Tyr Trp 820 825 830 aca gat gca ggt aca gac egg att gaa gta gcc aac aca gat ggc age 2544 Thr Asp Ala Gly Thr Asp Arg He Glu Val Ala Asn Thr Asp Gly Ser 835 840 845 atg aga aca gta etc ate tgg gag aac ctt gat cgt cct egg gac ate 2592 Met Arg Thr Val Leu He Trp Glu Asn Leu Asp Arg Pro Arg Asp He 850 855 860 gtg gtg gaa ccc atg ggc ggg tac atg tat tgg act gac tgg ggt gcg 2640 Val Val Glu Pro Met Gly Gly Tyr Met Tyr Trp Thr Asp Trp Gly Ala 865 870 875 880 age ccc aag att gaa cga get ggc atg gat gcc tea ggc cgc caa gtc 2688 Ser Pro Lys Ile Glu Arg Ala Gly Met Asp Ala Ser Gly Arg Gln Val 885 890 895 att ate let let aat ctg ace tgg cct aat ggg tta get att gat tat 2736 lIe Ile Ser Ser Asn Leu Thr Trp Pro Asn Gly Leu Ala Ile Asp Tyr 900 905 910 ggg tec cag cgt cta tac tgg get gac gcc ggc atg aag aca att gaa 2784 Gly Ser Gln Arg Leu Tyr Trp Ala Asp Ala Gly Met Lys Thr He Glu 915 920 925 ttt get gga ctg gat ggc agt aag agg aag gtg ctg att gga age cag 2832 Phe Ala Gly Leu Asp Gly Ser Lys Arg Lys Val Leu He Gly Ser Gln 930 935 940 etc ccc cac cca ttt ggg ctg ace etc tat gga gag cgc ate tat tgg 2880 Leu Pro His Pro Phe Gly Leu Thr Leu Tyr Gly Glu Arg Ile Tyr Trp 945 950 955 960 act gac tgg cag ace aag age ata cag age get gac egg ctg aca ggg 2928 Thr Asp Trp Gln Thr Lys Ser lie Gln Ser Ala Asp Arg Leu Thr Gly 965 970 975 ctg gac egg gag act ctg cag gag aac ctg gaa aac cta atg gac ate 2976 Leu Asp Arg Glu Thr Leu Gln Glu Asn Leu Glu Asn Leu Met Asp He
<Desc/Clms Page number 31>
980 985 990 cat gtc ttc cac cgc cgc egg ccc cca gtg let aca cca tgt get atg 3024 His Val Phe His Arg Arg Arg Pro Pro Val Ser Thr Pro Cys Ala Met 995 1000 1005 gag aat ggc ggc tgt age cac ctg tgt ctt agg tec cca aat cca age 3072 Glu Asn Gly Gly Cys Ser His Leu Cys Leu Arg Ser Pro Asn Pro Ser 1010 1015 1020 gga ttc age tgt ace tgc ccc aca ggc ate aac ctg ctg let gat ggc 3120 Gly Phe Ser Cys Thr Cys Pro Thr Gly Ile Asn Leu Leu Ser Asp Gly 1025 1030 1035 1040 aag ace tgc tea cca ggc atg aac agt ttc etc ate ttc gee agg agg 3168 Lys Thr Cys Ser Pro Gly Met Asn Ser Phe Leu Ile Phe Ala Arg Arg 1045 1050 1055 ata gac att cgc atg gtc tec ctg gac ate cct tat ttt get gat gtg 3216 Ile Asp Ile Arg Met Val Ser Leu Asp Ile Pro Tyr Phe Ala Asp Val 1060 1065 1070 gtg gta cca ate aac att ace atg aag aac ace att gcc att gga gta 3264 Val Val Pro Ile Asn Ile Thr Met Lys Asn Thr Ile Ala Ile Gly Val 1075 1080 1085 gac ccc cag gaa gga aag gtg tac tgg tct gac age aca ctg cac agg 3312 Asp Pro Gln Glu Gly Lys Val Tyr Trp Ser Asp Ser Thr Leu His Arg 1090 1095 1100 ate agt cgt gcc aat ctg gat ggc tea cag cat gag gac ate ate ace 3360 Ile Ser Arg Ala Asn Leu Asp Gly Ser Gln His Glu Asp Ile Ile Thr 1105 1110 1115 1120 aca ggg cta cag ace aca gat ggg etc gcg gtt gat gcc att ggc egg 3408 Thr Gly Leu Gln Thr Thr Asp Gly Leu Ala Val Asp Ala Ile Gly Arg 1125 1130 1135 aaa gta tac tgg aca gac acg gga aca aac egg att gaa gtg ggc aac 3456 Lys Val Tyr Trp Thr Asp Thr Gly Thr Asn Arg Ile Glu Val Gly Asn 1140 1145 1150 ctg gac ggg tec atg egg aaa gtg ttg gtg tgg cag aac ctt gac agt 3504 Leu Asp Gly Ser Met Arg Lys Val Leu Val Trp Gln Asn Leu Asp Ser 1155 1160 1165 ccc egg gcc ate gta ctg tac cat gag atg ggg ttt atg tac tgg aca 3552 Pro Arg Ala Ile Val Leu Tyr His Glu Met Gly Phe Met Tyr Trp Thr 1170 1175 1180 gac tgg ggg gag aat gcc aag tta gag egg tec gga atg gat ggc tea 3600 Asp Trp Gly Glu Asn Ala Lys Leu Glu Arg Ser Gly Met Asp Gly Ser 1185 1190 1195 1200 gac cgc gcg gtg etc ate aac aac aac cta gga tgg ccc aat gga ctg 3648 Asp Arg Ala Val Leu Ile Asn Asn Asn Leu Gly Trp Pro Asn Gly Leu 1205 1210 1215
<Desc/Clms Page number 32>
act gtg gac aag gcc age tee caa ctg cta tgg gcc gat gcc cac ace 3696 Thr Val Asp Lys Ala Ser Ser Gln Leu Leu Trp Ala Asp Ala His Thr 1220 1225 1230 gag cga att gag get get gac ctg aat ggt gcc aat egg cat aca ttg 3744 Glu Arg Ile Glu Ala Ala Asp Leu Asn Gly Ala Asn Arg His Thr Leu 1235 1240 1245 gtg tea ccg gtg cag cac cca tat ggc etc ace ctg etc gac tec tat 3792 Val Ser Pro Val Gln His Pro Tyr Gly Leu Thr Leu Leu Asp Ser Tyr 1250 1255 1260 ate tac tgg act gac tgg cag act egg age ate cac cgt get gac aag 3840 Ile Tyr Trp Thr Asp Trp Gln Thr Arg Ser Ile His Arg Ala Asp Lys 1265 1270 1275 1280 ggt act ggc age aat gtc ate etc gtg agg tee aac ctg cca ggc etc 3888 Gly Thr Gly Ser Asn Val Ile Leu Val Arg Ser Asn Leu Pro Gly Leu 1285 1290 1295 atg gac atg cag get gtg gac egg gca cag cca cta ggt ttt aac aag 3936 Met Asp Met Gln Ala Val Asp Arg Ala Gln Pro Leu Gly Phe Asn Lys 1300 1305 1310 tge gge teg aga aat ggc ggc tgc tec cac etc tgc ttg cct egg cct 3984 Cys Gly Ser Arg Asn Gly Gly Cys Ser His Leu Cys Leu Pro Arg Pro 1315 1320 1325 let ggc ttc tee tgt gcc tgc eee act ggc ate cag ctg aag gga gat 4032 Ser Gly Phe Ser Cys Ala Cys Pro Thr Gly Ile Gln Leu Lys Gly Asp 1330 1335 1340 ggg aag ace tgt gat ccc let cct gag ace tac ctg etc ttc tec age 4080 Gly Lys Thr Cys Asp Pro Ser Pro Glu Thr Tyr Leu Leu Phe Ser Ser 1345 1350 1355 1360 cgt ggc tee ate egg cgt ate tea ctg gac ace agt gac cac ace gat 4128 Arg Gly Ser Ile Arg Arg He Ser Leu Asp Thr Ser Asp His Thr Asp 1365 1370 1375 gtg cat gtc cct gtt cct gag etc aac aat gtc ate tec ctg gac tat 4176 Val His Val Pro Val Pro Glu Leu Asn Asn Val He Ser Leu Asp Tyr 1380 1385 1390 gac age gtg gat gga aag gtc tat tac aca gat gtg ttc ctg gat gtt 4224 Asp Ser Val Asp Gly Lys Val Tyr Tyr Thr Asp Val Phe Leu Asp Val 1395 1400 1405 ate agg cga gca gac ctg aac ggc age aac atg gag aca gtg ate ggg 4272 Ile Arg Arg Ala Asp Leu Asn Gly Ser Asn Met Glu Thr Val Ile Gly 1410 1415 1420 cga ggg ctg aag ace act gac ggg ctg gca gtg gac tgg gtg gcc agg 4320 Arg Gly Leu Lys Thr Thr Asp Gly Leu Ala Val Asp Trp Val Ala Arg 1425 1430 1435 1440
<Desc/Clms Page number 33>
aac ctg tac tgg aca gac aca ggt cga aat ace att gag gcg tec agg 4368 Asn Leu Tyr Trp Thr Asp Thr Gly Arg Asn Thr lIe Glu Ala Ser Arg 1445 1450 1455 ctg gat ggt tec tgc cgc aaa gta ctg ate aac aat age ctg gat gag 4416 Leu Asp Gly Ser Cys Arg Lys Val Leu lIe Asn Asn Ser Leu Asp Glu 1460 1465 1470 ccc egg gcc att get gtt ttc ccc agg aag ggg tac etc tie tgg aca 4464 Pro Arg Ala lIe Ala Val Phe Pro Arg Lys Gly Tyr Leu Phe Trp Thr 1475 1480 1485 gac tgg ggc cac att gcc aag ate gaa egg gca aac ttg gat ggt let 4512 Asp Trp Gly His lie Ala Lys Ile Glu Arg Ala Asn Leu Asp Gly Ser 1490 1495 1500 gag egg aag gtc etc ate aac aca gac ctg ggt tgg ccc aat ggc ctt 4560 Glu Arg Lys Val Leu lIe Asn Thr Asp Leu Gly Trp Pro Asn Gly Leu 1505 1510 1515 1520 ace ctg gac tat gat ace cgc agg ate tac tgg gtg gat gcg cat ctg 4608 Thr Leu Asp Tyr Asp Thr Arg Arg Ile Tyr Trp Val Asp Ala His Leu 1525 1530 1535 gac cgg ate gag agt get gac etc aat ggg aaa ctg egg cag gtc ttg 4656 Asp Arg He Glu Ser Ala Asp Leu Asn Gly Lys Leu Arg Gln Val Leu 1540 1545 1550 gtc age cat gtg tec cac ccc ttt gcc etc aca cag caa gac agg tgg 4704 Val Ser His Val Ser His Pro Phe Ala Leu Thr Gln Gln Asp Arg Trp 1555 1560 1565 ate tac tgg aca gac tgg cag ace aag tea ate cag cgt gtt gac aaa 4752 Ile Tyr Trp Thr Asp Trp Gln Thr Lys Ser Ile Gln Arg Val Asp Lys 1570 1575 1580 tac tea ggc egg aac aag gag aca gtg ctg gca aat gtg gaa gga etc 4800 Tyr Ser Gly Arg Asn Lys Glu Thr Val Leu Ala Asn Val Glu Gly Leu 1585 1590 1595 1600 atg gat ate ate gtg gtt tec cct cag egg cag aca ggg ace aat gcc 4848 Met Asp He Ile Val Val Ser Pro Gln Arg Gln Thr Gly Thr Asn Ala 1605 1610 1615 tgt ggt gtg aac aat ggt ggc tgc ace cac etc tgc ttt gcc aga gcc 4896 Cys Gly Val Asn Asn Gly Gly Cys Thr His Leu Cys Phe Ala Arg Ala 1620 1625 1630 teg gac ttc gta tgt gcc tgt cct gac gaa cct gat age egg ccc tgc 4944 Ser Asp Phe Val Cys Ala Cys Pro Asp Glu Pro Asp Ser Arg Pro Cys 1635 1640 1645 tec ctt gtg cct ggc ctg gta cca cca get cct agg get act ggc atg 4992 Ser Leu Val Pro Gly Leu Val Pro Pro Ala Pro Arg Ala Thr Gly Met 1650 1655 1660 agt gaa aag age cca gtg cta ccc aac aca cca cct ace ace ttg tat 5040
<Desc/Clms Page number 34>
Ser Glu Lys Ser Pro Val Leu Pro Asn Thr Pro Pro Thr Thr Leu Tyr 1665 1670 1675 1680 tct tea ace ace egg ace cgc acg tct ctg gag gag gtg gaa gga aga 5088 Ser Ser Thr Thr Arg Thr Arg Thr Ser Leu Glu Glu Val Glu Gly Arg 1685 1690 1695 tgc tct gaa agg gat gcc agg ctg ggc etc tgt gca cgt tee aat gac 5136 Cys Ser Glu Arg Asp Ala Arg Leu Gly Leu Cys Ala Arg Ser Asn Asp 1700 1705 1710 get gtt cct get get cca ggg gaa gga ctt cat ate age tac gcc att 5184 Ala Val Pro Ala Ala Pro Gly Glu Gly Leu His Ile Ser Tyr Ala Ile 1715 1720 1725 ggt gga etc etc agt att ctg ctg att ttg gtg gtg att gca get ttg 5232 Gly Gly Leu Leu Ser Ile Leu Leu Ile Leu Val Val Ile Ala Ala Leu 1730 1735 1740 atg ctg tac aga cac aaa aaa tee aag ttc act gat cct gga atg ggg 5280 Met Leu Tyr Arg His Lys Lys Ser Lys Phe Thr Asp Pro Gly Met Gly 1745 1750 1755 1760 aac etc ace tac age aac ccc tec tac cga aca tee aea cag gaa gtg 5328 Asn Leu Thr Tyr Ser Asn Pro Ser Tyr Arg Thr Ser Thr Gln Glu Val 1765 1770 1775 aag att gaa gca ate ccc aaa cca gcc atg tac aac cag ctg tgc tat 5376 Lys Ile Glu Ala Ile Pro Lys Pro Ala Met Tyr Asn Gln Leu Cys Tyr 1780 1785 1790 aag aaa gag gga ggg cct gac cat aac tac ace aag gag aag ate aag 5424 Lys Lys Glu Gly Gly Pro Asp His Asn Tyr Thr Lys Glu Lys Ile Lys 1795 1800 1805 ate gta gag gga ate tgc etc ctg tct ggg gat gat get gag tgg gat 5472 Ile Val Glu Gly Ile Cys Leu Leu Ser Gly Asp Asp Ala Glu Trp Asp 1810 1815 1820 gac etc aag caa ctg cga age tea egg ggg ggc etc etc egg gat cat 5520 Asp Leu Lys Gln Leu Arg Ser Ser Arg Gly Gly Leu Leu Arg Asp His 1825 1830 1835 1840 gta tgc atg aag aca gac acg gtg tee ate cag gee age tct ggc tee 5568 Val Cys Met Lys Thr Asp Thr Val Ser Ile Gln Ala Ser Ser Gly Ser 1845 1850 1855 ctg gat gac aca gag acg gag cag ctg tta cag gaa gag cag tct gag 5616 Leu Asp Asp Thr Glu Thr Glu Gln Leu Leu Gln Glu Glu Gln Ser Glu 1860 1865 1870 tgt age age gtc cat act gca gcc act cca gaa aga cga ggc tct ctg 5664 Cys Ser Ser Val His Thr Ala Ala Thr Pro Glu Arg Arg Gly Ser Leu 1875 1880 1885 cca gac acg ggc tgg aaa cat gaa cgc aag etc tec tea gag age cag 5712 Pro Asp Thr Gly Trp Lys His Glu Arg Lys Leu Ser Ser Glu Ser Gln
<Desc/Clms Page number 35>
1890 1895 1900 gtc 5715 Val < 210 > 5 < 211 > 1905 < 212 > PRT < 213 > Homo sapiens < 400 > 5 Met Arg Arg Gln Trp Gly Ala Leu Leu Leu Gly Ala Leu Leu Cys Ala 1 5 10 15 His Gly Leu Ala Ser Ser Pro Glu Cys Ala Cys Gly Arg Ser His Phe 20 25 30 Thr Cys Ala Val Ser Ala Leu Gly Glu Cys Thr Cys Ile Pro Ala Gln 35 40 45 Trp Gln Cys Asp Gly Asp Asn Asp Cys Gly Asp His Ser Asp Glu Asp 50 55 60 Gly Cys Ile Leu Pro Thr Cys Ser Pro Leu Asp Phe His Cys Asp Asn 65 70 75 80 Gly Lys Cys Ile Arg Arg Ser Trp Val Cys Asp Gly Asp Asn Asp Cys 85 90 95 Glu Asp Asp Ser Asp Glu Gln Asp Cys Pro Pro Arg Glu Cys Glu Glu 100 105 110 Asp Glu Phe Pro Cys Gln Asn Gly Tyr Cys Ile Arg Ser Leu Trp His 115 120 125 Cys Asp Gly Asp Asn Asp Cys Gly Asp Asn Ser Asp Glu Gln Cys Asp 130 135 140 Met Arg Lys Cys Ser Asp Lys Glu Phe Arg Cys Ser Asp Gly Ser Cys 145 150 155 160 Ile Ala Glu His Trp Tyr Cys Asp Gly Asp Thr Asp Cys Lys Asp Gly 165 170 175 Ser Asp Glu Glu Asn Cys Pro Ser Ala Val Pro Ala Pro Pro Cys Asn 180 185 190 Leu Glu Glu Phe Gln Cys Ala Tyr Gly Arg Cys Ile Leu Asp Ile Tyr 195 200 205 His Cys Asp Gly Asp Asp Asp Cys Gly Asp Trp Ser Asp Glu Ser Asp 210 215 220 Cys Ser Ser His Gln Pro Cys Arg Ser Gly Glu Phe Met Cys Asp Ser
<Desc/Clms Page number 36>
225 230 235 240 Gly Leu Cys He Asn Ala Gly Trp Arg Cys Asp Gly Asp Ala Asp Cys 245 250 255 Asp Asp Gln Ser Asp Glu Arg Asn Cys Thr Thr Ser Met Cys Thr Ala 260 265 270 Glu Gln Phe Arg Cys His Ser Gly Arg Cys Val Arg Leu Ser Trp Arg 275 280 285 Cys Asp Gly Glu Asp Asp Cys Ala Asp Asn Ser Asp Glu Glu Asn Cys 290 295 300 Glu Asn Thr Gly Ser Pro Gln Cys Ala Leu Asp Gln Phe Leu Cys Trp 305 310 315 320 Asn Gly Arg Cys Ile Gly Gln Arg Lys Leu Cys Asn Gly Val Asn Asp 325 330 335 Cys Gly Asp Asn Ser Asp Glu Ser Pro Gln Gln Asn Cys Arg Pro Arg 340 345 350 Thr Gly Glu Glu Asn Cys Asn Val Asn Asn Gly Gly Cys Ala Gln Lys 355 360 365 Cys Gln Met Val Arg Gly Ala Val Gln Cys Thr Cys His Thr Gly Tyr 370 375 380 Arg Leu Thr Glu Asp Gly His Thr Cys Gln Asp Val Asn Glu Cys Ala 385 390 395 400 Glu Glu Gly Tyr Cys Ser Gln Gly Cys Thr Asn Ser Glu Gly Ala Phe 405 410 415 Gln Cys Trp Cys Glu Thr Gly Tyr Glu Leu Arg Pro Asp Arg Arg Ser 420 425 430 Cys Lys Ala Leu Gly Pro Glu Pro Val Leu Leu Phe Ala Asn Arg Ile 435 440 445 Asp He Arg Gln Val Leu Pro His Arg Ser Glu Tyr Thr Leu Leu Leu 450 455 460 Asn Asn Leu Glu Asn Ala He Ala Leu Asp Phe His His Arg Arg Glu 465 470 475 480 Leu Val Phe Trp Ser Asp Val Thr Leu Asp Arg He Leu Arg Ala Asn 485 490 495 Leu Asn Gly Ser Asn Val Glu Glu Val Val Ser Thr Gly Leu Glu Ser 500 505 510 Pro Gly Gly Leu Ala Val Asp Trp Val His Asp Lys Leu Tyr Trp Thr 515 520 525 Asp Ser Gly Thr Ser Arg He Glu Val Ala Asn Leu Asp Gly Ala His
<Desc/Clms Page number 37>
530 535 540 Arg Lys Val Leu Leu Trp Gln Asn Leu Glu Lys Pro Arg Ala Ile Ala 545 550 555 560 Leu His Pro Met Glu Gly Thr Ile Tyr Trp Thr Asp Trp Gly Asn Thr 565 570 575 Pro Arg Ile Glu Ala Ser Ser Met Asp Gly Ser Gly Arg Arg lie Ile 580 585 590 Ala Asp Thr His Leu Phe Trp Pro Asn Gly Leu Thr Ile Asp Tyr Ala 595 600 605 Gly Arg Arg Met Tyr Trp Val Asp Ala Lys His His Val Ile Glu Arg 610 615 620 Ala Asn Leu Asp Gly Ser His Arg Lys Ala Val Ile Ser Gln Gly Leu 625 630 635 640 Pro His Pro Phe Ala Ile Thr Val Phe Glu Asp Ser Leu Tyr Trp Thr 645 650 655 Asp Trp His Thr Lys Ser Ile Asn Ser Ala Asn Lys Phe Thr Gly Lys 660 665 670 AsnGln Glu lie Ile Arg Asn Lys Leu His Phe Pro Met Asp lie His 675 680 685 Thr Leu His Pro Gln Arg Gln Pro Ala Gly Lys Asn Arg Cys Gly Asp 690 695 700 Asn Asn Gly Gly Cys Thr His Leu Cys Leu Pro Ser Gly Gln Asn Tyr 705 710 715 720 Thr Cys Ala Cys Pro Thr Gly Phe Arg Lys Ile Ser Ser His Ala Cys 725 730 735 Ala Gln Ser Leu Asp Lys Phe Leu Leu Phe Ala Arg Arg Met Asp He 740 745 750 Arg Arg Ile Ser Phe Asp Thr Glu Asp Leu Ser Asp Asp Val He Pro 755 760 765 Leu Ala Asp Val Arg Ser Ala Val Ala Leu Asp Trp Asp Ser Arg Asp 770 775 780 Asp His Val Tyr Trp Thr Asp Val Ser Thr Asp Thr Ile Ser Arg Ala 785 790 795 800 Lys Trp Asp Gly Thr Gly Gln Glu Val Val Val Asp Thr Ser Leu Glu 805 810 815 Ser Pro Ala Gly Leu Ala lIe Asp Trp Val Thr Asn Lys Leu Tyr Trp 820 825 830 Thr Asp Ala Gly Thr Asp Arg Ile Glu Val Ala Asn Thr Asp Gly Ser
<Desc/Clms Page number 38>
835 840 845 Met Arg Thr Val Leu He Trp Glu Asn Leu Asp Arg Pro Arg Asp He 850 855 860 Val Val Glu Pro Met Gly Gly Tyr Met Tyr Trp Thr Asp Trp Gly Ala 865 870 875 880 Ser Pro Lys He Glu Arg Ala Gly Met Asp Ala Ser Gly Arg Gln Val 885 890 895 He He Ser Ser Asn Leu Thr Trp Pro Asn Gly Leu Ala He Asp Tyr 900 905 910 Gly Ser Gln Arg Leu Tyr Trp Ala Asp Ala Gly Met Lys Thr He Glu 915 920 925 Phe Ala Gly Leu Asp Gly Ser Lys Arg Lys Val Leu He Gly Ser Gln 930 935 940 Leu Pro His Pro Phe Gly Leu Thr Leu Tyr Gly Glu Arg He Tyr Trp 945 950 955 960 Thr Asp Trp Gln Thr Lys Ser He Gln Ser Ala Asp Arg Leu Thr Gly 965 970 975 Leu Asp Arg Glu Thr Leu Gln Glu Asn Leu Glu Asn Leu Met Asp He 980 985 990 His Val Phe His Arg Arg Arg Pro Pro Val Ser Thr Pro Cys Ala Met 995 1000 1005 Glu Asn Gly Gly Cys Ser His Leu Cys Leu Arg Ser Pro Asn Pro Ser 1010 1015 1020 Gly Phe Ser Cys Thr Cys Pro Thr Gly He Asn Leu Leu Ser Asp Gly 1025 1030 1035 1040 Lys Thr Cys Ser Pro Gly Met Asn Ser Phe Leu He Phe Ala Arg Arg 1045 1050 1055 He Asp He Arg Met Val Ser Leu Asp He Pro Tyr Phe Ala Asp Val 1060 1065 1070 Val Val Pro He Asn He Thr Met Lys Asn Thr He Ala He Gly Val 1075 1080 1085 Asp Pro Gln Glu Gly Lys Val Tyr Trp Ser Asp Ser Thr Leu His Arg 1090 1095 1100 He Ser Arg Ala Asn Leu Asp Gly Ser Gln His Glu Asp He Ile Thr 1105 1110 1115 1120 Thr Gly Leu Gln Thr Thr Asp Gly Leu Ala Val Asp Ala He Gly Arg 1125 1130 1135 Lys Val Tyr Trp Thr Asp Thr Gly Thr Asn Arg He Glu Val Gly Asn
<Desc/Clms Page number 39>
1140 1145 1150 Leu Asp Gly Ser Met Arg Lys Val Leu Val Trp Gln Asn Leu Asp Ser 1155 1160 1165 Pro Arg Ala Ile Val Leu Tyr His Glu Met Gly Phe Met Tyr Trp Thr 1170 1175 1180 Asp Trp Gly Glu Asn Ala Lys Leu Glu Arg Ser Gly Met Asp Gly Ser 1185 1190 1195 1200 Asp Arg Ala Val Leu lIe Asn Asn Asn Leu Gly Trp Pro Asn Gly Leu 1205 1210 1215 Thr Val Asp Lys Ala Ser Ser Gln Leu Leu Trp Ala Asp Ala His Thr 1220 1225 1230 Glu Arg lie Glu Ala Ala Asp Leu Asn Gly Ala Asn Arg His Thr Leu 1235 1240 1245 Val Ser Pro Val Gln His Pro Tyr Gly Leu Thr Leu Leu Asp Ser Tyr 1250 1255 1260 Ile Tyr Trp Thr Asp Trp Gln Thr Arg Ser lie His Arg Ala Asp Lys 1265 1270 1275 1280 Gly Thr Gly Ser Asn Val He Leu Val Arg Ser Asn Leu Pro Gly Leu 1285 1290 1295 Met Asp Met Gln Ala Val Asp Arg Ala Gln Pro Leu Gly Phe Asn Lys 1300 1305 1310 Cys Gly Ser Arg Asn Gly Gly Cys Ser His Leu Cys Leu Pro Arg Pro 1315 1320 1325 Ser Gly Phe Ser Cys Ala Cys Pro Thr Gly lie Gln Leu Lys Gly Asp 1330 1335 1340 Gly Lys Thr Cys Asp Pro Ser Pro Glu Thr Tyr Leu Leu Phe Ser Ser 1345 1350 1355 1360 Arg Gly Ser lie Arg Arg He Ser Leu Asp Thr Ser Asp His Thr Asp 1365 1370 1375 Val His Val Pro Val Pro Glu Leu Asn Asn Val Ile Ser Leu Asp Tyr 1380 1385 1390 Asp Ser Val Asp Gly Lys Val Tyr Tyr Thr Asp Val Phe Leu Asp Val 1395 1400 1405 Ile Arg Arg Ala Asp Leu Asn Gly Ser Asn Met Glu Thr Val lie Gly 1410 1415 1420 Arg Gly Leu Lys Thr Thr Asp Gly Leu Ala Val Asp Trp Val Ala Arg 1425 1430 1435 1440 Asn Leu Tyr Trp Thr Asp Thr Gly Arg Asn Thr lie Glu Ala Ser Arg
<Desc/Clms Page number 40>
1445 1450 1455 Leu Asp Gly Ser Cys Arg Lys Val Leu lie Asn Asn Ser Leu Asp Glu 1460 1465 1470 Pro Arg Ala lIe Ala Val Phe Pro Arg Lys Gly Tyr Leu Phe Trp Thr 1475 1480 1485 Asp Trp Gly His He Ala Lys Ile Glu Arg Ala Asn Leu Asp Gly Ser 1490 1495 1500 Glu Arg Lys Val Leu He Asn Thr Asp Leu Gly Trp Pro Asn Gly Leu 1505 1510 1515 1520 Thr Leu Asp Tyr Asp Thr Arg Arg He Tyr Trp Val Asp Ala His Leu 1525 1530 1535 Asp Arg He Glu Ser Ala Asp Leu Asn Gly Lys Leu Arg Gln Val Leu 1540 1545 1550 Val Ser His Val Ser His Pro Phe Ala Leu Thr Gln Gln Asp Arg Trp 1555 1560 1565 He Tyr Trp Thr Asp Trp Gln Thr Lys Ser He Gln Arg Val Asp Lys 1570 1575 1580 Tyr Ser Gly Arg Asn Lys Glu Thr Val Leu Ala Asn Val Glu Gly Leu 1585 1590 1595 1600 Met Asp He He Val Val Ser Pro Gln Arg Gln Thr Gly Thr Asn Ala 1605 1610 1615 Cys Gly Val Asn Asn Gly Gly Cys Thr His Leu Cys Phe Ala Arg Ala 1620 1625 1630 Ser Asp Phe Val Cys Ala Cys Pro Asp Glu Pro Asp Ser Arg Pro Cys 1635 1640 1645 Ser Leu Val Pro Gly Leu Val Pro Pro Ala Pro Arg Ala Thr Gly Met 1650 1655 1660 Ser Glu Lys Ser Pro Val Leu Pro Asn Thr Pro Pro Thr Thr Leu Tyr 1665 1670 1675 1680 Ser Ser Thr Thr Arg Thr Arg Thr Ser Leu Glu Glu Val Glu Gly Arg 1685 1690 1695 Cys Ser Glu Arg Asp Ala Arg Leu Gly Leu Cys Ala Arg Ser Asn Asp 1700 1705 1710 Ala Val Pro Ala Ala Pro Gly Glu Gly Leu His He Ser Tyr Ala He 1715 1720 1725 Gly Gly Leu Leu Ser He Leu Leu He Leu Val Val lie Ala Ala Leu 1730 1735 1740 Met Leu Tyr Arg His Lys Lys Ser Lys Phe Thr Asp Pro Gly Met Gly
<Desc/Clms Page number 41>
1745 1750 1755 1760 Asn Leu Thr Tyr Ser Asn Pro Ser Tyr Arg Thr Ser Thr Gln Glu Val 1765 1770 1775 Lys lIe Glu Ala lIe Pro Lys Pro Ala Met Tyr Asn Gln Leu Cys Tyr 1780 1785 1790 Lys Lys Glu Gly Gly Pro Asp His Asn Tyr Thr Lys Glu Lys He Lys 1795 1800 1805 Ile Val Glu Gly Ile Cys Leu Leu Ser Gly Asp Asp Ala Glu Trp Asp 1810 1815 1820 Asp Leu Lys Gln Leu Arg Ser Ser Arg Gly Gly Leu Leu Arg Asp His 1825 1830 1835 1840 Val Cys Met Lys Thr Asp Thr Val Ser Ile Gln Ala Ser Ser Gly Ser 1845 1850 1855 Leu Asp Asp Thr Glu Thr Glu Gln Leu Leu Gln Glu Glu Gln Ser Glu 1860 1865 1870 Cys Ser Ser Val His Thr Ala Ala Thr Pro Glu Arg Arg Gly Ser Leu 1875 1880 1885 Pro Asp Thr Gly Trp Lys His Glu Arg Lys Leu Ser Ser Glu Ser Gln 1890 1895 1900 Val 1905 < 210 > 6 < 211 > 4 < 212 > PRT < 213 > Artificial Sequence < 220 > < 221 > SITE < 222 > (3) < 223 > Xaa is any amino acid < 220 > < 223 > Description of Artificial Sequence : Motif < 400 > 6 Asn Pro Xaa Tyr 1 < 210 > 7 < 211 > 4 < 212 > PRT < 213 > Artificial Sequence < 220 >
<Desc/Clms Page number 42>
< 223 > Description of Artificial Sequence: Epidermal growth factor-like repeat < 400 > 7
Tyr Trp Thr Asp 1

Claims (12)

  1. Claims 1. An isolated polynucleotide comprising the nucleotide sequence set forth in SEQ ID NO : 4.
  2. 2. The isolated polynucleotide of Claim 1 wherein the polynucleotide consists of a nucleotide sequence of the formula (R1) m-SEQ ID NO : 4- (R2),, wherein R, and R2 are independently any nucleic acid residue, and m and n are each integers between 1 and 1000.
  3. 3. The isolated polynucleotide of Claim 2 wherein the polynucleotide consists of the nucleotide sequence set forth in SEQ ID NO : 4.
  4. 4. An isolated polynucleotide that encodes a polypeptide comprising the amino acid sequence set forth in SEQ ID NO : 5.
  5. 5. The isolated polynucleotide of Claim 4 wherein the polynucleotide encodes the amino acid sequence set forth in SEQ ED NO : 5.
  6. 6. An isolated polypeptide comprising the amino acid sequence set forth in SEQ ID NO : 5.
  7. 7. The isolated polypeptide of Claim 6 wherein the polypeptide consists of an amino acid sequence of the formula (R-SEQID NO : 5- (R wherein, at the amino terminus, R, and R2 are independently any amino acid residue, and m and n are each integers between 1 and 1000.
  8. 8. The isolated polypeptide of Claim 6 consisting of the amino acid sequence set forth in SEQ ID NO : 5.
  9. 9. An expression vector comprising the isolated polynucleotide of claim 4 when said expression vector is present in a compatible host cell.
  10. 10. An isolated host cell comprising the expression vector of claim 9.
  11. 11. A process for producing a polypeptide comprising the amino acid sequence set forth in SEQ ID NO : 5 comprising culturing the host cell of claim 10 and recovering the polypeptide from the culture.
  12. 12. A membrane of the host cell of claim 10 expressing said polypeptide.
GB0222372A 2001-09-26 2002-09-26 LDL-receptor polypeptides Withdrawn GB2381790A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
GB0123124A GB0123124D0 (en) 2001-09-26 2001-09-26 Novel protein
GB0214703A GB0214703D0 (en) 2002-06-26 2002-06-26 Novel protein

Publications (2)

Publication Number Publication Date
GB0222372D0 GB0222372D0 (en) 2002-11-06
GB2381790A true GB2381790A (en) 2003-05-14

Family

ID=26246579

Family Applications (1)

Application Number Title Priority Date Filing Date
GB0222372A Withdrawn GB2381790A (en) 2001-09-26 2002-09-26 LDL-receptor polypeptides

Country Status (1)

Country Link
GB (1) GB2381790A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130108616A1 (en) * 2009-10-21 2013-05-02 Lin Mei Detection and treatment of lrp4-associated neurotransmission disorders

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998046743A1 (en) * 1997-04-15 1998-10-22 The Wellcome Trust Limited As Trustee To The Wellcome Trust Novel ldl-receptor
WO2002074906A2 (en) * 2001-03-16 2002-09-26 Eli Lilly And Company Lp mammalian proteins; related reagents

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1998046743A1 (en) * 1997-04-15 1998-10-22 The Wellcome Trust Limited As Trustee To The Wellcome Trust Novel ldl-receptor
WO2002074906A2 (en) * 2001-03-16 2002-09-26 Eli Lilly And Company Lp mammalian proteins; related reagents

Non-Patent Citations (6)

* Cited by examiner, † Cited by third party
Title
Biochem. Biophys. Res. Comm., Vol.248, 1998, Brown, S. D. et al., "Isolation and characterization of...", pp.879-888. *
Biochem. Biophys. Res. Comm., Vol.251, 1998, Dong, Y. et al., "Molecular cloning and characterization...", pp.784-790. *
Gene, Vol.216, 1998, Hey, P. et al., "Cloning of a novel member...", pp.103-111. *
Genomics, Vol.51, 1998, Nakayama, M. et al., "Identification of high-molecular-weight...", pp.27-34 & associated NCBI Accession AB011540. *
Neuron, Vol. 29, 2001, Herz, J. "The LDL receptor gene family...", pp.571-581. *
P.N.A.S., Vol.91, 1994, Saito, A. et al., "Complete cloning and sequencing...", pp.9725-9729. *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130108616A1 (en) * 2009-10-21 2013-05-02 Lin Mei Detection and treatment of lrp4-associated neurotransmission disorders
US9244082B2 (en) * 2009-10-21 2016-01-26 Georgia Regents Research Institute, Inc. Detection and treatment of LRP4-associated neurotransmission disorders
US9897614B2 (en) 2009-10-21 2018-02-20 Augusta University Research Institute, Inc. Detection and treatment of LRP4-associated neurotransmission disorders
US10877047B2 (en) 2009-10-21 2020-12-29 Augusta University Research Institute, Inc. Detection and treatment of LRP4-associated neurotransmission disorders

Also Published As

Publication number Publication date
GB0222372D0 (en) 2002-11-06

Similar Documents

Publication Publication Date Title
US20090192294A1 (en) Magi polynucleotides, polypeptides, and antibodies
US20030069398A1 (en) Identification of human gaba transporter
CA2407959A1 (en) Human wingless-like gene
WO1999058667A1 (en) Rhotekin, a putative target for rho
US6274380B1 (en) Cacnglike3 polynucleotides and expression systems
GB2381790A (en) LDL-receptor polypeptides
US6914125B2 (en) Scramblase 2
EP1266008B1 (en) Acute neuronal induced calcium binding protein type 1 ligand
US7034125B2 (en) Identification of new human gaba transporter
CA2400606A1 (en) New phosphodiesterase type 7b
EP1183372A2 (en) Mprot45 metalloprotease
EP1170365A1 (en) Member of the ion channel family of polypeptides; vanilrep4
CA2420257A1 (en) Identification of a cam-kinase ii inhibitor
GB2365010A (en) Human TREK2 polypeptides
US20060205641A1 (en) VANILREP4 polypeptides and VANILREP4 polynucleotides
US20030143654A1 (en) F-box containing protein
WO2000014223A1 (en) Voltage-gated calcium channel
WO2000014225A1 (en) Putative human neuronal voltage-gated calcium channel gamma-2 and gamma-3 subunits, cacnglike2 (calcium channel gamma like 2)
GB2377445A (en) Ion channel vanilloid receptor, VANILREP7
CA2421187A1 (en) Family member of inhibitor of apoptosis proteins
WO2001012645A1 (en) HUMAN sbhPARS2
CA2405985A1 (en) Nk-2 homeobox transcription factor
CA2403434A1 (en) Human gata-5 transcription factor
GB2367296A (en) RGS5-like regulator of G-protein coupled signalling
CA2385633A1 (en) Paralogue of a head trauma induced cytoplasmatic calcium binding protein

Legal Events

Date Code Title Description
WAP Application withdrawn, taken to be withdrawn or refused ** after publication under section 16(1)