AU3390499A - Human calcium channel compositions and methods using them - Google Patents

Human calcium channel compositions and methods using them Download PDF

Info

Publication number
AU3390499A
AU3390499A AU33904/99A AU3390499A AU3390499A AU 3390499 A AU3390499 A AU 3390499A AU 33904/99 A AU33904/99 A AU 33904/99A AU 3390499 A AU3390499 A AU 3390499A AU 3390499 A AU3390499 A AU 3390499A
Authority
AU
Australia
Prior art keywords
subunit
calcium channel
seq
sequence
dna
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU33904/99A
Inventor
Steven B Ellis
Alison Gillespie
Michael M Harpold
Ann F McCue
Mark E. Williams
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Merck and Co Inc
Original Assignee
SIBIA Neurosciences Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US08/149,097 external-priority patent/US5874236A/en
Application filed by SIBIA Neurosciences Inc filed Critical SIBIA Neurosciences Inc
Priority to AU33904/99A priority Critical patent/AU3390499A/en
Publication of AU3390499A publication Critical patent/AU3390499A/en
Assigned to MERCK & CO., INC. reassignment MERCK & CO., INC. Alteration of Name(s) of Applicant(s) under S113 Assignors: SIBIA NEUROSCIENCES, INC.
Abandoned legal-status Critical Current

Links

Landscapes

  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)

Description

S F Ref: 324828D1
AUSTRALIA
PATENTS ACT 1990 COMPLETE SPECIFICATION FOR A STANDARD PATENT
ORIGINAL
Name and Address of Applicant: Actual Inventor(s): Address for Service: Invention Title: SIBIA Neurosciences, Inc.
505 Coast Boulevard South Suite 300 La Jolla California 92037 UNITED STATES OF AMERICA Michael M. Harpold, Steven B. Ellis, Mark E. Williams, Ann F. McCue and Alison Gillespie Spruson Ferguson, Patent Attorneys Level 33 St Martins Tower, 31 Market Street Sydney, New South Wales, 2000, Australia Human Calcium Channel Compositions and Methods Using Them The following statement is a full description of this invention, including the best method of performing it known to me/us:- 5845 HUMAN CALCIUM CHANNEL COMPOSITIONS AND METHODS USING THEM TECHNICAL FIELD The present invention relates to molecular biology and pharmacology. More particularly, the invention relates to calcium channel compositions and methods of making and using the same.
BACKGROUND OF THE INVENTION Calcium channels are membrane-spanning, multi-subunit proteins that allow controlled entry of Ca 2 ions into cells from the extracellular fluid. Cells throughout the animal kingdom, and at least some bacterial, fungal and plant cells, possess one or more types of calcium channel.
The most common type of calcium channel is voltage dependent. "Opening" of a voltage-dependent channel to allow *an influx of Ca 2 ions into the cells requires a depolarization to a certain level of the potential difference between the inside of the cell bearing the channel and the extracellular medium bathing the cell. The rate of influx of Ca 2 into the cell depends on this potential difference. All "excitable" o eeoc cells in animals, such as neurons of the central nervous system (CNS), peripheral nerve cells and muscle cells, including those of skeletal muscles, cardiac muscles, and venous and arterial smooth muscles, have voltage-dependent calcium channels.
Multiple types of calcium channels have been identified in mammalian cells from various tissues, including skeletal muscle, cardiac muscle, lung, smooth muscle and brain, [see, Bean, B.P. (1989) Ann. Rev. Physiol. 51:367-384 and Hess, P. (1990) Ann. Rev. Neurosci. 56:337]. The different types of calcium channels have been broadly categorized into four classes, and P-type, distinguished by current kinetics, holding potential sensitivity and sensitivity to calcium channel agonists and antagonists.
Calcium channels are multisubunit proteins that contain two large subunits, designated a, and U2, which have molecular weights between about 130 and about 200 kilodaltons -2and one to three different smaller subunits of less than about kD in molecular weight. At least one of the larger subunits and possibly some of the smaller subunits are glycosylated. Some of the subunits are capable of being phosphorylated. The a. subunit has a molecular weight of about 150 to about 170 kD when analyzed by sodium dodecylsulfate (SDS)-polyacrylamide gel electrophoresis
(PAGE)
after isolation from mammalian muscle tissue and has specific binding sites for various 1, 4 -dihydropyridines (DHPs) and henylalkylamines. Under non-reducing conditions (in the presence of N-ethylmaleimide), the c. subunit migrates in SDS-PAGE as a band corresponding to a molecular weight of about 160-190 kD. Upon reduction, a large fragment and smaller fragments are released. The 0 subunit of the rabbit skeletal muscle calcium channel is a phosphorylated protein that has a molecular weight of 52-65 kD as determined by SDS- PAGE analysis. This subunit is insensitive to reducing conditions. The y subunit of the calcium channel, which is :.not observed in all purified preparations, appears to be a glycoprotein with an apparent molecular weight of 30-33 kD, as determined by SDS-PAGE analysis.
In order to study calcium channel structure and function, large amounts of pure channel protein are needed. Because of the complex nature of these multisubunit proteins, the varying concentrations of calcium channels in tissue sources of the protein, the presence of mixed populations of calcium channels in tissues, difficulties in obtaining tissues of interest, and the modifications of the native protein that can occur during the isolation procedure, it is extremely difficult to obtain large amounts of highly purified, completely intact calcium channel protein.
Characterization of a particular type of calcium channel by analysis of whole cells is severely restricted by the presence of mixed populations of different types of calcium channels in the majority of cells. Single-channel recording methods that are used to examine individual calcium channels do not reveal any information regarding the molecular structure or biochemical composition of the channel.
Furthermore, in performing this type of analysis, the channel is isolated from other cellular constituents that might be important for natural functions and pharmacological interactions.
Characterization of the gene or genes encoding calcium channels provides another means of characterization of different types of calcium channels. The amino acid sequence o ~determined from a complete nucleotide sequence of the coding region of a gene encoding a calcium channel protein represents the primary structure of the protein. Furthermore, secondary structure of the calcium channel protein and the relationship of the protein to the membrane may be predicted based on analysis of the primary structure. For instance, hydropathy plots of the a, subunit protein of the rabbit skeletal muscle calcium channel indicate that it contains four internal repeats, each containing six putative transmembrane regions [Tanabe, T. et al. (1987) Nature 328:313].
•c Because calcium channels are present in various tissues and have a central role in regulating intracellular calcium-- °oe ion concentrations, they are implicated in a number of vital •e processes in animals, including neurotransmitter release, muscle contraction, pacemaker activity, and secretion of hormones and other substances. These processes appear to be involved in numerous human disorders, such as CNS and cardiovascular diseases. Calcium channels, thus, are also implicated in numerous disorders. A number of compounds useful for treating various cardiovascular diseases in animals, including humans, are thought to exert their beneficial effects by modulating functions of voltagedependent calcium channels present in cardiac and/or vascular smooth muscle. Many of these compounds bind to calcium channels and block, or reduce the rate of, influx of Ca2 into the cells in response to depolarization of the cell membrane.
-4- The results of studies of recombinant expression of rabbit calcium channel a, subunit-encoding cDNA clones and transcripts of the cDNA clones indicate that the a. subunit forms the pore through which calcium enters cells. The relevance of the barium currents generated in these recombinant cells to the actual current generated by calcium channels containing as one component the respective a subunits in vivo is unclear. In order to completely and accurately characterize and evaluate different calcium channel types, however, it is essential to examine the functional properties of recombinant channels containing all of the subunits as found in vivo.
In order to conduct this examination and to fully understand calcium channel structure and function, it is critical to identify and characterize as many calcium channel subunits as possible. Also in order to prepare recombinant a. cells for use in identifying compounds that interact with calcium channels, it is necessary to be able to produce cells that express uniform populations of calcium channels containing defined subunits.
An understanding of the pharmacology of compounds that interact with calcium channels in other organ systems, such as ~the CNS, may aid in the rational design of compounds that a. specifically interact with subtypes of human calcium channels to have desired therapeutic effects, such as in the treatment of neurodegenerative and cardiovascular disorders. Such understanding and the ability to rationally design therapeutically effective compounds, however, have been hampered by an inability to independently determine the types of human calcium channels and the molecular nature of individual subtypes, particularly in the CNS, and by the unavailability of pure preparations of specific channel subtypes to use for evaluation of the specificity of calcium channel-effecting compounds. Thus, identification of DNA encoding human calcium channel subunits and the use of such DNA for expression of calcium channel subunits and functional calcium channels would aid in screening and designing therapeutically effective compounds.
Therefore, it is an object herein, to provide DNA encoding specific calcium channel subunits and to provide eukaryotic cells bearing recombinant tissue-specific or subtype- specific calcium channels. It is also an object to provide assays for identification of potentially therapeutic compounds that act as calcium channel antagonists and agonists.
a 999* >o SUMMARY OF THE INVENTION Isolated and purified nucleic acid fragments that encode human calcium channel subunits are provided. DNA encoding a subunits of a human calcium channel, and RNA, encoding such subunits, made upon transcription of such DNA are provided.
In particular, DNA fragments encoding ac subunits of voltagedependent human calcium channels (VDCCs) type A, type B (also referred to as VDCC IV), type C (also referred to as VDCC II) type D (also referred to as VDCC III) and type E are provided.
DNA encoding aA, aA aI, a~ and subunits is provided.
S'i DNA encoding an subunit that includes the amino acids S substantially as set forth as residues 10-2161 of SEQ ID No.
1 is provided. DNA encoding an aD subunit that includes substantially the amino acids set forth as amino acids 1-34 in SEQ ID No. 2 in place of amino acids 373-406 of SEQ ID No. 1 is also provided. DNA encoding an aec subunit that includes the amino acids substantially as set forth in SEQ ID No. 3 or SEQ ID No. 6 and DNA encoding an subunit that includes an amino acid sequence substantially as set forth in SEQ ID No.
7 or in SEQ ID No. 8 is also provided.
DNA encoding acl subunits is also provided. Such DNA includes DNA encoding an subunit that has substantially the same sequence of amino acids as encoded by the DNA set forth in SEQ ID No. 22 or No. 23 or other splice variants of a1A that include all or part of the sequence set forth in SEQ ID No. 22 or 23. The sequence set forth in SEQ ID NO. 22 is a splice variant designated aA-1; and the sequence set forth in SEQ ID NO. 23 is a splice variant designated DNA encoding clA subunits also include DNA encoding subunits that can be isolated using all or a portion of the DNA having SEQ ID NO.
21, 22 or 23 or DNA obtained from the phage lysate of an E.
coli host containing DNA encoding an a1A subunit that has been deposited in the American Type Culture Collection, 12301 Parklawn Drive, Rockville, Maryland 20852 U.S.A. under Accession No. 75293 in accord with the Budapest Treaty. The DNA in such phage includes a DNA fragment having the sequence set forth in SEQ ID No. 21. This fragment selectively hybridizes under conditions of high stringency to DNA encoding a but not to DNA encoding oa and, thus, can be used to isolate DNA that encodes a, subunits.
DNA encoding ae, subunits of a human calcium channel is also provided. This DNA includes DNA that encodes an a splice variant designated a, encoded by the DNA set forth in SEQ ID No. 24, and a variant designated ae,, encoded by SEQ ID No. 25. This DNA also includes other splice variants thereof that encodes sequences of amino acids encoded by all or a portion of the sequences of nucleotides set forth in SEQ ID Nos. 24 and 25 and DNA that hybridizes under conditions of high stringency to the DNA of SEQ ID. No. 24 or 25 and that encodes an as splice variant.
DNA encoding a~ subunits of a human calcium channel, and SRNA encoding such subunits, made upon transcription of such a S. DNA are provided. DNA encoding splice variants of the a subunit, including tissue-specific splice variants, are also provided. In particular, DNA encoding the subunit subtypes is provided. In particularly preferred embodiments, the DNA encoding the a, subunit that is produced by alternative processing of a primary transcript that includes DNA encoding the amino acids set forth in SEQ ID 11 and the DNA of SEQ ID No. 13 inserted between nucleotides 1624 and 1625 of SEQ ID No. 11 is provided. The DNA and amino acid secruences of c2 a?2 are set forth in SEQ ID Nos. I (a2b), 2 and to 32( 2-a 2 respectively.
Isolated and purified DNA fragments encoding human calcium channel 3 subunits, including DNA encoding 32, e3 and 6, subunits, and splice variants of the j3 subunits are provided. RNA encoding 3 subunits, made upon transcription of the DNA is also provided.
DNA encoding a 8, subunit that is produced by alternative processing of a primary transcript that includes DNA encoding the amino acids set forth in SEQ ID No. 9, but including the DNA set forth in SEQ ID No. 12 inserted in place of nucleotides 615-781 of SEQ ID No. 9 is also provided.
DNA
encoding 6, subunits that are encoded by transcripts that have the sequence set forth in SEQ ID No. 9 including the DNA set forth in SEQ ID No. 12 inserted in place of nucleotides 615- 781 of SEQ ID No. 9, but that lack one or more of the following sequences of nucleotides: nucleotides 14-34 of SEQ ID No. 12, nucleotides 13-34 of SEQ ID No. 12, nucleotides of SEQ ID No 12, nucleotides 56-190 of SEQ ID No. 12 and nucleotides 191-271 of SEQ ID No. 12 are also provided. In particular, subunit splice variants (see, SEQ ID Nos. 9, 10 and 33-35) described below, are provided.
B
2 subunit splice variants 2c-3,e, that include all or a portion of SEQ ID Nos. 26, 37 and 38 are provided; subunit splice variants, including 3, subunit splice variants that ::"have the sequences set forth in SEQ ID Nos 19 and 20, and DNA encoding the 3, subunit that includes DNA having the sequence set forth in SEQ ID No. 27 and the amino acid sequence set forth in SEQ ID No. 28 are provided.
Also Escherichia coli coli) host cells harboring plasmids containing DNA encoding have been deposited in accord with the Budapest Treaty under Accession No. 69048 at the American Type Culture Collection. The deposited clone encompasses nucleotides 122-457 in SEQ ID No. 19 and 107-443 in SEQ ID No. DNA encoding 0 subunits that are produced by alternative processing of a primary transcript encoding a subunit, including a transcript that includes DNA encoding the amino acids set forth in SEQ ID No. 9 or including a primary transcript that encodes /3 as deposited under ATCC Accession No. 69048, but lacking and including alternative exons are provided or may be constructed from the DNA provided herein.
DNA encoding y subunits of human calcium channels is also provided. RNA, encoding y subunits, made upon transcription of the DNA are also provided. In particular, DNA containing the sequence of nucleotides set forth in SEQ ID No. 14 is provided.
Full-length DNA clones and corresponding RNA transcripts, encoding including splice variants of ce, aIB, c, and a, and 3 subunits, including 2E, and 3, of human calcium channels are provided. Also provided are DNA clones encoding substantial portions of the certain oa subtype subunits and 7 subunits of voltage-dependent human calcium channels for the preparation of full-length DNA clones encoding the corresponding full-length subunits. Full-length clones may be readily obtained using the disclosed DNA as a probe as described herein.
The am subunit, subunit, a, subunit and splice variants thereof, the P: and 6:E subunits and subunits and nucleic acids encoding these subunits are of particular interest herein.
Eukaryotic cells containing heteroloccus DNA encoding one or more calcium channel subunits-, particularly human calcium channel subunits, or containing RNA transcripts of DNA clones encoding one or more of the subunits are provided. A single a, subunit can form a channel. The requisite combination of subunits for formation of active channels in selected cells, however, can be determined empirically using the methods herein. For example, if a selected a, subtype or variant does not form an active channel in a selected cell line, an additional subunit or subunits can be added until an active -channel is formed.
In preferred embodiments, the cells contain DNA or RNA encoding a human ca subunit, preferably at least an aa, aA or a 1 E subunit. In more preferred embodiments, the cells contain DNA or RNA encoding additional heterologous subunits, including at least one e, a. or subunit. In such embodiments, eukaryotic cells stably or transiently transfected with any combination of one, two, three or four of the subunit-encoding DNA clones, such as DNA encoding any of a 1 al 3, aC 0 are provided.
The eukaryotic cells provided herein contain heterologous DNA that encodes an a, subunit or heterologous DNA that encodes an a, subunit and heterologous DNA that encodes a p subunit. At least one subunit selected is a. Ac- E-l, eIE-3' 1 2C' 12D, ,2F, a subunit or a subunit. i preferred embodiments, the cells express such heterologous calcium channel subunits and include one or more of the subunits in membrane-spanning heterologous calcium chan-els.
In more preferred embodiments, the eukaryotic cells express functional, heterologous calcium channels that are capable of gating the passage of calcium channel-selective ions and/or binding compounds that, at physiological concentrations modulate the activity of the heterologous calcium channel. In certain embodiments, the heterologous calcium channels include at least one heterologous calcium channel subunit. In most preferred embodiments, the calcium channels that are expressed on the surface of the eukaryotic cells are co-.osed substantially or entirely of subunits encoded by the heterologous DNA or RNA. In preferred embodiments, the heterologous calcium channels of such cells are distinguishable from any endogenous calcium channels cf the Shost cell. Such cells provide a means to obtain homcceneous populations of calcium channels. Typically, the cells contain the selected calcium channel as the only heterologous ion channel expressed by the cell.
In certain embodiments the recombinant eukaryotic cells that contain the heterologous DNA encoding the calcium channel subunits are produced by transfection with DNA encoding one or more of the subunits or are injected with RNA transcripts of DNA encoding one or more of the calcium channel subunits. The DNA may be introduced as a linear DNA fragment or may be included in an expression vector for stable or transient expression of the subunit-encoding DNA. Vectors containing DNA encoding human calcium channel subunits are also provided.
-11- The eukaryotic cells that express heterologous calcium channels may be used in assays for calcium channel function or, in the case of cells transformed with fewer subunitencoding nucleic acids than necessary to constitute a functional recombinant human calcium channel, such cells may be used to assess the effects of additional subunits on calcium channel activity. The additional subunits can be provided by subsequently transfecting such a cell with one or more DNA clones or RNA transcripts encoding human calcium channel subunits.
The recombinant eukaryotic cells that express membrane spanning heterologous calcium channels may be used in methods for identifying compounds that modulate calcium channel activity. In particular, the cells are used in assays that identify agonists and antagonists of calcium channel activity in humans and/or assessing the contribution of the various calcium channel subunits to the transport and regulation of transport of calcium ions. Because the cells constitute homogeneous populations of calcium channels, they provide a means to identify agonists or antagonists of calcium channel activity that are specific for each such population.
The assays that use the eukaryotic cells for identifying compounds that modulate calcium channel activity are also provided. In practicing these assays the eukaryotic cell that :.expresses a heterologous calcium channel, containing at least one subunit encoded by the DNA provided herein, is in a ~solution containing a test compound and a calcium channel selective ion, the cell membrane is depolarized, and current flowing into the cell is detected. If the test compound is one that modulates calcium channel activity, the current that is detected is different from that produced by depolarizing the same or a substantially identical cell in the presence of the same calcium channel-selective ion but in the absence of the compound. In preferred embodiments, prior to the depolarization step, the cell is maintained at a holding potential which substantially inactivates calcium channels -12which are endogenous to the cell. Also in preferred embodiments, the cells are mammalian cells, most preferably HEK cells, or amphibian o 6 cytes.
Nucleic acid probes, typically labeled for detection, containing at least about 14, preferably 16, or, if desired, or 30 or more, contiguous nucleotides of aD, ac, a and ylE, including 0, I2 ,03 and 34 splice variants and 7 subunit-encoding DNA are provided. Methods using the probes for the isolation and cloning of calcium channel subunitencoding DNA, including splice variants within tissues and inter-tissue variants are also provided.
Purified human calcium channel subunits and purified human calcium channels are provided. The subunits and channels can be isolated from a eukaryotic cell transfected with DNA that encodes the subunit.
In another embodiment, immunoglobulins or antibodies obtained from the serum of an animal immunized with a substantially pure preparation of a human calcium channel human calcium channel subunit or epitope-containing fragment of a human calcium subunit are provided. Monoclonal antibodies produced using a human calcium channel, human calcium channel subunit or epitope-containing fragment thereof as an immunogen are also provided. E. coli fusion proteins including a fragment of a human calcium channel subunit may also be used as immunogen. Such fusion proteins may contain a bacterial protein or portion thereof, such as the E. coli TrpE protein, fused to a calcium channel subunit peptide. The immunoglobulins that are produced using the calcium channel subunits or purified calcium channels as immunogens have, among other properties, the ability to specifically and preferentially bind to and/or cause the immunoprecipitation of a human calcium channel or a subunit thereof which may be present in a biological sample or a solution derived from such a biological sample. Such antibodies may also be used to selectively isolate cells that express calcium channels that contain the subunit for which the antibodies are specific.
13 Methods for modulating the activity of ion channels by contacting the calcium channels with an effective amount of the above-described antibodies are also provided.
A diagnostic method for determining the presence of Lambert Eaton Syndrome (LES) in a human based on immunological reactivity of LES immunoglobulin G (IgG) with a human calcium channel subunit or a eukaryotic cell which expressed a recombinant human calcium channel or a subunit thereof is also provided. In particular, an immunoassay method for diagnosing Lambert-Eaton Syndrome in a person by combining serum or an IgG fraction from the person (test serum) with calcium channel proteins, including the a and p subunits, and ascertaining whether antibodies in the test serum react with one or more of the subunits, or a recombinant cell which expresses one or more of the subunits to a greater extent than antibodies in control serum, obtained from a person or group of persons known to be free of the Syndrome, is provided. Any immunoassay procedure known in the art for detecting antibodies against a given antigen in serum can be employed in the method.
According to a first embodiment of the invention, an isolated DNA molecule, comprising a sequence of nucleotides that encodes an (xa subunit of a human calcium channel, wherein the subunit is selected from the group consisting of IA-1, IA-2, CIE1, and "o1E-3* E According to a second embodiment of the invention, there is provided an isolated DNA molecule, comprising a sequence of nucleotides that encodes a P2 subunit of a human calcium channel.
According to a third embodiment of the invention, there is provided a DNA molecule that encodes a 33 subunit of a human calcium channel.
According to a fourth embodiment of the invention, there is provided a DNA molecule, comprising a sequence of nucleotides that encodes a 3 4 subunit of a human calcium channel.
o .According to a fifth embodiment of the invention, there is provided a eukaryotic cell, comprising heterologous DNA that encodes an ca subunit selected from the group of subunits consisting of caI-1, X1A2, OIC-2, IE-1, and iE-3.
According to a sixth embodiment of the invention, there is provided a eukaryotic cell, comprising heterologous DNA that encodes an ac subunit and heterologous DNA that encodes a P subunit, wherein at least one subunit is selected from the group of subunits consisting ofa1, I CA-2 P, 2C P2D, P2E, P3-1, a 34 subunit.
I.DayLib\LIBFF\0406.docsak 13a According to a seventh embodiment of the invention, there is provided a eukary cell with a functional, heterologous calcium channel, produced by a process comprisin introducing into the cell heterologous nucleic acid that encodes an a 1 -subunit of a human calcium channel, wherein: the a, subunit is selected from the group consisting of IA-2, XIC-2, IE- and IE-3; the heterologous calcium channel contains at least one subunit encoded by the heterologous nucleic acid; and the only heterologous ion channels are calcium channels.
According to an eighth embodiment of the invention, there is provided a eukaryotic cell with a functional, heterologous calcium channel, produced by a process comprising: introducing into the cell nucleic acid that encodes an a, subunit of a human calcium channel and introducing into the cell nucleic acid that encodes a P subunit of a human calcium channel, wherein: at least one of the subunits is selected from the group consisting of alA- 1 IA-2, (IE-I, LIE-3, P2C P2D' P2E. a 33 and a 34 subunit; the at least one subunit includes the sequence of amino acids set forth in or encoded by the DNA set forth in any of SEQ ID NOs. 19, 22, 23, 24, 25, 26, 27, 28, 29, or 36 or functionally equivalent variants thereof; the heterologous calcium channel contains at least one subunit encoded by the heterologous nucleic acid; and
S
the only heterologous ion channels are calcium channels.
According to a ninth embodiment of the invention, there is provided a eukaryotic cell with a functional, heterologous calcium channel, produced by a process comprising: introducing into the cell nucleic acid that encodes an a, subunit of a human calcium channel, nucleic acid that encodes an ca subunit of a human calcium channel and introducing into the cell nucleic acid that encodes a p subunit of a human calcium channel, wherein: the a 2 subunit is an C 2 b subunit of a human calcium channel, the ca subunit is an a alc, a I, ,ID or XIE subunit of a human calcium channel and the 3 subunit is a P3 or P4 subunit of a human calcium channel, wherein the a, subunit includes a sequence of amino acids set forth in any of SEQ ID NOs. 1, 2, 3, 6, 7, 8, 22, 23, 24, 25, or 36, the x 2 b subunit includes an amino acid sequence set forth in SEQ ID NO. 11, and the p subunit includes an I \DayLib\LIBFF\0406 docsak 13b amino acid sequence set forth in SEQ ID NO. 19 or functionally equivalent variants of the subunits encoded by any of the aforesaid SEQ ID NOs.
According to a tenth embodiment of the invention, there is provided a eukaryotic cell with a functional, heterologous calcium channel, produced by a process comprising: introducing into the cell nucleic acid that encodes an a, subunit of a human calcium channel, nucleic acid that encodes an a 2 subunit of a human calcium channel and introducing into the cell nucleic acid that encodes a P subunit of a human calcium channel, wherein: the calcium channel contains an a2b subunit of a human calcium channel, an ac1 or an al subunit of a human calcium channel and a 1-2 or P1-3 subunit of a human calcium channel, wherein the acc subunit includes a sequence of amino acids encoded by SEQ ID NOs. 1, 2, 7 or 8, the a2b subunit includes an amino acid sequence set forth in SEQ ID NO.
11, and the 0 subunit includes an amino acid sequence set forth in SEQ ID NOs. 9, 10 and 33-35 or functionally equivalent variants of the subunits encoded by any of the aforesaid SEQ ID NOs.
According to a eleventh embodiment of the invention, there is provided a method for identifying a compound that modulates the activity of a calcium channel, comprising: suspending a eukaryotic cell that has a functional, heterologous calcium channel, in a solution containing the compound and a calcium channel-selective ion: depolarizing the cell membrane of the cell; and detecting the current flowing into the cell, wherein: the heterologous calcium channel includes at least one human calcium channel subunit encoded by DNA or RNA that is heterologous to the cell; at least one subunit is selected from the group consisting of aIA-, 1 aA-2, E.IA-, aIEc 3, alC-2, P2C, P[2D 2E, a [3 subunit and a P4 subunit and includes the sequence of amino acids set forth in or encoded by the sequence of nucleotides set forth in any of SEQ ID NOs. 19, 22, 23, 24, 25, 26, 27, 28, 29, 30 or 36 or functionally equivalent variants thereof; the current that is detected is different from that produced by depolarizing the same or a substantially identical cell in the presence of the same calcium channel selective ion but in the absence of the compound.
According to a twelfth embodiment of the invention, there is provided a subunitspecific antibody selected from the group consisting of antibodies that bind to an a subunit I \DayLib\LIBFF\0406.docsak type or a subunit subtype of a human calcium channels, wherein the subunit is an a, ac, cIE, a or C1B type a, subunit, wherein the subunit includes a sequence of amino acids set forth in any of SEQ ID NOs. 1, 2, 3, 6, 7, 8, 22, 23, 24, 25, or 36.
According to a thirteenth embodiment of the invention, there is provided an RNA or single-stranded DNA probe of at least 30 bases in length comprising at least substantially contiguous bases from nucleic acids that encode a subunit of a human calcium channel selected from the group of subunits consisting of ca,IA_, 2
(IE-
3
P
3 2C P2D', 3 2E and 34.
According to a fourteenth embodiment of the invention, there is provided an RNA or single-stranded DNA probe of at least 16 bases in length comprising at least 16 substantially contiguous bases from nucleic acids that subunit of a human calcium channel selected from the group of subunits consisting of aE.
1 P2C, 3 2D, 3 2E and P4 subunits.
According to a fifteenth embodiment of the invention, there is provided a method for identifying nucleic acids that encode a human calcium channel subunit, comprising hybridizing under conditions of at least low stringency a probe in accordance with the thirteenth or fourteenth embodiment of the invention, to a library of nucleic acid molecules, .ee and selecting hybridizing molecules.
.2 According to a sixteenth embodiment of the invention, there is provided a method for identifying cells or tissues that express a calcium channel subunit-encoding nucleic acid, comprising hybridizing under conditions of at least low stringency a probe in accordance with the fourteenth embodiment of the invention, with mRNA expressed in the cells or tissues or cDNA produced from the mRNA, and thereby identifying cells or tissue that express mRNA that encodes the subunit.
According to a seventeenth embodiment of the invention, there is provided a substantially pure human calcium channel subunit selected from among subunits that include the sequence of amino acids set forth in any of SEQ ID NOs. 19, 22, 23, 24, 25, 26, 28, 29, 30 or 36 or functionally equivalent variants thereof.
Detailed Description of the Invention Definitions: Unless defined otherwise, all technical and scientific terms used herein have the same meaning as is commonly understood by one of skill in the art to which this invention belongs. All patents and publications referred to herein are incorporated by reference herein.
I:\DayLib\LIBFF\0406 docsak 13d Reference to each of the calcium channel subunits includes the subunits that are specifically disclosed herein and human calcium channel subunits encoded by DNA that can be isolated by using the DNA disclosed as probes and screening an appropriate human cDNA or genomic library under at least low stringency. Such DNA also includes DNA that encodes proteins that have about 40% homology to any of the subunits proteins described herein or DNA that hybridises under conditions of at least low stringency to the DNA provided herein and the .009
S
ol kC~ k k fr I.\DayLib\LIBFF\0406 docsak protein encoded by such DNA exhibits additional identifying characteristics, such as function or molecular weight.' It is understood that subunits that are encoded by transcripts that represent splice variants of the disclosed subunits or other such subunits may exhibit less than overall homology to any single subunit, but will include regions of such homology to one or more such subunits. It is also understood that 40% homology refers to proteins that share approximately 40% of their amino acids in common or that share somewhat less, but include conservative amino acid substitutions, whereby the activity of the protein is not substantially altered.
As used herein, the a, subunits types, encoded by different genes, are designated as type am, aj., at, and These types have also been referred to as VDCC IV for a,, VDCC II for ac and VDCC III for cao. Subunit subtypes, which are splice variants, are referred to, for example as a~ 2, cec.- etc.
Thus, as used herein, DNA encoding the ca subunit refers to DNA that hybridizes to the DNA provided herein under conditions of at least low stringency or encodes a subunit that has at least about 40% homology to protein encoded by DNA disclosed herein that encodes an a, subunit of a human calcium channel. An a, subunit may be identified by its ability to form a calcium channel. Typically, cr subunits have molecular masses greater than at least about 120 kD. Also, hydropathy plots of deduced a subunit amino acid sequences indicate that the a, subunits contain four internal receats, each containing six putative transmembrane domains.
The activity of a calcium channel may be assessed in vitro by methods known to those of skill in the art, including the electrophysiological and other methods described herein.
Typically, a, subunits include regions to which one or more modulators of calcium channel activity, such as a 1,4-DHP or w-CgTx, interact directly or indirectly. Types of a, subunits may be distinguished by any method known to those of skill in the art, including on the basis of binding specificity. For example, it has been found herein that c 3 subunits participate in the formation of channels that have previously been referred to as N-type channels, aC subunits participate in the formation of channels that had previously been referred to as L-type channels, and a, subunits appear to participate in the formation of channels that exhibit characteristics typical of channels that had previously been designated P-type channels.
Thus, for example, the activity of channels that contain the Sa,,s subunit are insensitive to 1,4-DHPs; whereas the activity of channels that contain the subunit are modulated or *altered by a 1,4-DHP. It is presently preferable to refer to calcium channels based on pharmacological characteristics and current kinetics and to avoid historical designations. Types and subtypes of a, subunits may be characterized on the basis of the effects of such modulators on the subunit or a channel containing the subunit as well as differences in currents and current kinetics produced by calcium channels containing the subunit.
o As used herein, an ca subunit is encoded by DNA that hybridizes to the DNA provided herein under conditions of low stringency or encodes a protein that has at least about homology with that disclosed herein. Such DNA encodes a protein that typically has a molecular mass greater than about 120 kD, but does not form a calcium channel in the absence of an a, subunit, and may alter the activity of a calcium channel that contains an oe subunit. Subtypes of the a 2 subunit that arise as splice variants are designated by lower case letter, such as a. In addition, the a 2 subunit and the large fragment produced when the protein is subjected to reducing conditions appear to be glycosylated with at least N-linked sugars and do not specifically bind to the 1,4-DHPs and phenylalkylamines that specifically bind to the a, subunit. The smaller fragment, the C-terminal fragment, is referred to as the 6 subunit and includes amino acids from about 946 (SEQ ID No. 11) through about the C-terminus. This -16fragment may dissociate from the remaining portion of a, when the c 2 subunit is exposed to reducing conditions.
As used herein, a 1 subunit is encoded by DNA that hybridizes to the DNA provided herein under conditions of low stringency or encodes a protein that has at least about 401 homology with that disclosed herein and is a protein that typically has a molecular mass lower than the a subunits and on the order of about 50-80 kD, does not form a detectable calcium channel in the absence of an a, subunit, but may alter the activity of a calcium channel that contains an a, subunit or that contains an a, and a, subunit.
Types of the 3 subunit that are encoded by different genes are designated with subscripts, such as 6, 3, and 3.
Subtypes of 3 subunits that arise as splice variants of a particular type are designated with a numerical subscript referring to the type and to the variant. Such subtypes include, but are not limited to the 3, splice variants, including and 32 variants, including 3 2-123- As used herein, a y subunit is a subunit encoded by DNA disclosed herein as encoding the y subunit and may be isolated and identified using the DNA disclosed herein as a probe by hybridization or other such method known to those of skill in the art, whereby full-length clones encoding a y subunit may be isolated or constructed. A y subunit will be encoded by DNA that hybridizes to the DNA provided herein under conditions of low stringency or exhibits sufficient sequence homology to encode a protein that has at least about homology with the y subunit described herein.
Thus, one of skill in the art, in light of the disclosure herein, can identify DNA encoding 3, 6 and y calcium channel subunits, including types encoded by different genes and subtypes that represent splice variants. For example, DNA probes based on the DNA disclosed herein may be used to screen an appropriate library, including a genomic or cDNA library, for hybridization to the probe and obtain DNA in one or more clones that includes an open reading fragment that -17encodes an entire protein. Subsequent to screening an appropriate library with the DNA disclosed hereid, the isolated DNA can be examined for the presence of an open reading frame from which the sequence of the encoded protein may be deduced. Determination of the molecular weight and comparison with the sequences herein should reveal the identity of the subunit as an a, etc. subunit. Functional assays may, if necessary, be used to determine whether the subunit is an a, subunit or S subunit.
For example, DNA encoding an a, subunit may be isolated by screening an appropriate library with DNA, encoding all or a portion of the human aC subunit. Such DNA includes the DNA in the phage deposited under ATCC Accession No. 75293 that encodes a portion of an a, subunit. DNA encoding an a subunit may be obtained from an appropriate library by screening with an oligonucleotide having all or a portion of the sequence set forth in SEQ ID No. 21, 22 and/or 23 or with the DNA in the deposited phage. Alternatively, such DNA may have a sequence that encodes an a, subunit that is encoded by SEQ ID NO. 22 or 23.
Similarly, DNA encoding 3, may be isolated by screening a human cDNA library with DNA probes prepared from the plasmid 31.42 deposited under ATCC Accession No. 69048 or may be obtained from an appropriate library using probes having secqences prepared according to the sequences set forth in SEQ ID Nos.
19 and/or 20. Also, DNA encoding 6, may be isolated by screening a human cDNA library with DNA probes prepared according to DNA set forth in SEQ ID No. 27, which sets forth the DNA sequence of a clone encoding a /4 subunit. The amino acid sequence is set forth in SEQ ID No. 28. Any method known to those of skill in the art for isolation and identification of DNA and preparation of full-length genomic or cDNA clones, including methods exemplified herein, may be used.
DNA
encoding The subunit encoded by isolated DNA may be identified by comparison with the DNA and amino acid sequences of the -18subunits provided herein. Splice variants share extensive regions of homology, but include non-homologous regions, subunits encoded by different genes share a uniform distribution of non-homologous sequences.
As used herein, a splice variant refers to a variant produced by differential processing of a primary transcript of genomic DNA that results in more than one type of mRNA.
Splice variants may occur within a single tissue type or among tissues (tissue-specific variants). Thus, cDNA clones that encode calcium channel subunit subtypes that have regions of identical amino acids and regions of different amino acid sequences are referred to herein as "splice variants".
As used herein, a "calcium channel-selective ion" is an ion that is capable of flowing through, or being blocked from flowing through, a calcium channel which spans a cellular membrane under conditions which would substantially similarly permit or block the flow of Ca 2 Ba 2 is an example of an ion which is a calcium channel-selective ion.
As used herein, a compound that modulates calcium channel activity is one that affects the ability of the calcium channel to pass calcium channel-selective ions or affects other detectable calcium channel features, such as current kinetics. Such compounds include calcium channel antagonists and agonists and compounds that exert their effect on the activity of the calcium channel directly or indirectly.
As used herein, a "substantially pure" subunit or protein is a subunit or protein that is sufficiently free of other polypeptide contaminants to appear homogeneous by SDS-PAGE or to be unambiguously sequenced.
As used herein, selectively hybridize means that a DNA fragment hybridizes to a second fragment with sufficient specificity to permit the second fragment to be identified or isolated from among a plurality of fragments. In general, selective hybridization occurs at conditions of high stringency.
-19- As used herein, heterologous or foreign DNA and RNA are used interchangeably and refer to DNA or RNA that does not occur naturally as part of the genome in which it is present or which is found in a location or locations in the genome that differ from that in which it occurs in nature. It is DNA or RNA that is not endogenous to the cell and has been artificially introduced into the cell. Examples of heterologous DNA include, but are not limited to, DNA that encodes a calcium channel subunit and DNA that encodes RNA or proteins that mediate or alter expression of endogenous DNA by affecting transcription, translation, or other regulatable biochemical processes. The cell that expresses the heterologous DNA, such as a calcium channel subunit, may contain DNA encoding the same or different calcium channel subunits. The heterologous DNA need not be expressed and may be introduced in a manner such that it is integrated into the host cell genome or is maintained episomally.
As used herein, operative linkage of heterologous DNA to regulatory and effector sequences of nucleotides, such as promoters, enhancers, transcriptional and translational stop sites, and other signal sequences, refers to the functional relationship between such DNA and such sequences of nucleotides. For example, operative linkage of heterologous DNA to a promoter refers to the physical and functional relationship between the DNA and the promoter such that the transcription of such DNA is initiated from the promoter by an RNA polymerase that specifically recognizes, binds to and transcribes the DNA in reading frame.
As used herein, isolated, substantially pure DNA refers to DNA fragments purified according to standard techniques employed by those skilled in the art [see, Maniatis et al. (1982) Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY].
As used herein, expression refers to the process by which nucleic acid is transcribed into mRNA and translated into peptides, polypeptides, or proteins. If the nucleic acid is derived from genomic DNA, expression may, if an appr c priate eukaryotic host cell or organism is selected, include splicing of the mRNA.
As used herein, vector or plasmid refers to discrete elements that are used to introduce heterologous DNA into cells for either expression of the heterologous DNA or for replication of the cloned heterologous DNA. Selection and use i of such vectors and plasmids are well within the level of skill of the art.
As used herein, expression vector includes vectors capable of expressing DNA fragments that are in operative linkage with regulatory sequences, such as promoter regions, that are capable of effecting expression of such DNA fragments Thus, an expression vector refers to a recombinant DNA or RNA construct such as a plasmid a phage, recombinant virus or other vector that, upon introduction into an appropriate host cell, results in expression of the cloned DNA. Appropriate expression vectors are well known to those of skill in the art and include those that are replicable in eukaryotic cells and/or prokaryotic cells and those that remain episomal or may integrate into the host cell genome.
As used herein, a promoter region refers to the portion of DNA of a gene that controls transcription of the DNA to which it is operatively linked. The promoter region includes specific sequences of DNA that are sufficient for RNA polymerase recognition, binding and transcription initiation.
This portion of the promoter region is referred to as the promoter. In addition, the promoter region includes sequences that modulate this recognition, binding and transcription initiation activity of the RNA polymerase. These sequences may be cis acting or may be responsive to trans acting factors. Promoters, depending upon the nature of the regulation, may be constitutive or regulated.
As used herein, a recombinant eukaryotic cell is a eukaryotic cell that contains heterologous DNA or RNA.
-21- As used herein, a recombinant or heterologous calcium channel refers to a calcium channel that contains one cr more subunits that are encoded by heterologous DNA that has been introduced into and expressed in a eukaryotic cell expresses the recombinant calcium channel. A recombinant calcium channel may also include subunits that are produced by DNA endogenous to the cell. In certain embodiments, the recombinant or heterologous calcium channel may contain only subunits that are encoded by heterologous
DNA.
As used herein, "functional" with respect to a recombinant or heterologous calcium channel means that the channel is able to provide for and regulate entry of calcium channel-selective ions, including, but not limited to, Ca 2 or Ba 2 in response to a stimulus and/or bind ligands with affinity for the charnnel. Preferably such calcium channel activity is distinguishable, such as by electrophysiological, pharmacological and other means known to those of skill in the art, from any endogenous calcium channel activity that is in the host cell.
As used herein, a peptide having an amino acid secence substantially as set forth in a particular SEQ ID No.
includes peptides that have the same function but may include minor variations in sequence, such as conservative amino acid changes or minor deletions or insertions that do not alter the activity of the peptide. The activity of a calcium channel receptor subunit peptide refers to its ability to form functional calcium channels with other such subunits.
As used herein, a physiological concentration of a compound is that which is necessary and sufficient for a biological process to occur. For example, a physiological concentration of a calcium channel-selective ion is a concentration of the calcium channel-selective ion necessary and sufficient to provide an inward current when the channe] is open.
As used herein, activity of a calcium channel refers to the movement of a calcium channel-selective ion through a -22calcium channel. Such activity may be measured by any method known to those of skill in the art, including, but not ldmited to, measurement of the amount of current which flows through the recombinant channel in response to a stimulus.
As used herein, a "functional assay" refers to an assay that identifies functional calcium channels. A functional assay, thus, is an assay to assess function.
As understood by those skilled in the art, assay methods for identifying compounds, such as antagonists and agcnists, that modulate calcium channel activity, generally recuire comparison to a control. One type of a "control" cell or "control" culture is a cell or culture that is treated substantially the same as the cell or culture exposed to the test compound except that the control culture is not exposed to the test compound. Another type of a "control" cell or "control" culture may be a cell or a culture of cells which are identical to the transfected cells except the cells employed for the control culture do not express functional calcium channels. In this situation, the response of the test o* cell to the test compound is compared to the response (or lack of response) of the calcium channel-negative cell to the test compound, when cells or cultures of each type of cell are exposed to substantially the same reaction conditions in the presence of the compound being assayed. For example, in methods that use patch clamp electrophysiological procedures, the same cell can be tested in the presence and absence of the test compound, by changing the external solution bathing the cell as known in the art.
It is also understood that each of the subunits disclosed herein may be modified by making conservative amino acid substitutions and the resulting modified subunits are contemplated herein. Suitable conservative substitutions of amino acids are known to those of skill in this art and may be made generally without altering the biological activity of the resulting molecule. Those of skill in this art recognize that, in general, single amino acid substitutions in non- -23essential regions of a polypeptide do not substantially alter biological activity (see, Watson et al. Molecular Biology of the Gene, 4th Edition, 1987, The Bejacmin/Cummings Pub. co., p.224) Such substitutions are preferably, although not exclusively, made in accordance with those set forth in TABLE 1 as follows: TABLE 1 Original residue Conservative substitution Ala Gly; Ser Arg Lys Asn Gin; His Cys Ser Gin Asn Glu Asp Gly Ala; Pro His Asn; Gin lie Leu; Val Leu lie; Val S Lys Arg; Gin; Glu Met Leu; Tyr; lie SPhe Met; Leu; Tyr Ser Thr Thr Ser Trp Tyr Tyr Trp; Phe Val lie; Leu Other substitutions are also permissible and may be determined empirically or in accord with known conservative substitutions. Any such modification of the polypeptide may be effected by any means known to those of skill in this art.
Mutation may be effected by any method known to those of skill in the art, including site-specific or site-directed mutagenesis of DNA encoding the protein and the use of DNA amplification methods using primers to introduce and amplify alterations in the DNA template.
Identification and isolation of DNA encoding human calcium channel subunits Methods for identifying and isolating DNA encoding al, a 2 3 and y subunits of human calcium channels are provided.
Identification and isolation of such DNA may be accomplished by hybridizing, under appropriate conditions, at least low stringency whereby DNA that encodes the desired -24subunit is isolated, restriction enzyme-digested human
DNA
with a labeled probe having at least 14, preferably 16 or more nucleotides and derived from any contiguous portion of DNA having a sequence of nucleotides set forth herein by sequence identification number. Once a hybridizing fragment is identified in the hybridization reaction, it can be cloned employing standard cloning techniques known to those of sill in the art. Full-length clones may be identified by the presence of a complete open reading frame and the identity of .the encoded protein verified by sequence comparison with the 9999 subunits provided herein and by functional assays to assess calcium channel- forming ability or other function. This method can be used to identify genomic DNA encoding the subunit or cDNA encoding splice variants of human calcium channel subunits generated by alternative splicing of the primary transcript of genomic subunit DNA. For instance,
DNA,
cDNA or genomic DNA, encoding a calcium channel subunit may be identified by hybridization to a DNA probe and characterized by methods known to those of skill in the art, such as restriction mapping and DNA sequencing, and compared to the DNA provided herein in order to identify heterogeneity or divergence in the sequences of the DNA. Such sequence differences may indicate that the transcripts from which the cDNA was produced result from alternative splicing of a primary transcript, if the non-homologous and homologous regions are clustered, or from a different gene if the nonhomologous regions are distributed throughout the cloned
DNA.
Any suitable method for isolating genes using the DNA provided herein may be used. For example, oligonucleotides corresponding to regions of sequence differences have been used to isolate, by hybridization, DNA encoding the fulllength splice variant and can be used to isolate genomic clones. A probe, based on a nucleotide sequence disclosed herein, which encodes at least a portion of a subunit of a human calcium channel, such as a tissue-specific exon, may be used as a probe to clone related DNA, to clone a full-length cDNA clone or genomic clone encoding the human calcium channel subunit.
I
Labeled, including, but not limited to, radioactively or enzymatically labeled, RNA or single-stranded DNA of at least 14 substantially contiguous bases, preferably 16 or more, generally at least 30 contiguous bases of a nucleic acid which encodes at least a portion of a human calcium channel subunit, the sequence of which nucleic acid corresponds to a segment of a nucleic acid sequence disclosed herein by reference to a SEQ ID No. are provided. Such nucleic acid segments may be used 9 as probes in the methods provided herein for cloning DNA encoding calcium channel subunits. See, generally, Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual, 2nd Edition, Cold Spring Harbor Laboratory Press.
In addition, nucleic acid amplification techniques, which are well known in the art, can be used to locate solice variants of calcium channel subunits by emoloyinc oligonucleotides based on DNA sequences surrounding the divergent sequence primers for amplifying human RNA or genomic DNA. Size and sequence determinations of the amplification products can reveal splice variants. Furthermore, isolation of human genomic DNA sequences by hybridization can yield DNA containing multiple exons, separated by introns, that correspond to different splice variants of transcripts encoding human calcium channel subunits.
DNA encoding types and subtypes of each of the cr, 3 and 7 subunits of voltage-dependent human calcium channels has been cloned herein by nucleic acid amplication of cDNA from selected tissues or by screening human cDNA libraries prepared from isolated poly A+ mRNA from cell lines or tissue of human origin having such calcium channels. Among the sources of such cells or tissue for obtaining mRNA are human brain tissue or a human cell line of neural origin, such as a neuroblastoma cell line, human skeletal muscle or smooth muscle cells, and the like. Methods of preparing cDNA libraries are well known in the art [see generally Ausubel et al. (1987) Current -26- Protocols in Molecular Biology, Wiley-Interscience, New York; and Davis et al. (1986) Basic Methods in Molecular B-ology, Elsevier Science Publishing Co., New York].
Preferred regions from which to construct probes include and/or 3' coding sequences, sequences predicted to encode transmembrane domains, sequences predicted to encode cytoplasmic loops, signal sequences, ligand-binding sites, and other functionally significant sequences (see Table, below) Either the full-length subunit-encoding DNA or fragments thereof can be used as probes, preferably labeled with suitable label means for ready detection. When fragments are used as probes, preferably the DNA sequences will be typically from the carboxyl-end-encoding portion of the DNA, and most preferably will include predicted transmembrane domainencoding portions based on hydropathy analysis of the deduced amino acid sequence [see, Kyte and Doolittle [(1982) J.
Mol. Biol. 167:105] :Riboprobes that are specific for human calcium channel subunit types or subtypes have been prepared. These probes are useful for identifying expression of particular subunits in selected tissues and cells. The regions from which the probes were prepared were identified by comparing the DNA and amino acid sequences of all known a or 3 subunit subtypes.
Regions of least homology, preferably human-derived sequences, and generally about 250 to about 600 nucleotides were selected. Numerous riboprobes for a and 6 subunits have been prepared; some of these are listed in the following Table.
-27-
C.
C. C
C
TABLE 2 SUMMARY OF RNA PROBES SUBUNIT INUCLEOTIDE PROBE NAME PROBE TYPE ORIENTA- SPECIFICITY jPOSITION
ITION
a1A generic 3357-3840 pGEM7ZalA* riboprobe fl/a 761-790 SE700 oligo antisense 3440-3464 SE718 oligo antisense SE724 oligo sense a1B generic 3091-3463 pGEM7ZaB,,. riboprobe n/a pGEM7Zc1B.0.h riboprobe n/a aylB-J. 6490-6676 pCRII riboprobe n/a specific alB-1/187 alE generic 3114-3462 pGEM7Za1E riboprobe n/a ca2b 1321-1603 pCRIIa2b riboprobe n/a /3 generic(?) 212-236 SE300 oligo antisense /31 generic 1267-1291 SE301 oligo antisense /31-2 1333-1362 SE17 oligo antisense specific 1682-1706 SE23 oligo sense 2742-2766 SE43 oligo antisense 27-56 SE208 oligo antisense 340-364 SE274 oligo antisense SE275 oligo sense 033 specific 1309-1509 riboprobe n/a /specific 1228-1560 riboprobe n/a U.S. Patent No. 4,766,072.
The above-noted nucleotide regions are also useful in selecting regions of the protein for preparation of subunitspecific antibodies, discussed below.
The DNA clones and fragments thereof provided herein thus can be used to isolate genomic clones encoding each subunit and to isolate any splice variants by hybridization screening of libraries prepared from different human tissues. Nucleic acid amplification techniques, which are well known in the art, can also be used to locate DNA encoding splice variants -28of human calcium channel subunits. This is accomplished by employing oligonucleotides based on DNA sequences surrounding divergent sequence(s) as primers for amplifying human RNA or genomic DNA. Size and sequence determinations of the amplification products can reveal the existence of splice variants. Furthermore, isolation of human genomic
DNA
sequences by hybridization can yield DNA containing multiple exons, separated by introns, that correspond to different splice variants of transcripts encoding human calcium channel subunits.
Once DNA encoding a calcium channel subunit is isolated, ribonuclease (RNase) protection assays can be employed to determine which tissues express mRNA encoding a particular calcium channel subunit or variant. These assays provide a sensitive means for detecting and quantitating an RNA species n a complex mixture of total cellular RNA. The subunit
DNA
is labeled and hybridized with cellular RNA. If complementary mRNA is present in the cellular RNA, a DNA-RNA hybrid results.
The RNA sample is then treated with RNase, which degrades single-stranded RNA. Any RNA-DNA hybrids are protected from RNase degradation and can be visualized by gel electrophoresis and autoradiography. In situ hybridization techniques can also be used to determine which tissues express mRNA encoding a particular calcium channel subunit. The labeled subunit DNAs are hybridized to different tissue slices to visualize subunit mRNA expression.
With respect to each of the respective subunits (cu, a2, (3 or y) of human calcium channels, once the DNA encoding the channel subunit was identified by a nucleic acid screenine method, the isolated clone was used for further screening to identify overlapping clones. Some of the cloned DNA fragments can and have been subcloned into an appropriate vector such as pIBI24/25 (IBI, New Haven, CT), M13mpl8/19, pGEM4, pGEM3, PGEM7Z, pSP72 and other such vectors known to those of skill in this art, and characterized by DNA sequencing and restriction enzyme mapping. A sequential series of -29overlapping clones may thus be generated for each of the subunits until a full-length clone can be prepared by methods, known to those of skill in the art, that include identification of translation initiation (start) and translation termination (stop) codons. For expression of the cloned DNA, the 5' noncoding region and other transcriptional and translational control regions of such a clone may be replaced with an efficient ribosome binding site and other regulatory regions as known in the art. Other modifications of the 5' end, known to those of skill in the art, that may be required to optimize translation and/or transcription efficiency may also be effected, if deemed necessary.
Examples II-VIII below, describe in detail the cloning of each of the various subunits of a human calcium channel as well as subtypes and splice variants, including tissuespecific variants thereof. In the few instances in which partial sequences of a subunit are disclosed, it is well within the skill of the art, in view of the teaching herein, to obtain the corresponding full-length clones and sequence thereof encoding the subunit, subtype or splice variant thereof using the methods described above and exemplified below.
Identification and isolation of DNA encoding ac subunits A number of voltage-dependent calcium channel a 1 subunit genes, which are expressed in the human CNS and in other tissues, have been identified and have been designated as a, (or VDCC IV), a, (or VDCC II), a.D (or VDCC III) and aE.
DNA, isolated from a human neural cDNA library, that encodes each of the subunit types has been isolated. DNA encoding subtypes of each of the types, which arise as splice variants are also provided. Subtypes are herein designated, for example, as 1 als-.
The a 1 subunit types A B, C, D and E of voltagedependent calcium channels, and subtypes thereof, differ with respect to sensitivity to known classes of calcium channel agonists and antagonists, such as DHPs, phenylalkylamines, omega conotoxin (w-CgTx), the funnel web spider toxin w-Aga- IV, and pyrazonoylguanidines. These subunit types also appear to differ in the holding potential and in the kinetics of currents produced upon depolarization of cell membranes containing calcium channels that include different types of al subunits.
DNA that encodes an a, subunit that binds to at least one compound selected from among dihydropyridines, phenylalkylamines, w-CgTx, components of funnel web spider toxin, and pyrazonoylguanidines is provided. For example, the a i B subunit provided herein appears to specifically interact with w-CgTx in N-type channels, and the aD subunit provided herein specifically interacts with DHPs in L-type channels.
Identification and isolation of DNA encoding the aD, human calcium channel subunit The iD, subunit cDNA has been isolated using fragments of
S
the rabbit skeletal muscle calcium- channel a, subunit cDNA as a probe to screen a cDNA library of a human neuroblastoma cell line, IMR32, to obtain clone al.36. This clone was used as a probe to screen additional IMR32 cell cDNA libraries to obtain overlapping clones, which were then employed for screening until a sufficient series of clones to span the length of the nucleotide sequence encoding the human a, subunit was obtained. Full-length clones encoding a1D were constructed by ligating portions of partial a, clones as described in Example II. SEQ ID No. 1 shows the 7,635 nucleotide sequence of the cDNA encoding the eaD subunit. There is a 6,483 nucleotide sequence reading frame which encodes a sequence of 2,161 amino acids (as set forth in SEQ ID No. 1) SEQ ID No. 2 provides the sequence of an alternative exon encoding the IS6 transmembrane domain [see Tanabe, et al.
(1987) Nature 328:313-318 for a description of transmembrane domain terminology] of the alD subunit.
SEQ ID No. 1 also shows the 2,161 amino acid sequence deduced from the human neuronal calcium channel subunit -31- DNA. Based on the amino acid sequence, the aic protein has a calculated Mr of 245,163. The subunit of the calcium channel contains four putative internal repeated sequence regions. Four internally repeated regions represent 24 putative transmembrane segments, and the amino- and carboxyl-termini extend intracellularly.
The 0, subunit has been shown to mediate DHP-sensitive, high-voltage-activated, long-lasting calcium channel activity.
This calcium channel activity was detected when o6cytes were co-injected with RNA transcripts encoding an ce, and P,1- or clD, ,,2b and subunits. This activity was distinguished from Ba 2 currents detected when oocytes were injected with RNA transcripts encoding the 2ba subunits. These currents pharmacologically and biophysically resembled Ca 2 currents reported for uninjected o6cytes.
Identification and isolation of DNA .***encoding the ao human calcium channel Ssubunit Biological material containing DNA encoding a portion of the IA subunit had been deposited in the American Type Culture Collection, 12301 Parklawn Drive, Rockville, Maryland 20852 U.S.A. under the terms of the Budapest Treaty on the International Recognition of Deposits of Microorganisms for Purposes of Patent Procedure and the Regulations promulgated under this Treaty. Samples of the deposited material are and will be available to industrial property offices and other persons legally entitled to receive them under the terms of the Treaty and Regulations and otherwise in compliance with the patent laws and regulations of the United States of America and all other nations or international organizations in which this application, or an application claiming priority of this application, is filed or in which any patent granted on any such application is granted.
A portion of an ac, subunit is encoded by an approximately 3 kb insert in XgtlO phage designated al.254 in E. coli host strain NM514. A phage lysate of this material has been deposited as at the American Type Culture Collection under -32- ATCC Accession No. 75293, as described above. DNA encoding al may also be identified by screening with a probe prepared from DNA that has SEQ ID No. 21: CTCAGTACCATCTCTGATACCAGCCCCA 3' UIA splice variants have been obtained. The sequences of two a, splice variants, a,,i and aa,, are set forth in SEQ.
ID
Nos. 22 and 23. Other splice variants may be obtained by screening a human library as described above or using all or a portion of the sequences set forth in SEQ ID Nos. 22 and 23.
Identification and isolation of DNA encoding the a cB human calcium channel subuni t .DNA encoding the a, subunit was isolated by screeninc a human basal ganglia cDNA library with fragments of the rabbit skeletal muscle calcium channel a, subunit-encoding cDNA.
A
portion of one of the positive clones was used to screen an IMR32 cell cDNA library. Clones that hybridized to the basal ganglia DNA probe were used to further screen an IMR32 cell cDNA library to identify overlapping clones that in turn were used to screen a human hippocampus cDNA library. In this way, a sufficient series of clones to span nearly the entire lenoth of the nucleotide sequence encoding the human a, subunit was obtained. Nucleic acid amplification of specific regions of the IMR32 cell au mRNA yielded additional segments of the a, coding sequence.
A full-length a, DNA clone was constructed by ligating portions of the partial cDNA clones as described in Examrle II.C. SEQ ID Nos. 7 and 8 show the nucleotide sequences of DNA clones encoding the subunit as well as the deduced amino acid sequences. The a, subunit encoded by SEQ ID No.
7 is referred to as the subunit to distinguish it from another a 1 subunit, a 2 encoded by the nucleotide sequence shown as SEQ ID No. 8, which is derived from alternative splicing of the ac, subunit transcript.
Nucleic acid amplification of IMR32 cell mRNA using oligonucleotide primers designed according to nucleotide -33sequences within the crlB--encoding DNA has identified variants of the a, transcript that appear to be splice variants because they contain divergent coding sequences.
Identification and isolation of DNA encoding the ,c human calcium channel subunit Numerous ec-specific DNA clones were isolated.
Characterization of the sequence revealed the 01c coding sequence, the initiation of translation sequence, and an alternatively spliced region of cc. Alternatively spliced variants of the ac subunit have been identified. SEQ ID No.
3 sets forth DNA encoding a substantial protion of an aq subunit. The DNA sequences set forth in SEQ ID No. 4 and No.
5 encode two possible amino terminal ends of the ac protein.
SEQ ID No. 6 encodes an alternative exon for the IV S3 transmembrane domain. The sequences of substantial portions of two ac splice variants, designated and a1-2, are set forth in SEQ ID NOs. 3 and 36, respectively.
The isolation and identification of DNA clones encoding .portions of the ac subunit is described in detail in Example
II.
Identification and isolation of DNA encoding the human calcium channel subunit DNA encoding alE human calcium channel subunits have been isolated from an oligo dT-primed human hippocampus library.
The resulting clones, which are splice variants, were designated acE- 1 and aE-3. The subunit designated al,- has the amino acid sequence set forth in SEQ ID No. 24, and a subunit designated alE- has the amino acid sequence set forth in SEQ ID No. 25. These splice variants differ by virtue of a 57 base pair insert between nucleotides 2405 and 2406 of SEQ. ID No.
24.
The subunits provided herein appear to participate in the formation of calcium channels that have properties of high-voltage activated calcium channels and low-voltage activated channels. These channels are rapidly inactivating -34compared to other high voltage-activated calcium channels.
In addition these channels exhibit pharmacological profiles that are similar to voltage-activated channels, but are also sensitive to DHPs and w-Aga-IVA, which block certain high voltage activated channels. Additional details regarding the electrophysiology and pharmacology of channels containing a, subunits is provided in Example VII. F.
Identification and isolation of DNA encoding encoding additional ac human calcium channel subunit types and subtypes DNA encoding additional a, subunits can be isolated and identified using the DNA provided herein as described for the a u C1 and a~E subunits or using other methods known to those of skill in the art In particular, the DNA provided h: erein may be used to screen appropriate libraries to isolate related DNA. Full-length clones can be constructed using methods, such as those described herein, and the resulting subunits characterized by comparison of their sequences and electrophysiological and pharmacological properties with the subunits exemplified herein.
Identification and isolation of DNA encoding B human calcium channel subunits DNA encoding g, To isolate DNA encoding the 6, subunit, a human hippocampus cDNA library was screened by hybridization to a DNA fragment encoding a rabbit skeletal muscle calcium channel g subunit. A hybridizing clone was selected and was in turn used to isolate overlapping clones until the overlapping clones encompassing DNA encoding the entire human calcium channel 8 subunit were isolated and sequenced.
Five alternatively spliced forms of the human calcium channel subunit have been identified and DNA encoding a number of forms have been isolated. These forms are designated expressed in skeletal muscle, P,2, expressed in the CNS, 03,, also expressed in the CNS, expressed in aorta tissue and HEK 293 cells, and expressed in HEK 293 cells. Full-length DNA clones encoding the 31.2 and 1.3 subunits have been constructed. The subunits 1-1, P1-2, p1-4 and have been identified by nucleic acid amplification analysis as alternatively spliced forms of the g subunit. Sequences of the 0, splice variants are set forth in SEQ ID Nos. 9, 10 and 33-35.
DNA encoding 62 DNA encoding the 02 splice variants has been obtained.
These splice variants include 92c-,2E. Splice variants 02c- 2E include all of sequence set forth in SEQ ID No. 26, except for the portion at the 5' end (up to nucleotide 182), which differs among splice variants. The sequence set forth in SEQ ID No. 26 encodes 32,. Additional splice variants may be isolated using the methods described herein and oligonucleotides including all or portions of the DNA set forth in SEQ ID. No. 26 or may be prepared or obtained as described in the Examples. The sequences of 92 splice variants 92c and 2E are set forth-in SEQ ID Nos. 37 and 38, respectively.
DNA encoding 83 DNA encoding the 3 subunit and any splice variants thereof may be isolated by screening a library, as described above for the G, subunit, using DNA probes prepared according to SEQ ID Nos. 19, 20 or using all or a portion of the deposited 13 clone plasmid 31.42 (ATCC Accession No. 69048).
The E. coli host containing plasmid 31.42 that includes DNA encoding a ,3 subunit has been deposited as ATCC Accession No. 69048 in the American Type Culture Collection, 12301 Parklawn Drive, Rockville, Maryland 20852 U.S.A. under the terms of the Budapest Treaty on the International Recognition of Deposits of Microorganisms for Purposes of Patent Procedure and the Regulations promulgated under this Treaty. Samples of the deposited material are and will be available to industrial property offices and other persons legally entitled to receive them under the terms of the Treaty and Regulations and otherwise in compliance with the patent laws and regulations
I
-36of the United States of America and all other nations or international organizations in which this application, or an application claiming priority of this application, is filed or in which any patent granted on any such application is granted.
The 3, encoding plasmid is designated 31.42. The plasmid contains a 2.5 kb EcoRI fragment encoding 3, inserted into vector pGemo7zF(+) and has been deposited in E. coli host strain DH5a. The sequences of 3, splice variants, designated and 93-2 are set forth in SEQ ID Nos. 19 and respectively.
Identification and isolation of DNA encoding the a2 human calcium channel subunit DNA encoding a human neuronal calcium channel a, subunit was isolated in a manner substantially similar to that used for isolating DNA encoding an a, subunit, except that a human genomic DNA library was probed under low and high stringency conditions with a fragment of DNA encoding the rabbit skeletal muscle calcium channel a, subunit. The fragment included nucleotides having a sequence corresponding to the nucleotide sequence between nucleotides 43 and 272 inclusive of rabbit back skeletal muscle calcium channel u2 subunit cDNA as disclosed in PCT International Patent Application Publication No. WO 89/09834, which corresponds to U.S. Application Serial No. 07/620,520 (now allowed U.S. Application Serial No.
07/914,231), which is a continuation-in-part of United States Serial No. 176,899, filed April 4, 1988.
Example IV describes the isolation of DNA clones encoding a2 subunits of a human calcium channel from a human DNA library using genomic DNA and cDNA clones, identified by hybridization to the genomic DNA, as probes.
SEQ ID Nos. 11 and 29-32 show the sequence of DNA encoding a, subunits. As described in Example V, nucleic acid amplification analysis of RNA from human skeletal muscle, brain tissue and aorta using oligonucleotide primers specific for a region of the human neuronal a2 subunit cDNA that -37diverges from the rabbit skeletal muscle calcium channel a2 subunit cDNA identified splice variants of the human calcium channel a2 subunit transcript.
Identification and isolation of DNA encoding y human calcium channel subunits DNA encoding a portion of a human neuronal calcium channel y subunit has been isolated as described in detail in Example VI. SEQ ID No. 14 shows the nucleotide sequence at the 3'-end of this DNA which includes a reading frame encoding a sequence of 43 amino acid residues. Since the portion that Shas been obtained is homologous :o the rabbit clone, described in allowed co-owned U.S. Application Serial No. 07/482,384, the remainder of the clone can be obtained using routine methods.
Antibodies Antibodies, monoclonal or polyclonal, specific for calcium channel subunit subtypes or for calcium channel types can be prepared employing standard techniques, known to those of skill in the art, using the subunit proteins or portions thereof as antigens. Anti-peptide and anti-fusion protein antibodies can be used [see, for example, Bahouth et al.
(1991) Trends Pharmacol. Sci. 12:338-343; Current Protocols in Molecular Biology (Ausubel et al., eds.) John Wiley and Sons, New York (1984)]. Factors to consider in selecting portions of the calcium channel subunits for use as immunogens (as either a synthetic peptide or a recombinantly produced bacterial fusion protein) include antigenicity accessibility extracellular and cytoplasmic domains), uniqueness to the particular subunit, and other factors known to those of skill in this art.
The availability of subunit-specific antibodies makes possible the application of the technique of immunohistochemistry to monitor the distribution and expression density of various subunits in normal vs diseased brain tissue). Such antibodies could also be employed in diagnostic, such as LES diagnosis, and therapeutic -38applications, such as using antibodies that modulate activities of calcium channels.
The antibodies can be administered to a subject employing standard methods, such as, for example, by intraperitoneal, intramuscular, intravenous, or subcutaneous injection, implant or transdermal modes of administration, and the like. One of skill in the art can empirically determine dose forms, treatment regiments, etc., depending on the mode of administration employed.
Subunit-specific monoclonal antibodies and polyclonal antisera have been prepared. The regions from which the antigens came were identified by comparing the DNA and amino acid sequences of all known a or j subunit subtypes. Regions of least homology, preferably human-derived sequences were selected. The selected regions or fusion proteins containing the selected regions are used as immunogens. Hydrophobicity analyses of residues in selected protein regions and fusion proteins are also performed; regions of high hydrophobicity are avoided. Also, and more importantly, when preparing fusion proteins in bacterial hosts, rare codons are avoided.
In particular, inclusion of 3 or more successive rare codons oo* in a selected host is avoided. Numerous antibodies, polyclonal and monoclonal, specific for a or 3 subunit types or subtypes have been prepared; some of these are listed in the following Table. Exemplary antibodies and peptide antigens used to prepare the antibodies are set forth in the following Table: TABLE 3 SPECIFICITY AMINO ACID ANTIGEN NAME ANTIBODY TYPE I NLV1BER al generic 112-140 peptide 1A#1 polyclonal ac generic 1420-1447 peptide 1A#2 polyclonal alA generic 1048-1208 alA#2(b)GST fusion' polyclonal monoclonal alB generic 983-1106 a1B2 GST fusion polyclonal monoclonal -39alB-l 2164-2339 alB-1#3 GST fusion polyclonal alB-2 2164-2237 alB-2#4 GST fusion polyclonal alE generic 985-1004 alE#2(a) GST fusion polyclonal (alE-3) GST gene fusion system is available from Pharmacia; see also, Smith et al. (1988) Gene 67:31. The system provides pGEX plasmids that are designed for inducible, high-level expression of genes or gene fragments as fusions with Schistosoma japonicum GST. Upon expression in a bacterial host, the resulting fusion proteins are purified from bacterial lysates by affinity chromatography.
The GST fusion proteins are each specific for the cytoplasmic loop region IIS6-IIS1, which is a region of low subtype homology for all subtypes, including a, and aD, for which similar fusions and antisera can be prepared.
Preparation of recombinant eukaryotic cells containing
DNA
encoding heterologous calcium channel subunits DNA encoding one or more of the calcium channel subunits or a portion of a calcium channel subunit may be introduced into a host cell for expression or replication of the DNA.
Such DNA may be introduced using methods described in the following examples or using other procedures well known to those skilled in the art. Incorporation of cloned DNA into a suitable expression vector, transfection of eukaryotic cells with a plasmid vector or a combination of plasmid vectors, each encoding one or more distinct genes or with linear DNA, and selection of transfected cells are also well known in the art [see, Sambrook et al. (1989) Molecular Cloning: A Laboratory Manual, Second Edition, Cold Spring Harbor Laboratory Press]. Cloned full-length DNA encoding any of the subunits of a human calcium channel may be introduced into a plasmid vector for expression in a eukaryotic cell. Such DNA may be genomic DNA or cDNA. Host cells may be transfected with one or a combination of the plasmids, each of which encodes at least one calcium channel subunit. Alternatively, host cells may be transfected with linear DNA using methods well known to those of skill in the art.
While the DNA provided herein may be expressed in any eukaryotic cell, including yeast cells such as P. pastoris [see, Cregg et al. (1987) Bio/Technology 5:479] mammalian expression systems for expression of the DNA encoding the human calcium channel subunits provided herein are preferred.
The heterologous DNA may be introduced by any method known to those of skill in the art, such as transfection with a vector encoding the heterologous DNA. Particularly preferred vectors for transfection of mammalian cells are the pSV2dhfr expression vectors, which contain the SV40 early promoter, mouse dhfr gene, SV40 polyadenylation and splice sites and sequences necessary for maintaining the vector in bacteria, cytomegalovirus (CMV) promoter-based vectors such as pCDNA1, o r pcDNA-amp and MMTV promoter-based vectors. DNA encoding the human calcium channel subunits has been inserted in the vector pCDNAl at a position immediately following the CMV promoter. The vector pCDNAl is presently preferred.
e 6 Stably or transiently transfected mammalian cells may be 0 prepared by methods known in the art by transfecting cells with an expression vector having a selectable marker gene such as the gene for thymidine kinase, dihydrofolate reductase, neomycin resistance or the like, and, for transient transfection, growing the transfected cells under conditions selective for cells expressing the marker gene. Functional voltage-dependent calcium channels have been produced in HEK 293 cells transfected with a derivative of the vector pCDNA1 that contains DNA encoding a human calcium channel subunit.
The heterologous DNA may be maintained in the cell as an episomal element or may be integrated into chromosomal DNA of the cell. The resulting recombinant cells may then be cultured or subcultured (or passaged, in the case of mammalian cells) from such a culture or a subculture thereof. Methods for transfection, injection and culturing recombinant cells are known to the skilled artisan. Eukaryotic cells in which DNA or RNA may be introduced, include any cells that are transfectable by such DNA or RNA or into which such DNA may be injected. Virtually any eukaryotic cell can serve as a -41vehicle for heterologous DNA. Preferred cells are those that can also express the DNA and RNA and most preferred cells are those that can form recombinant or heterologous calcium channels that include one or more subunits encoded by the heterologous DNA. Such cells may be identified empirically or selected from among those known to be readily transfected or injected. Preferred cells for introducing DNA include those that can be transiently or stably transfected and include, but are not limited to, cells of mammalian origin, such as COS cells, mouse L cells, CHO cells, human embryonic kidney cells, African green monkey cells and other such cells known to those of skill in the art, amphibian cells, such as Xenopus laevis oocytes, or those of yeast such as Saccharomyces cerevisiae or Pichia pastoris. Preferred cells for expressing injected
RNA
transcripts or cDNA include Xenopus laevis o6cytes. Cells that are preferred for transfection of DNA are those that can be readily and efficiently transfected. Such cells are known to those of skill in the art or may be empirically identified.
Preferred cells include DG44 cells and HEK 293 cells, particularly HEK 293 cells that can be frozen in liquid nitrogen and then thawed and regrown. Such HEK 293 cells are described, for example in U.S. Patent No. 5,024,939 to Gorman [see, also Stillman et al. (1985) Mol. Cell.Biol. 5:2051o* 2060] The cells may be used as vehicles for replicating heterologous DNA introduced therein or for expressing the heterologous DNA introduced therein. In certain embodiments, the cells are used as vehicles for expressing the heterologous DNA as a means to produce substantially pure human calcium channel subunits or heterologous calcium channels. Host cells containing the heterologous DNA may be cultured under conditions whereby the calcium channels are expressed. The calcium channel subunits may be purified using protein purification methods known to those of skill in the art. For example, antibodies, such as those provided herein, that specifically bind to one or more of the subunits may be used -42for affinity purification of the subunit or calcium channels containing the subunits.
Substantially pure subunits of a human calcium channel a subunits of a human calcium channel, a, subunits of a human calcium channel, 3 subunits of a human calcium channel and y subunits of a human calcium channel are provided.
Substantially pure isolated calcium channels that contain at least one of the human calcium channel subunits are also provided. Substantially pure calcium channels that contain a mixture of one or more subunits encoded by the host cell and one or more subunits encoded by heterologous DNA or RNA that has been introduced into the cell are also provided.
Substantially pure subtype- or tissue-type specific calcium channels are also provided.
In other embodiments, eukaryotic cells that contain heterologous DNA encoding at least one of an a, subunit of a human calcium channel an a, subunit of a human calcium channel, a 6 subunit of a human calcium channel and a subunit of a human calcium channel are provided. In accordance with one preferred embodiment, the heterologous
DNA
is expressed in the eukaryotic cell and preferably encodes a human calcium channel a subunit.
Expression of heterologous calcium channels: o. electrophysiology and pharmacology Electrophysiological methods for measuring calcium channel activity are known to those of skill in the art and are exemplified herein. Any such methods may be used in order to detect the formation of functional calcium channels and to characterize the kinetics and other characteristics of the resulting currents. Pharmacological studies may be combined with the electrophysiological measurements in order to further characterize the calcium channels.
With respect to measurement of the activity of functional heterologous calcium channels, preferably, endogenous ion channel activity and, if desired, heterologous channel activity of channels that do not contain the desired subunits, -43of a host cell can be inhibited to a significant extent by chemical, pharmacological and electrophysiological means, including the use of differential holding potential, to increase the S/N ratio of the measured heterologous calcium channel activity.
Thus, various combinations of subunits encoded by the DNA provided herein are introduced into eukaryotic cells. The resulting cells can be examined to ascertain whether functional channels are expressed and to determine the properties of the channels. In particularly preferred aspects, the eukaryotic cell which contains the heterologous DNA expresses it and forms a recombinant functional calcium channel activity. In more preferred aspects, the recombinant calcium channel activity is readily detectable because it is a type that is absent from the untransfected host cell or is of a magnitude and/or pharmacological properties or exhibits biophysical properties not exhibited in the untransfected cell.
The eukaryotic cells can be transfected with various combinations of the subunit subtypes provided herein. The resulting cells will provide a uniform population of calcium channels for study of calcium channel activity and for use in the drug screening assays provided herein. Experiments that have been performed have demonstrated the inadequacy of prior classification schemes.
Preferred among transfected cells is a recombinant eukaryotic cell with a functional heterologous calcium channel. The recombinant cell can be produced by introduction of and expression of heterologous DNA or RNA transcripts encoding an cr subunit of a human calcium channel, more preferably also expressing, a heterologous DNA encoding a P subunit of a human calcium channel and/or heterologous DNA encoding an 2, subunit of a human calcium channel. Especially preferred is the expression in such a recombinant cell of each of the as, 0 and e2 subunits encoded by such heterologous DNA or RNA transcripts, and optionally expression of heterologous
I-
-44- DNA or an RNA transcript encoding a y subunit of a human calcium channel. The functional calcium channel's may preferably include at least an ac subunit and a 6 subunit of a human calcium channel. Eukaryotic cells expressing these two subunits and also cells expressing additional subunits, have been prepared by transfection of DNA and by injection of RNA transcripts. Such cells have exhibited voltage-dependent calcium channel activity attributable to calcium channels that contain one or more of the heterologous human calcium channel subunits. For example, eukaryotic cells expressing heterologous calcium channels containing an a, subunit in addition to the ac subunit and a g subunit have been shcwn to exhibit increased calcium selective ion flow across the cellular membrane in response to depolarization, indicating that the a, subunit may potentiate calcium channel function.
Cells that have been co-transfected with increasing ratios of a..
2 to a and the activity of the resulting calcium channels has been measured. The results indicate that increasing the amount of a,-encoding DNA relative to the other transfected S. subunits increases calcium channel activity Eukaryotic cells which express heterologous calcium channels containing at least a human a, subunit, a human 3 subunit and a human a, subunit are preferred. Eukaryotic cells transformed with a composition containing cDNA or an RNA transcript that encodes an al subunit alone or in combination with a 6 and/or an a, subunit may be used to produce cells that express functional calcium channels. Since recombinant cells expressing human calcium channels containing all of the human subunits encoded by the heterologous cDNA or RNA are especially preferred, it is desirable to inject or transfect such host cells with a sufficient concentration of the subunit-encoding nucleic acids to form calcium channels that contain the human subunits encoded by heterologous DNA or RNA.
The precise amounts and ratios of DNA or RNA encoding the subunits may be empirically determined and optimized for a particular combination of subunits, cells and assay conditions. I In particular, mammalian cells have been transiently and stably tranfected with DNA encoding one or more human calcium channel subunits. Such cells express heterologous calcium channels that exhibit pharmacological and electrophysiolocical properties that can be ascribed to human calcium channels.
Such cells, however, represent homogeneous populations and the pharmacological and electrophysiological data provides insights into human calcium channel activity heretofore unattainable. For example, HEK. cells that have been transiently transfected with DNA encoding the ab.1, and 8., subunits. The resulting cells transiently express these subunits, which form calcium channels that have properties S.that appear to be a pharmacologically distinct class of voltage-activated calcium channels distinct from those of L-, T- and P-type channels. The observed a, currents were insensitive to drugs and toxins previously used to define other classes of voltage-activated calcium channels.
HEK cells that have been transiently transfected with DNA encoding a- 3 a~b, and express heterologous calcium channels that exhibt sensitivity to w-conotoxin and currents typical of N-type channels. It has been found that alteration of the molar ratios of a2b and introduced into the cells to achieve equivalent mRNA levels significantly increased the number of receptors per cell, the current density, and affected the K, for w-conotoxin.
The electrophysiological properties of these channels produced from a 2 and P-2 was compared with those of channels produced by transiently transfecting HEK cells with DNA encoding and j,3. The channels exhibited similar voltage dependence of activation, substantially identical voltage dependence, similar kinetics of activation and tail currents that could be fit by a single exponential. The voltage dependence of the kinetics of inactivation was significantly different at all voltages examined.
-46- In certain embodiments, the eukaryotic cell with a heterologous calcium channel is produced by introducing into the cell a first composition, which contains at least one RNA transcript that is translated in the cell into a subunit of a human calcium channel. In preferred embodiments, the subunits that are translated include an a, subunit of a human calcium channel. More preferably, the composition that is introduced contains an RNA transcript which encodes an a, subunit of a human calcium channel and also contains an RNA transcript which encodes a 0 subunit of a human calcium channel and/or an RNA transcript which encodes an subunit of a human calcium channel. Especially preferred is the introduction of RNA encoding an ac, a 3 and an a, human calcium channel subunit, and, optionally, a 7 subunit of a human calcium channel. Methods for in vitro transcription of a cloned DNA and injection of the resulting RNA into eukaryotic cells are well known in the art. Transcripts of any of the fulllength DNA encoding any of the subunits of a human calcium channel may be injected alone or in combination with other transcripts into eukaryotic cells for expression in the cells.
Amphibian o6cytes are particularly preferred for expression of in vitro transcripts of the human calcium channel subunit cDNA clones provided herein. Amphibian oocytes that express functional heterologous calcium channels have been produced by this method.
Assays and Clinical uses of the cells and calcium channels Assays Assays for identifying compounds that modulate calcium channel activity Among the uses for eukaryotic cells which recombinantly express one or more subunits are assays for determining whether a test compound has calcium channel agonist or antagonist activity. These eukaryotic cells may also be used to select from among known calcium channel agonists and antagonists those exhibiting a particular calcium channel -47subtype specificity and to thereby select compounds that have potential as disease- or tissue-specific therapeutic agents.
In vitro methods for identifying compounds, such as calcium channel agonist and antagonists, that modulate the activity of calcium channels using eukaryotic cells that express heterologous human calcium channels are provided.
In particular, the assays use eukaryotic cell-s that express heterologous human calcium channel subunits encoded by heterologous DNA provided herein, for screening potential calcium channel agonists and antagonists which are specific for human calcium channels and particularly for screening for compounds that are specific for particular human calcium channel subtypes. Such assays may be used in conjunction with methods of rational drug design to select among agonists and antagonists, which differ slightly in structure, those particularly useful for modulating the activity of human calcium channels, and to design or select compounds that exhibit subtype- or tissue- specific calcium channel antagonist and agonist activities. These assays should
S.
accurately predict the relative therapeutic efficacy of a compound for the treatment of certain disorders in humans. In addition, since subtype-and tissue-specific calcium channel subunits are provided, cells with tissue- specific or subtypespecific recombinant calcium channels may be prepared and used in assays for identification of human calcium channel tissueor subtype-specific drugs.
Desirably, the host cell for the expression of calcium channel subunits does not produce endogenous calcium channel subunits of the type or in an amount that substantially interferes with the detection of heterologous calcium channel subunits in ligand binding assays or detection of heterologous calcium channel function, such as generation of calcium current, in functional assays. Also, the host cells preferably should not produce endogenous calcium channels which detectably interact with compounds having, at physiological concentrations (generally nanomolar or picomolar -48concentrations), affinity for calcium channels that contain one or all of the human calcium channel subunits provided herein.
With respect to ligand binding assays for identifying a compound which has affinity for calcium channels, cells are employed which express, preferably, at least a heterologous a 1 subunit. Transfected eukaryotic cells which express at least an subunit may be used to determine the ability of a test compound to specifically bind to heterologous calcium channels by, for example, evaluating the ability of the test compound to inhibit the interaction of a labeled compound known to specifically interact with calcium channels. Such ligand binding assays may be performed on intact transfected cells or membranes prepared therefrom.
The capacity of a test compound to bind to or otherwise interact with membranes that contain heterologous calcium channels or subunits thereof may be determined by using any appropriate method, such as competitive binding analysis, such as Scatchard plots, in which the binding capacity of such membranes is determined in the presence and absence of one or more concentrations of a compound having known affinity for the calcium channel. Where necessary, the results may be compared to a control experiment designed in accordance with methods known to those of skill in the art. For example, as a negative control, the results may be compared to those of assays of an identically treated membrane preparation from host cells which have not been transfected with one or more subunit-encoding nucleic acids.
The assays involve contacting the cell membrane of a recombinant eukaryotic cell which expresses at least one subunit of a human calcium channel, preferably at least an a, subunit of a human calcium channel, with a test compound and measuring the ability of the test compound to specifically bind to the membrane or alter or modulate the activity of a heterologous calcium channel on the membrane.
-49- In preferred embodiments, the assay uses a recombinant cell that has a calcium channel containing an c subuni't of a human calcium channel in combination with a 3 subunit of a human calcium channel and/or an a, subunit of a human calcium channel. Recombinant cells expressing heterologous calcium channels containing each of the a 1 6 and 2 human subunits and, optionally, a y subunit of a human calcium channel are especially preferred for use in such assays.
In certain embodiments, the assays for identifying compounds that modulate calcium channel activity are practiced by measuring the calcium channel activity of a eukaryotic cell having a heterologous, functional calcium channel when such cell is exposed to a solution containing the test compound and a a calcium channel-selective ion and comparing the measured calcium channel activity to the calcium channel activity of the same cell or a substantially identical control cell in a solution not containing the test compound. The cell is maintained in a solution having a concentration of calcium channel-selective ions sufficient to provide an inward current when the channel is open. Recombinant cells expressing calcium channels that include each of the a 1 0 and a, human subunits, and, optionally, a y subunit of a human calcium channel, are especially preferred for use in such assays. Methods for practicing such assays are known to those of skill in the art.
For example, for similar methods applied with Xenopus laevls o6cytes and acetylcholine receptors, see, Mishina et al.
S[(1985) Nature 313:364] and, with such oocytes and sodium channels [see, Noda et al. (1986) Nature 322:826-828]. For similar studies which have been carried out with the acetylcholine receptor, see, Claudio et al. [(1987) Science 238:1688-1694].
Functional recombinant or heterologous calcium channels may be identified by any method known to those of skill in the art. For example, electrophysiological procedures for measuring the current across an ion-selective membrane of a cell, which are well known, may be used. The amount and duration of the flow of calcium-selective ions through heterologous calcium channels of a recombinant cell containing DNA encoding one or more of the subunits provided herein has been measured using electrophysiological recordings using a two electrode and the whole-cell patch clamp techniques. In order to improve the sensitivity of the assays, known methods can be used to eliminate or reduce non-calcium currents and calcium currents resulting from endogenous calcium channels, when measuring calcium currents through recombinant channels.
For example, the DHP Bay K 8644 specifically enhances L-type calcium channel function by increasing the duration of the open state of the channels [see, Hess, et al.
(1984) Nature 311:538-544]. Prolonged opening of the channels results in calcium currents of increased magnitude and Sduration. Tail currents can be observed upon repolarization of the cell membrane after activation of ion channels by a depolarizing voltage command. The opened channels reuire a finite time to close or "deactivate" upon repolarization, and the current that flows through the channels during this period is referred to as a tail current. Because Bay K 8644 prolongs opening events in calcium channels, it tends to prolong these tail currents and make them more pronounced.
In practicing these assays, stably or transiently transfected cells or injected cells that express voltagedependent human calcium channels containing one or more of the subunits of a human calcium channel desirably may be used in assays to identify agents, such as calcium channel agonists and antagonists, that modulate calcium channel activity.
Functionally testing the activity of test compounds, including compounds having unknown activity, for calcium channel agonist or antagonist activity to determine if the test compound potentiates, inhibits or otherwise alters the flow of calcium ions or other ions through a human calcium channel can be accomplished by maintaining a eukaryotic cell which is transfected or injected to express a heterologous functional calcium channel capable of regulating the flow of calcium -51channel-selective ions into the cell in a medium containing calcium channel-selective ions in the presence of and (ii) in the absence of a test compound; maintaining the cell under conditions such that the heterologous calcium channels are substantially closed and endogenous calcium channels of the cell are substantially inhibited depolarizing the membrane of the cell maintained in step to an extent and for an amount of time sufficient to cause (preferably, substantially only) the heterologous calcium channels to become permeable to the calcium channel-selective ions; and comparing the amount and duration of current flow into the cell in the presence of the test compound to that of the current flow into the cell, or a substantially similar cell, in the absence of the test compounc.
The assays thus use cells, provided herein, that express heterologous functional calcium channels and measure functionally, such as electrophysiologically, the ability of a test compound to potentiate, antagonize or otherwise modulate the magnitude and duration of the flow of calcium channel-selective ions, such as Ca" or Ba", through the heterologous functional channel. The amount of current which flows through the recombinant calcium channels of a cell may be determined directly, such as electrophysiologically, or by monitoring an independent reaction which occurs intracellularly and which is directly influenced in a calcium (or other) ion dependent manner. Any method for assessing the activity of a calcium channel may be used in conjunction with the cells and assays provided herein. For examDle, in one embodiment of the method for testing a compound for its ability to modulate calcium channel activity, the amount of current is measured by its modulation of a reaction which is sensitive to calcium channel-selective ions and uses a eukaryotic cell which expresses a heterologous calcium channel and also contains a transcriptional control element operatively linked for expression to a structural gene that encodes an indicator protein. The transcriptional control -52element used for transcription of the indicator gene is responsive in the cell to a calcium channel-selective ion, such as Ca'' and Ba 2 The details of such transcriptional based assays are described in commonly owned PCT International Patent Application No. PCT/US91/5625, filed August 7, 1991, which claims priority to copending commonly owned allowed U.S.
Application Serial No. 07/563,751, filed August 7, 1990; see also, commonly owned published PCT International Patent Application PCT US92/11090, which corresponds to co-pending U.S. Applications Serial Nos. 08/229,150 and 08/244,985.
Assays for diagnosis of LES LES is an autoimmune disease characterized by an insufficient release of acetylcholine from motor nerve terminals which normally are responsive to nerve impulses.
Immunoglobulins (IgG) from LES patients block individual voltage-dependent calcium channels and thus inhibit calcium channel activity [Kim and Neher, Science 239:405-408 (1988)] A diagnostic assay for Lambert Eaton Syndrome (LES) is provided herein. The diagnostic assay for LES relies on the ".immunological reactivity of LES IgG with the human calcium channels or particular subunits alone or in combination or expressed on the surface of recombinant cells. For example, such an assay may be based on immunoprecipitation of LES IgG by the human calcium channel subunits and cells that express such subunits provided herein.
Clinical applications In relation to therapeutic treatment of various disease states, the availability of DNA encoding human calcium channel subunits permits identification of any alterations in such genes mutations) which may correlate with the occurrence of certain disease states. In addition, the creation of animal models of such disease states becomes possible, by specifically introducing such mutations into synthetic DNA fragments which can then be introduced into laboratory animals or in vitro assay systems to determine the effects thereof.
-53c r r r Also, genetic screening can be carried out using the nucleotide sequences as probes. Thus, nucleic acid samples from subjects having pathological conditions suspected of involving alteration/modification of any one or more of the calcium channel subunits can be screened with appropriate probes to determine if any abnormalities exist with respect to any of the endogenous calcium channels. Similarly, subjects having a family history of disease states related to calcium channel dysfunction can be screened to determine if they are also predisposed to such disease states.
EXAMPLES
The following examples are included for illustrative purposes only and are not intended to limit the scope of the invention.
EXAMPLE I: PREPARATION OF LIBRARIES USED FOR ISOLATION OF DNA ENCODING HUMAN NEURONAL VOLTAGE- DEPENDENT CALCIUM CHANNEL SUBUNITS A. RNA Isolation 1. IMR32 cells IMR32 cells were obtained from the American Type Culture Collection (ATCC Accession No. CCL127, Rockville, MD) and grown in DMEM, 10% fetal bovine serum, 1% penicillin/streptomycin (GIBCO, Grand Island, NY) plus 1.0 mM dibutyryl cAMP (dbcAMP) for ten days. Total RNA was isolated from the cells according to the procedure described by H.C.
Birnboim [(1988) Nucleic Acids Research 16:1487-1497].
Poly(A') RNA was selected according to standard procedures [see, Sambrook et al. (1989) Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press; pg.
7.26-7.29].
2. Human thalamus tissue Human thalamus tissue (2.34 obtained from the National Neurological Research Bank, Los Angeles, CA, that had been stored frozen at -70 0 C was pulverized using a mortar and pestle in the presence of liquid nitrogen and the cells were lysed in 12 ml of lysis buffer (5 M guanidinium isothiocyanate, 50 mM TRIS, pH 7.4, 10 mM EDTA, 5% 3- -54mercaptoethanol). Lysis buffer was added to the lysate to yield a final volume of 17 ml. N-laurylsarcosine and Csl were added to the mixture to yield final concentrations of 4% and 0.01 g/ml, respectively, in a final volume of 18 ml.
The sample was centrifuged at 9,000 rpm in a Sorvall SS34 rotor for 10 min at room temperature to remove the insoluble material as a pellet. The supernatant was divided into two equal portions and each was layered onto a 2-ml cushion of a solution of 5.7 M CsC1, 0.1 M EDTA contained in separate centrifuge tubes to yield approximately 9 ml per tube. The samples were centrifuged in an SW41 rotor at 37,000 rpm for 24 h at 200C.
S' After centrifugation, each RNA pellet was resuspended in 3 ml ETS (10 mM TRIS, pH 7.4, 10 mM EDTA, 0.2% SDS) and combined into a single tube. The RNA was precipitated with 0.25 M NaCl and two volumes of 95% ethanol.
The precipitate was collected by centrifuaation and resuspended in 4 ml PK buffer (0.05 M TRIS, pH 8.4, 0.14 M NaC1, 0.01 M EDTA, 1% SDS). Proteinase K was added to the sample to a final concentration of 200 Ag/ml. The sample was incubated at 220C for 1 h, followed by extraction with an equal volume of phenol:chloroform:isoamylalcohol (50:48:2) two times, followed by one extraction with an equal volume of chloroform: isoamylalcohol The RNA was precipitated with ethanol and NaC1. The precipitate was resuspended in 400 il of ETS buffer. The yield of total RNA was approximately mg. Poly A' RNA (30 pg) was isolated from the total RNA according to standard methods as stated in Example I.A.1.
B. Library Construction Double-stranded cDNA was synthesized according to standard methods [see, Sambrook et al. (1989)
IN:
Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press, Chapter Each library was prepared in substantially the same manner except for differences in: 1) the oligonucleotide used to prime the first strand cDNA synthesis, 2) the adapters that were attached to the doublestranded cDNA, 3) the method used to remove the free or unused adapters, and 4) the size of the fractionated cDNA ligated into the X phage vector.
1. IMR32 cDNA library #1 Single-stranded cDNA was synthesized using IMR32 poly(A') RNA (Example as a template and was primed using oligo (dT) 8 (Collaborative Research Inc., Bedford, MA). The single-stranded cDNA was converted to double-stranded cDNA and the yield was approximately 2Ag. Ecol adapters: 5'-AATTCGGTACGTACACTCGAGC-3' 22-mer (SEQ ID GCCATGCATGTGAGCTCG-5' 18-mer (SEQ ID No.16) also containing SnaBI and XhoI restriction sites were then added to the double-stranded cDNA according to the following procedure.
a. Phosphorylation of 18-mer The 18-mer was phosphorylated using standard methods (see, Sambrook et al. (1989) IN: Molecular Cloning,
A
Laboratory Manual, Cold Spring -Harbor Laboratory Press, Chapter 8] by combining in a 10 pl total volume the 18-mer (225 pmoles) with 32 P]-ATP (7000 Ci/mmole; 1.0 1l) and kinase (2 U) and incubating at 370 C for 15 minutes. After incubation, 1 Il 10 mM ATP and an additional 2 U of kinase *o were added and incubated at 370C for 15 minutes.
Kinase was then inactivated by boiling for 10 minutes.
b. Hybridization of 22-mer The 22-mer was hybridized to the phosphorylated 18-mer by addition of 225 pmoles of the 22-mer (plus water to bring volume to 15 pl), and incubation at 65 0 C for 5 minutes. The reaction was then allowed to slow cool to room temperature.
The adapters were thus present at a concentration of pmoles/Al, and were ready for cDNA-adapter ligation.
c. Ligation of adapters to cDNA After the EcoRI, SnaBI, XhoI adapters were ligated to the double-stranded cDNA using a standard protocol [see, e.g., Sambrook et al. (1989) IN: Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press, Chapter the -56ligase was inactivated by heating the mixture to 72 0 C for minutes. The following reagents were added to the cDNA ligation reaction and heated at 37 0 C for 30 minutes: cDNA ligation reaction (20 p1), water (24 l) 10x kinase buffer (3 10 mM ATP (1 pl) and kinase (2i1 of 2 The reaction was stopped by the addition of 2 l 0.5M EDTA, followed by one phenol/chloroform extraction and one chloroform extraction.
d. Size Selection and Packaging of cDNA The double-stranded cDNA with the EcoRI, SnaBI, XhoI adapters ligated was purified away from the free or unligated adapters using a 5 ml Sepharose CL-4B column (Sigma, St.
Louis, MO). 100 il fractions were collected and those containing the cDNA, determined by monitoring the radioactivity, were pooled, ethanol precipitated, resuspended in TE buffer and loaded onto a 1% agarose gel. After the electrophoresis, the gel was stained with ethidium bromide and the 1 to 3 kb fraction was cut from the gel. The cDNA embedded in the agarose was eluted using the "Geneluter Electroelution System" (Invitrogen, San Diego, CA) The eluted cDNA was collected by ethanol precipitation and resuspended in TE buffer at 0.10 pmol/41. The cDNA was ligated to 1 pg of EcoRI digested, dephosphorylated Xgtll in a 5 1p reaction volume at a 2- to 4- fold molar excess ratio of cDNA over the Xgtll vector. The ligated Xgtll containing the cDNA insert was packaged into X phage virions in vitro using the Gigapack (Stratagene, La Jolla, CA) kit. The packaged phage were plated on an E. coli Y1088 bacterial lawn in preparation for screening.
2. IMR32 cDNA library #2 This library was prepared as described (Example I.B.1 with the exception that 3 to 9 kb cDNA fragments were ligated into the Xgtll phage vector rather than the 1 to 3 kb fragments.
-57- 3. IMR32 cDNA library #3 IMR32 cell poly(A') RNA (Example was used as a template to synthesize single-stranded cDNA. The primers for the first strand cDNA synthesis were random primers (hexadeoxy-nucleotides Cat #5020-1, Clontech, Palo Alto, CA). .The double-stranded cDNA was synthesized, EcoRI, SnaBI, XhoI adapters were added to the cDNA, the unligated adapters were removed, and the double-stranded cDNA with the ligated adapters was fractionated on an agarose gel, as described in Example I.B.1. The cDNA fraction greater than 1.8 kb was eluted from the agarose, ligated into Xgt1l, packaged, and plated into a bacterial lawn of Y1088 (as described in Example 4. IMR32 cDNA library #4 IMR32 cell poly(A') RNA (Example was used as a template to synthesize single-stranded cDNA. The primers for the first strand cDNA synthesis were oligonucleotides: 89-365a specific for the (VDCC III) type a 1 -subunit (see Example II.A.) coding sequence (the complementary sequence of nt 2927 to 2956, SEQ ID No. 89-495 specific for the a (VDCC II) type al-subunit (see Example II.B.) coding sequence (the .complementary sequence of nt 852 to 873, SEQ ID No. and 90-12 specific for the aic-subunit coding sequence (the complementary sequence of nt 2496 to 2520, SEQ ID No. The cDNA library was then constructed as described (Example except that the cDNA size-fraction greater than 1.5 kb was eluted from the agarose rather than the greater than 1.8 kb fraction.
IMR32 cDNA library The cDNA library was constructed as described (Example with the exception that the size-fraction greater than 1.2 kb was eluted from the agarose rather than the greater than 1.8 kb fraction.
6. Human thalamus cDNA library #6 Human thalamus poly RNA (Example was used as a template to synthesize single-stranded cDNA. Oligo (dT) was I -58used to prime the first strand synthesis (Example
I.B.I.)
The double-stranded cDNA was synthesized (Example and EcoRI, KpnI, NcoI adapters of the following sequence: CCATGGTACCTTCGTTGACG 20-mer (SEQ ID NO. 17) 3' GGTACCATGGAAGCAACTGCTTAA 24-mer (SEQ ID NO. 18) were ligated to the double-stranded cDNA as described (Example with the 20-mer replacing the 18-mer and the 24-mer replacing the 22-mer. The unligated adapters were removed by passing the cDNA-adapter mixture through a 1 ml Bio Gel (Bio-Rad Laboratories, Richmond, CA) column. Fractions pl) were collected and 1 pl of each fraction in the first peak of radioactivity was electrophoresed on a 1% agarose gel.
After electrophoresis, the gel was dried on a vacuum gel drier :and exposed to x-ray film. The fractions containing cDNA fragments greater than 600 bp were pooled, ethanol precipitated, and ligated into Xgtll (Example The S: construction of the cDNA library was completed as described (Example C. Hybridization and Washing Conditions Hybridization of radiolabelled nucleic acids to immobilized DNA for the purpose of screening cDNA libraries, DNA Southern transfers, or northern transfers was routinely .performed in standard hybridization conditions [hybridization: 50% deionized formamide, 200 Ag/ml sonicated herring sperm DNA (Cat #223646, Boehringer Mannheim Biochemicals, Indianapolis, IN), 5 x SSPE, 5 x Denhardt's, 420 wash :0.2 x SSPE, 0.1% SDS, 650 C] The recipes for SSPE and Denhardt's and the preparation of deionized formamide are described, for example, in Sambrook et al. (1989) Molecular Cloning, A Laboratory Manual, Cold Spring Harbor Laboratory Press, Chapter In some hybridizations, lower stringency conditions were used in that 10% deionized formamide replaced deionized formamide described for the standard hybridization conditions.
-59- The washing conditions for removing the non-specific probe from the filters was either high, medium, or low stringency as described below: 1) high stringency: 0.1 x SSPE, 0.1% SDS, 65 0
C
2) medium stringency: 0.2 x SSPE, 0.1% SDS, 50 0
C
3) low stringency: 1.0 x SSPE, 0.1% SDS, It is understood that equivalent stringencies may be achieved using alternative buffers, salts and temperatures.
EXAMPLE I: ISOLATION OF DNA ENCODING THE HUMAN NEURONAL CALCIUM CHANNEL a, SUBUNIT A. Isolation of DNA encoding the subunit 1. Reference list of partial cDNA clones Numerous D,,-specific cDNA clones were isolated in order to characterize the complete alD coding sequence plus portions o f the 5' and 3' untranslated sequences. SEQ ID No. 1 shows the complete aD, DNA coding sequence, plus 510 nucleotides of o l D 5' untranslated sequence ending in the guanidine nucleotide adjacent to the adenine nucleotide of the proposed initiation of translation as well as 642 nucleotides of 3' untranslated sequence. Also shown in SEQ ID No. 1 is the deduced amino acid sequence. A list of partial cDNA clones used to characterize the a1D sequence and the nucleotide position of each clone relative to the full-length cDNA sequence, which is set forth in SEQ ID No. 1, is shown below. The isolation and characterization of these clones are described below (Example II.A.2.).
IMR32 1.144 nt 1 to 510 of SEQ ID No. 1 untranslated sequence, nt 511 to 2431, SEQ ID No. 1 IMR32' 1.136 nt 1627 to 2988, SEQ ID No. 1 nt 1 to 104 of SEQ ID No. 2 additional exon, IMR32@ 1.80 nt 2083 to 6468, SEQ ID No. 1 IMR32# 1.36 nt 2857 to 4281, SEQ ID No. 1 IMR32 1.163 nt 5200 to 7635, SEQ ID No. 1 5' of nt 1627, IMR32 1.136 encodes an intron and an additional exon described in Example II.A.2.d.
IMR32 1.80 contains two deletions, nt 2984 to 3131 and nt 5303 to 5349 (SEQ ID No. The 148 nt deletion (nt 2984 to 3131) was corrected by Performing a polymerase chain reaction described in Example II.A.3.b.
IMR32 1.36 contains a 132 nt deletion (nt 3081 to 3212).
2. Isolation and characterization of individual clones listed in Example II.A.1.
a. IMR32 1.36
S:
Two million recombinants of the IMR32 cDNA library #1 S(Example were screened in duplicate at a density of *S approximately 200,000 plaques per 150 mm plate using a mixture 0of radiolabelled fragments of the coding region of the rabbit skeletal muscle calcium channel a cDNA [for the sequence of the rabbit skeletal muscle calcium channel ay subunit cDNA, see, Tanabe et al. (1987). Nature 328:313- 18] Fragment Nucleotides KpnI-EcoRI -78 to 1006 EcoRI-XhoI 1006 to 2653 ApaI-Apal 3093 to 4182 0oS* BglII-sacI 4487 to 5310 The hybridization was performed using low stringency hybridization conditions (Example and the filters were washed under low stringency (Example Only one specific recombinant (IMR32 1.36) of the 2 x 106 screened was identified. IMR32 1.36 was plaque purified by standard methods Sambrook et al. (1989) Molecular Cloning,
A
Laboratory Manual, Cold Spring Harbor Laboratory Press, Chapter 8) subcloned into pGEM3 (Promega, Madison, WI) and characterized by DNA sequencing.
b. IMR32 1.80 Approximately 1 x 106 recombinants of the IMR32 cDNA library #2 (Example were screened in duplicate at a -61density of approximately 100,000 plaques per 150 mm plate using the IMR32 1.36 cDNA fragment (Example II.A.1) as a probe. Standard hybridization conditions were used, and the filters were washed under high stringency (Example Three positive plaques were identified one of which was IMR32 1.80. IMR32 1.80 was plaque purified by standard methods, restriction mapped, subcloned, and characterized by DNA sequencing.
c. IMR32 1.144 Approximately 1 x 106 recombinants of the IMR32 cDNA library #3 (Example I.B.3) were screened with the EcoRI-PvuII fragment (nt 2083 to 2518, SEQ ID No. 1) of IMR32 1.80. The hybridization was performed using standard hybridization conditions (Example and the filters were washed under high stringency (Example Three positive plaques were identified one of which was IMR32 1.144. IMR32 1.144 was plaque purified, restriction mapped, and the cDNA insert was subcloned into pGEM7Z (Promega, Madison, WI) and characterized by DNA sequencing. This characterization revealed that IMR32 :1.144 has a series of ATG codons encoding seven possible initiating methionines (nt 511 to 531, SEQ ID No. Nucleic acid amplification analysis, and DNA sequencing of cloned nucleic acid amplification analysis products encoding these seven ATG codons confirmed that this sequence is present in the aID transcript expressed in dbcAMP-induced IMR32 cells.
d. IMR32 1.136 Approximately 1 x 106 recombinants of the IMR32 cDNA library #4 (Example I.B.4) were screened with the EcoRI-PvuII fragment (nt 2083 to 2518, SEQ ID No. 1) of IMR32 1.80 (Example The hybridization was performed using standard hybridization conditions (Example and the filters were washed under high stringency (Example Six positive plaques were identified one of which was IMR32 1.136.
IMR32 1.136 was plaque purified, restriction mapped, and the cDNA insert was subcloned into a standard plasmid vector, pSP72 (Promega, Madison, and characterized by DNA -62sequencing. This characterization revealed that IMR32 1.136 encodes an incompletely spliced transcript. The clone contains nucleotides 1627 to 2988 of SEQ ID No. 1 preceded by an approximate 640 bp intron. This intron is then preceded by a 104 nt exon (SEQ ID No. 2) which is an alternative exon encoding the IS6 transmembrane domain [see, Tanabe et al. (1987) Nature 328:313-318 for a description of the IS1 to IVS6 transmembrane terminology] of the alD subunit and can replace nt 1627 to 1730, SEQ ID No. 1, to produce a completely spliced aI0 transcript.
e. IMR32 1.163 Approximately 1 x 106 recombinants of the IMR32 cDNA library #3 (Example were screened with the NcoI-XhoI fragment of IMR32 1.80 (Example II.A.1.) containing nt 5811 to 6468 (SEQ ID No. The hybridization was performed using standard hybridization conditions (Example and the filters were washed under high stringency (Example Three positive plaques were identified one of which was IMR32 1.163. IMR32 1.163 was plaque purified, restriction mapped, and the cDNA insert was subcloned into a standard plasmid vector, pSP72 (Promega, Madison, and characterized by DNA sequencing. This characterization revealed that IMR32 1.163 contains the a1D termination codon, nt 6994 to 6996 (SEQ ID No. 1).
3. Construction of a full-length a, CDNA [pVDCCIII(A)] D, cDNA clones IMR32 1.144, IMR32 1.136, IMR32 1.80, and IMR32 1.163 (Example II.A.2.) overlap and include the entire a,1 coding sequence, nt 511 to 6993 (SEQ ID No. with the exception of a 148 bp deletion, nt 2984 to 3131 (SEQ ID No.
Portions of these partial cDNA clones were ligated to generate a full-length ac cDNA in a eukaryotic expression vector. The resulting vector was called pVDCCIII(A). The construction of pVDCCIII(A) was performed in four steps described in detail below: the construction of using portions of IMR32 1.144, IMR32 1.136, and
II~
-63- IMR32 1.80, the construction of pVDCCIII/5'.3 that corrects the 148 nt deletion in the IMR32 1.80 portion of the construction of pVDCCIII/3'.1 using portions of IMR32 1.80 and IMR32 1.163, and the ligation of a portion of the pVDCCIII/5'.3 insert, the insert of pVDCCIII/3'.1, and pcDNAl (Invitrogen, San Diego, CA) to form pVDCCIII(A). The vector pcDNA1 is a eukaryotic expression vector containing a cytomegalovirus (CMV) promoter which is a constitutive promoter recognized by mammalian host cell RNA polymerase
II.
Each of the DNA fragments used in preparing the fulllength construct was purified by electrophoresis through an agarose gel onto DE81 filter paper (Whatman, Clifton, NJ) and elution from the filter paper using 1.0 M NaC1, 10 mM TRIS, pH 8.0, 1 mM EDTA. The ligations typically were performed in a pA reaction volume with an equal molar ratio of insert fragment and a two-fold molar excess of the total insert .o relative to the vector. The amount of DNA used was normally about 50 ng to 100 ng.
a. To construct pVDCCIII/5', IMR32 1.144 (Example II.A.2.c.) was digested with XhoI and EcoRI and the fragment containingthe vector (pGEM7Z), nt 1 to 510 (SEQ ID No. and a, nt 511 to 1732 (SEQ ID No. 1) was isolated by gel electrophoresis. The EcoRI-Apal fragment of IMR32 1.136 (Example II.A.2.d.) nucleotides 1733 to 2671 (SEQ ID No. 1) was isolated, and the ApaI-HindIII fragment of IMR32 1.80 (Example nucleotides 2672 to 4492 (SEQ ID No. 1) was isolated. The three DNA clones were ligated to form containing nt 1 to 510 untranslated sequence; SEQ ID No. 1) and nt 511 to 4492 (SEQ ID No. 1).
b. pVDCCIII/5'.3 Comparison of the IMR32 1.36 and IMR32 1.80 DNA sequences revealed that these two cDNA clones differ through the a,, coding sequence, nucleotides 2984 to 3212. nucleic acid amplification analysis of IMR32 1.80 and dbcAMP-induced mM, 10 days) IMR32 cytoplasmic RNA (isolated according to Ausubel, F.M. et al. (Eds) (1988) Current Protocols in Molecular Biology, John Wiley and Sons, New York) revealed that IMR32 1.80 had a 148 nt deletion, nt 2984 to 3131 (SEQ ID No. and that IMR32 1.36 had a 132 nt deletion, nt 3081 to 3212. To perform the nucleic acid amplification analysis, the amplification reaction was primed with eD n -specific oligonucleotides 112 (nt 2548 to 2572, SEQ ID No. 1) and 311 (the complementary sequence of nt 3928 to 3957, SEQ ID No. 1) These products were then reamplified Using a
I
D-spcific oliginucleotides 310 (nt e3 fied using GlD-specific Soligonucleotides 310 (nt 2583 to 2608 SEQ ID No. 1) and 312 '(the complementary sequence of nt 3883 to 3909). This 0. reamplified product, which contains AccI and BgII restriction sites, was digested with AccI and BglII and the AccI-BglI fragment, nt 2765 to 3890 (SEQ ID No. 1) was cloned into AccI- BglII digested pVDCCIII/5, to replace the AccI-BglII pVDCCIII/ 5 fragment that had the deletion. This new construct was named pVDCCIII/ 5 3 DNA sequence determination of pVDCCIII/5'.
3 through the amplified region confitermination 148 nt deletion in IMR32 1.80. r e g o n c o n f i r me d t h e c. PVDCCIII/ 3 '.1 To construct pVDCCIII/ 3 1 the cDNA insert of IMR32 1.163 (Example II.A.2.e.) was subcloned into pBluescript
II
(Stratagene, La Jolla, CA) as an Xhol fragment. The XhoI sites on the cDNA fragment were furnished by the adapters used to construct the cDNA library (Example The insert was oriented such that the translational orientation of the insert of IMR32 1.163 ranslational orientation of the insert of MR32 1.163 was opposite to that of the lacZ gene present in the plasmid, as confirmed by analysis restriction enzyme digests of the resulting p a ysis wo done to preclude the possibiliting plasmaid. This w as done to preclude the Possibility of expression of sequences in DH5a cells transformed with this plasmid due to fusion with the lacz gene. This plasmid was then digested with HindIII and BgII and the HindII BglII fragment (the HindIII site comes from the vector and the BglII site is at nt 6220, SEQ ID No. 1) was eliminated, thus deleting nt 5200 to 6220 (SEQ
ID
No. 1) of the IMR32 1.163 clone and removing this sequence from the remainder of the plasmid which contained the 3' BglII Xhol fragment, nt 6221 to 7635 (SEQ ID No. 1).
pVDCCIII/3'.1 was then made by splicing together the HindIII- PvuII fragment from IMR32 1.80 (nucleotides 4493-5296, SEQ ID No. the PvuII BglII fragment of IMR32 1.163 (nucleotides 5294 to 6220, SEQ ID No. 1) and the HindIII-BglII-digested pBluescript plasmid containing the 3' BglII/XhoI IMR32 1.163 fragment (nt 6221 to 7635, SEQ ID No. 1).
d. pVDCCIII(A): the full-length alD construct To construct pVDCCIII(A), the DraI-HindIII fragment untranslated sequence nt 330 to 510, SEQ ID No. 1 and coding sequence nt 511 to 4492, SEQ ID No. 1) of pVDCCIII/5'.3 (Example II.A.3.b.) was isolated; the HindIII-XhoI fragment of pVDCCIII/3'.1 (containing nt 4493 to 7635, SEQ ID No. 1, plus the XhoI site of the adapter) (Example II.A.3.c.) was isolated; and the plasmid vector, pcDNA1, was digested with EcoRV and XhoI and isolated on an agarose gel. The three DNA fragments were ligated and MC1061-P3 (Invitrogen, San Diego, CA) was transformed. Isolated clones were analyzed by restriction mapping and DNA sequencing and pVDCCIII(A) was identified which had the fragments correctly ligated together: DraI-HindIII, HindIII-XhoI, XhoI-EcoRV with the blunt-end DraI and EcoRV site ligating together to form the circular plasmid.
The amino-terminus of the a, subunit is encoded by the seven consecutive 5' methionine codons (nt 511 to 531, SEQ ID No. This 5' portion plus nt 532 to 537, encoding two lysine residues, were deleted from pVDCCIII(A) and replaced with an efficient ribosomal binding site ACCACC-3') to form pVDCCIII.RBS(A). Expression experiments in which transcripts of this construct were injected into Xenopus laevis o6cytes did not result in an enhancement in the recombinant voltage-dependent calcium channel expression level relative to the level of expression in o6cytes injected with transcripts of pVDCCIII(A).
-66- B. Isolation of DNA encoding the a, subunit 1. Reference List of Partial a, cDNA clones Numerous ac,-specific cDNA clones were isolated in order to characterize the a, coding sequence, the a, initiation of translation, and an alternatively spliced region of a SEQ ID No. 3 sets forth one a, coding sequence (c1c-) and deduced amino acid sequence; SEQ ID No. 36 sets forth another splice variant designated UIc-2. SEQ ID No. 4 and No. 5 encode two possible amino terminal ends of an a, splice variant. SEQ ID No. 6 encodes an alternative exon for the IV S3 transmembrane domain. Other a, variants can be constructed by selecting the alternative amino terminal ends in place of the ends in SEQ ID No. 3 or 36 and/or inserting the alternative exon (SEQ ID No.
6) in the appropriate location, such as in SEQ ID NO. 3 in place of nucleotides 3904-3987. In addition, the nucleotide sequence (nucleotides 1391-1465 in SEQ ID No. 3) can be deleted or inserted to produce an alternative c, splice variant.
Shown below is a list of clones used to characterize the sequence and the nucleotide position of each clone relative to the characterized ac sequence (SEQ ID No. The isolation and characterization of these cDNA clones are described below (Example II.B.2).
1.°O IMR32 1.66 nt 1 to 916, SEQ ID No. 3 nt 1 to 132, SEQ ID No. 4 IMR32 1.157 nt 1 to 873, SEQ ID No. 3 nt 1 to 89, SEQ ID No. IMR32 1.67 nt 50 to 1717, SEQ ID No. 3 *IMR32 1.86 nt 1366 to 2583, SEQ ID No. 3 1 .16G nt 758 to 867, SEQ ID No. 3 IMR32 1.37 nt 2804 to 5904, SEQ ID No. 3 CNS 1.30 nt 2199 to 3903, SEQ ID No. 3 nt 1 to 84 of alternative exon, SEQ ID No. 6 IMR32 1.38 nt 2448 to 4702, SEQ ID No. 3 nt 1 to 84 of alternative exon, -67- SEQ ID No. 6 IMR32 1.86 has a 73 nt deletion compared to the rabbit cardiac muscle calcium channel a, subunit cDNA sequence.
"1.16G is an ac, genomic clone.
2. Isolation and characterization of clones described in Example II.B.I.
a. CNS 1.30 Approximately 1 x 106 recombinants of the human thalamus cDNA library No. 6 (Example were screened with fragments of the rabbit skeletal muscle calcium channel a, c DNA described in Example II.A.2.a. The hybridization was performed using standard hybridization conditions (Example and the filters were washed under low stringency (Example Six positive plaques were identified, one of which was CNS 1.30. CNS 1.30 was plaque purified, restriction mapped, subcloned, and characterized by DNA sequencing.
CNS
1.30 encodes a 1 ,-specific sequence nt 2199 to 3903 (SEQ ID No.
3) followed by nt 1 to 84 of one of two identified alternative a" exons (SEQ ID No. 3' of SEQ ID No. 6, CNS 1.30 contains an intron and, thus, CNS 1.30 encodes a partially spliced a transcript.
b. 1.16G Approximately 1 x 106 recombinants of a XEMBL3-based human genomic DNA library (Cat HL1006d Clontech Corp., Palo Alto, CA) were screened using a rabbit skeletal muscle cDNA fragment (nt -78 to 1006, Example II.A.2.a.) The hybridization was performed using standard hybridization conditions (Example and the filters were washed under low stringency (Example Fourteen positive plaques were identified, one of which was 1.16G. Clone 1.16G was plaque purified, restriction mapped, subcloned, and portions were characterized by DNA sequencing. DNA sequencing revealed that 1.16G encodes acc-specific sequence as described in Example II.B.1.
-68- C. IMR32 1.66 and IMR32 1.67 Approximately 1 x 106 recombinants of IMR32 cDNA library (Example were screened with a 151 bp KpnI-SacI fragment of 1.16G (Example II.B.2.b.) encoding ac sequence (nt 758 to 867, SEQ ID No. The hybridization was performed using standard hybridization conditions (Example The filters were then washed in 0.5 x SSPE at 650C. Of the positive plaques, IMR32 1.66 and IMR32 1.67 were identified.
The hybridizing plaques were purified, restriction mapped, subcloned, and characterized by DNA sequencing. Two of these cDNA clones, IMR32 1.66 and 1.67, encode a~ subunits as described (Example In addition, IMR32 1.66 encodes a partially spliced ac transcript marked by a GT splice donor dinucleotide beginning at the nucleotide 3' of nt 916 (SEQ ID No. The intron sequence within 1.66 is 101 nt long.
IMR32 1.66 encodes the a, initiation of translation, nt 1 to 3 (SEQ ID No. 3) and 132 nt of 5' untranslated sequence
(SEQ
ID No. 4) precede the start codon in IMR32 1.66.
d. IMR32 1.37 and IMR32 1.38 Approximately 2 x 106 recombinants of IMR32 cDNA library S#1 (Example were screened with the CNS 1.30 cDNA fragment (Example The hybridization was performedusing low stringency hybridization conditions (Example
I.C.)
and the filters were washed under low stringency (Example Four positive plaques were identified, plaque purified, restriction mapped, subcloned, and characterized by DNA sequencing. Two of the clones, IMR32 1.37 and IMR32 1.38 encode ac,-specific sequences as described in Example II.B.1.
DNA sequence comparison of IMR32 1.37 and IMR32 1.38 revealed that the ac transcript includes two exons that encode the IVS3 transmembrane domain. IMR32 1.37 has a single exon, nt 3904 to 3987 (SEQ ID No. 3) and IMR32 1.38 appears to be anomalously spliced to contain both exons juxtaposed, nt 3904 to 3987 (SEQ ID No. 3) followed by nt 1 to 84 (SEQ ID No. 6).
The alternative splice of the ac transcript to contain either of the two exons encoding the IVS3 region was confirmed by -69comparing the CNS 1.30 sequence to the IMR32 1.37 sequence.
CNS 1.30 contains nt 1 to 84 (SEQ ID No. 6) preceded by the identical sequence contained in IMR32 1.37 for nt 2199 to 3903 (SEQ ID No. As described in Example II.B.2.a., an intron follows nt 1 to 84 (SEQ ID No. Two alternative exons have been spliced adjacent to nt 3903 (SEQ ID No. 3) represented by CNS 1.30 and IMR32 1.37.
e. IMR32 1.86 IMR32 cDNA library #1 (Example was screened in duplicate using oligonucleotide probes 90-9 (nt 1462 to 1491, SEQ ID No. 3) and 90-12 (nt 2496 to 2520, SEQ ID No. 3) These oligonucleotide probes were chosen in order to isolate a clone that encodes the ac, subunit between the 3' end of IMR32 1.67 (nt 1717, SEQ ID No. 3) and the 5' end of CNS 1.30 (nt 2199, SEQ ID No. The hybridization conditions were standard hybridization conditions (Example with the exception that the 50% deionized formamide was reduced to The filters were washed under low stringency (Example Three positive plaques were identified one of which was IMR32 1.86. IMR32 1.86 was plaque purified, subcloned, and characterized by restriction mapping and DNA sequencing.
IMR32 1.86 encodes ec sequences as described in Example II.B.1. Characterization by DNA sequencing revealed that IMR32 1.86 contains a 73 nt deletion compared to the DNA encoding rabbit cardiac muscle calcium channel a, subunit [Mikami et al. (1989) Nature 340:230], nt 2191 to 2263. These missing nucleotides correspond to nt 2176-2248 of SEQ ID No.
3. Because the 5'-end of CNS 1.30 overlaps the 3'-end of IMR32 1.86, some of these missing nucleotides, nt 2205- 2248 of SEQ ID No. 3, are accounted for by CNS 1.30. The remaining missing nucleotides of the 73 nucleotide deletion in IMR32 1.86 nt 2176-2204 SEQ ID No. 3) were determined by nucleic acid amplification analysis of dbcAMP-induced IMR32 cell RNA. The 73 nt deletion is a frame-shift mutation and, thus, needs to be corrected. The exact human sequence'through this region, (which has been determined by the DNA sequence of
I
CNS 1.30 and nucleic acid amplification analysis of IMR32 cell RNA) can be inserted into IMR32 1.86 by standard methods, replacement of a restriction fragment or site-directed mutagenesis.
f. IMR32 1.157 One million recombinants of IMR32 cDNA library #4 (Example were screened with an XhoI-EcoRI fragment of IMR32 1.67 encoding ac nt 50 to 774 (SEQ ID No. 3) The hybridization was performed using standard hybridization conditions (Example The filters were washed under high stringency (Example One of the positive plaques identified was IMR32 1.157. This plaque was purified, the insert was restriction mapped and subcloned to a standard Splasmid vector pGEM7Z (Promega, Madison, WI). The DNA was characterized by sequencing. IMR32 1.157 appears to encodes an alternative 5' portion of the a sequence beginning with :t 1 to 89 (SEQ ID No. 5) and followed by nt 1 to 873 (SEQ ID No. Analysis of the 1.66 and 1.157 5 sequence is described below (Example II.B.3.).
3. Characterization of the a ic initiation of translation site Portions of the sequences of IMR32 1.157 (nt 57 to 89, SEQ ID No. 5; nt 1 to 67, SEQ ID No. IMR32 1.66 (nt 100 to 132, SEQ ID No. 4; nt 1 to 67, SEQ ID No. were compared to the rabbit lung CaCB-receptor cDNA sequence, nt -33 to 67 [Biel et al. (1990) FEBS Lett. 269:409]. The human sequences are possible alternative 5' ends of the a, transcript encoding the region of initiation of translation. IMR32 1.66 closely matches the CaCB receptor cDNA sequence and diverges from the CaCB receptor cDNA sequence in the 5' direction beginning at nt 122 (SEQ ID No. 4) The start codon identified in the CaCB receptor cDNA sequence is the same start codon used to describe the ai coding sequence, nt 1 to 3 (SEQ ID No. 3) The sequences of a, splice variants, designated aec- and aic-2 are set forth in SEQ ID NOs. 3 and 36.
-71- C. Isolation of partial cDNA clones encoding the al subunit and construction of a full-length clpne A human basal ganglia cDNA library was screened with the rabbit skeletal muscle a 1 subunit cDNA fragments (see Example II.A.2.a for description of fragments) under low stringency conditions. One of the hybridizing clones was used to screen an IMR32 cell cDNA library to obtain additional partial ae cDNA clones, which were in turn used to further screen an IMR32 cell cDNA library for additional partial cDNA clones.
One of the partial IMR32 a, clones was used to screen a human hippocampus library to obtain a partial a, clone encoding the 3' end of the a 1 c coding sequence. The sequence of some of the regions of the partial cDNA clones was compared to the o sequence of products of nucleic acid amplification analysis of S" IMR32 cell RNA to determine the accuracy of the cDNA sequences.
Nucleic acid amplification analysis of IMR32 cell RNA and genomic DNA using oligonucleotide primers corresponding to sequences located 5' and 3' of the STOP codon of the DNA encoding the aa subunit revealed an alternatively S. spliced o,,-encoding mRNA in IMR32 cells. This second mRNA product is the result of differential splicing of the a,, subunit transcript to include another exon that is not present in the mRNA corresponding to the other 3' cDNA sequence that was initially isolated. To distinguish these splice variants of the a, subunit, the subunit encoded by a DNA sequence corresponding to the form containing the additional exon is referred to as a,, (SEQ ID No. whereas the subunit encoded by a DNA sequence corresponding to the form lacking the additional exon is referred to as c 3 (SEQ ID No. The sequence of a.s-i diverges from that of a beginning at nt 6633 (SEQ ID No. Following the sequence of the additional exon in a 1 3 i (nt 6633-6819; SEQ ID No. the and aB- 2 sequences are identical nt 6820-7362 in SEQ ID No. 7 and nt 6633-7175 in SEQ ID No. SEQ ID No. 7 and No. 8 set forth 143 nt of 5' untranslated sequence (nt 1-143) as well as -72- 202 nt of 3' untranslated sequence (nt 7161-7362, SEQ ID No.
7) of the DNA encoding ac,, and 321 nt of 3' untranslated sequence (nt 6855-7175, SEQ ID No. 8) of the DNA encoding a~ Nucleic acid amplification analysis of the IS6 region of the transcript revealed what appear to be additional splice variants based on multiple fragment sizes seen on an ethidium bromide-stained agarose gel containing the products of the amplification reaction.
A full-length as-, cDNA clone designated pcDNA-a.c.I was prepared in an eight-step process as follows.
STEP 1: The SacI restriction site of pGEM3 (Promega, Madison, WI) was destroyed by digestion at the SacI site, producing blunt ends by treatment with T4 DNA I" polymerase, and religation. The new vector was designated pGEMASac.
STEP 2: Fragment 1 (HindIII/KpnI; nt 2337 to 4303 of SEQ ID No. 7) was ligated into HindIII/KpnI digested pGEM3ASac to produce pal.177HK.
STEP 3: Fragment 1 has a 2 nucleotide deletion (nt 3852 and 3853 of SEQ ID No. The deletion was repaired by inserting an amplfied fragment (fragment 2) of IMR32 RNA into pal.177HK. Thus, fragment 2 (Nari/KpnI; nt 3828 to 4303 of SEQ ID No. 7) was inserted into NarI/KpnI digested pal.177HK replacing the NarI/KpnI portion of fragment 1 and producing pal.177HK/PCR.
STEP 4: Fragment 3 (KpnI/KpnI; nt 4303 to 5663 of SEQ ID No. 7) was ligated into KpnI digested prl.177HK/PCR to produce STEP 5: Fragment 4 (EcoRI/HindIII; EccRI adaptor plus nt 1 to 2337 of SEQ ID No. 7) and fragment (HindIII/XhoI fragment of palB5'K; nt 2337 to 5446 of SEQ ID No. 7) were ligated together into EccRI/XhoI digested pcDNA1 (Invitrogen, San Diego, CA) to produce -73- STEP 6: Fragment 6 (EcoRI/EcoRI; EcoRI adapters on both ends plus nt 5749 to 7362 of SEQ ID No. 7) was ligated into EcoRI digested pBluescript II KS (Stratagene, La Jolla, CA) with the 5' end of the fragment proximal to the KpnI site in the polylinker to produce pal.230.
STEP 7: Fragment 7 (KpnI/XhoI; nt 4303 to 5446 of SEQ ID No. and fragment 8 (XhoI/CspI; nt 5446 to 6259 of SEQ ID No. 7) were ligated into KpnI/CspI digested pal.230 (removes nt 5749 to 6259 of SEQ ID No. 7 that was encoded in pal.230 and maintains nt 6259 to 7362 of SEQ ID No. 7) to produce palB3'.
STEP 8: Fragment 9 (SphI/XhoI; nt 4993 to 5446 of SEQ ID No. 7) and fragment 10 (XhoI/XbaI of palB3' nt 5446 to 7319 of SEQ ID No. 7) were ligated into SphI/XbaI digested palB5' (removes nt 4993 to 5446 of SEQ ID No. 7 that were encoded in palB5' and maintains nt 1 to 4850 -of SEQ ID No. 7) to produce pcDNAa 1 _-1i The resulting construct, pcDNAal-i, contains, in pCDNA1, a full-length coding region encoding (nt 144-7362, SEQ ID No. plus 5' untranslated sequence (nt 1-143, SEQ ID No. 7) and 3' untranslated sequence (nt 7161-7319, SEQ ID No. 7) under the transcriptional control of the CMV promoter.
D. Isolation of DNA encoding human calcium channel a, subunits 1. Isolation of partial clones DNA clones encoding portions of human calcium channel alA subunits were obtained by hybridization screening of human cerebellum cDNA libraries and nucleic acid amplification of human cerebellum RNA. Clones corresponding to the 3' end of the aA, coding sequence were isolated by screening 1 x 106 recombinants of a randomly primed cerebellum cDNA library (size-selected for inserts greater than 2.8 kb in length) under low stringency conditions (6X SSPE, 5X Denhart's solution, 0.2% SDS, 200 ug/ml sonicated herring sperm DNA,
I-
-74- 42 0 C) with oligonucleotide 704 containing nt 6190-6217 of the rat coding sequence [Starr et al. (1992) Proc. Natl. Acad.
Sci. U.S.A. 88:5621-5625]. Washes were performed under low stringency conditions. Several clones that hybridized to the probe (clones ci1.251-al.259 and al.244) were purified and characterized by restriction enzyme mapping and DNA sequence analysis. At least two of the clones, al.244 and al.254, contained a translation termination codon. Although clones al.244 and al.254 are different lengths, they both contain a sequence of nucleotides that corresponds to the extreme 3' end of the transcript, the two clones overlap. These two .clones are identical in the region of overlap, except, clone a1.244 contains a sequence of 5 and a sequence of 12 nucleotides that are not present in al.254.
To obtain additional a,-encoding clones, 1 x 106 recombinants of a randomly primed cerebellum cDNA library (size-selected for inserts ranging from 1.0 to 2.8 kb in length) was screened for -hybridization to three oligonucleotides: oligonucleotide 701 (containing nucleotides 2288-2315 of the rat a, coding sequence), oligonucleotide 702 (containing nucleotides 3559-3585 of the rat lIA codingsequence) and oligonucleotide 703 (containing nucleotides 4798-4827 of the rat a, coding sequence). Hybridization and •washes were performed using the same conditions as used for the first screening with oligonucleotide 704, except that washes were conducted at 45 0 C. Twenty clones (clones al.269- 1a.288) hybridized to the probe. Several clones were plaquepurified and characterized by restriction enzyme mapping and DNA sequence analysis. One clone, al.279, contained a sequence of about 170 nucleotides that is not present in other clones corresponding to the same region of the coding sequence. This region may be present in other splice variants. None of the clones contained a translation intiation codon.
To obtain clones corresponding to the 5' end of the human aAA coding sequence, another cerebellum cDNA library was prepared using oligonucleotide 720 (containing nucleotides 2485-2510 of SEQ ID No. 22) to specifically prime first-strand cDNA synthesis. The library (8 x 10' recombinants) was screened for hybridization to three oligonucleotides: oligonucleotide 701, oligonucleotide 726 (containing nucleotides 2333-2360 of the rat al, coding sequence) and oligonucleotide 700 (containing nucleotides 767-796 of the rat alA coding sequence) under low stringency hybridization and washing conditions. Approximately 50 plaques hybridized to the probe. Hybridizing clones al.381-al.390 were plaquepurified and characterized by restriction enzyme maping and DNA sequence analysis. At least one of the clones, al.381, contained a translation initiation codon.
Alignment of the sequences of the purified clones revealed that the sequences overlapped to comprise the entire C1A coding sequence. However, not all the overlapping sequences of partial clones contained convenient enzyme restriction sites for use in ligating partial clones to construct a full-length aA, coding sequence. To obtain DNA fragments containing convenient restriction enzyme sites that could be used in constructing a full-length aIA DNA, cDNA was synthesized from RNA isolated from human cerebellum tissue and subjected to nucleic acid amplification. The oligonucleotides used as primers corresponded to human aIA coding sequence located 5' and 3' of selected restriction enzyme sites. Thus, in the first amplification reaction, oligonucleotides 753 (containing nucleotides 2368-2391 of SEQ ID No. 22) and 728 (containing nucleotides 3179-3202 of SEQ ID No. 22) were used as the primer pair. To provide a sufficient amount of the desired DNA fragment, the product of this amplification was reamplified using oligonucleotides 753 and 754 (containing nucleotides 3112-3135 of SEQ ID No. 22 as the primer pair.
The resulting product was 768 bp in length. In the second amplification reaction, oligonucleotides 719 (containing nucleotides 4950-4975 of SEQ ID No. 22 and 752 (containing nucleotides 5647-5670 of SEQ ID No. 22) were used as the -76primer pair. To provide a sufficient amount of the desired second DNA fragment, the product of this amplification was reamplified using oligonucleotides 756 (containing nucleotides 5112-5135 of SEQ ID No. 22) and 752 as the primer pair. The resulting product was 559 bp in length.
2. Construction of full-Length aA coding sequences Portions of clone al.381, the 7 68-bp nucleic acid amplification product, clone al.278, the 559-bp nucleic acid amplification product, and clone al.244 were ligated at convenient restriction sites to generate a full-length coding sequence referred to as IA- fu *Comparison of the results of sequence analysis of clones al.244 and al.254 indicated that e primary transcript of the A ubunit gene is alternatively spliced to yield at least two variant mRNAs encoding different forms of the alA subunit. One form, iaA-., is encoded by the sequence shown in SEQ ID No. 22.
The sequence encoding a second form, aIA-2, differs from the aA- -encoding sequence at the 3' end in that it lacks a sequence found in clone al.244 (nucleotides 7035-7039 of SEQ ID No. 22). This deletion shifts the reading frame and introduces a translation termination codon resulting in an a coding sequence that encodes a shorter alA subunit than that encoded by the A3' splice variant. Consequently, a portion of the 3' end of the oA., coding sequence is actually 3' untranslated sequence in the cA-2 DNA. The complete sequence of A- 2 which can be constructed by ligating portions of clone al.381, the 768-bp nucleic acid amplification product, clone al.278, the 559-bp nucleic acid amplification product and clone al.254, is set forth in SEQ ID No. 23.
E. Isolation of DNA Encoding the alI Subunit DNA encoding al, subunits of the human calcium channel were isolated from human hippocampus libraries. The selected clones sequenced. DNA sequence analysis of DNA clones encoding the alE subunit indicated that at least two alternatively spliced forms of the same alE subunit primary transcript are expressed. One form has the sequence set forth -77in SEQ ID No. 24 and was designated and the other was designated which has the sequence obtained by inserting a 57 base pair fragment between nucleotides 2405 and 2406 of SEQ ID No. 24. The resulting sequence is set forth in SEQ ID No. The subunit designated IE-1 has a calculated molecular weight of 254,836 and the subunit designated lE- 3 has a calculated molecular weight of 257,348. alE- 3 has a 19 amino acid insertion (encoded by SEQ ID No. 25) relative to aIE- 1 in the region that appears to be the cytoplasmic loop between transmembrane domains IIS6 and IIISI.
S.EXAMPLE III: ISOLATION OF cDNA CLONES ENCODING THE HUMAN :..NEURONAL CALCIUM CHANNEL 01 subunit A. Isolation of partial cDNA clones encoding the 6 subunit and construction of a full-length clone encoding the 6, subunit A human hippocampus cDNA library was screened with the rabbit skeletal muscle calcium channel subunit cDNA fragment (nt 441 to 1379) [for isolation and sequence of the rabbit skeletal muscle calcium channel 01 subunit cDNA, see U.S. Patent Application Serial NO. 482,384 or Ruth et al.
(1989) Science 245:1115] using standard hybridization conditions (Example A portion of one of the hybridizing clones was used to rescreen the hippocampus library to obtain additional cDNA clones. The cDNA inserts of hybridizing clones were characterized by restriction mapping and DNA sequencing and compared to the rabbit skeletal muscle calcium channel subunit cDNA sequence.
Portions of the partial 61 subunit cDNA clones were ligated to generate a full-length clone encoding the entire 01 subunit. SEQ ID No. 9 shows the subunit coding sequence (nt 1-1434) as well as a portion of the 3' untranslated sequence (nt 1435-1546). The deduced amino acid sequence is also provided in SEQ ID No. 9. In order to perform expression experiments, full-length 0 subunit cDNA clones were constructed as follows.
-78- Step 1: DNA fragment 1 (-800 bp of 5' untranslated sequence plus nt 1-277 of SEQ ID No. 9) was ligated to DNA fragment 2 (nt 277-1546 of SEQ ID No. 9 plus 448 bp of intron sequence) and cloned into pGEM7Z. The resulting plasmid, p 1.18, contained a full-length 1 subunit clone that included a 448-bp intron.
Step 2: To replace the 5' untranslated sequence of pi1- 1.18 with a ribosome binding site, a double-stranded adapter was synthesized that contains an EcoRI site, sequence encoding a ribosome binding site (5'-ACCACC-3') and nt 1-25 of SEQ ID No. 9. The adapter was ligated to Smal-digested pl-1I.18, and the products of the ligation reaction were digested with EcoRI.
Step 3: The EcoRI fragment from step 2 containing the EcoRI adapter, efficient ribosome binding site and nt 1-1546 of SEQ ID No. 9 plus intron sequence was cloned into a plasmid vector and designated P 3 1-1.18RBS. The EcoR fragment of pl- 1.18Rn was subcloned into EcoRI:digested pcDNAl with the initiation codon proximal to CMV promoter to form pHBCaCH ,aRBS Step 4: To generate a full-length clone encoding the P subunit lacking intron sequence, DNA fragment 3 (nt 69-1146 of SEQ ID No. 9 plus 448 bp of intron sequence followed by t 1147-1546 of SEQ ID No. was subjected to site-directed mutagenesis to delete the intron sequence, thereby yielding pl The EcoRI-XhoI fragment of P3Il 18RBS (containing of the ribosome binding site and nt 1-277 of SEQ ID No. 9) was ligated to the XhoI-EcoRI fragment of p1l(-) (containing of nt 277-1546 of SEQ ID No. 9) and cloned into pcDNAl with the initiation of translation proximal to the CMV promoter The resulting expression plasmid was designated pHBCaCHOIbRBS
(A)
B. Splice Variant 1-, DNA sequence analysis of the DNA clones encoding the B subunit indicated that in the CNS at least two alternatively spliced forms of the same human 0, subunit primary transcript are expressed. One form is represented by the sequence shown -79in SEQ ID No. 9 and is referred to as
P
1-2 The sequences of 1-2 and the alternative form, 3. diverge at nt 1334
(SEQ
ID No. The complete 1-3 sequence (nt 1-1851), including 3' untranslated sequence (nt 1795-1851), is set forth in SEQ ID No. EXAMPLE IV: ISOLATION OF cDNA CLONES ENCODING THE HUMAN NEURONAL CALCIUM CHANNEL a 2 -subunit A. Isolation of cDNA clones The complete human neuronal a 2 coding sequence (nt 3310) plus a portion of the 5' untranslated sequence (nt 1 to 34) as well as a portion of the 3' untranslated sequence (nt 3311-3600) is set forth in SEQ ID No. 11.
To isolate DNA encoding the human neuronal 2 subunit, human a 2 genomic clones first were isolated by probing human genomic Southern blots using a rabbit skeletal muscle calcium channel a, subunit cDNA fragment [nt 43 to 272, Ellis et al.
(1988) Science 240:1661]. Human genomic DNA was digested with EcoRI, electrophoresed, blotted, and probed with the rabbit skeletal muscle probe using standard hybridization conditions (Example and low stringency washing conditions (Example Two restriction fragments were identified, 3.5 kb and kb. These EcoRI restriction fragments were cloned by preparing a Xgt1 library containing human genomic EcoRI fragments ranging from 2.2 kb to 4.3 kb. The library was screened as described above using the rabbit a 2 probe, hybridizing clones were isolated and characterized by DNA sequencing. HGCaCHa2.20 contained the 3.5 kb fragment and HGCaCHa2.9 contained the 3.0 kb fragment.
Restriction mapping and DNA sequencing revealed that HGCaCHa2.20 contains an 82 bp exon (nt 130 to 211 of the human a 2 coding sequence, SEQ ID No. 11) on a 650 bp PstI-XbaI restriction fragment and that HGCaCHa2.9 contains 105 bp of an exon (nt 212 to 316 of the coding sequence, SEQ ID No. 11) on a 750 bp XbaI-BglII restriction fragment. These restriction fragments were used to screen the human basal ganglia cDNA library (Example HBCaCHa2.1 was isolated (nt 29 to 1163, SEQ ID No. 11) and used to screen a human brain stem cDNA library (ATCC Accession No. 37432) obtained from the American Type Culture Collection, 12301 Parklawn Drive, Rockville MD. 20852. Two clones were isolated, (nt 1 to 1162, SEQ ID No. 11) and HBCaCHa2.8 (nt 714 to 1562, SEQ ID No. 11, followed by 1600 nt of intervening sequence) A 2400 bp fragment of HBCaCHa2.8 (beginning at nt 759 of SEQ ID No. 11 and ending at a SmaI site in the intron) was used to rescreen the brain stem library and to isolate HBCaCHa2.11 (nt 879 to 3600, SEQ ID No. 11). Clones HBCaCHa2.5 and HBCaCHa2.11 overlap to encode an entire human brain a 2 protein.
B. Construction of pHBCaCHaA S. To construct PHBCaCH a 2 A containing DNA encoding a full- Slength human calcium channel a 2 subunit, an (EcoRI)-PvuII fragment of HBCaCHa2.5 (nt 1 to 1061, SEQ ID No. 11, EcoRI adapter, PvuII partial digest) and a PvuII-PstI fragment of HBCaCH2. 11 (nt 1061 to 2424 SEQ ID No. 11;.PvuII partial digest) were ligated into EcoRI-PstI-digested pIBI24 (Stratagene, La Jolla, CA). Subsequently, an (EcoRI)-PstI fragment (nt 1 to 2424 SEQ ID No. 11) was isolated and ligated to a PstI-(EcoRI) fragment (nt 2424 to 3600 SEQ ID No. 11) of HBCaCHa2.11 in EcoRI-digested pIBI24 to produce DNA, HBCaCHa2, encoding a full-length human brain a 2 subunit. The 3600 bp EcoRI insert of HBCaCHa2 (nt 1 to 3600, SEQ ID No. 11) was subcloned into pcDNAl (pHBCaCHa2A) with the methionine initiating codon proximal to the CMV promoter. The 3600 bp EcoRI insert of HBCaCHa2 was also subcloned into pSV2dHFR [Subramani et al. (1981). Mol. Cell. Biol. 1:854-8641 which contains the SV40 early promoter, mouse dihydrofolate reductase (dhfr) gene, SV40 polyadenylation and splice sites and sequences required for maintenance of the vector in bacteria.
-81- EXAMPLE V. DIFFERENTIAL PROCESSING OF THE HUM(AN TRANSCRIPT AND THE HUMAN cx, TR.ANSCRIPT A. Differential processing of the 9, transcript Nucleic acid amplification analysis of the human 6, transcript present in skeletal muscle, aorta, hippocampus and basal ganglia, and HEK 293 cells revealed differential processing of the region corresponding to nt 615-781 of SEQ ID No. 9 in each of the tissues. Four different sequences that result in five different processed 0, transcripts through this region were identified. The transcripts from the different tissues contained different combinations of the four sequences, except for one of the 0, transcripts expressed in HEK 293 cells (5)which lacked all four sequences.
None of the g, transcripts contained each of the four sequences; however, for ease of reference, all four sequences are set forth end-to-end as a single long sequence in SEQ ID No. 12. The four sequences that are differentially processed are sequence 1 (nt 14-34 in SEQ ID No. 12), sequence 2 (nt in SEQ ID No. 12), sequence 3 (nt 56-190 in SEQ ID No. 12) and sequence 4 (nt 191-271 in SEQ ID No. 12) The forms of *the 6, transcript that have been identified include: a form that lacks sequence 1 called (expressed in skeletal muscle) a f orm that lacks sequences 2 and 3 called 31-2 (expressed in CNS) a form that lacks sequences 1, 2 and 3 called (expressed in aorta and HEK cells) and a form that lacks sequences 1-4 called (expressed in HEK cells).
Additionally, the and 91- contain a guanine nucleotide (nt 13 in SEQ ID No. 12) that is absent in the 91 and forms.
The sequences of 0, splice variants are set forth in SEQ ID Nos. 9, 10 and 33-35.
B. Differential processing of transcripts encoding the a, subunit.
The complete human neuronal a, coding sequence (nt 3307) plus a portion of the 5' untranslated sequence (nt 1 to 34) as well as a portion of the 3' untranslated sequence (nt 3308-3600) is set forth as SEQ ID No. 11.
-82- Nucleic acid amplification analysis of the human a 2 transcript present in skeletal muscle, aorta, and CNS revealed differential processing of the region corresponding to nt 1595-1942 of SEQ ID No. 11 in each of the tissues.
The analysis indicated that the primary transcript of the genomic DNA that includes the nucleotides corresponding to nt 1595-1942 also includes an additional sequence (SEQ ID No. 13: AAGAAAAAGGAGACCCAATATCCAG inserted between nt 1624 and 1625 of SEQ ID No. 11. Five alternatively spliced variant transcripts that differ in the presence or absence of one to three different portions of the region of the primary transcript that includes the region of nt 1595-1942 of SEQ ID SNo. 11 plus SEQ ID No. 13 inserted between nt 1624 and 1625 have been identified. The five a,-encoding transcripts from the different tissues include different combinations of the three sequences, except for one of the a 2 transcripts expressed in aorta which lacks all three sequences. None of the a 2 transcripts contained each of the three sequences. The sequences of the three regions that are differentially processed are sequence 1 (SEQ ID No. 13), sequence 2 AACCCCAAATCTCAG which is nt 1625-1639 of SEQ ID No. 11), and sequence 3 5' CAAAAAAGGGCAAAATGAAGG which is nt 1908-1928 of SEQ ID No. 11). The five a, forms identified are a form that lacks sequence 3 called a, (expressed in skeletal muscle), a form that lacks sequence 1 called a 2 b (expressed in CNS), a form that lacks sequences 1 and 2 called a 2 expressed in aorta), a form that lacks sequences 1, 2 and 3 called ad, (expressed in aorta) and a form that lacks sequences 1 and 3 called (expressed in aorta).
The sequences of az-a, are set forth in SEQ ID Nos 29(a,) and to 32 (a,-a 2 respectively.
-83- EXAMPLE VI: ISOLATION OF DNA ENCODING A CALCIUM CHANNEL y SUBUNIT FROM A HUMAN BRAIN cDNA LIBRARY A. Isolation of DNA encoding the 7 subunit Approximately 1 x 106 recombinants from a Xgtll-based human hippocampus cDNA library (Clontech catalog #HL1088b, Palo Alto, CA) were screened by hybridization to a 484 bp sequence of the rabbit skeletal muscle calcium channel 7 subunit cDNA (nucleotides 621-626 of the coding sequence plus 438 nucleotides of 3'-untranslated sequence) contained in vector yJ10 [Jay, S. et al. (1990). Science 248:490-492].
Hybridization was performed using moderate stringency conditions (20% deionized formamide, 5x Denhardt's, 6 x SSPE, 0.2% SDS, 20 pg/ml herring sperm DNA, 420C) and the filters were washed under low stringency (see Example A plaque that hybridized to this probe was purified and insert DNA was subcloned into pGEM7Z. This cDNA insert was designated 71.4.
B. Characterization of 71.4 y1.4 was confirmed by DNA hybridization and characterized by DNA sequencing. The 1500 bp SstI fragment of 71.4 hybridized to the rabbit skeletal muscle calcium channel y subunit cDNA yJ10 on a Southern blot. SEQ analysis of this fragment revealed that it contains of approximately 500 nt of human DNA sequence and -1000 nt of Xgtll sequence (included due to apparent destruction of one of the EcoRI cloning sites in Xgtll). The human DNA sequence contains of 129 nt of coding sequence followed immediately by a translational
STOP
codon and 3' untranslated sequence (SEQ ID No. 14).
To isolate the remaining 5' sequence of the human y subunit cDNA, human CNS cDNA libraries and/or preparations of mRNA from human CNS tissues can first be assayed by nucleic acid amplification analysis methods using oligonucleotide primers based on the 7 cDNA-specific sequence of 71.4.
Additional human neuronal 7 subunit-encoding DNA can be isolated from cDNA libraries that, based on the results of the nucleic acid amplification analysis assay, contain y-specific -84amplifiable cDNA. Alternatively, cDNA libraries can be constructed from mRNA preparations that, based on the results of the nucleic acid amplification analysis assays, contain yspecific amplifiable transcripts. Such libraries are constructed by standard methods using oligo dT to prime firststrand cDNA synthesis from poly A' RNA (see Example
I.B.)
Alternatively, first-strand cDNA can be specified by priming first-strand cDNA synthesis with a y cDNA-specific oligonucleotide based on the human DNA sequence in 71.4.
A
cDNA library can then be constructed based on this firststrand synthesis and screened with the y-specific portion of y1.4.
EXAMPLE VII: ISOLATION OF cDNA CLONES ENCODING THE HUMAN NEURONAL Ca CHANNEL 92 SUBUNIT Isolation of DNA Encoding human calcium channel 92 subunits Sequencing of clones isolated as described in Example
III
revealed a clone encoding a human neuronal calcium channel 3 subunit (designated P2D see, SEQ ID No. 26). An oligonucleotide based on the 5' end of this clone was used to prime a human hippocampus cDNA library. The library was screened with this 92 clone under conditions of low to medium stringency (final wash 0.5 X SSPE, 500 Several hybridizing clones were isolated and sequenced. Among these clones were those that encode f2 and For example, the sequence of f2C is set forth in SEQ ID NO. 37, and the sequeence of 02E is set forth in SEQ ID No. 38.
A randomly primed hippocampus library was then screened using a combination of the clone encoding 32, and a portion of the f3 clone deposited under ATCC Accession No. 69048.
Multiple hybridizing clones were isolated. Among these were clones designated 9101, 3102 and 3104. 0101 appears to encodes the 5' end of a splice variant of 92, designated
,E.
3102 and 0104 encode portions of the 3' end of 92.
It appears that the 2, splice variants include nucleotides 182-2294 of SEQ ID No. 26 and differ only between r r the start codon and nucleotides that correspond to 212 of SEQ.
ID No. 26.
EXAMPLE VIII: ISOLATION OF cDNA CLONES ENCODING
HUMAN
CALCIUM CHANNEL 6, and 3, SUBUNITS A. Isolation of cDNA Clones Encoding a Human S, Subunit A clone containing a translation initiation codon and approximately 60% of the coding sequence was obtained from a human cerebellum cDNA library (see nucleotides 1-894 of Sequence ID No. 27). To obtain DNA encoding the remaining 3' portion of the 14 coding sequence, a human cerebellum cDNA library was screened for hybridization a nucleic acid amplification product under high stringency hybridization and wash conditions. Hybridizing clones are purified and characterized by restriction enzyme mapping and DNA sequence a analysis to identify those that contain sequence corresponding to the 3' end of the 14 subunit coding sequence and a termination codon. Selected clones are ligated to the clone containing the 5' half of the 3, coding sequence at convenient restriction sites to generate a full-length cDNA encoding a 4 subunit. The sequence of a full-length 14 clone is set forth in SEQ ID No. 27; the amino acid sequence is set forth in SEQ ID No. 28.
Isolation of cDNA Clones Encoding a Human 33 Subunit Sequencing of clones isolated as described in Example
III
also revealed a clone encoding a human neuronal calcium channel 3, subunit. This clone has been deposited as plasmid 11.42 (ATCC Accession No. 69048).
To isolate a full-length cDNA clone encoding a complete 13 subunit, a human hippocampus cDNA library (Stratagene, La Jolla, CA) was screened for hybridization to a 5' EcoRI-PstI fragment of the cDNA encoding 1,1 using lower stringency hybridization conditions (20% deionized formamide, 200 yg/ml sonicated herring sperm DNA, 5X SSPE, 5X Denhardt's solution, 420 C) and wash conditions. One of the hybridizing clones contained both translation initiation and termination codons -86and encodes a complete 3, subunit designated ,3-1 (Sequence
ID
No. 19). In vitro transcripts of the cDNA were prepared and injected into Xenopus oocytes along with transcripts of the n-1 and a2b CDNAs using methods similiar to those described in Example IX.D. Two-electrode voltage clamp recordings of the oocytes revealed significant voltage-dependent inward Ba 2 currents.
An additional 83 subunit-encoding clone, designated 03-2, was obtained by screening a human cerebellum cDNA library for hybridization to the nucleic acid amplification product referred to in Example VIII.A. under lower stringency deionized formamide, 200 gg/ml sonicated herring sperm DNA, SSPE, 5X Denhardt's solution, 420 C) hybridization and wash conditions. The 5' ends of this clone (Sequence ID No. 20, 93 2) and the first (3 subunit, designated 03-, (Sequence ID No.
19) differ at their 5' ends and are splice variants of the P3 gene.
EXAMPLE IX: RECOMBINANT EXPRESSION OF HUMAN
NEURONAL
CALCIUM CHANNEL SUBUNIT-ENCODING cDNA AND RNA TRANSCRIPTS IN MAMMALIAN
CELLS
A. Recombinant Expression of the Human Neuronal Calcium Channel a 2 subunit cDNA in DG44 Cells 1. Stable transfection of DG44 cells DG44 cells [dhfr- Chinese hamster ovary cells; see, e.g., Urlaub, G. et al. (1986) Som. Cell Molec. Genet. 12:555-566] obtained from Lawrence Chasin at Columbia University were stably transfected by CaPO 4 precipitation methods [Wigler et al. (1979) Proc. Natl. Acad. Sci. USA 76:1373-1376] with pSV2dhfr vector containing the human neuronal calcium channel 2 -subunit cDNA (see Example IV) for polycistronic expression/selection in transfected cells. Transfectants were grown on 10% DMEM medium without hypoxanthine or thymidine in order to select cells that had incorporated the expression vector. Twelve transfectant cell lines were established as indicated by their ability to survive on this medium.
-87- 2. Analysis of a 2 subunit cDNA expression in transfected DG44 cells Total RNA was extracted according to the method of Birnboim [(1988) Nuc. Acids Res. 16:1487-1497] from four of the DG44 cell lines that had been stably transfected with pSV2dhfr containing the human neuronal calcium channel ae subunit cDNA. RNA (-15 pg per lane) was separated on a 1% agarose formaldehyde gel, transferred to nitrocellulose and hybridized to the random-primed human neuronal calcium channel a 2 cDNA (hybridization: 50% formamide, 5 x SSPE, 5 x Denhardt's, 420 wash :0.2 x SSPE, 0.1% SDS, 650 C.) Northern blot analysis of total RNA from four of the DG44 cell lines that had been stably transfected with pSV2dhfr S. containing the human neuronal calcium channel a 2 subunit cDNA Srevealed that one of the four cell lines contained hybridizing mRNA the size expected for the transcript of the a 2 subunit cDNA (5000 nt based on the size of the cDNA) when grown in the presence of 10 mM sodium butyrate for two days. Butyrate nonspecifically induces transcription and is often used for inducing the SV40 early promoter [Gorman, C. and Howard,
B.
(1983) Nucleic Acids Res. 11:1631]. This cell line, 44a 2 -9, also produced mRNA species smaller (several species) and larger (6800 nt) than the size expected for the transcript of the ae CDNA (5000 nt) that hybridized to the a, cDNA-based probe. The 5000- and 6800-nt transcripts produced by this transfectant should contain the entire a 2 subunit coding sequence and therefore should yield a full-length a 2 subunit protein. A weakly hybridizing 8000-nucleotide transcript was present in untransfected and transfected DG44 cells.
Apparently, DG44 cells transcribe a calcium channel a, subunit or similar gene at low levels. The level of expression of this endogenous a 2 subunit transcript did not appear to be affected by exposing the cells to butyrate before isolation of RNA for northern analysis.
Total protein was extracted from three of the DG44 cell lines that had been stably transfected with pSV2dhfr -88- 9* 9 9 containing the human neuronal calcium channel a, subunit cDNA.
Approximately 10' cells were sonicated in 300 1 of a solution containing 50 mM HEPES, 1 mM EDTA, 1 mM PMSF. An equal volume of 2x loading dye [Laemmli, U.K. (1970). Nature 227:680] was added to the samples and the protein was subjected to electrophoresis on an 8% polyacrylamide gel and then electrotransferred to nitrocellulose. The nitrocellulose was incubated with polyclonal guinea pig antisera (1:200 dilution) directed against the rabbit skeletal muscle calcium channel a 2 subunit (obtained from K. Campbell, University of Iowa) followed by incubation with 12 sI]-protein A. The blot was exposed to X-ray film at -700 C. Reduced samples of protein from the transfected cells as well as from untransfected DG44 cells contained immunoreactive protein of the size expected for the a, subunit of the human neuronal calcium channel (130- 150 kDa). The level of this immunoreactive protein was higher in 44a 2 -9 cells that had been grown in the presence of 10 mM sodium butyrate than in 442-9 cells that were grown in the absence of sodium butyrate. These data correlate well with those obtained in northern analyses of total RNA from 44a 2 -9 and untransfected DG44 cells. Cell line 44a 2 -9 also produced a 110 kD immunoreactive protein that may be either a product of proteolytic degradation of the full-length a, subunit or a product of translation of one of the shorter (<5000 nt) mRNAs produced in this cell line that hybridized to the a 2 subunit cDNA probe.
B. Expression of DNA encoding human neuronal calcium channel a 2 and B subunits in HEK cells Human embryonic kidney cells (HEK 293 cells) were transiently and stably transfected with human neuronal
DNA
encoding calcium channel subunits. Individual transfectants were analyzed electrophysiologically for the presence of voltage-activated barium currents and functional recombinant voltage-dependent calcium channels were.
r 9 e 9 *fr -89- 1. Transfection of HEK 293 cells Separate expression vectors containing DNA encoding human neuronal calcium channel a~1, a2 and 6 subunits, plasmids pVDCCIII(A), pHBCaCHa2A, and pHBCaCH~ 1 aRBS(A), respectively, were constructed as described in Examples II.A.3, IV.B. and III.B.3., respectively. These three vectors were used to transiently co-transfect HEK 293 cells. For stable transfection of HEK 293 cells, vector pHBCaCH lbRBS(A) (Example III.B.3.) was used in place of PHBCaCH0,aRBS(A) to introduce the DNA encoding the 6, subunit into the cells along with pVDCCIII(A) and pHBCaCHaA.
a. Transient transfection Expression vectors pVDCCIII pHBCaCHaA and SPHBCaCHOaRBS were used in two sets of transient transfections of HEK 293 cells (ATCC Accession No. CRL1573) In one transfection procedure, HEK 293 cells were transiently cotransfected with the a, subunit cDNA expression plasmid, the a 2 subunit cDNA expression plasmid, the subunit cDNA Sexpression plasmid and plasmid pCMVggal (Clontech Laboratories, Palo Alto, CA). Plasmid pCMVIgal contains the lacZ gene (encoding E. coli P-galactosidase) fused to the cytomegalovirus (CMV) promoter and was included in this transfection as a marker gene for monitoring the efficiency of transfection. In the other transfection procedure, HEK 293 cells were transiently co-transfected with the a, subunit cDNA expression plasmid pVDCCIII(A) and pCMVjgal. In both transfections, 2-4 x 106 HEK 293 cells in a 10-cm tissue culture plate were transiently co-transfected with 5 pg of each of the plasmids included in the experiment according to standard CaPO, precipitation transfection procedures (Wigler et al. (1979) Proc. Natl. Acad. Sci. USA 76:1373-1376). The transfectants were analyzed for f-galactosidase expression by direct staining of the product of a reaction involving
P-
galactosidase and the X-gal substrate [Jones, J.R. (1986)
EMBO
5:3133-3142] and by measurement of O-galactosidase activity [Miller, J.H. (1972) Experiments in Molecular Genetics, pp.
352-355, Cold Spring Harbor Press] To evaluate subunit
CDNA
expression in these transfectants, the cells were analyzed for subunit transcript production (northern analysis), subunit protein production (immunoblot analysis of cell lysates) and functional calcium channel expression (electrophysiological analysis).
b. Stable transfection HEK 293 cells were transfected using the calcium phosphate transfection procedure [Current Protocols in Molecular Biology, Vol. 1, Wiley Inter-Science, Supplement 14, Unit 9.1.1-9.19 (1990)] Ten-cm plates, each containing oneto-two million HEK 293 cells, were transfected with i ml of DNA/calcium phosphate precipitate containing 5 jg pVDCCIII(A) Mg pHBCaCHa2A, 5yg pHBCaCH61bRBS 5 ig pCMVBgal and 1 ng pSV2neo (as a selectable marker). After 10-20 days of growth in media containing 500 Ag G418, colonies had formed and were isolated using cloning cylinders.
2. Analysis of HEK 293 cells transiently transfected with DNA encoding human neuronal calcium channel subunits a. Analysis of §-galactosidase expression Transient transfectants were assayed for 9-galactosidase expression by I-galactosidase activity assays (Miller,
J.H.,
(1972) Experiments in Molecular Genetics, pp. 352-355, Cold Spring Harbor Press) of cell lysates (prepared as described in Example VII.A.2) and staining of fixed cells (Jones,
J.R.
(1986) EMBO 5:3133-3142). The results of these assays indicated that approximately 30% of the HEK 293 cells had been transfected.
b. Northern analysis PolyA+ RNA was isolated using the Invitrogen Fast Trak Kit (InVitrogen, San Diego, CA) from HEK 293 cells transiently transfected with DNA encoding each of the a 2 and 0, subunits and the lacZ gene or the a, subunit and the lacZ gene. The RNA was subjected to electrophoresis on an agarose gel and transferred to nitrocellulose. The nitrocellulose was then hybridized with one or more of the following radiolabeled -91probes: the lacz gene, human neuronal calcium channel a subunit-encoding cDNA, human neuronal calcium channel a 2 subunit-encoding cDNA or human neuronal calcium channel #3 subunit-encoding cDNA. Two transcripts that hybridized with the a, subunit-encoding cDNA were detected in HEK 293 cells transfected with the DNA encoding the and 6I subunits and the lacZ gene as well as in HEK 293 cells transfected with the a, subunit cDNA and the lacZ gene. One mRNA species was the size expected for the transcript of the a~ subunit
CDNA
(8000 nucleotides). The second RNA species was smaller (4000 nucleotides) than the size expected for this transcript.
RNA
:of the size expected for the transcript of the lacZ gene was detected in cells transfected with the 2 and subunitencoding cDNA and the lacZ gene and in cells transfected with the a, subunit cDNA and the lacZ gene by hybridization to the lacZ gene sequence.
RNA from cells transfected with the a 1 a 2 and 0, subunitencoding CDNA and the lacZ gene was also hybridized with the 2 and p subunit cDNA probes. Two mRNA species hybridized to the a, subunit cDNA probe. One species was the size expected for the transcript of the a, subunit cDNA (4000 nucleotides) The other species was larger (6000 nucleotides) than the expected size of this transcript. Multiple RNA species in the cells co-transfected with a 1 a, and 0, subunit-encoding cDNA and the lacZ gene hybridized to the subunit cDNA probe.
Multiple 3 subunit transcripts of varying sizes were produced since the subunit cDNA expression vector contains two potential polyA' addition sites.
c. Electrophysiological analysis Individual transiently transfected HEK 293 cells were assayed for the presence of voltage-dependent barium currents using the whole-cell variant of the patch clamp technique [Hamill et al. (1981). Pflugers Arch. 391:85-100]. HEK 293 cells transiently transfected with pCMV gal only were assayed for barium currents as a negative control in these experiments. The cells were placed in a bathing solution that -92contained barium ions to serve as the current carrier Choline chloride, instead of NaC1 or KC1, was used as the major salt component of the bath solution to eliminate currents through sodium and potassium channels. The bathing solution contained 1 mM MgC1, and was buffered at pH 7.3 with mM HEPES (pH adjusted with sodium or tetraethylammonium hydroxide). Patch pipettes were filled with a solution containing 135 mM CsC1, 1 mM MgC1 2 10 mM glucose, 10 mM EGTA, 4 mM ATP and 10 mM HEPES (pH adjusted to 7.3 with tetraethylammonium hydroxide). Cesium and tetraethylammonium ions block most types of potassium channels. Pipettes were coated with Sylgard (Dow-Corning, Midland, MI) and had resistances of 1-4 megohm. Currents were measured through a 500 megohm headstage resistor with the Axopatch IC (Axon Instruments, Foster City, CA) amplifier, interfaced with a Labmaster (Scientific Solutions, Solon, OH) data acquisition board in an IBM-compatible PC. PClamp (Axon Instruments) was used to generate voltage commands and acquire data. Data were analyzed with pClamp or Quattro Professional (Borland International, Scotts Valley, CA) programs.
To apply drugs, "puffer" pipettes positioned within several micrometers of the cell under study were used to apply solutions by pressure application. The drugs used for pharmacological characterization were dissolved in a solution identical to the bathing solution. Samples of a 10 mM stock solution of Bay K 8644 (RBI, Natick, MA), which was prepared in DMSO, were diluted to a final concentration of 1 uM in mM Ba 2 -containing bath solution before they were applied.
Twenty-one negative control HEK 293 cells (transiently transfected with the lacZ gene expression vector pCMVgal only) were analyzed by the whole-cell variant of the patch clamp method for recording currents. Only one cell displayed a discernable inward barium current; this current was not affected by the presence of 1 yM Bay K 8644. In addition, application of Bay K 8644 to four cells that did not display Ba 2 currents did not result in the appearance of any currents.
-93- Two days after transient transfection of HEK 293 cells with a 2 and 3, subunit-encoding cDNA and the lacZ gene, individual transfectants were assayed for voltage-dependent barium currents. The currents in nine transfectants were recorded. Because the efficiency of transfection of one cell can vary from the efficiency of transfection of another cell, the degree of expression of heterologous proteins in individual transfectants varies and some cells do not incorporate or express the foreign DNA. Inward barium currents were detected in two of these nine transfectants. In these assays, the holding potential of the membrane was mV. The membrane was depolarized in a series of voltage steps to different test potentials and the current in the presence S. and absence of 1 AM Bay K 8644 was recorded. The inward barium current was significantly enhanced in magnitude by the addition of Bay K 8644. The largest inward barium current (-160 pA) was recorded when the membrane was depolarized to 0 mV in the presence of 1 AM Bay K 8644. A comparison of the I-V curves, generated by plotting the largest current recorded after each depolarization versus the depolarization voltage, corresponding to recordings conducted in the absence and presence of Bay K 8644 illustrated the enhancement of the voltage-activated current in the presence of Bay K 8644.
Pronounced tail currents were detected in the tracings of currents generated in the presence of Bay K 8644 in HEK 293 cells transfected with ac and subunit-encoding cDNA and the lacZ gene, indicating that the recombinant calcium channels responsible for the voltage-activated barium currents recorded in this transfected appear to be DHP-sensitive.
The second of the two transfected cells that displayed inward barium currents expressed a -50 pA current when the membrane was depolarized from -90 mV. This current was nearly completely blocked by 200 iM cadmium, an established calcium channel blocker.
Ten cells that were transiently transfected with the DNA encoding the a, subunit and the lacZ gene were analyzed by -94whole-cell patch clamp methods two days after transfection.
One of these cells displayed a 30 pA inward barium current.
This current amplified 2-fold in the presence of 1 u M Bay K 8644. Furthermore, small tail currents were detected in the presence of Bay K 8644. These data indicate that expression of the human neuronal calcium channel 1D subuni t-encoding cDNA in HEK 293 yields a functional DHP-sensitive calcium channel 3. Analysis of HEK 293 cells stably transfected with DNA encoding human neuronal calcium channel subunits Individual stably transfected HEK 293 cells were assayed Selectrophysiologically for the presence of voltage-dependent S. barium currents as described for electrophysiological analysis of transiently transfected HEK 293 cells (see Example VII.B.2.c) In an effort to maximize calcium channel activity via cyclic-AMP-dependent kinase-mediated phosphorylation [Pelzer, et al. (1990) Rev. Physiol. Biochem. Pharmacol.
114:107-207], cAMP (Na salt, 250 jM) was added to the pipet solution and forskolin (10 tM) was added to the bath solution in some of the recordings. Qualitatively similar results were obtained whether these compounds were present or not.
e Barium currents were recorded from stably transfected Scells in the absence and presence of Bay K 8644 (1 When the cell was depolarized to -10 mV from a holding potential of mV in the absence of Bay K 8644, a current of approximately 35pA with a rapidly deactivating tail current was recorded. During application of Bay K 8644, an identical depolarizing protocol elicited a current of approximately pA, accompanied by an augmented and prolonged tail current.
The peak magnitude of currents recorded from this same cell as a function of a series of depolarizing voltages were assessed.
The responses in the presence of Bay K 8644 not only increased, but the entire current-voltage relation shifted about -10 mV. Thus, three typical hallmarks of Bay K 8644 action, namely increased current magnitude, prolonged tail currents, and negatively shifted activation voltage, were observed, clearly indicating the expression of a DHP-sensitive calcium channel in these stably transfected cells. No such effects of Bay K 8644 were observed in untransfected HEK 293 cells, either with or without CAMP or forskolin.
C. Use of pCMV-based vectors and pcDNAl-based vectors for expression of DNA encoding human neuronal calcium channel subunits 1. Preparation of constructs Additional expression vectors were constructed using pCMV. The full-length aD cDNA from pVDCCIII(A) (see Example II.A.3.d), the full-length a, cDNA, contained on a 3600 bp EcoRI fragment from HBCaCHo 2 (see Example IV.B) and a fulllength 1 subunit cDNA from pHBCaCHglbRBS(A) (see Example III.B.3) were separately subcloned into plasmid pCMVgal.
Plasmid pCMV/gal was digested with NotI to remove the lacZ gene. The remaining vector portion of the plasmid, referred to as pCMV, was blunt-ended at the NotI sites. The fulllength cr-encoding DNA and 8-encoding DNA, contained on separate EcoRI fragments, were isolated, blunt-ended and separately ligated to the blunt-ended vector fragment of pCMV locating the cDNAs between the CMV promoter and polyadenylation sites in pCMV. To ligate the eID-encoding cDNA with pCMV, the restriction sites in the polylinkers ""immediately 5' of the CMV promoter and immediately 3' of the polyadenylation site were removed from pCMV.
A
polylinker was added at the NotI site. The polylinker had the following sequence of restriction enzyme recognition sites: -96- GGCCGC EcoRI Sail PstI EcoRV HindIII Xball GT CGsite site site site site site
CACCGG
NotI I I NotI Destroys Not The D,-encoding DNA, isolated as a BamHI/XhoI fragment from pVDCCIII(A), was then ligated to XbaII/SalI-digested pCMV to place it between the CMV promoter and SV40 polyadenylation site.
Plasmid pCMV contains the CMV promoter as does pcDNAl but differs from pcDNAl in the location of splice donor/splice acceptor sites relative to the inserted subunit-encoding
DNA.
After inserting the subunit-encoding DNA into pCMV, the splice donor/splice acceptor sites are located 3' of the CMV promoter and 5' of the subunit-encoding DNA start codon. After inserting the subunit-encoding DNA into pcDNA1, the splice donor/splice acceptor sites are located 3' of the subunit cDNA stop codon.
2. Transfection of HEK 293 cells HEK 293 cells were transiently co-transfected with the 0 1D' a 2 and G subunit-encoding DNA in pCMV or with the a, a and subunit-encoding DNA in pcDNAI (vectors pVDCCIII
A)
pHBCaCHa 2 A and pHBCaCHbRBS respectively), as described in Example VII.B.1.a. Plasmid pCMVgal was included in each transfection as a measure of transfection efficiency. The results of 0-galactosidase assays of the transfectants (see Example VII.B.2.), indicated that HEK 293 cells were transfected equally efficiently with pCMV- and pcDNAl-based plasmids. The pcDNAl-based plasmids, however, are presently preferred for expression of calcium channel receptors.
D. Expression in Xenopus laevis obcytes of RNA encoding human neuronal calcium channel subunits Various combinations of the transcripts of DNA encoding the human neuronal a2 and subunits prepared in vitro were injected into Xenopus laevis o6cytes. Those injected with combinations that included exhibited voltage-activated barium currents.
-97- 1. Preparation of transcripts Transcripts encoding the human neuronal calcium channel aD, "2 and subunits were synthesized according to the instructions of the mCAP mRNA CAPPING KIT (Strategene, La Jolla, CA catalog #200350) Plasmids pVDCC
III.RBS(A)
containing pcDNAl and the a, cDNA that begins with a ribosome binding site and the eighth ATG codon of the coding sequence (see Example III.A.3.d), plasmid pHBCaCHuA containing pcDNAl and an a2 subunit cDNA (see Example IV), and plasmid pHBCaCH1bRBS containing pcDNAl and the DNA lacking Sintron sequence and containing a ribosome binding site (see Example III), were linearized by restriction digestion. The cDNA- and a2 subunit-encoding plasmids were digested with XhoI, and the 9, subunit- encoding plasmid was digested with EcoRV. The DNA insert was transcribed with T7 RNA polymerase.
2. Injection of o6cytes Xenopus laevis o6cytes were isolated and defolliculated by collagenase treatment and maintained in 100 mM NaCl, 2 mM KC1, 1.8 mM CaC1 2 1 mM MgCl 2 5 mM HEPES, pH 7.6, 20 pg/ml ampicillin and 25 yg/ml streptomycin at 19-25 0 C for 2 to days after injection and prior to recording. For each transcript that was injected into the o6cyte, 6 ng of the specific mRNA was injected per cell in a total volume of nl.
3. Intracellular voltage recordings Injected o6cytes were examined for voltage-dependent barium currents using two-electrode voltage clamp methods [Dascal, N. (1987) CRC Crit. Rev. Biochem. 22:317] The pClamp (Axon Instruments) software package was used in conjunction with a Labmaster 125 kHz data acquisition interface to generate voltage commands and to acquire and analyze data. Quattro Professional was also used in this analysis. Current signals were digitized at 1-5 kHz, and filtered appropriately. The bath solution contained of the following: 40 mM BaC,, 36 mM tetraethylammonium chloride -98- (TEA-Cl), 2 mM KCl, 5 mM 4 -aminopyridine, 0.15 mM niflumic acid, 5 mM HEPES, pH 7.6.
a. Electrophysiological analysis of o6cytes injected with transcripts encoding the human neuronal calcium channel al, 2 and #-subunits Uninjected o6cytes were examined by two-electrode voltage clamp methods and a very small (25 nA) endogenous inward Ba 2 current was detected in only one of seven analyzed cells.
O
6 cytes coinjected with aID, a 2 and subunit transcripts expressed sustained inward barium currents upon depolarization of the membrane from a holding potential of -90 mV or -50 mV (154 129 nA, n=21). These currents typically showed little S*9 inactivation when test pulses ranging from 140 to 700 msec.
were administered. Depolarization to a series of voltages revealed currents that first appeared at approximately -30 mV and peaked at approximately o mV.
Application of the DHP Bay K 8644 increased the magnitude of the currents, prolonged the tail currents present upon repolarization of the cell and induced a hyperpolarizing shift in current activation. Bay K 8644 was prepared fresh from a stock solution in DMSO and introduced as a lOx concentrate directly into the 60 u1 bath while the perfusion pump was turned off. The DMSO concentration of the final diluted drug solutions in contact with the cell never exceeded 0.1%.
Control experiments showed that 0.1% DMSO had no effect on membrane currents.
Application of the DHP antagonist nifedipine (stock solution prepared in DMSO and applied to the cell as described for application of Bay K 8644) blocked a substantial fraction (91 n=7) of the inward barium current in o6cytes coinjected with transcripts of the alD, a, and subunits.
A
residual inactivating component of the inward barium current typically remained after nifedipine application. The inward barium current was blocked completely by 50 yM Cd 2 but only approximately 15% by 100 pM Ni 2 -99- The effect of wCgTX on the inward barium currents in o6cytes co-injected with transcripts of the 2, and A subunits was investigated. WCgTX (Bachem, Inc., Torrance
CA)
was prepared in the 15 mM BaCl, bath solution plus 0.1% cytochrome C (Sigma) to serve as a carrier protein. Control experiments showed that cytochrome C had no effect on currents. A series of voltage pulses from a -90 mV holding potential to 0 mV were recorded at 20 msec. intervals. To reduce the inhibition of wCgTX binding by divalent cations, recordings were made in 15 mM BaCl 2 73.5 mM tetraethylammonium chloride, and the remaining ingredients identical to the 40 mM Ba 2 recording solution. Bay K 8644 was applied to the cell prior to addition to OCgTX in order to determine the effect of wCgTX on the DHP-sensitive current component that was distinguished by the prolonged tail currents. The inward barium current was blocked weakly (54 29%, n=7) and reversibly by relatively high concentrations (10-15 tM) of wCgTX. The test currents and the accompanying tail currents were blocked progressively within two to three minutes after application of wCgTX, but both recovered partially as the wCgTX was flushed from the bath.
b. Analysis of o6cytes injected with only a transcripts encoding the human neuronal calcium channel a, or transcripts encoding an al, and other subunits The contribution of the a, and 0, subunits to the inward barium current in o6cytes injected with transcripts encoding the alD, a 2 and I 1 subunits was assessed by expression of the subunit alone or in combination with either the f, subunit or the a 2 subunit. In o6cytes injected with only the transcript of a aD, cDNA, no Ba 2 currents were detected (n=3) In o6cytes injected with transcripts of and 3, cDNAs, small (108 39 nA) Ba 2 currents were detected upon depolarization of the membrane from a holding potential of -90 mV that resembled the currents observed in cells injected with transcripts of a, and G, cDNAs, although the magnitude of -100the current was less. In two of the four oocytes injected with transcripts of the a,,D-encoding and ,-encoding DNA, the Ba 2 currents exhibited a sensitivity to Bay K 8644 that was similar to the Bay K 8644 sensitivity of Ba 2 currents expressed in oicytes injected with transcripts encoding the al-, a, and 3, subunits.
Three of five o6cytes injected with transcripts encoding the aD and a 2 subunits exhibited very small Ba currents nA) upon depolarization of the membrane from a holding potential of -90 mV. These barium currents showed little or no response to Bay K 8644.
c. Analysis of o6cytes injected with Stranscripts encoding the human neuronal calcium channel a, and/or I subunit To evaluate the contribution of the a D a 1 -subunit to the inward barium currents detected in o 6 cytes co-injected with transcripts encoding the al, a 2 and P, subunits, o6cytes injected with transcripts encoding the human neuronal calcium channel a 2 and/or P, subunits were assayed for barium currents.
S.
O
6 cytes injected with transcripts encoding the a2 subunit displayed no detectable inward barium currents
O
6 cytes injected with transcripts encoding a subunit displayed measurable (54 23 nA, n=5) inward barium currents upon depolarization and o6cytes injected with transcripts encoding the a 2 and subunits displayed inward barium currents that were approximately 50% larger (80 61 nA, n=18) than those detected in o 6 cytes injected with transcripts of the S-encoding DNA only.
The inward barium currents in o6cytes injected with transcripts encoding the subunit or a 2 and 9, subunits typically were first observed when the membrane was depolarized to -30 mV from a holding potential of -90 mV and peaked when the membrane was depolarized to 10 to 20 mV.
Macroscopically, the currents in o6cytes injected with transcripts encoding the a 2 and subunits or with transcripts encoding the subunit were indistinguishable. In contrast to the currents in o6cytes co-injected with transcripts of aD, -101a, and subunit cDNAs, these currents showed a significant inactivation during the test pulse and a strong sensitivity to the holding potential. The inward barium currents in o6cytes co-injected with transcripts encoding the a, and subunits usually inactivated to 10-60% of the peak magnitude during a 140-msec pulse and were significantly more sensitive to holding potential than those in o6cytes co-injected with transcripts encoding the alD, a and 3, subunits. Changing the holding potential of the membranes of o6cytes co-injected with transcripts encoding the a, and ,i subunits from -90 to -50 mV resulted in an approximately 81% (n=ll) reduction in the magnitude of the inward barium current of these cells. In contrast, the inward barium current measured in o6cytes coinjected with transcripts encoding the alD, a, and 6, subunits were reduced approximately 24% (n=ll) when the holding potential was changed from -90 to -50 mV.
The inward barium currents detected in o8cytes injected with transcripts encoding the a 2 and l subunits were pharmacologically distinct from those observed in o6cytes coinjected with transcripts encoding the aD, a 2 and 3, subunits.
O6cytes injected with transcripts encoding the a 2 and 3, subunits displayed inward barium currents that were insensitive to Bay K 8644 Nifedipine sensitivity was S* difficult to measure because of the holding potential sensitivity of nifedipine and the current observed in injected with transcripts encoding the a, and 3, subunits.
Nevertheless, two o6cytes that were co-injected with transcripts encoding the a 2 and P, subunits displayed measurable (25 to 45 nA) inward barium currents when depolarized from a holding potential of -50 mV. These currents were insensitive to nifedipine (5 to 10 IM). The inward barium currents in o6cytes injected with transcripts encoding the a 2 and 1 subunits showed the same sensitivity to heavy metals as the currents detected in o8cytes injected with transcripts encoding the lcr, a 2 and i subunits.
-102- The inward barium current detected in o 6 cytes injected with transcripts encoding the human neuronal a, and 0, subunits has pharmacological and biophysical properties that resemble calcium currents in uninjected Xenopus o6cytes. Because the amino acids of this human neuronal calcium channel P/ subunit lack hydrophobic segments capable of forming transmembrane domains, it is unlikely that recombinant 0, subunits alone can form an ion channel. It is more probable that a homologous endogenous a, subunit exists in o 6 cytes and that the activity mediated by such an a, subunit is enhanced by expression of a human neuronal 0, subunit.
E. Expression of DNA encoding human neuronal calcium channel iB, 2 and 91-2 subunits in HEK cells 1 Transfection of HEK cells The transient expression of the human neuronal C2b and 1-2 subunits was studied in HEK293 cells. The HEK293 cells were grown as a monolayer culture in Dulbecco's modified Eagle's medium (Gibco) containing 5% defined-supplemented bovine calf serum (Hyclone) plus penicillin G (100 U/ml) and steptomycin sulfate (100 Ag/ml). HEK293 cell transfections were mediated by calcium phosphate as described above.
Transfected cells were examined for inward Ba 2 currents (IBa) mediated by voltage-dependent Ca 2 channels.
Cells were transfected (2 x 10 6 per polylysine-coated plate. Standard transfections (10-cm dish) contained 8 Ag of pcDNAa 1 5 jg of pHBCaCHazA, 2 pg pHBCaCH lbRBS(A) (see, Examples II.A.3, IV.B. and III) and 2 pg of CMVO (Clontech) 0glactosidase expression plasmid, and pUC18 to maintain a constant mass of 20 Ag/ml. Cells were analyzed 48 to 72 hours after transfection. Transfection efficiencies which were determined by in situ histochemical staining for 0-galactosidase activity (Sanes et al. (1986) EMBO J., 5:3133), generally were greater than 2. Electrophysiological analysis of transfectant currents a. Materials and methods -103- Properties of recombinantly expressed Ca 2 channels were studied by whole cell patch-clamp techniques. Recordings were performed on transfected HEK293 cells 2 to 3 days after transfection. Cells were plated at 100,000 to 300,000 cells per polylysine-coated, 35-mm tissue culture dishes (Falcon, Oxnard, CA) 24 hours before recordings. Cells were perfused with 15 mM BaC 2 125 mM choline chloride, 1 mM MgC1 2 and mM Hepes (pH 7.3) adjusted with tetraethylamonium hydroxide (bath solution). Pipettes were filled with 135 mM CsC1, 10 mM EGTA, 10 mM Hepes, 4 mM Mg-adenosine triphosphate (pH 7.5) adjusted with tetraethylammonium hydroxide Sylgard (Dow-Corning, Midland, MI)-coated, fire-polished, and Sfilled pipettes had resistances of 1 to 2 megohm before gigohm seals were established to cells.
Bay K 8644 and nifedipine (Research Biochemicals, Natick, MA) were prepared from stock solutions (in dimethyl sulfoxide) and diluted into the bath solution. The dimethyl sulfoxide concentration in the final drug solutions in contact with the cells never exceeded Control experiments showed that 0.1% dimethyl sulfoxide had no efect on membrane currents.
wCgTX (Bachem, Inc., Torrance CA) was prepared in the 15 mM BaC1 2 bath solution plus 0.1% cytochrome C (Sigma, St. Louis S* MO) to serve as a carrier protein. Control experiments showed that cytochrome C had no effect on currents. These drugs were dissolved in bath solution, and continuously applied by means of puffer pipettes as required for a given experiment. Recordings were performed at room temperature (220 to 25 0 Series resistance compensation (70 to 85%) was employed to minimize voltage error that resulted from pipette access resistance, typically 2 to 3.5 megohm. Current signals were filtered dB, 4-pole Bessel) at a frequency of 1/4 to the sampling rate, which ranged from 0.5 to 3 kHz.
Voltage commands were generated and data were acquired with CLAMPEX (pClamp, Axon Instruments, Foster City, CA). All reported data are corrected for linear leak and capacitive -104components. Exponential fitting of currents was performed with CLAMPFIT (Axon Instruments, Foster City,
CA)
b. Results Transfectants were examined for inward Ba 2 currents (Ba) Cells cotransfected with DNA encoding IB-1, 2b, and P1-2 subunits expressed high-voltage-activated Ca 2 channels. Ba first appeared when the membrane was depolarized from a holding potential of -90 mV to -20 mV and peaked in magnitude at 10 mV. Thirty-nine of 95 cells (12 independent transfections) had Isa that ranged from 30 to 2700 pA, with a mean of 433 pA. The mean current density was 26 pA/pF, and the highest density was 150 pA/pF. The Ia typically increased by 2- to 20-fold during the first 5 minutes of recording Repeated depolarizations during long records often revealed rundown of usually not exceeding 20% within 10 min.
IB
typically activated within 10 ms and inactivated with both a fast time constant ranging from 46 to 105 ms and a slow time constant ranging from 291 to 453-ms (n Inactivation showed a complex voltage dependence, such that I8a elicited at mV inactivated more slowly than Ia elicited at lower test voltages, possibly a result of an increase in the magnitude of slow compared to fast inactivation components at higher test voltages.
Recombinant oB1a2b 1-2 channels were sensitive to holding potential. Steady-state inactivation of measured after a 30- to 60-s conditioning at various holding potentials, was approximately 50% at holding potential between -60 and -70 mV and approximately 90% at -40 mV. Recovery of IBa from inactivation was usually incomplete, measuring 55 to 75% of the original magnitude within 1 min. after the holding potential was returned to more negative potentials, possibly indicating some rundown or a slow recovery rate.
Recombinant elB-1a2b01-2 channels were also blocked irreversibly by c-CgTx concentrations ranging from 0.5 to M during the time scale of the experiments. Application of 5 gM toxin (n 7) blocked the activity completely within -105- 2 min., and no recovery of Ia was observed after washing w-CgTx from the bath for up to 15 min. d 2 blockage (50 uM) was rapid, complete, and reversible; the DHPs Bay K 8644 (1 LM; n 4) or nifedipine (5 yM; n 3) had no discernable effect.
Cells cotransfected with DNA encoding aB-1, a2b, and p1-2 subunits predominantly displayed a single class of saturable, high-affinity w-CgTx binding sites. The determined dissociation constant (Yd) value was 54.6 14.5 pM (n 4) Cells transfected with the vector containing only 3 -galactosidase-encoding DNA or c 2 -encoding DNA showed no specific binding. The binding capacity of the IB-.20-transfected cells was 28,710 11,950 sites per cell (n 4).
These results demonstrate that aB-la 2 bl 2 -2-transfected cells express high-voltage-activated, inactivating Ca 2 channel activity that is irreversibly blocked by w-CgTx, insensitive to DHPs, and sensitive to holding potential. The activation and inactivation kinetics and voltage sensitivity of the channel formed in these cells are generally consistent with previous characterizations of neuronal N-type Ca 2 channels.
Expression of DNA encoding human neuronal calcium channel B- o, a2B, 1-2 and 1-3 subunits in HEK cells Significant Ba 2 currents were not detected in untransfected HEK293 cells. Furthermore, untransfected HEK293 cells do not express detectable w-CgTx GVIA binding sites.
In order to approximate the expression of a homogeneous population of trimeric al, a 2 b and ,1 protein complexes in transfected HEK293 cells, the 1B, I2b and i, expression levels were altered. The efficiency of expression and assembly of channel complexes at the cell surface were optimized by adjusting the molar ratio of al, I2b and i, expression plasmids used in the transfections. The transfectants were analyzed for mRNA levels, c-CgTx GVIA binding and Ca 2 channel current density in order to determine near optimal channel expression in the absence of immunological reagents for evaluating -106protein expression. Higher molar ratios of 2b appeared to increase calcium channel activity.
1. Transfections HEK293 cells were maintained in DMEM (Gibco 3 20-1965AJ) Defined/Supplemented bovine calf serum (Hyclone #A-2151- 100 U/ml penicillin G and 100 ug/ml streptomycin. Ca 2 phosphate based transient transfections were performed and analyzed as described above. Cells were co-transfected with either 8 pg pcDNAlea (described in Example II.C), 5 ag pHBCaCHyA (see, Example 2 Ag pHBCaCH 1 bRBS(A) (92 expression plasmid; see Examples III.A. and and 2 Mg pCMV3-gal [Clontech, Palo Alto, CA] (2:1.8:1 molar ratio of Ca 2 channel subunit expression plasmids) or with 3 gg pcDNAla,,_1 or pcDNAla u 1 2 11.25 ug pHBCaCHaA, 0.75 or 1.0 g pHBCaCHglbRBS(A) or pcDNA1 1 _3 and 2 g pCMV -gal (2:10.9:1 molar ratio of Ca 2 channel subunit expression plasmids) Plasmid pCMV3-gal, a S-galactosidase expression plasmid, was included in the transfections as a marker to permit transfection efficiency estimates by histochemical staining.
When less than three subunits were expressed, pCMVPL2, a pCMV promoter-containing vector that lacks a cDNA insert, was substituted to maintain equal moles of pCMV-based DNA in the transfection. pUC18 DNA was used to maintain the total mass of DNA in the transfection at 20 jg/plate.
RNA from the transfected cells was analyzed by Northern blot analysis for calcium channel subunit mRNA expression using random primed "P-labeled subunit specific probes HEK293 cells co-transfected with c2 and 1.-2 expression plasmids 5 and 2 jg, respectively; molar ratio 2:1.8:1) did not express equivalent levels of each Ca" channel subunit mRNA. Relatively high levels of and ,1-2 mRNAs were expressed, but significantly lower levels of a2b mRNA were expressed. Based on autoradiograph exposures required to produce equivalent signals for all three mRNAs, a2b transcript levels were estimated to be 5 to 10 times lower than aB, and -107- 13,- transcript levels. Untransfected HEK293 cells did not express detectable levels of acB-I, ab, or mRNAs.
To achieve equivalent Ca 2 channel subunit mRNA expression levels, a series of transfections was performed with various amounts of a2b and 0 1 -2 expression plasmids. Because the and 1,-2 mRNAs were expressed at very high levels compared to a2b mRNA, the mass of leB- and plasmids was lowered and the mass of l2b plasmid was increased in the transfection experiments. Co-transfection with 3, 11.25 and 0.75 gg of a I 1, a 2 b and 01-2 expression plasmids, respectively (molar ratio 2:10.9:1), approached equivalent expression levels of each .Ca 2 channel subunit mRNA. The relative molar quantity of a2b expression plasmid to a and expression plasmids was increased 6-fold. The mass of a, and 1-2 plasmids in the transfection was decreased 2.67-fold and the mass of a2b plasmid was increased 2.25-fold. The 6-fold molar increase of 2b relative to a 1 B-I and ~1-2 required to achieve near equal abundance mRNA levels is consistent with the previous 5- to 10-fold lower estimate of relative a 2 b mRNA abundance. w-CgTx GVIA binding to cells transfected with various amounts of expression plasmids indicated that the 3, 11.25 and 0.75 pg of 1
B-
1 a 2b and P1-2 plasmids, respectively, improved the level of cell surface expression of channel complexes. Further increases in the mass of 02b and 1 -2 expression plasmids while was held constant, and alterations in the mass of the ais-1 expression plasmid while a 2 b and 1i,- were held constant, indicated that the cell surface expression of =-CgTx GVIA binding sites per cell was nearly optimal. All subsequent transfections were performed with 3, 11.25 and 0.75 pg or Ag of or 2b and 1-2 or 0-3 expression plasmids, respectively.
2. 25 I--CgTx GVIA binding to transfected cells Statistical analysis of the K, and B, values was performed using one-way analysis of variance (ANOVA) followed by the Tukey-Kramer test for multiple pairwise comparisons (ps0.05).
-108- Combinations of human voltage-dependent Ca 2 channel subunits, 2 B-1, aIB-2, y2b, /1-2 and P1-3, were analyzed for saturation binding of 125I-w-CgTx GVIA. About 200,000 cells were used per assay, except for the aIB-, a C-2, aOB-1 2 b and aB 2a 2 b combinations which were assayed with 1 x 106 cells per tube The transfected cells displayed a single-class of saturable, high-affinity binding sites. The values for the dissociation constants (Kd) and binding capacities were determined for the different combinations. The results are summarized as follows: Subunit Combination K. (nM
C
C
C
C
I l d isitces/ci IB-1t2b)1-2 54.9 11.1 45,324 15, aiB-1C 2 b1-3 53.2 3.6 91,004 37, IB-I91-2 17.9 1.9 5,756 2,1 eIB-191-3 17.9 1.6 8,729 2,9 CaiB-1i 2 b 84.'6 15.3 2,256 35 alB-1 31.7 4.2 757 128 a.B-2a2b, -2 53.0 4.8 19,371 3,7 0 2 IB-2Y2b 1 -3 44.3 8.1 37,652 8,1 B-201-2 16.4 1.2 2,126 41: UIB-2I1-3 22.2 5.8 2,944 1,1 alB-2 2b
N.D.
alB-2 N.D.
N.D.
N.D. not detectable Cells transfected with subunit combinations lacking either the as- 1 or the a-2 subunit did not exhibit any detectable 2 I--CgTx GVIA binding (s 600 sites/cell). 125 CgTx GVIA binding to HEK293 cells transfected with aCs., alone or aB-2a 2 b was too low for reliable Scatchard analysis of the data. Comparison of the Kd and B values revealed several relationships between specific combinations of subunits and the binding affinities and capacities of the transfected cells. In cells transfected with all three subunits, (a 1 1 a 2 b 1 01B-1a 2 b1-3-, a 1 iB-2a 2 b91- 2 or aB-2C( 2 b .1- 3 -tran fectants) the Kd values were indistinguishable ranging from 44.3 ell) 606 654 63 6 '98 .29 2 68 -109- 8.1 pM to 54.9 11.1 pM. In cells transfected with twosubunit combinations lacking the a 2 b subunit (cB-31-2, ,1-I- 3 13, o'iB-21-2 or oi-261- 3 the Kd values were significantly lower than the three-subunit combinations ranging from 16.4 1.2 to 22.2 5.8 pM. Cells transfected with only the a subunit had a Kd value of 31.7 4.2 pM, a value that was not different from the two-subunit combinations lacking a2b As with the comparison between the four aliB 2 b1 versus a 1 8 combinations, when the was co-expressed with 2 b, the Kd increased significantly (p<0.05) from 31.7 4.2 to *84.6 5.3 pM. These data demonstrate that co-expression of the a2b subunit with aB, a 1 2 1 iB- -3 or lB-291.3 S. subunit combinations results in lower binding affinity of the cell surface receptors for 12 SI--CgTx GVIA. The B, values of cells transfected with various subunit combinations also differed considerably. Cells transfected with the eal-z subunit alone expressed a low but detectable number of binding sites (approximately 750 binding sites/cell). When the ,lB- subunit was co-expressed with the a2b subunit, the binding capacity increased approximately three-fold while co-expression of a p, 2 or 1-3 subunit with resulted in 8- to 10-fold higher expression of surface binding. Cells transfected with all three subunits expressed the highest number of cell surface receptors. The binding capacities of cells transfected with ~1B-i 2b 3-3 or a1B-2.
2 b1-3 combinations were approximately two-fold higher than the corresponding combinations containing the subunit. Likewise, cells transfected with a .B, 2 b. or -IB,,2b l-3 combinations expressed approximately 2.5-fold more binding sites per cell than the corresponding combinations containing In all cases, co-expression of the a2b subunit with and 3, increased the surface receptor density compared to cells transfected with only the corresponding and /3 combinations; approximately 8-fold for t.B-1C2b.12, 10-fold for a1B-1_2b- 3, 9-fold for a .2.t 2 b.1-.
2 and 13-fold for c1B.2 2 b1-3. Thus, comparison of the Ba values suggests that the toxin-binding subunit, ao. or aB.2, is more efficiently expressed and -110assembled on the cell surface when co-ex-pressed with either the a2b or the P-2 or 3.3 subunit, and most efficiently expressed when U2b and subunits are present.
3. Electrophysiology Functional expression of a B-1L 2 bSI-2 and alB-11-2 subunit combinations was evaluated using the whole-cell recording technique. Transfected cells that had no contacts with surrounding cells and simple morphology were used approximately 48 hours after transfection for recording. The pipette solution was (in mM) 135 CsC1, 10 EGTA, 1 MgC12, 10 HEPES, and 4 mM Mg-ATP (pH 7.3, adjusted with TEA-OH). The external solution was (in mM) 15 BaC1 2 125 Choline Cl, 1 MgCl 2 and HEPES (pH 7.3, adjusted with TEA-OH). w-CgTx GVIA (Bachem) was prepared in the external solution with 0.1% cytochrome
C
(Sigma) to serve as a carrier. Control experiments showed that cytochrome C had no effect on the Ba 2 current.
The macroscopic electrophysiological properties of Ba currents in cells transfected with various amounts of the a 2 b expression plasmid with the relative amounts of ea-1 and P-2 plasmids held constant were examined. The amplitudes and densities of the Ba 2 currents (15 mM BaC,) recorded from whole cells of these transfectants differed dramatically. The average currents from 7 to 11 cells of three types of transfections (no a2b; 2:1.8:1 [c1B-1:a2b:12] molar ratio; and 2:10.9:1 [aBe-,: 2 b: molar ratio) were determined. The smallest currents (range: 10 to 205 pA) were recorded when ab was not included in the transfection, and the largest currents (range: 50 to 8300 pA) were recorded with the 2:10.9:1 ratio of uiB-1i2b_1-2 plasmids, the ratio that resulted in near equivalent mRNA levels for each subunit transcript. When the amount of a2b plasmid was adjusted to yield approximately an equal abundance of subunit mRNAs, the average peak Ba 2 current increased from 433 pA to 1,824 pA (4.2-fold) with a corresponding increase in average current density from 26 pA/pF to 127 pA/pF (4.9-fold) This increase is in the presence of a 2.7-fold decrease in the mass of a and 2 expression plasmids in the transfections.
-111- In all transfections, the magnitudes of the Ba 2 currents did not follow a normal distribution.
To compa th ubunit ombinations and determine the effects of the current-voltage properties of cells transfected with or with B-2 2 in either the 2:1.8:1 (B-1:2b:2) molar ratio or the 2:10.9:1 (c-1:C e 2 b:1_2) molar ratio transfectants were examined. The extreme examples of no 2b and 11.e25rg e2b (2:10.9:1 molar ratio) showed no significant differences in the current voltage plot at test potentials between 0 mV and +40 mV (P<0.05) The slight differences observed at either side o f 0 s li h t d i f f e r e n c e observed apt either side of the peak region of the current voltage plot were likely due to normalization. The very small currents observed in the n 13-f2 transfected cells have a substantially component f residual leak relative to the barium current that is activated by the test pulse. When the current voltage plots are normalized, this leak is a much a greater component than in the a 12b1-2 transfected cells and as a result, the current-voltage plot appears broader. This is the most likely explanation of the apparent differences in the Scurrent voltage plots, especially given the fact that the current-voltage plot for the a,-10 1 -2 transfected cells diverge on both sides of the peak. Typically, when the voltagedependence activation is shifted, the entire current-voltage plot is shifted, which was not observed. To qualitatively compare the kinetics of each, the average responses of test pulses from -90 mV to 10 mV weree normalized and lotted. No significant differences in activation or inactivation kinetics of whole-cell Ba currents were observed with any combination.
G. Expression of DNA encoding human neuronal calcium channel alE-3C 2 1 1 3 and Y1R-a2B 1-3 subunits in REK cells Functional expression of the al_, -3 and caE- 3 a0 1 ,l as well as ~E- 3 Was evaluated using the whole cell recording technique t 1. Methods
I
-112- Recordings were performed on transiently transfected
HEK
293 cells two days following the transfection, from cells that had no contacts with surrounding cells and which had simple morphology.
The internal solution used to fill pipettes for recording the barium current from the transfected recombinant calcium channels was (in mM) 135 CsCl, 10 EGTA, 1 MgCl 2 10 HEPES, and 4 mM Mg-ATP (pH 7.4-7.5, adjusted with TEA-OH). The external Solution for recording the barium current was (in mM) 15 BaCl 2 150 Choline Cl, 1 MgCI 2 and 10 HEPES and 5 TEA-OH (pH 7.3, :::,-adjusted with TEAoH) 2 n H P S n O P 3 Sadjusted with TEA-OH). In experiments in which Ca 2 was replaced for Ba 2 a Laminar flow chamber was used in order to .completely exchange the extracellular solution and prevent any ixing of Ba 2 and Ca 2 -CgTx GVIA was prepared in the external solution with 0.1% cytochrome C to serve as a carrier the toxin was applied by pressurized puffer pipette. Series resistance was compensated 70-85%- and currents were analyzed only if the voltage error from series resistance was less than mV. Leak resistance and capacitance was corrected by subtracting the scaled current observed with the P/-4 protocol as implemented by pClamp (Axon Instruments) Electrophysiology Results Cells transfected with IC2b 3 or aE.
3 2 bI.
3 showed strong 0barium currents with whole cell patch clamp recordings Cells expressing -3a 2 had larger peak currents than those expressing -a2b/- 3 In addition, the kinetics of activation and inactivation are clearly substantially faster in the cells expressing aIE calcium channels. HEK 293 cells expressing -3 alone have a significant degree of functional calcium channels, with properties similar to those expressing al c
E
2 bA but with substantially smaller peak barium currents. Thus, with alE, the e2 and subunits are not required for functional expression of a, mediated calcium channels, but do substantially increase the number of functional calcium channels.
Examination of the current voltage properties of aEe 2 bl.
expressing cells indicates that a1E- 3
C
2 b0 3 is a high-voltage -113 activated calcium channel and the peak current is reached at a potential only slightly less positive than other neuronal calcium channels also expressing a2b and and a, and a.D- Current voltage properties of Y E22b).
3 and a1,E_3U 2 b 3 l.
3 are statistically different from those of aia- 2,l3- Current voltage curves for a1E-la2b31_ 3 and a1E-3a2b9.
3 peak at approximately as does the current voltage curve for a1E-, alone.
The kinetics and voltage dependence of inactivation using both prepulse (200 ms) and steady-state inactivation was examined. alE mediated calcium channels are rapidly inactivated relative to previously cloned calcium channels and other high voltage -activated calcium channels- alE3%b61_ mediated calcium channels are inactivated rapidly and are thus sensitive to relatively brief (200 ms) prepulses as well as long prepulses (>20s steady state inactivation) but recover rapidly from steady state inactivation. The kinetics of the rapid inactivation has two components, one with a time constant of approximately 25 ms and the other approximately 400 ms.
To determine whether ol mediated calcium channels have properties of low voltage activated calcium channels, the details of tail currents activated by a test pulse ranging to +90 mV were measured at -60 mV. Tail currents recorded at mV could be well fit by a single exponential of 150 to 300 at least an order of magnitude faster than those typically observed with low voltage-activated calcium channels.
HEK 293 cells expressing a1E 3 ab 1 3 flux more current with Ba 2 as the charge carrier and currents carried by Ba 2 and Ca 2 have different current-voltage properties. Furthermore, the time course of inactivation is slower and the amount of prepulse inactivation less with Ca' as the charge carrier.
While the invention has been described with some specificity, modifications apparent to those with ordinary skill in the art may be made without departing from the scope of the invention. Since such modifications will be apparent to -114those of skill in the art, it is intended that this invention be limited only by the scope of the appended claims.
s~e *get **of 4. 0:.
C a 0000 00..
-115- SEQUENCE
LISTING
GENERAL
INFORMATION:
APPLICANT:
NAME: THE SALK INSTITUTE BIOTECHNOLY/INDUSTRIAL
ASSOCIATES
STREET: 505 COAST BLVD SOUTH, SUITE 300 CITY: La Jolla STATE: California COUNTRY:
USA
POSTAL CODE (ZIP): 92037 (ii) TITLE OF INVENTION: HUMAN CALCIUM CHANNEL COMPOSITIONS
AND
METHODS
(iii) NUMBER OF SEQUENCES: 38 (iv) COMPUTER READABLE
FORM:
MEDIUM TYPE: Floppy disk COMPUTER: IBM PC compatible o OPERATING SYSTEM:
PC-DOS/MS-DOS
SOFTWARE: Patentln Release Version #1.25 CURRENT APPLICATION
DATA:
a• APPLICATION
NUMBER:
a FILING
DATE:
CLASSIFICATION:
,(vii) PRIOR APPLICATION
DATA:
APPLICATION NUMBER: 08/149,097 41.* FILING DATE: 5-NOV-1993 (B L
DNV
(vii) PRIOR APPLICATION
DATA:
APPLICATION NUMBER: 08/105,536 FILING DATE: 11-AUG-1993 a INFORMATION FOR SEQ ID NO:l: SEQUENCE
CHARACTERISTICS:
LENGTH: 7635 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY:
CDS
LOCATION: 511..6996 (ix) FEATURE: NAME/KEY: LOCATION: 1..510 (ix) FEATURE: -116- NAME/KEY: 3'UTR LOCATION: 6-994. .7635 (Xi) SEQUENCE DESCRIPTION: SEQ ID NO:1: GGGCGAGCGC CTCCGTCCCC GGATGTGAGC TCCGGCTGCC CGCGGTCCCG
AGCCAGCGGC
GCGCGGGCGG CGGCGGCGGG CACCGGGCAC CGCGGCGGGC GGGCAGACGG
GCGGGCATGC
GGGGAGCGCC GAGCGGCCCC GGCGGCCGGG CCGGCATCAC CGCGGCGTCT
CTCCGCTAGA
GGAGGGGACA AGCCAGTTCT CCTTTGCAGC AAAAAATTAC ATGTATATAT
TATTAAGATA
ATATATACAT TGGATTTTAT TTTTTTAJAjA AGTTTATTTT GCTCCATTTT
TGAAAAAGAG
AGAGCTTGGG TGGCGAGCGG TTTTTTTTTA AAATCAATTA TCCTTATTTT
CTGTTATTTG
TCCCCGTCCC TCCCCACCCC CCTGCTGAAG CGAGAATAAG GGCAGGGACC
GCGGCTCCTA
CCTCTTGGTG ATCCCCTTCC CCATTCCGCC CCCGCCCA CGCCCAGCAC
AGTGCCCTGC
ACACAGTAGT CGCTCAATAA ATGTTCGTGG ATG ATG ATG ATG ATO ATG ATG AAA Met Met Met Met Met Met Met Lys 1 AAA ATG CAG CAT CAA CGG CAG CAG CAA GCG GAC CAC GCG AAC GAG GCA Lys Met Gin His Gin Arg Gin Gin Gin Ala Asp His Ala Asn Giu Ala 10 15 AAC TAT GCA AGA GGC ACC AGA CTT CCT CTT TCT GOT GAA GGA CCA ACT Asn Tyr Ala Arg Gly Thr Arg Leu Pro Leu Ser Gly Giu Oly Pro Thr 25 30 35 TCT CAG CCG AAT AGC TCC AAG CAA ACT GTC CTG TCT TGG CAA GCT GCA Ser Gin Pro Asn Ser Ser Lys Gin Thr Val Leu Ser Trp Gin Ala Ala 45 50 ATC OAT GCT GCT AGA CAG GCC AAG GCT GCC CAA ACT ATG AGC ACC TCT Ile Asp Ala Ala Arg Gin Ala Lys Ala Ala Gin Thr Met Ser Thr Ser 65 GCA CCC CCA CCT GTA GGA TCT CTC TCC CAA AGA AA CGT CAG CAA TAC Ala Pro Pro Pro Val Gly Ser Leu Ser Gin Arg Lys~ Arg Gin GIn Tyr 80 0CC AAO AGC AAA AAA CAG GGT AAC TCG TCC AAC AOC CGA CCT GCC CC Ala Lys Ser Lys Lys Gin Oly Asn. Ser Ser Asn Ser Arg Pro Ala Ara 95 100 GCC CTT TTC TOT TTA TCA CTC AAT AAC CCC ATC CGA AGA GCC TGC ATT Ala Leu Phe Cys Leu Ser Leu Asn Asn Pro Ile Arg Arg Ala Cys Ile 105 110 115 120 AGT ATA GTG GAA TOG AAA CCA TTT GAC ATA TTT ATA TTA TTG GCT ATT Ser Ile Val Giu Trp Lys Pro Phe Asp Ile Phe Ile Leu Leu Ala Ile 125 130 135 120
ISO
24C 300 36-- 420 480 534 582 6 678 726 774 8 22 870 918 -117- TTT GCC AAT TGT GTC GCC Phe Ala Asri Cys Val Ala 140 TTA CCT ATT TAC ATC Leu Ala Ile Tyr Ile 145 CCA TTC CCT GAA CAT Pro Phe Pro Glu Asp
ISO
GAT TCT AAT Asp Ser Asn 155 TCA ACA ANT CAT AAC TTC GAA AAA CTA CAA TAT GCC TTC Ser Thr Asn His Asn Leu Ciu Lys Val Clu TPyr Ala Phe 160 165 CTG ATT Leu Ile 170 ATT TTT ACA CTC Ile Phe Thr Val
GAC
Clu 175 ACA TTT TTC AAC ATT ATA CC TAT CCA Thr Phe Leu Lys Ile Ile Ala TIyr Cly TTC CTA CAT CCT Leu Leu His Pro
AAT
Asn 190 CCT TAT CTT ACC AAT CCA TCC AAT TTA Ala Tyr Val Arg Asn Cly Trp Asn Leu
CTC
Leu 200 CAT TTT CTT ATA Asp Phe Val Ile
CTA
Val 205 ATA CTA CCA TTC Ile Val Cly Leu
TTT
Phe 210 ACT CTA ATT TTC Ser Val Ile Leu CAA CAA Ciu Cmn 215 TTA ACC AAA Leu Thr Lys CCGC TTT CAT Cly Phe Asp 235
CAA
Clu 220 ACA CAA CCC CCC Thr Clu Cly Cly CAC TCA ACC GC His Ser Ser Cly AAA TCT CCA Lys Ser Cly 230 CCA CCA CTT Arg Pro Leu CTC AAA CCC CTC Val Lys Ala Leu CCC TTT CCA CTC Ala Phe Arg Val CCA CTA Arg Leu 250 CTC TCA CCA CTC Val Ser Cly Val
CCC
Pro 255 ACT TTA CAA CTT Ser Leu Gin Val CTC AAC TCC ATT Leu Asn Ser Ile 1014 1062 1110 115e 1206 1254 1302 1350 1398 1446 1494 1542 1590
ATA
Ile 265 AAA CCC ATC CTT Lys Ala Met Val CTC CTT CAC ATA Leu Leu His Ile CTT TTC CTA TTA Leu Leu Val Leu GTA ATC ATA ATC Val Ile Ile Ile
TAT
TPyr 285 CCT ATT ATA CCA Ala Ile Ile ly GAA CTT TTT ATT Clii Leu Phe Ile CCA AAA Cly Lys 295 ATC CAC AAA Met His Lys
ACA
Thr 300 TCT TTT TTT CCT Cys Phe Phe Ala TCA CAT ATC CTA Ser Asp Ile Val CCT CAA CAC Ala Ciu Clu 310 CAC CCA CCT CCA TCT CC TTC Asp Pro Ala Pro Cys Ala Phe 315 TCA CCC AAT CCA CCC CAC TCT ACT GCC Ser Cly Asn Cly Arg Cmn Cys Thr Ala 320 325 AAT GC Asri Cly 330 ACC CAA TCT ACC Thr Clu Cys Arg
ACT
S er 335 CCC TCC CTT GC Cly Trp Val Cly AAC CCA CCC ATC Asn Cly Cly Ile
ACC
Thr 345 AAC TTT CAT AAC Asn Phe Asp Asn CCC TTT CCC ATC Ala Phe Ala Met ACT CTC TTT CAC Thr Val Phe Cmn -118- ATC AC-- ATO GAG GGC TGG ACA GAC GTG CTC
TAC
Ile Thr Met GiU Gly Ti-p Thr Asp Val Leu Tyr 36537 TGG ATG A-AT GAT GCT ATG GGA
TTT
Met Gly Phe GAA TTG CCC TGG GTG TAT TTT Giu Leu Pro Trp Vai Tyr Phe 380 385 GTC AGT CTC GTC ATC
TTT
Val Ser Leu Val Ile Phe CCC TCA TTT TTC GTA CTA AAT
CTT
Gly Ser Phe Phe Vai Leu Asn Leu 395 400 GTA CTT
CGT
Vai Leu Gly GTA
TTG
Val Leu AGC GGA GA-A b.
TTC
TCA
Phe Ser 410 A-AG GA-A
AGA
Lys Giu Arg GAG A-AG Giu Lys 415 CAG CTG Gin Leu GCA AAA
GCA
A-ia Lys A-ia CGG GGA Arg Giy GAT TTC CAG
AAG
Asp Phe Gin Lys
CTC
Leu 425 CGG GAG AAG
GAG
Arg Giu Lys Gin GAG GAG CAT CTA A-AG Giu CiU Asp Leu Lys GGC TAC TTG
CAT
TGG A-TC ACC CA-A GCT GAG GAG ATC CAT CCC Trp Ile Thr Gin Ala Ciu AspliAsPr 4450 GGA GAG CA-A CCC A-AA CGA AAT ACT A-CC A-TC Giy Giu Giu Giy Lys A-rg A-sn Thr Ser Met 460 465 GAG A-AT
GAG
Giu Asn Giu CCC ACC AGC Pro Thr Ser GGC GAG AAC Giy Giu A-sn GAA CAA GGA Giu Giu Ciy 455 GAG ACT GAG Ciu Thr Gin 470 CCA CCC
TGC
TCT
GTG
Ser Vai TGT GGA Cys Giy 490 A-AC ACA GAG A-AC GTC A-CC
CCT
Asn Thr Giu A-sn Vai Ser Giy 47548
GAA
1638 1686 1734 1782 1830 1878 1926 1974 2022 2070 2118 2166 2214 2262 ACT CTC TGT CAZA CCC ATC TCA AA-A TCC AAA Ser Leu Cys Gin A-ia Ile Ser Lys Ser Lys 495 500
TGG
Ti-p 505 CGT CCC TCC A-AC A-rg A-rg Trp A-sn A-AG TCT
GTC
Lys Ser Vai ACC TTA ACC Thr Leu Thr
A-CC
Thr CGA TTC A-AT CCC A-GA A-GA A-rg Phe Asn A-rg A-rg A-rg 510 515 TA-C TGG CTC CTT ATC GTC Tyr Trp Len Val Ile Vai
TGT
CTC A-CC CGA CC Leu Ser Arg Arg A-CC CCC CCC
GTG
A-rg A-ia A-ia Vai CTG CTG
TTT
CTG A-AC A-TT TCC TCT GA-C CA-C TA-C A-AT CA-C CCA CAT Ile Ser Ser Giu His Tyr A-sn Gin Pro Asp TCG TTG
A-CA
540 Ti- CA-C ATT CA-A CAT A-TT CCC A-A-C A-AA GTC CTC TTG GCT CTG TTC A-CC
TGC
Gin Ile Gin Asp Ile A-ia A-sn Lys Val Leu Leu A-ia Leu Phe Thr Cys 555 560 GAC A-TG CTG CTA AAA ATG
TAC
Giu Met Leu Vai Lys Met Tyr 570 575 A-CC TTC CCC CTC CAA GCA TAT TTC GTC Ser Leu Gly Leu Gin A-ia Tyr Phe Vai 580 -119- TCT CTT TTC A-AC CCC TTT GAT TCC TTC GTC Ser Leu Phe Asn Arg Phe Asp Cys Phe Val.
585 590 GAG A-CC ATC TTG Giu Thr Ile Len GTC TCT CCT GGA ATC ACT Val. Cys Giy Giy Ile Thr 595 600 TCT CCC CTG GGG ATC TCT Ser Pro Leu Gly Ile Ser GTG GA-A CTG GAA ATC ATC Val Giu Leu Giu Ile Met 605 610 GTC TTT CGG Val Phe Arg TGC ACT TCC Trp Thr Ser 635
TGT
Cys 620 GTG CGC CTC TTA Vai Arg Leu Leu
A-GA
Arg 625 ATC TTC AAA CTG Ile Phe Lys Vai A-CC A-GG CAC Thr A-rg His 630 TCC ATG AAG Ser Met Lys CTG ACC A-AC TTA Leu Ser Asn Leu
GTG
Val.
640 GCA TCC TTA TTA Aia Ser Leu Leu TCC ATC Ser Ile 650 GCT TCG CTG TTG Ala Ser Leu ILeu CTC CTT TTT CTC Leu Leu Phe Leu
TTC
Phe A-TT ATC ATC TTT Ile Ile Ile Phe TTG CTT GGG A-TG Leu Leu Gly Met
CAG
Gin 670 CTG TTT GGC GGC Leu Phe Ciy Gly TTT AA-T TTT CAT Phe Asn Phe Asp
GAA
Glu A-CC CAA ACC A-AG Thr Gin Thr Lys AGC ACC TTT GAC Ser Thr Phe Asp
A-AT
A-sn 690 TTC CCT CAA GCA Phe Pro Gin Ala CTT CTC Leu Leu 2310 2358 2406 2454 2502 2550 2598 2646 2694 2742 2790 2838 2886 2934 ACA CTC TTC Thr Vai Phe CAT GGC ATC Asp Giy Ile 71S
CAC
Gin 700 A-TC CTG A-CA GC Ile Leu Thr Giy GAC TGG A-AT GCT Asp Trp, A-sn A-ia GTG A-TC TAC Vai Met Tyr 710 ATG ATC GTC Met Ile Val.
A-TG GCT TA-C GG Met Ala Tyr Giy
GGC
Gly 720 CCA TCC TCT TCA Pro Ser Ser Ser
TGC
Cys A-TC TA-C TTC A-TC Ile Tyr Phe Ile 730 GTC TTC TTG GCC Vai Phe Leu A-ia TTC A-TT TGT GGT Phe Ile Cys Gly
A-AC
A-sn TAT A-TT CTA CTG Tyr Ile Leu Leu A-TC GCT CTA GAC A-AT Ile A-ia Val Asp A-sn GCT CAT CCT GA-A A-ia Asp A-ia Ciu
A-CT
Ser 750 CTG AA-C A-CT CCT Leu A-sn Thr A-ia AA-A GAA GAA CC Lys Giu Giu A-ia
GA-A
Giu GA-A A-AC GAC A-CC Giu Lys Giu A-rg AAA A-G Lys Lys ATT GCC A-CA Ile Ala A-rg CTC AA-C CA-C Vai A-sn Gin 795
AAA
Lys 780 GAG A-CC CTA GAA Ciu Ser Leu Ciu AAA A-AG A-AC A-AC Lys Lys A-sn A-sn AAA CC-A GAA Lys Pro Ciu 790 CAT GA-C TAT Asp Asp Tyr ATA CCC A-A-C A-CT Ile A-ia A-sn Ser A-AC A-AG CTT ACA A-sn Lys Val. Thr
A-TT
Ile 805 -120- AGA GAA GAG GAT CAA GAC AAC CAC CCC TAT CCG CCT Arg Giu Giu Asp Ciu Asp Lys Asp Pro T'yr Pro Pro 810 815 TGC GAT GTC CCA Cys Asp Val Pro GTA CGG GAA GAG Val Gly Clu Glu 825 GCC GCA CCC CGT A-la Cly Pro Arg GAA GAG GAA GAG GAG GAG GAT Clu Giu Clu Giu Glu Glu Asp 830 835 GAA CCT GAG GTT Glii Pro Glu Val CCT CCA AGG ATC TCG GAG TTG AAC ATG AAC Pro A-rg A-rg Ile Ser Glu Leu Asn Met Lys GAA AAA Glu Lys ATT GCC CCC ATC CCT CAA CCC AC Ile Ala pro Ile Pro Clu Cly Ser TTC TTC ATT CTT Phe Phe Ile Leu AAC CCC ATC Asn Pro Ile 875 AGC AAC ACC Ser Lys Thr 870 CAC ATC TTC CCC GTA CCC TGC Arg Val Gly Cys
CAC
His AAC CTC ATC AAC Lys Leu Ile Asn ACC AAC Thr Asn 890 CTC ATC CTT CTC Leu Ile Leu Val ATC ATG CTG AGC Ile Met Leu Ser GCT CC CTG GCC Aa Ala Leu Ala
CCA
Al a 905 GAG GAC CCC ATC Glu Asp Pro Ile ACC CAC TCC TTC Ser His Ser Phe
CGC
Arg AAC ACG ATA CTG Asn Thr Ile Leu 2982 3030 3078 3126 3174 3222 3270 3318 3366 3414 3462 3510 3558 3606 TAC TTT GAC TAT Tyr Phe Asp Tyr TTC ACA CCC ATC Phe Thr Ala Ile ACT GTT GAG ATC Thr Val Clu Ile CTC TTC Leu Leu AAG ATG ACA Lys Met Thr TTT CGA CCT TTC Phe Gly Ala Phe
CTC
Leu CAC AAA CCC CC His Lys Gly Ala AAC TAC TTC AAT TTG CTG Asn Tyr Phe Asn Leu Leu 935TC G TTC TC ArG 950 TCT CTG GTG CAT ATG CTC GTG GTT Asp Met Leu Val Val CCC CTC Cly Val TCA TTT Ser Phe 970 CCC ATT CAA TCC Cly Ile Gin Ser
ACT
Ser 975 CCC ATC TCC GTT Ala Ile Ser Val
CTC
Val AAC ATT CTC AG Lys Ile Leu Arg TTA AGC GTC Leu Arg Val CTG CGT Leu Arg CCC CTC AGG CC Pro Leu A-rg Ala AAC AGA GCA AAA Asn Arg Ala Lys
CGA
Cly CTT AAC CAC CTG CTC GZAG TCC GTC Leu Lys His Val Val Gin Cys Val 1005 TTC CTC CC Phe Val Ala ATC CG ACC 1000G ATC CCC ACC ATC CCC lie rg Tr10i15l AAC ATC ATG ATC GTC ACC ACC CTC CTC CAC TTC ATG TTT CCC TCT ATC Asn Ile Met Ile Val Thr Thr Leu Leu Ginl Phe Met Phe Ala Cys Ile 1020 1025 1030 -121- GCC GTC CAG TTG TTC AAC GGG AAC TTC Giy Val Gin Leu Phe Lys Cly Lys Phe 1035 1040 TAT CGC TCT ACG GAT GAA GCC Tyr Arg Cys Thr Asp Clu Ala 1045 CTT TTC ATC CTC TAC AAG GAT Leu Phe Ile Leo Tyr Lys Asp 1060 AAA ACT AAC Lys Ser Asn 1050 CCT GAA CAA TCC AGG GGA Pro Glu Clu Cys Arg Gly 1055 GGG GAT Gly Asp 1065 CTT CAC AGT CCT CTG GTC CGT Vai ASP Ser Pro Vai Vai Arg 1070 CAA CGG ATC Giu Arg Ile 1075 TCG CAA AAC ACT TrD Gin Asn Ser 1080 CCG CTC TTC ACA Ala Leu Phe Thr 1095 CAT TTC AAC TTC Asp Phe Asn Phe CAC AAC CTC Asp Asn Vai 1085 CTC TCT CCT ATC ATG Leo Ser Ala Met Met 1090 GTC TCC ACC Val Ser Thr TCG AAT GCA Ser Asn Giy ill! TCC ATC TTC Ser Ile Phe 1130 TTT GAG Phe Glu 1100 CCC TGC CCT CC TTC CTG TAT AAA CCC ATC CAC Gly Trp Pro Ala Leu Leu Tyr Lys Ala Ile Asp 1105 1110 GAG AAC ATC Glu Asn Ile TTC ATC ATC Phe Ile Ile CCC CCA ATC TAC Gly Pro Ile Tyr 1120 AAC CAC CCC GTG GAG ATC Asn His Arg Val Giu Ile 1125 TAC ATC Tyr Ile 1135 ATC ATT CT Ile Ile Va A CCT TTC I Ala Phe -1140 TTC ATC ATC Phe Met Met 3654 3702 3750 3798 3846 3894 3942 3990 4038 4086 4134 4182 4230 4278 AAC ATC TTT CTG GC As Ile Phe Vai Gly 1145 TTT GTC Phe Val 1150 ATC CTT ACA Ile Vai Thr TTT CAC Phe Gin 1155; GAA CAA GCA GAA Ciu Gin Giy Giu 1160 AAA GAG TAT AAC Lys Giu Tyr Lys AAC TCT Asn Cys 1165 GAG CTC CAC Glu Leu Asp AAA AAT Lys Asn 1170 CAC CCT CAC Gin Arg Gin TCT CTT Cys Val 1175 GAA TAC CCC TTC AAA GCA CCT CCC Glu Tyr Ala Leu Lys Ala Arg Pro 1180 TTC CCC Leu Arg 1185 AGA TAC ATC CCC AAA AAC Arg Tyr Ile Pro Lys Asn 1190 CCC TAC CAG TAC Pro Tyr Gin Tyr 1195 AAC TTC TG Lys Phe Trp TAC CTG Tyr Vai 1200 GTC AAC TCT Val Asn Ser TCG CCT TTC CAA Ser Pro Phe Giu 1205 TAC ATO ATG Tyr Met Met 1210 TTT GTC CTC Phe Vai Leu ATC ATG CTC Ile Met Leo 1215 AAC ACA CTC TGC TTG CCC ATC Asn Thr Leu Cys Leu Ala Met 1220 CAG CAC TAC GAG CAG TCC AAG ATG TTC AAT Gin His Tyr Gio Gin Ser Lys Met Phe Asn 1225 1230 CAT CCC ATC GAC Asp Ala Met Asp 1235 ATT CTC Ile, Leo 1240 AAC ATG GTC TTC ACC CCC GTC TTC ACC CTC GAG ATG GTT TTG AAA CTC Asn Met Val Phe Thr Gly Val Phe Thr Vai Glu Met Val Leo Lys Vai 1245 1250 1255 -122- 9* 9 *9 9 9 9 9 9 ATC GCA TTT AAG CCT AAG GGG TAT TTT AGT GAC GCC TGG AAC ACG TTT Ile Ala Phe Lys Pro Lys Giy Tyr Phe Ser Asp Ala Tro Asn Thr Phe 1260 1265 1270 GAC TCC CTC ATC GTA ATC GGC AGC ATT ATA GAC GTG GCC CTC AGC GAA Asp Ser Leu le Val lie Gly Ser Ile Ile Asp Val Ala Leu Ser Glu 1275 1280 1285 GCA GAC CCA ACT GAA AGT GAA AAT GTC CCT GTC CCA ACT GCT ACA CCT Ala Asp Pro Thr Glu Ser Giu Asn Val Pro Val Pro Thr Ala Thr Pro 1290 1295 1300 GGG AAC TCT GAA GAG AGC AAT AGA ATC TCC ATC ACC TTT TTC CGT CTT Gly Asn Ser Glu Giu Ser Asn Arg Ile Ser Ile Thr Phe Phe Arg Leu 1305 1310 1315 1320 TTC CGA GTG ATG CGA TTG GTG AAG CTT CTC AGC AGG GGG GAA GGC ATC Phe Arg Val Met Arg Leu Vai Lys Leu Leu Ser Arg Gly Giu Giy Ile 1325 1330 1335 CGG ACA TTG CTG TGG ACT TTT ATT AAG TTC TTT CAG GCG CTC CCG TAT Arg Thr Leu Leu Trp Thr Phe Ile Lys Phe Phe Gin Ala Leu Pro Tyr 1340 1345 1350 GTG GCC CTC CTC ATA GCC ATG CTG TTC TTC ATC TAT GCG GTC ATT GGC Val Ala Leu Leu Ile Ala Met Leu Phe Phe Ile Tyr Ala Val Ile Gly 1355 1360 1365 ATG CAG ATG TTT GGG AAA GTT GCC ATG AGA GAT AAC AAC CAG ATC AAT Met Gin Met Phe Gly Lys Val Aa Met Arg Asp Asn Asn Gin Ile Asn 1370 1375 1380 AGG AAC AAT AAC TTC CAG ACG TTT CCC CAG GCG GTG CTG CTG CTC TTC Arg Asn Asn Asn Phe Gin Thr Phe Pro Gin Ala Val Leu Leu Leu Phe 1385 1390 1395 1400 AGG TGT GCA ACA GGT GAG GCC TGG CAG GAG ATC ATG CTG GCC TGT CTC Arg Cys Aia Thr Gly Giu Ala Trp Gin Giu Ile Met Leu Ala Cys Leu 1405 1410 1415 CCA GGG AAG CTC TGT GAC CCT GAG TCA GAT TAC AAC CCC GGG GAG GAG Pro Giy Lys Leu Cys Asp Pro Giu Ser Asp Tyr Asn Pro Gly Giu Glu 1420 1425 1430 CAT ACA TGT GGG AGC AAC TTT GCC ATT GTC TAT TTC ATC AGT TTT TAC His Thr Cys Gly Ser Asn Phe Ala Ile Vai Tyr Phe Ile Ser Phe Tvr 1435 1440 1445 ATG CTC TGT GCA TTT CTG ATC ATC AAT CTG TTT GTG GCT GTC ATC ATG Met Leu Cys Ala Phe Leu Ile Ile Asn Leu Phe Val Ala Val Ile Met 1450 1455 1460 GAT AAT TTC GAC TAT CTG ACC CGG GAC TGG TCT ATT TTG GGG CCT CAC Asp Asn Phe Asp Tyr Leu Thr Arg Asp Trp Ser Ile Leu Giy Pro His 1465 1470 1475 1480 4326 4374 4422 4470 4518 4566 4614 4662 4710 4758 4806 4854 4902 4950 -123- CAT TTA GAT His Leu Asp A-AG GGA AG Lys Gly Arg GAA TTC AAA Giu Phe Lys 1485 AGA ATA TGG TCA GAA Arg Ile Trp Ser Glu 1490 TAT GAC CCT Tyr Asp Pro GAG GCA Giu Ala ATA AAA Ile Lys 1500 CAC CTT GAT His Leu Asp GTG GTC Val Val ACT CTG CTT Thr Leu Leu CGA CCC ATC Arg Arcg Ile
S
S. CAG CCT CCC CTG GG Gin Pro Pro Leu Giy AAG AGA TTA GTT GCC Lys Arg Leu Val Ala 1530 ATG TTT A-AT CCA ACC Met Phe Asn Ala Thr 1545 TTT CCC A-AG TTA TGT Phe Gly Lys Leu Cys 1520 ATG AAC ATG CCT CTC Met Asn Met Pro Leu 1535 CTC TTT GCT TTG GTT Leu Phe Ala Leu Vai 1550 AAC ACT CAC Asn Ser Asp 1540 CCA CA-C AGG GTA CC TC Pro His Arg Val Ala Cys 1525 CCC ACA GTC Cly Thr Val CCA ACG A-AG ACC CA-A Lys Thr Clu ATA A-AC AA-A Ile Lys Lys CCC AAC CTG Gly Asn Leu 1565 GAG CAA CCT Ciu Gin Ala A-AT GA-A GA-A A-sn Giu Glu GCT CTT AAC ATC Ala Leu Lys Ile 1560 CTT CCC GCT CTC Leu Arg Ala Val 1575 CTT CAC CA-A CTT Leu Asp Gin Val 1590 AAC TTC TAT CC Lys Phe Tyr Ala ATT TGG Ile Tr-p 1580 A-AG A-AA ACC AGC ATC A-AA TTA Lys Lys Thr Ser Met Lys Leu 1585 CAT CAT GAG CTA A-CC CTC CCC Asp Asp Ciu Val Thr Val Gly 1600 4998 5046 5094 5 14 2 5190 5238 5286 5334 53 82 5430 5478 5526 5574 5622 GTC CCT CCA CCT CCT Val Pro Pro Ala Cly 1595 ACT TTC CTC ATA Thr Phe Leu Ile 1610 CA-C CAC TAC TTT ACC Gin Asp Tyr Phe Arg 1615 A.AA TTC AAC AAA CCC AA-A CAA Lys Phe Lys Lys Arg Lys Glu
S
CAA CGA Gin Cly 162S CTC CTC CCA Leu Val Cly A-AC TAC Lys Tyr 1630 CCT CC A-AC Pro Ala Lys AA-C A-CC A-CA Asn Thr Thr A-TT CCC CTA Ile Ala Leu CAG CC CCA TTA Gin Ala Cly Leu ACG A-CA A-rg Thr 1~4S CTC CAT CAC ATT CCC CCA GA-A ATC Leu His Asp Ile Cly Pro Clu Ile 1650 CCC CCT Arg Arg CCT ATA TCC Ala Ile Ser TGT CAT Cys Asp 1660 TTC CAA CAT Leu Gin Asp CA-C GAG CCT GAG Asp Ciu Pro Giu CAA ACA AAA CCA Ciu Thr Lys Arg CAA GAA CAA CAT CAT Glu Giu Ciu Asp Asp 1675 CAT CTC A-AT CAT CTT His Val Asn His Val 1690 CTC TTC AAA AGA AAT Vai Phe Lys A-rg Asn 1680 A-AT ACT CAT AGC AGA A-sn Ser Asp Arg Arg 1695 GCT CCC CTC CTT CGA A-AC Gly Ala Leu Leu Cly Asn 1685 CAT TCC CTT CAC CAC A-CC Asp Ser Leu Gin Gin Thr 1700 -124- AAT ACC Asn Thr 1705 ACC CAC CCT Thr His Arg CCC CTG CAT CTC CAA AGC CCT TCA ATT CCA CCT Pro Leu His Val Gin Arg Pro Ser Ile Pro Pro 1710 1715 1720 GCA ACT GAT ACT Ala Ser Asp Thr
GAG
Giu 172~ AAA CCC CTG Lys Pro Leu AAC CAT AAT Asn His Asn TTT CCT CCA GCA Phe Pro Pro Aia 1730 CCA AAT TCC CTC GY Asn Ser Vai TCT CAT AAC Cys His Asn CAT CAT His His 1740 TCC ATA Ser Ile 1745 GA AAG CAA Cy Lys Cmn 000* 00 0 0 **0 TCA ACA AAT GCC Ser Thr Asn Ala 1755 CCA AAC CCC CC Cly Lys Arg pro 1770 CAT CAT TCT TCC His His Ser Ser 1785 CTC AAA AGA ACC Val Lys Arg Thr AAT CTC Asn Leu AAT AAT GCC Asn Asn Ala AAT ATG TCC AAA Asn Met Ser Lys 176 GAG CAT GTG TCT Giu His Val Ser CTT CCC ACC Val Pro Thr 1750 CCT CCC CAT Ala Ala His CAA AAT CCC Ciu Asn Gly AGC ATT CCC AAC CTT Ser Ile Cly Asn Leu 1775 CAC AAC CAT His Lys His 1790 CCC TAT TAT Arg Tyr Tyr 1805 CAC CCC GAG Asp Arg Clu CCT CAG Pro Gin ACA ACG TCC Arg Arg Ser
ACT
S er CAT GAA CAG CTC CCA ACT ATT Asp Giu Gin Leu Pro Thr Ile 1820 CAA ACT TAC ATT ACC Giu Thr Tyr Ile Arg 1810 TCC CCC CAA GAC CCA CYs Arg Ciu Asp pro TCC GACT 1800G Ser Asp Ser Cly 1815 GAG ATA CAT GC Cu Ile His ly 5670 5718 5766 5814 5862 5910 5958 6006 6054 6102 6150 6198 6246 6294 TAT TTC ACC CAC CCC Tyr Phe Arg Asp pro 1835 GAG CAA TCC TAC GAG Ciu Ciu Cys Tyr Ciu 1850 CAC TCC TTG CCC CAC His Cys Leu Cly Ciu 1840 CAT CAC ACC TCC CCC Asp Asp Ser Ser Pro 1855 CAC GAC TAT TTC ACT ACT Gln Ciu Tyr Phe Ser Ser 1845 ACC TCC AGC ACC CAA AAC Thr Trp, Ser Arg Gin Asn TAT GC Tyr Cly 1865 TAC TAC AC Tyr Tyr Ser ACA TAC CCA Arg Tyr Pro 1870 CCC ACA AAC ATC Cly Arg As Ile CAC TCT GAG Asp Ser Ciu
ACC
Arg CCC CCA CCC TAC Pro Arg Cly Tyr CCC CTT TCC TAT Pro Vai Cys Tyr 1900 CCC ACC CCA CCA Pro Thr Pro Ala 1915 CAT CAT His His 1885 CAT TCA Asp Ser TCC CAC Ser His CCC CAA CCA Pro Gin Ciy TTC TTG GAG Phe Leu Giu GAC GATG 1880C Asp Asp Asp Ser CCC AGA TCT CCA ACC AGA CCC CTA CTA CCT Arg Arg Ser Pro Arg Arg Arg Leu Leu Pro 1905 1910 CCC AGA TCC TCC TTC AAC TTT GAG TCC CTC Arg Arg Ser Ser Phe Asn Phe Ciu Cys Leu 1920 1925 -125- CGC CGG CAG Arg Arg Gin 1930 CAT CGC ACG His Arg Thr 1945 AGC AGC CAG GAA GAG GTC CCG TCG TCT CCC ATC TTC CCC Ser Ser Gin Glu Glu Val Pro Ser Ser Pro Ile Phe Pro 1935 1940 GCC CTG CCT CTG CAT CTA ATG CAG CAA CAG ATC ATG GCA Ala Leu Pro Leu His Leu Met Gin Gin Gin Ile Met Ala 1950 1955 1960 GTT GCC GGC CTA Val Ala Gly Leu GAT TCA AGT AAA Asp Ser Ser Lys 1965 GCC CAG AAG Ala Gin Lys 1970 TAC TCA CCG Tyr Ser Pro AGT CAC Ser His 1975 *eO.
0 *ee 000.
0*Op
G
0* r 4 6 e* 4 4* 4 .e44 4**e .r TCG ACC CGG TCG TGG GCC ACC CCT Ser Thr Arg Ser Trp Ala Thr Pro 1980 CCA GCA Pro Ala 1985 ACC CCT CCC Thr Pro Pro TAC CGG GAC Tyr Arg Asp 1990 TGG ACA CCG TGC TAC ACC Trp Thr Pro Cys Tyr Thr 1995 CCC CTG ATC Pro Leu Ile 2000 AGC CTG CCG Ser Leu Pro 2015 CAA GTG GAG CAG TCA GAG GCC Gin Val Glu Gin Ser Glu Ala 2005 TCC CTG CAC CGC AGC TCC TGG 3er Leu His Arg Ser Ser Trp CTG GAC CAG Leu Asp Gin 2010 GTG AAC GGC Val Asn Gly 2020 TAC ACA Tyr Thr 2025 GAC GAG CCC Asp Glu Pro GAC ATC TCC Asp Ile Ser 2030 TAC CGG ACT TTC Tyr Arg Thr Phe 2035 ACA CCA GCC Thr Pro Ala
AGC
Ser 2040 6342 6390 6438 6486 6534 6582 6630 6678 6726 6774 6822 6870 6918 6966 CTG ACT GTC CCC Leu Thr Val Pro AGC AGC Ser Ser 2045 TTC CGG AAC Phe Arg Asn AAA AAC Lys Asn 2050 AGC GAC AAG Ser Asp Lys CAG AGG Gin Arg 2055 AGT GCG GAC AGC TTG GTG Ser Ala Asp Ser Leu Val 2060 GAG GCA GTC CTG ATA Glu Ala Val Leu Ile 2065 TCC GAA GGC TTG GGA Ser Glu Gly Leu Gly 2070 ACA AAA CAC GAA ATC Thr Lys His Glu Ile 2085 CGC TAT GCA AGG Arg Tyr Ala Arg 2075 GAC CCA AAA TTT GTG TCA GCA Asp Pro Lys Phe Val Ser Ala 2080 GCT GAT GCC Ala Asp Ala 2090 TGT GAC CTC Cys Asp Leu ACC ATC GAC Thr Ile Asp 2095 GAG ATG GAG AGT GCA GCC AGC Glu Met Glu Ser Ala Ala Ser 2100 ACC CTG Thr Leu 2105 CTT AAT GGG Leu Asn Gly AAC GTG CGT Asn Val Arg 2110 CCC CGA GCC AAC GGG GAT Pro Arg Ala Asn Gly Asp 2115 GAG CTA CAG GAC TTT GGT Glu Leu Gin Asp Phe Gly 2130 GTG GGC Val Gly 2120 CCT GGC Pro Gly 2135 CCC CTC TCA CAC CGG CAG GAC TAT Pro Leu Ser His Arg Gin Asp Tyr 2125 TAC AGC GAC GAA GAG CCA GAC CCT Tyr Ser Asp Glu Glu Pro Asp Pro 2140 GGG AGG GAT GAG GAG GAC CTG GCG Gly Arg Asp Glu Glu Asp Leu Ala 2145 2150 -126- GAT GAA ATG ATA TGC ATO ACC ACC TTG TAGCCCCCAG COACGOCAG 7013 Asp Glu Met Ile Cys Ile Thr Thr Leu 2155 21G0 ACTGGCTCTG GCCTCAGGTG GGGCGCAGGA GAGCOAGGG AAAAGTGCCT CATAGTTAGG 7073 AAAGTTTAGG CACTAGTTGG GAGTAATATT CAATTAATTA GACTTTTGTA TAAGAGATGT 7133 CATGCCTCA GAAAGCCATA AACCTGGTAG GAACAGGTCC CAAGCGGTTG AGCCTGGCAG 7193 AGTACCATGC GCTCGGCCCC AGCTGCAGGA AACAGCAGGC CCCGCCCTCT CACAGAGGAT 7253 GGOTGAGOAG GCCAGACCTG CCCTGCCCCA TTGTCCAGAT GGGC.ACTGCT GTGGAGTCTG 7313 CTTCTCCCAT GTACCAGGGC ACCAGGCCCA CCCAACTGAA OGCATGGCGG CGGTGCAG 7373 GGAAAGTTA AAGGTGATGA CGATCATCAC ACCTGTGTCG TTACCTCAGC CATCGGTCTA 7433 GCATATCAGT CACTGGGCCC AACATATCCA TTTTTAA.ACC CTTTCCCCCA AATACACTGC 7493 GTCCTGGTTC CTGTTTAGCT GTTCTGAAAT ACGGTGTGTA AGTAAGTCAG AACCCAGCTA 7553 CCAGTGATTA TTGCGAGGGC AATGGGACCT CATAAATAAG GTTTTCTGTG ATGTOACGCC 7613 AGTTTACATA AGAGAATATC AC 73 INFORMATION FOR SEQ ID NO:2: Wi SEQUENCE
CHARACTERISTICS:
LENGTH: 104 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECUJLE TYPE: DNA (gerlomic) (ix) FEATUJRE: NAME/KYy:
CDS
LOCATION: 1.-102 (ix) FEATURE: NAME/KY:y misc feature LOCATION: 1. .1064 OTHER INFORMATION: /note= "A l 0 4-nucleotide alternative exon of alpha-ID.r (xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: OTA AAT GAT GCG ATA GGA TOG GAJ\ TGG CCA TOO OTG TAT TTT OTT ACT 48 Val Asn Asp Ala Ile Gly Trp Glu Trp Pro Trp Val Tyr Phe Val Ser 1 5 10 CTG ATC ATC CTT GOC TCA TTT TTC GTC CTT AAC CTG OTT CTT GGT GTC 96 Leu Ile Ile Leu Gly Ser Phe Phe Val Leu Asn Leu Val Leu Gly Val 25 -127- CTT AGT GG Leu Ser INFORMATION FOR SEQ ID NO:3: SEQUENCE CHARACTERISTICS: LENGTH: 6575 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..6492 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: GTC AAT GAG AAT Val Asn Glu Asn ACG AGG ATG TAC Thr Arg Met Tyr
ATT
Ile 10 CCA GAG GAA AAC Pro Glu Glu Asn CAC CAA His Gln GGT TCC AAC Gly Ser Asn AAT GCG GCA Asn Ala Ala 35 GGG AGC CCA CGC Gly Ser Pro Arg GCC CAT GCC AAC Ala His Ala Asn ATG AAT GCC Met Asn Ala GCG GGG CTG GCC Ala Gly Leu Ala GAG CAC ATC CCC ACC CCG GGG GCT Glu His Ile Pro Thr Pro Gly Ala GCC CTG Ala Leu 50 TCG TGG CAG GCG Ser Trp Gin Ala ATC GAC GCA GCC Ile Asp Ala Ala CAG GCT AAG CTG Gln Ala Lys Leu 96 144 192 240 288
ATG
Met GGC AGC GCT GGC Gly Ser Ala Gly GCG ACC ATC TCC Ala Thr Ile Ser GTC AGC TCC ACG Val Ser Ser Thr CGG AAG CGC CAG Arg Lys Arg Gln TAT GGG AAA CCC Tyr Gly Lys Pro AAG CAG GGC AGC Lys Gln Gly Ser ACC ACG Thr Thr GCC ACA CGC Ala Thr Arg ATC CGG AGG Ile Arg Arg 115 CCC CGA GCC CTG Pro Arg Ala Leu
CTC
Leu 105 TGC CTG ACC CTG Cys Leu Thr Leu AAG AAC CCC Lys Asn Pro 110 TTT GAA ATA Phe Glu Ile GCC TGC ATC AGC Ala Cys Ile Ser
ATT
Ile 120 GTC GAA TGG AAA Val Glu Trp Lys ATT ATT Ile Ile 130 TTA CTG ACT ATT Leu Leu Thr Ile TTT GCC AAT Phe Ala Asn 135 TGT GTG GCC TTA GCG ATC TAT Cys Val Ala Leu Ala Ile Tyr 140 ATT CCC TTT CCA GAA GAT GAT TCC AAC GCC ACC AAT TCC AAC CTG GAA -128a a. a a a. Ile Pro Phe 145 CGA GTG GAA Arg Val Giu AAA GTA ATC Lys Val Ile AAC GGC TGG Asn Gay Trp 195 AGT GCA ATT Ser Ala Ile 210 GGA GGG AAA Giy Gay Lys 225 GTG CTG CGC Val Leu Arg GTC CTG AAT Val Leu Asn CTG CTT GTG Leu Leu Val 275 CTC TTC ATG C Leu Phe Met C 290 GCA GAT GTT C Ala Asp Val P 305 GGC CAC GGG- c Gly His GayA GAT GGT CCCA Asp Gly Pro L 3 ATG CTC ACG G Met Leu Thr V 355 CTG TAC TGG G Pr
TA
Ty
GC
Al 18i
AAC
Asi
TTI
Lei.
GGG
Gly
CCC
Pro
TCC
Ser 260
CTG
Aeu
;GG
;iy
:CA
'ro
GG
xg
AG
ys 40
TG
al rC o Gi T CT r Le 16
CTA
-i Let ~GA7
GCC
Ala
CTG
Leu 245
ATC
Ile
TTT
Phe
AAG
Lys
GCA
Ala
CAG
Gin 325
CAC
His
TTC
Phe
AAT
u As is C TT u Ph 5 T' GG r Gi'
~CT)
21 Let
~C-A
1Gir
GGA
Gly 230
CGG
Arg
ATC
Ile
GTC
Val
ATG
Met
GAA
Glu 310
TGC
Cys
GGC
Gly
CAG
Gin
GAT
p Asp Se 0 T CTC AT e Leu 11 A CTC CT y Leu Le ~GAT TT 2Asp Ph 20 GCA ACC IAl a Th 215 TTT GA] Phe Asr CTG GTG Leu Val AAG GCC Lys Ala ATC ATC Ile Ile 280 CAC AAG His Lys 295 GAT GAC Asp Asp CAG AAC Gin Asn ATC ACC Ile Thr T GC ATC Cys Ile 360 GCC GTA r Asn Ala Thr Asn 155 A ATT TTT ACG GTG e Ile Phe Thr Val 170 C TTT CAC CCC AAT u Phe His Pro Asn 185 T' ATA ATT GTG GTT e Ile Ile Vai Val 0 C AAA GCA GAT GGG -Lys Ala Asp Giy 220 GTG AAG GCG CTG Vai Lys A-la Leu 235 TCC GGA GTC CCA *Ser Gly Val Pro 250 ATG GTC CCC CTG Met Val Pro LeuI 265 ATC TAC GCC ATC 7 Ile Tyr Ala Ile I ACC TGC TAC AAC C Thr Cys Tyr Asn G 300 CCT TCC CCT TGT G Pro Ser Pro Cys A 315 GGC ACG GTG TGCA Giy Thr Val Cys L 330 AAC TTT GAC AAC T Asn Phe Asp Asn P.
345 ACC ATG GAG GGC TC Thr met Glu Gly T: GGA AGG GAC TGG CC Se
GA
Gi
GC
Al
GT
Va.
GC
Al~
AGC
AkrS
AGT
Ser
"TG
'eu
TC
:le
AG
in
CG
la
AG
ys
TT
he
G
rp 65 cc ~r As A GC u Al C TA a Ty 19 G GG 1 Gl'
SAA(
a AsI 8GCC
CTC
Leu
CAC
His 270
GGC
Gly
GAG
Giu
CTG
Leu
CCC
Pro
GCC
Al a 350
ACG
Thr
TGG
;n Leu Glu 160 'G TTT TTA a Phe Leu 175 C CTC CGC r Leu Arg 0 G CTT TTT y Leu Phe C GCT CTC i Ala Leu TTC CGC IPhe Arg 240 CAG GTG Gin Vai 255 ATC GCC Ile Ala TTG GAG Leu Giu GGC ATA Giy Ile GAA ACG Giu Thr 320 GGC TGG Giy Trp, 335 TTC GCC Phe Ala GAC GTG Asp Val ATC
TAT
528 576 624 672 720 768 816 a.
*a.a 912 960 1008 1056 1104 1152 -129- Leu Tyr Trp Val Asn Asp Ala Val G1Y Arg Asp Trp Pro Trn Ile Tyr 370 375 380
TTT
Phe 385 GTT A-CA CTA Val Thr Leu ATC ATC ATA GGG TOA TTT TTT GTA OTT A-AC TTG GTT Ile Ile Ile Giy Ser Phe Phe Val Leu A-sn iLeu Val 390 395 400 C2TC GGT GTG OTT AGO GGA GAG TTT TOO Leu Giy Vai Leu Ser Gly Giu Phe Ser
AAA
Lys GAG AGG GAG AAG Giu A-mg Giu Lys GOC A-AG A-ia Lys GOC CGG GGA Ala A-mg Gly GAT OTO A-AA Asp Leu Lys 435 TTO CAG AAG OTG Phe Gin Lys Leu
CGG
A-mg GAG AAG CAG CAG Gu Lys Gn Gn CTA GA-A GAG Leu Giu Giu 430 GAO ATO GAT Asp Ile Asp GGO TAO CTG GAT Giy Tyx- Leu Asp
TGG
Trp A-TO ACT CAG GC Ile Thr Gin Ala OCT GAG Pro Glu 450 A-AT GAG GAO GA-A A-sn Giu Asp Giu
GGO
Giy ATG GAT GAG GAG Met Asp Giu Giu
A-AG
Lys COO OGA A-AC A-GA Pro A-mg A-sn Arg a. a. a.
*aa.
000a
GGO
Gly 465 A-CT COG GOG GGO Thin Pro Ala Gly
A-TG
Met 470 OTT GAT CAG A-AG Leu Asp Gin L~ys
AAA
Lys GGG AAG TTT GT Gly Lys Phe Ala 1200 1248 1296 1344 1392 1440 1488 1536 1584 1632 1680 TTT AGT CAC TOO Phe Ser His Ser GAG TOO GTO A-AC Glu Ser Vai Asn 500 A-AC TGO GGG GC Asn Cys Gly Ala 515 GA-A A-C OAT GTG Giu Thin His Val ATG COO A-CO AGT Met Pro Thr Ser GAG ACC Giu Thr ACC GAA A-AC GTG Thr Giu Asn Val
GOT
Al a GGA GGT GAO ATO Gly Gly Asp Ile 495GA A Giu Gly Giu 510 A-AG TTO AGO A-GG CTG GC A-rg Leu Ala CGG A-TO TOO A-AG A-mg Ile Ser Lys
TOA
Ser OGO TAO A-mg Tyr 530 TGG OGO CGG TGG Trp, A-mg Aing Trp, CGG TTO TGC AGA A-rg Phe Cys A-rg A-AG TGO OGO GC Lys Cys A-rg Ala
GCA
Al a 545 GTC A-AG TOT A-AT Vai Lys Ser Asn
GTC
Val TTO TAO TGG OTG Phe Tyr Trp Leu
GTG
Val A-TT TTC OTG GTG
TTC
OTO A-AC ACG OTO A-CO ATT GOC TOT GAG CA-C TA-C A-AC CAG 000 A-AC TGG Leu A-sn Thr Leu Thr Ile A-ia Ser Giu His Tyr A-sn Gin Pro A-sn Trp 565 570 575 OTO A-CA GAA Leu Thr Glu GTO CAA GAC AG GOA A-AC Vai Gin Asp Thin A-la A-sn 580 585 A-AG GOC OTO CTG GCC CTG TTC 1728 1776 1824 ACG GCA GAG ATG OTO CTG A-AG ATG TAO AGO OTG GGC OTG CAG GOC TA-C -130- Thr Ala Giu Met Leu Leu Lys MetTySeLeGlLeGiAaTv 595 600 605e euGyLu l l TTO GTG TCO OTC TTC AAC OGO TTT GAO TGO TTC GTC GTG TGT GGO GGC 1872 Phe Val Ser Leu Phe Asn Arg Phe Asp Cys Phe Val Val Cys Gly Giy 610 615 620 ATC CTG GAG ACC ATC CTG GTG GAG ACC AAG ATC ATG TOO COA CTG GGC 1920 Ile Leu Giu Thr Ile Leu Val Giu Thr Lys Ile Met Ser Pro Leu Giy 625 630 635 640 ATO TOO GTG OTO AGA TGO GTO CGG CTG CTG AGG ATT TTC AAG ATC ACO 1968 Ile Ser Val Leu Arg Cys Vai Arg Leu Leu Arg Ile Phe Lys Ile Thr 645 650 655 AGG TAO TGG AAO TCC TTG AGO AAC CTG GTG GOA TOO TTG CTG AAO TOT 2016 A-rg Tyr Trp Asn Ser Leu Ser A-sn Leu Val Ala Ser Leu Leu Asn Ser 660 665 670 GTG OGO TOO ATO GOO TOO OTG OTO OTT OTO OTO TTO OTO TTO ATO ATO 2064 Val Arg Ser Ile Ala Ser Leu Leu Leu Leu Leu Phe Leu Phe Ile Ile 675 680 685 ATO TTO TOO OTO OTG GGG ATG CAG OTO TTT GGA GGA AAG TTO AAO TTT 2112 Ile Phe Ser Leu Leu Gly Met Gin Leu Phe Gly Giy Lys Phe Asn Phe 690 695 700 GAT AG TGOAG ACC OGG AGG AGO AOA TTO GAT AAO TTO 000 OAG TOO 16 Asp Giu Met Gin Thr Arg Arg Ser Thr Phe Asp sPhPrGlSe 7s10715 720 C TO OTO ACT GTG TTT OAG ATO OTG ACC GGG GAG GAO TGG AAT TOG GTG 2208 Leu Leu Thr Val Phe Gin Ile Leu Thr Giy G 1u Asp Trp Asn Ser Val 725 730 735 *ATG TAT GAT GGG ATO ATG GT TAT GGG GGO 000 TT TTT OA GGG ATG 2256 Met Tyr Asp Gly Ile Met Ala Tyr Gly Gly Pro Ser Phe Pro Gly Met *..740 745 750 TTA GTO TGT ATT TAO TTO ATO ATO OTO TTO ATO TGT GGA AAO TAT ATC 2304 *Leu Val Cys Ile Tyr Phe Ile Ile Leu Phe Ile Cys Gly Asn Tyr Ile 755 760 765 Z OTA OTG AAT GTG TTO TTG GOO ATT GOT GTG GAO AAO OTG GOT GAT GOT 2352 Leu Leu Asn Val Phe Leu Ala Ile Ala Val Asp Asn Leu Ala Asp Ala 770 775 780 GAG AGO OTO ACA TOT GOO CAA AAG GAG GAG GAA GAG GAG AAG GAG AGA 2400 Glu Ser Leu Thr Ser Ala Gin Lys Giu Giu Giu Glu Giu Lys Giu Arg 785 790 795 800 AAG AAG OTG GOC AGG ACT CO AGO OCA GAG AAG AAA CAA GAG TTG GTG 2448 Lys Lys Leu Ala Arg Thin Ala Ser Pro Giu Lys Lys Gin Giu Leu Val 805 810 815 GAG AAG OOG GOA GTG GGG GAA TOO AAG GAG GAG AAG ATT GAG OTG AAA 249f; -131-
C
CS..
Glu Lys Pro TCC ATC ACG Ser Ile Thr 835 GAT GAC CTC Asp Asp Leu 850 CCA GAA ACT Pro Glu Thr 865 GGC CCT CGC Gly Pro Arg GTG CCC ATG Val Pro Met AGG TTT CGC Arg Phe Arg I 915 AAC CTG ATC C Asn Leu Ile I 930 GAG GAC CCG G Glu Asp Pro V 945 TTT GAT ATT G Phe Asp Ile V ATG ACT GCT T Met Thr Ala T 9i TAC TTC AAC A: Tyr Phe Asn I: 995 TTT GGC ATC C; Phe Gly Ile G1 1010 CTG CGA GTA CT Leu Arg Val Le 1025 AAG CAT GTG GT
A
8
G
A.
Ci G1
AC
T1
CC
Pr
C
Pr
TI
-ei
:T(
!TC
al
TT
al
AT
yr 30 rc Le iG .n
'C
u
T
la Val Gly Glu Ser Lys Glu Glu Lys Ile Glu Leu Lys 20 825 830 CT GAC GGA GAG TCT CCA CCC GCC ACC AAG ATC AAC ATG la Asp Gly Glu Ser Pro Pro Ala Thr Lys Ile Asn Met 840 845 AG CCC AAT GAA AAT GAG GAT AAG AGC CCC TAC CCC AAC in Pro Asn Glu Asn Glu Asp Lys Ser Pro Tyr Pro Asn 855 860 A GGA GAA GAG GAT GAG GAG GAG CCA GAG ATG CCT GTC ir Gly Glu Glu Asp Glu Glu Glu Pro Glu Met Pro Val 870 875 880 A CGA CCA CTC TCT GAG CTT CAC CTT AAG GAA AAG GCA o Arg Pro Leu Ser Glu Leu His Leu Lys Glu Lys Ala 885 890 895 A GAA GCC AGC GCG TTT TTC ATC TTC AGC TCT AAC AAC o Glu Ala Ser Ala Phe Phe Ile Phe Ser Ser Asn Asn 0 905 910 C CAG TGC CAC CGC ATT GTC AAT GAC ACG ATC TTC ACC u Gin Cys His Arg Ile Val Asn Asp Thr Ile Phe Thr 920 925 CTTC TTC ATT CTG CTC AGC AGC ATT TCC CTG GCT GCT u Phe Phe Ile Leu Leu Ser Ser Ile Ser Leu Ala Ala 935 940 CAG CAC ACC TCC TTC AGG AAC CAT ATT CTG TTT TAT SGn His Thr Ser Phe Arg Asn His Ile Leu Phe Tyr 950 955 960 TTT ACC ACC ATT TTC ACC ATT GAA ATT GCT CTG AAG Phe Thr Thr Ile Phe Thr Ile Glu Ile Ala Leu Lys 965 970 975 GGG GCT TTC TTG CAC AAG GGT TCT TTC TGC CGG AAC Gly Ala Phe Leu His Lys Gly Ser Phe Cys Arg Asn 985 990 CTG GAC CTG CTG GTG GTC AGC GTG TCC CTC ATC TCC Leu Asp Leu Leu Val Val Ser Val Ser Leu Ile Ser 1000 1005 TCC AGT GCA ATC AAT GTC GTG AAG ATC TTG CGA GTC Ser Ser Ala Ile Asn Val Val Lys Ile Leu Arg Val 1015 1020 AGG CCC CTG AGG GCC ATC AAC AGG GCC AAG GGG CTA Arg Pro Leu Arg Ala Ile Asn Arg Ala Lys Gly Leu 1030 1035 1040 CAG TGT GTG TTT GTC GCC ATC CGG ACC ATC GGG AAC 2544 2592 2640 2688 2736 2784 2832 2880 2928 2976 3024 3072 3120 3168 -132- 0 C.
C
C
C
C
C
S. C.
C
COC
SC..
Lys His Val Val Gin Cys Val Phe Val Ala Ile Arg Thr Ile Gly Asn. 1045 1050 1055 ATC GTG ATT GTC ACC ACC CTG CTG CAG TTC ATG TTT GCC TGC ATC GGG Ile Val Ile Val Thr Thr Leu Leu Gin Phe Met Phe Ala Cys Ile Gly 1060 1065 1070 GTC CAG CTC TTC AAG GGA AAG CTG TAC ACC TGT TCA GAC AGT TCC AAG Val Gin Leu Phe Lys Gly Lys Leu Tyr Thr Cys Ser Asp Ser Ser Lys 1075 1080 1085 CAG ACA GAG GCG GAA TGC AAG GGC AAC TAC ATC ACG TAC AAA GAC GGG Gin Thr Giu Ala Giu Cys Lys Gly Asn Tyr Ile Thr Tyr Lys Asp Gly 1090 1095 1100 GAG GTT GAC CAC CCC ATC ATC CAA CCC CGC AGC TGG GAG AAC AGC AAG Giu Vai Asp His Pro le Ile Gin Pro Arg Ser Trp Giu Asn Ser Lys 1105 1110 1115 1120 TTT GAC TTT GAC AAT GTT CTG GCA GCC ATG ATG GCC CTC TTC ACC GTC Phe Asp Phe Asp Asn Vai Leu Ala Ala Met Met Ala Leu Phe Thr Val 1125 1130 1135 TCC ACC TTC GAA GGG TGG CCA GAG CTG CTG TAC CGC TCC ATC GAC TCC Ser Thr Phe Giu Gly Trp, Pro Giu Leu Leu Tyr Arg Ser Ile Asp Ser 1140 1145 -1150 CAC ACG GAA GAC AAG GGC CCC ATC TAC AAC TAC CGT GTG GAG ATC TCC His Thr Giu Asp Lys Giy Pro Ile Tyr Asn Tyr Arg Val Giu Ile Ser 1155 1160 1165 ATC TTC TTC ATC ATC TAC ATC ATC ATC ATC GCC TTC TTC ATG ATG AAC Ile Phe Phe Ile Ile Tyr Ile Ile Ile Ile A-la Phe Phe Met Met Asn 1170 1175 1180 ATC TTC GTG GGC TTC GTC ATC GTC ACC TTT CAG GAG CAG GGG GAG CAG Ile Phe Val Gly Phe Val Ile Val Thr Phe Gin Giu Gin Gly Giu Gin 1185 1190 1195 1200 GAG TAC AAG AAC TGT GAG CTG GAC AAG AAC CAG CGA CAG TGC GTG GAA Giu Tyr Lys Asn Cys Giu Leu Asp Lys Asn Gin Arg Gin Cys Vai Giu 1205 1210 1215 TAC GCC CTC AAG GCC CGG CCC CTG CGG AGG TAC ATC CCC AAG AAC CAG Tyr A-la Leu Lys Ala Arg Pro Leu Arg Arg Tyr Ile Pro Lys Asn. Gin 1220 1225 1230 CAC CAG TAC AAA GTG TGG TAC GTG G3TC AAC TCC ACC TAC TTC GAG TAC His Gin Tyr Lys Val Trp Tyr Val Val Asn Ser Thr Tyr Phe Giu Tyr 1235 1240 1245 CTG ATG TTC GTC CTC ATC CTG CTC AAC ACC ATC TGC CTG GCC ATG CAG Leu Met Phe Val Leu Ile Leu Leu Asn Thr Ile Cys Leu Ala Met Gin 1250 1255 1260 CAC TAC GGC CAG AGC TGC CTG TTC AAA ATC GCC ATG AAC ATC CTC AAC 3216 3264 3312 3360 3408 3456 3504 3552 3600 3648 3696 3744 3840 -133- His Tyr Gly Gin Ser Cys Leu Phe Lys Ile Aia Met Asn Ile Leu Asn 1265 1270 1275 1280 ATG CTC TTC ACT GGC CTC TTC ACC GTG GAG ATG ATC CTG AAG CTC ATT 3888 Met Leu Phe Thr Gly Leu Phe Thr Val Giu Met Ile Leu Lys Leu Ile 1285 1290 1295 GCC TTC AAA CCC AAG GGT TAC TTT AGT GAT CCC TGG AAT GTT TTT GAC 3936 Ala Phe Lys Pro Lys Gly Tyr Phe Ser Asp Pro Trp Asn Val Phe Asp 1300 1305 1310 TTC CTC ATC GTA ATT GGC AGC ATA ATT GAC GTC ATT CTC AGT GAG ACT 3984 Phe Leu Ile Val Ile Giy Ser lie Ile Asp Val Ile Leu Ser Giu Thr 1315 1320 1325 AAT CCA GCT GAA CAT ACC CAA TGC TCT CCC TCT ATG AAC GCA GAG GAA 4032 Asn Pro Aia Giu His Thr Gin Cys Ser Pro Ser Met Asn Aia Giu Glu 1330 1335 1340 AAC TCC CGC ATC TCC ATC ACC TTC TTC CGC CTG TTC CGG GTC ATG CGT 4080 Asn Ser Arg Ile Ser Ile Thr Phe Phe Arg Leu Phe Arg Vai Met Arg 1345 1350 1355 1360 CTG GTG AAG CTG CTG AGC CGT GGG GAG GGC ATC CGG ACG CTG CTG TGG 4128 Leu Val Lys Leu Leu Ser Arg Gly Giu Gly Ile Arg Thr Leu Leu Trp 1365 1370 1375 99rr ACC TTC ATC AAG TCC TTC CAG GCC CTG CCC TAT GTG GCC CTC CTG ATC 4176 Thr Phe Ile Lys Ser Phe Gin Ala Leu Pro Tyr Val Ala Leu Leu Ile 1380 1385 1390 GTG ATG CTG TTC TTC ATC TAC GCG GTG ATC GGG ATG CAG GTG TTT GGG 4224 Vai Met Leu Phe Phe Ile Tyr Ala Val Ile Giy Met Gin Val Phe Gly 1395 1400 1405 9* 9 9* AAA ATT GCC CTG AAT GAT ACC ACA GAG ATC AAC CGG AAC AAC AAC TTT 4272 Lys Ile Ala Leu Asn Asp Thr Thr Giu Ile Asn Arg Asn Asn Asn Phe 1410 1415 1420 CAG ACC TTC CCC CAG GCC GTG CTG CTC CTC TTC AGG TGT GCC ACC GGG 4320 Gin Thr Phe Pro Gin Ala Val Leu Leu Leu Phe Arg Cys Aia Thr Giv 1425 1430 1435 1440 goo* GAG GCC TGG CAG GAC ATC ATG CTG GCC TGC ATG CCA GGC AAG AAG TGT 4368 Glu Ala Trp Gin Asp Ile Met Leu Ala Cys Met Pro Gly Lys Lys Cys **0901445 1450 1455 GCC CCA GAG TCC GAG CCC AGC AAC AGC ACG GAG GGT GAA ACA CCC TGT 4416 Ala Pro Giu Ser Giu Pro Ser Asn Ser Thr Glu Gly Giu Thr Pro Cys 1460 1465 1470 GGT AGC AGC TTT GCT GTC TTC TAC TTC ATC AGC TTC TAC ATG CTC TGT 4464 Gly Ser Ser Phe Ala Val Phe Tyr Phe Ile Ser Phe Tyr Met Leu Cys 1475 1480 1485 GCC TTC CTG ATC ATC AAC CTC TTT GTA GCT GTC ATC ATG GAC AAC TTT 4512 -134- Ala Phe Leu Ile Ile Asn Leu Phe Val Ala Val Ile Met Asp Asn Phe 1490 1495 1500 GAO TAO CTG ACA AGG GAO TGG TCC ATC CTT GGT COO CAC CAC CTG GAT 4560 Asp Tyr Leu Thr Arg Asp Trp, Ser Ile Leu Giy Pro His His Leu Asp 1505 1510 1515 1520 GAG TTT AAA AGA ATC TGG GCA GAG TAT GAC COT GAA GCC AAG GGT CGT 4608 Giu Phe Lys A-rg Ile Trp Ala Glu Tyr Asp Pro Giu Ala Lys Gly Arg 1525 1530 1535 ATO AAA CAC CTG GAT GTG GTG ACC OTO CTC CGG CGG ATT CAG CCG OCA 4656 Ile Lys His Leu Asp Vai Val Thr Leu Leu Arg Arg Ile Gin Pro Pro 1540 1545 1550 OTA GGT TTT GGG AAG CTG TGC COT CAC CGC GTG GCT TGC AAA CGC CTG 4704 Leu Giy Phe Giy Lys Leu Cys Pro His Arg Val Ala Cys Lys Arg Leu 1555 1560 1565 GTC TCC ATG AAC ATG CCT CTG AAC AGC GAC GGG ACA GTC ATG TTC AAT 4752 Vai Ser Met Asn Met Pro Leu Asn Ser Asp Gly Thr Vai Met Phe Asn 1570 1575 1580 GCC ACC CTG TTT GCC CTG GTC AGG ACG GCC CTG AGG ATC AAA ACA GAA 4800 Al a Thr Leu Phe Ala Leu Vai Arg Thr Ala Leu Arg Ile Lys Thr Giu .1585 159015560 **.GGG AAC CTA GAA CAA GCC AAT GAG GAG CTG CGG GCG ATC ATC AAG AAG 4848 :**Giy Asn Leu Giu Gin Aia Asn Giu Giu Leu Arg Aia Ile Ile Lys Lys *1605 1610 1615 ATO TGG AAG CGG ACC AGO ATG AAG OTG CTG GAO CAG GTG GTG CCC OCT 4896 Ile Trp Lys Arg Thr Ser Met Lys Leu Leu Asp Gin Val Val Pro Pro 1620 1625 1630 GGT GAT GAT GAG GTO ACC GTT GGC AAG TTO TAO GOC AOG TTC CTG 4944 Giy As p Asp Giu Vai Thr Val Gly Lys Phe Tyr Ala Thr Phe Leu 1635 1640 1645 ATO CAG GAG TAO TTC OGG AAG TTC AAG AAG OGO AAA GAG CAG GGC OTT 4992 I *le Gin Giu Tyr Phe Arg Lys Phe Lys Lys Arg Lys Glu Gin Gly Leu 1650 1655 1660 GTG GGO AAG CCC TOO CAG AGG AAC GOG CTG TOT OTG CAG GOT GGC TTG 5040 Val Gly Lys Pro Ser Gin Arg Asn Ala Leu Ser Leu Gin Ala Gly Leu 1665 1670 1675 1680 CGC ACA CTG CAT GAO ATO GGG COT GAG ATC OGA OGG GOC ATO TCT GGA 5088 Arg Thr Leu His Asp Ile Gly Pro Giu Ile Arg Arg Ala Ile Ser Gly 1685 1690 1695 GAT OTO ACC GOT GAG GAG GAG CTG GAC AAG GCC ATG AAG GAG GOT GTG 5136 Asp Leu Thr Ala Giu Giu Giu Leu Asp Lys Ala Met Lys Giu Ala Val 1700 1705 1710 TOO GOT GOT TOT GAA GAT GAO ATO TTC AGG AGG GCC GGT GGO CTG TTC 5184 -135- Ser Ala Ala Ser Giu Asp Asp Ile Phe Arg Arg A)la Gly Gly Leu Phe 1715 1720 1725 GCC AAC CAC Gly Asn His 1730 GTC AGC TAC TAC CAA Val Ser Tyr Tyr Gin 1735 AGC GAC GGC CGC AGC GCC TTC CCC Ser Asp Cly Arg Ser Ala Phe Pro 1740 CAG ACC Gin Thr 1745 TTC ACC ACT Phe Thr Thr CAG CGC Gin Arg 1750 CCC CTG CAC Pro Leu His ATC AAC AAG Ile Asn Lys 1755 ACC CAC GGC GAC Ser Gin Cly Asp ACT GAG TCG Thr Giu Ser 1765 CCA TCC CAC GAG AAC CTG Pro Ser His Giu Lys Leu 1770 GCG GGC AC Ala Giy Ser 1760 GTG GAC TCC Vai Asp Ser 1775 GCC AAC ATC Ala Asn Ile ACC TTC ACC Thr Phe Thr CCG AGC AGC Pro Ser Ser 1780 TAC TCC TCC Tyr Ser Ser 1785 ACC GCC TCC AAC Thr Gly Ser Asn AAC AAC GCC AAC Asn Asn Ala Asn 1795 TAC CCC AGC ACA Tyr Pro Ser Thr 1810 AAC ACC GCC Asn Thr Ala CTC GGT CC Leu Gly Arg 1800 CTC CCT CCC CCC GCC GCC Leu Pro Arg Pro Ala Gly 1805 CAC CCC CCC CCC TTC TCC His Gly Pro Pro Leu Ser 1820 GTC AGC ACT CTG GAG GC Val Ser Thr Val Ciu Gly 1815 CCT CC Pro Ala 1825 ATC CCC CTC Ile Arg Val CAC GAG GTC CC Gin Ciu Val Ala 1830 TGC AAG CTC AGC Trp Lys Leu Ser TCC AAC ACC Ser Asn Arg 5232 5280 5328 5376 5424 5472 5520 5568 5616 5664 5712 5760 5808 5856 TCC CAC TCC CYs His Ser TCT CAG CAT Ser Gin Asp TCC ACT GAG Cys Ser Ciu 1875 CCC GAG ACC CAC CCA CCC ATG CC CCT CAC GAG GAG ACC Arg Clu Ser Gin Ala Ala Met Ala Arg Gin Ciu Clu Thr 1845 1850 1855 GAG ACC Giu Thr 1860 TAT CAA CTG AAC ATG Tyr Giu Val Lys Met 1865 AAC CAT GAC ACC GAG CC Asn His Asp Thr Clu Ala 1870 ATC CTC TCC TAC CAC CAT Met Leu Ser Tyr Cln Asp 18s5 CCC ACC CTC CTC Pro Ser Leu Leu TCC ACA GAG Ser Thr Clii 1880 GAC GAA AAT Asp Ciu Asn 1890 CCC CAA CTC Arg Gin Leu ACG CTC CCA GAG GAG CAC AAC ACC CAC ATC Thr Leu Pro Giu Giu Asp Lys Arg Asp Ile 1895 1900 CCC CAA Arg Gin 1905 TCT CCC AAC Ser Pro Lys ACC CCT TTC CTC Arg Gly Phe Leu 1910 CCC TCT CCC TCA CTA Arg Ser Ala Ser Leu 1915 CdT CCA Cly Arg 1920 AGC CCC TCC TTC A-rg Ala Ser Phe CAC CTC CAA TGT CTC His Leu Giu Cys Leu 1925 AAG CCA CAC AAC Lys Arg Gin Lys 1930 CAC CCA CCC Asp Arg Gly CGA CAC ATC TCT CAC AAC ACA GTC CTC CCC TTC CAT CTC CTT CAT CAT -136- Gly Asp Ile Ser Gin Ls Thr Val Leu Pro Leu His Leu Vai His His 1940 1945 .950 CAG GCA TTG GCA GTG GCA GGC CTG AGC CCC CTC CTC CAG AGA AGC CAT 5904 Gin Ala Leu Ala Vai Ala Gly Leu Ser Pro Leu Leu Gin Arg Ser His 1955 1960 1965 TCC CCT GCC TCA TTC CCT AGG CCT TTT GCC ACC CCA CCA GCC ACA CCT 5952 Ser Pro Ala Ser Phe Pro Arg Pro Phe Ala Thr Pro Pro Ala Thr Pro 1970 1975 1980 GGC AGC CGA GGC TGG CCC CCA CAG CCC GTC CCC ACC CTG CGG CTT GAG 6000 Gly Ser Arg Gly Trp Pro Pro Gin Pro Val Pro Thr Leu Arg Leu Glu 1985 1990995 GGG GTC GAG TCC ACT GAG AAA CTC AAC AGC AGC TTC CCA TCC ATC CAC 6048 Gly, Val Giu Ser Ser Giu Lys Leu Asn Ser Ser Phe Pro Ser Ile His 2005 20 02015 .2 0 1 0 TGC GGC TCC TGG GCT GAG ACC ACC CCC GGT GGC GGG GGC AGC AGC GCC 6096 Cys Gly Ser Trp Ala Glu Thr Thr Pro Gly Gly Gly Gly Ser Ser Ala 2025 2030 A GTC CGG CCC GTC TCC CTC ATG GTG CCC AGC CAG GCT GGG 6144 A la Arg Arg Val Ag Pro Vai Ser Leu Met Val Pro Ser Gin Aa Gly 2035 2040 2045 GCC CCA GGG AGG CAG TTC CAC GGC AGT GCC AGC AGC CTG GTG GAA GCG 6192 Ala Pro Gly Arg Gin Phe His Gly Ser Ala Ser Ser Leu Val Glu Ala 2050 2050 2055 2060 GTC TTG ATT TCA GAA GGA CTG GGG CAG TTT GCT CAA GAT CCC AAG TTC 6240 Val Leu Ile Ser Giu Giy Leu Giy Gin Phe Ala Gin Asp Pro Lys Phe 2070 2075 2080 ATC GAG GTC ACC ACC CAG GAG CTG GCC GAC GCC TGC GAC ATG ACC ATA 6288 Ile Glu Val Thr Thr Gin Gu Leu Ala Asp Ala Cys Asp Met Thr Ile 2090 2095 2100 GAG GAG ATG GAG AGC GCG GCC GAC AAC ATC CTC AGC GGG GGC GCC CCA 6336 Glu Gu Met Glu Ser Ala Aa Asp Asn Ile Leu Ser G ly Giy Aia Pro 2!05C211 2115 CAG AGC CCC AAT GGC GCC CTC TTA CCC TTT GTG AAC TGC AGG GAC GCG 384 Gin Ser Pro Asn Giy Ala Leu Leu Pro Phe Val Asn Cys Arg Asp Ala 2120 2125 2130 GGG CAG GAC CGA GCC GGG GGC GAA GAG GAC GCG GGC TGT GTG CGC GCG 6432 Gly Gn Asp Arg Ala Gy Gly Glu Giu Asp Ala Gly Cys Val Arg Ala 22.35 2135 2140 CGG GGT CGA CCG AGT GAG GAG GAG CTC CAG GAC AGC AGG GTC TAC GTC 6480 Arg Gly Arg Pro Ser Glu Giu Gu Leu Gin Asp Ser Arg Val Tyr Val 2145 2150 2155 2160 AGC AGC CTG TAGTGGGCGC TGCCAGATGC GGGCTTTTTT TTATTTGTTT CAATGTTCCT 6539 -137- Ser Ser Leu AATGGGTTCG TTTCAGAAGT GCCTCACTGT
TCTCGT
6575 INFORMATION FOR SEQ ID NO:4: SEQUENCE
CHARACTERISTICS:
LENGTH: 133 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: AGACCACGGC TTCCTCGAAT CTTGCGCGAA GCCGCCGGCCA TCGGAGGAG GGATTAATCC AGACCCGCCG GGGGGTGTTT TCACATTTCT TCCTCTTCGTG GCTGCTCCT CCTATTAAAA 120 CCATTTTTGG
TCC
133 INFORMATION FOR SEQ ID SEQUENCE CHARACTERISTICS: LENGTH: 89 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID CGCTGAGGGC CTTCCGCGTG CTGCGCCCCC TGCGGCTGGT GTCCGGAGTC CCAAGTCTCC AGGTGGTCCT GAATTCCATC
ATCAAGGCC
89 INFORMATION FOR SEQ ID NO:6: SEQUENCE
CHARACTERISTICS:
LENGTH: 84 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY: CDS LOCATION: 1..84 OTHER INFORMATION: /note= "An alternative exon of alpha-1C." (xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: -138- 9*9* 99 9* .9 9* 9 9 9 9 99*9 9.
9 99 9 9 *999 9999 9 CAC TAT TTC TGT GAT GCA TOO AAT ACA TTT GAC
CC
His Tyr Phe Cys Asp Ala Trp Asn Thr Phe Asp Ala GOT ACC ATT GTT GAT ATA GCA ATC ACC GAG GTA AAC Gly Ser le Val Asp Ile Ala Ile Thr Glu Val Asn INFORMATION FOR SEQ ID NO:7: SEQUENCE
CHARACTERISTICS:
LENGTH: 7362 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/Ky:
CDS
LOCATION: 144. .7163 (ix) FEATUJRE: NAME/Ky: LOCATION: 1. .143 (ix) FEATRE:~ NAME/KEY: 3'tJTR LOCATION: 7161. .7362 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:7- TTG ATT GTT
GTC
Leu Ile Val Val GCGGCGCGC CTGCGGCGOT GGGCCCCGC GAGGTCCGTG CGGTCCCGGC
GGCTCCGTGG
CTCCTCCGCT CTGAGCCCT CCGCCCCCCG CGCCCTCCCT GCCGGCCG
CTGGGCCGGC
GATGCACGCG GGCCCCGGCA CCC ATGOGTC CGC TTC GGG CAC GAG CTG GGC Met Val A-rg Phe Gly Asp Glu Leu Gly 1 5 GGC CGC TAT GGA GCC CCC CCC GGC GOA GAG COG CCC CGG GGC GGC GOC Gly Arg Tyr Gly Gly Po~ Cly C'ly Gly Glu Arg Ala Arg Cly Gly Gly 15 20 25 120 170 218 266 314 362 GuC CCC GCG COG CGC CCC GOTCCCC Ala ly ly la ly Cly Pro Cly Pro Oly 35 CCC CTC CTC TAC AAC CAA TCG ATC CC CAC Arg Val Leu TIyr Lys Gin Ser Ile Ala Gin 50 COG CTC CAC CCC Cly Leu Gin pro CCC CAC CCC CC CCC ACC ATG C CTG TAC AAC CCC ATC CCC OTC AAC CAC AAC TOC TTC Leu Tyr Asn Pro Ile Pro Val Lys Gin Asn Cys Phe 65 ACC CTC AAC
CC
Thr Val Asra Arg 0 -139- TCG CTC Ser Leu TTC GTO TTC AGO GAG GAC AAC GTC GTC Phe Val Phe Ser Giu Asp Asn Val Val 80 ATC ACC GAG TGG Ile Thr Giu Trp CGO AAA TAO GCG AAG Arg Lys Tyr Ala Lys ATO CTG GCC ACC ATC Ile Leu Ala Thr Ile
COT
Pro CCA TTC GAG AAT Pro Phe Giu Asn
ATG
Met 100 ATC GCC AAC TGC ATC GTG CTG GCC CTG Ile Ala Asn Cys Ile Vai Leu Ala Leu CAG CAC CTC CCT Gin His Leu Pro GAT GGG Asp Gly GAO AAA ACG Asp Lys Thr ATO GGG ATC Ile Gly Ile 140
COO
Pro 125 ATG TCC GAG OGG Met Ser Giu Arg
CTG
Leu 130 GAC GAC AOG GAG COO TA: TTC Asp Asp Thr Giu Pro Tyr- Phe TTT TGC TTC GAG Phe Cys Phe Giu GGG ATC AAA ATC Gly Ile Lys Ile
ATC
Ile 150 GCT OTG GGC Ala Leu Gly 506 554 602 65S0 698 746 TTT GTC Phe Val 155 TTC CAC AAG GGO Phe His Lys Gly
TOT
Ser 160 TAO CTG OGG AAO Tyr Leu Arg Asn TGG AAC GTO ATG Trp Asn Val Met a a
GAO
Asp 170 TTC GTG GTC GTC Phe Val Val Val ACA GGG ATC CTT Thr Giy Ile Leu ACG GCT GGA ACT Thr Ala Gly Thr TTC GAC CTG CGA Phe Asp Leu Arg
AC-A
Thr 190 CTG AGG GOT GTG Leu Arg Ala Vai
CGT
Arg 195 GTG CTG AGG COO Val Leu Arg Pro CTG AAG Leu Lys CTG GTG TCT Leu Val Ser AAG GOC ATG Lys Ala Met 220 ATT CCA AGT TTG Ile Pro Ser Leu GTG GTG CTC AAG Val Val Leu Lys TCC ATC ATG Ser Ile Met 215 TTC TTT GCC Phe Phe Ala GTT CCA CTC CTG Val Pro Leu Leu
CAG
Gin 225 ATT GGG CTG CTT Ile Gly Leu Leu ATC CTC Ile Leu 235 ATG TTT GCC ATC Met Phe Ala Ile GGC CTG GAG TTC Gly Leu Giu Phe ATG GGC AAG TTC Mf.t- G yy Phe
CAC
His 250 AAG GCC TGT TTC Lys Ala Cys Phe
CCC
Pro 255 AAC AGC ACA GAT Asn Ser Thr Asp
GCG
Ala 260 GAG COO GTG GGT Giu Pro Val Gly
GAO
Asp 890 938 986 1034 TTC COO TGT GGC Phe Pro Cys Giy GAG TGC CGG GAG Giu Cys Arg Giu 28S
AAG
Lys 270 GAG GCC OCA GC Giu Ala Pro Ala CTG TGC GAG GGC Leu Cys Giu Gly GAC ACT Asp Thr TAO TGG OCA GGA 000 Tyr Trp Pro Gly Pro 290 AAO TTT GGO ATO Asn Phe Gly Ile ACC AAO TTT Thr Asn Phe 295 e -140- GAC AAT ATC CTG TTT GCC ATC TTG ACG GTG TTC Asp Asn Ile Leu Phe Ala Ile Leu Thr Val Phe 300 305 CAG TGC ATC ACC ATG Gin Cys Ile Thr Met 310 1082 GAG GGC Glu Gly 315 TGG ACT GAC ATC Trp Thr Asp Ile TAT AAT ACA AAC Tyr Asn Thr Asn GAT GCG Asp Ala 325 GCC GGC AAC Ala Gly Asn
ACC
Thr 330 TGG AAC TGG CTC Trp Asn Trp Leu
TAC
Tyr 335 TTC ATC CCT CTC Phe Ile Pro Leu ATC ATC GGC TCC TTC Ile Ile Gly Ser Phe TTC ATG CTC AAC Phe Met Leu Asn GTG CTG GGC GTG CTC TCG GGG GAG TTT Val LeU Gly Val Leu Ser Gly Glu Phe GCC AAG Ala Lys 360 GAG CGA GAG Glu Arg Glu CAG CAG r-AG Gin Gin Gin 380
AGG
Arg 365 GTG GAG AAC CGC Val Glu Asn Arg GCC TTC CTG AAG Ala Phe Leu Lys CTG CGC CGG Leu Arg Arg 375 TGG ATC TTC Trp Ile Phe ATC GAG CGA GAG Ile Glu Arg Glu
CTC
Leu 385 AAC GGG TAC CTG Asn Gly Tyr Leu
GAG
Glu 390 AAG, GCG Lys A-la 39S GAG GAA GTC ATG Glu Glu Val Met GCC GAG GAG GAC Ala Glu Glu Asp AAT GCA GAG GAG Asn Ala Glu Glu
AAG
Lys 410 TCC CCT TTG GAC GTG CTG AAG AGA GCG Ser Pro Leu Asp Val Leu Lys Arg Ala ACC AAG AAG AGC Thr Lys Lys Ser
AGA
Axg 425 1130 1178 1226 1274 1322 1370 1418 1466 1514 1562 lGic) 1658 1706 A-AT GAC CTG ATC ASn Asp Leu Ile GCA GAG GAG GGA Ala Glu Glu Gly GAC CGG TTT GCA Asp Arg Phe Ala GAT CTC Asp Leu 440 TGT GCT GTT CYS Ala Val ACA GAG AGC Thr Glu Ser 460 TCC CCC TTC GCC Ser Pro Phe Ala
CGC
Axg 450 GCC AGC CTC AAG Ala Ser Leu Lys AGC GGG AAG Ser Gly Lys 455 TTC CGG TTT Phe A-rg Phe TCG TCA TAC TTC Ser Ser Tyr Phe AGG AAG GAG AAG A.rg Lys Glu Lys
ATG
Met 470 TTT ATC CGG CGC ATG GTG Phe Ile Arg Arg Met Val 47S AAG GCT CAG AGC TTC Lys Ala Gin Ser Phe 480 TGG GTG GTG -TG Trp, Val Val eu GTG GTG GCC Val Val Ala CTG AAC Leu Asn 495 ACA CTG TGT GTG Thr Leu Cys Val ATG GTG CAT TAC Met Val His Tyr CAG CCG CGG CGG Gin Pro Arg Arg CTT ACC ACG ACC CTG TAT Leu Thr Thr Thr Leu Tyr 510 515 TTT GCA GAG TTT Phe Ala Glu Phe GTT TTC Val Phe 520 -141- CTC GGT CTC Leu Gly Leu CCC AGA AGC Pro Arg Ser 540 CTC ACA GAG ATG Leu Thr Ciu Met CTC AAG ATC TAT Leu Lys Met Tyr GCC CTG GCC Cly Leu Gly 535 TTT GGC CTC Phe Gly Val TAC TTC CGC TCC Tyr Phe Arg Ser
TCC
Ser 545 TTC AAC TGc TTC Phe Asn Cys Phe
CAC
Asp ATC GTG Ile Val 555 GGC AGC CTC TTT Gly Ser Val Phe GTG GTC TGC CCG GCC ATC AAC CCG GGA Val Val Trp Ala Ala Ile Lys Pro Gly 0~ S. 5
S
S
S
1 *55
S
*5*S
AGC
Ser 570 TCC TTT GGG ATC Ser Phe Cly Ile
ACT
Ser 575 GTC CTG CGG GCC Val Leu Arg Ala
CTC
Leu 580 CGC CTC CTG ACC Arg Leu Leu Arg TTC AAA GTC ACG Phe Lys Val Thr TAC TGC ACC TCC Tyr Trp Ser Ser CCC AAC CTC CTC Arg Asn Leu Val CTC TCC Val Ser CTC CTC AAC Leu Leu Asn CTC TTC ATT Leu Phe Ile 620
TCC
Ser 605 ATC AAC TCC ATC Met Lys Ser Ile
ATC
Ile 610 ACC CTC CTC TTC Ser Leu Leu Phe TTC CTC TTC Leu Leu Phe 615 TTT CCC CCA Phe Gly Cly 1754 1802 1850 1898 1946 1994 2042 2090 2138 2186 2214 2282 CTC CTC TTC CC Val Val Phe Ala CTC CCC ATC CAC Leu Cly Me-i Gin
CTC
Leu 630 CAC TTC Gin Phe 635 AAC TTC CAG CAT Asn Phe Gin Asp
GAG
Giu 640 ACT CCC ACA ACC Thr Pro Thr Thr TTC CAC ACC TTC Phe Asp Thr Phe CCC CCC ATC CTC Ala Ala Ile Leu CTC TTC CAG ATC Val Phe Gin Ile ACC CCA GAG CAC Thr Cly Glu Asp AAT CCA GTC ATC Asn Ala Val Met CAC CCC ATC CAA His Cly Ile Ciu
TCG
Ser 675 CAA CCC CCC GTC Gin Gly Cly Val ACC AAA Ser Lys 680 CCC ATC TTC Cly Met Phe TAC ACT CTC Tyr Thr Leu 700
TCC
Ser 685 TCC TTT TAC TTC Ser Phe Tyr Phe
ATT
lie 690 CTC CTC ACA CTC Val Leu Thr Leu TTC GCA AAC Phe Gly Asn 695 AAC CTC GCC Asn Leu Ala CTC AAT CTC TTT Leu Asn Val Phe CCC ATC GCT CTC Ala Ile Ala Val
CAC
Asp 710 AAC CC Asn Ala 715 CAA GAG CTC ACC Gin Clu Leu Thr CAT CAA GAG GAG Asp Ciu Giu Clu CAA CAA CCA CC Glu Clu Ala Ala 2320 2378
AAT
Asn 730 CAC AAC CTT CCT Gin Lys Leu Ala CAA AAG CCC AAA Gin Lys Ala Lys CTC GCT CAA CTC Val Ala Clu Val
AC
Ser 745 -142- CCC ATG TCT GCC Pro Met Ser Ala AAC ATC TCC ATC GCC GCC AGG CAG CAG AAC
TCG
Asn Ile Ser Ile Al1a Ala Arg Gin Gin Asn Ser 755 760 GCC AAG GCG Ala Lys Ala CAG AAC CTG Gin Asn Leu 780 TCG GTG TGG
GAG
Ser Val TrP Giu CGG GCC AGC CAG Arg Ala Ser Gin CGG GCC AGC
TGC
Arg Ala Ser Cys CTA CGG CTG Leu Arg Leu 775 ATG GAC cCC
GAG
Giu GCG CTG TAC AGC Ala Leu Tyr Ser
GAG
GAG
GAG
Glu Giu 795 CGG CTG CGC
TTC
Arg Leu Arg Phe
GCC
Al a ACT ACG CGC Thr Thr Arg CAC CTG His Leu CGG CCC GAC ATG
AAG
Lys 810 ACG CAC CTG GAC Thr His Leu Asp
CGG
Arg CCG CTG GTG GTG Pro ILeu Vai Val GAG CTG GGC CGC GAC Giu Leu G1y Ar9 Asp
GGC
S
GCG CGG GGG
CCC
Ala Arg Gly pro GGA GGC AAA Gly Giy Lys GCC CGA Al a Arg CCT GAG GCT
GCG
Pro Glu Ala Ala GAG GCC CCC GAG GGC Pro Glu Gly GAC AAG ACC ASP Lys Thr 860 GAC CT CG CC AG C835 A ACP CCT CCG CC AGG CAC CACs AspPr Pr Ag rg85 sH0 CGG CAC CGC GAC AAG Arg His Arg Asp Lys 2426 2474 2522 25"7 2618 2666 2714 2762 2810 2858 2906 2954 3002 3050 CCC GCG GCG GGG GAC Pro Ala Ala Gly Asp CAG GAC CGA GCA Gin Asp Arg Aia GAG GCC CCG
AAG
GCG GAG Ala Giu 875 AGC GGG
GAG
Ser Giy Glu CCC GGT GCC Pro Gly Ala CGG GAG
GAG
CGG CCG CGG CCG CAC CGC AGC CAC AGC Arg Ser His Ser 890 CGC GGC CGA GGC Arc, Gly Arg Gly AAG GAG GCC GCG GGG CCC CC( Lys Giu Ala Ala Gly pro prc 895 90 CCA GGC CCC GAG GGC GGC
CGC
Pro Giy Pro n. Ci Gly Arc 9 1 0 915 GGC TCC
CCG
Gly Ser Pro CAC CGG CAC His A-g His
GAG
Giu 925 GAG GCG GCC Glu Ala Ala GAG CGG GAG CCC Glu Arg Gu Pro GGAG GCG CGG AGC GAG 0Giu Ala Arg9 Ser Glu 0 905 CGG CAC CAC CGG Crr Arg His His Arg Arg 920 CGA CGC CAC CGC GCG Arg Arg His Arg Ala 935 GGC GCC AAG GGC
GAG
Gly Ala Lys Gly Giu 950 GGG CCC CGG GAG
GCG
Gly Pro Arg Giu Ala 965 CAG GAT CCG AGC Gin Asp Pro Ser GAG TGC
CC
Glu Cys Ala CGG CGC GCG CGG CAC
CGC
Arg Arg Ala Arg His Az-p GGC CCC CGA GCG Gly Pro Az-p Ala -143-
GAG
Glu 970 AGC GGG GAG GAG Ser Gly Glu Glu CCG GCG CGG CGG CAC Pro Ala Arg Arg His 975 CGG GCC CGG CAC AAG GCG Arg Ala Arg His Lys Ala 980 985 CAG CCT GCT CAC Gin Pro Ala His
GAG
Glu 990 GCT GTG GAG Ala Val Glu GAG ATA GTG Glu Ile Val ACG GAG AAG Thr Glu Lys AAG GAG Lys Glu 995 GAA GCC Glu Ala 1010 ACC ACG GAG AAG Thr Thr Glu Lys GAG GCC Glu Ala 1000 GAG GCT Glu Ala 1005 GAC AAG GAA Asp Lys Glu AAG GAG CTC Lys Glu Leu 1015 CGG AAC CAC CAG Arg Asn His Gin 1020 ACT GTG ACT GTG Thr Val Thr Val 1035 CCC CGG GAG Pro Arg Glu CCA CAC Pro His 1025 TGT GAC CTG Cys Asp Leu GAG ACC AGT GGG Glu Thr Ser Gly 1030 GGT CCC ATG CAC ACA CTG CCC Gly Pro Met His Thr Leu Pro 1040
AGC
Ser 104 AAG GTG Lys Val 1050 GAG GAA CAG CCA GAG GAT GCA Glu Glu Gin Pro Glu Asp Ala 1055 GAC AAT CAG Asp Asn Gin 1060 CGC ATG GGC Arg Met Gly AGT CAG CCC CCA Ser Gin Pro Pro 1070 GAC CCG AAC ACT ATT Asp Pro Asn Thr Ile 1075 GGG GAA GCC ACG GTC Gly Glu Ala Thr Val 1090 ACC TGT CTC CAG Thr Cys Leu Gin CGG AAC GTC ACT Arg Asn Val Thr 1065 GTA CAT ATC CCA Val His Ile Pro 1080 GTT CCC AGT GGT Val Pro Ser Gly 1095 GAG GTG GAA GCG Glu Val Glu Ala 1110 3098 3146 3194 3242 3290 3338 3386 3434 3482 3530 GTG ATG CTG ACG GGC CCT CTT Val Met Leu Thr Gly Pro Leu 1085 AAC GTG GAC CTG Asn Val Asp Leu 1100 GAT GAC GTG ATG Asp Asp Val Met 1115
GAA
Glu AGC CAA GCA GAG Ser Gin Ala Glu 1105 GGG AAG AAG Gly Lys Lys AGG AGC GGC CCC CGG CCT ATC GTC CCA TAC AGC TCC Arg Ser Gly Pro Arg Pro Ile Val Pro Tyr Ser Ser 1120 1125 ATG TTC TGT Met Phe Cys 1130 TTA AGC CCC ACC AAC CTG Leu Ser Pro Thr Asn Leu 1135 CTC CGC CGC TTC Leu Arg Arg Phe 1140 TGC CAC TAC Cys His Tyr 1145 Ile Val Thr Met Arg Tyr Phe Glu Val Val Ile Leu Val Val Ile Ala 1150 1155 1160 TTG AGC AGC ATC GCC CTG GCT GCT GAG GAC CCA GTG CGC ACA GAC TCG Leu Ser Ser Ile Ala Leu Ala Ala Glu Asp Pro Val Arg Thr Asp Ser 1165 1170 1175 3578 3626 3674 3722 CCC AGG AAC AAC GCT CTG AAA TAC CTG GAT Pro Arg Asn Asn Ala Leu Lys Tyr Leu Asp 1180 1185 TAC ATT TTC ACT GGT GTC Tyr Ile Phe Thr Gly Val 1190 -144- TTT ACC TTT GAG ATG GTG ATA AAG ATG ATC GAC TTG GGA CTG CTG CTT 3770 Phe Thr Phe Glu Met Val Ile Lys Met Ile Asp Leu Gly Leu Leu Leu 1195 1200 1205 CAC CCT GGA GCC TAT TTC CGG GAC TTG TGG AAC ATT CTG GAC TTC ATT 3818 His Pro Gly Ala Tyr Phe Arg Asp Leu Trp Asn Ile Leu Asp Phe Ile 1210 1215 1220 1225 GTG GTC AGT GGC GCC CTG GTG GCG TTT GCT TTC TCA GGA TCC AAA GGG 3866 Val Val Ser Gly Ala Leu Val Ala Phe Ala Phe Ser Gly Ser Lys Gly 1230 1235 1240 AAA GAC ATC AAT ACC ATC AAG TCT CTG AGA GTC CTT CGT GTC CTG CGG 3.914 Lys Asp Ile Asn Thr Ile Lys Ser Leu Arg Val Leu Arg Vai Leu Arg 1245 1250 1255 CCC CTC AAG ACC ATC AAA CGG CTG CCC AAG CTC AAG GCT GTG TTT GAC 3962 Pro Leu Lys Thr Ile Lys Arg Leu Pro Lys Leu Lys Ala Val Phe Asp 1260 1265 1270 TGT GTG GTG AAC TCC CTG AAG AAT GTC CTC AAC ATC TTG ATT GTC TAC 4010 Cys Val Val Asn Ser Leu Lys Asn Val Leu As Ile Leu Ile Val Tyr 1275 1280 1285 ATG CTC TTC ATG TTC ATA TTT GCC GTC ATT GCG GTG CAG CTC TTC AAA 4058 *Met Leu Phe Met Phe Ile Phe Ala Val Ile Ala Val Gin Leu Phe Lys *1290 1295 1300 1305 GGG AAG TTT TTC TAC TGC ACA GAT GAA TCC AAG GAG CTG GAG AGG GAC 4106 *Gly Lys Phe Phe Tyr Cys Thr Asp Giu Ser Lys Giu Leu Glu Arg Asp 1310 1315 1320 TG AG G CAG TAT TTG GAT TAT GAG AAG GAG GAA GTG GAA GCT CAG 4154 *Cys Arg Gly Gin Tyr Leu Asp Tyr Giu Lys Giu Giu Val Giu Ala Gin 1325 1330 1335 CCC AGG CAG TGG AAG AAA TAC GAC TTT CAC TAC GAC AAT GTG CTC TGG 4202 Pro Arg Gin Trp, Lys Lys Tyr Asp Phe His Tyr Asp Asn Vai Leu Trp, 1340 1345 GCT CTG CTG ACG CTG TTC ACA GTG TCC ACG GGA GAA GGC TGG CCC ATG 4250 Ala Leu Leu Thr Leu Phe Thr Val Ser Thr Gly Glu Gly Trp Pro Met 1355 1360 1365 GTGCTGAAACAC TCC GTG GAT GCC ACC TAT GAG GAG CAG GGT CCA AC 49 Val Leu Lys His Ser Val Asp Ala Thr Tyr dlii Giu Gin Gly Pro Ser 1370 1375 1380 1385 CCT GGG TAC CGC ATG GAG CTG TCC ATC TTC TAC GTG GTC TAC TTT GTG 4346 Pro Gly Tyr Arg Met Giu Leu Ser Ile Phe Tyr Val Val Tyr Phe Val 1390 1395 1400 GTC TTT CCC TTC TTC TTC GTC AAC ATC TTT GTG GCT TTG ATC ATC ATC 4394 Val Phe Pro Phe Phe Phe Val Asn Ile Phe Val Ala Leu Ile Ile Ile 1405 1410 1415 -145- ACC TTC CAG Thr Phe Gin 142C AAG AAC GAG Lys Asn Giu 1435 GAG CAG GGG GAC AAG GTG ATG TCT Giu Gin Giy Asp Lys Val Met Ser 1425 AGG GCT TGC ATT GAC TTC GCC ATC Arg Ala Cys Ile Asp Phe Ala Ile GAA TGC AGC CTG GAG Giu Cys Ser Leu Giu 1430 AGC GCC 1440 ACA CGG Thr Arg 1450 TAC ATG CCC CAA AAC CGG CAG TCG Tyr Met Pro Gin Asn Arg Gin Ser 1455 1445 TTC CAG TAT Phe Gin Tyr 1460 AAA CCC CTG Lys Pro Leu AAG ACG TGG Lys Thr T-p, 1465 GCC ATG ATA Ala Met Ile ACA TTT GTG GTC Thr Phe Val Val TCC CCG CCC Ser Pro Pro 1470 TTT GAA TAC TTC ATC ATG Phe Giu Tyr Phe Ile Met 1475 GCC CTC AAC ACT GTG GTG CTG ATG ATG AAG TTC TAT GAT GCA CCC TAT Ala Leu Asn Thr Vai Val Leu Met Met Lys Phe Tyr ASp, Ala Pro Tyr 1485 1490 -1495 a GAG TAC GAG CTG ATG CTG Giu Tyr Giu Leu Met Leu 1500 A.AA TGC Lys Cys 1505 CTG AAC ATC Leu Asn Ile ATC ATC GCC Ile Ile Ala GTG TTC ACA TCC ATG Val Phe Thr Ser Met 1510 TTT GGG GTG CTG AAC Phe Gly Val Leu Asn 4442 4490 4538 4586 4634 4682 4730 4778 4826 4874 4922 4970 TTC TCC ATG GAA Phe Ser Met Glu 1515 TGC GTG CTG AAG Cys Val Leu Lys 1520 TAT TTC Tyr Phe 15:10C AGA GAT GCC TGG AAT GTC TTT GAC TTT GTC ACT GTG TTG Arg Asp Ala Trp Asn Val Phe Asp Phe Val Thr Val Leu 1535 1540
GGA
Gly 1545 AGT ATT ACT GAT Ser Ile Thr Asp ATC AAC CTC AGC Ile Asn LeU Ser 156! CTG CTC CGC CAG Leu Leu Arg Gin 1580 5 ATT TTA Ile Leu 1550 TTC CTC Phe Leu GGC TAC Gly Tyr GTA ACA GAG ATT GCG Vai Thr Giu Ile Ala 1555 CGC CTC TTT CGA GCT Arg Leu Phe Arg Ala 1570 ACC ATC CGC ATC CTG Thr Ile Arg le Leu 1585 CCC TAC GTG TGT CTG Pro Tyr Val Cys Leu 1600 GAA ACG AAC AAT TTC Giu Thr Asn Asn Phe 1560 GCG CGG CTG ATC AAG Ala Arg Leu Ile Lys 1575 CTG TGG ACC TTT GTC Leu Trp, Thr Phe Val 1590 CAG TCC TTC Gin Ser Phe 1595 TTC TTC ATC Phe Phe Ile 1610 AAG GCC CTG Lys Ala Leu CTC ATT Leu Ile 1605 GCC ATG CTG Ala Met Leu AAT ATT GCC Asn Ile Ala 1625 TAC GCC ATC Tyr Ala Ile 1615 ATC GGC ATG CAG Ile Gly Met Gin GTG TTT GGG Vai Phe Gly 5018 5066 CTG GAT GAT GAC ACC AGC ATC AAC CGC CAC AAC AAC TTC CGC Leu Asp Asp Asp Thr Ser Ile Asn Arg His Asn Asn Phe Arg 1630 1635 ACG TTT Thr Phe 1640 -146- TTC CAA GCC CTG ATG CTG CTG TTC AGG AGC GCC ACG GGG GAG CCC TGG 5114 Leu Gin Ala Leu Met Leu Leu Phe Arg Ser Ala Thr Gly GJlu Ala Trp 1645 1650 1655 CAC GAG ATC ATG CTG TCC TGC CTG AGC AAC CAG GCC TGT GAT GAG CAG 5162 His Giu Ile Met Leu Ser Cys Leu Ser Asn Gin Ala Cys Asp Giu Gin 1660 1665 1670 GCC AAT GCC ACC GAG TGT GGA AGT GAC TTT GCC TAC TTC TAC TTC GTC 5210 Al1a Asn Ala Thr Ciu Cys Gly Ser Asp Phe Ala Tyr Phe Tyr Phe Val 1G75 1680 1685 TCC TTC ATC TTC CTG TGC TCC TTT CTG ATG TTG AAC CTC TTT CTG GCT 5258 Ser Phe Ile Phe Leu Cys Ser Phe Leu Met Leu Asn Leu Phe Val Ala 1690 1695 1700 1705 GTG ATC ATG GAC AAT TTT GAG TAC CTC ACG CGG GAC TCT TCC ATC CTA 5306 Val Ile Met Asp Asn Phe Giu Tyr Leu Thr Axg Asp Ser Ser Ile Leu 1710 1715 1720 GGT CCT CAC CAC TTG GAT GAG TTC ATC CGG GTC TGG GCT GAA TAC GAC 5354 Gly Pro His His Leu Asp Giu Phe Ile Arg Val Trp Ala Glu Tyr Asp 1751720 1735 CCC GCT CC TGT GGG CCC ATC AGT TAC AAT CAC ATG TTT GAG ATC CTC 5402 Pro Ala Ala Cys Gly Arg li e Ser Tyr Asn Anp Met Phe Glu met Leu 1740 1745 1750 AAA CAC ATG TCC CCG CCT CTG CCC CTG GGG AAG AAA TGC CCT GCT CCA 5450 Lys His Met Ser Pro Pro Leu Cly Leu Gly Lys Lys Cys Pro Ala Arg 1755 1760 1765 GTT GCT TAC AAG CCC CTC GTT CCC ATG AAC ATG CCC ATC TCC AAC GAG 5498 *..Val Aia Tyr Lys Arg Leu Val Arg Met Asn Met Pro Ile Ser Asn Giu *1770 1775 1780 1785 GAC ATG ACT GTT CAC TTC ACG TCC ACG CTC ATC CCC CTC ATC CCC ACC S546 Asp Met Thr Vai His Phe Thr Ser Thr Leu Met Ala Leu Ile Arg Thr 1790 1795 1800 CCA CTC GAG ATC AAG CTG CCC CCA CCT CCC ACA AAC CAG CAT CAG TOT 5594 Ala Leu Glu Ile Lys Leu Ala Pro Ala Gly Thr Lys Gin His Gin Cys 1805 1810 1815 GAC CC GAG TTC ACG AAG GAG ATT TCC CTT CTC TGG CCC AAT CTG CCC 5642 Asp Ala Giu Leu Arg Lys Ciu Ile Ser Val Val Trp, Ala Asn Leu Pro 1820 1825 1830 CAC AAG ACT TTG CAC TTC CTC GTA CCA CCC CAT AAG CCT CAT GAG ATC 5690 Gin Lys Thr Leu Asp Leu Leu Val Pro Pro His Lys Pro Asp Ciu Met 18:35 1840 1845 ACA GTG CCC AAG GTT TAT GCA GCT CTC ATC ATA TTT CAC TTC TAC AAG 5728 Thr Val Gly Lys Val Tyr Ala Ala Leu Met Ile Phe Asp Phe Tyr Lys 1850 1855 1860 1865 -147- CAG AAC AAA ACC ACC ACA CAC CAG ATC CAG CAG Gin Asn Lys Thr Thr Arg Asp Gin Met Gin Gin 1 O7r~ 1875 TCC CAG ATG GGT
CCT
Ser Gin Met Gly pro 1885 GAG CAG ACA CAC CCC Giu Gin Thr Gin Pro 1900 CACG AAG AGT TCC ACC Gin Lys Ser Ser Thr 1915 GAG AGT GGC ATC AAA Glu Ser Cly Ile Lys CCT CCT CCA GCC CTC Aia Pro Cly Cly Leu 1880 CTG AAG CCC ACC CTG Leu Lys Ala Thr Leu (GTG TCC CTG TTC CAC CCT Val Ser Leu Phe His pro 1890 GCT CTG CTC CCA GCA
CC
Ala Val Leu A-rg Cly Ala 1905 TCC CTC AGC AAT CCC
CCC
Ser Leu Ser Asn Gly Cly 1920 COG CTT TTC CTT CGA Arg Val Phe Leu Ara 1910 CCC ATA CAA AAC CAA Ala Ile Gin Asn Gin GAG TCT Giu Ser GTC TCC
TGG
Val Ser Trp 1 -,73 0 GAT GCA CCC
CAT
ASP Ala Pro His 193~ GAG GCC Giu Ala 1950 5 CGC ACT Gly Thr GAG ATC CCT Giu Ile Pro AGG CCA CCC CTG GAG
COT
Arg Pro Pro LeU Ciu Arg 1955 TCA GGA GCA CTG GCT
GTG
Ser Gly Ala Lieu Ala-Val CAA ACO ACC CAG Gin Arg Thr Gin 1945 GCC CAC TCC ACA Cly His Ser Thr 1960 CAC GTT CAG ATG Asp Vai Gin Met 5786 5834 5882 5930 5978 6026 6074 6122 6170 6218 GTG GGG
CGG
Val Cly Arg 1965 CAG AC Gin Ser ATA ACC Ile Thr 1980 COG AGO GCC CCT GAT GCC GAG CCC CAC CCT CCC CTG Arg Arg Gly Pro Asp Giy Giu Pro Gin Pro Giy Leu 1985 1990 GAG AGC CAG Ciu Ser Gin 1995 GCT CGA GCG GCC TCC ATG CCC CGC CTT GCC CCC GAG ACT Gly Arg Ala Ala Ser Met Pro Arg Leu Ala Ala Giu Thr 2000 2005 CAG CCC Gin Pro 2010 GTC ACA CAT Val Thr Asp GCC AGC Ala Ser CCC ATC AAG Pro Met Lys CCC TCC Arg Ser ATC TCC ACC
CTG
GCC CAC
CG
Ala Gin A-rg CCC COT GGG ACT CAT Pro Arg Cly Thr His 2030 CTT TGC ACC ACC C Ys Ser Thr ACC CCC GAC CCC Thir Pro Asp Arg CCA CCC CCT AGC CAC GCC Pro Pro Pro Ser Gin Ala 2045 CCC CGC AGO CAC AGO AAC Arg Arg Arg Asp Arg Lys 2060 TCT GCC CAT ATC GAT GCC Ser A-la A-sp Met Asp Cly 2075 TCG TCG CAC CAC CAC Ser Ser His His His 2050 CAC AGC TCC CTC GAG Gin Arg Ser Leu Ciu CAC CAC CCC TGC CAC His His Arg Cys His 2055 AAC CCC CCC AGC CTC Lys Cly Pro Ser Leu 6314 G410 GCA CCA AGC Ala Pro Ser 2080 ACT OCT CTC CCC CCC CCC CTC Ser Ala Val Gly Pro Cly Leu 2085 -148- CCC CCG GGA GAG GGG CCT ACA Pro Pro Gly Glu Gly Pro Thr 2090 2095 GGC TGO CGG OGG GAA CGA GAG CGO CGG Gly Cys Arg Arg Giu Arg Giu Arg Arg 2100 210s CAG GAG Gin Giu CGG GGO CGG TCC A-rg Gly Arg Ser CAG GAG CGG AGG CAG 000 Gin Giu Arg Arg Gin Pro 2115 TAO TCC TGC GAC OGO TTT Tyr Ser Cys Asp Arg Phe TOG GAG AAG CAG CGC
TTC
Ser Giu Lys Gin Arg Phe 2125 TCA TCC TOO TOO Ser Ser Ser Ser 2120 GGG GGC CGT GAG Gly Gly Arg Gu CCC CCG AAG CCC AAG CCC TCC CTC AGO AGC Pro Pro Lys Pro Lyvs Pro Ser Leu Ser Ser 2140 2145 GCT GGO CAG GAG Ala Giy Gin Giu 2155 CAC COA ACG TCG CCA ACA His Pro Thr Ser Pro Thr 2150 GGO AGT GGT TCC GTG AAT Gly Ser Gly Ser Val Asn COG GGA CCC CAC CCA CAG Pro Gly Pro His Pro Gin 2160 a. a.
a a a a.
a .a a a a. a.
a a a.
a a.
*aa.
GGG AGO Gly Ser 2170 COO TTG CTG Pro Leu Leu TCA ACA TCT GGT Ser Thr Ser Giy 2175 GOT AGO ACC CCC GGO Ala Ser Thr Pro Gly OGO GGT Arg Gly GGG CGG AGG CAG Gly Arg Arg Gin OTO CCC Leu Pro CAG ACG COO OTG ACT CCC OGO Gin Thr Pro Leu Thr Pro Arg 2195 TOO TOA CCC ATO CAC TTO GC Ser Ser Pro Ile His Phe Ala ACC TAO AAG ACG GOC AAO Thr Tyr Lys Thr Ala Asn 2205 CCC AGO ATO Pro Ser Ile 2200 GGG GOT CAG Giy Ala Gin 6458 6506 6554 6602 6650 6698 6746 6794 6842 6890 6938 6986 7082 2210 ACC AGO Thr Ser OTO OCT GOC TTO TOO Leu Pro Ala Phe Ser 2220 OCA GGC CGG OTO Pro Giy Arg Leu 2215T G CT
C
Ser Arg Gly Leu Ser GAA CAC AAO Giu His Asn 2235 GOC CTG CTG Ala Leu Leu
OAG
Gin COT GGC TOT OGA Pro Gly Ser Arg 2250 AGT GAG GOC
TOT
Ser Giu Ala Ser ATT GGO TOT ile Gly Ser 2255 GTC CAC GC Val His Ala 2270 AGA GAO CCC OTO AGO
CAG
Arg Asp Pro Leu Ser Gin 0 2245 GAO COT TAO CTr nnc(C'- Asp Pro Tyr Leu Gly Gl'n 2260 CTG COT GAG GAO ACG OTO Leu Pro Giu Asp Thr Leu COO CTG GC Pro Leu Ala
GTTGGAO
Arg Leu Asp 2265 ACT TTO GAG Thr Phe Giu GAG GOT GTG Giu Ala Val GOC ACC AAc Ala Thr Asn 2285 TOG GGC CO TOO TOO Ser Gly Arg Ser Ser 2290 TOT CAC COT OTO CC Ser His Pro Leu Arg 2305 TOO TOO CTG ACC TOO CAG Ser Ser Leu Thr Ser Gin 2300 AGG ACT TCC TAO GTG Arg Thr Ser Tyr Val 2295 CGC GTG CCC AAO GGT Arg Val Pro Asn Gly 2310 -149- TAC CAC TGC ACC CTG GGA CTC AGC TCG GOT GGC CGA GCA CGG CAC AGC 7130 Tyr His Cys Thr Leu Gly Leu Ser Ser Gly Gly Arg Ala Arg His Ser 2315 2320 2325 TAC CAC CAC CCT GAC CAA GAC CAC TGG TGC TAGCTGCACC GTGACCGCTC 7180 Tyr His His Pro Asp Gln Asp His Trp Cys 2330 2335 234 AGACGCCTGC ATGCAGCAGG CGTGTGTTCC AGTGGATGAG TTTTATCATC CACACGGGGC 7240 AGTCGGCCCT CGGGGGAGGC CTTGCCCACC TTGGTGAGGC TCCTGTGGCC CCTCCCTCCC 7300 CCTCCTCCCC TCTTTTACTC TAGACGACGA ATAAAGCCCT GTTGCTTGAG TGTACGTACC 736C GC 7362 INFORMATION FOR SEQ ID NO:8: SEQUENCE
CHARACTERISTICS:
LENGTH: 7175 base pairs TYPE: nucleic acid STRANDEDNESS. double (D)X TOPOLOGY: lna (ii) MOLECUJLE TYPE: DNA (genomic) (ix)
FEATURE:
NAME/KEY:
CDS
LOCATION: 144. .6857 (ix)
FEATUE:
NAME/KEY. a a LOCATION: 143 a(ix)
FEATURE:
NAME/KEY. 3'UTR LOCATION: 6855. .7175 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: :::*GCGGCGGCGG CTGCGGCGGT GGGGCCGGGC GAGGTCCGTG CGGTCCCGGC GGCTCCGTGG 61 CTGCTCCGCT CTGAGCGCCT GCGCGCCCCG CGCCCTCCCT GCCGGGGCCG CTGGGCCGGG 120 GATGCACGCG GGGCCCGGGA 0CC ATG GTC CGC TTC GGG GAC GAG CTG GGC 170 Met Val Arg PeGly ApGlu Leu Gly 1 5 GGC CGC TAT GGA GGC CCC GGC GGC GGA GAG CGG 0CC CGG GGC GGC GGG 218 Gly Arg Tyr Gly Gly Pro Gly Gly Gly Glu Arg Ala Arg Gly Gly Gly 15 20 GCC GGC GGG GCG GGG GGC CCC GOT CCC 000 GGG CTG CAG CCC GGC CAG 266 Ala Gly Gly Ala Gly Gly Pro Gly Pro Gly Gly Leu Gin Pro Gly Gin 35 -150- CGG GTC CTC TAC A-rg Val Leu Tyr CTG TAC AAC CCC Leu Tyr Asn Pro TCG CTC TTC GTC T Ser Leu Phe Val P CG ATC ACC GAG T ArgT Ile Thr Glu T.
ATC GCC AAC TGC A~ Ile A-la Asn Cys I 12 GAC AA ACG CCC Al AspD Lys Thr Pro Me 125 ATC GGG ATC TTT TG le Gly Ile Phe Cy 140 TTT GTC TTC CAC AA Phe Val Phe His Ly 1-55 GAC TTC GTG GTC GT Asp Phe Val Val Val 170 TTC GAC CTG CGA ACA Phe Asp Leu Arg Thr 190 CTG GTG TCT GGG ATT Leu Val Ser Gly Ile 205 AAG GCC ATG GTT CCA Lys Ala Met Val Pro 220 ATC CTC ATG TTT GCC Ile Leu Met Phe Ala 235 CAC A-AG GCC TGT TTC His Lys Ala Cys Phe 250 AAG CAA TCG ATC GCG CAG CGC GC CGG Lys Gin Ser Ile Ala Gin Arg Ala Arg 50 ~TC CCG GTC A-AG CAG AAC TGC TTC ACC lie Pro Val Lys Gin A-sn Cys Phe Thr 65 70 'TC AGC GAG GAC AAC GTC GTC CGC AAA he Ser Giu Asp Asn Vai Vai Arg Lys 80 85 GG CCT CCA TTC GAG AAT ATG ATC CTG r-p Pro Pro Phe Giu A-sn Met Ile Leu 95 100 TC GTG CTG GCC CTG GAG CAG CAC CTC Le Val Leu Ala Leu Giu Gin His Leu 0O 115 ~G TCC GAG CGG CTG GAC GAC ACG GAG C .t Ser Giu Arg Leu Asp Asp Thr Giu P 130 1 C TTC GAG GCA GGG ATC AAA ATC ATC G s Phe Giu Ala Gly Ile Lys Ile Ile A 145 150 G GGC TCT TAC CTG CGG AAC GGC TGG A.
s Gly Ser Tyr Leu Arg Asn Gly Trp A 160 165 7CTC A-CA COG ATC CTT GCC ACG GCT GC -Leu Thr Gly Ile Leu Ala Thr Ala GI 175 180 CTG AGG GCT GTG CGT GTG CTG A-GG CC Leu Arg Ala Val Arg Val Leu Arg Pr 195 CCA AGT TTG CAG GTG GTG CTC AAG TC Pro Ser Leu Gin Val Val Leu Lys Se 210 21 CTC CTG GAG ATT GGG CTG CTT CTC TT Leu Leu Gin Ile Gly Leu Leu Leu Ph 225 230 ATC ATT GGC CTG GAG TTC TAC ATO GG Ile Ile Giy Leu Giu Phe Tyr Met Gil 240 245 CCC A-AC A-CC ACA GAT GCG GAG CCC GTC Pro Asn Ser Thr Asp Ala Giu Pro Val 255 260 ACC ATG GC Thr Met Ala GTC AAC CGC Val Asn A-rg TAC GCG A-AG Tyr Ala Lys GCC ACC ATC Ala Thr Ile 105 CCT GAT GGG ?ro Asp Gly 120 ~CC TAT TTC 'ro Tyr Phe CT CTG GOC la Leu Cly AC GTC ATG sn Val Met 3A ACT GAC .y Thr Asp 185 C CTC AAG 0 Leu Lys 200 C ATC A-TO r Ile Met C TTT GCC e Phe Ala CAAG TTC ~Lys Phe GGT GAC Gly Asp 265 314 362 410 458 506 554 602 650 698 746 '7 cq4 842 890 938 -151- TTC CCC TGT GGC Phe Pro Cys Gly
AAG
Lys 270 GAG GCC CCA GCC CGG Glu Ala Pro Ala Arg 275 CTG TGC GAG GGC Leu Cys Glu Gly GAC ACT Asp Thr 280 GAG TGC CGG Glu Cys A-rg GAC AAT ATC Asp Asn Ile 300
GAG
Glu 285 TAC TGG CCA GGA Tyr Trp Pro Gly AAC TTT GGC ATC Asn Phe Gly Ile ACC AAC TTT Thr Asn Phe 295 ATC ACC ATG Ile Thr Met CTG TTT GCC ATC Leu Phe Ala Ile
TTG
Leu 305 ACG GTG TTC CAG Thr Val Phe Gin GAG GGC Glu Gly 315 TGG ACT GAC ATC Trp Thr Asp Ile
CTC
Leu 320 TAT AAT ACA AAC Tyr Asn Thr Asn
GAT
Asp 325 GCG GCC GGC AAC Ala Ala Gly Asn
ACC
Thr 330 TGG AAC TGG CTC Trp Asn Trp Leu
TAC
Tyr 335 TTC ATC CCT CTC ATC ATC ATC GGC TCC Phe Ile Pro Leu Ile Ile Ile Gly Ser TTC ATG CTC AAC Phe Met Leu Asn GTG CTG GGC GTG Val Leu Gly Val
CTC
Leu 355 TCG GGG GAG TTT Ser Gly Giu Phe GCC AAG Ala Lys 360 GAG CGA GAG Glu Arg Glu CAG CAG CAG Gin Gin Gin 380 GTG GAG AAC CGC Val Giu Asn Arg
CGC
Arg 370 GCC TTC CTG AAG Ala Phe Leu Lys CTG CGC CGG Leu Arg Arg 375 TGG ATC TTC Trp Ile Phe 986 1034 1082 1130 1178 1226 1274 1322 1370 1418 1466 1514 1562 ATC GAG CGA GAG Ile Giu Arg Glu AAC GGG TAC CTG Asn Gly Tyr Leu
GAG
Glu 390 AAG GCG Lys Ala 395 GAG GAA GTC ATG Glu Glu Val Met
CTG
Leu 400 GCC GAG GAG GAC Ala Glu Giu Asp AAT GCA GAG GAG Asn Ala Giu Glu
A-AG
Lys 410 TCC CCT TTG GAC Ser Pro Leu Asp
GTG
Val 415 CTG AAG AGA GCG Leu Lys Arg Ala
GCC
Ala 420 ACC AAG AAG AGC Thr Lys Lys Ser
AGA
Arg 425 AAT GAC CTG ATC CAC GCA GAG GAG GGA Asn Asp Leu Ile His A-a Giu Giu Gly GAC CGG TTT GCA Asp A-rg Phe A A GAT CTC Asp Leu 440 TGT GCT GTT GGA TCC CCC TTC GCC Cys Aa Val Gly Ser Pro Phe Ala 445
CGC
Arg 450 GCC AGC CTC AAG Ala Ser Leu Lys ACA GAG AGC Thr Giu Ser 460 TCG TCA TAC TTC Ser Ser Tyr Phe
CGG
Arg 465 AGG AAG GAG AAG Arg Lys Giu Lys
ATG
Met 470 AGC GGG AAG Ser Gly Lys 455 TTC CGG TTT Phe Arg Phe GTG GTG CTG Val Val Leu TTT ATC CGG CGC ATG GTG Phe Ile Arg Arg Met Val 475
AAG
Lys 480 GCT CAG AGC TTC Ala Gin Ser Phe TAC TGG Tyr Trp 485 1610 -152a a a a T1
C
4
C
C
CL
Cc
AT
AG
Se 5.7
TT
PhE
CTC
Lei-
CTG
Leu
CAG
Gin
CCT
Pro 650
AAT
Asn
GGC
Gly
TAC
Tyr GC GTG GTG GCC CTG AAC ACA CTG TGT GTG GCC ATG :ys Val Val Ala Leu Asn Thr Leu Cys Val Ala Met 90 495 500 AG CCG CGG CGG CTT ACC ACG ACC CTG TAT TTT GCA in Pro Arg Arg Leu Thr Thr Thi- Leu Tyr Phe Ala 510 515 TG GGT CTC TTC CTC ACA GAG ATG TCC CTG AAG ATG eu Gly Leu Phe Leu Thr Giu Met Ser Leu Lys Met 525 530 "C AGA AGC TAC TTC CGG TCC TCC TTC AAC TGC TTC 0 Arg Ser Tyr Phe Arg Ser Ser Phe Asn Cys Phe 540 545 'C GTG GGG AGC GTC TTT GAA GTG GTC TGG GCG GCC e Val Gly Ser Val Phe Giu Val Val Ti-p Ala Ala 555 560 565 C TCC TTT GGG ATC AGT GTG CTG CGG GCC CTC CGC C r Ser Phe Gly Ile Ser Val Leu Arg Ala Leu Arg L 0 575 580 CAAA GTC ACG AAG TAC TGG AGC TCC CTG CGG AAC C e Lys Vai Thr Lys Tyr Ti-p Ser Ser Leu Arg Asn L 590 595 CTG AAC TCC ATG AAG TCC ATC ATC AGC CTG CTC T' Leu Asn Ser Met Lys Ser Ile Ile Ser Leu Leu P1 605 610 TTC ATT GTG GTC TTC GCC CTG CTG GGG ATG CAG C' Phe Ile Vai Vai Phe Al1a Leu Leu Giy Met Gin LE 620 625 6 TTC AAC TTC CAG GAT GAG ACT CCC ACA ACC AAC TTI Phe Asn Phe Gin Asp Giu Thi- Pro Thr Thr As n Ph~ 635 640 645 GCC GCC ATC CTC ACT GTC TTC CAG ATC CTG ACG GG Ala A-la Ile Leu Thr Vai Phe Gin Ile Leu Thr Gi 655 660 GCA GTG ATG TAT CAC GGG ATC GAA TCG CAA GGC GG Ala Vai Met Tyr His Giy Ile Giu Ser Gin Gly Gi: 670 675 ATG TTC TCG TCC TTT TAC TTC ATT GTC CTG ACA CT( Met Phe Ser Ser Phe Tyr Phe Ile Vai Leu Thr Let 685 690 ACT CTG CTG AAT GTC TTT CTG GCC ATC GCT GTG GAC Thi- Leu Leu Asn Val Phe Leu Ala Ile Ala Vai Asp 700 705 710 GTG CAT TAC AAC Vai His Tyr As, 505 GAG TTT GTT TTC Giu Phe Val Phe 520 TAT GGC CTG GGG Tyr Giy Leu Gly 535 GAC TTT GGG GTC Asp Phe Gly Val 550 kTC AAG CCG GGA Ile Lys Pro Giy .TG CTG AGG ATC ~eu Leu Arg Ile 585 TG GTG GTG TCC eu Vai Vai Ser 600 TC TTG CTC TTC he Leu Leu Phe 615 P'G TTT GGG GGA ~u Phe Giy Gly C GAC ACC TTC Le Asp Thr Phe A GAG GAC TGG V Giu Asp T-r-p 665 C GTC AGC AAA y Vai Ser Lys 680 3TTC GGA AAC i Phe Gly Asn 695 -AAC CTG GCC Asn Leu A-la 1658 1706 1754 1802 1850 1898 1946 1994 2042 2090 2138 2186 2234 2282 -153- AAC GCC Asn Ala 715 CAA GAG CTG ACC Gin Glu Leu Thr GAT GAA GAG GAG ATG GAA GAA GCA GCC Asp Giu Giu Giu Met Giu Giu Ala Ala 725 2330
AAT
Asn 730 CAG AAG CTT GCT Gin Lys Leu Ala
CTG
Leu 735 CAA AAG GCC AAA Gin Lys Ala Lys
GAA
Clu GTG GCT GAA Val Ala Glu CCC ATG TCT GCC Pro Met Ser Ala GTC AGC Val Ser 745 AAC TCG Asn Ser AAC ATC TCC ATC Asn Ile Ser Ile
GCC
Al a CCC AGG CAG CAG Ala Arg Gin Gin CCC AAG GCG Ala Lys Ala CAG AAC CTG Gin Asn Leu 780 TCG GTG TGG GAG Ser Val Trp Giu CCG GCC AGC CAG Arg Ala Ser Gin CTA CCC CTG Leu Arg Leu 775 ATG GAC CCC Met Asp Pro CCC CCC AGC TC Arg Ala Ser Cys
GAG
Glu 785 C CTG TAC AGC Ala Leu Tyr Ser
GAG
Ciu
S
*5 S. S S S. *5
S
a a.
*5 S S S. a *5 a a
S
GAG GAG Giu Giu 795 CCC CTG CCC TTC Arg Leu Arg Phe ACT ACG CCC CAC Thr Thr Arg His CCC CCC CAC ATG Arg Pro Asp Met
AAG
Lys 810 ACC CAC CTC GAC Thr His Leu Asp
CCC
Arg 815 CCC CTC GTG CTC Pro Leu Vai Val
GAG
Gltu CTC CCC CCC GAC Leu Cly Arg Asp GCG CCC CCC CCC Ala Arg Gly Pro GCA CCC AAA CC Cly Gly Lys Ala
CGA
Arg 835 CCT GAG GCT CC Pro Giu Ala Ala GAG CC Clu Ala 2378 2426 2474 2522 2570 2618 2666 2714 2762 2810 2858 2906 2954 CCC GAG GC Pro Giu Gly CAC AAG ACC Asp Lys Thr 860 GAC CCT CCC CC Asp Pro Pro Arg CAC CAC CCC CAC His His Arg His CCC CAC AAG Arg Asp Lys 855 CCC CCC AAG Ala Pro Lys CCC CC CC CCC Pro Ala Ala Gly
CAC
Asp 865 CAC GAC CGA CCA Gin Asp Arg Ala
GAG
Clu GCG GAG Ala Ciu 875 AGC CCC GAG CCC Ser Gly Giu Pro CCC CCC GAG GAG Al1a Ar- Glu G lu CCC CCC CCC CAC Pro Arg Pro His ACC CAC AGC AAG Ser His Ser Lys
GAG
Giu 895 CCC CC CCC CCC Ala Ala Gly Pro
CCC
Pro GAG CC CCC AC Ciu Ala Arg Ser CCC CCC CGA GC Arg Cly Arg Cly CCC T'C CCC GAG Gly Ser Pro Giu 925
CCA
Pro 910 CCC CCC GAG Cly Pro Giu CCC GC Gly Gly 915 CCC GAG Arg Glu 930 CCC CCC CAC CAC Arg Arg His His CCC CC Arg Arg GAG CC CCC GAG Giu Ala Al a Glii CCC CGA CC Pro Arg Arg CAC CCC CC His Arg Al1a 935 -154- CAC CGG CAC His Arg His 940 CAG GAT CCG Gin Asp Pro AGC AAG GAG TGC GCC Ser Lys Glu Cys Ala 945 GGC GCC Gly Ala 950 AAG GGC GAG Lys Gly Glu 3002 3050 CGG CGC Arg Arg 955 GCG CGG CAC CGC Ala Arg His Arg GGC GGC CCC CGA GCG GGG CCC CGG GAG GCG Gly Gly Pro Arg Ala Gly Pro Arg Glu Ala 960 965
GAG
Glu 970 AGC GGG GAG GAG Ser Gly Glu Glu
CCG
Pro 975 GCG CGG CGG CAC Ala Arg Arg His GCC CGG CAC Ala Arg His CAG CCT GCT CAC Gin Pro Ala His GCT GTG GAG AAG Ala Val Glu Lys AAG GCG Lys Ala 985 GAG GCC Glu Ala 1000
GAG
Glu 995 ACC ACG GAG AAG Thr Thr Glu Lys ACG GAG AAG GAG GCT Thr Glu Lys Glu Ala 1005 GAG ATA GTG GAA GCC GAC AAG Glu Ile Val Glu Ala Asp Lys 1010 GAA AAG GAG CTC Glu Lys Glu Leu 1015 0*
U
U
.4 U 4U U CGG AAC CAC CAG CCC Arg Asn His Gin Pro 1020 ACT GTG ACT GTG GGT Thr Val Thr Val Gly 1035 CGG GAG CCA CAC TGT Arg Glu Pro His Cys 1025 GAC CTG GAG ACC AGT GGG Asp Leu Glu Thr Ser Gly 1030 CCC ATG CAC ACA CTG CCC AGC ACC TGT CTC CAG Pro Met His Thr Leu Pro Ser Thr Cys Leu Gin t A f 1045 AAG GTG Lys Val 1050 GAG GAA CAG Glu Glu Gin CCA GAG GAT GCA GAC Pro Glu Asp Ala Asp 1055 AAT CAG CGG Asn Gin Arg 1060 CGC ATG GGC AGT Arg Met Gly Ser CAG CCC CCA Gin Pro Pro 1070 GAC CCG AAC ACT ATT GTA Asp Pro Asn Thr Ile Val 1075 GGG GAA GCC ACG GTC GTT Gly Glu Ala Thr Val Val 1090 AAC GTC ACT Asn Val Thr 1065 CAT ATC CCA His Ile Pro 1080 CCC AGT GGT Pro Ser Gly 1095 3098 3146 3194 3242 3290 3338 3386 3434 3482 3530 3578 GTG ATG CTG ACG GGC CCT CTT Val Met Leu Thr Gly Pro Leu 1085
U.
U*
AAC GTG GAC CTG GAA AGC CAA GCA GAG GGG AAG AAG GAG GTG GAA GCG Asn Val Asp Leu Glu Ser Gin Ala C11T ly Lys Ly Glu a u Aa S1100YS Glu Va Glu Ala 110017 GAT GAC GTG Asp AspD Val 1115 ATG TTC TGT Met Phe Cys 1130 ATG AGG AGE GGC CCC CGG Met Arg Ser Gly Pro Arg 1120 TTA AGC CCC ACC AAC CTG Leu Ser Pro Thr Asn Leu 1135 1110 CCT ATC GTC CCA TAC AGC TCC Pro Ile Val Pro Tyr Ser Ser 1125 CTC CGC CGC TTC TGC CAC TAC Leu Arg Arg Phe Cys His Tyr 1140 1145 ATC GTG ACC ATG AGG TAC TTC GAG GTG GTC ATT CTC GTG GTC ATC GCC Ile Val Thr Met Arg Tyr Phe Glu Val Val Ile Leu Val Val Ile Ala 1150 1, e 3626 lOD 1160 -155- TTC AGC AGC ATC GCC Leu Ser Ser Ile Ala 1165 CCC AGG AAC AAC GCT Pro Arg Asn Asn Ala 1180 CTG CCT GCT GAG CAC CCA GTG Leu Ala Ala Glu Asp Pro Val 1170 CTG AAA TAC CTG GAT TAC ATT Leu Lys Tyr Leu Asp Tyr Ile 1185 CGC ACA GAC TCG Arg Thr Asp Ser 1175 TTC ACT GGT CTC Phe Thr Gly Val TTT ACC TTT Phe Thr Phe 1195 GAG ATG CTG Glu Met Val ATA AAG ATG ATC GAC TTG GGA CTG CTG CTT Ile Lys Met Ile Asp Leu Gly Leu Leu Leu 1200 1205 C-AC CCT His Pro 1210 GGA GCC TAT Gly Ala Tyr TTC CCC Phe Arg 1215 GTG GTC ACT GC Val Val Ser Cly GAC TTG TGG AAC ATT CTC GAC Asp Leu Trp Asn Ile Leu Asp 1220 CC TTT CCT TTC TCA CGA TCC Ala Phe Ala Phe Ser Cly Ser 1235 CCC CTC CTC Ala Leu Val 1230 TTC ATT Phe Ile 1225 AAA CCC Lys Cly .gt voo 0 a a 000 a. e*a AAA GAC ATC AAT ACC ATC AAC TCT Lys Asp Ile Asn Thr Ile Lys Ser 1245 CTG AGA CTC Leu Arg Val 1250 CTT CGT GTC CTG CCC Leu Arg Val Leu Arg CCC CTC AAG ACC Pro Leu Lys Thr 1260 TCT CTG CTC AAC Cys Val Val Asn 1275 ATC AAA CCC Ile Lys Arg CTG CCC Leu Pro 1265 AAG CTC AAC LYS Leu Lys GCT GTC TTT CAC Ala Val Phe Asp 1270 3674 3722 3770 3818 3866 3914 3962 4010 4058 4106 4154 4202 4250 TCC CTG AAC AAT Ser Leu Lys Asn 1280 CTC CTC AAC Val Leu Asn ATC TTG Ile Leu ATT GTC TAC Ile Val Tyr ATG CTC Met Leu 1290 TTC ATG TTC Phe Met Phe ATA TTT Ile Phe 1295 CCC GTC ATT CC GTG CAG CTC TTC Ala Val Ile Ala Val Gin Leu Phe
AAA
Lys CCC AAC TTT Cly Lys Phe TCC ACC GGT Cys Arg Gly TTC TAC TGC Phe Tyr Cys 1310 CAC TAT TTG Gin Tyr Leu 1325 ACA CAT GAA TCC AAC GAG CTG Thr Asp Giu Ser Lys Clu Leu 1315 CAT TAT GAG AAC GAG CAA GTC Asp Tyr Glu Lys Clii Glu Val GAG ACG CAC Glu Arg Asp 1320 CAA CCT CAC Ciu Ala Gin 1330 CCC ACG CAC TGC AAC AAA TAC Pro AX9 Gin Trp, Lys Lys Tyr 1340 CAC TTT CAC TAC Asp Phe His Tyr 1345 CAC AAT CTC CTC TGC Asp Asn Val Leu Trp CCT CTG CTC ACC CTC Ala Leu Leu Thr Leu 1355 TTC ACA CTC TCC Phe Thr Val Ser 1360 GTG CTC AAA CAC TCC GTG CAT CCC ACC Val Leu Lys His Ser Val Asp Ala Thr 1370 1375 ACC CCA CAA CCC TCC CCC ATC Thr Gly Glu Gly Trp Pro Met 1365 TAT GAG GAG CAC GGT CCA AGC Tyr Glu Clu Gin Cly Pro Ser 1380 1385 4298 -156- CCT GGG TAC CGC ATC GAG CTC TCC ATC TTC TAC GTG GTC TAC TTT GTC Pro GlY Tyr Arg Met Giu Leu Ser Ile Phe Tyr Vai Val. Tr3heVa 1301395 1400 OTC TTT CCC TTC TTC TTC OTC AAC ATC TTT GTG OCT TTG ATC TC ATC 4394 Val Phe Pro Phe Phe Phe Val Asn Ile Phe Val Ala Leu 1ie Ile 1405 1410 1415 ACC TTC CAG GAG CAG GCC GAC AAG OTC ATC TCT GAA TGC AGC CTC GAG 4442 Thr Phe Gin Giu Gin Gly Asp Lys Vai Met Ser Giu Cys Ser Leu Giu 1420 1425 1430 GAG AGGAGO OCT TGC ATT GAC TTC CCC ATC AGC CCC AAA CCC CTG 49 Lys Asn Giu A-gAla C le Asp Phe Ala Ile Ser Ala Lys Pro Leu 1435 1440 14 *ACA CCC TAC ATC CCC CAA AAC CCC CAC TCG TTC CAG TAT AAG ACG TOO 4538 Thr Arg Tyr Met Pro Gin Asn Arg Gin Ser Phe Gin Tyr Lys Thr Trp *1450 1455 14G0 1465 ACA TTT CTC OTC TCC CCC CCC TTT CAA TAC TTC ATC ATG CCACAA4 Thr Phe Val Val Ser Pro Pro Phe Giu Tyr Phe Ile Met Aa Met Ile 1470* 1475 1480 CCC CTC AAC ACT OCG GTG CTC ATG ATC AAC TTC TAT CAT GCA CCC TAT 4634 A-la Leu Asn Thr Val Val Leu Met Met Lys Phe Tyr Asp Ala Pro Tyr *..1485 1490 1495 GAG TAC GAG CTG ATC CTC AAA TGC CTG AAC ATC GTC TTC ACA TCC ATG 4682 Clu Tyr Giu Leu Met Leu Lys Cys Leu Asn Ile Val Phe Thr Ser Met 1500 1505 1510 TTC TCC ATC CAJ\ TCC GTC CTC AAC ATC ATC CCC TTT CCC OTC CTC AAC 4730 Phe Ser Met Ciu Cys Val Leu Lys Ile Ile Ala Phe Gly Vai Leu Asn *.1515 1520 1525 TAT TTC AGA AT CC TOO AAT OTC TTT AC TTT TC ACT TG TTG GOA 4778 Tyr Phe Arg Asp A-la Ti-p Asn Val Phe Asp Phe Val Thr Val Leu Cly 1501535 1540 1545 ACT ATT ACT GAT ATT TTA GTA ACA GAG ATT CC GAA ACG AAC A-AT TTC 4826 Ser Ile Thr Asp 11 e Leu Val Thr Giu Ile Ala Ciu Thr Asn Asn Phe 1550 1555 1560 ATC AAC CTC ACC TTC CTC CCC CTC TTT CGA OCT CC COC CTG ATC A-AG 4874 Ile Asn Leu Ser Phe Leu Arg Leu Phe Arg Ala Ala Arg Leu Ile Lys 1565 1570 1575 CTC CTC CC CA-C CCC TAC ACC ATC CCC ATC CTC CTC TGG ACC TTT OTC 4922 Leu Leu Arg Gin Cly Tyr Thr Ile Arg Ile Leu Leu Trp Thr Phe Val 1580 1585 1590 CA-C TCC TTC AAG CC CTC CCC TAC TC TOT CTC CTC ATT GCC ATC CTC 4970 Gin Ser Phe Lys Ala Leu Pro Tyr Val Cys Leu Le u Ile Ala Met Leu 1595 1600 1605 -157- TTC TTC Phe Phe 1610 A-TC TAC GCC ATC Ile Tyr Ala Ile 16.1 CTG GAT GAT GAC ACC AC Leu Asp Asp Asp Thr Ser 1630 TTG CAA GCC CTG ATG
CTG
Leu Gin Ala Leu Met Leu 1645 CA-C GAG ATC ATG CTG TCC His Giu Ile Met Leu Ser 1660 ATC CCC ATG CAG GTG TTT GGG A-AT ATT GCC Ile Gly Met Gin Val Phe Gly Asr Ile Ala 51620 1625 ATC A-AC CCC CAC AAC AAC TTC CGG A-CC TTT Ile Asn Arg His Asn Asri Phe Arg Thr Phe 1635 1640 CTC TTC ACG AC Leu Phe A-rg Ser 1650 TGC CTG A-CC AAC Cys Leu Ser A-sn CC ACG CCC Ala Thr Giy GAG CCC TGG Giu Ala Trp e CCC A-AT CCC ACC Ala Asn Ala Thr 1675 TCC TTC ATC TTC Ser Phe Ile Phe 1690 CTG ATC ATC CAC Vai Ile Met Asp CCT CCT CAC CA-C Gly Pro His His GAG TCT CGA ACT CA-C TTT Giu Cys Cly Ser Asp Phe 1680 CAG CCC TGT CAT GAG CAC Gin Ala Cys Asp Giu Gin 1670 CC TAC TTC TAC TTC CTCC Ala Tyr Phe Tyr Phe Val CTC TCC TCC TTT CTC ATC TTC AAC Leu Cys Ser Phe Leu Met Leu Asn 1695 1700 CTC TTT CTC
CCT
AA~T TTT GAG TAC Asn Phe Giu Tyr 1710 CTC A-CC CCC CAC Leu Thr Arg Asp TCT TCC ATC CTA Ser Ser Ile 1Leu 5018 5066 5114 5162 5210 5258 5306 5354 5402 5450 54.98 5546 5594 5642 I'TG CAT GAG TTC Asp Giu Phe rn,, 1725 CCC GCT CC TCT CCC Pro Ala A-ia Cys Ciy 1740 ATC CCC Ile Arg CTC TGG GCT Vai Trp Ala CAA TAC CAC GC1u Tyr As p CCC ATC ACT TAC Arg Ile Ser Tyr A-AT CAC ATC Asn Asp Met TTT GAG A-TG CTG Phe Giu Met Leu A-AA CAC ATC Lys His Met 1755 TCC CCC CCT Ser Pro Pro CTG CCC CTC Leu Cly Leu CCC A-AG AAA TGC CCT CCT CCA Cly Lys Lys Cys Pro A-ia Arg CTT CCT Val Ala TAC A-AC CC Tyr LYS Arg CTC CTT CC Leu Val Arg 1775 ATG AAC ATG CCC ATC TCC Met A-sn m~t Pro lie Ser CAC ATG ACT GTT Asp Met Thr Vai CAC TTC His Phe 1790 ACC TCC ACC CTG A-TG Thr Ser Thr Leu Met AAC GAG Asn Giu 1785 CCC A-CC A-rg Thr CCC CTC ATC Aia Leu Ile GCA CTC GAG Ala Leu Ciu ATC A-AG CTG Ile Lys Leu 1805 CCC CCA GCT CCC A-CA A-AC CAC A-ia Pro Ala Cly Thr Lys Gin CAT CAC TGT His Gin Cys GA-C CC GAG TTC A-CC A-AC CAC Asp A-ia Clu Leu Arg Lys Ciu 1820 ATT TCC CTT CTG TGG Ile Ser Val Val Trp 1825 CCC A-AT CTG CCC A-ia A-sn Leu Pro 1830 -158- CAG AAG ACT TTG GAC TTG CTG
GTA
Gin Lys Thr Lou Asp Leu Leu Val.
1835 1840 ACA GTG GGG AAG GTT TAT GCA OCT Thr Val Giy Lys Val Tyr Ala Ala 1850 1855 CCA CCC CAT AAG CCT CAT GAG ATG Pro Pro His Lys Pro Asp Glu Met 1845 CTG ATG ATA TTT GAC TTC TAC
AAG
186 CAG AAC AAA ACC ACC AGA GAC CAG ATG CAG CAG GCT CCT GGA GGC
CTC
Gin Asn LYS Thr Thr Arg ASP Gin Met Gin Gin Ala Pro Gly Giy Lou 1 8 7 02 8 58 8
E
TCC CAG 1 uuT GTG TCC CTG TTC CAC CCT Ser Gin Met Gly pro Val Ser Leu Phe His pro 2885 1890 GAG CAG Oiu Gin CAG
AAG
Gin Lys ACA CAG
CCG
Thr Gin Pro OCT OTO CTC CGA OGA 0CC Ala Val Lou Arg Gy Ala CTG AAG 0CC ACC
CTG
Leu Lys Ala Thr Le, 1895 19910 ACT TCC ACC TCC CTC AGC AAT GOC 000 GCC ATA CAA AAC
CAA
Ser Ser Thr Ser Lou Ser Asn Gly Gly Ala Ile Gin Asn Gin GAG ACT GOC ATC AAA GAG TCT OTC 0Th Ser Oly Ile LYS Oiu Ser Val 1930 1935 TCC TOO GOC ACT CAA AGO
ACC
Ser Trp Giy Thr Gin Arg9 Thr 5690 5738 5786 5834 5882 5930 5978 6026 6074 6122 6170
CAG
Gin 1945 OAT OCA CCC ASP Ala Pro GAG ATC
CCT
Giu Ile Pro CAT GA GCC G1940 CAT GAO 0CC AG CCo CCC CTO GAO COT GOC H 1950 l Ag r Pro Lou Oiu Arg Giy 1950 1955 CAC TCC
ACA
OTO 000 COO TCA GOA OCA CTO OCT Vai Giy Arg Ser Oiy Ala Lou Ala 1965 1970 CAG
AGC
Gin Ser ATA ACC
CG
Ile Thr Arg AGO GOC CCT OAT 000
GAG
Arg Gly pro Asp Oly Oiu OTO GAC OTT CAG
ATO
Val Asp Val Gin Met 1975 CCC CAG CCT GO
CTG
GAO AOC CAG
OOT
Giu Ser Gin Oly 1.9.95 COA OCO 0CC TCC ATO CCC COC CTT OCO 0CC GAG
ACT
200o~ l Sr e Pro Arg Leu Ala Ala Giu Thr CAG CCC Gin Pro 2010 GTC ACA OAT Val Thr Asp 0CC AGC CCC ATO AAO COC TCC ATC TCC ACG
CTG
Ala Ser Pro Met Lys Arg Ser Ile Ser Thr Leu 2015 2020 0CC CAG COO Ala Gin Arg CCA CCC CCT Pro Pro Pro
CCC
Pro COT 000 ACT Arg Gly Thr CAT CTT TOC AGC ACC
ACC
His Lu Cys Ser Thr Thr 2025 20300 AOC CAG OCO TCO TCG CAC CAC CAC CAC A G O
A
Ser Gin Ala Ser Ser His His His His HCA Cys HisCA 2045 2050 20is 55 cs i 6266 6314 -159- CGC CGC AGG GAC AGG Arg Arg Arg Asp Arg 2060 AAG CAC AGG TCC CTG GAG AAG GGG CCC AGC CTG Lys Gin Arg Ser Leu Glu LYS Gly Pro Ser Leu 2065 2070 TCT GCC GAT Ser Ala Asp 207S ATG GAT GGC GCA CCA AGC ACT GCT GTG GGG Met Asp Gly Ala Pro Ser Ser Ala Val Gly 2080 2085 CCG GGG CTG Pro Gly Leu CCC CCG GGA GAG Pro Pro Gly Glu 2090 CAG GAG CGG GGC Gin Glu Arg Gly TCG GAG AAG CAG Ser Glu Lys Gin 2125 GGG CCT ACA GGC Gly Pro Thr Gly 2095 CGG TCC CAG GAG Arg Ser Gin Giu 2110 TGC CGG CGG GAA Cys Arg Arg Giu 2100 CGA GAG CGC Arg Giu Arg
CGG
Arg CGG AGG CAG CCC TCA Arg Arg Gin Pro Ser 2115 *5 TOC TCC TCC Ser Ser Ser 2120 GGC CGT GAG Gly Arg Glu CGC TTC Arg Phe TAC TCC TGC GAC Tyr Ser Cys Asp 2130 CGC TTT GGG Arg Phe Gly 6362 6410 6458 6506 6554 6602 6650 6698 6746 6794 CCC CCG AAG CCC Pro Pro Lys Pro 2140 AAG CCC TCC Lys Pro Ser CTC AGC AGC CAC CCA Leu Ser Ser His Pro 2145 ACG TCG CCA ACA Thr Ser Pro Thr GCT GGC CAG Ala Gly Gin 2155 GAG CCG GGA Giu Pro Gly CCC CAC Pro His 2160 CCA CAG GCC GGC TCA GCC GTG GGC Pro Gin Ala Gly Ser Ala Val Gly 2165 TTT CCG Phe Pro 2170 AAC ACA ACG Asn Thr Thr CCC TGC TGC Pro Cys Cys 2175 AGA GAG ACC CCC Arg Giu Thr Pro 2180 TCA GCC AGC CCC Ser Ala Ser Pro 2185 TGG CCC CTG GCT Trp Pro Leu Ala CTC GAA TTG Leu Giu Leu 2190 GCT CTG ACC CTT Ala Leu Thr Leu 2195 ACC TGG GGC Thr Trp, Gly AGC GTC Ser Val TGG ACA GTG AGG CCT CTG Trp Thr Val Arg Pro ILeu 2205 TCC ACG CCC TGC Ser Thr Pro Cys 2210 CTG AGG ACA Leu Arg Thr CGC TCA CTT Arg Ser Leu TCG AGG AGG CTG TGG Ser Arc Arc- T. i, rP 2220 CCA CCA ACT CGG GCC PrFro Thr Arg Ala 2225 GCT CCT CCA GGA CTT CCT Ala Pro Pro Giy Leu Pro ACG TGT CCT CCC TGACCTCCCA GTCTCACCCT CTCCGCCGCG
TGCCCAACGG
Thr Cys Pro Pro 2235 TTACCACTGC ACCCTGGGAC TCAGCTCGGG TGGCCGAGCA CGGCACAGCT
ACCACCACCC
TGACCAAGAC CACTGGTGCT AGCTGCACCG TGACCGCTCA GACGCCTGCA
TGCAGCAGGC
GTGTGTTCCA GTGGATGAGT TTTATCATCC ACACGGGGCA GTCGGCCCTC
GGGGGAGGCC
TTGCCCACCT TGGTGAGGCT CCTGTGGCCC CTCCCTCCCC CTCCTCCCCT
CTTTTACTCT
6842 6894 6954 7014 7074 7134 -160- AGACGACGAA TAAAGCCCTG TTGCTTGAGT GTACGTACCG
C
INFORMATION FOR SEQ ID NO:9: SEQUENCE
CHARACTERISTICS:
LENGTH: 1546 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY:
CDS
LOCATION: 1..1437 (ix) FEATURE: NAME/KEY: 3'UTR LOCATION: 1435..1546 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: 7175
ATG
Met 1 GTC CAG AAG ACC AGC ATG TCC CGG GGC CCT Val Gln Lys Thr Ser Met Ser Arg Gly Pro 5 10 TAC CCA CCC Tyr Pro Pro TCC CAG Ser Gln GAG ATC CCC ATG GAG GTC TTC GAC Glu Ile Pro Met Glu Val Phe Asp CCC AGC CCG CAG Pro Ser Pro Gln TCA GAT GGG AGC Ser Asp Gly Ser GGC AAA TAC AGC Gly Lys Tyr Ser AAG AGG AAA Lys Arg Lys GGG CGA TTC AAA Gly Arg Phe Lys
CGG
Arg 40 TCC TCG GAT Ser Ser Asp ACC ACA Thr Thr TCC AAC AGC TTT Ser Asn Ser Phe CGC CAG GGC Arg Gln Gly TCA GCG Ser Ala 60 GAG TCC TAC ACC Glu Ser Tyr Thr CGT CCA TCA GAC Arg Pro Ser Asp GAT GTA TCT CTG Asp Val Ser Leu GAG GAC CGG GAA Glu Asp Arg Glu TTA AGG AAG GAA GCA GAG CGC CAG GCA Leu Arg Lys Glu Ala Glu Arg Gln Ala TTA GCG Leu Ala CAG CTC GAG Gln Leu Glu AAG GCC Lys Ala AAG ACC AAG Lys Thr Lys
CCA
Pro 100 GTG GCA TTT GCT Val Ala Phe Ala CGG ACA AAT GTT Arg Thr Asn Val GGC TAC AAT Gly Tyr Asn 110 ATC ACC TTC Ile Thr Phe CCG TCT CCA GGG GAT GAG GTG CCT Pro Ser Pro Gly Asp Glu Val Pro 115 120 GTG CAG GGA GTG Val Gln Gly Val
GCC
Ala 125 -161- GAG CCC AAA GAC Giu Pro Lys Asp 130 TTC CTG CAC Phe Leu His 135 ATC AAG GAG AAA Ile Lys Giu Lys TGG ATC GGG CGG CTG GTG AAG GAG GGC TGT GAG GTT Trp, Ile Gly Arg Leu Val Lys Giu Giy Cys Giu Val 145-1-A AAT AAT GAC TGG Asri Asn Asp Trp, GGC TTC ATT CCC Gly Phe Ile Pro AGC CCC GTC AAA Ser Pro Val Lys GAC AGC CTT CGC Asp Ser Leu Arg
CTG
Leu CTG CAG GAA CAG Leu Gin Giu Gin AAG CTG Lys Leu
S
CGC CAG AAC Arg Gin Asn AGT CTG GGA Ser Leu Gly 195
CGC
Arg 180 CTC GGC TCC AGC Leu Gly Ser Ser
AAA
Lys 185 TCA GGC GAT AAC Ser Gly Asp Asn TCC AGT TCC Ser Ser Ser 190 CCC CCT GCC Pro Pro Ala GAT GTG GTG ACT Asp Val Vai Thr
GGC
Gly 200 ACC CGC CGC CCC Thr Arg Arg Pro
ACA
Thr 432 480 528 576 624 672 720 768 616 864 AGT GCC Ser Ala 210 AAA CAG AAG CAG Lys Gin Lys Gin
AAG
Lys 215 TCG ACA GAG CAT Ser Thr Giu His CCC CCC TAT GAC Pro Pro Tyr Asp GTG CCT TCC ATG Val Pro Ser Met
AGG
Arg 230 CCC ATC ATC CTG Pro Ile Ile Leu GGA CCG TCG CTC Gly Pro Ser Leu
AAG
Lys GGC TAC GAG GTT Gly Tyr Glu Val
ACA
Thr 24S GAC ATG ATG CAG Asp Met Met Gin
AAA
Lys 250 GCT TTA TTT GAC Ala Leu Phe Asp TTC TTG Phe Leu AAG CAT CGG Lys His Arg ATT TCC CTG Ile Ser Leu 275
TTT
Phe 260 GAT GGC AGG ATC Asp Gly Arg Ile ATC ACT CGT GTG Ile Thr Arg Val ACG GCA GAT Thr Ala Asp 270 AAA CAC ATC Lys His Ile GCT AAG CGC TCA Al1a Lys Arg Ser CTC AAC AAC CCC Leu Asn Asn Pro ATC ATT Ile Ile 290 GAG CGC TCC AAC Giu Arg Ser Asn CGC TCC AGC CTG Arg Ser Ser Leu
GCT
Al a GAG GTG CAG AGT Giu Val GinSe
GAA
Glu 305 ATC GAG CGA ATC le Glu Arg Ile
TTC
Phe 310 GAG CTG GCC CGG Giu Leu Ala Arg
ACC
Thr CTT CAG TTG GTC Leu Gin Leu Val CTG GAT GCT GAC Leu Asp Ala Asp
ACC
Thr 325 ATC AAT CAC CCA GCC CAG CTG TCC AAG Ile Asn His Pro Ala Gin Leu Ser Lys 330 ACC TCG Thr Ser 960 1008 1056 CTG GCC CCC Leu Ala Pro
ATC
Ile 340 ATT GTT TAC ATC Ile Val Tyr Ile AAG ATC ACC TCT CCC Lys Ile Thr Ser Pro 345 AAG GTA CTT Lys Val Leu 350 -162- CAA AGG CTC ATO AAG TCC CGA GGA AAG TCT CAG TCC AAA Gin Arg Leu Ile Lys Ser Arg Giv Lys Ser Gin Ser Lys 355360 365 GTC CAA Vai Gin 370 ATA GCG GCC TCG GAA Ile Aia A-la Ser Giu AAG CTG GCA CAG Lys Leu Aia Gin
CCC
Pro CAC CTC AAT His Leu Asn CCT GAA ATG Pro Giu Met GAC ATC ATC CTG Asp Ile Ile Leu GAT GAG Asp Giu 390 AAC CAA TTG Asn Gin Leu GAT GCC TGC GAG ASP Aa Cys Gu
CAT
His CTG GC GAG Leu Ala Giu ACC ACG CCA Ser Thr Pro TAC TTG Tyr Leu 405 CCC AAT Pro Asn 420 a.
a. .a a a a. a a a a a *aa.
GAA CCC TAT TGG AAG Giu Ala Tyr Tro Lys 410 CCG CTG CTG AAC CGC Pro Leu Leu Asn Arg GCC ACA CAC CCG Ala Thr His Pro CCC ACC Pro Ser ACC ATG GCT ACC GCA GCC 1104 1152 1200 1248 1296 1344 1392 1444 CTG GCT GCC AGC CCT GCC CCT GTC TCC AAC CTC CAG GTA CAG GTG CTC Leu Ala Aia Ser Pro Ala Pro Val Ser A-sn Leu Gin Val Gin Val Leu 435 440 445 ACC TCG Thr Ser 450 CTC AGG AGA AAC CTC Leu Arg Arg Asn Leu CGC GGC AGT GTG GTG CCC Arg Gly Ser Val Val Pro GGC TTC TGG GGC Gly Phe Trp Gly CAC GAG CAG GAA Gin Glu Gin Glu CTG GAG TCC TCA CAT GCC ATC TAGTGGGCGC CCTGCCCGTC TTCCCTCCTG CTCTGGGGTC GGAACTGGAG TGCAGGGAAC
ATGGAGGAGG
AAGGGAAGAG CTTTATTTTC TA APATA AGATGACCGG
CA
INFORMATION FOPR SEQ ID NO:10: SEQUENCE
CHARACTERISTICS:
LENGTH: 1851 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECUJLE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY:
CDS
LOCATION: 1.-1797 OTHER INFORMATION: /standard-name- "Betal-31" (ix) FEATUR.E: CA) NAME/KEY: 3'UTR LOCATION: 1795. .1851 (xi) SEQUENCE DESCRIPTION: SEQ ID 1546 0 -163- ATG GTC CAG AAG ACC Met Val Gin Lys Thr AGO ATG TCO CGG GGO CCT TAO OCA COO TOC 0-AG Ser Met Ser Arg Gly Pro Tyr Pro Pro Ser Gin GAG ATC 000 Giu Ile Pro A-AG AGG AAA Lys Arg Lys 3S GGA GTC TTO GAC Gly Vai Phe Asp AGO CCG CAG GGC Ser Pro Gin Giy AAA TAO AGO Lys Tyr Ser TOO TCG GAT Ser Ser Asp GGG OGA TTC AAA Giy Arg Phe Lys TOA GAT GGG AGC Ser Asp Gly Ser ACO ACA Thr Thr TCC AAC AGC TTT Ser Asn Ser Phe CGC CAG GGO TOA Arg Gin Gly Ser GAG TCC TAC ACC Giu Ser Tyr Thr 9
AGC
S er 65 CGT OCA TCA GAC Arg Pro Ser Asp GAT GTA TCT CTG Asp Vai Ser Leu GAG GAO CGG GAA Giu Asp Arg Glu
GOC
Aila TTA AGG A-AG GAA Leu Arg Lys Giu GAG OGO CAG GCA Giu Arg Gin Ala GOG OAG OTO GAG Ala Gin Leu Glu AAG GOOC Lys Ala AAG ACC AAG Lys Thr Lys OOG TOT OOA Pro Ser Pro 115
OOA
Pro 100 GTG GOA TTT GOT Val Ala Phe Ala
GTG
Val 105 OGG ACA AAT GTT Arg Thr Asn Val GGO TAO AAT Gly Tyr Asn 110 ATO ACC TTO Ile Thr Phe 48 96 144 132 240 288 336 384 43 2 480 528 576 624 GGG GAT GAG GTG Gly Asp Glu Vai
OCT
Pro 120 GTG CAG GGA GTG Val Gin Gly Vai
GOO
Al a GAG 000 Giu Pro 130 AAA GAO TTO OTG Lys Asp Phe Leu ATO A-AG GAG A-AA Ile Lys Glu Lys A-AT AAT GAO TGG A-sn Asn Asp Trp
TGG
Trp 145 A-TO GGG OGG OTG Ile Gly A-rg Leu
GTG
Vai 150 A-AG GAG GGO TGT Lys Giu Gly Cys
GAG
Glu 155 GTT GO TTO ATT Val. Gly Phe Ile AGO 000 GTO AA-A Ser Pro Val Lys GAO AGO OTT OGO Asp Ser Leu A-rg OTG CAG GA-A CAG Leu Gin Giu CIn A-AG OTO Lys1 T.- OGO OAG A-AC Arg Gin Asn AGT OTG GGA Ser Leu Gly 195 OTO GGO TOO AGO Leu Giy Ser Ser TOA GGO GAT AAO Ser Gly Asp Asn TOO AGT TOO Ser Ser Ser 190 000 COT GOO Pro Pro Ala GAT GTG GTG ACT GGO ACC OGO OGO Asp Val Val Thr Gly Thr A-rg Arg 200 000 ACA Pro Thr AGT GOO Ser Ala 210 A-A-A CAG A-AG CAG A-AG TOG A-CA GAG OAT GTG 000 COO TAT GAO Lys Gin Lys Gin Lys Ser Thr Giu His Val Pro Pro Tyr Asp 215 220 -164- GTG GTG CCT TCC ATG AGG CCC ATC ATC
CTG
Val. Val Pro Ser Met Arg pro Ile Ile Leu. 225 230 GGC TAC GAG GTT ACA GAC ATG ATG CAG AAA Gly Tyr Giu Val Thr Asp Met Met Gin Lys 245 250 GTG GGA CCG TCG CTC
AAG
Vai Giy Pro Ser Leu Lvs 2 2 5 2 4 0 GCT TTA TTT GAC TTC
TTG
Aia Leu Phe Asp Phe Leo AAG CAT CGG Lys His Arg a.
*A
a a a 9 ATT
TCC
Ile Ser ATC
ATT
Ile Ile 290
CTG
Leu 275 TTT GAT GGC AGG ATC TCC ATC ACT CGT GTG Phe Asp Gly Arg IeI Serli TrAr Va 26026 GCT AAG CGC TCA GTT CTC AAC AAC CCC
AGC
Aia Lys Arg Ser Vai Leu Asn Asn Pro Ser 280 285 ACG GCA
GAT
Thr Aia Asp 270 AA CAC ATC LYS His -le GTG CAG
AGT
720 768 816 864 912 GAG CGC TCC AAC ACA
CGC
Giu Arg Ser Asn Thr Arg TCC AGC CTG
GCT
Ser Ser Leu Aia
GAG
GAA
Gio 305 ATC GAG CGA
ATC
Ile Giu Arg Ile TTC GAG
CTG
Phe Giu Leo GCC CGG ACC CTT CAG TTG
GTC
Aa Arg Thr Leo GIn Leu Val
GCT
Ala 320 CTG GAT GCT
GAC
Leu Asp Aia Asp
ACC
Thr ATC AAT CAC CCA GCC CAG CTG TCC AAG ACC
TCG
Ile Asn His Pro Aia Gin Leu Ser Lys Thr Ser CTG GCC cCC Leu Aia Pro ATC ATT
GTT
Ile Ile Vai TAC ATC Tyr Ile CAA AGG Gin Arg GTC CAA Vai Gin 370
CTC
Leu 355 ATC AAG TCC CGA
GGA
Ile Lys Ser Arg Giy AAG ATO ACC TCT CCC Lys Ile Thr Ser Pro 345 AAG TCT CAG TCC AAA Lys Ser Gin Ser Lys 365 CTG GCA CAG TGC CCC Leu Aia Gin Cys Pro AAG GTA CTT Lys Vai Leu 350 CAC CTC AAT His Leu Asn CCT GAA ATG 1008 1056 1104 1152 ATA GCG GCC TCG Ile Ala Aia Ser GAA AAG Giu Lys
TTT
Phe 385 GAC ATC ATC
CTG
Asp Ile Tl- L-eu
GAT
Asp GAG AAC CAA TTG
GAG
Gbu Asn Gin Leo Giu ASP AProy GiuMe CTG GCG GAG TAC TTG GAA GCC TAT TGG AAG GCC ACA CAC CCG CCC
AGC
Leu Ala Giu Tyr Leo Glu Ala Tyr Trp Lys Ala Thr His Pro Pro Ser 405 410 4 AGC ACG CCA CCC AAT CCG CTG
CTG
Ser Thr Pro Pro Asn Pro Leu Leu CGC ACC ATG
GCT
CTG GCT GCC AGC CCT GCC CCT GTC TCC AAC CT AG GGA Leo Ala AiaSroAa Pro Val Ser Asn Leo GiC i 425 440 445 ACC GCA GCc Thr Aia Al!a 420 CCC TAC CTT Pro Tyr Leu 1248 1296 1344 -165- GCT TCC GGG Ala Ser Gly 450 GAO CAG CCA OTG Asp Gin Pro Leu 455 GAA OGG GCC ACC GGG GAG CAC GCC AGO Glu Arg Ala Thr Gly Glu His Ala Ser 460 ATG CAC GAG TAC OCA Met His Glu T1yr Pro 465 AGO AGC CAC COA cCA Ser Ser His Pro Pro 485 GGG GAG Gly Glu 470 CTG GGO CAG COO Leu Gay Gin Pro CCA GGO OTT TAC 000 Pro Gly Leu Tyr Pro GGC CGG GOA GGC AOG Gly Arg Ala Gly Thr 490 CTA OGG GOA OTG Leu Arg Ala Leu TCC OGO Ser Arg
S
*5S* 5555
S
*5 9 S 5
S
S. S.
9 5
S
*5S* 5* SS SSt.
9e S0 S
S.
S
55.
S
Sn.
*9S*
S
Sn.
CAA GAC Gin Asp ACG GAG Thr Glu GGG CCA Gly pro 530
ACT
Thr GAT GOC GAO ACC Asp Ala Asp Thr GGC AGO CGA AAO Gly Ser Arg Asn TCT GCC TAO Ser Ala Tyr 510 000 TOA GAG OTG GGA GAO TCA TGT Leu Gly Asp Ser Oys 515 GGG OTT GGA GAO OOT Gly Leu Gly Asp Pro 535
GTG
Val 520 GAO ATG GAG ACT Asp Met Glu Thr GOA GGG GGO GGO AOG 000 OOA GOC OGA Ala Gay Gly Gay Thr Pro Pro Ala Arg 1392 1440 1488 15S36 1584 1632 1680 1728 1776 1824 1851 GGA TOO TGG GAG Gly Ser Trp Glu
GAO
Asp 550 GAG GAA GAA GAO Glu Glu Glu Asp GAG GAA GAG OTG Glu Glu Glu Leu
ACC
Thr GAO AAO OGG AAO Asp Asn Arg Asn 0CG Arg 565 GGO OGG AAT AAG GC Gly Arg Asn Lys Ala 570 OGO TAO TGO GOT Arg Tyr Cys Ala GAG GCT Glu Gly GGG GGT OOA Gly Gly Pro TTG GGG OGO AAO AAG AAT GAG OTG GAG GGO TGG GGA Leu Gly Arg Asn Lys Asn Glu Leu Glu Gly Trp Gay 585 590 OGA GGO GTO TAO ATT OGO TGAGAGGCAG GGGOOAOAOG
GCGGGAGGAA
Arg Gly Val Tyr Ile Arg 595 GGGOTOTGAG OOOAGGGGAG
GGGAGGG
INFORMAATION FOR SEQ ID NO:11: SEQUENCE
CHARACTERISTICS:
LENGTH: 3600 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY:
ODS
LOCATION: 35. .3310 OTHER INFORMATION: /standard name= "Alpha-2' -166- (iX) FEATURE: NAME/KEY. LOCATION: 1. .34 (i)FEATURE: NAME/KEY. 3'UTR LOCATION: 3308..3600 (xSEQUENCE DESCRIPTION. SEQ ID) GCGGGCAGG GGGCATTGAT CTTCCATCOC GAAG ATG OCT OCT CCC TOC CTG 52 Met Ala Ala Gly Cys Leu 6:00AC CTG GCC TTG C CTC ACA OTT TTC CAA TCT TTO CTC ATC CCC CCC TCC 1Cc Leu AaLeu Thr Leu Thr Leu Phe Gin Ser Leu Leu Ile Gly Pro Ser 1 0 1 52 0 TCG GAG GAG CCG TTC CCT TCG CC GTC ACT ATC AAA TCA TOO GTG GAT 148 Ser Glu Giu PoPeroS Ala Val Thr Ile Lys Ser Trp Val Asp 4 S25 35 AAG ATG CAA GAA GAC CTT GTC ACA CTO GCA AAA ACA GCA AGT GGA GTC 196 Lys Met Gin Glu Asp Leu Val Thr Leu Ala Lys Thr Ala Ser Gly Val *54045s AAT CAC CTT GTT GAT ATT TAT GAG AAA TAT CAA OAT TTG TAT ACT GTG 244 Asn Gin Leu Val Asp Ile Tyr Giu Lys Tyr Gin Asp Leu Tyr Thr Val 65 GAA CCA AAT OA CA C CAGCT OTA CAA ATT OCA CC AGO CAT ATT 2.92 0 u Pro Asn Asn Ala Arg Gin Leu Val Glu Ile Aia Ala Arg Asp Ile 80 8 V00GAG AAA CTT CTG AGC AAC AGA TCT AAA CCC CTO GTO AGC CTG GCA TTO 340 Glu Lys Leu Leu Ser Asn Arg Ser Lys Ala Leu Val Ser Leu Ala Leu GAA 0CG GAG AAA OTT CAA OCA OCT CAC CAC TOO ACA GAA GAT TTT OCA 388 Clu Ala Glu Lys Val Gin Ala Ala His Gin Trp Arg Oiu Asp Phe Ala 105 110 115 AGO AAT GAA OTT GTC TAC TAC AAT GCA AAG OAT CAT CTC GAT CCT GAG 436 Ser Asn Ciu Val Val Tyr Tyr Asn Ala Lys Asp Asp Leu Asp Pro Giu 120 125 130 AAA AAT GAO AGT GAG OCA CCC AGC CAG AGO ATA AAA CCT OTT TTO ATT 484 Lys Asn Asp Ser Glu Pro Gly Ser Gin Arg Ile Lys Pro Val Phe Ile 135 140 145 150 OAA OAT OCT AAT TTT OGA CCA CAA ATA TCT TAT CAG CAC OCA CCA OTO 532 Giu Asp Ala Asn Phe Oly Arg Gin Ile Ser Tyr Gin His Ala Ala Val 155 160 165 CAT ATT CCT ACT CAC ATO TAT GAO GC TCA ACA ATT GTO TTA AAT GAA 580 His Ilie Pro Thr Asp Ile Tyr Olu Oly Ser Thr Ile Val Leu Asn Giu -167-
CTC
Leu
GAA
Glu
CCT
Al a 215
AJAT
As n
GGA
Gly
ACT
Ser
ATG
Met A.AC Asn S 2 95 AAT G Asn V
GCC
Ala L CAG C Cmn L ATC C Met L 3 AAA T.
Ly's r 375 CAA c~ Clr. H:
A
As
CC
Ar
AA
Ly.
GC
Al -1fI Va 2 1TT ?60
G.C
~er
;TA
~al ,ys
TC
eu
TA
eu 60 yr i~s ~C TC n Tr 18 .C cc p Pr 0 A TA: g Ty~ G AT' s Ile T C a Ala
*ACT
*Ser 265
GAA
Clu
AAT
Asn
ACA
Arg
CCA
Cly
CTT
Leu 345
TTC
Phe
AAT
Asn
AAT
Asn 170 C ACA ACT C pThr Ser Al 5 T TCA TTA TT c Ser Leu Le, T TAT CCA GC' r yr Pro Ali 22( CAC CTT TAJ Asp Leu Ty2 235 TCT CCT AA; Ser Pro Lys 250 CCA TTC ACA Cly Leu Thr ACC CTC TCA Thr Leu Ser GCT CAC CAT Ala Gin Asp 300 AT AA AAA Asn Lys Lys 315 ATT ACA CAT Ile Thr Asp 330 AAT TAT AAT Asn Tyr Asn ACC CAT CCA Thr Asp Cly AAA CAT AA-A Lys Asp Lys 380 TAT CAC ACA Tyr Clu Arg C TTA CAT CAA a Leu Asp Clu 190 TCC CAC CTT u Trp Cln Val 205 r TCA CCA TCC a Ser Pro Trp 7 CAT CTA CC Asp Val Arg CAC ATC CTT *Asp Met Leu 25S *CTT AAA CTC Leu Lys Leu 270 CAT CAT CAT Asp Asp Asp 285 CTA ACC TCT IJ Val Ser Cys I CTC TTC AAA G Val Leu LysA 3 TAT AAC A-AC C Tyr Lys Lys C 335 CTT TCC ACA C Val Ser Arg A 350 CCA CAA CAC A Cly Clu Clu A 365 AAA CTA CCT G Lys Val Arg V CCA CCT ATT C2 Cly Pro Ile C: Va
TT
Ph
CT
Va
AG~
Ar 24
AT:
IlE
ATC
Ile rTC ?he
'TT
'he
AC
~sp 20
C
ly
CA
la rg
TA
Ln 180 T TTC AAA AAC AAT CC .1 Phe Lys Lys Asn Arg 195 'T CCC ACT CCC ACT GC e Cly Ser Ala Thr Cly 210 T CAT AAT ACT ACA ACT 1 Asp Asn Ser Arg Thr 225 A. ACA CCA TCC TAC ATC g Arg Pro Trp Tyr Ile 0 245 C CTC CTC CAT CTC ACT Leu Val Asp Val Ser 260 *CGA ACA TCT CTC TCCC -Arg Thr Ser Val Ser 275 CTC AAT CTA CCT TCA Val Asn Val Ala Ser 290 CAC CAC CTT CTC CAA G Cmn His Leu Val Cln 305 3 CC CTC AAT AAT ATC A Ala Val Asn Asn Ile T 325 TTT ACT TTT CCT TTT C Phe Ser Phe Ala Phe C 340 AAC TCC AAT AAC ATT A Asn Cys Asn Lys Ile I 355 CCC CAC CAC ATA TTT A- Ala Cln Clu Ile Phe A 370 TTC ACC TTT TCA CTT CG Phe Arg Phe Ser Val C: 3853 TCC ATC CCC TCT CAA A) Trp Met Ala Cys Clu A~
CAC
Cl u
CTA
Leu
CCA
Pro 230
CAA
Cln
GCA
Gly
GAA
Glu rTT ?he
CA
da
.CA
hr
AA
lu
TT
le sn ly 628 676 724 772 820 868 916 964 1012 1060 1108 1156 1204 1252 395 400 405 AA GCT TAT TAT TAT GAA ATT CCT TCC ATT GGT GCA ATA AGA ATC AAT 1300 Lys Gly Tyr Tyr Tyr Giu Ile Pro Ser Ile Gly Ala Ile Ar,, Ile Asn 4 1 04 1 5 4 2 ACT CAC GAA TAT TTG GAT GTT TTG GA AGA CCA ATG CTT TWA CCA CCA 14 Thr Gin Giu Tyr Leu Asp Val Leu Gly Arg Pro Met Val Leu Ala Gly 425 430 435 GAC AAA GCT AAG CAA GTC CAA TCC ACA AAT GTG TAC CTG GAT GCA TTC 1396 Asp Lys Ala Lys Gin Val Gin Trp Thr AsnVaTy eAsAaLe 4400 GAA CTG GCA CTT GTC ATT ACT GGA ACT CTT CCG GTC TTC AAC ATA ACC 1444 Giu Leu Cly Leu Val Ile Thr Gly Thr Leu Pro Val Phe Asn Ile Thr 455 460 465 470 GGC CAA TTT GAA AAT AAG ACA AAC TTA AAG AAC CAG CTIG ATT CTT CGT 1492 Gly Gin Phe Glu Asn Lys Thr Asn Leu Lys Asn Gin Leu Ile Leu Gly 475 480 485 GTG ATC GGA GTA CAT GTC TCT TTC GAA GAT ATT AAA AGA CTC ACA CCA 1540 Val Met Cly VlApVlSrLuG Asp IeLys Arg Leu Thr Pro 490 495 -500 CG TT CACTG TGC CCC AAT GGG TAT TAC TTT GCA ATC GAT CCT AAT 1588 Arg Phe Thr Leu Cys Pro Asn Gly Tyr Tyr Phe Ala Ile Asp Pro Asn ***505 51051 GGT TAT GTT TTA TTA CAT CCA AAT CTT CAG CCA AAG AAC CCC AAA TCT 1636 Gly Tyr Val Leu Leu His Pro Asn Leu Gin Pro Lys Asn Pro Lys Ser *520 525 5 30 C GAG CAGTA ACA TTG GAT TTC CTT GAT GCA GAG TWA GAG AAT GAT 1684 Gin Giu Pro Val Thr Leu Asp Phe Leu Asp Ala Glu Leu Giu Asn Asp 535 540 545 550 AT AAGGGAG ATT CGA AAT AAG ATG ATT CAT CCC GAA ACT GGA CAA 1732 IleLysValGiu Ile A-rg Asn Lys Met Ile Asp Gly Ciu Ser Gly Ciu .555 560 C ;C AAA ACA TTC AGA ACT CTG CTT AAA TCT CAA CAT GAG AGA TAT ATT GAC 1780 Lys Thr Phe Arg Thr Leu Val Lys Ser Gin Asp Giu Arg Tyr Ile Asp .*570 57558 AAA CCA AAC AGC ACA TAC ACA TCG ACA CCT CTC AAT CCC ACA CAT TAC 1828 Lys Cly Asn Arg Thr Tyr Thr Trp Thr Pro Val Asn Cly Thr AspTy 585 590 ACT TTC CCC TTG GTA TTA CCA ACC TAC ACT TTT TAC TAT ATA AAA GCC 1876 Ser Leu A-1a Leu Val Leu Pro Thr Tyr Ser Phe Tyr Tyr Ile Lys Ala 600 605 610 AAA CTA GAA GAG ACA ATA ACT CAG CCC AGA TCA AAA. AAC CCC AAA ATC 1924 Lys Leu Ciu Glu Thr Ile Thr Gin Ala Arg Ser Lys Lys Gly Lys Met -169- 615 AAG GAT TCG GAA Lys Asp Ser Glu
ACC
Thr 635 CTG AAG CCA GAT Leu Lys Pro Asp
AAT
Asn TTT GAA GAA Phe Glu Glu ACA TTC ATA Thr Phe Ile AAT AAC ACT Asn Asn Thr 665
GCA
Al a CCA AGA GAT TAC Pro Arg Asp Tyr
TGC
Cys AAT GAC CTG AAA Asn Asp Leu Lvs TCT GGC TAT Ser Gly Tyr 645 ATA TCG GAT Ile Ser Asp 660 GAT AGA AAA Asp Arg Lys GAA TTT CTT TTA Glu Phe Leu Leu
AAT
Asn 670 TTC AAC GAG TTT Phe Asn Glu Phe
ATT
Ile ACT CCA Thr Pro 680 AAC AAC CCA TCA Asn Asn pro Ser AAC GCG GAT TTG Asn Ala Asp Leu
ATT
Ile AAT AGA GTC TTG Asn Arg Val Leu GAT GCA GGC TTT Asp Ala Gly Phe
ACA
Thr 700 AAT GAA CTT GTC Asn Glu Leu Val CAA AAT Gin Asn TAC TGG AGT Tyr Trp, Ser CAG AAA AAT ATC Gin Lys Asn Ile
AAG
Lys 715 GGA GTG AAA GCA Gly Val Lys Ala
CGA
Arg TTT GTT GTG ACT Phe Val Val Tlir GAT GGT Asp Gly GGG ATT Gly Ile AAC CCA Asn Pro GAT AAC Asp Asn 760
ACC
Thr AGA GTT Arg Val 730 TAT CCC AAA Tyr Pro Lys
GAG
Giu GCT GG;A GAA AAT Ala Gly Glu Asn TGG CAA GAA Trp, Gn Glu 740 CTA GAT AAT Leu Asp Asn 1972 2020 2068 2116 2164 2212 2260 2308 2356 2404 2452 2500 2548 GAG ACA TAT GAG GAC Giu Thr Tyr Glu ASP 745 TAT GTT TTC ACT GCT Tyr Val Phe Thr Ala 765 TTC TAT AAA AGG Phe Tyr Lys Arg CCC TAC TTT AAC AAA Pro Tyr Phe Asn Lys 770 AGT GGA CCT GGT Ser Gly Pro Gly TAT GAA TCG GGC Tyr Glu Ser Gly ATG GTA AGC AAA Met Val Ser Lys GTA GAA ATA TAT Val Giu Ile Tyr
ATT
Ile CAA GG.G AAA Gin Gly Lys AAT TCC TGG Asn Ser Trp GCT GGT CCA Ala Gly Pro 825 Leu AAA CC GCA GTT Lys Pro Ala Val GGA ATT AAA Gly Ile Lys ATA GAG AAT TTC ACC AAA ACC TCA ATC AGA Ile Giu Asri Phe Thr Lys Thr Ser Ile Arg 810 815 ATT GAT GTA Ile ASP Val 805 GAT CCG TGT ASP Pro Cys 820 ATG GAT TGT GTT TGT GAC TGC Val Cys Asp Cys
AAA
Lys AGA AAC AGT GAC Arg Asn Ser Asp
GTA
Val1 GTG ATT CTG GAT GAT GGT GGG TTT CTT CTG ATG GCA AAT CAT GAT GAT Val Ile Leu Asp Asp Gly Gly Phe Leu Leu Met Ala Asn His Asp Asp 2596 -170- 840 845 850 TAT ACT AAT CAG ATT GGA AGA TTT TTT GGA GAG ATT GAT CCC AGC TTG 2644 Tyr Thr ASn Gin lie Giy Arg Phe Phe Gly Giu Ile Asp Pro Ser Leu 855 60 865 870 ATG AGA CAC CTG GTT AAT ATA TCA GTT TAT GCT TTT AAC AAA TCT TAT 2692 Met Arg His Leu Val Asn Ile Ser Val Tyr Aia Phe Asn Lys Ser Tyr 885 875 880 885 GAT TAT CAG TCA GTA TGT GAG CCC GGT GCT GCA CCA AAA CAA GGA GCA 2740 Asp Tyr Gin Ser Vai Cys Giu Pro Gly Ala Ala Pro Lys Gin Gly Ala 890 895 900 GGA CAT CGC TCA GCA TAT GTG CCA TCA GTA GCA GAC ATA TTA CAA ATT 2788 Gly His Arg Ser Ala Tyr Val Pro Ser Val Ala Asp Ile Leu Gin Ile 905 910 915 GGC TGG TGG GCC ACT GCT GCT GCC TGG TCT ATT CTA CAG CAG TTT CTC 2836 Gly Trp Trp Ala Thr Ala Ala Ala Trp Ser Ile Leu Gin Gin Phe Leu 920 925 930 :TTG AGT TTG ACC TTT CCA CGA CTC CTT GAG GCA GTT GAG ATG GAG GAT 2884 Leu Ser Leu Thr Phe Pro Arg Leu Leu Giu Ala Val Giu Met Giu Asp *935 940 945 950 GAT GAC TTC ACG GCC TCC CTG TCC AAG CAG C TGC ATT ACT GAA CAA 2932 Asp Asp Phe Thr Ala Ser Leu Ser Lys Gin Ser Cys Ile Thr Giu Gin 955 960 965 965 ACC CAG TAT TTC TTC GAT AAC GAC AGT AAA TCA TTC AGT GGT GTA TTA 2980 Thr Gin Tyr Phe Phe Asp Asn Asp Ser Lys Ser Phe Ser Gly Val Leu 970 975 980 GAC TGT GGA AAC TGT TCC AGA ATC TTT CAT GGA GAA AAG CTT ATG AAC 3028 GACG TGT
GGA
Asp Cys Gly Asn Cys Ser Arg lie Phe His Gly Giu Lys Leu Met Asn 995 985 990 995 ACC AAC TTA ATA TTC ATA ATG GTT GAG AGC AAA GGG ACA TGT CCA TGT 3076 Thr Asn Leu Ile Phe Ile Met Val Gu Ser Lys Gly Thr Cys Pro Cys 1000 1005 1010 GAC ACA CGA CTG CTC ATA CA, GCG GAG CAG ACT TCT GAC GGT CCA AAT 3124 SAsp Thr Arg Leu Leu Ile Gln Ala Gu Gin Thr Ser Asp Gly Pro Asn :....1015 11020 1025 1030 CCT TGT GAC ATG GTT AAG CAA CCT AGA TAC CGA AAA GGG CCT GAT GTC 3172 Pro Cys Asp Met 5Val Lys Gin Pro Arg Tyr Arg Lys Gly Pro Asp Val 1035 1040 1045 TGC TTT GAT AAC AAT GTC TTG GAG GAT TAT ACT GAC TGT GGT GGT GTT 3220 Cys Phe Asp Asn Asn Val Leu Gu Asp Tyr Thr Asp Cys Gly Gly Val 1050 1055 1060 TCT GGA TTA AAT CCC TCC CTG TGG TAT ATC ATT GGA ATC CAG TTT CTA 3268 Ser Gly Leu Asn Pro Ser Leu Trp Tyr Ile Ile Gly Ile Gin Phe Leu -171-
U
1065 1070 1075 CTA CTT TGG CTG GTA TCT GGC AGC ACA CAC CGG CTG TTA TGACCTTCTA Leu Leu Trp Leu Val Ser Gly Ser Thr His Arg Leu Leu 1080 1085 1090 AAAACCAAAT CTGCATAGTT AAACTCCAGA CCCTGCCAAA ACATGAGCCC
TGCCCTCAAT
TACAGTAACG TAGGGTCAGC TATAAAATCA GACAAACATT AGCTGGGCCT GTTCCATG3GC ATAACACTA.A GGCGCAGACT CCTAAGGCAC CCACTGGCTG CATGTCAGGG
TGTCAGATCC
TTAAA.CGTGT GTGAATGCTG CATCATCTAT GTGTAACATC AAAGCAAAAT
CCTATACGTG
TCCTCTATTG GAAAATTTGG GCGTTTGTTG TTGCATTGTT
GGT
INFORMATION FOR SEQ ID NO:12: SEQUENCE
CHARACTERISTICS:
LENGTH: 323 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: CCCCCTGCCA GTGGCCAAAC AGAAGCAGAA GTCGGGTAAT GAAATGACTA
ACTTAGCCTT
TGAACTAGAC CCCCTAGAGT TAGAGGAGGA AGAGGCTGAG CTTGGTGAGC
AGAGTGGCTC
TGCCAAGACT AGTGTTAGCA GTGTCACC-AC CCCGCCACCC CATGGCAAAC
GCATCCCCTT
CTTAGAG ACAGAGCATG TGCCCCCCTA TGACGTGGTG CCTTCCATGA
GGCCCATCAT
CCTGGTGGGA CCGTCGCTCA AGGGCTACGA GGTTACAGAC ATGATGCAGA
AAGCTTTATT
TGACTTCTTG AAGCATCGGT
TTG
INFORMATION FOR SEQ ID NO:13: SEQUENCE
CHARACTERISTICS:
LENGTH: 57 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: CCTATTGGTG TAGGTATACC AACAATTAAT TTAAGAAAAA GGAGACCCAA
TATCCAG
3317 3377 3437 34 97 3557 3600 -172- INFORMATION FOR SEQ ID NO:14: SEQUENCE
CHARACTERISTICS:
LENGTH: 180 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY:
CDS
LOCATION: 1..132 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: TG TCC TTT GCC TGC GCC TGT CC GCC TTC ATC CTC CTC TTT CTC GGC 48 Trp Ser Phe Ala Cys Ala Cys Ala Ala Phe Ile Leu Leu Phe Leu Gly 1 5 10 15 GGT CTC CCC CTC CTG CTG TTC TCC CTG CCT CGA ATG CCC CGG AAC CCA 96 Gly Leu Ala Leu Leu Leu Phe Ser Leu Pro Arg Met Pro Arg Asn Pro 25 TGC GAG TCC TCC ATG GAT GCT GAG CCC GAG CAC TAACCCTCCT GCGGCCCTAG 149 Trp Glu Ser Cys Met Asp Ala Glu Pro Clu His 35 CGACCCTCAG GCTTCTTCCC AGGAAGCGGG G 180 INFORMATION FOR SEQ ID SEQUENCE
CHARACTERISTICS:
LENGTH: 22 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear ii) MOLECULE TYPE: Other nucleic acid; DESCRIPTION: Oligonucleotide S (xi) SEQUENCE DESCRIPTION: SEQ ID AATTCGGTAC GTACACTCGA GC 22 INFORMATION FOR SEQ ID NO:16: SEQUENCE
CHARACTERISTICS:
LENGTH: 18 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Other nucleic acid; DESCRIPTION: Oligonucleotide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: -173- GCTCGAGTGT
ACGTACCG
18 INFORMATION FOR SEQ ID NO:17: SEQUENCE
CHARACTERISTICS:
LENGTH: 20 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Other nucleic acid; DESCRIPTION: Oligonucleotide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: CCATGGTACC TTCGTTGACG INFORMATION FOR SEQ ID NO:18: SEQUENCE
CHARACTERISTICS:
LENGTH: 24 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECULE TYPE: Other nucleic acid; DESCRIPTION: Oligonucleotide S(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: AATTCGTCAA CGAAGGTACC ATGG 24 INFORMATION FOR SEQ ID NO:19: SEQUENCE CHARACTERISTICS: S(A) LENGTH: 2153 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY:
CDS
LOCATION: 53..1504 OTHER INFORMATION: /standard_name= "Beta-3-1" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:19: CCGCCTCGGA CCCCCTGTCC CGGGGGAGGG GGAGAGCCCG CTACCCTGGT CT ATG Met 1 TCT TTT TCT GAC TCC AGT GCA ACC TTC CTG CTG AAC GAG GGT TCA GCC 103 -174- Ser Phe Ser Asp Ser Ser Ala Thr Phe Leu LeU Asn Glu Gly Ser Al!a 10 1 GAO TCC TAC ACC AGO CGC CCA TOT CTG GAC TOA GAO GTO TC i Ts GG Asp Ser Tyr Thr Ser Ar; pro Ser Leu Asp Ser Asp Val Ser Leu Giu C T A GG25 GGGAC OGG GAG AGT GOC OGG OGT GAA GTA GAG AGO CAG GCT CAG OAG 199 Glu Asp A-rg Glu Ser Ala Ar; Ar; Glu Val Glu Ser Gln AlaGi Gn 40n 45n CAG CTO GAA AGG GCC AAG CAC AAA OCT GTG GOA TTT GOG GTG AGG ACC24 Gln Leu Glu A-r; Ala Lys His Lys Pro Val Al247AaVl r h S h l a r h 5560 AAT GTO AGO TAO TGT GGC GTA CTG GAT GAG GAG TGO OCA GTC CAG GGO 295 Asn Val Ser Tyr Cys Gly Val Leu Asp Giu Giu Cys Pro Val Gin Gly 75 s0 TOT GGA GTC AAC TTT GAG GOC AAA GAT TTT OTG CAC ATT AAA GAG AAG34 Ser Gly Val Asn Phe Giu Ala Lys Asp Phe 1e i i y i y 90 e Hi Il Ly l Ls 95 TCACATGAO TGG TGG ATC GGG CGG CTA GTG AAA GAG GGC GGG GAC39 ~TAC AGC A-A-sp Trp Trp Ilie Gly Ar; Leu Val Lys Giu lGyAs39 100 105 1 110 CAT GCC TC ATO 000 AGC AOO TCA G GC C CCACCG G
A
Gie Ala Phe Lie ProaAg r Ser o Gin Ar Pro i Ser ier Leu L8-s 13G GA GCACCG T Asp 135 140 11512 AG GGCA AA G CGC AGG AG T GG CAAOC CT TOO AAG OTG AGT GAA 485 ine Gu Gin Lys Ala Ar; Ar; Sr Pro s ro Ser Ser L es G er AyspG IS 13016 135 140 145; AT GTG AAO TGA GO TOO CT CG TCT O AAGTTA GAAG Gc ACA G
GAA
Pie Galy Val Ar; Var; Se Po r Pro Ser Leu Al Lys Gin TrG V lyT spi 150 155 190 ATG T C G AA OAT GCTT 00 OGC TT G A GTG GTGA0 TOO GATG GG 589 Gin Al e l Giu His Val Pro ro Tlr Asp VLe Val Pro Ser Met Ar; 210. 165 17022 COT GTG TG OTG GAT G GG 000 TO OG ACCGT TT GAG CG T AC GAO 63 0 -175- 14., *fl.
4 4 we @9*4
CC
SC S SC CS
C
*4 45 5
C
CS..
CS
4* 4 44 4* 4* 5 4 5555 CCV t- 4
CS..
.45S
S
45
C
*S45 Ser Val.
GCC CGC Ala Arg GAG CTG Giu Leu AAC CAG Asn His 275 TTT GTC Phe Val 290 CGG GGG Arg Gly GAT AAG Asp Lys GAG AAGC Giu Asn G GTT TAC TI Val Tyr TI 355 GGT CCT C Gly Pro P 370 GAG CGT G Glu Arg G TCT GAT G~ Ser Asp G: GAG CGT AC Gin Arg S~ 42 GAG CTG T; Asp Leu T 435 AAC GGG CA~ Leu Asn Asn 230 TCC AGC ATT Ser Ser Ile 245 GCC AAA TCC Ala Lys Ser 260 CCA GCA GAG Pro Ala Gin AAA GTG TCC Lys Val Ser A.AG TCA CAG Lys Ser Gin 310 CTG GTT GAG Aeu Val GinC 325 :AG CTG GAG C in Leu Giu 40 'GG CGG GCCA 'rp Arg Ala TI CC AGT GCC A ro Ser Ala I 3 GC GAG GAG C iy Giu Giu H 390 AG GCC AGC G~ lu Ala Ser G: 405 3C TCC CGC C r Ser Arg H; ~C GAG CCT C; ~Gin Pro Hi ~T GAG CCC C-z
GC
C1) Le
CT
Le
TG
Se 29 MIe
TG(
~CG
'hr
.TC
le 75
AG
is lu Ls ro Gly Li G GAA G]2 La Glu Va ~G GAG CT ~u Gin Le 26 'G GGC AA u Ala Ly 280 A GCA AA( r Pro Ly 5 G~ AAG GAC t Lys Hi, GGA CCC 3Pro Prc *GGG TGT Ala Gys 345 GAG GAG His His 360 CCC GGA Pro Gly TGC CCC Ser Pro AGC TCC Ser Ser CTG GAG Leu Giu 425 GGC CAA Arg Gin 440 GAG GG Arg Thr Ile Ile G.
235 7G GAG AGT GAG ATC G; Li Gin Ser Giu Ile G- 250 'A GTA GTG TTG GAG GC u Val Vai Leu Asp A! 5 2-) G ACC TCG CTG GGG CC s Thr Ser Leu Ala Pr 285 3GTA CTG GAG GGT CT 3Val Leu Gin Arg Le 300 -CTG ACC GTA GAG AT( 3Leu Thr Val Gin Mel 315 GAG TCA TTT GAT GTC Giu Ser Phe Asp Va] 330 GAG GAG CTG GCT GAG Giu His Leu Ala Giu 350 GCA GGG CGT GGC CCC Pro Aia Pro Gly Pro 365 CTT GAG AAG GAG GAG Leu Gin Asn Gin Gin 380 CTT GAG CGG GAG AGC Leu Giu Arg Asp Ser 395 CGG CAA GGG TGG ACA A-rg Gin Ala Trp, Thr 410 GAG GAG TAT GCA GAT Glu Asp Tyr Ala Asp 430 GAG ACC TCG GGG GTG His Thr Ser Giy Leu 445 CTT GTA GGC GAG GAG iu Arg Ser Ser 240 'G CGC ATA TTT Lu Arg Ile Phe 255 'T GAG ACC ATC .a Asp Thr Ile '0 'C ATG ATC GTC o Ile Ile Val C ATT GGC TCC u Ile Arg Ser 305 3 ATG GGA TAT Met Ala Tyr 320 ATT GTG GAT Sle Leu Asp 335 TAG GTG GAG LTyr Leu Giu GGA CTT CTG Gly Leu Leu GTG GTG GGG Leu Leu Gly 385 TTG ATG CCC Leu Met Pro 400 GGA TGT TCA Gly Ser Ser 415 GCG TAG GAG Ala Tyr Gin GGT AGT GGT Pro Ser Ala TCA GAA GAG 823 871 9 19 967 1015 1063 1112.
1155 1207 12S5 1303 1351 1399 1447 -176- Asn Gly His Asp Pro Gin Asp Arg Leu Leu Ala Gin Asp Ser Giu His 450 455 460 AAC CAC AGT GAC CGG AAC TGG CAG CGC AAC CGG CCT TGG CCC AAG dAT 1495 As n His Ser Asp Arg Asn Tr-p Gin Arg Asn Arg Pro Trp Pro Lys Asp 470 475 480 AGC TAC TGA CAG C CTCCTGCTGC CCTACCCTGG
CAGGCACAGG
Ser Tyr 14 CGCAGCTGGC TGGGGGGCCC ACTCCAGGCA GGGTGGCGTT AGACTGGCAT 1593 CAGGCTGGCA CTAGGCTCAG CCCCCAAAAC CCCCTGCCCA GCCCCAGCTT CAGGGCTGCC 14 TGTGGTCCCA AGGTTCTGGG AGAAACAGGG GACCCCCTCA CCTCCTGGGC AGTGACCCCT 1708 ACTAGGCTCC CATTCCAGGT ACTAGCTGTG TGTTCTGCAC CCCTGGCACC TTCCTCTCCT 1768 :**CCCACACAGG AAGCTGCCCC ACTGGGCAGT GCCCTCAGGC CAGGATCCCC TTAGCAGGGT 1828 .CCTTCCCACC AGACTCAGGG AAGGGATGCC CCATTAAAGT GACAAAAGGG TGGGTGTGGG 1888 *"*CACCATGGCA GAGA ACAAGGTCCC TGAGCAGGCA CAAGTCCTGA CAGTCAAGGG 14 ACTGCTTTGG CATCCAGGGC CTCCAGTCAC CTCACTGCCA TACATTAGA ATGAGACAT 2008 TCAAAGCCCC CCCAGGGTGG CACACCCATC TGTTGCTGGG GTGTGGCAGC CACATCCAAG 2068 ACTGAGCG CGGCGGC ACCTTGGCCAGAGAGAGC TCACAGCTGA AGCTCTTGGA 22 .GGGAAGGGCT CTCCTCACCC AATCG INFORMATION FOR SEQ ID SEQUJENCE
CHARACTERISTICS:
LENGTH: 2144 base pairs TYPE: nucleic acid STRANDEDNESS. single TOPOLOGY: unknown (ii) MOLECULE TYPE: DNA (genomic)
ETURE-
NAME/EY:
CDS
LOCATION: 51..12492 OTHER INFORMATION: /product= "A Beta3 subunit of human calcium channel,, (ii) MOLECULTE TYPE: cDNA (xi) SEQUENCE DESCRIPTION. SEQ ID CGCCCCCGGC GCCGCTCGTT CCCCCGACCC GGACTCCCCC ATGTATGACG ACTCCTACGT GCCCGGGTTT GAGGACTCGG AGGCGGTTTC AGCCGACTCC TACACCAGCC GCCCATCTCT 120 -177-
S
S
a a a a
GGACTCAG)
GGCTCAGC;
TGTCAGCT;
TGAGGCCA.P
GCTAGTGA
CCGGCTCAA
TGGCAACcG TGTTCCC CC'
GAAAGGTTA'
ATTTGATGGC
TGTGCTcCA
TGCOGAAGTC
AGTGTTGGAC
CATCATCGTC
GGGGAAGTCA
GTGCCC.AccG
GCACCTGGCT
CGGACTTCTG
GCGTGGCGAG
CGAGAGCTCC
GGACTATGCA
GCCTAGTCrCT
CCACAGTGAC
TCCTGCTGCC
GGTGGCGTTA
CCCCAGCTTC
CTCCTGGGC-A
kC GTCTCCTGG G CAGCTCGAAA LC TGTGGCGTAC -A GATTTTCTGC A GAGGGCGGGG A CAGGAGCAGA A CGCTCCCCTC 3 TATGACGTGG r GAGGTCACAG 0AGGATCTCCA 0AATCCGGGCA
CAGAGTGAGA
GCTGACACCA
TTTGTCAAAG
IJ
CAGATGAAGC;
GAGTCATTTGA
GAGTACCTGG
A
GGTCCTCCCA
G
GAGCACTCCC
C
CGCCAAGCCT
G
GATGCCTACC Al A A CC-GCA TG AC CGGAACTGGC
AC
CTACCCTGGC
AC
GACTGGCATC
AC
AGGGCTGCCT
GI
GTGACCCCTA
CI
AGGAGGACCG
GGGCCAAGCA
TGGATGAGGA
ACATTAAAGA
ACATCGCCTT
AGGCCAGGAG
CGC CAT CT CT
TGCCCTCCAT
ACATGATGCA
TCACCCGAGT
AGAGGACCAT
rCGAGCGCAT2
PCAACCACCC
.GTCCTCACC LCCTGACCGT
A
~TGTGATTCT
G
GGTTTACTG
G
TGCCATCCC
C
CCTTGAGCG
G
GACAGGATC
T
~GACCTGTA Ci CCCCCAAGA CC -CGCAACCG
GC
3GCACAGGC
G(
~GCTGGCAC T; .GGTCCCAA
GG
GGAGAGTG
CAAACCTG
GTGCCCAG'
GAAGTACAC
CATCCCCAC
ATCTGGGA7
AGCCAAGCA
GCGGCCTGT
GAAGGCTCT
CACAGCCGA
CATTGAGCG
!LTTTGAGCTC
GCACAGCTC
AAGGTACTC
~CAGATGATC
GATGAGAAC
CGGGCCACG
GGACTTCAG
GACAGCTTG
TCACAGCGT
CAGCCTCAC
:GGCTTCTA
OCTTGGCCC
~AGCTGGCT
~GCCTCAGC
CTTCTGGGA
CC CGGCGTGA TG GCATTTGCC TC CAGGGCTC' GC AATGACTGC "C CCCCAGCGC ~C CCTTCCAGC ~G AAGCAAAAG G GTGCTGGTG C TTCGACTTC C CTCTCCCTG C TCCTCTGCCC G GCCAAATCC( G GCCAAGAcCj 7 CAGCGTCTC;
GCATATGAT.P
CAGCTGGAGG
CACCACCCAG
AACCAGCAGC
ATGCCCTCTG
AGCTCCCGCC
CGCCAACACA
GCCCAGGACT
AAGGATAGCT
GGGGGGCCCA
CCCCAAAACC
GAAACAGGGG
kG TAGAGAGCCA 3G TGAGGACCA VG GAGTCAACTT~ ,T GGAT CCGGCG 7C TGGAGAGCAT -C TGAGTGACAT C AGGCGGAACA G GACCCTCTCT C TCAAACACAG 3CAAAGCGApC 0GCTCCAGCAT
TGCAGCTAGT
CGCTGGCCCC
TTCGCTCCCG
AGCTGGTTCA
ATGCCTGTGA
CCCCTGG CCC
TGCTGGGGGA
ATGAGGCCAG
ACCTGGAGGA
CCTCGGGGCT
CAGAACACAJA
ACTGACAGCC
CTCCAGGCAG
CCCTGCCCAG
ACCCCTCAC
180 240 300 360 420 480 540 G00 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1S60 1620 1680 1740 1800 'AGGCTCCC ATTCCAGGTA CCTGCCACCT TCCTCTCCTC CCACACAGGA AGCTGCCCCA CTGGGCACTG
CCCTCAGGCC
-178i.
S AGGATCCCCT TAGCAGGGTC CTTCCCACCA
GACTCAGGGA
ACAAAAGGGT GGGTGTGGGC ACCATGGCAT
GAGGAAGA
AAGTCCTGAC AGTCAAGGGA CTGCTTTGGC
ATCCAGGGCC
ACATTAGA TGAGACAATT CAAAGCCCCC
CCAGGGTGGC
TGTGGCAGCC ACATCCAAGA CTGGAGCAGC
AGGCTGGCCA
CACAGCTGAJA GCTCTTGGAG GGAAGGGCTC
TCCTCACCCA
INFORMATION FOR SEQ ID NO;21: Wi SEQUENCE
CHARACTERISTICS:
LENGTH: 28 base pairs TYPE: nucleic acid STRANDEDNESS: single TOPOLOGY: linear (ii) MOLECUJLE TYPE: Other nucleic acid; DESCRIPTION: Oligonucleotide (xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: CTCAGTACCA TCTCTGATAC
CAGCCCCA
INFORMATION FOR SEQ ID NO:22: SEQUENCE
CHARACTERISTICS:
LENGTH: 7808 base pairs TYPE: nucleic acid STRAITDEDNESS. double TOPOLOGY: linear (ii) MOLECULTE TYPE: DNA (genomic) AGGGATGCCC
CATTAAAGTG
CAAGGTCCCT
GAGCAGGCAC
TCCAGTCACC
TCACTGCCAT
ACACCCATCT
GTTGCTGGGG
CGCTTGGGCC
AGAGAGAGCT
ATCG
1860 1920 1980 204 0 2100 2144 (ix) FEATURE: NAME/KEY: CDs (B3) LOCATION: 237.-7769 OTHER INFORMATION: /standard-name= "Alpha-lA1- (xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: GATGTCCCGA GCTGCTATCC CCGGCTCGGC CCGGGCAGCC GCCTTCTGAG
CCCCCGACCC
GAGGCGCCGA GCCGCCGCCG CCCGATGGGC TGGGCCGTGG AGCGTCTCCG
CAGTCGTAGC
TCCAGCCGCC GCGCTCCCAG CCCCGGCAGC CTCAGCATCA GCGGCGGCGG
CGGCGGCGGC
GGCGTCTTCC GCATCGTTCG CCGCAGCGTA ACCCGGAGCC CTTTGCTCTT
TGCAGA
ATG GCC CGC TTC GGA GAC GAG ATG CCG GCC CGC TAC GGG GGA GGA GGC 120 180 236 284 -179- Met Ala Arg Phe TCO GGG GCA GCC Ser Gly Ala Ala GOC GGG GGC AGC Ala Gly Gly Ser AAG CAG TOA ATG Lys Gin Ser Met 50 ATO 000 GTO CGA Ile Pro Val Arg 65 TTC AGO GAA GAO Phe Ser Glu Asp TGG CCT CCC TTT Tx-p Pro Pro Phe 100 ATC GTC OTO GO-A Ile Val. Leu Ala 115 ATO TOT GAA CGG C Met Ser Glu Arg I 130 TGT TTO GAG GOT G Oys Phe Glu Ala G 145 AAA GGC TOC TAO T Lys Gly Ser Tyr L 1 GTG OTA AOG GGO A Val Leu Thr Gly 1 180 ACG OTG AGG GOA G Thr Leu Arg Ala V 195 ATO OCA AGT TTA 0 Ile Pro Ser Leu G 210 COT TTG OTG CAG Al Gly Asp GOO GGG Ala Gly CGG OAG Arg Gin GOG CAG Ala Gin OAG AAO Gin Asn 70 AAO GTG Asn Val 85 GAA TAT Giu Tyr 7TG GAG ,eu Gluc TG GATC ~eu Asp CGA ATT ly Ile L 150 TG AGGA eu Arg A 65 TO TTG G le Leu A TT OGA G~ al Arg V~ 2LA GTO G~ Ln Val V~ 2: CO GGO C1J Giu ME GTG GT Val Va
GG
Gi
AG.
Ax 5
TG
Cy
GT(
Val
ATG
M4et 7AG in
;AC
LSp .35 ,ys
AT
Sn
CG
la
TG
ai L 5
'C
O GGG y Gly 40
AGOG
9 Ala 5 C OTO S Leu 3 AGA.
-Arg
ATT
Ile
OAT
HisI 120 ACA G~ Thr G ATO A Ile I GGC T Gly T ACA G Thr V Ii OTG Cc Leu A 200 CTG A; Leu L) OTO
CT~
-t '0 Va 2
CA
Gi
CG
Ar
ACC
Th Lys Zeia
LOS
TG
.eu
;AA
.TT
le
GG
rp
TT
65 3G 15
'A
:o Ala Arg Tyr Gly Gly Gly 10 15 'G GGO AGO GGA GGC GGG OGA .1 Gly Ser Gly Giy Gly Arg 5 30 O 000 GGG GOG CAA AGG ATG n Pro Gly Ala Gin Arg Met 45 G ACC ATG GOA OTO TAO AAO 9 Thr Met Ala Leu Tyr Asn 60 G GTT AAO CGG TOT OTO TTC Val Asn A-rg Ser Leu Phe 75 TAO GOC AAA AAG ATO ACC Tyr Ala Lys Lys Ile Thr 90 95 GOC ACC ATC ATA GOG AAT I Ala Thr Ile Ile Ala AsnC 110 COT GAT GAT GAO AAG ACC C Pro Asp Asp Asp Lys Thr P 125 OCA TAO TTO ATT GGA ATT T Pro Tyr Phe Ile Gly Ile P 140 GOC OTT GGG TTT GOC TTO 0 Aia Leu Gly Phe Ala Phe H 155 1 AAT GTC ATG GAO TTT GTG G' Asn Vai Met Asp Phe Val V 170 175C GGG ACG GAG TTT GAO OTA Cc Gly Thr Giu Phe Asp Leu Aj 190 COG OTO AAG OTG GTG TOT GC Pro Leu Lys Leu Val Ser G] 205 TOG ATO ATG AAG GOG ATG AT] Ser Ile Met Lys Ala Met Il 220 TTT TTT GOA ATO OTT ATT TT Gly
GGA
Gly
TAO
Tyr
COO
Pro
TO
Leu 3AA 31 u
IGC
:ys
COG
ro
TT
he
AC
is
TG
-y
'C
3321 380 428 476 524 5 72 620 668 716 764 812 860 908 956 -180a a a. at a a a a Pro Leu Leu Gil 225 GCA ATO- ATA GG Ala Ile Ile Gi TTT GAA GAG GG Phe Giu Giu Gi 26 GGG ACA GAA GA Gly Thr Glu Gl 275 CC' TAC TGG GAX Pro Tyr Trp Gl1 290 CTG TTT GC.A GTC Leu Phe Ala Vail 305 ACT GAT CTC CTC Thr Asp Leu Let TGG TTG TAC TTC Trp, Leu Tyr Phe 340 AAC CTT GTG CTG Asn Leu Val Leu 355 CGG GTG GAG AAC Arg Val Giu Asn 370 ATT GAA CGT GAG Ile Glu Arg Giu 385 GAG GTG ATC CTC Giu Val Ile Leu TTT GAT GGA GCT Phe Asp Gly Ala 420 TTG CTC AAC CC Leu LeU Asn Pro 435 GTG GGT TCT CCC -n Ile Gly Leu Leu Leu Phe 230 ~G TTA GAA TTT TAT ATG GGA y Leu Giu Phe Tyr Met Giy 245 250 G ACA GAT GAC ATT CAG GGT y Thr Asp Asp Ile Gin Gly 0 265 G CCC GCC CGC ACC TGC CCC u Pro Ala Arg Thr Cys Pro 280 ~GGG CCC AAC AAC GGG ATC ua Gly Pro Asn Asn Giy Ile 295 3CTG ACT GTT TTC CAG TGC -Leu Thr Val Phe Gin Cys 310 TAC AAT AGC AAC GAT GCC TFyr Asn Ser Asn Asp Ala 325 330 ATC CCC CTC ATC ATC ATC C Ile Pro Leu Ile Ile Ile C 345 GGT GTG CTG TCA GGG GAG T Gly Val Leu Ser Giy Giu P 360 CGG CGG GCT TTT CTG AAG C A-rg Arg Ala Phe Leu Lys L 375 CTC AAT GGG TAC ATG GAA T Leu Asn Giy Tyr Met Giu T 390 3 GCC GAG GAT GAA ACT GAC G Ala Giu Asp Giu Thr Asp G~ 405 410 CTG CGG AGA ACC ACC ATA A) Leu Arg Arg Thr Thr Ile L~ 425 GAA GAG GCT GAG GAT CAG C1 Glu Giu Ala Giu Asp Gin LE 440 TTC GCC CGA GCC AGC ATT Az Phe Ala Ile Leu 235 AAA TTT CAT ACC Lys Phe His Thr GAG TCT CCG GCT Giu Ser Pro Ala 270 AAT GGG ACC AAA Asn Gly Thr Lys 285 ACT CAG TTC GAC Thr Gin Phe Asp 300 ATA ACC ATG GAA Ile Thr Met Giu 315 rCA GGG AAC ACT I erGly Asn Thr I GC TCC TTT TTT ;iy Ser Phe Phe Mv 350 ~TT GCC AAA GAA A he Ala Lys GiuA 365 'TG AGG CGG CAAC eu Arg Arg Gin G 380 GG ATC TCA AAA G.
rP Ile Ser Lys A 95 GG GAG CAG AGG C ly Glu Gln Arg H: 4: ~G AAA AGC AAG AC {s Lys Ser Lys T1 430 ~G GCT GAT ATA GC ~u Ala Asp Ile Ail 445 ~AGT GCC AAG CT Ile Phe 240 ACC TGC Thr Cys 255 CCA TGT Pro Cys TGT CAG Cvs Gin ALAC ATC Asn Ile GGG TGG 'iy Trp, 320 ~GG AAC .rp Asn LTG CTG let Leu .GG GAA rg Giu AA CAG ln Gin CA GAA la Glu ~T CCC is Pro 'A GAT ~r Asp C TCT .a Ser G
GAG
1004 1052 12.00 1148 1196 1244 1292 1340 1388 1436 1484 1532 1580 1628 -181- Val
AAC
Asn 465
CGC
Arg
GTA
Val
GAG
Giu Gly Ser 450 TCG ACC Ser Thr CGC ATG Arg met Pro
TTT
Phe
GTC
Val
AAC
Asn 500
TCC
Phe
TTT
Phe
AAA
Lys 485
ACG
Thr
GAC
Al a
CAC
His 470
ACT
Thr
CTG
Leu Arg 455
AAA
Lys
CAG
Gin
TGT
Cys Al a
AAG
Lys
GCC
Al a
GTT
Val
TAC
Tyr 520 Ser
GAG
Glu
TTC
Phe Ile
AGG
Arg
TAC
Tyr 490 Lys Ser Ala Lys Leu Glu 460 AGG ATG CGT TTC TAC ATC Arg Met Arg Phe Tyr Ile 475 480 TGG ACT GTA CTC AGT TTG Trp, Thr Val Leu Ser Leu 495 a a.
a a *a a a. a .a a aaa.
a GCT ATT GTT CAC Ala Ile Val His 505 TAT GCA GAA TTC Tyr Ala Giu Phe
TAC
ATT
Ile 525 AAC GAG Asn Gin Si0 TTC TTA Phe Leu Leu Ser A.sp Phe Leu CTC TTT ATG TCC GAz Leu
CCT
Pro 545
GGG
Giy
TTT
Phe
GTC
Val
AAC
Asn
ATT
Ile 625 Phe 530
TAC
Tyr
AGC
Ser
GGA
Giy
ACA
Thr
TCC
Ser 610
GTC
Met
TTC
Phe
ATC
Ile
ATC
Ile
AAG
Lys 595
ATG
Met
GTC
Ser
CAC
His
TTC
Phe
AGC
Ser 580
TAC
Tyr
AAG
Lys
TTC
GiU
TCT
Ser
GAG
Giu 565
GTG
Val
TGG
Trp
TCC
Ser ATG TTT Met Phe 535 TCC TTC Ser Phe 550 GTC ATC Vai Ile TTA CGA Leu Arg GCA TCT Ala Ser ATC ATC Ile Ile 615 CTT TTG Leu Leu 630 ACT CCT Thr Pro TTT GAG Phe Gin ATA AAA ATG TAC Ile Lys Met Tyr
GGG
Gly CTT GGG ACG Leu Giy Thr AAC TGC TTT GAC TGT GGG Asn Cys Phe TGG GCT GTC Trp
GCC
Ala
CTC
Leu 600
AGC
Ser
GGA
Gly
CCC
Pro
ATC
Ile Al a
CTC
Leu 585
AGA
Arg
CTG
Leu
ATG
Met
ACC
Thr Val 570
AGG
Arg
AAC
Asn
TTG
Leu
CAA
Gin
AAC
Asn 650 Asp Cys 555 ATA AAA Ile Lys TTA TTG Leu Leu CTG GTC Leu Val TTT CTC Phe Leu 620 CTC TTC Leu Phe 635 Giy
CCT
Pro
CGT
Arg
GTC
Vai
CTT
Leu
GGC
Gly
ATC
Ile
ACA
Thr 575
TTC
Phe
CCC
Pro
GGA
Giy
CGG
Axg
ATT
Ile 560
TCC
Ser
AAA
Lys 1676 1724 1772 1820 1868 1916 1964 2012 2060 2108 2156 2204 2252 Ser Leu Leu
CTG
Leu
GAG
Gin
TTC
Phe
TTT
Phe 640 Vai Val Phe Ala A-AT TTC GAT GAA A-sn Phe Asp Giu GGA ATA ATG ACG Ala Ile Met Thr 660 TTC GAT ACT CCA GCA Phe Asp Thr Phe Pro Ala GGC GAA GAc 655 Leu Thr Gly Giu Asp 665 TGG AAC GAG Trp, Asn Giu GTC ATG TAC GAC GGG ATC AAG TCT GAG GGG GGC GTG GAG GGC GGC ATG 2300 -182- Val Met Tyr Asp Gly Ile Lys Ser Gin Gly Gly Val Gin Gly Gly Met 685 675 680 685 GTG TTC TC ATc TAT TTC ATT GTA CTG AG CTC TTT GGG AAC TAC ACC 2348 Vai Phe Ser Ile Tyr Phe le Vai Leu hr Leu Phe Giy Asn Tyr Thr 690 695 700 CT CTG AAT GTG TTC TTG GCC AT GT GTG GAC AAT CTG GCC AAC GCC 2396 Leu Leu Asn Val Phe Leu Ala lie Ala Val Asp Asn Leu Ala Asn Ala 705 710 715 720 AG GAG CTC ACC AAG GTG GAG GG GA GAG AA GAG GAA GAA GAA GCA 2444 Gin Gu Leu Thr Lys Vai Gu Ala Asp Giu Gin Gu Gu Giu Gu Aa 725 730 735 5760 76 ACAG GAG CAG CGG ACC AGT GG 2588 Lys Asn Gln ys Pro Ala Ls Ser Val Trp Glu Gin Arg Thr Ser Glu 77 8 0 GCG AAG GAG GTT CTG G AAA GGG AG G G GTG GCA AA GTG 232 Gin0 MA Ls i Asn Lys Leu Ala eu Gin Lys Ai a L u Vai A Gua 785 790 740 7 ATG CA C 278 CGAG GA CCG TGAC GAG CGC TGG AG TCT GC TAC AC G CGG CAC CTG CG GG 268 Met Asp Pro Asp Ala Arg Trp Lys Al i Ala Tyr Thr Arg His Leu ArG 805 81081 CCA GA ~C AAG CG GACG AAG GGC GG GG CGG GG GGG ACC AG 27 AAG AATC GAAC I CLCO Cr E GCGGGCTT T G 2588 PLyo sp n Lys rol Lys Vai Trp GHu Gin Arg r V a sr Gin 770 775 8378 GAG C CGC AAC AC AAGTTG GG GGG AGC CGG GCG GG CC GG TT GAGA 2763 Gl AGi Arg Lu As L AL Ser Arg GiuAla Ale TPr Asn Giu 8 78 790 79 ATG GAG GGG AG G C GGG TGG AAG GGGG GGG GA CT GG AG 284 et Asp Pro As i A Trp L y Gin GAl Ala TrT AS g Has Leu Ary 9**95 CGG GAG GC AC CAC GAG TG GAG GGG G TG GG TG GG GGC 2872 Pro Asp Met Tyrs ThAs Leu Asp Arg ApPro Le Vai Vai APr Gin 820 825 830 88 CGGAAG GGG AG AG AAG AGG AG GA AGG GG GG GC GAG CG AGC 2780 Giu Asp Aa Asn Asn Asn Tr Asn Lys Ser Gn Ala Ala Giu Pro Thr 835 840 845 GAG GGG GGG TAG GAG GAT GGG GGG CGG GAG CGC CAGC GGC TGG GAG GGC 2876 -183- Ar;
AGC
Ser
GCC
Ala
GAG
Glu 945 Glu Gly Pro 900 CTG GAG CAA Leu Glu Gin 915 GGG GAC CCC Gly Asp Pro 930 AGC CGC AGC Ser Ar; Ser Tyr
CCC
Pro
CAC
His
GGG
Gly Gly
GGG
Gly
CGG
Arg
TCC
Ser Arg
TTC
Phe
AGG
Ar; 935
CCG
Pro Glu
TGG
Trp 920
CAC
His
CGC
Arg Ser 905
GAG
Glu
GTG
Va1
ACG
Thr Asp His GGC GAG Gly Glu CAC CGG His Arg GGC GCG Gly Ala His
GCC
Ala
CAG
Gin 940
GAC
Asp Ala
GAG
Glu 925
GGG
Gly
GGG
Gly Ara Glu Gly 910 CGA GGC AAG Ar; Gly Lys GGC AGC AGG Gly Ser Arg GAG CAT CGA Glu His Ar; 960 3020 3068
S
S. S
S
S
950 955 CGT CAT CGC Ar; His Arg GCG GAG CGG Ala Glu Arg GGC GAG GGC Gly Glu Gly 995 CAC CGG CAT His Ar; His 1010 GAC AAG GAG Asp Lys Glu 1025 GTC CCT GTG Val Pro Val GAC CTG GGC Asp Leu Gly GCG CAC CGC AGG CCC GGG Al a
AGG
Arg 980
GAG
Glu
GGC
Gly
CGG
Ar;
TCG
Ser
CGC
Ar; His Ar; Ar; Pro Gly 965 GCG CGG CAC CGC GAG Ala Ar; His Ar; Glu 985 GGC GAG GGC CCC GAC Gly Glu Gly Pro Asp 1000 GCT CCA GCC ACG TAC Ala Pro Ala Thr Tyr 1015 AGG CAT CGG AGG AGG Ar; His Ar; Ar; Arg 1030 GGC CCC AAC CTG TCA Gly Pro Asn Leu Ser 1045 CAA GAC CCA CCC CTG Gin Asp Pro Pro Leu I rcgr GAG GAG GGT CCG GAG GAC AAG Glu Glu Gly Pro Glu Asp Lys 970 975 GGC AGC CGG CCC GCC CGG GGC Gly Ser Ar; Pro Ala Arg Gly 990 GGG GGC GAG CGC AGG AGA AGG Gly Gly Glu Arg Ar; Ar; Arg 1005 GAG GGG GAC GCG CGG AGG GAG Glu Gly Asp Ala Ar; Ar; Glu 1020 AAA GAG AAC CAG GGC TCC GGG Lys Giu Asn Gin Cly Ser Gly 1035 1040 ACC ACC CGG CCA ATC CAG CAC Thr Thr Ar; Pro Ile Gin Gin 1050 1055 GCA GAG GAT ATT GAC AAC ATC Ala Glu Asp Ile Asp Asn Met 1070 3164 3212 3260 3308 3356 3404 3452 3500 3548 3596 3644 106C AAG AAC AAC AAG Lys Asn Asn Lys 1075 CTT GGC CAC GCC Leu Gly His Ala 1090 ACC GAC CCC GGC Thr Asp Pro Gly 1105 CAG AAC GCC GCC GCC ACC GCG GAG TCG Ala Thr Ala Glu Ser 1080 CTG CCC CAG AGC CCA Leu Pro Gin Ser Pro 1095 ATG CTC GCC ATC CCT Met Leu Ala Ile Pro 1110 CGC CGG ACG CCC AAC GCC GCT CCC CAC Ala Ala Pro His 1085 GCC AAG ATG GGA Ala Lys Met Gly 1100 CCC ATG GCC ACC Ala Met Ala Thr 1115 AAC CCC GGG AAC
AGO
Ser
AGC
Ser
CCC
Pro 1120
TCC
-184- G'n Asn Ala Ala Ser Arg Arg Thr Pro Asn Asn Pro Gly Asn Pro Ser 1125 1130o13 AAT CCC GGC CC C CACCCC GAG A-AT AGC CTT ATC GTC ACC AAC 39 AnPro Giy Pro Pro LYS Thr Pro Glu Asn SerLeleVaThAs 11 01145 1150 CCC AGC CCC ACC CAG ACC AAT TCA GCT A-AG ACT GCC AGG AAA CCC GAC 3740 Pro Ser Gly Thr Gin Thr Asn Ser Ala Lys Thr Ala Arg Lys Pro Asp 1155 1160 1165 CAC ACC ACA T GAC ATC CCC CCA GCC TGC CCA CCC CCC CTC AAC CAC 2788 His Thr Thr Val Asp Ile Pro Pro Ala Cys Pro Pro Pro Leu A-sn His 1170 1175 1180 .*ACC GTC GTA CAA GTG A-AC AA-A AAC CCC AAC CCA GAC CCA CTG CCA A-AA 3836 Thr Val Vai Gin Val Asn Lys Asn Ala Asn Pro Asp Pro Leu Pro Lys 11519 1195 1200 AAA GA G A AA- AG GAG GAG GAG GA AC GAC CT G A GAC 3884 Lys Giu Giu Giu Lys Lys Giu Giu Glu Giu Asp Asp Arg Gly Giu Asp *CS1205 1210 11 GG CTAGCCA ATG CCT CCC TAT A-CC TCC ATG TTC ATC CTG TCC ACC 3932 Gly Pro Lys Pro met Pro Pro Tyr Ser Ser Met Phe Ile Leu Ser Thr 1220 1225 -1230 :A-CC AAC CCC CTT CCC CCC CTG TGC CAT TAC ATC CTG AAC CTG CCC TA-C 3980 Thr A-sn Pro Leu A-rg A-rg Leu Cys His Tyr Ile Leu Asra Leu A-rg Tyr 1235 1240 14 TTT GAG ATC TGC ATC CTC ATC CTC ATT CCC ATG ACC A-CC ATC CCC CTC 4028 Phe Giu Met Cys Ile Leu Met Val Ile Ala Met Ser Ser Ile Aia Leu 1201255 1260 CC CC GAG CAC CCT GTG CA-C CCC A-AC CCA CTCGACACGGCG 47 Ala Ala Ciu Asp Pro Val Gin Pro Asn Ala Pro Arg A-sn Asn Val Leu 1265 1270 1275 1280 CCA TAC TTT CAC TAC GTT TTT A-CA CCC GTC TTC ACC TTT GAG ATC GTC 4124 Arg Tyr Phe Asp Tyr Vai Phe Thr Cly Val Phe Thr Phe Ciu Met Val 1285 1290 1295 A-TC AAC ATC ATT CAC CTG CCC CTC CTC CTG CAT CAC CCT CCC TAC TTC 4172 Ile Lys Met Ile Asp Leu Gly Leu Val Leu His Gin Cly Aia Tyr Phe 1300 1305 1310 CCT GAC CTC TGG A-AT ATT CTC GA-C TTC ATA CTC CTC ACT CCC CCC CTG 4220 Arg Asp Leu Trp Asn Ile Leu Asp Phe Ile Vai Vai Ser Gly Ala Leu 1315 1320 1325 CTA CCC TTT CCC TTC ACT CCC AAT ACC AA-A CGA AA-A CAC ATC A-AC ACC 4268 Val Ala Phe Ala Phe Thr Cly Asn Ser Lys Gly Lys Asp Ile A-sn Thr 1330 1335 1340 ATT A-A-A TCC CTC CGA GTC CTC CCC GTG CTA CCA CCT CTT AAA ACC ATC 4316 -185- Ile Lys Ser Leu Arg Val Leu Arg Val Leu Arg Pro Leu Lys Thr Ile 134S 1350 1355 1360 AAG CGG CTG CCA AAG CTC AAG GCT GTG TTT GAC TGT GTG GTG AAC TCA 4364 Lys A-rg Leu Pro Lys Leu Lys Ala Val. Phe Asp Cys Val Val Asn Ser 1365 1370 1375 CTT AAA AAC GTC TTC AAC ATC CTC ATC GTC TAC ATG CTA TTC ATG TTC 44!2 Leu Lys Asn Val Phe Asn Ile Leu Ile Val Tyr Met Leu Phe Met Phe 1380 1385 1390 ATC TTC GCC GTG GTG GCT GTG CAG CTC TTC AAG GGG AAA TTC TTC CAC 4460 Ile Phe Ala Val Val Ala Val Gin Leu Phe Lys Gly Lys Phe Phe His 1395 1400 1405 :TGC ACT GAC GAG TCC AA GAG TTT GAG AAA GAT TGT CGA GGC AAA TAC 4508 *Cys Thr Asp Gu Ser Lys Giu Phe Glu Lys Asp Cys Arg Gly Lys Tyr 5.1410 1415 1420 CTC CTC TAC GAG AAG AAT GAG GTG AAG GCG CGA GAC CGG GAG TGG AAG 4556 Leu Leu Tyr Glu Lys Asn Giu Val Lys Ala Arg Asp Arg Glu Trp Lys *1425 1430 1435 1440 **AAG TAT GAA TTC CAT TAC GAC AAT GTG CTG TGG GCT CTG CTG ACC CTC 4604 Lys Tyr Giu Phe His Tyr Asp Asn Val Leu Trp Ala Leu Leu Thr Leu 1445 1450 1455 5TTC ACC GTG TCC ACG GGA GAA GGC TGG CCA CAG GTC CTC AAG CAT TCG 4652 *.SPhe Thr Val Ser Thr Gly Giu Gly Trp Pro Gin Val Leu Lys His Ser 1460 1465 1470 GTG GAC GCC ACC TTT GAG AAC CAG GGC CCC AGC CCC GGG TAC CGC ATG 4700 Val Asp Aia Thr Phe Giu Asn Gin Gly Pro Ser Pro Gly Tyr Arg Met .*1475 1480 1485 *GAG ATG TCC ATT TTC TAC GTC GTC TAC TTT GTG GTG TTC CCC TTC TTC 4748 Giu Met Ser Ile Phe Tyr Val Val Tyr Phe Va1 Val Phe Pro Phe Phe 1490 1495 1500 TTT GTC AAT ATC TTT GTG GCC TTG ATC ATC ATC ACC TTC CAG GAG CAA 4796 Phe Val Asn Ile Phe Val Ala Leu Ile Ile Ile Thr Phe Gin Giu Gin 1500 1515Isi 1520 GGG GAC AAG ATG ATG GAG GAA TAC AGC CTG GAG AAA AAT GAG AGG GCC 4844 Gly Asp Lys Met Met Giu Glu Tyr Ser Leu Giu Lys Asn Glu Arg Al a 1525 1530 1535 TGC ATT GAT TTC GCC ATC AGC GCC AAG CCG CTG ACC CGA CAC ATG CCG 4892 Cys Ile Asp Phe Ala Ile Ser Aia Lys Pro Leu Thr Arg His Met Pro 1540 1545 1550 CAG AAC AAG CAG AGC TTC CAG TAC CGC ATG TGG CAG TTC GTG GTG TCT 4940 Gin Asn Lys Gin Ser Phe Gin Tyr Arg Met Trp Gin Phe Val Val Ser 1555 1560 1565 CCG CCT TTC GAG TAC ACG ATC ATG GCC ATG ATC GCC CTC AAC ACC ATC 4988 -186- Pro Pro Phe Glu Tyr Thr Ile Met Ala Met Ile Ala Leu Asn Thr Ile 1501575 1580 GTG CTT ATG ATG AAG TTC TAT G GCT TCT GTT GCT TAT GAA AAT GCC 53 Val Leu Met Met Lys Phe Tyr Gly Ala Ser Val Ala Pyr Glu Asn AlJa 1585 1590 15510 CTG CGG GTG TTC AAC ATC GTC TTC ACC TCC CTC TTC TCT CTG GAA TGT 5084 Leu Arg Val Phe Asn Ile Val Phe Thr Ser Leu Phe Ser Leu Glu Cys 1605 1610 1615 GT TGAAGTC ATG GCT TTT GGG ATT CTG AAT TAT TTC CGC GAT CCC 5132 Val Leu Lys VlMtAaPeGyIeLusnTyr Phe Arg Asp Ala 1620 1625 1630 TGG AAC ATC TTC GAC TTT GTC ACT GTT CTC GGC ACC ATC ACC GTTC5180 Asn IePhe ApPhe Val TrVal Le l le TrAspIl 1635 1640 1645 CTC GTG ACT GAG TTT GGG AAT CCC AAT AAC TTC ATC AAC CTG AGC TTT 5228 Leu Val Thr Glu Phe Gly Asn Pro Asn Asn Phe Ile Asn Leu Ser Phe *1650 1655 1660 CTC CGC CTC TTC CGA GCT GCC CGG CTC ATC AAA CTT CTC CCT CAG GGT 5276 Leu Arg Leu Phe Arg Ala Ala Arg Leu Ile Lys Leu Leu Arg Gin Giy 1665 1670 1-675 1680 .TAC ACC ATC CGC ATT CTT CTC TGG ACC TTT GTG CAG TCC TTC AAG GCC 5324 a..Tyr Thr Ile Arg Ile Leu Leu Trp Thr Phe Val Gin Ser Phe Lys A-la p..1685 1690 19 CTG CCT TAT GTC TGT CTG CTG ATC GCC ATC CTC TTC TTC ATC TAT CCC 5372 :::Leu PoTyr Val Cys Leu Leu Ile Ala Met Leu Phe Phe Ile Tyr Ala 1700 1705 1710 ATC ATT GGC ATC CAG GTG TTT GGT AAC ATT GGC ATC GAC GTG GAG CAC 5420 le Ile Cly Met Gin Val Phe Gly Asn Ile Gly Ile Asp Val Ciu Asp 1715 1720 1725 GAG GAC AGT CAT CAA CAT GAG TTC CAA ATC ACT GAG CAC AAT AAC TTC 5468 Glu Asp Ser Asp Ciu Asp Giu Phe Gin Ile Thr Ciu His Asn Asn Phe 1730 1735 174 CCC ACC TTC TTC CAG CCC CTC ATC CTT CTC TTC CCC ACT CCC ACC CCC 5516 Arg Thr Phe Phe Gin Ala Leu Met Leu Leu Phe Arg Ser Ala Thr Cly 1745 1750 1755 1760 CAA CCT TCC CAC AAC ATC ATC CTT TCC TCC CTC ACC CCC AAA CCC TGT 5564 Clu Ala Trp, His Asn Ile Met Leu Ser Cys Leu Ser Gly Lys Pro Cys 1765 1770 1775 CAT AAC AAC TCT CCC ATC CTG ACT CCA GAG TCT CCC AAT CAA TTT GCT 5612 Asp Lys Asn Ser Gly Ile Leu Thr Arg Ciu Cys Gly Asn Clu Phe Ala 1780 1785 1790 TAT TTT TAC TTT CTT TCC TTC ATC TTC CTC TCC TCC TTT CTG ATC CTG 5660 -187- Tyr Phe Tyr Phe Val Ser Phe Ile Phe Leu Cys Ser Phe Leu Met Leu 1795 1800 1805 A-AT CTC TTT GTC GCC CTC ATC ATG GAC AAC TTT GAG TAC CTC ACC CGA S708 Asn Leu Phe Val Ala Val Ile Met Asp Asn Phe Giu Tyr Leu Thr Arg 1810 1815 1820 CAC TCC TCC ATC CTC CCC CCC CAC CAC CTC GAT GAG TAC GTG CCT GTC S556 Asp Ser Ser Ile Leu Gly Pro His His Leu Asp Giu Tyr Val Arg Val 1825 1830 1835 1840 0:9TGG GCC GAG TAT CAC CCC GCA GCT TCG CCC CCC ATC CCT TAC CTG GAC 5804 *00. Trp Ala Giu Tyr Asp Pro Ala Ala Trp Gly Arg Met Pro Tyr Leu Asp CiC1845 1850 1855 *CCATG TAT CAC ATG CTG AGA CAC ATC TCT CCC CCC CTC GCT CTC CCC AAG 5852 Mez Tyr Gin Met Leu Ar; His Met Ser Pro Pro Leu Gly Leu Gly Lys '1860 1865 1870 V0096 *CCAAC TCT CCC CCC AGA CTG CCT TAC AAG CCC CTT CTC CCC ATG CAC CTG 5900 Lys Cys Pro Ala Arg Vai Ala Tyr Lys Arg Leu Leu Arg Met Asp Leu 1875 1880 1885 C.CCCC GTC CCA CAT CAC AAC ACC CTC CAC TTC AAT TCC ACC CTC ATC CCT 5948 *Pro Val A-la Asp Asp Asn Thr Val His Phe Asn Ser Thr Leu Met Ala 1890 1895 1900 CTG ATC CCC ACA CCC CTC CAC ATC AAC ATT CCC AAG GGA GGA CCC CAC 5996 SeLeu Ile Arg Thr Ala Leu Asp Ile Lys Ile Ala Lys Cly Gly Ala Asp se*1905 1910 1915 1920 AAA CAG CAG ATC CAC CCT GAG CTC CCC AAG GAG ATG ATG CC ATT TCG 6044 Lys Gin Gin Met Asp Ala Clu Leu Arg Lys Ciu Met Met Ala Ile Trp 1925 1930 1935 eas CCC AAT CTG TCC CAG AAC ACG CTA CAC CTC CTG CTC ACA CCT CAC AAC 6092 Pro Asn Leu Ser Gin Lys Thr Leu Asp Leu Leu Vai Thr Pro His Lys 1940 1945 1950 TCC ACC CAC CTC ACC GTG CCC AAC ATC TAC CCA CCC ATG ATG ATC ATG 6140 Ser Thr Asp Leu Thr Val Gly Lys Ile Tyr Ala Ala Met Met Ile Met 1955 1960 1965 GAG TA TAC CCC CAG AGC AAG CC AAC AAG CTG CAC CCC ATC CCC GAG 6188 Ciu Tyr Tyr Ar; Gin Ser Lys Ala Lys Lys Leu Gin Ala Met Arg Ciu 1970 1975 1980 GAG CAC GAC CCC ACA CCC CTC ATG TTC CAG CCC ATG GAG CCC CCC TCC 6236 Glu Gin Asp Ar; Thr Pro Leu Met Phe Gin Ar; Met Giu Pro Pro Ser 1985 1990 1995 2000 CCA ACC CAC GAA CCC CGA CCT CCC CAG AAC CCC CTC CCC TCC ACC CAG 6284 Pro Thr Gin Giu Cly Cly Pro Gly Gin Asn Ala Leu Pro Ser Thr Gin 2005 2010 2015 CTC GAC CCA CCA GGA CCC CTG ATC GCT CAC GAA AGC CCC CTC AAC GAG 6332 -188- Leu Asp Pro Gly Gly Ala Leu Met Ala His Glu Ser Gly Leu Lys lu 2020 2025 2030 AGC CCG TCC TGG GTG ACC CAG CGT GCC CAG GAG ATG TTC CAG AAG ACG 6380 Ser Pro Ser Trp Val Thr Gin Arg Ala Gin Giu Met Phe Gin Lys Thr 2035 2040 2045 GGC ACA TGG AGT CCG GAA CAA GGC CCC CCT ACC GAC ATG CCC AAC AGC 6428 Gly Thr Trp Ser Pro Glu Gin Gly Pro Pro Thr Asp Met Pro Asn Ser 2050 2055 2060 CAG CCT AAo TCT CAG TCC GTG GAG ATG CGA GAG ATG GGC AGA GAT GGC 6476 Gin Pro Asn Ser Gln Ser Val Giu Met Arg Glu Met Gly Arg Asp Gly 2065 2070 2075 2080 TAC TCC GAC AGC GAG CAC TAC CTC CCC ATG GAA GGC CAG GGC CG GCT 6524 Tyr Ser Asp Ser Giu His Tyr Leu Pro Met Giu Cly Gin Gly Arg Ala 2085 2090 2095 GCC TCC ATC CCC CGC CTC CCT GCA GAG AAC CAG AGC AGA AGG CGC CGG 6572 Ala Ser Met Pro Arg Leu Pro Ala Glu Asn Gin Arg Arg Arg Gy Arg 2100 2105 2110 CCA CGT CGG AAT AAC CTC ACT ACC ATC TCA GAC ACC AGC CCC ATG AAG 6620 Pro Arg ly Asn As20 Leu Ser Thr Ile Ser Asp Thr Ser Pro Met Lys 21i5 2120 -2125 CGT TCA CCC TCC GTG CTC GGC CCC AAG CCC CGA CCC CTG GAC CAT TAC 6668 Arg Ser Ala Ser Val Leu Gly Pro Lys Ala Arg Arg Leu Asp Asp Tyr 2130 2135 2140 TCG CTC GAG CGG GTC CCC CCC GAG GAG AAC CAG CGG CAC CAC CAG CGG 6716 Ser Leu Giu Arg Vai Pro Pro Giu Glu Asn Gin Arg His His Gin Arg *2145 2150 2155 26 2155 2160 CGC CC GAC CC AC CAC CC GCC TCT GAG CC TCC CTG GGC CC TAC 6764 Arg Arg Asp Arg Ser His Arg Ala Ser lu Arg Ser Leu Gly Arg Tyr *2165 2170 2175 2175 6812 ACC AT GTC GAC ACA GGC TTG GGG ACA AC CTG AGC ATG ACC ACC CAA 6812 Thr Asp Val Asp Thr Cly Leu Gly Thr Asp Leu Ser Met Thr Thr Gin 2180 2185 2190 TCC GGG GAC CTG CCG TCG AAG GAG CGG GAC CAC GAG CGG GGC CGG CCC 6860 Ser Gly Asp Leu Pro Ser Lys Glu Arg Asp Gin Gu Arg ly Arg Pro 2195 2200 2205 AAG AT CGG AAG CAT CGA CAG CAC CAC CAC CAC CAC CAC CAC CAC CAC 6908 Lys Asp Arg Lys His Arg Gin His His His His His His His His His 2210 2215 2220 CAT CCC CCC CCC CCC AC AAG GAC CC TAT CC CAC GAA CGG CCC GAO 6956 His Pro Pro Pro Pro Asp Lys Asp Arg Tyr Ala Gin Glu Arg Pro Asp 2225 2230 2235 2240 CAC GGC CGG GCA CGG GCT CGG GAC CAC CC TGG TCC CC TG CCC AGO 7004 -189- His Gly Arg Ala Arg Ala Arg 2245 GAG GGC CGA GAG CAC ATG GCG Glu Gly Arg Glu His Met Ala 2260 GGA AGC CCA GCC CCC TCA ACA Gly Ser Pro Ala Pro Ser Thr 2275 CGC CGC CAG CTC CCC CAG ACC Arg Arg Gin Leu Pro Gin Thr 2290 2295 TAT TCC CCT GTG ATC CGT AAG Tyr Ser Pro Val Ile Arg Lys 2305 2310 CAG CAG CAG CAG CAG CAG CAG Gin Gin Gin Gin Gin Gin Gin 2325 GCG GCC ACC AGC GGC CCT CGG Ala Ala Thr Ser Gly Pro Arg 2340 Asp Gin Arg Trp Ser Arg Ser Pro Ser 2250 2255 CAC CGG CAG GGC AGT AGT TCC GTA AGT His Arg Gin Gly Ser Ser Ser Val Ser 2265 2270 TCT GGT ACC AGC ACT CCG CGG CGG GGC Ser Gly Thr Ser Thr Pro Arg Arg Gly 2280 2285 CCC TCC ACC CCC CGG CCA CAC GTG TCC Pro Ser Thr Pro Arg Pro His Val Ser 2300 GCC GGC GGC TCG GGG CCC CCG CAG CAG Ala Gly Gly Ser Gly Pro Pro Gin Gin 2315 2320 CAG CAG GCG GTG GCC AGG CCG GGC CGG Gin Gin Ala Val Ala Arg Pro Gly Arg 2330 2335 AGG TAC CCA GGC CCC ACG GCC GAG CCT Arg Tyr Pro Gly Pro Thr Ala Glu Pro 2345 2350 a CTG GCC GGA GAT CGG CCG CCC ACG GGG GGC CAC AGC AGC GGC CGC TCG Leu Ala Gly Asp Arg Pro Pro Thr Gly Gly His Ser Ser Gly Arg Ser 2355 2360 2365 CCC AGG ATG GAG AGG CGG GTC CCA GGC CCG GCC CGG AGC GAG TCC CCC Pro Arg Met Glu Arg Arg Val Pro Gly Pro Ala Arg Ser Glu Ser Pro 2370 2375 2380 AGG GCC TGT CGA CAC GGC GGG GCC CGG TGG CCG GCA TCT GGC CCG CAC Arg Ala Cys Arg His Gly Gly Ala Arg Trp Pro Ala Ser Gly Pro His 2385 2390 2395 2400 GTG TCC GAG GGG CCC CCG GGT CCC CGG CAC CAT GGC TAC TAC CGG GGC Val Ser Glu Gly Pro Pro Gly Pro Arg His His Gly Tyr Tyr Arg Gly 2405 2410 2415 TCC GAC TAC GAC GAG GCC GAT GGC CCG GGC AGC GGG GGC GGC GAG GAG Ser Asp Tyr Asp Glu Ala Asp Gly Pro Gly Ser Gly Gly Gly Glu Glu 2420 2425 2430 GCC ATG GCC GGG GCC TAC GAC GCG CCA CCC CCC GTA CGA CAC GCG TCC Ala Met Ala Gly Ala Tyr Asp Ala Pro Pro Pro Val Arg His Ala Ser 2435 2440 2445 TCG GGC GCC ACC GGG CGC TCG CCC AGG ACT CCC CGG GCC TCG GGC CCG Ser Gly Ala Thr G l y Arg Ser Pro Arg Thr Pro Arg Ala Ser Gly Pro 2450 2455 2460 GCC TGC GCC TCG CCT TCT CGG CAC GGC CGG CGA CTC CCC AAC GGC TAC 7052 7100 7148 7196 7244 7292 7340 7388 7436 7484 7532 7580 7628 7676 -190-
C
C
C.
C
C
C
Ala Cys Ala Ser Pro Ser Arg His Cly Arg3 Arg Leu Pro Asn Gly T-yr 2465 2470 2475 2480 TAC CCG GCC CAC GGA CTG GCC AGC CCC CGC GGG CCG GGC TCC AGG AAC Tyr Pro Ala His Gly Leu Ala Arg Pro A-rg Gly Pro Gly Ser Arg Lys 2485 2490 2495 GGC CTG CAC GAA CCC TAC AGC GAG AGT GAC GAT GAT TCC TCC TAACCCCGCC Cly Leu His Glu Pro Tyr Ser Glu Ser Asp Asp Asp Tro Cys 2500 2505 2510 CGAGGTGGCG CCCGCCCGGC CCCCCACGCA
CC
INFORMATION FOR SEQ ID NO:23: Wi SEQUENCE
CHARACTERISTICS:
LENGTH: 7791 base pairs TYPE: nucleic acid STRANDEDNESS. double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURPE: NAME/KEY:
CDS
LOCATION: 237. .7037 OTHER INFORMATION: /standard-nane= 'Alpha-lA-2'1 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: GATCTCCCCA CCTGCTATCC CCCGCTCCCC CCCCCCACCC GCCTTCTCAG
CCCCCCACCC
GAGCCCCA CCCCGCCC CCCCATCCCC TCGCCCTCC AGCTCTCCG
CACTCCTAC
TCCAGCCGCC GCCCTCCCAG CCCCCGCAGC CTCAGCATCA GCGCGCCC CGCGC
GC
CGCGTCTTCC CCATCGTTCC CCCCAGCCTA ACCCCCACCC CTTTGCTCTT
TGCACA
ATC CCC CCC TTC CCA CAC GAG ATC CCC CCC CCC TAC CCC CCA GCA GC Met Ala Arg Phe Cly Asp Clu Met Pro Ala Arg Tyr Cly Gly Cly Cly TCC CCC CCA CCC CCC CCC CTC CTC CTC CCC ACC CCA CCC CCC CGA CCA Ser Cly Ala Ala Ala Cly Val Val Val Cly Ser Cly Cly Cly Arg Cly 20 25 CCC CCC CCC ACC CCC CAC CCC CCC CAC CCC CCC CC CAA ACC ATG TAC Ala Cly Cly Ser Arg Cln- Cly Cly Cln Pro Cly Ala Gln Arg Met Tyr 40 AAG CAC TCA ATC CC CAC AGA CC CCC ACC ATC GCA CTC TAC AAC CCC Lys Cln Ser Met A-la Cln Arg Ala Arg Thr Met Ala Leu Tyr Asn Pro 55 7724 7776 7808 120 180 236 284 332 380 428 -191- S. 5.
S
S
*5 5 C
S
*SS.
a 55.5 ATC CCC Ile Pro TTC AGC Phe Ser TGG CCT Trp Pro ATC GTC Ile Vai ATC TCT Met Ser 130 TGT TTC Cys Phe 145 AAA GGC Lys Gly GTG CTA Val Leu ACG CTG A Thr LeuA
I
ATC CCA A Ile Pro S 210 CCT TTG C Pro Leu L GCA ATC A Ala Ile I: TTT GAA G Phe Giu G: GGG ACA G; Giy Thr Gl
G]
GA
Ci Pr
CT'
Lei 11:
GA;
G1i
GAC
r
.CG
~hr
~GG
,rg 95
CT
er
TG
eu
TA
le Lu
-U
~C CGA CAG AAC TGC CTC ACG 1i Arg Gin Asn Cys Leu Thr 70 -A GAC AAC CTC GTG AGA AAA u Asp Asn Vai Vai Arg Lys C TTT GAA TAT ATG ATT TTA o Phe Giu Tyr Met Ile Leu 100 105 C CCA CTG GAG CAG CAT CTG u Ala Leu Giu Gin His Leu 5 120 SCCC CTG CAT CAC ACA CAA I Arg Leu Asp Asp Thr Ciu 135 GCT GGA ATT AAA ATC ATT 1Ala Giy Ile Lys Ile Ile 150 TAC TTC AGG AAT CCC TGC *Tyr Leu A-rg Asn Gly Trp 1 165 1 CCC ATC TTG CC ACA CTT G Cly Ile Leu Aia Thr Val G 180 185 CCA CTT CGA CTG CTC CCC C Ala Vai Arg Vai Leu Arg P 200 TTA CAA CTC GTC CTC AAG T Leu Gin Vai Vai Leu Lys S 215 CAC ATC CCC CTC CTC CTA T Gin Ile Ciy Leu Leu Leu PJ 230 CCC TTA GAA TTT TAT ATG CC Gly Leu Ciu Phe Tyr Met C: 245 2! CCC ACA CAT GAC ATT CAG C Cly Thr Asp Asp Ile Gin CI 260 265 GAG CCC CCC CCC ACC TGC CC Ciu Pro Aia Arg Thr Cys Pr 280
C
Pr 3CCr Psr G7C 1C Ily
'CC
ro
CC
er hie Ly TT AAC CCC TCT al Asn Arg Ser 75 ~C CCC AAA AAG rr Ala Lys Lys 'C ACC ATC ATA a Thr Ile Ile T CAT CAT GAC o Asp Asp Asp 125 STAC TTC ATT Tyr Phe Ile 140 C CTT CCC TTTC I Leu Cly Phe 155 *GTC ATG CAC 1Val Met Asp P ACC GAG TTT C Thr Ciu Phe A 1 CTC AAG CTG C Leu Lys Leu V 205 ATC ATC AAC C Ile Met Lys A 220 TTT GCA ATC C' Phe Ala Ile L~ 235 AAA TTT CAT AC Lys Phe His T1 GAG TCT CCC C Ciu Ser Pro Al 27 AAT CGG ACC AP, Asn Cly Thr Ly 285 CTC TTC Leu Phe ATC ACC Ile Thr CC AAT Aia Asn 110 AAC ACC Lys Thr GGA ATT Giy Ile 3CC TTC Ula Phe -'TT CTG C ~he Vail 175 AC CTA C sp LeuA TC TCT G al Ser C CC ATG A la Met I CT ATT T u Ile P: 2.
7C ACC TC ir Thr C~ 255 T CCA TC .a Pro C '0 ATCT CZ s Cys Cl
CTC
Leu
GAA
Giu
TC
Cys
CCC
Pro
TTT
Phe
CAC
160
TG
Tal
GG
Lrg
GA
ly
TC
le
TT
he
GC
/s 476 524 572 620 668 716 764 812 860 908 956 1004 1052 1100 -192- CCC TAC TGG CAA CGG CCC AAC Pro Tyr Trp Giu Gly Pro Asn 290 295 AAC GCC ATC ACT Asn Gly Ile Thr TTC GAC AAC ATC Phe Asp Asn Ile TTT GCA GTG
CTG
Phe Ala Val Leu
ACT
Thr 310 GTT TTC CAG TGC Vai Phe Gin Cys
ATA
Ile ACC ATG GAA CG Thr Met Ciu Gly ACT GAT CTC Thr Asp Leu CTC TAC Leu Tyr AAT ACC AAC GAT GCC TCA GGC AAC Asn Ser Asn Asp Ala Ser Giy Asn TGG TTG TAC TTC ATC CCC CTC Trp Leu Tyr Phe Ile Pro Leu ATC ATC Ile Ile ATC CCC TCC TTT Ile Gly Ser Phe ACT TGG AAC Thr Trp Asn 335 TTT ATG CTG Phe Met Leu 350 GAA AGG GAA AAC CTT GTC Asn Leu Val 355 CTG GGT GTG CTG Leu Gly Vai Leu CCC GAG TTT
CC
Gly Giu Phe Ala
AAA
Lys CCC GTC Arg Val 370 GAG AAC CCC CCC Giu Asn Arg Arg TTT CTC AAC
CTC
Phe Leu Lys Leu a CCC CAA CAA CAG Arg Gn Gn Gn
ATT
Ile 385 CAA CGT GAG CTC Ciu Arg Giu Leu
AAT
Asn CCC TAC ATG Ciy Tyr Met CAA TCC Cu rp ATC TCA AAA CCA Ile Ser Lys Aia 1148 1196 1244 1292 1340 1388 1436 1484 1532 1580 1629 1676 1724 1772 GAG CTC ATC CTC CCC GAG CAT GAA ACT GAC Giu Vai Ile Leu Ala Ciu Asp Clu Thr Asp 405 410 CCC GAG CAC AG Gy Gu Gn Arg CAT CCC- His Pro TTT CAT CCA Phe Asp Cly CTG CCC AGA Leu Arg Arg ACC ACC ATA Thr Thr Ile AAC AAA AGC Lys Lys Ser TTG CTC AAC CCC CAA GAG CCT Leu Leu Asn Pro Giu Giu Ala AAG ACA CAT Lys Thr Asp 430 ATA CCC TCT CAT CAC CTG CCT Asp Gin Leu Ala
GAT
Asp CTC CCT Val Giy 450 TCT CCC TTC GCC Ser Pro Phe Ala
CCA
Arg CCC ACC ATT AAA Ala Ser Ile Lys CCC AAC CTC GAG Aia TVs T.u Glu
AAC
A-sn 465 TCC ACC TTT TTT Ser Thr Phe Phe
CAC
His AAA AAC GAG AGC Lys Lys Ciu Arg
ACC
Arg ATG CT TTC TAC Met Arg Phe Tyr CCC CCC ATCGCTC Arg Arg Met Val ACT CAC CCC TTC Thr Gin Ala Phe
TAC
Tyr TGC ACT CTA CTC ACT TTG Trp Thr Vai Leu Ser Leu CTA CCT CTC Vai Ala Leu AAC ACC CTG Asn Thr Leu 500 TCT CTT GCT ATT CTT CAC TAC AAC CAC CCC Cys Val Ala Ile Vai His Tyr Asn Gin Pro 50 5 193 GAG TGG CTC Giu TrD Leu 515 TCC GAO TTC OTT TAO Ser Asp Phe Leu Tyr TAT GCA Tyr Al a GAA TTC ATT TTC TTA GGA Giu Phe Ile Phe Leu Gly 525 CTC TTT ATG TCC GAA ATG TTT ATA AAA Leu Phe 530 CCT TAO Pro Tyr 545 GGG AGO Gly Ser TTT GGA Phe Gly GTO ACA Val Thr AAO TCC Asn Ser 610 ATT GTO Ile Val 625 AAT TTO Asn Phe GOA ATA Ala Ile GTC ATG Val Met GTG TTC Val Phe 690 CTC CTG Leu Leu 705 CAG GAG Gin Giu Met
TTC
Phe
ATC
Ile
ATC
Ile
AAG
Lys 595
ATG
Met
GTC
Vai
GAT
Asp
ATG
Met
TAC
Tyr 675
TOO
Ser
AAT
Asn Ser
CAC
His
TTC
Phe
AGC
Ser 580
TAC
Tyr
AAG
Lys
TTC
Phe
GAA
Giu
AOG
Thr 660
GAC
Asp
ATC
Ile
GTG
Val Giu
TOT
Ser
GAG
Giu 565 GT G Val
TGG
Trp
TOO
Ser
GOO
Al a
GGG
Gly 645
GTG
Val
GGG
Gly
TAT
Tyr
TTC
Phe Met
TOO
Ser 550
GTO
Val
TTA
Leu
GOA
Aila
ATO
Ile
OTT
Leu 630
ACT
Thr
TTT
Phe
ATO
Ile
TTO
Phe
TTG
Leu 710 Phe 535~
TTC
Phe
ATC
Ile
OGA
Arg
TOT
Ser
ATO
Ile 615
TTG
Leu
OOT
Pro
OAG
Gin
AAG
Lys
ATT
Ile 695
GOO
Aila Ile
AAC
Asn
TGG
Trp
GOO
Ala
OTO
Leu 600
AGO
Ser
GGA
Gly 000 Pro
ATO
Ile
TOT
Ser 680
GTA
Val
ATO
Ile Lys
TGO
Oys
GOT
Al a
OTO
Leu 585
AGA
A-rg
OTG
Leu
ATG
Met
ACC
Thr
OTG
Leu 665
OAG
Gin
CTG
Leu
GCT
Ala ATG TAO Met Tyr TTT GAO Phe Asp 555 GTO ATA Val Ile 570 AGG TTA Arg Leu AAO OTG Asn Leu TTG TTT Leu Phe CAA OTO Gin Leu 635 AAO TTO Asn Phe 650 AOG GGO Thr Gly GGG GGO Giy Giy AOG OTO Thr Leu GTG GAO Vai Asp 71S
GGG
Giy 540
TG'I
Oys
AAA
Lys
TTG
Leu
GTO
Val
OTO
Leu 620
ITO
Phe
GAT
Asp
GAA
3TG lai
E'TT
?he 00o k.AT ~sn
OTT
Leu
GGG
Gly
OOT
Pro
CT
Arg
GTO
Vai 605 O-1T Leu
GGO
Giy
ACT
Thr
GAO
Asp
OAG
Gin 685
GGG
Gly
OTG
Leu Val2
GGO
Gly
ATT
Ile 590
TOT
Ser
TTO
Phe
GGO
Gly
TTT
Phe
TGG
Trp 670
GGO
~AA
Asn Al a Ile
ACA
Thr 575
TTO
Phe
OTO
Leu
OTG
Leu
OAG
Gin
OOA
Pro 655
AAO
Asn
GGO
Gly
TAO
Tyr
AAO
Asn Ile 560
TOO
Ser
AAA
Lys
OTO
Leu
TTO
Phe
TTT
Phe 640
GOA
Al a
GAG
Giu
ATG
Met
ACC
Thr
CO
Al.a 720 GGG AC OGG Giy Thr Arg GTT ATO ATT 1820 1868 1916 1964 2012 2060 2108 2156 2204 2252 2300 2348 2396 2444 OTO ACC AAG GTG GAG GOG GAO GAG Leu Thr Lys Val Giu Ala Asp Giu 725 730 CAA GAG GAA GAA Gin Giu Glu Glu GAA GOA Ciu Aia 735 -194- GCG AAC Ala Asn CAG AAA CTT CCC CTA CAG AAA GCC Gin1 LYS Leu Ala Leu Gin LYS Ala 740 745 AAG GAG GTG GCA GAA GTG Lys Glu Val Ala Glu Val AGT CCT CTG Ser Pro Leu 755 TCC GCG GCC AAC Ser Ala Ala Asn
ATG
Met TCT ATA GCT GTG Ser Ile A-la Val GAG CAA CAG AAG AAT Lys Asn 770 CAA AAG CCA GCC Gin Lys Pro Ala
AAG
Lys TCC GTG TGG GAG Ser Val Trp Glu
ATG
Met 785 CGA AAG CAG AAC Arg Lys Gin Asn CTG CCC ACC CGC Leu Ala Ser Arg CCC ACC AGT GAG Arg Thr Ser Giu CTC TAT AAC GAA, Leu Tyr Asn Ciu GAG GCC Clu Ala ATG CAC CCG CAC Met Asp pro Asp
GAG
Gi u 805 CCC TGC AAG CCT Arg Trp Lys Ala TAC ACG CGG CAC Tyr Thr Arg His CTG CGG Leu Arg CCA GAC Pro Asp GAG AAC Clu Asn CTC GAC Val Asp 850
ATG
Met
AAG
Lys 820 ACG CAC TTG GAC Thr His Leu Asp
CGG
Arg CCG CTG CTG GTC Pro Leu Val Val GAC CCC CAG ASP Pro Gin CGC AAC AAC AAC Arg Asn Asn Asn 835 CAG CCC CTC GC Gin Arg Leu Cly ACC AAC Thr Asn AAC AGC CCC CC CCC GAG CCC ACC Lys Ser Ar9 Ala Ala Glu Pro Thr 2492 2540 2588 2636 2684 2732 2780 2828 2876 2924 29721 3020 3068 3116
CAG
Gin CAG CCC CCC GAG Gin Arg Ala Glu TTC CTC AGG AAA Phe Leu Arg Lys
CAG
Gin 865 CCC CCC TAC CAC Ala Arg Tyr His CCC CCC CCC GAC Arg Ala Arg Asp
CCC
Pro ACC CCC TCG CC Ser Gly Ser Ala CTG CAC CCA CCC Leu Asp Ala Arg
ACC
Arg 885 CCC TCC CC CGA Pro Trp Ala Cly CAC GAG CCC GAG Gn lu Ala Cu CTG AC Leu Ser CGG GAG GCA Arg Glu Gly ACC CTG GAG Ser Leu Glu 915 TAC CCC CCC GAG Tyr Cly Arg Clu GAC CAC CAC GCCC Asp His His Ala CCC GAG GCC Ara clu I, CAA CCC CCC TTC TGC Gin Pro Gly Phe Trp 920 GAG GGC GAG GCC GAG CCA CCC AAC Glu Cly Glu Ala Glu Arg Gly Lys CCC CCC Ala Cly 930 GAC CCC CAC Asp Pro His CCC AGC Arg Arg CAC CTG CAC CCC His Val His Arg
CAG
Gin CCC CCC ACC AGG Gly Gly Ser Arg GAG ACC CCC ACC CCC TCC CCC Clu Ser Arg Ser Gly Ser Pro 945 950 CCC ACG CCC CC CAC CCC GAG CAT Arg Thr Gly Ala Asp Cly Giu His 955 -195- CGT CAT CGC GCG CAC CGC AGG CCC GGG Arg His Arg Ala His Arg Arg Pro Gly 965 GCG GAG CGG Ala Glu Arg GGC GAG GGC Gly Glu Gly 995 CAC CGG CAT His Arg His 1010 GCG CGG CAC CGC GAG Ala Arg His Arg Glu 985 GAG GAG GGT CCG GAG GAC AAG Glu Glu Gly Pro Glu Asp Lys 970 975 GGC AGC CGG CCG GCC CGG GGC Gly Ser Arg Pro Ala Arg Gly 990 GGG GGC GAG CGC AGG AGA AGG Gly Gly Glu Arg Arg Arg Arg 1005 GAG GGG GAC GCG CGG AGG GAG Glu Gly Asp Ala Arg Arg Glu 1020 GAG GGC GAG Glu Gly Glu GGC CCC GAC Gly Pro Asp 1000 GGC GCT CCA GCC ACG TAC Gly Ala Pro Ala Thr Tyr 1015 GAC AAG Asp Lys 1025 GAG CGG AGG Glu Arg Arg CAT CGG His Arg 1030 AGG AGG AAA GAG AAC Arg Arg Lys Glu Asn 1035 CTG TCA ACC ACC CGG Leu Ser Thr Thr Arg 1050 GTC CCT GTG Val Pro Val GAC CTG GGC Asp Leu Gly AAG AAC AAC Lys Asn Asn 107; CTT GGC CAC Leu Gly His 1090 TCG GGC CCC AAC Ser Gly Pro Asn 1045 CAG GGC TCC GGG Gin Gly Ser Gly 1040 CCA ATC CAG CAG Pro lie Gin Gin 1055 ATT GAC AAC ATG Ile Asp Asn Met 1070 CCC CAC GGC AGC Pro His Gly Ser 1085 CGC CAA Arg Gin 1060 GAC CCA CCC CTG GCA GAG GAT Asp Pro Pro Leu Ala Glu Asp 1065 3164 3212 3260 3308 3356 3404 3452 3500 3548 3596 3644 3692 3740 3788 AAG CTG GCC Lys Leu Ala GCC GGC CTG Ala Gly Leu ACC GCG GAG TCG GCC GCT Thr Ala Glu Ser Ala Ala 1080 CCC CAG AGC CCA GCC AAG ATG GGA AAC AGC Pro Gin Ser Pro Ala Lys Met Gly Asn Ser 1095 1100 ACC GAC CCC GGC CCC ATG CTG Thr Asp Pro Gly Pro Met Leu 1105 1110 CAG AAC GCC GCC AGC CGC CGG Gin Asn Ala Ala Ser Arg Arg 1125 GCC ATC CCT GCC ATG GCC ACC AAC Ala Ile Pro Ala Met Ala Thr Asn 1115
CCC
Pro 1120 ACG CCC AAC AAC CCG Thr Pro Asn Asn Pro 1130
GGG
Gly AAT CCC GGC CCC CCC AAG ACC CCC GAG AAT AGC CTT ATC Asn Pro Gly Pro Pro Lys Thr Pro Glu Asn Ser Leu Ile 1140 AAC CCA TCC Asn Pro Ser 1135 GTCCC AAC Val Thr Asn 1150 5 -LJ CCC AGC GGC ACC CAG ACC Pro Ser Gly Thr Gin Thr 1155 CAC ACC ACA GTG GAC ATC His Thr Thr Val Asp Ile 1170 AAT TCA GCT AAG ACT GCC AGG AAA CCC GAC Asn Ser Ala Lys Thr Ala Arg Lys Pro Asp 1160 1165 CCC CCA GCC TGC CCA CCC CCC CTC AAC CAC Pro Pro Ala Cys Pro Pro Pro Leu Asn His 1175 1180 -196- ACC GTC GTA CAA GTG AAC AAA AAC GCC AAC CCA GAC CCA CTG CCA AAA 83 Thr Val. Val Gin Val Asn LYS Asn Ala Asn Pro, Asp Pro LeuPrLy .1185 191195 1200 33 AAA GAG GAA GAG AAG A.AG GAG GAG GAG GAA GAC GAC CGT GGG CAA GAC 3884 Lys Giu Giu Glu Lys Lys Giu Giu Glu Glu Asp Asp Arg Gly Giu Asp 1205 1210 1215 GGC CCT AAG CCA ATG CCT CCC TAT AGC TCC ATC TTC ATC CTG TCC ACG 3932 Gly Pro Lys Pro Met Pro Pro Tyr Ser Ser Met Phe Ile Leu Ser Thr 1220 1225 1230 ACC AAC CCC CTT CGC CGC CTG TGC CAT TAC ATC CTC AAC CTC CCC TAC 3980 Thr Asn Pro Leu Arg Arg Leu Cys His Tyr Ile Leu Asn Leu Arg Tyr 1235 1240 1245 TTT GAG ATG TCC ATC CTC ATG GTC ATT GCC ATG AGC ACC ATC GCC CTG 4028 Phe GJlu Met Cys Ile Leu Met Vai Ile Ala Met Ser Ser Ile Ala Leu 1250 1255 1260 GCC GCC GAG GAC CCT GTG CAG CCC AAC GCA CCT CGG AAC AAC GTC CTG 4076 Ala A-la Giu Asp Pro Val. Gin Pro Asn Ala Pro Arg Asn Asn Val. Leu 1265 1270 1275 1280 CGA TAC TTT GCTAC T TTT A GGC GTC TTC ACC TTT GAG ATC GTG 4124 Arg Tyr Phe ApTrVlPeT GyalPhe Thr Phe Glu Met Val.
*..1285 1290 1295 ATC AAC ATG ATGAC CTG GGG CTC GTC CTG CAT CAC G GCC TAC TTC 4172 *le Lys Met Ile Asp Leu Gly Leu Val Leu His Gn ly Ala Tyr Phe *1300n 1305 11 CCT GAC CTC TGG AAT ATT CTC GAC TTC ATA CTG GTC ACT GCC GCC CTG 4220 A-rg Asp Leu Trp Asn Ile Leu Asp Phe Ile Vai Va. Ser Giy Ala Leu 1315 1320 1325 GTA GCC TTT GCC TTC ACT GGC AAT AGC AAA GGA AAA GAC ATC AAC ACC 4268 Ala Phe Ala Phe Thr Gy Asn Ser Lys Gly Lys Asp Ile Asn Thr 1330 1335 1340 ***ATT AAA TCC CTC CGA GTC CTC CCC GTC CTA CGA CCT CTT AAA ACC ATC 4316 Ile Lys Ser Leu Arg Va. Leu Arg Val Leu Arg Pro Leu Lys Thr T1e 1345 1350z( 1355 1.360 AAG CCC CTG CCA AAG CTC AAG CCT GTG TTT GAC TCT CTC GTG AAC TCA 4364 Lys A-rg Leu Pro Lys Leu Lys Ala Va]. Phe Asp Cys Val Val Asri Ser 1365 1370 1375 CTT AAA AAC GTC TTC AAC ATC CTC ATC GTC TAC ATC CTA TTC ATC TTC 4412 Leu Lys Asn Vai Phe Asn Ile Leu Ile Va. Tyr Met Leu Phe Met Phe 1380 1385 1390 ATC TTC CCC GTC GTG GCT GTC CAC CTC TTC AAG CCC AAA TTC TTC CAC 4460 Ile Phe Ala Vai Va]. Ala Vai Gin Leu Phe Lys Cly Lys Phe Phe His 1395 1400 1405 -197- TGC ACT GAC Cys Thr Asp 1410 CTC CTC TAC Leu Leu Tyr 1425 AAG TAT GAA Lys Tyr Giu GAG TCC AAA GAG TTT GAG AAA Glu Ser Lys Glu Phe Glu Lys 1415 GAG AAG AAT GAG GTG AAG GCG Giu Lys Asn Glu Val Lys Ala GAT TGT CGA GGC AAA TAC Asp Cys Arg Gly Lys Tyr 1420 CGA GAC Arg Asp
TTC
Phe 1430 CGG GAG TGG AAG Arg Giu Tr-p Lys 1440 CTG CTG ACC CTC Leu Leu Thr Leu CAT TAC GAC His Tyr Asp 1445 AAT GTG CTG TGG GCT Asn Vai Leu Trp Ala 1450 TTC ACC GTG Phe Thr Val TCC ACG GGA Ser Thr Gly 1460 GAA GGC TGG CCA CAG Giu Gly Trp, Pro Gln 1465 GTG GAC GCC ACC Val Asp Ala Thr 1475 GAG ATG TCC ATT Glu Met Ser Ile 1490 GTC CTC AAG CAT TCG Val Leu Lys His Ser 1470 CCC GGG TAC CCC ATG Pro Gly Tyr Arg Met 1485 TTT GAG AAC CAG GGC CCC AGC Phe Giu Asn Gin Gly Pro Ser 1480 TTC TAC GTC GTC TAC Phe Tyr Val Val Tyr 1495 TTT GTC Phe Val 1505 TTT GTG GTG TTC Phe Vai Vai Phe 1500 ATC ATC ACC TTC Ile Ile Thr Phe 1515s AAT ATC TTT GTG GCC TTG ATC Asn Ile Phe Val Ala Leu Ile 1510 CCC TTC TTC Pro Phe Phe CAG GAG CAA Gin Glu Gin 4508 4556 4604 4652 4700 4748 4796 4844 4892 4940 4988 GGG GAC AAG ATG Gly Asp Lys Met ATG GAG GAA Met Giu Giu 1525 TAC AGC CTG GAG Tyr Ser Leu Giu 1530 AAA AAT GAG Lys Asn Giu AGG GCC Arg Al a TGC ATT GAT Cys Ile Asp TTC GCC ATC Phe Ala Ile 1540 AGC GCC AAG CCG CTG Ser Ala Lys Pro Leu 1545 CAG TAC CGC ATG TGG Gin Tyr Arg Met Trp 1560 CAG AAC AAG CAG AGC TTC Gin Asn Lys Gin Ser Phe 1555 ACC CGA CAC ATG CCG Thr Arg His Met Pro 1550 CAG TTC GTG GTG TCT Gin Phe Vai Val Ser CCC CCT TTC Pro Pro Phe 1570U GAG TAC ACG Glu Tyr Thr ATC ATG GCC ATG ATC GCC CTC AAC ACC ATC Ile Met Ala Met Ile Ala Leu A~n Tll T1 1575 1580 GTG CTT ATG ATG AAG Val Leu met Met Lys 1585 TTC TAT GGG GCT Phe Tyr Gly Ala 1590 TCT GTT GCT Ser Val Ala 1595 TAT GAA AAT Tyr Giu Asn
CC
Al a CTG CGG GTG Leu Arg Val GTG CTG AAA Val Leu Lys TTC AAC ATC GTC TTC Phe Asn Ile Vai Phe 1605 GTC ATG GCT TTT GGG Vai Met Ala Phe Gly 1620 ACC TCC CTC Thr Ser Leu 1610 ATT CTG AAT Ile Leu Asn 1625 TTC TCT CTG Phe Ser Leu GAA TGT Glu Cys 5036 5084 5132 TAT TTC CCC GAT GCC Tyr Phe Arg Asp Ala 1630 -198- TGG AAC ATC TTC GAC TTT GTG ACT GTT CTG GGC AGC ATC ACC GAT ATC 5180 Trp Asn Ile Phe Asp Phe Vai Thr Val. Leu Gly Ser Ile Thr Asp Ile 1635 1640 1645 CTC GTG ACT GAG TTT GGG AAT CCG AAT AAC TTC ATC AAC CTG AGC TTT 5228 Leu Val Thr Giu Phe Gly Asn Pro Asn Asn Phe Ile Asn Leu Ser Phe 1650 1655 1660 CTC CGC CTC TTC CGA GCT GCC CGG CTC ATC AAA CTT CTC CGT CAG GGT 5276 Leu Arg Leu Phe Arg Ala Ala Arg Leu Ile Lys Leu LeU A-rg Gin Giy 1665 1670 1675 1680 TAC ACC ATC CGC ATT CTT CTC TGG ACC TTT GTG CAG TCC TTC AAG GCC 5324 Tyr Thr Ile Arg Ile Leu Leu Trp Thr Phe Val Gin Ser Phe Lys A-la 1685 1690 1695 CTG CCT TAT GTC TGT CTG CTG ATC GCC ATG CTC TTC TTC ATC TAT GCC 5372 Leu Pro Tyr Vai Cys Leu Leu Ile Ala Met Leu Phe Phe Ile Tyr Ala 1700 1705 1710 ATC ATT GGG ATG CAG GTG TTT GGT AAC ATT GGC ATC GAC GTG GAG GAC 5420 Ile Ile Gly Met Gin Vai Phe Gly Asn Ile Gly Ile Asp Val Giu Asp 1715 1720 12 GAG GAC AGT GAT GAA GAT GAG TTC CAA ATC ACT GAG CAC AAT AAC TTC 5468 Giu Asp Ser Asp Glu Asp Giu Phe Gin Ile Thr Giu His Asn Asn Phe 1730 1735 14 CGG ACC TTC TTC CAG GCC CTC ATG CTT CTC TTC CGG AGT GCC ACC GGG 5516 aArg Thr Phe Phe Gn Ala Leu Met Leu Leu Phe Arg Ser Ala Thr Gly a1745 1750 1755 1760 GAA GCT TGG CAC AAC ATC ATG CTT TCC TGC CTC AGC GGG AAA CCG TGT 5564 Glu Ala Trp His Asn Ile Met Leu Ser Cys Leu Ser Gly Lys Pro Cys 1765 1770 1775 GAT AAG AAC TCT GGC ATC CTG ACT CGA GAG TGT GGC AAT GAA TTT GCT 5612 Asp Lys Asn Ser Gly Ile Leu Thr Arg Giu Cys Giy Asn Giu Phe Aia *.a1780 1785 1790 a..TAT TTT TAC TTT GTT TCC TTC ATC TTC CTC TGC TCG TTT CTG ATG CTG Tyr Phe TrPhe Val Ser Phe Ile Ph.- Cy -Ser Ffle Leu Met Leu 1751800 1805 je..AAT CTC TTT GTC GCC GTC ATC ATG GAC AAC TTT GAG TAC CTC ACC CGA 5708 Asn Leu Phe Val. Ala Val Ile Met Asp Asn Phe Giu Tyr Leu Thr Arg a..1810 1815 12 GAC TCC TCC ATC CTG GGC CCC CAC CAC CTG GAT GAG TAC GTG CGT GTC 57S6 Asp Ser Ser Ile Leu Giy Pro His His Leu Asp Glu Tyr Val Arg Vai 1825 1830 1.835 1840 TGG GCC GAG TAT GAC CCC GCA GCT TGG GGC CGC ATG CCT TAC CTG GAC 5804 Trp Ala Giu Tyr Asp Pro Ala Aia Trp Giy Arg met Pro Tyr Leu Asp 1845 1850 1855 -199- ATG TAT CAG ATG CTG AGA CAC Met Tyr Gin Met Leu Arg His 1860 AAG TGT CCG GCC Lys Cys Pro Ala 1.875 COO GTC GCA GAT Pro Val Ala Asp 1890 CTG ATO OGO ACA Leu Ile Arg Thr 1905 ATG TCT COG COO CTG GGT CTG GGG AAG Met Ser Pro Pro Leu Gly Leu Gly Lys 1865 1870 TAC AAG CGG CTT CTG CGG ATG GAO CTG Tyr Lys Arg Leu Leu Arg Met Asp Leu 1880 1885 AGA GTG GCT Arg Val Ala GAO AAO ACO GTO CAC TTC AAT TOO ACC OTO ATG GOT Asp Asn Thr Val His Phe Asn Ser Thr Leu Met A-la 1895 1900 GOC OTG GAO ATO AAG Ala Leu Asp Ile Lys ATT GOC AAG GGA GGA CO GAO Ile Ala Lys Gly Giy Ala Asp AAA CAG CAG ATG GAO GOT GAG OTG CCC AAG GAG ATG Lys Gin Gin Met Asp Ala Giu Leu Arg Lys Giu Met 1925 1930
S
5555
S
S.
OS S @5 S S 0 ~t 0 5 5
OS..
0* S. Sb *5
S.
S S
*OS*
S
*555 0.5* 5505
S
SO..
COO AAT CTG TOO CAG AAG ACG OTA GAO CTG CTG GTO Pro Asn Leu Ser Gin Lys Thr Leu Asp Leu Leu Val 1940 1945 ATG COG ATT TGG 1935 ACA CT CAC AAG Thr Pro His Lys 1950 ATG ATG ATO ATG Met Met Ile Met TOO ACG GAO OTO ACC Ser Thr Asp Leu Thr 1955 GAG TAO TAO CCC CAG Giu Tyr Tyr Arg Gin 1970 GAG CAG GAO CCC ACA Giu Gin Asp Arg Thr 1985 CCA ACG CAG GAA GG Pro Thr Gin Ciu Gly 200S GTG CCC AAG ATO TAO GOA GC Vai Cly Lys Ile Tyr Ala Ala 1960 5852 5900 5948 5996 6044 6092 6140 6188 6236 6284 6332 6380 642a 6476 AGO AAG CCC AAG Ser Lys Ala Lys 1975 COO OTO ATG TTO Pro Leu Met Phe 1990 AAG CTG CAG CO ATG CCC GAG Lys Leu Gin Ala Met Arai Giu CAG CCC ATG GAG Gin Arg Met Giu GGA COT Giy Pro COO CG TOO Pro Pro Ser 2000 TOO ACC CAG Ser Thr Gin CCC CAG AAC CCC OTO COO Gly Gin Asn Ala Leu Pro CTG GAO OCA GGA GGA CO OTG ATG Leu Asp Pro Gly Giy Ala Leu Met 2020 GOT CAC GAA Ala His Giu AGO COG TOO TGC Ser Pro Ser Trp, 2035 AGO CCC OTO AAG GAG Ser Gly Leu Lv~q rl 2030 ATG TTC CAG AAG AOG Met Phe Gin Lys Thr GTG ACC CAG CGT CO CAG GAG Vai Thr Gin Arg Ala Gin Giu 2040 GGO ACA TGG ACT COG GAA CAA GC Gly Thr Trp Ser Pro Giu Gin Gly 2050 2055 CAG COT AAO TOT CAG TOO GTG GAG Gin Pro Asn Ser Gin Ser Vai Giu 2065 2070 COO OCT ACC GAO ATG COO AAC AGO Pro Pro Thr Asp Met Pro Asn Ser ATG CA GAG ATG CCC AGA CAT Met Arg Giu Met Gly Arg Asp 2075
GCC
Gly 2080 -200- TAC TCC GAC AGC GAG CAC TAC CTC CCC ATG GAA GGC CAG GGC CGG GCT 6524 Tyr Ser Asp Ser Glu His Tyr Leu Pro Met Glu Gly Gin Giy ArgAa 20852090 2095 GCC TCC ATG CCC CGC CTC CCT GCA GAG AAC CAG AGG AGA AGG GGC CGG 67 A-la Ser Met Pro Arg Leu Pro Ala Giu Asn Gin Arg Arg Arg Gly Arg 67 2100 2105 2110 CCA CGT GGG AAT AAC CTC AGT ACC ATC TCA GAC ACC AGC CCC ATG AAG 6620 Pro Arg Giy Asn Asn Leu Ser Thr Ile Ser Asp Thr Ser Pro Met Lys 2115 2120 2125 CGT TCA GCC TCC GTG CTG GGC CCC AAG GCC CGA CGC CTG GAC GAT TAC 6668 Arg Ser Ala Ser Val Leu Gly Pro Lys Ala Arg Arg Leu Asp Asp Tyr 2130 2135 2140 TCG CTG GGCGGCCGCCGGAGAC CAG CGG CAC CAC CAG CGG 6716 Ser Leu Glu AgVlPoPoGuG snl Arg His His Gin Arg 2145 2150 2155 26 *.6CGC CGC GAC CGC AGC CAC CGC GCC TCT GAG CGC TCC CTG GGC CGC TAC 6764 *Arg Arg Asp Arg Ser His Arg Ala Ser Glu Arg Ser Leu Gly Arg Tyr 2165 2170 2175 ACC GAT GTG GAC ACA GGC TTG GGG ACA GAC CTG AGC ATG ACC ACC CAA 6812 Thr Asp Val Asp Thr Gly Leu Gay Thr Asp Leu Ser Met Thr Thr Gin 2180 21529 *l~o TC A CTG CCG TCG AAG GAG CGG GAC CAG GAG CGG 2190GCC 66 .o Ser Gly Asp Leu Pro Ser Lys Glu Arg Asp Gin Giu Arg Gly Arg Pro 2.195 2200 2205 AAG GAT CGG AAG CAT CGA CAG CAC CAC CAC CAC CAC CAC CAC CAC CAC 6908 Lys Asp Arg Lys His Arg Gin His His His His His His His His His 2210 2215 22 CAT CCC CCG CCC CCC GAC AAG GAC CGC TAT GCC CAG GAA CGG CCG GAC 6956 0 His Pro Pro Pro Pro Asp Lys Asp Arg Tyr Ala Gin Giu Arg Pro Asp 2225 2230 2235 24 0*.CAC GGC CGG GCA CGG GCT CGG GAC CAG CGC TGG TCC CGC TCG CCC AGC 7004 His Gly Arg Ala Arg Ala Arg Asp Gln Arg Trp Ser Arg Ser Pro Ser 2245 2250 2255 GAG GGC CGA GAG CAC ATG GCG CAC CGG CAG TAGTTCCGTA AGTGGAAGCC 7054 Glu Gly Arg Glu His Met Ala His Arg Gin 2260 2265 CAGCCCCCTC AACATCTGGT ACCAGCACTC CGCGGCGGGG CCGCCGCCAG CTCCCCCZAGA 7114 CCCCCTCCAC CCCCCGGCCA CACGTGTCCT ATTCCCCTGT GATCCGTAAG GCCGGCGGCT 7174 CGGGGCCCCC GCAGCAGCAG CAGCAGCAGC AGGCGGTGGC CAGGCCGGGC CGGGCGGCCA 7234 CCAGCGGCCC TCGGAGGTAC CCAGGCCCCA CGGCCGAGCC TCTGGCCGGA GATCGGCCGC 7294 -201- S.
S
*5
S
S S
S
*5*S CCACGGGGGG CCACAGCAGC GGCCGCTCGC
CCAGGATGGA
CCCGGAGCGA GTCCCCCAGG GCCTGTCGAC
ACGGCGGGGC
CGCACGTGTC CGAGGGGCCC CCGGGTCCCC
GGCACCATGG
ACGACGAGGC CGATGGCCCG GGCAGCGGGG
GCGGCGAGGA
ACGCGCCACC CCCCGTACGA CACGCGTCCT
CGGGCGCCAC
CCCGGGCCTC GGGCCCGGCC TGCGCCTCGC
CTTCTCGGCA
GCTACTACCC GGCGCACGGA CTGGCCAGGC
CCCGCGGGCC
ACGAACCCTA CAGCGAGAGT GACGATGATT
GGTGCTAAGC
CCGGCCCCCC
ACGCACC
INFORMATION FOR SEQ ID NO:24: Ci) SEQUENCE
CHARACTERISTICS:
LENGTH: 7032 base pairs TYPE: nucleic acid STRANDEDNESS: doable TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic)
GAGGCGGGTC
CCGGTGGCCG
CTACTACCGG
GGCCATGGCC
CGGGCGCTCG
CGGCCGGCGA
GGGCTCCAGG
CCGGGCGAGG
CCAGGCCCGG
GCATCTGGCC
GGCTCCGACT
GGGGCCTACG
CCCAGGACTC
CTCCCCAACG
AAGGGCCTGC
TGGCGCCCGC
7354 7414 7474 7534 7594 76 54 7714 7774 7791 (ix) FEATURE: NAME/KEY: CDS (B3) LOCATION: 166. .6921 OTHER INFORMATION: /standard-name= "Alpha-1E-1, Cxi) SEQUENCE DESCRIPTION: SEQ ID NO:24: GCTGCTGCTG CCTCTCCGAA GAGCTCGCGG AGCTCCCCAG AGGCGGTGGT
CCCCGTGCTT
GTCTGGATGC GGCTCTGAGT CTCCGTGTGT CTTTCTGCTT GTTGCTGTGT
GCGGGTGTTC
GGCCGC'AT ACCTTTGTGT GTCTTCTGTC TGTTTAAACC TCAGG ATO GCT COC Met Ala Arg 1 TTC GGG GAG GCG GTG GTC GCC AGO CCA GGG TCC GGC GAT GGA GAC TCG Phe Gly Glu Ala Val Val Ala Arg Pro Gly Ser Gly Asp Gly Asp Ser 510 GAC CAG AGC AGG AAC CGG CAA OGA ACC CCC GTG CCG GCC TCG GGG CAG Asp Gin Ser Arg Asn Arg Gin Gly Thr Pro Val Pro Ala Ser Gly Gln 25 30 GCG GCC 0CC TAC AAG CAG ACG AAA GCA CAG AGO GCG COG ACT ATG GCT Ala Ala Ala TIyr Lys Gln Thr Lys Ala Gln Arg Ala Arg Thr Met Ala 45 120 174 2 22 270 318 -202- TTG TAC AAC CCC ATT CCC GTC CGG CAG AAC TGT TTC ACC GTC AAC AGA 366 Leu Tyr Asn Pro Ile Pro Val Arg Gin Asn Cys Phe Thr Vai Asn Arg 065 TCC CTG TTC ATC TTC GGA GAA GAT AAC ATT GTC AGG AAA TAT GCC AAG 414 Ser Leu Phe le Phe Giy Giu Asp Asn Ile Val Arg Lys Tyr Aia Lys 70 75 80 AAG CTC ATC GAT TGG CCG CCA TTT GAG TAC ATG ATC CTG GCC ACC ATC 462 Lys Leu Ile Asp Trp Pro Pro Phe Gu Tyr Met Ile Leu Aa Thr Ile 90 ATT GCC AAC TGC ATC GTC CTG GCC CTG GAG CAG CAT CTT CCT GAG GAT 510 Ile Aa Asn Cys Ile Val Leu Ala Leu Giu Gin His Leu Pro Giu Asp 100 105 110 115 GAC A-AG ACC ATG TCC CGA AGA CTG GAG AAG ACA GAA CCT TAT TTC 558 Asp Lys Thr Pro Met Ser Arg Arg Leu Gl Lys Thr Glu Pro Tyr Phe 120 125 130 ATT GGG ATC TTT TGC TTT GAA GCT GGG ATC AA ATT GTG GCC CTG GGG 606 Ile Gy Ile Phe Cys Phe Gu Ala Gy le Lys le Vai Aa Leu Gly *135 140 145 TTC ATC TTC CAT AAG GGC TCT TAC CTC CGC AAT GGC TGG AAT GTC ATG 654 Phe lie Phe His Lys Giy Ser Tyr Leu Arg Asn Giy Trp Asn Vai Met 150 155 160 GAC TTC ATC GTG GTC CTC AGT GGC ATC CTG GCC ACT GCA GGA ACC CAC 702 Asp Phe le Val Vai Leu Ser Giy Ile Leu Aa Thr Aa Gly Thr His 165170 175 TT ACT CAC GTG GAC CTG AGG A-CC CTC CGG GCT GTG CGT GTC CTG 750 Phe Asn Thr His Vai Asp Leu Arg Thr Leu Arg A-a Val Arg Val Leu 180 185 190 195 CGG CCT TTG AAG CTC GTG TCA GGG ATA CCT AGC CTG CAG ATT GTG TTG 798 Arg Pro Leu Lys Leu Val Ser Gly Ile Pro Ser Leu Gin Ile Val Leu 200 205 210 AAG TCC ATC ATG AAG GCC ATG GTA CCT CTT CTG CAG ATT GGC CTT CTG 846 Lys Ser Ile Met Lys Aa Met Val Pro Leu Leu Gln ie Gly Leu Leu 150 225 CTC TTC TTT GCC ATC CTG ATG TTT GCT ATC ATT GGT TTG GAG TTC TAC 894 Leu Phe Phe Aila Ile Leu Met Phe Aa Ile Ile Gly Leu Giu Phe Tyr 230 235 240 AGT GGC AAG TTA CAT CGA GCG TGC TTC ATG AAC AAT TCA GGT ATT CTA 942 Ser Gly Lys Leu His Arg A-a Cys Phe Met Asn Asn Ser Gly Ile Leu 245 250 255 GAA GGA TTT GA-C CCC CCT CA-C CCA TGT GGT GTG CAG GGC TGC CCA GCT 990 Glu Gly Phe Asp Pro Pro His Pro Cys Gly Val Gin Gly Cys Pro Ala 260 265 270 275 0 -203- GGT TAT GAA TGC Gly Tyr Giu Cys
AAG
Lys 280 GAC TGG ATC Asp Trp Ile GGC CCC AAT GAT GGG ATC ACC CAG Giy Pro Asn Asp Gly Ile Thr Gin 285 290 CTG ACT GTC TTC CAG TGC ATC ACC Leu Thr Vai Phe Gin Cys Ile Thr 300 305 TTT GAT AAC Phe Asp Asn
ATC
Ile 295 CTT TTT GCT GTG Leu Phe Aia Val ATG GAA GGG TGG ACC ACT GTG CTG TAC AAT ACC AAT Met Giu Giy Trp Thr Thr Val Leu Tyr Asn Thr Asn
GAT
Asp GCC TTA GGA Aia Leu Giy GCC ACC Ai a Thr 325 TGG AAT TGG CTG Trp Asn Trp Leu 00. 0 0..0*
TAC
Tyr 330 TTC ATC CCC CTC Phe Ile Pro Leu ATC ATT GGA TCC Ile Ile Giy Ser TTT GTT CTC AAC Phe Vai Leu Asn GTC CTG GGA GTG Vai Leu Giy Vai TCC GGG GAA TTT Ser Giy Giu Phe AAA GAG AGA GAG Lys Giu Arg Giu
AGA
Arg 360 GTG GAG AAC CGA Vai Giu Asn Arg
AGG
Arg 365 OCT TTC ATG AAG Aia Phe Met Lys CTG CGG Leu A-rg 1038 1086 1134 1182 1230 1278 1326 1374 1422 1470 1518 CGC CAG Arg Gin GAC AAA Asp Lys ACA TCC Thr Ser 405
CAG
Gin ATT GAG COT GAG Ile Oiu Arg Giu
CTG
Leu 380 AAT GGC TAC CGT Asn Oly Tyr Arg GCC TGG ATA Ala Trp Ile 385 AAT GCT GGA Asn Aia Oly GCA GAG OAA OTC Aia Giu Giu Vai 390 GCC TTA GAA OTG Aia Leu Oiu Vai
ATG
Met
CTC
Leu 395 OCT OAA GAA AAT Ala Giu Giu Asn CTT CGA AGG OCA ACC ATC AAG AGG AGC CGG Leu Arg Arg Aia Thr Ile Lys Arg Ser Arg 410 415
ACA
Thr 420 GAG GCC ATG ACT Giu Ala Met Thr
CGA
Arg 425 GAC TCC AGT OAT Asp Ser Ser Asp
GAG
Oiu 430 CAC TGT OTT GAT His Cys Val Asp
ATC
Ile TCC TCT GTG GGC ACA CCT CTG GCC CGA Ser Ser Val Giy Thr Pro Leu Ala Arg 440 AGT ATC AAA AGT Ser Ile Lys Ser GCA AAG Ala Lys OTA GAC OGG GTC TCT TAT TTC CG Val Asp Gly Val Ser Tyr Phe Arg 455 TCC ATT CGC CAC ATG GTT AAA TCC Ser Ile Arg His Met Val Lys Ser 470 475 AAO GAA AGG CTT Lys Giu Arg Leu CTG CGC ATC Leu A-rg Ile 465 ATT GTG CTG Ile Val Leu CAG GTG TTT TAC Gin Val Phe Tyr 1614 1662 AGC CTT GTG GCA CTC AAC ACT Ser Leu Vai Ala Leu Asn Thr 485 490 GCC TOT GTG GCC ATT GTC CAT CAC AAC Ala Cys Val Ala Ile Vai His His Asn -204- CAG CCC CAG TGG CTC ACC CAC CTC CTC TAC TAT GCA GAA TTT CTG TTT 1710 Gin Pro Gin Trp, Leu Thr His Leu Leu Tyr Tyr Ala Giu Phe Leu Phe 500 505 510 CTG GGA CTC TTC CTC TTG GAG ATG TCC CTG AAG ATG TAT GGC ATG GGG 1758 Leu Gly Leu Phe Leu Leu Giu Met Ser Leu Lys Met Tyr Giy Met Gay 520 525 530 CCT CGC CTT TAT TTT CAC TCT TCA TTC AAC TGC TTT GAT TTT GGG GTC 1806 Pro Arg Leu Tyr Phe His Ser Ser Phe A-sn Cys Phe Asp Phe Gly Val 535 540 545 ACA GTG GGC AGT ATC TTT GAA GTG GTC TGG GCA ATC TTC AGA CCT GGT 1854 Thr Val Gly Ser Ile Phe Giu Val Val Trp, Ala Ile Phe Arg Pro Gly 550 55556 AC CTTTGGA ATC AGT GTC TTG CGA GCC CTC CGG CTT CTA AGA ATA 1902 ThrSerPheGly Ile Ser Val Leu Arg A-1a Leu Arg Leu Leu Arg Ile *..565 570 575 T AAA ATA ACC AAG TAT TGG GCT TCC CTA CGG AAT TTG GTG GTC TCC Phe Lys Ile Thr Lys Tyr Trp Ala Ser Leu Arg Asn Leu Val Val Ser .580 585 590 595 TTG ATG AGC TCA ATG AAG TCT ATC ATC AGT TTG CTT TTC CTC CTC TTC 1998 Leu Met Ser SrMet Lys Ser Ile Ile Ser Leu Leu Phe Leu Leu Phe *..600 605 610 CTC TTC ATC GTT GTC TTT GCT CTC CTA GGA ATG CAG TTA TTT GGA GGC 2046 Leu Phe Ile Val Val Phe Ala Leu Leu Giy Met Gin Leu Phe Gly Gly :615 620 625 AGG TTT AAC TTT AAT GAT GGG ACT CCT TCG GCA AAT TTT GAT ACC TTC 2094 Arg Phe Asn Phe Asn Asp Gly Thr Pro Ser Ala Asn Phe Asp Thr Phe 630 635 640 CCTGCAGCCATC ATG ACT GTG TTC CAG ATC CTG ACG GGT GAG GCTG 24 PoAla Ala Ile MtThr Val Phe Gin Ile Leu Thr Gly Giu Asp Trp S645 650 655 *59AAT GAG GTG ATG TAC AAT GGG ATC CGC TCC CAG GGT GGG GTC AGC TCA 21qn Asn Giu Val Met Tyr Asn Giy Ile Arq Ser G 1ln Gy Gl a -er Ser 660 665 670 675 GGC ATG TGG TCT GCC ATC TAC TTC ATT GTG CTC ACC TTG TTT GGC AAc 2238 Gly Met Trp Ser Ala Ile Tyr Phe Ile Val Leu Thr Leu Phe Gly Asn 680 685 690 TAC ACG CTA CTG AAT GTG TTC TTG GCT ATC GCT GTG GAT AAT CTC GCC 2286 Tyr Thr ILeu Leu Asn Val Phe Leu Ala Ile Ala Val Asp Asn Leu Ala 695 700 705 AAC GCC CAG GAA CTG ACC AAG GAT GAA CAG GAG GAA GAA GAG GCC TTC 2334 A-sn Ala Gin Giu Leu Thr Lys Asp Giu Gin Giu Giu Giu Giu Ala Phe 710 715 720 -205- AAC CAG AAA Asri Gin Lys 725 CAT GCA CTG CAG AAG GCC A-AG GAG His Ala Leu Gin Lys Ala Lys Glu 730 GTc AGC CCG ATG TCT Val Ser Pro Met Ser
GCA
Al a 740 CCC AAC ATG CCT Pro Asn Met Pro TCG ATC GAG AGG GAG CGG Ser Ile Giu Arg Giu Arg 745 750 AGG CGC CGG CAC A-rg Arg Arg His
CAC
His ATG TCC GTG TGG Met Ser Val Trp CAG CGT ACC AGC Gin Arg Thr Ser
CAG
Gin CTG AGG AAG CAC Leu Arg Lys His ATG CAG Met Gin ATG TCC AGC Met Ser Ser CCG CTC AAC Pro Leu Asn 790 GAG CCC CTC AAC Glu Ala Leu Asn GAG GAG GC CCG Giu Giu Ala Pro ACC ATG AAC Thr Met Asn 785 CTC AAT CC CCC CTC AAC CCG Pro Leu Asn Pro
CTC
Leu AGC TCC CTC AAC Ser Ser Leu Asn
CCG
Pro CAC CCC His Pro 805 AGC CTT TAT CGG Ser Leu Tyr Arg
CGA
Arg CCC AGG GCC ATT Pro Arg Ala Ile GCC CTG GCC CTG Gy Leu Ala Leu
GGC
Cl y 820 CTG GCC CTG Leu Ala Leu GAG A-AC TTC GAG GAG GAG CC Clu Lys Phe Ciu Giu Giu Arg 825 8-30 ATC AGC CGT CCC Ile Ser Arg ly 2382 2430 2478 2526 2574 2622 2670 2718 2766 2814 2862 2910 2958 3006 TCC CTC AAG
GC
Ser Leu Lys Cly CAT GCA CCG GAC CGA TCC CGT GCC CTC GAC Asp Gly Ciy Asp Arg Ser Ser Ala Leu Asp 840
D
AAC GAG Asn Gin AGG ACC
CCT
Arg Thr Pro CCC TGT CAT Pro Cys His 870
TTG
Leu 855 TCC CTC CCC GAG Ser Leu Cly Gin GAG CCA CGA TG Giu Pro Pro Trp, CTG CCC AGC Leu Ala Arg 865 GCG CCA GGA Gly Gly Gly CGA AAC TCT CAC Gly Asn Cys Asp
CCG
Pro ACT GAG GAG GAG Thr Gin Gin Giu GAG CCT Giu Ala 885 GTG GTG ACC TTT Val Val Thr Phe GAC CGC CCC ACG Asp Arg Ala Arg AGG C-AG AGC CAA Arg Gin Ser Gin
CGG
Arg 900 CGC AGC CGG CAT Arg Ser Arg His CGC GTC AGG ACA Arg Val Arg Thr
CA-A
Giu GGC A-AC GAG TCC ly Lys lu Ser TGA GCC TCC CG Ser Ala Ser A-rg
AGC
Ser 920 AGG TCT GCC AGC Arg Ser Ala Ser CAA CGC ACT CTG Gu Arg Ser Leu GAT CAA Asp Giu GCC ATG CCC ACT CAA CCC GAG AAG Ala Met Pro Thr Giu Gly Glu Lys 935 CAT GAG CTC AGC His Glu Leu Arg GCC AAC CAT Cly Asn His 945 -206- GGT GCC AAG GAG CCA ACG ATC CAA GAA GAG AGA GCC CAG GAT TTA AGG 3054 Gly Ala Lys Giu Pro Thr Ile Gin Giu Giu Arg Ala Gin Asp Leu Arg 950 955 960 AGG ACC AAC AGT CTG ATG GTG TCC AGA GGC TCC GGG CTG GCA GGA GGC 3102 Arg Thr Asn Ser Leu Met Va Ser Arg Giy Ser Giy Leu Ala Gly Gly 965 970 975 CTT GAT GAG GCT GAC ACC CCC CTA GTC CTG CCC CAT CCT GAG CTG GAA 3150 Leu Asp Giu Ala Asp Thr Pro Leu Vai Leu Pro His Pro Giu Leu Glu 980 985 990 995 GTG GGG AAG CAC GTG GTG CTG ACG GAG CAG GAG CCA GAA GGC AGC AGT 3198 Vai Gly Lys His Vai Vai Leu Thr Giu Gin Giu Pro Giu Gly Ser Ser 1000107 1000 1005 1010 GAG CAG GCC CTG CTG GGG AAT GTG CAG CTA GAC ATG GGC CGG GTC ATC 3246 Glu Gin Ala Leu Leu Gly Asn Val Gin Leu Asp Met Gly Arg Val Ile Giu Gly 1Asp Gl A r oa ie 1015 1020 1025 AGC CAG AGC GAG CCT GAC CTC TCC TGC ATC ACG GCC AAC ACG GAC AAG 3294 Ser Gin Ser Glu Pro Asp Leu Ser Cys Ile Thr Ala Asn Thr Asp Lys 1035 1040 1040 GCC ACC ACC GAG AGC ACC AGC GTC ACC GTC GCC ATC CCC GAC GTG GAC 3342 Ala Thr Thr Giu Ser Thr Ser Val Thr Val Ala Ile Pro Asp Val Asp 1045 1050 1055 CCC TTG GTG GAC TCA ACC GTG GTG CAC ATT AGC AAC AAG ACG GAT GGG 3390 G**67 Pro Leu Val Asp Ser Thr Val Val His Ile Ser Asn Lys Thr Asp Gly 1060 1065 *a1070 1075 GAA GCC AGT CCC TTG AAG GAG GCA GAG ATC AGA GAG GAT GAG GAG GAG 3438 a Giu Ala Ser Pro Leu Lys Giu Ala Giu Ile Arg Giu Asp Giu Giu Glu 1080 1085 1090 GTG GAG AAG AAG AAG CAG AAG AAG GAG AAG CGT GAG ACA GGC AAA GCC 3486 Val Gu Lys Lys Lys Gin Lys Lys Gu Lys Arg Gu Thr Gly Lys Ala 1095 1100 1105 ATG GTG CCC CAC AGC TCA ATO TTC ATC TTC AGC ACC ACC AAC CCG ATC 3534 Met Val Pro His Ser Ser Met Phe Ile Phe Spr Th, Tr AS Po ie 1110 1115 1120 CGG AGG GCC TGC CAC TAC ATC GTG AAC CTG CGC TAC TTT GAG ATG TGC 3582 Arg Arg Ala Cys His Tyr Ile Val Asn Leu Arg Tyr Phe Gu Met Cys 1125 1130 1135 ATC CTC CTG GTG ATT GCA GCC AGC AGC ATC GCC CTG GCG GCA GAG GAC 3630 Ile Leu Leu Val Ile Ala Aia Ser Ser Ile Ala Leu Ala Ala Gu Asp 1140 1145 1150 1155 CCC GTC CTG ACC AAC TCG GAG CGC AAC AAA GTC CTG AGG TAT TTT GAC 3678 Pro Val Leu Thr Asn Ser Gu Arg Asn Lys Vai Leu Arg Tyr Phe Asp 1160 1165 1170 -207- TAT GTG TTC ACG GGC GTG TTC ACC TTT GAG ATG GTT ATA AAG ATG ATA Tyr Val Phe Thr Gly Val Phe Thr Phe Giu Met Val Ile Lys Met Ile 1175 10 GAC CAA GGC TTG ATC Asp Gin Gly Leu Ile 1190 AAC ATC CTG GAC TTT Asn Ile Leu Asp Phe 1205 CTG CAG GAT GGG TCC Leu Gin Asp Gly Ser 1195 GTG GTG GTC GTT GGC Val Val Val Vai Gly 1210 1185 TAC TTC CGA GAC TTG TGG Tyr Phe Arg Asp Leu Trp 1200 GCA TTG GTG GCC TTT GCT Ala Leu Val Ala Phe Aia 1215 CTG GCG Leu Ala 1220 AAC GCT TTG Asn Ala Leu GGA ACC AAC AAA GGA Gly Thr Asn Lys Gly 1225 CGG GAC Arg Asp ATC AAG ACC Ile Lys Thr a. a a a
ATC
Ile AAG TCT CTG CGG GTG CTC CGA GTT Lys Ser Leu Arg Val Leu Arg Val 1240 CTA AGG CCA CTG Leu Arg Pro Leu 1245 AAA ACC ATC AAG Lys Thr Ile Lys CGC TTG CCC AXg Leu Pro AAG CTC AAG LYS Leu Lys 1255 GCC GTC TTC GAC Ala Val Phe Asp 1260 AAG AAT GTC TTC Lys Asn Val Phe 1270 TTT GCT GTC ATC Phe Ala Val Ile 1285 ACG GAC AGT TCC Thr Asp Ser Ser 1300 AAC ATA CTC Asn Ile Leu ATT GTG TAC Ile Val Tyr TGC GTA GTG ACC TCC TTG Cys Vai Val Thr Ser Leu 126S AAG CTC TTC ATG TTC ATC LYS Leu Phe Met Phe Ile 1280 GGA AAG TTC TTT TAT TGC Gly Lys Phe Phe Tyr Cys 1295 3726 3774 3822 3870 3918 3966 4014 4062 4110 4158 4206 4254 4302 4350 GCA GTT CAG CTC TTC AAG Ala Val Gin Leu Phe Lys 1290 AAG GAC ACA GAG AAG GAG Lys Asp Thr Glu Lys Glu 1305 TGC ATA Cys Ile GGC AAC TAT Gly Asn Tyr
GTA
Val1 GAT CAC GAG AAA AAC AAG ATG GAG GTG AAG GGC CGG Asp His Glu Lys Asn Lys Met Glu Val Lys Gly Arg 1320 1325 CAT GAA TTC CAC TAC His Glu Phe His Tyr 1335 ACC GTC TCC ACA GGG Thr Val Ser Thr Gly 1350 GAT GTG ACA GAG GAA Asp Val Thr Glu Glu 1365 ATG TCT ATC TTT TAT Met Ser Ile Phe Tyr 1380 GAA TGG AAG CGC Giu Trp Lys Arg 1330 CTG ACC CTC TTC Leu Thr Leu Phe AAC ATT ATC TGG GCC CTG Asn Ile Ile Trp, Ala Leu GAA GGA TGG CCT CAA Glu Gly Trp Pro Gin 1355 GAC CGA GGC CCA AGC Asp Arg Gly Pro Ser 1370 GTA GTC TAC TTT GTG Val Val Tyr Phe Val 1385 GTT CTG CAG CAC TCT GTA Val Leu Gin His Ser Val 1360 CGC AGC AAC CGC ATG GAG Arg Ser Asn Arg Met Giu 1375 GTC TTC CCC TTC TTC TTT Val Phe Pro Phe Phe Phe 1390 139S -208- GTC AAT ATC TTT GTG GCT CTC ATC ATC ATC ACC TTC CAG GAG CAA GGG 4398 Val Asn Ile Phe Val Aia Leu le Ile Ile Thr Phe Gin Giu Gin Gly 1400 1405 1410 GAT AAG ATG ATG GAG GAG TGC AGC CTG GAG AAG AAT GAG AGG GCG TGC 4446 Asp Lys Met Met Giu Giu Cys Ser Leu Glu Lys Asn Giu Arg Aia Cys 1415 1420 1425 ATC GAC TTC GCC ATC AGC GCC AAA CCT CTC ACC CGC TAC ATG CCG CAG 4494 Ile Asp Phe Aaa Ile Ser Aa Lys Pro Leu Thr Arg Tyr Met Pro Gin 1430 1435 1440 AAC AGA CAC ACC TTC CAG TAC CGC GTG TGG CAC TTT GTG GTG TCT CCG 4542 Asn Arg His Thr Phe Gin Tyr Arg Vai Trp His Phe Val Vai Ser Pro :i 1455 1445 1450 14551475 TCC TTT GAG TAC ACC ATT ATG GCC ATG ATC GCC TTG AAT ACT GTT GTG 4590 Ser Phe Giu Tyr Thr Ile Met Aia Met Ile Aia Leu Asn Thr Vai Vai *1460 1465 1470 1475 CTG ATG ATG AAG TAT TAT TCT GCT CCC TGT ACC TAT GAG CTG GCC CTG 4638 Leu Met Met Lys Tyr Tyr Ser Ala Pro Cys Thr Tyr Giu Leu Ala Leu 1480 1485 1490 AAG TAC CTG AAT ATC GCC TTC ACC ATG GTG TTT TCC CTG GAA TGT GTC 4686 Lys Tyr Leu Asn eli Ala Phe Thr Met Val Rhe Ser Leu Giu Cys Vai 1495 1500 1505 CTG AAG GTC ATC GCT TTT GGC TTT TTG AAC TAT TTC CGA GAC ACC TGG 4734 Leu Lys Va Ile Ala Phe Gly Phe Leu Asn Tyr Phe Arg Asp Thr Trp 1510 I55 1520 AAT ATC TTT GAC TTC ATC ACC GTG ATT GGC AGT ATC ACA GAA ATT ATC 4782 A-sn lie Phe Asp Phe lie Thr Val Ile Gly Ser Ile Thr Giu Ile Ile 1525 1530 1535 CTG ACA GAC AGC AAG CTG GTG AAC ACC AGT GGC TTC AAT ATG AGC TTT 4830 ***eUThr Asp Ser Lys Leu Val Asn Thr Ser Gly Phe Asn Met Ser Phe 1540 1545 1550 1555 CTG AAG CTC TTC CGA GCT GCC CGC CTC ATA AAG CTC CTG CGT CAG GGC 4878 Leu Lys Leu Phe Arg Aa Aa Arg Leu Ile Lys Leu Lei lrg Gn Gly 1560 1565 1570 TAT ACC ATA CGC ATT TTG CTG TGG ACC TTT GTG CAG TCC TTT AAG GCC 4926 Tyr Thr Ile Arg le Leu Leu Trp Thr Phe Val Gin Ser Phe Lys Ala 1575 1580 1585 CTC CCT TAT GTC TGC CTT TTA ATT GCC ATG CTT TTC TTC ATT TAT GCC 4974 Leu Pro Tyr Val Cys Leu Leu Ile A-a Met Leu Phe he Ile Tyr Ala 1590 1595 1600 ATC ATT GGG ATG CAG GTA TTT GGA AAC ATA AAA TTA GAC GAG GAG AGT 5022 Ile Ile Gy Met Gin Val Phe Gy Asn Ile Lys Leu Asp Gu Gu Ser 1605 1610 1615 -209- CAC ATC His Ile 1620 AAC COG CAC AAC
AAC
Asn Arg His Asn As, TTC CGG AGT TTC TTT GGG TCC CTA ATG Phe Ara Ser Phe Phe Gly Ser Leu Met 1630 1635 CTA CTC TTC AGG AGT GCC ACA Leu Leu Phe Ara Ser Ala Thr 1640 TCA TGC CTT GGG GAG AAG
GGC
Ser Cys Leu Gly GiU Lys Gly 1655 GOT GAG GCC TGG CAG GAG Gly Glu Ala Trp Gin Giu ATT ATG
CTG
GGG CAG AAC GAG AAT GAA Giy Gin Asn Giu Asn Giu 1670
CGC
Arg TGT GAG CCT GAC Cys Giu Pro Asp 1660 TGC GGC ACC
GAT
Cys Gly Thr Asp ACC ACC GCA CCA TCA Thr Thr Ala pro Ser 1665 CTG GCC TAC GTG TAC Leu Aia Tyr Val Tyr
S
V
S* S S. S S
S
4* 59 9 5 *4 S. S .5*
S
*5 S
S
S.O
*S*b 9 TTT GTC TCC Phe Val Ser 1G85 GTG GCC OTC Val Ala Val 1700 ATC CTG GGG Ile Leu Gly TTC ATC TTC TTC
TGC
Phe Ile Phe Phe Cys 1690 ATC ATG GAC AAC TTT Ile Met Asp Asn Phe 1705 CCT CAC CAC TTG GAC Pro His His Leu Asp 1720 TCC TTC TTG ATG CTC AAC CTG TTT Ser Phe Leu Met Leu Asn Leu Phe GAG TAC CTG ACT Giu Tyr Leu Thr 1710 GAG TTT TC CGC Giu Phe Val Arg CGG GAC TCC
TCC
TAT GAC CGA GCA GCA Tyr Asp Arg Ala Ala 1735 ATG CTG ACT CTC ATG Met Leu Thr Leu Met 1750 TCC AAA GTG GCA
TAT
Ser Lys Val Ala Tyr 1765 GAG GAC ATG ACG
GTC
Giu Asp Met Thr Val 1780 ACA GCT CTG GAC
ATT
Thr Ala Leu Asp Ile
L
1800 TGT GGC CGC ATC CAT TAC ACT Cys Gly Arg Ile His Tyr Thr OTC TGO GCA GAA Val Trp Ala Giu 1730 GAG ATG TAT GAA Glu Met Tyr Giu 5070 5118 5166 5214 5262 5310 5358 5406 5454 5502 5550 5598 5646 5694 TCA CCT CCG CTA GGC Ser Pro Pro Leu Gly 1755 AAG AGG TTG GTC CTG Lys Arg Leu Val Leu CTC GGC AAG AGA TT CCC Leu Gly Lys Arg Cys Pro 1760 ATG AAC ATG CCA GTA
CT
Met Asn Met Pro Val Ala CAC TTC ACC TCC ACA CTT ATG GCT iis Phe Thr Ser Thr Leu Met zi L785 1790 LAA ATT GCC AAA GGT GGT GCA GAC ~ys Ile Ala Lys Gly Gly Ala Asp CTG ATC CG Leu lie Arg 1795 AGG CAG CAG Ara Gin Gin 1810 CCT CAC CTA CTA GAC TCA GAG CTA CAA AAG GAG ACC CTA GCC ATC TGG Leu Asp Ser Glu Leu Gin Lys Glu Thr Leu Ala Ile Trp 1815 1820 ProHi2 Le TCC CAG AAG ATG CTG GAT CTG CTT GTG CCC ATG CCC AAA GCC TCT GAC Ser Gin Lys Met Leu Asp Leu Leu Val Pro Met Pro Lys Ala Ser Asp 1830 1835 1840 -210- CTG ACT GTG GGC AAA ATC TAT GCA GCA ATG ATG ATC ATG GAC TAC TAT Leu Thr Val Gly Lys Ile Tyr Ala Ala Met Met Ile Met Asp Tyr Tyr 1845 1850 1855 AAG CAG AGT Lys Gin Ser 1860 AAT GCC Ccc Asn Ala Pro ATC ATT GCT Ile Ile Ala AAG GTG AAG AAG CAG Lys Val Lys Lys Gin 1865 ATG TTC CAG CGC ATG Met Phe Gin Arg Met 1880 AAT GCC AAA GCC CTG Asn Ala Lys Ala Leu 1895 AGG CAG CAG CTG Arg Gin Gin Leu 1870 GAG CCT TCA TCT Giu Pro Ser Ser 1885 CCT TAC CTC CAG Pro Tyr Leu Gin 1900 TAC CC? TCG ATG Tyr Pro Ser Met GAG GAA CAG AAA IGu Gu Gn Lys 1875 CTG CCT CAG GAG Leu Pro Gn Gu 1890 CAG GAC CCC GTT Gn Asp Pro Val 1905 AG? CCA CTC TCT Ser Pro Leu Ser .0 0.
,so*, 00 0 *00 006 TCA GGC CTG AGT GGC CGG Ser Gly Leu Ser Gly Arg 1910 AGT GGA Ser Gly CCC CAG GAT ATA TTC CAG TTG GCT TGT Pro Gin Asp Ile Phe Gin Leu Ala Cys 1925 1930 CAG TTC CAA GAA CGG CAG TCT CTG GTG Gin Phe Gin Giu Arg Gin Ser Leu Val 1940 1945 ATG GAC CCC GCC GAT GAC GGA Met Asp Pro Ala Asp Asp Gly 5742 5790 5838 5886 5934 5982 6030 6078 6126 6174 6222 GTG ACA GAC Val Thr Asp AGA CGT TCA Arg Arg Ser TTT TCC ACT AT? Phe Ser Thr Ile 1960 CCT AGC TCC ATG Pro Ser Ser Met 1955 AAT TCC TCG TGG Asn Ser Ser Trp CGG GAT AAG CGT TCA Arg Asp Lys Arg Ser 1965 TTG GAG GAA TTC TCC Leu Giu Giu Phe Ser 1975 CGT CGC CGG AGT TAC Arg Arg Arg Ser Tyr 1990 AAC TCT GAT TCA GGC Asn Ser Asp Ser Gly 20015 ATG GAG CGA Met Giu Arg AGC AGT Ser Ser GAA AAT ACC Giu Asn Thr TAC AAG TCC T'yr Lys Ser CAC TCC TCC TTG CGG His Ser Ser Leu Arg 1995 CAC AAG TCT GAC ACT His Lys Ser Asp Thr 2010 CTG TCA GCC CAC CGC CTG Leu Ser Ala His Arg Leu 2000 CAC CCC TCA GGG GGC AGG His Pro S er Gly Gly Ar-g GAG CGG Glu Arg 2020 CGA CGA TCA Arg Arg Ser AAA GAG Lys Glu 2025 CGA AAG CAT Arg Lys His CT? CTC TCT Leu Leu Ser CCT GAT GTC Pro Asp Vai TCC CGC TGC AAT TCA GAA GAG CGA Ser Arg Cys Asn Ser Giu Giu Arg 2040 CCA GAG CGC CGT CAA TCC AGG TCA Pro Giu Arg Arg Gin Ser Arg Ser 2055 GGG ACC CAG GCT GAC TGG Gly Thr Gin Ala Asp Trp, GAG TCC Giu Ser 6270 6318 6366 CCC AGT GAG GGC Pro Ser Giu Gly 2060 AGG TCA CAG ACG Arg Ser Gn Thr 2065 -211- CCC AAC AGA CAG CCC ACA CCT TCC CTA Pro Asn Arg Gin Giy Thr Cly Ser Leu 2070 2075 ACT GAG AGC TCC ATC CCC TCT Ser Giu Ser Ser Ile Pro Ser 2080 GTC TCT GAC Val Ser Asp 2085 GTC CCC CCA Vai Pro Pro 2100 CAC GCG GGC His Ala Ciy ACC AGC ACC CCA AGA AGA Thr Ser Thr Pro Arg Arg 2090 AAG CCC CCC CCC CTC CTT Lys Pro Arg Pro Leu Leu 2105 AGT CGT CGC CAG Ser Arg Arg Gin 2095 TCC TAC AGC TCC Ser Tyr Ser Ser
AGC
Ser CTC CCA CC Leu Pro Pro CTC ATT CGA Leu Ile Arg 2115 GAG CCC TCC Cu Cy Ser 2130 CTC ACC GAG Leu Thr Giu ATC TCT Ile Ser 2120 CCA CCT CCT Pro Pro Ala CAT GCA ACC GAG Asp Ciy Ser Ciu CCC CTC ACC TCC CA-A CCT CTG GAG Pro Leu Thr Ser Gin Ala Leu Ciu 2135 AGC AAC AAT CCT TG Ser Asn Asn Ala Trp TCT TC-C AAC TCT CCC Ser Ser Asn Ser Pro 2150 CAC CCC CAC CAC AGC CAA His Pro Gin Gin Arg Gin CAT CCC TCC CCA CAG His Ala Ser Pro Gin CCC TAC ATC Arg Tyr Ile 2165 TCA CAC TGT Ser Asp Cys 2180 TCC GAG CCC TAC TTG CCC CTC Ser Ciu Pro Tyr Leu Al1a Leu 2170 GTT GAG GAG GAG ACG CTC ACT Val Giu Giu Giu Thr Leu Thr CAC GAA CAC His Giu Asp 2175 TTC GAA GCA Phe Ciu Ala TCC CAC GCC Ser His Ala CCC GTC GCT Ala Vai Aia 6414 6462 6510 6558 6606 6654 6702 6750 6798 6846 6894 6948 2185 ACT ACC CTG GGC Thr Ser Leu Cly CGT TCC AAC Arg Ser Asn 2200 ACC ATC GC Thr Ile Gly 2205 CCC CAC TAT Giy His Tyr TCA CCC CCA CCC Ser Ala Pro Pro CTG CGG Leu Arg CAT ACC TCC CAC ATG CCC AAC His Ser Trp Gin Met Pro Asn 2215 CCC CCC CCC Arg Arg Arg ACG CCC GGC A-rg Arg Cy CCC CCT CCC CCA GC Cly Pro Giy Pro Gly 2230 CAC ACG GAA GAA GAT Asp Thr Ciu Giu Asp 2245 ATG ATC TCT GGC GCT CTC AAC Met Met Cys Ciy Ala Val A~ 2235 AAC CTG CTA ACT T, T- I-- GAC AAA TCC TACACGCTGC TCCCCCCTCC
GATGCATGCT'
Asp Lys Cys 2250 CTTCTCTCAC ATGGAGAAA CCAAGACAGA ATTGGGAAGC CAGTGCGGCC
CCGCGGCCAG
GAAGAGGGA AAGGAAGATG
GAAG
INFORMATION FOR SEQ ID NO:25: Wi SEQUENCE
CHARACTERISTICS:
LENGTH: 7089 base pairs 7032 -212- TYPE: nlucleic acid STRANDEDNS double TOPOLOGY: linear (ii) MOLECUJLE TYPE: DNA (genomjc) (ix) FEATURE.
NAME/Ky:
CDS
LOCATION: 6 67 D)OTHER INFORMATION: /standard -name= "AlPhaLE- 3 1, (xSEQUENCE DESCRIPTION: SEQ ID GCTGCTGCTG CCTCTCCGAA GAGCTCGCGG AGCTCCCCAO AGGCGGTGGT CCCCGTGCTT GTCTGGATGC GGCTCTGAGT CTCCGTGTGT CTTTCTGCTT GTTGCTGTGT GCGGGTGTY.C 120 GGCCGCGATC ACCTTTGTGT GTCTTCTGTC TGTTTACC TCAGG ATG GCT CGC 174 TTC GGG GAG GCG GTG GTC GCC AGG CCA 000 TCC GGC GAT GGA GCTCG22 PheGlyGluAla Val Val Ala Arg Pro Gly Ser GlyAs lApSe 510 2225 GAC CAG AGC AGG AAC COG CAA OGA ACC CCC GTG CCG GCC TCG GGG CAG 270 Asp Gn Ser Arg Asn Arg Gn Gly Thr Pro Val Pro Ala Ser Gly Gn 20 25 30 35 GC C C A A A ACG AAGCA CGAGG GCOOC ACT ATO GCT 318 Ala Ala Ala Tyr Lys Gln Thr Lys Ala Gln Arg Ala Arg Thr Met Ala 40455 TTG TAC AAC COCATCCGCCG CAG AAC TGT TTC ACC GTC AAC AGA 366 Leu Tyr Asn Pro Ile Pro Val Arg Gln Asn Cys Phe Thr Val Asra Arg 5560 TCC CTG TTC ATC TTC GGA GAA GAT AAC ATT GTC AGG AAA TAT 0CC AAO 414 Ser Leu Phe Ile Phe Gly Glu Asp Asra Ile Val Arg Lys Tyr Ala Lys 75 8 AAO CTC ATC GAT TGG CCG CCA TTT GAG TAC ATG ATC CTG GCC ACC ATC 462 LYS Leu Ile Asp Trp Pro Pro Phe Glu Tyr Met Ile Leu Ala Thr Ile 90 95 ATT 0CC AAC TGC ATC GTC CTG 0CC CTG GAG CAG CAT CTT CCT GAO OAT 510 Ile Ala Asn Cys Ile Val Leu Ala Leu Glu GTn His Leu Pro Olu Asp 100 105 110 115 GAC AAo ACC CCC ATO TCC CGA AGA CTO GAO AAG ACA OAA CCT TAT TTC 558 Asp Lys Thr Pro met Ser Ara- Arg Leu Glu Lys Thr Olu Pro Tyr Phe 120 125 130 ATT 000 ATC TTT TOC TTT GAA OCT 000 ATC AAA ATT OTO 0CC CTO 000en -213- 44** 4* 4 4 4 4* 44 #4 4 *4*4 4*44 4
I
P1
G;
AS
TT
Ph 18
CG
A~r
AA
Ly
CTC
Lei.
AGI
Ser
GAA
Glu 260
GGT
Gly
TTT
Phe
ATG
Met
GCC
Al a
TTC
Phe 340
AAA
le Gly Ile Phe Cys Phe Clu Ala Gly Ile 135 140 PC ATC TTC CAT AAG GGC TCT TAC CTC CGC ie Ile Phe His Lys Gly Ser Tyr Leu Arg 150 155 ~C TTC ATC GTC GTC CTC ACT GCC ATC CTG P Phe Ile Val Val Leu Ser Cly Ile Leu 165 170 'C AAT ACT CAC GTG GAC CTC AGG ACC CTC e Asn Thr His Val Asp Leu Arg Thr Leu 0 185 G CCT TTC A-AG CTC GTG TCA CCG ATA CCT g Pro Leu Lys Leu Val Ser Gly Ile Pro 200 205 G TCC ATC ATG AAG GCC ATC GTA CCT CTT sSer Ile Met Lys Ala Met Val Pro Leu 215 220 TTC TTT GCC ATC CTG ATG TTT GCT ATC Phe Phe Ala Ile Leu Met Phe Ala Ile I 230 235 CC AAC TTA CAT CCA CC TCC TTC ATC Cly Lys Leu His Arg Ala Cys Phe MetA 245 250 CCA TTT CAC CCC CCT CAC CCA TCT CCT C Cly Phe Asp Pro Pro His Pro Cys Cly V 265 2 TAT GAA TCC AAC CAC TCC ATC CCC CCC k- Tyr Ciu Cys Lys Asp Trp Ile Cly Pro A- 280 285 CAT A-AC ATC CTT TTT CCT CTC CTC ACT CG Asp Asn Ile Leu Phe Ala Val Leu Thr Vz 295 300 CA-A CCC TCC ACC ACT CTC CTC TAC A-AT AC Glu Cly Trp Thr Thr Val Leu Tyr A-sn T±2 310 315 A-CC TCC AAT TCG CTC TAC TTC ATC CCC CT Thr Trp, Asn Trp Leu Tyr Phe Ile Pro Le 325 330 TTT CTT CTC AAC CTA CTC CTC CCA CTC CT Phe Vai Leu Asn Leu Val Leu Cly Val Le 345 35 CAG ACA GAG A-GA CTC GAG A-AC CCA A-CC GC Lys Ile Val Ala 145 AAT CCC TGG A-AT Asn Gly Trp A-sn 160 CCC ACT GCA CCA Ala Thr Ala Cly 175 CCC CCT CTC CGT Arg Ala Val Arg 190 AGC CTC CAC ATT Ser Leu Gin Ile CTG CAC ATT GC eu Cn Ile Cly 225 ~TT CCT TTC GAG 'I :ie Cly Leu Ciu P 240 AC AAT TCA CCT A -sn A-sn Ser Gly 1 255 TC CAC CCC TGC C al Gin Cly Cys p 70 .T CAT CCC ATC A sn Asp Cly Ile TI PC TTC CA-C TCC A 1i Phe Gin Cys Il 305 C AAT CAT CCC TI Lr Asn Asp Ala Le 320 C ATC ATC ATT CC u Ile Ile Ile l 335 T TCC CCC GAA TT U Ser Cly Ciu Ph 0 T TTC ATG A-AC CT
I
C
V
A
T
Va 21
CT
e ~eu Cly TC ATC a! Met cc CAC hr His T C CTC ULeu 195 GC TTC .1 Leu 0 'T CTC U Leu C TAC e Tyr T' CTA eLeu G CT Ala 275
CAC
Cm
ACC
Thr
CCA
Gly
TCC
Ser
CC
Al a 355
CCC
798 846 894 942 990 1038 1086 1134 1182 1230 1278 -214- Lys Glu A-rg Giu Arg Val. Giu A-sn Arg Arg AlaPeMtLyLeAr 360 365 370etLSLe r CGC CAC CA-C CAG ATT GAG CGT GAG CTG A-AT GGC TAC CCT CCC TGG ATA .1326 AgGin Girl Gin Ile GiU Arg Glu Leu Asn ly Tyr Arg Ala Trp Ile 3750 8 GAC A-AA GCA GAG GAA GTC ATG CTC GCT GA-A GAp A-AT A-A-A A-AT CCT GGA 17 Asp Lys Ala Ciu Glu Val Met Leu Ala Glu Giu Asn Lys Asn Ala Gly 17 390 395 400 ACA TCC GCC TTp. CAA GTG CTT CGA AGG GCA A-CC A-TC A-AG AGG A-GO CGG 42 Thr Ser Ala Leu Giu Val. Leu A-g A-rg A-la Thr Ile Lys Arg Ser A-rg 4054145 410 42130Hs422VlAs l 4155 .ACA TCC AT GC ACT OGA CG TCC ACT A GAG AC A TGT TTCAT ATC 51 Thr ea Glu -i ThrPr e A- l rg Asp SerIlLy Ser Asp CuHsysaiApLie 44430 1470 TCC TCT GTGGT CTA TC CC CA-C ACTGA A- T A--CT CGC A-AC 6 e S rVal Cly ThVraL u All-r l Ser Tye Lys Ser Alais 440 445 1518uA g e L u Ar l 50 I CA- CTC A T T A A T C C G G G T T T C TG A T G G C G1 1 Va le AsCyVlSrTrPeArg HisMeVaLy rGn Vls Giu Tyrg Leu Ileu Val le A G C C T GTG GCA C C A A C A CA-C A-A G G GA-A A -CC T T CATG C cC A -TO6 6 48 455 4605 CrG~ CCC CA C ATC GTAACC CA-C CTC T TA-C TCAG TT G T 170 Sen r Gi A-r i Met Vai Ls er Gin Var Phe Tyr Trp lie VLe Lhe C T C C C p.A TO-CAT C OO TCT CC CC TT GATC CT CA-C A-C 166 SrAg Leu Vai Alaeu A-sn Thr AlaPe Cys aAiale Ai his Hs Va-s AC CCT CA TCT A-TC CGGC C C COA CATTT O G T T902 GinSe Po Gn Trp SeTr Vals Leu Aru Ayr Ler A-ia Phu e le 50557 1710 TC AA TO TCC OTO TTCT CACATC TCC CTC A-G ATG TT CCC TC CC -215- Phe Lys 580 TTG ATG Leu Met CTC TTC Leu Phe AGG TTT Arg Phe CCT GCA Pro Ala 645 AAT GAG Asri Glu 660 GGC ATG Gly Met TAC ACG Tyr Thr AAC GC Asn Ala C AAC CAG Asn Gin L 725 GCA CCC Ala ProA 740 ATG TCG A Met Ser M CGG CAC C Arg His H CAC ATG C His Met G.
7: ACC ATG 'U 11
AG
Se
AT
11
AA
As: 63 Gc Al~ GTc Val
I'GC
Tr
:TA
eu
:AG
1n ~10 ~ys
AC
'Sn
TG
et
AC
is ln 90 .e Thr C TCA .r Ser C GTT e Val 615 C TTT n Phe 0 C ATC a Ile 3ATG *Met
TCT
Ser CTG2 Leu2 695 GAA C Giu I CAT C His ATG C met P TGG G Trp G 7 ATG T Met S 775 ATG T Met S CCG C~
AT
Me
GT
Va
AA
As:
AT
Mel
TAC
Tyi
GCC
680
%AT
ksn
TG
~eu
CA
dla
'CT
ro
AG
iu 60 cc er er
SE
G AA t Ly 0 C TT 1 Ph T GA ni AS' G AC tTh:
:AA
-Asr 66S
ATC
Ile
GTC
Val
ACC
Thr
CTG
Leu
TCG
Ser 745
CCA
Pro
GTG
Val
AGC
Ser
AAC
'r Trp, Ala 5S *G TCT ATC s Ser Ile T GCT CTC e Ala Leu T GGG ACT p Gly Thr 635 T GTG TTC r Val Phe 650 r' GGG ATC 1Gly Ile *TAC TTC *Tyr Phe TTC TTGC Phe Leu AAG GAT C Lys Asp C 715 CAG AAG C Gin Lys A 730 ATC GAA A Ile Giu A CGC AGC Arg Ser TGG GAG C.
Trp, Giu G.
7' CAG GAG G( Gin Giu A: 795 CCC CTC A] Se
AT
Ii
CT
Le 62 cc Pr
CAC
G1i
CGC
A.rS
W.T
Ile
CT
Ula ~00
;AA
CC
l a
GA
2r in 80 la ~r Leu Arg Asn Leu 590 'C AGT TTG CTT TTC e Ser Leu Leu Phe 605 AGGA ATG CAG TTA u Gly Met Gin Leu 0 T' TCG GCA AAT TTT o Ser Ala Asn Phe 640 3 ATC CTG ACG GGT a Ile Leu Thr Gly 655 TCC CAG GGT GGG Ser Gin Gly Giy 670 *GTG CTC ACC TTG Val Leu Thr Leu 685 ATC GCT GTG GAT Ile Ala Val Asp CAG GAG GAA GAA C Gin Giu Giu Giu C 720 AAG GAG GTC AGC C Lys Giu Val Ser P 735 GAC AGA AGG AGA A Asp Arg Arg Arg A 750 CAC CTG AGG GAG C His Leu Arg Giu A: 765 CGT ACC AGC CAG C Arg Thr Ser Gin L~ 71 CTC AAC AGA GAG G2 Leu Asn Arg Giu G 800 CCG CTC AGC TCC C]~ Val Val Ser 595 CTC CTC TTC Leu Leu Phe 610 TTT GGA GG Phe Gly Gly 625 GAT ACC TTC Asp Thr Phe GAG GAC TGG Giu Asp Trp, 3TC AGC TCA Val Ser Ser 675 CTT GGC AAC ?he Giy Asn 690 AT CTC GCC Lsn Leu Ala
'OS
~AG GCC TTC iu Ala Phe CG ATG TCT ro Met Ser GA CAC CAC rgHis His 755 GG AGG CGC rg Arg Arg 770 TG AGG AAG eu Arg Lys kG GCG CCG Lu Aia Pro ~C AAC
CCG
1998 2046 2094 2142 2190 2238 2286 2334 2382 2430 2478 2526 2574 2622 -216- Thr Met Asn Pro Leu Asn Pro Leu Asn Po Leu Ser Ser Leu Asn Pro 805 810 815e erSrLu s r CTC AAT GCC CAC CCC AGC CTT TAT CGG CGA CCC AGG GCC ATT GAG GGC 2670 Leu Asn Ala His Pro Ser Leu Tyr Arg Arg 2Po Arg Ala le Glu Gly 820ProArgAlaIleGluGl 825 830 835 CTG GCC CTG GGC CTG GCC CTG GAG AAG TTC GAG GAG GAG CGC ATC AGC 2718 Leu Ala Leu Gly Leu Ala Leu Glu Lys Phe Giu Giu Giu Arg Ile Ser 840 845 850 CGT GGG GGG TCC CTC AAG GGG GAT GGA GGG GAC CGA TCC AGT GCC CTG 276 Arg Gly Gly Ser Leu Lys Gly Asp Gly Gly Asp Arg Ser Ser Ala Leu .855 860 6 865 GAC AAC CAG AGG ACC CCT TTG TCC CTG GGC CAG CGG GAG CCA CCA TGG 2814 00 Asp Asn Gin Arg Thr Pro Leu Ser Leu Gly Gin Arg Glu Pro Pro Trp 80875 880 C GT T GGA AAC TGT GAC CCG ACT CAG CAG GAG GCA 2862 Leu Ala Arg Pro Cys His Gly Asn Cys Asp Pro Thr Gin Gin Glu Ala 885 890 895 *.GGG GGA GGA GAG GCT GTG GTG ACC TTT GAG GAC CGG GCC AGG CAC AGG 2910 Gly Giy Gly Glu Ala Val Val Thr Phe Giu Asp Arg Ala Arg His Arg 900 905 910 915 CAG AGC CAA CGG CGC AGC CGG CAT CGC CGC GTC AGG ACA GAA GGC AAG 2958 Gl..in Ser Gln Arg Ser Arg His Arg Arg Val Arg Thr Glu Gly Lys 920 925 930 GAG TCC TCT TCA GCC TCC CGG AGC AGG TCT GCC AGC CAG GAA CGC AGT 3006 Glu Ser Ser Ser Ala Ser Arg Ser Arg Ser Ala Ser Gin Glu Arg Ser 935 94094 CTG GAT GAA GCC ATG CCC ACT GAA GGG GAG AAG GAC CAT GAG CTC AGG 3054 945 Leu Asp Glu Ala Met Pro Thr Glu Gly Glu Lys Asp His Gu Leu Arg 950 955 960 GGC AAC CAT GGT GCC AAG GAG CCA ACG ATC CAA GAA GAG AGA GCC CAG 3102 Gly Asn His Gly Ala Lys Glu Pro Thr Ile Gin Glu Glu Arg Ala Gin 965 970 975 GAT TTA AGG AGG ACC AAC AGT CTG ATG GTG TCC AGA GGC TCC GGG CTG 3150 Asp Leu Arg Arg Thr Asn Ser Leu Met Val Ser Arg Gly Ser Gly Leu 980 985 990 995 GCA GGA GGC CTT GAT GAG GCT GAC ACC CCC CTA GTC CTG CCC CAT CCT 3198 Ala Gly Gly Leu Asp Glu Ala Asp Thr Pro Leu Val Leu Pro His Pro 1000 1005 1010 GAG CTG GAA GTG GGG AAG CAC GTG GTG CTG ACG GAG CAG GAG CCA GAA 3246 Glu Leu Glu Val Gly Lys His Val Val Leu Thr Glu Gln Glu Pro Glu 1015 1020 1025 GGC AGC ACT GAG CAG GCC CTG CTG GGG AAT GTG CAG CTA GAC ATG GGC 3294 -217- Gly Ser Ser Giu Gin Ala Leu Leu Gly Asn Val Gin Leu Asp Met Gly 1030 1035 1040 CGG GTC ATC AGC CAG AGC GAG CCT GAC CTC TCC TGC ATC ACG GCC AAC 3342 Arg Val Ile Ser Gin Ser Giu Pro Asp Leu Ser Cys Ile TrAia Asn 1045 1050 1055 ACG GAC AAG GCC ACC ACC GAG AGC ACC AGC GTC ACC GTC GCC ATC CCC 3390 Thr Asp Lys Ala Thr Thr Giu Ser Thr Ser Val Thr Val Ala Ile Pro 1060 1065 1070 1075 GAC GTG GAC CCC TTG GTG GAC TCA ACC GTG GTG CAC ATT AGC AAC AAG 3438 Asp Val Asp Pro Leu Val Asp Ser Thr Val Val His Ile Ser Asn Lys 1080 1085 1090 *ACG GAT GGG GAA GCC AGT CCC TTG AAG GAG GCA GAG ATC AGA GAG GAT 3486 *Thr Asp Gly Giu Ala Ser Pro Leu Lys Giu Ala Giu Ile Arg Giu Asp .1095 1100 1i05 .GAG GAG GAG GTG GAG AAG AAG AAG CAG AAG AAG GAG AAG CGT GAG ACA 3534 Giu Giu Giu Val Giu Lys Lys Lys Gin Lys Lys Giu Lys Arg Giu Thr *1110 1115 1120 *GGC AAA GCC ATG GTG CCC CAC AGC TCA ATG TTC ATC TTC AGC ACC ACC 3582 Gly Lys Ala Met Val Pro His Ser Ser Met Phe Ile Phe Ser Thr Thr 1.125 1130 1135 AAC CCG ATC CGG AGG GCC TGC CAC TAC ATC GTG AAC CTG CGC TAC TTT 3630 Asn Pro Ile Arg Arg Ala Cys His Tyr Ile Val Asn Leu Arg Tyr Phe 1140 1145 1150 1155 GAG ATG TGC ATC CTC CTG GTG ATT GCA GCC AGC AGC ATC GCC CTG GCG 3678 Giu Met Cys Ile Leu Leu Val Ile Ala Ala Ser Ser Ile Ala Leu Ala 1160 1165 1170 *GCA GAG GAC CCC GTC CTG ACC AAC TCG GAG CGC AAC AAA GTC CTG AGG 3726 S..Ala Giu Asp Pro Val Leu Thr Asn Ser Glu Arg Asn Lys Val Leu Arg 1175 1180 1185 TAT TTT GAC TAT GTG TTC ACG GGC GTG TTC ACC TTT GAG ATG GTT ATA 3774 Tyr Phe Asp Tyr Val Phe Thr Gly Val Phe Thr Phe Giu Met Val Ile 1190 1195 in AAG ATG ATA GAC CAA GGC TTG ATC CTG GAG GAT GGG TCC TAC TTC CGA 3822 Lys Met Ile Asp Gin Gly Leu Ile Leu Gin Asp Gly Ser Tyr Phe Arg 1205 1210 1215 GAC TTG TGG AAC ATC CTG GAC TTT GTG GTG GTC GTT GGC GCA TTG GTG 3870 Asp Leu Trp Asn Ile Leu Asp Phe Val Val Val Val Gly Ala Leu Vai 1220 1225 1230 1235 GCC TTT GCT CTG GCG AAC GCT TTG GGA ACC AAC AAA GGA CGG GAC ATC 3918 Ala Phe Ala Leu Ala Asn Ala Leu Gly Thr Asn Lys Gly Arg Asp Ile 1240 1245 1250 AAG ACC ATC AAG TCT CTG CGG GTG CTC CGA GTT CTA AGG CCA CTG AAA 3966 -218- Lys Thr lie Lys Ser Leu Arg Val Leu Arg Val Leu Arg Pro 1Leu Lys 1255 1260 16 ACC ATC AAG CGC TTG CCC A-AG CTC AAG GCC GTC TTC GAC TGC GTA GTG 4014 Thr Ile Lys Arg Leu Pro LYS Leu LYS Ala VaPhAsCyVaVa 1270 1275 1280 s CsVa a A C C T C C T T G A A G A A T G T C T T C A A C A T C T1T2GG T C AA8T0T C4 6 T 12Sr8e5 L s s Val. Phe Asn lie Leu lie Val Tyr Lys Leu 4Phe 12 51290 1295T r L S e h ATG TTC ATC T GCT GTC ATC G C GT -G C C T c G G A A A T Met ~~30 Pe ie Pe Aa Vl e Ala Val Gin Leu Phe LYS GlY Lys Ph 1320 325 uC~s 1310l AAC TAT TAGAT GAC GAGTAAACAGAG A T A 13 11315 TTC TC ACTC AAG GA C ACA G AG AAG GAG TGGCAAT CT ATA
GG
TruPhe T Cys asp Ser Se L As Thr Giu LyslyTp r Gi Cys Le Gn 136 137 3120732 CA TC *T GAT CTGACA GAG GAGCCAGCCAACCCACC is~ Tr Val Asp Vaish Gu Lys ASP LysMe Gl Vao seGr Arg GiuAs 11330 ~39 13913 5534 1400 ~1405 h r TGGTT AAG CGC CT GATC TT GTA GC AC ATT ATC TGG CC TCT
CG
Tr y -g GuPhe is Tyr Asp Asn ie lie TVal Ala LeuleIe44 1350 1355 ~1 46 25 e 4 GAG-jc CC TG T AGT TCC AC-A GG GA TGCAGG CT CAGAA GTT CGAG 449 Thr Geu Phe Thr Lai er Mth Gly Gu Gy Trp Pro Gin Lai Leu Gi 143751343024 A C TC GT GA GT C- GCA GACC GC CAAGG CC-A AC ACC AGC AC H 44s Sa Val Asp Val Thr Giu GTysp ArGl r 1413801138 CTG CAG AG ATGA TC C ACCTTC TA TA GT G AC TTT GTG GTC590CC Aet Met Gn Metn Sr ie h PheGn Tyr l Val T Hi Phe ValVaPhPr 140140 140 1475 GTGTCT TT C TTT GAG AC TC TT TG GCC AT T ATC CC TTC AG 46 Val Ser Pro Ser Phe Glu Tyr Thr Ile Met Ala Met Ile Ala Leu Asn 1480 1485 1490 ACT GTT CTG CTG ATG ATG AAC TAT TAT TCT GCT CCC TGT ACC TAT GAG 4686 Thr Val Val Leu Met Met Lys Tyr Tyr Ser Ala Pro Cys Thr Tyr Clu 1495 1500 1505 CTG CCC CTG AAG TAC CTG AAT ATC GCC TTC ACC ATC CTG TTT TCC CTG 4734 Leu Ala Leu Lys Tyr Leu Asn Ile Ala Phe Thr Met Val Phe Ser Leu 1510 1515 1520 GAA TGT GTC CTG AAG GTC ATC CCT TTT GGC TTT TTG AAC TAT TTC CGA 4782 Glu Cys Val Leu Lys Val Ile Ala Phe Cly Phe Leu Asn Tyr Phe Arg *1525 1530 1535 G AC ACC TGC AAT ATC TTT GAC TTC ATC ACC GTC ATT CCC AGT ATC ACA 4830 .Asp Thr Trp Asn Ile Phe Asp Phe Ile Thr Val Ile Cly Ser Ile Thr *1540 1545 1550 1555 G AA ATT ATC CTG ACA GAC AGC AAG CTG GTG A.AC ACC ACT GGC TTC AAT 4878 Giu Ile Ile Leu Thr Asp Ser Lys Leu Val Asn Thr Ser Cly Phe Asn *1560 1565 1570 ATG AGC TTT CTC AAC CTC TTC CGA GCT GCC CCC CTC ATA AAG CTC CTG 4926 **.Met Ser Phe Leu Lys Leu Phe Arg Ala Ala Arg Leu Ile Lys Leu Leu a.*1575 1580 1585 CCT CAC GCC TAT ACC ATA CGC ATT TTG CTC TCC ACC TTT CTC CAC TCC 4974 .*Arg Gin Gly Tyr Thr Ile Arg Ile Leu Leu Trp Thr Phe Val Gin Ser *1590 1595 1600 TTT AAC GCC CTC CCT TAT CTC TCC CTT TTA ATT CCC ATC CTT TTC TTC 5022 Phe Lys Ala Leu Pro Tyr Val Cys Leu Leu Ile Ala Met Leu Phe Phe 1605 1610 1615 *ATT TAT CCC ATC ATT CCC ATC CAG CTA TTT CCA AAC ATA AAA TTA GAC 5070 Ile Tyr Ala Ile Ile ly Met Gin Val Phe Gly Asn Ile Lys Leu Asp 1620 1625 1630 1635 GAG GAG ACT CAC ATC AAC CCC CAC AAC AAC TTC CCC ACT TTC TTT CCC 5118 Clu Giu Ser His Ile Asn Arg His Asn Asn Phe Arg Ser Phe Phe Cly 1640 1645 1 Zr TCC CTA ATC CTA CTC TTC ACC ACT CCC ACA GGT GAG CCC TGC CAC GAG 5166 Ser Leu Met Leu Leu Phe Arg Ser Ala Thr Gly Ciu Ala Trp Gin Glu 1655 1660 1665 ATT ATG CTG TC.A TCC CTT CCC GAG AAG CCC TCT GAG CCT GAC ACC ACC 5214 Ile Met Leu Ser Cys Leu Cly Giu Lys Cly Cys Ciu Pro Asp Thr Thr 1670 1675 1680 CCA CCA TCA CCC GAG AAC GAG AAT CAA CCC TCC CCC ACC CAT CTC CCC 5262 Ala Pro Ser Gly Gin Asn Clu Asn Glu Arg Cys Cly Thr Asp Leu Ala 1685 1690 1695 TAC CTC TAC TTT GTC TCC TTC ATC TTC TTC TCC TCC TTC TTC ATG CTC 5310 -220- T1700 y P e a Ser Phe lie Phe, h y Ser Phe Leu Met Leu 170 1 051710 1715 A-AC CTC TTT GTC GCC CTC A-TC A-TG GA-C A-AC TTT GA-G TAC CTG ACT CGG 5358 A-sn Leu Phe Val Ala Val Ile Met Asp A-sn Phe Giu Tyr Leu Thr A-rg 1720 1725 1730 GAC TCC TCC ATC CTC GGG CCT CA-C CAC TTC AC GA-G TTT GTC CC GTC 5406 Asp Ser Ser Ile Leu Gly Pro His His Leu Asp Giu Phe Val Arg Va.1 1735 1740 1745 TGC GCA GA-A TAT GA-C CGA OCA GCA TGT GGC CGC ATC CAT TA-C ACT GAG Tr laGuTyr Asp A-rg Ala A-ia Cys Gly A-rg Ile His Tyr454Gi 0 1750 1755 1760Ty hGl ~A TG TAT AA A T C CT C ACT CTC A TG TCA CCT CC C T- C C C C G -G5 0 Met Gyr 1Ceu LeuTCGC Tyr Met eu Thr L u Met Ser Pro Pro Leu G y Le G y Ls 1765 1 770 17 l e l y A-GA TGT CCC TCC A-AA GTG GCA TAT A-AG AGG TTG GTC CTG A-TG A-AC ATG Arg CyS Pro Ser Lys Val A-a Tyr LYS Arg Leu Val Leu Met Asn Met 1720 1785 1790 19 **,CCA GTA GCT GA-C GA-C ATG A-CC GTC CAC TTC A-CC TCC ACA CTT ATG GCT 5598 Pro Val A-ia Giu Asp Met Thr Val His Phe Thr Ser Thr Leu Met A-i a .1800 1805 1810 CTG A-TC CCC A-CA GCT CTG GA-C A-TT AA-A TT CCC AAA GGT GGT CCA GA-C54 *e le iArg Thr A a Leu As Ile Ly le A-a Ly Cl ly A-a As 1851820 1825 A-CC CA-C CA-C CTA CA-C TCA GA-C CTA CA-A A-AG GAG A-CC CTA CCC A-TC TGC 5694 A-gGGin Leu Asp Ser Giu Leu Gin Lys Giu Thr Leu A-ia Ile Trp CCT CAC CTA TCC CAC A-AG ATC CTG CAT CTC CTT GTG CCC ATC CCC AAA 5742 *get Pro His Leu Ser Gin Lys Met Leu Asp Leu Leu Val Pro Met Pro Lys 1845 1850 1855 CCC TCT GA-C CTC A-CT GTC CCC AAA ATC TAT CCA GCA A-TG A-TG ATC ATG 5790 A-ia Ser Asp Leu Thr Val Giy Lys Ile Tyr A-ia A-ia Met Met l e 1860 1865 1870187 CA-C TA-C TAT A-AC CA-C A-CT A-AC CTC AAG AA-C CA-C A-CC CA-C CAC CTG GAC 5838 Asp Tyr Tyr Lys Gin Ser Lys Val Lys Lys Gin A-rg Gin Gin Leu Glu 1880 1885 1890 AA CA-C AAA- AAT GCC CCC ATG TTC CA-C CC ATC A-C CCT TCA TCT CTG 5886 Ciu Cmn Lys A-sn Ala Pro Met Phe Cmn A-rg Met Ciu Pro Ser Ser Leu 1895 1900 1905 CCT CA-C GAC A-TC ATT GCT A-AT CCC A-AA GCC CTC CCT TA-C CTC CAG CA-C93 Pro Gin Clu Ile Ile A-ia A-sn A-ia Lys A-ia Leu Pro Tyr Leu Gin Gin 1910 1915 1920 GA-C CCC CTT TCA CCC CTG A-CT CCC CCC A-CT CA TAC CCT TCC A-TC A-CT 5982 -221- Asp Pro Val Ser Gly Leu Ser Gly Arg Ser Gly Tyr Pro Ser Met Ser 1925 1930 1935 CCA CTC TCT CCC CAG GAT ATA TTC CAG TTG GCT TGT ATG GAC CCC GCC 6030 Pro Leu Ser Pro Gin Asp lie Phe Gin Leu Ala Cys Met Asp Pro Ala 1940 1945 1950 1955 GAT GAC GGA CAG TTC CAA GAA CGG CAG TCT CTG GTG GTG ACA GAC CCT 6078 Asp Asp Gly Gln Phe Gin Giu Arg Gin Ser Leu Val Val Thr Asp Pro 1960 1.965 1970 AGC TCC ATG AGA CGT TCA TTT TCC ACT ATT CGG GAT AAG CGT TCA AAT 6126 Ser Ser Met Arg Arg Ser Phe Ser Thr Ile Arg Asp Lys Arg Ser Asn 1975 1980 1985 TCC TCG TGG TTG GAG GAA TTC TCC ATG GAG CGA AGC AGT GAlA AAT ACC 6174 Ser Ser Trp, Leu Giu Giu Phe Ser Met Giu Arg Ser Ser Glu Asn Thr 1990 1995 2000 TAC AAG TCC CGT CGC CGG AGT TAC CAC TCC TCC TTG CGG CTG TCA GCC 6222 Tyr Lys Ser Arg Arg Arg Ser Tyr His Ser Ser Leu Arg Leu Ser Ala 2005 2010 2015 CAC CGC CTG AAC TCT GAT TCA GGC CAC AAG TCT GAC ACT CAC CCC TCA 6270 His Arg Leu Asn Ser Asp Ser Giy His Lys Ser Asp Thr His Pro Ser 2020 2025 2030 2035 GGG GGC AGG GAG CGG CGA CGA TCA AAA GAG CGA AAG CAT CTT CTC TCT 6318 Gly Gly Arg Giu Arg Arg Arg Ser Lys Glu Arg Lys His Leu Leu Ser 2040 2045 2050 ***CCT GAT GTC TCC CGC TGC AAT TCA GAA GAG CGA GGG ACC CAG GCT GAC 6366 *..Pro Asp Val Ser Arg Cys Asn Ser Giu Giu Arg Gly Thr Gin Ala Asp .2055 2060 2065 TGG GAG TCC CCA GAG CGC CGT CAA TCC AGG TCA CCC AGT GAG GGC AGG 6414 *..Trp Giu Ser Pro Giu Arg Arg Gin Ser Arg Ser Pro Ser Giu Gly Arg 2070 2075 2080 *..TCA CAG ACG CCC AAC AGA CAG GGC ACA GGT TCC CTA AGT GAG AGC TCC 6462 Ser Gin Thr Pro Asn Arg Gin Gly Thr Gly Ser Leu Ser Giu Ser Ser 2085 2090 ATC CCC TCT GTC TCT GAC ACC AGC ACC CCA AGA AGA AGT CGT CGG CAG 6510 Ile Pro Ser Val Ser Asp Thr Ser Thr Pro Arg Arg Ser Arg Arg Gin 2100 2105 2110 2115 CTC CCA CCC GTC CCG CCA AAG CCC CGG CCC CTC CTT TCC TAC AGC TCC 6558 Leu Pro Pro Val Pro Pro Lys Pro Arg Pro Leu Leu Ser Tyr Ser Ser 2120 2125 2130 CTG ATT CGA CAC GCG GGC AGC ATC TCT CCA CCT GCT GAT GGA AGC GAG 6606 Leu Ile Arg His Ala Gly Ser Ile Ser Pro Pro Ala Asp Gly Ser Giu 2135 2140 2145 GAG GGC TCC CCG CTG ACC TCC CAA GCT CTG GAG AGC AAC AAT GCT TGG 6654 -222- Giu Gly Ser Pro Leu Thr Ser Gin Ala Leu Glu Ser Asn Asn Ala Trp 2150 2155 2160 CTG ACC G AG T T T C A C CCG CAC CCC CAG CAG AGO CAA CAT GCC 6702 2eu ThrG5 Ser Ser Asn Ser Pro His Pro Gn Gn Arg Gn His Ala TCC CCA CAG CGC TAC ATC TCC GAG CCC TAC TTG GCC CTG CAC GAA GAC 6750 Ser Pro Gin Arg Tyr le Ser Giu Pro Tyr Leu Ala Leu His Giu Asp9 2180 2185 2190 29 TCC CAC GCC TCA GAC TGT GTT GAG GAG GAG ACG CTC ACT TTC GAA GCA 6798 Ser His A-la Ser Asp Cys Val Giu Glu Giu Thr Leu Thr Phe Giu Al1a 2200 2205 2210 GCC GTG GCT ACT AGC CTG GGC CGT TCC AAC ACC ATC GGC TCA GCC CCA 6846 A-la Val Al1a Thr Ser Leu Gly Arg Ser Asn Thr Ile Gly Ser Ala Pro 2215 2220 2225 CCC CTG CGG CAT AGC TGG CAG ATG CCC AAC GGG CAC TAT CGG CGG CGG 6894 Pro Leu Arg His Ser Trp Gin Met Pro Asn Gly His Tyr Arg Arg Arg 2230 2235 2240 AGG CGC GGG GGG CCT GGG CCA GGC ATG ATG TGT GGG GCT GTC AAC AAC 6942 Arg Arg Gly Gly Pro Gly Pro Gly Met Met Cys Giy Ala Val Asn Asn 2245 2250 2255 .CTG CTA AGT GAC ACG GAA GAiA GAT GAC AAA TGC TAGAGGCTGC TCCCCCCTCC 6995 ***Leu Leu Ser Asp Thr Giu Giu Asp Asp Lys Cys ***2260 2265 2270 .GATGCATGCT CTTCTCTCAC ATGGAGAAAA CCAAGACAGA ATTGGGAAGC CAGTGCGGCC 7055 *CCGCGGGGAG GAAGAGGGAA AAGGAAGATG GAAG 7089 INFORMATION FOR SEQ ID NO:26: SEQUENCE
CHARACTERISTICS:
LENGTH:' 2634 basepar TYPE: nucleic acid STRANDEDNESS: double S TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) (ix)
FEATUR.E:
NAME/KEY: CDs LOCATION: 1.-1983 OTHER INFORMATION: /standard-name= 'Beta-2d" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:26: ATG GTC CAA AGO GAC ATG TCC AAG TCT CCT CCC ACA CCG GCG GCG GCG 48 Met Val Gin Arg Asp Met Ser Lys Ser Pro Pro Thr Pro Ala Ala Ala -223- 1 GTG GCG CAG GAG ATC CAG ATG GAA CTG Val Ala Gin Giu Ile Gin Met Gin Len 25 GGG GCG CTC GGA GCC GCC GCA CAG TCA Giy Aia Len Giy Aia Ala Ala Gin Ser 40 AAA AAC AGA TTT AAA GGA TCT GAT GGA Lys Asn Arg Phe Lys Gly Ser Asp Giy 55 TCA AAT AGT TTT GTT CGC CAG GGT TCG Ser Asn Ser Phe Val Arg Gin Giy Ser 70 CCA TCC GAT TCC GAT GTA TCT CTG GAG Pro Ser Asp Ser Asp Val Ser Leu Gin AGA GAA GCG GAG CGG CAG GCC CAG GCA Arg Gin Ala Gin Arg Gin Ala Gin Ala 100 105 **AAG CCC GTT GCA TTT C GTT CGG ACA2 Lys Pro Val Ala Phe Ala Vai Arg Thr2 *115 120 CAT GAA CAT GAT CTT CCA GTG CCT GGC His Gin Asp Asp Val Pro Vai Pro Gly *.130 135 AAA GAT TTT CTG CAT CTT AAG GAA AAA *Lys A-sp Phe Len His Val Lys Gin Lys P 145 150 CCC CGA TTG GTA AAA GAA CCC TGT CAA Cly Arg Len Val Lys Gin Gly Cys Gin I 165 1 Val Lys Len Gin Asn Met Arg Leu Gin H 180 185 GGG AAA TTC TAC TCC AGT AAA TCA GGA G Cly Lys Phe Tyr Ser Ser Lys Ser Gly G.
195 200 CGT GAC ATA CTA CCT AGT TCC AGA AAA T( Gly Asp Ile Val Pro Ser Ser Arg Lys S~ 210 215 ATA GAC ATA CAT GCT ACT GCC TTA CAT CG Ile Asp Ile Asp Ala Thr Giy Len Asp Al 10 CTA GP Leu Gl TAT GG Tyr Gi AGC AC Ser Th GCA GA( Ala As; 7! GAG GAC Gin Asr 90 TAG TTC G1n Leu !LAT GTC ksn Val ~TG GCC let Ala 'TT AAC 'he Asn 155 .TC GGA le Cly 70 AT CAA is Gin GA AAT ly Asn CA ACA er Thr :A GAA .a Gin LG AAC GTG GCT CCC .u Asn Val Ala Pro 30 A AAA GGA GCC AGA y Lys Giy Aia Arg G TCA TCT GAT ACT r Ser Ser Asp Thr 60 C TCC TAC ACT AGC p Ser Tyr Thr Ser CGG GAG CCA CTG Arg Gin Ala Val GAA AAA GCA AAG Gin Lys Ala Lys 110 AGC TAC ACT CCG Ser Tyr Ser Ala 125 ATC TCA TTC CAA G Ile Ser Phe Gln 140 AAT GAC TGG TGG A Asn Asp Trp Trp I 1 TTC ATT CCA ACC C Phe Ile Pro Ser P 17S CAC AGA CCC AAC Gin Arg Ala Lys G 190 TCA TCA TCC AGT T Ser Ser Ser Ser L 205 CCT CCA TCA TCT G Pro Pro Ser Ser A: 220 CAA AAT GAT ATT CC Gin Asn Asp Ile P2
GOC
Al a
AGG
Arg
ACC
Thr
CGT
Axg
CGC
Axrg
%CA
Thr Ula
;CA
Lla
~TA
le
CA
ro
AA
in
C
La 144 192 24C 288 336 384 432 480 S28 576 624 672 720 -224- 2 25 230 235 240 GCA AAC CAC CGC TCC CCT AAA CCC ACT GCA AAC ACT GTA ACG TCA CCC 768 Ala Asn His Arg Ser Pro Lys Pro Ser Ala Asn Ser Val Thr Ser Pro 245 250 255 CAC TCC AAA GAG AAA AGA ATG CCC TTC TTT AAC AAC ACA GAG CAC ACT81 His Ser L.ys Glu Lys Arg Met Pro Phe Phe Lys Lys Thr Ciu His Thr 260 265 270 CCT CCG TAT GAT CTG CTA CCT TCC ATG CGA CCA GTG GTC CTA CTG GGC 864 Pro Pro Tyr Asp Val Val Pro Ser Met Arg pro Val Val Leu Val Cly 275 280 285 CCT TCT CTG ARC GGC TAC GAG GTC ACA CAT ATG ATG CAA AAA. GC'- CTC 912 Pro Ser Leu Lys Cly Tyr Giu Val Thr Asp Met Met Gln Lys Ala Leu 290 295 300 TTT CAT TTT TTA AAA CAC AGA TTT CAR CCC CCC ATA TCC ATC ACA ACG 960 Phe Asp Phe Leu Lys His Arg Phe Giu Cly Arg Ile Ser Ile Thr Arg 305 310 315 320 CTC ACC CCT GAC ATC TCC CTT CCC AAA CCC TCC CTA TTA AAC AAT CCC 1008 Val Thr Ala Asp Ile Ser Leu Ala Lys Arg Ser Val Leu Asn Asn Pro 325 330 335 ACT ARC CAC CCA ATA ATA CAR ACA TCC AAC ACA ACC TCA AGC TTA CC 1056 Ser Lys His Ala Ile Ile Clu Arg Ser Asn Thr Arg Ser Ser Leu Ala *340 345 350 GA GT AC T GAR ATC CAR AGG ATT TTT CAR CTT CCA AGA ACA TTC 1104 Glu Val Gin Ser iu Ile Ciu Arg Ile Phe Ciu Leu Ala Arg Thr Leu 355 30365 CAC TTG CTC GTC CTT GAC CC CAT ACA ATT ART CAT CCA CCT CAR CTC 1152 Gin Leu Val Val Leu Asp Ala Asp Thr Ile Asn His Pro Ala Cln Leu *370 375 380 .*AT AAR ACC TCC TTG GCCC CCT ATT ATA GTA TAT TA ARC ATT TCT TCT 1200 Ser Lys Thr Ser Leu Ala Pro Ile Ile Val Tyr Val Lys Ile Ser Ser 385 390 395 n CTARC CTT TTA CAR AGC TTA ATA AAA TCT CGA GGG AAA TCT CAR C 1248 Pro Lys Val Leu Cmn Arg Leu Ile Lys Ser Arg Cly Lys Ser Gin Ala 405 410 415 *AAA CAC CTC AC TC CAC ATG TA CA GCT AT AAA CTG CT CAC TT 1296 Lys His Leu Asn Val Gin Met Val Ala Ala Asp Lys Leu Ala Gin Cys 420 425 430 CCT CCA GAG CTC TTC CAT GTC ATC TTC CAT GAG ARC CAG CTT GAG CAT 1344 Pro Pro Ciu Leu Phe Asp Val Ile Leu Asp Ciu Asn Gin Leu Ciu Asp 435 440 445 CCC TCT CAC CAC CTT CCC GAC TAT CTC GAG CCC TAC TGC ARC CCC ACC 1392 Ala Cys Ciu His Leu Ala Asp Tyr Leu Giu Ala Tyr Trp Lys Ala Thr -225- 450 455 460 CAT CCT CCC AGC AGT AGC CTC CCC AAC CCT CTC CTT AGC CGT ACA TTA His Pro Pro Ser Ser Ser Leu Pro Asn Pro Leu Leu Ser Arg Thr Leu 465 470 475 480 GCC ACT TCA AGT CTG CCT CTT ACC CCC ACC CTA CCC TCT AAT TCA CAG Ala Thr Ser Ser Leu Pro Leu Ser Pro Thr Leu Ala Ser Asn Ser Gin 485 490 495 GGT TCT CAA GGT GAT CAG AGG ACT CAT CGC TCC GCT CCT ATC CGT TCT Cly Ser Gin Gly Asp Gin Arg Thr Asp Arg Ser Ala Pro Ile Arg Se r 500 505 510 GCT TCC CAA GCT GA\ GAA GAA CCT AGT GTG GAA CCA CTC AAC AAA TCC Ala Ser Gin Aia Ciu Giu Giu Pro Ser Val Giu Pro Vai Lys Lys Ser 515 520 525 CAG CAC CGC TCT TCC TCC TCA GCC CCA CAC CAC AAC CAT CGC ACT GGG Gin His Arg Ser Ser Ser Ser Al1a Pro His His Asn His Arg Ser Gly 530 535 540 ACA ACT CGC GGC CTC TCC AGG CAA GAG ACA TTT GAC TCC GAA ACC CAG Thr Ser Arg Cly Leu Ser Arg Gin Giu Thr Phe Asp Ser Ciu Thr Gin 545 550 555 560 GAG ACT CGA GAC TCT CCC TAC CTA GAG CCA AAG GAA GAT TAT TCC CAT Giu Ser Arg As p Ser Ala Tyr Val Giu Pro Lys Giu Asp Tyr Ser His 565 570 575 CAC CAC CTG GAC CAC TAT GCC TCA CAC CGT CAC CAC AAC CAC ACA GAC Asp His Val Asp His Tyr Ala Ser His Arg Asp His Asn His Arg Asp 580 585 590 GAG ACC CAC CGG AGC ACT CAC CAC ACA CAC AGG GAG TCC CGG CAC CCT Clu Thr His Gly Ser Ser Asp His Arg His A-rq Ciu Ser Arg His Arg 595 600 605 TCC CGG GAC GTG GAT CGA GAG CAC CAC CAC AAC GAG TCC AAC AAC CAC Ser Arg Asp Vai Asp Arg Giu Gin Asp His Asn Giu Cys Asn Lys Gin 610 615 620 CCC AGC CGT CAT AAA TCC AAG CAT CCC TAC TCT GAA AAG GAT GCA CAA Arg Ser Arg His Lys Ser Lys Asp Arg Tyr Cys Clu Lys Asp Cly Clu 625 630 635 640 GTC ATA TCA AAA AAA CGG AAT GAG CCT CCC GAG TGC AAC AGC CAT CTT Val Ile Ser Lys Lys Arg Asn Ciu Ala Gly Glu Trp Asn Arg Asp Val 645 650 655 TAC ATC CCC CAA TCAGTTTTCC CCTTTTGTCT TTTTTTTTTT
TTTTTTTTGA
Tyr Ile Pro Gin 660 AGTCTTGTAT AACTAACAGC ATCCCCAAAA CAAAAAGTCT TTGGCCTCTA
CACTGCAATC
1440 148s 1584 1632 1680 1728 1776 1824 1872 1920 196e 2080 -226- ATATGTGATC TGTCTTGTAA TATTTTGTAT TATTGCTGTT GCTTGAATAG CAATAGCATG 14 GATAGAGTAT TGAGATACTT TTTCTTTTGT AAGTGCTACA TAAATTGGCC TGGTATGGCT 2200 GCAGTCCTCC GGTTGCATAC TGGACTCTTC AAAAACTGTT TTGGGTAGCT GCCACTTA 2260 CAAAATCTGT TGCCACCCAG GTGATGTTAO TGTTTTAAGA AATGTAGTTG ATGTATCA 2320 CAAGCCAGA TCAGCACAGA TAAAAAGTGG AATTTCTTGT TTCTCCAGAT TTTTAATACG 2380 TTAATACGCA GGCATCTGAT TTGCATATTC ATTCATGGAC CACTGTTTCT TGCTTGTACC 2440 TCTGGCTGAC TAAATTTGGG GACAGATTCA GTCTTOCCTT AC.ACAAAGGG GATCATAG 2500 TTAGAATCTA TTTTCTATGT ACTAGTACTG TGTACTGTAT AGACAGTTTG TAAATOTTAT 2560 TTCTGCAA-Ac AAACACCTCC TTATTATATA TAATATATAT ATATATATCA GTTTGATCAC 2620 ACTATTTTAG AGTC 23 INFORM4ATION FOR SEQ ID NO:27: 23 ()SEQUENCE
CHARACTERISTICS:
LENGTH: 1823 base pairs TYPE: nucleic acid STRANDEDESs: double TOPOLOGY: linear MOLECULTE TYPE: DNA gnmc
FEATURE:
NAME/KEY.
CDS
(B3) LOCATION: 69. .1631 OTHER INFORMATION: /standard-nane= 'Beta-41, (xi) SEQUENCE DESCRIPTION: SEQ ID NO:27: AGCCCAGCCT CGGGGGCCAG CCCCCTCCGC CCACCGCACA CGGGCTGGCC ATGCGGCGGC *.TCTcG-AC ATG TCC TCC TCC TCC TAC 0CC AAG AAC 000 ACC GCG GAC GO 110 **.Met Ser Ser Ser Ser Tyr Ala Lys Asn Gly Thr Ala Asp Gly 1 5 1 CCG CAC TCC CCC ACC TCO CAG GTG GCC CGA GGC ACC ACA ACC CGG AGO 158 PoHis Ser Pro Thr Ser Gln Val Ala Arg Gly Thr Thr Thr Arg Arg 1520 25 3 AOC AGO TTG AAA AGA TCC OAT OGC AOC ACC ACT TCO ACC AOC TTC ATC 206 Ser Arg Leu Lys Arg Ser Asp Oly Ser Thr Thr Ser Thr Ser Phe Ile 4 CTC AGA CAG GOT TCA OCO OAT TCC TAC ACA AOC AGO CCO TCT GAC TCC 254 Leu Arg Gln Gly Ser Ala Asp Ser Tyr Thr Ser Arg Pro Ser Asp Ser 55 -227- GAT GTC TCT TTG GAA GAG GAC CGG GAA GCA ATT CGA CAG GAG AGA GAA Asp Val Ser Leu Giu Glu Asp Arg Giu Ala Ile Arg Gin Giu Arg Giu 70 75 CAG CAA Gin Gin GCA GCT ATC CAG Aia Ala Ile Gin
CTT
Leu 85 GAG AGA GCA AAG Giu Arg Ala Lys AAA CCT GTA GCA Lys Pro Val Ala GCC GTG AAG ACA Ala Val Lys Thr
AAT
Asn 100 GTG AGC TAC TGC Val Ser Tyr Cys GCC CTG GAC GAG Ala Leu Asp Giu
GAT
Asp GTG CCT GTT CCA Val Pro Val Pro CAT ATT AAA GAG His Ile Lys Giu 130 AAA GAG GGC TGT Lys Giu Gly Cys 145 ACA GCT ATC TCC Thr Ala Ile Ser
TTT
Phe GAT GCT AAA GAC Asp Ala Lys Asp TTT CTA Phe Leu AAA TAT AAC AAT Lys Tyr Asn Asn TGG TGG ATA GGA Trp, Trp Ile Gly AGG CTG GTG Arg Leu Val 140 AGA TTG GAG Arg Leu Giu GAA ATT GGC TTC ATT CCA AGT CCA Giu Ile Gly Phe Ile Pro Ser Pro *e
S
S
AAC ATA Asn Ile 160 CGG ATC CAG CAA Arg Ile Gin Gin
GAA
Giu 165 CAA AAA AGA GGA Gin Lys Arg Gly
CGT
A-rg TTT CAC GGA GGG Phe His Gly Gly 302 350 398 446 494 542 590 G38 686 734 782 830 878 AAA TCA AGT GGA AAT Lys Ser Ser Gly Asn 175 ACA TTC CGA GCA ACT Thr Phe Arg Ala Thr 195 ACG GAG CAC ATT CCT Thr Giu His Ile Pro 210 GTG TTA GTG GGG CCG Val Leu Val Gly Pro 225; TCT TCA AGT CTT Ser Ser Ser Leu GAA ATG GTA TCT Giu Met Val Ser
GGG
Gly CCC ACA TCA ACA Pro Thr Ser Thr
GCA
Ala AAA CAG AAG CAA Lys Gin Lys Gin AAA GTG Lys Val CCT TAC GAT Pro Tyr Asp GTA CCG TCA ATG Val Pro Ser Met CGT CCG GTG Arg Pro Val 220 GAC ATG ATG Asp Met Met TCA CTG Ser Leu GGT TAC GAG GTA
ACA
Thru CAG AAA GCC CTC TTT GAT Gin Lys Ala Leu Phe Asp 240 TCA ATA ACG AGA GTG ACA Ser Ile Thr Arg Val Thr 255 260 CTA AAT AAT CCC AGC AAG Leu Asn Asn Pro Ser Lvs
TCC
Ser 245 CTG AAG CAC AGG Leu Lys His Arg GAT GGG AGG ATT Asp Giy A-rg Ile GCT GAC ATT TCT Ala Asp Ile Ser
CTT
Leu GCT AAG AGG TCT Ala Lys Arg Ser AGA GCA ATA Arg Ala Ile 275
ATT
Ile 280 GAA CGT TCG AAC Giu Arg Ser Asn ACC CGG Thr Arg 285 -228- TCC ACC
TTA
Ser Ser Leu
GC
Al a GAA GTA CAA
ACT
Glu Val Gin Ser CAA ATT GAA AGA ATC TTT GAG TTC Guu lie Ciu Arg Ile PheCl Le ,RATCT TTG CAA CTG
GTT
Ala Arg Ser Leu Gin Leu Val 305 CCA GCA CAA CTT ATA AAG
ACT
Pro Ala Gin Lieu Ile Lys Thr 320 CTT AT CA AC ACC ATC AAT
CAC
Leu Asp Ala Asp Thr Ile Asn His TCC TTA CCA CCA ATT
AT
Ser Leu Ala Pro Ile Ii T GTT e Val 32S AAA GTC TCA Lys Val Ser 335 AAC TCA CA.A Lys Ser Gin CTT CCA CAA Leu Al1a Gin TCT CCA AAG CTT TTA CAC CCC TTC ATT AAA
TCT
Ser Pro Lys Val Leu Gin Arg Leu Ile LysSe 340 345 e ACT AAA CAC TTC AAT CTT CAA CTC CTG GCA
CCT
Ser Lys His Leu Asn Val Gin Leu Val Ala Ala CAT CTA His Vai AGA
GGA
Arg Gly 350 CAT AAA Asp Lys 365 GAA
AAT
1lu Asn TGC CCC CCA CAA ATC TTT
CAT
Cys Pro Pro Clu met Phe Asp 370 375 GTT ATA
TTG
CAC
CTT
Gin Leu
CAT
Asp 380 GAG C GAG CAT Glu Asp CCA TCT GAA CAT
CTA
Ala Cys Clii His Leu CCC GAG TAC
CTC
TCC CCT CCC ACC CAC ACA
ACC
Trp Arg Ala Thr His Thr Thr 400 405 CGA ACG AAT TTC CCC TCC
ACC
Cly Arg Asn Leu Cly Ser Thr 415 420 AT AC ACA CCC ATC Ser Ser Thr Pro Met 410 CA CTC TCA CCA TAT Ala Leu Ser Pro Tyr ACC CCC CTG
CTG
1022 1070 i118 1166 1214 1262 1310 1358 1406 1454 1502 1550 1598 CCC ACA
CCA
ATT
TCT CCC TTA CAC Ser Cly Leu Cmn
ACT
Ser CAC CA ATC ACC CAC Cmn Arg Met Arg His ACC AAC CAC
TCC
ACA
GAG
AAC
TCT
Asn Ser CCA ATT CAA AGA CGA ACT
CTA
-L lArg Arg Ser Leu ATC ACrTt-'irr_ Met Thr Ser Asp CAC AAT CAA ACC CCT CCC AAG AGT ACC His Asn Ciii Arg Ala Arg Lys Ser Arg 465 470 GzAA AAT
TAT
Ciii Asn ?yr, 460 TCC ACT TCT CAC
CAT
Gin His ACC CCA CAT Ser Arg Asp CAT TAC
CCT
His Tyr Pro
CTT
AAC CCC TTC Asn Arg Leii CTC CAA GAA
TCT
CAT TAC CCT
CAC
4850 TCA TAC CAC CAC ACT TAC AAA CCC CAT ACC AAC
CGA
Ser Tyr Gin Asp Thr Tyr Lys Pro His Arg Asn Arg 495 500 505 GGA TCA CCT Gly Ser Pro -229- GGA TAT AGC CAT GAC TCC CGA CAT AGG CTT TGAGTCTAAT GAAACAA164 Gly Tyr Ser His Asp Ser Arg His Arg Leu 14 515 520 ATATTCATCT GTTGACAATT TGCCATAGCA GTGCTAGGAT AAACCAATCA TCTTAACTTG 1708 GCTAACATAG CACAGTATT ACTGTGCTA TGGGCTGCTG TCATTTTATG CTAAGTAAGG 1768 GGCAAAAA AAAATTACAT TATGCCCTTG AGTCTAGATG GATATTAGAT GCCCG 182.3 INFORMATION FOR SEQ ID NO:28: SEQUENCE
CHARACTERISTICS:
LENGTH: 520 amino acids TYPE: amino acid TOPOLOGY: linear (ii) MOLECULTE TYPE: protein (xi) SEQUENCE DESCRIPTION: SEQ ID NO:28: Met Ser Ser Ser Ser Tyr Ala Lys Asn Gly Thr Ala Asp Gly Pro His 1 5 10 Ser Pro Thr Ser Gin Val. Ala Arg Gly Thr Thr Thr Arg Arg Ser Arg 2025 Leu Lys Arg Ser Asp Gly Ser Thr Thr Ser Thr Ser Phe Ile Leu Arg :**Gin Gly Ser Al1a Asp Ser Tyr Thr Ser Arg Pro Ser Asp Ser Asp Val *50 55 SSer Leu Giu Giu Asp A-rg Giu Ala Ile A-rg Gin Giu Arg Giu Gin Gin 70 75 s0 Al1a Ala Ile Gin Leu Giu Arg Ala Lys Ser Lys ProVaAl hAa 90 ***Vai Lys Thr Asn Vai Ser Tyr Cys Gly Ala Leu Asp Giu Asp Vai Pro 100 105 110 V a 1a Pro Ser Thr Ala Ile Ser Phe Asp Ala Lys Asp Phe Leu His Ile .5.115 12012 Lys Giu Lys Tyr Asn Asn Asp Trp, Trp Ile Gly Arg Leu Vai Lys Glu 130 135 140 Gly Cys Giu Ile Gly Phe Ile Pro Ser Pro Leu Arg Leu Glu Asn Ile S..145 150 155 160 Arg Ile Gin Gin Glu Gin Lys Arg Gly Arg Phe His Gly Gly Lys Ser 165 170 175 Ser Gly Asn Ser ser Ser Ser Leu Gly Giu Met Val Ser Giy Thr Phe 180 185 190 -230- Arg Ala Thr Pro Thr Ser Thr Ala Lys Gin Lys Gin Lys Val Thr Giu 195 20020 His Ile Pro Pro Tyr Asp Val Val Pro Ser Met Arg Pr2a0V5 e 210 215 220Pr Va Vl e Val Giy Pro Ser Leu Lys Gly Tyr Giu Vai Thr AspMe Me Gi Ly 2 5230 235 240et G n y Ala Leu Phe Asp Ser Leu Lys His Arg Phe As240AHle e i 245 250 255Gy ~gIe e l Thr Arg Val Thr Ala Asp Ile Ser Leu Ala Lvs Ar Se255Lu s 260265 270 Asn Pro Ser Lys Ar9 Ala Ile 2181e Glu Ag Sr Asn Thr Arg Ser Ser 275 280S 285 Leu Ala Giu Vai Gin Ser Giu Ile Glu Arg Ile Phe Giu Leu Aia Arg 290 295 300 Ser Leu Gin Leu Vai Val Leu Asp Ala Asp Thr Ile Asn His Pro Ala 305 310 31S 320 GnLeu Ile Lys Thr Ser Leu Ala Pro Ile Ile Vai His Vai Lys Vai 325 330 3 Ser Ser PoLys VlLuGnAgLeu Ile Lys Ser Arg Giy Lys Ser 340 ~34535 :Gin Ser Lys His Leu Asn Vai Gin Leu Vai Ala Ala Asp Lys Leu Ala 355 360 365 Gin Cys Pro Pro Giu Met Phe Asp Val lie Leu Asp GiU Asn Gin Leu S..370 375 380 Glu Asp Al1a Cys Glu His Leu Gly Giu Tyr Leu Giu Aia Tyr Trp, Arg 35390 39540 Aia Thr His Thr Thr Ser Ser Thr Pro Met Thr Pro Leu Leu Gly Ar; *S405 41045 Asn Leu Giy Ser Thr Ala Leu Ser Pro Tyr Pro Thr Ala Ile Ser Giy *420 425 430 Leu Gin Ser Gin Ar; Met Ar; His Ser Asn His Ser Thr Giu Asn Ser .5.4354445 Pro Ile Giu Ar; A-rg Ser Leu Met Thr Ser Asp Giu Asn Tyr His Asn 450 455 460 Giu A-r; Ala Ar; Lys Ser Arg Asn Arg Leu Ser Ser Ser Ser Gin His 465 470 475 480 Ser Arg Asp His Tyr Pro Leu Val Giu Giu Asp Tyr Pro Asp Ser Tyr 485 490 495 -231- Glr2 Asp Thr Tyr Lys Pro His Arg Asn Arg Gly Ser Pro Gly Gly Tyr S00 505 510 Ser His Asp Ser Arg His Arg Leu 515 520 INFORMATION FOR SEQ ID NO:29: SEQUENCE
CHARACTERISTICS:
LENGTH: 3636 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECUJLE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY:
CDS
LOCATION: 35. .3346 OTHER INFORMATION: /standard-name= "Alpha-2a" (ix) FEATUJRE: NAME/KEY: LOCATION: 1. .34 (ix) FEATURE: NAME/KEY: 3'UTR LOCATION: 3347. .3636 SEQUENCE DESCRIPTION: SEQ ID NO:29: .GCGGGGGAGG GOOCATTOAT CTTCOATCGC GAAO ATG OCT OCT GOC TGC CTG 52 Met Ala Ala Gly Cys Leu .1 *e.CTG GCC TTG ACT CTG ACA CTT TTC CAA TCT TTG CTC ATC GGC CCC TCG 100 Leu Ala Leu Thr Leu Thr Leu Phe Oln Ser Leu Leu Ile Oly Pro Ser 15 .TCO GAG GAO CCG TTC CCT TCG GCC GTC ACT ATC AAA TCA TOO OTO OAT 4 Ser ro Phe Pro Ser Ala Val Thr Ile Lys Ser Trp Val Asp 0*e25 30 3 AAO ATO CAA OAA GAC CTT OTC ACA CTO OCA AAA ACA OCA AOT OGA OTC 196 Lys Met Oln Olu Asp Leu Val Thr Leu Ala Lys Thr Ala Ser Oly Val 45 ART CAG CTT OTT OAT ATT TAT GAG ARA TAT CAR OAT TWO TAT ACT OTO 244 Asn Gin Leu Val Asp Ile Tyr Olu Lys Tyr Gin Asp Leu Tyr Thr Val 60 65 OAR CCA ART ART OCA COC CAG CTO OTA OAR ATT OCA 0CC AGO OAT ATT 292 01u Pro Asn Asn Ala Arg Gin Leu Val Olu Ile Ala Ala Arg Asp Ile 80 -232- GAG AAA CTT OTG AGO AAC AGA TOT AAA GOC CTG GTG AGO CTG GOA
TTG
Giu LYS Leu Leu Ser Asn Arg Ser LYS Ala Leu Val Ser Leu Ala Leu Glu Ala GlU Lys GvTT CAA GOA GCT CAC CAG TGG AGA GAA GAT TTT
GC
VI Gin Ala Ala His Gin Trp Arg Giu Asp PheA AGC AAT GAA GTT GTC TAC TAC A-AT GOA AAG GAT
GAT
Ser Asn Giu Val Val Tyr Tyr Asn Ala Lys Asp Asp 1.2012 3 AAA AAT GAC AGT GAG CCA GGC AGC CA1GGA 0 A Ly135 ApSe i Pro Giy Ser Gin Arg Ile Lys 135 14014 115 CTO GAT OCT GAG Leu Asp Pro Giu CCT GTT TTC ATT (3 AT GOT AAT TTT
GGA
Giu Asp Ala Asn Phe Gly 145 CGA CAA ATA TCT TAT CAG
CAC
Arg Gin Ile Ser Tyr Gin His Pro Al a Vhe le OAT
ATT
His Ile CTC A.AC Leu Asn OCT ACT GAC
ATC
Pro Thr Asp Ile TAT GAG
GGC
TGG
Trp, ACA AGT GCC TTA GAT
GAA
Thr Ser Aia Leu ASP Giu TCA ACA ATT GTG TTA AAT
GAA
Ser Thr Ile Val Leu Asn Giu 180 GTT TTC AAA AAGAT G
GG
Va Phe LYS LYS Asn Arg Giu 388 436 4 8; 532 580 628 676 724 772 820 868 GAA GAC OCT
TOA
Giu Asp pro Ser TTA T TG LeU Leu GOT
OGA
Ala Arg TAT TAT COA
GOT
Tyr Tyr Pro Ala TGG CAG GTT TTT GGC AGT GC Trp Gin Vai Phe Giy Ser Ala 205 210 TCA OCA TGG GTT GAT AAT AGT Ser Pro Tr-p Val Asp Asn Ser 225 GAT GTA GO AGA AGA CA TGG Asp Val Arg Arg Arg Pro Trp AAT AAG ATT
GAO
Asn Lys Ile Asp OTT
TAT
Leu Tyr ACT GGC CTA Thr GiY Leu AGA ACT COA Arg Thr pro 230 TAO ATC CAA T'yr Ile Gin 245 TGAG
GA
T
ai Ser Gly GGA GOT GCA Giy Ala Ala AGT GTT
AGT
Ser Vai Ser TOT COT AAA GAO ATG OTT
ATT
Ser Pro Lv, n- 250 k 'tL eu Ile 255 GGA TTG AcA OTT AAA CTG
ATO
Giy Leu Thr Leu, LYS Leu Ile CTG GTG
GAT
Leu Val Asp OGA ACA
TOT
2 ATG TTA Met Leu GAA ACC
OTO
Giu Thr Leu TOA GAT
GAT
GATTTCGT AA GT GTO TOO GAz AsnValAl Ser Giu 2800 AAC AGO AAT GOT CAG GAT GTA AGO TGT TTT CAG CAC OTT GTC
CAA
Asn Ser Asn Ala Gin Asp Vai Ser Cys Phe Gin His Leu Val Gin 295 300 305 -233- AAT GTA AGA AAT AAA AAA GTG TTG AAA GAC Asn Val Arg Asn Lys Lys Val Leu Lys Asp 315 320 GCG GTG AAT AAT ATC ACA Ala Val Asn Asn Ile Thr 325 GCC AAA GGA Ala Lys Gly CAG CTG CTT Gin Leu Leu 345
ATT
Ile 330 ACA GAT TAT AAG Thr Asp Tyr Lys GGC TTT ACT TTT Gly Phe Ser Phe GCT TTT GAA Ala Phe Glu 340 AAG ATT ATT Lys Ile Ile 1012 1060 1108 AAT TAT AAT GTT Asn Tyr Asn Val AGA GCA AAC TGC Arg Ala Asn Cys ATG CTA Met Leu 360 TTC ACG GAT GGA Phe Thr Asp Gly
GGA
Gly 365 GAA GAG AGA GCC Glu Giu Arg Ala
CAG
Gin 370 GAG ATA TTT AAC Glu Ile Phe Asn TAC AAT AAA Tyr Asn Lys CAT AAA AspLys 380 GAG AGA Glu Arg 395 AAA GTA CGT GTA Lys Val Arg Val AGG TTT TCA GTT Arg Phe Ser Val CAA CAC AAT TAT Gin His Asn Tyr GGA CCT ATT Gly Pro Ile
CAG
Gin 400 TGG ATG GCC TGT Trp Met Ala Cys GAA AAC Glu Asn 405
S
S. *5
C
AAA GGT TAT Lys Gly Tyr ACT CAG GAA Thr Gin Glu 425 TAT GAA ATT CCT Tyr Giu Ile Pro
TCC
Ser 415 ATT GGT GCA ATA Ile Gly Ala Ile AGA ATC AAT Arg Ile Asn 420 TTA GCA GGA Leu Ala Gly TAT TTG CAT GTT Tyr Leu Asp Val CGA AGA CCA ATG Cly Arg Pro Met GAC AAA Asp Lys 440 GCT AAG CRA GTC Ala Lys Gin Val TGG ACA AAT GTG Trp Thr Asn Val
TAC
Tyr 450 CTG CAT GCA TTG Leu Asp Ala Leu 1156 1204 1252 1300 1348 1396 1444 1492 1540 1588 1636
GAA
Glu 455 CTG GGA CTT GTC Leu Giy Leu Val
ATT
Ile 460 ACT GCA ACT CTT Thr Gly Thr Leu GTC TTC RAC ATA Val Phe Asn Ile
ACC
Thr 470 se GGC CRA TTT Gly Gin Phe GTG ATG GGA Val Met Gly COT TTT ACA Arg Phe Thr 505 GAA RAT Glu Asn
A*,
GTA CAT Val Asp 490 AAG ACA AAC TTA Lys Thr Asn Leu AAC CAG CTG ATT Asn Gin Leu Ile CTT GGT Leu Gly 485 GTG TCT TTG Val Ser Leu GAA CAT Glu Asp 495 ATT AAA AGA Ile Lys Arg CTG ACA CCA Leu Thr Pro 500 CAT CCT RAT Asp Pro Asn CTG TGC CCC AAT Leu Cys Pro Asn TAT TAC TTT CA Tyr Tyr Phe Ala GOT TAT GTT TTA TTA CAT Gly Tyr Val Leu Leu His 520 RAT CTT CAC CCA Asn Leu Gin Pro
RAG
Lys 530 CCT ATT GGT GTA Pro Ile Gly Val -234- CCT ATA CCA ACA ATT A-AT TTA AGA AAA
AGG
Giy Ile Pro Thr Ile Asn Leu Arg Lys Arg 535 540 CCC AAA TCT
C-AG
Pro Lys Ser Gin GAG CCA GTA ACA TTG GAT Giu Pro Val Thr Leu Asp 555 560 AGA CCC A-AT ATC CAG A-AC Arg Pro Asn Ile Gn Asn 545 550 TTC CTT GAT GCA GAG
TTA
Phe Leu Asp Aia Giu Leu GAG AAT
GAT
Giu Asn Asp AGT GGA
GAA
Ser Giy Giu 585 AA-A GTC GAG ATT Lys Vai Giu Ile CGA AAT Arg Asn AAG ATC ATT GAT GGG GAA Lys Met Ile Asp Giy Giu AAA ACA TTC AGA Lys Thr Phe Arg
ACT
Thr CTG GTT AAA
TCT
Leu Vai Lys Ser TAT ATT Tyr Ile 600 GAC AAA. GGA A-AC Asp LYS GiY As, CAA GAT GAG AGA Gin Asp Giu Arg 595 CCT GTC AAT GGC
AGG
Arg ACA TAC ACA
TGG
Thr Tyr Thr Trp
ACA
Thr 1G84 1732 1780 1828 1876 1924 1972 2020 2068
ACA
Thr 615 GAT TAC AGT
TTG
Asp Tyr Ser Leu TTG GTA TTA Leu Vai Leu CCA ACC Pro Thr TAC AGT TTT TAC Tyr Ser Phe Tyr @Oeb 0@40 *69* 0* 6* @9 4 6 0 eq 0e 9 9 6 ATA AAA GCC AAA Ile Lys Aia Lys ACC CTG AAG
CCA
Thr Leu Lys pro 650 CCA AGA GAT TAC Pro Arg Asp Tyr 665 GAA GAG ACA ATA Giu Giu Thr Ile CAG GCC AGA
TAT
Gin Aia Arg Tyr TCG GAA CAT A-AT TTT GA-A CAA Asp Asn Phe Giu Ciu TCT GGC TAT ACA Ser Giy Tyr Thr TTC ATA GCA Phe Ile Aia 660 AAC ACT GAA TGC AAT GAC 2 'Ys Asn Asp
CTC
Leu AAA ATA TCC GAT Lys Ile Ser Asp
TTT
Phe
CCA
Pro 6.95 CTT TTA A-AT
TTC
Leu Leu A-sn Phe 680 TCA TGT A-AC GCG Ser Cys Asn Ala A-AC GAG Asn Giu TTT ATT
GAT
Phe Ile Asp A-GA AAA Arg Lys 690 OTC
TTG
Val Leu ACT CCA AAC
A-AC
TTG ATT AAT
AGA
-e lie= Asn Arg CTT GAT CC'A TTT ACA
AAT
Phe Thr Asn AAC GGA
GTG
Lys Cly Val
GAA
Gi u
CTT
Leu GTC CAA A-AT Val Gin A-sn TAC TCG Tyr Trp ACT AAG CAG AAA A-AT
ATC
Ser Lys Gin Lys Asn le 2212 2260 2308 AAA GCA CGA TTT GTT GTC ACT CAT GGT GG Lys Aia Arg Phe Val Vai Thr Asp Cly Cly GTT TAT CCC A-A-A GAG OCT GCA CAA AA-T TOO Val Tyr Pro Lys Giu Ala Cly Ciu Asn Trp 745 750 ATT ACC AGA Ile Thr A-rg 740 CCA GAG ACA Pro Ciu Thr CAA GAA AAC Gin Giu Asn 755 -235- TAT GAG GAC Tyr Glu Asp 760 AGC TTC TAT Ser Phe Tyr AAA AGG Lys Arg 765 AGC CTA GAT AAT GAT AAC TAT GTT Ser Leu Asp Asn Asp Asn Tyr Val 770 2356
TTC
Phe 775 ACT GCT CCC TAC Thr Ala Pro Tyr
TTT
Phe 780 AAC AAA AGT GGA Asn Lys Ser Gly GGT GCC TAT GAA Gly Ala Tyr Glu GGC ATT ATG GTA Gly Ile Met Val AGC AAA GCT GTA GAA ATA Ser Lys Ala Val Glu Ile 795 800 TAT ATT CAA GGG Tyr Ile Gin Gly AAA CTT Lys Leu 805 *c *oo* CTT AAA CCT Leu Lys Pro GAG AAT TTC Glu Asn Phe 825
GCA
Ala 810 GTT GTT GGA ATT Val Val Gly Ile ATT GAT GTA AAT Ile Asp Val Asn TCC TGG ATA Ser Trp Ile 820 GGT CCA GTT Gly Pro Val ACC AAA ACC TCA Thr Lys Thr Ser
ATC
Ile 830 AGA GAT CCG TGT Arg Asp Pro Cys
GCT
Ala 835 TGT GAC TGC AAA AGA AAC AGT GAC GTA ATG GAT Cys Asp Cys Lys Arg Asn Ser Asp Val Met Asp 840 A GTG ATT CTG GAT Val lie Leu Asp
GAT
Asp 855 GGT GGG TTT Gly Gly Phe CTT CTG Leu Leu 860 ATG GCA AAT CAT Met Ala Asn His GAT TAT ACT AAT Asp Tyr Thr Asn
CAG
Gin 870 2404 2452 2500 2548 2596 2644 2692 2740 2788 2836 2884 2932 ATT GGA AGA TTT Iie Gly Arg Phe GGA GAG ATT GAT Gly Glu Ile Asp ooo.
o
CCC
Pro 880 AGC TTG ATG AGA Ser Leu Met Arg CAC CTG His Leu 885 GTT AAT ATA Val Asn Ile GTA TGT GAG Val Cys Glu 905
TCA
Ser 890 GTT TAT GCT TTT Val Tyr Ala Phe AAA TCT TAT GAT Lys Ser Tyr Asp TAT CAG TCA Tyr Gin Ser 900 CAT CGC TCA His Arg Ser CCC GGT GCT GCA Pro Gly Ala Ala
CCA
Pro 910 AAA CAA GGA GCA Lys Gin Gly Ala
GGA
Gly 915 GCA TAT Ala Tyr 920 GTG CCA TCA GTA Val Pro Ser Val GAC ATA TTA CAA Asp Ile Leu Gin GGC TGG TGG GCC Gly Tr pTr Ala ACT GCT GCT GCC TGG Thr Ala Ala Ala Trp 935 TTT CCA CGA CTC CTT Phe Pro Arg Leu Leu 955
TCT
Ser 940 ATT CTA CAG CAG Ile Leu Gin Gin CTC TTG AGT TTG Leu Leu Ser Leu GAG GCA GTT GAG Glu Ala Val Glu
ATG
Met 960 GAG GAT GAT GAC Glu Asp Asp Asp TTC ACG Phe Thr 965 GCC TCC CTG TCC Ala Ser Leu Ser 970 AAG CAG AGC TGC Lys Gin Ser Cys ACT GAA CAA ACC Thr Glu Gln Thr CAG TAT TTC Gin Tyr Phe 980 2980 -236- TTC GAT AAC GAC AGT AAA TCA TTC AGT GGT GTA TTA GAO TGT GGA AAO 02 Ph spAriApSer Lys Ser Phe Ser Giv Vai Leu Asp Cys Gy Asn 985 990 995 TGT TCC AGA ATC TTT CAT GGA GAA AAG CTT ATG AAC ACC AAC TTA ATA 3076 Cys Ser Arg Ile Phe His Giy Glu Lys Leu Met Asn Thr Asn Leu Ile 1000 1005 1010 TTO ATA ATG GTT GAG AGC AAA GGG ACA TGT CCA TGT GAO ACA CGA CTG 3124 Phe Ile Met Vai Giu Ser Lys Giy Thr Cys Pro Cys Asp Thr Arg Leu 1015 1020 1025 1030 CTC ATA CAA GCG GAG CAG ACT TCT GAC GGT OCA AAT COT TGT GAO ATG 3172 Leu Ile Gin Aia Giu Gin Thr Ser Asp Giy Pro Asn Pro Cys Asp Met 1051040 1045 eo~oGTT AAG CAA CT AGA TAO CGA AAA GGG CT GAT GTC TGO TTT GAT AAC 3 22 0 Vai Lys GnPro Arg Tyr A-rg Lys Gly Pro Asp Vai Cys Phe Asp Asn *..1050 1055 1060 AAT GTO TTG GAG GAT TAT ACT GAO TGT GGT GGT GTT TOT GGA TTA AAT 3268 *Asn Val Leu Gu Asp Tyr Thr A-sp Cys Gy Gly Vai Ser Gy Leu Asn 1065 1070 1075 CCC TOO CTG TGG TAT ATO ATT GGA ATO CAG TTT OTA OTA OTT TGCG 31 PoSrLuTrp TrIeleGly IeGnPhe Leu Leu Leu Trp Leu 1080 1085 1090 GTA TOT GGO AGO ACA CAC CGG CTG TTA TGAOOTTCTA AAAAOOAAAT 3363 Vai Ser Gly Ser Thr His Arg Leu Leu 1095 1100 .CTGCATAGTT AAACTCCAGA CCCTGCCAAA ACATGAGOCO TGCCCTOAAT TACAGTAACG 3423 TAGGGTOAGO TATAAAATCA GACAAACATT AGCTGGGOCT GTTOCATGGO ATAACACTAA 3483 GGCGCAGACT COTAAGGCAC CCACTGGCTG CATGTCAGGG TGTCAGATOC TTAAAOGTGT 34 0000 GTGAATGCTG CATOATOTAT GTGTAACATC AAAGCAAAAT OCTATACGTG TCCTOTATTG 3603 GAAAATTTGG GCGTTTGTTG TTGCATTGTT GGT 3636 INFORMATION FOR SEQ ID SEQUENCE
CHARACTERISTICS:
LENGTH: 3585 base pairs TYPE: nucieic acid STRANDEDNESS: double TOPOLOGY: linear (iiJ) MOLECUJLE TYPE: DNA (genomic) (ix)
FEATURE:
NAE/EY:
ODS
LOCATION: 35. .3295 -237- OTHER INFORMATION: /standard-name= 'Alpha-2c, (ix) FEATURE.
NAME/KEY: LOCATION: 1. .34 (ix) FEATREp: NAME/KEy. 3'UTR LOCATION: 3296. .3585 SEQUENCE DESCRIPTION: SEQ ID GCGGGGGAGG GGGCATTGAT CTTCGATCGC GAAG ATC GCT GCT CCC TGC C7C Met AaAla Gly Cys Leu, CTG GCC TTG ACT CTG ACA CTT TTC CAA TCT TTG CTC ATC GCC CCC TCC Leu Ala Leu Thr Le, Thr Leu Phe Gln Ser Leu Leu Ile Gly Pro Ser 1s TC A GAG CCG TTC CCT TCG GCC GTC ACT TAATCTGGGGT14 Ser Cu Gu Pro Phe Pro Ser Ala Val Thr Ile Lys Ser Tr-p Val. Asp 30 35 *AAG ATG CAA GAA GAC CTT CTC ACA CTG GCA AAA ACA GCA AGT GGA GTC 196 Lys Met Gin Giu Asp Leu Val Thr Leu Ala Lys Thr Ala Ser Gly Val 45 a.aAAT CAG CTT GTT GAT ATT TAT GAG AAA TAT CAA GAT TTC TAT ACT GTC 244 Asn Gin Leu Val Asp Ile Tyr Giu Lys Tyr Gin Asp Leu Tyr Thr Val 60 65 *GAA CCA AAT AAT GCA CGC CAC CTG GTA GAA ATT GCA GCC AGG GAT ATT 292 Glu Pro Asn Asn Ala Arg Gin Leu Val Giu Ile A-la Ala Arg Asp Ile 80 GAG AAA CTT CTG AGC AAC AGA TCT AAA GCC CTG GTG AGC CTG GCA TTC 340 Glu Lys Leu Leu Ser Asn Arg Ser Lys Ala Leu Val Ser Leu Ala Leu a90 100 GAA GCG GAG AA CTT CAA CCA GCT CAC CAC TGG AGA GAA GAT TTT CCA
-R
Ciu Ala Giu Lys Val Gin Ala Ala His G1n As Phe Ala ly1-rg Ul1uAs hAl 105 110 115 AGC AAT GAA GTT GTC TAC TAC AAT GCA AAG CAT CAT CTC CAT CCT CAC 436 Ser Asn Giu Val Val Tyr Tyr Asn Ala Lys Asp Asp Leu Asp Pro Glu 120 125 130 AAA AAT CAC ACT GAG CCA CCC ACC CAG ACC ATA AAA CCT CTT TTC ATT 484 Lys Asn Asp Ser Ciu Pro Cly Ser Gin Arg Ile Lys Pro Val Phe Ile 135 140 145 150 GAA CAT GCT AAT TTT CCA CGA CAA ATA TCT TAT CAC CAC CCA GCA CTC 532 Clu Asp Aia Asn Phe Gly Arg Gin Ile Ser Tyr Gin His Ala Ala Vai 155 160 165 -238- CAT ATT CCT ACT
GAC
His Ile Pro Thr Asp ATC TAT
GAG
Ile Tyr Gu
GGC
Giy TCA ACA ATT CTC TTA AAT
CAA
Ser Thr Ile Val Leu Asn Giu CTC AAC
TGG
Leu Asn Trp ACA ACT GCC TTA
GAT
Thr Ser Ala1 Leu Asp 580 622 GAA GTT
TTC
AAA AAG LYS Lys AAT CGC GAG GAA GAC Ciu Asp 200 CCT TCA TTA TTG TGC
CAC
Pro Ser Leu Leu Trp Gin CTT TTT GCC
ACT
Val Phe Cly Ser CCC ACT CCC
CTA
a. a
S
S
a.
a
C
CCT
Al a 215 CCA TAT TAT
CCA
Arg Tyr Tyr Pro
CT
Ala AAT AAG ATT
GAC
Asn Lys Ile Asp TCA CCA TCC CTT
CAT
Ser Pro Trp Val Asp 225 GAT CTA CCC ACA
AGA
Asp Val Ag Arg Ag AAT ACT AGA
ACT
CCA
CTT
TAT
Leu Tyr CCA TCC TAC ProCA 240 CCA CCT
GCA
Cly Ala Ala ACT CTT
ACT
Ser Val Ser 265
TCT
Ser CCT AAA CAC ATG
CTT
Pro Lys Asp Met Leu ATT CTC CTC
CAT
Ile Leu Vai Asp V le Gin 245 GCA TTG ACA CTT AAkA
CTC
Giy Leu Thr Leu LYS Leu ATC QZGA ACA CTC TCC GAA ATC TTA Met Leu 280 GAA ACC CTC TCA
CAT
Giu Thr Leu Ser Asp
AAC
Asn 295 ACC AAT
CCT
Ser Asn Ala CAG CAT GTA Gin Asp Val AT AT TTC TG AAT Asp Asp Phe Vai Asn 290 AC TGT TTT CAG
CAC
Ser Cys Phe Gin His 305 TTC AAA GAC CC CTC Lieu Lys Asp Ala Vai GTA GCT TCA TTT CTT CTC CAA AAT GTA AGA AAT AAA AAA
GTC
Asn Vai Arg Asn LYS Lys Val 315 Leu aiGi ATC ACA Ile Thr 325 CCC AAA CGA
ATT
Ala Lys T1y-l ACA CAT TAT
AAG
T11Z ASP Tyr Lys .TTT AAT 3325 CAC CTG Gin Leu AAT TAT AAT CTT Asn Tyr Asn Val 964: 1012 1060 1108c 1159- 1204
TCC
Ser AGA GCA AAC
TGC
AAG ATT ATT ATG CTA TTC ACC CAT
GGA
Met Leu Phe Thr Asp Cly 360
CGA
Ci y CAA GAG AGA CCC
CAG
Giu Ciu Arg Ala Gin Lys Ili e Aie 370 AAA TAC AAT AAA CAT AAA AAA CTA CCT CTA TTC AGC TTT TCA GTT
GCT
Lys Tyr A-sn Lys Asp Lys Lys Vai Arg Val Ph3r h e a l 375 h rPh eVa l 380 385 390 -239- CAA CAC AAT TAT GAG AGA GGA CCT ATT CAG Gin His Asn Tyr Giu Arg Gly Pro Ile Gin 395 400 AAA GGT TAT TAT TAT GAA ATT CCT TCC ATT Lys Cly Tyr Tyr Tyr Giu Ile Pro Ser Ile 410 TGG ATG GCC Trp Met Ala GGT GCA ATA Giy Ala Ile TGT GAA AAC Cys Gu Asn 405 AGA ATC AAT Ar Ile Asn 420 TTA GCA GGA ACT CAG GAA Thr Gin Giu 425 TAT TTG GAT GTT TTG Tyr LeU Asp Val Leu 430 GGA AGA CCA ATG Gly Arg Pro Met
GTT
Val GAC AAA Asp Lys 440 9 99..
GCT AAG CAA GTC CAA TGG ACA AAT GTG TAC CTG GAT CCA TTG Ala Lys Gin Vai Gin Trp Thr Asn Val Tyr Leu Asp Ala Leu 445 450
GAA
Giu 455 CTG GGA CTT Leu Giy Leu GTC ATT ACT GGA ACT CTT CCG Val Ile Thr Cly Thr Leu Pro 460 465 GTC TTC AAC ATA Vai Phe Asn Ile 1252 1300 1348 1396 1444 1492 1540 1588 1636 GGC CAA TTT GAA Giy Gin Phe Giu
AAT
Asn 475 AAG ACA AAC TTA Lys Thr Asn Leu
AAG
Lys AAC CAG CTG ATT Asn Gin Leu Ile CTT GCT Leu Giy GTG ATG GGA Val Met Gly CGT TTT ACA Arg Phe Thr 505
GTA
Vai 490 GAT GTG TCT TTG Asp Val Ser Leu CAA GAT Giu Asp ATT AAA AGA Ile Lys Arg CTG ACA CCA Leu Thr Pro CTG TGC CCC AAT Leu Cys Pro Asn TAT TAC TTT GCA ATC GAT CCT AAT Tyr Tyr Phe Aia Ile Asp Pro Asn 515 CTT CAG CCA AAG GAG CCA CTA ACA Leu Gin Pro Lys Giu Pro Vai Thr GCT TAT GTT TTA TTA CAT CCA AAT Giy Tyr Vai Leu Leu His Pro Asn
TTG
Leu 535 CAT TTC CTT GAT Asp Phe Leu Asp
GCA
Ala GAG TTA GAG Giu Leu Giu CGA AAT AAG ATG ATT GAT GGG GAA ACT Arg Asn Lys Meft T1e As lyCu e AAT GAT Asn Asp 545 GGA CAA Giy Giu ATT AAA GTG GAG Ile Lys Vai Giu
ATT
Ile AAA ACA TTC Lys Thr Phe AGA ACT Arg Thr CTG GTT AAA I.eu Val Lys TAC ACA TGG Tyr Thr Trp, 585 CAA CAT GAG AGA Gin Asp Giu Axg
TAT
Tyr ATT CAC AAA GGA Ile Asp Lys Giy AAC AGG ACA Asn Arg Thr 580 GCC TTG GTA Aia Leu Val 1684 1780 1828 1876 ACA CCT CTC AAT Thr Pro Vai Asn ACA CAT TAC AGT Thr Asp Tyr Ser TTA CCA ACC TAC ACT TTT TAC Leu Pro Thr Tyr Ser Phe Tyr 600 605 TAT ATA AAA GCC Tyr Ile Lys Aia AAA CTA CAA GAG ACA Lys Leu Giu Ciu Thr 610 m -240- ATA ACT CAG GCC ACA TCA AAA AAC GGC AAA
ATG
Ile Thr Gin Ala Arg Ser Lys Lys Gly Lys Met 65620 625 RAG GAT TOG
RA
OTG RAG CCA
CAT
Leu Lys Pro Asp
AAT
Asn TTT GA
A
TOT GCC
TAT
ACA TTO
ATA
GCA OCA ACA CAT TAO TGO RAT GAO OTC AAA ATA TOG CAT AAT
AAO
Arg Asp Tyr Cys Asn Asp Leu Lys Ile Se r Asp Asn Asn 650 AC Aa Pro a OTT TTA RAT Leu Leu Asn 665 TOA TGT AAO Ser Cys Asn 680 TTO AAO GAG TTT ATT CAT AGA AAA ACT
OCA
Phe Asn Giu Phe Ile Asp Arg Lys Thr Pro 670 675 AT GAA TTTe 660 AAO RAC OCA Asn Asn pro CA CCC TTT
CG
Aa CAT TTC ATT AAT AGA
GC
Asp Leu Ile Asn Arg Val
TTOCOTT
CAT
ACA
Thr 695 RAT GAA OTT GC CAA
AAT
Asn Glu Leu Vai Gin Asn 700 TAO TGC ACT AG CAC Tyr Trp Ser Lys Gin AAA AAT ATO
AAC
GCA GTC ARA
CCA
Cly Val Lys Ala OGA TTT GTT Arg Phe Val GTG ACT CAT
OCT
Val Thr Asp Gly CCC ATT ACC 720 TAT COO AAA Tyr Pro Lys GAG GAO AGO Giu Asp Ser 1924 1972 2020 2066 2116 2164 2212 2260 2303 2356 2404 2452 2500 2548 GOT GGA CRA RAT TGGCOAA CAA ARO Ala Gly Ciu Asn Trp Gin Giu Asn TTO TAT AAA AGG
AGO
Phe Tyr Lys Arg Ser ACT CT Thr Ala COO TAO TTT Pro Tyr Phe OTA CAT RAT
CAT
Leu Asp Asn Asp CCA COT GCT
CO
Cly Pro ClY Ala OCA GAG ACA
TAT
Pro Giu Thr Tyr 740 ARC TAT CTT TTO Asn Tyr Val Phe 755 TAT GRA TCC
GCC
AC AA ACT Asn Lys Ser ATT ATG GTA AGO
AA
Ile Met VAI1 Se Lys 775
CT
Al a GTA GA ATA TAT ATT Val Glu Ile Tyr Ile CARCC ARkA OTT AAA OCT GOA
CTT
Lys Pro Ala Val CGA ATT AAA ATT
CAT
Gly Ile Lys Ile Asp CTA RAT TOO
TGC
Val Asn Ser Trp ATA
GAG
RAT TTO
ACC
Asn Phe Thr
AA
Lys ACC TOA ATO Thr Ser Ile AGA CAT
COG
Arg Asp Pro TCT CT
CCT
Cys Ala Gly GAO TC Asp CYS ARA AGA RAC ACT
GAO
Lys Arg Asn Ser Asp 825 OCA GTT
TCT
Pro Val Cys 820 OTC CAT
CAT
Leu Asp Asp CTA ATC CAT
TCT
Val Met Asp Cys 830 CTG ATT Val Ile 835 -241- GGT GGG TTT CTT CTG ATG GCA AAT CAT GAT GAT TAT ACT Gly Gly Phe Leu Leu Met Ala Asn His Asp Asp Tyr Thr 840 845 850 AAT CAG ATT Asn Gin lie GGA AGA Gly Arg 855 TTT TTT GGA GAG Phe Phe Gly Glu 860 ATT GAT CCC AGC Ile Asp Pro Ser
TTG
Leu 865 ATG AGA CAC CTG Met Arg His Leu 2644 AAT ATA TCA GTT Asn Ile Ser Val TGT GAG CCC GGT Cys Giu Pro Gly 890 TAT GTG CCA TCA Tyr Val Pro Ser 905 TAT GCT Tyr Ala 875 GCT GCA Ala Ala TTT AAC AAA Phe Asn Lys CCA AAA CAA Pro Lys Gin 895 GCA GGA Ala Gly TAT GAT TAT CAG Tyr Asp Tyr Gin TCA GTA Ser Val 885 TC GCA Ser Ala CAT CGC His Arg 900 GTA GCA GAC Val Ala Asp TTA CAA ATT GGC Leu Gin Ile Gly TGG GCC ACT Trp Ala Thr
GCT
Ala GCT GCC TGG TCT ATT Ala Ala Trp Ser 1le 920
CTA
Leu 925 CAd CAG TTT CTC Gin Gin Phe Leu AGT TTG ACC TTT Ser Leu Thr Phe CCA CGA Pro Arg 935 CTC CTT GAG Leu Leu Glu
GCA
Ala 940 GTT GAG ATG GAG Val Glu Met Glu GAT GAC TTC ACG Asp Asp Phe Thr
GCC
Al a TCC CTG TCC AAG CAG AGC TGC ATT ACT Ser Leu Ser Lys Gin Ser Cys Ile Thr CAA ACC CAG TAT Gin Thr Gin Tyr TTC TTC Phe Phe 965 2836 2884 2932 298C 3026 GAT AAC GAC Asp Asn Asp TCC AGA ATC Ser Arg Ile 985 ATA ATG GTT Ile Met Val 1000 AAA TCA TTC AGT Lys Ser Phe Ser
GGT
Gly 975 GTA TTA GAC TGT Val Leu Asp Cys GGA AAC TGT Gly Asn Cys 980 TTA ATA TTC Leu Ile Phe TTT CAT GGA GAA Phe His Gly Glu CTT ATG AAC ACC Leu Met Asn Thr GAG AGC AAA GGG ACA Glu Ser Lys Gly Thr 1005 TGT CCA TGT Cys Pro Cys GAC ACA Asp Thr i0io ATA CAA Ile Gin 1015 GCG GAG CAG Ala Glu Gin ACT TCT GAC Thr Ser Asp 1020 GGT CCA AAT CCT TGT Gly Pro Asn Pro Cvs 1025 CGA CTG CT: Arg Leu Leu GAC ATG GTT Asp Met Val 1030 GAT AAC AAT Asp Asn Asn 1045 AAG CAA CCT Lys Gin Pro GTC TTG GAG Val Leu Glu
AGA
Arg TAC CGA AAA Tyr Arg Lys 1035 GGG CCT GAT GTC TGC TTT Gly Pro Asp Val Cys Phe 1040 3026 3124 3172 3220 3268 GAT TAT ACT GAC TGT GGT GGT GTT TCT GGA TTA AAT CCC Asp Tyr Thr Asp Cys Gly Gly Val Ser Gly Leu Asn Pro 1050 1055 1060 TCC CTG TGG TAT ATC AT? GGA ATC CAG TTT CTA CTA CT? TGG CTG GTA -242- Ser Leu Trp Tyr Ile Ile Gly Ile Gl-i Phe Leu Leu Leu Trp Leu Val 1065 1070 1075 TCT GGC AGC ACA CAC CGG CTG TTA TCACCTTCTA AAAACCAAAT CTGCATAGTT 3322 Ser Gly Ser Thr His Axg Leu Leu 1080 1085 AAACTCCACA CCCTGCCA.AA ACATCACCCC TGCCCTCAAT TACATACc TAGCTCAGC 3382 TATAJAAATC.A GACAAACATT ACCTGGGCCT GTTCC.ATGGC ATA-ACACTAJA GCCCACACT 3442 CCTAACGCAC CCACTGGCTG CATGTCAGGG TGTCACATCC TTAAACGTGT GTCXATGCTG 3502 CATCATCTAT CTCTAACATC AAAGCCXAAT CCTATACSTC- TCCTCTATTG CAA-AATTTC-C 3562 **.GCSTTTGTTG TTCCA27c3'T CGT 3585 INFORMATION FOR SEQ ID NO:31: SEQUENCE C:-A-RACTERISTICS: LENCTE1: 3564 base pairs TYPE: nucleic acid STRAN1-,DEDNESS: double TOPOLOGY: linear toOV (ii) MOLECULE TYPE: DNIA (gencmiic) too* (ix) FE.ATURE: NUME/KEZY:
CODS
LOCATION: 3S. .3374 (n1625 to 1639 &01908 to 1928) O'DFER INFORMATION: /standard-name= "Alpha-2d" (ix) FEATURpE: NA.ME/KEY:
S'UTR
LOCATION: 1. .34 (ix) FEATURhE: NAME/KEY: 3'UTR LOCATION: 337S. .3565 (xi) SEQUENCE DESCRIPTION: SEQ ID NO:31: CCGCCGAGC GCCCATTrAT CTTCCATCGC'r ATG GCT GCT GGC TCC CTG 52 Met Ala Ala Gly Cys Leu 1
S
CTC GCC TTC ACT CTG ACA CTT TTC C A TCT TTG CTC ATC CCC CCC TCG 100 Leu Ala Leu Thr Leu Thr Leu Phe Cln- Ser Leu Leu Ile Gly Pro Ser 15 TCG GAG GAG CCG TTC CCT TCG GCC GTC ACT ATC AAA TCA. TGG GTG CAT 148 Ser Clu Glu Pro Phe Pro Ser Ala Val Thr Ile Lys Ser Tr-p Val Asp 30 AAC ATC CAA GA.A CAC CTT GTC ACA CTG GCA A~AA A CA GCA AGT GCA GTC 196 Lys Met Cln Clu A sp Leu Val Thr Leu Ala Lys Thr Ala Ser Gly Val -243- 45 AAT CAG CTT GTT GAT ATT TAT GAG AAA TAT CAA GAT TTG TAT ACT GTG 244 Asn Gln Leu Val Asp Ile Tyr Glu Lys Tyr Gin Asp Leu Tyr Thr Val 55 60 65 GAA CCA AAT AAT GCA CGC CAG CTG GTA GAA ATT GCA GCC AGG GAT ATT 292 Glu Pro Asn Asn Ala Arg Gin Leu Val Glu Ile Ala Ala Arg Asp Iie GAG AAA CTT CTG AGC AAC AGA TCT AAA GCC CTG GTG AG CTGCA TTG 34 Glu Lys Leu Leu Ser Asn Arg Ser Lys Ale Leu Va Ser Leu Ala Leu 95100 GAA GCG GAG AAA GTT CAA GCA GCT CAC CAG TGG AGA GAA GAT TTT GCA 388 Glu Ala Glu Lys Val Gln Ala Ala His Gln Trp Arg Glu Asp Phe Ala 105 0 115 A ACT AA T TA6 AGC AAT GAA GTT GTC TAC TAC AAT GCA AAG GAT GAT CTC GAT CCT GAG 436 Ser Asn Glu Val Val Tyr Tyr Asn Ala Lys Asp Asp Leu Asp Pro Glu 120 125 130 130 AAA AAT GAC AGT GAG CCA GGC AGC CAG AGG ATA AAA CCT GTT TTC ATT 484 Lys Asn Asp Ser Glu Pro Gly Ser Gln Arg Ile Lys Pro Val Phe Ile 170 145 150 GAA GAT GCTA ACAA ATA TCT TAT CAG CAC GCA GCA GTC 532 Glu Asp Ala Asn Phe Gly Arg Gn Ile Ser Tyr Gl His Ala Ala Val 160 165 CAT ATT CCT ACT GAC ATC TAT GAG GGC TCA AC AT GTG TTA AAT G A 580 o His Ile Pro Thr Asp Ile Tyr Glu Giy SeG Thr Ile Val Leu Asn Glu 170 5 180 CTC AAC TGG ACA AGT GCC TTA GAT GAA GTT TTC AAA AAG AAT CGC GAG 628 Leu Asn Trp Thr Ser Ala Leu Asp Glu Val Phe Lys Lys Asn Arg Glu 185 190 195 GAA GAC CCT TCA TTA TTG TGG CAG GTT TTT GGC AGT GCC ACT GGC CTA 676 Glu Asp Pro Ser Leu Leu Trp Gln Val Phe Gly Ser Ala Thr Gly Leu 200240 245 GCT CGA TAT TAT CCA GCT TCA CCA TGG GTT GAT AAT AGT AGA ACT CCA 724 Ala Arg Tyr Tyr Pro Ala Ser Pro Trp Val Asp Asn Set Arg Thr Pro S2 260 AAT GTT AGT GAA TT AA T AT A CTG AGT CA ACA TGG TAC ATC CAA 8 Ser Val Ser GIy Leu Thr Leu Lys Leu Ile Arg Thr Ser Val Ser Glu -244- 265 A-TG TTA GA-A ACC CTC
TCA
Met Leu Clu Thr Leu Ser 28o A-AC AGC A-AT CCT CAG
GAT
Asn Ser Asn Ala Gln Asp 29530 270 GAT GAT GAT TTC GTG
AAT
Asp Asp Asp Phe Va]. A-sn 285 290 TA- AGC TT TTT CA-C
CAC
Val Ser Cys Phe Gin His 305 CTC TTG AA-A CAC GC
GTG
Val Leu Lys Asp Ala Val GTA GCT TCA
TTT
916 L Vala SerPh 0 0* *0 *0 00 0 0 00 00 *0 ~0 *0 *0 0 00*0 000.
0 0000 A-AT CTA ACA A-sn Val Arg CCC A-A-A
CCA
Ala LYS Cly AT AAA
A-A-A
Asn Lys Lys A-AT A-AT A-Tc
A-CA
ATT
Ile ACA CAT TAT A-AG A-AG
GGC
Thr Asp Tyr LYS Lys Gly TTT ACT TTT GCT TTT
GAA
Phe Ser Phe Ala Phe Glu CAC CTG CTT
A-AT
Cmn LeU Leu A-sn TAT AAT TT TCC Tyr A-sn Val Ser ACA CCA A-AC TCC
A-AT
Arg Ala A-sn Cys A-sn AAG ATT ATT Lys Ile Ile A-TC CTA TTC A-CC CAT CCA CCA CA-A Met Leu Phe Thr AspCyCyCu 36 0 365G y l AAA TA-C A-AT AAA CAT AA-A AAA CTA Lys Tyr A-sn LYS Asp LYS Lys Val 375 380 CAC A-CA CCC CA-C CAC A-TA TTT
AA-C
Cu Arg Ala Gin Cu Ile Phe Asn 370 CCT CTA TTC A-CC TTT TCA CTT
CGT
38Sg V l P e Arg Phe Ser Val ly CAA CA-C A-AT Gin His A-sn AAA GGT
TAT
Lys Gly Tyr A-CT CA-C
CAA
Thr Gin Clu TAT
CA-C
Tyr lu 395 TAT
TAT
Tyr Tyr AGA CCA CCT
A-TT
Arg ly Pro Ile CAA A-TT CCT
TCC
Ciu Ile Pro Ser
CAC
Gin 400
ATT
TCG A-TC CC
TGT
CAA A-AC Ciu A-sn 405 1012 1060 1108 1156 1204 1252 1300 1348 1396 1444 1492 1540 TAT TTC CAT
CTT
Tyr Leu Asp Vai CCT CCA
A-TA
Cly A-ia Ile
ACA
TTC CA
AGA
L lae A-sn 44 4 Al l 450 Clu 455 CTG CCA CTT
CTC
Leu Cly Leu Val A-TT
A-CT
Ile Thr CCA A-CT CTT
CCC
Cly Thr Leu Pro CTC TTC A-A-C
ATA
Acc Thr 470 CCC CAA TTT
GAA
Ciy Gin Phe Ciu CTC A-TC CGA
TA
Val Met Cly Val 490
A-AT
A-sn AA-C ACA AAC TTA-
AA-C
Lys Thr Asn Leu Lys AA-C CA-C CTC A-TT CTT
CCT
A-sn Gin Leu Ile LeU Cly CAT CTG TCT TTC GAA
GA-T
Asp Val Ser Leu Giu Asp 495 A-TT AAA
A-GA
Ile LYS A-rg CTG A-CA
CCA
Leu Thr Pro 500 -245- CGT TTT ACA CTC TGC CCC AAT GGG TAT TAC TTT GCA ATC GAT CCT AAT Arg Phe Thr Leu Cy, Pro Asn Gly Tyr Tyr Phe Ala Ile Asp pro Asri 505 510 515 GGT TAT GTT TTA TTA CAT CCA AAT CTT CAG CCA AAG GAG CCA CTA
ACA
Gly Tyr Val Leu Leu His Pro Asn Leu Gin Pro LYS Glu Pro Val Thr 158 1636 TTG GAT TTC CTT
GAT
Leu Asp Phe Leu Asp 535
CCA
Al a GAG TTA GAG AAT GAT Glu Leu Ciu Asn Asp ATT AAA CTG GAG Ile Lys Val Giu ATT
C
.4 C
C.
C
CGA A-AT AAC ATC -Arg Asn Lys Met ATT GAT
CCC
Ile Asp Gly CAA ACT
CGA
Ciu Ser Cly GAA AAA ACA
TTC
Clu Lys Thr Phe AGA ACT CTG OTT AAA Leu Val Lys
TCT
Ser CAA CAT GAG AGA TAT ATT Gin Asp Giu Arg Tyr Ile CAC AAA
CGA
ASP LYS Cly TAC ACA TOO ACA CCT GTC AAT GC Tyr Thr Trp Thr Pro Val Asn Gly 585 590 AAC AGO ACA Asn Arg Thr 580 CCC TTC CTA ACA CAT TAC ACT Thr Asp Tyr Ser
TTG
Leu TTA CCA Leu pro 600 ACC TAC ACT
TTT
Thr Tyr Ser Phe
TAC
Tyr TAT ATA AAA GCC Tyr Ile Lys Ala
AAA
Lys CTA GAA GAG ACA Leu Glu Clu Thr ACT CAC CCC ACA Thr Gin Ala A-rg
TAT
Tyr TCG GAA ACC Ser Clu Thr CTG AAC Leu Lys CCA CAT AAT
TTT
Pro Asp Asn Phe GAA CAA TCT CCC
TAT
Clu Ser Cly Tyr 1684 1780 1828 1876 1924 1972 2020 2068 2116 21G4 2212 ACA TTC ATA CCA Thr Phe Ile Ala 635 AAT AAC ACT CAA Asn Asn Thr Clu AAA ATA TCG Lys Ile Ser ATT CAT
ACA
lie tpP Arg 665
GAT
Asp 650 CCA ACA CAT TAC Pro Arg Asp Tyr 640 TTT CTT TTA AAT Phe Leu Leu Asn 655 (C-CA TCA TGT AAC Pro Ser CYS Asn TCC AAT CAC CTG TTC AAC GAG
TTT
Phe Asn Ciu Phe AAA ACT CCA AAC LYS Thr Pro Asn CAT TTC ATT AAT AGA Asn Arg 680 CTC TTC CTT
CAT
Val Leu Leu Asp
CCA
Al a CCC TTT ACA AAT Cly Phe Thr Asn
CAA
Clu CTT OTC CAA AAT Leu Val Gin Asn
TAC
Tyr 695 TCC ACT AAG CAd AAA AAT ATC AAG CCA Trp Ser Lys Gin Lys Asn Ile Lys Cly AAA CCA CCA TTT Lys Ala Arg Phe CTC ACT CAT GOT CCO ATT ACC ACA CTT Val Thr Asp Cly Cly Ile Thr Arg Val 715
TAT
Tyr 720 CCC AAA GAG CT Pro Lys Glu Ala GGA CAA Cly Giu 725 -246- AAT TGG CAA
GAA
Asn Trp Gin Giu 730 AGO CTA GAT
AAT
Ser Leu Asp Asn 745 ACT GGA COT
GGT
Ser Gly Pro Gly 760 AAC OCA GAG ACA TAT Asn Pro Giu Thr Tyr GAG GAO AGO TTO TAT AAA
AG
Giu Asp Ser Phe Tyr Lys Arg CAT AAO TAT
GTT
Asp Asn Tyr Val 750 CO TAT GAA
TOG
Alia Tyr Ciu Ser TTO ACT
GOT
000 TAO TTT AAO AAA Phe Asn Lys AAA GOT CTA GGO ATT ATG GTA
AGO
Cly Ile Met Val Ser *saw 04 0 00 0* .0 0* C ATA TAT
ATT
Ile Tyr Ile CAA~ CCC Gin Gly 780 AAT TOO A-sn Ser AAA OTT OTT AAA OOT GOA GTT CTT CGA Lys 1,eu Leu Lys Pro Ala Val Vai Gly AAA ATT CAT CTA Lys Ile Asp Vai TGG ATA GAG AAT TTO ACC
AAA
Trp Ile Giu Asn Phe Thr Lys AGA CAT CCG A-rg ASP Pro
TCT
ys CT CGT OOA CTT TCT
GAO
Ala Cly Pro Vai Oys Asp TGO AAA
AGA
GTA
ATG
Val Met ACC TOA ATO Thr Ser Ile 805 AAO ACT GAO Asn Ser Asp 820 CTG ATG
GOA
Leu Met Ala GGA GAG ATT TGT CTC ATT OTG CAT CAT CCT GGG TTT OTT Oys Vai Ile Leu Asp Asp ClY Giy Phe Leu 2260 2308 2356 2 452 2500 2548 259-9 264- 2692 274 C 278e 2836 AAT OAT CAT CAT TAT ACT ALT OAC ATT GGA AGA TTT
TTT
Asn His Asp Asp Tyr Thr Asn Ginli GyArPePe 840 845 Il250AgPh h 000 AGO TTG Pro Ser Leu ATC AGA CAC OTC CTT AAT ATA TOA GTT TAT
GOT
Met Arg His Leu Val Asn Ile Ser VaTyAl 8 6 0 a y l AAO AAA
TOT
ASn Lys Ser AAA CAA GGA Lys Gin Gly ATA TTA CAA Ile Leu Gin 905 TAT CAT Tyr Asp 875 GCA GGA Ala Gly TAT OAC TCA Tyr Gin Ser AT CC TA His Arg Ser TA TT GAG 000 GGT CT GCA
OA
Vai Cys Cu Pro Gly Ala Al1a Pro 880 825 GA TAT GTG OA TA GTA GA GAO Ala Tyr Vai Pro~ Ser 'Jai Ala Asp 895 900 ATT GGO TCG
TGG
Ile Gly Trp Trp ACT GOT GOT
CO
Thr Ala Ala Ala TGG TOT ATT OTA OAG
OAG
Gin Gin 920 TTT OTO TTG
ACT
Phe Leu Leu Ser ACC TTT OCA
OGA
Thr Phe Pro Arg OTT GAG GOA
GTT
GAG ATG GAG CAT
CAT
Giu Met Giu Asp Asp 935
GAO
Asp TTO AG COO TOO OTG TOO AAC CAG
AGO
TGC
CYS
ATT ACT GAA CAA ACC CAC TAT TTO TTO CAT AAO GAO ACT AAA TOA
TTO
2932 -247- Ile Thr Giu Gin Thr Gin Tyr Phe Phe Asp Asn Asp Ser Lys Ser Phe 955 960 965 AGT GGT GTA TTA GAC TGT GGA AAC TGT TCC AGA ATC TTT CAT GGA
GAA
Ser Giy Val LeU Asp Cys Gly Asn Cys Ser Arg Ile Phe His Giv Giu 970 975 980 AAG CTT ATG AAC ACC AAC TTA ATA TTC ATA ATG GTT GAG AGC AAA
GGG
Lys Leu Met Asn Thr Asn Leu Ile Phe Ile Met Vai Giu Ser Lys Giy 985 9.90 995 ACA TGT CCA TGT GAC ACA CGA CTG CTC ATA CAA GCG GAG CAG ACT
TCT
Thr Cys Pro Cys A-sp Thr Arg Leu Leu Ile Gin Aia Giu Gin Thr Ser 1000 1005 1010 GAC GGT CCA AAT CCT TGT GAC ATG GTT AAG CAA CCT AGA TAC CGA AAA Asp Giy Pro Asn Pro Cys Asp Met Vai Lys Gin Pro Arg Tyr Arg Lys J015 1020 1025 1030 GGG CCT GAT GTC TGC TTT GAT AAC AAT GTC TTG GAG GAT TAT ACT
GAC
Glyv Pro Asp Vai Cys Phe Asp Asn Asn Vai Leu diu Asp Tyr Thr Asp 1035 1040 1045 TGT GGT GGT GTT TCT GGA TTA AAT CCC TCC CTG TGG TAT ATC ATT
GGA
Cys Giy Giy Vai Ser Giy Leu Asn Pro Ser Leu Trp Tyr Ile Ile Giy 1050 1055 1060 ATC CAd TTT CTA CTA CTT TGG CTG GTA TCT GGC AGC ACA CAC CGG
CTG
Ilie Gin Phe Leu Leu Leu Trp Leu Vai Ser Giy Ser Thr His Arg Leu 1065 1070 1075 TTA TGACCTTCTA AAAACCAAAT CTGCATAGTT AAACTCCAGA
CCCTGCCA
Leu ACATGAGCCC TGCCCTCAAT TACAGTAACG TAGGGTCAGC TATAAAATCA
GACAAACATT
AGCTGGGCCT GTTCCATGGC ATAACACTAA GGCGCAGACT CCTAAGGCAC
CCACTGGCTG
CATGTCAGGG TGTCAGATCC TTAAA.CGTGT GTGAATGCTG CATCATCTAT
GTGTAACATC
AAAGCAAAAT CCTATACGTG TCCTCTATTG GAAAATTTGG GCGTTTGTTG
TTGCATTGTT
INFORMATION FOR SEQ ID NO:32: SEQUENCE
CHARACTERISTICS:
LENGTH: 3579 base pairs TYPE: nucleic acid STRANDEDNESS. doubie TOPOLOGY: linear (ii) MOLECUL~E TYPE: DNA (genomic) (ix) FEATURE:~ 2980 3028 3076 3124 3172 3220 3268 3321 3381 3441 350-, 3561 3564 0 -248- NAME/KEY: CDS (B3) LOCATION: 35. .3289 OTHER IN-FORMATION: /standard name= "Alpha-2e"l
FEATURE:
NAME/KEY: LOCATION: 1. .34 (ix) FEATURE-.
NAME/KEY: 3'UT.R LOCATION: 3289. .3579 (xi) SEQUENCE DESCRIPTION: SEQ ID NO: 32: GCGGGGGAGG GGGCATTGAT CTT7CGATCGC GAAG ATG GCT GCT GGC TGC CTG Met Ala Ala Gly Cys Leu
CTG
Leu
TCG
Ser
AAG
Lys
AAT
Asri 55
GA.A
Glu
GAG
Glu
GAA
Glu
AGC
Ser
AAA
Ly s 13S
GC
Al a
GAG
Glu
ATG
Met 40
CAG
Gin
CCA
Pro
AAA
Lys
GCG
Ala
AAT
Asn.
120
AAT
A sn
TTG
Leu
GAG
Glu
CAA
Gin
CTT
Leu
AAT
Asn
CTT
Le u
GAG
Glu 105
GAA
Glu
GAC
Asp
CTG
Leu
TTC
Phe
GAC
Asp
GAT
Asp
GCA
Al a
AGC
Ser
GTT
Val1
GTC
Val1
GAG
G111
ACA
Thr
CCT
Pro
CTT
Leu
ATT
Ile 60
CGC
Axrg
AAC
Asn.
CAA
Gin
TAC
Tyr
CCA
Pro 140 TCT TTG CTC ATC GGC CCC Le u
AAA
Lys
ACA
Thr
GAT
Asp
GCA
Ala
GTG
Val1
AGA
Arg
GAT
Asp 130
AAA
Lys Ile
TCA
Ser
OCA
Ala
TTG
Le u
GCC
Al1 a
AGC
Ser
GAA
Glu 115
CTC
Leu C CT Pro Gly
TGG
T rp AGT1 Ser
TAT
Tyr
AGO
Axg
CTG
Leu 100
GAT
Asp
OAT
Asp O-1T Val1 Pro
GTG
Val1
GGA
Gly
ACT
Th r
GAT
Asp
GCA
Ala
TTT
Ph e
OCT
Pro
TTC
Phe 388 145 150 GPA GAT OCT A.AT TTT GGA CGA CAA ATA TOT TAT CAG CAC GCA GCA GTO -249- Giu Asp Ala Asn Phe Gly Arg Gin Ile Ser Tyr Gin 155 160 CAT ATT CCT ACT GAC ATC TAT GAG GGC TCA ACA ATT His Ile Pro Thr Asp Ile Tyr Giu Giy Ser Thr Ile 170 175 CTC A-AC TGG ACA AGT GCC TTA GAT GAA GTT TTC AAA Leu Asn Trp, Thr Ser Aia Leu Asp Giu Vai Phe Lys 185 190 GAA GAC CCT TCA TTA TTG TGG CAG GTT TTT GOC AGT Giu Asp Pro Ser Leu Leu Trp Gin Val Phe Gly Ser 200 205 210 GCT CGA TAT TAT CCA GCT TCA CCA TGG GTT GAT AAT Ala Ar; Tyr Tyr Pro Ala Ser Pro Trp Val Asp Asn 215 220 225 AAT AAG ATT GAC CTT TAT GAT GTA CGC AGA AGA CCA 'I Asn Lys Ile Asp Leu Tyr Asp Val Arg Ar; Ar; Pro 'I 235 240 GGA GCT GCA TCT CCT AAA GAC ATG CTT ATT CTG GTG G Gly Ala Ala Ser Pro Lys Asp Met Leu Ile Leu Val A 250 255 ACT GTT AGT GGA TTG ACA CTT AAA CTG ATC CGA ACA T Ser Vai Ser Gly Leu Thr Leu Lys Leu Ile Arg Thr S 265 270 2 ATC TTA GAA ACC CTC TCA GAT GAT GAT TTC CTG AAT CG Met Leu Giu Thr Leu Ser Asp Asp Asp Phe Val Asn V~ 280 285 290 AAC ACC AAT GCT CAG CAT GTA AGC TGT TTT CAC CAC C] Asn Ser Asn Ala Gin Asp Val Ser Cys Phe Gin His Le 295 300 305 AAT GTA ACA AAT AAA AAA GTG TTG AAA GAC GCG CTG AA Asn Val Arg Asn Lys Lys Val Leu Lys Asp Ala Val As 315 320 GCC AAA GGA ATT ACA GAT TAT AAG AAG CGC TTT ACT TT Ala Lys Gly Ile Thr Asp Tyr Lys Lys Gly Phe Ser Ph 330 335 CAC CTG CTT AAT TAT AAT GTT TCC AGA GCA AAC TCC AA Gin Leu Leu Asn Tyr Asn Val Ser Arg Ala Asn Cys As 345 350 35.
ATC CTA TTC ACG GAT GGA GGA GAA GAG AGA GCC CAG GAC Met Leu Phe Thr Asp Gly Giy Giu Clu Ar; Ala Gin Cl.
360 365 370 AAA TAC AAT AAA GAT AAA AAA GTA CGT CTA TTC AGG TT] H{is A GTC T Val L AAG A Lys A~ 195 GCC AC Ala Th r,,GT AG Ser Ar ?GC TA ~rp TY AT CT( .Sp Va 26( C T GTC e r Val 75 TA GC a 1 Ala 7T GTC ~u Val ~T AAT n Asn T GCT e Ala 340 T AAC n~ Lys
GATA
7TCA i a Ala Vai 165 TA AAT GAA eu Asn Ciu ~T CGC GAG sn Arg Giu T GGC CTA ~r Cly Leu A ACT CCA.
g Thr Pro 230 C ATC CAA r Ile Gin 245 3AGT1 GGA LSer Gly TCC GAA Ser Giu TCA TTT Ser Phe CAA GCA Gin Ala 310 ATC ACA Ile Thr 325S TTT GAA Phe Giu ATT ATT Ile Ile TTT AAC Phe Asn GTT
CCT
580 6 28 6706 724 772 820 868 964 1060 1108 1156 1204 -250- Lys Tyr Asn Lys ASP Lys Lys Val Arg Val Phe Arg Phe SerVa Gy 375 380 385 3al90 CAA CAC AAT TAT GAG AGA GGA OCT ATT 390G AGGCTG A GnHsAnTyr Giu Arg Gly Pro Ile Gin Trp Met Ala Cys Giu Asn 395 400 405 AAA GGT TAT TAT TAT GA.A ATT CCT TCC ATT GGT GCA ATA AGA ATO AAT 1300 Lys Cly Tyr Tyr Tyr Giu Ile Pro Ser Ile Gly Ala Ile Arg Ile Asn 410 415 420 ACT CAG GAA TAT TTG GAT GTT TTG GGA AGA CCA ATG GTT TTA GCA GCA 1348 Thr Gin Giu Tyr Leu Asp Val Leu Giy Arg Pro Met Val Leu Ala Gly 45430 435 GAO AAA GOT AAG CAA GTC CAA TGG ACA AAT CTG TAO OTG GAT CCA TTG 1396 ApLys Aia Lys Gin Vai Gin Trp, Thr Asn VlTrLuApAla Leu 440* 44545 CTG OGTT GTO ATT ACT CCA ACT OTT COG 450 TCACAT CC14 *Gu Leu Gly Leu Val Ile Thr ly Thr Leu Pro Vai Phe Asn Ile Thr 45460 465 470 :GOC CAA TTT GAA AAT AAG ACA AAO TTA AAG AAO CAGCTG ATT TT GGT 1492 *Gly Gin Phe Cu Asn Lys Thr Asn Leu Lys Asn Gin Leu Ile Leu Gly 475 480 -485 GTG ATG GGA GTA GAT GTG TOT TTG CAA CAT ATT AAA AGA OTG ACA OCA 1540 Val Met Gly Val Asp Vai Ser Leu Ciu Asp Ile Lys A-rg Leu Thr Pro 490 495 500 CT TTT ACA OTG TC COO AAT GGC TAT TAO TTT GOA ATO GAT OCT AAT 1588 Arg Phe Thr Leu Cys Pro Asn Gly Tyr Tyr Phe Ala Ile Asp Pro Asn 505 510 515 GGT TAT CTT TTA TTA CAT OCA AAT OTT CAG OCA AAG AAC COO AAA TOT 1638 Gly Tyr Val Leu Leu His Pro Asn Leu Gin Pro Lys Asn Pro Ls Ser 520 525 530 CAG GAG OCA CTA ACA TTG CAT TTO OTT GAT GCA GAG TTA GAG AAT CAT 1684 Gin Giu Pro Val Thr Leu Asp Phe Leu Asp Ala Ciu Leu Clu Asn Asp 535 540 545 550) ATT AAA GTG GAG ATT OGA AAT AAC ATC ATT CAT GGG GAA ACT CGA CAA 1732 Ile Lys Val Ciu Ile Arg Asn Lys Met Ile Asp Gly Giu Ser Gly Ciu 55560 565 AAA ACA TTC AGA ACT OTG CTT AAA TOT CAA GAT GAG AGA TAT ATT GAO 1780 Lys Thr Phe Axg Thr Leu Val Lys Ser Gin Asp Giu Arg Tyr Ile Asp 570 575 580 AAA CGA AAC AGG ACA TAO ACA TGG ACA OCT GTC AAT CCC ACA CAT TAO 1828 Lys Gly Asn Axg Thr Tyr Thr Trp Thr Pro Vai Asri Gly Thr Asp Tyr 585 590 595 ACT TTG CO TTG GTA TTA OCA ACC TAO ACT TTT TAO TAT ATA AAA CCC 1876 -251oooo o o I oo Ser Leu Ala 600 AAA CTA GAA Lys Leu Glu 615 CCA GAT AAT Pro Asp Asn TAC TGC AAT Tyr Cys Asn AAT TTC AAC Asn Phe Asn 665 AAC GCG GAT Asn Ala Asp 680 GAA CTT GTC C Glu Leu Val C 695 AAA GCA CGA T Lys Ala Arg p AAA GAG GCT G Lys Glu Ala G 7 AGC TTC TAT A Ser Phe Tyr L 745 CCC TAC TTT A; Pro Tyr Phe AE 760 GTA AGC AAA GC Val Ser Lys Al 775 GCA GTT GTT GG Ala Val Val G1l ACC AAA ACC TC Thr Lys Thr Se 81 AAA AGA AAC AG' Leu Val GAG ACA Glu Thr TTT GAA Phe Glu 635 GAC CTG Asp Leu 650 GAG TTT Gl31u Phe TTG ATT Leu Ile :AA AAT ;ln Asn 1 'TT GTT G 'he Val V 715 GA GAA A ly Glu A 30 AA AGG Ai ys Arg S AC AAA AC sn Lys SE IT GTA G; .a Val G1 78 ;A ATT AA y Ile Ly 795 A ATC AG r Ile Ar 0 T GAC GT
I
A
I
6
G.
G
L
A
Ii a.
As rA '0
!TI
'a.
A:
AT
er
A
.u 10 *s
A
g
A
,eu P 6 .TA A( le T 20 AA T( lu SE kA AT ys II 'T GA .e As rT AG.
;n Ar.
68 C TG( r Tr] 0 3 AC' 1 Th2 T TGG 1 Trp
CTA
Leu
GGA
Gly 765
ATA
Ile
ATT
Ile
GAT
Asp
ATG
ro Thr T 05 CT CAG G hr Gin A CT GGC T er Gly T 'A TCG G .e Ser As 65 .T AGA A; P Arg Lj 670 A GTC TT g Val Le 5 G AGT AA Ser Ly r GAT GG' Asp G1l CAA GAA SGin Gli 735 GAT AA1 Asp Asn 750 CCT GGT Pro Gly TAT ATT Tyr Ile GAT GTA Asp Val CCG TGT Pro Cys 815 GAT TGT 'yr Ser Phe Tyr 610 CC AGA TAT TCG la Arg Tyr Ser 625 AT ACA TTC ATA yr Thr Phe Ile 640 IT AAT AAC ACT sp Asn Asn Thr 55 A ACT CCA AAC 's Thr Pro Asn 'G CTT GAT GCA u Leu Asp Ala C 690 G CAG AAA AAT s Gin Lys Asn I 705 T GGG ATT ACC A y Gly Ile Thr A 720 A AAC CCA GAG A u Asn Pro Glu T.
GAT AAC TAT G Asp Asn Tyr V; 7; GCC TAT GAA TC Ala Tyr Glu SE 770 CAA GGG AAA C1 Gin Gly Lys Le 785 AAT TCC TGG AT Asn Ser Trp Ii 800 GCT GGT CCA GT Ala Gly Pro Va GTG ATT CTG GA'
T
G.
G.
G(
A!
GA
G1l
AA
As ;7
'G
1: Si
.G;
rg
CA
hr
TT
al
"G
ir
FT
'u
'A
e
T
1
T
yr
AA
lu
CA
!a
LA
.u .C C n P C T y P C A i L h Gq
V
T
TC
74
TT
Ph
GG
G1l
CT
Le
GA(
GIl
TG'
Cys 82C
GAI
Ile Lys Ala ACC CTG AAG Thr Leu Lys 630 CCA AGA GAT Pro Arg Asp 645 TTT CTT TTA Phe Leu Leu :CA TCA TGT 'ro Ser Cvs 'TT ACA AAT he Thr Asn AG GGA GTG ys Gly Val 710 TT TAT CCC al Tyr Pro 725 AT GAG GAC yr Glu Asp [0 'C ACT GCT ie Thr Ala C ATT ATG y Ile Met T AAA CCT u Lys Pro 790 G AAT TTC U Asn Phe 805 r GAC TGC s Asp Cys GGT GGG 1924 1972 2020 2068 2164 2212 2260 2308 2356 2404 2452 2500 2548 -252- Lys Arg Asn Ser Asp Val Met Asp Cys Val Ile Leu Asp Asp Gly Gly 825 830 835 TTT CTT CTG ATG CA AAT CAT GAT GAT TAT ACT AAT CAG ATT GGA AGA 2596 Phe Leu Leu Met Ala Asn His Asp Asp Tr Thr Aso Gin le Gly Arg 850 TTT TTT GGA GAG ATT GAT CCC AGC TTG ATG AGA CAC CTG cT AAT ATA 2644 Phe Phe Gly Giu ie Asp Pro Ser Leu Met Arg His Leu Vai Asn Ile 855 860 865 870 TCA GTT TA- GCT TTT AAC AAA TCT TAT GAT TAT CAG TCA GTA TGT GAG 2692 Ser Val iyla y Phe Asn Lys Ser Ty Asp Tyr Gin Ser Val Cys Glu 87 880 885 .CCC GGT GCT GA CCA AAA CAA GGA GCA GGA CAT CGC TCA GCA TAT GTG 2740 *Pro Gl Ala Ala Pro Lys Gin Giy Ala Gly His Arg Ser Ala Tyr Val S9890 900 CCA TCA GTA GCA GAC ATA TTA CAA ATT GGC TGG TGG GCC ACT GCT GCT 2788 Pro SerVal Ala Asp Ile Leu Gin Ile Gy Trp Trp Ala Thr Ala Ala 905 910 915 GCC TGG TCT ATT CTA CAG CAG TTT CTC TTG AGT TTG ACC TTT CCA CGA 2836 Ala Trp Ser Ile Leu Gin Gin Phe Leu Leu Ser Leu Thr Phe Pro Arg 925 930 CTC CTT GAG GCA GTT GAG ATG GAG GAT GAT GAC TTC ACG GCC TCC CTG 2884 Leu Leu Giu Ala Vai Giu Met Giu Asp Asp Asp Phe Thr Aia Ser Leu 935 9409 9969 945 950 TCC AAG CAG AGC TGC ATT ACT GAA CAA ACC CAG TAT TTC TTC GAT AAC 2932 Ser Lys Gin Ser Cys Ile Thr Giu Gin Thr Gin Tyr Phe Phe Asp Aso 970 975 980 ATC TTT CAT GGA GAA AAC CTT ATG AAC ACC AAC TTA ATA TTC ATA ATG 3025 lie Phe His Cly Gu Lys Leu Met Asn Thr Asn Leu Ile Phe Ile Met 995 GTT GAG AGC AAA GGG ACA TGT CCA TGT GAC ACA CGA CTC CTC ATA CAA 3076 G G A A G A T C T G A C C C AA AA 3076 Vai Giu Ser Lys Giy Thr Cys Pro Cys Asp Thr Arg Leu Leu Ile Gin 1000 1005 1010 GCG GAG CAC ACT TCT GAC GGT CCA AAT CCT TGT GAC ATC GTT AAG CAA 3124 Ala Glu Gin Thr Ser Asp Gy Pro Asn Pro Cys Asp Met Val Lys Gin 1015 1020 1025 1030 CCT AGA TAC CA AAA GGG CC? GAT GTC TGC TTT GAT AAC AAT GTC TTG 3172 Pro Arg Tyr Arg Lys Giy Pro Asp Vai Cys Phe Asp Asn Asn Val Leu 1035 ysPhe Asp Asn Asn Val Leu 1035 1040 1045 GAG GAT TAT ACT GAC TGT GGT GGT GTT TCT GGA TTA AAT CCC TCC CTG 3220 I -253- 0Th As T r T r0 sp Cy Gy GlY Val Ser Gly Leu Asn Pro Ser Leu 10501055 1060 TGG TAT ATC ATT GGA ATC CAG TTT CTA OTA CTT TGG OTG GTA TOT GGC Trp Tyr Ile Ile Giy Ile Gin Phe Leu Leu Leu Trp Leu Vai Ser Giy 1065 1070 1075 ACC ACA CAC CGG OTG TTA TGAOCTTOTA AAAACCAAAT
CTGCATAGTT
Ser Thr His Arg Leu Leu 1080 108 0* a C. a. AAACTCCAGA CCCTGCCAAA
ACATGAGCOO
TATAAAATCA GACAAACATT
AGCTGGGCCT
COTAAGGCAO OCACTGGCTG
CATGTCAGGG
CATCATCTAT GTGTAACATC
AAAGCAAAAT
GCGTTTGTTG TTGOATTGTT
GGT
INFORMATION FOR SEQ, ID NO:33: i)SEQUENCE
CHARACTERISTICS
LENGTH: 1681 base p, TYPE: nucleic acid STRANDEDNESS. doubl, TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genor TGCCCTCAAT TACAGTAACG
TAGG-TOAGC
OTTOCATOGO ATAACACTAA
GGCGCAGACT
TGTOAGATOC TTAAACGTGT
GTGAATGCTG
CCTATACGTG TOCTOTATTG
GAAAATTTGG
3268 3316 3376 3436 3496 3556 3579
S.
S
*5 a (ix) FEATURE: NAME/KEY: ODS LOCATION: 1. .1437 OTHER INFORMATION: /standard-name= 'Beta-1-i', (ix) FEATURE: NAME/KEY: 3'UTR LOCATION: 1435. .1681 SEUEC DESCRIPTION: SEQ ID NO:33: ATG GTO CAG AAG ACC AGO ATG TOO COG GOC OCT TAO OCA CCC TOO CAG Met Vai Gin Lys Thr Ser met Ser Arg Gly Pro Tyr Pro Pro Ser Gin 1 5 10 GAG ATO CCC ATO GAG OTO TTC GAO CCC AGO COG CAG 000 AAA TAO AGO Giu Ile Pro Met Glu Val Phe Asp Pro Ser Pro Gin Giy Lys Tyr Ser 25 AAG AGO AAA GGG OGA TTO AAA COG TOA GAT GGG AGO ACO TOO TOG GAT Lys Arg Lys Gly Arg Phe Lys Arg Ser Asp Giy Ser Thr Ser Ser Asp 40 -254- ACC ACA TCC AAC ACC TTT GTC CGC CAG GGC TCA CCG GAC TCC TAC ACC Thr Thr Ser Asn Sex- Phe Val Arg Gin Cly Ser Ala Glu Ser Tyr Thr 55
ACC
Sex- CGT CCA TCA Arg Pro Ser GAC TCT GAT CTA TCT CTG GAG GAG CAC CCG CAA GCC Asp Sex- Asp Val Ser Leu Clii Clu Asp Arg CluAa 70 75 80a TTA ACC AAC GAA Lieu Arg LYS Glu AAC ACC AAC CCA Lys Thr Lys pro 100 240 28R CCA GAG Ala Clu CCC CAG CCA TTA CC CAG CTC Arg Gin Ala Leu Ala Gin Leu GAG AAG GCC CTC CCA TTT GOT CTC CCC ACA Val Ala Phe Ala Val Arg Thr AAT CTT CCC TAC AAT CCC TCT CCA CCC CAT GAG CTG CCT GTG CAG CCA GTC AGCC ATC ACC TTC Pro Sex- Pro Gly Asp Glu Val Pro Val Gin Gly Val Al12 eTr5h 115 12012 GAG CCC AAA GAC TTC CTC CAC ATC AAC GAG AAA TAC AAT AAT CAC
TCC
Glu Pro LYS Asp Phe Leu His Ile Lys Glu Lys Tyr Asn Asn Asp Trp 130 135 140 e *e a a.
a a a a a. ATC CCC
CCC
Ile Gly Argr 38.: 432 480 52C CTG CTG AAC GAG Leu Val LYS Glu CCC TGT
GAG
Gly Cys Glu GTT CCC TTC ATT Val Cly Phe Ile CCC ACC CCC GTC AAA Ser Pro Val Lys CTC GAC Leu Asp AGC CTT CCC CTG CTG Sex- Leu Arg Leu Leu CAG CAA CAC AAG CTC 170 Lys17 e CCC CAC AAC CCC CTC CCC TCC AGC AAA TCA CCC CAT AAC TCC ACT TOO Arg Gin Asn Ax-g Leu Cly Sex- Sex- Lys Sex- Cly Asp Asn Sex- Sex- Sex- 180 15190 ACT CTC GGA
CAT
Sex- Leu Cly Asp 195 CTC GTC ACT Val Val Thr ACT
GCT
Sex- Gly 210 CCC ACC CCC CC Gly Thr Ax-g Arg 200 TTA CCC TTT GAA Leu Ai Dpi-l AAT GAA ATC ACT AAC Asn ClU Met Thr Asn 215 CCC ACA Pro Thr 205 CTA CAC Leu
ASP
CCC CCT CC Pro Pro Ala CCC CTA GAG
TTA
Leu 225 GAG GAG CAA GAG Ciu Clu Ciu Glu CCT GAG Ala Ciu CTT GGT GAG CAC ACT CCC TCT CC Leu Cly Clu Gin Sex- Ciy Sex- Ala ACT ACT CTT AC Thr Sex- Vai Sex-
ACT
Sex- CTC ACC ACC Val Thr Thr CCC CCA Pro Pro CCC CAT GC CCC TTC TTT AAG AAC ACA GAG CAT CTG CCC CCC TAT GAC Pro Phe Phe Lys Lys Thx- Giu His Val Pro Pro y s 260 TyAs AAA CCC ATO Lys Arg Ile 255 CTC CTC CCT Val Val Pro 270 720 768 816 -255- TCC ATG AGG CCC ATC ATC CTG GTG GGA CCG TCG CTC AAG GGC TAC GAG Ser Met Ar Pro Ile Ile Leu Val Gly Pro Ser Leu Lys Gly Tyr Glu 275 280 285 285 GTT ACA GAC ATG ATG CAG AAA GCT TTA TTT GAC TTC TTG AAG CAT CGG Val Thr Asp Met Met Gin Lys Ala Leu Phe Asp Phe Leu Lys His Arg 290 295300 TTT GAT GGC AGG ATC TCC ATC ACT CGT GTG ACG GCA GAT ATT TCC
CTG
Phe Asp Gly Arg Ile Ser Ile Thr Arg Val Thr Ala Asp Ile Ser Leu GCT AAG CGC TCA GTT CTC AAC AAC CCC AGC AAA CAC ATC ATC ATT GAG Ala Lys Arg Ser Val Leu Asn Asn Pro Ser Lys His Ile Ile Ile Glu 325 330 335 CGC TCC AAC ACA CGC TCC AGC CTG GCT GAG GTG CAG AGT GAA ATC GAG Arg Ser Asn Thr Arg Ser Ser Leu Ala Glu Val Gin Ser Glu Ile Glu 340 345 350 CGA ATC TTC GAG CTG GCC CGG ACC CTT CAG TTG GTC GCT CTG GAT GCT Arg Ile Phe Glu Leu Ala Arg Thr Leu Gin Leu Val Ala Leu Asp Ala 355 360 365 GAC ACC ATC AAT CAC CCA GCC CAG CTG TCC AAG ACC TCG CTG GCC CCC Asp Thr Ile Asn His Pro A-la Gin Leu Ser Lys Thr Ser Leu Alia Pro 370 375 380 ATC ATT GTT TAC ATC AAG ATC ACC TCT CCC AAG GTA CTT CAA AGG CTC le lile Val Tyr Ile Lys Ile Thr Ser Pro Lys Val Leu Gln Aro Leu 385 390 395 400 3 9 5 400 ATC AAG TCC CGA GGA AAG TCT CAG TCC AAA CAC CTC AAT GTC CAA ATA Ile Lys Ser Arg Gly Lys Ser Gin Ser Lys His Leu Asn Val Gln Ile 405 410 415 415 Ala Ala Ser Glu Lys Leu Ala Gin Cys Pro Pro Glu Met Phe Asp Ile 420 425 430 ATC CTG GAT GAG AAC CAA TTG GAG GAT GCC TGC GAG CAT CTG GCG GAG Ile Leu Asp Glu Asn Gin Leu Glu Asp Ala Cys Glu His Lel Al1m G1., 440 445 TAC TTG GAA GCC TAT TGG AAG GCC ACA CAC CCG CCC AGC AGC ACG CCA Tyr Leu Glu Ala Tyr Trp Lys Ala Thr His Pro Pro Ser Ser Thr Pro 450 455 460 CCC AAT CCG CTG CTG AAC CGC ACC ATG GCT ACC GCA GCC CTG GCT Pro Asn Pro Leu Leu Asn Arg Thr Met Ala Thr Ala Ala Leu Ala 465 470 475 GCCAGCCCTG CCCCTGTCTC CAACCTCCAG GTACAGGTGC TCACCTCGCT
CAGGAGAAAC
CTCGGCTTCT GGGGCGGGCT GGAGTCCTCA CAGCGGGGCA GTGTGGTGCC
CCAGGAGCAG
864 912 960 1008 1056 1104 1152 1200 1248 1296 1344 1392 1437 1497 1557 -256- GAACATGCCA TGTAGTGGGC GCCCTGCCCG TCTTCCCTCC TGCTCTGGGG TCGGAACTGG 1617 AGTGCAGGGA ACATGGAGGA GGAAGGGAAG AGCTTTATTT TGTAAAAAAA TAAGATGAGC 1677
GGCA
1681 INFORMATION FOR SEQ ID NO:34: SEQUENCE
CHARACTERISTICS:
LENGTH: 1526 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genomic) (ix) FEATURE: NAME/KEY:
CDS
LOCATION: 1..651 OTHER INFORMATION: /standard_name= "Beta-1-4" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:34: ATG GTC CAG AAG ACC AGC ATG TCC CGG GGC CCT TAC CCA CCC TCC CAG 48 Met Val Gln Lys Thr Ser Met Ser Arg Gly Pro Tyr Pro Pro Ser Gin *1 5 10 40 ACC ACA TCC AA AGC TTT GTC CGC CAG GGC TCA GCG GAG TCC TAC ACC 192 Thr Thr Ser Asn Ser Phe Val Arg Gln Gly Ser Ala Glu Ser Tyr Thr 55 SAGC CGT CCA TCA GAC TCT GAT GTA TCT CTG GAG GAG GAC CGG GAA GCC 240 Ser Arg Pro Ser Asp Ser Asp Val e Leu G u Asp Arg Gu Aa 70 75 TTA AGG AAG GAA GCA GAG CGC CAG GCA TTA GCG CAG CTC GAG AAG GCC 288 Leu Arg Lys Glu Ala Glu Arg Gin Ala Leu Ala Gin Leu Glu Lys Ala 90 AAG ACC AAG CCA GTG GCA TTT GCT GTG CGG ACA AAT GTT GGC TAC AAT 336 Lys Thr Lys Pro Val Ala Phe Ala Val Arg Thr Asn Val Gly Tyr Asn 100 105 110 CCG TCT CCA GGG GAT GAG GTG CCT GTG CAG GGA GTG GCC ATC ACC TTC 384 Pro Ser Pro Gly Asp Glu Val Pro Val Gin Gly Val Ala Ile Thr Phe 115 120 125 -257- GAG CCC AAA GAC TTC CTC CAC ATC AAG GAG AAA TAC AAT AAT GAC TG Ciu Pro Lys Asp Phe Leu H{is Ile Lys Giu Lys Tyr Asn Asn Asp Trp 130 135 14 0 TGG ATC CGG CGG CTG GTG AAG GAG GGC TGT GAG GTT GGC TTC ATT CCC Trp Ile Ciy Arg Leu Val Lys Giu Giy Cys Giu Vai Cly Phe Ile Pro 145 150 155 160 AGC CCC GTC AAA CTG GAC AGC CTT CGC CTG CTG CAC GAA CAG AAG CTC Ser Pro Val Lys Leu Asp Ser Leu Arg Leu Leu Gin Giu Gin Lys Leu 165 170 175 CGC CAG AAC CGC CTC GGC TCC AGC AAA TCA GGC GAT AAC TCC ACT TCC Arg Gin Asn- Arg Leu Giy Ser Ser Lys Ser Giy Asp Asz Ser Ser Ser 180 185 190 ACT CTG GGA GAT GTG GTG ACT GGC ACC CGC CCC CCC ACA CCC CCT CC Ser Leu Cly Asp Val Val Thr Gly Thr Arg Arg Pro Thr Pro Pro Ala 195 200 205 AGT GAC AGA GCA TGT GCC CCC CTA TGACGTGGTC CCTTCCATGA GGCCCATCAT Ser Asp Arg Ala Cys Ala Pro Leu 210 215 CCTGGTGGGA CCGTCGCTCA AGGGCTACGA GGTTACAGAC ATGATGCAGA
AACCTTTATT
TGACTTCTTG AAGCATCGGT TTGATGGCAG GATCTCCATC ACTCGTGTGA
CGGCAGATAT
TTCCCTGGCT AAGCGCTCAG TTCTCACA CCCCAGCAAA CACATCATC
'"M
432 480 528 57 6 624 678 738 798 858 918 5978 1038 1098 1158 1218 1278 1338 1398 1458 15i8 1526
CAACACACC
CCGGACCCTT
CAAGACCTCG
-AAGGCTCATC
CTCGGAAAAG
ATTGGAGGAT
CCCGCCCAGC
GGCTGCCAGC
TCCAGCCTGG
CAGTTGGTCG
CTGGCCCCA
AAGTCCCGAG
CTGGCACAGT
GCCTGCGAC
CCTGCCCCTC
CTGAGGTGCA
CTCTGGATGC
TCATTGTTTA
GAAAGTCTCA
GCCCCCCTGA
ATCTGGCCGA
*GAGTGAAATC
TGACACCATC
CATCAAGATC
GTCCAAACAC
AATGTTTGAC
GTACTTGGAjA .1UAA2CCCC GACCGAATCT TCCACCTGGC p I
AATCACCCAC
ACCTCTCCCA
CTCAATTCC
ATCATCCTCC
GCCTATTGGA
ACCATGGCTA
CCCACCTCTC_
ACGTACTTCA
AAATAGCGGC
ATCAGAACCA
AGGCCACACA
CCGCAGCCCT
CCCTCACCAC
TCCCCCACGA
CCCCTCGGAA
AAAATAAGAT
AAACCTCCGC TTCTGGGGCC GGCTCGAGTC CTCACACCGC
GGCACTGTCC
GCAGGACT GCCATGTACT GCCCGCCCTC
CCCGTCTTCC
CTGCAGTCCA GGGAACATGG AGCAGGAAGC
GAACAGCTTT
GAGCGCA
INFORMATION FOR SEQ ID NO:35:
CTCCTCTCT
ATTTTTAAA
-258- Wi SEQUENCE
CHARACTERISTICS:
LENGTH: 1393 base pairs TYPE: nucleic acid STRANDEDNS. double TOPOLOGY: linear (ii) MOLECULTE TYPE: DNA (genomic)
FEATURE:
NAME/KEY:
CDS
LOCATION: 1. .660 OTHER INFORMATION: /standard-nane= (xi) SEQUENCE DESCRIPTION: SEQ ID ATG GTC CAG AAG ACC AGC ATO TCC CGG GGC CCT TAC CCA CCC TCC CAG 4 Met Val Gln Lys Thr Ser Met Ser Arg Oly Pro Tyr Pro Pro Ser Gln 1 5 10 GAG ATC CCC ATG GAG GTC TTC GAC CCC AGC CCG CAG GGC AAA TAC AGC 96 Olu Ile Pro Met Olu Val Phe Asp Pro Ser Pro Gln Gly Lys Tyr Ser 30 AAG AGO AAA GGO CGA TTC AAA CGG TCA GAT -GGG AOC ACG TCC TCO GAT 144 *.SLys Arg Lys Gly Arg Phe Lys Arg Ser Asp Oly Ser Thr Ser Ser Asp 40 45 .ACC ACA TCC AAC AGC TTT OTC COC CAG GOC TCA OCO GAG TCC TAC ACC 192 *Thr Thr Ser Asn Ser Phe Val Arg Oln Gly Ser Ala Glu Ser Tyr Thr 55 ACC COT CCA TCA GAC TCT OAT OTA TCT CTG GAG GAG GAC COG GAA 0CC 240 SrAgPro Ser Asp Ser Asp Val Ser Leu Olu Glu Asp xGlAa 70 75 TTA AGO AAO OAA OCA GAG COC CAG GOCA TTA OCO CAG CTC GAO AAG 0CC 288 Leu Arg Lys Olu Ala Glu Arg Gin Ala Leu Ala Gln Leu Glu Lys Ala 90 *S.AAG ACC AAG CCA TG GCA TTT CT TGO C'CG ACA AAT TT GC TAC AAT 336 Lys T hr Lys Pro Val Ala Phe Ala Val Arg Thr Asn Val Gly Tyr Asn 100 105 110 *S.CCG TCT CCA 000 OAT GAG GTG CCT OTO CAG OGA OTO 0CC ATC ACC TTC 384 Pro Ser Pro Oly Asp Glu Val Pro Val Gln Gly Val Ala Ile Thr Phe 115 120 125 GAG CCC AAA GAC TTC CTG CAC ATC AAG GAG AAA TAC AAT ART GAC TOG 432 0 u Pro Lys Asp Phe Leu His Ile Lys Olu Lys Tyr Asn Asn Asp Trp 130 135 140 TOG ATC 000 COG CTO OTO AAG GAO GOC TOT GAG OTT GOC TTC ATT CCC 480 Trp, Ile Gly Arg Leu Val Lys Glu Gly Cys Olu Val Gly Phe Ile Pro 145 150 155 160 -259- ACC CCC CrC AAA CTC GAC ACC CTT CCC CTG CTC CAC GAA CAG AAC CTG Ser Pro Val Lys Leu Asp Ser Leu Arg Leu Leu Gin 175lnLs e 165 17017 CCC CAC AAC CGC CTC GCC TCC AGC AAA TCA GGC GAT AAC TCC AGT
TCC
Arg Gin Asn Arg Leu Gly Ser Ser Lys Ser Gly Asp Asn Ser Ser Ser 180 185 190 ACT CTG CCA GAT GTC GTC ACT GGC ACC CCC CCC CCC ACA CCC CCT
CC
Ser Leu Gly Asp Val Val Thr Cly Thr Arg Arg Pro Thr Pro Pro Ala 195 200 205 ACT GCT TAC AGA CAT GAT CCA CAA ACC TTT ATT TGACTTCTT-G
AAGCATCGCT
Ser Cly Tyr Axg His Asp Ala Clu Ser Phe Ile 210 215 220 TTCATCCCAC GATCTCCATC ACTCCTCTCA CCGCAGATAT
TTCCCTGCC
TTCTCACA CCCCAGCAAA CACATCATCA TTCACCTC CAACACACGC1 CTCACGTGCA GACTGAAATC GAGCGAATCT TCCACCTCCC CCGCACCcTr CTCTCCATCC TCACACCATC AATCACCCAC CCCAGCTGTC
CAACACCTCC
TCATTCTTTA CATCAAGATC ACCTCTCCCA AGGTACTTCA
AACGCTCATC
GAAACTCTCA CTCCAAA AC CTCAATCTCC AAATAGCCCC
CTCCCAAAAC
CCCCCCCTCA AATGTTTGAC ATCATCCTCC ATGAGAACCA
ATTGCACCAT
ATCTGCCCA GTACTTGGAA CCCTATTCCA ACCCCACACA
CCCCCCAC
CCAATCCGCT GCTCAACCCC ACCATCCCTA CCCCACCCT
GCCTCCCAC
TCTCCAACCT CCACGTACAC CTCCTCACCT CCCTCACGAG
AAACCTCGC
CCCTGCACTC CTCACAGCCG GCCACTGTGC TCCCCCAGCA
CCACGAACT
CCCCCCCTC CCCCTCTTCC CTCCTCCTCT CCCCTCCCAA
CTCCAGTGCA
INOMTO FOR SEQ ID NO:36: SEQUENCE
CHARACTERISTICS:
LENGTH: 6725 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOCY: linear (ii) MOLECUJLE TYPE: DNA (genomic) P AAGCGCTCAC
TCCACCCTGC
CACTTCCTCG
CTCCCCCCCA
AAGTCCCGAG
CTCGCACAGT
GCCTCAC
AGCACGcCAC
CCTCCCCCTC
TTCTGCCCG
GCCATCTACT
GAAC
52e 5 76 624 677 727, 797 8 57 917 977 1027 1097 1157 1217 1277 1327 1293 (ix) FEATURE: NAME/KEY:
CDS
(B3) LOCATION: 226. .6642 OTHER INFORMATjION: /standard-name= 'Alpha-1C-21 -260a a. a a *as.
a a. a a a. (xiJ) SEQUENCE DESCRIPTION. SEQ ID NO:36: CTCGAGGAGG CAGTAGTGGA AAGGAGCAGT TTTTGGGGTT TGATGCCATA
ATGGGAATCA
GGTAATCGTC GGCGGGGAAG AAGAAACGCT GCAGACCACG GCTTCCTCGA
ATCTTGCGCG
AAAGCCGCCG GCCTCGGAGG AGGGATTAAT CCAGACCCGC CGGGGGGTGT
TTTCACATTT
CTTCCTCTTC GTGGCTGCTC CTCCTATTAJA AACCATTTTT GGTCC ATG GTC AAT Met Val Asn 1 GAG AAT ACG AGG ATG TAC ATT CCA GAG GAA AAC CAC CAA GGT TCC AAC Giu Asn Thr Arg Met Tyr Ile Pro Giu Giu Asn His Gin Giy Ser Asn 10 i5 TAT GGG AGC CCA CGC CCC GCC CAT GCC AAC ATG AAT GCC AAT GCG GCA T~yr Giy Ser Pro Arg Pro Ala His Ala Asn Met Asn Aia Asn Ala Ala 25 30 35 GCG GGG CTG GCC CCT GAG CAC ATC CCC ACC CCG GGG GCT GCC CTG TCG Ala Giy Leu Ala Pro Glu His Ile Pro Thr Pro Gly Ala Aia Leu Ser 40 45 50 TGG CAG GCG GCC ATC GAC GCA GCC CGG CAG GCT AAG CTG ATG GGC AGC Trp Gin Ala Ala Ile Asp Ala Ala Arg Gin Ala L~ys Leu Met Gly Ser 55 60 65 GCT GGC AAT GCG ACC ATC TCC ACA GTC AGC TCC ACG CAG CGG AAG CGG Aia Giy Asn Aia Thr Ile Ser Thr Vai Ser Ser Thr Gin Arg Lys Arg 75 80 CAG CAA TAT GGG AAA CCC AAG AAG CAG GGC AGC ACC ACG GCC ACA CGC Gin Gin Tyr Gly Lys Pro Lys Lys Gin Gly Ser Thr Thr Ala Thr Arg 85 90 95 CCG CCC CGA GCC CTG CTC TGC CTG ACC CTG AAG AAC CCC ATC CGG AGG Pro Pro Arg Ala Leu Leu Cys Leu Thr Leu Lys Asn Pro Ile Arg Arg 100 105 110 115 GCC TGC ATC AGC ATT GTC GAA TGG AAA CCA TTT GAA ATA ATT ATT TTA Ala Cys Ile Ser Ile Val Giu Trp Lys Pro Phe Giu Ilp Ile T1 Te 20125 130 CTG ACT ATT TTT GCC AAT TGT GTG GCC TTA GCG ATC TAT ATT CCC TTT Leu Thr Ile Phe Ala Asn Cys Val Ala Leu Ala Ile Tyr Ile Pro Phe 135 140 145 CCA GAA GAT GAT TCC AAC GCC ACC AAT TCC AAC CTG GAA CGA GTG GAA Pro Giu Asp Asp Ser Asn Ala Thr Asn Ser Asn Leu Giu Arg Val Giu 150 155 160 TAT CTC TTT CTC ATA ATT TTT ACG GTG GAA GCG TTT TTA AAA GTA ATC Tyr Leu Phe Leu Ile Ile Phe Thr Val Giu Ala Phe Leu Lys Vai Ile 165 170 175 120 180 234 282 330 378 426 474 5 22 570 666 714 762 -261- CCC TAT CCA CTC OTO TTT CAC Ala Tyr Gly Leu Leu Phe His 180 185 COO AAT GCC TAO CTO OGO AAO GCC Pro Asn Ala Tyr Leu Arg Asn Gly 190 AAO CTA CTA Asn Leu Leu TTA CAA cAA Leu Clu Gln CCC GCC GCA Gly Ala Gly 230 GAT TTT ATA ATT GTG CTT GTG Asp Phe Ile Ile Val Val Val 200 205 GOA ACC AAA GCA GAT GGG GCA Ala Thr Lys Ala Asp Cly Ala 215 220 GGG OTT TTT AGT Gly Leu Phe Ser GOA ATT Ala Ilie 810 858 AAO GOT OTO Asn Ala Leu GGA GGG AAA Cly Cly Lys 225 GTGCT TCOGO Val Leu Arg TTT GAT GTG AAG Phe Asp Val Lys OTG ACG CO TTO Leu Arg Ala Phe
CO
Arg 000 OTO Pro Leu 245 OGG OTG GTG TOO Arg Leu Val Ser
GGA
Ci y 250 GC OCA AGT OTO Val Pro Ser Leu GTG GC OTG AAT Val Val Leu Asn ATO ATO AAG GOO Ile Ile Lys Ala
ATC
Met 265 GC 000 OT COTC Val Pro Leu Leu
S
S.
S.
S S
S
S S
CAC
His ATO CO OTGCOTT Ile Ala Leu Leu
CTG
Val CTG TTT GTO ATO Leu Phe Val Ile ATO TAO CCC ATO Ile Tyr Al1a Ile CCC TTC GAG OTO Cly Leu Ciu Leu TTOC ATG Phe Met CCC AAC ATC Cly Lys Met OCA GCA CAA Pro Ala Clu 310
CAC
His 295 AAG ACO TGC TAO Lys Thr Cys Tyr
AAO
Asn 300 CAC GAG CCC ATA Gin Giu Gly Ile CA CAT CTT Ala Asp Val 305 CCC CAC CCC Cly His Cly CAT GAO COT TOO Asp Asp Pro Ser
OOT
Pro 315 TGT CG OTG CAA Cys Ala Leu Giu
AOG
Thr 1002 1050 1098 1146 1194 1242 1290 1338 1386 1434 CC CAG Arg Gin 325 TGO CAC AAO GC Cys Gin Asn Gly CTG TC AAG 000 Val Cys Lys Pro TCC CAT CCT COO Trp Asp Cly Pro CAC GGO ATO ACC His Gly Ile Thr TTT GAO AAO TTT Phe Asp Asn Phe TTO CO ATG OTO Phe Ala Met Leu CTG TTO CAC TC Val Phe Gin Cys
ATO
Ile 360 ACC ATG GAG CCC TCC AOG GAO CTG OTG Thr Met Clu Cly Trp, Thr Asp Val Leu TAO TG Tyr Trp GTO AAT CAT Val Asn Asp GTA CGA AGC GAO Val Gly Arg Asp
TGG
Tr- 380 000 TCG ATO TAT Pro Trp Ile Tyr TTT GTT ACA Phe Val Thr OTA ATO ATO ATA CCC TOA TTT Leu Ile Ile Ile Gly Ser Phe 390 TTT GTA OTT AAO TTG GTT OTO CCT CTG Phe Val Leu Asn Leu Val Leu Cly Val 395 400 -262- CTT AGC GGA GAG TTT TCC AAA GAG AGG GAG Leu Ser Gly Giu Phe Ser Lys Giu Arg Giu 405 410 AAG Gcc LYS Ala AAG GCC CGG GGA LYS Ala Arg Gly GAT TTC Asp Phe 420 CAG AAG CTG CGG GAG Gin Lys Leu Arg Glu 425 AAG CAG CAG Lys Gin Gln
CTA
Leu GAA GAG GAT CTC Glu Giu Asp Leu GGC TAC CTG GAT TGG Gly Tyr Leu Asp Trp 440 GAG GAC GAA GGC ATG Giu Asp Giu Gly Met 455 ATC ACT CAG GCC GAA GAC ATC GAT CCT GAG AAT Ile Thr Gin Ala Glu Asp Ile Asp Pro Giu Asn 445 450 GAT GAG GAG AAG CCC CGA AAC ATG AGC ATG CCC Asp Giu Giu Lys Pro Arg Asn Met Ser Met Pro 460 465 ACC AGT GAG Thr Ser Giu 470 ACC GAG TCC Thr Giu Ser GTC A.AC Val Asn ACC GAA AAC GTG Thr Giu Asn Val
GCT
Al a GGA GGT GAC ATC GAG Ile Glu 485 GGA GAA AAC TGC Gly Giu Asn Cys
GGG
Gly GCC AGG CTG GCC Ala Arg Leu Aia CGG ATC TCC AAG Arg Ile Ser Lys
TCA
Ser 500 AAG TTC AGC CGC Lys Phe Ser Arg TGG CGC CGG TGG Trp Arg Arg Trp CGG TTC TGC AGA Arg Phe Cys Arg
AGG
Arg 1482, 1530 1578 1626 1G74 1722 1770 1818 1866 1914 1 9r7 2 010 2058 2106 AAG TGC CGC GCC Lys Cys Arg Ala
GCA
Al a 520 GTC AAG TCT AAT Vai Lys Ser Asn
GTC
Val TTC TAC TGG CTG Phe Tyr Trp Leu GTG ATT Val Ile TTC CTG GTG Phe Leu Val CAG CCC AAC Gin Pro Asn 550 CTC AAc ACG CTC Leu Asn Thr Leu ATT GCC TCT GAG Ile Ala Ser Giu CAC TAC AAC His Tyr Asn 545 AAG GCC CTG TGG CTC ACA GAA Trp Leu Thr Giu
GTC
Val CAA GAC ACG GCA Gin Asp Thr Ala CTG GCC Leu Ala 565 CTG TTC ACG GCA Leu Phe Thr Ala ATG CTC CTG AAG e- Leu L=u j
ATG
met TAC AGC CTG GGC Tyr Ser Leu Gly CAG GCC TAC TTC Gin Ala Tyr Phe
GTG
Val1 585 TCC CTC TTC AAC Ser Leu Phe Asn TTT GAC TGC TTC Phe Asp Cys Phe GTG TGT GGC GGC Val Cys Gly Gly CTG GAG ACC ATC Leu Giu Thr Ile GTG GAG ACC AAG ATC ATG Val Giu Thr Lys Ile Met TCC CCA CTG Ser Pro Leu
GGC
Gly 615 ATC TCC GTG CTC Ile Ser Val Leu
AGA
Axg G20 TGC GTC CGG CTG Cys Val Arg Leu CTG AGG ATT Leu Arg Ile 625 -263- TTC A-AG ATO A-CC AGG TAO TGG AAO TCO TTG Phe Lys Ile Thr Arg Tyr Trp A-sn Ser Leu 630 635 AGO AAc OTG GTG GCA TOO Se -nLeu VJal Ala Ser TTG CTG Leu Leu 645 A-AC TCT GTG
CGO
A-sn Ser Vai A-rg
TOO
Ser A-TO GCC TCO CTG Ile Ala Ser Leu
OTO
Leu OTT OTO CTC TTO Leu Leu Leu Phe
CTO
Leu 660 TTC ATC A-TO
ATO
Phe Ile Ile Ile
TTC
Phe 665 TOO CTC CTG GGG Ser Leu Leu Giy OAG OTO TTT GGA Gin Leu Phe Gly
GGA
Giy A-AG TTC A-AC TTT Lys Phe Asn Phe GAG ATG CAG ACC Giu Met Gin Thr
CGG
Arg A-GG AGO ACA TTC Arg Ser Thr Phe GAT
AAO
AspD A-sn TTC COO CAG Phe Pro Gin CTC OTO ACT GTG Leu Leu Thr Val CAG A-TO OTG ACC Gin Ile Leu Thr TGG
A-AT
Trm A-sn TTT
CZA
Phe Pro 725 GGG GAG GAO Gly Giu Asp 705 GGO 000 TOT
TOG
Ser 710 GTG ATG TAT GAT GGG Val Met Tyr Asp Gly 715 A-TO ATG GOT TAT Ile Met Ala Tyr 55.
S. GGG ATG
TTA
Gly Met Leu GTO TGT ATT Val Cys Ile TAO TTC ATO Tyr Phe Ile
A-TO
Ile OTO TTO ATO TGT Leu Phe Ile Cys 2154 2202 2250 2298 2346 2394 2442 2490 2538 2586 2634' 2682 2730 2778 A-AC TAT A-TO OTA A-sn Tyr Ile Leu A-AT GTG TTC TTG A-sn Val Phe Leu ATT GOT GTG GAO Ile Ala Val Asp CTG GOT GAT GOT Leu Ala Asp Ala
GAG
Glu 760 A-GO OTO A-CA TOT Ser Leu Thin Ser CAA AAG GAG GAG Gin Lys Glu Giu GAA GAG Giu Giu S GAG A-AG Giu Lys CA-A GAG Gin Giu A-TT GAG Ile Giu 805
GAG
Glu
A-GA
Arg 775 A-AG A-AG CTG GCC Lys Lys Leu A-la
A-GG
Aing ACT GOC AGO OCA Thr A-la Ser Pro GAG A-AG AAA Glu Lys Lys 785 GAG GAG A-AG Giu Giu Lys TTG GTG GAG A-AG Leu Vai Glu Lys 790 CTG A-AA TOO A-TO Leu Lys Ser Ile COG GCA Pro A-la '70C GTG GGG GA-A TOO Val Giy Giu Ser
AAG
Lys GOT GAC GGA GAG Ala Asp Gly Giu OCA CCC GOC A-CO
A-AG
Lys 820 A-TO A-AC ATG GAT Ile A-sn Met Asp GAO OTO CAG CCC A-AT GA-A A-AT GAG GAT A-AG AGO Asp Leu Gin Pro A-sn Giu A-sn Giu Asp Lys Ser 825
D)
CCC TA-C CCC A-AC Pro Tyr Pro A-sn OCA GAA A-CT A-CA GGA Pro Giu Thr Thr Gly 840
GAA
Giu 845 GA-G GAT GAG GAG Giu Asp Giu Glu GAG OCA Giu Pro 850 -2 64- GAG ATG
CCT
G].u Met Pro AAG GAA AAG LYS Glu Lys 870 GTC CCC
CCT
Val Gly pro 855 GCA GTG CCC Ala Val Pro CGC CCA CGA CCA CTC TCT GAG CTT CAC OTT Arg Pro Arg Pro Leu Ser Giu Leu His Leu 860 865 ATG CCA CAA CCC AGC C TTT TTC ATC TTC Met Pro Giu Ala Ser Ala Phe Phe h AGO TCT Ser Ser 885 AAC AAC AGC Asn Asn Ara TTT CGC CTC CAG TC CAC Phe Arg Leu Cln Cys His
CC
Arg ATT GTC AAT CAC Ile Vai Asn Asp
ACG
Thr 900 ATC TTC ACC AAC Ile Phe Thr Asn CTG ATC Leu Ile CTO TTC TTO
ATT
Leu Phe Phe Ile CTG CTC AGC AGC
ATT
TCC CTG GCT GCT Ser Leu Ala Ala
GAG
Giu CAC CCG GTC
CAG
Asp Pro Val Gin CAC ACC TCC TTC AG His Thr Ser Phe Arg AAO CAT ATT CTC
TTT
Ile Leu Phe *4*e 0 *0 *0 a a a
TAT
Tyr ATT CCT CTC
AAC
Ile Ala Leu Lys 950 TTC TCC CCC
AAC
Phe Cys Arg Asn TTT CAT ATT CTT
TTT
Phe Asp Ile Val Phe 940 ATC ACT CCT TAT
CCC
Met Thr Ala Tyr Cly ACC ACC ATT
TTC
Thr Thr Ile Phe ACC ATT GAA Thr Ile Clu 945 AAC CCT TCT CCT TTC TTC Ala Pbe Leu
CAC
2826 2874 2922 2970 3018 3066 3114 3162 3210 3258 3306 3354 3402 3450 TAC TTC AAC Tyr Phe Asn ATC CTC AC CTG Ile Leu Asp Leu CTG GTC AGO
GTG
0.* a a a. a *a CTC ATC
TCC
Leu Ile Ser TTT GC Phe Cly ATC TTC CCA CTC CTG
CCA
Ile Leu Arg Val Leu Arg 1000 ATC CAC TCC ACT
CCA
Ile Gin Ser Ser Ala 990 GTA CTC AGC CCC CTC Val Leu Arg Pro Leu ATC AAT CTC CTG AAG Ile Asn Val Val Lys 995 ACC CCC ATC AAC AGC Arg Ala Ile Asn Arg CCC AAC CCC CTA AAG CAT
GTG
Ala Lys Cly Leu LYS His Val 1 i ~i CTT CAG TCT CTC
TTT
Val Gin Cys Vai Phe ACC ATC Thr Ile CTC CCC ATC CCC Val Ala le Arcr 1025 CAC TTC ATG TTT CCC AAC ATC CTC ATT GTC
ACC
Cly Asn Ile Vai Ile Val Thr 1030 1035 ACC CTC CTC Thr Leu Leu CCC TGC ATC CCC
CTC
Ala Cys Ile Cly Val 1045 GAC ACT TCC AAC
CAC
ASP Ser Ser Lys Gli 1060 CAC CTC TTC AAC Gin Leu Phe Lys CGA AAG CTG TAO ACC TCT
TCA
Cly Lys Leu Tyr Thr Cys Ser ACA GAC CC CAA TCC AAC CCC AAC
TAC
Thr Ciu Ala Giu Cys Lys Gly Asn Tyr 1065 1070 ATC ACG Ile Thr 1075 -265- TAO AAA GAC GGG GAG GTT Tyr Lys Asp Gly Glu Val 1080 GAC CAC 000 ATC ATO Asp His Pro Ile Ile 1085 TTT GAO AAT GTT OTG Phe Asp Asn Val Leu GAG AAC AGC Giu Asn Ser AAG TTT GAO Lys Phe Asp 1095 CAA 000 CGC AGO TGG Gn Pro Arg Ser Trp 1090 GCA GCC ATG ATG GCO Aa Aa Met Met Ala 1105 GAG CTG CTG TAO OGO GlU Leu Leu Tyr Arg 1100 CTC TTO ACC GTC TOO ACC TTC GAA GGG TGG OCA Leu Phe Thr Vai Ser Thr Phe Giu Gly Trp, Pro 1110 1115 TOO ATO GAO Ser Ile Asp 1125 GTG GAG ATO Val Giu Ile 1140 TOO CAC AOG Ser His Thr GAA GAO AAG GGO Giu Asp Lys Gly 000 ATO TAO AAO TAO OGT Pro Ile Tyr Asn Tyr Arg TOO ATO TTO TTO ATO ATO TAO ATO ATO Ser Ile Phe Phe Ile Ile Tyr Ile Ile .L -L 'i 1150 TTO ATG ATG Phe Met Met ATO ATO GOO TTO Ile Ile Ala Phe 1155 ACC TTT OAG GAG Thr Phe Gin Giu 6 9S.
a. S. *g 5
S
S. 5 0
S
S
S. 56 S. AAO ATO TTO GTG GGO TTO GTO ATO GTO A-on Ile Phe Vai Gly Phe Vai Ile Vai 1160 1165 OAG GGG GAG OAG GAG TAO Gin Giy Giu Gin Giu Tyr 1175 OAG TGO GTG GAA TAO CO Gin Oys Vai Giu Tyr Ala 1190 000 AAG AAO OAG CAC OAG Pro Lys Asn Gin His Gin 1205 TAO TTO GAG TAO OTG ATG Tyr Phe Giu Tyr Leu Met 1220 1225 AAG AAO TGT GAG Lys Asn Cys Giu OTG GAO AAG AAO CAG OGA Leu Asp Lys Asn Gin Arg 3498 3546 3594 3642 369C 373S 3786 3834 3 882 3930 3978 4026 4074 4122 OTO AAG GOC OGG 000 OTG OGG AGG TAO ATO Leu Lys Aia A-rg Pro Leu Arg Arg Tyr Ile 1195 120n TAO AAA GTG TGG Tyr Lys Vai Trp 1210 ~TO GTO OTO ATC Phe Val Leu Ile TAO GTG GTO AAO TOO ACC Tyr Val Val Asn Ser Thr TG TO AAO Leu Leu Asn OTG GOC ATG CAd CAC TAO GGO CAd AGO TGO OTG TTO AAA Leu Ala Met Gin His Tyr Gly Gin Ser Oys Leu Phe Lys 1240 1245 ACC ATO TGO Thr Ile Cys 1235 ATO GOC ATG Ile Ala Met AAO ATO OTO AAO ATG OTO TTO ACT GGO OTO TTT Asn Ile Leu Asn Met Leu Phe Thr Giy Leu Phe 1255 1260 OTG AAG OTO ATT GOC TTO Leu Lys Leu Ile Ala Phe 1270 AAT ACA TTT GAO GOC TTG Asn Thr Phe Asp Ala Leu 1285 ACC TG GG1250AT ACO GTG GAG Aet ATe 1265 TTO TGT AT GA TGG Phe CYs Asp Ala Trp, AAA 000 AAG CAC TAT Lys Pro Lys His Tyr ATT GTT GTG GGT Ile Val Val Gly 1290 AGO ATT GTT GAT ATA GCA Ser Ile Val Asp Ile Ala 1295 -266- ATC ACC GAG GTA AAC CCA GCT GAA CAT
ACCCATCTTCCTCAG
130e 1h30uVa5sn Po Gu His Thr Giln, Cys Ser Pro Ser Met u AAC GCA GAG GAA AAC TCC Asn Aia Giu Giu Asn Ser 1320 CGG GTC ATG CGT CTG
GTG
Arg Val Met Arg Leu Vai 1315 CGC ATC TCC ATC
ACC
Ar Ile Ser Ile Thr 1325 AAG CTG CTG ACC
CGT
Lys Leu Leu Ser Arg TTC TTC CGC CTG TTC Phe Phe Arg Leu Phe 1330 GGG GAG GGC ATC
CG
ACC CTG CTG TOG ACC TTC ATC AAC TCC Thr Leu Leu Trp Thr Phe Ile Lys Ser 1350 1355 4170 4218 4266 4314 4362 CCC CTC CTG A1TC Ala Leu Leu le 1365 TTC CAC CC CTC CCC TAT
GTG
Phe Gin Ala Leu Pro Tyr Vai 1360 ATC TAC GCG GTG ATC GOC
ATG
GTG ATG CTG TTC
TTC
Vai Met Leu Phe Phe 13370 CAG GTG TTT GGC AAA ATT GCC CTG AAT GAT ACC ACA GAG ATC
AAC
Gin Val Phe Gly Lys Ile Ala Leu Asn Asp *Thr Thr Giu Ile Asn 1301385 1390
COG
Ara 139. AAC AAC AAC TTT CAG ACC TTC CCC CAG GCC GTG CTG CTC CTC TTC
AGG
A-sn Asn Asn Phe Gin Thr Phe Pro Gin Ala Vai Leu LeuLePhAr 1.400 1405 1410heAr TOT GCC ACC GGG GAG GCC TGG CAG GAC ATC ATG CTC GCC TCC ATG
CCA
Cys Ala Thr Giy Glu Ala Trp Gin Asp Ile Met Leu Ala Cys Met Pro 1415 1420 GGC AAC Giy Lys AAG TOT CC Lys Cys Ala 1430 CCA GAG TCC GAG CCC Pro Ciu Ser Ciu Pro ACC AAC ACC ACC GAG
GCT
GAA ACA
CCC
GiU Thr Pro 1445 TAC ATG CTC Tyr Met Leu -1460 ATC GAC AAC Met Asp Asn TGT GGT AGC AGC
TTT
Cys Giy Ser Ser Phe 1450 TOT GCC TTC CTG
ATC
Cys Ala Phi Te-l GCT GTC TTC TAC TTC ATC AGC
TTC
Ala Val Phe Tyr Phe Ile Ser Phe 1455 ATC AAC CTC TTT GTA CT cT
T
ie Asn Leu Phe Val Ala Val le 4458 4506 4554 4602 4650 4698 4746 4794
TTT
Phe GAC TAC CTG ACA ACG GAC TGC TCC ATC Asp Tyr Leu Thr Ara Asp Trp Ser Ile 1480 1485 CTT GOT CCC CAC CAC His His CTG CAT GAG TTT AAA Leu Asp Giu Phe Lys AGA ATC TGG
GCA
Ar Ile Trp Ala GCC AAG GGT CGT ATC AAA CAC CTG CAT GTG
GTG
Ala Lys Giy Arg Ile Lys His Leu Asp Val Val 1510 Iis GAG TAT GAC CCT
GAA
G1u Tyr Asp pro Clu 1505 ACC CTC CTC CGC
CCC
Thr Leu Leu Arg Arg 1520 -267- ATT CAC CCG CCA CTA CGT TTT GOC AAG Ile Gin Pro Pro Leu Gly Phe Gly Lys 1525 1530 CTG TGC CCT CAC CGC GTG OCT Leu Cys Pro His Arg Val Ala TGC AAA CGC Cys Lys Arg 1540 CTG GTC TCC ATC AAC ATC Leu Val Ser Met Asn Met 1545 CCT CTG AAC AGC CAC Pro Leu Asri Ser Asp GGG ACA Cly Thr GTC ATG TTC AAT GCC ACC CTC TTT GCC Val Met Phe Asn Ala Thr Leu Phe Ala 1560 CTG GTC AGG Leu Val Arg ATC AAA ACA GAA COG AAC lie Lys Thr Clii Oly Asn ATC ATC AAG AAG ATC TGG Ilie Ile Lys Lys Ile Trp, 1590 ACG GCC CTO AGO Thr Ala Leu Arg 1570 GAG CTG CGG GCC Glu Leu Ar; Al CTA GAA CAA GCC AAT GAG Leu Giu Gin Ala Asn Glu 1580 AAG CGG ACC AGC ATC AAG CTG CTG GAC C-AG Lys Ar; Thr Ser Met Lys Leu Leu Asp Gin 1595 1600 GTO' GTG CCC cm' Val Val Pro Prc 1605 GCC ACC TTC CTG Ala Thr Phe Leu 1620 GAG CAG CCC CTT G1'a Gin Gly Leu CAG OCT 0CC TTC Gin Ala Cly Leu 165.
CC ATC TCT GCA Ala Ile Ser Gly 1670 AAG GAG CCT GTG Lys Giu Ala Val 1685; G CA CGT CAT Ala Cly Asp 161 ATC CAG GAG Ile Gin Clu 1625 CTG CCC AAC Val Cly Lys 1640 CCC ACA CTC Ar; Thr Leu 5 CAT GAG GTC Asp Ciu Val 0 TAC TTC CG Tyr Phe Ar; CCC TCC CAC Pro Ser Gin ACC GTT CCC AAG TTC TAC Thr Val Gly Lys Phe Tyr AAG TTC AAG Lys Phe Lys 1:630 ACC AAC GCO Ar; Asn Ala AAO CC AAA1 Lys Ar; Lys 1635 CTC TCT CTG Leu Ser Leu 4842 4890 4938 4986 5034 5 0 82 5130 5226 52 "14 53220 5418 5466 CAT GAC ATC GCC His Asp Ile Cly 1660 CAT CTC ACC Asp Leu Thr OCT CAC GAO Ala Ciu Ciu CCT GAG ATC COA CG Pro Glu Ile Arg Ar; 1665 CTC GAC AAO CCC ATG Leu Asp Lys Aia Met 1680 ATC TTC AGO AOG 0CC Ile Phe Ar; Ar; Ala- TCC GCT OCT TCT CAA CAT CAC Ser Ala Ala Ser Ciu Asp Asp 1690 GGT GOC ClV Gly 1700 CTG TTC GOC AAC CAC GTC AGC TAC TAC CAA AOC GAC GC Leu Phe Gly Asn His Val Ser Tyr Tyr Gin Ser Asp Gly 1705 1710
CGG
Arc AGC CCC TTC CCC Ser A-la Phe Pro CAC ACC TTC Gin Thr Phe 1720 AGC CAC GCC Ser Cmn Gly ACC ACT CAC CC CCC Thr Thr Gin Ar; Pro 1725 CAC ACT GAG TCO CCA Asp Thr Giu Ser Pro 1740 AAO CCG OGC Lys Ala Cly CTG CAC ATC AAC Leu His Ile Asn 1730 TCC CAC GAG AAG Ser His Clu Lvs 1745
AGC
Ser 1735 -268- CTG GTG GAO TCC ACC TTO ACC COG AGO AGC TAO TOG TOO ACC GGC TOO 5514 Leu Val Asp Ser Thr Phe Thr Pro Ser Ser Tyr Ser Ser Thr Gly Ser 1750 1755 1760 AAO GOC AAO ATO AAC AAO GOC AAO AAC ACC GOC OTG GGT OGO OTO COT 5562 Asn Ala Asn Ile Asn Asn Ala Asn Asn Thr Ala Leu Gly Arg Leu Pro 1765 1770 1775 CGC CCC GOC GGO TAO 000 AGO ACG GTO AGO ACT GTG GAG GGO CAC GGG 5610 Arg Pro Ala Gly Tyr Pro Ser Thr Val Ser Thr Val Glu Gly His Gly 1780 1785 1790 17;5 000 000 TTG TOO COT GOC ATO OGG GTG CAG GAG GTG GOG TGG AAG OTO 5658 Pro Pro Leu Ser Pro Ala Ile Arg Val Gin Giu Val Ala Trp Lys Leu *1800 1805 1810 *AGO TOO AAO AGG TGO CAC TOO OGG GAG AGO CAG GOA GOC ATG GOG GOT 5706 Ser Ser Asn Arg Cys His Ser Arg Glu Ser Gin Ala Ala Met Ala Gly .181518012 *CAG GAG GAG ACG TOT CAG GAT GAG ACC TAT GAA GTG AAG ATG AAO OAT 5754 .Gin Glu Glu Thr Ser Gin Asp Giu Thr Tyr Glu Val Lys Met Asn His 1830 1835 1840 *.*GAO ACG GAG GOC TGC AGT GAG CCC AGO CTG OTO TOO ACA GAG ATG OTO 5802 *Asp Thr Glu Ala Cys Ser Glu Pro Ser Leu Leu Ser Thr Glu Met Leu .**1845 1850 1855 TOO TAO CAG GAT GAO GAA AAT CGG CAA CTG AOG OTO OCA GAG GAG GAO 5850 *Ser Tyr Gin Asp Asp Glu Asn Arg Gin Leu Thr Leu Pro Glu Glu Asp 1860 1865 1870 17 A-AG AGG GAO ATO CGG CAA TOT COG AAG AGG GGT TTO OTO OGO TOT GOC 5898 Lys Arg Asp Ile Arg Gin Ser Pro Lys Arg Gly Phe Leu Arg Ser Ala 1880 1885 1890 TOA OTA GGT OGA AGG GOC TOO TTO CAC CTG GAA TGT CTG AAG OGA CAG 5946 Ser Leu Gly Arg Arg Ala Ser Phe His Leu Giu Cys Leu Lys Arg Gin 1895 1900 1905 AAG GAO OGA GGG GGA GAO ATO TOT CAG AAG ACA CTO OTG 000 TTG OAT 5994 Lys Asp Ar-g Gly Gly Asp Ile Ser Gin Lys Thr 17n Ie Pr Let:u his 1910 1-915 1920 OTG GTT OAT CAT CAG GOA TTG GCA GTG GOA GO CTG AGO COO OTO OTO 6042 Leu Val His His Gin Ala Leu Ala Val Ala Gly Leu Ser Pro Leu Letu 1925 1930 1935 CAG AGA AGO OAT TOO COT GOC TCA TTO COT AGG COT TTT GOC ACC OCA 6090 Gin Arg Ser His Ser Pro Ala Ser Phe Pro Arg Pro Phe Ala Thr Pro 1940 1945 1950 1955 OCA CO ACA OCT GO AGO OGA GGC TGG COO OCA CAG COO GTO COO ACC 6138 Pro Ala Thr Pro Gly Ser Arg Gly Trp, Pro Pro Gln Pro Val Pro Thr 1960 1965 1970 -269- CTG CGG CTT GAG GGG Leu Arg Leu Giu Gly 1975 CCA TCC ATC CAC TGC Pro Ser Ile His Cys 1990 GTC GAG TCC AGT GAG Val Giu Ser Ser Giu 1980 AAA CTC AAC AGC AGC TTC Lys Leu Asn Ser Ser Phe GGC TCC TGG GCT GAG ACC Gly Ser Trp Ala Giu Thr 1995 0.0.
to. too,* 0.0* GGC AGC AGC GCC Gly Ser Ser Aia 2005 AGC CAG GCT GGC Ser Gin Ala Giy 2020 CTG GTG GAA GCG Leu Val Giu Ala GAT CCC AAG TTC A-sp Pro Lys Phe 205.
GAC ATG ACC ATA Asp Met Thr Ile 2070 GGG GGC GCC CCA Giy Giy Ala Pro 2085 ACC CCC GGT GGC GGG Thr Pro Gly Gly Gly 2000 TCC CTC ATG GTG CCC Se r Leu Met Val Pro GCC CGG Al a Arg AGA GTC Arg Val CGG CCC GTC Arg Pro Val GCC CCA GGG AGG Ala Pro Gly Arg 2025 GTC TTG ATT TCA Val Leu Ile Ser 2040 ATC GAG GTC ACC Ile Glu Val Thr 6234 6282 6233 0 CAG TTC CAC GGC AGT GCC Gin Phe His Gly Ser Ala 2030 AGC AGC Ser Ser 2035 GCT CAA Ala Gin GAA GGA CTG Glu Gly Leu 2045 ACC CAG GAG Thr Gin Giu GGG CAG TTT Gly Gin Phe CTG GCC GAC GCC TGC Leu Ala Asp Ala Cys 2065 GAC AAC ATC CTC AGC Asp Asn Ile Leu Ser GAG GAG ATG Giu Glu Met GAG AGC GCG GCC Giu Ser Ala Ala 6; 74 CAG AGC CCC AAT GGC GCC CTC Gin Ser Pro Asn Giy Ala Leu 2090 TTA CCC TTT GTG AAC Leu Pro Phe Val Asn TGC AGG Cys Axrg 2100 GAC GCG GGG Asp Ala Gly CAG GAC Gin Asp 2105 TGT GTG CGC CYs Val Arg AGG GTC TAC Arg Val Tyr
GCG
Ala CGA GCC GGG GGC GAA GAG GAC GCG GGC Arg Ala Gly Gly Glu Glu Asp Ala Gly 2110 21i5 CCG AGT GAG GAG GAG CTC CAG GAC AGC Pro Ser Glu Giu Giu Leu Gin Asp Ser 2125 2130 CGG GGT CGA Arg Gly Arg 2120 657 0 6616 6669 GTC AGC AGC CTG TAGTGGGCGC TGCCAGATGC
GGGCTTTTTT
Val Ser Ser Leu TTATTTGTTT CAATGTTCCT AATGGGTTCG TTTCAGAAGT GCCTCACTGT
TCTCGT
INFORMATION FOR SEQ ID NO:37: SEQUENCE
CHARACTERISTICS:
LENGTH: 2970 base pairs TYPE: nucleic acid STRANDEDNESS: double TOPOLOGY: linear (ii) MOLECULE TYPE: DNA (genornic) -270- (ix) FEATURE'.
NAME/KEY:
CDS
LOCATION: 502. .2316 OTHER INFORM4ATION: /standard-name= "B3eta-2C" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:37: CAGCAGCGTG CTAAGAAGCA GTCACATA CAGCAGCAGG AGTAGGCCTC CTGCTTTTCA 9AAAGCAGAGT ACTGCAGGGT CGCGAAATGC AAGACACTCA GATGTTTGA AATCTCCCGA 120 .9GTTGAGAATG GCTACTGTA AAGCGTCACC AAGAAACTCT GACGATCTGG ACAGTCCTAA 180 CTCTGTGTTA GCAATACTTA CTTCCGGA TATCACT GA ATTTTTGCAA 240 *ATAGGAAACC CCCTTGAAGA AGATCTCAAA TTACCCCCC CACCCCCAAA AAALAGACAAZ 300 *CAGGGGAGACAGTTGGAGCG AGGAACGGTG GCTTTTTTAG AAACTACCTA 360 'GGAGGCAGA GCTAAGTGAT TTGCTCATGC CTCTTACCTG GGAGTAGAAG GTGGGAAGAJ4 420 9..*ATGGACCGAG GCTGTGACGA GAAGACAAGG CACAGTGCAG CTTGGTGAJAG CCACACGCTO 480 ATCGT-C GCCCCCTCTT C ATG CAG TGC TGC GGG CTG GTG CAT CGC CGG53 .9 4 Met Gin Cys Cys Gly Leu Val His Arg Arg53 1 5 ~*CGA GTA CGG GTG TCC TAT GGT TCG GCA GAC TCC TAC ACT GCTCA57 Arg Val Arg Val Ser Tyr Gly Ser Ala Asp Ser Tyr Thr Ser Arg Pro is. 1 20 TCC GAT TCC GAT GTA TCT CTG GAG GAG GAC CGG GAG OCA GTG CGC AGA 627 Ser Asp Ser Asp Val Ser Leu Glu Glu Asp Arg Glu Ala Val Arg Arg .9.30 35 woo* GAA GCG GAG CGG CAG GCC CAG GCA CAG TTG GAA AAA GCA AAG ACA AAG 675 Glu Ala Glu A-rg Gin Ala Gln Ala Gln Leu Glu Lys Ala Lys Thr Lys 50 CCC OTT GCA TTT GCG GTT CGG ACA AAT GC AGC TAC AGT GCG GCC CAT 723 Pro Val Ala Phe Ala Val Arg Thr Asn Val Ser Tyr Ser Ala Ala His 65 GAA OAT OAT OTT CCA OTO CCT GOC ATO 0CC ATC TCA TTC OAA OCA AAA 771 Glu Asp Asp Val Pro Val Pro Oly Met Ala Ile Ser Phe Olu Ala L.ys 80 85 OAT TTT CTG CAT OTT AAG GAA AAA TTT AAC AAT GAC TOG TOG ATA 000 819 Asp Phe Leu His Val Lys Glu Lys Phe Asn Asn Asp Trp Trp Ile Oly 100 105 COA TTG OTA AAA OAA 0CC TOT GAA ATC GCA TTC ATT CCA AOC CCA OTC 867 Arg Leu Val Lys Olu Gly Cys Olu Ie Gly Phe Ile Pro Ser Pro Val 110 115 120 -271- AAA CTA LYS Leu AAA TTC Lys Phe 140
GAA
Glu 125 AAC ATC ACG CTC Asn Met Arg Leu CAT GAA CAC His Giu Gin AGA CC AAG CAA GGC Arg Ala Lys Gin Gly TAC TCC ACT AAA Tyr Ser Ser Lys TCA GGA GGA AAT TCA TCA TCC AGT TTG GGT Ser Cly Cly Asn Ser Ser Ser Ser Leu Gly GAC ATA GTA CCT ACT TCC AGA AAA Asp Ile Val Pro Ser Ser Arg Lys 155 160 TCA ACA CCT Ser Thr Pro CCA TCA TCT GCT Pro Ser Ser Ala GAC ATA GAT CCT Asp Ile Asp Ala GGC TTA CAT Cly Leu Asp AAC CAC Asn His TCC AAA Ser Lys CCC TAT Pro Tyr 220
CC
Arg GCA CAA Ala Glu 180 CCA AAC Ala Asn CAA AAT CAT ATT Glu Asn Asp Ile CCA CCA Pro Ala CCT AAA CCC ACT Pro Lys Pro Ser AGT GTA ACG TCA CCC CAC Ser Val Thr Ser Pro His GAG AAA AGA ATG CCC Ciu Lys Arg Met Pro 205 CAT CTC CTA CCT TCC Asp Val Val Pro Ser
TTC
Phe TTT AAC AAG ACA Phe Lys Lys Thr CAC ACT CCT CTG AAC CCC TAC Leu Lys Cly Tyr 225
CTC
Val ATG CCA CCA Met Arg Pro ACA CAT ATG Thr Asp Met CTC CTC CTA CTC CCC CCT Val Val Leu Val Cly Pro 915 963 1011 1059 1107 1155 1203 1251 1299 1347 1395 1443 1491
ATC
Met CAA AAA CC CTC Gin LYS Ala Leu CAT TTT TTA AAA Asp Phe Leu Lys
CAC
His 255 AGA TTT GAA CCC Arg Phe Clu Cly ATA TCC ATC ACA Ile Ser Ile Thr ACG CTC A-rg Val ACC CCT CAC Thr Ala Asp
ATC
Ile 270 TCC CTT CCC AAA Ser Leu Ala Lys
CC
Arg TCC CTA TTA AAC Ser Val Leu Asn AAC CAC Lys His CTT CAG Val Gin 300 AAT CCC ACT Asn Pro Ser 280 TTA CC GAA
CCA
Al a ATA ATA CAA ACA Ile Ile Ciu Arg AAC ACA ACC TCA Asn Thr Arg Ser ACT CAA ATC GAA Ser Giu Ile Giu
AG
Arg 305 ATT TTT CAA CTT Ile Phe Glu Leu
GCA
Al a AGA ACA TTG CAG Arg Thr Leu ln
TTC
Leu 315 CTG CTC CTT GAC Val Val Leu Asp
C
Al a CAT ACA ATT AAT Asp Thr Ile Asn
CAT
His CCA CCT CAA CTC AAA ACC TCC TTG CCC CCT ATT ATA GTA TAT CTA AAC ATT TCT TCT CCT LYS Thr Ser Leu Ala Pro Ile Ile Val Tyr Val Lys Ile Ser Ser Pro 335 340 345 1539 -272- AAC GTT TTA CAA AGG TTA ATA AAA TCT CGA GGG AAA TCT CAA GCT
A
Lys Vai Leu Gin Arg Leu ieLsSer Arg Giy Lys Ser Gin Ala Lys 35o0l y 355 360 AA CT AAc GTC His Leu Asn Val 365 CAG ATG GTA CA CCT CAT AAA CTG GCT Gin M~et Vai Al71a Aia Asp Lys Leu Aia CAC
S
S.
S S
S
S.
*SSS
S
S
*SS*
5555 CCA GAG Pro Ciu 380 TGT
GAG
Gys Gin 395 CCT CCC pro Pro S CTG TTC CAT GTG ATC TTC CAT GAG AAC CAG CTT
GAG
Leu, Phe Asp Vai Ile Leu Asp Giu Asn Gin Leu Giu 385 390 CAG CTT CCC GAG TAT CTC GAG GCC TAG TGG AAG
CC
Hi s Len Aia Asp Tyr Leu Ciu Aia Tyr Trp Lys Aia 400 405 ~GC ACT ACC CTC CCC AAC GCT CTC CTT AGC CGT
ACA
;er Ser Ser Leu Pro Asn Pro Leu Leu Ser Arg Thr 415 420 TGT
CCT
Gys Pro CAT CC ASP Ala ACC
CAT
Thr His 410 TTA CC ACT TCA ACT CTC CCT
CTT
Thr Ser Ser Leu Pro Leu TCT CAA
GCT
Ser Gin Cly 445 AGC CCC Ser Pro ACT
AT
Thr Asp
ACC
Thr 435 CTA CC TCT AAT TCA CAC
GT
Leu Ala Ser Asn Ser Gin Ciy CAT GAG
AG
Asp Gin Arg TCC CAA Ser Gin 460 GGT GAA CAA CAA Ala Ciu Giu Gin CC TC CT CCT
ATG
Arg Ser -Aa Pro Ile 455 GTC GAA GA GC AAG Val Cu pro Vai Lys CCT TGT CCT Arg Ser Ala 1587 163s 1683 1731 1779 1827 1875 1923 19 71 2019 2115 2163 2211 GT ACT Pro Ser
CAC
His 475 CCC TGT TCC TCC
TGA
Arg Ser Ser Ser Ser 480 CC CA AG A-la Pro His CAA GAG ACA Gin Glu Thr GAG AAG His Asn CAT CCC ACT CCC
ACA
His A-rg Ser Cly Thr ACT CCC CCC
GTG
Ser Arg Cly Leu ACT CGA GAG
TCT
Ser Arg Asp Ser 510 GAG GTG GAG GAG His Val Asp His 525 TGG AG Ser Arg GAG TCG CAA ACC ASP Ser Giu Thr GAG GAG CCC TAG GTA GAG Aia Tyr Val Gin GGA AAC GAA GAT TAT Lys GiU ASP Tyr TAT CCC TCA Tyr Ala Ser TCC CAT GAG Ser His Asp 520 AGA GAG GAG
GAG
His 515 ACCC A gT AGp GAG AAG ACC GACGGCC ACC ACT GAG GAG AGA GAG ACC GAG Thr His Gly Ser Ser Asp HisAr Hi AgCu 540 r Hi X Gl TC CCC GAG CGT TCC CCC GAG GTG CAT
CGA
Arg Asp Val Asp Arg 555 GAG GAG GAG GAG AAG
GAG
Giu Gin Asp His Asn Giu 560 565 TGC AAG AAG GAG Gys Asn Lys Gin
CC
Arg 570 -273- ACC CGT CAT AAA TCC AAC AG C A G A A A C A T 2259 Ser Arg Hi Ly S r L s sp Arg Tyr Cys iu Lys Asp Gly Ciu Val 575 580 585 ATA TCA AAA AAA CGG AAT GAG CCT GGG GAG TGC AAC AGC GAT GTT TAC 2307 Ile Ser Lys Lys A-rg Asn Ciu Ala Gly Glu Trp Asra Arg Asp Val Tyr 590 595 600 ATC CCC CAA TCACTTTTGC CCTTTTGTCT TTTTTTTTTT TTTTTTTTCA 2356 Ile Pro Gin 605 AGTCTTCTAT AACTAACAGC ATCCCCAAAA CAAAAACTCT TTGGCCTCTA CACTGCCATC 2416 ATATGTGATC TGTCTTGTAA TATTTTGTAT TATCGTGCTTGAATAGCAAACT 2476 GATAGACTAT TGAGATACTT TTTCTTTTGT AAGTCCTACA TAAATTGGCC TGCTATCGCT 23 SGCAGTCCTCC GGTTGCATAC TGGACTCTTC AAAAACTGTT TTGGGTAGCT GCCACTTGAA 2596 .CAAAATCTGT TGCCACCCAC GTGATGTTAC TGTTTTAAGA AATGTAGTTG ATGTATCCAA 2656 *CAACCCAGAA TCAGCACACA TAAAAAGTGG AATTTCTTGT TTCTCCACAT TATCG 21 *TTAATACCCA GCCATCTGAT TTCCATATTC ATTCATCGAC CACTCTTTCT GTTAC 27 TCTGGCTGAC TAAATTTGGG GACAGATTCA GTCTTGCCTT ACACAAAGGG GATCATAAC 2836 *TTAGAATCTA TTTTCTATCT ACTATACTG TGTACTGTAT AACAGTTTC TAAATTTAT 2896 .TTCTGCAAC AAACACCTCC TTATTATATA TAATATATAT ATATATATCA GTTTGATCAC 2956 ACTATTTTAG ACTC 2970 INFORMATION FOR SEQ ID NO:38: SEQUENCE
CHARACTERISTICS:
LENGTH{: 2712 base pairs TYPE: nucieic acid STRANDEDNESS. doubie TOPOLOGY: linear (ii) MOEkL YE N genomic) (ix)
FEATURE:
NAME/KEY:
CDS
LOCATION: 223. .2061 OTHER INFORMATION: /standard-name= "Beta-2E" (xi) SEQUENCE DESCRIPTION: SEQ ID NO:38: ACTGTCTCTT TTCAGCCCCT CCTCGAATGG CAAAATAACA ATCTCCCTCG ATCCCACTCC TCTGGCGCAG GGAGTCAAAG CCCCGCACGC AGAAAGCGAC CCAGAACAGC CGCTTGCCCA 120 -274- GAGCATGGAT AGGAAAGGAG CTGGGGTTCT CCGGGGCTCA GCGCGCACTG AGAACCTGTG 180 CCCGGGGCTG CAGCTGCGGA CGATAAAGGC GCTGTCTGGC TC ATG AAG GCC ACC 234 Met Lys Ala Thr 1 TG ATC AGO CTT CTG AAA AGA CC AAG GGA GGA AGG CTG AAs AAT TCT22 Tm Ile Arg Leu Leu Lys Arg Ala Lys Gly Gly ArgLuLsAr e 510 15 GAT ATC TOT GOT TCG GCA GAC TCC TAC ACT AGc COT CCA TCC GAT TCC 330 Asp Ile Cys Gly Ser Ala Asp Ser Tyr Thr Ser Arg Pro Ser Asp Ser 30 3 GAT GTA TCT CTG GAG GAG GAC CGG GAG GCA GTG CGC AGA GAA GCG GAG 378 S.Asp Val Ser Leu Gu Gu Asp Arg Gu Ala Val Arg Arg Glu Ala Glu 45 CGG CAG GCC CAG GC.A CAG TTG GAA AAA GCA AAG ACA AAG CCC GTT GCA 426 *SArg Gin Ala Gin Ala Gin Leu Clu Lys Ala Lys Thr Lys Pro Val Ala 60 55*TTT GCG OTT CGG ACA AAT OTC AGC TAC AGT GCG GCC CAT GAA GAT GAT47 P le Val Arg Thr Asn Vai Ser Tyr Ser Ala Ala His Giu Asp Asp 75 8 GTT CCA GTG CCT GGC ATG GCC ATC TCA TTC GAA GCA AAA GAT TTT CTG 522 Val Pro Val Pro Gly Met Al1a Ile Ser Phe Glu Ala Lys Asp Phe Leu *5*85 90 95100 *00CAT GTT AAG GAA AAA TTT AAC AAT GAC TGG TGG ATA GGG, CGA TTG GTA 570 His Val Lys Giu Lys Phe Asn Asn Asp Trp Trp Ile Gly Arg Leu Val 105v* 110 115 AAA GAA GGC TGT GAA ATC GGA TTC ATT CCA AGC CCA OTC AAA CTA GAA 618 Lys Glu Gly Cys Gu Ile Gly Phe Ile Pro Ser Pro Val Lys Leu Gu 120 125 130 AAC ATG AGG CTG CAG CAT GAA CAG AGA GCC AAG CAA GGG AAA TTC TAC 666 Asn Met Arg Leu Gin His Giu Gin Arg Ala Lys Gin Gly Lys Phe Tyr 135 140 145 TCC ACT "ATrCA GGA GGA AAT TCA TCA TCC AGT TTG GOT GAC ATA GTA 714 Ser Ser Lys Ser Gly Gly Asn Ser Ser Ser Ser Leu Gly Asp Ile Val 150 155 160 CCT AGT TCC AGA AAA TCA ACA CCT CCA TCA TCT OCT ATA GAC ATA GAT 762 Pro Ser Ser Arg Lys Ser Thr Pro Pro Ser Ser Ala Ile Asp Ile Asp 165 170 175 180 GCT ACT GGC TTA CAT GCA GAA GAA AAT GAT ATT CCA GCA AAC CAC CCC 810 Ala Thr Cly Leu Asp Ala Glu Giu Asn Asp Ile Pro Ala Asn His Arg 185 190 195 TCC CCT AAA CCC ACT GCA AAC AGT GTA ACC TCA CCC CAC TCC AAA GAG 858 Ser Pro Lys Pro Ser Ala Asn Ser Val Thr Ser Pro His Ser Lys Glu -275- 200 205 210 AAA AGA ATG CCC TTC TTT AAG AAG ACA GAG CAC ACT CCT CCG TAT GAT 906 Lys Arg Met Pro Phe phe Lys Lys Thr Giu His Thr Pro Pro Tyr Asp 215 220 225 GTG GTA CCT TOC ATG CGA CCA GTG GTC CTA GTG GGC CCT TCT CTG AAG 9S4 Val Val Pro Ser Met Arg Pro Val Val Leu Val Gly Pro Ser Leu Lys 230 235 240 GGC TAC GAG GTC ACA GAT ATG ATG CAA AAA GCG CTG TTT CAT TTT TTA 1002 Gly Tyr Giu Val Thr Asp Met Met Gin Lys Ala Leu Phe Asp Phe Leu 245 250 255 260 AAA CAC AGA TTT GAA GGG CGG ATA TCC ATC ACA AGG GTC ACC GCT GAC 1050 Lys His Arg Phe Glu Gly Arg Ile Ser Ile Thr Arg Val Thr Ala Asp 265 270 275 275 ATC TCG CTT GCC AAA CGC TCG GTA TTA AAC AAT CCC ACT AAG CAC GCA 1098 Ile Ser Leu A-la Lys Arg Ser Val Leu Asn Asn Pro Ser Lys His Ala 280 285 290 ATA ATA GAA AGA TCC AAC ACA AGG TCA ACC TTA GCG GAA GTT CAG AGT 1146 Ile Ile Giu Arg Ser Asn Thr Arg Ser Ser Leu Ala Giu Val Gin Ser 295 300 305 GAA ATC GAA AGG ATT TTT GAA CTT GCA AGA ACA TTG CAG TTG GTG GTC 1194 Glu Ile Glu Arg Ile Phe Gu Leu Ala Arg Thr Leu Gin Leu Vai Val 310 315 320 CTT GAC GCG GAT ACA ATT AAT CAT CCA GCT CAA CTC ACT AAA ACC TCC 1242 Leu Asp Ala Asp Thr Ile Asn His Pro Aia Gin Leu Ser Lys Thr Ser 325 330 335 340 TTG GCC CCT ATT ATA GTA TAT GTA AAG ATT TCT TCT CCT AAG GTT TTA 1290 Leu Ala Pro Ile Ile Val Tyr Val Lys Ile Ser Ser Pro Lys Val Leu 345 350 355 CAA AGG TTA ATA AAA TCT CGA GGC AAA TCT CAA GCT AAA CAC CTC AAC 1338 355 Gin Arg Leu lie Lys Ser Arg Gly Lys Ser Gin Ala Lys His Leu Asn A 13 360 365 370 ***GTC CAG ATG GTA a GCT AAA CTG GCT CA TGT CCT CCA GAG CTC 1386 Val Gin Met Val Ala Ala Asp Lys Leu Ala Gin Cys Pro Pro Glu Leu 375 380 385 TTC AT GTG ATC TTG AT GAG AAC CAG CTT GAG AT CC TT GAG CAC 1434 Phe Asp Val Ile Leu Asp Glu Asn Gin Leu Gu Asp Ala Cys Gu His 390 395 400 CTT CC GAC TAT CTG GAG GCC TAC TGG AAG GCC ACC CAT CCT CCC AGO 1482 Leu Ala Asp Tyr Leu Giu Ala Tyr Trp Lys Ala Thr His Pro Pro Ser 405 410 415 420 ACT AC CTC CCC AAC CCT CTC CTT AGC CT ACA TTA GCC ACT TCA ACT 1530 Ser Ser Leu Pro Asn Pro Leu Leu Ser Arg Thr Leu Ala Thr Ser Ser -276- 4 4 4* 4 *4 *40.
425 CTG CCT CTT AGC CCC ACC CTA GCC Leu Pro Leu Ser Pro Thr Leu Ala 440 GAT CAG AGG ACT GAT CGC TCC GCT Asp Gin Arg Thr Asp Arg Ser Ala 455 460 GAA GAA GAA CCT AGT GTG GAA CCA Giu Giu Giu Pro Ser Val Giu Pro 470 475 TCC TCC TCA GCC CCA CAC CAC AAC Ser Ser Ser Ala Pro His His Asn 485 490 CTC TCC AGG CAA GAG ACA TTT GAC Leu Ser Arg Gin Giu Thr Phe Asp 505 TCT GCC TAC GTA GAG CCA AAG GAA Ser Ala Tyr Val Glu Pro Lys Giu 520 CAC TAT GCC TCA CAC CGT GAC C-AC His Tyr Ala Ser His Arg Asp His 535 540 AGC AGT GAC CAC AGA CAC AGG GAG Ser Ser Asp His Arg His Arg Giu 550 555 GAT CGA GAG CAG GAC CAC AAC GAG I Asp Arg Giu Gin Asp His Asn Giu C 565 570 AAA TCC AAG GAT CGC TAC TGT GAA Lys Ser Lys Asp Arg Tyr Cys Giu L.
585 AAA CGG AAT GAG GCT GGG GAG TGG Lys Arg Asn Giu Ala Gly Giu Trp 600 6 TGAGTTTTGC CCTTTTGTGT
TTTTTTTTTT
ATCCCCAAAA CAAAZAGCTCT
TTGGGGTCTA
TATTTTGTAT TATTGCTGTT
GCTTGAATAG
TTTCTTTTGT AAGTGCTACA
TAAATTGGCC
TGGACTCTTC AAAAACTGTT
TTGGGTAGCT
430 TCT AAT TC Ser Asn Se 445 CCT ATC CG Pro Ile Ar GTC AAG AA Val Lys Ly5 CAT CGC AGJ His Arg SeT.
495 TCG GAA ACC Ser Giu Thr 510 GAT TAT TCC Asp Tyr Ser 525 PAC CAC AGA ksn His Arg r'CC CGG CAC 'er Arg His 'GC AAC AAG ys Asn Lys 575 LAG GAT GGA ~ys Asp Giy 590 Sn Arg Asp 05
TTTTTTTTGA
CACTGCAATC
CAATAGCATG
TGGTATGGCT
GCCACTTGAA
435 A CAG GGT TCT CAA GGT r Gin Gly Ser Gin Cly 450 T' TCT GCT TCC CAA GCT SSer Ala Ser Gin Ala 465 ~TCC CAG CAC CGC TCT sSer Gin His Arg Ser 480 SGGG ACA AGT CGC GC Cly Thr Ser Arg Gly 500 CAG GAG AGT CGA GAC Gin Giu Ser Arg Asp 515 CAT GAC CAC GTG GAC His Asp His Val Asp 530 GAC GAG ACC CAC GGG Asp Giu Thr His Gly 545 CGT TCC CGG GAC GTG Arg Ser Arg Asp Val 560 CAC CGC AGC CGT CAT Gin Arg Ser Arg His 580 GAA GTG ATA TCA AAA Giu Val Ile Ser Lys 595 GT r .T TAC ATC C CCC CAA Val Tyr Ile Pro Gin 610 AGTCTTGTAT
AACTAACAGC
ATATGTGATC
TGTCTTGTAA
GATAGAGTAT
TGAGATACTT
CCAGTCCTCC
GGTTGCATAC
CAAAATCTGT TGCCACCCAG 1578 1626 1674 1722 1770 1818 1866 1914 1962 2010 2058 2118 2178 2238 2298 2358 -277- GTGATGTTAG TGTTTTAAGA AATGTAGTTG ATGTATCA CAAGCCAGA TCAGCACAGA 2418 TAAAAAGTGG AATTTCTTCT TTCTCCAGAT TTTTAATAC TTAATACGCA GGCATCTGAT 24!78 TTGCATATTC ATTCATGGAC CACTGTTTCT TGCTTGTACC TCTGGCTGAC TAAATTTCGG 2538 GACAGATTCA GTCTTGCCTT ACACAAAGGG GATCATAAAG TTAGAATCTA TTTTCTATGT 2598 ACTAGTACTG TGTACTGTAT AGACAGTTTG TAAATGTT.AT TTCTGCAAAC AAACACCTCC 2658 TTATTATATA TA-ATATATAT ATATATATCA GTTTGATCAC ACTATTTTAG AGTC 2712

Claims (31)

1. An isolated DNA fragment, comprising a sequence of nucleotides that encodes an a I subunit of a human calcium channel selected from the group consisting of a 1A A-2, E, a 1 c- 2 and eIE_ 3 1a-2 a-r 2ic- and a
2. The DNA fragment of claim 1, wherein the subunit is a w h e r e i n the a, subunit is A-1 or aA-2, wherein: the aA.,1 subunit includes the sequence of amino acids set forth in SEQ ID NO. 22 or functionally equivalent variants thereof; and the aAt 2 subunit includes the sequence of amino acids set forth in SEQ ID NO. 23 or functionally equivalent variants thereof.
3. The DNA fragment of claim 2, wherein the subunit includes the sequence of amino acids set forth in SEQ ID NO. 22 or 23.
4. The DNA fragment of claim 1 wherein the subunit is 1E-1 I wherein: the aEf subunit includes the sequence of 20 amino acids set forth in SEQ ID NO. 24 or functionally equivalent variants thereof. The DNA fragment of claim 1, wherein the subunit is alE-3 wherein: the 1E-3 subunit includes the sequence of 25 amino acids set forth in SEQ ID NO. 25 or functionally equivalent variants thereof.
6. The DNA fragment of claim 4, wherein the subuni- .h i subun includes the sequence of amino acids set forth in SEQ ID NO.
24. 7. The DNA fragment of claim 5, wherein the subunit includes the sequence of amino acids set forth in SEQ ID NO. 8. The DNA fragment of claim 1, wherein the al subunit is aic-2;, wherein: the aI c -2 subunit includes the sequence of 279 amino acids set forth in SEQ ID NO. 36 or functionally equivalent variants thereof. 9. The DNA fragment of claim 8, wherein the c I- 2 subunit includes the sequence of amino acids set forth in SEQ ID NO.
36. An isolated DNA fragment, comprising a sequence of nucleotides that encodes a 52 or 3 subunit of a human calcium channel. 11. The DNA fragment of claim 10, wherein the subunit is a 2c, 12D or B2E subunit 12. The DNA fragment of claim 11, wherein the 2, 1 3 2D or 1 2E subunit includes the sequence of amino acids set forth in SEQ ID Nos. 26, 29 or 30 or functionally equivalent variants thereof. 13. The DNA fragment of claim 12, wherein the 1 2c, 2D or 3 2E subunit includes the sequence of amino acids set forth in SEQ ID Nos. 26, 29 or 14. A DNA fragment that encodes a 53 subunit a human calcium channel. 20 2015 The DNA fragment of claim 14, wherein th subunit is a P3- subunit. 16. The DNA fragment of claim 15, wherein the 3-i subunit includes the sequence of amino acids set forth in SEQ ID NO. 19. 17. The DNA fragment of claim 10, wherein the subunit is a 3 subunit, wherein the 4 subunit includes the sequence of amino acids set forth i D NO. 27 or functionally equivalent variants thereof. S18 The DNA fragment of claim 17 wherein the 30 subunit includes the sequence of amino acids set forth in SEQ ID No. 27. n o ds set forth 19. A eukaryotic cell, comprising heterologous DNA that encodes a a u tcompris i n g heterologous DNA that encodes an a, subunit of a human calcium channel selected from the group of subunits consisting of a' A' g r o u p o f s u b u 2 t consistingand 1A-1r 1u-a' 21i-2' and a C-2' E-3 280 The eukaryotic cell of claim 19, wherein the a subunit includes the sequence of amino acids set forth in any of SEQ ID NOs. 22, 23, 24, 25 or 36. 21. The eukaryotic cell of claim 20, wherein the subunit is an a A-1, ~A-2 or C-2 subunit. 22. A eukaryotic cell, comprising heterologous DNA that encodes an al subunit of a human calcium channel and heterologous DNA that encodes a P subunit of a human calcium channel, wherein at least one subunit is selected from the group of subunits consisting of a1,A-' 1 a-2' 21, P3-1' and a subunit. 23. The eukaryotic cell of claim 22, wherein the P subunit is selected from the group of subunits consisting of 2c' 2D or 2E- 24. The eukaryotic cell of claim 22 or claim 23, wherein the B2 subunit is a ~2c S2D or 2 subunit that includes the sequence of amino acids set forth in SEQ ID Nos. 26, 29 or 30 or functionally equivalent 20 variants thereof. .25. The eukaryotic cell of claim 23 or claim 24, wherein the p 2 subunit is a 2c' 2D or 2E subunit that includes the sequence of amino acids set forth in SEQ ID Nos. 26, 29 or 26. The eukaryotic cell of claim 22, wherein the P subunit is a 3 4 subunit. 27. The eukaryotic cell of claim 26, wherein the 4 subunit includes the sequence of amino acids encoded by the sequence of nucleotides set forth in 30 SEQ ID NO. 27 or functionally equivalent variants thereof. 28. The eukaryotic cell of claim 26, wherein the 34 subunit includes the sequence of amino acids encoded by the sequence of nucleotides set forth in SEQ ID NO. 27. 281 29. The eukaryotic cell of claim 19, selected from the group consisting of HEK 293 cells, Chinese hamster ovary cells, African green monkey cells, and mouse L cells. 30. The eukaryotic cell of claim 22 selected from the group consisting of HEK 293 cells, Chinese hamster ovary cells, African green monkey cells, and mouse L cells. 31. A eukaryotic cell with a functional, heterologous calcium channel, produced by a process comprising: introducing into the cell heterologous nucleic acid that encodes an al-subunit of a human calcium channel, wherein: the a, subunit is selected from the group consisting of a A-l, A-2' a 1 ic-2 alE- and a 1E3; the heterologous calcium channel contains at least one subunit encoded by the heterologous nucleic acid; and the only heterologous ion channels are calcium 20 channels. *32. The eukaryotic cell of claim 31, wherein the a, subunit has the sequence of amino acids encoded by any of SEQ ID NOs. 22, 23, 24, 25 or 36 or functionally equivalent variants thereof. 33. The eukaryotic cell of claim 31, wherein the subunit has the sequence of amino acids encoded by any of SEQ ID NOs. 22, 23 or 36. 34. A eukaryotic cell with a functional, heterologous calcium channel, produced by a process comprising: ~introducing into the cell nucleic acid that encodes an a, subunit of a human calcium channel and introducing into the cell nucleic acid that encodes a P subunit of a human calcium channel, wherein: at least one of the subunits is selected from the 282 group consisting of a A A-2 I E-3 2C' 2D' 2E/ A 3a n 1A 2 1E 1 1E 3 2C a 3 and a 4 subunit; the at least one subunit including the sequence of amino acids set forth in or encoded by the DNA set forth in any of SEQ ID NOs. 19, 22, 23, 24, 25, 26, 27, 28, 29, 30 or 36 or functionally equivalent variants thereof; the heterologous calcium channel containing at least one subunit encoded by the heterologous nucleic acid; and the only heterologous ion channels being calcium channels. The eukaryotic cell of claim 34, wherein the at least one subunit includes the sequence of amino acids set forth in or encoded by the DNA set forth in any of SEQ ID NOs. 19, 22, 23, 26, 27, 28, 29, 30 or 36. 36. The eukaryotic cell of claim 31 selected S from the group consisting of HEK 293 cells, Chinese hamster ovary cells, African green monkey cells, mouse L cells and amphibian o 6 cytes.
37. The eukaryotic cell of claim 34 selected from the group consisting of HEK 293 cells, Chinese hamster ovary cells, African green monkey cells, mouse L cells and amphibian o6cytes.
38. The eukaryotic cell of claim 34, wherein the C subunit is a 2c' 2D' 2E' a 3 or a 34 subunit of a human calcium channel.
39. A eukaryotic cell with a functional, heterologous calcium channel, produced by a process comprising: introducing into the cell nucleic acid that encodes an a, subunit of a human calcium channel, nucleic acid that encodes an a 2 subunit of a human calcium channel and introducing into the cell nucleic 283 acid that encodes a p subunit of a human calcium channel, wherein: the a 2 subunit is an a 2 b subunit of a human calcium channel, the a 1 subunit is an 1A, a a 1 0 or alE subunit of a human calcium channel and the 1 subunit is a 53 or P4 subunit of a human calcium channel, wherein the al subunit includes a sequence of amino acids set forth in any of SEQ ID NOs. 1, 2, 3, 6, 7, 8, 22, 23, 24, 25, or 36, the a 2 b subunit includes an amino acid sequence set forth in SEQ ID NO. 11, and the 3 subunit includes an amino acid sequence set forth in SEQ ID NO. 19 or functionally equivalent variants of the subunits encoded by any of the aforesaid SEQ ID NOs.
40. The eukaryotic cell of claim 39, wherein the calcium channel comprises an A2', a 2 b, and a P3- subunit, or a alB, r 2 b' and an (3-1 subunit.
41. A eukaryotic cell with a functional, heterologous calcium channel, produced by a process comprising: introducing into the cell nucleic acid that encodes an al subunit of a human calcium channel, nucleic acid that encodes an a 2 subunit of a human calcium channel and introducing into the cell nucleic 25 acid that encodes a 3 subunit of a human calcium channel, wherein: the calcium channel contains an a 2 b subunit of a human calcium channel, an or an ac, subunit of a human calcium channel and a O ,2 or 1-3 subunit of 30 a human calcium channel, wherein the a, subunit includes a sequence of amino acids encoded by SEQ ID NOs. 1, 2, 7 or 8, the aZb subunit includes an amino acid sequence set forth in SEQ ID NO. 11, and the p subunit includes an amino acid sequence set forth in SEQ ID NOs. 9, 10 and 33-35 or functionally equivalent 284 variants of the subunits encoded by any of the aforesaid SEQ ID NOs.
42. The eukaryotic cell of claim 41, wherein the calcium channel includes an ae I 1 2b, and a 2 subunit, or an aie a2b' and a p.3 subunit, or an a1 2 aQb, and a B, 1 subunit.
43. A method for identifying a compound that modulates the activity of a calcium channel, comprising; suspending a eukaryotic cell that has a functional, heterologous calcium channel, in a solution containing the compound and a calcium channel-selective ion: depolarizing the cell membrane of the cell; and detecting the current flowing into the cell, wherein: the heterologous calcium channel includes at least one human calcium channel subunit encoded by DNA or RNA that is heterologous to-the cell; 20 at least one subunit is selected from the group consisting of laa a 2 2D, a2E' 1A-1 1A-2' 1E-1' IE-3' 1-2 2C 2E a 03 subunit and a ,4 subunit and includes the sequence of amino acids set forth in or encoded by the sequence of nucleotides set forth in any of SEQ ID NOs. 19, 22, 25 23, 24, 25, 26, 27, 28, 29, 30 or 36 or functionally equivalent variants thereof; the current that is detected is different from Sthat produced by depolarizing the same or a substantially identical cell in the presence of the 30 same calcium channel selective ion but in the absence of the compound.
44. The method of claim 43, wherein the heterologous DNA or RNA encodes a f3 subunit. The method of claim 43, wherein the heterologous DNA or RNA encodes a 04 subunit. 285
46. A subunit-specific antibody selected from the group consisting of antibodies that bind to an a subunit type or a subunit subtype of a human calcium channel, wherein the subunit is an lA, a E a 0 D or al. type a 1 subunit, wherein the subunit includes a sequence of amino acids set forth in any of SEQ ID NOs. 1, 2, 3, 6, 7, 8, 22, 23, 24, 25, or 36,
47. The antibody of claim 46, wherein the antibody is subtype specific and the a subunit is lA a E and a 1 A
48. The antibody of claim 29, wherein the subunit subtype is 1A 1 A-2, 1 C- 2 f 1E- or a 3
49. An RNA or single-stranded DNA probe of at least 30 bases in length comprising at least substantially contiguous bases from nucleic acids that encode a subunit of a human calcium channel selected from the group of subunits consisting of a I a 2 alEl- I alE-3' P2C' B2D, 2E and P4.
50. An RNA or single-stranded DNA probe of at **20 least 16 bases in length comprising at least 16 substantially contiguous bases from nucleic acids that subunit of a human calcium channel selected from the group of subunits consisting of alE- aIE- 3 n 2E and ,4 subunits. 1E 3 C
51. A method for identifying nucleic acids that encode a human calcium channel subunit, comprising hybridizing under conditions of at least low stringency a probe of claim 49 or claim 50 to a library of nucleic acid fragments, and selecting 30 hybridizing fragments.
52. A method for identifying cells or tissues that express a calcium channel subunit-encoding nucleic acid, comprising hybridizing under conditions of at least low stringency a probe of claim 49 or claim 50 with mRNA expressed in the cells or tissues 286 or cDNA produced from the mRNA, and thereby identifying cells or tissue that express mRNA that encodes the subunit.
53. A substantially pure human calcium channel subunit selected from the group consisting of iA-1 a1A-2' a1E-1' a 1 C- 2 a 1E-3,' P 3 2ZC' 2D' 2PE and P4.
54. A substantially pure human calcium channel subunit selected from among subunits that include the sequence of amino acids set forth in any of SEQ ID NOs. 19, 22, 23, 24, 25, 26, 28, 29, 30 or 36 or functionally equivalent variants thereof. *@0 9 9 9 9* e 9 *f 0 287 1. An isolated DNA molecule, comprising a sequence of nucleotides that encodes an a, subunit of a human calcium channel, wherein the subunit is selected from the group consisting of aIA-i, 1 (E 1 and alE 3 2. The DNA molecule of claim 1, wherein the a, subunit is or IA-.2, wherein: the IA-1 subunit includes the sequence of amino acids set forth in SEQ ID NO. 22 or functionally equivalent variants thereof; and the alA-2 includes the sequence of amino acids set forth in SEQ ID NO. 23 or functionally equivalent variants thereof. 3. The DNA molecule of claim 2, wherein the subunit includes the sequence of amino acids set forth in SEQ ID NO. 22 or 23. 4. The DNA molecule of claim 1, wherein the a, subunit is a.-1. The DNA molecule of claim 1, wherein the a, subunit is aE-3. 6. The DNA molecule of claim 4, wherein the subunit includes the sequence of amino acids set forth in SEQ ID NO. 24. 7. The DNA molecule of claim 5, wherein the subunit includes the sequence of amino acids set forth in SEQ ID NO. 8. An isolated nucleic acid molecule that encodes an a, 1 C 2 subunit of a human calcium channel, comprising as sequence of amino acids encoded by the coding portion of SEQ ID NO. 36. a 9. The nucleic acid molecule of claim 8 that comprises the coding portion of the sequence of nucleotides set forth in SEQ ID NO. 36. 10. An isolated DNA molecule, comprising a sequence of nucleotides that encodes a P2 subunit of a human calcium channel. 11. The DNA molecule of claim 10, wherein the subunit is a 2C, P2D or 2E subunit. 12. The DNA molecule of claim 11, wherein the 3 2C, P2D or P2E subunit includes the sequence of amino acids set forth in SEQ ID Nos. 26, 29 or 30 or functionally equivalent variants thereof. 13. The DNA molecule of claim 12, wherein the 32C, 3 2D or 2E subunit includes the sequence of amino acids set forth in SEQ ID Nos. 26, 29 or 14. A DNA molecule that encodes a P3 subunit of a human calcium channel. The DNA molecule of claim 14, wherein the subunit is a 3 3 subunit. I'\DayLib\LIBFF\0406 docsak 288 16. The DNA molecule of claim 15, wherein the subunit includes the sequence of amino acids set forth in SEQ ID NO. 19. 17. A DNA molecule, comprising a sequence of nucleotides that encodes a P4 subunit of a human calcium channel. 18. The DNA molecule of claim 17, wherein the subunit includes an amino acid sequence encoded by sequence of nucleotides set forth in SEQ ID No. 27. 19. A eukaryotic cell, comprising heterologous DNA that encodes an a, subunit selected from the group of subunits consisting of CIA-,, ,IA-2, 1 C, 2 ,IE1,, and alE-3. The eukaryotic cell of claim 19, wherein the a, subunit includes the sequence of amino acids set forth in any of SEQ ID NOs. 22, 23, 24, 25 or 36. 21. The eukaryotic cell of claim 20, wherein the a, subunit is an aA-1, Ia-2 or C-2 subunit. 22. A eukaryotic cell, comprising heterologous DNA that encodes an a, subunit and Sheterologous DNA that encodes a p subunit, wherein at least one subunit is selected from the group of subunits consisting of OaIA-, ,A2, PI-2, P 3 2, P2E, 3- 1 and a 3 4 subunit. 23. The eukaryotic cell of claim 22, wherein the p subunit is a P2 subunit. 24. The eukaryotic cell of claim 22 or claim 23, wherein the 3 2 subunit is a 3 2C, P2D or p2E subunit that includes the sequence of amino acids set forth in SEQ ID Nos. 26, 29 or 30 or functionally equivalent variants thereof. The eukaryotic cell of claim 23 or claim 24, wherein the 3 2 subunit is a 3 2 P, 2D or p2E subunit that includes the sequence of amino acids set forth in SEQ ID Nos. 26, 29 or 26. The eukaryotic cell of claim 22, wherein the p subunit is a 34 subunit. 27. The eukaryotic cell of claim 26, wherein the 3 4 subunit includes the sequence of amino acids encoded by the sequence of nucleotides set forth in SEQ ID NO. 27 or functionally equivalent variants thereof. 28. The eukaryotic cell of claim 26, wherein the 3 4 subunit includes the sequence of amino acids encoded by the sequence of nucleotides set forth in SEQ ID NO. 27. 29. The eukaryotic cell of any one of claims 19-28 selected from the group consisting of HEK 293 cells, Chinese hamster ovary cells, African green monkey cells, and mouse L cells. l\DayLib\LIBFF\0406 docsak 289 A eukaryotic cell with a functional, heterologous calcium channel, produced by a process comprising: introducing into the cell heterologous nucleic acid that encodes an a,- subunit of a human calcium channel, wherein: the a, subunit is selected from the group consisting of cI-1, tIA.2, Ca,-2, aIE, and lE-3; the heterologous calcium channel contains at least one subunit encoded by the heterologous nucleic acid; and the only heterologous ion channels are calcium channels. 31. The eukaryotic cell of claim 30, wherein the a, subunit has the sequence of amino acids encoded by any of SEQ ID NOs. 22, 23, 24, 25 or 36 or functionally equivalent variants thereof. 32. The eukaryotic cell of claim 30, wherein the c, subunit has the sequence of amino acids encoded by any of SEQ ID NOs. 22, 23 or 36. 33. A eukaryotic cell with a functional, heterologous calcium channel, produced by a process comprising: introducing into the cell nucleic acid that encodes an a, subunit of a human calcium channel and introducing into the cell nucleic acid that encodes a 0 subunit of a human calcium channel, wherein: at least one of the subunits is selected from the group consisting of CaI,, CIA-2 .IE-1 E- 3 a, 3 2 P 3 2D, P2E, a P3 and a 3 4 subunit; the at least one subunit includes the sequence of amino acids set forth in or encoded by the DNA set forth in any of SEQ ID NOs. 19, 22, 23, 24, 25, 26, 27, 28, 29, or 36 or functionally equivalent variants thereof; the heterologous calcium channel contains at least one subunit encoded by the heteroiogous nucleic acid; and the only heterologous ion channels are calcium channels. 34. The eukaryotic cell of claim 33, wherein the at least one subunit includes the sequence of amino acids set forth in or encoded by the DNA set forth in any of SEQ ID NOs. 19, 22, 23, 26, 27, 28, 29, 30 or 36. The eukaryotic cell of any one of claims 30-34 selected from the group consisting of HEK 293 cells, Chinese hamster ovary cells, African green monkey cells, mouse L cells and amphibian o6cytes. [.\DayLib\LIBFF\0406 docsak 290 36. The eukaryotic cell of claim 33, wherein the 0 subunit is a p 2 C, P2D, p12E, a 3 3 or a 3 4 subunit of a human calcium channel. 37. A eukaryotic cell with a functional, heterologous calcium channel, produced by a process comprising: introducing into the cell nucleic acid that encodes an a, subunit of a human calcium channel, nucleic acid that encodes an a 2 subunit of a human calcium channel and introducing into the cell nucleic acid that encodes a p subunit of a human calcium channel, wherein: the a 2 subunit is an a 2 b subunit of a human calcium channel, the a, subunit is an clA, Ca,, XB, or aE subunit of a human calcium channel and the p subunit is a P3 or 3 4 subunit of a human calcium channel, wherein the a, subunit includes a sequence of amino acids set forth in any of SEQ ID NOs. 1, 2, 3, 6, 7, 8, 22, 23, 24, 25, or 36, the a 2 b subunit includes an amino acid sequence set forth in SEQ ID NO. 11, and the P subunit includes an amino acid sequence set forth in SEQ ID NO. 19 or functionally equivalent variants of the subunits encoded by any of the aforesaid SEQ ID NOs. 38. The eukaryotic cell of claim 37, wherein the calcium channel includes an aIA-, 2b, and a P3-, subunit, or a a(Bl, a2b and an P3, subunit. 39. A eukaryotic cell with a functional, heterologous calcium channel, produced by a process comprising: introducing into the cell nucleic acid that encodes an c, subunit of a human calcium channel, nucleic acid that encodes an a 2 subunit of a human calcium channel and introducing into the cell nucleic acid that encodes a P subunit of a human calcium channel, wherein: the calcium channel contains an a,2 subunit of a human calcium channel, an c,B or an cD, subunit of a human calcium channel and a P1- 1 p-2 or 31- 3 subunit of a human calcium channel, wherein the a, subunit includes a sequence of amino acids encoded by SEQ ID NOs. 1, 2, 7 or 8, the a2b subunit includes an amino acid sequence set forth in SEQ ID NO. 11, and the p subunit includes an amino acid sequence set forth in SEQ ID NOs. 9, and 33-35 or functionally equivalent variants of the subunits encoded by any of the aforesaid SEQ ID NOs. The eukaryotic cell of claim 39, wherein the calcium channel includes an a C 2 b, and a 1 -2 subunit, or an a,,c a 2 b, and a 3,, 3 subunit, or an a1B-2, a 2 b, and a subunit. I \DayLib\LIBFF\0406.docsak 291 41. A method for identifying a compound that modulates the activity of a calcium channel, comprising: suspending a eukaryotic cell that has a functional, heterologous calcium channel, in a solution containing the compound and a calcium channel-selective ion: depolarizing the cell membrane of the cell; and detecting the current flowing into the cell, wherein: the heterologous calcium channel includes at least one human calcium channel subunit encoded by DNA or RNA that is heterologous to the cell; at least one subunit is selected from the group consisting of lA-1, aE-1, IE 3, IC-2 P2C, P 2D, P2E, a 3 3 subunit and a 34 subunit and includes the sequence of amino acids S set forth in or encoded by the sequence of nucleotides set forth in any of SEQ ID NOs. 19, 22, 23, 24, 25, 26, 27, 28, 29, 30 or 36 or functionally equivalent variants thereof; the current that is detected is different from that produced by depolarizing the S* same or a substantially identical cell in the presence of the same calcium channel selective ion but in the absence of the compound. 42. The method of claim 41, wherein the heterologous DNA or RNA encodes a P3 subunit. 43. The method of claim 41, wherein the heterologous DNA or RNA encodes a 3 4 subunit. 44. A subunit-specific antibody selected from the group consisting of antibodies that bind to an a subunit type or a subunit subtype of a human calcium channels, wherein the subunit is an aA,, c CE ID or ,IB type a, subunit, wherein the subunit includes a sequence of amino acids set forth in any of SEQ ID NOs. 1, 2, 3, 6, 7, 8, 22, 23, 24, 25, or 36. The antibody of claim 44, wherein antibody is subtype specific and the cc subunit is aIA, aIE and aC. 46. The antibody of claim 44, wherein the subunit subtype is aIA-, IA 2 aIC-2, IE- or aE-3. 47. An RNA or single-stranded DNA probe of at least 30 bases in length comprising at least 30 substantially contiguous bases from nucleic acids that encode a subunit of a human calcium channel selected from the group of subunits consisting of a,(IA, IA-2, 1a,,- (GIE-3 P2C, 2D, P2E and P4. I.\DayLih\.IBFF\0406.docsak 292 48. An RNA or single-stranded DNA probe of at least 16 bases in length comprising at least 16 substantially contiguous bases from nucleic acids that subunit of a human calcium channel selected from the group of subunits consisting of aE,, cE 3 P 3 2C, 2D, P2E and 3 4 subunits. 49. A method for identifying nucleic acids that encode a human calcium channel subunit, comprising hybridizing under conditions of at least low stringency a probe of claim 49 or claim 50 to a library of nucleic acid molecules, and selecting hybridizing molecules. A method for identifying cells or tissues that express a calcium channel subunit- encoding nucleic acid, comprising hybridizing under conditions of at least low stringency a probe of claim 50 with mRNA expressed in the cells or tissues or cDNA produced from the mRNA, and thereby identifying cells or tissue that express mRNA that encodes the subunit. 51. A substantially pure human calcium channel subunit selected from the group consisting of C1A-1, aIA-2, a1IE-I (C-2 1E-3', 3- 1 3 2C, f2D, P2E and 4 52. A substantially pure human calcium channel subunit selected from among subunits that include the sequence of amino acids set forth in any of SEQ ID NOs. 19, 22, 23, 24, 25, 26, 28, 29, 30 or 36 or functionally equivalent variants thereof. 53. The DNA molecule of claim 4, wherein: the tIE-I subunit includes the sequence of amino acids set forth in SEQ ID NO. 24 or functionally equivalent variants thereof. 54. The DNA molecule of claim 5, wherein: the clE-3 includes the sequence of amino acids set forth in SEQ ID NO. 25 or functionally equivalent variants thereof. The DNA molecule of claim 8, wherein: the aIC 2 subunit includes the sequence of amino acids set forth in SEQ ID NO. 36 or functionally equivalent variants thereof.
56. A eukaryotic cell, comprising a DNA molecule of claim 1 or claim 10 or claim 17 or any combination thereof.
57. The DNA molecule of claim 17, wherein the 34 subunit includes the sequence of amino acids set forth in SEQ ID NO. 27 or functionally equivalent variants thereof.
58. An isolated RNA molecule encoded by the DNA molecule of any of claims 1-18 or 53-56.
59. An isolated eukaryotic cell, comprising the RNA of claim 58. I:\DayLib\LIBFF\0406.docsak 293 An isolated DNA molecule, comprising a sequence of nucleotides that encodes an a, subunit of a human calcium channel, substantially as hereinbefore described with reference to any one of the Examples.
61. A eukaryotic cell, comprising heterologous DNA that encodes an a, subunit of a human calcium channel, which DNA is substantially as hereinbefore described with reference to any one of the Examples.
62. An isolated DNA molecule, comprising a sequence of nucleotides that encodes an P2, P 3 or 1 4 subunit of a human calcium channel, substantially as hereinbefore described with reference to any one of the Examples.
63. A eukaryotic cell, comprising heterologous DNA that encodes a 32, 33 or 4 subunit of a human calcium channel, which DNA is substantially as hereinbefore described S with reference to any one of the Examples. Dated 27 May, 1999 SSIBIA Neurosciences, Inc. ;Patent Attorneys for the Applicant/Nominated Person SPRUSON FERGUSON 4.. 0* 1:\DayLib\LIBFF\0406 docsak
AU33904/99A 1993-08-11 1999-06-07 Human calcium channel compositions and methods using them Abandoned AU3390499A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU33904/99A AU3390499A (en) 1993-08-11 1999-06-07 Human calcium channel compositions and methods using them

Applications Claiming Priority (6)

Application Number Priority Date Filing Date Title
US10553693A 1993-08-11 1993-08-11
US08/105536 1993-08-11
US08/149097 1993-11-05
US08/149,097 US5874236A (en) 1988-04-04 1993-11-05 DNA encoding human calcium channel α-1A, β1, β-2, and β-4 subunits, and assays using cells that express the subunits
AU76322/94A AU707793B2 (en) 1993-08-11 1994-08-11 Human calcium channel compositions and methods using them
AU33904/99A AU3390499A (en) 1993-08-11 1999-06-07 Human calcium channel compositions and methods using them

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
AU76322/94A Division AU707793B2 (en) 1993-08-11 1994-08-11 Human calcium channel compositions and methods using them

Publications (1)

Publication Number Publication Date
AU3390499A true AU3390499A (en) 1999-08-19

Family

ID=27422998

Family Applications (1)

Application Number Title Priority Date Filing Date
AU33904/99A Abandoned AU3390499A (en) 1993-08-11 1999-06-07 Human calcium channel compositions and methods using them

Country Status (1)

Country Link
AU (1) AU3390499A (en)

Similar Documents

Publication Publication Date Title
AU707793B2 (en) Human calcium channel compositions and methods using them
US5429921A (en) Assays for agonists and antagonists of recombinant human calcium channels
US7063950B1 (en) Nucleic acids encoding human calcium channel and methods of use thereof
JP3628691B2 (en) Human neuronal nicotinic acetylcholine receptor compositions and methods for their use
EP0696320B1 (en) Human n-methyl-d-aspartate receptor subunits, nucleic acids encoding same and uses therefor
US5876958A (en) Assays of cells expressing human calcium channels containing α1 β subunits
EP0598840B1 (en) Human calcium channel compositions and methods
US5846757A (en) Human calcium channel α1, α2, and β subunits and assays using them
AU760309B2 (en) Low-voltage activated calcium channel compositions and methods
US5851824A (en) Human calcium channel α-1C/α-1D, α-2, β-1, and γsubunits and cells expressing the DNA
US6528630B1 (en) Calcium channel compositions and methods
US6090623A (en) Recombinant human calcium channel β4 subunits
US7414110B2 (en) Human calcium channel compositions and methods
AU3390499A (en) Human calcium channel compositions and methods using them
US6653097B1 (en) Human calcium channel compositions and methods
US6387696B1 (en) Human calcium channel compositions and methods

Legal Events

Date Code Title Description
PC1 Assignment before grant (sect. 113)

Owner name: MERCK AND CO., INC.

Free format text: THE FORMER OWNER WAS: SIBIA NEUROSCIENCES, INC.

MK5 Application lapsed section 142(2)(e) - patent request and compl. specification not accepted