AU742489B2 - Regulatory DNA sequences of the human catalytic telomerase sub-unit gene, diagnostic and therapeutic use thereof - Google Patents

Regulatory DNA sequences of the human catalytic telomerase sub-unit gene, diagnostic and therapeutic use thereof Download PDF

Info

Publication number
AU742489B2
AU742489B2 AU22729/99A AU2272999A AU742489B2 AU 742489 B2 AU742489 B2 AU 742489B2 AU 22729/99 A AU22729/99 A AU 22729/99A AU 2272999 A AU2272999 A AU 2272999A AU 742489 B2 AU742489 B2 AU 742489B2
Authority
AU
Australia
Prior art keywords
dna
recombinant
sequences
gene
intron
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU22729/99A
Other versions
AU2272999A (en
Inventor
Gustav Hagen
Maresa Wick
Dmitry Zubov
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Bayer AG
Original Assignee
Bayer AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Bayer AG filed Critical Bayer AG
Publication of AU2272999A publication Critical patent/AU2272999A/en
Application granted granted Critical
Publication of AU742489B2 publication Critical patent/AU742489B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1241Nucleotidyltransferases (2.7.7)
    • AHUMAN NECESSITIES
    • A01AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
    • A01KANIMAL HUSBANDRY; CARE OF BIRDS, FISHES, INSECTS; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
    • A01K2217/00Genetically modified animals
    • A01K2217/05Animals comprising random inserted nucleic acids (transgenic)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides

Description

WO 99/33998 PCT/EP98/08216 Regulatory DNA sequences of the gene for the human catalytic telomerase subunit, and their diagnostic and therapeutic use Structure and function of the chromosome ends The genetic material of eukaryotic cells is distributed on linear chromosomes. The ends of hereditary units are termed telomeres, derived from the Greek words telos (end) and meros (part, segment). Most telomeres consist of repeats of short sequences which are mainly composed of thymine and guanine (Zakian, 1995). In all the vertebrates which have so far been investigated, the telomeres consist of the sequence TTAGGG (Meyne et al., 1989).
The telomeres have a variety of important functions. They prevent the fusion of chromosomes (McClintock, 1941) and thus the formation of dicentric hereditary units. Such chromosomes having two centromeres can lead to the development of cancer due to loss ofheterozygosis or duplication, or loss of genes.
In addition, telomeres serve the purpose of distinguishing intact hereditary units from damaged hereditary units. Thus, yeast cells ceased their cell division when they contained a chromosome without a telomere (Sandell and Zakian, 1993).
Telomeres fulfil another important task in association with the replication of eukaryotic cell DNA. In contrast to the circular genomes of prokaryotes, the linear chromosomes of eukaryotes cannot be completely replicated by the DNA polymerase complex. RNA primers are required to initiate DNA replication. After elimination of the RNA primers, extension of the Okazaki fragments and subsequent ligation, the newly synthesized DNA strand lacks the 5' end since the RNA primer cannot be replaced by DNA at that point. Without special protective mechanisms, the chromosomes would therefore shrink with each cell division ("end-replication problem"; Harley et al., 1990). The non-coding telomere sequences presumably constitute a buffer zone for preventing the loss of genes (Sandell and Zakian, 1993).
-2- In addition to this, telomeres also play an import role in regulating cell ageing (Olovnikov, 1973). Human somatic cells exhibit a limited capacity for replication in culture; after a certain period of time, they become senescent. In this state, the cells no longer divide even after having been stimulated with growth factors; however, they do not die and remain metabolically active (Goldstein, 1990). Various observations support the hypothesis that a cell determines how many more times it can divide on the basis of the length of its telomeres (Allsopp et al., 1992).
In summary, the telomeres consequently possess key functions in the ageing of cells, and in stabilizing the genetic material and preventing cancer.
The enzyme telomerase synthesizes the telomeres As described above, organisms which possess linear chromosomes can only replicate their genome incompletely in the absence of a special protective mechanism. Most eukaryotes use a special enzyme, i.e. telomerase, for regenerating the telomere sequences. Telomerase is expressed constitutively in the single-cell organisms which have so far been investigated. On the other hand, telomerase activity has only been measured in humans in germ cells and tumour cells, whereas neighbouring somatic tissue did not contain any telomerase (Kim et al., 1994).
Telomerase can also be designated functionally as terminal telomere transferase, which is located in the cell nucleus as a multiprotein complex. While the RNA moiety of human telomerase has been known for a relatively long period of time (Feng et al., 1995), the catalytic subunit of this enzyme group was recently identified in a variety of organisms (Lingner et al., 1997; cf. our application PCT EP/98/03468 which is likewise pending). These catalytic subunits of telomerase are strikingly homologous both among themselves and in relation to all previously known reverse transcriptases.
WO 98/14592 also describes nucleic acid and amino acid sequences of the catalytic telomerase subunit.
-3- Activation of telomerase in human tumours It was originally only possible to demonstrate telomerase activity in humans in germ line cells and not in normal somatic cells (Hastie et al., 1990; Kim et al., 1994).
Following the development of a more sensitive detection method (Kim et al., 1994), a low telomerase activity was also detected in hematopoietic cells (Broccoli et al., 1995; Counter et al., 1995; Hiyama et al., 1995). It is true, however, that these cells nevertheless exhibited a reduction in the telomeres (Vaziri et al., 1994; Counter et al., 1995). It has still not been resolved whether the quantity of enzyme in these cells is not sufficient for compensating the telomere loss or whether the telomerase activity which is measured stems from a subpopulation, e.g. incompletely differentiated CD34+38 precursor cells (Hiyama et al., 1995). In order to resolve this, it would be necessary to detect telomerase activity in a single cell.
Interestingly, however, significant telomerase activity was detected in a large number of the tumour tissues which had thus far been tested (1734/2031, 85%; Shay, 1997), whereas no activity was found in normal somatic tissue (1/196, Shay, 1997). In addition various investigations have shown that the telomeres still shrank in senescent cells which were transformed with viral oncoproteins and it was only possible to detect telomerase in the subpopulation which survived the growth crisis (Counter et al., 1992). The telomeres were also stable in these immortalized cells.
(Counter et al., 1992). Similar findings from investigations in mice (Blasco et al., 1996) support the assumption that reactivation of the telomerase is a late event in tumorigenesis.
Based on these results, a "telomerase hypothesis" was developed which links the loss of telomere sequences and cell ageing with telomerase activity and the development of cancer. In long-lived species such as humans, the shrinking of the telomeres can be regarded as being a mechanism for suppressing tumours. Differentiated cells which do not contain any telomerase cease their cell division at a particular telomere length.
If such a cell mutates, it can only form a tumour if the cell can extend its telomeres.
-4- Otherwise, the cell would continue to lose telomere sequences until its chromosomes became unstable and it was finally destroyed. Telomerase reactivation is presumably the main mechanism used by tumour cells to stabilize their telomeres.
It follows from these observations and considerations that it should be possible to treat tumours by inhibiting the telomerase. Conventional cancer therapies using cytostatic agents or short-wave radiation damage all the dividing cells in the body in addition to the tumour cells. However, since only germ line cells, apart from tumour cells, contain significant telomerase activity, telomerase. inhibitors would attack the tumour cells more specifically and consequently elicit fewer undesirable side effects.
Telomerase activity has been detected in all the tumour tissues which have so far been tested, which means that these therapeutic agents could be employed against all types of cancer. The effect of telomerase inhibitors would then set in when the telomeres of the cells had shortened to such an extent that the genome became unstable. Since tumour cells usually possess telomeres which are shorter than those of normal somatic cells, cancer cells would be the first to be eliminated by the telomerase inhibitors. By contrast, cells possessing long telomeres, such as the germ cells, would only be damaged at a much later date. Telomerase inhibitors consequently represent a potential way forward in the treatment of cancer.
It becomes possible to obtain unambiguous answers to the question of the nature and points of attack of physiological telomerase inhibitors once the manner in which expression of the telomerase gene is regulated has also been identified.
Regulation of gene expression in eukarvotes There are a large number of points in eukaryotic gene expression, i.e. the cellular flow of information from the DNA to the protein by way of the RNA, at which regulatory mechanisms can exert an effect. Examples of individual control steps are gene amplification, the recombination of gene loci, chromatin structure, DNA methylation, transcription, post-transcriptional modifications of mRNA, mRNA transport, translation and post-translational modifications of proteins. Studies which have been carried out to date indicate that control at the level of transcription initiation is of the greatest importance (Latchman, 1991).
A region which is responsible for regulating transcription, and which is designated the promoter region, is located directly upstream of the transcription start of a gene which is transcribed by RNA polymerase I. Comparison of the nucleotide sequences of promoter regions from a large number of known genes shows that particular sequence motifs occur regularly in.this region. These.elements include, inter alia, the TATA box, the CCAAT box and the GC box, which elements are recognized by specific proteins. The TATA box, which is located about 30 nucleotides upstream of the transcription start, is, for example, recognized by the TFIID subunit TBP ("TATA box-binding protein"), whereas particular GC-rich sequence segments are specifically bound by the transcription factor Sp 1 ("specificity protein The promoter can be functionally subdivided into a regulatory segment and a constitutive segment (Latchman, 1991). The constitutive control region comprises the so-called core promoter which enables transcription to be initiated correctly. This promoter contains the sequence elements which are described as UPE's (upstream promoter elements) which are necessary for efficient transcription. The regulatory control segments, which can be interlaced with the UPE's, possess sequence elements which can be involved in the signal-dependent regulation of .transcription by hormones, growth factors, etc. They impart tissue-specific or cell-specific promoter properties.
DNA segments which are able to exert an influence on gene expression over relatively large distances are a characteristic feature of eukaryotic genes. These elements can be located upstream or downstream of a transcription unit, or within the unit, and can perform their function independently of their orientation. These sequence segments may reinforce (enhancers) or attenuate (silencers) promoter activity. In a similar way to the promoter regions, enhancers and silencers also accommodate several binding sites for transcription factors.
-6- The invention relates to the DNA sequences from the 5'-flanking region of the gene for the catalytically active human telomerase subunit and intron sequences for this gene.
The invention particularly relates to the 5'-flanking regulatory DNA sequence which contains the promoter DNA sequence for the gene for the human catalytic telomerase subunit, as depicted in Fig. 10 (SEQ ID NO 3).
The invention furthermore relates to part regions of the 5'-flanking regulatory DNA sequence, as depicted in Fig. 4 (SEQ ID NO which has a regulatory effect.
Intron sequences for the gene for the human catalytic telomerase subunit, in particular those sequences which have a regulatory effect, are also part of the subjectmatter of the present invention. The intron sequences according to the invention are described in detail in the context of Example 5 (cf. SEQ ID NO 4, 5, 6, 7, 8, 9, 11, 12, 13, 14, 15, 16, 17, 18, 19 and The invention furthermore relates to a recombinant construct which comprises the DNA sequences according to the invention, in particular the 5'-flanking DNA sequence of the gene for the human catalytic telomerase subunit, or part regions thereof.
Preference is given to recombinant constructs which, in addition to the DNA sequences according to the invention, in particular the 5'-flanking DNA sequence of the gene for the human catalytic telomerase subunit, or part regions thereof, also contain one or more additional DNA sequences which encode polypeptides or proteins.
According to a particularly preferred embodiment, these additional DNA sequences encode antineoplastic proteins.
-7- Particular preference is given to those antineoplastic proteins which inhibit angiogenesis directly or indirectly. Examples of these proteins are: Plasminogen activator inhibitor (PAl-I), PAI-2, PAI-3, angiostatin, endostatin, platelet factor 4, TIMP-1, TIMP-2, TIMP-3 and leukaemia inhibitory factor (LIF).
Antineoplastic proteins which have a direct or indirect cytostatic effect on tumours are likewise particularly preferred. These proteins include, in particular: perforin, granzyme, 1L-2, IL-4, IL-12, interferons, such as IFN-a, IFN-B and IFN-y, TNF, TNF-a, TNF-B, oncostatin M; tumour suppressor genes, such as p53, retinoblastoma.
Particular preference is furthermore given to antineoplastic proteins which, where appropriate in addition to their antineoplastic effect, stimulate inflammations and thereby contribute to the elimination of tumour cells. Examples of these proteins are: RANTES, monocyte chemotactic and activating factor (MCAF), IL-8, macrophage inflammatory protein (MIP-la,-B), neutrophil activating protein-2 (NAP-2), IL-3, IL- 5, human leukaemia inhibitory factor (LIF), IL-7, IL-11, IL-13, GM-CSF, G-CSF and
M-CSF.
Particular preference is furthermore given to antineoplastic proteins which, due to their action as enzymes, are able to convert precursors of an antineoplastic active compound into an antineoplastic active compound. Examples of these enzymes are: herpes simplex virus thymidine kinase, varicella zoster virus thymidine kinase, bacterial nitroreductase, bacterial 1-glucuronidase, plant B-glucuronidase from Secale cereale, human glucuronidase, human carboxypeptidase, bacterial carboxypeptidase, bacterial B-lactamase, bacterial cytosine deaminidase, human catalase and/or phosphatase, human alkaline phosphatase, type 5 acid phosphatase, human lysooxidase, human acid D-aminooxidase, human glutathione peroxidase, human eosinophil peroxidase and human thyroid peroxidase.
The abovementioned recombinant constructs can also contain DNA sequences which encode factor VI or factor IX, or part fragments thereof. These DNA sequences also include other blood clotting factors.
The abovementioned recombinant constructs can also contain DNA sequences which encode a reporter protein. Examples of these reporter proteins are: Chloramphenicol acetyl transferase (CAT), glow-worm luciferase (LUC), B-galactosidase (B-Gal), secreted alkaline phosphatase (SEAP), human growth hormone (hGH), B-glucuronidase (GUS), green-fluorescing protein (GFP), and all the variants derived therefrom, aquarin and obelin.
Recombinant constructs according to the invention can also contain DNA which encodes the human catalytic telomerase subunit and its variants and fragments in the antisense orientation. Where appropriate, these constructs can also contain other protein subunits of the human telomerase and the telomerase RNA component in the antisense orientation.
The recombinant constructs can, in addition to the DNA which encodes the human catalytic telomerase subunit, and its variants and fragments, also contain other protein subunits of the human telomerase and the telomerase RNA component.
The invention furthermore relates to a vector which contains the abovementioned DNA sequences according to the invention, in particular the 5'-flanking DNA sequences and also one or more of the other DNA sequences mentioned above.
The preferred vector for these constructs is a virus, for example a retrovirus, an adenovirus, an adeno-associated virus, a herpes simplex virus, a vaccina virus, a lentiviral virus, a Sindbis virus and a Semliki forest virus.
-9- Preference is also given to using plasmids as vectors.
The invention furthermore relates to pharmaceutical preparations which comprise recombinant constructs or vectors according to the invention; for example a preparation in a colloidal dispersion system.
Examples of suitable colloidal dispersion systems are .liposomes or polylysine ligands.
The preparations of the constructs or vectors according to the invention in colloidal dispersion systems can be supplemented with a ligand which binds to the membrane structures of tumour cells. Such a ligand can, for example, be attached to the construct or the vector or else be a component of the liposome structure.
Suitable ligands are, in particular, polyclonal or monoclonal antibodies, or antibody fragments thereof, which bind, by their variable domains, to the membrane structures of tumour cells, or substances carrying mannose terminally, cytokines or growth factors, or fragments or part sequences thereof, which bind to receptors on tumour cells.
Examples of corresponding membrane structures, are receptors for a cytokine or a growth factor, such as IL-1, EGF, PDGF, VEGF, TGF B, insulin or insulin-like growth factor (ILGF), or adhesion molecules, such as SLeX, LFA-1, MAC-1, LECAM-1 or VLA-4, or the mannose-6-phosphate receptor.
The present invention includes pharmaceutical preparations which, in addition to the vector constructs according to the invention, can also comprise non-toxic, inert, pharmaceutically suitable excipients. It is possible to conceive of administering (e.g.
intravenously, intraarterially, intramuscularly, subcutaneously, intradermally, anally, S vaginally, nasally, transdermally, intraperitoneally, as an aerosol or orally) these preparations at the site of a tumour or administering them systemically.
The vector constructs according to the invention can be employed in gene therapy.
The invention furthermore relates to a recombinant host cell, in particular a recombinant eukaryotic host cell, which harbours the above-described constructs or vectors.
The invention furthermore relates to a process for identifying substances which affect the promoter activity, silencer activity or enhancer activity of the catalytic telomerase subunit, with this process comprising the following steps: A. adding a candidate substance to a host cell which harbours the regulatory DNA sequence according to the invention, in particular the regulatory DNA sequence for the gene for the human catalytic telomerase subunit, or a part region thereof which has a regulatory effect, which sequence or part region is functionally linked to a reporter gene, and B. measuring the effect of the substance on expression of the reporter gene.
The process can be employed for identifying substances which increase the promoter activity, silencer activity or enhancer activity of the catalytic telomerase subunit.
The process can furthermore be employed for identifying substances which inhibit the promoter activity, silencer activity or enhancer activator of the catalytic telomerase subunit.
The invention furthermore relates to a process for identifying factors which bind specifically to fragments of the DNA fragments according to the invention, in particular the 5'-flanking regulatory DNA sequence of the catalytic telomerase subunit. This method comprises screening an expression cDNA library using the above-described DNA sequence, or subfragments of widely differing length, as the S probe.
-11 The above-described constructs or vectors can also be used for preparing transgenic animals.
The invention furthermore relates to a process for detecting telomerase-associated conditions in a patient, which process comprises the following steps: A. incubating a construct or vector, which contains .the DNA sequence according to the invention, in particular the 5'-flanking regulatory DNA sequence for the gene for the human catalytic telomerase subunit, or a part region thereof having a regulatory effect, and a reporter gene, with body fluids or cell samples, B. detecting the activity of the reporter gene in order to obtain a diagnostic value; and C. comparing the diagnosic value with standard values for the reporter gene construct in standardized normal cells or body fluids of the same type as the test sample; The detection of diagnostic values which are higher or lower than the standard comparative values indicates a telomerase-associated condition, which in turn indicates a pathogenic condition.
Explanation of the figures: Fig. 1: Southern blot analysis using genomic DNA from various species A: Photograph of an ethidium bromide-stained 0.7% agarose gel containing approximately 4 gg of Eco RI-cut genomic DNA. Track 1 Scontains Hind Em-cut X DNA as size markers (23.5, 9.4, 6.7, 4.4, 2.3, and 0.6 kb). Tracks 2 to 10 contain human, rhesus monkey, Sprague -12- Dawley rat, BALB/c mouse, dog, bovine, rabbit, chicken and yeast (Saccharomyces cerevisiae) genomic DNA.
B: Autoradiogram, corresponding to Fig.1 A, of a Southern blot analysis in which radioactively labelled hTC-cDNA probe of about 720 bp in length is used for the hybridization.
Fig. 2: Restriction analysis of the recombinant X DNA of the phage clone P12, which hybridizes with a probe from the 5' region of the hTC cDNA.
The figure shows a photograph of an ethidium bromide-stained 0.4% agarose gel. Tracks 1 and 2 contain Eco RI/Hind II-cut X DNA and a 1 kb ladder from Gibco as size markers. Tracks 3 7 each contain 250 ng of the DNA from the recombinant phage which has been cut with Bam HI (track Eco RI (track Sal I (track Xho I (track 6) and Sac I (track The arrows mark the two X arms of the vector EMBL3 Sp6/T7.
Fig. 3: Restriction analysis and Southern blot analysis of the recombinant k DNA of the phage clone which hybridizes with a probe from the region of the hTC cDNA.
A: The figure shows a photograph of an ethidium bromide-stained 0.8% agarose gel. Tracks 1 and 15 contain a 1 kb ladder from Gibco as size markers. Tracks 2 to 14 each contain 250 ng of cut X DNA from the recombinant phage clone. The following enzymes were employed: track 2: Sac I, track 3: Xho I, track 4: Xho I, Xba I, track 5: Sac I, Xho I, track 6: Sal I, Xho I, Xba I, track 7: Sac I, Xho I, Xba I, track 8: Sac I, Sal I, Xba I, track 9: Sac I, Sal I, BamH I, track 10: Sac I, Sal I, Xho I, track 11: Not I, track 12: Sma I, track 13: empty, track 14: not digested.
-13 B: Autoradiogram, corresponding to Fig. 3 A, of a Southern blot analysis.
A 5'-hTC cDNA fragment of about 420 bp in length was used as the probe for the hybridization.
Fig. 4: Partial DNA sequence of the 5'-flanking region and of the promoter of the gene for the human catalytic telomerase subunit. The ATG start codon in the sequence is printed in bold. The depicted sequence corresponds to SEQ ID NO 1.
Fig. 5: Use of primer extension analysis to identify the transcription start.
The figure shows an autoradiogram of a denaturing polyacrylamide gel which was selected for depicting a primer extension analysis. An oligonucleotide having the sequence 5'GTTAAGTTGTAGCTTACACTGGTTCTC 3' was used as the primer.
The primer extension reaction was loaded in track 1. Tracks G, A, T and C constitute the sequence reactions using the same primer and the corresponding dideoxynucleotides. The thick arrow marks the main transcription start while the thin arrows point to three subsidiary transcription start points.
Fig. 6: cDNA sequence of the human catalytic telomerase subunit (hTC; cf. our pending application PCT/EP/98/03468). The depicted sequence corresponds to SEQ ID NO 2.
Fig. 7: Structural organization and restriction map of the human hTC gene and its 5'-flanking and 3 '-flanking regions.
Exons are shown as consecutively numbered rectangles which are filledin in black, and introns are shown as regions which are not filled in.
S, Untranslated sequence segments in the exons are hatched. Translation starts in exon 1 and ends in exon 16. Restriction enzyme cleavage sites -14are marked as follows: S, SacI; X, XhoI. The relative arrangement of the five phage clones (P2, P3, P5, P12, P17), and of the product from the genome walking, are shown by thin lines. As the dots indicate, the sequence of intron 16 has only been partly deciphered.
Fig. 8: HTL splice variants.
A: Diagrammatic structure of the hTC mRNA splice variants. The complete hTC mRNA is depicted as a rectangle with a grey background in the upper region of the figure. The 16 exons are depicted in accordance with their size. The translation start (ATG) and the stop codon, and also the telomerase-specific T motif, and the seven RT motifs, are all shown.
The hTC variants are subdivided into deletion and insertion variants. The missing exon sequences are marked in the deletions. The insertions are shown by additional white rectangles. The sizes and origins of the inserted sequences are given. Newly formed stop codons are marked. The size of the insertion in variant INS2 is unknown.
B: Exon-intron transitions in the hTC splice variants. Unspliced flanking and 3'-flanking sequences are shown as white rectangles. The origins of the exon and intron sequences are given. Intron and exon sequences are shown in small letters and large.letters, respectively. The donor and acceptor sequences in the splice sites are underlaid as grey rectangles, and their exon and intron origins are also given.
Fig. 9: Identification of the transcription start by means of RT-PCR analysis.
The RT-PCR was carried out using a cDNA library prepared from HL cells and genomic DNA as the positive control. A common 3' primer hybridizes to a region of the exon 1 sequence. The positions of the different 5' primers in the coding region or the 5'-flanking region are AL given. In the negative control, no template DNA was added to the PCR reaction. M: DNA size marker.
Fig. 10: Nucleotide sequence and structural features of the hTC promoter.
The figure depicts 11273 bp of the 5'-flanking hTC gene sequence, beginning with the translation start codon ATG The putative region of the translation start is underlined. Possible regulatory sequence segments within the 4000 bp upstream of the translation start are ringed.
The depicted sequence corresponds to SEQ ID NO 3.
Fig. 11: Activity of the hTC promoter in HEK-293 cells.
The first 5000 bp of the 5'-flanking hTC gene region are shown diagrammatically in the upper part of the figure. The ATG start codon is picked out. CpG-rich islands are marked by grey rectangles. The sizes of the hTC promoter-luciferase construct are shown on the left-hand side of the figure. The promoterless pGL2 basic construct and the promoter construct pGL2-Pro were used as controls in each transfection.
The relative luciferase activities of the different promoter constructs in HEK cells are shown as continuous bars on the right-hand side of the figure. The standard deviation is indicated. The numerical values represent the average of two independent experiments which were carried out in duplicate.
Tab. 1: Exon-intron transitions in the hTC gene The table lists the nucleotide sequences at the 3' and 5' splice transitions of the hTC gene. The consensus sequences for donor and acceptor sequences (AG and GT) are underlaid with grey rectangles. The table shows the intron sequences (small letters) and exon sequences (large letters) which flank the splice acceptor and donor sites. The sizes of the exons and introns are given in bp.
Tab. 2: Potential binding sites for DNA-binding factors in the nucleotide sequence of intron 2 -16- The search for possible DNA-binding factors transcription factors) was carried out using the "find pattern" algorithm from the Genetics Computer Group (Madison, USA) GCG sequence analysis program package. The table lists the abbreviations of the DNA-binding factors which were identified and their location in intron 2.
31 Acceptor Sequence 51 Donor Sequence 3' Acceptor Sequence 5' Donor Seauence Intron Exon Exon Exon Intron Intron flanking region cagggcgcttcccccgdag catgtccttctcgtttaag gaggggctctctattgaag cccatgctgtccccgc ag ctcgcctccactcaca ag ccctctcctctgccggcag ctcccgtctgqtttcg ag ctgtgtcttcccgccccag gtattttcccttattttag cattgcccctctgcct ag attcccccctgtgtct ag tctttcttggcgactctag ctgtccgccatcctctcag agcctctgttttcccc ag tctgattttggccccglagi
GTTTCAGGCAGCGCTGCGT
GTGTCCTGCCTGAAGGAGC
GGGTTGGCTGTGTTCCGGC
ACAGCACTTGAAGAGGGTG
GCCGAGCGTCTCACCTCGA
GTGGATGTGACGGGCGCGT
GTCTCTACCTTGACAGACC
AGCTCCTCCCTGAATGAGG
GTCCTACGTCCAGTGCCAG
GCTGCTCCTGCGTtTGGTG
GACCCTGGTCCGAGGTGTC
CTATGCCCGGACCTCCATC
GTGAACAGCCTCCAGACGG
GTTTCACGCATGTGTGCTG
GGATGTCGCTGGGGGCCAA
CCCAGACGCAGCTGAGTCG
281 1354 196 181 180 156 96 86 114 72 189 127 62 125 138
CGCCCCCTCCTTCCGCCAG
TGGCTGCGCAGGAGCCCAG
TGCAAAGCATTGGAATCAG
GTTCCGCAGAGAAAAGAGG
TGAGCTGTACTTTGTCAAG
CAAGGCCTTCAAGAGCCAC
TGCCGTCGTCATCGAGCAG
CCGTGCGCATCAGGGGCAA
CGGGGATTCGGCGGGACGG
ACGCGAAAACCTTCCTCAG
TGCAGAGCGACTACTCCAG
CCTGTTTCTGGATTTGCAG
TCCTGCTGCAGGCGTACAG
CTGAAAGCCAAGAACGCAG
CTGGGGTCACTCAGGACAG
;tg ggcctccccggggtcg ;t aggaggtggtggccgt ;tactgtatccccacgcca ;tggctgtgctttggttta ;tgggtgccggggaccccc Itstaggttcacgtgtgata ;t.tgggcactgccctgca ;t agtcaggtggccaggt gaggcctcctcttcccc It aggcccgtgccgtgtg 1; agcgcacctggccgga Fagcaggctgatggtca Ft agccgccaccaagggg jtitgtgcaggtgcctggc ri agtgtgggtggaggcc 3' flanking region 104 8616 2089 687 494 >4660 980 2485 1984 1871 3801 880 3187 781 536 16 664 TTTTTCAGTTTTGAAAAAA -18- Tab. 2 Factors Location in intron 2 C/EBP 2925 CRE.2 2749 Spl 2378, 4094,4526,4787, 4835,4995 AP-2 CS3 5099 AP-2 CS4 2213, 3699, 4667, 5878, 5938, 6059, 6180, 6496 AP-2 CS5 5350, 5798, 5880, 5940, 6061, 6182, 6375, 6498 PEA3 934,2505 P53 2125 GR uteroglobin 848, 1487, 2956 PR uteroglobin 3331 Zeste-white 1577, 1619, 1703, 1745, 1787, 1829, 1871, 1913, 1955, 1997, 2039, 2081,3518, 3709,4765, 5014, 5055 GRE 846 MyoD-MCK right 447, 509, 558, 1370, 1595, 1900, 2028, 2099, 4557 site/rev MyoD-MCK left site 108, 118, 453, 1566, 1608, 1692, 1734, 1818, 1902, 1986, 2372, 2460,2720, 3491, 5030 Ets-1 CS 6408 AP 3784, 4406 CREB 2801 GATA-1 839,1390,3154 c-Myc 108, 118, 453, 1566, 1608, 1692, 1734, 1818, 1902, 1986, 2372, 2460,2720, 3491, 5030 CACCC site 991 CCAAT site 1224 CCAC box 992 CAAT site 463, 2395 Rb site 992, 4663 TATA 3650 CDEI 106, 1564, 1606, 1690, 1732, 1816, 1900, 1984 -19- Examples The human gene for the catalytic telomerase subunit (ghTC), and the regions of this gene located 5' and were cloned, while the start point for transcription was determined, potential binding sites for DNA-binding proteins were identified and active promoter fragments were highlighted. The sequence of the hTC cDNA (Fig. 6) has already been reported in our application PCT/EP/98/03468, which is also pending. Unless otherwise mentioned, all the data refer to the position of the cDNA in this sequence.
Example 1 A genomic Southern blot analysis was used to determine whether ghTC constitutes a single gene in the human genome or whether there exist several loci for the hTC gene and possibly also ghTC pseudogenes.
In order to do this, a commercially available zoo blot from Clontech was subjected to Southern blot analysis. This blot contains 4 pgg of Eco RI-cut genomic DNA from nine different species (human, monkey, rat, mouse, dog, bovine, rabbit, chicken and yeast). With the exception of yeast, chicken and human, the DNA was isolated from kidney tissue. The human genomic DNA was isolated from placenta and the chicken genomic DNA was purified from liver tissue. An hTC cDNA fragment of about 720 bp in length, which was isolated from hTC cDNA, variant Del2 (position 1685 to 2349 plus 2531 to 2590 in Fig. 6 [deletion 2; cf. Example 5 in Fig. was used as the radioactively labelled probe in the autoradiogram in Fig. 1. The experimental conditions for the blot hybridization and washing steps were taken from Ausubel et al. (1987).
In the case of the human DNA, the probe recognizes two specific DNA fragments.
The smaller Eco RI fragment, of from about 1.5 to 1.8 kb in length, probably originates from two Eco RI cleavage sites in an intron in the ghTC DNA. On the basis of this result, it is to be assumed that only one single ghTC gene is present in the human genome.
Example 2 In order to isolate the 5' flanking hTC gene sequence, approx. 1.5 x 106 phages from a human genomic placenta gene library (EMBL 3 SP6/T7 from Clontech, order number HL1067j) were hybridized on nitrocellulose filters (0.45 tm; from Schleicher and Schuell), in accordance with the manufacturer's instructions, with a radioactively labelled 5'-hTC cDNA fragment of about 500 bp in length (position 839 to 1345 in Fig. The nitrocellulose filters were firstly incubated, at 42 0 C for two hours, in 2 x SSC (0.3 M NaCl; 0.5 M Tris-HC1, pH 8.0) and then in a prehybridization solution (50% formamide; 5 x SSPE, pH 7.4; 5 x Denhard's solution; 0.25% SDS; 100 tg of herring sperm DNA/ml). For the overnight hybridization, the prehybridization solution was supplemented with 1.5 x 10 6 cpm of denatured, radioactively labelled probe/ml of solution. Nonspecifically bound radioactive DNA was removed under stringent conditions, i.e. by means of three fiveminute steps of washing with 2 x SSC; 0.1% SDS at from 55 to 65 0 C. The filters were evaluated by autoradiography.
The phage clones which were identified in this primary investigation were purified (Ausubel et al. (1987)). In subsequent analyses, one phage clone, i.e. P12 turned out to be potentially positive. A X DNA preparation carried out on this phage (Ausubel et al. (1987)), and the subsequent restriction digestion with enzymes which release the genomic insert in fragments, showed that this phage clone contains an insert of approx. 15 kb in the vector (Fig. 2).
In order to isolate the complete hTC gene sequence, in each case from 1 to 1.5 x 106 phages were screened, in independent experiments, with in each case different radioactively labelled probes, as described above.
-21- The phage clones which were identified in these primary investigations, and which were positive for the corresponding probes, were purified. The phage clone P17 was found to contain an hTC cDNA fragment of about 250 bp in length (position 1787 to 2040 in Fig. The phage clone P2 was identified as containing an hTC cDNA fragment of about 740 bp in length (position 1685 to 2349 plus 2531 to 2607 in Fig.
6 [deletion 2; cf. Example The phage clones P3 and P5 were found to contain a 3' hTC cDNA fragment of 420 bp in length (position 3047 to 3470 in Fig. After the X DNA had been prepared from these.phages, and subsequently subjected to restriction digestion with enzymes which release the genomic insert in fragments, the inserts were subcloned into plasmids (Example 4).
Example 3 In order to investigate whether the 5' end of the hTC cDNA was also present in the insert in the recombinant phage clone P12, the X DNA from this clone was hybridized, in a Southern blot analysis, with a radiactively labelled hTC cDNA fragment of about 440 bp in length (position 1 to 440 in Fig. 6) from the extreme region (Fig. 3).
Since the isolated k DNA from the positive clone also hybridizes with the extreme end of the hTC cDNA, this phage probably also contains the 5' sequence region flanking the ATG start codon.
Example 4 In order to subclone the entire 15 kb insert in the positive phage clone P12 in the form of subfragments, and subsequently to sequence these fragments, restriction endonucleases which, on the one hand, release the entire insert from EMBL3 Sp6/T7 (cf. Example 2) and, in addition, cut within the insert, were selected for digesting the
DNA.
-22- In all, two Xho I subfragments, of about 8.3 and about 6.5 kb in length, respectively, and three Sac I subfragments, of about 8.5, about 3.5 and about 3 kb in length, respectively, were subcloned into the pBluescript vector (from Stratagene).
The 5123 bp 5'-flanking nucleotide sequence of the ghTC gene region, starting from the ATG start codon, was determined by analysing the sequences of these fragments (Fig. 4; corresponding to SEQ ID NO Fig. 4 depicts the first 5123 bp (starting from the ATG start codon). Fig. 10 depicts the entire cloned 5' sequence (corresponding to SEQ ID NO 3).
In order to subclone the entire insert, of approx. 14.6 kb in size, in phage clone P17 in the form of subfragments, restriction endonucleases which, on the one hand, release the entire insert from EMLB3 Sp6/T7 and, in addition, cut a few times within the insert, were selected for digesting the DNA. Three XhoI/BamHI fragments, of 7.1 kb, 4.2 kb and 1.5 kb in size, respectively, and one BamHI fragment, of 1.8 kb in size, were subcloned by means of using a combination digestion with the enzymes Xhol and BamHI. Combination restriction digestion with the enzymes XhoI and XbaI resulted in a XhoI/XbaI fragment of 6.5 kb in size, and two XhoI fragments, of 6.5 kb and 1.5 kb in size, respectively, being cloned.
Digestion with the restriction enzyme XhoI was used to subclone the insert, of approx. 17.9 kb in size, in phage clone P2 in the form of subfragments. In all, three XhoI subfragments, of 7.5 kb, 6.4 kb and 1.6 kb in length, respectively, were cloned.
Four SacI fragments, of 4.8 kb, 3 kb, 2 kb and 1.8 kb in size, respectively, were additionally subcloned by digesting with the restriction enzyme SacI.
The insert, of approx. 13.5 kb in size, in phage clone P3 was subcloned by digesting with the restriction enzymes SacI and/or XhoI. Six SacI subfragments, of 3.2 kb, 2 kb, 0.9 kb, 0.8 kb, 0.65 kb and 0.5 kb in length, respectively, and two XhoI subfragments, of 6.5 kb and 4.3 kb in length, respectively, were obtained in this connection.
-23- The insert, of approx. 13.2 kb in size, in phage clone P5 was subcloned by digesting with the restriction enzymes SacI and/or XhoI. In all, SacI fragments of 6.5 kb, 3.3 kb, 3.2 kb, 0.8 kb and 0.3 kb in size, and XhoI fragmente of 7 kb and 3.2 kb in size, were subcloned.
In order to clone the hTC genomic sequence region located 3' of phage clone P17 and 5' of phage clone P2, 3 genomic walkings were carried out using the Clontech GenomeWalkerTM kits (catalogue number K1803-1) and various combinations of primers. In a final volume of 50 ul, 10 pmol of dNTP mix were added to 1 ul of human GenomeWalker Library HDL (from Clontech), and a PCR reaction was carried out in IxKlen Taq PCR reaction buffer and IxAdvantage Klen Taq polymerase mix (from Clontech). 10 pmol of an internal gene-specific primer, and pmol of the adaptor primer AP1 (5'-GTAATACGACTCACTATAGGGC-3'; from Clontech) were added as primers. The PCR was carried out in 3 steps as a touchdown PCR. First of all, denaturation was carried out at 94 0 C for 20 sec, and the primers were then annealed, and the DNA chain extended, at 72 0 C for 4 min, over 7 cycles.
There then followed 37 cycles in which the DNA was denaturated at 94 0 C for 20 sec but the subsequent primer extension took place at 67 0 C for 4 min. In conclusion, there followed a chain extension at 670C for 4 min. After this first PCR, the PCR product was diluted 1:50. One pl1 of this dilution was used in a second nested PCR together with 10 pmol of dNTP mix in IxKlen Taq PCR reaction buffer and IxAdvantage Klen Taq polymerase mix and also 10 pmol of a nested gene-specific primer and. 10 pmol of the nested Marathon Adaptor primers AP2 ACTATAGGGCACGCGTGGT-3'; from Clontech). The PCR conditions corresponded to the parameters which were selected in the first PCR. As the sole exception, only 5 cycles rather than 7 cycles were selected in the first PCR step and only 24 cycles, instead of 37 cycles, were run in the second PCR step. The products of this nested genomic walking PCR were cloned into the TA Cloning Vector pCRII from InVitrogen.
-24- In the first genomic walking, the gene-specific primer C3K2-GSP1 GACGTGGCTCTTGAAGGCCTTG-3') and the nested gene-specific primer C3K2- GSP2 (5'-GCCTTCTGGACCACGGCATACC-3') were used, together with the HDL library 4, and a PCR fragment of 1639 bp in length was obtained. In the second genomic walking, a PCR fragment of 685 bp in length was amplified from the HDL library 4 using the gene-specific primer C3F2 CGTAGTTGAGCACGCTGAACAGTG-3') and the nested gene-specific primer C3F (5'-CCTTCACCCTCGAGGTGAGACGCT-3. The .third genomic walking mixture, using the gene-specific primer DEL5-GSP1 GGTGGATGTGACGGGCGCGTACG-3') and the nested gene-specific primer C5K-GSP1 (5'-GGTATGCCGTGGTCCAGAAGGC-3'), led to a 924 bp PCR fragments being cloned from the HDL library 1. In all, 2100 bp of the genomic hTC region located 3' of phage clone P17 were identified using this genomic walking method (see Fig. 7).
The subcloned fragments, and the genomic walking products, were sequenced in single-stranded form. The Lasergene Biocomputing Software (DNASTAR Inc.
Madison, Wisconsin, USA) was used to identify overlapping regions and form contigs. In all, 2 large contigs were assembled from the sequences collected from phage clones P12, P17, P2, P3 and P5, and also the sequence data from the genomic walking. Contig 1 consists of sequence data from phage clones P12 and P17 and the sequence data from the genomic walking. Contig 2 was put together from the sequences from phage clones P2, P3 and P5. Overlapping phage clone regions are shown diagrammaticaly in Fig. 7. The sequence data from the 2 contigs are shown below. The ATG start codon in contig 1 is underlined. The TGA stop codon is underlined in contig 2.
Con tigi: ACTTGAGCCC AAGAGTTCAA ATGAGACCCT GTCTCAAAAA ACAAAACCAG AAATCAACAA TTCTGAATGA CCAGTGAGTC CGGAAACATA ACCTCTCAAA AGCAGCTACA TCAAAAAAGT GCCAAGGCGG GCAGATCGCC CTACTAAAAA TACAAAATTA GCAGGATAAC CGCTTGAACC TGGTGTAACAA GAGTGAAACC GATGCACCTT AAAGAACTAG AAGATCAGAG CAGAAATAAA TTTTGAAAAG ATAAACAAAA ATAAAGTCAG AGATGAAAAA CTATGAGCAA CTGTACACTA CTACCAAGAT TGAACCATGA AATAAAAAGT CTCCTAGCAA AAAGAAGAAT GAATTCCAAT TCTACATGGC CAGTATTACC CAGAAAGAAA GAAAACTACA GCAAACCAAA TTAAACAACA AAGGATGGTT CAACATATGC TATGATTATT TCACTTTATG AAAAACCAGG TATACAAGAA AGGCCAAGGT GGGATGATTG CTACAAAAAA CTTTTTTAAA GCTGAGGTGG GAGAATCACT CCAGCCTAGA CAACAGAACA AAGGGAGGAG GAGGAGAAGG AAGAAACATA TTTCAACATA AGCCTTTCCT CTAAGATCTG AGTCCTAGCT AGAGCAATCA TTATCCTGTT TGCAGATGAT GCTGAAATTT GGTACAGCAG AAACAATCTG AAAAAGAAAC GTGAAAGATC TCTACAATGA AGATATTCCA TGTTCATAGA AAATTCAATG CAATCCCTAT CACATTACCT GACTTCAAAT AGATGAGACA TGGACCAGAG TTTTTGACAA AGGTGCCAAG CTGGATATCC ATATGCAAAA GGATGAAAGG CTTAAATCTA GGACATTGGA GTGGGCAAAG AAATGGGATC ATATCAAGTT CCACAGAATG GGAGAATATA GCTCAAACTA CTCTATAAGA CATTTCTCAA AATAAGTCAT GAGAAATGCA AATCAAAACT GGCTACGGTG AGCCATGATT GCAACACCAC ACGCCAGCCT TGGTGACAGA AAAAAAAAAA AATTGAAATA ATATAAAGCA TCTTCTCTGG CCACAGTGGA CAAGAGGAAT TTTGAAAACT ATACAAACAC ATGAAAATTA AACAATATAC AATGAAGAAA TTAAAAAGGA AATTGAAAAA TTTATTTAAG CAAATGATAA ACCCACGGTA TACAGCAAAA GCAGTGCTAA GAAGGAAGTT TATAGCTATA AGAAAAGCCA GGCGCAGTGG CTCATGCCTG TAATCCCAGC ACTTTGGGAG TGAGGTCAGG AGTTCGAGAC CAGCCTGACC AACACAGAGA AACCTTGTCG GCTGGGCATG GTGGCACATG CCTGTAATCC CAGCTACTCG GGAGGC'rGAG CAGGAGGTGG AGGTTGCGGT GAGCCGGGAT TGCGCCATTG GACTCCAGCC CTGTCTCAAG AAAAAAAAAA AAGTAGAAAA ACTTAAAAAT ACAACCTAAT AAAAGCAAGA GCAAACTAAA CCTAAAATTG GTAAAAGAAA AGAAATAATA TGAAACTGAA AGATAACAAT ACAAAAGATC AACAAAATTA AAAGTTGGTT TTGACAAACC TTTGCCCAGA CTAAGAAAAA AGGAAAGAAG ACCTAAATAA AGAGACATTA CAACTGATAC CACAGAAATT CAAAGGATCA CTAGAGGCTA ATAAATTGAA AAACCTAGAA AAAATAGATA -,AATTCCTAGA. TGCATACAAC AGAAATCCAA AGCCCAAACA :GACCAATAACAATAATGGGA TTAAAGCCAT AGAGAAGCCC AGGACCCAAT, GGCTTCCCTG CTGGATTTTA CCAATCATTT CCTACTCAAA CTATTCTGAA AAATAGAGGA AAGAATACTT.CCAAACTCAT CTGATTCCAA AACCAGACAA AAACACATCA -AAAACAAACA AACAAAAAAA GGCCAATATC CCTGATGAAT ACTGATACAA AAATCCTCAA CAAAACACTA CCTTCGAAAG ATCATTCATT GTGATCAAGT GGGATTTATT CCAGGGATGG AAATCAATCA ATGTGATACA TCATCCCAAC AAAATGAAGT ACAAAAACTA CAGAAAAAGC ATTTGATAAA ATTCTGCACC CTTCATGATA AAAACCCTCA ACATACAGGC CAGGCACAGT GGCTCACACC TGCGATCCCA GCACTCTGGG CTTGGGCCCA GGAGTTTGAG ACTAGCCTGG GCAACAAAAT GAGACCTGGT AAATTAGCCA GGCATGATGG CATATGCCTG TAGTCCCAGC TAGTCTGGAG TAAGCCTAGG AGGTCGAGGC TGCAGTGAGC CATGAACATG TCACTGTACT AGACCCCACT GAATAAGAAG AAGGAGAAGG AGAAGGGAGA AGGGAGGGAG AGGAGGTGGA GGAGAAGTGG AAGGGGAAGG GGAAGGGAAA GAGGAAGAAG ATAAAAGCCC TATATGACAG ACCGAGGTAG TATTATGAGG AAAAACTGAA GAAAATGACA AGGGCCCACT TTCACCACTG TGATTCAACA TAGTACTAGA GATAAGAGAA AGAAATAAAA GGCATCCAAA CTGGAAAGGA AGAAGTCAAA ATGATCTTAT ATCTGGAAAA GACTTAAGAC ACCACTAAAA AACTATTAGA GATACAAAAT CAATGTACAA AAATCAGTAG TATTTCTATA TTCCAACAGC CAAAAAAGCA GCTACAAATA AAATTAAACA GCTAGGAATT AACCAAAGAA AAACTATAAA ATGTTGATAA AAGAAATTGA AGAGGGCACA AAAAAAGAAA TTGGAAGAAT AAATACTGTT AAAATGTCCA TACTACCCAA AGCAATTTAC TAAAATACTA ATGACGTTCT TCACAGAAAT AGAAGAAACA ATTCTAAGAT CCCAGAATAG CCAAAGCTAT CCTGACCAAA AAGAACAAAA CTGGAAGCAT TATACTACAA AGCTATAGTA ACCCAAACTA CATGGTACTG GCATAAAAAC GAACAGAATA GAGAATCCAG AAACAAATCC ATGCATCTAC AGTGAACTCA AACATACTTT GGGGAAAAGA TAATCTCTTC AATAAATGGT GCTGGAGGAA TAACAATACT AGAACTCTGT CTCTCACCAT ATACAAAAGC AAATCAAAAT AAACCTCAAA CTTTGCAACT ACTAAAAGAA AACACCGGAG AAACTCTCCA ACTTCTTGAG TAATTCCCTG CAGGCACAGG CAACCAAAGC AAAAACAGAC AAAAAGCTTC TGCCCAGCAA AGGAAACAAT CAACAAAGAG AAGAGACAAC TTTGCAAACT ATTCATCTAA CAAGGAATTA ATAACCAGTA TATATAAGGA AAAACACCTA ATAAGCTGAT TTTCAAAAAT AAGCAAAAGA TCTGGGTAGA ACAAATGGCA AACAGGCATC TGAAAATGTG CTCAACACCA CTGATCATCA ACTATGAGAG ATCATCTCAT CCCAGTTAAA ATGGCTTTTA TTCAAAAGAC GAGGATGTGG ATAAAAGGAA ACCCTTGGAC ACTGTTGGTG GGAATGGAAA AGTTTGAAAG TTCCTCAAAA A.ACTAAAAAT AAAGCTACCA TACAGCAATC CAAAAAAGGG AATCAGTGTA TCAACAAGCT ATCTCCACTC CCACATTTAC CCAAGGTTTG GAAGCAACCT CAGTGTCCAT CAACAGACGA ATGGAAAAAG AATGGAGTAC TACGCAGCCA TAAAAAAGAA TGAGATCCTG TCAGTTGCAA AGTATGTTAA GTGAAATAAG CCAGGCACAG AAAGACAAAC TTTTCATGTT AAAATTAAAA CAATTGACAT AGAAATAGAG GAGAATGGTG GTTCTAGAGG GAGTCAACAA TAATTTATTG TATGTTTTAA AATAACTAAA AGAGTATAAT GAAAGGATAA ATGCTTGAAG GTGACAGATA CCCCATTTAC CCTGATGTGA GTATCAAAAT ATCTCATGTA TGCTATAGAT ATAAACCCTA CTATATTAAA GGCACGGTGG CTCATGTCCG TAATCCCAGC ACTTTGGGAG GCCGAGGCGG AGTTTGAAAC CAGTCTGGCC ACCATGATGA AACCCTGTCT CTACTAAAGA GGTGGCACAT ACCTGTAGTC CCAACTACTC AGGAGGCTGA GACAGGAGAA GAGGTTGCAG TGAGCCGAGA TCATGCCACT GCACTGCAGC CTGGGTGACA ACAAAAACAA AAAAAAGAAG ATTAAAATTG TAATTTTTAT GTACCGTATA AGAAGTTAAA AATTAAAACA ATTATAAAAG GTAATTAACC ACTTAATCTA GGGTTTCTAG CTTCTGAAGA AGTAAAAGTT ATGGCCACGA TGGCAGAAAT GTTACTGTTG TTAGACGCTC ATACTCTCTG TAAGTGACTT AATTTTAACC TAAAGAGGCA TTCTATMAGC CCTAAAACAA CTGCTAATAA TGGTGAAAGG TAATTACAGA TATCTCTAAA ATCGAGCTGC AGAATTGGCA CGTCTGATCA TGCTTTTTTT CTTGTGTGC'r TGGAGATTTT CGATTGTGTG TTCGTGTTTG ATCCTGAAAC GAAAAATGGT GGTGATTTCC TCCAGAAGAA TTAGAGTACC TGTGGACCTG AGCCACTTCA ATCTTCAAGG GTCTCTGGCC AAGACCCAGG 140 210 280 350 420 490 560 630 700 770 840 910 980 1050 1120 1190 1260 1330 1400 1470 1540 1610 1680 1750 1820 1890 1960 2030 2100 2170 2240 2310 2380 2450 2520 2590 2660 2730 2800 2870 2940 3010 3080 3150 3220 3290 3360 3430 3500 3570 3640 3710 3780 3850 3920 3990 4060 4130 4200 4270 4340 4410 4480 4550 4620 4690 4760 4830 4900 4970 5040 5110
*AGGCAATAAC
TTGCTACCAC
CCATTGCTAG
TGCAGCACTG
AAAATGTGGT
CAGCATGGGG
CTCCCTTACT
GGTGGGGGAC
TGGGTTGTTT
TTATTACACA
AATTAAAATT
GTGGATCACC
TACAAAAATT
TTGCTTGAAC
GAGCAAGACT
AATATATACT
AAATAAGAAC
GTGAGGAGGG
AAAGACAGGC
TAATCTCTAT
CACCGTCCTC
S GTTAAACTTA
TGGCAGGAAG
AAATGCCAGT
TATGGAGAAC
GTATATACTC
TTCATAGCAG
GCACATACAC
GGCACTGGTC
TGTGGGAGCA
AGGGTGACTA
GTAACACAAA
TTGTATGCCT
TTAATGGCCA
TGAGGTCAGG
AGCCAGGCGT
CTGGGAGGCG
CCATCTCAAA
CTACTATATT
AATGTATGTG
AACAGTGGAA
TGGGAGAAGT
TAATTACCAA
TCATTCACGG
ATCTGTATGA
CAGGTGGCTC
26 TGCAAGGCAG AGGCCTGATG CGTTGGTGAG CAGCGCATGA AGGCGGCCAG CGGGAATGCA CCCGGGCGTG TGCCAGAGGG AGCAGTGACC AAGGTGTATT GCACCCTTCT CAAGGGAAAA CCCTCGCGGT TTCTGATCGG TGCAAAGGGC TCCACAGACC TGGCTGGGGG CGGACAGCGA CCGAATGGAT TTGGATTTTA ATTTTTCGCC CTAAGTACTT AGAGGCCTGG GCGGCAGGGC ACAGCAGGAC CACTGACCGT GTGACTCAGG ACCCCATACC CCGTGGAAAC GAACATGACC GGAAATGGCC ATGTAAATTA CCCAAGGACT GAATGATTCC ACTCTTTTAC TAGGCCCACA GCTTTCAGCC ACCAGGCTGG CAGGCACTCC CCCAGATTCT CTGGGGTGCC ACACTGAGGC TTCCTAAACC CTGGGTGGGC ACGTAGCTCG CACGGTTCCT GAGGAGATTC TGCGCCTCCC CCTGGCGTCC GGCTGCACGC CCGGTGTGTT CTTCTGTTTC GGGGTTTTTA TAGGCATAGG TGAAAGTAGG AGTGCCTGTC CCCGCCCTTC TCTGCCCAGC ACTAAGCATC CTCTTCCCAA CACGTGACTA CGCACATCAT CAAAGCAGGG AAATCCCTGC GGACAGTTCC TCACAGTGAA GAGTCAAAAC TGCCACCTCC AGGGGAGTGG TTAGGGGGGT AAGCCAGTTT CCTGGTTCTG GGAACCCGGA GGCTGTGCCA ACGTCCTGAT TCCCCCAAAC TGAGGACCCT GAGGTCTGGG AGAGGCGGGC AGGAGGGTCA GGAGGGAGGG CCTCGAGCCC ACGGAGCCTG CAGCAGGAAG GTGCCATAGG AGGGCACTCG' ACCCATGCAC TGTGAATCTA GTGCCTCCGG GCAAGGGCAG CTGAGACAGA GTTATGCTCT CGTCTCCTGG GTTCAAGCAA ACACCCGGCT AATTTTGTAT CTGACCTCAG GTGATCCGCC GGCCTATTTA ACCATTTTAA AATTTCCCCT TTACTCAGGA CGTCTCTTGA CATATTCACA GAGGCTGCAG GCTTCAGGTC AAGTGTGGAC ACTG'rCCTGA CTCCTACTCT ACTGGGATTG TGGAGGAAGG AATGATACTT TTTGTTTTGT TTTGAGAGGC GCTTACTGCA GCCTCTGCCT CAGGCACCCG CCACCATGCC ATGTTGGCCA GGCTGGTCTC
ACCCGAGGAC
AGTGCCCTTA
AGGAGTCAGA
AGAGGAGTCA
CTGAGGGAAG
CCAGACGCCC
GACAGAGTGA
CCCGCCCTGG
CGGCGGGATT
TCTTAATATT
CTGAGCTATG
TTTATTGGTT
TATGAGCACG
CCTCCCTGGG
GGCTTCCTGG
CTTGCCTGCC
CACGACTCTG
AGCAACTTCT
GAGCACGGSC
GGTGACAACA
AGGGCCTGGT
CAGCCCTGTC
CGTGTTCCAG
CCTCACATGG
AGACTGGCTC
TGACCTCCAT
TGTGCTCCTT
ACGGGGGCGT
CTCACCTAGG.
ACTTTCCTGC
AAGACCCAGC
GTACACACTC
TAAAATGTCC
GAGGAACATG
ATGGGATACG
TAAGGACGGT
ATGGTATTGG
TCTTTGCCAT
CTGTGGACAG
ATCCTTCGGG
GAGGGGGGCA
AGGCCTGCAA
GCACGGCTGG
CGCTGCCCTT
GGATTATTTC
GGCAGGCACG
TGTTGCCCAG
TTCTCGTGCC
TTTTAGTAGA
CACCTCAGCC
AACTTCCCTG
GTTACCCTCC
GTTTCTGTGA
CCAGTGGGGT
ATCTCAATGT
AGCCCCTTCC
TGTTATTTTT
GGTTTCACTC
CCCAGGTTCA
CAGCTAATTT
GAACTTCTGA
ATGCCCAGCT
GTTGTGGTGT
GATGACTAAG
TATCAGCAAT
AATCTTCTGC
AACCAGTGTA
GAGACAATTC
TGTAATCCTA
CTGTTCAAAT
TTAAGGTTGC
GACCCAGAAG
CAGCTGTCCT
GGCTGGAAGT
CTCCACCCTG
GTCAAGGCCG
CCCTTTCTCG
CTCGCCGCCT
AGGAAAGCTC GGATGGGAAG GGGCGATGAG AAGCCTGCCT TTTACGCTTT GCAAAGATTG CTCTGGATAC CATCTGGAAA AGCCTCCTGC TCAAACCCAG GCCAGCAGCT ATGGCGCCCA AGGCACCTCG AAGTATGGCT TAAATCTTTT TTTCACCTGA CTTGAGTTAG GTGCCTTCTT TAAAACAGAA AGTCATGGAA GCTCTGCGGT CATTTACC'rC TTTCCTCTCT CCCTCTCTTG CCCCCGTGGA GCTTCTCCGA GCCCGTGCTG AGGACCCTCT AGAGAGGAGT CTGAGCCTGG CTTAATAACA AACTGGGATG CAAAGACTTA ATTCCATGAG TAAATTCAAC CTTTCCACAT TTCTTAAATT TCATCAAATA ACATTCAGGA CTGCAGAAAT TTTGCCAAGG TCCAAGGACT TAATAACCAT GTTCAGAGGG TTCATAAGGT GGCTTAGGGT GCAAGGGAAA GTACACGAGG GCAGGGCCAC CGGGGAGAGA GTCCCCGGCC TGGGAGGCTG AGCTGCCACA TTGGGCAACG CGAAGGCGGC CACGCTGCGT GCCCACCCAC ACTAACCCAG GAAGTCACGG AGCTCTGAAC TGCTTCCCTG GGTGGGTCAA GGGTAATGAA GTGGTGTGCA CTGATGGGGA CCGTTCCTTC CATCATTATT CATCTTCACC TCGGGTGTGA CAAGCCATGA CAAAACTCAG TACAAACACC CACACCCCTG ATATATTAAG.AGTCCAGGAG AGATGAGGCT GCGGCTGAAC AGTCTGTTCC TCTAGACTAG TAGACCCTGG TGCTGCTTCC CGAGGGCGCC ATCTGCCCTG GAGACTCAGC TCCACACCCT CCGCCTCCAG GCCTCAGCTT CTCCAGCAGC CGCTACTGTC TCACCTGTCC 'CACTGTGTCT TGTCTCAGCG GGTGTCTGTC TCCTTCCCCA ACACTCACAT'GCGTTGAAGG CTCTGAGCCT GAACCTGGCT CGTGGCCCCC GATGCAGGTT TTCCAGGCGC TCCCCGTCTC CTGTCATCTG CCGGGGCCTG TCCACGTCCA GCTGCGTGTG TCTCTGCCCG CTAGGGTCTC GGTGGGCCAG GGCGCTCTTG GGAA.ATGCAA CATTTGGGTG TCCACGGGCA CAGGCCTGGG GATGGAGCCC CCGCCAGGGA CCCCCTCCCT CTGGAACACA GAGTGGCAGT TTCCACAAGC ATTGGCACCC CTGGACATTT GCCCCACAGC CCTGGGAATT CCGTCCACGA CCGACCCCCG CTGTTTTATT TTAATAGCTA TTTAACAAAC TGGTTAAACA AACGGGTCCA TCCGCACGGT CCGTTTATAA AGCCTGCAGG CATCTCAAGG GAATTACGCT TACGCAACAT GCTCAAAAAG AAAGAATTTC ACCCCATGGC GGGGGCGGCA GCTGGGGGCT ACTGCACGCA CCTTTTACTA CTCAGTTATG GGAGACTAAC CATAGGGGAG TGGGGATGGG GCCCGAGTGT CCTGGGCAGG ATAATGCTCT AGAGATGCCC AACCCGCCCG GCCCCAGGGC CTTTGCAGGT GTGATCTCCG ACTACCTGCA GGCCCGAAAA GTAATCCAGG GGTTCTGGGA GCCTCAGGAC GATGGAGGCA GTCAGTCTGA GGCTGAAAAG GCGCCTCCAG AAGCTGGAAA AAGCGGGGAA GGGACCCTCC CCCTTAGCCC ACCAGGGCCC ATCGTGGACC TCCGGCCTCC CTAGCATGAA GTGTGTGGGG ATTTGCAGAA GCAACAGGAA AAAACAAAGG TTTACAGAAA CATCCAAGGA CAGGGCTGAA AGTGATTTTA TTTAGCTATT TTATTTTATT TACTTACTTT GCTGGAGTGC AGCGGCATGA TCTTGGCTCA CTGCAACCTC TCAGCCTCCC AAGTAGCTGG GATTTCAGGC GTGCACCACC GATGGGCTTT CACCATGTTG GTCAAGCTGA TCTCAAAATC TCCCAAAGTG CTGGGATTAC AGGCATGAGC CACTGCACCT GGCTCAAGTC. ACACCCACTG GTAAGGAGTT CATGGAGTTC TTTGATATTT TCTGTAATTC TTCGTAGACT GGGGATACAC CCACCTGTTA TCCCATGGGA CCCACTGCAG GGGCAGCTGG TGCCATCTGC :'CAGTAGAAAC'CTGATGTAGA ATCAGGGCGC CTCAGTGTGT GCTGAAACAT GTAGAAATTA AAGTCCATCC CTATCCCCCC CCAGGGGCAG AGGAGTTCCT CTCACTCCTG CACTGCTGGT ACTGAATCCA CTGTTTCATT TGTTGGTTTG TTGTTGCTCA GGCTGGAGGG AGTGCAATGG CGCGATCTTG AGTGATTCTC CTGCTTCCGC CTCCCATTTG GCTGGGATTA TTTGTATTTT TAGTAGAGAC GGGGGTGGGT GGGGTTCACC CCTCAGATGA TCCACCTGCC TCTGCCTCCT AAAGTGCTGG CAGAATTTAC TCTGTTTAGA AACATCTGGG TCTGAGGTAG TTTAAGCCAA TGATAGAATT TTTTTATTGT TGTTAGAACA ACATCATCAG CTTTTCAAAG ACACACTAAC TGCACCCATA CTTCATTGAA TGCCGGGAGG CGTTTCCTCG CCATGCACAT TTCCATTTCT TCTCTTCCCT CTTTTAAAAT TGTGTTTTCT AGCTACAACT TAACTTTTGT TGGAACAAAT TTTCCAALACC ACAAACACAG CCCTTTAAAA AGGCTTAGGG ATCACTAAGG AGTATTTACA AGACGAGGCT AACCTCCAGC GAGCGTGACA GCTAGCTCCA TAAATAAAGC AATTTCCTCC GGCAGTTTCT GTTTGTTAGC ATTTCAGTGT TTGCCGACCT CAGCTACAGC TTTCTCGCCC CCTTAGATCC AAACTTGAGC AACCCGGAGT GCGGTTGTGC CGGGGCCCCA GGTCTGGAGG GGACCAGTGG CGGGCCTCCT AGCTCTGCAG TCCGAGGCTT GGAGCCAGGT TGCGGGCGGG ATGTGACCAG ATGTTGGCCT CATCTGCCAG TTGTGGCTGG TGTGAGGCGC CCGGTGCGCG GCCAGCAGGA ACGGGACCGC CCCGGTGGGT GATTAACAGA TTTGGGGTGG GAGAACCTGC AAAGAGAAAT GACGGGCCTG TGTCAAGGAG 5180 5250 5320 5390 5460 5530 5600 5670 5740 5810 5880 5950 6020 6090 6160 6230 6300 6370 6440 6510 6580 6650 6720 6790 6860 6930 7000 7070 7140 7210 7280 7350 7420 7490 7560 7630 7700 7770 7840 7910 7980 8050 8120 8190 8260 8330 8400 8470 8540 8610 8680 8750 8820 8890 8960 9030 9100 9170 9240 9310 9380 9450 9520 9590 9660 9730 9800 9870 9940 10010 10080 10150 10220 10290 10360 10430 10500 10570
GATTACAGGT
GAAGCTCACC
CTCTTGATGT
ATACTGGGGT
GGTGTTAATT
ATGTTGGCTT
GCCCCTTTGC
GGATTTCTAG
GCCCAGGGAG
GTGAGCCACC
CCACTCAAGT
TTTACACTGT
GTCTTCTGGG
ACTCCAGCAT
CTCTGCAGAG
CCTAGTGGCA
AAGAGCGACC
GGTGCGAGGC
GAAAGTAGGA AAGGTTACAT ATCCCTGCAA GGCCTCGGGA CTGGATTCCT GGGAAGTCCT CCGTGTGGCT TCTACTGCTG GCCTGGACCC CGAGGCTGCC ACAGAGTGCC GGGGCCCAGG GCGCCTGGCT CCATTTCCCA TTTGCTCATG GTGGGGACCC -27-
CCCAAGTCGC
GTCCTCGGGT
CCGGAGCCCG
GGTCGCCGCA
GCCGATTCGA
GGCGGGGAAG
CCAGTGGATT
ACCCGTCCTG
GGGTCCCCGG
TCGCGGCGCG
GCGATGCCGC
CGCTGGCCAC
TTTCCGCGCG
TCCTTCCGCC
GACATGCGGA
GCCCGAGTGC
ACGGGGCCCG
CGACGCACTG
GGGGAAGTGT
TCGTCCCCAG
ACGCCCCGCG
CGCACCTGTT
CCTCTCTCCG
CGCGGCCCAG
CGCGGGCACA
CCCCTTCACC
CCCAGCCCCC
AGTTTCAGGC
GCGCTCCCCG
GTTCGTGCGG
CTGGTGCCC
AGGTGGGCCT
GAGCAGCGCA
TGCAGAGGCT
CGGGGGCCCC
CGGGGGAGCG
CTGCTGGCAC GCTGCGCGCT TGTACCAGCT CGGCGCTGCC ATGCGAACGG GCCTGGAACC AGGAGGCGCG GGGGCAGTGC AGCCGGAGCG GACGCCCGTT TGGTTTCTGT GTGGTGTCAC ACGCGCCACT CCCACCCATC GTCCCTGGGA CACGCCTTGT GGAGCAGCTG CGGCCCTCCT GAGACCATCT TTCTGGGTTC GCTACTGGCA AATGCGGCCC CCTCAAGACG CACTGCCCGC CAGGGCTCTG TGGCGGCCCC ACAGCAGCCC CTGGCAGGTG GGGCTCCAGG CACAACGAAC AAGCTCTCGC TGCAGGAGCT GTGAGGAGGT GGTGGCCGTC AGGCAGAGCC CTGGTCCTCC TGGACACGGT GATCTCTGCC CGTTTTGATG GACACGCGGT TGGAGCCGGG TTGCCGGCAA TTACCTATAA TCCTCTTCGC GGGTGGGAGG TAAGGGTTTT GTGTGTTGAC GGCCAGGTGC TCACCTGAGG TCAGGAGTTT AAATTAGCTG GGCATGGTGG TGAACCCAGG AGGCGGAGGC GAAACTCTGT CTTTAAAAAA ACTGTTCTCC AGCACAGATC AGATGGCTCC ACCTGCTGAG CGTGTCCCCA CCCTGTTTTT TGCTCCCAGG CCCTACCGTG ATGTAAGACT TCCGGCCATG TTATGGTGGC AAAAGTCATA TGCTAACTCG GCGGTGTTTA ATCGAACGGC AGCTGCCTCA ACCCAGTTTT GCTTTTTGTG CCTGTCATTG CTGTTCTCTG TGGTCCCCGG GTGTCCCTGT GGGTGAGTGA GGCGCGGCCC CCCTGTCACG TGTAGGGTGA TGGTCCCCGG GTGTCCCTGT GGGTGAGTGA GGCGCGGTCC CCCTGTCACG TGTAGGGTGA CTGTCCCCGG GTGTCCCTGT GGGTGAGTGA GGCGCTGTCC CCTTGGCGTT TGCTCACTTG CCCATTGCCT GGGTAGATGG TCTTGGTCAC CTCTCCGTTC GGCACTGCAG CCACAGCTTC ATGCATGCTG CCAATACTCC CTGCCACGTG TTGCTGGAGA CCCCTCACTT GTCCTGTTTT TCACCTTATT CTGGGCACCT ATTGACGTCC AGCCACAGGT GTCTCCGCCA GCCTTCGTCA TTTCTATCTC TCCATTGTAT CATGCCTTTC CCTCTAAGTG CACTTTCAAG TGTTCTTAAA TTTCTTTGTG CACGCTGTGT TGCAGGGAGG CACTCCGGGA CCGCGTCTAC GCGCCTCCGT TCCGGACCTG GAGGCAGCCC CCCAGGGCCT CCACATCATG CTGGGGCCCT CGCTGGCGTC ACCCCCGGGT CCGCCCGGAG GACGCCCAGG ACCGCGCTCC TTCCAGCTCC GCCTCCTCCG TCCGGGCCCT CCCAGCCCCT AGCGCTGCGT CCTGCTGCGC CTGCCGAGCC GTGCGCTCCC CGCCTGGGGC CCCAGGGCTG AGTGCCTGGT GTGCGTGCCC CCCCGGGGTC GGCGTCCGGC GGCGACTCAG GGCGCTTCCC GTGCGAGCGC GGCGCGAAGA CCCGAGGCCT TOACCACCAG GGGCGTGGGG GCTGCTGCTG CTTTGTGCTG GTGGCTCCCA ACTCAGGCCC GGCCCCCGCC ATAGCGTCAG GGAGGCCGGG CAGCCGAAGT CTGCCGTTGC GGGCAGGGGT CCTGGGCCCA CTGCCAGACC CGCCGAAGAA CGTGGGCCGC CAGCACCACG CCCCCGGTGT ACGCCGAGAC TCCTACTCAG CTCTCTGAGG CAGGCCCTGG ATGCCAGGGA CTGTTTCTGG AGCTGCTTGG TGCGAGCTGC GGTCACCCCA CGAGGAGGAG GACACAGACC TACGGCTTCG TGCGGGCCTG GCCGCTTCCT CAGGAACACC GACGTGGAAG ATGAGCGTGC GAGGGCCCAG GCCCCAGAGC TGTCTCCATC GTCACGTGGG TCTGCTCTCC CTCCTGTCCA TTCCAGGCGC CGAGGCCAGA TGGGGAGAAG TGTCTGGAAG AATTTCAAGG GTGGGAATGA GCAGGTGCAC GTGGTCAGCC GGTGGCTCAC GCCGGTAATC GAGACCAGCC TGACCAACAT TGTGTGCCTG TAATCCCAGC TGCAGTGAGC TGAGATTGTG AAAAAGTGTT CGTTGATTGT CTGGTCCCAT CTTTAGGTAT GAAGGGACAG TGTTTGTGGG CTGGATTTGA TGTTGAGGAA GCAGCTAGAA GAAGTCCCGA CAGACAAGGA GGGTGACCTT TAACATGAGA TTGGCACTCC CAGCAGGTTG CTTGAAATGC CACCTGCTGC GGCTCAGGTG CTCCAGCTTC CTTCGTTGAG ACTTCAGATG AGGTCACAAT GGTCCCGCGT GCCCGTCCAG CCTCCCCTTC ACGTCCGGCA TGGGTCTCCG GATCAGGCCA GCCCCTCCCT CGGGTTACCC CCTGCACCCT GGGAGCGCGA CAGCTGCGCT GTCGGGGCCA CCACGTGGCG GAGGGACTGG CGCGGACCCC GCCCCGTCCC CCCCTTCCTT TCCGCGGCCC ACGTGGGAAG CCCTGGCCCC TGCTGCGCAG CCACTACCGC GCGGCTGGTG CAGCGCGGGG TGGGACGCAC GGCCGCCCCC TGGGGTTGAG GGCGGCCGGG CCGCAGGTGT CCTGCCTGAA ACGTGCTGGC CTTCGGCTTC CGTGCGCAGC TACCTGCCCA CGCCGCGTGG GCGACGACGT GCTGCGCCTA CCAGGTGTGC ACACGCTAGT GGACCCCGAA GTCCCCCTGG GCCTGCCAGC CCAAGAGGCC CAGGCGTGGC CCCGGGCAGG. ACGCGTGGAC GCCACCTCTT TGGAGGGTGC CAGGCCCCCC ATCCACATCG CAAGCACTTC CTCTACTCCT CCCAGCCTGA CTGGCGCTCG CTCCCCGCAG GTTGCCCCGC GAACCACGCG CAGTGCCCCT GCAGCCGGTG TCTGTGCCCG CCCGTCGCCT GGTGCAGCTG CCTGCGCCGG CTGGTGCCCC AAGAAGTTCA TCTCCCTGGG GGGACTGCGC TTGGCTGCGC TGAATGCAGT AGGGGCTCAG CACACGTGGC TTTTCGCTCA GTTTGCATAA ACTTACGAGG GCAGTGAACA GAGGAGGCTG CACAGACGCT CTGGCGAGGG GAGGTGGGGA CGAGAACCCC AATATGCAGG TTTGTGTTTA CCAGCACTTT GGGAAGCTGA GGTGAILACCC TATCTGTACT TACTTGGGAG GCTGAGGCAG CCATTGTACT CCAGCCTGGG GCCAGGACAG GGTAGAGGGA GAAGAGGGCC ACATGGGAGC TGTTCAGGGG ATGGTGCTGC CCTCCGCTCC AGCCCCCTTT
GGAGCAATGC
TTCGTGGTGC
GCGGCCAAAG
CACAGCCTAG
GCGGCGCGCG
GGCCGGGCTC
GGACCCGGGC
GACCCCTCCC
CGCCCTCTCC
GGCCACCCCC
GAGGTGCTC
ACCCGGCGGC
CGCCGCCCCC
GGGAACCAGC
GGAGCTGGTG
GCGCTGCTGG
ACACGGTGAC
GCTGGTTCAC
GGGCCGCCGC
GGCGTCTGGG
CCCGGGTGCG
GCTGCCCCTG
CGAGTGACCG
GCTCTCTGGC
CGGCCACCAC
CAGGCGACAA
GAGGCTCGTG
CTGCCCCAGC
ACGGGGTGCT
GGAGAAGCCC
CTCCGCCAGC
CAGGCCTCTG
GAAGCATGCC
AGGAGCCCAG
AAAAGGGGGC
GGACGTCGAG
TTCACCTTCA
GGCGCGGCAG
TGCCTGCAGG
CTCTTCCTGG
AGATTTAATT
GGCAGGTGGA
AAAAATACAA
GAGAATCACT
CGACAAGAGT
GGGAGATAAG
AGAGGACAGC
TGGGCCCTGC
TGGCTCCCAG
CTCCCAAGAC
TTTTTTCTTT
AGTGCAGAAT
GTCCCTACCC
GCGTCATGCA
CAGGACTCTG
GAGTGAGGCG
GTCACGTGTA
CCCCGGGTGT
GAGTGAGGCG
GTCACGTGCA
CCCAGGGTGT
GAGTGAGGCA
CTCAGGTGCA
CCCCAGGTGT
GCGCCGGTTG
GCTCGGCTCT
CCGCGTGCCA
TGCCCGCCAC
CCTGTGTCTG
TCACCCCAGC
GCAACGCTTG
CCTCACATGG
GAGGGCCGGT
ACCTCTGACG
TTAGTTTAGT
CAACATCAGC
CTGTGATTTT
TTCTTTAGCT
10640 10710 10780 10850 10920 10990 11060 11130 11200 11270 11340 11410 11480 11550 11620 11690 11760 11830 11900 11970 12040 12110 12180 12250 12320 12390 12460 12530 12600 12670 12740 12810 12880 12950 13020 13090 13160 13230 13300 13370 13440 13510 13580 13650 13720 13790 13860 13930 14000 14070 14140 14210 14280 14350 14420 14490 14560 14630 14700 14770 14840 14910 14980 15050 15120 15190 15260 15330 15400 15470 15540 15610 15680 15750 15820 15890 15960 16030
CACGTGCAGG
CCGGGTGTCC
GTGAGGCGCC
CCCGTGCAGG
CCGGGTGTCC
GTGAGGCACC
CACGTGCAGG
CTGGGTGTCC
AGCTTGCTCC
TGCAGGCGCA
CATTTTGCTA
AGGTCCGCTT
TCTCCCAGCT
CATCCCAGAA
CTCCCAAGCT
GCCGCTCATT
TGGAGTGTCT
GACTTCCCTC
GCTTTTTCTT
CTGCCTTACC
ATACTTCAAA
GTGAGTGAGG
CTGTCCCGTG
ATCCCCGGGT
GTGAGTGAGG
CTCTCAGGTG
GTCCCTGGGT
GTGAGTGAGG
CTGTCTCGTG
TGAATGTTTG
GTGCTGGTCC
CGGGGACACG
GCCTCTGTTG
TGTCTCATGC
AGGGTTCTCT
GCCCCTCTGC
GCTTAGGCTG
CTGTCTGTCT
TTGGGTCTTA
GGTTTATTCT
TGCACCCTGT
GTGTTAATAC
TTTCACCCCC
CTTGGGGCTC
TAACACCGTT
TGCGTCTTGC
GACCACGCCG
GAGAGTTTGA
CTGCCCCTGG
CGTTGCCCCC
CAGCGTGATT
GTCCCTGTCA
CACTGTCCCC
TAGGGTGAGT
GTCCCTCCCA
CGCGGCCCCC
TAGGGTGAGT
CTCTTTCTAT
CCAAGCCTAT
GGACTGCAGG
GGCCTGGCTT
CGAGGCTGGA
GTGCCCTGAA
TTGGCCCCCT
GGCTCTGCCT
CCTGCTCTGA
GTTTTGAATT
TTCATTCCTT
GTTTTGATGT
TTCTTTTAAG
ATATCAGTGA
TCCCCACAAA
TTTTTTTTCT
TTCTGTGTAC
GTGACTGGAA
AGTCAGATAA
GTTCTCTGAT
CTTATGCAGG
AGGTGTCCCT
GAGGTGTGGC
CGTGTAGGGT
GGGTGTCCCT
GAGGCGCGGC
GGTATAGGGT
GGGTGTCCCT
GAGGCTCTGT
AGCCACAGCT
CTTTTCTGAT
CTCTCGCCTC
GCTCACCACG
CTCTGGGCTG
GGAAAGCAAG
TGGGTGGGTG
CCAGTCGCCC
GACCCACGTG
TCACTGATTT
TTCTAGCTTC
GAAGTAATCT
TATTCTTATT
CTTTTAAGTA
TTTGACGTGA AATCATTTTG -28- TATTCTGTGA TTTCTTTGAG ATACGTAGAG TATTTTAAGT ACCAATTATT TGAAGTTTGC TGTAAATTTG ACATCCTGTC AAGCTTCTGT CTCCTTCTAG TCTGTTCATT TCTTCTCGTT TTATCTTCCT GATGAGTGAA TTGCTCTTAG TACTGCCACA AATTTATATA TATATATATA AGTGGTGTGA TCACAGGTCA AGTAGCTGGA ACTGCAGACA TGCTGTGTTG CCCAGGCTGG CTGAATTACA GGCATGAGCC TGTCCTGTTA ACAGCATGTA TTTATTTTCA TTTTTTTGTC .CCTCGTTCCC TTGTTTCTCA CTCGTTGCCT CCTGGTCACT ATTGTCGTTG TTTGCTTTTG ATCTCGGCTC ACTGCAACCT GGATTACAGG CGCCCACCAC TGGCCAGGCT GGTCTCAAAC ACAGGTGCAA GCCACCGTGC TCCTGAGCAA TAAGACCCTT TGACTTAGTT CTATCTCAGG GAGTGTTTCT GTAGCTTTGC CCGCCGTCTG GGGTCCCCTT CTTTACCTGT GCTGGCCTCC ATGTGGAGAC TCACGAGGAG CTTAGCCAGT GAGTGACAGC TCTGGTGGCT CCGCGGTGTC GCCTGGCGGG GGAGTGTCTG TTGTCGCCCA ACAGGAGCAT TCACGCCTGT AATCCCAGCA CCTGGCCAAC ATGATGAAAC TGTAATCCCA GCTACTCGGG GCCGACATTG CACCACTGCA AAAAAAAAAA AATTCTAGTA TTTTACTGAA GCCCAGCATG ACATTTGACA TTTTTTGAGC GGACCTGCTG GGCTTCCCAT CCCTCAGTGA GCTGGATGTG AGCTGGATGT GTGGTGTCTG ATGGAGTCCG GATGATGCAG GGATGGTGCA GGTCAGGGGT GGTCCGGGGT GAGGTCGCCA TGAGGTCACC AGGCCCTGCG CAGACGGTGC CAGACCATGC CCAGGCCCTG CTGTGAGTTG GCTGTGAGCT GGATGTGTGG AGCTGGATGT GTGGTGTCTG GCAGTGTCCA GATGGTGCAG GGATGGTGCA GGTCTGGAGT GGTCCGGGGT GAGGTCGCCA TGAGGTCGCC AGACCCTGCT CAGGCCCTCG GTGAGCTGGA TGTGAACTGG ATGTGCGGCG AGGTATGGAG TCCGGATGAT GTCTGGATGG TGCAGGTCTG TGCAGGTCCG GGGTGAGGTT GGGTGAGGTC GCCAGGCCCT CACCAGGCCC TGCGGTGAGC CTCGGTGAGC TGGATGTGCG TGGTGGGCTG GATGTGCCGT GATGTGCGGT GTCTGCATGG GTCCGGATGG TGCAGGTCCG GTGCAGGTCC GGGGTGAGGT CGGGGTGAGG TCGCCAGGCC GTCACCAGGC CCTGCGGTTA CCCTGCTGTG AGCTGGATGT GAGCTGGATG TGCTGTATCC ATGCGGTGTC GGATGGTGCA CGGATGGTGC AGGTCTGGGG CAGGTCCGGG GTGAGGTCGC CGTGAGGTCG CCAGGCCCTG GCCAGGCCCT GCGGTGGGCT TGCGGTGAGC TGGATGTGTG TGGATATGCG GTGTCCCCGT $A GATGTGCCGT GTCCGGATGG CAGTGAGTTA TTTGAACACT GTTTATGTTC AAGATATGTA GAGTATCAAG 16100 TATCATTTTA TTATTGATTT CTAACTCAGT TGTGTAGTGG TCTGTATAAT 16170 GGAGCCTTGC TTTGTGATCT AGTGTGTGCA TGGTTTCCAG AACTGTCCAT 16240 AATAGTGGGC ATGCATGTTC ACTATATCCA GCTTATTAAG GTCCAGTGCA 16310 ATGCATGAAA TTCCAAGAAG GAGGCCATAG TCCCTCACCT GGGGGATGGG 16380 TGGTAGCATT TATGTGAGGC ATTGTTAGGT GCATGCACGT GGTAGAATTT 16450 TCTTTTGGAG ACTTCTATGT CTCTAGTAAT CTAGTAATTC TTTTTTTAAA 16520 CTGGGCTTCT TTTGATTAGT ATTTTCCTGC TGTGTCTGTT TTCTGCCTTT 16590 TTTTTTTTTT TTTTGAGACA GAGTCTTGGT CTGTCGCCCA GGGTGAGTGC 16660 GTGTAACTTT TACCTTCTGG CCTGAGCCGT CCTCTCACCT CAGCCTCCTG 16730 CGCACCGCTA CACCTGGCTA ATTTTTAAAT TTTTTCTGGA GACAGGGTCT 16800 TCTCAAACTC TTGGACTCAA GGGATCCATC TACCTCGGCT TCCCAAAGTG 16870 ACCATGTCTG GCCTAATTTT CAACACTTTT ATATTCTTAT AGTGTGGGTA 16940 GGTGAATTTC CAATCCAGTC TGACAGTCGT TGTTTAACTG GATAACCTGA 17010 ACTAGAGACC CGCCTGGTGC ACTCTGATTC TCCACTTGCC TGTTGCATGT 17080 CCACCTCTTG GGTTGCCATG TGCGTTTCCT GCCGAGTGTG TGTTGATCCT 17150 GGGCATTTGC TTTTATTTCT CTTTGCTTAG TGTTACCCCC TGATCTTTTT 17220 TTTATTGAGA CAGTCTCACT CTGTCACCCA GGCTGGAGTG TAATGGCACA 17290 CTGCCTCCTC GGTTCAAGCA GTTCTCATTC CTCAACCTCA .TGAGTAGCTG 17360 CACGCCTGGC TAATTTTTGT -ATTTTTAGTA, GAGATAGGCT TTCACCATGT 17430 TCCTGACCTC AAGTGATCTG CCCGCCTTGG -CCTCCCACAG TGCTGGGATT 17500 CCGGCATACC TTGATCTTTT AAAATGAAGTCTGAAACATT -GCTACCCTTG 17570 AGTGTATTTT AGCTCTGGCC *ACCCCCCAGC 'CTGTGTGCTG, TTTTCCCTGC 17640 CATCTTGACA CCCCCACAAG CThAGCATTA TTAATATTGT. TTTCCGTGTT 17710 CCCCGCCCTG CTTTTCCTCC TTTGTTCCCC GTCTGTCTTC TGTCTCAGGC 17780 CCTTGTCCTT TGCGTGGTTC TTCTGTCTTG TTATTGCTGG TAAACCCCAG 17850 ATGGCATCTA GCGACGTCCG GGGACCTCTG CTTATGATGC ACAGATGAAG 17920 GGCGGTCATC TTGGCCCGTG AGTGTCTGGA GCACCACGTG'GCCAGCGTTC 17990 AACGTCCGCT CGGCCTGGGT TCAGCCTGGA AAACCCCAGG CATGTCGGGG 18060 GAGTTTGAAA TCGCGCAAAC CTGCGGTGTG GCGCCAGCTC TGACGGTGCT 18130 CTTCCTCCCT TCTGCTTGGG AACCAGGACA AAGGATGAGG CTCCGAGCCG 18200 GACGTGAGCC ATGTGGATAA TTTTAAAATT TCTAGGCTGG GCGCGGTGGC 18270 CTTTGGGAGG CCAAGGCGGG TGGATCACGA GGTCAGGAGG TCGAGACCAT 18340 CCCATCTGTA CTAAAAACAC AAAAATTAGC TGGGCGTGGT GGCGGGTGCC 18410 AGGCTGAGGC AGGAGAATTG CTTGAACCTG GGAGTTGGAA GTTGCAGTGA 18480 CTCCAGCCTG GCAACACAGC GAGACTCTGT CTCAAAAAAA AAAAAAAAAA 18550 GCCACATTAA AAAAGTAAAA AAGAAAAGGT GAAATTAATG TAATAATAGA 18620 TCCACACCTC ATCATTTTAG GGTGTTATTG GTGGGAGCAT CACTCACAGG 18690 TTTGTCTGCG GGATCCCGTG TGTAGGTCCC GTGCGTGGCC ATCTCGGCCT 18760 GGCCATGGCT GTTGTACCAG ATGGTGCAGG TCCGGGATGA GGTCGCCAGG 18830 CAGTGTCCGG ATGGTGCACG TCTGGGATGA GGTCGCCAGG CCCTGCTGTG 18900 GATGGTGCAG GTCAGGGGTG AGGTCTCCAG GCCCTCGGTG AGCTGGAGGT 18970 GTCCGGGGTG AGGTCGCCAG GCCCTGCTGT GAGCTGGATG TGTGGTGTCT 19040 GAGGTCTCCA GGCCCTCGGT AAGCTGGAGG TATGGAGTCC GGATGATGCA 19110 GGCCCTGCTG TGAGCTGGAT GTGTGGTGTC TGGATGGTGC AGGTCTGGGG 19180 GTGAGCTGGG TGTGCGGTGT CTGGATGGTG CAGGTCTGGA GTGAGGTCGC 19250 GGTGAGCTGG ATATGCGGTG TCCGGATGGT GCAGGTCTGG GGTGAGGTTG 19320 GATGTGGGGT GTCCGGATGC TGCAGGTCCG GTGTGAGGTC ACCAGGCCCT 19390 TGTCTGGATG GTGCAGGTCT GGGGTGAAGG TCGCCAGGCC CCTGCTTGTG 19460 GATGGTGCAG GTCTGGAGTG AGGTCGCCAG GCCCTCGGTG AGCTGGATGT 19530 GTCCGGGGTG AGGTCGCCAG ACCCTGCGGT .GAGCTGGATG. TGCGGTGTCT 19600 GAGGTCGCCA GGCCCTCGGT.
GACCCTGCTG TGAGCTGGAT GTGAGCTGGA TATGCGGTGT GGTATGGAGT CCGGATGATG TCTGGATGGT GCAGGTCTGG GCAGGTCCGG GGTGAGGTCG GGGTGTGGTC GCCAGGCCCT GCCAGGCCCT GCTGTGAGCT GCTGTGAGCT GGATGTGCTG TGGTTGTGCG GTGTCCGGTT GTGTCCCCGT GTCCGGATGG GTCCGGATGG TGCAGGTCTG TGCAGGTCTG GGGTGAGGTC GCGTGAGGTC GCC-AGGCCCT AGCCAAGGCC TTCGGTGAGC CTGCGGTTAG CTGGATATGC GCTGGATGTG CGGTGTCTGG GCTGTATCCG GATGGTGCAG GGATGGTGCA GGTCTGGCGT GGTCCGGGGT GAGGTCACCA TGAGGTCGCC AGGCCCTGCT CAGGCCCTGC GGTGAGCTGG CGGTGAGCTG GATGTGCAGT GTATGTGTGT TGTCTGGATG GTGTCTGGAT GCTGCAGOTC GTCCGAATGG TGCAGGTCCA TGCAGGTCTG GGGTGAGGTC GAGCTGGATG ITATGGAGTCC -GGATGGTGCC 19670 GTGCGGTGTC .TGGATGGTAC .AGGTCTGGAG 19740
CCGGATGGTG.
CAGGTCCGGG
GGTGTGGTCG
CCAGGCCCTG
CGGTGAGCTG
GGATGTGCTG
TATCCGGATG
GCTGCAGGTC
TGCAGGTCCA
GGGTGAGGTC
GCCAGGCCCT
GCTGTGAGCT
TGGATGTGGG
GGTGTCCGGA
ATGGTGCAGG
GTCCGGGGTG
GAGGTCGCCA
GGCCCTGCGG
GTGAGCTGGA
ATGTGCTGTA
GTACGGATGG
GTGCAGGTCC
CGGGGTGAGT
GGGTGAGGTC
GCCAGGCCCT
CAGGTCAGGG 'GTGAGGTCTC GTGAGGTCGC CAGGCCCTGC CCAGGCCCTC GGTGAGCTGG CTGTGAGCTG GATGTGCGGC GAGGTATGGA GTCCGGATGA TATCCGGATG GTGCAGTCCG GTGCAGGTCT GGGGTGAGGT CGGGGTGAGT TCGCCAGGCC GGGTGAGGTC GCTAGGCCCT GCCAGGCCTT TGGTGAGCTG TGGTGGGCTG GATGTGTGGT GGATGTGCGG TGTCTGGATG GTGTCCGGAT GGTGCAGGTC TGGTGCAGGT CCGGGGTGAG TCCGGGGTGA GGTCGCCAGG AGGTCGCCAG GCCCTGCAGT GGCCCTGCGG TTAGCTGGAT TTAGCTGGAT GTGCGGTGTC TGTGCTGTAT CCGGATGGTG TCCGGATGGT GCAGGTCTGG TGCAGGTCCG GGGTGAGGTC GGGGTGAGTT CGCCAGGCCC TCGCCAGGCC CTCGGTGAGC GCCAGGCCCT TGGTGGGCTG TGGTGAGCTG GATGTGCGGT 19810 19880 19950 20020 20090 20160 20230 20300 20370 20440 20510 20580 20650 20720 20790 20860 20930 21000 21070 21140 21210 21280 21350 21420 21490 29 GTCCGGATGG TGCAGGTCCG TTTAAGGGGT TGGCTGTGTT CTGGCTGATG AGTGTGTACG AAGAACAGGC TC'rTTTTCTA TCCCCACGCC AGGCCTCTGC CTTGCCTGTG CTTCCCTGGC GTGGATTCTG TGCAAGGCTC CAAGTGGTCT CTAGGGTTTG GCCTTTCCCT GGGATGTGGG TGTTGCCCAG GCTGGAGTGC TTCACCAGCC TCAGCCTCCT CTTTTAGGAG AGACGGGGTT CCACCTTGGC CTCCCAAAGT TTCATGCTGT TCTGTATGAA GGCGACTCAC TGCAGGGAGC TCTAGGTGGC TGCATTTGAA TGACAGATTC AAGCTGGATT AGCCCAGGCC ATGGTATTAG CAGGGCTTCC CCAGCTCCCC GGATGTCTGC AGAGGGAGCT CGTGGTGCTG GGGCCATTTC ACAATGCACC TTACTTAGAC TTTGGAAAGA ATTTAATTGG CACTACTGGG ACTGTTGTTC TTTCTACTCT GCTGGGCCTG ACGGAGTGCC AGGCTGTCAG GGGTGAGGTC ACCAGGCCCT CGGTGATCTG GATGTGGCAT GTCCTTCTCG CCGGCCGCAG AGCACCGTCT GCGTGAGGAG ATCCTGGCCA AGTTCCTGCA TCGTCGAGCT GCTCAGGTCT TTCTTTTATG TCACGGAGAC CACGTTTCAA CCGGAAGAGT GTCTGGAGCA AGTTGCAAAG CATTGGAATC AGGTACTGTA TTCTCGAAGT CCTGGAACAC CAGCCCGGCC TCAGCATGCG CCTGTCTCCA TGTGCAGCTC TGGGC'rGGGA GCCAGGGGCC CCG'rCACAGG CCTGGTCCAA TGACTGCCTG GAGCTCACGT TCTCTTACTT GTAAAATCAG GAGTTTGTGC TAAAGCAGAA GGGATTTAAA TTAGATGGAA ACACTACCAC TAGCCTCCTT TCTGATTCTC TCTCTCTTTT TTTTTTCTTT TTTGAGATGG AGTCTCACTC AGTGGCATAA TCTTGGCTCA CTGCAACCTC CACCTCCTGG GTTTAAGCGA AAGTAGCTGG GATTACAGGC ACCTGCCACC ACGCCTGGCT AATTTTTGTA TCACCATGTT GGCCAGGCTG GTCTCGAACT CATGACCTCA GGTGATCCAC GCTGGGTTTA CAGGCTAAGC *CACCGTGCCC AGCCCCCGAT TCTCTTTTAA TCTTCAATCT ATTGGATTTA GGTCATGAGA GGATAAAATC CCACCCACTT ACCTGTGCAG GGAGCACCTG GGGATAGGAG AGTTCCACCA TGAGCTAACT TGGCTGTGAG ATTTTGTCTG CAATGTTCGG CTGATGAGAG TGTGAGATTG TGCATCAGTG AGGGACGGGA GCGCTGGTCT GGGAGATGCC AGCCTGGCTG CTTCTCCGTG TCCCGCCCAG GCTGACTGTG GAGGGCTTTA GTCAGAAGAT TGCACACTCG AGTCCCTGGG GGGCCTTGTG ACACCCCATG CCCCAAATCA GGCAGCAGAC CTCGTCAGAG. GTAACACAGC CTCTGGGCTG. GGGACCCCGA CTTGCATCTG GGGGAGGGTC AGGGCTTTCC CTGTGGGAAC AAGTTAATAC TTTACACGTA TTTAATGGTG .TGCGACCCAA .CATGGTCATT TGACCAGTAT GGTGACCGGA AGGAGCAGAC AGACGTGGTG GTCCCCAAGA*TGCTCCTTGT TGCCTGGGGG GCCTTGGAGG CCCCTCCTCC CTGGACAGGG TACCGTGCCT CGGCCTGCGG TCAGGGCACC AGCTCCGGAG CACCCGCGGC CCCAGTGTCC *CCACAGATGC CCAGGTCCAG GTGTGGCCGC TCCAGCCCCC GTGCCCCCAT GCCAAGGGCA GAGGTGTCAG GAGACTGGTG GGCTCATGAG AGCTGATTCT TGAGCAGCCT CTCCCGCCCT CTCCATCTGA AGGGATGTGG CTCTTTCTAC CAGCCTTGGG CTACCCCAGT GGCTGTACCA GAGGGACAGG CATCCTGTGT GGCCCCAGAT GCAGCCTGGG ACCAGGCTCC CTGGTGCTGA TGGTGGGACA CCGGACTGGG CGTCCCCAGG GTTGACTATA GGACCAGGTG TCCAGGTGCC AGAGGCGTCT GGCTGGCATG GGTGGACGTG GCCCCGGGCA TGGCCTTCAG CTGAGCCCTC ACTGAGTCGG TGGGGGCTTG TGGCTTCCCG TGAGCTTCCC AGCAAGCCTC CTGAGGGGCT, CTCTATTGCA GACAGCACTT GAAGAGGGTG AGCAGAGGTC AGGCAGCATC GGGAAGCCAG GCCCGCCCTG CTGACGTCCA CCTGACGGGC TGCGGCCGAT TGTGAACATG GACTACGTCG TGGGAGCCAG GGGTGGCTGT GCTTTGGTTT AACTTCCTTT TTAAACAGAA GTGCGTTTGA TTAGATGAAG GGCCCGGAGG AGGGGCCACG GGACACAGCC AGGGCCATGG GCGCACAGTG AGGTGGCCGA GGTGCCGGTG CCTCCAGAAA AGCAGCGTGG GGGCAGGGAC AGGCTCTGAG GACCACAAGA AGCAGCCGGG CCAGGGCCTG CCTGGATCCG TGTCCTGCTG TGGTGCGCAG CCTCCGTGCG CTTCCGCTTA ACGACTGCCA GGAGCCCACC GGGCTCTGAG GATCCTGGAC CTTGCCCCAC
GGGTGGTTTT
GCTCCTTGGC
CTGGGGGTCC
GGAGGGGCAT
GTCACCCTGG
CTGCAAGTAG
CGTGTGCTGC
CCTAGTCTGT
CAGCTGCGGG
GACTCCGCTT
AACGTTCCGC
GCCCCACATT
CACGGCGCCA
GGGTGTAGGG
GATGCAGCAC'
CGGGGCCCGG
GGCTCCTGCA
AGGTGGACAG
GGTCTGAGGA
TCTCCCCTGG
CCCCGCCAGG
GGCGCCCCGG
GCTGCGTGTG
GAGCAGCCCT
GGGTGGGCCG
TTCAGCCTTT
GTCCCACTGG
CACCCCAGTC
GGGTCTGGAG
ATGTCTGAGT
GTACGACACC
TGCGTGCGTC
TAAGGTTCAC
GTGTCTGTGA
CGTGTGTGTC
CCATGGTGTG
CATGTCTGTG
GGTGTGTGTG
CGGGTGCTGG
GGGCTGGGCC
CCCTGGCACC
TCGCTCCCCG
TGGGCTTGGG
GGGTGCAGAG
CCTCTTTCTC
GGGGGAAAAG
TGAGCTGCCC
TGCCTGGGGC
GGGTTCACGT
GGGTTGACCG
AGGGGCTCTC
CGTGGGTCC
TGTCTGGCTG
AGCTGTCGGA
CATCCCCAAG
AGAGAAAAGA
TGGTATCAGC
ACCCATTTGT
GGAGCTCCTG
GGCCCGAGGT
GGACCAGGCC
CCCCACCCCT
AGGTGTGGCA
AGCTGGGAGG
GTCCCTATGG
CCGAGCGTCT
CCTCCTGGGC
CGGGCCCAGG
GCTGGACCTT
CAGGGAGTGC
CCTGCAGCAC
ATTCCAGTTT
CTGAGCCAGG
TGGTGGGGGT
TTCTGCGTGG
ATCCCCCAGG
GGTATGCCGT
GTGTGATAGT
TGCGTTTCTG
GTGGTGCATG
TGTGCCTGTG
ATGTGCCTAT
GCCCCTTGGC
TTTGGGGAGC
TTGGAGACTG
CCCAGGACCC
GGACACACTC
21560 21630 21700 21770 21840 21910 21980 2205SO 22120 22190 22260 22330 22400 22470 22540 22610 22680 22750 22820 22890 22960 23030 23100 23170 23240 23310 23380 23450 23520 23590 23660 23730 23800 23870 23940 24010 24080 24150 24220 24290 24360 24430 245S00 24570 24640 24710 24780 24850 24920 24990 25060 25130 25200 25270 25340 25410 25480 25550 25620 25690 25760 25830 25900 25970 26040 26110 26180 26250 26320 26390 26414
GTGGCTGCGG
TGAGGATCCC
GGTTCTAGGT
TGGGGTGGGC
CACCTCGAGG
GCCTCTGTGC
ACCCGCCGCC
GGGAGTGGCT
AGGTGACCCT
ATGGGGCCGA
CCGTCAGAGA
GGTCTCCTGT
CAGAGAGAGA
CCACTGTCAG
ACAGGCTCAC
GGTCCAGAAG
CGTGTCCAGG
TGGTGGAGGT
TATCTGTGGC
GTGTGCATGT
TTGTGGTGTG
CTTACTCCTT
TCCACATTCA
TAAGCCAGGT
CAGTCTGGCC
CTCCCAGAGC
TGGCTGCGGT GACCCCGTCA GTGTGCAACA CACATGCGGC CCCGGGTCTG GGTGGCTGGG ACTTGGCCGG ATCCACTTTC GTGAAGGCAC TGTTCAGCGT TGGGCCTGGA CGATATCCAC TGAGCTGTAC TTTGTCAAGG
GCCTGATTGG.CACCTCATGT
GTCACTGTTG AGGACACACC CTGTGCACCC TGACTGCCCG AGGAACCGCA ACGGCTCAGC CCTGAGGCTC AGAGAGGGGA GTGGGGGACA CCGCCAGGCC TCTCCTCGCC TCCACTCACA GGAGGTCATC GCCAGCATCA GCCGCCCATG GGCACGTCCG ATGTGTGTCT CTGGGATATG ACTTCCATGA TTTACACATC GTGCATATTT GTGGTGTGTG GTGTGTGTCT GTGACACGTG TGTGTGCATG TGTCCGTGAC CCTCCTCCAG GCATGGTCCG GGGTCCTCAC TTCTAGCATG TTGAGAGGAG AGTAGGGATG TATGCCGGCT CCATGAGATA GGCCGGGGGC CTTGGGGCTC GCACGCTGGA GGGGTAAGCC CTTCGGTCTG GGGAGAGGCA
TCTGAGGAGA
CAGGAACCCG
GACACTGGGG
CTGACTGTCT
GCTCAACTAC
AGGGCCTGGC
TGGGTGCCGG
.TGGGTGGAGG
GTGTGGGGTG
TTTCAAACAG
AGGGGCTGCT
CCCATGCTGT
GAGCGGGCGC
GCACCTTCGT
GGACCCCCGT
AGGTACTCCT
TGGCACCTAG .GGTGGAGGCC GGCTCCTATT CCCAAGGAGG
CACCAGGCCC'CGGTGCCTTG
CACAGCCCGC
AGGCCCTGAG
CAGGTGGATG
TCAAACCCCA
CAAGGCCTTC
AATGTGTCTA
TGTGATATGC
TGTGTGTGGC
CATGTTCATG
ATATGCGTGT
CACCATTGTC
GGTGCCCCTG
CTGGTGGTAC
TAGGAAGGCT
GGCAGGGGTG
CTCAAAGTCG
CATGTGGAAA
CCTGCCCTTG
GGCAGAGGTG
TGACGGGCGC
GAACACGTAC
AAGAGCCACG
GAATGCAGTC
GTGTGTGGCA
ACGTGTGTGT
CTGTGTGCTG
CTATGGCATG
CTCACGCTCT
TCCTGTCACA
CTTCCTGGAC
GATTCAGGCC
AAAGGGGCCC
TGCCAGGCCG
CCCACAAGGA
TTCCCACCCA GTGGTCATGA GTGAAGAAGT ATCCCTGGAG TGACTTCTTG AGCT Contig 2: TGTGGGATTG GTTTTCATGT AGAGTTCAAG GCGAGCTTTC
ATCTTCCTTT
CTGTTTCTTC
TCCTTGTGTT
GCTTTCTTTC
GGCTGGAGTG
CTCAGCCTCC
AGACGAGGTT
CTCCCAAAGT
GAGATCTGCA
AGGGGTCTTT
GCGTAATTGG
TCCGGCTCCT
TGCCGTTTTC
AGGAACCCGG
TCTCAGATCA
TCTAGATTCT
CATCCCCTTC
TTGGGGACCC
TGAGTCCAGA
GTGTTGCCTG
TTCTCAGGGT
AACATGCTGA
TGTTAGTGTG
AATGACAAGC
ACTCTTCTCC
CCGGGAGCTC
TGACTGTGGA
TGTGGTGACT
TGTGGTGACT
TGGTGACTGT
TGGTGACTGT
TGTGGTGACT
TGTGTGGTGA
TGTGTGGTGA
GGTCTGATGT
GTCGTGGGGT
GATGGCGATC
GACTGTGGAT
GACTGTGGAT
GACTGTGGAT
GTGGTGACTG
GTGTGGTGAC
ATGTGTGGTG
ATGTGTGGTG
GGGTCTGATG
GGGTCTGATG
GTCACAGGGG
GCGGTCGTGG
TGGCGGTCGT
GCAGGTGGAG
GAAGCTTCCC
GGAAGAAACA
AGGTCTCTAC
GCTGAGGGAT
GTGGGTCCTC
GAATGAGCTG
CGACAGCTGC
TGTTCATGAT
TTTAGGAGGG
AGGAAACGTG
TGCTGGGTCT
TGCAGATGCC
CTGGGCGAAZT
CCCGGCAGCT
CCCAGCTGGA
CTGCAGAGAC
CACTGACCCC
CCCATGTGTC
TTCCTACGCT
CCCTGCGGGT
A, CTGCCCGGGG
GAACTATGGT
CACTCTGGGG
CATTGGCCTG
TTTCTTTTTT
GTTTGGCGTG
CAAGTAGCTG
TCTCCATGTT
GCTGGGATGA
GCGATAGCTG
CCATTTCATG
TGTCTGCTGT
TGAAGGAAAA
CCTCTAAAGC
CGCACAGCGG
GCAGTGGCAT
GTGCTCCTTA
CCCACTGCTG
TGTGCTAAAG
ATAATTACGG
CTCCTGGGTT
TGAATCGTAC
AGCACAGAGT
TGTCACGTGC
GTCCTGGGGG
TGCCTGTGCT
GAGTGCCACT
TGGCGGTTGG
GTGGATGGCG
GTGGATGGCG
GGATGGCAGT
GGATGGCAGT
GTGGATGGCG
CTGTGGATGG
CTGTGGATGG
GTGGTGACTG
CTGATGTGTG
GGTCACAGGG
GGCGGTCGTG
GGCGGTCGTG
GGCGGTTGGT
GTGGGATAGG TGGGGATCTG TTCCTGTAGT GGGTCTGCAG CGGGTTTATA GTAAGTCAGG TCGTGTGGTG CCTGCTGTGG GATGTGGCCC TGGCTACGCT TCTTTCTTTT TTTTTTTTTT ATCTTGGCTC ACTGCAACCT GAATTATAGG CGCCCACCAC GGCCAGGCTG GTCTCGAACT CAGGTGTGAA CCGCCGCGCC CCTGCAGCCT TGGTGCTGAC ACTCTCTTCA CAGAAGAGTT TTATCGATGG CCTCCTTCCA GTTTCGATTA TGGATGTTTG AGGGATCCCG AGGCCCCTGG GAGGCTAGGT GGGGTGTGGG GCGGTGCTCA GAGGCGCACA TGGGAATCTA ATGCCTGATG TCCTGTGGAA AAATCGTCTT ACCTGCTTCA GCAGCCTCTC ATTTCTGTGA TGCTTTCCGC GGGAAGGGTG CAGGCCCCAT TCGATGTGGT TTTAGCCCAC CACCGTGCGC GTCTTTTGAT CTGCTCACAT CCTGTCTTGG AGTCTGCAGA ATAGGAGGTG GTGGCTGCAC CTGCATCCCT TGTGCCACGT GACTGTGGAT TCACAGGGGT CTGATGTGTG GTCGTGGGGT CTGATGTGTG GTCGTGGGGT CTGATGTGGT CGTGGGGTCT GATGTGTGGT CGTGGGGTCT GATGTGTGGT GTCGTGGGGT CTGATGTGTG CGGTCGTGGG GTCTGATGTG TGATCGGTCA CAGGGGTCTG TGGATGGTGA TCGGTCACAG GTGACTGTGG ATGGCGGTTG GTCTGATGTG TGGTGACTGT GGGTCTGATG TGTGGTGACT GGGTCTGATG TGGTGACTGT CCCGGGGGTC TGATGTGTGG
TGGGATTGGT
GTGCTCCAAC
GGTGTGGAGG
TGTGTGGCCG
CCGTCCTTGG
TGATAACAGA
GTGCTTCCTG
CATGCTGACT
CCTGACCTCA
CGGCCGAGAC
AACCTCCGTT
TCACGTGTGC
TTTCCTTTAG
AACTTTCTTT
CTGTGGAGTG
GAGCCAGCGT
CACCCTACTG
ATCTGAGGTG
CCACGAAACC
GTCAGTGTTG
CGACCTCAGA
GTACCTTCCT
GGCCCTGCCG
GCCTCACAAG
GGACGCAGGG
GGGGTGCCGG
GCAATCCCTC
GGCAGTCGGT
GTGACTGTGG
GTGACTGTGG
GACTGTGGAT
GACTGTGGAT
GACTGTGGAT
GTGACTGTGG
GTGACTGTGG
ATGTGTGGTG
GGGTCTGATG
GTCCCGGGGG
GGATGGCGGT
GTGGATGGCG
GGATGGCGGT
TTTTATGAGT
AGCTTTATTG
CCTCCCCTGG
GTGGGCAGGG
AATTCCCCTG
GTCTCGCTCT
AGTTCAAGCA
AATTTTTGTA
GGTGATCCTC
TCGCTTCCTG
TTCCTTCTCC
TGATTTCCCG
GCTTTGTTTA
TCTAAACAAG
GCACCGGTCT
TCCCGCCTGA
AGAACTGTGC
GAACCGTTTG
AGTCCCTGGT
ATATATTGGC
CCCATGGGCT
GTTACTGCCT
CCAGCTCCTG
CTCGAGGCCT
GCTTAGCAGG
TCTCTCTCCC
CAGCACTGGG
CACGGGGGTC
ATGGCGGTCG
ATGGCGGTCG
GGCGGTCGTG
GGCGGTCGTG
GGCGGTCGTG
ATGGCGGTCG
ATGGCGGTCG
ACTGTGGATG
TGTGGTGACT1
TCTGATGTGT
CGTGGGGTCT
GTCGTGGGGT
CGTGGGGTCT
GGGGTAACAc
AGGAGACCAT
GCTCCCTGTT
CTTCCAGGCC
CGAGTTGGAG
TTTTTGCCCA
ATTCTCTTGC
ATTTTAGTAG
CCACCTCGGC
CAGCTTCCGT
AGGTCTCGCT
GCTGTTTCCT
TTGTTGTTTT
CATCTGAAGT
GGGGCCTGTT
GCCCCGCCCC
GTGAGAGGGG
CTCCCAAAAC
ACCACAATGG
TTTTCTGTGT
ATTTGTGGGC
TCCAGGTTGG
GGGGCTGGGG
CCTGTGTCCG
TCCCGTAGTA
GCGTCTTCAG
CTGGAGAGGC
TGATGTGTGG
TGGGGTCTGA
TGGGGTCTGA
GGGTCTGATG
GGGTCTGATG
GGGTCTGATG
TGGGGTCTGA
TGGGGTCTGA
GCGGTCGTGG
GTGGATGGCG
GGTGACTGTG
GATGTGTGGT
CTGATGTGGT
GATGTGTGGT
GGGGTCTGAT
GGGGTCTGAT
140 210 280 350 420 490 560 630 700 770 840 910 980 1050 1120 1190 1260 1330 1400 1470 1540 1610 1680 1750 1820 1890 1960 2030 2100 2170 2240 2310 2380 2450 2520 2590 2660 2730 2800 2870 2940 3010 3080 3150 3220 3290 3360 3430 3500 3570 3640 3710 3780 3850 3920 3990 4060
TGGATGGCAG
TGTGGATGGC
ACTGTGGATG
ACTGTGGATG
TGTGGTGACT
TGTGGTGACT
TCTGATGTGT
GGTCTGATGT
GGGGTCTGAT
TCCCAGGTGT
AGGCGCTCTC
AGTGCCCAGC
CTTGACAGAC
GCCGTCGTCA
CCCTGGGCAA
TGATGGGGGC
TGCATTCAGG
TTGCTAAATG
CAGGCCATGT
TCCCCCTTCT
GTCCACGTGG
TGTTAGCACT
TTCCTTGGCT
GGGCAGCAAC
CCCACAGGTG
TCGGCCCGGC
AAAAGGGACG
TCCCGTCTGC
TCATGTGCCA
GGCTGGGCGG
CCACCAGAGT
TCGTGGGGTC TGATGTGTGG TGACTGTGGA TGGCGGTCGT TGACTGTGGA TGGCGGTCGT GGTCGTGGGG TCTGATGTGT GGTGACTGTG GATGGCGGTC GTGGGGTCTG GCGGTCGTGG GGTCTGATGT GGTGACTGTG GATGGCGGTC GTGGGGTCTG GTGATCGGTC ACAGGGGTCT GATGTGTGGT GACTGTGGAT GGCGGTCGTG GTGGATGGCG GTCGTGGGGT CTGATGTGGT GACTGTGGAT GGCGGTCGTG GTGGATGGCG GTCGTAGGGT CTGATGTGTG GTGACTGTGG ATGGCAGTCG GGTGACTGTG GATGGCGGTC GTGGGGTCTG ATGTGTGGTG ACTGTGGATG GTGGTGACTG TGGATGGCGG TCGTGGGGTC TGATGTGTGG TGACTGTGGA GTGGTGACTG TGGATGGTGA TCGGTCACAG GGGTCTGATG TGTGGTAGCT GTCTGTAGCT ACTTTGCGTC CTCGGCCCCC CGGCCCCCGT TTCCCAAACA TGGGCTTCAT CCCGCCATCG GGCTTGGCCG CAGGTCCACA CGTCCTGATC TCTGGCCGGG GCAGGCCACA TTTGTGGCTC ATGCCCTCTC CTCTGCCGGC CTCCAGCCGT ACATGCGACA GTTCGTGGCT CACCTGCAGG AGACCAGCCC TCGAGCAGGT CTGGGCACTG CCCTGCAGGG TTGGGCACGG ACTCCCAGCA TCACTGGGCT CATGACCGGA CAGACTGTTG GCCCTGGGGG GCAGTGGGGG ATGATGAGCT GTGTGCCTTG GCGAAATCTG AGCTGGGCCA TGCCAGGCTG CACCTGCTCA CGTTTGACTG TCTTCTCTGC CAGTTTTGAT TTGAGCCGTG TCCTGCCCAG TAGGAGGACG GGCCGTGTTT CCCTGTGGCC CTTTGCAGAT TGCTCGGCTC TAGGGGACAG CCCAGGGTGG GGGTGGAGGT TCCTGGATCA CATATGCCAT GCCCAGAGGA GACGTTCTGT CAGCCCACGA TGGCCCTGCA GAGGGTCTTG GCCACGTGGT TTTCGCAGAG CTCCTCCCTG CCACGCCGTG CGCATCAGGG GCTGGCAGGG CTTCTGCTCA CTCCTTTTCT GGCCCCCGCC GCTTGCGGCA GCCGGAGCGG CCTGCGTGGA GGACGAGGGG CGCGGCCTCT CTCCAGTTCC GCAGTGCCTT 4130 CTTGAGGCCA AAGGAAAGGT CTGGCCCCTC AGTGCTGGGT
GAGCCACGCC
GTGGTCTGTC
TCGTGTCCAC
GGCCTGGGCT
CCGGGCCACG
GTCACACACT
TTCCAGCCCA
CCTGCCTGTC
AATGAGGCCA
GCAAGTGAGT
CCTCTCTCCT
CCCTCCGGCT
AGCAGGTGCC
CGGGGGGTGT
CCGCTGAGCG
CACGTGGCCC
CGCATGAGGC
GCTGGGACCC
GTGGGCTGTG
CTGCCTAAGC
GCCCCGCACT
TCAGCACCCA
GCAGTGGCCT
CAGGTGGCCA
GCCCCTTCCC
CCTGGGCTGC
ACACGAGGCC
GTCTGGGTCA
GTCCCCCTCC
CTGAGGCCAA
GGCCTCTCAG
TGTGGCTCTT
TCAGAGACCT
AGACCCTGTG
TGGGTGTGAG
CCATGTGTGT
TCATCACAAA
CCGGCTCACT
CTTCGACGTC
GGTGCCATTG
CACTGNCCTT
AGGCTCCCGA
TGGAAATGGC
GGTGTGCGCC
4200 4270 4340 4410 4480 4550 4620 4690 4760 4830 4900 4970 5040 5110 5180 5250 GGCCCCGGAA ACATGGCTCG AAGCGGGGTG TGGAGTTGCT -31
GAGCGTTTGA
GTATCTCTCT
GAGGGCGCGG
CTCCTCTCGG
GTGAGCCACA
ACCACACACC
AATTCGTGCA
GATCGTCTCC
ATTGCATTAT
AGTAGTACAC
TGTGCACATG
ACCACGGGCC
GGGCTTTGGG
CTGTCCCTAG
ACGTCTTCAA
CGTATCTGCT
AGTGCAGTCC
TTTTTTTTTT
TGCAACCTCC
CCCACCCCCT
GTCTCGAACT
CCATCACGCC
TGTTTTTGCA
AGGGTCGCGT
TGTCTGAAAA
GGGCAAGGTT
CCTTGAAAAA
TCTAAGATTT
GCTGGATCTG
ATCTGGTATG
TCTGTGTGTG
CCGCCCCAGG
CTGTGCTACG
CCAGGGGGGC
ACCATGACTG
GGGTTGCACA
GCTCAGCAGG
CTTCTGTCAC
ATCTCCCAGC
ATGGGGTCTA
TTTTATTGAC
GTTAGTTACA
TATCTCCTAA
CTGTGTCCAA
CTTGCAATAG
CTTTTTTATG
GGACATTTGG
CTTTATAGCA
TTCTAGTTCT
CAACAGTGTA
ACTGAACAAG
GACTCCGGGT
TCTTCTTCAT
CACAACTTGC
GGATACCTCT
CTTCCTGTGG
ACGTGGCTCA
TCTGAGAAGT
GTAAACACTG
AAAGAAATTT
GACATTCAGT
ATTTTAGGCT
CTTCCTCAGG
CCCCGTGTCC
GCAAGAGCAG
ACCGTGCAGG
CTCCTGCTGC
ACAGGGCCTG
GCTGAGTGAG
GAATTTGGAT
TTTGTGAATC
CCTCAAGGC
AAAGCTGTAA
TTGTGAAAAC
TGGGGGTATG
AGATTCACTC
4. AGAGCAGACG
TGTATGTTGG
GCCTGCAGCT TGTCAGCTCC CTCCCGATAC AAAAGGA'TT GCTCTTCTCT CTGTGACTAG GGGGCCTGTG GTGGCCATGG CTCACGGTGG TAGAGCCACA TCCCGGCAGG CATCTGCCTG CACTCAAGGT CATCAGCAAG AGCGGATAAA GGACTGTGCA AAGTAATCAC TAATGGTATC ACGTTCTGGA AAAACACAAA TGTGTAAGCG GCCCCCAGGC TCCTTCGTGG TCGTGAATTT GAATGTGAGG TGATGACTGC GACACGGACA GGCCCGAAGC AACCTGTTGC CCCAAAAACT TGCGTTGACT CGCTGGGCTG TCTTGCCCAT CACTGTGATA GAGACGGAAC GTCACTGTTG GCCTCCCGGG TTCCAGCATT GCGCCTGGCT AATTTTTGTA CCTGACCTCA GGTGATCCAC CAGCCGGAAA GCCTCTTTTT GTAGGCTGCA AGCGTCTCTT GGCAGCCATG CCTTCTGTGT CGCACCCTTG GCATCCTTGT .GTATCCGTTG GCGCGCAGCG AAAAAAAGGA GTCCGGTTAA AAGAAACCTT AATGAAAGAA TTTCAGCCGC CCCAGTGCAT TTTCTGAGGT GTTTGCCGGC GCTCGGTTTG AGTGTACGCA TCCTACGTCC AGTGCCAGGG GCGACATGGA GAACAAGCTG TTGGGTGGGG GTTGATTTGC CTCTGTCTTG AGGAACCAGA GCCTGAGGAC TGCGGGCTCC CGGGAGGGCC GCTGCCCTGC GTCACCCAGG TTCCGTTAGG AGGCCCTCGA CAGGTGGCCT CACCCAAGGA CGCACACACC AGCAGTTACT TTTTTTTTTT TATGTATACA TGTGCCATGT TGCTATCCCT CCCCACTCCC GTGTTCTCAT TGTTCAGTTC TTTGCTCAGA GTGATGGTTT ACTGCATAGT ATTCCGTGGT GTTGGTTGCA AGTCTTTGCT GCATGATTTA TAATCCTTTG AGATCCTTGA GGAATCACCA AAAGTGTTCT GGTGCTGGAG CAGACAGTTA GTGAAGGATG TTTTCCTGTG CATCTTTTGA TAAGGTTCAA GTTCTAGATT TCTGGGATTT GGAGGAAAGT GGCCCATGGT CATGGGGCC ATAGGATCTG GGTCTCGGAT GAGGGGGCGA GGTTCCCAGC GGGGCCGCGC CTGATGGCCT AGTACTTATA ATGAATGAGG AAGTTTTTCA TTTAACCGCT CCCTCGTAGA CAGATACTAC GCTCCTGCGT TTGGTGGATG TGAGGCCCGT GCCGTGTGTC TGCCCCTGGC ACCGCAGCGT AGGCGCTTGG CCGTGCACCC CCCTGGTCCT GCAGAGACGC TCTTTGGAAA GTCAAGAGTG CAGGGCCGAG GCGGCAGCCT CTGGCCCACA GCGTTCGCTG TTGCTGAGTG CTGC'rGTCTT AAACTAAAAT CAGGCACAGG GCCCCACAGA GCCGGTGGGC AGGGAACCCT CAGAAAATGT CCATTTGGAC CCGCCCTCCA CCTGGCGTTC CTTGTGCCGC GGGGGGAGCC CAGGTCCCAA GGGAACGCTG CTTCTGTGTG GGTCCCACCG GGGGCAGAAC AAGTTACTAC TGACGCTGGA CACCCGGCTC TCACACGCTT TATCCGATTC TCATTCCTGT CCCTGTCGTG TGACCCCCGC ATTTCCCATC TGGAAAGTGC GGGGTTGACC GTGTAGTTTG GGCAGGCGGC CTGGGAGAGC TGCCGTCACA CAGCCACTGG GTGCCTGGTG CCACATCACG TCCTCTGGAT TTTAAGTAALA CGACCCTGTG TGTGCCTGGG GAGAGTGGTA GCACGGAGGA GTCATCCGCA GTCAGGTGGA ACGTGGAGGC CTCTCTCTGG CAGCTTCGGA AGCTTTTATT TAAAAATATA ACTATTAATT AGCAATTATA ATATTTATTA AAGTATAATT AGAAATATTA TTGCACATGG CAGCAGAGTG AATTTTGGCC GAGGGACACG CCACAGAATT CGCTGACAAA GTCACCTCCC CAGAGAAGCC TATTAAGATG GATCAAGTCA CGTACCGTCC ACGTGTGGCA GTCCTCATGC CCTGACAGAC AGGAGGTGAC TGTGTCTGTC TCTAGTCCCC ATCGTGGTCC AGTTTGGCCT CTGAATAAAA AAGAACAGAG AGAGTTTCCC ATCCCATGTG CTCACAGGGG GCCGGACTCC TAGAGTTGGT GCGTGTGCTT CTGTGCAAAA TCTGCACCAG CAAGGAAAGC CTCTTTTCTT TTCTTTCTTT TCTGCCTGGG CTTGAGTGCA GTGGCGCGAT CTCAACTCAC TCTCCTGCCT CAGCCTCCCG AGCAGCTGAG. ATTACAGGCA TTTTTAGTAG AGAGGGGTTT'.T.TGCCATGTT GGCCAGGCTG CCACCTCGGC 'CTCCCAAAGT GCTGGGATTA CAGGTGTGAG AAGGTGACCA. CCTATAGCGC TTCCCGAAAA TAACAGGTCT AGCAACAGGA GTGGCGTCCT 'GTGGGCTCTG GGGATGGCTG GCACCTTTAG GTTCCACGGG GCTATTCTGC TCTCACTGTT TTGGAGAGTT TCTGCTTCTC GTTGGTCATG CTGAAACTAG GCTACATGTA GGGTCATGAG TCTTTCACCG TGGACAAATT GCATTCATTC CGGGTCAAGT GTCTGGTTCT GTGAATAAAC AACCTTGATG ATTCAGAGCA AGGATGTGGT CACACCTGTG GGTGAGAGTG GGGAGCAGGG ATTGTTTGTT CAGAGGTCTC TGAATGGTAG ACGTGTCGTT TGTGTGTATG AGGTTCTGTG TGTCCAGCAC ATGCCCTGCC CGTCTCTCAC CTGTGTCTTC GATCCCGCAG GGCTCCATCC TCTCCACGCT GCTCTGCAGC TTTGCGGGGA TTCGGCGGGA CGGGTGAGGC CTCCTCTTCC TTTTGATGCA TTCAGTGTTA ATATTCCTGG TGCTCTGGAG CAAGGTTGCA GCCCCTTCTT GGTATGAAGC CGCACGGGAG ACGCAGGCTC TGTCCAGCGG CCATGTCCAG AGGCCTCAGG ATGATGAGCA TGTGAATTCA ACACCGAGGA AGCACACCAG GTCCTTGGGG AGATGGGGCT GGTGCAGCCT GAGGCCCCAC GGACTGGGCG CCTCTTCAGC CCATTGCCCA TCCCAC'rTGC TAAATATCGT GCCAACCTAA TGTGGTTCAA CTCAGCTGGC TAATACTTTA AGTTCTAGGG *TACATGTGCA CGACGTGCAG TGGTGTGCTG CACCCATTAA CTCATCATTT ACATTAGGTA CCCATCCCAT GACAGGCCCT GGTGTGTGAT GTTCCCCACC CCACCTGTGA GTGAGAACAT GTGGTGTTTG GTTTTCTTTC CCAGCTTCGT CCATGTCCCT ACAAAGGACA TGAACTCATC GTATATGTGC CACATTTTCT TAATCCAGTC TATCATCGAT ACTGTGAATA GTGCCGCAAT AAACATACGT GTGCATGTGT GGTATATACC CAGTAATGGG ATGGCTGGGT CAAATGGTAT CACTGTCTTC CACAATGGTT GAACTAGTTT ACACTCCCAC AGGATGTGGA ,CAGCAGTTAT. TTTTTTATGA AAATAGTATC CGTCAGGAAG CCTGCAGGCC ACACAGCCAT .TTCTCTCGAA AACTCTAGCT' CCAATTATAG. CATGTACAGT GGATCAAGGT GAAATAAGTT TATGTAACAG AAACAAAAAT ;TTCTTGTACA GTCCTCGAGC TGGCGGCACA CTGGTCAGCC CTCTGGGACA TGGGCTTGGG CCTGAGGGTC ACACAGTGCA CCATGCCCAG CATGCTGAGG ACCACAGCTG CCATGCTGGT AAAGGGCACC CCCAGCTTTC TTACCGTCTT CAGTTATTTT TCCCTAAGAG TCGTTCGTCT TCAGCTGGCA CAGAATTGCA CAAGCTGATG AATTGCTGTA GCAGTTAACT GTAGAGAGCT CGTCTGTTGG TTGGAGAATG TTACTTTATT TATGGCTGTG TAAATTGTTT GTAAAAAGTG TAAAGTTAAC CTTGCTGTGT ATTTTCCCTT ATTTCTTGTT GGTGACACCT CACCTCACCC ACGCGAAAAC TGTGGGGACC TCCACAGCCT GTGGGCTTTG CAGTTGAGCC TGTCTCTGCC AAGTCCTCTC TCTCTGCCGG TGCTGGATCC AGGCCTGGGG GCGCAGGGGC ACCTTCGGGA GGGAGTGGGT ACCCAGGTTA CACACGTGGT GAGTGCAGGC GGTGACCTGG GCGGCTCCTG GGGCCCCAGT GAGACCCCCA GGAGCTGTGC CCTCCCCAGG GTGCACCTGA GCCTGCGGAG AGCAGGAGCT CGGTCACGTT CCTGCGTGGG GTTGTTTGGG ATCGGTGGGA GAACCACGGA GATGGCTAGG AGTGGGTTTC AGAGTTGATT GGACCTGGCC TCAGCACAGG GGATTGTCCA ATGTGGTCCC TTGTTTTAAA GTGCGATTTG ACGAGGGACG AGAAACCTTG GGCCGCCAGG GGTGGTTTCA GGTGCTTTGC TGGGCTGTGT AGTCCACCCT CCAGGTCCAC CCTCCAGGGC CGCCCTGGGC AGCCCGGAGC ACAGCAGGCT GTGCACATTT AAATCCACTA GCAACTGAGG GCTCAGGAGT CCTGAGGCTG CTGAGGGGAC GCAAGTTCCT GAGGGTGCTG GCCAGGGAGG TGGCTCAGAG TCTGTCTCTG ATGAGTCGGC AGCCATGTAA cAGGAAGGGG 5320 5390 5460 5530 5600 5670 5740 5810 5880 5950 6020 6090 6160 6230 6300 6370 6440 6510 6580 6650 6720 6790 6860 6930 7000 7070 7140 7210 7280 7350 7420 7490 7560 7630 7700 7770 7840 7910 7980 8050 8120 8190 8260 8330 8400 8470 8540 8610 8680 8750 8820 8890 8960 9030 9100 9170 9240 9310 9380 9450 9520 9590 9660 9730 9800 9870 9940 10010 10080 10150 10220 10290 10360 10430 10500 10570 10640 10710 32 TGGCCACAGG GAGCTGGGAA GAAGGGCAGG GGGACGCCCG GAGGCTACCG GGCACAGGGG TTGACGTGAA GCTGACGACT CAGAACCCTC CCCTTTGTCT TTGAGAAACG TCTTAAAAGA CAGATGAGTC TATAACGGGA GGGGCTGCAC CTCCCATCTG GTCCTGCCCG GGAGACAGGG CTGCCAGGCC CAGCACCCTG CAGGTTACCT CCTGGGTGAC CGAGGTGTCC CTGAGTATGG AGGCCCTGGG TGGCACGGCT GGATACCCGG ACCCTGGAGG GCCCGGCTGG GGCAGGTGCT CCAATCCCAA AGGGTCAGAG CTGTGGGAGT GAGGGTGCTC AGACCTGGGT GCACTGAGGT GGCGTGAGTC TCTCAAACCC GGCCCTGCTG GGCGTGAGTC GTGAGCCCCA CACTCCAAGG TCAGGGGACA GGGCCATGGT GAGCTCAAGG CCCCGTCTCA TGTTTCTTTT ATGAATAAAA ACTTTGGGAG GCCGAGGTGG AATTCCATTT CTACTTAAAA GCGGGAGGCT GAGGCAGGAG CTGCACTCCA GCCTGGGCAA AACCATAGTG GACAGGTGTT AACTGGGGGT GCCTTCCTCT CCAGAGGTTT AAACTGGGGT ATGCTCCCTG GGGTTTGCTT TGCACCAGGG GAGCTGCGCA GGGCCACAGC AGAGGCCGCA GGCTCCCTGA GCTGGGTGAG GGTGTTGCCC AGCTCACAGC AAAGCACAGC AGATGCCrTC AGGTGGGATG GTGGCAATTT TTGTGGTGTT GCCATGGGGA AGTCCTGGCT GTCCCGGGTC AAAGCACCCC GAAGTCTGGA CTCCAAATCA CCACTTCTCT GCTGGCCGAG GTCCCAGGGC CAGGCCACAG GGAAGGGAAG GGGATGCCCA GGCCAGAGCA CGAGGCTCAT GACTCGGCGA GGGAACCTCC CCAGCCAGGT CCCGCGCCTG AGCAGGAACT AGGGCATCTA GGAGAAAACA GGCAAAGTCG CTTGTCCAGA TTTTAGTCTG CCCCGGACCA CACATGAGAT GGACCATCAC AGAGGCCACT
CAGGCCAGGT
GCAGGGCTGG
GGGGTTTTCC
TCTTGCATGC
GTCCAGGCTC
AAAGCATTTA
GGCCCCGCAT CCTGGGGCTG ACATTGCCCC'TCTGCCTTAG CTGCGTGGTG AACTTGCGGA AGACAGTGGT GAACTTCCCT TTTGTTCAGA TGCCGGCCCA CGGCCTATTC CCCTGGTGCG TGCAGAGCGA CTACTCCAGG TGAGCGCACC TGGCCGGAAG GCTGCAGGGC CGTTGCGTCC ACCTCTGCTT CCGTGTGGGG GCCACAGGGT GCCCCTCGTC CCATCTGGGG CTGAGCAGAA ACAACGGGAG CAGTTTTCTG TGCTATTTTG GTAAAAGGAA GTCTTCAGAA AGCAGTCTGG ATCCGAACCC AAGACGCCCG GAACACAGGG GCCCTGCTGG GCATGAGTCC CTCTGAACCC TCTCCGAACC CAGAGACTTC AGGGCCCTTT. TGGGCGTGAG CTCATCCACA GTCTACAGGA TGCCATGAGT TCATGATCAC GTGGGGGGGG TCTCTACAAA ATTCTGGGGT CTTGTTTCCC GGCTCAGACA CAAATGAATT: GAAGATGGAC .ACAGATGCAG AGTATCAACA TTCCAGGCAG GGCAAGGTGG .CTCACACCTA GTGGATCACT TGAGGCCAGG AGTTTGAGGC CAACCTAACC
TCACCTACCT
CTCAGAGCTC
ACAAGGGTGT.
GACCCTGGTC
GTAGAAGACG
GCCTGCTGCT
TGGAGCCTGT
CAGGCGACTG
ATGCATCTTT
ATGGTGCACC
GGCCCTGCTG
GAGACCCTGG
TCTCTCCGCT
GTGTGACCCA
:CAGAGCCCGA
AAATCTGTGC
TAATCCCAGC
AACATAGTGA
10780 10850 10920 10990 11060 11130 11200 11270 11340 11410 11480 11550 11620 11690 11760 11830 11900 11970 12040 12110 12180 12250 12320 12390 12460
GCAGACGCCC
TGCAGCTCCC
TACCTGGTCC
GCCTCTCCCA
CCCCATGAAA
TTGAAGGACA
GGCTAGTGCA
AAGGCCACTT
CCATACTCAG
AAGAAAACAG
ATTTTAGTCT
AAACGGAAGC
TGTGTGTAAT
AGCAGATTCT
AGGGAAGGGA
GAGGCAACGG
CTGGAGCGTT
TTCTCCTAAC
GCAAAGGGCA
AAGTCAGACC
TGGACGTCTG
AGACCCATCC
TTTTCTGGGC
TTTGCATGGA
CACGGGTCTT
AAATACAGGG
TTGAAGAATG
ACACCCCAGG
CGTGCACTCA
AGATTCCCCA
GCCCGGGAGC
GATGATGAGT
AGTTTGGTCA
GGCTGCAGCG
GCAGGAGGGG
ACCTCCATCA
TTGGGGTCTT
AGAGTTCAGA
ATGGTGACTG
TGTACATGAA
CCTGACATGC
GTGTGAGTAG
CCAGTGCCAC
CTGAGGGCAT
TCTCCTGTGG
TCATGATGGG
TCCCCACAAG
TGGCCTCCAC
GGCACCTCTG
TGTATTTTTT
AAGGACAGAC
GGATGGGTGG
GGTCAGAGTG
GGTGAACTCA
GCAAAATGAT
CCCAAACCAC
CCTATCTCTC
TTTTTTTTCT
AGGTAGAAGA
GGTGAACGTT
GCATTGCTTT
TGTGCACGTG
CACCTGAGAG
TGCATGATTG
CATAGGCTCA
ATGCACACTT
CTCAAAGAAA
TTGCCAAGAG
AGTCCTCACA
ATTTACCATT
CTAAGGAGAT
TTTAATGGCA
AGCCTGCCGT
CCCTGGAGGA
CTGGAGCCCT
CGCCCAACTC
CAGGGCTCCA
GCACAAACAC
TGCAGAGTCT
CATGCCCCAG
GCTGGGTGTG
GAGCCAGTCT
GCGGCTGAAG
GTTCAGGAGG
GCTGCACGTA
GGCATGGCAG
ATGTGTGTTC
CATGTGTGCA
TCCTTACAGG
TGTCCCATCT
GCATTTACAT
AATACAAAAA TTAGCCTGGC CTGGTGGCAC ACGCCTGTAG TCCCCGCTAT 12530 AATCATTTGA ACCCAGGAGG CAGAGGTTGC AGTGAGCCGA GATCACACCA 12600 CAGAGTGAGA CTTCATCTTA AAAAAAAAAA AAAAAGTATC AGCATTCCAA 12670 TTTTTATTCT'GTCCTTCGAT AATATTTACT GGTGCTGTGC TAGAGGCCGG 12740 GAAAGGCACA CC'TTCATGGG AAGAGAAATA AGTGGTGAAT GGTTGTTAAA 12810 CCTGTCGTTC TGAGTTAACA GTCCAGATCT GGACTTTGCC TCTTTCCAGA 12880 CATGGGGGAG CAGCAGGTGT GGACACCCTC GTGATGGGGG AGCAGCAGGT 12950 GGAGTGGCAG GTGCAGACAC CCTTGTGCAT GGTGCCCAGC ATGTCCCTGT 13020 GATGCCGGTC TCCTGTGCTC CCCACAGTCC CTGCTTCCCT CTCACAGCCT 13090 TGGCTTTGTC TGCATGATTT CCACATTTCC TGGGCTCCCA GCACCTCTTC 13160 CAGTGCTGGC CATACCAGTC AGCTGTGAAC TGTCCACTGC TTATTTTGCT 13230 AGGACAGGCA CCCCTGGTTC CAGCCTCTGG CACAGCATCA GTGAATGTTA 13300 AAACAAATCA GGAAAATGGG TTCTCTCTAA ACACATTGCA AAGCCACAGA 13370 GCATCAGGTC ATCAGATGTG GGTCCAATGC CAGAATATTC TGTGCTCCCA 13440 TGTGCTTGCA GAGGTGGCTC TAAAAGCTCA GCAGTGGAGG CAGTGGTTCG 13510 CATCCTCTGT GTCTGAAGTA TACAGCAGAG GCTTGAAGGG CATCTGGGAG 13580 TAAGAAAAGT GAAAAAGGAA AAGTGGTAAG ATGGGAATTT TCTTGTCCAG 13650 AGCTCAGATG GTAGAATGTG GTCAGAACTG ATGGACAGAA CAATAGAACA 13720 AGAAACGTGT GTTAATGTGG TATGTGGCAC AGCTGATGGA AAAGAGAGTG 13790 GAGAAAACTG ACTGGAAGCA AATAAGTTGT GTCTTTACAG CATATACCAG 13860 GGAGACACAT GCAAACAACA CCAGCAACAG AAATAAAACA AAAGACTCAA 13930 CCCTGGTTTG GTGTTGGGGA AGGACACACA GGGAGGCGGA TGAAACCAGT 14000 CACTGCAGAG AAACTCAGCT TGCCTGAGCC ACAGTGAAAA TGGCCATTCC 14070 ATTTATTTAA GGCGCCCTGT GAGGTCCTGC ACATTCATCC TCTCACTTTG 14140 GTAGAGGAGG AAAGGCTCCA GGGGAGCAGC CGCCCTTGGT -CACCCAGCTG 14210 CAGCCTGGCC TCCTGCTCCG .GGGCCCTTGC .TCTGCCCGAG S GACCCCACAC 14280 GGGTGAGCCG GAGCCCAAGG .TCGTGTTGGG GATGGCTGTG'AAAGAAGAAA 14350 GGGAAGGTCC TACCAGCAGC- GTCAAAGAAA TGCATGTGAA ACTGACAGCG 14420 CGCACGTGAA ACTGATGGCG'AGACCTGTCC 'CCATCCCTCA TGCTGGCTCC 14490 CCAGCATCAG GTTGAGGCAA GCTGGAAAGA CTTTTCTGGA AAGCAGCTTG 14560 ATGTCCTGTG TCTTCCCAGT AATTCCACTT CTGAAGTGAC CAGACATTAT 14630 TCCAGTGTTC CAGGCAGGGG GACTTGCCAC AGCAAGTCAC GAACCTGCCC 14700 ATTATGCATC ACAAAACTTG CTCTGCCATT AAACATTTTT CAAAGAATTT 14770 CAAAACGTTT ATTTCAATGT AGCAGTGTTC AAAGCTGGAT GTAAAAGAAC 14840 GAATGTCATG TGTGTTCATC TTTGGACATG GACATACATG GGCAGTGAGT 14910 CATCGGTGGG ATGCCTCCAT CCTGCCCCTC TGGAGACACC ATGTGTGCCA 14980 GTTTAGCTGG TGCCACCTGG CTCTTCCATC CCTGAGATTC AAACACAGTG 15050 AGTGTTCTCC CACAAAAAAC CTGAGTCACA CCTGTGTTCA CTCGAGGGAC 15120 CAGTTTATTA TGTGTTTTTG GCTGAGTTAT GTGCAGATCT CATCAGGGCA 15190 GGCCGTGCGA GGTTTGGATA CACTCAACAT CACTAGCCAG GTCCTGGTGG 15260 GGATGGCATG TAGCATTTGG AGTCCATGGA GTGAGCACCC AGCCCCCTCG 15330 GCAGGACAAG GAAGCGGGAG GAAGGCAGGA GGCTCTTTGG AGCAAGCTTT 15400 GGGCAGGCAC CTGTGTCTGA CATTCCCCCC TGTGTCTCAG CTATGCCCGG 15470 CACCTTCAAC CGCGGCTTCA AGGCTGGGAG GAACATGCGT CGCAAACTCT 15540 TGTCACAGCC TGTTTCTGGA TTTGCAGGTG AGCAGGCTGA TGGTCAGCAC 15610 TGTGTGCGCA AGTATGTGTG TGTGTGTGTG CGCGCGTGCC TGCAAGGCTG 15680 AGAGTGCACA TGTACGCATA TACACGTGAG CACATACATG TGTGCATGTG 15750 TGTGTGCACA GGTGTGCAAG GGCACAAGTG TGTGCACATG CGAATGCACA 15820 GTGCACAGTC GTGTGGGCAT TCACGTGAGG TGCATGCGTG TGGGTGTGCA 15890 CATAACATGT ATTGAGGGGT CCTCGTGTTC ACCCCGCTAG GTCCTCAGCA 15960 ATGAGACGGG GTCCCAGGCC TTGGTGGGCT GAGGCTCTGA AGCTGCAGCC 16030 GGGCATCCGC GTCCACTCCC TCTCCTGTGG GCTTCTGTGT CCACTCCCCC 16100 CCACTCCACT CCCTCTCTCC TGTGGGCATC CGCGTCCACT CCCCCTCTCT 16170 33
GTGGGCATCT
CTTGGCCGAG
GGTGAGGGCC
CATCTGAATG
TTTCTTGGCG
GGCGTACAGG
CTGACCCGGG
TGCTGCCTGT
AGGAGCCGGT
CCAGGGCAGC
CTGAACAGTA
AGGTGAGGGT
CACCTGTGCT
CAAGAATCGA
AGTCAGGGCA
TCCACCTCAA
GCTGAGAAGG
TGAAATGAGG
TACCATGAAA
TGCATGTTAC
GCTCAGACCG
GTGCCTGGGC
CACTGAGGAC
GGGCCATGAT
GCCACCTGCT
GCTCCCGTAG
GTCGCTTGAT
GCGTCCACCT CCCCTCTCTG CCTCGGGGGC AGGCAGATGA
AGGCCGGATT
GATGATAAAG
ACTCTAGGTG
TGAGCCGCCA
GCTTCACCTT
GCACAGTTCT
GTGGCCCCAG
AGGGGCATGG
GATGGGAGAT
CGCTGGCCCC
CTGGGCATGG
CAACTTTATC
GGTGGTGGCA
CAGGCCTCCC
AGTGTGAGCA
TCGTCGTCTA
ATGGTTTTTA
CGCCTTTGCA
CCCTCCTCTC
CCTCGTGCAA
TGGAGGTGTC
GAGGTCAGAG
CCTCCCATAT
AGGGCCTGGG
TGGGTAGCCC
TCACTGGGAA
CAAAAAGTAA
AACAGCCTCC
CCAAGGGGTG
GGAACTCCTG
GTTCGCGTGG
GTGTCCCCAC
GGTAAAGAGA
CAGATGCCCG
ACCCCCGGGA
CTGTGCTCCT
ACAGAGGGAA
CAAGCCTCGG
GAGCCACTGG
TTTGTGTTAC
TCGTGGAAAC
ACCCGAGTGC
CCAGCTCCAG
TGCCTTCTCT
GCTGCTTGAC
TGACACTGTG
GAGTTTTCCC
TCAGCTCAGT
CTCAGGGCAG
TGAGGAGGCC
AATCCCCTTC
AGTGGGCGAG
CTTCCTGGCC
ATTTCTCCCT
GTGGCAACCC
AAATGGGGAA
TGTGATCTCA
TGGGTGAGAT
TGGGCATTTG CGTCCACTCC CTCTCCTGGT TCCTTCCTGT CACAGAGTCT TGACTCGCCC AGGGTGGTTC GCAGCTGCCG GAGGGATAGT TTCTTGTCAA AATGTTCCTC TTTCTTGTTC AAACTTAAAA TCCCAGAGAG GTTTCTACCG TTTCTCACTC AGACGGTGTG CACCAACATC TACAAGATCC TCCTGCTGCA CAGGCCCAGC CTCCAGGGAC CCTCCGCGCT CTGCTCACCT GGTTTTAGGG GCAAGGAATG TCTTACGTTT TCAGTGGTGC CTCTGTGCAA AGCACCTGTT CTCCATCTCT GGGTAGTGGT TGTGCCTGTG CACTGGCCGT GGGACGTCAT GGAGGCCATC TGTTTATGGG GAGTCTTAGC AGAGGAGGCT GGGAAGGTGT GAGGATTTGG GGTCTCAGCA AAGAGGGCCG AGGTGGGTGC AGGTGCAGCA GAGCTGTGGC TCCCCACACA GCCCGGCCAG GGAACGTTCC CTGTCCTGGC TGGTCAGGGG GTGCCCCTGC GGGCCAATCT GTGGAGGCCA CAGGGCCAGC TTCTGCCTGG GGCTGTACCA AAGGGCAGTC GGGCACCACA GGCCCGGGCC GAGCTGAATG CCAGGAGGCC GAAGCCCTCG CCCCATGAGG CCAGGGCCGA GGCTGCGCGA ATTACCGTGC ACACTTGATG CCAGCAAGGG CTCACGGGAG AGTTTTCCAT TACAAGGTCG TTGCGCCTTC ATGCTCTGGC AGGGAGGGCA GAGCCACAGC AGGCTTGGGA CCAGGCTGTC TCAGTTCCAG GGTGCGTCCG CTCTGCCTCA. AATCTTCCCT CGTTTGCATC TCCCTGACGC TCCTTTCCGG AAACCCTTGG-GGTGTGCTGG ATACAGGTGC GTTGACCCCA GGGTCCAGCT GGCGTGCTTG GGGCCTCCTT AGGTGAAAAC TCCTGGGAAA CTCCCAGGGC CATGTGACCT CTTGTCCTCA TTTCCCCACC AGGGTCTCTA GCTCCGAGGA GGCGGCTGAG TTTCCCCACC CATGTGGGGA CCCTTGGGTA GAGATGCGAT GGGCCACGGG CCGTTTCCAA ACACAGAGTC CCTCGAGGCA GGAGTGGGAG AACGGAGAGC TGGGCCCCGA GCTGTGGTGG TCCACGTGGC GCTGGGGGCG GGGTCTGATT CGTGCTGGCC GCGCCTCCAC ACGGGCTTGG GGTGGACGCC TTGGAAGAGA GCCCCTCACC CATGCTAGGT GTTTCCCTCC CGGGACCTTA GGCTTATTTA TTTGTTTAA6A AACATTCTGG AAGACATCCC ACCTCAGCAG AGTTACTGAG AGGCTGAA6AC GGTCATTCCA GAAGTGGCTC AGGAAGTCAG TGAGACCAGG 16240 16310 16380 16450 16520 16590 16660 16730 16800 16870 16940 17010 17080 17150 17220 17290 17360 17430 17500 17570 17640 17710 17780 17850 17920 17990 18060 18130 18200 18270 18340 18410 18480 18550 AGGCACGTGG AAGGCCCAGG TTTCACGGCA GCCAGGCTGC CAAATCCGCT GGGGCTCGGC CCGACCTCTA GCAGGTGGCT TGGGTCAGGA GCGTGGCCGT GCCTGGCTTC CGTTGTTGCT CGGGGTGCTG GCTTGACTGG TACATGGGGG GCTCAGGCAG CATGGGGGGC TCAGGCACTG CGGGGGCTCT GATCACACGC CACACCTGCC CCAAAGTCCC CACACCTGTA GTCCCAGCAC GCCTGAGCAA CATAGTAGAA GCGCCTGTAG TTCCAATACT GTGAGCTGAG ATTGCACCAC GAAGACTGAC AAATGCAGTT TCGGTGTCTC GGTGTCAGTG ACCACAGGGG CGGGTGGCTC GAAGGGCAGG ATTCATGATA CCTTCCCGGA ACAGGGGCTA TCCACATGCG TGTTCATACA CTCGCACACA CAAGCACACA AACCCATGCA TGTGCATTCA CTGATTAGGA GGCCTTTCCT CCATTTCATC AGCAAGTTTG GCTACTCCAT CCTGAAAGCC CTGCTGGTGT TAGTGTGTCA GGAAGTGGTT TAACCCAACC TGGAAGGGAC AGGAGCTGTC GCCTGGTCTC TCCTGTTTGC GATTGGGCTG TCTCCCGTCC CCAGGCCCAG GCTACCCCAC CCTCTGCTTC CCAGTCACCG CTGAAATTCA AGCCATGTCG GTGGAAATTT CACCTGGAGA AGCCAGAGAT GGAGCCACCC TGGGCTGGGC CTGTGACTCC GGCCCTCTGC CCTCCGAGGC GTGTCACCTA CGTGCCACTC ACCTGCCCAG GGGTCATCCT GCCCCCGGGC CTGACCCTGG AGGCCACGGA GCCTGGCAGG CACGCTTGGG AGCCTTCTGA TCCTGGGGTC CTGAGCAAGT CCCGGTGGAG GGGTGTCTGT TGCCCGGCCA CCCACACGTC ATTTTGGCCC CGCAGCCCAG CGCAGCCAAC CCGGCACTGC GCCGAGAGCA GACACCAGCA CCAGGCCCGC ACCGCTGGGA GCTGAGTGTC CGGCTGAGGC GAGGTACACG GGGGGCTCAG GCAGTGGGTG AGGCCAGGTA 18620 AGAGGGTCAG ACCAGGTACA 18690 GGTGAGATGA GGTACACGGG GGGCTCAGGC ACATATGAGC ACATGTGCAC ATGTGCTGTT AGGAAGCTGA GAGGCCAAAG ATGGAGGCTG TTTGGGAGGC CGAGGCGAGA GGATCCCTTG CCCCATCTCT ATGAAAAATA AAAACAAAAA TGGGAGGCTG AAGTGGGAGG ATCACTTGAG TGTACTGCAG CCTGGGTGAC AGAGTGAGAG TCTTGGAAAG AAACATTTAG TAGGAACTTA AGATGAGATG ATGGGTCCTC ACACCATCAC AGAAGGGATG CGCAGGACGT AGTACCTGCT GGTACACAAG ATCAGAAGCC AGCATGGGGG GATGGTGCAC AGAAACGCAG
CACAGACATG
TGCACGCACA
CTGACGCTGT
GAAGAACCCC
AAGAACGCAG
GGAGACTGAG
ACTGTCAGGC
TGGGAGCTGC
CCCATGGTGG
ATGGCACTTA
CCCTCTCAGG
TCCTCTGCCC
AACCTGCGGT
AGCCGAAGAA
CGCAGACCGT
TCAGCCTCTG
CGTGCAGTGG
CTGGGGTCAC
TGAACGCCCT
GGGCCTGGAG
GTCCCCAACT
CCCCTGACCT
TCTCTCCCCG
CCCTTCACTG
CTAGGAGGGT
ACGCAGCTGA
CCTCAGACTT
GCCCTGTCAC
GTCTGAGGCC
CTGAGCGAGT
CATGCATGCA
CAGGCACCGG
CCGCCATCCT
ACATT'rTTCC
GTATGTGCAG
TGAATCTGGG
TCGTCTGCCC
CATCCTTCCC
GATTTGGGGG
GGGCCCTTGT
AGCAGAGGCC
CTGGACACTT
CCTGAGCTTA
AACATTTCTG
CGGGTGTGGG
TTTTCCCCCA
CTGTGCCACC
TCAGGACAGG
GTGTGGGGCG
CCACGCTGGC
TCTTGAACCC
GTGTCCTCTC
CCCCGCCGCT
AGGTTCCCAC
TGGAGGATGC
GTCGGAAGCT
CAAGACCATC
GCCGGGCTCT
TGAGTGAGTG
TGATATACGA
GAACAATGGA
GCTGGCATCC
TGTACCTGTG
TCCGTGTGTG
TGGGCCCATG
CTCAGGTTTC
TGCGCGTCAT
GTGCCTGGCC
CTTAGGAAGT
GCCCTCTCGT
ACCTTGCTCT
GCCTGGCCTC
GCAAACCCAG
GCGTATCACC
TGTCCAGCAT
ACAGCTTCTA
TCGTGACTCC
CAGCTTTCCG
GGGATGTCGC
AAGCATTCCT
TCATGGTAGC
ACAGGGCTGG
AGCCCAGGAG
TTAGCTGAAC
CCCAGGAGGT
CCCATCTCAA
ACCTACACAC
CCCAGACCCA
TGACATCAAG
TAAACTGGAA
AGGATGGAGC
CACACACAGA
TGCACCTGTG
CCCACACCCA
ACGCATGTGT
CTCTGACACG
TCAGTGGCAG
TCTTACCCCT
GGGGTGAGCA
GCCTGGGGAA
TCCTGTTTGC
GCCAAGGGCT
ACGACAGAGC
CAGGGAGGTT
CTTTCTGTTC
TGCGGTGCTT
GTGTCTCCTG
TGGGGGCCAA
GCTCAAGCTG
CAGGTCTGTG
CGCGGTGGCT
TTTAAGACCA
ATGGTGGTGT
GGAAGCTGCA
CAACAACAAA
AGAAGCCAAG
GGGTTTATGC
GTTGTCTGAC
ACCTTAGAGG
TGCTTCAGCC
CACGCAGCTA
CCCATGAGGA
CGAGCACCGT
GCTGCAGCTC
GCCTCCCTCT
CAGTGCCTGC
TTTCGCATCA
GAGCACCTGA
GCGCTGGGGG
CCTGTGGTGG
TAGGAGGAGG
CCCGCGCCGT
TCTGATCCGT
TTTCTGTGTT
GGGTCGGGAC
GGAGGGGAGC
GGGCGCCGCC
ACTCGACACC
TGCGGGCCCC
AAGTGCAGAC
TGGTGTCCCC
GATGGCTCCC
TGCCCTGAGC
GTCTGCTCGC
AGGCCCTGCC
18760 18830 18900 18970 19040 19110 19180 19250 19320 19390 19460 19530 19600 19670 19740 19810 19880 19950 20020 20090 20160 20230 20300 20370 20440 20510 20580 20650 20720 20790 20860 20930 21000 21070 21140 21210 21280 21350 21420 21490 21560 CAAGTGTGGG TGGAGGCCAG AGCAGCCTCA GATGCTGCTG AGCCCTATGT GATTAAACGC CTGCTTCCCA TCTCAGGGGC ACAGCCTCTT CCCTGGCTGC CCAGCGTCAC TGGGCTGCCT CAGCCAGGGC CACGAGGTGC CACCTCTGGC CTCTTCTGGA ACGGAGTCTG CCCGGGGACG ACGCTGACTG CCCTGGAGGC CTGGACTGAT GGCCACCCGC CCACAGCCAG ACGTCCaGG GAGGGAGGGG CGGCCCACAC TTTGGCCGAG GCCTGCATGT CCGGCTGAAG GTCCAGCCAA GGGCTGAGTG TCCAGCACAC CTGCCGTCTT 21630 -34 CACTTCCCCA CAGGCTGGCG CTCCCCACAT AGGAATAGTC CCACCCCCAC CATCCAGGTG GTGTGCCCTG TACACAGGCG GCTGTGGGAG TAAAATACTG GTGCACTGCA TAGACACCAC GCCCATGGCC TGGCTGTGCA CCTGGCTGGG CCTGGGAGGT GAGCCCCCAC CCTGGAAGAC TTGGGCGGCG GGGATGATGG GGGGTGATGG GGGGGGCTGG GCCTCCCACC TGCAGCCGTG GGAGGTGGGG GGCAGGGGCA TCAGGTTGAA AGTCACATTC GTGGGTAGGG TGGGGCAGTG TCCTCTTATC ATCTCCCAGT TCTCCCAGTC TCATCTGTCA CATCCAGACT TACCTCCCAG GAAGGAACTG GAAGGATTGC AGCCCCTCCT CAGAAGTTGG CAGCAGGTCC CTGGTGGGGC TTTGAGTGCA GCCCGGACGT TGCTGCTGCT TCAGAGAATG GTCGTAAATG CACTCTGGTG GGCCTGGGGG CGCCTTTGCC GTGAGAGGTT GGACAGAACA TGAATCACAG ACCAACaGGT
CTCGGCTCCA
CATCCCCAGA
GAGACCCTGA
AGGACCCTGC
AATATATGAG
TGTATGCAAT
TTTACGGAAG
TTCTGATGCT
ATAACAGTAA
AGGGCCTGGC
TCTGGGTGGC
GATCCGGATG
TGACACCATC
CGCCTCTGGC
GAGGGTGTGG
CTCATCTCTC
TCCTCTTACC
GGCGGGTGCC
AGAGAACAGG
CTTGGGCCAC
CTTATGGTAT
GCCTGGTGTC
TCTGAGTGAC
CCTGGAGCCC
CTGCAAACTG
GGGCGGGGAC
CAGGCCATTG
AATTTTACTC
TGGCTTAACT
GTGTCTACTC
GGCTCAAGTG
TGGCACTTTT
GAAGCCGAGG
TCTCTACAAA
CTGAGTGGGA
AGCCTGGGCA
AAGAAGGAAG
AGGTAGACTG
TTTGGACTTC
TGAGTTCAAA
TGAAAGGGAG
ATGTTGGTGC
AAGCCCCAGG.
CCTGTGAATG
CGATCTTAAG
GGAGGCAGGG
GGGGGTCCCC
AGGGCACACG
GCTGTAAGAT
AGTACAGGGA
CCCCAGGGCC AGCTTTTCCT CACCAGGAGC CCGGCTTCCA TTCGCCATTG TTCACCCCTC GCCCTGCCCT CCTTTGCCTT GAAGGACCCT GGGAGCTCTG GGAATTTGGA GTGACCAAAG ACCTGGATGG GGGTCCCTGT GGGTCAAATT GGGGGGAGGT TTTTTCAGTT TTGAAAAAAA TCTCATGTTT GAATCCTAAT TACAGAAGCC TGTGAGTGAA CGGGGTGGTG GTCAGTGCGG TCTATGAGTG AATGGGGTTG TGGTCAGTGC GGGCCCATGG GTGAGGCAGG AGGGGAAGGA GGGTAGGGGA TAGACAGTGG GTCCAGGCCC GAAGGGCAGC AGGGATGCTG GGGGCCCAGC CAGGGTGGrCA GGGATGATGG GGGCCCCAGC TGGGGTGGCA GGGGAAGATG GGGAAGCCTG GCTGGGCCCC CTCCTCCCCT TGCTTCCCTG GTGCACATCC TCTGGGCCAT CAGCTTTCAT CTGTATAAAA TCCAGGATTC CTCCTCCTGA ACGCCCCAAC CATTCTCTTA AGAGTAGACC AGGATTCTGA TCTCTGAAGG ACACAGGAGG CTTCAGGGTG GGGCTGGTGA TGCTCTCTCA ATCCTCTTAT CATCTCCCAG TCTCATCTGT CTTCCTCTTA ATCTCCCAGT CTCATCTCTT ATCCTCTTAT CTCCTAGTCT AGGCTCGCAG TGGAGCTGGA CATACGTCCT TCCTCAGGCA AGGGGCGGCT CAGAGGGACG CAGTCTTGGG GTGAAGAAAC ACGAAACCGA GGGCCCTGCG. TGAGTGGCTC CAGAGCCTTC GGCCGGGTCC TACTGAGTGC ACCTTGGACA GGGCTTCTGG GGGGTGGGGG *.CTTATGGCCA CTGGATATGG CGTCATTTAT CGAGCCTAAT GTGTATGGTG.'GGCCCAAGTC CACAGACTGT CCGTATAGGA GCTGTGAGGA AGGAGGGGCT 'CTTGGCAGCC GAAGGGAGCG GCCCCGGGCG CCGTGGGCGG ACGACCTCAA TTCCCAGGAG CAGAGGCCGC TGCTCAGGCA CACCTGGGTT TTCAGCTATC CATCTTCTAC AAAGCTCCAG ATTCCTGTTT AGGATTACTT ATATTTTTTG CTAAAGTATT AGACCCTTAA CACTAAGCAC CTACTTTATT TGTCTGTTTT TATTTATTAT TGTCACCCAG GTTGTTAGTG CAGTGGCACA GTCATGGCTC ATCCTCCGGC CTCAGCTTCC CAGAGTGCTG GGATTACAGG AAAAACCACT ATGTAAGGTC AGGTCCAGTG GCTTCCACAC CAGAAGGATT GTCTGAGGCC AGGAGTTTGA GACCAGCATG AAATGCAAAA AGTTATCCGG GCGTGGGGTC CAGCATCTGT GGATCGCTTG AGCCCGGGAG GTCATGGCTG CAGTGAGCTG ACAGAGTGAG ACCCTGTCTC AAAAAAAAAA AAAAAAAAAG AAGGAAAGAG AAGAAGAAGG AAGAAGGAAG AAAGAAGGAG TCAAATCTCA GAGCAAAATG AAAATAACAA AGTTTTAAAG CTTAGGCCTG AACTTCATCT CAAGCAGCTT CCTTCCACAG GCAGAAAGGG AGGAGAAGCA GGCAAGGGTG GAGGCTGTGG TGGTTGTTTT CCTGCCTCAG CCCCACGCTC CTGCCGGTCC CAGGTGCCCA CCTGGGAAGG ATGCTGTGCA GGGGGCTTGC CACTTGTGGC AGGCACAATT ACAGCCCCTC CCCAAAGATG TGTCACCCGC ALAGGCAGAGG CTGGTGAAGG CTGCAGGTGG GTCATCCTGG ATTATCTGGT GGGCCTGATA TGGCCACAAG GAGAGTCAGA GAGGGGACGT GAGAAGGACC ACTGGCCACT AGCCAAGGAA TGGGGGCAGC CGCTCCATGC TGGAAAAGCA GCCCTGCCCA CGCCTCGATT TCAGGCCAGT GGGACCTGTT GATGCGTTTG TGTTCAGCCA CTAAGCTGCA GTGATTCGTC AATGAATACA GGGACAGTTC T~CAGAGTGAC TCTCAGCCCA 21700 21770 21840 21910 21980 22050 22120 22190 22260 22330 22400 22470 22540 22610 22680 22750 22820 22890 22960 23030 23100 23170 23240 23310 23380 23450 23520 23590 23660 23730 23800 23870 23940 24010 24080 24150 24220 24290 24360 24430 24500 24570 24640 24710 24780 24850 24920 24990 25060 25130 25138
CTCCGGGTGT
AAAAGGTATT
TATTATTATT
GCTGTAGCCG
TGTGAGCCAC
CTGTCATCCC
GGTAACATAG
AGTCCCAGCT
TTTTTGTTGA
TGCTTTGATA
ATTAGAGATG
CAAACCCCCA
TGCCCTTGCC
AGTAGTTTGG
GGAGACCCCA
GCTCGGGAGG
TGATTGTACC ATCGCACTCC AAGGAGAAGG AGAAGAGAAG AAGGAGGCCT GCTAGGTGCT GGAAAGAAAA ACCCCAGCTC ACAAGCGTGT ATGGAGCGAG GTGACACCAG CCAGGACCCC TGCACCTGCT GTAACCGTCG CAAACTTTGG TGGGTTTCAG CCCACGTCCT .TCTCCTGGAA AATCACGGCT GCCAGTCAGC GGTCCCTAGA AGTGAGAGAG GCTGGCTTTG AGATGGAGGA AGCAATCCTC CCCGGTCCTG TCAGCTTTCC GGCCTCCAGA ACAGCAGCAA ATGGAATAGC
CCCCTGGG
Example Comparison of the above-described genomic hTC sequence and the sequence of the hTC cDNA (Fig. 6; corresponding to SEQ ID NO 2) made it possible to elucidate the exon-intron structure of the hTC gene. The genomnic organization of the hTC gene is illustrated diagrammatically in Fig. 7. The coding region of the hTC gene is composed of 16 exons which vary in size between 62 bp and 1354 bp (see Table 1).
Exon 1 contains the translation start codon ATG. The translation stop codon TGA and the 3'-untranslated region lie on exon 16 (Fig. No possible polyadenylation Ssignal (AATAAA) was found either in exon 16 or in the 3195 bp of the following 3'-flanking region. The exon-intron transitions were determined on the basis of the consensus sequence Intron 3'-Exon Pre-mRNA A/C A G G T A/G A NC A G G Frequency 70 6o o oo 100oo 95 70 80 100 100oo and listed in Table 1. With the exception of the 5' splice site between exon 15 and intron 15, all the exon-intron transitions are in accord with the published (Shapiro and Senapathy, 1987) splice consensus sequence. The sizes of the introns are between 104 bp and 8616 bp. Since only part of intron 6 was isolated, it is not possible to determine the precise length of the hTC gene. Based on the part sequence of -4660 bp, which was obtained from intron 6, the minimum size of the hTERT gene is 37 kb.
-36- Introns 1-5 and the 5' region of intron 6, are contained in contig 1: Intron 1: bp 11493-11596 (SEQ ID NO 4); Intron 2: bp 12951-21566 (SEQ ID NO Intron 3: bp 21763-23851 (SEQ ID NO 6); Intron 4: bp 24033-24719 (SEQ ID NO 7); Intron 5: bp 24900-25393 (SEQ ID NO 8); region ofintron 6: bp 25550-26414 (SEQ ID NO 9).
The 3' region of intron 6, and introns. 7-15, .are located in contig 2 at the following positions: 3' region ofintron 6: bp 1-3782 (SEQ ID NO Intron 7: bp 3879-4858 (SEQ ID NO 11); Intron 8: bp 4945-7429 (SEQ ID NO 12); Intron 9: bp 7544-9527 (SEQ ID NO 13); Intron 10: bp 9600-11470 (SEQ ID NO 14); Intron 11: bp 11660-15460 (SEQ ID NO Intron 12: bp 15588-16467 (SEQ ID NO 16); Intron 13: bp 16530-19715 (SEQ ID NO 17); Intron 14: 19841-20621 (SEQ ID NO 18); Intron 15: 20760-21295 (SEQ ID NO 19).
The 3'-untranscribed region is also located in contig 2 at.position 21960-25138 (SEQ ID NO The individual sequences of the abovementioned introns are as follows: 37 Intron 1 (SEQ ID NO 4)
GTGGGCCTCCCCGGGGTCGGCGTCCGGCTGGGGTTGAGGGCGGCCGGGGGGAACCAGCGACATGCGGAGAGCAGCGCAGG
CGACTCAGGGCGCTTCCCCCGCAG
Intron 2 (SEQ ID NO
GTGAGGAGGTGGTGGCCGTCGAGGGCCCAGGCCCCAGAGCTGAATGCAGTAGGGGCTCAGAGGCAGCAGAGCC
CTGGTCCTCCTGTCTCCATCGTCACGTGGGCACACGTGGCTTTTCGCTCAGGACGTCGAGTGGACACGGTGATCTCTGCC
TCTGCTCTCCCTCCTGTCCAGTTTGCATAAACTTACGAGGTTCACCTTCACGTTTTGATGGACACGCGGTTTCCAGGCGC
CGAGGCCAGAGCAGTGAACAGAGGAGGCTGGGCGCGGCAGTGGAGCCGGGTTGCCGGCAATGGGGAGAAGTGTCTGGAAG
CACAGACGCTCTGGCGAGGGTGCCTGCAGGTTACCTATATCCTCTTCGCAATTTCAAGGGTGGGAATGAGAGTGGGGA
CGAGAACCCCCTCTTCCTGGGGGTGGGAGGTAAGGGTTTTGCAGGTGCACGTGGTCAGCCAATATGCAGTTTGTGTTTA
AGATTTAATTGTGTGTTGACGGCCAGGTGCGGTGGCTCACGCCGGTAATCCCAGCACTTTGGGAGCTGAGGCAGJTGGA
TCCTAGCGATTAACGCGCACTGTAACTTTTCAAAAAAATGT
GGCATGGTGGTGTGTGCCTGTAATCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATCACTTGAACCCAGGAGGCGGAGGC
TGCAGTGAGCTGAGATTGTGCCATTGTACTCCAGCCTGGGCGACAAGAGTGAAACTCTGTCTTTAAAAAGTGTT
CGTTGATTGTGCCAGGACAGGGTAGAGGGAGGGAGATAAGACTGTTCTCCAGCACAGATCCTGGTCCATCTTTAGGTAT
GAAGAGGGCCACATGGGAGCAGAGGACAGCAGATGGCTCCACCTGCTGAGGAGGGACAGTGTTTGTGGGTGTT~CAGG
ATGGTGCTGCTGGGCCCTGCCGTGTCCCCACCCTGTTTTTCTGGATTTGATGTTGAGGAACCTCCGCTCCAGCCCCCTTT
TGCCCGGTCAGCTCGGCGTGAGATCGTTACCTCCCACCCAA
ATGTAAGACTTCCGGCCATGCAGACAAGGAGGGTGACCTTCTTGGGGCTCTTTTTTTTCTTTTTTTCTTTTTATGTGGC
AAAAGTCATATAACATGAGATTGGCACTCCTAACACCGTTTTCTGTGTACAGTGCAGAATTGCTAACTCGC~jGGTGTTTA
CAGCAGGTTGCTTGAAATGCTGCGTCTTGCGTGACTGGAAGTCCCTACCCATCGAACGGCAGCTGCCTCACACCTGCTGC
GGCTCAGGTGGACCACGCCGAGTCAGATAAGCGTCATGCAACCCAGTTTTGCTTTTTGTGCTCAGCTTCCTTCGTTGAG
GAGAGTTTGAGTTCTCTGATCAGGACTCTGCCTGTCATTGCTGTTCTCTGACTTCAGATGAGGTCACAATCTGCCCCTGG
CTAGAGATAGGGTCCGTTCTTCCTCGGGGGGCTGCCAGGCC
GTCACGTGTAGGGTGAGTGAGGCGCGGCCCCCGGGTGTCCCTGTCCCGTGCAGCGTGATTGAGTGTGGCCCCCGGTGT
CCCTGTCACGTGTAGGGTGAGTGAGGCGCCATCCCCGGGTGTCCCTGTCACGTGTAGGGTGAGTGAGGCGTGGTCCCCGG
GTGTCCCTGTCCCGTGCAGGGTGAGTGAGGCACTGTCCCCGGGTGTCCCTGTCACGTGCAGGGTGAGTGAGGCGCGGTCC
CCGGGTGTCCCTCTCAGGTGTAGGGTGAGTGAGGCGCGGCCCCAGGGTGTCCCTGTCACGTGTAGGGTGAGTGAGCCC
GTCCCTGGGTGTCCCTCCCAGGTATAGGGTGAGTGAGGCACTGTCCCCGGGTGTCCCTGTCACGTGCAGGGTGAGTGAGG
CGCGCCCCCGGTGTCCCTCTCAGTGCAGGTGAGTGAGGCGCTGTCCCTGGTGTCCCTGTCTCGTGTAGGTGAGT
GAGGCTCTGTCCCCAGGTGTCCTTGGCGTTTGCTCACTTGAGCTTGCTCCTGAATGTTTGCTCTTTCTATAGCCACAGCT
GCGCCGGTTGCCCATTGCCTGGGTAGATGGTGCAGGCGCAGTGCTGGTCCCAGCCTATCTTTTCTGATGCTCGGCTCT
TCTTGGTCACCTCTCCGTTCCATTTTGCTACGGGGACACGGGACTGCAGGCTCTCGCCTCCCGCGTGCCAGCACTGCAG
cC-ACAGCTTCAGGTCCGCTTGCCTCTGTTGGGCCTGGCTTGCTCACCACGTGCCCGCCACATGCATGCTGCCAATACTCC
TCTCCCAGCTTGTCTCATGCCGAGGCTGGACTCTGGGCTGCCTGTGTCTGCTGCCACGTGTTGCTGGAGACATCCCAGAA
AGGGTTCTCTGTGCCCTGAAGGAAAGCAAGTCACCCCAGCCCCCTCACTTGTCCTGTTTTCTCCCAAGCTGCCCCTCTGC
TTGGCCCCCTTGGGTGGGTGGCAACGCTTGTCACCTTATTCTGGGCCCTGCCGCTCATTGCTTAGGCTGGGCTCTGCCT
CCAGTCGCCCCCTCACATGGATTGACGTCCAGCCACAGGTTGGAGTGTCTCTGTCTGTCTCCTGCTCTGAGACCCCGTG
GAGGGCCGGTGTCTCCGCCAGCCTTCGTCAGACTTCCCTCTTGGTCTTAGTTTTGAATTTCACTGATTTACCTCTGACG
TTTCTATCTCTCCATTGTATGCTTTTTCTTGGTTTATTCTTTCATTCCTTTTCTAGCTTCTTAGTTTAGTCATGCCTTTC
CCCAGGTCTACGACTTTTGTTGATACCAACGCCTCATTCTA
ATACTTCAAAGTGTTAATACTTCTTTTAAGTATTCTTATTCTGTGATTTTTTTCTTTGTGCACGCTGTGTTTTGACGTGA
AATCATTTTGATATCAGTGACTTTTAAGTATTCTTTAGCTTATTCTGTGATTTCTTTGAGCAGTGAGTTATTTGACCT
GTTTATGTTCAAGATATGTAGAGTATCAAGATACGTAGAGTATTTTAAGTTATCATTTTATTATTGATTTCTAACTCAGT
TGTGTAGTGGTCTGTATAATACCAATTATTTGAGTTTGCGAGCCTTGCTTTGTGATCTAGTGTGTGCATGGTTTCCAG
AACTGTCCATTGTAAATTTGACATCCTGTCAATAGTGGGCATGCATGTTCACTATATCCAGCTTATTAAGGTCCAGTGCA
-38-
AAGCTTCTGTCTCCTTCTAGATGCATGAATTCCAAGAAGGAGGCCATAGTCCCTCACCTGGATGGGTCTGTTCATT
TCTTCTCGTTTGGTAGCATTTATGTGAGGCATTGTTAGGTGCATGCACGTGGTAGAATTTTTATCTTCCTGATGAGTGAA
TCTTTTGGAGACTTCTATGTCTCTAGTAATCTAGTAATTCTTTTTTTAAATTGCTCTTAGTACTGCCACACTGGCTTCT
TTTGATTAGTATTTTCCTGCTGTGTCTGTTTTCTGCCTTTAATTTATATATATATATATATTTTTTTTTTTTTTGAGACA
GAGTCTTGGTCTGTCGCCCAGGGTGAGTGCAGTGGTGTGATCACAGGTCAGTGTAACTTTTACCTTCTGGCCTGAGCCGT
CCTCTCACCTCAGCCTCCTGAGTAGCTGGAACTGCAGACACGCACCGCTACACCTGGCTAATTTTTAAATTTTTTCTGGA
GACAGGGTCTTGCTGTGTTGCCCAGGCTGGTCTCAAACTCTTGGACTCAAGGGATCCATCTACCTCGGCTTCCCAGTG
CTGAATTACAGGCATGAGCCACCATGTCTGGCCTAATTTTCAACACTTTTATATTCTTATAGTGTGGGTATGTCCTGTTA
ACGAGAGGATCATCGCGCGCTTTTATGTACGTTTTCTTTTT
ACTAGAGACCCGCCTGGTGCACTCTGATTCTCCACTTGCCTGTTGCATGTCCTCGTTCCCTTGTTTCTCACCACCTCTTG
GGTTGCCATGTGCGTTTCCTGCCGAGTGTGTGTTGATCCTCTCGTTGCCTCCTGGTCACTGGGCATTTGCTTTTATTTCT
CTTTGCTTAGTGTTACCCCCTGATCTTTTTATTGTCGTTGTTTGCTTTTGTTTATTGAGACAGTCTCACTCTGTCACCCA
GGCTGGAGTGTAATGGCACAATCTCGGCTCACTGCAACCTCTGCCTCCTCGGTTCAGCAGTTCTCATTCCTCACCTCA
TGAGTAGCTGGGATTACAGGCGCCCACCACCACGCCTGGCTAATTTTTGTATTTTTAGTAGAGATAGGCTTTACCATGT
TGGCCAGGCTGGTCTCAAACTCCTGACCTCAAGTGATCTGCCCGCCTTGGCCTCCCACAGTGCTGGGATTACAGGTGCAA
GCCACCGTGCCCGGCATACCTTGATCTTTTAAAATGAGTCTGAACATTGCTACCCTTGTCCTGAGCATAGACCCTT
AGTGTATTTTAGCTCTGGCCACCCCCCAGCCTGTGTGCTGTTTTCCCTGCTGACTTAGTTCTATCTCAGGCATCTTGACA
CCCCCACAAGCTAAGCATTATTAATATTGTTTTCCGTGTTGAGTGTTTCTGTAGCTTTGCCCCCGCCCTGCTTTTCCTCC
TTTGTTCCCCGTCTGTCTTCTGTCTCAGGCCCGCCGTCTGGGGTCCCCTTCCTTGTCCTTTGCGTGGTTCTTCTGTCTTG
TTATTGCTGGTAAACCCCAGCTTTACCTGTGCTGGCCTCCATGGCATCTAGCGACGTCCGGGGACCTCTGCTTATGATGC
ACAGATGAAGATGTGGAGACTCACGAGGAGGGCGGTCATCTTGGCCCGTGAGTGTCTGGAGCACCACGTGGCCAGCGTTC
CTTAGCCAGTGAGTGACAGCAACGTCCGCTCGGCCTGGGTTCAGCCTGGAAAACCCCAGGCATGTCGGGGTCTGGTGGCT
CCGCGGTGTCGAGTTTGAAATCGCGCAAACCTGCGGTGTGGCGCCAGCTCTGACGGTGCTGCCTGGCGGGGGAGTGTCTG
CTTCCTCCCTTCTGCTTGGGAACCAGGACAAAGGATGAGGCTCCGAGCCGTTGTCGCCCAACAGGAGCATGACGTGAGCC
ATGTGGATAATTTTAATTTCTAGGCTGGGCGCGGTGGCTCACGCCTGTATCCCAGCACTTTGGGAGGCCAAGGCGGG
TGGATCACGAGGTCAGGAGGTCGAGACCATCCTGGCCAACATGATG
CCCCATCTGTACTAACAAAAATTAGC
TGGGGTGGGGCGATCACATGGAGTAGAGGATCTACTGATGA
GTTGCAGTGAGCCGACATTGCACCACTGCACTCCAGCCTGGCCACAGCGAGACTCTGTCTCAAAZAAAA
AAAAAAAAATTCTAGTAGCCACATTAAAAAAGTAAAAAAGAAAAGGTGAAATTAATGTAATAATAGATTTTACTGA
GCCCAGCATGTCCACACCTCATCATTTTAGGGTGTTAT.TGGTGGGAGCATCACTCACAGGACATTTGACATTTTTTGAGC
TTTGTCTGCGGGATCCCGTGTGTAGGTCCCGTGCGTGGCCATCTCGGCCTGACCTGCTGG-JCTTCCCATGGCCATGGCT
GTTGTACCAGATGGTGCAGGTCCGGGATGAGGTCGCCAGGCCCTCAGTGAGCTGGATGTGCAGTGTCCGGATGGTGCACG
TCTGGGATGAGGTCGCCAGGCCCTGCTGTGAGCTGGATGTGTGGTGTCTGGATGGTGCAGTCAGGGGTGAGGTCTCCAG
GCCCTCGGTGAGCTGGAGGTATGGAGTCCGGATGATGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCTGTGAGCTGGATG
TGTGGTGTCTGGATGGTGCAGGTCAGGGTGAGGTCTCCAGGCCCTCGGTAGCTGGAGTATGGAGTCCGATGATGCA
GGTCCGGGGTGAGGTCGCCAGGCCCTGCTGTGAGCTGGATGTGTGGTGTCTGATGGTGCAGGTCTGGGTGAGGTCACC
AGGCCCTGCGGTGAGCTGGGTGTGCGGTGTCTGGATGGTGCAGGTCTGGAGTGAGGTCGCCAGACGGTGCCAGACCATGC
GGTGAGCTGGATATGCGGTGTCCGGATGGTGCAGGTCTGGGGTGAGGTTGCCAGGCCCTGCTGTGAGTTGGATGTGGT
GTCCGGATGCTGCAGGTCCGGTGTGAGGTCACCAGGCCCTGCTGTGAGCTGGATGTGTGGTGTCTGGATGGTGCAGGTCT
GGGGTGAAGGTCGCCAGGCCCCTGCTTGTGAGCTGGATGTGTGGTGTCTGGATGGTGCAGGTCTGGAGTGAGGTCGCCAG
GCCCTCGGTGAGCTGGATGTGCAGTGTCCAGATGTGCAGGTCCGGGGTGAGGTCGCCAGACCCTGCGGTGAGCTGGATG
TGCGGTGTCTGGATGGTGCAGGTCTGGAGTGAGGTCGCCAGGCCCTCGGTGAGCTGGATGTATGGAGTCCGGATGGTGCC
GGTCCGGGGTGAGGTCGCCAGACCCTGCTGTGAGCTGGATGTGCGGTGTCTGGATGGTACAGGTCTGGAGTGAGGTCGCC
AGACCCTGCTGTGAGCTGGATATGCGGTGTCCGGATGGTGCAGGTCAGGGGTGAGGTCTCCAGGCCCTCGGTGAGCTGGA
GGTATGGAGTCCGGATGATGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCTGTGAACTGGATGTGCGGCGTCTGGATGGT
GCAGGTCTGGGGTGTGGTCGCCAGGCCCTCGGTGAGCTGGAGGTATGGAGTCCGGATGATGCAGGTCCGGGGTGAGGTCG
CCAGGCCCTGCTGTGAGCTGGATGTGCGGCGTCTGGATGGTGCAGGTCTGGGGTGTGGTCGCCAGGCCCTCGGTGAGCTG
-39-
GAGGTATGGAGTCCGGATGATGCAGGTCCGGGGTGAGGTTGCCAGGCCCTGCTGTGAGCTGGATGTGCTGTATCCGGATG
GTGCAGTCCGGGGTGAGGTCGCCAGGCCCTGCTGTGAGCTGGATGTGCTGTATCCGGATGGTGCAGGTCTGGGGTGAGGT
CACCAGGCCCTGCGGTGAGCTGGTTGTGCGGTGTCCGGTTGCTGCAGGTCCGGGGTGAGTTCGCCAGGCCCTCGGTGAGC
TGGATGTGCGGTGTCCCCGTGTCCGGATGGTGCAGGTCCAGGGTGAGGTCGCTAGGCCCTTGGTGGGCTGGATGTGCCGT
GTCCGGATGGTGCAGGTCTGGGGTGAGGTCGCCAGGCCTTTGGTGAGCTGGATGTGCGGTGTCTGCATGGTGCAGGTCTG
GGGTGAGGTCGCCAGGCCCTTGGTGGGCTGGATGTGTGGTGTCCGGATGGTGCAGGTCCGGCGTGAGGTCGCCAGGCCCT
GCTGTGAGCTGATGTGCGGTGTCTGGATGGTGCAGTCCGGGGTGAGGTAGCCAAGGCCTTCGGTGAGCTGGATGTGGG
GTGTCCGGATGGTGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCGGTTAGCTGGATATGCGGTGTCCGGATGGTGCAGGT
CCGGGGTGAGGTCACCAGGCCCTGCGTTAGCTGGATGTGCGGTGTCTGGATGGTGCAGGTCCGGGGTGAGTCGCCAGG
CCCTGCTGTGAGCTGGATGTGCTGTATCCGGATGGTGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCAGTGAGCTGGATG
TGCTGTATCCGGATGGTGCAGGTCTGGCGTGAGGTCGCCAGGCCCTGCGGTTAGCTGGATATGCGGTGTCGGATGGTGCA
GGTCCGGGGTGAGGTCACCAGGCCCTGCGGTTAGCTGGATGTGCGGTGTCCGGATGGTGCAGGTCTGGGGTGAGGTCGCC
AGGCCCTGCTGTGAGCTGGATGTGCTGTATCCGGATGGTGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCGGTGAGCTGG
ATGTGCTGTATCCGGATGGTGCAGGTCTGGCGTGAGGTCGCCAGGCCCTGCGGTGAGCTGGATGTGCAGTGTACGGATGG
TGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCGGTGGGCTGTATGTGTGTTGTCTGGATGGTGCAGGTCCGGGGTGAGTT
CGCCAGGCCCTGCGGTGAGCTGGATGTGTGGTGTCTGGATGCTGCAGGTCCGGGGTGAGTTCGCCAGGCCCTCGGTGAGC
TGGATATGCGGTGTCCCCGTGTCCGAATGGTGCAGGTCCAGGGTGAGGTCGCCAGGCCCTTGGTGGGCTGATGTGCCGT
GTCCGGATGGTGCAGGTCTGGGGTGAGGTCGCCAGGCCCTTGGTGAGCTGGATGTGCGGTGTCCGGATGTGCAGGTCCG
GGGTGAGGTCACCAGGCCCTCGGTGATCTGGATGTGGCATGTCCTTCTCGTTTAAG
Intron 3 (SEQ ID NO 6)
GTACTGTATCCCCACGCCAGGCCTCTGCTTCTCGAAGTCCTGGAACACCAGCCCGGCCTCAGCATGCGCCTGTCTCCACT
TGCCTGTGCTTCCCTGGCTGTGCAGCTCTGGGCTGGGAGCCAGGGGCCCCGTCACAGGCCTGGTCCAAGTGGATTCTGTG
CAAGGCTCTGACTGCCTGGAGCTCACGTTCTCTTACTTGTAAAATCAGGAGTTTGTGCCAAGTGGTCTCTAGGGTTTGTA
AACGAGATAATGTGACCACCAGCCTGCTCCGGTTGTTATTT
TCTCTTTTTTTTTTCTTTTTTGAGATGGAGTCTCACTCTGTTGCCCAGGCTGGAGTGCAGTGGCATATCTTGGCT~CAT
GCAACCTCCACCTCCTGGGTTTAAGCGATTCACCAGCCTCAGCCTCCTAAGTAGCTGGGATTACAGGCACCTGCCACCAC
GCCTGGCTAATTTTTGTACTTTTAGGAGAGACGGGGTTTCACCATGTTGGCCAGGCTGGTCTCGACTCATGACCTAJG
TGATCCACCCACCTTGGCCTCCCAAAGTGCTGGGTTTACAGGCTAAGCCACCGTGCCCAGCCCCCGATTCTCTTTTAATT
CATGCTGTTCTGTATGAATCTTCAATCTATTGGATTTAGGTCATGAGAGGATAJAAATCCCACCCACTTGrJCGACTCACTG
CAGGGAGCACCTGTGCAGGGAGCACCTGGGGATAGGAGAGTTCCACCATGAGCTACTTCTAGGTGGCTGCATTTGAATG
GCTGTGAGATTTTGTCTGCAATGTTCGGCTGATGAGAGTGTGAGATTGTGACAGATTCAAGCTGGATTTGCATCAGTGAG
GGACGGGAGCGCTGGTCTGGGAGATGCCAGCCTGGCTGAGCCCAGGCCATGGTATTAGCTTCTCCGTGTCCCGCC~CAC
TGACTGTGGAGGGCTTTAGTCAGAAGATCAGGGCTTCCCCAGCTCCCCTGCACACTCGAGTCCCTGGGGyGGCCTTGTGAC
ACCCCATGCCCCAAATCAGGATGTCTGCAGAGGGAGCTGGCAGCAGACCTCGTCAGAGGTAACACAGCCTCTGGGCTGGG
GACCCCGACGTGGTGCTGGGGCCATTTCCTTGCATCTGGGGGAGGGTCAGGGCTTTCCCTGTGGGACAAGTTAATACAC
AAGACTCTGCTAAGATATGGGCACACTGCTTACGATTGAGA
TTAATTGGGGTGACCGGAAGGAGCAGACAGACGTGGTGGTCCCCAJAGATGCTCCTTGTCACTACTGGGACTGTTGTTCTG
CCTGGGGGGCCTTGGAGGCCCCTCCTCCCTGGACAGGGTACCGTGCCTTTTCTACTCTGCTGGGCCTGCGGCCTGCGGTC
AGGGCACCAGCTCCGGAGCACCCGCGGCCCCAGTGTCCACGGAGTGCCAGGCTGTCAGCCACAGATGCCCAGGTCCAGGT
GTGCCCACCCTCCCTGTGTTGGGAAGCAGCGGTTAGGCGTG
CTCATGAGAGCTGATTCTGCTCCTTGGCTGAGCTGCCCTGAGCAGCCTCTCCCGCCCTCTCCATCTGAAGGGATGTGGCT
CTTTCTACCTGGGGGTCCTGCCTGGGGCCAGCCTTGGGCTACCCCAGTGGCTGTACCAGAGGGACAGGCATCCTGTGTGG
AGGGGCATGGGTTCACGTGGCCCCAGATGCAGCCTGGGACCAGGCTCCCTGGTGCTGATGGTGGGACAGTCACCCTGGGG
GTTGACCGCCGGACTGGGCGTCCCCAGGGTTGACTATAGGACCAGGTGTCCAGGTGCCCTGCAAGTAGAGGGGCTCTCAG
AGGCGTCTGGCTGGCATGGGTGGACGTGGCCCCGGGCATGGCCTTCAGCGTGTGCTGCCGTGGGTGCCCTGAGCCCTCAC
TGGCGGGG=G=CGGG CCCATCGTTrGrACACTCGGGCC
CTATTGCAG
40 Intron 4 (SEQ ID NO 7)
GTGGCTGTGCTTTGGTTTAACTTCCTTTTTAAACAGAAGTGCGTTTGAGCCCCACATTTGGTATCAGCTTAGATGAAGGG
CCCGGAGGAGGGGCCACGGGACACAGCCAGGGCCATGGCACGGCGCCAACCCATTTGTGCGCACAGTGAGGTGGCCGAGG
TGCCGGTGCCTCCAGAAAAGCAGCGTGGGGGTGTAGGGGAGCTCCTGGGGCAGGGACAGGCTCTGAGACCACAAGAAG
CAGCCGGGCCAGGGCCTGGATGCAGCACGGCCCGAGGTCCTGGATCCGTGTCCTGCTGTGGTGCGCAGCCTCCGTGCGCT
TCCGCTTACGGGGCCCGGGGACCAGGCCACGACTGCCAGGAGCCCACCGGGCTCTGAGGATCCTGGACCTTGCCCCACGG
CTCCTGCACCCCACCCCTGTGGCTGCGGTGGCTGCGGTGACCCCGTCATCTGAGGAGAGTGTGGGGTGAGGTGGACAGAG
GTGTGGCATGAGGATCCCGTGTGCACACACATGCGGCCAGGACCCGTTTCACAGGGTCTGAGGAGCTGGGAGGGG
TTCTAGGTCCCGGGTCTGGGTGGCTGGGGACACTGGGGAGGGGCTGCTTCTCCCCTGGGTCCCTATGGTGGGGTGGGCAC
TTGGCCGGATCCACTTTCCTGACTGTCTCCCATGCTGTCCCCGCCAG
Intron 5 (SEQ ID NO 8)
GTGTCGGACCGGGACCGTGCTTGATGTCTATGACCTTGGGA
GAGGTACTCCTGGGTGGGCCGCAGGGAGTGCAGGTGACCCTGTCACTGTTGAACACCTGCACCTAGGTGGAGGC
CTCGCTCTCGAAGGCGCGGACCGCGCGGTCATCAGAGTCAT
GATTCCGTTTCCGTCAGAGAAGGAACCGCAACGGCTCAGCCACCAGGCCCCGGTGCCTTGCACCCCAGTCCTGAGCCG
GGGTCTCCTGTCCTGAGGCTCAGAGAGGGGACACAGCCCGCCCTGCCCTTGGGGTCTGGAGTGGTGGGGGTCGAGAGAG
AGTGGGGGACACCGCCAGGCCAGGCCCTGAGGGCAGAGGTGATGTCTGAGTTTCTGCGTGGCCACTGTCAGTCTCCTCGC
CTCCACTCACACAG
intron 6 (SEQ ID NO 9)
GTAAGGTTCACGTGTGATAGTCGTGTCCAGGATGTGTGTCTCTGGGATATGAATGTGTCTAGAATGCAGTCGTGTCTGTG
ATGCGTTTCTGTGGTGGAGGTACTTCCATGATTTACACATCTGTGATATGCGTGTGTGGCACGTGTGTGTCGTGTGCAT
GTATCTGTGGCGTGCATATTTGTGGTGTGTGTGTGTGTGGCACGTGTGTGTCCATGGTGTGTGTGCCTGTGGTGTGCATG
TGTGTGTGTCTGTGACACGTGCATGTTCTGCTGTGTGCTGATGTCTGTGATGTGCCTATTTGTGGTGTGTGTGTGCAT
GTGTCCGTGACATATGCGTGTCTATGGCATGGGTGTGTGTGGCCCCTTGGCCTTACTCCTTCCTCCTCCAGGCATGGTCC
GCACCATTGTCCTCACGCTCTCGGGTGCTGGTTTGGGGAGCTCCACATTCAGGGTCCTACTTCTAGCATGGGTGCCCCT
GTCCTGTCACAGGGCTGGGCCTTGGAGACTGTAAGCCAGGTTTGAGAGGAGAGTAGGGATGCTGGTGGTACCTTCCTGGA
CCCCTGGCACCCCCAGGACCCCAGTCTGGCCTATGCCGGCTCCATGAGATATAGGAGGCTGATTCAGGCCTCGCTCCCC
GGAAATCCCGGGCGGGCTGGTGCGGTAAGGC-GGTGGTCAC
AGTGGTCATGAGCACGCTGGAGGGGTAAGCCCTCAAGTCGTGCCAGGCCGGGGTGCAGAGGTGAGAAGTATCCCTGA
GCTTCGGTCTGGGGAGAGGCACATGTGGAAACCCACAAGGACCTCTTTCTCTGACTTCTTGAGCT
3'-region intron 6 (SEQ ID NO
TGTGGGATTGGTTTTCATGTGTGGGATAGGTGGGGATCTGTGGGATTGGTTTTTATGAGTGGGGTACAAGAGTTCAAG
GCGAGCTTTCTTCCTGTAGTGGGTCTGCAGGTGCTCCAACAGCTTTATTGAGGAGACCATATCTTCCTTTGAACTATGGT
CGGGTTTATAGTAAGTCAGGGGTGTGGAGGCCTCCCCTGGGCTCCCTGTTCTGTTTCTTCCACTCTGGGGTCGTGTGGTG
CCTGCTGTGGTGTGTGGCCGGTGGGCAGGGCTTCCAGGCCTCCTTGTGTTCATTGGCCTGGATGTGGCCCTGGCTACGCT
CCGTCCTTGGAATTCCCCTGCGAGTTGGAGGCTTTCTTTCTTTCTTTTTTTCTTTCTTTTTTTTTTTTTTTGATAACAGA
GTCTCGCTCTTTTTTGCCCAGGCTGGAGTGGTTTGGCGTGATCTTGGCTCACTGCACCTGTGCTTCCTGAGTTCAGCA
ATCCTCTACTCAGACGATAAGCCCCACTCGCATTTTATTGA
AGACGAGGTTTCTCCATGTTGGCCAGGCTGGTCTCGACTCCTGACCTCAGGTGATCCTCCCACCTCGGCCTCCCAGT
GCTGGGATGACAGGTGTGAACCGCCGCGCCCGGCCGAGACTCGCTTCCTGCAGCTTCCGTGAGATCTGCAGCGATAGCTG
CCTGCAGCCTTGGTGCTGACAACCTCCGTTTTCCTTCTCCAGGTCTCGCTAGGGGTCTTTCCATTTCATGACTCTCTTCA
-41-
CAGAAGAGTTTCACGTGTGCTGATTTCCCGGCTGTTTCCTGCGTAATTGGTGTCTGCTGTTTATCGATGGCCTCCTTCCA
TTTCCTTTAGGCTTTGTTTATTGTTGTTTTTCCGGCTCCTTGAGGAAAAGTTTCGATTATGGATGTTTGACTTTCTTT
TCTAAACAAGCATCTGAAGTTGCCGTTTTCCCTCTAAGCAGGGATCCCGAGGCCCCTGGCTGTGGAGTCCCGGTCT
GGGGCCTGTTAGGAACCCGGCGCACAGCGGGAGGCTAGGTGGGGTGTGGGGAGCCAGCGTTCCCGCCTGAGCCCCGCCCC
TCTCAGATCAGCAGTGGCATGCGGTGCTCAGAGGCGCACACACCCTACTGAGACTGTGCGTGAGAGGTCTAGATTCT
GTGCTCCTTATGGGAATCTAATGCCTGATGATCTGAGGTGGAACCGTTTGCTCCAACCATCCCCTTCCCACTGCTG
TCCTGTGGAAAAATCGTCTTCCACGAAACCAGTCCCTGGTACCACAATGGTTGGGGACCCTGTGCTAGACCTGCTTCA
GCAGCCTCTCGTCAGTGTTGATATATTGGCTTTTCTGTGTTGAGTCCAGAATAATTACGGATTTCTGTGATGCTTTCCGC
CGACCTCAGACCCATGGGCTATTTGTGGGCGTGTTGCCTGCTCCTGGGTTGGGAAGGGTGCAGGCCCCATGTACCTTCCT
GTTACTGCCTTCCAGGTTGGTTCTCAGGGTTGAATCGTACTCGATGTGGTTTTAGCCCACGGCCCTGCCGCCAGCTCCTG
GGGGCTGGGGAACATGCTGAGCACAGAGTCACCGTGCGCGTCTTTTGATGCCTCACAAGCTCGAGCCTCCTGTGTCCG
TGTTAGTGTGTGTCACGTGCCTGCTCACATCCTGTCTTGGGGACGCAGGCTTAGGGTCCCGTAGTATGACAAGC
GTCCTGGGGGAGTCTGCAGAATAGGAGGTGGGGGTGCCGGTCTCTCTCCCGCGTCTTAGACTCTTCTCCTGCCTGTGCT
GTGTCCTCTCTCACCCACCGGCGAAGCGGGTGGGCCTTCAG
GACTGTGGATGGCAGTCGGTCACGGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTTGGTCACAGGGTCTGATGTGTG
GTGACTGTGGATGGCGGTCGTGGGTCTGATGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGTGACTGTGG
ATGGCGGTCGTGGGGTCTGATGTGGTGACTGTGGATGGCGGTCGTGGGTCTGATGTGGTGACTGTGGATGCGTCGTG
GGGTCTGATGTGGTGACTGTGGATGGCAGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATG
TGGTGACTGTGGATGGCAGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACT
GTGGATGGCGGTCGTGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGTGACTGTGATGG
CGGTCGTGGGGTCTGATGTGGTGACTGTGGATGGCGGTCGTGGTCTGATGTGTGGTGACTGTGGATGGTGATCGGTCA
CAGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGTGATCGTCAQAG
GGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCTTTCCCGGG
TCGTTTGGCGGAGCACGCCGGGCGTTTGGCGGAGCGCTGGC
GATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGCGTCGTGGGGTCTGATGTGGT
GACTGTGGATGGCGGTCGTGGGGTCTGATGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGAT
GGCGGTTGGTCCCGGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGGTGACTGTGGATGGCAG
TCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCTCGTGG
TCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGT
GGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGTGATCGGTACGGGGTCTGATGTGTGGT
GATTGTGGTGGGTTAGGGTATGGAGCGCTGGCGTTGGCGGA
GGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTAGGGTCTGATGTGTGGTGACTGTGGATGGCAGTCG
GTCACAGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGATGGCGTCGTGG
GGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGAT
GTGGTGACTGTGGATGGTGATCGGTCACAGGGGTCTGATGTGTGGTAGCTGCAGGTGGAGTCCCAGGTGTGTCTGTAGCT
ACTTTGCGTCCTCGGCCCCCCGGCCCCCGTTTCCCACAGAAGCTTCCCGGCGCTCTCTGGGCTTCTCCCGCCATCG
GGCTTGGCCGCAGGTCCACACGTCCTGATCGGAAGACAGTGCCCAGCTCTGGCCGGGGCAGGCCCATTTGTGGCTC
ATGCCCTCTCCTCTGCCGGCAG
Intron 7 (SEQ ID NO 11)
GTCTGGGCACTGCCCTGCAGGGTTGGGCACGGACTCCCAGCAGTGGGTCCTCCCCTGGGCAATCACTGGGCTCATGACCG
GACAGACTGTTGGCCCTGGGGGGCAGTGGGGGGAATGAGCTGTGATGGGGGCATGATGAGCTGTGTGCCTTGGCGATC
TGAGCTGGGCCATGCCAGGCTGCGACAGCTGCTGCATTCAGGCACCTGCTCACGTTTGACTGCGCGGCCTCTCTCCAGTT
CCGCAGTGCCTTTGTTCATGATTTGCTAAATGTCTTCTCTGCCAGTTTTGATCTTGAGGCCAGAAGTGTCCCCCT
CCTTTAGGAGGGCAGGCCATGTTTGAGCCGTGTCCTGCCCAGCTGGCCCCTCAGTGCTGGGTCTGAGGCCGGACG
TGTCCCCCTTCTTAGGAGGACGGGCCGTGTTTGAGCCACGCCCCGCTGAGCGGGCCTCTCAGTGCTGGGTCTGTCCACGT
-42-
GGCCCTGTGGCCCTTTGCAGATGTGGTCTGTCCACGTGGCCCTGTGGCTCTTTGCAGATGCCTGTTAGCACTTGCTCGGC
TCTAGGGGACAGTCGTGTCCACCGCATGAGGCTCAGAGACCTCTGGGCGTTTCCTTGGCTCCAGGTGGGGGTGGAG
GTGGCCTGGGCTGCTGGGACCCAGACCCTGTGCCCGGCAGCTGGGCAGCAACTCCTGGATCACATATGCCATCCGGGCCA
CGGTGGGCTGTGTGGGTGTGAGCCCAGCTGGACCCACAGGTGGCCCAGAGGAGACGTTCTGTGTCACACACTCTGCCTAA
GCCCATGTGTGTCTGCAGAGACTCGGCCCGGCCAGCCCACGATGGCCCTGCATTCCAGCCCAGCCCCGCACTTCATCACA
AACACTGACCCCAAAAGGGACGGAGGGTCTTGGCCACGTGGTCCTGCCTGTCTCAGCACCCACCGCTCACTCCCATGTG
TCTCCCGTCTGCTTTCGCAG
Intron 8 (SEQ ID NO 12)
GTGAGTCAGGTGGCCAGGTGCCATTGCCCTGCGGGTGGCTGGGCGGGCTGGCAGGGCTTCTGCTCACCTCTCTCCTGCCC
CTTCCCCACTGNCCTTCTGCCCGGGGCCACCAGAGTCTCCTTTTCTGGCCCCCGCCCCCTCCGGCTCCTGGGCTGCAGGC
TCCCGAGGCCCCGGAAACATGGCTCGGCTTGCGGCAGCCGGAGCGGAGCAGGTGCCACACGAGGCCTGGATGGCAAGC
GGGGTGTGGAGTTGCTCCTGCGTGGAGGACGAGGGGCGGGGGGTGTGTCTGGGTCAGGTGTGCGCCGAGCGTTTGAGCCT
GCAGCTTGTCAGCTCCAAGTTACTACTGACGCTGGACACCCGGCTCTCACACGCTTGTATCTCTCTCTCCGATAC~AA
GGATTTTATCCGATTCTCATTCCTGTCCCTGTCGTGTGACCCCCGCGAGGGCGCGGGCTCTTCTCTCTGTGACTAGATTT
CCCATCTGGAAAGTGCGGGGTTGACCGTGTAGTTTGCTCCTCTCGGGGGGCCTGTGGTGGCCATGGGGCAGGCGGCCTGG
GAGAGCTGCCGTCACACAGCCACTGGGTGAGCCACACTCACGGTGGTAGAGCCACAGTGCCTGGTGCCACATCACGTCCT
CTGGATTTTAAGTAAAACCACACACCTCCCGGCAGGCATCTGCCTGCGACCCTGTGTGTGCCTGGGGAGAGTGGTAGCAC
GGAGGAAATTCGTGCACACTCAAGGTCATCAGCAAGGTCATCCGCAGTCAGGTGGAACGTGGAGGCCTCTCTCTGGGATC
GTCTCCAGCGGATAAAGGACTGTGCACAGCTTCGGAAGCTTTTATTTAAAAATATAACTATTAATTATTGCATTATAAGT
AATCACTAATGGTATCAGCAATTATAATATTTATTAAAGTATAATTAGAAATATTAAGTAGTACACACGTTCTGGAAAAA
CACAAATTGCACATGGCAGCAGAGTGAATTTTGGCCGAGGGACACGTGTGCACATGTGTGTAAGCGGCCCCCAGGCCCAC
AGAATTCGCTGACAAAGTCACCTCCCCAGAGAAGCCACCACGGGCCTCCTTCGTGGTCGTGAATTTTATTAAGATGGATC
AAGTCACGTACCGTCCACGTGTGGCAGGGCTTTGGGGAATGTGAGGrTGATGACTGCGTCCTCATGCCCTGACAGACAGGA
GGTGACTGTGTCTGTCCTGTCCCTAGGACACGGACAGGCCCGAAGCTCTAGTCCCCATCGTGGTCCAGTTTGCCTCTGA
ATAAAAACGTCTTCAAAACCTGTTGCCCCAAAAACTA-AGAACAGAGAGAGTTTCCCATCCCATGTGCTCACAGGGGCGTA
TCTGCTTGCGTTGACTCGCTGGGCTGGCCGGACTCCTAGAGTTGGTGCGTGTGCTTCTGTGCAAAAAGTGCAGTCCTCTT
GCCCATCACTGTGATATCTGCACCAGCAAGGAAAGCCTCTTTTCTTTTCTTTCTTTTTTTTTTTTTGAGACGGAACGTCA
CTGTTGTCTGCCTGGGCTTGAGTGCAGTGGCGCGATCTCAACTCACTGCAACCTCCGCCTCCCGGGTTCCAGCATTTCTC
CTGCCTCAGCCTCCCGAGCAGCTGAGATTACAGGCACCCACCCCCTGCGCCTGGCTAATTTTGTATTTTTAGTAGAGAG
GGGTTTTTGCCATGTTGGCCAGGCTGGTCTCGAACTCCTGACCTCAGGTGATCCACCCACCTCGGCCTCCCAAAGTGCTG
GGATTACAGGTGTGAGCCATCACGCCCAGCCGGAAAGCCTCTTTTTAAGGTGACCACCTATAGCGCTTCCCGAAAATAAC
AGGTCTTGTTTTTGCAGTAGGCTGCAAGCGTCTCTTAGCAACAGGAGTGGCGTCCTGTGGGCTCTGGGGATGGCTGAGGG
TCGCGTGGCAGCCATGCCTTCTGTGTGCACCTTTAGGTTCCACGGGGCTATTCTGCTCTCACTGTTTGTCTGAAAACGCA
CCCTTGGCATCCTTGTTTGGAGAGTTTCTGCTTCTCGTTGGTCATGCTGAAACTAGGGGCAAGGTTGTATCCGTTGGCGC
GCAGCGGCTACATGTAGGGTCATGAGTCTTTCACCGTGGACATTCCTTGA flAAGGAGTCCGGTTAAGCAT
TCATTCCGGGTCAAGTGTCTGGTTCTGTGA-ATAAACTCTAAGATTTAAGAAACCTTAATGAAAGAAAACCTTGATGATTC
AGAGCAAGGATGTGGTCACACCTGTGGCTGGATCTGTTTCAGCCGCCCCAGTGCATGGTGAGAGTGGGGAGCAGGGATTG
TTTGTTCAGAGGTCTCATCTGGTATGTTTCTGAGGTGTTTGCCGGCTGATGGTAGACGTGTCGTTTGTGTGTATGAGGT
TCTGTGTCTGTGTGTGGCTCGGTTTGAGTGTACGCATGTCCAGCACATGCCCTGCCCGTCTCTCACCTGTGTCTTCCCGC
CCCAG
Intron 9 (SEQ ID No 13)
GTGAGGCCTCCTCTTCCCCAGGGGGGCTTGGGTGGGGGTTGATTTGCTTTTGATGCATTCAGTGTTAATATTCCTGGTGC
145 TCTGGAGACCATGACTGCTCTGTCTTGAGGAACCAGACAAGGTTGCAGCCCCTTCTTGGTATGAAGCCGCACGGGAGGGG 43 TTGCACAGCCTGAGGACTGCGGGCTCCACGtAGGCTCTGTCCAGCGGCCATGTCCAGAGGCCTCAGGCJCTCAGCAGGCGG
GAGGGCCGCTGCCCTGCATGATGAGCATGTGAATTCCACCGAGGAAGCACACCAGCTTCTGTCACGTACCQA.JTTC
CGTTAGGGTCCTTGGGGAGATGGGGCTGGTGCAGCCTGAGGCCCCACATCTCCCAGCAGGCCCTCGACAGGTGGCCTGGA
CTGGGCGCCTCTTCAGCCCATTGCCCATCCCACTTGCATGGGGTCTACACCCAAGGACGCACACACCTATATCGTGCC
AACCTAATGTGGTTCAACTCAGCTGGCTTTTATTGACAGCAGTTACTTTTTTTTTTTTAATACTTTAAGTTCTAGGGTAC
ATGTGCACGACGTGCAGGTTAGTTACATATGTATACATGTGCCATGTTGGTGTGCTGCACCCATTAACTCATCATTTACA
TTAGGTATATCTCCTAATGCTATCCCTCCCCACTCCCCCCATCCCATGACAGGCCCTGGTGTGTGATGTTCCCCACCCTG
TGTCCAAGTGTTCTCATTGTTCAGTTCCCACCTGTGAGTGAGAACATGTGGTGTTTGGTTTTCTTTCCTTGCATAGTTT
GCTCAGAGTGATGGTTTCCAGCTTCGTCCATGTCCCTACAAAGGACATGAACTCATCCTTTTTTATGACTGCATAGTATT
CCGTGGTGTATATGTGCCACATTTTCTTAATCCAGTCTATCATCGATGGACATTTGGGTTGGTTGCAGTCTTTGCTACT
GTGAATAGTGCCGCAATAAACATACGTGTGCATGTGTCTTTATAGCAGCATGATTTATAATCCTTTGGGTATATACCCAG
TAATGGGATGGCTGGGTCAAATGGTATTTCTAGTTCTAGATCCTTGAGGATCACCACACTGTCTTCCTGGTTGAA
CTAGTTTACACTCCCACCAACAGTGTAAAAGTGTTCTGGTGCTGGAGAGGATGTGGAAGCAGTTATTTTTTTATGJAJAA
TAGTATCACTGACAAGCAGACAGTTAGTGAAGGATGCGTCAGGAAGCCTGCAGGCCCACAGCCATTTCTCTCGAAGAC
TCCGGGTTTTTCCTGTGCATCTTTTGAAACTCTAGCTCCATTATAGCATGTACAGTGGATCAGGTTCTTCTTATTAA
GGTTCAAGTTCTAGATTGAATAAGTTTATGTAACAGACAAATTTCTTGTACACACACTTGCTCTGGGATTTGGA
GGAAAGTGTCCTCGAGCTGGCGGCACACTGGTCAGCCCTCTGGGACAGGATACCTCTGGCCCATGGTCATGG~CGCTG
GCTTGGGCCTGAGGGTCACACAGTGCACCATGCCCAGCTTCCTGTGGATAGGATCTGGGTCTCGGATCATGCTGAGGACC
ACAGCTGCCATGCTGGTAAAGGGCACCACGTGGCTCAGAGGGGGCGAGGTTCCCAGCCCCAGCTrTCTTACCGTCTTCAG
TTATTTTTCCCTAAGAGTCTGAGAAGTGGGGCCGCGCCTGATGGCCTTCGTTCGTCTTCAGCTGGCACAGAATTGCACAA
GCTGATGGTAAACACTGAGTACTTATAATGAATGAGGAATTGCTGTAGCAGTTAACTGTAGAGAGCTCGTCTGTTGGA
GAAATTTAAGTTTTTCATTTAACCGCTTTGGAGAATGTTACTTTATTTATGGCTGTGTAAATTGTTTGACATTCAGTCcC
TCGTAGACAGATACTACGTAAAAAGTGTAAAGTTAACCTTGCTGTGTATTTTCCCTTATTTTAG
Intron 10 (SEQ ID NO 14)
GTGAGGCCCGTGCCGTGTGTCTGTGGGGACCTCCACAGCCTGTGGGCTTTGCAGTTGAGCCCCCCGTGTCCTGCCCCTGG
CACCGCAGCGTTGTCTCTGCCAGTCCTCTCTCTCTGCCGGTGCTGGATCCGCAGAGCAGAGGCGCTTGGCCGTGCACC
CAGGCCTGGGGGCGCAGGGGCACCTTCGGAGGGAGTGGGTACCGTGAGGCCCTGGTCCTGCAGAGACGACCCAGTT
ACACACGTGGTGAGTGCAGGCGGTGACCTGGCTCCTGCTGCTCTTTGGAAGTCAAGAGTGGCGGCTCCTGGGGCCCCAG
TGAGACCCCCAGGAGCTGTGCACAGGGCCTGCAGGGCCGAGGCGGCAGCCTCCTCCCCAGGGTGCACCTGAGCCTGCGGA
GAGCAGGAGCTGCTGAGTGAGCTGGCCCACAGCGTTCGCTGCGGTCACGTTCCTGCGTGGGTTGTTTGGATCGGTGGG
AGAATTTGGATTTGCTGAGTGCTGCTGTCTTGAACCACGGAGATGGCTAGGAGTGGGTTTCAGAGTTGATTTTTGTGAAT
CAAACTAAAATCAGGCACAGGGGACCTGGCCTCAGCACAGGGGATTGTCCATGTGGTCCCCCTCAGGGCGCCCCACAG
AGCGGGTGTTAGGGTTAGGGCGGACTGAGTTAGGACTAAAT
TGGCCGCCAGGGGTGGTTTCAGGTGCTTTGCTGGGCTGTGTTTGTGACCCATTTGACCCGCCCTCCAGTCCACCC
TCCAGGTCCACCCTCCAGGGCCGCCCTGGGCTGGGGGTATGCCTGGCGTTCCTTGTGCCGCAGCCCGGAGCACAG~CAC
TGTGCACATTTAAATCCACTAAGATTCACTCGGGGGGAGCCCAGGTCCCAAGCAACTGAGGGCTCAGGAGTCCTGAGGCT
GCTGAGGGGACAGAGCAGACGGGGAACGCTGCTTCTGTGTGGCAAGTTCCTGAGGGTGCTGGCCAGGGAGGTGGCTCAGA
GTGTATGTTGGGGTCCCACCGGGGGCAGAACTCTGTCTCTGATGAGTCGGCAGCCATGTACAGGAAGGGGTGGCCACA.G
GGAGCTGGGAATGCACCAGGGGAGCTGCGCAGCTGGCCGAGGTCCCAGGGCCAGCCACAGGAAGGCAGGGGGACGCCC
GGGGCCACAGCAGAGGCCGCAGGAAGGGAAGGGGATGCCCAGGCCAGAGCAGAGGCTACCGGGCACAGGGGGGCTCCCTG
AGCTGGGTGAGCGAGGCTCATGACTCGGCGAGGGAACCTCCTTGACGTGAAGCTGACGACTGGTGTTGCCCAGCTCACAG
CCCAGCCAGGTCCCGCGCCTGAGCAGGAACTCAGAACCCTCCCCTTTGTCTAAAGCACAGCAGATGCCTTCAGGGCATCT
AGGAGAAAACAGGCAAAGTCGTTGAGAAACGTCTTAAAGAAGGTGGGATGGTGGCATTTCTTGTCCAGATTTTAGTCT
GCCCCGGACCACAGATGAGTCTATAACGGGATTGTGGTGTTGCCATGGGGACACATGAGATGGACCATCACAGAGGCCAC
TGGGGCTGCACCTCCCATCTGAGTCCTGGCTGTCCCGGGTCCAGGCCAGGTTCTTGCATGCTCACCTACCTGTCCTGCCC
-44-
GGGAGACAGGGAAAGCACCCCGAAGTCTGGAGCAGGGCTGGGTCCAGGCTCCTCAGAGCTCCTGCCAGGCCCAGCACCCT
GCTCCAAATCACCACTTCTCTGGGGTTTTCCAAAGCATTTAACAAGGGTGTCAGGTTACCTCCTGGGTGACGGCCCCGCA
TCCTGGGGCTGACATTGCCCCTCTGCCTTAG
Intron 11 (SEQ ID NO
GTGAGCGCACCTGGCCGGAAGTGGAGCCTGTGCCCGGCTGGGGCAGGTGCTGCTGCAGGGCCGTTGCGTCCACCTCTGCT
TCCGTGTGGGGCAGGCGACTGCCAATCCCAAAGGGTCAGAGGCCACAGGGTGCCCCTCGTCCCATCTGGGGCTGAGCAGA
AATGCATCTTTCTGTGGGAGTGAGGGTGCTCACAACGGGAGCAGTTTCTGTGCTATTTGGTAAAGGATGGTGCAC
CAGACCTGGGTGCACTGAGGTGTCTTCAGAAAGCAGTCTGGATCCGAACCCAAGACGCCCGGGCCCTGCTGCGTGAGT
CTCTCAAACCCGAACACAGGGGCCCTGCTGGGCATGAGTCCCTCTGAACCCGAGACCCTGGGGCCCTGCTGGGCGTGAGT
CTCTCCGAACCCAGAGACTTCAGGGCCCTTTTGGGCGTGAGTCTCTCCGCTGTGAGCCCCACACTCCAGGCTCATCC
AGTCTACAGGATGCCATGAGTTCATGATCACGTGTGACCCATCAGCGGAGCCATGGTGTGGGGGGTCTCTAA
AATTCTGGGGTCTTGTTTCCCCAGAGCCCGAGAGCTCAAGGCCCCGTCTCACTCAGACACTGATTGAGATGGA
CAAAGAAACGGTTTTTAGAAAAGACAATCGCGGAGTGTAAC
ATACCGATTGAGCAGGGGACCTGGCAGGTGGCACTACAAAT
AAATTCCATTTCTACTTAAAAAATACAAATTAGCCTGGCCTGGTGGCACACGCCTGTAGTCCCCGCTATGCGA3GC
TGAGGCAGGAGAATCATTTGACCCAGGAGGCAGAGGTTGCAGTGAGCCGAGATCACACCACTGCACTCCAGCCTGG.
ACAGAGTGAGACTTCATCTTAAAAAAAAGTATCAGCATTCCAAAACCATAGTGGACAGGTGTTTTTTTATTC
TGTCCTTCGATAATATTTACTGGTGCTGTGCTAGAGGCCGGAACTGGGGTGCCTTCCTCTGAAAGGCACACCTTCATGG
GAAGAGAAATAAGTGGTGAATGGTTGTTAACCAGAGGTTTAAACTGGGGTCCTGTCGTTCTGAGTTAACAGTCCAGATC
TGGACTTTGCCTCTTTCCAGAATGCTCCCTGGGGTTTGCTTCATGGGGGAGCAGCAGGTGTGGACACCCTCGTGATGG
GAGCAGCAGGTGCAGACGCCCTCATGATGGGGGAGTGGCAGGTGCAGACACCCTTGTGCATGGTGCCCAGCATGTCCCTG
TTGCAGCTCCCTCCCCACAAGGATGCCGGTCTCCTGTGCTCCCCACAGTCCCTGCTTCCCTCTCACAGCCTTACCTGGTC
CTGGCCTCCACTGGCTTTGTCTGCATGATTTCCACATTTCCTGGGCTCCCAGCACCTCTTCGCCTCTCCCAGGCACCTCT
GCAGTGCTGGCCATACCAGTCAGCTGTGACTGTCACTGCTTATTTTGCTCCCCATGATGTATTTTTTAGGACAGGC
ACCTGTCGCCGCCGACGGAGTATAGAAAGCGCACATAGAAG
GTTCTCTCTAACACATTGCAAGCCACAGAGGCTAGTGCAGGATGGGTGGGCATCAGGTCATCAGATGTGTCATG
CCGAATTTCCCAGCATGTAATGGGTGAAGGCCAAGTACGGA
GCGGTCCAATAGTACCCTCCGGTTAGAAACGGCTACGCTT
G
GAAGAAAACAGGCAAAATGATTAAGAAGTGAAAGGAAJGTGGTAGATGGGATTTTCTTGTCCAGATTTTAGTC
TCCACAACCGTGAATTGCGATGTGCGAATGAAACGACCACC
CAGAAACGTGTGTTAATGTGGTATGTGGCACAGCTGATGGAAGAGAGTGTGTGTGTATTTTTTTTTCTGAGAACT
GACTGGAAGCAAATAAGTTGTGTCTTTACAGCATATACCAGAGCAGATTCTAGGTAGAGAGGAGACACATGJA.ACAAC
ACACAAAAAACAAATAAGAGGAGGAGTCTGTGTTGGAGAAA
AGGGAGGCGGATGAAACCAGTGAGGCAACGGGCATTGCTTTCACTGCAGAGAACTCAGCTTGCCTGAGCCACAGTGA
ATGGCCATTCCCTGGAGCGTTTGTGCACGTGATTTATTTAJAGGCGCCCTGTGAGGTCCTGCACATTATCCTCTCACTTT
GTTCTCCTAACCACCTGAGAGGTAGAGGAGGAGGCTCCAGGGGAGCAGCCGCCCTTGGTCACCCAGCTGCAAGGGC
ATGCATGATTGCAGCCTGGCCTCCTGCTCCGGGGCCCTTGCTCTGCCCGAGGACCCCACACAAGTCAGACCCATAGGCTC
AGGGTGAGCCGGAGCCCAAGGTCGTGTTGGGGATGGCTGTGGGAATGGACGTCTGATGCACACTTGGGAAGGTC
CTACCAGCAGCGTCAAAGAAATGCATGTGAAACTGACAGCGAGACCCATCCCTCAAAGAAACGCACGTGAAACTGATGGC
GAGACCTGTCCCCATCCCTCATGCTGGCTCCTTTTCTGGGCTTGCGAGCCAGCATCAGGTTGAGGCGCTGG
ACTTTTCTGGAAGCAGCTTGTTTGCATGGAAGTCCTCACATGTCCTGTGTCTTCCCAGTATTCCACTTCTGAGTGA
CCAGACATTATCACGGGTCTTATTTACCATTTCCAGTGTTCCAGGCAGGGGACTTGCCACAGCAAGTCACGAACCTGCC
CAAAAGCAGAAATTCTAAACTGTTCATACTTTAAATTTAGA
17 3AL, GTTTAATGGCACAAAACGTTTATTTCAATGTAGCAGTGTTCA-AGCTGGATGTAAAGACACACCCCAGGAGCCTGCCG
GAATGTCATGTGTGTTCATCTTTGGACATGGACATACATGGGCAGTGAGTGGTGGTGAGGCCCTGGAGGACATCGGTGG
GATGCCTCCATCCTGCCCCTCTGGAGACACCATGTGTGCCACGTGCACTCACTGGAGCCCTGTTTAGCTGGTGCCACCTG
GCTCTTCCATCCCTGAGATTCAAACACAGTGAGATTCCCCACGCCCAACTCAGTGTTCTCCCACAAAACCTGAGTCAC
ACCTGTGTTCACTCGAGGGACGCCCGGGAGCCAGGGCTCCACAGTTTATTATGTGTTTTTGGCTGAGTTATGTGCAGATC
TCATCAGGGCAGATGATGAGTGCACAAACACGGCCGTGCGAGGTTTGGATACACTCAACATCACTAGCCAGTCCTGGTG
GAGTTTGGTCATGCAGAGTCTGGATGGCATGTAGCATTTGGAGTCCATGGAGTGAGCACCCAGCCCCCTCGGGCTGCAGC
GCATGCCCCAGGCAGACAAGGAAGCGGGAGGAAGGCAGGAGGCTCTTTGGAGCAAGCTTTGCAGGAGGGGGCTGGGTGT
GGGGCAGGCACCTGTGTCTGACATTCCCCCCTGTGTCTCAG
Intron 12 (SEQ ID NO 16)
GTGAGCAGGCTGATGGTCAGCACAGAGTTCAGAGTTCAGGAGGTGTGTGCGCAAGTATGTGTGTGTGTGTGTGCGCGCGT
GCCTGCAAGGCTGATGGTGACTGGCTGCACGTAAGAGTGCACATGTACGCATATACACGTGAGCACATACATGTGTGCAT
GTGTGTACATGAAGGCATGGCAGTGTGTGCACAGGTGTGCAAGGGCACAAGTGTGTGCACATGCGATGCACACCTGACA
TGCATGTGTGTTCGTGCACAGTCGTGTGGGCATTCACGTGAGGTGCATGCGTGTGGGTGTGCAGTGTGAGTAGCATGTGT
GCACATAACATGTATTGAGGGGTCCTCGTGTTCACCCCGCTAGGTCCTCAGCACCAGTGCCACTCCTTACAGATGAGAC
GGGGTCCCAGGCCTTGGTGGGCTGAGGCTCTGAAGCTGCAGCCCTGAGGGCATTGTCCCATCTGGGCATCCGCGTCCACT
CCCTCTCCTGTGGGCTTCTGTGTCCACTCCCCCTCTCCTGTGGGCATTTACATCCACTCCACTCCCTCTCTCCTGTGGC
ATCCGCGTCCACTCCCCCTCTCTGTGGGCATCTGCGTCCACCTCCCCTCTCTGTGGGCATTTGCGTCCACTCCCTCTCCT
GGTTCCTTCCTGTCTTGGCCGAGCCTCGGGGGCAGGCAGATGACACAGAGTCTTGACTCGCCCAGGGTGGTTCGCAGCTG
CCGGGTGAGGGCCAGGCCGGATTTCACTGGGAAGAGGGATAGTTTCTTGTCAAAATGTTCCTCTTTCTTGTTCCATCTGA
ATGGATGATAAAGCAAAAAGTAAAAACTTAAAATCCCAGAGAGGTTTCTACCGTTTCTCACTCTTTCTTGGCGACTCTAG
Intron 13 (SEQ ID NO 17)
GTGAGCCGCCACCAAGGGGTGCAGGCCCAGCCTCCAGGGACCCTCCGCGCTCTGCTCACCTCTGACCCGGGGCTTCACCT
TGGAACTCCTGGGTTTTAGGGGCAAGGAATGTCTTACGTTTTCAGTGGTGCTGCTGCCTGTGCACAGTTCTGTTCGCGTG
GCTCTGTGCAAGCACCTGTTCTCCATCTCTGGGTAGTGGTAGGAGCCGTGTGGCCCCAGGTGTCCCCACTGTGCCTGT
GCACTGGCCGTGGGACGTCATGGAGGCCATCCCAGGGCAGCAGGGGCATGGGGTAAAGAGATGTTTATGGGGAGTCTTAG
CAGAGGAGGCTGGGAAGGTGTCTGAACAGTAGATGGGAGATCAGATGCCCGGAGGATTTGGGGTCTCAGCAGAGGGCC
GAGGTGGGTGCAGGTGAGGGTCGCTGGCCCCACCCCCGGGAAGGTGCAGCAGAGCTGTGGCTCCCCACACAGCCCGGCCA
GCACCTGTGCTCTGGGCATGGCTGTGCTCCTGGAACGTTCCCTGTCCTGGCTGGTCAGGGGGTGCCCCTGCCAAGAATCG
ACAACTTTATCACAGAGGGAAGGGCCAATCTGTGGAGGCCACAGGGCCAGCTTCTGCCTGGAGTCAGGGCAGGTGGTGGC
ACAAGCCTCGGGGCTGTACCAAAGGGCAGTCGGGCACCACAGGCCCGGGCCTCCACCTCAACAGGCCTCCCGAGCCACTG
GGAGCTGAATGCCAGGAGGCCGAAGCCCTCGCCCCATGAGGGCTGAGAAGGAGTGTGAGCATTTGTGTTACCCAGGGCCG
AGGCTGCGCGAATTACCGTGCACACTTGATGTGAAATGAGGTCGTCGTCTATCGTGGACCCAGCAAGGGCTCACGGGA
GAGTTTTCCATTACAAGGTCGTACCATGAAAATGGTTTTTAACCCGAGTGCTTGCGCCTTCATGCTCTGGCAGGGAGGGC
AGAGCCACAGCTGCATGTTACCGCCTTTGCACCAGCTCCAGAGGCTTGGGACCAGGCTGTCTCAGTTCCAGGGTGCGTCC
GGCTCAGACCGCCCTCCTCTCTGCCTTCTCTCTCTGCCTCAAATCTTCCCTCGTTTGCATCTCCCTGACGCGTGCCTGGG
CCCTCGTGCAAGCTGCTTGACTCCTTTCCGGAAACCCTTGGGGTGTGCTGGATACAGGTGCCACTGAGGACTGGAGGTGT
CTGACACTGTGGTTGACCCCAGGGTCCAGCTGGCGTGCTTGGGGCCTCCTTGGGCCATGATGAGGTCAGAGGAGTTTTCC
CAGGTGAAAACTCCTGGGAAACTCCCAGGGCCATGTGACCTGCCACCTGCTCCTCCCATATTCAGCTCAGTCTTGTCCTC
ATTTCCCCACCAGGGTCTCTAGCTCCGAGGAGCTCCCGTAGAGGGCCTGGGCTCAGGGCAGGGCGGCTGAGTTTCCCCAC
CCATGTGGGGACCCTTGGGTAGTCGCTTGATTGGGTAGCCCTGAGGAGGCCGAGATGCGATGGGCCACGGGCCGTTTCCA
AACACAGAGTCAGGCACGTGGAAGGCCCAGGAATCCCCTTCCCTCGAGGCAGGAGTGGGAGAACGGAGAGCTGGGCCCCG
ATTTCACGGCAGCCAGGCTGCAGTGGGCGAGGCTGTGGTGGTCCACGTGGCGCTGGGGGCGGGGTCTGATTCAAATCCGC
TGGGGCTCGGCCTTCCTGGCCCGTGCTGGCCGCGCCTCr-ACACGGGCTTGGGGTGGACGCCCCGACCTCTAGCAGGTGGC S TATTTCTCCCTTTGGAAGAGAGCCCCTCACCCATGCTAGGTGTTTCCCTCCTGGGTCAGGAGCGTGGCCGTGTGGCAACC 46
CCGGGACCTTAGGCTTATTTATTTGTTTAAAACATTCTGGGCCTGGCTTCCGTTGTTGCTWATGGGGAAAGACATCC
CACCTCAGCAGAGTTACTGAGAGGCTGAAACCGGGGTGCTGGCTTGACTGGTGTGATCTCAGGTCATTCCAGpAAGTGGCT
CAGGAAGTCAGTGAGACCAGGTACATGGGGGGCTCAGGCAGTGGGTGAGATGAGGTACACGGGGGGCTCAGJCAGTGGGT
GAGGCCAGGTACATGGGGGGCTCAGGCACTGGGTGAGATGAGGTACACGGGGGGCTCAGGCAGAGGGTr-GACCAGGTAC
ACGGGGGCTCTGATCACACGCACATATGAGCACATGTGCACATGTGCTGTTTCATGGTAGCCAGGTCTGTGCACACCTGC
CCCAAAGTCCCAGGAAGCTGAGAGGCCAAAGATGGAGGCTGACAGGGCTGGCGCGGTGGCTCACACCTGTAGTCCCAGCA
CTTTGGGAGGCCGAGGCGAGAGGATCCCTTGAGCCCAGGAGTTTAAGACCAGCCTGAGCACATAGTAGACCCATCTC
TATGAAAAATAAAAACAAAAATTAGCTGAACATGGTGGTGTGCGCCTGTAGTTCCAATACTTGGGAGGCTGAAGTGGGAG
GATCACTTGAGCCCAGGAGGTGGAAGCTGCAGTGAGCTGAGATTGCACCACTGTACTGCAGCCTGGGTGACAGAGTGAGA
GCCCATCTCAACAACAACAAGAAGACTGACAAATGCAGTTTCTTGGAAAGAAACATTTAGTAGGAACTTACCTACACA
CAGAAGCCAAGTCGGTGTCTCGGTGTCAGTGAGATGAGATGATGGGTCCTCACACCATCACCCCAGACCCAGGGTTTATG
CACCACAGGGGCGGGTGGCTCAGAAGGGATGCGCAGGACGTTGATATACGATGACATCAGGTTGTCTGACGAJGGGCAG
GATTCATGATAAGTACCTGCTGGTACACAAGGAACAATGGATACTGGACCT,TAGAGGCCTTCCCGACAGGGGCT
AATCAGAAGCCAGCATGGGGGGCTGGCATCCAGGATGGAGCTGCTTCAGCCTCCACATGCGTGTTCATACAGATGGTGCA
CAGAAACGCAGTGTACCTGTGCACACACAGACACGCAGCTACTCGCACACACGCACACACACAGAATGATGCTGC
ATCCGTGTGTGTGCACCTGTGCCCATGAGGAACCCATGCATGTGCATTCATGCACGCACACAGGCCGTGGGCCCAT
GCCCACACCCACGAGCACCGTCTGATTAGGAGGCCTTTCCTCTGACGCTGTCCGCCATCCTCTCAG
Intron 14 (WEQ ID NO 18)
GTATGTGCAGGTGCCTGGCCTCAGTGGCAGCAGTGCCTGCCTGCTGGTGTTAGTGTGTCAGGAGACTGAGTGAATCTGGG
CTTAGGAAGTTCTTACCCCTTTTCGCATCAGGAAGTGGTTTAACCCAACCACTGTCAGGCTCGTCTGCCCGCCCTCTCGT
GGGGTGAGCAGAGCACCTGATGGAAGGGACAGGAGCTGTCTGGGAGCTGCCATCCTTCCCACCTTGCTCTGCCTGGGGAA
GCGCTGGGGGGCCTGGTCTCTCCTGTTTGCCCCATGGTGGGATTTGGGGGGCCTGGCCTCTCCTGTTTGCCCTGTGGTGG
GATTGGGCTGTCTCCCGTCCATGGCACTTAGGGCCCTTGTGCAAACCCAGGCCAAGGGCTTAGGAGGAGGCCAGGCCCAG
GCTACCCCACCCCTCTCAGGAGCGAGGCCGCGTATCACCACGACAGAGCCCCGCGCCGTCCTCTGCTTCCCAGTCACCG
TCCTCTGCCCCTGGACACTTTGTCCAGCATCAGGGAGGTTTCTGATCCGTCTGATTCAAGCCATGTCGACCTGCGGT
CCTGAGCTTAACAGCTTCTACTTTCTGTTCTTTCTGTGTTGTGGATTTCACCTGGAGAGCCGAGAACATTTCTG
TCGTGACTCCTGCGGTGCTTGGGTCGGGACAGCCAGAGATGGAGCCACCCCGCAGACCGTCGGTGTGGGCAGCTTTCCG
GTGTCTCCTGGGAGGGGAGCTGGGCTGGGCCTGTGACTCCTCAGCCTCTGTTTTCCCCCAG
Intron 15 (SEQ ID NO 19)
GCAAGTGTGGGTGGAGGCCAGTGCGGGCCCCACCTGCCCAGGGGTCATCCTTGAACGCCCTGTGTGGGGCGAGCAGCCTC
AGATGCTGCTGAAGTGCAGACGCCCCCGGGCCTGACCCTGGGGGCCTGGAGCCACGCTGGCAGCCCTATGTGATTACG
CTGGTGTCCCCAGGCCACGGAGCCTGGCAGGGTCCCCAACTTCTTGAACCCCTGCTTCCCATCTCAGGGGCGATGGCTCC
CCACGCTTGGGAGCCTTCTGACCCCTGACCTGTGTCCTCTCACAGCCTCTTCCCTGGCTGCTGCCCTGAGCTCCTGGGGT
CCTGAGCAAGTTCTCTCCCCGCCCCGCCGCTCCAGCGTCACTGGGCTGCCTGTCTGCTCGCCCCGGTGGAGGGGTGTCTG
TCCCTTCACTGAGGTTCCCACCAGCCAGGGCCACGAGGTGCAGGCCCTGCCTGCCCGGCCACCCACACGTCCTAGGAGGG
TTGGAGGATGCCACCTCTGGCCTCTTCTGGAACGGAGTCTGATTTTGGCCCCGCAG
3'-untranscribed region (SEQ ID NO
ATCTCATGTTTGATCCTAATGTGCACTGCATAGACACCACTGTATGCAATTACAGAGCCTGTGAGTGAACGGGGTGGT
GGTCAGTGCGGGCCCATGGCCTGGCTGTGCATTTACGGAAGTCTATGAGTGAATGGGGTTGTGGTCAGTGCGGCCCATG
GCCTGGCTGGGCCTGGGAGGTTTCTGATGCTGTGAGGCAGGAGGGGAAGGAGGGTAGGGGATAGACAGTGGGAGCCCCCA
CCCTGGAAGACATAACAGTAAGTCCAGGCCCGAAGGGCAGCAGGGATGCTGGGGGCCCAGCTTGGGCGGCGGGGATGATG
GAGGGCCTGGCCAGGGTGGCAGGGATGATGGGGGCCCCAGCTGGGGTGGCAGGGGTGATGGGGGGGGCTGGTCTGGGTGG
47
CGGGGAAGATGGGGAAGCCTGGCTGGGCCCCCTCCTCCCCTGCCTCCCACCTGCAGCCGTGGATCCG(JATGTGCTTCCCT
GGGAACTTGCACGTTAGAGGGGGAGGAGCCACTTTAACAGT
CCTCCTCCTGAACGCCCCACTCAGGTTGAAGTCACATTCCGCCTCTGGCCATTCTCTTAAGAGTAGACCAGATTCTG
ATCTCTGAAGGGTGGGTAGGGTGGGGCAGTGGAGGGTGTGACACAGGAGGCTTCAGGGTGGCTGGTGATGCTCTCTC
ATCCTCTTATCATCTCCCAGTCTCATCTCTCATCCTCTTATCATCTCCCAGTCTCATCTGTCTTCCTCTTATCTCCCAGT
CTCATCTGTCATCCTCTTACCATCTCCCAGTCTCATCTCTTATCCTCTTATCTCCTAGTCTCATCCAGACTTACCTCCCA
GGGCGGGTGCCAGGCTCGCAGTGGAGCTGGACATACGTCCTTCCTCAGGCAGAAGGACTGGAAGGATTGAGAGACAG
GAGGGGCGGCTCAGAGGGACGCAGTCTTGGGGTGAGAAACAGCCCCTCCTCAGGTTCTTGGGCCAACGACCG
AGGGCCCTGCGTGAGTGGCTCCAGAGCCTTCCAGCAGGTCCCTGGTGGGGCCTTATGGTATGGCCGGGTCCTACTGAGTG
CACCTTGGACAGGGCTTCTGGTTTGAGTGCAGCCCGGACGTGCCTGGTGTCGGGGTGGGGGCTTATGGCCACTGGATATG
GCGTCATTTATTGCTGCTGCTTCAGAGAATGTCTGAGTGACCGAGCCTTGTGTATGTGGGCCCAGTCACGACTG
TGTCGTAAATGCACTCTGGTGCCTGGAGCCCCCGTATAGGAGCTGTGAGGAAGGAGGGGCTCTTGGCAGCCGGCCTGGG
GCGCCTTTGCCCTGCAAACTGGAAGGGAGCGGCCCCGGGCGCCGTGGGCGGACGACCTCAAGTGAGAGGTTGGACAGAAC
AGGGCGGGGACTTCCCAGGAGCAGAGGCCGCTGCTCAGGCACACCTGGGTTTGAATCACAGAC CaGTCGGCCATT
GTTCAGCTATCCATCTTCTACAAAGCTCCAGATTCCTGTTTCTCCGGGTGTTTTTTGTTGATTTTACTCAGGATTACT
TATATTTTTTGCTAAGTATTAGACCCTTAAAAGGTATTTGCTTTGATATGGCTTACTCACTAGCCCTACTTTAT
TTGTCTGTTTTTATTTATTATTATTATTATTATTAGAGATGTGTCTACTCTGTCACCCAGGTTGTTAGTGAGTGGC
AGTCATGGCTCGCTGTAGCCGCAAACCCCCAGGCTCAAGTGATCCTCCGGCCTCAGCTTCCCAGAGTGCTGGGATTACG
GTGTGAGCCACTGCCCTTGCCTGGCACTTTTAAAAACCACTATGTAAGGTCAGGTCCAGTGGCTTCCACACCTGTCATCC
CAGTAGTTTGGGAAGCCGAGGCAGAAGGATTGTCTGAGGCCAGGAGTTTGAGACCAGCATGGGTAACATAGGGAGACCCC
ATCTCTACAAAAAATGCAAAAGTTATCCGGGCGTGGGGTCCAGCATCTGTAGTCCCAGCTGCTCGGGACTGAGTG
AGGATCGCTTGAGCCCGGGAGGTCATGGCTGCAGTGAGCTGTGATTGTACCATCGCACTCGCCTGCACGAGTGA
GACCCTGTCTCAAAAAAAAAAAAAGAAGGAGAAGGAGAAGAGAAGAAGAAGGAAGAAGGAAAGAGAAGAAGAAG
GAAGAAGGAAGAAAGAAGGAGAAGGAGGCCTGCTAGGTGCTAGGTAGACTGTTcTCAGAGTAATAACA
AAGTTTTAGGGAAGAAACCCCAGCTCTTTGGACTTCCTTAGGCCTGACTTCATCTCAGCAGCTTCCTTCCACA
GAAGGGAGACATATCAGAAAGGGAAGAGAGGGAGTTGTAAC
GCCAGGACCCCTGAAAGGGAGTGGTTGTTTTCCTGCCTCAGCCCCACGCTCCTGCCGGTCCTGCACCTGCTGTAACCGTC
GATGTTGGTGCCAGGTGCCCACCTGGGAAGGATGCTGTGCAGGGGGCTTGCCAAACTTTGGTGGGTTTCAGAAGCCCCAG
GCACTTGTGGCAGGCACAATTACAGCCCCTCCCCAAAGATGCCCACGTCCTTCTCCTGGAACCTGTGAATGTGTCACCCG
CAAGGCAGAGGCTGGTGAAGGCTGCAGGTGGAATCACGGCTGCCAGTCAGCCGATCTTAGGTCATCCTGGATTATCTGG
TGGCGTTGCCAGTCTGATAAAGGGCGGAATAAAGGCTAAGA
CACTGGCCACTGCTGGCTTTGAGATGGAGGAGGGGGTCCCCAGCCAAGGAATGGGGGCAGCCGCTCCATGCTGGAAAGC
AAGCAATCCTCCCCGGTCCTGAGGGCACACGGCCCTGCCCACGCCTCGATTTCAGGCCAGTGGGACCTGTTTCAGCTTTC
CGGCCTCCAGAGCTGTAAGATGATGCGTTTGTGTTCAGCCACTAAGCTGCAGTGATTCGTCACAGCAGCAAATGGAATAG
CAGTACAGGGAAATGAATACAGGGACAGTTCTCAGAGTGACTCTCAGCCCACCCCTGGG
-48- Characterization of the exons showed, interestingly, that the functionally important hTC protein domains which are described in our Patent Application PCT/EP/98/03469 are arranged on separate exons. The telomerase-characteristic T motif is located on exon 3. The RT (reverse transcriptase) motifs 1-7, which are important for the catalytic function of the telomerase, are located on the following exons: RT motifs 1 and 2 on exon 4, RT motif 4 on exon 9, RT motif 5 on exon and RT motifs 6 and 7 on exon 11. RT motif 3 is shared by exons 5 and 6 (see Fig. 8).
Elucidation of the exon-intron structure of the hTC gene also shows that the four deletions or insertion variants of the hTC cDNA which were described in our Patent Application PCT/EP/98/03469, as well as three additional hTC insertion variants which are described in the literature (Kilian et al., 1997), in all probability represent alternative splicing products. As shown in Fig. 8, the splicing variants can be divided into two groups: deletion variants and insertion variants.
The hTC variants in the deletion group lack specific sequence segments. The 36 bp in-frame deletion in variant DELl in all probability results from using an alternative 3' splice acceptor sequence in exon 6, resulting in a part of RT motif 3 being lost. In variant DEL2, the normal 5' splice donor and 3' splice acceptor sequences of introns 6, 7 and 8 are not used. Instead exon 6 is fused directly to exon:9, resulting in a displacement arising in the open reading frame and a stop codon appearing in exon Variant Del3 is a combination of variants 1 and 2.
The insertion variant group is characterized by the insertion of intron sequences which lead to premature cessation of translation. Instead of the 5' splice donor sequence of intron 5, which is normally used, use is made, in variant INS1, of an alternative, 3'-located splice site, resulting in the insertion of the first 38 bp from intron 4 between exon 4 and exon 5. The insertion, in variant INS2, of a region of the intron 11 sequence likewise results from using an alternative 5' splice donor sequence in intron 11. Since this variant was only described inadequately in the -49literature (Kilian et al., 1997), it is not possible to determine the precise alternative splice donor sequence in this variant. The insertion of intron 14 sequences between exon 14 and exon 15 in variant INS3 comes from using an alternative 3' splice acceptor sequence, resulting in the 3' part of intron 14 not being spliced.
The hTC variant INS4 (variante which is described in our Patent Application PCT/EP/98/03469, is characterized by exon 15, and the 5' part region of exon 16, being replaced by the first 600 bp of intron 14. This variant can be attributed to the use of an alternative internal 5' splice donor sequence in intron 14 and an alternative 3' splice acceptor sequence in exon 16, resulting in an altered C terminus.
The in vivo generation of hTC protein variants which are probably non-functional and which could interfere with the function of the complete hTC protein constitutes a possible mechanism, in addition to transcription regulation, for controlling hTC protein function. The function of the hTC splicing variants is not yet known.
Although most of these variants presumably encode proteins without reverse transcriptase activity, they could nevertheless play a crucial role as transdominantnegative telomerase regulators by, for example, competing for interaction with important binding partners.
The search for possible transcription factor binding sites was carried out using the ,,find pattern" algorithm from the Genetics Computer Group (Madison, USA) GCG Sequence Analysis program package. This resulted in the identification of a variety of potential binding sites for transcription factors in the nucleotide sequence of intron 2, which binding sites are listed in Tab. 2. In addition, an Spl binding site was found in intron 1 (pos. 43), and a c-Myc binding site was found in the 5'-untranslated region (cDNA position 29-34, cf. Fig. 6).
Example 6 In order to ascertain the start point(s) of hTC transcription in HL 60 cells, the 5' end of the hTC mRNA was determined by means of primer extension analysis.
2 gg of polyA RNA from HL-60 cells were denaturated at 65 0 C for 10 min. 1 gl of RNasin (30-40 U/ml) and 0.3-1 pmol of radioactively labelled primer 2.5-8x10 5 cpm) were added for primer annealing, and the whole was incubated, at 37 0 C for 30 min, in a total volume of 20 gl. After the addition of 10 gl of 5xreverse transcriptase buffer (from Gibco- BRL), 2 ul of 10 mM dNTPs, 2 gl RNasin (see above), 5 Al of 0.1 M DTT (from Gibco-BRL) 2 gl of ThermoScript RT (15 U/gl; from Gibco-BRL) and 9 Al of DEPC-treated water, primer extension took place, at 58 0 C for 1 h, in a total volume [lacuna]. The reaction was stopped by adding 4 pl of 0.5 M EDTA, pH 8.0, and the RNA was degraded, at 37 0 C for 30 min, after having added 1 gl of RNaseA mg/ml). 2.5 gg of sheared calf thymus DNA and 100 gl of TE were then added, and the mixture was extracted once with 150 gl of phenol/chloroform The DNA was precipitated, at -70 0 C for 45 min, after adding 15 il of 3 M Na acetate and 450 gl of ethanol, and then centrifuged at 14,000 rpm for 15 min. The precipitate was washed once with 70% ethanol, dried in air and dissolved in 8 gl of sequencing stop solution. After 5 min of denaturation at 80 0 C, the samples were loaded onto a 6% polyacrylamide gel and fractionated electrophoretically (Ausubel et al., 1987) (Fig. In this connection, a main transcription start site was identified which is located 1767 bp 5' of the ATG start codon of the hTC cDNA sequence (nucleotide position 3346 in Fig. In addition to this, the nucleotide sequence around this main transcription start (TTA+ITTGT) represents an initiator element (Inr), which, in 6 out of 7 nucleotides, matches the consensus motif (PyPyA+ 1 Na/tPyPy) (Smale, 1997) of an initiator element.
-51 It was not possible to identify any unambiguous TATA box in the immediate vicinity of the experimentally identified main transcription start, which means that the hTC promoter has probably to be classified in the family of TATA-less promoters (Smale, 1997). However, a potential TATA box from nucleotide position 1306 to nucleotide position 1311 (Fig. 4) was found by means of bioinformatics analysis. The subsidiary transcription starts which were additionally observed around the main transcription start have also been described in the case of other TATA-less promoters (Geng and Johnson, 1993), for example in the strongly regulated.promoters of some cell cycle genes (Wick et al., 1995).
Example 7 In addition to the start point of the hTC transcript which was described in Example 6 and identified in HL60 cells, a further transcription start region was also identified in HL60 cells. With the aid of RT-PCR analyses, the region of the hTC gene transcription start in HL60 cells was localized to bp -60 to bp -105.
The cDNA for this was synthesized using a First Strand cDNA Synthesis kit (Clontech), in accordance with the manufacturer's instructions, and employing 0.4 ig of HL60 cell polyA RNA (Clontech) and the gene-specific primer GSP13 (5'-CCTCCAAAGAGGTGGCTTCTTCGGC-3', cDNA position 920-897). In a final volume of 50 pl, 10 pmol dNTP mix were added to 1 jil of cDNA, and a PCR reaction was carried out in lxPCR reaction buffer F (PCR-Optimizer kit from InVitrogen) and using one unit of platinum Taq DNA polymerase (from Gibco/BRL).
10 pmol of each of the 5' and 3' primers defined below were added as primers. The PCR was carried out in 3 steps. A two-minute denaturation at 94 0 C was followed by 36 PCR cycles in which the DNA was first of all denatured at 94 0 C for 45 sec and, after that, the primers were annealed, and the DNA chain was extended at 68 0 C for min. The cycles were concluded by a chain extension at 68 0 C for 10 min. In all, six different 5' PCR primers (primer 5'-CGCAGCCACTACCGCGAGGTGC-3', cDNA position 105 to 126; primer m -52- 5'-CTGCGTCCTGCTGCGCACGTGGGAAGC-3', 5'-flanking region -49 to -23; primer PRO-TEST1: 5'-CTCGCGGCGCGAGTTTCAGGCAG-3', region -74 to -52; primer PRO-TEST2: 5'-CCAGCCCCTCCCCTTCCTTTCC-3', region -112 to -91; primer PRO-TEST4: 5'-CCAGCTCCGCCTCCTCCGCGC-3', 5'-flanking region -191 to -171; primer RP-3A: 5'-CTAGGCCGATTCGACCTCTCTCC-3', 5'-flanking region -427 to -405) were combined with the 3' PCR primer (5'-GTCCCAGGGCACGCACACCAG-3', cDNA position 245 to 225). Genomic DNA was also employed for the PCR, as a control, in addition to. the Oligo dT- and GSP13-primed cDNAs. As Fig. 9 shows, a PCR product was only obtained with the primer combinations HTRT5B-C5Rback, C5S-C5Rback and indicating that the start point for hTC transcription lies in the region between and bp-105.
Example 8 Several extremely GC-rich regions, so-called CpG Islands, are located in the isolated region, of about 11.2 kb in size, of the hTC gene. One CpG Island, having a GC content of> 70%, extends from bp 1214 into intron 2. Two further GC-rich regions having a GC content of> 60% extend from bp -3872 to bp -3113 and from bp -5363 to bp -3941, respectively. The positions ofthe CpG Islands are shown graphically in Fig. 11.
The search for possible transcription factor binding sites was carried out using the "Find Pattern" algorithm from the Genetics Computer Group (Madison, USA) GCG Sequence Analysis program package. This resulted in the identification of a variety of potential binding sites in the region up to -900 bp upstream of the translation start codon ATG: five Spl binding sites, one c-Myc binding site, and one CCAC box (Fig. 10). In addition, a CCAAT box and a second c-Myc binding site were found at positions -1788 and -3995, respectively, of the 5'-flanking region.
-53- Example 9 In order to analyse the activity of the hTC promoter, PCR amplification was used to generate four hTC promoter sequence segments of differing length, which segments were cloned into the Promega vector pGL2 5' in front of the luciferase reporter gene.
The 8.5 kb SacI fragment which was subcloned from phage clone P12 was selected as the DNA source for the PCR amplification. In a final volume of 50 gl, 10 pmol of dNTP mix were added to 35 ng of this DNA, and a PCR..reaction was carried out in IxPCR reaction buffer (PCR-Optimizer kit from InVitrogen) and using one unit of platinum Taq DNA polymerase (from Gibco/BRL). In each case 20 pmol of the and 3' primers which are defined below were added as primers. The PCR was carried out in three steps. A two-minute denaturation at 94 0 C was followeed by 30 PCR cycles in which the DNA was first of all denaturated at 94C for 45 sec, after which the primers were annealed, and the DNA chain was extended, at 68 0 C for 5 min. The cycles were concluded by a chain extension at 68 0 C for 10 min. The selected 3' PCR primer was in each case the primer PK-3A (5'-GCAAGCTTGACGCAGCGCTGCCTGAAACTCG-3', position -43 to which primer recognizes a sequence region 42 bp upstream of the ATG START codon. A promoter fragment of 4051 bp in size (NPK8) was amplified by combining the PK-3A primers with the 5' PCR primer (5'-CCAGATCTCTGGAACACAGAGTGGCAGTTTCC-3', position -4093 to -4070). Combining the pair of primers PK-3A and (5'-CCAGATCTGCATGAAGTGTGTGGGGATTTGCAG-3', position -3120 to -3096) led to the amplification of a promoter fragment of 3078 bp in size Use of the primer combination PK-3A and (5'-GGAGATCTGATCTTGGCTTACTGCAGCCTCTG-3', position -2110 to -2087) amplified a promoter fragment of 2068 bp in size (NPK22). Finally, using the primer combination PK-3A and (5'-GGAGATCTGTCTGGATTCCTGGGAAGTCCTCA-3', position -1125 to -1102) led to the amplification of a promoter fragment of 1083 bp in size (NPK27).
-54- The PK-3A primer contains a Hindu recognition sequence. The different 5' primers contain a BglII recognition sequence.
The resulting PCR products were purified using the Qiagen QIA quick spin PCR purification kit, in accordance with the manufacturer's instructions, and then digested with the restriction enzymes BglII and Hindu. The pGL2 promoter vector was digested with the same restriction enzymes, and the SV40 promoter contained in this vector was released and removed. The PCR promoter fragments ligated into the vector, which was then transformed into competent DH5a bacteria (from Gibco/BRL). DNA for the promoter activity analyses, which are described below, was isolated from transformed bacterial clones using the Qiagen plasmid kit.
Example The activity of the hTC promoter was analysed in transient transfections in eukaryotic cells.
All the work with eukaryotic cells was carried out at a sterile workstation. CHO-K1 and HEK 293 cells were obtained from the American Type Culture collection.
CHO-K1 cells were kept in DMEM Nut Mix F-12 cell. culture medium (from Gibco- BRL, order number: 21331-020) containing 0.15% streptomycin/penicillin, 2 mM glutamine and 10% FCS (from Gibco-BRL).
HEK 293 cells were cultured in DMOD cell culture medium (from Gibco-BRL, order number: 41965-039) containing 0.15% streptomycin/penicillin, 2 mM glutamine and FCS (from Gibco-BRL).
CHO-K1 and HEK 293 cells were cultured at 37 0 C in a water-saturated atmosphere while being gassed with 5% CO 2 When the cell lawn was confluent, the medium (RAZ was sucked off, after which the cells were washed with PBS (100 mM KH 2
PO
4 pH 7.2; 150 mM NaCI) and released by adding a trypsin-EDTA solution (from Gibco- BRL). The trypsin was inactivated by adding medium and the cell count was determined using a Neubauer counting chamber in order to plate out the cells at the desired density.
For the transfection, in each case 2x 105 HEK 293 cells were plated out, per well, in a 24-well cell culture plate. The HEK 293 medium was removed after 3 hours. For the transfection, up to 2.5 lg of plasmid DNA, 1 gg of a CMV B-Gal plasmid construct (from Stratagene, order numner: 200388), 200 ul of serum-free medium and 10 ul of transfection reagent (DOTAP from Boehringer Mannheim) were incubated at room temperature for 15 minutes and then dropped uniformly onto the HEK 293 cells. ml of medium were added after 3 hours. The medium was changed after 20 hours.
After a further 24 hours, the cells were harvested for determining the luciferase activity and the B-Gal activity. For this, the cells were lysed, at room temperature for 15 minutes, in the cell culture lysis reagent (25 mM Tris [pH 7.8] containing H 3 P0 4 2 mM CDTA; 2 mM DTT; 10% glycerol; 1% Triton X-100). Twenty pl of this cell lysate were mixed with 100 pl of luciferase assay buffer (20 mM Tricin; 1.07 mM (MgCO 3 4 Mg(OH)2'5H 2 0; 2.67 mM MgSO 4 0.1 mM EDTA; 33.3 mM DTT; 270 pM coenzyme A; 470 uM luciferin, 530 gM ATP), and the light generated by the luciferase was measured.
In order to measure the B-galactosidase activity, equal quantities of cell lysate and Bgalactosidase assay buffer (100 mM sodium phosphate buffer, pH 7.3; 1 mM MgCl 2 mM B-mercaptoethanol; 0.665 mg of ONPG/ml) were incubated at 37 0 C for at least 30 minutes or until a slight yellow coloration appeared. The reaction was stopped by adding 100 tl of 1 M Na 2
CO
3 and the absorption was determined at 420 nm.
In order to analyse the hTC promoter, four hTC promoter sequence segments of differing length were cloned 5' in front of the luciferase reporter gene (cf. Example RA 9).
-56- The relative luciferase activities of two independent transfections in HEK 293 cells, using the constructs NPK8, NPK 5, NPK22 and NPK27, are plotted in Fig. 11. Each experiment was carried out in duplicate. The standard deviation has also been given.
The construct NPK 27 exhibits a luciferase activity which is 40 times higher than the basal activity of the promoterless luciferase control construct (pGL2-basic) and from 2 to 3 times higher than that of the SV40 promoter control construct (pGL2PRO).
Interestingly, a luciferase activity which was from 2 to 3 times lower than that obtained with the NPK 27 construct was observed in cells which were transfected with longer hTC promoter constructs (NPK8, NPK15, NPK22). Similar results were also observed in CHO cells (data not shown).
The reference to any prior art in this specification is not, and should not be taken as, an acknowledgement or any form of suggestion that the prior art forms part of the common general knowledge in Australia.
It will be appreciated that an 'isolated' nucleic acid, or DNA as referred to in the claims which follow, is one which has been identified and separated from at least one contaminant nucleic acid molecule with which it is associated in its natural state.
Accordingly, it will be understood that isolated nucleic acids are in a form which differs from the form or setting in which they are found in nature. An isolated nucleic acid may be obtained using a number of techniques known in the art, for example, recombinant DNA technologies or chemical synthesis. It will further be appreciated that 'isolated' does not reflect the extent to which the nucleic acid molecule has been purified.
-57- References Allsopp, R. Vazire, Pattersson, Goldstein, Younglai, Futcher, Greider, C.W. und Harley, C.B. (1992). Telomere length predicts replicative capacity of human fibroblasts.
Proc. Natl. Acad. Sci. 89, 10114-10118.
Ausubel, Brent, Kingston, Moore, Seidman, Smith, Struhl, K.
(1987). Current protocols in molecular biology. Greene Publishing Associates and Whiley- Intersciences, New York.
Blasco, M. Rizen, Greider, C. W. und Hanahan, D. (1996). Differential regulation of telomerase activity and telomerase RNA during multistage tumorigenesis. Nature Genetics 12, 200- 204.
Broccoli, Young, J. W. und deLange, T. (1995). Telomerase activity in normal and malignant hematopoietic cells. Proc. Natl. Acad. Sci. 92, 9082-9086.
Counter, C. Avilion, A. LeFeuvre, C. Stewart, N. G. Greider, C.W. Harley, C. B. und Bacchetti S. (1992). Telomere shortening associated with chromosome instability is arrested in immortal cells which express telomerase activity. EMBO J. 11, 1921-1929.
Feng, Funk, W. Wang, Weinrich, S. Avilion, Chiu, Adams, R.R., Chang, Allsopp, Yu, Le, West, Harley, Andrews, Greider, C.W. und Villeponteau, B. (1995). The RNA component of human telomerase. Science 269, 1236- 1241.
Geng, and Johnson, L.F. (1993). Lack of an initiator element is responsible for multiple transcriptional initiation sites of the TATA less mouse thymidine synthasse promoter. Mol. Cell. Biol 14:4894.
Goldstein, S. (1990). Replicative senescence: The human fibroblast comes of age. Science 249, 1129- 1133.
Harley, Futcher, Greider, 1990. Telomeres shorten during ageing of human fibroblasts. Nature 345, 458-460.
-58- Hastie, N. Dempster, Dunlop, M. Thompson, A. Green, D.K. und Allshire, R.C.
(1990). Telomere reduction in human colorectal carcinoma and with ageing. Nature 346, 866-868.
Hiyama, Hirai, Kyoizumi, Akiyama, Hiyama, Piatyszek, Shay, J.W., Ishioka, S. und Yamakido, M. (1995). Activation of telomerase in human lymphocytes and hematopoietic progenitor cells. J. Immunol. 155, 3711-3715.
Kim, Piatyszek, Prowse, Harley, C. West, Ho, Coviello, Wright, Weinrich, S.L. und Shay, J.W. (1994). Specific association of human telomerase activity with immortal cells and cancer. Science 266, 2011-2015.
Latchman, D.S. (1991). Eukaryotic transcription factors. Academic Press Limited, London.
Lingner, Hughes, Shevchenko, Mann, Lundblad, V. und Cech T.R. (1997).
Reverse transcriptase motifs in the catalytic subunit of telomerase. Science 276: 561-567.
Lundblad, V. und Szostak, J. W. (1989). A mutant with a defect in telomere elongation leads to senescence in yeast. Cell 57, 633-643.
McClintock, B. (1941). The stability of broken ends of chromosomes in Zea mays. Genetics 26, 234- 282.
Meyne, Ratliff, R. L. und Moyzis, R. K. (1989). Conservation of the human telomere sequence (TTAGGG)n among vertebrates. Proc. Natl. Acad. Sci. 86, 7049-7053.
Olovnikov, A. M. (1973). A theory of marginotomy. J..Theor. Biol. 41, 181-190.
Sandell, L. L. und Zakian, V. A. (1993). Loss of a yeast telomere: Arrest, recovery and chromosome loss. Cell 75, 729-739.
Shapiro, Senapathy, 1987. RNA splice junctions of different classes of eukaryotes: sequence statistics and functional implications in gene expression. Nucl. Acids Res. 15, 7155-7174.
Smale, S.T. and Baltimore, D. (1989). The ,,initiator" as a transcription control element. Cell 57:103- 113.
-59- Smale, S.T. (1997). Transcription initation from TATA-less promoters within eukaryotic proteincoding genes. Biochimica et Biophysica Acta 1351, 73-88.
Shay, J. W. (1997). Telomerae and Cancer. Ciba Foundation Meeting: Telomeres and Telomerase.
London.
Vaziri, Dragowska, Allsopp, R. Thomas, T. Harley, C.B. und Landsdorp, P.M.
(1994). Evidence for a mitotic clock in human hematopoietic stem cells: Loss of telomeric DNA with age. Proc. Natl. Acad. Sci. 91, 9857-9860.
Wick, Harnen, Mumberg, Birger, Olsen, Budarf, Apte, S. S. and Miiller, R. (1995). Structure of the human TIMP-3 gene and its cell-cycle-regulated promoter.
Biochemical Jomal 311, 549-554.
Zakian, V. A. (1995). Telomeres: Beginning to understand the end. Science 270, 1601-1607.
NV'
4.
60 SEQUENCE LISTING <110> Bayer AG <120> Regulatory DNA sequences from the 5i region for the human catalytic telomerase subunit, their diagnostic and therapeutic use <130> LeA32805-Foreign Countries <140> <141> <160> <170> Patentln Vers. of the gene and <210> <211> <212> <213> 1 5126
DNA
Homo sapiens <400> 1 gagctctgaa gggtaatgaa ccgttccttc tcgggtgtga gagcacgggc accaggctgg caggcactcc gagactcagc gcctcagctt tcacctgtcc ggtgtctgtc agactggctc ggctgcacgc ccggtgtgtt ctagggtctc ggaaatgcaa caggcctggg ccccctccct aagacccagc cgcacatcat caaagcaggg tccgcacggt catctcaagg gctcaaaaag gggggcggca atggtattgg ggctgtgcca acgtcctgat gtgatctccg gtaatccagg gatggaggca gcgcctccag gcacggctgg agggcactcg acccatgcac cagggctgaa ttattttatt agcggcatga tcagcctccc ttttagtaga gtgatccgcc ggcctattta catggagttc ttcgtagact tcccatggga tgccatctgc atctcaatgt actgggattg tggaggaagg ccgtggaaac gtggtgtgca catcattatt caagccatga cacacccctg ggtgacaaca cccaaattct ctggggtgcc ctccagcagc cactgtgtct tccttcccca ctctgagcct tgacctccat cttctgtttc ggggttttta catttgggtg gatggagccc ctggaacaca attggcaccc gtacacactc aaatccctgc ggacagttcc gaattacgct aaagaatttc gctgggggct ctcagttatg tctttgccat tcccccaaac tgaggaccct ggttctggga gtcagtctga aagctggaaa cccttagccc cgctgccctt tgtgaatcta gtgcctccgg tacttacttt tcttggctca aagtagctgg gatgggcttt cacctcagcc accattttaa aatttcccct ggggatacac cccactgcag cagtagaaac ctcagtgtgt agccccttcc aatgatactt gaacatgacc ggaaatggcc catcttcacc caaaact cag atatattaag gcggctgaac agggcctggt acactgaggc ttcctaaacc tgtctcagcg acactcacat gaacctggct ttccaggcgc tgtgctcctt taggcatagg tgaaagtagg ccgccaggga gagtggcagt ctggacattt ccgtccacga taaaatgtcc tcacagtgaa gagtcaaaac accccatggc actgcacgca ggagactaac gcccgagtgt ctgtggacag gaggtctggg agaggcgggc ggctgaaaag aagcggggaa accagggccc ctagcatgaa ggattatttc gcaagggcag ctgagacaga ctgcaacctc gatttcaggc caccatgttg tcccaaagtg aacttccctg ttactcagga cgtctcttga gggcagctgg ctgatgtaga gctgaaacat ctatcccccc tgttattttt cttgcctgcc atgtaaatta ccc aaggac t tacaaacacc agtccaggag agtctgttcc tgctgcttcc cagccctgtc ctgggtgggc acgtagctcg gcgt tgaagg cgtggccccc tccccgtctc tccacgtcca acgggggcgt agtgcctgtc: cccgcccttc ttccacaagc gccccacagc ccgacccccg tttaacaaac gaggaacatg tgccacctcc aggggagtgg ccttttacta cataggggag cctgggcagg aacccgcccg atccttcggg aggagggtca ggagggaggg gggaccctcc atcgtggacc gtgtgtgggg aaaacaaagg ggcaggcacg gttatgctct.
cgtctcctgg gtgcaccacc: gt caagctga ctgggattac ggctcaagtc gttaccctcc catattcaca gaggctgcag atcagggcgc gtagaaatta ccaggggcag cactgctggt tgcttccctg cacgactctg gaatgattcc actcttttac agatgaggct tctagactag cgagggcgcc tccacaccct cgtgttccag cacggttcct gaggagattc gatgcaggtt ctgtcatctg gctgcgtgtg ggtgggccag ctcacctagg tctgcccagc actaagcatc cctgggaatt ctgttttatt tggttaaaca ccgtttataa atgggatacg 'ttaggggggt aagccagttt tggggatggg ataatgctct gccccagggc actacctgca gaggggggca cctcgagccc acggagcctg tccggcctcc atttgcagaa tttacagaaa agtgatttta tgttgcccag gttcaagcaa acacccggct tctcaaaatc aggcatgagc acacccactg tttgatattt gtttctgtga gcttcaggtc aagtgtggac aagtccatcc aggagttcct actgaatcca ggtgggtcaa ctgatgggga agcaacttct taggcccaca gctttcagcc tagaccctgg atctgccctg ccgcctccag cgctactgtc cctcacatgg tgcgcctccc: cctggcgtcc ccggggcctg tctctgcccg ggcgct ct tg tccacgggca actttcctgc ctcttcccaa cacgtgacta ttaatagcta aacgggtcca agcctgcagg tacgcaacat taaggacggt cctggttctg ggaacccgga agagatgccc ctttgcaggt ggcccgaaaa gcctcaggac aggcctgcaa cagcaggaag gtgccatagg gcaacaggaa catccaagga tttagctatt gctggagtgc ttctcgtgcc aattttgtat ctgacctcag cactgcacct gt aaggagt t tctgtaattc ccacctgtta ccagtggggt actgtcctga ctcctactct ctcactcctg ctgtttcatt 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 -61 tgttggtttg agtgcaatgg Ctgcttccgc tttgtatttt gaacttctga gtgagccacc gaagctcacc tgttagaaca acacactaac tgccgggagg ttccatttct aaccagtgta cctagtggca ggatttctag gagcgtgaca aatttcctcc atttcagtgt tttctcgccc cagctgtcct tctactgctg gcctggaccc catctgccag ccggtgcgcg cccggtgggt gagaacctgc tgcagggagg tcgtccccag ccggagcccg gcggccaaag cgggttaccc cctgcaccct ccgcccggag gacgcccagg ccccttcacc gggtccccgg cgccctctcc ccctggcccc tttgttttgt cgcgatcttg ctcccatttg tagtagagac cctcagatga atgcccagct Ccactcaagt ctcttgatgt tgcacccata cgt tt cc tcg tctcttccct agctacaact gagacaattc aagagcgacc gcccagggag ggcagtttct ttgccgacct ccttagatcc gcggttgtgc ggctggaagt cgaggctgcc acagagtgcc gccagcagga gattaacaga aaagagaaat cactccggga ccgcgtctac acgccccgcg ggt cgccgca cacagcctag gggagcgcga cagctgcgct accgcgctcc ttccagctcc cccagccccc tcgcggcgcg ggccaccccc tttgagaggc gcttactgca gctgggatta gggggtgggt tccacctgcc cagaatttac gttgtggtgt tttacactgt atactggggt ccatgcacat cttttaaaat taacttttgt acaaacacag tgtaatccta ggtgcgaggc gaaagtagga cagctacagc aaacttgagc cggggcccca cgggcctcct ctccaccctg ggggcccagg gcgcctggct tttggggtgg gacgggcctg ggtcccgcgt gcgcctccgt tccggacctg cgcacctgtt gccgattcga gcggcgcgcg gtcggggcca ccacgtggcg gcctcctccg tccgggccct agtttcaggc gcgatg ggtttcactc gcctctgcct caggcacccg ggggttcacc tctgcctcct tctgtttaga tttaagccaa gatgactaag gtcttctggg ggtgttaatt tgtgttttct tggaacaaat ccctttaaaa agtatttaca ctgttcaaat aaggttacat atccctgcaa aacccggagt ggtctggagg, agctctgcag tgcgggcggg.
gtcaaggccg ccatttccca tttgctcatg tgt caaggag gcccgtccag cctccccttc gaggcagccc cccagggcct cctCt Ct ccg ggcggggaag ggccgggctc gagggactgg cgcggacccc cccagcccct agcgctgcgt ttgttgctca cccaggttca ccaccatgcc atgttggcca aaagtgctgg aacatctggg tgatagaatt acatcatcag tatcagcaat actccagcat atgttggctt tttccaaacc aggcttaggg agacgaggct gctagctcca ttaaggttgc ggcctcggga ctggattcct ggaccagtgg.
tccgaggctt atgtgaccag ttgtggctgg ccCtttctcg gtggggaccc cccaagtcgc ggagcaatgc acgtccggca tgggtctccg ccacatcatg ctggggccct cgcggcccag ccagtggatt ggacccgggc gCcccgtccc ccccttcctt cctgctgcgc ggctggaggg agtgattctc cagctaattt ggctggtctc gattacaggt tctgaggtag tttttattgt cttttcaaag cttcattgaa aatcttctgc ctctgcagag gcccctttgc atcactaagg aacctccagc taaataaagc gtttgttagc gacccagaag gggaagtcct ccgtgtggct ggagccaggt atgttggcct tgtgaggqgc acgggaccgc ctcgccgcct ggggaagtgt gtcctcgggt ttcgtggtgc gatcaggcca gcccctccct cgctggcgtc acccccgggt cgcgggcaca acccgtcctg gacccctccc tccgcggccc acgtgggaag 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 4980 5040 5100 5126 <210> 2 <211> 4042 <212> DNA <213> Homo sapiens <400> 2 gtttcaggca cgatgccgcg aggtgctgcc agcgcgggga gggacgcacg tggtggcccg gcttcgcgct gcagctacct tgctgcgccg tgctggtggc ctgccactca aacgggcctg gtgcgaggag gtggcgctgc gcaggacgcg aagaagccac gccgccagca cttgtccccc agctgcggcc tcgtggagac cccgcctgcc acgcgcagtg ccccagcagc aggaggacac aggtgtacgg ccaggcacaa atgccaagct tgcgcaggag gcgctgcgtc cgctccccgc gctggccacg cccggcggct gccgcccccc agtgctgcag gctggacggg gcccaacacg cgtgggcgac tcccagctgc ggcccggccc gaaccatagc gcgcgggggc ccctgagccg tggac cgagt ctCtttggag ccacgcgggc ggtgtacgcc ctccttccta catctttctg ccagcgctac cccctacggg cggtgtctgt agacccccgt cttCgtgcgg cgaacgccgc ctcgctgcag cccaggggtt ctgctgcgca tgccgagccg ttcgtgcggc ttccgcgcgc gccgccccct aggctgtgcg gcccgcgggg gtgaccgacg gacgtgctgg gcctaccagg ccgccacacg gtcagggagg agtgccagcc gagcggacgc gaccgtggtt ggtgcgctct cccccat cca gagaccaagc ctcagctctc ggttccaggc tggcaaatgc gtgctcctca gcccgggaga CgCctggtgc gcctgcctgc ttcctcagga gagctgacgt ggctgtgttc cgtgggaagc CCtggccccg tgcgctccct gctgcgcagc gcctggggcc ccagggctgg.
tggtggccca: gtgcctggtg ccttccgcca ggtgtcctgc agcgcggcgc' gaagaacgtg gcccccccga ggccttcacc cactgcgggg gagcggggcg ttcacctgct ggcacgctgc tgtgcgggcc gccgctgtac ctagtggacc ccgaaggcgt Ccggggtccc CCtgggcctg gaagtctgcc gttgcccaag ccgttgggca ggggtcctgg tctgtgtggt gtcacctgcc ctggcacgcg ccactcccac catcgcggcc accacgtccc acttcctcta ctcctcaggc tgaggcccag cctgactggc cctggatgcc agggactccc ggcccctgtt tctggagctg agacgcactg cccgctgcga agccccaggg ctctgtggcg agctgctccg ccagcacagc gccggctggt gcccccaggc acaccaagaa gttcatctcc ggaagatgag cgtgcgggac cggccgcaga gcaccgtctg gccacccccg cactaccgcg cggctggtgc tgcgtgccct ctgaaggagc ctggccttcg accagcgtgc tgggggctgc gcgctctttg cagctcggcg ctgggatgcg ccagccccgg aggcccaggc gcccacccgg agacccgccg ccatccgtgg tgggacacgc gacaaggagc gctcggaggc cgcaggttgc cttgggaac gctgcggtca gcccccgagg agcccctggc CtCtggggct ctggggaagc tgcgcttggc cgtgaggaga 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 62 tcctggccaa tcttttatgt tctggagcaa agctgtcgga gactccgctt tgggagccag cactgttcag tgctgggcct aggacccgcc tcccccagga gcgtgcgtcg agagccacgt tgcaggagac.
aggccagcag tcaggggcaa tgctctgcag acgggctgct cgaaaacctt tgcggaagac ttcagatgcc tggaggtgca tcaaccgcgg tgaagtgtca acatctacaa catttcatca cctccctctg gcgccgccgg tcaagctgac agacgcagct acccggcact aggccgagag ggcggcccac aggcctgcat aagggctgag caccccaggg tccatcccca accatccagg aggtgtgccc ttggggggag aaaaaaaaaa gttcctgcac cacggagacc gt tgcaaagc agcagaggtc catccccaag aacgttccgc cgtgctcaac ggacgatatc gcctgagctg caggctcacg gtatgccgtg ctctaccttg cagcccgctg tggcctcttc gtcctacgtc cctgtgctac cctgcgtttg cctcaggacc agtggtgaac ggcccacggc gagcgactac cttcaaggct cagcctgttt gatcctcctg gcaagtttgg ctactccatc ccctctgccc tcgacaccgt gagtcggaag gccctcagac cagacaccag acccaggccc gtccggctga tgtccagcac ccagcttttc gattcgccat tggagaccct tgtacacagg gtgctgtggg aaaaaaaaaa tggctgatga acgtttcaaa attggaatca aggcagcatc cctgacgggc agagaaaaga tacgagcggg cacagggcct tactttgtca gaggtcatcg gtccagaagg acagacctcc agggatgccg gacgtcttcc cagtgccagg ggcgacatgg gtggatgatt ctggtccgag ttccctgtag ctattcccct tccagctatg gggaggaaca ctggatttgc ctgcaggcgt aagaacccca ctgaaagcca tccgaggccg gtcacctacg ctcccgggga ttcaagacca cagccctgtc gcaccgctgg aggctgagtg acctgccgtc ctcaccagga tgttcacccc gagaaggacc cgaggaccct agtaaaatac aa gtgtgtacgt agaacaggct gacagcactt gggaagccag tgcggccgat gggccgagcg cgcggcgccc ggcgcacctt aggtggatgt ccagcatcat ccgcccatgg agccgtacat tcgtcatcga tacgcttcat ggatcccgca agaacaagct tcttgttggt gtgtccctga aagacgaggc.
ggtgcggcct cccggacctc tgcgtcgcaa aggtgaacag acaggtttca catttttcct agaacgcagg tgcagtggct tgccactcct cgacgctgac tcctggactg acgccgggct gagtctgagg tccggctgag ttcacttccc gcccggcttc tcgccctgcc ctgggagctc gcacctggat tgaatatatg cgtcgagctg ctcaggtctt ctttttctac cggaagagtg gaagagggtg cagctgcggg gcccgccctg ctgacgtcca tgtgaacatg gactacgtcg tctcacctcg agggtgaagg cggcctcctg ggcgcctctg cgtgctgcgt gtgcgggccc gacgggcgcg tacgacacca caaaccccag aacacgtact gcacgtccgc aaggccttca gcgacagttc gtggctcacc gcagagctcc tccctgaatg gtgccaccac gccgtgcgca gggctccatc Ctctccacgc gtttgcgggg attcggcggg gacacctcac ctcacccacg gtatggctgc gtggtgaact cctgggtggc .acggcttt.tg .gctgctggat:acccggaccc catcagagcc-:agtctcacct actctttggg-gtcttgcggc 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2160 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4042 cctccagacg cgcatgtgtg gcgcgtcatc gatgtcgctg gtgccaccaa ggggtcactc tgccctggag atggccaccc ctacgtccca cctgagtgag gcctgagcga cacaggctgg cactccccac ctcctttgcc tgggaatttg gggggtccct agtttttcag gtgtgcacca ctgcagctcc tctgacacgg ggggccaagg gcattcctgc aggacagccc gccgcagcca gcccacagcc gggagggagg tgtttggccg gtgtccagcc cgctcggctc ataggaatag ttccaccccc gagtgaccaa gtgggtcaaa ttttgaaaaa <210> 3 <211> 11276 <212> DNA <213> Homo sapiens <400> 3 acttgagccc tggtgacaga tcttctctgg atacaaacac ttaaaaagga acccacggta tcaaaaaagt gccaaggcgg aaccttgtcg cagctactcg gagccgggat aaaaaaaaaa aaaagcaaga cagaaataaa ttttgaaaag acctaaataa caaaggatca aaaatagata agcccaaaca agagaagccc gaattccaat tctacatggc aacaaaaaaa aaatcctcaa gtgat caagt aagagttcaa atgagaccct ccacagtgga atgaaaatta aattgaaaaa tacagcaaaa agaaaagcca gcagatcgcc ctactaaaaa ggaggctgag tgcgccattg aagtagaaaa gcaaactaaa tgaaactgaa ataaacaaaa ataaagtcag ctagaggcta aattcctaga gaccaataac aggacccaat cctactcaaa cagtattacc cagaaagaaa caaaacacta gggatttatt ggctacggtg gtctcaaaaa acaaaaccag aacaatatac tttatttaag gcagtgctaa ggcgcagtgg tgaggt cagg tacaaaatta gcaggataac gactccagcc acttaaaaat cctaaaattg agataacaat ttgacaaacc agatgaaaaa ctatgagcaa tgcatacaac aataatggga ggcttccctg ctattctgaa ctgattccaa gaaaactaca gcaaaccaaa ccagggatgg agccatgatt aaaaaaaaaa aaatcaacaa ttctgaatga caaatgataa gaaggaagtt ctcatgcctg agttcgagac gctgggcatg cgct tgaacc tggg ta acaa acaacctaat gtaaaagaaa acaaaagatc tttgcccaga agagacatta ctgtacacta ctaccaagat ttaaagccat ctggatttta aaatagagga aaccagacaa ggccaatatc ttaaacaaca aaggatggtt gcaacaccac aattgaaata caagaggaat ccagtgagtc cggaaacata tatagctata taatcccagc cagcctgacc gtggcacatg caggaggtgg gagtgaaacc gatgcacctt agaaataata aacaaaatta ctaagaaaaa caactgatac ataaattgaa tgaaccatga aataaaaagt ccaatcattt aagaatactt aaacacatca cctgatgaat ccttcgaaag caacatatgc acgccagcct atataaagca tttgaaaact aatgaagaaa acctctcaaa agcagctaca actttgggag aacacagaga cctgtaatcc aggt tgcggt ctgtctcaag aaagaactag aagatcagag aaagttggtt aggaaagaag cacagaaatt aaacctagaa agaaatccaa ctcctagcaa aaagaagaat ccaaactcat aaaacaaaca actgatacaa atcattcatt aaatcaatca 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 63 atgtgataca cagaaaaagc tatacaagaa aggccaaggt gagacctggt tagtcccagc tgcagtgagc gaataagaag aggaggtgga tttcaacata agcctttcct tagtactaga ctggaaagga gacttaagac caatgtacaa caaaaaagca tctacaatga agatattcca agcaatttac agaagaaaca cctgaccaaa agctatagta gaacagaata aggtgccaag ctggatatcc aaatcaaaat aacaccggag caggcacagg tgcccagcaa tttgcaaact ctctataaga catttctcaa ctgatcatca atggctttta acccttggac ttcctcaaaa caaaaaaggg ttcatagcag aaaatgtggt tcagttgcaa aaagacaaac agaaatagag taatttattg gaaaggat aa ttgtatgcct aattaaaatt gccgaggcgg aaccctgtct ccaactactc tgagccgaga acaaaaacaa ctactatatt aaataagaac tggcagaaat taagtgactt cctaaaacaa tatctctaaa tgcttttttt atctgtatga tggcaggaag aagacccagg gggcgatgag gcaaagattg agcctcctgc agaggagtca aaggtgtatt gcacccttct ccctctcttg gcccgtgctg ctgagcctgg caaagactta 4 tcttaatatt tcatcccaac aaaatgaagt atttgataaa attctgcacc acatacaggc caggcacagt gggatgattg cttgggccca ctacaaaaaa cttttttaaa tagtctggag gctgaggtgg catgaacatg tcactgtact aaggagaagg agaagggaga ggagaagtgg aaggggaagg ataaaagccc tatatgacag ctaagatctg gaaaatgaca agtcctagct agagcaatca agaagtcaaa ttatcctgtt accactaaaa aactattaga aaatcagtag tatttctata gctacaaata aaattaaaca aaactataaa atgttgataa tgttcataga ttggaagaat aaattcaatg caatccctat attctaagat ttgtacagaa aagaacaaaa ctggaagcat acccaaacta-catggtactg gagaatccag aaacaaatcc aacatacttt ggggaaaaga atatgcaaaa taacaatact ggatgaaagg cttaaatcta aaactctcca ggacattgga caaccaaagc aaaaacagac aggaaacaat caacaaagag attcatctaa caaggaatta aaaacaccta ataagctgat aataagtcat acaaatggca gagaaatgca aatcaaaact ttcaaaagac aggcaataac actgttggtg ggaatggaaa aactaaaaat aaagctacca aatcagtgta tcaacaagct ccaaggtttg gaagcaacct gcacatacac aatggagtac cagcatgggg ggcactggtc ttttcatgtt ctcccttact gagaatggtg gttctagagg tatgttttaa aataactaaa atgcttgaag gtgacagata gtatcaaaat atctcatgta ttaatggcca ggcacggtgg .gtggatcacc .tgaggtcagg ctactaaaga tacaaaaatt aggaggctga gacaggagaa tcatgccact gcactgcagc aaaaaagaag attaaaattg agaagttaaa aattaaaaca aatgtatgtg gggtttctag gtgaggaggg aacagtggaa aattttaacc aaagacaggc ctgctaataa tggtgaaagg atcgagctgc agaattggca cttgtgtgct tggagatttt atcctgaaac gaaaaatggt caggtggctc tgtggacctg tgcaaggcag aggcctgatg aagcctgcct cgttggtgag ctctggatac catctggaaa tcaaacccag gccagcagct aggcacctcg aagtatggct ctgagggaag cttgagttag caagggaaaa ccagacgccc ccctcgcggt ttctgatcgg aggaccctct tgcaaagggc cttaataaca aactgggatg attccatgag taaattcaac ttcttaaatt tcatcaaata acaaaaacta cttcatgata ggctcacacc ggagtttgag aaattagcca gagaatcact ccagcctaga agggagggag ggaagggaaa accgaggtag agggcccact gataagagaa tgcagatgat gctgaaattt ttccaacagc gctaggaatt aagaaattga aaatactgtt taaaatacta tatgattatt aaaaccctca tgcgatccca actagcctgg ggcatgatgg taagcctagg caacagaaca aagggaggag gaggaagaag tattatgagg ttcaccactg agaaataaaa atgatcttat ggtacagcag aaacaatctg aaccaaagaa agagggcaca aaaatgtcca atgacgttct ccacaaaaga- cccagaatag cacattacct~ gacttcaaat gcataaaaac, agatgagaca.
atgcatctac agtgaactca taatctcttc aataaatggt agaactctgt ctctcaccat aaacctcaaa ctttgcaact gtgggcaaag acttcttgag aaatgggatc atatcaagtt aagagacaac ccacagaatg ataaccagta tatataagga tttcaaaaat aagcaaaaga aacaggcatc tgaaaatgtg actatgagag atcatctcat aaatgccagt gaggatgtgg ttgctaccac tatggagaac tacagcaatc ccattgctag atctccactc ccacatttac cagtgtccat caacagacga tacgcagcca taaaaaagaa agtatgttaa gtgaaataag tgtgggagca aaaattaaaa ggtgggggac agggtgacta agagtataat tgggttgttt ccccatttac cctgatgtga tgctatagat ataaacccta ctcatgtccg taatcccagc agtttgaaac cagtctggcc agccaggcgt ggtggcacat ttgcttgaac ctgggaggcg ctgggtgaca gagcaagact taatttttat gtaccgtata attataaaag gtaattaacc cttctgaaga agtaaaagtt gttactgttg ttagacgctc tgggagaagt taaagaggca taatctctat taattaccaa cgtctgatca caccgtcctc cgattgtgtg ttcgtgtttg ggtgatttcc tccagaagaa agccacttca atcttcaagg acccgaggac aggaaagctc cagcgcatga agtgccctta aggcggccag cgggaatgca atggcgccca cccgggcgtg taaatctttt tttcacctga gtgccttctt taaaacagaa gctctgcggt catttacctc gacagagtga cccccgtgga tccacagacc cccgccctgg tggctggggg cggacagcga ctttccacat ccgaatggat acattcagga ctgcagaaat tcactttatg aaaaaccagg gcactctggg gcaacaaaat catatgcctg aggtcgaggc agaccccact gaggagaagg aagaaacata aaaaactgaa tgattcaaca ggcatccaaa atctggaaaa gatacaaaat aaaaagaaac gtgaaagatc aaaaaagaaa tactacccaa .tcacagaaat ccaaagctat .tatactacaa tggaccagag tttttgacaa gctggaggaa atacaaaagc actaaaagaa taattccctg aaaaagcttc ggagaatata gctcaaacta tctgggtaga ctcaacacca cccagttaaa at aaaaggaa agtttgaaag gtatatactc tgcagcactg atggaaaaag tgagatcctg ccaggcacag caattgacat gagtcaacaa gtaacacaaa ttattacaca ctatattaaa actttgggag accatgatga acctgtagtc gaggttgcag ccatctcaaa aatatatact acttaatcta atggccacga atactctctg ttctataagc taattacaga tcattcacgg gttaaactta ttagagtacc gtctctggcc ggatgggaag tttacgcttt aggagtcaga tgccagaggg agcagtgacc agtcatggaa tttcctctct gcttctccga agagaggagt cggcgggatt ttggatttta ccaaaggcgt 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 5640 5700 5760 5820 64 aaaacaggaa atttttcgcc gtacacgagg gtccccggcc ttgggcaacg gcccacccac ct tgcctgcc atgtaaatta cccaaggact tacaaacacc agtccaggag agtctgttcc tgctgcttcc cagccctgtc ctgggtgggc acgtagctcg gcgttgaagg cgtggccccc tccccgtctc tccacgtcca acgggggcgt agtgcctgtc cccgcccttc ttccacaagc gccccacagc ccgacccccg tttaacaaac gaggaacatg tgccacctcc aggggagtgg ccttttacta cataggggag cctgggcagg aacccgcccg atccttcggg aggagggtca ggagggaggg gggaccctcc atcgtggacc gtgtgtgggg aaaacaaagg ggcaggcacg gttatgctct Cgtctcctgg gtgcaccacc gtcaagctga ctgggattac ggctcaagtc gttaccctcc catattcaca gaggctgcag atcagggcgc gtagaaatta ccaggggcag cactgctggt ggtttcactc gcctctgcct caggcacccg ggggttcacc tctgcctcct tctgtttaga tttaagccaa gatgactaag gtcttctggg ggtgttaatt tgtgttttct tggaacaaat ccctttaaaa agtatttaca ctgttcaaat aaggttacat atccctgcaa ctgagctatg ct aagt act t agaggcctgg tgggaggctg cgaaggcggc actaacccag tgcttccctg cacgactctg gaatgattcc actcttttac agatgaggct tctagactag cgagggcgcc tccacaccct cgtgttccag cacggttcct gaggagattc gatgcaggtt ctgtcatctg gctgcgtgtg ggtgggccag ctcacctagg tctgcccagc actaagcatc cctgggaatt ctgttttatt tggttaaaca ccgtttataa atgggatacg ttaggggggt aagccagttt tggggatggg ataatgctct gccccagggc actacctgca gaggggggca cctcgagccc acggagcctg tccggcctcc atttgcagaa tttacagaaa agtgatttta tgttgcccag gttcaagcaa acacccggct tctcaaaatc aggcatgagc acacccactg tttgatattt gtttctgtga gcttcaggtc aagtgtggac aagtccatcc aggagttcct actgaatcca ttgttgctca cccaggttca ccaccatgcc atgttggcca aaagtgctgg aacatctggg tgatagaatt acatcatcag tatcagcaat actccagcat atgttggctt tttccaaacc aggcttaggg agacgaggct gctagctcca ttaaggttgc ggcctcggga tttgccaagg tttattggtt gcggcagggc acagcaggac cacgctgcgt gaagtcacgg ggtgggtcaa ctgatgggga agcaacttct taggcccaca gctttcagcc tagaccctgg atctgccctg ccgcctccag cgctactgtc cctcacatgg tgcgcctccc cctggcgtcc ccggggcctg tctctgcccg ggcgctcttg tccacgggca actttcctgc ctcttcccaa cacgtgacta ttaatagcta aacgggtcca agcctgcagg tacgcaacat t aaggacggt cc tggtt ct g ggaacccgga agagatgccc ctttgcaggt ggcccgaaaa gcctcaggac aggcctgcaa cagcaggaag gtgccatagg gcaacaggaa catccaagga tttagctatt gctggagtgc ttctcgtgcc aattttgtat ctgacctcag cactgcacct gtaaggagtt tctgtaattc ccacctgtta ccagtggggt actgtcctga ctcctactct ctcactcctg ctgtttcatt ggctggaggg agtgattctc cagctaattt ggctggtctc gattacaggt tctgaggtag tttttattgt cttttcaaag cttcattgaa aatcttctgc ctctgcagag gcccctttgc atcactaagg aacctccagc taaataaagc gtttgttagc gacccagaag tccaaggact taataaccat ttcataaggt ggcttagggt tatgagcacg gcagggccac cactgaccgt cctccctggg gtgactcagg accccatacc agctctgaac ccgtggaaac gggtaatgaa gtggtgtgca ccgttccttc catcattatt tcgggtgtga caagccatga gagcacggsc cacacccctg accaggctgg ggtgacaaca caggcactcc cccagattct gagactcagc ctggggtgcc gcctcagctt ctccagcagc tcacctgtcc cactgtgtct ggtgtctgtc tccttcccca agactggctc ctctgagcct ggctgcacgc. tgacctccat ccggtgtgtt cttctgtttc ctagggtctc -ggggttttta ggaaatgcaa catttgggtg caggcctggg ccccctccct aagacccagc cgcacatcat caaagcaggg tccgcacggt catctcaagg gctcaaaaag gggggcggca atggtattgg ggctgtgcca acgtcctgat gtgatctccg gtaatccagg gatggaggca gcgcctccag gcacggctgg agggcactcg acccatgcac cagggctgaa ttattttatt agcggcatga tcagcctccc ttttagtaga gtgatccgcc ggcctattta catggagttc ttcgtagact.
tcccatggga tgccatctgc atctcaatgt actgggattg tggaggaagg tgttggtttg agtgcaatgg ctgcttccgc tttgtatttt gaacttctga gtgagccacc gaagctcacc tgttagaaca acacactaac tgccgggagg ttccatttct aaccagtgta cctagtggca ggatttctag gagcgtgaca aatttcctcc atttcagtgt tttctcgccc gatggagccc ctggaacaca attggcaccc gtacacactc aaatccctgc ggacagttcc gaattacgct aaagaatttc gctgggggct ctcagttatg tctttgccat tcccccaaac tgaggaccct ggttctggga gtcagtctga aagctggaaa cccttagccc cgctgccctt tgtgaatcta gtgcctccgg tacttacttt tcttggctca aagtagctgg gatgggcttt cacctcagcc accattttaa aatttcccct, ggggatacac cccactgcag cagtagaaac ctcagtgtgt agccccttcc aatgatactt tttgttttgt cgcgatcttg ctcccatttg tagtagagac cctcagatga atgcccagct ccactcaagt ctcttgatgt tgcacccata cgtttcctcg tctcttccct agctacaact gagacaattc aagagcgacc gcccagggag ggcagtttct ttgccgacct ccttagatcc gttcagaggg gcaagggaaa cggggagaga agctgccaca ggcttcctgg gaacatgacc ggaaatggcc catcttcacc caaaactcag at at at taag gcggctgaac agggcctggt acactgaggc ttcctaaacc tgtctcagcg acactcacat gaacctggct ttccaggcgc tgtgctcctt taggcatagg.
tgaaagtagg ccgccaggga gagtggcagt ctggacattt ccgtccacga taaaatgtcc tcacagtgaa gagtcaaaac ac ccca tggc actgcacgca ggagactaac gcccgagtgt ctgtggacag gaggtctggg agaggcgggc ggctgaaaag aagcggggaa accagggccc ctagcatgaa ggattatttc gcaagggcag ctgagacaga ctgcaacctc gatttcaggc caccatgttg tcccaaagtg aacttccctg ttactcagga cgtctcttga gggcagctgg ctgatgtaga gctgaaacat ctatcccccc tgttattttt t ttgagaggc gcttactgca gctgggatta gggggtgggt tccacctgcc cagaatttac gttgtggtgt tttacactgt atactggggt ccatgcacat cttttaaaat taacttttgt acaaacacag tgtaatccta ggtgcgaggc gaaagtagga cagctacagc aaacttgagc 5880 5940 6000 6060 6120 6180 6240 6300 6360 6420 6480 6540 6600 6660 6720 6780 6840 6900 6960 7020 7080 7140 7200 7260 7320 7380 7440 7500 7560 7620 7680 7740 7800 7860 7920 7980 8040 8100 8160 8220 8280 8340 8400 8460 8520 8580 8640 8700 8760 8820 8880 8940 9000 9060 9120 9180 9240 9300 9360 9420 9480 9540 9600 9660 9720 9780 9840 9900 9960 10020 10080 10140 65 aacccggagt ggtctggagg agctctgcag tgcgggcggg gtcaaggccg ccatttccca tttgctcatg tgtcaaggag gcccgtccag gaggcagccc cccagggcct cctctctccg ggcggggaag ggccgggctc gagggactgg cgcggacccc cccagcccct agcgctgcgt ctggattcct ggaccagtgg tccgaggctt atgtgaccag ttgtggctgg ccctttctcg gtggggaccc cccaagtcgc ggagcaatgc acgtccggca tgggtctccg ccacatcatg c tggggc cc t cgcggcccag ccagtggatt ggacccgggc gccccgtccc ccccttcctt cctgctgcgc gggaagtcct ccgtgt ggct ggagccaggt atgt tggcc t tgtgaggcgc acgggaccgc Ctcgccgcct ggggaagtgt gtcctcgggt ttcgtggtgc gatcaggcca gcccctccct cgctggcgtc acccccgggt cgcgggcaca acccgtcctg gacccctccc tccgcggccc acgtgggaag cagctgtcct tctactgctg gcctggaccc catctgccag ccggtgcgcg cccggtgggt gagaacctgc tgcagggagg tcgtccccag ccggagcccg gcggccaaag cgggttaccc cctgcaccct ccgcccggag gacgcccagg ccccttcacc gggtccccgg cgccctctcc gcggttgtgc ggctggaagt cgaggctgcc acagagtgcc gccagcagga gattaacaga aaagagaaat cactccggga ccgcgtctac acgccccgcg ggtcgccgca cacagcctag gggagcgcga cagctgcgct accgcgctcc ttccagctcc cccagccccc tcgcggcgcg cggggcccca cgggcctcct ctccaccctg ggggcccagg gcgcctggct tttggggtgg gacgggcctg ggtcccgcgt gcgcctccgt tccggacctg cgcacctgtt gccgattcga gcggcgcgcg gtcggggcca ccacgtggcg gcct Cc tccg tccgggccct agtttcaggc gcgatg 10200 10260 10320 10380 10440 10500 10560 10620 10680 10740 10800 10860 10920 10980 11040 11100 11160 11220 11276 ccctggcccc. ggccaccccc <210> 4 <211> 104 <212> DNA <213> Homo sapiens <400> 4 gtgggcctcc catgcggaga ccggggtcgg Cgtccggctg gggttgaggg cggccggggg gaaccagcga gcagcgcagg Cgactcaggg cgcttccccc gcag 104 <210> <211> 8616 <212> DNA <213> Homo sapiens <400> gtgaggaggt aaaagggggc ttttcgctca gtttgcataa cgaggccaga tggggagaag tCctcttcgc gggtgggagg agatttaatt gggaagctga ggtgaaaccc taatcccagc tgcagtgagc ctttaaaaaa actgttctcc agaggacagc atggtgctgc CCtccgctcc gaagtcccga cagacaagga aaaagtcata tgctaactcg gtccctaccc agtcagataa gagagtttga aggtcacaat cacgtgcagg ggcgcggccc ccctgtcacg gagtgaggcg gggtgtccct tagggtgagt gtccctgggt cacgtgcagg ggcgctgtcc Ccttggcgtt gcgccggttg ggtggccgtc aggcagagcc ggacgtcgag acttacgagg gcagtgaaca tgtctggaag aatttcaagg taagggtttt gtgtgttgac ggcaggtgga tatctgtact tacttgggag tgagattgtg aaaaagtgtt agcacagatc agatggctcc tgggccctgc agcccccttt tttcaccccc gggtgacctt taacatgaga gcggtgttta atcgaacggc gcgtcatgca gttctctgat ctgcccctgg gtgagtgagg ccgggtgtcc tgtagggtga tggtccccgg gtcacgtgca gaggcgcggc gtccctccca gtgagtgagg ctgggtgtcc tgctcacttg cccattgcct gagggcccag ctggtcctCC tggacacggt ttcaccttca gaggaggctg cacagacgct gtgggaatga gcaggtgcac ggccaggtgc tcacctgagg aaaaatacaa gctgaggcag ccattgtact cgttgattgt c tggt cccat acctgctgag cgtgtcccca tggctcccag tCCCcacaaa cttggggctc ttggcactcc cagcaggttg agctgcctca acccagtttt caggactctg cttatgcagg cgttgccccc ctgtcccgtg gtgaggcgcc gtgtccctgt gggtgagtga cccagggtgt ggtatagggt cgCggccccc ctgtctcgtg agcttgctcc gggtagatgg gccccagagc tgaatgcagt tgtctccatc gtcacgtggg gatctctgcc tctgctctcc cgttttgatg gacacgcggt ggcgcggcag tggagccggg ctggcgaggg tgcctgcagg gaggtgggga cgagaacccc gtggtcagcc aatatgcagg ggtggctcac gccggtaatc tcaggagttt gagaccagcc aaattagctg ggcatggtgg gagaatcact tgaacccagg ccagcctggg cgacaagagt gccaggacag. ggtagaggga ctttaggtat gaagagggcc gaagggacag tgtttgtggg Ccctgttttt ctggatttga tgctcccagg CCCtaccgtg ctcccaagac atgtaagact taacaccgtt ttctgtgtac cttgaaatgc tgcgtcttgc cacctgctgc ggctcaggtg gctttttgtg ctccagcttc cctgtcattg ctgttctctg gagtgaggcg tggtccccgg aggtgtccct gtcacgtgta cagcgtgatt gaggtgtggc atccccgggt gtccctgtca cccgtgcagg gtgagtgagg ggcgcggtcc Ccgggtgtcc ccctgtcacg tgtagggtga gagtgaggca ctgtccccgg gggtgtccct ctcaggtgca tagggtgagt gaggctctgt tgaatgtttg Ctctttctat tgcaggcgca gtgctggtcc aggggctcag cacacgtggc ctcctgtcca ttccaggcgc ttgccggcaa ttacctataa ctCt t cc tgg tttgtgttta ccagcacttt tgaccaacat tgtgtgcct~g aggcggaggc gaaactctgt gggagataag acatgggagc tgttcagggg tgttgaggaa gcagctagaa tccggccatg ttatggtggc agtgcagaat gtgactggaa gaccacgccg cttcgttgag acttcagatg gtgtccctgt gggtgagtga ccccgggtgt cgtgtagggt cactgtcccc ctctcaggtg gtgaggcacc gtgtccctgt gggtgagtga ccccaggtgt agccacagct ccaagcctat 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 66 cttttctgat ggacigcagg gcctcigitg tctcccagct ttgctggaga cccctcactt gcaacgcitg ccagtcgccc cctgctctga ttgggtctta gctitttctt cctctaagig cactttcaag ctgtgattt cttttaagta gtttaigttc itatigatt ggagccttgc acatcctgtc aagcttctgt gggggatggg gcatgcacgt ctctagtaat tttgaiiagt tcacaggtca agtagctgga gacagggtct tacctcggct caacactit caaiccagtc actagagacc ttgttctca ctcgttgcct tgatcttttt ggciggagtg gttctcattc taaittttgt icctgaccic gccaccgtgc icctgagcaa ttttccctgc ttaaiattgt tttgttcccc tgcgtggtic atggcatcta tcacgaggag cttagccagt catgtcgggg gcgccagctc aaccaggaica atgtggaiaa ctttgggagg atgatgaaac tgtaatccca gttgcagtga ctcaaaaaaa aagaaaaggt atcattttag tttgtctgcg ggcttcccat ccctcagtga ccctgctgig gccctcggtg gccctgctgt ggccctcggt ggccctgctg aggccctgcg cagacggtgc ggtgaggttg \gigigaggic ggggtgaagg gc tcggc t c ctctcgcctc ggcctggctt tgtctcatgc catcccagaa gt cctgtttt tcaccttatt cctcacatgg gacccacgtg gtittgaatt ggtttattci ctgccttacc tgitcttaaa ittctitgtg ttcttiagci aagatatgta ciaactcagt titgtgatct aaiagtgggc ciccitctag tctgiicatt ggtagaaittt ctagiaattc attttccigc ttttgagaca gtgtaacittt acigcagaca tgctgtgttg tcccaaagtg atattcttat tgacagtcgt cgcctggtgc ccacctcttg cctggtcact atigicgtig taatggcaca cicaacctca attitagia aagtgaictg ccggcatacc taagacccit igactiagit tttccgigtt gtctgtcttc tictgtcttg gcgacgtccg ggcggt cat c gagtgacagc tctggtggct tgacggtgci aaggatgagg ttttaaaatt ccaaggcggg cccatctgta gctactcggg gccgacattg aaaaaaaaaa gaaattaatg ggigttattg ggaicccgig ggccaiggct gctggatgtg agctggatgt agctggaggi gagctggatg aagctggagg tgagctggat gtgagctggg cagaccatgc ccaggcccig accaggccci tcgccaggcc ictiggtcac ccgcgtgcca gctcaccacg cgaggctgga agggttctct ctcccaagct ctgggcacct aitgacgtcc gagggccggi icacigatti ttcattccit tgcaccctgt atacttcaaa cacgctgtgt taticigiga gagtatcaag igigtagtgg agigtgtgca atgcatgttc atgcaigaaa tCt t cicgitt ttatcttcct tttt taaa igtgictgit gagtcttggt taccttctgg cgcaccgcta cccaggctgg ctgaattaca agtgtgggia tgtttaacig actctgattc ggttgccatg gggcatttgc tttgcttitg atctcggctc tgagtagctg gagataggct cccgccttgg ttgatcitt agtgtaitit ctaictcagg gagtgittct tgtctcaggc ttattgctgg gggacctctg ttggcccgtg aacgtccgct ccgcggtgtc gcctggcggg ctccgagccg tctaggctgg tggatcacga ctaaaaacac aggctgaggc caccactgca aaaaaaaaaa taataaiaga gtgggagcat tgtaggtccc gttgtaccag cagtgtccgg gtggtgtcig atggagtccg tgtggtgtct tatggagicc gigiggigtc tgtgcggigt ggigagcigg ctgtgagttg gctgigagct ccigcitgtg Ctctccgttc ggcactgcag tgcccgccac ctctgggctg gtgccctgaa gcccctcigc gccgcicatt agccacaggi gtctccgcca acctctgacg ttctagctic gittigaigt gtgtiaatac ttigacgtga tttctttgag atacgtagag tctgtataat tggtticcag actataicca., ttccaagaag tggtagcati gatgagtgaa ttgctcttag ttctgccttt ctgicgccca cctgagccgt cacctggcta tctcaaactc ggcatgagcc tgtccigtta gataacctga tccacttgcc tgcgtttcct tttatttci tttattgaga actgcaacct ggattacagg ticaccaigi cctcccacag aaaaigaagt agctciggcc caictigaca gtagctttgc ccgccgtctg taaaccccag cttatgatgc.
agtgicigga cggccigggtgagittgaaaggagtgtctg ttgtcgccca gcgcggtggc ggt caggagg aaaaattagc aggagaai ig ctccagcctg aatictagta ttiiactgaa cacicacagg gtgcgtggcc atggtgcagg aiggtgcacg gatggtgcag gatgatgcag ggatggtgca ggatgaigca tggatggtgc ciggatggtg ataigcggtg gaigiggggt ggatgtgigg agciggatgt cattitgcia ccacagcttc atgcatgctg cctgtgtctg ggaaagcaag tiggccccct gcttaggctg tggagtgtct gccitcgtca tiictaictc tiagtttagi gaagtaatct ttcitttaag aatcattttg cagtgagtta iattttaagt accaaitati aacigtccai cggggacacg aggtccgctt ccaatacicc cigccacgtg tcaccccagc tgggtgggtg ggctctgcct ctgtctgict gacitccctc iccattgtat catgcctiic caacatcagc taitciiatt aiatcagtga tttgaacact tatcattia tgaagttigc tgtaaaiiig .gcti'attaag giccagigca -gaggccaiag tcccicacct .tatgigaggc attgitaggi tcttttggag..acttciatgt tactgccaca aatttatata gggtgagtgc cctcicacct atttitaaat ttggacicaa accatgicig acagcatgia tttattitca tgiigcatgt gccgagtgtg ciiigcttag cagicicact Cigcctcctc cgcccaccac tggccaggct tgctgggatt ctgaaacatt accccccagc cccccacaag ccccgccctg gggicccctt ctttaccigt acagaigaag gcaccacgtg tcagcciggatcgcgcaaac cticctccct acaggagcat icacgcctgt icgagaccat igggcgtggt ciigaacctg gcaacacagc gccacattaa gcccagcatg acatttgaca aicicggcct iccgggaiga tctgggaiga gtcaggggig gtccggggig ggtcaggggt ggtccggggi aggictgggg caggicigga tccggaiggi giccggatgc igtctggatg gtggigtctg ctgggcttct iataiaiata agtggtgiga cagcctcctg tttttctgga gggatccaic gcctaatti ggtgaatttc titttgic cctcgiiccc tgttgatcct tgttaccccc ctgtcaccca ggticaagca cacgcciggc ggictcaaac acaggigcaa gctaccctig ctgtgtgctg ciaagcatta ctiicctcc ccitgtcct gciggcctcc aigtggagac gccagcgttc aaaccccagg ctgcggtgt~g tctgctiggg gacgtgagcc aatcccagca cctggccaac ggcgggtgCC ggagtiggaa gagacicigt aaaagiaaaa iccacaccic ttttttgagc ggacctgctg ggicgccagg ggicgccagg aggtciccag aggtcgccag gaggictcca gaggicgcca tgaggtcacc gtgaggtcgc gcaggtcigg tgcaggtccg gigcaggtci gaiggigcag 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 33.20 33.80 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 4500 4560 4620 4680 4740 4800 4860 4920 4980 5040 5100 5160 5220 5280 5340 5400 5460 5520 5580 5640 5700 5760 5820 5880 5940 6000 6060 6120 6180 6240 6300 6360 6420 6480 6540 67 gtctggagtg gtccggggtg ggtctggagt ggtccggggt aggtctggag caggtcaggg caggtccggg gcaggtctgg gcaggtccgg tgcaggtctg tgcaggtccg gtgcagtccg gtgcaggtct gctgcaggtc gtccggatgg gtccggatgg gtctgcatgg gtccggatgg tgtctggatg gtgtccggat ggtgtccgga cggtgtctgg gctgtatccg tgctgtatcc atgcggtgtc gtgcggtgtc tgtgctgtat atgtgctgta gatgtgcagt gtatgtgtgt tggatgtgtg tggatatgcg tggtgggctg tggtgagctg cggtgatctg aggtcgccag aggtcgccag gaggtcgcca gaggtcgcca tgaggtcgcc gtgaggtctc gtgaggtcgc ggtgtggtcg ggtgaggtcg gggtgtggtc gggtgaggtt gggtgaggtc ggggtgaggt cggggtgagt tgcaggtcca tgcaggtctg tgcaggtctg tgcaggtccg gtgcaggtcc ggtgcaggtc tggtgcaggt atggtgcagg gatggtgcag ggatggtgca ggatggtgca cggatggtgc ccggatggtg tccggatggt gtacggatgg tgtctggatg gtgtctggat gtgtccccgt gatgtgccgt gatgtgcggt gatgtggcat gccctcggtg accctgcggt ggccctcggt gaccctgctg agaccctgct caggccctcg caggccctgc ccaggccct c ccaggccctg gccaggccct gccaggccct gccaggccct caccaggccc tcgccaggcc gggtgaggtc gggtgaggtc gggtgaggtc gcgtgaggtc ggggtgaggt cggggtgagg ccggggtgag tccggggtga gtccggggtg ggtctggcgt ggtccggggt aggt ctgggg caggtccggg gcaggtctgg tgcaggtccg gtgcaggtcc gctgcaggtc gt ccgaa tgg gtccggatgg gtccggatgg gtccttctcg agctggatgt gagctggatg gagctggatg tgagctggat gtgagctgga gtgagctgga tgtgaactgg ggtgagctgg ctgtgagctg cggtgagctg gctgtgagct gctgtgagct tgcggtgagc ctcggtgagc gctaggccct gccaggcctt gccaggccct gccaggccct agccaaggcc tcgccaggcc gtcaccaggc ggtcgccagg aggtcgccag gaggtcgcca gaggtcacca tgaggtcgcc gtgaggt cgc cgtgaggtcg gggtgaggtc ggggtgagtt cggggtgagt tgcaggtcca tgcaggtctg tgcaggtccg tttaag gcagtgtcca tgcggtgtct tatggagtcc gtgcggtgtc tatgcggtgt ggtatggagt atgtgcggcg aggtatggag gatgtgcggc gaggtatgga ggatgtgctg ggatgtgctg tggttgtgcg tggatgtgcg tggtgggctg tggtgagctg tggtgggctg gctgtgagct ttcggtgagc ctgcggttag cctgcggtta ccctgctgtg gccctgcagt ggccctgcgg ggccctgcgg aggccctgct caggccctgc ccaggccctg gccaggccct cgccaggccc tcgccaggcc gggtgaggtc gggtgaggtc gggtgaggtc gatggtgcag ggatggtgca ggatggtgcc tggatggtac ccggatggtg ccggatgatg tctggatggt t ccggatgat gtctggatgg gtccggatga tatccggatg tatccggatg gtgtccggtt gtgtccccgt gatgtgccgt gatgtgcggt gatgtgtggt ggatgtgcgg tggatgtggg ctggatatgc gctggatgtg agctggatgt gagctggatg ttagctggat ttagctggat gtgagctgga ggtgagctgg cggtgagctg gcggtgggct tgcggtgagc ctcggtgagc gccaggccct gccaggccct accaggccct 6600 6660 6720 6780 6840 6900 6960 7020 7080 7140 7200 7260 7320 7380 7440 7500 7560 7620 7680 7740 7800 7860 7920 7980 8040 8100 8160 8220 8280 8340 8400 8460 8520 8580 8616 <210> 6 <211> 2089 <212> DNA <213> Homo sapiens <400> 6 gtactgtatc agcatgcgcc caggggcccc gctcacgttc aagcagaagg gatgtgggtc ttgcccaggc ttaagcgatt gcctggctaa ctcgaactca ggctaagcca ttcaatctat cagggagcac taggtggctg tgagattgtg gagatgccag tgactgtgga tccctggggg cagcagacct gccatttcct aatgcacctt accagtattt ccccaagatg CCtcctccct agggcaccag acagatgccc gggaaaaggc tccttggctg ctttctacct gggacaggca cccacgccag tgtctccact gtcacaggcc tcttacttgt gatttaaatt tgattctctc tggagtgcag caccagcctc tttttgtact tgacctcagg ccgtgcccag tggatttagg ctgtgcaggg catttgaatg acagattcaa cctggctgag gggctttagt gccttgtgac cgtcagaggt tgcatctggg acttagactt tggaaagaat ctccttgtca ggacagggta ctccggagca aggtccaggt caagggcaga agctgccctg gggggtcctg tcctgtgtgg gcctctgctt tgcctgtgct tggtccaagt aaaatcagga agatggaaac tctctttttt tggcataatc agcctcctaa tttaggagag tgatccaccc cccccgattc tcatgagagg agcacctggg gctgtgagat gctggatttg cccaggccat cagaagatca accccatgcc aacacagcct ggagggtcag tacacgtatt ttaattgggg ct ac tgggac ccgtgccttt cccgcggccc gtggccgctc ggtgtcagga agcagcctct cctggggcca aggggcatgg ctcgaagtcc tcCctggctg ggattctgtg gtttgtgcca actaccacta.
ttttcttttt ttggctcact gtagctggga acggggtttc accttggcct tcttttaatt ataaaatccc gataggagag tttgtctgca catcagtgag ggtattagct gggcttcccc ccaaatcagg ctgggctggg ggctttccct taatggtgtg tgaccggaag tgttgttctg tctactctgc cagtgtccac cagcccccgt gactggtggg cccgccctct gccttgggct gttcacgtgg tggaacacca tgcagctctg caaggctctg agtggtct Ct gcctccttgc tgagatggag gcaacctcca ttacaggcac accatgttgg cccaaagtgc catgctgttc acccacttgg ttccaccatg atgttcggct ggacgggagc tctCcgtgtc agctcccctg atgtctgcag gaccccgacg gtgggaacaa cgacccaaca gagcagacag CCtggggggc tgggcctgcg ggagtgccag gcccccatgg ctcatgagag ccatctgaag accccagtgg ccccagatgc gcccggcctc ggctgggagc actgcctgga agggtttgta ctttCCctgg tctcactctg cctcctgggt ctgccaccac ccaggctggt tgggtttaca tgtatgaatc cgactcactg agctaacttc gatgagagtg gctggtctgg ccgcccaggc cacactcgag agggagctgg tggtgctggg gttaatacac tggtcatttg acgtggtggt cttggaggcc gcctgcggtc gctgtcagcc gtggttttgg ctgattctgc ggatgtggct ctgtaccaga agcctgggac 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 68 caggctccct tccccagggt aggcgtctgg tgggtgccct tagtctgttg ggtgctgatg tgactatagg ctggcatggg gagccctcac t Ctggc tgag gtgggacagt accaggtgtc tggacgtggc tgagtcggtg caagcctcct caccctgggg Caggtgccct c ccgggc atg ggggcttgtg gaggggctct gttgaccgcc ggactgggcg 1860 gcaagtagag gggctctcag 1920 gccttcagcg tgtgctgccg 1980 gcttcccgtg agcttccccc 2040 ctattgcag 2089 <210> 7 211> 687 <212> DNA 213> Homo sapiens <400> 7 gtggctgtgc gtatcagctt cggcgccaac cagcgtgggg cagccgggcc gtgcgcagcc agcccaccgg ggctgcggtg gtgtggcatg tctgaggaag gggctgcttc gactgtctcc t ttggt t taa agatgaaggg ccatttgtgc gtgtaggggg agggcctgga tccgtgcgct gctctgagga gctgcggtga aggatcccgt ctgggagggg tCCCctgggt catgctgtcc cttccttttt cccggaggag gcacagtgag agctcctggg tgcagcacgg tccgcttacg tcctggacct ccccgtcatc gtgcaacaca ttctaggtcc ccctatggtg ccgccag aaacagaagt gcgtttgagc cccacatttg gggccacggg acacagccag ggccatggca gtggccgagg tgccggtgcc tccagaaaag gcagggacag gctctgagga ccacaagaag cccgaggtcc tggatccgtg tcctgctgtg gggcccgggg accaggccac gactgccagg tgccccacgg ctcctgcacc -ccacccctgt tgaggagagt gtggggtgag gtggacagag catgcggcca ggaacccgtt -tcaaacaggg cgggtctggg tggctgggga. cactggggag gggtgggcac ttggccggat ccactttcct <210> 8 <211> 494 <212> DNA <213> Homo sapiens <400> 8 gtgggtgccg gcacctcatg tgtcactgtt catggggccg gattccagtt gcaccccagt ccctgccctt caggccctga ctccactcac gggacccccg t tgggtggag gaggacacac actgtgcacc tccgtcagag cctgagccag ggggtctgga gggcagaggt acag tgagcagccc gaggtactcc ctggcaccta ctgactgccc aaggaaccgc gggtctcctg gtggtggggg gatgtctgag tgctggacct tgggtgggcc gggtggaggc gggctcctat aacggctcag tcctgaggct tcagagagag tttctgcgtg tgggagtggc gcagggagtg cttcagcctt tcccaaggag ccaccaggcc cagagagggg agtgggggac gccactgtca tgcctgattg caggtgaccc tcctgcagca ggtcccactg ccggtgcct t acacagcccg accgccaggc gtctcctcgc <210> 9 <211> 865 <212> DNA <213> Homo sapiens <400> 9 gtaaggttca agaatgcagt Ctgtgatatg tgtggtgtgt tgtgtgtgtc tttgtggtgt ggccccttgg tcgggtgctg gtcctgtcac gctggtggta tccatgagat cggccggggg agtggtcatg ggtgaagaag acctctttct cgtgtgatag cgtgtctgtg cgtgtgtggc gtgtgtgtgg tgtgacacgt gtgtgtgcat ccttactcct gtttggggag agggctgggc ccttcctgga ataggaaggc ccttggggct agcacgctgg tatccctgga ctgacttctt tcgtgtccag atgcgtttct acgtgtgtgt cacgtgtgtg gcatgttcat gtgtccgtga tcctcctcca ctccacattc cttggagact ccctggcac tgattcaggc cggcaggggt aggggtaagc gcttcggtct gagct gatgtgtgtc tctgggatat gtggtggagg 'tacttccatg cgtggtgcat gtatctgtgg tccatggtgt gtgtgcctgt gctgtgtgct gcatgtctgt catatgcgtg tctatggcat ggcatggtcc gcaccattgt agggtcctca cttctagcat gtaagccagg tttgagagga ccccaggacc ccagtctggc ctcgctcccc gggacacact gaaaggggcc ctgggcttgg cctcaaagtc gtgccaggcc ggggagaggc acatgtggaa gaatgtgtct atttacacat 120 cgtgcatatt 180 ggtgtgcatg 240 gatgtgccta 300 gggtgtgtgt 360 cctcacgctc 420 gggtgcccct 480 gagtagggat 540 ctatgccggc 600 cctcccagag 660 gttcccaccc 720 ggggtgcaga 780 acccacaagg 840 <210> <211> 3782 <212> DNA <213> Homo sapiens <400> tgtgggattg ~rA, ggggtaacac gttttcatgt gtgggatagg tggggatctg tgggattggt ttttatgagt agagttcaag gcgagctttc ttcctgtagt gggtctgcag gtgctccaac 120 -69 agctttattg ggtgtggagg cctgctgtgg gatgtggccc tttctttttt ggctggagtg attctcttgc aatttttgta cctgacctca ccgccgcgcc cctgcagcct ccatttcatg gcgtaattgg ttgttgtttt tctaaacaag ctgtggagtg ggggtgtggg gcggtgctca gtgctcctta catccccttc accacaatgg atatattggc cgacctcaga caggccccat tcgatgtggt agcacagagt tgttagtgtg tcccgtagta tctctctccc gcaatccctc gactgtggat tcacaggggt gtggatggcg tgtggtgact gggtctgatg ggcggtcgtg gactgtggat ctgatgtgtg cggtcgtggg ctgtggatgg ggtctgatgt gtggatggcg tctgatgtgt ggatggcggt tgtggtgact gggtctgatg ggcggttggt gtggtgactg ggggtctgat gatggcggtc ggtgactgtg acaggggtct gtggatggcg tgtggtgact gtcacagggg actgtggatg tgatgtgtgg tcggtcacag actttgcgtc tgggcttcat agtgcccagc ag aggagaccat cct cccc tgg tgtgtggccg tggct acgct tctttctttt gtttggcgtg ctcagcctcc attttagtag ggtgatcctc cggccgagac tggtgctgac actctcttca tgtctgctgt tccggctcct catctgaagt gcaccggtct gagccagcgt gaggcgcaca tgggaatcta cccactgctg t tggggaccc ttttctgtgt cccatgggct gtaccttcct tttagcccac caccgtgcgc tgtcacgtgc aatgacaagc gcgtcttcag cagcactggg ggcagt cggt ctgatgtgtg gtcgtggggt gtggatggcg tggtgactgt gggtctgatg ggcggtcgtg gtgactgtgg gtctgatgtg tgatcggtca gtggtgactg gtcgtggggt ggtgactgtg cgtggggtct gtggatggcg tggtgactgt cccgggggtc tggatggcag gtgtggtgac gtggggtctg gatggcggtc gatgtgtggt gtcgtggggt gtggatggcg tctgatgtgt gcggt cgtgg tgactgtgga gggtctgatg ctcggccccc cccgccatcg tctggccggg atcttccttt gaactatggt gctccctgtt ctgtttcttc gtgggcaggg cttccaggcc ccgtccttgg aattcccctg tttttttttt tgataacaga atcttggctc actgcaacct caagtagctg gaattatagg agacgaggtt tctccatgtt ccacctcggc ctcccaaagt tcgcttcctg cagcttccgt aacctccgtt ttccttctcc cagaagagtt tcacgtgtgc ttatcgatgg cctccttcca tgaaggaaaa gtttcgatta tgccgttttc cctctaaagc ggggcctgtt aggaacccgg tcccgcctga gccccgcccc caccctactg agaactgtgc, atgcctgatg atctgaggtg tcctgtggaa 'aaatcgtctt, tgtgctaaag 'acctgcttca tgagtccaga ataattacgg atttgtgggc gtgttgcctg gttactgcct tccaggttgg ggccctgccg ccagctcctg gtcttttgat gcctcacaag ctgctcacat cctgtcttgg gtcctggggg agtctgcaga actcttctcc tgcctgtgct ctggagaggc ccgggagctc cacgggggtc tgatgtgtgg gtgactgtgg atggcggtcg ctgatgtgtg gtgactgtgg gtcgtggggt ctgatgtggt ggatggcagt cgtggggtct tggtgactgt ggatggcagt gggtctgatg tgtggtgact atggcggtcg tggggtctga gtgactgtgg atggcggtcg caggggtctg atgtgtggtg tggatggtga tcggtcacag ctgatgtgtg gtgactgtgg gatggcgatc ggtcacaggg gatgtgtggt gactgtggat gtcgtggggt ctgatgtggt ggatggcggt cgtggggtct tgatgtgtgg tgactgtgga tcgtggggtc tgatgtgtgg tgtggatggc ggtcgtgggg atgtgtggtg actgtggatg gtggggtctg atgtgtggtg gactgtggat ggcggtcgtg ctgatgtggt gactgtggat gtcgtagggt ctgatgtgtg ggtgactgtg gatggcggtc ggtctgatgt gtggtgactg tggcggtcgt ggggtctgat tgtggtagct gcaggtggag cggcccccgt ttcccaaaca ggcttggccg caggtccaca gcaggccaca tttgtggctc cgggtttata gtaagtcagg cactctgggg tcgtgtggtg tccttgtgtt cattggcctg cgagttggag gctttctttc gtctcgctct tttttgccca gtgcttcctg agttcaagca cgcccaccac catgctgact ggccaggctg gtctcgaact gctgggatga caggtgtgaa gagatctgca gcgatagctg aggtctcgct aggggtcttt tgatttcccg gctgtttcct tttcctttag gctttgttta tggatgtttg aactttcttt agggatcccg aggcccctgg cgcacagcgg gaggctaggt tctcagatca gcagtggcat gtgagagggg tctagattct .gaaccgtttg ctcccaaaac ccacgaaacc agtccctggt igcagcctctc gtcagtgttg atttctgtga tgctttccgc ctcctgggtt gggaagggtg ttctcagggt tgaatcgtac ggggctgggg aacatgctga ctcgaggcct cctgtgtccg ggacgcaggg gcttagcagg ataggaggtg ggggtgccgg gtggctgcac ctgcatccct gagtgccact tgtgccacgt tgactgtgga tggcggttgg tggggtctga tgtggtgact atggcggtcg tggggtctga gactgtggat ggcggtcgtg gatgtgtggt gactgtggat cgtggggtct gatgtgtggt gtggatggcg gtcgtggggt tgtgtggtga ctgtggatgg tggggtctga tgtgtggtga actgtggatg gcggtcgtgg gggtctgatg tgtggtgact atggcggttg gtcccggggg gtctgatgtg tggtgactgt ggcggtcgtg gggtctgatg gactgtggat ggcggtcgtg gatgtgtggt gactgtggat tggcggtcgt :ggggtctgat tgactgtgga tggcggtcgt tctgatgtgt .ggtgactgtg gcggtcgtgg ggtctgatgt actgtggatg gtgatcggtc gggtctgatg tgtggtgact ggcggtcgtg gggtctgatg gtgactgtgg atggcagtcg gtggggtctg atgtgtggtg tggatggcgg tcgtggggtc gtggtgactg tggatggtga tcccaggtgt gtctgtagct gaagcttccc aggcgctctc cgtcctgatc ggaagaaaca atgccctctc ctctgccggc 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 :1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3782 <210> 11 <211> 980 <212> DNA <213> Homo sapiens <400> 11 gtctgggcac aatcactggg tgtgatgggg tgccctgcag ggttgggcac ggactcccag cagtgggtcc tcccctgggc ctcatgaccg gacagactgt tggccctggg gggcagtggg gggaatgagc 120 gcatgatgag ctgtgtgcct tggcgaaatc tgagctgggc catgccaggc 180 70 tgcgacagct ccgcagtgcc caaaggaaag agctggcccc cgggccgtgt ggccctgtgg cctgttagca ctctgggcga ccagaccctg cggtgggctg gtgtcacaca gatggccctg cggagggtct tctcccgtct gctgcattca tttgttcatg gtgtccccct tcagtgctgg ttgagccacg ccctttgcag cttgctcggc atttccttgg tgcccggcag tgtgggtgtg ctctgcctaa ggcacctgct atttgctaaa cc t ttaggag gtctgaggcc ccccgctgag atgtggtctg tctaggggac ctcccagggt c tgggcagca agcccagctg gcccatgtgt cacgtttgac tgtcttctct ggcaggccat aaaggaaacg cgggcctctc tccacgtggc agtcgtgtcc gggggtggag actcctggat gacccacagg gtctgcagag cttcatcaca tctcagcacc tgcgcggcct gccagttttg gtttgagccg tgtccccctt agtgctgggt cctgtggctc accgcatgag gtggcctggg cacatatgcc tggcccagag actcggcccg aacactgacc caccggc tca ctctccagtt atcttgaggc tgtcctgccc cttaggagga ctgtccacgt tttgcagatg gctcagagac ctgctgggac atccgggcca gagacgttct gccagcccac ccaaaaggga ctcccatgtg cat tccagcc cagccccgca tggccacgtg gtcctgcctg gctttcgcag <210> 12 <211> 2485 <212> DNA <213> Homo sapiens <400> 12 gtgagtcagg tgctcacctc ttttctggcc ggctcggctt ggggtgtgga tgcgccgagc cggctctcac tcctgtccct cccatctgga ccatggggca cggtggtaga cacacctccc ggaggaaat t ggaggcctct tttatttaaa attataatat cacaaattgc taagcggccc cgggcctcct gtggcagggc ggtgactgtg tggtccagtt acagagagag gggctggccg gcccatcact t tt t ttgaga actcactgca gctgagatta gggtttttgc ctcggcctcc ctttttaagg gctgcaagcg tcgcgtggca actgtttgtc gtcatgctga catgagtctt tcattccggg aaagaaaacc agccgcccca ggtatgtttc tctgtgtctg tctcacctgt tggccaggtg tctcctgccc cccgccccct gcggcagccg gttgctcctg gtttgagcct acgcttgtat gtcgtgtgac aagtgcgggg ggcggcctgg gccacagtgc ggcaggcatc cgtgcacact ctctgggatc aatataacta ttattaaagt acatggcagc ccaggcccac tcgtggtcgt tttggggaat tctgtcctgt tggcctctga tttcccatcc gactcctaga gtgatatctg cggaacgt ca acctccgcct caggcaccca catgt tggcc caaagtgctg tgaccaccta tctcttagca gccatgcctt tgaaaacgca aactaggggc tcaccgtgga tcaagtgtct ttgatgattc gtgcatggtg tgaggtgttt tgtgtggctc gtcttcccgc ccattgccct cttccccact ccggctcctg gagcggagca cgtggaggac gcagcttgtc ctctctctcc ccccgcgagg ttgaccgtgt gagagctgcc ctggtgccac tgcctgcgac caaggtcatc gtctccagcg ttaattattg ataattagaa agagtgaat t agaattcgct gaattttatt gtgaggtgat ccctaggaca ataaaaacgt catgtgctca gttggtgcgt caccagcaag ctgttgtctg cccgggttcc cccCctgcgc aggctggtct ggattacagg tagcgcttcc acaggagtgg ctgtgtgcac ccc ttggcat aaggttgtat caaattcctt ggttctgtga agagcaagga agagtgggga gccggctgaa ggtttgagtg cccag gcgggtggct gncct t Ctgc ggctgcaggc ggtgccacac gaggggcggg agctccaagt cgatacaaaa gcgcgggctc agtttgctcc gtcacacagc atcacgtcct cctgtgtgtg agcaaggtca gataaaggac cattataagt a tat taagt a ttggccgagg gacaaagtca aagatggat c gactgcgtcc cggacaggcc cttcaaaacc caggggcgta gtgcttctgt gaaagcctct cctgggcttg.
agcatttctc ctggctaatt cgaactcctg tgtgagccat cgaaaataac cgtcctgtgg ctttaggttc ccttgtttgg ccgttggcgc gaaaaaaaaa ataaactcta tgtggtcaca gcagggat tg tggtagacgt tacgcatgtc gggcgggctg ccggggccac tcccgaggcc gaggcctgga gggtgtgtct tactactgac ggattttatc ttctctctgt tctcgggggg cactgggtga ctggatttta cctggggaga tccgcagtca tgtgcacagc aatcactaat gtacacacgt gacacgtgtg cctccccaga aagtcacgta tcatgccctg cgaagctcta tgttgcccca tctgcttgcg gcaaaaagtg tttcttttct agtgcagtgg C tgc ctcagc tttgtatttt *acctcaggtg cacgcccagc aggtcttgtt gctctgggga cacggggcta agagtttctg gcagcggcta aaaggagt cc agatttaaga cctgtggctg t ttgt tcaga gtcgtttgtg cagcacatgc gcagggcttc cagagtctcc ccggaaacat aatggcaagc gggtcaggtg gctggacacc cgattctcat gactagattt cctgtggtgg gccacactca agtaaaacca gtggtagcac ggtggaacgt ttcggaagct ggtatcagca tctggaaaaa cacatgtgtg gaagccacca ccgtccacgt acagacagga gtccccatcg aaaactaaga ttgactcgct cagtcctctt ttcttttttt cgcgatctca ctcccgagca ,tagtagagag atccacccac cggaaagcct tttgcagtag tggctgaggg ttctgctctc cttctcgttg catgtagggt ggttaagcat aaccttaatg gatctgtttc ggtct cat ct tgtatgaggt cctgcccgtc 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 .174 0 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2485 <210> 13 <211> 1984 <212> DNA <213> Homo sapiens <400> 13 gtgaggcctc agtgttaata ctcttcccca ggggggcttg ggtgggggtt gatttgcttt tgatgcattc ttcctggtgc tctggagacc atgactgctc tgtcttgagg aaccagacaa 120 71 ggttgcagcc gggctccacg gagggccgct ctgtcacgtc gccccacatc ttgcccatcc aacctaatgt tactttaagt gccatgttgg tatccctccc tgtccaagtg ttctttcctt aaggacatga attttcttaa gtgaatagtg tcctttgggt tccttgagga cagtgtaaaa tagtatcact cagccatttc attatagcat ataagtttat ggaaagtgtc ccatggtcat cctgtggata gggcaccacg ttatttttcc gc tggcacag tgctgtagca aaccgctttg tcgtagacag t tag ccttcttggt caggctctgt gccctgcatg acccaggttc tcccagcagg cacttgcatg ggttcaactc tctagggtac tgtgctgcac cactcccccc ttctcattgt gcaatagttt actcatcctt tccagtctat ccgcaataaa atatacccag atcaccacac gtgttctggt gaacaagcag tctcgaagac gtacagtgga gtaacagaaa ctcgagctgg ggggcgctgg ggatctgggt tggctcagag ctaagagtct aattgcacaa gttaactgta gagaatgtta atactacgta atgaagccgc ccagcggcca atgagcatgt cgttagggtc ccctcgacag gggtctacac agctggcttt atgtgcacga ccattaactc atcccatgac tcagttccca gctcagagtg ttttatgact catcgatgga catacgtgtg taatgggatg tgtcttccac gctggagagg acagttagtg tccgggtttt tcaaggttct caaaaatttc cggcacactg gcttgggcct ctcggatcat ggggcgaggt gagaagtggg gctgatggt a gagagctcgt ctttatttat aaaagtgtaa acgggagggg tgtccagagg gaattcaaca ct tggggaga gtggcctgga ccaaggacgc tattgacagc cgtgcaggtt atcatttaca aggccctggt cctgtgagtg atggtttcca gcatagtatt catttgggtt catgtgtctt gctgggtcaa aatggttgaa atgtggacag aaggatgcgt tcctgtgcat tcttcattaa ttgtacacac gtcagccctc gagggtcaca gctgaggacc tcccagcccc gccgcgcctg aacactgagt ctgttggaaa ggctgtgtaa agt taacct t ttgcacagcc cctcagggct ccgaggaagc tggggctggt ctgggcgcct acacacctaa agttactttt agttacatat ttaggtatat gtgtgatgtt agaacatgtg gcttcgtcca ccgtggtgta ggttgcaagt tatagcagca atggtatttc ctagtttaca cagttatttt caggaagcct cttttgaaac ggttcaagtt aacttgctct tgggacagga cagtgcacca acagctgcca agctttctta atggccttcg acttataatg gaaatttaag attgtttgac gctgtgtatt tgaggactgc cagcaggcgg acaccagctt gcagcctgag cttcagccca atatcgtgcc ttttttttaa gtatacatgt ctcctaatgc ccccaccctg gtgtttggtt tgtccctaca tatgtgccac ctttgctact tgatttataa tagttctaga Ctcccaccaa tttatgaaaa gcaggccaca tctagctcca ctagattgaa gggatttgga tacctctggc tgcccagctt tgctggtaaa ccgtcttcag ttcgtcttca aatgaggaat tttttcattt attcagtccc ttcccttatt 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 1984 <210> 14 <211> 1871 <212> DNA <213> Homo sapiens <400> 14 gtgaggcccg tgccgtgtgt cccccgtgtc ctgcccctgg gtgctggatc cgcaagagca caccttcggg agggagtggg acacacgtgg tgagtgcagg ggcggctcct ggggccccag ggcggcagcc tcctccccag gctggcccac agcgttcgct agaatttgga tttgctgagt cagagttgat ttttgtgaat gggattgtcc aatgtggtcc agtgcgattt gacgagggac tggccgccag gggtggtttc cccgccctcc aagtccaccc gcctggcgtt ccttgtgccg aagattcact cggggggagc gctgagggga cagagcagac ggccagggag gtggctcaga gatgagtcgg cagccatgta ggagctgcgc agctggccga ggggccacag cagaggccgc gggcacaggg gggctccctg cttgacgtga agctgacgac gagcaggaac tcagaaccct aggagaaaac aggcaaagtc tcttgtccag attttagtct tgccatgggg acacatgaga gagtcctggc tgtcccgggt gggagacagg gaaagcaccc cctgccaggc ccagcaccct aacaagggtg tcaggttacc ctctgcctta g ctgtggggac caccgcagcg gaggcgcttg taccgtgcag cggtgacctg tgagaccccc ggtgcacctg gcggtcacgt gctgctgtct caaactaaaa ccctcaaggg gagaaacctt aggtgctttg tccaggtcca cagcccggag ccaggtccca ggggaacgct gtgtatgttg acaggaaggg ggtcccaggg aggaagggaa.
agctgggtga tggtgttgcc cccctttgtc gttgagaaac gccccggacc tggaccatca.
ccaggccagg cgaagtctgg gctccaaatc tcctgggtga ctccacagcc ttgtctctgc gccgtgcacc gccctggtcc gctcctgctg aggagctgtg agcctgcgga tcCtgcgtgg tgaaccacgg tcaggcacag cgccccacag gaaagctgta ctgggctgtg Ccctccaggg cacagcaggc agcaactgag gcttctgtgt gggtcccacc gtggccacag ccaggccaca ggggatgccc gcgaggctca cagct cacag taaagcacag gtcttaaaag acagatgagt cagaggccac ttcttgcatg agcagggctg accacttctc cggccccgca tgtgggcttt caagtcctct caggcctggg tgcagagacg ctctttggaa cacagggcct gagcaggagc ggttgtttgg agatggctag gggacctggc agccggtggg aagggaaccc tttgtgaaaa ccgccctggg tgtgcacatt ggctcaggag ggcaagttcc gggggcagaa ggagctggga ggaagggcag aggccagagc tgactcggcg cccagccagg cagatgcctt aaggtgggat ctataacggg tggggctgca ctcacctacc ggtccaggct tggggttttc tcctggggct gcagttgagc ctctctgccg ggcgcagggg cacccaggtt agtcaagagt gcagggccga tgctgagtga gatcggtggg gagtgggttt ctcagcacag cttgttttaa tcagaaaatg cccatttgga ctgggggtat taaatccact tcctgaggct tgagggtgct ctctgtctct atgcaccagg ggggacgccc agaggctacc agggaacctc tcccgcgcct cagggcatct ggtggcaatt attgtggtgt cctcccatct tgtcctgccc cctcagagct caaagcattt gacattgccc 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1871 72 <210> <211> 3801 <212> DNA <213> Homo sapiens <400> gtgagcgcac ccgttgcgtc ggccacaggg tgagggtgct cagacctggg gggccctgct cctctgaacc cagggccctt agtctacagg tgtggggggg gccccgtctc ctgtttcttt ataatcccag ccaacctaac cctggtggca aacccaggag acagagtgag ggacaggtgt gaactggggg tggttgttaa tggactttgc tggacaccct ggtgcagaca ggatgccggt ctggcctcca cgcctctccc cttattttgc gcacagcatc gttctctcta catcagatgt gtgtgcttgc gggtgaactc gaagaaaaca ttcttgtcca gatggacaga gtatgtggca gactggaagc aggagacaca aggtgaacgt tgaggcaacg atggccattc cacattcatc aggggagcag ctcctgctcc agggtgagcc gatgcacact gagacccatc atgctggctc acttttctgg taattccact ccaggcaggg tattatgcat gtttaatggc cacaccccag gggcagtgag ctggagacac gctcttccat ccacaaaaaa acagtttatt tgcacaaaca gagtttggtc cagccccctc aggctctttg acattccccc ctggccggaa cacctctgct tgcccctcgt cacaacggga tgcactgagg gggcgtgagt cgagaccctg ttgggcgtga atgccatgag gtctctacaa aggctcagac tatgaataaa cactttggga caacatagtg cacgcctgta gcagaggttg acttcatctt ttttttattc tgccttcctc accagaggtt ctctttccag cgtgatgggg cccttgtgca ctcctgtgct ctggctttgt aggcacctct tccccatgaa agtgaatgtt aacacattgc gggtccaatg agaggtggct acatcctctg ggcaaaatga gattttagtc acaatagaac cagctgatgg aaataagttg tgcaaacaac tccctggttt ggcattgctt cctggagcgt ctctcacttt ccgcccttgg ggggcccttg ggagcccaag tgggaaggtc cctcaaagaa cttttctggg aaagcagctt tctgaagtga ggacttgcca cacaaaactt acaaaacgtt gagcctgccg tggtggtgag catgtgtgcc ccctgagatt cctgagtcac atgtgttttt cggccgtgcg atgcagagtc gggctgcagc gagcaagctt ctgtgtctca gtggagcctg tccgtgtggg cccatctggg gcagttttct tgtcttcaga ctctcaaacc gggccctgct gtctctccgc ttcatgatca aattctgggg acaaatgaat aagtatcaac ggccgaggtg aaattccatt gtccccgcta cagtgagccg aaaaaaaaaa tgtccttcga tgaaaggcac taaactgggg aatgctccct gagcagcagg tggtgcccag ccccacagtc Ctgcatgatt gcagtgctgg atgtattttt attgaaggac aaagccacag ccagaatatt ctaaaagctc tgtctgaagt ttaagaaaag tcccaaacca aaaacggaag aaaagagagt tgtctttaca accagcaaca ggtgttgggg tcactgcaga ttgtgcacgt gttctcctaa tcacccagct ctctgcccga gtcgtgttgg ctaccagcag -acgcacgtga cttgccaaga gt ttgcatgg ccagacatta cagcaagtca gctctgccat tatttcaatg tgaatgtcat gccctggagg acgtgcactc caaacacagt acctgtgttc ggctgagtta aggtttggat tggatggcat gcatgcccca tgcaggaggg g tgcccggctg gcaggcgact gctgagcaga gtgctatttt aagcagtctg cgaacacagg gggcgtgagt tgtgagcccc cgtgtgaccc tcttgtttcc tgaagatgga attccaggca gggcaggtgc gccaatccca aatgcatctt ggtaaaagga gatccgaacc ggccctgctg ctctccgaac acactccaag atcaggggac ccagagcccg cacagatgca gggcaaggtg ggtggatcac .ttgaggccag tctacttaaa .aaatacaaaa, tgcgggaggc.- tgaggcagga agatcacacc actgcactcc aaaaaagtat cagcattcca taatatttac tggtgctgtg accttcatgg gaagagaaat tcctgtcgtt ctgagttaac ggggtttgct tcatggggga tgcagacgcc ctcatgatgg catgtccctg ttgcagctcc cctgcttccc tctcacagcc tccacatttc ctgggctccc ccataccagt cagctgtgaa taggacaggc acccctggtt aaaggacaga caaacaaatc aggctagtgc aggatgggtg ctgtgctccc aaaggccact agcagtggag gcagtggttc atacagcaga ggcttgaagg tgaaaaagga aaagtggtaa cagctcagat ggtagaatgt ccctatctct cagaaacgtg gtgtgtgtaa tttttttttc gcatatacca gagcagattc gaaataaaac aaaagactca aaggacacac agggaggcgg gaaactcagc ttgcctgagc gatttattta aggcgccctg ccacctgaga ggtagaggag ggcaaagggc.-atgcatgatt ggaccccaca-.caagtcagac tgctgcaggg aagggtcaga tctgtgggag aatggtgcac caagacgccc ggcatgagtc ccagagactt gctcatccac agggccatgg agagctcaag gaaatctgtg gctcacacct gagtttgagg attagcctgg .gaatcatttg agcctgggca aaaccatagt ctagaggccg aagtggtgaa agtccagatc gcagcaggtg gggagtggca ctccccacaa ttacctggtc agcacctctt ctgtccactg ccagcctctg aggaaaatgg ggcatcaggt tggtcagagt gccatactca gcatctggga gatgggaatt ggtcagaact tgttaatgtg tgagaaaact taggtagaag aagggaaggg atgaaaccag cacagtgaaa .tgaggtcctg ,gaaaggctcc ccataggctc atggacgtct aactgacagc cccatccctc agctggaaag gtcttcccag ttccagtgtt gctaaggaga tttgaagaat tgtaaaagaa ggacatacat tcctgcccct gtgccacctg cagtgttctc ccagggctcc agatgatgag ggtcctggtg agtgagcacc ggaaggcagg cctgtgtctg 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3801 ggatggctgt cgtcaaagaa aactgatggc gccagcatca aagtcctcac tcacgggtct cgaacctgcc taaacatttt tagcagtgtt gtgtgttcat acatcggtgg actggagccc gagattcccc actcgaggga tgtgcagatc acactcaaca gtagcatttg ggcaggacaa ggctgggtgt gaaagaagaa atgcatgtga gagacctgtc ggttgaggca aatgtcctgt tatttaccat caaatacagg tcaaagaatt caaagctgga ctttggacat gatgcctcca tgtttagctg acgcccaact cgcccgggag tcatcagggc tcactagcca gagtccatgg ggaagcggga ggggcaggca <210> 16 73 <211> 880 <212> DNA <213> Homo sapiens <400> 16 gtgagcaggc gtgtgtgtgt acatgtacgc cagtgtgtgc tgcatgtgtg *gcagtgtgag taggtcctca gctgaggctc CCCtCtCctg actccctctc cctcccctct gagcctcggg ccgggtgagg ctctttcttg gaggtttcta tgatggtcag gtgcgcgcgt atatacacgt acaggtgtgc ttcgtgcaca tagcatgtgt gcaccagtgc tgaagctgca tgggcttctg tcctgtgggc ctgtgggcat ggcaggcaga gccaggccgg ttccatctga ccgtttctca cacagagttc gcctgcaagg gagcacatac aagggcacaa gtcgtgtggg gcacataaca cactccttac gccctgaggg tgtccactcc atccgcgtcc ttgcgtccac tgacacagag atttcactgg atggatgata ctctttcttg agagttcagg aggtgtgtgc gcaagtatgt ctgatggtga ctggctgcac gtaagagtgc 120 atgtgtgcat gtgtgtacat gaaggcatgg 180 gtgtgtgcac atgcgaatgc acacctgaca 240 cattcacgtg aggtgcatgc gtgtgggtgt 300 tgtattgagg ggtcctcgtg ttcaccccgc 360 aggatgagac ggggtcccag gccttggtgg 420 cattgtccca tctgggcatc cgcgtccact 480 ccctctcctg tgggcattta catccactcc 540 actccccctc tctgtgggca tctgcgtcca 600 tccctctcct ggttccttcc tgtcttggcc 660 tcttgactcg cccagggtgg ttcgcagctg 720 gaagagggat.- agtttcttgt. caaaatgttC 780 aagcaaaaag, taaaaactta. aaatcccaga 840 gcgactctag 880 <210> 17 <211> 3186 <212> DNA <213> Mono sapiens <400> 17 gtgagccgcc tctgacccgg ttcagtggtg tctccatctc gcactggccg atgtttatgg tcagatgccc tcgctggccc gcacctgtgc ggtgcccctg acagggccag aaagggcagt ggagctgaat atttgtgtta gtcgtcgtct gtaccatgaa agagccacag ctcagttcca aaatcttccc ctcctttccg ctgacactgt tgaggtcaga tgccacctgc agctccgagg ccatgtgggg tgggccacgg ccctcgaggc cagtgggcga tggggctcgg cccgacctct tgtttccctc atttgtttaa cacctcagca aggtcattcc gtgggtgaga ctcaggcact acgggggctc ccaggtctgt gacagggctg aggatccctt tatgaaaaat ttgggaggct gattgcacca agaagactga cagaagccaa accaaggggt ggcttcacct ctgctgcctg tgggtagtgg tgggacgtca ggagtcttag ggaggatttg cacccccggg tctgggcatg ccaagaatcg cttctgcctg cgggcaccac gccaggaggc cccagggccg atcgtggaaa aatggttttt ctgcatgtta gggtgcgtcc tcgtttgcat gaaacccttg ggttgacccc ggagtt t tcc tcctcccata agctcccgta acccttgggt gccgtttcca aggagtggga ggctgtggtg ccttcctggc agcaggtggc ctgggtcagg aaacattctg gagttactga agaagtggct tgaggtacac gggtgagatg tgatcacacg gcacacctgc gcgcggtggc gagcccagga aaaaacaaaa gaagtgggag ctgtactgca caaatgcagt gtcggtgtct gcaggcccag cctccaggga tggaactcct gggttttagg tgcacagttc tgttcgcgtg taggagccgg tgtggcccca tggaggccat cccagggcag cagaggaggc tgggaaggtg gggtctcagc aaagagggcc aaggtgcagc agagctgtgg gctgtgctcc tggaacgttc acaactttat cacagaggga gagtcagggc aggtggtggc aggcccgggc ctccacctca cgaagccctc gccccatgag aggctgcgcg aattaccgtg cccagcaagg gctcacggga aacccgagtg cttgcgcctt ccgcctttgc accagctcca ggctcagacc gccctcctct ctccctgacg cgtgcctggg gggtgtgctg gatacaggtg agggtccagc tggcgtgctt caggtgaaaa ctcctgggaa ttcagctcag tcttgtcctc gagggcctgg gctcagggca agtcgcttga ttgggtagcc aacacagagt caggcacgtg gaacggagag ctgggccccg gtccacgtgg cgctgggggc ccgtgctggc cgcgcctcca tatttctccc tttggaagag agcgtggccg tgtggcaacc ggcctggctt ccgttgttgc gaggctgaaa ccggggtgct caggaagtca gtgagaccag ggggggctca ggcagtgggt aggtacacgg ggggctcagg cacatatgag cacatgtgca cccaaagtcc caggaagctg tcacacctgt agtcccagca gtttaagacc agcctgagca attagctgaa catggtggtg gatcacttga gcccaggagg gcctgggtga cagagtgaga ttcttggaaa gaaacattta cggtgtcagt gagatgagat ccctccgcgc ggcaaggaat gctctgtgca ggtgtcccca caggggcatg tctgaacagt gaggtgggtg ctccccacac cc tgt cctgg agggccaatc acaagcctcg acaggcctcc ggctgagaag cacacttgat gagttttcca catgctctgg gaggcttggg ctgccttctc ccctcgtgca ccactgagga.
ggggcctcct actcccaggg atttccccac gggcggctga ctgaggaggc gaaggcccag atttcacggc ggggtctgat cacgggcttg agcccctcac ccgggacctt taaatgggga ggcttgactg gtacatgggg gaggccaggt cagagggtca catgtgctgt agaggccaaa ctttgggagg acatagtaga tgcgcctgta tggaagctgc gcccatctca gtaggaactt gatgggtcct tctgctcacc gtcttacgtt aagcacctgt ctgtgcctgt gggtaaagag agatgggaga caggtgaggg agcccggcca ctggtcaggg tgtggaggcc gggctgtacc cgagccactg gagtgtgagc gtgaaatgag ttacaaggtc cagggagggc accaggctgt tctctgcctc agctgcttga *ctggaggtgt tgggccatga ccatgtgacc cagggtctct gtttccccac cgagatgcga gaatcccctt agccaggctg tcaaatccgc gggtggacgc ccatgctagg aggcttattt aaagacatcc gtgtgatctc ggctcaggca acatgggggg gaccaggtac ttcatggtag gatggaggct ccgaggcgag accccatctc gttccaatac agtgagctga acaacaacaa aacctacaca cacaccatca 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 2520 2580 2640 2700 74 ccccagaccc ttgatatacg tggtacacaa aatcagaagc gtgttcatac actcgcacac gcccatgagg gcccacaccc tctcag agggtttatg atgacatcaa ggaacaatgg cagcatgggg aga tggtgca acaagcacac aaacccatgc acgagcaccg caccacaggg ggttgtctga ataaactgga ggctggcatc cagaaacgca acacagacat atgtgcattc tctgattagg gcgggtggct cgaagggcag aaccttagag caggatggag gtgtacctgt gcatgcatgc atgcacgcac aggcctttcc cagaagggat gattcatgat gccttcccgg ctgcttcagc gcacacacag atccgtgtgt acaggcaccg tctgacgctg gcgcaggacg aagtacctgc aacaggggct ctccacatgc acacgcagct gtgcacctgt gtgggcccat tccgccatcc 2760 2820 2880 2940 3000 3060 3120 3180 3186 <210> 18 <211> 781 <212> DNA <213> Homo sapiens <400> 18 gtatgtgcag ggagactgag taacccaacc tggaagggac gcgctggggg tcctgtttgc gcaaacccag agcagaggcc tcctctgccc agccatgtcg gtggaaattt gggt cgggac gtgtctcctg g gtgcctggcc tgaatctggg actgtcaggc aggagctgtc gcctggtctc cctgtggtgg gccaagggct gcgtatcacc ctggacactt aacctgcggt cacctggaga agccagagat ggaggggagc tcagtggcag cttaggaagt tcgtctgccc tgggagctgc tcctgtttgc gattgggctg taggaggagg acgacagagc tgtccagcat cctgagctta agccgaagaa ggagccaccc tgggctgggc cagtgcctgc tcttacccct gccctctcgt catccttccc, cccatggtgg.
tctcccgtcc.
ccaggcccag cccgcgccgt cagggaggt t acagcttcta aacatttctg cgcagaccgt ctgtgactcc ctgctggtgt tagtgtgtca tttcgcatca..ggaagtggtt 120 :ggggtgagca".gagcacctga.-l180 acct 'tgctct.,gcctggggaa.240 *gatttggggg gcctggcctcA.00 atggcactta-'gggcccttgt 360 gctaccccac ccctctcagg 420 cctctgcttc ccagtcaccg 480 tctgatccgt ctgaaattca 540 ctttctgttc tttctgtgtt 600 tcgtgactcc cgggtgtggg tcagcctctg tgcggtgctt cagctttccg ttttccccca <210> 19 <211> 536 <212> DNA <213> Homo sapiens <400> 19 gcaagtgtgg tgtgtggggc ggggcctgga agcctggcag ccacgcttgg ctgccctgag ctgggctgcc ccagccaggg ttggaggatg gtggaggcca gagcagcctc gccacgctgg ggtccccaac gagccttctg ctcctggggt tgtctgctcg ccacgaggtg ccacctctgg gtgcgggccc agatgctgct cagccctatg ttcttgaacc acccctgacc cctgagcaag ccccggtgga caggccctgc cctcttctgg cacctgccca gaagtgcaga tgattaaacg cctgcttccc tgtgtcctct ttctctcccc ggggtgtctg ctgcccggcc aacggagtc t ggggtcatcc cgcccccggg ctggtgtccc atctcagggg cacagcctct gccccgccgc tcccttcact acccacacgt gattttggcc ttgaacgccc cctgaccctg caggccacgg cgatggctcc tccctggctg tccagcgtca.
gaggttccca cctaggaggg ccgcag <210> <211> 3179 <212> DNA <213> Homo sapiens <400> atctcatgtt ctgtgagtga gtctatgagt tttctgatgc ccctggaaga cttgggcggc ctggggtggc ggctgggccc ggtgcacatc cctgtataaa ccgcctctgg gtggggcagt atcctcttat tCttCCtctt tatcctCtta gtggagctgg gaggggcggc gcttgggcca tgaatcctaa acggggtggt gaatggggtt tgtgaggcag cataacagta ggggatgatg aggggtgatg cctcctcccc ctctgggcca atccaggatt ccattctctt ggagggtgtg cat ct CCcag atctcccagt tctcctagtc acatacgtcc tcagagggac cacgaaaccg tgtgcactgc ggtcagtgcg gtggtcagtg gaggggaagg agtccaggcc gagggcctgg gggggggctg tgcctcccac tcagctttca cctcctcctg aagagtagac gacacaggag tctcatctct ctcatctgtc tcatccagac ttcctcaggc gcagtcttgg agggccctgc atagacacca ggcccatggc cgggcccatg agggtagggg cgaagggcag ccagggtggc gtctgggtgg ctgcagccgt tggaggtggg aacgccccaa caggattctg gcttcagggt catcctctta atcCtCttac ttacctccca, agaaggaact ggtgaagaaa gtgagtggct ctgtatgcaa ctggctgtgc gcctggctgg atagacagtg cagggatgct agggatgatg cggggaagat ggatccggat gggcaggggc ctcaggttga atctctgaag ggggctggtg tcatctccca catctcccag gggcgggtgc ggaaggattg cagcccctcc ccagagcctt ttacagaagc atttacggaa gcctgggagg ggagccccca gggggcccag ggggccccag ggggaagcct gtgcttccct atgacaccat aagtcacatt ggtgggtagg atgctctctc gtctcatctg tctcatctct caggctcgca cagagaacag tcagaagttg ccagcaggtc 120 180 240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 75 cctggtgggg gtttgagtgc gcgtcattta gggcccaagt agctgtgagg ggaagggagc agggcgggga gaccaacagg tctccgggtg tagaccctta ttgtctgttt ggttgttagt gatcctccgg ctggcacttt cagtagtttg gggtaacata ccagcatctg ggtcatggct gaccctgtct gaaggaaaga taggtagact aaccccagct gacaagcgtg ggaggctgtg gccccacgct acctgggaag gcacttgtgg acctgtgaat tgccagtcag gggtccctag cactggccac ccgctccatg acgcctcgat tgatgcgttt cagtacaggg ccttatggta agcccggacg ttgctgctgc ccacagactg aaggaggggc ggccccgggc cttcccagga tcaggccatt ttttttgttg aaaaaggtat ttatttatta gcagtggcac cctcagcttc taaaaaccac ggaagccgag gggagacccc tagtcccagc gcagtgagct caaaaaaaaa gaagaagaag gtcaaatctc ctttggactt tatggagcga ggtgacacca cc tgc cggt c gatgctgtgc caggcacaat gtgtcacccg ccgatcttaa aagtgagaga tgctggc tt t ctggaaaagc ttcaggccag gtgttcagcc aaatgaatac tggccgggtc tgcctggtgt ttcagagaat tgtcgtaaat tcttggcagc gccgtgggcg gcagaggccg gttcagctat aaattttact ttgctttgat ttattattat agtcatggct ccagagtgct tatgtaaggt gcagaaggat atctctacaa tgctcgggag gtgattgtac aaaaaaaaaa gaagaaggaa agagcaaaat.
ccttaggcct gtgagttcaa gccaggaccc ctgcacctgc agggggcttg tacagcccct caaggcagag ggt cat cctg gggaggcagg gagatggagg aagcaatcct tgggacctgt actaagctgc agggacagtt ctactgagtg cggggtgggg gtctgagtga gcactctggt cggcctgggg gacgacctca ctgctcaggc ccatcttcta caggattact atggcttaac tattagagat cgctgtagcc gggattacag caggtccagt tgtctgaggc aaaatgcaaa gctgagtggg catcgcactc caccttggac gcttatggcc ccgagcctaa gcctggagcc gcgcctttgc agtgagaggt acacctgggt caaagctcca tatatttttt tcactaagca ggtgt ctac t gcaaaccccc gtgtgagcca ggcttccaca caggagtttg aagttatccg aggatcgctt cagcctgggc agggcttctg actggatatg tgtgtatggt cccgtatagg cctgcaaact tggacagaac ttgaatcaca gattcctgtt gctaaagtat cctactt tat ctgtcaccca aggctcaagt ctgcccttgc cctgtcatcc agaccagcat ggcgtggggt gagcccggga aacagagtga 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 gaaggagaag.gagaagagaa -,gaagaaggaa.:222 0 gaaagaagga: gaaggaggcc tgctaggtgc- 22 gaaaataaca. aagttttaaa. gggaaagaaa. 2340 gaacttcatc agcagaaagg ctgaaaggga tgtaaccgtc ccaaactttg ccccaaagat gctggtgaag gattatctgg ggagagt cag agggggtccc ccccggtcct t tcagct tt c agtgattcgt ctcagagtga tcaagcagct gaggagaagc gtggttgttt gatgttggtg gtgggtttca gcccacgtcc gctgcaggtg tgggcctgat agaggggacg cagccaagga gagggcacac cggcctccag cacagcagca ctctcagccc tccttccaca aggcaagggt tcctgcctca ccaggtgccc gaagccccag ttctcctgga gaatcacggc atggccacaa tgagaaggac atgggggcag ggccctgccc agctgtaaga aatggaatag acccctggg .2400 2460 2520 2580 2640 2700 2760 2820 2880 2940 3000 3060 3120 3179 EDITORIAL NOTE FOR APPLICATION NO. 22729/99 THE FOLLOWING SEQUENCE LISTING IS PART OF THE DESCRIPTION THE CLAIMS BEGIN ON PAGE 76 WO 99/33998 WO 9933998PCT/E P98/08216 1 18
SEQUENZPROTOKQLL
ci1o> Bayer AG <120> Regulatory DNA sequences of the 51f region of the gene for the human catalytic telomerase subunit, and their diagnostic and therapeutic use 0 <130> -LeA32805-Foreign countries <140> <141> <160>. <170> Patentln Vers. <210> 1 <211> 5126 <212> DNA <213> Homo sapiens 1 gagctctgaa gggtaatgaa ccgttccttc tcgggtgtga gagcacgggc accaggctgg caggcactcc gagactcagc gcctcagct tcacctgtcc ggtgtctgtc agactggctc ggctgcacgc ccggtgtgtt ctagggtctc ggaaatgcaa caggcctggg CCCCCtccct aagacccagc cgcacatcat caaagcaggg tccgcacggt catctcaagg gctcaaaaag gggggcggca atggtattgg ggctgtgcca acgtcctgat gtgatctccg gtaatccagg gatggaggca gcgcctccag gcacggctgg agggcactcg acccatgcac cagggctgaa tattt agcggcatga tcagcctccc gtgatccgcc :cgtggaaac ;tggtgtgca catcattatt caagccatga :acacccctg ggtgacaaca cccaaattct :tggggtgcc ctccagcagc cactgtgtct tccttcccca ctctgagcct tgacctccat cttctgtttc ggggt tta catttgggtg gacggagccc ctggaacaca attggcaccc gtacacactc aaatccctgc ggacagttcc gaattacgct aaagaatttc 9"99999"~ ctcagttatg tctttgccat tcccccaaac tgaggaccct ggttctggga gtcagtctga aagctggaaa cccttagccc cgctgccct t tgtgaatcta gtgcctccgg tacttacttt tcttggctca aagtagctgg gatgggcttt caccccagcc gaacatgacc cttgcctgcc ggaaatggcc atgtaaata catctrtcacc cccaaggact caaaactcag tacaaacacc atatattaag agtccaggag gcggctgaac agtctgttc tgcttccctg ggtgggtcaa cacgactctg ctgatgggga 120 gaatgattcc actcttttac agatgaggct tctagactag a t 9 t agggcctggt acactgaggc ttcctaaacc tgtct cagcg acactcacat gaacctggct ttccaggcgc tgtgctcctt taggcatagg tgaaagtagg ccgccaggga gagtggcagt ctggacatrtt ccgtccacga taaaatgtcc tcacagtgaa gagtcaaaac accccatggc actgcacgca ggagactaac gcccgagtgt ctgtggacag gaggtccggg agaggcgggc ggctgaaaag aagcggggaa accagggccc ctagcatgaa ggattatt~c gcaagggcag ctgagacaga ctgcaacctc gatttcaggc caccatgttg tcccaaagcg tgctgcttcc cgagggcgcc a cagccctgtc tccacaccct c ctgggtgggc cgtgttccag c acgtagctcg cacggttcct c gcgttgaagg gaggagattc t cgtggccccc gatgcaggtt c tccccgtctc ctgtcatctg c tccacgtcca gctgcgtgtg-t acgggggcgt ggtgggccag agtgcctgtc ctcacctagg cccgcccttc tctgcccagc ttccacaagc actaagcatc gccccacagc cctgggaatt ccgacccccg ctgttttatt tttaacaaac tggttaaaca.
gaggaacatg ccgtttataa tgccacctcc atgggatacg aggggagtgg ttaggggggt ccrtttacta aagccagttt cataggggag tggggatggg cctgggcagg ataatgctct aacccgcccg gccccagggc atccttcggg actacctgca aggagggtca gaggggggca ggagggaggg cctcgagccc gggaccctcc acggagcctg atcgtggacc tccggcctcc gtgtgtgggg atttgcagaa aaaacaaagg tttacagaaa ggcaggcacg agtgatttta gttatgCtct tgttgcccag cgtctcctgg gttcaagcaa gtgcaccacc acacccggct gtcaagctga tctcaaaatc ctgggattac aggcatgagc .gcaacttct 1 aggcccaca 2 rctttcagcc 3 .agaccctgg 3 tctgCCCtg 4 :cgcctccag 4 :gctactgtc 5 :ctcacatgg 6 :gCgCCtCCC E :ctggcgtcc :cggggcctg :ctCtgCCCg ;gcgctcttg :ccacgggca acttcctgc ctcttcccaa cacgtgacta ttaatagcta.
aacgggtcca, agcctgcagg tacgcaacat taaggacggt cctggttctg ggaacccgga agagatgccc ctttgcaggt ggcccgaaaa gcctcaggac aggcctgcaa cagcaggaag gtgccatagg gcaacaggaa catccaagga tttagctact gctggagtgc ttctcgtgcc aattttgtat ctgacctcag cactgcacct so 00 so 00o 340 900 960 1020 1.080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1920 1980 2040 2100 2160 2220 2280 2340 2400 2460 WO 99/33998 WO 9933998PCT/EP98/082 16 2 18 ggcctattta accattttaa catggagttc aatttcccct ttcgtagact ggggat5aca tcccatggga eccactgcag cgccatctgc cagtagaaac atctcaatgt ctcagtgtgc actgggattg agccccttc tggaggaagg aatgatactt agtgcaatgg cgcgatcttg ctgcttccgc ctcccatttg tttgtatttt tagcagagac gaacttctga cctcagatga gtgagccacc atgcccagct gaagctcacc ccactcaagt tgttagaaca ctcttgatgt acacactaac tgcacccata tgccgggagg cgtttcctcg aaccagtgta agctacaact cctagtggca gagacaarttc ggatttctag aagagcgacc gagcgtgaca gcccagggag aatttcctcc ggcagtct atttcagcgt ttgccgacct tttcccgccc ccttagatcc cagctgtcct gcggttgtgC tcrtactgctg ggctggaagt gcctggaccc cgaggctgcc catctgccag acagagtgcc ccggtgcgcg gccagcagga cccggtgggt gattaacaga gagaacctgc aaagagaaat tgcagggagg cactccggga tcgtccccag ccgcgtctac ccggagcccg acgccccgcg gcggccaaag ggtcgccgca cgggttaccc cacagcctac cctgcaccct gggagcgcga ccgcccggag cagctgcgct gacgcccagg accgcgctcc cccctccacc ttccagctcc gggtccccgg cccagccCc cgccctctcc tcgcggCgcc ccctggcccc ggccaccccc <210> 2 <211> 4042 c2l2> DNA <213> Homo sapiens aacttccctg ttactcagga cgtctcttga gggcagctgg ctgatgtaga gccgaaacat ctatccccc tgttattttt ggctcaagtc gttaccctcc catattcaca gaggctgcag atcagggcgc gtagaaatta, ccaggggcag cactgctggt acacccactg tttgatattt gtttctgtga gcttcaggtc aagtgtggac aagtccatcc aggagttcct actgaatcca ttgttgctca cccaggttca ccaccatgcc atgttggcca aaagtgCtgg aacatctggg tgatagaat tttgagaggc ggtttcactc gcttactgca gcctctgcct gtaaggagtt 2520 tctgtaattc 2580 ccacctgtta 2640 ccagtggggt 2700 actgtcctga 2760 ctcctactCt 2820 ctcactcctg 2880 ctgtttcatt 2940 ggctggaggg 3000 agtgattctc 3060 cagctaattt 3120 ggctggtctc 3180 gattacaggt 3240 tctgaggtag 3300 tttttattgt 3360 cttttcaaag.,3420 cttcattgaa..3480 gctgggatta tccacctgcc cagaatttac gttgtggtgt ttacaccgt atactggggt ccatgcacat cttttaaaat taacttgt acaaacacag tgtaatccr-a ggtgcgaggc gaaagtagga cagctacagc aaacttgagc cggggcccca cgggcctcct ctccaccctg ggggcccagg gcgcctggCt tttggggtgg9 gacgggcctg ggtcccgcgt gcgcctCCgt tccggacctg cgcacctgtt Igccgattcga LgcggcgcgCg gtcggggcca ccacgtggcs gcctcctccc tccgggccct ;agtttcaggc gcgatg caggcacccg ggggctcacc tctgcctcct tctgtttaga tttaagccaa gatgactaag. acatcatcaggtcttctggg. tatcagcaat ggtgttaatt tgtgttttcc tggaacaaat ccctttaaaa agtatttaca ctgttcaaac aaggttacat atccctgcaa aacccggagz ggt ctggag agectgcag tgcgggcggg gtcaaggccg ccatttccca tttgctcatg tgtcaaggag gcccgtccag cctccccttc gaggcagccc cccagggcct cctctctccg ggcggggaag ggCCgggCtc gagggactgg cgcggaCCCC cccagcccct agcgctgcgt actccagcat-aatcttctgc'3540 atgttggcrtt ctctgcagag.3600 tttccaaacc aggcttaggg agacgaggct gctagctcca ttaaggttgc ggcct cggga ctggatCtc ggaccagtgg tccgaggctt atgtgaccag ttgtggctgg ccctttctcg gtggggaCc cccaagtcgc ggagcaatgc acgiccggca tgggtctCCg ccacatcatg ctggggCCCt cgcggcccag ccagtggatt ggacccgggc *gccccgtCc ccccttcctt *cctgctgcgC gcccctttgc 3660 atcactaagg 3720 aacctccagc 3780 taaataaagc 3840 gtttgttagc 3900 gacccagaag 3960 gggaagtcct 4020 ccgrtgcggct 4080 ggagccaggt 4140 atgttggcct 4200 tgtgaggcgc 4260 acgggaccgc 4320 ctcgccgcc 4380 ggggaagtgt 4440 gtCCtCgggt 4500 r.tCgtggtgC 4560 gatcaggcca 4620 gCCCCtCcCi 4680 CgCtggCgtC 4740 acccccgggt 4800 cgcgggcaca 4860 acccgtCCtg 4920 gacccctccc 4980 tccgcggccc 5040 acgtgggaag 5100 5126 <400> 2 gtttcaggca cgatgccgcg aggtgctgc agcgcgggga gggacgcacg tggtggcccg gcttcgcgCt gcagctacct tgccgcgCCg tgctggtggc aacgggcctg gtgcgaggag gcgCtgCgtC cgctCCCCgC gctggccacg cccggCggcc gccgcccC agtgctgcag gctggacggg gcccaacacg cgtgggcgac tcccagctgc ggcccggCCC ctgctgcgca tgccgagccg ttcgtgcggc ttccgcgcgc gccgccccct aggctgtgcg gcccgcgggg gtgaccgacg gacgcgctgg gcctaccagg ccgccacacg cgtgggaagc tgcgctccct gcctggggcc tggtggccca ccttccgcca agcgcggcg c gcccccccga caccgcgggg ttcacctgct tgtgcgggcc ctagtggacc ccggggtccc gaagtctgcc cctggccccg gctgcgcagC ccagggctgg gcgcctggtg ggtgtccCgc gaagaacgtg ggccttcacc gagcggggcg ggcacgetgc gccgctgtac ccgaaggcgt cctgggcCtg gttgcccaag gccacccccg cactaccgcg cggctggtgc tgcgtgccct ctgaaggagc ctggccrttcg accagcgtgc tgggggctgc gCgctctttg cagctcggcg ctgggatgcg ccagccccgg aggcccaggc gaaccatagc gtcacggagg gcgcgggggc agtgccagcc WO 99/33998 WO 9933998PCT/EP98/0821 6 I 18 gtggcgctgc gcaggacgcg aagaagccac gccgccagca cttgtccccc agctgcggc tcgtggagac cccgcctgc acgcgcagcg ccccagcagc aggaggacac aggtgtacgg ccaggcacaa atgccaagct tgcgcaggag tcctggccaa tcttttatgt tctggagcaa agctgtcgga gactccgctt tgggagccag cactgttcag tgctgggcc aggacccgcc tcccccagga gcgtgcgtcg agagccacgt tgcaggagac aggccagcag tcaggggcaa tgctctgcag acgggctgct cgaaaacctt tgcggaagac ttcagatgcc tggaggtgca tcaaccgcgg tgaagtgtca acatctacaa catttcatca cctccctctg gcgccgccgg tcaagctgac agacgcagct acccggcact aggccgagag ggcggcccac aggcctgcat aagggctgag caccccaggg tccatcccca accatccagg aggtgtgccc t tggggggag aaaaaaaaaa :cctgagccg gagc tggaccgagi gacc :ictttggag ggr.5 ccacgegggc cccc ;gtgiacgcc gagz :tccttccta ctcz ccagcgctac tggc :ccctacggg gcgc :ggtgtctgt gccc agacccccgt cgcc :ttcgtgcgg gcct cgaacgccgc ttcc ctcgctgcag gag cccaggggtt ggcl gctcctgcac tgg cacggagacc acgt gttgcaaagc att agcagaggtc agga catccccaag cct~ aacgttccgc aga cgtgctcaac tae ggacgatatc cac, gcctgagctg tac caggctcacg gag gtatgccgcg gtci ctctaccttg aca cagcccgctg agg tggcctcttc gadi gtcctacgtc cag cctgtgctac ggc cctgcgtttg gtg cctcaggacc ctg agtggtgaac ttc ggcccacggc cta gagcgactac tcc cttcaaggct ggg cagcctgttt ctg gatcctcctg ctg gcaagtttgg aag ctactccatc ctg ccctcigccc tcc tcgacaccgt gtc gagtcggaag ctc gccctcagac ttc cagacaccag cag acccaggcdc gca gcccggctga agg tgtccagcac acc ccagcttttc ctc gattcgccat tgt tggagacct ga5 tgtacacagg cgz gtgctgcggg ag~ aaaaaaaaaa aa :ggacgc :gtggtt rcgctct :catcca iccaagc Lgctctc :ccaggc :aaatgc :tcctca :gggaga :tggtgc :gcctgc :tcagga :tgacgt :gtgttc ctgatga t ttcaaa ;gaatca :agcatc gacgggc gaaaaga gagcggg agggcct tttgtca gccaicg :agaagg gacctcc gatgccg gtcttcc tgccagg gacatgg gatgatt gt ccgag cctgtag ttcccct agctatg aggaaca gatttgc caggcgc aacccca aaagcca gaggccg :acctacg ccgggga :aagacca ccgt tgggca t ctgtgtggt ctggcacgcg catcgcggcc acttcctcca tgaggcCcag cctggatgcc ggcccctgtt agacgcactg agccccaggg agctgctccg gccggctggt acaccaagaa ggggtcctgg gtcacctgcc ccactcccac accacgtccc ctcctcaggc cctgactggc agggactccc tctggagctg cccgctgcga ctctgtggcg ccagcacagc gcccccaggc gttcatctcc gcccacccgg 840 agaeccgccg 900 ccaiccgtgg 960 tgggacacgc 1020 gacaaggagc 1080 gctcggaggc 1140 cgcaggttgc 1200 cttgggaacc 1260 gctgcggtca 1320 gcccccgagg 1380 agcccctggc 1440 ctctggggct 1500 ctggggaagc 1560 tgcgcttggc 1620 cgtgaggaga 1680 cccaggtctt 1740 cggaagagtg.18 00 cagctgcggg .1860 ggaagatgag cgtgcgggac cggccgcaga gcaccgtctg gtgtgtacgt agaacaggct gacagcactt, gggaagccag tgcggccgat gggccgagcg cgcggcgcc ggcgcacctt aggtggatgt ccagcatcat ccgcccacgg agccgtacat tcgtcatcga tacgcttcat ggat cccgca agaacaagct tcttgttggt gtgtccctga aagacgaggc ggtgcggcCt cccggacctc tgcgccgcaa aggt-gaacag acaggtttca catttttcct agaacgcagg tgcagtggct tgccactccc cgacgctgac tcctggactg cgccgagctg ctttttctac gaagagggtg gcccgccctg ~ccgacgtcca.1920 tgtgaacatg gactacgtcg .1980 tctcacctcg :ggcctcdtg :gtgctgcgt gacgggcgcg caaaccccag gcacgcccgc gcgacagttc gcagagctcc gtgccaccac gggctccatc ;tttgcgggg gacacctcac gtatggctgc cctgggtggd gctgctggat catcagagcc actctttggg cctccagacg cgcatgtgtg gcgcgtcatc gatgtcgctg gtgccaccaa ggggtcactc igccctggag atggccaccc ctacgtccca cctgagtgag gcctgagcga cacaggctgg cactccccac ctcctttgcc tgggaatttg gggggtccCt agtttttcag agggtgaagg 2040 ggcgcctctg 2100 gtgcgggccc 2160 tacgacacca 2220 aacacgtact 2280 aaggccttca 2340 gtggctcacc 2400 tccctgaatg 2460 gccgtgcgca 2520 ctctccacgc 2580 aitcggcggg 2640 ctcacccacg 2700 gtggtgaact 2760 acggcttttg 2820 acccggaccc 2880 agtctcacct 2940 gtcttgcggc 3000 gtgtgcacca 3060 ctgcagctcc 3120 tctgacacgg 3180 ggggccaagg 3240 gcattcctgc 3300 aggacagccc 3360 gccgcagcca 3420 gcccacagcc 3480 gggagggagg 3540 tgtttggccg 3600 gtgtccagcc 3660 cgctcggctc 3720 ataggaacag 3780 ttccaccccc 3840 gagcgaccaa 3900 gtgggtcaaa 3960 rtttgaaaaa 4020 4042 rccctgtc acgccgggct ccgctgg gagtctgagg rctgagtg :tgccgtc :accagga :tcacccc ~aaggacc ~ggaccct taaaatac tccggctgag ttcacttccc gcccggcttc tcgCcctgcc ctgggagctc gcacctggat tgaatatatg <210> 3 <211> 11276 <212> DNA <213> Homo sapiens <400> 3 acttgagccc aagagttcaa ggctacggtg agccatgatt gcaacaccac acgccagcct tggtgacaga atgagacccc gtctcaaaaa aaaaaaaaaa aattgaaata atataaagca 120 tCttCtCtgg ccacagtgga acaaaaccag aaatcaacaa caagaggaat tttgaaaact 180 WO 99133998 WO 9933998PCT/EP98/082 16 4 18 ataeaaacac atgaaaacLa ttaaaaagga aat tgaaaaa acccacggta tacagcaaaa t caaaaaagt agaaaagcca gccaaggcgg gcagatcgcc aaccttgtcg ctactaaaaa cagctactcg ggaggctgag gagccgggat tgcgccattg aaaaaaaaaa aagtagaaaa aaaagcaaga gcaaactaaa cagaaataaa tgaaactgaa ttttgaaaag ataaacaaaa acctaaataa ataaagtcag caaaggatca ctagaggcta aaaatagata aattcctaga agcccaaaca gaccaataac agagaagccc aggacccaat gaattccaat cctactcaaa tctacatggc cagtattacc aacaaaaaaa cagaaagaaa aaatcctcaa caaaacacta gtgatcaagt gggattat atgtgataca tcatcccaac cagaaaaagc atttgataaa tatacaagaa acatacaggc aggccaaggt gggatgattg gagacctggt ctacaaaaaa tagtcccagc tagtctggag tgcagtgagc catgaacatg gaataagaag aaggagaagg aggaggcgga ggagaagtgg tttcaacata ataaaagcc agcccttcct ctaagatctg '5 tagtactaga agtcctagct ctggaaagga agaagtcaaa gacttaagac accactaaaa caatgtacaa aaatcagtag caaaaaagca gctacaaata tctacaatga aaactataaa agatattcca tgttcataga agcaatttac aaattcaatg agaagaaaca attctaagat cctgaccaaa aagaacaaaa agctatagta acccaaacta gaacagaata gagaatccag aggtgccaag aacatactt ctggatatcc atatgcaaaa aaatcaaaat ggatgaaagg aacaccggag aaactcca caggcacagg caaccaaagc tgcccagcaa aggaaacaat tttgcaaact attcatctaa ctctataaga aaaacaccta catttctcaa aataagtcat ctgatcacca gagaaatgca atggctta ttcaaaagac accctcggac actgttggtcttcctcaaaa aactaaaaat caaaaaaggg aatcagtgta ttcatagcag ccaaggtttc aaaatgtggt gcacatacac tcagttgcaa cagcatgggS S aaagacaaac ttttcatgti agaaatagag gagaatggtc aacaatatac tttatttaag gcagtgctaa ggcgcagtgg tgaggtcagg tacaaaatta gcaggataac gactccagc acttaaaaat cctaaaattg agataacaat ttgacaaacc agatgaaaaa ctatgagcaa tgcatacaac aataatggga ggcttccctg ctattctgaa ctgattccaa gaaaactaca gcaaaccaaa ccagggatgg aaaargaagt attctgcacc caggcacagt cttgggccca cttttttaaa gctgaggtgg tcactgtact agaagggaga aaggggaagg tatatgacag gaaaatgaca agagcaatca ttatcctgtt aactattaga tatttctata aaattaaaca aigttgataa ttggaagaac caatccctat ttgtacagaa ctggaagcat catggtactg aaacaaatcc ggggaaaaga taacaatact cttaaatcta ggacattgga *aaaaacagac caacaaagag *caaggaacta *ataagctgat *acaaacggca *aatcaaaact aggcaataac Fggaatggaai aaagctacca itcaacaagct Igaagcaacci -aatggagtac j ggcactggtc ctccctcaci ;gttctagag i aataactaai ttctgaazga caaatgataa gaaggaagtt ctcacgcctg agttcgagac gctgggcatg cgcttgaacc tgggtaacaa acaacctaat gtaaaagaaa acaaaagatc tttgcccaga agagacat ta ctgcacacta ctaccaagat ttaaagccat ctggatttta.
aaatagagga aaccagacaa.
ccagrgagc cggaaacata tatagctata taatcccage cagcctgacc gtggcacatg caggaggtgg gagtgaaacc gatgcacctt agaaataata aacaaaatta ctaagaaaaa caactgatac ataaattgaa tgaaccatga aataaaaagt aatgaagaaa acCtctcaaa agcagctaca acittgggag aacacagaga cctgtaatcc aggttgcggt ctgtctcaag aaagaactag aagatcagag aaagttggtt aggaaagaag cacagaaatt aaacctagaa agaaatccaa ctcctagcaa 2 3 3 4 4 5 6 6 7 7 8 9 9 ccaatcattt .aaagaagaat aagaatacttr ccaaactcat aaacacatca .aaaacaaaca ggccaatatc cctgatgaat ttaaacaaca aaggatggt t acaaaaacta cttcatgata ggctcacacc ggagcttgag aaattagcca gagaatcact ccagcctaga agggagggag ggaagggaaa accgaggtag agggcccact gataagagaa tgcagatgat gctgaaattt ccttcgaaag caacatatgc tatgattatt aaaaccctca tgcgatccca actagcctgg ggcatgatgg taagcctagg caacagaaca aagggaggag gaggaagaag tattatgagg ttcaccactg agaaataaaa atgatcttat ggtacagcag ttccaacagc aaaeaatctg gctaggaatt aaccaaagaa aagaaattga agagggcaca aaacactgtt aaaatgtcca taaaatacta atgacgttct.
ccacaaaaga ,cccagaatag cacattacct. gacttcaaat gcataaaaac-.agatgagacaatgcatctac agtgaactca taatctcctc aataaatggt agaactctgt ctctcaccat aaacctcaaa ctttgcaact gtgggcaaag acttcttgag aaatgggatc atatcaagtt aagagacaac ccacagaatg ataaccagta tatacaagga tttcaaaaat aagcaaaaga aacaggcatc tgaaaatgtg actatgagag atcacctcat aaatgccagt gaggatgtgg ttgccaccac tatggagaac tacagcaatc ccattgctag atctccactc ccacatttac cagtgtccat caacagacga tacgcagcca taaaaaagaa agtacgttaa gcgaaataag tgtgggagca aaaattaaaa ggtgggggac agggtgacta agagcacaat tgggttgttt actgatacaa 1 atcattcatt 1 aaatcaatca i tcactttatg 1 aaaaaccagg I gcactctggg I gcaacaaaat I :atatgcctg 1 aggtcgaggc 1 agaccccact 1 ;aggagaagg 1 aagaaacata 2 aaaaactgaa2 tgattcaaca2 ggcatccaaa atctggaaaa gatacaaaat aaaaagaaac gtgaaagatc aaaaaagaaa tactacccaa tcacagaaat ccaaagctat tatactacaa tggaccagag tttttgacaa gctggaggaa atacaaaagc actaaaagaa taattccctg aaaaagcttc ggagaatata gctcaaacta t ctgggt aga ctcaacacca cccagttaaa ataaaaggaa agtttgaaag gtatatactc tgcagcactg atggaaaaag tgagatcctg ccaggcacag caattgacat gagtcaacaa gtaacacaaa 00 so 00 s0 00 020 *0s0 140 *200 .260 .320 380 .440 So00 .560 .620 .680 .740 .S00 .960 .920 .980 ~040 ~100 ~160 2220 1280 2340 2400 2460 2520 2580 2640 2700 2760 2920 2880 2940 3000 3060 3120 3180 3240 3300 3360 3420 3480 3540 3600 3660 3720 3780 3840 3900 3960 4020 4080 WO 99/33998 WO 9933998PCTIEP98/08216 18 gaaaggataa atgcttgaag ttgtatgcct gtatcaaaat aattaaaatt ttaatggcca gccgaggcgg gtggatcacc aaccctgtct ctactaaaga ccaactactc aggaggctga tgagccgaga tcatgccact acaaaaacaa aaaaaagaag ctactatatt agaagttaaa aaataagaac aatgtatgtg tggcagaaat gtgaggaggg taagtgactt aattttaacc cctaaaacaa ctgctaataa tatctctaaa atcgagctgc atctgtatga atcctgaaac tggcaggaag caggtggctc aagacccagg tgcaaggcag gggcgatgag aagcctgcct gcaaagattg ccctggatac agcctcctgc tcaaacccag agaggagtca aggcacctcg aaggtgtatt ctgagggaag gcacccttct caagggaaaa ccctctcttg ccctcgcggt gcccgtgctg aggaccctct ctgagcctgg cttaataaca caaagactta attccatgag aaaacaggaa ctgagctatg atttttcgcc ctaagtact gtacacgagg agaggcctgg gtccccggcc tgggaggctg ttgggcaacg cgaaggcggc gcccacccac accaacccag cctgcctgcc tgcttccctg atgtaaatta cacgactctg cccaaggact gaatgattcc tacaaacacc actcttttac agtccaggag agatgaggct agtctgttcc tcragactag tgctgcttcc cgagggcgce cagccctgtc tccacacccc ctgggtgggc cgtgttccag acgtagctcg cacggttcct gcgttgaagg gaggagattc cgtggccccc gatgcaggtt tccccgtctc ctgtcatctg tccacgtcca gctgcgtgtg acgggggcgt ggtgggccag agcgcctgtc ctcacctagg cccgccctc tctgcccagc ttccacaagc actaagcatc gccccacagc cctgggaatt ccgacccccg ctgttttatt tttaacaaac tggttaaaca gaggaacatg ccgtttatae tgccacctcc atgggatacc aggggagtgg tcaggggggt cctttacta aagccagttt cataggggag tggggatggc cctgggcagg ataatgctct aacccgcccg gccccagggc atccttcggg actacctgci ,4 aggagggtca gaggggggci gtgacagata atctcatgta ggcacggtgg tgaggtcagg tacaaaaatt gacaggagaa gcactgcagc attaaaattg aattaaaaca gggtttctag aacagtggaa aaagacaggc tggtgaaagg agaattggca tggagatt gaaaaatggt tgtggacctg aggcctgatg cgtt~ggtgag catctggaaa gccagcagct aagtatggct cttgagttag ccagacgccc ttctgatcgg tgcaaagggc aactgggatg taaattcaac tcatcaaata tttgccaagg tttattggtt gcggcagggc acagcaggac cacgctgcgt gaagtcacgg ggtgggtcaa ctgatgggga agcaactt taggcccaca gctttcagcc tagaccctgg atctgccctg ccgcctccag cgctactgtc cetcacatgg tgcgcctccc cctggcgtcc ccggggcctg tctctgcccg ggcgctcttg tccacgggca actttcctgc ctcttcccaa *cacgtgacta *ttaacagcta Laacgggccca Lagcctgcagg Itacgcaacac taaggacggt cct-ggttct9- Iggaacccgga agagatgCc ctttgcaggt i ggcccgaaaz i gcctcagga( ccccatttac tgctatagat ctcatgtccg agtttgaaac agccaggcgt ttgcttgaac ctgggtgaca taatttttat attataaaag cttcegaaga gttactgttg tgggagaagt taatctat cgtctgatca cgat tgtgtg ggtgatttcc agccacttca acccgaggaccagcgcatga aggcggccag atggcgccca taaatctttt gctctgcggt gacagagtga t ccacagac c tggctggggg cctttccacat acattcagga tccaaggact ttcataaggt tatgagcacg caccgaccgt gtgactcagg agctctgaac gggtaatgaa ccgttccttc tcgggtgtga gagcacggsc accaggctgg caggcactcc gagactcagc gcctcagcrtt tcacctgtcc ggtgtctgtc agactggcic ggctgcacgc ccggtgigtt ctagggtctc ggaaatgcaa caggcctgg ccccctccct aagaccagc cgcacatcat caaagcagg tccgcacggt catc~caag gctcaaaaac gggggcggci Iacggtattg Lggctgtgcci acgtcctgal gtgatctcc(, Lgrtaatccag! gargagc cctgatgtga ataaacccta taatcccagc cagtctggcc ggtggcacat ctgggaggcg gagcaagact gtaccgtata gtaactaacc agtaaaagt ttagacgctc taaagaggca taattaccaa caccgtcctc ttcgtgtttg tccagaagaa atcttcaagg.
aggaaagctc .agtgccctta cgggaatgcacccgggcgtg ~ttcacctga taaaacagaa catttacctc cccccgtgga cccgccccgg cggacagcga ccgaatggat ctgcagaaat taataaccat ggcttagggt gcagggccac cctccctggg accccatacc ccgtggaaac gtggtgtgca catcattatt caagccatga cacacccctg ggtgacaaca cccagattct ctggggtgcc.
.Ctccagcagc cactgtgtct tccttcccca ctccgagcct *tgacctccat ctctgtttc *ggggttttta catrttgggtg gatggagccc *ctggaacaca *attggcaccc *gtacacactc aaatccctgc *ggacagttcc Igaactacgct ;aaagaatttc Igcr-gggggct ctcigttatg t cccccaaac t gaggacccc 3 ggttctggga a gtcagtctga ttattacaca 4140 ctatattaaa 4200 actttgggag 4260 accatgatga 4320 acctgtagtc 4380 gaggttgcag 4440 ccatctcaaa 4500 aatatatact 4560 acttaatcta 4620 atggccacga 4680 atactCtCtg 4740 ttctataagC 4800 taattacaga 4860 tcattcacgg 4920 gttaaactta 4980 ttagagtacc 5040 gtctctggcc: 5100 ggatgggaag 5160 tttacgcttt 5220 aggagtcaga '5280 tgccagaggg 5340 agcagtgacc 5400 agtcatggaa 5460 tttcctctct 5520 gcttctccga 5580 agagaggagt 5640 cggcgggatt 5700 tggatttta 5760 ccaaaggcgt 5820 gttcagaggg 5880 gcaagggaaa 5940 cggggagaga 6000 agctgccaca 6060 ggcttcctgg 6120 gaacatgacc 6180 ggaaatggcc 6240 catcttcacc 6300 caaaactcag 6360 atatatiaag 6420 gcggctgaac 6480 agggcctggt 6540Or acactgaggc 6600 ttccr~aaacc 6660 tgtctcagcg .6720 acactcacat 6780 gaacctggct 6840 ttccaggcgc 6900 tgtgCtcctt 6960 taggcatagg 7020 tgaaagtagg 7080 ccgccaggga 7140 gagtggcagt 7200 ctggacattt 7260 ccgtccacga 7320 taaaatgtcc 7380 tcacagtgaa 7440 gagtcaaaac 7500 accccatggc 7560 actgcacgca 7620 ggagactaac 7680 gcccgagtgt 7740 ctgtggacag 7800 gaggtctggg 7860 agaggcgggc 7920 ggcigaaaag 7980 WO 99/33998 WO 9933998PCT/EP98/082 16 6 18 ggagggaggg ccccgagccc gggaccctcc acggagcctg atcgtggacc tccggcctcc gtgtgtgggg atttgcagaa aaaacaaagg tttacagaaa ggcaggcacg agtgatttta gttatgctct tgttgcccag cgtctcctgg gttcaagcaa gtgcaccacc acacccggct gtcaagctga tctcaaaatc ctgggattac aggeatgagc ggctcaagcc acacccactg gttaccctcc tttgatattt catattcaca gtttctgtga gaggctgcag gcttcaggtc atcagggcgc aagtgtggac gtagaaatta aagtccatcc ccaggggcag aggagttcct cactgctggt actgaatcca ggtttcactc ttgttgctca gcccctgcct cccaggt.tca caggcacccg ccaccatgcc ggggttcacc acgttggcca tctgcctcct aaagtgctgg tctgtttaga aacatctggg tttaagccaa tgatagaatt gatgactaag acatcatcag gccttctggg tatcagcaac ggtgttaatt actccagcat tggaacaaat tttccaaacc ccctiaaaa aggcttaggg agtatttaca agacgaggct ctgttcaaat gctagctcca aaggttacat ttaaggttgc atccctgcaa ggcctcggga aacccggagt ctggattccc ggtctggagg ggaccagtgg agctctacag tccgaggct tgcgggcggg atgtgaccag gtcaaggccg ttgtggctgg ccatttccca cccttctcS ticgctcatg gtggggaccc tgtcaaggag cccaagtcgc gcccgtccag ggagcaatgc cctccccttc acgtccggci gaggcagccc tgggtctcCi cccagggcct ccacatcatc cctctctccg ctggggcCCl ggcggggaag cgcggccca ggccgggctc ccagtggatl gagggactgg ggacccggg, cgcggacccc gccccgtcci cccagcccct ccccttC agcgctgcgt cctgctgCg aggcctgcaa cagcaggaag gtgccatagg gcaacaggaa catccaagg~a tttagctatt gctggagtgc ttCtcgtgcc aattttgtat ctgacctcag cactgcacct gtaaggaget tctgtaattc ccacctgtta ccagtggggt actgtcctga ctcctactct ctcactcctg ctgtttcatt ggctggaggg agtgatctc cagctaattt gcgcctccag gcacggctgg agggeaCtcg acccatgcac cagggctgaa ttattat agcggcatga tcagcctCccc ttttagtaga gtgatccgcc ggcccattta catggagttc ttcgtagact tcccatggga tgccatctgc atctcaatgt actgggattg tggaggaagg agtgcaatgg ctgcttccgc aagctggaaa cccttagccc cgctgccCtt tgtgaatcta gtgcctccgg tacttacttt t citggctca aagtagctgg gatgggcrt cacctcagcc accattttaa aatttccct ggggatacac cccactgcag cagtagaaac ctcagtgtgt agcccttcc aatgatactt cgcgatcttg ctcccatttg tagcagagac cctcagatga atgcccagcc ccactcaagt ctcttgatgt tgcacccata cgtttcccg tctcttccct agctacaact gagacaattc aagagcgacc gcccagggag ggcagtttc ttgccgacct ccttagatc gcggttgtgC ggctggaagt cgaggctgcc acagagtgcc aagcggggaa 8 accagggccc a ctagcatgaa 8 ggattatttc8 gcaagggcag 8 ctgagacaga 8 ctgcaacctc8 gattcaggc 8 caccatgttg8 tcccaaagtg aacttccctg ttactcagga cgtctcttga gggcagccgg ctgatgtaga :gctgaaacat :ctatcccccc ,tgttatttit .tttgagaggc, :gcttactgca gctgggatta 9ggggtgggt tccacctgcc cagaatttac gttgtggigt tttacactgt atactggggt ccatgcacat cttttaaaat taacttttgt acaaacacag tgtaatccta ggtgegaggc gaaagtagga cagctacagc aaacttgagc cggggcccca cgggcctcct ctccaccctg ggggcccagg ggctggcctc gaacittcga gattacaggt gtgagccacc tctgaggtag gaagctcacc rtttattgt cttttcaaag cttcattgaa aatcttctgc ctctgcagag gcccctttgc atcactaagg aacctccagc taaataaagc gtttgttagc gacccagaag gggaagtcct ccgtgtggct ggagccaggt atgttggcct tgtgaggcgc acgggaccgc ctcgccgcct ggggaagtgt gtcctcgggt ttcgtggtgc gatcaggcca *gcccccccCt *cgctggcgtc acccccgggt *cgcgggcaci acccgtcctc gacccctcccc tccgcggccc acgtgggaac tgrttagaaca acacactaac tgccgggagg ttccaittct aaccagtgta cctagtggca ggatttctag gagcgtgaca aatttCCtCC tttctcgccc cagc igtcct tctactgctg gcctggaccc catctgccag ccggtgCgCg cccggi-ggglgagaacctgc tgcagggags icgtcccca ccggagCcc *gcggccaaa cgggttacc cctgcaccc ccgcccgga igacgcccag Iccccttcac *gggtcCCcg -cgccctcte F ccctggccc 1040 3100 1160 1220 1280 1340 1400 1460 3520 8580 8640 8700 8760 8820 8880 8940 9000 9060 9120 9190 9240 9300 9360 9420 9480 9540 9600 9660 9720 9780 9840 9900 9960 10020 10080 10140 10200 10260 10320 10380 10440 10500 10560 10620 10680 10740 10800 10860 10920 10980 11040 11100 11160 11220 11276 19 9 9 9 c g c c gccagcagga. gcgcccggct gattaacaga tttggggtgg.
aaagagaaat *,gacgggcctg cactccggga-ggtcccgcgt ccgcgtctac acgccccgcg ggtcgccgca cacagcctag gggagcgcga cagctgcgct accgcgctcc ttccagctcc cccagccccc tcgcggcgcg ggccaccccc gcgcctccgt tccggacctg cgcacctgtt gccgattcga gcggcgcgcg gtcggggcca ccacgtggcg gcctcctccg tccgggccct agtttcaggc gcgatg <210> 4 <211> 104 <212> DNA <213> Homo sapiens <400> 4 grgggcctcc ccggggtcgg cgccggccg gggtcgaggg cggccggggg gaaccagcga 1 N. catgcggaga gcagcgcagg cgactcaggg cgcttccccc gcag WO 99/33998 WO 9933998PCT/EP98/0821 6 7 18 <210o~ 4211>' 8616 <212> DNA <213> Homo sapiens c400z- gtgaggaggt 9 aaaagggggc a gtttgcataa a cgaggccaga g tggggagaag t tcctcttcgc a gggtgggagg t gggaagctga S ggtgaaaccc t taatcccagct tgcagtgagct cr.ttaaaaaa actgttctcc i agaggacagc atggtgctgc1 cctccgctcc gaagtcccga cagacaagga aaaagtcata tgctaactcg gtccctaccc agtcagataa gagagtttga aggtcacaat cacgtgcagg ggcgcggccc ccctgtcacg gagtgaggcg gggtgtccct tagggtgagt gtccctgggt cacgtgcagg ggcgctgtcc ccttggcgt gcgccggttg cttttctgat ggactgcagg gcctcigttg tctcccagct ttgctggaga cccctcact gcaacgcttg ccagtcgccc cctgctctga ttgggtctta cctctaagtg cactttcaag cttttaagta gtttatgttc ggagccrtgc acatcctgtc aagcttctgt gggggatggg gcatgcacgt ~gtgccgtc Lggcagagcc ~gacgtcgag Lcttacgagg rcagtgaaca ~gtctggaag Latttcaagg :aagggt t tt ~tgtgttgac ~gcaggtgga :atctgtact :acttgggag :gagattgtg Laaaagtgtt ~gcacagatc igatggctcc :gggccCtgc IgcCccctt tttcaccccc ;ggtgacct :aacatgaga gcggtgttta atcgaacggc gcgtcatgca gttctctgat :tgcccctgg gtgagtgagg ccgggtgtcc tgtagggtga tggtccccgg gtcacgtgca gaggcgcggc gtccctccca gtgagtgagg ctgggtg tcc tgctcacttg cccattgcct gctcggctct ctctcgcctc ggcctggct tgtctcatgc catcccagaa gtcctgtttt tcaccttatt cctcacatgg gacccacgtg gttttgaatt ggrtttattct ctgcctcacc tgttcttaaa tttctttgtg ttctttagct aagatatgta ctaactcagt tttgtgatct aatagtgggc ctcctttac tctgttcatt ggtagaatt gagggcccag g ctggtcctcc t tggacacggt g ttcaccttca c gaggaggctg g cacagacgct c gtgggaatga g gcaggtgcac g ggccaggtgc g tcacctgaggt aaaaatacaa a gctgaggcag S ccattgtact cgttgattgt g ctggtcccat acctgctgag 5 cgtgtccccac tggctcccag t tccccacaaa cttggggctc t ttggcacicc cagcaggttg agctgcctca acccagtttt caggactctg cttatgcagg cgttgccccc ctgtcccgtg gtgaggcgcc gtgtccctgt gggtgagtga cccagggtgt ggtatagggt cgcggcccec ctgtctcgtg agcttgctcc gggtagatgg ttggt cac ccgcgtgcca gctcaccacg cgaggctgga agggt tct ct ctcccaagct ctgggcacct attgacgtcc gagggccggt tcactgattt ttcattcctt tgcaccccgt atacttcaaa cacgctgtgt tattctgtga *gagtatcaag *tgtgzagcgg *agtgtgtgca atgcatgttc Iatgcatgaaa tcttctcgtt ttatcttcct rccccagagc t gtctccatc g ratctctgcc t rgcgcggcag t :tggcgaggg t ~aggtgggga c ~cggtcagcc a rggctcac .g ;caggagt.t g Laattagctg g ~agaatcact. t :cagcctggg-c ;ccaggacag g :ttaggcat g ;aagggacag t :cctgttttt c :gctcccagg c ctcccaagac a :tttttttct taacaccgttt :ttgaaatgc t :acctgctgc 5 gctttttgtg cctgtcattg gagtgaggcg aggtgtccct cagcgtgatt atccccgggt .ccgtgcagg ;gcgcggtcc ccctgtcacg1 gagtgaggca gggigtccct tagggtgagt.
tgaatgtttg.
tgcaggcgca.
ctctccgttc ggcactgcag tgcccgccac ctctgggctg gtgccctgaa gcccctctgc gccgctcat t agccacaggt gtctccgcca acctctgacg tcctagcttc gttttgatgt gtgttaatac tttgacgtga tttctttgag atacgtagag tctgtataat tggtttccag actatatcca ttccaagaag tggtagcatt gatgagtgaa gaatgcagt a rtcacgtggg c ccgctctce c racacgcggt t ggagccggg t *gcctgcagg t :gagaacccc c .atatgcagg t rccggtaatc c ragaccagcc t gcatggtgg t gaacccagg a :gacaagagt S ;gtagaggga S raagagggcc z :gtttgtggg :cggatttga t :cctaccgtg S Ltgtaagactt :tttttctttt :tctgtgtac i :gcgtcttgc S ;gctcaggtg 5 :tccagctic :tgtictctg :ggtccCgg ;tcacgtgta ;aggtgtggc gtccctgtca gtgagtgagg :cgggtgtcc tgtagggtga :tgtcCCcgg ctcaggtgca gaggctctgt ccctttciat gtgct~ggtCC cattttgcta ccacagcttc atgcatgctg cctgtgtCtg ggaaagcaag ttggccCCCt gct taggctg tggagtgtct gcctcgtca tttctatctc ttagtttagt gaagtaatct ttcttttaag aaccattttg cagtgagtta tattttaagt accaattatt aactgtccat gcttattaag gaggccatag tatgtgaggc tcttttggag ggggctcag :acacgtggc 120 :tcctgtcca 180 tccaggcgc 240 tgccggcaa 300 ~tacctataa 360 :tcttcctgg 420 ttgtgttta 490 :cagcacttt 540 :gaccaacat 600 :gtgtgcct~g 660 LggqggaggC 720 ~aaactctgt 780 ~ggagataag 840 icatgggagc 900 :gttcagggg 960 :gttgaggaa 1020 ~cagctagaa 1080 :ccggccatg 1140 :tatggtggc 1200 Igtgcagaat 1260 ;tgactggaa 1320 ~accacgccg 1380 cttcgttgag 1440 icttcagatg 1500 ;tgtccctgt 1560 ;ggtgagtga 1620 :cccgggtgt 1680 :gtgtagggt 1740 :actgtcccc 1800 ctctcaggtg 1860 gtgaggcacc 1920 gtgtccctgt 1980 gggtgagtga 2040 ccccaggtgt 2100 agccacagct 2160 ccaagcctat 2220 cggggacacg 2290 aggtccgctt 2340 ccaatactcc 2400 ctgccacgtg 2460 tcaccccagc 2520 tgggtgggtg 2580 ggctctgcct 2640 ctgtctgtct 2700 gacttccctc 2760 tccattgtat 2820 catgcctttc 2890 caacatcagc 2940 tatccttatc 3000 atatcagtga 3060 tttgaacact 3120 tatcatttta 3180 tgaagtctgc 3240 tgtaaatttg 3300 gtccagtgca 3360 tccctcacct 3420 attgttaggt 3480 acttctatgt 3540 I R
IR-
-o 4i/ ~'VT 0 WO 99/33998 WO 9933998PCT/EP98/0821 6 8 18 ctctagtaat ctagtaattc tcacaggtca gtgtaacttt agtagctgga actgcagaca gacagggtct tgctgtgttg tacctcggct tcccaaagtg caacactttt atattcttat caatccagtc tgacagtcgt actagagacc cgcctggtgc ttgtttctca ccacctcttg ctcgttgcct cctggtcact ggctggagtg taatggcaca gctctcattc ctcaacctca ccctgacctc aagtgatctg gccaccgtgc ccggeaiacc tcctgagcaa taagaccctt ttttccctgc tgacttagtt Ettgttcccc gtctgcttC tgcgtggttc ttctgtcttg atggcatcca gcgacgtccg tcacgaggag ggcggtcatc cttagccagt gagtgacagc catgtcgggg tctggtggct gcgccagctc tgacggtgct aaccaggaca aaggatgagg atgtggataa tttiaaaatt ctttgggagg ccaaggcggg atgatgaaac cccatctgta tgtaatccca gctactcggg gttgcagtga gccgacattg cccaaaaaaa aaaaa aagaaaaggt gaaattaatg atcattttag ggtgttatrg tttgtctgcg ggatcccgtg ggcttcccat ggccatggct ccctcagtga gctggatgtg ecctgctgtg agctggatgt gacctcggtg agctggaggt gccctgctgt gagctggatc ggccctcggt aagctggagS ggccctgctg tgagctggat aggccctgcg gtgagctggS cagacggtgc cagaccatg( ggtgaggttg ccaggccctc gtgtgaggtc accaggccci ggggtgaagg ccgccaggc( gtctggagtg aggtcgcca( gtccggggtg aggtcgcca! ggtctggagt gaggtcgcc.
ggtccggggt gaggtcgcc, aggtctggag tgaggtcgc, caggtcaggg gtgaggtct.
caggtccggg gtgaggtcg gcaggtctgg ggtgtggtc gcaggtccgg ggtgaggtc tgcaggtctg gggtgtggt tgcaggtccg gggtgaggt gtgcagtccg gggtgaggt -T gtgcaggtct ggggcgagg gctgcaggcc cggggtgag gtccggatgg tgcaggtcc tttttttaaa tgtgtctgt gagtctrggt taccitctgg cgcacegcta cccaggctgg ctgaattaca agtgtgggta tgtttaactg actctgattc ggttgccatg gggcatttgc tttgcttttg atctcggctc tgagtagctg gagataggct cccgccttgg ttgatcttt agtgtattt ctatctcagg gagtgtttct tgtctcaggc ttattgccgg gggacctctg ttggcccgtg aacgtccgcc ccgcggtgtc gcctggcggg ctccgagccg tctaggctgg tggatcacga ctaaaaacac aggctgaggc caccactgca aaaaaaaaaa taataataga gtgggagcat tgtaggtccC *gttgtaccag fcagtgtccgg *gtggigtctg atggagtccg r gtggtgtct ttgctctcag ttctgccttt ctgtcgccca cctgagccgc cacctggcta tctrcaaactc ggcatgagcc tgtcctgtta gataacer~ga tccacttgcc tgcgtttcct ttttaitict tttattgaga actgcaacct ggar.tacagg ttcaccatgt cctcccacag tactgccaca aattaata gggtgagtgc cctctcacct atttttaaat ttggactcaa accatgtctg acagcatgta tttattttca tgttgcatgt gccgagtgtg ctttgcttag cagtctcact ctgcCtcctc cgcccaccac ctgggcttct tatatatata agtggtgtga cagcctcctg tttttctgga gggatccatc gcctaattt ggtgaatttc tttttgtc cctcgttccc tgttgatcct tgttaccccc ctgtcaccca ggttcaagca cacgcctggc 3600 3660 3720 3780 3840 3900 3960 4020 4080 4140 4200 4260 4320 4380 4440 tggccaggct ggictcaaaC 4500 tgctgggatt :acaggtgCaa 4560 aaaatgaagt ctgaaacart t;.gcrcaccCttg 4620 agctrctggcc accccccagc 'Ctgtgtgctg 4680 catcttgaca gtagctttgc ccgccgtctg taaaccccag cttatgatgc agtgtctgga cggcctgggt gagtttgaaa ggagtgtctg ttgtcgccca gcgcggtggc ggtcaggagg aaaaattagc aggagaattg ctccagcctg aatt~ctagt a cac-tcacagg gtgcgtggc atggr-gcagg atggtgcacg gatggtgcag gatgatgcag ggatggtgca cccccacaag-ctaagcatta 4740 ccccgccctg gggtcccctt cttacctgt acagatgaag gcaccacgcg tcagcctgga tcgcgcaaac cttcCtCCCt acaggagcat tcacgcctgt tcgagaccat tgggcgtggt cttgaacctg gcaacacagc gccacattaa gcccagcatg acatttgaca atctcggcct tccgggatga tctgggatga CttC~tCC 4800 CCtgtCCrtt 4860 gCtggCC~CC 4920 atgtggagaC 4980 gccagcgttc 5040 aaaccccagg 5100 ctgcggtgtg 5160 tctgcttggg 5220 gacgtgagcc 5280 aatcccagca 5340 cctggccaac 5400 ggcgggtgcc 5460 ggagttggaa 5520 gagactctgt 5580 aaaagtaaaa 5640 tccacacctc 5700 ttttttgagc 5760 ggacctgctg 5820 ggtcgccagg 5880 ggtcgccagg 5940 gtcaggggtg .aggtctccag 6000 gtccggggtg .aggtcgccag 6060 ggtcaggggt .gaggtctcca 6120 tatggagtcc ggatgatgca gtgtggtgtC tgtgcggtgt ggtgagctgg ctgtgagttg gctgtgagcc cctgct tgtg gccctcggtg accctgcggc ggccctcggt gaccctgctg agaccctgct tggatggtgc ctggatggtg atatgcggtg gatgtggggt ggatgtgtgg agctggatgt agctggatgt gagctggatg gagctggatg tgagctggat gtgagctgga ggtccggggt aggtctgggg caggtctgga tccggatggt gtccggatgc tgtctggatg gtggtgtctg gcagtgtcca tgcggtgtct tatggagtcc gtgcggigtc tatgcggtgt ggtatggagt atgtgcggcg aggtatggag gatgtgcggc gaggtatgga ggatgtgctg ggatgtgctg tggttgtgcg tggatgtgcg tggtgggctg_ gaggtcgcca 6180 tgaggtcacc 6240 gtgaggtcgc 6300 gcaggtctgg 6360 tgcaggtccg 6420 gtgcaggtct 6480 gatggtgcag 6540 gatggtgcag 6600 ggatggtgca 6660 ggatggtgcc 6720 tggatggtac 6780 ccggatggtg 6840 ccggatgatg 6900 tctggatggt 6960 tccggatgat 7020 gtctggartgg 7080 gtccggatga 7140 tatccggatg 7200 tatccggatg 7260 gtgtccggtt 7320 gtgtccccgt 7380 fgatgtgccgt 7440 caggccctcg gtgagctgga caggccctgc ccaggccctc ccaggccctg gccaggccct gccaggccct gccaggccct caccaggccc tcgccaggcc gggtgaggtc tgtgaactgg ggtgagctgg ctgtgagctg cggtgagctg gctgtgagct gctgtgagct tgcggtgagc ctcggtgagc gctaggccct WO 99/33998 WO 9933998PCT/EP98/082 16 9 18 gtccggatgg gtctgcatgg gtccggatgg tgtctggatg gtgtccggat ggtgtccgga cggtgtctgg gctgtatccg tgctgtatcc atgcggtgtc gtgcggtgtc tgtgctgtat atgtgctgta gatgcgcagt gtatgtgtgt tggatgtgtg tggatatgcg tggtgggctg tgcaggtctg tgcaggtctg tgcaggtccg gtgcaggtcc ggtgcaggtc tggtgcaggt atggtgcagg gatggtgcag ggatggtgca ggatggtgca cggatggtgc ccggatggtg tccggacggt gtacggatgg tgtctggatg gtgtctggat gtgtccccgt gatgtgccgt gggtgaggtc gggtgaggtc gcgtgaggtc ggggtgaggt cggggtgagg ccggggtgag tccggggtga gtccggggtg ggtccggcgt ggtccggggt gccaggcctt gccaggccct gccaggccct agccaaggc tcgccaggcc gtcaccaggc ggtcgccagg aggtcgccag gaggtcgcca gaggtcacca tggcgagctg tggtgggctg gctgtgagct tccggtgagc ctgcggtcag cccgcggtta ccctgctgtg gccctgcagt ggccctgcgg ggccctgcgg aggccctgct caggccctgc ccaggccctg gccaggccct cgccaggccc tcgccaggcc gatgtgCggt 7500 gatgtgtggt 7560 ggatgtgcgg 7620 tggatgtggg 7680 ctggatatgc 7740 gctggatgtg,7800 agctggatgt 7860 gagctggatg 7920 ttagccggat 7980 ttagctggat 8040 gtgagctgga 8100 ggtgagctgg 8160 cggtgagctg 8220 gcggtgggct 8280 tgcggtgagc 8340 ctcggtgagc .8400 aggtctgggg tgaggtcgcc caggtccggg gtgaggtcgc gcaggtctgg cgtgaggtcg tgcaggtccg gggtgaggtc gcgcaggtcc ggggtgagt gctgcaggtc cggggtgagt gtccgaatgg tgcaggtcca gggtgaggtc 'gccaggccct.8460 tggtgagctg gatgtgcggt cggtgatctg gatgtggcat <210> 6 <211> 2089 c2l2> DNA <213> Homo sapiens gcccggatgg gtccggatgg gtccCtctcg tgqaggtrctg gggtgaggtc gccaggccct, 8520 tgcaggtccg gggcgaggtc .accaggccct 8580 tttaag 8616 <400> 6 gtactgtatc agcatgcgcc caggggcccc gctcacgttc aagcagaagg gatgtgggtc ttgcccagge 3 5 ctaagcgatt gcctggctaa ctcgaactca ggctaagcca ttcaatctat cagggagcac taggtggctg tgagattgtg gagaegecag tgactgtgga tccctggggg cagcagacct gccatttcct aatgcacctt accagtattt ccccaagatg cctcctccct agggcaccag acagatgccc gggaaaaggc tcettggctg ctttctacct gggacaggca caggctccct cccccagggt aggcgtctgg i:U tgggtgccct R A I tagtctgttg cccacgccag tgtctccact gtcacaggcc tcttacttgt gataaat tgattctCc tggagtgcag caccagcctc tttttgtact tgacctcagg ccgtgcccag tggatttagg ctgtgcaggg catttgaatg acagattcaa cctggctgag gggctttagt gccttgtgac cgtcagaggt tgcatctggg acttagactt tggaaagaat ctccttgtca ggacagggta ctccggagca aggtccaggt caagggcaga agctgccctg gggggtcctg tcctgtgtgg ggtgctgatg tgactatagg ctggcatggS gagccctcac tctggctgag gcctctgctt ctcgaagtcc tgccr-gtgCt tccctggctg tggtccaagt ggattctgtg aaaatcagga gtttgtgcca agatggaaac actaccacta tggcar-aatc ttggctcact agcctcctaa gtagctggga tttaggagag acggggtttc tgatccaccc accttggcct cccccgattc tcttttaatt tcatgagagg ataaaatccc agcacctggg gataggagag gctgtgagat tttgtctgca gctggatttg catcagtgag cccaggccat ggtattagct cagaagatca gggcttcccc acccatgcc ccaaatcagg aacacagcct czgggctggg ggagggtcag ggctttccc tacacgtatt taatggtgtg ttaattgggg tgaccggaag ctactgggac tgttgttctg ccgtgccttt tctactctgc cccgcggccc cagtgtccac gtggccgctc cagcccccgt ggtgtcagga gactggtggg agcagcctct cccgccctct cctggggcca gccr.tgggct aggggcatgg gttcacgtgg gtgggacagc caccctgggg accaggtgtc caggcgcccc tggacgtggc cccgggcatg tgagtcggtg ggggcttgtg caagcctcct gaggggctct tggaacacca gcccggcctc tgcagctctg ggctgggagc 120 caaggctctg agtggtctct gcctccttgc tgagatggag gcaacctcca ttacaggcac accatgttgg cccaaagtgc catgctgttc acccacttgg ttccaccatg atgttcggct ggacgggagc.
tctccgtgtc agctcccctg.
atgtctgcag gaccccgacg gtgggaacaa cgacccaaca gagcagacag cctggggggc tgggcctgcg ggagtgccag gcccccatgg ctcatgagag ccatctgaag accccagtgg ccccagatgc gttgaccgcc gcaagtagag gccttcagcg gcttcccgtg ctattgcag actgcctgga agggctgta tctcactctg :cccctgggr ctgccaccac ccaggctggt tgggtttaca tgtatgaatc cgactcactg agctaacttc gatgagagtg gctggtCtgg ccgcccaggc cacactcgag agggagctgg tggtgctggg gttaatacac tggtcatttg acgtggtggt cttggaggcc gcctgcggtc gctgtcagcc grggttttgg ctgattctgc ggatgtggct ctgtaccaga agcctgggac ggactgggcg gggctctcag cgtgctgccg agcttccccc
ISO
240 300 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 174 0 1800 1860 1920 1980 2040 2089 WO 99/33998 WO 9933998PCT/EP98/0821 6 18 c210, 7 <211.o 687 <212> DNA <c213> Homo sapiens 7 gtggctgtgC gtatcagett cggcgccaac cagcgtgggg cagccgggcc gtgcgcagcc agcccaccgg ggctgcggtg gtgtggcatg tctgaggaag gggctgctiC gactgtctcc t ttggtttaa agatgaaggg ccatttgtgc gtgtaggggg agggcctgga tccgtgcgct gctctgagga gctgcggtga aggatcccgt ctgggagggg tcccctgggt catgctgtcc cttccttttt cccggaggag gcacagcgag agctcctggg tgcagcacgg tccgettacg tcctggacct ccccgtcatc gtgcaacaca ttctaggtcc ccctatggtg ccgccag aaacagaagt gcgtttgagc gggccacggg acacagccag gtggccgagg tgccggtgc gcagggacag gctctgagga cccgaggtcc tgccccacgg tgaggagagt catgcggcca cgggtctggg: gggtgggcac tggatccgtg accaggccac ctcctgcacc gtggggcgag ggaacCgt t tggctgggga ttggccggat cccadatttg ggccatggca tccagaaaag ccacaagaag tcctgctgtg gactgccagg ccacccctgt gtggacagag tcaaacaggg cactggggag ccactttcct 4210> 8 4211>. 494 <212> DNA <213> Homo sapiens <400> 8 gtgggtgCCg gcacctcatg tgtcactgtt catggggccg gattccagt gcaccccagt ccctgccctt caggccctga ctccactcac gggacccccg ttgggtggag gaggacacac actgtgcacc tccgtcagag cctgagccag ggggtctgga gggcagaggt acag tgagcagccc tgctggacct tgggagtggc gaggtactcc tgggtgggcc gcagggagtg ctggcaccta gggtggaggc cttcagcct ctgactgccc .gggctCCtat tcccaaggag aaggaaccgc aacggctcag ccaccaggcc gggtctcctg tcctgaggct cagagagggg gtggtggggg tcagagagag agtgggggac gatgtctgag t ttctgegtg gccactgtca tgcctgattg caggtgaccc tcctgcagca ggtcccactg ccggtgccLt acacagcccg accgccaggc gtctcctcgc 9 <211> 865 <212> DNA <213> Homoi sapiens <400> 9 gtaaggttca cgtgtgatag agaatgcagt cgtgtctgtg ctgtgatatg cgtgtgtggc tgtggtgtgt gtgtgtgtgg tgtgtgtgtc tgtgacacgt tttgtggtgt gtgtgtgcat ggccccttgg ccttactcct tcgggtgctg gtttggggag gtcctgtcac agggctgggC gctggtggta ccttcctgga tccatgagat ataggaaggc cggccggggg ccttggggct agtggtcatg agcacgctgg ggtgaagaag tatccctgga acctctttet ctgacttctt tcgtgtccag.
atgcgttt acgtgtgtgt cacgtgtgtg gcatgttcat gtgtccgtga cctccca ctccacattc cttggagact cccctggcac tgattcaggc cggcaggggt aggggtaagc gcttcggtt gagct gatgtgtgtc tctgggatat .gaatgtgtct gtggtggagg tacitccatg cgtggtgcatr tccatggtgt gctgtgtgct catatgcgtg ggcatggtcc agggtcctca gtaagccagg ccccaggac ctcgctcccc gaaaggggcc cctcaaagtc ggggagaggc gtatctgtgg gtgtgcctgt gcatgtctgt tctatggcat gcaccattgt cttctagcat tttgagagga ccagtctggc gggacacact ctgggcttgg gtgccaggcc acatgtggaa atttacacat 120 cgtgcatatt 190 ggtgtgcatg 240 gatgtgccta 300 gggtgtgtgt 360 cctcacgctc 420 gggtgCCCCt 480 gagtagggat 540 ctatgccggc 600 cctcccagag 660 gttcccaccc 720 ggggtgcaga 780 acccacaagg 840 865 C210> c2l1> 3782 <212> DNA c2l3> Homo sapiens <400> tgtgggattg gttttcatgt gtgggatagg tggggatctg cgggattgg: ttttatgagt ggggtaacac agagttcaag gcgagctttc ttcctgtagt gggtctgcag gtgctccaac 120 WO 99/33998 WO 9933998PCT/EP98/0821 6 11 I 18 agctttattg aggagaccat a ggcgtggagg cctcccctgg g cctgctgtgg tgtgtggccg 9 gatgtggccc tggctacgct c ggctggagtg gtttggcgtg a attctcttgc ctcagcctcc C aatttttgta attttagtag a cctgacctca ggtgatcetc c ccgccgcgcc cggccgagac t cctgcagcct tggtgctgac a ccattteatg actctcttca c gcgtaattgg tgtctgctgt t tctaaacaag catctgaagt t ctgtggagcg gcaccggtct g ggggtgtggg gagccagcgt t gcggtgctca gaggcgcaca c gtgctcctta tgggaatcta a catccccttc cccactgctg t accacaatgg ttggggaccc t atatattggc ttttctgtgt t cgacctcaga cccacgggct a caggccccat gtaccttcct S tcgatgtggt tagcccac S agcacagagt caccgtgcgc s tgttagtgtg tgtcacgtgc C tcccgtagta aatgacaagc tctctctccc gcgtcttcag gcaatccctc cagcactggg gactgtggat ggcagtcggt tcacaggggt ctgatgtgtg gtggatggcg gtcgtggggt tgtggtgact gtggatggcg 3 5 gggtctgatg tggtgactgt ggcggtcgtg gggtctgatg gactgtggat ggcggtcgtg ctgatgtgtg gtgactgtgg cggtcgtggg gtctgatgtg ctgtggatgg tgatcggtca ggtctgatgt geggtgactg gtggatggcg gtcgtggggt tctgatgtgt ggtgactgig ggatggcggt cgtggggtct tgtggtgact gtggatggcg gggtCtgatg tggtgactgt ggcggttggt cccgggggtC gtggtgactg tggatggcag ggggtctgat gtgtggtgac gatggcggtc gtggggtctg ggtgactgtg gatggcggtc acaggggtct gatgtgtggt gtggatggcg gtcgtggggt tgtggtgact gtggatggcg gtcacagggg tctgatgtgt actgtggatg gcggtcgtgg tgatgtgtgg tgactgtgga tcggtcacag gggtctgatg actttgcgtc ctcggccccc tgggcctcat cccgccatcg agtgcccagc tctggccggg ag tcttccttt ctccctgtt tgggcaggg Cgtccttgg ttttttttt tcttggctc aagiagctg gacgaggtt cacctcggc cgcttcctg acctccgtt agaagagtt tatcgatgg gaaggaaaa gccgt titc rgggcctgtt cccgcctga :accctactg Ltgcctgatg cctgtggaa :gtgctaaag ~gagtccaga Ltttgtgggc ~ttactgcct ~gccctgcCg ;tcttttgat :tgctcacat ltcctggggg ictcttctcc :tggagaggc cacgggggtc gtgactgtgg :tgatgtgtg gtcgtggggt ggatggcagt tggtgactgt gggtctgatg atggcggtcg gtgactgtgg caggggtctg tggatggtga ctgatgtgtg gatggcgatc gatgtgtggt gtcgtggggt ggatggcggt tgatgtgtgE tcgtggggtc tgtggatggc atgtgtggtc gtggggtcti gactgtggai ctgatgtggl gtcgtagggi ggtgactgti ggtctgatg tggcggtCg tgtggtagc cggcccccg ggct tggcc gcaggccac gaactatggt ctgtttettC cttccaggcc aattcccctg tgataacaga actgcaacct gaattatagg tctccatgtt ctcccaaagt cagcttccgt ttCcttctc tcacgtgtgc cctccttcca gt t tcgat ta cctctaaagc cgggtttata cactctgggg tccttgtgtt cgagttggag gtctcgctCt gtgcttcctg cgcccaccac ggccaggctg gctgggatga gagatctgca aggtctcgct tgatttcccg tttcctttag tggatgtttg agggatcccg aggaaccCgg cgcacagcgg: gccccgcccc -tctcagatca agaactgtgc atctgaggtg aaaccgtctt acctgcttca at aat tacgg gtgttgcctg tccaggttgg ccagcrtcctg gcctcacaag cctgtcttgg agtctgcaga tgcctgtgCt ccgggagctc tgatgtgtgg atggcggtcg gtgactgtgg ctgatgtggt cgtggggtct ggatggcagt tgtggtgact tggggtctga atggcggtcg atgtgtggtg tcggtcacag gtgagagggg gaaccgtttg, ccacgaaacc gcagcctctc atttctgtga ctcctgggtt ttctcagggt ggggctgggg ctegaggcct ggacgcaggg ataggaggtg gtggctgcac gagtgccact tgactgtgga tggggtctga atggcggccg gactgtggat gatgtgtggt cgtggggtCt gtggatggcg tgtgtggtga tggggtctga ,actgtggatg gggtctgatg ;taagtcagg 180 tcgtgtggtg 240 cattggcctg 300 gctttctttc 360 cttttgccca 420 agttc aagca 480 catgctgact 540 gtctcgaact 600 caggtgtgaa 660 gcgatagctg 720 aggggtcttt 780 gCtgtttcCt 840 gctttgttta 900 aactttcttt 960 aggcccctgg 1020 gaggctaggt 1080 gcagtggcat 1140 tctagattct 1200 ctcccaaaac 1260 agtccctggt 1320 gtcagtgttg 1380 tgCttccgC 1440 gggaagggtg 1500 tgaatcgtac 1560 aacatgctga 1620 cctgtgtccg 1680 gcttagcagg 1740 ggggtgccgg 1800 ctgcatccct 1860 cgtgccacgt 1920 tggcggttgg 1980 tgtggtgact 2040 tggggtctga 2100 ggcggtcgtg 2160 gactgtggat 2220 gatgtgtggt 2280 gtcgtggggt 2340 ctgtggatgg 2400 tgtgtggtga 2460 gcggtcgtgg 2520 tgtggtgact 2580 gtgactgtgg atggcggttg'gtcccggggg 2640 ggtcacaggggtctgatgtg tggtgactgt 2700 gactgtggat. ggcggtcgtg ctgatgtggt cgtggggitt tgactgtgga tgatgtgtgg ggtCgtgggg actgtggatg atgcgtggtg ggcggtcgtg gactgtggat ctgatgtgtg gacggcggr-c gtggcgactg ggggtctgat gcaggtggag ttcccaaaca caggtccaca tttgtggctc gactgtggat gatgtgtggt tggcggtCgt tgactgtgga tctgatgtgt gCggtCgtgg actgtggatg gggtctgatg ggcggtCgtg gtgactgtgg gtggggtctg tggatggcgg gtggtgactg tcccaggcgc gaagcttccc cgtcctgatc atgccctctc gggtctgatg 2760 ggcggtcgtg 2820 gactgtggat 2880 ggggtctgat 2940 iggcggtcgt 3000 ggtgactgtg 3060 ggtctgatgt 3120 gtgatcggtc 3180 tgtggtgact 3240 gggtctgatg 3300 atggcagtcg 3360 atgtgtggtg 3420 tcgtggggtc 3480 tggatggtga 3540 gtctgtagct 3600 aggcgctctc 3660 ggaagaaaca 3720 ctctgccggc 3780 3782 WO 99/33998 WO 9933998PCT/EP98/082 16 12 18 <210 11 c2ll:, 980 c212> DNA <213z- Homo sapiens <400> 11 gtctgggcac aatcactggg tgtgatgggg tgcgacagct ccgcagtgcc caaaggaaag agctggcccc cgggccgtgt ggccctgcgg cctgttagca ctctgggcga ccagaccctg cggtgggctg gtgtcacaca gatggccctg cggagggtct tctcccgtct tgccctgcag ggtgggcac ggactcccag ctcatgaccg gacagactgt tggccctggg gcatgatgag ctgtgtgcct tggcgaaatc gctgcattca ggcacctgct cacgtttgac gtgtccccct cctttaggag ggcaggccat tcagtgctgg gtctgaggcc aaaggaaacg ttgagccacg ceccgctgag cgggcctctc ccctttgcag atgtggtctg tccacgtggc cttgctcggc tctaggggac agtcgtgtcc atttccttgg ctcccagggt gggggtggag tgcccggcag ctgggcagca actctggat tgtgggtgtg agcccagctg gacccacagg ctctgcctaa gcccatgtgt gtctgcagag cattccagcc cagccccgca cr.tcatcaca tggccacgtg gtcctgcctg tctcagcacc gcttcgcag cagtgggtcc gggcagtggg cgagctgggc tgcgcggccL gccagttttg gtttgagccg tgtccccctt agtgctgggt cctgtggctc accgcat~gag .gtggcctggg cacatatgcc tggcccagag actcggcccg aacactgacc caccggctca tcccctgggc gggaatgagc 120 catgccaggc 180 ctctccagtt 240 atcttgaggc 300 tgtcctgccc 360 cttaggagga 420 ctgtccaCgt 480 tttgcagatg 540 gctcagagac .600 ctgctgggac 660 .atccgggcca-720 gagacgttct 1780 gccagcccac 840 ccaaaaggga 900 ctcccatgtg 960 980 <210> 12 <211> 2485 <212> DNA <213> Homo sapiens <400> 12 gtgagtcagg tggccaggtg tgctcacctc tctcctgcCC ttttctggcc cccgccccct ggctcggctt gcggcagccg ggggtgtgga gttgctcctg tgcgccgagc gtttgagcct cggctctcac acgcttgtat tcccgtccet gicgtgtgac cccatctgga aagtgcgggg ccatggggca ggcggcctgg cggtggtaga gccacagtgc cacacctccc ggcaggcatc ggaggaaatt cgtgcacact ggaggcctct ctctgggatc ttatttaaa aatataacta attataatat ttattaaagt cacaaattgc acatggcagc taagcggccc ccaggcccac cgggcctcct tcgtggtcgt gtggcagggc tttggggaat ggtgactgtg tctgtcctgt tggtccagtt iggcctctga acagagagag tttcccatcc gggctggccg gactcctaga gcccatcact gtgatatctg ttttttgaga cggaacgtca actcactgca acctccgcct gctgagatta caggcaccca gggtttttgc catgttggc ctcggcctcc caaagtgctg cttttaagg tgaccaccta gctgcaagcg cctcttagca tcgcgtggca gccatgcct 6 actgtttgtc tgaaaacgca gtcatgctga aactaggggc ccattgccct cttccccact ccggctcctg gagcggagca cgtggaggac gcagcttgt c Ccccccc ccccgcgagg ttgaccgtgt gagagctgcc ctggtgccac tgcctgcgac caaggt~catc gtctccagcg ttaattattg ataattagaa agagtgaat agaattcgct gaattttatt gtgaggtgat ccct agga ca ataaaaacgt catgtgctca gttggtgcgt caccagcaag ctgttgtctg cccgggtc ccccctgcgc aggctggtct ggattacagS tagcgcttcc *acaggagtgS ctgtgtgcac *cccttggcat aaggttgrtat gcgggtggct gnccttctgc ggctgcaggc ggtgccacac gaggggcggg agctccaagt cgatacaaaa gcgcgggctc agtttgctc gtcacacagc atcacgtcct cctgtgtgtg agcaaggtca gataaaggac cattataagt atattaagta ttggccgagg gacaaagtca aagatggatc gactgcgtcc cggacaggcc ctccaaaacc caggggcgta gtgcttCtgt gaaagcct cctgggcttg agcatttctc ctggctaatt cgaactcctg rtgtgagccat :cgaaaataac fcgtcctgtgg ctttaggttc ccttgtttgg ccgttggcgc gggcgggCtg ccggggccac tcccgaggcc gaggcctgga gggtgtgtCt tactactgac ggattttatc ttctctctgt tctcgggggg cactgggtga ctggatttta cctggggaga tccgcagtca tgtgcacagc aatcactaat gtacacacgt gacacgtgtg cctccccaga aagtcacgta tcatgccctg cgaagctcta tgttgcccca tctgcttgcg gcaaaaagcg agtgcagtgg ctgcctcagc tttgtatttt acctcaggtg cacgcccagc aggtcttgtt gctctgggga cacggggcta *agagttctg *gcagcggcca gcagggcttc cagagtctcc 120 ccggaaacat 180 aatggcaagc 240 gggtcaggtg 300 gctggacacc 360 cgattctcat 420 gactagattt 480 cctgtggigg 540 gccacactca 600 agtaaaacca 660 gtggtagcac 720 ggtggaacgt 780 ttcggaagct 840 gatcagca 900 tctggaaaaa 960 cacatgtgtg 1020 gaagccacca 1080 ccgtccacgt 1140 acagacagga 1200 gtccccatcg 1260 aaaactaaga 1320 ttgactcgct 1380 cagtcctctt 1440 cgcgatctca 1560 ctcccgagca 1620 tagtagagag 1680 atccacccac 1740 cggaaagccc 1800 tttgcagtag 1860 tggccgaggg 1920 ttctgctctc 1980 *cttctcgttg 2040 catgcagggt 2100 WO 99/33998 WO 9933998PCT/EP98/0821 6 13 18 catgagtctt tcattccggg aaagaaaacc agccgcccca ggtatgtttc tctgtgtctg tctcaccigt tcaccgtgga tcaagtgtct ttgatgattc gtgcatggtg tgaggtgitt tgtgtggctc gtcttcccgc caaattcctt gaaaaaaaaa ggttctgtga ataaactcta agagcaagga tgtggtcaca agagtgggga gcagggattg gccggctgaa tggtagacgt ggtctgagtg tacgcatgtc cccag aaaggagtcc agatttaaga cctgtggctg tttgttcaga gtcgtttgtg cagcacatgc ggttaagcat aaccttaatg gatctgtc ggtctcatct tgtatgaggt cctgcccgtc 2160 2220 2280 2340 2400 2460 2485 <210> 13 <212> DNA <213> Homo sapiens <400> 13 gtgaggcctc agtgttaata ggttgcagcc gggctccacg gagggccgct ctgtcacgtc gccccacatc ttgcccatcc aacctaatgt tactttaagt gccatgttgg tatccctccc tgtccaagtg ttctttcctt aaggacatga attttcttaa gtgaatagtg tcctttgggt tccttgagga cagtgtaaaa tagtatcact cagccatttc attatagcat ataagtttat ggaaagtgtc ccatggtcat cctgtggata gggcaccacg ttatttttcc gctggcacag tgctgtagca aaccgct ttg tcgtagacag t tag ctcttcecca ttcctggtgc ccttcttggt caggctctgt gccctgcatg acccaggttc tcccagcagg cacttgcatg ggttcaactc tctagggtac tgtgctgcac cactcccccc ttctcattgt gcaatagttt actcatcctt tccagcctat ccgcaataaa atatacecag atcaccacac gigittctggi gaacaagcag tctcgaagac gtacagtgga gtaacagaaa ctcgagctgg ggggcgctgg ggatctgggt tggctcagag ctaagagtct aattgcacaa gttaactgta gagaatgtta atactacgta ggggggcttg tctggagacc atgaagccgc ccagcggcca atgagcatgt cgttagggtc ccctcgacag gggtctacac agctggCttt atgtgcacga ccattaactc atcccatgac tcagttcca gctcagagtg ttatgact catcgatgga catacgtgtg taatgggatg tgtcttccac gctggagagg acagttagtg tccgggtttt tcaaggttct caaaaatttc cggcacactg gcttgggiCt ctcggatcat ggggcgaggt gagaagtggg gctgatggta gagagctcgt ctttatttat aaaagtgtaa ggtgggggtt gatttgcttt tgatgcattc acgactgctc. tgtcttgagg -aaccagacaa 120 acgggagggg tgtccagagggaattcaacacttggggaga gtggcctgga ccaaggacgc tattgacagc cgtgcaggtt atcatttaca aggecccggt cctgtgagtg a iggtttcca gcatagtatt catttgggtt catgtgtct gctgggtcaa aatggttgaa atgtggacag aaggatgcgt tcctgtgcat tcttcattaa ttgtacacac gtcagccctc gagggtcaca ttgcacagcc -tgaggactgc cctcagggct cagcaggcgg ccgaggaagc acaccagctt tggggctggt ccgggcgcct acacacctaa agttacttti agttacatat ttaggtatat gtgtgatgt t agaacatgtg gcttcgtcca ccgtggcgt a ggttgcaagt tatagcagca atggtatttc ctagtttaca cagttatttt caggaagcct cttttgaaac ggttcaagtt aacttgctct tgggacagga cagtgcacca gcagcctgag cttcagccca atatcgtgcc ttttttttaa gtatacatgt ctcctaatgc ccccaccctg gtgtttggtt tgtccctaca tatgtgccac ctttgctact tgatttataa tagttctaga ctcccaccaa tttatgaaaa gcaggccaca tctagctcca ctagattgaa gggatttgga tacctctggc tgcccagctt tgctggtaaa ccgtcttcag ttcgtcttca aat~gaggaat tttttca ttt attcagtccc ttcccttatt 360 420 480 540 600 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1960 1920 1980 1984 gctgaggacc *acagctgcca tcccagcccc agctttctta gccgcgcctg atggccttcg aacactgagt acttataatg ctgttggaaa gaaatttaag ggctgtgtaa attgtttgac agttaacctt gctgtgtatt <210> 14 <211> 1871 <212> DNA <213> Homo sapiens <400> 14 gtgaggcccg cccccgcgtc gtgctggat c caccttcggg acacacgtgg ggcggc icCt ggcggcagcc gctggcccac agaatctgga N6 cagagttgat tgcCgtgtgt ctgcccCtgg cgcaagagca agggagtggg tgagcgcagg ggggCCCCag tcctccccag agcgttcgct tttgctgagt ttttgtgaat ctgtggggaC caccgcagcg gaggcgcttg taccgtgcag cggtgacctg tgagaccccc ggtgcacczg gcggtcacgt gctgctgtct caaactaaaa ctccacagcc tcgtctctgc gccgtgcacc gccctggtcc gctcctgctg aggagccgtg agcctgcgga tcctgcgtgg tgaaccacgg tcaggcacag tgtgggCttt caagtcctc.
caggcctggg tgcagagacg ctctttggaa cacagggcct gagcaggagc ggttgtttgg agatggctag gggacctggc gcagttgagc ctctctgccg ggcgcagggg cacccaggtt agtcaagagt gcagggccga tgctgagtga gatcggtggg gagtgggt tt ctcagcacag WO 99/33998 WO 9933998PCT/EP98/0821 6 14 18 gggattgtcc aatgtggtcc agtgcgattt gacgagggac tggccgccag gggtggtttc cccgcccc aagtccaccc gcctggcgtt ccttgtgccg aagattcact cggggggagc gctgagggga cagagcagac ggccagggag gtggctcaga gatgagtcgg cagccatgta ggagctgcgc agctggccga ggggccacag cagaggccgc gggcacaggg gggctccctg crttgacgtga agctgacgac gagcaggaac tcagaaccct aggagaaaac aggcaaagtc tcttgtccag attttagtct tgccatgggg acacatgaga gagtcctggc tgtcccgggt gggagacagg gaaagcaccc cctgccaggc ccagcaccct aacaagggtg tcaggttacc ccctgcctta g C210> <211> 3801 c212> DNA <213> Homo sapiens ccctcaaggg cgccccacag gagaaacctt gaaagctgta aggtgctttg tccaggtcca cagcccggag ccaggtccca ggggaacgct gtgtatgttg acaggaaggg ggtcccaggg aggaagggaa agetgggrga tggtgttgcc cccctttgtc gttgagaac gccccggacc tggaccatca ccaggccagg cgaagtctgg gctccaaatc t cctgggtga ctgggctgtg ccctccaggg cacagdaggc agcaactgag gcttctgtgi gggtcccacc gtggccacag ccaggccaca ggggatgccc gcgaggctca cagctcacag taaagcacag gtcttaaaag acagatgagt agccggtggg aagggaaccc tttgtgaaaa ccgccctggg tgtgcacatt ggctcaggag ggcaagttcc gggggcagaa ggagctggga ggaagggcag aggccagagc tgactcggcg cccagccagg cagatgcctt aaggcgggat .ctataacggg cttgttttaa tcagaaaatg cccatttgga ctgggggtat taaatccact tcct~qaggct tgagggtgct ctctgtctct atgcaccagg ggggacgcc agaggctacc agggaacctc tcccgcgcct cagggcatcc ggtggcaat t attgtggtgt 660 720 780 840 900 960 1020 1080 1140 1200 1260 1320 1380 1440 1500 1560 1620 1680 1740 1800 1860 1871 cagaggccac ,.ggggctgca., cctcccatct ttcttgcatg :ctcacctacc tgtcctgccc agcagggctg .ggtccaggct :cctcagagct accactt.ctc tggggttttc caaagcattt cggccccgca tcctggggct gacattgccc <400> gtgagcgcac ctggccggaa ccgrttgcgtc cacctctgct ggccacaggg tgcccctcgt tgagggtgci cacaacggga cagacctggg tgcaccgagg 3 5 gggccctgct gggcgtgagt cctctgaacc cgagaccctg cagggccctt ttgggcgtga agtctacagg atgccatgag tgtggggggg gtcictacaa gccccgtctc aggctcagac ctgtttctti tatgaataaa ataatcccag cactttggga ccaacctaac caacatagtg cctggtggca cacgcctgta aacccaggag gcagaggttg acagagtgag acttcatctt ggacaggtgt ttttttattc gaactggggg tgccttcctc tggttgttaa accagaggt tggactttgc ctctttccag tggacaccct cgtgatgggg ggtgcagaca ccettgtgca ggatgccggt cccctgtgct ctggcctcca ctggctttgt cgcctctccc aggcacctct cttattttgc tccccatgaa gcacagcatc agtgaatgt t gttctcicta aacacattgc catcagatgt gggtccaatg gtgtgcttgc agaggtggct gggtgaactc acatcctctg gaagaaaaca ggcaaaatga ttcttgtcca gattttagtc gatggacaga acaacagaac gtaigtggca cagctgatgs gtggagcctg tccgtgtggg cccatctggg gcagttttct tgtcttcaga ctctcaaacc gggccctget gtctct ccgc ttcatgatca aattctgggg acaaatgaat aagtatcaac ggccgaggtg aaattccatt gtccccgcta cagtgagccg aaaaaaaaaa tgtccttcga tgaaaggcac taaactgggg aatgctccct gagcagcagg tggtgcccag ccccacagtc ctgcatgatt gcagtgctgg atgtattttt at tgaaggac aaagccacag ccagaatatt ctaaaagctc tgtctgaagt taagaaaag *tcccaaacca *aaaacggaag Iaaaagagagt tgcccggctg gcaggcgact gctgagcaga gtgctatttt aagcagtctg cgaacacagg gggcgtgagt tgtgagcccc cgtgtgaccc tcttgtttcc gggcaggtgc gccaatccca aatgcatctt ggtaaaagga gatccgaacc ggccctgctg ctctccgaac acactccaag aicaggggac ccagagcccg tgaagatgga. cacagatgca attccaggca. gggcaaggcg.
ggtggatcac.
tctacttaaa tgcgggaggc agatcacacc aaaaaagtat taatatttac accrttcatgg tcctgtcgtt ggggtttgct tgcagacgcc catgtccctg cctgcttccc tccacatttc ccataccagt taggacaggc aaaggacaga aggctagtgc ctgtgctccc agcagtggag atacagcaga tgaaaaagga cagctcagat ccctatctct gtgtgtgtaa C Cgaggccag aaatacaaaa tgaggcagga actgcactcc cagcattcca tggtgctgtg gaagagaaat ctgagttaac tcatggggga ctcatgatgg C tgcagct cc tctcacagcc ctgggctccc cagctgtgaa acccctggtt caaacaaat c aggatgggtg aaaggccact gcagtggttc ggcttgaagg aaagtggtaa ggtagaatgt cagaaacgtg tttttttc :gctgcaggg aagggtcaga 120 :ctgtgggag 180 ~aggtgcac 240 caagacgccc 300 ggcatgagtc 360 ccagagactt 420 gctcatccac 480 agggccatgg 540 agagctcaag 600 gaaatctgtg 660 gctcacacct 720 gagtttgagg 780 attagcc gg 840gaatcatttg 900 agcctgggca 960 aaaccatagt 1020 ctagaggccg 1080 aagtggtgaa 1140 agtccagatc 1200 gcagcaggtg 1260 gggagtggca 1320 ctccccacaa 1380 ttacctggcc 1440 agcacctctt 1500 ctgtccactg 1560 ccagcctctg 1620 aggaaaatgg 1680 ggcatcaggt 1740 tggtcagagt 1800 gccatactca 1860 gcatctggga 1920 gatgggaatt 1990 ggicagaact 2040 tgttaatgtg 2100 tgagaaaact 2160 WO 99/33998 WO 9933998PCT/EP98/082 16 18 gactggaagc aggagacaca aggtgaacgt tgaggcaacg atggccattc cacattcatc aggggagcag .ctcctgctcc agggtgagcc gatgcacact gagacccatc atgctggctc actttctgg taattccact ccaggcaggg tattatgcat gttaatggc cacaccccag gggcagtgag ctggagacac gctcttccat ccacaaaaaa acagtttatt tgcacaaaca gagtttggtc cagccccctc aggctctttg acattccccc aaataagttg tgcaaacaac tccctggttt ggcattgctt cctggagcgt ctctcacttt ccgcccttgg ggggCCCttg ggagcCCaag tgggaaggtc cctcaaagaa cttttctggg aaagcagctt tctgaagtga ggacttgcca cacaaaact t acaaaacgt gagcctgccg tggtggtgag catgtgtgcc ccctgagatt cctgagtcac atgtgtcttt cggccgtgCg atgcagagtc gggctgcagc gagcaagc tt ctgtgtctca tgtctttaca accagcaaca ggtgttgggg tcactgcaga ttgtgcacgt gttctcctaa tcacccagct ctctgcccga gtcgtgttgg ctaccagcag acgcacgtga cttgccaaga gttgcatgg ccagacatta cagcaagtca gctctgccat tatttcaatg tgaatgtcat gccctggagg acgtgcactc caaacacagt acctgtgttc ggctgagtta aggttggat tggatggcat gcatgcccca tgcaggaggg g gcatatacca gaaataaaac aaggacacac gaaactcagc gatttattta ccacctgaga ggcaaagggc ggaceccaca ggatggctgt cgtcaaagaa aacegatggc gccagcatca aagtcctcac tcacgggtct cgaacctgcc taaacatttt gagcagattc aaaagactca agggaggcgg ttgcctgagc aggcgccctg ggtagaggag atgcatgatt caagtcagac gaaagaagaa atgcatgtga gagacccgtc ggctgaggca aatgtcctgt tatttaccat caaatacagg tcaaagaatt taggtagaag 2220 aagggaaggg 2280 atgaaaccag 2340 cacagcgaaa 2400 tgaggtctg 2460 gaaaggctcc 2520 gcagcctggc 2590 ccataggctc 2640 acggacgtct 2700 aactgacagc 2760 cccatccctc 2820 agctggaaag 2880 gtcttcccag 2940 ttccagtgtt 3000 gctaaggaga 3060 Ettgaagaat 3-120 tgtaaaagaa 3180 ggacatacat 3240 tcctgcccct' 3300 gtgccacctg 3360 cagtgttctc 3420 ccagggctcc 3480 agatgatgag 3540 ggtcctggtg 3600 agtgagcacc 3660 ggaaggcagg 3720 cctgtgtctg 3780 3801 tagcagtgtt-caaagctgga gtgtgttcat .ctttggacat acatcggtgg gatgcctcca: actggagccc gagattcccc actcgaggga tgtgcagac acactcaaca gtagcatttg ggcaggacaa ggctgggtgt tgtttagctg acgcccaact cgcccgggag ccatcagggc tcactagcca gagtccatgg ggaagcggga ggggcaggca C210> 16 <211> 880 <212> DNA <23 Homo sapiens <400> 16 gtgagcaggc gtgtgtgtgt acatgtacgc cagtgtgtgc tgcatgtgtg gcagtgtgag taggtcctca gctgaggctc ccctctcctg actccctctc cctcccctct gagcctcggg ccgggtgagg ctctttcttg gaggtttcta tgatggtcag gtgcgcgcgt atatacacgt acaggtgtgc ttcgtgcaca tagcatgtgt gcaccagtgc tgaagctgca tgggcttctg tcctgigggC ctgtgggcat ggcaggcaga gccaggccgg ttccatctga ccgtttctca cacagagtic gcctgcaagg gagcacatac aagggcacaa gtcgtgtggg gcacataaca cactccttac gccctgaggg tgtccactc atccgcgtcc ttgcgtccac tgacacagag atttcactgg acggatgata ctctttcttg agagttcagg ctgatggtga atgtgtgcat gtgtgtgcac cattcacgtg tgtattgagg aggatgagac.
cattgtccca ccctctcctg actccccctc tccctctcct tcttgactcg gaagagggat aagcaaaaag gcgactctag aggtgtgtgc ctggctgcac gtgtgtacat atgcgaatgc aggrtgcatgc *ggccccgtg ggggtcccag tctgggcatc ,tgggcattta tctgtgggca ggttccttcc cccagggtgg agtttcttgt taaaaactta gcaagtatgt gtaagagtgc gaaggcatgg acacctgaca gtgtgggtgt ,ttcaccccgc: gccttggtgg cgcgtccact catccactcc tctgcgtcca tgtcttggcc t tcgcagctg caaaatgttc aaatcccaga <210> 17 <211> 3186 <212> DNA 4213> Homno sapiens <400> 1I? gtgagccgcc tctgacccgg ttcagtggtg tctccatctc gcactggccg a tgt t a tgg tcagatgccc accaaggggt ggcttcacct ctgctgcctg tgggtagtgg cgggacgtca ggagtcttag ggaggatttg cacccccggg gcaggcccag tggaactcct tgcacagttc taggagccgg tggaggccat cagaggaggc gggtctcagc aaggtgcagc cctccaggga gggttttagg tgttcgcgtg tgtggcccca cccagggcag tgggaaggtg aaagagggcc agagc tgtgg ccctccgcgc ggcaaggaa t gctctgtgca ggtgtcccca caggggcatg tctgaacagt gaggtgggtg ctccccacac cctgctcacc gtcttacgtt aagcacctgt ctgtgcctgt gggtaaagag agatgggaga caggtgaggg agcccggcca WO 99/33998 WO 9933998PCTIEP98/082 16 16 18 gcacctgtgc ggtgcccctg acagggccag aaagggcagt ggagctgaat atttgtgtta gtcgtcgtct gtaccatgaa agagccacag ctcagttcca aaatcttccc ctcctttccg ctgacactgt tgaggtcaga tgccacctgc agctccgagg ccatgtgggg tgggccacgg ccctcgaggc cagtgggcga tggggctcgg cccgacctct tgcttccctc atttgtttaa cacctcagca aggtcattcc gtgggtgaga ctcaggcact acgggggctc ccaggtctgt gacagggctg aggatccctt tatgaaaaat ttgggaggct gattgcacca agaagactga cagaagccaa ccccagaccc ttgatatacg tggtacacaa.
aatcagaagc gtgttcatac actcgcacac gcccatgagg .gcccacaccc tctcag tctgggcatg :caagaatcg :ttctgcctg cgggcaccac ;ccaggaggc :ccagggccg atcgtggaaa ~aggttttt ctgcatgtta ~gggcgtcc tcgtttgcat gaaacccttg ggttgacccc ggagttttcc tcctcccata agctcccgta acccttgggt gccgtttcca aggagtggga ggctgcggtg cctt~cctggc agcaggtggc ctgggtcagg aaacattctg gagttactga agaagtggct tgaggtacac gggtgagatg tgatcacacg gcacacctgc gcgcggtggC gagcccagga aaaaacaaaa gaagtgggag ctgtactgca caaatgcagt gtcggtgtt agggtttatg atgacatcaa ggaacaatgg cagcatgggg agatggtgca acaagcacac aaacccatgc acgagcaccg gctgtgctcc tggaacgttc acaactttat cacagaggga gagtcagggc aggtggtggc aggcccgggc ctccacctca cgaagccctc gccccatgag aggctgcgcg aattaccgtg cccagcaagg gctcacggga aacccgagtg cttgcgcctt ccgcctttgc accagctcca ggctcagacc gccetcctct ctcctgacg cgtgcctggg gggtgtgctg gatacaggtg agggtccagc tggcgtgctt caggtgaaaa ctcctgggaa ttcagctcag tcttgtcctc gagggcctgg gctcagggca agtcgcttga ttgggtagcc aacacagagt caggcacgtg gaacggagag ctgggccccg gtccacgtgg cgctgggggc ccgtgctggc cgcgcctcca tatttctccc tttggaagag agcgtggccg tgtggcaacc ggcctggctt ccgttgttgc gaggctgaaa ccggggtgct caggaagtca gtgagaccag cctgtcctgg agggccaatc acaagcct cg acaggcctcc ggctgagaag cacacttgat gagttttcca catgctctgg gaggcttggg ctgccttctc ccctcgtgca ccactgagga ggggcctcCt actcccaggg atttccccac gggcggctgactgaggaggc, gaaggcccag atttcacggC ggggtctgat cacgggcttg agcccctcac ccgggacctt taaatgggga ggcttgactg gtacatgggg gaggccaggt cagagggtca catgtgctgt agaggccaaa cttt gggagg acatagtaga tgcgcctgta tggaagctgc gcccatctca gtaggaactt gatgggtcct cagaagggat gattcatgat gcct~cccgg ctgcttcagc gcacacacag atccgtgtgt acaggcaccg t c gacgc tg ctggicaggg 540 tgtggaggcc 600 gggctgtacc 660 cgagccactg 720 gagtgtgagc 780 gtgaaatgag 840 ttaciaggtc 900 cagggagggc 960 accaggctgt 1020 tetctgcctc 1080 agctgcttga 1140 ctggaggtgt 1200 tgggccatga 1260 ccatgtgacc 1320 cagggtctci 1380 gtttccccac 1440 cgagatgcga 1500 gaatcccctt 1560 agccaggctg 1620 tcaaatccgc 1680 gggtggacgc 1740 ccatgctagg 1800 aggcttattt 1860 aaagacatcc 1920 gtgtgatctc 1980 ggctcaggca 2040 acatgggggg 2100 gaccaggtac 2160 ttcatggtag 2220 gatggaggct 2280 ccgaggcgag 2340 accccatct 2400 gttccaatac 2460 agtgagctga 2520 acaacaacaa 2580 aacctacaca 2640 cacaccaica 2700 gcgcaggacg 2760 aagtacctgc 2820 aacaggggct 2880 ctccacatgc 2940 acacgcagct .3 000 gtgcacctgt 3060 gtgggcccat 3120 tccgccatcc 3180 3186 ggggggccca ggcagtgggt aggtacacgg ggggctcagg cacatatgag cccaaagtcc tcacacctgt gtttaagac attagctgaa gatcacttga gcctgggtga ttcttggaaa cggtgtcagt caccacaggg ggttgtctga ataaactgga ggctggcatc cacatgtgca caggaagctg agtcccagca agcctgagca catggtggig gcccaggagg cagagtgaga gaaacattta gagatgagat gcgggtggct cgaagggcag aaccttagag caggatggag cagaaacgca gtgtacctgt acacagacat gcatgcatgc atgtgcattc atgcacgcac tctgattagg aggcctttcc <210> 18 <211> 781 <212> DNA <213> Homo sapiens <400>. 18 gtatgtgcag ggagactgag taacccaacc tggaagggac gcgctggggg tcctgttgc gcaaacccag agcagaggcc tcctctgccc NI agccatgtcg gtgcCtggcc tgaatctggg actgtcaggc aggagccgtc gcctggtCtC cctgtggtgg gccaagggct gcgcatcacc ctggacacit aacctgcggt tcagtggcag cttaggaagt tcgtctgccc tgggagctgc gattgggctg taggaggagg acgacagagc tgtccagcat cctgagctta cagtgcctgc tcttacccct gccctctcgt catccttccc cccatggtgg tctcccgtcc ccaggcccag Cccgcgccgt cagggaggtt acagcttcta ctgctggtgt trttcgcatca ggggtgagca accttgctct gatttggggg atggcactta gctacccac cctctgcttc tctgatccgt cttcgttc tagtgtgtca ggaagtggtt gagcacctga gcctggggaa gcctggcctc gggcccttgt ccctctcagg ccagtcaccg ctgaaattca WO 99/33998 WO 9933998PCTIEP98/0821 6 17 18 gtggaaattt gggtcgggac gtgtctcccg g cacctggaga agccgaagaa aacattctg tcgtgactcc tgcggtgctt agccagagat ggagccaccc cgcagaccgt cgggtgtggg cagctttccg ggaggggagc tgggctgggc ctgtgactcc tcagcctctg ttttccccca <210> 19 <2112, 536 <212> DNA <213> Homo sapiens <400> 19 gcaagtgtgg tgtgtggggc ggggcctgga agcctggcag ccacgcttgg ctgccctgag ctgggctgcc ccagccaggg ttggaggatg gtggaggcca gagcagcctc gccacgctgg ggtccccaac gagccttctg ctcctggggL tgtctgctcg ccacgaggtg ccacctctgg gtgcgggccc cacctgccca agatgctgct gaagtgcaga cagccctatg ttcttgaacc acccctgacc cctgagcaag ccccggtgga caggccctgc cctcttctgg tgattaaacg cctgcttccc tgtgtcctct ttctctcccc ggggtgtctg ctgcccggcc aacggagtct ggggtcatcc cgcccccggg ccggtgtccc atctcagggg cacagcctct gccccgccgc tcccttcact.
acccacacgt gattttggcc ttgaacgccc cctgaccctg caggccacgg cgatggctcc tccctggctg tccagcgtca.
gaggr.tccca cctaggaggg ccgcag <210> <211> 3179 <212> DNA <213> Homo sapiens <400> atcicatgct tgaatcctaa ctgtgagtga acggggtggt gtctatgagt gaatggggtt tttctgatgc tgtgaggcag ccctggaaga cataacagta cttgggcggc ggggatgatg ctggggtggc aggggtgatg ggccgggccc cctcctcccc ggtgcacaic ctctgggcca cctgtataaa atccaggatt ccgcctctgg ccattctctt gtggggcagt ggagggtgtg atcctcttat catctcccag *tcttcctctt atctcccagt tatcctctta tctcctagtc gtggagctgg acatacgtcc gaggggcggc tcagagggac gcttgggcca cacgaaaccg cctggtgggg ccicatggta gtttgagtgc agcccggacg gcgtcattta ttgctgctgc gggcccaagt ccacagactg agctgtgagg aaggaggggc ggaagggagc ggccccgggc agggcgggga cttcccagga gaccaacagg tcaggccat tagaccctta aaaaaggtat ggttgttagt gcagtggcac gatcctccgg cctcagcttc ctggcacttt taaaaaccac cagtagtttg ggaagccgag gggtaacata gggagacccc ccagcatctg tagtcccagc ggtcatggct gcagtgagct S gaccctgtct caaaaaaaaa gaaggaaaga gaagaagaag tgtgcactgc gcagtgcg gtggtcagtg gaggggaagg agtccaggc gagggcctgg 9g9ggggct9 tgcctcccac tcagctttca cetcctcctg aagagtagac gacacaggag tctcatctct ctcatctgtc tcatccagac ttcctcaggc gcagtcttgg agggccctgc tggccgggtc tgcctggtgt ttcagagaat tgtcgtaaat tctcggcagc gccgtgggcg gcagaggccg gttcagccat aaattttact ttgctttgat ttattattat agt~catggc t ccagagtgct tatgtaaggt gcagaaggat atctctacaa tgctcgggag gtgattgtac aaaaaaaaaa gaagaaggaa atagacacca ctgtatgcaa ggcccatggc ctggctgtgc cgggcccatg gcctggctgg agggtagggg atagacagtg cgaagggcag cagggatgct ccagggtggc agggatgatq gtctgggtgg cggggaagat ctgcagccgt ggatccggat tggaggtggg gggrcaggggc aacgccccaacaggattctg gcttcagggt catcctctta atcctctiac ttacctccca.
agaaggaact ggtgaagaaa gtgagtggct ctaccgagtg cggggtgggg gtctgagtga gcacccggt cggcctgggg gacgacctca ctgct caggc ccatcttcta caggattact atggcttaac tarttagagat cgctgtagcc gggattacag caggtccagt tgtctgaggc aaaatgcaaa gctgagtggg catcgcactc gaaggagaag gaaagaagga ctcaggttga atctctgaag ggggctgg tcatctccca catctcccag gggcgggtgc ggaaggattg cagcccctcc ccagagcctt caccttggac gcttatggcc ccgagcctaa gcctggagcc gcgcctttgc agtgagaggt acacctgggt caaagctcca taiatttttt tcactaagca ggtgtctact gcaaaccccc gtgtgagcca ggcttccaca caggagtttg aagttatccg aggatcgctt cagcctgggc gagaagagaa gaaggaggcc ttacagaagc atttacggaa 120 gcctgggagg 180 ggagccccca 240 gggggcccag 300 ggggccccag 360 ggggaag6CCt 420 gtgcttccct .480 atgacaccat 540 aagtcaca, 600 ggtgggtagg-660 atgctctctc 720 gtctcatctg 780 tctcatctct 840 caggctcgca 900 cagagaacag 960 tcagaagttg 1020 ccagcaggtc 1080 agggattctg 1140 actggatatg 1200 tgtgtatggt 1260 cccgtatagg 1320 cctgcaaact 1380 tggacagaac 1440 ttgaatcaca 1500 gactcctgtt 1560 gctaaagcat 1620 cctactttat 1680 ctgtcaccca 1740 aggctcaagt 1800 ctgcccttgc 1860 cctgtcatcc 1920 agaccagcat 1980 ggcgtggggt 2040 gageccggga 2100 aacagagtga 2160 gaagaaggaa 2220 tgctaggtgc 2280 WO 99/33998 PCTIEP98/082 16 18 18 taggtagact aaccccagct gacaagcgtg ggaggctgtg gccccacgct acctgggaag gcacttgtgg acctgtgaat tgccagtcag gggtccccag cactggccac ccgctccatg acgcctcgat tgacgcgtt cagtacaggg gtcaaatct ctttggactt tatggagcga ggtgacacca cctgccggtc gatgctgtgc caggcacaac gtgtcacccg ccgatcttaa aagtgagaga tgctggcttt ctggaaaagc ttcaggccag gtgttcagcc aaatgaatac agagcaaaat ccttaggcct gtgagttcaa gccaggaccc ctgcacctgc agggggcttg tacagcccct caaggcagag ggtcatcctg gggaggcagg gagatggagg aagcaatcct tgggacctgt actaagctgc agggacagtt gaaaataaca gaacttcatc agcagaaagg ctgaaaggga tgtaaccgtc ccaaactttg ccccaaagat gctggtgaag gattatctgg ggagagtcag agggggtccc ccccggtcct ttcagetttc agtgattcgt ctcagagtga aagttttaaa tcaagcaget gaggagaagc gtggttgttt gatgrtggtg gtgggtttca gcccacgtcc gctgcaggtg tgggcctgat agaggggacg cagccaagga gagggcacac cggcctccag cacagcagca ctctcagccc gggaaagaaa 2340 tccttccaca 2400 aggcaagggt 2460 tcctgcctca 2520 ccaggtgccc 2S80 gaagccccag 2640 ttctcctgga 2700 gaatcacggc 2760 atggccacaa 2820 tgagaaggac 2880 atgggggcag 2940 ggccctgccc 3000 agctgtaaga 3060 aatggaatag 3120 acCCCtggg 3179

Claims (14)

1. Isolated DNA sequences characterized in that the sequences are intron sequences in accordance with SEQ ID NO: 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 and/or 20 or fragments of these sequences which have a regulatory effect.
2. Isolated DNA sequences characterized in that the sequences are the flanking regulatory DNA sequence for the gene for the human catalytic 10 telomerase subunit as depicted in Fig. 10 (SEQ ID NO or fragments of this DNA sequence which have a regulatory effect.
3. Recombinant construct which contains a DNA sequence according to one of Claims 1 or 2.
4. Recombinant construct according to Claim 3, characterized in that it S. additionally contains one or more DNA sequences which encode polypeptides or proteins.
5. Vector which contains a recombinant construct according to Claims 3 or 4.
6. Use of recombinant constructs or vectors according to one of Claims 3 to for preparing medicaments.
7. Recombinant host cells which harbour recombinant constructs or vectors according to one of Claims 3 to
8. Process for identifying substances which affect the promotor activity, silencer activity or enhancer activity of the human catalytic telomerase subunit, /0 comprising the following steps: P:\WPDOCS\CRN\ShellcySpc\7495250.spe.doc-30/Il/01 -77- functionally linked to a reporter gene, and B. measuring the effect of the substance on expression of the reporter gene.
9. Process for identifying factors which bind specifically to the DNA according to one of Claims 1 or 2, or to fragments thereof, characterized in that an expression of cDNA library is screened using a DNA sequence according to one of Claims 1 or 2, or subfragments of widely differing length, as the C probe.
C.. Transgenic animals, excluding humans, which harbour recombinant CC constructs or vectors according to Claims 3 to o
11. Process for detecting telomerase-associated conditions in a patient, comprising the following steps: C S. A. incubating a recombinant construct or vector according to Claims 3 to which additionally contains a reporter gene, with body fluids or cell samples, B. detecting the activity of the reporter gene in order to obtain a diagnostic value, and C. comparing the diagnostic value with standard values for the reporter gene construct in standardized normal cells or body fluids of the same type as the test sample.
12. Pharmaceutical preparations comprising recombinant constructs or vectors according to one of Claims 3 to 5 and one or more pharmaceutically suitable R excipients. P:\WPDOCS\CRN\Shlly\Sp.\749525.sp.d-30/10/OI -78-
13. Use of recombinant vactors or constructs according to one of Claims 3 to 5 in gene therapy.
14. Isolated DNA sequences, recombinant contructs or vectors, or host cells, use of recombinant constructs or vectors, transgenic animals (excluding humans), or processes substantially as herein before described with reference to the Examples and accompanying figures. S S 10 DATED this 30th day of October, 2001 S BAYER AKTIENGESELLSCHAFT By its Patent Attorneys DAVIES COLLISON CAVE S
AU22729/99A 1997-12-24 1998-12-22 Regulatory DNA sequences of the human catalytic telomerase sub-unit gene, diagnostic and therapeutic use thereof Ceased AU742489B2 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
DE1997157984 DE19757984A1 (en) 1997-12-24 1997-12-24 Regulatory DNA sequences from the 5 'region of the gene of the human catalytic telomerase subunit and their diagnostic and therapeutic use
DE19757984 1997-12-24
PCT/EP1998/008216 WO1999033998A2 (en) 1997-12-24 1998-12-22 Regulatory dna sequences of the human catalytic telomerase sub-unit gene, diagnostic and therapeutic use thereof

Publications (2)

Publication Number Publication Date
AU2272999A AU2272999A (en) 1999-07-19
AU742489B2 true AU742489B2 (en) 2002-01-03

Family

ID=7853458

Family Applications (1)

Application Number Title Priority Date Filing Date
AU22729/99A Ceased AU742489B2 (en) 1997-12-24 1998-12-22 Regulatory DNA sequences of the human catalytic telomerase sub-unit gene, diagnostic and therapeutic use thereof

Country Status (6)

Country Link
EP (1) EP1040195A2 (en)
JP (1) JP2003519462A (en)
AU (1) AU742489B2 (en)
CA (1) CA2316282A1 (en)
DE (1) DE19757984A1 (en)
WO (1) WO1999033998A2 (en)

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69739675D1 (en) 1996-10-01 2010-01-07 Geron Corp OTER
US6808880B2 (en) 1996-10-01 2004-10-26 Geron Corporation Method for detecting polynucleotides encoding telomerase
US6475789B1 (en) 1996-10-01 2002-11-05 University Technology Corporation Human telomerase catalytic subunit: diagnostic and therapeutic methods
US6093809A (en) 1996-10-01 2000-07-25 University Technology Corporation Telomerase
US6610839B1 (en) 1997-08-14 2003-08-26 Geron Corporation Promoter for telomerase reverse transcriptase
US6777203B1 (en) 1997-11-19 2004-08-17 Geron Corporation Telomerase promoter driving expression of therapeutic gene sequences
US6261836B1 (en) 1996-10-01 2001-07-17 Geron Corporation Telomerase
US7585622B1 (en) 1996-10-01 2009-09-08 Geron Corporation Increasing the proliferative capacity of cells using telomerase reverse transcriptase
US7413864B2 (en) 1997-04-18 2008-08-19 Geron Corporation Treating cancer using a telomerase vaccine
US7622549B2 (en) 1997-04-18 2009-11-24 Geron Corporation Human telomerase reverse transcriptase polypeptides
US7378244B2 (en) 1997-10-01 2008-05-27 Geron Corporation Telomerase promoters sequences for screening telomerase modulators
WO2000046355A2 (en) * 1999-02-04 2000-08-10 Geron Corporation Telomerase reverse transcriptase transcriptional regulatory sequences
DE19947668A1 (en) 1999-10-04 2001-04-19 Univ Eberhard Karls Tumor-specific vector for gene therapy
DE10019195B4 (en) * 2000-04-17 2006-03-09 Heart Biosystems Gmbh Reversible immortalization
AU2001286540A1 (en) 2000-08-24 2002-03-04 Sierra Sciences, Inc. Methods and compositions for modulating telomerase reverse transcriptase (tert) expression
US6576464B2 (en) 2000-11-27 2003-06-10 Geron Corporation Methods for providing differentiated stem cells
WO2002042468A2 (en) 2000-11-27 2002-05-30 Geron Corporation Glycosyltransferase vectors for treating cancer
AU2002345743A1 (en) 2001-06-21 2003-01-08 Sierra Sciences, Inc. Telomerase expression repressor proteins and methods of using the same
US20030143228A1 (en) * 2001-10-29 2003-07-31 Baylor College Of Medicine Human telomerase reverse transcriptase as a class-II restricted tumor-associated antigen
US8163892B2 (en) 2002-07-08 2012-04-24 Oncolys Biopharma, Inc. Oncolytic virus replicating selectively in tumor cells

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE69739675D1 (en) * 1996-10-01 2010-01-07 Geron Corp OTER

Also Published As

Publication number Publication date
JP2003519462A (en) 2003-06-24
CA2316282A1 (en) 1999-07-08
AU2272999A (en) 1999-07-19
WO1999033998A2 (en) 1999-07-08
DE19757984A1 (en) 1999-07-01
WO1999033998A3 (en) 1999-08-19
EP1040195A2 (en) 2000-10-04

Similar Documents

Publication Publication Date Title
AU742489B2 (en) Regulatory DNA sequences of the human catalytic telomerase sub-unit gene, diagnostic and therapeutic use thereof
KR101441843B1 (en) Conditionally immortalized long-term stem cells and methods of making and using such cells
KR100581990B1 (en) Vertebrate telomerase genes and proteins and uses thereof
US20030100093A1 (en) Human telomerase catalytic subunit: diagnostic and therapeutic methods
JPH11253177A (en) Human telomerase catalytic subunit promoter
US20020102686A1 (en) Inactive variants of the human telomerase catalytic subunit
ES2792126T3 (en) Treatment method based on polymorphisms of the KCNQ1 gene
AU745420B2 (en) Human catalytic telomerase sub-unit and its diagnostic and therapeutic use
US20050032094A1 (en) Regulatory DNA sequences of the human catalytic telomerase sub-unit gene, diagnostic and therapeutic use thereof
US6706511B2 (en) Isolated human kinase proteins
US6808911B2 (en) Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof
US6426206B1 (en) Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof
US6653117B2 (en) Isolated human kinase proteins
EP0783583B1 (en) Cytoplasmic tyrosine kinase
US6410294B1 (en) Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof
US6753175B2 (en) Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof
US20040014193A1 (en) Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof
CN114846157A (en) Adenylyl cyclase 7(ADCY7) variants and uses thereof
US20040220387A1 (en) Methods
US20030119037A1 (en) Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof
CA2439798A1 (en) Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof
CA2422549A1 (en) Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof
CA2440575A1 (en) Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof

Legal Events

Date Code Title Description
FGA Letters patent sealed or granted (standard patent)