AU2272999A - Regulatory dna sequences of the human catalytic telomerase sub-unit gene, diagnostic and therapeutic use thereof - Google Patents
Regulatory dna sequences of the human catalytic telomerase sub-unit gene, diagnostic and therapeutic use thereof Download PDFInfo
- Publication number
- AU2272999A AU2272999A AU22729/99A AU2272999A AU2272999A AU 2272999 A AU2272999 A AU 2272999A AU 22729/99 A AU22729/99 A AU 22729/99A AU 2272999 A AU2272999 A AU 2272999A AU 2272999 A AU2272999 A AU 2272999A
- Authority
- AU
- Australia
- Prior art keywords
- dna
- sequences
- gene
- intron
- sequence
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 241000282414 Homo sapiens Species 0.000 title claims description 69
- 108090000623 proteins and genes Proteins 0.000 title claims description 69
- 108010017842 Telomerase Proteins 0.000 title claims description 63
- 108091028043 Nucleic acid sequence Proteins 0.000 title claims description 35
- 230000003197 catalytic effect Effects 0.000 title claims description 26
- 230000001105 regulatory effect Effects 0.000 title claims description 24
- 230000001225 therapeutic effect Effects 0.000 title description 3
- 108020004414 DNA Proteins 0.000 claims description 71
- 230000000694 effects Effects 0.000 claims description 46
- 239000012634 fragment Substances 0.000 claims description 41
- 239000002299 complementary DNA Substances 0.000 claims description 28
- 239000013598 vector Substances 0.000 claims description 23
- 102000004169 proteins and genes Human genes 0.000 claims description 22
- 238000000034 method Methods 0.000 claims description 13
- 239000000523 sample Substances 0.000 claims description 13
- 108700008625 Reporter Genes Proteins 0.000 claims description 11
- 230000008569 process Effects 0.000 claims description 10
- 239000000126 substance Substances 0.000 claims description 9
- 230000014509 gene expression Effects 0.000 claims description 8
- 239000003623 enhancer Substances 0.000 claims description 6
- 210000001124 body fluid Anatomy 0.000 claims description 4
- 239000010839 body fluid Substances 0.000 claims description 4
- 230000003584 silencer Effects 0.000 claims description 4
- 101150072531 10 gene Proteins 0.000 claims description 2
- 241001465754 Metazoa Species 0.000 claims description 2
- 239000003814 drug Substances 0.000 claims description 2
- 229920001184 polypeptide Polymers 0.000 claims description 2
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 2
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 2
- 238000012360 testing method Methods 0.000 claims description 2
- 230000009261 transgenic effect Effects 0.000 claims description 2
- 210000004027 cell Anatomy 0.000 description 54
- 239000013615 primer Substances 0.000 description 51
- 102000055501 telomere Human genes 0.000 description 38
- 108091035539 telomere Proteins 0.000 description 38
- 210000003411 telomere Anatomy 0.000 description 36
- 238000013518 transcription Methods 0.000 description 22
- 230000035897 transcription Effects 0.000 description 22
- 206010028980 Neoplasm Diseases 0.000 description 18
- 230000027455 binding Effects 0.000 description 14
- 102100034343 Integrase Human genes 0.000 description 13
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 13
- 239000005089 Luciferase Substances 0.000 description 12
- 108020005029 5' Flanking Region Proteins 0.000 description 11
- 108060001084 Luciferase Proteins 0.000 description 11
- 108091081024 Start codon Proteins 0.000 description 11
- 210000000349 chromosome Anatomy 0.000 description 11
- 238000003780 insertion Methods 0.000 description 11
- 230000037431 insertion Effects 0.000 description 11
- 102000004190 Enzymes Human genes 0.000 description 10
- 108090000790 Enzymes Proteins 0.000 description 10
- 238000004458 analytical method Methods 0.000 description 10
- 239000002773 nucleotide Substances 0.000 description 10
- 125000003729 nucleotide group Chemical group 0.000 description 10
- 238000013519 translation Methods 0.000 description 10
- 230000009184 walking Effects 0.000 description 10
- 108700024394 Exon Proteins 0.000 description 9
- 108091008146 restriction endonucleases Proteins 0.000 description 9
- 210000004881 tumor cell Anatomy 0.000 description 9
- 230000000118 anti-neoplastic effect Effects 0.000 description 8
- 201000011510 cancer Diseases 0.000 description 8
- 238000006243 chemical reaction Methods 0.000 description 8
- 238000012217 deletion Methods 0.000 description 8
- 230000037430 deletion Effects 0.000 description 8
- 239000000203 mixture Substances 0.000 description 8
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 7
- 108091092195 Intron Proteins 0.000 description 7
- 238000002105 Southern blotting Methods 0.000 description 7
- 108091023040 Transcription factor Proteins 0.000 description 7
- 102000040945 Transcription factor Human genes 0.000 description 7
- 230000006870 function Effects 0.000 description 7
- 239000000047 product Substances 0.000 description 7
- 238000011144 upstream manufacturing Methods 0.000 description 7
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 6
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 6
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 6
- 230000001580 bacterial effect Effects 0.000 description 6
- 108020004705 Codon Proteins 0.000 description 5
- 230000029087 digestion Effects 0.000 description 5
- 239000002609 medium Substances 0.000 description 5
- 108020004999 messenger RNA Proteins 0.000 description 5
- 239000013612 plasmid Substances 0.000 description 5
- 239000000243 solution Substances 0.000 description 5
- 239000003277 telomerase inhibitor Substances 0.000 description 5
- 210000001519 tissue Anatomy 0.000 description 5
- 230000007704 transition Effects 0.000 description 5
- 108010054576 Deoxyribonuclease EcoRI Proteins 0.000 description 4
- 241000206602 Eukaryota Species 0.000 description 4
- 241000287828 Gallus gallus Species 0.000 description 4
- 102100038895 Myc proto-oncogene protein Human genes 0.000 description 4
- 101710135898 Myc proto-oncogene protein Proteins 0.000 description 4
- 108700015679 Nested Genes Proteins 0.000 description 4
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 4
- 108700026226 TATA Box Proteins 0.000 description 4
- 108010006785 Taq Polymerase Proteins 0.000 description 4
- 101710150448 Transcriptional regulator Myc Proteins 0.000 description 4
- 238000000211 autoradiogram Methods 0.000 description 4
- 238000004925 denaturation Methods 0.000 description 4
- 230000036425 denaturation Effects 0.000 description 4
- 210000003527 eukaryotic cell Anatomy 0.000 description 4
- 239000003102 growth factor Substances 0.000 description 4
- 210000003958 hematopoietic stem cell Anatomy 0.000 description 4
- 238000009396 hybridization Methods 0.000 description 4
- 238000011835 investigation Methods 0.000 description 4
- 239000003446 ligand Substances 0.000 description 4
- BASFCYQUMIYNBI-UHFFFAOYSA-N platinum Chemical compound [Pt] BASFCYQUMIYNBI-UHFFFAOYSA-N 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- 239000011535 reaction buffer Substances 0.000 description 4
- UCSJYZPVAKXKNQ-HZYVHMACSA-N streptomycin Chemical compound CN[C@H]1[C@H](O)[C@@H](O)[C@H](CO)O[C@H]1O[C@@H]1[C@](C=O)(O)[C@H](C)O[C@H]1O[C@@H]1[C@@H](NC(N)=N)[C@H](O)[C@@H](NC(N)=N)[C@H](O)[C@H]1O UCSJYZPVAKXKNQ-HZYVHMACSA-N 0.000 description 4
- 238000001890 transfection Methods 0.000 description 4
- 108020005065 3' Flanking Region Proteins 0.000 description 3
- 101100297345 Caenorhabditis elegans pgl-2 gene Proteins 0.000 description 3
- 108091035707 Consensus sequence Proteins 0.000 description 3
- 108091029523 CpG island Proteins 0.000 description 3
- 230000004568 DNA-binding Effects 0.000 description 3
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 3
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 3
- 241000282412 Homo Species 0.000 description 3
- 108091030087 Initiator element Proteins 0.000 description 3
- 239000013616 RNA primer Substances 0.000 description 3
- 102100040296 TATA-box-binding protein Human genes 0.000 description 3
- 241000700605 Viruses Species 0.000 description 3
- 239000011543 agarose gel Substances 0.000 description 3
- 230000032683 aging Effects 0.000 description 3
- 230000032823 cell division Effects 0.000 description 3
- 238000001246 colloidal dispersion Methods 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- ZMMJGEGLRURXTF-UHFFFAOYSA-N ethidium bromide Chemical compound [Br-].C12=CC(N)=CC=C2C2=CC=C(N)C=C2[N+](CC)=C1C1=CC=CC=C1 ZMMJGEGLRURXTF-UHFFFAOYSA-N 0.000 description 3
- 229960005542 ethidium bromide Drugs 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 210000002950 fibroblast Anatomy 0.000 description 3
- 210000004602 germ cell Anatomy 0.000 description 3
- 230000002401 inhibitory effect Effects 0.000 description 3
- 230000007246 mechanism Effects 0.000 description 3
- 239000012528 membrane Substances 0.000 description 3
- 239000011734 sodium Substances 0.000 description 3
- 210000001082 somatic cell Anatomy 0.000 description 3
- 241000894007 species Species 0.000 description 3
- 108010057210 telomerase RNA Proteins 0.000 description 3
- -1 1L-4 Proteins 0.000 description 2
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 2
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 2
- 241000283690 Bos taurus Species 0.000 description 2
- 102100021943 C-C motif chemokine 2 Human genes 0.000 description 2
- 108700013048 CCL2 Proteins 0.000 description 2
- 102000005367 Carboxypeptidases Human genes 0.000 description 2
- 108010006303 Carboxypeptidases Proteins 0.000 description 2
- 208000005623 Carcinogenesis Diseases 0.000 description 2
- 108010035563 Chloramphenicol O-acetyltransferase Proteins 0.000 description 2
- HEDRZPFGACZZDS-UHFFFAOYSA-N Chloroform Chemical compound ClC(Cl)Cl HEDRZPFGACZZDS-UHFFFAOYSA-N 0.000 description 2
- 108091026890 Coding region Proteins 0.000 description 2
- 102000004127 Cytokines Human genes 0.000 description 2
- 108090000695 Cytokines Proteins 0.000 description 2
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 2
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 2
- 102000002265 Human Growth Hormone Human genes 0.000 description 2
- 108010000521 Human Growth Hormone Proteins 0.000 description 2
- 239000000854 Human Growth Hormone Substances 0.000 description 2
- 101150089655 Ins2 gene Proteins 0.000 description 2
- 108090000723 Insulin-Like Growth Factor I Proteins 0.000 description 2
- 102000014429 Insulin-like growth factor Human genes 0.000 description 2
- 102100025390 Integrin beta-2 Human genes 0.000 description 2
- 241000699666 Mus <mouse, genus> Species 0.000 description 2
- 239000000020 Nitrocellulose Substances 0.000 description 2
- 241000283973 Oryctolagus cuniculus Species 0.000 description 2
- 238000012408 PCR amplification Methods 0.000 description 2
- 229930182555 Penicillin Natural products 0.000 description 2
- JGSARLDLIJGVTE-MBNYWOFBSA-N Penicillin G Chemical compound N([C@H]1[C@H]2SC([C@@H](N2C1=O)C(O)=O)(C)C)C(=O)CC1=CC=CC=C1 JGSARLDLIJGVTE-MBNYWOFBSA-N 0.000 description 2
- 241000009328 Perro Species 0.000 description 2
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 2
- 102000010752 Plasminogen Inactivators Human genes 0.000 description 2
- 108010077971 Plasminogen Inactivators Proteins 0.000 description 2
- 108010001267 Protein Subunits Proteins 0.000 description 2
- 102000002067 Protein Subunits Human genes 0.000 description 2
- 241000700584 Simplexvirus Species 0.000 description 2
- IQFYYKKMVGJFEH-XLPZGREQSA-N Thymidine Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](CO)[C@@H](O)C1 IQFYYKKMVGJFEH-XLPZGREQSA-N 0.000 description 2
- 102100040247 Tumor necrosis factor Human genes 0.000 description 2
- 102000003848 Uteroglobin Human genes 0.000 description 2
- 108090000203 Uteroglobin Proteins 0.000 description 2
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 2
- 101100072652 Xenopus laevis ins-b gene Proteins 0.000 description 2
- 239000002253 acid Substances 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 230000003321 amplification Effects 0.000 description 2
- 230000000692 anti-sense effect Effects 0.000 description 2
- 239000012131 assay buffer Substances 0.000 description 2
- 230000033228 biological regulation Effects 0.000 description 2
- 239000000872 buffer Substances 0.000 description 2
- 230000036952 cancer formation Effects 0.000 description 2
- 231100000504 carcinogenesis Toxicity 0.000 description 2
- 230000032677 cell aging Effects 0.000 description 2
- 238000004113 cell culture Methods 0.000 description 2
- 239000006143 cell culture medium Substances 0.000 description 2
- 239000013592 cell lysate Substances 0.000 description 2
- 238000003776 cleavage reaction Methods 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 230000008030 elimination Effects 0.000 description 2
- 238000003379 elimination reaction Methods 0.000 description 2
- 239000000499 gel Substances 0.000 description 2
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 2
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical compound O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 2
- NOESYZHRGYRDHS-UHFFFAOYSA-N insulin Chemical compound N1C(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(NC(=O)CN)C(C)CC)CSSCC(C(NC(CO)C(=O)NC(CC(C)C)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CCC(N)=O)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(N)=O)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CSSCC(NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2C=CC(O)=CC=2)NC(=O)C(CC(C)C)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2NC=NC=2)NC(=O)C(CO)NC(=O)CNC2=O)C(=O)NCC(=O)NC(CCC(O)=O)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC(O)=CC=3)C(=O)NC(C(C)O)C(=O)N3C(CCC3)C(=O)NC(CCCCN)C(=O)NC(C)C(O)=O)C(=O)NC(CC(N)=O)C(O)=O)=O)NC(=O)C(C(C)CC)NC(=O)C(CO)NC(=O)C(C(C)O)NC(=O)C1CSSCC2NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)C(C)C)CC1=CN=CN1 NOESYZHRGYRDHS-UHFFFAOYSA-N 0.000 description 2
- 208000032839 leukemia Diseases 0.000 description 2
- 239000002502 liposome Substances 0.000 description 2
- 229920001220 nitrocellulos Polymers 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 230000008520 organization Effects 0.000 description 2
- 229940049954 penicillin Drugs 0.000 description 2
- 239000000825 pharmaceutical preparation Substances 0.000 description 2
- 210000002826 placenta Anatomy 0.000 description 2
- 239000002797 plasminogen activator inhibitor Substances 0.000 description 2
- 229910052697 platinum Inorganic materials 0.000 description 2
- 229920002401 polyacrylamide Polymers 0.000 description 2
- 239000002243 precursor Substances 0.000 description 2
- 230000009979 protective mechanism Effects 0.000 description 2
- 230000007420 reactivation Effects 0.000 description 2
- 108020003175 receptors Proteins 0.000 description 2
- 102000005962 receptors Human genes 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 230000010076 replication Effects 0.000 description 2
- 238000003757 reverse transcription PCR Methods 0.000 description 2
- 230000007017 scission Effects 0.000 description 2
- 239000011780 sodium chloride Substances 0.000 description 2
- 230000000392 somatic effect Effects 0.000 description 2
- 229960005322 streptomycin Drugs 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 238000005406 washing Methods 0.000 description 2
- KSXTUUUQYQYKCR-LQDDAWAPSA-M 2,3-bis[[(z)-octadec-9-enoyl]oxy]propyl-trimethylazanium;chloride Chemical compound [Cl-].CCCCCCCC\C=C/CCCCCCCC(=O)OCC(C[N+](C)(C)C)OC(=O)CCCCCCC\C=C/CCCCCCCC KSXTUUUQYQYKCR-LQDDAWAPSA-M 0.000 description 1
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 1
- KUWPCJHYPSUOFW-YBXAARCKSA-N 2-nitrophenyl beta-D-galactoside Chemical compound O[C@@H]1[C@@H](O)[C@@H](O)[C@@H](CO)O[C@H]1OC1=CC=CC=C1[N+]([O-])=O KUWPCJHYPSUOFW-YBXAARCKSA-N 0.000 description 1
- HRGUSFBJBOKSML-UHFFFAOYSA-N 3',5'-di-O-methyltricetin Chemical compound COC1=C(O)C(OC)=CC(C=2OC3=CC(O)=CC(O)=C3C(=O)C=2)=C1 HRGUSFBJBOKSML-UHFFFAOYSA-N 0.000 description 1
- 101150110188 30 gene Proteins 0.000 description 1
- PQVHMOLNSYFXIJ-UHFFFAOYSA-N 4-[2-(2,3-dihydro-1H-inden-2-ylamino)pyrimidin-5-yl]-1-[2-oxo-2-(2,4,6,7-tetrahydrotriazolo[4,5-c]pyridin-5-yl)ethyl]pyrazole-3-carboxylic acid Chemical compound C1C(CC2=CC=CC=C12)NC1=NC=C(C=N1)C=1C(=NN(C=1)CC(N1CC2=C(CC1)NN=N2)=O)C(=O)O PQVHMOLNSYFXIJ-UHFFFAOYSA-N 0.000 description 1
- DEXFNLNNUZKHNO-UHFFFAOYSA-N 6-[3-[4-[2-(2,3-dihydro-1H-inden-2-ylamino)pyrimidin-5-yl]piperidin-1-yl]-3-oxopropyl]-3H-1,3-benzoxazol-2-one Chemical compound C1C(CC2=CC=CC=C12)NC1=NC=C(C=N1)C1CCN(CC1)C(CCC1=CC2=C(NC(O2)=O)C=C1)=O DEXFNLNNUZKHNO-UHFFFAOYSA-N 0.000 description 1
- FVFVNNKYKYZTJU-UHFFFAOYSA-N 6-chloro-1,3,5-triazine-2,4-diamine Chemical compound NC1=NC(N)=NC(Cl)=N1 FVFVNNKYKYZTJU-UHFFFAOYSA-N 0.000 description 1
- QTBSBXVTEAMEQO-UHFFFAOYSA-M Acetate Chemical compound CC([O-])=O QTBSBXVTEAMEQO-UHFFFAOYSA-M 0.000 description 1
- 102000012936 Angiostatins Human genes 0.000 description 1
- 108010079709 Angiostatins Proteins 0.000 description 1
- 101100278884 Arabidopsis thaliana E2FD gene Proteins 0.000 description 1
- 238000011725 BALB/c mouse Methods 0.000 description 1
- 101001032758 Bacillus subtilis (strain 168) General stress protein 13 Proteins 0.000 description 1
- DWRXFEITVBNRMK-UHFFFAOYSA-N Beta-D-1-Arabinofuranosylthymine Natural products O=C1NC(=O)C(C)=CN1C1C(O)C(O)C(CO)O1 DWRXFEITVBNRMK-UHFFFAOYSA-N 0.000 description 1
- 102100032367 C-C motif chemokine 5 Human genes 0.000 description 1
- 101710186200 CCAAT/enhancer-binding protein Proteins 0.000 description 1
- 101100179596 Caenorhabditis elegans ins-3 gene Proteins 0.000 description 1
- 101100179594 Caenorhabditis elegans ins-4 gene Proteins 0.000 description 1
- 241000282693 Cercopithecidae Species 0.000 description 1
- 108010055166 Chemokine CCL5 Proteins 0.000 description 1
- 108010077544 Chromatin Proteins 0.000 description 1
- 208000037051 Chromosomal Instability Diseases 0.000 description 1
- RGJOEKWQDUBAIZ-IBOSZNHHSA-N CoASH Chemical compound O[C@@H]1[C@H](OP(O)(O)=O)[C@@H](COP(O)(=O)OP(O)(=O)OCC(C)(C)[C@@H](O)C(=O)NCCC(=O)NCCS)O[C@H]1N1C2=NC=NC(N)=C2N=C1 RGJOEKWQDUBAIZ-IBOSZNHHSA-N 0.000 description 1
- 102100022641 Coagulation factor IX Human genes 0.000 description 1
- 102100031162 Collagen alpha-1(XVIII) chain Human genes 0.000 description 1
- 206010009944 Colon cancer Diseases 0.000 description 1
- 208000001333 Colorectal Neoplasms Diseases 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- 102000005636 Cyclic AMP Response Element-Binding Protein Human genes 0.000 description 1
- 108010045171 Cyclic AMP Response Element-Binding Protein Proteins 0.000 description 1
- FCKYPQBAHLOOJQ-UHFFFAOYSA-N Cyclohexane-1,2-diaminetetraacetic acid Chemical compound OC(=O)CN(CC(O)=O)C1CCCCC1N(CC(O)=O)CC(O)=O FCKYPQBAHLOOJQ-UHFFFAOYSA-N 0.000 description 1
- IGXWBGJHJZYPQS-SSDOTTSWSA-N D-Luciferin Chemical compound OC(=O)[C@H]1CSC(C=2SC3=CC=C(O)C=C3N=2)=N1 IGXWBGJHJZYPQS-SSDOTTSWSA-N 0.000 description 1
- WQZGKKKJIJFFOK-QTVWNMPRSA-N D-mannopyranose Chemical compound OC[C@H]1OC(O)[C@@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-QTVWNMPRSA-N 0.000 description 1
- 102000053602 DNA Human genes 0.000 description 1
- 230000004544 DNA amplification Effects 0.000 description 1
- 230000007067 DNA methylation Effects 0.000 description 1
- 230000004543 DNA replication Effects 0.000 description 1
- 102000052510 DNA-Binding Proteins Human genes 0.000 description 1
- 108700020911 DNA-Binding Proteins Proteins 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- CYCGRDQQIOGCKX-UHFFFAOYSA-N Dehydro-luciferin Natural products OC(=O)C1=CSC(C=2SC3=CC(O)=CC=C3N=2)=N1 CYCGRDQQIOGCKX-UHFFFAOYSA-N 0.000 description 1
- 241000702421 Dependoparvovirus Species 0.000 description 1
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 1
- 102100032449 EGF-like repeat and discoidin I-like domain-containing protein 3 Human genes 0.000 description 1
- 102100039578 ETS translocation variant 4 Human genes 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 108010079505 Endostatins Proteins 0.000 description 1
- 102100031690 Erythroid transcription factor Human genes 0.000 description 1
- 101710100588 Erythroid transcription factor Proteins 0.000 description 1
- 101150031329 Ets1 gene Proteins 0.000 description 1
- 108010076282 Factor IX Proteins 0.000 description 1
- 108010054218 Factor VIII Proteins 0.000 description 1
- 102000001690 Factor VIII Human genes 0.000 description 1
- BJGNCJDXODQBOB-UHFFFAOYSA-N Fivefly Luciferin Natural products OC(=O)C1CSC(C=2SC3=CC(O)=CC=C3N=2)=N1 BJGNCJDXODQBOB-UHFFFAOYSA-N 0.000 description 1
- 102000002464 Galactosidases Human genes 0.000 description 1
- 108010093031 Galactosidases Proteins 0.000 description 1
- 102000053187 Glucuronidase Human genes 0.000 description 1
- 108010060309 Glucuronidase Proteins 0.000 description 1
- 102000006587 Glutathione peroxidase Human genes 0.000 description 1
- 108700016172 Glutathione peroxidases Proteins 0.000 description 1
- 108010017080 Granulocyte Colony-Stimulating Factor Proteins 0.000 description 1
- 102000004269 Granulocyte Colony-Stimulating Factor Human genes 0.000 description 1
- 108010017213 Granulocyte-Macrophage Colony-Stimulating Factor Proteins 0.000 description 1
- 102100039620 Granulocyte-macrophage colony-stimulating factor Human genes 0.000 description 1
- 102000001398 Granzyme Human genes 0.000 description 1
- 108060005986 Granzyme Proteins 0.000 description 1
- 102100031573 Hematopoietic progenitor cell antigen CD34 Human genes 0.000 description 1
- 101000715499 Homo sapiens Catalase Proteins 0.000 description 1
- 101001016381 Homo sapiens EGF-like repeat and discoidin I-like domain-containing protein 3 Proteins 0.000 description 1
- 101000813747 Homo sapiens ETS translocation variant 4 Proteins 0.000 description 1
- 101000987586 Homo sapiens Eosinophil peroxidase Proteins 0.000 description 1
- 101000777663 Homo sapiens Hematopoietic progenitor cell antigen CD34 Proteins 0.000 description 1
- 101001046686 Homo sapiens Integrin alpha-M Proteins 0.000 description 1
- 101000935040 Homo sapiens Integrin beta-2 Proteins 0.000 description 1
- 101100046340 Homo sapiens TIMP3 gene Proteins 0.000 description 1
- 101000799461 Homo sapiens Thrombopoietin Proteins 0.000 description 1
- 101000694103 Homo sapiens Thyroid peroxidase Proteins 0.000 description 1
- 108010031792 IGF Type 2 Receptor Proteins 0.000 description 1
- 102000008394 Immunoglobulin Fragments Human genes 0.000 description 1
- 108010021625 Immunoglobulin Fragments Proteins 0.000 description 1
- 206010061218 Inflammation Diseases 0.000 description 1
- 102100023915 Insulin Human genes 0.000 description 1
- 108090001061 Insulin Proteins 0.000 description 1
- 108010008212 Integrin alpha4beta1 Proteins 0.000 description 1
- 102000014150 Interferons Human genes 0.000 description 1
- 108010050904 Interferons Proteins 0.000 description 1
- 108010002352 Interleukin-1 Proteins 0.000 description 1
- 102000003815 Interleukin-11 Human genes 0.000 description 1
- 108090000177 Interleukin-11 Proteins 0.000 description 1
- 102000003816 Interleukin-13 Human genes 0.000 description 1
- 108090000176 Interleukin-13 Proteins 0.000 description 1
- 108010002350 Interleukin-2 Proteins 0.000 description 1
- 108010002586 Interleukin-7 Proteins 0.000 description 1
- 102100021592 Interleukin-7 Human genes 0.000 description 1
- 108090001007 Interleukin-8 Proteins 0.000 description 1
- 108010092694 L-Selectin Proteins 0.000 description 1
- 102000016551 L-selectin Human genes 0.000 description 1
- 241001484259 Lacuna Species 0.000 description 1
- 241000131894 Lampyris noctiluca Species 0.000 description 1
- 108091026898 Leader sequence (mRNA) Proteins 0.000 description 1
- DDWFXDSYGUXRAY-UHFFFAOYSA-N Luciferin Natural products CCc1c(C)c(CC2NC(=O)C(=C2C=C)C)[nH]c1Cc3[nH]c4C(=C5/NC(CC(=O)O)C(C)C5CC(=O)O)CC(=O)c4c3C DDWFXDSYGUXRAY-UHFFFAOYSA-N 0.000 description 1
- 108010064548 Lymphocyte Function-Associated Antigen-1 Proteins 0.000 description 1
- 241000282560 Macaca mulatta Species 0.000 description 1
- 108010046938 Macrophage Colony-Stimulating Factor Proteins 0.000 description 1
- 102000007651 Macrophage Colony-Stimulating Factor Human genes 0.000 description 1
- 102000009571 Macrophage Inflammatory Proteins Human genes 0.000 description 1
- 108010009474 Macrophage Inflammatory Proteins Proteins 0.000 description 1
- 101000962498 Macropis fulvipes Macropin Proteins 0.000 description 1
- 102000019218 Mannose-6-phosphate receptors Human genes 0.000 description 1
- 102100039364 Metalloproteinase inhibitor 1 Human genes 0.000 description 1
- 102100026262 Metalloproteinase inhibitor 2 Human genes 0.000 description 1
- 102100026261 Metalloproteinase inhibitor 3 Human genes 0.000 description 1
- 229910019440 Mg(OH) Inorganic materials 0.000 description 1
- 241000699670 Mus sp. Species 0.000 description 1
- 102000004459 Nitroreductase Human genes 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- 102000004140 Oncostatin M Human genes 0.000 description 1
- 108090000630 Oncostatin M Proteins 0.000 description 1
- 108700026244 Open Reading Frames Proteins 0.000 description 1
- 108010056995 Perforin Proteins 0.000 description 1
- KHGNFPUMBJSZSM-UHFFFAOYSA-N Perforine Natural products COC1=C2CCC(O)C(CCC(C)(C)O)(OC)C2=NC2=C1C=CO2 KHGNFPUMBJSZSM-UHFFFAOYSA-N 0.000 description 1
- 102000045595 Phosphoprotein Phosphatases Human genes 0.000 description 1
- 108700019535 Phosphoprotein Phosphatases Proteins 0.000 description 1
- IDDMFNIRSJVBHE-UHFFFAOYSA-N Piscigenin Natural products COC1=C(O)C(OC)=CC(C=2C(C3=C(O)C=C(O)C=C3OC=2)=O)=C1 IDDMFNIRSJVBHE-UHFFFAOYSA-N 0.000 description 1
- 102100024078 Plasma serine protease inhibitor Human genes 0.000 description 1
- 102000004179 Plasminogen Activator Inhibitor 2 Human genes 0.000 description 1
- 108090000614 Plasminogen Activator Inhibitor 2 Proteins 0.000 description 1
- 102000004211 Platelet factor 4 Human genes 0.000 description 1
- 108090000778 Platelet factor 4 Proteins 0.000 description 1
- 108010039918 Polylysine Proteins 0.000 description 1
- 108010001953 Protein C Inhibitor Proteins 0.000 description 1
- 108091008109 Pseudogenes Proteins 0.000 description 1
- 102000057361 Pseudogenes Human genes 0.000 description 1
- 102000009572 RNA Polymerase II Human genes 0.000 description 1
- 108010009460 RNA Polymerase II Proteins 0.000 description 1
- 108020005067 RNA Splice Sites Proteins 0.000 description 1
- 238000010240 RT-PCR analysis Methods 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 201000000582 Retinoblastoma Diseases 0.000 description 1
- 101100393821 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) GSP2 gene Proteins 0.000 description 1
- 244000082988 Secale cereale Species 0.000 description 1
- 235000007238 Secale cereale Nutrition 0.000 description 1
- 241000710961 Semliki Forest virus Species 0.000 description 1
- 241000710960 Sindbis virus Species 0.000 description 1
- 208000037065 Subacute sclerosing leukoencephalitis Diseases 0.000 description 1
- 206010042297 Subacute sclerosing panencephalitis Diseases 0.000 description 1
- 108700025695 Suppressor Genes Proteins 0.000 description 1
- 108010044281 TATA-Box Binding Protein Proteins 0.000 description 1
- 102000007591 Tartrate-Resistant Acid Phosphatase Human genes 0.000 description 1
- 108010032050 Tartrate-Resistant Acid Phosphatase Proteins 0.000 description 1
- 102000006601 Thymidine Kinase Human genes 0.000 description 1
- 108020004440 Thymidine kinase Proteins 0.000 description 1
- 108010031374 Tissue Inhibitor of Metalloproteinase-1 Proteins 0.000 description 1
- 108010031372 Tissue Inhibitor of Metalloproteinase-2 Proteins 0.000 description 1
- 108010031429 Tissue Inhibitor of Metalloproteinase-3 Proteins 0.000 description 1
- 108010083268 Transcription Factor TFIID Proteins 0.000 description 1
- 108700009124 Transcription Initiation Site Proteins 0.000 description 1
- 102000004357 Transferases Human genes 0.000 description 1
- 108090000992 Transferases Proteins 0.000 description 1
- 239000007983 Tris buffer Substances 0.000 description 1
- 229920004890 Triton X-100 Polymers 0.000 description 1
- 239000013504 Triton X-100 Substances 0.000 description 1
- 102000004142 Trypsin Human genes 0.000 description 1
- 108090000631 Trypsin Proteins 0.000 description 1
- 108060008682 Tumor Necrosis Factor Proteins 0.000 description 1
- 101900150902 Varicella-zoster virus Thymidine kinase Proteins 0.000 description 1
- 102000005789 Vascular Endothelial Growth Factors Human genes 0.000 description 1
- 108010019530 Vascular Endothelial Growth Factors Proteins 0.000 description 1
- 240000008042 Zea mays Species 0.000 description 1
- 235000007244 Zea mays Nutrition 0.000 description 1
- INULNSAIIZKOQE-YOSAUDMPSA-N [(3r,4ar,10ar)-6-methoxy-1-methyl-3,4,4a,5,10,10a-hexahydro-2h-benzo[g]quinolin-3-yl]-[4-(4-nitrophenyl)piperazin-1-yl]methanone Chemical compound O=C([C@@H]1C[C@H]2[C@H](N(C1)C)CC=1C=CC=C(C=1C2)OC)N(CC1)CCN1C1=CC=C([N+]([O-])=O)C=C1 INULNSAIIZKOQE-YOSAUDMPSA-N 0.000 description 1
- 238000010521 absorption reaction Methods 0.000 description 1
- 150000007513 acids Chemical class 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 239000012190 activator Substances 0.000 description 1
- 239000008186 active pharmaceutical agent Substances 0.000 description 1
- 239000000443 aerosol Substances 0.000 description 1
- 150000001413 amino acids Chemical group 0.000 description 1
- 230000033115 angiogenesis Effects 0.000 description 1
- 238000000137 annealing Methods 0.000 description 1
- FZCSTZYAHCUGEM-UHFFFAOYSA-N aspergillomarasmine B Natural products OC(=O)CNC(C(O)=O)CNC(C(O)=O)CC(O)=O FZCSTZYAHCUGEM-UHFFFAOYSA-N 0.000 description 1
- 239000012298 atmosphere Substances 0.000 description 1
- 238000000376 autoradiography Methods 0.000 description 1
- IQFYYKKMVGJFEH-UHFFFAOYSA-N beta-L-thymidine Natural products O=C1NC(=O)C(C)=CN1C1OC(CO)C(O)C1 IQFYYKKMVGJFEH-UHFFFAOYSA-N 0.000 description 1
- 238000003766 bioinformatics method Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 239000003114 blood coagulation factor Substances 0.000 description 1
- 238000010805 cDNA synthesis kit Methods 0.000 description 1
- 244000309466 calf Species 0.000 description 1
- 108700021031 cdc Genes Proteins 0.000 description 1
- 230000022131 cell cycle Effects 0.000 description 1
- 210000003855 cell nucleus Anatomy 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 210000002230 centromere Anatomy 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 210000004978 chinese hamster ovary cell Anatomy 0.000 description 1
- 210000003483 chromatin Anatomy 0.000 description 1
- 239000013599 cloning vector Substances 0.000 description 1
- RGJOEKWQDUBAIZ-UHFFFAOYSA-N coenzime A Natural products OC1C(OP(O)(O)=O)C(COP(O)(=O)OP(O)(=O)OCC(C)(C)C(O)C(=O)NCCC(=O)NCCS)OC1N1C2=NC=NC(N)=C2N=C1 RGJOEKWQDUBAIZ-UHFFFAOYSA-N 0.000 description 1
- 239000005516 coenzyme A Substances 0.000 description 1
- 229940093530 coenzyme a Drugs 0.000 description 1
- 201000010989 colorectal carcinoma Diseases 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 230000001276 controlling effect Effects 0.000 description 1
- 230000009089 cytolysis Effects 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- 239000000824 cytostatic agent Substances 0.000 description 1
- 230000001085 cytostatic effect Effects 0.000 description 1
- 230000006378 damage Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- KDTSHFARGAKYJN-UHFFFAOYSA-N dephosphocoenzyme A Natural products OC1C(O)C(COP(O)(=O)OP(O)(=O)OCC(C)(C)C(O)C(=O)NCCC(=O)NCCS)OC1N1C2=NC=NC(N)=C2N=C1 KDTSHFARGAKYJN-UHFFFAOYSA-N 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 239000005546 dideoxynucleotide Substances 0.000 description 1
- KCFYHBSOLOXZIF-UHFFFAOYSA-N dihydrochrysin Natural products COC1=C(O)C(OC)=CC(C2OC3=CC(O)=CC(O)=C3C(=O)C2)=C1 KCFYHBSOLOXZIF-UHFFFAOYSA-N 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 229960004222 factor ix Drugs 0.000 description 1
- 229960000301 factor viii Drugs 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000001415 gene therapy Methods 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 229940094991 herring sperm dna Drugs 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 102000045501 human CAT Human genes 0.000 description 1
- 102000053400 human TPO Human genes 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 230000004054 inflammatory process Effects 0.000 description 1
- 239000003999 initiator Substances 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 229940125396 insulin Drugs 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 229940047124 interferons Drugs 0.000 description 1
- 210000005228 liver tissue Anatomy 0.000 description 1
- WLHQHAUOOXYABV-UHFFFAOYSA-N lornoxicam Chemical compound OC=1C=2SC(Cl)=CC=2S(=O)(=O)N(C)C=1C(=O)NC1=CC=CC=N1 WLHQHAUOOXYABV-UHFFFAOYSA-N 0.000 description 1
- 238000003670 luciferase enzyme activity assay Methods 0.000 description 1
- 210000004698 lymphocyte Anatomy 0.000 description 1
- 230000003211 malignant effect Effects 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000000394 mitotic effect Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 239000013642 negative control Substances 0.000 description 1
- 238000007857 nested PCR Methods 0.000 description 1
- 210000000440 neutrophil Anatomy 0.000 description 1
- 108020001162 nitroreductase Proteins 0.000 description 1
- 108091027963 non-coding RNA Proteins 0.000 description 1
- 102000042567 non-coding RNA Human genes 0.000 description 1
- 231100000252 nontoxic Toxicity 0.000 description 1
- 230000003000 nontoxic effect Effects 0.000 description 1
- 102000039446 nucleic acids Human genes 0.000 description 1
- 108020004707 nucleic acids Proteins 0.000 description 1
- 150000007523 nucleic acids Chemical class 0.000 description 1
- 108010089433 obelin Proteins 0.000 description 1
- 238000011275 oncology therapy Methods 0.000 description 1
- 108091008819 oncoproteins Proteins 0.000 description 1
- 102000027450 oncoproteins Human genes 0.000 description 1
- 230000036961 partial effect Effects 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 229930192851 perforin Natural products 0.000 description 1
- 239000000546 pharmaceutical excipient Substances 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 229920000656 polylysine Polymers 0.000 description 1
- 239000013641 positive control Substances 0.000 description 1
- 230000004481 post-translational protein modification Effects 0.000 description 1
- 230000001124 posttranscriptional effect Effects 0.000 description 1
- 239000002244 precipitate Substances 0.000 description 1
- 230000002028 premature Effects 0.000 description 1
- 239000002987 primer (paints) Substances 0.000 description 1
- 108020001580 protein domains Proteins 0.000 description 1
- 230000004853 protein function Effects 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 230000005855 radiation Effects 0.000 description 1
- 230000002285 radioactive effect Effects 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 238000011084 recovery Methods 0.000 description 1
- 230000001172 regenerating effect Effects 0.000 description 1
- 230000014493 regulation of gene expression Effects 0.000 description 1
- 230000037425 regulation of transcription Effects 0.000 description 1
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 1
- 230000008844 regulatory mechanism Effects 0.000 description 1
- 210000005084 renal tissue Anatomy 0.000 description 1
- 230000003362 replicative effect Effects 0.000 description 1
- 230000008943 replicative senescence Effects 0.000 description 1
- 229920006395 saturated elastomer Polymers 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000009758 senescence Effects 0.000 description 1
- 238000011896 sensitive detection Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 239000012679 serum free medium Substances 0.000 description 1
- 238000004904 shortening Methods 0.000 description 1
- 239000012064 sodium phosphate buffer Substances 0.000 description 1
- 230000000087 stabilizing effect Effects 0.000 description 1
- 239000012089 stop solution Substances 0.000 description 1
- 229940124597 therapeutic agent Drugs 0.000 description 1
- 229940104230 thymidine Drugs 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 210000001541 thymus gland Anatomy 0.000 description 1
- 230000005026 transcription initiation Effects 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 239000012096 transfection reagent Substances 0.000 description 1
- 238000003146 transient transfection Methods 0.000 description 1
- BMCJATLPEJCACU-UHFFFAOYSA-N tricin Natural products COc1cc(OC)c(O)c(c1)C2=CC(=O)c3c(O)cc(O)cc3O2 BMCJATLPEJCACU-UHFFFAOYSA-N 0.000 description 1
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 1
- 239000012588 trypsin Substances 0.000 description 1
- 241000701161 unidentified adenovirus Species 0.000 description 1
- 241001430294 unidentified retrovirus Species 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- 210000005253 yeast cell Anatomy 0.000 description 1
- DGVVWUTYPXICAM-UHFFFAOYSA-N β‐Mercaptoethanol Chemical compound OCCS DGVVWUTYPXICAM-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
- C12N9/1241—Nucleotidyltransferases (2.7.7)
-
- A—HUMAN NECESSITIES
- A01—AGRICULTURE; FORESTRY; ANIMAL HUSBANDRY; HUNTING; TRAPPING; FISHING
- A01K—ANIMAL HUSBANDRY; AVICULTURE; APICULTURE; PISCICULTURE; FISHING; REARING OR BREEDING ANIMALS, NOT OTHERWISE PROVIDED FOR; NEW BREEDS OF ANIMALS
- A01K2217/00—Genetically modified animals
- A01K2217/05—Animals comprising random inserted nucleic acids (transgenic)
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K38/00—Medicinal preparations containing peptides
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Engineering & Computer Science (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Wood Science & Technology (AREA)
- Microbiology (AREA)
- Biotechnology (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Medicinal Chemistry (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
- Enzymes And Modification Thereof (AREA)
- Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Description
Re2ulatorv DNA sequences of the gene for the human catalytic telomerase subunit, and their diagnostic and therapeutic use Structure and function of the chromosome ends 5 The genetic material of eukaryotic cells is distributed on linear chromosomes. The ends of hereditary units are termed telomeres, derived from the Greek words telos (end) and meros (part, segment). Most telomeres consist of repeats of short sequences which are mainly composed of thymine and guanine (Zakian, 1995). In all the 10 vertebrates which have so far been investigated, the telomeres consist of the sequence TTAGGG (Meyne et al., 1989). The telomeres have a variety of important functions. They prevent the fusion of chromosomes (McClintock, 1941) and thus the formation of dicentric hereditary 15 units. Such chromosomes having two centromeres can lead to the development of cancer due to loss of heterozygosis or duplication, or loss of genes. In addition, telomeres serve the purpose of distinguishing intact hereditary units from damaged hereditary units. Thus, yeast cells ceased their cell division when they 20 contained a chromosome without a telomere (Sandell and Zakian, 1993). Telomeres fulfil another important task in association with the replication of eukaryotic cell DNA. In contrast to the circular genomes of prokaryotes, the linear chromosomes of eukaryotes cannot be completely replicated by the DNA polymerase 25 complex. RNA primers are required to initiate DNA replication. After elimination of the RNA primers, extension of the Okazaki fragments and subsequent ligation, the newly synthesized DNA strand lacks the 5' end since the RNA primer cannot be replaced by DNA at that point. Without special protective mechanisms, the chromosomes would therefore shrink with each cell division ("end-replication 30 problem"; Harley et al., 1990). The non-coding telomere sequences presumably constitute a buffer zone for preventing the loss of genes (Sandell and Zakian, 1993).
C)J
-2 In addition to this, telomeres also play an import role in regulating cell ageing (Olovnikov, 1973). Human somatic cells exhibit a limited capacity for replication in culture; after a certain period of time, they become senescent. In this state, the cells no longer divide even after having been stimulated with growth factors; however, 5 they do not die and remain metabolically active (Goldstein, 1990). Various observations support the hypothesis that a cell determines how many more times it can divide on the basis of the length of its telomeres (Allsopp et al., 1992). In summary, the telomeres consequently possess key functions in the ageing of cells, 10 and in stabilizing the genetic material and preventing cancer. The enzyme telomerase synthesizes the telomeres As described above, organisms which possess linear chromosomes can only replicate 15 their genome incompletely in the absence of a special protective mechanism. Most eukaryotes use a special enzyme, i.e. telomerase, for regenerating the telomere sequences. Telomerase is expressed constitutively in the single-cell organisms which have so far been investigated. On the other hand, telomerase activity has only been measured in humans in germ cells and tumour cells, whereas neighbouring somatic 20 tissue did not contain any telomerase (Kim et al., 1994). Telomerase can also be designated functionally as terminal telomere transferase, which is located in the cell nucleus as a multiprotein complex. While the RNA moiety of human telomerase has been known for a relatively long period of time 25 (Feng et al., 1995), the catalytic subunit of this enzyme group was recently identified in a variety of organisms (Lingner et al., 1997; cf. our application PCT EP/98/03468 which is likewise pending). These catalytic subunits of telomerase are strikingly homologous both among themselves and in relation to all previously known reverse transcriptases. 30 WO 98/14592 also describes nucleic acid and amino acid sequences of the catalytic telomerase subunit.
-3 Activation of telomerase in human tumours It was originally only possible to demonstrate telomerase activity in humans in germ 5 line cells and not in normal somatic cells (Hastie et al., 1990; Kim et al., 1994). Following the development of a more sensitive detection method (Kim et al., 1994), a low telomerase activity was also detected in hematopoietic cells (Broccoli et al., 1995; Counter et al., 1995; Hiyama et al., 1995). It is true, however, that these cells nevertheless exhibited a reduction in the telomeres (Vaziri et al., 1994; Counter et 10 al., 1995). It has still not been resolved whether the quantity of enzyme in these cells is not sufficient for compensating the telomere loss or whether the telomerase activity which is measured stems from a subpopulation, e.g. incompletely differentiated CD34+38+ precursor cells (Hiyama et al., 1995). In order to resolve this, it would be necessary to detect telomerase activity in a single cell. 15 Interestingly, however, significant telomerase activity was detected in a large number of the tumour tissues which had thus far been tested (1734/2031, 85%; Shay, 1997), whereas no activity was found in normal somatic tissue (1/196, <1%, Shay, 1997). In addition various investigations have shown that the telomeres still shrank in 20 senescent cells which were transformed with viral oncoproteins and it was only possible to detect telomerase in the subpopulation which survived the growth crisis (Counter et al., 1992). The telomeres were also stable in these immortalized cells. (Counter et al., 1992). Similar findings from investigations in mice (Blasco et al., 1996) support the assumption that reactivation of the telomerase is a late event in 25 tumorigenesis. Based on these results, a "telomerase hypothesis" was developed which links the loss of telomere sequences and cell ageing with telomerase activity and the development of cancer. In long-lived species such as humans, the shrinking of the telomeres can be 30 regarded as being a mechanism for suppressing tumours. Differentiated cells which AL do not contain any telomerase cease their cell division at a particular telomere length. If such a cell mutates, it can only form a tumour if the cell can extend its telomeres. uJw -4 Otherwise, the cell would continue to lose telomere sequences until its chromosomes became unstable and it was finally destroyed. Telomerase reactivation is presumably the main mechanism used by tumour cells to stabilize their telomeres. 5 It follows from these observations and considerations that it should be possible to treat tumours by inhibiting the telomerase. Conventional cancer therapies using cytostatic agents or short-wave radiation damage all the dividing cells in the body in addition to the tumour cells. However, since only germ line cells, apart from tumour cells, contain significant telomerase activity, telomerase inhibitors would attack the 10 tumour cells more specifically and consequently elicit fewer undesirable side effects. Telomerase activity has been detected in all the tumour tissues which have so far been tested, which means that these therapeutic agents could be employed against all types of cancer. The effect of telomerase inhibitors would then set in when the telomeres of the cells had shortened to such an extent that the genome became 15 unstable. Since tumour cells usually possess telomeres which are shorter than those of normal somatic cells, cancer cells would be the first to be eliminated by the telomerase inhibitors. By contrast, cells possessing long telomeres, such as the germ cells, would only be damaged at a much later date. Telomerase inhibitors consequently represent a potential way forward in the treatment of cancer. 20 It becomes possible to obtain unambiguous answers to the question of the nature and points of attack of physiological telomerase inhibitors once the manner in which expression of the telomerase gene is regulated has also been identified. 25 Regulation of gene expression in eukaryotes There are a large number of points in eukaryotic gene expression, i.e. the cellular flow of information from the DNA to the protein by way of the RNA, at which regulatory mechanisms can exert an effect. Examples of individual control steps are 30 gene amplification, the recombination of gene loci, chromatin structure, DNA methylation, transcription, post-transcriptional modifications of mRNA, mRNA transport, translation and post-translational modifications of proteins. Studies which -5 have been carried out to date indicate that control at the level of transcription initiation is of the greatest importance (Latchman, 1991). A region which is responsible for regulating transcription, and which is designated 5 the promoter region, is located directly upstream of the transcription start of a gene which is transcribed by RNA polymerase II. Comparison of the nucleotide sequences of promoter regions from a large number of known genes shows that particular sequence motifs occur regularly in this region. These elements include, inter alia, the TATA box, the CCAAT box and the GC box, which elements are recognized by 10 specific proteins. The TATA box, which is located about 30 nucleotides upstream of the transcription start, is, for example, recognized by the TFIID subunit TBP ("TATA box-binding protein"), whereas particular GC-rich sequence segments are specifically bound by the transcription factor SpI ("specificity proteinI"). 15 The promoter can be functionally subdivided into a regulatory segment and a constitutive segment (Latchman, 1991). The constitutive control region comprises the so-called core promoter which enables transcription to be initiated correctly. This promoter contains the sequence elements which are described as UPE's (upstream promoter elements) which are necessary for efficient transcription. The regulatory 20 control segments, which can be interlaced with the UPE's, possess sequence elements which can be involved in the signal-dependent regulation of transcription by hormones, growth factors, etc. They impart tissue-specific or cell-specific promoter properties. 25 DNA segments which are able to exert an influence on gene expression over relatively large distances are a characteristic feature of eukaryotic genes. These elements can be located upstream or downstream of a transcription unit, or within the unit, and can perform their function independently of their orientation. These sequence segments may reinforce (enhancers) or attenuate (silencers) promoter 30 activity. In a similar way to the promoter regions, enhancers and silencers also accommodate several binding sites for transcription factors.
-6 The invention relates to the DNA sequences from the 5'-flanking region of the gene for the catalytically active human telomerase subunit and intron sequences for this gene. 5 The invention particularly relates to the 5'-flanking regulatory DNA sequence which contains the promoter DNA sequence for the gene for the human catalytic telomerase subunit, as depicted in Fig. 10 (SEQ ID NO 3). The invention furthermore relates to part regions of the 5'-flanking regulatory DNA 10 sequence, as depicted in Fig. 4 (SEQ ID NO 1), which has a regulatory effect. Intron sequences for the gene for the human catalytic telomerase subunit, in particular those sequences which have a regulatory effect, are also part of the subject matter of the present invention. The intron sequences according to the invention are 15 described in detail in the context of Example 5 (cf. SEQ ID NO 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 and 20). The invention furthermore relates to a recombinant construct which comprises the DNA sequences according to the invention, in particular the 5'-flanking DNA 20 sequence of the gene for the human catalytic telomerase subunit, or part regions thereof. Preference is given to recombinant constructs which, in addition to the DNA sequences according to the invention, in particular the 5'-flanking DNA sequence of 25 the gene for the human catalytic telomerase subunit, or part regions thereof, also contain one or more additional DNA sequences which encode polypeptides or proteins. According to a particularly preferred embodiment, these additional DNA sequences 30 encode antineoplastic proteins. \ A L -7 Particular preference is given to those antineoplastic proteins which inhibit angiogenesis directly or indirectly. Examples of these proteins are: Plasminogen activator inhibitor (PAI-i), PAI-2, PAI-3, angiostatin, endostatin, 5 platelet factor 4, TIMP-1, TIMP-2, TIMP-3 and leukaemia inhibitory factor (LIF). Antineoplastic proteins which have a direct or indirect cytostatic effect on tumours are likewise particularly preferred. These proteins include, in particular: 10 perforin, granzyme, IL-2, 1L-4, IL-12, interferons, such as IFN-a, IFN-B and IFN-y, TNF, TNF-a, TNF-B, oncostatin M; tumour suppressor genes, such as p53, retinoblastoma. Particular preference is furthermore given to antineoplastic proteins which, where 15 appropriate in addition to their antineoplastic effect, stimulate inflammations and thereby contribute to the elimination of tumour cells. Examples of these proteins are: RANTES, monocyte chemotactic and activating factor (MCAF), IL-8, macrophage inflammatory protein (MIP-la,-B), neutrophil activating protein-2 (NAP-2), 1L-3, IL 20 5, human leukaemia inhibitory factor (LIF), IL-7, IL-11, IL-13, GM-CSF, G-CSF and M-CSF. Particular preference is furthermore given to antineoplastic proteins which, due to their action as enzymes, are able to convert precursors of an antineoplastic active 25 compound into an antineoplastic active compound. Examples of these enzymes are: herpes simplex virus thymidine kinase, varicella zoster virus thymidine kinase, bacterial nitroreductase, bacterial B-glucuronidase, plant B-glucuronidase from Secale cereale, human glucuronidase, human carboxypeptidase, bacterial carboxypeptidase, 30 bacterial B-lactamase, bacterial cytosine deaminidase, human catalase and/or L/I phosphatase, human alkaline phosphatase, type 5 acid phosphatase, human -8 lysooxidase, human acid D-aminooxidase, human glutathione peroxidase, human eosinophil peroxidase and human thyroid peroxidase. The abovementioned recombinant constructs can also contain DNA sequences which 5 encode factor VIII or factor IX, or part fragments thereof. These DNA sequences also include other blood clotting factors. The abovementioned recombinant constructs can also contain DNA sequences which encode a reporter protein. Examples of these reporter proteins are: 10 Chloramphenicol acetyl transferase (CAT), glow-worm luciferase (LUC), B-galac tosidase (B-Gal), secreted alkaline phosphatase (SEAP), human growth hormone (hGH), B-glucuronidase (GUS), green-fluorescing protein (GFP), and all the variants derived therefrom, aquarin and obelin. 15 Recombinant constructs according to the invention can also contain DNA which encodes the human catalytic telomerase subunit and its variants and fragments in the antisense orientation. Where appropriate, these constructs can also contain other protein subunits of the human telomerase and the telomerase RNA component in the 20 antisense orientation. The recombinant constructs can, in addition to the DNA which encodes the human catalytic telomerase subunit, and its variants and fragments, also contain other protein subunits of the human telomerase and the telomerase RNA component. 25 The invention furthermore relates to a vector which contains the abovementioned DNA sequences according to the invention, in particular the 5'-flanking DNA sequences and also one or more of the other DNA sequences mentioned above. 30 The preferred vector for these constructs is a virus, for example a retrovirus, an adenovirus, an adeno-associated virus, a herpes simplex virus, a vaccina virus, a lentiviral virus, a Sindbis virus and a Semliki forest virus. C313 -9 Preference is also given to using plasmids as vectors. The invention furthermore relates to pharmaceutical preparations which comprise 5 recombinant constructs or vectors according to the invention; for example a preparation in a colloidal dispersion system. Examples of suitable colloidal dispersion systems are liposomes or polylysine ligands. 10 The preparations of the constructs or vectors according to the invention in colloidal dispersion systems can be supplemented with a ligand which binds to the membrane structures of tumour cells. Such a ligand can, for example, be attached to the construct or the vector or else be a component of the liposome structure. 15 Suitable ligands are, in particular, polyclonal or monoclonal antibodies, or antibody fragments thereof, which bind, by their variable domains, to the membrane structures of tumour cells, or substances carrying mannose terminally, cytokines or growth factors, or fragments or part sequences thereof, which bind to receptors on tumour 20 cells. Examples of corresponding membrane structures are receptors for a cytokine or a growth factor, such as IL-1, EGF, PDGF, VEGF, TGF B, insulin or insulin-like growth factor (ILGF), or adhesion molecules, such as SLeX, LFA-1, MAC-1, 25 LECAM-1 or VLA-4, or the mannose-6-phosphate receptor. The present invention includes pharmaceutical preparations which, in addition to the vector constructs according to the invention, can also comprise non-toxic, inert, pharmaceutically suitable excipients. It is possible to conceive of administering (e.g. 30 intravenously, intraarterially, intramuscularly, subcutaneously, intradermally, anally, vaginally, nasally, transdermally, intraperitoneally, as an aerosol or orally) these preparations at the site of a tumour or administering them systemically.
-10 The vector constructs according to the invention can be employed in gene therapy. The invention furthermore relates to a recombinant host cell, in particular a 5 recombinant eukaryotic host cell, which harbours the above-described constructs or vectors. The invention furthermore relates to a process for identifying substances which affect the promoter activity, silencer activity or enhancer activity of the catalytic telomerase 10 subunit, with this process comprising the following steps: A. adding a candidate substance to a host cell which harbours the regulatory DNA sequence according to the invention, in particular the 5'-flanking regulatory DNA sequence for the gene for the human catalytic telomerase 15 subunit, or a part region thereof which has a regulatory effect, which sequence or part region is functionally linked to a reporter gene, and B. measuring the effect of the substance on expression of the reporter gene. 20 The process can be employed for identifying substances which increase the promoter activity, silencer activity or enhancer activity of the catalytic telomerase subunit. The process can furthermore be employed for identifying substances which inhibit the promoter activity, silencer activity or enhancer activator of the catalytic 25 telomerase subunit. The invention furthermore relates to a process for identifying factors which bind specifically to fragments of the DNA fragments according to the invention, in particular the 5'-flanking regulatory DNA sequence of the catalytic telomerase 30 subunit. This method comprises screening an expression cDNA library using the above-described DNA sequence, or subfragments of widely differing length, as the probe.
- 11 The above-described constructs or vectors can also be used for preparing transgenic animals. 5 The invention furthermore relates to a process for detecting telomerase-associated conditions in a patient, which process comprises the following steps: A. incubating a construct or vector, which contains the DNA sequence according to the invention, in particular the 5'-flanking regulatory DNA sequence for the 10 gene for the human catalytic telomerase subunit, or a part region thereof having a regulatory effect, and a reporter gene, with body fluids or cell samples, B. detecting the activity of the reporter gene in order to obtain a diagnostic value; 15 and C. comparing the diagnosic value with standard values for the reporter gene construct in standardized normal cells or body fluids of the same type as the test sample; 20 The detection of diagnostic values which are higher or lower than the standard comparative values indicates a telomerase-associated condition, which in turn indicates a pathogenic condition. 25 Explanation of the figures: Fig. 1: Southern blot analysis using genomic DNA from various species A: Photograph of an ethidium bromide-stained 0.7% agarose gel 30 containing approximately 4 pg of Eco RI-cut genomic DNA. Track 1 T contains Hind III-cut X DNA as size markers (23.5, 9.4, 6.7, 4.4, 2.3, 2.0 and 0.6 kb). Tracks 2 to 10 contain human, rhesus monkey, Sprague - 12 Dawley rat, BALB/c mouse, dog, bovine, rabbit, chicken and yeast (Saccharomyces cerevisiae) genomic DNA. B: Autoradiogram, corresponding to Fig. 1 A, of a Southern blot analysis 5 in which radioactively labelled hTC-cDNA probe of about 720 bp in length is used for the hybridization. Fig. 2: Restriction analysis of the recombinant X DNA of the phage clone P12, which hybridizes with a probe from the 5' region of the hTC cDNA. 10 The figure shows a photograph of an ethidium bromide-stained 0.4% agarose gel. Tracks 1 and 2 contain Eco RI/Hind III-cut X DNA and a 1 kb ladder from Gibco as size markers. Tracks 3 - 7 each contain 250 ng of the DNA from the recombinant phage which has been cut with Bam 15 HI (track 3), Eco RI (track 4), Sal I (track 5), Xho I (track 6) and Sac I (track 7). The arrows mark the two X arms of the vector EMBL3 Sp6/T7. Fig. 3: Restriction analysis and Southern blot analysis of the recombinant k DNA of the phage clone which hybridizes with a probe from the 5' 20 region of the hTC cDNA. A: The figure shows a photograph of an ethidium bromide-stained 0.8% agarose gel. Tracks 1 and 15 contain a 1 kb ladder from Gibco as size markers. Tracks 2 to 14 each contain 250 ng of cut k DNA from the 25 recombinant phage clone. The following enzymes were employed: track 2: Sac I, track 3: Xho I, track 4: Xho I, Xba I, track 5: Sac I, Xho I, track 6: Sal I, Xho I, Xba I, track 7: Sac I, Xho I, Xba I, track 8: Sac I, Sal I, Xba I, track 9: Sac I, Sal I, BamH I, track 10: Sac I, Sal I, Xho I, track 11: Not I, track 12: Sma I, track 13: empty, track 14: not digested. 30 Nji - 13 B: Autoradiogram, corresponding to Fig. 3 A, of a Southern blot analysis. A 5'-hTC cDNA fragment of about 420 bp in length was used as the probe for the hybridization. 5 Fig. 4: Partial DNA sequence of the 5'-flanking region and of the promoter of the gene for the human catalytic telomerase subunit. The ATG start codon in the sequence is printed in bold. The depicted sequence corresponds to SEQ ID NO 1. 10 Fig. 5: Use of primer extension analysis to identify the transcription start. The figure shows an autoradiogram of a denaturing polyacrylamide gel which was selected for depicting a primer extension analysis. An oligonucleotide having the sequence 15 5'GTTAAGTTGTAGCTTACACTGGTTCTC 3' was used as the primer. The primer extension reaction was loaded in track 1. Tracks G, A, T and C constitute the sequence reactions using the same primer and the corresponding dideoxynucleotides. The thick arrow marks the main transcription start while the thin arrows point to three subsidiary 20 transcription start points. Fig. 6: cDNA sequence of the human catalytic telomerase subunit (hTC; cf. our pending application PCT/EP/98/03468). The depicted sequence corresponds to SEQ ID NO 2. 25 Fig. 7: Structural organization and restriction map of the human hTC gene and its 5'-flanking and 3'-flanking regions. Exons are shown as consecutively numbered rectangles which are filled 30 in in black, and introns are shown as regions which are not filled in. Untranslated sequence segments in the exons are hatched. Translation starts in exon 1 and ends in exon 16. Restriction enzyme cleavage sites - 14 are marked as follows: S, SacI; X, XhoI. The relative arrangement of the five phage clones (P2, P3, P5, P12, P17), and of the product from the genome walking, are shown by thin lines. As the dots indicate, the sequence of intron 16 has only been partly deciphered. 5 Fig. 8: HTL splice variants. A: Diagrammatic structure of the hTC mRNA splice variants. The complete hTC mRNA is depicted as a rectangle with a grey background 10 in the upper region of the figure. The 16 exons are depicted in accordance with their size. The translation start (ATG) and the stop codon, and also the telomerase-specific T motif, and the seven RT motifs, are all shown. The hTC variants are subdivided into deletion and insertion variants. The missing exon sequences are marked in the deletions. The insertions are 15 shown by additional white rectangles. The sizes and origins of the inserted sequences are given. Newly formed stop codons are marked. The size of the insertion in variant INS2 is unknown. B: Exon-intron transitions in the hTC splice variants. Unspliced 5' 20 flanking and 3'-flanking sequences are shown as white rectangles. The origins of the exon and intron sequences are given. Intron and exon sequences are shown in small letters and large letters, respectively. The donor and acceptor sequences in the splice sites are underlaid as grey rectangles, and their exon and intron origins are also given. 25 Fig. 9: Identification of the transcription start by means of RT-PCR analysis. The RT-PCR was carried out using a cDNA library prepared from HL 60 cells and genomic DNA as the positive control. A common 3' primer hybridizes to a region of the exon 1 sequence. The positions of the 30 different 5' primers in the coding region or the 5'-flanking region are A given. In the negative control, no template DNA was added to the PCR reaction. M: DNA size marker. Lu P C-) i ((z~c'n > " - 15 Fig. 10: Nucleotide sequence and structural features of the hTC promoter. The figure depicts 11273 bp of the 5'-flanking hTC gene sequence, beginning with the translation start codon ATG (+1). The putative region 5 of the translation start is underlined. Possible regulatory sequence segments within the 4000 bp upstream of the translation start are ringed. The depicted sequence corresponds to SEQ ID NO 3. Fig. 11: Activity of the hTC promoter in HEK-293 cells. 10 The first 5000 bp of the 5'-flanking hTC gene region are shown diagrammatically in the upper part of the figure. The ATG start codon is picked out. CpG-rich islands are marked by grey rectangles. The sizes of the hTC promoter-luciferase construct are shown on the left-hand side of the figure. The promoterless pGL2 basic construct and the SV40 15 promoter construct pGL2-Pro were used as controls in each transfection. The relative luciferase activities of the different promoter constructs in HEK cells are shown as continuous bars on the right-hand side of the figure. The standard deviation is indicated. The numerical values represent the average of two independent experiments which were carried 20 out in duplicate. Tab. 1: Exon-intron transitions in the hTC gene The table lists the nucleotide sequences at the 3' and 5' splice transitions of the hTC gene. The consensus sequences for donor and acceptor 25 sequences (AG and GT) are underlaid with grey rectangles. The table shows the intron sequences (small letters) and exon sequences (large letters) which flank the splice acceptor and donor sites. The sizes of the exons and introns are given in bp. Tab. 2: Potential binding sites for DNA-binding factors in the nucleotide sequence of intron 2 NT O -16 The search for possible DNA-binding factors (e.g. transcription factors) was carried out using the "find pattern" algorithm from the Genetics Computer Group (Madison, USA) GCG sequence analysis program package. The table lists the abbreviations of the DNA-binding factors 5 which were identified and their location in intron 2.
m W v 4 ' m 4 H H 0 t- H %D 0m H a 0 u' 0 0 4 0 i 0 0 0m 0 u HJ U, 4. U O 0, M 0 0 I 0) 4. H i IM l A * M 4 m 0 43 4 O4 M 3 e M M 0)J.) u 000) u04 u000))00 u Id u d 4 t)0 0 0 0) 0 0 4 ) 0 4) I) 0 0) O4 0)00 4)0 )4) 4)0)0)4) u u 0)0 0) 0)0 0 0) 0 U) 0 40 0 0) Ol 0 934 I d 4. .) I u u u Im b) u 41 9 H 00(d4)0)0)n4)0)0000)000)fd U 0)0 00)0)4)4) 004)0)0)00 0>)4)-H 0)0)4)00)0)0)0)0)0)0)0)0)0) 0)' fl a 0 fl e 0 0 ~h~h h 0 0 t 0 C! ! C! ) t)) 0 )t D t) 01 0) 0 0 W4 0 4 4 4 t 4 4 I 4 4 4 4 4 N O 4 m 00 000 0 0 (0 00 F 0 1 4 0 ) vC!V4 ) 2 IT I 'IT R4 0 :R !) 3 M O O & G O O C!) C! 4 .4 14 0 4 0 U 0 0 0 4 0 0 u u 0 0 F4 ~ ~ 0 H 0 Oi a U E) a - 4 - Q E) E )G 4 0 4 u EA 0- 4 0 E-# 0 E- C) 4 0 4 rU 0 0 0 E -O OO0 O o 0 ~~ Q) i V 0 E) E1O 4 04 a H 1 0 E 0) 014 0 0 L 0 Q 0 V0 E O E-C4! 4 E U U H E H In H % 0 H 0 % D % O 4.e N 0% eq I t c ' co n ao c o n 0 o co 0 H co N4 eq eq m %De eq4 m o H H rH H H H H D r- H 4 eq v' 4' n %o rs aw a o~ 0 4 H , eq v 4. n %D 0 0 Hi Hi H H H- H HA C U 001 )0 E4 4110) 1~ a 0 0)0)00 U4 ED 00 0 E -'E4 000-1 4U 0))014).) Da U44U0 04u U0 0)F 0 )0( 0 H 0 0 0 U 0 0) 0 0 U 0 G GO O14014140 40 0 400 0 00)0 00 14000)00)0 ,11 u'0 0 04 a11 0 0 14 1414 )0 00 N )100U000 400 0 '4 000 530 E) 0 1 4 E-EE 0 0 4 E- 4 m 00) 0 1 Q U 0 40 0 0 0 0 E- 0 Q U ED H Q CJ IL 4 (J U E 40 0 P E1 00 0 4 E- 4 E E) ) C 00 )0001 O a E) 0 0 1-4 q10 014140)14U0 U E14141400 0) 0 0 E 0.) u ) 0 ) 00)4 0) 0 E E 0 U 40 0 0 0 P U E4 F1 p ED ) ) 0 0 00000000000000 4 0 l4 . 0 J 0 0 6 - 5 6 6 4 Z Z 0 04)4) 0)0 04 04) 0 04) 0 0 0 H .4 044 )004)0UIU 4 ) 0 iu4)0U4)00 02 0) 4) 0 4) 0) 4) 04) 4) 4) 0l)04) 1 0 0 U 0 04) 0 04 )J 0 0) 0 4) 0)) 4 4) 0 4) 4) 0 > 0> 0 v 4) 4) 0 0 4L) 0) 0 4 ) )4 0 0)0 4 4) 0 0 0 0) 4) 4 4 4 ) 4) 0 U 04 ) 0 04) 0)4 ) 0d 0)0 0 0 4) 0 4) 04) 0 0 4) t))4J4) 0 U U 0o 0)0)0 ) 4 4) 0 U 0 U 0 0) 0 41 )) t)) I 0~ 0. U 04.) 0)04. ) 0J4.) 0 ) 0 0))) 0 0)4)04 ))d4)04 4 )004) -n 0 0 0 04 4) 04 4) 04) 0 4) 0) t 0 U0) U 0 0 0 0)0 04) o 04) I % TL/ ooyuu - 18 Tab. 2 Factors Location in intron 2 C/EBP 2925 CRE.2 2749 Spl 2378, 4094, 4526, 4787, 4835, 4995 AP-2 CS3 5099 AP-2 CS4 2213, 3699, 4667, 5878, 5938, 6059, 6180, 6496 AP-2 CS5 5350, 5798, 5880, 5940, 6061, 6182, 6375, 6498 PEA3 934, 2505 P53 2125 GR uteroglobin 848, 1487, 2956 PR uteroglobin 3331 Zeste-white 1577, 1619, 1703, 1745, 1787, 1829, 1871, 1913, 1955, 1997,2039,2081,3518,3709,4765,5014,5055 GRE 846 MyoD-MCK right 447, 509, 558, 1370, 1595, 1900, 2028, 2099, 4557 site/rev MyoD-MCK left site 108, 118, 453, 1566, 1608, 1692, 1734, 1818, 1902, 1986,2372,2460,2720,3491,5030 Ets-1 CS 6408 AP1 3784, 4406 CREB 2801 GATA-1 839, 1390, 3154 c-Myc 108, 118, 453, 1566, 1608, 1692, 1734, 1818, 1902, 1986,2372,2460,2720,3491,5030 CACCC site 991 CCAAT site 1224 CCAC box 992 CAAT site 463, 2395 Rb site 992, 4663 TATA 3650 CDEI 106, 1564, 1606, 1690, 1732, 1816, 1900, 1984 -19 Examples The human gene for the catalytic telomerase subunit (ghTC), and the regions of this gene located 5' and 3', were cloned, while the start point for transcription was 5 determined, potential binding sites for DNA-binding proteins were identified and active promoter fragments were highlighted. The sequence of the hTC cDNA (Fig. 6) has already been reported in our application PCT/EP/98/03468, which is also pending. Unless otherwise mentioned, all the data refer to the position of the cDNA in this sequence. 10 Example 1 A genomic Southern blot analysis was used to determine whether ghTC constitutes a single gene in the human genome or whether there exist several loci for the hTC gene 15 and possibly also ghTC pseudogenes. In order to do this, a commercially available zoo blot from Clontech was subjected to Southern blot analysis. This blot contains 4 pg of Eco RI-cut genomic DNA from nine different species (human, monkey, rat, mouse, dog, bovine, rabbit, chicken and 20 yeast). With the exception of yeast, chicken and human, the DNA was isolated from kidney tissue. The human genomic DNA was isolated from placenta and the chicken genomic DNA was purified from liver tissue. An hTC cDNA fragment of about 720 bp in length, which was isolated from hTC cDNA, variant Del2 (position 1685 to 2349 plus 2531 to 2590 in Fig. 6 [deletion 2; cf. Example 5 in Fig. 8]), was used as 25 the radioactively labelled probe in the autoradiogram in Fig. 1. The experimental conditions for the blot hybridization and washing steps were taken from Ausubel et al. (1987). In the case of the human DNA, the probe recognizes two specific DNA fragments. 30 The smaller Eco RI fragment, of from about 1.5 to 1.8 kb in length, probably originates from two Eco RI cleavage sites in an intron in the ghTC DNA. On the -20 basis of this result, it is to be assumed that only one single ghTC gene is present in the human genome. Example 2 5 In order to isolate the 5' flanking hTC gene sequence, approx. 1.5 x 106 phages from a human genomic placenta gene library (EMBL 3 SP6/T7 from Clontech, order number HL1067j) were hybridized on nitrocellulose filters (0.45 gm; from Schleicher and Schuell), in accordance with the manufacturer's instructions, with a 10 radioactively labelled 5'-hTC cDNA fragment of about 500 bp in length (position 839 to 1345 in Fig. 6). The nitrocellulose filters were firstly incubated, at 42*C for two hours, in 2 x SSC (0.3 M NaCl; 0.5 M Tris-HCl, pH 8.0) and then in a prehybridization solution (50% formamide; 5 x SSPE, pH 7.4; 5 x Denhard's solution; 0.25% SDS; 100 pg of herring sperm DNA/ml). For the overnight 15 hybridization, the prehybridization solution was supplemented with 1.5 x 106 cpm of denatured, radioactively labelled probe/ml of solution. Nonspecifically bound radioactive DNA was removed under stringent conditions, i.e. by means of three five minute steps of washing with 2 x SSC; 0.1% SDS at from 55 to 65'C. The filters were evaluated by autoradiography. 20 The phage clones which were identified in this primary investigation were purified (Ausubel et al. (1987)). In subsequent analyses, one phage clone, i.e. P12 turned out to be potentially positive. A X DNA preparation carried out on this phage (Ausubel et al. (1987)), and the subsequent restriction digestion with enzymes which release the 25 genomic insert in fragments, showed that this phage clone contains an insert of approx. 15 kb in the vector (Fig. 2). In order to isolate the complete hTC gene sequence, in each case from 1 to 1.5 x 106 phages were screened, in independent experiments, with in each case different 30 radioactively labelled probes, as described above. uj7 -21 The phage clones which were identified in these primary investigations, and which were positive for the corresponding probes, were purified. The phage clone P17 was found to contain an hTC cDNA fragment of about 250 bp in length (position 1787 to 2040 in Fig. 6). The phage clone P2 was identified as containing an hTC cDNA 5 fragment of about 740 bp in length (position 1685 to 2349 plus 2531 to 2607 in Fig. 6 [deletion 2; cf. Example 5]). The phage clones P3 and P5 were found to contain a 3' hTC cDNA fragment of 420 bp in length (position 3047 to 3470 in Fig. 6). After the k DNA had been prepared from these phages, and subsequently subjected to restriction digestion with enzymes which release the genomic insert in fragments, the 10 inserts were subcloned into plasmids (Example 4). Example 3 In order to investigate whether the 5' end of the hTC cDNA was also present in the 15 insert in the recombinant phage clone P12, the k DNA from this clone was hybridized, in a Southern blot analysis, with a radiactively labelled hTC cDNA fragment of about 440 bp in length (position 1 to 440 in Fig. 6) from the extreme 5' region (Fig. 3). 20 Since the isolated k DNA from the positive clone also hybridizes with the extreme 5' end of the hTC cDNA, this phage probably also contains the 5' sequence region flanking the ATG start codon. Example 4 25 In order to subclone the entire 15 kb insert in the positive phage clone P12 in the form of subfragments, and subsequently to sequence these fragments, restriction endonucleases which, on the one hand, release the entire insert from EMBL3 Sp6/T7 (cf. Example 2) and, in addition, cut within the insert, were selected for digesting the 30 DNA.
IN
- 22 In all, two Xho I subfragments, of about 8.3 and about 6.5 kb in length, respectively, and three Sac I subfragments, of about 8.5, about 3.5 and about 3 kb in length, respectively, were subcloned into the pBluescript KS(+) vector (from Stratagene). The 5123 bp 5'-flanking nucleotide sequence of the ghTC gene region, starting from 5 the ATG start codon, was determined by analysing the sequences of these fragments (Fig. 4; corresponding to SEQ ID NO 1). Fig. 4 depicts the first 5123 bp (starting from the ATG start codon). Fig. 10 depicts the entire cloned 5' sequence (corresponding to SEQ ID NO 3). 10 In order to subclone the entire insert, of approx. 14.6 kb in size, in phage clone P17 in the form of subfragments, restriction endonucleases which, on the one hand, release the entire insert from EMLB3 Sp6/T7 and, in addition, cut a few times within the insert, were selected for digesting the DNA. Three XhoI/BamHI fragments, of 7.1 kb, 4.2 kb and 1.5 kb in size, respectively, and one BamHI fragment, of 1.8 kb in 15 size, were subcloned by means of using a combination digestion with the enzymes XhoI and BamHI. Combination restriction digestion with the enzymes XhoI and XbaI resulted in a XhoI/XbaI fragment of 6.5 kb in size, and two XhoI fragments, of 6.5 kb and 1.5 kb in size, respectively, being cloned. 20 Digestion with the restriction enzyme XhoI was used to subclone the insert, of approx. 17.9 kb in size, in phage clone P2 in the form of subfragments. In all, three XhoI subfragments, of 7.5 kb, 6.4 kb and 1.6 kb in length, respectively, were cloned. Four SacI fragments, of 4.8 kb, 3 kb, 2 kb and 1.8 kb in size, respectively, were additionally subcloned by digesting with the restriction enzyme SacI. 25 The insert, of approx. 13.5 kb in size, in phage clone P3 was subcloned by digesting with the restriction enzymes Sac and/or XhoI. Six SacI subfragments, of 3.2 kb, 2 kb, 0.9 kb, 0.8 kb, 0.65 kb and 0.5 kb in length, respectively, and two XhoI subfragments, of 6.5 kb and 4.3 kb in length, respectively, were obtained in this 30 connection.
AL'
-23 The insert, of approx. 13.2 kb in size, in phage clone P5 was subcloned by digesting with the restriction enzymes SacI and/or XhoI. In all, SacI fragments of 6.5 kb, 3.3 kb, 3.2 kb, 0.8 kb and 0.3 kb in size, and XhoI fragmente of 7 kb and 3.2 kb in size, were subcloned. 5 In order to clone the hTC genomic sequence region located 3' of phage clone P17 and 5' of phage clone P2, 3 genomic walkings were carried out using the Clontech GenomeWalkerTM kits (catalogue number K1803-1) and various combinations of primers. In a final volume of 50 pl, 10 pmol of dNTP mix were added to 1 pl of 10 human GenomeWalker Library HDL (from Clontech), and a PCR reaction was carried out in lxKlen Taq PCR reaction buffer and IxAdvantage Klen Taq polymerase mix (from Clontech). 10 pmol of an internal gene-specific primer, and 10 pmol of the adaptor primer API (5'-GTAATACGACTCACTATAGGGC-3'; from Clontech) were added as primers. The PCR was carried out in 3 steps as a touchdown 15 PCR. First of all, denaturation was carried out at 94*C for 20 sec, and the primers were then annealed, and the DNA chain extended, at 72*C for 4 min, over 7 cycles. There then followed 37 cycles in which the DNA was denaturated at 94*C for 20 sec but the subsequent primer extension took place at 67*C for 4 min. In conclusion, there followed a chain extension at 67*C for 4 min. After this first PCR, the PCR 20 product was diluted 1:50. One pl of this dilution was used in a second nested PCR together with 10 pmol of dNTP mix in 1xKlen Taq PCR reaction buffer and IxAdvantage Klen Taq polymerase mix and also 10 pmol of a nested gene-specific primer and 10 pmol of the nested Marathon Adaptor primers AP2 (5' ACTATAGGGCACGCGTGGT-3'; from Clontech). The PCR conditions 25 corresponded to the parameters which were selected in the first PCR. As the sole exception, only 5 cycles rather than 7 cycles were selected in the first PCR step and only 24 cycles, instead of 37 cycles, were run in the second PCR step. The products of this nested genomic walking PCR were cloned into the TA Cloning Vector pCRII from InVitrogen.
J~
-24 In the first genomic walking, the gene-specific primer C3K2-GSPl (5' GACGTGGCTCTTGAAGGCCTTG-3') and the nested gene-specific primer C3K2 GSP2 (5'-GCCTTCTGGACCACGGCATACC-3') were used, together with the HDL library 4, and a PCR fragment of 1639 bp in length was obtained. In the second 5 genomic walking, a PCR fragment of 685 bp in length was amplified from the HDL library 4 using the gene-specific primer C3F2 (5' CGTAGTTGAGCACGCTGAACAGTG-3') and the nested gene-specific primer C3F (5'-CCTTCACCCTCGAGGTGAGACGCT-3. The third genomic walking mixture, using the gene-specific primer DEL5-GSP1 (5' 10 GGTGGATGTGACGGGCGCGTACG-3') and the nested gene-specific primer C5K-GSP1 (5'-GGTATGCCGTGGTCCAGAAGGC-3'), led to a 924 bp PCR fragments being cloned from the HDL library 1. In all, 2100 bp of the genomic hTC region located 3' of phage clone P17 were identified using this genomic walking method (see Fig. 7). 15 The subcloned fragments, and the genomic walking products, were sequenced in single-stranded form. The Lasergene Biocomputing Software (DNASTAR Inc. Madison, Wisconsin, USA) was used to identify overlapping regions and form contigs. In all, 2 large contigs were assembled from the sequences collected from 20 phage clones P12, P17, P2, P3 and P5, and also the sequence data from the genomic walking. Contig 1 consists of sequence data from phage clones P12 and P17 and the sequence data from the genomic walking. Contig 2 was put together from the sequences from phage clones P2, P3 and P5. Overlapping phage clone regions are shown diagrammaticaly in Fig. 7. The sequence data from the 2 contigs are shown 25 below. The ATG start codon in contig 1 is underlined. The TGA stop codon is underlined in contig 2. < o -25 Contig1: ACTTGAGCCC AAGAGTTCAA GGCTACGGTG AGCCATGATT GCAACACCAC ACGCCAGCCT TGGTGACAGA 70 ATGAGACCCT GTCTCAAAAA AAAAAAAAAA AATTGAAATA ATATAAAGCA TCTTCTCTGG CCACAGTGGA 140 5 ACAAAACCAG AAATCAACAA CAAGAGGAAT TTTGAAAACT ATACAAACAC ATGAAAATTA AACAATATAC 210 TTCTGAATGA CCAGTGAGTC AATGAAGAAA TTAAAAAGGA AATTGAAAAA TTTATTTAAG CAAATGATAA 280 CGGAAACATA ACCTCTCAAA ACCCACGGTA TACAGCAAAA GCAGTGCTAA GAAGGAAGTT TATAGCTATA 350 AGCAGCTACA TCAAAAAAGT AGAAAAGCCA GGCGCAGTGG CTCATGCCTG TAATCCCAGC ACTTTGGGAG 420 GCCAAGGCGG GCAGATCGCC TGAGGTCAGG AGTTCGAGAC CAGCCTGACC AACACAGAGA AACCTTGTCG 490 10 CTACTAAAAA TACAAAATTA GCTGGGCATG GTGGCACATG CCTGTAATCC CAGCTACTCG GGAGGCTGAG 560 GCAGGATAAC CGCTTGAACC CAGGAGGTGG AGGTTGCGGT GAGCCGGGAT TGCGCCATTG GACTCCAGCC 630 TGGGTAACAA GAGTGAAACC CTGTCTCAAG AAAAAAAAAA AAGTAGAAAA ACTTAAAAAT ACAACCTAAT 700 GATGCACCTT AAAGAACTAG AAAAGCAAGA GCAAACTAAA CCTAAAATTG GTAAAAGAAA AGAAATAATA 770 AAGATCAGAG CAGAAATAAA TGAAACTGAA AGATAACAAT ACAAAAGATC AACAAAATTA AAAGTTGGTT 840 15 TTTTGAAAAG ATAAACAAAA TTGACAAACC TTTGCCCAGA CTAAGAAAAA AGGAAAGAAG ACCTAAATAA 910 ATAAAGTCAG AGATGAAAAA AGAGACATTA CAACTGATAC CACAGAAATT CAAAGGATCA CTAGAGGCTA 980 CTATGAGCAA CTGTACACTA ATAAATTGAA AAACCTAGAA AAAATAGATA AATTCCTAGA TGCATACAAC 1050 CTACCAAGAT TGAACCATGA AGAAATCCAA AGCCCAAACA GACCAATAAC AATAATGGGA TTAAAGCCAT 1120 AATAAAAAGT CTCCTAGCAA AGAGAAGCCC AGGACCCAAT GGCTTCCCTG CTGGATTTTA CCAATCATTT 1190 20 AAAGAAGAAT GAATTCCAAT CCTACTCAAA CTATTCTGAA AAATAGAGGA AAGAATACTT CCAAACTCAT 1260 TCTACATGGC CAGTATTACC CTGATTCCAA AACCAGACAA AAACACATCA AAAACAAACA AACAAAAAAA 1330 CAGAAAGAAA GAAAACTACA GGCCAATATC CCTGATGAAT ACTGATACAA AAATCCTCAA CAAAACACTA 1400 GCAAACCAAA TTAAACAACA CCTTCGAAAG ATCATTCATT GTGATCAAGT GGGATTTATT CCAGGGATGG 1470 AAGGATGGTT CAACATATGC AAATCAATCA ATGTGATACA TCATCCCAAC AAAATGAAGT ACAAAAACTA 1540 25 TATGATTATT TCACTTTATG CAGAAAAAGC ATTTGATAAA ATTCTGCACC CTTCATGATA AAAACCCTCA 1610 AAAAACCAGG TATACAAGAA ACATACAGGC CAGGCACAGT GGCTCACACC TGCGATCCCA GCACTCTGGG 1680 AGGCCAAGGT GGGATGATTG CTTGGGCCCA GGAGTTTGAG ACTAGCCTGG GCAACAAAAT GAGACCTGGT 1750 CTACAAAAAA CTTTTTTAAA AAATTAGCCA GGCATGATGG CATATGCCTG TAGTCCCAGC TAGTCTGGAG 1820 GCTGAGGTGG GAGAATCACT TAAGCCTAGG AGGTCGAGGC TGCAGTGAGC CATGAACATG TCACTGTACT 1890 30 CCAGCCTAGA CAACAGAACA AGACCCCACT GAATAAGAAG AAGGAGAAGG AGAAGGGAGA AGGGAGGGAG 1960 AAGGGAGGAG GAGGAGAAGG AGGAGGTGGA GGAGAAGTGG AAGGGGAAGG GGAAGGGAAA GAGGAAGAAG 2030 AAGAAACATA TTTCAACATA ATAAAAGCCC TATATGACAG ACCGAGGTAG TATTATGAGG AAAAACTGAA 2100 AGCCTTTCCT CTAAGATCTG GAAAATGACA AGGGCCCACT TTCACCACTG TGATTCAACA TAGTACTAGA 2170 AGTCCTAGCT AGAGCAATCA GATAAGAGAA AGAAATAAAA GGCATCCAAA CTGGAAAGGA AGAAGTCAAA 2240 35 TTATCCTGTT TGCAGATGAT ATGATCTTAT ATCTGGAAAA GACTTAAGAC ACCACTAAAA AACTATTAGA 2310 GCTGAAATTT GGTACAGCAG GATACAAAAT CAATGTACAA AAATCAGTAG TATTTCTATA TTCCAACAGC 2380 AAACAATCTG AAAAAGAAAC CAAAAAAGCA GCTACAAATA AAATTAAACA GCTAGGAATT AACCAAAGAA 2450 GTGAAAGATC TCTACAATGA AAACTATAAA ATGTTGATAA AAGAAATTGA AGAGGGCACA AAAAAAGAAA 2520 AGATATTCCA TGTTCATAGA TTGGAAGAAT AAATACTGTT AAAATGTCCA TACTACCCAA AGCAATTTAC 2590 40 AAATTCAATG CAATCCCTAT TAAAATACTA ATGACGTTCT TCACAGAAAT AGAAGAAACA ATTCTAAGAT 2660 TTGTACAGAA CCACAAAAGA CCCAGAATAG CCAAAGCTAT CCTGACCAAA AAGAACAAAA CTGGAAGCAT 2730 CACATTACCT GACTTCAAAT TATACTACAA AGCTATAGTA ACCCAAACTA CATGGTACTG GCATAAAAAC 2800 AGATGAGACA TGGACCAGAG GAACAGAATA GAGAATCCAG AAACAAATCC ATGCATCTAC AGTGAACTCA 2870 TTTTTGACAA AGGTGCCAAG AACATACTTT GGGGAAAAGA TAATCTCTTC AATAAATGGT GCTGGAGGAA 2940 45 CTGGATATCC ATATGCAAAA TAACAATACT AGAACTCTGT CTCTCACCAT ATACAAAAGC AAATCAAAAT 3010 GGATGAAAGG CTTAAATCTA AAACCTCAAA CTTTGCAACT ACTAAAAGAA AACACCGGAG AAACTCTCCA 3080 GGACATTGGA GTGGGCAAAG ACTTCTTGAG TAATTCCCTG CAGGCACAGG CAACCAAAGC AAAAACAGAC 3150 AAATGGGATC ATATCAAGTT AAAAAGCTTC TGCCCAGCAA AGGAAACAAT CAACAAAGAG AAGAGACAAC 3220 CCACAGAATG GGAGAATATA TTTGCAAACT ATTCATCTAA CAAGGAATTA ATAACCAGTA TATATAAGGA 3290 50 GCTCAAACTA CTCTATAAGA AAAACACCTA ATAAGCTGAT TTTCAAAAAT AAGCAAAAGA TCTGGGTAGA 3360 CATTTCTCAA AATAAGTCAT ACAAATGGCA AACAGGCATC TGAAAATGTG CTCAACACCA CTGATCATCA 3430 GAGAAATGCA AATCAAAACT ACTATGAGAG ATCATCTCAT CCCAGTTAAA ATGGCTTTTA TTCAAAAGAC 3500 AGGCAATAAC AAATGCCAGT GAGGATGTGG ATAAAAGGAA ACCCTTGGAC ACTGTTGGTG GGAATGGAAA 3570 TTGCTACCAC TATGGAGAAC AGTTTGAAAG TTCCTCAAAA AACTAAAAAT AAAGCTACCA TACAGCAATC 3640 55 CCATTGCTAG GTATATACTC CAAAAAAGGG AATCAGTGTA TCAACAAGCT ATCTCCACTC CCACATTTAC 3710 TGCAGCACTG TTCATAGCAG CCAAGGTTTG GAAGCAACCT CAGTGTCCAT CAACAGACGA ATGGAAAAAG 3780 AAAATGTGGT GCACATACAC AATGGAGTAC TACGCAGCCA TAAAAAAGAA TGAGATCCTG TCAGTTGCAA 3850 CAGCATGGGG GGCACTGGTC AGTATGTTAA GTGAAATAAG CCAGGCACAG AAAGACAAAC TTTTCATGTT 3920 CTCCCTTACT TGTGGGAGCA AAAATTAAAA CAATTGACAT AGAAATAGAG GAGAATGGTG GTTCTAGAGG 3990 60 GGTGGGGGAC AGGGTGACTA GAGTCAACAA TAATTTATTG TATGTTTTAA AATAACTAAA AGAGTATAAT 4060 TGGGTTGTTT GTAACACAAA GAAAGGATAA ATGCTTGAAG GTGACAGATA CCCCATTTAC CCTGATGTGA 4130 TTATTACACA TTGTATGCCT GTATCAAAAT ATCTCATGTA TGCTATAGAT ATAAACCCTA CTATATTAAA 4200 AATTAAAATT TTAATGGCCA GGCACGGTGG CTCATGTCCG TAATCCCAGC ACTTTGGGAG GCCGAGGCGG 4270 GTGGATCACC TGAGGTCAGG AGTTTGAAAC CAGTCTGGCC ACCATGATGA AACCCTGTCT CTACTAAAGA 4340 65 TACAAAAATT AGCCAGGCGT GGTGGCACAT ACCTGTAGTC CCAACTACTC AGGAGGCTGA GACAGGAGAA 4410 TTGCTTGAAC CTGGGAGGCG GAGGTTGCAG TGAGCCGAGA TCATGCCACT GCACTGCAGC CTGGGTGACA 4480 GAGCAAGACT CCATCTCAAA ACAAAAACAA AAAAAAGAAG ATTAAAATTG TAATTTTTAT GTACCGTATA 4550 AATATATACT CTACTATATT AGAAGTTAAA AATTAAAACA ATTATAAAAG GTAATTAACC ACTTAATCTA 4620 AAATAAGAAC AATGTATGTG GGGTTTCTAG CTTCTGAAGA AGTAAAAGTT ATGGCCACGA TGGCAGAAAT 4690 70 GTGAGGAGGG AACAGTGGAA GTTACTGTTG TTAGACGCTC ATACTCTCTG TAAGTGACTT AATTTTAACC 4760 AAAGACAGGC TGGGAGAAGT TAAAGAGGCA TTCTATAAGC CCTAAAACAA CTGCTAATAA TGGTGAAAGG 4830 TAATCTCTAT TAATTACCAA TAATTACAGA TATCTCTAAA ATCGAGCTGC AGAATTGGCA CGTCTGATCA 4900 CACCGTCCTC TCATTCACGG TGCTTTTTTT CTTGTGTGCT TGGAGATTTT CGATTGTGTG TTCGTGTTTG 4970 GTTAAACTTA ATCTGTATGA ATCCTGAAAC GAAAAATGGT GGTGATTTCC TCCAGAAGAA TTAGAGTACC 5040 TGGCAGGAAG CAGGTGGCTC TGTGGACCTG AGCCACTTCA ATCTTCAAGG GTCTCTGGCC AAGACCCAGG 5110 uj - 26 TGCAAGGCAG AGGCCTGATG ACCCGAGGAC AGGAAAGCTC GGATGGGAAG GGGCGATGAG AAGCCTGCCT 5180 CGTTGGTGAG CAGCGCATGA AGTGCCCTTA TTTACGCTTT GCAAAGATTG CTCTGGATAC CATCTGGAAA 5250 AGGCGGCCAG CGGGAATGCA AGGAGTCAGA AGCCTCCTGC TCAAACCCAG GCCAGCAGCT ATGGCGCCCA 5320 CCCGGGCGTG TGCCAGAGGG AGAGGAGTCA AGGCACCTCG AAGTATGGCT TAAATCTTTT TTTCACCTGA 5390 5 AGCAGTGACC AAGGTGTATT CTGAGGGAAG CTTGAGTTAG GTGCCTTCTT TAAAACAGAA AGTCATGGAA 5460 GCACCCTTCT CAAGGGAAAA CCAGACGCCC GCTCTGCGGT CATTTACCTC TTTCCTCTCT CCCTCTCTTG 5530 CCCTCGCGGT TTCTGATCGG GACAGAGTGA CCCCCGTGGA GCTTCTCCGA GCCCGTGCTG AGGACCCTCT 5600 TGCAAAGGGC TCCACAGACC CCCGCCCTGG AGAGAGGAGT CTGAGCCTGG CTTAATAACA AACTGGGATG 5670 TGGCTGGGGG CGGACAGCGA CGGCGGGATT CAAAGACTTA ATTCCATGAG TAAATTCAAC CTTTCCACAT 5740 10 CCGAATGGAT TTGGATTTTA TCTTAATATT TTCTTAAATT TCATCAAATA ACATTCAGGA CTGCAGAAAT 5810 CCAAAGGCGT AAAACAGGAA CTGAGCTATG TTTGCCAAGG TCCAAGGACT TAATAACCAT GTTCAGAGGG 5880 ATTTTTCGCC CTAAGTACTT TTTATTGGTT TTCATAAGGT GGCTTAGGGT GCAAGGGAAA GTACACGAGG 5950 AGAGGCCTGG GCGGCAGGGC TATGAGCACG GCAGGGCCAC CGGGGAGAGA GTCCCCGGCC TGGGAGGCTG 6020 ACAGCAGGAC CACTGACCGT CCTCCCTGGG AGCTGCCACA TTGGGCAACG CGAAGGCGGC CACGCTGCGT 6090 15 GTGACTCAGG ACCCCATACC GGCTTCCTGG GCCCACCCAC ACTAACCCAG GAAGTCACGG AGCTCTGAAC 6160 CCGTGGAAAC GAACATGACC CTTGCCTGCC TGCTTCCCTG GGTGGGTCAA GGGTAATGAA GTGGTGTGCA 6230 GGAAATGGCC ATGTAAATTA CACGACTCTG CTGATGGGGA CCGTTCCTTC CATCATTATT CATCTTCACC 6300 CCCAAGGACT GAATGATTCC AGCAACTTCT TCGGGTGTGA CAAGCCATGA CAAAACTCAG TACAAACACC 6370 ACTCTTTTAC TAGGCCCACA GAGCACGGSC CACACCCCTG ATATATTAAG AGTCCAGGAG AGATGAGGCT 6440 20 GCTTTCAGCC ACCAGGCTGG GGTGACAACA GCGGCTGAAC AGTCTGTTCC TCTAGACTAG TAGACCCTGG 6510 CAGGCACTCC CCCAGATTCT AGGGCCTGGT TGCTGCTTCC CGAGGGCGCC ATCTGCCCTG GAGACTCAGC 6580 CTGGGGTGCC ACACTGAGGC CAGCCCTGTC TCCACACCCT CCGCCTCCAG GCCTCAGCTT CTCCAGCAGC 6650 TTCCTAAACC CTGGGTGGGC CGTGTTCCAG CGCTACTGTC TCACCTGTCC CACTGTGTCT TGTCTCAGCG 6720 ACGTAGCTCG CACGGTTCCT CCTCACATGG GGTGTCTGTC TCCTTCCCCA ACACTCACAT GCGTTGAAGG 6790 25 GAGGAGATTC TGCGCCTCCC AGACTGGCTC CTCTGAGCCT GAACCTGGCT CGTGGCCCCC GATGCAGGTT 6860 CCTGGCGTCC GGCTGCACGC TGACCTCCAT TTCCAGGCGC TCCCCGTCTC CTGTCATCTG CCGGGGCCTG 6930 CCGGTGTGTT CTTCTGTTTC TGTGCTCCTT TCCACGTCCA GCTGCGTGTG TCTCTGCCCG CTAGGGTCTC 7000 GGGGTTTTTA TAGGCATAGG ACGGGGGCGT GGTGGGCCAG GGCGCTCTTG GGAAATGCAA CATTTGGGTG 7070 TGAAAGTAGG AGTGCCTGTC CTCACCTAGG TCCACGGGCA CAGGCCTGGG GATGGAGCCC CCGCCAGGGA 7140 30 CCCGCCCTTC TCTGCCCAGC ACTTTCCTGC CCCCCTCCCT CTGGAACACA GAGTGGCAGT TTCCACAAGC 7210 ACTAAGCATC CTCTTCCCAA AAGACCCAGC ATTGGCACCC CTGGACATTT GCCCCACAGC CCTGGGAATT 7280 CACGTGACTA CGCACATCAT GTACACACTC CCGTCCACGA CCGACCCCCG CTGTTTTATT TTAATAGCTA 7350 CAAAGCAGGG AAATCCCTGC TAAAATGTCC TTTAACAAAC TGGTTAAACA AACGGGTCCA TCCGCACGGT 7420 GGACAGTTCC TCACAGTGAA GAGGAACATG CCGTTTATAA AGCCTGCAGG CATCTCAAGG GAATTACGCT 7490 35 GAGTCAAAAC TGCCACCTCC ATGGGATACG TACGCAACAT GCTCAAAAAG AAAGAATTTC ACCCCATGGC 7560 AGGGGAGTGG TTAGGGGGGT TAAGGACGGT GGGGGCGGCA GCTGGGGGCT ACTGCACGCA CCTTTTACTA 7630 AAGCCAGTTT CCTGGTTCTG ATGGTATTGG CTCAGTTATG GGAGACTAAC CATAGGGGAG TGGGGATGGG 7700 GGAACCCGGA GGCTGTGCCA TCTTTGCCAT GCCCGAGTGT CCTGGGCAGG ATAATGCTCT AGAGATGCCC 7770 ACGTCCTGAT TCCCCCAAAC CTGTGGACAG AACCCGCCCG GCCCCAGGGC CTTTGCAGGT GTGATCTCCG 7840 40 TGAGGACCCT GAGGTCTGGG ATCCTTCGGG ACTACCTGCA GGCCCGAAAA GTAATCCAGG GGTTCTGGGA 7910 AGAGGCGGGC AGGAGGGTCA GAGGGGGGCA GCCTCAGGAC GATGGAGGCA GTCAGTCTGA GGCTGAAAAG 7980 GGAGGGAGGG CCTCGAGCCC AGGCCTGCAA GCGCCTCCAG AAGCTGGAAA AAGCGGGGAA GGGACCCTCC 8050 ACGGAGCCTG CAGCAGGAAG GCACGGCTGG CCCTTAGCCC ACCAGGGCCC ATCGTGGACC TCCGGCCTCC 8120 GTGCCATAGG AGGGCACTCG CGCTGCCCTT CTAGCATGAA GTGTGTGGGG ATTTGCAGAA GCAACAGGAA 8190 45 ACCCATGCAC TGTGAATCTA GGATTATTTC AAAACAAAGG TTTACAGAAA CATCCAAGGA CAGGGCTGAA 8260 GTGCCTCCGG GCAAGGGCAG GGCAGGCACG AGTGATTTTA TTTAGCTATT TTATTTTATT TACTTACTTT 8330 CTGAGACAGA GTTATGCTCT TGTTGCCCAG GCTGGAGTGC AGCGGCATGA TCTTGGCTCA CTGCAACCTC 8400 CGTCTCCTGG GTTCAAGCAA TTCTCGTGCC TCAGCCTCCC AAGTAGCTGG GATTTCAGGC GTGCACCACC 8470 ACACCCGGCT AATTTTGTAT TTTTAGTAGA GATGGGCTTT CACCATGTTG GTCAAGCTGA TCTCAAAATC 8540 50 CTGACCTCAG GTGATCCGCC CACCTCAGCC TCCCAAAGTG CTGGGATTAC AGGCATGAGC CACTGCACCT 8610 GGCCTATTTA ACCATTTTAA AACTTCCCTG GGCTCAAGTC ACACCCACTG GTAAGGAGTT CATGGAGTTC 8680 AATTTCCCCT TTACTCAGGA GTTACCCTCC TTTGATATTT TCTGTAATTC TTCGTAGACT GGGGATACAC 8750 CGTCTCTTGA CATATTCACA GTTTCTGTGA CCACCTGTTA TCCCATGGGA CCCACTGCAG GGGCAGCTGG 8820 GAGGCTGCAG GCTTCAGGTC CCAGTGGGGT TGCCATCTGC CAGTAGAAAC CTGATGTAGA ATCAGGGCGC 8890 55 AAGTGTGGAC ACTGTCCTGA ATCTCAATGT CTCAGTGTGT GCTGAAACAT GTAGAAATTA AAGTCCATCC 8960 CTCCTACTCT ACTGGGATTG AGCCCCTTCC CTATCCCCCC CCAGGGGCAG AGGAGTTCCT CTCACTCCTG 9030 TGGAGGAAGG AATGATACTT TGTTATTTTT CACTGCTGGT ACTGAATCCA CTGTTTCATT TGTTGGTTTG 9100 TTTGTTTTGT TTTGAGAGGC GGTTTCACTC TTGTTGCTCA GGCTGGAGGG AGTGCAATGG CGCGATCTTG 9170 GCTTACTGCA GCCTCTGCCT CCCAGGTTCA AGTGATTCTC CTGCTTCCGC CTCCCATTTG GCTGGGATTA 9240 60 CAGGCACCCG CCACCATGCC CAGCTAATTT TTTGTATTTT TAGTAGAGAC GGGGGTGGGT GGGGTTCACC 9310 ATGTTGGCCA GGCTGGTCTC GAACTTCTGA CCTCAGATGA TCCACCTGCC TCTGCCTCCT AAAGTGCTGG 9380 GATTACAGGT GTGAGCCACC ATGCCCAGCT CAGAATTTAC TCTGTTTAGA AACATCTGGG TCTGAGGTAG 9450 GAAGCTCACC CCACTCAAGT GTTGTGGTGT TTTAAGCCAA TGATAGAATT TTTTTATTGT TGTTAGAACA 9520 CTCTTGATGT TTTACACTGT GATGACTAAG ACATCATCAG CTTTTCAAAG ACACACTAAC TGCACCCATA 9590 65 ATACTGGGGT GTCTTCTGGG TATCAGCAAT CTTCATTGAA TGCCGGGAGG CGTTTCCTCG CCATGCACAT 9660 GGTGTTAATT ACTCCAGCAT AATCTTCTGC TTCCATTTCT TCTCTTCCCT CTTTTAAAAT TGTGTTTTCT 9730 ATGTTGGCTT CTCTGCAGAG AACCAGTGTA AGCTACAACT TAACTTTTGT TGGAACAAAT TTTCCAAACC 9800 GCCCCTTTGC CCTAGTGGCA GAGACAATTC ACAAACACAG CCCTTTAAAA AGGCTTAGGG ATCACTAAGG 9870 GGATTTCTAG AAGAGCGACC TGTAATCCTA AGTATTTACA AGACGAGGCT AACCTCCAGC GAGCGTGACA 9940 70 GCCCAGGGAG GGTGCGAGGC CTGTTCAAAT GCTAGCTCCA TAAATAAAGC AATTTCCTCC GGCAGTTTCT 10010 GAAAGTAGGA AAGGTTACAT TTAAGGTTGC GTTTGTTAGC ATTTCAGTGT TTGCCGACCT CAGCTACAGC 10080 ATCCCTGCAA GGCCTCGGGA GACCCAGAAG TTTCTCGCCC CCTTAGATCC AAACTTGAGC AACCCGGAGT 10150 CTGGATTCCT GGGAAGTCCT CAGCTGTCCT GCGGTTGTGC CGGGGCCCCA GGTCTGGAGG GGACCAGTGG 10220 CCGTGTGGCT TCTACTGCTG GGCTGGAAGT CGGGCCTCCT AGCTCTGCAG TCCGAGGCTT GGAGCCAGGT 10290 75 GCCTGGACCC CGAGGCTGCC CTCCACCCTG TGCGGGCGGG ATGTGACCAG ATGTTGGCCT CATCTGCCAG 10360 ACAGAGTGCC GGGGCCCAGG GTCAAGGCCG TTGTGGCTGG TGTGAGGCGC CCGGTGCGCG GCCAGCAGGA 10430 GCGCCTGGCT CCATTTCCCA CCCTTTCTCG ACGGGACCGC CCCGGTGGGT GATTAACAGA TTTGGGGTGG 10500 TTTGCTCATG GTGGGGACCC CTCGCCGCCT GAGAACCTGC AAAGAGAAAT GACGGGCCTG TGTCAAGGAG 10570 u-LL -27 CCCAAGTCGC GGGGAAGTGT TGCAGGGAGG CACTCCGGGA GGTCCCGCGT GCCCGTCCAG GGAGCAATGC 10640 GTCCTCGGGT TCGTCCCCAG CCGCGTCTAC GCGCCTCCGT CCTCCCCTTC ACGTCCGGCA TTCGTGGTGC 10710 CCGGAGCCCG ACGCCCCGCG TCCGGACCTG GAGGCAGCCC TGGGTCTCCG GATCAGGCCA GCGGCCAAAG 10780 GGTCGCCGCA CGCACCTGTT CCCAGGGCCT CCACATCATG GCCCCTCCCT CGGGTTACCC CACAGCCTAG 10850 5 GCCGATTCGA CCTCTCTCCG CTGGGGCCCT CGCTGGCGTC CCTGCACCCT GGGAGCGCGA GCGGCGCGCG 10920 GGCGGGGAAG CGCGGCCCAG ACCCCCGGGT CCGCCCGGAG CAGCTGCGCT GTCGGGGCCA GGCCGGGCTC 10990 CCAGTGGATT CGCGGGCACA GACGCCCAGG ACCGCGCTCC CCACGTGGCG GAGGGACTGG GGACCCGGGC 11060 ACCCGTCCTG CCCCTTCACC TTCCAGCTCC GCCTCCTCCG CGCGGACCCC GCCCCGTCCC GACCCCTCCC 11130 GGGTCCCCGG CCCAGCCCCC TCCGGGCCCT CCCAGCCCCT CCCCTTCCTT TCCGCGGCCC CGCCCTCTCC 11200 10 TCGCGGCGCG AGTTTCAGGC AGCGCTGCGT CCTGCTGCGC ACGTGGGAAG CCCTGGCCCC GGCCACCCCC 11270 GCGATGCCGC GCGCTCCCCG CTGCCGAGCC GTGCGCTCCC TGCTGCGCAG CCACTACCGC GAGGTGCTGC 11340 CGCTGGCCAC GTTCGTGCGG CGCCTGGGGC CCCAGGGCTG GCGGCTGGTG CAGCGCGGGG ACCCGGCGGC 11410 TTTCCGCGCG CTGGTGGCCC AGTGCCTGGT GTGCGTGCCC TGGGACGCAC GGCCGCCCCC CGCCGCCCCC 11480 TCCTTCCGCC AGGTGGGCCT CCCCGGGGTC GGCGTCCGGC TGGGGTTGAG GGCGGCCGGG GGGAACCAGC 11550 15 GACATGCGGA GAGCAGCGCA GGCGACTCAG GGCGCTTCCC CCGCAGGTGT CCTGCCTGAA GGAGCTGGTG 11620 GCCCGAGTGC TGCAGAGGCT GTGCGAGCGC GGCGCGAAGA ACGTGCTGGC CTTCGGCTTC GCGCTGCTGG 11690 ACGGGGCCCG CGGGGGCCCC CCCGAGGCCT TCACCACCAG CGTGCGCAGC TACCTGCCCA ACACGGTGAC 11760 CGACGCACTG CGGGGGAGCG GGGCGTGGGG GCTGCTGCTG CGCCGCGTGG GCGACGACGT GCTGGTTCAC 11830 CTGCTGGCAC GCTGCGCGCT CTTTGTGCTG GTGGCTCCCA GCTGCGCCTA CCAGGTGTGC GGGCCGCCGC 11900 20 TGTACCAGCT CGGCGCTGCC ACTCAGGCCC GGCCCCCGCC ACACGCTAGT GGACCCCGAA GGCGTCTGGG 11970 ATGCGAACGG GCCTGGAACC ATAGCGTCAG GGAGGCCGGG GTCCCCCTGG GCCTGCCAGC CCCGGGTGCG 12040 AGGAGGCGCG GGGGCAGTGC CAGCCGAAGT CTGCCGTTGC CCAAGAGGCC CAGGCGTGGC GCTGCCCCTG 12110 AGCCGGAGCG GACGCCCGTT GGGCAGGGGT CCTGGGCCCA CCCGGGCAGG ACGCGTGGAC CGAGTGACCG 12180 TGGTTTCTGT GTGGTGTCAC CTGCCAGACC CGCCGAAGAA GCCACCTCTT TGGAGGGTGC GCTCTCTGGC 12250 25 ACGCGCCACT CCCACCCATC CGTGGGCCGC CAGCACCACG CAGGCCCCCC ATCCACATCG CGGCCACCAC 12320 GTCCCTGGGA CACGCCTTGT CCCCCGGTGT ACGCCGAGAC CAAGCACTTC CTCTACTCCT CAGGCGACAA 12390 GGAGCAGCTG CGGCCCTCCT TCCTACTCAG CTCTCTGAGG CCCAGCCTGA CTGGCGCTCG GAGGCTCGTG 12460 GAGACCATCT TTCTGGGTTC CAGGCCCTGG ATGCCAGGGA CTCCCCGCAG GTTGCCCCGC CTGCCCCAGC 12530 GCTACTGGCA AATGCGGCCC CTGTTTCTGG AGCTGCTTGG GAACCACGCG CAGTGCCCCT ACGGGGTGCT 12600 30 CCTCAAGACG CACTGCCCGC TGCGAGCTGC GGTCACCCCA GCAGCCGGTG TCTGTGCCCG GGAGAAGCCC 12670 CAGGGCTCTG TGGCGGCCCC CGAGGAGGAG GACACAGACC CCCGTCGCCT GGTGCAGCTG CTCCGCCAGC 12740 ACAGCAGCCC CTGGCAGGTG TACGGCTTCG TGCGGGCCTG CCTGCGCCGG CTGGTGCCCC CAGGCCTCTG 12810 GGGCTCCAGG CACAACGAAC GCCGCTTCCT CAGGAACACC AAGAAGTTCA TCTCCCTGGG GAAGCATGCC 12880 AAGCTCTCGC TGCAGGAGCT GACGTGGAAG ATGAGCGTGC GGGACTGCGC TTGGCTGCGC AGGAGCCCAG 12950 35 GTGAGGAGGT GGTGGCCGTC GAGGGCCCAG GCCCCAGAGC TGAATGCAGT AGGGGCTCAG AAAAGGGGGC 13020 AGGCAGAGCC CTGGTCCTCC TGTCTCCATC GTCACGTGGG CACACGTGGC TTTTCGCTCA GGACGTCGAG 13090 TGGACACGGT GATCTCTGCC TCTGCTCTCC CTCCTGTCCA GTTTGCATAA ACTTACGAGG TTCACCTTCA 13160 CGTTTTGATG GACACGCGGT TTCCAGGCGC CGAGGCCAGA GCAGTGAACA GAGGAGGCTG GGCGCGGCAG 13230 TGGAGCCGGG TTGCCGGCAA TGGGGAGAAG TGTCTGGAAG CACAGACGCT CTGGCGAGGG TGCCTGCAGG 13300 40 TTACCTATAA TCCTCTTCGC AATTTCAAGG GTGGGAATGA GAGGTGGGGA CGAGAACCCC CTCTTCCTGG 13370 GGGTGGGAGG TAAGGGTTTT GCAGGTGCAC GTGGTCAGCC AATATGCAGG TTTGTGTTTA AGATTTAATT 13440 GTGTGTTGAC GGCCAGGTGC GGTGGCTCAC GCCGGTAATC CCAGCACTTT GGGAAGCTGA GGCAGGTGGA 13510 TCACCTGAGG TCAGGAGTTT GAGACCAGCC TGACCAACAT GGTGAAACCC TATCTGTACT AAAAATACAA 13580 AAATTAGCTG GGCATGGTGG TGTGTGCCTG TAATCCCAGC TACTTGGGAG GCTGAGGCAG GAGAATCACT 13650 45 TGAACCCAGG AGGCGGAGGC TGCAGTGAGC TGAGATTGTG CCATTGTACT CCAGCCTGGG CGACAAGAGT 13720 GAAACTCTGT CTTTAAAAAA AAAAAGTGTT CGTTGATTGT GCCAGGACAG GGTAGAGGGA GGGAGATAAG 13790 ACTGTTCTCC AGCACAGATC CTGGTCCCAT CTTTAGGTAT GAAGAGGGCC ACATGGGAGC AGAGGACAGC 13860 AGATGGCTCC ACCTGCTGAG GAAGGGACAG TGTTTGTGGG TGTTCAGGGG ATGGTGCTGC TGGGCCCTGC 13930 CGTGTCCCCA CCCTGTTTTT CTGGATTTGA TGTTGAGGAA CCTCCGCTCC AGCCCCCTTT TGGCTCCCAG 14000 50 TGCTCCCAGG CCCTACCGTG GCAGCTAGAA GAAGTCCCGA TTTCACCCCC TCCCCACAAA CTCCCAAGAC 14070 ATGTAAGACT TCCGGCCATG CAGACAAGGA GGGTGACCTT CTTGGGGCTC TTTTTTTTCT TTTTTTCTTT 14140 TTATGGTGGC AAAAGTCATA TAACATGAGA TTGGCACTCC TAACACCGTT TTCTGTGTAC AGTGCAGAAT 14210 TGCTAACTCG GCGGTGTTTA CAGCAGGTTG CTTGAAATGC TGCGTCTTGC GTGACTGGAA GTCCCTACCC 14280 ATCGAACGGC AGCTGCCTCA CACCTGCTGC GGCTCAGGTG GACCACGCCG AGTCAGATAA GCGTCATGCA 14350 55 ACCCAGTTTT GCTTTTTGTG CTCCAGCTTC CTTCGTTGAG GAGAGTTTGA GTTCTCTGAT CAGGACTCTG 14420 CCTGTCATTG CTGTTCTCTG ACTTCAGATG AGGTCACAAT CTGCCCCTGG CTTATGCAGG GAGTGAGGCG 14490 TGGTCCCCGG GTGTCCCTGT CACGTGCAGG GTGAGTGAGG CGTTGCCCCC AGGTGTCCCT GTCACGTGTA 14560 GGGTGAGTGA GGCGCGGCCC CCGGGTGTCC CTGTCCCGTG CAGCGTGATT GAGGTGTGGC CCCCGGGTGT 14630 CCCTGTCACG TGTAGGGTGA GTGAGGCGCC ATCCCCGGGT GTCCCTGTCA CGTGTAGGGT GAGTGAGGCG 14700 60 TGGTCCCCGG GTGTCCCTGT CCCGTGCAGG GTGAGTGAGG CACTGTCCCC GGGTGTCCCT GTCACGTGCA 14770 GGGTGAGTGA GGCGCGGTCC CCGGGTGTCC CTCTCAGGTG TAGGGTGAGT GAGGCGCGGC CCCAGGGTGT 14840 CCCTGTCACG TGTAGGGTGA GTGAGGCACC GTCCCTGGGT GTCCCTCCCA GGTATAGGGT GAGTGAGGCA 14910 CTGTCCCCGG GTGTCCCTGT CACGTGCAGG GTGAGTGAGG CGCGGCCCCC GGGTGTCCCT CTCAGGTGCA 14980 GGGTGAGTGA GGCGCTGTCC CTGGGTGTCC CTGTCTCGTG TAGGGTGAGT GAGGCTCTGT CCCCAGGTGT 15050 65 CCTTGGCGTT TGCTCACTTG AGCTTGCTCC TGAATGTTTG CTCTTTCTAT AGCCACAGCT GCGCCGGTTG 15120 CCCATTGCCT GGGTAGATGG TGCAGGCGCA GTGCTGGTCC CCAAGCCTAT CTTTTCTGAT GCTCGGCTCT 15190 TCTTGGTCAC CTCTCCGTTC CATTTTGCTA CGGGGACACG GGACTGCAGG CTCTCGCCTC CCGCGTGCCA 15260 GGCACTGCAG CCACAGCTTC AGGTCCGCTT GCCTCTGTTG GGCCTGGCTT GCTCACCACG TGCCCGCCAC 15330 ATGCATGCTG CCAATACTCC TCTCCCAGCT TGTCTCATGC CGAGGCTGGA CTCTGGGCTG CCTGTGTCTG 15400 70 CTGCCACGTG TTGCTGGAGA CATCCCAGAA AGGGTTCTCT GTGCCCTGAA GGAAAGCAAG TCACCCCAGC 15470 CCCCTCACTT GTCCTGTTTT CTCCCAAGCT GCCCCTCTGC TTGGCCCCCT TGGGTGGGTG GCAACGCTTG 15540 TCACCTTATT CTGGGCACCT GCCGCTCATT GCTTAGGCTG GGCTCTGCCT CCAGTCGCCC CCTCACATGG 15610 ATTGACGTCC AGCCACAGGT TGGAGTGTCT CTGTCTGTCT CCTGCTCTGA GACCCACGTG GAGGGCCGGT 15680 GTCTCCGCCA GCCTTCGTCA GACTTCCCTC TTGGGTCTTA GTTTTGAATT TCACTGATTT ACCTCTGACG 15750 TTTCTATCTC TCCATTGTAT GCTTTTTCTT GGTTTATTCT TTCATTCCTT TTCTAGCTTC TTAGTTTAGT 15820 CATGCCTTTC CCTCTAAGTG CTGCCTTACC TGCACCCTGT GTTTTGATGT GAAGTAATCT CAACATCAGC 15890 CACTTTCAAG TGTTCTTAAA ATACTTCAAA GTGTTAATAC TTCTTTTAAG TATTCTTATT CTGTGATTTT 15960 w TTTCTTTGTG CACGCTGTGT TTTGACGTGA AATCATTTTG ATATCAGTGA CTTTTAAGTA TTCTTTAGCT 16030 7Crr -28 TATTCTGTGA TTTCTTTGAG CAGTGAGTTA TTTGAACACT GTTTATGTTC AAGATATGTA GAGTATCAAG 16100 ATACGTAGAG TATTTTAAGT TATCATTTTA TTATTGATTT CTAACTCAGT TGTGTAGTGG TCTGTATAAT 16170 ACCAATTATT TGAAGTTTGC GGAGCCTTGC TTTGTGATCT AGTGTGTGCA TGGTTTCCAG AACTGTCCAT 16240 TGTAAATTTG ACATCCTGTC AATAGTGGGC ATGCATGTTC ACTATATCCA GCTTATTAAG GTCCAGTGCA 16310 5 AAGCTTCTGT CTCCTTCTAG ATGCATGAAA TTCCAAGAAG GAGGCCATAG TCCCTCACCT GGGGGATGGG 16380 TCTGTTCATT TCTTCTCGTT TGGTAGCATT TATGTGAGGC ATTGTTAGGT GCATGCACGT GGTAGAATTT 16450 TTATCTTCCT GATGAGTGAA TCTTTTGGAG ACTTCTATGT CTCTAGTAAT CTAGTAATTC TTTTTTTAAA 16520 TTGCTCTTAG TACTGCCACA CTGGGCTTCT TTTGATTAGT ATTTTCCTGC TGTGTCTGTT TTCTGCCTTT 16590 AATTTATATA TATATATATA TTTTTTTTTT TTTTGAGACA GAGTCTTGGT CTGTCGCCCA GGGTGAGTGC 16660 10 AGTGGTGTGA TCACAGGTCA GTGTAACTTT TACCTTCTGG CCTGAGCCGT CCTCTCACCT CAGCCTCCTG 16730 AGTAGCTGGA ACTGCAGACA CGCACCGCTA CACCTGGCTA ATTTTTAAAT TTTTTCTGGA GACAGGGTCT 16800 TGCTGTGTTG CCCAGGCTGG TCTCAAACTC TTGGACTCAA GGGATCCATC TACCTCGGCT TCCCAAAGTG 16870 CTGAATTACA GGCATGAGCC ACCATGTCTG GCCTAATTTT CAACACTTTT ATATTCTTAT AGTGTGGGTA 16940 TGTCCTGTTA ACAGCATGTA GGTGAATTTC CAATCCAGTC TGACAGTCGT TGTTTAACTG GATAACCTGA 17010 15 TTTATTTTCA TTTTTTTGTC ACTAGAGACC CGCCTGGTGC ACTCTGATTC TCCACTTGCC TGTTGCATGT 17080 CCTCGTTCCC TTGTTTCTCA CCACCTCTTG GGTTGCCATG TGCGTTTCCT GCCGAGTGTG TGTTGATCCT 17150 CTCGTTGCCT CCTGGTCACT GGGCATTTGC TTTTATTTCT CTTTGCTTAG TGTTACCCCC TGATCTTTTT 17220 ATTGTCGTTG TTTGCTTTTG TTTATTGAGA CAGTCTCACT CTGTCACCCA GGCTGGAGTG TAATGGCACA 17290 ATCTCGGCTC ACTGCAACCT CTGCCTCCTC GGTTCAAGCA GTTCTCATTC CTCAACCTCA TGAGTAGCTG 17360 20 GGATTACAGG CGCCCACCAC CACGCCTGGC TAATTTTTGT ATTTTTAGTA GAGATAGGCT TTCACCATGT 17430 TGGCCAGGCT GGTCTCAAAC TCCTGACCTC AAGTGATCTG CCCGCCTTGG CCTCCCACAG TGCTGGGATT 17500 ACAGGTGCAA GCCACCGTGC CCGGCATACC TTGATCTTTT AAAATGAAGT CTGAAACATT GCTACCCTTG 17570 TCCTGAGCAA TAAGACCCTT AGTGTATTTT AGCTCTGGCC ACCCCCCAGC CTGTGTGCTG TTTTCCCTGC 17640 TGACTTAGTT CTATCTCAGG CATCTTGACA CCCCCACAAG CTAAGCATTA TTAATATTGT TTTCCGTGTT 17710 25 GAGTGTTTCT GTAGCTTTGC CCCCGCCCTG CTTTTCCTCC TTTGTTCCCC GTCTGTCTTC TGTCTCAGGC 17780 CCGCCGTCTG GGGTCCCCTT CCTTGTCCTT TGCGTGGTTC TTCTGTCTTG TTATTGCTGG TAAACCCCAG 17850 CTTTACCTGT GCTGGCCTCC ATGGCATCTA GCGACGTCCG GGGACCTCTG CTTATGATGC ACAGATGAAG 17920 ATGTGGAGAC TCACGAGGAG GGCGGTCATC TTGGCCCGTG AGTGTCTGGA GCACCACGTG GCCAGCGTTC 17990 CTTAGCCAGT GAGTGACAGC AACGTCCGCT CGGCCTGGGT TCAGCCTGGA AAACCCCAGG CATGTCGGGG 18060 30 TCTGGTGGCT CCGCGGTGTC GAGTTTGAAA TCGCGCAAAC CTGCGGTGTG GCGCCAGCTC TGACGGTGCT 18130 GCCTGGCGGG GGAGTGTCTG CTTCCTCCCT TCTGCTTGGG AACCAGGACA AAGGATGAGG CTCCGAGCCG 18200 TTGTCGCCCA ACAGGAGCAT GACGTGAGCC ATGTGGATAA TTTTAAAATT TCTAGGCTGG GCGCGGTGGC 18270 TCACGCCTGT AATCCCAGCA CTTTGGGAGG CCAAGGCGGG TGGATCACGA GGTCAGGAGG TCGAGACCAT 18340 CCTGGCCAAC ATGATGAAAC CCCATCTGTA CTAAAAACAC AAAAATTAGC TGGGCGTGGT GGCGGGTGCC 18410 35 TGTAATCCCA GCTACTCGGG AGGCTGAGGC AGGAGAATTG CTTGAACCTG GGAGTTGGAA GTTGCAGTGA 18480 GCCGACATTG CACCACTGCA CTCCAGCCTG GCAACACAGC GAGACTCTGT CTCAAAAAAA AAAAAAAAAA 18550 AAAAAAAAAA AATTCTAGTA GCCACATTAA AAAAGTAAAA AAGAAAAGGT GAAATTAATG TAATAATAGA 18620 TTTTACTGAA GCCCAGCATG TCCACACCTC ATCATTTTAG GGTGTTATTG GTGGGAGCAT CACTCACAGG 18690 ACATTTGACA TTTTTTGAGC TTTGTCTGCG GGATCCCGTG TGTAGGTCCC GTGCGTGGCC ATCTCGGCCT 18760 40 GGACCTGCTG GGCTTCCCAT GGCCATGGCT GTTGTACCAG ATGGTGCAGG TCCGGGATGA GGTCGCCAGG 18830 CCCTCAGTGA GCTGGATGTG CAGTGTCCGG ATGGTGCACG TCTGGGATGA GGTCGCCAGG CCCTGCTGTG 18900 AGCTGGATGT GTGGTGTCTG GATGGTGCAG GTCAGGGGTG AGGTCTCCAG GCCCTCGGTG AGCTGGAGGT 18970 ATGGAGTCCG GATGATGCAG GTCCGGGGTG AGGTCGCCAG GCCCTGCTGT GAGCTGGATG TGTGGTGTCT 19040 GGATGGTGCA GGTCAGGGGT GAGGTCTCCA GGCCCTCGGT AAGCTGGAGG TATGGAGTCC GGATGATGCA 19110 45 GGTCCGGGGT GAGGTCGCCA GGCCCTGCTG TGAGCTGGAT GTGTGGTGTC TGGATGGTGC AGGTCTGGGG 19180 TGAGGTCACC AGGCCCTGCG GTGAGCTGGG TGTGCGGTGT CTGGATGGTG CAGGTCTGGA GTGAGGTCGC 19250 CAGACGGTGC CAGACCATGC GGTGAGCTGG ATATGCGGTG TCCGGATGGT GCAGGTCTGG GGTGAGGTTG 19320 CCAGGCCCTG CTGTGAGTTG GATGTGGGGT GTCCGGATGC TGCAGGTCCG GTGTGAGGTC ACCAGGCCCT 19390 GCTGTGAGCT GGATGTGTGG TGTCTGGATG GTGCAGGTCT GGGGTGAAGG TCGCCAGGCC CCTGCTTGTG 19460 50 AGCTGGATGT GTGGTGTCTG GATGGTGCAG GTCTGGAGTG AGGTCGCCAG GCCCTCGGTG AGCTGGATGT 19530 GCAGTGTCCA GATGGTGCAG GTCCGGGGTG AGGTCGCCAG ACCCTGCGGT GAGCTGGATG TGCGGTGTCT 19600 GGATGGTGCA GGTCTGGAGT GAGGTCGCCA GGCCCTCGGT GAGCTGGATG TATGGAGTCC GGATGGTGCC 19670 GGTCCGGGGT GAGGTCGCCA GACCCTGCTG TGAGCTGGAT GTGCGGTGTC TGGATGGTAC AGGTCTGGAG 19740 TGAGGTCGCC AGACCCTGCT GTGAGCTGGA TATGCGGTGT CCGGATGGTG CAGGTCAGGG GTGAGGTCTC 19810 55 CAGGCCCTCG GTGAGCTGGA GGTATGGAGT CCGGATGATG CAGGTCCGGG GTGAGGTCGC CAGGCCCTGC 19880 TGTGAACTGG ATGTGCGGCG TCTGGATGGT GCAGGTCTGG GGTGTGGTCG CCAGGCCCTC GGTGAGCTGG 19950 AGGTATGGAG TCCGGATGAT GCAGGTCCGG GGTGAGGTCG CCAGGCCCTG CTGTGAGCTG GATGTGCGGC 20020 GTCTGGATGG TGCAGGTCTG GGGTGTGGTC GCCAGGCCCT CGGTGAGCTG GAGGTATGGA GTCCGGATGA 20090 TGCAGGTCCG GGGTGAGGTT GCCAGGCCCT GCTGTGAGCT GGATGTGCTG TATCCGGATG GTGCAGTCCG 20160 60 GGGTGAGGTC GCCAGGCCCT GCTGTGAGCT GGATGTGCTG TATCCGGATG GTGCAGGTCT GGGGTGAGGT 20230 CACCAGGCCC TGCGGTGAGC TGGTTGTGCG GTGTCCGGTT GCTGCAGGTC CGGGGTGAGT TCGCCAGGCC 20300 CTCGGTGAGC TGGATGTGCG GTGTCCCCGT GTCCGGATGG TGCAGGTCCA GGGTGAGGTC GCTAGGCCCT 20370 TGGTGGGCTG GATGTGCCGT GTCCGGATGG TGCAGGTCTG GGGTGAGGTC GCCAGGCCTT TGGTGAGCTG 20440 GATGTGCGGT GTCTGCATGG TGCAGGTCTG GGGTGAGGTC GCCAGGCCCT TGGTGGGCTG GATGTGTGGT 20510 65 GTCCGGATGG TGCAGGTCCG GCGTGAGGTC GCCAGGCCCT GCTGTGAGCT GGATGTGCGG TGTCTGGATG 20580 GTGCAGGTCC GGGGTGAGGT AGCCAAGGCC TTCGGTGAGC TGGATGTGGG GTGTCCGGAT GGTGCAGGTC 20650 CGGGGTGAGG TCGCCAGGCC CTGCGGTTAG CTGGATATGC GGTGTCCGGA TGGTGCAGGT CCGGGGTGAG 20720 GTCACCAGGC CCTGCGGTTA GCTGGATGTG CGGTGTCTGG ATGGTGCAGG TCCGGGGTGA GGTCGCCAGG 20790 CCCTGCTGTG AGCTGGATGT GCTGTATCCG GATGGTGCAG GTCCGGGGTG AGGTCGCCAG GCCCTGCAGT 20860 70 GAGCTGGATG TGCTGTATCC GGATGGTGCA GGTCTGGCGT GAGGTCGCCA GGCCCTGCGG TTAGCTGGAT 20930 ATGCGGTGTC GGATGGTGCA GGTCCGGGGT GAGGTCACCA GGCCCTGCGG TTAGCTGGAT GTGCGGTGTC 21000 CGGATGGTGC AGGTCTGGGG TGAGGTCGCC AGGCCCTGCT GTGAGCTGGA TGTGCTGTAT CCGGATGGTG 21070 CAGGTCCGGG GTGAGGTCGC CAGGCCCTGC GGTGAGCTGG ATGTGCTGTA TCCGGATGGT GCAGGTCTGG 21140 CGTGAGGTCG CCAGGCCCTG CGGTGAGCTG GATGTGCAGT GTACGGATGG TGCAGGTCCG GGGTGAGGTC 21210 5 GCCAGGCCCT GCGGTGGGCT GTATGTGTGT TGTCTGGATG GTGCAGGTCC GGGGTGAGTT CGCCAGGCCC 21280 TGCGGTGAGC TGGATGTGTG GTGTCTGGAT GCTGCAGGTC CGGGGTGAGT TCGCCAGGCC CTCGGTGAGC 21350 TGGATATGCG GTGTCCCCGT GTCCGAATGG TGCAGGTCCA GGGTGAGGTC GCCAGGCCCT TGGTGGGCTG 21420 GATGTGCCGT GTCCGGATGG TGCAGGTCTG GGGTGAGGTC GCCAGGCCCT TGGTGAGCTG GATGTGCGGT 21490 Lul - 29 GTCCGGATGG TGCAGGTCCG GGGTGAGGTC ACCAGGCCCT CGGTGATCTG GATGTGGCAT GTCCTTCTCG 21560 TTTAAGGGGT TGGCTGTGTT CCGGCCGCAG AGCACCGTCT GCGTGAGGAG ATCCTGGCCA AGTTCCTGCA 21630 CTGGCTGATG AGTGTGTACG TCGTCGAGCT GCTCAGGTCT TTCTTTTATG TCACGGAGAC CACGTTTCAA 21700 AAGAACAGGC TCTTTTTCTA CCGGAAGAGT GTCTGGAGCA AGTTGCAAAG CATTGGAATC AGGTACTGTA 21770 5 TCCCCACGCC AGGCCTCTGC TTCTCGAAGT CCTGGAACAC CAGCCCGGCC TCAGCATGCG CCTGTCTCCA 21840 CTTGCCTGTG CTTCCCTGGC TGTGCAGCTC TGGGCTGGGA GCCAGGGGCC CCGTCACAGG CCTGGTCCAA 21910 GTGGATTCTG TGCAAGGCTC TGACTGCCTG GAGCTCACGT TCTCTTACTT GTAAAATCAG GAGTTTGTGC 21980 CAAGTGGTCT CTAGGGTTTG TAAAGCAGAA GGGATTTAAA TTAGATGGAA ACACTACCAC TAGCCTCCTT 22050 GCCTTTCCCT GGGATGTGGG TCTGATTCTC TCTCTCTTTT TTTTTTCTTT TTTGAGATGG AGTCTCACTC 22120 10 TGTTGCCCAG GCTGGAGTGC AGTGGCATAA TCTTGGCTCA CTGCAACCTC CACCTCCTGG GTTTAAGCGA 22190 TTCACCAGCC TCAGCCTCCT AAGTAGCTGG GATTACAGGC ACCTGCCACC ACGCCTGGCT AATTTTTGTA 22260 CTTTTAGGAG AGACGGGGTT TCACCATGTT GGCCAGGCTG GTCTCGAACT CATGACCTCA GGTGATCCAC 22330 CCACCTTGGC CTCCCAAAGT GCTGGGTTTA CAGGCTAAGC CACCGTGCCC AGCCCCCGAT TCTCTTTTAA 22400 TTCATGCTGT TCTGTATGAA TCTTCAATCT ATTGGATTTA GGTCATGAGA GGATAAAATC CCACCCACTT 22470 15 GGCGACTCAC TGCAGGGAGC ACCTGTGCAG GGAGCACCTG GGGATAGGAG AGTTCCACCA TGAGCTAACT 22540 TCTAGGTGGC TGCATTTGAA TGGCTGTGAG ATTTTGTCTG CAATGTTCGG CTGATGAGAG TGTGAGATTG 22610 TGACAGATTC AAGCTGGATT TGCATCAGTG AGGGACGGGA GCGCTGGTCT GGGAGATGCC AGCCTGGCTG 22680 AGCCCAGGCC ATGGTATTAG CTTCTCCGTG TCCCGCCCAG GCTGACTGTG GAGGGCTTTA GTCAGAAGAT 22750 CAGGGCTTCC CCAGCTCCCC TGCACACTCG AGTCCCTGGG GGGCCTTGTG ACACCCCATG CCCCAAATCA 22820 20 GGATGTCTGC AGAGGGAGCT GGCAGCAGAC CTCGTCAGAG GTAACACAGC CTCTGGGCTG GGGACCCCGA 22890 CGTGGTGCTG GGGCCATTTC CTTGCATCTG GGGGAGGGTC AGGGCTTTCC CTGTGGGAAC AAGTTAATAC 22960 ACAATGCACC TTACTTAGAC TTTACACGTA TTTAATGGTG TGCGACCCAA CATGGTCATT TGACCAGTAT 23030 TTTGGAAAGA ATTTAATTGG GGTGACCGGA AGGAGCAGAC AGACGTGGTG GTCCCCAAGA TGCTCCTTGT 23100 CACTACTGGG ACTGTTGTTC TGCCTGGGGG GCCTTGGAGG CCCCTCCTCC CTGGACAGGG TACCGTGCCT 23170 25 TTTCTACTCT GCTGGGCCTG CGGCCTGCGG TCAGGGCACC AGCTCCGGAG CACCCGCGGC CCCAGTGTCC 23240 ACGGAGTGCC AGGCTGTCAG CCACAGATGC CCAGGTCCAG GTGTGGCCGC TCCAGCCCCC GTGCCCCCAT 23310 GGGTGGTTTT GGGGGAAAAG GCCAAGGGCA GAGGTGTCAG GAGACTGGTG GGCTCATGAG AGCTGATTCT 23380 GCTCCTTGGC TGAGCTGCCC TGAGCAGCCT CTCCCGCCCT CTCCATCTGA AGGGATGTGG CTCTTTCTAC 23450 CTGGGGGTCC TGCCTGGGGC CAGCCTTGGG CTACCCCAGT GGCTGTACCA GAGGGACAGG CATCCTGTGT 23520 30 GGAGGGGCAT GGGTTCACGT GGCCCCAGAT GCAGCCTGGG ACCAGGCTCC CTGGTGCTGA TGGTGGGACA 23590 GTCACCCTGG GGGTTGACCG CCGGACTGGG CGTCCCCAGG GTTGACTATA GGACCAGGTG TCCAGGTGCC 23660 CTGCAAGTAG AGGGGCTCTC AGAGGCGTCT GGCTGGCATG GGTGGACGTG GCCCCGGGCA TGGCCTTCAG 23730 CGTGTGCTGC CGTGGGTGCC CTGAGCCCTC ACTGAGTCGG TGGGGGCTTG TGGCTTCCCG TGAGCTTCCC 23800 CCTAGTCTGT TGTCTGGCTG AGCAAGCCTC CTGAGGGGCT CTCTATTGCA GACAGCACTT GAAGAGGGTG 23870 35 CAGCTGCGGG AGCTGTCGGA AGCAGAGGTC AGGCAGCATC GGGAAGCCAG GCCCGCCCTG CTGACGTCCA 23940 GACTCCGCTT CATCCCCAAG CCTGACGGGC TGCGGCCGAT TGTGAACATG GACTACGTCG TGGGAGCCAG 24010 AACGTTCCGC AGAGAAAAGA GGGTGGCTGT GCTTTGGTTT AACTTCCTTT TTAAACAGAA GTGCGTTTGA 24080 GCCCCACATT TGGTATCAGC TTAGATGAAG GGCCCGGAGG AGGGGCCACG GGACACAGCC AGGGCCATGG 24150 CACGGCGCCA ACCCATTTGT GCGCACAGTG AGGTGGCCGA GGTGCCGGTG CCTCCAGAAA AGCAGCGTGG 24220 40 GGGTGTAGGG GGAGCTCCTG GGGCAGGGAC AGGCTCTGAG GACCACAAGA AGCAGCCGGG CCAGGGCCTG 24290 GATGCAGCAC GGCCCGAGGT CCTGGATCCG TGTCCTGCTG TGGTGCGCAG CCTCCGTGCG CTTCCGCTTA 24360 CGGGGCCCGG GGACCAGGCC ACGACTGCCA GGAGCCCACC GGGCTCTGAG GATCCTGGAC CTTGCCCCAC 24430 GGCTCCTGCA CCCCACCCCT GTGGCTGCGG TGGCTGCGGT GACCCCGTCA TCTGAGGAGA GTGTGGGGTG 24500 AGGTGGACAG AGGTGTGGCA TGAGGATCCC GTGTGCAACA CACATGCGGC CAGGAACCCG TTTCAAACAG 24570 45 GGTCTGAGGA AGCTGGGAGG GGTTCTAGGT CCCGGGTCTG GGTGGCTGGG GACACTGGGG AGGGGCTGCT 24640 TCTCCCCTGG GTCCCTATGG TGGGGTGGGC ACTTGGCCGG ATCCACTTTC CTGACTGTCT CCCATGCTGT 24710 CCCCGCCAGG CCGAGCGTCT CACCTCGAGG GTGAAGGCAC TGTTCAGCGT GCTCAACTAC GAGCGGGCGC 24780 GGCGCCCCGG CCTCCTGGGC GCCTCTGTGC TGGGCCTGGA CGATATCCAC AGGGCCTGGC GCACCTTCGT 24850 GCTGCGTGTG CGGGCCCAGG ACCCGCCGCC TGAGCTGTAC TTTGTCAAGG TGGGTGCCGG GGACCCCCGT 24920 50 GAGCAGCCCT GCTGGACCTT GGGAGTGGCT GCCTGATTGG CACCTCATGT TGGGTGGAGG AGGTACTCCT 24990 GGGTGGGCCG CAGGGAGTGC AGGTGACCCT GTCACTGTTG AGGACACACC TGGCACCTAG GGTGGAGGCC 25060 TTCAGCCTTT CCTGCAGCAC ATGGGGCCGA CTGTGCACCC TGACTGCCCG GGCTCCTATT CCCAAGGAGG 25130 GTCCCACTGG ATTCCAGTTT CCGTCAGAGA AGGAACCGCA ACGGCTCAGC CACCAGGCCC CGGTGCCTTG 25200 CACCCCAGTC CTGAGCCAGG GGTCTCCTGT CCTGAGGCTC AGAGAGGGGA CACAGCCCGC CCTGCCCTTG 25270 55 GGGTCTGGAG TGGTGGGGGT CAGAGAGAGA GTGGGGGACA CCGCCAGGCC AGGCCCTGAG GGCAGAGGTG 25340 ATGTCTGAGT TTCTGCGTGG CCACTGTCAG TCTCCTCGCC TCCACTCACA CAGGTGGATG TGACGGGCGC 25410 GTACGACACC ATCCCCCAGG ACAGGCTCAC GGAGGTCATC GCCAGCATCA TCAAACCCCA GAACACGTAC 25480 TGCGTGCGTC GGTATGCCGT GGTCCAGAAG GCCGCCCATG GGCACGTCCG CAAGGCCTTC AAGAGCCACG 25550 TAAGGTTCAC GTGTGATAGT CGTGTCCAGG ATGTGTGTCT CTGGGATATG AATGTGTCTA GAATGCAGTC 25620 60 GTGTCTGTGA TGCGTTTCTG TGGTGGAGGT ACTTCCATGA TTTACACATC TGTGATATGC GTGTGTGGCA 25690 CGTGTGTGTC GTGGTGCATG TATCTGTGGC GTGCATATTT GTGGTGTGTG TGTGTGTGGC ACGTGTGTGT 25760 CCATGGTGTG TGTGCCTGTG GTGTGCATGT GTGTGTGTCT GTGACACGTG CATGTTCATG CTGTGTGCTG 25830 CATGTCTGTG ATGTGCCTAT TTGTGGTGTG TGTGTGCATG TGTCCGTGAC ATATGCGTGT CTATGGCATG 25900 GGTGTGTGTG GCCCCTTGGC CTTACTCCTT CCTCCTCCAG GCATGGTCCG CACCATTGTC CTCACGCTCT 25970 65 CGGGTGCTGG TTTGGGGAGC TCCACATTCA GGGTCCTCAC TTCTAGCATG GGTGCCCCTG TCCTGTCACA 26040 GGGCTGGGCC TTGGAGACTG TAAGCCAGGT TTGAGAGGAG AGTAGGGATG CTGGTGGTAC CTTCCTGGAC 26110 CCCTGGCACC CCCAGGACCC CAGTCTGGCC TATGCCGGCT CCATGAGATA TAGGAAGGCT GATTCAGGCC 26180 TCGCTCCCCG GGACACACTC CTCCCAGAGC GGCCGGGGGC CTTGGGGCTC GGCAGGGGTG AAAGGGGCCC 26250 TGGGCTTGGG TTCCCACCCA GTGGTCATGA GCACGCTGGA GGGGTAAGCC CTCAAAGTCG TGCCAGGCCG 26320 70 GGGTGCAGAG GTGAAGAAGT ATCCCTGGAG CTTCGGTCTG GGGAGAGGCA CATGTGGAAA CCCACAAGGA 26390 CCTCTTTCTC TGACTTCTTG AGCT 26414 9ALi - 30 Contig 2: TGTGGGATTG GTTTTCATGT GTGGGATAGG TGGGGATCTG TGGGATTGGT TTTTATGAGT GGGGTAACAC 70 AGAGTTCAAG GCGAGCTTTC TTCCTGTAGT GGGTCTGCAG GTGCTCCAAC AGCTTTATTG AGGAGACCAT 140 ATCTTCCTTT GAACTATGGT CGGGTTTATA GTAAGTCAGG GGTGTGGAGG CCTCCCCTGG GCTCCCTGTT 210 5 CTGTTTCTTC CACTCTGGGG TCGTGTGGTG CCTGCTGTGG TGTGTGGCCG GTGGGCAGGG CTTCCAGGCC 280 TCCTTGTGTT CATTGGCCTG GATGTGGCCC TGGCTACGCT CCGTCCTTGG AATTCCCCTG CGAGTTGGAG 350 GCTTTCTTTC TTTCTTTTTT TCTTTCTTTT TTTTTTTTTT TGATAACAGA GTCTCGCTCT TTTTTGCCCA 420 GGCTGGAGTG GTTTGGCGTG ATCTTGGCTC ACTGCAACCT GTGCTTCCTG AGTTCAAGCA ATTCTCTTGC 490 CTCAGCCTCC CAAGTAGCTG GAATTATAGG CGCCCACCAC CATGCTGACT AATTTTTGTA ATTTTAGTAG 560 10 AGACGAGGTT TCTCCATGTT GGCCAGGCTG GTCTCGAACT CCTGACCTCA GGTGATCCTC CCACCTCGGC 630 CTCCCAAAGT GCTGGGATGA CAGGTGTGAA CCGCCGCGCC CGGCCGAGAC TCGCTTCCTG CAGCTTCCGT 700 GAGATCTGCA GCGATAGCTG CCTGCAGCCT TGGTGCTGAC AACCTCCGTT TTCCTTCTCC AGGTCTCGCT 770 AGGGGTCTTT CCATTTCATG ACTCTCTTCA CAGAAGAGTT TCACGTGTGC TGATTTCCCG GCTGTTTCCT 840 GCGTAATTGG TGTCTGCTGT TTATCGATGG CCTCCTTCCA TTTCCTTTAG GCTTTGTTTA TTGTTGTTTT 910 15 TCCGGCTCCT TGAAGGAAAA GTTTCGATTA TGGATGTTTG AACTTTCTTT TCTAAACAAG CATCTGAAGT 980 TGCCGTTTTC CCTCTAAAGC AGGGATCCCG AGGCCCCTGG CTGTGGAGTG GCACCGGTCT GGGGCCTGTT 1050 AGGAACCCGG CGCACAGCGG GAGGCTAGGT GGGGTGTGGG GAGCCAGCGT TCCCGCCTGA GCCCCGCCCC 1120 TCTCAGATCA GCAGTGGCAT GCGGTGCTCA GAGGCGCACA CACCCTACTG AGAACTGTGC GTGAGAGGGG 1190 TCTAGATTCT GTGCTCCTTA TGGGAATCTA ATGCCTGATG ATCTGAGGTG GAACCGTTTG CTCCCAAAAC 1260 20 CATCCCCTTC CCCACTGCTG TCCTGTGGAA AAATCGTCTT CCACGAAACC AGTCCCTGGT ACCACAATGG 1330 TTGGGGACCC TGTGCTAAAG ACCTGCTTCA GCAGCCTCTC GTCAGTGTTG ATATATTGGC TTTTCTGTGT 1400 TGAGTCCAGA ATAATTACGG ATTTCTGTGA TGCTTTCCGC CGACCTCAGA CCCATGGGCT ATTTGTGGGC 1470 GTGTTGCCTG CTCCTGGGTT GGGAAGGGTG CAGGCCCCAT GTACCTTCCT GTTACTGCCT TCCAGGTTGG 1540 TTCTCAGGGT TGAATCGTAC TCGATGTGGT TTTAGCCCAC GGCCCTGCCG CCAGCTCCTG GGGGCTGGGG 1610 25 AACATGCTGA AGCACAGAGT CACCGTGCGC GTCTTTTGAT GCCTCACAAG CTCGAGGCCT CCTGTGTCCG 1680 TGTTAGTGTG TGTCACGTGC CTGCTCACAT CCTGTCTTGG GGACGCAGGG GCTTAGCAGG TCCCGTAGTA 1750 AATGACAAGC GTCCTGGGGG AGTCTGCAGA ATAGGAGGTG GGGGTGCCGG TCTCTCTCCC GCGTCTTCAG 1820 ACTCTTCTCC TGCCTGTGCT GTGGCTGCAC CTGCATCCCT GCAATCCCTC CAGCACTGGG CTGGAGAGGC 1890 CCGGGAGCTC GAGTGCCACT TGTGCCACGT GACTGTGGAT GGCAGTCGGT CACGGGGGTC TGATGTGTGG 1960 30 TGACTGTGGA TGGCGGTTGG TCACAGGGGT CTGATGTGTG GTGACTGTGG ATGGCGGTCG TGGGGTCTGA 2030 TGTGGTGACT GTGGATGGCG GTCGTGGGGT CTGATGTGTG GTGACTGTGG ATGGCGGTCG TGGGGTCTGA 2100 TGTGGTGACT GTGGATGGCG GTCGTGGGGT CTGATGTGGT GACTGTGGAT GGCGGTCGTG GGGTCTGATG 2170 TGGTGACTGT GGATGGCAGT CGTGGGGTCT GATGTGTGGT GACTGTGGAT GGCGGTCGTG GGGTCTGATG 2240 TGGTGACTGT GGATGGCAGT CGTGGGGTCT GATGTGTGGT GACTGTGGAT GGCGGTCGTG GGGTCTGATG 2310 35 TGTGGTGACT GTGGATGGCG GTCGTGGGGT CTGATGTGTG GTGACTGTGG ATGGCGGTCG TGGGGTCTGA 2380 TGTGTGGTGA CTGTGGATGG CGGTCGTGGG GTCTGATGTG GTGACTGTGG ATGGCGGTCG TGGGGTCTGA 2450 TGTGTGGTGA CTGTGGATGG TGATCGGTCA CAGGGGTCTG ATGTGTGGTG ACTGTGGATG GCGGTCGTGG 2520 GGTCTGATGT GTGGTGACTG TGGATGGTGA TCGGTCACAG GGGTCTGATG TGTGGTGACT GTGGATGGCG 2590 GTCGTGGGGT CTGATGTGTG GTGACTGTGG ATGGCGGTTG GTCCCGGGGG TCTGATGTGT GGTGACTGTG 2660 40 GATGGCGATC GGTCACAGGG GTCTGATGTG TGGTGACTGT GGATGGCGGT CGTGGGGTCT GATGTGTGGT 2730 GACTGTGGAT GGCGGTCGTG GGGTCTGATG TGTGGTGACT GTGGATGGCG GTCGTGGGGT CTGATGTGGT 2800 GACTGTGGAT GGCGGTCGTG GGGTCTGATG TGGTGACTGT GGATGGCGGT CGTGGGGTCT GATGTGTGGT 2870 GACTGTGGAT GGCGGTTGGT CCCGGGGGTC TGATGTGTGG TGACTGTGGA TGGCGGTCGT GGGGTCTGAT 2940 GTGGTGACTG TGGATGGCAG TCGTGGGGTC TGATGTGTGG TGACTGTGGA TGGCGGTCGT GGGGTCTGAT 3010 45 GTGTGGTGAC TGTGGATGGC GGTCGTGGGG TCTGATGTGT GGTGACTGTG GATGGCGGTC GTGGGGTCTG 3080 ATGTGTGGTG ACTGTGGATG GCGGTCGTGG GGTCTGATGT GGTGACTGTG GATGGCGGTC GTGGGGTCTG 3150 ATGTGTGGTG ACTGTGGATG GTGATCGGTC ACAGGGGTCT GATGTGTGGT GACTGTGGAT GGCGGTCGTG 3220 GGGTCTGATG TGTGGTGACT GTGGATGGCG GTCGTGGGGT CTGATGTGGT GACTGTGGAT GGCGGTCGTG 3290 GGGTCTGATG TGTGGTGACT GTGGATGGCG GTCGTAGGGT CTGATGTGTG GTGACTGTGG ATGGCAGTCG 3360 50 GTCACAGGGG TCTGATGTGT GGTGACTGTG GATGGCGGTC GTGGGGTCTG ATGTGTGGTG ACTGTGGATG 3430 GCGGTCGTGG GGTCTGATGT GTGGTGACTG TGGATGGCGG TCGTGGGGTC TGATGTGTGG TGACTGTGGA 3500 TGGCGGTCGT GGGGTCTGAT GTGGTGACTG TGGATGGTGA TCGGTCACAG GGGTCTGATG TGTGGTAGCT 3570 GCAGGTGGAG TCCCAGGTGT GTCTGTAGCT ACTTTGCGTC CTCGGCCCCC CGGCCCCCGT TTCCCAAACA 3640 GAAGCTTCCC AGGCGCTCTC TGGGCTTCAT CCCGCCATCG GGCTTGGCCG CAGGTCCACA CGTCCTGATC 3710 55 GGAAGAAACA AGTGCCCAGC TCTGGCCGGG GCAGGCCACA TTTGTGGCTC ATGCCCTCTC CTCTGCCGGC 3780 AGGTCTCTAC CTTGACAGAC CTCCAGCCGT ACATGCGACA GTTCGTGGCT CACCTGCAGG AGACCAGCCC 3850 GCTGAGGGAT GCCGTCGTCA TCGAGCAGGT CTGGGCACTG CCCTGCAGGG TTGGGCACGG ACTCCCAGCA 3920 GTGGGTCCTC CCCTGGGCAA TCACTGGGCT CATGACCGGA CAGACTGTTG GCCCTGGGGG GCAGTGGGGG 3990 GAATGAGCTG TGATGGGGGC ATGATGAGCT GTGTGCCTTG GCGAAATCTG AGCTGGGCCA TGCCAGGCTG 4060 60 CGACAGCTGC TGCATTCAGG CACCTGCTCA CGTTTGACTG CGCGGCCTCT CTCCAGTTCC GCAGTGCCTT 4130 TGTTCATGAT TTGCTAAATG TCTTCTCTGC CAGTTTTGAT CTTGAGGCCA AAGGAAAGGT GTCCCCCTCC 4200 TTTAGGAGGG CAGGCCATGT TTGAGCCGTG TCCTGCCCAG CTGGCCCCTC AGTGCTGGGT CTGAGGCCAA 4270 AGGAAACGTG TCCCCCTTCT TAGGAGGACG GGCCGTGTTT GAGCCACGCC CCGCTGAGCG GGCCTCTCAG 4340 TGCTGGGTCT GTCCACGTGG CCCTGTGGCC CTTTGCAGAT GTGGTCTGTC CACGTGGCCC TGTGGCTCTT 4410 65 TGCAGATGCC TGTTAGCACT TGCTCGGCTC TAGGGGACAG TCGTGTCCAC CGCATGAGGC TCAGAGACCT 4480 CTGGGCGAAT TTCCTTGGCT CCCAGGGTGG GGGTGGAGGT GGCCTGGGCT GCTGGGACCC AGACCCTGTG 4550 CCCGGCAGCT GGGCAGCAAC TCCTGGATCA CATATGCCAT CCGGGCCACG GTGGGCTGTG TGGGTGTGAG 4620 CCCAGCTGGA CCCACAGGTG GCCCAGAGGA GACGTTCTGT GTCACACACT CTGCCTAAGC CCATGTGTGT 4690 CTGCAGAGAC TCGGCCCGGC CAGCCCACGA TGGCCCTGCA TTCCAGCCCA GCCCCGCACT TCATCACAAA 4760 70 CACTGACCCC AAAAGGGACG GAGGGTCTTG GCCACGTGGT CCTGCCTGTC TCAGCACCCA CCGGCTCACT 4830 CCCATGTGTC TCCCGTCTGC TTTCGCAGAG CTCCTCCCTG AATGAGGCCA GCAGTGGCCT CTTCGACGTC 4900 TTCCTACGCT TCATGTGCCA CCACGCCGTG CGCATCAGGG GCAAGTGAGT CAGGTGGCCA GGTGCCATTG 4970 CCCTGCGGGT GGCTGGGCGG GCTGGCAGGG CTTCTGCTCA CCTCTCTCCT GCCCCTTCCC CACTGNCCTT 5040 CTGCCCGGGG CCACCAGAGT CTCCTTTTCT GGCCCCCGCC CCCTCCGGCT CCTGGGCTGC AGGCTCCCGA 5110 GGCCCCGGAA ACATGGCTCG GCTTGCGGCA GCCGGAGCGG AGCAGGTGCC ACACGAGGCC TGGAAATGGC 5180 AAGCGGGGTG TGGAGTTGCT CCTGCGTGGA GGACGAGGGG CGGGGGGTGT GTCTGGGTCA GGTGTGCGCC 5250 Y>A LN -31 GAGCGTTTGA GCCTGCAGCT TGTCAGCTCC AAGTTACTAC TGACGCTGGA CACCCGGCTC TCACACGCTT 5320 GTATCTCTCT CTCCCGATAC AAAAGGATTT TATCCGATTC TCATTCCTGT CCCTGTCGTG TGACCCCCGC 5390 GAGGGCGCGG GCTCTTCTCT CTGTGACTAG ATTTCCCATC TGGAAAGTGC GGGGTTGACC GTGTAGTTTG 5460 CTCCTCTCGG GGGGCCTGTG GTGGCCATGG GGCAGGCGGC CTGGGAGAGC TGCCGTCACA CAGCCACTGG 5530 5 GTGAGCCACA CTCACGGTGG TAGAGCCACA GTGCCTGGTG CCACATCACG TCCTCTGGAT TTTAAGTAAA 5600 ACCACACACC TCCCGGCAGG CATCTGCCTG CGACCCTGTG TGTGCCTGGG GAGAGTGGTA GCACGGAGGA 5670 AATTCGTGCA CACTCAAGGT CATCAGCAAG GTCATCCGCA GTCAGGTGGA ACGTGGAGGC CTCTCTCTGG 5740 GATCGTCTCC AGCGGATAAA GGACTGTGCA CAGCTTCGGA AGCTTTTATT TAAAAATATA ACTATTAATT 5810 ATTGCATTAT AAGTAATCAC TAATGGTATC AGCAATTATA ATATTTATTA AAGTATAATT AGAAATATTA 5880 10 AGTAGTACAC ACGTTCTGGA AAAACACAAA TTGCACATGG CAGCAGAGTG AATTTTGGCC GAGGGACACG 5950 TGTGCACATG TGTGTAAGCG GCCCCCAGGC CCACAGAATT CGCTGACAAA GTCACCTCCC CAGAGAAGCC 6020 ACCACGGGCC TCCTTCGTGG TCGTGAATTT TATTAAGATG GATCAAGTCA CGTACCGTCC ACGTGTGGCA 6090 GGGCTTTGGG GAATGTGAGG TGATGACTGC GTCCTCATGC CCTGACAGAC AGGAGGTGAC TGTGTCTGTC 6160 CTGTCCCTAG GACACGGACA GGCCCGAAGC TCTAGTCCCC ATCGTGGTCC AGTTTGGCCT CTGAATAAAA 6230 15 ACGTCTTCAA AACCTGTTGC CCCAAAAACT AAGAACAGAG AGAGTTTCCC ATCCCATGTG CTCACAGGGG 6300 CGTATCTGCT TGCGTTGACT CGCTGGGCTG GCCGGACTCC TAGAGTTGGT GCGTGTGCTT CTGTGCAAAA 6370 AGTGCAGTCC TCTTGCCCAT CACTGTGATA TCTGCACCAG CAAGGAAAGC CTCTTTTCTT TTCTTTCTTT 6440 TTTTTTTTTT GAGACGGAAC GTCACTGTTG TCTGCCTGGG CTTGAGTGCA GTGGCGCGAT CTCAACTCAC 6510 TGCAACCTCC GCCTCCCGGG TTCCAGCATT TCTCCTGCCT CAGCCTCCCG AGCAGCTGAG ATTACAGGCA 6580 20 CCCACCCCCT GCGCCTGGCT AATTTTTGTA TTTTTAGTAG AGAGGGGTTT TTGCCATGTT GGCCAGGCTG 6650 GTCTCGAACT CCTGACCTCA GGTGATCCAC CCACCTCGGC CTCCCAAAGT GCTGGGATTA CAGGTGTGAG 6720 CCATCACGCC CAGCCGGAAA GCCTCTTTTT AAGGTGACCA CCTATAGCGC TTCCCGAAAA TAACAGGTCT 6790 TGTTTTTGCA GTAGGCTGCA AGCGTCTCTT AGCAACAGGA GTGGCGTCCT GTGGGCTCTG GGGATGGCTG 6860 AGGGTCGCGT GGCAGCCATG CCTTCTGTGT GCACCTTTAG GTTCCACGGG GCTATTCTGC TCTCACTGTT 6930 25 TGTCTGAAAA CGCACCCTTG GCATCCTTGT TTGGAGAGTT TCTGCTTCTC GTTGGTCATG CTGAAACTAG 7000 GGGCAAGGTT GTATCCGTTG GCGCGCAGCG GCTACATGTA GGGTCATGAG TCTTTCACCG TGGACAAATT 7070 CCTTGAAAAA AAAAAAAGGA GTCCGGTTAA GCATTCATTC CGGGTCAAGT GTCTGGTTCT GTGAATAAAC 7140 TCTAAGATTT AAGAAACCTT AATGAAAGAA AACCTTGATG ATTCAGAGCA AGGATGTGGT CACACCTGTG 7210 GCTGGATCTG TTTCAGCCGC CCCAGTGCAT GGTGAGAGTG GGGAGCAGGG ATTGTTTGTT CAGAGGTCTC 7280 30 ATCTGGTATG TTTCTGAGGT GTTTGCCGGC TGAATGGTAG ACGTGTCGTT TGTGTGTATG AGGTTCTGTG 7350 TCTGTGTGTG GCTCGGTTTG AGTGTACGCA TGTCCAGCAC ATGCCCTGCC CGTCTCTCAC CTGTGTCTTC 7420 CCGCCCCAGG TCCTACGTCC AGTGCCAGGG GATCCCGCAG GGCTCCATCC TCTCCACGCT GCTCTGCAGC 7490 CTGTGCTACG GCGACATGGA GAACAAGCTG TTTGCGGGGA TTCGGCGGGA CGGGTGAGGC CTCCTCTTCC 7560 CCAGGGGGGC TTGGGTGGGG GTTGATTTGC TTTTGATGCA TTCAGTGTTA ATATTCCTGG TGCTCTGGAG 7630 35 ACCATGACTG CTCTGTCTTG AGGAACCAGA CAAGGTTGCA GCCCCTTCTT GGTATGAAGC CGCACGGGAG 7700 GGGTTGCACA GCCTGAGGAC TGCGGGCTCC ACGCAGGCTC TGTCCAGCGG CCATGTCCAG AGGCCTCAGG 7770 GCTCAGCAGG CGGGAGGGCC GCTGCCCTGC ATGATGAGCA TGTGAATTCA ACACCGAGGA AGCACACCAG 7840 CTTCTGTCAC GTCACCCAGG TTCCGTTAGG GTCCTTGGGG AGATGGGGCT GGTGCAGCCT GAGGCCCCAC 7910 ATCTCCCAGC AGGCCCTCGA CAGGTGGCCT GGACTGGGCG CCTCTTCAGC CCATTGCCCA TCCCACTTGC 7980 40 ATGGGGTCTA CACCCAAGGA CGCACACACC TAAATATCGT GCCAACCTAA TGTGGTTCAA CTCAGCTGGC 8050 TTTTATTGAC AGCAGTTACT TTTTTTTTTT TAATACTTTA AGTTCTAGGG TACATGTGCA CGACGTGCAG 8120 GTTAGTTACA TATGTATACA TGTGCCATGT TGGTGTGCTG CACCCATTAA CTCATCATTT ACATTAGGTA 8190 TATCTCCTAA TGCTATCCCT CCCCACTCCC CCCATCCCAT GACAGGCCCT GGTGTGTGAT GTTCCCCACC 8260 CTGTGTCCAA GTGTTCTCAT TGTTCAGTTC CCACCTGTGA GTGAGAACAT GTGGTGTTTG GTTTTCTTTC 8330 45 CTTGCAATAG TTTGCTCAGA GTGATGGTTT CCAGCTTCGT CCATGTCCCT ACAAAGGACA TGAACTCATC 8400 CTTTTTTATG ACTGCATAGT ATTCCGTGGT GTATATGTGC CACATTTTCT TAATCCAGTC TATCATCGAT 8470 GGACATTTGG GTTGGTTGCA AGTCTTTGCT ACTGTGAATA GTGCCGCAAT AAACATACGT GTGCATGTGT 8540 CTTTATAGCA GCATGATTTA TAATCCTTTG GGTATATACC CAGTAATGGG ATGGCTGGGT CAAATGGTAT 8610 TTCTAGTTCT AGATCCTTGA GGAATCACCA CACTGTCTTC CACAATGGTT GAACTAGTTT ACACTCCCAC 8680 50 CAACAGTGTA AAAGTGTTCT GGTGCTGGAG AGGATGTGGA CAGCAGTTAT TTTTTTATGA AAATAGTATC 8750 ACTGAACAAG CAGACAGTTA GTGAAGGATG CGTCAGGAAG CCTGCAGGCC ACACAGCCAT TTCTCTCGAA 8820 GACTCCGGGT TTTTCCTGTG CATCTTTTGA AACTCTAGCT CCAATTATAG CATGTACAGT GGATCAAGGT 8890 TCTTCTTCAT TAAGGTTCAA GTTCTAGATT GAAATAAGTT TATGTAACAG AAACAAAAAT TTCTTGTACA 8960 CACAACTTGC TCTGGGATTT GGAGGAAAGT GTCCTCGAGC TGGCGGCACA CTGGTCAGCC CTCTGGGACA 9030 55 GGATACCTCT GGCCCATGGT CATGGGGCGC TGGGCTTGGG CCTGAGGGTC ACACAGTGCA CCATGCCCAG 9100 CTTCCTGTGG ATAGGATCTG GGTCTCGGAT CATGCTGAGG ACCACAGCTG CCATGCTGGT AAAGGGCACC 9170 ACGTGGCTCA GAGGGGGCGA GGTTCCCAGC CCCAGCTTTC TTACCGTCTT CAGTTATTTT TCCCTAAGAG 9240 TCTGAGAAGT GGGGCCGCGC CTGATGGCCT TCGTTCGTCT TCAGCTGGCA CAGAATTGCA CAAGCTGATG 9310 GTAAACACTG AGTACTTATA ATGAATGAGG AATTGCTGTA GCAGTTAACT GTAGAGAGCT CGTCTGTTGG 9380 60 AAAGAAATTT AAGTTTTTCA TTTAACCGCT TTGGAGAATG TTACTTTATT TATGGCTGTG TAAATTGTTT 9450 GACATTCAGT CCCTCGTAGA CAGATACTAC GTAAAAAGTG TAAAGTTAAC CTTGCTGTGT ATTTTCCCTT 9520 ATTTTAGGCT GCTCCTGCGT TTGGTGGATG ATTTCTTGTT GGTGACACCT CACCTCACCC ACGCGAAAAC 9590 CTTCCTCAGG TGAGGCCCGT GCCGTGTGTC TGTGGGGACC TCCACAGCCT GTGGGCTTTG CAGTTGAGCC 9660 CCCCGTGTCC TGCCCCTGGC ACCGCAGCGT TGTCTCTGCC AAGTCCTCTC TCTCTGCCGG TGCTGGATCC 9730 65 GCAAGAGCAG AGGCGCTTGG CCGTGCACCC AGGCCTGGGG GCGCAGGGGC ACCTTCGGGA GGGAGTGGGT 9800 ACCGTGCAGG CCCTGGTCCT GCAGAGACGC ACCCAGGTTA CACACGTGGT GAGTGCAGGC GGTGACCTGG 9870 CTCCTGCTGC TCTTTGGAAA GTCAAGAGTG GCGGCTCCTG GGGCCCCAGT GAGACCCCCA GGAGCTGTGC 9940 ACAGGGCCTG CAGGGCCGAG GCGGCAGCCT CCTCCCCAGG GTGCACCTGA GCCTGCGGAG AGCAGGAGCT 10010 GCTGAGTGAG CTGGCCCACA GCGTTCGCTG CGGTCACGTT CCTGCGTGGG GTTGTTTGGG ATCGGTGGGA 10080 70 GAATTTGGAT TTGCTGAGTG CTGCTGTCTT GAACCACGGA GATGGCTAGG AGTGGGTTTC AGAGTTGATT 10150 TTTGTGAATC AAACTAAAAT CAGGCACAGG GGACCTGGCC TCAGCACAGG GGATTGTCCA ATGTGGTCCC 10220 CCTCAAGGGC GCCCCACAGA GCCGGTGGGC TTGTTTTAAA GTGCGATTTG ACGAGGGACG AGAAACCTTG 10290 AAAGCTGTAA AGGGAACCCT CAGAAAATGT GGCCGCCAGG GGTGGTTTCA GGTGCTTTGC TGGGCTGTGT 10360 TTGTGAAAAC CCATTTGGAC CCGCCCTCCA AGTCCACCCT CCAGGTCCAC CCTCCAGGGC CGCCCTGGGC 10430 TGGGGGTATG CCTGGCGTTC CTTGTGCCGC AGCCCGGAGC ACAGCAGGCT GTGCACATTT AAATCCACTA 10500 AGATTCACTC GGGGGGAGCC CAGGTCCCAA GCAACTGAGG GCTCAGGAGT CCTGAGGCTG CTGAGGGGAC 10570 AGAGCAGACG GGGAACGCTG CTTCTGTGTG GCAAGTTCCT GAGGGTGCTG GCCAGGGAGG TGGCTCAGAG 10640 TGTATGTTGG GGTCCCACCG GGGGCAGAAC TCTGTCTCTG ATGAGTCGGC AGCCATGTAA CAGGAAGGGG 10710 LU )XI r-e - 32 TGGCCACAGG GAGCTGGGAA TGCACCAGGG GAGCTGCGCA GCTGGCCGAG GTCCCAGGGC CAGGCCACAG 10780 GAAGGGCAGG GGGACGCCCG GGGCCACAGC AGAGGCCGCA GGAAGGGAAG GGGATGCCCA GGCCAGAGCA 10850 GAGGCTACCG GGCACAGGGG GGCTCCCTGA GCTGGGTGAG CGAGGCTCAT GACTCGGCGA GGGAACCTCC 10920 TTGACGTGAA GCTGACGACT GGTGTTGCCC AGCTCACAGC CCAGCCAGGT CCCGCGCCTG AGCAGGAACT 10990 5 CAGAACCCTC CCCTTTGTCT AAAGCACAGC AGATGCCTTC AGGGCATCTA GGAGAAAACA GGCAAAGTCG 11060 TTGAGAAACG TCTTAAAAGA AGGTGGGATG GTGGCAATTT CTTGTCCAGA TTTTAGTCTG CCCCGGACCA 11130 CAGATGAGTC TATAACGGGA TTGTGGTGTT GCCATGGGGA CACATGAGAT GGACCATCAC AGAGGCCACT 11200 GGGGCTGCAC CTCCCATCTG AGTCCTGGCT GTCCCGGGTC CAGGCCAGGT TCTTGCATGC TCACCTACCT 11270 GTCCTGCCCG GGAGACAGGG AAAGCACCCC GAAGTCTGGA GCAGGGCTGG GTCCAGGCTC CTCAGAGCTC 11340 10 CTGCCAGGCC CAGCACCCTG CTCCAAATCA CCACTTCTCT GGGGTTTTCC AAAGCATTTA ACAAGGGTGT 11410 CAGGTTACCT CCTGGGTGAC GGCCCCGCAT CCTGGGGCTG ACATTGCCCC TCTGCCTTAG GACCCTGGTC 11480 CGAGGTGTCC CTGAGTATGG CTGCGTGGTG AACTTGCGGA AGACAGTGGT GAACTTCCCT GTAGAAGACG 11550 AGGCCCTGGG TGGCACGGCT TTTGTTCAGA TGCCGGCCCA CGGCCTATTC CCCTGGTGCG GCCTGCTGCT 11620 GGATACCCGG ACCCTGGAGG TGCAGAGCGA CTACTCCAGG TGAGCGCACC TGGCCGGAAG TGGAGCCTGT 11690 15 GCCCGGCTGG GGCAGGTGCT GCTGCAGGGC CGTTGCGTCC ACCTCTGCTT CCGTGTGGGG CAGGCGACTG 11760 CCAATCCCAA AGGGTCAGAG GCCACAGGGT GCCCCTCGTC CCATCTGGGG CTGAGCAGAA ATGCATCTTT 11830 CTGTGGGAGT GAGGGTGCTC ACAACGGGAG CAGTTTTCTG TGCTATTTTG GTAAAAGGAA ATGGTGCACC 11900 AGACCTGGGT GCACTGAGGT GTCTTCAGAA AGCAGTCTGG ATCCGAACCC AAGACGCCCG GGCCCTGCTG 11970 GGCGTGAGTC TCTCAAACCC GAACACAGGG GCCCTGCTGG GCATGAGTCC CTCTGAACCC GAGACCCTGG 12040 20 GGCCCTGCTG GGCGTGAGTC TCTCCGAACC CAGAGACTTC AGGGCCCTTT TGGGCGTGAG TCTCTCCGCT 12110 GTGAGCCCCA CACTCCAAGG CTCATCCACA GTCTACAGGA TGCCATGAGT TCATGATCAC GTGTGACCCA 12180 TCAGGGGACA GGGCCATGGT GTGGGGGGGG TCTCTACAAA ATTCTGGGGT CTTGTTTCCC CAGAGCCCGA 12250 GAGCTCAAGG CCCCGTCTCA GGCTCAGACA CAAATGAATT GAAGATGGAC ACAGATGCAG AAATCTGTGC 12320 TGTTTCTTTT ATGAATAAAA AGTATCAACA TTCCAGGCAG GGCAAGGTGG CTCACACCTA TAATCCCAGC 12390 25 ACTTTGGGAG GCCGAGGTGG GTGGATCACT TGAGGCCAGG AGTTTGAGGC CAACCTAACC AACATAGTGA 12460 AATTCCATTT CTACTTAAAA AATACAAAAA TTAGCCTGGC CTGGTGGCAC ACGCCTGTAG TCCCCGCTAT 12530 GCGGGAGGCT GAGGCAGGAG AATCATTTGA ACCCAGGAGG CAGAGGTTGC AGTGAGCCGA GATCACACCA 12600 CTGCACTCCA GCCTGGGCAA CAGAGTGAGA CTTCATCTTA AAAAAAAAAA AAAAAGTATC AGCATTCCAA 12670 AACCATAGTG GACAGGTGTT TTTTTATTCT GTCCTTCGAT AATATTTACT GGTGCTGTGC TAGAGGCCGG 12740 30 AACTGGGGGT GCCTTCCTCT GAAAGGCACA CCTTCATGGG AAGAGAAATA AGTGGTGAAT GGTTGTTAAA 12810 CCAGAGGTTT AAACTGGGGT CCTGTCGTTC TGAGTTAACA GTCCAGATCT GGACTTTGCC TCTTTCCAGA 12880 ATGCTCCCTG GGGTTTGCTT CATGGGGGAG CAGCAGGTGT GGACACCCTC GTGATGGGGG AGCAGCAGGT 12950 GCAGACGCCC TCATGATGGG GGAGTGGCAG GTGCAGACAC CCTTGTGCAT GGTGCCCAGC ATGTCCCTGT 13020 TGCAGCTCCC TCCCCACAAG GATGCCGGTC TCCTGTGCTC CCCACAGTCC CTGCTTCCCT CTCACAGCCT 13090 35 TACCTGGTCC TGGCCTCCAC TGGCTTTGTC TGCATGATTT CCACATTTCC TGGGCTCCCA GCACCTCTTC 13160 GCCTCTCCCA GGCACCTCTG CAGTGCTGGC CATACCAGTC AGCTGTGAAC TGTCCACTGC TTATTTTGCT 13230 CCCCATGAAA TGTATTTTTT AGGACAGGCA CCCCTGGTTC CAGCCTCTGG CACAGCATCA GTGAATGTTA 13300 TTGAAGGACA AAGGACAGAC AAACAAATCA GGAAAATGGG TTCTCTCTAA ACACATTGCA AAGCCACAGA 13370 GGCTAGTGCA GGATGGGTGG GCATCAGGTC ATCAGATGTG GGTCCAATGC CAGAATATTC TGTGCTCCCA 13440 40 AAGGCCACTT GGTCAGAGTG TGTGCTTGCA GAGGTGGCTC TAAAAGCTCA GCAGTGGAGG CAGTGGTTCG 13510 CCATACTCAG GGTGAACTCA CATCCTCTGT GTCTGAAGTA TACAGCAGAG GCTTGAAGGG CATCTGGGAG 13580 AAGAAAACAG GCAAAATGAT TAAGAAAAGT GAAAAAGGAA AAGTGGTAAG ATGGGAATTT TCTTGTCCAG 13650 ATTTTAGTCT CCCAAACCAC AGCTCAGATG GTAGAATGTG GTCAGAACTG ATGGACAGAA CAATAGAACA 13720 AAACGGAAGC CCTATCTCTC AGAAACGTGT GTTAATGTGG TATGTGGCAC AGCTGATGGA AAAGAGAGTG 13790 45 TGTGTGTAAT TTTTTTTTCT GAGAAAACTG ACTGGAAGCA AATAAGTTGT GTCTTTACAG CATATACCAG 13860 AGCAGATTCT AGGTAGAAGA GGAGACACAT GCAAACAACA CCAGCAACAG AAATAAAACA AAAGACTCAA 13930 AGGGAAGGGA GGTGAACGTT CCCTGGTTTG GTGTTGGGGA AGGACACACA GGGAGGCGGA TGAAACCAGT 14000 GAGGCAACGG GCATTGCTTT CACTGCAGAG AAACTCAGCT TGCCTGAGCC ACAGTGAAAA TGGCCATTCC 14070 CTGGAGCGTT TGTGCACGTG ATTTATTTAA GGCGCCCTGT GAGGTCCTGC ACATTCATCC TCTCACTTTG 14140 50 TTCTCCTAAC CACCTGAGAG GTAGAGGAGG AAAGGCTCCA GGGGAGCAGC CGCCCTTGGT CACCCAGCTG 14210 GCAAAGGGCA TGCATGATTG CAGCCTGGCC TCCTGCTCCG GGGCCCTTGC TCTGCCCGAG GACCCCACAC 14280 AAGTCAGACC CATAGGCTCA GGGTGAGCCG GAGCCCAAGG TCGTGTTGGG GATGGCTGTG AAAGAAGAAA 14350 TGGACGTCTG ATGCACACTT GGGAAGGTCC TACCAGCAGC GTCAAAGAAA TGCATGTGAA ACTGACAGCG 14420 AGACCCATCC CTCAAAGAAA CGCACGTGAA ACTGATGGCG AGACCTGTCC CCATCCCTCA TGCTGGCTCC 14490 55 TTTTCTGGGC TTGCCAAGAG CCAGCATCAG GTTGAGGCAA GCTGGAAAGA CTTTTCTGGA AAGCAGCTTG 14560 TTTGCATGGA AGTCCTCACA ATGTCCTGTG TCTTCCCAGT AATTCCACTT CTGAAGTGAC CAGACATTAT 14630 CACGGGTCTT ATTTACCATT TCCAGTGTTC CAGGCAGGGG GACTTGCCAC AGCAAGTCAC GAACCTGCCC 14700 AAATACAGGG CTAAGGAGAT ATTATGCATC ACAAAACTTG CTCTGCCATT AAACATTTTT CAAAGAATTT 14770 TTGAAGAATG TTTAATGGCA CAAAACGTTT ATTTCAATGT AGCAGTGTTC AAAGCTGGAT GTAAAAGAAC 14840 60 ACACCCCAGG AGCCTGCCGT GAATGTCATG TGTGTTCATC TTTGGACATG GACATACATG GGCAGTGAGT 14910 GGTGGTGAGG CCCTGGAGGA CATCGGTGGG ATGCCTCCAT CCTGCCCCTC TGGAGACACC ATGTGTGCCA 14980 CGTGCACTCA CTGGAGCCCT GTTTAGCTGG TGCCACCTGG CTCTTCCATC CCTGAGATTC AAACACAGTG 15050 AGATTCCCCA CGCCCAACTC AGTGTTCTCC CACAAAAAAC CTGAGTCACA CCTGTGTTCA CTCGAGGGAC 15120 GCCCGGGAGC CAGGGCTCCA CAGTTTATTA TGTGTTTTTG GCTGAGTTAT GTGCAGATCT CATCAGGGCA 15190 65 GATGATGAGT GCACAAACAC GGCCGTGCGA GGTTTGGATA CACTCAACAT CACTAGCCAG GTCCTGGTGG 15260 AGTTTGGTCA TGCAGAGTCT GGATGGCATG TAGCATTTGG AGTCCATGGA GTGAGCACCC AGCCCCCTCG 15330 GGCTGCAGCG CATGCCCCAG GCAGGACAAG GAAGCGGGAG GAAGGCAGGA GGCTCTTTGG AGCAAGCTTT 15400 GCAGGAGGGG GCTGGGTGTG GGGCAGGCAC CTGTGTCTGA CATTCCCCCC TGTGTCTCAG CTATGCCCGG 15470 ACCTCCATCA GAGCCAGTCT CACCTTCAAC CGCGGCTTCA AGGCTGGGAG GAACATGCGT CGCAAACTCT 15540 70 TTGGGGTCTT GCGGCTGAAG TGTCACAGCC TGTTTCTGGA TTTGCAGGTG AGCAGGCTGA TGGTCAGCAC 15610 AGAGTTCAGA GTTCAGGAGG TGTGTGCGCA AGTATGTGTG TGTGTGTGTG CGCGCGTGCC TGCAAGGCTG 15680 ATGGTGACTG GCTGCACGTA AGAGTGCACA TGTACGCATA TACACGTGAG CACATACATG TGTGCATGTG 15750 TGTACATGAA GGCATGGCAG TGTGTGCACA GGTGTGCAAG GGCACAAGTG TGTGCACATG CGAATGCACA 15820 CCTGACATGC ATGTGTGTTC GTGCACAGTC GTGTGGGCAT TCACGTGAGG TGCATGCGTG TGGGTGTGCA 15890 75 GTGTGAGTAG CATGTGTGCA CATAACATGT ATTGAGGGGT CCTCGTGTTC ACCCCGCTAG GTCCTCAGCA 15960 CCAGTGCCAC TCCTTACAGG ATGAGACGGG GTCCCAGGCC TTGGTGGGCT GAGGCTCTGA AGCTGCAGCC 16030 CTGAGGGCAT TGTCCCATCT GGGCATCCGC GTCCACTCCC TCTCCTGTGG GCTTCTGTGT CCACTCCCCC 16100 TCTCCTGTGG GCATTTACAT CCACTCCACT CCCTCTCTCC TGTGGGCATC CGCGTCCACT CCCCCTCTCT 16170 - 33 GTGGGCATCT GCGTCCACCT CCCCTCTCTG TGGGCATTTG CGTCCACTCC CTCTCCTGGT TCCTTCCTGT 16240 CTTGGCCGAG CCTCGGGGGC AGGCAGATGA CACAGAGTCT TGACTCGCCC AGGGTGGTTC GCAGCTGCCG 16310 GGTGAGGGCC AGGCCGGATT TCACTGGGAA GAGGGATAGT TTCTTGTCAA AATGTTCCTC TTTCTTGTTC 16380 CATCTGAATG GATGATAAAG CAAAAAGTAA AAACTTAAAA TCCCAGAGAG GTTTCTACCG TTTCTCACTC 16450 5 TTTCTTGGCG ACTCTAGGTG AACAGCCTCC AGACGGTGTG CACCAACATC TACAAGATCC TCCTGCTGCA 16520 GGCGTACAGG TGAGCCGCCA CCAAGGGGTG CAGGCCCAGC CTCCAGGGAC CCTCCGCGCT CTGCTCACCT 16590 CTGACCCGGG GCTTCACCTT GGAACTCCTG GGTTTTAGGG GCAAGGAATG TCTTACGTTT TCAGTGGTGC 16660 TGCTGCCTGT GCACAGTTCT GTTCGCGTGG CTCTGTGCAA AGCACCTGTT CTCCATCTCT GGGTAGTGGT 16730 AGGAGCCGGT GTGGCCCCAG GTGTCCCCAC TGTGCCTGTG CACTGGCCGT GGGACGTCAT GGAGGCCATC 16800 10 CCAGGGCAGC AGGGGCATGG GGTAAAGAGA TGTTTATGGG GAGTCTTAGC AGAGGAGGCT GGGAAGGTGT 16870 CTGAACAGTA GATGGGAGAT CAGATGCCCG GAGGATTTGG GGTCTCAGCA AAGAGGGCCG AGGTGGGTGC 16940 AGGTGAGGGT CGCTGGCCCC ACCCCCGGGA AGGTGCAGCA GAGCTGTGGC TCCCCACACA GCCCGGCCAG 17010 CACCTGTGCT CTGGGCATGG CTGTGCTCCT GGAACGTTCC CTGTCCTGGC TGGTCAGGGG GTGCCCCTGC 17080 CAAGAATCGA CAACTTTATC ACAGAGGGAA GGGCCAATCT GTGGAGGCCA CAGGGCCAGC TTCTGCCTGG 17150 15 AGTCAGGGCA GGTGGTGGCA CAAGCCTCGG GGCTGTACCA AAGGGCAGTC GGGCACCACA GGCCCGGGCC 17220 TCCACCTCAA CAGGCCTCCC GAGCCACTGG GAGCTGAATG CCAGGAGGCC GAAGCCCTCG CCCCATGAGG 17290 GCTGAGAAGG AGTGTGAGCA TTTGTGTTAC CCAGGGCCGA GGCTGCGCGA ATTACCGTGC ACACTTGATG 17360 TGAAATGAGG TCGTCGTCTA TCGTGGAAAC CCAGCAAGGG CTCACGGGAG AGTTTTCCAT TACAAGGTCG 17430 TACCATGAAA ATGGTTTTTA ACCCGAGTGC TTGCGCCTTC ATGCTCTGGC AGGGAGGGCA GAGCCACAGC 17500 20 TGCATGTTAC CGCCTTTGCA CCAGCTCCAG AGGCTTGGGA CCAGGCTGTC TCAGTTCCAG GGTGCGTCCG 17570 GCTCAGACCG CCCTCCTCTC TGCCTTCTCT CTCTGCCTCA AATCTTCCCT CGTTTGCATC TCCCTGACGC 17640 GTGCCTGGGC CCTCGTGCAA GCTGCTTGAC TCCTTTCCGG AAACCCTTGG- GGTGTGCTGG ATACAGGTGC 17710 CACTGAGGAC TGGAGGTGTC TGACACTGTG GTTGACCCCA GGGTCCAGCT GGCGTGCTTG GGGCCTCCTT 17780 GGGCCATGAT GAGGTCAGAG GAGTTTTCCC AGGTGAAAAC TCCTGGGAAA CTCCCAGGGC CATGTGACCT 17850 25 GCCACCTGCT CCTCCCATAT TCAGCTCAGT CTTGTCCTCA TTTCCCCACC AGGGTCTCTA GCTCCGAGGA 17920 GCTCCCGTAG AGGGCCTGGG CTCAGGGCAG GGCGGCTGAG TTTCCCCACC CATGTGGGGA CCCTTGGGTA 17990 GTCGCTTGAT TGGGTAGCCC TGAGGAGGCC GAGATGCGAT GGGCCACGGG CCGTTTCCAA ACACAGAGTC 18060 AGGCACGTGG AAGGCCCAGG AATCCCCTTC CCTCGAGGCA GGAGTGGGAG AACGGAGAGC TGGGCCCCGA 18130 TTTCACGGCA GCCAGGCTGC AGTGGGCGAG GCTGTGGTGG TCCACGTGGC GCTGGGGGCG GGGTCTGATT 18200 30 CAAATCCGCT GGGGCTCGGC CTTCCTGGCC CGTGCTGGCC GCGCCTCCAC ACGGGCTTGG GGTGGACGCC 18270 CCGACCTCTA GCAGGTGGCT ATTTCTCCCT TTGGAAGAGA GCCCCTCACC CATGCTAGGT GTTTCCCTCC 18340 TGGGTCAGGA GCGTGGCCGT GTGGCAACCC CGGGACCTTA GGCTTATTTA TTTGTTTAAA AACATTCTGG 18410 GCCTGGCTTC CGTTGTTGCT AAATGGGGAA AAGACATCCC ACCTCAGCAG AGTTACTGAG AGGCTGAAAC 18480 CGGGGTGCTG GCTTGACTGG TGTGATCTCA GGTCATTCCA GAAGTGGCTC AGGAAGTCAG TGAGACCAGG 18550 35 TACATGGGGG GCTCAGGCAG TGGGTGAGAT GAGGTACACG GGGGGCTCAG GCAGTGGGTG AGGCCAGGTA 18620 CATGGGGGGC TCAGGCACTG GGTGAGATGA GGTACACGGG GGGCTCAGGC AGAGGGTCAG ACCAGGTACA 18690 CGGGGGCTCT GATCACACGC ACATATGAGC ACATGTGCAC ATGTGCTGTT TCATGGTAGC CAGGTCTGTG 18760 CACACCTGCC CCAAAGTCCC AGGAAGCTGA GAGGCCAAAG ATGGAGGCTG ACAGGGCTGG CGCGGTGGCT 18830 CACACCTGTA GTCCCAGCAC TTTGGGAGGC CGAGGCGAGA GGATCCCTTG AGCCCAGGAG TTTAAGACCA 18900 40 GCCTGAGCAA CATAGTAGAA CCCCATCTCT ATGAAAAATA AAAACAAAAA TTAGCTGAAC ATGGTGGTGT 18970 GCGCCTGTAG TTCCAATACT TGGGAGGCTG AAGTGGGAGG ATCACTTGAG CCCAGGAGGT GGAAGCTGCA 19040 GTGAGCTGAG ATTGCACCAC TGTACTGCAG CCTGGGTGAC AGAGTGAGAG CCCATCTCAA CAACAACAAA 19110 GAAGACTGAC AAATGCAGTT TCTTGGAAAG AAACATTTAG TAGGAACTTA ACCTACACAC AGAAGCCAAG 19180 TCGGTGTCTC GGTGTCAGTG AGATGAGATG ATGGGTCCTC ACACCATCAC CCCAGACCCA GGGTTTATGC 19250 45 ACCACAGGGG CGGGTGGCTC AGAAGGGATG CGCAGGACGT TGATATACGA TGACATCAAG GTTGTCTGAC 19320 GAAGGGCAGG ATTCATGATA AGTACCTGCT GGTACACAAG GAACAATGGA TAAACTGGAA ACCTTAGAGG 19390 CCTTCCCGGA ACAGGGGCTA ATCAGAAGCC AGCATGGGGG GCTGGCATCC AGGATGGAGC TGCTTCAGCC 19460 TCCACATGCG TGTTCATACA GATGGTGCAC AGAAACGCAG TGTACCTGTG CACACACAGA CACGCAGCTA 19530 CTCGCACACA CAAGCACACA CACAGACATG CATGCATGCA TCCGTGTGTG TGCACCTGTG CCCATGAGGA 19600 50 AACCCATGCA TGTGCATTCA TGCACGCACA CAGGCACCGG TGGGCCCATG CCCACACCCA CGAGCACCGT 19670 CTGATTAGGA GGCCTTTCCT CTGACGCTGT CCGCCATCCT CTCAGGTTTC ACGCATGTGT GCTGCAGCTC 19740 CCATTTCATC AGCAAGTTTG GAAGAACCCC ACATTTTTCC TGCGCGTCAT CTCTGACACG GCCTCCCTCT 19810 GCTACTCCAT CCTGAAAGCC AAGAACGCAG GTATGTGCAG GTGCCTGGCC TCAGTGGCAG CAGTGCCTGC 19880 CTGCTGGTGT TAGTGTGTCA GGAGACTGAG TGAATCTGGG CTTAGGAAGT TCTTACCCCT TTTCGCATCA 19950 55 GGAAGTGGTT TAACCCAACC ACTGTCAGGC TCGTCTGCCC GCCCTCTCGT GGGGTGAGCA GAGCACCTGA 20020 TGGAAGGGAC AGGAGCTGTC TGGGAGCTGC CATCCTTCCC ACCTTGCTCT GCCTGGGGAA GCGCTGGGGG 20090 GCCTGGTCTC TCCTGTTTGC CCCATGGTGG GATTTGGGGG GCCTGGCCTC TCCTGTTTGC CCTGTGGTGG 20160 GATTGGGCTG TCTCCCGTCC ATGGCACTTA GGGCCCTTGT GCAAACCCAG GCCAAGGGCT TAGGAGGAGG 20230 CCAGGCCCAG GCTACCCCAC CCCTCTCAGG AGCAGAGGCC GCGTATCACC ACGACAGAGC CCCGCGCCGT 20300 60 CCTCTGCTTC CCAGTCACCG TCCTCTGCCC CTGGACACTT TGTCCAGCAT CAGGGAGGTT TCTGATCCGT 20370 CTGAAATTCA AGCCATGTCG AACCTGCGGT CCTGAGCTTA ACAGCTTCTA CTTTCTGTTC TTTCTGTGTT 20440 GTGGAAATTT CACCTGGAGA AGCCGAAGAA AACATTTCTG TCGTGACTCC TGCGGTGCTT GGGTCGGGAC 20510 AGCCAGAGAT GGAGCCACCC CGCAGACCGT CGGGTGTGGG CAGCTTTCCG GTGTCTCCTG GGAGGGGAGC 20580 TGGGCTGGGC CTGTGACTCC TCAGCCTCTG TTTTCCCCCA GGGATGTCGC TGGGGGCCAA GGGCGCCGCC 20650 65 GGCCCTCTGC CCTCCGAGGC CGTGCAGTGG CTGTGCCACC AAGCATTCCT GCTCAAGCTG ACTCGACACC 20720 GTGTCACCTA CGTGCCACTC CTGGGGTCAC TCAGGACAGG CAAGTGTGGG TGGAGGCCAG TGCGGGCCCC 20790 ACCTGCCCAG GGGTCATCCT TGAACGCCCT GTGTGGGGCG AGCAGCCTCA GATGCTGCTG AAGTGCAGAC 20860 GCCCCCGGGC CTGACCCTGG GGGCCTGGAG CCACGCTGGC AGCCCTATGT GATTAAACGC TGGTGTCCCC 20930 AGGCCACGGA GCCTGGCAGG GTCCCCAACT TCTTGAACCC CTGCTTCCCA TCTCAGGGGC GATGGCTCCC 21000 70 CACGCTTGGG AGCCTTCTGA CCCCTGACCT GTGTCCTCTC ACAGCCTCTT CCCTGGCTGC TGCCCTGAGC 21070 TCCTGGGGTC CTGAGCAAGT TCTCTCCCCG CCCCGCCGCT CCAGCGTCAC TGGGCTGCCT GTCTGCTCGC 21140 CCCGGTGGAG GGGTGTCTGT CCCTTCACTG AGGTTCCCAC CAGCCAGGGC CACGAGGTGC AGGCCCTGCC 21210 TGCCCGGCCA CCCACACGTC CTAGGAGGGT TGGAGGATGC CACCTCTGGC CTCTTCTGGA ACGGAGTCTG 21280 ATTTTGGCCC CGCAGCCCAG ACGCAGCTGA GTCGGAAGCT CCCGGGGACG ACGCTGACTG CCCTGGAGGC 21350 75 CGCAGCCAAC CCGGCACTGC CCTCAGACTT CAAGACCATC CTGGACTGAT GGCCACCCGC CCACAGCCAG 21420 GCCGAGAGCA GACACCAGCA GCCCTGTCAC GCCGGGCTCT ACGTCCCAGG GAGGGAGGGG CGGCCCACAC 21490 CCAGGCCCGC ACCGCTGGGA GTCTGAGGCC TGAGTGAGTG TTTGGCCGAG GCCTGCATGT CCGGCTGAAG 21560 GCTGAGTGTC CGGCTGAGGC CTGAGCGAGT GTCCAGCCAA GGGCTGAGTG TCCAGCACAC CTGCCGTCTT 21630 RA4.' - 34 CACTTCCCCA CAGGCTGGCG CTCGGCTCCA CCCCAGGGCC AGCTTTTCCT CACCAGGAGC CCGGCTTCCA 21700 CTCCCCACAT AGGAATAGTC CATCCCCAGA TTCGCCATTG TTCACCCCTC GCCCTGCCCT CCTTTGCCTT 21770 CCACCCCCAC CATCCAGGTG GAGACCCTGA GAAGGACCCT GGGAGCTCTG GGAATTTGGA GTGACCAAAG 21840 GTGTGCCCTG TACACAGGCG AGGACCCTGC ACCTGGATGG GGGTCCCTGT GGGTCAAATT GGGGGGAGGT 21910 5 GCTGTGGGAG TAAAATACTG AATATATGAG TTTTTCAGTT TTGAAAAAAA TCTCATGTTT GAATCCTAAT 21980 GTGCACTGCA TAGACACCAC TGTATGCAAT TACAGAAGCC TGTGAGTGAA CGGGGTGGTG GTCAGTGCGG 22050 GCCCATGGCC TGGCTGTGCA TTTACGGAAG TCTATGAGTG AATGGGGTTG TGGTCAGTGC GGGCCCATGG 22120 CCTGGCTGGG CCTGGGAGGT TTCTGATGCT GTGAGGCAGG AGGGGAAGGA GGGTAGGGGA TAGACAGTGG 22190 GAGCCCCCAC CCTGGAAGAC ATAACAGTAA GTCCAGGCCC GAAGGGCAGC AGGGATGCTG GGGGCCCAGC 22260 10 TTGGGCGGCG GGGATGATGG AGGGCCTGGC CAGGGTGGCA GGGATGATGG GGGCCCCAGC TGGGGTGGCA 22330 GGGGTGATGG GGGGGGCTGG TCTGGGTGGC GGGGAAGATG GGGAAGCCTG GCTGGGCCCC CTCCTCCCCT 22400 GCCTCCCACC TGCAGCCGTG GATCCGGATG TGCTTCCCTG GTGCACATCC TCTGGGCCAT CAGCTTTCAT 22470 GGAGGTGGGG GGCAGGGGCA TGACACCATC CTGTATAAAA TCCAGGATTC CTCCTCCTGA ACGCCCCAAC 22540 TCAGGTTGAA AGTCACATTC CGCCTCTGGC CATTCTCTTA AGAGTAGACC AGGATTCTGA TCTCTGAAGG 22610 15 GTGGGTAGGG TGGGGCAGTG GAGGGTGTGG ACACAGGAGG CTTCAGGGTG GGGCTGGTGA TGCTCTCTCA 22680 TCCTCTTATC ATCTCCCAGT CTCATCTCTC ATCCTCTTAT CATCTCCCAG TCTCATCTGT CTTCCTCTTA 22750 TCTCCCAGTC TCATCTGTCA TCCTCTTACC ATCTCCCAGT CTCATCTCTT ATCCTCTTAT CTCCTAGTCT 22820 CATCCAGACT TACCTCCCAG GGCGGGTGCC AGGCTCGCAG TGGAGCTGGA CATACGTCCT TCCTCAGGCA 22890 GAAGGAACTG GAAGGATTGC AGAGAACAGG AGGGGCGGCT CAGAGGGACG CAGTCTTGGG GTGAAGAAAC 22960 20 AGCCCCTCCT CAGAAGTTGG CTTGGGCCAC ACGAAACCGA GGGCCCTGCG TGAGTGGCTC CAGAGCCTTC 23030 CAGCAGGTCC CTGGTGGGGC CTTATGGTAT GGCCGGGTCC TACTGAGTGC ACCTTGGACA GGGCTTCTGG 23100 TTTGAGTGCA GCCCGGACGT GCCTGGTGTC GGGGTGGGGG CTTATGGCCA CTGGATATGG CGTCATTTAT 23170 TGCTGCTGCT TCAGAGAATG TCTGAGTGAC CGAGCCTAAT GTGTATGGTG GGCCCAAGTC CACAGACTGT 23240 GTCGTAAATG CACTCTGGTG CCTGGAGCCC CCGTATAGGA GCTGTGAGGA AGGAGGGGCT CTTGGCAGCC 23310 25 GGCCTGGGGG CGCCTTTGCC CTGCAAACTG GAAGGGAGCG GCCCCGGGCG CCGTGGGCGG ACGACCTCAA 23380 GTGAGAGGTT GGACAGAACA GGGCGGGGAC TTCCCAGGAG CAGAGGCCGC TGCTCAGGCA CACCTGGGTT 23450 TGAATCACAG ACCAACaGGT CAGGCCATTG TTCAGCTATC CATCTTCTAC AAAGCTCCAG ATTCCTGTTT 23520 CTCCGGGTGT TTTTTGTTGA AATTTTACTC AGGATTACTT ATATTTTTTG CTAAAGTATT AGACCCTTAA 23590 AAAAGGTATT TGCTTTGATA TGGCTTAACT CACTAAGCAC CTACTTTATT TGTCTGTTTT TATTTATTAT 23660 30 TATTATTATT ATTAGAGATG GTGTCTACTC TGTCACCCAG GTTGTTAGTG CAGTGGCACA GTCATGGCTC 23730 GCTGTAGCCG CAAACCCCCA GGCTCAAGTG ATCCTCCGGC CTCAGCTTCC CAGAGTGCTG GGATTACAGG 23800 TGTGAGCCAC TGCCCTTGCC TGGCACTTTT AAAAACCACT ATGTAAGGTC AGGTCCAGTG GCTTCCACAC 23870 CTGTCATCCC AGTAGTTTGG GAAGCCGAGG CAGAAGGATT GTCTGAGGCC AGGAGTTTGA GACCAGCATG 23940 GGTAACATAG GGAGACCCCA TCTCTACAAA AAATGCAAAA AGTTATCCGG GCGTGGGGTC CAGCATCTGT 24010 35 AGTCCCAGCT GCTCGGGAGG CTGAGTGGGA GGATCGCTTG AGCCCGGGAG GTCATGGCTG CAGTGAGCTG 24080 TGATTGTACC ATCGCACTCC AGCCTGGGCA ACAGAGTGAG ACCCTGTCTC AAAAAAAAAA AAAAAAAAAG 24150 AAGGAGAAGG AGAAGAGAAG AAGAAGGAAG AAGGAAAGAG AAGAAGAAGG AAGAAGGAAG AAAGAAGGAG 24220 AAGGAGGCCT GCTAGGTGCT AGGTAGACTG TCAAATCTCA GAGCAAAATG AAAATAACAA AGTTTTAAAG 24290 GGAAAGAAAA ACCCCAGCTC TTTGGACTTC CTTAGGCCTG AACTTCATCT CAAGCAGCTT CCTTCCACAG 24360 40 ACAAGCGTGT ATGGAGCGAG TGAGTTCAAA GCAGAAAGGG AGGAGAAGCA GGCAAGGGTG GAGGCTGTGG 24430 GTGACACCAG CCAGGACCCC TGAAAGGGAG TGGTTGTTTT CCTGCCTCAG CCCCACGCTC CTGCCGGTCC 24500 TGCACCTGCT GTAACCGTCG ATGTTGGTGC CAGGTGCCCA CCTGGGAAGG ATGCTGTGCA GGGGGCTTGC 24570 CAAACTTTGG TGGGTTTCAG AAGCCCCAGG CACTTGTGGC AGGCACAATT ACAGCCCCTC CCCAAAGATG 24640 CCCACGTCCT TCTCCTGGAA CCTGTGAATG TGTCACCCGC AAGGCAGAGG CTGGTGAAGG CTGCAGGTGG 24710 45 AATCACGGCT GCCAGTCAGC CGATCTTAAG GTCATCCTGG ATTATCTGGT GGGCCTGATA TGGCCACAAG 24780 GGTCCCTAGA AGTGAGAGAG GGAGGCAGGG GAGAGTCAGA GAGGGGACGT GAGAAGGACC ACTGGCCACT 24850 GCTGGCTTTG AGATGGAGGA GGGGGTCCCC AGCCAAGGAA TGGGGGCAGC CGCTCCATGC TGGAAAAGCA 24920 AGCAATCCTC CCCGGTCCTG AGGGCACACG GCCCTGCCCA CGCCTCGATT TCAGGCCAGT GGGACCTGTT 24990 TCAGCTTTCC GGCCTCCAGA GCTGTAAGAT GATGCGTTTG TGTTCAGCCA CTAAGCTGCA GTGATTCGTC 25060 50 ACAGCAGCAA ATGGAATAGC AGTACAGGGA AATGAATACA GGGACAGTTC TCAGAGTGAC TCTCAGCCCA 25130 CCCCTGGG 25138 Example 5 55 Comparison of the above-described genomic hTC sequence and the sequence of the hTC cDNA (Fig. 6; corresponding to SEQ ID NO 2) made it possible to elucidate the exon-intron structure of the hTC gene. The genomic organization of the hTC gene is illustrated diagrammatically in Fig. 7. The coding region of the hTC gene is composed of 16 exons which vary in size between 62 bp and 1354 bp (see Table 1). 60 Exon 1 contains the translation start codon ATG. The translation stop codon TGA and the 3'-untrapslated region lie on exon 16 (Fig. 8). No possible polyadenylation RAQ signal (AATAAA) was found either in exon 16 or in the 3195 bp of the following -35 3'-flanking region. The exon-intron transitions were determined on the basis of the consensus sequence 5'-Exon Intron 3'-Exon 5 Pre-mRNA A/c A G IG T A/G A ... N C A GI G Frequency (%) 70 60 80 100 100 95 70 80 100 100 60 and listed in Table 1. With the exception of the 5' splice site between exon 15 and intron 15, all the exon-intron transitions are in accord with the published (Shapiro 10 and Senapathy, 1987) splice consensus sequence. The sizes of the introns are between 104 bp and 8616 bp. Since only part of intron 6 was isolated, it is not possible to determine the precise length of the hTC gene. Based on the part sequence of -4660 bp, which was obtained from intron 6, the minimum size of the hTERT gene is 37 kb. r IR; 1- -36 Introns 1-5 and the 5' region of intron 6, are contained in contig 1: Intron 1: bp 11493-11596 (SEQ ID NO 4); Intron 2: bp 12951-21566 (SEQ ID NO 5); Intron 3: bp 21763-23851 (SEQ ID NO 6); 5 Intron 4: bp 24033-24719 (SEQ ID NO 7); Intron 5: bp 24900-25393 (SEQ ID NO 8); 5' region of intron 6: bp 25550-26414 (SEQ ID NO 9). The 3' region of intron 6, and introns 7-15, are located in contig 2 at the following 10 positions: 3' region of intron 6: bp 1-3782 (SEQ ID NO 10); Intron 7: bp 3879-4858 (SEQ ID NO 11); Intron 8: bp 4945-7429 (SEQ ID NO 12); Intron 9: bp 7544-9527 (SEQ ID NO 13); 15 Intron 10: bp 9600-11470 (SEQ ID NO 14); Intron 11: bp 11660-15460 (SEQ ID NO 15; Intron 12: bp 15588-16467 (SEQ ID NO 16); Intron 13: bp 16530-19715 (SEQ ID NO 17); Intron 14: 19841-20621 (SEQ ID NO 18); 20 Intron 15: 20760-21295 (SEQ ID NO 19). The 3'-untranscribed region is also located in contig 2 at position 21960-25138 (SEQ ID NO 20). 25 The individual sequences of the abovementioned introns are as follows: r R AL
LIJI
-37 Intron 1 (SEQ ID NO 4) GTGGGCCTCCCCGGGGTCGGCGTCCGGCTGGGGTTGAGGGCGGCCGGGGGGAACCAGCGACATGCGGAGAGCAGCGCAGG CGACTCAGGGCGCTTCCCCCGCAG 5 Intron 2 (SEQ ID NO 5) GTGAGGAGGTGGTGGCCGTCGAGGGCCCAGGCCCCAGAGCTGAATGCAGTAGGGGCTCAGAAAAGGGGGCAGGCAGAGCC CTGGTCCTCCTGTCTCCATCGTCACGTGGGCACACGTGGCTTTTCGCTCAGGACGTCGAGTGGACACGGTGATCTCTGCC TCTGCTCTCCCTCCTGTCCAGTTTGCATAAACTTACGAGGTTCACCTTCACGTTTTGATGGACACGCGGTTTCCAGGCGC CGAGGCCAGAGCAGTGAACAGAGGAGGCTGGGCGCGGCAGTGGAGCCGGGTTGCCGGCAATGGGGAGAAGTGTCTGGAAG 10 CACAGACGCTCTGGCGAGGGTGCCTGCAGGTTACCTATAATCCTCTTCGCAATTTCAAGGGTGGGAATGAGAGGTGGGGA CGAGAACCCCCTCTTCCTGGGGGTGGGAGGTAAGGGTTTTGCAGGTGCACGTGGTCAGCCAATATGCAGGTTTGTGTTTA AGATTTAATTGTGTGTTGACGGCCAGGTGCGGTGGCTCACGCCGGTAATCCCAGCACTTTGGGAAGCTGAGGCAGGTGGA TCACCTGAGGTCAGGAGTTTGAGACCAGCCTGACCAACATGGTGAAACCCTATCTGTACTAAAAATACAAAAATTAGCTG GGCATGGTGGTGTGTGCCTGTAATCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATCACTTGAACCCAGGAGGCGGAGGC 15 TGCAGTGAGCTGAGATTGTGCCATTGTACTCCAGCCTGGGCGACAAGAGTGAAACTCTGTCTTTAAAAAAAAAAAGTGTT CGTTGATTGTGCCAGGACAGGGTAGAGGGAGGGAGATAAGACTGTTCTCCAGCACAGATCCTGGTCCCATCTTTAGGTAT GAAGAGGGCCACATGGGAGCAGAGGACAGCAGATGGCTCCACCTGCTGAGGAAGGGACAGTGTTTGTGGGTGTTCAGGGG ATGGTGCTGCTGGGCCCTGCCGTGTCCCCACCCTGTTTTTCTGGATTTGATGTTGAGGAACCTCCGCTCCAGCCCCCTTT TGGCTCCCAGTGCTCCCAGGCCCTACCGTGGCAGCTAGAAGAAGTCCCGATTTCACCCCCTCCCCACAAACTCCCAAGAC 20 ATGTAAGACTTCCGGCCATGCAGACAAGGAGGGTGACCTTCTTGGGGCTCTTTTTTTTCTTTTTTTCTTTTTATGGTGGC AAAAGTCATATAACATGAGATTGGCACTCCTAACACCGTTTTCTGTGTACAGTGCAGAATTGCTAACTCGGCGGTGTTTA CAGCAGGTTGCTTGAAATGCTGCGTCTTGCGTGACTGGAAGTCCCTACCCATCGAACGGCAGCTGCCTCACACCTGCTGC GGCTCAGGTGGACCACGCCGAGTCAGATAAGCGTCATGCAACCCAGTTTTGCTTTTTGTGCTCCAGCTTCCTTCGTTGAG GAGAGTTTGAGTTCTCTGATCAGGACTCTGCCTGTCATTGCTGTTCTCTGACTTCAGATGAGGTCACAATCTGCCCCTGG 25 CTTATGCAGGGAGTGAGGCGTGGTCCCCGGGTGTCCCTGTCACGTGCAGGGTGAGTGAGGCGTTGCCCCCAGGTGTCCCT GTCACGTGTAGGGTGAGTGAGGCGCGGCCCCCGGGTGTCCCTGTCCCGTGCAGCGTGATTGAGGTGTGGCCCCCGGGTGT CCCTGTCACGTGTAGGGTGAGTGAGGCGCCATCCCCGGGTGTCCCTGTCACGTGTAGGGTGAGTGAGGCGTGGTCCCCGG GTGTCCCTGTCCCGTGCAGGGTGAGTGAGGCACTGTCCCCGGGTGTCCCTGTCACGTGCAGGGTGAGTGAGGCGCGGTCC CCGGGTGTCCCTCTCAGGTGTAGGGTGAGTGAGGCGCGGCCCCAGGGTGTCCCTGTCACGTGTAGGGTGAGTGAGGCACC 30 GTCCCTGGGTGTCCCTCCCAGGTATAGGGTGAGTGAGGCACTGTCCCCGGGTGTCCCTGTCACGTGCAGGGTGAGTGAGG CGCGGCCCCCGGGTGTCCCTCTCAGGTGCAGGGTGAGTGAGGCGCTGTCCCTGGGTGTCCCTGTCTCGTGTAGGGTGAGT GAGGCTCTGTCCCCAGGTGTCCTTGGCGTTTGCTCACTTGAGCTTGCTCCTGAATGTTTGCTCTTTCTATAGCCACAGCT GCGCCGGTTGCCCATTGCCTGGGTAGATGGTGCAGGCGCAGTGCTGGTCCCCAAGCCTATCTTTTCTGATGCTCGGCTCT TCTTGGTCACCTCTCCGTTCCATTTTGCTACGGGGACACGGGACTGCAGGCTCTCGCCTCCCGCGTGCCAGGCACTGCAG 35 CCACAGCTTCAGGTCCGCTTGCCTCTGTTGGGCCTGGCTTGCTCACCACGTGCCCGCCACATGCATGCTGCCAATACTCC TCTCCCAGCTTGTCTCATGCCGAGGCTGGACTCTGGGCTGCCTGTGTCTGCTGCCACGTGTTGCTGGAGACATCCCAGAA AGGGTTCTCTGTGCCCTGAAGGAAAGCAAGTCACCCCAGCCCCCTCACTTGTCCTGTTTTCTCCCAAGCTGCCCCTCTGC TTGGCCCCCTTGGGTGGGTGGCAACGCTTGTCACCTTATTCTGGGCACCTGCCGCTCATTGCTTAGGCTGGGCTCTGCCT CCAGTCGCCCCCTCACATGGATTGACGTCCAGCCACAGGTTGGAGTGTCTCTGTCTGTCTCCTGCTCTGAGACCCACGTG 40 GAGGGCCGGTGTCTCCGCCAGCCTTCGTCAGACTTCCCTCTTGGGTCTTAGTTTTGAATTTCACTGATTTACCTCTGACG TTTCTATCTCTCCATTGTATGCTTTTTCTTGGTTTATTCTTTCATTCCTTTTCTAGCTTCTTAGTTTAGTCATGCCTTTC CCTCTAAGTGCTGCCTTACCTGCACCCTGTGTTTTGATGTGAAGTAATCTCAACATCAGCCACTTTCAAGTGTTCTTAAA ATACTTCAAAGTGTTAATACTTCTTTTAAGTATTCTTATTCTGTGATTTTTTTCTTTGTGCACGCTGTGTTTTGACGTGA AATCATTTTGATATCAGTGACTTTTAAGTATTCTTTAGCTTATTCTGTGATTTCTTTGAGCAGTGAGTTATTTGAACACT 45 GTTTATGTTCAAGATATGTAGAGTATCAAGATACGTAGAGTATTTTAAGTTATCATTTTATTATTGATTTCTAACTCAGT TGTGTAGTGGTCTGTATAATACCAATTATTTGAAGTTTGCGGAGCCTTGCTTTGTGATCTAGTGTGTGCATGGTTTCCAG
AACTGTCCATTGTAAATTTGACATCCTGTCAATAGTGGGCATGCATGTTCACTATATCCAGCTTATTAAGGTCCAGTGCA
-38 AAGCTTCTGTCTCCTTCTAGATGCATGAAATTCCAAGAAGGAGGCCATAGTCCCTCACCTGGGGGATGGGTCTGTTCATT TCTTCTCGTTTGGTAGCATTTATGTGAGGCATTGTTAGGTGCATGCACGTGGTAGAATTTTTATCTTCCTGATGAGTGAA TCTTTTGGAGACTTCTATGTCTCTAGTAATCTAGTAATTCTTTTTTTAAATTGCTCTTAGTACTGCCACACTGGGCTTCT TTTGATTAGTATTTTCCTGCTGTGTCTGTTTTCTGCCTTTAATTTATATATATATATATATTTTTTTTTTTTTTGAGACA 5 GAGTCTTGGTCTGTCGCCCAGGGTGAGTGCAGTGGTGTGATCACAGGTCAGTGTAACTTTTACCTTCTGGCCTGAGCCGT CCTCTCACCTCAGCCTCCTGAGTAGCTGGAACTGCAGACACGCACCGCTACACCTGGCTAATTTTTAAATTTTTTCTGGA GACAGGGTCTTGCTGTGTTGCCCAGGCTGGTCTCAAACTCTTGGACTCAAGGGATCCATCTACCTCGGCTTCCCAAAGTG CTGAATTACAGGCATGAGCCACCATGTCTGGCCTAATTTTCAACACTTTTATATTCTTATAGTGTGGGTATGTCCTGTTA ACAGCATGTAGGTGAATTTCCAATCCAGTCTGACAGTCGTTGTTTAACTGGATAACCTGATTTATTTTCATTTTTTTGTC 10 ACTAGAGACCCGCCTGGTGCACTCTGATTCTCCACTTGCCTGTTGCATGTCCTCGTTCCCTTGTTTCTCACCACCTCTTG GGTTGCCATGTGCGTTTCCTGCCGAGTGTGTGTTGATCCTCTCGTTGCCTCCTGGTCACTGGGCATTTGCTTTTATTTCT CTTTGCTTAGTGTTACCCCCTGATCTTTTTATTGTCGTTGTTTGCTTTTGTTTATTGAGACAGTCTCACTCTGTCACCCA GGCTGGAGTGTAATGGCACAATCTCGGCTCACTGCAACCTCTGCCTCCTCGGTTCAAGCAGTTCTCATTCCTCAACCTCA TGAGTAGCTGGGATTACAGGCGCCCACCACCACGCCTGGCTAATTTTTGTATTTTTAGTAGAGATAGGCTTTCACCATGT 15 TGGCCAGGCTGGTCTCAAACTCCTGACCTCAAGTGATCTGCCCGCCTTGGCCTCCCACAGTGCTGGGATTACAGGTGCAA GCCACCGTGCCCGGCATACCTTGATCTTTTAAAATGAAGTCTGAAACATTGCTACCCTTGTCCTGAGCAATAAGACCCTT AGTGTATTTTAGCTCTGGCCACCCCCCAGCCTGTGTGCTGTTTTCCCTGCTGACTTAGTTCTATCTCAGGCATCTTGACA CCCCCACAAGCTAAGCATTATTAATATTGTTTTCCGTGTTGAGTGTTTCTGTAGCTTTGCCCCCGCCCTGCTTTTCCTCC TTTGTTCCCCGTCTGTCTTCTGTCTCAGGCCCGCCGTCTGGGGTCCCCTTCCTTGTCCTTTGCGTGGTTCTTCTGTCTTG 20 TTATTGCTGGTAAACCCCAGCTTTACCTGTGCTGGCCTCCATGGCATCTAGCGACGTCCGGGGACCTCTGCTTATGATGC ACAGATGAAGATGTGGAGACTCACGAGGAGGGCGGTCATCTTGGCCCGTGAGTGTCTGGAGCACCACGTGGCCAGCGTTC CTTAGCCAGTGAGTGACAGCAACGTCCGCTCGGCCTGGGTTCAGCCTGGAAAACCCCAGGCATGTCGGGGTCTGGTGGCT CCGCGGTGTCGAGTTTGAAATCGCGCAAACCTGCGGTGTGGCGCCAGCTCTGACGGTGCTGCCTGGCGGGGGAGTGTCTG CTTCCTCCCTTCTGCTTGGGAACCAGGACAAAGGATGAGGCTCCGAGCCGTTGTCGCCCAACAGGAGCATGACGTGAGCC 25 ATGTGGATAATTTTAAAATTTCTAGGCTGGGCGCGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCAAGGCGGG TGGATCACGAGGTCAGGAGGTCGAGACCATCCTGGCCAACATGATGAAACCCCATCTGTACTAAAAACACAAAAATTAGC TGGGCGTGGTGGCGGGTGCCTGTAATCCCAGCTACTCGGGAGGCTGAGGCAGGAGAATTGCTTGAACCTGGGAGTTGGAA GTTGCAGTGAGCCGACATTGCACCACTGCACTCCAGCCTGGCAACACAGCGAGACTCTGTCTCAAAAAAAAAAAAAAAAA AAAAAAAAAAAATTCTAGTAGCCACATTAAAAAAGTAAAAAAGAAAAGGTGAAATTAATGTAATAATAGATTTTACTGAA 30 GCCCAGCATGTCCACACCTCATCATTTTAGGGTGTTATTGGTGGGAGCATCACTCACAGGACATTTGACATTTTTTGAGC TTTGTCTGCGGGATCCCGTGTGTAGGTCCCGTGCGTGGCCATCTCGGCCTGGACCTGCTGGGCTTCCCATGGCCATGGCT GTTGTACCAGATGGTGCAGGTCCGGGATGAGGTCGCCAGGCCCTCAGTGAGCTGGATGTGCAGTGTCCGGATGGTGCACG TCTGGGATGAGGTCGCCAGGCCCTGCTGTGAGCTGGATGTGTGGTGTCTGGATGGTGCAGGTCAGGGGTGAGGTCTCCAG GCCCTCGGTGAGCTGGAGGTATGGAGTCCGGATGATGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCTGTGAGCTGGATG 35 TGTGGTGTCTGGATGGTGCAGGTCAGGGGTGAGGTCTCCAGGCCCTCGGTAAGCTGGAGGTATGGAGTCCGGATGATGCA GGTCCGGGGTGAGGTCGCCAGGCCCTGCTGTGAGCTGGATGTGTGGTGTCTGGATGGTGCAGGTCTGGGGTGAGGTCACC AGGCCCTGCGGTGAGCTGGGTGTGCGGTGTCTGGATGGTGCAGGTCTGGAGTGAGGTCGCCAGACGGTGCCAGACCATGC GGTGAGCTGGATATGCGGTGTCCGGATGGTGCAGGTCTGGGGTGAGGTTGCCAGGCCCTGCTGTGAGTTGGATGTGGGGT GTCCGGATGCTGCAGGTCCGGTGTGAGGTCACCAGGCCCTGCTGTGAGCTGGATGTGTGGTGTCTGGATGGTGCAGGTCT 40 GGGGTGAAGGTCGCCAGGCCCCTGCTTGTGAGCTGGATGTGTGGTGTCTGGATGGTGCAGGTCTGGAGTGAGGTCGCCAG GCCCTCGGTGAGCTGGATGTGCAGTGTCCAGATGGTGCAGGTCCGGGGTGAGGTCGCCAGACCCTGCGGTGAGCTGGATG TGCGGTGTCTGGATGGTGCAGGTCTGGAGTGAGGTCGCCAGGCCCTCGGTGAGCTGGATGTATGGAGTCCGGATGGTGCC GGTCCGGGGTGAGGTCGCCAGACCCTGCTGTGAGCTGGATGTGCGGTGTCTGGATGGTACAGGTCTGGAGTGAGGTCGCC AGACCCTGCTGTGAGCTGGATATGCGGTGTCCGGATGGTGCAGGTCAGGGGTGAGGTCTCCAGGCCCTCGGTGAGCTGGA 45 GGTATGGAGTCCGGATGATGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCTGTGAACTGGATGTGCGGCGTCTGGATGGT GCAGGTCTGGGGTGTGGTCGCCAGGCCCTCGGTGAGCTGGAGGTATGGAGTCCGGATGATGCAGGTCCGGGGTGAGGTCG CCAGGCCCTGCTGTGAGCTGGATGTGCGGCGTCTGGATGGTGCAGGTCTGGGGTGTGGTCGCCAGGCCCTCGGTGAGCTG
LLI
-39 GAGGTATGGAGTCCGGATGATGCAGGTCCGGGGTGAGGTTGCCAGGCCCTGCTGTGAGCTGGATGTGCTGTATCCGGATG GTGCAGTCCGGGGTGAGGTCGCCAGGCCCTGCTGTGAGCTGGATGTGCTGTATCCGGATGGTGCAGGTCTGGGGTGAGGT CACCAGGCCCTGCGGTGAGCTGGTTGTGCGGTGTCCGGTTGCTGCAGGTCCGGGGTGAGTTCGCCAGGCCCTCGGTGAGC TGGATGTGCGGTGTCCCCGTGTCCGGATGGTGCAGGTCCAGGGTGAGGTCGCTAGGCCCTTGGTGGGCTGGATGTGCCGT 5 GTCCGGATGGTGCAGGTCTGGGGTGAGGTCGCCAGGCCTTTGGTGAGCTGGATGTGCGGTGTCTGCATGGTGCAGGTCTG GGGTGAGGTCGCCAGGCCCTTGGTGGGCTGGATGTGTGGTGTCCGGATGGTGCAGGTCCGGCGTGAGGTCGCCAGGCCCT GCTGTGAGCTGGATGTGCGGTGTCTGGATGGTGCAGGTCCGGGGTGAGGTAGCCAAGGCCTTCGGTGAGCTGGATGTGGG GTGTCCGGATGGTGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCGGTTAGCTGGATATGCGGTGTCCGGATGGTGCAGGT CCGGGGTGAGGTCACCAGGCCCTGCGGTTAGCTGGATGTGCGGTGTCTGGATGGTGCAGGTCCGGGGTGAGGTCGCCAGG 10 CCCTGCTGTGAGCTGGATGTGCTGTATCCGGATGGTGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCAGTGAGCTGGATG TGCTGTATCCGGATGGTGCAGGTCTGGCGTGAGGTCGCCAGGCCCTGCGGTTAGCTGGATATGCGGTGTCGGATGGTGCA GGTCCGGGGTGAGGTCACCAGGCCCTGCGGTTAGCTGGATGTGCGGTGTCCGGATGGTGCAGGTCTGGGGTGAGGTCGCC AGGCCCTGCTGTGAGCTGGATGTGCTGTATCCGGATGGTGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCGGTGAGCTGG ATGTGCTGTATCCGGATGGTGCAGGTCTGGCGTGAGGTCGCCAGGCCCTGCGGTGAGCTGGATGTGCAGTGTACGGATGG 15 TGCAGGTCCGGGGTGAGGTCGCCAGGCCCTGCGGTGGGCTGTATGTGTGTTGTCTGGATGGTGCAGGTCCGGGGTGAGTT CGCCAGGCCCTGCGGTGAGCTGGATGTGTGGTGTCTGGATGCTGCAGGTCCGGGGTGAGTTCGCCAGGCCCTCGGTGAGC TGGATATGCGGTGTCCCCGTGTCCGAATGGTGCAGGTCCAGGGTGAGGTCGCCAGGCCCTTGGTGGGCTGGATGTGCCGT GTCCGGATGGTGCAGGTCTGGGGTGAGGTCGCCAGGCCCTTGGTGAGCTGGATGTGCGGTGTCCGGATGGTGCAGGTCCG GGGTGAGGTCACCAGGCCCTCGGTGATCTGGATGTGGCATGTCCTTCTCGTTTAAG 20 Intron 3 (SEQ ID NO 6) GTACTGTATCCCCACGCCAGGCCTCTGCTTCTCGAAGTCCTGGAACACCAGCCCGGCCTCAGCATGCGCCTGTCTCCACT TGCCTGTGCTTCCCTGGCTGTGCAGCTCTGGGCTGGGAGCCAGGGGCCCCGTCACAGGCCTGGTCCAAGTGGATTCTGTG CAAGGCTCTGACTGCCTGGAGCTCACGTTCTCTTACTTGTAAAATCAGGAGTTTGTGCCAAGTGGTCTCTAGGGTTTGTA 25 AAGCAGAAGGGATTTAAATTAGATGGAAACACTACCACTAGCCTCCTTGCCTTTCCCTGGGATGTGGGTCTGATTCTCTC TCTCTTTTTTTTTTCTTTTTTGAGATGGAGTCTCACTCTGTTGCCCAGGCTGGAGTGCAGTGGCATAATCTTGGCTCACT GCAACCTCCACCTCCTGGGTTTAAGCGATTCACCAGCCTCAGCCTCCTAAGTAGCTGGGATTACAGGCACCTGCCACCAC GCCTGGCTAATTTTTGTACTTTTAGGAGAGACGGGGTTTCACCATGTTGGCCAGGCTGGTCTCGAACTCATGACCTCAGG TGATCCACCCACCTTGGCCTCCCAAAGTGCTGGGTTTACAGGCTAAGCCACCGTGCCCAGCCCCCGATTCTCTTTTAATT 30 CATGCTGTTCTGTATGAATCTTCAATCTATTGGATTTAGGTCATGAGAGGATAAAATCCCACCCACTTGGCGACTCACTG CAGGGAGCACCTGTGCAGGGAGCACCTGGGGATAGGAGAGTTCCACCATGAGCTAACTTCTAGGTGGCTGCATTTGAATG GCTGTGAGATTTTGTCTGCAATGTTCGGCTGATGAGAGTGTGAGATTGTGACAGATTCAAGCTGGATTTGCATCAGTGAG GGACGGGAGCGCTGGTCTGGGAGATGCCAGCCTGGCTGAGCCCAGGCCATGGTATTAGCTTCTCCGTGTCCCGCCCAGGC TGACTGTGGAGGGCTTTAGTCAGAAGATCAGGGCTTCCCCAGCTCCCCTGCACACTCGAGTCCCTGGGGGGCCTTGTGAC 35 ACCCCATGCCCCAAATCAGGATGTCTGCAGAGGGAGCTGGCAGCAGACCTCGTCAGAGGTAACACAGCCTCTGGGCTGGG GACCCCGACGTGGTGCTGGGGCCATTTCCTTGCATCTGGGGGAGGGTCAGGGCTTTCCCTGTGGGAACAAGTTAATACAC AATGCACCTTACTTAGACTTTACACGTATTTAATGGTGTGCGACCCAACATGGTCATTTGACCAGTATTTTGGAAAGAAT TTAATTGGGGTGACCGGAAGGAGCAGACAGACGTGGTGGTCCCCAAGATGCTCCTTGTCACTACTGGGACTGTTGTTCTG CCTGGGGGGCCTTGGAGGCCCCTCCTCCCTGGACAGGGTACCGTGCCTTTTCTACTCTGCTGGGCCTGCGGCCTGCGGTC 40 AGGGCACCAGCTCCGGAGCACCCGCGGCCCCAGTGTCCACGGAGTGCCAGGCTGTCAGCCACAGATGCCCAGGTCCAGGT GTGGCCGCTCCAGCCCCCGTGCCCCCATGGGTGGTTTTGGGGGAAAAGGCCAAGGGCAGAGGTGTCAGGAGACTGGTGGG CTCATGAGAGCTGATTCTGCTCCTTGGCTGAGCTGCCCTGAGCAGCCTCTCCCGCCCTCTCCATCTGAAGGGATGTGGCT CTTTCTACCTGGGGGTCCTGCCTGGGGCCAGCCTTGGGCTACCCCAGTGGCTGTACCAGAGGGACAGGCATCCTGTGTGG AGGGGCATGGGTTCACGTGGCCCCAGATGCAGCCTGGGACCAGGCTCCCTGGTGCTGATGGTGGGACAGTCACCCTGGGG 45 GTTGACCGCCGGACTGGGCGTCCCCAGGGTTGACTATAGGACCAGGTGTCCAGGTGCCCTGCAAGTAGAGGGGCTCTCAG AGGCGTCTGGCTGGCATGGGTGGACGTGGCCCCGGGCATGGCCTTCAGCGTGTGCTGCCGTGGGTGCCCTGAGCCCTCAC TGAGTCGGTGGGGGCTTGTGGCTTCCCGTGAGCTTCCCCCTAGTCTGTTGTCTGGCTGAGCAAGCCTCCTGAGGGGCTCT CTATTGCAG
LU
-40 Intron 4 (SEQ ID NO 7) GTGGCTGTGCTTTGGTTTAACTTCCTTTTTAAACAGAAGTGCGTTTGAGCCCCACATTTGGTATCAGCTTAGATGAAGGG CCCGGAGGAGGGGCCACGGGACACAGCCAGGGCCATGGCACGGCGCCAACCCATTTGTGCGCACAGTGAGGTGGCCGAGG TGCCGGTGCCTCCAGAAAAGCAGCGTGGGGGTGTAGGGGGAGCTCCTGGGGCAGGGACAGGCTCTGAGGACCACAAGAAG 5 CAGCCGGGCCAGGGCCTGGATGCAGCACGGCCCGAGGTCCTGGATCCGTGTCCTGCTGTGGTGCGCAGCCTCCGTGCGCT TCCGCTTACGGGGCCCGGGGACCAGGCCACGACTGCCAGGAGCCCACCGGGCTCTGAGGATCCTGGACCTTGCCCCACGG CTCCTGCACCCCACCCCTGTGGCTGCGGTGGCTGCGGTGACCCCGTCATCTGAGGAGAGTGTGGGGTGAGGTGGACAGAG GTGTGGCATGAGGATCCCGTGTGCAACACACATGCGGCCAGGAACCCGTTTCAAACAGGGTCTGAGGAAGCTGGGAGGGG TTCTAGGTCCCGGGTCTGGGTGGCTGGGGACACTGGGGAGGGGCTGCTTCTCCCCTGGGTCCCTATGGTGGGGTGGGCAC 10 TTGGCCGGATCCACTTTCCTGACTGTCTCCCATGCTGTCCCCGCCAG Intron 5 (SEQ ID NO 8) GTGGGTGCCGGGGACCCCCGTGAGCAGCCCTGCTGGACCTTGGGAGTGGCTGCCTGATTGGCACCTCATGTTGGGTGGAG GAGGTACTCCTGGGTGGGCCGCAGGGAGTGCAGGTGACCCTGTCACTGTTGAGGACACACCTGGCACCTAGGGTGGAGGC 15 CTTCAGCCTTTCCTGCAGCACATGGGGCCGACTGTGCACCCTGACTGCCCGGGCTCCTATTCCCAAGGAGGGTCCCACTG GATTCCAGTTTCCGTCAGAGAAGGAACCGCAACGGCTCAGCCACCAGGCCCCGGTGCCTTGCACCCCAGTCCTGAGCCAG GGGTCTCCTGTCCTGAGGCTCAGAGAGGGGACACAGCCCGCCCTGCCCTTGGGGTCTGGAGTGGTGGGGGTCAGAGAGAG AGTGGGGGACACCGCCAGGCCAGGCCCTGAGGGCAGAGGTGATGTCTGAGTTTCTGCGTGGCCACTGTCAGTCTCCTCGC CTCCACTCACACAG 20 5'-region intron 6 (SEQ ID NO 9) GTAAGGTTCACGTGTGATAGTCGTGTCCAGGATGTGTGTCTCTGGGATATGAATGTGTCTAGAATGCAGTCGTGTCTGTG ATGCGTTTCTGTGGTGGAGGTACTTCCATGATTTACACATCTGTGATATGCGTGTGTGGCACGTGTGTGTCGTGGTGCAT GTATCTGTGGCGTGCATATTTGTGGTGTGTGTGTGTGTGGCACGTGTGTGTCCATGGTGTGTGTGCCTGTGGTGTGCATG 25 TGTGTGTGTCTGTGACACGTGCATGTTCATGCTGTGTGCTGCATGTCTGTGATGTGCCTATTTGTGGTGTGTGTGTGCAT GTGTCCGTGACATATGCGTGTCTATGGCATGGGTGTGTGTGGCCCCTTGGCCTTACTCCTTCCTCCTCCAGGCATGGTCC GCACCATTGTCCTCACGCTCTCGGGTGCTGGTTTGGGGAGCTCCACATTCAGGGTCCTCACTTCTAGCATGGGTGCCCCT GTCCTGTCACAGGGCTGGGCCTTGGAGACTGTAAGCCAGGTTTGAGAGGAGAGTAGGGATGCTGGTGGTACCTTCCTGGA CCCCTGGCACCCCCAGGACCCCAGTCTGGCCTATGCCGGCTCCATGAGATATAGGAAGGCTGATTCAGGCCTCGCTCCCC 30 GGGACACACTCCTCCCAGAGCGGCCGGGGGCCTTGGGGCTCGGCAGGGGTGAAAGGGGCCCTGGGCTTGGGTTCCCACCC AGTGGTCATGAGCACGCTGGAGGGGTAAGCCCTCAAAGTCGTGCCAGGCCGGGGTGCAGAGGTGAAGAAGTATCCCTGGA GCTTCGGTCTGGGGAGAGGCACATGTGGAAACCCACAAGGACCTCTTTCTCTGACTTCTTGAGCT 3'-region intron 6 (SEQ ID NO 10) 35 TGTGGGATTGGTTTTCATGTGTGGGATAGGTGGGGATCTGTGGGATTGGTTTTTATGAGTGGGGTAACACAGAGTTCAAG GCGAGCTTTCTTCCTGTAGTGGGTCTGCAGGTGCTCCAACAGCTTTATTGAGGAGACCATATCTTCCTTTGAACTATGGT CGGGTTTATAGTAAGTCAGGGGTGTGGAGGCCTCCCCTGGGCTCCCTGTTCTGTTTCTTCCACTCTGGGGTCGTGTGGTG CCTGCTGTGGTGTGTGGCCGGTGGGCAGGGCTTCCAGGCCTCCTTGTGTTCATTGGCCTGGATGTGGCCCTGGCTACGCT CCGTCCTTGGAATTCCCCTGCGAGTTGGAGGCTTTCTTTCTTTCTTTTTTTCTTTCTTTTTTTTTTTTTTTGATAACAGA 40 GTCTCGCTCTTTTTTGCCCAGGCTGGAGTGGTTTGGCGTGATCTTGGCTCACTGCAACCTGTGCTTCCTGAGTTCAAGCA ATTCTCTTGCCTCAGCCTCCCAAGTAGCTGGAATTATAGGCGCCCACCACCATGCTGACTAATTTTTGTAATTTTAGTAG AGACGAGGTTTCTCCATGTTGGCCAGGCTGGTCTCGAACTCCTGACCTCAGGTGATCCTCCCACCTCGGCCTCCCAAAGT GCTGGGATGACAGGTGTGAACCGCCGCGCCCGGCCGAGACTCGCTTCCTGCAGCTTCCGTGAGATCTGCAGCGATAGCTG RA CCTGCAGCCTTGGTGCTGACAACCTCCGTTTTCCTTCTCCAGGTCTCGCTAGGGGTCTTTCCATTTCATGACTCTCTTCA -41 CAGAAGAGTTTCACGTGTGCTGATTTCCCGGCTGTTTCCTGCGTAATTGGTGTCTGCTGTTTATCGATGGCCTCCTTCCA TTTCCTTTAGGCTTTGTTTATTGTTGTTTTTCCGGCTCCTTGAAGGAAAAGTTTCGATTATGGATGTTTGAACTTTCTTT TCTAAACAAGCATCTGAAGTTGCCGTTTTCCCTCTAAAGCAGGGATCCCGAGGCCCCTGGCTGTGGAGTGGCACCGGTCT GGGGCCTGTTAGGAACCCGGCGCACAGCGGGAGGCTAGGTGGGGTGTGGGGAGCCAGCGTTCCCGCCTGAGCCCCGCCCC 5 TCTCAGATCAGCAGTGGCATGCGGTGCTCAGAGGCGCACACACCCTACTGAGAACTGTGCGTGAGAGGGGTCTAGATTCT GTGCTCCTTATGGGAATCTAATGCCTGATGATCTGAGGTGGAACCGTTTGCTCCCAAAACCATCCCCTTCCCCACTGCTG TCCTGTGGAAAAATCGTCTTCCACGAAACCAGTCCCTGGTACCACAATGGTTGGGGACCCTGTGCTAAAGACCTGCTTCA GCAGCCTCTCGTCAGTGTTGATATATTGGCTTTTCTGTGTTGAGTCCAGAATAATTACGGATTTCTGTGATGCTTTCCGC CGACCTCAGACCCATGGGCTATTTGTGGGCGTGTTGCCTGCTCCTGGGTTGGGAAGGGTGCAGGCCCCATGTACCTTCCT 10 GTTACTGCCTTCCAGGTTGGTTCTCAGGGTTGAATCGTACTCGATGTGGTTTTAGCCCACGGCCCTGCCGCCAGCTCCTG GGGGCTGGGGAACATGCTGAAGCACAGAGTCACCGTGCGCGTCTTTTGATGCCTCACAAGCTCGAGGCCTCCTGTGTCCG TGTTAGTGTGTGTCACGTGCCTGCTCACATCCTGTCTTGGGGACGCAGGGGCTTAGCAGGTCCCGTAGTAAATGACAAGC GTCCTGGGGGAGTCTGCAGAATAGGAGGTGGGGGTGCCGGTCTCTCTCCCGCGTCTTCAGACTCTTCTCCTGCCTGTGCT GTGGCTGCACCTGCATCCCTGCAATCCCTCCAGCACTGGGCTGGAGAGGCCCGGGAGCTCGAGTGCCACTTGTGCCACGT 15 GACTGTGGATGGCAGTCGGTCACGGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTTGGTCACAGGGGTCTGATGTGTG GTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGG ATGGCGGTCGTGGGGTCTGATGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGGTGACTGTGGATGGCGGTCGTG GGGTCTGATGTGGTGACTGTGGATGGCAGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATG TGGTGACTGTGGATGGCAGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACT 20 GTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGG CGGTCGTGGGGTCTGATGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGTGATCGGTCA CAGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGTGATCGGTCACAG GGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTTGGTCCCGGGGG TCTGATGTGTGGTGACTGTGGATGGCGATCGGTCACAGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCT 25 GATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGGT GACTGTGGATGGCGGTCGTGGGGTCTGATGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGAT GGCGGTTGGTCCCGGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGGTGACTGTGGATGGCAG TCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGG TCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGT 30 GGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGTGATCGGTCACAGGGGTCTGATGTGTGGT GACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGGTGACTGTGGAT GGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTAGGGTCTGATGTGTGGTGACTGTGGATGGCAGTCG GTCACAGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGG GGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGATGTGTGGTGACTGTGGATGGCGGTCGTGGGGTCTGAT 35 GTGGTGACTGTGGATGGTGATCGGTCACAGGGGTCTGATGTGTGGTAGCTGCAGGTGGAGTCCCAGGTGTGTCTGTAGCT ACTTTGCGTCCTCGGCCCCCCGGCCCCCGTTTCCCAAACAGAAGCTTCCCAGGCGCTCTCTGGGCTTCATCCCGCCATCG GGCTTGGCCGCAGGTCCACACGTCCTGATCGGAAGAAACAAGTGCCCAGCTCTGGCCGGGGCAGGCCACATTTGTGGCTC ATGCCCTCTCCTCTGCCGGCAG 40 Intron 7 (SEQ ID NO 11) GTCTGGGCACTGCCCTGCAGGGTTGGGCACGGACTCCCAGCAGTGGGTCCTCCCCTGGGCAATCACTGGGCTCATGACCG GACAGACTGTTGGCCCTGGGGGGCAGTGGGGGGAATGAGCTGTGATGGGGGCATGATGAGCTGTGTGCCTTGGCGAAATC TGAGCTGGGCCATGCCAGGCTGCGACAGCTGCTGCATTCAGGCACCTGCTCACGTTTGACTGCGCGGCCTCTCTCCAGTT CCGCAGTGCCTTTGTTCATGATTTGCTAAATGTCTTCTCTGCCAGTTTTGATCTTGAGGCCAAAGGAAAGGTGTCCCCCT CCTTTAGGAGGGCAGGCCATGTTTGAGCCGTGTCCTGCCCAGCTGGCCCCTCAGTGCTGGGTCTGAGGCCAAAGGAAACG TGTCCCCCTTCTTAGGAGGACGGGCCGTGTTTGAGCCACGCCCCGCTGAGCGGGCCTCTCAGTGCTGGGTCTGTCCACGT -0 C) -42 GGCCCTGTGGCCCTTTGCAGATGTGGTCTGTCCACGTGGCCCTGTGGCTCTTTGCAGATGCCTGTTAGCACTTGCTCGGC TCTAGGGGACAGTCGTGTCCACCGCATGAGGCTCAGAGACCTCTGGGCGAATTTCCTTGGCTCCCAGGGTGGGGGTGGAG GTGGCCTGGGCTGCTGGGACCCAGACCCTGTGCCCGGCAGCTGGGCAGCAACTCCTGGATCACATATGCCATCCGGGCCA CGGTGGGCTGTGTGGGTGTGAGCCCAGCTGGACCCACAGGTGGCCCAGAGGAGACGTTCTGTGTCACACACTCTGCCTAA 5 GCCCATGTGTGTCTGCAGAGACTCGGCCCGGCCAGCCCACGATGGCCCTGCATTCCAGCCCAGCCCCGCACTTCATCACA AACACTGACCCCAAAAGGGACGGAGGGTCTTGGCCACGTGGTCCTGCCTGTCTCAGCACCCACCGGCTCACTCCCATGTG TCTCCCGTCTGCTTTCGCAG Intron 8 (SEQ ID NO 12) 10 GTGAGTCAGGTGGCCAGGTGCCATTGCCCTGCGGGTGGCTGGGCGGGCTGGCAGGGCTTCTGCTCACCTCTCTCCTGCCC CTTCCCCACTGNCCTTCTGCCCGGGGCCACCAGAGTCTCCTTTTCTGGCCCCCGCCCCCTCCGGCTCCTGGGCTGCAGGC TCCCGAGGCCCCGGAAACATGGCTCGGCTTGCGGCAGCCGGAGCGGAGCAGGTGCCACACGAGGCCTGGAAATGGCAAGC GGGGTGTGGAGTTGCTCCTGCGTGGAGGACGAGGGGCGGGGGGTGTGTCTGGGTCAGGTGTGCGCCGAGCGTTTGAGCCT GCAGCTTGTCAGCTCCAAGTTACTACTGACGCTGGACACCCGGCTCTCACACGCTTGTATCTCTCTCTCCCGATACAAAA 15 GGATTTTATCCGATTCTCATTCCTGTCCCTGTCGTGTGACCCCCGCGAGGGCGCGGGCTCTTCTCTCTGTGACTAGATTT CCCATCTGGAAAGTGCGGGGTTGACCGTGTAGTTTGCTCCTCTCGGGGGGCCTGTGGTGGCCATGGGGCAGGCGGCCTGG GAGAGCTGCCGTCACACAGCCACTGGGTGAGCCACACTCACGGTGGTAGAGCCACAGTGCCTGGTGCCACATCACGTCCT CTGGATTTTAAGTAAAACCACACACCTCCCGGCAGGCATCTGCCTGCGACCCTGTGTGTGCCTGGGGAGAGTGGTAGCAC GGAGGAAATTCGTGCACACTCAAGGTCATCAGCAAGGTCATCCGCAGTCAGGTGGAACGTGGAGGCCTCTCTCTGGGATC 20 GTCTCCAGCGGATAAAGGACTGTGCACAGCTTCGGAAGCTTTTATTTAAAAATATAACTATTAATTATTGCATTATAAGT AATCACTAATGGTATCAGCAATTATAATATTTATTAAAGTATAATTAGAAATATTAAGTAGTACACACGTTCTGGAAAAA CACAAATTGCACATGGCAGCAGAGTGAATTTTGGCCGAGGGACACGTGTGCACATGTGTGTAAGCGGCCCCCAGGCCCAC AGAATTCGCTGACAAAGTCACCTCCCCAGAGAAGCCACCACGGGCCTCCTTCGTGGTCGTGAATTTTATTAAGATGGATC AAGTCACGTACCGTCCACGTGTGGCAGGGCTTTGGGGAATGTGAGGTGATGACTGCGTCCTCATGCCCTGACAGACAGGA 25 GGTGACTGTGTCTGTCCTGTCCCTAGGACACGGACAGGCCCGAAGCTCTAGTCCCCATCGTGGTCCAGTTTGGCCTCTGA ATAAAAACGTCTTCAAAACCTGTTGCCCCAAAAACTAAGAACAGAGAGAGTTTCCCATCCCATGTGCTCACAGGGGCGTA TCTGCTTGCGTTGACTCGCTGGGCTGGCCGGACTCCTAGAGTTGGTGCGTGTGCTTCTGTGCAAAAAGTGCAGTCCTCTT GCCCATCACTGTGATATCTGCACCAGCAAGGAAAGCCTCTTTTCTTTTCTTTCTTTTTTTTTTTTTGAGACGGAACGTCA CTGTTGTCTGCCTGGGCTTGAGTGCAGTGGCGCGATCTCAACTCACTGCAACCTCCGCCTCCCGGGTTCCAGCATTTCTC 30 CTGCCTCAGCCTCCCGAGCAGCTGAGATTACAGGCACCCACCCCCTGCGCCTGGCTAATTTTTGTATTTTTAGTAGAGAG GGGTTTTTGCCATGTTGGCCAGGCTGGTCTCGAACTCCTGACCTCAGGTGATCCACCCACCTCGGCCTCCCAAAGTGCTG GGATTACAGGTGTGAGCCATCACGCCCAGCCGGAAAGCCTCTTTTTAAGGTGACCACCTATAGCGCTTCCCGAAAATAAC AGGTCTTGTTTTTGCAGTAGGCTGCAAGCGTCTCTTAGCAACAGGAGTGGCGTCCTGTGGGCTCTGGGGATGGCTGAGGG TCGCGTGGCAGCCATGCCTTCTGTGTGCACCTTTAGGTTCCACGGGGCTATTCTGCTCTCACTGTTTGTCTGAAAACGCA 35 CCCTTGGCATCCTTGTTTGGAGAGTTTCTGCTTCTCGTTGGTCATGCTGAAACTAGGGGCAAGGTTGTATCCGTTGGCGC GCAGCGGCTACATGTAGGGTCATGAGTCTTTCACCGTGGACAAATTCCTTGAAAAAAAAAAAAGGAGTCCGGTTAAGCAT TCATTCCGGGTCAAGTGTCTGGTTCTGTGAATAAACTCTAAGATTTAAGAAACCTTAATGAAAGAAAACCTTGATGATTC AGAGCAAGGATGTGGTCACACCTGTGGCTGGATCTGTTTCAGCCGCCCCAGTGCATGGTGAGAGTGGGGAGCAGGGATTG TTTGTTCAGAGGTCTCATCTGGTATGTTTCTGAGGTGTTTGCCGGCTGAATGGTAGACGTGTCGTTTGTGTGTATGAGGT 40 TCTGTGTCTGTGTGTGGCTCGGTTTGAGTGTACGCATGTCCAGCACATGCCCTGCCCGTCTCTCACCTGTGTCTTCCCGC CCCAG Intron 9 (SEQ ID NO 13) GTGAGGCCTCCTCTTCCCCAGGGGGGCTTGGGTGGGGGTTGATTTGCTTTTGATGCATTCAGTGTTAATATTCCTGGTGC 5 TCTGGAGACCATGACTGCTCTGTCTTGAGGAACCAGACAAGGTTGCAGCCCCTTCTTGGTATGAAGCCGCACGGGAGGGG -43 TTGCACAGCCTGAGGACTGCGGGCTCCACGCAGGCTCTGTCCAGCGGCCATGTCCAGAGGCCTCAGGGCTCAGCAGGCGG GAGGGCCGCTGCCCTGCATGATGAGCATGTGAATTCAACACCGAGGAAGCACACCAGCTTCTGTCACGTCACCCAGGTTC CGTTAGGGTCCTTGGGGAGATGGGGCTGGTGCAGCCTGAGGCCCCACATCTCCCAGCAGGCCCTCGACAGGTGGCCTGGA CTGGGCGCCTCTTCAGCCCATTGCCCATCCCACTTGCATGGGGTCTACACCCAAGGACGCACACACCTAAATATCGTGCC 5 AACCTAATGTGGTTCAACTCAGCTGGCTTTTATTGACAGCAGTTACTTTTTTTTTTTTAATACTTTAAGTTCTAGGGTAC ATGTGCACGACGTGCAGGTTAGTTACATATGTATACATGTGCCATGTTGGTGTGCTGCACCCATTAACTCATCATTTACA TTAGGTATATCTCCTAATGCTATCCCTCCCCACTCCCCCCATCCCATGACAGGCCCTGGTGTGTGATGTTCCCCACCCTG TGTCCAAGTGTTCTCATTGTTCAGTTCCCACCTGTGAGTGAGAACATGTGGTGTTTGGTTTTCTTTCCTTGCAATAGTTT GCTCAGAGTGATGGTTTCCAGCTTCGTCCATGTCCCTACAAAGGACATGAACTCATCCTTTTTTATGACTGCATAGTATT 10 CCGTGGTGTATATGTGCCACATTTTCTTAATCCAGTCTATCATCGATGGACATTTGGGTTGGTTGCAAGTCTTTGCTACT GTGAATAGTGCCGCAATAAACATACGTGTGCATGTGTCTTTATAGCAGCATGATTTATAATCCTTTGGGTATATACCCAG TAATGGGATGGCTGGGTCAAATGGTATTTCTAGTTCTAGATCCTTGAGGAATCACCACACTGTCTTCCACAATGGTTGAA CTAGTTTACACTCCCACCAACAGTGTAAAAGTGTTCTGGTGCTGGAGAGGATGTGGACAGCAGTTATTTTTTTATGAAAA TAGTATCACTGAACAAGCAGACAGTTAGTGAAGGATGCGTCAGGAAGCCTGCAGGCCACACAGCCATTTCTCTCGAAGAC 15 TCCGGGTTTTTCCTGTGCATCTTTTGAAACTCTAGCTCCAATTATAGCATGTACAGTGGATCAAGGTTCTTCTTCATTAA GGTTCAAGTTCTAGATTGAAATAAGTTTATGTAACAGAAACAAAAATTTCTTGTACACACAACTTGCTCTGGGATTTGGA GGAAAGTGTCCTCGAGCTGGCGGCACACTGGTCAGCCCTCTGGGACAGGATACCTCTGGCCCATGGTCATGGGGCGCTGG GCTTGGGCCTGAGGGTCACACAGTGCACCATGCCCAGCTTCCTGTGGATAGGATCTGGGTCTCGGATCATGCTGAGGACC ACAGCTGCCATGCTGGTAAAGGGCACCACGTGGCTCAGAGGGGGCGAGGTTCCCAGCCCCAGCTTTCTTACCGTCTTCAG 20 TTATTTTTCCCTAAGAGTCTGAGAAGTGGGGCCGCGCCTGATGGCCTTCGTTCGTCTTCAGCTGGCACAGAATTGCACAA GCTGATGGTAAACACTGAGTACTTATAATGAATGAGGAATTGCTGTAGCAGTTAACTGTAGAGAGCTCGTCTGTTGGAAA GAAATTTAAGTTTTTCATTTAACCGCTTTGGAGAATGTTACTTTATTTATGGCTGTGTAAATTGTTTGACATTCAGTCCC TCGTAGACAGATACTACGTAAAAAGTGTAAAGTTAACCTTGCTGTGTATTTTCCCTTATTTTAG 25 Intron 10 (SEQ ID NO 14) GTGAGGCCCGTGCCGTGTGTCTGTGGGGACCTCCACAGCCTGTGGGCTTTGCAGTTGAGCCCCCCGTGTCCTGCCCCTGG CACCGCAGCGTTGTCTCTGCCAAGTCCTCTCTCTCTGCCGGTGCTGGATCCGCAAGAGCAGAGGCGCTTGGCCGTGCACC CAGGCCTGGGGGCGCAGGGGCACCTTCGGGAGGGAGTGGGTACCGTGCAGGCCCTGGTCCTGCAGAGACGCACCCAGGTT ACACACGTGGTGAGTGCAGGCGGTGACCTGGCTCCTGCTGCTCTTTGGAAAGTCAAGAGTGGCGGCTCCTGGGGCCCCAG 30 TGAGACCCCCAGGAGCTGTGCACAGGGCCTGCAGGGCCGAGGCGGCAGCCTCCTCCCCAGGGTGCACCTGAGCCTGCGGA GAGCAGGAGCTGCTGAGTGAGCTGGCCCACAGCGTTCGCTGCGGTCACGTTCCTGCGTGGGGTTGTTTGGGATCGGTGGG AGAATTTGGATTTGCTGAGTGCTGCTGTCTTGAACCACGGAGATGGCTAGGAGTGGGTTTCAGAGTTGATTTTTGTGAAT CAAACTAAAATCAGGCACAGGGGACCTGGCCTCAGCACAGGGGATTGTCCAATGTGGTCCCCCTCAAGGGCGCCCCACAG AGCCGGTGGGCTTGTTTTAAAGTGCGATTTGACGAGGGACGAGAAACCTTGAAAGCTGTAAAGGGAACCCTCAGAAAATG 35 TGGCCGCCAGGGGTGGTTTCAGGTGCTTTGCTGGGCTGTGTTTGTGAAAACCCATTTGGACCCGCCCTCCAAGTCCACCC TCCAGGTCCACCCTCCAGGGCCGCCCTGGGCTGGGGGTATGCCTGGCGTTCCTTGTGCCGCAGCCCGGAGCACAGCAGGC TGTGCACATTTAAATCCACTAAGATTCACTCGGGGGGAGCCCAGGTCCCAAGCAACTGAGGGCTCAGGAGTCCTGAGGCT GCTGAGGGGACAGAGCAGACGGGGAACGCTGCTTCTGTGTGGCAAGTTCCTGAGGGTGCTGGCCAGGGAGGTGGCTCAGA GTGTATGTTGGGGTCCCACCGGGGGCAGAACTCTGTCTCTGATGAGTCGGCAGCCATGTAACAGGAAGGGGTGGCCACAG 40 GGAGCTGGGAATGCACCAGGGGAGCTGCGCAGCTGGCCGAGGTCCCAGGGCCAGGCCACAGGAAGGGCAGGGGGACGCCC GGGGCCACAGCAGAGGCCGCAGGAAGGGAAGGGGATGCCCAGGCCAGAGCAGAGGCTACCGGGCACAGGGGGGCTCCCTG AGCTGGGTGAGCGAGGCTCATGACTCGGCGAGGGAACCTCCTTGACGTGAAGCTGACGACTGGTGTTGCCCAGCTCACAG CCCAGCCAGGTCCCGCGCCTGAGCAGGAACTCAGAACCCTCCCCTTTGTCTAAAGCACAGCAGATGCCTTCAGGGCATCT AGGAGAAAACAGGCAAAGTCGTTGAGAAACGTCTTAAAAGAAGGTGGGATGGTGGCAATTTCTTGTCCAGATTTTAGTCT GCCCCGGACCACAGATGAGTCTATAACGGGATTGTGGTGTTGCCATGGGGACACATGAGATGGACCATCACAGAGGCCAC TGGGGCTGCACCTCCCATCTGAGTCCTGGCTGTCCCGGGTCCAGGCCAGGTTCTTGCATGCTCACCTACCTGTCCTGCCC 40 -44 GGGAGACAGGGAAAGCACCCCGAAGTCTGGAGCAGGGCTGGGTCCAGGCTCCTCAGAGCTCCTGCCAGGCCCAGCACCCT GCTCCAAATCACCACTTCTCTGGGGTTTTCCAAAGCATTTAACAAGGGTGTCAGGTTACCTCCTGGGTGACGGCCCCGCA TCCTGGGGCTGACATTGCCCCTCTGCCTTAG 5 Intron 11 (SEQ ID NO 15) GTGAGCGCACCTGGCCGGAAGTGGAGCCTGTGCCCGGCTGGGGCAGGTGCTGCTGCAGGGCCGTTGCGTCCACCTCTGCT TCCGTGTGGGGCAGGCGACTGCCAATCCCAAAGGGTCAGAGGCCACAGGGTGCCCCTCGTCCCATCTGGGGCTGAGCAGA AATGCATCTTTCTGTGGGAGTGAGGGTGCTCACAACGGGAGCAGTTTTCTGTGCTATTTTGGTAAAAGGAAATGGTGCAC CAGACCTGGGTGCACTGAGGTGTCTTCAGAAAGCAGTCTGGATCCGAACCCAAGACGCCCGGGCCCTGCTGGGCGTGAGT 10 CTCTCAAACCCGAACACAGGGGCCCTGCTGGGCATGAGTCCCTCTGAACCCGAGACCCTGGGGCCCTGCTGGGCGTGAGT CTCTCCGAACCCAGAGACTTCAGGGCCCTTTTGGGCGTGAGTCTCTCCGCTGTGAGCCCCACACTCCAAGGCTCATCCAC AGTCTACAGGATGCCATGAGTTCATGATCACGTGTGACCCATCAGGGGACAGGGCCATGGTGTGGGGGGGGTCTCTACAA AATTCTGGGGTCTTGTTTCCCCAGAGCCCGAGAGCTCAAGGCCCCGTCTCAGGCTCAGACACAAATGAATTGAAGATGGA CACAGATGCAGAAATCTGTGCTGTTTCTTTTATGAATAAAAAGTATCAACATTCCAGGCAGGGCAAGGTGGCTCACACCT 15 ATAATCCCAGCACTTTGGGAGGCCGAGGTGGGTGGATCACTTGAGGCCAGGAGTTTGAGGCCAACCTAACCAACATAGTG AAATTCCATTTCTACTTAAAAAATACAAAAATTAGCCTGGCCTGGTGGCACACGCCTGTAGTCCCCGCTATGCGGGAGGC TGAGGCAGGAGAATCATTTGAACCCAGGAGGCAGAGGTTGCAGTGAGCCGAGATCACACCACTGCACTCCAGCCTGGGCA ACAGAGTGAGACTTCATCTTAAAAAAAAAAAAAAAAGTATCAGCATTCCAAAACCATAGTGGACAGGTGTTTTTTTATTC TGTCCTTCGATAATATTTACTGGTGCTGTGCTAGAGGCCGGAACTGGGGGTGCCTTCCTCTGAAAGGCACACCTTCATGG 20 GAAGAGAAATAAGTGGTGAATGGTTGTTAAACCAGAGGTTTAAACTGGGGTCCTGTCGTTCTGAGTTAACAGTCCAGATC TGGACTTTGCCTCTTTCCAGAATGCTCCCTGGGGTTTGCTTCATGGGGGAGCAGCAGGTGTGGACACCCTCGTGATGGGG GAGCAGCAGGTGCAGACGCCCTCATGATGGGGGAGTGGCAGGTGCAGACACCCTTGTGCATGGTGCCCAGCATGTCCCTG TTGCAGCTCCCTCCCCACAAGGATGCCGGTCTCCTGTGCTCCCCACAGTCCCTGCTTCCCTCTCACAGCCTTACCTGGTC CTGGCCTCCACTGGCTTTGTCTGCATGATTTCCACATTTCCTGGGCTCCCAGCACCTCTTCGCCTCTCCCAGGCACCTCT 25 GCAGTGCTGGCCATACCAGTCAGCTGTGAACTGTCCACTGCTTATTTTGCTCCCCATGAAATGTATTTTTTAGGACAGGC ACCCCTGGTTCCAGCCTCTGGCACAGCATCAGTGAATGTTATTGAAGGACAAAGGACAGACAAACAAATCAGGAAAATGG GTTCTCTCTAAACACATTGCAAAGCCACAGAGGCTAGTGCAGGATGGGTGGGCATCAGGTCATCAGATGTGGGTCCAATG CCAGAATATTCTGTGCTCCCAAAGGCCACTTGGTCAGAGTGTGTGCTTGCAGAGGTGGCTCTAAAAGCTCAGCAGTGGAG GCAGTGGTTCGCCATACTCAGGGTGAACTCACATCCTCTGTGTCTGAAGTATACAGCAGAGGCTTGAAGGGCATCTGGA 30 GAAGAAAACAGGCAAAATGATTAAGAAAAGTGAAAAAGGAAAAGTGGTAAGATGGGAATTTTCTTGTCCAGATTTTAGTC TCCCAAACCACAGCTCAGATGGTAGAATGTGGTCAGAACTGATGGACAGAACAATAGAACAAAACGGAAGCCCTATCTCT CAGAAACGTGTGTTAATGTGGTATGTGGCACAGCTGATGGAAAAGAGAGTGTGTGTGTAATTTTTTTTTCTGAGAAAACT GACTGGAAGCAAATAAGTTGTGTCTTTACAGCATATACCAGAGCAGATTCTAGGTAGAAGAGGAGACACATGCAAACAAC ACCAGCAACAGAAATAAAACAAAAGACTCAAAGGGAAGGGAGGTGAACGTTCCCTGGTTTGGTGTTGGGGAAGGACACAC 35 AGGGAGGCGGATGAAACCAGTGAGGCAACGGGCATTGCTTTCACTGCAGAGAAACTCAGCTTGCCTGAGCCACAGTGAAA ATGGCCATTCCCTGGAGCGTTTGTGCACGTGATTTATTTAAGGCGCCCTGTGAGGTCCTGCACATTCATCCTCTCACTTT GTTCTCCTAACCACCTGAGAGGTAGAGGAGGAAAGGCTCCAGGGGAGCAGCCGCCCTTGGTCACCCAGCTGGCAAAGGGC ATGCATGATTGCAGCCTGGCCTCCTGCTCCGGGGCCCTTGCTCTGCCCGAGGACCCCACACAAGTCAGACCCATAGGCTC AGGGTGAGCCGGAGCCCAAGGTCGTGTTGGGGATGGCTGTGAAAGAAGAAATGGACGTCTGATGCACACTTGGGAAGGTC 40 CTACCAGCAGCGTCAAAGAAATGCATGTGAAACTGACAGCGAGACCCATCCCTCAAAGAAACGCACGTGAAACTGATGGC GAGACCTGTCCCCATCCCTCATGCTGGCTCCTTTTCTGGGCTTGCCAAGAGCCAGCATCAGGTTGAGGCAAGCTGGAAAG ACTTTTCTGGAAAGCAGCTTGTTTGCATGGAAGTCCTCACAATGTCCTGTGTCTTCCCAGTAATTCCACTTCTGAAGTGA CCAGACATTATCACGGGTCTTATTTACCATTTCCAGTGTTCCAGGCAGGGGGACTTGCCACAGCAAGTCACGAACCTGCC CAAATACAGGGCTAAGGAGATATTATGCATCACAAAACTTGCTCTGCCATTAAACATTTTTCAAAGAATTTTTGAAGAAT 4/ GTTTAATGGCACAAAACGTTTATTTCAATGTAGCAGTGTTCAAAGCTGGATGTAAAAGAACACACCCCAGGAGCCTGCCG
GAATGTCATGTGTGTTCATCTTTGGACATGGACATACATGGGCAGTGAGTGGTGGTGAGGCCCTGGAGGACATCGGTGG
-45 GATGCCTCCATCCTGCCCCTCTGGAGACACCATGTGTGCCACGTGCACTCACTGGAGCCCTGTTTAGCTGGTGCCACCTG GCTCTTCCATCCCTGAGATTCAAACACAGTGAGATTCCCCACGCCCAACTCAGTGTTCTCCCACAAAAAACCTGAGTCAC ACCTGTGTTCACTCGAGGGACGCCCGGGAGCCAGGGCTCCACAGTTTATTATGTGTTTTTGGCTGAGTTATGTGCAGATC TCATCAGGGCAGATGATGAGTGCACAAACACGGCCGTGCGAGGTTTGGATACACTCAACATCACTAGCCAGGTCCTGGTG 5 GAGTTTGGTCATGCAGAGTCTGGATGGCATGTAGCATTTGGAGTCCATGGAGTGAGCACCCAGCCCCCTCGGGCTGCAGC GCATGCCCCAGGCAGGACAAGGAAGCGGGAGGAAGGCAGGAGGCTCTTTGGAGCAAGCTTTGCAGGAGGGGGCTGGGTGT GGGGCAGGCACCTGTGTCTGACATTCCCCCCTGTGTCTCAG Intron 12 (SEQ ID NO 16) 10 GTGAGCAGGCTGATGGTCAGCACAGAGTTCAGAGTTCAGGAGGTGTGTGCGCAAGTATGTGTGTGTGTGTGTGCGCGCGT GCCTGCAAGGCTGATGGTGACTGGCTGCACGTAAGAGTGCACATGTACGCATATACACGTGAGCACATACATGTGTGCAT GTGTGTACATGAAGGCATGGCAGTGTGTGCACAGGTGTGCAAGGGCACAAGTGTGTGCACATGCGAATGCACACCTGACA TGCATGTGTGTTCGTGCACAGTCGTGTGGGCATTCACGTGAGGTGCATGCGTGTGGGTGTGCAGTGTGAGTAGCATGTGT GCACATAACATGTATTGAGGGGTCCTCGTGTTCACCCCGCTAGGTCCTCAGCACCAGTGCCACTCCTTACAGGATGAGAC 15 GGGGTCCCAGGCCTTGGTGGGCTGAGGCTCTGAAGCTGCAGCCCTGAGGGCATTGTCCCATCTGGGCATCCGCGTCCACT CCCTCTCCTGTGGGCTTCTGTGTCCACTCCCCCTCTCCTGTGGGCATTTACATCCACTCCACTCCCTCTCTCCTGTGGGC ATCCGCGTCCACTCCCCCTCTCTGTGGGCATCTGCGTCCACCTCCCCTCTCTGTGGGCATTTGCGTCCACTCCCTCTCCT GGTTCCTTCCTGTCTTGGCCGAGCCTCGGGGGCAGGCAGATGACACAGAGTCTTGACTCGCCCAGGGTGGTTCGCAGCTG CCGGGTGAGGGCCAGGCCGGATTTCACTGGGAAGAGGGATAGTTTCTTGTCAAAATGTTCCTCTTTCTTGTTCCATCTGA 20 ATGGATGATAAAGCAAAAAGTAAAAACTTAAAATCCCAGAGAGGTTTCTACCGTTTCTCACTCTTTCTTGGCGACTCTAG Intron 13 (SEQ ID NO 17) GTGAGCCGCCACCAAGGGGTGCAGGCCCAGCCTCCAGGGACCCTCCGCGCTCTGCTCACCTCTGACCCGGGGCTTCACCT TGGAACTCCTGGGTTTTAGGGGCAAGGAATGTCTTACGTTTTCAGTGGTGCTGCTGCCTGTGCACAGTTCTGTTCGCGTG 25 GCTCTGTGCAAAGCACCTGTTCTCCATCTCTGGGTAGTGGTAGGAGCCGGTGTGGCCCCAGGTGTCCCCACTGTGCCTGT GCACTGGCCGTGGGACGTCATGGAGGCCATCCCAGGGCAGCAGGGGCATGGGGTAAAGAGATGTTTATGGGGAGTCTTAG CAGAGGAGGCTGGGAAGGTGTCTGAACAGTAGATGGGAGATCAGATGCCCGGAGGATTTGGGGTCTCAGCAAAGAGGGCC GAGGTGGGTGCAGGTGAGGGTCGCTGGCCCCACCCCCGGGAAGGTGCAGCAGAGCTGTGGCTCCCCACACAGCCCGGCCA GCACCTGTGCTCTGGGCATGGCTGTGCTCCTGGAACGTTCCCTGTCCTGGCTGGTCAGGGGGTGCCCCTGCCAAGAATCG 30 ACAACTTTATCACAGAGGGAAGGGCCAATCTGTGGAGGCCACAGGGCCAGCTTCTGCCTGGAGTCAGGGCAGGTGGTGGC ACAAGCCTCGGGGCTGTACCAAAGGGCAGTCGGGCACCACAGGCCCGGGCCTCCACCTCAACAGGCCTCCCGAGCCACTG GGAGCTGAATGCCAGGAGGCCGAAGCCCTCGCCCCATGAGGGCTGAGAAGGAGTGTGAGCATTTGTGTTACCCAGGGCCG AGGCTGCGCGAATTACCGTGCACACTTGATGTGAAATGAGGTCGTCGTCTATCGTGGAAACCCAGCAAGGGCTCACGGGA GAGTTTTCCATTACAAGGTCGTACCATGAAAATGGTTTTTAACCCGAGTGCTTGCGCCTTCATGCTCTGGCAGGGAGGGC 35 AGAGCCACAGCTGCATGTTACCGCCTTTGCACCAGCTCCAGAGGCTTGGGACCAGGCTGTCTCAGTTCCAGGGTGCGTCC GGCTCAGACCGCCCTCCTCTCTGCCTTCTCTCTCTGCCTCAAATCTTCCCTCGTTTGCATCTCCCTGACGCGTGCCTGGG CCCTCGTGCAAGCTGCTTGACTCCTTTCCGGAAACCCTTGGGGTGTGCTGGATACAGGTGCCACTGAGGACTGGAGGTGT CTGACACTGTGGTTGACCCCAGGGTCCAGCTGGCGTGCTTGGGGCCTCCTTGGGCCATGATGAGGTCAGAGGAGTTTTCC CAGGTGAAAACTCCTGGGAAACTCCCAGGGCCATGTGACCTGCCACCTGCTCCTCCCATATTCAGCTCAGTCTTGTCCTC 40 ATTTCCCCACCAGGGTCTCTAGCTCCGAGGAGCTCCCGTAGAGGGCCTGGGCTCAGGGCAGGGCGGCTGAGTTTCCCCAC CCATGTGGGGACCCTTGGGTAGTCGCTTGATTGGGTAGCCCTGAGGAGGCCGAGATGCGATGGGCCACGGGCCGTTTCCA AACACAGAGTCAGGCACGTGGAAGGCCCAGGAATCCCCTTCCCTCGAGGCAGGAGTGGGAGAACGGAGAGCTGGGCCCCG ATTTCACGGCAGCCAGGCTGCAGTGGGCGAGGCTGTGGTGGTCCACGTGGCGCTGGGGGCGGGGTCTGATTCAAATCCGC TGGGGCTCGGCCTTCCTGGCCCGTGCTGGCCGCGCCTCCACACGGGCTTGGGGTGGACGCCCCGACCTCTAGCAGGTGGC TATTTCTCCCTTTGGAAGAGAGCCCCTCACCCATGCTAGGTGTTTCCCTCCTGGGTCAGGAGCGTGGCCGTGTGGCAACC 1- -46 CCGGGACCTTAGGCTTATTTATTTGTTTAAAAACATTCTGGGCCTGGCTTCCGTTGTTGCTAAATGGGGAAAAGACATCC CACCTCAGCAGAGTTACTGAGAGGCTGAAACCGGGGTGCTGGCTTGACTGGTGTGATCTCAGGTCATTCCAGAAGTGGCT CAGGAAGTCAGTGAGACCAGGTACATGGGGGGCTCAGGCAGTGGGTGAGATGAGGTACACGGGGGGCTCAGGCAGTGGGT GAGGCCAGGTACATGGGGGGCTCAGGCACTGGGTGAGATGAGGTACACGGGGGGCTCAGGCAGAGGGTCAGACCAGGTAC 5 ACGGGGGCTCTGATCACACGCACATATGAGCACATGTGCACATGTGCTGTTTCATGGTAGCCAGGTCTGTGCACACCTGC CCCAAAGTCCCAGGAAGCTGAGAGGCCAAAGATGGAGGCTGACAGGGCTGGCGCGGTGGCTCACACCTGTAGTCCCAGCA CTTTGGGAGGCCGAGGCGAGAGGATCCCTTGAGCCCAGGAGTTTAAGACCAGCCTGAGCAACATAGTAGAACCCCATCTC TATGAAAAATAAAAACAAAAATTAGCTGAACATGGTGGTGTGCGCCTGTAGTTCCAATACTTGGGAGGCTGAAGTGGGAG GATCACTTGAGCCCAGGAGGTGGAAGCTGCAGTGAGCTGAGATTGCACCACTGTACTGCAGCCTGGGTGACAGAGTGAGA 10 GCCCATCTCAACAACAACAAAGAAGACTGACAAATGCAGTTTCTTGGAAAGAAACATTTAGTAGGAACTTAACCTACACA CAGAAGCCAAGTCGGTGTCTCGGTGTCAGTGAGATGAGATGATGGGTCCTCACACCATCACCCCAGACCCAGGGTTTATG CACCACAGGGGCGGGTGGCTCAGAAGGGATGCGCAGGACGTTGATATACGATGACATCAAGGTTGTCTGACGAAGGGCAG GATTCATGATAAGTACCTGCTGGTACACAAGGAACAATGGATAAACTGGAAACCTTAGAGGCCTTCCCGGAACAGGGGCT AATCAGAAGCCAGCATGGGGGGCTGGCATCCAGGATGGAGCTGCTTCAGCCTCCACATGCGTGTTCATACAGATGGTGCA 15 CAGAAACGCAGTGTACCTGTGCACACACAGACACGCAGCTACTCGCACACACAAGCACACACACAGACATGCATGCATGC ATCCGTGTGTGTGCACCTGTGCCCATGAGGAAACCCATGCATGTGCATTCATGCACGCACACAGGCACCGGTGGGCCCAT GCCCACACCCACGAGCACCGTCTGATTAGGAGGCCTTTCCTCTGACGCTGTCCGCCATCCTCTCAG Intron 14 (WEQ ID NO 18) 20 GTATGTGCAGGTGCCTGGCCTCAGTGGCAGCAGTGCCTGCCTGCTGGTGTTAGTGTGTCAGGAGACTGAGTGAATCTGGG CTTAGGAAGTTCTTACCCCTTTTCGCATCAGGAAGTGGTTTAACCCAACCACTGTCAGGCTCGTCTGCCCGCCCTCTCGT GGGGTGAGCAGAGCACCTGATGGAAGGGACAGGAGCTGTCTGGGAGCTGCCATCCTTCCCACCTTGCTCTGCCTGGGGAA GCGCTGGGGGGCCTGGTCTCTCCTGTTTGCCCCATGGTGGGATTTGGGGGGCCTGGCCTCTCCTGTTTGCCCTGTGGTGG GATTGGGCTGTCTCCCGTCCATGGCACTTAGGGCCCTTGTGCAAACCCAGGCCAAGGGCTTAGGAGGAGGCCAGGCCCAG 25 GCTACCCCACCCCTCTCAGGAGCAGAGGCCGCGTATCACCACGACAGAGCCCCGCGCCGTCCTCTGCTTCCCAGTCACCG TCCTCTGCCCCTGGACACTTTGTCCAGCATCAGGGAGGTTTCTGATCCGTCTGAAATTCAAGCCATGTCGAACCTGCGGT CCTGAGCTTAACAGCTTCTACTTTCTGTTCTTTCTGTGTTGTGGAAATTTCACCTGGAGAAGCCGAAGAAAACATTTCTG TCGTGACTCCTGCGGTGCTTGGGTCGGGACAGCCAGAGATGGAGCCACCCCGCAGACCGTCGGGTGTGGGCAGCTTTCCG GTGTCTCCTGGGAGGGGAGCTGGGCTGGGCCTGTGACTCCTCAGCCTCTGTTTTCCCCCAG 30 Intron 15 (SEQ ID NO 19) GCAAGTGTGGGTGGAGGCCAGTGCGGGCCCCACCTGCCCAGGGGTCATCCTTGAACGCCCTGTGTGGGGCGAGCAGCCTC AGATGCTGCTGAAGTGCAGACGCCCCCGGGCCTGACCCTGGGGGCCTGGAGCCACGCTGGCAGCCCTATGTGATTAAACG CTGGTGTCCCCAGGCCACGGAGCCTGGCAGGGTCCCCAACTTCTTGAACCCCTGCTTCCCATCTCAGGGGCGATGGCTCC 35 CCACGCTTGGGAGCCTTCTGACCCCTGACCTGTGTCCTCTCACAGCCTCTTCCCTGGCTGCTGCCCTGAGCTCCTGGGGT CCTGAGCAAGTTCTCTCCCCGCCCCGCCGCTCCAGCGTCACTGGGCTGCCTGTCTGCTCGCCCCGGTGGAGGGGTGTCTG TCCCTTCACTGAGGTTCCCACCAGCCAGGGCCACGAGGTGCAGGCCCTGCCTGCCCGGCCACCCACACGTCCTAGGAGGG TTGGAGGATGCCACCTCTGGCCTCTTCTGGAACGGAGTCTGATTTTGGCCCCGCAG 40 3'-untranscribed region (SEQ ID NO 20) ATCTCATGTTTGAATCCTAATGTGCACTGCATAGACACCACTGTATGCAATTACAGAAGCCTGTGAGTGAACGGGGTGGT GGTCAGTGCGGGCCCATGGCCTGGCTGTGCATTTACGGAAGTCTATGAGTGAATGGGGTTGTGGTCAGTGCGGGCCCATG GCCTGGCTGGGCCTGGGAGGTTTCTGATGCTGTGAGGCAGGAGGGGAAGGAGGGTAGGGGATAGACAGTGGGAGCCCCCA CCCTGGAAGACATAACAGTAAGTCCAGGCCCGAAGGGCAGCAGGGATGCTGGGGGCCCAGCTTGGGCGGCGGGGATGATG R 45 GAGGGCCTGGCCAGGGTGGCAGGGATGATGGGGGCCCCAGCTGGGGTGGCAGGGGTGATGGGGGGGGCTGGTCTGGGTGG
L-I
- 47 CGGGGAAGATGGGGAAGCCTGGCTGGGCCCCCTCCTCCCCTGCCTCCCACCTGCAGCCGTGGATCCGATGTGCTTCCCT GGTGCACATCCTCTGGGCCATCAGCTTTCATGGAGGTGGGGGGCAGGGGCATGACACCATCCTGTATAAATCCAGGATT CCTCCTCCTGAACGCCCCAACTCAGGTTGAAAGTCACATTCCGCCTCTGGCCATTCTCTTAAGAGTAGACCAGGATTCTG ATCTCTGAAGGGTGGGTAGGGTGGGGCAGTGGAGGGTGTGGACACAGGAGGCTTCAGGGTGGGGCTGGTGATGCTCTCTC 5 ATCCTCTTATCATCTCCCAGTCTCATCTCTCATCCTCTTATCATCTCCCAGTCTCATCTGTCTTCCTCTTATCTCCCAGT CTCATCTGTCATCCTCTTACCATCTCCCAGTCTCATCTCTTATCCTCTTATCTCCTAGTCTCATCCAGACTTACCTCCCA GGGCGGGTGCCAGGCTCGCAGTGGAGCTGGACATACGTCCTTCCTCAGGCAGAAGGAACTGGAAGGATTG-AGAGAACAG GAGGGGCGGCTCAGAGGGACGCAGTCTTGGGGTGAAGAAACAGCCCCTCCTCAGAAGTTGGCTTGGGCCACACGACCG AGGGCCCTGCGTGAGTGGCTCCAGAGCCTTCCAGCAGGTCCCTGGTGGGGCCTTATGGTATGGCCGGGTCCTACTGAGTG 10 CACCTTGGACAGGGCTTCTGGTTTGAGTGCAGCCCGGACGTGCCTGGTGTCGGGGTGGGGGCTTATGGCCACTGGATATG GCGTCATTTATTGCTGCTGCTTCAGAGAATGTCTGAGTGACCGAGCCTAATGTGTATGGTGGGCCCAAGTCCACAGACTG TGTCGTAAATGCACTCTGGTGCCTGGAGCCCCCGTATAGGAGCTGTGAGGAAGGAGGGGCTCTTGGCAGCCGGCCTGGGG GCGCCTTTGCCCTGCAAACTGGAAGGGAGCGGCCCCGGGCGCCGTGGGCGGACGACCTCAAGTGAGAGGTTGGACAGAAC AGGGCGGGGACTTCCCAGGAGCAGAGGCCGCTGCTCAGGCACACCTGGGTTTGAATCACAGACCAACaGGTCAGGCCATT 15 GTTCAGCTATCCATCTTCTACAAAGCTCCAGATTCCTGTTTCTCCGGGTGTTTTTTGTTGAAATTTTACTCAGGATTACT TATATTTTTTGCTAAAGTATTAGACCCTTAAAAAAGGTATTTGCTTTGATATGGCTTAACTCACTAAGCACCTACTTTAT TTGTCTGTTTTTATTTATTATTATTATTATTATTAGAGATGGTGTCTACTCTGTCACCCAGGTTGTTAGTGCAGTGGCAC AGTCATGGCTCGCTGTAGCCGCAAACCCCCAGGCTCAAGTGATCCTCCGGCCTCAGCTTCCCAGAGTGCTGGGATTACAG GTGTGAGCCACTGCCCTTGCCTGGCACTTTTAAAAACCACTATGTAAGGTCAGGTCCAGTGGCTTCCACACCTGTCATCC 20 CAGTAGTTTGGGAAGCCGAGGCAGAAGGATTGTCTGAGGCCAGGAGTTTGAGACCAGCATGGGTAACATAGGGAGACC ATCTCTACAAAAAATGCAAAAAGTTATCCGGGCGTGGGGTCCAGCATCTGTAGTCCCAGCTGCTCGGGAGGCTGAGTGGG AGGATCGCTTGAGCCCGGGAGGTCATGGCTGCAGTGAGCTGTGATTGTACCATCGCACTCCAGCCTGGGCAACAGAGTGA GACCCTGTCT C n ~ZGAGAGAAGGAGAAGAGAAGAAGAAGGAAGAAGGAAAGAGAAGAAGAAG GAAGAAGGAAGAAAGAAGGAGAAGGAGGCCTGCTAGGTGCTAGGTAGACTGTCAAATCTCAGAGCAAAATGAAAATAACA 25 AAGTTTTAAAGGGAAAGAAACCCCAGCTCTTTGGACTTCCTTAGGCCTGACTTCATCTCAGCAGCTTCCTTCCACA GACAAGCGTGTATGGAGCGAGTGAGTTCAAAGCAGAAAGGGAGGAGAAGCAGGCAAGGGTGGAGGCTGTGGGTGA~CAC GCCAGGACCCCTGAAAGGGAGTGGTTGTTTTCCTGCCTCAGCCCCACGCTCCTGCCGGTCCTGCACCTGCTGTAACCGTC GATGTTGGTGCCAGGTGCCCACCTGGGAAGGATGCTGTGCAGGGGGCTTGCCAAACTTTGGTGGGTTTCAGAAGCCCCAG GCACTTGTGGCAGGCACAATTACAGCCCCTCCCCAAGATGCCCACGTCCTTCTCCTGGAACCTGTGATGTGTCACCCG 30 CAGCGGCGTAGCGAGGATAGCTCATACGTTAGTACTGTACG TGGGCCTGATATGGCCACAGGGTCCCTAGAGTGAGAGAGGGAGGCAGGGGAGAGTCAGAGAGGGGACGTGAGAAGGAC CACTGGCCACTGCTGGCTTTGAGATGGAGGAGGGGGTCCCCAGCCAAGGAATGGGGGCAGCCGCTCCATGCTGGAAAAGC AAGCAATCCTCCCCGGTCCTGAGGGCACACGGCCCTGCCCACGCCTCGATTTCAGGCCAGTGGGACCTGTTTCAGCTTTC CGGCCTCCAGAGCTGTAAGATGATGCGTTTGTGTTCAGCCACTAAGCTGCAGTGATTCGTCACAGCAGCATGGATAG 35 CAGTACAGGGAAATGAATACAGGGACAGTTCTCAGAGTGACTCTCAGCCCAcCCCTGGG -0 Li; -48 Characterization of the exons showed, interestingly, that the functionally important hTC protein domains which are described in our Patent Application PCT/EP/98/03469 are arranged on separate exons. The telomerase-characteristic T motif is located on exon 3. The RT (reverse transcriptase) motifs 1-7, which are 5 important for the catalytic function of the telomerase, are located on the following exons: RT motifs 1 and 2 on exon 4, RT motif 4 on exon 9, RT motif 5 on exon 10, and RT motifs 6 and 7 on exon 11. RT motif 3 is shared by exons 5 and 6 (see Fig. 8). 10 Elucidation of the exon-intron structure of the hTC gene also shows that the four deletions or insertion variants of the hTC cDNA which were described in our Patent Application PCT/EP/98/03469, as well as three additional hTC insertion variants which are described in the literature (Kilian et al., 1997), in all probability represent alternative splicing products. As shown in Fig. 8, the splicing variants can be divided 15 into two groups: deletion variants and insertion variants. The hTC variants in the deletion group lack specific sequence segments. The 36 bp in-frame deletion in variant DEL1 in all probability results from using an alternative 3' splice acceptor sequence in exon 6, resulting in a part of RT motif 3 being lost. In 20 variant DEL2, the normal 5' splice donor and 3' splice acceptor sequences of introns 6, 7 and 8 are not used. Instead exon 6 is fused directly to exon 9, resulting in a displacement arising in the open reading frame and a stop codon appearing in exon 10. Variant Del3 is a combination of variants 1 and 2. 25 The insertion variant group is characterized by the insertion of intron sequences which lead to premature cessation of translation. Instead of the 5' splice donor sequence of intron 5, which is normally used, use is made, in variant INS1, of an alternative, 3'-located splice site, resulting in the insertion of the first 38 bp from intron 4 between exon 4 and exon 5. The insertion, in variant INS2, of a region of the 30 intron 11 sequence likewise results from using an alternative 5' splice donor RA4 sequence in intron 11. Since this variant was only described inadequately in the -49 literature (Kilian et al., 1997), it is not possible to determine the precise alternative 5' splice donor sequence in this variant. The insertion of intron 14 sequences between exon 14 and exon 15 in variant INS3 comes from using an alternative 3' splice acceptor sequence, resulting in the 3' part of intron 14 not being spliced. 5 The hTC variant INS4 (variante 4), which is described in our Patent Application PCT/EP/98/03469, is characterized by exon 15, and the 5' part region of exon 16, being replaced by the first 600 bp of intron 14. This variant can be attributed to the use of an alternative internal 5' splice donor sequence in intron 14 and an alternative 10 3' splice acceptor sequence in exon 16, resulting in an altered C terminus. The in vivo generation of hTC protein variants which are probably non-functional and which could interfere with the function of the complete hTC protein constitutes a possible mechanism, in addition to transcription regulation, for controlling hTC 15 protein function. The function of the hTC splicing variants is not yet known. Although most of these variants presumably encode proteins without reverse transcriptase activity, they could nevertheless play a crucial role as transdominant negative telomerase regulators by, for example, competing for interaction with important binding partners. 20 The search for possible transcription factor binding sites was carried out using the ,,find pattern" algorithm from the Genetics Computer Group (Madison, USA) GCG Sequence Analysis program package. This resulted in the identification of a variety of potential binding sites for transcription factors in the nucleotide sequence of intron 25 2, which binding sites are listed in Tab. 2. In addition, an SpI binding site was found in intron 1 (pos. 43), and a c-Myc binding site was found in the 5'-untranslated region (cDNA position 29-34, cf. Fig. 6). RA 4) 1 r- -50 Example 6 In order to ascertain the start point(s) of hTC transcription in HL 60 cells, the 5' end of the hTC mRNA was determined by means of primer extension analysis. 5 2 ptg of polyA* RNA from HL-60 cells were denaturated at 65*C for 10 min. 1 pl of RNasin (30-40 U/ml) and 0.3-1 pmol of radioactively labelled primer (5'GTTAAGTTGTAGCTTACACTGGTTCTC 3'; 2.5-8x10 5 cpm) were added for primer annealing, and the whole was incubated, at 37*C for 30 min, in a total volume 10 of 20 pl. After the addition of 10 pl of 5xreverse transcriptase buffer (from Gibco BRL), 2 pl of 10 mM dNTPs, 2 I RNasin (see above), 5 pl of 0.1 M DTT (from Gibco-BRL) 2 Il of ThermoScript RT (15 U/pl; from Gibco-BRL) and 9 j1 of DEPC-treated water, primer extension took place, at 58*C for 1 h, in a total volume [lacuna]. The reaction was stopped by adding 4 pl of 0.5 M EDTA, pH 8.0, and the 15 RNA was degraded, at 37*C for 30 min, after having added 1 pl of RNaseA (10 mg/ml). 2.5 pg of sheared calf thymus DNA and 100 pl of TE were then added, and the mixture was extracted once with 150 gl of phenol/chloroform (1:1). The DNA was precipitated, at -70*C for 45 min, after adding 15 1I of 3 M Na acetate and 450 jl of ethanol, and then centrifuged at 14,000 rpm for 15 min. The precipitate was 20 washed once with 70% ethanol, dried in air and dissolved in 8 pl of sequencing stop solution. After 5 min of denaturation at 80'C, the samples were loaded onto a 6% polyacrylamide gel and fractionated electrophoretically (Ausubel et al., 1987) (Fig. 5). 25 In this connection, a main transcription start site was identified which is located 1767 bp 5' of the ATG start codon of the hTC cDNA sequence (nucleotide position 3346 in Fig. 4). In addition to this, the nucleotide sequence around this main transcription start (TTA+ 1 TTGT) represents an initiator element (Inr), which, in 6 out of 7 nucleotides, matches the consensus motif (PyPyA+ 1 Na/tPyPy) (Smale, 1997) of 30 an initiator element. 0 L 1 -51 It was not possible to identify any unambiguous TATA box in the immediate vicinity of the experimentally identified main transcription start, which means that the hTC promoter has probably to be classified in the family of TATA-less promoters (Smale, 1997). However, a potential TATA box from nucleotide position 1306 to nucleotide 5 position 1311 (Fig. 4) was found by means of bioinformatics analysis. The subsidiary transcription starts which were additionally observed around the main transcription start have also been described in the case of other TATA-less promoters (Geng and Johnson, 1993), for example in the strongly regulated promoters of some cell cycle genes (Wick et al., 1995). 10 Example 7 In addition to the start point of the hTC transcript which was described in Example 6 and identified in HL60 cells, a further transcription start region was also identified in 15 HL60 cells. With the aid of RT-PCR analyses, the region of the hTC gene transcription start in HL60 cells was localized to bp -60 to bp -105. The cDNA for this was synthesized using a First Strand cDNA Synthesis kit (Clontech), in accordance with the manufacturer's instructions, and employing 0.4 pg 20 of HL60 cell polyA RNA (Clontech) and the gene-specific primer GSP13 (5'-CCTCCAAAGAGGTGGCTTCTTCGGC-3', cDNA position 920-897). In a final volume of 50 pl, 10 pmol dNTP mix were added to 1 pl of cDNA, and a PCR reaction was carried out in lxPCR reaction buffer F (PCR-Optimizer kit from InVitrogen) and using one unit of platinum Taq DNA polymerase (from Gibco/BRL). 25 10 pmol of each of the 5' and 3' primers defined below were added as primers. The PCR was carried out in 3 steps. A two-minute denaturation at 94*C was followed by 36 PCR cycles in which the DNA was first of all denatured at 94'C for 45 sec and, after that, the primers were annealed, and the DNA chain was extended at 68*C for 5 min. The cycles were concluded by a chain extension at 68'C for 10 min. In all, six 30 different 5' PCR primers (primer HTRT5B: 5'-CGCAGCCACTACCGCGAGGTGC-3', cDNA position 105 to 126; primer C5S: -52 5'-CTGCGTCCTGCTGCGCACGTGGGAAGC-3', 5'-flanking region -49 to -23; primer PRO-TEST1: 5'-CTCGCGGCGCGAGTTTCAGGCAG-3', 5'-flanking region -74 to -52; primer PRO-TEST2: 5'-CCAGCCCCTCCCCTTCCTTTCC-3', 5'-flanking region -112 to -91; primer PRO-TEST4: 5 5'-CCAGCTCCGCCTCCTCCGCGC-3', 5'-flanking region -191 to -171; primer RP-3A: 5'-CTAGGCCGATTCGACCTCTCTCC-3', 5'-flanking region -427 to -405) were combined with the 3' PCR primer C5Rback (5'-GTCCCAGGGCACGCACACCAG-3', cDNA position 245 to 225). Genomic DNA was also employed for the PCR, as a control, in addition to the Oligo dT- and 10 GSP13-primed cDNAs. As Fig. 9 shows, a PCR product was only obtained with the primer combinations HTRT5B-C5Rback, C5S-C5Rback and PRO-TEST1-C5Rback, indicating that the start point for hTC transcription lies in the region between bp-60 and bp-105. 15 Example 8 Several extremely GC-rich regions, so-called CpG Islands, are located in the isolated 5'-flanking region, of about 11.2 kb in size, of the hTC gene. One CpG Island, having a GC content of > 70%, extends from bp - 1214 into intron 2. Two further GC-rich 20 regions having a GC content of > 60% extend from bp -3872 to bp -3113 and from bp -5363 to bp -3941, respectively. The positions of the CpG Islands are shown graphically in Fig. 11. The search for possible transcription factor binding sites was carried out using the 25 "Find Pattern" algorithm from the Genetics Computer Group (Madison, USA) GCG Sequence Analysis program package. This resulted in the identification of a variety of potential binding sites in the region up to -900 bp upstream of the translation start codon ATG: five SpI binding sites, one c-Myc binding site, and one CCAC box (Fig. 10). In addition, a CCAAT box and a second c-Myc binding site were found at 30 positions -1788 and -3995, respectively, of the 5'-flanking region.
-53 Example 9 In order to analyse the activity of the hTC promoter, PCR amplification was used to generate four hTC promoter sequence segments of differing length, which segments 5 were cloned into the Promega vector pGL2 5' in front of the luciferase reporter gene. The 8.5 kb Sac fragment which was subcloned from phage clone P12 was selected as the DNA source for the PCR amplification. In a final volume of 50 ptl, 10 pmol of dNTP mix were added to 35 ng of this DNA, and a PCR reaction was carried out in 1xPCR reaction buffer (PCR-Optimizer kit from InVitrogen) and using one unit of 10 platinum Taq DNA polymerase (from Gibco/BRL). In each case 20 pmol of the 5' and 3' primers which are defined below were added as primers. The PCR was carried out in three steps. A two-minute denaturation at 94*C was followeed by 30 PCR cycles in which the DNA was first of all denaturated at 94*C for 45 sec, after which the primers were annealed, and the DNA chain was extended, at 68*C for 5 min. The 15 cycles were concluded by a chain extension at 68*C for 10 min. The selected 3' PCR primer was in each case the primer PK-3A (5'-GCAAGCTTGACGCAGCGCTGCCTGAAACTCG-3', position -43 to -65), which primer recognizes a sequence region 42 bp upstream of the ATG START codon. A promoter fragment of 4051 bp in size (NPK8) was amplified by combining 20 the PK-3A primers with the 5' PCR primer PK-5B (5'-CCAGATCTCTGGAACACAGAGTGGCAGTTTCC-3', position -4093 to -4070). Combining the pair of primers PK-3A and PK-5C (5'-CCAGATCTGCATGAAGTGTGTGGGGATTTGCAG-3', position -3120 to -3096) led to the amplification of a promoter fragment of 3078 bp in size (NPK15). 25 Use of the primer combination PK-3A and PK-5D (5'-GGAGATCTGATCTTGGCTTACTGCAGCCTCTG-3', position -2110 to -2087) amplified a promoter fragment of 2068 bp in size (NPK22). Finally, using the primer combination PK-3A and PK-5E (5'-GGAGATCTGTCTGGATTCCTGGGAAGTCCTCA-3', position -1125 to 30 -1102) led to the amplification of a promoter fragment of 1083 bp in size (NPK27). RA4) -54 The PK-3A primer contains a HindIll recognition sequence. The different 5' primers contain a BglII recognition sequence. The resulting PCR products were purified using the Qiagen QIA quick spin PCR 5 purification kit, in accordance with the manufacturer's instructions, and then digested with the restriction enzymes BglIl and HindII. The pGL2 promoter vector was digested with the same restriction enzymes, and the SV40 promoter contained in this vector was released and removed. The PCR promoter fragments ligated into the vector, which was then transformed into competent DH5a bacteria (from 10 Gibco/BRL). DNA for the promoter activity analyses, which are described below, was isolated from transformed bacterial clones using the Qiagen plasmid kit. Example 10 15 The activity of the hTC promoter was analysed in transient transfections in eukaryotic cells. All the work with eukaryotic cells was carried out at a sterile workstation. CHO-KI and HEK 293 cells were obtained from the American Type Culture collection. 20 CHO-KI cells were kept in DMEM Nut Mix F-12 cell culture medium (from Gibco BRL, order number: 21331-020) containing 0.15% streptomycin/penicillin, 2 mM glutamine and 10% FCS (from Gibco-BRL). 25 HEK 293 cells were cultured in DMOD cell culture medium (from Gibco-BRL, order number: 41965-039) containing 0.15% streptomycin/penicillin, 2 mM glutamine and 10% FCS (from Gibco-BRL). CHO-KI and HEK 293 cells were cultured at 37*C in a water-saturated atmosphere 30 while being gassed with 5% CO 2 . When the cell lawn was confluent, the medium SRA was sucked off, after which the cells were washed with PBS (100 mM KH 2
PO
4 pH - 55 7.2; 150 mM NaCl) and released by adding a trypsin-EDTA solution (from Gibco BRL). The trypsin was inactivated by adding medium and the cell count was determined using a Neubauer counting chamber in order to plate out the cells at the desired density. 5 For the transfection, in each case 2x 105 HEK 293 cells were plated out, per well, in a 24-well cell culture plate. The HEK 293 medium was removed after 3 hours. For the transfection, up to 2.5 pg of plasmid DNA, 1 gg of a CMV B-Gal plasmid construct (from Stratagene, order numner: 200388), 200 pl of serum-free medium and 10 pl of 10 transfection reagent (DOTAP from Boehringer Mannheim) were incubated at room temperature for 15 minutes and then dropped uniformly onto the HEK 293 cells. 1.5 ml of medium were added after 3 hours. The medium was changed after 20 hours. After a further 24 hours, the cells were harvested for determining the luciferase activity and the B-Gal activity. For this, the cells were lysed, at room temperature for 15 15 minutes, in the cell culture lysis reagent (25 mM Tris [pH 7.8] containing H 3
PO
4 ; 2 mM CDTA; 2 mM DTT; 10% glycerol; 1% Triton X-100). Twenty pl of this cell lysate were mixed with 100 pl of luciferase assay buffer (20 mM Tricin; 1.07 mM (MgCO 3
)
4 Mg(OH) 2 '5H 2 0; 2.67 mM MgSO 4 ; 0.1 mM EDTA; 33.3 mM DTT; 270 gM coenzyme A; 470 pM luciferin, 530 gM ATP), and the light generated by 20 the luciferase was measured. In order to measure the B-galactosidase activity, equal quantities of cell lysate and B galactosidase assay buffer (100 mM sodium phosphate buffer, pH 7.3; 1 mM MgCl 2 ; 50 mM B-mercaptoethanol; 0.665 mg of ONPG/ml) were incubated at 37'C for at 25 least 30 minutes or until a slight yellow coloration appeared. The reaction was stopped by adding 100 g 1 of 1 M Na 2
CO
3 , and the absorption was determined at 420 nm. In order to analyse the hTC promoter, four hTC promoter sequence segments of 30 differing length were cloned 5' in front of the luciferase reporter gene (cf. Example nA/9). A4/ -56 The relative luciferase activities of two independent transfections in HEK 293 cells, using the constructs NPK8, NPK15, NPK22 and NPK27, are plotted in Fig. 11. Each experiment was carried out in duplicate. The standard deviation has also been given. 5 The construct NPK 27 exhibits a luciferase activity which is 40 times higher than the basal activity of the promoterless luciferase control construct (pGL2-basic) and from 2 to 3 times higher than that of the SV40 promoter control construct (pGL2PRO). Interestingly, a luciferase activity which was from 2 to 3 times lower than that obtained with the NPK 27 construct was observed in cells which were transfected 10 with longer hTC promoter constructs (NPK8, NPK15, NPK22). Similar results were also observed in CHO cells (data not shown).
-57 References Allsopp, R. C., Vazire, H., Pattersson, C., Goldstein, S., Younglai, E.V., Futcher, A.B., Greider, C.W. und Harley, C.B. (1992). Telomere length predicts replicative capacity of human fibroblasts. 5 Proc. Natl. Acad. Sci. 89, 10114-10118. Ausubel, F.M., Brent, R., Kingston, R.E., Moore, D.D., Seidman, J.G., Smith, J.A., Struhl, K. (1987). Current protocols in molecular biology. Greene Publishing Associates and Whiley Intersciences, New York. 10 Blasco, M. A., Rizen, M., Greider, C. W. und Hanahan, D. '(1996). Differential regulation of telomerase activity and telomerase RNA during multistage tumorigenesis. Nature Genetics 12, 200 204. 15 Broccoli, D., Young, J. W. und deLange, T. (1995). Telomerase activity in normal and malignant hematopoietic cells. Proc. Natl. Acad. Sci. 92, 9082-9086. Counter, C. M., Avilion, A. A., LeFeuvre, C. E., Stewart, N. G. Greider, C.W. Harley, C. B. und Bacchetti S. (1992). Telomere shortening associated with chromosome instability is arrested in 20 immortal cells which express telomerase activity. EMBO J. 11, 1921-1929. Feng, J., Funk, W. D., Wang, S.-S., Weinrich, S. L., Avilion, A.A., Chiu, C.-P., Adams, R.R., Chang, E., Allsopp, R.C., Yu, J., Le, S., West, M.D., Harley, C.B., Andrews, W.H., Greider, C.W. und Villeponteau, B. (1995). The RNA component of human telomerase. Science 269, 1236 25 1241. Geng, Y., and Johnson, L.F. (1993). Lack of an initiator element is responsible for multiple transcriptional initiation sites of the TATA less mouse thymidine synthasse promoter. Mol. Cell. Biol 14:4894. 30 Goldstein, S. (1990). Replicative senescence: The human fibroblast comes of age. Science 249, 1129 1133. Harley, C.B., Futcher, A.B., Greider, C.W., 1990. Telomeres shorten during ageing of human 35 fibroblasts. Nature 345, 458-460. /j)~ R4U - 58 Hastie, N. D., Dempster, M., Dunlop, M. G., Thompson, A. M., Green, D.K. und Allshire, R.C. (1990). Telomere reduction in human colorectal carcinoma and with ageing. Nature 346, 866-868. Hiyama, K., Hirai, Y., Kyoizumi, S., Akiyama, M., Hiyama, E., Piatyszek, M.A., Shay, J.W., 5 Ishioka, S. und Yamakido, M. (1995). Activation of telomerase in human lymphocytes and hematopoietic progenitor cells. J. Immunol. 155, 3711-3715. Kim, N.W., Piatyszek, M.A., Prowse, K.R., Harley, C. B., West, M.D., Ho, P.L.C., Coviello, G.M., Wright, W.E., Weinrich, S.L. und Shay, J.W. (1994). Specific association of human 10 telomerase activity with immortal cells and cancer. Science 266, 2011-2015. Latchman, D.S. (1991). Eukaryotic transcription factors. Academic Press Limited, London. Linger, J., Hughes, T.R., Shevchenko, A., Mann, M., Lundblad, V. und Cech T.R. (1997). 15 Reverse transcriptase motifs in the catalytic subunit of telomerase. Science 276: 561-567. Lundblad, V. und Szostak, J. W. (1989). A mutant with a defect in telomere elongation leads to senescence in yeast. Cell 57, 633-643. 20 McClintock, B. (1941). The stability of broken ends of chromosomes in Zea mays. Genetics 26, 234 282. Meyne, J., Ratliff, R. L. und Moyzis, R. K. (1989). Conservation of the human telomere sequence (TTAGGG)n among vertebrates. Proc. Natl. Acad. Sci. 86, 7049-7053. 25 Olovnikov, A. M. (1973). A theory of marginotomy. J. Theor. Biol. 41, 181-190. Sandell, L. L. und Zakian, V. A. (1993). Loss of a yeast telomere: Arrest, recovery and chromosome loss. Cell 75, 729-739. 30 Shapiro, M.B., Senapathy, P., 1987. RNA splice junctions of different classes of eukaryotes: sequence statistics and functional implications in gene expression. Nucl. Acids Res. 15, 7155-7174. Smale, S.T. and Baltimore, D. (1989). The ,,initiator" as a transcription control element. Cell 57:103 35 113. T R -59 Smale, S.T. (1997). Transcription initation from TATA-less promoters within eukaryotic protein coding genes. Biochimica et Biophysica Acta 1351, 73-88. Shay, J. W. (1997). Telomerae and Cancer. Ciba Foundation Meeting: Telomeres and Telomerase. 5 London. Vaziri, H., Dragowska, W., Allsopp, R. C., Thomas, T. E., Harley, C.B. und Landsdorp, P.M. (1994). Evidence for a mitotic clock in human hematopoietic stem cells: Loss of telomeric DNA with age. Proc. Natl. Acad. Sci. 91, 9857-9860. 10 Wick, M., Hironen, R., Mumberg, D., Burger, C., Olsen, B.R., Budarf, M.L., Apte, S. S. and Miller, R. (1995). Structure of the human TIMP-3 gene and its cell-cycle-regulated promoter. Biochemical Jornal 311, 549-554. 15 Zakian, V. A. (1995). Telomeres: Beginning to understand the end. Science 270, 1601-1607.
-60 SEQUENCE LISTING <110> Bayer AG 5 <120> Regulatory DNA sequences from the 5i region of the gene for the human catalytic telomerase subunit, and their diagnostic and therapeutic use 10 <130> LeA32805-Foreign Countries <140> <141> <160> 20 15 <170> PatentIn Vers. 2.0 <210> 1 <211> 5126 20 <212> DNA <213> Homo sapiens <400> 1 gagctctgaa ccgtggaaac gaacatgacc cttgcctgcc tgcttccctg ggtgggtcaa 60 25 gggtaatgaa gtggtgtgca ggaaatggcc atgtaaatta cacgactctg ctgatgggga 120 ccgttccttc catcattatt catcttcacc cccaaggact gaatgattcc agcaacttct 180 tcgggtgtga caagccatga caaaactcag tacaaacacc actcttttac taggcccaca 240 gagcacgggc cacacccctg atatattaag agtccaggag agatgaggct gctttcagcc 300 accaggctgg ggtgacaaca gcggctgaac agtctgttcc tctagactag tagaccctgg 360 30 caggcactcc cccaaattct agggcctggt tgctgcttcc cgagggcgcc atctgccctg 420 gagactcagc ctggggtgcc acactgaggc cagccctgtc tccacaccct ccgcctccag 480 gcctcagctt ctccagcagc ttcctaaacc ctgggtgggc cgtgttccag cgctactgtc 540 tcacctgtcc cactgtgtct tgtctcagcg acgtagctcg cacggttcct cctcacatgg 600 ggtgtctgtc tccttcccca acactcacat gcgttgaagg gaggagattc tgcgcctccc 660 35 agactggctc ctctgagcct gaacctggct cgtggccccc gatgcaggtt cctggcgtcc 720 ggctgcacgc tgacctccat ttccaggcgc tccccgtctc ctgtcatctg ccggggcctg 780 ccggtgtgtt cttctgtttc tgtgctcctt tccacgtcca gctgcgtgtg tctctgcccg 840 ctagggtctc ggggttttta taggcatagg acgggggcgt ggtgggccag ggcgctcttg 900 ggaaatgcaa catttgggtg tgaaagtagg agtgcctgtc ctcacctagg tccacgggca 960 40 caggcctggg gatggagccc ccgccaggga cccgcccttc tctgcccagc actttcctgc 1020 ccccctccct ctggaacaca gagtggcagt ttccacaagc actaagcatc ctcttcccaa 1080 aagacccagc attggcaccc ctggacattt gccccacagc cctgggaatt cacgtgacta 1140 cgcacatcat gtacacactc ccgtccacga ccgacccccg ctgttttatt ttaatagcta 1200 caaagcaggg aaatccctgc taaaatgtcc tttaacaaac tggttaaaca aacgggtcca 1260 45 tccgcacggt ggacagttcc tcacagtgaa gaggaacatg ccgtttataa agcctgcagg 1320 catctcaagg gaattacgct gagtcaaaac tgccacctcc atgggatacg tacgcaacat 1380 gctcaaaaag aaagaatttc accccatggc aggggagtgg ttaggggggt taaggacggt 1440 gggggcggca gctgggggct actgcacgca ccttttacta aagccagttt cctggttctg 1500 atggtattgg ctcagttatg ggagactaac cataggggag tggggatggg ggaacccgga 1560 50 ggctgtgcca tctttgccat gcccgagtgt cctgggcagg ataatgctct agagatgccc 1620 acgtcctgat tcccccaaac ctgtggacag aacccgcccg gccccagggc ctttgcaggt 1680 gtgatctccg tgaggaccct gaggtctggg atccttcggg actacctgca ggcccgaaaa 1740 gtaatccagg ggttctggga agaggcgggc aggagggtca gaggggggca gcctcaggac 1800 gatggaggca gtcagtctga ggctgaaaag ggagggaggg cctcgagccc aggcctgcaa 1860 55 gcgcctccag aagctggaaa aagcggggaa gggaccctcc acggagcctg cagcaggaag 1920 gcacggctgg cccttagccc accagggccc atcgtggacc tccggcctcc gtgccatagg 1980 agggcactcg cgctgccctt ctagcatgaa gtgtgtgggg atttgcagaa gcaacaggaa 2040 acccatgcac tgtgaatcta ggattatttc aaaacaaagg tttacagaaa catccaagga 2100 cagggctgaa gtgcctccgg gcaagggcag ggcaggcacg agtgatttta tttagctatt 2160 60 ttattttatt tacttacttt ctgagacaga gttatgctct tgttgcccag gctggagtgc 2220 agcggcatga tcttggctca ctgcaacctc cgtctcctgg gttcaagcaa ttctcgtgcc 2280 tcagcctccc aagtagctgg gatttcaggc gtgcaccacc acacccggct aattttgtat 2340 ttttagtaga gatgggcttt caccatgttg gtcaagctga tctcaaaatc ctgacctcag 2400 gtgatccgcc cacctcagcc tcccaaagtg ctgggattac aggcatgagc cactgcacct 2460 65 ggcctattta accattttaa aacttccctg ggctcaagtc acacccactg gtaaggagtt 2520 catggagttc aatttcccct ttactcagga gttaccctcc tttgatattt tctgtaattc 2580 ttcgtagact ggggatacac cgtctcttga catattcaca gtttctgtga ccacctgtta 2640 tcccatggga cccactgcag gggcagctgg gaggctgcag gcttcaggtc ccagtggggt 2700 tgccatctgc cagtagaaac ctgatgtaga atcagggcgc aagtgtggac actgtcctga 2760 70 atctcaatgt ctcagtgtgt gctgaaacat gtagaaatta aagtccatcc ctcctactct 2820 actgggattg agccccttcc ctatcccccc ccaggggcag aggagttcct ctcactcctg 2880 tggaggaagg aatgatactt tgttattttt cactgctggt actgaatcca ctgtttcatt 2940 - 61 tgttggtttg tttgttttgt tttgagaggc ggtttcactc ttgttgctca ggctggaggg 3000 agtgcaatgg cgcgatcttg gcttactgca gcctctgcct cccaggttca agtgattctc 3060 ctgcttccgc ctcccatttg gctgggatta caggcacccg ccaccatgcc cagctaattt 3120 tttgtatttt tagtagagac gggggtgggt ggggttcacc atgttggcca ggctggtctc 3180 5 gaacttctga cctcagatga tccacctgcc tctgcctcct aaagtgctgg gattacaggt 3240 gtgagccacc atgcccagct cagaatttac tctgtttaga aacatctggg tctgaggtag 3300 gaagctcacc ccactcaagt gttgtggtgt tttaagccaa tgatagaatt tttttattgt 3360 tgttagaaca ctcttgatgt tttacactgt gatgactaag acatcatcag cttttcaaag 3420 acacactaac tgcacccata atactggggt gtcttctggg tatcagcaat cttcattgaa 3480 10 tgccgggagg cgtttcctcg ccatgcacat ggtgttaatt actccagcat aatcttctgc 3540 ttccatttct tctcttccct cttttaaaat tgtgttttct atgttggctt ctctgcagag 3600 aaccagtgta agctacaact taacttttgt tggaacaaat tttccaaacc gcccctttgc 3660 cctagtggca gagacaattc acaaacacag ccctttaaaa aggcttaggg atcactaagg 3720 ggatttctag aagagcgacc tgtaatccta agtatttaca agacgaggct aacctccagc 3780 15 gagcgtgaca gcccagggag ggtgcgaggc ctgttcaaat gctagctcca taaataaagc 3840 aatttcctcc ggcagtttct gaaagtagga aaggttacat ttaaggttgc gtttgttagc 3900 atttcagtgt ttgccgacct cagctacagc atccctgcaa ggcctcggga gacccagaag 3960 tttctcgccc ccttagatcc aaacttgagc aacccggagt ctggattcct gggaagtcct 4020 cagctgtcct gcggttgtgc cggggcccca ggtctggagg ggaccagtgg ccgtgtggct 4080 20 tctactgctg ggctggaagt cgggcctcct agctctgcag tccgaggctt ggagccaggt 4140 gcctggaccc cgaggctgcc ctccaccctg tgcgggcggg atgtgaccag atgttggcct 4200 catctgccag acagagtgcc ggggcccagg gtcaaggccg ttgtggctgg tgtgaggcgc 4260 ccggtgcgcg gccagcagga gcgcctggct ccatttccca ccctttctcg acgggaccgc 4320 cccggtgggt gattaacaga tttggggtgg tttgctcatg gtggggaccc ctcgccgcct 4380 25 gagaacctgc aaagagaaat gacgggcctg tgtcaaggag cccaagtcgc ggggaagtgt 4440 tgcagggagg cactccggga ggtcccgcgt gcccgtccag ggagcaatgc gtcctcgggt 4500 tcgtccccag ccgcgtctac gcgcctccgt cctccccttc acgtccggca ttcgtggtgc 4560 ccggagcccg acgccccgcg tccggacctg gaggcagccc tgggtctccg gatcaggcca 4620 gcggccaaag ggtcgccgca cgcacctgtt cccagggcct ccacatcatg gcccctccct 4680 30 cgggttaccc cacagcctag gccgattcga cctctctccg ctggggccct cgctggcgtc 4740 cctgcaccct gggagcgcga gcggcgcgcg ggcggggaag cgcggcccag acccccgggt 4800 ccgcccggag cagctgcgct gtcggggcca ggccgggctc ccagtggatt cgcgggcaca 4860 gacgcccagg accgcgctcc ccacgtggcg gagggactgg ggacccgggc acccgtcctg 4920 ccccttcacc ttccagctcc gcctcctccg cgcggacccc gccccgtccc gacccctccc 4980 35 gggtccccgg cccagccccc tccgggccct cccagcccct ccccttcctt tccgcggccc 5040 cgccctctcc tcgcggcgcg agtttcaggc agcgctgcgt cctgctgcgc acgtgggaag 5100 ccctggcccc ggccaccccc gcgatg 5126 <210> 2 40 <211> 4042 <212> DNA <213> Homo sapiens <400> 2 45 gtttcaggca gcgctgcgtc ctgctgcgca cgtgggaagc cctggccccg gccacccccg 60 cgatgccgcg cgctccccgc tgccgagccg tgcgctccct gctgcgcagc cactaccgcg 120 aggtgctgcc gctggccacg ttcgtgcggc gcctggggcc ccagggctgg cggctggtgc 180 agcgcgggga cccggcggct ttccgcgcgc tggtggccca gtgcctggtg tgcgtgccct 240 gggacgcacg gccgcccccc gccgccccct ccttccgcca ggtgtcctgc ctgaaggagc 300 50 tggtggcccg agtgctgcag aggctgtgcg agcgcggcgc gaagaacgtg ctggccttcg 360 gcttcgcgct gctggacggg gcccgcgggg gcccccccga ggccttcacc accagcgtgc 420 gcagctacct gcccaacacg gtgaccgacg cactgcgggg gagcggggcg tgggggctgc 480 tgctgcgccg cgtgggcgac gacgtgctgg ttcacctgct ggcacgctgc gcgctctttg 540 tgctggtggc tcccagctgc gcctaccagg tgtgcgggcc gccgctgtac cagctcggcg 600 55 ctgccactca ggcccggccc ccgccacacg ctagtggacc ccgaaggcgt ctgggatgcg 660 aacgggcctg gaaccatagc gtcagggagg ccggggtccc cctgggcctg ccagccccgg 720 gtgcgaggag gcgcgggggc agtgccagcc gaagtctgcc gttgcccaag aggcccaggc 780 gtggcgctgc ccctgagccg gagcggacgc ccgttgggca ggggtcctgg gcccacccgg 840 gcaggacgcg tggaccgagt gaccgtggtt tctgtgtggt gtcacctgcc agacccgccg 900 60 aagaagccac ctctttggag ggtgcgctct ctggcacgcg ccactcccac ccatccgtgg 960 gccgccagca ccacgcgggc cccccatcca catcgcggcc accacgtccc tgggacacgc 1020 cttgtccccc ggtgtacgcc gagaccaagc acttcctcta ctcctcaggc gacaaggagc 1080 agctgcggcc ctccttccta ctcagctctc tgaggcccag cctgactggc gctcggaggc 1140 tcgtggagac catctttctg ggttccaggc cctggatgcc agggactccc cgcaggttgc 1200 65 cccgcctgcc ccagcgctac tggcaaatgc ggcccctgtt tctggagctg cttgggaacc 1260 acgcgcagtg cccctacggg gtgctcctca agacgcactg cccgctgcga gctgcggtca 1320 ccccagcagc cggtgtctgt gcccgggaga agccccaggg ctctgtggcg gcccccgagg 1380 aggaggacac agacccccgt cgcctggtgc agctgctccg ccagcacagc agcccctggc 1440 aggtgtacgg cttcgtgcgg gcctgcctgc gccggctggt gcccccaggc ctctggggct 1500 70 ccaggcacaa cgaacgccgc ttcctcagga acaccaagaa gttcatctcc ctggggaagc 1560 -M - atgccaagct ctcgctgcag gagctgacgt ggaagatgag cgtgcgggac tgcgcttggc 1620 tgcgcaggag cccaggggtt ggctgtgttc cggccgcaga gcaccgtctg cgtgaggaga 1680 -62 tcctggccaa gttcctgcac tggctgatga gtgtgtacgt cgtcgagctg ctcaggtctt 1740 tcttttatgt cacggagacc acgtttcaaa agaacaggct ctttttctac cggaagagtg 1800 tctggagcaa gttgcaaagc attggaatca gacagcactt gaagagggtg cagctgcggg 1860 agctgtcgga agcagaggtc aggcagcatc gggaagccag gcccgccctg ctgacgtcca 1920 5 gactccgctt catccccaag cctgacgggc tgcggccgat tgtgaacatg gactacgtcg 1980 tgggagccag aacgttccgc agagaaaaga gggccgagcg tctcacctcg agggtgaagg 2040 cactgttcag cgtgctcaac tacgagcggg cgcggcgccc cggcctcctg ggcgcctctg 2100 tgctgggcct ggacgatatc cacagggcct ggcgcacctt cgtgctgcgt gtgcgggccc 2160 aggacccgcc gcctgagctg tactttgtca aggtggatgt gacgggcgcg tacgacacca 2220 10 tcccccagga caggctcacg gaggtcatcg ccagcatcat caaaccccag aacacgtact 2280 gcgtgcgtcg gtatgccgtg gtccagaagg ccgcccatgg gcacgtccgc aaggccttca 2340 agagccacgt ctctaccttg acagacctcc agccgtacat gcgacagttc gtggctcacc 2400 tgcaggagac cagcccgctg agggatgccg tcgtcatcga gcagagctcc tccctgaatg 2460 aggccagcag tggcctcttc gacgtcttcc tacgcttcat gtgccaccac gccgtgcgca 2520 15 tcaggggcaa gtcctacgtc cagtgccagg ggatcccgca gggctccatc ctctccacgc 2580 tgctctgcag cctgtgctac ggcgacatgg agaacaagct gtttgcgggg attcggcggg 2640 acgggctgct cctgcgtttg gtggatgatt tcttgttggt gacacctcac ctcacccacg 2700 cgaaaacctt cctcaggacc ctggtccgag gtgtccctga gtatggctgc gtggtgaact 2760 tgcggaagac agtggtgaac ttccctgtag aagacgaggc cctgggtggc acggcttttg 2820 20 ttcagatgcc ggcccacggc ctattcccct ggtgcggcct gctgctggat acccggaccc 2880 tggaggtgca gagcgactac tccagctatg cccggacctc catcagagcc agtctcacct 2940 tcaaccgcgg cttcaaggct gggaggaaca tgcgtcgcaa actctttggg gtcttgcggc 3000 tgaagtgtca cagcctgttt ctggatttgc aggtgaacag cctccagacg gtgtgcacca 3060 acatctacaa gatcctcctg ctgcaggcgt acaggtttca cgcatgtgtg ctgcagctcc 3120 25 catttcatca gcaagtttgg aagaacccca catttttcct gcgcgtcatc tctgacacgg 3180 cctccctctg ctactccatc ctgaaagcca agaacgcagg gatgtcgctg ggggccaagg 3240 gcgccgccgg ccctctgccc tccgaggccg tgcagtggct gtgccaccaa gcattcctgc 3300 tcaagctgac tcgacaccgt gtcacctacg tgccactcct ggggtcactc aggacagccc 3360 agacgcagct gagtcggaag ctcccgggga cgacgctgac tgccctggag gccgcagcca 3420 30 acccggcact gccctcagac ttcaagacca tcctggactg atggccaccc gcccacagcc 3480 aggccgagag cagacaccag cagccctgtc acgccgggct ctacgtccca gggagggagg 3540 ggcggcccac acccaggccc gcaccgctgg gagtctgagg cctgagtgag tgtttggccg 3600 aggcctgcat gtccggctga aggctgagtg tccggctgag gcctgagcga gtgtccagcc 3660 aagggctgag tgtccagcac acctgccgtc ttcacttccc cacaggctgg cgctcggctc 3720 35 caccccaggg ccagcttttc ctcaccagga gcccggcttc cactccccac ataggaatag 3780 tccatcccca gattcgccat tgttcacccc tcgccctgcc ctcctttgcc ttccaccccc 3840 accatccagg tggagaccct gagaaggacc ctgggagctc tgggaatttg gagtgaccaa 3900 aggtgtgccc tgtacacagg cgaggaccct gcacctggat gggggtccct gtgggtcaaa 3960 ttggggggag gtgctgtggg agtaaaatac tgaatatatg agtttttcag ttttgaaaaa 4020 40 aaaaaaaaaa aaaaaaaaaa aa 4042 <210> 3 <211> 11276 <212> DNA 45 <213> Homo sapiens <400> 3 acttgagccc aagagttcaa ggctacggtg agccatgatt gcaacaccac acgccagcct 60 tggtgacaga atgagaccct gtctcaaaaa aaaaaaaaaa aattgaaata atataaagca 120 50 tcttctctgg ccacagtgga acaaaaccag aaatcaacaa caagaggaat tttgaaaact 180 atacaaacac atgaaaatta aacaatatac ttctgaatga ccagtgagtc aatgaagaaa 240 ttaaaaagga aattgaaaaa tttatttaag caaatgataa cggaaacata acctctcaaa 300 acccacggta tacagcaaaa gcagtgctaa gaaggaagtt tatagctata agcagctaca 360 tcaaaaaagt agaaaagcca ggcgcagtgg ctcatgcctg taatcccagc actttgggag 420 55 gccaaggcgg gcagatcgcc tgaggtcagg agttcgagac cagcctgacc aacacagaga 480 aaccttgtcg ctactaaaaa tacaaaatta gctgggcatg gtggcacatg cctgtaatcc 540 cagctactcg ggaggctgag gcaggataac cgcttgaacc caggaggtgg aggttgcggt 600 gagccgggat tgcgccattg gactccagcc tgggtaacaa gagtgaaacc ctgtctcaag 660 aaaaaaaaaa aagtagaaaa acttaaaaat acaacctaat gatgcacctt aaagaactag 720 60 aaaagcaaga gcaaactaaa cctaaaattg gtaaaagaaa agaaataata aagatcagag 780 cagaaataaa tgaaactgaa agataacaat acaaaagatc aacaaaatta aaagttggtt 840 ttttgaaaag ataaacaaaa ttgacaaacc tttgcccaga ctaagaaaaa aggaaagaag 900 acctaaataa ataaagtcag agatgaaaaa agagacatta caactgatac cacagaaatt 960 caaaggatca ctagaggcta ctatgagcaa ctgtacacta ataaattgaa aaacctagaa 1020 65 aaaatagata aattcctaga tgcatacaac ctaccaagat tgaaccatga agaaatccaa 1080 agcccaaaca gaccaataac aataatggga ttaaagccat aataaaaagt ctcctagcaa 1140 agagaagccc aggacccaat ggcttccctg ctggatttta ccaatcattt aaagaagaat 1200 gaattccaat cctactcaaa ctattctgaa aaatagagga aagaatactt ccaaactcat 1260 tctacatggc cagtattacc ctgattccaa aaccagacaa aaacacatca aaaacaaaca 1320 70 aacaaaaaaa cagaaagaaa gaaaactaca ggccaatatc cctgatgaat actgatacaa 1380 aaatcctcaa caaaacacta gcaaaccaaa ttaaacaaca ccttcgaaag atcattcatt 1440 gtgatcaagt gggatttatt ccagggatgg aaggatggtt caacatatgc aaatcaatca 1500 -63 atgtgataca tcatcccaac aaaatgaagt acaaaaacta tatgattatt tcactttatg 1560 cagaaaaagc atttgataaa attctgcacc cttcatgata aaaaccctca aaaaaccagg 1620 tatacaagaa acatacaggc caggcacagt ggctcacacc tgcgatccca gcactctggg 1680 aggccaaggt gggatgattg cttgggccca ggagtttgag actagcctgg gcaacaaaat 1740 5 gagacctggt ctacaaaaaa cttttttaaa aaattagcca ggcatgatgg catatgcctg 1800 tagtcccagc tagtctggag gctgaggtgg gagaatcact taagcctagg aggtcgaggc 1860 tgcagtgagc catgaacatg tcactgtact ccagcctaga caacagaaca agaccccact 1920 gaataagaag aaggagaagg agaagggaga agggagggag aagggaggag gaggagaagg 1980 aggaggtgga ggagaagtgg aaggggaagg ggaagggaaa gaggaagaag aagaaacata 2040 10 tttcaacata ataaaagccc tatatgacag accgaggtag tattatgagg aaaaactgaa 2100 agcctttcct ctaagatctg gaaaatgaca agggcccact ttcaccactg tgattcaaca 2160 tagtactaga agtcctagct agagcaatca gataagagaa agaaataaaa ggcatccaaa 2220 ctggaaagga agaagtcaaa ttatcctgtt tgcagatgat atgatcttat atctggaaaa 2280 gacttaagac accactaaaa aactattaga gctgaaattt ggtacagcag gatacaaaat 2340 15 caatgtacaa aaatcagtag tatttctata ttccaacagc aaacaatctg aaaaagaaac 2400 caaaaaagca gctacaaata aaattaaaca gctaggaatt aaccaaagaa gtgaaagatc 2460 tctacaatga aaactataaa atgttgataa aagaaattga agagggcaca aaaaaagaaa 2520 agatattcca tgttcataga ttggaagaat aaatactgtt aaaatgtcca tactacccaa 2580 agcaatttac aaattcaatg caatccctat taaaatacta atgacgttct tcacagaaat 2640 20 agaagaaaca attctaagat ttgtacagaa ccacaaaaga cccagaatag ccaaagctat 2700 cctgaccaaa aagaacaaaa ctggaagcat cacattacct gacttcaaat tatactacaa 2760 agctatagta acccaaacta catggtactg gcataaaaac agatgagaca tggaccagag 2820 gaacagaata gagaatccag aaacaaatcc atgcatctac agtgaactca tttttgacaa 2880 aggtgccaag aacatacttt ggggaaaaga taatctcttc aataaatggt gctggaggaa 2940 25 ctggatatcc atatgcaaaa taacaatact agaactctgt ctctcaccat atacaaaagc 3000 aaatcaaaat ggatgaaagg cttaaatcta aaacctcaaa ctttgcaact actaaaagaa 3060 aacaccggag aaactctcca ggacattgga gtgggcaaag acttcttgag taattccctg 3120 caggcacagg caaccaaagc aaaaacagac aaatgggatc atatcaagtt aaaaagcttc 3180 tgcccagcaa aggaaacaat caacaaagag aagagacaac ccacagaatg ggagaatata 3240 30 tttgcaaact attcatctaa caaggaatta ataaccagta tatataagga gctcaaacta 3300 ctctataaga aaaacaccta ataagctgat tttcaaaaat aagcaaaaga tctgggtaga 3360 catttctcaa aataagtcat acaaatggca aacaggcatc tgaaaatgtg ctcaacacca 3420 ctgatcatca gagaaatgca aatcaaaact actatgagag atcatctcat cccagttaaa 3480 atggctttta ttcaaaagac aggcaataac aaatgccagt gaggatgtgg ataaaaggaa 3540 35 acccttggac actgttggtg ggaatggaaa ttgctaccac tatggagaac agtttgaaag 3600 ttcctcaaaa aactaaaaat aaagctacca tacagcaatc ccattgctag gtatatactc 3660 caaaaaaggg aatcagtgta tcaacaagct atctccactc ccacatttac tgcagcactg 3720 ttcatagcag ccaaggtttg gaagcaacct cagtgtccat caacagacga atggaaaaag 3780 aaaatgtggt gcacatacac aatggagtac tacgcagcca taaaaaagaa tgagatcctg 3840 40 tcagttgcaa cagcatgggg ggcactggtc agtatgttaa gtgaaataag ccaggcacag 3900 aaagacaaac ttttcatgtt ctcccttact tgtgggagca aaaattaaaa caattgacat 3960 agaaatagag gagaatggtg gttctagagg ggtgggggac agggtgacta gagtcaacaa 4020 taatttattg tatgttttaa aataactaaa agagtataat tgggttgttt gtaacacaaa 4080 gaaaggataa atgcttgaag gtgacagata ccccatttac cctgatgtga ttattacaca 4140 45 ttgtatgcct gtatcaaaat atctcatgta tgctatagat ataaacccta ctatattaaa 4200 aattaaaatt ttaatggcca ggcacggtgg ctcatgtccg taatcccagc actttgggag 4260 gccgaggcgg gtggatcacc tgaggtcagg agtttgaaac cagtctggcc accatgatga 4320 aaccctgtct ctactaaaga tacaaaaatt agccaggcgt ggtggcacat acctgtagtc 4380 ccaactactc aggaggctga gacaggagaa ttgcttgaac ctgggaggcg gaggttgcag 4440 50 tgagccgaga tcatgccact gcactgcagc ctgggtgaca gagcaagact ccatctcaaa 4500 acaaaaacaa aaaaaagaag attaaaattg taatttttat gtaccgtata aatatatact 4560 ctactatatt agaagttaaa aattaaaaca attataaaag gtaattaacc acttaatcta 4620 aaataagaac aatgtatgtg gggtttctag cttctgaaga agtaaaagtt atggccacga 4680 tggcagaaat gtgaggaggg aacagtggaa gttactgttg ttagacgctc atactctctg 4740 55 taagtgactt aattttaacc aaagacaggc tgggagaagt taaagaggca ttctataagc 4800 cctaaaacaa ctgctaataa tggtgaaagg taatctctat taattaccaa taattacaga 4860 tatctctaaa atcgagctgc agaattggca cgtctgatca caccgtcctc tcattcacgg 4920 tgcttttttt cttgtgtgct tggagatttt cgattgtgtg ttcgtgtttg gttaaactta 4980 atctgtatga atcctgaaac gaaaaatggt ggtgatttcc tccagaagaa ttagagtacc 5040 60 tggcaggaag caggtggctc tgtggacctg agccacttca atcttcaagg gtctctggcc 5100 aagacccagg tgcaaggcag aggcctgatg acccgaggac aggaaagctc ggatgggaag 5160 gggcgatgag aagcctgcct cgttggtgag cagcgcatga agtgccctta tttacgcttt 5220 gcaaagattg ctctggatac catctggaaa aggcggccag cgggaatgca aggagtcaga 5280 agcctcctgc tcaaacccag gccagcagct atggcgccca cccgggcgtg tgccagaggg 5340 65 agaggagtca aggcacctcg aagtatggct taaatctttt tttcacctga agcagtgacc 5400 aaggtgtatt ctgagggaag cttgagttag gtgccttctt taaaacagaa agtcatggaa 5460 gcacccttct caagggaaaa ccagacgccc gctctgcggt catttacctc tttcctctct 5520 ccctctcttg ccctcgcggt ttctgatcgg gacagagtga cccccgtgga gcttctccga 5580 gcccgtgctg aggaccctct tgcaaagggc tccacagacc cccgccctgg agagaggagt 5640 70 ctgagcctgg cttaataaca aactgggatg tggctggggg cggacagcga cggcgggatt 5700 caaagactta attccatgag taaattcaac ctttccacat ccgaatggat ttggatttta 5760 tcttaatatt ttcttaaatt tcatcaaata acattcagga ctgcagaaat ccaaaggcgt 5820 - 64 aaaacaggaa ctgagctatg tttgccaagg tccaaggact taataaccat gttcagaggg 5880 atttttcgcc ctaagtactt tttattggtt ttcataaggt ggcttagggt gcaagggaaa 5940 gtacacgagg agaggcctgg gcggcagggc tatgagcacg gcagggccac cggggagaga 6000 gtccccggcc tgggaggctg acagcaggac cactgaccgt cctccctggg agctgccaca 6060 5 ttgggcaacg cgaaggcggc cacgctgcgt gtgactcagg accccatacc ggcttcctgg 6120 gcccacccac actaacccag gaagtcacgg agctctgaac ccgtggaaac gaacatgacc 6180 cttgcctgcc tgcttccctg ggtgggtcaa gggtaatgaa gtggtgtgca ggaaatggcc 6240 atgtaaatta cacgactctg ctgatgggga ccgttccttc catcattatt catcttcacc 6300 cccaaggact gaatgattcc agcaacttct tcgggtgtga caagccatga caaaactcag 6360 10 tacaaacacc actcttttac taggcccaca gagcacggsc cacacccctg atatattaag 6420 agtccaggag agatgaggct gctttcagcc accaggctgg ggtgacaaca gcggctgaac 6480 agtctgttcc tctagactag tagaccctgg caggcactcc cccagattct agggcctggt 6540 tgctgcttcc cgagggcgcc atctgccctg gagactcagc ctggggtgcc acactgaggc 6600 cagccctgtc tccacaccct ccgcctccag gcctcagctt ctccagcagc ttcctaaacc 6660 15 ctgggtgggc cgtgttccag cgctactgtc tcacctgtcc cactgtgtct tgtctcagcg 6720 acgtagctcg cacggttcct cctcacatgg ggtgtctgtc tccttcccca acactcacat 6780 gcgttgaagg gaggagattc tgcgcctccc agactggctc ctctgagcct gaacctggct 6840 cgtggccccc gatgcaggtt cctggcgtcc ggctgcacgc tgacctccat ttccaggcgc 6900 tccccgtctc ctgtcatctg ccggggcctg ccggtgtgtt cttctgtttc tgtgctcctt 6960 20 tccacgtcca gctgcgtgtg tctctgcccg ctagggtctc ggggttttta taggcatagg 7020 acgggggcgt ggtgggccag ggcgctcttg ggaaatgcaa catttgggtg tgaaagtagg 7080 agtgcctgtc ctcacctagg tccacgggca caggcctggg gatggagccc ccgccaggga 7140 cccgcccttc tctgcccagc actttcctgc ccccctccct ctggaacaca gagtggcagt 7200 ttccacaagc actaagcatc ctcttcccaa aagacccagc attggcaccc ctggacattt 7260 25 gccccacagc cctgggaatt cacgtgacta cgcacatcat gtacacactc ccgtccacga 7320 ccgacccccg ctgttttatt ttaatagcta caaagcaggg aaatccctgc taaaatgtcc 7380 tttaacaaac tggttaaaca aacgggtcca tccgcacggt ggacagttcc tcacagtgaa 7440 gaggaacatg ccgtttataa agcctgcagg catctcaagg gaattacgct gagtcaaaac 7500 tgccacctcc atgggatacg tacgcaacat gctcaaaaag aaagaatttc accccatggc 7560 30 aggggagtgg ttaggggggt taaggacggt gggggcggca gctgggggct actgcacgca 7620 ccttttacta aagccagttt cctggttctg atggtattgg ctcagttatg ggagactaac 7680 cataggggag tggggatggg ggaacccgga ggctgtgcca tctttgccat gcccgagtgt 7740 cctgggcagg ataatgctct agagatgccc acgtcctgat tcccccaaac ctgtggacag 7800 aacccgcccg gccccagggc ctttgcaggt gtgatctccg tgaggaccct gaggtctggg 7860 35 atccttcggg actacctgca ggcccgaaaa gtaatccagg ggttctggga agaggcgggc 7920 aggagggtca gaggggggca gcctcaggac gatggaggca gtcagtctga ggctgaaaag 7980 ggagggaggg cctcgagccc aggcctgcaa gcgcctccag aagctggaaa aagcggggaa 8040 gggaccctcc acggagcctg cagcaggaag gcacggctgg cccttagccc accagggccc 8100 atcgtggacc tccggcctcc gtgccatagg agggcactcg cgctgccctt ctagcatgaa 8160 40 gtgtgtgggg atttgcagaa gcaacaggaa acccatgcac tgtgaatcta ggattatttc 8220 aaaacaaagg tttacagaaa catccaagga cagggctgaa gtgcctccgg gcaagggcag 8280 ggcaggcacg agtgatttta tttagctatt ttattttatt tacttacttt ctgagacaga 8340 gttatgctct tgttgcccag gctggagtgc agcggcatga tcttggctca ctgcaacctc 8400 cgtctcctgg gttcaagcaa ttctcgtgcc tcagcctccc aagtagctgg gatttcaggc 8460 45 gtgcaccacc acacccggct aattttgtat ttttagtaga gatgggcttt caccatgttg 8520 gtcaagctga tctcaaaatc ctgacctcag gtgatccgcc cacctcagcc tcccaaagtg 8580 ctgggattac aggcatgagc cactgcacct ggcctattta accattttaa aacttccctg 8640 ggctcaagtc acacccactg gtaaggagtt catggagttc aatttcccct ttactcagga 8700 gttaccctcc tttgatattt tctgtaattc ttcgtagact ggggatacac cgtctcttga 8760 50 catattcaca gtttctgtga ccacctgtta tcccatggga cccactgcag gggcagctgg 8820 gaggctgcag gcttcaggtc ccagtggggt tgccatctgc cagtagaaac ctgatgtaga 8880 atcagggcgc aagtgtggac actgtcctga atctcaatgt ctcagtgtgt gctgaaacat 8940 gtagaaatta aagtccatcc ctcctactct actgggattg agccccttcc ctatcccccc 9000 ccaggggcag aggagttcct ctcactcctg tggaggaagg aatgatactt tgttattttt 9060 55 cactgctggt actgaatcca ctgtttcatt tgttggtttg tttgttttgt tttgagaggc 9120 ggtttcactc ttgttgctca ggctggaggg agtgcaatgg cgcgatcttg gcttactgca 9180 gcctctgcct cccaggttca agtgattctc ctgcttccgc ctcccatttg gctgggatta 9240 caggcacccg ccaccatgcc cagctaattt tttgtatttt tagtagagac gggggtgggt 9300 ggggttcacc atgttggcca ggctggtctc gaacttctga cctcagatga tccacctgcc 9360 60 tctgcctcct aaagtgctgg gattacaggt gtgagccacc atgcccagct cagaatttac 9420 tctgtttaga aacatctggg tctgaggtag gaagctcacc ccactcaagt gttgtggtgt 9480 tttaagccaa tgatagaatt tttttattgt tgttagaaca ctcttgatgt tttacactgt 9540 gatgactaag acatcatcag cttttcaaag acacactaac tgcacccata atactggggt 9600 gtcttctggg tatcagcaat cttcattgaa tgccgggagg cgtttcctcg ccatgcacat 9660 65 ggtgttaatt actccagcat aatcttctgc ttccatttct tctcttccct cttttaaaat 9720 tgtgttttct atgttggctt ctctgcagag aaccagtgta agctacaact taacttttgt 9780 tggaacaaat tttccaaacc gcccctttgc cctagtggca gagacaattc acaaacacag 9840 ccctttaaaa aggcttaggg atcactaagg ggatttctag aagagcgacc tgtaatccta 9900 agtatttaca agacgaggct aacctccagc gagcgtgaca gcccagggag ggtgcgaggc 9960 ctgttcaaat gctagctcca taaataaagc aatttcctcc ggcagtttct gaaagtagga 10020 aaggttacat ttaaggttgc gtttgttagc atttcagtgt ttgccgacct cagctacagc 10080 atccctgcaa ggcctcggga gacccagaag tttctcgccc ccttagatcc aaacttgagc 10140 -65 aacccggagt ctggattcct gggaagtcct cagctgtcct gcggttgtgc cggggcccca 10200 ggtctggagg ggaccagtgg ccgtgtggct tctactgctg ggctggaagt cgggcctcct 10260 agctctgcag tccgaggctt ggagccaggt gcctggaccc cgaggctgcc ctccaccctg 10320 tgcgggcggg atgtgaccag atgttggcct catctgccag acagagtgcc ggggcccagg 10380 5 gtcaaggccg ttgtggctgg tgtgaggcgc ccggtgcgcg gccagcagga gcgcctggct 10440 ccatttccca ccctttctcg acgggaccgc cccggtgggt gattaacaga tttggggtgg 10500 tttgctcatg gtggggaccc ctcgccgcct gagaacctgc aaagagaaat gacgggcctg 10560 tgtcaaggag cccaagtcgc ggggaagtgt tgcagggagg cactccggga ggtcccgcgt 10620 gcccgtccag ggagcaatgc gtcctcgggt tcgtccccag ccgcgtctac gcgcctccgt 10680 10 cctccccttc acgtccggca ttcgtggtgc ccggagcccg acgccccgcg tccggacctg 10740 gaggcagccc tgggtctccg gatcaggcca gcggccaaag ggtcgccgca cgcacctgtt 10800 cccagggcct ccacatcatg gcccctccct cgggttaccc cacagcctag gccgattcga 10860 cctctctccg ctggggccct cgctggcgtc cctgcaccct gggagcgcga gcggcgcgcg 10920 ggcggggaag cgcggcccag acccccgggt ccgcccggag cagctgcgct gtcggggcca 10980 15 ggccgggctc ccagtggatt cgcgggcaca gacgcccagg accgcgctcc ccacgtggcg 11040 gagggactgg ggacccgggc acccgtcctg ccccttcacc ttccagctcc gcctcctccg 11100 cgcggacccc gccccgtccc gacccctccc gggtccccgg cccagccccc tccgggccct 11160 cccagcccct ccccttcctt tccgcggccc cgccctctcc tcgcggcgcg agtttcaggc 11220 20 agcgctgcgt cctgctgcgc acgtgggaag ccctggcccc ggccaccccc gcgatg 11276 <210> 4 <211> 104 <212> DNA 25 <213> Homo sapiens <400> 4 gtgggcctcc ccggggtcgg cgtccggctg gggttgaggg cggccggggg gaaccagcga 60 catgcggaga gcagcgcagg cgactcaggg cgcttccccc gcag 104 30 <210> 5 <211> 8616 <212> DNA <213> Homo sapiens 35 <400> 5 gtgaggaggt ggtggccgtc gagggcccag gccccagagc tgaatgcagt aggggctcag 60 aaaagggggc aggcagagcc ctggtcctcc tgtctccatc gtcacgtggg cacacgtggc 120 ttttcgctca ggacgtcgag tggacacggt gatctctgcc tctgctctcc ctcctgtcca 180 gtttgcataa acttacgagg ttcaccttca cgttttgatg gacacgcggt ttccaggcgc 240 40 cgaggccaga gcagtgaaca gaggaggctg ggcgcggcag tggagccggg ttgccggcaa 300 tggggagaag tgtctggaag cacagacgct ctggcgaggg tgcctgcagg ttacctataa 360 tcctcttcgc aatttcaagg gtgggaatga gaggtgggga cgagaacccc ctcttcctgg 420 gggtgggagg taagggtttt gcaggtgcac gtggtcagcc aatatgcagg tttgtgttta 480 agatttaatt gtgtgttgac ggccaggtgc ggtggctcac gccggtaatc ccagcacttt 540 45 gggaagctga ggcaggtgga tcacctgagg tcaggagttt gagaccagcc tgaccaacat 600 ggtgaaaccc tatctgtact aaaaatacaa aaattagctg ggcatggtgg tgtgtgcctg 660 taatcccagc tacttgggag gctgaggcag gagaatcact tgaacccagg aggcggaggc 720 tgcagtgagc tgagattgtg ccattgtact ccagcctggg cgacaagagt gaaactctgt 780 ctttaaaaaa aaaaagtgtt cgttgattgt gccaggacag ggtagaggga gggagataag 840 50 actgttctcc agcacagatc ctggtcccat ctttaggtat gaagagggcc acatgggagc 900 agaggacagc agatggctcc acctgctgag gaagggacag tgtttgtggg tgttcagggg 960 atggtgctgc tgggccctgc cgtgtcccca ccctgttttt ctggatttga tgttgaggaa 1020 cctccgctcc agcccccttt tggctcccag tgctcccagg ccctaccgtg gcagctagaa 1080 gaagtcccga tttcaccccc tccccacaaa ctcccaagac atgtaagact tccggccatg 1140 55 cagacaagga gggtgacctt cttggggctc ttttttttct ttttttcttt ttatggtggc 1200 aaaagtcata taacatgaga ttggcactcc taacaccgtt ttctgtgtac agtgcagaat 1260 tgctaactcg gcggtgttta cagcaggttg cttgaaatgc tgcgtcttgc gtgactggaa 1320 gtccctaccc atcgaacggc agctgcctca cacctgctgc ggctcaggtg gaccacgccg 1380 agtcagataa gcgtcatgca acccagtttt gctttttgtg ctccagcttc cttcgttgag 1440 60 gagagtttga gttctctgat caggactctg cctgtcattg ctgttctctg acttcagatg 1500 aggtcacaat ctgcccctgg cttatgcagg gagtgaggcg tggtccccgg gtgtccctgt 1560 cacgtgcagg gtgagtgagg cgttgccccc aggtgtccct gtcacgtgta gggtgagtga 1620 ggcgcggccc ccgggtgtcc ctgtcccgtg cagcgtgatt gaggtgtggc ccccgggtgt 1680 ccctgtcacg tgtagggtga gtgaggcgcc atccccgggt gtccctgtca cgtgtagggt 1740 65 gagtgaggcg tggtccccgg gtgtccctgt cccgtgcagg gtgagtgagg cactgtcccc 1800 gggtgtccct gtcacgtgca gggtgagtga ggcgcggtcc ccgggtgtcc ctctcaggtg 1860 tagggtgagt gaggcgcggc cccagggtgt ccctgtcacg tgtagggtga gtgaggcacc 1920 gtccctgggt gtccctccca ggtatagggt gagtgaggca ctgtccccgg gtgtccctgt 1980 cacgtgcagg gtgagtgagg cgcggccccc gggtgtccct ctcaggtgca gggtgagtga 2040 70 ggcgctgtcc ctgggtgtcc ctgtctcgtg tagggtgagt gaggctctgt ccccaggtgt 2100 ccttggcgtt tgctcacttg agcttgctcc tgaatgtttg ctctttctat agccacagct 2160 gcgccggttg cccattgcct gggtagatgg tgcaggcgca gtgctggtcc ccaagcctat 2220 \5Z<e -66 cttttctgat gctcggctct tcttggtcac ctctccgttc cattttgcta cggggacacg 2280 ggactgcagg ctctcgcctc ccgcgtgcca ggcactgcag ccacagcttc aggtccgctt 2340 gcctctgttg ggcctggctt gctcaccacg tgcccgccac atgcatgctg ccaatactcc 2400 tctcccagct tgtctcatgc cgaggctgga ctctgggctg cctgtgtctg ctgccacgtg 2460 5 ttgctggaga catcccagaa agggttctct gtgccctgaa ggaaagcaag tcaccccagc 2520 cccctcactt gtcctgtttt ctcccaagct gcccctctgc ttggccccct tgggtgggtg 2580 gcaacgcttg tcaccttatt ctgggcacct gccgctcatt gcttaggctg ggctctgcct 2640 ccagtcgccc cctcacatgg attgacgtcc agccacaggt tggagtgtct ctgtctgtct 2700 cctgctctga gacccacgtg gagggccggt gtctccgcca gccttcgtca gacttccctc 2760 10 ttgggtctta gttttgaatt tcactgattt acctctgacg tttctatctc tccattgtat 2820 gctttttctt ggtttattct ttcattcctt ttctagcttc ttagtttagt catgcctttc 2880 cctctaagtg ctgccttacc tgcaccctgt gttttgatgt gaagtaatct caacatcagc 2940 cactttcaag tgttcttaaa atacttcaaa gtgttaatac ttcttttaag tattcttatt 3000 ctgtgatttt tttctttgtg cacgctgtgt tttgacgtga aatcattttg atatcagtga 3060 15 cttttaagta ttctttagct tattctgtga tttctttgag cagtgagtta tttgaacact 3120 gtttatgttc aagatatgta gagtatcaag atacgtagag tattttaagt tatcatttta 3180 ttattgattt ctaactcagt tgtgtagtgg tctgtataat accaattatt tgaagtttgc 3240 ggagccttgc tttgtgatct agtgtgtgca tggtttccag aactgtccat tgtaaatttg 3300 acatcctgtc aatagtgggc atgcatgttc actatatcca gcttattaag gtccagtgca 3360 20 aagcttctgt ctccttctag atgcatgaaa ttccaagaag gaggccatag tccctcacct 3420 gggggatggg tctgttcatt tcttctcgtt tggtagcatt tatgtgaggc attgttaggt 3480 gcatgcacgt ggtagaattt ttatcttcct gatgagtgaa tcttttggag acttctatgt 3540 ctctagtaat ctagtaattc tttttttaaa ttgctcttag tactgccaca ctgggcttct 3600 tttgattagt attttcctgc tgtgtctgtt ttctgccttt aatttatata tatatatata 3660 25 tttttttttt ttttgagaca gagtcttggt ctgtcgccca gggtgagtgc agtggtgtga 3720 tcacaggtca gtgtaacttt taccttctgg cctgagccgt cctctcacct cagcctcctg 3780 agtagctgga actgcagaca cgcaccgcta cacctggcta atttttaaat tttttctgga 3840 gacagggtct tgctgtgttg cccaggctgg tctcaaactc ttggactcaa gggatccatc 3900 tacctcggct tcccaaagtg ctgaattaca ggcatgagcc accatgtctg gcctaatttt 3960 30 caacactttt atattcttat agtgtgggta tgtcctgtta acagcatgta ggtgaatttc 4020 caatccagtc tgacagtcgt tgtttaactg gataacctga tttattttca tttttttgtc 4080 actagagacc cgcctggtgc actctgattc tccacttgcc tgttgcatgt cctcgttccc 4140 ttgtttctca ccacctcttg ggttgccatg tgcgtttcct gccgagtgtg tgttgatcct 4200 ctcgttgcct cctggtcact gggcatttgc ttttatttct ctttgcttag tgttaccccc 4260 35 tgatcttttt attgtcgttg tttgcttttg tttattgaga cagtctcact ctgtcaccca 4320 ggctggagtg taatggcaca atctcggctc actgcaacct ctgcctcctc ggttcaagca 4380 gttctcattc ctcaacctca tgagtagctg ggattacagg cgcccaccac cacgcctggc 4440 taatttttgt atttttagta gagataggct ttcaccatgt tggccaggct ggtctcaaac 4500 tcctgacctc aagtgatctg cccgccttgg cctcccacag tgctgggatt acaggtgcaa 4560 40 gccaccgtgc ccggcatacc ttgatctttt aaaatgaagt ctgaaacatt gctacccttg 4620 tcctgagcaa taagaccctt agtgtatttt agctctggcc accccccagc ctgtgtgctg 4680 ttttccctgc tgacttagtt ctatctcagg catcttgaca cccccacaag ctaagcatta 4740 ttaatattgt tttccgtgtt gagtgtttct gtagctttgc ccccgccctg cttttcctcc 4800 tttgttcccc gtctgtcttc tgtctcaggc ccgccgtctg gggtcccctt ccttgtcctt 4860 45 tgcgtggttc ttctgtcttg ttattgctgg taaaccccag ctttacctgt gctggcctcc 4920 atggcatcta gcgacgtccg gggacctctg cttatgatgc acagatgaag atgtggagac 4980 tcacgaggag ggcggtcatc ttggcccgtg agtgtctgga gcaccacgtg gccagcgttc 5040 cttagccagt gagtgacagc aacgtccgct cggcctgggt tcagcctgga aaaccccagg 5100 catgtcgggg tctggtggct ccgcggtgtc gagtttgaaa tcgcgcaaac ctgcggtgtg 5160 50 gcgccagctc tgacggtgct gcctggcggg ggagtgtctg cttcctccct tctgcttggg 5220 aaccaggaca aaggatgagg ctccgagccg ttgtcgccca acaggagcat gacgtgagcc 5280 atgtggataa ttttaaaatt tctaggctgg gcgcggtggc tcacgcctgt aatcccagca 5340 ctttgggagg ccaaggcggg tggatcacga ggtcaggagg tcgagaccat cctggccaac 5400 atgatgaaac cccatctgta ctaaaaacac aaaaattagc tgggcgtggt ggcgggtgcc 5460 55 tgtaatccca gctactcggg aggctgaggc aggagaattg cttgaacctg ggagttggaa 5520 gttgcagtga gccgacattg caccactgca ctccagcctg gcaacacagc gagactctgt 5580 ctcaaaaaaa aaaaaaaaaa aaaaaaaaaa aattctagta gccacattaa aaaagtaaaa 5640 aagaaaaggt gaaattaatg taataataga ttttactgaa gcccagcatg tccacacctc 5700 atcattttag ggtgttattg gtgggagcat cactcacagg acatttgaca ttttttgagc 5760 60 tttgtctgcg ggatcccgtg tgtaggtccc gtgcgtggcc atctcggcct ggacctgctg 5820 ggcttcccat ggccatggct gttgtaccag atggtgcagg tccgggatga ggtcgccagg 5880 ccctcagtga gctggatgtg cagtgtccgg atggtgcacg tctgggatga ggtcgccagg 5940 ccctgctgtg agctggatgt gtggtgtctg gatggtgcag gtcaggggtg aggtctccag 6000 gccctcggtg agctggaggt atggagtccg gatgatgcag gtccggggtg aggtcgccag 6060 65 gccctgctgt gagctggatg tgtggtgtct ggatggtgca ggtcaggggt gaggtctcca 6120 ggccctcggt aagctggagg tatggagtcc ggatgatgca ggtccggggt gaggtcgcca 6180 ggccctgctg tgagctggat gtgtggtgtc tggatggtgc aggtctgggg tgaggtcacc 6240 aggccctgcg gtgagctggg tgtgcggtgt ctggatggtg caggtctgga gtgaggtcgc 6300 cagacggtgc cagaccatgc ggtgagctgg atatgcggtg tccggatggt gcaggtctgg 6360 70 ggtgaggttg ccaggccctg ctgtgagttg gatgtggggt gtccggatgc tgcaggtccg 6420 gtgtgaggtc accaggccct gctgtgagct ggatgtgtgg tgtctggatg gtgcaggtct 6480 ggggtgaagg tcgccaggcc cctgcttgtg agctggatgt gtggtgtctg gatggtgcag 6540 - 67 gtctggagtg aggtcgccag gccctcggtg agctggatgt gcagtgtcca gatggtgcag 6600 gtccggggtg aggtcgccag accctgcggt gagctggatg tgcggtgtct ggatggtgca 6660 ggtctggagt gaggtcgcca ggccctcggt gagctggatg tatggagtcc ggatggtgcc 6720 ggtccggggt gaggtcgcca gaccctgctg tgagctggat gtgcggtgtc tggatggtac 6780 5 aggtctggag tgaggtcgcc agaccctgct gtgagctgga tatgcggtgt ccggatggtg 6840 caggtcaggg gtgaggtctc caggccctcg gtgagctgga ggtatggagt ccggatgatg 6900 caggtccggg gtgaggtcgc caggccctgc tgtgaactgg atgtgcggcg tctggatggt 6960 gcaggtctgg ggtgtggtcg ccaggccctc ggtgagctgg aggtatggag tccggatgat 7020 gcaggtccgg ggtgaggtcg ccaggccctg ctgtgagctg gatgtgcggc gtctggatgg 7080 10 tgcaggtctg gggtgtggtc gccaggccct cggtgagctg gaggtatgga gtccggatga 7140 tgcaggtccg gggtgaggtt gccaggccct gctgtgagct ggatgtgctg tatccggatg 7200 gtgcagtccg gggtgaggtc gccaggccct gctgtgagct ggatgtgctg tatccggatg 7260 gtgcaggtct ggggtgaggt caccaggccc tgcggtgagc tggttgtgcg gtgtccggtt 7320 gctgcaggtc cggggtgagt tcgccaggcc ctcggtgagc tggatgtgcg gtgtccccgt 7380 15 gtccggatgg tgcaggtcca gggtgaggtc gctaggccct tggtgggctg gatgtgccgt 7440 gtccggatgg tgcaggtctg gggtgaggtc gccaggcctt tggtgagctg gatgtgcggt 7500 gtctgcatgg tgcaggtctg gggtgaggtc gccaggccct tggtgggctg gatgtgtggt 7560 gtccggatgg tgcaggtccg gcgtgaggtc gccaggccct gctgtgagct ggatgtgcgg 7620 tgtctggatg gtgcaggtcc ggggtgaggt agccaaggcc ttcggtgagc tggatgtggg 7680 20 gtgtccggat ggtgcaggtc cggggtgagg tcgccaggcc ctgcggttag ctggatatgc 7740 ggtgtccgga tggtgcaggt ccggggtgag gtcaccaggc cctgcggtta gctggatgtg 7800 cggtgtctgg atggtgcagg tccggggtga ggtcgccagg ccctgctgtg agctggatgt 7860 gctgtatccg gatggtgcag gtccggggtg aggtcgccag gccctgcagt gagctggatg 7920 tgctgtatcc ggatggtgca ggtctggcgt gaggtcgcca ggccctgcgg ttagctggat 7980 25 atgcggtgtc ggatggtgca ggtccggggt gaggtcacca ggccctgcgg ttagctggat 8040 gtgcggtgtc cggatggtgc aggtctgggg tgaggtcgcc aggccctgct gtgagctgga 8100 tgtgctgtat ccggatggtg caggtccggg gtgaggtcgc caggccctgc ggtgagctgg 8160 atgtgctgta tccggatggt gcaggtctgg cgtgaggtcg ccaggccctg cggtgagctg 8220 gatgtgcagt gtacggatgg tgcaggtccg gggtgaggtc gccaggccct gcggtgggct 8280 30 gtatgtgtgt tgtctggatg gtgcaggtcc ggggtgagtt cgccaggccc tgcggtgagc 8340 tggatgtgtg gtgtctggat gctgcaggtc cggggtgagt tcgccaggcc ctcggtgagc 8400 tggatatgcg gtgtccccgt gtccgaatgg tgcaggtcca gggtgaggtc gccaggccct 8460 tggtgggctg gatgtgccgt gtccggatgg tgcaggtctg gggtgaggtc gccaggccct 8520 tggtgagctg gatgtgcggt gtccggatgg tgcaggtccg gggtgaggtc accaggccct 8580 35 cggtgatctg gatgtggcat gtccttctcg tttaag 8616 <210> 6 <211> 2089 <212> DNA 40 <213> Homo sapiens <400> 6 gtactgtatc cccacgccag gcctctgctt ctcgaagtcc tggaacacca gcccggcctc 60 agcatgcgcc tgtctccact tgcctgtgct tccctggctg tgcagctctg ggctgggagc 120 45 caggggcccc gtcacaggcc tggtccaagt ggattctgtg caaggctctg actgcctgga 180 gctcacgttc tcttacttgt aaaatcagga gtttgtgcca agtggtctct agggtttgta 240 aagcagaagg gatttaaatt agatggaaac actaccacta gcctccttgc ctttccctgg 300 gatgtgggtc tgattctctc tctctttttt ttttcttttt tgagatggag tctcactctg 360 ttgcccaggc tggagtgcag tggcataatc ttggctcact gcaacctcca cctcctgggt 420 50 ttaagcgatt caccagcctc agcctcctaa gtagctggga ttacaggcac ctgccaccac 480 gcctggctaa tttttgtact tttaggagag acggggtttc accatgttgg ccaggctggt 540 ctcgaactca tgacctcagg tgatccaccc accttggcct cccaaagtgc tgggtttaca 600 ggctaagcca ccgtgcccag cccccgattc tcttttaatt catgctgttc tgtatgaatc 660 ttcaatctat tggatttagg tcatgagagg ataaaatccc acccacttgg cgactcactg 720 55 cagggagcac ctgtgcaggg agcacctggg gataggagag ttccaccatg agctaacttc 780 taggtggctg catttgaatg gctgtgagat tttgtctgca atgttcggct gatgagagtg 840 tgagattgtg acagattcaa gctggatttg catcagtgag ggacgggagc gctggtctgg 900 gagatgccag cctggctgag cccaggccat ggtattagct tctccgtgtc ccgcccaggc 960 tgactgtgga gggctttagt cagaagatca gggcttcccc agctcccctg cacactcgag 1020 60 tccctggggg gccttgtgac accccatgcc ccaaatcagg atgtctgcag agggagctgg 1080 cagcagacct cgtcagaggt aacacagcct ctgggctggg gaccccgacg tggtgctggg 1140 gccatttcct tgcatctggg ggagggtcag ggctttccct gtgggaacaa gttaatacac 1200 aatgcacctt acttagactt tacacgtatt taatggtgtg cgacccaaca tggtcatttg 1260 accagtattt tggaaagaat ttaattgggg tgaccggaag gagcagacag acgtggtggt 1320 65 ccccaagatg ctccttgtca ctactgggac tgttgttctg cctggggggc cttggaggcc 1380 cctcctccct ggacagggta ccgtgccttt tctactctgc tgggcctgcg gcctgcggtc 1440 agggcaccag ctccggagca cccgcggccc cagtgtccac ggagtgccag gctgtcagcc 1500 acagatgccc aggtccaggt gtggccgctc cagcccccgt gcccccatgg gtggttttgg 1560 gggaaaaggc caagggcaga ggtgtcagga gactggtggg ctcatgagag ctgattctgc 1620 70 tccttggctg agctgccctg agcagcctct cccgccctct ccatctgaag ggatgtggct 1680 ctttctacct gggggtcctg cctggggcca gccttgggct accccagtgg ctgtaccaga 1740 gggacaggca tcctgtgtgg aggggcatgg gttcacgtgg ccccagatgc agcctgggac 1800 4/1 -68 caggctccct ggtgctgatg gtgggacagt caccctgggg gttgaccgcc ggactgggcg 1860 tccccagggt tgactatagg accaggtgtc caggtgccct gcaagtagag gggctctcag 1920 aggcgtctgg ctggcatggg tggacgtggc cccgggcatg gccttcagcg tgtgctgccg 1980 tgggtgccct gagccctcac tgagtcggtg ggggcttgtg gcttcccgtg agcttccccc 2040 5 tagtctgttg tctggctgag caagcctcct gaggggctct ctattgcag 2089 <210> 7 211> 687 <212> DNA 10 213> Homo sapiens <400> 7 gtggctgtgc tttggtttaa cttccttttt aaacagaagt gcgtttgagc cccacatttg 60 gtatcagctt agatgaaggg cccggaggag gggccacggg acacagccag ggccatggca 120 15 cggcgccaac ccatttgtgc gcacagtgag gtggccgagg tgccggtgcc tccagaaaag 180 cagcgtgggg gtgtaggggg agctcctggg gcagggacag gctctgagga ccacaagaag 240 cagccgggcc agggcctgga tgcagcacgg cccgaggtcc tggatccgtg tcctgctgtg 300 gtgcgcagcc tccgtgcgct tccgcttacg gggcccgggg accaggccac gactgccagg 360 agcccaccgg gctctgagga tcctggacct tgccccacgg ctcctgcacc ccacccctgt 420 20 ggctgcggtg gctgcggtga ccccgtcatc tgaggagagt gtggggtgag gtggacagag 480 gtgtggcatg aggatcccgt gtgcaacaca catgcggcca ggaacccgtt tcaaacaggg 540 tctgaggaag ctgggagggg ttctaggtcc cgggtctggg tggctgggga cactggggag 600 gggctgcttc tcccctgggt ccctatggtg gggtgggcac ttggccggat ccactttcct 660 25 gactgtctcc catgctgtcc ccgccag 687 <210> 8 <211> 494 <212> DNA 30 <213> Homo sapiens <400> 8 gtgggtgccg gggacccccg tgagcagccc tgctggacct tgggagtggc tgcctgattg 60 gcacctcatg ttgggtggag gaggtactcc tgggtgggcc gcagggagtg caggtgaccc 120 tgtcactgtt gaggacacac ctggcaccta gggtggaggc cttcagcctt tcctgcagca 180 35 catggggccg actgtgcacc ctgactgccc gggctcctat tcccaaggag ggtcccactg 240 gattccagtt tccgtcagag aaggaaccgc aacggctcag ccaccaggcc ccggtgcctt 300 gcaccccagt cctgagccag gggtctcctg tcctgaggct cagagagggg acacagcccg 360 ccctgccctt ggggtctgga gtggtggggg tcagagagag agtgggggac accgccaggc 420 caggccctga gggcagaggt gatgtctgag tttctgcgtg gccactgtca gtctcctcgc 480 40 ctccactcac acag 494 <210> 9 <211> 865 <212> DNA 45 <213> Homo sapiens <400> 9 gtaaggttca cgtgtgatag tcgtgtccag gatgtgtgtc tctgggatat gaatgtgtct 60 50 agaatgcagt cgtgtctgtg atgcgtttct gtggtggagg tacttccatg atttacacat 120 ctgtgatatg cgtgtgtggc acgtgtgtgt cgtggtgcat gtatctgtgg cgtgcatatt 180 tgtggtgtgt gtgtgtgtgg cacgtgtgtg tccatggtgt gtgtgcctgt ggtgtgcatg 240 tgtgtgtgtc tgtgacacgt gcatgttcat gctgtgtgct gcatgtctgt gatgtgccta 300 tttgtggtgt gtgtgtgcat gtgtccgtga catatgcgtg tctatggcat gggtgtgtgt 360 55 ggccccttgg ccttactcct tcctcctcca ggcatggtcc gcaccattgt cctcacgctc 420 tcgggtgctg gtttggggag ctccacattc agggtcctca cttctagcat gggtgcccct 480 gtcctgtcac agggctgggc cttggagact gtaagccagg tttgagagga gagtagggat 540 gctggtggta ccttcctgga cccctggcac ccccaggacc ccagtctggc ctatgccggc 600 tccatgagat ataggaaggc tgattcaggc ctcgctcccc gggacacact cctcccagag 660 60 cggccggggg ccttggggct cggcaggggt gaaaggggcc ctgggcttgg gttcccaccc 720 agtggtcatg agcacgctgg aggggtaagc cctcaaagtc gtgccaggcc ggggtgcaga 780 ggtgaagaag tatccctgga gcttcggtct ggggagaggc acatgtggaa acccacaagg 840 acctctttct ctgacttctt gagct 865 65 <210> 10 <211> 3782 <212> DNA <213> Homo sapiens 70 <400> 10 RA tgtgggattg gttttcatgt gtgggatagg tggggatctg tgggattggt ttttatgagt 60 ggggtaacac agagttcaag gcgagctttc ttcctgtagt gggtctgcag gtgctccaac 120 ,3 -69 agctttattg aggagaccat atcttccttt gaactatggt cgggtttata gtaagtcagg 180 ggtgtggagg cctcccctgg gctccctgtt ctgtttcttc cactctgggg tcgtgtggtg 240 cctgctgtgg tgtgtggccg gtgggcaggg cttccaggcc tccttgtgtt cattggcctg 300 gatgtggccc tggctacgct ccgtccttgg aattcccctg cgagttggag gctttctttc 360 5 tttctttttt tctttctttt tttttttttt tgataacaga gtctcgctct tttttgccca 420 ggctggagtg gtttggcgtg atcttggctc actgcaacct gtgcttcctg agttcaagca 480 attctcttgc ctcagcctcc caagtagctg gaattatagg cgcccaccac catgctgact 540 aatttttgta attttagtag agacgaggtt tctccatgtt ggccaggctg gtctcgaact 600 cctgacctca ggtgatcctc ccacctcggc ctcccaaagt gctgggatga caggtgtgaa 660 10 ccgccgcgcc cggccgagac tcgcttcctg cagcttccgt gagatctgca gcgatagctg 720 cctgcagcct tggtgctgac aacctccgtt ttccttctcc aggtctcgct aggggtcttt 780 ccatttcatg actctcttca cagaagagtt tcacgtgtgc tgatttcccg gctgtttcct 840 gcgtaattgg tgtctgctgt ttatcgatgg cctccttcca tttcctttag gctttgttta 900 ttgttgtttt tccggctcct tgaaggaaaa gtttcgatta tggatgtttg aactttcttt 960 15 tctaaacaag catctgaagt tgccgttttc cctctaaagc agggatcccg aggcccctgg 1020 ctgtggagtg gcaccggtct ggggcctgtt aggaacccgg cgcacagcgg gaggctaggt 1080 ggggtgtggg gagccagcgt tcccgcctga gccccgcccc tctcagatca gcagtggcat 1140 gcggtgctca gaggcgcaca caccctactg agaactgtgc gtgagagggg tctagattct 1200 gtgctcctta tgggaatcta atgcctgatg atctgaggtg gaaccgtttg ctcccaaaac 1260 20 catccccttc cccactgctg tcctgtggaa aaatcgtctt ccacgaaacc agtccctggt 1320 accacaatgg ttggggaccc tgtgctaaag acctgcttca gcagcctctc gtcagtgttg 1380 atatattggc ttttctgtgt tgagtccaga ataattacgg atttctgtga tgctttccgc 1440 cgacctcaga cccatgggct atttgtgggc gtgttgcctg ctcctgggtt gggaagggtg 1500 caggccccat gtaccttcct gttactgcct tccaggttgg ttctcagggt tgaatcgtac 1560 25 tcgatgtggt tttagcccac ggccctgccg ccagctcctg ggggctgggg aacatgctga 1620 agcacagagt caccgtgcgc gtcttttgat gcctcacaag ctcgaggcct cctgtgtccg 1680 tgttagtgtg tgtcacgtgc ctgctcacat cctgtcttgg ggacgcaggg gcttagcagg 1740 tcccgtagta aatgacaagc gtcctggggg agtctgcaga ataggaggtg ggggtgccgg 1800 tctctctccc gcgtcttcag actcttctcc tgcctgtgct gtggctgcac ctgcatccct 1860 30 gcaatccctc cagcactggg ctggagaggc ccgggagctc gagtgccact tgtgccacgt 1920 gactgtggat ggcagtcggt cacgggggtc tgatgtgtgg tgactgtgga tggcggttgg 1980 tcacaggggt ctgatgtgtg gtgactgtgg atggcggtcg tggggtctga tgtggtgact 2040 gtggatggcg gtcgtggggt ctgatgtgtg gtgactgtgg atggcggtcg tggggtctga 2100 tgtggtgact gtggatggcg gtcgtggggt ctgatgtggt gactgtggat ggcggtcgtg 2160 35 gggtctgatg tggtgactgt ggatggcagt cgtggggtct gatgtgtggt gactgtggat 2220 ggcggtcgtg gggtctgatg tggtgactgt ggatggcagt cgtggggtct gatgtgtggt 2280 gactgtggat ggcggtcgtg gggtctgatg tgtggtgact gtggatggcg gtcgtggggt 2340 ctgatgtgtg gtgactgtgg atggcggtcg tggggtctga tgtgtggtga ctgtggatgg 2400 cggtcgtggg gtctgatgtg gtgactgtgg atggcggtcg tggggtctga tgtgtggtga 2460 40 ctgtggatgg tgatcggtca caggggtctg atgtgtggtg actgtggatg gcggtcgtgg 2520 ggtctgatgt gtggtgactg tggatggtga tcggtcacag gggtctgatg tgtggtgact 2580 gtggatggcg gtcgtggggt ctgatgtgtg gtgactgtgg atggcggttg gtcccggggg 2640 tctgatgtgt ggtgactgtg gatggcgatc ggtcacaggg gtctgatgtg tggtgactgt 2700 ggatggcggt cgtggggtct gatgtgtggt gactgtggat ggcggtcgtg gggtctgatg 2760 45 tgtggtgact gtggatggcg gtcgtggggt ctgatgtggt gactgtggat ggcggtcgtg 2820 gggtctgatg tggtgactgt ggatggcggt cgtggggtct gatgtgtggt gactgtggat 2880 ggcggttggt cccgggggtc tgatgtgtgg tgactgtgga tggcggtcgt ggggtctgat 2940 gtggtgactg tggatggcag tcgtggggtc tgatgtgtgg tgactgtgga tggcggtcgt 3000 ggggtctgat gtgtggtgac tgtggatggc ggtcgtgggg tctgatgtgt ggtgactgtg 3060 50 gatggcggtc gtggggtctg atgtgtggtg actgtggatg gcggtcgtgg ggtctgatgt 3120 ggtgactgtg gatggcggtc gtggggtctg atgtgtggtg actgtggatg gtgatcggtc 3180 acaggggtct gatgtgtggt gactgtggat ggcggtcgtg gggtctgatg tgtggtgact 3240 gtggatggcg gtcgtggggt ctgatgtggt gactgtggat ggcggtcgtg gggtctgatg 3300 tgtggtgact gtggatggcg gtcgtagggt ctgatgtgtg gtgactgtgg atggcagtcg 3360 55 gtcacagggg tctgatgtgt ggtgactgtg gatggcggtc gtggggtctg atgtgtggtg 3420 actgtggatg gcggtcgtgg ggtctgatgt gtggtgactg tggatggcgg tcgtggggtc 3480 tgatgtgtgg tgactgtgga tggcggtcgt ggggtctgat gtggtgactg tggatggtga 3540 tcggtcacag gggtctgatg tgtggtagct gcaggtggag tcccaggtgt gtctgtagct 3600 actttgcgtc ctcggccccc cggcccccgt ttcccaaaca gaagcttccc aggcgctctc 3660 60 tgggcttcat cccgccatcg ggcttggccg caggtccaca cgtcctgatc ggaagaaaca 3720 agtgcccagc tctggccggg gcaggccaca tttgtggctc atgccctctc ctctgccggc 3780 ag 3782 <210> 11 65 <211> 980 <212> DNA <213> Homo sapiens <400> 11 70 gtctgggcac tgccctgcag ggttgggcac ggactcccag cagtgggtcc tcccctgggc 60 aatcactggg ctcatgaccg gacagactgt tggccctggg gggcagtggg gggaatgagc 120 tgtgatgggg gcatgatgag ctgtgtgcct tggcgaaatc tgagctgggc catgccaggc 180 4LT - 70 tgcgacagct gctgcattca ggcacctgct cacgtttgac tgcgcggcct ctctccagtt 240 ccgcagtgcc tttgttcatg atttgctaaa tgtcttctct gccagttttg atcttgaggc 300 caaaggaaag gtgtccccct cctttaggag ggcaggccat gtttgagccg tgtcctgccc 360 agctggcccc tcagtgctgg gtctgaggcc aaaggaaacg tgtccccctt cttaggagga 420 5 cgggccgtgt ttgagccacg ccccgctgag cgggcctctc agtgctgggt ctgtccacgt 480 ggccctgtgg ccctttgcag atgtggtctg tccacgtggc cctgtggctc tttgcagatg 540 cctgttagca cttgctcggc tctaggggac agtcgtgtcc accgcatgag gctcagagac 600 ctctgggcga atttccttgg ctcccagggt gggggtggag gtggcctggg ctgctgggac 660 ccagaccctg tgcccggcag ctgggcagca actcctggat cacatatgcc atccgggcca 720 10 cggtgggctg tgtgggtgtg agcccagctg gacccacagg tggcccagag gagacgttct 780 gtgtcacaca ctctgcctaa gcccatgtgt gtctgcagag actcggcccg gccagcccac 840 gatggccctg cattccagcc cagccccgca cttcatcaca aacactgacc ccaaaaggga 900 cggagggtct tggccacgtg gtcctgcctg tctcagcacc caccggctca ctcccatgtg 960 15 tctcccgtct gctttcgcag 980 <210> 12 <211> 2485 <212> DNA 20 <213> Homo sapiens <400> 12 gtgagtcagg tggccaggtg ccattgccct gcgggtggct gggcgggctg gcagggcttc 60 tgctcacctc tctcctgccc cttccccact gnccttctgc ccggggccac cagagtctcc 120 ttttctggcc cccgccccct ccggctcctg ggctgcaggc tcccgaggcc ccggaaacat 180 25 ggctcggctt gcggcagccg gagcggagca ggtgccacac gaggcctgga aatggcaagc 240 ggggtgtgga gttgctcctg cgtggaggac gaggggcggg gggtgtgtct gggtcaggtg 300 tgcgccgagc gtttgagcct gcagcttgtc agctccaagt tactactgac gctggacacc 360 cggctctcac acgcttgtat ctctctctcc cgatacaaaa ggattttatc cgattctcat 420 tcctgtccct gtcgtgtgac ccccgcgagg gcgcgggctc ttctctctgt gactagattt 480 30 cccatctgga aagtgcgggg ttgaccgtgt agtttgctcc tctcgggggg cctgtggtgg 540 ccatggggca ggcggcctgg gagagctgcc gtcacacagc cactgggtga gccacactca 600 cggtggtaga gccacagtgc ctggtgccac atcacgtcct ctggatttta agtaaaacca 660 cacacctccc ggcaggcatc tgcctgcgac cctgtgtgtg cctggggaga gtggtagcac 720 ggaggaaatt cgtgcacact caaggtcatc agcaaggtca tccgcagtca ggtggaacgt 780 35 ggaggcctct ctctgggatc gtctccagcg gataaaggac tgtgcacagc ttcggaagct 840 tttatttaaa aatataacta ttaattattg cattataagt aatcactaat ggtatcagca 900 attataatat ttattaaagt ataattagaa atattaagta gtacacacgt tctggaaaaa 960 cacaaattgc acatggcagc agagtgaatt ttggccgagg gacacgtgtg cacatgtgtg 1020 taagcggccc ccaggcccac agaattcgct gacaaagtca cctccccaga gaagccacca 1080 40 cgggcctcct tcgtggtcgt gaattttatt aagatggatc aagtcacgta ccgtccacgt 1140 gtggcagggc tttggggaat gtgaggtgat gactgcgtcc tcatgccctg acagacagga 1200 ggtgactgtg tctgtcctgt ccctaggaca cggacaggcc cgaagctcta gtccccatcg 1260 tggtccagtt tggcctctga ataaaaacgt cttcaaaacc tgttgcccca aaaactaaga 1320 acagagagag tttcccatcc catgtgctca caggggcgta tctgcttgcg ttgactcgct 1380 45 gggctggccg gactcctaga gttggtgcgt gtgcttctgt gcaaaaagtg cagtcctctt 1440 gcccatcact gtgatatctg caccagcaag gaaagcctct tttcttttct ttcttttttt 1500 ttttttgaga cggaacgtca ctgttgtctg cctgggcttg agtgcagtgg cgcgatctca 1560 actcactgca acctccgcct cccgggttcc agcatttctc ctgcctcagc ctcccgagca 1620 gctgagatta caggcaccca ccccctgcgc ctggctaatt tttgtatttt tagtagagag 1680 50 gggtttttgc catgttggcc aggctggtct cgaactcctg acctcaggtg atccacccac 1740 ctcggcctcc caaagtgctg ggattacagg tgtgagccat cacgcccagc cggaaagcct 1800 ctttttaagg tgaccaccta tagcgcttcc cgaaaataac aggtcttgtt tttgcagtag 1860 gctgcaagcg tctcttagca acaggagtgg cgtcctgtgg gctctgggga tggctgaggg 1920 tcgcgtggca gccatgcctt ctgtgtgcac ctttaggttc cacggggcta ttctgctctc 1980 55 actgtttgtc tgaaaacgca cccttggcat ccttgtttgg agagtttctg cttctcgttg 2040 gtcatgctga aactaggggc aaggttgtat ccgttggcgc gcagcggcta catgtagggt 2100 catgagtctt tcaccgtgga caaattcctt gaaaaaaaaa aaaggagtcc ggttaagcat 2160 tcattccggg tcaagtgtct ggttctgtga ataaactcta agatttaaga aaccttaatg 2220 aaagaaaacc ttgatgattc agagcaagga tgtggtcaca cctgtggctg gatctgtttc 2280 60 agccgcccca gtgcatggtg agagtgggga gcagggattg tttgttcaga ggtctcatct 2340 ggtatgtttc tgaggtgttt gccggctgaa tggtagacgt gtcgtttgtg tgtatgaggt 2400 tctgtgtctg tgtgtggctc ggtttgagtg tacgcatgtc cagcacatgc cctgcccgtc 2460 tctcacctgt gtcttcccgc cccag 2485 65 <210> 13 <211> 1984 <212> DNA <213> Homo sapiens 70 <400> 13 gtgaggcctc ctcttcccca ggggggcttg ggtgggggtt gatttgcttt tgatgcattc 60 agtgttaata ttcctggtgc tctggagacc atgactgctc tgtcttgagg aaccagacaa 120 $, <c 744 - 71 ggttgcagcc ccttcttggt atgaagccgc acgggagggg ttgcacagcc tgaggactgc 180 gggctccacg caggctctgt ccagcggcca tgtccagagg cctcagggct cagcaggcgg 240 gagggccgct gccctgcatg atgagcatgt gaattcaaca ccgaggaagc acaccagctt 300 ctgtcacgtc acccaggttc cgttagggtc cttggggaga tggggctggt gcagcctgag 360 5 gccccacatc tcccagcagg ccctcgacag gtggcctgga ctgggcgcct cttcagccca 420 ttgcccatcc cacttgcatg gggtctacac ccaaggacgc acacacctaa atatcgtgcc 480 aacctaatgt ggttcaactc agctggcttt tattgacagc agttactttt ttttttttaa 540 tactttaagt tctagggtac atgtgcacga cgtgcaggtt agttacatat gtatacatgt 600 gccatgttgg tgtgctgcac ccattaactc atcatttaca ttaggtatat ctcctaatgc 660 10 tatccctccc cactcccccc atcccatgac aggccctggt gtgtgatgtt ccccaccctg 720 tgtccaagtg ttctcattgt tcagttccca cctgtgagtg agaacatgtg gtgtttggtt 780 ttctttcctt gcaatagttt gctcagagtg atggtttcca gcttcgtcca tgtccctaca 840 aaggacatga actcatcctt ttttatgact gcatagtatt ccgtggtgta tatgtgccac 900 attttcttaa tccagtctat catcgatgga catttgggtt ggttgcaagt ctttgctact 960 15 gtgaatagtg ccgcaataaa catacgtgtg catgtgtctt tatagcagca tgatttataa 1020 tcctttgggt atatacccag taatgggatg gctgggtcaa atggtatttc tagttctaga 1080 tccttgagga atcaccacac tgtcttccac aatggttgaa ctagtttaca ctcccaccaa 1140 cagtgtaaaa gtgttctggt gctggagagg atgtggacag cagttatttt tttatgaaaa 1200 tagtatcact gaacaagcag acagttagtg aaggatgcgt caggaagcct gcaggccaca 1260 20 cagccatttc tctcgaagac tccgggtttt tcctgtgcat cttttgaaac tctagctcca 1320 attatagcat gtacagtgga tcaaggttct tcttcattaa ggttcaagtt ctagattgaa 1380 ataagtttat gtaacagaaa caaaaatttc ttgtacacac aacttgctct gggatttgga 1440 ggaaagtgtc ctcgagctgg cggcacactg gtcagccctc tgggacagga tacctctggc 1500 ccatggtcat ggggcgctgg gcttgggcct gagggtcaca cagtgcacca tgcccagctt 1560 25 cctgtggata ggatctgggt ctcggatcat gctgaggacc acagctgcca tgctggtaaa 1620 gggcaccacg tggctcagag ggggcgaggt tcccagcccc agctttctta ccgtcttcag 1680 ttatttttcc ctaagagtct gagaagtggg gccgcgcctg atggccttcg ttcgtcttca 1740 gctggcacag aattgcacaa gctgatggta aacactgagt acttataatg aatgaggaat 1800 tgctgtagca gttaactgta gagagctcgt ctgttggaaa gaaatttaag tttttcattt 1860 30 aaccgctttg gagaatgtta ctttatttat ggctgtgtaa attgtttgac attcagtccc 1920 tcgtagacag atactacgta aaaagtgtaa agttaacctt gctgtgtatt ttcccttatt 1980 ttag 1984 <210> 14 35 <211> 1871 <212> DNA <213> Homo sapiens <400> 14 40 gtgaggcccg tgccgtgtgt ctgtggggac ctccacagcc tgtgggcttt gcagttgagc 60 cccccgtgtc ctgcccctgg caccgcagcg ttgtctctgc caagtcctct ctctctgccg 120 gtgctggatc cgcaagagca gaggcgcttg gccgtgcacc caggcctggg ggcgcagggg 180 caccttcggg agggagtggg taccgtgcag gccctggtcc tgcagagacg cacccaggtt 240 acacacgtgg tgagtgcagg cggtgacctg gctcctgctg ctctttggaa agtcaagagt 300 45 ggcggctcct ggggccccag tgagaccccc aggagctgtg cacagggcct gcagggccga 360 ggcggcagcc tcctccccag ggtgcacctg agcctgcgga gagcaggagc tgctgagtga 420 gctggcccac agcgttcgct gcggtcacgt tcctgcgtgg ggttgtttgg gatcggtggg 480 agaatttgga tttgctgagt gctgctgtct tgaaccacgg agatggctag gagtgggttt 540 cagagttgat ttttgtgaat caaactaaaa tcaggcacag gggacctggc ctcagcacag 600 50 gggattgtcc aatgtggtcc ccctcaaggg cgccccacag agccggtggg cttgttttaa 660 agtgcgattt gacgagggac gagaaacctt gaaagctgta aagggaaccc tcagaaaatg 720 tggccgccag gggtggtttc aggtgctttg ctgggctgtg tttgtgaaaa cccatttgga 780 cccgccctcc aagtccaccc tccaggtcca ccctccaggg ccgccctggg ctgggggtat 840 gcctggcgtt ccttgtgccg cagcccggag cacagcaggc tgtgcacatt taaatccact 900 55 aagattcact cggggggagc ccaggtccca agcaactgag ggctcaggag tcctgaggct 960 gctgagggga cagagcagac ggggaacgct gcttctgtgt ggcaagttcc tgagggtgct 1020 ggccagggag gtggctcaga gtgtatgttg gggtcccacc gggggcagaa ctctgtctct 1080 gatgagtcgg cagccatgta acaggaaggg gtggccacag ggagctggga atgcaccagg 1140 ggagctgcgc agctggccga ggtcccaggg ccaggccaca ggaagggcag ggggacgccc 1200 60 ggggccacag cagaggccgc aggaagggaa ggggatgccc aggccagagc agaggctacc 1260 gggcacaggg gggctccctg agctgggtga gcgaggctca tgactcggcg agggaacctc 1320 cttgacgtga agctgacgac tggtgttgcc cagctcacag cccagccagg tcccgcgcct 1380 gagcaggaac tcagaaccct cccctttgtc taaagcacag cagatgcctt cagggcatct 1440 aggagaaaac aggcaaagtc gttgagaaac gtcttaaaag aaggtgggat ggtggcaatt 1500 65 tcttgtccag attttagtct gccccggacc acagatgagt ctataacggg attgtggtgt 1560 tgccatgggg acacatgaga tggaccatca cagaggccac tggggctgca cctcccatct 1620 gagtcctggc tgtcccgggt ccaggccagg ttcttgcatg ctcacctacc tgtcctgccc 1680 gggagacagg gaaagcaccc cgaagtctgg agcagggctg ggtccaggct cctcagagct 1740 cctgccaggc ccagcaccct gctccaaatc accacttctc tggggttttc caaagcattt 1800 70 aacaagggtg tcaggttacc tcctgggtga cggccccgca tcctggggct gacattgccc 1860 ctctgcctta g 1871 - 72 <210> 15 <211> 3801 <212> DNA 5 <213> Homo sapiens <400> 15 gtgagcgcac ctggccggaa gtggagcctg tgcccggctg gggcaggtgc tgctgcaggg 60 ccgttgcgtc cacctctgct tccgtgtggg gcaggcgact gccaatccca aagggtcaga 120 ggccacaggg tgcccctcgt cccatctggg gctgagcaga aatgcatctt tctgtgggag 180 10 tgagggtgct cacaacggga gcagttttct gtgctatttt ggtaaaagga aatggtgcac 240 cagacctggg tgcactgagg tgtcttcaga aagcagtctg gatccgaacc caagacgccc 300 gggccctgct gggcgtgagt ctctcaaacc cgaacacagg ggccctgctg ggcatgagtc 360 cctctgaacc cgagaccctg gggccctgct gggcgtgagt ctctccgaac ccagagactt 420 cagggccctt ttgggcgtga gtctctccgc tgtgagcccc acactccaag gctcatccac 480 15 agtctacagg atgccatgag ttcatgatca cgtgtgaccc atcaggggac agggccatgg 540 tgtggggggg gtctctacaa aattctgggg tcttgtttcc ccagagcccg agagctcaag 600 gccccgtctc aggctcagac acaaatgaat tgaagatgga cacagatgca gaaatctgtg 660 ctgtttcttt tatgaataaa aagtatcaac attccaggca gggcaaggtg gctcacacct 720 ataatcccag cactttggga ggccgaggtg ggtggatcac ttgaggccag gagtttgagg 780 20 ccaacctaac caacatagtg aaattccatt tctacttaaa aaatacaaaa attagcctgg 840 cctggtggca cacgcctgta gtccccgcta tgcgggaggc tgaggcagga gaatcatttg 900 aacccaggag gcagaggttg cagtgagccg agatcacacc actgcactcc agcctgggca 960 acagagtgag acttcatctt aaaaaaaaaa aaaaaagtat cagcattcca aaaccatagt 1020 ggacaggtgt ttttttattc tgtccttcga taatatttac tggtgctgtg ctagaggccg 1080 25 gaactggggg tgccttcctc tgaaaggcac accttcatgg gaagagaaat aagtggtgaa 1140 tggttgttaa accagaggtt taaactgggg tcctgtcgtt ctgagttaac agtccagatc 1200 tggactttgc ctctttccag aatgctccct ggggtttgct tcatggggga gcagcaggtg 1260 tggacaccct cgtgatgggg gagcagcagg tgcagacgcc ctcatgatgg gggagtggca 1320 ggtgcagaca cccttgtgca tggtgcccag catgtccctg ttgcagctcc ctccccacaa 1380 30 ggatgccggt ctcctgtgct ccecacagtc cctgcttccc tctcacagcc ttacctggtc 1440 ctggcctcca ctggctttgt ctgcatgatt tccacatttc ctgggctccc agcacctctt 1500 cgcctctccc aggcacctct gcagtgctgg ccataccagt cagctgtgaa ctgtccactg 1560 cttattttgc tccccatgaa atgtattttt taggacaggc acccctggtt ccagcctctg 1620 gcacagcatc agtgaatgtt attgaaggac aaaggacaga caaacaaatc aggaaaatgg 1680 35 gttctctcta aacacattgc aaagccacag aggctagtgc aggatgggtg ggcatcaggt 1740 catcagatgt gggtccaatg ccagaatatt ctgtgctccc aaaggccact tggtcagagt 1800 gtgtgcttgc agaggtggct ctaaaagctc agcagtggag gcagtggttc gccatactca 1860 gggtgaactc acatcctctg tgtctgaagt atacagcaga ggcttgaagg gcatctggga 1920 gaagaaaaca ggcaaaatga ttaagaaaag tgaaaaagga aaagtggtaa gatgggaatt 1980 40 ttcttgtcca gattttagtc tcccaaacca cagctcagat ggtagaatgt ggtcagaact 2040 gatggacaga acaatagaac aaaacggaag ccctatctct cagaaacgtg tgttaatgtg 2100 gtatgtggca cagctgatgg aaaagagagt gtgtgtgtaa tttttttttc tgagaaaact 2160 gactggaagc aaataagttg tgtctttaca gcatatacca gagcagattc taggtagaag 2220 aggagacaca tgcaaacaac accagcaaca gaaataaaac aaaagactca aagggaaggg 2280 45 aggtgaacgt tccctggttt ggtgttgggg aaggacacac agggaggcgg atgaaaccag 2340 tgaggcaacg ggcattgctt tcactgcaga gaaactcagc ttgcctgagc cacagtgaaa 2400 atggccattc cctggagcgt ttgtgcacgt gatttattta aggcgccctg tgaggtcctg 2460 cacattcatc ctctcacttt gttctcctaa ccacctgaga ggtagaggag gaaaggctcc 2520 aggggagcag ccgcccttgg tcacccagct ggcaaagggc atgcatgatt gcagcctggc 2580 50 ctcctgctcc ggggcccttg ctctgcccga ggaccccaca caagtcagac ccataggctc 2640 agggtgagcc ggagcccaag gtcgtgttgg ggatggctgt gaaagaagaa atggacgtct 2700 gatgcacact tgggaaggtc ctaccagcag cgtcaaagaa atgcatgtga aactgacagc 2760 gagacccatc cctcaaagaa acgcacgtga aactgatggc gagacctgtc cccatccctc 2820 atgctggctc cttttctggg cttgccaaga gccagcatca ggttgaggca agctggaaag 2880 55 acttttctgg aaagcagctt gtttgcatgg aagtcctcac aatgtcctgt gtcttcccag 2940 taattccact tctgaagtga ccagacatta tcacgggtct tatttaccat ttccagtgtt 3000 ccaggcaggg ggacttgcca cagcaagtca cgaacctgcc caaatacagg gctaaggaga 3060 tattatgcat cacaaaactt gctctgccat taaacatttt tcaaagaatt tttgaagaat 3120 gtttaatggc acaaaacgtt tatttcaatg tagcagtgtt caaagctgga tgtaaaagaa 3180 60 cacaccccag gagcctgccg tgaatgtcat gtgtgttcat ctttggacat ggacatacat 3240 gggcagtgag tggtggtgag gccctggagg acatcggtgg gatgcctcca tcctgcccct 3300 ctggagacac catgtgtgcc acgtgcactc actggagccc tgtttagctg gtgccacctg 3360 gctcttccat ccctgagatt caaacacagt gagattcccc acgcccaact cagtgttctc 3420 ccacaaaaaa cctgagtcac acctgtgttc actcgaggga cgcccgggag ccagggctcc 3480 65 acagtttatt atgtgttttt ggctgagtta tgtgcagatc tcatcagggc agatgatgag 3540 tgcacaaaca cggccgtgcg aggtttggat acactcaaca tcactagcca ggtcctggtg 3600 gagtttggtc atgcagagtc tggatggcat gtagcatttg gagtccatgg agtgagcacc 3660 cagccccctc gggctgcagc gcatgcccca ggcaggacaa ggaagcggga ggaaggcagg 3720 aggctctttg gagcaagctt tgcaggaggg ggctgggtgt ggggcaggca cctgtgtctg 3780 70 acattccccc ctgtgtctca g 3801 <210> 16 - 73 <211> 880 <212> DNA <213> Homo sapiens 5 <400> 16 gtgagcaggc tgatggtcag cacagagttc agagttcagg aggtgtgtgc gcaagtatgt 60 gtgtgtgtgt gtgcgcgcgt gcctgcaagg ctgatggtga ctggctgcac gtaagagtgc 120 acatgtacgc atatacacgt gagcacatac atgtgtgcat gtgtgtacat gaaggcatgg 180 cagtgtgtgc acaggtgtgc aagggcacaa gtgtgtgcac atgcgaatgc acacctgaca 240 10 tgcatgtgtg ttcgtgcaca gtcgtgtggg cattcacgtg aggtgcatgc gtgtgggtgt 300 gcagtgtgag tagcatgtgt gcacataaca tgtattgagg ggtcctcgtg ttcaccccgc 360 taggtcctca gcaccagtgc cactccttac aggatgagac ggggtcccag gccttggtgg 420 gctgaggctc tgaagctgca gccctgaggg cattgtccca tctgggcatc cgcgtccact 480 ccctctcctg tgggcttctg tgtccactcc ccctctcctg tgggcattta catccactcc 540 15 actccctctc tcctgtgggc atccgcgtcc actccccctc tctgtgggca tctgcgtcca 600 cctcccctct ctgtgggcat ttgcgtccac tccctctcct ggttccttcc tgtcttggcc 660 gagcctcggg ggcaggcaga tgacacagag tcttgactcg cccagggtgg ttcgcagctg 720 ccgggtgagg gccaggccgg atttcactgg gaagagggat agtttcttgt caaaatgttc 780 ctctttcttg ttccatctga atggatgata aagcaaaaag taaaaactta aaatcccaga 840 20 gaggtttcta ccgtttctca ctctttcttg gcgactctag 880 <210> 17 <211> 3186 <212> DNA 25 <213> Homo sapiens <400> 17 gtgagccgcc accaaggggt gcaggcccag cctccaggga ccctccgcgc tctgctcacc 60 tctgacccgg ggcttcacct tggaactcct gggttttagg ggcaaggaat gtcttacgtt 120 30 ttcagtggtg ctgctgcctg tgcacagttc tgttcgcgtg gctctgtgca aagcacctgt 180 tctccatctc tgggtagtgg taggagccgg tgtggcccca ggtgtcccca ctgtgcctgt 240 gcactggccg tgggacgtca tggaggccat cccagggcag caggggcatg gggtaaagag 300 atgtttatgg ggagtcttag cagaggaggc tgggaaggtg tctgaacagt agatgggaga 360 tcagatgccc ggaggatttg gggtctcagc aaagagggcc gaggtgggtg caggtgaggg 420 35 tcgctggccc cacccccggg aaggtgcagc agagctgtgg ctccccacac agcccggcca 480 gcacctgtgc tctgggcatg gctgtgctcc tggaacgttc cctgtcctgg ctggtcaggg 540 ggtgcccctg ccaagaatcg acaactttat cacagaggga agggccaatc tgtggaggcc 600 acagggccag cttctgcctg gagtcagggc aggtggtggc acaagcctcg gggctgtacc 660 aaagggcagt cgggcaccac aggcccgggc ctccacctca acaggcctcc cgagccactg 720 40 ggagctgaat gccaggaggc cgaagccctc gccccatgag ggctgagaag gagtgtgagc 780 atttgtgtta cccagggccg aggctgcgcg aattaccgtg cacacttgat gtgaaatgag 840 gtcgtcgtct atcgtggaaa cccagcaagg gctcacggga gagttttcca ttacaaggtc 900 gtaccatgaa aatggttttt aacccgagtg cttgcgcctt catgctctgg cagggagggc 960 agagccacag ctgcatgtta ccgcctttgc accagctcca gaggcttggg accaggctgt 1020 45 ctcagttcca gggtgcgtcc ggctcagacc gccctcctct ctgccttctc tctctgcctc 1080 aaatcttccc tcgtttgcat ctccctgacg cgtgcctggg ccctcgtgca agctgcttga 1140 ctcctttccg gaaacccttg gggtgtgctg gatacaggtg ccactgagga ctggaggtgt 1200 ctgacactgt ggttgacccc agggtccagc tggcgtgctt ggggcctcct tgggccatga 1260 tgaggtcaga ggagttttcc caggtgaaaa ctcctgggaa actcccaggg ccatgtgacc 1320 50 tgccacctgc tcctcccata ttcagctcag tcttgtcctc atttccccac cagggtctct 1380 agctccgagg agctcccgta gagggcctgg gctcagggca gggcggctga gtttccccac 1440 ccatgtgggg acccttgggt agtcgcttga ttgggtagcc ctgaggaggc cgagatgcga 1500 tgggccacgg gccgtttcca aacacagagt caggcacgtg gaaggcccag gaatcccctt 1560 ccctcgaggc aggagtggga gaacggagag ctgggccccg atttcacggc agccaggctg 1620 55 cagtgggcga ggctgtggtg gtccacgtgg cgctgggggc ggggtctgat tcaaatccgc 1680 tggggctcgg ccttcctggc ccgtgctggc cgcgcctcca cacgggcttg gggtggacgc 1740 cccgacctct agcaggtggc tatttctccc tttggaagag agcccctcac ccatgctagg 1800 tgtttccctc ctgggtcagg agcgtggccg tgtggcaacc ccgggacctt aggcttattt 1860 atttgtttaa aaacattctg ggcctggctt ccgttgttgc taaatgggga aaagacatcc 1920 60 cacctcagca gagttactga gaggctgaaa ccggggtgct ggcttgactg gtgtgatctc 1980 aggtcattcc agaagtggct caggaagtca gtgagaccag gtacatgggg ggctcaggca 2040 gtgggtgaga tgaggtacac ggggggctca ggcagtgggt gaggccaggt acatgggggg 2100 ctcaggcact gggtgagatg aggtacacgg ggggctcagg cagagggtca gaccaggtac 2160 acgggggctc tgatcacacg cacatatgag cacatgtgca catgtgctgt ttcatggtag 2220 65 ccaggtctgt gcacacctgc cccaaagtcc caggaagctg agaggccaaa gatggaggct 2280 gacagggctg gcgcggtggc tcacacctgt agtcccagca ctttgggagg ccgaggcgag 2340 aggatccctt gagcccagga gtttaagacc agcctgagca acatagtaga accccatctc 2400 tatgaaaaat aaaaacaaaa attagctgaa catggtggtg tgcgcctgta gttccaatac 2460 ttgggaggct gaagtgggag gatcacttga gcccaggagg tggaagctgc agtgagctga 2520 70 gattgcacca ctgtactgca gcctgggtga cagagtgaga gcccatctca acaacaacaa 2580 agaagactga caaatgcagt ttcttggaaa gaaacattta gtaggaactt aacctacaca 2640 cagaagccaa gtcggtgtct cggtgtcagt gagatgagat gatgggtcct cacaccatca 2700 -4&7 - 74 ccccagaccc agggtttatg caccacaggg gcgggtggct cagaagggat gcgcaggacg 2760 ttgatatacg atgacatcaa ggttgtctga cgaagggcag gattcatgat aagtacctgc 2820 tggtacacaa ggaacaatgg ataaactgga aaccttagag gccttcccgg aacaggggct 2880 aatcagaagc cagcatgggg ggctggcatc caggatggag ctgcttcagc ctccacatgc 2940 5 gtgttcatac agatggtgca cagaaacgca gtgtacctgt gcacacacag acacgcagct 3000 actcgcacac acaagcacac acacagacat gcatgcatgc atccgtgtgt gtgcacctgt 3060 gcccatgagg aaacccatgc atgtgcattc atgcacgcac acaggcaccg gtgggcccat 3120 gcccacaccc acgagcaccg tctgattagg aggcctttcc tctgacgctg tccgccatcc 3180 10 tctcag 3186 <210> 18 <211> 781 <212> DNA 15 <213> Homo sapiens <400> 18 gtatgtgcag gtgcctggcc tcagtggcag cagtgcctgc ctgctggtgt tagtgtgtca 60 ggagactgag tgaatctggg cttaggaagt tcttacccct tttcgcatca ggaagtggtt 120 taacccaacc actgtcaggc tcgtctgccc gccctctcgt ggggtgagca gagcacctga 180 20 tggaagggac aggagctgtc tgggagctgc catccttccc accttgctct gcctggggaa 240 gcgctggggg gcctggtctc tcctgtttgc cccatggtgg gatttggggg gcctggcctc 300 tcctgtttgc cctgtggtgg gattgggctg tctcccgtcc atggcactta gggcccttgt 360 gcaaacccag gccaagggct taggaggagg ccaggcccag gctaccccac ccctctcagg 420 agcagaggcc gcgtatcacc acgacagagc cccgcgccgt cctctgcttc ccagtcaccg 480 25 tcctctgccc ctggacactt tgtccagcat cagggaggtt tctgatccgt ctgaaattca 540 agccatgtcg aacctgcggt cctgagctta acagcttcta ctttctgttc tttctgtgtt 600 gtggaaattt cacctggaga agccgaagaa aacatttctg tcgtgactcc tgcggtgctt 660 gggtcgggac agccagagat ggagccaccc cgcagaccgt cgggtgtggg cagctttccg 720 30 gtgtctcctg ggaggggagc tgggctgggc ctgtgactcc tcagcctctg ttttccccca 780 9 781 <210> 19 <211> 536 35 <212> DNA <213> Homo sapiens <400> 19 gcaagtgtgg gtggaggcca gtgcgggccc cacctgccca ggggtcatcc ttgaacgccc 60 40 tgtgtggggc gagcagcctc agatgctgct gaagtgcaga cgcccccggg cctgaccctg 120 ggggcctgga gccacgctgg cagccctatg tgattaaacg ctggtgtccc caggccacgg 180 agcctggcag ggtccccaac ttcttgaacc cctgcttccc atctcagggg cgatggctcc 240 ccacgcttgg gagccttctg acccctgacc tgtgtcctct cacagcctct tccctggctg 300 ctgccctgag ctcctggggt cctgagcaag ttctctcccc gccccgccgc tccagcgtca 360 45 ctgggctgcc tgtctgctcg ccccggtgga ggggtgtctg tcccttcact gaggttccca 420 ccagccaggg ccacgaggtg caggccctgc ctgcccggcc acccacacgt cctaggaggg 480 ttggaggatg ccacctctgg cctcttctgg aacggagtct gattttggcc ccgcag 536 <210> 20 50 <211> 3179 <212> DNA <213> Homo sapiens <400> 20 55 atctcatgtt tgaatcctaa tgtgcactgc atagacacca ctgtatgcaa ttacagaagc 60 ctgtgagtga acggggtggt ggtcagtgcg ggcccatggc ctggctgtgc atttacggaa 120 gtctatgagt gaatggggtt gtggtcagtg cgggcccatg gcctggctgg gcctgggagg 180 tttctgatgc tgtgaggcag gaggggaagg agggtagggg atagacagtg ggagccccca 240 ccctggaaga cataacagta agtccaggcc cgaagggcag cagggatgct gggggcccag 300 60 cttgggcggc ggggatgatg gagggcctgg ccagggtggc agggatgatg ggggccccag 360 ctggggtggc aggggtgatg gggggggctg gtctgggtgg cggggaagat ggggaagcct 420 ggctgggccc cctcctcccc tgcctcccac ctgcagccgt ggatccggat gtgcttccct 480 ggtgcacatc ctctgggcca tcagctttca tggaggtggg gggcaggggc atgacaccat 540 cctgtataaa atccaggatt cctcctcctg aacgccccaa ctcaggttga aagtcacatt 600 65 ccgcctctgg ccattctctt aagagtagac caggattctg atctctgaag ggtgggtagg 660 gtggggcagt ggagggtgtg gacacaggag gcttcagggt ggggctggtg atgctctctc 720 atcctcttat catctcccag tctcatctct catcctctta tcatctccca gtctcatctg 780 tcttcctctt atctcccagt ctcatctgtc atcctcttac catctcccag tctcatctct 840 tatcctctta tctcctagtc tcatccagac ttacctccca gggcgggtgc caggctcgca 900 70 gtggagctgg acatacgtcc ttcctcaggc agaaggaact ggaaggattg cagagaacag 960 gaggggcggc tcagagggac gcagtcttgg ggtgaagaaa cagcccctcc tcagaagttg 1020 R4 gcttgggcca cacgaaaccg agggccctgc gtgagtggct ccagagcctt ccagcaggtc 1080 39 - 75 cctggtgggg ccttatggta tggccgggtc ctactgagtg caccttggac agggcttctg 1140 gtttgagtgc agcccggacg tgcctggtgt cggggtgggg gcttatggcc actggatatg 1200 gcgtcattta ttgctgctgc ttcagagaat gtctgagtga ccgagcctaa tgtgtatggt 1260 gggcccaagt ccacagactg tgtcgtaaat gcactctggt gcctggagcc cccgtatagg 1320 5 agctgtgagg aaggaggggc tcttggcagc cggcctgggg gcgcctttgc cctgcaaact 1380 ggaagggagc ggccccgggc gccgtgggcg gacgacctca agtgagaggt tggacagaac 1440 agggcgggga cttcccagga gcagaggccg ctgctcaggc acacctgggt ttgaatcaca 1500 gaccaacagg tcaggccatt gttcagctat ccatcttcta caaagctcca gattcctgtt 1560 tctccgggtg ttttttgttg aaattttact caggattact tatatttttt gctaaagtat 1620 10 tagaccctta aaaaaggtat ttgctttgat atggcttaac tcactaagca cctactttat 1680 ttgtctgttt ttatttatta ttattattat tattagagat ggtgtctact ctgtcaccca 1740 ggttgttagt gcagtggcac agtcatggct cgctgtagcc gcaaaccccc aggctcaagt 1800 gatcctccgg cctcagcttc ccagagtgct gggattacag gtgtgagcca ctgcccttgc 1860 ctggcacttt taaaaaccac tatgtaaggt caggtccagt ggcttccaca cctgtcatcc 1920 15 cagtagtttg ggaagccgag gcagaaggat tgtctgaggc caggagtttg agaccagcat 1980 gggtaacata gggagacccc atctctacaa aaaatgcaaa aagttatccg ggcgtggggt 2040 ccagcatctg tagtcccagc tgctcgggag gctgagtggg aggatcgctt gagcccggga 2100 ggtcatggct gcagtgagct gtgattgtac catcgcactc cagcctgggc aacagagtga 2160 gaccctgtct caaaaaaaaa aaaaaaaaaa gaaggagaag gagaagagaa gaagaaggaa 2220 20 gaaggaaaga gaagaagaag gaagaaggaa gaaagaagga gaaggaggcc tgctaggtgc 2280 taggtagact gtcaaatctc agagcaaaat gaaaataaca aagttttaaa gggaaagaaa 2340 aaccccagct ctttggactt ccttaggcct gaacttcatc tcaagcagct tccttccaca 2400 gacaagcgtg tatggagcga gtgagttcaa agcagaaagg gaggagaagc aggcaagggt 2460 ggaggctgtg ggtgacacca gccaggaccc ctgaaaggga gtggttgttt tcctgcctca 2520 25 gccccacgct cctgccggtc ctgcacctgc tgtaaccgtc gatgttggtg ccaggtgccc 2580 acctgggaag gatgctgtgc agggggcttg ccaaactttg gtgggtttca gaagccccag 2640 gcacttgtgg caggcacaat tacagcccct ccccaaagat gcccacgtcc ttctcctgga 2700 acctgtgaat gtgtcacccg caaggcagag gctggtgaag gctgcaggtg gaatcacggc 2760 tgccagtcag ccgatcttaa ggtcatcctg gattatctgg tgggcctgat atggccacaa 2820 30 gggtccctag aagtgagaga gggaggcagg ggagagtcag agaggggacg tgagaaggac 2880 cactggccac tgctggcttt gagatggagg agggggtccc cagccaagga atgggggcag 2940 ccgctccatg ctggaaaagc aagcaatcct ccccggtcct gagggcacac ggccctgccc 3000 acgcctcgat ttcaggccag tgggacctgt ttcagctttc cggcctccag agctgtaaga 3060 tgatgcgttt gtgttcagcc actaagctgc agtgattcgt cacagcagca aatggaatag 3120 35 cagtacaggg aaatgaatac agggacagtt ctcagagtga ctctcagccc acccctggg 3179 EDITORIAL NOTE FOR APPLICATION NO. 22729/99 THE FOLLOWING SEQUENCE LISTING IS PART OF THE DESCRIPTION THE CLAIMS BEGIN ON PAGE 76
Claims (12)
1. Regulatory DNA sequences for the gene for the human catalytic telomerase subunit. 5
2. DNA sequences according to Claim 1, characterized in that the sequences are intron sequences in accordance with SEQ ID NO 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19 and/or 20 or fragments of these sequences which have a regulatory effect. 10
3. DNA sequences according to Claim 1, characterized in that the sequences are the 5'-flanking regulatory DNA sequence for the gene for the human catalytic telomerase subunit as depicted in Fig. 10 (SEQ ID NO 3), or fragments of this DNA sequence which have a regulatory effect. 15
4. Recombinant construct which contains a DNA sequence according to one of Claims 1 to 3.
5. Recombinant construct according to Claim 4, characterized in that it 20 additionally contains one or more DNA sequences which encode polypeptides or proteins.
6. Vector which contains a recombinant construct according to Claim 4 or 5. 25
7. Use of recombinant constructs or vectors according to one of Claims 4 to 6 for preparing medicaments.
8. Recombinant host cells which harbour recombinant constructs or vectors according to one of Claims 4 to 6. 30 -77
9. Process for identifying substances which affect the promoter activity, silencer activity or enhancer activity of the human catalytic telomerase subunit, comprising the following steps: 5 A. adding a candidate substance to a host cell which harbours DNA sequences according to one of Claims 1 to 3, which sequences are functionally linked to a reporter gene, and B. measuring the effect of the substance on expression of the reporter 10 gene.
10. Process for identifying factors which bind specifically to the DNA according to one of Claims 1 to 3, or to fragments thereof, characterized in that an expression cDNA library is screened using a DNA sequence according to one 15 of Claims 1 to 3, or subfragments of widely differing length, as the probe.
11. Transgenic animals which harbour recombinant constructs or vectors according to Claims 4 to 6. 20
12. Process for detecting telomerase-associated conditions in a patient, comprising the following steps: A. incubating a recombinant construct or vector according to Claims 4 to 6, which additionally contains a reporter gene, with body fluids or cell 25 samples, B. detecting the activity of the reporter gene in order to obtain a diagnostic value, and - 78 C. comparing the diagnostic value with standard values for the reporter gene construct in standardized normal cells or body fluids of the same type as the test sample.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
DE19757984 | 1997-12-24 | ||
DE1997157984 DE19757984A1 (en) | 1997-12-24 | 1997-12-24 | Regulatory DNA sequences from the 5 'region of the gene of the human catalytic telomerase subunit and their diagnostic and therapeutic use |
PCT/EP1998/008216 WO1999033998A2 (en) | 1997-12-24 | 1998-12-22 | Regulatory dna sequences of the human catalytic telomerase sub-unit gene, diagnostic and therapeutic use thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
AU2272999A true AU2272999A (en) | 1999-07-19 |
AU742489B2 AU742489B2 (en) | 2002-01-03 |
Family
ID=7853458
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
AU22729/99A Ceased AU742489B2 (en) | 1997-12-24 | 1998-12-22 | Regulatory DNA sequences of the human catalytic telomerase sub-unit gene, diagnostic and therapeutic use thereof |
Country Status (6)
Country | Link |
---|---|
EP (1) | EP1040195A2 (en) |
JP (1) | JP2003519462A (en) |
AU (1) | AU742489B2 (en) |
CA (1) | CA2316282A1 (en) |
DE (1) | DE19757984A1 (en) |
WO (1) | WO1999033998A2 (en) |
Families Citing this family (20)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6610839B1 (en) | 1997-08-14 | 2003-08-26 | Geron Corporation | Promoter for telomerase reverse transcriptase |
US6808880B2 (en) | 1996-10-01 | 2004-10-26 | Geron Corporation | Method for detecting polynucleotides encoding telomerase |
US6777203B1 (en) | 1997-11-19 | 2004-08-17 | Geron Corporation | Telomerase promoter driving expression of therapeutic gene sequences |
US6093809A (en) | 1996-10-01 | 2000-07-25 | University Technology Corporation | Telomerase |
GB2321642B8 (en) | 1996-10-01 | 2006-08-22 | Geron Corp | Human telomerase reverse transcriptase promoter |
US6475789B1 (en) | 1996-10-01 | 2002-11-05 | University Technology Corporation | Human telomerase catalytic subunit: diagnostic and therapeutic methods |
US7585622B1 (en) | 1996-10-01 | 2009-09-08 | Geron Corporation | Increasing the proliferative capacity of cells using telomerase reverse transcriptase |
US6261836B1 (en) | 1996-10-01 | 2001-07-17 | Geron Corporation | Telomerase |
US7413864B2 (en) | 1997-04-18 | 2008-08-19 | Geron Corporation | Treating cancer using a telomerase vaccine |
US7622549B2 (en) | 1997-04-18 | 2009-11-24 | Geron Corporation | Human telomerase reverse transcriptase polypeptides |
US7378244B2 (en) | 1997-10-01 | 2008-05-27 | Geron Corporation | Telomerase promoters sequences for screening telomerase modulators |
ES2220448T3 (en) * | 1999-02-04 | 2004-12-16 | Geron Corporation | REGULATING SEQUENCES OF THE TRANSCRIPTION OF THE INVERSE TRANSCRIPT TELOMERASA. |
DE19947668A1 (en) * | 1999-10-04 | 2001-04-19 | Univ Eberhard Karls | Tumor-specific vector for gene therapy |
DE10019195B4 (en) * | 2000-04-17 | 2006-03-09 | Heart Biosystems Gmbh | Reversible immortalization |
US6686159B2 (en) | 2000-08-24 | 2004-02-03 | Sierra Sciences, Inc. | Methods and compositions for modulating telomerase reverse transcriptase (TERT) expression |
AU2002235141A1 (en) | 2000-11-27 | 2002-06-03 | Geron Corporation | Glycosyltransferase vectors for treating cancer |
US6576464B2 (en) | 2000-11-27 | 2003-06-10 | Geron Corporation | Methods for providing differentiated stem cells |
US7211435B2 (en) | 2001-06-21 | 2007-05-01 | Sierra Sciences, Inc. | Telomerase expression repressor proteins and methods of using the same |
AU2002363231A1 (en) * | 2001-10-29 | 2003-05-12 | Baylor College Of Medicine | Human telomerase reverse transcriptase as a class-ii restricted tumor-associated antigen |
US8163892B2 (en) | 2002-07-08 | 2012-04-24 | Oncolys Biopharma, Inc. | Oncolytic virus replicating selectively in tumor cells |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2321642B8 (en) * | 1996-10-01 | 2006-08-22 | Geron Corp | Human telomerase reverse transcriptase promoter |
-
1997
- 1997-12-24 DE DE1997157984 patent/DE19757984A1/en not_active Withdrawn
-
1998
- 1998-12-22 JP JP2000526653A patent/JP2003519462A/en not_active Withdrawn
- 1998-12-22 EP EP98966334A patent/EP1040195A2/en not_active Withdrawn
- 1998-12-22 CA CA002316282A patent/CA2316282A1/en not_active Abandoned
- 1998-12-22 WO PCT/EP1998/008216 patent/WO1999033998A2/en not_active Application Discontinuation
- 1998-12-22 AU AU22729/99A patent/AU742489B2/en not_active Ceased
Also Published As
Publication number | Publication date |
---|---|
AU742489B2 (en) | 2002-01-03 |
WO1999033998A3 (en) | 1999-08-19 |
WO1999033998A2 (en) | 1999-07-08 |
EP1040195A2 (en) | 2000-10-04 |
JP2003519462A (en) | 2003-06-24 |
DE19757984A1 (en) | 1999-07-01 |
CA2316282A1 (en) | 1999-07-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU742489B2 (en) | Regulatory DNA sequences of the human catalytic telomerase sub-unit gene, diagnostic and therapeutic use thereof | |
KR101441843B1 (en) | Conditionally immortalized long-term stem cells and methods of making and using such cells | |
ES2792126T3 (en) | Treatment method based on polymorphisms of the KCNQ1 gene | |
US20020102686A1 (en) | Inactive variants of the human telomerase catalytic subunit | |
AU745420B2 (en) | Human catalytic telomerase sub-unit and its diagnostic and therapeutic use | |
US20050032094A1 (en) | Regulatory DNA sequences of the human catalytic telomerase sub-unit gene, diagnostic and therapeutic use thereof | |
AU744188C (en) | Human growth gene and short stature gene region | |
US6706511B2 (en) | Isolated human kinase proteins | |
US6808911B2 (en) | Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof | |
US6426206B1 (en) | Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof | |
US6653117B2 (en) | Isolated human kinase proteins | |
US6387677B1 (en) | Nucleic acid molecules encoding human calcium/calmodulin (CaMK) dependent kinase proteins | |
US6410294B1 (en) | Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof | |
US6753175B2 (en) | Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof | |
US20040014193A1 (en) | Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof | |
US20040220387A1 (en) | Methods | |
US20030119037A1 (en) | Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof | |
CA2422549A1 (en) | Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof | |
CA2440575A1 (en) | Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof | |
US20030228595A1 (en) | Isolated human kinase proteins, nucleic acid molecules encoding human kinase proteins, and uses thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FGA | Letters patent sealed or granted (standard patent) |