US20030130215A1 - Isolated genomic polynucleotide fragments from chromosome 7 - Google Patents

Isolated genomic polynucleotide fragments from chromosome 7 Download PDF

Info

Publication number
US20030130215A1
US20030130215A1 US09/957,956 US95795601A US2003130215A1 US 20030130215 A1 US20030130215 A1 US 20030130215A1 US 95795601 A US95795601 A US 95795601A US 2003130215 A1 US2003130215 A1 US 2003130215A1
Authority
US
United States
Prior art keywords
polynucleotide
seq
polypeptide
sequence
human
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US09/957,956
Inventor
James Ryan
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ryogen LLC
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to US09/957,956 priority Critical patent/US20030130215A1/en
Publication of US20030130215A1 publication Critical patent/US20030130215A1/en
Priority to US10/642,946 priority patent/US7588915B2/en
Assigned to RYOGEN LLC reassignment RYOGEN LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: RYAN, JAMES W.
Priority to US12/533,105 priority patent/US8313899B2/en
Priority to US12/533,164 priority patent/US8313900B2/en
Priority to US12/533,087 priority patent/US8178662B2/en
Priority to US12/533,130 priority patent/US8323884B2/en
Priority to US13/680,178 priority patent/US8795959B2/en
Priority to US13/680,223 priority patent/US8822145B2/en
Priority to US13/680,203 priority patent/US20130130251A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/435Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
    • C07K14/705Receptors; Cell surface antigens; Cell surface determinants
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61PSPECIFIC THERAPEUTIC ACTIVITY OF CHEMICAL COMPOUNDS OR MEDICINAL PREPARATIONS
    • A61P43/00Drugs for specific purposes, not provided for in groups A61P1/00-A61P41/00
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1205Phosphotransferases with an alcohol group as acceptor (2.7.1), e.g. protein kinases
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1241Nucleotidyltransferases (2.7.7)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/48Hydrolases (3) acting on peptide bonds (3.4)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K38/00Medicinal preparations containing peptides
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y10TECHNICAL SUBJECTS COVERED BY FORMER USPC
    • Y10TTECHNICAL SUBJECTS COVERED BY FORMER US CLASSIFICATION
    • Y10T436/00Chemistry: analytical and immunological testing
    • Y10T436/14Heterocyclic carbon compound [i.e., O, S, N, Se, Te, as only ring hetero atom]
    • Y10T436/142222Hetero-O [e.g., ascorbic acid, etc.]
    • Y10T436/143333Saccharide [e.g., DNA, etc.]

Definitions

  • the invention is directed to isolated genomic polynucleotide fragments that encode human SNARE YKT6, human liver glucokinase, human adipocyte enhancer binding protein (AEBP1) and DNA directed 50 kD regulatory subunit (POLD2), vectors and hosts containing these fragments and fragments hybridizing to noncoding regions as well as antisense oligonucleotides to these fragments.
  • the invention is further directed to methods of using these fragments to obtain SNARE YKT6, human liver glucokinase, AEBP1 protein and POLD2 and to diagnose, treat, prevent and/or ameliorate a pathological disorder.
  • Chromosome 7 contains genes encoding, for example, epidermal growth factor receptor, collagen-1-Alpha-1-chain, SNARE YKT6, human liver glucokinase, human adipocyte enhancer binding protein and DNA polymerase delta small subunit (POLD2).
  • SNARE YKT6, human liver glucokinase, human adipocyte enhancer binding protein and DNA polymerase delta small subunit (POLD2) are discussed in further detail below.
  • SNARE YKT6 a substrate for prenylation, is essential for vesicle-associated endoplasmic reticulum-Golgi transport (McNew, J. A. et al. J. Biol. Chem. 272, 17776-17783, 1997). It has been found that depletion of this function stops cell growth and manifests a transport block at the endoplasmic reticulum level.
  • adipocyte-enhancer binding protein is a transcriptional repressor having carboxypeptidase B-like activity which binds to a regulatory sequence (adipocyte enhancer 1, AE-1) located in the proximal promoter region of the adipose P2 (aP2) gene, which encodes the adipocyte fatty acid binding protein (Muise et al., 1999, Biochem. J. 343:341-345).
  • B-like carboxypeptidases remove C-terminal arginine and lysine residues and participate in the release of active peptides, such as insulin, alter receptor specificity for polypeptides and terminate polypeptide activity (Skidgel, 1988, Trends Pharmacol.
  • AEBP1 Full length cDNA clones encoding AEBP1 have been isolated from human osteoblast and adipose tissue (Ohno et al., 1996, Biochem. Biophys Res. Commun. 228:411-414). Two forms have been found to exist due to alternative splicing. This gene appears to play a significant role in regulating adipogenesis. In addition to playing a role in obesity, adipogenesis may play a role in ostopenic disorders. It has been postulated that adipogenesis inhibitors may be used to treat osteopenic disorders (Nuttal et al., 2000, Bone 27:177-184).
  • DNA polymerase delta core is a heterodimeric enzyme with a catalytic subunit of 125 kD and a second subunit of 50 kD and is an essential enzyme for DNA replication and DNA repair (Zhang et al., 1995, Genomics 29:179-186).
  • cDNAs encoding the small subunit have been cloned and sequenced.
  • the gene for the small subunit has been localized to human chromosome 7 via PCR analysis of a panel of human-hamster hybrid cell lines. However, the genomic DNA has not been isolated and the exact location on chromosome 7 has not been determined.
  • cDNAs encoding the above-disclosed proteins have been isolated, their location on chromosome 7 has not been determined. Furthermore, genomic DNA encoding these polypeptides have not been isolated. Noncoding sequences can play a significant role in regulating the expression of polypeptides as well as the processing of RNA encoding these polypeptides.
  • the invention is directed to an isolated genomic polynucleotide, said polynucleotide obtainable from human chromosome 7 having a nucleotide sequence at least 95% identical to a sequence selected from the group consisting of:
  • the polynucleotides of the present invention may be used for the manufacture of a gene therapy for the prevention, treatment or amelioration of a medical condition by adding an amount of a composition comprising said polynucleotide effective to prevent, treat or ameliorate said medical condition.
  • the invention is further directed to obtaining these polypeptides by
  • polypeptides obtained may be used to produce antibodies by
  • step (b) immunizing a host animal with said polypeptide or peptide-carrier protein conjugate of step (b) with an adjuvant and
  • the invention is further directed to polynucleotides that hybridize to noncoding regions of said polynucleotide sequences as well as antisense oligonucleotides to these polynucleotides as well as antisense mimetics.
  • the antisense oligonucleotides or mimetics may be used for the manufacture of a medicament for prevention, treatment or amelioration of a medical condition.
  • the invention is further directed to kits comprising these polynucleotides and kits comprising these antisense oligonucleotides or mimetics.
  • the noncoding regions are transcription regulatory regions.
  • the transcription regulatory regions may be used to produce a heterologous peptide by expressing in a host cell, said transcription regulatory region operably linked to a polynucleotide encoding the heterologous polypeptide and recovering the expressed heterologous polypeptide.
  • polynucleotides of the present invention may be used to diagnose a pathological condition in a subject comprising
  • the invention is directed to isolated genomic polynucleotide fragments that encode human SNARE YKT6, human liver glucokinase, human adipocyte enhancer binding protein and DNA directed 50 kD regulatory subunit (POLD2), which in a specific embodiment are the SNARE YKT6, human liver glucokinase, human adipocyte enhancer binding protein and DNA directed 50 kD regulatory subunit (POLD2) genes, as well as vectors and hosts containing these fragments and polynucleotide fragments hybridizing to noncoding regions, as well as antisense oligonucleotides to these fragments.
  • POLD2 DNA directed 50 kD regulatory subunit
  • a “gene” is the segment of DNA involved in producing a polypeptide chain; it includes regions preceding and following the coding region, as well as intervening sequences (introns) between individual coding segments (exons).
  • isolated refers to material removed from its original environment and is thus altered “by the hand of man” from its natural state.
  • An isolated polynucleotide can be part of a vector, a composition of matter or could be contained within a cell as long as the cell is not the original environment of the polynucleotide.
  • the polynucleotides of the present invention may be in the form of RNA or in the form of DNA, which DNA includes genomic DNA and synthetic DNA.
  • the DNA may be double-stranded or single-stranded and if single stranded may be the coding strand or non-coding strand.
  • the human snare YKT6 polypeptide has the amino acid sequence depicted in SEQ ID NO:1: KLYSLSVLYKGEAKVVLLKAAYDVSSFSFFQRSSVQEFMTFTSQLIVERSSKGTRASVKFQDYLCH VYVRNDSLAGVVIADNEYPSRVAFTLLEKVLDEFSKQVDRIDWPVGSPATIHYPALDGHLSRYQN PREADPMTKVQAELDETKIILHNTMESLLERGEKLDDLVSKSEVLGTQSKAFYKTARKQNSCCAI M
  • the genomic DNA or YKT6 SNARE gene is 39,000 base pairs in length and contains seven exons (see Table 1 below for location of exons). As will be discussed in further detail below, the YKT6 SNARE gene is situated in genomic clone AC006454 at nucleotides 36,001-75,000.
  • the human liver glucokinase is depicted in SEQ ID NO:2: MPRPRSQLPQPNSQVEQILAEFQLQEEDLKKVMRRMQKEMDRGLRLETHEEASVKMLPTYVRSTP EGSEVGDFLSLDLGGTNFRVMLVKVGEGEEGQWSVKTKHQTYSIPEDAMTGTAEMLFDYISECIS DFLDKHQMKHKKLPLGFTFSFPVRHEDIDKGILLNWTKGFKASGAEGNNVVGLLRDAIKRRGDFE MDVVAMVNDTVATMISCYYEDHQCEVGMIVGTGCNACYMEEMQNVELVEGDEGRMCVNTEW GAFGDSGELDEFLLEYDRLVDESSANPGQQLYEKLIGGKYMGELVRLVLLRLVDENLLFHGEASE QLRTRGAFETREVSQVESDTGDRKQIYNILSTLGLRPSTTDCDIVRRACESVSTRAAHMCSAGLAG VINRMRESRSEDVMRITVGVDGSVYKL
  • the human liver glucokinase genomic DNA is 46,000 base pairs in length and contains ten exons (see Table 2 below for location of exons).
  • the human adipocyte enhancer binding protein has the amino acid sequence depicted in SEQ ID NO:3: MAAVRGAPLLSCLLALLALCPGGRPQTVLTDDEIEEFLEGFLSELEPEPREDDVEAPPPPEPTPRVR KAQAGGKPGKRPGTAAEVPPEKTKDKGKKGKKDKGPKVPKESLEGSPRPPKKGKEKPPKATKKP KEKPPKATKKPKEEPPKATKKPKEKPPKATKKPPSGKRPPILAPSETLEWPLPPPPSPGPEELPQEGG APLSNNWQNPGEETHVEAQEHQPEPEEETEQPTLDYNDQIEREDYEDFEYIRRQKQPRPPPSRRRR PERVWPEPPEEKAPAPEERIEPPVKPLLPPLPPDYGDGYVIPNYDDMDYYFGPPPQKPDAERQT DEEKEELKKPKKEDSSPKEETDKWAVEKGKDHKEPRKGEELEEEWTPTEKVKCPPIGMESHRIED
  • the adipocyte enhancer binding protein 1 is 16,000 base pairs in length and contains 21 exons (see Table 3 below for location of exons).
  • the human AEBP1 gene is situated in genomic clone AC006454 at nucleotides 137,041-end,
  • POLD2 has an amino acid sequence depicted in SEQ ID NO:4: MFSEQAAQRAHTLLSPPSANNATFARVPVATYTNSSQPFRLGERSFSRQYAHIYATRLIQMRPFLE NRAQQHWGSGVGVKKLCELQPEEKCCVVGTLFKAMPLQPSILREVSEEHNLLPQPPRSKYIHPDD ELVLEDELQRIKLKGTIDVSKLVTGTVLAVFGSVRDDGKYLVEDYCFADLAPQKpAPPLDTDRFVL LVSGLGLGGGGGESLLGTQLLVDVVTGQLGDEGEQCSAAHVSRVILAGNLLSHSTQSRDSINKAK YLTKKTQAASVEAVKMLDEILLQLSASVPVDVMPGEFDPTNYTLPQQPLHPCMFPLATAYSTLQL VTNPYQATIDGVRFLGTSGQNVSDIFRYSSMEDHLEJLEWTLRVRHISPTAPDTHGCYPFYKTDPHF PECPHVY
  • the POLD2 gene is 19,000 base pairs in length and contains ten exons (see Table 4 below for location of exons). As will be discussed in further detail below, the POLD2 gene is situated in genomic clone AC006454 at nucleotides 119,001-138,000.
  • the polynucleotides of the invention have at least a 95% identity and may have a 96%, 97%, 98% or 99% identity to the polynucleotides depicted in SEQ ID NOS:5, 6, 7 or 8 as well as the polynucleotides in reverse sense orientation, or the polynucleotide sequences encoding the SNARE YKT6, AEBP1, human glucokinase or POLD2 polypeptides depicted in SEQ ID NOS:1, 2, 3, or 4 respectively.
  • a polynucleotide having 95% “identity” to a reference nucleotide sequence of the present invention is identical to the reference sequence except that the polynucleotide sequence may include on average up to five point mutations per each 100 nucleotides of the reference nucleotide sequence encoding the polypeptide.
  • the query sequence may be an entire sequence, the ORF (open reading frame), or any fragment specified as described herein.
  • nucleic acid molecule or polypeptide is at least 95%, 96%, 97%, 98% or 99% identical to a nucleotide sequence of the presence invention can be determined conventionally using known computer programs.
  • a preferred method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence, also referred to as a global sequence alignment, can be determined using the FASTDB computer program based on the algorithm of Brutlag et al. (Comp. App. Biosci. (1990) 6:237-245).
  • the query and subject sequences are both DNA sequences.
  • An RNA sequence can be compared by converting U's to T's.
  • the result of said global sequence alignment is in percent identity.
  • the percent identity is corrected by calculating the number of bases of the query sequence that are 5′ and 3′ of the subject sequence, which are not matched/aligned, as a percent of the total bases of the query sequence. Whether a nucleotide is matched/aligned is determined by results of the FASTDB sequence alignment.
  • This percentage is then subtracted from the percent identify, calculated by the above FASTDB program using the specified parameters, to arrive at a final percent identity score.
  • This corrected score is what is used for the purposes of the present invention. Only bases outside the 5′ and 3′ bases of the subject sequence, as displayed by the FASTDB alignment, which are not matched/aligned with the query sequence are calculated for the purposes of manually adjusting the percent identity score.
  • a 95 base subject sequence is aligned to a 100 base query sequence to determine percent identity.
  • the deletions occur at the 5′ end of the subject sequence and therefore, the FASTDB alignment does not show a matched/alignment of the first 10 bases at 5′ end.
  • the 10 unpaired bases represent 5% of the sequence (number of bases at the 5′ and 3′ ends not matched/total numbers of bases in the query sequence) so 5% is subtracted from the percent identity score calculated by the FASTDB program. If the remaining 95 bases were perfectly matched the final percent identity would be 95%.
  • a 95 base subject sequence is compared with a 100 base query sequence.
  • deletions are internal deletions so that there are no bases on the 5′ or 3′ of the subject sequence which are not matched/aligned with the query.
  • percent identity calculated by FASTDB is not manually corrected.
  • bases 5′ and 3′ of the subject sequence which are not matched/aligned with the query sequence are manually corrected for. No other manual corrections are made for purposes of the present invention.
  • a polypeptide that has an amino acid sequence at least, for example, 95% “identical” to a query amino acid sequence is identical to the query sequence except that the subject polypeptide sequence may include on average, up to five amino acid alterations per each 100 amino acids of the query amino acid sequence.
  • up to 5% of the amino acid residues in the subject sequence may be inserted, deleted, (indels) or substituted with another amino acid.
  • These alterations of the reference sequence may occur at the amino or carboxy terminal positions of the reference amino acid sequence or anywhere between those terminal positions, interspersed either individually among residues in the referenced sequence or in one or more contiguous groups within the reference sequence.
  • a preferred method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence can be determined using the FASTDB computer program based on the algorithm of Brutlag et al. (Com. App. Biosci. (1990) 6:237-245).
  • the query and subject sequence are either both nucleotide sequences or both amino acid sequences.
  • the result of said global sequence alignment is in percent identity.
  • the percent identity is corrected by calculating the number of residues of the query sequence that are N- and C-terminal of the subject sequence, which are not matched/aligned with a corresponding subject residue, as a percent of the total bases of the query sequence. Whether a residue is matched/aligned is determined by results of the FASTDB sequence alignment.
  • This percentage is then subtracted from the percent identity, calculated by the above FASTDB program using the specified parameters, to arrive at a final percent identity score.
  • This final percent identity score is what is used for the purposes of the present invention. Only residues to the N- and C-termini of the subject sequence, which are not matched/aligned with the query sequence, are considered for the purposes of manually adjusting the percent identity score. That is, only query residue positions outside the farthest N- and C-terminal residues of the subject sequence.
  • the invention also encompasses polynucleotides that hybridize to the polynucleotides depicted in SEQ ID NOS: 5, 6, 7 or 8.
  • a polynucleotide “hybridizes” to another polynucleotide, when a single-stranded form of the polynucleotide can anneal to the other polynucleotide under the appropriate conditions of temperature and solution ionic strength (see Sambrook et al., supra). The conditions of temperature and ionic strength determine the “stringency” of the hybridization.
  • low stringency hybridization conditions corresponding to a temperature of 42° C.
  • Moderate stringency hybridization conditions correspond to a higher temperature of 55° C., e.g., 40% formamide, with 5 ⁇ or 6 ⁇ SCC.
  • High stringency hybridization conditions correspond to the highest temperature of 65° C., e.g., 50% formamide, 5 ⁇ or 6 ⁇ SCC.
  • Hybridization requires that the two nucleic acids contain complementary sequences, although depending on the stringency of the hybridization, mismatches between bases are possible.
  • RNA:RNA, DNA:RNA, DNA:DNA The appropriate stringency for hybridizing nucleic acids depends on the length of the nucleic acids and the degree of complementation, variables well known in the art. The greater the degree of similarity or homology between two nucleotide sequences, the greater the value of Tm for hybrids of nucleic acids having those sequences. The relative stability (corresponding to higher Tm) of nucleic acid hybridizations decreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA.
  • the invention is directed to both polynucleotide and polypeptide variants.
  • a “variant” refers to a polynucleotide or polypeptide differing from the polynucleotide or polypeptide of the present invention, but retaining essential properties thereof. Generally, variants are overall closely similar and in many regions, identical to the polynucleotide or polypeptide of the present invention.
  • the variants may contain alterations in the coding regions, non-coding regions, or both.
  • polynucleotide variants containing alterations which produce silent substitutions, additions, or deletions, but do not alter the properties or activities of the encoded polypeptide are preferred.
  • variants in which 5-10, 1-5, or 1-2 amino acids are substituted, deleted, or added in any combination are also preferred.
  • the invention also encompasses allelic variants of said polynucleotides.
  • An allelic variant denotes any of two or more alternative forms of a gene occupying the same chromosomal locus. Allelic variation arises naturally through mutation, and may result in polymorphism within populations. Gene mutations can be silent (no change in the encoded polypeptide) or may encode polypeptides having altered amino acid sequences.
  • An allelic variant of a polypeptide is a polypeptide encoded by an allelic variant of a gene.
  • amino acid sequences of the variant polypeptides may differ from the amino acid sequences depicted in SEQ ID NOS:1, 2, 3 or 4 by an insertion or deletion of one or more amino acid residues and/or the substitution of one or more amino acid residues by different amino acid residues.
  • amino acid changes are of a minor nature, that is conservative amino acid substitutions that do not significantly affect the folding and/or activity of the protein; small deletions, typically of one to about 30 amino acids; small amino- or carboxyl-terminal extensions, such as an amino-terminal methionine residue; a small linker peptide of up to about 20-25 residues; or a small extension that facilitates purification by changing net charge or another function, such as a poly-histidine tract, an antigenic epitope or a binding domain.
  • Examples of conservative substitutions are within the group of basic amino acids (arginine, lysine and histidine), acidic amino acids (glutamic acid and aspartic acid), polar amino acids (glutamine and asparagine), hydrophobic amino acids (leucine, isoleucine and valine), aromatic amino acids (phenylalanine, tryptophan and tyrosine), and small amino acids (glycine, alanine, serine, threonine and methionine).
  • Amino acid substitutions which do not generally alter the specific activity are known in the art and are described, for example, by H. Neurath and R. L. Hill, 1979, In, The Proteins, Academic Press, New York.
  • the invention is further directed to polynucleotide fragments containing or hybridizing to noncoding regions of the SNARE YKT6, AEBP1, human glucokinase and POLD2 genes. These include but are not limited to an intron, a 5′ non-coding region, a 3′ non-coding region and splice junctions (see Tables 1-4), as well as transcription factor binding sites (see Table 5).
  • the polynucleotide fragments may be a short polynucleotide fragment which is between about 8 nucleotides to about 40 nucleotides in length. Such shorter fragments may be useful for diagnostic purposes.
  • Such short polynucleotide fragments are also preferred with respect to polynucleotides containing or hybridizing to polynucleotides containing splice junctions.
  • larger fragments e.g., of about 50, 150, 500, 600 or about 2000 nucleotides in length may be used.
  • noncoding sequences are expression control sequences. These include but are not limited to DNA regulatory sequences, such as promoters, enhancers, repressors, terminators, and the like, that provide for the regulation of expression of a coding sequence in a host cell. In eukaryotic cells, polyadenylation signals are also control sequences.
  • the expression control sequences may be operatively linked to a polynucleotide encoding a heterologous polypeptide.
  • Such expression control sequences may be about 50-200 nucleotides in length and specifically about 50, 100, 200, 500, 600, 1000 or 2000 nucleotides in length.
  • a transcriptional control sequence is “operatively linked” to a polynucleotide encoding a heterologous polypeptide sequence when the expression control sequence controls and regulates the transcription and translation of that polynucleotide sequence.
  • operatively linked includes having an appropriate start signal (e.g., ATG) in front of the polynucleotide sequence to be expressed and maintaining the correct reading frame to permit expression of the DNA sequence under the control of the expression control sequence and production of the desired product encoded by the polynucleotide sequence. If a gene that one desires to insert into a recombinant DNA molecule does not contain an appropriate start signal, such a start signal can be inserted upstream (5′) of and in reading frame with the gene.
  • an appropriate start signal e.g., ATG
  • the human chromosome 7 genomic clone of accession number AC006454 has been discovered to contain the SNARE YKT6 gene, the human liver glucokinase gene, the AEBP1 gene, and the POLD2 gene by Genscan analysis (Burge et al., 1997, J. Mol. Biol. 268:78-94), BLAST2 and TBLASTN analysis (Altschul et al., 1997, Nucl. Acids Res. 25:3389-3402), in which the sequence of AC006454 was compared to the SNARE YKT6 cDNA sequence, accession number NM — 006555 (McNew et al., 1997, J. Biol. Chem.
  • the cloning of the nucleic acid sequences of the present invention from such genomic DNA can be effected, e.g., by using the well known polymerase chain reaction (PCR) or antibody screening of expression libraries to detect cloned DNA fragments with shared structural features.
  • PCR polymerase chain reaction
  • Other nucleic acid amplification procedures such as ligase chain reaction (LCR), ligated activated transcription (LAT) and nucleic acid sequence-based amplification (NASBA) or long chain PCR may be used.
  • 5′ or 3′ non-coding portions of each gene may be identified by methods including but are not limited to, filter probing, clone enrichment using specific probes and protocols similar or identical to 5′ and 3′ “RACE” protocols which are well known in the art. For instance, a method similar to 5′ RACE is available for generating the missing 5′ end of a desired full-length transcript. (Fromont-Racine et al., 1993, Nucl. Acids Res. 21:1683-1684).
  • identification of the specific DNA fragment containing the desired SNARE YKT6 gene, the human liver glucokinase gene, the AEBP1 gene, or POLD2 gene may be accomplished in a number of ways. For example, if an amount of a portion of a SNARE YKT6 gene, the human liver glucokinase gene, the AEBP1 gene, or POLD2 gene or its specific RNA, or a fragment thereof, is available and can be purified and labeled, the generated DNA fragments may be screened by nucleic acid hybridization to the labeled probe (Benton and Davis, 1977, Science 196:180; Grunstein and Hogness, 1975, Proc. Natl. Acad.
  • nucleic acid probes which can be conveniently prepared from the specific sequences disclosed herein, e.g., a hybridizable probe having a nucleotide sequence corresponding to at least a 10, and preferably a 15, nucleotide fragment of the sequences depicted in SEQ ID NOS:5, 6, 7 or 8.
  • a fragment is selected that is highly unique to the encoded polypeptides. Those DNA fragments with substantial homology to the probe will hybridize. As noted above, the greater the degree of homology, the more stringent hybridization conditions can be used.
  • low stringency hybridization conditions are used to identify a homologous SNARE YKT6, the human liver glucokinase, the AEBP1, or POLD2 polynucleotide.
  • a nucleic acid encoding a polypeptide of the invention will hybridize to a nucleic acid derived from the polynucleotide sequence depicted in SEQ ID NOS:5, 6, 7 or 8 or a hybridizable fragment thereof, under moderately stringent conditions; more preferably, it will hybridize under high stringency conditions.
  • the presence of the gene may be detected by assays based on the physical, chemical, or immunological properties of its expressed product.
  • cDNA clones, or DNA clones which hybrid-select the proper mRNAs can be selected which produce a protein that, e.g., has similar or identical electrophoretic migration, isoelectric focusing behavior, proteolytic digestion maps, or antigenic properties as known for the SNARE YKT6, the human liver glucokinase, the AEBP1, or POLD2 polynucleotide.
  • a gene encoding SNARE YKT6, the human liver glucokinase, the AEBP1, or POLD2 polypeptide can also be identified by mRNA selection, i.e., by nucleic acid hybridization followed by in vitro translation. In this procedure, fragments are used to isolate complementary mRNAs by hybridization. Immunoprecipitation analysis or functional assays of the in vitro translation products of the products of the isolated mRNAs identifies the mRNA and, therefore, the complementary DNA fragments, that contain the desired sequences.
  • the present invention also relates to nucleic acid constructs comprising a polynucleotide sequence containing the exon/intron segments of the SNARE YKT6 gene (nucleotides 4320-15463 of SEQ ID NO:5), human liver glucokinase gene (nucleotides 20485-33460 of SEQ ID NO:6), AEBP1 gene (nucleotides 1301-13893 of SEQ ID NO:7) or POLD2 gene (nucleotides 11546-18811 of SEQ ID NO:8) operably linked to one or more control sequences which direct the expression of the coding sequence in a suitable host cell under conditions compatible with the control sequences.
  • Expression will be understood to include any step involved in the production of the polypeptide including, but not limited to, transcription, post-transcriptional modification, translation, post-translational modification, and secretion.
  • the invention is further directed to a nucleic acid construct comprising expression control sequences derived from SEQ ID NOS: 5, 6, 7 or 8 and a heterologous polynucleotide sequence.
  • Nucleic acid construct is defined herein as a nucleic acid molecule, either single- or double-stranded, which is isolated from a naturally occurring gene or which has been modified to contain segments of nucleic acid which are combined and juxtaposed in a manner which would not otherwise exist in nature.
  • nucleic acid construct is synonymous with the term expression cassette when the nucleic acid construct contains all the control sequences required for expression of a coding sequence of the present invention.
  • coding sequence is defined herein as a portion of a nucleic acid sequence which directly specifies the amino acid sequence of its protein product.
  • the boundaries of the coding sequence are generally determined by a ribosome binding site (prokaryotes) or by the ATG start codon (eukaryotes) located just upstream of the open reading frame at the 5′ end of the mRNA and a transcription terminator sequence located just downstream of the open reading frame at the 3′ end of the mRNA.
  • a coding sequence can include, but is not limited to, DNA, cDNA, and recombinant nucleic acid sequences.
  • the isolated polynucleotide of the present invention may be manipulated in a variety of ways to provide for expression of the polypeptide. Manipulation of the nucleic acid sequence prior to its insertion into a vector may be desirable or necessary depending on the expression vector. The techniques for modifying nucleic acid sequences utilizing recombinant DNA methods are well known in the art.
  • the control sequence may be an appropriate promoter sequence, a nucleic acid sequence which is recognized by a host cell for expression of the nucleic acid sequence.
  • the promoter sequence contains transcriptional control sequences which regulate the expression of the polynucleotide.
  • the promoter may be any nucleic acid sequence which shows transcriptional activity in the host cell of choice including mutant, truncated, and hybrid promoters, and may be obtained from genes encoding extracellular or intracellular polypeptides either homologous or heterologous to the host cell.
  • Suitable promoters for directing the transcription of the nucleic acid constructs of the present invention are the promoters obtained from the E. coli lac operon, the prokaryotic beta-lactamase gene (Villa- Komaroff et al., 1978, Proc. Natl. Acad. Sci. USA 75: 3727-3731), as well as the tac promoter (DeBoer et al., 1983, Proc. Natl Acad. of Sciences USA 80: 21-25). Further promoters are described in “Useful proteins from recombinant bacteria” in Scientific American, 1980, 242: 74-94; and in Sambrook et aL, 1989, supra.
  • promoters for directing the transcription of the nucleic acid constructs of the present invention in a filamentous fungal host cell are promoters obtained from the genes encoding Aspergillus oryzae TAKA amylase, Rhizomucor miehei aspartic proteinase, Aspergillus niger neutral alpha-amylase, Aspergillus niger acid stable alpha-amylase, Aspergillus niger or Aspergillus awamori glucoamylase (glaA), Rhizomucor miehei lipase, Aspergillus oryzae alkaline protease, Aspergillus oryzae triose phosphate isomerase, Aspergillus nidulans acetamidase, Fusarium oxysporum trypsin-like protease (WO 96/00787), NA2-tpi (a hybrid of the promoters from the genes encoding
  • useful promoters are obtained from the Saccharomyces cerevisiae enolase (ENO-1) gene, the Saccharomyces cerevisiae galactokinase gene (GAL1), the Saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde-3-phosphate dehydrogenase genes (ADH2/GAP), and the Saccharomyces cerevisiae 3-phosphoglycerate kinase gene.
  • ENO-1 Saccharomyces cerevisiae enolase
  • GAL1 Saccharomyces cerevisiae galactokinase gene
  • ADH2/GAP Saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde-3-phosphate dehydrogenase genes
  • Saccharomyces cerevisiae 3-phosphoglycerate kinase gene Other useful promoters for yeast host cells are described by Romanos et al.
  • Eukaryotic promoters may be obtained from the genomes of viruses such as polyoma virus, fowlpox virus, adenovirus, bovine papilloma virus, avian sarcoma virus, cytomegalovirus, a retrovirus, hepatitis-B virus and SV40.
  • viruses such as polyoma virus, fowlpox virus, adenovirus, bovine papilloma virus, avian sarcoma virus, cytomegalovirus, a retrovirus, hepatitis-B virus and SV40.
  • heterologous mammalian promoters such as the actin promoter or immunoglobulin promoter may be used.
  • the constructs of the invention may also include enhancers.
  • Enhancers are cis-acting elements of DNA, usually from about 10 to about 300 bp that act on a promoter to increase its transcription. Enhancers from globin, elastase, albumin, alpha-fetoprotein, and insulin enhancers may be used. However, an enhancer from a virus may be used; examples include SV40 on the late side of the replication origin, the cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side of the replication origin and adenovirus enhancers.
  • the control sequence may also be a suitable transcription terminator sequence, a sequence recognized by a host cell to terminate transcription.
  • the terminator sequence is operably linked to the 3′ terminus of the nucleic acid sequence encoding the polypeptide. Any terminator which is functional in the host cell of choice may be used in the present invention.
  • the control sequence may also be a suitable leader sequence, a nontranslated region of an mRNA which is important for translation by the host cell.
  • the leader sequence is operably linked to the 5′ terminus of the nucleic acid sequence encoding the polypeptide. Any leader sequence that is functional in the host cell of choice may be used in the present invention.
  • control sequence may also be a polyadenylation sequence, a sequence which is operably linked to the 3′ terminus of the nucleic acid sequence and which, when transcribed, is recognized by the host cell as a signal to add polyadenosine residues to transcribed mRNA.
  • Any polyadenylation sequence which is functional in the host cell of choice may be used in the present invention.
  • the control sequence may also be a signal peptide coding region, which codes for an amino acid sequence linked to the amino terminus of the polypeptide which can direct the encoded polypeptide into the cell's secretory pathway.
  • the 5′ end of the coding sequence of the nucleic acid sequence may inherently contain a signal peptide coding region naturally linked in translation reading frame with the segment of the coding region which encodes the secreted polypeptide.
  • the 5′ end of the coding sequence may contain a signal peptide coding region which is foreign to the coding sequence.
  • the foreign signal peptide coding region may be required where the coding sequence does not normally contain a signal peptide coding region.
  • the foreign signal peptide coding region may simply replace the natural signal peptide coding region in order to obtain enhanced secretion of the polypeptide.
  • any signal peptide coding region which directs the expressed polypeptide into the secretory pathway of a host cell of choice may be used in the present invention.
  • the control sequence may also be a propeptide coding region, which codes for an amino acid sequence positioned at the amino terminus of a polypeptide.
  • the resultant polypeptide is known as a proenzyme or propolypeptide (or a zymogen in some cases).
  • a propolypeptide is generally inactive and can be converted to a mature active polypeptide by catalytic or autocatalytic cleavage of the propeptide from the propolypeptide.
  • the propeptide coding region may be obtained from the Bacillus subtilis alkaline protease gene (aprE), the Bacillus subtilis neutral protease gene (nprT), the Saccharomyces cerevisiae alpha-factor gene, the Rhizomucor miehei aspartic proteinase gene, or the Myceliophthora thermophila laccase gene (WO 95/33836).
  • aprE Bacillus subtilis alkaline protease gene
  • nprT Bacillus subtilis neutral protease gene
  • Saccharomyces cerevisiae alpha-factor gene the Rhizomucor miehei aspartic proteinase gene
  • Myceliophthora thermophila laccase gene WO 95/33836
  • the propeptide region is positioned next to the amino terminus of a polypeptide and the signal peptide region is positioned next to the amino terminus of the propeptide region.
  • regulatory sequences which allow the regulation of the expression of the polypeptide relative to the growth of the host cell.
  • regulatory systems are those which cause the expression of the gene to be turned on or off in response to a chemical or physical stimulus, including the presence of a regulatory compound.
  • Regulatory systems in prokaryotic systems would include the lac, tac, and trp operator systems.
  • yeast the ADH2 system or GAL1 system may be used.
  • filamentous fungi the TAKA alpha-amylase promoter, Aspergillus niger glucoamylase promoter, and the Aspergillus oryzae glucoamylase promoter may be used as regulatory sequences.
  • regulatory sequences are those which allow for gene amplification. In eukaryotic systems, these include the dihydrofolate reductase gene which is amplified in the presence of methotrexate, and the metallothionein genes which are amplified with heavy metals. In these cases, the nucleic acid sequence encoding the polypeptide would be operably linked with the regulatory sequence.
  • the present invention also relates to recombinant expression vectors comprising a nucleic acid sequence of the present invention, a promoter, and transcriptional and translational stop signals.
  • the various nucleic acid and control sequences described above may be joined together to produce a recombinant expression vector which may include one or more convenient restriction sites to allow for insertion or substitution of the nucleic acid sequence encoding the polypeptide at such sites.
  • the polynucleotide of the present invention may be expressed by inserting the nucleic acid sequence or a nucleic acid construct comprising the sequence into an appropriate vector for expression.
  • the coding sequence is located in the vector so that the coding sequence is operably linked with the appropriate control sequences for expression.
  • the recombinant expression vector may be any vector (e.g., a plasmid or virus) which can be conveniently subjected to recombinant DNA procedures and can bring about the expression of the nucleic acid sequence.
  • the choice of the vector will typically depend on the compatibility of the vector with the host cell into which the vector is to be introduced.
  • the vectors may be linear or closed circular plasmids.
  • the vector may be an autonomously replicating vector, i.e., a vector which exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g., a plasmid, an extrachromosomal element, a minichromosome, or an artificial chromosome.
  • the vector may contain any means for assuring self-replication.
  • the vector may be one which, when introduced into the host cell, is integrated into the genome and replicated together with the chromosome(s) into which it has been integrated.
  • a single vector or plasmid or two or more vectors or plasmids which together contain the total DNA to be introduced into the genome of the host cell, or a transposon may be used.
  • the vectors of the present invention preferably contain one or more selectable markers which permit easy selection of transformed cells.
  • a selectable marker is a gene the product of which provides for biocide or viral resistance, resistance to heavy metals, prototrophy to auxotrophs, and the like.
  • Examples of bacterial selectable markers are the dal genes from Bacillus subtilis or Bacillus licheniformis, or markers which confer antibiotic resistance such as ampicillin, kanamycin, chloramphenicol or tetracycline resistance.
  • Suitable markers for yeast host cells are ADE2, HIS3, LEU2, LYS2, MET3, TRP1, and URA3.
  • Suitable selectable markers for mammalian cells are those that enable the identification of cells competent to take of the nucleic acids of the present invention, such as DHFR or thymidine kinase.
  • An appropriate host cell when wild-type DHFR is employed is the CHO cell line deficient in DHFR activity, prepared and propagated as described by Urlaub et al., Proc. Natl. Acad. Sci. USA, 77:4216 (1980).
  • the vectors of the present invention preferably contain an element(s) that permits stable integration of the vector into the host cell genome or autonomous replication of the vector in the cell independent of the genome of the cell.
  • the vector may rely on the polynucleotide sequence encoding the polypeptide or any other element of the vector for stable integration of the vector into the genome by homologous or nonhomologous recombination.
  • the vector may contain additional nucleic acid sequences for directing integration by homologous recombination into the genome of the host cell.
  • the additional polynucleotide sequences enable the vector to be integrated into the host cell genome at a precise location(s) in the chromosome(s).
  • the integrational elements should preferably contain a sufficient number of nucleic acids, such as 100 to 1,500 base pairs, preferably 400 to 1,500 base pairs, and most preferably 800 to 1,500 base pairs, which are highly homologous with the corresponding target sequence to enhance the probability of homologous recombination.
  • the integrational elements may be any sequence that is homologous with the target sequence in the genome of the host cell.
  • the integrational elements may be non-encoding or encoding nucleic acid sequences.
  • the vector may be integrated into the genome of the host cell by non-homologous recombination.
  • the vector may further comprise an origin of replication enabling the vector to replicate autonomously in the host cell in question.
  • origins of replication are the origins of replication of plasmids pBR322, pUC19, pACYC177, and pACYC184 permitting replication in E. coli, and pUB110, pE194, pTA1060, and pAM ⁇ 1 permitting replication in Bacillus.
  • origins of replication for use in a yeast host cell are the 2 micron origin of replication, ARS1, ARS4, the combination of ARS1 and CEN3, and the combination of ARS4 and CEN6.
  • the origin of replication may be one having a mutation which makes its functioning temperature-sensitive in the host cell (see, e.g., Ehrlich, 1978, Proceedings of the National Academy of Sciences USA 75: 1433).
  • More than one copy of a polynucleotide sequence of the present invention may be inserted into the host cell to increase production of the gene product.
  • An increase in the copy number of the polynucleotide sequence can be obtained by integrating at least one additional copy of the sequence into the host cell genome or by including an amplifiable selectable marker gene with the nucleic acid sequence where cells containing amplified copies of the selectable marker gene, and thereby additional copies of the nucleic acid sequence, can be selected for by cultivating the cells in the presence of the appropriate selectable agent.
  • the present invention also relates to recombinant host cells, comprising a nucleic acid sequence of the invention, which are advantageously used in the recombinant production of the polypeptides.
  • a vector comprising a nucleic acid sequence of the present invention is introduced into a host cell so that the vector is maintained as a chromosomal integrant or as a self-replicating extra-chromosomal vector as described earlier.
  • the term “host cell” encompasses any progeny of a parent cell that is not identical to the parent cell due to mutations that occur during replication. The choice of a host cell will to a large extent depend upon the gene encoding the polypeptide and its source.
  • the host cell may be a unicellular microorganism, e.g., a prokaryote, or a non-unicellular microorganism, e.g., a eukaryote.
  • Useful unicellular cells are bacterial cells such as gram positive bacteria including, but not limited to, a Bacillus cell, or a Streptomyces cell, e.g., Streptomyces lividans or Streptomyces murinus, or gram negative bacteria such as E. coli and Pseudomonas sp.
  • the introduction of a vector into a bacterial host cell may, for instance, be effected by protoplast transformation (see, e.g., Chang and Cohen, 1979, Molecular General Genetics 168: 111-115), using competent cells (see, e.g., Young and Spizizin, 1961, Journal of Bacteriology 81: 823-829, or Dubnau and Davidoff-Abelson, 1971, Journal of Molecular Biology 56: 209-221), electroporation (see, e.g., Shigekawa and Dower, 1988, Biotechniques 6: 742-751), or conjugation (see, e.g., Koehler and Thorne, 1987, Journal of Bacteriology 169: 5771-5278).
  • protoplast transformation see, e.g., Chang and Cohen, 1979, Molecular General Genetics 168: 111-115
  • competent cells see, e.g., Young and Spizizin, 1961, Journal of Bacteriology 81: 823-829, or Dub
  • the host cell may be a eukaryote, such as a mammalian cell (e.g., human cell), an insect cell, a plant cell or a fungal cell.
  • Mammalian host cells that could be used include but are not limited to human Hela, embryonic kidney cells (293), lung cells, H9 and Jurkat cells, mouse NIH3T3 and C127 cells, Cos 1, Cos 7 and CV1, quail QC1-3 cells, mouse L cells and Chinese Hamster ovary (CHO) cells. These cells may be transfected with a vector containing a transcriptional regulatory sequence, a protein coding sequence and transcriptional termination sequences.
  • the polypeptide can be expressed in stable cell lines containing the polynucleotide integrated into a chromosome.
  • a selectable marker such as dhfr, gpt, neomycin, hygromycin allows the identification and isolation of the transfected cells.
  • the host cell may be a fungal cell.
  • “Fungi” as used herein includes the phyla Ascomycota, Basidiomycota, Chytridiomycota, and Zygomycota (as defined by Hawksworth et al., In, Ainsworth and Bisby's Dictionary of The Fungi, 8th edition, 1995, CAB International, University Press, Cambridge, UK) as well as the Oomycota (as cited in Hawksworth et al., 1995, supra, page 171) and all mitosporic fungi (Hawksworth et al., 1995, supra).
  • the fungal host cell may also be a yeast cell.
  • yeast as used herein includes ascosporogenous yeast (Endomycetales), basidiosporogenous yeast, and yeast belonging to the Fungi Imperfecti (Blastomycetes). Since the classification of yeast may change in the future, for the purposes of this invention, yeast shall be defined as described in Biology and Activities of Yeast (Skinner, F. A., Passmore, S. M., and Davenport, R. R., eds, Soc. App. Bacteriol. Symposium Series No. 9, 1980). The fungal host cell may also be a filamentous fungal cell. “Filamentous fungi” include all filamentous forms of the subdivision Eumycota and Oomycota (as defined by Hawksworth et al., 1995, supra).
  • the filamentous fungi are characterized by a mycelial wall composed of chitin, cellulose, glucan, chitosan, mannan, and other complex polysaccharides. Vegetative growth is by hyphal elongation and carbon catabolism is obligately aerobic. In contrast, vegetative growth by yeasts such as Saccharomyces cerevisiae is by budding of a unicellular thallus and carbon catabolism may be fermentative.
  • Fungal cells may be transformed by a process involving protoplast formation, transformation of the protoplasts, and regeneration of the cell wall in a manner known per se. Suitable procedures for transformation of Aspergillus host cells are described in EP 238 023 and Yelton et al., 1984, Proceedings of the National Academy of Sciences USA 81: 1470-1474. Suitable methods for transforming Fusarium species are described by Malardier et al., 1989, Gene 78: 147-156 and WO 96/00787. Yeast may be transformed using the procedures described by Becker and Guarente, In Abelson, J. N. and Simon, M.
  • the present invention also relates to methods for producing a polypeptide of the present invention comprising (a) cultivating a host cell under conditions conducive for production of the polypeptide; and (b) recovering the polypeptide.
  • the cells are cultivated in a nutrient medium suitable for production of the polypeptide using methods known in the art.
  • the cell may be cultivated by shake flask cultivation, small-scale or large-scale fermentation (including continuous, batch, fed-batch, or solid state fermentations) in laboratory or industrial fermentors performed in a suitable medium and under conditions allowing the polypeptide to be expressed and/or isolated.
  • the cultivation takes place in a suitable nutrient medium comprising carbon and nitrogen sources and inorganic salts, using procedures known in the art. Suitable media are available from commercial suppliers or may be prepared according to published compositions (e.g., in catalogues of the American Type Culture Collection). If the polypeptide is secreted into the nutrient medium, the polypeptide can be recovered directly from the medium. If the polypeptide is not secreted, it can be recovered from cell lysates.
  • the polypeptides may be detected using methods known in the art that are specific for the polypeptides. These detection methods may include use of specific antibodies, formation of an enzyme product, or disappearance of an enzyme substrate.
  • an enzyme assay may be used to determine the activity of the polypeptide.
  • AEBP1 activity can be determined by measuring carboxypeptidase activity as described by Muise and Ro, 1999, Biochem. J. 343:341-345.
  • the conversion of hippuryl-L-arginine, hippuryl-L-lysine or hippuryl-L-phenylalanine to hippuric acid may be monitored spectrophotometrically.
  • POLD2 activity may be detected by assaying for DNA polymerase_activity (see, for example, Ng et al., 1991, J. Biol. Chem. 266:11699-11704).
  • the resulting polypeptide may be recovered by methods known in the art.
  • the polypeptide may be recovered from the nutrient medium by conventional procedures including, but not limited to, centrifugation, filtration, extraction, spray-drying, evaporation, or precipitation.
  • polypeptides of the present invention may be purified by a variety of procedures known in the art including, but not limited to, chromatography (e.g., ion exchange, affinity, hydrophobic, chromatofocusing, and size exclusion), electrophoretic procedures (e.g., preparative isoelectric focusing, differential solubility (e.g., ammonium sulfate precipitation), SDS-PAGE, or extraction (see, e.g., Protein Purification, J. -C. Janson and Lars Ryden, editors, VCH Publishers, New York, 1989).
  • chromatography e.g., ion exchange, affinity, hydrophobic, chromatofocusing, and size exclusion
  • electrophoretic procedures e.g., preparative isoelectric focusing, differential solubility (e.g., ammonium sulfate precipitation), SDS-PAGE, or extraction (see, e.g., Protein Purification, J. -C. Janson and Lars Ryden, editors, VCH
  • the SNARE YKT6, human glucokinase, AEBP1 or POLD2 polypeptides produced according to the method of the present invention may be used as an immunogen to generate any of these polypeptides.
  • Such antibodies include but are not limited to polyclonal, monoclonal, chimeric, single chain, Fab fragments, and an Fab expression library.
  • polypeptides for the production of antibodies.
  • various host animals can be immunized by injection with the polypeptide thereof, including but not limited to rabbits, mice, rats, sheep, goats, etc.
  • the polypeptide or fragment thereof can optionally be conjugated to an immunogenic carrier, e.g., bovine serum albumin (BSA) or keyhole limpet hemocyanin (KLH).
  • BSA bovine serum albumin
  • KLH keyhole limpet hemocyanin
  • adjuvants may be used to increase the immunological response, depending on the host species, including but not limited to Freund's (complete and incomplete), mineral gels such as aluminum hydroxide, surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanins, dinitrophenol, and potentially useful human adjuvants such as BCG (bacille Calmette-Guerin) and Corynebacterium parvum.
  • BCG Bacille Calmette-Guerin
  • Corynebacterium parvum bacille Calmette-Guerin
  • any technique that provides for the production of antibody molecules by continuous cell lines in culture may be used. These include but are not limited to the hybridoma technique originally developed by Kohler and Milstein (1975, Nature 256:495-497), as well as the trioma technique, the human B-cell hybridoma technique (Kozbor et al., 1983, Immunology Today 4:72), and the EBV-hybridoma technique to produce human monoclonal antibodies (Cole et al., 1985, in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp.
  • monoclonal antibodies can be produced in germ-free animals utilizing recent technology (PCT/US90/02545).
  • human antibodies may be used and can be obtained by using human hybridomas (Cote et al., 1983, Proc. Natl. Acad. Sci. U.S.A. 80:2026-2030) or by transforming human B cells with EBV virus in vitro (Cole et al., 1985, in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, pp. 77-96).
  • techniques developed for the production of “chimeric antibodies” (Morrison et al., 1984, J. Bacteriol.
  • Antibody fragments which contain the idiotype of the antibody molecule can be generated by known techniques.
  • such fragments include but are not limited to: the F(ab′)2 fragment which can be produced by pepsin digestion of the antibody molecule; the Fab′ fragments which can be generated by reducing the disulfide bridges of the F(ab′)2, fragment, and the Fab fragments which can be generated by treating the antibody molecule with papain and a reducing agent.
  • screening for the desired antibody can be accomplished by techniques known in the art, e.g., radioimmunoassay, ELISA (enzyme-linked immunosorbent assay), “sandwich” immunoassays, immunoradiometric assays, gel diffusion precipitin reactions, immunodiffusion assays, in situ immunoassays (using colloidal gold, enzyme or radioisotope labels, for example), western blots, precipitation reactions, agglutination assays (e.g., gel agglutination assays, hemagglutination assays), complement fixation assays, immunofluorescence assays, protein A assays, and immunoelectrophoresis assays, etc.
  • radioimmunoassay e.g., ELISA (enzyme-linked immunosorbent assay), “sandwich” immunoassays, immunoradiometric assays, gel diffusion precipitin reactions, immunodiffusion assays, in situ immunoa
  • antibody binding is detected by detecting a label on the primary antibody.
  • the primary antibody is detected by detecting binding of a secondary antibody or reagent to the primary antibody.
  • the secondary antibody is labeled.
  • Many means are known in the art for detecting binding in an immunoassay and are within the scope of the present invention. For example, to select antibodies which recognize a specific epitope of a particular polypeptide, one may assay generated hybridomas for a product which binds to a particular polypeptide fragment containing such epitope. For selection of an antibody specific to a particular polypeptide from a particular species of animal, one can select on the basis of positive binding with the polypeptide expressed by or isolated from cells of that species of animal.
  • Immortal, antibody-producing cell lines can also be created by techniques other than fusion, such as direct transformation of B lymphocytes with oncogenic DNA, or transfection with Epstein-Barr virus. See, e.g., M. Schreier et al., “Hybridoma Techniques” (1980); Hammerling et al., “Monoclonal Antibodies And T-cell Hybridomas” (1981); Kennett et al., “Monoclonal Antibodies” (1980); see also U.S. Pat. Nos. 4,341,761; 4,399,121; 4,427,783; 4,444,887; 4,451,570; 4,466,917; 4,472,500; 4,491,632; 4,493,890.
  • Polynucleotides containing noncoding regions of SEQ ID NOS:5, 6, 7 or 8 may be used as probes for detecting mutations from samples from a patient. Genomic DNA may be isolated from the patient. A mutation(s) may be detected by Southern blot analysis, specifically by hybridizing restriction digested genomic DNA to various probes and subjecting to agarose electrophoresis.
  • Polynucleotides containing noncoding regions may be used as PCR primers and may be used to amplify the genomic DNA isolated from the patients. Additionally, primers may be obtained by routine or long range PCR, that can yield products containing more than one exon and intervening intron. The sequence of the amplified genomic DNA from the patient may be determined using methods known in the art. Such probes may be between 10-100 nucleotides in length and may preferably be between 20-50 nucleotides in length.
  • kits comprising these polynucleotide probes.
  • these probes are labeled with a detectable substance.
  • the invention is further directed to antisense oligonucleotides and mimetics to these polynucleotide sequences.
  • Antisense technology can be used to control gene expression through triple-helix formation or antisense DNA or RNA, both of which methods are based on binding of a polynucleotide to DNA or RNA.
  • a DNA oligonucleotide is designed to be complementary to a region of the gene involved in transcription or RNA processing (triple helix (see Lee et al., Nucl. Acids Res., 3:173 (1979); Cooney et al, Science, 241:456 (1988); and Dervan et al., Science, 251: 1360 (1991)), thereby preventing transcription and the production of said polypeptides.
  • the antisense oligonucleotides or mimetics of the present invention may be used to decrease levels of a polypeptide.
  • SNARE YKT6 has been found to be essential for vesicle-associated endoplasmic reticulum-Golgi transport and cell growth. Therefore, the SNARE YKT6 antisense oligonucleotides of the present invention could be used to inhibit cell growth and in particular, to treat or prevent tumor growth.
  • POLD2 is necessary for DNA replication. POLD2 antisense sequences could also be used to inhibit cell growth.
  • Glucokinase and AEBP1 antisense sequences may be used to treat hyperglycemia.
  • the antisense oligonucleotides of the present invention may be formulated into pharmaceutical compositions. These compositions may be administered in a number of ways depending upon whether local or systemic treatment is desired and upon the area to be treated. Administration may be topical (including ophthalmic and to mucous membranes including vaginal and rectal delivery), pulmonary, e.g., by inhalation or insufflation of powders or aerosols, including by nebulizer; intratracheal, intranasal, epidermal and transdermal), oral or parenteral. Parenteral administration includes intravenous, intraarterial, subcutaneous, intraperitoneal or intramuscular injection or infusion; or intracranial, e.g., intrathecal or intraventricular, administration.
  • compositions and formulations for topical administration may include transdermal patches, ointments, lotions, creams, gels, drops, suppositories, sprays, liquids and powders.
  • Conventional pharmaceutical carriers, aqueous, powder or oily bases, thickeners and the like may be necessary or desirable.
  • compositions and formulations for oral administration include powders or granules, suspensions or solutions in water or non-aqueous media, capsules, sachets or tablets. Thickeners, flavoring agents, diluents, emulsifiers, dispersing aids or binders may be desirable.
  • compositions and formulations for parenteral, intrathecal or intraventricular administration may include sterile aqueous solutions which may also contain buffers, diluents and other suitable additives such as, but not limited to, penetration enhancers, carrier compounds and other pharmaceutically acceptable carriers or excipients.
  • compositions of the present invention include, but are not limited to, solutions, emulsions, and liposome-containing formulations. These compositions may be generated from a variety of components that include, but are not limited to, preformed liquids, self-emulsifying solids and self-emulsifying semisolids.
  • compositions of the present invention may be prepared according to conventional techniques well known in the pharmaceutical industry. Such techniques include the step of bringing into association the active ingredients with the pharmaceutical carrier(s) or excipient(s). In general, the formulations are prepared by uniformly and intimately bringing into association the active ingredients with liquid carriers or finely divided solid carriers or both, and then, if necessary, shaping the product.
  • compositions of the present invention may be formulated into any of many possible dosage forms such as, but not limited to, tablets, capsules, liquid syrups, soft gels, suppositories, and enemas.
  • the compositions of the present invention may also be formulated as suspensions in aqueous, non-aqueous or mixed media.
  • Aqueous suspensions may further contain substances which increase the viscosity of the suspension including, for example, sodium carboxymethylcellulose, sorbitol and/or dextran.
  • the suspension may also contain stabilizers.
  • the pharmaceutical compositions may be formulated and used as foams.
  • Pharmaceutical foams include formulations such as, but not limited to, emulsions, microemulsions, creams, jellies and liposomes. While basically similar in nature these formulations vary in the components and the consistency of the final product.
  • the preparation of such compositions and formulations is generally known to those skilled in the pharmaceutical and formulation arts and may be applied to the formulation of the compositions of the present invention.
  • compositions and their subsequent administration is believed to be within the skill of those in the art. Dosing is dependent on severity and responsiveness of the disease state to be treated, with the course of treatment lasting from several days to several months, or until a cure is effected or a diminution of the disease state is achieved. Optimal dosing schedules can be calculated from measurements of drug accumulation in the body of the patient. Persons of ordinary skill can easily determine optimum dosages, dosing methodologies and repetition rates. Optimum dosages may vary depending on the relative potency of individual oligonucleotides, and can generally be estimated based on EC50 as found to be effective in in vitro and in vivo animal models.
  • dosage is from 0.01 ug to 10 g per kg of body weight, and may be given once or more daily, weekly, monthly or yearly, or even once every 2 to 20 years. Persons of ordinary skill in the art can easily estimate repetition rates for dosing based on measured residence times and concentrations of the drug in bodily fluids or tissues. Following successful treatment, it may be desirable to have the patient undergo maintenance therapy to prevent the recurrence of the disease state, wherein the oligonucleotide is administered in maintenance doses, ranging from 0.01 ug to 10 g per kg of body weight, once or more daily, to once every 20 years.
  • SNARE YKT6 is necessary for cell growth
  • POLD2 is involved in DNA replication and repair
  • AEBP1 is involved in repressing adipogenesis
  • glucokinase is involved in glucose sensing in pancreatic islet beta cells and liver.
  • the SNARE YKT6 gene may be used to modulate or prevent cell apoptosis and treat such disorders as virus-induced lymphocyte depletion (AIDS); cell death in neurodegenerative disorders characterized by the gradual loss of specific sets of neurons (e.g., Alzheimer's Disease, Parkinson's disease, ALS, retinitis pigmentosa, spinal muscular atrophy and various forms of cerebellar degeneration), cell death in blood cell disorders resulting from deprivation of growth factors (anemia associated with chronic disease, aplastic anemia, chronic neutropenia and myelodysplastic syndromes) and disorders arising out of an acute loss of blood flow (e.g., myocardial infarctions and stroke).
  • AIDS virus-induced lymphocyte depletion
  • the glucokinase gene may be used to treat diabetes mellitus.
  • the AEBP1 gene may be used to modulate or inhibit adipogenesis and treat obesity, diabetes mellitus and/or osteopenic disorders.
  • POLD2 may be used to treat defects in DNA repair such as xeroderma pigmentosum, progeria and ataxia telangiectasia.
  • the polynucleotide of the present invention may be introduced into a patient's cells for therapeutic uses.
  • cells can be transfected using any appropriate means, including viral vectors, as shown by the example, chemical transfectants, or physico-mechanical methods such as electroporation and direct diffusion of DNA. See, for example, Wolff, Jon A, et al., “Direct gene transfer into mouse muscle in vivo,” Science, 247, 1465-1468, 1990; and Wolff, Jon A, “Human dystrophin expression in mdx mice after intramuscular injection of DNA constructs,” Nature, 352, 815-818, 1991.
  • vectors are agents that transport the gene into the cell without degradation and include a promoter yielding expression of the gene in the cells into which it is delivered.
  • promoters can be general promoters, yielding expression in a variety of mammalian cells, or cell specific, or even nuclear versus cytoplasmic specific. These are known to those skilled in the art and can be constructed using standard molecular biology protocols. Vectors have been divided into two classes:
  • Viral vectors have higher transaction (ability to introduce genes) abilities than do most chemical or physical methods to introduce genes into cells.
  • Vectors that may be used in the present invention include viruses, such as adenoviruses, adeno associated virus (AAV), vaccinia, herpesviruses, baculoviruses and retroviruses, bacteriophages, cosmids, plasmids, fungal vectors and other recombination vehicles typically used in the art which have been described for expression in a variety of eukaryotic and prokaryotic hosts, and may be used for gene therapy as well as for simple protein expression. Polynucleotides are inserted into vector genomes using methods well known in the art.
  • Retroviral vectors are the vectors most commonly used in clinical trials, since they carry a larger genetic payload than other viral vectors. However, they are not useful in non-proliferating cells. Adenovirus vectors are relatively stable and easy to work with, have high titers, and can be delivered in aerosol formulation. Pox viral vectors are large and have several sites for inserting genes, they are thermostable and can be stored at room temperature.
  • promoters are SP6, T4, T7, SV40 early promoter, cytomegalovirus (CMV) promoter, mouse mammary tumor virus (MMTV) steroid-inducible promoter, Moloney murine leukemia virus (MMLV) promoter, phosphoglycerate kinase (PGK) promoter, and the like.
  • the promoter may be an endogenous adenovirus promoter, for example the E1 a promoter or the Ad2 major late promoter (MLP).
  • MLP major late promoter
  • those of ordinary skill in the art can construct adenoviral vectors utilizing endogenous or heterologous poly A addition signals.
  • Plasmids are not integrated into the genome and the vast majority of them are present only from a few weeks to several months, so they are typically very safe. However, they have lower expression levels than retroviruses and since cells have the ability to identify and eventually shut down foreign gene expression, the continuous release of DNA from the polymer to the target cells substantially increases the duration of functional expression while maintaining the benefit of the safety associated with non-viral transfections.
  • Liposomes are commercially available from Gibco BRL, for example, as LIPOFECTIN® and LIPOFECTACE®, which are formed of cationic lipids such as N-[1-(2,3 dioleyloxy)-propy1]-n,n,n-trimethylammonium chloride (DOTMA) and dimethyl dioctadecylammonium bromide (DDAB). Numerous methods are also published for making liposomes, known to those skilled in the art.
  • LIPOFECTIN® and LIPOFECTACE® are formed of cationic lipids such as N-[1-(2,3 dioleyloxy)-propy1]-n,n,n-trimethylammonium chloride (DOTMA) and dimethyl dioctadecylammonium bromide (DDAB).
  • DOTMA N-[1-(2,3 dioleyloxy)-propy1]-n,n,n-trimethylammonium chloride
  • DDAB dimethyl
  • Nucleic acid-Lipid Complexes can be associated with naked nucleic acids (e.g., plasmid DNA) to facilitate passage through cellular membranes.
  • Cationic, anionic, or neutral lipids can be used for this purpose.
  • cationic lipids are preferred because they have been shown to associate better with DNA which, generally, has a negative charge.
  • Cationic lipids have also been shown to mediate intracellular delivery of plasmid DNA (Felgner and Ringold, Nature 337:387 (1989)).
  • Intravenous injection of cationic lipid-plasmid complexes into mice has been shown to result in expression of the DNA in lung (Brigham et al., Am. J. Med.
  • Cationic lipids are known to those of ordinary skill in the art.
  • Representative cationic lipids include those disclosed, for example, in U.S. Pat. No. 5,283,185; and e.g., U.S. Pat. No. 5,767,099.
  • the cationic lipid is N4-spermine cholesteryl carbamate (GL-67) disclosed in U.S. Pat. No. 5,767,099.
  • Additional preferred lipids include N4-spermidine cholestryl carbamate (GL-53) and 1-(N4-spermind)-2,3-dilaurylglycerol carbamate (GL-89).
  • the vectors of the invention may be targeted to specific cells by linking a targeting molecule to the vector.
  • a targeting molecule is any agent that is specific for a cell or tissue type of interest, including for example, a ligand, antibody, sugar, receptor, or other binding molecule.
  • invention vectors may be delivered to the target cells in a suitable composition, either alone, or complexed, as provided above, comprising the vector and a suitably acceptable carrier.
  • the vector may be delivered to target cells by methods known in the art, for example, intravenous, intramuscular, intranasal, subcutaneous, intubation, lavage, and the like.
  • the vectors may be delivered via in vivo or ex vivo applications. In vivo applications involve the direct administration of an adenoviral vector of the invention formulated into a composition to the cells of an individual. Ex vivo applications involve the transfer of the adenoviral vector directly to harvested autologous cells which are maintained in vitro, followed by readministration of the transduced cells to a recipient.
  • the vector is transfected into antigen-presenting cells.
  • Suitable sources of antigen-presenting cells include, but are not limited to, whole cells such as dendritic cells or macrophages; purified MHC class I molecule complexed to ⁇ 2-microglobulin and foster antigen-presenting cells.
  • the vectors of the present invention may be introduced into T cells or B cells using methods known in the art (see, for example, Tsokos and Nepom, 2000, J. Clin. Invest. 106:181-183).

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Engineering & Computer Science (AREA)
  • Genetics & Genomics (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Medicinal Chemistry (AREA)
  • Biotechnology (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biophysics (AREA)
  • Immunology (AREA)
  • Analytical Chemistry (AREA)
  • Toxicology (AREA)
  • Cell Biology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Physics & Mathematics (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • Pharmacology & Pharmacy (AREA)
  • Animal Behavior & Ethology (AREA)
  • Public Health (AREA)
  • Veterinary Medicine (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • General Chemical & Material Sciences (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Peptides Or Proteins (AREA)
  • Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention is directed to isolated genomic polynucleotide fragments that encode human SNARE YKT6, human liver glucokinase, human adipocyte enhancer binding protein (AEBP1) and DNA directed 50 kD regulatory subunit (POLD2), vectors and hosts containing these fragments and fragments hybridizing to noncoding regions as well as antisense oligonucleotides to these fragments. The invention is further directed to methods of using these fragments to obtain SNARE YKT6, human liver glucokinase, AEBP1 protein and POLD2 and to diagnose, treat, prevent and/or ameliorate a pathological disorder.

Description

    PRIORITY CLAIM
  • This application claims priority under 35 U.S.C. §119(e) to provisional application serial No. 60/234,422, filed Sep. 21, 2000, the contents of which are incorporated herein by reference.[0001]
  • FIELD OF THE INVENTION
  • The invention is directed to isolated genomic polynucleotide fragments that encode human SNARE YKT6, human liver glucokinase, human adipocyte enhancer binding protein (AEBP1) and DNA directed 50 kD regulatory subunit (POLD2), vectors and hosts containing these fragments and fragments hybridizing to noncoding regions as well as antisense oligonucleotides to these fragments. The invention is further directed to methods of using these fragments to obtain SNARE YKT6, human liver glucokinase, AEBP1 protein and POLD2 and to diagnose, treat, prevent and/or ameliorate a pathological disorder. [0002]
  • BACKGROUND OF THE INVENTION
  • Chromosome 7 contains genes encoding, for example, epidermal growth factor receptor, collagen-1-Alpha-1-chain, SNARE YKT6, human liver glucokinase, human adipocyte enhancer binding protein and DNA polymerase delta small subunit (POLD2). SNARE YKT6, human liver glucokinase, human adipocyte enhancer binding protein and DNA polymerase delta small subunit (POLD2) are discussed in further detail below. [0003]
  • SNARE YKT6 [0004]
  • SNARE YKT6, a substrate for prenylation, is essential for vesicle-associated endoplasmic reticulum-Golgi transport (McNew, J. A. et al. J. Biol. Chem. 272, 17776-17783, 1997). It has been found that depletion of this function stops cell growth and manifests a transport block at the endoplasmic reticulum level. [0005]
  • Human Liver Glucokinase [0006]
  • Human liver glucokinase (ATP:D-hexose 6-phosphotransferase) is thought to play a major role in glucose sensing in pancreatic islet beta cells (Tanizawa et al., 1992, Mol. Endocrinol. 6:1070-1081) and in the liver. Glucokinase defects have been observed in patients with noninsulin-dependent diabetes mellitus (NIDDM) patients. Mutations in the human liver glucokinase gene are thought to play a role in the early onset of NIDDM. The gene has been shown by Southern Blotting to exist as a single copy on chromosome 7. It was further found to contain 10 exons including one exon expressed in islet beta cells and the other expressed in liver. [0007]
  • Human Adipocyte Enhancer Binding Protein [0008]
  • The adipocyte-enhancer binding protein (AEBP1) is a transcriptional repressor having carboxypeptidase B-like activity which binds to a regulatory sequence (adipocyte enhancer 1, AE-1) located in the proximal promoter region of the adipose P2 (aP2) gene, which encodes the adipocyte fatty acid binding protein (Muise et al., 1999, Biochem. J. 343:341-345). B-like carboxypeptidases remove C-terminal arginine and lysine residues and participate in the release of active peptides, such as insulin, alter receptor specificity for polypeptides and terminate polypeptide activity (Skidgel, 1988, Trends Pharmacol. Sci. 9:299-304). For example, they are thought to be involved in the onset of obesity (Naggert et al., 1995, Nat. Genet. 10:1335-1342). It has been reported that obese and hyperglycemic mice homozygous for the fat mutation contain a mutation in the CP-E gene. [0009]
  • Full length cDNA clones encoding AEBP1 have been isolated from human osteoblast and adipose tissue (Ohno et al., 1996, Biochem. Biophys Res. Commun. 228:411-414). Two forms have been found to exist due to alternative splicing. This gene appears to play a significant role in regulating adipogenesis. In addition to playing a role in obesity, adipogenesis may play a role in ostopenic disorders. It has been postulated that adipogenesis inhibitors may be used to treat osteopenic disorders (Nuttal et al., 2000, Bone 27:177-184). [0010]
  • DNA Polymerase Delta Small Subunit (POLD2) [0011]
  • DNA polymerase delta core is a heterodimeric enzyme with a catalytic subunit of 125 kD and a second subunit of 50 kD and is an essential enzyme for DNA replication and DNA repair (Zhang et al., 1995, Genomics 29:179-186). cDNAs encoding the small subunit have been cloned and sequenced. The gene for the small subunit has been localized to human chromosome 7 via PCR analysis of a panel of human-hamster hybrid cell lines. However, the genomic DNA has not been isolated and the exact location on chromosome 7 has not been determined. [0012]
  • OBJECTS OF THE INVENTION
  • Although cDNAs encoding the above-disclosed proteins have been isolated, their location on chromosome 7 has not been determined. Furthermore, genomic DNA encoding these polypeptides have not been isolated. Noncoding sequences can play a significant role in regulating the expression of polypeptides as well as the processing of RNA encoding these polypeptides. [0013]
  • There is clearly a need for obtaining genomic polynucleotide sequences encoding these polypeptides. Therefore, it is an object of the invention to isolate such genomic polynucleotide sequences. [0014]
  • SUMMARY OF THE INVENTION
  • The invention is directed to an isolated genomic polynucleotide, said polynucleotide obtainable from human chromosome 7 having a nucleotide sequence at least 95% identical to a sequence selected from the group consisting of: [0015]
  • (a) a polynucleotide encoding a polypeptide selected from the group consisting of human SNARE YKT6 depicted in SEQ ID NO:1, human liver glucokinase depicted in SEQ ID NO:2, human adipocyte enhancer binding protein 1 (AEBP1) depicted in SEQ ID NO:3 and DNA directed 50 kD regulatory subunit (POLD2) depicted in SEQ ID NO:4; [0016]
  • (b) a polynucleotide selected from the group consisting of SEQ ID NO:5 which encodes human SNARE YKT6 depicted in SEQ ID NO:1, SEQ ID NO:6 which encodes human liver glucokinase depicted in SEQ ID NO:2, SEQ ID NO:7 which encodes human adipocyte enhancer binding protein 1 depicted in SEQ ID NO:3 and SEQ ID NO:8 which encodes DNA directed 50 kD regulatory subunit (POLD2) depicted in SEQ ID NO:4; [0017]
  • (c) a polynucleotide which is a variant of SEQ ID NOS:5, 6, 7, or 8; [0018]
  • (d) a polynucleotide which is an allelic variant of SEQ ID NOS:5, 6, 7, or 8; [0019]
  • (e) a polynucleotide which encodes a variant of SEQ ID NOS:1, 2, 3, or 4; [0020]
  • (f) a polynucleotide which hybridizes to any one of the polynucleotides specified in (a)-(e); [0021]
  • (g) a polynucleotide that is a reverse complement to the polynucleotides specified in (a)-(f) and [0022]
  • (h) containing at least 10 transcription factor binding sites selected from the group consisting of AP1FJ-Q2, AP1-C, AP1-Q2, AP1-Q4, AP4-Q5, AP4-Q6, ARNT-01, CEBP-01, CETS1P54-01, CREL-01, DELTAEF1-01, FREAC7-01, GATA1-02, GATA1-03, GATA1-04, GATA1-06, GATA2-02, GATA3-02, GATA-C, GC-01, GFII-01, HFH2-01, HFH3-01, HFH8-01, IK2-01, LMO2COM-01, LMO2COM-02, LYF1-01, MAX-01, NKX25-01, NMYC-01, S8-01, SOX5-01, SP1-Q6, SAEBP1-01, SRV-02, STAT-01, TATA-01, TCF11-01, USF-01, USF-C and USF-Q6 as well as nucleic acid constructs, expression vectors and host cells containing these polynucleotide sequences. [0023]
  • The polynucleotides of the present invention may be used for the manufacture of a gene therapy for the prevention, treatment or amelioration of a medical condition by adding an amount of a composition comprising said polynucleotide effective to prevent, treat or ameliorate said medical condition. [0024]
  • The invention is further directed to obtaining these polypeptides by [0025]
  • (a) culturing host cells comprising these sequences under conditions that provide for the expression of said polypeptide and [0026]
  • (b) recovering said expressed polypeptide. [0027]
  • The polypeptides obtained may be used to produce antibodies by [0028]
  • (a) optionally conjugating said polypeptide to a carrier protein; [0029]
  • (b) immunizing a host animal with said polypeptide or peptide-carrier protein conjugate of step (b) with an adjuvant and [0030]
  • (c) obtaining antibody from said immunized host animal. [0031]
  • The invention is further directed to polynucleotides that hybridize to noncoding regions of said polynucleotide sequences as well as antisense oligonucleotides to these polynucleotides as well as antisense mimetics. The antisense oligonucleotides or mimetics may be used for the manufacture of a medicament for prevention, treatment or amelioration of a medical condition. The invention is further directed to kits comprising these polynucleotides and kits comprising these antisense oligonucleotides or mimetics. [0032]
  • In a specific embodiment, the noncoding regions are transcription regulatory regions. The transcription regulatory regions may be used to produce a heterologous peptide by expressing in a host cell, said transcription regulatory region operably linked to a polynucleotide encoding the heterologous polypeptide and recovering the expressed heterologous polypeptide. [0033]
  • The polynucleotides of the present invention may be used to diagnose a pathological condition in a subject comprising [0034]
  • (a) determining the presence or absence of a mutation in the polynucleotides of the present invention and [0035]
  • (b) diagnosing a pathological condition or a susceptibility to a pathological condition based on the presence or absence of said mutation.[0036]
  • DETAILED DESCRIPTION OF THE INVENTION
  • The invention is directed to isolated genomic polynucleotide fragments that encode human SNARE YKT6, human liver glucokinase, human adipocyte enhancer binding protein and DNA directed 50 kD regulatory subunit (POLD2), which in a specific embodiment are the SNARE YKT6, human liver glucokinase, human adipocyte enhancer binding protein and DNA directed 50 kD regulatory subunit (POLD2) genes, as well as vectors and hosts containing these fragments and polynucleotide fragments hybridizing to noncoding regions, as well as antisense oligonucleotides to these fragments. [0037]
  • As defined herein, a “gene” is the segment of DNA involved in producing a polypeptide chain; it includes regions preceding and following the coding region, as well as intervening sequences (introns) between individual coding segments (exons). [0038]
  • As defined herein “isolated” refers to material removed from its original environment and is thus altered “by the hand of man” from its natural state. An isolated polynucleotide can be part of a vector, a composition of matter or could be contained within a cell as long as the cell is not the original environment of the polynucleotide. [0039]
  • The polynucleotides of the present invention may be in the form of RNA or in the form of DNA, which DNA includes genomic DNA and synthetic DNA. The DNA may be double-stranded or single-stranded and if single stranded may be the coding strand or non-coding strand. The human snare YKT6 polypeptide has the amino acid sequence depicted in SEQ ID NO:1: [0040]
    KLYSLSVLYKGEAKVVLLKAAYDVSSFSFFQRSSVQEFMTFTSQLIVERSSKGTRASVKFQDYLCH
    VYVRNDSLAGVVIADNEYPSRVAFTLLEKVLDEFSKQVDRIDWPVGSPATIHYPALDGHLSRYQN
    PREADPMTKVQAELDETKIILHNTMESLLERGEKLDDLVSKSEVLGTQSKAFYKTARKQNSCCAI
    M
  • and is encoded by the genomic DNA sequence shown in SEQ ID NO:5: [0041]
    CCAGACATAGGCAAGGCGCAAGGTGATACAGTAGGCAGCCACCATGGGGGCCAGGAGGCTCC
    AGCAGAGGCCACACAACCAGCCCAGAATCCAGGACAGAGAGCTGGAATGGAGACAGGGAAG
    CCAGATACCAGGCCAGACTGGCCAGGTGCTACAGGCCTGTGGGCCAGGCCAGGCTTGGGGAC
    TTCGTCCTGGGTGTGAAGGAGACAGGCACCCCTGAGGCCTTCCCTCTGCATCTCCAGCCCAAG
    CTAAGCGCAAACTCTTAGGTTGGAGTAAGGAGTAACCCCCTGCCAAGTTTCTCCTGTCCTCAG
    GCTCCACCGACCACCTATGCTGCCTGGCCCCATGGGGCACACGCTCAGGCCCAGCCTGGGAAA
    GCAACTGCACCTGCCTGTGCTATGCTGGCCCTTCTCAGCCTCAATGCCCTCCTCCCTCCCCGAC
    GCACCCTCGTGGCCCCCGCTGGGCCCCCTGATGCACCCTCATGTCTCCATGGCAACCTGCTCA
    GAGTGTGGCCCTGCCCTTGGCTCCCCTCCACACCTGTGTCCCAGGCAGTGCCACGGCACTTTCC
    TAAACAGAAGGATGGGCTTCAAAACAGTCCCAGACACTAAACACACCTGCATTTTGGGTCCA
    AGTAACTTCTGACAAGACGAGTGCCCCTACACACTCTCAGTCCTATCCACTATGGGCAAGGAG
    CCTGAAGGATCCCCCAGAACTGGCTAAAGCCCTCAGTCTCCTCCTCCACCCTGAGCACCTTCA
    CGCGGCAGAGTGGCCCTGGATGTCAGCTTCTTGCTCCCCATGGTCTGCACCTGGACAGGTGCT
    CTCAGGTGTGTGGGTGGGCAGGTGGCAGGTCCCAAGAGCCAGGTGCAAAGAATCTAGGCCAG
    TGCCCACGAGTGCTGCAGTGTCTGTCCCCAGCATGGTATCTAGGGCTCCACTTGCCTATCAGCT
    GTAATCGGAGGAGGCTTTCCAGGCCAGGCCTCCCCCAGGAAGGCTGCAGGCACTGCGGATCG
    TGCGCCCTCACATGCATTATTCCTGAGGCCCTTCTGCAGATGCCATCAGGGCAGCAACTCTGA
    TGAGGTATTAGGGCACAGCACACAGGGCTAAGCCACCCTGTACTGGGCCAAGCGCTACAGGC
    GCCACATGACCCTAAGGGTTACCCCATCCCACCCCAACCCAGGTCTGGCAGGTCCTCAGAACA
    GGAAAAGCTGAGCACTGCCCAAGGCTGCTTGCTGGGCCAGTCAGAGAGGTCTCTGCCTTCCAG
    GATCAGAAGTACAGGCTGAAAGCAGCCTTGGGCCCGCCTCCCTGGGAGGCTACAGAGGCTTC
    AGAGGGTTCCCTGAACTCAAAACCAGATGTGAGACTTGAATTTGACTTACCCCTGGTTCACCT
    CCCAAGCAAAGCAGGGGTCAGCTTTGGCTCCTCCAGGAACCAGGAAGCTTCCAGGTACCCTGT
    GGAGCCCCCTCTGCTCCTGAAAAGTTGCCACCTGTGCTTGGTGGGATGCCAGGTGGTCTCAGA
    TTGACCCTGGGGTCAGCGGTGAGGGACAGGAAGCCTACAGCGGGATCAGGATGGGGATGGGG
    CCTCCTGTCCCATGGCTCTGCAGCTATGAGGCAGCTTTCCTAGGGTGGGTCTCCTGGCTGCAGC
    TAAGACCAGGCAACAGGATTCAGCAATGACAGGGCTTCTTCTACTCCAGGGCTCCCTCACCTG
    GTTAACAGCAAAAAAGAAAATACAGTTCCTGCTAGCAAGGTCTATAGAAAGGAGGTGAAGGA
    GTCAGGCCTGCAGCTACCTCTCCTGGACAGGAGCTGGTCAGGATAACTTGGACCCTTGCATGC
    GGCAGGCCCACAGGCACACAGCATGAGGCCACTCTCTCCCCCGGGGGAAGGGCTTGGTGAAG
    AAAGGATTCCCCTGAAGCACAAAGAAAGCACAGGACCACTGTGAAATTTCAAGACAACTTTTTA
    TCCAGACAGGCGCCTCTCAAATAGAACACAGGGAAGTTAGGCAGCAGTTACTAAAATACAGT
    CTGGCCAAATGATTTACAACAGAACACAACAGGAGCAGGGGATCTGTGGGTGGGGCTGGGCT
    GGGCCCTCTATCTCACAGGGCCTGAGTCAAGCCAGCCCGCCCTGCAAGGCAGGGGCTGACCT
    GCAAGCGGAGATCTCACTTCCTCTTACCCCAAATTCATACCTCCATTTTCCCCGCCCCCATCTC
    TCCCCAGGGTCCTCAAGTGGGAAAGGGAGAGGTAGCATCCCTCGGATCCAGGCCCACTCCAC
    TCCGTCTCCGGCACCAGTGGGCAGGCTGAGTCTGGGCCTCAAGGGGCCCTGGGCTTAGGGTAT
    CTATGGCAGTAGGAAAATGACATGGACAGGCTCTTCAGGGGTAGGCTAAAGTCCTCTGGCCA
    GCAGTACCCAGAGAAAATGGGCAGCAGCAGGTAAACCAGCCAGGAGGTGGAGTCCTCTGAAC
    CCACAGCAGACCCCACCCTCCTGCCCAGCCCCTGCCCACATTGGGGGTCAGGACCACTGAGAC
    TCTGGTCAGGACAGTGGGTGCTCTCAGCAGTGTGGCAAGCTCAGAGCAGAGCTCCCAAGGAC
    CATACCACACTGGTTCAAAACCCATAGGTGACACCATCCCAGCAGAAGCTTCCATGGGTGCTG
    GATCCCAGGGCTGCATCCTGAGCACAGGTGGGCAGACTGGAACATAACACTAGGACCCAAGG
    GATCCAGAACATTTAGGCCCATCTCCTGGGCTGCTCCAGCCTGTTGCCATGACTTGGGCAGT
    GAGTGGGCCTCCTGCCAGGTGGCAGGGCACAGCTTAGACCAAACCCTTGGCCTCCCCCCTCTG
    CAGCTACCTCTGACCAAGAAGGAACTAGCAAGCCTATGCTGGCAAGACCATAGGTGGGGTGC
    TGGGAATCCTCGGGGCCGGCTGGCACCCACTCCTGGTGCTCAAAGGGAGAGACCCACTTGTTCA
    GATGCATAGGCCTCAGGCGGTTCAAGGCAGTCTTAGAGCCACAGAGTCAAATAAAAATCAAT
    TTTGAGAGACCACAGCACCTGCTGCTTTGATCGTGATGTTCTAAGGCAAGTTGCAAGTCAAGGC
    AAGTGTCCCAGAGGCCCTGGGCAGCTGAGTGCACCTGTGTTTGATCTTCCCCTGATGATGGAC
    ACTCCCAGCTGACCATCCAAACACCAGGAAAACATCCCCCTTTCCTGGGCTCAGTTCCTAGTC
    TACTTGCTGGTACGAACCCAACCCACACACTCCCCGCCCACAATGCAGCTCCTCCAAATCCT
    CCCACAAGCCACCTTTGTGGGACTTGGAAGCTGCAGGATGGGCCCTGCCCTCTGCGGGAAG
    CCAATCCTAGCAGAAAGGTAAGCTAAACAACAGTCTCAGAATCTGAGACCCAGTGACT
    GTTCCCCCCGCCCCAGGCCTTGGGCCTGAAGTGGGGGCCTGCCTGTGGCCTCTGTGGTGGGCT
    CACTCCCACCCCCAACAGTGGCCCCAGGAGAGGCTTTCCCAAGAGTCTTCAAACTCCACCCAC
    CCCAGCCCTAGCATCAGGGACTCCCCACCCCCCACTGGAGTGTTAATATCATTAATGTACAAA
    TAAGATCCAAAGATATACCAAAGATCGAGAAACAGCTGGCTCCGACCTCCCTCCCACAGAGC
    CTTCCCAGGGTTAGCTGAAAAAGAGCCCTTTGGCATCTACAGAAGCCAGTCGGAGTTTATGGT
    TTCATTTGCCCAAAAATACACCTTTGGGGACCTCAAACTTTCCAAGAATCACTACCACACAT
    ATGAATTTGAACATTCGCCACCCTTCCACCATCCATTTCTCGCAGGAACTTCAAAATAAAAAT
    GGCCAGTCTGCCGCCACTCTGGCTCCTCGTCTATGGCTGTCTCTTCTTTTCCAGGGGCTGCAGT
    TCTGATGTGAATGATGGTGCCATTCCAGCATTGGGCCTCTGGCAGGCTGCATCACATGATGGC
    ACAGCATGAGTTTTGTTTCCGGGCCTTGGAAAAAAACAAAGAGGAGCTGAGAAGGAGGACTG
    ACGAAGTAAGGGAAGCCCCAATCCTGGCAGGCGTGGCAGAGGGAGCTCCACAGGACACAGC
    CAGGCAGAGAAACTAGCACTAGAACAGGGTGGGGGTGGAGGCCTTGAGGGAAGCTGTCCAC
    AAGCAATTCCCATCACCAAGCACAAGGCGGGCCCCGGCTTCCAAAACTAGTCTGGGATCCTTT
    TTCCTTTCTTTCTCACACCCCATTAATGCTATCAAAAAGTGAGTAAAATTCCTACAGTTAGGC
    CAGGTACAAACAAAGGACCAATAATACAAATGGGATTGGCAGAATATCTTAACTTTGGCCCA
    CTCCTGTCTTCACACAATGCTATCTGACCACCACGGTGGTGTTTCTCCTAGAAGATGGTCCTG
    AGGACAACAGATGTGGTTCCCACTTGGGATGTGGTTTGTGGGGACCACTGTTGCCACCTTCTC
    TCTTGCTTTCTGGTCACAGACTATCTTCCTAATCCCACCTAGCCATCTCCCTCCAATGTGCACA
    TGAAAGCAAATGTGTGTGGACAGACCAAGTAAATTTGTCCCTATGACTATCCAACCATGGGCC
    AACAGTGCCATCTCCAGATAGGAAGACATGAGCACTGACCTGAGAGAAAGCGGCAGTCAGCA
    GCACCCATCCTTGTCAATTAAATATTTTCTGTCAAAGGGAAATTAAAAGCTTAAGAACCTCTT
    CAGGAAGGCTGAATTGCTTGCATCTTAAAGACTTATGTCTACTCAGCAGAAAGAGGAATAAG
    ATTCAACAGTAAATCTCTGGTGATCAGAACTGAACCAGCCTTCCTGGACTGGGAGTAGGAGT
    TCAGAAATCAGCCAGAGCAGCAGAGGGCAGAGCAGAGGCAGGAGTGGAACAAGGCCTCGGC
    CCGCATCGACTCCAACGGCGCCCAAGTGAACTGCCTCCAACCACCTGGGCCTGAGGCGCTCAC
    CTTAGGCTCTTGCCGCACAAGGAATCATCCACCATGATCAACAGTCTAAGAAAGACCCGTTC
    ATAGTGGAGAGTGCCAGAAGCAGCAAGCTGCGACTGCTCTCTAGAGAGAACACCCAGGAGGC
    AGCAGGTGCTGGGTACTCACAGTTTATAGAAGGCTTTAGACTGTGTTCCCAGCACCTCGGAT
    TTGGACACCAAGTCATCTAGCTTCTCACCTCGCTCTAACAGAGACTCCATGGTGTTGTGCTGG
    ACAAAAAAGAAAAGAGAATCCAGCTCTGTTCAGTACGTGCCCTGACATGAGCCCCTCATATTT
    CAGTCATGGGGGAAAGTGCCTTACCTGGGTTCCTCTCCAACACACACAAACTTCACCTCTAGG
    TGTCGAGACTCGGTCCAAGAATAGTTACTGTCCAAGTGGATGGAACAGAACCTGGTGACATTC
    CCGTGAAATCTAGAAGATCTAACTGGGATGTAGCAGACTTCCCAAAAAGCTGTCCCCAGCAC
    AGGCTTAGATAACCAGCACTCCAGGAAAACTCATATATATATATACACACACATTTATATATA
    CATTTGTGTGTGTGTGTGTGTGTGCACGCACATGTGCGTGTGCATGGAGCTTGGAAAAAAGA
    GTTCCCCCCGCCCCAGGCCTTGGGCCTGAAGTGGGGGCCTGCCTGTGGCCTCTGTGGTGGGCT
    CACTCCCACCCCCAACAGTGGCCCCAGGAGAGGCTTTCCCAAGAGTCTTCAAACTCCACCCAC
    CCCAGCCCTAGCATCAGGGACTCCCCACCCCCCACTGGAGTGTTAATATCATTAATGTACAAA
    TAAGATCCAAAGATATACCAAAGATCGAGAAACAGCTGGCTCCGACCTCCCTCCCACAGAGC
    CTTCCCAGGGTfAGCTGAAAAAGAGCCCTTTGGCATCTACAGAAGCCAGTCGGAGTTTATGGT
    TTCATTTGCCCAAAAATACACCTTTGGGGACCTCAAACTTTCCAAGAATCACTACCACACAT
    ATGAATTTGAACATTCGCCACCCTTCCACCATCCATTTCTCGCAGGAACTTCAAAATAAAAAT
    GGCCAGTCTGCCGCCACTCTGGCTCCTCGTCTATGGCTGTCTCTTCTTTTCCAGGGGCTGCAGT
    TCTGATGTGAATGATGGTGCCATTCCAGCATTGGGCCTCTGGCAGGCTGCATCACATGATGGC
    ACAGCATGAGTTTTGTTTCCGGGCCTTGGAAAAAAACAAAGAGGAGCTGAGAAGGAGGACTG
    ACGAAGTAAGGGAAGCCCCAATCCTGGCAGGCGTGGCAGAGGGAGCTCCACAGGACACAGC
    CAGGCAGAGAAACTAGCACTAGAACAGGGTGGGGGTGGAGGCCTTGAGGGAAGCTGTCCAC
    AAGCAATTCCCATCACCAAGCACAAGGCGGGCCCCGGCTTCCAAAACTAGTCTGGGATCCTTT
    TTCCTTTCTTTCTCACACCCCATTAATGCTATCAAAAAGTGAGTAAAATTCCTACAGTTAGGC
    CAGGTACAAACAAAGGACCAATAATACAAATGGGATTGGCAGAATATCTTAACTTTGGCCCA
    CTCCTGTCTTCACACAATGCTATCTGACCACCACGGTGGTGTTTCTCCTAGAAGATGGTCCTG
    AGGACAACAGATGTGGTTCCCACTTGGGATGTGGTTTGTGGGGACCACTGTTGCCACCTTCTC
    TCTTGCTTTCTGGTCACAGACTATCTTCCTAATCCCACCTAGCCATCTCCCTCCAATGTGCACA
    TGAAAGCAAATGTGTGTGGACAGACCAAGTAAATTTGTCCCTATGACTATCCAACCATGGGCC
    AACAGTGCCATCTCCAGATAGGAAGACATGAGCACTGACCTGAGAGAAAGCGGCAGTCAGCA
    GCACCCATCCTTGTCAATTAAATATTTTCTGTCAAAGGGAAATTAAAAGCTTAAGAACCTCTT
    CAGGAAGGCTGAATTGCTTGCATCTTAAAGACTTATGTCTACTCAGCAGAAAGAGGAATAAG
    ATTCAACAGTAAATCTCTGGTGATCAGAACTGAACCAGCCTTCCTGGACTGGGAGTAGGAGT
    TCAGAAATCAGCCAGAGCAGCAGAGGGCAGAGCAGAGGCAGGAGTGGAACAAGGCCTCGGC
    CCGCATCGACTCCAACGGCGCCCAAGTGAACTGCCTCCAACCACCTGGGCCTGAGGCGCTCAC
    CTTAGGCTCTTGCCGCACAAGGAATCATCCACCATGATCAACAGTCTAAGAAAGACCCGTTC
    ATAGTGGAGAGTGCCAGAAGCAGCAAGCTGCGACTGCTCTCTAGAGAGAACACCCAGGAGGC
    AGCAGGTGCTGGGTACTCACAGTTTATAGAAGGCTTTAGACTGTGTTCCCAGCACCTCGGAT
    TTGGACACCAAGTCATCTAGCTTCTCACCTCGCTCTAACAGAGACTCCATGGTGTTGTGCTGG
    ACAAAAAAGAAAAGAGAATCCAGCTCTGTTCAGTACGTGCCCTGACATGAGCCCCTCATATTT
    CAGTCATGGGGGAAAGTGCCTTACCTGGGTTCCTCTCCAACACACACAAACTTCACCTCTAGG
    TGTCGAGACTCGGTCCAAGAATAGTTACTGTCCAAGTGGATGGAACAGAACCTGGTGACATTC
    CCGTGAAATCTAGAAGATCTAACTGGGATGTAGCAGACTTCCCAAAAAGCTGTCCCCAGCAC
    AGGCTTAGATAACCAGCACTCCAGGAAAACTCATATATATATATACACACACATTTATATATA
    CATTTGTGTGTGTGTGTGTGTGTGCACGCACATGTGCGTGTGCATGGAGCTTGGAAAAAAGA
    GTAGCTGGGCACTATATGATTGTACTGGGTTGGAGAGTGACCCACACCGCACCCCCCAACCCC
    AACCGCATCCCAGAAATTAACATCCCCAGAATCTCTGAATGTGACCATATTTAGAAATAGGGT
    CTTGGCAGATGTAACTAGTTAGGAAGAGGTAATACTGGATTAGGGTGGCATCTAATTCCATGA
    CTGATGCCTGGTAAGAAACGGAAACACACACACAGAAGGTCACGTGACGGCAGAGGCAGA
    GCCTGAAGTGATGCACCTCTAATCCAAGGAATGCCAAGGATGGCCAGCAGCCACCAGAGGCT
    GGAGAGAGGCCTGGGACAGACACTTCAGAGCCCCAAAAGACACCAGCCAGGCCCACAGAGCT
    ATCGTTAAAAGCAAATATTTGAGGGTTTCTGTTGACAGCAGCCACAGGAAACAAAAGGCGG
    TGGGAAATGGCTATTGAGCACTTGATGTGAGGCAAGTCCAAACTGAGCAGCGCTCTGAGTAC
    AGACACACCAGATTTCAGATGCAAACTCACACATGCTTCATTAGTAAGTTTTATACTGAAAAA
    AAAACAAGTTTTATACCGATTACATGTTGGAAAAATTGTATTTGGATATACTGCGTTAAGTAA
    AATATATAATTAAATTAAATTCTACCTATTTCCTTTTATCATTTTAAAATATGGCTCCTAGAA
    AATTCTAAGTTACACACATGCCCCAAATATATACCAGACAGCACTATGACAGAACATGTCCTG
    CCTTCTAAATGGGCTATGTCCTAAATGTCATCACTACAAACTCTGACTTAGGAAATGTGAAAACA
    CTGCCCCATGGGAAGGGGTCTAGAGATGGAGACCTCACAAGAGCCAGCAGCTCTGCTGCCA
    GGGCCCTCAGGAAGCAGCAGCTCGCTTCTCTCCTCAGATGGCCACTGCTGCAGCAGCTAGATG
    CACACATGAAGCGCCATCGAACAAGGAGCCAGCAAGAATGTCCTTCATCCCTACACACAGCT
    GAGCGACTCTAAATTTTTAACACAGAAAGTTAACTGATTCAGATATGCACACCAATCATCTAGA
    TTTTACAACTGCAGCTAGATGAGGCTGGGTGAATAGGACTCATCCACTCCCCACCGTGGGGAG
    AGGAGAAACAGCGGGTGTCCCAGGTGTCATGGTACTCAGACTAGGACTTGAGCAACAGAAAG
    AGATGGCTTGAGGAGAAAACGGAGAAATGCACCTAGGTGGTAAGAAAGCTCACAAGGTTTC
    AAAAGACACAGATACCATGAGACTTTCACATCTATCGTTCATTCCAAAGCCACGTTATTTGGA
    GTGCAGTCAGCACACCTGTGTTTGAAGCCCCTGGGATGCTTTTTATAAAATGCAGGTTCCCAG
    GCTCCATCGCAGGCCAACAACTCCAACCCCAGGAGACGCTGATGTACACACTAAAGCTATGC
    CTGTGTAAATGGTAAAGCTTTGTATGTGGGTTTCAATCCACTCCAGGTATCTATCAACTGCTGA
    GCATGGTATAAACTAGGCACTGTATCATGAGCAGGATGGAAAGATGTCCCAGTGCTCATACG
    CTGGTCAGGGAGACATGTAAACAAGCAGTGACAAAACTGTGACATCTGGTCAGAAAGGCCCA
    ACCTTCAGGCGCCTGTGTGTGAGCTGGGCAAGAAAGGGTATAAGAGAGAACAGGGCCCAGTC
    AGGAGACTGTGAGTTAGTTTGCACTTTATCCTGGGGCGGATCTGAGAGCTGCGAAGGGTTCT
    AAGTTGTGCAGATCAATGACTACTCTCTGGTGGACAGACTGGAGGTGAGCAGGAGGCAAGGG
    GACCACTTAGAGGCAAAGGCTGTAAGAGAAAAACCTGAGAAAAACAGATAGCTGCTTACATT
    CCACTTGTAGCAAAAATTTAAAAAAAAAGAGTTGAAGCAACAGTTACAAATCAGGAGATTT
    CAGCTCAAAATGCAGGGTTCTGGCTCTTTTCAAAGGGGCCTATGTGACAACCCTGGGCCCATA
    TTCCAGCCGCTGCCCTGTGGTCAGTGCACGGTGCTTCAATCTGTTCACCTTCAATGCAAACGCT
    GCAAGGGGAGGCACCTGTGGGGTGTGGAGGCACCCGAAACCCTAACAAAGGCACCAGGGTG
    GGAATCCAGGTCTTCAGAAGCCAAACCCTAGGAACCCAGTAAATGGTCAGACAGGCAGTAGC
    CATGAGGAAGGGAGACTTGAGGGTTCCACTGGTTCCCAGCTTGGTCCCCTAGAAACAATGGGT
    GCCATTAACCAAGAGAAGGGTATAGGAAAGACAGTCGATGCCCGGGGTGGGGGAAGGGGT
    GGGCAATCCCACTTGCTGGAGAGTGCCGTGGTTACTATTATATTAAAACGAGGATGGATCTGT
    GCATGCCTGGCCAGTGGAAATCGCACCCCCGCCTCAGTTCTTGGGCTTGCTCTCCATCTTCCTG
    CTTACCAGAATGATTTTGGTCTCATCTAGTTCGGCCTGCACTTTAGTCATGGGATCAGCTTCTC
    GTGGGTTCTAGGAAAGAGTGAAAAATAATAAAGTCAGGACTGGAGTGGCTACCTGCAAACAA
    AACCTAAAACTGAGGAAGCTGGACAAACTTTCACAGGTTAAAAACCACAGCCTGGGCCGGGC
    ACAGTGGCTCACGCCTGTAATGGGAGCATTTTGGGAGGATGAGGCGGGTGGATCACCAGAGA
    TCAAGAGTTCGAGACCAGCCTGACCAACATGGTGAAACCGTCTCTACTAAAAATACAAAAAT
    TAGCCAGGCGTGGTGGCACATGCCTGTAATCCCAGCTACTCGGGAGGCTGAGGCAGGAGAAT
    CGCTTGAACCCAGGAGGTGCAGGTTGAGTGAGCCGAGATTGCGCCACTGCACTCCAGCCTGG
    GAAACAGAGTGAGACTCCAACTCAAAAAACAAAAAACAAAAAAAAAAACCCACAGCCTGTT
    TAACATGTAACAGAAACCCAAAGCCTGCCTAGAGCTTGGGTTCCCCGGTCTGAACGTAGATTC
    TCTGTTTTCCAAACAGTAAGGCTTGAGAGAGGACACCAGCATCAGAAGCTGTCAGAAGTAATT
    AGACCAGAACTATCAGGGCAGTTGGCTTTTTCAGTTTCACATGGATTCTGGGCCACATGGTGT
    CTGCTGAAGCTTCCTTTAACCCTACCTGGTATCTACTGAGGTGACCATCCAGGGCTGGGTAAT
    GGATTGTAGCAGGGGATCCTACTGGCCAGTCTATCCTGTCGACTTGCTTGGAGAATTCATCTA
    GTACCTGCAAGACAAAGGAGACTCAACAAGCCTCCCACTGTGCACTCACCAGTGGTCTCAATG
    ACAGGGCTTCACCCCTGAGCACCTCACCCTGAATGAGGCTCCTTGGCCTTCACAGCCCAGGAA
    GGAGGAATGAGGGGGACATATAATGGCAACAGAGAAAATCTAGGCTAAAGTTCTTTCCAAAT
    TTTTATCATTAAAACATATCCTAAATATTCTGAGAATCAAAAGTATGCCCAGCCCGAGATGAA
    CCTCACTTGGGGAGTAATAAAGGTATTTGAATTTTAAACTACAGATTTCCAGAAAAAAGGGGC
    ACTGGTCCTCTAATTTTCCAAAGCAATTTTTTAAAAAAGAGAATTAGGTCCCCTAGATTTAAG
    AAACCACCAGATTCCATGTGTTTGGAGGTATTTTGGTGCTCTGGGGTATAGGATGAAGCCTCT
    GACTTCAAAGAGTTAATATTAGTAATTAGCACCGTACGCAAAAAAATTTAAAGAATGCTTAGG
    TGCTAAGCTCTGTGGTGCAACTGACTGACATCAAGGTAGAGGGATGCAGCAACTGCAGGAGG
    CAATGGGGAGAGTGAAGGCATTCAAGAGGGAGACTCCTTGAGCAGAAGCACAGGGGGCGAG
    AACACAAGGCACAGCTGTCTCCGAGGGTCCCATCCCAGAGAATAGATGCTATGACTCAGTGG
    CCTAGACCCAGCTCACATGAGGGACAGCACCGGGGAGGAAACCCATACAGGGATGCCAAATT
    GTCTCTTGGGTTGCAGGGAAGGGGGCTGAAAAATGTGTTGACTTTGGACACATCATTTCATCC
    CTTATGTCTCAGGGACTGCCATCAACCCCTGTCCCAGTCCATAAATGTGCCCATTCATCATCCA
    AGTCCAGGAGAGGCAAATAAAAAACTCACCTTCTCCAGCAAGGTAAAGGCCACCCGGGATGG
    GTATTCATTGTCAGCAATGACCACACCTGCAAGACTATCATTCCGGACGTAGACGTGGCACAG
    ATAGTCTAAGGAGACAAGAGATCAGACACATGGATGCTGACATGAGGGCTTCAGACTTCTTTT
    AATCCCCCCAAATCAAAGCATCCAATGTTAGGCCAAATGAAGCCACTCGGAAGCTCAATAGC
    TCTGGGCAAGTCTTGTGGAGAGGCTTAGCAGCACAGCCCAATGGGCCACACACAGGAGCTTG
    GCCCAACGCCTGCTTTAGGACCAGTAAATACCCAGAGGCCCAGTATGCAAAGCCAGGGCTTA
    AAGAAACAGCCAGTGGTGCAGAAAACACACCCTTGACAACATGGCCCCAGGAGCATTTCCAA
    GTGTATTCCTTAAGCTCGGGTCAGGCCAAGCTATATCTTAGGGATCTGGAGCCCTTGGGGCTC
    TGTGCTGCTCCCAAACTTAGGGAACCCTGGACAAGCCAAGAGGCCTCTGCTTTTCTTAAAAAAT
    CTTTTCAGAGCAGCCAAAAGACAGGAAATTACCCCCCAGGGCCTCAGTCTTCCATATATAGC
    AACCTGCTGGGTTTGCTCCACTCTGGTGGGTGACTGGGAGTAGGGGGGTAAAGTCTAGAAAAAG
    ATTAGCTACTGCCAGCTAAGGCCTCCAGAGCACTGTGCTAAAATCCTCATATGATTGAAAGGT
    ACAGTTGTACAGGTCTTCCGCAAAATATTCACAATCCACAGGATTGTTCATTCCATCACTTTG
    AAAGGATTCAGAGTTGATACAGCTAACCATATCCCCAAGGAAAGAGAAATGTAAGGATTACA
    GCTTACAAATAAGAACCTTCTTGTCCTAAAAGGATCTGACCCAGAAGATTCCAATGCTAAACAA
    CAGAAAAACAAATAAAAGAGGAGGGAATGATGGTGAGCCCCTGAAATCAGAAAAGAGCAGA
    GATAAATGAGAACAAGAATGAGGAGGAGGAAGAGGACAGGGGGTTGTCACCAATGCTCTCC
    AGATTTTGTATACCATCCCCAATTAAGATTCAAACATGGGGTCAAAGTGCATACCCTCCAAAG
    AAACTGAGAACCTGGTCAGTGGAGGAATTGTCTTAAGTAATAAACGTGGGAAGGGCAGGCA
    CAGTTTGAAGAACAGAGCAAGAACACTGAAATATTTGTGATGCGATTTCACTTCTATGATGTT
    AATAGCACAGAGATCCCACATAAAGTGTATATAGTCAATCCTGCCTGTATCATAACTGACATT
    TATATCATCAATTCAGTAACTCTATGTCACGTGACTTGAGGTTAGCATAAGTGTGAGATGATC
    TTTGTCCCTACCTGATGAAACTCATGTAACTCTTTCCTGATCTGTCTGTATAACATACACATCT
    AAATAAATGCCTAAACCTGAATTATCAGAAAGAAAAAATAGTTTTTTCAGATTCCTGATCAAA
    AAATCTACGATGCACAGAATACATATAGTACCTCAACAGTGCTAGCTGGAAATCCTTTTTGA
    GGGGTCTGCAACTCTGAAGAGGATAGGGAAGAATACGATATGAAGGCTGCTTACTGCTCCAA
    AAGAGTCAGACCCTAATCTTAAATGAGTCTAAGTTTGAGGGCAATTTTATCTGGGAAGCTCAG
    ACTTCAACAGTGGGCACAGAATTCTGCATAAATAGGAAAAGGAAGAGGTGGGAAAGAGAGA
    ACAAGCTAGAGGAGGAGTAGGGTCCCAGTAGAAAGGAGAAAGCTGGGTGCTATGTGAGGTG
    AGGCATGGCAGCCAGGCCAGCACACGCACAGAAGTTGGAGGGTCTTCTTACCTTGTTCTTTGA
    CAGAAGCTCTAGTGCCTTTCGATGAGCGCTCCACAATCAGTGACTCGTGAAGGTCATGAATT
    CCTGAACGCTAAGAAACACAAAATGTATTTATTGCCTACTTCTTATCACCTTGTCCCCAACACA
    GTGGAAAGTGACCTCTGGGCTTATACATTAAGTAGACATTGCTTCTTGGTTTCATTCCTTTCCC
    TCCCATCCCTAGTAACAAACACTCTATAAATGAGCACAAATACTGATAATTATGAATTATCAT
    CACCATGAAAGCTCCATCTGTTTGCTACCTGGCTCACCAAAACAGGTGAATTTTCTGGGGGGT
    TTTTCCACAGGATACAGTCAATTTTACATTTTGGTGAATGCATAATTTGGAATGCAATGGAAA
    AACAAGAGGCAGGTCCTGCTCTCAAGGTCCCAATAACTTCCAAGAAGCAGGACATTTATAAG
    AACTGCACTAGAAGAATAGTGTGCAAAAACTGTCAGGCAGAAATGCACAACCATTTATGGCT
    GTGTCCACATGACAGACCCTCGCAATGCCACATACACCCATAGTGAGTGCTGGCTCAGGTCTG
    CTGGGGCTCGTCCACAGAACGAGCGCAAGACACTCTGGATGGAACAAAAGGAAAACTGCTCA
    TCCAAGACAAAGAAGTGGGAAATGGCTCATACAAAGGGTGAAAGGGAGAAGGTCCATCATG
    GGCTCAACAGAGAGATCTATCCAGAACAGAACAGTCACAGGAGATGGTACAGCCAGAGGAA
    GAGGTGCTGACAAGGAGCCTCCAACTGAGGATGTGATATAAAGGGCAACCAGGGCCATCAAA
    GCAGGGTGCTCAAATGGGAGTCTGCAGCAGGCTCCAGCAGAGCCATATAGGTAACTGAAGGC
    CTGACTCTGGGCCTGTGTGCTGTGCCTCCACATTAAAAAAATCAAGATTTGTGCAACAGTTAA
    ACGAGGTAATACGTGTAAAGCACTTGGAACAATGCCTGCACACACAGTATTACTTGTTAATAT
    CTTGAGGGACTGAAGTGATCAAAATAACCCCTCAGAAAAGAAGACCTCAAACAAGGAAGGCT
    TTGCAGTAAACCTAGAGACAGCATTTGAGACACGGCTATAAAGAGACAAAGGAAGAACTGCA
    TTGTGACAGCATGTATACAAAGACCAAAAAAGCTGGGAAACTACTTTTTCAACTTTGGAATCG
    GGTAATATAGGGCACAAAGGACGTAAGTAAAGCGGTCTTATAAGAAAACAAGCTCAGGCCG
    GACGTGGTGGCTCAAGCCTGTAATCCTAGCACTTTGGGAGGCCAAGGCAGGCGGATCACTTG
    AGCTCAGGAGPTCGAGACCAGCCTGGCTAACATGGTAAAACCCCATCTCTACTAAAAATACA
    AAAATTAGCCGGGTGTGGTGGTGCGCGCCTGTAATCCCAGCTACTGGGAGGCTGAGGCAGG
    AGAATCACTTGAACCCAGGAGGCGGAGGTTGCAGTGAGCTGACACTGTGCCACTGCACTCCA
    GCCTGGGTGACAGAGCAAGACTCCATCTAAAATAAAATAAATAAATAAATAAATCAGCTGGG
    ACATGTGTTGTTTTAAGACATATTAGTAGAGATGTCCCTTTAGTGTTGCAGCTGTTAGTCATTG
    GAAACTAGTGTGGGCATCCCAAGCAGGTGAGGTATAAGTCCTACAAGTGAAATCTCTGAGAA
    TCTTAAGTACTAATGGGAAGGAAAAAGGAAAAAGAATCAGAGCCAAGTAAGGCACCAAAAGTT
    CCATCTGAGAAAAGCAACAACACAGAGCAGTGAATGTAGGCCATGGTAAAGACTGCAAAGAC
    CAAGAACCCCAAGAAGGAGCTAAAAGATAATGCAGCAATTCCGCTTCTGGGTAAATACCAAA
    AAAATGCGAGCAGGGTCTTGAAGAGATATTTGTACATCCATGTTCATAGCAGTATCATTCACA
    ATGGCTGAAATGTGGAAGCAACCCAGGTGTCCACTGACAGATGAACAGATAAGCAAAATGTG
    GTGAATAATACAATGGATTATTCAGCCTTAAAAAGGAAAGAAATTCTGATATATGCAACAAG
    ATGCATGAGCCTTGAGGACATTATGCTACATGAAATAAGCCAGACACACAAAAACTATATGA
    TTCCATTTATCTAAGGTCGCCAGAAAAGTCAAAATCACAGAGACAAATTAGAATGGCAGTTGC
    CATGGGCTGGGGGAGAAGGGAATGTGTTTAATAGACACGAATTTGATAAAAAGGAGTTCTGG
    AGACGATTGACAGTGATGGCTGCACAACACTATCAATCTATTTCATATCAATGCACTCACTAC
    ACGCTTAAAGATAGTGAAGATAAATTTTGTGTACCATTTTACCACAATTAAAAATATTTTTTTA
    AAAGAACTCAAAGAAGCAGAAAGTTTCAACAAAATAACATTTTAATTTTAATTTACATCCAGCAA
    GTCCTTGGCAAAGAACTCTCATCAAGAACCAGCTGCACTGAAGCAGGGAAAACAGAATCCAA
    ACGGCAGATTCCATCAGATTTTGAGACAAGATGACCATAGATACCGACCATGTAGGGTCCTCC
    TTCTTTCGTGCCTGAGTCACCCCAATCCCTCCCACGAATGGTCTGGAAGTGTCTGTGTTACTTC
    TAACACGTTCCAGCAATTAAAGCGCCCCAGAAACAAGTAAAAGCCTGTAAGCCCTACAGATC
    CCATGCTTCATTTGCATCTTCCGTGTGGAATCCTTTTGTACCACTAGTGTCCAACTAAAAAGCG
    TTAAACCTGGCTTTTCAGTTCTAGCTGGTTGTGATATAACCTCTTGGTACCTCAGTGACTTCACC
    CATTAAAAACAAACAAAAAAAAGTATATCACTATCTCTCATACAGAATTGTTGGGAAGCCCC
    GCAAGAAAATCAAAATATGGCTCTCAAGATGCGGCACCCAAGCTCCCAGAGTCAGAATCACT
    GGGTGGGAAGTGTTGGTCTAAAATATAAATACCGAGGCCTCAATCTACTAATTCAGAACATCT
    TGGCATGAAGCTTGGAAATCTGCACTACTTCACAGTCTCCTTAAAATTTTTACACGACAGAAA
    TTTGAAAAACACTGAGTAGAGAACTATATTCTAGAATGGTATAAGCTCTTAAAGAGCTAATGT
    TGGTTCCTCAAAGGTAGAGTCCACGGCCAGATTCCATTATAGGAGACCAAGCCCGGACAGCA
    GACCCCGGGCCCTCCCCACCCCGCCCCGCCTCTGACTCGGACACCAGCCTTTCTCAGACCCCGG
    GCACTCGGCCACCCCGCCCTGCCCCTACCCTTGGCCTCCTCCACCCTCCCCTCATCCCTCCGCC
    GACCCCAGGCCCACTCCGACTCGGACCCCCACCCCAGTCCTCTCCGCCCGACCGCCACGGCCC
    ACCAGCCTGTGCCGCTCACCTGGATCTCTGGAAAAAGCTGAAGGAAGACACATCGTATGCGG
    CTTTGAGCAGCACCACCTTGGCCTCGCCTTTGTAGAGGACGCTGAGGCTGTACAGCTTCATGG
    TCCGGCCCTCAGGCCGCCCGCCTGCCCAGCTGCGGGACCCGTTCTCAGGGAGCAGCGCGGCCG
    CCGCCCCTCGGGACCGCCGCCGCCTACCGGCCTCTCAGCAGCCGGCTGCTGACGGGGCCACCG
    CCGGCTTCCTCCTCCTGGCTCGCAATCCACTTCCGGATCCGGTCAGCCTGGTTGAGGGTTCTCA
    TACTCCGGATGCAGAAATGTGAGCCCGGAAGTACAATGCAGCGAGGGGCGGGATGCCACGCC
    TCGCGTAAGCTTGGCCCCTCCCTGCTCGCCAGGTGGAGTCGGGCGCGCGGCGGGATACCGTAC
    TGTCTTGTGCTGGGTGGTGCTGGGCCTCCCACAGCGGCCTGAACCCTTCTTTTTTTTTTTTCT
    TTTCTTTCTTTTTTAAAGTAAGCATTTTTTTTATTATTATACTTAAGTTAGGGTACATGTG
    CACAACGTGCAGGTTTGTTACATATGTATACATGTGCCATGTTGGTGTGCTGCACCCATTAACT
    CGTCATTTAGCATTAAGTATATCTCCTAATGCTATCCCTCCCCCCTCCCCCCACCCCACAACAG
    TCCCCGGTGTGTGATGTTCGCCTTCCTGTGTCCATGTGTTCTTATTGTTCAATTCCCACCTATGA
    GTGAGAACATGCGGTGTTTGGTTTTTTGTCCTTGCAATAGTAATGCTGAGAATGATGGTTTTCCAG
    CTTCATCCATGTCCCTACAAAGGACATGAACTCATCATTTTTTATGGCTGCATAGTATTCCATG
    GTGTATATGTGCCACATTTTAGGAGGAGCTTGTACCATTCCTTCTGAAACTATTCCAATCAAAA
    GAAAAAGAGAGAATCCTCCCTAACTCATTTTTATGAGGCCAGCATCATCCTGATACCAAAGGGT
    GGCAGAGAGAGACACAACAAAAAAAGAATTTTAGACCAATATCCTTGATGAACATTGAAGCA
    AAAATCCTCAGTAAAATACTGGCAAACCGAATCCAGCAACACATCAAAAAGCTTATCCACCA
    TGATCAAGTGGGCTTCATCCCTGGGATGCAAGGCTGGTTCAACATACGAAAATCAGTAAACGT
    AATCCAGCATATAAACAGAACCAAAGACAAAAACCACATGATTATCTCAATAGATGCAGAAA
    AGGCCTTTGACAAAATTCAACAACCCTCATGCTAAAAACTCTCAATAAATTAGGTATTGATGG
    GACGTATCTCAAAATAATAAGAGCTATCTATGACAAACCCACAGCCAATATCATACTGAATGG
    ACAAAAACTGGAAGCATTCCCTTTGAAAACTGGCACAAGACTGGGATGCCCTCTCTCACCACT
    CCTAATTCAACATAGTGTTGGAAGTTCTGGCCAGGGCAATCAGGTAGGAGAAGGAAATAAAGG
    GTATTCAATTAAGAAAAGAGGAAGTCAAATTGTCCCTGTTGCAGATGACATGATTGTATATC
    TAGAAAACCCCATCGTCTCAGCCCAAAATCTCCTTAAGCTGATAAGCAACTTCAGCAAAGTCT
    CAGGATACAAAATCAATGTGCAAAAATCACAAGCAGTCTTATACACCAATAACAGACAGAGA
    GCCAAATCATGAGTGAACTCCCATTCACAATTGCTTCAAAGAGAATAAAATACCTAGGAATCC
    AACTTACAAGGGATGTGAAGGACCTCTTCAAGGAGAACTACAAACGACTGCTCAATGAAATA
    AAAGAGGATACAAACAAATGGAAGAACATTCCATGCTCATGGGTAGGAAGAATCAGTATCGT
    GAAAATGGCCATACTGCCCAAGGTAATTTATAGATCAATGCCATCCCTATCAAGCTACCAAT
    GACTTCTTCACAGAATTGGAAAAAACTAAAGTTCATATGGAACCAAAAAAGAGCCCGCATT
    GCCAAGTCAATCCTAAGCCAAAAGAACAAAGCTGGAGGCATCACACTACCTGACTTCTAACT
    ATACTACAAGGCTACAGTAACCAAAACAGCATGCTACTGGTACCAAAACAGAGATATAGAGC
    AATGGAACAGAACAGAGCCCTCAGAAATAATGCCGCATATCTACAAGCATCTGATCTTTGAC
    AAACCTGACAAAAACAAGCAATGGGGAAAGGATTCCCTATTTAATAAATGGTGCTGGGAAAA
    CTGGCTAGCCATATGTAGAAAGCTGAAACTGGATCCCTTCCTTACACCTTTATACAAAAATTAA
    TTCAAGATGGATTAAAGACTTACATGTTAGACCTAAAACCATAAAAACCCTAGAAGAAAACC
    TAGGCAATACCATTCAGGACATAGGCATGGGCAAGGACTTCATGTCTAAAACACCAAAAGCA
    ATGGCAACAAAAGCCAAAATTGACAAATGGGATCTAATTTAAACTAAAGAGCTTCTGCACAGC
    AAAAGAAACTACCATCAGAGTGAACAGGCAACCTACAGAATGGGAGAAAATTTTTGCAACCT
    ACTCATCTGACAAAGGGCTAATATCCAGAATCTACAATGAACTCAAACAAATTTACAAGAAA
    AAAACAAACAACCCCATCAACAAATGGGCGAAGGATATGAACAGACACTTCTCAAAAGAAG
    ACATTTATGTAGCCAAAAAACACATGAAAAAATGCTCATCATCACTGGCCATCAGAGAAATG
    CAAATCAAAACCACAATGAGATACCATCTCACACCAGTTAGAATGGTGATCATTAAAAAGTC
    AGGAAACAACAGGTGCTGGAGAGGATGTGGAGAAATAGGAACACTAATTACACTGTTCGTGGG
    ACTGTAAACTAGTTCAACCATTGTGGAAGTCAGTGTGGCGATCCTCAGGGATCTAGAACTGG
    AAATACCATTTGACCCAGCCATCCCATTACTAGGTATATACCCAAAGGATTATAAATCATGCT
    GCTATAAGGACACATGCACACGTATGTTTATTGTGGCACTGTTCAGAATAGCAAAGACTTGGA
    ACCAACCCAAATGAACCCTTCTTTTTGCTTGCGTTGTTGAAAGAAGGCAAGTCTATGGATAGG
    AATGAGTGAGGCACAGCTCCCTGAGGATGCCATATCTTGCCCGTTTCTTGTGTATTAAGTGAC
    ATCACGTGTTACCAAACTAAACCGGCTGCATTTGCCTGCGCACAACATAAAACCAAACACCCT
    AGCATTGGATTTTTGTAGCAAGAAAGATGTATTGCCAAGCAGCCTTGCAAGGGGACAGAAGA
    CGGGCTCAAATCTGTCTCCCAATACTTGCTTCGCAGCAGTAGATTTAAGGGAGAGATTTTGGA
    AGTGGAGTTTCGGGCTGGACGGTGATTGGCTGAAACGAAGAAGTGTTTAGAAAATCTCTTGGAA
    CATGAGCTGTTGCTTCTTCATGCTGCTTCAAGGGTCACATGCAGATTCAGGAGGTGGTATAAA
    ACAAGCTGTGGGAATLTGGGCTGTGACATCAAAGGGCCGCTCCTCGGGCTAGTAAGTCTATTT
    TGCACAGGCTCCAGTCAGCCATATTGGTTCCAACCTGTTCCAGCAAGTTGTATAAGCAGAGGC
    GATTATAGCAAACTGTTCCTTATCGGCTGCCCTGCAAGACAAGCTCAAGATTTCTGTTAGTTA
    CCAGTTTCTTTAACCCTGTCGGGCACAGTTTCACATGTAATCAGAAAGGAACTTGCAAGACAC
    ATACAACTGAAAGAAACTTGGTCTTTGGAAGTTGTCAGTAAGGTCACAAAGTTGTGATGCTAG
    AAGCAGCCGTATCTGAGATTATGGGAAAGAGATGATATATTGGAAAAACAACAGCATCACTT
    TAAACATTACTCTAAATCAAGGTTTCTCAACCTTGGCACTATTGACATTTTGGGTTAGATAGTT
    CTTTCTTGTTGGGAGACTGCCCTGTACATTGTGTAGGCAGCATCTCAGGCCTTTGTAGAAATGT
    CAGTACCAACCCACCCCCTCCCCACTGCACAATCAAAAACGTCAAAATGTCCTTTGGGAGCAG
    TAGTTTTGAGAAACATTGCUTGCAGATATATATGTTTGTTTGTTTGTTTTGCTTTGTGACAGG
    GTCTTACTCTGTTGCCCAGGCAGAAGTGCAATGGTGTGATCCCACTCACTGCAACCTCTGCCTC
    CCAGGTTCAAGCGATTCTCATGCCTCAGCCTCCCGAGTAGCTGGGATTACAGGAATGCATCCA
    TACACGCGGCTAATTTTTGTATTTTTAATAGAGATGGGATTTCACCATGTTGGCCAGGCTGGTC
    TGAAACTCCTGGCCTCATGTGATCCACCCACCTCGACCTCCCAAATTGCTGGGATTTACAAGCT
    TAAGCCACTGCGCCCAGCTGAGAAACATTGCTTTAAATAATCTGTGGTGAAAGGAAGTTCCCA
    CCACCTGCCCACTCACTCAGTACCTCTGTCACCAACCCTCTTCCCTGGGTGTTTCCAAGTACAG
    AGGGTGGAAAGGGCTTTCCACATTTCCCCTGTTTTGGTAGTAAACATTAGGAACAGCCAG
    GCCGTGGCTAGGCTCAGCCACCCACAGATATGGACACAGTAGTCTGACAAGCTGGGTTGCTG
    GGTGCTATCAGTCCAGGCTCAACTGCTTGCACTGACACCATTTCCCTATAGGAGGCAGGTGAG
    AGCCATTTCTGAGGAAAGTCTCTGGAGCCCCTCTTCCTTCCACTGAAAGTTGTGCAAAAAGAT
    CAGGAAGACAGCGCTTGGATGGAATAAATTTCAGTGTATCCACTTGACACATTATAGTGGCTG
    TCCCAAAGTTTTTACCTTATGCCAAGTACThCCATGTGCCACATCATTTAATCCTCACAAAAACA
    GGGGAAAATATTAGCCACCCTACAGACATAGAGACTGAGATTCAATTTAAGGAGATGGTT
    GGTAAGGGACAGAGTTGGGGTTCAGATGTCAACAGTGAAATGCTTAACAAACTGTCATGCAG
    CCCACTCCTGGCAACTCTTCCTGCTCCTCTCTGGCCTCACTCAGCCTCTACTGTTCCAGGAAGG
    CTCATTCATAGTCATGTGGTTGCAGACTTCCCAAGCTCACTGTGTTACCAAAAAGCAAGACCT
    GCCTTCTGCTGCATCGCCCCAGCTGTCACCCAACTTGGATTCAGTCCCAGCACTGACACATCA
    CAAAATCACAAAAGTGAGCAAACCATTACCTCCCTGAGTCTCCTTTTGTTTTTATCTATAAAAC
    TAGAAAAATATTCTTTCCATAGGAATGTTGTTGGAAATAATAAAACATTATATAAACAAGCTCT
    AGTCATTGTTGATGTTTAACAGGTAACAGTGATAATTATTTGTCTTCTCATTAATGAAGAAAA
    GGATTATTAATCATAGAGGGTGGAAGGCATCTATGGGAAGTAGAGATTTGAAGATAGGCTAA
    AACCCAAGTAAGGCCTCTAGATTAGATAATAGTATTGTATCTATTTTAATTTCCTGCTTTCCAT
    CACTGTGCCATGGTTATATAAGAGAAGTCTTGTAATATAGGAAATATACACAAGAATTAGAA
    GTAAAGGGACATTGTGTCTGCAACAAACTCTTACAGGGTGTGTGTGTGTGTGTGTGTGTGTGTG
    TGTGAGAGAGAGAGAGAGACAGAGAGAGAGAGAGACAGAGAGAAAGAGAATGATAAAGCA
    AATACAGGAATCAGGATGAAGCGTATCTGTTTTGTTTGTTTTGCTTTGTGATAGGGTCTTGCTCT
    GTTGCCCAGGCAGGAGTGCAATGGTGTGATCCCGCTCACTGCAACCTCTGCCTCCCAGGTTCA
    AGCGATTCTCATGCTTGTATTGTTCTGCACCTGTTCTGCAAGTACAACATTGTGGGAATGGAA
    AATGCAGGAAATGGGCAGTAAGGCTATGAACGAAGCCCGCACAGGAGTGTGGGTAGCAGAG
    TTCTCTAGTCCAGGCTCCCACCTGAGGTGCTGGGACCTAGAAGAAAAGCCTCTCTGCAGACAG
    AACTGGAGTTAACGCTGTCCACGATAAATGGCCCAGGCCCTGTTAAGTTTGCCCCATTGAGCA
    AAACAAGTACCCACCCGCCTTTGCAGCCTTGCCTAGCTCACATAAGGTGCCAGCCCTTGCTGT
    ACAGCAGAACCTTTGGGGAGCTGGACAAAAGCCTATCAAGGAGCATACCCCCAGGAAGCCCA
    GTCCAGGTGGGGAGCCCAGCCACACAATGGCCCTTGCCCCCACACCTCCTCATTCAGTCAGCT
    AAGGCCATGGCAGCTGAGCTGCCTCCACAGCTCATATAGGAAAAGGGTGTGGAAAGGGGCCA
    CCAATGTGGTCAGGCCTCCATGGCCTGAGTAGGTCACCAAGCCTCAGGTGCACAGACTTGATG
    TCATCAATCAGGGTCTGTCAGCACACCTAGCCCTCAGGAACACTGCTCCCCACTGCAACCCCA
    CACCAAGGCATCCTGGGCTCCCTCTGGGTTCTCCAGGCCCCAGGGAAGACAGACAGAGTCTGC
    CACCAAAGGTTTGAGCTCTGCCACTGGCTACGAAGCAATAGGGGATGTCAGAGCAAGGGAGG
    AACAGGACAGGAGTATACGTGGGCAGGAAGGGATTACAGCCAAGGAAGACAGGAGGCAGGT
    GCCCTGATTTTGAGGCTGTGCCCCAGCAGGGGCTTCCCAGAAGCTGTATTTGTCCTAAGACAC
    CCCTCTGCAGCTGAGGGGCTAGAGATGGATATGTAGCTGTGTTAGGCCATTCTTGCATTTGCTA
    TAAAGAAATACCTGAGACCAGGTAATTTATAAAGAAAAGAGGTTTCATTTGGTTCACAGTTCTG
    CTGGCTTTGCAAGAGGCATGGTGCTGGCATCTGCTCAGCCTTTGAGGAGGCCTCAGGAAACTT
    ACAGTCATGGCGGAAGGCAAAGGGGAAGCAGGCACATCACACAGTGGAAGCAGGAGTGAGA
    GAGAGAGAGGCACTGGGAGGTGCCACACTTTTAAACAACCAGATCTCGTGTGAACTCAGAGC
    AAGAGCTGACTCATCACCAAGGGGATGGCCCAAGCCATTCATGAGGGATCCACCCCCATGAC
    TCAGACACCTCCCACCAGGCCCCACCTCCAATATTGGGGATTACAATTCAGATGAGATTTGGT
    GGGGACACATATCCAAACCATATCAGTTATCAGTAGCCATACTGGATGAATGCCAGGAACTTA
    GAATTAGGACACATGGTCATTTAGGCAAGTGGCTTGTCCTGTCAATGGTACCCTGATAGTCGT
    GGGGTTGCCCCGTACAAAAAGCGAGAGGAAGTCTACAGAGCTGTCAAAGAGGGGCAGGTGG
    AAAGGCCTGCAGAGGAGTCCCCTGCTCCACAACCAGGCGTGCACCTCCCACATCCTCGGGGCT
    GTAGGCCCCACATGAGAGCAGAAAGAAGGATGCAGAGGAAGGCC
    AAGAACACAAGGTGTGCCCTTGGAAAGGCTGGGCACACCAAACACAACCTAATAAACAACAG
    CAATGAGCACACAGGGAAAGTACTCACAGGGAAACCATCATGAACTAGAGGCTGATCCCACA
    CCCTGCCACATGGGGCCCCAGGCCCCAGCCTATCAACCAGTGGTCCTTTATTGCCACAGCGATT
    GGTCTTGGATAGGCACCTGATGCAAGCTTCAGCCAATCAACAGGCCAGTCAGCTGGCCATCA
    GTAGGCCATCCAATCAGAGCAAAGCCCAGGACTTTCTTCGACTCTTAAGAAAAGAGAAGCAA
    AGTAACTGGCACAGATTGGAGAGGATCAAGGAACCCCGAGCTGGATACATACAAACTTTGGG
    TTAACATGGATGATTAAATACATATGTTTATGTGAACCACCTCCCAAATATGCTCCACTATAAT
    GACACAAGACAAAGGGCAGGGGGAGACCAATTGCAAGGTGGCGCAAATGAGAGATGCTACC
    AAGGGTGGCGGGGGAGAGAGGGGAGCAGTGTCAAGTTAGGAGGCAACAGGCTGAGGGACA
    GGGACCAGCAGACGGGGAGGGAGGGGCTGAAGCAGAAGTGTCCAGTGTCTGGAGGGATGGG
    GCCAGAAAGGCAAGGGGCATCCTGAAGAAGCTATACCTGGGGAGGGCAGCTCTCTCCCCACC
    TGCTCCCCAATTCATCAGCCAGGAATGCCCCATCCACCCCACCCCAGGGAGGAGGACAGAGG
    ACTTTCGTTTGGGAGCATTGAATGGTTCAGAGATTCTGCAACTCTGCGGTCCCCAACTAAACT
    GCTCATTGTTTCAAGCAGTCCCTGTTGGGTAAATGTCCCCCATTGTAACCGGACTCGGATTTCCA
    CCGCTTGAAAGCCAAATACAAGAGGAGAGGTTTGGTGGGAGGAAAAGTGGTTTTAACTAGAG
    CCAGCAAACCAAGAAGATGGTGAATTGTTGTTTAAAGCATTCAATTATCTCAAATTTTAAAA
    TTTATCATAGGATTCTGAAAGGAAAACTTGGTATGGGACATACGTGGGAGCAGTGCAGGGTA
    CAGGGTCTATGTGTCTTGATCCAATGGCTGTCTTGAGTATCACCTATCGTGAGGTCTGGTTGGT
    GTTATCTTTCCTTCGGCCAGATGGTGGTGGGTGAATTGTTTTCGACTCCCCCTAAGTTGGAGGAT
    TCCGCAGGGGTTCCGTGTCTGGTTETTGTTTCAAGATTAGCCCCTGGAATTCCCAAATAAGCAT
    AGAGTTAGATAAGCGGGCATGGTGCAAAGGAGTGTCTAGTGGGAAAGGGAGAGAAGCAGAG
    TTTCAAAGTACATTTCAAGGTTACATTTTAAGACTAAAGAAAAAGCCTTAAAATGCATTTTTA
    AAGCTGATTTAATGCTTGGCTACACTAGGCTGTGGCCAGTGTGCAGTGTGGCTGCTCTTGGAT
    CAGGTGATGTTTCATCAGCTGTGTCCAGGGAGGGCAGGGCCATGTGGCAGAACCTGGGACCT
    CTGTGTGAGGGACTACCTTGGCCCCTGTCCTTAGCAGGAAGCTATGGTAAGGAACCCTTAGGG
    AGACATTAAATTGGGGAGACCGTCCCTGCCAATCCTTTAACCTCCCCAGCCTCAGCGACCTCA
    GTTGGAAAGTGGTGGTAATAATACTACCACTGACCAGGTGTGGTGGCCAGACATTCCACACTT
    TGGCTTCAGCCGCTCCCTCCCCACTCTACTGTAATCCCAGCACTTTGGGAGGAAGAGGTAGGC
    GGAACCTGAGGTCTGGAGTTGAGACCAGCCTGGTCAACATGGTGAAACCCCATATCTACTAA
    AAAGAAAGTACAAAAAATTAGCCAGGTGCAGTGGCACACGTGTGTGGTCCCAGCTACTCGTG
    GGTCTGAGGCATGAAAATTGTTGAGCCTGGGAGGCAGAGGTTCCATTGAGTGGAGATCGAG
    CCACTGCACTCCAGCCTGGGTGATAGAACGAGATTCTGTCTCAAAAAAATAAAAATAAAATA
    ATAATAATAATACCACTGCCTGCCACACTAAGAUGTCTGATTAGATGACAGAATGAATGCAA
    AAGTACTTTGTGAATCATAAAT&TTTTCATCAATATTAGTTATAATGACAATTGCTCCTTCTCC
    TAATAAATGTATTGCCTTTCTTTAGGAATAAATATAACAAGAAATGTGTAAGATATATATGAG
    AAAAATAATAAAATTCACCTGAAGGACATAAAAGAAGACCAAAATAAATGAAACAACACAT
    ACTTCTAGATGAGAAAACTCAATATTATAAAGAGGTTAGTTCTCTAAAATGAATCCCTAAACC
    CACAAAGTCAATGTATTTCCAATGAAATTGTCAACAGCATTATTTTCCGAAGTGGGATGAGTA
    GTGCTAAGATTTATAAGAAAGCCAACATTCCAGAGCAGTGGGGAAGGGATGCTTCACCACC
    AAATAGCCATATTAGAGATTCCCTTGCACCATACCCAAACCACCATCTCCCAGGACCCGGGAG
    AGCAGAAAAGAGGAATGAGAAGAAAGGCGAGGATGTGAGGTGTGCCCTCATAATGGCGGTG
    CACGCAGCACAAGCAATTGCAGAAAGACTAAAGTACTGAACAAATAGAAAACTAAGGAAAAAT
    ATTAGAAGGAAATGTGGGAGAACATTTTTGCAATTTGGGGATTTGGAAACGGTTTTCTTAACAA
    GATATAAAAACCCCAAAACAAGAAAACAAAGGTTGAAATTCATAAAAACTAGATACTTCTGT
    ATGATGAAAGACACGATTAATCAAGTTTGTTAAGTTAGCAATAGACTAGGGGAGATATCATA
    GTATATTTAACAGACAAAGGATTAATAGATACTACAGATGAAATATAAAATAGTTTCTCCAAG
    TCCATAGGCAGAAGATAATCCAATAGCAACATAGTTAAGTAATGTAAACAAATCATCCTTAG
    AAGAAGAAATGCAATCACCAAGAAACACATGAAAAGGTGTCCAGCATTTTGCAATTCAAGCA
    ACAATGAGGTGACAGATCGGCAAAAAACTCATAAAGATTTATCATCTGAAGGATTGGCCAAG
    ATAAAGCCAAACTTCTCGTGTTGGCAGAAGAAACTGGTGAAGCCATGTGAAGAGGCCACGTG
    GTCCTGCCTACCAAGATGTAAAATGTGTACAGCATTGAACTAGCAATTCAGCCTCCAGGAGC
    CATCCAGAAGAAACACTGACACACACTTAGACTCCGGTGAAATTCAAGGACTTCTGCCACAG
    CCTGCTTCGTAATAGTGAAAATCTGAAACTGCCTCAATGACCGTCAATAGGAAGTTGATTTA
    AAGTGTTACAGCACATCTGTCTGGAGAGATCGCACTGGCCACTCCTCCTCACCCCCTCTGCTG
    GACCTCTGAGCGTAGGTGGCCTGGAGCTGGGTCCTGAGCCCTCTTTGGTCTATACCGACACTA
    CCCAATATGGTAGCCACCAGTCACGCTGGACACTTGAAAAGTGGCCGATCCTGACTGAGAAG
    GGCCACGAGTGGGAAAAACACACCAGACCTCAGTGACTTAGGCAGAAGTATGTTTTGTTCCA
    GACTATTGACTGAGCCCGCAGCTGAGTTGGCTCCAGCACCCTGGCCCCCTGCTCCATCCACTC
    ACTGGGACTCCCCACTGCACAGGGCAACCTCTCCAGGGGCACTTGGGCTGCGAAGGGGAGAG
    TGGGTGGCATCCCAGGCTGAAGCTTCCTGAGCAGGGCCAGAGGAGGAGCCAGTCCCTGTGGG
    CCTCTGTTCTGACAGTGTCAACCTCAGCCAGGCTTGTGTGGGCCAGGTGTACTGTTCTGGTTCA
    GATTTCAAGGAGATAGTCAGGGCAGGCCGCGCCAAAGCCCTCCGATGGGCTCCCCTACTGCCT
    GGCAGACCTGTCCAGCTTTGGACTCTGGCCCTGCGACCTGGAAGTCAGGCTGCCAAGAGGTCC
    AGGCAGTGGCCTCCACTGTGGAGGGTCTCTGGAGAGTTTACAGCCCTAGATAGGGGGGTTAG
    GGATGTGAGATGGTCCCAGGGGCCTGCTCCTGAGCCACGCCAAGCTGCCTGCTCCCTTTCCTC
    TGCTTCCAGACTCACGGGATCCTCTGCTCATCAGAACAGGAGTGTGGGAGACCCTGAGACACT
    GCCCCAGGATCTGAACAGGTGGCAAAGGCAATTAACAGGCTAGCGGTCACTGTAGTGACAAGGC
    GATTGAGTGGTCACCATGGTGATGGGGATGGAGGCTCTTTGCCACCAGTCCCAGTTTTATGCA
    TGGCAGCTCTAATGACAGGATGGTCAGCCCTGCTGAGGCCACTCCTGGTCACCATGACAACCA
    CAGGCCCTCTCAGGAGCACAGTAAGCCCTGGCAGGAGAATCCCCCACTCCACACCTGGCTGG
    AGCAGGAAATGCCGAGCGGCGCCTGAGCCCCAGGGAAGCAGGCTAGGATGTGAGAGACACA
    GTCACCTGGAGCCTAATTACTCAAAAGCTGTCCCCAGGTCACAGAAGGGAGAGGACATTTCCC
    ACTGAATCTGTCTGAAGGACACTAAGCCCCACAGCTCAACACAACCAGGAGAGAAAGCGCTG
    AGGACGCCACCCAAGCGCCCAGCAATGGCCCTGCCTGGAGAACATCCAGGCTCAGTGAGGAA
    GGGTCCAGAAGGGAATGCTGCCGACTCGTTGGAGAACAATGAAAAGGAGGAAACTGTGACT
    GAACCTCAAACCCCAAACCAGCCCGAGGAGAACCACATTCTCCCAGGGACCCAGGGCGGGCC
    GTGACCCCTGCGGCGGAGAAGCCTTGGATATTTCCACTTCAGAAGCCTACTGGGGAAGGCTGA
    GGGGTCCCAGCTCCCCACGCTGGCTGCTGTGCAGATGCTGGACGACAGAGCCAGGATGGAGG
    CCGCCAAGAAGGAGAAGGTATCTCGCCCTCCATTGGGCATTCTGGGAGTGTTTGCTTGCCTGT
    CCCCAACATTCCATGGTTTGTTGAGCCTCAGAATCTGATTTTATGCACAGGCTCTTTGAGAAG
    GGTCTTGCCAGGGGTGCCTTCTGGGGCAGGAAGGCCCCTACTGCCTGGCAGACCCATCCAGCT
    TTGGACTCTGGTCCTGCGACCCGGAAGTCAGGCTGCCAAGAGGTCCAGGCAGTGGCCTCCACT
    GGGGAGGGGCTCTGGAGAGTTTAGAGCCCTAGATGTGGGGGTTAGGGACATGAGGTCTTGTG
    GACAAAGCCCACTACCTGATTTTGAGACAACACTCACTAGACATGGTGACAAGTCAAAGATG
    CCTTGCCTCCTACCAGGAATCACTTCGCAGGGAGCCCGAGGGCTGCTGTGGCCTGCTGAGGAG
    TGCAGGGCAGTTACTTTTCCAAAAACAAAGAGAAATCCAGGCATGCTCTGAGCCAGCCCTGA
    GCCCAGCAGTGAGCAAGGAGAGAGCTGGAGACAGGGGACTTTGCTGTGAAACACTGGGGGG
    AATGTGCCTGCATCACCCCAGCTGGGGGCCCAGGCAGAGTGGGGGAGAAGGGGTAAGTGGGC
    AGAGCCAGTCACTTTGGGCATGCAATCCCTCTCGCCTCTGTGTGAAATGACCAGGTCAGCATAA
    ACCCCGGGCTGGCTGTGCTTCTGGCAGAGCTAATGATGTTAGGAGGAAAACAACCAACCCAA
    GTGAGAGGGTGCGCAGCCAGACAGCTGGACCGGCCGAGGCCCCAACCAAGTCCCAGATCTGC
    CTGTCACTGGTGCTATGGCAGCAATTTGGATGAGAAATCCTGCCCAAAGGGCCCCTTCAGGCC
    ACCCGGGGAGAAGGAAGCGGCTGTCTTTGGCATGACCAGAAAGATGGCTCGGAGCTAGGGAG
    AGGTGGACATGTGGGCTGTGGAGATCTGGCACT1TCCCCAAACAAGGAGAGAAAGCATAGTG
    TGCCTATGTGTGAATGTGCTATGTGTGCATGTTTGTGCCTGTGCATACCTGCATGTGTACATGC
    ATGTGCACATATGTGTGCACAGGGAATCACTTTAATAAAGGCCACAGCAGAGCTGTCCCTGAG
    CCCCTTGCATTCACAGTGGCATGTGAGTGAACCACCTTCTTAGGCTGGGCATCCAGTCTCAGA
    CTCTGGGGCTGCCCATGCCCCATCCTTTATCTGCTCCACGTGTGAGGGGTTGCTGGTCCTGACC
    AGGGCCAGCTGTGAACCCCAGAATCCTGGGAAGTCACTGACATTCTTGTCAGGGCCAAGAGT
    GGAGCAAGGCAATGCCTCGGGCACAAACTTTAAGGGGTCACCAGAAACATCAATCATCAAGA
    TATATGCTATTTAAATAATCAAAATGAATGCAAAAAAAAAATATGATGGACAACATACCAAA
    TCTAAACAAAGGCAGGATGAGTATCACTGGCTTCTGCACTTTTCTCCACCCAGTCTACCCCTC
    TTCTAGTGCCTGGATCGCAGGGTGCCAAGGCCTGGATGAGGGAAGCGTGGAGCTGCAATGGC
    CACTCCTGTCTGCCTGTTCTGGCTGCACAGAGGACTCAGTCCTTGTCTTGGGGGAACCTATCTT
    GGTTTAGGGTCATCCTAAGGATCTGATGT1TAACCAAGTGAGCTGGCTGTCCAGGCCCACCCA
    GGTTCAGTCCAGTCCTGTGTCTCTGGGAAGTGCTGCCCCTACCCCAAGCCAGTGTTTGACCTTG
    GAGCAATGAGCAATGCCCTCCTTCCACTTTCAAAGTTGTCCCCAAGACGTCAGCTGTGGTGT
    CTCTGTGCAGACACCGAGGAGGAACTGTCTTCTTTCTCCTAATTGGTTGCTTTGGAGGAAAGTAA
    AGTGTTGCTGGTTTCCCTCTTTCTACTTCTTTGATTGAGAGCAGCCGTCTTGCCGGTACCAACC
    TTCCAGATCTTACCTGTGGTTTGCAGGAGCCTGTGGCCTCAGTCCTGTGCCCAGTGACTTCTCCA
    TGTGGATGTCAGCTCCTTAGGGGCAAGCCTGATCCACTGACACTACTCCCACCCCTCATAAG
    CCCCTTCTTACCAGCTGCAGTTGCCTGGTACCCCACCATCGCTGACTCATTCCTTGGCATCAA
    GGTTCATCCCTTACTGGGCCACCACTTCTGGGTGGCCTGAAATAGGGCCCTGGGCATCCCTCTT
    GGGGACCTTTTGGTCTATATTTTTCACTCTCACCTCACTAAGGACAGATGAGTAAATCTGGTTAA
    CTTTGCCTGATAGATTTGGTGACCAATTTTCAGGAAGGAGCCTGGAAAGATGAGATTCAGGTG
    TATTGGTCAGCTTAGACTGCCATAAGAGAATACCATCCACTGATGGCTTAGAAACAACAGAA
    ATCTATTTCTCACTATTCTAGAGGCTGGACGTCCAAGATCAGATGCCAGCATGGTCAGGTTGC
    AGGGAGGGCTCTCTTCCTGACTTGCAGACCGCCACCTTCTTGCTGTGTCCTCACATCGTGGAG
    AGAGAGTGAAAACAAGCTCTCTGGTGTCTCTTCTATAAGAATGCTAATCCTATGATGGGGGC
    TCCCCCTCCTTACCTCATCTAAACCTAATTATCTCCCAAAGGTCTCATCTCCAGATACCATCAC
    ACTGGGGTTAGGGCTTTGACATATGAATCTGGGGGGACACAATCAATCTGTAACACCAGGA
    GGGCATGCCGGGAGGAACTGACCTTCCTCCCTCCAGCTGCCCTGGACACCTTTGCCCCATTGA
    AGGAGCAGGCTGAGAAGTGGAATGAGGATGGAATAAGGTGCACTCCATCATGCTTACCCACA
    TCCCTGGCAGGAATTGTCCTGGGCCCCAGCAGGAGAGATGCCCCCCCATACTGCCATGGCACC
    TGCTCTGAGACAGGTGTGCAGAGTGCAAAGCTCCAGGTGGCCCCCAAGCAGGTGTGCTGGGA
    GGAGGGGCCCGTGTGGGAGGAGCAGGCAGCGCCAAGGCCTAGCGGAGCAGTGACAGGTCCC
    TGACTTCAGGGAATGGGCACGCTGTGGGCAGGCAGCTGGTGTGGGGGTGAGGGCTGGGGCTG
    CATCTGTGGGACCAGGGCTGGGCCATCCTCATATGCCGTGTCACAACCCCAGTGCCCCTGCTG
    TAGCCAGGACAGGAGGCTGGGCCAGGCTGGGAGGTGACAAGAGTGGGGGCTGTCCCCAGGA
    GAAGCACTCTGCTGCCTGTGCCCAGGCCTCTGGGGATGAGGACCCCTCAGAAGGAGTAGCTAT
    GTCTAGGAAGCCCCAGGGCAGGAGCAAGCCAAAGGGGACATCATTAGTGAGATCCAGGGGAT
    CAGTGGGCCACAGAAGCCCCAGCGTGAGCCCCTCTGACTGATGCAGCTAGGCCCACACCTGC
    ACCTGCCCACAGCAAGACCCCCAGGAGGAGAGGGGACAGATGGAGAGAGGCACAAAGTGCC
    CCTGGCCTCTGCCTTGAAGCCACCCCAAGGCAAGAGAGATGAGCCCCTGTTTAGTGACCTC
    CAGGGGAACATTCTGGCCCATCTGATGTGGGAAGCCCCTTGTGGAGTCTGTCATTCCTCAGCT
    GAGCCAGGCCTTTGGAGGCAGCCCAGGCATGTCCCCTGTGTGCTCCTATCCCTGTGTTGGGAC
    ACCTGGCCCAGCCCCTCCTTCTGCCTTTCTCTTCCCTTCCCTTCTCAGGAGTGGACACTTCCTCC
    TTTAGCCCCCTCACAGCTGTGTGAACTTCTCTGTATCTCTCTCTTTCTGTCTCTTTCTCCCCCTCT
    CTCTCTGTCTCATTGTCTCTCTGTGTGTCTGTCTGTAGTATTCTCTCTCTGTCTCTGTCACTCTGT
    CTCTCTCTCTCTCTGTGTCTACCTTTCTGTATTTCGCTTTGTTTCTTTTCTCTGTGTGTGTGTGT
    GTGTATCTGTTTTTCTCACTCTCTCTCTGTGTCTATCTTTCTGTATTTCGCTTTGTTTCTTTTTTCT
    GTGTGTGTGTGTGTGTATATCTGTTTTTCTCACTCTCTCAATCTCTCTCTCTCTTTCTGTCTCTCT
    TTTGCTGGCCTGAGCAAAGAGGGAGCCCCATCCTGATGCTACATAACCG
    TGAACCAGCACAGACAGAATTTTGTAGGAAAGTCCTGCAAGTAGAAGGATAGAAGGATGAGGG
    AAGAAACGCCATGTGAGTCATGACAGATCCCTTTCCAGGAGCCACTGACTCACCCTGCCTCCT
    GCCCTCCCACTGTGACACTATTACTCACAGACAGGCCCGGATTAAACCTATGTTCCAGGTGGC
    CTGTGGTTCCCACAGTGTGGCTCCCTGGGTCTGGCCTCAGGCTCCACAGGTGCCCAGCCCTGC
    CAAAGTCTCCAGAGCAGCTGTCCAGCTGGGGAGCTGCGGGGCCCCTTCACAGAGCGCATGGG
    AAGAAGTTCCATCCTACACATTACATCGAGAGGGACGTGCCTGAGAAGGGGAGCTGGAGCCC
    GTGCAGCCCCCTGCTTGCGTGCAGAACATAGTGTACCCTGAGCATGCCATGAAAAACACAAA
    CGCACAAAGTTGTAAAGAAAAAAGAAATGACAGGTGGCTGTAAAATCAGTTATAGCCCACGA
    GAGGCCCACTAATGAGTGGTGATTTCAGCTGATTACAAAGAAATGATGGTGTTTCTGTAATGA
    ACTAAACATGCACTCGTGCGTGCACACACGCGCACGTATAGTCACATAACTGACCAGCCCTAT
    GCATCACTTGTTAATTTACTTAGTAACTGTAACAATAATAGTTTCCAATAAGTGAGCCTTAGTCT
    CTGCGCAAGGGTCAGTTTATTGAGCACACGGGGGCCTTGCAGTGGGGGCAGGTGATCTGCTCC
    TGGGAGCCGCCAGCCTCTCCTCTCCTGCTCTTCATCTTCCTCCGTGGTGGGAAATTGTCTCACT
    GCTTCTACACCTGAGGCTGAACATCTCCCTTTATTTCAGTCTGAAACACATGTATAAAATATACT
    GGAATGAATTAAGGTTGCAATTATTGATATCAGGCAGTGAGTACATCAGGGTTTATTATACTA
    TCTCCTTTACTTACTTCGAAGTTCTCTAUACCAAAAAATTAAAAACTATAAAAGAAAGAAAA
    AGGAAATGAGGCTAGATTCAACACAGATTACTCTTACCAAACCCTTCGTAGTCCCAGGAGTCC
    CCTAACACAAGCACTTGTGACCTGGAGTGATATTCACAGCATTCCTTACCTGGCAATACCTGA
    GTATTAGCCCCCCCAGTGGGATCUTGTTGTAGACAACCAGCAACTATCAGCCCAGCCAATAA
    ACAAGTAGGAAAGGGGAGTGCTGGAGAGGCCAAGAAGTGGGATTTTCCATGCTCCTGGGCTG
    TGATCCAGAGGGCACGGCTGTGAGGCTGATCTCAATGAACACTCTGTCTTGGAAGTACAGGG
    ATCCTCTGCTACCTGAAAACGTTCTGAGTATTCACTTTCATGGATTGCAAAGTCATTTACCCAA
    AATTCACTCTCCAAATGAAAAGTGAGTATGATGAATCAGTATTCAAGTTCCACCTGGGTCCTG
    GGAGAGGGCATGGACATCATATCCCAGCTGTTCCGACAGGAGGACCCAATCTGAGTCTCACT
    GCCTGGCTGCATCGTTTGTCTGCTGCCAGCCTGCACAGTAGGAAGGGAAAACATGATTTGTAT
    CTGTTTTAGGTCAGGTTCCCAAGAAGTAGAGCCTGAGATTGGAATTCTTGGAAAATGGTGTTT
    GCGGGAGCGCTGTCAGCAGAAGCTATAAGGAAGTGGGGGGACAGAAAACGAGAGGTAAGA
    AGCCAGTCAAAAAGGCAGGTCCAGCTTAAGTCCGCCTCAGTCTGGTTCCACAAGGGCTCTGAT
    GCATGAAGAATATCACAGGGTTGTCCCTCCTGGGAGAGGGGCCAGCCTATTGTACCTGTATCA
    AAGCCACCAGCTGAGGGCCAGTGGGGAGGGAAGATCTTCCAGGCATTTCCAGGAAACTCTCA
    GGAGAAGGGTGTAGCTGTGAGCAGTCTGCAGCTGCTGCTCACTGCGGCTAAAGGCTGGGTGT
    GCAGGCCAGTCAGCCAGTGAGGTGCCAACAGCAGGCACTACAGTCCACCCCTTGACTGCTCA
    GACCTACTGCTTTCCACTTTAAGCTCTCTCCATCCAGGCACAGCTTCAGGGAAAACTTACAATT
    GGAGAAACAGAGGGATGAACTACAATGCCCACTTCTGCATGTGATTGTAAGACTGTCACTGAT
    ACTCACCATCATGCCCCATCCCCACCATCCATTCTAGTGTCCCCTTTCCCCTTGGCTAACACTGC
    TGGTCTAGGTGACTTCCCTAGAGCAGGAGCCAAACCCTTATCCCTGAGGCATCTGAATCCTGG
    ATTCCTTATCAGGCTATTGTTTGTTTGTAAGTTGTCCATTCCCAATTACAACTGGACATGAGACT
    ACCAAGAAACACCCTGGCAAATCATCTGAGTGCAAGCCATATTCTTCCTGCTCCATTATGTAG
    CGGTAGTCCTACCTCCTAATGACAAGGGTAAATTGCCACATTTTGCTCCTTGTGCCAGGATGG
    TAATACCTTTCTCTACCTGCTTTTGGCTACTGGCACAAGGAAGCACAGCATGACCAGGAGGCAAT
    TGTAGCTGTACATTTAGTGAATGTGTTAATGTATCACCTGGTGGAAGGACCCCCTCTGAGAAC
    CAGGACTTCTAGACCCACAAAACCTAAAGTTGTGAATGGCGGAAGCACAAATTTCCCAAGTG
    GATCATGGAGAGTGATGAAGAGTTCTTGGTTCCCAAACCCACATATAATACCTTTCAGGAACA
    TGGCCTCATCCCATAGCCATTAGAGTGCATATTGCATTCTGGAGGAGACTGGGCCCTCCTCAT
    GGGTGTCATCTTCAAGATGACAGCTCCACTGTGCCTCCAAGAGGATGCTCCACCACCCTATCT
    GTGATTCCTTGGTTAGCAGGACAGGCTGCTGCACTGAGGGTAGGAAAGGCAAGTCCATTGAT
    GGCTGGAATACATGTCAATCCAAGTCAAGAGAAAATGCCGCCCTTCCAGGTTGGAAGGGGC
    CCGATTTAGCCAACTTGTCACCCAGTAGTGGCTGGTTGGTCTCCTCCAGGAGCAGTGTTATAC
    CAGGAATTCAGCACCAGTCGCTATTGCTGGCAGTTCAATTACATTCAACAGCAGCAAAACTAGGT
    CAGCCTGATGAGAGGGAATGTATGCTTCTGGGCACAGGCATGGCTCCTTCTCTGACTCCAT
    GACTATCTATTTCTGAGTGCATGGTGGCCGACATTCAGCTGCCTGCCCATCCTATCCACTTGGT
    TATTATTGCCTCTTCCACAAGAAGTGGTTTCTGGCTGTCATTAATGTCTCATACTTTGTGCCCA
    CTCACACAGGTTAGCTCTACAACTTTTCCCCATGCCACCACTTTTCCACAATCTTCTAATGTT
    GCTCCTTCCAAGCTACTGAAGAACGAGCTAAGCTATTCACCAATGTCCATGAGTCTATATTTA
    CCTTAGGCCACATCTCTCTCCACACAAAGTGAATAAGCAGGTGCACCCTCCAAAACTCTACTA
    AGAGGATTTCTTCTCCCCAGTGTCTTCAGGGCCACCTTGAGTGGGGCTGAAGTACAGCAGAA
    GTCCATTTCCAGCTTGCATCAACATTCCAAACTAACCTATCCATGATCAATGCATAGATGGGTT
    TTTCCCTCCTCCAGCAGCTAGACAAAAGACACCCCCCACCAGGAGGCCATATTTGCATGTGGG
    TGAAAGAGAGGCACAGGGGCCAATATTCGTGCAACAGTGGTAGATGGCAGGTGGGTCTGGGC
    CACCTGTCCCTGCAGCTTATCTGTGCCATCTGGACCTGCTCAAGCCTGATTCCAGATATACCAT
    TTCCATCTTATGATGGATGGCTATGACCTAGTGGGTCTGACAGCACCAAACTCATAATGGGC
    AGTATGGCCACATGGTCACTTAATGTCCTATGGTCAGACACTCTGCTGAGTGGCATGCCAGG
    AAATGCTTTACAAGTGGTGTTTGGTTCTCTGCTGCAGATGGCATGACCTTGGTCCGGAGCCCT
    AGGGGTTTGGACAGTGACTCCTGTTGGGGCCTAATCTCACATTCCATGCAGAGTATCATCAGA
    TTTGCCAATCACATAGCCTAAGGGTCAGGACTGATCCAACCAGTTTTGCAGAGATCAAACTG
    GAGAATGAAAGGTTGATATGATGTGACCATCATATCACGTTTTCTCTCTTGAAAAGTATGCA
    GATGTCTGAAAGAGACAAGTGCCCCAGGAGAAAATGCATGCCTTCCTCAGGATCGGCCCCCA
    CCTCCCCTCCTGGCCACAAGGAGGGTCAAATCTCAGCATGGCCCAACTTGGACCTGTCAAGGA
    AGAAGAAAAAAATTGTATGCCAAAGGAACTCAGTCTTTTGGCTAACAAGTACTAGACATCCTTT
    AAGTCTTTGAGAATGGTAATAATTTCTGCCATCCCTCCAGATTTTGTGTTTTTCTGTTTTGGCTG
    GGTGGGAATGCAGCATTTTCACTTTGCCTTTGTTATTACAAATGTTGCTTATTCTATAAATCAA
    GGAACCATGTAAGGGCTCTTCTGATGGTTAAGTATATCCATTCCAATGATTTATTCGGGATCC
    AAGGAAATGATTTCTGGGTGAATACACAGAACTAGTGGATCCAATTTGAGACATACCTGGGC
    CAGAACTATATTTGTCGTCTTACCCCAATAAGCCTGCACTCTACTAGGACAGCCATGACAGCA
    CTTTGGGACCCTAGATATAAGTGTGAATTGCTGGCTGGGCATGGTGGCTCACGCCTGTAATCC
    CAGCATTTTGGGAGGCTGAGGCAGGTAGATCACCTGAGGTCAGGAGTTGAAGACCAGCCTGG
    CCAACACGGTGAAACCCCATCTCTACTAAAAAATACAAAAATTAGCTGGGCGTGGTGGTGGG
    TGCCTGTAATCCCAGCTACTCGGGAGGCTGAGGCAGGGAGAATTGCTTGAACCCAGGAGGCG
    GAGGTTGCAGTGAGCCAAAATCACACCACTGCACTCCAGCCTGGGTGACAGAGCGAGATTCC
    ATCTCAAAAAAAGAAAAAAAAAAGTGTGAATTGCTATGAAATCACTATCAAAAGATCTGAGT
    GTTACCCTTACTCAGTGTGGTCGAATATAAATAGCCATAGGTTCCTGTTATACACACTTGCTGT
    GGTGCTACAGAGTCTTTCCTCATGGGAACCCAGTCCCTCTTTCAGTCAATGGGTTCTGGTTCGA
    GAACTGGCTGAGGTTTGGAAACTGTGCCTTCCATCATAACTTTCCACTGGGGTGACTGACCTT
    GGCCTTCTGTTCATCCTTTCTAGCCCCTAAGAATCCAACACTCTATTAGCCTAACTCCTTAGACC
    CCTATAAGCTAATCCCTTCTAGTTGTTAGTCTGACCTTGGTGCCCAATATGATAATTATAACCCA
    CTTTGCTAACTGATATGCTTCTAAGTGCTGCCCCTGGTCTCTGCCCTTAAGTGATCTATCATCCCC
    ACTGCCATTAGGGGGAGAAGCTCTGAAAAAGAGVTGTCTCCCATCAACTCTGGTCTACAAAGG
    ACAGCCCTACTGAGCCTCAGCCATGTGCCCGACACCAGCAGATTCTTTACAGCCTGGGAAGCA
    GAGTGTCTTCCCTGCCTTTCCAGGGAACATAGCCAGCTTACAGGCTTTTTGATCTTATAGAGTA
    GGTCAGTTATATTTTGCCCCATTTCTTTTATCCTTTTGATCACTTCCTCTTGGCCCACCATGTAA
    ACTCAAGCATCCCTGCTTCATTTAATCGAGCTGTTTGCTTTTTCTAAGCTACCAAGAGCAACCCC
    AGCAATATATCAGAGCCCTCTCTTGGGACCCTTGCTAGGGTGTTAAATCCTGCATCATAGGAG
    AATGCCCCCACATCAGCAAAGTCCCCTTATCCTCTTGATATCCCACCTGCCCCAGTCCAGCACC
    TTCAGGATCTGGTCTCAATCACAGGATCCAGCACCTTTGGGACTGTTGCAAGCATAAGATCCA
    GCACTTTTGGGATCTAGTCTCCCACTTCCTGCTAGTACTTGTTAGCCAAAGACTGAGTTCCTTT
    GGCATACAATTTTTTTCTTCTTCCTTGACAGGTCCAAGTTGGGCCATGCTGAGATTTGACCCTC
    CTTGTGGCCAGGAGGGGAGGTAGGGGCCGATCCTGAGGAAGGCACTCATTTCTCCTGGGGC
    ACTAAGTCTCTTTCAGACATCTGCATACTTTTCAAGAGAGAAAAGGCCTCCTTCTCACAGCAAG
    ACTACAAAACTGTAGATGCAGGTGGCTCGTGGGAATCTGGCAATTCAAAATTCTCAAGTGTACTC
    ACTAGCACATTAGAAAACCAGTAGTACACATCTCTTTCCAAATCCATTCAGTGACACTATG
    TCAGTAGCTGGAAATGGGCCATGGTGGGTGTATTTAAACCATGAAAATCAGAAAATGCTACA
    AACCAGGGCATCCCGCATCTCTAGACAGCAGATTGTTGGCCATTTCCCAGCATACCATTGTGT
    ATACTCCTTCCCATCAGGGCCGTGGCTAAGCCTTGGTGGAGGACTCAGCCCTTGCTGAAGTTCTG
    CTACTGCTCTTACAATTGAGTCCTATGCCTGGTCTCCAGCTCTGCCTGCCTCACTACAGGAGAC
    AAGCATCTCTTGAACACTGCCGAGAAGACCCTCTGGCTCTCAGGCTTGGCTTTAAATCGATA
    GACCTGAGCCTGCCATTTTCTCTAATTCCATGCATCACTCCACTGATCCACAGGTCTCAGTGGCA
    TAGTCCTTCGGGTTAGCATCTCCCCCACACCCTCGGTGCCAGAGACACTGAGTAAGAAAGTAC
    CTCCCTGTCTACCCCCATCCCCGCTCCCCACAGGCAGGGCCTTGGCGATCCACTGCTGCAATGT
    GCCAGAGACTGTCAGTACTCCTACCACCAGTGAGGTGGCAACCAGCTGGGAAGTGATCCAAC
    TCCAGAGTCCCGCCCTCATAGGCTGATTTCTAGGACCACCCCTGGTATACTGTGTTAGGTTCTT
    GAAGCAGAGCCTGAGATAAGGATTCTGGCACCTGTGATTGAGTGGGAGGGTGCTCTCAGGAT
    GAGATGGGGTAGAAATAGGCAAAGGTACAGATTCAGCAGCAGTAAGAGCCTCAGTCTGACCCA
    GCAGGGAGCTCTCAAATGTGAATGACATCACAGAGTTGTCCCTCTGAGGCAGGGGCCAGCCTT
    TGTGCTCCTACATGAGTCAGTCACTGGCTGGAGGCCCCTGGGGAAAGGCTAGGGCTGCCAGCT
    TTAGCAAATAAAAAATTAGGGCACTCAGTTAAATTGAATTCAGATAAACAACA
  • The genomic DNA or YKT6 SNARE gene is 39,000 base pairs in length and contains seven exons (see Table 1 below for location of exons). As will be discussed in further detail below, the YKT6 SNARE gene is situated in genomic clone AC006454 at nucleotides 36,001-75,000. [0042]
  • The human liver glucokinase is depicted in SEQ ID NO:2: [0043]
    MPRPRSQLPQPNSQVEQILAEFQLQEEDLKKVMRRMQKEMDRGLRLETHEEASVKMLPTYVRSTP
    EGSEVGDFLSLDLGGTNFRVMLVKVGEGEEGQWSVKTKHQTYSIPEDAMTGTAEMLFDYISECIS
    DFLDKHQMKHKKLPLGFTFSFPVRHEDIDKGILLNWTKGFKASGAEGNNVVGLLRDAIKRRGDFE
    MDVVAMVNDTVATMISCYYEDHQCEVGMIVGTGCNACYMEEMQNVELVEGDEGRMCVNTEW
    GAFGDSGELDEFLLEYDRLVDESSANPGQQLYEKLIGGKYMGELVRLVLLRLVDENLLFHGEASE
    QLRTRGAFETREVSQVESDTGDRKQIYNILSTLGLRPSTTDCDIVRRACESVSTRAAHMCSAGLAG
    VINRMRESRSEDVMRITVGVDGSVYKLHPSFKERFHASVRRLTPSCEITFIESEEGSGRGAALVSAV
    ACKKACMLGQ
  • and is encoded by the genomic DNA sequence shown in SEQ ID NO:6: [0044]
    ACTAGCACATTAGAAAACCAGTAGTACACATCTCTTTCCAAATCTTCATTCAGTGACACTATG
    TCAGTAGCTGGAAATGGGCCATGGTGGGTGTATTTAAACCATGAAAATCAGAAAATGCTACA
    AACCAGGGCATCCCGCATCTCTAGA
    AGCAGATTGTTGGCCATTTCCCAGCATACCATTGTGTATACTCCTTCCCATCAGGGCCGTGGCT
    TGCCTTGGTGGAGGACTCAGCCCTTGCTGAAGTTCTGCTACTGCTCTTACAATTGAGTCCTATG
    CCTGGTCTCCAGCTCTGCCTGCCTCACTACAGGAGACAAGCATCTCTTTGAACACTGCCGAGA
    AGACCCTC TGGCTCTCAGGCTTGGCTTTAAATCGATAGACCTGAGCCTGCCATTTCT
    CTTTTCCATGCATCACTCCACTGATCCACAGGTCTCAGTGGCATAGTCCTTCGGGTTAGCATCT
    CCCCCACACCCTCGGTGCCAGAGACACTGAGTAAGAAAGTACCTCCCTGTCTACCCCCATCCC
    CGCTCCCCACAGGCAGGGCCTTGGCGATCCACTGCTGCAATGTGCCAGAGACTGTCAGTACTC
    CTACCACCAGTGAGGTGGCAACCAGCTGGGAAGTGATCCAACTCCAGAGTCCCGCCCTCATA
    GGCTGATTTCTAGGACCACCCCTGGTATACTGTGTTAGGTTCTTGAAGCAGAGCCTGAGATAA
    GGATTCTGGCACCTGTGATTGAGTGGGAGGGTGCTCTCAGGATGAGATGGGGTAGAAATAGG
    CAAAGGTACAGATTCAGCAGCAGTTGAGCCTCAGTCTGACCCAGCAGGGAGCTCTCAAATGT
    GAATGACATCACAGAGTTGTCCCTCTGAGGCAGGGGCCAGCCTTTGTGCTCCTACATGAGTCA
    GTCACTGGCTGGAGGCCCCTGGGGAAAGGCTAGGGCTGCCAGCTTTAGCAAATAAAAAATTA
    GGGCACTCAGTTAAATTGAATTTCAGATAAACAACAAATTATTTTTTAGTATATGTCCCAAATT
    GTGCATAACATAATGTGTTTTCTCCGCCAGCCCTGGGAAGGGCGTAACTTCCCAGGTATTTCT
    AGGTGAAGTAACTTTGTAGATCAGGAGTAAGTCCCAGGAAAGAAGTCCAGCTCTTCTCT
    TCAGCCCTGGGCAGCTGGGGGTAGGCACAGGGGCCCAGCAGGCACCCATA
    GCATCTCCTACAGCATCTGAAATGAACAGGGTCATCACGTACTACATACAAATGTACCCACTG
    CTGAGTTCTTCAGGGATTATATCATTAGGTACTTGGTATTTTAAATACATTACATTATGCAGAA
    GTCCTTTGTGGATTGCTATATTTGGAGAGTTTTGTGATATTGGGGGGATTAGATGGAGTTTTCA
    GATGGGCAT CATACGGTTTTTCATTTAAAACCCTAGAGTATTGTAATCCTAGGGAGTGA
    TCCTGCGATTAGTAAATTAGCTCTCCAATAGATTTTCAATGTGGTTGCAAAGGACATGCATGT
    GGTTCACCCTCCCAGGAAATCCAGAAGGGCAGCATTGGCCTGAGTGGCCTGAGTTTGGCTGGT
    TGGGCTGGTAATGCTGGACAAAGA
    CAATGGGTGGAATGGTTTGCTTCCCTCAGTCCTTTCAGACACAGCCCAGC
    CCACCACGTCAAGCCAGTGGGTGCATCTGCAACCAATCCCCATGAGAACT
    GCAGCCTCTCAGAGGTGGGCAAGTTGGCCCGGGTGGGTCAGGAGGATCAG
    ATGTTGAGGAAATCTTTGGATTGGAGGCAGGCAGAGCAGGGAAGCATCGG
    GTGATTCTATGACAGACCCAGGGCTCCAAGCTGCAGTTCAGGAGGGGCAC
    TGGCACGGCCTCTGCTCAACTCCCCCTTGAGTGACATCAGGTGAAGTGCC
    GACAACACAGAAGGCAGCAAATGCTGCCAGTCAGGTCTGCTTCCCAGGAC
    AGCCAGTTGCTAACCCTTCTCCAGCACAGCACTGGATTTTGGTCACCTGG
    CTGGGAGCTCCACCTCCCCAGCTGCTGCCTCACCTGCTTTTCCAAACCCC
    ACCCTGTAAACGGTAACTACATTTTGTGCCCACTACGCCTCGTTTCCATC
    TCTTTGGAGCACCTCTCACGTGGAGCTGAACAGAACGACCTGTTAAGCCC
    ACCGTGTCTGTTAGGGTTGTCTAGGCTGTATCAGATACCCAACTAAAACT
    GGATTGACCAACAGGTATTGTCAAAGCACATAAGAAAGAGTCCAGAGGCA
    GGCAGCTCTCAGCCTGGTGTCAGGCTCTGGGTCAGCTTTCCAGATTCTCT
    TAACCTTCCCCACATCTGCCAGATGCCGCCACAGGCACAGGAGGTACAAA
    CAAACCCAAAAATGTTCTGGAAACAAGAAGGGAAGGGGATCCCCACCATA
    TCTCCCCAGAGGCCTTCCTTCTCACATCTCACTGTACTGAAGCCAGCTCT
    AGCAGAAGACAGCAGGGTGAATTTGTCCAGGGTATTCAGCCCCCAGTGCT
    GGGTCCATTACTACTTGACCCCTGAATAAAACAGAGGTTCCATGAGCAAG
    AAGGAAGGGGAACTGGATGTTAGAGGGCAAGAATGTATCCATCCCACCCC
    TAGGAGCACGCATGGACAACTGCCCCATTTTTGCTCCTATTGCAGCCCAG
    GGGCTAGCCCAGAGACCTTGCCAGTGCTGAGTCACAAGATGCTGGGAAAG
    TGAGACCAGAGCCTGGTCTTGGGGAACAGCTCAAGGCCGCATTGGTCTGC
    AGGTCATAGAGCAGCTGCTGAGCAGTGAGAGCCCACGATGGGCCAGGCCC
    TGGGTCTTGGAGACCTGAATGAGATAGACTGGGTTCCTGTTCTCCTGGGC
    ATTGCCTCTTAGAGGGCAAAGACAATTAACAATAAACAAATAGAACATGA
    AGTGTTTTCCGATAGTGACTGATATACTTTGGATATTTGTCCTCTCCAAA
    TCTCATGTTGAAATGTAATTCCTTATGTTGGAGGTGGGGCCTGGAAGGAG
    GTGTCTGGGTCATGGGGGCAGATCCCTCATGAATGGTTTAGTGCCATCCC
    CTTGGTGATGAGTGAGTTCACGTGAGAGCTGGTTGTTTGAAAGAGCCTGG
    CCCCCTCTCATTCTCCTGCTCCCACTCTTGCATGAGACACCTGCTCCCCC
    TTCTCCTTCTGCCATGATTTTAAGATTCCAGGGACTTCACAAGAAGCAAA
    TGCTAACGCCATGCTTCTTGTTCTGTCTGCAAAACTGTAAGCCAATTAAA
    CCTCTTTTCTTTGTAATTTATCCAGTCTTGGGTATTTCTTTATAACAGCA
    CAAGAACAGCCTAATACAGTGATGCTCTCCAAGTGACCTTTGGGCTGAGA
    CCTGAAGAAGAAGGGGAAGCAGTTAGGTCTGATAGCTCATGCCTGTAATC
    CCAGCTCTTTAGGAGGCTGAAGTGGGAGGACTGCTTGAGCCTAGGAGTTG
    AAGACCAGCTTGGAAAACATAGCAAGACCCTGGCTCTACAAAAATATTTT
    TTAATTGGCCAGGTGTGGTGGTGCACACCTGTAGTCCCACCTACTTGGAA
    GGCTGAGGCAGGAGCATCTCTTGAGCCCAGGAGGTTGAGACTGCAGTGAG
    TCATGTTCACACCACTGCACTCCAGCTTGGGTGACAGAGCAAGACCTGTC
    TCGAAAAAGAAGAAAGAAGAAAGTAGGAAGAAGAAGAAGAAGAAGAAGAA
    GAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAAGAGGAAGA
    GGAACAAGAACAAGAAGAAGAACAAGAAGAACAAGAAGAAGAACAAGGAG
    AACAAGAAGAAGAATAAGAAGAAGAAGGAGAAGAAGAAGAAGGAGAGGAA
    GAAGAAGAAGAGGAAGAGGAGGAAGAGGAGGAGGAGGAAGATGAGGAGGA
    GGAAGCAGAAGCAGAAGAAAAAGAAAGAAAAGAAAGAAAGAGAAAGAAAG
    AAAAGGGAAGGAGGGAAGGAAGGAAGGAAGGAAAAAGGGAAGGAAAGGGA
    AGGAGAGGGAGAGGGAGAAGGAAGAACAAAGAAGAAAGAAGGAGAAGCAG
    AGGCTTGTGCTGGATAGCCTTGCTTTTGCCAATGACCTTGCTGATTTTCA
    GGGGGTCCTGGTGTCTTAGTCCATTTGTGTTGCTGTAAAGGCATACCTGA
    GGCTGGATAATTTACAGAGAAAAGAGGTTTATTTGGCTGAGAGTTCTGCA
    GGCTCTACAAGAAGCATGGCACCAATGCCTACTTCTGATGAGGGCCTCAG
    TCTGGTTCCACTCATGGCAGAAGGTGAAGCAGAGCCTGCATGTGCAGATA
    TCACATGGTGAGAGAGGAAGCACGAGGGGGCAGGGAGGTGCCAGCCTCTT
    CCTAATAGTAAGCTGTCTTGAGAACTAATAGAGTAAGAAATAACTCACAC
    CCTGCCCCCAAGGAAGGGCATTAATCTATTCATGAAGTATCTGCCCCCAT
    GAGCCAAACATCTCCCATTAGGCCCCCCACCTCCAACATTGAGGATCAAA
    TTTCAACATGAGGTTCCGGTGGGCAAACATCCAGCTATAATACTGGGCAA
    TGCTGACCAGACTCTTCCCCTCTCAGGCCCAGAGCTCCTTGGCCCTGTAA
    CAACAGAAAATTGCGTTTGAGTGTCAAGATTTTTCCTTTAGTCCCCATGC
    AGCTCCTTAGAATGAGGTGGCATCTTCTCCCTTTTCATAGGTGAAGAAAC
    AGAAGCTCTGGAGGAACGAATCATTCATCCAAGGTCAGGTAGCTAGTAAG
    CGTCCCACCAGCTCCCCAGATCTCCTGTTTCCTGTCCCAAGTCCCACTGA
    GTGAGCTGGAACAATGGCTTCACTGGCACCTGCCGGGAATGGTGGCAGGT
    GCCTATAATCCCAGCTACTCGGGAGGCTGAGGCATGAGAATCACTTGAAC
    CCGGGAGGCAGAGGTTGCAGCGAGCCAAGATCACACCACTGCACTCCAGC
    CTGGATAACAAACGGAGATTCCATTTAAAAAAATTAACATATAATATACA
    TACAGTAACATTCACTTTTTAAGTGTACAGTTTGATGAGTTTTATCAAAT
    GTATATGGTTATATAACCACCATCACCATTAAGGCAGAATCTTCCCATCA
    CTCAAATAATTCCCTCAGCCCCACCTCTTGCTGTCAATCACTTCTCCCAC
    CCTAGCCACTGGAAATCATTCATCTGTTTTCTGTCCCCTTGGTTTTGCCT
    TTTCTAGAATGTTCTATACATGAGACCACTGAGAATATAGTCTTCTGTGT
    CTGGCTTCTTTCACTTAACATAATGCCTAGCTCAGCAGTGTGTCAATCCT
    CCCTCCCTTGCCATTGCTGAGCAGTGAGTATTCCACTGTATGGCTGTGCT
    ACGGTGTGTTCATCCATTTATTCATTCACCAGCTAATGGGCATTTGGATT
    GTTTCCAGGCTTTGGCTATGATGAGTGAAGCTGCTGTGAATGTTCAAGTA
    CAAGTCTTTGTGTAGACAGGGGTTTTCAATTGGCGGGATAAATACCTAGG
    AGTAGTATCGTGTGGTTAAGCGTACGTTTAAACTTAGAAAAACTGTCAAA
    CTGTTTTCCAATGTGGCCTGTACCATGTTGCATTTCCATCAGCAGTGTTT
    GAGAATTCCAATTGCTCCACATCCTCCTCCCGACACTTGGTTTCACCCAT
    CTTTTAAATATTAGCCACTCTGGTGACTGTGTAGTGATATGTCAGTGTGG
    TTGTAATTTGCATTTCTATGATTGACTAATAATAATGTTGCAGATATTTC
    TGTATGCTTAGTGGGCATTTTTGGTGAGTTTTTAAAAATTGGGTTGTTGT
    CACCGTCTTATTGAGTTGGAAGAATTCTTTATATGTTCTGGATGTTTATT
    CATGTGTGTGTCTGCTAAGAGGTGAGACTGGTTCTACCCTGGTCCTAACA
    AGCACCCTGGGCCTGCATCCCTTTTTGTGTCTGTGAGCTGGGTCTGCAGC
    CCTCTCCTCCCACTACCTACTGCCCAGCAGTACCCCTCACCCATCACTGT
    GGCTCCTGCAATGACATCTCAGCCTGTCTCTCCCTCCCTCCAGCTAGCCA
    GAGGCAGGATGGCTCAGTGACACAGGGTGGGCCCTGAAGACAGAGTGCCA
    GGGTTTGGACCTTGTATTAGCAAGAGTCACAAGGGAAACTTACTTTATCT
    CTCCATAGCTCTGTTGTGAGGATCCAATAAATTAATCCATAGAAGAGCTT
    AGGACAGCACCTGGCACAAAGTATACATGAGCTATTATGATGTTATTCTT
    CCAACCCATTGTTTCTGTGTTGTCATAAACATGAATGCAGGACTCAGTGT
    CCCAGCTCTGTGTCCCTCGCATACATTCCCTAACAGCCCACAGGTCTTGC
    CTGTCACCGCCTCATTCAATAAGTGATGACTCTGCCTCTTCCTTGGCTGG
    GGCCTTGCATTGGACATTTCTGTATCCATATTTGTTTTTTAAAAACTAGC
    TGTTGGCCGGGCGCGGTGGCTCACATCTCTAATCCCAGCACTTGGGAGGC
    AGAGACAGGTGGATCATGAGGTCAGGAGTTCAAGGCCAGCCTGGCCAACA
    TGGTGAAACCCCATCTGTACAAAAAATACGAAAATTAGCTGGGCGTGGTG
    GCATGCACCTGTAATCCCAGCTACTTGGGAGGCTGAAGCAGGAGAATCGC
    TTGAACCTGGGAGGCAGAGGTTGTAGTGAGCCAATATAGCGCCACTGCAC
    TCCAGCCTGGGCAACACAGCAAAACTCCATCTCAAAAAAAAAAAAAACAA
    AAAACAACCTAGCTGGACTTGACACTCTTGTTAGAGGAAGATTTTTCCAC
    ATCTGTTAACTTTTCTTCTATTGTTATCCATCTGTGCAGGTTTTTCTGTC
    CTCCTGAGTCATTTTGATAATTTATATTATATTTTGAAAATCATCCATTT
    CCTATAGTTGTTTATTAGTGTCTTCTCTGTTATATTTGATCAGATTACCA
    AATCTTGCTCATTGATTGCCCATTTATTTTATTGTGTTTATTTTTTTGAG
    ACAGGGTCTCACTCGACAGCCCAGGCTGAAGTGCAGTGGTGCAATCATGG
    CTCACTGCAGCCTTGACCTCCTGGGCTCAAGCAATTCTCCCACCTCAGCC
    TCCTGAGTAGCTGGGACCTCAGGCACACGCCACCACAGCTGGCTAATATT
    TTATTTATTTATTTATTTATTTATTTTTGTAGAGATGGGGTCTCACTATG
    TTGCCCAGGCTGGTTTCAAACTCCTTGGTTCAAGTGATCCTCCTGCCTCA
    GCTTCCCAAAGTACTGGGATTACAGGAGTGAGCCACCATGCCCAGCCCCT
    ATTTACTTTATAGTAAGTGCCTTCATGGGCATAAATGTTCCTCTGAGACA
    GCTTTGGCTATTAGCCATACTTTTAATATTTTGTACATTCATGGTTATTC
    ATTTATAAATGGTCTGTAATGCAATGCAGATTTCCCCTTTGGCCCAAATG
    CCATTTACAGCAGCACTTTTCTCTTTCTGAGCAGACAGAATATTTTGGTT
    TCCCCTCTGTTGTTTATTTCTCGTCTGCCTCGCCTCATTTGCTAGGTGTT
    CCCTTGGTGTGCCTTAAGTATGAGCCACTCAAATATTTGTGTTTCTCTAA
    ACACCCCTGACACTGTCCTGCTGGTTTCTCTATCTGGAATATCCTTCCCT
    TCTTGGCCAGTTCCCCCTAGTGCATCAAAGAAATCCTGCTCTTTTGCCTT
    CAGAAAACAAAACAAAACGAAACCTATCAGTCTCCTTATGTCCCCAAAGA
    CATAGCTTTGCTGGTATCTGGTTGTATTGAGCTGTTCATTTGTCTCTTCT
    GCTAGATGGTAAGCTCCTTGGAAACTAAAAACTAATCACTTTTCTAACTT
    CAGACTGAGCACAAATTAGGTTCTCAAGAAACATTGAATAATGAGTGATC
    CGGTATCCCCTTCCAACATATTTTTGGTCATTGATACCATCATTCTGAGT
    AGTTACTAGGGAACACTTCACTGCAGTAACCAATACAGCAAAACGTGAAA
    TACAGTTACATAGTAGAATTGTATTTCTTGCCCATATAATAGTCAAGTGC
    AGTTCTTCATCAGCTGGGAGGTTCTCCTCCACACAGTCATTTAGGAATCC
    AGGGAACATAGCAGAGGTTGCTAGCTCTAGACCCAAACCCATGTCCTCTT
    TGTCCACAGTGAGGACAATGCCAGCAACAGCTGGCCAGCTGTTCTGTAGT
    TCTCAGCCTCCCTCGCAGTGAGATGTCTCCATGCAATTTCAGTGGAGCAA
    CATATACCATTTCCATTTCCAGGTGTAGGCTCCTAAGAAGAGGGTGGCTT
    CTTCATGTTCTTTCTCACCTTTCCGTAGGCTAGCTGCAGATAATGATGAG
    GCTTTAGGGAGTGGGTGGAGCCATAAAGTAGAAGCCTGGATTCCTAAATG
    ACGGTGTGAAGTGTTCCCTAATTTCACGTAATTGTTTCTTAATTTCCTGT
    TTGGGTTATTTGTTGCTAAGGTATAAAAAAACCCTGATTTTTGTGTGTTG
    ATATTTGTGTGCTGCAACTTTGCTGAATTAGCTTATTAGCTCAATTTGAT
    CTCAGATATTAGCTCAAATATTTTGGGAGATTATTTATGGTTATCTACAT
    AAGATCATGTCATCTGAAATAAAGATAGTTCTATTTCCTTCTTTCTATCT
    TAGTCCATTTGGGCTGCTGTAACAAAATGCCATAAATTGGAGGCTGAGAA
    GTCCAAGATCAAGGCCCAAGCTAATTCACTGTCTGATGAAGGCCTGCTTT
    CTGGTTCATACATGGCACCTTCTAGCTGTGTCCTCACATGGTGGAAAAGG
    CAAGGTAGCTCTCTGGGATTCCTTTTTGTTTGTTTGTTTGTTTTGTTGTT
    TTTGTTTGATTTTTTGAGACAGAGTCTCACTCTGTCACCAGGCTGGAGTG
    CAGTGGCACAATCTCGGCTCATTGCAACCTCTGACTCCCTGGTTCAAACG
    ATTCTCCTGCCTCAGCCTCCTGAGTAGCTGGGATTACAGGTACCCATCAC
    CATGTCCAGCTACTTTTTGTATTTTTAGTAGAGACAGGGTTTCACCATGT
    TGGCCAGGATGGTCTCGATCTCTTGACCTCGTGATCTGCCCACCTTGGCC
    TCCCAAAGTGCTGGGATTACAGGCATGAGCCACCGTGCCTGTCCTCCGGT
    ATTCTTTTTATAAGGGCTCTTTTTCTTTTTATGTGGGCTCTACCCTCATG
    ACCTAGCACCTTCTAAGGCCCCACCTCTTAATATCATCACACAGCAGATT
    TAATATATGAATTTTGAGGGGACACATTCTTTCCATAGCACTTTCCAGTA
    TGGATACCTTTTATTTATTTTTCTTCCCTAATTGCTTTGGTTAGAAATGT
    CTTCCCTAATTGCTCCACTACTATGTTGAAAAGAAGTGGCAAAAGTGGGT
    ATTCTTGTCTTGCTCCTCTCTTAGGAAGAAAGTTTAAGTCTTTTGCCATT
    AAATATGACGTTAGCTATGGGGTTTTCATATATGACATTTATCATGTTGA
    GGAAATTTTCTTCTTGTTTCAATGATGACAGGGTGTTGAGTTTTGTCAGA
    TGCTTTTTCTGCATCAATCAATATGACCATGTAGTTTCTTTGTTTTATTC
    CATTATTGTAGTACATTACATTAATTTTTGCATGTTGAACTATTCTTGTG
    TTCCTGGGATAAATTTCACTTGGTTATGGTGTATAATCCATAACCATAAC
    CTGAAGATATGCTGAAGAGGCTAAGTGCCATGGCTCATGCCTGTAATTCC
    AACACTTTGGGAGGCTGGTGTGGGAGGATCACCTGAAATCAGGAGTTTTA
    GAAGAGCCTGGGCAAGTAAACAAGATCCCATCTCTACAAAAAATTGAAAA
    TTACCGCTGGGCATGGTGGCTCACGCCTGTAATCCCAGCACTTTGGGTGG
    CCGAGGCAGGCAGATCACCTGAGGTCGGGAGTTCTAGACCAGCCTGACCA
    ACATAGAAAAACCCCGTCTCTACTGAAAATACAGAATTAGCCAGGCGTGG
    TGGCACATGCCTGTAATCCCAGCTACTCAGGAGGCTGAGGCAGGAAAATC
    ACTTGAACCTGGGAGACGGAGGTTGCAGCGAGCCAAGATCATGCCATTGC
    ACTCCAGCCTGGGCAACAAGAGCAAATCTCCGTCTCAAAAAAAAAAAAAA
    GAAAAGAAAGAAAGAAAGAAAAGAAAAGAAAGAAAATTAGCTTGATGTGG
    TGGTTGTGCACCTTTAGTCCTAGCTACTCAGGAGGCTGAGGCAGGAGGAT
    TGTTTGAGCCCAGGAGGTTGAGGCTGCAGTGAGCCATGATTGCACCACTG
    CACTCCAGCCTGAGCAACAAAGTAAGACCTCATCACTAAAAACAAATTTT
    TTAATACTGAAGAATTTTATTTGCTGGTATTTTGTTGAGGATTTTGCATC
    TATATTCACAAGAAATATTACTCTGTAGTTTTTCTTCTTGTAGTATCTTT
    GTCTGGTTTCAGTATCAAGGCAATGCTGGCCTCATGAGATCAATCAGGAA
    GTGTTACTTCCTCTTTTATTTTTTGGAAGAATTTGAGAGAATTGGTGTTA
    ATTCTTCTTTAAATGGTTGGTAGAATTACCAGTGTAGACATCTGGTCCTG
    GGATTTTCTTTGTTGGGAGGTTTTTTAGTACTAATTCCATTTCCTTACTT
    GTTATTAGTCTAATGAGATTTTCTGTTTCTTCTTGAGCTAGTTGTAGTAG
    CTCATGTGTGGAATTTTTCTATTTCATCTAAGTTATCCAAGTTTACCTAA
    GTTAAAGTTCCATTTTATCTAACTTGGGTAAGCCAACAAACAATACTAAA
    TTGTTCATAGTATTCTCTCATAGTCCTTTTTTTCTCTAAAGTCAGTAATA
    ACGTTCACTCTTTCATTTTTTCATTCCTGATTTTAATAATCTGAGTTCTT
    TCTCTCCCCCTCCCTGCAATTGAGAGTCATTTAAAAGTGTCTTGATTAAA
    TTTTATATATCTGTGAGTTTTCCAGTTTTCCCTCTGTTATTCTCTTCTAG
    TTTTATTTCATGTGATCCAAAAAGATACTTTATATGATTTCAATTTTTTT
    ACATTTACTAAGACTTGTTTTGTGACTAAAATATCCTTGAGAATTTCCAT
    GCACATTTGAGAAAAATGCACATTCTGCTGTTGTTGGACAGAGTGTTCTG
    TATATGTCTGTTAGGTCTAATTGGTTTAGAGTATTGTTCTAGTCCTCTCT
    TTCCTTATTGATCTTCTGTCTAGTTGTTTAATCCATTATTCAAAGTAGTG
    GCCGGGCACGGTGGCTCACACCTGTAATCCCAGCACTTTGGGAGGCCGAG
    GAGGGTGGATCACAATGTCAGGAGGTTGAGACCAGCCTGGCCAACATGGT
    GAAACTCCGTCTCTACTGAAAATACAAAAAATTTGCTGGACATGGTGGCA
    CACGCCTGTAATCCCAGCTACTCAGGAGGCCAAGGCAGGAGAATCACTTG
    AACCCAGGAGGCAGAAGTTGCAGTGAGCTGAGATCGCACCATTGCACTGC
    AGCCTGGGCAACAGAGCAAGACTCTGTCTCGAGAAACAACAAAAACAAAA
    ACAAAAAACAAAGTAGTGTACTAAAGTCTCCAACTACTATTGTAGAACTC
    TATTTCTCCCTTCAATGTTGCAAAATTTTGTTTCATGTATTTTGGTGTTC
    TGTTCTTTATAATTTTTATATCTTCTTAATGGATGAAAACTTTTATCAAC
    ATATAATGTTCTTTGTCTCTTGAGACTTTTTTTTTTAACTTAAAATCTAT
    TTGGGCTGATAATACAGCCACCACAACTCTCATATTGGTTGTTATTTTCA
    TAGAATATCTTCTTCCATCCTTCTACTTTAAAATTCTTCTATCTTTATAT
    CTAAAGTGAGCCTCTTGTAGATAGCATATAGGTGGATAATGTTCTCTTTA
    TTCACTCTGCCAATATCTGCCTTTTAACTGGAGTTTAATCTATTTATATA
    TAAAATAATTACTGATTAGGAAGGACTTACTTCTACCACTCAGCTATTTT
    TTTTCTGTGTGTCTTATACATTTTTAAGTTTCTCAATTCCTCCATTACTG
    GATTTTTTTTTTTACTTCTTGATTTTGTGTCTGTGTTGTTACATTTTGAT
    TATTTTCTCCTTTTGATAGCGGCAGGAGGCAGCCAAATGCCTGGCAGATA
    GAAGCTTGTCCCCCATGAAACCCCACCTTCAAGCCAAAAAATAGCCTGAA
    GGCTGAAAGACCGGACTGCTGGTCCCAGATGAAACCCATGATCCAGAGTG
    AGAACTTCCATTCCTGTTTGCCTGCCCTCTAAATAATCCCTTTTAACCAA
    TCGAATGTTGCCTTTTCCAATACTACCTATGGCCTGCCCCTCCCCCATTC
    TGAGCCCATAAAAGCCCTGGAATCAGCCACATTGGGGGCACTTTGCCAAC
    TTCAGGTAGGGGGACCACCTCTGTATCCCTTCTCTGCTGAAAGCTGTTTT
    CATCACTCAATGAAACTCTCACCTTGCTCCCTCTTTGATTGTCAGCGTAT
    CCTCATTTTTCTTGGGTGTGGTACAAGAACTCGGGAACCAGTGCACAAGC
    CAGACTTGGTCTGGGCAGCACGGGTTAGTGGGCCATCTCCCACAGCAGGT
    AGCATGGCCAAGTGAGGCCTGGGCAGGGCATCACCAAGGTCCCTGGCTTG
    CAAAGTGACCAAGGAAAAAATCCTGTGTCACTTTCCTTTTCTCATATTTT
    TTAGTTATTTTCCTAATGATTGCCTTGAGGATGGCAATTAACATCTTACA
    CTTATAAGAAGCTAGTTTGAATAATAGTTCCAATAGTACATGAACACTCT
    ACTCCTATATATCTCCATCCTTCTTCCTTTATATTGTTATTCCCACAAAT
    TATGTTTTTATACATTATATCCTCACTAACATAAACTTATTATTATTTTC
    TGCATTTGCCTTTTAAATCATACAGGAAAACAAGAATCACAAAGAAAAAC
    TACATTAATATTTGCTGTTATATTTACCTATATAGTGACATTTAACAGTG
    TATTTTTATGTCTTCAGATGTCTTTGAATTACTACTTAGTGTCTTTTCAT
    TTTAGCCTCAATGTTTCCCTTTAGCATTTCCTATAGGGCAGGCCTGCCGG
    TAATTAATTCCCTTTGGTTTTCTTTATCTGAAATGTCTAATTTCTTTTTT
    ATTCTTGAAGAATAGTTTTGCTGGCTATAAGATTCTTAGTTAATAGTTTT
    TTTCCCAGCACTTCAATTATTATTAAAGTGTTATTATTATTATTATTATT
    ATTTTGAGATGGAGTCTCCCTCTGTCACTCAGGCTGGAGTGCAGTGGCGC
    AATCTCTGCTCACTGCAACCTCCGCCTCCCAGGTTCAAGCAATTCTCCTG
    CCTCAGCCTCCCGAGTTAGCTGGGATTACAGGTGCCCGCCACCATGCCCA
    GCTAATTTTTGTATTTTTAGTAGAGACGGGGTTTCACCATGTTGGTCAGG
    CTGATCTTGAACTCCTGACCTCAAGTGATACACCCACCTTGGCCTCCCAA
    AGTGCTGGGATTAGAGGCATGAGCCACCATGCCTGGTCTAAAGTGTAATT
    ATTATTACAGCTGCCATTTGGCCTCCTTGGTTTCTAATGAGAAATCATCT
    GTTAAACTTATTGCAAATCCTTGGTATGTATGCTATGTGTCATTTCTCTC
    TTGCTGCTTCCAAGATTCTCTCTCTGTCTTTGTCTTTTGACAATTTGACT
    ATAATGTGTTTCAGTGTGAATTTCTTAGAGTTTATCCCACTTGGATTTCA
    TTGAGCTTCTTGGATGTGTACGTTTGTCTTTCACCAAATCTGGGAAATTA
    TTTCACCATTTCTCAAATATCTTTTTCTTCCCCTTTCCATCTCTCTTCTT
    CTGGAGCTCCCGTATACTTAGTTGGCATGACTGATGGTATCCTACTGGTC
    CCTCAGGTTCTGTTCATTTTTCTTCTTTCTTTTTTTCTGCTCTGCAGACT
    GGATAACTTCAATCGCCTTTTCTTCAAGTTCAATGATTATTTCTTCTGCC
    TGCTCAAATTGGCCATTTAACCCCTCCAGTGACTTTTTCATTTCAGTATT
    GTACTTTTCAGATCCAGAATTTCTATTTGGTTCCTCTTTAATAAATTCTT
    TTTATTGTCATTCCCCATCTGTTCATACATTGCTCTCCCAATTTCCTGTA
    GTTCTTTGTCCATGGTTTTCTTTAGTTAATTAAGCATATTTAAGACAGTT
    GACTTAATGTCTTTGACTAGTAATTCCAATGTCTAAAATTCCTTATGGAT
    AGCTTCTTTTAAATTATTTTTGTCCTGTTAGAGAGTCATATCTTCCTCTT
    TATTTGCTTTGTAATACTTTGTTGAAAACTTAACATTTTGAGTAGTAAAA
    TGTGGTAATTCTGAAGCCAGATTCTCCCCCTCCTTTGAGATTGGTTTTGT
    TGTTTGTTGAGGGCTGCAGTTGTCCATTTGTATAGTGACTTTTCCAAACG
    ATTTTTGCAAAGTATGTATTCTCTCTTGTGTCTGGTCACTGACGTTTCTG
    TTCTGGTGCCTCTGCAGTCAGCCTATGACCTGGAAGAGCATTCCTTAAAT
    GCATAGATTTTTTTAAAACCCAAGAAACAAAAAACCTAGCATGTATGTAC
    CTTTTTAAAAATCTTCTGATAGATGCCACCTGGAAGGCTGCTGCTGCCTG
    AAGGGGCAGAAACAAAGGCAAGCTCTACTCTGAGCCCTCAGGGAACCACC
    AGATAAACAAAAGAAATTTGATTCTCCAAATTTCTGGAAGACAAGGTCCT
    TTCTGCCCACTCCTGCTCCAGCCAGCTGCTCTAGGAACACAATTACTGTC
    CACATGGCCACAGGAATGTTGAAGAATGCAGGATGGTAGCTGGTTTGCCC
    ACACCACTCACTTATGAGCCATCAGCATGCCTCTCCCTTCATCGAGCACT
    CCCATGGTTGCTGTAAGTGTCCAATCAGGTTCCAGAATTCTGAAAGAGTT
    GACTCTTACAGGATTTTTTTCTTTTCTAACTTGCTGGTTGTTTAGATAGA
    GGAACCAATTCCTGAAGTTTCCTACGTTGCCAGCTTCATGAGGATCATTC
    CCTAGTAACTCTTTTCAGACAAAAAGCTTCATTGATTTACTGTAGGACTA
    GCATCAAAGAGTCTATGCCACCTAGTCTGTCTCCTTAAAACACAGAAATA
    ATCAGTATGCATTGGGGTAGGAGTTTGGCATTAGATCTGCCGTAAATCAA
    GAGCTGGGGACAGCCCATGTCTTAAACTCTGACCCAAGGGCTAAAATATC
    CTTTGGTAGCAACAACAGCTACAAACTATTGAACAACTTGTATGTGCCAA
    GAGCCTTACCTGCATTATCCCATTGAATCCTCTCAACAGCCCTGTGAGGT
    AGTAGAATTGTTGCCTGCCCCTTACTGAGGCCTAGAAACATTAAGGAATT
    TGCCCGAGGCCCTAGAGCCAGTGAGTGGCAAAGCCAGTCTCCAGACTCAG
    GCTGGAGATCCTACAGTTCTGTGTTACCCCAGTGTTATCCTGCCTCTCAG
    CACAGAGTCTTGGATGATTCTCCTAACCCCTCCCTAGGCAATGCACAGGG
    CTGCTCCCTGCACCCTTACTCATGCTCTGCTCTTCAACCCCAACAGTGCT
    GGCCTTAGGCTTTATCCCTGACACCCAGCCCCAGGCTCCATTCCATCTGT
    TGACAGAGGCAAACACTGGGGCAAAACTGACCTCTGTGGATACCACTGTG
    TCCACCTCCACCAGCTTCAGCTGAAGCCTCTGAACATCTCCAGCATGGAA
    GAAGCCCCAAAGGATATTTCCTGTCCCCCAGCATATGCTTGACCCTGAAG
    CCCTCCCCATCTAGTCAAGAAGACCAAACTGTTAACAATCCTGGAGTCAG
    AGTGACCCATGGGTGAATCTTAGCCAAGTCACTCATAGCTGTTGCATCCT
    AGTAAATCCCTTAACTCCCATAGGCTTCAGTTTCCCTGCATATAAAATGA
    CAGCGTTCAGCTCATCGGCCAGTTTCAATCCATCTAAAGGGTCTAGCACA
    TCCCCTGGCATGTGGAAGCCACAGGGCACACACTAGTTGTGGTCATTTGA
    TCCTGGCATGCTCTGCTGTCTCTCGGCTCTCCCCTTGCCTCTTTCCCTGA
    TGTCCTGGCGATCAGCCACTGCCTAACACCCTCCCACTCACCAGGCCCTT
    AGCCTGCCCCTTAGCACAAGAGCACAGCCGGTCTCAAGTCTACCCTGCTG
    TAAGCAAACACTTGCAACATCATGCTGACCTCCAGGCCCTGTTGCATCAG
    CGTGCCCACACTTGGTGCCCAGCTGGTACTGAGGGTATCAGGGAACAGGC
    CAGTGGTGGAAGGGCGGACACTTTGGGTTCCCTGGTTTCCTGGCTCCCAA
    TATCTTTCCCAATGGCATATGGGGTCTAGCAGCTTGGCTCATTTAACTGT
    GAACCTCTACCCTTTAGAATCTGGGCCTCCAGGCTTGCTTCTGTGCAAAA
    TGGCAGATAAGGCTCAACCTTTCTTTTTTTAACTTCATTGTTAAATATTA
    CTCCATTAATACCCATTTACTGCAGAAAAGGTAGGAAATACAGATAAGCA
    AAAAGGAAAATAAATTAAAATCCTCATACCACCATCATCAAGATAATTAC
    TGTCACCATTTTGGTATATTTCCTCCCAATACATATATTATCTATATCGT
    ATATACGACAAAAATGGATCATACTATGTTTCCTGTTCTTCCCCTGTGTT
    AGTCATCTATTGCTGTATAACAAACTGCCTCAAAACTTAGTGGCTTCACC
    TTTCCGTGTATTATGATGACAAGAATGTGGTATGACACTGTCTTATATCT
    GGATCATATGCTAAAAGATAGAAAATGGTTTCTAAACTTATTTGTTCTGT
    AATAACAAAATTTTATTTCATAAAGTGTTTTTAAAAAAAACCATAGTAGC
    TTGAAACAACAAACCTTTGTTATCTCACACAGTTTCTGTAGGTCAAGAATTCAGAAGCAGCTT
    AGCTGGGTGGTCTGGCTTGGTGTCTCTCCTGAGGTCAGGGTTTTGGCTGGGGCTGCATCACCT
    GAAGGCTTGACTGGGGCCAGAGGA
    CTGCTTCCAAAGTGGTCCACTCACATGGCTGGCAAGTTGGAGTTGCGTATTGGCAAGAGACTT
    CGCTTCTTCTCAATGGATCTTCCCAGAGTTCTTGTAGGCAACCTCATAGCATAGCAGTTGGCTT
    CCCCCAGAGGGAACAGTCCAGGAGAGAACAAGGCAGAAACCACAGGGTCTTTTCTGGCTTAG
    GCTCCAAAGTCATACTCCACCATTTCTGCATTATCATATTAGTTACACAGGCTAGACCTA
    TTCTGCATGGAAGAGACTATACCATGGGGTGAATACCAGAAGCAGGGCTATTGAAGGCCAGC
    TTCAAGGGCGGCTACACATTCCCTTTCAACAGTATGTCATGAACATCTTTCCATGCCAATAGA
    GCAGATGAATCTTACCATTTTTAATGACTACATGTAAGTGTAGCATAATTTATTTAACCAACCT
    CCTGTAGTTGGGTATGTGGGTTGTGTCTCGTTTTTTGATAGTAGAATTAATCATCTTGAA
    TATCCATCACCAAACTTGTCATATTATTTTCTTTTGATGAATGAAAAAGAAAATCAAGTCATGT
    CTGTCAATCAGAACCCTGAGCAACTAAGAAATGGGGGTACCACTGGGACATAGAGCAAGGTC
    CCTTCTGATTCTGCTCTTGTCTTTCTCTCCCCATGAAATGGGGAGTTCACTATCTACTGAGACA
    TCCTAGCCCACAGCTGCACAGTTCTGTCTTTTTAGAAAGCTCTAAGCAGAAACAATGTTC
    ATCCATCCTCCTCGGGACAGCCCTTGAGCTACTGAAGACTCTAAGCATGTCCTGGTCATCCTCC
    ATGAGCCATCATCTCTGAGGCCCTCCCCTTCTTGGCCCCTCTTCTCTGGACAGGTTCTGGACAG
    TCTTGCCCTTCCAAAATTCCTGAAAGCAGGAACTGTTCCTGCTACAATGACTCTCAACTCCAGT
    GCAGTAC AGACTGTTGGTGTCACCCCTTATCCTGAAGAAGAGGCACTGAGACAGGAC
    AAGGGTGGGTGCCCAGGAGGGCTGGCATGAGTCATGAGAATCTGGTCCCGGAGAATTAGACG
    GTGTGGGGAAGTAGGGGTGTTGGGCCGCTTTCTGGCCTCATGGATGCCAATGAATATCAGCAG
    GTGGCTCCCAGAAAGGAACTCTAGGGGATGCCTGTTGCTCTAAATAGAGGCTAGAGAGGGCA
    CTGGCAGTTCAGT CAACCAAGAAAGGGGGCCCACTTGCCTCAGCTTCAGGCTTTGTACACATC
    CTCAGCCTTTCTTGAGAACTGAATTTAGATTCTCCTCCCCTGTGCTGTGTGCTTGGCCCAGAAG
    AAGGGCAAGTCTCGCTGGGTGGCTGCTTCTTGGCCTGGCTGAACCAGAAGGCCCCAGTGCCAC
    TCCAAACCTGGGTGTGAGCCCTGCCCCCATGAGCAAACAGTAGCTCAGAGCTGGGGGCTGTG
    GGGGTCAGTGG CCTGTCACATGAGATCTGATGAGGCCATCTCTGCTCTATATTGGGAAAGG
    GATCAATTGTATCAAGGGCTTTCTTGGGAGTGATCACTCTGGCCATTGGCGAGAGACCTGGCA
    TTCTGACAAGGCACCCTCCATACCCTGACCCACTTGCCAGCTCCAGCTAATTTTAGCAGGCTTT
    GGCAGGTGCCAGCAAGTACATAGCATGTGGATGTCACTCCCAGGTGAGCCCAAGGAGAGGCC
    TGGGCCAGAGC CTGGAAGTCATGGTCTATGCCCATGGAGGCACCCAAAGCAAGCCTGAGGC
    CTGGACTTTGCAGTCACAAAATTAAGAATGATACCCCTGTTTTTTGTTTG
    TTTTGATCAGTTGGCCACCTTCCTCCACCACCCCTTCCCCAAGTTCCAT
    CAGACCCCTGGATTGTATGAAATGCAAATCGAACCTCTCTGCAGATGAA
    AATCCACTGGGGATCCCCTTGCCTCCAAGAGCAAGTCCAGACCTGCACCA
    GCGCGGGCCAGGCCCCCTTAGGACCCCCTCCCTGTCCAAGGGCATTTCAG
    TAAGTGTTCTGTGGCCAAGGCAGCCTGGTGACTTTCTGCCCGCACAAGGC
    TGAGGAATGGAAGATGGGTAGGCTGGCTCTGCACACCCCCTCCTGCTGGG
    CAGCAATCCCTACCCCATGTTCACAGAGTGTGGCCGGCTGCCCCATGGCT
    CTGTCCCCGTGGCCCTGTCAACTGTTACCCACATGGCCTACCCTCCCTTT
    CTGCCCTGCCTCTGACCCCATGGCAGGGGGCAGAGTATTTGAGCAGCCGC
    CAGGCTGAGCCCTTTCAGTGCAGAAGCCCTGGGCTGCCAGCCTCAGGCAG
    CTCTCCATCCAAGCAGCCGTTGCTGCCACAGGCGGGCCTTACGCTCCAAG
    GCTACAGCATGTGCTAGGCCTCAGCAGGCAGGAGCATCTCTGCCTCCCAA
    AGCATCTACCTCTTAGCCCCTCGGAGAGATGGCGATGGATGTCACAAGGA
    GCCAGGCCCAGACAGCCTTGACTCTGGTAAGGGTCACACCAAAGTTAGGG
    ACTTTGCACTGGGAGAGCAGCACCCAGGGCAGGGCCCTTGGTTTTGCAGA
    TTACCAAAACTAAGGCTGGGGGCAGGGAAGGCGAGCAGGCTTGGGGCACC
    TTGGAAGGAGGCACATGGGCCTTGGGGGTCCTGGCTAGGGCAGCTGTGCCTGCCACTGGCCCT
    CTGCCCACCACCCCTCCTCACTGTGGCTATCCAGTGTCCAGCCTCTCGAGGGGTTCTAGGGTAC
    TTATTCCTGGAGCTAACGGTGACCCAGGACACCAGTGTCCGGGGCCTGGCCTGGGGCTTTTAT
    GGGGGGAGCT GGCTGGCTGCCCAGGGCTGTCTGGCTCTCTGGGGGCTCTGCATGGCATTT
    CCAGGGGTTGGTGGATCAGGGATTCTGTCCCTCAGGAGAATGTGGGCACTAGCCCAAGGCCA
    CTCACTTCTGTGTACATAGCCACCTGAGGGCCCAGGAATGGAGGGGGCCAGGCTACAGCTGG
    ACATCTGGCACTCGGATGGGCTCTGGAGCCCCCAGGCCTGCAGCATCTGCCCAGGGACTGCCC
    TGGCCCTTGGCCA TTTCCTCAGGGACCCACAGCTCCACCAGCCGGCCCCTCCCAGTGCTGGAA
    TAGACAGTTCCTCAGTCCACATCTGCCAAAGGCGGCACTAGAAGGCATCCTGCCTTTTTTACT
    GCGTTCTGGAGGTGGGGTCACAAAGCACTGCTCACTGCATAAAAGGGACAGCATCCTGCCCCT
    GGCAGCCCTGCCTGACCAGCTCCGCCTCTCCCACTGCTATCCAACCTGTACACCCTGGTGACC
    ATGTCCAGGCC AGTGGCCTTAAGGACTGTCTCTGTACTGATGGCTCCACATCTACCTCTCC
    AGCCAGACTCTCCTCTGAACTCGGGCCTCACATGGCCAACTGCTACTTGGAACAAATCGCCCC
    TTGGCTGGCAGATGTGTTAACATGCCCAGACCAAGATCCCAACTCCCACAACCCAACTCCCAG
    GTCAGATGGAACCTCTTCTTCCCAGGCCCTTCTGTTCCTCTCCTCAGCCCCTCCCACCTCCCTTC
    AGAATAAGT CTAGACTCTTATCGCTTTCACCAAGCCTGCGCCCAGCATCCCTGCACAGG
    GATTGTTAGGACAGCCTGACGCCCTGCTTCCACCCTGCCCCAAGATGCCCCTGCTCTGCAGCC
    CGGCGCCTCCAGGCTTCTCACCTCCTGCTGCTCACAGCTCAGCCTCACTCCCTCCCTCCCCGCC
    TCTGCTCCAGCCTCAGTGCAGGT
    CCCTGCTCCCATCTTCTGGCAGCAGCTGCCCGACCTGGTCCCTCTTCATCTGTCCCCATTCCTTC
    ACCCCCCAGCCTGTCCCCAACTTGACTGAGGTTCTTTCCTGCAGATCCCCGCCCTTGAGAGGG
    GTTGGTCCCACTGTCAACTCTGCTTCTGTGCCCTGTGCCGCACCTGGCATTCAGTGAGCATCTG
    CTGAAGA GATGAGGGTCAGATGCCCTGCAGGGAGTGTGGGGGCGTCCTCAGGCAAGA
    AAAGTTGTACGTTTGGCTGTGGGCCCTGATTATGTGTCCTGTGACCTCTTGGGTGAGGTCAGC
    AAGAGAAACCTCTGCAAGCTGGCTGGGGCTGCCTCCCAGAGGCTGCCAGGGGGAGGGACAGG
    CTCTGTCTGTGCTCTTCTTCCGAGGCTACACCTGGGGCGCCAGGCTCTCAGGGCTCCCCAGGTA
    CCACCACATTT CCTACACTGCTTGGGAAAGCCCTGTAAGTTTGCACAGACACCCAGCATGA
    GGCTCGCCAGAGAGATACTTGTAGCTGGGGTCTGGGCACCAGGAACAGCTTGGTGCTGGGCC
    TGAAGTCGGGCAGGATGCAGCCTGGCCAGGTGAGAGGAAAGCTTGGAGCCAGTGCCTGGGTT
    CAAACTCCTCTGTGGCCTATGGTTCTGTGGGCTTGGGGAAGGGTTTGTACCTCTGTGTCCAGTT
    TCCTCACTTATA AAAAAAGGAGATAATAAAAGTACCCATGTCCCAGGGTGGCTGTAGCAATA
    ATAGGGAGGGGTGCCCAGAGCAGGTCTGGCACACAGGAAGTGTGCATCAG
    CCTCAGTCCCTGCCATTGGGCTTGTCCTGGGAGTCTGTGAAGCCAACCTC
    TGCTCCACAATGTGACCCCCAGGCTTGTGAGACCAAGCTGGGTCAGAGCT
    TCCTCCTCTGGGGTTGCACCAGGAGGGGAACTTCTGCAGGCCCAGATGCA
    CCCTGAGGAAAGGGCTTGTTCCCACCAAGAACAAGGCTCACCTTTGGAGG
    ATGCTCCCCACATGAGAGGTGAACCCCCAGGTCTACTGGTGACTGCAGCC
    TCGGAAGCTGACAGCATCTATCCTCCAACCCATGCCCACTGGGAAGTGTG
    TGAGGGGTCCTCATAGGCCCTGCGGTGTGGACAATGCAGAGACCCTGTAG
    CATCTGGCTAGGGCGGGGCCCAGATAAGAGCCCTGTGCCAGGAGAGCCTG
    GCCGGTTCTGCCACTGTGGGGAGACAGGCTCCCCCACCCCATGTCCCCTG
    CTTCCCTGCAGCCCACAGAGAATACAGACCTACTTTTACAGAAATCCAGA
    TTTTTGTGTAAAAGTGTCTCTATTTTAAGTAGATTTTAAGTGGTGGCAGC
    AAATTTAAGCTTTTGAGAATATTATACAGAACAAATCAGATTCACAGGCC
    AGATGCAACTTTATTTACAGAAATGGGATCAGGTCCTACCTCAGGTCCCA
    TCTCACGTTTTCACTTATGCCTATACGTCTCCTTCACGGGAAAGGCCACA
    AGAGGCCCTGCGGTAAGTGTCCCGGTGTTGATTTAAAGTCCCCAACAGTG
    AATATGAGGGTCCTCACTGTTGCAGCAAGAGGATACCCCCCTGTGTATCT
    TGGAAATGCCTGCAGCCCTCTTGCTGCAGAACAGATTCTTAGGAGAGAAA
    CTGTCAGATCAAAGTTAAACTTAGAGAAACTCCAAATTGCCCTCTGAACA
    GACGGTATCAGTTTGACATCATCCAATACCGGGATTCCTCGGGGAGAACT
    TTCTGGCCTAGAAGGCAGTAGAGCCAGGACTTCACCCAGTCAGTGGCAGG
    GCCACACGTGGGCCTTGATACAGAGGGGGAAGACTTGAGCCTCCTCGACA
    CCCTACAGGGCCCAGCCTCCCAACATGTGATAAGAGAAACAACAGCCAAC
    TTGTACCTAGCTCTCCTTATTCTCCAAGGGCTGGGCCAGTTCTCCCCACA
    GCCCTGCAAGGGAGGATCACTCAAGGGCCCCAACTGTCTGACAATACAGC
    CACACTCTGATCAGCCACCTGGGCATAGGCTCCATGCCATTGTCCTCCGC
    CAAGACCTCAGACTGAAATGTTGGCTCCTCCCATGAAGAACCTGGGGCCA
    AAGGACCAGAGTCCAGGTCCGTGGCTGCCAGGATGGGCCACTTGGAGAGA
    GGCACAAGGGTGGTGCCAGGCAGGTGTGAGGGCTGGACCTTTGCAAGAGC
    AGCATCACTTTTGTTGAGAGCCCACAGGTATCTTATAATTGGGTCCTAGG
    ACTTCCTGCCAGTAGCCATTGTGTGCATGGATTTGGGTGCTGGCCTCACC
    ATGGTGTGCTGGCTGCCCATGCCTGCAATAATGACTTCTGTAAGCCTTTC
    TTCATCTGCAAGATGGGTGCTGCTGGCACCTCCTCCCCGGTGCTGTGGTG
    ACAGGGCATAGTGTGTGAGGCTGCTATGTGAAGCACCTAATGCAGGGCCT
    GGCATATGGAGGAATTCAGCAAATGACAGATGCCTTCACAGTTAGTTCCT
    GGCATCCTCTACATTGGTGGGTGTAGGAAAGAAAGACAGAGGAGGCAAAA
    GTTGTAGCTGTGGGGCATTGAGGACAGCCTGGATTGTTCCACAGAGCCCT
    GAGGACATCTCCAGGGGTGTGCTCTGCAGGGGCAGCTGGATTGGAGGGTT
    AGGGGTCGGGGAGGGCGTGCACTCCCACCCATGCTCACAGCCTCGGAACA
    GTGCCTGCTCAGCCAACATGGGTGTTTGATTCTGTGTCTTTTGTCACAGA
    CTTTATCAGCCCCATCCCTTTCTGACCTTGCCTCAGTTTAAATTTTACAT
    GTGGGGCCTCATTAAGAGACATGGTTCTTAACTAAAGATCTGTATCCATT
    AGGAATGCTTTGGGCTGCAGGAAGACAAACACCTGACTCACTGTGGCATA
    AGTGGTTTGCGTCTGCTCCCATAAGCTGCACGTGGAGGGTGGATCTGGCA
    TTACTCTCTCTTCCCTACATTTGCAGTATGCTAACAGCTTTAACCTCCAG
    CCTTGTTTCTTCATGGTTGCAGGGTGGCTATCACAGCGCTGGCCATCACA
    TCCTTACACAGCTGTGTTTACAAATTTAGGGGGACATTGAAGCTCCTCCC
    CTGCTAAAATCAGGCTTCCCTTCACCTGTCATTGGCCAGAACTGGGTGAA
    ATGCCCAACTCTAGACCGATCATCAGTAAGAGGAGTATAGAATTGCTGTG
    CCCACCTTAGATTAATCATGGCGCAATGTGCTCCCCATACCAACAAAATC
    TGAGTTCTAGAAACTGAGGAAGAAGAGGAAAATGGCCGTCTTGCCTCCTG
    GCTGGGATTCAGAGCATCTCCAACCCTCTGAGCTTATGTGTAAGACTGTG
    GGCAAAAGTGTGTGAGTTTTTGTGGAATGGATCCACGGCTTTTATCAGAG
    CATCTTTCCTTTTTCTTTTTGATTCAAGATGAAAATATTCTTATGATTAT
    TTTTCTCACCACTGCCCAGAGATAACCAGCACATTAACATGGCCTTTTCT
    CCATGAATAGCACTAGGGTGCCCAGTGGACAGACACATAGCTGTCCACAC
    ACCAGCTTGCTGGGGATGCATAGGCAGAGTCACATCTGCACTCACGGCCT
    GTCCTCACACTGCCATGTGGAGAGCCAGCAGCCACACCATGGGCCGTCCA
    TGCTCACGGGAGTGGCAGTATCAGATCTGAGCTTCGTGTGCCCAGGCGTC
    TCTCACATCAGTGCATAGGGACCCTCTTTGTTCTGTGGCCCAGTGTGCCC
    ATGCCACAGATGGCTTCAGTCAGCAGACACCTCCTTCTAGACACTCACAC
    TCACTCCTGGCTGGCCCTTAGCACACCTGTGCAGACAGGCCCATTTATTT
    TCTTGTGTAAATCCCAAGTAGGAGGACTGGGTCTCTCTGACAGCAATGCC
    AGCTGCCTGGCACCCTCCAGACAGGTGGCTCAAGCCCCACCTCGCCAGCT
    CTCCCAGTTAGCCCCTCCTTTCCCTGGCTCTGACCTGAGGGACGAAGCAG
    GGTGCTACAGGACGCTGTGCCACAGGGATATCGTCAGGGACAGAAGCTAC
    TCTGCCCTCTGCTGCTCACCCCTCCAACACGCTGTGGGCTGCATTTGTTG
    AGTGGCTGGTACCAGACTCTGCTCTTCTGACTTTCCAGCTGGTTTTACCT
    GTAGTAAAGTTTGAGAAGATGGGTCATCCTGACCCCGGGGTCAGAAGACA
    GAAGGAGGCCCATGGCGTGTGGGGGAGATGCCCCGTGAGGCCCTCGGTGT
    GCAGATGCCTGGTGACAGCCCCACCCTGAGGTCCCCAGCCTACCCCCTCC
    CCAGCCCGACTGCTCCCATCCCCCTCCCTGTGCAGGTAGAGCAGATCCTG
    GCAGAGTTCCAGCTGCAGGAGGAGGACCTGAAGAAGGTGATGAGACGGAT
    GCAGAAGGAGATGGACCGCGGCCTGAGGCTGGAGACCCATGAAGAGGCCA
    GTGTGAAGATGCTGCCCACCTACGTGCGCTCCACCCCAGAAGGCTCAGGT
    ACCACATGGTAACCGGCTCCTCATCCAGAAGCAGCTGTGGGCTCAGCCCT
    AGCTGGGAGAAGCACCCCAGGCACTCCCAGACTCACAGCCAGCCCGAGAC
    AGAATCTCCTGGGGAGCAATGAAGTCCTCGACTTGGGCCAGTTCTCACCC
    TTGGCTCCTCTGGTCCGGCCCTGGGGCACTCGGGCTCACCCTGGAGCTGG
    CAAACCTCAGGAAAACTGGCGTTTTAAATCTCACTCCTGGCCAGGTGCAG
    TGGCTCACCCCTGTAACTTCAACACTTTGGGAGGCCAAAGCAGGCGGATC
    TCTTGAGGCCAGGAGTTTGAGACCAGCCTGCCCAACATGGTGAAACCCCG
    TCTCTACTAAAAATACAAAAATTATCCAGGCATGGTGGCACATTCCTGTA
    GTTCCAGCTACTCGGGAGGCTGAGGCATAAGAATTGCTTGAACCCGGGAG
    GCCGAGGTTGCAGTGAGCCAAAATCGCGCCACTGCACTCCAGCCTGGGGT
    GACAGGGTGAGACACCATCTCAAAAAAAAAAAAAAAAAAAGACCTCACTG
    CTCCCCATGGGCACTTAGGGAACTCTCCCAGCCCAGTTCTGCAGCTGGGC
    CATTGCACTAGATCCTCAGTTGGTCCCTGGGCTCTCGGTGACTGTCCAGG
    GCAGGAGTTTCCCATTGACTTTTCCCTGGTTGACCTTTGACCCCTTCCAC
    AGTTGACACTGGTGTCCCCAGGTGTCTGGTGGCCCCTTGTCCAGCTCCCT
    TAGTCCCTTGTGCCTTCCCTCCTCCTCTTTGTAATATCCGGGCTCAGTCA
    CCTGGGGCCCACCCAGCCCAAGGCCAGCCTGTGGGTGTCCCTGAGGCTGA
    CACACTTCTCTCTGTGCCTTTAGAAGTCGGGGACTTCCTCTCCCTGGACC
    TGGGTGGCACTAACTTCAGGGTGATGCTGGTGAAGGTGGGAGAAGGTGAG
    GAGGGGCAGTGGAGCGTGAAGACCAAACACCAGATGTACTCCATCCCCGA
    GGACGCCATGACCGGCACTGCTGAGATGGTGAGCAGCGCAGGGGCCGGGG
    CAGGGGGCCAAGGCCATGCAGGATCTCAGGGCCCAGCTAGTCCTGACGGG
    AGGTGCCACCTGTCTACCAGGGGTGGGGAGAGCGGGGGCTGGAGGACCAC
    CCAGCCTCAGAGGCAGCTGGAGGCCTGGGTGAACAGGACTGGCCAACATG
    TCCCCAAGTCCCACAGTCACCATCTGGCCAGCATTGAGAGGGGAACGGGC
    TGAGGAAGAGTTAGTGGCAAGAGGAACCCCAGCCAGTCACACCTTGTCCA
    GTTTACCAGAGGAAAAAGCAATGTGTAAGAACAGAAATGTGACCCGGCAG
    CCAGTGCACTGCCCCCCTCTCCAAAGGCCACCCCTCACCCTCCACCAGCA
    TGCACAGAAAGTGGGGTGACAGCAATCACAATGTCTACCCAGGCAGCAAG
    GACCCCTGACCATGGGGAGGACTGGGGTGCAGGGAACATAGAAGCAGAAT
    GAGGCCTAGGGGGAGTTGGGCAAGGCCAGAGCCCTAGCTGCAGCCAAGCA
    CATGGCCAAGGCCAGCTCCTGGAAGGGCAGGGCTCCGAGGCAGGAGGCAG
    GAGGCTGCCCGTGGCTACCCGTCCTCACACCCCTGCAGCTTGCTAGTCTG
    TCTGTGGGCTGGGTGTGAATCAAGGCAGTGGGATGGTGTGGGGACCTCCC
    TGGCCCCAGCAGCCAGTGAGGAGCCTGGTCAGTCAGCAGAGCATTCAGCA
    GTATCCAGTTCCATGGAGAGGCCCGTGTGAGGGGAGTCGGGGCTGGTCTT
    CAGTAAGGATGGGTGGCCAGGGCCCCTAGAAGTAGAAAAGGAGACTCCGG
    GTGCTGGAGACAGAAATCAAGGATGTGCCTCCATGTGGAGCCTCAGGAAT
    AGCTGGCCAGGCCTGAGGCTGAACCTCACAAGGTTCAGCTGGGAGGGCTA
    GGCTGACAGAGCACAGCCGGGCCAGGGACCAGCCTGCCCTGTGTTGCCTT
    GTCCCGAGGGCCACTGTCAGCAGGTCTCTGGCATGGGGGAGGCTTAGGGC
    CTGAGCCCAACAAGCAGCAGCGGAAGAGGAGAGGGAAACTGTGGACAGGC
    CTGGCATTCAGTGGCCAGGTGTTGCAGTGTCCCTGAGGAATAGCTTGGCT
    TGAGGCCGTGGGGAGGGCTGCCGGCCAGCGCACCCCCCCATGCCAGATGG
    TCACCATGGCGTGCATCTTCCAGCTCTTCGACTACATCTCTGAGTGCATC
    TCCGACTTCCTGGACAAGCATCAGATGAAACACAAGAAGCTGCCCCTGGG
    CTTCACCTTCTCCTTTCCTGTGAGGCACGAAGACATCGATAAGGTGGGCC
    GGGTGGAGGGGCAGAAGGCAGATGAGGGGAGGCACAGGCACCCCAGAGGA
    ACTCTGCCTTCAAATGTAGCCCCCATACCATGTGCTCAGAAGGGAGATCT
    GGATTCAAATTGTGGCCATGTCACCTGCCACCTCTAATGCTGTGGAAAAG
    AAGCATCACATTAGCTAATTCTGGCTGTGCGCCTTGTGAGGCACCAGCTA
    TGATCACCCCACTCCAGTGGAAAGAGCAGCTGGCAGTAGGGTGGGGCTCA
    AACTCAGGCAGCCGGGCTCTGGGTCACCTGCAGGCCACGGTCATGTCACA
    CTGCCTCTAGCTGAGTCAGAAATGTGAAGGAACTGAGATTCTACCCTTCC
    TGCAAGCTAGCAAAGTGGCCTGCCAGTTACATCTGTGCATGCACACACAC
    ACACAGTTATATATGCACACACATAAAACACGAGACCTTTGGGTCAGGGA
    GAAAGCCAGATCCTCACTCACGGCAGAAGCAGCAGCCAAAGCAACATCTC
    ATGTGGTTTTCCAAGCCCCAGTCCCTACAGAGACAGAGAGGGCCAGGTGG
    CACCTGTGCATGCAGCGGGGTACCTTGCAGGAGGGAAATCCTGATTTTAC
    ACAAAGCTGCTCCCCCCACGCCCTGCCTTGACTCTGGGATGACGTCTCAG
    AGCTGTGCAGTACAACATTCTTAAATTGGCTGGGACTCAGCCCTGCAGAA
    ATATGATATCTTCAAGGAGAATCGTTCCCAAAACCTCTCAAAGCTATGGG
    GCTGCTCTGAGCCTGTTTCCTCAGCTGTAAAGTAGGGTGCATACTTTTAT
    GGCCCTGTGCAGGAGGTAGTGACAGGCCCTAGCACCCTGCCTCCAGTATA
    TGTTAGCAGCCACGAGGCCTATCTCTCCCCACAGGGCATCCTTCTCAACT
    GGACCAAGGGCTTCAAGGCCTCAGGAGCAGAAGGGAACAATGTCGTGGGG
    CTTCTGCGAGACGCTATCAAACGGAGAGGGGTGAGGGGGCACCTGTACCT
    GCCGGGGGGGCTGCCCTGGGCCACCCACCCCAGCACTGCCTGCCTTTCTC
    CTTGGCTTCCAGCACTGCAGCTTCTGTGCTTCTTGGCAGGACTTTGAAAT
    GGATGTGGTGGCAATGGTGAATGACACGGTGGCCACGATGATCTCCTGCT
    ACTACGAAGACCATCAGTGCGAGGTCGGCATGATCGTGGGTAAGGGCTCC
    TTGCACCCCTGCCCCTTCCAGACTGCTGAGGCTCCCTGTGTACAACAGGC
    TTCAAGGGCCCTGTGGGGTGAGGACCAAACTACTTAACAACCGGTGATGT
    CAGAGCAGAGCCTGGTGCTACAGCCTGGGTGGTCTTGGGGTATCAAGATG
    GAAGCACCGTGTACAGTAGGAAGCATTTCAACGCCATGATGCCACATTCC
    TGCATCAGATGGTATGCCAGCTGCATATCCACCTCACCCATCAGGATTAT
    AATTAAAACACTTATCTGGTAAATTGACCAACTGGACAGATTGGTCCAAG
    TGGAAGAGGATAAGCAAAAGTGGTACCATCTCCACCCGAATGGTCTTTCC
    ACGGGCCTGCCCCTGCCCCTGCCCCCACCCAAAGTGAAGGCAGGTACCAG
    GAAAGGGAGCAGCAGTCCGCCCCTCCCAGCAGAGGGGTCTTCCACACCAA
    CTCGGACCTTTCTCAGAAGTTCCGGAGGTCATTATAACCAGCCTTCACTG
    AGGAGCAATCCAATCAGATCAGTTATCTGCTGTGCGCACAGCCGTGTGGT
    TCTATACTTCTCTTACTTCCATTTTCACCTTTCAGAAGGAACGTTGTCTT
    TAAATGCAGCATCTAAACGTGAGCCCCAGCCATCCCTGGCTGTGATCCCC
    CCAGCCCTTTCCACCCTATCCTCTGGAACTGCCTGGGGCTCCCCAAGACA
    CTTCCACATGAATTCCCACCAAGCCAAGCTGCAGCTGCTGGGCCCAGGCA
    TAACCCCTCCTGGGGCAGAGGTGGCAAGGAGTGACCCACCACTCACATCT
    GCCCCACATCCACTCTTGACTCTGCTCAGTGTTTAAAAACATGTTTATAA
    CAATTACCAAGATCTGAAAATTAGGAGAATTCACATCAAAGTCTGGATTT
    CTGTTTGTTCATAAAAAACTAGAAGGCAGCCAGGCAAGGTGGCTCACGCC
    AGTAATCCCAACACTTTGGGAGGCTAAGGCAGGCGGGTCACTTGAGGTCA
    GGATTTGAAGACTAGCTGGCCAACAAGGTGTAACCTCGTCTCTACTAAAA
    ATACAAAAATTAGCTGGGTGTGATGGCGCATGCCTGTAATCCCAGGTACT
    CAGGAGACTGAGGCAGGAGAATTGCTTAAACCCTGGAGGCAGAGGTTGCA
    GTGAGCCAAGATCACGCCACTGCACTCCAGCCTGGGTGATGGAGTGAGTG
    AGACTCTGTCTCCAAATAAATAAATAAATAAATAAAAACTGGAAGTCTAA
    GCATCACTGAGCCCTGATTCCTATGTGGCAGCTCGACTGACCAGCATTTG
    AGTTGCTGTCCCTGACAGCTTTGGGGGTGTGCAGCCCACACAGTCATGCT
    AGCTTGAGGCTCTGCTGTCAGCAGTTTGAAACTCTTAATAACTTGTGAAC
    AAAAGACTCCATGTTGTCACTCTGCACAGGGGCCAGCAAATTACAAAATT
    CCATATCCGGAATTGTCTACAGGAGCCTCTGGGCTGCTCCCAAGGGCCCA
    CACCATGCCTTACTCACTTTGGGTTGCCATCCAAACATGTCTCATGACAA
    AGAAGCTCAAACATGTGCATGGACAGTGCCAGAAAACAAGGGTCGTACAT
    AGACAAAATAAAATGATAACGTCCCACAACCATTTCTTTGATACACACTG
    TTTCTCTCAGTCCTCCCAACCACCTAGGTAACAGGCAGGGAAGGTGTTAC
    TGTTGCCTGTTAGGAAAGAGGACAGCCCTGAAAGCTGTCCCTGGCCACTG
    AAGCAACCCAGGTCTTCCAGCCCCAGGGAGAGCCGCCTTTCCATTGTTCC
    AGACAAAGCAGAGACAGGCATGGGGGAGCGGGAGAGGGACTCCTGTGGGC
    AGGAACCAGGCCCTACTCCGGGGCAGTGCAGCTCTCGCTGACAGTCCCCC
    CGACCTCCACCCCAGGCACGGGCTGCAATGCCTGCTACATGGAGGAGATG
    CAGAATGTGGAGCTGGTGGAGGGGGACGAGGGCCGCATGTGCGTCAATAC
    CGAGTGGGGCGCCTTCGGGGACTCCGGCGAGCTGGACGAGTTCCTGCTGG
    AGTATGACCGCCTGGTGGACGAGAGCTCTGCAAACCCCGGTCAGCAGCTG
    TAAGGATGCCCCCCTCCCCCACAACCCAGGCCCTGGGCCGCTCTGGTGCA
    GCGGCAGATGGGAGCCGGGCCATTGCAGATAATGGGCTTGTTTTTAAACA
    ACTCTGGGGAAAAGCAAACTGACAATCCGTTCGTAAGCTCCATCCCTTCT
    GCTCAGTCATGACCTGCCCCTGTGAGAGATGAAGGGTTAGTCCCAGTTGT
    GATGTGATAAGCCCAGACCTCTTTCCTTCCGACAGGTGATCGTGCATGCA
    GAGGAGGCTCTGAGACGCCCCCAGCAAGGTTCCTGGGTTTAACCCAACAT
    TCCCCAAAGTATGTATTTGGCCACATTCACAGAAAGAATATTAGTCTTTT
    GTGGAATGCTGCGGGTTGACAGTCACAGCTTGGAAACCAACCCACAGAGA
    GCTCATCATTAATCATGGCTATCACTTGTTTACCACCTACTGTGCCAGGC
    CTATGCTAATTACTTTATTAGCGTCCTCTCTGCCGCTCGCAGGCCTCTAT
    TATTATAGGTCAGTAGTATTCGATTTATTTAAATTAAATACGGAAGGTCA
    TAGATTAAGCAAGAAAGTGCCAGCAACATGGTGCGTGCCTCTGACTGGGC
    ACTAACCCTCCAAGTCTTAGTTTTCCCAACCATAACTGGCCAATGAACAG
    CAGCTCTGGATGCAGCTAAAGGAAGACTGAAGCTGTAGGTCCCGTGCTCG
    GCGCAGGGCCCCCTGCAAGGAAGGTTTCGGAGGGACTGGATGGGGTCTTT
    GAACTATCTGTCTTTCCCTTTACTGCAGTGGGCCCAGGGGCAGGCCAAAG
    TTGCTCCCGTGATTGACTTGAACGTGCACGTTCCTAATCCCTGACATTTC
    TAAAGCTCTGGCTCATTAACGAGGGAAAGACGTGAACCAGCTGGGGGAGT
    GGGGATCGCAGTGCCCCACGTGGCCGCCTCGTGACCTCAGTGGGGAGCAG
    TGGGGCCGGCTCCCGGCTTCCACCTGCATGAGGGGCCCTCCCTCGTGCCT
    GCTGATGTAATGGACCTGCCCTATGTCCAGGTATGAGAAGCTCATAGGTG
    GCAAGTACATGGGCGAGCTGGTGCGGCTTGTGCTGCTCAGGCTCGTGGAC
    GAAAACCTGCTCTTCCACGGGGAGGCCTCCGAGCAGCTGCGCACACGCGG
    AGCCTTCGAGACGCGCTTCGTGTCGCAGGTGGAGAGGTGTGCGGAGGAGG
    AGGGTGGGTGCAAAGGGCAGGGGCTGGGGACGCCCGGGCACTGCAGACTT
    GGTCTCAGGGCGACGCTGAGTCCCAGGCCCGGGGCGCAGGGATGGGAAAC
    TAGGGCCTGGGGCGGGATTCCGGGCGTGGGCGGGGCCCGGGGCGGGGCAC
    AGGGGGCGGGGGAGTGGGCGGGGCCCGAGGCCGGGCGCTGGAGGCGAGGG
    CGGGGCAGGGACGGGTCCAAGGGCAGGAGGCTGGGACAGGACGGGGATGC
    AAAGGGAGGGGCGGGGCCCGAGACGGGGAGGAGGGGGAGGGCCCAAGGGG
    AGGAGGCGGGGTCCGGACGGGGATGCCAAGAGCAGGGATGGGAGCGAGCC
    TGCGTCCGGGCACTGGTCCCCATCCGTGAGTCCCCTCGGTGCTCCCTGCC
    CGCCGTGGCCATCCTCTCACATCACTCACAACCCCAAGGCGCGGCATGGT
    TGACACCCCCACGTTAGGACGGAGACCCTGGGCTTAGTTAGAGGGGGCAG
    TACTAACCAGTCCCTGGCGGAAACGCTTTGGCTGGGTGAGGTGAGCGGGA
    TCGCCCCCATTTCTCCAGAGAGGGGTCCCGGCTCAGCGAGGGAAAGAGGC
    CGCCGCTGGGGGGACGGCTGGCCGGGGCCCCTCCCTGGAGAACGAGAGGC
    CGCCGCTGGAGGGGGATGGACTGTCGGAGCGACACTCAGCGACCGCCCTA
    CCTCCTCCCGCCCCGCAGCGACACGGGCGACCGCAAGCAGATCTACAACA
    TCCTGAGCACGCTGGGGCTGCGACCCTCGACCACCGACTGCGACATCGTG
    CGCCGCGCCTGCGAGAGCGTGTCTACGCGCGCTGCGCACATGTGCTCGGC
    GGGGCTGGCGGGCGTCATCAACCGCATGCGCGAGAGCCGCAGCGAGGACG
    TAATGCGCATCACTGTGGGCGTGGATGGCTCCGTGTACAAGCTGCACCCC
    AGGTGAGCCCGCCCCGCTCTCTCCCTGGTAAAGTGGGGCCCAAAAAGCGC
    GCGCTCCAAGGTTCCTTGCGGTTCCCAAGCTCCAAGATTTCGTAGTCCTC
    TTCTCGTCCCCCTTGGCCTAGATTTGGGGGAAGGGTCGACTGCGTGCAGG
    GCGCCCGGTAATGAATGTGGAGGATGAGGTGGGAGGAGGGACGGCAGCCC
    TGCTTCTCTTCTGCCCAGCTTCAAGGAGCGGTTCCATGCCAGCGTGCGCA
    GGCTGACGCCCAGCTGCGAGATCACCTTCATCGAGTCGGAGGAGGGCAGT
    GGCCGGGGCGCGGCCCTGGTCTCGGCGGTGGCCTGTAAGAAGGCCTGTAT
    GCTGGGCCAGTGAGAGCAGTGGCCGCAAGCGCAGGGAGGATGCCACAGCC
    CCACAGCACCCAGGCTCCATGGGGAAGTGCTCCCCACACGTGCTCGCAGC
    CTGGCGGGGCAGGAGGCCTGGCCTTGTCAGGACCCAGGCCGCCTGCCATA
    CCGCTGGGGAACAGAGCGGGCCTCTTCCCTCAGTTTTTCGGTGGGACAGC
    CCCAGGGCCCTAACGGGGGTGCGGCAGGAGCAGGAACAGAGACTCTGGAA
    GCCCCCCACCTTTCTCGCTGGAATCAATTTCCCAGAAGGGAGTTGCTCAC
    TCAGGACTTTGATGCATTTCCACACTGTCAGAGCTGTTGGCCTCGCCTGG
    GCCCAGGCTCTGGGAAGGGGTGCCCTCTGGATCCTGCTGTGGCCTCACTT
    CCCTGGGAACTCATCCTGTGTGGGGAGGCAGCTCCAACAGCTTGACCAGA
    CCTAGACCTGGGCCAAAAGGGCAGCCAGGGGCTGCTCATCACCCAGTCCT
    GGCCATTTTCTTGCCTGAGGCTCAAGAGGCCCAGGGAGCAATGGGAGGGG
    GCTCCATGGAGGAGGTGTCCCAAGCTTTGAATACCCCCAGAGACCTTTTC
    TCTCCCATACCATCACTGAGTGGCTTGTGATTCTGGGATGGACCCTCGCA
    GCAGGTGCAAGAGACAGAGCCCCCAAGCCTCTGCCCCAAGGGGCCCACAA
    AGGGGAGAAGGGCCAGCCCTACATCTTCAGCTCCCATAGCGCTGGCTCAG
    GAAGAAACCCCAAGCAGCATTCAGCACACCCCAAGGGACAACCCCATCAT
    ATGACATGCCACCCTCTCCATGCCCAACCTAAGATTGTGTGGGTTTTTTA
    ATTAAAAATGTTAAAAGTTTTAAACATGGCCTGTCCACTGTTCTTTGACT
    TCTGTGCATTAGGACTGTGGGGACAATCTATAAAGAGTCTGCGTCACATG
    CATGAAGACACTTCAGTATCTCGGCAATGCCCTCCAGACAGCTCCTCCAG
    CCATCTGTGCCAAGGGGAGTGTGAGGAGTGACAGACCAGGCTGTAGGAAC
    AGGAATGGGGTGTCATGGGGGATGGCAGAGCAGTGGACAGTACACTGCCT
    GGCCCGGGCCCCTGCTTGCCTGCCCATGGAATGTGTGCAGAGGGAGTGCC
    AGGCCAGGTGCTGCTCTGGAGAAGTGGGGGAATGAGGCTGGTCCTGCTGC
    AGGTCAGTCTCAGCACCGTCCTGTCCAGTCAGAGTCACTTAGGTTTGCCA
    GTGAGTAGGGGCCCAGATACATGTTGGATTTCTAAGGTCCCTCCAGATGC
    TCCTGTCAGTGGAACGCCTATTTAGAGTTAGCCAAGCGTAGGCATAATGC
    CATCTTTCTGCAGCATAAAATACAGTGACATAGAAACATATTTGTGTGAT
    TTTCATGCATTCCTTTTTTGATGAGAGATATTACCCAGCTAATTAGGAAC
    AACTGTTTTGTTTCCTTCAGATCATAACCCAAAGTTGTGATTTTGAAAAG
    TCATGTCCCCCTTCAGATTTCTTGTTTTCTGCTACTTCTCATGTGGAATT
    GCTTTGGCTCTTCTTAGTTCTCTTGAGTCTAAATTATTCCTTATAAGTTG
    GTGCAAGCATCTGATTATTTTGTTATCATTACTGTTATGCTCAAGCATTC
    ACAGAGTGGAACACATTTTAATATCAATTGCTTTCTATTTCTCCTTTATA
    TTACAGTTCAGGACATTGTATTAATTATTAAAATTCTATTCGTAGGTAGG
    TTATATGACTGAATTGAAATAGATAAAATGAATTTCTTTTCTAGATAACA
    AAGGAGGTGTCATAAAACACTTGTTATGGGCCAGTGTGATGGCTCATGCC
    TATAATCTCAGTGCTTTGAGAGGCTGAGGTGGAGGATTGCTTGAGGCCAG
    GAATTTGAGACCAGCCTGGGGCAACATAGCAAGACCCCATCTCTTAAAAA
    AAAAAGGGTGGGGCGGGGGGGCACTGCTGGGCGCGGTGGCTCATGCCTGT
    AATCCCAGCACTTTGGGAAGCCAAAGCAGGTGGATCAAAAGGTCAGGAGT
    TCGAGATCAGCCTGGCCAACATGGTGAAACCCCAACTCTACTAAAAATAC
    AAAAATTAGCCGGGCATGATGGCGGGTGCTTATAATCCCAGCTACTCAGG
    AGGCTGAGGCAGAAGAATTGCTTGAACCCAGGAGGCGGAGGTTGCAGTGA
    GCAGAGATTGCACCACTGCACTCCAGCCTGGGCAACAGAGCGAAACTCTG
    TCTCAAAAATGAATTAATTAATTAAAAAAAGAAAAAAAAAACACTGGGCA
    GGGTGGTGTGCACCTGTAGTCCCAACTACTCCAGAGGCTGAGGCAGGAAG
    GAGCACTTGAGCCCAGGAGGTTGTCTGCAGTGAGCTCTACTCATGCCACT
    GCACTCCAGCCTGGGTGACAGAGCTCAGTGGCTTACACCTGTAATCCTAG
    CACTTTGGGAGGCTGAAGCAGGCAGATCACCTAAGATCAGGAGTTCGAGA
    CCGGCTGGCCAACATGATAAAACCCCGTCTTTACTAAAAATAAAATAAAA
    TAAAAAATATATATAAAAATTAGCTGGGTGTGGTGGCACATGCCTATAAT
    CCCAGCTGCTTGGGAGGCTGAGGAACAAGAATGGCTTGAACCCGGGAGGC
    AGAGGTGGCAGTGAGCTGAGATCGCGCCACTGCACTCCAGCCTGTGCGAG
    AGTGAGACTCTGTCTCAAAAAAAAAAAAGGGAATTTAAGAAATTTAAAAG
    AAAACTCTTGTTATATAAAAAGGGTATTGGGTCTGACAGATAAGAGCTCC
    TGCACTCTACCAGCCAGCTACTGACAGACATAGGTCTGGCTCCAGTGGAG
    GGGCAGCAGCCAGTGAGCCCAGCCTGGGGTGGCCCACTCCTGCTGCCTCC
    AGGATGTCCCCTGTTTCCCCAGCCCCTCTGCTGTGCCCTCGGCCCCAGAA
    GCTGGCGAGACTGCTTCTCTGGAACAGCATCACGCAGGCCTGCCCATCGG
    CCCACTGTGCACCAGGCCTTCTGGGGATACAGATGTCAACCAGGTGGGGT
    GCTCAGGAGGGGCACAGAAGCCAGGAATGACAAACACATCAGCCACCAGG
    CAAATGGGAAATGTGCCCCAGAAGCTCCCTGCTGAGGATGTTAGGGAGAG
    CATTCTGAAGTAGTGTGGTTGAGATGAGGCTTGAGGAAGGCAAGGCTCCA
    AACAGCAGGGCAGACTGGGAGCAAGGTAGACTGCATGGGAGGGCAGCTGA
    TGGAGCTCCTTAACCCTCTGGAATTGCCCCAAAGCCAAGCAAAGTGTTCT
    TCTTGGGGTCACAGCTAGCTCAGGGATGCCTTCTGCCCCTTGGTCAGAGG
    GGCAAAAGGTCAGAGCCTAGGGTCACCAAAACCTCTGGGAAGCCCCGGGG
    GTCTCAGGCCACAGACCATCCTCAGAACTACACACTGCCCTCCCATGCCT
    GGCGGGGGCCCTGGACTGGCCCTCACCAGCTGTCTTCTTGCACTGGCCAG
    GGTTCTGGCTGGACTGGCAAGGAGGGGTGGTCAGATACAGGAGTAACTGG
    ATCCCTTCATCAGGACCTAGGGTGGTGAGAGCTTTGAGCCTGCTCTGCTC
    CAGGCAGACATTGTGTCTGGCCCTGCCAGGATGGATAGACAGCAGGATGT
    TACACGTTGAGGACATGAAGGTCATCAGGAATGTGGCTGGAATCTGTTAG
    GCCTCCCCCAGCCCAGGCGGGGGCTGCCAAGTTTGGGCCTATCCTCTGTT
    CCTCTCCTTATTTGGACCTTCAGGTGATAAGGCTGAGACATAAAGGAGGC
    TGGGCCCTGCCACCACGACAGCAGCCACACCTCTGCAGAGAGAATGGTGA
    GTGCCTGCTGGGGAAGAAAGGCTAGCGGTCTCCCAGGTGCTGGCCTTTGG
    GCTGGGGGAGCAGAGTTTTCTGTGCTTGTGTTGGGTTGAGGGTGGTCCCC
    AGGGAGAGGAAGAGGATCCTGGCCCTGGCTCTCCTGGGAATGCTCTGGGA
    CTGTGCATGATGGGTGGGGTGGGGAGACTCTGAGGAGTTGGGGAGAGGAC
    CCCTCCCTACTCACAGTGTTGCAGGCCAGCAGGAAGGCGGGGACCCGGGG
    CAAGGTGGCAGCCACCAAGCAGGCCCAACGTGGTTCTTCCAACGTCTTTT
    CCATGTTTGAACAAGCCCAGATACAGGAGTTCAAAGAAGTGAGTGCCCAC
    TCCCAGTAGCCTCAGATCCCATCCTGGCCCCCCCACCCCACCCCACATAC
    ATACCCCCCTTCTACCCTGACCTTGCCTCTCACACCACCCAGGTCTCTCC
    CCCACCTCCCACCTTCCCTAGAGCTGGGGGCTGCTCCCACCTGAAGGCCC
    CCATCCCACAGGCCTTCAGCTGTATCGACCAGAATCGTGATGGCATCATC
    TGCAAGGCAGACCTGAGGGAGACCTACTCCCAGCTGGGTGCGTGCACCCA
    CCTCCCACCCTGCGCACTGGGGTCCCTACTCTGAGCTGCTGGGCGGGTGG
    GAGTGGCTGGGGGGACAGGACTCTGCTCCCCTGCTTCCCCTCCTCCCCGT
    CTCCTCACACTGCCCTTCCCCCCTTGTCACGCCTTGCTTCCACTTCACCT
    TCCCGACCCACAGCTGCCTCTGCCCCTCCAGCCCCTGTGGCCAGGATGGA
    GGGAGGGCGGCCTGGGCCTTCTGGGGGACACCCAGGGTCCCTGTGTGCAC
    CTCATGCCCCACCCCCACCAGGGAAGGTGAGTGTCCCAGAGGAGGAGCTG
    GACGCCATGCTGCAAGAGGGCAAGGGCCCCATCAACTTCACCGTCTTCCT
    CACGCTCTTTGGGGAGAAGCTCAATGGTGAGCCTGGGACAGAGCTGGGCA
    CCCTTGGCCAGGCAGGGAGCCTGCACCCTGCCTGAACCCCACCTGAACCC
    TGCCTGAACCCCACCTGAACCTTACATGAACCCCACCTGAACCCTAACTG
    AACCCCACCTGGACCCACCTGGACTCTTCCTGGCCATGACCCATTCCAAG
    CACATCCTCTGCCCCAGAATCCCATGTGCACTGGTCACCCCAGTGCTGAC
    TTGGAGCCAGGAAATGTGCCTTCAGCCCCCACCCCCAAATTCCAGTCTCC
    CAGCCAAGCTGCCCGCCTCAGGAGGATGACCATTCCCAGCCCCACTGATC
    CCCGAGAAACATTTTATGTTAGGGAATACCCCCACCTCTTCTGGGATGTG
    GGAGGCTCCTCATGCAGCCCAGTTCCTCCTGCGGGGGACCTGGGATGCTG
    GAGACATGGATGCTCACCTGGCTGCCTCGGCCTTCCAGGGACAGACCCCG
    AGGAAGCCATCCTGAGTGCCTTCCGCATGTTTGACCCCAGCGGCAAAGGG
    GTGGTGAACAAGGATGAGTAAGTATGGGCCCAGCCAGATGAGGAGCACCG
    TGGTGGAAGCAGAGAGCGGGGTGAGGCCCCTAGTGAGGGGGGCTGCCTGT
    GCTTCGGGGCCTTACACTGCTCTTTGGGGTGCAGCCAACCCTTCCCTGCG
    CCATGGGAGCCTCCGTACCCACCTTCCCTGTGCAGTCACTCCCCCGCAGT
    CTCCTGCTCAGACCCTCCTCACCCCCCAGGTTCAAGCAGCTTCTCCTGAC
    CCAGGCAGACAAGTTCTCTCCAGCTGAGGTGAGGCTGCCCAGCCCCTTCA
    ATACTCATCCCCAGCACCTTCTCTGGGCCTTCACCCATGACCCAGAGCCC
    AGTACCAGTGAGGCAGTTGCTGGAAGGGTGAGCCGAGGGCCCTTCTGGAG
    GAGGTGCCATCTCTGTTGAGACCTAGAGGGTAAAGATGTGGAGTCAGAAA
    AGAGGGCAGGGTGCGCCAGGCAGGGAGACTGTGCACAGACCTGGGGGGAA
    GTGGATAGGGAGAGGTTTCGTACACTCGGGGTGGGCCTGTGCCTGTGGCT
    GGAGGGGCGTCCTTTGCCTCTTGGCCCACATTTGCACTGACTCCTCACTC
    TGCCCAGAGTCAGCCAAGAGAAAAACATTAACCCAGAGTCTGGGGTCTAG
    GGTTGAAAAGCTAAGGCAAAAAGCACAGATGCAGGGGGCAGACAGAAAGG
    CCACAGGACTCAGGTGAGGTCTCTGCCGGGCTGGGCCAGGAGCCAGGGGA
    CTGCCACTCACCAGTGTCCCCTGCAGGTGGAGCAGATGTTCGCCCTGACA
    CCCATGGACCTGGCGGGGAACATCGACTACAAGTCACTGTGCTACATCAT
    CACCCATGGAGACGAGAAAGAGGAATGAGGGGCAGGGCCAGGCCCACGGG
    GGGGCACCTCAATAAACTCTGTTGCAAAATTGGAATTGCTGTGGTGTCTT
    GTCTGTGACAGATGGGTTGGGGACCAGCCAAGGGGGATCCCAGGGTCTCA
    GTGCGCACATCACCATGATCATGGCCACCATCTACCTCCTGGGAGCTGGC
    CCCTCGCCAGCTCACCTTGATTCACTCCCATGATGCCAAGTGAAGTGTGA
    ACTATGATCATGCCTAGTTTACAGATGAGGACACTGAGGCCCAGAAAGTG
    TGAGCATCTTACCAAGGCCAGCCCTCTAGAAGAGGAGATGGTGGGATTTA
    CACCACCTCCACCAAGCCCAGGAATGAGCCACAAAGTGGGCACTGCCCAG
    CTACTTGGGGCTGTGCAGAGAAGAGGCTGCTTGCTGGGCACTCAGCAAAC
    TCTGCCCAACAGCCCAGCGGGTGGGCAGCAGCCCTGGGACCCCCACACCC
    AACCACACAGCCTCCCCTGGCCCACTGCTCGCACCCCATCTCAATACACT
    GGCTTGGGTGCCTCCCTGCATGGGCCCTTTGTGAAAGGCAGAGAGGTACC
    CATTTGAAACACAACCAGCTTCTCATTGCAAATACAGGCAAGGCACTAAG
    ACATGAGGAACATGGACACCAAAGCAGGGGCCAGGTAACATGCAAATTTC
    TAGAGGAAATGCCCAGAACCTGGCATCATGCCTCCTGAGCCCCTCATGCG
    CCGTGAGGGGTAAGAGGGTCAGACAGCTGGAGTGTAGGGAGACGACTTCT
    CAGGAGAGAATAGTTAGTGCTCCCGTCACCCTTCATCTGAGAACCCAAGA
    GCTAGAGGAGAAAGTGATCCTCATGAGTACCAGAGGAGCAGCAGGGGACA
    TCCAAAGCACCAGAGAGAGAAACAGAGACAGAGAGACAGGCAGTGACAGC
    TCAAACCTCAGCCAGATCCAGAGCATACAAAGTCTCCTGCCTACAGGACA
    GCCCAGTAAGAGCTCTCAGCTTGCCTCCTTCCCTCCCCACAAGCCCTGCT
    GCAATCCCTGTACCTGGGGGTCAGTGGGAAGGAGGTGAGCGAGAAAGGAG
    GGGCACCCCTTCCTGAAGGCCCCAAGAGGAAAGGCGTTTTCACCCAGACA
    GGTGTTCAGTTTTGATTTTATCTGGCGCCTGGCAATTTAATTACTAAATT
    GAAACTTGAGACTTTCTGGAATTATGGCATTTTCTGTTGCTTAGAGAGAT
    TACAAAAGTCACGAACTGCCTGAGTTTCCATCCTGAAAGCAGGCCACCAG
    CCCACTCCACTGACCATGCTGGAACAGTGGATGAACAAAATCAAGTACCA
    TTAGGATTCTACCACATGAGTCTGCTTGTTCAACAAGCTGATTTCATAAA
    GTAAGGGATCATGTTATAATCCAAGCTCTACAGGGGTAAATTGTGAAAGA
    CTAAAATGAACCAAAAAGATCATAGGTGTCCAGTTATCTGATTTGATGGG
    GTGTCTGAACCTTTTGTTATCTTTGAGCTGTTTCAAAACTCTCTAAATTA
    TTATTATTATTTTTGAGACAGAGTCTCTCTCTGTCACCCAGGCTGGAGTG
    CAGTGGCATGATCTCAGCTCACTGCAACCTCCACCTCCCAGGTTCAAGTG
    ATTCTCATGCCTCACCCTCCCAAGTAGCTAGTATTACAGATGGGCACACC
    TTGCCTGGCTAATTTTTGTATTTTTAATAGAGACGTGGTTTCACCATGTT
    AGCCAGGCTGGTCTCGAACTCCTGACCTCCGTTGATCCACCTGCCTCTGC
    CTCCCAAAGTGCTGGGATTACAGGGGTGAGCCACCGTGCCCTGCCACAAC
    TCTAAATTATAACTAATAGCAAGGCAATGGTTCTTCTCTATTAACGTGCA
    AATAAATGTTGTCCAGTGGAAGCACAACTGATTTTTCCCTTCTCTGTGGA
    AGAAGCCAATTTTGCATCTATTAAGCAAATTCATCTGGGCATTCCTAACC
    GTCTACACATGCACCGGCTCTTTGAATTCTTCTCTGAACCAGGCCCAGGA
    ATAAGCCACAAGATGAGCACTGCCCAGCTCCTTGGGCTGTCACATCTTAT
    TGATTCCCACATGAATTCACAAGTAAATAAAATATTTGGCGGTTGTTCAC
    TTAGTATGCAAGTCAATATTTTGCTTTAAAAATATTATCCTTTCACACTC
    CTGATATAGTTGTCTGATAAGGTTAGTCCTTCCCACACCAAAACTGCCTG
    TATTAGTGTTGTTTGGAATAAACTGAGGGTAGAATGTATATGGTGTGTGT
    ATGTGGTGTGTGTGTTTGTGTGTGTGTGTGTGTGAGAGAGAGAGAGAGAC
    AAAAGAGAGAGACAGAAGGATAGAGAGAAACAGATGGGCACAGACCCAGG
    ACATGAGTTCAGCCTACACTGACCAATATGACAGCCACTGGCCACTTGAA
    ATGTGGTGTGAGTTGGGATATGCCAAAAGTGTAAAATGCACACAATATTT
    TGAAGATTTCATACAAAAAAGAATGCAAACATCTCATTAATAACTTTTAT
    ATAGATCACATGTTGAAATGATAATGTTTTGGATATTAGATTATTACTAA
    AATTAATTTCACCTATTTCTTTTCACTTTTTAAATGTGGCTACTAGAATA
    TTTAGAATTCCATAAGTGGCTTGCATTTCTGGCTTTCACTCCTGTTGGAA
    AGCACTGAGTTAGACTGTGTAGTACGTCTATTTAAGACTGCAGTTTCCAG
    GCCGAACACCGTGGCTCACGCCTATAATCCCAGCACTTTGGGAGGCCGAG
    GCGGGCAGATCACCTGAGGTCAGGAGTTTGAGATAAGCCTGGCTAACGTG
    GTGAAACCCTGTCTCTACTAAAAATACAGAAATTAGCCAGGTGTGGTAGT
    GCATGCCTGTAGTCCCAGCTACTAGGGAGGCTGAGGCAGGAGAATCTCTT
    GAACCCAGAAGGGGAGGTTGCAGTGAGCCAAGATCAAGCCACTGCACTCC
    AGCCTAGATGACAGAGCAAGACTCCATCTCAAAAAAAAAAAAGTAGAATA
    AAAATAAATAAATAAATAAAGACTGCAGTTTCTGGGAGACTCTGAGGCAG
    GCATTAGCCTTCTCTGCAGAGAGTACTTGCAGCAGGGAGCAGCAGTTTTG
    ATGTCCTCAAAAGGAGCCAATTTCATTTGGGTAGGGTTGCCTCTGAGTAT
    TCTAGCAGTACAGACAGAAAGGAGAGAAGGCTGTTTCCAGAAAGCAGAGA
    TCATACGAATTACTTGTGAGACCAAACTTGTTCCTCAGGTGAAGCTCAGG
    CATCCCTTATGTGGAGTGTCTAACAGTCTACACCTGAGGATGTTGGACAT
    AAGGGGGTGTGAGGTGGGCATGGCTGGGGAGAGCTCTGGGAGGGGGAAAA
    CCAGCTCCATGTTGTCCACCCACTGAAAGGAAAGCTCCCTCTGGGGGAGG
    TAGATGCCCCCTGGCCAGGCCTGCAGGGCCCTGCTCACTGTGAGCCCTGT
    GTGGTCCTGGCCTGGGTCCCACCAGCCATTGCCAGGCAACAGCTCCCAGT
    TGGAAAACAGAGCAAGGCTCCCTCTTAGAAAAAAAAAAAAGAAAGAAAGA
    AAAGAAAAGAAATACAACAGGTAACTAAGCATGACGGCTCACGCCTGAAA
    TCCCAGCTACTTGGGAGGCCAAGGCAGAGGATTGCTTGAGACTGGGAGGT
    TGAGGCAGCAGTGAGCCAGGATTCTGCAATTGCACTCCAGCCTGGGTGAC
    AAAGTGAGACCCTAGTAAAAAAAAAAAAAATAGAGACAGAGAAAGAAAGA
    CATGCAACAGGGCCAGGCGCAGTGACTCATACCTGTGATCCCAACACTTT
    GGGAGGCAGAGAAGGGAGGATTGCTTAAGACCAGGAGTGCAAGACCAACC
    TGGGCAACATGGCAAAAACCCATCTCTTCAAAAAATAAAAAAATTAGCCT
    GTTGTGGTGGTGCGCACCTATAGTCCCAGATATTCAGGGAGCTTGAACCA
    GGTCCAGGCTGCAGTAAGCCATGATCGTGCCACTGCACTCCAGCCTGGGT
    GACAGAGCGAGACCTTGTGAGAAAGAAAAGAAAGAAGGGAAGGAAGGAAG
    GAGGGAAGGAGGGAAGGAGGGAGGAAGGGAGGAAGGAAGAATATAGGACC
    CAAAGGCCTAAATGCCCCTACTGTGCCCCAGTTCTGCGTGACTCAGGACC
    AGCCTCCTCCACACTCCCACCACCACAACCCTGCACCCTACTTGTTCCTG
    GGGGCCCCAAGGGGAGCCTCACCAGAAGCCTCCTCATAAACCCACTGCCC
    CTTACCTTTCCTGTCTTTCTAGAAGCCTCAGAAGCCTTGCCACTCTAAGG
    ACACCTCCATCTGAGCCAAGGCGCTCGCTCCAGATGTCCCAGAGCTCCTG
    GTCCTGGGTGTCCCTGCCACACAACCCCCCATGGAGCCCTGCTCTGGCTC
    AAGCCCCCTGACTGTGCATGAGCAGGCCTGTTGCCCTCACTGGGACTGTC
    CAGAGCCTTCCCATCTCTCTGGAGGGACTTCCATCAGTTTCTGCCCCTTC
    TCCTCTGCCAAGAACTCACGTTCAGTCTGATAGCAGAAGAATCATCTGGC
    ACCCTCCTGAATGGAACCCAGAGTACCTCCTTTGTGGACCGGTCTCTGGA
    TTTTCCCCACTCTCTCCCTTCAGCCATGCTGATGGCAGAGAAGGTAAGAA
    CTTCCAGCCCACTTCTCTGGCGAGGGGAACTTGTCATCTGGGTCTGCAGA
    GAAGGTTCCACCTTATGCTCATAGTACATTATCTTTACTATGTACTAGGA
    TATCACATTTAAAAGGACAAAAAAGGCCAGGCAGTGGCTCATGCTTGTAA
    TCCTAGCACTTTGGGAGGCTGAGGCAGGTGGATTACCTGAGGCCAGGAGT
    TCAAGACCAGCCTGACCAACATGGCGAAACCCCATCTCTATTAAAAATAC
    AAAAATTAGCTGGGTGTCGTGGCATGTGCCTACAATCCCAACTACTTGGG
    AGGCTGAAGCAAGAGAATCACTTGAACCCAGGAGGCAGAGGATGCAGTGA
    GCTGAGATCGTGCCACTGCACACCAGCCTGGGCGACAAACCGAGACTCCA
    TCTCAAAAAATAATAATAATAAAATACAACAAAATAAAAGAACAAAAAAA
    AAGAAATGTAAAATACTTGAAGGGGCTTGTATAACATTAATAGGATTGAC
    AGTATCTGCTTTCCAGGCTGAAGTGATTCATTCATTATTCTAGACGTCTT
    TAGTCCTTTGCAATTTGTGGTAATTAGGCTTTTCTTTTTAACATTAAAAA
    TATACAAAAATAAAAGGCAAAAAAAGCATCATCCCATTAGTCTGACCTTC
    CCCTCCTCCATCCCTGCCCCAACACCCTGAAGACCCTGGATGCAAACAAA
    GGCCCGAGGGAGCCTCTTCCCTCGCAGTGCAGGCCTCACCTGGGGCTCAG
    AGTCAGAATCTGCATTTTATTCCCTAGGACAACCTCTAGTCAGGGCAGAG
    GCCGGCTGTGCTGCCCAAGTGCCCTAACCCTAGCTTTGAGGCACCAGAAG
    GGCAAATGGAAATTAAAAATGAGAATAAGTTTATTCTCCTTGGTGAAAAA
    AAAAAAAAAAGACTTTCCCCTCTCCTTTTTCTTTAGAAAATCTATCATTG
    CAAGTTCCTTCCTGGACTTTTTTTATGTAGATCTGTTCAAAAGCTAAATA
    AGCCTCTTTCAAGTTTCACATCCCAGGAATGTCTCCTTAAGGACCTAGGA
    GCCACCATTTGAAGTGTAATCACCAAGGGAGATACATCCTTATCTCCCAG
    TTTCCGTGGGCAAAGGGGAGCCTAACTTTAGCCCGGTGCCTAGCTCAAGT
    TGCAAACACACTTCCAGTCTTAAAGGAATGAATTTATTTTTTTTCCTTTA
    GGCAAACCCAGGTAGCCACCACAGTTACCTGGGGATTCACAGAGAACTGT
    GTGTGACCACTGGTGCTGTCAAGTCCTCTTACCTGAGCACCTGTGACGTT
    TCCCTTGAGAACGTGTACGGGATGGGTTGCACCTGGTTATATACAAGCGT
    GAGACTTCTTTCTGCCTTTGTAATTTATTAGCAGATTATCTGTGATGAGC
    ATCGCAATCTGTTTAATGCCTATTCAATAATTAAATTTTTCTTTCTCTTC
    TTTTGTGGAAAGGTTTTCTGCATTGGCAGGAGATTTTTGTTTTCGATTAT
    GTCCCCAACATGCCTGATGTTCCACCCCTCAAGAGCCTCAGCCTTGCCCA
    GGGAGGGCATGGGGGTGAGTGGCCTCTCCCACAGAGAGTGCTGGCCAAGT
    TGGCCCAGGTGCGCAGCAAGGGCTGCTGCCCAAAGGCTCCCTCCTGGTTG
  • The human liver glucokinase genomic DNA is 46,000 base pairs in length and contains ten exons (see Table 2 below for location of exons). [0045]
  • The human adipocyte enhancer binding protein has the amino acid sequence depicted in SEQ ID NO:3: [0046]
    MAAVRGAPLLSCLLALLALCPGGRPQTVLTDDEIEEFLEGFLSELEPEPREDDVEAPPPPEPTPRVR
    KAQAGGKPGKRPGTAAEVPPEKTKDKGKKGKKDKGPKVPKESLEGSPRPPKKGKEKPPKATKKP
    KEKPPKATKKPKEEPPKATKKPKEKPPKATKKPPSGKRPPILAPSETLEWPLPPPPSPGPEELPQEGG
    APLSNNWQNPGEETHVEAQEHQPEPEEETEQPTLDYNDQIEREDYEDFEYIRRQKQPRPPPSRRRR
    PERVWPEPPEEKAPAPAPEERIEPPVKPLLPPLPPDYGDGYVIPNYDDMDYYFGPPPQKPDAERQT
    DEEKEELKKPKKEDSSPKEETDKWAVEKGKDHKEPRKGEELEEEWTPTEKVKCPPIGMESHRIED
    NQIRASSMLRHGLGAQRGRLNMQTGATEDDYYDGAWCAEDDARTQWIEVDTRRTTRFTGVITQ
    GRDSSIHDDFVTTFFVGFSNDSQTWVMYTNGYEEMTFHGNVDKDTPVLSELPEPVVARFIRIYPLT
    WNGSLCMRLEVLGCSVAPVYSYYAQNEVVATDDLDFRHHSYKDMRQLMKVVNEECPTITRTYS
    LGKSSRGLKIYAMEISDNPGEHELGEPEFRYTAGIHGNEVLGRELLLLLMQYLCRERYDGNPRVRS
    LVQDTRIHLVPSLNPDGYEVAAQMGSEFGNWALGLWTEEGFDIFEDFPDLNSVLWGAEERKWVP
    YRVPNNNLPIPERYLSPDATVSTEVRAIIAWMEKNPFVLGANLNGGERLVSYPYDMARTPTQEQLL
    AAAMAAARGEDEDEVSEAQETPDHAIFRWLAISFASAHLTLTEPYRGGCQAQDYTGGMGIVNGA
    KWNPRTGTINDFSYLHTNCLELSFYLGCDKFPHESELPREWENNKEALLTFMEQVHRGIKGVVTD
    EQGIPIANATISVSGINHGVKTASGGDYWRILNPGEYRVTAHAEGYTPSAKTCNVDYDIGATQCNF
    ILARSNWKRIREIMAMNGNRPIPHIDPSRPMTPQQRRLQQRRLQHRLRLRAQMRLRRLNATTTLGP
    HTVPPTLPPAPATTLSTTIEPWGLIPPTTAGWEESETETYTEVVTEFGTEVEPEFGTKVEPEFETQLEP
    EFETQLEPEFEEEEEEEKEEEIATGQAFPFTTVETYTVNFGDF
  • and is encoded by the genomic DNA sequence shown in SEQ ID NO:7: [0047]
    CAGCAGGGCCAAGGTCTTGTGACAATGTCTGGAGGTGCCCCTATTGTCACACTGGGGGTCTCC
    TACTGGCCTGCAATGGGAGGAGGGGCTGCAGCCCCACATCCTGTGCAGAGTGCTAGTGCTGA
    GGCGGAACCCTCCTCAGAGCTGCCCCTTCTCCTCCAGGTTGTTACCCCTTCTACAAAACTGACC
    CGTTCATCTTCCCAGAGTGCCCGCATGTCTACTTTTGTGGCAACACCCCCAGCTTTGGCTC
    CAAAATCATCCGAGGTAATTTTTGTCTTCTGGGGGCCCAGGCTGATTTGCTGATTTGCTCTCAC
    CTGGGGACAAGGTTCACAGAGAAGAAAACCTGCATTGTGGAGTCCCCCTGGCCCTTGTGGGA
    TGGACAGCTGAGGTCTTCTGCACAGCTGCCATTTCACTGTGGGAGCCAAGCTGCCTCGCCAGC
    TGGGCAGGGACTGGAACGGCTCCCAGCCTGTGTGCCTCTCAAGGCTAATCTCTGGTCTCCT
    ATTGTCACTGCCCCACTGTGTGCCAATGGGGACTCCTGTTTATTTCTGGCAGCTTCTCTTTGAG
    GCAGGACTTACTTGGAACCTACAGTGGGTCCTATGTGACTTCTTTGCAGGTCCTGAGGACCAG
    ACAGTGCTGTTGGTGACTGTCCCTGACTTCAGTGCCACGCAGACCGCCTGCCTTGTGAACCTG
    CGCAGCCTGGCCTGCCAGCCCATCAGCTTCTCGGGCTTCGGGGCAGAGGACGATGACCTG
    GGAGGCCTGGGGCTGGGCCCCTGACTCAAAAAAGTGGTTTTGACCAGAGAGGCCCAGATGGA
    GGCTGTTCATTCCCTGCAGTGTCGGCATTGTAAATAAAGCCTGAGCACTTGCTGATGCGAGCC
    TTGAGCCCTGGGCACTCTGGCTATGGGACTCCTGCAGGGGTGCCCACAGTGACCATAGCCCAT
    GCACCCACCAGCCGGTCTCCCTCCTCCCCATCCCTGACACCTCAGAATGTGAGCAGTCCGTGC
    CATGAGCTTGTTTTATTGGAGTGACCTTGGCTCCCTCCCTCTGCCCCTACTCCAACACTGCAGC
    AACCCCATCTCTTACGAGACTGGCAGGTGGAGCAGGAGCCTCTACACAGCCTCTGGCTCTTAG
    GTCCCAGTCATGTTTGCACCCCCTCAAAGGGGCAGGACCAGCCCTTCCTTTCAGTGTCCATAC
    CAGGGGCCTTCCATGTGCTGATGGGTGATGTGACTGTGGTCAGCAGGCTTGGGAAGTGC
    TGCTGCTGTAGCTTGAGTTGGGCTGGGGTCTTGGTAGGACGCTGATCTCAGAAGTCCCCAAAG
    TTCACTGTGTAGGTCTCTACTGTTGTGAAGGGGAATGCCTGGCCAGTGGCTATCTCCTCCTCTT
    TCTCCTCCTCCTCCTCTTCCTCAAACTCGGGTTCCAGCTGGGTCTCGAACTCAGGCTCCAACTG
    GGTCTCAAACTCGGGCTCCACCTTGGTCCCAAACTCGGGCTCCACCTCGGTCCCAAACT
    CTGTCACCACCTCTGTGTAGGTCTCAGTCTCCGACTCCTCCCAGCCAGCGGTGGTTGGCGGTAT
    GAGGCCCCAGGGCTCTATGGTAGTGCTCAGGGTGGTGGCAGGGGCAGGGGGCAGCGTGGGAG
    GCACAGTGTGGGGGCCTAGGGTGGTGGTGGCGTTGAGGCGCCGCAGCCGCATCTGTGCCCGA
    AGCCGCAGGCGGTGTTGTAGGCGTCGCTGCTGCAGGCGTCGCTGTTGGGGGGTCATAGGGCG
    CGATGGGTCTATGTGTGGGATAGGCCGGTTCCCGTTCATGGCCATGATCTCCCGGATGCGCTT
    CCAGTTGGAGCGAGCCAGGATGAAGTTGCACTGAGTGGCCCCGATGTCATAGTCAACATTGC
    AGGTCTTGGCGCTCGGGGTGTAGCCCTCCGCGTGGGCTGTCACGCGGTACTCACCCGGGTTCA
    AGATTCGCCAGTAATCACCACCACTGGCTGCGGAGGGAGAACGATCCGGCTGCCCCAGAGCG
    CCCCTCCCAGGCCCCCACCCTCCCACTCAGTCCTGCCCCCAGCCCCGCCCTCCCCCTCTGAGTT
    CCCGCCCCCAGCACCGCCCTCCCTCTCTGAATTTCGCCCCCAGGCTCCCCAGACTCTACCTGCT
    CGCTGAGTTCCTCAAGCCCCCACCCTCTCTGGCGGGTCCTCCCTCAGAAAGATGGGGTAAGG
    TGTGCACACTAGGTACCTGTCTTCACGCCGTGATTAATGCCACTCACAGAGATGGTGGCGTTG
    GCAATGGGGATGCCTTGCTCGTCCGTCACCACCCCCTTAATGCCGCGGTGCACCTAGGGAAGC
    AGGTGAGGGCTGCTGGTCCTCAGGAAGGTCCAATGTGGTCCGCTGCTCCCTCCCGCCCATCCA
    GGAGCCTGTGCAGCCTCCTCTCCCCAGGCATTGCCCTAGCCACCCCACCTGCTCCATGAAGGT
    GAGCAGCGCCTCCTTGTTGTTCTCCCACTCGCGGGGCAGCTCACTCTCATGAGGGAACTTGTC
    ACAGCCCAGGTAGAAGGAGAGCTCCAGGCAGTTGGTATGCAGGTAACTGAAGTCATTGATAG
    CTGGCCGGGGACAGATACAGACCCAAAGTCAGCCCCTCTCCGGACCAGGCCCCGCCCACAGC
    CCCTCCCAGGCTGACTCACTCCCGGTCCGGGGGTTCCACTTGGCCCCGTTGACGATGCCCATG
    CCGCCGGTGTAGTCCTGGGCTTGGCAGCCTCCGCGGTAGGGCTCGGTCAAGGTGAGGTGTGCG
    GAGGCGAAGGAGATGGCAAGCCACCGGAAGATGGCGTGGTCTGGAGTCTCCTGGGCCTCGGA
    GACCTCGTCCTCATCCTCCCCCCGGGCTGCTGCCATGGCTGCGGCCAGCAGCTGCTCCTGGGT
    AGGCGTGCGGGCCATATCGTAGGGGTAGGATACTAGCCGCTCGCCGCCGTTCAGATTTGCTCC
    CAGCACGAAGGGGTTCTTCTCCATCCAGGCAATGATGGCCCGGACCTCCGTGGATACCTGGAG
    TGGCCAGCACGTGTGAGGCCAGGGCTGCAGCTCCGGCCACTATCCCCAACCTAGCCCGATCAC
    CCTCCATGAAGCTTCACACCAGTACTCGCACGATCCCCTGTCCCCCAACCCCCAGAGCCTCAG
    CGTCTGGAGTTCAGGCACCGTCAGCCCCACCCCCAAGCCCAGAACACCAGGACCCCAGGGTC
    CAGCTGCTCCCTCCTGCCCTTTCAGCCAGGCTGTAGCCTCACCGTGGCATCTGGCGAAAGGTA
    GCGTTCAGGGATGGGCAAGTTATTGTTGGGGACCCGGTAGGGGACCCATTTCCTCTCCTCAGC
    TCCCCAGAGCACAGAGTTGAGATCCGGGAAATCTTCAAAGATGTCAAAGCCCTCCTCAGTCCA
    CAGTCCCAGCGCCCAGTTCCCAAACTCTGAGCCCTGTGGGGAGCCAGCAGGGTAGGCATCGG
    CTACCCACACCCCCACAACCCCCAGCTGCCTGGACCCTGGCCAGCCTCACCCTTCAACCCACC
    ATCTGCGCTGCCACCTCGTAGCCATCAGGGTTCAGTGAGGGCACCAGGTGGATGCGTGTGTCC
    TGCACCAGGCTGCGCACACGTGGGTTCCCATCGCGGTACTCTCGGCACAGGTACTGCATGAGC
    AGCAGCAACAGCTCTCGGCCCAGCACCTCGTTGCCATGGATCCCAGCAGTGTAGCGGAACTCG
    GGCTCCCCTGCAAGGGCGGGAGCCTCAGTGAGCACTCAGTCTCCCGAGGCCCAGGGCAGCTG
    AGGAAGGACCCAGACCCACCTCATACCCGAGGGTCTGGGGGACAGCTGGGGCTCCTAGGGCC
    CTGTAAGACAAGCCAGAATCCCCAGAGAGGCTCCGGAACAGGCGGGAGGCAGTGAGCTCTGC
    ACATCAGCAGCAGAGGCCAGCTGCTGGCCCCCACAGACCCTCCCCCAGTTCATGCTCCCCAGG
    GTTGTCTGAGATCTCCATGGCATAGATCTTGAGGCCTCGTGAGCTCTTGCCCAGGCTGTAAGT
    GCGGGTGATGGTGGGGCACTCCTCGTTCACCACCTTCATGAGCTGGCGCAGAGGGGGAGGAC
    GTGGAATCAATCATGCAATCCGTCCCCCGCTGACCATGCCCCTTCCACTTCCAGGGCCTGCTCT
    ATGGCGAGGGACGGGCATGACCCCTTCACGCAGCCCCCAGGTACTGGCCTCCTTCCTAAGGTG
    AGGGACAGCCAGCATCCCTGGAACCAGTAGGGACTGGGCCCAGTGACAGAAGCACCAGGCAC
    ACACTCCCGTCAGCCACAGACAGGTCCCACCCCCAGCCCCAGGATATATGCTCCCAACCTGGC
    GCATGTCCTTGTAGCTGTGGTGCCGGAAATCCAGGTCATCGGTGGCCACCACCTCATTCTGTG
    CGTAGTAGCTGTAGACAGCTGCAAGGGAGGCGGGGTTGTCTTTAGCTGGGTGCCGGCTGGCCC
    ACCCTAGCACCCCACCTCCACTCAGAGCCCCTGCCAGCCCTCCACACTCACGGGCCACAGAGC
    ACCCCAGCACCTCCAGGCGCATGCACAGGCTGCCATTCCAGGTGAGTGGGTAGATGCGGATG
    AAACGAGCCACCACCGGCTCTGGGAGCTCACTCAGCACGGGTGTGTCCTTGTCCACGTTCCCA
    TGAAAGGTCTGGGGAGAGGCAGGCCTCAGAGCAGTACTGCCAGCCCCTCTGAGAGCCCACCC
    CTCGCCCAGACAATGGGAGCAGAGCCAAGAGCCTGGGCATGGTGCCCACCATTTCCTCATAG
    CCGTTGGTGTACATCACCCATGTCTGGCTGTCATTGCTGAAGCCCACGAAGAAGGTGGTCACA
    AAATCGTCACTGTGGAGTGGACAGTGGTCAGAGCAAGGGTCTTCCCCCTCCCAGGCCCTCAGG
    TGGCCTGAGCCTCCCTCTTCCGAGCCCCAAGAATTTAAGAGCTAGCAGGGTGGTGCTGCACG
    GCCCAGGTGTTGAGCCTGGGTCCTATGCCCGTCACATAGCCATGGGCAGGTGATCTGTCCCTA
    AACTCATGTGCTATCAGGACACAGGGGCTGACTGACCAGGCTGAGGAGTGGGGATGGGCAGG
    GTGAGTCCCTCACTGATCTTTTTGGCCTTCTTTGGCTGGGCCAAAGAAGGGCCCACTGGAATCT
    CCTTAATGGGACACAGAGCCATGCCTATGTAGCCACTCCCCTCTGCCAACTATCCATGAGC
    CTGGCCACGCACTGGATGCTGGAGTCTCTGCCCTGGGTGATGACGCCTGTGAACCGGGTAGTC
    CTCCTGGTGTCCACCTCTATCCACTGGGTCCTGGCATCGTCCTCGGCACACCACGCACCATCAT
    AGTAGTCGTCCTCAGTGGCACCGGTCTGTCCAGGGGGCAGGGGAGGCTGAGCATGGGCGGAG
    GAGTCCCTTTATCCCAGTTGGGAGATGGGCCCATCCCAATGCCCACCTGCATGTTGAGCCGG
    CCGCGCTGTGCCCCCAGGCCGTGGCGCAGCATGGAGGAGGCTCGGATCTGGTTGTCCTCAATA
    CGGTGTGACTCCATCCCAATGGGGGGACACTCTGAGGACGCGTACCCCAGAATGGTGGCTCA
    CTAGCTCCATCCTTCCCTCCACCAAACCCAGAACCAAGGAGCCCAGAGCCCACTCCCGGCACA
    TCGGGGGCACAGTCAGAGGGCAGCTCTGGTCAGCTGGTGGCTCCCTGGTGCCCTGCACCAGC
    CCACCTGGAATCGACTCAAAGCCAGGCCAGGAGCTGTTTCCAATCCCAGCCTGTGCTTCCCCT
    CCCTGGGCCTCAGCTGCCCCATCTGGAGAACGGGCTGACCATGCCCAGCTCTCAGGGGACACA
    CGTGAAATCACAGGTAGAGCTCCCCCAGGGCGCAGCCACAGATGTCATCCAGATGGGGACGG
    TCTGCACAATGGCCCTGCAGGGATACCTGTGAAGGTACCTGAGGTCCTCACTCCCCACCAAGG
    CCCCAGGTCCTCCCCCTACCACGCCCAGCCACTAGGGGCCCTGGGGAGCTGCCACCCTCCTGA
    AGCAGGCCAGCCTGGGGTCCAGGGCTGGGGCAGCCAAGCGAGGCTATCCTGGGCTCCCGGGG
    CCCCTCCCTTCTGGGTCCCAAGAATCTGAGTAGGAAAGGGTTCCGGGGACCTGGGTCCTGTTT
    GTGACATTGGGCCAGTCACTTGTCCCAGCACCCCCATCCTGTGGCCCCCACCCTCACCCCCTTG
    TGCCCCCCACTTACTGACTFTTCTCCGTAGGCGTCCACTCCTCCTCCAACTCCTCGCCCTTTCGG
    GGCTCTAGGGACAATGAAGGGAGGACATGGCACCAAGGGCCCGGGAGGCAATCAGGAGTCC
    AGATGCTGCCCCACAGGGACCCAGGCCCCAAGCCCCAGCCACACACCTTTGTGGTCCTTGCCC
    TTCTCCACTGCCCACTTGTCGGTCTCCTCCTTGGGGCTGCTGTCCTCCTTTTTGGGTTTCTCTGG
    AAGGTGCAAGGTAGGAGGGGCCAGTCAGCCTGGCTCTGGGCTTTGAGGACCATGTGGGGTGG
    ATCAGGCAGGCCCCAGGTGGCCTTCAGGGCAGGCCTGGTGTGGGAAGTCCTTGGTCCCACTCA
    CTCAGCTCCTCCTTCTCTTCGTCCGTCTGGCGCTCAGCATCGGGCTTCTGGGGCGGAGGAGGCC
    CAAAGTAATAGTCCACTATGGGGAGGGAGAGCCAGCTGAGGCTGCCCTGACCCTGCTGCGGG
    GCCTCAGCTCCTGGGTCCACAGGAGCTCAGCAGGACAGGACCGCGCCAGAGGGGAGGAGGAC
    GGGAGATGGGGGACAGCTGAGTTGGGAGAGGGTCTTGCAGGAGTCAGGAGCAGCCCGAGCTC
    AGGGGCAGCTGAGCAAGACCCTGCTGAAGTCACCAGCCCGGCCTTCCAGGAGCATCTGGCCT
    GGGGAAAGGACTCGAGGCCCAGGGCATGGGAAAGGCCTGGAGGGACAACTGGCACCTGTGC
    CTGGGGTTGCGGGCTGGGGGGTGAGATGGGGAGACATTGGAGGCACTGATGGGGACCTGGGG
    GCAGGGAAATGGCGATGCACGGGCTGCCACCCAGGAGGAAAGGGAACCTGAGGGCTCCAGG
    GACGCAGGGGCATGAGCAACAGGGAGGCAAAAGCCCTCGGGCTCCCTGAAGAGAGTGGGGC
    AGTGGCCACGAGCCAGCGGGAAGCCAGTTAGAGCACAGGACTGGGAGGGCTGGAACCCACA
    TGGGTGACAGGGCAGAGTGTGTGCCTAGGGACACCCCTGTGGGGGTCACAGCCAAGCAGGAA
    TGGGCACGCAGGGACAGCAGGGACAGCGAGGTAACCACGGGCACAGGTGGGGTTGCAAGGT
    GGGTGAGTTGCCCCAGCTGGCTCCTGACCACACCCCAGCCCCGACCCCCACCTGCCTATGTCC
    CTCAGACTCTGGGGTGCTGGGTACTCACTGTCATCGTAGTTGGGGATCACGTAACCATCACCA
    TAGTCAGGGGGCAGCGGGGGCAGCAGAGGCTTCACAGGAGGCTCTGGGGAGGCGGGGAGGT
    TAGGAGGGGGCCAGAGCGCCGTGGCCATGGCACCTCCTCTCCTGCCCCCCATCCTACCAATCC
    TCTCCTCCGGGGCTGGGGCCGGGGCCTTCTCCTCAGGGGGCTCTGGCCAGACCCGCTCGGGCC
    TCCTCCTTCTGCTTGGGGGTGGCCTGGGTTGCTTCTGGCGCCGAATGTACTCAACTGAGGGGG
    AGGCTGGCTCAGAGTGGGGCCCAAGGCTGGGATGGGCCCATTGGCACATCCCCCAGGCCAGG
    GGTCCGACCCAGGTGGGGCTGGCAGGACCCTACTCAAAGTCCTCATAGTCCTCCCTCTCGATC
    TGGTCATTGTAGTCCAGTGTGGGTTTGCTCGGTCTCCTCCTCCGGCTCTGAGGGGAAAGCGCTG
    GTAGCTGCCTGACAACCCCACCCAGGCCTACTCTGGGGAAGCCCTCAGTCCAACCAGCCAGG
    GCAGCTGGCCCCAAGGCCAGGCGGATGACGGCCACTCACCAGGCTGGTGCTCCTGTGCCTCCA
    CATGGGTCTCCTCTCCTGGATTCTGCCAGTTATTTGAGAGGGGCGCCCCTGCAACACAGGAGT
    TCCAGAAGCAGGTGGGCGGGAGGCCTGCTCTGACCACCTTGGGAGCCTCAGGCCACCAGCCA
    CCCATAGAGCCCACACAGAGCCTGTGGACACCCTCCTGAGGCCGAGCTCACTCCAAGGAGGC
    CTGAGCTCCTCTGGCCTTCAGCATCCTGCTGGCATCTCATGGGGCCAGAGAGCTGGGCCCACC
    TTCTGGGGAACCTACTGTGCTGCTGGAGGCCCTACCACAAAGCTGTCCCCAGCGGGAGAAGG
    CAGGAGGGAACTCCATGGGCTCAGAGCCCAGGGACATCTGGGCAGGGGCCTGAGGGACAGA
    GGTCCCACCCAAAAGGCTGCCAAGCCCTCTCCCTACCCAAAAGAGGCTACAGCACTGAGGGA
    GCCCACCAATCAAATTGTGAAATTTATAGCAAAAGTGAGGTTCCCATCCAGTGGGGAGCTGA
    AGGTCTATAGGAAGCAGGGCCCCAGAAACCTGCCTCCCACTCCCTGCCTCCACCCGAGCAGGC
    AGTCAGAGCCCCATCACCCCAGAGGAGCCCGGCACAAACCTCCCTCCTGGGGTAGCTCCTCGG
    GGCCAGGGCTGGGGGGTGGGGGCAGTGGCCACTCCAGGGTTTCTGAGGGAGCCAGAATGGGG
    GGCCTCTTCCCTGACGGGGGCTTCTTGGTGGCCTTGGGTGGCTTCTCTTTGGGCTTCTTGGTGG
    CCTTGGGTGGCTCCTCCTTGGGCTTCTTGGTGGCCTTAGGTGGCTTCTCCTTGGGCTTCTTGGT
    GGCCTTGGGTGGCTTCTCCTTCCCCTTCTTGGGCGGCCTGGGGGACCCCTCCAAGGACTCCTTG
    GGCACCTTGGGGCCTTTGTCTTTCTTGCCTTTCTTCCCTTTGTCTTTGGTCTTTTCCGGAGGCAC
    TGTCCAAGATGCAGACTCGTGTCAAATGAACAGAGCCAGCTCTGTGCCCCCATGAGGCCCCTC
    TCTAGATGCCCAGAACCTGGGCACAGGGACTCTTGTCAGTTCCCAGTGCGGATCAGCAAACTG
    AGAGGTTAAGTCATTTGCCCAAGTGGCAAACTGGGATCCGGACCCAGATTTTCTGTCTGCAAG
    TCTGGGGCTGTGACCACCAATCTCAACCTCTCTAAAGACTGAGCGTAGGGTTCCCAGTTCCCA
    GGGGGAGGCCCTCATCCCCCCACCTGCCAAAACCTCAATAGGGTTCCTTACTATCCACTCCT
    CCACTATTCTGTTCTGGGCACAGAAGGGGCAGAGAGGTGACTGAGCCATCCAGGCCTGGAGG
    AGCATCTGGTCATCCCTGCCAACTGCCATACAAAGGAAGGGACATGGGCCCAAGACCTTCCCC
    TGGTCTCCTACGGGGCAAGAAAAGCTTCAAAGAAAAGGGACACTTGGTTGAGTATTGAAGCC
    CAAAGAAGAGGAAGTGGTCTCCTFTCGAGAAGTAAGGGGTTTGAATTGATTGGAAGGATAG
    GGAGTCCTGGGGGGTTCAGGGATCACACAGAGGACAGAAAAGACAGGTAGGGAGCTTGTGG
    CTCGACACTCATTTCAGAGTCTGGGAGAGGGAGCAGGGACTGGTTGTGAGGATTCCCCATGG
    GAATCCTCCCAGGACCCTAAGCAGGAGCTGCAAGTGCTGTTGAGAACCTGATGAGAGGTGGG
    GAGCATGAGGGAAGTTTGGCAGAAACACAGGAAAGCTACCAAATGCAGACAGCCAGGGGAC
    GCAGGGCTGCTAGAGCGGTGCCCCAGAGCCAGGAGAGCAAGCCTGGAAGGAGAGCCAGAGG
    CAGGAGGGGCACAGGCAGCCCAGGGTGTGGGAAGCAGCCAGGAAAGATCTAGAGCTGGGGT
    GGCAGGGGAGGGGCTGCTGACATCAGGAATGTTGGATGGTGCCTTGGAATCTCCTGGGAGAC
    AGGGATCACAAGACCCTCTGCCACCTTCCAGAGGGCCACGATGAAAACAGCTAAGATTTACT
    GACAACTGATTATGCAAGAGGCCGTGGGTTAAATGCTTCAGTGATGCATCACCTCATCTAATT
    TCCTGTACTAATGTAGGACCACCCATTGCTCACCACCACCTGAAGCCCTGTGCTCACCACCAC
    CTGAAACTCTCTCACCTACGTGAGACCTCCTGGAGTAGGAGGGCAAAGGCAGGAGGGAGGGA
    CGACGTGAAGCTGTGCCACCAACAGGGAGAGTGGTCCCATTAGTATGGCAGGGGGTGACACA
    GCACAGTCCCCTGTGGCTCAAGCCTAGTACCTGTCGCGTACTGGAGGAATGGGGATAAGCGA
    CCCGTACAACCACAGCACCAACCCTAGAGCCACCGGCCCCCAAAAGCGGCCCTGCCGCCCGG
    GTGCTGGATGTGCCTCCACGCCAGCGCTGACCTCGGCCTAGCACAGGGTCCCTCCAGGCATCT
    GGGCTCGCGTGCGCATTAGTAAGCCAGCCATTCCTCCCCTAGCAGACTGGGGAGTGGCCAGAC
    CCTACCGAATCCCCCTGTTCCCACCTGAGATGCCAGCCCCCCACACCCCCGCCCTGCCCTGGG
    CTCTTACCTTCTGCGGCCGTCCCTGGCCGCTTCCCTGGCTTGCCCCCCGCCTGGGCTTTTCGGA
    CCCGCGGGGTGGGCTCGGGAGGCGGCGGGGCCTCCACGTCGTCCTCCCGGGGCTCAGGTTCTA
    GCTCTGACAGGAAGCCCTCGAGGAACTCCTCGATCTCGTCGTCGGTCAGCACCGTCTGCGGGC
    GCCCTCCAGGGCACAGGGCCAGCAACGCCAGGAGGCAGCTGAGCAGGGGCGCCCCGCGCAC
    GGCCGCCATGGCCGCGGCACGCGCGGGGGGCTCCGGGGAGGGCGCGGGGGGTCAGGGGCTCT
    GGGTCTCTGGGAAAGGGCGGAGAGGGGATCGAGACGGGTGAGGGAATCCAGGAAGGGGCGG
    GAGAGAGGATGGGGTGAGCGAGGGAATCCGGGAAAGGGAGGGAGAGTGGATTAGGGTGGGC
    GAGGGGACCCGGGAAGGGGTGCTGGGGGGCTCCGAAGCCAGAGGGGCTCAGGGGTGGTCGG
    GGCGCTCCGAGGTCTGGCGGCTAATAGGCGCTCCGGCCCCGCGTGGCGCACTCCCGCGCGGAT
    AGCCGTCTCCAAAGCGCTGGCGGGGCCCGGGGCGGGGGCGCCGGGGCTTCCGGAGCCGGCTC
    CCCACCCCCGGGGAGGAGGAGGAGGAAGAGAAGGAGGAGCCGAGAGTGGACGGAGGGGCTG
    CGGGGGGGCGGGGGGCGGGGGGCGGGGGGCTAGGGGCGGGGCAGGCGGGCGGGCGCTGGCG
    GCGAGCGTCCCAAGCCCGGAGACTTGCGCCTAGGACAGAGGGGCAGGGGGCGGGGCGACTG
    GGAAGACAGAGGGCCTGAGGGAAGGAAAGGTGGTGGGGAGGGCCTGGGGTGCGGGTCTGAG
    GGGGCCGACATCCCTCCTCCTTCTGCCCTAGGCACCCCCCTTAAGGCGGGACCCCGAGTCCAC
    CGGGGCTCTGAGCCCTCCGCGGGTGACCAGGAACCCTGGACGGAAAGCCGTGGTGTCAGGCC
    TCTGAGACCTCTCTCAATTCGGAGGGCCACAGAAAGGCCACCCCATCCTTCCCAGGCTCTGGA
    GCCTCTGCCCATGGGCCCTGCTGCATCCCAGCGTCAATTCATTCAGTCATCCTACCAACCTCTT
    CAGGTCGGTGTGGGGCCGGGCCCCGTGCTGGGCCCCAGGGAGGGACAGCACAGTGGGAACTC
    ACTTTCCAGCCAGGAGGCAGGTGCAAAACTGCCCTCAGAGTGGCCAGCTGCCCCGCTGGGGG
    TAGGAGTCCCATGTAAGGGCATGCCATCCCTCCCCTCCGGGTCCCAACGTGGACAGAAAGCCA
    TTTATCACCTTCTTCTTACCAGAACTCATTTTTTAAAAAGTGTCTACCATACCTCCAGCTGCCA
    CATGGACCCAGAGGGCCCAGAGGACCCAGAAGGCAGGTGGATTGAGTGTCAACTGATCCCAG
    TTGGCTGCCCTACAACGGCCATCAACAGGCAGAGTGGTCCCATTAGTATGGCAGGGCGTGAC
    ACAGCACAGTCCCCCGTGACTCAAGCCTAGTCCCTGTCTCATACTGGAGGAATGGGGAGCTAA
    GGACAGAGCTCCGAGGACATTCCCCCTTAAAGGAATGAGGACACAAGAGAAAGCTCACAGGT
    AGTCCATGGGCCAAGTGCAGAGGCAGACAGCCCTAAGCCACGATTGTCTGCGGGGTTTGGCC
    CCAGTGAAGTAGTCAGGTAGGGAAGCCTAGGAGCCCCTGGGATGATTGACAGGGCAGAGTTT
    GGACCTGGGGTCAAAAGGAAAGAGGAAAAGTGGGTCAGGAAGCACCTGGGTCCCCAGAGCA
    GCCCCGAGTGAGTTGGAGCAGGCAGCAGCCGGGGAGGCCACAGTGGAGGCTGCTGGGCCTGG
    GATACATGCCACCCCCTGGGAGCAGGACCACAAGGAGGCCTTGCCTCCTCTCACACCTGGTCC
    TGCCAAGACCCTGCCTTTGCTTTCTCACTGCATCTCCTTGAAAAAGCAGTGGGACTGTGTCAG
    GTTCTGGCTCTACCTCCCAGGCACCACATCTCGGCAGGTAGCCTCAGTGCCGTCCACCTGTGTC
    CCTGTTCTCCTTGTCGTTCATACAGGATCATGCATGTGCTGTGCCTAGCACACATTCTTGGCAC
    TCACACTGCTGCCTTTTAGCTCTCATCATTTGCCCTCAGAGATCAACCTGAGCTGTGCCCACTG
    GGGCGCTCAGAGCAGACCCTGAGCCCCAACACCCAGGCTCCCTGTGCACCTGAGCCTGCCTCT
    GCCTGCCACGTGCCCCCAGGCCAGTCCTGGTGGCAGCAAGGATCCGCAAGCTCTCCCCTTTCC
    TCATCCTCTGCAAAGCTCTGAATCATCTTTCTCAAAACTTGTTCTGGGAATTTGCTCCGTTGCC
    CCAGTTGAGCATGTCAAGCCCGGCGGCCCAAGGCTGGGGTGAAGCAGCGTGGCACGTCACTT
    CCCTGGGAACAACTCACACATGGATTGGATTTGGGTCCAACATCCTCTGCCAGGGAAAATAGA
    AGCCATAAGAAAACAAAAAAGGAACAGAAGGAGGCTTTCTTCAGTCACAGCGAGTCACCAA
    CAAAAACATGTGCAAAAGCTCTCATGGAGAGCTGGGCCACAAGGAGGGCCATGATGTTGGGG
    GCCCTCTGACACCAAGGGTGTGGGCAGGTGGATGGGAGGCAGCTGCCCTCCATGCCAGGCTG
    ATGTGCCTCCCTTTGGGTGGTGGGGCTGGGACTCCCACTCCACTTGAAGACCTGCACCAAAAA
    GTCCTTTAGCCCTGTGCCCAGGCTCTGCCACGGGGCCGGTGAGGGGACTTCTCCCCTCTGCTG
    CCAGAGTGAAGCCAGTCAGGGGGATGGGAGGCTTGTAGCCAAGAGCACCTAGTGGCTTTCAG
    GGTCCCTTACCCCTGCCACTTAGCAGGGTCTGCACCTGCATCCAAGTGTTCTCCTGGGCTACAG
    TGGGGGGCTGGTAGACACTCTGGTGATCCACTTTCAGCTTCCCACATGGATGTGGCAGGGACT
    GCTTTGGCATTTCCCTACCCCAAGGGACAGCCACTGCGGCAGGACTGGGCTGGGGAGGGTGG
    GGCCTGCGCTGGGGAGGGTGCCCCCTGTCCCTTGCTGCTGCTGGAATGGGAAGGAGAGTTGTT
    GAGAGAGCCAGAACTGTCCAAGGGTGGAAGCTGGCGAAAGTGACCTGCAGGGAACAGGGAG
    ACAGGGAGCATGGCCCAGTGAGTAGGTCCTATGTAGCTCTGAGGCCATCAACCCTGCCATGA
    GGGCTGAGACCCCAAGAGAGAAGTTGAGGTTGGGTCAGGGGCCTGTTAGTGCCAGCTGAGGA
    GGGGGACAGGCCAGCCTCCTCCCACTGGGACCCAAGCTATAGCTCCTGAGCCTCCAGAGCTGC
    CTGGTGCCTCAACGTGGTCAGAGGTGGAAACTCACCTGCCAGCAGGCCGAGTGTGCCTGAGTT
    CTGACTGTGGGGATCTGCAGGGCACAGAAGGATAAGAGGTCATCAGGGCCTGGGGACAGGCA
    GGAGTGGCAGGGTCTGGGAGGCTGGGAGCAGACCCTCCCAACCTGCCCCATGGCCTCCGTGG
    CCCCCAGGACCCCCATGGCAGCAGCTCAGACACGGGTTGTGCCTCAGAAGGAAGTGAAGCTG
    TGTGTACCGAGATGGCCCAGCAAACCCTTTGTATGTAAACTTCCGCCACAGCCCAGCTGTCCA
    GCACCAGCATGTGTATCTGGGGGAGGGGGATAAATAGAAGGTCTGGGAGGCCTGGGATCTGG
    CCAGCAGGCTACTGGGATCACAGATGCCAGCCCCTCCATATCTCCGCTTGAGTCCTGGATCTG
    CCTCCTGGGACCAAAGGGGAAAGGACCAGGCTAGGCTCCTTCCTTTTTGTTCTTCCCTCTTGGG
    GGAGGCTCCTAGAAACTCCCCCTTCTCTGCCGCCCAAGTGCCTGGATATTACCAGTGGGGTTA
    GCCTGTTTGGGCCCACAAGATGGGATGGCTCCCAGAGCCATGGGACCTGAGGTCTCCCAGAC
    AGTGTCTAGCCACCCTCACAACTGGCAGAACAATTTCCTTGGTTTTCAACAACTTGAAAAACA
    TATGTGATTTTCCACAGTCCGGTGCTTCTCAGGCCTGGCTGCTGAGTGAGCAGAGTTCATGCTG
    AATTCCTTCCACTCACCACAGGGCAGACAGCAAGCCCAGCTGTGGGGACTCGGTTGGGGTGG
    GGGTCACCACAGCAAGGCGCGGGGAGTGGGGAGGGGGGCAGGCTTCCAGCACTGATGAGTA
    ATTCTGCTGCCCGAAGATCTGGGAAGAGGGCATGTGACAACTTAGTGCAACAATCTGCCCAGT
    GTTAGGTCAGAAGGAAGGAGAGGTCGTTCAAAATGGAGTCTGGTGGAAAAAATAATGTTTGG
    CCCCACCTCATACCTCCCTCAAAATTAACTCCAGATTAATGAGGTAGATGTTAGAAGAGGAAC
    CAGGGAAGGACTACAAGAAAATATGGAGTCTTTATTTACATTGTGAGGTTTTCTTTAGGTTTT
    GTTTGTTTTTGTTTTTGATATGGAGTCTCACTCTGTCACCCAGGCTGGAGTGCAGTGGTGCGAT
    CCCGGCTAACTGCAACCTCCGCCTCCCAGGTTCAAGAGATTCTCCTGCCTCAGCCTCCCAAGT
    ATCTGGGGATTACAGGCACATGCCACCATGCCCGGCTTTTTTTTTTTTTTTTTTTTTTTGTATTT
    TTAGTAGAGATGGGGTTTCACCATGTTGACCAGGCAGATCTCAAACTCCTGACCTCAAGTGAT
    CCACCCGCCTCAGCCTCCCAAAGTGCTGGGCGCCCGGCATGTGTGCCCAGCCTATATTGACAT
    TCTTGATGGAGAAGTCTCTTAAGGAAGGACAGAGAAGTTTGGTTGCATAAAAGTTTTTACCTT
    CTGTACATCAAAATATACTGAAAATGAAAATAAAGAGCAAACAAAATACTGAGAAAGAATGC
    AGTGCTTAGAGAGCGAACATTCCTGGCCTCCTGTAGTTTTAGGAAGCAGCTGTGGCCTCAGAC
    CCATCTGCTGTGAACCTCTACTCCATATTTATTGCACTTTCTGTCTGTGAGCGTCGGTTTCTCTC
    CTCTATAACAATAGGATAATAATGACACTACCATGCCTTGCAAAAATGCTACAAGGGTTCACT
    GAGATAAATCTGGAGAGTCATGCCTGAAAAATAGTAAGTCGTTGATAAAGGGAAGCTGCTAT
    TAATAAATAAAGCTTTTTCTTTTTTTTTTTTTTTGAGATGGAATCTCACTCTGGCGCCTAGGCTGG
    AGTGCAGTGATGCAATCTTTGGCTCACTGCAACCTCCGCCTCCTGTGTTCAAGCAATCCTCCTAC
    TTCAGCATCCTCAGTAGCTGGGACTACAGGTGCGCACCACCATGCCCGGCTAGTTTTTTACATT
    TTTAAAGCTATTAATAGGCCAGCCACAGTGGCTCATGCCTATAATCCCAGCACTTTGGGAAGC
    TGAGGCAGGTGGATC
  • The adipocyte enhancer binding protein 1 is 16,000 base pairs in length and contains 21 exons (see Table 3 below for location of exons). As will be discussed in further detail below, the human AEBP1 gene is situated in genomic clone AC006454 at nucleotides 137,041-end, [0048]
  • POLD2 has an amino acid sequence depicted in SEQ ID NO:4: [0049]
    MFSEQAAQRAHTLLSPPSANNATFARVPVATYTNSSQPFRLGERSFSRQYAHIYATRLIQMRPFLE
    NRAQQHWGSGVGVKKLCELQPEEKCCVVGTLFKAMPLQPSILREVSEEHNLLPQPPRSKYIHPDD
    ELVLEDELQRIKLKGTIDVSKLVTGTVLAVFGSVRDDGKYLVEDYCFADLAPQKpAPPLDTDRFVL
    LVSGLGLGGGGGESLLGTQLLVDVVTGQLGDEGEQCSAAHVSRVILAGNLLSHSTQSRDSINKAK
    YLTKKTQAASVEAVKMLDEILLQLSASVPVDVMPGEFDPTNYTLPQQPLHPCMFPLATAYSTLQL
    VTNPYQATIDGVRFLGTSGQNVSDIFRYSSMEDHLEJLEWTLRVRHISPTAPDTHGCYPFYKTDPHF
    PECPHVYFCGNTPSFGSKIJRGPEDQTVLLVTVPDFSATQTACLVNLRSLACQPISFSGFGAEDDDL
    GGLGLGP
  • and a genomic DNA sequence depicted in SEQ ID NO:8. [0050]
    CCCTCCTCCATCCCTGCCCCAACACCCTGAAGACCCTGGATGCAAAGAAAGGCCCGAGGGAG
    CCTCTTCCCTCGCAGTGCAGGCCTCACCTGGGGCTCAGAGTCAGAATCTGGATTTTATTCCCTA
    GGACAACCTCTAGTCAGGGCAGAGGCCGGCTGTGCTGCCCAAGTGCCCTAACCCTAGCTTTGA
    GGCACCAGAAGGGCAAATGCAAATTAAAAATGAGAATAAGTTTATTCTCCTTGGTGAAAAAA
    AAAAAAAAAGACTTTCCCCTCTCCTTTTTCTTTAGAAAATCTATCATTGCAAGTTCCTTCCTGG
    ACTTTTTTTATGTAGATCTGTTCAAAAGCTAAATAAGCCTCTTTTCAAGTTTCACATCCCAGGAA
    TGTCTCCTTAAGGACCTAGGAGCCACCATTTGAAGTGTAATCACCAAGGGAGATACATCCTTA
    TCTCCCAGTTTCCGTGGGCAAAGGGGAGCCTAACTTTAGCCCGGTGCCTAGCTCAAGT
    TGCAAACACACTTCCAGTCTTAAAGGAATGAATTTATTTTTTTTCCTTTAGGCAAACCCAGGTA
    GCCACCACAGTTACCTGGGGATTCACAGAGAACTGTGTGTGACCACTGGTGCTGTCAAGTCCT
    CTTACCTGAGCACCTGTGACGTTTCCCTTGAGAACGTGTACGGGATGGGTTGCACCTGGTTAT
    ATACAAGCGTGAGACTTCTTTCTGCCTTTGTAATTTATTAGCAGATTATCTGTGATGAGC
    ATCGCAATCTGTTTAATGCCTATTCAATAATTAAATTTTTCTTTCTCTTCTTTTGTGGAAAGGTT
    TTCTGCATTGGCAGGAGATTTTTGTTTTCGATTATGTCCCCAACATGCCTGATGTTCCACCCCT
    CAAGAGCCTCAGCCTTGCCCAGGGAGGGCATGGGGGTGAGTGGCCTCTCCCACAGAGAGTGC
    TGGCCAAGTTGGCCCAGGTGCGCAGCAAGGGCTGCTGCCCAAAGGCTCCCTCCTGGTTG
    GCATGGGTCGGGACCCTGTTGTGTTGTGTTTTCGCTCTTTTTCGTAGAGTTCAAGGGGGTCCTG
    CTATGTTGTCCAGACTGGTCTTGAACTGACCTCAAGGGATCCTCTCGTCTCAGCCTCCCAAAGT
    GCTGGGATTACTGTGCCCAGCTTTGTGTTGTATTTTCTGATCTTATCCTGCAACCTCTTGAGCC
    CCCAACCTGGGCCCCAGTTCCTGCTGTGCCCCAGCCTGCCAGCCCTCTCTCTCTGCAT
    ATTCTTTCTTTAGCTGAGTTAACACCACTGATAAGGTTAAAGACAGGCTCTTAAATTTCTGCCC
    TGGCATGAGAAATATGTGACCCAGATGCTTCTCCAGCTTTAGCTGTCCAGTGTAACTGTCAGGG
    ACTGATGGGCGCGTGCTGGCCCACAGCCCACCTCAGTCCTGACCCTCCCTGACAGGCTGAGAG
    AGGCCCCAGCCTGAACCTGGACTCCCCCATGTTCTGATATTCCTGCACAAGAGTGCAGAG
    GCCTGGTTAAGCTGGAGAAACATAAGGAATAGGTAGGTCTGCACACACTCACCTCTTCCTTTG
    CAGTGAACCTTCTAGAATCTTCTAGATGGAAAAGCTGGGGGTGTGGAGGTGTAGGGATAGGA
    CAGCTGGGGGAGGCCTTGGCCAAGGTCAAGGAGTAGGCCCAGTCTCCCTCTCTGTGTGCCTGT
    CTGGGACTCGGTTTCCTGTCTGTGAAGCAGGGCTGGACGGGATATTGACAGCACCTGATGGTC
    ATTGAGCTCCTCTGCCCCAGGCACTCAGCTGCTGGGCACAGTGCACACGTGGCAGTCCGGTGC
    CCTCTCACGCTCCGTGATGACTGAGTCTGTAGTTACACCCCTGGCCTCAGAATAAAGACTACA
    CTTTCTGCCTCCCTCACTGGCAGGTATGACTAGGTGTGGTGGCAGTTTTCTCCTTAAGAGACAG
    ATGTTTGTGCCTCCCTCCAACCCGCTGGCTAACACCTAGCTGGCACACAGCCTCCTGGGGCTA
    TGAAGATGAGGGCCACAGCCACAGGGTGGGGGAGCCGTGAGCTGGGTCTGGCTGCGTCTCTG
    ACATATGGGGGCATCACACATCACCTCTACCTCCCATCGAATGCTACACGAAGAGAACAAACT
    CCACCTGATGGAAGCTGCTGTTGTTTGAAGTCTTTCATGCTCACAACAGAACCTAACCCCAAC
    CAATACAGTATGAGTATTGGCCCCACGTGGTTAAGCAAGCTGTCCAAGGFFACACACAGCTGG
    GAGGTGGTGGAGCTGGGTTTGAGCCTGTTATTGACCTTTGTGCAGACAGACCTCAGAGCAGAG
    CACAAGGCAGCAAGGCTGTGGGTCTGGGGCTCCCTCTCCAGGAGAATCAACTGGCTGCACAC
    AGCCTGGAGAGCCCATGGGCAACCTGAGTCCTTGCACCTGGAAGTTTCTGTGTCCCACACATA
    TCCAGGAGCTTAAAATGAAGATGTCTGAATTACCCAACCTCTTGATAGCACCAACCCAACCTT
    CCAGCCTCCTCTTCTGAGGTCAGCCCAGAGCAAGCCCCTTGCAAAGCTGATTTAACTCAGAA
    CCACTGGGCATACCCACAGGGCAGTGACCCTGCAGCCCTCGATCAAATGTGCAGATGGACTTG
    GGGGTGGGCTGGTACCCCAGATGGCCTCATTCTCCCAGGGTTGCAGAGCCCCTGAAAGCCACA
    GCCCTGTGTGCACACCACTGGGGAGTCATCACAGGATACTTCAAGAATTCAGTGCCAGGCAA
    GGTGGCTCATGGCTGTAATCCCAGCACTTCGGGAGGCTGAAGCGGGCAGATCACCTGAGGTC
    AGGAGCTAGAGACCACCCTGGTCAACATAGGGAAACCCCATCTCTACTAAAAATACAAAAAT
    TATCTGGGCGTGGTGGCGGGTGCCTGTAATCCCAGCTACTCAGGAGGCTGAGACCGGAAAAT
    CGCTTTGAGCCTGGGAGGCAGAGGTTGCAGTGAGCTGAGATTGCACTGCTGCACTCCAGCTTGG
    GGGACAGAGTAAGACTCCATCTCAGAAAAAAGAGTTCTGTGTATCATTTAATGTGGAGATCCT
    CCCATCACGAGGATGAGGCTGTTTCTCTACTCCCCAGATCTGGGCTGGCCTGTGGTTTGTTGAC
    CTCAGCCTTGTAGTTCTCACTTTCCTGGAACCTGAATGCCACCACGCGACATCCATAAGACAA
    AGCCCAGGATAAAAGATCACTTGGAGAGACAGGCCTGGCCTGGCACCACCCCGGCTGAGGCT
    GGACCCCTGGGAAGGAGACTCTGATGGACCTCCAGACCCAGT
    CAAATGACCACTTCCAAGGTCAGGCAAGAAGGGACAAAGAGCCACTGGCTCAGCCCACAGCA
    TCTGAGAAATAAGAAACCGCTGCATTTTTTGAGCCAGTAAGATTTGACAGGTTTGTTTTGCAG
    CAATAGATGAGTGGTACCTCATCTTAGCCCATGTTCTGATGAAGACAAACAGTAGCATTGACA
    AAGTTTTAAGAAAAGTTAACCAAAAACTGGGATTCCTTTCTTCATTTTGACCCTTTGTTACAAG
    AAACAGAGGCCCACCCCACCAGACTCACTGTTCACTGGTCCCTGAGTGCCTGTGAGTCTCAGT
    GGGAGTTACCTTGAGACCAGCCCTTCTGAGTGGAGGGTGCTGGGTGCTGAGGTCAAGTCGAG
    CTCAGTCCAGGCTAAAAGGAGAGCAGCTCTGGCCAGGCTGTCAGGGCTGTGGCCTCCCCAAG
    AACCTCCTACCCTGGCCCCTCCAGGCTTTGCTGCTATGGTTGTGTGAGGGGAGTTGCTGTCCCA
    GCATTCTGGCCCCCTTGCCCCCAGCCCCTCCCTGACCTCCACGGGCTTCAGGCCTCAGTCCAGA
    GTCACCTCCTCTAGGAAGCCATCCCCCAGTGCAAGTCTGGGCAACATTCCTCCTTGCCTGGCC
    CACCTGCTCACTCTCATGCTATGGCTTTCTGTAAGCAAACACAAAGATAGGAACAACTCTGTC
    CCTGGCACAGAGCAGATGCTCTGGCAATATCTCATGAGTGAATGAAGGCACATGACAAACCT
    CCAGACCTGTGGAGACTGAAGGCTGAGAGCCTTTATAGATGCTGTGGGGCCGAGGAGTTTGC
    CAACTACAGCAGGTCATGCCCAGAGGTTTCTCTCTGGGTAGCAAGGTGTGTCTCCCACCAAAG
    GCCATTGGCATGGGGCCCGCCCTGCTGACCCGAGGCAGTGCACAGCAGAGGCCAGATGCAGT
    GAGAAGGAGCCTCTCCTTGGCCTGCTGTCTGCTGCCATGCCTGTGGGGGCGTGGACACAAGTG
    TGTGGCATAGAAGGTGGTGTGGCAGGTGAGAGGTTGGGGGTGTGTATGTAGCAGGTGTCTGT
    GTGTGTATGTGCATGTGGGGGTGTGTGTGCATGCATGTGTGTGTGTGCATATGCACGTGTGTG
    CATATGCATGTGTGTGCATGGAGAGAGAAGACCTCCTCTTTCTGGCCCCTCTCCTAGCTGCCCC
    CCTCCCTCCTGCTGCCAACACACTGTCAACCCTTCACTGTCTTTTTCCTTGGGACTCGTTGATCT
    GTCTCTACCATCCCAGGTGTCTGGAGCAGCCTCTAACCTTCCATCTGCCAAGGTACTTCAGCCC
    CACCCCTCCCAGCTGTGGAATGTCCCCTAGGATGTGCCACTGACACAAAGAGCCACACAGCTC
    CAAAATAGAATATTATCTAACCCACTGCTCCCTTTGCTGTCAGCAACACCTCCACCATGCTTCT
    CCCAGGACCCCCCFTTGAACTCTCTGCTTCCTCCCTGAGGCCAAAGGAAAGACAGGAAAGGGG
    CCACCTTCCTGTCCTTGGGTCCCACAGAGATGTATCCTTGTAATGAAACCTACTTTATGCTTGA
    GTTGTATCCAGTTAGTTTCTGTGGCTTGCAATCAAGACCCACACCCACCTCAACCCAGGCTCTA
    GAGAGTAGACCCTTGTTTTTGCCTGGCTTGGGTCGACCTGGCACCTGCCAGGGTCCCAGCCTC
    TGAGTCAGCCCACCTTGCCCTCATCGGTGCCACCTCCAGGCGGCTGT
    ACATAGACTCTGGCTTCTGCCCTGGCCTGGCCTCTGGGAACTGCAGCTGTCTGCTTCCATCCTA
    TGTGGATGGTGCCTGAAAGTGAATAGGGATCAGTTACCAGCCCAGTATCTGTCCCCTTCTCAA
    TAGCACTGATTCCTATGGGGAACTGCTTTTCTTGGACTATGTATGGGTTTGGTGGGAGGGTAG
    TTCCTGTAACCAACCCTACAGGGTGTAGGAACCTAGACTCTCAGCAACATAACAGGCAGC
    AGGCTCCCAAGCTAAGTCTGGCCAGCTGGGCCACCTCTCCCAGATTCTGTTTCATGAGAGCAT
    CATCCAAGAGCAGTGGGAACACTGGGGACGGTCCAGCCTAGGACTGGTATGCAGATCAGAGA
    ATCCCAGATAGAAGGTGATTGCTGTTCTTCCAGTTTCTTGGCCCTCCAGAGCAACCATACTTCC
    CATCTGCCCCAAAACCTGATCCTCCAAACTCCCACCATTTCTGTGCATCCCCAATATCTAA
    TAGATCAACTGCCTTTCATTTACATTTGTCACAACCAAATGATACACCTGCCCTTCACCCAGTA
    CTGAACTGCAGCTGGGTTAGTCCAAATTCAGGGCCCACGTGTCATTTCAAGCCTGTCTTGAAT
    AATGTACACCTTCCTGCAATGTGAGGATGGCCACCACCTTGGTCTTATACCCACGGGTGTCCT
    GAGCTACATTTCTCATAATCAAAAATAAACTCAACACATCACTCCAGCCTGAGCAACAGA
    GCAAGACACTAGCTCTAAAAATAAAAAATAAAAACAAACAAATGAAAAACCCAGCAAACTT
    GGGGAAAGAGGAAGCACCTGATTTCCAGAGTTTCCACATCATGAGATGCAAATGTCCAGTTTT
    CAACAACAACAACAACAACAAAAAAAAAATCACAAGGCATACAAAGAAATAGGAGACTAAG
    ACCCACTCAAAGGAAAAGAATAAATAAGCAGAAGCCATACCAGAGGAAAACCAGATGGCTG
    ACTTACTAGACAAATACTTTAAAACAACTGTCTTAAAGATGCTTGAAGAGCTAAAGGAAAAT
    GTGAACAAAGTCAAGAAAGTGATGGAACAAATGGAAATTCCAATAAAGTGATAGAAAACTTT
    TTGGAGTTTTTTTTCTTGGTAGCAAAAAATTATGAAGCTGAAGAATACAATAAATTCCCTAGA
    GGGCTTCAAAGGCAGATGTAAGCAAACTTGGCCAGGTGCAGTGGCTCATGCTCATAATCCAG
    CACTTTGGAAGGCTGAGGCAGGAGGATTGCTTGAGCCCAGGAGTFJTGAAACCAGCCTGGGCA
    ACATAGAAAAACGCTATCTTTAAAAAAACTTATATAAAATTTAAAAATTATAAAATTTATTTA
    AAAAATCAGCAATTTGAAGACTGGACAGGGAAATTATCAAATTTGAGGAACAGAAAGGAAA
    AAGATGGAAGAAAAATAAACAGAGCCTAAGAGACCTGCGGGACACCATCAAGCAGACTAAT
    ACCCATTGTGGAAATTCCAGAAAGAAAAGAGAGTGAAGGACCAGAGAGATTATTAGGAGAA
    ATAATGGCTGAAAATGTCTCAAATTTGATGAATGACATGAATATGAACATTCAAAAATCTCGA
    CAAACTCCAAGTAGGAAAAACTCAAAGATACTCATACTGAGATTCATCATAATCAAACTGCTG
    AAAGCCAAAGACAAGGAGACAATATCAAAAGCTGCAAGAGAGAAGTGACTCATCACATACA
    AGGGATCTTTCAAAAAGATTATCAGATATCTTGGCTGGGCACGGTGGCTCACACCTGTAATCTT
    AGCACTTTGGGAGGGCGAGGCAGGTGGATCACTTGAGGTCAGGAGTTTGAGACCAGCCTGGC
    CAACATGGCAAAAACCCATCTCCATTAAAAATACAAAGATTGGTGAGGCATGGTGGTGCATG
    CCTGTAATCCCAGCTACTCGGGAGGCTGAAGCAGGAGAATCACTTGAACCTGGGAGGCGGAG
    GGTGCACCAAGCCAAGATCGTGCCACCACTGCACTCCAGCCTGGGTGACAGAGTGTGACCTTG
    TTTCAAAAAAAAAAGAAAAAGAAAAAGAAAAAAAAGATCATCAGCTATCTCATCAGAAACCT
    CAGAGGCCAAAAGGCAGTAGATTGATATATTCAAAGTGCTAAAAGAAAAAAATAAATCTGTC
    AGCTGAGAATCCTGTATCTGTATCTCACTTAACCATTATTTTAAAATAAGGGAAAATGAAGAC
    ATTCCCAGATAAACACAAGCTGAGGGAGTTCATTATCACTAGATCTGCCCTGCAAAGAAAGCC
    AAAGAAAGCCTTTCAGGATGAAATGAAAGGATACTAGACAGTGACTCAAAGCTGAATAAAGA
    GGCCAGGCATAGTGGCTCACACCTGTAATCTCAGCACTTTGGGAGGCTGAGATGGGCGGATC
    ACCTGAGGAGTTTGGAGACCAGCCTGGCTAATATGGTGGAACCCCATCTCTACGAAAAATACA
    AAAATTAGCCAGGTGTGGTGGCACATGCCTGTAATCCCAGCTACTTTGGGAGGCTGAGGCAAG
    AGAATCACCTGAACCCAGGAGGCGGAGGTTGCAGTGAGCCGAGATTGTGCCACCGCACTCCA
    GCCTGGGTGACAGAGTGATACCCTGTCTCAAAAAAAAAAGCCGAATAAACGAATAAAGATCT
    CATCTATGGCCGTACCACCCTGAATGTGTCCAATCTCAGAAGCTAAGCAGAGTTGGGCCTGGT
    TAGTACTTGGAGGGGAGAAATAACGGTCTATGCTAAAGGAAAATTCAGGTGCAATTAAAGTA
    AAATTAATTATATAAAAGAGAATACATTAAAAGCTAGTATTATTGTAACTTTGGTTTGTAATT
    CCACCAAGTGGAATTTGTTCCTGAAATGCTAGAATGGTTCAACATAAAAATCAATAAATGTAA
    TAGACCACATTAACAGAAAAAAAACCCACACGGTCATCTCAATTGATGTCAAAAAAGTATTT
    GACAAAATTCAACACTCTTTTGAAAGAAGAAAAAGCTCAACAAACTAAGAATAGGAGGAAAC
    TACCTCAAATAATAAAATCCATAGGCCAAATCCCCAAACTCACAGCTAGCAACATATTTAATG
    CTAAAGACTGAAAGCTTCCCCTTTAAGATCCGGAATAAGACAAAGATGCCCACTTTCACCACT
    TCTACTCAACATAGTATGGGAAGTTCTAGCCAGAGTAATCAGGTAAGAAAAAAGAAATAAAA
    AGCATCTGAATTGGAAAGGAAAAAGTAAAATTATTTGTTTGCCCAATACATGTACAATGTTTC
    AGGTGAAGGCTCAGAACAGTACAACCTTACCAGCAAGAGTCCTGCTGTCTCTGTGTGAATCCC
    AGCTATTACTCACTAGCTACATGATCTCTCTTGCCCTCCCTGCCTCAATITTCCTCATGTGTAAA
    GTGGGAGAAAAATAATAGTTCATGCTTCAAAGGTTTTTTGTTTGTTTGCTTGCTTTGAGACAGC
    GTCTGGCTCTGTCGCTCAGGCTGAAGTGCAGTGGTGCAATCTTAGGTCACTGCAACCTCAGCC
    TCCTGGGCTTAAGCGATCCTCCCACCTCGGCCTCCCAAAGTGTTGGGATACAGGCGTGAACCA
    CTGTGTCTGACCCAAAGGATTATFTTGAGGAGCAGATGAATTAATGTGTCATAACCTCAAAGCA
    GTTGCAAAGGCGTTTAATAATTAAAATATCACATTTTAAATTAAAATATAAGGCTGGGCGTGG
    TGGCTCATGCCTGTAATCCCAGCACTTTGGGAGGCTGAGGTGGGAGGATCACTTGAGCCCAGG
    AGTTCCACACTAGCCTGGGCACCATTGGGAGACCCTGTCTCTACACACACACGCACACACACA
    CACACACACACAAACTTAAAGTAGCCAGGCGTGGTGCTGCGCGCCTGTTGTCCCAGCTACTCG
    GGAGGCTGAGGCGGGAGAATCACTGGAGCCTGGGAGTTCGAGGCTGCAGTGAGCCGAGATCG
    CACCACTGCACTCCAGCCTGGGCCACAGAGCAAGACGCTGCCTCAAACAAACAAACAAAAAC
    AAAAATTAAAATATTAAGTAATAATTAACGAGTGTTAATATCCACTCGTTGTGGAGACAAGAC
    CTGGACTTAGGAAACAGGCCCAGGGAAGTAGCAGAACAGTAGCGCTAGAGGACGCCTGGGA
    GAATCAGCGCGCGGCGGGAAGAGCCCGGGAAGCTTAGTGGGGAAGCGTCTCTTGATGGGGTG
    AGGAATTCTATAAATTAGTGGAGATGGAAAAAAAAAAAAAAAAGTATTCCCAAAGTGGGAG
    ACAGCACTCAGAAAGACGTGGTGGTAAGAACGAGTATGAGTAACGGGGACAACGAGGACAC
    TGGAGATTGGGGAGTGTTGGGCTGGAAGCTGGTGTGCAGCTGTGGGCAAGCTAGGGAGGACC
    CCGAAACCGCCAATGCGTTTCCCGGACGCAGACGCTGGCAGGACGGGAGGAACCCCGAGACC
    CCGCGCCATCCCTTCAGGAAGAGTTACTTCTCCCCGGCCAAGTTAGTGGGCCTTGGGCCTTCTT
    TCTGTTGGGATCCTCCTCGCGTGTCGCCATCGCTACAAGTGGGCAGCTCTGCGGGGAAAGCTG
    GGACGCTGGGGGCTTCACCAAGGAGGCTGGCGGCCGACCACTGGGAGGTCTGGCGGGGTGAC
    GACCACTGGGAGGTTTGGGCAGGGCCTGACGGGGTGACGCGGTCAGCCCACTGGAGGCCGAC
    ACCCCCCGTCAGCCCAACCCCTGCACGCGCGGCCGCCAACCAAAGACCCGCGGCGCCGGCCT
    GCGAGCCCCCGCCCCGCGTTGCCCAGGAAACCGAGGGTGTGGCTCCGCGTTCTCTGGGCGTCC
    CAGGGACTGGGCGCACAGTGGTCGGCGGGATGAGGCGCCTGGTGACGGACGGGGCGAGGAG
    GGCAGCGATTGGTGAGATTAGGCGATGGGCGGGGAAGCCGCGCGGGGATTAGCGAGTTGCGG
    CGATGGGCGGGGCAGGCGCGCGGGGATTGGCGGGATGCGGCGCGCCGCGCGTTGAGTGGGGT
    CCAGGGAAACGGGGTCAGCTGGGGGTGGCAGTTCCAGGCCGCGAGGCCGGGCTCCTGGGTCG
    GTGGGCTGGTGTCTTGGCGGACGTCCCGCAGCTGCCGCGTGGATCCGAGCCGGGGCACCCGCC
    GTGACTGGGACAGCCCCCAGGGCGCTCTCGGCCCCATCCCGAGTAGCGCGGCCTGGCTGCTGC
    CGCCATCAAGCACGTTCGAGCCAAAAGCTCCTAACGAGTCACTCGTTAGACACGTGTGCGGA
    GCCTGTGTCCCAGGCCAGTGCTGTCCCGTGGAGATAGATTGCAAGCCGCTAGGGAATTTTTTA
    ACTTTCTAGTAGGTGTACGAAAAAAGTAAAACGAAACAAATCAATTGGAGTAAATCCATAAA
    TATATTCAAACTATTATTTCAATTGTATGTGAAAAAATTATTGGGATAYTTCTTTGTACTATTCTT
    AGAAATCCAYTTTGTGTCCAACCCAAACATCACAGTTGGACTCACCACATCTCCTGTACTTCG
    TAGCCCTAGGTGGCTAGTGGCATAAGACACAAAAATCTCAGCTCTCCTGGAGCTTATGGTCTA
    GTTGGAGCAGGCAGACAATACAFTTTTAAAATATACAGTTTGTTAGAAGGTAAATGTTGTAAACA
    ACAATAACAGTTGAAGTACTGGGGAGAGTTGCAGTTGTAAATCAGATGGGCAGGGCACAAGG
    TAACATTTGAGTAAAGATGTAAGAACTTGAAGGAGATGGGCAAGTGAGCTCTATAAGTATAC
    GGGAGAGGGGCAAGCAAGAGTTCAGAGGCCCCTGCTGTGGGGAGGGATCCAAGGTGGAGG
    AGTGGGAACCAGGAGGGGAGAGGACCAGTGGAGCAGATCTCATAGGCAGTTGTAAGGACTTG
    GGGCCTTATTCAATGAAATGAGGACACTTTGGAGAGTTTTGAACAGAGCAGTGACTGATTTAT
    GTTTTTGGTTTTGGTTTAGTTCTATTATTATTTAATAATAGGCTTATTATTTCACAGAAGTTTTAT
    TTAATAAGGCAGACCTCTTGTCTGGAAATGAGACAGGTGCCGGAGAGCTGGATGGAGGCAGA
    TCGGGAATTCCATTTGGGGCAAACTGAACTTGATTGAGACCCTGGTAGTTGTCCAGATGGAAC
    AGGACACCTGAGTCTAGGGflCGGGAAGAACTCCAGATGGGACAAACACTCCTAGCTTTCCTT
    TTCTCTTYFTGGATGACCGCTACAGGGTGAGACATCGGTATCCAGGCACGATAAATTTCCAAG
    TGGACACAATGTCTGGTGTCAACTACAGCTGTTCTCCTTCTTTTCCCAGTATCCTTTGGGTGCA
    GTGAGACACCAGGAGAGCTGCTGCTTTGGGGGATGGACAGGGGCAGCAGGAATGCCTTTGTG
    TTTTCGCAGTGAACCTCCTTGGCCTGGGCGAAGCTGTGTGGACCAAGCAAGTCAGGAGTGTGG
    CCATGTTTTCTGAGCAGGCTGCCCAGAGGGCCCACACTCTACTGTCCCCACCATCAGCCAACA
    ATGCCACCTTTGCCCGGGTGCCAGTGGCAACCTACACCAACTCCTCACAACCCTTCCGGCTAG
    GAGAGCGCAGCTTTAGCCGGCAGTATGCCCACATTTATGCCACCCGCCTCATCCAAATGAGAC
    CCTTCCTGGAGAACCGGGCCCAGCAGCACTGGGGTAAGTGAGAGTTTGGGAAGGTGCTTCCC
    CCACAGCATCCCTGAACTTAGAAGTGTTCTGCAAGAGAATGGGAACAGTTTATCTAATTGATC
    CCACTTCCTGTTACCTTGGGAAAATTAACCTCTTLTTTCCCTCAGTTTCTTCTTAAGATAGTAAC
    AAGGATTAAATTAAGTAATTTGTGGGTFTTGGAGTTAGTTTTAGTTCAGAGGCTGGTTGGAGAT
    GAGGACTTAGTTCTGGCGGTGATGGCGATTACTTCACTGGCAGAGGAAAATGGTTTTCCTATC
    TTCAGTGCAGATTATTCAGGTATTTGCCTGTGCTGTAGCCAGAGAGCCCCTCAGTGTGGCAAG
    CCTGGCGCCAGGCACCAGGAGCCAAGACTGGTGAGGATGCACTCTCTGGTCTCGAGGGGACC
    CCCTCTGTTCACTCATGTCTGTTTGCCTGTCCTCCTGGCCCCCATATTTGCTGGCCATGAATTTT
    CCTGTCCCTTGGGCCCTCTGTCTTTCCTAATAAAGTGGCCTGCCCAACACAACCCTTGTTCTTT
    GCCCCCATTTCTTCCCTGGTGATCTCTCCTGCAGTTGGATTACTCTTGGTGGTGAAGCAGGGAC
    CCCCATCTCCCCCTTTGAGTTTATTTGAGTTTTAGGTGCTGCTGCATTCCCCCATTCCTACCACT
    TACATAAGAGTGGCTTTCCAGGTAATTTTCAAATCCATCTCCTATTATATTTTTAAACTGAGGA
    TTTAGTAGGTGAGACCAGGTCTTACTCATTTTACTGTCCTTGGCACCAGGCAAAATGGATCTC
    AGCCCTAGTTGCACATTGGAATCCCCTGGGGAGCTTTGAGAAGCCCATCTCATCCCATGCCAA
    GCCAAGATCAATTCTCGTTATAGGCAGGCGGAGAACCCTGGGCCTAGAAATCTAGCTAGAAC
    CTCAAATTCATTAGGGATATGTATTAGTCCATTTTCACATTGCTATAAAAAACTACCTGAGATA
    GGGTAATTTATAAAGAAAAGAGGTTTAATTGACTCACAGTTCCTCATGGCTGGGGAGGCCTCA
    GGAAACTTAACAATCATGGCAGAAGGTGAAGGGAAAGCAAGGCTCTTTTACATGATAGCAGG
    AGAGAGAGAGCAAGGGGAACTGCCAACCATTTTTAAACCATCAGATCGCATGATGGCTTGAT
    CTCACTCACCATCACAAGAACAGCATGGGGGAAATCCACCCCCACAATCCAGTCACCTCCCAC
    CAGGTCCCTCCGTCAACACCGTGTGGATTATAATTCCAGATGAGATGTGGGTGGGGACACAGA
    GCCAAATCATATCAGGATGTTTTTCTGTTTTTGTTTACCTGAGACAAAGTGCTGTTCACCTCTCCT
    CTCCCACATAATCAGGGGCTCCCTCCTGCGGCTCCGGTAGCTTTTCCTCACTTTCCTTTCAGCC
    CTCGGGACACCTTCCTTGGCTCCTTTCAGAGCTCAGTTACTACTTGGGCCCAATGTCAATGCCA
    CCTTCTAGATTCTTTCCGGCAGCACCTCCTCTGGTCGCACATLTTCTCTTCCAGTTATTGGAGCT
    GTCAAAAAAGCTCCCCAGTGATGGACGATAGCGATTTCACTGTGCTCACAGACTGGTCAGGA
    AACCAAACAGCTGCCACAGTGAATGTGTTGATAGCAGCGGGGCAGCAGTAGCACTCGCTCAC
    AGGCCTGGTGGTTGGTGCTGGCCCCCACCCTGAATACCTACATGTGGCTTCTCCATGTGGCCT
    GTGCATCCTCACTGAAGCTCAGCCTGTCTCTCCAAATTGGTCTTTCCACTCACCTGTTCCCCAA
    ACCTGCCCAGACCTTCCTGCTGTAGGCTTTTCCCTTCACTTGGCACACTCTTTCCCTTGTCTTCC
    CATGGCCCCATCTAAGCCCCACTGTCAGCTGAAGTGTTATATTCTTTGAGGGGCCACCTGAAG
    CCACCTTGCAATGAGGGCCTCCGTTTTCTACCTCAGCTCACCATTTGTTCACAGCACTTGTCAC
    TGTGGCGAGTTACTTGTCTATGGCCTGTTGTCGTTCTCCTGCCTAGACCCAGTGGGCTGAGTGG
    GGGCAAGTGTTGGCTTTTATGTCCAGTTTTGATCTTGGTGCCAGCACATTGCCTGGGTGGAAG
    CATGTCCTACTATCGGTTACAGGGATGTCATTCTGCCCAGTGCTCAGGGGCATACACTTGGAT
    CCCAGTTGTGTGCCCTTGGACACATTGCTTAACCTCTCTGTGCATCAGTTGGGTGATAATATCT
    ACTCCTGGCACATTTTCAGCGTTGGCTGAGTTACATTACAGTGCTTAGGCCACCTGGGGGAGA
    GTAAGAGTGGGATACGTGAGGATGTGGAGTCTGTTGCATTTCTGTCTGCTGCTGGCATCCTTCT
    TGTCTTGTTTTGAGTTGCTCGCCTCTGTCTGCTCCCTAGGGCGTAGATTTGAGGAATATTCCTG
    GTTCTTCCCAGGCAGCAGGGGCTCAGGCTGTGCTGGAGTCAGCTAGGCTAAGGGGCTGGTCTG
    GCATCCGCGTTGTCCTGTCACCTCCTTGGTGTTTTTCTCCAGGCCTGGATCTGTGCTGTGTGGGC
    ACCTGTATTCCTCCCTCCTGCCCTCACTGATTCTCCATACCTTTCTTCGAGAGTGCCAAGCC
    CCTCCCATGTGTTCTTGTTCATACCTAGGATCCCGGGAAGGGGCTGGGGAAGACGGTGCCCAG
    GTGCCCTGGGTAAACAAAGCCACCTGACTCCACGGGAATGGAATGGGTGGAGGGGATCTGAG
    GTCTGCATTTTGAGTATCTCTGGTCTCAGAGGATGAAGCATFTTGGTGGGGGTTGGGGGTGGGG
    GGTAGGGTGGAAGAATCTAAAGTCTTAAAAGAAAATGGCAGTTATTTGTGGGACAGGGCTGT
    GTTGAGACTTGGCATGCTTCTTTTTAAGAGTCAGTGTTGTAATTTAGGTATAAGTGAAGCAGT
    ACTTTGTATTAGTTTCCTGTAGGCGCTGTAACAAAGCACCACAAACTGGTTGACTTAAAACAA
    CAGAGATGGCCGGGCACGGTGGCTCACGACTGTAATCCCAGCACTTTGGGAGGCCGAGGCGG
    GCAGATCACAAGGTCAAGAGATTGAGACCATCCTGGCTAACACGGTGAAACCCTGTCTCTACT
    AAAAATACAAAAAAAAAAAAATTAGCTGGGCGTGGTGGCACACGCCTGTAGTCCCAGCTACT
    CGGGAGGCTGAGGCAGGAGAATGGCGTGAACCCGGGAGGCGGAGCTTGCAGTGAGCTGAGA
    TCGCGCCACTGCACTCCAGCCTGGATGACAGCGAGACTCCGCCTCAAAACAAAAACAAAAAC
    AGAAACAACAATAACAGAAAAACACAGACATTTACTCTCTGGCAGTTCTGGAGGCCAGAAGT
    TGAAATCCAGATGTCAGCAGGATFTGGCTCCTTCTGAAGGCCCGAGGGGAGGGTCCTTCCTGGC
    CTCCTCCCTGGTGTTCCTGGGCTTGTGGCCGCATCACTCCGCTCTGCCCGTCTTCACACTCCCT
    CTTGTCTGTGTGTCTGTCTCTCTGTTCTCATGAGGACACTTGGCATCCAGGGCCCAACCACACC
    CAGAGTCCCTGGTCTCCTGTGGCTGACTCACTTTTTACTGTCACCGTGAAGTCCAGGGGGTCCT
    TGTACTTGATGTTCTCTCCTGGCAAGGCCAGGGCCCTGTGATTGGCCTCTCATGGAGTGCTGG
    GCAGGGCCTCCATGGCCTCTGTCGGGCGGGGGGGCTACTTCATCTCTGAGTCTGTACCCCTCG
    TGTCCCAGGCAGTGGAGTGGGAGTGAAGAAGCTGTGTGAACTGCAGCCTGAGGAGAAGTGCT
    GTGTGGTGGGCACTCTGTTCAAGGCCATGCCGCTGCAGCCCTCCATCCTGCGGGAGGTCAGCG
    AGGAGGTGAGGCAGGGTGCTACACAGTGGGGCCGCCAGGCAGACCTGGCCTCCCACTAGAAC
    ACCTCCCTGGAGGTGGGGTTGTGGGGAAGCAGGTTCAGAGACAATGGACTCCAGAGGGGTGG
    GGGCTGCGGTGCCAGCTCACTAACACCAGAGCTTTGGTGGGCTCTGGCCCCAAGATTATACCT
    CCTGTCTCTGCATTCCAGCACAACCTGCTCCCCCAGCCTCCTCGGAGTAAATACATACACCCA
    GATGACGAGCTGGTCTTGGAAGATGAACTGCAGCGTATCAAACTAAAAGGCACCATTGACGT
    GTCAAAGCTGGTTACGGGTAGGGAGCCCAATGAGAGGATGTGGGTGATGCAGGTGAAGAGCC
    CAGCGGTGGTGTGTTAGGGATGGTGTGAGTGGGGAGCCTGGGGGGAGTGGGGGGGTGTGGCC
    TGGGCACACGTGTGTTCTTGAGGAGGTAGGTGAGGCTCCAGGCGGTCGGAGGCCATCAGATT
    GGGTGAGACCTGGCTGGGAGATGGGTCTCCCCACCTCCATCCAAGGGCAGTGACTCCAGGAA
    GCAGGCATGCATCCTGGAGTCCTAGGTGAGAATTCACCAATGTGGTTGTGGAGAACTGGCTTG
    TTTTGCCCGTTGGGGTGACTGGAAGGAGTGGTAGCACCTGGGGCTCCCTGCTCAGGCCTGATG
    CCACTGCTCCCCAGGGACTGTCCTGGCTGTGTTTGGCTCCGTGAGAGACGACGGGAAGTTTCT
    GGTGGAGGACTATTGCTTTGCTGACCTTGCTCCCCAGAAGCCCGCACCCCCACTTGACACAGA
    TAGGTGAGCAGCAGTTCTCGGGAGCTGGAACCAGCTCATGGTCAGTGGAATCTTTGAGTTGCA
    CCTAGGAGGGGCTGCCTCCCTTCTCGGCACCCTGGAGGACCCCACCTTCTCCCGCAGGTTFPGT
    GCTACTGGTGTCCGGCCTGGGCCTGGGTGGCGGTGGAGGCGAGAGCCTGCTGGGCACCCAGC
    TGCTGGTGGATGTGGTGACGGGGCAGCTTGGGGACGAAGGGGAGCAGTGCAGCGCCGCCCAC
    GTCTCCCGGGTTATCCTCGCTGGCAACCTCCTCAGCCACAGCACCCAGAGCAGGGATTCTATC
    AATAAGGTATGGAGCCCACCTGGCTGCATTCAGCCCCAGCCCAGGAGCCTGCAAGCCTGTAA
    GACCCTCCTTCCCCAGGGCGAGTAGGGTACCCTGTGAGGTCTCGCAGGTCGGTGGGAAGCGCC
    CTGCAGTGACTCTGGGGCCTCCTGCAATGGGGCTCCTCATGCCCAGGCCCTCGCTGAGGATGG
    TGGGAGGCTTGAAGGGAGTGAGGGTCTATGGGACAACAACTGCATCTTCCAGCTGGTGGGGC
    TCTACTCTCCTCTGAGCCTGGGACTCGCCTGGGCCTGATGGCCTTCTGGGCTTCTATTCCAGGC
    CAAATACCTCACCAAGAAAACCCAGGCAGCCAGCGTGGAGGCTGTTAAGATGCTGGATGAGA
    TCCTCCTGCAGCTGAGCGTGAGCGAGCTGGGGGCTGGAGGGGTGATGGGGATTGCAGTCTTC
    AAAGCTGCCACTGGGCAACAGAAGGCAGGCAGGAGGGCAGGGGGAGTGGCCGGAGTTGGTG
    TAGGGGGCTCCTTCGGGGCCCTGTGAGCTCTCCCTGCCCTGTGCCTTCCAGGCCTCAGTGCCCG
    TGGACGTGATGCCAGGCGAGTTTGATCCCACCAATTACACGCTCCCCCAGCAGCCCCTCCACC
    CCTGCATGTTCCCGCTGGCCACTGCCTACTCCACGCTCCAGCTGGTCACCAACCCCTACCAGG
    CCACCATTGATGGAGTCAGGTAGCTGGCACAGCCACACTTCAGTCTGACCCAGCCTTTTGCCT
    CAGGAGGCACAAAGAAGGGAGGGGAGGGAGGGCCCAGGAAGGTGGCAGGGCTGCAGAGGC
    CCACCTAGCATCTGTTTCCTTCTCTCTGGGGCATCCCCACAAGAGCGCCAGATGAGCTCTGGGC
    TGACCACTATGGGTGGCACCCAAAGCCAAGAGTCAGCTGAGCTTTGCCTTGCAGATTTTTGGG
    GACATCAGGACAGAACGTGAGTGACATTTCCGATACAGCAGCATGGAGGATCACTTGGAGA
    TCCTGGAGTGGACCCTGCGGGTCCGTCACATCAGCCCCACAGCCCCGGACACTCTAGGTAACA
    GGCTCAGCCATACAGGGTGGGAGCAGAGGGCCAGGAGGCCTGGCAGGACCCTGAAGTGCAC
    AGGGTCCCCCTGTGGGTTTGCACTTGCCAGCATTGCTGAGAACTGTCTGAGGAGAAGTTCAGA
    GGCTTGGCACCTGCTCTGGAAGCTACTCTGGAATCTCTAAGGCCAATGGCTGCCCACC
    CCAACGGGCAGCAACAGCAGGGCCAAGGTCTTGTGACAATGTCTGGAGGTGCCCCTATGTC
    ACACTGGGGGTCTCCTACTGGCCTGCAATGGGAGGAGGGGCTGCAGCCCCACATCCTGTGCA
    GAGTGCTAGTGCTGAGGCGGAACCCTCCTCAGAGCTGCCCCTTCTCCTCTAGGTTGTTACCCCT
    TCTACAAAACTGACCCGTTCATCTTCCCAGAGTGCCCGCATGTCTACTTTTGTGGCAACACCCC
    CAGCTTTGGCTCCAAAATCATCCGAGGTAATTTTTGTCTTCTGGGGGCCCAGGCTGATTTGCTG
    ATTTGCTCTCACCTGGGGACAAGGTTCACAGAGAAGAAAACCTGCATTGTGGAGTCCCCCTGG
    CCCTTGTGGGATGGACAGCTGAGGTCTTCTGCACAGCTGCCATTTCACTGTGGGAGCCAAGCT
    GCCTCGCCAGCTGGGCAGGGACTGGAACGGCTCCCAGCCTGTGTGCCTCTCAAGGCTAATCTC
    TGGTCTCCTATTGTCACTGCCCCACTGTGTGCCAATGGGGACTCCTGTTTATTTCTGGCAGCTT
    CTCTTTGAGGCAGGACTTACTTGGAACCTACAGTGGGTCCTATGTGACTTCTTTGCAGGTCCTG
    AGGACCAGACAGTGCTGTTGGTGACTGTCCCTGACTTCAGTGCCACGCAGACCGCCTGCCTTG
    TGAACCTGCGCAGCCTGGCCTGCCAGCCCATCAGCTTCTCGGGCTTCGGGGCAGAGGACGATG
    ACCTGGGAGGCCTGGGGCTGGGCCCCTGACTCAAAAAAGTGGTTTTGACCAGAGAGGCCCAG
    ATGGAGGCTGTTCATTCCCTGCAGTGTCGGCATTGTAAATAAAGCCTGGCACTTGCTGATGCG
    AGCCTTGAGCCCTGGGCACTCTGGCTATGGGACTCCTGCAGGGGTGCCCACAGTGACCATAGC
    CCATGCACCCACCAGCCGGTCTCCCT
  • The POLD2 gene is 19,000 base pairs in length and contains ten exons (see Table 4 below for location of exons). As will be discussed in further detail below, the POLD2 gene is situated in genomic clone AC006454 at nucleotides 119,001-138,000. [0051]
  • The polynucleotides of the invention have at least a 95% identity and may have a 96%, 97%, 98% or 99% identity to the polynucleotides depicted in SEQ ID NOS:5, 6, 7 or 8 as well as the polynucleotides in reverse sense orientation, or the polynucleotide sequences encoding the SNARE YKT6, AEBP1, human glucokinase or POLD2 polypeptides depicted in SEQ ID NOS:1, 2, 3, or 4 respectively. [0052]
  • A polynucleotide having 95% “identity” to a reference nucleotide sequence of the present invention, is identical to the reference sequence except that the polynucleotide sequence may include on average up to five point mutations per each 100 nucleotides of the reference nucleotide sequence encoding the polypeptide. In other words, to obtain a polynucleotide having a nucleotide sequence at least 95% identical to a reference nucleotide sequence, up to 5% of the nucleotides in the reference sequence may be deleted or substituted with another nucleotide, or a number of nucleotides up to 5% of the total nucleotides in the reference sequence may be inserted into the reference sequence. The query sequence may be an entire sequence, the ORF (open reading frame), or any fragment specified as described herein. [0053]
  • As a practical matter, whether any particular nucleic acid molecule or polypeptide is at least 95%, 96%, 97%, 98% or 99% identical to a nucleotide sequence of the presence invention can be determined conventionally using known computer programs. A preferred method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence, also referred to as a global sequence alignment, can be determined using the FASTDB computer program based on the algorithm of Brutlag et al. (Comp. App. Biosci. (1990) 6:237-245). In a sequence alignment the query and subject sequences are both DNA sequences. An RNA sequence can be compared by converting U's to T's. The result of said global sequence alignment is in percent identity. Preferred parameters used in a FASTDB alignment of DNA sequences to calculate percent identity are: Matrix=Unitary, k-tuple=4, Mismatch Penalty=1, Joining Penalty=30, Randomization Group Length=0, Cutoff Score=1, Gap Penalty=5, Gap Size Penalty=0.05, Window Size=500 or the length of the subject nucleotide sequence, whichever is shorter. [0054]
  • If the subject sequence is shorter than the query sequence because of 5′ or 3′ deletions, not because of internal deletions, a manual correction must be made to the results. This is because the FASTDB program does not account for 5′ and 3′ truncations of the subject sequence when calculating percent identity. For subject sequences truncated at the 5′ or 3′ ends, relative to the query sequence, the percent identity is corrected by calculating the number of bases of the query sequence that are 5′ and 3′ of the subject sequence, which are not matched/aligned, as a percent of the total bases of the query sequence. Whether a nucleotide is matched/aligned is determined by results of the FASTDB sequence alignment. This percentage is then subtracted from the percent identify, calculated by the above FASTDB program using the specified parameters, to arrive at a final percent identity score. This corrected score is what is used for the purposes of the present invention. Only bases outside the 5′ and 3′ bases of the subject sequence, as displayed by the FASTDB alignment, which are not matched/aligned with the query sequence are calculated for the purposes of manually adjusting the percent identity score. [0055]
  • For example, a 95 base subject sequence is aligned to a 100 base query sequence to determine percent identity. The deletions occur at the 5′ end of the subject sequence and therefore, the FASTDB alignment does not show a matched/alignment of the first 10 bases at 5′ end. The 10 unpaired bases represent 5% of the sequence (number of bases at the 5′ and 3′ ends not matched/total numbers of bases in the query sequence) so 5% is subtracted from the percent identity score calculated by the FASTDB program. If the remaining 95 bases were perfectly matched the final percent identity would be 95%. In another example, a 95 base subject sequence is compared with a 100 base query sequence. This time the deletions are internal deletions so that there are no bases on the 5′ or 3′ of the subject sequence which are not matched/aligned with the query. In this case the percent identity calculated by FASTDB is not manually corrected. Once again, only bases 5′ and 3′ of the subject sequence which are not matched/aligned with the query sequence are manually corrected for. No other manual corrections are made for purposes of the present invention. [0056]
  • A polypeptide that has an amino acid sequence at least, for example, 95% “identical” to a query amino acid sequence is identical to the query sequence except that the subject polypeptide sequence may include on average, up to five amino acid alterations per each 100 amino acids of the query amino acid sequence. In other words, to obtain a polypeptide having an amino acid sequence at least 95% identical to a query amino acid sequence, up to 5% of the amino acid residues in the subject sequence may be inserted, deleted, (indels) or substituted with another amino acid. These alterations of the reference sequence may occur at the amino or carboxy terminal positions of the reference amino acid sequence or anywhere between those terminal positions, interspersed either individually among residues in the referenced sequence or in one or more contiguous groups within the reference sequence. [0057]
  • A preferred method for determining the best overall match between a query sequence (a sequence of the present invention) and a subject sequence, also referred to as a global sequence alignment, can be determined using the FASTDB computer program based on the algorithm of Brutlag et al. (Com. App. Biosci. (1990) 6:237-245). In a sequence alignment, the query and subject sequence are either both nucleotide sequences or both amino acid sequences. The result of said global sequence alignment is in percent identity. Preferred parameters used in a FASTDB amino acid alignment are: Matrix=PAM 0, k-tuple=2, Mismatch Penalty=1, Joining Penalty=20, Randomization Group Length=0, Cutoff Score=1, Window Size=sequence length, Gap Penalty=5, Gap Size Penalty=0.05, Window Size=500 or the length of the subject amino acid sequence, whichever is shorter. [0058]
  • If the subject sequence is shorter than the query sequence due to N- or C-terminal deletions, not because of internal deletions, a manual correction must be made to the results. This is because the FASTDB program does not account for N- and C-terminal truncations of the subject sequence when calculating global percent identity. For subject sequences truncated at the N- and C-termini, relative to the query sequence, the percent identity is corrected by calculating the number of residues of the query sequence that are N- and C-terminal of the subject sequence, which are not matched/aligned with a corresponding subject residue, as a percent of the total bases of the query sequence. Whether a residue is matched/aligned is determined by results of the FASTDB sequence alignment. This percentage is then subtracted from the percent identity, calculated by the above FASTDB program using the specified parameters, to arrive at a final percent identity score. This final percent identity score is what is used for the purposes of the present invention. Only residues to the N- and C-termini of the subject sequence, which are not matched/aligned with the query sequence, are considered for the purposes of manually adjusting the percent identity score. That is, only query residue positions outside the farthest N- and C-terminal residues of the subject sequence. [0059]
  • The invention also encompasses polynucleotides that hybridize to the polynucleotides depicted in SEQ ID NOS: 5, 6, 7 or 8. A polynucleotide “hybridizes” to another polynucleotide, when a single-stranded form of the polynucleotide can anneal to the other polynucleotide under the appropriate conditions of temperature and solution ionic strength (see Sambrook et al., supra). The conditions of temperature and ionic strength determine the “stringency” of the hybridization. For preliminary screening for homologous nucleic acids, low stringency hybridization conditions, corresponding to a temperature of 42° C., can be used, e.g., 5×SSC, 0.1% SDS, 0.25% milk, and no formamide; or 40% formamide, 5×SSC, 0.5% SDS). Moderate stringency hybridization conditions correspond to a higher temperature of 55° C., e.g., 40% formamide, with 5× or 6×SCC. High stringency hybridization conditions correspond to the highest temperature of 65° C., e.g., 50% formamide, 5× or 6×SCC. Hybridization requires that the two nucleic acids contain complementary sequences, although depending on the stringency of the hybridization, mismatches between bases are possible. The appropriate stringency for hybridizing nucleic acids depends on the length of the nucleic acids and the degree of complementation, variables well known in the art. The greater the degree of similarity or homology between two nucleotide sequences, the greater the value of Tm for hybrids of nucleic acids having those sequences. The relative stability (corresponding to higher Tm) of nucleic acid hybridizations decreases in the following order: RNA:RNA, DNA:RNA, DNA:DNA. [0060]
  • Polynucleotide and Polypeptide Variants [0061]
  • The invention is directed to both polynucleotide and polypeptide variants. A “variant” refers to a polynucleotide or polypeptide differing from the polynucleotide or polypeptide of the present invention, but retaining essential properties thereof. Generally, variants are overall closely similar and in many regions, identical to the polynucleotide or polypeptide of the present invention. [0062]
  • The variants may contain alterations in the coding regions, non-coding regions, or both. Especially preferred are polynucleotide variants containing alterations which produce silent substitutions, additions, or deletions, but do not alter the properties or activities of the encoded polypeptide. Nucleotide variants produced by silent substitutions due to the degeneracy of the genetic code are preferred. Moreover, variants in which 5-10, 1-5, or 1-2 amino acids are substituted, deleted, or added in any combination are also preferred. [0063]
  • The invention also encompasses allelic variants of said polynucleotides. An allelic variant denotes any of two or more alternative forms of a gene occupying the same chromosomal locus. Allelic variation arises naturally through mutation, and may result in polymorphism within populations. Gene mutations can be silent (no change in the encoded polypeptide) or may encode polypeptides having altered amino acid sequences. An allelic variant of a polypeptide is a polypeptide encoded by an allelic variant of a gene. [0064]
  • The amino acid sequences of the variant polypeptides may differ from the amino acid sequences depicted in SEQ ID NOS:1, 2, 3 or 4 by an insertion or deletion of one or more amino acid residues and/or the substitution of one or more amino acid residues by different amino acid residues. Preferably, amino acid changes are of a minor nature, that is conservative amino acid substitutions that do not significantly affect the folding and/or activity of the protein; small deletions, typically of one to about 30 amino acids; small amino- or carboxyl-terminal extensions, such as an amino-terminal methionine residue; a small linker peptide of up to about 20-25 residues; or a small extension that facilitates purification by changing net charge or another function, such as a poly-histidine tract, an antigenic epitope or a binding domain. [0065]
  • Examples of conservative substitutions are within the group of basic amino acids (arginine, lysine and histidine), acidic amino acids (glutamic acid and aspartic acid), polar amino acids (glutamine and asparagine), hydrophobic amino acids (leucine, isoleucine and valine), aromatic amino acids (phenylalanine, tryptophan and tyrosine), and small amino acids (glycine, alanine, serine, threonine and methionine). Amino acid substitutions which do not generally alter the specific activity are known in the art and are described, for example, by H. Neurath and R. L. Hill, 1979, In, [0066] The Proteins, Academic Press, New York. The most commonly occurring exchanges are Ala/Ser, Val/Ile, Asp/Glu, Thr/Ser, Ala/Gly, Ala/Thr, Ser/Asn, Ala/Val, Ser/Gly, Tyr/Phe, Ala/Pro, Lys/Arg, Asp/Asn, Leu/Ile, Leu/Val, as well as these in reverse.
  • Noncoding Regions [0067]
  • The invention is further directed to polynucleotide fragments containing or hybridizing to noncoding regions of the SNARE YKT6, AEBP1, human glucokinase and POLD2 genes. These include but are not limited to an intron, a 5′ non-coding region, a 3′ non-coding region and splice junctions (see Tables 1-4), as well as transcription factor binding sites (see Table 5). The polynucleotide fragments may be a short polynucleotide fragment which is between about 8 nucleotides to about 40 nucleotides in length. Such shorter fragments may be useful for diagnostic purposes. Such short polynucleotide fragments are also preferred with respect to polynucleotides containing or hybridizing to polynucleotides containing splice junctions. Alternatively larger fragments, e.g., of about 50, 150, 500, 600 or about 2000 nucleotides in length may be used. [0068]
    TABLE 1
    Exon/Intron Regions of Polymerase, DNA directed, 50 kD
    regulatory subunit (POLD2) Genomic DNA
    LOCATION
    (nucleotide no.)
    EXONS (Amino acid no.)
    1. 11546 ------------ 11764
    1 73
    2. 15534 ------------ 15656
    74 114
    3. 15857 ------------ 15979
    115 155
    4. 16351 ------------ 16464
    156 193
    5. 16582 ------------ 16782
    194 260
    6. 17089 ------------ 17169
    261 287
    7. 17327 ------------ 17484
    288 339
    8. 17704 ------------ 17829
    340 381
    9. 18199 ------------ 18303
    382 416
    10. 18653 ------------ 18811
    417 469
  • [0069]
    TABLE 2
    AEBP1 (adipocyte enhancer binding protein 1), vas-
    cular smooth muscle-type. Reverse strand coding.
    LOCATION
    EXONS (nucleotide no.) (Amino acid no.)
    21. 1301 1966
    1158 937
    20. 2209 2304
    936 905
    19. 2426 2569
    904 857
    18. 2651 3001
    856 740
    17. 3238 3417
    739 680
    16. 3509 3706
    679 614
    15. 3930 4052
    613 573
    14. 4320 4406
    572 544
    13. 4503 4646
    543 496
    12. 4750 4833
    495 468
    11. 5212 5352
    467 421
    10. 5435 5545
    420 384
    9. 6219 6272
    383 366
    8. 6376 6453
    365 340
    7. 6584 6661
    339 314
    6. 7476 7553
    313 288
    5. 7629 7753
    287 247
    4. 7860 7931
    246 223
    3. 8050 8121
    222 199
    2. 8673 9014
    198 85
    1. 10642 10893
    84 1
  • [0070]
    TABLE 3
    Glucokinase
    LOCATION
    (nucleotide no.)
    EXONS (Amino acid no.)
    1. 20485 ------------ 20523
    1 13
    2. 25133 ------------ 25297
    14 68
    3. 26173 ------------ 26328
    69 120
    4. 27524 ------------ 27643
    121 160
    5. 28535 ------------ 28630
    161 192
    6. 28740 ------------ 28838
    193 225
    7. 30765 ------------ 30950
    226 287
    8. 31982 ------------ 32134
    288 338
    9. 32867 ------------ 33097
    339 415
    10. 33314 ------------ 33460
    416 464
  • [0071]
    TABLE 4
    SNARE. Reverse strand coding.
    LOCATION
    (nucleotide no.)
    EXONS (Amino acid no.)
    7. 4320 ------------ 4352
    198 188
    6. 5475 ------------ 5576
    187 154
    5. 8401 ------------ 8466
    153 132
    4. 9107 ------------ 9211
    131 97
    3. 10114 ------------ 10215
    96 63
    2. 11950 ------------ 12033
    62 35
    1. 15362 ------------ 15463
    34 1
  • [0072]
    TABLE 5
    TRANSCRIPTION FACTOR BINDING SITES
    BINDING SITES SNARE GLUCOKINASE POLD2 AEBP
    AP1FJ-Q2 11 11
    AP1-C 15 15 7 6
    AP1-Q2 9 5
    AP1-Q4 7 4
    AP4-Q5 36 5 43
    AP4-Q6 17 23
    ARNT-01 7 5
    CEBP-01 7
    CETS1P54-01 6
    CREL-01 7
    DELTAEF1-01 64 12 5 50
    FREAC7-01 4
    GATA1-02 19
    GATA1-03 12 6
    GATA1-04 25 6
    GATA1-06 8 5
    GATA2-02 10
    GATA3-02 5
    GATA-C 11 6
    GC-01 4
    GFII-01 6
    HFH2-01 5
    HFH3-01 10
    HFH8-01 4
    IK2-01 49 29
    LMO2COM-01 41 6 27
    LMO2COM-02 31 5 7
    LYF1-01 10 13 6
    MAX-01 4
    MYOD-01 7
    MYOD-Q6 32 19 7 12
    MZF1-01 99 40 15 94
    NF1-Q6 5 7
    NFAT-Q6 43 8 7 8
    NFKAPPAB50-01 4
    NKX25-01 13 14 5
    NMYC-01 12 8
    S8-01 30 4
    SOX5-01 21 20 4 4
    SP1-Q6 8
    SAEBP1-01 4
    SRV-02 5
    STAT-01 6
    TATA-01 8
    TCF11-01 47 28 5 19
    USF-01 12 8 6 8
    USF-C 16 12 12 8
    USF-Q6 6
  • In a specific embodiment, such noncoding sequences are expression control sequences. These include but are not limited to DNA regulatory sequences, such as promoters, enhancers, repressors, terminators, and the like, that provide for the regulation of expression of a coding sequence in a host cell. In eukaryotic cells, polyadenylation signals are also control sequences. [0073]
  • In a more specific embodiment of the invention, the expression control sequences may be operatively linked to a polynucleotide encoding a heterologous polypeptide. Such expression control sequences may be about 50-200 nucleotides in length and specifically about 50, 100, 200, 500, 600, 1000 or 2000 nucleotides in length. A transcriptional control sequence is “operatively linked” to a polynucleotide encoding a heterologous polypeptide sequence when the expression control sequence controls and regulates the transcription and translation of that polynucleotide sequence. The term “operatively linked” includes having an appropriate start signal (e.g., ATG) in front of the polynucleotide sequence to be expressed and maintaining the correct reading frame to permit expression of the DNA sequence under the control of the expression control sequence and production of the desired product encoded by the polynucleotide sequence. If a gene that one desires to insert into a recombinant DNA molecule does not contain an appropriate start signal, such a start signal can be inserted upstream (5′) of and in reading frame with the gene. [0074]
  • Expression of Polypeptides [0075]
  • Isolated Polynucleotide Sequences [0076]
  • The human chromosome 7 genomic clone of accession number AC006454 has been discovered to contain the SNARE YKT6 gene, the human liver glucokinase gene, the AEBP1 gene, and the POLD2 gene by Genscan analysis (Burge et al., 1997, J. Mol. Biol. 268:78-94), BLAST2 and TBLASTN analysis (Altschul et al., 1997, Nucl. Acids Res. 25:3389-3402), in which the sequence of AC006454 was compared to the SNARE YKT6 cDNA sequence, accession number NM[0077] 006555 (McNew et al., 1997, J. Biol. Chem. 272:17776-177783), the human liver glucokinase cDNA sequence (Tanizawa et al., 1992, Mol. Endocrinol. 6:1070-1081), accession number NM000162 (major form) and M69051 (minor form), , AEBP1 cDNA sequence, accession number NM001129 (accession number D86479 for the osteoblast type) (Layne et al., 1998, J. Biol. Chem. 273:15654-15660) and the POLD2 cDNA sequence, accession number NM006230 (Zhang et al., 1995, Genomics 29:179-186).
  • The cloning of the nucleic acid sequences of the present invention from such genomic DNA can be effected, e.g., by using the well known polymerase chain reaction (PCR) or antibody screening of expression libraries to detect cloned DNA fragments with shared structural features. See, e.g., Innis et al., 1990, PCR: [0078] A Guide to Methods and Application, Academic Press, New York. Other nucleic acid amplification procedures such as ligase chain reaction (LCR), ligated activated transcription (LAT) and nucleic acid sequence-based amplification (NASBA) or long chain PCR may be used. In a specific embodiment, 5′ or 3′ non-coding portions of each gene may be identified by methods including but are not limited to, filter probing, clone enrichment using specific probes and protocols similar or identical to 5′ and 3′ “RACE” protocols which are well known in the art. For instance, a method similar to 5′ RACE is available for generating the missing 5′ end of a desired full-length transcript. (Fromont-Racine et al., 1993, Nucl. Acids Res. 21:1683-1684).
  • Once the DNA fragments are generated, identification of the specific DNA fragment containing the desired SNARE YKT6 gene, the human liver glucokinase gene, the AEBP1 gene, or POLD2 gene may be accomplished in a number of ways. For example, if an amount of a portion of a SNARE YKT6 gene, the human liver glucokinase gene, the AEBP1 gene, or POLD2 gene or its specific RNA, or a fragment thereof, is available and can be purified and labeled, the generated DNA fragments may be screened by nucleic acid hybridization to the labeled probe (Benton and Davis, 1977, Science 196:180; Grunstein and Hogness, 1975, Proc. Natl. Acad. Sci. U.S.A. 72:3961). The present invention provides such nucleic acid probes, which can be conveniently prepared from the specific sequences disclosed herein, e.g., a hybridizable probe having a nucleotide sequence corresponding to at least a 10, and preferably a 15, nucleotide fragment of the sequences depicted in SEQ ID NOS:5, 6, 7 or 8. Preferably, a fragment is selected that is highly unique to the encoded polypeptides. Those DNA fragments with substantial homology to the probe will hybridize. As noted above, the greater the degree of homology, the more stringent hybridization conditions can be used. In one embodiment, low stringency hybridization conditions are used to identify a homologous SNARE YKT6, the human liver glucokinase, the AEBP1, or POLD2 polynucleotide. However, in a preferred aspect, and as demonstrated experimentally herein, a nucleic acid encoding a polypeptide of the invention will hybridize to a nucleic acid derived from the polynucleotide sequence depicted in SEQ ID NOS:5, 6, 7 or 8 or a hybridizable fragment thereof, under moderately stringent conditions; more preferably, it will hybridize under high stringency conditions. [0079]
  • Alternatively, the presence of the gene may be detected by assays based on the physical, chemical, or immunological properties of its expressed product. For example, cDNA clones, or DNA clones which hybrid-select the proper mRNAs, can be selected which produce a protein that, e.g., has similar or identical electrophoretic migration, isoelectric focusing behavior, proteolytic digestion maps, or antigenic properties as known for the SNARE YKT6, the human liver glucokinase, the AEBP1, or POLD2 polynucleotide. [0080]
  • A gene encoding SNARE YKT6, the human liver glucokinase, the AEBP1, or POLD2 polypeptide can also be identified by mRNA selection, i.e., by nucleic acid hybridization followed by in vitro translation. In this procedure, fragments are used to isolate complementary mRNAs by hybridization. Immunoprecipitation analysis or functional assays of the in vitro translation products of the products of the isolated mRNAs identifies the mRNA and, therefore, the complementary DNA fragments, that contain the desired sequences. [0081]
  • Nucleic Acid Constructs [0082]
  • The present invention also relates to nucleic acid constructs comprising a polynucleotide sequence containing the exon/intron segments of the SNARE YKT6 gene (nucleotides 4320-15463 of SEQ ID NO:5), human liver glucokinase gene (nucleotides 20485-33460 of SEQ ID NO:6), AEBP1 gene (nucleotides 1301-13893 of SEQ ID NO:7) or POLD2 gene (nucleotides 11546-18811 of SEQ ID NO:8) operably linked to one or more control sequences which direct the expression of the coding sequence in a suitable host cell under conditions compatible with the control sequences. Expression will be understood to include any step involved in the production of the polypeptide including, but not limited to, transcription, post-transcriptional modification, translation, post-translational modification, and secretion. [0083]
  • The invention is further directed to a nucleic acid construct comprising expression control sequences derived from SEQ ID NOS: 5, 6, 7 or 8 and a heterologous polynucleotide sequence. [0084]
  • “Nucleic acid construct” is defined herein as a nucleic acid molecule, either single- or double-stranded, which is isolated from a naturally occurring gene or which has been modified to contain segments of nucleic acid which are combined and juxtaposed in a manner which would not otherwise exist in nature. The term nucleic acid construct is synonymous with the term expression cassette when the nucleic acid construct contains all the control sequences required for expression of a coding sequence of the present invention. The term “coding sequence” is defined herein as a portion of a nucleic acid sequence which directly specifies the amino acid sequence of its protein product. The boundaries of the coding sequence are generally determined by a ribosome binding site (prokaryotes) or by the ATG start codon (eukaryotes) located just upstream of the open reading frame at the 5′ end of the mRNA and a transcription terminator sequence located just downstream of the open reading frame at the 3′ end of the mRNA. A coding sequence can include, but is not limited to, DNA, cDNA, and recombinant nucleic acid sequences. [0085]
  • The isolated polynucleotide of the present invention may be manipulated in a variety of ways to provide for expression of the polypeptide. Manipulation of the nucleic acid sequence prior to its insertion into a vector may be desirable or necessary depending on the expression vector. The techniques for modifying nucleic acid sequences utilizing recombinant DNA methods are well known in the art. [0086]
  • The control sequence may be an appropriate promoter sequence, a nucleic acid sequence which is recognized by a host cell for expression of the nucleic acid sequence. The promoter sequence contains transcriptional control sequences which regulate the expression of the polynucleotide. The promoter may be any nucleic acid sequence which shows transcriptional activity in the host cell of choice including mutant, truncated, and hybrid promoters, and may be obtained from genes encoding extracellular or intracellular polypeptides either homologous or heterologous to the host cell. [0087]
  • Examples of suitable promoters for directing the transcription of the nucleic acid constructs of the present invention, especially in a bacterial host cell, are the promoters obtained from the [0088] E. coli lac operon, the prokaryotic beta-lactamase gene (Villa-Komaroff et al., 1978, Proc. Natl. Acad. Sci. USA 75: 3727-3731), as well as the tac promoter (DeBoer et al., 1983, Proc. Natl Acad. of Sciences USA 80: 21-25). Further promoters are described in “Useful proteins from recombinant bacteria” in Scientific American, 1980, 242: 74-94; and in Sambrook et aL, 1989, supra.
  • Examples of suitable promoters for directing the transcription of the nucleic acid constructs of the present invention in a filamentous fungal host cell are promoters obtained from the genes encoding [0089] Aspergillus oryzae TAKA amylase, Rhizomucor miehei aspartic proteinase, Aspergillus niger neutral alpha-amylase, Aspergillus niger acid stable alpha-amylase, Aspergillus niger or Aspergillus awamori glucoamylase (glaA), Rhizomucor miehei lipase, Aspergillus oryzae alkaline protease, Aspergillus oryzae triose phosphate isomerase, Aspergillus nidulans acetamidase, Fusarium oxysporum trypsin-like protease (WO 96/00787), NA2-tpi (a hybrid of the promoters from the genes encoding Aspergillus niger neutral alpha-amylase and Aspergillus oryzae triose phosphate isomerase), and mutant, truncated, and hybrid promoters thereof.
  • In a yeast host, useful promoters are obtained from the [0090] Saccharomyces cerevisiae enolase (ENO-1) gene, the Saccharomyces cerevisiae galactokinase gene (GAL1), the Saccharomyces cerevisiae alcohol dehydrogenase/glyceraldehyde-3-phosphate dehydrogenase genes (ADH2/GAP), and the Saccharomyces cerevisiae 3-phosphoglycerate kinase gene. Other useful promoters for yeast host cells are described by Romanos et al., 1992, Yeast 8: 423-488.
  • Eukaryotic promoters may be obtained from the genomes of viruses such as polyoma virus, fowlpox virus, adenovirus, bovine papilloma virus, avian sarcoma virus, cytomegalovirus, a retrovirus, hepatitis-B virus and SV40. Alternatively, heterologous mammalian promoters, such as the actin promoter or immunoglobulin promoter may be used. [0091]
  • The constructs of the invention may also include enhancers. Enhancers are cis-acting elements of DNA, usually from about 10 to about 300 bp that act on a promoter to increase its transcription. Enhancers from globin, elastase, albumin, alpha-fetoprotein, and insulin enhancers may be used. However, an enhancer from a virus may be used; examples include SV40 on the late side of the replication origin, the cytomegalovirus early promoter enhancer, the polyoma enhancer on the late side of the replication origin and adenovirus enhancers. [0092]
  • The control sequence may also be a suitable transcription terminator sequence, a sequence recognized by a host cell to terminate transcription. The terminator sequence is operably linked to the 3′ terminus of the nucleic acid sequence encoding the polypeptide. Any terminator which is functional in the host cell of choice may be used in the present invention. [0093]
  • The control sequence may also be a suitable leader sequence, a nontranslated region of an mRNA which is important for translation by the host cell. The leader sequence is operably linked to the 5′ terminus of the nucleic acid sequence encoding the polypeptide. Any leader sequence that is functional in the host cell of choice may be used in the present invention. [0094]
  • The control sequence may also be a polyadenylation sequence, a sequence which is operably linked to the 3′ terminus of the nucleic acid sequence and which, when transcribed, is recognized by the host cell as a signal to add polyadenosine residues to transcribed mRNA. Any polyadenylation sequence which is functional in the host cell of choice may be used in the present invention. [0095]
  • The control sequence may also be a signal peptide coding region, which codes for an amino acid sequence linked to the amino terminus of the polypeptide which can direct the encoded polypeptide into the cell's secretory pathway. The 5′ end of the coding sequence of the nucleic acid sequence may inherently contain a signal peptide coding region naturally linked in translation reading frame with the segment of the coding region which encodes the secreted polypeptide. Alternatively, the 5′ end of the coding sequence may contain a signal peptide coding region which is foreign to the coding sequence. The foreign signal peptide coding region may be required where the coding sequence does not normally contain a signal peptide coding region. Alternatively, the foreign signal peptide coding region may simply replace the natural signal peptide coding region in order to obtain enhanced secretion of the polypeptide. However, any signal peptide coding region which directs the expressed polypeptide into the secretory pathway of a host cell of choice may be used in the present invention. [0096]
  • The control sequence may also be a propeptide coding region, which codes for an amino acid sequence positioned at the amino terminus of a polypeptide. The resultant polypeptide is known as a proenzyme or propolypeptide (or a zymogen in some cases). A propolypeptide is generally inactive and can be converted to a mature active polypeptide by catalytic or autocatalytic cleavage of the propeptide from the propolypeptide. The propeptide coding region may be obtained from the [0097] Bacillus subtilis alkaline protease gene (aprE), the Bacillus subtilis neutral protease gene (nprT), the Saccharomyces cerevisiae alpha-factor gene, the Rhizomucor miehei aspartic proteinase gene, or the Myceliophthora thermophila laccase gene (WO 95/33836).
  • Where both signal peptide and propeptide regions are present at the amino terminus of a polypeptide, the propeptide region is positioned next to the amino terminus of a polypeptide and the signal peptide region is positioned next to the amino terminus of the propeptide region. [0098]
  • It may also be desirable to add regulatory sequences which allow the regulation of the expression of the polypeptide relative to the growth of the host cell. Examples of regulatory systems are those which cause the expression of the gene to be turned on or off in response to a chemical or physical stimulus, including the presence of a regulatory compound. Regulatory systems in prokaryotic systems would include the lac, tac, and trp operator systems. In yeast, the ADH2 system or GAL1 system may be used. In filamentous fungi, the TAKA alpha-amylase promoter, [0099] Aspergillus niger glucoamylase promoter, and the Aspergillus oryzae glucoamylase promoter may be used as regulatory sequences. Other examples of regulatory sequences are those which allow for gene amplification. In eukaryotic systems, these include the dihydrofolate reductase gene which is amplified in the presence of methotrexate, and the metallothionein genes which are amplified with heavy metals. In these cases, the nucleic acid sequence encoding the polypeptide would be operably linked with the regulatory sequence.
  • Expression Vectors [0100]
  • The present invention also relates to recombinant expression vectors comprising a nucleic acid sequence of the present invention, a promoter, and transcriptional and translational stop signals. The various nucleic acid and control sequences described above may be joined together to produce a recombinant expression vector which may include one or more convenient restriction sites to allow for insertion or substitution of the nucleic acid sequence encoding the polypeptide at such sites. Alternatively, the polynucleotide of the present invention may be expressed by inserting the nucleic acid sequence or a nucleic acid construct comprising the sequence into an appropriate vector for expression. In creating the expression vector, the coding sequence is located in the vector so that the coding sequence is operably linked with the appropriate control sequences for expression. [0101]
  • The recombinant expression vector may be any vector (e.g., a plasmid or virus) which can be conveniently subjected to recombinant DNA procedures and can bring about the expression of the nucleic acid sequence. The choice of the vector will typically depend on the compatibility of the vector with the host cell into which the vector is to be introduced. The vectors may be linear or closed circular plasmids. [0102]
  • The vector may be an autonomously replicating vector, i.e., a vector which exists as an extrachromosomal entity, the replication of which is independent of chromosomal replication, e.g., a plasmid, an extrachromosomal element, a minichromosome, or an artificial chromosome. The vector may contain any means for assuring self-replication. Alternatively, the vector may be one which, when introduced into the host cell, is integrated into the genome and replicated together with the chromosome(s) into which it has been integrated. Furthermore, a single vector or plasmid or two or more vectors or plasmids which together contain the total DNA to be introduced into the genome of the host cell, or a transposon may be used. [0103]
  • The vectors of the present invention preferably contain one or more selectable markers which permit easy selection of transformed cells. A selectable marker is a gene the product of which provides for biocide or viral resistance, resistance to heavy metals, prototrophy to auxotrophs, and the like. Examples of bacterial selectable markers are the dal genes from [0104] Bacillus subtilis or Bacillus licheniformis, or markers which confer antibiotic resistance such as ampicillin, kanamycin, chloramphenicol or tetracycline resistance. Suitable markers for yeast host cells are ADE2, HIS3, LEU2, LYS2, MET3, TRP1, and URA3. An example of suitable selectable markers for mammalian cells are those that enable the identification of cells competent to take of the nucleic acids of the present invention, such as DHFR or thymidine kinase. An appropriate host cell when wild-type DHFR is employed is the CHO cell line deficient in DHFR activity, prepared and propagated as described by Urlaub et al., Proc. Natl. Acad. Sci. USA, 77:4216 (1980).
  • The vectors of the present invention preferably contain an element(s) that permits stable integration of the vector into the host cell genome or autonomous replication of the vector in the cell independent of the genome of the cell. [0105]
  • For integration into the host cell genome, the vector may rely on the polynucleotide sequence encoding the polypeptide or any other element of the vector for stable integration of the vector into the genome by homologous or nonhomologous recombination. Alternatively, the vector may contain additional nucleic acid sequences for directing integration by homologous recombination into the genome of the host cell. The additional polynucleotide sequences enable the vector to be integrated into the host cell genome at a precise location(s) in the chromosome(s). To increase the likelihood of integration at a precise location, the integrational elements should preferably contain a sufficient number of nucleic acids, such as 100 to 1,500 base pairs, preferably 400 to 1,500 base pairs, and most preferably 800 to 1,500 base pairs, which are highly homologous with the corresponding target sequence to enhance the probability of homologous recombination. The integrational elements may be any sequence that is homologous with the target sequence in the genome of the host cell. Furthermore, the integrational elements may be non-encoding or encoding nucleic acid sequences. On the other hand, the vector may be integrated into the genome of the host cell by non-homologous recombination. [0106]
  • For autonomous replication, the vector may further comprise an origin of replication enabling the vector to replicate autonomously in the host cell in question. Examples of bacterial origins of replication are the origins of replication of plasmids pBR322, pUC19, pACYC177, and pACYC184 permitting replication in [0107] E. coli, and pUB110, pE194, pTA1060, and pAMβ1 permitting replication in Bacillus. Examples of origins of replication for use in a yeast host cell are the 2 micron origin of replication, ARS1, ARS4, the combination of ARS1 and CEN3, and the combination of ARS4 and CEN6. The origin of replication may be one having a mutation which makes its functioning temperature-sensitive in the host cell (see, e.g., Ehrlich, 1978, Proceedings of the National Academy of Sciences USA 75: 1433).
  • More than one copy of a polynucleotide sequence of the present invention may be inserted into the host cell to increase production of the gene product. An increase in the copy number of the polynucleotide sequence can be obtained by integrating at least one additional copy of the sequence into the host cell genome or by including an amplifiable selectable marker gene with the nucleic acid sequence where cells containing amplified copies of the selectable marker gene, and thereby additional copies of the nucleic acid sequence, can be selected for by cultivating the cells in the presence of the appropriate selectable agent. [0108]
  • The procedures used to ligate the elements described above to construct the recombinant expression vectors of the present invention are well known to one skilled in the art (see, e.g., Sambrook et al., 1989, supra). [0109]
  • Host Cells [0110]
  • The present invention also relates to recombinant host cells, comprising a nucleic acid sequence of the invention, which are advantageously used in the recombinant production of the polypeptides. A vector comprising a nucleic acid sequence of the present invention is introduced into a host cell so that the vector is maintained as a chromosomal integrant or as a self-replicating extra-chromosomal vector as described earlier. The term “host cell” encompasses any progeny of a parent cell that is not identical to the parent cell due to mutations that occur during replication. The choice of a host cell will to a large extent depend upon the gene encoding the polypeptide and its source. [0111]
  • The host cell may be a unicellular microorganism, e.g., a prokaryote, or a non-unicellular microorganism, e.g., a eukaryote. Useful unicellular cells are bacterial cells such as gram positive bacteria including, but not limited to, a Bacillus cell, or a Streptomyces cell, e.g., [0112] Streptomyces lividans or Streptomyces murinus, or gram negative bacteria such as E. coli and Pseudomonas sp.
  • The introduction of a vector into a bacterial host cell may, for instance, be effected by protoplast transformation (see, e.g., Chang and Cohen, 1979, [0113] Molecular General Genetics 168: 111-115), using competent cells (see, e.g., Young and Spizizin, 1961, Journal of Bacteriology 81: 823-829, or Dubnau and Davidoff-Abelson, 1971, Journal of Molecular Biology 56: 209-221), electroporation (see, e.g., Shigekawa and Dower, 1988, Biotechniques 6: 742-751), or conjugation (see, e.g., Koehler and Thorne, 1987, Journal of Bacteriology 169: 5771-5278).
  • The host cell may be a eukaryote, such as a mammalian cell (e.g., human cell), an insect cell, a plant cell or a fungal cell. Mammalian host cells that could be used include but are not limited to human Hela, embryonic kidney cells (293), lung cells, H9 and Jurkat cells, mouse NIH3T3 and C127 cells, Cos 1, Cos 7 and CV1, quail QC1-3 cells, mouse L cells and Chinese Hamster ovary (CHO) cells. These cells may be transfected with a vector containing a transcriptional regulatory sequence, a protein coding sequence and transcriptional termination sequences. Alternatively, the polypeptide can be expressed in stable cell lines containing the polynucleotide integrated into a chromosome. The co-transfection with a selectable marker such as dhfr, gpt, neomycin, hygromycin allows the identification and isolation of the transfected cells. [0114]
  • The host cell may be a fungal cell. “Fungi” as used herein includes the phyla Ascomycota, Basidiomycota, Chytridiomycota, and Zygomycota (as defined by Hawksworth et al., In, [0115] Ainsworth and Bisby's Dictionary of The Fungi, 8th edition, 1995, CAB International, University Press, Cambridge, UK) as well as the Oomycota (as cited in Hawksworth et al., 1995, supra, page 171) and all mitosporic fungi (Hawksworth et al., 1995, supra). The fungal host cell may also be a yeast cell. “Yeast” as used herein includes ascosporogenous yeast (Endomycetales), basidiosporogenous yeast, and yeast belonging to the Fungi Imperfecti (Blastomycetes). Since the classification of yeast may change in the future, for the purposes of this invention, yeast shall be defined as described in Biology and Activities of Yeast (Skinner, F. A., Passmore, S. M., and Davenport, R. R., eds, Soc. App. Bacteriol. Symposium Series No. 9, 1980). The fungal host cell may also be a filamentous fungal cell. “Filamentous fungi” include all filamentous forms of the subdivision Eumycota and Oomycota (as defined by Hawksworth et al., 1995, supra). The filamentous fungi are characterized by a mycelial wall composed of chitin, cellulose, glucan, chitosan, mannan, and other complex polysaccharides. Vegetative growth is by hyphal elongation and carbon catabolism is obligately aerobic. In contrast, vegetative growth by yeasts such as Saccharomyces cerevisiae is by budding of a unicellular thallus and carbon catabolism may be fermentative.
  • Fungal cells may be transformed by a process involving protoplast formation, transformation of the protoplasts, and regeneration of the cell wall in a manner known per se. Suitable procedures for transformation of Aspergillus host cells are described in EP 238 023 and Yelton et al., 1984, [0116] Proceedings of the National Academy of Sciences USA 81: 1470-1474. Suitable methods for transforming Fusarium species are described by Malardier et al., 1989, Gene 78: 147-156 and WO 96/00787. Yeast may be transformed using the procedures described by Becker and Guarente, In Abelson, J. N. and Simon, M. I., editors, Guide to Yeast Genetics and Molecular Biology, Methods in Enzymology, Volume 194, pp 182-187, Academic Press, Inc., New York; Ito et al., 1983, Journal of Bacteriology 153: 163; and Hinnen et al., 1978, Proc. e Natl Acad. f Sci.s USA 75: 1920.
  • Methods of Production [0117]
  • The present invention also relates to methods for producing a polypeptide of the present invention comprising (a) cultivating a host cell under conditions conducive for production of the polypeptide; and (b) recovering the polypeptide. [0118]
  • In the production methods of the present invention, the cells are cultivated in a nutrient medium suitable for production of the polypeptide using methods known in the art. For example, the cell may be cultivated by shake flask cultivation, small-scale or large-scale fermentation (including continuous, batch, fed-batch, or solid state fermentations) in laboratory or industrial fermentors performed in a suitable medium and under conditions allowing the polypeptide to be expressed and/or isolated. The cultivation takes place in a suitable nutrient medium comprising carbon and nitrogen sources and inorganic salts, using procedures known in the art. Suitable media are available from commercial suppliers or may be prepared according to published compositions (e.g., in catalogues of the American Type Culture Collection). If the polypeptide is secreted into the nutrient medium, the polypeptide can be recovered directly from the medium. If the polypeptide is not secreted, it can be recovered from cell lysates. [0119]
  • The polypeptides may be detected using methods known in the art that are specific for the polypeptides. These detection methods may include use of specific antibodies, formation of an enzyme product, or disappearance of an enzyme substrate. In a specific embodiment, an enzyme assay may be used to determine the activity of the polypeptide. For example, AEBP1 activity can be determined by measuring carboxypeptidase activity as described by Muise and Ro, 1999, Biochem. J. 343:341-345. Here, the conversion of hippuryl-L-arginine, hippuryl-L-lysine or hippuryl-L-phenylalanine to hippuric acid may be monitored spectrophotometrically. POLD2 activity may be detected by assaying for DNA polymerase_activity (see, for example, Ng et al., 1991, J. Biol. Chem. 266:11699-11704). [0120]
  • The resulting polypeptide may be recovered by methods known in the art. For example, the polypeptide may be recovered from the nutrient medium by conventional procedures including, but not limited to, centrifugation, filtration, extraction, spray-drying, evaporation, or precipitation. [0121]
  • The polypeptides of the present invention may be purified by a variety of procedures known in the art including, but not limited to, chromatography (e.g., ion exchange, affinity, hydrophobic, chromatofocusing, and size exclusion), electrophoretic procedures (e.g., preparative isoelectric focusing, differential solubility (e.g., ammonium sulfate precipitation), SDS-PAGE, or extraction (see, e.g., [0122] Protein Purification, J. -C. Janson and Lars Ryden, editors, VCH Publishers, New York, 1989).
  • Antibodies [0123]
  • According to the invention, the SNARE YKT6, human glucokinase, AEBP1 or POLD2 polypeptides produced according to the method of the present invention may be used as an immunogen to generate any of these polypeptides. Such antibodies include but are not limited to polyclonal, monoclonal, chimeric, single chain, Fab fragments, and an Fab expression library. [0124]
  • Various procedures known in the art may be used for the production of antibodies. For the production of antibody, various host animals can be immunized by injection with the polypeptide thereof, including but not limited to rabbits, mice, rats, sheep, goats, etc. In one embodiment, the polypeptide or fragment thereof can optionally be conjugated to an immunogenic carrier, e.g., bovine serum albumin (BSA) or keyhole limpet hemocyanin (KLH). Various adjuvants may be used to increase the immunological response, depending on the host species, including but not limited to Freund's (complete and incomplete), mineral gels such as aluminum hydroxide, surface active substances such as lysolecithin, pluronic polyols, polyanions, peptides, oil emulsions, keyhole limpet hemocyanins, dinitrophenol, and potentially useful human adjuvants such as BCG (bacille Calmette-Guerin) and Corynebacterium parvum. [0125]
  • For preparation of monoclonal antibodies directed toward the SNARE YKT6, human glucokinase, AEBP1 or POLD2 polypeptide, any technique that provides for the production of antibody molecules by continuous cell lines in culture may be used. These include but are not limited to the hybridoma technique originally developed by Kohler and Milstein (1975, Nature 256:495-497), as well as the trioma technique, the human B-cell hybridoma technique (Kozbor et al., 1983, Immunology Today 4:72), and the EBV-hybridoma technique to produce human monoclonal antibodies (Cole et al., 1985, in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, Inc., pp. 77-96). In an additional embodiment of the invention, monoclonal antibodies can be produced in germ-free animals utilizing recent technology (PCT/US90/02545). According to the invention, human antibodies may be used and can be obtained by using human hybridomas (Cote et al., 1983, Proc. Natl. Acad. Sci. U.S.A. 80:2026-2030) or by transforming human B cells with EBV virus in vitro (Cole et al., 1985, in Monoclonal Antibodies and Cancer Therapy, Alan R. Liss, pp. 77-96). In fact, according to the invention, techniques developed for the production of “chimeric antibodies” (Morrison et al., 1984, J. Bacteriol. 159-870; Neuberger et al., 1984, Nature 312:604-608; Takeda et al., 1985, Nature 314:452-454) by splicing the genes from a mouse antibody molecule specific for the SNARE YKT6, human glucokinase, AEBP1 or POLD2 polypeptide together with genes from a human antibody molecule of appropriate biological activity can be used; such antibodies are within the scope of this invention. [0126]
  • According to the invention, techniques described for the production of single chain antibodies (U.S. Pat. No. 4,946,778) can be adapted to produce polypeptide-specific single chain antibodies. An additional embodiment of the invention utilizes the techniques described for the construction of Fab expression libraries (Huse et al., 1989, Science 246:1275-1281) to allow rapid and easy identification of monoclonal Fab fragments with the desired specificity for the SNARE YKT6, AEBP1, human glucokinase or POLD2 polypeptides. [0127]
  • Antibody fragments which contain the idiotype of the antibody molecule can be generated by known techniques. For example, such fragments include but are not limited to: the F(ab′)2 fragment which can be produced by pepsin digestion of the antibody molecule; the Fab′ fragments which can be generated by reducing the disulfide bridges of the F(ab′)2, fragment, and the Fab fragments which can be generated by treating the antibody molecule with papain and a reducing agent. [0128]
  • In the production of antibodies, screening for the desired antibody can be accomplished by techniques known in the art, e.g., radioimmunoassay, ELISA (enzyme-linked immunosorbent assay), “sandwich” immunoassays, immunoradiometric assays, gel diffusion precipitin reactions, immunodiffusion assays, in situ immunoassays (using colloidal gold, enzyme or radioisotope labels, for example), western blots, precipitation reactions, agglutination assays (e.g., gel agglutination assays, hemagglutination assays), complement fixation assays, immunofluorescence assays, protein A assays, and immunoelectrophoresis assays, etc. In one embodiment, antibody binding is detected by detecting a label on the primary antibody. In another embodiment, the primary antibody is detected by detecting binding of a secondary antibody or reagent to the primary antibody. In a further embodiment, the secondary antibody is labeled. Many means are known in the art for detecting binding in an immunoassay and are within the scope of the present invention. For example, to select antibodies which recognize a specific epitope of a particular polypeptide, one may assay generated hybridomas for a product which binds to a particular polypeptide fragment containing such epitope. For selection of an antibody specific to a particular polypeptide from a particular species of animal, one can select on the basis of positive binding with the polypeptide expressed by or isolated from cells of that species of animal. [0129]
  • Immortal, antibody-producing cell lines can also be created by techniques other than fusion, such as direct transformation of B lymphocytes with oncogenic DNA, or transfection with Epstein-Barr virus. See, e.g., M. Schreier et al., “Hybridoma Techniques” (1980); Hammerling et al., “Monoclonal Antibodies And T-cell Hybridomas” (1981); Kennett et al., “Monoclonal Antibodies” (1980); see also U.S. Pat. Nos. 4,341,761; 4,399,121; 4,427,783; 4,444,887; 4,451,570; 4,466,917; 4,472,500; 4,491,632; 4,493,890. [0130]
  • Uses of Polynucleotides [0131]
  • Diagnostics [0132]
  • Polynucleotides containing noncoding regions of SEQ ID NOS:5, 6, 7 or 8 may be used as probes for detecting mutations from samples from a patient. Genomic DNA may be isolated from the patient. A mutation(s) may be detected by Southern blot analysis, specifically by hybridizing restriction digested genomic DNA to various probes and subjecting to agarose electrophoresis. [0133]
  • Polynucleotides containing noncoding regions may be used as PCR primers and may be used to amplify the genomic DNA isolated from the patients. Additionally, primers may be obtained by routine or long range PCR, that can yield products containing more than one exon and intervening intron. The sequence of the amplified genomic DNA from the patient may be determined using methods known in the art. Such probes may be between 10-100 nucleotides in length and may preferably be between 20-50 nucleotides in length. [0134]
  • Thus the invention is thus directed to kits comprising these polynucleotide probes. In a specific embodiment, these probes are labeled with a detectable substance. [0135]
  • Antisense Oligonucleotides and Mimetics [0136]
  • The invention is further directed to antisense oligonucleotides and mimetics to these polynucleotide sequences. Antisense technology can be used to control gene expression through triple-helix formation or antisense DNA or RNA, both of which methods are based on binding of a polynucleotide to DNA or RNA. A DNA oligonucleotide is designed to be complementary to a region of the gene involved in transcription or RNA processing (triple helix (see Lee et al., Nucl. Acids Res., 6:3073 (1979); Cooney et al, Science, 241:456 (1988); and Dervan et al., Science, 251: 1360 (1991)), thereby preventing transcription and the production of said polypeptides. [0137]
  • The antisense oligonucleotides or mimetics of the present invention may be used to decrease levels of a polypeptide. For example, SNARE YKT6 has been found to be essential for vesicle-associated endoplasmic reticulum-Golgi transport and cell growth. Therefore, the SNARE YKT6 antisense oligonucleotides of the present invention could be used to inhibit cell growth and in particular, to treat or prevent tumor growth. POLD2 is necessary for DNA replication. POLD2 antisense sequences could also be used to inhibit cell growth. Glucokinase and AEBP1 antisense sequences may be used to treat hyperglycemia. [0138]
  • The antisense oligonucleotides of the present invention may be formulated into pharmaceutical compositions. These compositions may be administered in a number of ways depending upon whether local or systemic treatment is desired and upon the area to be treated. Administration may be topical (including ophthalmic and to mucous membranes including vaginal and rectal delivery), pulmonary, e.g., by inhalation or insufflation of powders or aerosols, including by nebulizer; intratracheal, intranasal, epidermal and transdermal), oral or parenteral. Parenteral administration includes intravenous, intraarterial, subcutaneous, intraperitoneal or intramuscular injection or infusion; or intracranial, e.g., intrathecal or intraventricular, administration. [0139]
  • Pharmaceutical compositions and formulations for topical administration may include transdermal patches, ointments, lotions, creams, gels, drops, suppositories, sprays, liquids and powders. Conventional pharmaceutical carriers, aqueous, powder or oily bases, thickeners and the like may be necessary or desirable. [0140]
  • Compositions and formulations for oral administration include powders or granules, suspensions or solutions in water or non-aqueous media, capsules, sachets or tablets. Thickeners, flavoring agents, diluents, emulsifiers, dispersing aids or binders may be desirable. [0141]
  • Compositions and formulations for parenteral, intrathecal or intraventricular administration may include sterile aqueous solutions which may also contain buffers, diluents and other suitable additives such as, but not limited to, penetration enhancers, carrier compounds and other pharmaceutically acceptable carriers or excipients. [0142]
  • Pharmaceutical compositions of the present invention include, but are not limited to, solutions, emulsions, and liposome-containing formulations. These compositions may be generated from a variety of components that include, but are not limited to, preformed liquids, self-emulsifying solids and self-emulsifying semisolids. [0143]
  • The pharmaceutical formulations of the present invention, which may conveniently be presented in unit dosage form, may be prepared according to conventional techniques well known in the pharmaceutical industry. Such techniques include the step of bringing into association the active ingredients with the pharmaceutical carrier(s) or excipient(s). In general, the formulations are prepared by uniformly and intimately bringing into association the active ingredients with liquid carriers or finely divided solid carriers or both, and then, if necessary, shaping the product. [0144]
  • The compositions of the present invention may be formulated into any of many possible dosage forms such as, but not limited to, tablets, capsules, liquid syrups, soft gels, suppositories, and enemas. The compositions of the present invention may also be formulated as suspensions in aqueous, non-aqueous or mixed media. Aqueous suspensions may further contain substances which increase the viscosity of the suspension including, for example, sodium carboxymethylcellulose, sorbitol and/or dextran. The suspension may also contain stabilizers. [0145]
  • In one embodiment of the present invention, the pharmaceutical compositions may be formulated and used as foams. Pharmaceutical foams include formulations such as, but not limited to, emulsions, microemulsions, creams, jellies and liposomes. While basically similar in nature these formulations vary in the components and the consistency of the final product. The preparation of such compositions and formulations is generally known to those skilled in the pharmaceutical and formulation arts and may be applied to the formulation of the compositions of the present invention. [0146]
  • The formulation of therapeutic compositions and their subsequent administration is believed to be within the skill of those in the art. Dosing is dependent on severity and responsiveness of the disease state to be treated, with the course of treatment lasting from several days to several months, or until a cure is effected or a diminution of the disease state is achieved. Optimal dosing schedules can be calculated from measurements of drug accumulation in the body of the patient. Persons of ordinary skill can easily determine optimum dosages, dosing methodologies and repetition rates. Optimum dosages may vary depending on the relative potency of individual oligonucleotides, and can generally be estimated based on EC50 as found to be effective in in vitro and in vivo animal models. [0147]
  • In general, dosage is from 0.01 ug to 10 g per kg of body weight, and may be given once or more daily, weekly, monthly or yearly, or even once every 2 to 20 years. Persons of ordinary skill in the art can easily estimate repetition rates for dosing based on measured residence times and concentrations of the drug in bodily fluids or tissues. Following successful treatment, it may be desirable to have the patient undergo maintenance therapy to prevent the recurrence of the disease state, wherein the oligonucleotide is administered in maintenance doses, ranging from 0.01 ug to 10 g per kg of body weight, once or more daily, to once every 20 years. [0148]
  • Gene Therapy [0149]
  • As noted above, SNARE YKT6 is necessary for cell growth, POLD2 is involved in DNA replication and repair, AEBP1 is involved in repressing adipogenesis and glucokinase is involved in glucose sensing in pancreatic islet beta cells and liver. Therefore, the SNARE YKT6 gene may be used to modulate or prevent cell apoptosis and treat such disorders as virus-induced lymphocyte depletion (AIDS); cell death in neurodegenerative disorders characterized by the gradual loss of specific sets of neurons (e.g., Alzheimer's Disease, Parkinson's disease, ALS, retinitis pigmentosa, spinal muscular atrophy and various forms of cerebellar degeneration), cell death in blood cell disorders resulting from deprivation of growth factors (anemia associated with chronic disease, aplastic anemia, chronic neutropenia and myelodysplastic syndromes) and disorders arising out of an acute loss of blood flow (e.g., myocardial infarctions and stroke). The glucokinase gene may be used to treat diabetes mellitus. The AEBP1 gene may be used to modulate or inhibit adipogenesis and treat obesity, diabetes mellitus and/or osteopenic disorders. POLD2 may be used to treat defects in DNA repair such as xeroderma pigmentosum, progeria and ataxia telangiectasia. [0150]
  • As described herein, the polynucleotide of the present invention may be introduced into a patient's cells for therapeutic uses. As will be discussed in further detail below, cells can be transfected using any appropriate means, including viral vectors, as shown by the example, chemical transfectants, or physico-mechanical methods such as electroporation and direct diffusion of DNA. See, for example, Wolff, Jon A, et al., “Direct gene transfer into mouse muscle in vivo,” [0151] Science, 247, 1465-1468, 1990; and Wolff, Jon A, “Human dystrophin expression in mdx mice after intramuscular injection of DNA constructs,” Nature, 352, 815-818, 1991. As used herein, vectors are agents that transport the gene into the cell without degradation and include a promoter yielding expression of the gene in the cells into which it is delivered. As will be discussed in further detail below, promoters can be general promoters, yielding expression in a variety of mammalian cells, or cell specific, or even nuclear versus cytoplasmic specific. These are known to those skilled in the art and can be constructed using standard molecular biology protocols. Vectors have been divided into two classes:
  • a) Biological agents derived from viral, bacterial or other sources. [0152]
  • b) Chemical physical methods that increase the potential for gene uptake, directly introduce the gene into the nucleus or target the gene to a cell receptor. [0153]
  • Biological Vectors [0154]
  • Viral vectors have higher transaction (ability to introduce genes) abilities than do most chemical or physical methods to introduce genes into cells. Vectors that may be used in the present invention include viruses, such as adenoviruses, adeno associated virus (AAV), vaccinia, herpesviruses, baculoviruses and retroviruses, bacteriophages, cosmids, plasmids, fungal vectors and other recombination vehicles typically used in the art which have been described for expression in a variety of eukaryotic and prokaryotic hosts, and may be used for gene therapy as well as for simple protein expression. Polynucleotides are inserted into vector genomes using methods well known in the art. [0155]
  • Retroviral vectors are the vectors most commonly used in clinical trials, since they carry a larger genetic payload than other viral vectors. However, they are not useful in non-proliferating cells. Adenovirus vectors are relatively stable and easy to work with, have high titers, and can be delivered in aerosol formulation. Pox viral vectors are large and have several sites for inserting genes, they are thermostable and can be stored at room temperature. [0156]
  • Examples of promoters are SP6, T4, T7, SV40 early promoter, cytomegalovirus (CMV) promoter, mouse mammary tumor virus (MMTV) steroid-inducible promoter, Moloney murine leukemia virus (MMLV) promoter, phosphoglycerate kinase (PGK) promoter, and the like. Alternatively, the promoter may be an endogenous adenovirus promoter, for example the E1 a promoter or the Ad2 major late promoter (MLP). Similarly, those of ordinary skill in the art can construct adenoviral vectors utilizing endogenous or heterologous poly A addition signals. [0157]
  • Plasmids are not integrated into the genome and the vast majority of them are present only from a few weeks to several months, so they are typically very safe. However, they have lower expression levels than retroviruses and since cells have the ability to identify and eventually shut down foreign gene expression, the continuous release of DNA from the polymer to the target cells substantially increases the duration of functional expression while maintaining the benefit of the safety associated with non-viral transfections. [0158]
  • Chemical/physical Vectors [0159]
  • Other methods to directly introduce genes into cells or exploit receptors on the surface of cells include the use of liposomes and lipids, ligands for specific cell surface receptors, cell receptors, and calcium phosphate and other chemical mediators, microinjections directly to single cells, electroporation and homologous recombination. Liposomes are commercially available from Gibco BRL, for example, as LIPOFECTIN® and LIPOFECTACE®, which are formed of cationic lipids such as N-[1-(2,3 dioleyloxy)-propy1]-n,n,n-trimethylammonium chloride (DOTMA) and dimethyl dioctadecylammonium bromide (DDAB). Numerous methods are also published for making liposomes, known to those skilled in the art. [0160]
  • For example, Nucleic acid-Lipid Complexes—Lipid carriers can be associated with naked nucleic acids (e.g., plasmid DNA) to facilitate passage through cellular membranes. Cationic, anionic, or neutral lipids can be used for this purpose. However, cationic lipids are preferred because they have been shown to associate better with DNA which, generally, has a negative charge. Cationic lipids have also been shown to mediate intracellular delivery of plasmid DNA (Felgner and Ringold, Nature 337:387 (1989)). Intravenous injection of cationic lipid-plasmid complexes into mice has been shown to result in expression of the DNA in lung (Brigham et al., Am. J. Med. Sci.298:278 (1989)). See also, Osaka et al., J. Pharm. Sci. 85(6):612-618 (1996); San et al., Human Gene Therapy 4:781-788 (1993); Senior et al., Biochemica et Biophysica Acta 1070:173-179 (1991); Kabanov and Kabanov, Bioconjugate Chem. 6:7-20 (1995); Remy et al., Bioconjugate Chem. 5:647-654 (1994); Behr, J-P., Bioconjugate Chem 5:382-389 (1994); Behr et al., Proc. Natl. Acad. Sci., USA 86:6982-6986 (1989); and Wyman et al., Biochem. 36:3008-3017 (1997). [0161]
  • Cationic lipids are known to those of ordinary skill in the art. Representative cationic lipids include those disclosed, for example, in U.S. Pat. No. 5,283,185; and e.g., U.S. Pat. No. 5,767,099. In a preferred embodiment, the cationic lipid is N4-spermine cholesteryl carbamate (GL-67) disclosed in U.S. Pat. No. 5,767,099. Additional preferred lipids include N4-spermidine cholestryl carbamate (GL-53) and 1-(N4-spermind)-2,3-dilaurylglycerol carbamate (GL-89). [0162]
  • The vectors of the invention may be targeted to specific cells by linking a targeting molecule to the vector. A targeting molecule is any agent that is specific for a cell or tissue type of interest, including for example, a ligand, antibody, sugar, receptor, or other binding molecule. [0163]
  • Invention vectors may be delivered to the target cells in a suitable composition, either alone, or complexed, as provided above, comprising the vector and a suitably acceptable carrier. The vector may be delivered to target cells by methods known in the art, for example, intravenous, intramuscular, intranasal, subcutaneous, intubation, lavage, and the like. The vectors may be delivered via in vivo or ex vivo applications. In vivo applications involve the direct administration of an adenoviral vector of the invention formulated into a composition to the cells of an individual. Ex vivo applications involve the transfer of the adenoviral vector directly to harvested autologous cells which are maintained in vitro, followed by readministration of the transduced cells to a recipient. [0164]
  • In a specific embodiment, the vector is transfected into antigen-presenting cells. Suitable sources of antigen-presenting cells (APCs) include, but are not limited to, whole cells such as dendritic cells or macrophages; purified MHC class I molecule complexed to β2-microglobulin and foster antigen-presenting cells. In a specific embodiment, the vectors of the present invention may be introduced into T cells or B cells using methods known in the art (see, for example, Tsokos and Nepom, 2000, J. Clin. Invest. 106:181-183). [0165]
  • The invention described and claimed herein is not to be limited in scope by the specific embodiments herein disclosed, since these embodiments are intended as illustrations of several aspects of the invention. Any equivalent embodiments are intended to be within the scope of this invention. Indeed, various modifications of the invention in addition to those shown and described herein will become apparent to those skilled in the art from the foregoing description. Such modifications are also intended to fall within the scope of the appended claims. [0166]
  • Various references are cited herein, the disclosure of which are incorporated by reference in their entireties. [0167]

Claims (22)

What is claimed is:
1. An isolated genomic polynucleotide, said polynucleotide obtainable from human chromosome 7 having a nucleotide sequence at least 95% identical to a sequence selected from the group consisting of:
(a) a polynucleotide encoding a polypeptide selected from the group consisting of human SNARE YKT6 depicted in SEQ ID NO:1, human liver glucokinase depicted in SEQ ID NO:2, human adipocyte enhancer binding protein depicted in SEQ ID NO:3 and DNA directed 50 kD regulatory subunit (POLD2) depicted in SEQ ID NO:4;
(b) a polynucleotide selected from the group consisting of SEQ ID NO:5 which encodes human SNARE YKT6 depicted in SEQ ID NO:1, SEQ ID NO:6 which encodes human liver glucokinase depicted in SEQ ID NO:2, SEQ ID NO:7 which encodes human adipocyte enhancer binding protein depicted in SEQ ID NO:3 and SEQ ID NO:8 which encodes DNA directed 50 kD regulatory subunit (POLD2) depicted in SEQ ID NO:4;
(c) a polynucleotide which is a variant of SEQ ID NOS:5, 6, 7, or 8;
(d) a polynucleotide which is an allelic variant of SEQ ID NOS:5, 6, 7, or 8;
(e) a polynucleotide which encodes a variant of SEQ ID NOS:1, 2, 3, or 4;
(f) a polynucleotide which hybridizes to any one of the polynucleotides specified in (a)-(e)
(g) a polynucleotide which is a reverse complement of the polynucleotides specified in (a)-(f);
(h) a polynucleotide selected from the group consisting of a polynucleotide which encodes human SNARE YKT6 with exons as depicted in Table 1, a polynucleotide which encodes human liver glucokinase with exons as depicted in Table 3; a polynucleotide which encodes human adipocyte enhancer binding protein with exons as depicted in Table 2 and a polynucleotide which encodes DNA directed 50 kD regulatory subunit (POLD2) with exons as depicted in Table 4 and
(i) containing at least 10 transcription factor binding sites selected from the group consisting of AP1FJ-Q2, AP1-C, AP1-Q2, AP1-Q4, AP4-Q5, AP4-Q6, ARNT-01, CEBP-01, CETS1P54-01, CREL-01, DELTAEF1-01, FREAC7-01, GATA1-02, GATA1-03, GATA1-04, GATA1-06, GATA2-02, GATA3-02, GATA-C, GC-01, GFII-01, HFH2-01, HFH3-01, HFH8-01, IK2-01, LMO2COM-01, LMO2COM-02, LYF1-01, MAX-01, NKX25-01, NMYC-01, S8-01, SOX5-01, SP1-Q6, SAEBP1-01, SRV-02, STAT-01, TATA-01, TCF11-01, USF-01, USF-C and USF-Q6.
2. A nucleic acid construct comprising the polynucleotide of claim 1.
3. An expression vector comprising the polynucleotide of claim 1.
4. A recombinant host cell comprising the nucleic acid construct of claim 2.
5. A recombinant host cell comprising the expression vector of claim 4.
6. A method for obtaining a polypeptide encoded by a polynucleotide obtainable from human chromosome 7, said polypeptide selected from the group consisting of human SNARE YKT6, human liver glucokinase, human adipocyte enhancer binding protein and DNA directed 50 kD regulatory subunit (POLD2) comprising:
(a) culturing the recombinant host cell of claim 5 under conditions that provide for the expression of said polypeptide and (b) recovering said expressed polypeptide.
7. A method for preparing an antibody specific to a polypeptide selected from the group consisting of human SNARE YKT6, human liver glucokinase, human adipocyte enhancer binding protein and DNA directed 50 kD regulatory subunit (POLD2) comprising:
(a) obtaining a polypeptide according to the method of claim 6;
(b) optionally conjugating said polypeptide to a carrier protein;
(c) immunizing a host animal with said polypeptide or polypeptide-carrier protein conjugate of step (b) with an adjuvant and
(d) obtaining antibody from said immunized host animal.
8. An antisense oligonucleotide or mimetic to an isolated polynucleotide which hybridizes to a non-coding region of SEQ ID NOS:5, 6, 7 or 8, which non-coding region is selected from the group consisting of an intron, a splice junction, a 5′ non-coding region, a transcription factor binding region and a 3′ non-coding region.
9. A method of diagnosing a pathological condition or susceptibility to a pathological condition in a subject comprising:
(a) determining the presence or absence of a mutation in the polynucleotide of claim 1 and
(b) diagnosing a pathological condition or a susceptibility to a pathological condition based on the presence or absence of said mutation.
10. A composition comprising the polynucleotide of claim 1 and a carrier.
11. A composition comprising the antisense oligonucleotide of claim 8 and a carrier.
12. A method for preventing, treating or ameliorating a medical condition, comprising administering to a subject an amount of the composition of claim 10 effective to prevent, treat or ameliorate said medical condition.
13. A method for preventing, treating or ameliorating a medical condition, comprising administering to a subject an amount of the composition of claim 11 effective to prevent, treat or ameliorate said medical condition.
14. A kit comprising the polynucleotide of claim 1.
15. The kit according to claim 14, in which the polynucleotide is labeled with a detectable substance.
16. A kit comprising the antisense oligonucleotide or mimetic of claim 8.
17. The kit according to claim 16, in which the antisense oligonucleotide is labeled with a detectable substance.
18. An isolated polynucleotide which hybridizes to a transcriptional regulatory region of SEQ ID NOS:5, 6, 7 or 8.
19. A nucleic acid construct comprising the polynucleotide sequence of claim 18 operably linked to a polynucleotide sequence encoding a heterologous polypeptide.
20. An expression vector comprising the nucleic acid construct of claim 19.
21. A recombinant host cell comprising the nucleic acid construct of claim 19.
22. A method for expressing a heterologous polypeptide sequence comprising (a) culturing the recombinant host cell of claim 21 under conditions that provide for the expression of said polypeptide and (b) recovering said expressed polypeptide.
US09/957,956 2000-09-21 2001-09-21 Isolated genomic polynucleotide fragments from chromosome 7 Abandoned US20030130215A1 (en)

Priority Applications (9)

Application Number Priority Date Filing Date Title
US09/957,956 US20030130215A1 (en) 2000-09-21 2001-09-21 Isolated genomic polynucleotide fragments from chromosome 7
US10/642,946 US7588915B2 (en) 2000-09-21 2003-08-18 Isolated genomic polynucleotide fragments from chromosome 7
US12/533,130 US8323884B2 (en) 2000-09-21 2009-07-31 Isolated SNARE YKT6 genomic polynucleotide fragments from chromosome 7 and their uses
US12/533,087 US8178662B2 (en) 2000-09-21 2009-07-31 Isolated AEBP1 genomic polynucleotide fragments from chromosome 7 and their uses
US12/533,164 US8313900B2 (en) 2000-09-21 2009-07-31 Isolated DNA directed 50kD regulatory subunit (POLD2) genomic polynucleotide fragments from chomosome 7 and their uses
US12/533,105 US8313899B2 (en) 2000-09-21 2009-07-31 Isolated snare YKT6 genomic polynucleotide fragments from chomosome 7 and their uses
US13/680,178 US8795959B2 (en) 2000-09-21 2012-11-19 Isolated glucokinase genomic polynucleotide fragments from chromosome 7
US13/680,223 US8822145B2 (en) 2000-09-21 2012-11-19 Identification of POLD2 sequences
US13/680,203 US20130130251A1 (en) 2000-09-21 2012-11-19 Identification of snare ykt6 sequences

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US23442200P 2000-09-21 2000-09-21
US09/957,956 US20030130215A1 (en) 2000-09-21 2001-09-21 Isolated genomic polynucleotide fragments from chromosome 7

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US10/642,946 Continuation US7588915B2 (en) 2000-09-21 2003-08-18 Isolated genomic polynucleotide fragments from chromosome 7

Publications (1)

Publication Number Publication Date
US20030130215A1 true US20030130215A1 (en) 2003-07-10

Family

ID=22881323

Family Applications (9)

Application Number Title Priority Date Filing Date
US09/957,956 Abandoned US20030130215A1 (en) 2000-09-21 2001-09-21 Isolated genomic polynucleotide fragments from chromosome 7
US10/642,946 Expired - Lifetime US7588915B2 (en) 2000-09-21 2003-08-18 Isolated genomic polynucleotide fragments from chromosome 7
US12/533,087 Expired - Fee Related US8178662B2 (en) 2000-09-21 2009-07-31 Isolated AEBP1 genomic polynucleotide fragments from chromosome 7 and their uses
US12/533,164 Expired - Fee Related US8313900B2 (en) 2000-09-21 2009-07-31 Isolated DNA directed 50kD regulatory subunit (POLD2) genomic polynucleotide fragments from chomosome 7 and their uses
US12/533,130 Expired - Fee Related US8323884B2 (en) 2000-09-21 2009-07-31 Isolated SNARE YKT6 genomic polynucleotide fragments from chromosome 7 and their uses
US12/533,105 Expired - Fee Related US8313899B2 (en) 2000-09-21 2009-07-31 Isolated snare YKT6 genomic polynucleotide fragments from chomosome 7 and their uses
US13/680,223 Expired - Fee Related US8822145B2 (en) 2000-09-21 2012-11-19 Identification of POLD2 sequences
US13/680,178 Expired - Fee Related US8795959B2 (en) 2000-09-21 2012-11-19 Isolated glucokinase genomic polynucleotide fragments from chromosome 7
US13/680,203 Abandoned US20130130251A1 (en) 2000-09-21 2012-11-19 Identification of snare ykt6 sequences

Family Applications After (8)

Application Number Title Priority Date Filing Date
US10/642,946 Expired - Lifetime US7588915B2 (en) 2000-09-21 2003-08-18 Isolated genomic polynucleotide fragments from chromosome 7
US12/533,087 Expired - Fee Related US8178662B2 (en) 2000-09-21 2009-07-31 Isolated AEBP1 genomic polynucleotide fragments from chromosome 7 and their uses
US12/533,164 Expired - Fee Related US8313900B2 (en) 2000-09-21 2009-07-31 Isolated DNA directed 50kD regulatory subunit (POLD2) genomic polynucleotide fragments from chomosome 7 and their uses
US12/533,130 Expired - Fee Related US8323884B2 (en) 2000-09-21 2009-07-31 Isolated SNARE YKT6 genomic polynucleotide fragments from chromosome 7 and their uses
US12/533,105 Expired - Fee Related US8313899B2 (en) 2000-09-21 2009-07-31 Isolated snare YKT6 genomic polynucleotide fragments from chomosome 7 and their uses
US13/680,223 Expired - Fee Related US8822145B2 (en) 2000-09-21 2012-11-19 Identification of POLD2 sequences
US13/680,178 Expired - Fee Related US8795959B2 (en) 2000-09-21 2012-11-19 Isolated glucokinase genomic polynucleotide fragments from chromosome 7
US13/680,203 Abandoned US20130130251A1 (en) 2000-09-21 2012-11-19 Identification of snare ykt6 sequences

Country Status (3)

Country Link
US (9) US20030130215A1 (en)
AU (1) AU2001296274A1 (en)
WO (1) WO2002024741A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130244903A1 (en) * 2004-12-01 2013-09-19 The Curators Of The University Of Missouri Modulators of alpha-synuclein toxicity
US9879257B2 (en) 2005-05-13 2018-01-30 Whitehead Institute For Biomedical Research Modulators of alpha-synuclein toxicity
US9909160B2 (en) 2007-12-21 2018-03-06 Whitehead Institute For Biomedical Research Modulators of alpha-synuclein toxicity

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2001296274A1 (en) * 2000-09-21 2002-04-02 James W. Ryan Isolated genomic polynucleotide fragments from chromosome 7
CN105950619A (en) * 2016-04-20 2016-09-21 刘媛 shRNA molecule capable of inhibiting expression of human AEBP1 gene

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5541060A (en) 1992-04-22 1996-07-30 Arch Development Corporation Detection of glucokinase-linked early-onset non-insulin-dependent diabetes mellitus
US5624803A (en) * 1993-10-14 1997-04-29 The Regents Of The University Of California In vivo oligonucleotide generator, and methods of testing the binding affinity of triplex forming oligonucleotides derived therefrom
AU697269B2 (en) * 1994-01-27 1998-10-01 Human Genome Sciences, Inc. Human DNA mismatch repair proteins
US5776746A (en) * 1996-05-01 1998-07-07 Genitope Corporation Gene amplification methods
US6783961B1 (en) * 1999-02-26 2004-08-31 Genset S.A. Expressed sequence tags and encoded human proteins
US20070015162A1 (en) * 1999-03-12 2007-01-18 Rosen Craig A 99 human secreted proteins
US20070031842A1 (en) * 1999-03-12 2007-02-08 Rosen Craig A 379 human secreted proteins
US20070048818A1 (en) * 1999-03-12 2007-03-01 Human Genome Sciences, Inc. Human secreted proteins
WO2000058467A1 (en) * 1999-03-26 2000-10-05 Human Genome Sciences, Inc. 50 human secreted proteins
US20030204075A9 (en) * 1999-08-09 2003-10-30 The Snp Consortium Identification and mapping of single nucleotide polymorphisms in the human genome
US20040018969A1 (en) * 2000-01-31 2004-01-29 Rosen Craig A. Nucleic acids, proteins, and antibodies
US20020048763A1 (en) * 2000-02-04 2002-04-25 Penn Sharron Gaynor Human genome-derived single exon nucleic acid probes useful for gene expression analysis
US6812339B1 (en) * 2000-09-08 2004-11-02 Applera Corporation Polymorphisms in known genes associated with human disease, methods of detection and uses thereof
AU2001296274A1 (en) * 2000-09-21 2002-04-02 James W. Ryan Isolated genomic polynucleotide fragments from chromosome 7

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130244903A1 (en) * 2004-12-01 2013-09-19 The Curators Of The University Of Missouri Modulators of alpha-synuclein toxicity
US10526651B2 (en) * 2004-12-01 2020-01-07 Whitehead Institute For Biomedical Research Modulators of alpha-synuclein toxicity
US9879257B2 (en) 2005-05-13 2018-01-30 Whitehead Institute For Biomedical Research Modulators of alpha-synuclein toxicity
US9909160B2 (en) 2007-12-21 2018-03-06 Whitehead Institute For Biomedical Research Modulators of alpha-synuclein toxicity

Also Published As

Publication number Publication date
US8822145B2 (en) 2014-09-02
US8323884B2 (en) 2012-12-04
US8178662B2 (en) 2012-05-15
US20130130252A1 (en) 2013-05-23
US20050100912A1 (en) 2005-05-12
US20130130250A1 (en) 2013-05-23
US7588915B2 (en) 2009-09-15
US20090324627A1 (en) 2009-12-31
US20130130251A1 (en) 2013-05-23
US8795959B2 (en) 2014-08-05
US8313899B2 (en) 2012-11-20
US20100081709A1 (en) 2010-04-01
WO2002024741A3 (en) 2003-03-27
US20100291556A1 (en) 2010-11-18
US20090324626A1 (en) 2009-12-31
AU2001296274A1 (en) 2002-04-02
WO2002024741A2 (en) 2002-03-28
US8313900B2 (en) 2012-11-20

Similar Documents

Publication Publication Date Title
US8822145B2 (en) Identification of POLD2 sequences
US8765927B2 (en) Identification of isolated genomic nucleotide fragments from the p15 region of chromosome 11 encoding human cluster of differentiation antigen 81 and variants thereof
US7985571B1 (en) Isolated genomic polynucleotide fragments from chromosome 17 that encode human carboxypeptidase D
US8338100B1 (en) Isolated genomic polynucleotide fragments from chromosome 12 that encode human carboxypeptidase M
US8258273B1 (en) Isolated genomic polynucleotide fragments from chromosome 10q25.3 that encode human soluble aminopeptidase P
US8053195B1 (en) Isolated genomic nucleic acid molecules obtainable from chromosome 1p21-p13 that encode human RhoC

Legal Events

Date Code Title Description
STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION

AS Assignment

Owner name: RYOGEN LLC, NEW YORK

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:RYAN, JAMES W.;REEL/FRAME:016309/0848

Effective date: 20050503