POLYNUCLEOTIDE ASSOCIATED WITH THE BO OR BOR SYNDROME, ITS CORRESPONDING POLYPEPΗDE AND DIAGNOSTIC AND THERAPEUTIC APPLICATIONS
The subject of the present invention pertains to a polynucleotide, the alteration of which is associated with Branchio-Oto-Renal syndrome (BOR) and which may be also involved in some carcinogenetics processes, especially in kidney tumorogenesis. The present invention is also directed to new polynucleotides which share a strong homology with the polynucleotide associated with the BOR syndrome, thus defining the BOR-syndrome-linked polynucleotide as a member of a new human gene family. The present invention also concerns a polypeptide encoded by anyone of the above cited polynucleotides. The invention is also directed to the use of said polypeptides or said polynucleotides, or fragments thereof, for example as tools useful for the in vitro detection of their presence in a biological sample, as well as for the detection of any kind of alteration of a gene belonging to the new gene family to which the polynucleotide associated to the BOR syndrome belongs. Finally, the present invention concerns the use of the polynucleotide or polypeptide for the preparation of therapeutic compositions useful for the treatment of genetic diseases or for the control of tumor proliferation, especially of cancers, particularly kidney tumors.
The association of branchial arch anomalies and hearing impairment has been recognized since the nineteenth century. In 1976 Melnick et al, observing the additional association of renal aberrations, used the term Branchio-Oto-Renal (BOR) syndrome to describe individuals presenting with such branchial, otic and renal anomalies'1. BOR syndrome is an autosomal dominant disorder with incomplete penetrance and variable expressivity. Several reports have described members within the same family who show different combinations of symptoms with varying degrees of severity, especially with regards to renal anomalies which range from undetected (BO syndrome) to bilateral aplasia2-3.
The branchial anomalies of BOR syndrome consist of latero-cervical fistulas or cysts4. The otic anomalies involve the outer, middle and inner ear, affecting both the cochlea and the vestibular apparatus. The outer ear anomalies most frequently include preauricular pits and tags, malformed auricle and, less commonly, malpositioned ears, microtia (undersized auricle), and atresia to stenosis of the external auditory canal^.5. Anomalies of the middle ear mainly comprise hypoplasia or absence of the three ossicles (malleus, incus and stapes) and malformation of the middle ear cavity. Inner ear anomalies include an absent cochlea, an underdeveloped cochlea showing only 1 1/2 turns instead of the normal 2 1/2 turns, as well as absent or abnormal semicircular canals of the vestibular apparatus6. Hearing loss is the most commonly (93% of affected individuals) observed feature of BOR syndrome. Hearing impairment can range from mild to profound? and can be either conductive (i.e. due to external and/or middle ear anomalies), sensorineural (i.e. due to inner ear anomalies) or mixed, as seen in half of the cases5. BOR syndrome accounts for approximately 2% of profoundly deaf children4. Renal anomalies include unilateral or bilateral hypoplasia, dysplasia and aplasia6. In addition, anomalies of the collecting system, such as duplication or absence of the
ureter, megaureter, blunted or distorted calyces and extra or bifid pelvis have been observed2-8 The incidence of BOR syndrome is difficult to estimate Branchial anomalies and minor renal aberrations may be overlooked, and due to the lethal consequence of the most severe kidney anomalies BOR syndrome may often remain undetected
The clinical features of BOR syndrome are indicative of an early developmental defect taking place between the fourth week and the tenth week of embryonic development The inner ear develops from the otic placode at three and a half weeks5* and the cochlea has completed its full two and a half turns at nine weeks The middle ear ossicles, which are visible at week seven, develop from neural crest derived cells of the first and second branchial arches The malleus and incus derive from the first arch and the stapes from the second The external auditory ear canal and the middle ear cavity are derived from the first branchial cleft and pouch respectively, which separate the first and second branchial arch externally (cleft) and internally (pouch) The formation of the auricle requires the correct fusion of six mesenchymal hillocks from the first and second arches that takes place at week Latero-cervical cysts or fistulas are remnants of the second, third and fourth branchial clefts that are normally obliterated at the end of the sixth week Likewise the definitive kidney (metanephros) develops from a reciprocal induction between the metanephπc mesenchyme and the uretenc bud'O-1 '' that begins during week five and is completed by around week 11
In 1989, Haan et al described a family with an inherited rearrangement of chromosome 8q, dir ins (8)(q24 11 ,q13 3,q21 13), presenting with BO syndrome and Tncho-Rhino-Phalangeal syndrome''2 This report mapped the gene underlying BOR syndrome to chromosome band 8q13 3 or 8q21 13 By linkage analysis the gene was then localised to 8q13 3 between the polymorphic markers D8S543 and D8S286, an interval of approximately 7 cM13,14 Subsequent refinement of this interval was then accomplished by the precise mapping of the aforementioned translocation that was shown to be associated with a deletion1 5>'16 The gene interval was recently redefined to between 470 and 650 kb, flanked by the polymorphic markers D8S1060 and D8S180716
In an attempt to clone in eukaryotic genomes the gene underlying BOR syndrome, the inventors constructed a bacteπophage P1 and PAC contig spanning this interval and undertook a positional cloning approach based on a sequencing strategy
For the purpose of the present invention, a Contig is defined as a consensus sequence obtained after alignment of the multiple nucleic sequences of the same clones and of the multiple nucleic sequences of different clones containing common sequences
Computer analysis of the sequences obtained from P1 4405 mapping to the centre of the deletion, demonstrated the presence of putative exons showing homology to the drosophila developmental gene eyes absent (eya) Eya is required prior to the morphogenetic furrow for eye morphogenesis We report here the characterisation of this newly identified gene as underlying BOR syndrome, as well
as the identification of two other human homologues of eya. These three newly identified eya related genes are suggestive of a novel gene family that plays a role in development. They do not seem to be linked to the development of eyes in humans, according to the present data.
As it clearly appears from the above detailed teachings of prior art, Branchio- Oto-Renal (BOR) syndrome is a congenital disease characterized by a combination of varying branchial, otic and renal anomalies. The causative gene has been assigned to chromosome band 8q13.3 but it must be underlined that no evidence for genetic heterogeneity has ever been reported in the prior art.
The present inventors have now isolated and sequenced both the genomic and the cDNA corresponding to the gene associated with the BOR syndrome.
Using a positional cloning strategy based on genomic DNA sequencing, the inventors have identified a gene in the candidate region that they have named eyal. A chromosomal deletion and seven mutations predicted to disrupt the eyal gene product have been identified in unrelated BOR-affected individuals, thus establishing that eyal underlies BOR syndrome. The EYA1 polypeptide encoded by the eyal gene was found to share a substantial homology with a drosophila protein that is responsible of the eyes absent phenotype, that is characterized by a reduction or absence of the adult compound eye, thus a phenotype unrelated to the BOR syndrome caused by eyal.
For the purpose of the present invention, the expression « eyal » is used to designate the genomic DNA. As it will be further disclosed, the expressions « eyal- a », « eya1-b » and « eya1-c » are used to designate respectively the three transcripts produced by eyal. The eyal gene was also found to be poorly homologous (43% identity over 72 aminoacids) with an Expressed Sequence Tag (EST) of C. elegans, which is a short transcribed sequence.
Moreover, the inventors have also characterized two other genes, structurally related to eyal, that have been named eya2 and eya3. These two genes are also involved in the development of genetic defects, as it will be detailed further in the specification. For example, eya2 is localized in a chromosome region associated with Fanconi Anemia typel (FA1 ) and epilepsy benign neonatal 1 (EBN1 ) genetic defects.
The inventor's discovery of three members of a new gene family is allowing new means for the detection of genetic defects the causative agent of which was previously remained unknown. The availability of the genes eyal, eya2 and eya3 is also allowing the obtention of their expression products that can be used both to produce diagnostic means to detect a genetic defect in a patient and therapeutic compositions in order to prevent or cure the disease induced by the said genetic defect.
More particularly, eyal is strongly associated with some carcinogenic processes, especially in kidney tumorogenesis. eya2 and eya3 are likely to be responsible for other human developmental diseases and may similarly be linked to the control of the development of various tumors or eukaryotic cell differenciation process. The type of DNA modifications detected in BOR affected patients supports that the disease results from reduced gene dosage, implying that the amount of the protein encoded by eyal is critical for the normal development of the branchial arches, ear and kidney. Consequently, the polynucleotide or polypeptide (including antibodies raised against the polypeptide) detection means made available by the
inventors allow not only the determination of the presence or absence of the expression of eyal, eya2 or eya3 but also their quantification in a biological sample which is useful to detect a pathological level of expression of these genes Indeed, these new detection means are likely to be used in ante-natal diagnosis
Thus, one object of the present invention consists in a polynucleotide corresponding to the eyal genomic sequence which is characterized by the SEQ ID N°1 that is detailed in Annex 1
The polynucleotide of SEQ ID N°1 contains the whole exons and introns of the eyal gene as well as non coding sequences localized both at 5' and 3'ends and which may be very useful to express the eyal gene or its cDNA counterparts in a suitable vector For this reason, all the non coding regions of the polynucleotide of SEQ ID N°1 are also part of the present invention
The genomic sequence has been found by the inventors to contain 18 exons named, from 5' to 3' of the strand carrying the coding sequence, respectively Q, R, S, T, U, V, W, X, Y, z, A, B, C, D, E, F, G and H The inventors have isolated three different mRNA transcripts corresponding to three different splicing patterns of the genomic sequence that leads to three different mRNAs that are herein called (as well as their cDNA counterparts) respectively
- eya1-a this trancnpt includes exons R, T, U, V, W, X, Y, z, A, B, C, D, E F G and H of the genomic sequence of SEQ ID N° 1 The resultant polynucleotide obtained by reverse transcription of this mRNA has a nucleotide sequence which is the following SEQ ID N° 2
- eya1-c this transcript includes exons R, S, T, U, V, W, X, Y, z, A, B, C, D, E, F, G and H of the genomic sequence of SEQ ID N° 1 The resultant polynucleotide obtained by reverse transcription of this mRNA has a nucleotide sequence which is the following SEQ ID N° 3
- eya7-ι this transcripts includes exons P, Q, S, T, U, V, W, X, Y, z, A B, C, D, E, F, G and H of the genomic sequence of SEQ ID N° 1 The resultant polynucleotide obtained by reverse transcription of this mRNA has a nucleotide sequence which is the following SEQ ID N°4
The polynucleotides eya1-b and eya1-c have the same translation start codon ATG, localized at the same position, and are thus coding for the same polypeptide of SEQ ID N° 8 that is described further in the specification
The polynucleotide eya1-a has an ATG start codon which is localized in exon R, but , due to an alternative splicing generating both a shorter length transcript and a frameshift, eya1-a gives rise to polypeptide different frome the polypeptide resulting from the translation of eya1-b and eya1-c he polypeptide translated from eyal is described hereinafter as the polypeptide of SEQ ID N° 7
Another object of the present invention consists in a polynucleotide comprising the cDNA sequence eya1-a which is characterized by the sequence SEQ ID N°2
The polynucleotide of sequence SEQ ID N°2 comprises notably the restriction sites as described in ANNEX 2 The invention concerns also the fragments of this polynucleotide obtainable by the use of at least one of the said restriction enzymes
The polynucleotide of sequence SEQ ID N°2 is contained, from 5'end to 3'end, in the following clones HSEYA1 aNEYAK7, HSEYA1a6K7F4, HSEYA13'5' and HSEYA1 Puc3, deposited at the Collection de Cultures de Microorganismes (CNCM) respectively under the accession numbers 1-1824, 1-1836, 1-1832 and 1-1830
Another object of the present invention consists in a polynucleotide comprising the cDNA eya1-b which is characterized by the sequence SEQ ID N°3
The polynucleotide of sequence SEQ ID N°3 comprises notably the restriction sites as described in ANNEX 3 The invention concerns also the fragments of this polynucleotide obtainable by the use of at least one of the said restriction enzymes
The polynucleotide of SEQ ID N° 3 is contained, from 5' to 3' end, in the following clones HSEYA1 b5p35, HSEYA13'5' and HSEYA1 Puc3 deposited at the Collection de Cultures de Microorganismes (CNCM) respectively under the following accession numbers 1-1835, 1-1832 and 1-1830
Another object of the present invention consists in a polynucleotide comprising the cDNA sequence eya7-c which is characterized by the sequence SEQ ID N° 4
The polynucleotide of sequence SEQ ID N°4 comprises notably the restriction sites as described in ANNEX 4 The invention concerns also the fragments of this polynucleotide obtainable by the use of at least one of the said restriction enzymes
The polynucleotide of sequence SEQ ID N°4 is contained, from 5' to 3' end, in the following clones HSEYA1aNEYAK7, HSEYA1c14K7F6, HSEYA13'5' and HSEYA1 Puc3, deposited at the Collection de Cultures de Microorganismes (CNCM) under the accession numbers 1-1824, 1-1822, 1-1832 and 1-1830
Another object of the present invention consists in a polynucleotide the sequence of which shares a strong homology with the sequences SEQ ID N°1 to SEQ ID N°4
For the purpose of the present invention, a first polynucleotide which shares a strong homology with a second polynucleotide is defined as a first polynucleotide having 8Q % identity in its nucleic sequence with the second polynucleotide
The polynucleotide defined above comprises the cDNA of a new gene named eya2 the said polynucleotide being characterized by the sequence SEQ ID N°5
The polynucleotide of sequence SEQ ID N°5 comprises notably the restriction sites as described in the Annex 5 The invention concerns also fragments of this polynucleotide obtainable by the use of at least one of said restriction enzymes
The polynucleotide of sequence SEQ ID N°5 is contained, from 5' to 3' end, in the following clones HSEYA2-20R50 3, HSEYA2-20ιnt8 and HSEYA2-20F250 10 deposited at the Collection de Cultures de Microorganismes (CNCM) under the accession numbers 1-1834, 1-1829 and 1-1833
Another object of the present invention consists in a polynucleotide the sequence of which shares a strong homology with the sequences SEQ ID N°1 to SEQ ID N°4
For the purpose of the present invention, a first polynucleotide which shares a strong homology with a second polynucleotide is defined as a first polynucleotide having 80 % identity in its nucleic sequence with the second polynucleotide
The polynucleotide defined above comprises the cDNA of a new gene named eya3 the said polynucleotide being characterized by the sequence SEQ ID N°6
The polynucleotide of sequence SEQ ID N°6 comprises notably the restriction sites as described in Annex 6 The invention concerns also fragments of this polynucleotide obtainable by the use of at least one of said restriction enzymes
The polynucleotide of sequence SEQ ID N° 6 is contained, from 5'end to 3' end, in the following clones HSEYA3-z39F11 , HSEYA3-z39ιnt and HSEYA3-z39 23 deposited at the Collection de Cultures de Microorganismes (CNCM) under the accession numbers 1-1823, 1-1825 and 1-1831
More particularly, the present invention is directed to a polynucleotide which is selecting from the group consisting of
(a) a polynucleotide comprising at least 8 consecutive nucleotides of anyone of SEQ ID N°1 ,
(b) a polynucleotide differing from the polynucleotide defined in (a) by mutation, insertion, deletion or substitution of one or more bases ,
(c) a polynucleotide, the sequence of which is complementary to the sequence of anyone of polynucleotides defined in (a) and (b) ,
(d) a polynucleotide which hybridizes specifically with anyone of the polynucleotides defined in (a), (b) or (c)
An other object of the present invention are also the polynucleotides entering in the above definition and which are the following polynucleotides that are selected from the group consisting of
(a) a. polynucleotide comprising at least 8 consecutive nucleotides of SEQ ID N°2, SEQ ID N°3, SEQ ID N°4, SEQ ID N°5 or SEQ ID N°6 ,
(b) a polynucleotide differing from the polynucleotide defined in (a) by mutation, insertion, deletion or substitution of one or more bases ,
(c) a polynucleotide, the sequence of which is complementary to the sequence of anyone of polynucleotides defined in (a) and (b) ,
(d) a polynucleotide which hybridizes specifically with anyone of the polynucleotides defined in (a), (b) or (c)
By polynucleotide or nucleic acid according to the present invention is meant either a double stranded DNA, a single stranded DNA, including a synthetic nucleic acid, as well as their transcription products.
By polynucleotide having a sequence complementary to the sequence of one polynucleotide of the invention is meant any nucleic acid wherein the nucleotides are complementary to those of either SEQ IDN°1 , SEQ ID N°2, SEQ ID N°3, SEQ ID N°4, SEQ ID N°5 or SEQ ID N°6 and the orientation of which is reversed
By DNA according to the present invention is meant either genomic DNA or cDNA, the latter being obtained by reverse transcription of an RNA molecule
By polynucleotide hybridizing specifically with a second polynucleotide is meant that the temperature and ionic strength parameters are selected in such a manner that they allow two polynucleotides having complementary sequences to be maintained in a hybridized form.
As an illustrative embodiments of the conditions to be met to keep hybridized two polynucleotides having complementary sequences, the following hybridization conditions are used :
The hybridization step is realized at 65°C in the presence of 6 x SSC buffer, 5 x Denhardt's solution, 0.5% SDS and 100μg/ml of salmon sperm DNA
For technical information, 1 x SSC corresponds to 0 15 M NaCl and 0.05M sodium citrate; 1 x Denhardt's solution corresponds to 0.02% Ficoll, 0.02% polyvinylpyrrolidone and 0.02% bovine serum albumin.
The hybridization step is followed by four washing steps .
- two washings during 5 min, preferably at 65°C in a 2 x SSC and 0.1 %SDS buffer;
- one washing during 30 min, preferably at 65°C in a 2 x SSC and 0.1 % SDS buffer,
- one washing during 10 min, preferably at 65°C in a 1 x SSC and 0.1 %SDS buffer
Fragments of the polynucleotides according to the invention may be obtained by cleavage using one or several restriction endonucleases, the desired fragments being obtained by taking into account of the restriction sites contained in SEQ ID N°2, SEQ ID N°3, SEQ ID N°4, SEQ ID N°5 or SEQ ID N°6 respectively disclosed in the Annexes . The conditions under which the restrictions enzymes are used in order to generate the polynucleotide fragments according to the invention are described in Sambrook et al., 1989.
Another object of the present invention consists in a polypeptide which is encoded by the polynucleotide of SEQ ID N°1 or SEQ ID N°2 the said polypeptide having the sequence SEQ ID N°7.
Another object of the present invention is a polypeptide which is encoded by the polynucleotide of SEQ ID N° 1 , SEQ ID N° 3 or SEQ ID N°4, said polypeptide being named EYA1-B and having the sequence SEQ ID N°8
The eya1-a cDNA contains sixteen exons from the genomic sequence of SEQ ID N°1. The most conserved C-terminal part of the EYA1 polypeptide spans eight exons of the eya1-a cDNA of sequence SEQ ID N° 2, starting with exon A and ending with exon H. For exons D and F, 5' splicing occurs after the first base of the last codon. For exons C and E, 5' splicing occurs after the second base of the last
codon For exons A, B and H, 5' splicing occurs after the third base of the last codon
The aminoacid sequences corresponding to the translation of the exons of eyal are the foliowings
Exon A RVFIWDLETIIVFHSLLTGSYANRYGR
Exon B DPPTSVSLGLRMEEMIFNLADTHLFFNDLE
Exon C ECDQVHIDDVSSDDNGQDLS
Exon D
T TYYNNFFGGTTDGFPAAATSANLCLATGVRGGVDWMRKLAFRYRRVKEIYNTYKNNVGG
Exon E LLGPAKREAWQLRAEIEALTDSWLTLALKALSLIHSR
Exon F TNCVNILVTTTQLIPALAKVLLYGLGIVFPIENIYSATKIG
Exon G KESCFERIIQRFGRKWYWIGDGVEEEQGAKK
Exon H HAMPFWRISSHSDLMALHHALELEYL
Another object of the present invention consists in a polypeptide which is encoded by the polynucleotide of SEQ ID N°5 the said polypeptide being named EYA2 and having the sequence SEQ ID N°9
Another object of the present invention consists in a polypeptide which is encoded by the polynucleotide of SEQ ID N°6 the said polypeptide being named EYA3 and having the sequence SEQ ID N°10
Another object of the present invention consists in a polypeptide which is selected, from the group consisting of
(a) a polypeptide comprising anyone of the aminoacid sequences SEQ ID N° 7, SEQ ID N°8, SEQ ID N°9 or SEQ ID N°10,
(b) a peptide fragment of at least 10 aminoacids in length of the polypeptide defined in (a), the said peptide fragment being recognized by a polyclonal or a antibody directed against the polypeptide defined in (a),
(c) a polypeptide comprising, as regards to the polypeptide defined in (a) or (b), at least one modification by addition or substitution of an aminoacid, the said modified polypeptide being recognized by an antibody directed against the polypeptide defined in (a)
A further object of the present invention are also peptide fragments of the above disclosed polypeptides that may be obtained by cleavage of the polypeptides of SEQ ID N° 7, SEQ ID N°8, SEQ ID N°9 or SEQ ID N°10 with a proteolytic enzyme such that trypsin, chymotrypsine, collagenase, clostnpaine, Myxobacter protease
3 Proline endopeptidase, Staphylococcal protease, trypsm having the lysine residues blocked, trypsm having the arginine residues blocked or the endoproteinase Asp-N Fragments of the polypeptides according to the invention may also be obtained by placing the polypeptide in a very acid solution (pH 2 5) or by cleavage using chemical reagents such as cyanogen bromide or lodobenzoate
Preferred peptide fragments according to the present invention are the fragments that are recognized by antibodies directed respectively to the polypeptides of SEQ ID N° 7, SEQ ID N°8, SEQ ID N°9 or SEQ ID N°10 Such peptide fragments have advantageously a length of at least 20 aminoacids
Such peptide fragments may be prepared either by chemical synthesis (Houbenweyl et al , 1974), from cell host transformed with an expression vector containing a nucleic acid allowing the expression of said peptide fragments, when the peptide encoding sequence is placed under the control of appropriate regulation and/or expression elements These peptide fragments may also be generated by chemical or enzymatic cleavage as described above
An other object of the present invention are polypeptides that are homologous to any of the polypeptides of SEQ ID N° 7, SEQ ID N°8, SEQ ID N°9 or SEQ ID N°10 By homologous peptide according to the present invention is meant a polypeptide containing one or several aminoacid additions, deletions and/or substitutions in either SEQ ID N° 7, SEQ ID N°8, SEQ ID N°9 or SEQ ID N°10 In the case of an aminoacid substitution, one or several -consecutive or non-consecutive- aminoacids are replaced by « equivalent » aminoacids The expression « equivalent » aminoacid is used herein to name any aminoacid that may substituted for to one of the aminoacids belonging to the initial polypeptide structure without modifying the antigenic properties of the corresponding peptides In other words, the «equιvalent » aminoacids are those which allow the generation or the obtention of a polypeptide with a modified sequence as regards to SEQ ID N° 7, SEQ ID N°8, SEQ ID N°9 or SEQ ID N°10, the said modified polypeptide being able to induce antibodies recognizing the parent polypeptide of either SEQ ID N° 7, SEQ ID N°8, SEQ ID N°9 or SEQ ID N°10
These equivalent aminoacids may be determined either by their structural homology with the initial aminoacids to be replaced or by the results of the cross- immunogenicity between the parent peptides and their modified counterparts
As an illustrative example, it should be mentioned the possibility to perform substitutions without a deep change in the immunogenicity of the correspondant modified peptides by replacing, for example, leucine by valine, or isoleucine, aspartic acid by glutamic acid, glutamine by asparagine, arginine by lysine etc , it being understood that the reverse substitutions are permitted in the same conditions
The present invention is also directed to a nucleic acid encoding a polypeptide of SEQ ID N° 7, SEQ ID N°8, SEQ ID N°9 or SEQ ID N°10 according to the invention or a polypeptide homologous to the latter or also one peptide fragment of the latters, as defined herein above
The invention also pertains to nucleotidic fragments that are complementary to the polynucleotides according to the invention, as well as nucleotidic fragments that are modified, as regards to the latters by deletion or addition of one or several nucleotides in a ratio of about 15% as regards to the length of the nucleic fragments and/or modified by substitution of one or several nucleotides, provided that the modified nucleic fragments retain their ability to hybridize with any of the
O polynucleotides of SEQ ID N°1 , SEQ ID N°2, SEQ ID N°3, SEQ ID N°4, SEQ ID N°5 or SEQ ID N°6 in the hybridization conditions described above
Advantageously, a nucleic fragment as defined herein above has a length of at least 8 nucleotides, which is the minimal length that has been determined to allow a specific hybridization with either SEQ ID N°1 , SEQ ID N°2, SEQ ID N°3, SEQ ID N°4, SEQ ID N°5 or SEQ ID N°6 Preferably the nucleic fragment has a length of at least 12 nucleotides and more preferably 20 consecutive nucleotides of any of the SEQ ID N°1 , SEQ ID N°2, SEQ ID N°3, SEQ ID N°4, SEQ ID N°5 or SEQ ID N°6
These nucleic fragments may be used as primers for use in amplification reactions, or as nucleic probes
Thus, the polynucleotides of SEQ ID N°1 , SEQ ID N°2, SEQ ID N°3, SEQ ID N°4, SEQ ID N°5 or SEQ ID N°6 or the nucleic fragments obtained from such polynucleotides may be used to select nucleotide primers notably for an amplification reaction such as the amplification reactions further described
PCR is described in the US patent N° 4,683,202 The amplified fragments may be identified by an agarose or a polyacrylamide gel electrophoresis, or by a capillary electrophoresis or alternatively by a chromatography technique (gel filtration, hydrophobic chromatography or ion exchange chromatography) The specificity of the amplification may be ensured by a molecular hybridization using as nucleic probes the polynucleotides SEQ ID N°1 , SEQ ID N°2, SEQ ID N°3, SEQ ID N°4, SEQ ID N°5 or SEQ ID N°6, fragments thereof, ohgonucleotides that are complementary to these polynucleotides or fragment thereof or their amplification products themselves
Amplified nucleotide fragments are used as probes in hybridization reactions in order to detect the presence of one polynucleotide according to the present invention or in order to detect mutations in the SEQ ID N°1 , SEQ ID N°2, SEQ ID N°3, SEQ ID N°4, SEQ ID N°5 or SEQ ID N°6
An other object of the present invention are the amplified nucleic fragments (« amphcons ») defined herein above
These probes and amphcons may be radioactively or non-radioactively labeled, using for example enzymes or fluorescent compounds
Preferred nucleic acid fragments that can serve as primers according to the present invention are those of SEQ ID N° 11 to SEQ ID N° 47
Such nucleic acid fragments may be used as pairs in order to amplify specific regions of SEQ ID N°1 , SEQ ID N°2, SEQ ID N°3, SEQ ID N°4, SEQ ID N°5 or SEQ ID N°6 As an illustrative embodiment of pairs of primers according to the present invention that can be used in one nucleic acid amplification reaction are the followings
Pair l : SEQ ID N° 11
SEQID N° 12 This pair of ohgonucleotides is used for amplifying the full length coding region of SEQ ID N°2
Pair 2 : SEQ ID N° 13
SEQ ID N° 14
AΛ This pair of ohgonucleotides is used as internal primers to amplify exons E and F of SEQ ID N°2.
Pair 3 : SEQ ID N°15
SEQ ID N° 16
This pair of ohgonucleotides is used to amplify exon z of SEQ ID N°2.
Pair 4 : SEQ ID N°17
SEQ ID N°18 This pair of ohgonucleotides is used to amplify exon A of SEQ ID N°2.
Pair 5 : SEQ ID N° 19
SEQ ID N° 20 This pair of ohgonucleotides is used to amplify exon B of SEQ ID N°2.
Pair 6 : SEQ ID N° 21
SEQ ID N° 22
This pair of ohgonucleotides is used to amplify exon C of SEQ ID N°2.
Pair 7 : SEQ ID N° 23
SEQ ID N° 24 This pair of ohgonucleotides is used to amplify exon D of SEQ ID N°2.
Pair 8 : SEQ ID N° 25
SEQ ID N° 26
This pair of ohgonucleotides is used to amplify exons E and F of SEQ ID N°2.
Pair 10: SEQ ID N° 27
SEQ ID N° 28 This pair of ohgonucleotides is used to amplify exon G of SEQ ID N°2.
Pair 11 : SEQ ID N° 30
SEQ ID N° 31 This" pair of ohgonucleotides is used to amplify exon R of SEQ ID N°2.
Pair 12 : SEQ ID N° 32
SEQ ID N° 33 This pair of ohgonucleotides is used to amplify exon S of SEQ ID N°2.
Pair 13 : SEQ ID N° 34 SEQ ID N° 35 This pair of ohgonucleotides are used to amplify exon T of SEQ ID N°2
Pair 14 : SEQ ID N° 36
SEQ ID N° 37 This pair of ohgonucleotides are used to amplify exon U of SEQ ID N°2.
All
Pair 15 : SEQ ID N° 38
SEQ ID N° 39
This pair of ohgonucleotides are used to amplify exon V of SEQ ID N°2
Pair 16 : SEQ ID N° 40
SEQ ID N° 41 This pair of ohgonucleotides are used to amplify exon W of SEQ ID N°2
Pair 17 : SEQ ID N° 42
SEQ ID N° 43
This pair of ohgonucleotides are used to amplify exon X of SEQ ID N°2
Pair 18 : SEQ ID N° 44
SEQ ID N° 45 This pair of ohgonucleotides are used to amplify exon Y of SEQ ID N°2
Pair 19 : SEQ ID N° 46
SEQ ID N° 47 This pair of ohgonucleotides are used to amplify exon H of SEQ ID N°2
The primers may also be used as ohgonucleotide probes to specifically detect a polynucleotide according to the invention
Other techniques related to nucleic acid amplification may also be used and are generally preferred to the PCR technique
The Strand Displacement Amplification (SDA) technique (Walker et al , 1992) is an isothermal amplification technique based on the ability of a restriction enzyme to cleave one of the strands at his recognition site (which is under a hemiphosphorothioate form) and on the property of a DNA polymerase to initiate the synthesis of a new strand from the 3'OH end generated by the restriction enzyme and on the property of this DNA polymerase to displace the previously synthesized strand being localized downstream The SDA method comprises two main steps a) The synthesis, in the presence of dCTP-alpha-S, of DNA molecules that are flanked by the restriction sites that may be cleaved by an appropriate enzyme b) The exponential amplification of these DNA molecules modified as such, by enzyme cleavage, strand displacement and copying of the displaced strands The steps of cleavage , strand displacement and copy are repeated a sufficient number of times in order to obtain an accurate sensitivity of the assay
The SDA technique was initially realized using the restriction endonuclease Hindi but is now generally practiced with an endonuclease from Bacillus stearothermophilus (SSOBI) and a fragment of a DNA polymerase which is devoid of any 5'→3'exonuclease activity isolated from Bacilllus cladotenax (exo- Bca) [=exo- minus-Sca] Both enzymes are able to operate at 60°C and the system is now optimized in order to allow the use of dUTP and the decontamination by UDG When using this technique, as described by Spargo et al in 1996, the doubling time of the target DNA is of 26 seconds and the amplification rate is of 10
10 after an incubation
The SDA amplification technique is more easy to perform than PCR (a single thermostated waterbath device is necessary) and is faster than the other amplification methods
Thus, another object of the present invention consists in using the nucleic acid fragments according to the invention (primers) in a method of DNA or RNA amplification according to the SDA technique For performing of SDA, two pairs of primers are used a pair of external primers (B1 , B2) consisting in a sequence specific of the target polynucleotide of interest and a pair of internal primers (S1 , S2) consisting in a fusion ohgonucleotide carrying a site that is recognized by a restriction endonuclease, for example the enzyme SSOBI
As an illustrative embodiment of the use of the primers according to the invention in a SDA amplification reaction, a sequence that is non specific for the target polynucleotide and carrying a restriction site for Hindi or SSOBI is added at the 5'end of a primer specific either for SEQ ID N°1 , SEQ ID N°2, SEQ ID N°3, SEQ ID N°4, SEQ ID N°5 or SEQ ID N°6 Such an additional sequence containing a restriction site that is recognized by SsoBI is advantageously the following sequence GCATCGAATGCATGTCTCGGGT, the nucleotides represented in bold characters corresponding to the recognition site of the enzyme SsoBI Thus, primers useful for performing SDA amplification may be designed from any of the primers according to the invention as described above and are part of the present invention The operating conditions to perform SDA with such primers are described in Spargo et al, 1996
More specifically, the following conditions are used when performing the SDA amplification reaction with the primers of the invention designed to contain a SsoBI restriction site
SsoBI/exo'Bca [=exo-minus-Sca] SDA reactions are performed in a 50μl volume with final concentrations of 9 5 mM MgCI2, 1 4 mM each dGTP, dATP, TTP, dCTP-alpha- S, 100 μg/ml acetylated bovine serum albumin, 10 ng/ml human placental DNA, 35 mM K2HPO4 pH 7 6, 0 5 μM primers S1 SSOBI and B2 SSOBI, 0 05 μM primers B1 BSOBI and B2 SSOBI, 3 2 U/μl SsoBI enzyme, 0 16 U/μl exo'Bca [=exo-minus-Sca] enzyme 3mM Tris-HCI, 11 mM NaCl, 0 3 mM DTT, 4 mM KCI, 4% glycerol, 0 008mM EDTA and varying amounts of target DNA Prior to the addition of SsoBI and exo'Bca, incomplete reactions (35μl) are heated at 95°C for 3 mm to denature the target DNA, followed toy 3 mm at 60°C to anneal the primers Following the addition of a 15 μl enzyme mix consisting of 4 μl of SsoBI (40 Units/μl), 0 36 μl exo'Bca (22 Units/μl), and 10 6 μl enzyme dilution buffer (10 mM Tπs Hcl, 10 mM MgCI2, 50 mM NaCl, 1 mM DTT), the reactions are incubated at 60°C for 15 mm Amplification is terminated by heating for 5 m in a boiling water bath A non-SDA sample is created by heating a sample in a boiling water bath immediately after enzyme addition Aerosol resistant tips from Continental Laboratory Products are used to reduce contamination of SDA reactions with previously amplified products
The polynucleotides of SEQ ID N°1 , SEQ ID N°2, SEQ ID N°3, SEQ ID N°4 SEQ ID N°5 or SEQ ID N°6 and their above described fragments, especially the primers according to the invention, are useful as technical means for performing different target nucleic acid amplification methods such as
- TAS (Transcription-based Amplification System), described by Kwoh et al in 1989
- SR (Self-Sustained Sequence Replication), described by Guatel et al in 1990
- NASBA (Nucleic acid Sequence Based Amplification), described by Kievitis et al in 1991
- TMA (Transcription Mediated Amplification)
The polynucleotides of SEQ ID N°1 , SEQ ID N°2, SEQ ID N°3, SEQ ID N°4, SEQ ID N°5 or SEQ ID N°6 and their above described fragments, especially the primers according to the invention, are also useful as technical means for performing methods for amplification or modification of a nucleic acid used as a probe , such as
- LCR (Ligase Cham Reaction), described by Landegren et al in 1988 and improved by Barany et al in 1991 who employ a thermostable ligase
- RCR (Repair Cham Reaction) described by Segev et al in 1992
- CPR ( Cycling Probe Reaction), described by Duck et al in 1990
- Q-beta rephcase reaction, described by Miele et al in 1983 and improved by Chu et al in 1986, Lizardi et al in 1988 and by Burg et al and Stone et al in 1996
When the target polynucleotide to be detected is a RNA, for example a mRNA, a reverse transcπptase enzyme will be used before the amplification reaction in order to obtain a cDNA from the RNA contained in the biological sample The generated cDNA is subsequently used as the nucleic acid target for the primers or the probes used in an amplification process or a detection process according to the present invention
Nucleic probes according to the present invention are specific to detect a polynucleotide of the invention By « specific probes » according to the invention is meant any ohgonucleotide that hybridizes with one polynucleotide of SEQ ID N°1 , SEQ ID N°2, SEQ ID N°3, SEQ ID N°4, SEQ ID N°5 or SEQ ID N°6 and which does not hybridize with unrelated sequences Prefered ohgonucleotide probes according to the invention are at least 8 nucleotides in length, and more preferably a length comprised between 8 and 300 nucleotides
The ohgonucleotide probes according to the present invention hybridize specifically with a DNA or RNA molecule comprising all or part of one polynucleotide among SEQ ID N°1 , SEQ ID N°2, SEQ ID N°3, SEQ ID N°4, SEQ ID N°5 or SEQ ID N°6 under stringent conditions
As an illustrative embodiment, the stringent hybridization conditions used in order to specifically detect a polynucleotide according to the present invention are advantageously the follow gs
The hybridization step is realized at 65°C in the presence of 6 x SSC buffer, 5 x Denhardt's solution, 0,5% SDS and 100μg/ml of salmon sperm DNA
The hybridization step is followed by four washing steps
- two washings during 5 mm, preferably at 65°C in a 2 x SSC and 0 1 %SDS buffer,
- one washing during 30 mm, preferably at 65°C in a 2 x SSC and 0 1 % SDS buffer,
- one washnig during 10 mm, preferably at 65°C in a 0 1 x SSC and 0 1 %SDS buffer
The non-labeled polynucleotides or ohgonucleotides of the invention may be directly used as probes Nevertheless, the polynucleotides or ohgonucleotides are generally labeled with a radioactive element (32P, 35S, 3H, 125l) or by a non-isotopic molecule (for example, biotin, acetylaminofluorene digoxigen , 5- bromodesoxyuπdin, fluorescem) in order to generate probes that are useful for numerous applications
5
Examples of non-radioactive labeling of nucleic acid fragments are described in the French patent N° FR-7810975 or by Urdea et al. or Sanchez-Pescador et al., 1988.
In the latter case, other labeling techniques may be also used such those described in the French patents FR-2,422,956 and 2,518,755. The hybridization step may be performed in different ways (Matthews et al., 1988). The more general method consists in immobilizing the nucleic acid that has been extracted from the biological sample on a substrate (nitrocellulose, nylon, polystyren) and then to incubate, in defined conditions, the target nucleic acid with the probe. Subsequently to the hybridization step, the excess amount of the specific probe is discarded and the hybrid molecules formed are detected by an appropriate method (radioactivity, fluorescence or enzyme activity measurement).
Advantageously, the probes according to the present invention may have structural characteristics such that they allow signal amplification, such structural characteristics being, for example, branched DNA probes as those described by Urdea et al. in 1991 or in the European patent N° EP-0225,807 (Chiron).
In another advantageous embodiment of the probes according to the present invention, the latters may be used as « capture probes », and are for this purpose immobilized on a substrate in order to capture the target nucleic acid contained in a biological sample. The captured target nucleic acid is subsequently detected with a second probe which recognizes a sequence of the target nucleic acid which is different from the sequence recognized by the capture probe.
The ohgonucleotide fragments useful as probes or primers according to the present invention may be prepared by cleavage of the polynucleotides of SEQ ID N°1 , SEQ ID N°2, SEQ ID N°3, SEQ ID N°4, SEQ ID N°5 or SEQ ID N°6 by restriction enzymes, the one skill in the art being guided by the restriction maps presented in the annexes 2 to 6 of the instant specification.
Another appropriate preparation process of the nucleic acids of the invention containing at most 200 nucleotides (or 200 bp if these molecules are double stranded) comprises the following steps :
- synthesizing DNA using the automated method of beta-cyanethylphosphoramidite described in 1986;
- cloning the thus obtained nucleic acids in an appropriate vector;
- purifying the nucleic acid by hybridizing an appropriate probe according to the present invention.
A chemical method for producing the nucleic acids according to the invention which have a length of more than 200 nucleotides nucleotides (or 200 bp if these molecules are double stranded) comprises the following steps :
- assembling the chemically synthesized ohgonucleotides, having different restriction sites at each end ;
- cloning the thus obtained nucleic acids in an appropriate vector ;
- purifying the nucleic acid by hybridizing an appropriate probe according to the present invention.
In the case in which the above nucleic acids are used as coding sequences in order to produce a polypeptide according to the present invention, it is important to ensure that their sequences are compatible ( in the appropriate reading frame) with the aminoacid sequence of the polypeptide to be produced.
The ohgonucleotide probes according to the present invention may also be used in a detection device comprising a matrix library of probes immobilized on a substrate, the sequence of each probe of a given length being localized in a shift of one or several bases, one from the other, each probe of the matrix library thus being complementary of a distinct sequence of the target nucleic acid Optionally, the substrate of the matrix may be a material able to act as an electron donnor, the detection of the matrix positions in which an hybridization has occurred being subsequently determined by an electronic device Such matrix libraries of probes and methods of specific detection of a target nucleic acid is described in the European patent application N° EP-0713,016 (Affymax technologies) and also in the US patent N° US-5,202,231 (Drmanac)
An ohgonucleotide probe matrix may advantageously be used to detect mutations occurring in the eya 1, eya2 or eya 3 gene For this particular purpose, probes are specifically designed to have a nucleotidic sequence allowing their hybridization to the genes that carry known mutations (either by deletion, insertion of substitution of one or several nucleotides) By known mutations is meant mutations on the eyal, eya2, or eya3 gene that have been identified according, for example to the technique used in Example 4 Specifically, probes are designed to hybridize with the mutated sequences depicted in Table 1
Another technique that is used to detect mutations in the eyal eya2, or eya3 gene is the use of a high-density DNA array Each ohgonucleotide probe constituting a unit element of the high density DNA array is designed to match a specific subsequence of the eyal, eya2, or eya3 genomic DNA or cDNA Thus, an array consisting of ohgonucleotides complementary to subsequences of the target gene sequence is used to determine the identity of the target sequence with the wild gene sequence, measure its amount, and detect differences between the target sequence and the reference wild gene sequence of the eyal, eya2, or eya3 gene In one such design, termed 4L tiled array, is implemented a set of four probes (A, C, G, T), preferably 15-nucleotιde oligomers In each set of four probes, the perfect complement will hybridize more strongly than mismatched probes Consequently, a nucleic acid target of length L is scanned for mutations with a tiled array containing 4L probes, the whole probe set containing all the possible mutations in the known wild reference sequence The hybridization signals of the 15-mer probe set tiled array are perturbed by a single base change in the target sequence As a consequence, there is a characteristic loss of signal or a « footprint » for the probes flanking a mutation position This technique was described by Chee et al in 1996
Another object of the present invention consists in hybrid molecules resulting from
- the hybrid formation between a DNA (genomic DNA or cDNA) or a RNA contained in a biological sample with a nucleic probe or primer according to the present invention,
- the hybrid formation between a DNA (genomic DNA or cDNA) or a RNA contained in a biological sample with an amplified nucleic fragment obtained by the use of a pair of primers according to the present invention
By cDNA according to the present invention is meant a DNA molecule that has been obtained by incubating an RNA molecule in the presence of an enzyme having a reverse transcnptase activity, as described by Sambrook et al in 1989
The present invention also pertains to a family of recombinant plasmids containing at least a nucleic acid according to the above teachings According to an advantageous embodiment, a recombinant plasmid comprises a polynucleotide of SEQ ID N°1 , SEQ ID N°2, SEQ ID N°4 or SEQ ID N°6, or one nucleic fragment thereof
Another object of the present invention consists in an appropriate vector for cloning, expressing or inserting a nucleic sequence, wherein the vector comprises a nucleic acid as above described in site that is non-essential for its replication, optionally under the control of the regulation elements allowing the expression of a polypeptide of the invention
Particular vectors used are plasmids, phages, cosmids, phagemids, PACs ( P1 derived Artificial Chromosomes) and YACs (Yeast Artificial Chromosomes) As plasmids, pUC vectors are preferred
These vectors are useful for transforming or transfectmg cell hosts in order to clone or express the nucleic acids of the invention
It is now easy to produce proteins in high amounts by the genetic engineering techniques by the use, as expression vectors, plasmids, phages or phagemids The polynucleotides that code for the polypeptides of the present invention is inserted in an appropriate expression vector in order to in vitro produce the polypeptide of interest
Thus, the present invention also concerns a method for the producing a polypeptide of the invention, and especially a polypeptide of SEQ ID N° 7, SEQ ID N°8, SEQ ID N°9 or SEQ ID N°10, the said method comprising the steps of a) optionally amplifying the nucleic acid coding for the desired polypeptide using a pair of primers according to the invention (by SDA, TAS, 3SR NASBA, TMA etc ) , b) inserting the nucleic acid of interest in an appropriate vector , c) growing, in an appropriate culture medium, a cell host previously transformed or transfected with the recombinant vector of step b) , d) harvesting the culture medium thus conditioned or lyse the cell host, for example by sonication or by an osmotic shock , e) separating or purifying, from the said culture medium, or from the pellet of the resultant host cell lysate the thus produced polypeptide of interest , f) characterizing the produced polypeptide of interest
The polypeptides according to the invention may be characterized by binding onto an immunoaffinity chromatography column on which polyclonal or monoclonal antibodies directed to a polypeptide among the polypeptides of SEQ ID N°7 to SEQ ID N°10 have previously been immobilized
The polypeptides according to the invention may also be prepared by the conventional methods of chemical synthesis, either in a homogenous solution or in solid phase As an illustrative embodiment of such chemical polypeptide synthesis techniques, it may be cited the homogenous solution technique described by Houbenweyl in 1974
Another object of the present invention consists in a polypeptide produced by the genetic engineering techniques or a polypeptide synthesized chemically as above described
The polypeptides according to the present invention, especially the polypeptides of SEQ ID N° 7, SEQ ID N°8, SEQ ID N°9 or SEQ ID N°10 are allowing
A8 the preparation of polyclonal or monoclonal antibodies that recognize the polypeptides of SEQ ID N° 7, SEQ ID N°8, SEQ ID N°9 or SEQ ID N°10 or fragments thereof. The antibodies may be prepared from hybridomas according to the technique described by Kohler and Milstein in 1975. The polyclonal antibodies may be prepared by immunisation of an a mammal, especially a mouse or a rabbit, with a polypeptide according to the invention that is combined with an adjuvant of immunity, and then by purifying of the specific antibodies contained in the serum of the immunized animal on a affinity chromatography column on which has previously been immobilized the polypeptide that has been used as the antigen.
Consequently, the invention is also directed to a method for detecting specifically the presence of a polypeptide according to the invention in a biological sample, said method comprising the following steps : a) bringing into contact the biological sample with an antibody according to the invention; b) detecting the antigen-antibody complex formed.
Is also part of the invention a diagnostic kit for in vitro detecting the presence of a polypeptide according to the present invention in a biological sample, the said kit comprising :
- a polyclonal or monoclonal antibody as described above, optionally labeled;
- a reagent allowing the detection of the antigen-antibody complexes formed, said reagent carrying optionally a label, or being able to be recognized itself by a labeled reagent, more particularly in the case when the above-mentioned monoclonal or polyclonal antibody is not labeled by itself.
Another object of the present invention consists in a method for detecting specifically the presence of a polynucleotide of the invention contained in a biological sample comprising the steps of : a) bringing into contact an ohgonucleotide probe according to the invention with a biological sample under appropriate conditions, the DNA contained in the biological sample or the cDNA obtained by reverse transcription of the RNA contained in the biological sample having previously been made available to the hybridization reaction ; b) detecting the hybrid molecule formed between the ohgonucleotide probe and the. DNA of the biological sample.
Another object of the present invention consists in a method for detecting a genetic abnormality linked to the BO or to the BOR syndrome in a biological sample containing DNA or cDNA, comprising the steps of : a) bringing the biological sample into contact with a pair of ohgonucleotide fragments according to the invention, the DNA contained in the sample having been optionally made available to hybridization and under conditions permitting a hybridization of the said ohgonucleotide fragments with the DNA contained in the biological sample; b) amplifying the DNA c) revealing the amplification products; d) optionally detecting a mutation or a deletion by appropriate techniques.
The step d) of the above-described method may consist in a Single-Strand Polymorphism technique (SSCP), a Denaturing Gradient Gel Electrophoresis (DGGE), or the FAMA technique such as described in the PCT patent application N° WO-95/07361
Another object of the present invention consists in a method for detecting a genetic abnormality linked to the BO or to the BOR syndrome in a biological sample containing DNA or cDNA, comprising the steps of a) bringing the biological sample into contact with an ohgonucleotide probe according to the invention, the DNA contained in the sample having been optionally made available to hybridization and under conditions permitting a hybridization of the primers with the DNA contained in the biological sample, b) detecting the hybrid formed between the ohgonucleotide probe and the DNA contained in the biological sample
The present invention also comprises a method for detecting a genetic abnormality linked to the BO or to the BOR syndrome in a biological sample containing DNA, comprising the steps of a) bringing into contact a first ohgonucleotide probe according to the invention that has been immobilized on a support, the DNA contained in the sample having been optionally made available to hybridization and under conditions permitting a hybridization of the primers with the DNA contained in the biological sample, b) bringing into contact the hybrid formed between the immobilized first ohgonucleotide probe and the DNA contained in the biological sample with a second ohgonucleotide probe according to the invention, which second probe hybridizes with a sequence different from the sequence to which the immobilized first probe hybridizes, optionally after having removed the DNA contained in the biological sample which has not hybridized with the immobilized first ohgonucleotide probe
Another object of the present invention consists in a method for detecting a genetic abnormality linked to the BO or to the BOR syndrome in a biological sample containing DNA, by detecting the presence and the position of base substitutions or base defetions in a nucleotide sequence included in a double stranded DNA preparation to be tested, the said method comprising the steps of a) amplifying specifically the region containing, on one hand, the nucleotide sequence of the DNA to be tested and on the other hand the nucleotide sequence of a DNA of known sequence, the DNA of known sequence being a polynucleotide according to the invention, b) labeling the sense and antisense strands of these DNA with different fluorescent or other non-isotopic labels, c) hybridizing the amplified DNAs, d) revealing the heteroduplex formed between the DNA of known sequence and the DNA to be tested by cleavage of the mismatched parts of the DNA strands
Such a mismatch localization technique has been described by Meo et al in the PCT application N° WO-95/07361
£0 The invention also pertains to a kit for the detection of a genetic abnormality linked to the BOR syndrome in a biological sample, comprising the following elements a) a pair of ohgonucleotides according to the invention, b) the reagents necessary for carrying out a DNA amplification, c)a component which makes it possible to determine the length of the amplified fragments or to detect a mutation
The discovery by the inventors of the linkage between the alteration of the eyal gene and both developmental defects and tumoπgenesis of various organs, specially kidneys, related to the BOR syndrome have allowed them to design specific therapeutic compositions for treating Eyal - defect associated disorders, particularly renal disorders, using an active principle which is selected from the group consisting of a) a purified EYA1 or EYA1-B protein or one of their biologically active derivatives , b) a polynucleotide encoding for the EYA1 or EYA1-B protein or for one of their biologically active derivatives , c) an antisense polynucleotide hybridizing specifically with the genomic DNA of the EYA1 gene or hybridizing specifically with the mRNA encoding the EYA1 or EYA1-B protein , d) a polyclonal or a monoclonal antibody that specifically binds to the EYA1 or EYA1 -B protein
The above-described therapeutic compositions are also useful to modulate the expression or the biological activity of the translation products of the EYA1 gene in case of organ grafting, and more particularly when embryonic organs are grafted Such therapeutic compositions are used to ensure a correct development of the embryonic grafted organ, in particular grafted embryonic liver or kidney These therapeutic compositions may be used in case of transplantation of allo -or xeno- organs
In a preferred embodiment of the therapeutic compositions of the present invention, the amount of the biologically active peptide component is comprised in the range from 0,1 μg/ml to 10μg/ml in the body fluid The dose-range is expressed in reference to the bioavailabi ty of the EYA1 or EYA1 -B protein
Another subject of the present invention is a therapeutic composition containing a pharmaceutically effective amount of a polypeptide of SEQ ID N° 7, SEQ ID N°8, SEQ ID N°9 or SEQ ID N°10 according to the present invention useful in the treatment of renal disorders, particularly renal disorders linked to the BOR syndrome For clinical use, the purified therapeutic polypeptide according to the present invention can be administered under the form of a solution, a gel or a dry powder It can be introduced locally, for example at the renal level or it can be administered systemically, for example intravenously
Another object of the present invention is a therapeutic composition containing an effective amount of a polynucleotide (RNA, genomic DNA, cDNA) coding for the purified polypeptide of SEQ ID N° 7, SEQ ID N°8, SEQ ID N°9 or SEQ ID N°10
A suitable vector for the expression of a polypeptide according to the invention is a baculovirus vector that can be propagated in insect cells and in insect
cell lines A specific suitable host vector system is the pVL1392/1393 baculovirus transfer vector (Pharmmgen) that is used to transfect the SF9 cell line (ATCC N°CRL 1711 ) which is derived from Spodoptera frugiperda
Another suitable vector for the expression in bacteria and in particular in E coli, is the pQE-30 vector (QIAexpress) that allows the production of a recombinant protein containing a 6xHιs affinity tag The 6xHιs tag is placed at the C-terminus of the recombinant EYA1 , EYA1-B, EYA2 or EYA3-proteιn which allows a subsequent efficient purification of the recombinant protein by passage onto a Nickel or Cupper affinity chromatography column The Nickel chromatography column may contain the Ni-NTA resin (Porath et al , 1975)
In another embodiment of the therapeutic composition according to the invention, the said composition comprises a polynucleotide coding for the EYA1 , EYA1 -B, EYA2 or EYA3 polypeptide of interest
Gene therapy consists in correcting a defect or an anomaly (mutation, aberrant expression etc ) by the introduction of a genetic information in the affected organism This genetic information may be introduced in vitro in cell that has been previously extracted from the organism, the modified cell being subsequently remtroduced in the said organism, directly in vivo into the appropriate tissue
The method for delivering the corresponding protein or peptide to the interior of a cell of a vertebrate in vivo comprises the step of introducing a preparation comprising a pharmaceutically acceptable injectable carrier and a naked polynucleotide operatively coding for the polypeptide into the interstitial space of a tissue comprising the cell, whereby the naked polynucleotide is taken up into the interior of the cell and has a pharmaceutical effect at the renal, retinal or the neuronal level of the vertebrate
In a specific embodiment, the invention provides a pharmaceutical product, comprising a naked polynucleotide operatively coding for the EYA1 , EYA1 -B, EYA2 or EYA3 protein, in solution in a physiologically acceptable injectable carrier and suitable for introduction interstitially into a tissue to cause cells of the tissue to express the said protein or polypeptide
Advantageously, the therapeutic composition containing a complete or a part of the polynucleotide corresponding to the SEQ ID N° 1 to SEQ ID N° 6 polynucleotide is administered locally, near the site to be treated
The polynucleotide operatively coding for the EYA1 , EYA1 -B, EYA2 or EYA3 protein may be a vector comprising the genomic DNA or the complementary DNA coding for the corresponding protein or its protein derivative and a promoter sequence allowing the expression of the genomic DNA or the complementary DNA in the desired eukaryotic cells, such as vertebrate cells, specifically mammalian cells
The vector component of a therapeutic composition according to the present invention is advantageously a plasmid, a part of which is of viral or bacterial origin, which carries a viral or a bacterial origin of replication and a gene allowing its selection such as an antibiotic resistance gene
By « vector » according to this specific embodiment of the invention is intended a circular or linear DNA molecule
This vector may also contain an origin of replication that allows it to replicate in the eukaryotic host cell such as an origin of replication from a bovine papiilomavirus
The promoter carried by the said vector is advantageously the cytomegalovirus promoter (CMV) Nevertheless, the promoter may also be any other promoter with the proviso that the said promoter allows an efficient expression of the DNA insert coding for the EYA1 , EYA1 -B, EYA2 or EYA3 protein within the host
Thus, the promoter is selected among the group comprising
- an internal or an endogenous promoter, such as the natural promoter associated with the structural gene coding for EYA1 , EYA1 -B, EYA2 or EYA3 such a promoter may be completed by a regulatory element derived from the vertebrate host, in particular an activator element,
- a promoter derived from a cytoskeletal protein gene such as the desmin promoter (Bolmont et al , J of Submicroscopic cytology and pathology, 1990 22 117-122, Zhenhn et al , Gene, 1989, 78 243-254)
As a general feature, the promoter may be heterologous to the vertebrate host, but it is advantageously homologous to the vertebrate host
By a promoter heterologous to the vertebrate host is intended a promoter that is not found naturally in the vertebrate host
Therapeutic compositions comprising a polynucleotide are described in the PCT application N° WO 90/11092 (Vical Inc ) and also in the PCT application N° WO 95/11307 (Institut Pasteur, INSERM, Universite d'Ottawa) as well as in the articles of Tacson et al (1996, Nature Medicine, 2(8) 888-892) and of Huygen et al (1996, Nature Medicine, 2(8) 893-898)
Other therapeutic compositions according to the present invention comprise advantageously an ohgonucleotide fragment of SEQ ID N°1 to SEQ ID N°6 of the invention as an antisense tool that inhibit the expression of the eyal , eya2 or eya3 gene and is thus useful in order to prevent or limit the tumor cell proliferation in certain patient organs, specifically kidney tumors
The therapeutic compositions described above may be administered to the vertebrate host by a local route such as an intramuscular route
The therapeutic polynucleotide according to the present invention may be injected to the host after it has been coupled with compounds that promote the penetration of the therapeutic polynucleotide within the cell or its transport to the cell nucleus The resulting conjugates may be encapsulated in polymer microparticles as it is described in the PCT application N° WO 94/27238 (Medisorb Technologies International)
In another embodiment, the DNA to be introduced is complexed with DEAE- dextran (Pagano et al , 1967, J Virol , 1 891 ) or with nuclear proteins (Kaneda et al 1989, Science, 243 375), with lipids (Feigner et al , 1987, Proc Natl Acad Sci 84 7413) or encapsulated within hposomes (Fraley et al , 1980, J Biol Chem 255 10431 )
In another embodiment, the therapeutic polynucleotide may be included in a transfection system comprising polypeptides that promote its penetration within the host cells as it is described in the PCT application WO 95/10534 (Seikagaku Corpporation)
The therapeutic polynucleotide and vector according to the present invention may advantageously be administered in the form of a gel that facilitates their transfection into the cells Such a gel composition may be a complex of poly-L-lysme and lactose, as described by Midoux (1993, Nucleic Acids Research, 21 871 -878) or also poloxamer 407 as described by Pastore (1994, Circulation, 90 1-517) The
therapeutic polynucleotide and vector according to the invention may also be suspended in a buffer solution or be associated with liposomes
Thus, the therapeutic polynucleotide and vector according to the invention are used to prepare pharmaceutical compositions for delivering the DNA (genomic DNA or cDNA) coding for the EYA1 , EYA1-B, EYA2 or EYA3 protein at the site of the injection
The amount of the vector to be injected varies according to the site of injection and also to the kind of disorder to be treated As an indicative dose, it will be injected between 0,1 and 100 μg of the vector in a patient
In another embodiment of the therapeutic polynucleotide according to the invention, this polynucleotide may be introduced in vitro in a host cell, preferably in a host cell previously harvested from the patient to be treated and more preferably a somatic cell such as a muscle cell, a renal cell or a neurone In a subsequent step, the cell that has been transformed with the therapeutic nucleotide coding for the EYA1 , EYA2 or EYA3 protein is implanted back into the patient body in order to deliver the recombinant protein within the body either locally or systemically
In a preferred embodiment, gene targeting techniques are used to introduce the therapeutic polynucleotide into the host cell One of the preferred targeting techniques according to the present invention consists in a process for specific replacement, in particular by targeting the EYA1 , EYA1-B, EYA2 or EYA3 protein encoding DNA, called insertion DNA, comprising all or part of the DNA structurally encoding the corresponding protein, when it is recombmed with a complementing DNA in order to supply a complete recombinant gene in the genome of the host cell of the patient, characterized in that
- the site of insertion is located in a selected gene, called the recipient gene, containing the complementing DNA encoding the EYA1 , EYA1 -B, EYA2 or EYA3 protein and in that
- the polynucleotide coding for the said protein or one of its biologically active derivatives may comprise
- « flanking sequences » on either side of the DNA to be inserted, respectively homologous to two genomic sequences which are adjacent to the desired insertion site in the recipient gene
- the insertion DNA being heterologous with respect to the recipient gene, and
- the flanking sequences being selected from those which constitute the above- mentioned complementing DNA and which allow, as a result of homologous recombination with corresponding sequences in the recipient gene, the reconstitution of a complete recombinant gene in the genome of the eukaryotic cell
Such a DNA targeting technique is described in the PCT patent application N° WO 90/11354 (Institut Pasteur)
Such a DNA targeting process makes it possible to insert the therapeutic nucleotide according to the invention downstream of an endogenous promoter which has the desired functions (for example, specificity of expression in the selected target tissue)
According to this embodiment of the invention, the inserted therapeutic polynucleotide may contain between the flanking sequences and upstream from the open reading frame encoding the EYA1 , EYA1-B, EYA2 or EYA3 protein, a sequence carrying a promoter sequence either homologous or heterologous with
respect to the EYA1 , EYA1-B, EYA2 or EYA3 encoding DNA The insertion DNA may contain in addition, downstream from the open reading frame and still between the flanking sequences, a gene coding for a selection agent, associated with a promoter making possible its expression in the target cell
According to this embodiment of the present invention, the vector contains in addition a bacterial origin of replication of the type colE1 , pBR322, which makes the cloning and preparation in E coli possible A preferred vector is the plasmid pGN described in the PCT application N° WO 90/11354
Other gene therapy methods than those using homologous recombination may also be used in order to allow the expression of a polynucleotide encoding the EYA1 , EYA1 -B, EYA2 or EYA3 protein within a patient's body
In all the gene therapy methods that may be used according to the present invention, different types of vectors are utilized
In one specific embodiment, the vector is derived from an adenovirus Adenoviruses vectors that are suitable according to the gene therapy methods of the present invention are those described by Feldman and Steg (1996, Medecme/Sciences, synthese, 12 47-55) or Ohno et al (1994, Sciences, 265 781 - 784) or also in the French patent application N° FR-94 03 151 (Institut Pasteur, Inserm) Another preferred recombinant adenovirus according to this specific embodiment of the present invention is the human adenovirus type 2 or 5 (Ad 2 or Ad 5) or an adenovirus of animal origin ( French patent application N° FR-93 05954)
Among the adenoviruses of animal origin it can be cited the adenoviruses of canine (CAV2, strain Manhattan or A26/61 [ATCC VR-800]), bovine, murine (Mav1 , Beard et al , 1980, Virology, 75 81 ) or simian (SAV)
Preferably, the recombinant defective adenoviruses are prepared following a technique well-known by one skill in the art, for example as described by Levrero et al , 1991 , Gene, 101 195) or by Graham (1984, EMBO J , 3 2917) or in the European patent application N° EP-185 573 Another defective recombinant adenovirus that may be used according to the present invention, as well as a pharmaceutical composition containing such a defective recombinant adenovirus is described in the PCT application N° WO 95/14785
In another specific embodiment, the vector is a recombinant retroviral vector such as the vector described in the PCT application N° WO 92/15676 or the vector described in the PCT application N° WO 94/24298 (Institut Pasteur) The latter recombinant retroviral vector comprises
- a DNA sequence from a provirus that have been modified such that
- the gag, pol and env genes of the provirus DNA have been deleted at least in part in order to obtain a proviral DNA which is incapable of replicate, this DNA not being able to recombme to form a wild virus,
- the LTR sequence comprises a deletion in the U3 sequence, such that the mRNA transcription that the LTR controls is significantly reduced, for example at least 10 times, and
- the retroviral vector comprises in addition an exogenous nucleotide sequence coding for the EYA1 , EYA1-B, EYA2 or EYA3 protein or one of its biologically active derivatives under the control of an exogenous promoter, for example a constitutive or an ductible promoter
By exogenous promoter in the recombinant retroviral vector described above is intended a promoter that is exogenous with respect to the retroviral DNA but that
zs may be endogenous or homologous with respect to the EYA1 , EYA1 -B, EYA2 or EYA3 protein entire or partial nucleotide coding sequence
In the case in which the promoter is heterologous with respect to the EYA1 , EYA1-B, EYA2 or EYA3 protein entire or partial nucleotide coding sequence, the promoter is preferably the mouse mductible promoter Mx or a promoter comprising a tetracychn operator or also a hormon regulated promoter A preferred constitutive promoter that is used is one of the internal promoters that are active in the resting fibroblasts such the promoter of the phosphoglycerate kinase gene (PGK-1 ) The PGK-1 promoter is either the mouse promoter or the human promoter such as described by Adra et al ( 1987, Gene, 60 65-74) Other constitutive promoters may also be used such that the beta-actm promoter (Kort et al , 1983, Nucleic Acids Research, 11 8287-8301 ) or the vimentm promoter (Rettlez and Basenga, 1987, Mol
A preferred retroviral vector used according to this specific embodiment of the present invention is derived from the Mo-MuLV retrovirus (WO 94/24298)
In one preferred embodiment, the recombinant retroviral vector carrying the therapeutic nucleotide sequence coding for the EYA1 , EYA1 -B, EYA2 or EYA3 protein or one of its biologically active derivatives is used to transform mammalian cells, preferably autologous cells from the mammalian host to be treated, and more preferably autologous fibroblasts from the patient to be treated The fibroblasts that have been transformed with the retroviral vector according to the invention are reimplanted directly in the patient's body or are seeded in a preformed implant before the introduction of the implant colonized with the transformed fibroblasts within the patient's body The implant used is advantageously made of a biocompatible carrier allowing the transformed fibroblasts to anchor associated with a compound allowing the gehfication of the cells The biocompatible carrier is either a biological carrier, such as coral or bone powder, or a synthetic carrier, such as synthetic polymer fibres, for example polytetrafluoroethylene fibres
An implant having the characteristics as defined above is the implant described in the PCT application N° WO 94/24298 (Institut Pasteur)
The present invention provides also a method for the screening of ligands which are capable to bind to the EYA1 , EYA1-B, EYA2 or EYA3 protein Such a screening method, in one embodiment, comprises the steps of a) preparing a complex between the EYA1 , EYA1-B, EYA2 or EYA3 protein and a ligand that binds to the EYA1 , EYA1-B, EYA2 or EYA3 protein by a method selected among the following
- preparing a tissue extract containing the EYA1 , EYA1 -B, EYA2 or EYA3 protein putatively bound to a natural ligand,
- bringing into contact the purified EYA1 , EYA1 -B, EYA2 or EYA3 protein with a solution containing a molecule to be tested as a ligand binding to the EYA1 , EYA1-B, EYA2 or EYA3 protein, b) visualizing the complex formed between the EYA1 , EYA1 -B, EYA2 or EYA3 protein from the tissue extract and the natural ligand of the EYA1 , EYA1 -B, EYA2 or EYA3 protein or the complex formed between the purified EYA1 , EYA1 -B, EYA2 or EYA3 protein and the molecule to be tested
For the purpose of the present invention, a ligand means a molecule, such as a protein, a peptide, a transcription factor, an antibody or a synthetic compound
capable of binding to the EYA1 , EYA1 -B, EYA2 or EYA3 protein or one of its biologically active derivatives or to modulate the expression of the polynucleotide coding for the EYA1 , EYA1 -B, EYA2 or EYA3 protein or coding for one of its biologically active derivatives
In the first embodiment of the screening procedure wherein a natural ligand of the EYA1 , EYA1-B, EYA2 or EYA3 protein is to be characterized, it is processed as follows
The tissue putatitively containing the EYA1 , EYA1 -B, EYA2 or EYA3 protein bound to its natural ligand, for example the heart tissue, the skeletal muscle or the kidney are homogenized in 10 mM Hepes, pH 7 4, containing 100 μg/ml PMSF, 200 μg/ml aprotinm and 5 μg/ml Dnase, with a glass-Teflon homogenizer The homogenate is centrifuged at 1 ,000 g for 10 minutes, the supernatant is removed and centrifuged at 190,000 g for 30 mm at 4°C The pellet containing the membrane fraction is stored at -20°C until used
The cell membrane fractions are incubated first in 0 9% Triton X-100, 0 1 % ovalbumin, 5 mM EDTA, 50 mM Tπs-HCI, pH 8, with an immune serum of the invention overnight at 4°C, then with Protein G-sepharose (Pharmacia) for 2 hours Complexes are centrifuged, washed three times in PBS and three times in 50 mM Tπs-Hcl, pH 8 Then the complexes are dissociated in a dissociating buffer containing SDS in order to dissocate the EYA1 , EYA1 -B, EYA2 or EYA3 protein from its bound natural ligand Immunoprecipitates are analysed by western blot following the technique described by Gershoni and Palade (1983, Anal Biochem , 131 1 -15) The anti- EYA1 , EYA1 -B, EYA2 or EYA3 protein monoclonal antibody produced by the hybridoma clone 1-4 was used to detect the EYA1 , EYA1 -B, EYA2 or EYA3 protein and a panel of candidate antibodies, for example antibodies directed against different sub-units of integπns are used ( at a concentration of 1 5 μg/ml) to identify the ligand that was previously bound to the EYA1 , EYA1-B, EYA2 or EYA3 protein in the tissue extract IgG peroxidase-conjugated antibody (Bio-Rad, 1/6,000 dilution) is used as second antibody The blots are revealed by chemiluminescence with the ECL kit (Amersham France)
In a second embodiment of the ligand screening method according to the present invention, a biological sample or a defined molecule to be tested as a putative hgand of the EYA1 , EYA1-B, EYA2 or EYA3 protein is brought into contact with the #uπfιed EYA1 , EYA1-B, EYA2 or EYA3, in order to form a complex between the EYA1 , EYA1-B, EYA2 or EYA3 protein and the putative ligand molecule to be tested The biological sample may be obtained from a cerebelum or a renal extract, for example
When the hgand source is a biological sample, the complexes are processed as described above in order to identify and characterize the unknown ligand
When the putative ligand is a defined known molecule to be tested, the complexes formed between the EYA1 , EYA1-B, EYA2 or EYA3 protein and the molecule to be tested are not dissociated prior to the western blotting in order to allow the detection of the complexes using polyclonal or monoclonal antibodies directed against the EYA1 , EYA1-B, EYA2 or EYA3 protein
In a particular embodiment of the screening method, the putative ligand is the expression product of a DNA insert contained in a phage vector (Parmley and Smith, Gene, 1988, 73 305-318) According to this particular embodiment, the recombinant phages expressing a protein that binds to the immobilized EYA1 , EYA1 -B, EYA2 or
EYA3 protein is retained and the complex formed between the EYA1 , EYA1 -B, EYA2 or EYA3 protein and the recombinant phage is subsequently immunoprecipitated by a polyclonal or a monoclonal antibody directed against EYA1 , EYA1-B, EYA2 or EYA3 protein
According this particular embodiment, a ligand library is constructed in recombinant phages from human or chicken genomic DNA or cDNA Once the gand library in recombinant phages has been constructed, the phage population is brought into contact with the immobilized EYA1 , EYA1 -B, EYA2 or EYA3 protein Then the preparation of complexes is washed in order to remove the non-specifically bound recombinant phages The phages that bind specifically to the EYA1 , EYA2 or EYA3 protein are then eluted by a buffer (acid pH) or immunoprecipated by the monoclonal antibody produced by the hybridoma anti- EYA1 , EYA1 -B, EYA2 or EYA3, and this phage population is subsequently amplified by an over-infection of bacteria (for example E coli) The selection step may be repeated several times, preferably 2-4 times, in order to select the more specific recombinant phage clones The last step consists in characterizing the protein produced by the selected recombinant phage clones either by expression in infected bacteria and isolation, expressing the phage insert in another host-vector system, or sequencing the insert contained in the selected recombinant phages
Another subject of the present invention is a method for screening molecules that modulate the expression of the EYA1 , EYA1-B, EYA2 or EYA3 protein Such a screening method comprises the steps of a) cultivating a prokaryotic or an eukaryotic cell that has been transfected with a nucleotide sequence encoding the EYA1 , EYA1-B, EYA2 or EYA3 protein, placed under the control of its own promoter, b) bringing into contact the cultivated cell with a molecule to be tested, c) quantifying the expression of the EYA1 , EYA1 -B, EYA2 or EYA3 protein
Using DNA recombination techniques well known by the one skill in the art the EYA1 , EYA1-B, EYA2 or EYA3 protein encoding DNA sequence is inserted into an expression vector, downstream from its promoter sequence, the said promoter sequence being described by Cohen-Salmon et al (1995, Gene, 164 235-242)
The quantification of the expression of the EYA1 , EYA1-B, EYA2 or EYA3 protein may be performed either at the mRNA level or at the protein level In the latter case, polyclonal or monoclonal antibodies may be used to quantify the amounts of the EYA1 , EYA1 -B, EYA2 or EYA3 protein that have been produced, for example in an ELISA or a RIA assay
In a preferred embodiment, the quantification of the eya1-a, eya1-b, eya1-c, eya2 or eya3 mRNA is realized by a quantitative PCR amplification of the cDNA obtained by a reverse transcription of the total mRNA of the cultivated eya1-a, eyal- b, eya1-c, eya2 or eya3 -transfected host cell, using a pair of primers specific for eya1-a, eya1-b, eya1-c, eya2 or eya3 of the kind that are described in the PCT application N° WO 93/07267 (Institut Pasteur, HHS)
As an illustrative example, a pair of primers used to quantitate the reverse- transcribed eya1-a, eya1-b, eya1-c, eya2 or eya3 mRNA is the following Primer 1 5'-CTCAGCCATGTGCTCTGTATAATTAAGAGC-3' Primer 2 5'-AGGCACGGCACATGAATTTGTGCACGTGCT-3'
1% The process for determining the quantity of the cDNA corresponding to the eyal, eya2 or eya3 mRNA present in the cultivated eya1-a, eya1-b, eya1-c, eya2 or eya3 -transfected cells is characterized in that
1 ) a standard DNA fragment, which differs from the eya1-a, eya1-b, eya1-c, eya2 or eya3 cDNA fragment, obtained by the reverse transcription of the eya1-a, eya1-b, eya1-c, eya2 or eya3 -mRNA, but can be amplified with the same ohgonucleotide primers is added to the sample to be analyzed containing the eya1-a, eya1-b, eyal- c, eya2 or eya3 -cDNA fragment, the standard DNA fragment and the eya1-a, eyal- b, eya1-c, eya2 or eya3 -cDNA fragment differing in sequence and/or size by not more than approximately 10%, and preferably by not more than 5 nucleotides by strand,
2) the eya1-a, eya1-b, eya1-c, eya2 or eya3 -cDNA fragment and the standard DNA fragment are coamp fied with the same ohgonucleotide primers, preferably to saturation of the amplification of the eya1-a, eya1-b, eya1-c, eya2 or eya3 -cDNA fragment,
3) to the reaction medium obtained in step 2), there are added
- either two types of labeled ohgonucleotide probes which are each specific for the eya1-a, eya1-b, eya1-c, eya2 or eya3 -cDNA fragment ant the standard DNA fragment, respectively, and different from the amplification ohgonucleotide primers of step2),
- or one or more labeled ohgonucleotide pπmer(s), specific for the eya1-a, eya1-b, eya1-c, eya2 or eya3 -cDNA fragment and the standard DNA fragment and different from said ohgonucleotide primers of step 2), and one or more additional amplification cycle(s) with said labeled ohgonucleotide prιmer(s) is/are performed, so that, during a cycle, after denaturation of the DNA, said labeled ohgonucleotide pπmer(s) hybrιdιze(s) with said fragments at a suitable site in order that an elongation with the DNA polymerase generates labeled DNA fragments of different sizes and/or sequences and/or with different labels according to whether they originate from the DNA fragment of interest or the standard fragment, respectively, and then
4) the initial quantity of eya1-a, eya1-b, eya1-c, eya2 or eya3 -cDNA fragment is determined as being the product of the initial quantity of standard DNA fragment and the ratio of the quantity of amplified eya1-a, eyal-b, eya1-c, eya2 or eya3 -cDNA fragment, which ratio is identical to that of the quantities of the labeled DNA fragments originating from the amplified eya1-a, eyal-b, eya1-c, eya2 or eya3 - cDNA fragment, respectively, obtained in step 3)
Primers and probes hybridizing with the eya1-a, eyal-b, eya1-c, eya2 or eya3 -cDNA fragment and used in the above-described quantitative PCR amplification reaction are described in the PCT application N° WO 93/07267 (Institut Pasteur HHS)
More technical details regarding the performing of the quantitative PCR amplification reaction are found in the PCT application N° WO 93/10257 (Institut Pasteur, Inserm)
Finally, the invention also pertains to a method for screening a gand molecule capable to bind to a polynucleotide according to anyone of claims 1 to 6 comprising the steps of a) bringing into contact a polynucleotide as defined hereabove with a ligand molecule to be tested,
23 b) detecting the complexes formed between the said polynucleotide and the ligand molecule
Such screening techniques are described, for example, by Kadonaga et al in 1987 and by Kageyama et al in 1989
In order to better understand the present invention, reference will be made to the appended figures which depict specific embodiments to which the present invention is in no case limited in scope with
D8S1060, D8S1807, D8S530, D8S279), YAC end clones (876D10L, 201 A6L, 942C8R) and internal Alu PCR product (456H3ιnt) in the region PAC clones are indicated in italics and P1 clones in standard print On this Figure, the 5'end of SEQ ID N°1 is on the right side whereas the 3'end of SEQ ID N°1 is on the left side
Fig. 2 Sequence of the EYA1 cDNA The potential N-glycosylation sites (ammo acid positions 30, 166, 171 , 176 and 220) are circled The protein kinase C phosphorylation sites (ammo acid positions 42, 57, 259 and 414) and the cAMP- dependent protein kinase C phosphorylation site (ammo acid position 263) are boxed The leucme zipper is underlined and, in the 3' non coding region, the two polyadenylation sites are boxed
Fig. 3 Detection of a deletion by Southern blot analysis in a sporadic case of BOR syndrome a, Hybridisation with a probe containing exon D resulted in a signal of reduced intensity in the proband's DNA as compared to the unaffected parents b, Hybridisation with a probe containing exon C resulted in two bands, one of the normal size range (2 3 kb) and the second of a lower molecular weight c, Hybridisation with the Sp6 insert end of P1 4405 resulted in a band of equal intensity in DNA from all three individuals
Fig. 4 A gnement generated by the PILEUP program of the ammo acid sequence for EYA1 with the deduced ammo acid sequence of the murine orthologue Eyal , and of the human homologues EYA2 and EYA3 Residues highlighted in grey indicate
boxes indicate similarities The start of the eya homologous region (eyaHR) is indicated by a triangle
Fig. 5 Northern blot analysis of the expression of eyal -a and its mouse orthologue Eyal Human a, foetal and b, adult poly A+ northern blots (Clontech MTN7765-1 and 7760-1 ) hybridized with a 278 bp PCR product derived from the 3' untranslated region (beginning at the stop codon) of the eya1-a cDNA Mouse c, embryonic and d, adult poly A+ northern blots (Clontech MTN7763-1 and 7762-1 ) hybridised with a 256 bp product derived from the 3' untranslated region of the Eyal cDNA (beginning 9 bp before the stop codon)
Fig 6 In situ hybridisation to parasagittal (E13 5 and E19 5) and transversal (E14 5 and E17 5) sections of mouse embryos a, Hybridisation of the antisense RNA probe to the neuroepithelium (NE), and to the mesenchyme of the otic capsule, of the inner ear at E13 5 b, Hybridisation signal in the cochlear neuroepithelium (NE)
3θ and in the spiral ganglia (SpG) at E19 5 c, Hybridisation signal in the olfactory epithelium at E13 5 d, Labelling of the metanephπc mesenchyme surrounding the just divided ureteπc branches (arrows) near the medullary region, at E13 5 Progression of this hybridisation signal towards the peripheral cortex at e, E14 5 and f, E17 5 (arrows)
The general techniques used within the framework of the present invention are described below
A - P1 and PAC library screening
Nineteen P1 and 3 PAC clones from the Genome Systems libraries (P 1-2535 and PAC-6539, Missouri, U S A ) were isolated by PCR using primers flanking the 7 markers from D8S1060 to D8S530 (see Fig 1 ) End clones of the P1/PAC inserts were isolated (according to the manufacturer's directions) and primers corresponding to six of them were chosen to rescreen the two libraries, resulting in the isolation of a further 3 P1 s and 13 PACs These 16 clones were positioned relative to the initial 22 clones using the aforementioned probes A contig of minimum overlap was constructed using 3 P1 and 3 PAC clones The corresponding end clones were isolated and used to confirm the overlaps by Southern blot analysis
B - Subcloning, sequencing and analysis
Approximately 15 μg of DNA from P1 4405 was sonicated and separated on a preparative 0 4% LMP agarose gel DNA within the size range of 7-10 kb was isolated, blunt-ended by T4 DNA polymerase and Klenow, phosphorylated by T4 polynucleotide kinase and ligated to dephosphorylated Smal digested pBC SK+ (Stratagene) Approximately 250 subclones with a confirmed insert size of 7-10 kb were amplified in LB medium containing 30 μg/ml of chloramphenicol End clones were directly sequenced, using the T3 and T7 primers, on an ABI 377 sequencer In the first instance, sequence data was compared to the EMBL and GenBank protein and nucleic acid databases, and also analysed using the GRAIL program50 to screen for the presence of potential exons Subsequent to the detection of sequence homology to the drosophila gene eya, translated genomic sequences were compared to the deduced ammo acid sequence of the eya cDNA Once all the ends of the individual clones had been sequenced, the sequences were assembled using the computer programs Phred (B Ewmg, unpublished) and Phrap (P Green, unpublished), and edited with Consed (D Gordon, unpublished) This resulted in the construction of 31 sequence contigs covering approximately 90% of the P1 insert Using the orientation and distance constraints of end sequences from subclone inserts of a determined length, a scaffold of these 31 contigs covering the entire P1 clone was determined, as well as the size and position of intercontig gaps (method adaptated from 51 , R Hei g et al, manuscript in preparation) Individual subclones from the scaffold were then selected for sequence extension to bridge the gaps between contigs
C - cDNA isolation
Oligo dT and random primed cDNA populations were generated from a) nine week total human foetus polyA+ RNA and b) embryonic day (E) 13 total mouse polyA+ RNA, and amplified using the Marathon cDNA amplification kit (Clontech) according to the manufacturer's recommendations. The reconstructed EYA1 cDNA sequence was verified by PCR amplification of the full length coding region, and sequencing, using the primers 5'GSP.F (5'CTCAGCCATGTGCTCTGTATAATTAAGAGC3') and 3'GSP.R (5ΑGGCACGGCACATGAATTTGTGCACGTGCT3') Sequence analysis was performed using the GCG Sequence Analysis Software Package52 SAPS (sequence analysis of protein sequences)5^ and ppsearch (protein pattern searching) software (derived from 54,55).
D - Mutation detection
Southern blots containing EcoRI, Mspl, Pstl or Taql digested DNA from both familial and sporadic BOR affected patients were prepared according to standard techniques. The eight exons, exon z and exons A to G, were amplified using P1 clone 4405 DNA and the flanking intronic primers listed below The resultant PCR products were random labelled with 32p and hybridised to the Southern blots. A search for sequence mutations was carried out on the 8 exons in 42 unrelated individuals. The exons were amplified and sequenced using the following primers derived from intronic sequences: Exon z exz.F 5ΑGGCTAATCTTGGCACCATGG3' exz.R 5OACTGCTGTTTACGTAGCAGG3'
Exon A exA.F 5TGAATAACAGCTTTCTCAGCC3' exA.R 5'GACTATATAGTTCTTCTCCATTT3'
Exon B exB.F 5OTTTCAGCCTCTCCCAATGC3' exB.R 5ΑCCAACAAACTCCTGTCTCAC3'
Exon C exC.F 5ΑCCTACTGATTGACATAGTTGA3' exC.R 5ΑCTATAAAAGGGAGATGGTCAC3'
Exon D exD.F 5'GTGACCATCTCCCTTTTATAGT3*
- exD.R 5TGCTGAGGTACTGGTGGTAA3'
Exons E & F exE.F 5ΑAATCTGGAGGCTGGTATTC3' exF.R 5ΑGAGTACTGCACATATTCATCA3' exG.R 5TGCTGTGGCACATACAACCC3'
Exons E and F were coamplified in the same PCR product and sequenced with the external primers exE.F and exF.R, and the internal primers, exE R (5ΑTGAACAAGCACGAGCATTGC3') and exF F
(5'GCAATGCTCGTGCTTGTTCAT3'). All PCRs, and subclonmg of PCR products into Hindi digested M13, were carried out as previously described56
3>V E - In situ hybridization
A 256 bp PCR product was derived from the 3' untranslated region (starting 9 bp before the stop codon) of the murine cDNA. Hybridisation of this PCR product to Southern blots containing mouse genomic DNA resulted in a single band The 256 bp PCR product was then subcloned via the EcoRI and BamHI linkers into the EcoRI/BamHI digested pGEM-4Z, Promega, (creating pM3UTR). The sense and antisense RNA probes were labelled with dιgoxigenιn-11-UTP and the hybridisation signal detected by sheep anti-digoxigenin antibodies coupled to alkaline phosphatase. Whole E9.5 and E11.5 Rj:SWISS mouse embryos, parasagittal cryosections (7 μm) from E13.5 and E19.5 embryos and transversal sections through the kidneys of E14.5 and E17.5 embryos, were treated and hybridised as previously described5 -58.
Examples
Example 1 : Construction of a P1/PAC contig
Initially 7 markers mapping proximal to (D8S1060), distal to (D8S1807, D8S530) and within (876D10L, 456H3int, 201 A6L, 942c8R) the translocation-associated deletion '6 (see Fig. 1 ) were used to screen P1 and PAC human genomic libraries 19 P1 and 3 PAC clones were isolated. In order to fill the gaps between these clones, chromosome walks were initiated by using the insert ends to rescreen the libraries. A further 3 P1s and 13 PACs were isolated. Of the 38 total clones, 3 P1s and 3 PACs (see methods) were used to construct a contig with minimum overlap (Fig. 1 ).
Example 2 : Sequencing and identification of a candidate gene
P1 4405 (see Fig. 1 ) was chosen as the starting point for DNA sequencing of the deletion interval. Initial sequence data showed homology to the 3' coding region of the drosophila developmental gene eyes absent (eya). Further genomic sequence data was then directly translated and compared to the deduced ammo acid sequence of the eya1-a cDNA (eya protein). Seven putative exons (A to G, Fig. 2), flanked by consensus splice sites, were identified. The corresponding ammo acid sequence was found to be highly homologous to an eya protein fragment that finished 26 amino acids before the C-terminal end. In order to search for an exon(s) encoding the C-terminal end of the putative human protein, DNA sequencing was continued into the adjacent P1 , 10910 (see Fig. 1 ). An additional potential exon encoding a sequence homologous to the last 26 ammo acids of the eya C-terminal end (exon H), and containing a stop codon, was identified The 271 aa peptide encoded by exons A to H showed 69% identity and 88% similarity with the drosophila eya protein. This proposed human homologue was therefore named eyes absent-like 1 (EYA1 ). Databank comparison of this human sequence with a C elegans expressed sequence tag (EST) database, resulted in the detection of an EST (C07779) encoding a 99 aa peptide showing 43% identity and 64% similarity
over 72. This peptide was 38% identical and 58% similar to the eya protein. Finally, no significant homology was detected upon comparison of the EYA1 deduced amino acid sequence with the amino acid sequence derived from the complete coding sequence of S. cerevisiae.
Example 3 : Determination of the cDNA sequence
RACE PCR was used to isolate the eya1-a cDNA. Successive rounds of amplification were performed on 9 week total foetus mRNA resulting in a reconstituted 3698 bp cDNA sequence containing a poly A tail (see methods). The translation initiation site was identified by the presence of an adequate Kozak consensus sequence (GTTCAGatgT)''8, preceded by stop codons in all three frames. The cDNA sequence contained a 138 bp 5' non coding region followed by a 1677 bp open reading frame and a 1883 bp 3' non coding region (GenBank accession number Y10260) (Fig. 2). An additional 3' non coding region of 1164 bp, was also amplified. Sequence comparison with genomic sequence from P1 10910 demonstrated that both were transcribed from exon H and resulted from the usage of two distinct polyadenylation sites (Fig. 2).
Example 4 : Mutations in BOR affected patients
To test for possible DNA rearrangements within eyal in BOR affected patients, Southern blots containing DNA from 21 familial and sporadic patients were hybridised with probes corresponding to the exons A to G, and the exon immediately adjacent to exon A (exon z) identified by comparison of genomic sequence from P1 4405 with the eya1-a cDNA sequence. In individual 4 (a sporadic case), hybridisation with exon D resulted in a signal of reduced intensity suggestive of a deletion; this reduction was not seen in the proband's parents (Fig. 3a). Hybridisations with the exons A and B resulted in signals of normal intensity (data not shown), C resulted in a band shift (Fig. 3b), and E, F and G resulted in bands of reduced intensity (data not shown). Hybridisation with a probe corresponding to the Sp6 insert end of P1 4405 (located between exons G and H), resulted in a band of normal intensity (Fig. 3c). According to the assembled genomic sequence of the P1 4405 subregion, the deletion in individual 4 was estimated to span 5.8 - 7 kb.
Exons A to G and exon z of 42 unrelated affected individuals were sequenced (see methods) and seven mutations were detected (Table 1 )1 9. A premature stop codon was detected in exon z of individual 1. A replacement of a T with a CC insertion was detected in exon D in individual 11. The affected family members of these two probands carried the same mutations. In individual 23, a 1 bp insertion was detected in exon z. The phenotypically normal parents of this proband did not carry this mutation. In individuals 22, 34, 40 and 48, four mutations were detected consisting of a 4 bp insertion in exon G, a 1 bp insertion in exon E, a 5 bp substitution/insertion in exon E and a donor splice site mutation of exon F, respectively (Table 1 ). No other family members of these four individuals were available for testing
3^
Table 1 : Mutations in BOR affected patients
Type of Individu Nucleotide Effect on coding Exon case al change sequence
Familial 1 823C→T R275X z
11 1251T→CC Frameshift D
Sporadic 4 del-7kb Deletion C,D,E,F,G
23 755ιnsC Frameshift z
Unknown 22 1555ιns-4a Frameshift G
34 1359ιnsC Frameshift E
40 1372T→AGAGC Frameshift E
48 1498+2T→G Aberrant splicing F
aTTGT insertion Nomenclature denote mutations as described in ref 19 b The nucleotide positions given are referenced to the eyal -a cDNA
Example 5 : Amino acid sequence analysis
The 559 ammo acid sequence of the EYA1 gene product was compared with the drosophila eya protein A high degree of homology between the two proteins was restricted to the previously mentioned 271 aa C-terminal region encoded by the last eight exons, A to H This region was named eyaHR for eya homologous region A lower yet significant degree of homology was seen over 125 aa (ammo acid position 130-254) in the N-termmal part of the EYA1 protein which was 21 % identical and 79% similar with the eya protein The deduced ammo acid sequence of the EYA1 protein has a molecular weight of 60 3 kDa +/- 10% Sequence analysis failed to detect a signal peptide An a-hehcal segment composed of four successive hepta repeats forming a leucme zipper, was predicted between am o acids 431 and 458 (Fig 2) The a-hehcal segment was preceded by a 30 residue basic segment suggesting the possible existence of a bZIP domain The drosophila eya product also contains a basic stretch of ammo acids followed by four successive hepta-hke repeats, in which three leucme residues (ammo acid position 635, 642 and 656"1 7) are replaced by three other residues, an isoleucme, an alanme and a methionine respectively, which also have a positive hydropathy value20 Five potential N- glycosylation sites, four potential protein kinase C phosphorylation sites and one potential cAMP-dependent protein kinase C phosphorylation site were detected (Fig 2)
Example 6 : Evidence for a novel gene family
The EMBL and GenBank databases were screened for sequences homologous to the eya1-a cDNA sequence Three human and two murine ESTs, showing significant homology to the eyaHR, were detected The human ESTs H07988, R72695 and Z39529, consisted of two overlapping ESTs assigned to chromosome 20 and an unmapped human EST, respectively The two murine ESTs detected (W34432 and W83314) were unmapped Using primers specific to H07988, the cDNA sequence was extended in a 5' and 3' direction by RACE PCR using 9 week total human foetal mRNA A 1929 bp cDNA sequence with a 198 bp 5' non coding region, a 1614 bp open reading frame with an initiation site within a strong Kozak consensus sequence (GTGGAAatgG), and a 117 bp 3' non coding region, was obtained (GenBank accession number Y10261 ) The 538 aa coding sequence predicted a 59 2 kDa protein Due to the extensive homology between the deduced ammo acid sequence of this cDNA sequence and the EYA1 protein, the corresponding gene was named EYA2 The 1207 bp human EST clone Z39529 was sequenced and mapped to chromosome 1 by hybridisation to a panel of somatic cell hybrids (data not shown) The cDNA sequence was extended in a 5' and 3' direction by RACE PCR A 2860 bp cDNA sequence with a 113 bp 5' non coding region, a 1716 bp open reading frame with an initiation site within an adequate Kozak consensus sequence (GTCCTCatgG), and a 1031 bp 3' non coding region, was obtained (genbank accession number Y10262) The 572 aa coding sequence predicted a 62 kDa protein Likewise, because of the extensive homology with the EYA1 protein, the corresponding gene was named EYA3 The overall homology between EYA1 , EYA2 and EYA3 ranged from 45-68% identity and 73-84% similarity with EYA1 and EYA2 showing a significantly higher degree of homology with one another than with EYA3 (Fig 4) The eyaHR of EYA1 was highly conserved in both the EYA2 and EYA3 proteins with EYA2 protein being 83% identical and 99% similar, and the EYA3 protein being 62% identical and 96% similar Also within the eyaHR, EYA2 and EYA3 contain a basic ammo acid stretch followed by an a-hehcal domain in which the third leucme residue (ammo acid position 448 of EYA1 ) has been replaced by a histidine (in EYA2) and a threonme (in EYA3) (Fig 4) With regards to the drosophila eya protein, EYA2 was observed to be 65% identical and 90% similar, and EYA3 58% identical and 90% similar, for the region corresponding to the eyaHR, and the two proteins showed the same lower degree of homology for the N-termmal region, as was seen for EYA1
Using primers specific to the murine EST W34432 and E13 total mouse mRNA, the cDNA sequence was extended in a 5' and 3' direction by RACE PCR This led to a reconstituted sequence of 2482 bp containing a 233 bp 5' non coding sequence, an open reading frame of 1659 bp with an initiation codon within an adequate Kozak consensus sequence (TCAGCCatgT), and a 590 bp 3' non coding region (GenBank accession number Y10263) The 553 aa coding sequence predicts a protein of 60 3 kDa The deduced ammo acid sequence of this cDNA was 99% identical to EYA1 for the eyaHR, and 91 % identical and 97% similar to the remaining region The homology between the extended sequence of W34432 and the eya1-a cDNA sequence extends into the non coding region and was 85% in a 46 nt overlap at the
5' end and 73% in a 594 nt overlap at the 3' end, compared to a homology of 90% in a 1659 nt overlap in the coding region The high level of homology led us to consider the corresponding gene as the murine orthologue of eya1-a (Eyal ) The deduced ammo acid sequence of the 436 bp murine EST W83314 showed a higher homology with the EYA2 protein (99% identical) than with EYA1 (85% identical and 90% similar) or EYA3 (64% identical and 80% similar) These results suggested that this EST sequence could belong to the murine orthologue of EYA2 and it was not further characterised
Example 7 : Analysis of the expression pattern of eya1-a and Eyal
By northern blot analysis of human foetal tissues (Fig 5a), a probe specific to the 3' untranslated region of the EYA1 cDNA (see legend of Fig 5) hybridised to a 4 4 kb transcript that was highly expressed in kidney, less in brain and weakly expressed in lung An additional weaker 3 8 kb transcript was seen in foetal kidney In adult human tissues (Fig 5b), a 4 4 kb transcript (accompanied by a fainter 3 8 kb transcript), which was highly expressed in heart and skeletal muscle, and weakly expressed in brain, was detected The size difference between the two transcripts is likely to be explained by the use of the two distinct polyadenylation sites The total size of the two transcripts is indicative of additional 5' non coding sequence A faint band of 4 6 kb was seen in adult liver No transcript was detected in adult kidney By northern blot analysis of total mouse embryos (Fig 5c), a probe specific to the 3' untranslated region of the Eyal cDNA (see legend Fig 5) hybridised to a 4 4 kb and a 4 2 kb transcript that was highly expressed at E11 , and less expressed at E15 and E17, no transcript was detected in E7 mouse mRNA In murine adult tissues (Fig 5d), a 4 4 kb transcript was seen that was highly expressed in skeletal muscle and weakly expressed in brain No transcript was found in murine adult kidney
In situ hybridisation was performed using an antisense probe from pM3UTR (see methods) on whole mount mouse embryos and parasagittal and transversal cryosections On whole mount E9 5 and E11 5 mouse embryos, a specific weak labelling of the otic placode (E9.5) and the otic vesicle (E11 5) (data not shown) was observed? No labelling of the branchial arches was seen By in situ hybridisation to cryosections, at E13 5 a strong signal was detected in the cochlear and vestibular neuroepithelium as well as a weaker signal in the mesenchyme of the otic capsule (Fig. 6a) and of the middle ear, and in the statoacoustic ganglia (data not shown) At E19 5 there was a strong signal in the whole vestibular (utπcule, saccule and semicircular canals) and cochlear neuroepithe a, and in both the vestibular and spiral (auditory) ganglia (Fig. 6b) At E13 5 a strong signal in the olfactory epithelium (Fig. 6c), and a weak signal in the surrounding nasal mesenchyme, was seen The labelling of the olfactory epithelium persisted at E19 5 (data not shown) At E13 5, within the metanephros, a strong labelling of the mesenchyme surrounding the just divided ureteπc branches, located nearer to the medullary zone, was observed (Fig 6d) and no labelling of the peripheral mesenchymal blastema was seen There was no signal in the medullary zone, the ureter, the mesoπephros or the mesonephπc duct (data not shown) Kidney sections of E14 5 (Fig 6e) and E17 5 (Fig 6f) mouse embryos showed that the mesenchymal hybridisation signal
3* accompanies the progression of the uretenc branching towards the peripheral cortex This was concomitant with the disappearance of the previously observed signal nearer to the medullary zone leaving only a weak signal in the S-shaped bodies At E19 5 only a signal at the limit of detection was observed in the peripheral cortical region of the kidney Finally, the sympathetic chain comprising the cervical, thoracic, lumbar and sacral ganglia displayed a strong hybridisation signal at E13 5 (Fig 6g,h), which persisted at E19 5 (data not shown) No signal was ever detected in the eye
Conclusion eyal seems to join the group of developmental genes with a normal activity that is close to the threshold level for the appearance of clinical defects, such as PAX621 >22-23, GLI324 SOX925 and Sonic Hedgehog26-27 (for review see 28) Such threshold dosage effects may explain the variable expressivity and incomplete penetrance of BOR syndrome within the same family
The three compartments of the mammalian ear are derived from different embryonic structures and are generally considered to develop independently9 Development of the inner ear begins with the otic placode (evident in the mouse at E9) which subsequently forms the otic vesicle The ventral wall of the otic vesicle gives rise to the stato-acoustic ganglia which will eventually differentiate into the vestibular and spiral ganglia From E12, the otic vesicle itself enlarges and two pouches emerge which differentiate into the cochlea (ventral pouch) and vestibular apparatus (dorsal pouch) At E13 the cochlea extends through 1/2 a coil and by E18 has reached the full number of coils (1 3/4 coils) The differentiation and maturation of the inner ear neuroepithelium is a long and complex process, that is completed at 20 days postnatal The ossicles of the middle ear are identifiable at E13 The first branchial arch appears at E8 and all arches have disappeared by E12
According to the branchial and ear anomalies associated with BOR syndrome, expression of Eyal was expected in the branchial arches and in the developing middle and inner ear Within the inner ear, as a reciprocal interaction between the epithelium and the mesenchyme of the otic capsule is known to be essential for its development29, Eyal expression was expected within either one or both of these tissues Likewise, the sensoπneural deafness would be consistent with an expression in the neuroepithelium and/or in the spiral ganglia By in situ hybridisation to mouse sections, Eyal expression was detected in the otic placode in the developing membranous inner ear (i e the cochlea and vestibule) as well as in the surrounding mesenchyme of the otic capsule While the cells of the organ of Corti stop dividing at E14, expression persisted concomitant to the maturation of the sensoπneural epithelium Gene expression was also detected in the associated vestibular and spiral ganglia Assuming a similar pattern of expression of the murine and human homologues, these multiple sites of expression thus account for all the inner ear anomalies of BOR syndrome The expression pattern of Eyal suggests a direct role for this gene in the development of all components of the inner ear (whether derived from the otic placode or the surrounding mesenchyme), as well as in the maturation of the neuroepithelium The observed expression of Eyal in the middle ear pπmordium is also compatible with the defects of the ossicles, and of the
middle ear cavity, seen in BOR syndrome In contrast, no expression was seen in the branchial arches, which is most probably attributed to a transient expression of the gene in these structures or a level of expression below the sensitivity of our assays Rather intriguing is the observation of a high Eyal expression in another placodal derivative, the olfactory epithelium This could be indicative of a general role for this gene in the development of the sensory placodes although, to date, in contrast to the olfactory and inner ear epithe a, we failed to detect a signal in the developing lens
Consistent with the renal (hypoplasia, dysplasia, aplasia) and collecting system (duplication or absence of the ureter, bifid or extra pelvis, blunted or distorted calyces) anomalies of BOR syndrome, Eyal expression was expected in the ureteπc bud and/or in the metanephros Eyal expression was not detected in the ureter, in the kidney collecting system nor in the non condensed metanephπc mesenchyme Our data show that Eyal expression was restricted to the metanephπc cells surrounding the just divided ureteric branches and accompanied the progression of the induced clustering of the metanephπc cells from the juxtameduilar to peripheral cortex This expression was barely detectable upon completion of this inductive process (E17-E18) and undetectable in the adult kidney Eyal expression thus seems to be a marker of the newly condensed metanephπc mesenchyme The temporal pattern of expression of Eyal indicates that the activation of this gene may be dependent on induction by the ureteric bud According to the anomalies observed in BOR affected patients, this gene plays a crucial role in kidney morphogenesis, as well as in the growth and branching of the developing ureter and collecting system
Hox-1 6^0,31 Paχ282 and some components of the retmoic acid pathway (reviewed in 33), have been shown to be essential for murine inner ear development Signalling molecules required for kidney morphogenesis have been identified based on in vitro studies and genetic evidence (for review see 10, 34 These molecules include the transcription factors Pax232-35-36, Pax837, WT-13 HNF139, BF-240( the secreted glycoprotein Wnt-441 , the tyrosme kinase receptor c- ret42, and several retmoic acid receptor genes43 How the developmental step(s) controlled by Eyal fits into the cascade of events controlled by these other regulatory genes remains to be determined Of particular interest is a possible relationship in kidney morphogenesis between Eyal and Pax2, Pax8 as well as WT- 1 These three genes are expressed in the condensed metanephπc mesenchyme, albeit not exclusively, and for two of the three, Pax232 and WT-138, a direct role in kidney morphogenesis has been shown by studies of knockout mice
This work has also demonstrated the existence of a gene family comprising at least two other members, EYA2 and EYA3 This family is defined by a highly conserved 271 aa C-terminal region which is also present in the drosophila eya protein, the remaining part of the protein shows a lesser degree of homology that decreases towards the N-terminal end The actual size of this newly identified gene family remains to be determined, a task which will be greatly facilitated by the high conservation of the C-terminal domain shared by the EYA genes Interestingly, the high conservation between the human eyal and murine Eyal genes extends into the 5'and 3' non coding regions This conservation, which was not found with EYA2 and EYA3, strongly suggests an important role for these untranslated regions The conserved 271 aa C-terminal region is likely to confer a similar molecular activity for
the encoded proteins As the drosophila eya protein was known to have a nuclear localisation, the identification of a putative bZIP domain within the eyaHR of EYA1 was particularly appealing due to the association of this domain with various families of transcription factors However, this possibility needs to be considered with caution as EYA2, EYA3 and eya do not entirely fit the recently determined criteria for a bZIP domain44
ANNEXES
Annex 1 : Sequence of the eyal gene
Annex 2 : Sequences IDN° 2 to SEQ ID N° 47
Annex 3 : Restriction map of the eyal -a cDNA
Annex 4 : Restriction map of the eya1-b cDNA
Annex 5 : Restriction map of the eya1-c cDNA
Annex 6 : Restriction map of the eya2 cDNA
Annex 7 : Restriction map of the eya3 cDNA
ANNEX 1
The polynucleotide of SEQ ID N°1 according to the present invention is hereafter presented. In order to facilitate the reading of SEQ ID N°1 , it has been divided in five subsequences that are presented below, begining with the 5'end non coding region that is localized upstream the eyal gene, and ending in the 3' non coding region localized downstream the eyal gene.
SEQ ID N°1-A : This sequence corresponds to the Contig Exon P. This contig is contained in the S1C6 clone (subclone derived from the P1 11083). The sequence contains a 5' non coding exon that spans until the nucleotide in position 794. This exon is contained in the in eya1-b (SEQ ID N°3). The sequence SEQ ID N°1 -A is contained in the clone HSEYA1 S1 C6, deposited at the Collection de Cultures de
Microorganismes (CNCM) under the accession number 1-1828. The SEQ ID N°1-A is the following :
GAAAAGATAACGXGTAATTAXTCCCCAAAGATCTTAAXTGGTAACTACTGTTGCA
AGAGGACGCGTGTGTGTTGACCACGTCTCAGAGAAGTTATTCTCACACAGCTGT
TTATTCAGATAGCGTGGGGCGCGTTCATTXACATAATTTTTTTTTTTTTAAACAGC
AAGGTTGAGTCAGACAGAGCAGTGCTCCTTTCAATTGTTGCATTTAAACCAATAA
GGTTAGGACAAGAGMTAGCTGTGGTTTGCGTTGCAAAAACCAAAAAAAAAAAA
AAAAAAAAAAAGAAAGCCCCGAGGCTCCATGGGCAGACCTACAAGGCTGCGCA
AACAAATCGAGGGATGAGATTCTGCTGTTTCTTTGTCTAGGGTTCTCAGATGCTA
TCTGCCGCTGCTGTTTGGTGGGGAAGGAGCGCTGGGCGCAAAGCTGTTACCAA
ACAGAACGGTGGGAGCTGATGGCTCCGAGTTTGGGGCGAGGTAGAAACTCTCC
AGTGCCACTTCCGACTTTAAGCCTTCCTGTTGCCGTCCACTGTGGCGGGTTTCT
TCCTGGGGAACACGTTTTCGCTCAGTCGCTCGGCAGCCCGAGCCTGCGGCAGC
GGCCAGGCGCCTGCCCCCTGCGCCGAGCTTTCCCCTGCAGAGGCGCTCCACT
CCCAGAAGCGCCGCGGCTGCACCAGAGCGCCTGAGAGCCCCCGCGCGTACCC
ATCCAGGAGCAAAACTATGTCAGGAATGGAGGTTTGCTAACCCAGAAAATTCGA
AGGAACACATTAAACTGGTGGATGCAGCAGATGTAAGCGCTGTAAGTACAAGCT
GACTGTTTGAAAAATTGTAGTGACTGACAACAGCTGTCCTGTGGGCAAAGCCCA
AGCTAACAGGAGCGCTTTTATTATTGTTTTGATTTGTTTXCAAAGAAAAACATTAT
GGCATGGGGAAAATTATGACACTTTGTACCTAGTACTTTATGAGCCACAGGGTT
TCATCGTAGAAGTAGAATTCCCAACCTCCAAATTATGAGCGATCAGGATCATTAA
XAAATAATTACAAATTAATTGTTAACAATTACTTTTGTTATGCTCTAGATAAATGTA
ATGCAXGACCATATTCTTGCAGGGAGATGACTTCATATCTCTTTTCTGAATTGCT
GCACAAATCTCAAATATATTTGTGTGGTATTGAATAAAAGCXGTACATATATGTTC
ATTGGC TACATXAGTT CCTTTTACACCCTTTGATATGAAAAAAATGAXCXTT
TAXCXTTTC
SEQ ID N°1-B : This sequence corresponds to the Contig exon R. This sequence is contained in clone B4CL34 (coming from a classified lambda phage library wherien the YAC 953H7 was subcloned). This sequence contains the first exon, exon R, the translation initiation codon of eyai-a (SEQ ID N°2) being at the nucleotide in position 378 of the sequence. The end of exon R is localized at the nucleotide 402 of this sequence. Exon R contains a potential splicing site at the nucleotide in position 402 of this sequence (fusion with exon P) that is used by the eya1-b form (SEQ ID
N°3) and by the eya 7 -c form (SEQ ID N°4) In eya1-b and eya1-c Exon R is spliced to give raise to exon Q Exon Q spans from the nucleotide at position 353 to the nucleotide at position 402 of this sequence The sequence of SEQ ID N°1_B is contained in the clone HSEYA1 B4CI34 deposited at the Collection de Cultures de
Microorganismes (CNCM) under the accession number 1-1837 The nucleotide sequence of SEQ ID N°1-B is the following
TTATTTTTTTCCAAGCACCTTTTCCAAGCTGCCTGGGCAAGGCTGCTTTGCATGT
TGATTGGGGGAGGGGTAATCCCATTTGCAGCTGTCATAGGCCAGTGATGAATCA
CTATCACGTCATTTGGCTGCCTTCCCCCCCTGACAAGGGGAGGAGGAGGATAG
ACCTGCAGGAGGAAACAACAGTTGAGACCTCAACTTAAAGACAATATGCCTTTC
ACTGCTGCAGTATCCTCCTTTTTCTCTTTTGGTTAAAAGAGAGCATTGTCGTTAT
CAGCCATGTGCTCTGTATAATTAAGAGCTGACACTGAAGCAGAGTAACAACATAT
TCTAATTTTTTTACCCCTGATCACAGGTGCAAACATCTCAAGCCAGTTCAAATGT
TGCTGTTTCCTCAAGTTGCAGGTAAGGATATTGTTGATATTTGACATTTTTGGAG
ATTACCTTTCTGTGACCCATATCTATTTTTGTCTAGTGTTTCTTAATTATTTCTATT
AATAGTAACAGCCTCTGTTGGAAATTACTTAACAGTTTGTGTGTGTGTGCTTGTA
AAACATTTTAAGTATGTAGAAAGGGAATAGGAGCCTCGGTTTGAAAATAATTATG
GGCTATAGTTTCTAGTGTTAGGAAAACTGTGAATGTGTAATGAGAATCAGGAGA
GAAAAATTGGATTTTAGACCATTTCTAGTTTGAATTCAGAATATTTGGGGAAGTC
AAAGCTCACTTTAGAAGTAAAATGGTTCATGTGAAGCTGTTTT
SEQ ID N°1-C : This sequence (Contig S) contains exon S, which is an additional exon that is ppresent in eya1-b and eyai-c This sequence is contained in clone
B4CL34 Exon S spans from the nucleotide at position 638 to the nucleotide at the position 765 of this sequence The polynucleotide of SEQ ID N°1 -C is contained in the clone HSEYA1 B4CI34 deposited at the Collection de Cultures de
Microorganismes (CNCM) under the accession number 1-1837 The nucleotide sequence of SEQ ID N°1-C is the following
TXCAGGXXCGACTCATAXGTTTAAGGGGCATTGTTCAGGGAAACAGGGGACAAT
TAAGTAGTTGCCAGAGTAAACACAGGCATCAGGATXTCTCXTCCTGTCATAGACT
ATAGGTTTCAAGGTTTATTTGGTTGGAGACCCATTAAATTACAACTGGGTGGTAA
ATTATAAGGGAGTACCTTAGTGCTAATTTATTAGATGAAACAGATTCCACAGAAG
AGTGATCTATTTCCATTTTACTTCTTAACATAGTTGCCTTTTCTTTTAGAAAAGTA
ATGCCTTTGGTCC I I I I I AATATAAAAAGACAACGATGACTGTATTTTTTCTTTGA
AAATTATAATAGATGTCATTGTCTCACATTAGACCTTGTGACTTGATGAACTGTGT
AAGAGAAAAATGAAATTTTAAAAGCTTCATGTGACTGGCTGTATAGGTAAAGCAA
GTGACATCTTTTGATAGTGCATATGATGAAGGGATACCTGAAGTCAAAATGTGA
GATGAATTATTTTGAGTCTAAGAAATAGATACAATGGGACTTTTGTGCAAGTGTG
TTTCAAAGGTGTCAAGTCATTAGCGCATTAAATGGTGGTGTTATGAATTTGGTAC
ATCTGTCAAATGGAAGGCTTCTTGTCTTTAGGTCTATGGAAATGCAGGATCTAAC
CAGCCCGCATAGCCGTCTGAGTGGTAGTAGTGAATCCCCCAGTGGCCCCAAAC
TCGGTAACTCTCATATAAATAGTAATTCCATGACTCCCAATGGCACCGAAGGTGA
GTGTCTGCCTTCTCACTATTTACTTTGCACCTTGTTTCTCCAXCAGTATATGCTTT
GTTATTAGGTGGTTATGATTTCAGTTGTCAAAATCATAAAACTACTCTTATATTTC
CCCTTTAGGTAATAATGTGACTTACCAATCAGTACTCATTGTTAGGAGCTTCATG
TTTCCAGTTATTGTAACATTGCGTAGATTGGTTTCGTTTCACCAAATGCATATACA
TATATACACATGTATATTATACATATTTGTGTGCATAACATTTTTAACCTTTGAGT
GTGTTTTATCTTCTTATTGATCATGTACACXCTGCAGTATCAATTTGTGTTGTXTT
GTTGTGTGACCAGTGAAAACTTGGAATXATATCCXTTATTTTCCCXTTTGTGGTT
AAATATAGCCXAATTTGTTTAAATGTACACXTXGGACTGGATTGTTTGTAAATATC
AAXGAAATAATTTGTGTGCCXAXTGTTCTTAAGGCTTGGCTAGAAAATAXAAXAA
TTTTAAAAAGTATGTTTATGAAXTAXCCXGCXTTTAGGTGATAACXCCXATTATXX
CTXAAXATAGGTCTTAAATGTXAACATXCATATTCTXGTATTAAACCAGXAXATTA
CXXAATTTAATAATCXAATCCXCTCGAACCTA
SEQ ID N°1-D : This sequence contains exon T. Exon T spans from the nucleotide at position 1056 to the nucleotide at position 1133 of this sequence. The sequence is contained in clone S3B12 (which is a subclone derived from the P1 11083). The polynucleotide of SEQ ID N°1-D is contained in the clone HSEYA1 S3B12 deposited at the Collection de Cultures de Microorganismes (CNCM) under the accession number : 1-1827. The nucleotide sequence of SEQ ID N°1-D is the following :
TTTGATTAAGAGXAXAAGGGAAATAACCXTTTTACCAACAAAAAAGCXAGGGGTT GGGAXAAAATTTTTTGGGAXGGGAGGGXCAAXTTTTXXACCCCAGTAAGGTT
TTTTXTXXAGGGXAAAAACCAAAAGGGAGGGXAAAXTTTTXTTTTXXTGXAAAAA
GAAXATXTTAXAAATTTGGGAGGGXAACAAAGGGXCXGGATAGTGXXXAAATTXT
ATTGGCTTGCCTTXTTTXGTTAGGGGGAACAAXAAAAAGGTTXACCXATATTCAG
GGGAGTGGAXGGCTXTTAAGXGAATXXGXXGCATAAAATTTTAAGAGXTTXTTTT
CACAGCACAAAGGGACTTGACCCAAATATCAATGTATGTATTTCCACAAAXAGTC
TGCXATXCAATAXXGTXTTGAAATAGAAAACTTTGTTTTGTGGTXTCATAAACATT
CAGTTATGTTTGXTGGTACGAAGCATAACATAAAATATTGGGTGAATTGGCAGTC
AGATATGGXTGGGTGGTAATGGATGGTGCATTAAAATCAAGCCCTXGGTTTTTTA
GCCATGCTGGAGAACCCAGAAGGTGTGTGTGTGTGTGGTGGGGGAGACXTTGC
CCTTGAGCATCAGGAAGACCTGCTGGTGTCGACACCCACTTACTTCXTGCTTCC
CCACATCTTTCTGTTCTATCCTTTXGACTCTGACCGAATAACAGATTTTTTCCACA
ACCCCAGGAAATGATTTTATATCACATCATGTAGATTTTGAGAGATAATTTATCAA
TAAGTAATTCAGTAAATATGTAAAACAATGATGXATTTAATTAXTTTXACAGTGXT
GCTTTTAAAGGGTAACTTGTGACAATAGATTGTTTCCTAAGGGGAAAGAACTTAA
TATGTGAAI I I I I I I AAAAGGAGTCTATTTATGCTTATGATATATGTTCAGTTAGG
GAAAATGTTTTTTGAAATGATTTTCTTGTCATTACTCTGAATGTTATATCTTCTCTT
TCCATGTCTTTACAAATAACTATTAATTCAATATCGATGTATGTTGTTTATTTTTGT
AGTTAAAACAGAGCCAATGAGCAGCAGTGAAACAGCTTCAACGACAGCCGACG
GGTCTTTAAACAATTTCTCAGGTTCAGGTAAGGAATAAATATTAAATTTTCACATC
AACACCAAGCA I I I I I I CTAGTTTATTCATGTAATGTCCCTGTGTATGTATATGTA
TACATGTATATGTGTATATATGTGGGTATATACGTACTTATGTATATATACATATA
AAATAGCTTTGCTCAXAXTCACATATTACCAATXCTGTTTTXATTCATTGTGAAAA
AGTGGTTGCTGAAATGTTGCTATCCAAXAAATGGGGTATAAAAAGGGTGTTGTT
CTTCTCCTTTCTTCTACTAATTATATTTACTGTGCTGGTGACTCCXCCCCCGTTCT
CTCTTAXATGTATTATTTTTCCAGGTCTXCAAATTTGGTXAGGCXAAATTTGGTGT
TTTACXCCXCTAAGCAAACTCGCCAAXTCATCTTGCTAAXTAACTGTCAXATCXX
CTATAGATCTAAATXAAATACACTCTGTAAAGCAXCATTAAGTXCTTCAGTTAXTT
CTTAATCTCCTCCACATTAATAAAACCXGCCTCCAATAAXXCAXAATAXCCAACTT
TCACAAGTGTCACAATGAACTACCGACXAGAAAATGACTGTGATCXXAXGGTGG
TXGTXAXAACGTTACXACATGTTTGCCXXGTCACCCACCCCCATTTTGGXAATAT
XXCATTCAXAATXTATTTACXACCCAXGGTAACCTCAGAAATTCXACCTXTTTCXC
AACXCXTGCCTXCTAACXTTXCAGCCCCCCCCCTAATTXGXAAAATXCXATAACT
TCCACGTAAAXTCGTXAATTGAXTTTAATCCXAACCTCCCCTGGAATCXTTTCXTC
TCAATACCAXTTXTCATXCCCCCTAXAXAACAACTTXCTCTTXTXXXCXAXCATTX
AAAAXXXGCTAAAACAXCACXCACCCTCCAAAXAXTTCACTGAATTCTCACAGCT
TXGAAXAACCGAAAXTXCAAAXTTTXTXCCXCCGGGGCXTTCAXTCCCCCCCTTT
TTGTTTCCCCTTCCTXCTAAAAAAXAX
SEQ ID N°1-E : This sequence corresponds to the Contig 4405-9480 corresponding to the assembly of the sequences derived from the P1 9480 and the P1 4405 that are publicly available at the Genome Systems library, Missouri, USA.. This sequence contains 13 exons of eyal which are respectively localized at the following nucleotide positions:
- Exon U : nt2606 to nt 2675; - Exon V : nt 2995 to nt 3140; - Exon W : nt7185 to nt 7322;
- Exon X : nt 25141 to nt25223;
- Exon Y : nt 25628 to nt 25814;
- Exon Z : nt 52945 to nt53084;
- Exon A : nt55019 to nt5102;
- Exon B : nt80417 to nt80235;
- Exon C : nt107808 to nt107866;
- Exon D : nt107979 to nt108140;
- Exon E : nt 109103 to nt109217;
- Exon F : nt109323 to nt109444;
- Exon G : nt113575 to nt113675.
The nucleotide sequence of SEQ ID N°1 -E is the following :
GAGGATCCTGAGAGTATGGATGGAGTATGCAGGACAGACCAACAGGCAATTCT
GGATTGGGGGTCGGGGAGCCACCTGAG
TCATGAAATAGAGGGGAAAACAAGATGGCTGTGAAGTTCTAATTTCCACGCAGG
GGGTCGGAAGACAGTATGCTTTGGTG
GGAAAAGTTAGGCAGACCTGGGTTCAATTTGCTTAGATACGTTACTTTACTTCTA
TCAGCCCTTAGTTCCTTATATGTAA
AATGGAAACAGAAAATCCAACCTTGAAGTATTTTGTGGGACAGATGGTGATGCA
GTCCACTGAGTAGAGTTGAAAACACA
GAAAGAAAAGACAAGTGTGGGGTAAAAGTTAGTGGGAATCTGGGAAAATTATCT
CAGTCTTCATGACAATCCTATGAGGT
GTAGGTAGTATTACTGCCATCATTTTATAGGAGAACATATGGAGACATAGAGGG
AATAACTTGACTAAGCCCACATAGAA
GAAGTGGGATTTGAATCTGAGCTGAAATTCAATTCTGATTCCAGAACCTAGGCTT
AATTCCCACGTTCTGCAGCCTCCTA
AGTCTTTCATCATATGTATATACACACATACTGTGCATATCCTCTAAATCAAAGCC
TTATGGGACTTCCTGTACTCTGTG
TAGTAACCCATAACCCTGATATCTTTTATTCTGTTGTTTGTTAAAAAGAAAGAAAA
AAGAAAAAAATTGTTGGGTACAAC
CTATTGAGTTTATG CTTTTCATGAGCCATGACCCACAATTCAAAAAACATAGA
CATAGAGCTAGGAAAGAATAAATGG
GGTTTCTGTGGCTTCTGTCAGCAAGTAGCATTGCATTTTTTTGTTGTTTGGTTTT TAACCAATTATTTTGAACTCCAATT
L S
GTCCCAGGAAGAGATTACTTCTCCAATAATTCTGCTGGGTAAGTCCATGTTACC
CTTGGCCTGTTTTCTGTCTTAGAGGA
GCAGAAGAGGTTTTAGTTGCAGATGACCTGGGCAAGTGGTAGTGCTCATAGTCC
AAGTGTTCTCTCACTAATCCCTCTCT
TCTATAAACTGCAGTCTTTAACTGACCATCCCACAGTTCTTTATGTGTCACCCTT
CCCATGTGTATCAGACAGATCTAGT
CTTTGCACCTTCGCACGTGCAGCATCTAGCACAGCACCTGGAACAGAGTAGTG
GCTCAGTATGTGCCAGATGATGGGGTA
TACCAGCTGGGCCGGCTTGCCATCTCTTCAGTACAGATGCTGTCTCTGCCAAGC
GGCCAGCCAGATGTTTGAAGTCTGGG
AGCTTGCAAGAGTGGTACTAATCTAGATCTACTGTTCCTGGACAGGCCTTGTTT
GTTTTACAAGGAAAAGATTACTCATA
GATTTGAAGACTCTAGTCCAATTCTACCATTGTCATCCCAAACACAAGCTAGAAA
ATTAAATTTGAGAAACATTAATGTA
CCAGTGTGACAATGGGAAAACTATATGAAACTTTGAGTACATGGAACCTCCTAAA
AGGATTTTAGTGTAGAATAGAATTA
ATAAATAATAGAGCCTCCGAAGATAGTAATTATTGGTAGAAGAATAAAATAAACAT
TACATCTTATAATAATTGCCCCCA
AGGAATATGCATCTTAGCCCGATAGAGATTAGAGAATTGGAGGAAAAAAATATGT
ACATGTATACCCATAAACGTAAATA
TGTGTAGATATATATGTATGTATATATACACACATATATGTAGATTTATATATATGT
AGATTCACTGTGTTTACAATGAA
AGAAGCTGAAACTGCCATTTTCAGGACTTTGTTAACATAGGTTTTTGCTTCTTCA
GTTCAAAATTAGAGCACAAGAACAT
TTGCTTGGACTTCAGTGCATTATTTAATCAAGTAAACGTGATTAAGAAGTAAAGT
TCTGCTTGAATGCTTAAATACCTCT
TTAAAAAATAAAGGCAAACC I I I I I I I AGTTGAACATTGAAGTTGATCACATATAA
CTAAAAGGCAAAGGAAGACCCAGA
TCAGCACAGGAAAAAACAGGCTTAGCTTGCAACTTGAGAAATTTAAGATAGTTAT
AGGAACGCATTTTTACTAAAAACCA
TTGCTAAATCAATGCAGGCAAATGAGGATCTATCCTGTTTCTTCTCACCCGAGAG
AAGATTAGACCATGTTTCTTGGTAA
TACAGCCTCAGGTTTGTGTGTTCTTGCTGAGCACAAACAATAGGCCAGCTGACC
TTTCAGATTCTCACCCAATCCTGCAA
TTCAGCTTCATACCTCCCCTTTCAGCACAGCTTTCCAAACAGTAGCCAGGACTC
CTTGTACTCTTTAAGTATTGCCACTG
TAAAAACTCACAGCATGCCAGCAGGTATTTGCGAAGATTGTGATGACGTTTTCTA
AGAGTTCATTTAAAACTAGAGCCCT
TAATCAGTTTCTATTCATTCCTCTTTATAGTTTAACTTTCTGTTAGGTGGAACCAT
ATGAAATTTCCATCTCCGCAGGTC
ACAAAGACCAAATATCTGAAATTTCATATGGCCAACCTATAGTTGAAAATATGTCA GTGCTTTCTATATATTTTGTGTTT
TGCTTGGATTATTGTGCATTATTTTGTTTTGTTTTTGTTTTCCAGCAATTGGGAGC AGTAGTTTCAGCCCACGACCAACT
CACCAGTTCTCTCCACCACAGATTTACCCTTCCAAGTAAGATGTATTTTCTCTTA ATCAATAAATAACTTCTCTGAGTAG
ACTTTTATTTATTTAAAGTCATGTCTGTGCCCACATGTTCCATCTTAATTTAAATG AAAGTGCATTTCATAGAGATGTAA
ACATATTGATACAAATTTATTTTGAGCATCATGTAGTGGAGACACTGAAGTATTTC
AGTGCTTTGTCAATTCTGTGTGAA
AGTTTTACTTTAAGATACAATATTTAGCTGGAATTTGTGATGTGGTTGTTAATCGG
TTTGCATGTTTCCATCTAATGCAT
GGGCTTTCTATTTTCTGTCATTTCTGACAAATAGCAGACCATACCCACATATTCT
CCCTACCCCTTCCTCACAAACTATG
GCTGCATATGGGCAAACACAGTTTACCACAGGAATGCAACAAGCTACAGCCTAT
GCCACGTACCCACAGCCAGGACAGCC
GTACGGCATTTCCTCATATGGTGAGTAACCTGCAACTGTAGTGGTGGCGTTATC
GCATGGGCATGCGTGTATTAACATGT
TTGAGTGGTACTTAAAGACCCAATCTTTGGCCAATTTAGAACGTGTTGTCACCTT
CTGTGTCTAAATTCTTAGTATGATT
GTTTCTAAAGAGCATTATAAGAACTGGTAGCTCCTTATCTTGAGGTCAGAGGTAG
CGAAAAGATTATGTGGTTCGTATGC
CTTCTTGCTCTCATTTTTCTCGATTTACATCTCAACCCTGAATTTGCCATTTGAGC
ACTGGGCTCTGGAATCCAAAATAT
TTTTGTGCATATTTGTGATAAAACGTCTGTGAATGAAGTCTTTTGGGGTACAGAT
TCATTTCCTATTTATCAGATGCCTA
GCTGTGATTAGGAACGTACAAGTAGAATTCAACAAACTCTTGCTTTAAACTCGAG
GGTGGTAATAGTCTCAAATTTCAGC
ATTATCAACCATTAGGAAAAGTGATCTATATATCACTGCCTGGAGTGTACTTTGA
AGAGGCAAATCGGGAAACATTTAAG
GCTCACATGGACATTTGTTCCTTCCCACCTGGAGATTATCTTTATCTTGGTGTAA
TTTAACTCACTTTTTATGGGTCAAT
ACCTGTAAATTAAAAATAAATCCAACAAAATCCCAAAGCTTAATTAAAAAAATTTC
AGAATTTCAAAATACCTTTTCTTC
CATTAAATTGTCATTGCAATTCTAAGTAGATTGAACTGAATAGAAACACTTGAATT
ATTGGCATTTTATTAAGAAAAGAA
TAAAAACTAGTGATCAAGGAATAGTACATGGACACTTGGAGGAAAGGATTGATA
GAATAAGTAATTCTGTGCAACAAGTA
CAGAATTTTTTGTGACCAGTATTCTAATTGTGCTTCTGCTTGATACTGTTTTGTAG
TATTCTTTTCTGTGGTATATAAAA
AGAATTAATGCTCTTCTTTAAGTATTTCATTAATAATTAGAAGGTAAATTACTTTAT
TCCTTTTAATTCTATCTCAAACA
AAATCAACTGTATATGAATTTCTTATAAAATATATACAAATAATCTAAATTATGTAC
GACTTCAGCATTTATAGTTTTAT
TTTAAAACCATCAGAACCAATGTTTTCTAGATAGGTATTTTTAAAGTAGACTTTAG AGACAGCCCTAAAATGGAA I I I I I
AAAACTATCTCTCTTTTCCATAACTTTTATTCACAGTTTCTCATAATAATATTAGCT ATGGATTAAATTTAGCTGGAATT
TACTTTCAAACTATTCTTGGACTACAAGAAGAAAAAGGATCAGCTCCCATTCTAT AATAGAAGTCAAGGCAGTCTTTGAT
TTACGTTCTCAAAATTTATTTGTGAGATGGTTATTTGAAGTATAAATGTATTTTCC AGTCCACACAGTTCTAAAGTACTG
TCAGCCTACCAGTGCTGCCATTGGAAGGTGCCATTCCCATCACTGCCCTGGCTA CCAGCCACACATCTTTCTTCTTCCTT
CTCCTCCCCATTCAGTCTTGGTGGGACCTGGACATCTCCCTTTGCCAATTTCTG AGCTCTGGGTTAGGGGTTCCTCCAGT
TCTAAACCATTAATGAATGTCCTTGTCTGAAATAGCTGTTCAAAGAGTGTGAGAT
GAGCAAAAGATCTGGGAATAGAATA
AGAAGTAGCAGAGTGAGGATGGGGATTTTAATGGGATAGATGGGGATAAGGGA
GGGGGAGAGAGCGAGGCTCAGTGTGAA
GGGCTTTGATTCCAGAGTTTTCCATGGTTATAAATTCCCAGGGAATAATTTTCAG
TTCTTAGCATTTTCCTAATAACCCC
CATCTTCTCTTTTGGGAACTTGAGTAGAAGGAGATGCCCTGCGTACTTCTGATC
CAAAAGTTCTATTTTGGGGGCCAGTG
CATTGAAAAATATGGCATCCTTCTCCCCTTTGCTCATTCATGAAGGAAGAGCACA
TAGCTTGGGTTTTAACTGCTGATAG
TCAAGGCTCAGATGGTCTTACAAAACTTTCTCCAAGGACTTGTACTCTTGACTTT
TTATTTAGACCCTGTAAGAAAAGGT
TGGTGTATCTTTTTATCATGAGAATGATAGATCAGTGGGGAAACACAAATTCATT
AGTCTGCCAGTGAAATGAGTTTGCCACCTTGTAGGTAGAAGTCACAGAGATAGG
TGTGGCTATGGGTGCTTTCTACCTCT
TTCTGCCTTTCCTTACATGTTCTTATCACAAAGATCCCCATCTGGAACCAATAGG
CTAGAAACAGAATAAAAATACATGA
CTTAGGCCAGGCGCAGTGGCTCACACCTGTAATCCCAGCACTTTGGGAGGCTG
AGGCGGGCGGATCATGAGGTCAGGAGA
TCTAGACCATCCTGGCTAACACGGTGAAACGCCGTCTCTACTAAAAATACAAAAA
ATTAGCCAGGCGTGGTGGCGGGTGC
CTGTAGTCCCAGCTACTTGGGAGGCTGAGGCAGGAGAATGGCATGAACCTGGG
AGGTGGAGCTTGCAGTGAGCTGAGATT
GCGCCACTGCACTCGAGCCTGGGTGACAGAGCGAGACTCCGTCTCAAAAAAAA
AAAAAATACATGAATTATTGGTATTTT
ATTGGTATAGGTGAAGGCTGGAAGCAGATCAGTATAGTCTCAAATGCAGCAAGA
GGTGGAAAATCCCTAGGGGTGACAAC
ATGGAAAGAAAGGGATTTTCTAAAGCCATCATGCCACCTGTCTGGGGCAGGTGT
ATTTCCAGGGACCATTTCTATGTGTG
GGTTATTCATTGGTTTGGGATCACCTTTACAGGGAAAATCACATTTTTAAAATGT
TAAGGTTGTGTTAAGACTTAAATGT
TAAGGTCATTATGAAGTTATTTAGAGATACTGCAGGAATTAAAGTAGGCACTCTC
AGCCAAGAGTCAGATTCTCCCACTG
CCTTAAAACAATTACCTAGCTGTACTCTGCTGATGAGATTCCCCTAAAATGGAAA
TAAACAGTAAGACTACTGTGTATTC
TAGATCCTAAAGATATCATTTATCTTAAGTACGAAGCAAAATTATTAAAAAATGTG
AATTCCCTCGGCAGAAA I I I I I CT
TCCCTCCCTCTGTGAGAATGAAGGAGAGTTGTTAACATGTATGACTGTCTTTAAT TGTAATTAATTAAAGACAGAGCTGA
GCCCCATACCTCAGTGCTGCCAAATCCAAGTCCCAAAATTACGATGTTAGCCTG GTGTTTGGGGAAGTATTTGGGGGCCA
CAGATACATCATTTTTATATTATCATATGATAAAAACAAAGGTACTTTAGACTACA TCGAAAACAATTATAGTCAAATAT
TTCACACTAACAGGGACAAAAGGAAACTGCTGGTCAACCTCACTTGTTTATAAAT CACTGCTGTTTTATAAGATTCAGCT
CATTAGTGTTCTGCTTCCTTTGTGCTTATTTTCTGATAGGACCTAGATATTCAGG AAAAGCCTTTCCACCATGTTGTTTA
t,8 GTTCTGTGCCCAGTATTGCTTCTTTGGACTTCAAAATGAGTTGGTACTGAAATCA GCTTTGTAGTAGAAAGAGTAAAGGA
GGGAAAAATACTGTGATGGTATAGGATCTAATTAATATTTTTAATATAAAAAAGGA TTTTATTTTTAACACTAGTCAATA
TAGCAACTTTGGGAACATTTAGCTAAAATTTAATTATGGCTATTTCTTTACTGTGA AGGAGATGATCTTCAACTCTTACA
TAATTAAAACAACAAAATTAAGTATGAAAATATATCACTGGAGTTACATTTTATAG AAGGGATATGTCTTGAAGTGCAAT
AAAACAAGAGAGAATAATGTGAATAGACATTCAAATCTGTAATCCTGGATTTAAA AAACTGAAGTTGTACTTTATGGTAT
TTCACTGAATGCATGTATACCCTTTATTTTTGAGATAAGATTGGGGAAGCATTTG TAGTTAAACGGTTGTGGAAGTGTAT
GTTTTTGTATATTTACCAGCTTTTGAAAATGGACAGATAGATAAGCTATCACTAAC TGATTTAGGTGCATTGTGGGCAGG
CATCAAGACTGAAGGTGGATTGTCACAGTCTCAGTCACCTGGACAGACAGGATT TCTCAGCTATGGCACAAGCTTCAGTA
CCCCTCAACCTGGACAGGCACCATACAGCTACCAGATGCAAGGTCTGTATATAC ACATTGCTATTTTCCTTAACCCTGTA
GGTTGATAGTTAACTTCTAAAGTAGAACCTGATGTTCCATGTAAATAGTAAATGA TGATGGCAACTGGATTGGGCTTAGA
GGCAGGACACCTGCTTCTAGTTTGATTCATCACTTACCTATTTTCTAGGCCTAAA TTTCCATATTTATAACATCAGGGGG
TTAGACTGGATGATCTATAACATTCTTTCCAAAAAGCACTAAGATTTTTTGTGAAG TTGACA I I I I I GCCTTTGTGTCTT
CTGTCAGTTATAAAAGAAATGAAAAAGTCAAAAGTCAGCCACGTAGGAGGTTGT GATATGGCATGTACTAAAAACTTACG
CACTTGGTATTACCATCCAACTTTTCCAAAAGGGCTGTATTCTTTATTTTCATTAT GAAGTGAAGCTTGGGGAGTGACAG
TCATAGCTGTAGAAATAAAGTATGTGATAAGTGTTGAATAATGATCTCCCAAAGA ACAGCTTGATCAGATTTACAGACAC
CCAGTGAATCTTTAAGAAAATGCTTGTAAGAATCAGTCCCTGGGTTCTGTTTTTG AAAATATAGACAAAACATAATGTAA
GTACAAGTCAAATGGAATGACAGCTGTCATTGTCAAAGCTGGTCTCAAGAATAC ATAGACATTACTTAAAGTTAGCACAT
TGTTATCTTGCAAATAGATCTTTAAAGTGCTTGAGAATAGTTCTCACATAATTTAA ACATTTTCCTAATATCTTGTCTTT
ACTACAGTTCAGTTCAACGAATCATATTTGGCCTCCTAATAGTGCCAATCCATAG TTGAATTAAATCCCTGATGATCTCT
TTTTGATTGAAGGGGTTC I I I I I ATAGTTGTTATTGTTTTTGTATATTTTAGCTTAT TCTTCATTACATTAATAATATTT
GTTTATCACAGACAAATTAGAAAATACAGAGAAGCAAAAATATAAGTCTCATACAT TTATTATTATCATTATGATATATG
TTTGTCTAGATATACAGCATGGGCTCATTATTTAATACAGTATTGCAGCCTGCTG TTTTTAGTAACAAGTACATTTTAAT
CATCTTTTCACATCAATAAGTTCTGTTCTATGGTATCCTTTTGATGGCTACATGGA ATTAGCTAAATAGGTATATCTTAC
TTGACTAATTCCCTAAATAGATATATCTTATTTACCTAGTTCCCTACTATTTGACA TTTAGAGCAATGCTATGCATATCT
TATCTACACACCTTTAATTATTTCCTTTGAATAAGTTTAAGTATAGAATTTCTAAGT AAATTGTTTTATGCTCCTTTAAG
GATTTCCTATATATTTACACAAGGAAGGGCTGTGTGTTAAAGTATACACATTTGA TACCTTTCTTTGAAAAAAGTGTCAT
TTTGGTAGGTACAAGATTTTGTTTTTACACACATTTAGAAAAAACATAAAACGTTT TTTCTTCTCTTCAGTGGTTGTCAT
CTTTATCTTATTATTTTTCACATTAAAGATCTTAGCCTTTTTAAATGTCTTATTAAT ATGTTTCCTTAAGAATGA I I I I I
TTTTATTTATCATTTTTAAGAGTCTATTTTCCTGTATGGCAAACTACAATTTTAGTG GACAAGATCATTAACCTTTTCAT
ATTTGATTTTGCATTTGATTCAAAAGGCCTGAACTGTAAAGTTTTGACATATTCAG TGATTATAAAATAAAAACTATTTT
ATAACAGATAGCATTTATTGGTAGAAGGATATTATTTATGTGCCTCATTCCAAAAC TTTTATGCAGCCTTGGAAAAGAAC
TGTTCTGCACATGCTGAATATCAATTGAGTAAAGGAATATAAAGAACAAAAAGTG ATGGAAAGACTTTATTAATGAATTT
AACAGTCCTAGGCTCTAGTAATAAAATGGATCATAAGGCAGACAAGGTCTGTTT GCTCATAGAGCTTTCATTCTTGCCAA
AGAGAGAGACAGTAGACAAGCAACCAACTAAATTATGCAATTGCAGATTCCTCC GGGAGACATCAGGAAAATAAAGAGTG
CTGGAGGGGTCCCAGGGTCTAGCAGGGTGCCCGTGGTATCCAGGTCTCCTTGA GGAGAAGTGGATCCTAAAAGTGGGAAA
GAGGCAAGTGAAGTGCCCTGCCCAGAGCAGAGCTACAGAATTGTATGGTTCTG GGCCTGCGCCCATCCACAGCCTTGCCA
TTACCCGTCCACCAAAAGCTAAAGCTAAGTACAGAAACTGAGAGTTAGCATCTA GAAACTTTCATAGTCAGTCAATAATA
AATTTTATGTTTATTGAATCAAATAATAAT I I I I I AAAATGAGACTTATACTGTCTC TTTTTTTAGAAAA I I I I I I I I CT
TTTAG GTACATTACTTTGTATTTTGCTCAGTTGTCAGTCTGTGAAGGATTGGA GGAAAAAACCAATCCTGGACCTCAG
ATAGTTTGAGAAATATTGTGCTTGAGAGTTACTTTCATCTTGCTCTTCTCCCAGA TCAAAACTTATGCAGCAGGAACAAT
TCAGAAAAAGACAAACCAGCCATTACAGTATAGTGCATAAAAAGGGGAAATGAAT GAGTAGAAAAGGAAATGAATTTGGA
AGAAAATAAAATTATCATTTAAAAGAAATAAGGACATGTATAGCACCAAATTAGGC ACTGCTTTTCTAGCCATAGATTTA
TTATCATCTGGGCTTACTGGATATTGCATTAAAACTTTTTGTGTGTTTTTATTCAT C C CTTAGTTGAATGGCCTTTCTGT
GGGACATTAGGACCCATTTCCCATGACTGAGGCATTATTTCACATGACTATCTTT ATAATGATAATTTACTCAGTTTTAG
ACTATTGGATATACAGGTTGAGTATTCCTAATCCAAAAATCTGAAATCCAAAATG ATTCAAAATTTAAAACTTTTTGAGC
ACTCACGTGATTCTCAAAGAAAATGCTTATAGGAGCATTCTGGATTTCAGATTTT TGGGTTAGGAATGCTCAACCAGTAT
ATAATGCAAATATTCCAAAATTTGAAAAAAATTCAAAATCCAAAACATTTCTGGTC C CAAGCATTTCAGATAAGGGATAT
TCAACCTGTAATATGACTTCCACATATTAAGCCAGATGATCATATACTATTAATTT GAAAATTAACTTGCCAAAATGATG
SO
ATGGATAATGCTATATAATTTGCTACAGAATGTATATGGCCAGATTTGGGTACTT
AATTGTGATTAATACGTAGTTCTTA
AATATTAAATGCAGTATGTATACTGCTTCTCATTATTCAATGATTTTGTGCACCTA
GTTGCGGGAGTCTGTTGACCTTTG
AGTGTCTTTTTAATACATGTTTTTTAATATAGCAATTTCTACAATGTCTAAAATTAT
TTTCAAAATTCTCTCAGTTCTTT
TAATCCTTAATGAATAACAGTGATTATTTGCTTTTATCAATTTCATCTTATAGCTAT
TAAAATGTATTTGAATCCTGGAT
GAGAATATATTTTGGCAATATGGGGGATATTTTTGCCTAGCTTTATTTCCGGGAA
ATCTTTTCAACTTTAAAAATATAAG
GTATGATGACTAGTTATTCAGAGATTTACCTGTTCAAATTTAATATCAGAATTTTA
CAATGATAGAAGAACTATTCTAGT
AGCCCAGGGGAAATATTCTAGTAGCCCCAAAAACTGAAGTGTGACATGTATGCA
TCCTCTAAGTGTTTTGCTGATTAATT
TAAAGAAAACTTATGATGATCAGCTCTTTGGTTCTATCAGACTTAGAAAGACTGG
AGTAAAGGTACAGTGAAATCTGGCT
TCCTGGTTAGCAGTATGGGTTCTGGAGTAAGATTGGCTGTCAGCAGTTCTACCT
TTGCTTTCTGTGGGCCTTCAGCATGC
AATGACTCCTTCACATGTGTTTCCTTATCTGGAATAGGGGAATGAGATCTACTCC
CACAAAACTGTTGTGAGATATCATT
GAGATTGTCTGTAAATCACTCTGACCAAACTGTGGTACATAGTAAGAACTCAAAT
GATTGTTGTCAATTTAACTGCTCTC
ATTGCAGTGGTGATGTGATTACTATTATTTTCATTATTATCATTATGGTCCAGCTT
TCTGGAATAGTTCCAGATTAGGTA
GCATAACCTAAATGCTCCTTCTCTTCCCCCCACAACACACAAAGTGTAATTAAAA
TTTTAATATGTGTAATAATATATAT
AACAAAATATATTTTAGCCATGAAAGAGAACTTTAAGAAACCAATAACCAAATAAG
CATAGCAAAAAAGGAGCCAGATGT
AATTAATATGCTGGCATGGAAGGAAACTTACTATCAACAAAGGTCAAGCTTTTTG
TACATGATACCGAAAAAGCGGGGGA
GAAACAACATGAAGAAAAAAATCAGGAGAGAGGAAAAGCTGTATGGCAGCAGTG
GTGAATGGGGTGTTGGAAAATGAGAG
TGAGGAAGCAAGGAAGAGCCGCTGGCAAAAAATAAACTCAGCTTTGCAGTGGT
GGTGG.GAGAGAGACAGTCATCAACCAA
GTCCTTGTTGGTCATGGGCCCTGCAGTCACATCTCTGAGTGCTAAGAATGGGTG
GAGAAAGAGGTCTCCACCAGTCATTG
AAGCATCAAATTGTTTAGTTCAGAAAATTAGGGAAGAAGATGGGAATGTGATAAG
CTGTGCTATGTAGAGATGGAAAACT
CTAGTGCCCATACAACAAAATAAAATAACAGTTGACATTTATTGTTTACTTTTTAT
ATGCCAAGGACTAACCTAGTTATT
TCCCTTGTGACATCTTATTTAATCTTTACATAACTCTGAACAGTAACTTTGAAAAG ATAAGAACCTATGGCTCACGCCTG
TAATCCCAGCACTTTGGGAGGCCGAGGTGGGCAGATCACCCAAGGTCAGGAGT TTGAGACCAGCCTGGCCAACATGGCAA
AACCCCGTCTCTATTAAAAATACAAAAATTAGCCAGGTGTAATGTCACGTGCCTG TAATCCCAGCTACTTGGGAGGCTGA
GGTAAGAGAATCTTGGGACCGTGAGGTTCAGTGAGACAAGATCGTGCCACTGC ACTCCCGCCTGGGCAATGGAGTGAGAC
TGTGTTTTCAAAAAACTAACTAAATAATAAATAAATAATAAGAACCTGAAAATGCT
AAAGAATGCATAAAATTAGTAATT
TAGTTTGTATGATATCAACAGTTAAAAATGGCTACTCAAAACGGAAAATGGCTTA
GAAATGTCAGAACTCCTTGTTGTAC
ATACACCCAAATTACAGAGCTCACAGGAGTGGCTTTTAAAGAAAACTACAAAAAG
CCAAAATGTTGTTGCTTGGTCTTTA
AATTTTTCTGAAGGGCAGGCTTGATCGTTAGAACTAAACCATTGCATTACATACT
TAGTTATATAACTAGTCAAACACCA
ATTCATGAAGCTCTAAAAAAAACTTCTTAGAAAACTCATTTCTGAGCACTGTCAGT
GCTTTGACACTTGCACAGCCTTTT
AAGAGCAGATTTGGCAATTTACGGCCATTTTGCCACTGTCAAACCTTGTGAACT
GGGCCAGCCTTACTTGGTGCCCTCAT
TTGCTCCAAGTGATCCTTCTGGAATCACCCTTTCAGCTCCAACTTTCAACACTTC
TGTCCTGCCAAGCAGGCTTCTGTGA
ATCTGACACATGCCATTTTAAACTAAACAAGGATCAGATTTTCGTCGTGCATGCT
CTAATACTAAAATTCCATGAAATAA
AACCTCAGTGATTGGGAAATCTGATTGTTTTCACCTTTGCTGTAAGATCACTGGA
AAATTCTATTCATGCTTTAA I I I I I
CTTTTGTGCACCACTAAAAGATTAGAAACTTAAGTTCAGCTGAGAAAGACAAAGT
GTTCTTCTTCATAAACAACTGTAGA
GTTCAACCTGAGGAGGTGAAATCTTTGCTTATTCATTTTTGACATTTTCCTGTTC
CGCAGTGTTAGCTTTGG I I I I I CTG
GAAGAAGTTTTGGAATGGTCTGAGGGGGTTTTTCTTCACACACCATTGACAAAG
CTGATGCAGATGAGTTAGCTGAATGG
GTGGGTGTGGATGGACAGAGTGTGCCTGGAGACCCTTGGGTCAGTTCTGCAAA
CAATATGTTTTTACTGCTGCTTACCAA
TCAAAGGGCACTGGCCACAATAGTGAAACGGGGTGTTGTGTGTGAGTTTGTGC
GTGTGTGCACGCATGCATGAAAGGTAA
GAAAACTGCTTAGCAGAAGCAGCTTCGGCCTCTGATTGCTTTCACAGAATATCT
CCCCAGTGTGTCATAAATAACCTTTA
TTCTCCTTTTATGTAAAAGAAACAACTAACTTGAAATCAAACCTTTAAAAATGCAA
GGTCTGTACCTATCACTGGTGTCG
CTATCCATTTAACTCCACATATAATATAGATTTATGGTTCCATCTAATACAGAAAA
CAATGAGCTATCTTTATAATGGTG
GTGCCCAATCATATGAGGAAAATAAAATTTTGTTTAGCAATTTGGCTGAGGTAAG
GCAGTCTGTAGTTCACGTGCAAACA
ATTTTCACACACCTACAAAACAAAACCATTAAAATTAAGTTGTTATTTAACTGGTT
AAGCCTCAGCTCTGTGGAAAACCC
AGCTTTCCATAATAGCTTAGTTCTAGTGATTCTAGATTATTAAAGGTAATGAACAT
CTTTAGTCATATTTTAATCTCCAG
TATCTAGTATTATGTCTAAAATATACAAATGTTTGTTAGAAGTGGAGAGGTCTATG
TCACAATGAATGTTAGAAATTAAG
GTTCATAAAA CTCTGATTTTCTTCATTTGGGGAAATGAATTTGGAATGTAGCAT
AATAAAAATTCAGTTAATTCACCA
TTTAGAAGAGCAAAGACAAAGTTATGGAATCAACCTAAATGCCCATCAATGGTAA ACTGGATAAAGAAAATGTGGTACAT
ATACAC CATGGAATATTATGCAGCTATACAAAAAGAATGAGATC ATGTC CTTTG C AGGAACATGGATGGAGCTGGAGGCC
ATTATCCTTAGCAAACTAACACAAGAACAGAAAACCAGCTACCATATGTTCTCAC
CTATAAGTGGGAGCTAAATAATGAG
AACATATGGACATGTAGAGGGGAACAACACACACTGGGACCTGTCAGAGAACG
AAGGGTGGGAAGACAGAGAGGATCAGG
AAAAATAACTAATGTGTACTAGGCTTAATACCTGGGTGATGAAATAATCTGTACA
CCAAACCTCCATGACACAAGTTTAC
ATAACAAACCTGCACATGTACCTCTGAACTTAAAAGTTAAAAATATATATAAAACA
TAAATAAGACATTTCAGCTAATTT
ATTGTCATCGTATGTTTAATGCCATCTTTCCTTGTGTTAATACTCGTAGATAGAG
CTGGTTGGAGACATCACAGAATCAT
CACGATGGTGGTATAGAATGCCTTTCTGAGTGTTTTATTGTCCCAGAATATCAAT
AATGGTATAGGACATTCCCATCATC
CCACACCAGGGAAGGTGTCAGGATCTGCCATGATAGATCGAAGGGGCTGCACA
GATGTACATGTACTAAACTCAGAGGGC
CCTGGAAACCCTGCTTTGTTTATTTTCCTTTAGGTTTTATTGCTCTCACTTAGAGA
AGAAAATAACACAAATTCTCACTC
CCTCTTTGTGTATAATGTTGTTTGTTCTGCTAAAGAGATTGGGGTTCAAACTAGT
AAGGAAAGACAATAAATCAGGGCAG
AGGTTTGTTAAAGGGTACGTGGAAGAGGCACAGCTCTGCGACCATCACACCTCA
TGCATCCCTGCCTCATTATGAAGTGG
GAATGATCCAGTAACATCACCCTTGTTTCCTTTGATCTGCCTCTCTGTAGCTTTT
CAAATGTACGCCTGAACCATTTAAG
TTTATTAGTCATTTATAACCTAAAAATTGCTTAACTAAATATGCTGTTATGCTTTAA
AATTTTTTTATTATCTATAGAAA
AGGTATTTTTACTTTAAAAGATAGTGGTTTTAGGAAAATATACATTAAGCTTAAAA
AAGTAATCTGTGATAAAGCATTTG
AGATTTGTACTTTCTGACGTAAAATGAAATTGAAGAATAGCTGTTGCAGCTTGCA
CTGATGTGCAGAATTGCCCCTTAAT
TGCTTTGCCACTCTTCTCATATTTA I I I I I TAGTTCTTGGCATTTTATGTTCTAGA
TTTTTTTTTTGCTTAAATTGATTG
TATAAGTAATATTTTGTTTCACTTCTTTTGGCATCTTGTATATTTCAGGTGTTAAA
AACTGGATAAAAATTTCTTGACAT
CTAGGCACAGCAATAATTAGTGACCCATACCAATATTGTTTTCTTCATGAGTTGT
TTTTAAT AAAATATATAGAAATAT
ATCAAGACAGGTAAATGTAAAGGCTTTGCAGTCTTTATAACCTGTGTTAAAAGTT
CTCTTTGGAAATAAAAACTATTACT
GGGACAAAAAGGAAAACTTGGTCATTTAAAATCTATCTTCTAAGGAAGGGGGAA
AATCCTGGTTTTAAATAGCCCATCTG
GCCTTCGAAAGCTTTTCCCCAGGGAGCTGCACTAACTTCTGCCACATAGATCGT
GCTTGATGATGGACATAGCCCATGGC
AAAGAGGCCAGAGATGGAAAGCAAATGGACTTCTGTGAAAATACACGAGGGACA AACAGCATCCTGCTTGCCCAGTAAGT
GCTAGAAAATGCTCTAAAACATGGCAAGTACAGACAGGCCCCAGCAGGTTCCAG CATTCTTATGTTGGCTGGGATTCTGG
TTTTTAATGTGAGTCCTTGTTCTCTGTGAAGCAGACCTTTGACCTACTTGATACT c CTTTT i π 1 1 1 T i TCTTAAAAAAA
AAAAAAAGCT I F T I GAAGTTATAACTGGGCAGTATTATCCAGGTTAAATCATACA CAGAACTTTGATTTAATATTTTAAG
TAATGTTTAAATCGACATAGGCTGTGTTCTTAACTACGAAGAGGTAGGGAAGAT
GTATTAAGAGAAGTTTGGAGGGATGG
GTTTGTTTTAAAACTAAAATTGCTTAGCACTAACAGAACTGCATGTCCCATATAG
GATCTTCAAGCAAGACTGTTTCTTG
TTTGTTATGGTGACAAGGCCACCACTCTTTTTCTGTAGGAAGCCAACATTGATAA
GAAGAAAGCAAAGCAAAATCTTTGT
GATGCACTGTCCCTGGGACACTTTATGTAATTTACTTTTAAATTTACTAGCCTAAA
ACGGAGTTACAATGAAGAAGTCCT
TTAAAAATGGGATCATTTGATTCCAAATGTATTTATTAAATGTCTTAGGTATGATT
ATTTCTGCCCTTAGTGTGTCTATG
ATTTTTGTGTAGCCAAAAATTAATACCACTTAATACCAATTTGCTTTCAACATTAC
C G ATTCAGAAAATCTAAAAGGTAG
ATTTGGCATGACAGATTAGAAAACCATAACCTGGCACCAGAGATAATGGTCACC
TTATCCATCCTCTACTTTTTGAATGA
AGACGAGGACCCTGCCTCAGCCATCTCACATGATGATGTTGGAAATAACCTCAG
AAAATGGATGTCGAAAGATTGAACTA
TAAATGTACTCTAAAGCCATAGGATGTTATTGCATTCTGAATTTGAAGTTTTTCTT
ACACAAGCAAGACTTGTTTTCATT
AATATACCTGTGATATGCCAATGTCTGAGCTATGAGAGTGAGTTACCTTCTTAGG
AGGGCAGAATCTGGACCATGCGTCA
GTGTTCTGTTAGGATCCTGACAGCTGGACTAACAAGGTTAACATCCTAGCGGCA
TGCTCCGATGGTATGTGGATGCGAAG
TTCTTGTACCATGTCTGGGGTCCAGTGGTATTCATCGCTCAAATAGGTGAGATTT
TATGAGGTAAAAATCCATGGATACT
TAATCAACTTAAAATTTTATTTGATTTAAAGTCTGCTGGACTATTTACTGGAGATT
CAAAAATATTTTAACTCCTTTCAG
AAGGTGCTTACAGGTCCAGAATTTAACTTTCTTTACAAAAATTAAATCTATAGTGA
TTATAATCTGATTTCAAAGATAAT
CCTTAAAACATCAAGTAGTTGGAAAGAAGCCAGTCAGCCGAGTGTGGTGGGAC
GGACACGCTTGTAATCCCAGATACTCA
AGAGGCTGAGGCACGAGAGTCGCTTGAACCTGGGAGGCAGAGGTTGCTGTGA
GCCAAGATCATGCCACTGCACTCTAGCC
TGGGTGACAGAGTGAGACATTGTCTCTAAAAAATAAAAAAAACCCAGCAAAGTGT
TCAGATAAAATCTGTTAAATATCAA
AGTTCTCCAGATAAAGTCTGAGAAATAAGCTCAATTATTATTTATCTTTTCTTTCC
CTACTCAC I I I I I I AAATTGATGG
AAAAAGTTGAGAGACCTTCTTTTCTCTCTTAAAATCTTTTGTGGGCCTTAAAGTCA
TTCTTCCAAACTGTCCCCCTCCCC
ACCAACAAAAATCTATGGAAAGGTCCTCACCTGTCCCCACACCTAATACTTTATG
ACAATCGTGTTCTCACTAAATCCTT
TAGTCACTTACCACATGGATTGAAATTATCAAGCAACAGTACAGTTCTGGCTCCC ACTCCAGCCTAGAAGAAGCCTTCAA
AGGCATGAGTCAGACACGGTTTATCCATCTCTATGTGAGGCAGACAGGATAAAT ATAGAAAATGAGCTTAAGGAACTTAG
GTCAGAATGAAAACAAGACTAAGGGGGACTTGCTTAGTACTGGTCTAATGGAGA GGTCAATTACCTGCTATCAATTATTA
GGGAGCTCAGGTTTCCAGAAGATAAACTCAGTACAAAGAGTAACAGTCAAGGAA AAGCTTATTGGACAGATGAGACTTGG
GTGAGGCCTTGAAGGATGGAAAATATGGAGTGAGGGAGGGAAGCATTAAGAGG AGCAAGACTGAGCAAGGTGAGTTTCCG
ATATAGCAGTGATCACATCCTAAAGTACCATGGTGAGCTGTGCATGGGGAGCTG TAGGCAACAGCCTGGTTATAATATTC
TCCTGAGCACCCCCATCCCAAACCCCGAGTGGCATCTCCAGGCCCAGATAAAC ATTTGTATCCTTGCATTTGTAAGGGTT
CTCTAGCTTTTCCCCCACCCCCGGCAATTCAGAATCGTAACATTTCACTGGTCC CAGTATACTAATGATGATTTGTTGGA
CATTCGGATTCTTGCATTTTGCCATTAAAGGATGCCATTTCTTTTCCTGAGCTGT TTTTCTTAAATGATCAAGAAACTTC
CATCACCCACGATCAGTCACACTCCTCACCCAACTTCCCAAAAACTTTCTATTCT CTGATCTTCTTTGGATAGTTTATTC
TCACCCTGTGCCAGGAGTGAGTACACTTCTTCCCTTTTTGGGGAAGCATGTGCT TTCCTTCCCAGTCTAACCAGGTGTCT
TCATGTCATTTTGCTGAGCACCTCCTTAGCAGGGCTGTGGCACCCAAGTGAGTA ACACAGTGAGAGCTGTGACACCTGGG
TGGGTGATGCAAGGTGATGTTCTTTGTGTCATCTTCATGAGTGTGTGGGATTTT CAAACTTTCTTCGAGGGATAGGTTCA
TGATGCCCACTTCTTTTGCATCACAACAACCTCTAGAGTTGTACCTATAAAGAAT GCCAGTAGATGCTGTTAATGAAACT
AACTTGAAAAGCTGTGTGTGTGTGCGTATGCGTGTGATTAGAGATATAGTATCTT CCCACTTAGAATATAAAAATAATCA
ACTTTAAAAGACAGCAGTGGAAGGACAAATAATGCTGTATCTTAAGAGATGTTGT GAAGGGAAAAATTTTCCAGTTTCCT
GTTGTGGAAATGAGTGTCTCCCCTCCATGTCAAAAAAAAAATAATAATAACCATG ACTATGATTGTTAACAATTATTTAT
TTATTTATTTATTTATTTGAGATAGAGTTACACTCTTGTCACCCAAGCTGGAGCG CAATAGCATGATCTCGGCTCACTGA
AACTTCAGCCTCCCAAGTTCAAGTGATTCTCCTGCCTCAGCCACCCAAGTAGCT GAGATTACACATCCGGCTAA I I I I I G
TATTTTTAGTAGAGACAGAGTTTCACCATATTGGCCAATAGTTATATAACTAGGTA AGCTTTTCCTTTTAACCAATGAGT
AAGAAGGTATCCCCACTCCCACCCCAAACTCTATTGAGAATTATTGAAGTTTCAC ATGAAG.CAATCTCTGCCAAACTTAT
TGATTGAGAAATAATGGGTTATAATGGGAATTCTTTCTACTTTGAGGAATTTTCAT GATATAATGTGATGTAAAACCTTG
CATTATAGGATGTAGCGGCTCTATTTTATTTATCTGGAGGTGTCTTTTGATGTAA AGTCATTTGTTTCTAGAAAAGAAAA
TACTTCTATATTATTTTGATTAAAGAATTCCAAACATTACAAATAATTCTGTGACCA AAAGTTGTAAGTGAAGACCCTTG
CAAATCCAACAGAAACTCTCAGTGGATTGATGTTAAGGGAAGGAGGTCTGTTTT ATCCTGGTCATGTCTTATCCTGTCTC
TCTCTAGCATCCAGGTTCAAGCCTGGTGTGTGACAGCCTTCTACACAAATGTGC CTCTCTTGCTAAGCCTGCAAGTGAGT
CTAGCAGCCCATACACTGAAGCTTTCTGGGGAGCATGAGCATTGTTGTTTTGCC TTTGCTTTGTGGCGTTCTAAACTTCA
GTGAAATCCATCTCGTAGTGTTGGCACTGAAGTCTCCATTGAATTAGCTATCAAA GTGGAGCCCCTTACTCCAGATGATC
TTTAGCTCAGCTCAAGAACCAGAAATACCCCACACAGATAGGCTCTGAGAAGCT
TCAGCTCTGAATGCTTGAAGAGAATG
ATTACTTTCTTAGTGTATAAGAAGAAAGGTTATACTAGTTACATATTTTTGAAGCA
ATTGCTATAACACCTCCTGAATTG
GTACTGAAAATCAGACACTTATAGGTATTTGGTAAACATGTGGATTTTCAGATCA
CACCCTAGTCCTTCTGCCTCAGTGA
GACTGATTTCTGTGTCTGAAAGGGAAGCCCAGGTATTTGCCTAGGAAGAATAAG
CACTTCCTAATGACTCTGAGGCAGTT
TGTCAGGCATATTTTGGAAATGGTCATAAAAAAGGGAAATAACTCACAGACAAAA
ACCTTGCCATTCTGAGATGCAATTT
ACTATAGAGATTAAGAGCGTCAGTTCTGGAGCCAGGTGCCTGGGACTGAATCCC
AGCTCTGCCACAGTTGCTTATTACCT
CTGGCAGTTTCTTCGTCTATAAACGGGATGTAACAGCATCCACCTCCTGGGGTG
GTTGTGAGGAATCGGTGAGTTCTAAT
AGGCAAAATGCTTTATTTGGGCTGGCTCAATGTAAGCACTCAAAAATGTTTAAGT
ATTATTATTTGTTCTTGTCACTTCT
ATTTTGCTTCTAACAC I I I I I AATATAATTATAAAATAATCCCTGAGATTATTTACA
AGATATAAATTATTTTAATATAA
TTGTTTTATAAAATAATGCCTGGGAAAACTTTAGTGTATATCAGAACTTC C C ATTA
TACAATATAGAATCGAGCAAGAGG
AGGCAAGACAGCATTGTATAGACCTTAAAACCAGACAGATCTGTGTTTGTGTTTC
ACCTTCATTCCTTGGAGCTGTGTTA
CTGTAACCTGACCCTCCATTTCCTTATCTGCAAAATGGGAATGATCATACACCCT
GTAGTATCAAGTTGTTGTGGAGTAT
AGGTGAAGCATGTACGGGCGAGCCTACTTCCTAACATAGAGAGAGTACTCAATG
GGTATTTTGTTTTATTGTTATTATAT
TTTGCACTCTGGGGATATTTGTATGTTCCAGTCTTGGAAGTAGCATTTAGTCTGG
AAAGTCACTTGGGTGCCATTTGGAG
TACTGAATTGTGATTTACTTGGTCAGGCCCTAAGTCCTATGTGAAGGCAGGTGG
AGCGGCCCTGCGAACTTGATTCTGGT
CGTGAAAAATTAATATTGCTTGAGCCTGCTTCTCTAGTATGTACACCTCTAGTAT
TGTTATTATATTTTGCACTCTGGGG
ATATTTGTATGGGTTAGGTGTAAATATCCAA I I I I I TATAGGAATGTCGAAGTTTT
TTTAGTTGTTTTGTAGGGGAAGGA
CTAGTTTACATATAATTAGTATTTGTTAGTTTACAGGTCAATTAGTACTTGGCTTA
AAATACTGATTAGTAAGTCATGGA
TGTGTGTGTGTGTGTGTGTGTGTGTATATCAAATGTACGTACATCACAGAGATTG
ATAAACACACACATATATCATGGAG
ATAGATGGACAGATGATAGGTGGATTGATTCAGCAACCCAGGTTTAGAGGGAAA
GAGGAGAGAATGGGGATTCAGGGAGG
CTTTTCTGAGAAAGCGGATTTCAAATTCTGCTGAATTGTGAAGTGCAATCAAAGT
TAACCTGGATTTGACAGGAGGAAGT
AGGAGCTGTAAGGAGTAACATTAGAAAGCCTGCTGCATGTTGGTTTACTAGGGT TAGGCTGGAATGTATGCCCGGGTGCA
TTGGAGTCACAGTTGCTGGGAAAAGACCAAAGAAGTTCAAAAGATAAACTGGGT TTAGACCAGCTAAGGCCTAATACATA
AACCCTAGTAAGCAGTTTGAATATTATCCTGAGGGCAACAGGGAGCCATTAAAG GTTACTTTTTTTAGCAGGGGAGTGAC
AAGATCAGTCATGCACATTAGAGAGTCTACTGACTTTGCAGAGAATGGGCTGGA
CAGTACCCAAAGATGGGTAAAGAAAC
CAGTAGAGGAGGCTGCTGCAGAAGTCCCCTGCTGTGGCCTGATCCGAGTTTAG
CAGCTCTGTTTGCCTAAGTCCTATGTG
AAGGCAAGTGGAGTGGCCCTGCGAACTTGATTCTGGTGGTGAAAAATTAATATT
GCCTGAGCCTGATTCTCTAGTATGTA
CACCTCTCCCTAAAGTAAGATCCTTTTGGAAGTTGCATTAAGTCTCCAAAGCCAC
TGGGGTGCCATTAAACTCCTGTTTC
GGCATACTAAATGGAATGGAATCTCAGCCTGCAGTCAGCCAGCAAATGAGTATA
ACTTATGTTGAATTCCAGCTAATGTA
CAGTCTTGGAAGTAGCATTTAGTCTGGAAAGTCACTTGGGTGCCATTTGGAGTA
CTGAACTGTGATTTACTTGGTCAGAC
CCTCGTGGCCTGTGTGTGCCAATATAGCTTCCATGGTCTTTCCTACTTCATTTCA
GGGGATAACTTATAGAATGAAAACA
TCCATTTTAAAAATATATTTTCTATATCTTATTCATTTTCTTGGTGGATAGGGATTA
TTGAGGTCAGTATTTATTGCTTA
AAACTCTGTCATTTAAATGAAAATAGCAAGTCGTAACCTAAAAATGTATAGTTTAA
ATGAAAGTAACGGATCATAACCCA
AGGTAATTTTTCCTCTAAAATGCATATAATAGCTGAAGTTTCTTTTAAAAAAGCAT
TGATATCCACATCTTTATCCAAAA
GTAATTTAAAATTTCTTTTAC GGCATTTTTAAATGTTCATCATGTATAAGTTTGC
TTTATATATAATATCAATAATAA
ATTGTTCACTGATATGCACGTAAAGAATGAGTCATATGCCTGTAAGGGACAATGT
TTAAAAAAAAAACAAAAACTGGTAG
TCTGATAAGCCCTGTGAGGTCTGAGGCTATGTCTTAGTCATCCTTTTAGCCCCA
TGTAATTGTCACCATGAGCCTGGCAT
TGAATATATGTTTCCTTTGCATGTAATTACCACATGCAAATGAACATTAATATGTG
AAGCCTGTCTATTGATGGCTAATT
GTGGTGGCTACTAGCACTGTGAGAGCTCTCTGCCTATCTGAGATAGGTGTTCTG
GGAACTTCAAACAATTGATATTCCTG
AAATATACAAATGAAGAAACCTCAAATAGAGTCTCTGTGATAACTAAATTTAACAC
TATTTTAGGATATAGATTCCCTTC
TTAATAAAAATAATTCTTGGCTCAAGCTTTTAAAGTTCATAGATGCTTGATAAGTA
AATAGGAATAGAATTTTAGATATG
TAAAGAGATTAAGTCATATTTATCTAATTCGGGAAAGTAAAAGAGGGAAAAGATA
CGTAAATAGAGAATTTTACAAGTGC
TTGCTGCCATAAACCATAATGTGCCTAATGTTTATGTGATATTGTAGAGTGTCCA
TTATGCTTTCCAAAACATTACCTCA
TCTGATCTGTTAGGAACACTGCAAAGTAGAGTTTTTGCAGATAAGGAAATTATGT TTTAGAGTGGCTGGGAGACTTGTCC
AAAGTCCAAATAGCTCATAAATGATGAACTGGAAACTCTGTTTTCTGACTCTTAA GCCAATACTTAAATGTTCAGTAGCA
AGTAAATAATTGGCATATATTACCACATTTTATACCTGGAGCCATCTGAATAGTG AGTTCACATTGAAGAGATTCCTCCT
TGAACTGCCAACACATTTATGGCCTATACTGCTCATTTGCCGTAACACTGTAACT TTTTAAAAAATGGCTTTTATTTCTA
ATTATAGAAGCAACATAGGTTCCAAGTAGAAAACTTAAAAATGTAAATTGGCTTAA TGAATGCTATGTGACCATATATCA
5
AATCCTTCAGAGACAGCATTTTTTAGAATTTTGTGCTTTTTGAAAATTTGTCATGC
TTTAAAAATATTTTAAATGAGATC
ATCAAA TACTTTTTGTTATTTTAGTTTTTGCTTTCACAAAATAATATGAAAATTT
TTCTCATGCCATTAAACATCATT
GTAATACCGTTTAAACAGCTTCATAGTATTCCATTTTCTGAATAGTACCTGTTATG
GAGTTGGGGCTGCTTCCCTAGGTT
TTCTTATTAGAAGCAACACTGAAGAATATCTTTGCTATCTCTTTGTATAAATCTAT
AATTTTTTTTCTTAGAGTAAATGC
CTGAAATAAAATTGCTGGGTTAAATGGCATACGAAATTTCTCAATCATAAGCCAA
ATAGCCAAACAGAAAGATTGTGCCT
ATGTAAATTCCCATTTTGCAGTGTATGAATGCCGTTTTTTATATATTTCCCAATCC
CTGATCTTACCTTCTTAAAAAAAT
TTTTGCCAATATGATTATCAATGGCAGCATATTATTGTTTTGCATAGCTTAATAAC
AAAATAGTAAATCATGTGAAAGAG
GGAACCCAGCTTTGCTCAGGTGTTGTATCTTACACAGCGCCTACTGTTCATTGT
TTAACAATGTGGACATTCAACAGACA
GAACATTATCTTGACAGCATATGTAACAGATAAAACATTAATCATATTCTTCTTTC
ATCTTTGTTGTGTCAAATTTATCA
GGTGCTGCCATTCATTCTTCTGTGAAATGAAATGGCTTTTTCCATACGCCATTTT
AATGACCATATAATACAAGTCTGAA
AGCATTTTAAGTTCTTGGAAGTAGTATCATAGTTTTTGAGCTAGCAAGAACCTTA
AAGATCATCTAATCTGACCCCATCA
TTTTACATTATGTGAATTTAATATAAAGACATGAAAAATCTTGGTGTGTTTTGCAA
ATTGAAAAGTTCATTGTCTAATAT
TTTTACACAAATGTATTATTTTGTCATTAGTCTATATAAGTTACTAACATCCTTGG
AAATTTAAACCTCTTTG I I I I I I A
ATTTTGTAGTAAAAATAAAATCATCAACTATGTTAAGAAAACTACATTTTAGACAT
GACAACAGATTCTAGTGTCGAAGA
TAAACCAAATACCAATTCTGCCTTTTCTAAATGCTATTTTCCTGTACCCACATTAT
TCCACAAGCCTGGTATAATCGTAT
TTCTTTTCATTTTGTTTTAGGTAGCAGTTTTACAACATCATCAGGAATATATACAG
GAAATAATTCACTCACAAATTCCT
CTGGATTTAATAGTTCACAGCAGGTAATTTAAAAAGCCAGTTTAACTTCTGAAAA
ATATCT TAAATGCAACATGATTTT
CTTATGTAAAGAAAGATCTGTTTATAAGTGAGAGCTTTCCTAAGGCTTTTAGTAG
TTTTCTTTTAAAATTCACTTTAGCA
GTCTTTTGTGGTGAGTTGTTTGGTTTTCAGATTGGTTTAATTTTACCTAGTATTGC
ATCATAGAAAAGCAAAAATATGAA
GAATGTGTATAATCTTACATAAAATTTGTTTGCAAGTCTGAGCTCTTTAGTTAATG TGTACTTGTATGCCCCCGTGTGAA
TTCAGAAAAGGCTCAGAAACCCAAACATACTTACAGTTACACTTTGATATGTAAG AGATAAGCAGCAGTATTTTGACAGA
CAATTCTTGTGTGGTTTGATTTGGTAGGACTATCCGTCTTATCCCAGTTTTGGCC AGGGTCAGTACGCACAGTATTATAA
CAGCTCACCGTATCCAGCACATTATATGACCAGCAGCAACACCAGCCCAACGAC ACCATCCACCAATGCCACTTACCAGC
TTCAAGAACCGCCATCTGGCATCACCAGCCAAGCAGTTACAGATCCCACAGCAG GTAATTATGTGAATATTATTAGTCCT
GCAGGTGATATTTTCAATGGTTTCAATGCATTTTTCTACAGTCATATATTCATGCA
GTGGTTGCACAATTTTGGCAGCTT
TTGGCAAGGACTAAATATAAGTCAAGCAGCATGACTGTGGCATTCATTTTTATTT
ATTTTTTTATTTGCTTGAAATTTTA
AAATAAAGTATATAAACATCAAAAGCACATGAAATAGAGATTTTATTTGTAAATAT
ACTCAATTTTTAAGATCTGTGTAG
TTTTTGAGATGCATAATTATAAATACTTAGCTATAACTATAATTCATTGGCACCAC
ATATAGAGCATAAAATGGCAATGC
CTAGAATTTGGCATTTATCATTATTCAATATTTAGTGTTTTTCATCAAAGTAATATC
TTGCTGATAGTTTGAAAAACAAT
CAAAAGTGAGCCTCTGCTTTCCCACATGTGGCCTGCTGTGAATTTCATGACAAT
GTGTCATGAGTTTGGGCATTTGTTAT
GCTTCTGCACAACTCCCTTGCTGGTCTCTTTCAAATGCCAGCCTCTCCATGTGG
AAATCTTCCTACATGATGTCTTTGAT
ATCTCTTCCCCTCATTGATTCTTTCTTTCTAGAACTCTTAATGGTGTCTGCCTGAT
TTCCTGAATTGGTCTTTACTTAAC
GTTTCCTATACTTTCACATACTTCTCACCTCCTTGCCTTAACTTTTTAAAGAGTTA
ATCATACCTTTCATGGATTACAAA
ATAAGCTATTATCCTCAAAGAGTCAAAGTTTTCTTCCTCTTGCCTGTTTCCAGAG
CTAAAGCCCACTCCATTGCTACAAG
TCAGGTTAAAATGGTCTTAACATTGAAAAAATTACCTCTCTGATATTATAATCATA
GTATGCATTCTAGGTTATCCTAAA
TCATGAAAAAAAGTTCTGTAT TAAATGCTAATTTAATTTTTGACTATAGAATTAT
ATTTTAAATATTAATGGGTGATT
TCTCCCCCAAAAGTCTTCAAAATTTTTCATGTTCCACGTTTTTTTCCTTTTAAAAA
TAAGTATGCAGACATTGCTTTATT
CTAAAGTTTCATCATTCTTGTCTTTCAATTAACTTCTAATGCTCATTATTTCTGCT
GCACATTTTAACTTTAAATATTTT
TGTCTTTGATAATTGTATTCCATGTATCTTTATTTTCCAAAAACTAAGCTGATTCC
TCATTCAGTTTACATGTGTACATA
CAGTTTTCATAGAACTGAGTATACATATCTCTCTCTTCCTTTAATTAGATTTCACA
TCTATATAATGACACTGAAGACAA
ATTTTATTATAG TGCATTGAATTTATTATAGGAGTTTTTATAGAACTTTCTCTAC
TTACATAAACTTCATATAGGCAT
CATGAGGAAAGGTAATGGCTTGTCCCAACTTTAGGGAGATCTTGCCTTTTCATTT
GAACCACCTGTGTCATGCAAGCTTC
CTGAAAACTAAAAGGAGTTTTGAAATGAAGATCATTATGTGAATGAGATGTAGCA
AACCTCTTTTAGGATCTAGAGAGTC
ATAGAAAGAGAAGAGGAGACAGCGAATAATATTATTTAATTTCTGATATTTTGGC TGTGAATACTACATTGTAAATAAAC
TGTTGAAGACTTAATTGTGTTTCTTAAATTAAGGTTGATTAGAGCTTTTTTGAAAA GGCCATTTGGTCAGATATAAGTGT
TTGTCTCTGAAACCAGACATTGTTCTAAGTGAGTATTTTTCCTAAAAGGGATGCT ATGTGAGAGAGTGGGGATGGTGGAC
TATAAAGCCAGGCCACATGGAAAACAATGAAAAATTCTAATTTCAAAATCGGTCC ATGGTTTTAAGAGTGAATGATGTAT
TTTAGCTTGTATTGTAAAAGTAAAAGTGCTTAGTGTTTTAGGTACTTATTTTAATT TCATTATCTTATGGGAAAAGAATA
GGAATGAGTTGAACCGAGAAAAAAATAATACTTTTCAGATTTCTTATACATACCTC
TTCCAACTAAAGCGAGTATGTCTG
GATCAGCCTTTCTTGTGTTGCATAATTAGAAGTGGAAAGTTCACTGAGTTTCCAC
ATTGAAAAAGAAATGGTTTTCTTTA
AAAATTCAAGGAAGATCTAGTCTGGGTATTTTGTTTTCATTCTTTAATTTGTGAAA
AGACCGCAGATTCAGGCTAATTTT
TTGGCATTAACTTGTCTTTCTTAGTAGTCGTATATAAAATAATAGACTTACTATTC
TTGATGAAAAGGAAGTTTTGATGG
TAAAGTACATAAAACAGAAGAATTTATTAAACTAAATAATTAGAAATGACTTATTAA
AAACAATTTCTGTTTTGAGGTAG
AATTTTTTACATTGTTAAAATAATGAATGTCTCTTTTTGTTGGAATAATTTCGGGT
CTTTTTTTTTTCTTTTACTTTTAG
TCTATTGGGTTTAAAGTTAGTATTTTATAATACTGGAGAGTTAGACTCTAGTCATT
TTCTTATCTTATGTATGACCTGGT
TTCTGCCTTTCAATTTCATTATATTATGCAGTAGGGACAAATCTAGGTATATGTGA
TATTTATCTCATGCTTCATGATTT
CTGTTCCTGCATTGTTCTTAGAATTGTTGTGTCACAATGGACATATAAAGATAATT
TTGTAAGGGTTTTTCAGGGCAAAC
ATGTAAGGACTCATTTGTGGG I I I I I I GTGCCTTTTTAAAGTTACTTTTTAAGTCT
GTGATGGTGTGCAGTATTACTTGT
AAACGTAATTCCTTATGAATTTTGAGTTCAAAAATTATTTTAAAGTGTTCATAATG
AAATATTCAAAGCTTTATTACCCT
TTGGATGGATATTTTAATACGTTAAAAATAATATAACTTGAATCTGAAAGCTTCCA
AAATTTTACTCTATTTTAAAAACT
AAAAACAGTGCCTTTGCATTCATTCTTTTAAAAATGAATTTGTTAGCTCTTTCTTC
ACTATCTTAATTAAAAAAATTCAC
ATAATCTGCAGATATGTTAGAATCTACCTTGAGAAATATAGATATTTTCTAGTCAA
TGAGATTTTAATTTATTTATTGCC
TACTCATGCCTACTCTTCAACTAGCCATGCCTTAAACTCATGCCTACTCTTAAAC
TCATGCCTACTTTTCAACTAGCCAT
GTCTTTAAACTCATGTCTTAAACTCATGCCTACTCTTCAACTAGCCATGCCAATC
ACCCCTCACCGCCAATTGAAGGACA
TCTTGGTTGCTCTGAGTTTCTGGCAGTTATGAATGAAGTTTCTATAAACATTTGT
GTGCAΘI I I I I I I AATAGATATAAA
TTTTCAACTCCTGTGAGTAAATACCTAGGAGTGTGATTGCTGGATCATATGGCAG
CACTACGTTTAGCTTTAGAAGAAAC
TTCCAAATCATCTTCCAAAGTGGCTATACCATTTTGTATTCTCATCAGCAATGAAT
AAGAGTTCCTGTTGCTCCATATTC
TTGCCAGAATTTGGTGTTGTCAGTGTTTTAGATTTTGGACATTCTAATTGAGAAG
TAGTGCTATCTTGTTGTTTTGTTTC
CTAATGACATATGATGTGAAGCACCTTTTCATATGCTTATTTGCCATCTGTATGT
CTTCTTTGGTGAGGTGTCTGTAGAG
ATATTTTGCCCACTTCTTAAACAGGTTGTTTGTTTTCTTATTGTTGGGTTTTAGGA GTTCTTTGTTTATTTTGATACACA
TCTTTTATCAGATAGGTGTTTTGCAATATTTTCTCCCAGTCTATGGATTTTCTTTT CTCTTAGCAGTGTCTTTTGCAGAG
AAGTTTTTAATTTTAATACTCCAGTTTATCAGTTTTTCCCTCATGGATCATGCCTT TGCTCCTGTATCTAAAAACCTATT
GCCAAAACCAAGGCCATCTAGATTTTCTCCTGTGTTACTATCTAGAAGTTTCATA
ATTTTGCATTTTACATCTAGTTCTC
TGACACATTTTGAATTAATTTTTGTGAAAGATGTAAGGTCAGGGTCTAGATTCAT
TTTTTTGCATATGGATGTGCAGTGG
TTCCAGATTAATTTTTGAAAAACTATCTTTTCTCCATTATGTTGCCTTTGCTGCTT
TGTCAAAGACCACTTGACTGTATT
TAAGTGGGGCTATTTCAGGGCTCTCTCTTCTGTTCCATTGATCCATTTGTCTATT
TTTTCACCAATACCACTCTGTCTTG
ATTACTACAGCTTTATAGTAAGTCCTAAAGTCAGACAGTGTCTGTCTTCCAACTT
TTTTCTTCTCCTTCAGCATTGTGTT
GGCTATTCTGAGTCTTTTATCTCTCCATATAATCTTTAGAATCAATTTGTCAGTAT
CCACAAAGTAAATTTCTGGGATTT
TGGCTGGAATCACGTTGAATTTATAGATCAAATTGGGACAAACTGATGTCTTGAC
AATATTGAATCTATCCATATGCATG
GAATATCTCTCCATTTATTTATATCTTCTTTGATTTCTTTTATAGTTTTTCTCACAT
AAAGCTTATATATAATTTTTAGA
GTTACACCTCTTTCTCTCTCTCTCTTTCCCTCTCTAGTATTGATGGAAATAATGGT
GTGTTTTTTTTGTTTGTTTGTTTG
TTTTTTGAGACGGAGTCTCCCTGTCGCCCAGGCTGGAGTGCGTGGCGCTATCT
CGGCTCACTGCAGGCTCCGCCCCCCGG
GGTTCACGCCATTCTCCTGCCTCAGCCTCCCGAGTAGCTGGGACTACAGGCGC
CCGCTACCTCGCCCGGCTAA I I I I I I G
TATTTTTAGTAGAGACGGGGTTTCACTGTGTTAGCCAGGATGGTCTCGATCTCC
TGACCTCGTGATCCACCCGCCTCGGC
CTCCCAAAGTGCTGGGATTACAGGCGTGAGCCACCGCGCCCGGCCGGTGTGTT
TTTAATTTCAAATTCCAACTGCTCATT
GCTGGTATATAGGAAGGCATTTGACTTTTATCTATTATGTTATATCCTGCAACCTT
CTTCATTGGTTTCAAGAGTTGTTT
GGTTTTGGGGGTGGGTATTTGTTGA I I I I I I I I I TTAATATTTTCTACATAGACAA
TCATGTTTTTTGTGAAAAAAGACA
GTTATATTTCTTCCTTCCCAATTTGTATACCTTTTATTTTCTTATTGCATTAGTAAG
GACTTGTACAATGTCGAACAGGA
GTGGTAGAGGGGACATCTTTGCTTTGTTCCTTATCTTAGAAAGGAGACATCTAG
TTTCTQGCCATGAAGTATAATGTTAG
CTGTAAGTTGTTTTTTGATGTTCTTTGTCAGCTTTAGGAAGCTCTCTCCCCTTTC
CTAGATTACTGAG I I I I I I IATTAT
GAATGGGTATTGGACTTTGTCAAATGCTTTTTATGCATCTGATGATTTGATCATAT
GGTTTTTCTTCATTAGCTCATTGA
TGTGATGGATTACATTAATTTTCAAATGTTGAACCAGCCTTATATATCTTGGATTA
ATCCTACTTGGTCATGACATATAA
TTCTTTTTATACATTGTTGGATTCAGTTTGCTAATATTTTGTTGAGGATCTTTGCA TTTATGTTCATGAGAAATATTATT
CTAGTTTCTAGTTTTCATATCTTGTAATGCCTTTATCTCATTTCAGTATTAGGGTA ATGCTGGCCTCATTGATGGAATTA
CAAAATGTTCCCTCTGCTTCTGTTTTCTGGAACAGATTGTAGAGAATTGTAATCA TTTTTTTTTTTAAGTATTTGATGAA
ATTTACCAGGGAACCCATCCTGGTTTAGTGCTTTCTGTTTTGAAAGATTAATTGT TGGCTCAATTTTTTTAATAGGTACA
GACCTATTCATATTATCTATTTTTCCTTCTGTGAGTTTGGGTAGACTGTCTTTCAA
GGAATTGGTCCATTTCATCTAACT
TTGTCTAATTTGTGGATATAGAATTGTTCATAACATTTCTTTATTATCCTTTTAATG
TTCATGGGATTGCTAATGATGGC
TGTCTTTCATTTCTGATATTAGTAATTTGTGTATTCTCTGTTTATCTTGGTTAGCC
TGCGTAGAGGTTTATCAATTTTAT
TGATATTCTCAAACAGCCAATTTCATTGATTTTCTCTGTTGTTTTCCTGTTTTTAAT
TTTATTAACTTCTCTGATTTTCA
TTATTTCATAATTTTTTTCTGATTGCTTTTGAGTTAACTTGCTTTTCTTTTTATGGT
TTCCTAAGAGTGGTGCTTAGACG
ATTGATTTTTAGATCTTTCTTTTCTAATATATGTGTTCAGTGCTGTTAATTTTCTCC
TGAGCGCTGTTTTTGTGGCATTT
CACAGAATTCGATGGGTTCTGCTTTTTCATTCTCATTTACTTCACAATATTTTAAT
ACTTATTTTTGTTTTTATTACCAC
TGGAATCCTCCCTTTTAATGATGTATATATTTATAAACATTTTATTGGAGAGAAAA
ATCTTAGAAATTTCATAGTTTACA
GTAAATATAATTAGTTACAGTTACCCAGATATTAACATCTTACAGATTAAATGGCT
TTCTCTTATTAGAAGATATGTGTA
ATTTATGAGTAGAAGGGTATTCCATGGTAAATTCAAATTTCACTCTCTATAAAATA
AAATTTTGATCAGAACCAACTAGT
TTTCCTTGTTCAATCAAAGACTGTAAGACAGTGTCACAATGAAATCCATGATTAC
TTTAAATAATAAATTACAGATTTCA
CTCATTTTTA ITl I lAAAAACCCATCTTTTACAAAGCCAATGATTTTTTTTCTATGA
TCAAAATAGAGGAAGGGAAAAAT
AAATTTGAAGGTGTTTTCCTTTTGTGCTAATATCTAAACTAGCATTTTATC ATC C C
CTAGCCACTGAGGTATAAAGTAAG
TTATCATCTTGTACTGATCGTAAAAGCTTAAGTCATATTACTACTAATCTAACTAA
ATATATTCTTTCAAAAGAGTGATA
AAAATACT I I I I GAGTATTATTACGAAGAAAATTGTAATTGACTTTTCTCAAAAGA
TCACTTAAACCA I I I I I ATCATAT
GATATTTTTCATTATATGATTCATGTGTATAATCAAAATTTAAACAATCATCTATAA
TC C AATTACTGTGCTGTATTTAA
ATAATTTGGCCTGACAACATATTTTTTATCCCAAATCTCTTAGAATATCTCATAAT
TTTGATAGCTATCATCACAATACA
TAATAAAATACAGTCGAACACAATTGTTGTGCTTAGAGCTCTACATTCATTACCA
CATGTAATCTTACAATATACCATGA
GATAGGTACTATTTGTTCCATTTTACAGATAAGGAAAATGAGGCACATCACTCTT
AAATGCTCAATAAGTGTTAGGTATC
ATTAAGTTAATCAAGAATGTTGAGGCTAAATTCTTAAAAAAGCACTATTTAGCTAT
AATATTGTA I I I I IAAAATAATAC
AAATATAAAGCAGTGAGGGACAAAGATAAAAGTGAATGGTGAACGAAACATTTTA GATTAATGTGAAATAAAAGATAATT
AGTTTTTAGGAATAAGATTTAGTATGAATGCCGTAGAAAAGTTTGTTTTTCAAAG GATTTTTTCATTTGAATTTCTTACA
TCAACTTGATGTTGAAATCTACCTTCACGTGTGCTGATCACATGCAGACAATTTA CTGGTTAAGGTGTTGGACGGGAATC
TTGCCGATATTGAATTTCAAGTGACAGAGGAAAAATCAGTATAAGAGGGAATTCA ATATGCATGGGAAACCATTTGTTCC
CTCTGCTTCTTTGATCCCTGCCAAGGCACAAGAAAGTAGAGCAAAGCGCCTGAA
AAAGGGGAATGTTCTTGTTCAAAGTT
GCAGAACTCTGAGCCCTCAGAGGAACTCAGTGTGACAGGTTAAAAAACTAAACT
CCCGGCTATCTTCCCTGTCATGCAGG
ATATTTGAGTCTCGTCTGTACTTGCTAACATGCTTATGTCTGTTCATTGTGGAAA
TGATCCTGAGGTTTGCAATAGCAAG
CGTCTTGATGCTTCCATTTTACTAAGTGAACGTTTTGGAAAATCTGCAGGCCATA
CTTGCTCATTTTCACAGATGCCTTG
TGGACTTTGGAAAAGTTTGCAATTTGTTCAGCTCAAAAATAAACTATTTCAGCAAA
ACAATTTTAACTAATGTTTTGGAA
ACATTTGTCGGCAAAACAAATATTTGTATCTGAAATAGCTAGGCTATTAAGATGT
GAGAAAACAAAACAGATGTTTGTTT
TGGGTTGTTGGGTTTGTTTGTCTTGATGGCATTTGTTTTCTTTCTTTCTTTCTTTC
TTTTTTCCCAGAGGATATATAGTG
GTATAATAAGATTATAAAATTCATGGAAATAAAGTTCACTCTTTGTTTTGTACTGT
GAGAAACGTTATGAAAATGAGTTT
TGTTACCTGTGAATTTTGGACTGTATTTCCAAGAAAAGAAAAATTCAGTCTTCTAA
GCACTTAGCCCTTTTGTGCCTTGT
TACTGGTTTCCCTAA I I I I I I CAATTACTTAGCCATATGTCAGGATGCCTTATAGA
GTCAGTTTTAGTACACTGGATTTA
AATATGTAATTCTTTCTTATATTTATACATAACAAGATTTTTATATTAATCAAAATAA
TGCATATATTATTGAGCTAATT
AAATTAGTTTATTAACCCTCCCAAACATTTATTATCTTCTTTTGTAACTAATTGATT
CATTATTTGAAAAACTGTTACAT
ATTAAGTGTCATATGGGACCAACACATATCAGTGTAATAATTTCTCGTATTTAAGT
TGGAGGACATCGGCATTTACTGGT
TCAGTTTTTTAATTGGAGATGGTTGTAAAGAGGACAAAGTTATGAGTGTTTGCCA
TCTCTTCACCTAATTTAATTGCGAT
TTCATCGTTTGGTAAGTGGCTTTTGGATTAAGTTTTAAATAAAGGTAGAACTGAT
GGATTAATGGGGCTACCAATGTTAT
TTCTATGTTGATAGTAAGTATTGTTTGATAAATTTTTTTACTGAATAATAACACATG
AACATGGGACAAAATTATAAAGG
CATAGATGTGTTTACAGACAAGAATAAACCTCCCTCCATGTTCTATACTCGGCAC
CTTGGTTTCATATTCTTTGCTTATA
TAAACATATACTTAACAACTGTCCCCCGCCTCTCTTTTTTTCCACATTCAAGTATG
TTTAGC I I I I I GGTCCTTCACAAA
TAATAGAGTACTATACATGCTGTTCTGCACCTTACTTTTTCACTAGACAATATACA
GAGAAAATTTCAGTAAGTA I I I I I
GTTGTGGGTTATATATACAGATTTAACAGTGGCTAATATTTTTTGAACATTTACTA
GTGCCAGGTATTGTTCTAAGCACT
GTTCATTCATTAACTTAGATAATTCCCACCCAGCTCCATGAGATAGAGTTGTATT
GTCATTGTCCTCATTTTACAGATGG
GAAAACTAAGGAAAGGTCAGTGACAACACTGAAATTAAACCTCTGGAAGCTTGA
CTTCAAACTTGAGTTTTTAATCAGTA
AAGCATGCTGCTGCTTGAAACCACTGTGTCTGAAGTTTCAAAGGAGCAGAAAAA TGAAAATGTAAACAGAGAGTCCTAAA
GGTAGTAGCTGCGTAGTAAACATCTTCTTTAATCAGCACTCAGCAAAAGCATATA TTATGAGGCATTCAGTTAACTATGT
GTTAAATGTTTGAGCTCTACTTGCTGATAAAATAATGTAGGTCTGAAATTACAGG
GAAGAGGTATGTAGTATATTTTGGG
TTATTATGTCATAAGTATTCACTGAATTATAAAAGCAGTTTGTGGATATCTTGGAT
AATCCAACAGCTGATTGCTCCCAT
TCTCCCTCATTTTCCACTTCCTGAATCAGTTCACTCCGTGCTTCCCCTGTATCCT
ACTTAAATAGTCACCTGTCATGGAC
CTGCTTAGGGAACAACTTGCCACATCTCTACCATTTCCCTCCCCTTCCTTCCAG
GGACCTCCTTGCATCTTGTCCTCCTT
TTTTTCTTTATCTTTTAAAAAAAATTTTAAATAACTCAAATGGCAGCTCATGGATG
ATGAACACTAGGTGTAGGAATGGT
GACGATTCTATCAAGGAAAAGGGAAGCTATTCGTTTTAAAGTGAAGGTGGGGAT
CTGAATCCAATACCATCACCTTCTAA
CACTTTTAAAACTTATCTGACTCCTTCAGAAACTTCAGGGACCTCTTGATTAATTT
AAATTTCATTTCTTCATTATCATT
ATTTAAGATAAGTTTTAACTCTTTCAGTTTTATGTGACTGCTGATATTTAAGGGTA
TAATTCTCTGCTATTTTGCATTAT
ATTTGTGCAGTATTTCACCTGGATTATTGATTGATTAGGTAAGCATGAATTGGTT
TGGAAATCATGTCTGACATCGTTTC
TGTGAGAAAATAGCTTCTGAATTTCAAACAATTGATTTAAACATATTTAGAACGTA
GCATATTTGTAAATAGACTACTCT
CTTATATCTACAAATCCACTTGCGTAACCTTAGCAGGAAGCAGCTGGGCAATAG
GGAAGCTTTTAATTTTCGTTTAGCAT
CGACTGTATAGATTCTTTTGCAGAAGACCTTCACCTTACTGAATATAAATAAGGA
ATCCAGAAAGATTAAA I I I I I CCTA
AGCAGCACAGCCTGCCATTGGTGACATCTCTGAATGGGACACTAGAGTGTCAG
CCTTGGGAGACAGAAGGGTGTAGTGTG
GCCTTTCCCATACTTCCCTGGGTAGCACAACTGTCTCCCAAAGATGGAGATGAA
TATCCTTTTATCCTGTAATGAGTTCC
CAGGTGGGAGCAGTTAGGGATACAAGTTATCATATTTAAAGTTATTGGAGTTTCT
CTACTTTAAGTCTTTTCTGATTTTT
GATATGACAGTACTCATTGAGAATTGAGTGTAATGTCCAATGTTTCCCAGTCATA
CCTGGCTGGGGTTGGGTACCACTTT
CCACCTGTGCTTCCCCCAACACCCCATCTGAGAATGTCTATGAATGTCTCTTGA
CCAGCACCCTCCCTTTGAATACAATT
AGGAAAAACTGGAGTGGTGGCTAGGGATTTGGAGTTGGGCTAACCTGGGTTTG
AATTCTGGCTCTGCCACTCACTTGGGA
AGGCCACTTAGCTCCTTGGTGTATTATTTCTCTGTGGAATGAGCATAATACCACT
CTTGTGTATAGGATGACTATGAGAA
TGAAATGATGTAAGTGGCTAGCAAAGTGACTGATACATATTAGGCCCTCAAATGT
CCATTTCTTCCATTCGGGCATTCAA
GAGATATTTATCAATGCTTGCTGCATGCCAGGCACTGTGCTTGGTGCTGGGAAT
ATGGGGGTGACAAGAAAGGCAAGGCC
CCTGGCCTCAGGAAGCTGATGTGTTCTTCCCTCTTGAAGTTCATTACTGTTGTAC ATTGTGGATTGTGGTTGTTTCAGAC
ACTAAAGCCCCAAAAGTGTGGCTCACCAGTTTCTTTCATGTAGAATGACTGTGG GGAGCTTAAACCAGTGGTTGAAATGT
GTAAGAAGMT TTAGAGGTTTTTACCAGGTTATCCAGAGATTACTGGAAAGTG GATAATGGTCATATTTTAGGAGTGA
TGAAGCTGTCTTAATTGAGTTACTAGACCTTGACTAATGAATTAATTTACCTAGA
GACCTCGATTAGTGCCTGCGTGAAC
AGGGAATAATGGCTATGGAATACAGGTAGAGAAGTCAAAGCAATTCTCTAAGAT
TAATTCAGTTAGTGTTTATTGGTCTC
GGACTACTGGTTATCCCTATAAGGCTAGTCAACAAGTTTACTTGCCTTTAGTTTA
TTTTACAAAAGTTTTTTGTGGGAGA
TTTCTATGACTGAAAACAAAATTATTACAATTCCTTTACATTAAATCTTTTATTCTG
CTAAGGTAATATTCACTGAGAAT
CTAAACTAATATAATGTAATTTATTCAAGGTTTTGCTGAGGTCCTTTTTTTTCTTTT
ATGCCAAAATCTAAAAAAAAAAT
TTAGGCCAAGCGGGGTGTCTCACGCCTGTAATCCCAGCACTTTGGGAGGCCGA
AGAGGGTGGATCACGAAGTCAGGAGTC
CAAGACCAGCCTGGCCAAGATGATGAAACCCTGTCTCTACTAAAAATAGAAAAAT
TAGCCAGGCGTGGTGGCAGGCACCT
GTAATCCCAGCTACTCGGGAGTCTGAGGCGGGGAATTGCCTGAACCCAGGAGG
CAGAGGTTGCAGTGAACTGAGATCGTG
CCACTGCACTCCAGCCTGCATGAGAGAGTAAGTCTCTGTCTCAAAAAAAAAAAA
AAAAACTACCAATATGAAATTATCAT
GGTCACACTATCACACGTAACTTTGAAGGTTTGATGACCCATAACTTCATAGATC
TGTGTGGATCCCAGGTGA I I I I I GA
GGAATATCATTGGCCTTCTTTCAGGACTGTGGCCTTCTACGTGGGAATCTTCCC
TGATTGGGCCTGTAATTTTCATAGGA
AAATGTCTATTTCTATTCTCAGCACAGCTAACTTAGTAAAGCCTATGAGTTTCCC
CCCACTCAACCAAGCAGTAACACCT
TCGCATACCCTCTTACCTCACTGCCACTTAAAACTCAAAGGAGAAACTTGTTATT
GTCAGCTGGCATAACAATATGTCAT
GGCTCCATGCCTCCAGATATTTTCAGTTTTACCCTTAAACAAGATGCCTGATTCC
AGGGTTCTGCACTCTGCAGTAGTGA
ACTCCAGAGTTACAGAGTCAGCAGTAATCAGCCTTAGAAGAAAGAATCTTGTGA
CACTTGCTGTTTTATTTACCACGAGG
GTGTGCTCCTGGGATTCCGTAGCATTCCTCAGTGAGCAGTTCTGTGAGGTGTTT
C CATC CCATCTCTATAATGGGAACAC
CTACA I I I II l ATTTTAAGTTCCGGGTTACATGTGCAGGATGTGCAGGTTTATAC
CTGGGTCAACGGGGCACACCTCCAC
TATTGTGGAGAAGTTGTTTGTCACATCTCATGCACACCATGAGTCATAGAACTGT
TCCTGCCTTTGCCCTGCATGGACTG
CAAGAAGACCCGTGTGGTCCATTTCCTCACTGGCTCCATTTTCCCCTTCTGATC
CACATTCTTCTTCTAAGGTACTCATT
CATTTTTTAAAAAGTTGACTGTATAGTTATTATGTATATATTTGGTGGGTTATCTT
AAACCCTTCTTTTAGAAAAGTGGG
ATATCAATTATTCAGAAACTAAAATTAAGAATTAAAAGGTTGTATTCAACAAGAAT
AAGAACCAGAGAACAGACATGCTC
AGGAAGAAATTACCTTGCAAAAACAATAAGAACCAGAGAACAGACATGCTCAGG AAGCACTTACTTTGTTAACAACAACT
GTTTTGGAAACTTTTTGGTTAATATTTGCACAAAAACCTGATATGCACAAAAAAAG CCTTTTGGGTTAATATGCACTTCA
TGTTTTCATATAACTGTAGTCTCCCAAGAGCACATGCCTCTTTATCATTTCTTCTC TTTGCTCAAAGTTCCAAACCTACT
TGTATAATGAATAGGCAGGCCTGTTATTCGTTTCAGTTCAGCTATTTCATGTATTT
TTTGAAAAGCATGCTGCCCATGGT
AAAATCTCTCCCAATGTAAGGAGTACACTTTCTCTGTCATGAATATTGTTTTCAG
CTTCTGCAATTTATTTTATTTCCCA
TGCCTACCACTTCTTAAATGTTCTCCCTTCAAACATTCTAAGAATTACAGGATTTG
CTTGGAGTCTCTCCTTTCTAGTTT
CTC ATTTAGTGTTTTTCTTAAAGTATATGTTTC C CTACACTTGTTCATCTAATTACT
TGCCTTGATTCAGCTTTTATATT
GATAAAGGTTACTAAGAAAGCACACTTCTCTTGAGATATATTTCTTAATTGTGCTA
TTAAATTGCATCCCCTGGTTCCCT
AATTGAAGGCAAAGTAGAAAACGCAGTTTCTCCTACTGACATATGGAAATGAGAT
TTTAATGATGTTCGTCAGCGTTTCA
TTACCAGCCATTGAGCATCTCTTACTTTCAAAGTACCATTGGGTTCTATGGAGAT
ATAGAAGACAAGAGGGTCATTCACT
GTCCTCAGGAGCATTTTGCCACTAAAGAAAACTATGTGTACAGCCAAACCTAATA
AAATTTCCA I I I I I AAAATGCCATG
CTAGAAGAAACAAAGTCCTCAGGAAGAGTGCAACAACAGATTCCTTACATAAGT
GTTTGAAGAAGAAATTAACAAAGAAG
GTGAAATTCTCAGCATTGGAGGATGTGAAAGTGCGAGGGACAGAGGATAGGGA
AGGGATATTTATTTGAGGCTAGAGAAA
TAGACCACTTTAAATGGGCTCAATGACAGGATATCCTTTCCATCAGAATTCGGTT
AAGTTATATTTAAATTTTTTAAAGG
ATGAAAGCATTTTTGAACTGGAATATCACTTTATCATCATCTTTACAAATAAGGTA
ACAGCCCAGAGGGGTAATTTTGAA
CTATTGACTTAGAGCAACAACACAAATACACTTGTATAAAACACTTTAGCATGCAT
TATTTGATGCAATCTTGACAATGG
TGCTGTTCCCATTTTACTGTTGAGGAAACTGAAGTCCAGCAAAAGTCCAGATAAC
TGAATAAATTATAAGCTATCAAGGC
TGCTCTGGAATGTATGCAATCCTAATGTGTTAATTTGTAGGTTATTCTGTCCACT
ATGGTGAAATTTGAATGAATCCATT
TTCTGTGTTATGCCCTGTAAGTTAGAGAGGATATGCAGATGTACCAGAATGTGC
AAGATTCAGCTTCTCTGCAGGTGGGT
TATTTTCCCAAAGAGTGAAAGGAAAGAAGGTCCCCTTACTGGAGTCAGTGAGGG
TTCCATCACCCAAGCAGCCAAGCTCT
GGTTGTGGGCAGCCCCTCTCCTTTAGGCCCAAACCTCCAGAAGAAGGTGGCAG
CCAAGGGTCCCCAGGCAGGTACCCATC
AGAGGCTGCACAGCCTGTCTCTGGCCTCCTCATTGACAAGTGAGACTTACGGCT
TGGAGAAACAATGCTGTTTCTGAGGC
CAGTCAAGTCTGTAGGGCGTCTGAGGTGGGAGCAGGCAGGTGGAGTTGGAGG
CTTTAGAGAAGACAGGCCTGGGTTCAGA
TCCTGATATTGGTACAGCCTAGCTGGCAGACCTTTATTTTGATTAACTTCAATCT
TTTTATCTGTTAACTGGGGACAGTA
ATACATCCCTAAACAGATCTGGGTAGAAATAATCCTAAACCTAGAAGAAGCCCTG TTCTCTCCACTCTTCCATCTCTCAA
CCCTGACCCCAGCCTCAAGTTTCTTTTGCTGTGATATGTCTTGCCTGGCTTTGC ACTGTTTCAATCAGTTGAGACCAGAG
GATGGGATAGTGAACCAAGTGTGTTTGAGTTCTCCTGTCAGAGAAATGCTGTCC C CTAATGAACATGAAGAC ATTGATAA
GTATTTCCCCTGTGACACATCCCTCAAGAGAGAGGCTAAACAGACCCTTGGAGA
TTATCAATGATCATGTGCATTTTGCT
GTGTGAACCAGAGATGCTCCACATGCCAGGTTGTGGAAAAAAAAAAAAAGTTTA
ACGAGCTTCTGATATGTTCCAGGCAC
TGTTCTAGACATAGGGGATTCTGCAGTGAGCAAAAATCAAGCCTCTGCCGTCAG
GGAACTTACATTCTATTGGAGGCACA
GATGATAAGTACATATAAAATGCATCTGGCAGAGTGGCTGGAAGGATGAACAGC
TGCCTAAGGAGAATAGCGTTAGGGCA
GGTAAACTGTTTCGGAAGAGTTGTCCATCGGGTGACACTTGAGCAGAGGCCTTC
AGGAAGGGAGGAGAGAGCCACCTTAA
TATCTGGGGAAAGAATTCTGGGGAGAGGAAACAGTAGTTGCCAAGGCCTGGAG
GTGGGGCTGTGCTTTTTGGATTCAGCA
AGAGCAAAGAAGTCCCATAAGGCATTGGAGCCTCTGGGAAAGGAAAGTGGGAG
GGTAAAGGCAGGGAGGAAGAGAAGCTG
TAGTAGAAGGCCTTGTGGGCATGAGAAGGATTTGGGATTTTATTTTGAATAACAT
GAGCAGTGTTTAAGCAGAAGCATGG
CATGATTTCTTACACCTGTAAATAATTTCTCTAGTCATGGACAGAGAATAAGTTG
CAACAAGGCAAAAAAAAAAAAAAAA
ACAAGGTGGTGTTGAAGAGATCAGAGCTTCTGCTTAGCACATGTGAAGTTTGAG
CTGTCTGTTAGAAATCTAACAGGAGG
TGGTGACTCAGTCATTGGATAAGTAAGTCTAATGTTTACAAGACAGGTTAGGGC
CTAAACTTGTGAGTCTTCAACATCTG
GAAGCCACTGAAAGCTGTGTGTGTGGATGAGTTAACCAGGACTGTAGGAGCAA
ACACAGAAGAGGAGACAAGCACTGAGC
CCAGGGCCTTCCCACGTTAGCCAGCTGGGAAAAGAGGAAGAGCCAGAGAAGGA
GATTGGGAAAGAGCACCCAGGGAGTCA
GGAGGAGAACTGAGAAAAGGGCCTTTCAGAGGCCAAGGAACACAAGTGTTGCA
GAAGGAAGCCATGATCAACTGTGTGTG
GCGCTGGCAAAAGGCGGAGAAATATGGAAATTGAGACTTGACTATTTGATTTGA
CAGAGTGGATGTTGTTGGTGATCTTG
ACAAAGAGGAACATTAGAGAAAATAGCATGATGGAAGTGGGTTCAAGAGAAAAT
GGGATAAAAGGAAGTTTAAGCAGCAG
GGATTCTCAATTCTTTCAGTGACTTTTGCTGTACACTGTAAGGCAGTAGAGAAAG
AGGGCAGAATCTAGAGGAAGTGGAA
GATGCTTTTTAGGCAAGAAATCTATCTAGCTTGTGTGCACCTGCAGGTCATCTAG
AGTAAAGGGAACGTGGACTGGAATC
AGTGACAGGGCACTGGAGTCTCATGCCAAAAGGAGGGGTTGGCCTTAGGAGTG
CAGAAGGTCCACCCATCATAACCTGAG
GGGAGGCAGACTGCATACAGAGTACTTACGGGTGGGGAGAGCCTGGCAAGGA
GAGTAGAGAGAGTGGTTCCTTCTGTTTT
CTCAGTGAATGGGAAGCCAGGTTCCTTGGCAATGAGAGAGTAGAAAGCTGTCC CTTGGTATTGCTCATGACACCCCTGGT
GCATAATAAATATAGGCAGCGCCAATATTCTACAAGGTCATGGACCATCATCCTG CTCAGTCACATCCCTGGTGCTGAGT
GTGTTGCCTGGCACATAGGTGCTCTGTGATACCACAGAATCCAAAAGTAGTGCT TGTTGCCCTGGCCTAGGAATTTGTGA
TAGCCACAGGCAGGACCAGCTGCAAAACTGGTGGGCCCCTTGTTCAAAAATAAA GAATTTTAAGACAGCACAGCAGACTA
CTGAACCAAGTTGCAGGCCCAGCTGAACACAAAGCCCTGTGGGAGTGCACAGG
CTGCACCTCCAGGAGACTGGCCCTAGC
CACAGGTACTTGTTAAATCTGATCAAACCCCAGTGATGCAAGAATCACAGTCAG
CAGACCCGTGGCAGTGGCCAGAGATA
AAACTCCCTGCAGAGACAAATTTTGCTTATTTTCCTCTTTCAGATGTTTAAGACTG
AGTGCAAGCAGCAGTCACCATCTT
CACAGATTAGCAGAGATGGAGGTTTAATGATTCTTGTAAGAAAATGAACAGAAAG
ATGCATCAGAGATATTCAGAAACTT
TTAGGAGCAAAATGTTCACTAGTGGTATCTCTTAAATGACGATACCACAAATCCC
TGCTGTAGGAAGAGAAAGGTGCCGA
ATGGTGGGACAGAGGGAGCTTCAGCAGTGTTGGAGGTTGCAGCCCATTGTGAG
TATGAGGTGGCATTAATGACCATTCAA
GAAAAAAACATTTTCAGACAAAACAAAATAACATAAGAAAACCTCCCCATTCATTC
CCAAAACTGCAAAGGTAAAGAGTA
GTATGGGTAGATTAGATCCAGTGCAAAGGTAAGATGACCTGTTTTATAGCTACTT
CAAATTGAGTTGTTTCAAAGATTGT
TTTTAAAAAACTATTATATAGAGACCTTTTTAATGGAGTTTTTCACTGGGCATGGT
GGCTCATGCCTATAATCCCAGCAC
TTCTGGTGGCCCAGGCAGGCAGATCGCTTGATCCCAGGAGTTTGAGACTAGCC
TGGCAACGTGGCGGAACCACATCTGTA
TAAAAATTACAAAAAATTTAAAAATTAGCTGGGGATGGTGGTGCTCACTTATAGA
CTCAGCTACATGGGAGGCTAAGGTG
GAAGGATCACCTAAGCCCAGGAGGTCAAGGCTGCAGTGAACTGTGATTATGCC
ACAGTATTGCAGCCTAGGTGACAGAGT
GAGACCCTGTCTCCAAAGAAAAAAAAAGCAAAACAAAGTTTTCCAACATTGTTCT
CCTTTTATCGGAAAAAAAGCAAGTC
AGCTGAATGAATGTGTGCACATGTATACATTCCATGTTGTGAGCATTTGGTTTCA
GTGGCATTCAGCCACATGAGAATTC
AAGCGTGAAGTTTAGGCCACCACTGGAATTGCTTTAGACATTTTAAGAATAAAAG
GAATTTTGGTATTCACTCTGCGGAT
ATCTGACTGTAGAGGGGGAACCTTGGCCAGCCCTTCCCTCTGACTGCCTTACA
GGAGAGGAGAATTCATTTTATGCAATG
TTGTCTAGAGGATTAGAGGTGGGTTTGTTAGAAATATCTGTCATATTCAGCTATG
GGAGG-TCACTTTAAGTCAGAGAAAA
TCTAAGTGTTCATTGTAAAAGGCCTACATATCCTTTTTTCATTCACCTTTTCACCT
GAATTTCCTGGTTTTCTTACATAG
TAAACAGCATGGCTTGACAATGGTAGGGTCTTCGAATTTTTCTTTTCTCAGAAAC
TCAGCTTAGCTCATTTTCTGGTGTC
ACTGGTATGCTTGTCTGCTCAAGTAAGTTGAGTTTGTGTCCCCTAGATTCAGCT
GACTACAAAGAGCCTGGTAAACAAGT
ACGCCCAGGGACCCAGACCTGGGACAGCAGACTTGGTGCCTGCACCTGGCCAT
TATGTTAGAGATGAAATCAAAAAATCA
CTCACAAGATGTATAGTTCTGGTGCGTGTATCAAACCAAGCGGAATTCATAGCC
TGTCAAGCATTTCTGTGTGCTGCTAT
GACAGCAGAGTGGAATGGTTGCAACAGAGACCGTGTGGCCCAGAAAGCCTAAA ATATTTACTGTATGACTCTTGACAGGA
AAAATTTACCGAGCCTTGCTCTAGAGTATAGGAACTCTCCCCTCCACACACACAC AAAAAAGGAGAATTAAATGTTTATA
GTGTTGGTAAGTTAGTCTCCGTGCCCCATCATATACCTGTGACCTGTACCAACA
CCGTCCAATGGGTTAGTTTCAGTGCC
TTTGACTTTCACAGTGCAGCTGGTGTAGAGAAGTTCTGGGTAATTTTGATTATAT
TTCTTTGTTTACCTTGTCCATCCCT
GGAGTAAGGGTCAATCCCACTCACTTGCCTTAAAGTAATTCAACTCAGCTTCACC
AGTGTTTTCCCCCATAGGAAAACTA
ATTTCAAAATCATAACCAAAAATATGCCCAATGGATGTATATGATTCTTTTGGCCA
ATGTCATCAAATTTGATGCATCTG
CTTCCCTATGGGGAATGCCACGGAAGAACTATAATATCCTTGCCTTCTTGACTTA
GATTTCAACCTCTCTACATTCTGTC
TTCCTGATCTGGCCTTGGTATACCATCTATTAAGATGACTTTTGCATATACTTATG
TTGGTTTTGAGAGTCTCAAAGTTC
AGGGGTGCCTTAGTGTCAGTTGTTGTCTGACATCTACTTCGTGCTCATGTGTAC
AATGTGCTTTAAGGCTTGAGTCTTGA
ACATGATGGAATCTACATTTAGGGAAGTGATATGCTCTGTTATTTTTATCTGAGT
CTCCCATAATGGCAAATAGAATATA
ATTATTGATTAGCCTGTCTCATGTTCACTAAATTCAGATAATCTTTCTAAGGATTT
CATGAAAAGCTCTGCACTTTGTGT
TGTAAGCTATATTGTCTTAGCAGACAGATTCAGGGTCTTGCCAGAGATCGTAGC
TACCCTGGCCTCAGTCCAATGTCATT
GTACATGTGGTGAGAGCTTCCTCCCGGCTAGGAGTTCAATTACCTACTTCTGTT
TAGGGAGAGTAGTCTTTCACCTCACT
TAGACTTCAATATGAGTGAGGAGGGCCTCCACCCTCCCAAAGCACGGAGAAAG
GAGCCTGCAGGAATTGTGTATTCTACT
TCCTTTGCAACCACTACTGGCATATGTAGTCCCTGGGGTGTAAAGCTAGATCAG
AAATCCCCTACTACCCTTAGAATAAA
TGTGAACACTTATCTCTAAATCTTCAAGGAGCCAACCAGGCCTTCATGTCGTGCA
GCAAGACTCCATAAAAATAAAATAC
CTCTGTGAAGGAAAGCATAATTTAAGGAAATATCAGGCTGGGAGAAGTATCAGT
AGATATGAAAAAAGGGTTAAAACTAT
TTTGTATAAATCAATAAAGGAATATTTAAAAACTGGCAAAGGATTTGGACTGTTCA
CAGAGGGAATATAATTTTTAAGCA
AATATATAAGCATCTTTAATATATGCAAATTAAAAATGATGCACTGTTCTGTGTAT
GATAAATAGCAATGACTTTAAAAT
GATGACAGCCAATGTATATATGGTTATGGTAAAAGAGACACAATTTAAATTGATA
CAGTCCTTTGAAAACTATTTGGCAA
TATGTATAAGAAAACAGCGTATTAAGGAAGGCCTGAGAAATAAAACCACATAGGT
AATTTGGGCTAGATATAGTAGGCAA
TAAACCCTGATGGGTTTTGAGGGTTGGAGTTGCCTGAGATCTTTTATATAGATTA ATATGAAGGTAACTAATCTACTTAA
GAGAGAGAGAGACAAAGAACTAATCATTATGAGGCTTTGGTAAGTGTTGTACAT CCACAGGTAATAGAATTTTAAAGCTA
AATAATCTAGGCTGTCGTCAATTTGATAAACAGAGTGGGAACTGAGGAAAAAAT GGAAGTAACTTCCCCAAGGCTCAAGG
AGGAGTTACTAGCTCAGGGGAACTAAATGGACTCTTGCGAGTTCCAGTTCTTTG TTTTTCTTTTTAATATTGAAGAATTT
ATGGAAATCTGTGGTATTTACAGATGTAAAAATTTAATTATCTTATGTGGATTTTA ATATTTTATGACTTAATATATAAA
CATTAACTTTTAAATGAGCTCTTAATGATAGCTATTTCTCATTTATAAATATTTTTG CTGTCTTGTAGAAATAATTAACC
TTTATAGCTTTAATAAATACTAGACTGAAGAAACATTTTTCTAATCATCAATTTTTT AATGTTTGCCTTGCACTCTAGAC
ACTTGCCATGCTCTCAGTGTGCAAACATGAATACCACTGTGTTCTGTTTTCCCAG AGTGGGAATGATGGACATGTGCTCA
AAATAATTTCATCAAATTATTTATTATCCACCACAAATGCAAATAATTGCAAAAATG TTATAATCCTATAATAGAGATAA
GAGTACATAATTACAAAGAAGGGACCAGAAGTTAGAGAATTCACAAAGATTCTTG AGTTTGATCTTCAGAGGCTAATAGG
AATTTCTCAGACAGCGGAGAGGGTAAGGAGTTGAGAAAAGGAGACAAGGGCCA GGTGCATTGGCTCATGCCTGTAATCCT
AGTGCTTTGGGAGGCTGAGGCAGGAGGACTCTTTGAGGCCAGGAATTCAAGAC CAGCTTGGGCAACATAGCAAGACCCCC
AGTCCTAGCTGCTTGAGAGCTTGAG
AGATATATATTGCAGGTCCCTTGAACCAAGGAATTTGAGGCTGCAGTGAGCTAT
GGTTGTACCACTACACTTCATCCTCA
GTGACAGAGAAAGGTCCTGTCTCTAAAMCTAAAAAAAAAAAAAAAAAAAAAAAA
ATGTGTGAAGCCATGAAATAGCATG
GCACTTTCAGAGAAATGAGATAATTAAGTTTGTCTATCTGAAGTAGACTATGAAG
AGAGGGTAGTTAGAGATAAGGTTGT
TGAAGGATGACTCGGTGACAACACTTTTTTCAGTTAGCTAACATCGTAGAGTGCT
AACGACATGTGCAGCTCCAAGTACC
TTTACATGTATGTTAGTTTTATTTTACCCCATTTATTCCCGTTTTACAAATGAAAAA
ACTGAGTCATGTAAAGGCTAAAT
AACGAGCAAGCAATACTACTAGTAAGTACTACAGACAAGAAGCAAAATGTGGCA
AGCCAATACATACCTAAGGAAAAAAA
AAACCAAATATCCCCAGATAAAAACTAGAAGGAAGCAAACGGTCAAACTCTAGGT
GATGGCAGTGCAGAAAATAGGTGGA
AGACGAGCCTATCGATCCTGGAAATAGCCTGTTGCAAGAATTAAGGCAAGATAT
AATGAGAGCCTCAACTTAGAGCAAAA
GTTTTTAGGAGTAAAAACCACAGAACTTAGTCTTAATTTGATGGAGACACTAAGG
AGTTTCTGGATTGAAAAAGTGGATA
GTTACTGGTACCATTAATTCAACTTGTTGATAGGGAGGATTCCTTTTGGTACAAG
TTTAATTATGGTGTCCATGGAATAC
TTGGGTAGAGAGGTCTAGAAATAGTGGGATACTCAGGTCAGGAGATCAAAAGGT
AAAGCTTTAGAATTTATCAGCTTGGA
GGGCATCAATTAAATTCCTCCTACAAATATTACCAAGCACTAAGAGCCTACAGTG
GGATAAGAAGGCCATGGAAGAAACT
CCAGTGCACTGCTGCTCAAAATATAGTCCATCAACTGGCTGCCTGTCCACTAAC TGTTTATTATTAGTGCAAGATAACAA
AGGATTTGCTTCAAAATGTAATCGACCATGTTACTAAGTAATCAATGACGTTACA TAGTGTTTTCTTCAGCTGACTATAT
TTTTAATATTAGCAAGACTTTTAGATGAAGGAAACATTGTGTTGATTCATATCCTG ATGCAAAATCTGTATCTTGTCCCA
GATCAGCACTTTGAATAACAGGGATGTAGAAAACACCAATATTCACACTTTGGTC AGAGCAGAAGTTCTCCAAAGAGACT
TO
GATAAGGGATAGTGTAAAGCAGAAGTAAGGGAAGTCAAGATATTCTGAATTGGA
CATGATTTAAAGGAGCAAGTGGTCAG
CACAGCTGCAGAGGGGTCAAATAATATTGAAGACTAAGACAATAATTATTGGATT
TGGCCATTAGAGGGTCATCAGTAAC
AGTTTAGGAGTTTTACCCCCTGTTGTTTCACTCTTCTGCCTCCTTGAATCTGTAT
TATTGCCATGTAAACCAAATGCCTC
TTGACTCTAATCTAAAGTACTCAGATTCAGTCTTTATTTTACCTGTTTCTTGGGTG
ATCTGAAATTATTGATCACTCTCT
CCTTCCTGACATACTTTTTCATCTGCCATACATTGACTTTGGTTTTCCTCTTCCAA
CCTCCCCTGTCTTTCTGTGCCCAA
CCCTTGTACACACACTGCATTTCTTTAACAACACTCATTTCCTTAATTACATATGG
TGTCAGTTTCCAATCCTAAACCCA
GCCCATATTTCTTTATGGAGCTTCAGAGTAGTCTAGCAGCCTACTGGACATCTCT
ACTTGGATGCCCACAGCCCCTTCAA
ATTCAATATGTCCAAAGCGGAACTGAATTCTCACCTATACTCTTCCCTGTAATGC
CTTCTTCCTCCTAAGTCCCGTACCT
TGTATTACAATCTTCACCCAGTTGTGAAACCACTGACATGGGCATGATTTTTTAC
TGCATCTTTCTTCCTTATCCACTTA
GTTATTGATTTTACCTCCTAATGACACTTCAGATCTCCTTATGCCCGGCACCTGC
ATCACCATCCCTCACCTGGATCACT
GAATTATCCTCCTGGCCGGTCTCCCATTCTCAGTCTCCTCTCCAGCACCCTGCC
ACGTGGAGTGTGAGCCCCCACTCCGT
TGTTGAATTCTTTACCTGCCACTCCAGCCTCATCTCTGTGTTCTTTTTCTGTAGA
CTCAGGTCAGATTGGATTTTTATTT
TCATTTCTTAAATGTGACATGCGCTCTCGCACGCGCTTGCGCTTGTTCACTCGC
TTGCTCTCGTTCACTCGCTCTCATTC
TCTCTCTCTCGCTCGCTCACTCTCTCTCTCTCGCTCGCTCACTCTCTCTCTCTCT
CCCACTTTACCTTCACACATACTTA
TGCCCCTGTCTGGAACACAGTCATATTCCCTCTTCACTGGTGCATTTGTCCTCT
GTCTTCAGGAAGCTTCTGTAAGCCTG
CAAATCTGAGTTATGTGGACTTCCTCTGTGTTCTCCTAGCATTCTGCCTTTCCTG
TGTTGTGACCTTTCCACATTACTTT
GTAATTTCTCA I I I I I GTCCACAAGTCAATGCCTTCCTCTGAATTGTAAGCTCCG
AGAAGQCAGGAACCATGTCTGTTTG
CTACATCATTATATTTCATGCTCCTGGTACAGTGCCTGCATGGTGTGAACGTGAC
ATCCACTCCAAGTTTGTTAAAATCA
ATGAATGCCCTCCAAAGTATTTTCACATCCAGTATCTCAGTGGGGGTAAACTTGA
ACAATTTGGAGAGGGTTTTAGTGAA
ACATGAGCTGTCAGTGGATTAGAAGATTAAATGATTTGAATATTTGAGTTGTTTT
GCACTAGGAGGAGGAAGGAGAAGTT
AGAGTCTGGAAAGCTGGTTAAGAGAATAGAAGCATTTAATTCCACCTCATAGAAA
AGGAAGGGATTTTGAAAAATAACTA
GCTTGCATCACTTTTATTTTCATCTTCCTCCAGAAGTGGAGGCCTGCAAGGTAC GGGTGTGGGTGTATTTTAATATTTAA
AGGTGTCAGGTTGTCTATCTTTTAGGGGTTTTTTTCCCTTTACCTGGAAAAACTT TAGATAACATGATTTAAAACTTTCA
CTTTCTACTTTGCAGMTGMTTTATTTTCTTTTGGAAAGAGGCTTACAAAAGTAA CTCACTTCATTATAGCTAGTCTGT
TTCCAAAAAAAGTGATAATAATAAAGACAGAATATATTTTAAAAGATTGAGTTTTT
AAAGTGGGGATAAACATTAGGAAA
TAAAGGAATGAGCCATCAATAGGAGGAAATCTAGGCTAACAAAATCTAGGAAAG
ATAAACGACTTTTACGACAAAATTAT
ACTGTCGGTAAGTCAAGGAAAAAGTTGTAACATGCTAACTTACATTGCTCGATGT
CTTTTAGCTTCTCTTCAAATACTTA
CTTCCAAAAGCCAGAGAGCTCCCAATTCAAGAGAGACTTTGTAGTAAAGTACTT
GAAGTGCAGGTTGAATTAGATGGTGA
CTGTACTGGTTGTAAAATGAATGCTTGGTATCGAAACATATTAGAGATTGATGTG
TGTTTTGTCTCAGATGGTGAATATC
TGGTTCTGCATGGAGACGTACATTGCAGAATATTTTTTAAAAAGGAAGGCCAAGA
CTTTTTCCTCAAAGAACAGCTTTGA
ATCCACTTGACTACTTAAAATTCTACTCATTTGGAGCTGCTGATTCTCTTTAAAAC
ATGGAAGAGAGATGCTCAGTTTGG
AACATCTTAAGGAGAGGATGCCCTAGTTATCTGCCTCTGTAATAAAAGACTATTA
ATGTAGATTTAGAGAATGATTAACT
GACAAGCCAGCATTATGTTGTAATGGTTCTGTGAGATGAGTCAGCTTAAAATTGC
ATCCCAAATAAACTAACCCATGGGG
TGGAGTTGATTTGGACTTTCTTATTGTCATTAATAAGCTTACAGGTCATGATGGT
GGTTTCTGGGTGTGAGAATCACATG
ACCAGTTTAGATGAGGAAGTTTGCCATCTTGTTATAGAAGAAGTCTCTTGCTTGT
TTTCTATAAGTAATAGGAATCTATT
TAGGTTTTAAGGTTAAATAAGAGGTATAGAGCTGTCTCCTGTGATTTAGAAACAG
AAATTTATTCTGTCAAAGCCTTTCC
TATCTACTCTGACTCAAGGCTTTGGAGAAGCACTTAAGCGTGGAGGTAATTCTG
AGACCATGCAGTTTGACACGTGGATG
GCCTTATTGGTATAGTTGAGGCTGATAGTGTGTTAATAAAGGCTACCAAGCCTTC
AGTTGTTCTCTTCAAACAGTGGTTT
TCTGTTTAGTGAAATAGACCAACCTTAACAACATCGTAGGGAAAATAATCATTCC
TCATCTTAGATGGAGAGATTCCAAA
CTAAATCTGGTACAGAAATCTGGAAAAGAGTTTGTGGGGTGAGCCAGCTTTTTG
GACGTCTGGAATTTGGGAGAAGAGGA
GATATTCTGATGTATGTATTGGATAGACTAAAAAGTACAAAATTGGTGTGAGTAA
ATGGCJ-I I I I GTTCAGTCTGAACAC
AAGATAAGCATCTTCCCTCGCTCCACAATCACTGCCCGACTCTTCCTTTGAGCCT
TTTGGGGGTTGTTCTTTATTTTGTG
GTTTCTATGTAATTATTTTAGTGGCTTCTCTGAGTTACCCAGATGAGTAATTTTAA
TGTGCTGCCTGGTTTTGGATGTTT
GGTTCAAAGGGACTAGGACCGAGGAAGAGCCGGTGATGGACAGTGTGTGGAGT
ACGGGTATTGTCCATGTATCTGCCTCT
TTATCTTGTCACCCGCTTGTTCTCACATAAAATGAGGGAGAAAATGGGGAATGTA
AAATTTTCAAAGAAAAATCTACCTG
ATGACTCAGTTTTCCCTGCCCCAGAGAGCTGGTTTTTTCTTTTCACCTCTTTGCA CGCACCAGAACCTTTTTGTAGCTGA
GAGAGCCACAGCTGAGGCAAATGTGTGAGCCTCTCAGTATGGCCAGGTGCTGT GACTGGAATGAATGAGAGAAGCGGGGA
AAGGCATGGGGAGCGTTCTCTAATCACAGAGAACCACATGGAAGATGCACGCA ACCTTGTACAACCTGTTGCCTCTCACC
TTCAATTAAATTTTAATTGTTCCTCATCACACTTTCTTTCAAAACAAATCTTGTAAT
CAGATATCTTATAAATTATTTAC
CAGCGCAAGTAAAAAACAATGTAGGATTTCAGCAAAATGAAATGTGCTTTACTAT
TGAAACTCCCCTTGTGAATTAAAAC
AAGGCTAATCTTGGCACCATGGTGATGGAAATAAAAAATGAAAGTCCTTCCTATA
TTCCTGCAAAGTACAGCACAATCCA
CAGCCCATCAACACCCATTAAAAATTCAAATTCTGATCAATTGCTTCAAGGTTCA
AATGGGAAATCACGTGGACGGGGCC
AAAAAAACAATAATCCTTCACCTCCCCCAAATTCTGATCTTGAGGTACGTCATAC
CCAACACCCAAGTGAATAAAAAGGT
GTCTTAATAGCTAACTGGTGGTTTTACATACCTGCTACGTAAACAGCAGTGGTTA
AGGTATCAAATGCTTTTGTTACTGC
TTTCTTTTTTATATTAACAATAGATAAAGTAATATTTTCTCTCCTTTATTAACTATA
GGTTAAGCAATAATTTG I I I I I I
CTTCCCTATGAACTATATGGCAAAGGGTCTGTTTCTCTCTAAAGAAATATAGTAA
ATTATGCAATTCTACTTGAACAAAA
TTCTGTTATGCTGATCTGTATCATTGCCATTTTAATTAAGTGTCATACTAAAAATA
CTTGGTAAAATCCACTTGTTAAAA
GGCACAAAAAAAATAGAATGAGTGAGACCTACTATTTGATAGCACAACAGGGTG
ACTATAGCCAATAATAACTTAATTGT
ACATTTTCAAATAACTAAAAGAGCGTACCTGGATTGTTTTTAACACAAAGGACAA
ATGCTTGAAGGGATAGAGACCCCAT
TCTCCATGATGTGCTTATTTCGCATTGCATGCCTATATCAAAACATCTCATGTAC
CCCATAATTATATACACCTAGTATC
CACAAAAATTAAAAATACAAAAAATAAATAAAATCCACTTTTTGTTGTTTTGAAGAT
GGAGTCTTGCTTTGTCACCCAGC
GTAGAGTGCAATGGCGTGATCTCGGCTCACTGCAACCTCTGCCTCCCGGTTCA
AGCGATTCTTCTGCCCCAGCCTCCCGA
GTAGCTGGGACTACAGGCACCAAGCCCGGCTAATTTTTGTATTTTTAGTARAGA
CGTCATTTCACCATGTTGGCCAGGAT
GGTCTCCATCTCCTGACCTCATGGTCCACCCACCTCGGCCTCCCAAAGTGCTG
GATTACAGGCACGAGCCACCACGCCCA
GCCATAAAATCTACTTTGAAATGAAAGACACACTTATATATATTGAGAAAGGAAAA
CCTGGAATGTTCTAAACTGAAATA
GTAATCATCTAATGAGACATATTTTGGGGGAGTTCAAAAGCAGGAAGGATTAAGT
AAGAACTTAAAGAGTGGTAGATTTC
ATAGCCTTGAATCCACTGCAACTAAGTTTCTGCTTTTTGCCAGAGTGAAATCTGA
TGGTTGGAAAACATGATTATTTTGG
CTGAAAAAATGAAACTTTTAGCAGAATCTATGTATATGAATATGTACTGGTGTGA
GTGGGAATAGGAAGAAGTAAAACAG
GCTTGATAGGCTTTTTGTGGTATCCCAAAGACCTTTTTGAAATGAATTCCTTCAG TTCCCATTTTAAATACTTCAAGAGT
ATGTCTAAGAGCATGAGTGATGTGTATGCTTATTTTGTTAAATATCTATACCCTAA AGACTTAAAAAACCTGGCAAGAAC
AATGTGTATCCCATCCCTTTGATAACAGCTCAATAGGTAACATAAGGTGAATTTA TTATGTAAGGTTTTAGTCTTTGGAT
GCAATTTTAATACTTGGCTTCCTAAGAGTAAAACATAAGATTCATTATTTTAGTGC TTCTCAATGAACAGAGATCAGAGG
TAAGGAAAAGATAAATGATATATGATTAGTTGCTCATCTCTATCTCAAGTTATAAC
CAGACTAGCTATAACTCATTTTTT
TCATTTATTCAGAGCTGAGACCGAAATTCTATTACAGAATTAGAAAGGACATAGA
TTATCTCTTTCCTATATTTATATTG
ACATTACAAATGTAATTATCCTCTTGCACCTCATCCAAAGTTATATCTGGAGAAAA
GTTAGGAGATACTCTTTGAAACAG
TTGTGTTTTTTTTGAAATAATAAATTCAGATTTTTCATTCATCTTCCGTTTCAAGAT
GACTTTTTCGTGTTGAAGTTAAC
TGAATAACAGCTTTCTCAGCCAAGGCAAAGACACATTGATTTGGTTCTTCCTTTG
TAGAGAGTGTTCATCTGGGACTTGG
ATGAGACAATCATTGTTTTCCACTCCTTGCTTACTGGGTCCTACGCCAACAGATA
TGGGAGGGTGAGTACCAAGAGGCAT
CAGTTTTTAATTTCTTTGGAGTAAATATACCTCTAATTTGTTTATTATCCAAATGAA
TGTAATTTAACTTTAACAAATTT
AAAATATAAATGGAGAAGAACTATATAGTCAAATATATATATGTAAGGTAAAAAAT
ATGTAAAAGAGCCAAAGAAGATAT
TATTTTTAGGTCTTTGGCAATAAGATACACTTTGTTCAATTAAAGTCATGCTTATT
CAGACCCCAGTGAATAAAAGTGAT
TGAATTTTATCCTAGGATGAAATATTAAAATACATAAGTGACATAGAAAGATGTTT
CGTGTTCTCTTTTACTTTTCCTGT
CTTTTTATTCCCTAACATTTAATGGAGCAAATATTTCACTTTCTGTTGTTCTCAGT
GATTAGAAGCTTTGGTATTTTATC
TTCAAAACAACTGCTACAAATGTCTCTATATGTCCCTTTGCTACTTGGGTTAAAT
GTATTTGAATGGTATGTAAGATTGA
TCCTCTGAAAGGCAGTGGGACCATGTTAGAGTTGTACAGTACTTATATTGGTATT
ATAGATAAAGAATAATATTATAGAC
CACCTTCTGTGAGTGAAACTTTGGTTTCCATGTTGTACATTTGGGTATCCCAAAA
GTTTGGAATTTCTTTTTTGGCATAT
TGATGATTTCATAAGTCTTTCCCAGGCAAAAATGTTATTTAAAGTAGCTATAGGTA
ATTTCTCAAGTTATCCTGCTAATA
TTTCATTTCTCTTGTTTTCATCTGTAAAGAAATAATAAGTCTTAGGTTTTTTTAAGT
TATTTGGAGCATTAGAGACTGAA
TTCTTATTATGTCCCAGTGACTTCAGCAGTTTATCATAATTAAGTGTTAGAATTAA
TACCATTAAAATGCACATACTGTT
ATTACAAATTTGACCGGGTGGTTGTTCTTTCAACAAATATTTATTTTAACAAATAC
ACACTGAGTATCTGCTATGGATAT
GCAGTCTGGGGGATTTAGCTTCACAGCCCAGATTTCCTTTTCATCATCTCAAAG GTACACCTAATCAAAAATAGGCTGAA
GCAATAATATCCCACCATTTTAATTCCTGTTCCGAGATAATATATCGGTAAAACAC
AGTATGTCTCATCTTATTGTTAGG
CAAATTCTCTATAAGAAGACAACAAAAATGCAGATTATAAAATATAAGGCCATAGA
ATCAATCAGCCAAATGGTCTCTAC
TGCGTGATTAGCACATAGTAGGAATTCAAGAAACATTAGCTGTGCTTTTTCTTGT
ATTATTTCTTCTGTCAATACTATCA
CTGTTCCTATTTCTAGAAAAACTACTACTTAAAAACTAAAGATTAATATCTGAAAT ATCAACATGTAGTTAGTAGGCAGA
ACCTTGAAGATCTCCTTAACTGATCCATTATAAAATCCCTACACCTGTCTTCTAAG GAGACCACTGGTGCCTGTAACTCA
?
CACAAGAGTGTTGAAAAGAGTAGGCACTTAGAAAATGAGCATTTGAAAACCACC
CAGAAAGTCTCCTATAAGCTCTCGCT
AAGGGCCAAATTGACGCAAAAGAGAGTTATCTAAGAGTATTCTATTAGCCAATGA
AAGAGACCAGCTTAGAGTTAAGGAG
TTAGGCATCCCATGACCTACTGAACTTTCAATCTTTGGCCATAAATTATTGGCTT
ACACATGGAACTTACACTTAACCTA
GAAATGCCTGGCTTCATACAATTTTTCATTCATGTCTAAACCATTCTCTAAATAAT
GTATTACAAATACCATGAGTTTAA
TTTTTTAATGTGTCAATCATTAGAACTTTGAGGCTTCTTCAAGTAGTAATATATTT
GTTTGTTTGTTTAGTTTAGTTTTC
TCATTCCAGGTCATGTGATAAAAAGTTTTATCTCGGCCGGGCGCGGTGGCTCAT
GCCTGTAATCCCAGCACTTTGGGAGG
CCAAGGCGGGTGGATCACGAGGTCAGGAGATCGAGACCATCCTGACTAACACA
GTGAAACCCCGTCTCTACTAAAAATAC
AAAAAATTAGCCAGGCGTGGTGGCAGGCGCCTGTAGTCCCAGCTACTCAGGAA
GCTGAGGCAGGAGAACGGCGTGAACCT
GGGAGGCGGAGCTTGCAGTGAGCAGAGATCGCGCCACTGTGCTCCAGCCTGG
GCAACAGAGCAAGACTCCATCTCAAAAA
AAACAAAAGTTTTATCTCATATAATTTCCAAAAGCCAAATTTATATACTCTCTGAA
GTACAGGCTATTTGAGTTTCTTAA
ATATAAGTGTATCCTTCATTGATAAGTTTATGGGGCTGTGGAAGAACTTCAGAAA
GCCAGGTTTCTGAGGAAGATCTTCC
TGCATGTGACTTTGGGTCTGAGGCCTGGAGTTACTGTATGGTCATCAGGAGAAA
GGGTGTTGTAAGCAGAAGAAACAGTA
AGTGCAGAGGCAAGAAAGGCTTTGCTGTTTGATGGACAGAGGCCAGTGAGACC
AAGCTTAGTGATGGAGATGGAGAGTCA
ATGAGATGTGGTCAGGGATGAAGGCAGGTTCCAAAGGGCCTCCCAGCCACACA
CAGTCAACAGGGCTAATCTCACTGCAT
GTGCCATGGGAAGTCATCGGGGGACCACAATGGAGGAGATGAGAAGATCTGAT
TTACATTTTACTTTAGCTAAGTGTGAA
GAATGAATGGTAGTGGAAGAAAAATGGGGGTCCAAATTGGAGGTTGTAGCAGT
GATCTGAGGCACCAGCAGTAGAGATGG
AAAGAATTATTTGGACTCTGGATTACATTGCAGGTAGAGAACTGGATTTAGAAAT
GATTTGAATTGATTGGAGTGGTGGG
TAGTAAACAAGACAGAGAAGTGAAAGATGATTCCTGTGTTGTCCTGGGCAACTG
AGAAAATAGTCGTATATTTGTTGAAA
TAGTAAAGAATTCAATTTGGAGGGAAGGGAGAGTTCAGTGTTGGACAAGTTTCA
GATGCTTATTAGACAACTGCTTGGAC
CAGGCAAGGAAACCACTGGAAGGTGAGCCCAGGGCACTGGGGGGACTAGTCA
GTACAGGAGATGGCAGATTCACAGTCAA
CATTGCACAGATTAGGGTTAGAGGCTCCTTGGAGTCTCTTAGCTTGATTTCTTCC TAAACATI 1 1 I I GAGCACCTTCCTT
GTACTGAACACTTGCGGTCAGCCTGTTGTCTCATTGCCTGGAGGTGTCTTTTCC TGGCCAGGTGACTTTTGAAGATGCAT
GGTTTAGGATGTTGAATTGTAGCTGGAAGCATGCAAGGAAGTCTACTTAATGATT TC CTC ATTAGAAATC ATTAC AGTTG
ATAAGAATTGGCTCTCAAATCTGATCATCAAGCTTTGTAAAAATGTGATTCAACAA GGATTCTTCCAAACAATAATAATA
IS
AGCAAGCTGATCAGAGTGTCTTGCAGTCCTTTTGTTTCCCCCAGGGCACTGAGG
AGAAAAGTCTTAAAGGAGTTCAGTGA
CAAGACTTCTTTTTAGGAGCACAGATTTGGTGGTGTTCCTGGGCATTGTCAAGG
GGAGCATGTGGCTGAGCCTTACTCTA
CATGTATATAAAGAATGAATGCCTGACAATGTGATCGCTGTTCTGTCTCTAAGGG
AACCAGATACTTGAGAAGTAGCTTA
GTCACAGTGGCTTAGGCAACTCAATGAGTAGGCCATTCAGTAAGAGGACACAG
GATCTGGTTGATACAATGCAAAATTAG
GGCATGTTACAGCAATGTGTACCAAAAAAAAGTTTGAGAATGTCCCACCTTCCGT
GTAATCCTAGCTGTGGGAAGAAGGC
ACGTTGGGTGTTATAAGAAAGTCTTGTTTTATAACAAAGGTGATAGGGACAAAAG
AGGGTGCTGGGCACAGTCAGCCATG
GCTGAGGAAAACAAGTAAGATGTTAGGAAATTGAAGAGCTTGTGAGTGCTACGT
AAACTAGGATGTTTTAAATAGTATCT
GACCTAGCTGTTTTTGTTTAATGCA I I I I I AAAAGTTGCTTTAAAACTAATAACAA
ATTCCAGAATGTCATTGAAGAAAA
CTATTCTTGGGTTA I I T I I CATTGAATGGTCTCCTTATAATAAACAATGTAATTCA
ATTATTGTTATCAAAGTGCACACC
CAAAACTTGTATTTCAAATCAAAATTGTGCTGATGTCTCAGGCATGGTCCCCCTC
CTGTGGGAGGATTACCCAAGATATC
TACACTCAACGCCTTGATTGTTTCAACACTATCTTTTTATTGTTTAATGTCTGTGT
CTCACAGTGATTTGGGCTAGAATG
GATCTTAAAAATATGGATATTCACTTTTCATTTTCAACAAAAAACAAAACATGAAC
ACTTAAGATAGTGTAAAGCTTAGT
ATGTTTAATCATAGGCCCTATTAAGAACTTTCTATATGCCCAGCATTCATTTAATT
CAACATTAATGCTCCAAAGGTGCT
GTTATCAAGCCTATTTTATAGATGAGAAAATTGAAACGGGAGAGGTTAGAGACC
CAATGTAATAAATACTCAGCACCATG
TCTAGCATAAAAAAAAATTTCAACAAATACCAGTTATATTTGAAAGTAGGTGAAGC
TTCACTATCAAAAAAAAACAAAAC
AAAAACAAAAACTTGAATTCCTGCATAGTGGCCAGTGAGTAACATGGTGGCAGT
GTC CTC AAATTTATG AATTAC AAGTA
TACGCTATGCCTTAGAGTGTGATAGAAAGAGGATCGGATTTGAATTCAAGTAGA
TCTAGACTCAAATCCAAGCTCTGACT
TTTCACCTCTTTGACCTTAGGCAAATTATTTATTCTTACTGAGCCTCAATTTTCTT
ATCCATAAAGCCAGCTACGTGACT
TGCGGAATTGTTTTAAAAATTAAATTGTATCATGAATTTGGCCCAGCATGGTGGT
TTATGCGTGTAATCCCAGCACTTTG
GGAGGCTGAGGCGGGTAGATCACTTGAGCTCAGGAGTTCAAGACAAGCCTGGG TAACATGGCAAGAGTCCATCGCTACAA
AAAAAAAATACAAAAATTAGCTGGGTGTGGTGGCACATGCGTGTAGTCCTAGCT GCTTGGGAGGGCTGAGGTAGGAGGAT
CGCTTGAGCCTGGGAGGTCAAGGCTGCAGTGAACTATGATTGCACTACTGCATT CCAGCCTGGGTGACAGAGTAAGATCC
TGTCTCAAAAAAAAAAAAAAAATTGAATTGCTCACGAATGTAAAAGTCCATGCAC AGAGTCTTATTAATGGCAACAACCG
ATAGTAAGAAAGGAAAGCATGTGTCACCTATTGAACTTGTAGTTGCTCTCTAAAG TATTTGTTTTATAACTCACTAGGCT
T6
TCTAAAAGGATCTCACTTGTCATAGTGAGTTTAGATAAACAGCCTCCCTTGTGGG
TGTGTTTCTGTGATCATTCAGTACC
TTGAGGAAAGAAAAACTGAAAGCAGTTTATATCGCCCATTTCATCAGAACAGCTC
AGTTCGCCAGTAGAGATTGTTGTTC
CTACTCTGCTTTAGGGGACAACACAACTCCAAGCTCTTTTGGTGGAGGATTTTCT
TCCATGGTGGTTGATGAAAAAGCTA
AACTCTTAGCATGGCACCAGAGCAGTAGGACACCAGTGTCCTCACTGGGTTGCA
GGAGATCTGGTTACTTAACCAGGATG
ATCTCCCTTTTGGAACCTCCAGTCCAAGGGTTGGCAAACATTTTCTGTAAAGCAA
CAGATAGTAAATATTTCAGGCTTTG
CGGGCCATATGCTTGCCATAACTACACAGTTCTGCCTTAGTAGTGCAAAAGCAA
CCAATGAATAAATGTGGCTGTGTTCC
AATGAAACTTGATTGAATGCTCCCAAATTTGAATTTTATATAATGTTCATGTCTTA
TGAAATATTTTTCTTTTGGTTTTA
AAAACTCTAGACCGAGGATTGACAGACCATACCCCATAGGTCAAATCTGGTTTAT
GGTCTGCTTCTGTCAGGCCTTTAAG
AATGTTGTTTAAGTTTTTAAAGTGTAATTTAAAAAAATAAAAGCACAAGGAAGAAT
GTGTGTCAGAGATGATATGGCCTG
CAAAGATAGAAATACTCCCTGTCCCTTTAGAGAGAAAGGCTGTCTGCCTGCCTC
TGCTGTAGGTGATGCTCTGAGCTTTG
CACTGGTTCATTTTCCTCCTGCCTTCCGGGACAATTCAATGACAGCAACAATGT
GACTAGACCGTGAACTCAATGCAGGC
AGTGCATACATGCGCCTGGGGCCCCAAGGAAACACAGACCCTCACGTGAATTC
ATTAGGGTAGGTGAAATGGCTGTTACA
AACAGATCTAAAATATAATGGCACAAACAAAAGAAATCTGTCTCTTGTTTAGTTGT
CTAGAGTGGGGATGTTCCAGGTTG
CTTGGTTTGGGCATGGTTTGGTGGTACAAATAATAGTAGTGCCATTTTCATCATG
CAGCTTTTGAAGCTTCTGGGATCAT
TAGCATTCAATCCCCAGGGGAGGGGGTGGAGGCAGGGGCAGTGTGAATGGGA
GGCTTTGTAAGCTAGGCTGAGACAGGCA
CACACCTCTGCCTGTGCCATTGACCTCAGTCAAATGACCCAGTGCAGGGGAGG
CTGGGCGATTCAXCCTTCCTGTGTGCC
CAGGAAGAAACAGAGAATGGGTTTTAGTGAACAGGTAGCACTCACAGCCCAGTT
CTCAACTATTTGACCAACGAATGAAA
CCCAAAGCACACTTTCAACCCACCTTGCAGCTCTAGCCACACTGACAGAGCCAA
GCCCCAGGCTATGTGAGTGGTGGCTC
TGGAGTGATGATGGCCAATCTAATCCCAAAGGACAGAGCCCTGATTTTATTTTCT
TCCTTGTATATCAATGCCAAATTAG
ATAAAAAGGATGAAATAATATATGTGTCTATATATGAAACCTCTTTATGCTGTAGA
GTGACATAGAATCTAAACTCTCAA
AACAAACAGAAGCACATTTTTTCAAATGCCAGCCAAACAGGCATGCTTCATCTCT
CTTCTGGCCATATCTTGAAGGTTTC
TCTCTTTTCCTTTAACAGCCAGCCCCTCTGCCTCACTACCAGTAGTATACAGCG
GAAATGTACATTCTTGCTGTTTAATG
GAACACAGCCTTACTGTTCTTGTTCATGAATATAAAAGGGAATGCAGTGAATGAA
CAGGAATCATCACAGCTCCTTAACC
ACTGCACTGGCCTGCCACCTGCAACTTTGGACCTCCCTTGCTCTGTGGCCTTGG AAAAAGCAGCCTATAAAATGGGAGCA
ATTACAGGGCTGTTAGGGGAACAATGTGAACCACCTAGAAAAATACTCATAAGTA
CTTCATGAACAAGAGCTATTATTTA
TTGCTTACTTTGGGGGTTCAGAGGAGGAAGCAGGAAACCATGCAGACTTTGTTA
TTCCAGGCAGCAAAGAGGGAGTGGGA
AACAGAGAATTACAAGTGGCAGAAATCATAAGCAAAAAGGGTAAATAGTGATAAA
GTATATTTAGGGGATTAGACTTGAA
GAAGAATAGTAGCAGAAAATGTGTTGGATCAGTAATAAAAAGACTGCATCTTTAT
TCAACTTTTATGGTTTACACTGCTA
TCTCACACTTATTATCCAATGAACTCCTTCTTTTTGCCAATAAGAAAAGTGTAGCT
CAGAAAAATTAAATTATCTGTCCA
AAAGGCAGATCCATCAACTGGCAACACCAGTCTTGTAATTGACCCAGGCTTAGA
GAGAGCATTGAGTGCCAGGATTTCGT
TCTGTGGGCATGG " AATGCCATGAATTAGGGACATGATGGAAGAACATGATGG
CAATGGAGGGACATGCTGGAAGTGTAC
ATTTGGAAGACCGATTTAATAGCAAGGAAAACTATTGTGCAGTCCTCTGGTAGAA
AATAATAAAGGCCTAAAAAGTGGCT
GGAACACTCTGTAGGAAGGAGGGAGGTAGGAGAGTCATTACAAATGAAATCCT
GATAGGTCTTTCGACTGGATTTAGCTG
TTTAGCCTTTCCCTTAGCACCCCATGGGCCCCTTGGTGTTGAAGCCAATTTGGC
CAGGTCCATGCTGGCTTTATTCTTCT
TCCTCCCTCTTTGTACCACTTCTTGGCAGTTGGCCCATGTTTGAGATTCACTGG
GTTTTCTCCACAGTCAAAATTAGTCT
TTGGAATGTTGAAGTTCTTTCTCTTATTCTCTACTTGAGTATTATTTTCATGGTGT
TGAAAGGGAGTCTCTTAAGTAAGG
AGAAAATTAACACAATCGAGTGTCACGTATCTAATCAGTAAAAATGACCTGATTA
GCCCCACCACAAGTTTGAGGGTTTT
GGGGGTGTTTTTCCAGATCATTCAGGCAGTAGTTTAACCATGGTTTATTTTTTAT
GTGTATACAGTCTCTCTTGGTAATC
TGTATTATTTCTTAATTGGCAAGAGAATCACATCCAACTCCCAAAATCTGATATTT
CATATCTAAGTAGCCCAGTGATAG
GGCCTTAAACCTTGAAAAAAGTTAATATTTTATTGATGATATTGCTCCCTTTAGAC
AGGAGAAAACACAATAATGGAACC
TTTGGGTCTCGTAAGTTGTACAGATCATTTGGATAACACTTTTGGTTATGTGTTT
ATGTG N I I I I A I I I I I I A I I I I I I
AAAAAAACAATGTTTCAGCATAGTTTTTCCTTTCAAGATAGGAAGCTAATTTTCAA
AGCTTAACCTTAATCCTTTCCTTG
TGGGCAGCCAGCATGGGGTCTTAAAAATAAACCACTCATGTATTCAGAGTGCTC
ATTCCTG Π I i ΓCAGCTTATTTTCCC
CCAAAACAAAAGATCTTGGATGGGGAGAATTTTTAAATGACATTCAGGTCCTGTT TGTGGCTGTAATATTAGCATTCTAA
GTAAGGAATTGTTTAAGGTGCACTGAATGAGTATTTTTCTTTCTTTTGATTGATAC TGCCCAAAGATTTGTCCTTGATTT
GAGCATACTTAACAAGATTTTCACATGGACGAATAAATCTCATAAACCTGCTCAC ATTGTTAGCAACACACCTCACTGGG
CTACCACAACTCCCAGTAATTGCTTGCCAGGGAGGCTTAGCTGCCCATGTAGGA TTAGTACAGAGAGAGTTTAGGAACCT
TATTTTCTTTTATTCCAGAAATTCAAAATGTAGGTAGCATATCCTGAGGAACTAGT CATGCTTTCTGAAATAGTTGCACT
GTCTTCTGATAACTC CTATCTCAAGATTAATTATTTAAATAG C ACTTTTTTAAAGAT
AATATATTCTTTGTGATTCAAAG
TAGAATAGCAATTGCTAGGAGTTTGTGTAGAGTATCATCTCATGGATATAAGTAG
GTACATATTCAGAGAGGAGAGATAC
AAAAGACATACATTATGTATTTAATGTACTTATAAATTGAAAATAATAGGATAAGA
ATTAAAAGGCAAAATAATTCCCTT
TCCTTTTTATTTTTAGTAAGAACACCATAAGTACTTTGTAGAGACCAAGTCAGAAT
GCTGAAGCTACTTAAAATTGGAAC
AATGACTTAGCCATTAAAATAGTTCTTAAATGTTGACTTAAAGCATTTAAATTAGA
CAATTTATATTATGTGTTTACTTA
TTTTGTTTTACTGTTAAAT I I I I I GTTTTCAATTTAGAAAAATAGCATGGCCAAAG
AAAATAAAGCTAAATGTGCAAGGG
TGCATAAATAAAAACTAGAGCTACATTAAAAATAGGAACAATTTTTCTTTTGGTGC
TTTACAGGTGTATCTAGCAATAAA
GAGAACCTGCTTCCTCATTAATATTTGATGCCATTTTTAGTGGTTCTTATAATAAA
ACTAAAGTGACTTATTCTTAGATG
ACATCAGATCTAATGTAATACTTGTCACTTGTGGCATTATTAAATCAACAATAAAA
ATCTTGGGAAGTATCTTTTAAAAC
ATAGTTACATAAAAAGGAGAGATACTTTCATGGTTAGAAAATGGTTACCTTTAAC
GTTTCTAGAGAAAGGAGAAGAGATA
TTTTGTTTTTATACTCAGAAATATATTCTGGATCCTTCTAAAAGCCCCAAAAGAGA
TAAACAACAGGTTTCCCTTCAACT
TTCTATCTGTTTTGAGTCTCCAAAGACCGTGAACTATCTGAACTGAAAGTAAATA
AAAACGGGTCAAGAAGATCATTTCT
GAATCAGCCTCACTTCTGTGAGGGGTTGGTGTTTGCCACTAAAAATAGACTACA
TTGTCATTTGAAGCCAGGAAGTGAGA
AGTGTTCAl I I T I TCTTGCCTCTTTCTGATCATTTTAGTTCAACGCATAACTCTAC
CTTTCTCCAGCAGGCTGAATTGTG
CGACTTTCATGAGTCTTACGTGTTTCTTTTATGCATTTTGCCAGAATACTGTGCT
TAAATGGTATGGGGCCAGTAGAGAG
CTCCAGGAGTGGGAACTCACTGAACTGGTAAATGACAGCAAGAGAAGTTAGGAC
AGATACCAGTGTGTCCACAGGTGACT
CTTGGTGCTCTTTGGCATCTGTGTAATAAGGTAACATGAACTGTGGTGGGCGGG
CATCTCCAGTCTGCTGTGCGGGAGAG
GGAGGATTGATTGTAGAAAGTCCTTAGTGGTACCAAGAGAATGTGATACGGCTC
TGGCTTGGTTGACCATAGACAGAGTT
CAGGGTAAAAAAGGTATTTTGTTGTTGTAGGGAGAAAGCTGCTGGATTTATAGG AAAAATGATGTGTGCTTACCCAATAC
TATCAGTAAAATAGTGTCTGGCTGCTTTACATTTTTCTTTAAAAAACACTATAAAT TCCACATTGCAATCCTTTTCTCAA
GCACTCTGAATGGGGTCACTAAGCAACTTTGTGACACAATTAGCTTCAGCTGTC CAGTTGGTCAAATTAGATTTTACCTC
TTCTCCCTTCCTCAATGTTTGTAAGCTGCCTCACGAACATGACTACAACATAAAA CATTTAACTTTCCAAAGAGGTCAAG
AGAATTGAGGATAAATACATACCGCCTTCAAAACAGTTTTCACAGTCCTTATTTA CAAGATAGTTAAGATCAAATATCAT
GTACAAACACTAGAATTAGCAGTGCAAATGTAACTATAATAGTTCTCCACATAAAT GGGCAACAAAATAAAATGAGCAGC
ATAAAGTTTAACCTAGTCACTAGAATAAAGCAAAATGGCATGTGAGAATGTACTC
CCTTCCACTCCTTCCTGGGGACCTA
GAAAAATTCCTGGTGGTTTCTGTAGTGGGTAAGGTCCCCTGCTAATACCATGTG
GTCTCCATGTTAGGACTACCCAAAGT
TTAAGACTCCACCATAGAAGTATCAGGTGGTACTCAGAACGTTAGGCTTCTAGC
AGAGATGAGTTTCCCTGTGGATAAAT
GATCATTAAAAACATCCATGAAGTAATTCAGAAAATTGTACCACCAAATGGCACC
CTTCTCAAGTTGCACACGCAAAACC
TCCAGGGGTATATGGACTCCAGTACCCAGAACACATGCCTGGCTGCACACCTCT
CTCCTGCCTGAGTCACCATTTGTTTC
CTGGCCTGACATTAATGAGTCATTCAGAAAGTCATTCTTTTATGATTTATAGAATA
CATAAACAGTAATGTTGTAGGTGC
CTAGCCAAAATTTCTGCCTTCAAGGAGCTTTTAAATTAACAGAGGAGGCTGTTTG
GTAGGGGATGGATGAATCTTCAATA
ATAGTCTGTCATTAGGGTAAAACTATTCTAAACTAATGCCACTCTAAATCCACTAA
GGGTAAAACTGCAGAATCTCCAAA
GAAATAACAGTGCTAAGAAGGCAACAAGTCTCTCTGTATTATTCATGAGAGCCAT
TCTCTAGTAGCTACATCATTTAACT
AGCAAGAAATAGCCAGTAATAGGATGTCAGATGATAACACTAGCATTGTGAATCA
GAAAAAAAAAAATCAGGATCCACAG
GAGACATACAACCAAAAAATAAATGCCAGGAGCCTTAAATGAAATGTTTCTAGGG
TTACTGCTCTTCTTAAATTACACCG
GGATTTCTCCTTTGCAGCTGATCAGCACTGCAATTCAACACCGCTATTGCTAAAT
GTATGAGATGATCTTGTGGCTTCAA
AAATGTAGGAGTTCCTCTGGGTTCCTGGAAAAGAACATCCCATATGGTAAAAAT
GAATAGAGTGAATAGAAGCCAAGCAT
CGTCTGACATACATTCTTGTGTATATTCCAAATCCTCACTTATATCTGGGTACAG
ATTACCCAAGTATGCCCTAGCACCT
TCTTTTCAATAAAAGGTGATATGCTGAGGAACTTTGGCAGGCTTTTGCATTATGG
CCTTTAAAATGTTTTTGGCATTTTA
AAAGTATGGCTTGCACGTGTTTTA TTTTCAGAATAAATATTTTTAATCCATAGT
TTCACCACCAAAAACAGCCAATCT
TGTCTTAGTGAATA I I I I I ATA I I I I I I I I AACTTTTCATTATAGAAACTTGCAAAC
ATAGAAAAGTAGATAGAATAATA
TGTATAATGAAACCCAATGTAACCACTCCAGCTTTAACAATATCAAGTCCTAGCC
AATCTTGTTTATGTATATCTTCATC
GTATCCCTCTGCTTCCTGGTCCTGTCCAAGGATGATTTTTCAAAACAAATCCCAG
ATATCACATTATTTAATTTGTATAG
AGTTGAGTATGTATTTAGGAATTCATTAAAAGAATAGAATAGAAGTAATAAATAGT
ATTCAGCACCTTAGCCATCTAGTG
AAATTAAAAATAATTCTGTAATATCATTAAATGTCTAACCAATAGTTATATTTAAAA TTA I I I I I TATTTTAI I I I I I I I
TTGAGATGGAGTTTAGTTCTTGTTGCCCAGGCTGGAGTGCAGTGGAGTGATCTC AGCTCACTGCAACCTCCGTCTCCCGG
GTTCAGGCGATTCTCCTACCTCAGCCTCCCATGTAGCTGGGATTAGAGGCATGC GCCACCATGCCCAGCTAATTTTGTAT
TTTTAGTAGAGACAGGGTTTCACCACGTTGGTCAGGCTGGTCTTGAACTCCTGA CCTCAGGTGATCCACCCACCTTGGCC
TCCCAAAGTGTTGGGATTATAGGCATGAGCCACCGTGCCTGGCCAATTTTTATT
TTTAATTTTGATATACACATAAAAAA
TTTTACCTTCTTAACCATTTTAAGTATACAGTTCAGTAGTTTTAAGTACCTTCACA
TTGTTGTAAAACCAATCTCCAGAA
GTTTTTTATCTCTCAAAATTGAAAGTCTAAACCCATTAAACAACTCCCCATTCTCC
CCTCCCCCCAATTCCTGGCAACAT
CTATTCTTTCTGTCTCTGTGTATTGCAGGTACCCTCATATAAGTGGAATCATATA
GTACTTCACTTTTTGTGACTGGCTG
GTTTCACTCAGGACAAGGTCCTCAAGGTTCATCCGTGTTATAACATGTCAGAATT
TC CTC C CTTTTTAAGGTTAC ATAAT
ATTCTGTTGTATGTATATGCCACATTTTATTTTTTTGATATATCTATCAATGGTCAT
TTGAGTTGCTTCCACCTATGACT
GTTTTAAATAATGCTGCTATGAATATGAGTATGCAAAATGTCTTCCCTGCTTTTAG
TATTTTTTGGGTATATAATCAGAA
GGGGAATTGTGGGATCATATAGTAATTATATACTTAATTTTTTGAGAAACTGCTG
TACTGTTTATCTTAGCAGCTGCAGC
ATTTTATGTTCCCACCAGCTGTGCACAGGGATTCCAATTTCTCCACATCCTCATC
AACACTTATTATTTTCTGTTTT I I G
TTTTCTGATAGTAGCCATTCTAATGGCTGTGAGATAATAACTCACAGTGGTTTGG
GTTCACATTTCCCTCATGAATAATG
ATGATGAGCATATTTCATGTGC I I I I I GACCATTTGTATAACTTCTTTAGAGATGA
TCCTTTTCCCATTTTTTAAATCAG
GGTTTTTTTTATTGAGTCATAGGAGGTCTTTACATATTCTGGATATTGACCCCTTA
TTAAATATATGATTTTCAAATATT
TTCTCCCATTCCGTGGCCTGCCTTTTCACTGTTGTTTCCTTTGATTATACACATTT
TTCATTTTGATGTTTTGAATTTAT
TTTCTTGGTTATTGCCTTTGCTTTTGGTGTCATATCCAAAAAGTCATTGACAAATC
CAATGTCATGAAGCTTTTCCTATA
CCATTTTATATTTTAGGTCTTTCATTTAGGTTTTTGATCTATTTTGAGTTAATTTTT
GCATATGGTGTAAGGTAACGATT
TACCTTGAA I I I I I I GCATGTGGATATCCAGTTTCCCCAGCACCGTTTGTTGAAA
AGACTATTATTTCCTCATTGAATGG
TCTTAGCACCCTTATTGAAAATCATCTTATCATACATGTGAGGGCTTATTTCTGG
TCTCTCTCTATATTCCATTGGTCTG
TACATCTGTA I I I I I GCCAGAACCACATTGTTTTGATTACTGTGTCTTTGTAATCA
GTTTTGAAATTGGCAAGTGTAACA
CTTCC GTTTGTTATTTTTTAAAATTGTTTTGGATCTTCAAGGTCCCTTGAGATT
TCATATAAACTTCAAAATGGAGTT
TTCTGTTTCTGCAGAAAAATACTATTGGGGATTTGATAGGGATTGCATTAGATCT GTACATCACTTTGGATAGTATTGAC
ATCTTAACACTATATCTTTCATCCATGAACATGGGATGTCTTTCCATTTATTTGTA TCTTCTTTAATG 1 I I I TAGCAATA
ATTTTTTAAAATTTTTCAGCGTACMGTGTTTTCCCTCCCTATTTAAGTTTATTCCT AAGTATTTGATTTCTTTCATTTC
TGCTATTGTAAATGGAATCATTTTCTTAATTTATTTCCTGGCTTAGTCGTTGTTAG TGTATAGAAACACAACTGATTTTT
GTGTGTTGTTTTTTCATCCTGCAACTGTGCTGAATGTATTCATTAGTTCTAACAG ACTCTGTGTGTGTGGTGGGGGGCGG
SI GGTGGGGGGAGTTTTTCGGGGCTTTCTACACACAAAATTATGTCATTGTAAACA GATATCATTTTACTTCTTCTTTTACA
ACTTTGACGCCTTTTATTTGTTATTCTTGCCTAATACCTCTGGCTAGAACTTCTAT GTTTTGTTACACAGAAGTGGCAAA
AGCAGCCTGGTTGTCTTGTTCCTGATCTTAGAGGAAAAACTTTCAGACATTTACC ATTGAGTATGATGTTCTTTGTGGGC
TTTTCATATATGAGCTTTATTTTGCTGGGTAATTTCCTTTTCTTTCTAGTTTATAAA GTGTTTTTCTCCAGTAGTTTTGT
TTATTTACTTTACAGTTTATTATATGTCTCCTAAGTGTCTATAATCTATAGATTCTC ACTCAATCCCTTTTTATAAACCT
TGTGTATTTCTGTTGAAGTAACCAGACTGTCTGTCCTACAGAGTTCCTGGGAGT CTGGGTTTTCTTGACAGCAACCCTAT
TGTATTATTTAACATGTTTCTTTGTCCTCTGTATTTCCTACAATCTGATAGCTAGA TCTAGGAGCTTGCTTAGTTAGATT
CAGTTTCCTTTGCTTTTT I I I I I TTGTATTTTTGGTTGGCTAACACTTTATAGATG GGGTATATCCTTGAGCATCTATTT
GTTGTTTTTAAAACTTCAGTGCTCCAGTTGGAGAAGCTCCTTCAGCAGTCCATTA CAGAATGTTTTTACTTATGCTGTAC
ATTCTTTCGTTTTAGCTCTAAGGAGGTTGACTAAGAAGAAGAAAACTCAAGAGTT TGGAAGTTTTATGAACCACAGATTC
TTAATTGATCTGAATCAAATATATGTAGCCTTGCTGCCAGAAAAAAAGACTGACA GTTCCTGTTGCAGATAATTAGCTCA
GAGCTATTTTTGCTGAGGTGAAGTTTTGAGATGAAACTAAACCACTTGCTGAATT TTGATATTTTCTCAAGTATCAACTC
TTCTTGAATAGCTCTACAGTATTATTGCATATTTTTGATTAAGGCTTCCTCGTTAC CTTTAGTGGACAAGTAGGTGTCTT
GAGTGACTCATGCTGAGTCACCTGGCAGGTGCCACATTCCTTTGTTATCTTTTC AATAGTGTGGTGAAGTTCAATTTACC
ACCCTTAGTTTCAAGTGAGAGAACTAACTGATAGGATCAGTGAATTAACATGGTT AGTCTATAGCTATGCTTGGTGGCTA
ACTTTAAAGGCAGATTCCAGCTATCCTACTTGAGGACATTTGCTCTACGGGTAGT TTCTTTGTCTCAGTGTAAGAATAAA
TATTTATTAGAGATTTAGAGATTTATATACAATCTAACATGCTAATCTTGACTGGG AAGTAATCAACTATTTACATTACA
GAAATATGCTATACTCACATCAGAGAGCCAAAGAATAGACCACCTTTCCCCTATA TAATCAGTTTGATGATTCAACCAAA
CTTTCTATTTGTGG G AG AAGG ATC CTTTTTTATTTTTTAAATAATAC CCC C AAATT TTAAACTCTTTATCTTCTTCCTGG
AGTGGGCACAGTCCAAGTCACATGCATTTTACCCTGAGCGGACAGAATACCAGG TACTTGATACAGATACTCCAGAACTG
ATACTGGAAGACTCTACTTCTCTTTAGTATTATGGTAGCTGATATTTTATGTTAAT TTGATATTCTTAAAGTGCTTTCAC
AAATTTATCCTTCGACTTGACCACAGACAAAACCTTTCTAAAATATTATGTCATTC CTGAAATGCTAAAGTATTGGATCA
TTTTATGATCACTGAAAAATTTCAGGGACTTTACATTGAAGGCACTAAATTTGGA CTCTATGTCAATAATTCCTGTTTAT
AACCATGTATTACTGGCATACATTTGTAGATAAACTTGGTCAGGGTCAAGATGTT GCCAAGACCTATTCCCCCCAGTCAT
υ TTGTTAATTGATTGGATTTGGGCTAAGCTATGTTCACAAATTGTCACTGTTAAAT GATTAAATGGTATAAGCAGAAAGGT
ATATGTGACTAACTTAAAGCAATCACATGACATGGCTATGCAAGTAATTGATTTTT TTGAAGAATAGTTCTAGGTTTCAG
TTGAACTGGTGCATTGCTCAGATCCCATCTGAATGATCCTACCTTCCTCTTAGTT CCTAGATCTTCAGAGTATAGGACAC
CTTTTGCCAGCTTCCTAAAGTGGGGTCATCTTTCCTGGTGATAAAAGCTTGGTT GACATCTTTGGGAAAAACCAGTGATG
GCCCCTTCTGGGATTGCTCATGTTGCATCGTCTATTTTAACTTCCACCTAAATTA GCTTTACTGATGACTGAGCTCTTAA
AAATATGGTCACCTCAAAAATAATTGGTATGAGGAATTGTATGAACATTCTATCG AAAGTGCCTTGAGTAGTAAGTACAG
CGAAAGTCTAGTTTGAATGTATGACTCCTTTATGGTAGTGCTGCAGAGATTGAGT GTGTTAGTCCATTTTCCATTGTTGT
AAAGGAATACCTTTACTTTTTAGTAGGCACAATACATATAGCACTTCAGAAATTTG AATTTACCAAGATTCTCACATACA
CAGATGAAGGCATTGTTGGAAGCAAATGACTTAGCCCTGCAAAAACAACAATGC CTTCATCTGGTGTGTGTGAGAGTCTT
GGTAAATTAAAGAATCAGAAGTGCTATATGTATTGTGCCTACTAAAAAGTCCATT TTTAATGATACTTGAAGTTTATCAC
ATGCCTTAGCTCACTGATGAAACAAAGCAGGGCGCCAAGAGGAAATATCCCAGT CCTCTGATGATTTAGTACACACATGC
ATTTATATACCCTTTTCTTTTATTTACTTCAAAC C C AATTAC AGATTTTCAAC ATTT GAATTCAAAGCTGACAGACATTA
ATCTCTCTAGAGACTGTATTTAATGGCATTTCAGTATTTCTGTGCACTGCTTTTGT CCAGAATATTTTTCC I I I I IAACA
TTTTAATCAAGATCTACTGTGATAGCTGTGATGAGGATTAAAAAAAATCAATAAAC AATTCAAGGTAGTGGACCAGGGTT
GATGCTGGTGATTTGGGTACATGATGAAAATTTTAGTCTCTGCGAGCAGTTCTG TGTTGTAGAAATAAGAGAAAGAGATC
TTGAGGCAATGGGATAATTAAAGACCACTTGACAGAGGATGAAGGCAAAACAAT TGTTTTATTCCTGGTTCTACAGACAC
TCAGCCAATTTCATCATTATTTACTTTTTCTACATAATGTTACTTTGAATTAGTAGA CATAATTΓGCCTTTGATGTACTG AAAACAGCCATGGGTAGCCTAGGATGACAACAAGTTCCTTCACTTCACTTTAGT
GACTTTAAGTAGACATTTGCAATTTA
ATATCATATCCAAATAATTTCGGAGCAGGGTTAGATTTCATTCCTGTGTGTGGGT
GTGAAGTTAGATGTTGTGTATTGAG
TTTTGTTTGGGTGAAGAACAACTTAAGAAGGCTTCCACTAGAAGAATTTCCTGTA CATAAGAACTATAGTAAATGTACAA
TTTCTTCGTTAAACAAGAAAATTGATAAAACAGTTGACTGTATTTTTTTCCTTTTG AGCCTCTTCACTGTACTATTCAAT
TTCTACCAGAGAAGTAAATTTCATGGCACATAAGAACATATTTTACAAAATGGTG TAATACATACCACTACTGAGTTTAC
AAAATCAGTGTTATGCTGACCTTATGGTAATATTATCCACTTCATTAAATAGAAAA GTAAGGCTTCTTATGCAGCTATTC
TAAGAATGGGCTCTTCCTTAATCAGGTAGATAGCATTTACTAGTGAACACGTTTA ATGTTTAATACTATCTTATGTAGAT
Ϋ3
CAAATGAGATTTTCAGATTCAAGTGCCCATTTTTTCCCTAGGCTAGATCATTAGG
AATTGTTTTTTCACTGACAACACTA
ACAGCTAATGAAAGGTAAAATTTAGTATTTTCTTCTTTTCTTCACTGGAATTATTT
GAAATATTCTCTATTTATTGCTTT
TCATTCATTTATTCCTCCACCAAAGTTTGGGAAAGCCTTTATGACATGCATTGTG
TTGAGTACCAAGGATATACAGATAA
GGATATAGGGATAGGAAAAAACAAGGAATCATAGCCTAAAGCTTAAAATGTTGTT
TAAAGGCTGTAAAGAGAGAGAAAAA
TCTAATTGAACAAAACATTCTACACGAAATAGAAACAGAGCCTCCCACTATGTTT
TTTCAGAGGATTGATACCCTCTTTT
TGATTCTGCAAATAACAGTTTCATACTAAAACTAAAAAGAAAGAAAACCATACTCT
GAATCTAAAAAACAGGAGGAGGAA
GATACAGGGATTAAACCCCTGGCCTGCCGAGACAAGAACACATAGAGCTGTGG
TACCAGCAACTGTGGAGCCCAGCCAGC
AAGGATCAGGGAGGTGGAGCTGCGGAACTGGGAAAGGGGCCCAGCGGGGAG
CGCTGCCTGTAGGGATGGAGATGTGAGTG
TACAGGCTCCATCTGAGACCACGGTTCATGAACCTGTGTGGACAGATTCTTTGC
CCCAGTTTAGGCTTCTGCCTAGTTGT
GTTTTGAGGAAAGGATTTCATGATCAAGACAACAGTTTGAAAAATCAGAAACAGG
CCATCTTGAATATGTAAGCAAAAGT
TAGACATATACGGGTCTTTTGTACTTGTTAAATGTTTGAATGTTTTTTGAAAGAGA
TGAGAGGCTGGAATGAAGATAGTG
ACAACCTAGGAATAGACCCCAGTGAAGGAGACCATACCAAATATCCACAAAAGA
CACAAGGCTAAGGGGAAAAAAAAAAA
ACAAGAAAGCAAAGAACAATGGCAATGACAATAAGGGAGGCTGAAATTTATGGT
CCGAGTGACAGAGTGTTGAAATTGAT
TCAATATGGGGCTAAAAAGAGGTGGCTAGGAAAATGCCAAGGTCTGGGTTGGTA
GGATGCCATTCCTCTAAAGAATGAAT
ACAGTAGGTGTCCAGACGAAATGTGATGACTCCAACTTGCCACACATTGAATTT
CAGGTGCTTTATGGGACATCTAGGTG
GAAATGACCAATAAGTAATTGATATATGAGCCTCCAGTTGAGATCTAAGATCTGA
TAGAGGGGAGTGGGTGGTGGTTGAA
GCCATCAGTGTGCTTGAGTCCACAAGGTGATCTGAACAGAATGAGAGATGGAAG
GATAAGAATAAAGTCCTAGGAACCAA
CGTCATTGATATGGTTTTGCTGTGTCTCCACCCAAATCTCATTTTGAATTGTAGC
TCCCATAATTCCCATGTGTCATGGG
AGGGACCCAGTGGGAGATAATTGAATCATGGGGATGGTTACCTCCATGATGTTC
TCATGATAGTGAGTTCTCATGAGATC
TGATGGTGTTATAAGGGGCTTTTCCCTGCATTTGCTCTGCACTTCTCCTTGCTG
CCACCATGTGAAGAAGAATATGTTTG
CTTCCCCTTCCTCCATGATTGTAGGTTTCCTGAGGCCTCCCCAGCCATGCTGAA
CTGTGAGTCAGTTAAACCTCTTTTCT
TTATAAATTACCCAGTCTTGGGTATGTCTTTATTAGAAGCATGAATGGAATAATAC AGACATTTAACAAGCAGATGATGT
ATACATAGATTCTAGGAAAATGCCCATAAAAGATTGCTGTCATAGAGGAGAAATG TCAAGAAGAGAGTAAACAGCAGTGG
CATATGCCACAGAAAGGCCACGTTTTGAAAAGAATGAGAACATTTGGCAATGAG ATTGACTTAAAATGAGATGTTACTTT
GAAGTGGAGGCAAAGCTCTAAATAATTTTTTGAAAGTGATATAACAAAAAATGGT
GTTTTAAAAGAAGTCCTTTTATAAC
TGCAGCTCTGGAATTTGTTATTGCAAGAAAATTATGTATTATAAAAATAGTAAGAT
TAAACTAAAGACCTGGTGATTTTA
ACATCTCAGTTTTGGTTTAAGAACGTGTTTTTTTCCTTGTTCTTAATATGCCAAAA
TAAGCCATCAAGCCAGAGTAATTT
ATTAACTGTTCAATGTTAGC I I I I I I CACATTTGACTTGTACTCGTATGAGACTTT
AGAAATTTTAACAGTGTTAACGTA
TTAGCTATTATAATAGGAAAATAAAATCTAACCTTGATATGAGAACTCCAGTTGCT
AGAATATGTAAGCAATTCCATGAG
CAGAATTATCTATCTGTCCTTCTGCTAGTTTCCTCTTGGGTAAAACACTCAGGGT
GAGATGGCTTTGCACTCTGCTTGCT
GGGTCATCACGCACAAATCACAACCCTCTAGGAACTCCGTGCTCTGTTTCTAAA
GTCAAAGAGTATGTGTTTTCTAGAGC
TCTGCCTTTCAAACTGTAAATTGAGACAGGGCTGTGAAATGGACAGTGTCAGAG
TATATCCACACAGTGAGCAAAATCTT
GCTTTGGCAAATATATTTCAGCTGTGTGTGTGCACATAAGAGAGATTCTTAGATC
C CAGTGTTAATTGTG I I I I I I ACTG
AGCATTGTGGTTCAAAGATGTCTGAAAAATACTGCTCTCAACTAGAACCCCCTTC
CTATTCTAAAATTGTCTCATTCCAG
TTAGCTGCCAGACTGTTCTGTTGGTGTAGTAAATGTTTCAAAACAATATCTTGGA
GTGGCAGGTATAAAAACTGTTGAGG
AATGACAGTTAATAACAAAAGAGATTAACATTAAGAGGTAAATAAGACCAACATC
TGTGTTGTATTTTACCACTAGACTG
AACAAGCCTTTAAAATTTATTAGTATTCTGACTAGCAAGTCATTACTGTAATTTAT
GAAGAAAAGTTAAAGTATGCTATA
TATTAACGAACAATAGAGTAATTCTTTCAATGGTCATATAAATAAGCAACCGTATC
AAGGAGAMTAATACTTTTTCAAA
AGTAACAATACTATGGCACATAAATTTTAAAATTGGCAATTAAATAAATGTGGTGA
GGCTCCATCCTTGCCACATAAAAA
GATTCATTGACATGGTTATGAGACAAGTCAGGGTATTTTTATTGGCCTGAGCTAG
AGGGAATAAACACCTTAGTCAGGTG
TTTTGTTTTAGACACATCTGTTTGAGCTTCCCCTACAGGTTAATTCAGCTGTAGG
GTCTATTTGAGCATTTGC I I I I I CC
AGCATGCCTACTGATACTTAAACTCCTTGTGAATGAGCAATTTCTTATAAATCCAA
GAGTAAATGAGTTAG I I I I I CTTT
ACAGTAATTTGTAAACTGTAATTCTCCTAAAACCTATGAAATTTCATAGAATTTCA
TAGATTTCAGGTGTATTAAAAAGG
GACAATTTCAGAATGCTAAGGAGTATGAATCATCATACACAAAATTTGAGGGAAT CGTTTTCCTTTGTCCCTCCCAGCAG
AACTCTTAGACACTCAGCCTCTGCATCTTTATTTCCTCTCTGCCTTCTTATCAACC CTGGGGTATTTGCCAATCTAACCC
CTGACGTTTGGCAATACTATGTCCTCTGCTTCCCTTCCACTGTCAGTCAGCTGG CAGAAAAGGCACAGGCAGGAGAGGAA
CTGAGGTGAGTGAATTGCACCATCCTGCCATATTTCACGTACTCTGTTCCTCCTC TAGGGCTGCTGGAAGGTAGAACTTG
GGCCACACTGGAGAGGTGGGGGCAGTGCTTAAGGGGGTGGGATGTACTGTAX GGATGTGGGTTATGCTGGAAGGGTGTGG
GTAGTGCTAAGGGGGTGAGAGTGCTAGGAGGGAAGTGAGCTGTGCCAAGAGG
GTGGGAGGTGCTGGAGGGATGTGGGAGG
TGCTGGAGGGATGTGGGCGGTGCCAAGACAGTAGGAAGTATTAGAGAGATGTG
GGAGGTGCTGGAGGGATGTGGGTAGTG
CCAAGAGGGTGAGAAGTTCTGGAGGGATGTGGGAGGTGCTGGAAGGGTGTGG
GTGGTGCCAAGAGGGTGGGGAGTGCTGG
AGGGGTGTGGGCAGTGCTAGGGGTGGGCAGTGCTGGAGGAATAGCCAGGTTG
GGTTGGGTTGACCTCCAGGCTACAGCAG
ATGGGCCCCTGGCAGGTGCCCAGCAAGGCTCCGCTCCCATCTCCCACCCTTTC
CAGCCCCTGCTCTCTCCTAACTGTACA
AACATGTACGATTCTGACAAATGAAAGACAAGATTTTTTTTATTAGTCTATGTGAG
AATATTGTGTTTGCCATCTTGCTC
TCAGATTTATTAGTGAACAAACCAGGCTCGATGAGAGGATGACACCTAAGTGGG
AAAGCAAAGGCAGGGTCTGGCCTACA
GTCTAGGCCTCCTCTCTCACCACTTCTTGCCACTTCCCAAACTTAACAGGCATG
GCAGATGTTAAAGAGTAGGGGTAACA
CACGGGCATGTCTAGATATTGGAGATAAAGTTAGAGTGTTCAGAAAACAAGGTA
GATAGTGTACAAAGTTGACATTGGGC
AGTATCGAATCAAAGTCTTTTCTTTGGATTGTATGTTCTTGACTCTAAGACTCAAA
GGCACATGACACACATATCATGAC
ACATGGCATACCTGCCCAATGGGGCCTACCTGAAGGCAGTTTCTCTGCTTACAA
TCCCCCTGGTTACATAATGATGCAAG
GAGATTCCATACTTATTCCAAAGGTTTGCCGTGACAATAAAATAGTATATCATAAA
TTATTGTTCTTTGGAAGAATTGAA
AATGGCATACATACGAAAGTTGGTATTATTAATAGAATGCCAATAAACCCTTAAAA
AATTCATTTTATAGCCTTGTATAC
TTCCTGGAAATGTTATCTTTCATTACTTCCTATACAGGATCTTCAGCCATGAACC
ATGTTGCTTCTTGGATTTGATACAC
AGTAAATACAGGATAAAACTGCAAATACCTGCATATCCCCCAAGTTCATTTCAGT
GCCAGTAAATGTCATCAGGGTGGCT
TGTAGGGAGAGGCTGACTACTCACTGCTAATGTCCAACACTTGAATTTTTAGATA
GAAACCACGGTGCACTGTGGAATGT
AGCACTGCTCTCTGAGTATGGCATGGGCTGTTTGATGCCCGTACAGTGTCATGT
GAATCTTAAGATCAGCTCCCTGAGCT
GGACAGTGCCTGGTGGGGTCAGAGTGGCCATCACTTCCTGCTTTGAGTACACT
TAAAATTGACAGGTTTAATGAATGCAA
ATAATGATAGAGAGTTATCTAGATTAAAATTACTATGTACTACCCTTCTGAAATTA
ACCTATAGTGCAGGGCTGTCCAAT
CTTTTGACTTCCCTGGGCCACACTGGAAGAAGAAGAATTGTTTGAGGCCACACG
TAAAATACACTAACACTAATGTTAGC
TGATGAGCTAAAAAAAAAAAAAATCATCACTTAGGCCGGGCGCAGTGGCTCATG
CCTGTAATCCCAGCACTTTGGGAGGC
TGAGGCAGGCGGATCATGAGGTCAGATCAAGACCATCCTGGCCAACATGGTGA
AACTCCGTCTCTACTAAAAATACAAAA
AAAAATTAGCTGGTCGTGTTGGTGCGCGCCTGTAGTCCCAGCTACTCGGGAGG CTGAGGCAGGAGAATTGCTTGAACCTG
GGACATGGAGGCTGCAGTGAGCTGAGATCACGCCAATGTACTCCAGCCTGGGT GACAGAGCGAGACTCTGTCTCAAAAAA
AAAAAATCATCACTTAAAACATAGTGTTATAAGAAAAGTTTATGAATTTGTGTTGG
GCTGCATTCAAAGCCGTACTGGGC
TGCACGCAGCCTGCAGGCTGCAGGTTGGGCAAGCTTGCTATAGTGTATCTATTT
TCTGATATTTGTGACAGTGCACTTAG
ATGCTCTTAATGGATTGTGGCGTAACATAACCAGTGGCTACTTGATTCACTGAAT
TTTTTGCTCCTGAAATAGACATAAA
AAGTTGAATTTCCTTTATTTTATATGTCTAAGTAACCCTAGATTTTAAATTTTAAAA
CAGTGTGTAAAGCACCATCAACATTTGGGGCTCTTTTTATAAAATAAGTCCCTGA
AGTTTGCACACAGAATAAGTCATGC
ACACTAAAATAAGCAGCTGTTGCCCAAGTCTCTCTCTGTGTCTCTTGCTTTCAGC
CTCTCCCAATGCCTGCTTCCTCTTA
ATGAGACACTCTCTTTCATGTAACCTGAAGTATACATGTTCTTCACCTGTCATATT
CTTATTTTAGGATCCACCCACTTC
AGTTTCCCTTGGACTGCGAATGGAAGAAATGATTTTCAACTTGGCAGACACACAT
GTTTTAATTTTTAATACATA I I I I I GCATGTCTGAAAATAGGTGATTGGTTTTAAAA
TAAAGATAGGATATTA I M I M A
ATGTATAATTGGAATAGTGAGACAGAGGTTTGTTGGTAATATGGCAATGTGTTTT
GCCTTTATGGGATTTTTATTTTGCT
TTACTTTGGCTGTTGATTTTCAGAGTTAGATCCTACATGACCAGTAGTAAAGATT
AAAATTTGATTTCAAGAATTTTAAA
CTTTGTTGATTATCAGAAGCCACATA I I I I I CAGTTACATAATCACCACTAAGATA
TGTCCAACATAATTTGTAACACTT
GCATATGTTATGTCAAATTC I I I I I I I I I I I GAGACTGTTATGTCAAACTCTTACT
GTATTAGGACCGATGCTTTACTTT
CTGGCTGATCCAATTGGTAGGAAGATGTTTTAGTCTGGCCTCTGCTGTTTGGGT
TGCCAGGACTGCTTATGTCACACTGC
TGTTAATTTTATTGGTCCATTATGATTTCAGGGCAGACTCTTGTAAGATGGCTAG
CTGAATTGTTACAGGCACTGACAAT
CACAATGGTCTTGATTAGCAGCATTCTTAAGAGTAGAGAATATTGTTTTTCATATC
TC C ATTATG AG AAAATATGACTAT
AACTACATTTCGTTCTCTTCTGAATAGAGACAGTTGTTTATTTAAATTCCGCTGTT
AAGCAAGCTTGTGGTAGGGCTGCC
AACAGGCATTAATCATCAAGTTGTTGTTTTGTTCGATTTATAAATGTATGTTATTT
CAGACAGGAGAAGAATCTGAAAGA
TTCAAATAGAACTTTTACTGATCAAAGTTAATTGGCAGCAATTGGCCTTTAAATAT
TTCCTGATGTTTCCCTTTTCAAAA
TCAAGCAAAATGCACCTTATTTAATAGATGAGCCAAAATCATTTGAAAAATGCCA
GTGTGTGACAGAACTTAAATGCGGT
TTGTTCAACATCTGTCCTGTATTTTGATTTCTTTTCATCTGCAATGAAACTACTCA AAGACGAATTGGCAACTTCCAGTG
CAATAAAGTTATTAAATATGCTTTATTTTTTCTCTTCCTTAGTGAAAAATTTATTTT ACACTTAAGGGTAAAATACAAAA
TTTTTTTGC TTTAAAGTGTTTTCCTCAAGAAGAAATTACATGTAAGGCATTACT ATTTTTTTAAGGTAGAAATATTCC
CATTGAGTGACAAGTTTAGAATTAATTGCCTTTAGAAACACATAATTATCATACTT ATTTTCAAAAGCTGTTTCTGATTG
ATAATTGTTTTCACTTAGAGAGCAGCACAAAAAATTCCACATTTTACTGGTATGTT
AAAAAATGCCTTTCCTAAGTTTCC
ATTTGTTTCTAAAG ATAC AG AGTAG ACTG ATTTTTTGTC CTG G GTAC AATCTTGTT
ACCTTTATTGATGTTAATAGGTGT
TAATAACAAACATCAAGAAGAAATTTGTAGTAATTTCCTATTGTTTTAAAACTCTG
TACCTTCTTAAAAATGCCAAAACT
TGGCTGGGCGCAGTGGCTCATGCCTGTAATCCCAGCGCTTTGGGAGGCAGAG
GCAGCTGGATCACCTGAGGTTGGGAGTT
CGAGACCATCCCGACCAACATGGAGAAACCCCCGTCTCTATTAAAAATACAAAAT
TAGCTGGGTGTGGTGGCCAATGCTT
GTAATCCCAGCTACTCAGGGGGCTGAGGCAGGAGAATCACTTGAGCCCGGGAG
GCAGAGGTTACGATGAGCCAGAGATCA
CGCCATTGCACTCCAGCCTGGGCAACAAGAGCAAAACTCTGTCTCAAAAAAAAA
AAAAAAAAAAAGTGCCAAAACTTGCT
TATACAGTGCCAAAATTAGGTGGTGAAAAGTGAGATGTTAGGCAGTCATAACTT
GTTATAATATAGACCAAAAAACTTCC
AAACTCAATTTCAATGTAGTAT TCAGATTTTTGTTTATTGACGTTATTTACACT
GATTATATAGAGACTGAAATACTA
CTCAAATATATTAAAAGCTCTAGG I I I I IAGGGTATAGGAGATGTATTTTTTTCTG
TAAACCCAAATTTCAAAAGAAAAA
GAAAAAGCAGTCTATGAAGTTACTGAAATATCAGATTCTATTTTTGGCATGAGGA
CTATTCCTTAGTAGAGGTAAAGTAG
AAAAGAGAATGAAGGTATAATTCAACCAAATATATAAAACAATCATGGGAAAATG
TTTAAGCTTGTAGCCTTACAAAAGT
TGATTTTATGTTAGGGATATTAACAAAAGTATAAATGAAAGATTCATTTAGTCTGT
CTTTGGAAATTGTTTACTTGCCCT
AGATTTGTACAATATTTTAAAAAATAAATTGGTTTTTTACTTATTTCAATATTTTTTA
TTTTATTTTA M i l l GAGATAG
GGTCTTGCTCTGTCATCCAGGCTGGAGTGCAGTGGCACCATCTCAGCTCACTG
CACCCTCTGCCTCCCAGGTTCAAGCAA
TTCTGCCTCAGCATCCCAAGTAGCTGGGATTACAGATACACACCACCATGCGTG
GCTAA I I I I I GTACTTTTAGTAGAAA
CGTGTTCTTCATGTTGGCCAGGCTAGTCTCAAACTCCTGACCTCAGGTGAGCCA
CCTGCeTCGGCCACCCACCCAAGGCA
CTGGGATTACAGGCATGAGTCACTGCTCCCAGGCCAATCTTTTATTTTATATTTT
TGAAATTCAACAAACCTGGGACTGG
TCATTGCAAATATGTTTTCTTCTAAATAAAAAGTTAAATTATTTTTTGGCAAAAAAA
AAAAAATGTCTATTGGCATAATG
CTCCTCAGAACACCAGAATTATCTCATCAGGGAGAAGTGATGTGGTAATGCCTG
ATTTCACACTCCAACTAGCAATGGAT
AGTGGTAGGGGACATGAATATTCTAATATATTTAAGTTACATGTTTTTAGCTTTGT AAGGTCTATAGCATAAGTAATTTC
AAGGAGCTTAATAGTCTCTTGGTTAGCTTGAAGTATGATGCCTGAACTCTAATTG TTGACTTTTGGTCTTATTTTCTTCT
AAATCTTAGCTTTTGCCAGCTATCTCTTGGTATAATTTGGTTAGTTCACTCCTTGC TTAATAAGACATTCTGACTAGTGA
AATACACCCCCTCTGTGAGGGGCTTTCAGTAACATTTGTCATATTATGAATCAGT GTACATGCTTAAAATGGTTATTAGG
ATATGATTTAGAAAATTTAAGGTATAATTTTTGAAGTACTTTAGAAATATCTTAATC
GAACTCTAAGAACTAAGTTAATT
CATTTTAGTTTCATAAGTCTTGTTTGTAATCTCACTTAGTGACTTTTTTAATTTAAA
GAAAATCACAATTTATATCAATT
CTGTTGACACACCAGGAAAAACAAAAAGGAGAAAAAAATGTTTTCTGTACCATAA
AAAATGTTTGACGTATGAAGGAACA
TGAGTGACAGGGAGACATAATTATCTTAGGAAATAATATTTTCCCCCCAAGAGAA
GGGTTTCGACTTTAGTTCCGTAGTG
TACGTCTGGCTTTCCTATTTCTAAGTCTGACTAACGTCCTGCATATCATTTTACAA
AAAAACTCTCCATGTGCAATACGA
TTCAGCTTCTTTGGTAAAGTATGAC
TCTTTCTGACTGGTTGGAGAAAGAGTTCTTTCTGTTAGCTTTTTATTGAAAGGCA
ACCTTCAAACGGAGTGAGGAAAGGC
CTTCAGCATACGATAATATTTCAGTTTAACCTGCTGCTGGTCAGTTCTCCTGCTG
GATTTTTCCACAGTTAGAAGAAAAT
ATTTTATGAAGATGGATACAGGTCTAAATAAAAATAATGCTTTAAGTTCATGTTGA
ATTTTCTTCTTCTGTCTGGTTAAG
GAATATGATTTCACTGGCATTATGGTTGCCATCCTAACCCATGTAGACAGAGTGA
CCTTTAAGAGTCAGCAGGAACCAAA
TCTCTGGTCTCGGGTCACCCGTTACCCCTGAAGAGTAAGGATTGGGAAGAGTG
GGAACCCTCCTCTCGTGTCTGTGAGAT
GGTGGCACATGAAACTATTGCACATACCACGCTCACCCCGTTAGTAGCAAAGCT
GTTCTAAACTCCCCTCTCCACTCTTT
CAGTGCATTGTTTAGTTGGAGCAGACCAGAGCATTCTTTGCATCAAGATTTACAA
CATGAATATTCCTTGGGGCCTTCAA
TATGTAGACTCCTAGCCCGGGTCTGATATTTGAACACCTAGACTTGTTGCCCCTT
GTTTTCACCCTTTCTATTAAAAAAA
AAAAAAAAGGCCTGGGGCAGAAAGTTCTGGTCCCACTAGAGTATGAGTTTGATG
AGTCCAGGGCCAAGGAAAGGGGCTTG
TGGATGGTCTTTAGGTAATAGCATTATGTAAAAGGAGAAAGCTGCAATGATTTTT
CTTGTAAAGCAAAGAGAAGATTATC
CCTTGGTTTAATTCAGTCTTTAAAGTTGCAATAATATTTTCTTACTTTAAAACTTAG
GTTATGATCTATAAATTATTTTA
ATATGTGTTATTTATGTTTAI I T I lAGCTAGGAAAAAATTCTTTCTGTAAAATATCA
GTTACGCCAATGTTAAGAAACTT
CGATTTCAAGTAAAGCACTTGGTACTGACATACTGTTTGTGCTTCCAAAAAGTGT
GAAATATTTAATCCGATTCAGAAGT
TGATTTCTCAAGTTGGGAAAATATAACTCACCTAGAAAGTCCTTTATTCTCTATAA
ATAAAGAAGAAACTCTGATGCTCT
ATGTTCAGCTTTTGTGTGGATGCTAGAGCTCTGGTTCTCTGCTGCCAAAGGATC TCACAGGTGCTTGCAAGGAAGACACT
GTAGTTCTTTGCTGCTTCCAAACCCCTCCACAAGCCAAGAGTAGCCATATGAAA GGGTGTGAGCATATGGATAATTACAG
TCCAAACCTAAGGGTCCCTATGTATATGTACACACATATGGAATGCCACTTGTCA TTTTCAACCCTGTACACAGACTGGT
ATACTTGGTGATTTTCTCCTGACAAATTCAGCAAACATAAAGGACCACATTTTTTA AATCCTAGTAAACATCCCTTATAA
CACTCAGAATGTTGCCTTATATATGTTGACCATGAAATGGATATCCTTTAGACTG
AATTGTTATGTGTTGTCTTAGACTA
G AATAC C G ATTTTTGTTTC CTTCTTC ATAG AAG ATAATAG AGTTG ACTTTG C CTTC
CTGTGTTTTCAGGTAGGTTGTTCA
GCCTTGTGCATTTCCTTAGTTGACAAGAGTAACATCAAAACCCTCAGTTTAGGGT
GGGGAGGACAAAATAAGTTTTTTAA
GCAAGAAATGATTATTAGTCATAATAAATGCGTATTGAGTTTAGCTGCATTGTAA
CTAATGGCCAAAAAGAAACAATGCC
CATATTTTATTATTACTAATGAATAATTTTTATTGCTCTTCTTGTTTCTTCATCATTT
CTTATAAGATTGTCATCTTTAT
TTTATCTGAAGTTAAAAATAGAAATATAAAAGTATATTTAAAGATTTCCTTAACTCT
GATTTTCATCAGAACAGTCTTAA
AATATAGCTATTTAACACAGTAAAAGATAATATATTATAAATAAATTTATACTCTGA
ATACTTTATATTCTACTGATAGC
AAACATTGTACTTAGCAGTAAAATACTGGAAGCAAACTGTCATTACAAGTAAAAA
CAAGATAAGAATGCCTTCTATCAGC
ACTATTATTCAGCAGTATTCTAGGAATTTTGGACAGTACAATAAGACACAAGCTT
TACTTAATGAAATGAAAGAGATTCA
AATTATTTTTGTAGATAATTTGATATGTAGAAGACTGAAGAAAATCAACTAGAATA
TTCTTAGAATTTACAACAGCCTTC
AGAAAAATGAAGAAATATAGAATTCAAAAATCAAATTGATTTGAAATTTTAATTATA
ATTTAAAATAGCGGTAGTTACTA
AAACACAAAAACAAAAAATATCATAAAAGGAAATTTAGGAAGAAACCTTATAAGAA
GTATTTGTGAAGAATATATGAATA
AACCTGTGACATCTTACTGAGAGATATAAAGGAAGGTTTCAATAAATGAGATTCT
TTCTTTCTCATTTTTGTAATATGAA
AAAATTGTTCTAAGTTCTATTTGAAAGGATAAATCAGCAAAAATACCTAGCAATTT
TTGACAAAACATGGGAAATGAATG
TCTGACTGAACCAATTATTAAAATGTTTTATAATTCTACAATAGCGTGAAGAATTA
TGCGGCCTAATGTTGACATTGGAC
TAAGTATGAGAGAAGAAACAAACTGCTCAGGCACAGACCCAACTTAAAGAAAAAT
TGTGTATGTTTTAAATAAAACCAAG
CAAATTAATAGTGAAGTAATAGATTATTCAGCTTCTATGCTAAAATAGGTAAACAG
TTAAAAAATGAAAGAATGAAACAC
AAGGAGAATATATAGGTAAATGAAAAGACTAAATAAGTCAAAATCTTCAGATGGC
CATTTGGAAATAAGTATTGAGATCC
TATAAAAGGTTAATACCATTTGTACCAATTATACATCTAGGACTTTTTACTAATCA
GAGATAAAGGCAAGAATTTATAAC
AATAAACATTTCTAACTTATGAAATCCATCATATTTTGTTTCTTCATCTGGGTGTT
AGTTATACCCATGCATTTAGTTTA
TAAAAATTCATTGAGCTGTAATATTTGTTTAATAAAAAGTTAATTTTTAAAAACTAA TATTTGAGGAATATACACACATA
TTGTTGGCTCACAATTCTATTATAAAATGTATACCCTATGTATAATATATTCACTG CTAATTTCTCTAGCACAGGACCTG
GCCCATAGGAGGACCCAGTAAATATATATTGCATAAATAATGGAAATATATGTTG ACTGTCTTATGGAAATATAAATCGT
AGAAGTTTAACATTGATTGTTTATGGAATTTGTGGTTTTATTATTTTTTTATTTATG CTTTTCTGTATGTCCCAAATTCT
GTAAAGCATATGCTTATTTCTTCTGTAGTCAGAGACAAATACTATTTTTAAAAGCC
ACAGATAATATTAAAGACACAACT
GAGCCAAATATAACCAATCAAAGAACTAAGATTACTGCAATTCATTAAGAAATAG
GTAAAGAAATAGCTCCACAGAAGAA
ATGAGCAAAGACCATGAACAGATAATTCACAAGAGAAAAGCATTCAAATTATCAA
GCTACAGACGAAAAATGTTGGATTT
CACTTXTAATAATCAATCAGGGCGGATCACGAGGTCAGGAGTTTGAGACCAGCX
TGGCCATCATGGTGAGACCCCATCTT
TACTAAAAATACCAAAATTAGTTGGGCATGGTAGTGTGAGCCTGTAATCCCAGTT
ATTTTGGAGGCTGAGGCAGGAGAAT
CCCTTGAACCCGGGAGGCGGAGGTTGCAGTAAGCCGAGATCATGCCACTGCAC
TCCAGCCTGGTGACGGAGTGAGACTCC
ATCTCAAAAAAAAAAAAGAAAAGAAATGTCAAGTCAAACAGCCTCTTGCACATGA
AATAGGCAAAGGTTTAAAATAAATA
ACATGATCTACAATGCAGTAAGATGAAAACTTGTGTACATTGCTGGTAAGGATAT
AAAGGGATACAAGTTTTCAGAAAAG
CAGTTTCATAATATGTATTCAGAACTTTGATGTGCTGATCCTTTTACTCAAAATTC
CACTTTTAGCATCACTTCTAAGAA
AATAGAATTACAGATAATTAAGATAATAGCAATAAGCATTTGTCAAACACTTGTCC
TATATCTTTTACATACATTATTCC
ATTTAACAGCATGATATCTCTACTATAGTAACACTATTAATATCTTAATTTTATGGA
GGAGCCAGGTGAAGCTTAGGATA
GGTAAAAGAAAACTTGTCTAAGGTCACACAACTGTAGAGTGGCAGATTCTAGCT
CAGGTCTGACACTTTCCAGAAATCCA
CTCAACCCCATAATATACCCACCCCCCATGGTTTTTTTACCACTACAATAGAGAA
ATATTTTAAATATCCAATAACAGAT
TAAAAATGAAATAAATTAAAATACATCCTCATAAAACTTGTTTTAAGACAATTTATT
GTATATAGGGAAGGTATTCATGG
TCGATTTCAAGTTTTAAAAATTAGCTCTAAGATACAATCTAGCCTATTCAACATGT
CATACCTCTTCAACATGTATGGCA
TATTTAAAAAGTGTAAAATTCGGCCAGGTGCTGTGGCTCATGCCTTAATCTCAGC
ACTTAGGGAGTCTGGGGCAGGCAGA
TCACTTGGGATCGGGAGTTTGAAACCCGTCTGACCAACACGGTGAAACCCCATC
TCTACTAAAAATACAAAAATTAGCTG
GGCGTGATGGCGTGTACCTGTAATCCCAGCTACTCGGGATGCTGAGGCAGGAG
AATCAAACCCGGAGGTGGAGGTTGCAG
TGAGCCAAGATTGTGCCACTGCACTCCAGCCTGAGTGACAGAGTGAGACTATTT
CAAAAAAATAAAATAAAGAGTAAAAT
TTGTATTCAACACGTATGGCATACTAAAGAAAAACTGGAAGGGAATACATTTTTT
AAAAGCCACGACGATATATTTTCAA
GTTTAAAAGTGATCAGTCTACAAAATGAATGATAGTATATAATCCTGATTTAATGT CTATGACATATTAAAGGTAATATT
TACAAGTTGCCAGTGGTTATTTCTGGGTGTTGGGGCTATAAATGCTATATTTCTG TGATGTTTCAGAACCGACATGCTAT
CGCATCCAGTAGTCTTTTTAGACTTTGTCATTTCTTTCCGTCATTCAATATTCAAT ACCTATTTATTGAATAACCATTAA
GTACCTAGGCACAGTTCTAACTACTGGGGATACAGCAATGACTTTATTAAGGAC ATTCTCCTTATGGAGTTAGTGCTGTG
GAAGAGAAGTAGGAATAAACAGTATGTAAACAAAACACTTGCAGGCAGAGATAA
GTGCTATGAAGCAAATCAAGCAGTAT
TGGCTAAGGGATGAATTGGAGTGGGGCATTGTTAGCTGAGATGATCAGGAAGA
GATGTCTCTGAGGAGGTAACATTTGTC
AAGAAGGGAGAAGCTGGGAGGGAAGCAACTGCAAGGATATGGTAGGAAGTCAT
TTTGGGAGGACAATAGTAGGAATGAGT
TTCACTCCTTCAGGAAGTGAAAGATGCCCAGTGTGGCTGAGACAGAGTAACTAG
GGAAGAGCGGCAGGGTCCAGATGATG
AAGGGTCTTGTAGACCATGGTAAAGAATATTTTTTTTCTAATTGCAGTGCGAAGC
CATTGGAGAGTTTTAAGCATGGACA
TGATAGAATAGTGTTTACACTTTTAAAACATCAGTCTGGTTATCAAATGGAATCG
GGGCAAGAGTGAAACCAATGAGACC
AATTAAGAGACTGTCTCAGCCAGCCTAGGAGGACATTAGCTTAGGGCTGAGGAT
ATAGCAGTCTGGATGTGGTGAGAAGT
CACAGGACTTACAGTATCCTTTTATGGAAATATGGGCAGAATTTGCTAATGGATC
AGACGTAGGTGGGGAGGGAAAGCAG
AATCAAGGATGAGCACTAGGTTTTGCCTTGAGTACTGGAAAAGTGATCACACTTT
TGAAGAAGAGTGTGTTTGGGGGGTG
GCTGGGGAGGAGTTGGATTCAAGCATTCTGTATTGAGTGTATTAAGTATGAAAT
ACATATTCAACATCCAAGTCAAGAGA
TGACATATAAGCAGGTAGACATACAGGTTTGGATCTCAGAAGAGAGAGGCAAAT
TTGAAGGTATAGTTTTGGGAGTCCCG
ACGCCATAAAAATAGATGGCACAGGGGAAGGAATGAAAAATTGAGAAGGGGGC
CTAAGATTGAGCATTTTGGTACCAACA
ATTAGAAATAGGGCCAAGAAGAAGAGCTTTGAAGAGAGAATTGTATGGTAGAGA
GAATTCCACAAGGTAGGTGGAAAAGT
ATGAGCTATCACAGAAACCTGGAGAATAGTGTTTCAAAAAGGAGGGAGTGGTAT
ACTCCTGTGTCAAATGTGGCCAAAGT
CAAGTAAGATGAGGACAGAGAAGTGAGCGACTGTTAGATTGGGCAAGAATGTTG
GCTGTTAATGACCCTGATAAGCACCA
TGTCAATGAAGTGGAAGACATCAGCCAGATGGGAATGGGTTGAGAAGAGGATG
AGGGCTGACAGTGAAGACAGAAAGGAT
CAAGAATTCCCACAGAGACAGACACAAAGAGATGGGCTATGGTGTGTGGAAGG
CATCATGGAATCAAGGGAAGGTGGTTT
CATTTTTAAATAAAAGATATTTAGGCATGTCTGGATGTCAGCAGGAATGATTCCA
TAGGAAGGAAAATACATGTGGTATG
GCAAGGAGAATGACTACAGGAAAGAAGTCCTTGGATAGCAAGACAGAATGAAAT
CCGGAGCACAGGTTACATGGTGGGTC
TTAAATAAGAAGGCAGGTAAAGTCTGCAGGTACAGTGCAGTCCAGGAGTGGGG
ATTTGGTCATGATCAAAGGAGCTGGTT
CTTGCTATAAAAATATTTACTGCACTTTATTAAGATTTTCTGCATACCTCGGTCTA
CCCCCCACTGGGAAATCCCTTGAG
GCCAGGAATCAGGATTTGTCCACTTAAATCCTGAGCTCCTGACACAAGATTTGG TACATGGAAGGCTCTCATATGTTTTC
TTTAATTACTTATTTTCAAATTTTTCTAATAGTAGTCAATGATGAGCATTACCTTTA TTATTTAGAAAAGTTAACTTTTA
AAAGTAAATGAAGATTAATCTCAATCTTAACAATTAAAACACCCAAGATATTAGAC TTTCATATTACGTTAAGTCAGATT
9^
TTCAAAATCCTTAATGTCAAGACACAATATACATAGAAGAGGTATGTTATAAAATG
ATATGCAGCGTGGTTCCAGTTTGG
GGATTAAAAAACAAAGGTATGTATATCAAAAGAAAACAAAGGCTTTCTTTTTTTTC
TTTTTTTTTTTTTTTTGAGACGGA
GTCTCACTCTGTCGCCCAGGATGGAGTGCAGTGGCACAATCTCGGCTCACTGC
AAGCTCTGCCTCCCAGGTTCATGCCAT
TCTCCTGCCTCAGCCTCCTGAGTAGCTGGGAATACAGGCGCCCACCACCACGC
CCAACTAATTTTTTGTG I I I I I AGTAG
AGACGGGGTTTCACTGTGTTAGCCAGGATGGTCTCGATCTCCTGACCTCGTGAT
CTGCCCGCCTTGGCCTCCCAAAGTGC
TGGGATTACAGGCATGAGCCACCGCACCCGGCAGAAAACAAAGGCTTTCTATAT
TTAAAATGATACAGAAATAATTTCAA
CAATAAACACTGATTTATCTCTAGATAGTAGGATAAGAGAAATTTTCTGTTTTTAC
TTTCTATATGTAATGTTTCTTATA
ATTAGGAAAATGCTGAATGTAATCATGGCTCTCTTATTCCAGCCTGACTCAAAGC
CCAGTGTTGTTTATGTGGTTTGTTA
GCTTGCAGTTCTAGAGACACCAGAGGTGGAACTGGTGAATTCATTCATTCGTGA
GGGCCCCCTTTATGCCTCAATTTCCC
ATCTGTAAAACTGGATGATGATATACCTCATCGGGTCAGTGTGAGAATTGCGTG
AGTTAAAATCTGTTAAGTGCATACAA
CTAGCACTGGCTCGCAGGAAGCACTAAGGAAGTCTATGCTGCTGTTATTGTCCA
TTGAGTACTTCTACATGCAGAATGCT
GGGTCAAGCCCTGTAAGGCTGTAGGGACGAAGGGAAAGTGACTTAAGGAGGGA
CGAGGGAATGAAGGGCATGAGAAAATA
AGTGGTTTGCTGCCAAATTCTCTTGTAGAATTCCAACAGAAATCACACCAGACAC
ACCAAAAAAAAAAAAGAAAAGAAAA
AAGAAAAAACATCTGTCAGTGGCAAATGCCCCTGCTCCAGGTAGATCACAGAAT
ACTTCTTTGGGTCCATCCTTATTCAT
GGGCTCCTCCCACCCCCAGTTGCATGCCGGGAGGTGTGCTGTTGGCAGTTAAG
TAGCTACTGTATCTGGTTCCCGTCACA
GTGTGCTGTCCCTCATAGAAAATGAGGATGCTGGCTTCGTGTAAAAAGACACGT
TTCCTTGGTTGTGCTGATAAATGTCC
TTTTACTCGTTGTCTTCCTACCTCATGTTTTGAAGGATGGCTCATGGAACTTGCT
GGTGGTGTCACGGCATGCAGCGGAG
CCTGGCTGGGTGTGAAGTCACTGTGGAGCAGAGCCTCCCCTTCTCAGGCCGCT
TGGACTGACAGTGCAGGCTGCTGGGCG
TCCTTAGTGAAGTAGGAGGCAGGGGGCAGCAGTCCTTTGGGAAACTTTTCTCAT
GTCAGTACAAAAGCACTGACTCCATC
AACACTGTGAATTCAATCTTATTATCTGCAGTTTACACATGTGGGAAATTGAAGG
GAGGTGGTTGTTGCCCAAAGCCACT
TGGTTAGGTGATGGTAAAGCTGAGGTTGAAACTCAGGTTTTCTTTTGCAAACATC ATTGCACAAAACCTTTCTAGCCAAG
TCAGAGAACCTGGCTATTCCTTTTTAGCAAAGGAAAGGAAGTCAATAACACCTTG CTGCTGTCACACAAGGATGCTATGG
AGCGGGGCCTAATAACTGGATGGGGCTTTCTTGACTGTATTCTTTGAAAGCTTT CTTTCTCACTGTGTCTTACAAACCTG
CAAGCCATAGTTGCCCAGGCAACTTAAAATCTATTTTGGAAATGTTGAGAGGTAG GTCAATAGATAGAGAGTGGACAGAA
AG AC ATGTTGTTGTAATGAAC AAC G ATAAAATAATATTG G AG C AGTC C CTAC AC C
ACAGACTAAAAAGAGCTACTTTGCG
CTGAATGTCCTCATGCTCACTCTATTCTGGACACTAGCTAGGCACTGAGGATAA
AATATGAAACACATTCTTTGCTGTCA
AGGAGCCCAGATACTGGGTAGAGACAGAAAAGTGAAAGAATAATTCCATCACAG
CAATGATGGAGATGTCTGTGGTCATT
AGCTAGAGTCTGAGAGATGGGATGGCTATCCCAGATTGGGGAGAGGTGGCAAG
GCAGAGTGTTGGAGCAGTGACTTCCCC
AGGAAGAAACATTGAGCTTGAGAAAGCATGACATTTAGGAACAGAGCCTGTGTA
GGTATAAAGCACAGAACACACAGAGA
GTGCTTGTGAGATGAGAAAATGGAAAGAAAGACAGAAGCCATGTCATCAAAGGT
TTTGCACACTTTGCTAAGGGAATTGG
ATTTTACAATGGATGCCGAGGGTAACCAGTAAAGAAATTATAGCACTACCAAGGA
GATGAGTTGTATTGTAAAGTCAAAG
GGGTCAATAATGATGAGAAATCAGTGGCATTTGTTTGTGGAGACAGAGAACAGG
GGCTGATGTGAAAGCTACTGAGAAAG
TAGTTTGGGTAGGATTTGCAGATATATTCAGTGTAACTATTCAGTGAGCCAGTCT
TTTAGTTTTTGGCTTGAGCAATAAG
GTAGAGATAGAAAACCCAGGAGAATGGTCTGGCTTTGGAAGGAAGATGCTGAAT
TCAGATTTGAGATATTGAACCTGAAG
TGTGTGTTGGACACACAAGTGGAGTCCAGTGAGCAGGTGTATGCACAGATCTG
CAGCTCAGAAAACAGAAGCTGGAAGCC
ATCATTTCTGACATCAGTTAAGATTTCCCAGGAAGAATATATTGAATGAAAAAGA
AGAGTTATGTCTTAGTTTAGAAACA
CAGAATCCAAATATGTATTGAAAATTCCAGATTCAGAAAATTAAATGATTTATGTG
AGGCAACTGTGGCCATTTGGAGCA
GCAGTTATTCCAACAGGCTGTTAATTAACCTGTTCATGAAACTGCTTTCCTAGTT
TCGCCATAGTTTCTCTATTCTGTTG
GGTGAAGCTACCAGGAACAGTGTGGTCTGAAGGGACAGTAAACACTTCAGAGG
TGGAGAGACCTGGGTTCAGTCCTAGCT
TCCTGCTAATGTATGGCCTCCACACAGTCAACCCTGACTGGCTTCAGAGAGGGT
GACAGGTAGGTGTGGAGTTAAGGGAG
ACAACAGAATCTGGAGCTCCTGGCCGGTGCCTGCAGCATAGTGGTGCCTCAGC
GAATAA€AGAGTGTCTTCCTTTCTTGT
TGAATGGAAATACCACTTCAGCACTCACACAGTCTGCTGGATTGCTTCAGTTTAT
TTTAAAGTCGTGTGAAATGGACCTG
TCAGTGTCACCTATGACAGGCTTTTGGTTTAGATCTTGGATTTTGACAGGTGCTT
GCTAATCTTCAATCTGGGAGAGAGT
CACTGCTTTGTAGAGCCAGATTTTCCTCAGTTACCTTCATAGGAAGTTTGCTTAG
TAAAGAAAGGAATTCAAGGTGCCCT
TAGACCTTT ΓT I I I I TATTTTTTAAGGTTGAATACTGCAAGAGAAGGCCTGTATGC
CTGTAACTTCAACTTCCCCAAAAT
GTGTTTAAGTTGCAGTCCCTGGAGACTTCTCCCATTTCAGTGCAGCTACAATAG GGCCTCTATTTGGGCTCAGTTCAGTA
TTCTAGCAGCATTTCTAAAGAAAGGCTTTCTAAATGCTGGATCCTTCTCACTGCT AATTGTGTGCTACTCCTTTGTAAGG
CAGGTGGACGGTCGGGGTATGGGGAGAGCATGTTGCTTTTGATCAACATTATTT TATAATCTTTTGCCTCTTCAGAAATA
ATCAACTAAGAATTAACTTCTGGAATTGGTATTTTGTCAGAAAGTTAAAGGGAAC TCTTCTGTCCAAGATATCATCCTTT
TCAATTGGTAGGAAACAGCTTGGACTAGAAGAATGTTACAGTGGTTACTTCTGA GGGATGGAATTGGAGATGAATTCTTT
CAAAATTATTTATTTCCACAAAAC
TAACTTCAATGACAAAATGCACTAACATTATCTATTAACTGATTGGGTCATCAATT
CTTTCATAATCAAATCCTGTCAAT
CAGATAACCTAGAAGGTTTCAGTCCATGATATCTATGTGTTTGGTCCAGCGTTAG
AAAATAAGTAGGAAAAAAAGACACG
TGAAAGATGGTCTTTTCTTCAAAGAGTTTCTTATCTGGTAGTGTACACAGAAGTT
GTCTGTGTAAGTCTTTGCTTATTCA
TCAAATGCAGAGTTTTCAGGAGAATTGGAAAGAACGTGAAATACACATGACACAA
AGACTGTAGTCTGTCAATAATGGTA
GTTATTATTGAGACTGTGGGTTGCAGAGAGTATGTCAGGGATGGAGAAGGGAAA
GAAACCAATAAAAACTTAATGGAGGA
GCACTGCCTTGAAGGAGAGAGAAGGCATTCCAGCAAGCAGTAAACACAGCTATT
GTCTGGTGCAGGGAGGCAGGCTGTTC
ATGATAGAATGTGAGGGATGGGCACATTGGTAAGGTTAGGGCTCATTATAGAGG
CCCTTGAATGCTCAATTGCATAGAAA
TATATTAAGTGTTTCCAATTTAA I I I I I CTAATACATCTATTGAGGGAGATGGACT
AAAACAGAAGTAAAACATATTCCC
TGTCTCCCCTCCCGCTAGATGCTTTTGCTCTTTGTTGGAGACTGCCTTGCCTGG
AACAGCAAAACCTACTTTCAGAAAAA
TGTATCTGAACGTGCTGCCCAGGGTGGGTTGGAAGAGGAAGGGCTTTACAAGA
ATTTAGTACCTTCCCTGACAGAAAATC
ATAGTGACAGTCACTGTACCCTTTCTCTTTAGCCAGTCCCCATGTTTAAAACCGT
AAAGGCAAGATCTGCATCTCATTTA
TTTTTGTGTGTGTGGTGCCTGGAATAGACATAGGTGTTCAGTCCTTATTAGTTGA
ATGCATGAGAAAATGAATGAAAGAA
TCGAGAACATGTGGTTATGTATTTTCTAGTGACACTGCTACAGTTTAATATATAAC
ACACATAATACTCATGCTATTCTA
ATGATACATAATATTTTCCTGAATAACATATAAACAACCAAAGCATGATTATTTAT
AGCTCTAAATTATACTTCTTGAGA
AGAGGGATGTTTTTATAGGGAGTGAAATTATGAGTATAGACAAAACTTTCCCTTT
TTTTTTTCCTTTCGGGCAGCCAACT
CCATTTCCCCTTTTAACACCCAATGATTTGGGGGCAGGCACACCAGGGCTCCCC
ATGCTGGAAGAAATTTATCTCATAGT
GCAGTTCACACCAAGGCAATGGAGCTTAAATAGTTCACTCTTACCTGAAGTCTTT
GTCTTACAAAAGAAGATTCAGAAGG
GATGATTTTGTGTGGTGGAATTTCAGACCCACGGATACTTAGAGGAGAGAATTT
TTTGAGTAGGAAAGTGCCCCTCCAAG
GTCCTCTCCTCGCCGCCAGGGTCTGACAGTGAGGGAGGTGATGATGGAACCCT
TCCTGCAGCTCACAGGGAACTGCTTTT
TAAAAAGTGCCTATGTCCGGTTCTGCATATTTTTCTTTCTGGAGTCAGTAATAATT ACATGGCAGAGAAAATCAGCTTGG
AAAAATCTTTATTAACTTATTAACTAAGGATATGATTGGGCCTGATTTATTTTGAG CTGTATGTTCTTTTATGCAAAATT
GAGTGTTTCTTACTATTAGAAAATTATCTAGCAGTATAAGTTAACTGAAGTTCATA
AATTAGCCTCTTTTAAACTTCAAA
ATTTCAACCCCACTCTCACCCCTGCAAAAGAAAAAGACATTTAAAAAGGTTAACT
ACAAGAAAATATTTCTTTACAAAAT
GCATTGACATATAGTTTGTAGTTAAAGATGAATCTATAGACATGTAATGGCATTA
AACAGAAACTAATATTTGTTGCAAA
TGATTTTAGGAGGAGTGGGAAATTGCTCCAGAAAAAAAAATAATGTATGACCTAT
TTAATATTGGAGCTTATAGTTCATT
GTTTTATATGAGCGAAACAAATATTTTTGTGCAAATGAAAGCTCTAGTTAAATGCT
TTGTGAATAAATTGGTATGCGATT
CACTTTCTGCAAATGGCCATGAAAGTTAAAAGAAAGCCGTTGCAGAAAAGGATG
AAAGTTCTCTCGGTCTGAATATTTTT
CCATGCTTCTGGCAGCAAGTTTGCAATAAAAAGAATGATTTTTATGACATTGTGA
TTATTCACAGCAAGCCTAGAGTTGC
TGTCACTATGAAAAATTCTTCTTTAGTAGATTCTCTCAGCAGGAATAATAGATCTA
AAAGAAAGGAAACAGGAGTTACAA
AGAACACTGTTGCCAGTGACAAATTATGTTTGCAAAAATGATGAGCAAGCCCGTT
GAAAATTGAGAAATAAGTTCAGCAG
TAAATCAAAGGCTCATTCTTAAGAGTAGTTCTGAGTTGCTGTTCCAAAAAAAAAT
GATATTTATATTAGATGTGTGCTGA
CCATTTTAAATATGTGTAATATTTTAATGATAAAATTGTACAATATAATTTGGCATT
GTCAGCAAACACTAATGCATTGC
CCTGTGTGGTATATGAAGGTTTGCACTGGTTATGATACAAATTTGTAAAATTCTA
TGTTCTCTTAAATTAACTGAACCAT
AAATAAAGCTGTGCATTTGGACATAACCTTTTGTGTACAATGTCAGCGGTAGGG
ACATAGAATGCATTGAAATACCTCTT
ACTATTGCATAAAACATAACTTAATAAAAGACGGATCTTGTACGAGATAGTATTTA
TTGTATACTTTTAAAAAAGCAAAG
AAATTTACAAGTATTGATATCATTAGATCAAATGTAATATATGCTATTATCTTTTAA
TTATCCTTATATTAAAACTGTTC
AACACTGAGTCACTAAGGTAATGGATTAGTTAATGGATTAATCCATTACCTTTAG
AGACTGAGTGTTGAACAGTTTTAAA
TAAGCTCATTTTGTTTAAAAGTTTATTTTCAATTCTTTCTGGCTTTCATTAAGTGA
GACTTTCTTTTCCTCACCCTAAGG
TTGAGAGTGTGAATAAAGACAAATATATTTAATATCATGTGGTAAAGAAGGCATT
CATGGTGACAAATATATTTAATATC
ATGTGGTAAAGAAGGCATTCATGGTGAGAAAGAAGGAACTCCACGCACAGTGAA
ACTGACATATCATATTTATTAACATT
GTACTCACTAAGAAGGAGCTATCTTCATGTGATACCTTTGGGATTCTGTAAGCCA CACGCTAAAGTGATATAATGATGTC
TTATTTTTTTTAATTAGCACATTTTAAAAGGGATTAATATTATAAGAAAATATCAAA ACTTTGAAAAACATGTGTTCATG
CAATCTTCTAGTTATTAATAAGTTTGGGCATCCTGGAAGCCTTTACTTAGTGATC TCAAATATTTACCATAATTTTATGC
ATATACCAGCACATAGCTTAAGCACTTCAATTTATTTTTTACCTTAACATACCATC TTCAGGAACTTTCAGTAAAGACTT
36
TGTTCTCTAAATATTTTATGAATATTAATAAATAGAATCAGATCAAAATAGCTAATA
CCTTACCTTTGTTGAAACTTCTG
TAATTCAAATCTTGTGTTTCTAATACAAGCACTTAGGTCTCCTAGCTAATATAAAC
ATAATTCTAGTAAAATCTCTCTCA
CTTAGATAATCTCTAAAGGATATTATAACATTAAGTATATTAAGACACTGTTTTTG
TTTGTAAATGCCATATGTGTGTCC
ACAAGTTGCATCTTATTGCCACCTCTCGAAAATAGATGCTGAAATTCCTCTTTTT
GGTTTCTGAAATTTGAAATTGTTTT
CTTTTCCCATCGGCTTGTATAGTTGTGTCATCCTATTCCTACCATAACACATATA
GATATTTAAATATTGACTTAAAATA
TCGGTTGGCACCAATAGTATGCTGAGTAGATAGTTTAGAGTGGGAATTAAAAGA
TATATTTGTCCAGGTGTGTCTTAATT
CCTTGACTTGCACTTCCCAAAACTGCCAAACAGAATGGGGCAAATTTCAAAACTC
CTTTTGTTTTTTATCCCCTGTGCCT
TAGCCACAGGTATCTGCACCA I I I I I I TCTTTTTCTGCAAGGAAGCATCATGGTG
TGAAGGCCAGGGAAATAGGAGGTCA
CCAGGTGTACCCTAGACAGAGATGACAGGGTCTGCCAGCGATTGCCAGTGTGG
AGAGGCTACAAACCACATACCCTCCCC
ATCTCCAGGAGTCAAGGGCTTCCAGAGAGTCCATGATGCTGGTGAAGCTGGGC
AGAAGAAAGCCTAGCAGCGCTTGAGCT
CTTCATGAGGTCTATTTGAAAGCACAGTGAGGGCTTGGACGTGCCCTTAGAAAG
CACAATGGAATCATGCTGTTCTGCAT
CAGTAAGTATAAATACACAGTACATCAGTTGGGCAGTCTCCTCTTTACAGAGAAA
AACACGCAAGTAAAAACATGGTTCC
TGCAAATTTACTAATTCATGGACAGAGGGTCTTTATCACGGAGCCTAGACTTGAT
GTACCAAATCAAATTGTCAGCTAGT
TGGAAGCCAGTTTCTTCAAATCTGTTCCATCAATTGGCTTCTTCCTTGACTACTC
CTCTCCCTCTGTAGAACTTCTAAAT
CATAGAGAAGTTTAGAAGATGTCCCATTTCACTTAGTTTTTCAAATTAGTTATGAA
AATATCTGGAGAGTTGAGAGTTAT
TTCGAAAAACCACCATGGAGAGCATTCCCTTAAGACCTTTGTCACTCAAACACTG
GTCACCCTGTAAGGCCATCTGTGTA
ATTTAGTTTTCTGTGAGAAAGATGGTATCTGGATAGATCCACAGACAGATATGCA
TTCTTTT CAAGTGCTGCTAATGGC
TTGACAAAATTAGCTTTCAAACAATTTATGTGAGTTTGTCACTAAAATACTAAAAC
TACCCATGTCCCATGTATAAATAA
ACATTTTCACATTTAGAAATAAGAGGAATACTTCTTTTAGTGACCTATAAATGTCA
AATATCAGGCAAGTATCTAAATTG
CTCAATAGGGAAGTCAGAGCCATGGATGTAATCACTTAATGACAGATAAGAGGG GACACAGAACCCAATGATCACTTCCT
CTGCCTTGGCCAGCTCCAGCTGGTCTCAATAGGCACATTGATGTAAAGACACAA TGAAAACAGCTAGTTCTCTGCTGACT
TTTATCCTTCCATTTTTGTTTTGTAAAAGGAAGTATTGAAAGTATTGAAACAATTA GTTGGCAGTATGGTGTTAGTTTTC
TCATGGTAAATTTGGAACATGGAGCATACTATAGCCTTGATCAGCAGGATTTGG ACTTGTGTAACTCAACTTCTCCAAAG
ACACATTGT CTTTTTCATATTATTCCCTTTTATAACATGCAAACTGCCTCACAG TCTTCCTAGAGATTTCCTTCTCTG
TTTATTCAAGGACTTCAGGTTTACTTTGAGTTTGTGTTTTTAGTTTCATGTTCAGA
AGTATACAGCATCAGTGTCCTAGC
TATTACGTCACTGTTGGCACAGATAATAAGAATCATTGTAACTGTTTTAATTTTTT
TTATTAATTTTCCAAAAATACCTT
GAATTGTATGGTGTATTTTTACATAATTATGGTACAAGTGAAAGGCTTGGGGGGA
AAGTACAACTACTACAGTATCAAAG
CAGAATAAATATTTGCAGAAAAATTTTGCCATAAATTAATATTTTGCATATGAGCC
TTTGTACAAAATTTGAGTTATTCC
CGGGGATTGGTCCAAAAGAACATGTTTGTGGCCCAGCCCTAGAAGGCCAGAGA
TGATGATGGATCAGAAAAAGGTCAGCT
TGAGTCTTGATGCCAAGAATAATGACAGAAGGGACTAAAAAGGCAGAATAACAG
TGCAACCAGAGTTCCAAGAGCGAATG
TGCATTTGCCGCATGGCTGTGAGATCGTGCCCCATGTGGCTTTCTGCCCACTCC
CTATACCTACACCAGAACTGGGATTC
AGGACTTCATGTGTTGGCAGTTTCCCCAGCCACCCATGTGAGCATTTTATAATC
CAGCTCCATGAGTAGTTAGATGACTG
TTAGCTGAGGGCCAGCAGTAACTGAAGCTAACTTTATTTATTTTACTCCTCTGTG
AAGCACAGGTGGTGTCATATAAGGG
GAAGGCTGGCAAXGGTAAACCATGGTCATTTCAATTTTGGTATTTACTGAAAGAT
GCTTGTAGGATCCTGTTGTGGAAAA
GCAAAAGTTTGCAGGCTGGTAACTGTGGGCACAACACCGCACCTCTGACTGGT
GCCATCTGCCCTCTTTTCAXCTTCCCC
TGCCATGACGGTGGCCTAATCCGTTTTCCTGGTCCTCAGCTCTCCTACTTATGA
CACGCTGTGGCAGTGTGATATGCTAA
AACCTATGCTTTCACCACCTACAGTTGTACTCTCTACCCTCCACGCCTGGCAGA
GACCCGGTTACTACAGTGGCTCTCCT
TCTGGGAGCATTTTCCTCTGAAGTACCAGTCATTTAAAGGCCTGCTGATGACAT
GAAGCGTTGGGGCTTGTGCTGTAGGA
TGAAGGACCCACATCCATGCTCCAGGCAGGAAGAGGGGTCCATGACCACGGAC
ACAGAGCACAGGTCAGAGTGGGGCAGA
CAGAGCTGGAGTCCCCTAAACTCAAAACAACTGAGAAGCCTCTAAGAACCTAGG
TCGTGACCAGTGTCTAGAGGGAGCAG
TTGTCTACACTGTGTGAAATTGAGGAGCCTGGGAAGAAACGGCGAGTTGAGGG
TCAAAGACATCAGCATCGCAGCATTTA
TCTCACATTATTTAGCTTTTCTGAATCTTTATTTTTCATATCTGTAAAATGGGTTGT
AACATTAAATACTATTGCATTTT
CACCATAAATTCAACTTAAGAACTAGGAGTCTGTGTAAGTGAGAGAAACTGAGCA
CCTGCCTCTGCAGGAACTCCACAAT
AAGTGTAAGGAATGCCTTCTGCAGAAGCTATTGCTGTGGGGGAAGGAAGAATAT
CCTCTACGAGGGATACAAAGTGCCAA
GATGTGGTGCTTTGGTAGTCAGGTATCAAGGGATGATGAATCTCTGGTGTCTTG GGACTTCTGAAGAGAGAGACCAACCT
CAGACTGGGAACAGCGGGCTGGATGCGTTAGCAATACCCAAGTTTGGTAATTAA TTGTAAAATTTTTCTAGTGGAGGATA
ATCGCAACTTTGCCTTGCACTGGCCCTAGTTCTTCCACATAATCTATCTGTACTA TGTGACTGTTACCAGTGGCAAATAC
CTGAGTCACACGGCACCAAATATGTTACCAGTGGAAGGTATCCGAGTTACCAGC AGTGAATCCCTATGGTCTGCAGCAAC
CTCAATTCTTACCTCCTCAGAAGAAAGAATTCAACTGGGGGGCATGAGGCAGAA
AGAGAGACTGAGGCAAATTTCAGAGC
AGGAGTGGAAGTTTTCTGAAAAAGGCTTTAGAACAAGAAAAAAGGAAACTGCAC
TTGGAAGAGACCCAAGCAGGCATCTT
GGTCAAGTGCGGTGTTTAACCTTGATCCTAGGACTTTATATGCTCGCCCCGTTC
CCATGCATCTTCCCTTAGGGTGGGCT
GCCGGCATGTGCATTGCCCCCCTTACCCTTGGGAACTGAGCCTGTGCAGTGTG
TTTAGGAAGCTGCACGCATGTCTATCT
GAGGCTTTCTTCCCTTTTCCAGTGGAGTGCCCCCAGAAGGTCATACTCTGCCAT
TAGGTCTCTTAATGTGCATGCCCAGG
GACCCTGGTGCCTGCATTCAGTTAACATTTAGTGCAACAGGTGTGGACCATCAG
GAAATGGCCTCTCCCTGGCATTGGCT
GCCAATTTATCACTTTTAGAGAGACAATGTGATAATTGCCAGACCATCACCCGAC
ATTCCTAGTGGGTGGCGAAGAGCCC
TCTCCTGCCCCGGCTCATGGCTAACTACCTGTATGTGAAAATTGGCTAAATAGA
AATGTTTATTATATATTCCTCTACTG
CTGACTTGTTTCTGAGCCCTTATGCAAACATGTTGAATTTTTCATCCCTTTGGGA
AAGCCATTAACAAAAAATTACTCTC
TTCCCACGTCTTGATCCCTTTCCAACTCCCATATCAGTTCTCACATCTGTGACTT
GGGGGTGAAAGAAACATGTCTTCAA
AGGCTTCCCCTCACCTCTCTGAGGCCTGGTCACTAAGCCCCCGGGGTTCCTTT
GTACCTTCCCCGCCCCATCCCACCCCA
CCTCACTGCCATTTGAAGGTAGAGGTCCAGCATGGATAGTCTGGGACCAATAAG
CTGTTTTGTTCCTGTCACCACAGTTC
AGTTGTTAGGAACCATCTCCAGTAGGCCACCATTTTATCGGTTTGATTGTTTCAA
TCACAAAAGCCACAAGGACAAGCGG
TGGACCAXGCTGTCCCTCCTGATGTACCAAAATATAATTTTAATTCTTCAGACCT
TTGCAACTCTCTCTGGCTAAATAGA
ATGTAGGACGATGTTCATTTGGATCCTAGGGAAGTTCTGCAGGATTTACCATGA
TTTTAATTATCACGTGTAGTTAGTTA
ACTTCTCTCTTTCCCCCATAACCATGTAAGTGCCTTGAGGGCAAGGACTGATAA
CTTGACCACTGCTTCCTTGGCAAAGC
CTGCTCCTTGCTGGGATCTGCAGGCACAGGGAACTCTTCTCTTCCTCTGCCCTC
TCCTGGCTCCTTCCTCCTACCTCCTT
ACACCAGTGTTCTCCTGCCTCAGGGCCTTTGAACTTGCTCTTTTATTTGCCTGGA
AAACTCTTTCCCTCACACCCACTTG
GCATTGCATTTCTCCCTCATCTACTTTTTTGTTTCCTTAGCACTTACCACTATCCA
GCATATGTACATTTTACATATTTT
CTCGTTTGTGACTAACTAGAATGTTAGTTTCAAGGCAGTAAGGACTTTTGTCTGT
TGTATTAAGCGCTGTATCATTAGCA
TCTGAAACACTACCTGGAACACAGCAGCCTCTCAATAATTTGACACGTACATGAA TGACCACCTTTCTAAGCTAAGATAC
CCATTAGTTTACTGTGAGTGCAAAATCTTTCAGGAGTCCTTGAAATCACTATCTC TGTCTTCTCTGAGGTCCCAGGCCAG
ATCCCCAAACACAGCAGGGATCCCCTGGGCATCTCAGACCCTTCCAXTACAGAT TAGCGCATAAAATATGCTGAACTGGA
GTGGCCCCTAGAAATGTGAATGCTGTCCCAAGATTTGGATTCAATAATGATTTCA TC AC CTTAAAAGTCTATC AC ATAAT
93 TTTGGTCGTGTATTTCAAAAGTACAGTTTCAGCACGTTTAGTCTGTTTTTGGAAT AAAAACTTATTTTCCCATAGAATTA
TGTATGTTCTGTACTCTACTGAACACCAAATTAGGGTTTTTAGGAATATATATTG GCACCTGGAAATATATTAAGCTTCC
ATTTTTGTCTTTGTCAGTAGATTCTTCCAGGAGAGCTTTAAACCAGGTTGCTGCT GTTGGTTCACGGTGCCCAGAGCCCC
TTCTGCCTAAGCCATGGACTCTGAAATCAGGCCCTGCACCCTCCCTCAGTTATC TCATTGTCTGGGTTTTAAACAAATTT
GGAAGCTAAAAACAACCTAGTCCAGAAGGCAAAAATGACAACCTAAAATTTGGCT CTTAGGAAGACAAAAAGCTAGATAG
ATGGTGTCTGAGAAATGGTAGAAGGAAAGTTTCTCCCCACAGAAGAGGCTGAGT TTCAGGATATCTGATGGGAGACTTCA
GTTTCTCATTTGAGAGCTGCTGGACTTACAACAGGTTAACACATTGTGCAGTTCT GCTGTGGAATATTGCAGTATCTTAA
TATTCCTCATTCTTTCATGAATTAAAATGTTAATAAGTAAGCATTTAGATAGCAAT ATTATAGTTTCTCCTTAAGATTTA
AAACAATTTTTTAAAAACATACCTCCCAAGGCATTTTATGTCATTATTTGTAAGTT CCTACAAAGACACAGATAATGAAC
CCTCAGTATGTATTTTATGTATTTGAAGTTTGTACAATATAGTATTTTTCTATTCAT TAAGTGAAATAAGCAACAATTAC
ATAGACACATTTTACATAGACATTTTAATGATT I T I I I TTCATTCCTGGAGGTGCT GATGAGGCCCCTTAAAGATTACTC
TTGATTTTTAAATCTCTTTTAATTAAAATCATTCTACAGAAATTCTGATGAGGAAT GAGGATTTCCTTGTCACTACTCCG
AACTTTGCTTTAATCAAGTCACTGTGAAGTGAGTTTGCTATCTGCAGGTAAAAGC TTGAAGGTGTTGACACTAAGAGCTA
AGCTAGAGATTGAACAATACTTCAAATGAAGTCCTCAGCCTGGGAACAAAACGTT TTAAATGAAGCTGGCAGGAGAAAAC
AAATACAGGATGTGGAAGAACAATAGCAACAATGGGGTCGTTCACAGCTGTTAG AGGATAACTGCCTTCTCTGTTAGTAA
TGAGGATTTACAGCTAAATTCGCTTCAGAAAAGCATCTTAACACCCTGCTTTTTC AAAAAACTCTTTCCACCCTTAGAGG
GAACCAGGGAATTCCAGGTCAGTTTTGATTCCTTGAATTAAATTCTTTTGAACAT ACATTTTCCTCTTCTACTAGTTCCT
CAACTTCCTCCTCCTTTTTGTTCATTCTCCCAAGAGAGAAAATTTCTCCATCATG GCGCTATGGCCTTCCCCTCTCCCCC
CACCCCCAACGCTAACCACACGCTCTAAAGAAGTTTTACAACATAAAAATTACAT TCCTGGTTAAATATCAGAAATTTCT
ACTCAGGATTTAGTATGTTGTCGAAAACACAAGCGATCATCATTCATCTTCTAAA ATAAGATTACATTAATAAATGAGAA
GTTCCCAACCACGTTTTCCAAGAATGACCTGTTTTTAAAATGTTAAATTCACAGT CAGCATGTGAGGGCTCCTAGACTTG
CTTTTCCTTTTCTCTTATCTGAGGTTATGGACCTCAGGAGAGCAGGGGAGGGTT ACAAGGTCAAAGTCTTTTCCTCCTTG
GAACTTGCTGCGTGCTGGGAGAGGGCTGCCTGGAAAATAGATGCTTTAGAGAA GAGGTTGGGGGTCACGATTTTAATTCT
GTTAAGGAACAGCTGGTTTGAGTTTCTTTCCACTTCTATCTCCTCCTCCCTTCCT CACAAATCCATGGAGAGAGAAGAGT
TTGTTCAGAAGGGACCGGCAACATAAAAATATACCTTCTATTATTATAGTGCTTT
CTAAGTGTCATCTTTGTTATGAAGT
TGAGTCTAAGTGCTAGAGAGATGGGCTAGATTTTATAAAATTAACTCTATCTGAA
ACATGACCAGACCTTCAGTCAAAAA
GAACCTTTTAAAATACAGGATGTGCGCCTGGAAATTTCTTTTAATAATATTTTTAA
ACTAAAACTTTAATGGAAATAAGT
ACAATAAAGGCCTCTGGGGTATTTGGAGCCTGACTGCAAAGAGAGGAACATGAC
TGAAATATGAAAGCAGCCAAGTGGGG
GTGTTAAGTTTCTGGACTAGAGTCCTTGTCTGCCTCCAAATTAATATGCCTTCAT
TCATTCAACAAATGTTTATTGAGTG
CCTGTTCTTCCCACAATTCATCTGGGGATACGGTGGTTAAGATATATGGGTGCT
CCGATATATCTGCAAACAGGTGCAGA
AACGTTAACTGGGCCCAATAGAGCTTTAGCCTGTAAGCATGGCTTCCTATTCTAA
GTAGTTTACACAGTTTGGGATTTAA
AAATAAGATCTGTTCAAAATAACATTATGCCATAATAAACTCATAAATCCAGTGAA
TATGTAATAATAGTAAAACAGCTA
AAAATAACACAAAAAATGTAAAAAAAACTAAACAGCAAACAAGCACATGGTTTGA
CTTTTTATTTAGGTTTTAAAGTCTC
CCATGGTTTTTACCGTGCACATGTGTACACACACACACACACGTACACAGAGAG
AGAGAAAGAGGGAGGATAAAACTCAC
ATCTCTCAAATAAATGGAAACCAGTAACAGATTTTTTGTCTGGGCTCCTCAACAC
ACTGGCCTCAGGCTGTGTTGGTGTG
TGGCTCAACAGCAATCAGAGGTCATAATTTCATTGGCAGTTAGCATCAGGTTTAA
TTTTTCTAGCATGTTACATTTACCA
AAATGTCTGGTTTCAGTTATGCTTAGGCACTGACTGAATCAGCTCTGAGCAGCA
TGATGGCTTAAGGTACCTGAGGTTTA
TGGAACACATATGCT TTGTTTTCAATCACTTTCCTTTCTAAACACAAAAATGTG
TTTTCTGAAGCTGAATAACAGTAG
TCCTTAACGATGTTGCTTCTTTCTTACAGGAGAGGACCCTAAGTAGGTAAAAATG
TCAATTTTACAAGACTGCCACCTAC
TGATTGACATAGTTGAAAAATCCTCCAATTAAGGTGCTTTAAAAAATATGTATGGT
TTTTCCAAAACAAAACAAAAAAAT
TAGGTACTCTTAAAAAATTGACATGTATCATTTTAATTGCATTTTAGGAATGTGAC
CAAGTGCATATAGATGATGTTTCT
TCAGATGATAACGGACAGGACCTAAGGTGAGCACCCTTGAATGTGAAGTGACCA
TCTCCCTTTTATAGTTTAATTACTTA
CCTAATTAAGATGGACTTTCTGTTCGGCTAACTTTGGGCTTTTCTTGCATATAAA
TAGCACATATAACTTTGGAACAGAT
GGCTTTCCTGCTGCAGCAACCAGTGCTAACTTATGTTTGGCAACTGGTGTACGG GGCGGTGTGGACTGGATGAGAAAGTT
GGCCTTCCGCTACAGACGGGTAAAAGAGATCTACAACACCTACAAAAATAATGT TGGAGGCAAGTGATTACCACCAGTAC
CTCAGCAAGTTAAGACAGTCCAGATACAATTCCAAAAGTACTGTACACTTTTGTG TCACACTGCTTTGAATATAATAGCT
TCTATTTGGGCAGTTTCATCTCACTGGCCATTATTCTGCTGTGATTTAGTGACAG AAATATTGCACTGGAGCCAGCTGCC
CTGGCTATTCCAGTCTAGTCCTGTCTCTGTTACCTAGTGATGTGACCTGXGGCA AGTTTCCTGACCTCTCTGACATTCAG
-λcA
TTGTCCTTTCAAGGAAGGATTAGCATCCTTCGAAGAATTAGCCACCTCTGCCCAA
ATCACAAGAATGGTTGTAAATCTCC
TCCTACAGGGTAAATACTTGGATTAGTTTGAATCTTTTGACTTTGGTTCATAATG
GCTTTAGCATGAAAAGGAAATACAG
TGACTTAATAATAAAAAGTTTAGGGGTGAGGTTGGTTTCCAGCATGGCTATATGC
AGATGCTGAAATGATGCTGTCGGGG
AAGCCTGTGCCTTCCTTTGGATAAAGGCCTTTCTAGGTCAGTCCCTCCCAAGAG
GTGGCAAAATGACCACCAGTAGATCC
AGCTTTATACTCTACCAGTAGTTTGGCACATCCATGAGGGGAGGGAATCTATCA
TTTCCTAATAGATTCAGCAAAAGTTC
CAAGACTGACTTCCACTGGCCAAGTGAATGCTGCACACCTACCCTGGGAGCCA
GAAAGTGGAGTCACCTCTGCCCAAACT
ACACTTAGAGGAATGTAXGAAAGTAGGTTCCCTAAAAAGAGATGACTTCTGTAAC
C AAAC C AAATAAATACTTGTC AAAT
GAATGGTACTATGTGAAAGATACTTTGTATACTATAAATCTGGAGGCTGGTATTC
ATAGTAC I I I I I AAATGTTAATTAC
TTGCAAAGCCGAAGAAATATGTTGGTCAGGAAGTAGTTCATCATGTGCTTTTAAA
TCCTCAGGTCTGCTTGGTCCAGCTA
AGAGGGAAGCCTGGCTGCAGTTGAGGGCCGAAATTGAAGCCCTGACCGACTCC
TGGTTGACACTGGCCCTGAAAGCACTC
TCGCTCATTCACTCCCGGTGAGGCTCTCTGCGGGGTCTTTGCCAGGAAATGGG
CAATGCTCGTGCTTGTTCATTTTGGAG
ATTTATTTTTCAACAGTCTTTGTTTTC I M M GTTCTCATAGGACAAACTGTGTGA
ATATTTTAGTAACAACTACTCAGC
TCATCCCAGCATTGGCGAAAGTCCTGCTGTATGGGTTAGGAATTGTATTTCCAAT
AGAAAATATTTACAGTGCAACTAAA
ATAGGTGAGCTGCTTTTCTTTTCTT "I I I AACAATCAGCTCTTTTCCTTCAGGATAA
TAAAAACGAAAGGCAATAAATGAA
ACTTTTTTTCCAGAATGCTACATAGGTCAAAAAAAAAATACAAGAATAATTTAACC
AATTTGCAAAACTGGATTTAGACG
TAAGAGAAACATAAAGCATAATACAGTAGCTGAAGTCACAAAATTTTACTTGACT
GGCTAACGTTAAGACCACGGCATTT
TAGGTAATATAACAGCATCGATTCTGGATTATTATCATATTTTTATCCTTTTAGGC
AACATTAACAAAAATATCTGATTT
TCTAAGCTAAAACAAACAGCAAGATAAAATTATAAAAAGTAGAAATGAAACAACTC
TAATAACAGATTTATTACTAAAAA
TATTTGTTTTGTTTTCCATTCAATCTGTATAATCAATTGAGTTTCCATTTTTGTTGT
GGAAATTTAAAAGCTTCTTGCAT
GAAAAAAACATAGTTGGGAGGAAAATACTTATGAAATTTAGTTAGAATATAAATCA AGGCAAAATCAACATAGGAATAAA
AAGACATGGCTCTTGTTCTCAGGGGAAGCAAACAGTTACATGACAATCCAATAT GATACAAGAGTATGCAAATGAGCATT
TTTTAGTTGTCTACTAAATGCCCTGAACTGTACTAAGCATTTTGCATTTAAAGTCA ATCTTTATAATCTTTCCATTTTAC
AAATGAGGAAACTAAAGTCCAGAGACTTAAGTAAATTGTTGCAATCAATAGTTCA CAAACACAAGGCTCAAGCCTCATGT
TTCTCTGATTCCAAGGTCCAGACCCCTTGCTGTATCCTACAGGGTCCTATTGCTA AGTGCTTGAATGTGCACATGACTGA
-AOZ,
GGAAGCATTTACTGCCACCCAGGAAAAAGTAGAGGCCAAGTGTCTATGAAAGTG
TGAAAGCTAAAGGGAAGAAATGACAT
TTGATCTGGGTATAGAAGAGCAAGTAGGAGTTCAACAGGTAGATGGGCTAGTGA
GAACAATGCAAATGGACAGAGAAATT
AAAATGTGTTCCCTGCTGAGGGACTTGTGAGTGGCCCTGGGTAGCTAGTATGCA
GTGCTCATTAAAAGCCAAAGCTGAAA
AGGTATACTTGGGTCTAGGCATGAATGGCGTTGTATATCACCCTAAGGAXAGGA
ATGTTATTTTAGTCCAGGGGAAGGGG
TATGAGCATTCTAGCAGGTGGCAAAAGGGCTGTCTTTGGCTGCTCGACTCCATA
TAATAAAAAGGGGCAAAAATGGAATC
CCTGTAGGTCTAACTTCCGTTTGCCTGCACTTTTACACTTCCTGTGCTTCTCCTT
CAAAGCATTTACTAAATGATTACTA
AAAAATTACTGACTTTTATAATTATTTGTTCATAGTATGACTTCCCCGTCATACTG
TAATAATCTTGAGGACTGGGTTTG
TCTCTCTTGCTCATCATAGTAGCTGTAACACCCAGCACACAGTAGGTATCAATAA
ATGTAGAGATGAATTTGTTGAATGA
ATGGCTGAATAAATTACCATTTATTTTGAAGAAGGCCGACACAAGATTTCTTTTA
GGCACGTGCGTCAATAGCGGCACAA
TTAACTGATGGGGAAATCAGGTTTTATTTTTATCAGTGTTGCATTGGAAGAGACC
CAGTGGGGTCTTCAATCAATGGCAG
AAAGACACAGTCAGCAGGGTGCTGATGGGATTCATCCAGGCTAACCTTTCAAGT
TAGAAATACATATCTGGAGTTTGAGA
ATGACTGATGCTTCCCAAGAAGGGCCAAGAACAAATTGTTGGAAAATATCAAAAA
GCAAGAAATATATCAAATAGAGTAA
AATTCCTAGTAGCCCATTATAACCAGAACTTTTATGTGCTTTCATGTTCCATAAAA
ATAGCAAACAGTTTAAGTCACATG
TTAATTCTATTTTCATCTATATCAATTAAGAGGAATTATAATTATATTGTGTTGTAA
ACAGGGATTGTGGAAAAGGTTTA
CCAGCACATGAAAGTTGGAACAAATGCCGAAGCAGCCCATAGTACAGAAGCGG
GTAGAGCTTAGGTTTTCAGAAAGAAAG
C AGGAGTAGTGTATTACATAGC ATGTGCATTGAAGTTTAATC C CAAATAATTTG A
AAATGATAACAGGAAATTACAATCC
AGTAAGCATTGTTTTTTTAATGGGCATTTTGTTTAAAATATTAAGCTTTATTTTTTT
TTCAATACAGAAAAAAAATATGT
GGTGGTGGAAGTAGGATTTCACCATCAAATAAGGTATATGTGGGGAATCAATGC
AAAAAACCCAGCAA I I I I I I I I I I CT
TTTGACATGGAGTCTTGTTCTGTCACCTAGGCTGGAGTACAGTGGCACCATCTC
AGCCTGTGCCTCCTGGGTTCAAGTGA
TTCTCCTGCCTCAGCCTCCCCAGGTAGCTGGGATTACAGGCACACACCACCATG
CCCAGCTAATTTTTGTATTTTTAGTA
GAGACGGGGTTTCTCCATGTTAGCCAGGCTGGTCTCAGACTCCTGACCTCAGG TGATCTGCCTTGGCCTTCCAAAGTGCT
GGGATTACAGGTGTGAACCACTATGCCTGGCCCCAGTAATTTTTATTGTAGGCA AAATGCTAAAGTCATAGGAGGAGATG
ACTGGTATCTCAGTTGGCAGAGCTATGATGACCTTTCAGACAAAATGGACTCAT CAAAATGAATACTTCATATAAAACTA
ATAGTTAACTTCTAAAATCTAAGCTTCATTTTCTTAACCTCTATTGTGAAAAATTTC TACATGACCCCAAATTGGCAATG
AGCCCAATAGGAGATAGATAGTATATAGGCAAGGATTTTTTTTGTTTACCACAAC
TTCATCTTTAAATTACATTGAATTT
ACCTTCCTTTTTCCACAAGAGGGAAAATTTGCAAAAACGCATAAGCATTTCCCCG
TAAGGAGAGAAAAAACTTTCATGAC
ATCGAAGTTTTTGGTTGTATATTTTTGCAGTGAAGACTATACTTTGCATTTGCTTC
TTAAAGTGATTTAATATTTCACAG
TTTAGTGGACAACTAAAGAGGCATTTAAATTGCATCTTTGAATAAGGCTAGCTCT
ATTTCATTGTGTCAGATGCAAACTA
GAGATTTTAGTGCAAAATACAAATATAAAGCAGACATTTGTTTAAGCAAACAAGC
AATTTGTTCATGTGTATATCTGATC
CTTATTTGAAAGAGGTGCAGGCTTCGCTCAGCCATTCACATTTGAAAACTATAAA
CTGTCCCATTGTCCATTAAAATGAC
CCTATTGGCAGTCCAAGAGTTGGAACACCAGTTTCTTCTCTGAGAAGAAACAAT
GCTTGCATTAACTGTGGGAATGTGCT
GACAGTGTAGCCACCACAGGAGCCCAAACTAAGTCATGACACGGGAAAGATAG
TAGACAGATAAGGGTATTCTATTACCT
TTATTCCAGTTTTCCTCCAGGAAAGTGCTTCATTGAACATTAGATGCCTGGGTAA
GGCAGGTACCATCTTGCTGCACTAC
CTCAAATTATCTGACATTTATTTGAGAAGGAGTTTTACAAGACTACATCAGACTTT
CAGGGTCTTTAATTCAACCTGAAA
ATG GG TAAGCTAATTCCGATGTTTGCGTTTTCTTGCTAGAACAAAAACAAG
TCCAGTTATTTTTTATCATTCTCAA
TAGTCTTTGGATCAGATATTTTCAAAGTTTGATATATGAAGAAAAAACTTTTTAAA
ATCAAGTTAAAATATGTTGCCAGA
ATTGCCTATGTTTGAGTTCAGCTTATATATAGGGGGAAAAAGAAGTCCCACTGCA
AAACAGTACTTTCCCCCACCACCCC
AATTGCAATGCTGTGATTGACTAGATTCCATTTTTTAATAAATGAGTCCTTGGGA
AGAGAGTATTTAACTATCAATTTCC
AAAGGACAGGACACCACACTGCTAATTAATTTACTTTGTAATTTATTGTGCTGTG
GCACATACAACCCAAGTGGCATAAA
AGCATTCCTCTTAGAGGTACTAGTGGGGCATTCGAATCAGAGCATGAAGAAAGC
ATTTTATATAAAAAGTTCAAGCACTG
TCACATTAGGTTACCCAAACTTCACATTTTCTATCATTTATTTTTGTTCACTAGGA
AAAGAAAGCTGTTTTGAGAGAATA
ATTCAAAGGTTTGGAAGAAAAGTGGTGTATGTTGTTATAGGAGATGGTGTAGAA
GAAGAACAAGGAGCAAAAAAGGTGAG
CATTCCTAGATGGTGCATCTTGTCTCATTTGTTTTTTCAGTCCTCAGTATTCTCAT
AACTGGATCATTTAAACAGGTTCC
ATTTATGTGAAACGTGATGAATATGTGCAGTACTCTAAGCACTGAACAGGGTAAA
GTAGAATTCAATAAATATGGAGTGA
TCTTTCTGAATAGTAGTGTGCACAAACTCAAGGGCCACAAGTGGTGTTCATAGA
AGGTAGAATTTGGGAGCAGAACTGAA
GTGTTGTTAGTGGTGTAGCTGTCATCCAGAGCAGAGATAACACATGTGCCAGCA GGACCTTTTCTGTACCCATGGCAGAC
ATGCCAGGTGGTTCAAACCTTGTTACCTGAGCCTGGACGCAGCCTAAGAACCTT TCAAACATCCTTCAGAAAGCCTCTAT
CAGTTGGTCAAAGCTGACTTGGAAGATAAAACTATCTGCTGCCCCAGAAGACTC GATGTCTTCTTCATATCATGAGGTCC
TACCTGTCTCTCTCAGGGCAGGGGGAAGGGAGTGGACCCAACTATAAATCTCTT
GACTCCCGCTTTGCCCATTGCTGTTC
ATTATATTCTTCTGAATCTCTGCAGGTGTAATTCTATCATTTCTCTGAGCAAAAGC
TTTCTCATTGGGCCTCTCATTGCA
TTTTTAAAAGCCAAGGACCACCTTCCTTAGCCTGGCACTGAGGGTCTCCACAGT
CTGGTTACAGCCTGCCTTTCTGATCT
GTCTCCCTCCTGTACCTTACACTGTAGTTGTGTCAGTCCACTCAGCTTTCCTTGA
CCATGCCCCACAGCTTTTCCTTCCT
GTTGTGTCTCCTTGGAACCTTTTCCCTAATACCATTCCGTCATTGTCTTATTCATT
TGCAAAATTTAATTCATGTACTAC
CTTTTCTGTTATGCTTTTAAGAGATCAATTATAATCATGCTCCCTTATATTTTAACT
GAAGGAGCTGTAGTGATAAATAG
CTTGGATTCTTGAGCCAGACAGCATGAGTTTGAATTATGACCCTGCCACTTACC
CTTC CTGTG ACTTTG AAC AAGTTTTT
AACCCTCTGTTTCCCTTTCCTTGTCTATAAATGGGGATAGTAATAGCACTTAGCT
CATTGGGCTAGTTTGAGGATTAGAT
ATAGTAAAGATATATAATTCCCATAATATAGATATGTGTAAAGTGTACAACAAATG
GTAGCTATTGTTATTGTTTATGTG
CCTGTTTCAGCAAAGAAATGTGAGCTTCACAGATTCAGGGATCTTACTTTGCTTG
GTTTGTTGAGACTGTCTTCAGATTT
TTTTATAACTCCCTAAATCTTGTCATAGTTGTACTCTGTAAATGTTTCTTGAATTG
AAGAACAATTAAATATTTTATGGG
ATTAACCAACAGCATAATTTTCCATTCCCTTCCTTACACACTCCCTCATAGGAATC
CATGTATATTTCCATAAATGATAC
AATATCTCAGCCCTTCTGGAATATTTCAGCATATACCAGAAACAGCATGTAAGGT
CACCAGATTACAGATGCAGCACTGA
GCCTAAGTATGATACATGGCCAGCCTTCCTACAGGCTCTGAAGATGGCCTAGTT
TACCTTCAGTGCCTCTGAGTTCCACT
AAAATCATAATCACTGACCCTAATTTTAAACTAGATAATTTCATGAATACCCATAA
ATACAGAATGTATCCCAGTCACAA
ACCTGCGTGAGTCTGGTTGGTGCTGGTAGAAACCCACACGTGACTTCAGGTCT
CCTCATGCTCAATGTGAAGCAGCATTT
GCCATTTAAGCAGTAGTTCTAATCTTCCCACTTCACCGACGACCAACCACTAGTG
TCATTGGGTGCCTCTGTTACCTCTC
TGGGCfGTCTTTATAAAATAAAACCTTTCCCAGCCTAAGTCAGAATCCTCAAAAC
ACAATCCTGGAGAGTTGAAGAGAAT
GTTTTCTGACATCCAGACTCAAGTCTTCTTTCTAGGAAAGCCTCTGGCATTCCCA
AGCCATTCCTATTCAGGACCCTCCT
AGTGGCCAGCAGGGCAGGGATGGAACAGATTTGTACCTCTGACCAGTATGGCC CTAGGCCAAGGTGTGAAGCCATCTCCA
GGCTGGTCCTTCTCCCTCTGGTATCTACCAGAAAGTTTAATATCAAAAATTGCTA AAATGTGGCTTTCTAGCTTGTGGTA
GGCAGAGAGGGATGTTTATGAAAGTGATTACTTAAAATGCAGGGAACACAAGTG AAATC CTTTGGATCTTATCAAGGTTT
CTGTGAGTGAGTCCCTTCCTAAACCTTTCTGATAAGTACAATTCTTAGAAGTTTG ACTTCAGTGAACACAACATGGAAAA
AAACAAAGGAATTCTTGGTTTCAGGTTAGAATTGAAAGTATTGATAGGGCTAGAG TATTTGCTGTCTTTGAAGAGCAGTT
-AoS
AAGATTTTTTTTTCTTTTTTTTATTATACTTTAAGTTCTAGGGTACATGTGCACAAT
GTGCAGGTTTGATACATAGGTAT
ACATGTGCCATGTTGGTTTGCTGCATCCGTCAACTCATCATTACATTAGGCAAAT
GCTGACACTTCCTGGTAGATTCTAA
TCCTGACAGCTTTGACAAATGGGTGCTCAATACATTTTGTTGGTAGAGCGGACA
TGAAAACTTTATGTATTTCTGATGTA
ATAAAAGGTTTTAGTTAAAAGATTTTAGTTAATTTTTATACTGGAAATAGGCTGAG
GTTCGTTTTTGGTTTTTTGTTTTA
TTTGAGAAAGAGTCTCCGTCACCCAGGCTGGAGTGCAGTAGCGCGATCTCGGC
TCACTGCAACCTCCTCCTCCTAGGTTC
AAGTGATTCTCCTGCCTCAGTCTCCTAAGTAGCTGGGATTACAGGCATGTGCCA
CGATGCCTGGCTAATTΠTGTATΠT
TAGTAGAGACGGGGTTTCACCAGTTAGCCAGACTGGTCTCAATCTCCTGACCCC
AAGTGATCTT
SEQ ID N°1-F : This sequence is contained in the subclone J4A10 derived from the
P1 10910 and contains exon H (Contig H) which is the last exon of eyal. Exon H spans from the nucleotide at position 362 of this sequence. The translation stop codon is localized at the nucleotide at position 440 of this sequence. The polynucleotide of SEQ ID N°1 -F is contained in the clone HSEYA1J4A10 deposited at the Collection de Cultures de Microorganismes under the accession number :
XXX. The nucleotide sequence of SEQ ID N°1-F is the following :
AATTATAAGAACAGTTTTAAATTTCGCXGAAAACCATGAXAGGGAACACTGCCAT
ATTTGGCAACGTTXTTTTATTTAAAGCTTGGAATTGGCATTCAAGTCACAGTGTA
GATCCAGTTAGGAAATATCCATCCCCTTCTTCAAAGAAACAGACTCTAGATTACA
TTTTACAACTCAGTCCAGTATTCTTAGGGGAGGATTGAGTTTGAATTTTTGTTTTA
AACAGATGTGATGTTAAGATGTTAGCTGGCATTTCAATGATACTGAAAATGGGG
GTCAGCATGGGAGTGGATTTCCAAAAAAGCTGGCAGGGCTGGCGCTGGGTGCT
CTGTCCTCACATGTATGTGTGCCTCTGCTGCAGCACGCGATGCCCTTCTGGAG
GATCTCCAGCCACTCGGACCTCATGGSCCTGCACCATGCCTTGGAACTGGAGT
ACCTGTAACAGCGCTCGGCACTTTGACAGCGCACAGCTGCTCTGTGACCAGGG
ACAGATCCAGCAGGCCCCAGTCTCGCATCAGCGCCGGCCTCCARAACTTAGCA
ATTTCCGCCTGGTGATGCGCAGTTGCTGTCAgTCTTGACCTCTGCCTTTGTGGT
GAATGGAGGACCACGTCTATTTCATCAXAACAGCTGTTGACTCTAGTACTGTGAA
TCCAGTGAAAATAAGCCATGAXAATGTTTTAGCACAGCGTTATGTGTCTGCCACA
TTAACTACACGGTTCAAACCTGTGAAAAAAGGACcTGCAAACGCTTCAGTTGTTA
GCATTTTCAATGTGATATAAACAGCTTCTCCAATACAGCAAACCTAATTGCACAA
CAGAXACTGAAATGTGTTTCCTGAATACCAGTGGAGGAATTTTCTTGTAAAGAAG
GTTTACTTTTTGGTGTCTCATACCCAGGGTAATCTGTACATCTCTACTTATTTATG
AACAGACTTTTTTTAAAAARATAAAAAAACAGCTTTATTGAGGTATAATTCACCCA
CCAGACTTTTTTAAACATCAAATAATTGAAGAGACAATAGCATTAGAAATAAGTGA
TTAAAGGCCTCTGCCTCACAACATGGCAAGTACAGTACTTTGAATTTTAGCACAT
TGCATAATAKTTTTAAGTATGTCTAATTTAAACGTATAATATGTACATCACTGAGA
CAATCATGTXCAGAAAGAATTTTTGGTGTAAATTTGTAATAATGGATAATTCTTTT
ACATATTGTTTAGGGAAATGATATTGAAXAGGTAGCAATGCCTGCATAGTGAAXC
ATGAGGCAGCACGTGCACAAATTCATGTGCCGTGCCTTATCTGAGTTTTCGGTA TAAATATGTA
-Λ06
ANNEX 2
SEQ ID n° 2 :
5 -TCTCCTTTTTCTCTTTTGGTTAAAAGAGGGCATTGTCGTTCTCAGCCATG
TGCTCTGTATAATTAAGAGCTGACACTGAAGCAGAGTAACAACATCTTCT
AATTTTTTTACCCCTGATCACAGGTGCAAACATCTCAAGCCAGTTCAGAT
GTTGCTGTTTCCTCAAGTTGCAGTTAAAACAGAGCCAATGAGCAGCAGTG
AAACAGCTTCAACGACAGCCGACGGGTCTTTAAACAATTTCTCAGGTTCA
GCAATTGGGAGCAGTAGTTTCAGCCCACGACCAACTCACCAGTTCTCTCC
ACCACAGATTTACCCTTCCAACAGACCATACCCACATATTCTCCCTACCC
CTTCCTCACAAACTATGGCTGCATATGGGCAAACACAGTTTACCACAGGA
ATGCAACAAGCTACAGCCTATGCCACGTACCCACAGCCAGGACAGCCGTA
CGGCATTTCCTCATATGGTGCATTGTGGGCAGGCATCAAGACTGAAGGTG
GATTGTCACAGTCTCAGTCACCTGGACAGACAGGATTTCTCAGCTATGGC
ACAAGCTTCAGTACCCCTCAACCTGGACAGGCACCATACAGCTACCAGAT
GCAAGGTAGCAGTTTTACAACATCATCAGGAATATATACAGGAAATAATT
CACTCACAAATTCCTCTGGATTTAATAGTTCACAGCAGGACTATCCGTCT
TATCCCAGTTTTGGCCAGGGTCAGTACGCACAGTATTATAACAGCTCACC
GTATCCAGCACATTATATGACCAGCAGCAACACCAGCCCAACGACACCAT
CCACCAATGCCACTTACCAGCTTCAAGAACCGCCATCTGGCATCACCAGC
CAAGCAGTTACAGATCCCACAGCAGAGTACAGCACAATCCACAGCCCATC
AACACCCATTAAAGATTCAGATTCTGATCGATTGCGTCGAGGTTCAGATG
GGAAATCACGTGGACGGGGCCGAAGAAACAATAATCCTTCACCTCCCCCA
GATTCTGATCTTGAGAGAGTGTTCATCTGGGACTTGGATGAGACAATCAT
TGTTTTCCACTCCTTGCTTACTGGGTCCTACGCCAACAGATATGGGAGGG
ATCCACCCACTTCAGTTTCCCTTGGACTGCGAATGGAAGAAATGATTTTC
AACTTGGCAGACACACATTTATTTTTTAATGACTTAGAAGAATGTGACCA
AGTCCATATAGATGATGTTTCTTCAGATGATAACGGACAGGACCTAAGCA
CATATAACTTTGGAACAGATGGCTTTCCTGCTGCAGCAACCAGTGCTAAC
TTATGTTTGGCAACTGGTGTACGGGGCGGTGTGGACTGGATGAGAAAGTT
GGCCTTCCGCTACAGACGGGTAAAAGAGATTTACAACACCTACAAAAATA
ATGTTGGAGGTCTGCTTGGTCCAGCTAAGAGGGAAGCCTGGCTGCAGTTG
AGGGCeGAAATTGAAGCCCTGACCGACTCCTGGTTGACACTGGCCCTGAA
AGCACTCTCGCTCATTCACTCCCGGACAAACTGTGTGAATATTTTAGTAA
CAACTACTCAGCTCATCCCAGCATTGGCGAAAGTCCTGCTGTATGGGTTA
GGAATTGTATTTCCAATAGAAAATATTTACAGTGCAACTAAAATAGGAAA
AGAAAGCTGTTTTGAGAGAATAATTCAAAGGTTTGGAAGAAAAGTGGTGT
ATGTTGTTATAGGAGATGGTGTAGAAGAAGAACAAGGAGCAAAAAAGCAC
GCGATGCCCTTCTGGAGGATCTCCAGCCACTCGGACCTCATGGCCCTGCA
CCATGCCTTGGAACTGGAGTACCTGTAACAGCGCTCGGCACTTTGACAGC
GCACAGCTGCTCTGTGACCAGGGACAGATCCAGCAGGCCCAGTCTCGCAT
CAGCGCCGGCCTCCAGAACTTAGCAATTTCCGCCCTGGTGATGCGCAGTT
GCTGTCAGTCTTGACCTCTGCCTTTGTGGTGAATGGAGGACCACGTCTAT
TTCATCAGAACAGCTGTTGACTCTAGTACTGTGAATCCAGTGAAAATAAG
CCATGAGAATGTTTTAGCACAGCGTTATGTGTCTGCCACATTAACTACAC
GGTTCAAACCTGTGAAGAAAGGACCTGCAAACGCTTCAGTTGTTAGCATT
TTCAATGTGATATAAACAGCTTCTCCAATACAGCAAACCTAATTGCACAA
CAGAGACTGAAATGTGTTTCCTGAATACCAGTGGAGGAATTTTCTTGTAA
AGAAGGTTTACTTTTTGGTGTCTCATACCCAGGGTAATCTGTACATCTCT
ACTTATTTATGAACAGACTTTTTTTAAAAAGATAAAAAAACAGCTTTATT
GAGGTATAATTCACCCACCAGACTTTTTTAAACATCAAATAATTGAGGAG
ACAATAGCATTAGAAATAAGTGATTAAAGGCCTCTGCCTCACAACATGGC
AAGTACAGTACTTTGAATTTTAGCACATTGCATAATAGTTTTAAGTATGT
CTAATTTAAACGTATAATATGTACATCACTGAGACAATCATGTACAGAAA
GAATTTTTGGTGTAAATTTGTAATAATGGATAATTCTTTTACATATTGTT
TAGGGAAATGATATTGAAAGGTAGCAATGCCTGGATAGTGAAGCATGAGG
CAGCACGTGCACAAATTCATGTGCCGTGCCTTATCTGAGTTTTCGGTATA
AATATGTAGATAATGGA I I I I I I I I TAGATAATGTTGTCAAGACCAAAAG
CATGGATGTCAAGTGTCAGTAAGGATTTTGTTTCTAAAATTTTTTCCTGC
ATCAGTTCTTCTGAGGGCCTTGATGAAATAACACAGCAGTTTCTTAAACA
ATTTGAAACAAAATGAGCTCTCCTACCACCTCACTTTTTCATTTCCACAC
TAATGTATTATATGTAACTACTTGGAAAAAATAATTATTCAAATGCTTCT
TCCCACAAAGAATATAGATGATAGTAGATATATTTTATTAATAAAATGGT
TCATGAATCGGAGACTAACAAAGTTTTCATGTGCTCAGAATTATTAATTA
TCGTGTCTGCATTTTCTTTCGATAAAGGAAGACACACGATGCTAATCCGG
AAATCAGCAAACTTTGCATTACTCCCTATGTGCGTATTTTCTCTTTCTTC
CTGTCACCCTGAGGAAGGTTCATTGCCATTGTCATCACCATGGAAACAAC
GTTCCTCTCCACCTGCATTATGTACTACATGACAGGCATCAATCTGGGGA
AATAATAAAATTATCACCTTTGTCAGACCATAAGAGTTTCTCCAAAAGTG
GTCAGTTTGGCTGGGCAATATTTTCTCTCATCTAACAAACACAATCCATT
GTCATGAAATTACCCTTAGGATGAGTCTTCTTTAATCAATCATATATTGG
GCGGAAAAAACACCAGCTTTGACCCGAAGTAGTTGAAGAGCTACTTCATT
CTTTTCTGAAGTTGTGTGTTGCTGCTAGAAATAGTCATTTGTGAATTATC
CAAATTGTTTAAATTCACAATTGAATTAGTTTTTTCTTCCTTTTGGCTTG
AAGCAAACAGTTGACCATTTTT CCTTTTCATTTTATGTTTTTGTACTC
TGCAGACTGAAAAGACAAAGTTTATCTTGGCCTTACTGTATAAAGGTATG
CTGTGTCCACCGTTGTGTACAGAATTTTTCTTCATTAATTTTGTGTTTAA
GTTAATAAAATTTATTTGTGATGTACTGTAA-3'
SEQ ID N° 3 :
AAACCAATAAGGTTAGGACAAGAGAATAGCTGTGGTTTGCGTTGCAAAAA
CCAAAAAAAAAAAAAAAAAAAAAAAGAAAGCCCCGAGGCTCCATGGGCAG
ACCTACAAGGCTGCGCAAACAAATCGAGGGATGAGATTCTGCTGTTTCTT
TGTCTAGGGTTCTCAGATGCTATCTGCCGCTGCTGTTTGGTGGGGAAGGA
GCGCTGGGCGCAAAGCTGTTACCAAACAGAACGGTGGGAGCTGATGGCTC
CGAGTTTGGGGCGAGGTAGAAACTCTCCAGTGCCACTTCCGACTTTAAGC
CTTCCTGTTGCCGTCCACTGTGGCGGGTTTCTTCCTGGGGAACACGTTTT
CGCTCAGTCGCTCGGCAGCCCGAGCCTGCGGCAGCGGCCAGGCGCCTGCC
CCCTGCGCCGAGCTTTCCCCTGCAGAGGCGCTCCACTCCCAGAAGCGCCG
CGGCTGCACCAGAGCGCCTGAGAGCCCCCGCGCGTACCCATCCAGGAGCA
AAACTATGTCAGGAATGGAGGTTTGCTAACCCAGAAAATTCGAAGGAACA
CATTAAACTGGTGGATGCAGCAGATGTAAGCGCTGTGCAAACATCTCAAG
CCAGTTCAGATGTTGCTGTTTCCTCAAGTTGCAGGTCTATGGAAATGCAG
GATCTAACCAGCCCGCATAGCCGTCTGAGTGGTAGTAGTGAATCCCCCAG
TGGCCCCAAACTCGGTAACTCTCATATAAATAGTAATTCCATGACTCCCA
ATGGCACCGAAGTTAAAACAGAGCCAATGAGCAGCAGTGAAACAGCTTCA
ACGACAGCCGACGGGTCTTTAAACAATTTCTCAGGTTCAGCAATTGGGAG
CAGTAGTTTCAGCCCACGACCAACTCACCAGTTCTCTCCACCACAGATTT
ACCCTTCCAACAGACCATACCCACATATTCTCCCTACCCCTTCCTCACAA
ACTATGGCTGCATATGGGCAAACACAGTTTACCACAGGAATGCAACAAGC
TACAGCCTATGCCACGTACCCACAGCCAGGACAGCCGTACGGCATTTCCT
CATATGGTGCATTGTGGGCAGGCATCAAGACTGAAGGTGGATTGTCACAG
TCTCAGTCACCTGGACAGACAGGATTTCTCAGCTATGGCACAAGCTTCAG
TACCCCTCAACCTGGACAGGCACCATACAGCTACCAGATGCAAGGTAGCA
GTTTTACAACATCATCAGGAATATATACAGGAAATAATTCACTCACAAAT
TCCTCTGGATTTAATAGTTCACAGCAGGACTATCCGTCTTATCCCAGTTT
TGGCCAGGGTCAGTACGCACAGTATTATAACAGCTCACCGTATCCAGCAC
ATTATATGACCAGCAGCAACACCAGCCCAACGACACCATCCACCAATGCC
ACTTACCAGCTTCAAGAACCGCCATCTGGCATCACCAGCCAAGCAGTTAC
AGATCCCACAGCAGAGTACAGCACAATCCACAGCCCATCAACACCCATTA
AAGATTCAGATTCTGATCGATTGCGTCGAGGTTCAGATGGGAAATCACGT
GGACGGGGCCGAAGAAACAATAATCCTTCACCTCCCCCAGATTCTGATCT
TGAGAGAGTGTTCATCTGGGACTTGGATGAGACAATCATTGTTTTCCACT
CCTTGCTTACTGGGTCCTACGCCAACAGATATGGGAGGGATCCACCCACT
TCAGTTTCCCTTGGACTGCGAATGGAAGAAATGATTTTCAACTTGGCAGA
CACACATTTAI I I I I lAATGACTTAGAAGAATGTGACCAAGTCCATATAG
ATGATGTTTCTTCAGATGATAACGGACAGGACCTAAGCACATATAACTTT
GGAACAGATGGCTTTCCTGCTGCAGCAACCAGTGCTAACTTATGTTTGGC
AACTGGTGTACGGGGCGGTGTGGACTGGATGAGAAAGTTGGCCTTCCGCT
ACAGACGGGTAAAAGAGATTTACAACACCTACAAAAATAATGTTGGAGGT
CTGCTTGGTCCAGCTAAGAGGGAAGCCTGGCTGCAGTTGAGGGCCGAAAT
TGAAGCCCTGACCGACTCCTGGTTGACACTGGCCCTGAAAGCACTCTCGC
TCATTCACTCCCGGACAAACTGTGTGAATATTTTAGTAACAACTACTCAG
CTCATCCCAGCATTGGCGAAAGTCCTGCTGTATGGGTTAGGAATTGTATT
TCCAATAGAAAATATTTACAGTGCAACTAAAATAGGAAAAGAAAGCTGTT
TTGAGAGAATAATTCAAAGGTTTGGAAGAAAAGTGGTGTATGTTGTTATA
GGAGATGGTGTAGAAGAAGAACAAGGAGCAAAAAAGCACGCGATGCCCTT
CTGGAGGATCTCCAGCCACTCGGACCTCATGGCCCTGCACCATGCCTTGG
AACTGGAGTACCTGTAACAGCGCTCGGCACTTTGACAGCGCACAGCTGCT
CTGTGACCAGGGACAGATCCAGCAGGCCCCAGTCTCGCATCAGCGCCGGC
CTCCAGAACTTAGCAATTTCCGCCTGGTGATGCGCAGTTGCTGTCAGTCT
TGACCTCTGCCTTTGTGGTGAATGGAGGACCACGTCTATTTCATCAGAAC
AGCTGTTGACTCTAGTACTGTGAATCCAGTGAAAATAAGCCATGAGAATG
TTTTAGCACAGCGTTATGTGTCTGCCACATTAACTACACGGTTCAAACCT
GTGAAGAAAGGACCTGCAAACGCTTCAGTTGTTAGCATTTTCAATGTGAT
ATAAACAGCTTCTCCAATACAGCAAACCTAATTGCACAACAGAGACTGAA
ATGTGTTTCCTGAATACCAGTGGAGGAATTTTCTTGTAAAGAAGGTTTAC
TTTTTGGTGTCTCATACCCAGGGTAATCTGTACATCTCTACTTATTTATG
AACAGACTTTTTTTAAAAAGATAAAAAAACAGCTTTATTGAGGTATAATT
CACCCACCAGACTTTTTTAAACATCAAATAATTGAAGAGACAATAGCATT
AGAAATAAGTGATTAAAGGCCTCTGCCTCACAACATGGCAAGTACAGTAC
TTTGMTTTTAGCACATTGCATAATAGTTTTAAGTATGTCTAATTTAAAC
GTATAATATGTACATCACTGAGACAATCATGTACAGAAAGAATTTTTGGT
-Hog
GTAAATTTGTAATAATGGATAATTCTTTTACATATTGTTTAGGGAAATGA
TATTGAAAGGTAGCAATGCCTGGATAGTGAAGCATGAGGCAGCACGTGCA
CAAATTCATGTGCCGTGCCTTATCTGAGTTTTCGGTATAAATATGTAGAT
AATGGAT I I I I I I TTAGATAATGTTGTCAAGACCAAAAGCATGGATGTCA
AGTGTCAGTAAGGATTTTGTTTTCTAAAATTTTTTCCTGCATCAGTTCTT
CTGAGGGCCTTGATGAAATAACACAGCAGTTTCTTAAACAATTTGAAACA
AAATGAGCTCTCCTACCACCTCACTTTTTCATTTCCACACTAATGTATTA
TATGTAACTACTTGGAAAAAATAATTATTCAAATGCTTCTTCCCACAAAG
AATATAGATGATAGTAGATATATTTTATTAATAAAATGGTTCATGAATCG
GAGACTAACAAAGTTTTCATGTGCTCAGAATTATTAATTATCGTGTCTGC
ATTTTCTTTCGATAAAGGAAGACACACGATGCTAATCCGGAAATCAGCAA
ACTTTGCATTACTCCCTATGTGCGTATTTTCTCTTTCTTCCTGTCACCCT
GAGGAAGGTTCATTGCCATTGTCATCACCATGGAAACAACGTTCCTCTCC
ACCTGCATTATGTACTACATGACAGGCATCAATCTGGGGAAATAATAAAA
TTATCACCTTTGTCAGACCATAAGAGTTTCTCCAAAAGTGGTCAGTTTGG
CTGGGCAATATTTTCTCTCATCTAACAAACACAATCCATTGTCATGAAAT
TACCCTTAGGATGAGTCTTCTTTAATCAATCATATATTGGGCGGAAAAAA
CACCAGCTTTGACCCGAAGTAGTTGAAGAGCTACTTCATTCTTTTCTGAA
GTTGTGTGTTGCTGCTAGAAATAGTCATTTGTGAATTATCCAAATTGTTT
AAATTCACAATTGAATTAG I I I I I I CTTCCTTTTGGCTTGAAGCAAACAG
TTGACCATTTTTAACCTTTTCATTTTATGTTTTTGTACTCTGCAGACTGA
AAAGACAAAGTTTATCTTGGCCTTACTGTATAAAGGTATGCTGTGTCCAC
CGTTGTGTACAGAA I I I I I CTTCATTAATTTTGTGTTTAAGTTAATAAAA
TTTATTTGTGATGTACTGTAA
SEQ ID N° 4 :
TCTCCTTTTTCTCTTTTGGTTAAAAGAGGGCATTGTCGTTCTCAGCCATG
TGCTCTGTATAATTAAGAGCTGACACTGAAGCAGAGTAACAACATCTTCT
AATTTTTTTACCCCTGATCACAGGTGCAAACATCTCAAGCCAGTTCAGAT
GTTGCTGTTTCCTCAAGTTGCAGGTCTATGGAAATGCAGGATCTAACCAG
CCCGCATAGCCGTCTGAGTGGTAGTAGTGAATCCCCCAGTGGCCCCAAAC
TCGGTAACTCTCATATAAATAGTAATTCCATGACTCCCAATGGCACCGAA
GTTAAAACAGAGCCAATGAGCAGCAGTGAAACAGCTTCAACGACAGCCGA
CGGGTeTTTAAACAATTTCTCAGGTTCAGCAATTGGGAGCAGTAGTTTCA
GCCCACGACCAACTCACCAGTTCTCTCCACCACAGATTTACCCTTCCAAC
AGACCATACCCACATATTCTCCCTACCCCTTCCTCACAAACTATGGCTGC
ATATGGGCAAACACAGTTTACCACAGGAATGCAACAAGCTACAGCCTATG
CCACGTACCCACAGCCAGGACAGCCGTACGGCATTTCCTCATATGGTGCA
TTGTGGGCAGGCATCAAGACTGAAGGTGGATTGTCACAGTCTCAGTCACC
TGGACAGACAGGATTTCTCAGCTATGGCACAAGCTTCAGTACCCCTCAAC
CTGGACAGGCACCATACAGCTACCAGATGCAAGGTAGCAGTTTTACAACA
TCATCAGGAATATATACAGGAAATAATTCACTCACAAATTCCTCTGGATT
TAATAGTTCACAGCAGGACTATCCGTCTTATCCCAGTTTTGGCCAGGGTC
AGTACGCACAGTATTATAACAGCTCACCGTATCCAGCACATTATATGACC
AGCAGCAACACCAGCCCAACGACACCATCCACCAATGCCACTTACCAGCT
TCAAGAACCGCCATCTGGCATCACCAGCCAAGCAGTTACAGATCCCACAG
CAGAGTACAGCACAATCCACAGCCCATCAACACCCATTAAAGATTCAGAT
TCTGATCGATTGCGTCGAGGTTCAGATGGGAAATCACGTGGACGGGGCCG
A O
AAGAAACAATAATCCTTCACCTCCCCCAGATTCTGATCTTGAGAGAGTGT
TCATCTGGGACTTGGATGAGACAATCATTGTTTTCCACTCCTTGCTTACT
GGGTCCTACGCCAACAGATATGGGAGGGATCCACCCACTTCAGTTTCCCT
TGGACTGCGAATGGAAGAAATGATTTTCAACTTGGCAGACACACATTTAT
TTTTTAATGACTTAGAAGAATGTGACCAAGTCCATATAGATGATGTTTCT
TCAGATGATAACGGACAGGACCTAAGCACATATAACTTTGGAACAGATGG
CTTTCCTGCTGCAGCAACCAGTGCTAACTTATGTTTGGCAACTGGTGTAC
GGGGCGGTGTGGACTGGATGAGAAAGTTGGCCTTCCGCTACAGACGGGTA
AAAGAGATTTACAACACCTACAAAAATAATGTTGGAGGTCTGCTTGGTCC
AGCTAAGAGGGAAGCCTGGCTGCAGTTGAGGGCCGAAATTGAAGCCCTGA
CCGACTCCTGGTTGACACTGGCCCTGAAAGCACTCTCGCTCATTCACTCC
CGGACAAACTGTGTGAATATTTTAGTAACAACTACTCAGCTCATCCCAGC
ATTGGCGAAAGTCCTGCTGTATGGGTTAGGAATTGTATTTCCAATAGAAA
ATATTTACAGTGCAACTAAAATAGGAAAAGAAAGCTGTTTTGAGAGAATA
ATTCAAAGGTTTGGAAGAAAAGTGGTGTATGTTGTTATAGGAGATGGTGT
AGAAGAAGAACAAGGAGCAAAAAAGCACGCGATGCCCTTCTGGAGGATCT
CCAGCCACTCGGACCTCATGGCCCTGCACCATGCCTTGGAACTGGAGTAC
CTGTAACAGCGCTCGGCACTTTGACAGCGCACAGCTGCTCTGTGACCAGG
GACAGATCCAGCAGGCCCAGTCTCGCATCAGCGCCGGCCTCCAGAACTTA
GCAATTTCCGCCCTGGTGATGCGCAGTTGCTGTCAGTCTTGACCTCTGCC
TTTGTGGTGAATGGAGGACCACGTCTATTTCATCAGAACAGCTGTTGACT
CTAGTACTGTGAATCCAGTGAAAATAAGCCATGAGAATGTTTTAGCACAG
CGTTATGTGTCTGCCACATTAACTACACGGTTCAAACCTGTGAAGAAAGG
ACCTGCAAACGCTTCAGTTGTTAGCATTTTCAATGTGATATAAACAGCTT
CTCCAATACAGCAAACCTAATTGCACAACAGAGACTGAAATGTGTTTCCT
GAATACCAGTGGAGGAATTTTCTTGTAAAGAAGGTTTACTTTTTGGTGTC
TCATACCCAGGGTAATCTGTACATCTCTACTTATTTATGAACAGACTTTT
TTTAAAAAGATAAAAAAACAGCTTTATTGAGGTATAATTCACCCACCAGA
CTTTTTTAAACATCAAATAATTGAGGAGACAATAGCATTAGAAATAAGTG
ATTAAAGGCCTCTGCCTCACAACATGGCAAGTACAGTACTTTGAATTTTA
GCACATTGCATAATAGTTTTAAGTATGTCTAATTTAAACGTATAATATGT
ACATCACTGAGACAATCATGTACAGAAAGAATTTTTGGTGTAAATTTGTA
ATAATGGATAATTCTTTTACATATTGTTTAGGGAAATGATATTGAAAGGT
AGCAATGCCTGGATAGTGAAGCATGAGGCAGCACGTGCACAAATTCATGT
GCCGTGCCTTATCTGAGTTTTCGGTATAAATATGTAGATAATGGATTTTT
TTTTAGATAATGTTGTCAAGACCAAAAGCATGGATGTCAAGTGTCAGTAA
GGATTTTGTTTCTAAAA I I I I I I CCTGCATCAGTTCTTCTGAGGGCCTTG
ATGAAATAACACAGCAGTTTCTTAAACAATTTGAAACAAAATGAGCTCTC
CTACCACCTCACTTTTTCATTTCCACACTAATGTATTATATGTAACTACT
TGGAAAAAATAATTATTCAAATGCTTCTTCCCACAAAGAATATAGATGAT
AGTAGATATATTTTATTAATAAAATGGTTCATGAATCGGAGACTAACAAA
GTTTTCATGTGCTCAGAATTATTAATTATCGTGTCTGCATTTTCTTTCGA
TAAAGGAAGACACACGATGCTAATCCGGAAATCAGCAAACTTTGCATTAC
TCCCTATGTGCGTATTTTCTCTTTCTTCCTGTCACCCTGAGGAAGGTTCA
TTGCCATTGTCATCACCATGGAAACAACGTTCCTCTCCACCTGCATTATG
TACTACATGACAGGCATCAATCTGGGGAAATAATAAAATTATCACCTTTG
TCAGACCATAAGAGTTTCTCCAAAAGTGGTCAGTTTGGCTGGGCAATATT
TTCTCTCATCTAACAAACACAATCCATTGTCATGAAATTACCCTTAGGAT
GAGTCTTCTTTAATCAATCATATATTGGGCGGAAAAAACACCAGCTTTGA
CCCGAAGTAGTTGAAGAGCTACTTCATTCTTTTCTGAAGTTGTGTGTTGC
TGCTAGAAATAGTCATTTGTGAATTATCCAAATTGTTTAAATTCACAATT
GAATTAGTTTTTTCTTCCTTTTGGCTTGAAGCAAACAGTTGACCATTTTT
AACCTTTTCATTTTATGTTTTTGTACTCTGCAGACTGAAAAGACAAAGTT
TATCTTGGCCTTACTGTATAAAGGTATGCTGTGTCCACCGTTGTGTACAG
AATTTTTCTTCATTAATTTTGTGTTTAAGTTAATAAAATTTATTTGTGAT
GTACTGTAA
SEQ ID N° 5 :
5'-GGAAATGGTAGAACTAGTGATCTCACCCAGCCTCACTGTAAACAGCGATT
GTCTGGATAAACTGAAGTTTAACCGTGCTGACGCTGCTGTGTGGACTCTG
AGTGACAGACAAGGCATCACCAAATCGGCCCCCCTGAGAGTGTCCCAGCT
CTTCTCCAGATCTTGCCCACGTGTCCTCCCCCGCCAGCCTTCCACAGCCA
TGGCAGCCTACGGCCAGACGCAGTACAGTGCGGGGATCCAGCAGGCTACC
CCCATTACAGCTTACCCACCTCCAGCACAAGCCTATGGAATCCCTTCCTA
CAGCATCAAGACAGAAGACAGCTTGAACCATTCCCCTGGCCAGAGTGGAT
TCCTCAGCTATGGCTCCAGCTTCAGCACCTCACCCACTGGACAGAGCCCA
TACACCTACCAGATGCACGGCACAACAGGGTTCTATCAAGGAGGAAATGG
ACTGGGCAACGCAGCCGGTTTCGGGAGTGTGCACCAGGACTATCCTTCCT
ACCCCGGCTTCCCCCAGAGCCAGTACCCCCAGTATTACGGCTCATCCTAC
AACCCTCCCTACGTCCCGGCCAGCAGCATCTGCCCTTCGCCCCTCTCCAC
GTCCACCTACGTCCTCCAGGAGGCATCTCACAACGTCCCCAACCAGAGTT
CCGAGTCACTTGCTGGTGAATACAACACACACAATGGACCTTCCACACCA
GCGAAAGAGGGAGACACAGACAGGCCGCACCGGGCCTCCGATGGGAAGCT
CCGAGGCCGGTCTAAGAGGAGCAGTGACCCGTCCCCGGCAGGGGACAATG
AGATTGAGCGTGTGTTCGTGTGGGACTTGGATGAGACAATAATTATTTTT
CACTCCTTACTCACGGGGACATTTGCATCCAGATACGGGAAGGACACCAC
GACGTCCGTGCGCATTGGCCTTATGATGGAAGAGATGATCTTCAACCTTG
CAGATACACATCTGTTCTTCAATGACCTGGAGGATTGTGACCAGATCCAC
GTTGATGACGTCTCATCAGATGACAATGGCCAAGATTTAAGCACATACAA
CTTCTCCGCTGACGGCTTCCACAGTTCGGCCCCAGGAGCCAACCTGTGCC
TGGGCTCTGGCGTGCACGGCGGCGTGGACTGGATGAGGAAGCTGGCCTTC
CGCTAeCGGCGGGTGAAGGAGATGTACAATACCTACAAGAACAACGTTGG
TGGGTTGATAGGCACTCCCAAAAGGGAGACCTGGCTACAGCTCCGAGCTG
AGCTGGAAGCTCTCACAGACCTCTGGCTGACCCACTCCCTGAAGGCACTA
AACCTCATCAACTCCCGGCCCAACTGTGTCAATGTGCTGGTCACCACCAC
TCAACTAATTCCTGCCCTGGCCAAAGTCCTGCTATATGGCCTGGGGTCTG
TGTTTCCTATTGAGAACATCTACAGTGCAACCAAGACAGGGAAGGAGAGC
TGCTTCGAGAGGATAATGCAGAGATTCGGCAGAAAAGCTGTCTACGTGGT
GATCGGTGATGGTGTGGAAGAGGAGCAAGGAGCGAAAAAGCACAACATGC
CTTTCTGGCGGATATCCTGCCACGCAGACCTGGAGGCACTGAGGCACGCC
CTGGAACTGGAGTATTTATAGCAGGATCAGCAGCATCTCCACCTGCCAT-3'
SEQ ID N° 6 :
5'-ACTCTAGAAGGACCCACTCACTATAGGGCTCGAGCGGCCGCCCGATTGGT
TCTACTGTGGGTCTGGACTGATCTCCATGTCCTGTTGTGGGGCTTTTACA
GCCTTTGGATTGTGAAAACTGCTGAGAGAGACTTGCAATCCAGTCACATA
AGTATAATAAAGAAATATTGGTCCTCATGGAAGAAGAGCAAGATTTACCA
GAGCAACCAGTGAAAAAAGCCAAGATGCAGGAATCAGGAGAGCAAACTAT
AAGTCAAGTAAGCAATCCAGATGTCAGTGATCAGAAGCCTGAAACATCAA
GCCTTGCTTCAAACCTTCCCATGTCAGAGGAAATTATGACATGCACCGAT
TACATCCCTCGCTCATCCAATGATTATACCTCACAAATGTATTCTGCAAA
ACCTTATGCACATATTCTCTCAGTTCCTGTTTCGGAAACTGCTTACCCTG
GACAGACTCAATACCAGACACTACAGCAGACTCAACCCTATGCTGTCTAC
CCTCAGGCAACCCAAACGTATGGACTACCTCCTTTTGGTGCATTGTGGCC
AGGTATGAAACCTGAAAGTGGTTTAATTCAGACTCCATCTCCAAGTCAAC
ACAGTGTTCTTACCTGCACTACAGGGTTAACCACAAGCCAGCCAAGCCCA
GCACATTATTCTTATCCCATTCAAGCTTCAAGCACAAATGCCAGCCTGAT
ATCTACTTCTTCTACAATTGCCAATATTCCAGCAGCAGCAGTAGCCAGCA
TCTCAAACCAGGATTATCCCACCTATACTATTCTTGGTCAGAATCAGTAC
CAGGCCTGCTACCCCAGCTCCAGCTTTGGAGTCACAGGTCAGACTAACAG
TGATGCAGAGAGCACCACATTAGCAGCAACCACATACCAGTCGGAGAAGC
CTAGTGTCATGGCGCCTGCACCTGCAGCACAGAGACTTTCCTCTGGAGAC
CCTTCTACAAGTCCATCTTTGTCCCAGACTACACCAAGTAAAGATACTGA
TGATCAGTCCAGGAAAAACATGACTAGCAAGAACCGGGGCAAGAGGAAAG
CTGATGCCACTTCTTCCCAAGACAGTGAATTAGAACGGGTATTTCTGTGG
GACTTGGATGAAACCATCATCATCTTCCACTCACTTCTTACTGGATCCTA
TGCCCAGAAATATGGAAAGGACCCAACAGTAGTGATTGGCTCAGGTTTAA
CAATGGAAGAAATGA I I I I I GAAGTGGCTGATACTCATCTATTTTTCAAT
GACTTAGAGGAGTGTGACCAGGTACATGTGGAAGATGTGGCTTCTGATGA
CAATGGCCAAGACTTGAGCAACTACAGTTTCTCAACAGATGGTTTCAGTG
GCTCAGGAGGTAGTGGCAGCCATGGTTCATCTGTGGGTGTTCAGGGAGGT
GTGGACTGGATGAGGAAACTAGCTTTCCGCTACCGGAAAGTGAGAGAAAT
CTATGATAAGCATAAAAGCAACGTGGGTGGTCTCCTCAGTCCCCAGAGGA
AGGAAGCACTGCAGAGATTAAGAGCAGAAATTGAAGTTTTAACAGATTCC
TGGTTAGGAACTGCATTAAAGTCCTTACTTCTCATCCAGTCCAGAAAGGA
TTGTGTGAATGTTCTGATCACTACCACCCAGCTGGTTCCAGCCCTGGCCA
AGGTTCTCCTATATGGACTAGGAGAAATATTTCCTATTGAGAACATCTAT
AGTGCTACCAAAATTGGTAAGGAGAGCTGCTTTGAGAGAATTGTGTCAAG
GTTTGGAAAGAAAGTCACATATGTAGTGATTGGAGATGGACGAGATGAAG
AAATTG€AGCCAA-3'
SEQ ID N° 7 :
MLLFPQVAVKTEPMSSSETASTTADGSLNNFSGSAIGSSSFSPRPTNQFS
PPQIYPSNRPYPHILPTPSSQTMAAYGQTQFTTGMQQATAYATYPQPGQP
YGISSYGALWAGIKTEGGLSQSQSPGQTGFLSYGTSFSTPQPGQAPYSYQ
MQGSSFTTSSGIYTGNNSLTNSSGFNSSQQDYPSYPSFGQGQYAQYYNSS
PYPAHYMTSSNTSPTTPSTNATYQLQEPPSGITSQAVTDPTAEYSTIHSP
STPIKDSDSDRLRRGSDGKSRGRGRRNNNPSPPPDSDLERVFIWDLDETI
IVFHSLLTGSYANRYGRDPPTSVSLGLRMEEMIFNLADTHLFFNDLEECD
QVHIDDVSSDDNGQDLSTYNFGTDGFPAAATSANLCLATGVRGGVDWMRK
LAFRYRRVKEIYNTYKNNVGGLLGPAKREAWLQLRAEIEALTDSWLTLAL
KALSLIHSRTNCVNILVTTTQLIPALAKVLLYGLGIVFPIENIYSATKIG
KESCFERIIQRFGRKWYWIGDGVEEEQGAKKHAMPFWRISSHSDLMAL HHALELEYL
-ΛAZ
SEQ ID N° 8 :
MEMQDLTSPHSRLSGSSESPSGPKLGNSHINSNSMTPNGTEVKTEPMSSS
ETASTTADGSLNNFSGSAIGSSSFSPRPTHQFSPPQIYPSNRPYPHILPT
PSSQTMAAYGQTQFTTGMQQATAYATYPQPGQPYGISSYGALWAGIKTEG
GLSQSQSPGQTGFLSYGTSFSTPQPGQAPYSYQMQGSSFTTSSGIYTGNN
SLTNSSGFNSSQQDYPSYPSFGQGQYAQYYNSSPYPAHYMTSSNTSPTTP
STNATYQLQEPPSGITSQAVTDPTAEYSTIHSPSTPIKDSDSDRLRRGSD
GKSRGRGRRNNNPSPPPDSDLERVFIWDLDETIIVFHSLLTGSYANRYGR
DPPTSVSLGLRMEEMIFNLADTHLFFNDLEECDQVHIDDVSSDDNGQDLS
TYNFGTDGFPAAATSANLCLATGVRGGVDWMRKLAFRYRRVKEIYNTYKN
NVGGLLGPAKREAWLQLRAEIEALTDSWLTLALKALSLIHSRTNCVNILV
TTTQLIPALAKVLLYGLGIVFPIENIYSATKIGKESCFERIIQRFGRKW
YWIGDGVEEEQGAKKHAMPFWRISSHSDLMALHHALELEYL
SEQ ID N° 9 :
MVELVISPSLTVNSDCLDKLKFNRADAAVWTLSDRQGITKSAPLRVSQLF
SRSCPRVLPRQPSTAMAAYGQTQYSAGIQQATPYTAYPPPAQAYGIPSYS
IKTEDSLNHSPGQSGFLSYGSSFSTSPTGQSPYTYQMHGTTGFYQGGNGL
GNAAGFGSVHQDYPSYPGFPQSQYPQYYGSSYNPPYVPASSICPSPLSTS
TYVLQEASHNVPNQSSESLAGEYNTHNGPSTPAKEGDTDRPHRASDGKLR
GRSKRSSDPSPAGDNEIERVFVWDLDETIIIFHSLLTGTFASRYGKDTTT
SVRIGLMMEEMIFNLADTHLFFNDLEDCDQIHVDDVSSDDNGQDLSTYNF
SADGFHSSAPGANLCLGSGVHGGVDWMRKLAFRYRRVKEMYNTYKNNVGG
LIGTPKRETWLQLRAELEALTDLWLTHSLKALNLINSRPNCVNVLVTTTQ
LIPALAKVLLYGLGSVFPIENIYSATKTGKESCFERIMQRFGRKAVYWI
GDGVEEEQGAKKHNMPFWRISCHADLEALRHALELEYL
SEQ ID N° 10 :
MEEEQDLPEQPVKKAKMQESGEQTISQVSNPDVSDQKPETSSLASNLPMS
EEIMTCTDYIPRSSNDYTSQMYSAKPYAHILSVPVSETAYPGQTQYQTLQ
QTQPDAVYPQATQTYGLPPFGALWPGMKPESGLIQTPSPSQHSVLTCTTG
LTTSQPSPAHYSYPIQASSTNASLISTSSTIANIPAAAVASISNQDYPTY
TILGQNQYQACYPSSSFGVTGQTNSDAESTTLAATTYQSEKPSVMAPAPA
AQKLSSGDPSTSPSLSQTTPSKDTDDQSRKNMNSKNRGKKKADATSSQDS
ELERVFLWDLDETIIIFHSLLTGSYAQKYGKDPTWIGSGLTMEKMIFEV
ADTHLFSNDLKECDQVHVEDVAPNDKGQNLNNYSFSTNGFSGSGGSGSHG
SSVGVQGGVDWMRKLAFRYRKVREIYDKHKSNVGGLLSPQRKEALQKLKA
EIEVLTNSWLGTALKSLLLIQSKKNCVNVLITTTQLLPALAKVLLYGLGK
IFPIENIYSATKIGKESCFERIVTSLGKKLTYWIGDGRDEEIAAK
SEQ ID N° 11 :
5'-CTCAGCCATGTGCTCTGTATAATTAAGAGC-3'
SEQ ID N° 12 :
5'-AGGCACGGCACATGAATTTGTGCACGTGCT-3'
SEQ ID N° 13 :
5'-ATGAACAAGCACGAGCATTGC-3'
SEQ ID N° 14 :
5'-GCAATGCTCGTGCTTGTTCAT-3'
SEQ ID N° 15 :
5'-AGGCTAATCTTGGCACCATGG-3'
SEQ ID N° 16 :
5'-CACTGCTGTTTACGTAGCAGG-3'
SEQ ID N° 17 :
5'-TGAATAACAGCTTTCTCAGCC-3'
SEQ ID N° 18 :
5'-GACTATATAGTTCTTCTCCATT-3'
SEQ ID N° 19 :
5'-CTTTCAGCCTCTCCCAATGC-3'
SEQ ID N° 20 :
5'-ACCAACAAACTCCTGTCTCAC-3'
SEQ ID N° 21 :
5'-ACCTACTGATTGACATAGTTGA-3'
SEQ ID N° 22 :
5'-ACTATAAAAGGGAGATGGTCAC-3'
SEQ ID N° 23 :
5'-GTGACCATCTCCCTTTTATAGT-3'
SEQ ID N° 24 :
5'-TGCTGAGGTACTGGTGGTAA-3'
SEQ ID N° 25 :
5 -AAATCTGGAGGCTGGTATTC-3'
SEQ ID N° 26 :
5 -TGCTTTATGTTTCTCTTACGTC-3'
SEQ ID N° 27 :
5'-AGAGTACTGCACATATTCATCA-3'
SEQ ID N° 28 :
5 -TGCTGTGGCACATACAACCC-3'
SEQ ID N° 29 :
5'-TGCTGTGGCACATACAACCC-3'
SEQ ID N° 30 :
5' C ACTGAAGCAGAGTAACAACA3'
SEQ ID N° 31 :
S'CCAACAGAGGCTGTTACTATTS'
SEQ ID N° 32 :
5'GGGACTTTTGTGCAAGTGGT3'
SEQ ID N° 33 :
5'GACAACTGAAATCATAACCAC3'
SEQ ID N° 34 :
5TTATGATATATGTTCAGTTAGGG3' SEQ ID N° 35 :
5'CATACACAGGGACATTACATG3' SEQ ID N° 36 :
5OGCAGGTCACAAAGACCAAA3' SEQ ID N° 37 :
5ΑGATGGAACATGTGGGCACA3'
Λ
SEQ ID N° 38 :
5OTGATGTGGTTGTTAATCGGT3' SEQ ID N° 39 :
5ΑCACAGAAGGTGACAACACTG3' SEQ ID N° 40 :
5'GAGATAAGATTGGGGAAGCAT3' SEQ ID N° 41 :
S'CCAATCCAGTTGCCATCATCS' SEQ ID N° 42 :
5'GCTATTTTCCTGTACCCACATT3' SEQ ID N° 43 :
5'GAAAGCTCTCACTTATAAACAG3' SEQ ID N° 44 :
5'GGCTCAGAAACCCAAACATAC3' SEQ ID N° 45 :
5'GTGCAACCACTGCATGAATAT3' SEQ ID N° 46 :
5ΑGCTGGCATTTCAATGATACT3' SEQ ID N° 47 : 5'GTGGCAGACACATAACGCTG3'
ANNEX 3
Untitled Sequence # 1 -> Full Restriction Map
DNA sequence 3731 b.p . TCTCCTTT TCT . . . GATGTACTGTAA linear
Positions of Restriction Endonucleases sites (unique sites underlined)
26 47 68
50
ScrF I Hph I Dde I EcoR II Eco57 I
BsmA I BstN I Alu I Alu I
SfaN I Eco57 I Mae III Mae III Dde I HinD ITT
AGG ICATCAAGACITGAAGGTGGATTGITCACACTICTI II I I I II I 560
TCCGTAGTTCTGACTTCCACCTAACAGTGTCΑGAGTCAGT^
4 I83 - 4I92 5I05 -I I 5I1I7-I . 5I3-9I - 5I5I3I -
511 521 542 554
513 521 556
518 521 ScrF I EcoR II Mnl I Nla IV Rsa I BstN I Ban I Alu I SfaN I
G ITACCCICTCAACICTGGACAOI-CACCATACAIGCTACCA^I 640
CATGGGGAGTTGGACCTGTCCGTGGTATC_TCGATGGT^
5 I61 I • 5I72 5I80 5I90 5I98• 566 580
572 572
Sec I ScrF I EcoR II BstN I Hae III
Xmn I Mnl I Mse I Bsr I Eas_I
GG IAAATAATTCACTCACAAATTCICTCrc.GATTITAA^ I II I 720
CCTTTATTAAGTGAGTGTTTAAGGAGACCrAAATTATCAA^
6 I42 ♦ - 6I63 ' 6I72 • • 7I05 • 7I1I2I
712 712 713 715
746 774
Sau3A I Mbo I
Dpn I Alw I Hph I Mae III Alu I SfaN I Tthlll II BstY I Rsa I
CC^CCAATGCCACTTACCA IGCTTCAAGAACCGCCATCTO^I I I I M I 880
GGTGGTTACGGTGAATGGTCGAAGTTCTTGGCGGTAGACCGTAGTGGTCGGT^
|. . I I *| I • II • I ■
819 840 851 862 877
843 857
863 863 863 863
cia i
Sau3A I
Hinf I Taq I Mnl I Mae II
Hinf I Mbo I Taq I Pml I
Mse I AlwN I Don I Hσa I BsaA I
AGCACAATCCACAGCCCATCAAC^CCCAT ITAAAGIATTCIAGIATTC^III I I I II 960
TCGTGTTAGGTGTCGGGTAGTTGTGGGTAATTTCTAAGTCTAAGACT
9 I0-9 I 9I18I 9I2I6I • 9I34I I- • 9I5I7•
914 926 937 957
920 928 939 958
926 927 Mbo II Sau3A I
Hae III Mbo I
Sau96 I Mnl I Hinf I BsmA I
Nla IV Hph I AlwN I Dpn I Fok I
TGGACG IGIGIGCCGIAAGAAACAATAATCCTTICACICTCCCCCIAGIATTCTGIATCTTGAGAG^ I I 1040
ACCTGCCCCGGCririTiυi ATTAGGAAGTGG
9 I6I6I • I • 9I8-9I 9l9-9l 1I007• • - 1I036I
967 992 1001 1040
968 1007
972 1007
Sau3A I
Mbo I
Dpn I
Sau96 I Al I
Ava II Nla IV
PpuM I BstY I
Nla IV BamH I
ECOO109 I Alw I Sty I
Bsr I Mnl I Eco57 I Sec I
AGACAATCATTGTΓΓTCCACTCCTTGCΠA ICIK-I^I^ I II I I 1120
TCTGTTAGTAACAAAAGGTGAGGAACGAATGACCCAGGATGCGGT^
1 I070II 1I096II 1I110 1I120
1073 1099 1120
1073 1099
1073 1099
1074 1099
1074 1100
1100 1100 1100
Mbo II Tthlll I Mbo II Mse I Dde I Mae III
CTTGGACK^GAATGG IAAGAAATGATTTTCAACTTGG I I I I I 1200
GAACCTGACGCTΓACCTTCTT ACTAAAAGTTGAACCGTCTGTGTGTAAATAAA
I . . . i . i i . i i .
1136 1176 1183 1194
1187 1196 • Dde I Sau96 I Ava II Eco57 I PpuM I Fnu4H I
Mbo II EcoO109 I Bbv I
AGTCCATATAGATGATGΥ-TCTTCAGATGATAAC 1280
TCAGGTATATCTACTAOUUGAAGTCTACTATTGCCT^
1 i2i20 • 1I2I39 I • • • , 1 i280
1221 1239 1280
1240 1240
1244
11 y
nu4H I
Bbv I Bsr I Fok I Hae III
;t I Bsr I Tthlll II Rsa I Bsr I Hae I
CTGCAGCAACCAGTGCTAACTTATGTTTGGCAACTGGTGTA^ 1360
GACGTCGTTGGTCACGATTGAATACAAACCGTTGACCACAT^
I I I - l - l l' - I I - II
1281 1290 1304 1319 1335 1350
1283 1313 1338 1351
1283
Alu I Sau96 I ScrF I
Mnl I Ava II Mnl I EcoR II
Mme I Tthlll II Dde I BstN I
I I I I I I I I
TACAGACGGGTAAAAGAGATTTACAACACCTACAAAAATAATGTTGGAG 1440
AT ?IX rGCCCATTTTCTCTAAATGTTGTGGATGTTTTTATTAC^
• 1I403I • 1I413 l- l 1l425I' 1I437• 1407 1418 1429 1437 1418 1437
1423 ScrF I EcoR II Pst I Hae III BstN I Sau96 I
Fnu4H I Sau96 I Pie I Bsr I
Bbv I Mnl I Hinf I HinC II Hae III
G ICITGCAGTTGIAGIGIGCCGAAATTGAAGCCCTGACCGIACTCICT^I^ I I 1520
CGACGTCAACT XCGGCTTTAACTTCGGGACπ -KX- GAGGACCAACTG^
1 I4I41 1I45I0I - 1I475I- 1I483 I- 1I492
1441 1452 1475 1489
1442 1453 1479 1492
1479 1479
Msp I
Hpa II ScrF I
Nci I Alu I
Ben I Ssp I Mae III Dde I Fok I
C ICICGGAC-AAACTGTGTGAIATATTTTAGITAAC-AACTACIT^I I 1600
GGGCCTGTTTGACACACTTATAAAATCATTGTTGATGAGTCGACT
1 I5I21 • 1 I53 •8 1 I547 • 1 I557 I 1 I564
1521 1560
1521
1522
1522
Ssp I Alu I
GGAATTGTATTTCCAATAGAAAATATTTTACAGTGCAACTAA^ 1680
CCTTAACATAAAGGTΓATCTI'.ΎATAAATGTCACGTT^
• • j • • • I • * •
1622 1655
Mbo II SfaN I
Mbo II Mbo II BstU I EcoN I
I I I I I I
GTTTGGAAGAAAAGTGGTGTATGTTGTTATAGGAGATGGTGTAGA 1760
CAAACCTTCTM^CCACATAC-AACAATATCCTCTA
I . . . . i i . . I l l *
1686 1724 1750 1758
1727 1753
Sau3A I
Mbo I
Dpn I HinP I
BstY I Nla III Sty I Hha I
Alw I Mnl I Sec I Rsa I Hae II
Mnl I Sau96 I Sau96 I Nla III Gsu I Eco47 TTT
Gsu I Gsu I Ava II Hae III pfiM r Bsr I Mae III
TCTGGAGGATCTCCAGCCACTCGGACCr ATGGCCCTGCACCATGCCTT^ 1840
AGACCTCCTAGAGGTCGGTGAGCCTGGAGTACCGGGACGTGGTACGGAA^
1762 1771 1783 1792 1801 1813 1825
1765 1783 1792 1802 1814 1830
1767 1786 1806 1819 1830
1767 1789 1806 1831
1768 1831
1768
1768
Hae III Msp I
Fnu4H I Hpa II
Alu I Sec I Sau3A I Nae I
Pvu II ScrF I Mbo I BsmA I HinP I
NspB II EcoR II Dpn I Bsr I Hha I Gsu I
HinP I BstN I Alw I Sau96 I Hae II Mnl I
Hha I Bbv I Mae III BstY I Hae III SfaN I CfrlO T Dde I
I III I I II I I I I II II I II I
CTTTGACAGCGCACAGCTGCTCTGTGACCAGGGACAGATCCAGCAGGCC 1920
GAAACTGTCGCGTGTCGACGAGACACTGGTCCCTGTCTAG
1 I8-49 II 1I856• 1I864I • 1I8I76• 1I886I- I 1I897• II 1I9I05I II 1I9-19
1849 1868 1877 1886 1902 1910
1854 1868 1877 1889 1903 1911
1854 1868 1877 1892 1903
1855 1868 1877 1905
1856 1906 1906 1908
• 1I9I33I 1I94I0I I • • 1I965 1I97•8 1I98I6I- 1I993
1934 1942 1988
1934 1943 1988
1934 1943 1989
1937 1946 HinC II Sea I Alu I Pie I Rsa I Pvu II Hinf I Bsr I
NspB II Mae I Hinf I Nla III
TTCATCAGAAC IAIGCTG ITTG IACTC ITA IGITACTGTG IAA I I 2080
AACTAGTCTTGTCGACAACTGACiATCATGACACTTAGGTO
•II I I- I II • I I • - I
2011 2023 2033 2052
2011 2019 2037
2012 2019 2026 2016 2025
Sau96 I Ava II PpuM I EcoO109 I Mse I Mbo II BspM I Eco57 I
GTCTGCCACAT ITAACTACACGGTTCAAACCTGTGIAAGAAAIGIGAICCT^ I 2160
CAGACGGTGTAATTtaTGTGCα^AGTTTGGACA riCITI^CTGGAC
•I • • I II I • I
2091 2114 2123 2134
2120 2120
2121
BsmA I Mnl I
Alu I AlwN I Bsr I
TATAAACA IGCTTCTCCAATACAGCAAACCTAATTGCACAACIAGIAGA^ I I 2240
ATATTTGT ICG.AAGAGGTTAT.GTCGTTTGGA.TTAACGTGTT.G|TCITCTGACT.TTAC^ . I . i 2168 2201 2228
2203 2234
Sec I ScrF I EcoR II BstN I
Dra I Alu I Mnl I Hph I Dra I Mnl I
I I I I I M I I
TTTTTAAAAAGATAAAAAAACAGCITTATTGAGGTATAATTCARC^ 2400
AAAAATTTTTCTATTTTΠ-ICTCGAAATAAC
I I - I - I - I • I I • - I I -
2323 2342 2351 2361 2377 2395
2324 2378 2398
Hae III Stu I Rsa I
Hae I Mnl I Rsa I
Mse I Mnl I Nla III Sea I
ACAATAGCATTAGAAATAAGTGAT ITAAAIGIGCICTCTGCICTCACAACIATGGCAAGI^ II 2480
TGTTATCGTAATCTTTATTCACTAATTTCCGGAGACGGAGT^^
• 2I424ll-2l431 I • 2I445 • I 2I4I57* 2428 2437 2453
2428 2458
2429 Mae II Mse I BsmA I Rsa I
Mse I Dra I Rsa I Dde I Nla III
CATAATAGTTT ITAACTATGTCTAATITITAA^I I I I I I 2560
GTATTATCAAAATT<-ATAC»GATTAAATTTGCATATT^^
• 2I491 - 2I5I05I -2I521 2l5-2l9 2I5-39I
2506 2531 2542
2510
ScrF I EcoR II BstN I
TGTAAATTTCΓAATAATGGATAATTCTTTTAΓΛTAT^ I 2640
AC^TTTAAACATTATTACCTATTAAGAAAATGTATAAOU^
• • • • • • I •
2630 2630 2630
HgiA I Bspl286 I Mae II Fnu4H I Bbv I &paτ, τ Mnl I Pml I Nla III BsaA I Nla III Dde I
I I I II I I I
AAGCATGAGGCAGCACGTGCACAAATTCATGTGCCGTGCCπ^ 2720
TTCGTACTCCGTCGTGCACGTGTTTAAGTACACGGCACGGAATAGACTCAAAAGCCATATTT
I I I II I • I • I
2644 2654 2668 2685
2647 2654 2650 2657 2650
2655 2657 2657
Nla III Bst T Fok I SfaN I I I I I
TrTπTAGATAATGTTGTCAAGACCAAAAGCATGGATGTC^ 2800
AAAAAATCTATTACAACAGTTCTGGTTTTCGTACCT^^
I ' l l * • • • I •
2744 2754 2799
2751 Hae III Alu I
EcoO109 I ≤ c_X
Mnl I HgiA I
Dde I Bspl286 I
Mbo II Sau96 I Mse I Baα_II Mnl I
I I M M I I I I
ATCAGΓΓCTTCTGAGGGCCITGATGAAATAACAC^^ 2880
TAGTCAAGAAGACTCCCGGAACTACTTTATTGTGTCGT^
I - I M i l • • • I • I I • I -
2807 2815 2844 2865 2879
2811 2865
2813 ' 2865
2814 2865
2816 2866
Mae III Mbo II I I
TCACTTTTTCATTTCCΛCACTAATGTATTATATGT^ 2960
AGTGAAAAAGTAAAGGTGTGATTACATAATATACATTGATGAACCTΓTTΓTATTAATA^
2914 2948
Dde I
Hinf I HgiA I
Mse I Nla III Bspl286 I
Ase I BspH I BsmA I Nla III
AATATAGATGATAGTAGATATATTTTATTAATAAAATGGTTCATGAATCGGAGACT 3040
Msp I
Mse I Mbo II Hpa II Ase I Taq I Bbv TT SfaN I BspM II
TTA 1T1TAATTATCGTGTCTGCATTTTCT TICGATAAAGGIAAGACACACGIATGCTA^II 3120
AATAATTAATAGCAC-AGACCTAAAAGAAAGCTATTTCCTTC^
II • I- I • I • II •
3043 3069 3078 3088 3096
3044 3078 3097
3097
Nla III
Dde I Sty I
EcoN I Sec I
Bsu36 I Nco I
Mae III Dsa I
Mbo II Hph I Mnl I Hph I Mae II
ACTCCCTATGTGCGTATTTTCΓ ΓTT ICTTCCTGITICACCICI^I I II I 3200
TGAGGGATACACGCATAAAAGAGAAAGAAGGACAGTGGGACTCCTTCCAAGTAACGGTAACA^
3 I146• I3I154ll-3l161 • 3I185II- 3I1-99
3153 3188
3158 3188
3158 3188
3159 3188
3189 Nla III
Mnl I BspM I Rsa I SfaN I Hph I
GTTC ICTCTCCAICCTGCATTATGITACTACIATGACAGGI I 3280
CAAGGAGAGGTGGACGTAATACATGATGTACTGTCCGTAGTTAGACCCCTTTATTATTTTAATA
3 I204 -3I211 - 3l222 l- 3l236- • • 3I264 •
3228
Nla III Ssp I Tthlll II BspH I
TAAGAGTTTCTCCAAAAGTGGTCAGTTTGGCTGGGCA IATATTTTCTC^^ I II 3360
ATTCTCAAAGAGGTTTTCACCAGTCAAACCGACCC^
I . j . j i
3317 3336 3352
3353 Bbv II Fok I Mbo II Alu I
Dde I Pie I Mbo II
Bsu36 I Hinf I Mse I Alu I Ear I TACC ICITTAGIGATGIAGITICTTCIITAATCA^ I I I 3440
ATGGGAATCCTACTCAGAAGAAATTAGTTAGTATATAACCCGCCrTTTTTGTGGTCGAA^
3 I3I64 I- 3I37I3I • 3I382 • • 3I415 - 3I435I-
3365 3373 3435
3369 3376 3439
3375
AACTTAATCAAAAAAGAAGGAAAACCGAACTTCGTTTGTCAACTGGTAAAAATTGGA
3 I535 - 3I554 I -3I571 • - 3I595I
3560 3600
Hae III Hae I Rsa I Mbo II
TGCAGACTGAAAAGACAAAGTTTATCTT IGIGCCTTACTC^ I I 3680
ACGTCTGACTTTTCIXTITrCAAATAGAAC^
I I - • • • I • I -
3628 3667 3679
3629 Mse I Mse I
Ase I Mse I Rsa I
TTCA ITITAATTTTGTGTTITAAGTITAATAAAATTTATTTGTGATGITACTGTAA 3731
Restriction Endonucleases site usage
Aat II - BstN I 10 HinC II 3 Pie I 3
Ace I - BstU I 1 HinD III 1 Pml I 2
Afl II - BstX I 1 Hinf I 8 PpuM I 3
Afl III - BstY I 4 HinP I 4 Pst I 3
Aha II - Bsu36 I 2 Hpa I - Pvu I -
Alu I 18 CfrlO I 1 Hpa II 3 Pvu II 2
Alw I 5 Cla I 1 Hph I 11 Rsa I 17
AlwN I 4 Dde I 15 Kpn I - Rsr II -
Apa I - Dpn I 7 Mae I 2 Sac I 1
ApaL I 1 Dra I 5 Mae II 6 Sac II -
Ase I 3 Dra III - Mae III 10 Sal I -
Asp718 - Drd I - Mbo I 7 Sau3A I 7
Ava I - Dsa I 1 Mbo II 17 Sau96 I 12
Ava II 6 Eae I 1 Mlu I - Sea I 2
Avr II - Eag I - Mme I 2 ScrF I 11
BamH I 1 Ear I 1 Mnl I 27 Sec I 8
Ban I 1 Eco47 III 1 Mse I 1 SfaN I 9
Ban II 1 Eco57 I 7 Mse I 22 Sfi I -
Bbe I - EcoN I 2 Msp I 3 Sma I -
Bbv I 9 EcoO109 I 4 Nae I 1 SnaB I -
Bbv II 2 EcoR I - Nar I - Spe I -
Bel I 1 EcoR II 10 Nci I 1 Sph I -
Ben I 1 EcoR V - Nco I 1 Spl I 1
Bgl I - Esp I - Nde I 2 Ssp I 3
Bgl II - Fnu4H I 9 Nhe I - Stu I 1
BsaA I 3 Fok I 6 Nla III 14 Sty I 3
Bsm I 1 Fsp I 1 Nla IV 4 Taq I 3
BsmA I 8 Gdi II - Not I - Tthlll I 2
Bspl286 I 4 Gsu I 4 Nru I - Tthlll II 7
BspH I 2 Hae I 4 Nsi I - Xba I -
BspM I 2 Hae II 2 Nsp7524 I - Xca I -
BspM II 1 Hae III 11 NspE 1 II 2 Xho I -
Bsr I 12 Hga I 1 NspH [ I - Xcm I -
BssH II - HgiA I 4 PaeR7 I - Xma I -
BstB I - HgiE II - PflM I 1 Xmn I 1
BstE II - Hha I 4
Enryme Site Use Site position (Fragment length) Fragment order
ApaL I g/tgcac" 1 1( 2656) 2657 ( 1075)
BamH I g/gatcc 1 1( 1098) 1099 ( 2633)
Ban I g/gyrcc 1 1( 579) 580 ( 3152)
Ban II grgcy/c 1 1( 2864) 2865 ( 867)
Bel I t/gatca 1 1( 114) 115 ( 3617)
Ben I ccs/gg 1 1( 1520) 1521 ( 2211)
Bsm I gaatgc 1/-1 1 1( 398) 399 ( 3333)
BspM II t/ccgga 1 1< 3095) 3096 ( 636)
BstU I cg/cg 1 1( 1749) 1750 ( 1982)
BstX I ccannnnn/ntgg 1 1( 2743) 2744 ( 988)
CfrlO I r/ccggy 1 1( 1904) 1905 ( 1827)
Cla I at/cgat 1 1< 926) 927 ( 2805)
Dsa I c/crygg 1 1( 3187) 3188 ( 544)
Eae I y/ggccr 1 1( 711) 712 ( 3020)
Ear I ctcttc 1/4 1 1( 3434) 3435 ( 297)
Eco47 III agc/gct 1 1( 1829) 1830 ( 1902)
Fsp I tgc/gca 1 1( 1941) 1942 ( 1790)
Hga I gacgc 5/10 1 1( 933) 934 ( 2798)
HinD III a/agctt 1 1( 552) 553 ( 3179)
Mse I tgg/cca 1 1( 711) 712 ( 3020)
Nae I gcc/ggc 1 1( 1904) 1905 ( 1827)
Nci I cc/sgg 1 1( 1520) 1521 ( 2211)
Nco I c/catgg 1 1( 3187) 3188( 544)
PflM I ccannnn/ntgg 1 1( 1800) 1801 ( 1931)
Sac I gagct/c 1 1( 2864) 2865 ( 867)
Spl I c/gtacg 1 1( 446) 447 ( 3285)
Stu I agg/cct 1 1( 2427) 2428 ( 1304)
Xmn I gaann/nnttc 1 1( 641) 642 ( 3090)
Bbv II gaagac 2/6 1( 3077) 1 3078 ( 297) 3 3375 ( 357) 2
BspH I t/catga 2 1 ( 3000) 1 3001 ( 351) 3 3352 ( 380) 2
BspM I acctgc 4/8 2 1 ( 2122) 1 2123 ( 1088) 2 3211 ( 521) 3
Bsu36 I cc/tnagg 2 1 ( 3157) 1 3158 ( 206) 3 3364 ( 368) 2
EcoN I cctnn/nnnagg 2 1 ( 1757) 1 1758 ( 1400) 2 3158 ( 574) 3
Hae II rgcgc/y 2 1 ( 1829) 2 18301 72) 3 1902 ( 1830) 1
Mae I c/tag 2 1 ( 2022) 1 2023 ( 1452) 2 3475 ( 257) 3
Me I tccrac 20/18 2 1 ( 316) 3 317 ( 1086) 2 1403 ( 2329) 1
Nde I ca/tatg 2 1 ( 371) 2 372 ( 90) 3 462 ( 3270) 1
NspB II cmg/ckg 2 1 ( 1853) 1 1854 ( 157) 3 2011 ( 1721) 2
Pml I cac/gtg 2 1 ( 956) 3 957 ( 1697) 1 2654 ( 1078) 2
Pvu II cag/ctg 2 1 ( 1853) 1 1854 ( 157) 3 2011 ( 1721) 2
Sea I agt/act 2 1 ( 2024) 1 2025 ( 432) 3 2457 ( 1275) 2
Tthlll I gacn/nngtc 2 1 ( 1195) 2 1196 ( 793) 3 1989 ( 1743) 1
Ase I at/taat 3 1 ( 2986) 1 29871 56) 3 3043 ( 641) 2 36841 48) 4
BsaA I yac/gtr 3 1 ( 423) 4 424 ( 533) 3 957 ( 1697) 1 26541 1078) 2
HinC II gty/rac 3 1 ( 1482) 2 1483 ( 533) 3 2016 ( 1544) 1 35601 172) 4
Hpa II c/cgg 3 1 ( 1521) 1 1522 ( 384) 4 1906 ( 1191) 2 30971 635) 3
Msp I c/cgg 3 1 ( 1521) 1 1522 ( 384) 4 1906 ( 1191) 2 30971 635) 3
Pie I gagtc 4/5 3 1 1474) 1 1475 ( 544) 3 2019 ( 1354) 2 33731 359) 4
PpuM I rg/gwccy 3 1 ( 1072) 2 1073 ( 166) 4 1239 ( 881)-3 21201 1612) 1
Pst I ctgca/g 3 1 ( 1280) 2 1281 ( 161) 3 1442 ( 2158) 1 36001 132) 4
Ssp I aat/att 3 1 1537) 2 1538 ( 84) 4 1622 ( 1695) 1 33171 415) 3
Sty I c/cwwgg 3 1 1119) 2 1120 ( 686) 3 1806( 1382) 1 31881 544) 4
Taq I t/cga 3 1 927) 2 928 ( 9) 4 937 ( 2132) 1 30691 663) 3
AlwN I cagnnn/ctg 4 1 917) 3 918 ( 81) 5 999 ( 947) 2 19461 255) 4
2201 1531) 1
Bspl286 I gdgch/c 4 1 49) 5 50 ( 2607) 1 2657 ( 208) 3 28651 166) 4 3031 701) 2
BstY I r/gatcy 4 1 861) 2 862 ( 237) 4 1099 ( 668) 3 17671 109) 5 1876 1856) 1
EcoO109 I rg/gnccy 4 1 1072) 1 1073 ( 166) 5 12391 881) 3 21201 694) 4 2814 918) 2
Gsu I ctggag 16/14 4 1 1761) 2 1762 ( 9) 5 1771 ( 43) 4 18141 97) 3 1911 1821) 1
Hae I wgg/ccw 4 1 711) 3 712 ( 638) 4 1350 { 1078) 2 24281 1200) 1 3628 104) 5
HgiA I gwgcw/c 4 1 49) 5 50 ( 2607) 1 2657 ( 208) 3 28651 166) 4 30311 701) 2
Hha I gcg/c 4 1 1830) 1 1831 ( 18) 5 1849 ( 54) 3 19031 40) 4 1943 ( 1789) 2
HinP I g/cgc 4 1 1830) 1 1831 ( 18) 5 1849 ( 54) 3 19031 40) 4 1943 ( 1789) 2
Nla IV ggn/ncc 4 1 579) 2 580 ( 386) 3 966 ( 107) 4 10731 26) 5 1099 ( 2633) 1
Alw I ggatc ... 4/5 5 1 862) 2 863 ( 236) 4 1099( 1) 6 11001 667) 3 1767 ( 110) 5 1877 ( 1855) 1
Dra I ttt/aaa 5 1( 228) 3 229 ( 2094) 1 23231 54) 6 23771 128) 5 2505 ( 1003) 2 3508 ( 224) 4
Ava II g/gwcc 6 1 1073) 2 1074 ( 166) 6 12401 178) 5 14181 365) 3
1783 < 205) 4 1988 ( 133) 7 21211 1611) 1
Fok I ggatg 9/13 6 1( 797) 2 798 ( 238) 6 10361 302) 5 13381 226) 7
1564 ( 1190) 1 2754 ( 615) 3 33691 363) 4
Mae II a/cgt 6 1( 424) 6 425 ( 533) 3 9581 1035) 1 19931 517) 5
2510 ( 145) 7 2655 ( 544) 2 31991 533) 4
Dpn I ga/tc 7 1( 115) 4 116 ( 747) 2 8631 63) 8 9261 81) 7
1007 ( 93) 6 1100 ( 668) 3 17681 109) 5 18771 1855) 1
Eco57 I ctgaag 16/14 7 1( 75) 7 76 ( 416) 4 4921 64) 8 5561 554) 3
1110 ( 111) 6 1221 ( 913) 2 21341 1322) 1 34561 276) 5
Mbo I /gate 7 1( 115) 4 116 ( 747) 2 8631 63) 8 9261 81) 7
1007 ( 93) 6 1100 ( 668) 3 17681 109) 5 18771 1855) 1
Sau3A I /gate 7 1( 115) 4 116 ( 747) 2 8631 63) 8 9261 81) 7
1007 ( 93) 6 1100 ( 668) 3 17681 109) 5 18771 1855) 1
Tthlll II caarca 11/9 7 1( 126) 7 127 ( 253) 4 3801 471) 2 8511 453) 3
1304 ( 109) 8 1413 ( 1923) 1 33361 218) 5 35541 178) 6
BsmA I gtctc 1/5 8 1( 510) 4 511 ( 529) 3 10401 852) 1 18921 311) 6
2203 ( 67) 9 2270 ( 128) 8 23981 133) 7 25311 480) 5
3011 ( 721) 2
Hinf I g/antc 8 1( 913) 2 914 ( 6) 9 9201 81) 7 10011 474) 4
1475 ( 544) 3 2019 ( 14) 8 20331 972) 1 30051 368) 5
3373 ( 359) 6
Sec I c/cnngg 8 1( 714) 2 715 ( 405) 5 11201 686) 3 18061 62) 8 7
31881 544) 4
Bbv I gcagc 8/12 9 1 191) 7 192 176) 8 368 406) 5 774 506) 3 1280 3)10 1283 158) 9 1441 415) 4 1856 794) 2 2650 821) 1 3471 261) 6 Fnu4H I gc/ngc 9 1 191) 7 192 176) 8 368 406) 5 774 506) 3 1280 3)10 1283 158) 9 1441 415) 4 1856 794) 2 2650 821) 1 3471 261) 6 SfaN I gcatc 5/9 9 1 482) 4 483 115) 9 598 242) 6 840 913) 1 1753 144) 8 1897 43)10 1940 859) 2 2799 289) 5 3088 148) 7 3236 496) 3
BstN I cc/wgg 10 1 436) 437 84) 8 521 51)10 572 143) 7
715 722) 1437 42)11 1479 389) 4 1868 66) 9
1934 345) 2279 351) 5 2630 1102) 1 EcoR II /ccwgg 10 1 436) 437 84) 8 521 51)10 572 143) 7
715 722) 1437 42)11 1479 389) 4 1868 66) 9
1934 345) 2279 351) 5 2630 1102) 1 Mae III /gtnac 10 1 85) 86 419) 3 505 12)11 517 340) 5
857 337) 1194 353) 4 1547 278) 7 1825 39)10
1864 1050) 2914 239) 8 3153 579)_2
Hae III gg/cc 11 1 712) 2 713 255) 968 383) 5 1351 102) 9
1453 39)11 1492 300) 1792 94)10 1886 22)12
1908 521) 3 2429 387) 2816 813) 1 3629 103) 8 Hph I ggtga 8/7 11 1 285) 5 286 232) 518 228 ) 7 746 97)
843 146) 8 989 948) 1937 41) 11 1978 383)
2361 793) 2 3154 31)12 3185 79 ) 10 3264 468) ScrF I cc/ngg 11 1 436) 3 437 84) 8 521 51) 10 572 143)
715 722) 2 1437 42)11 1479 42 ) 12 1521 347)
1868 66) 9 1934 345) 6 2279 351) 4 2630 1102)
Bsr I actgg 1/-1 12 1 139)10 140 149) 8 289 416) 2 705 365) 1070 220) 5 1290 23)12 1313 22)13 1335 154) 1489 324) 4 1813 76)11 1889 148) 9 2037 191) 2228 1504) 1
Sau96 I g/gncc 12 1 966) 1 967 107) 8 1074 166) 1240 178) 5 1418 34)12 1452 40)11 1492 291) 1783 9)13 1792 94)10 1886 102) 9 1988 133) 2121 694) 3 2815 917) 2
Nla III catg/ 14 1 46)11 47 1742) 1 1789 13)15 1802 250) 5 2052 393) 2 2445 94) 9 2539 105) 8 2644 24)14 2668 83)10 2751 251) 4 3002 26)13 3028 161) 6 3189 39)12 3228 125) 7 3353 379) 3
Dde I c/tnag 15 1 40)15 41 200) 8 241 272) 5 513 26)16
539 644) 1 1183 61)14 1244 181) 9 1425 132)11
1557 362) 4 1919 610) 2 2529 156)10 2685 126)12
2811 223) 6 3034 125)13 3159 206) 7 3365 367) 3
Mbo II gaaga 8/7 17 1 94)11 95 877) 1 972 164) 6 1136 51)15 1187 33)17 1220 466) 3 1686 38)16 1724 3)18 1727 387) 4 2114 693) 2 2807 141) 8 2948 130) 9 3078 68)12 3146 230) 5 3376 59)13 3435 100)10 3535 144) 7 3679 53)14
Rsa I gt/ac 17 1 426) 4 427 21)15 448 113)11 561 163) 8 724 153)10 877 442) 3 1319 500) 2 1819 207) 7 2026 265) 6 2291 162) 9 2453 5)18 2458 63)13 2521 21)16 2542 680) 1 3222 373) 5 3595 72)12 3667 56)14 3723 9)17
Alu I ag/ct 18 1 67)16 68 137)11 205 204) 5 409 133)13
542 12)19 554 36)17 590 153)10 743 76)15
819 604) 1 1423 137)12 1560 95)14 1655 200) 6
1855 157) 8 2012 156) 9 2168 174) 7 2342 524) 3
2866 549) 2 3415 24)18 3439 293) 4
Mse I t/taa 22 1 19)20 20 43)18 63 111)11 174 56)14 230 442) 2 672 237) 6 909 267) 5 1176 915) 1 2091 233) 7 2324 54)16 2378 46)17 2424 67)12 2491 15)21 2506 338) 3 2844 144) 8 2988 56)15 3044 338) 4 3382 127) 9 3509 62)13 3571 114)10 3685 12)22 3697 5)23 3702 30)19
Mnl I cctc 7/7 27 1 25)23 26 135)10 161 193) 8 354 105)14 459 107)13 566 97)16 663 276) 5 939 53)19 992 104)15 1096 311) 3 1407 22)24 1429 21)25
14501 315) 2 17651 21)26 17861 124)11 19101 55)18
19651 21)27 19861 248) 6 22341 117)12 23511 44)20
23951 36)22 24311 6)28 24371 210) 7 26471 166) 9
28131 66)17 28791 282) 4 31611 43)21 32041 528) 1
460 sites found
No Sites found for the following Restriction Endonucleases
Aat II gacgt/c Drd I gacnnnn/nngtc NspH I rcatg/y
Ace I gt/mkac Eag I c/ggccg PaeR7 I c/tcgag
Afl II c/ttaag EcoR I g/aattc Pvu I cgat/cg
Afl III a/crygt ECOR V ga /ate Rsr II cg/gwccg
Aha II gr/cgyc Esp I gc/tnagc Sac II ccgc/gg
Apa I gggcc/c Gdi II yggccg -5/-1 Sal I g/tcgac
Asp718 g/gtacc HgiE II accnnnnnnggt Sfi I ggccnnnn/nggcc
Ava I c/ycgrg Hpa I gtt/aac Sma I ccc/ggg
Avr II c/ctagg Kpn I ggtac/c SnaB I tac/gta _
Bbe I ggcgc/c Mlu I a/cgcgt Spe I a/ctagt
Bgl I gccnnnn/nggc Nar I gg/cgcc Sph I gcatg/c
Bgl II a/gatct Nhe I g/ctagc Xba I t/ctaga
BssH II g/cgcgc Not I gc/ggccgc Xca I gta/tac
BstB I .tt/cgaa Nru I tcg/cga Xho I c/tcgag
BstE II g/gtnacc Nsi I atgca/t Xcm I ccannnnn/nnnntgg
Dra III cacnnn/gtg Nsp7524 I r/catgy Xma I c/ccggg
ANNEX 4
DNA sequence 4321 b.p. AAACCAATAAGG ... GATGTACTGTAA linear
Positions of Restriction Endonucleases sites (unique sites underlined)
Alu I
I
AAACCAATAAGG'TTAGGACAAGAGAATAGCTGTGG'I^TGCG'rTGCAAAAACCAAAAAAAAAAAAAAAAAAAAAAAGAAAG 80
T'TTGG'rTATTCCAATCCTGTTCTCTTATCGACACCAAACGCAACtf
• • I • • • • • •
28 Sty I Tthlll II
Sec I HinP I
Nla IV Hha I —
Mnl I Nla III Fsp I Fok I
Sec I Nco I Fnu4H I Mnl I
Ava I Dsa I Bbv I Taq I Hinf I Mae I
II I I II I II I I I I I I
CCCCGAGGCTCCATGGGCAGACCTAO^AGGCTGCGCAAACAAATCGAGTiGATGAGATTCTGCTGTTTCTTTGTCTAGGGT 160 GGGGCTCCGAGGTACCCGTCTGGATGTTCCGACGCGTTTG'ITTAGCTCCCTACTCTAAGACGACAAAGAAACAGATCCCA
II I I -II • I II I • I I I- I - I
82 91 110 124 135 154
83 91 110 126
85 92 112 129
87 113
91 113
91 116
Tthlll II Fnu4H I HinP I
Bbv I Hha I HinP I
SfaN I NspB II Hae II Hha I Mae III
Dde I Fnu4H I Eco47 III Alu I Tthlll II Alu I
I I II I I II I I I I I
TCTCAGATGCTATCTGCCGCrGCTG'TTrGGTGGGGAAGGAGC 240
AGAGTCTACGATAGACGGCGACGACAAACCACCCCTTCCTCGCGACCCGCG'ITTCGACAATGG'ITTGTCTTGCCACCCTC
I I - II I- I • II I • I I • I • I-
162 176 200 214 223 239
166 177 200 208 218
179 201 208
179 201 184
Bsr I ..
Nla IV Mnl I Gsu I Mme I Mse I
I I I I I I
CTGATGGCTCCGAGTTTGGGGCGAGGTAGAAACTCTC^GTGCCAC'ITCCGACTTTAAGCCTTCCTGTTGCCGTCCACTG 320 GACTACCGAGGCTCAAACCCCGCTCCATC'TTTGAGAGGTCACGGTGAAGGCTGAAATTCGGAAGGACAACGGCAGGTGAC
I • - I - I I - I • I •
246 263 275 288 295
277
HinP I
Nla IV
Nar I.
Hae II
Bbe I
ScrF I
Hae III
Gdi II
Eae I Hha I
Sec I Fnu4H I ScrF I NspB II Ban I EcoR II Ava I Fnu4H I Aha II
BstN I Mae II Fnu4H I Bbv I EcoR II Mbo II Afl III Dde I Bbv I Fnu4H I BstN I
TC ΪCGGGTTTCTTCCTGGGGAACACGTTTTCGCTCAGTCGCTCGGCAGCCCGAGCCTGCGGCAGCGGCCAGGCGCCTGCC 400 ACCGCCCAAAGAAGGACCCCTTGTGCAAAAGCGAGTCAGCGAGCCGTCGGGCTCGGACGCCGTCGCCGGTCCGCGGACGG
330 342 353 365 378 388
334 344 365 381 388
334 369 381 391
334 382 391
334 384 385 392 385 386 388 391 391 391 391 392
Fnu4H I BstU I Sec I Sac II NspB II Dsa I
HinP I Fnu4H I Dde I HinP I
Alu I Hha I HinP I HinP I BstU I
HinP I Mn] . I Hha I Fnu4H I Hha I Bspl286 I
Hha I Pst I Hae II Hae II Bbv I Hae II Ban II Hha I
CCCTGCGCCGAGCTTTCCCCTGCAGAGGCGCTCCACTCCCAGAAGCGCCGCGGCTGCACCAGAGCGCCTGAGAGCCCCCG 480 GGGACGCGGCTCGAAAGGGGACGTCTCCGCGAGGTGAGGGTCTTCGCGGCGCCGACGTGGTCTCGCGGACTCTCGGGGGC
405 420 427 444 453 463 472 480
405 425 445 453 464 472
411 428 445 464 479 428 447 468 480
448
448
448
448
449
45C 1
ScrF I EcoR II
Rsa I BstN I Taq I Bsr I
BstU I Fok I Mnl I £s B_I Mse I
CGCGTACCCATCCAGGAGCAAAACTATGTCAGGAATGGAGGTTTGCTAACCCAGAAAATTCGAAGGAACACATTAAACTG 560 GCGCATGGGTAGGTCCTCGTTTTGATACAGTCCTTACCTCCAAACGATTGGGTCT'ITTAAGCTTCCTTGTGTAATTTGAC
481 489 518 539 553
484 492 540 557
492
492
Fnu4H I HinP I
Bbv I Hha I
SfaN I Hae II Tthlll II
Fok I Eco47 III Bsr I Mnl I BspM I
GTGGATGCAGCAGATGTAAGCGCTGTGCAAACATCTCAAGCCAGTTCAGATGTTGCTGTTTCCTCAAGTTGCAGGTCTAT 640 CACCTACGTCGTCTACATTCGCGACACG'TTTGTAGAGTTCGGTCAAGTCTACAACGACAAAGGAGTTCAACGTCCAGATA
563 579 601 622 631
564 579 588
567 580
567 580
Sau3A I
Mbo I Sau96 I
Dpn I Nla IV
BstY I Bsr I
Alw I Dde I Hinf I Hae III Mae III
GGAAATGCAG IGIATCTAACCAGCCCGCATAGCCGTCITGAGTGGTAGTAGTGIAATCCCCICAGTGIGCCCCAAACTCGGITAACT 720
CCTTTACGTCCTAGATTGGTCGGGCGTATCGGCAGACTCACCATCATCACTTAGGGGGTCACCGGGGTTTGAGCCATTGA
|| • I l l - l I
650 675 690 702 715
650 697
651 702
651 702 651
Pie I
Hinf I Nla IV Fnu4H I
Nla III Ban I Mse I Bbv I Alu I
CTCATATAAATAGTAATTCC IATGIACTCCCAATGIGCACCGAAGTITAAAACAGAGCCAATGAGICAGCAGTGAAACAIGCTTCA 800
GAGTATATTTATCATTAAGGTACTGAGGGTTACCGTGGCTTCAATTTTGTCTCGGTTACTCGTCGTCACTTTGTCGAAGT | | . I . I . .| . I
740 753 763 781 794
743 753 781
743 Mse I Bsr I
Dra I Dde I Hph I
ACGACAGCCGACGGGTCT ITITAAACAATTTCITCAGGTTCAGCAATTGGGAGCAGTAGTTTCAGCCCACGACCAACTICACICA 880 TGCTGTCGGCTGCCCAGAAATTTGTTAAAGAGTCCAAGTCGTTAACCCTCGTCATCAAAGTCGGGTGCTGGTTGAGTGGT
II- I • • • - I I -
818 830 875
819 878
Fnu4H I Mme I Mnl I Bbv I
GTTCTCTCCACCACAGATTTACCCTT ICCAACAGACCATACCCACATATTCTCCCTACCCCTTCICTCACAAACTATGGICTG 960 CAAGAGAGGTGGTGTCTAAATGGGAAGGTTGTCTGGTATGGGTGTATAAGAGGGATGGGGAAGGAGTGTTTGATACCGAC • • I • • • • j • I •
906 943 957 957
Rsa I ScrF I
Mae II EcoR II Rsa I
Nde I Tthlll II Bsm I Alu I isaA I BstN I ?Pl I
C IATATGGGCIAAACACAGTTTACCACAGGIAATGCAACAAIGCTACAGCCTATGCCIAICGITACCCACAGCICAGGACAGCCIGITAC 1040 GTATACCCGTTTGTGTCAAATGGTGTCCTTACGTTGTTCGATGTCGGATACGGTGCATGGGTGTCGGTCCTGTCGGCATG
9 I61 9I6-9 • 9I88• 9I98• ■ 1I0I13I • 1I026• 1I0I36•
1014 1026 1037 1016 1026
ScrF I Hph I Dde I EcoR II Nde I BsmA I BstN I
Mnl I -~ SfaN I Eco57 I Mae III Mae III
GGCATTTC ICTCIATATGGTGCATTGTGGGCAGGICATCAAGACITGAAGGTGGATTGITCACAGITCITCAGITICACICTGGACAGAC 1120 CCGTAAACWAGTATACCACGTAACACCCGTCCGTAGTTCTGACTTCCACCTAACAGTGTCAGAGTCAGTGGACCTGTCTG
1 I04-8I • - 1I072 -1I081 • 1I094 I I 1I1I06I
1051 1100 1110
1102 1110 1107 1110 ScrF I Eco57 I EcoR II Alu I Alu I Rsa I BstN I Nla IV Dde I HinD III Mnl I Ban I Alu I SfaN I
I I II I I I I I I I
AGGATTTCTCAGCTATGGCACAAGCTTCAGTACCCCTCAACCTGGACAGGCACCATACAGCTACCAGATGCAAGGTAGCA 1200 TCCTAAAGAGTCGATACCGTGTTCGAAGTCATGGGGAGTTGGACCTGTCCGTGGTATGTCGATGGTCTACGTTCCATCGT
1 I12-8I • 1I1I42I I 1I155 -I 1I1-69 1I1-79 1I187■
1131 1143 1150 1161 1169 1145 1161 1161 Xmn I Mnl I Mse I
GTT-RAACAACATCATCAGGAATATATACAGG IAAATAATTCACTCACAAATTCICTCTGGATTITAATAGTTCACAGCAGGAC 1280 CAAAATGTTGTAGTAGTCCTTATATATGTCCTTΓATTAAGTGAGTGTTTAAGGAGACCTAAATTATCAAGTGTCGTCCTG
• • • I • • I • I • •
1231 1252 1261
Sec I ScrF I EcoR II BstN I Hae III Mse I
Hae I Hph I
Bsr I Eae I Rsa I Alu I
TATCCGTCTTATCC ICAGTTTT IGIGC ICAGGGTCAG ITACGCACAGTATTATAACA IGCTICACCGTATCCAGCACATTATATGAC 1360 ATAGGCAGAATAGGGTCAAAACCGGTCCCAGTCATGCGTGTCATAATATTGTCGAGTGGCATAGGTCGTGTAATATACTG . j .|| I . I . . i i
1294 1301 1313 1332
1301 1335
1301 1302 1304 1304 1304 1304 Fnu4H I Hph I
Bbv I Fok I Alu I SfaN I Tthlll II
CAG ICAGCAACACCAGCCCAACGACACCIATCCACCAATGCCACTTACCAIGCTTCAAGAACCGCCATCTGGICATICACCAGCCI 1440 GTCGTCGTTGTGGTCGGGTTGCTGTGGTAGGTGGTTACGGTGAATGGTCGAAGTTCTTGGCGGTAGACCGTAGTGGTCGG
1 I363 1I387• • 1I40•8 • 1I4-29I -1I440
1363 1432
Sau3A I bo i ia i
Dpn I Sau3A I
Alw I Hinf I Taq I
BstY I Hinf I Mbo I
Mae III Rsa I Mse I AlwN I Dpn I
I I I I I I I I I I I
AAGCAGTTACAGATCCCACAGCAGAGTACAGCACAATCCACAGCCCATCAACACCCATTAAAGATTCAGATTCTGATCGA 1520 TTCGTCAATGTCTAGGGTGTCGTCTCATGTCGTGTTAGGTGTCGGGTAGTTGTGGGTAATTTCTAAGTCTAAGACTAGCT
1446 1466 1498 1507 1515
1451 1503 1515
1452 1509 1517
1452 1515
1452 1516
1452
Mbo II Sau3A I
Mnl I Mae II Hae III Mbo I
Taq I Pml I Sau96 I Mnl I Hinf I
Hβft.I. BsaA I Nla IV Hph I AlwN I Dpn I
TTGCGTCGAGGTTCAGATGGGAAATCACGTGGACGGGGCCGAAGAAACAATAATCCTTCACCTCCCCCAGATTCTGATCT 1600 AACGCAGCTCCAAGTCTACCCTTTAGTGCACCTGCCCCGGCTTCTTTGTTATTAGGAAGTGGAGGGGGTCTAAGACTAGA
1 I523I I - • 1I5I46• 1I5I5I5 -I • I .57-8I 1I58I8 1I596 •
1526 "* 1546 1556 1581 1590 1528 1547 1557 1596
1561 1596
Sau96 I
Ava II
PpuM I
Nla IV
BsmA I EcoO109 I Fok I Bsr I
TGAGAGAGTGTTCATCTGGGACTTG IGATGIAGACAATCATTGTTTTCCACTCCTTGCTTAICTGIGIGTCCTACGCCAACAGAT 1680 ACTCTCTCACAAGTAGACCCTGAACCTACTCTGTTAGTAACAAAAGGTGAGGAACGAATGACCCAGGATGCGGTTGTCTA
I I- • • I- II 1625 1659
1629 1662
1662 1662 1663 1663
Sau3A Mbo I Dpn I Alw I Nla IV BstY I
BamH I
Alw I Sty I
Mnl I Eco57 I Sec I Mbo II
ATGGGAGGIGIATCCACCCACTTCAGTTTCCCTTGGACTGCGAATGGAAGAAATGATTTTCAACTTGGCAGACACACATTTA 1760 TACCCTCCCTAGGTGGGTGAAGTCAAAGGGAACCTGACGCTTACCTTCTTTACTAAAAGTTGAACCGTCTGTGTGTAAAT
1685 1699 1709 1725
1688 1709
1688 1688 1688
1689
1689
1689
1689
Dde I
Sau96 I
Ava II
Mbo II Tthlll I Eco57 I PpuM I Mse I Dde I Mae III Mbo II EcoO109 I
TTTTTTAATGACTTAGAAGAATGTGACCAAGTCCATATAGATGATG'ITTCTTCAGATGATAACGGACAGGACCTAAGCAC 1840 AAAAAATTACTGAATCTTC'ΓTACACTGGTTCAGGTATATCTACTACAAAGAAGTCTACTATTGCCTGTCCTGGATTCGTG
1765 1772 1783 1809 1828
1776 1785 1810 1828 1829 1829
1833
Fnu4H I Bbv I Pst I
Fnu4H I Bsr I
Bbv I Bsr I Tthlll II Rsa I
ATATAACTTTGGAACAGATGGCTTTCCTGCTGCAGCAACCAGTGCTAACTTATGTTTGGCAACTGGTGTACGGGGCGGTG 1920 TATATTGAAACCTTGTCTACCGAAAGGACGACGTCGTTGGTCACGATTGAATACAAACCGTTGACCACATGCCCCGCCAC
II I I- - I - I I -
1869 1879 1893 1908
1869 1902
1870 1872 1872 Fok I Hae III Mnl I
Bsr I Hae I Mme I
TGGA ICTGIGATGAGAAAGTTIGIC^CTTCCGCTACAGACGGGTAAAAGAGATTTACAACACCTACAAAAATAATGITTGGIAGGT 000 ACCTGACCTACTCTTTCAACCGGAAGGCGATGTCTGCCCATTTTCTCTAAATGTTGTGGATGTTTTTATTACAACCTCCA
1 I924I - 1I9I39 • • • • • - 1I992I -
1927 1940 1996
Pst I Fnu4H I ScrF I
Alu I Bbv I EcoR II
Sau96 I ScrF I Hae III BstN I
Ava II Mnl I EcoR II Sau96 I Pie I Bsr I
Tthlll II Dde I BstN I Mnl I Hinf I HinC II
CT IGCTTGIGTCCAIGCITAAGIAGGGAAGCICTGGICITGCAGTTGIAGIGIGCCGAAATTGAAGCCCTGACCGIACTCICTGGITTGACAICT 2080 GACGAACCAGGTCGATTCTCCCTTCGGACCGACGTCAACTCCCGGCTTTAACTTCGGGACTGGCTGAGGACCAACTGTGA
2 I002 I • I 2I014I • 2I026II 2l0-3l9l • - 2l064l - 2l072 I •
2007 2018 2026 2041 2064 2078
2007 2026 2042 2068
2012 2030 2068
2030 2068 2031
Hpa II ScrF I Sau96 I Nci I Alii I
Hae in Ben I Ssp I Mae III Dde I Fok I
I I I I I I I I
GGCCCTGAAAGCACTCTCGCTCATTCACTCCCGGACAAACTGTGTGAATATTTTAGTAACAACTACTCAGCTCATCCCAG 2160 CCGGGACTTTCGTGAGAGCGAGTAAGTGAGGGCCTGTTTGACACACTTATAAAATCATTGTTGATGAGTCGAGTAGGGTC
I • • II • I • I • I I- I • 2081 2110 2127 2136 2146 2153
2081 2110 2149
2110 2111 2111
Ssp I
CATTGGCGAAAGTCCTGCTGTATGGGTTAGGAATTGTATTTCCAATAGAAA IATATTTACAGTGCAACTAAAATAGGAAAA 2240 GTAACCGCTTTCAGGACGACATACCCAATCCTTAACATAAAGGTTATCTTTTATAAATGTCACGTTGATTTTATCCTTTT • • • • • 1 • • •
2211
Mbo II Alu I Mbo II Mbo II
GAAA IGCTGTTTTGAGAGAATAATTCAAAGGTTTGG IAAGAAAAGTGGTGTATGTTGTTAAGGAGATGGTGTAG IAAGIAAGA 2320 CTTTCGACAAAACTCTCTTATTAAGTTTCCAAACCTTCTTTTCACCACATACAACAATATCCTCTACCACATCTTCTTCT
I • I • • - I I -
2244 2275 2313
231£_ Sau3A I Mbo I Dpn I
BstY I Nla III Sty I
Mnl I Mnl I Sec I
SfaN I Gsu I Gsu I Sau96 I Sau96 I Nla III
BstU I EcoN I Alw I Ava II Hae III PflM I
I I I I I I I I I I I I I I I
ACAAGGAGCAAAAAAGCACGCGATGCCCTTCTGGAGGATCTCCAGCCACTCGGACCTCATG&CCCTGCACCATGCCTTGG 2400 TGTTCCTCGTTTTTTCGTGCGCTACGGGAAGACCTCCTAGAGGTCGGTGAGCCTGGAGTACCGGGACGTGGTACGGAACC
I - I I - I I I I I • I I I - I I I I
2339 2347 2356 2372 2381 2390
2342 2351 2360 2372 2381 2391
54 2375 2395
2356 2378 2395
2357
2357
2357
Fnu4H I
HinP I Alu I Sec I Sau3A I Bsr I Hha I Pvu II ScrF I Mbo I Sau96 I Rsa I Hae II NspB II EcoR II Dpn I Nla IV Gsu I Eco47 III HinP I BstN I Alw I Hae III
Bsr I Mae III Hha I Bbv I Mae III BstY I EcoO109 I
I I I I I I I I I I I I I I I I I
AACTGGAGTACCTGTAACAGCGCTCGGCACTTTGACAGCGCACAGCTGCTCTGTGACCAGGGACAGATCCAGCAGGCCCC 2480 TTGACCTCATGGACATTGTCGCGAGCCGTGAAACTGTCGCGTGTCGACGAGACACTGGTCCCTGTCTAGGTCGTCCGGGG
II I • I II • I • III • I I • II • II I-
2402 2414 2438' 2445 2453 2465 2474 2403 2419 2438 2457 2466 2475
2408 2419 2443 2457 2466 2475 2420 2443 2457 2466 2475 2420 2444 2457 2466 2479
2445
Hae III Msp I Hpa II Nae I SfaN I HinP I Hph I AlwN I
Hha I Gsu I ScrF I HinP I SfaN I CfrlO I EcoR II Hha I BsmA I Hae II Mnl I Dde I BstN I Fsp I Mnl I
I I II II I II I
AGTCTCGCATCAGCGCCGGCCTCCAGAACTTAGCAATTTCCGCCTGGTGATGCGCAGTTGCTGTCAGTCTTGACCTCTGC 2560 TCAGAGCGTAGTCGCGGCCGGAGGTCTTGAATCGTTAAAGGCGGACCACTACGCGTCAACGACAGTCAGAACTGGAGACG
I I • II II I II I- - I I l-ll I • • I •
2482 2492 2500 2509 2523 2531 2554
2487 2495 2523 2532
2493 2501 2523 2532
2493 2526 2535
2495 2529
2496 2496 2498
Tthlll I HinC II Sea I
Sau96 I Alu I Pie I Rsa I
Ava II Pvu II Hinf I Bsr I
Hph I Mnl I Mae II NspB II Mae I Hinf I
CTTTGTG IGTGAATGG IAG IGIACCA ICGTCTATTTCATCAGAAC IAIGCTG ITTG IACTC ITA IGITACTGTG IAATCICAGTGAAAATAAGC 2640 GAAACACCACTTACCTCCTGGTGCAGATAAAGTAGTCTTGTCGACAACTGAGATCATGACACTTAGGTCACTTTTATTCG
I • I I I • I • I I I h i l l - I I ■
2567 2575 2582 2600 2612 2622
2577 2600 2608 2626
2577 2601 2608 2615
2578 2605 2614
Sau96 I Ava II PpuM I EcoO109 I Nla III Mse I Mbo II BspM I
I I I I I I
CATGAGAATGTTTTAGCACAGCGTTATGTGTCTGCCACATTAACTACACGGTTCAAACCTGTGAAGAAAGGACCTGCAAA 2720 GTACTCTTACAAAATCGTGTCGCAATACACAGACGGTGTAATTGATGTGCCAAGTTTGGACACTTCTTTCCTGGACGTTT
I • • • I • • I I I I
2641 2680 2703 2712
2709 2709 .2710 2710 _ BsmA I Eco57 I Alu I AlwN I
I I I I
CGCTTCAGTTGTTAGCATTTTCAATGTGATATAAACAGCTTCTCCAATACAGCAAACCTAATTGCACAACAGAGACTGAA 2800 GCGAAGTCAACAATCGTAAAAGTTACACTATATTTGTCGAAGAGGTTATGTCGTTTGGATTAACGTGTTGTCTCTGACTT
2 I723 ' ' 2I757- • • 2I79I0 -
2792 Sec I ScrF I EcoR II Mnl I BstN I
Bsr I BsmA I Sec I Rsa I
I I I I I I
ATGTGTTTCCTGAATACCAGTGGAGGAATTTTCTTGTAAAGAAGGTTTACTTTTTGGTGTCTCATACCCAGGGTAATCTG 2880 TACACAAAGGACTTATGGTCACCTCCTTAAAAGAACATTTCTTCCAAATGAAAAACCACAGAGTATGGGTCCCATTAGAC
I • I • • • I- II • I
2817 2859 2867 2880
2823 2868
2868 2868 2868 Mse I Dra I Alu I Mnl I Hph I I I I I I
TACATCTCTACTTATTTATGAACAGAC'TTTTTTTAAAAAGATAAAAAAACAGC'I TATTGAGGTATAATTCACCCACCAG 2960 ATGTAGAGATGAATAAATACTTGTCTGAAAAAAATTTTTCTATr ri GTCGAAATAACTCCATATTAAGTGGGTGGTC
• • II • -I I I
2912 2931 2940 2950
2913
Hae III BsmA I Stu. I
Mse I Mbo II Hae I Mnl I
Dra I Ear I Mse I Mnl I Nla III
ACTTT TTAAACATCAAATAATT AALGACAAT GC I 11 I I I
TGAAAAAA'ITTGTAGTTTATTAACTTCTCTGTTATCGTAATCTTTATTCACTAATTTCCGGAGACGGAGTGTTGTACCGT
I I • - I I - • - I N I I • I
2966 2984 3013 3020 3034
2967 2984 3017 3026
2987 3017
3018
Rsa I Mae II
Sea I Mse I BsmA I
Rsa I Mse I Dra I Rsa I Dde I
I I I I I I I I I I
AGTACAGTACTTTGAATTTTAGCACATTGCATAATAGTTTTAAGTATGTCTAATTTAAACGTATAATATGTACATCACTG 120 TCATGTCATGAAACTTAAAATCGTGTAACGTATTATCAAAATTCATACAGATTAAATTTGCATATTATACATGTAGTGAC
I II • • • I - II I- I I I
3042 3080 3094 3110 3118
3046 3095 3120
3047 3099
Rsa I
Nla III AGACAATC IATGITACAGAAAGAATTTTTGGTGTAAATTTGTAATAATGGATAATTCTTTTACATATTGTTTAGGGAAATGA 3200
TCTGTTAGTACATGTCTTTCTTAAAAACCACATTTAAACATTATTACCTATTAAGAAAATGTATAACAAATCCCTTTACT
| *| • • * • • • •
3128
3131
HgiA I
Bspl286 I
Mae II
BstN I Nla III BsaA I Nla III Dde I
I I I I I I I I I
TATTGAAAGGTAGCAATGCCTGGATAGTGAAGCATGAGGCAGCACGTGCACAAATTCATGTGCCGTGCCTTATCTGAGTT 3280 ATAACTTTCCATCGTTACGGACCTATCACTTCGTACTCCGTCGTGCACGTGTTTAAGTACACGGCACGGAATAGACTCAA
I - • I I I - I I I • I • - I
3219 3233 3243 3257 3274
3219 3236 3243
3219 3239 3246
3239
3244
3246
3246
Nla III
BstX I Fok I
TTCGGTATAAATATGTAGATAATGGATTTTTTTTTAGATAATGTTGTCAAGACCAAAAGCATGGATGTCAAGTGTCAGTA 3360 AAGCCATATTTATACATCTATTACCTAAAAAAAAATCTATTACAACAGTτCTGGTTTTCGTACCTACAGTTCACAGTCAT
• • • • * I I 1 * *
3333 3343 3340 Hae III EcoO109 I Mnl I Dde I SfaN I Mbo II Sau96 I Mse I
I I I M M I
AGGATTTTGTTTTCTAAAATTTTTTCCTGCATCAGTTCTTCTGAGGGCCTTGATGAAATAACACAGCAGTTTCTTAAACA 3440 TCCTAAAACAAAAGATTTTAAAAAAGGACGTAGTCAAGAAGACTCCCGGAACTACTTA'iτGTGTCGTCAAAGAATTTGT
I- I -I MM • • • I •
3389 3397 3405 3434
3401 3403 3404 3406 Alu I Sac., I HgiA I Bspl286 I Ban II Mnl I Mae III
I I I I
ATTTGAAACAAAATGAXSCTCTCCTACCACCTCACTTTTTCATTTCCACACTAATGTATTATATGTAACTACTTGGAAAAA 3520 TAAACTTTGTTTTACTCGAGAGGATGGTGGAGTGAAAAAGTAAAGGTGTGATTACATAATATACATTGATGAACCTTTTT
I I - I • • • • I •
3455 3469 3504
3455 3455 3455 3456
Hinf I Mse I Nla III
Mbo II Ase I BspH I
I II II I
ATAATTATTCAAATGCTTCTTCCCACAAAGAATATAGATGATAGTAGATATATTTTATTAATAAAATGGTTCATGAATCG 3600 TATTAATAAGTTTACGAAGAAGGGTGTTTCTTATATCTACTATCATCTATATAAAATAATTATTTTACCAAGTACTTAGC
I • • • II • -II I -
3538 577 3591
3578 3592
3595
Dde I HgiA I
Bspl286 I Mse I Mbo II
BsmA I Nla III Ase I Taq I Bbv II SfaN I
G IAGACTAACAAAGTTTTCIATGITGCITCAGAATTAITITAATTATCGTGTCTGCATTTTCTTTICGATAAAGGIAAGACACACGIAT 3680 CTCTGATTGTTTCAAAAGTACACGAGTCTTAATAATTAATAGCACAGACGTAAAAGAAAGCTATTTCCTTCTGTGTGCTA
I • I - I I • I I • • I - I • I •
3601 3618 3633 3 3665599 3 3666688 3678
3621 3634 3668 3621 3624
Dde I EcoN I
Msp I Bsu36 I
Hpa II Mae III
BSPM II Mbo II Hph I Mnl I
GCTAAT MCCGGAAATCAGCAAACTTTGCATTACTCCCTATGTGCGTATTTTCTCTTTICTTCCTGITICACCICITGIAGGAAGGTT 3760
CGATTAGGCCTTTAGTCGTTTGAAACGTAATGAGGGATACACGCATAAAAGAGAAAGAAGGACAGTGGGACTCCTTCCAA
II • • • •
3686 3I736' I3I744ll-3l751
3687 3743
3687 3748 3748 3749
Nla III
Sty I
Sec I
Nco I
Dsa I Mnl I Nla III
Hph I Mae II BspM I Rsa I SfaN I
I I I I I I I I I
CATTGCCATTGTCATCACCATGGAAACAACGTTCCTCTCCACCTGCATTATGTACTACATGACAGGCATCAATCTGGGGA 3840 GTAACGGTAACAGTAGTGGTACCTTTGTTGCAAGGAGAGGTGGACGTAATACATGATGTACTGTCCGTAGTTAGACCCCT
3 I775II- 3I7-89 I - 3I801 - 3l812 l - 3l826-
3778 3794 3818
3778
3778
3778
3779
Hph I Ssp I
AATAATAAAATTAT ICACCTTTGTCAGACCATAAGAGTTTCTCCAAAAGTGGTCAGTTTGGCTGGGCAIATATTTTCTCTCA 3920 TTATTATTTTAATAGTGGAAACAGTCTGGTATTCTCAAAGAGGTTTTCACCAGTCAAACCGACCCGTTATAAAAGAGAGT • I • • • • • I ■ •
3854 3907
Bbv II Fok I Mbo II Nla III Dde I Pie I Tthlll II BspH I Bsu36 I Hinf I Mse I
I I I I I I I I I I
TCTAACAAACACAATC ATTGTCATGAAATTACCCTTAGGATGAGTCTTCTTTAATCAATCATATATTGGGCGGAAAAAA 4000 AGATTGTTTGTGTTAOTTAACAGTACnraAATGGGAATCCTACTCAGAAGAAATTAG'raAGTATATAACCCGCCTTTTTT
3 I926• - 3I9I42 • II I- - I II • I
3954 3963 3972
3943 3955 3963
3959 3966 3965
Alu I Mae I
Mbo II Fnu4H I
Alu I Ear I Eoo57 I Bbv I
CACCA IGCTrTGACCCGAAGTAGTTG IAAGAIGCTACTTCATTCTTTTC ITGAAGTTGTGTGTTG ICTGCITAGAAATAGTCATTT 4080 GTGGTCGAAACTGGGCTTCATCAACTTCTCGATGAAGTAAGAAAAGACTTCAACACACAACGACGATCTTTATCAGTAAA
I • - I I- • I • -I I -
4005 4025 4046 4061
4025 4061
4029 4065
Mse I HinC II
Dra I Mbo II Tthlll II
I I I I I
GTGAATTATCCAAATTGTTTAAATTCACAATTGAATTAGT'ITI TCTTCCTTTTGGCTTGAAGCAAACAGTTGACCATTT 4160 CACTTAATAGGTTTAACAAATTTAAGTGTTAACTTAATCAAAAAAGAAGGAAAACCGAACTTCGTTTGTCAACTGGTAAA
4099 4150
Pst I Hae III
Mse I Rsa I Hae I
TTAACCTTTTCATTTTATGTTTTTGTACTCTGCAGACTGAAAAGACAAAGTTTATCTTGGCCTTACTGTATAAAGGTATG 4240 AATTGGAAAAGTAAAATACAAAAACATGAGACGTCTGACTTTTCTGTTTCAAATAGAACCGGAATGACATATTTCCATAC
4161 4185 4218
4190 4219
Mse I Ase I Mse I
Rsa I Mbo II Mse I Rsa I
CTGTGTCCACCGTTGTGTACAGAATTTTTCTTCATTAATTTTGTGTTTAAGTTAATAAAATTTATTTGTGATGTACTGTA 4320 GACACAGGTGGCAACACATGTCTTAAAAAGAAGTAATTAAAACACAAATTCAATTATTTTAAATAAACACTACATGACAT
4257 4269 4287 4313
4274 4292 4275
A 4321 T
Restriction Endonucleases site usage
Aat II - BstN I 13 HinC II 3 Pie I 4
Ace I - BstU I 4 HinD III 1 Pral I 2
Afl II - BstX I 1 Hinf I 11 PpuM I 3
Afl III 1 BstY I 5 HinP I 14 Pst I 4
Aha II 1 Bsu36 I 2 Hpa I - Pvu I -
Alu I 21 CfrlO I 1 Hpa II 3 Pvu II 2
Alw I 6 Cla I 1 Hph I 11 Rsa I 18
AlwN I 4 Dde I 18 Kpn I - Rsr II -
Apa I - Dpn I 7 Mae I 3 Sac I 1
ApaL I 1 Dra I 5 Mae II 7 Sac II 1
Ase I 3 Dra III - Mae III 11 Sal I -
Asp 18 - Drd I - Mbo I 7 Sau3A I 7
Ava I 2 Dsa I 3 Mbo II 18 Sau96 I 13
Ava II 6 Eae I 2 Mlu I - Sea I 2
Avr II - Eag I - Mme I 3 ScrF I 14
BamH I 1 Ear I 2 Mnl I 30 Sec I 11
Ban I 3 Eco47 III 3 Mse I 1 SfaN I 11
Ban II 2 Eco57 I 6 Mse I 22 Sfi I -
Bbe I 1 EcoN I 2 Msp I 3 Sma I -
Bbv I IS EcoC-109 I 5 Nae I 1 SnaB I -
Bbv II 2 EcoR I - Nar I 1 Spe I -
Bel I - EcoR II 13 Nci I 1 Sph I -
Ben I 1 EcoR V - Nco I 2 Spl I 1
Bgl I - Esp I - Nde I 2 Ssp I 3
Bgl II - Fnu4H I 20 Nhe I - Stu I 1
BsaA I 3 Fok I 9 Nla III 15 Sty I 4
Bsm I 1 Fsp I 2 Nla rv 10 Taq I 5
BsmA I 8 ... Gdi II 1 Not I - Tthlll I 2
Bspl286 I 4 Gsu I 5 Nru I - Tthlll II 10
BspH I 2 Hae I 4 Nsi I - Xba I -
BspM I 3 Hae II 8 Nsp7524 I - Xca I -
BspM II 1 Hae III 13 NspBi II 5 Xho I -
Bsr I 15 Hga I 1 NspH [ I - Xc I -
BssH II - HgiA I 3 PaeR7 I - Xma I -
BstB I 1 HgiE II - PflM [ I 1 Xmn I 1
BstE II - Hha I 14
Enzyme Site Use Site position (Fragment length) Fragment order
Afl III a/crygt 1 11 341) 2 342( 3980) 1
Aha II gr/cgyc 1 11 390) 2 391{ 3931) 1
ApaL I g/tgcac 1 K 3245) 1 3246( 1076) 2
BamH I g/gatce 1 K 1687) 2 1688( 2634) 1
Bbe I ggcgc/e 1 11 390) 2 391( 3931) 1
Ben I ccs/gg 1 11 2109) 2 21101 2212) 1
Bsm I gaatgc 1/-1 1 11 987) 2 988( 3334) 1
BspM II t/ccgga 1 11 3685) 1 3686( 636) 2
BstB I tt/cgaa 1 K 538) 2 539( 3783) 1
BstX I ccannnnn /ntgg 1 11 3332) 1 3333( 989) 2
CfrlO I r/ccggy 1 11 2494) 1 24951 1827) 2
Cla I at/cgat 1 11 1515) 2 15161 2806) 1
Gdi II yggccg -5/-1 1 11 384) 2 385{ 3937) 1
Hga I gacgc 5/10 1 1S2:.)
HinD III a/agctt 1 1141) 2 11421 3180)
Mse I tgg/cca 1 1300) 2 13011 3021)
Nae I gcc/ggc 1 2494) 1 24951 1827)
Nar I gg/cgcc 1 390) 2 3911 3931)
Nci I cc/sgg 1 2109) 2 21101 2212)
PflM I ccannnn/ntgg 1 2389) 1 23901 1932)
Sac I gagct/c 1 3454) 1 34551 867)
Sac II ccgc/gg 1 447) 2 4481 3874)
Spl I c/gtacg 1 1035) 2 10361 3286)
Stu I agg/cct 1 3016) 1 30171 1305)
Xmn I gaann/nnttc 1 1230) 2 12311 3091)
Ava I c/ycgrg 2 81) 3 821 287) 3691 3953)
Ban II grgcy/c 2 471) 3 4721 2983) 34551 867) Bbv II gaagac 2/6 2 3667) 1 36681 297) 39651 357) BspH I t/catga 2 3590) 1 35911 351) 39421 380) Bsu36 I cc/tnagg 2 3747) 1 37481 206) 39541 368) Eae I y/ggccr 2 1 384) 3 3851 916) 13011 3021) Ear I ctcttc 1/4 2 1 2983) 1 29841 1041) 4025( 297) EcoN I cctnn/nnnagg 2 1 2346) 1 2347( 1401) 37481 574) Fsp I tgc/gca 2 1 111) 3 1121 2419) 25311 1791)
Nco I c/catgg 2 1 90) 3 911 3687) 3778 ( 544) Nde I ca/tatg 2 1 960) 2 9611 90) 10511 3271) Pml I cac/gtg 2 1 1545) 2 15461 1697) 3243 ( 1079) Pvu II cag/ctg 2 1 2442) 1 24431 157) 26001 1722) Sea I agt/act 2 1 2613) 1 26141 432) 30461 1276) Tthlll I gacn/nngtc 2 1 1784) 1 17851 793) 25781 1744 )—2
Ase I at/taat 1 3576) 1 35771 56) 36331 641) 42741 48) Ban I g/gyrcc 1 390) 3 3911 362) 753 ( 416) 11691 3153) BsaA I yac/gtr 1 1012) 3 10131 533) 15461 1697) 32431 1079) BspM I acctgc 4/8 1 630) 3 6311 2081) 2712( 1089) 38011 521) Dsa I c/crygg 1 90) 4 911 357) 448( 3330) 37781 544) Eco47 III age/get 1 199) 4 2001 379) 579( 1840) 24191 1903) HgiA I gwgcw/c 1 3245) 1 32461 209) 34551 166) 36211 701) HinC II gty/rac 1 2071) 1 20721 533) 26051 1545) 41501 172) Hpa II c/cgg 1 2110) 1 21111 385) 24961 1191) 36871 635) Mae I c/tag 1 153) 4 1541 2458) 26121 1453) 40651 257) Mme I tccrac 20/18 1 287) 4 288( 618) 906( 1086) 19921 2330) Msp I c/cgg 1 2110) 1 21111 385) 24961 1191) 36871 635) PpuM I rg/gwccy 1 1661) 1 16621 166) 18281 881) 27091 1613) Ssp I aat/att 1 2126) 1 21271 84) 22111 1696) 3907( 415)
AlwN I cagnnn/ctg 4 1 1506) 2 15071 81) 5 15881 947) 3 25351 255) 4 2790 1532) 1 Bspl286 I gdgch/c 4 1 471) 3 472( 2774) 1 32461 209) 4 34551 166) 5 3621 701) 2
BstU I cg/cg 4 1 448) 3 4491 30) 4 4791 2) 5 4811 1858) 2 2339 1983) 1
Hae I wgg/ccw 4 1 1300) 1 13011 638) 4 19391 1078) 3 30i7( 1201) 2 4218 104) 5 Pie I gagtc 4/5 4 1 742) 3 7431 1321) 2 20641 544) 4 26081 1355) 1 3963 359) 5 Pst I ctgca/g~ 4 1 419) 3 4201 1450) 2 1870( 161) 4 20311 2159) 1 4190 132) 5 Sty I c/cwwgg 4 1 90) 5 911 1618) 1 1709( 686ι 3 23951 1383) 2 3778 544) 4
BstY I r/gatcy 5 1 649) 4 650( 801) 2 14511 237) 5 16881 668) 3 2356 109) 6 24651 1857) 1
Dra I ttt/aaa 5 1 817) 3 8181 2094) 1 29121 54) 6 29661 128) 5 3094 1004) 2 40981 224) 4
EcoO109 I rg/gnccy 5 1 1661) 1 16621 166) 6 1828( 646) 4 24741 235) 5 2709 695) 3 34041 918) 2
Gsu I ctggag 16/14 5 1 274) 3 2751 2076) 1 23511 9) 6 23601 43) 5 2403 98) 4 25011 1821) 2
NspB II cmg/ckg 5 1 176) 4 177( 205) 3 382{ 66) 6 4481 1995) 1 2443 157) 5 26001 1722) 2
Taq I t/cga 5 1 123) 5 1241 416) 4 540( 977) 2 1517( 9) 6 1526 2133) 1 36591 663) 3
Alw I ggatc 4/5 6 1 649) 650( 802) 2 14521 236) 5 1688( 1) 7 1689 667) 23561 110) 6 24661 1856) 1
Ava II g/gwcc 6 1 1662) 1663 ( 166) 6 18291 178) 5 2007( 365) 3 2372 205) 25771 133) 7 27101 1612) 2
Eco57 I ctgaag 16/14 6 1 1080) 1081 ( 64) 7 11451 554) 4 1699( 111) 6 1810 913) 27231 1323) 1 40461 276) 5
Dpn I ga/tc 1 650) 4 6511 801) 2 14521 63) 8 1515( 81) 7 1596 93) 6 16891 668) 3 2357( 109) 5 24661 1856) 1
Mae II a/cgt 1 343) 7 3441 670) 2 10141 533) 4 15471 1035) 1 2582 517) 6 30991 145) 8 32441 545) 3 37891 533) 5
Mbo I /gate 7 11 650) 4 6511 801) 2 14521 63) 8 15151 81) 7
15961 93) 6 16891 668) 3 23571 109) 5 24661 1856) 1
Sau3A I /gate 7 K 650) 4 6511 801) 2 1452 ( 63) 8 15151 81) 7
15961 93) 6 16891 668) 3 2357( 109) 5 24661 1856) 1
BsmA I gtctc 1/5 8 11 1099) 1 11001 529) 4 16291 853) 2 24821 310) 6
27921 67) 9 28591 128) 8 29871 133) 7 31201 4811 5
36011 721) 3
Hae II rgcgc/y 8 11 199) 3 2001 191) 4 3911 36) 7 4271 17) 9
4441 19) 8 4631 116) 5 5791 1840) 1 24191 73) 6
24921 1830) 2
Fok I ggatg 9/13 9 K 128) 9 1291 360) 5 4891 74)10 563 ( 824) 2
13871 238) 7 16251 302) 6 19271 226) 8 21531 1190) 1
3343 ( 616) 3 39591 363) 4
Nla IV ggn/ncc 10 11 86) 9 87( 159) 6 2461 145) 7 3911 311) 5
7021 51)10 7531 416) 3 11691 386) 4 15551 107) 8
16621 26)11 16881 787) 2 24751 1847) 1
Tthlll II caarca 11/9 10 K 115) 8 116( 68)10 1841 39)11 2231 365) 5
588( 381) 4 969( 471) 2 14401 453) 3 18931 109) 9
20021 1924) 1 39261 218) 6 41441 178) 7
Hinf I g/antc 11 K 134) 8 1351 555) 3 6901 53)10 7431 760) 2
15031 6)12 15091 81) 9 15901 474) 5 20641 544) 4
26081 14)11 26221 973) 1 35951 368) 6 39631 359) 7
Hph I ggtga 8/7 11 K 874) 2 875( 232) 6 11071 228) 7 13351 97) 9
14321 146) 8 15781 948) 1 25261 41)11 25671 383) 5
29501 794) 3 37441 31)12 37751 79)10 38541 468) 4
Mae III /gtnac 11 11 217)10 2181 497) 3 7151 379) 4 10941 12)12
11061 340) 6 14461 337) 7 17831 353) 5 21361 278) 8
24141 39)11 24531 1051) 1 35041 239) 9 3743 ( 579) 2
Sec I c/cnngg 11 11 82) 9 831 8)11 911 243) 7 3341 114) 8
448{ 856) 2 13041 405) 6 17091 686) 3 23951 62)10
24571 410) 5 28671 1)12 28681 910) 1 37781 544) 4
SfaN I gcatc 5/9 11 K 165) 8 1661 398) 5 5641 508) 3 1072 ( 115)11
11871 242) 7 14291 913) 1 23421 145)10 2487{ 42)12
25291 860) 2 33891 289) 6 36781 148) 9 38261 496) 4
BstN I cc/wgg 13 11 333) 7 3341 54)12 3881 104) 9 4921 534) 3
10261 84)10 11101 51)13 11611 143) 8 13041 722) 2
20261 42)14 20681 389) 4 24571 66)11 2523 ( 345) 6
28681 351) 5 32191 1103) 1
EcoR II /ccwgg 13 11 333) 7 3341 54)12 3881 104) 9 492( 534) 3
10261 84)10 11101 51)13 11611 143) 8 13041 722) 2
20261 42)14 20681 389) 4 24571 66)11 25231 345) 6
28681 351) 5 32191 1103) 1
Hae III gg/cc 13 K 385) 5 3861 316) 7 7021 600) 2 13021 255) 9
15571 383) 6 19401 102)11 20421 39)13 20811 300) 8
23811 94)12 24751 23)14 24981 520) 3 3018{ 388) 4
34061 813) 1 42191 103)10
Sau96 I g/gncc 13 11 701) 3 7021 854) 2 15561 107) 9 16631 166) 7
18291 178) 6 20071 34)13 20411 40)12 20811 291) 5
23721 9)14 23811 94)11 24751 102)10 2577( 133) 8
27101 695) 4 34051 917) 1
Hha I gcg/c 14 11 112) 4 1131 88) 6 2011 7)15 208( 184) 3
3921 13)14 4051 23) 9 428( 17)12 4451 19)10
4641 16)13 4801 100) 5 580( 1840) 1 2420 ( 18)11
24381 55) 7 24931 39) 8 25321 1790) 2
HinP I g/cgc 14 11 112) 4 1131 88) 6 2011 7)15 208 ( 184) 3
3921 13)14 4051 23) 9 428( 17)12 4451 19)10
4641 16)13 4801 100) 5 580( 1840) 1 24201 18)11
24381 55) 7 24931 39) 8 25321 1790) 2
ScrF I cc/ngg 14 11 333) 7 3341 54)12 388( 104) 9 4921 534) 3
10261 84)10 11101 51)13 11611 143) 8 13041 722) 2
20261 42)14 20681 42)15 21101 347) 5 24571 66)11
25231 345) 6 28681 351) 4 3219( 1103) 1
Bbv I gcagc 8/12 15 11 109)12 1101 69)14 179( 186) 8 365( 16)15
3811 72)13 4531 114)11 567 ( 214) 7 7811 176) 9
9571 406) 5 13631 506) 3 18691 3)16 1872( 158)10
20301 415) 4 24451 794) 2 32391 822) 1 40611 261) 6
Bsr I actgg 1/-1 15 11 276) 6 277( 280) 5 557( 44)14 601 ( 96)12
6971 181) 9 8781 416) 2 12941 365) 3 16591 220) 7
18791 23)15 19021 22)16 19241 154)10 20781 324) 4
24021 77)13 24791 147)11 26261 191) 8 2817( : 1505) 1
Nla III catg/ 15 11 91)11 921 648) 2 740( 1638) 1 23781 13)16
23911 250) 6 26411 393) 3 30341 94)10 31281 105) 9
32331 24)15 32571 83)12 3340( 252) 5 35921 26)14
36181 161) 7 37791 39)13 38181 125) 8 3943 ( 379) 4
Dde I c/tnag 18 11 161)11 1621 191) 9 3531 115)17 4681 207) 7
6751 155)13 830( 272) 5 11021 26)19 11281 644) 1
17721 61)18 18331 181)10 20141 132)14 21461 363) 4
25091 609) 2 31181 156)12 32741 127)15 34011 223) 6
36241 125)16 37491 206) 8 39551 367) 3
Mbo II gaaga 8/7 18 11 329) 5 330( 1231) 1 15611 164) 8 17251 51)16
17761 33)18 18091 466) 2 22751 38)17 23131 3)19
23161 387) 4 2703 ( 281) 6 29841 413) 3 33971 141)10
35381 130)11 36681 68)13 37361 230) 7 39661 59)14
40251 100)12 41251 144) 9 42691 53)15
Rsa I gt/ac 18 11 483) 4 4841 532) 2 10161 21)16 10371 113)12
11501 163) 9 13131 153)11 14661 442) 5 19081 500) 3
24081 207) 8 26151 265) 7 28801 162)10 30421 5)19
30471 63)14 31101 21)17 31311 681) 1 38121 373) 6
41851 72)13 42571 56)15 43131 9)18
Fnu4H I gc/ngc 20 K 109)12 110( 66)13 176( 3)16 1791 186) 8
3651 13)15 3781 3)17 3811 3)18 3841 63)14
4471 3)19 4S0( 3)20 453( 114)11 567( 214) 7
7811 176) 9 957( 406) 5 13631 506) 3 18691 3)21
18721 158)10 20301 415) 4 24451 794) 2 32391 822) 1
40611 261) 6
Alu I ag/ct 21 11 27)19 28( 186) 8 2141 25)20 2391 172)10
4111 383) 4 7941 204) 6 998( 133)15 11311 12)22
11431 36)18 11791 153)13 13321 76)17 14081 604) 1
20121 137)14 21491 95)16 22441 200) 7 24441 157)11
26011 156)12 2757( 174) 9 29311 525) 3 34561 549) 2
40051 24)21 40291 293) 5 —
Mse I t/taa 22 11 294) 5 2951 258) 7 5531 210)10 7631 56)16
819( 442) 2 12611 237) 8 14981 267) 6 17651 915) 1
26801 233) 9 29131 54)18 29671 46)19 3013 ( 67)14
30801 15)21 30951 339) 3 34341 144)11 35781 56)17
36341 338) 4 39721 127)12 40991 62)15 41611 114)13
42751 12)22 42871 5)23 42921 30)20
Mnl I 7/7 30 K 84)20 85( 41)26 1261 137)11 2631 162)10
425( 93)19 5181 104)16 6221 321) 2 9431 105)15
10481 107)14 11551 97)18 12521 276) 6 15281 53)24
15811 104)17 16851 311) 4 19961 22)27 20181 21)28
20391 315) 3 23541 21)29 23751 125)12 25001 54)23
25541 21)30 25751 248) 7 2823 ( 117)13 29401 80)21
30201 6)31 30261 210) 8 32361 167) 9 3403 ( 66)22
34691 282) 5 37511 43)25 37941 528) 1
587 sites found
No Sites found for the following Restriction Endonucleases
Aat II gacgt/c EcoR I g/aattc Pvu I cgat/cg
Ace I gt/mkac EcoR V gat/atc Rsr II cg/gwccg
Afl II c/tcaag- Esp I gc/tnagc Sal I g/tcgac
Apa I gggcc/c HgiE II accnnnnnnggt Sfi I ggccnnnn/nggcc
Asp718 g/gtacc Hpa I gtt/aac Sma I ccc/ggg
Avr II c/ctagg Kpn I ggtac/c SnaB I tac/gta
Bel I t/gatca Mlu I a/cgcgt Spe I a/ctagt
Bgl I gccnnπn/nggc Nhe I g/ctagc Sph I gcatg/c
Bgl II a/gatct Not I gc/ggccgc Xba I t/ctaga
BssH II g/cgcgc Nru I tcg/cga Xca I gta/tac
BstE II g/gtnacc Nsi I atgca/t Xho I c/tcgag
Dra III cacnnn/gtg Nsp7524 I r/catgy Xcm I ccannnnn/nnnntgg
Drd I gacnnnn/nngtc NspH I rcatg/y Xma I c/ccggg
Eag I c/ggccg PaeR7 I c/tcgag
ANNEX 5
DNA sequence 3859 b.p. TCTCCTTTTTCT ... GATGTACTGTAA linear
Positions of Restriction Endonucleases sites (unique sites underlined)
I 80
Sau3A I
Mbo I
Dpn I
Mae III Mbo II Bleeil TT IIII Bsr I
GCAGAG ITAACAACATICTTCTAArrT'lTTTACCCCTIGIATCACAGGTGCIAAACATCTCAAGCICAGTTCAGATGTTGCTGTTT 160 CGTCrrCATTGTTGTAGAAGATTAAAAAAATGGGGACTAGTGTCCACGTTTGTAGAGTTCGGTCAAGTCTACAACGACAAA
8 I6 • 9I5 1I1I5 1I27' 1I40
116 116 116 Sau3A I Mbo I Dpn I BstY I Bsr I
Mnl I BspM I Alw I Dde I Hinf I
C ICTCAAGTTGICACKn'CTATGGAAATGCAGIGIATCTAACCAGCCCGCATAeCCGTCITGAGTGGTAGTAGTGIAATCCCCICAGT 2 0 GGAGTTCAACGTCCAGATACCTTTACGTCCTAGATTGGTCGGGCGTATCGGCAGACTCACCATCATCACTTAGGGGGTCA
1 I61 1I70 • 1I8I9 • • 2I14 • 2I2-9 I •
189 236 190 190 190
Sau96 X Pie I
Nla IV Hinf I Nla IV Fnu4H I
Hae III Mae III Nla III Ban I Mse I Bbv I
G IGCCCCAAACTCGGITAACTCTCATATAAATAGTAATTCCtATGiACTCCCAATGlGCACCGAAGTlTAAAACAGAGCCAATGAGI 320 CCOGGGTTTGAGCCATTGAGAGTATATTTATCATTAAGGTACTGAGGGTC^ 2 I41 ' 2I54 —• • 2I7-9I - 2I92 - 3I02 • 3I20
241 282 292 320
241 282
Mse I Alu I Dra I Dde I
CAGCAGTGAAACA IGCTTCAACGACAGCCGACGGGTCTIrITAAACAATTTCITCA∞ 400
417
Nde I Rsa I
Fnu4H I Mae II
Mnl I Bbv I Tthlll II Bsm T Alu I BsaA I
TC ICTCACAAACTATGGIC GCIATATGGGCIAAACACAGTTTACCACAGGIAATGCAACAAIGCTACAGCCTATGCCIAICGITACCC 560 AGGAGTGrraGATACCGACGTATACCCGTTTGTGTCAAATGGTGTCCTTACGTTGTTCGATGTCGGATACGGTGCATGGG
4 '82 • 4I96 I 5 I08 • • 5 I27 • 5 I37 • • 5 I5I2 I
496 553
500 555
ScrF I
EcoR II Rsa I Nde I BsmA I
BstN I Sol I Mnl I SfaN I Eco57 I Mae III
ACAGC ICAGGACAGCC IGITACGGCATTTC ICTCIATATGGTGCATTGTGGGCAGG ICATCAAGACITGAAGGTGGATTG ITCACAGIT 640 TGTCGGTCCTGTCGGCATGCCGTAAAGGAGTATACCACGTAACACCCGTCCGTAGTTCTGACTTCCACCTAACAGTGTCA
I . M • I I • -I I - I I-
565 575 587 611 620 633
565 576 590 639
565
ScrF I
EcoR II ScrF I
Hph I Eco57 I EcoR II
Mae III Alu I Alu I Rsa I BstN I Nla IV
Dde I BstN I Dde I HinD III Mnl I Ban I Alu I
C ITCAG ITICAC ICTGGACAGACAGGATTTC ITCAIGCTATGGCACA IAIGC ITTCAG ITACCC ICTCAAC ICTGGACAG IGCACCATACA IGC 720 GAGTCAGTGGACCTGTCTGTCCTAAAGAGTCGATACCGTGTTCGAAGTCATGGGGAGTTGGACCTGTCCGTGGTATGTCG
I I I I - • I I - I I I I - I I I • I •
641 649 666677 668811 669944 770088 718
645 670 682 689 700 708 646 684 700 649 700 649 SfaN I Xmn I Mnl I Mse I
I I I I
TACCAGATGCAAGGTAGCAGTTTTACAACATCATCAGGAATATATACAGGAAATAATTCACTCACAAATTCCTCTGGATT 800 ATGGTCTACGTTCCATCGTCAAAATGTTGTAGTAGTCCTTATATATGTCCTTTATTAAGTGAGTGTTTAAGGAGACCTAA
7 I26 • • • • 7I70 • -7I91 8I00
Sec I ScrF I EcoR II BstN I Hae III Msg I
Hae I Hph I
Bsr I Ea£_I Rsa I Alu I
I I I I I I I
TAATAGTTCACAGCAGGACTATCCGTCTTATCCCAGTTTTGGCCAGGGTCAGTACGCACAGTATTATAACAGCTCACCGT 880 ATTATCAAGTGTCGTCCTGATAGGCAGAATAGGGTCAAAACCGGTCCCAGTCATGCGTGTCATAATATTGTCGAGTGGCA
• I II I - I • -I I -
833 840 852 871
840 874
840 841 843 843 843 843 Fnu4H I Bbv I Fok I Alu I
I I I
ATCCAGCACATTATATGACCAGCAGCAACACCAGCCCAACGACACCATCCACCAATGCCACTTACCAGCTTCAAGAACCG 960 TAGGTCGTGTAATATACTGGTCGTCGTTGTGGTCGGGTTGCTGTGGTAGGTGGTTACGGTGAATGGTCGAAGTTCTTGGC
• " | • • _ | • • | » •
902 926 947
902
Sau3A I
Mbo I
Dpn I
Alw I
Hph I Mae III
SfaN I Tthlll II BstY I Rsa I Mse I
I I I I II I I
CCATCTGGCATCACCAGCCAAGCAGTTACAGATCCCACAGCAGAGTACAGCACAATCCACAGCCCATCAACACCCATTAA 1040 GGTAGACCGTAGTGGTCGGTTCGTCAATGTCTAGGGTGTCGTCTCATGTCGTGTTAGGTGTCGGGTAGTTGTGGGTAATT
I - I I - I I I • I • • • I •
68 979 990 1005 1037
971 985
991 991 991 991
Cla I
Sau3A I Mbo II Hinf I Taq I Mnl I Mae II Hae III AlwN I Mbo I Taq I Pml I Sau96 I Mnl I Hinf I Dpn I H a I BsaA I Nla IV Hph I
AG IATTCMAGATTCTGIAITICGATTGICGTICGIAGGTTCAGATGGGAAATCIAICGTGGACGIGIGIGCCGIAAGAAACAATAATCCTTiCACl 1120 TCTAAGTCTAAGACTAGCTAACGCAGCTCCAAGTCTACCCTTTAGTGCACCTGCCCCGGCTTCTTTGTTATTAGGAAGTG i i i . i l l • I I I • • I I • I I I I - I I
1042 1054 1062 1085 1094 1117
1046 1054 1065 1085 1095 1120
1048 1056 1067 1086 1096 1054 1100 1055
Sau3A I
Mbo I
Hinf I BsmA I
AlwN I Dpn I Fok I Bsr I
CTCCCCC .AG.ATTCTG.ATCTTGAGAGAGTGTTCATCTGGGACTTG IGATGIAGACAATCATTGTTTTCCACTCCTTGCTTA ICT 1200 GAGGGGGTCTAAGACTAGAACTCTCTCACAAGTAGACCCTGAACCTACTCTGTTAGTAACAAAAGGTGAGGAACGAATGA 1l12l7- 1l135 - • - 1I164I - ' ' 1I19*8
1129 1168
1135 1135
Sau3A I
Mbo I —
Dpn I
Alw I
Sau96 I Nla IV
Ava II BstY I
PpuM I BamH I
Nla IV Alw I Sty I
EcoO109 I Mnl I Eeo57 I See I Mbo II
I I I I I I I I
GGGTCCTACGCCAACAGATATGGGAGGGATCCACCCACTTCAGTTTCCCTTGGACTGCGAATGGAAGAAATGATTTTCAA 1280 CCCAGGATGCGGTTGTCTATACCCTCCCTAGGTGGGTGAAGTCAAAGGGAACCTGACGCTTACCTTCTTTACTAAAAGTT
II • - I II - I • I • • I •
1201 1224 1238 1248 1264
1201 1227 1248
1201 1227
1202 1227
1202 1227
1228 1228 1228 1228
Mbo II Tthlll I Eco57 I
Mse I Dde I Mae III Mbo II I I I I I II
CTTGGCAGACACACATTTATTTTTTAATGACTTAGAAGAATGTGACCAAGTCCATATAGATGATGTTTCTTCΛGATGATA 13 60 GAACCGTCTGTGTGTAAATAAAAAATTACTGAATC'ITCraACACTGGTTCAGGTATATCTACTACAAAGAAG'I CTACTAT
, • I - I I • I I • I I -
1304 1311 1322 1348 1315 . 1324 1349
Dde I Fnu4H I
Sau96 I Bbv I
Ava II Pst I
PpuM I Fnu4H I
EcoO109 I Bbv I Bsr I Tthlll II
I I I I I I I I
ACGGACAGGACCTAAGCACATATAACTTTGGAACAGATGGCTTTCCTGCTGCAGCAACCAGTGCTAACTTATGTTTGf'A 1440 TGCCTGTCCTGGATTCGTGTATATTGAAACCTTGTCTACCGAAAGGACGACGTCGTTGGTCACGATTGAATACAAACCGT ll - l • • • ll-l I • - I
1367 1408 1418 1432
1367 1408
1368 1409
1368 1411
1372 1411
Rsa I Fok I Hae III
I Bsr I Hae I
ACTGGTGTACGGGGCGGTGTGGACTGGATGAGAAAGTTGGCCTTCCGCTACAGACGGGTAAAAGAGATTTACAACACCTA 1520 TGACCACATGCCCCGCCACACCTGACCTACTCTTTCAACCGGAAGGCGATGTCTGCCCATTTTCTCTAAATGTTGTGGAT
1 I441 I - - 1I463I - 1I4I7-8
1447 1466 1479
Pst I
Fnu4H I
Alu I Bbv I
Sau96 I ScrF I Hae III
Mnl I Ava II Mnl I EcoR II Ξau96 I
Mme I Tthlll II Dde I BstN I Mnl I
CAAAAATAATGTTGGAGGTCTGCTTGGTCCAGCTAAGAGGGAAGCCTGGCTGCAGTTGAGGGCCGAAATTGAAGCCCTGA 1600 GTTTTTATTACAACCTCCAGACGAACCAGGTCGATTCTCCCTTCGGACCGACGTCAACTCCCGGCTTTAACTTCGGGACT
1531 1541 1553 1565 1578
1535 1546 1557 1565 1580 1546 1565 1581
1551 1569 1569 1570
ScrF I Msp I
EcoR II Hpa II
BstN I Sau96 I ScrF I
Pie I Bsr I Nei I
Hinf I HinC II Hae III Be, I Ssp I Mae III
CCG IACTCICTGGITTGACAICTGIGCCCTGAAAGCACTCTCGCTCATTCACTCICICGGACAAACTGTGTGAIATATTTTAGITAACA 1680 GGCTGAGGACCAACTGTGACCGGGACTTTCGTGAGAGCGAGTAAGTGAGGGCCTGTTTGACACACTTATAAAATCATTGT
1 I603I -1I611 I 1I620 • • 1I6I49 1I666• 1I675
1603 1617 1649
1607 1620 1649 —
1607 1650
1607 1650
Alu I Dde I Fok I Ssp I
ACTAC ITCAIGCTCIATCCCAGCATTGGCGAAAGTCCTGCTGTATGGGTTAGGAATTGTATTTCCAATAGAAAIATATTTACAG 1760 TGATGAGTCGAGTAGGGTCGTAACCGCTTTCAGGACGACATACCCAATCCTTAACATAAAGGTTATCTTTTATAAATGTC
1l685l - 1l692 - • • • ' 1I750
1688
Alu I Mbo II
TGCAACTAAAATAGGAAAAGAAA IGCTGTTTTGAGAGAATAATTCAAAGGTTTGGIAAGAAAAGTGGTGTATGTTGTTATAG 1840 ACGTTGATTTTATCCTTTTCΓΓTCGACAAAATCTCTTATTAAGTTTCCAAACCTTCTTTTCACCACATACAACAATATC
1783 1814
Sau3A I Mbo I Dpn I BstY I Nla III
Mnl I Mnl I
Mbo II SfaN I Gsu I Gsu I Sau96 I Sau96 I
Mbo II BstU I EcoN I Alw I Ava II Hae III
GAGATGGTGTAG IAAGIAAGAACAAGGAGCAAAAAAGCACIGCGIATGCCICTTCITGGIAGIGIATCITCCAGCCACTCGIGACICTCIATGI 1920 CTCTACCACATCTTCTTCTTGTTCCTCGTTTTTTCGTGCGCTACGGGAAGACCTCCTAGAGGTCGGTGAGCCTGGAGTAC • 1I852I- * • 1I87-8I- 1I886I I 1I8I95I- -I I I I
1855 1881 90 1899 1911 192
1893 1914
1895 1917
1896
1896
1896
Fnu4H I
HinP I Alu I Sec I
Sty I Hha I Pvu II ScrF I
Sec I Rsa I Hae II NspB II EeoR II
Nla III Gsu I Eco47 ITT HinP I BstN I
PflM I Bsr I Mae III Hha I Bbv I Mae III
GCCCTGCACCATGCCTTGGAACTGGAGTACCTGTAACAGCGCTCGGCACTTTGACAGCGCACAGCTGCTCTGTGACCAGG 2000 CGGGACGTGGTACGGAACCTTGACCTCATGGACATTGTCGCGAGCCGTGAAACTGTCGCGTGTCGACGAGACACTGGTCC
1I9I29 I -1I9I41 I • 1I953 II- • I • III . i i .
1977 1984 1992 1930 1942 1958 1977 1996
1934 1947 1958 1982 1996 1934 1959 1982 1996
1959 1983 1996
1984
Hae III
Msp I
Hpa II
Sau3A I Nae I Hph I AlwN I
Mbo I BsmA I HinP I ScrF I HinP I
Dpn I Bsr I Hha I Gsu I EcoR II Hha I
Alw I Sau96 I Hae II Mnl I BstN I Fsp I
BstY I Hae III SfaN i CfrlO i Dde I Sec I SfaN I
GACAGATCCAGCAGGCCCAGTCTCGCATCAGCGCCGGCCTCCAGAACTTAGCAATTTCCGCCCTGGTGATGCGCAGTTGC 2080 CTGTCTAGGTCGTCCGGGTCAGAGCGTAGTCGCGGCCGGAGGTCTTGAATCGTTAAAGGCGGGACCACTACGCGTCAACG
M - I I I I II II I II- I • -II I I II I
2004 2014 2025 2033 2047 2061 2068
2005 2014 2030 2038 2062 2070
2005 2017 2031 2039 2062 2071
2005 2020 2031 2062 2071
2005 2033 2065 2074 2034 2034 2036
Tthlll I HinC II Sea I Sau96 I Alu I Pie I Rsa I Ava II Pvu II Hinf I
Mnl I Hph I Mnl I Mae II NspB II Mae I
I I I II I II I I I II
TGTCAGTCTTGACCTCTGCCTTTGTGGTGAATGGAGGACCACGTCTATTTCATCAGAACAGCTGTTGACTCTAGTACTGT 2160 ACAGTCAGAACTGGAGACGGAAACACCACTTACCTCCTGGTGCAGATAAAGTAGTCTTGTCGACAACTGAGATCATGACA
• I • I • I II -I • II I I -I II
2093 2106 2114 2121 2139 2151
2116 2139 2147 2116 2140 2147 2154
2117 2144 2153
TACACGGTTCAAACCTG 2240 CTTAGGTCACTTTTATTCGGTACTCTTACAAAATCGTGTCGCAATACACAGACGGTGTAATTGATGTGCCAAGTTTGGAC
I I - I • • I-
2161 2180 2219
2165
Sau96 I Ava II PpuM I EcoO109 I Mbo II BspM I Eco57 I Alu I
I II I I I
TGAAGAAAGGACCTGCAAACGCTTCAGTTGTTAGCATTTTCAATGTGATATAAACAGCTTCTCCAATACAGCAAACCTAA 2320 ACTTCITTCCTGGACGTTTGCGAAGTCAACAATCGTAAAAGTTACACTATATTTGTCGAAGAGGTTATGTCGTTTGGATT
I ll-l • I • . i .
2242 2251 2262 2296
2248 2248 2249
AACGTGTTGTCTCTGACTTTACACAAAGGACTTATGGTCACCTCCTTAAAAGAACATTTCTTCCAAATGAAAAACCACAG l - l
• • I
• I • • • I •
2329 2356 2398
2331 2362
Sec I ScrF I EeoR II
BstN I Mse I
Sec I Rsa I Dra I Alu I Mnl I
II I II I I
TCATACCCAGGGTAATCTGTACATCTCTACTTATTTATGAACAGACTTTTTTTAAAAAGATAAAAAAACAGCTTTATTGA 2480 AGTATGGGTCCCATTAGACATGTAGAGATGAATAAATACTTGTCΓGAAAAAAATTTTTCTATTTTTTTGTCGAAATAACT ll - l- • • -II • I I-
2406 2419 2451 2470 2479
2407 2452
≤£U I Mse I BsmA I Hae I
Hph I Dra I Mnl I Mse I Mnl I
GGTATAATT ICACCCACCAGACTTTT ITITAAACATCAAATAATTG IAGGIAGACAATAGCATTAGAAATAAGTGAT ITAAA IGIGC IC 2560 CCATATTAAGTGGGTGGTCTGAAAAAATTTGTAGTTTATTAACTCCTCTGTTATCGTAATCTTTATTCACTAATTTCCGG
| . I I • - I I - • • I I I I -
2489 2505 2523 2552 2559
2506 2526 2556
2556 2557 Rsa I Mae II
Rsa I Mse I
Mnl I Nla III Sea I Mse I Dra I
I I I I I I I I I
TCTGCCTCACAACATGGCAAGTACAGTACTTTGAATTTTAGCACATTGCATAATAGTTTTAAGTATGTCTAATTTAAACG 2640 AGACGGAGTGTTGTACCGTTCATGTCATGAAAC'I AAAATCGTGTAACGTATTATCAAAATTCATACAGATTAAATTTGC
I • I -I II • • • I- • II I •
2565 2573 2585 2619 2633
2581 2634
2586 2638
BsmA I Rsa I Rsa I Dde I Nla III
TATAATATG ITACATCAC ITGIAGACAATC IATGITACAGAAAGAATTTTTGGTGTAAATTTGTAATAATGGATAATTCTTT AC 2720 ATATTATACATGTAGTGACTCTGTTAGTACATGTCTTTCTTAAAAACCACATTTAAACATTATTACCTATTAAGAAAATG
I- I I- I I - - - - -
2649 2657 2667
2659 2670
HgiA I Bspl286 I Mae II Fnu4H I ScrF I Bbv I ApaL I
EcoR II Mnl I Pml I
BstN I Nla III BsaA I Nla III
I I I I I I I I
ATATTGTTTAGGGAAATGATATTGAAAGGTAGCAATGCCTGGATAGTGAAGCATGAGGCAGCACGTGCACAAATTCATGT 2800 TATAACAAATCCCTTTACTATAACTTTCCATCGTTACGGACCTATCACTTCGTACTCCGTCGTGCACGTGTTTAAGTACA
2758 2772 2782 2796
2758 2775 2782
2758 2778 2785
2778
2783 2785 2785
Nla III
Dde I BstX I
I
GCCGTGCriTATCTGAGTTTTCGGTATAAATATGTAGATAATGGATTTTTTTTTAGATAATGTTGTCAAGACCAAAAGCA 2880 CGGCACGGAATAGACTCAAAAGCCATATTTATACATCTATTAC AAAAAAAAATCTATTACAACAGTTCTGGTTTTCGT
• • • • • • • j | ■
2813- 2872
2879 Hae III EcoO109 I Mnl I Dde I Fok I SfaN I Mbo II Sau96 I
TG IGATGTO^GTGTCAGTAAGGATTTTG'I TCTAAAATTTTTTCCTGICATCAGTTICTTCITGMAGMGGCCTTGATGAAATAAC 2960 ACCTACAGTTCACAGTCATTCCTAAAACAAAGATTTTAAAAAAGGACGTAGTCAAGAAGACTCCCGGAACTACTTTATTG I . . I . I |.||||
2882 2927 2935 2943
2939 2941 2942 2944
Alu I Sac I HgiA I Bspl286 I Mse I Baη II Mnl I
ACAGCAGTTTCT ITAAACAATTTGAAACAAAATGIAIGCTCTCCTACCACICTCACTTTTTCATTTCCACACTAATGTATTATA 3040 TGTCGTCAAAGAATTTGTTAAACTTTGTTTTACTCGAGAGGATGGTGGAGTGAAAAAGTAAAGGTGTGATTACATAATAT
• I • • II • I •
2972 2993 3007
2993 2993 2993 2994
Mse I Mae III Mbo II Ase I
TG ITAACTACTTGGAAAAAATAATTATTCAAATGCTTICTTCCCACAAAGAATATAGATGATAGTAGATATATTTTAITITAAT 3120 ACATTGATGAACCTTTTTTATTAATAAGTTTACGAAGAAGGGTGTTTCTTATATCTACTATCATCTATATAAAATAATTA
I • • • I • • ' * I I • 3042 3076 3115
3116 Dde I Hinf I HgiA I
Nla III Bspl286 I Mse I
BspH I BsmA I Nla III Ase I Taq I
II I I I I I II I
AAAATGGTTCATGAATCGGAGACTAACAAAGTTTTCATGTGCTCAGAATTATTAATTATCGTGTCTGCATTTTCTTTC«A 3200 TTTTACCAAGTACTTAGCCTCTGATTGTTTCAAAAGTACACGAGTCTTAATAATTAATAGCACAGACGTAAAAGAAAGCT
II I I- - l l- l -II • • I •
3129 3139 3156 3171 3197
3130 3159 3172
3133 3159 3162
Msp I
Mbo II Hpa II
Bbv II SfaN I BSPM II Mbo II
I I II I
TAAAGGAAGACACACGATGCTAATCCGGAAATCAGCAAACTTTGCATTACTCCCTATGTGCGTATTTTCTCTTTCTTCCT 3280 ATTTCCTTCTGTGTGCTACGATTAGGCCTTTAGTCGTTTGAAACGTAATGAGGGATACACGCATAAAAGAGAAAGAAGGA
3206 3216 3224 3274
3206 3225 3225
Nla III
Dde I Sty I
EcoN I Sec I
Bsu36 I Ncς> I
Hph I Mnl I Ds I Mnl I Nla III lae III Hph I Mae II BspM I Rsa I
GTCACCCTGAGGAAGGTTCATTGCCATTGTCATCACCATGGAAACAACGTTCCTCTCCACCTGCATTATGTACTACATGA 3360 CAGTGGGACTCCTTCCAAGTAACGGTAACAGTAGTGGTACCTTTGTTGCAAGGAGAGGTGGACGTAATACATGATGTACT
II II I- • - I II - I • I I- | l -
3281 "~ 33331133 33332277 3339 3350
3282 3289 33331166 33333322 3356
3286 3316 3286 3316 3287 3316
3317
SfaN I Hph I
I I
CAGGCATCAATCTGGGGAAATAATAAAATTATCACCTTTGTCAGACCATAAGAGTTTCTCCAAAAGTGGTCAGTTTGGCT 3440 GTCCGTAGTTAGACCCCTTTATTATTTTAATAGTGGAAACAGT ΓGGTATTCTCAAAGAGGTTTTCACCAGTCAAACCGA
I • • • j • • • * •
3364 3392
Bbv II Fok I Mbo II Nla III Dde I Pie I Ssp I Tthlll II BspH I Bsu36 I Hinf I Mse I
I I II II I I II I
GGGCAATATTTTCTCTCATCTAACAAACACAATCCATTGTCATGAAATTACCCTTAGGATGAGTCTTCTTTAATCAATCA 3520 CCCGTTATAAAAGAGAGTAGATTGTTTGTGTTAGGTAACAGTACTTTAATGGGAATCCTACTCAGAAGAAATTAGTTAGT
I • • I • II • II I -I II I
3445 3464 3480 3492 3501 3510
3481 3493 3501
3497 3504 3503
Alu I Mbo II Fnu4H I
Alu I Ear ,ι Eco57 I Bbv I
TATATTGGGCGGAAAAAACACCAGCTTTGACCCGAAGTAGTTGAAGAGCTACTTCATTCTTTTCTGAAGTTGTGTGTTGC 3600 ATATAACCCGCCTTTTTTGTGGTCGAAACTGGGCTTCATCAACTTCTCGATGAAGTAAGAAAAGACTTCAACACACAACG
3543 3563 3584 3599 3563 3599
3567
Mse I
Mae I Dra I Mbo II i ' ' '
TGCTAGAAATAGTCATTTGTGAATTATCCAAATTGTTTAAATTCACAATTGAATTAGTTTTTTCTTCCTTTTGGCTTGAA 3680
ACGATCTTTATCAGTAAACACTTAATAGGTTTAACAAATTTAAGTGTTAACTTAATCAAAAAAGAAGGAAAACCGAACTT
3603 3636 3663 3637
HinC II Pst I Hae III Tthlll II Mse I Rsa I Hae I
GCAAACAGTTGACCATT'ITTAACCTTTTCATTTTATGTTTTTGTACTCTGCAGACTGAAAAGACAAAGTTTATCTTGGCC 3760 CGTTTGTCyUiCTGGTAAAAATTGGAAAAGTAAAATACAAAAACATGAGACGTCTGACTTTTCTGTTTCAAATAGAACCGG
3682 3699 3723 3756
3688 3728 3757
Mse I Ase I Mse I —
Rsa I Mbo II Mse I
TTACTGTATAAAGGTATGCTGTGTCCACCGTTGTGTACAGAATTTTTCTTCATTAATTTTGTGTTTAAGTTAATAAAATT 3840 AATGACATATTTCCATACGACACAGGTGGCAACACATGTCTTAAAAAGAAGTAATTAAAACACAAATTCAATTATTTTAA
3795 3807 3825
3812 3830 3813
Rsa I
I TATTTGTGATGTACTGTAA 3859 ATAAACACTACATGACATT
• I
3851
Restriction Endonucleases site usage
Aat II - BstN I 10 HinC II 3 Pie I 4
Ace I - BstU I 1 HinD III 1 Pml I 2
Afl II - BstX I 1 Hinf I 10 PpuM I 3
Afl III - BstY I 5 HinP I 4 Pst I 3
Aha II - Bsu36 I 2 Hpa I - Pvu I -
Alu I 18 CfrlO I 1 Hpa II 3 Pvu II 2
Alw I 6 Cla I 1 Hph r 11 Rsa I 17
AlwN I 4 ""* Dde I 16 Kpn I - Rsr II -
Apa I - Dpn I 8 Mae I- 2 Sac I 1
ApaL I 1 Dra I 5 Mae II 6 Sac II -
Ase I 3 Dra III - Mae III 11 Sal I -
Asp718 - Drd I - Mbo I 8 Sau3A I 8
Ava I - Dsa I 1 Mbo II 17 Sau96 I 13
Ava II 6 Eae I 1 Mlu I - Sea I 2
Avr II - Eag I - Mme I 2 ScrF I 11
BamH I 1 Ear I 1 Mnl I 27 Sec I 8
Ban I 2 Eco47 III 1 Mse I 1 SfaN I 9
Ban II 1 Eco57 I 7 Mse I 22 Sfi I -
Bbe I - EcoN I 2 Msp I 3 Sma I -
Bbv I 9 EcoO109 I 4 Nae I 1 SnaB I -
Bbv II 2 EcoR I - Nar I - Spe I -
Bel I 1 EcoR II 10 Nci I 1 Sph I -
Ben I 1 EcoR V - Nco I 1 Spl I 1
Bgl I - Esp I - Nde I 2 Ssp I 3
Bgl II - Fnu4H I 9 Nhe I - Stu I 1
BsaA I 3 Fok I 6 Nla III 15 Sty I 3
Bsm I 1 Fsp I 1 Nla IV 6 Taq I 3
BsmA I 8 Gdi II - Not I - Tthlll I 2
Bspl286 I 4 Gsu I 4 Nru I - Tthlll II 7
BspH I 2 Hae I 4 Nsi I - Xba I -
BspM I 3 Hae II 2 Nsp7524 I - Xea I -
BspM II 1 Hae III 12 NspEi II 2 Xho I -
Bsr I 13 Hga I 1 NspH [ I - Xcm I -
BssH II - HgiA I 4 PaeR7 I - Xma I -
BstB I - HgiE II - PflM [ I 1 Xmn I 1
Enzyme Site Use Site position (Fragment length) Fragment order
ApaL I g/tgcac 1 1( 2784) 1 2785( 1075) 2
BamH I g/gatcc 1 1( 1226) 2 1227 ( 2633) 1
Ban II grgcy/c 1 1( 2992) 1 2993( 867) 2
Bel I t/gatca 1 K 114) 2 115 ( 3745) 1
Ben I ccs/gg 1 1( 1648) 2 1649 ( 2211) 1
Bsm I gaatgc 1/-1 1 1( 526) 2 527{ 3333) 1
BspM II t/ccgga 1 1( 3223) 1 3224( 636) 2
BstU I cg/cg 1 1( 1877) 2 1878( 1982) 1
BstX I ccannnnn /ntgg 1 1( 2871) 1 2872( 988) 2
CfrlO I r/ccggy 1 1( 2032) 1 2033 ( 1827) 2
Cla I at/cgat 1 1( 1054) 2 1055( 2805) 1
Dsa I c/crygg 1 1( 3315) 1 3316( 544) 2
Eae I y/ggccr 1 1( 839) 2 840( 3020) 1
Ear I ctcttc 1/4 1 1( 3562) 1 3563{ 297) 2
Eco47 III agc/gct 1 1( 1957) 1 1958 ( 1902) 2
Fsp I tgc/gca 1 1( 2069) 1 2070( 1790) 2
Hga I gacgc 5/10 1 1( 1061) 2 1062 ( 2798) 1
HinD III a/agctt 1 1( 680) 2 681( 3179) 1
Mse I tgg/cca 1 1( 839) 2 840( 3020) 1
Nae I gcc/ggc 1 1( 2032) 1 2033 { 1827) 2
Nci I cc/sgg 1 1( 1648) 2 1649( 2211) 1
Nco I c/catgg 1 1( 3315) 1 3316( 544) 2
PflM I ceannnn/ntgg 1 1( 1928) 2 1929( 1931) 1
Sac I gagct/c 1 1( 2992) 1 2993 ( 867) 2
Spl I c/gtacg 1 K 574) 2 575( 3285) 1
Stu I agg/cct 1 1( 2555) 1 2556( 1304) 2
Xmn I gaann/nnttc 1 1( 769) 2 770( 3090) 1
Ban I g/gyrcc 2 1( 291) 3 292 ( 416) 2 708( 3152) 1
Bbv II gaagac 2/6 2 1( 3205) 1 3206( 297) 3 3503 ( 357) 2
BspH I t/catga 2 1( 3128) 1 3129( 351) 3 3480( 380) 2
Bsu36 I cc/tnagg 2 1( 3285) 1 3286 ( 206) 3 3492 ( 368) 2
EcoN I cctnn/nnnagg 2 1( 1885) 1 1886( 1400) 2 3286( 574) 3
Hae II rgcgc/y 2 1( 1957) 1 1958( 72) 3 2030( 1830) 2
Mae I c/tag 2 1( 2150) 1 2151 ( 1452) 2 3603 ( 257) 3
Mme I tccrac 20/18 2 1( 444) 3 445( 1086) 2 1531( 2329) 1
Nde I ca/tatg 2 1{ 499) 2 500( 90) 3 590( 3270) 1
NspB II cmg/ckg 2 1( 1981) 1 1982 ( 157) 3 2139( 1721) 2
Pml I cac/gtg 2 1( 1084) 2 1085( 1697) 1 2782 ( 1078) 3
Pvu II cag/ctg 2 1( 1981) 1 1982 ( 157) 3 2139( 1721) 2
Sea I agt/act 2 1( 2152) 1 2153 ( 432) 3 2585( 1275) 2
Tthlll I gacn/nngt :c 2 1( 1323) 2 1324 ( 793) 3 2117 ( 1743) 1
Ase I at/taat 3 1( 3114) 1 3115( 56) 3 3171( 641) 2 3812 ( 48) 4
BsaA I yac/gtr 3 K 551) 3 552( 533) 4 1085 ( 1697) 1 2782 ( 1078) 2
BspM I acctgc 4/8 3 K 169) 4 170( 2081) 1 2251( 1088) 2 3339( 521) 3
HinC II gty/rac^ 3 1( 1610) 1 1611 ( 533) 3 2144 ( 1544) 2 3688( 172) 4
Hpa II c/cgg 3 1( 1649) 1 1650( 384) 4 2034{ 1191) 2 3225 ( 635) 3
Msp I c/cgg 3 1( 1649> 1 1650( 384) 4 2034( 1191) 2 3225( 635) 3
PpuM I rg/gwecy 3 1( 1200) 2 1201( 166) 4 1367( 881) 3 2248( 1612) 1
Pst I ctgca/g 3 1( 1408) 2 1409( 161) 3 1570( 2158) 1 3728 ( 132) 4
Ssp I aat/att 3 1( 1665) 2 1666 ( 84) 4 1750( 1695) 1 3445 ( 415) 3
Sty I c/cwwgg 3 1( 1247) 2 1248 ( 686) 3 1934 ( 1382) 1 3316( 544) 4
Taq I t/cga 3 1( 1055) 2 1056( 9) 4 1065( 2132) 1 3197( 663) 3
AlwN I cagnnn/ctg 4 1( 1045) 2 1046( 81) 5 1127 ( 947) 3 2074 ( 255) 4
2329( 1531) 1
Bspl286 I gdgch/c 4 K 49) 5 50( 2735) 1 2785 ( 208) 3 2993 ( 166) 4 3159( 701) 2
EcoC-109 I rg/gnecy 4 1( 1200) 1 1201( 166) 5 1367 ( 881) 3 2248( 694) 4 2942 ( 918) 2
Gsu I ctggag 16/14 4 1( 1889) 1 1890( 9) 5 1899( 43) 4 1942 ( 97) 3 2039( 1821) 2
Hae I wgg/ccw 4 1( 839) 3 840( 638) 4 1478( 1078) 2 2556( 1200) 1 3756( 104) 5
HgiA I gwgew/c 4 K 49) 5 50( 2735) 1 2785( 208) 3 2993 ( 166) 4 3159( 701) 2
Hha I gcg/c 4 1( 1958) 1 1959( 18) 5 1977 ( 54) 3 2031( 40) 4 2071 ( 1789) 2
HinP I g/cgc 4 1( 1958) 1 1959( 18) 5 1977 ( 54) 3 2031 ( 40) 4 2071( 1789) 2
Pie I • gagtc 4/5 4 1( 281) 5 282( 1321) 2 1603 ( 544) 3 2147( 1354) 1 3501( 359) 4
BstY I r/gatcy 5 1( 188) 5 189( 801) 2 990( 237) 4 1227 ( 668) 3 1895( 109) 6 2004( 1856) 1
Dra I ttt/aaa 5 1( 356) 3 357( 2094) 1 24511 54) 6 25051 128) 5
2633 ( 1003) 2 36361 224) 4
Alw I ggatc 4/5 6 K 188) 5 1891 802) 2 9911 236) 4 12271 1) 7
1228 ( 667) 3 18951 110) 6 20051 1855) 1
Ava II g/gwcc 6 1( 1201) 2 12021 166) 6 13681 178) 5 15461 365) 3
1911 ( 205) 4 21161 133) 7 22491 1611) 1
Fok I ggatg 9/13 6 1( 925) 2 9261 238) 6 11641 302) 5 14661 226) 7
1692( 1190) 1 28821 615) 3 34971 363) 4
Mae II a/cgt 6 1( 552) 2 553( 533) 4 10861 1035) 1 21211 517) 6
2638 ( 145) 7 2783 ( 544) 3 33271 533) 5
Nla IV ggn/ncc 6 1( 240) 4 2411 51) 6 2921 416) 2 7081 386) 3
1094 ( 107) 5 12011 26) 7 12271 2633) 1
Eco57 I ctgaag 16/14 7 K 75) 7 761 544) 4 6201 64) 8 6841 554) 3
1238 ( 111) 6 13491 913) 2 22621 1322) 1 35841 276) 5
Tthlll II caarca 11/9 7 K 126) 7 1271 381) 4 5081 471) 2 9791 453) 3
1432 ( 109) 8 15411 1923) 1 34641 218) 5 36821 178) 6
BsmA I gtctc 1/5 8 1( 638) 3 639{ 529) 4 11681 852) 1 20201 311) 6
2331 ( 67) 9 23981 128) 8 25261 133) 7 26591 480) 5
3139 ( 721) 2
Dpn I ga/tc 8 1( 115) 4 1161 74) 8 1901 801) 2 991( 63) 9
1054 ( 81) 7 11351 93) 6 12281 668) 3 1896( 109) 5
2005 ( 1855) 1
Mbo I /gate 8 K 115) 4 1161 74) 8 190( 801) 2 9911 63) 9
1054 ( 81) 7 11351 93) 6 12281 668) 3 1896( 109) 5
2005 ( 1855) 1
Sau3A I /gate 8 1( 115) 4 1161 74) 8 1901 801) 2 9911 63) 9
1054 ( 81) 7 11351 93) 6 12281 668) 3 18961 109) 5
2005 ( 1855) 1
Sec I c/cnngg 8 K 842) 2 8431 405) 5 12481 686) 3 19341 62) • 8
1996 ( 65) 7 20611 345) 6 24061 1) 9 24071 909) 1
3316 ( 544) 4
Bbv I gcagc 8/12 9 1( 319) 6 3201 176) 8 496( 406) 5 902( 506) 3
1408 ( 3)10 14111 158) 9 15691 415) 4 19841 794) 2
2778 ( 821) 1 35991 261) 7
Fnu4H I gc/ngc 9 1( 319) 6 3201 176) 8 4961 406) 5 9021 506) 3
1408 ( 3)10 14111 158) 9 15691 415) 4 19841 794) 2
2778 ( 821) 1 35991 261) 7
SfaN I gcatc 5/9 9 K 610) 3 6111 115) 9 726( 242) 9681 913) 1
1881 ( 144) 8 20251 43)10 20681 859) 29271 289) 5
3216 ( 148) 7 33641 496) 4
BstN I cc/wgg 10 K 564) 3 5651 84) 8 6491 51)10 700( 143) 7
843 ( 722) 2 15651 42)11 16071 389) 4 1996( 66) 9
2062 ( 345) 6 24071 351) 5 27581 1102) 1
EcoR II /ccwgg 10 1( 564) 3 5651 84) 8 649( 51)10 700( 143) 7
843 ( 722) 2 15651 42)11 16071 389) 4 19961 66) 9
2062 ( 345) 6 24071 351) 5 27581 1102) 1
Hinf I g/antc 10 K 228) 7 2291 53) 9 282( 760) 2 10421 6)11
1048 ( 81) 8 11291 474) 4 16031 544) 3 2147( 14)10
2161 ( 972) 1 31331 368) 5 35011 359) 6
Hph I ggtga 8/7 11 K 413) 4 4141 232) 6 6461 228) 7 8741 97) 9
971 ( 146) 8 11171 948) 1 20651 41)11 21061 383) 5
2489 ( 793) 2 32821 31)12 33131 79)10 33921 468) 3
Mae III /gtnac 11 1( 85)10 861 168) 9 2541 379) 3 633( 12)12
645 ( 340) 5 9851 337) 6 13221 353) 4 16751 278) 7
1953 ( 39)11 19921 1050) 1 30421 239) 8 32811 579) 2
ScrF I cc/ngg 11 1( 564) 3 5651 84) 8 6491 51)10 7001 143) 7
843 ( 722) 2 15651 42)11 16071 42)12 16491 347) 5
1996 ( 66) 9 20621 345) 6 24071 351) 4 27581 1102) 1
Hae III gg/cc 12 1( 240) 8 2411 600) 2 8411 255) 7 10961 383) 5
1479 ( 102)10 15811 39)12 16201 300) 6 19201 94)11
2014 ( 22)13 20361 521) 3 25571 387) 4 29441 813) 1
3757 ( 103) 9
Bsr I actgg 1/-1 13 1( 139)10 140( 96)11 2361 181) 7 417( 416) 2
833 ( 365) 3 11981 220) 5 14181 23)13 14411 22)14
1463( 154) 8 16171 324) 4 19411 76)12 20171 148) 9
2165( 191) 6 23561 : 1504) 1
Sau96 I g/gncc 13 11 240) 5 2411 854) 2 10951 107) 9 12021 166) 7
13681 178) 6 15461 34)13 15801 40)12 16201 291) 4
1911 ( 9)14 19201 94)11 20141 102)10 21161 133) 8
22491 694) 3 29431 917) 1
Nla III catg/ 15 11 46)12 471 232) 6 2791 1638) 1 19171 13)16
1930( 250) 5 21801 393) 2 25731 94)10 26671 105) 9
2772 ( 24)15 27961 83)11 28791 251) 4 31301 26)14
31561 161) 7 33171 39)13 33561 125) 8 34811 379) 3
Dde I c/tnag 16 11 40)16 411 173) 9 214( 155)11 3691 272) 5
6411 26)17 667( 644) 1 13111 61)15 13721 181) 8
15531 132)12 16851 362) 4 20471 610) 2 26571 156)10
28131 126)13 29391 223) 6 31621 125)14 32871 206) 7
34931 367) 3
Mbo II gaaga 8/7 17 11 94)11 95( 1005) 1 11001 164) 6 12641 51)15
13151 33)17 13481 466) 3 18141 38)16 18521 3)18
18551 387) 4 22421 693) 2 29351 141) 8 30761 130) 9
32061 68)12 32741 230) 5 35041 59)13 35631 100)10
36631 144) 7 38071 53)14
Rsa I gt/ac 17 11 554) 2 5551 21)15 576( 113)11 6891 163) 8
8521 153)10 10051 442) 4 14471 500) 3 19471 207) 7
21541 265) 6 24191 162) 9 25811 5)18 25861 63)13
26491 21)16 26701 680) 1 33501 373) 5 37231 72)12
37951 56)14 38511 9)17
Alu I ag/ct 18 11 67)16 68( 265) 5 3331 204) 6 537( 133)13
6701 12)19 6821 36)17 7181 153)11 8711 76)15
9471 604) 1 15511 137)12 16881 95)14 1783 ( 200) 7
19831 157) 9 21401 156)10 22961 174) 8 24701 524) 3
29941 549) 2 35431 24)18 35671 293) 4
Mse I t/taa 22 11 19)20 201 43)18 631 239) 6 3021 56)14
358{ 442) 2 8001 237) 7 10371 267) 5 13041 915) 1
22191 233) 8 24521 54)16 25061 46)17 25521 67)12
26191 15)21 26341 338) 3 29721 144)-9 31161 56)15
31721 338) 4 35101 127)10 36371 62)13 36991 114)11
38131 12)22 38251 5)23 38301 30)19
Mnl I cctc 7/7 27 11 25)23 26( 135)10 1611 321) 2 4821 105)14
5871 107)13 6941 97)16 7911 276) 6 10671 53)19
11201 104)15 12241 311) 4 15351 22)24 15571 21)25
15781 315) 3 18931 21)26 19141 124)11 20381 55)18
2093 ( 21)27 21141 248) 7 23621 117)12 24791 44)20
25231 36)22 25591 6)28 25651 210) 8 27751 166) 9
29411 66)17 30071 282) 5 32891 43)21 33321 528) 1
478 sites found
No Sites found for the following Restriction Endonucleases
Aat II gacgt/c Drd I gacnnnn/nng c NspH I rcatg/y Ace I gt/mkac Eag I c/ggccg PaeR7 I c/tcgag Afl II c/ttaag EcoR I g/aattc Pvu I cgat/cg
Afl III a/crygt EcoR V gat/atc Rsr II cg/gwccg
Aha II gr/cgyc Esp I gc/tnagc Sac II ccgc/gg
Apa I gggcc/c Gdi II yggccg -5/-1 Sal I g/tcgac
Asp718 g/gtacc HgiE II accnnnnnnggt Sfi I ggccnnnn/nggcc
Ava I c/ycgrg Hpa I gtt/aac Sma I ccc/ggg
Avr II c/ctagg"* Kpn ggtac/c SnaB I tac/gta
Bbe I ggcgc/c Mlu a/cgcgt Spe I a/ctagt
Bgl I gccnnnn/nggc Nar gg/cgcc Sph gcatg/c
Bgl II a/gatct Nhe g/ctagc Xba t/ctaga
BssH II g/cgcgc Not gc/ggccgc Xca gta/tac
BstE 1 I tt/cgaa Nru tcg/cga Xho c/tcgag
BstE II g/gtnacc Nsi atgca/t Xcm ccannnnn/nnnntgg
Dra III cacnnn/gtg Nsp7524 I r/catgy Xma c/ccggg
ANNEX 6
DNA sequence 1649 b.p. GGAAATGGTAGA ... TCCACCTGCCAT linear
Positions of Restriction Endonucleases sites (unique sites underlined)
Sau3A I Mbo I
Dpn I Mae I Mse I
See I Hph I Mnl I Eco57 I Hga I
II I I I I I I
GGAAATGGTAC-AACTAGT^TCTCΑCCC-AGCCT 80
CCI ΓACCATCTΓGATCACTAGAGT^^
• II I- I -I • • - I I- I
13 23 31 62 80 14 69
19 19 19
Sau3A I
Mbo I
Dde Dpn I
Dde I Sau96 I Mbo II BstY I
Fnu4H I Pie I Hph I Nla IV Ear I EsL.Il Bbv I Hinf I Mae III SfaN I Hae III Alu I Gsu I
ACG 1CTGCTGTGTGG 1ACICT1X_AG1TGACA^ 1 1 1 1 I II I II 160
TGCGACGACACACCTGAGACTCACTGTCIX^^
I - l l - l - I I - I • I • I II I II-
83 94 102 114 127 147 154
83 94 117 127 149 158
98 127 150 158
134 159
159 159 Sau3A I Fnu4H I Mbo I
Nla III Dpn I
Stv I Alw I
Mae II Sec I Hga I Nla IV
Aϋ_HI Nco I Hae III BstY I
Pml I Mnl I Dsa I Gdi II BamiLI.
BsaA I BstX I Bbv I Eae I Rsa I Alw I
II I I II I II I I II
TCTTGCCCAOSTGTCCrrcCCCCK 240
AGAACGGGTO-ACAGGAGGGGGCGGTCGGAAGG^
II- I - I II- -I -II I • I • II
168 192 203 211 223 234
168 175 1 19988 2 21111 234
169 1 19988 2 21i2: 234
169 1 19988 217 234
198 235
199 235
203 235 235
Alu I
Gsu I Xmn_ι Mbo II Alu I M Mnnll II H Hiinnff II SfaN I Bbv II
I II I I I I
GCAOCTACCCCCATTACAGCTTACC^ 20
CGTCCGATGGGGGTAATGTCGAATt-OSTGGAGGTC^
I- II • I • - l - l l
259 269 288 303 314
270 288 314
320
Hae III Mse I Hae I ScrF I EcoR II BstN I Dde I Alu I Sec I Mnl I Nla TV Eco57 I Hph Bspl286 I BstX I Eae I Hinf I Alu I Gsu I Mnl I Bsr I Ban II
GCTTGAACCATTCCCCΌGCCAGAGTCGAT^ 400
CGAACTTGGTAAGGGGACCGGTCTCACCTAAGGAGTCGA^
328 337 348 356 364 378 386 394
334 352 362 370 380 394
335 353 368
335
335
337
337
338
Msp I
Hpa II
CfrlO I HgiA I
Fnu4H I Bspl286 I
SfaN I Mnl I Bsr I Bbv I ApaL I
TACACCTACCAGATGCACGGCAC-AA(AGGGTTC^^ 480
ATGTGGATGGTCTACGTGCCGTCTTGTCCCAAGATA^
412 441 451 461 479
461 479
464 479
465
465
Msp I
Hpa II
ScrF I
ScrF I Nci I
EcoR II Ben I Rsa I
BstN I Sec I Bsr I Bsr Fok I Mnl I
GCACCAGGACΓATCCTTCCTACCCCGGCTTCCCCCAGAGCCAGTACCCCCAGTATO 560 CGTGGTCCΓGATAGGAAGGATGGGGCCGAAGGGGGTCTCGGTC^
484 502 520 529 543 554
484 503 523
484 503
503
" "* 504
504
Hae III
Gdi II
Eae I ScrF I
Msp I EcoR II
Hpa II BstN I
ScrF I SfaN I Gsu I
Nci I AlwN I Mnl I
Ben I Fnu4H I ScoN I SfaN I
Mae II Bbv I Mnl I Mae II Mae II Mnl I Mae II
ACGTCCCGGCCAGCΛGCATCTGCCCTTCGCCCCTC CCACG^ 640
TGC-AGGGCCGGTCGTCGTAGACGGGAAGCGGGGAGAGGTGCAGGT^
561 573 592 599 609 620 633
565 573 613 623
565 574 613
565 576 614
566 616
566 616
567 616
567
568
Mae III Pie I Sau96 I BsmA I
Hinf I Hph I Ava II Mnl I
AACCAGAGT-TCCG IAGITCACTTGCTOI STGAATACAAC I I I 720
TTGGTCTCAAGGCIK^GTGAACGACCACITO
• l l - l - • I • • I - I
653 665 686 707
653 686 711
655
Mnl I
ScrF I
Nci I Msp I Msp I
Msp I Hpa II Hpa II
Hpa II CfrlO I ScrF I
Ben I Mnl I Nci I
Fnu4H I Hae III Sec I Dde I Ben I
[ae III Sau9β I Alu I Hae III Mnl Mae III Sec I
CAGGCCG<^CCGG<_CCTCCGATGGGAAGCTCCGAGGCCGGTCTAAGAGGAG 800
GTCCG XX?rcGCCCGGAGGCTACCCrrcCGAGGCTC
I I I I I I • I -I I I I I • I I • I - I I I
723 732 747 755 766 774 783
724 733 751 762 784
730 753 784
730 756 784
730 757 785
730 757 785
730
735
BsmA I Fok I
Fok I SfaN I 1 1 I I AGATTGAGCGTGTGTTCGTGTGGGACTTGGATGA^^ 880
TCTAACTCGCACA(^ΛG aCΑCCCI\3AA^ j . I . . . I I
829 875
833 876
Mbo II
Mae II HinP I Sau3A I Aha II Hha I Hae III Mbo II Mbo I Aat II Fsp I Hae I Ear I Dpn I
AGATACGGGAAGGACACCAOSIAICGTCCGTIGICGCATTI^I I I I 960
TCTATG^CCTTCCΓGTX-GTGCΓGCAGG^^
• 9I0I1 9 M09 9M16 - 9 I2-9 9 I37 I-
901 910 917 929 937
902 910 937
939
Sau3A I
Mnl I Mbo I
Gsu I Dpn I BsmA I Hae III
ScrF I Alw I Mae II Mse I
AGACAAGAAσiτACIGGACCTCCTAACACTGGTCTAGσiX_CAA
9 166 * 9 9 H7766 I 9 9 l8877- l l 9 9 I99'99 1 1 I00I0077 I • 1 I0I2;7 1 I037 •
976 993 1007 1027
976 994 1008 1027
977 994 1010 1028
980 994 994
Nla IV
ScrF I
EcoR II Bspl286 I
BstN I Ban II
Sec I Sec I Fnu4H I
Sau96 I ScrF I HgiA I
Nla IV EcoR II Bspl286 I
US2E-II Hae III Xcjn_I BstN I ApaL I
G ^(^TACAACITCTC ICGCr3ACGGCITCCACAGTrrC I II I I I I I I 1120
CGTGTATGTTGAAG-GGCGACTGCCGAAGG G C-AAGCCGGGGTC
1056 1078 1089 1099 1112
1078 1099 1112
1078 1099 1112
1081 1099 1119
1082 1102
1082 1102
1082
1085
Hph I
Hae III Msp I
Fok I Alu I Hpa II
Bsr I Mnl I Hae I CfrlO I Rsa I Mai
Q3CσiX-<aCTGGATGAG<-AAGClX-GCC K:CGrrACCGGCG 1200
CαSCACCTGACCTACTCCTTCGACOXSAAG^
I -I I I II - ll - l - I • - I
1128 1135 1143 1155 11 1194
1131 1140 1156
1144 1156
1162
ScrF I Alu I
EcoR II Dde I
Hae III
Msp I
Hpa II
* "* ScrF I Hph I Nci I Mae III
Eco57 I Mnl I Ben I 2≤££_H
I I II I III
CCCACTCCCTGAAGGCACTAAAO_TC^TC^ 1360
GGGTGAGGGACTTCOπ'GATTTGGAGTAGTTGAGGGCα^^
I- • I • II I • • III
1289 1303 1314 1339
1314 1340
1314 1341
1315 1315 1317 1317
Hae III Mse I Hae I Sec I Eae I ScrF I ScrF I EcoR II EcoR II BstN I BstN I Hae III Sec I Hae I
CCTGCCCTGGCCAAAGTCCTGCTATATXSGCCTGGGGTCTGTGT^ 1440
GGACGGGACCGGTTTCAGGACGATATACCGGACCCV^GACACAAAGGATAACTCTTG^
1365 1387 1366 1388 1366 1390 1366 1390 1368 1390 1368 1390 1368 1369
Sau3A I
Mbo I
Fnu4H Mae II Mnl Bbv I Mnl Ace I Hph I Hph I Mbo II Alu I Taσ .I Hinf I Alu I BsaA I Dpn I Ear I
GAAGGAGAGCrarrTCGAGAGGATAATGCAGAGAT^ 1520
CΓTCCTCTCGAOSAAGCΓCΓCCTATTACGTCT^
1448 1455 1473 1486 1493 1501 1517 1449 1459 1490 1498 1505 1517
1449 1494 1520
1501
1501
Mnl I
Bαl I Gsu I ScrF I Nla III ScrF I EcoR II SEH_I EcoR II Mnl I BstN I Nsp7524 I ECOR V V BstN I Dde I Sec I
AGGAGCAAGGAGCGAAAAAGCACAACATGCCTTTCTGGCG^ 1600 TCCTCGTTCCTO-CTTTTTCGTGTTGT^^
1545 1561 1579 1589 1599 1545 1579 1591 1600 1546 1579 1600 1549 1580 1600
1583
Fnu4H I Bbv I Sau3A I Mbo I
Gsu I Dpn I Bsr I Alw I SfaN I BSPM I
CTGGAACTG<AGTATTTATAGC-AGGATCAGCAGCA^ 1649
GACCΓTGACCTCATAAATATCGTCCTAGTCGTCGTAGAGGTGGACGGTA
1606 1624 1633 1641 1607 1625 1625 1625
1630 1630
Restriction Endonucleases site usage
Aat II 2 BstN I 11 HinC II - Pie I
Ace I 1 BstU I - HinD III - Pml I
Afl II - BstX I 2 Hinf I 5 PpuM I
Afl III 1 BstY I 3 HinP I 1 Pst I
Aha II 2 Bsu36 I - Hpa I - Pvu I
Alu I 13 CfrlO I . 3 Hpa II 8 Pvu II
Alw I 4 Cla I - Hph I 8 Rsa I
AlwN I 1 Dde I 6 Kpn I - Rsr II
Apa I - Dpn I 7 Mae I 1 Sac I
ApaL I Dra I Mae II 10 Sac II Ase I Dra III Mae III 5 Sal I Asp718 Drd I Mbo I 7 Sau3A I 7 Ava I Dsa I Mbo II 6 Sau96 I 5 Ava II 1 Eae I Mlu I Sea I Avr II Eag I Mme I ScrF I 16 BamH I 1 Ear I Mnl I 22 Sec I 10 Ban I Eco47 III Msc I 3 SfaN I 7 Ban II 2 Eco57 I Mse I 2 Sfi I Bbe I EcoN I Msp I 8 Sma I Bbv I 6 EcoO109 Nae I SnaB I Bbv II 1 EcoR I Nar I Spe I 1 Bel I EcoR II 11 Nci I 5 Sph I Ben I 5 EcoR V 1 Nco I 1 Spl I Bgl I 1 Esp I 1 Nde I Ssp I Bgl II 1 Fnu4H I 8 Nhe I Stu I BsaA I 2 Fok I 4 Nla III 2 Sty I 1 Bsm I Fsp I 1 Nla IV 5 Taq I 1 BsmA I 4 Gdi II 2 Not I Tthlll I Bspl286 4 Gsu I 7 Nru I Tthlll II BspH I Hae I 6 Nsi I Xba I BspM I 1 Hae II Nsp7524 I 1 Xca I BspM II Hae III 14 NspB II 1 Xho I Bsr I 6 Hga I 2 NspH I 1 Xcm I 1 BssH II HgiA I 2 PaeR7 I Xma I BstB I HgiE II PflM I Xmn I 1 BstE II Hha I
Enzyme S te Use Site position (Fragment length) Fragment order
Ace I gt/mkac 1 1 ( 1489) 1 1490 ( 160) 2
Afl III a/crygt 1 1 ( 168) 2 169 ( 1481) 1
AlwN I cagnnn/ctg 1 1 ( 573) 2 574 ( 1076) 1
Ava II g/gwcc 1 1 ( 685) 2 686 { 964) 1
BamH I g/gatcc 1 1 ( 233) 2 234 ( 1416) 1
Bbv II gaagac 2/6 1 1 ( 313) 2 314 ( 1336) 1
Bgl I gccnnnn/nggc 1 1 1548) 1 1549 ( 101) 2
Bgl II a/gatct 1 1 ( 157) 2 158 ( 1492) 1
BspM I acctgc 4/8 1 1 ( 1640) 1 1641 { 9) 2
BstE II g/gtnacc 1 1 ( 1338) 1 1339 ( 311) 2
Dsa I c/crygg 1 1 [ 197) 2 198 ( 1452) 1
EcoN I cctnn/nnnagg 1 1 612) 2 613 ( 1037) 1
EcoR V gat/ te 1 1 1560) 1 1561 ( 89) 2
Esp I gc/tnagc 1 1 1246) 1 1247 ( 403) 2
Fsp I tgc/gca 1 1 908) 1 909 ( 741) 2
Hha I gcg/c 1 1 909) 1 910 ( 740) 2
HinP I g/cgc 1 1 909) 1 910 ( 740) 2
Mae I c/tag 1 1 13) 2 14 ( 1636) 1
Nco I c/catgg 1 1 197) 2 198 ( 1452) 1
Nsp7524 I r/catgy 1 1 1544) 1 1545 ( 105) 2
NspB II cmg/ckg 1 1 1055) 1 1056 ( 594) 2
NspH I rcatg/y 1 1 1544) 1 1545 ( 105) 2
Pral I cac/gtg 1 1 167) 2 168 ( 1482) 1
Spe I a/ctagt 1 1 12) 2 13 ( 1637) 1
Sty I c/cwwgg 1 1 197) 2 198 ( 1452) 1
Tag I t/cga 1 1 1454) 1 1455 ( 195) 2
Xcm I ccannnnn/nnnntgg 1 1 1088) 1 1089 ( 561) 2
Xmn I gaann/nnttc 1 1 287) 2 288 ( 1362) 1
Aat II gacgt/c 2 1 900) 1 901 ( 106) 3 1007 ( 643) 2
Aha II gr/cgyc 2 1 900) 1 901 ( 106) 3 1007 ( 643) 2
ApaL I g/tgcac 2 1 478) 3 479 ( 633) 1 1112 ( 538) 2
Ban II grgcy/c 2 1 393) 3 394 ( 708) 1 1102 ( 548) 2
BsaA I yac/gtr 2 1 167) 2 168 ( 1325) 1 1493 ( 157) 3
BstX I ccannnnn/ntgg 2 1 191) 2 192 ( 136) 3 328 ( 1322) 1
Gdi II yggccg -5/-1 2 1 210) 3 211 ( 356) 2 567 ( 1083) 1
Hga I gacgc 5/10 2 1 79) 3 80( 137) 2 217 ( 1433) 1
HgiA I gwgcw/c 2 1 478) 3 479 ( 633) 1 1112 ( 538) 2
Mse I t/taa 2 1 68) 3 69 ( 968) 1 1037 ( 613) 2
Nla III catg/ 2 1 198) 2 199 ( 1347) 1 1546 ( 104) 3
Pie I gagtc 4/5 2 . 1 93) 3 94 ( 559) 2 653 ( 997) 1
BstY I r/gatcy 3 1 157) 3 158 ( 76) 4 234 ( 759) 1 993 ( 657) 2
CfrlO I r/ccggy 3 1( 463) 2 464 ( 292) 4 756 ( 399) 3 1155 ( 495) 1
Ear I ctcttc 1/4 3 1 148) 3 149 780) 1 929 ( 588) 2 1517 { 133) 4 Eco57 I ctgaag 16/14 3 1 61) 4 62 308) 3 370 ( 919) 1 1289( 361) 2 Mse I tgg/cca 3 1 336) 3 337 690) 1 1027 ( 341) 2 1368 ( 282) 4 Rsa I gt/ac 3 1 222) 4 223 300) 3 523 ( 651) 1 1174{ 476) 2
Alw I ggatc 4/5 4 1 233) 234 1) 5 235( 759) 1 994 ( 630) 2 1624 26)
BsmA I gtctc 1/5 4 1 710) 711 122) 5 833 ( 177) 4 1010 ( 216) 3 1226 424) Bspl286 I gdgch/c 4 1 393) 394 85) 4 479 ( 623) 1 1102 ( 10) 5 1112 538)
Fok I ggatg 9/13 4 1 542) 543 286) 3 829 ( 47) 5 876( 255) 4 1131 519)
Ben I ccs/gg 5 1 502) 503 62) 565 ( 165) 4 730 ( 54) 6
784 530) 1314 336)
Eae I y/ggccr 5 1 210) 211 126) 337( 230) 4 567 ( 460) 1
1027 341) 1368 282)
Hinf I g/antc 5 1 93) 94 194) 288 ( 60) 6 348 ( 305) 2
653 820) 1473 177)
Mae III /gtnac 5 1 101) 102 553) 655 ( 119)_5 774 ( 213) 4
987 353) 1340 310)
Nci I cc/sgg 5 1 502) 503 62) 565 ( 165) 4 730 ( 54) 6
784 530) 1314 336)
Nla IV ggn/ncc 5 1 126) 127 107) 234 ( 128) 3 362 ( 716) 1
1078 7) 1085 565)
Sau96 I g/gncc 5 1 126) 127 559) 686 ( 46) 6 732 ( 346) 2
1078 239) 1317 333)
Bbv I gcagc 8/12 1 82) 83 120) 203 ( 258) 461( 112) 5 573 876) 1449 181) 1630 ( 20)
Bsr I actgg 1/-1 1 385) 386 65) 451 ( 69) 520( 9) 7 529 599) 1128 478) 1606 ( 44)
Dde I c/tnag 1 97) 98 36) 134 ( 219) 353 ( 409) 2 762 486) 1248 341) 1589 ( 61)
Hae I wgg/ccw 1 336) 337 579) 916 ( 111) 1027 ( 116) 5 1143 225) 1368 19) 1387 ( 263)
Mbo II gaaga 8/7 1 149) 150 164) 314 ( 615) 929 ( 10) 7 939 27) 966 551) 1517 { 133)
Dpn I ga/tc 7 1 18) 8 19 140) 159 ( 76) 5 235 ( 702) 937 57) 6 994 507) 1501 ( 124) 4 1625 ( 25)
Gsu I ctggag 16/14 7 1 153 ) 4 154 116) 270 ( 94) 6 364 ( 250) 614 363 ) 2 977 603) 1580 ( 27 ) 8 1607 ( 43)
Mbo I /gate 7 1 18 ) 8 19 140) 159 ( 76) 5 235 ( 702) 937 57 ) 6 994 507) 1501 ( 124) 4 1625 ( 25)
Sau3A I /gate 7 1 18 ) 8 19 140) 159 ( 76) 5 235 ( 702) 937 57) 6 994 507) 1501 ( 124) 4 1625 ( 25)
SfaN I gcatc -~ 5/9 7 1 113 ) 5 114 189) 303 ( 109) 6 412 ( 164) 576 47-) 7 623 252) 875 ( 758) 1 1633 ( 17)
Fnu4H I gc/ngc 1 82) 8 83 120) 203 ( 258) 461 ( 112) 7
573 151) 724 395) 1119 ( 330) 1449 ( 181) 4
1630 20)
Hpa II c/cgg 1 464) 465 39) 7 504 ( 62) 566 ( 164)
730 27) 757 28) 8 785 ( 371) 1156 ( 159)
1315 335)
Hph I ggtga 8/7 8 1 22) 23 94) 117 ( 263) 380( 285)
665 497) 1162 179) 1341 ( 157) 1498 ( 7)
1505 145)
Msp I c/cgg 1 464) 465 39) 7 504 ( 62) 566( 164)
730 27) 757 28) 8 785 ( 371) 1156 ( 159)
1315 335)
Mae II a/cgt 10 1 168) 169 392) 561( 38 ) 8 599 ( 10)10
609 24) 633 269) 902 ( 97 ) 7 999 ( 9)11
1008 186) 1194 300) 1494 ( 156) 6
Sec I c/cnngg 10 1 197) 198 136) 334( 168) 6 502 ( 249)
751 32) 783 298) 1081 ( 18 ) 11 1099 ( 266)
1365 25)10 1390 209) 1599 ( 51) 8
BstN I cc/wgg 11 1 334) 2 335 149) 4 484 ( 132 ) 6 616 ( 360)
976 106) 8 1082 17)12 1099 ( 131 ) 7 1230 ( 136)
1366 24)10 1390 189) 3 1579 ( 21 ) 11 1600 ( 50)
EcoR II /ccwgg 11 1 334) 2 335 149) 4 484 { 132 ) 6 616 ( 360)
976 106) 8 1082 17)12 1099 ( 131 ) 7 1230 ( 136)
1366 24)10 1390 189) 3 1579 ( 21 ) 11 1600 ( 50)
Alu I ag/ct 13 1( 146) 5 147 ( 112) 6 259 ( 61) 8 320 ( 36)10
356( 12)11 368 ( 379) 2 747 ( 393) 1 1140 ( 99) 7
1239 ( 7)12 1246 ( 5)14 1251 ( 7)13 1258 ( 190) 3
1448 ( 38) 9 1486 ( 164) 4
Hae III gg/cc 14 1( 126) 6 127 ( 85) 9 212 ( 126) 7 338 ( 230) 2
568 ( 155) 5 723 ( 10)15 733 ( 22)13 755 ( 162) 4
917 ( 111) 8 1028 ( 50)12 1078 ( 66)10 1144 ( 173) 3
1317 ( 52)11 1369 ( 19)14 1388 ( 262) 1
ScrF I cc/ngg 16 K 334) 1 335( 149) 4 484( 19)16 503 ( 62) 9
565 ( 51)12 616 ( 114) 6 730( 54)10 784 ( 192) 2
976 ( 106) 7 1082 ( 17)17 1099 ( 131) 5 1230 ( 84) 8
1314 ( 52)11 1366 ( 24)14 1390 ( 189) 3 1579 ( 21)15
1600 ( 50)13
Mnl I cctc 7/7 22 K 30)16 31 ( 144) 4 175 ( 94) 7 269 ( 83) 9
352 ( 26)18 378 ( 63)10 441 ( 113) 6 554 ( 38)14
592 ( 21)19 613 ( 7)23 620 ( 87) 8 707 ( 28)17
735 ( 18)20 753 ( 13)21 766 ( 214) 1 980 ( 155) 3
1135 ( 135) 5 1270 ( 33)15 1303 ( 156) 2 1459 ( 61)12
1520 ( 63)11 1583 ( 8)22 1591 ( 59)13
325 sites found
No Sites found for the following Restriction Endonucleases
Afl II c/ttaag Eco47 III age/get Pst I ctgca/g
Apa I gggcc/c ECOO109 I rg/gnccy Pvu I cgat/cg
Ase I at/taat EcoR I g/aattc Pvu II cag/ctg
Asp718 g/gtacc Hae II rgcgc/y Rsr II cg/gwccg
Ava I c/ycgrg HgiE II accnnnnnnggt Sac I gagct/c
Avr II c/ctagg HinC II gty/rac Sac II ccgc/gg
Ban I g/gyrcc HinD III a/agctt Sal I g/tcgac
Bbe I ggcgc/c Hpa I gtt/aac Sea I agt/act
Bel I t/gatca Kpn I ggtac/c Sfi I ggccnnnn/nggcc
Bsm I gaatgc 1/-1 Mlu I a/cgcgt Sma I ccc/ggg
BspH I t/catga Mme I tccrac 20/18 SnaB I tac/gta
BspM II t/ccgga Nae I gcc/ggc Sph I gcatg/c
BssH II g/cgcgc Nar I gg/cgcc Spl I c/gtacg
BstB I tt/cgaa Nde I ca/tatg Ssp I aat/att
BstU I cg/cg Nhe I g/ctagc Stu I agg/cct
Bsu36 I cc/tnagg Not I gc/ggccgc Tthlll I gacn/nngtc
Cla I at/cgat Nru I tcg/cga Tthlll II caarca 11/9
Dra I ttt/aaa^ Nsi I atgca/t Xba I t/ctaga
Dra III cacnnn/gtg PaeR7 I c/tcgag Xca I gta/ ae
Drd I gacnnnn/nngtc PflM I ccaπnnn/ntgg Xho I c/tcgag
Eag I c/ggccg PpuM I rg/gwccy Xma I c/ccggg
ANNEX 7
DNA sequence 1813 b.p. ACTCTAGAAGGA ... AATTGCAGCCAA linear
Positions of Restriction Endonucleases sites (unique sites underlined)
2di_H
Eaσ I
Eae I
BsmA I Mae III
Dde I Bsr I
CCTGTTgKX.GKTriTACArc I I I I 160
GGACAACACCCCGAAAATG"rCGGAAACCTAACA^
. I I . | |
122 140
128 143
Nla III
AGCAAACTATAAGTCAAGTAAGCAATCCAιGATX-T^^ II I 320
TCG ITGATATTCAGTTC^TTCGTTAGGTCTACAGT^
• • • 11 * * * *
278 320
279
279
279
Nla III
NspH I Mnl I
Mnl I N Nsspp77552244 II F Fookk II Fok I Mnl I
ATGTCAGAGGAAATTATGACATGC-ACCGAT^^ I I 400
TA ΛiGTCTCCTTTAATACTGTACGTGGCTAATGTAGGGAG^
I . | | . I I . I | .
327 3 33399 3 35533 364 379
339 357
340
ScrF I EcoR II AlwN I BstN I Pie I Pie I
Dde I Sec I Hinf I Hinf I
ACCTTATX-CACATATTCTC MTCAGTTCCTGTTT^^ II I I 480
TGGAATACGTGTATAAGAGAGTCAAGGACAAAGCCTTTGACGAAT^
|.| • II • I . |.
419 446 455 479 421 447 455 479
447
447
ScrF I EcoR II
560
547 549 549
GGACI IX-ΑCCAAATTAAGTCTGAGGTAGAGGTTC-AGTTGTC
• I -I I - I II •
573 581 595 612 626
581 626 627
Tthlll II Alu I
HinD III EcoR V Mb
GCCAAGCCCAG<^CΑTTATTCTTATCCCATTCAAG^ 720
CGGTTCGG<?rcGTX-TAATAAGAATAGGGTAAGTTCGAAGTTC
. || |« • I ♦ I •
673 698 708
... 674
679-
Fnu4H I ScrF I
Bbv I ScrF I EcoR II
735 758 800
735 800
Hae III Alu I Mae III stu I Gsu I Pie I HσiA I Fnu4H I
Hae I Alu I Hinf I SfaN I Bspl286 I Bbv I
CAGGCCTCCTACCCCAGCTCCAGCTTTG<3ACTCA^ 880
GTCCGGACGATGGGGTCGAGGTCGJvJ-ACCTCAG^
II • I I • I l-l • - I I - I
802 816 829 852 860 873
802 818 829 860 873
803 822 831
HinP_l
HbS
Nla IV
Nar I
Has_ιι
Bbe I Fnu4H I
Ban I Bbv I BsmA I
Mme I Aha II Pst I Gsu I Bsr I Mae I L III BspM I BsmA I Mnl I
CACATAC ICACTI CGGAGAAGCCT I AGTGTCATGGCGCCT -^ 960
GTGTATGGTCAGCCTCTTCGC-ATCACAGTACC^^
I I -I l -ll I I I • I I I I *
887 901 908 920 932 940
890 911 922 943
911 924 946
911 924
911
911
911
912
912
Sec I ScrF I
Sau3A I Nci I
Mbo I ScrF I Msp I
Dpn I EcoR II Mae I Hpa II
Bel I BstN I Nla III Bcn_I
GTCCATCTTTσrCCCAGACTACACCL^GTAAAGATACTGATGATCAGTC 1040
CAGGTAGAAACAGGGTCTGATGTGGTTCATTTCTATGACT
• 1I0I01 1I0-09 1I0-19 I • 1I034
1002 1009 1024 1034
1002 1009 1034
1002 1034
1034 1034 Alu I M l I SfaN I Mbo II Fok I
AAG IAGGAAAIGCRGIATGCCACTTICTICCCAAGACAGTX^ I 1120
TTCTCCI ΓCGACTACGGTGAAGAAGGGTTCTGT^
1 I043 I- 1 1I005533 • 1 1I006622 • • • 1I106
1049
Sau3A
Mbo 1
Dpn I
Alw I
Nla IV Sau96 I
GTAGAAGGTGAGTGAAGAATGACCTAGGATACGGGTCTTTATACCTTTCCTGG^
1 I123 1 I140 I I • 1 I154 1 I1I6-8 • 1 I190 1I197-
1143 1168
1143 1169
1143 1169
1143 1169
1144
1144
1144
1144
lb J
Nla III
NspH I
Nsp7524 I
Rsa I
ScrF I
1280
Hae I Mnl I Mbo II Eae I Dde I
G IAAC»TG ∞CTKrrGATGACAAT IGI∞ I I 1360
Sty I Sec I Nco I Dsa I Fnu4H I Fok I Alu I Msp I
Bbv I Mnl I Bsr I Mnl I Mae I Hpa II
I II I I I I I I I
TAGTGGCAGCCΑTGGTTCATΛ-TK_TGGGTGT^ 1440
ATCACCffTCGGTACC-AAGTAGACACCCACAAGTCCCTCCACACCTGACCTAC^
I II • • I • I I • I l-l - I
1366 1396 1405 1412 1419 1433
1366 1408 1421 1433
1370 1370 1370 1370 1371
Dde I
1600
Sec I
Hae III
Mse I
Hae I
Eae I
ScrF I
Sau3A Alu I EcoR II Mbo I Pvu II EcoN T Dpn I NsoB IT BstN I Xmn T Bel I BstX I Nla IV Sec I Mae I Ssp I
TTGTGTGAATGTTCTX_ATCACTACCACCC^^ 1680 AAa^CACTTACaAGACTAGTGATGGTGGGTCGACCAA
1615 1624 1634 1642 1668 1676 1616 1629 1643 1674 1616 1629 1643 1616 1630 1643 1643 1645 1645 1645 1646 1648 1648 Fnu4H I Bbv I
HσiE II Alu I
TTCCTATTGAGAACaTCTATAGTGCTACCAAAATTGGTAAGG 1760 AAGGATA CTCTKn' G TATC^GAa ΛT^^^
• • i • * 1 1 * * * *
1707 1725
Restriction Endonucleases site usage
Aat II - BstN I 8 HinC II 2 Pie I 4
Ace I 1 BstU I - HinD III 1 Pml I -
Afl II - BstX I 3 Hinf I 7 PpuM I 2
Afl III 1 BstY I 1 HinP I 1 Pst I 2
Aha II 1 Bsu36 I 1 Hpa I 1 PVU I -
Alu I 7 CfrlO I - Hpa II 2 Pvu II 1
Alw I 2 Cla I - Hph I - Rsa I 2
AlwN I 1 Dde I 7 Kpn I - Rsr II -
Apa I - Dpn I 5 Mae I 5 Sac I -
ApaL I - Dra I - Mae II 2 Sac II -
Ase I - Dra III - Mae III 4 Sal I -
Asp718 - Drd I - Mbo I 5 Sau3A I 5
Ava I 1 Dsa I 1 Mbo II 8 Sau96 I 3
Ava II 3 Eae I 4 Mlu I - Sea I -
Avr II - Eag I 1 Mme I 1 ScrF I 9
BamH I 1 Ear I 1 Mnl I 14 Sec I 5
Ban I 1 Eco47 III - Mse I 3 SfaN I 4
Ban II 1 Eco57 I - Mse I 6 Sfi I -
Bbe I 1 EcoN I 1 Msp I 2 Sma I -
Bbv I 7 ECOO109 I 2 Nae I - SnaB I -
Bbv II - EcoR I - Nar I 1 Spe I -
Bel I 3 EcoR II 8 Nci I 1 Sph I -
Ben I 1 EcoR V 1 Nco I 1 Spl I -
Bgl I - Esp I - Nde I 1 Ssp I 3
Bgl II - Fnu4H I 9 Nhe I - stu I 1
BsaA I - Fok I 5 Nla III 8 sty I 2
Bsm I - Fsp I - Nla iv 5 Taq I 1
BsmA I 4 Gdi II 1 Not I 1 Tthlll I -
Bspl286 I 2 GSU I 2 Nru I - Tthlll II 1
BspH I - Hae I 4 Nsi I - Xba I 1
BspM I 2 Hae II 1 Nsp7524 I 2 Xea I -
BspM II - Hae III 5 NspB II 1 Xho I 1
Bsr I 6 Hga I - NspH I 2 Xcm I -
BssH II - HgiA I 1 PaeR7 I 1 Xma I -
BstB I - HgiE II 1 PflM I - Xmn I 1
BstE II - Hha I 1
Enzyme Site Use Site position (Fragment length) Fragment order
ACC I gt/m ac 1 1 ( 494) 2 495 ( 1319) 1
Afl III a/crygt 1 1 ( 1273) 1 1274 ( 540) 2
Aha II gr/cgyc 1 1 ( 910) 1 911 ( 903) 2
AlwN I cagnnn/ctg 1 1 ( 420) 2 421 ( 1393) 1
Ava I c/ycgrg 1 1 ( 28) 2 29 ( 1785) 1
BamH I g/gatcc 1 1 ( 1142) 1 1143 ( 671) 2
Ban I g/gyrcc 1 1 ( 910) 1 911 ( 903) 2
Ban II grgcy/c 1 1 ( 25) 2 26( 1788) 1
Bbe I ggcgc/c 1 1 ( 910) 1 911 ( 903) 2
Ben I ccs/gg 1 1 ( 1033) 1 1034 ( 780) 2
BstY I r/gatcy 1 1 ( 1142) 1 1143 ( 671) 2
Bsu36 I cc/tnagg 1 1 ( 500) 2 501 ( 1313) 1
Dsa I c/crygg 1 1 ( 1369) 1 1370 ( 444) 2
Ξag I c/ggccg 1 1 ( 34) 2 35 ( 1779) 1
Ear I ctcttc 1/4 1 1 ( 182) 2 183 ( 1631) 1
EcoN I cctnn/nnnagg 1 1 ( 1642) 1 1643 ( 171) 2
EcoR V gat/ate 1 1 697) 2 698 ( 1116) 1
Gdi II yggccg -5/-1 1 1 34) 2 35 ( 1779) 1
Hae II rgcgc/y 1 1 910) 1 911 ( 903) 2
HgiA I gwgcw/c 1 1 859) 2 860 ( 954) 1
HgiE II accnnnnnnggt 1 1 1706) 1 1707 ( 107) 2
Hha I gcg/c 1 1 911) 1 912 ( 902) 2
HinD III a/agctt 1 1 672) 2 673 ( 1141) 1
HinP I g/cgc 1 1 911) 1 912 ( 902) 2
Hpa I gtt/aac 1 1 625) 2 626 ( 1188) 1
Mme I tccrac 20/18 1 1 889) 2 890 ( 924) 1
Nar I gg/cgcc 1 1 910) 1 911 ( 903) 2
Nci I cc/sgg 1 1 1033) 1 1034 ( 780) 2
Nco I c/catgg 1 1 1369) 1 1370 ( 444) 2
Nde I ca/tatg 1 1 1767) 1 1768 ( 46) 2
Not I gc/ggccgc 1 1 33) 2 34 { 1780) 1
NspB II cmg/ckg 1 1 1628) 1 1629 ( 185) 2
PaeR7 I c/tcgag 1 1 28) 2 29 ( 1785) 1
Pvu II cag/ctg_ 1 1 1628) 1 1629( 185) 2
Stu I agg/cct 1 1 801) 2 802 ( 1012) 1
Taq I t/cga 1 1( 29) 2 30( 1784) 1
Tthlll II caarca 11/9 1 1( 678) 2 679 ( 1135) 1
Xba I t/ctaga 1 1( 2) 2 3( 1811) 1
Xho I c/tcgag 1 1( 28) 2 29 ( 1785) 1
Xmn I gaann/nnttc 1 1( 1673) 1 1674 ( 140) 2
Alw I ggatc 4/5 2 1( 1142) 1 1143 ( 1) 3 1144 ( 670) 2
Bspl286 I gdgch/c 2 1( 25) 3 26( 834) 2 860 ( 954) 1
BspM I acctgc 4/8 2 1( 611) 2 612 ( 308) 3 920 ( 894) 1
EcoO109 I rg/gnccy 2 1( 8) 3 9( 1159) 1 1168 ( 646) 2
Gsu I ctggag 16/14 2 1( 817) 2 818 ( 125) 3 943 ( 871) 1
HinC II gty/rac 2 1( 594) 2 595 ( 31) 3 626 { 1188) 1
Hpa II c/cgg 2 1( 1033) 1 1034 ( 399) 2 1433 ( 381) 3
Mae II a/cgt 2 1( 515) 2 516 ( 955) 1 1471 ( 343) 3
Msp I c/cgg 2 1( 1033) 1 1034 ( 399) 2 1433 ( 381) 3
Nsp7524 I r/catgy 2 11 338) 3 339( 935) 1 1274 ( 540) 2
NspH I rcatg/y 2 1( 338) 3 339( 935) 1 1274 ( 540) 2
PpuM I rg/gwccy 2 1( 8) 3 9( 1159) 1 1168 ( 646) 2
Pst I ctgca/g 2 1( 921) 1 922 ( 587) 2 1509 ( 305) 3
Rsa I gt/ac 2 1( 796) 1 797 ( 475) 3 1272 ( 542) 2
Sty I c/cwwgg 2 1( 1369) 1 1370 ( 278) 2 1648 ( 166) 3
Ava II g/gwcc 3 1( 9) 4 10 ( 160) 3 170 ( 999) 1 1169 ( 645) 2
Bel I t/gatca 3 1{ 277) 3 278 { 723) 1 1001 ( 614) 2 1615 ( 199) 4
BstX I ccannnnn/ntgg 3 1( 511) 2 512 ( 642) 1 1154 ( 470) 3 1624 ( 190) 4
Mse I tgg/cca 3 1( 545) 2 546 ( 758) 1 1304 ( 341) 3 1645 ( 169) 4
Sau96 I g/gncc 3 1( 9) 4 10 ( 160) 3 170 ( 999) 1 1169 ( 645) 2
Ssp I aat/att 3 1( 163) 3 164 ( 559) 2 723 ( 953) 1 16761 138) 4
BsmA I gtctc 1/5 4 11 127) 4 1281 804) 1 932( 14) 5 9461 534) 2 14801 334) 3
Eae I y/ggccr 4 11 34) 5 351 511) 2 5461 758) 1 13041 341) 3 16451 169) 4
Hae I wgg/ccw 4 11 545) 1 5461 256) 4 8021 502) 2 13041 341) 3 16451 169) 5
Mae III /gtnac 4 11 142) 4 1431 688) 1 8311 433) 3 12641 500) 2 17641 50) 5
Pie I gagtc 4/5 4 11 454) 2 4551 24) 5 4791 102) 4 5811 248) 3 8291 985) 1
SfaN I gcatc 5/9 4 11 223) 3 2241 524) 2 7481 104) 5 8521 201) 4 10531 761) 1
Dpn I ga/tc 5 11 69) 6 70( 209) 3 2791 723) 1 10021 142) 5 11441 472) 2 16161 198) 4
Fok I ggatg 9/13 5 11 352) 2 3531 11) 6 3641 742) 1 11061 302) 3 14081 175) 5 15831 231) 4
Hae III gg/cc 5 11 35) 6 361 511) 1 5471 256) 4 8031 502) 2 13051 341) 3 16461 168) 5
Mae I c/tag 5 11 3) 6 41 897) 1 9011 123) 5 10241 395) 2 14191 249) 3 16681 146) 4
Mbo I /gate 5 11 69) 6 701 209) 3 2791 723) ~i 10021 142) 5 11441 472) 2 16161 198) 4
Nla IV ggn/ncc 5 11 9) 6 101 901) 1 9111 232) 3 11431 26) 5 11691 465) 2 16341 180) 4
Sau3A I /gate 5 11 69) 6 70( 209) 3 2791 723) 1 10021 142) '5 11441 472) 2 16161 198) 4
Sec I c/cnngg 5 11 445) 2 4461 588) 1 10341 336) 3 13701 272) 4 16421 6) 6 16481 166) 5
Bsr I actgg 1/-1 6 11 139) 6 1401 67) 7 2071 680) 1 887( 253) 3
11401 265) 2 14051 181) 5 15861 228) 4
Mse I t/taa 6 11 572) 1 5731 54) 5 6271 570) 2 11971 321) 3
15181 21) 7 15391 27) 6 15661 248) 4
Alu I ag/ct 7 11 673) 1 6741 142) 5 8161 6) 8 8221 227) 3
10491 372) 2 14211 209) 4 16301 95) 6 17251 89) 7
Bbv I gcagc 8/12 7 11 731) 1 7321 3) 8 7351 138) 4 8731 51) 6
9241 442) 2 13661 360) 3 17261 80) 5 18061 8) 7
Dde I c/tnag 7 11 121) 5 1221 297) 3 4191 83) 7 5021 688) 1
11901 63) 8 12531 99) 6 13521 133) 4 14851 329) 2
Hinf I g/antc 7 11 230) 3 2311 224) 4 4551 24) 8 4791 102) 6
5811 210) 5 7911 38) 7 8291 716) 1 15451 269) 2
BstN I cc/wgg 8 11 446) 1 447( 102) 7 5491 209) 4 7581 42) 9
8001 209) 5 10091 259) 3 12681 281) 2 15491 94) 8
16431 171) 6
EcoR II /ccwgg 8 11 446) 1 4471 102) 7 5491 209) 4 7581 42) 9
8001 209) 5 10091 259) 3 12681 281) 2 15491 94) 8 16431 171) 6
Mbo II gaaga 8/7 8 11 179) 4 1801 3) 9 1831 525) 1 7081 354) 3
10621 61) 7 11231 83) 5 12061 75) 6 12811 516) 2
17971 17) 8
Nla III catg/ 8 11 75) 8 76( 100) 6 1761 144) 4 3201 20) 9
3401 568) 1 9081 111) 5 10191 256) 3 12751 96) 7
13711 443) 2
Fnu4H I gc/ngc 9 11 33) 7 341 3) 9 37( 695) 1 7321 3)10
7351 138) 4 8731 51) 6 92 1 442) 2 13661 360) 3
17261 80) 5 18061 8) 8
ScrF I cc/ngg 9 11 446) 1 447( 102) 7 5491 209) 4 7581 42) 9
8001 209) 5 10091 25)10 10341 234) 3 12681 281) 2
15491 94) 8 16431 171) 6
Mnl I cctc 7/7 14 11 172) 4 1731 154) 5 327( 30)11 3571 22)13
3791 122) 6 5011 27)12 5281 412) 1 9401 103) 7
10431 214) 3 12571 100) 8 13571 39)10 13961 16)14
14121 72) 9 14841 12)15 14961 318) 2
256 sites found
No Sites found for the following Restriction Endonucleases
Aat II gacgt/c CfrlO I r/ccggy PflM I ccaπnnn/ntgg
Afl II c/ttaag Cla I at/cgat Pro! I cac/gtg
Apa I gggcc/c Dra I ttt/aaa Pvu I cgat/cg ApaL I g/tgcac Dra III cacnnn/gtg Rsr II cg/gwccg Ase I at/taat Drd I gacnnnn/nngtc Sac I gagct/c Asp718 g/gtacc Eco47 III age/get Sac II ccgc/gg Avr II c/ctagg Eco57 I ctgaag 16/14 Sal I g/tcgac Bbv II gaagac 2/6 EcoR I g/aattc Sea I agt/act Bgl I gccnπnn/nggc Esp I gc/tnagc Sfi I ggccnnnn/nggcc Bgl II a/gatct Fsp I tgc/gca Sma I ccc/ggg BsaA I yac/gtr Hga I gacgc 5/10 SnaB I tac/gta Bsm I gaatgc 1/-1 Hph I ggtga 8/7 Spe I a/ctagt BspH I t/catga Kpn I ggtac/c Sph I gcatg/c BspM II t/ccgga Mlu I a/cgcgt Spl I c/gtacg BssH II g/cgcgc Nae I gcc/ggc Tthlll gacn/nngtc BstB I tt/cgaa Nhe I g/ctagc Xca I gta/tac BstE II g/gtnacc Nru I tcg/cga Xcm I ccannnnn/nnnntgg BstU I cg/cg Nsi I atgca/t Xma I c/ccggg
BIBLIOGRAPHY
5 1. Melnick, M., Bixler, D., Nance, W., Silk, K. & Yune, H. Familial branchio-oto-renal dysplasia: A new addition to the branchial arch syndromes. Clin. Genet. 9, 25-34 (1976).
2. Heimler, A. & Lieber, E. Branchio-Oto-Renal syndrome: reduced penetrance and K) variable expressivity in four generations of a large kindred. Am. J. Med. Genet. 25,
15-27 (1986).
3. Greenberg, C.R., Trevenen, C.L. & Evans, J.A. The BOR syndrome and renal agenesis-Prenatal diagnosis and further clinical delineation. Prenatal Diagnosis 8,
15 103-108 (1988).
4. Fraser, F.C., Sproule, J.R. & Hatal, F. Frequency of the Branchio-Oto-Renal (BOR) syndrome in children with profond hearing loss. Am. J. Med. Genet. 7, 341- 349 (1980).
20
5. Chen, A., Francis, M., Ni, L, Cremers, C.W.R.J., Kimberling, W.J. et al. Phenotypic manifestations of branchiootorenal syndrome. Am. J. Med. Genet. 58, 365-370 (1995).
25 6. Ksnig, R., Fuchs, S. & Dukiet, C. Branchio-Oto-Renal (BOR) syndrome: variable expressivity in a five-generation pedigree. Eur J Pediatr 153, 446-450 (1994).
7. Gimsing, S. & Dyrmose, J. Branchio-Oto-Renal dysplasia in three families. Ann. Otol., Rhinol. & Laryngol. 95, 421-426 (1986).
30
8. Fraser, F.C., AymD, S., Halal, F. & Sproule, J. Autosomal dominant duplication of the renal collecting system, hearing loss, and external ear anomalies: a new syndrome ? Am. J. Med. Genet. 14, 473-478 (1983).
35 9. Sulk, K.K. Embryology of the Ear. in Hereditary Hearing Loss and Its Syndromes, 22-42 (Gorlin, R.J., Toriello, H.V. & Cohen, M.M., New York, 1995).
10. Clapp, W.L. & Abrahamson, D.R. Development and Gross Anatomy of the Kidney, in Renal Pathology: With Clinical and Functional Correlations, 3-59 (Tisher,
40 CC. & Brenner, B.M., Philadelphia, 1994).
11. Patterson, L.T. & Dressier, G.R. The regulation of kidney development: new insights from an old model. Curr. Opin. Genet. Dev. 4, 696-702 (1994).
45 12. Haan, E.A. et al. Tricho-Rhino-Phalangeal and Branchio-Oto Syndromes in a Family With an Inherited Rearrangement of Chromosome 8q. Am. J. Med. Genet 32 490-494 (1989).
13. Vincent, C. et al. A proposed new contiguous syndrome on 8q consists of Branchio-Oto-Renal (BOR) syndrome, Duane syndrome, a dominant form of hydrocephalus and trapese aplasia; implications for the mapping of the BOR gene. Hum. Mol. Genet. 3, 1859-1866 (1994).
5
14. Ni, L. et al. Refined localisation of the branchiootorenal syndrome gene by linkage and haplotype analysis. Am. J. Med. Genet. 51 , 176-184 (1994).
15. Gu, J.Z., Wagner, M.J., Haan, E.A. & Wells, D.E. Detection of a megabase o deletion in a patient with Branchio-Oto-Renal Syndrome (BOR) and Tricho-Rhino-
Phalangeal Syndrome (TRPS): Implications for mapping and cloning the BOR gene. Genomics 31 , 201-206 (1996).
16. Kalatzis, V., Abdelhak, S., Compain, S., Vincent, C. & Petit, C. Characterisation 5 of a translocation-associated deletion defines the candidate region for the gene responsible for Branchio-Oto-Renal syndrome. Genomics 34, 422-425 (1996).
18. Kozak, M. Interpreting cDNA sequences: some insights from studies on translation. Mammal. Genome 7, 563-574 (1996). 0
19. Beaudet, A.L. & Tsui, L.-C. A suggested nomenclature for designating mutations. Hum. Mutation 2, 245-248 (1993).
20. Kyte, J. & Doolittle, R.F. A simple method for displaying the hydropathic 5 character of a protein. J. Mol. Biol. 157, 105-132 (1982).
21. Jordan, T. et al. The human PAX6 gene is mutated in two patients with aniridia. Nature Genet. 1 , 328-332 (1992).
0 22. Hanson, I.M. et al. Mutations at the PAX6 locus are found in heterogeneous anterior segment malformations including Peters' anomaly. Nature Genet. 6, 168- 173 (1994).
23. Mirzayans, F., Pearce, W.G., MacDonald, I.M. & Walter, M.A. Mutation of the i^ PAX6 gene in patients with autosomal dominant keratitis. Am. J. Hum. Genet. 57, 539-548 (1995).
24. Vortkamp, A., Gessler, M. & Grzeschik, K.-H. GLI3 zinc-finger gene interrupted by translocations in Greig syndrome families. Nature 352, 539-540 (1991 ). 0
25. Wagner, T. et al. Autosomal Sex Reversal and Campomelic Dysplasia are caused by Mutations in and around the SRY-Related Gene SOX9. Cell 79 1111 - 1120 (1994).
5 26. Belloni, E. et al. Identification of Sonic hedgehog as a candidate gene responsible for holoprosencephaly. Nature Genet. 14, 353-356 (1996).
27. Roessler, E. et al. Mutations in the human Sonic Hedgehog gene cause holoprosencephaly. Nature Genet. 14, 357-360 (1996).
28. Wilkie, A.O.M. The molecular basis of genetic dominance. J. Med. Genet. 31 , 89-98 (1994).
29. Van-De-Water, T.R. et al. Growth factors and development of the stato-acoustic system, in Development of auditory and vestibular systems 2, 1-32 (Romand, R., Amsterdam, 1992).
30. Lufkin, T., Dierich, A., LeMeur, M., Mark, M. & Chambon, P. Disruption of the Hox-1.6 homeobox gene results in defects in a region corresponding to its rostral domain of expression. Cell 66, 1105-1119 (1991 ).
31. Chisaka, O., Musci, T.S. & Capecchi, M.R. Developmental defects of the ear, cranial nerves and hindbrain resulting from targeted disruption of the mouse homeobox gene Hox-1.6. Nature 355, 516-520 (1992).
32. Torres, M., Gomez-Pardo, E., Dressier, G.R. & Gruss, P. Pax-2 controls multiple steps of urogenital development. Development 121 , 4057-4065 (1995).
33. Corey, D.P. & Breakefield, X.O. Transcription factors in inner ear development. Proc. Natl. Acad. Sci. USA 91 , 433-436 (1994).
34. Dressier, G.R. The genetic control of renal development. Curr. Opin. Nephrol. & Hypertension 4, 253-257 (1995).
35. Sanyanusin, P. et al. Mutation of the PAX2 gene in a family with optic nerve colobomas, renal anomalies and vesicoureteral reflux. Nature Genet. 9, 358-363 (1995).
36. Rothenpieler, U.W. & Dressier, G.R. Pax-2 is required for meseπchyme-to- epithelium conversion during kidney development. Development 119, 71 1 -720 (1993).
37. Plachov, D. et al. Pax8, a murine paired box gene expressed in the developing excretory sustem and thyroid gland. Development 110, 643-651 (1990).
38. Kreidburg, J.A. et al. WT-1 is required for early kidney development. Cell 74, 679-691 (1993).
39. Pontoglio, M. et al. Hepatocyte Nuclear Factorl inactivation results in hepatic dysfunction, phenylketonuria, and renal Fanconi syndrome. Cell 84, 575-585 (1996).
40. Hatini, V., Huh, S.O., Herzlinger, D., Soares, V.C. & Lai, E. Essential role of stromal mesenchyme in kidney morphogenesis revealed by targeted disruption of Winged Helix transcription factor BF-2. Genes & Dev. 10, 1467-1478 (1996).
■ FH
41. Stark, K., Vainio, S., Vassileva, G. & McMahon, A.P. Epithelial transformation of metanephric mesenchyme in the developing kidney regulated by Wπt-4. Nature 372, 679-683 (1994). 42. Schuchardt, A., D'Agati, V., Larsson-Blomberg, L, Costantini, F. & Pachnis, V. Defects in the kidney and enteric nervous system of mice lacking the tyrosine kinase receptor Ret. Nature 367, 380-383 (1994).
43. Mendelsohn, C. et al. Function of the retinoic accid receptors (RARs) during development. Development 120, 2749-2771 (1994).
44. Hurst, H. Sequences of bZIP Proteins. Protein Profile 2, 107-119 (1995).
45. Brydeπ, M.M., Evans, H.E. & Binns, W. Embryology of the Sheep. J. Morphol.138, 187-205 (1972).
46. Koseki, C, Herzlinger, D. & Al-Awqati, Q. Apoptosis in metanephric development. J Cell Biol. 119, 1327-1333 (1992).
47. Banfi, S. et al. Identification and mapping of human cDNAs homologous to Drosophila mutant genes through EST database searching. Nature Genet. 13, 167- 174 (1996).
48. Yamagata, K. et al. Mutations in the hepatocyte nuclear factor-4a gene in maturity-onset diabetes of the young (MODY1 ). Nature 384, 458-460 (1996).
49. Spence, S.E. et al. Genetic Localization of Hao-1 , blind-sterile (bs), and Emv-13 on mouse chromosome 2. Genomics 12, 403-404 (1992).
50. Uberbacher, E.C. & Mural, R.J. Locating protein-coding regions in human DNA sequences by a multiple sensor-neural network approach. Proc. Natl. Acad. Sci. USA 88, 11261 -11265 (1991 ).
51. Roacji, J.C., Boysen, C, Wang, K. & Hood, L. Pairwise End Sequencing: A unified approach to genomic mapping and sequencing. Genomics 26, 345-353 (1995).
52. Devereux, J., Haeberli, P. & Smithies, O. A comprehensive set of sequence analysis programs for the VAX. Nucleic Acids Res. 12, 387-395 (1984).
53. Brendel, V., Bucher, P., Nourbakhsh, I.R., Blaisdell, B.E. & Karlin, S. Methods and algorithms for statistical analysis of protein sequences. Proc. Natl. Acad. Sci. USA 89, 2002-2006 (1992).
54. Fuchs, R. MacPattern: protein pattern searching on the Apple Macintosh. Computer Applications in the Biosciences 7, 105-106 (1991 ).
55. Fuchs, R. Predicting protein function: a versatile tool for the Apple Macintosh. Computer Applications in the Biosciences 10, 171 -178 (1994).
56. Weil, D. et al. Defective myosin VIIA gene responsible for Usher syndrome type 1 B. Nature 374, 60-61 (1995).
57. Henrique, D. et al. Expression of a Delta homologue in prospective neurons in the chick. Nature 375, 787-790 (1995).
58. Schaeren-Wiemers, N. & Gerfin-Moser, A. A single protocol to detect transcripts of various types and expression levels in neural tissue and cultured cells: In situ hybridization using digoxigenin-labelled cRNA probes. Histochemistry 100, 431 -440 (1993). Sambrook, J. et al. 1989. In Molecular cloning : A Laboratory Manual. Cold Spring Harbor, NY: Cold Spring Harbor Laboratory Press.
Walker G.T. et al., 1992, Nucleic Acids Res., 20:1691 -1696.
Walker G.T. et al., 1992, Proc. Natl. Acad. Sci. USA, 89:392-396.
Spargo CA. et al., 1996, Mol. and Cell. Probes, 10:247-256
Kwoh D.Y. et al., 1989, Proc. Natl. Acad. Sci. USA, 86:1173-1177.
Guateli J.C. et al., 1990, Proc. Natl. Acad. Sci. USA, 87:1874-1878.
Kievitis T. et al., 1991 , J. Virol. Methods, 35:273-286.
Landegren U. et al., 1988, Science, 241 :1077-1080.
Barany F., 1911 , Proc. Natl. Acad. Sci. USA, 88:189-193.
Segev D^ 1992, in « Non-radioactive Labeling and Detection of Biomolecules ». Kessler C Springer Verlag, Berlin, New- York, 197-205.
Duck P. et al., 1990, Biotechniques, 9:142-147.
Burg J.L. et al., 1996, Mol. and Cell. Probes, 10:257-271.
Lizardi P.M. et al., 1988, Bio/technology, 6:1197-1202.
Miele E.A. et al., 1983, J. Mol. Biol., 171 :281 -295.
Stone B.B. et al., 1996, Mol. and Cell. Probes, 10:359-370.
Chu B.C.F. et al., 1986, Nucleic Acids Res., 14:5591-5603.
l
Sanchez-Pescador R., 1988, J. Clin. Microbiol., 26(10):1934-1938.
Urdea M.S., 1988, Nucleic Acids Research, 11 : 4937-4957.
Matthews J.A. et al., 1988, Anal. Biochem., 169:1-25.
Urdea M.S. et al., 1991 , Nucleic Acids Symp. Ser., 24:197-200.
beta-cyanethylphosphoramidite technique, 1986, Bioorganic Chem., 4:274-325.
Houbenweyl, 1974, in Meuthode der Organischen Chemie, E. Wunsch Ed., Volume 15-1 et 15-11, Thieme, Stuttgart.
Kohler G. et al., 1975, Nature, 256(5517):495-497.
Bolmont et al., J. of Submicroscopic cytology and pathology, 1990, 22:117-122.
Zhenlin et al., Gene, 1989, 78:243-254).
Tacson et al., 1996, Nature Medicine, 2(8):888-892. Huygen et al. (1996, Nature Medicine, 2(8):893-898
Pagano et al., 1967, J. Virol., 1 :891
Kaneda et al., 1989, Science, 243:375
Feigner et al., 1987, Proc. Natl. Acad. Sci., 84:7413
Fraley et al., 1980, J. Biol. Chem., 255:10431
Midoux, 1993, Nucleic Acids Research, 21 :871-878
Pastore,.1994, Circulation, 90:1-517
Feldman and Steg, 1996, Medecine/Sciences, synthese, 12:47-55 Ohno et al., 1994, Sciences, 265:781-784 Beard et al., 1980, Virology, 75:81
Levrero et al., 1991 , Gene, 101 :195
Graham, 1984, EMBO J., 3:2917
Adra et al., 1987, Gene, 60:65-74
Kort et al., 1983, Nucleic Acids Research, 11 :8287-8301
-Λ1L
Rettlezand Basenga, 1987, Mol. Cell. Biol., 7:1676-1685 Parmley and Smith, Gene, 1988, 73:305-318
Cohen-Salmon etal., 1995, Gene, 164:235-242
Chee et al., 1996, Science, 274:610-614
Kadonagaetal., 1987, Cell, 51:1079-1090.
Kageyama et al., 1989, Cell, 59:815-825.
Xuetal., 1997, Development, 124:219-231.