WO2002066501A2 - PROTEIN-PROTEIN INTERACTIONS IN $i(HELICOBACTER PYLORI) - Google Patents

PROTEIN-PROTEIN INTERACTIONS IN $i(HELICOBACTER PYLORI) Download PDF

Info

Publication number
WO2002066501A2
WO2002066501A2 PCT/EP2001/015428 EP0115428W WO02066501A2 WO 2002066501 A2 WO2002066501 A2 WO 2002066501A2 EP 0115428 W EP0115428 W EP 0115428W WO 02066501 A2 WO02066501 A2 WO 02066501A2
Authority
WO
WIPO (PCT)
Prior art keywords
protein
polypeptide
sid
cells
helicobacter pylori
Prior art date
Application number
PCT/EP2001/015428
Other languages
French (fr)
Other versions
WO2002066501A3 (en
Inventor
Pierre Legrain
Jean-Christophe Rain
Frédéric COLLAND
Hilde De Reuse
Agnès Labigne
Original Assignee
Hybrigenics
Institut Pasteur
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hybrigenics, Institut Pasteur filed Critical Hybrigenics
Publication of WO2002066501A2 publication Critical patent/WO2002066501A2/en
Publication of WO2002066501A3 publication Critical patent/WO2002066501A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/205Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Campylobacter (G)
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies
    • A61K2039/51Medicinal preparations containing antigens or antibodies comprising whole cells, viruses or DNA/RNA
    • A61K2039/515Animal cells
    • A61K2039/5156Animal cells expressing foreign proteins
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61KPREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
    • A61K39/00Medicinal preparations containing antigens or antibodies

Definitions

  • the present invention relates to proteins that interact with Helicobacter pylori. More specifically, the present invention relates to complexes of polypeptides or polynucleotides encoding the polypeptides, fragments of the polypeptides, antibodies to the complexes, Selected Interacting Domains (SID®) which are identified due to the protein-protein interactions, methods for screening drugs for agents which modulate the interaction of proteins and pharmaceutical compositions that are capable of modulating the protein-protein interactions.
  • SID® Selected Interacting Domains
  • the present invention provides a protein-protein interaction map called a PIM® which is available in a report relating to the protein-protein interactions of Helicobacter pylori.
  • Protein-protein interactions enable two or more proteins to associate. A large number of non-covalent bonds form between the proteins when two protein surfaces are precisely matched. These bonds account for the specificity of recognition.
  • protein-protein interactions are involved, for example, in the assembly of enzyme subunits, in antibody-antigen recognition, in the formation of biochemical complexes, in the correct folding of proteins, in the metabolism of proteins, in the transport of proteins, in the localization of proteins, in protein turnover, in first translation modifications, in the core structures of viruses and in signal transduction.
  • the earliest and simplest two-hybrid system which acted as basis for development of other versions, is an in vivo assay between two specifically constructed proteins.
  • the first protein known in the art as the "bait protein” is a chimeric protein which binds to a site on DNA upstream of a reporter gene by means of a DNA-binding domain or BD.
  • the binding domain is the DNA-binding domain from either Gal4 or native E. coli LexA and the sites placed upstream of the reporter are Gal4 binding sites or LexA operators, respectively.
  • the second protein is also a chimeric protein known as the "prey" in the art.
  • This second chimeric protein carries an activation domain or AD.
  • This activation domain is typically derived from Gal4, from VP 16 or from B42.
  • Another advantage of the two-hybrid plus one system is that it allows or prevents the formation of the transcriptional activator since the third partner can be expressed from a conditional promoter such as the methionine-repressed Met25 promoter which is positively regulated in medium lacking methionine.
  • the presence of the methionine-regulated promoter provides an excellent control to evaluate the activation or inhibition properties of the third partner due to its "on” and "off switch for the formation of the transcriptional activator.
  • the three-hybrid method is described, for example in Tirade et al., The Journal of Biological Chemistry, 272, No. 37 pp. 22995-22999 (1997) incorporated herein by reference.
  • W099/42612 permits the screening of more prey polynucleotides with a given bait polynucleotide in a single step than in the prior art systems due to the cell to cell mating strategy between haploid yeast cells. Furthermore, this method is more thorough and reproducible, as well as sensitive. Thus, the presence of false negatives and/or false positives is extremely minimal as compared to the conventional prior art methods.
  • Helicobacter pylori is a microaerophilic, Gram negative, slow growing, spiral shaped and flagellated organism. H. pylori has been first isolated in 1983 from a gastric biopsy specimen of patient with chronic gastritis (Marshall et al., 1984, Lancet, 1 :1311-1314, Unidentified curved bacilli in the stomach of patients with gast ⁇ tis and peptic ulceration).
  • H. pylori has become identified as a primary cause of chronic gastroduodenal disorders, such as gastritis, dyspepsia, and peptic ulcers, in humans. Studies have shown (Labigne et al.) that H. pylori can be successfully eradicated by a treatment combining two antibiotics with a proton pump inhibitor. However, few antibiotics are active against H. pylon and antibiotic-resistant strains have begun to appear.
  • the H. pylori strain n° 26695 genome has been studied by Tomb et al.
  • This strain's genome consists of a circular chromosome with a size of 1 ,667,867 bp, average G + C content of 39%, and 1590 predicted coding sequences (open reading frames or "ORF").
  • the multisubunit urease is a characteristic enzyme that is crucial for survival in acidic pH and for successful colonization of the gastric environment, a site that few other microbes can colonize (Labigne et al., WO 93/07273, Helicobacter pylori genes necessary for the regulation and maturation of urease, and use thereof).
  • Genes encoding ureases have been located on a 34 kb chromosome fragment and comprise ureA, ureB, ureC, ureD, ureE, ureF, ureG, ureH and urel.
  • flagellar filaments biosynthesis comprises A and B flageilins and the filament cap.
  • VacA is a H. pylori toxin that induces the formation of large acidic vacuoles in host epithelial cells. These large vacuoles originate from massive swelling of membranous compartments of late stages of the endocytic pathway (de Bernard et al., 1997, Microbiology, 26(4), 665-674, Helicobacter pylori toxin VacA induces vacuole formation by acting in the cell cytosol) Proof for receptor-mediated interaction with VacA has been made by Pagliaccia et al.; m2 allele of vacA gene has always been described as inactive in the in vitro HeLa cell assay, however, the m2 allele is associated with peptic ulcer and is prevalent in populations in which peptic ulcer and gastric cancer have high incidence (Pagliaccia et al., Proc. Natl. Acad. Sci. U.S.A, 1998, 95(17), 10212-10217, The m2 form of the
  • CagA is one of the proteins encoded by the "cag pathogenicity island"
  • H. pylori is produced by 50-60% of H. pylori strains; it is a high molecular weight (120-140 kDa) superficial protein and an immunodominant antigen with unknown function.
  • H. pylori strains that produce CagA protein have two genes cagB and cagC (36 and 101 kDa proteins, respectively). These genes are highly associated with duodenal ulcers (Blaser et al. 1996, WO 96/12825, cagB and cagC genes of Helicobacter pylori and related methods an d compositions).
  • virulence factors are : several gastric tissue-specific adhesins (Boren et al., 1993, Science, 262, 1892-1895).
  • Therapeutic agents are currently available that eradicate H. pylori infections in vitro. However, methods employing antibiotic agents result in the emergence of bacterial strains which are resistant to these agents.
  • SID® polypeptides it is still another object of the present invention to identify selected interacting domains of the polypeptides.
  • SID® polynucleotides it is still another object of the present invention to identify selected interacting domains of the polynucleotides.
  • PGS Putative Essential Genes
  • the present invention relates to the identification of ORFs (open reading frames) having enzymatic activity, which provides a direct way to screen lead compounds that abolish enzymatic activity through the disruption of the oligomehc interaction.
  • the present invention relates to a protein complex of polypeptides as described in Table 1.
  • the present invention provides SID® polynucleotides and SID® polypeptides as defined in Figure 2, as well as a PIM® for Helicobacter pylori.
  • the present invention also provides antibodies to the protein-protein complexes for Helicobacter pylori.
  • the present invention provides a method for screening drugs for agents that modulate the protein-protein interactions and pharmaceutical compositions that are capable of modulating protein-protein interactions.
  • the present invention provides protein chips or protein microarrays.
  • the present invention identifies a superbinder phenotype in H. pylori with the two-hybrid system which completely inhibits protein-protein interactions.
  • the present invention provides oligopeptides, their overlapping or combining derivatives thereof that inhibit H. pylori growth.
  • the present invention identifies ORFs having enzymatic activity which provides a direct way to screen lead compounds.
  • the present invention provides a report in, for example, paper, electronic and/or digital forms.
  • Fig. 1 is a schematic representation of the pB1 plasmid.
  • Fig. 2 is a schematic representation of the pB5 plasmid.
  • Fig. 3 is a schematic representation of the pB6 plasmid.
  • Fig. 4 is a schematic representation of the pB13 plasmid.
  • Fig. 5 is a schematic representation of the pB14 plasmid.
  • Fig. 6 is a schematic representation of the pB20 plasmid.
  • Fig. 7 is a schematic representation of the pP1 plasmid.
  • Fig. 8 is a schematic representation of the pP2 plasmid.
  • Fig. 9 is a schematic representation of the pP3 plasmid.
  • Fig. 10 is a schematic representation of the pP6 plasmid.
  • Fig. 1 1 is a schematic representation of the pP7 plasmid.
  • Fig. 12 is a schematic representation of vectors expressing the T25 fragment.
  • Fig. 13 is a schematic representation of vectors expressing the T18 fragment.
  • Fig. 14 is a schematic representation of various vectors of pCmAHLI , pT25 and pT18.
  • Fig. 15 is a schematic representation identifying the SlD®'s of
  • the "Full-length prey protein” is the Open Reading Frame (ORF) or coding sequence (CDS) where the identified prey polypeptides are included.
  • the Selected Interaction Domain (SID®) is determined by the commonly shared polypeptide domain of every selected prey fragment.
  • Fig. 16 is a protein map (PIM®).
  • Fig. 17 is a gel illustrating the results obtained for the disruption of the ORFs hp0099 to hp0198. This figure exemplifies first that multiple insertions of the transposon took place and second that for the majority trransposon insertion occurred at a distance ranging between 100 to 600 bp from the 5'-end of the ORF, a distance compatible with the promotion of gene replacement by allelic recombination.
  • Fig. 18 is a schematic diagram of the procedure for classification of the genes as described in the present invention.
  • Fig. 19 are the results of three-hybrid experiments. Growth phenotypes of diploid strains containing various plasmids were analyzed by incubating cells at various dilutions (from 1 to 10 "4 ). Yeast growth was performed over 2 days at 30°C on DO-3+Met or DO-3-Met medium. Lane 1 are cells containing [p3H1-HP1230]+pP6-HP1529; Lane 2 [p3H1 -HP1230- SID1529W ⁇ +pP6-HP1529; lane 3[pH1-HP1230-SID1529(N38D-
  • Fig. 20 is the pP7-centro vector.
  • polynucleotides As used herein the terms “polynucleotides”, “nucleic acids” and “oligonucleotides” are used interchangeably and include, but are not limited to RNA, DNA, RNA/DNA sequences of more than' one nucleotide in either single chain or duplex form.
  • the polynucleotide sequences of the present invention may be prepared from any known method including, but not limited to, any synthetic method, any recombinant method, any ex vivo generation method and the like, as well as combinations thereof.
  • polypeptide means herein a polymer of amino acids having no specific length.
  • peptides, oligopeptides and proteins are included in the definition of “polypeptide” and these terms are used interchangeably throughout the specification, as well as in the claims.
  • polypeptide does not exclude posttranslational modifications such as polypeptides having covalent attachment of glycosyl groups, acetyl groups, phosphate groups, lipid groups and the like. Also encompassed by this definition of "polypeptide” are homologs thereof.
  • orthologs structurally similar genes contained within a given species
  • orthologs are functionally equivalent genes from a given species or strain, as determined for example, in a standard complementation assay.
  • a polypeptide of interest can be used not only as a model for identifying similar genes in given strains, but also to identify homologs and orthologs of the polypeptide of interest in other species.
  • the orthologs for example, can also be identified in a conventional complementation assay.
  • orthologs can be expected to exist in bacteria (or other kind of cells) in the same branch of the phylogenic tree, as set forth, for example, at ftp://ft.cme.msu.edu/pub/rdp/SSU-rRNA/SSU/Prok.phylo.
  • prey polynucleotide means a chimeric polynucleotide encoding a polypeptide comprising (i) a specific domain; and (ii) a polypeptide that is to be tested for interaction with a bait polypeptide.
  • the specific domain is preferably a transcriptional activating domain.
  • a "bait polynucleotide” is a chimeric polynucleotide encoding a chimeric polypeptide comprising (i) a complementary domain; and (ii) a polypeptide that is to be tested for interaction with at least one prey polypeptide.
  • the complementary domain is preferably a DNA-binding domain that recognizes a binding site that is further detected and is contained in the host organism.
  • complementary domain is meant a functional constitution of the activity when bait and prey are interacting; for example, enzymatic activity.
  • specific domain is meant a functional interacting activation domain that may work through different mechanisms by interacting directly or indirectly through intermediary proteins with RNA polymerase II or Ill- associated proteins in the vicinity of the transcription start site.
  • complementary means that, for example, each base of a first polynucleotide is paired with the complementary base of a second polynucleotide whose orientation is reversed.
  • the complementary bases are A and T (or A and U) or C and G.
  • sequence identity refers to the identity between two peptides or between two nucleic acids. Identity between sequences can be determined by comparing a position in each of the sequences which may be aligned for the purposes of comparison. When a position in the compared sequences is occupied by the same base or amino acid, then the sequences are identical at that position. A degree of sequence identity between nucleic acid sequences is a function of the number of identical nucleotides at positions shared by these sequences. A degree of identity between amino acid sequences is a function of the number of identical amino acid sequences that are shared between these sequences.
  • two polypeptides may each (i) comprise a sequence (i.e., a portion of a complete polynucleotide sequence) that is similar between two polynucleotides, and (ii) may further comprise a sequence that is divergent between two polynucleotides
  • sequence identity comparisons between two or more polynucleotides over a "comparison window" refers to the conceptual segment of at least 20 contiguous nucleotide positions wherein a polynucleotide sequence may be compared to a reference nucleotide sequence of at least 20 contiguous nucleotides and wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) of 20 percent or less compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences.
  • the sequences are aligned for optimal comparison. For example, gaps can be introduced in the sequence of a first amino acid sequence or a first nucleic acid sequence for optimal alignment with the second amino acid sequence or second nucleic acid sequence.
  • the amino acrid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, the molecules are identical at that position.
  • sequences can be the same length or may be different in length.
  • Optimal alignment of sequences for determining a comparison window may be conducted by the local homology algorithm of Smith and Waterman (J. Theor. Biol, 91 (2) pgs. 370-380 (1981), by the homology alignment algorithm of Needleman and Wunsch, J. Miol. Biol, 48(3) pgs. 443-453 (1972), by the search for similarity via the method of Pearson and Lipman, PNAS, USA, 85(5) pgs. 2444-2448 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetic Computer Group, 575, Science Drive, Madison, Wisconsin) or by inspection.
  • the best alignment i.e., resulting in the highest percentage of identity over the comparison window generated by the various methods is selected.
  • sequence identity means that two polynucleotide sequences are identical (i.e., on a nucleotide by nucleotide basis) over the window of comparison.
  • percentage of sequence identity is calculated by comparing two optimally aligned sequences over the window of comparison, determining the number of positions at which the identical nucleic acid base (e.g., A, T, C, G, U, or I) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison (i.e., the window size) and multiplying the result by 100 to yield the percentage of sequence identity.
  • the same process can be applied to polypeptide sequences.
  • the percentage of sequence identity of a nucleic acid sequence or an amino acid sequence can also be calculated using BLAST software (Version 2.06 of September 1998) with the default or user defined parameter.
  • sequence similarity means that amino acids can be modified while retaining the same function. It is known that amino acids are classified according to the nature of their side groups and some amino acids such as the basic amino acids can be interchanged for one another while their basic function is maintained.
  • isolated means that a biological material such as a nucleic acid or protein has been removed from its original environment in which it is naturally present.
  • a biological material such as a nucleic acid or protein has been removed from its original environment in which it is naturally present.
  • a polynucleotide present in a plant, mammal or animal is present in its natural state and is not considered to be isolated.
  • the same polynucleotide separated from the adjacent nucleic acid sequences in which it is naturally inserted in the genome of the plant or animal is considered as being “isolated.”
  • isolated is not meant to exclude artificial or synthetic mixtures with other compounds, or the presence of impurities which do not interfere with the biological activity and which may be present, for example, due to incomplete purification, addition of stabilizers or mixtures with pharmaceutically acceptable excipients and the like.
  • isolated polypeptide or isolated protein as used herein means a polypeptide or protein which is substantially free of those compounds that are normally associated with the polypeptide or protein in a naturally state such as other proteins or polypeptides, nucleic acids, carbohydrates, lipids and the like.
  • purified means at least one order of magnitude of purification is achieved, preferably two or three orders of magnitude, most preferably four or five orders of magnitude of purification of the starting material or of the natural material. Thus,- the term “purified” as utilized herein does not mean that the material is 100% purified and thus excludes any other material.
  • variants when referring to, for example, polynucleotides encoding a polypeptide variant of a given reference polypeptide are polynucleotides that differ from the reference polypeptide but generally maintain their functional characteristics of the reference polypeptide.
  • a variant of a polynucleotide may be a naturally occurring allelic variant or it may be a variant that is known naturally not to occur.
  • Such non-naturally occurring variants of the reference polynucleotide can be made by, for example, mutagenesis techniques, including those mutagenesis techniques that are applied to polynucleotides, cells or organisms.
  • Variants of polynucleotides according to the present invention include, but are not limited to, nucleotide sequences which are at least 95% identical after lo
  • Nucleotide changes present in a variant polynucleotide may be silent, which means that these changes do not alter the amino acid sequences encoded by the reference polynucleotide.
  • Substitutions, additions and/or deletions can involve one or more nucleic acids. Alterations can produce conservative or non-conservative amino acid substitutions, deletions and/or additions.
  • Variants of a prey or a SID® polypeptide encoded, by a variant polynucleotide can possess a higher affinity of binding and/or a higher specificity of binding to its protein or polypeptide counterpart, against which it has been initially selected.
  • variants can also loose their ability to bind to their protein or polypeptide counterpart.
  • anabolic pathway is meant a reaction or series of reactions in a metabolic pathway that synthesize complex molecules from simpler ones, usually requiring the input of energy.
  • An anabolic pathway is the opposite of a catabolic pathway.
  • catabolic pathway is a series of reactions in a metabolic pathway that break down complex compounds into simpler ones, usually releasing energy in the process.
  • a catabolic pathway is the opposite of an anabolic pathway.
  • drug metabolism is meant the study of how drugs are processed and broken down by the body. Drug metabolism can involve the study of enzymes that break down drugs, the study of how different drugs interact within the body and how diet and other ingested compounds affect the way the body processes drugs.
  • metabolic means the sum of all of the enzyme- catalyzed reactions in living cells that transform organic molecules.
  • second metabolism is meant pathways producing specialized metabolic products that are not found in every cell.
  • SID® means a Selected Interacting Domain and is identified as follows: for each bait polypeptide screened, selected prey polypeptides are compared. Overlapping fragments in the same ORF or CDS define the selected interacting domain.
  • PIM® means a protein-protein interaction map. This map is obtained from data acquired from a number of separate screens using different bait polypeptides and is designed to map out all of the interactions between the polypeptides.
  • affinity of binding can be defined as the affinity constant Ka when a given SID® polypeptide of the present invention which binds to a polypeptide and is the following mathematical relationship:
  • [free SID®], [free polypeptide] and [SID® polypeptide complex] consist of the concentrations at equilibrium respectively of the free SID® polypeptide, of the free polypeptide onto which the SID® polypeptide binds and of the complex formed between SID® polypeptide and the polypeptide onto which said SID® polypeptide specifically binds.
  • SID® polypeptide of the present invention or a variant thereof for its polypeptide counterpart can be assessed for example, on a BiacoreTM apparatus marketed by Amersham Pharmacia Biotech Company such as described by Szabo et al Curr Opin Struct Biol 5 pgs. 699-705 (1995) and by
  • the phrase "at least the same affinity" with respect to the binding affinity between a SID® polypeptide of the present invention to another polypeptide means that the Ka is identical or can be at least two-fold, at least three fold or at least five fold greater than the Ka value of reference.
  • modulating compound means a compound that inhibits or stimulates or can act on another protein which can inhibit or stimulate the protein-protein interaction of a complex of two polypeptides or the protein- protein interaction of two polypeptides.
  • the present invention comprises complexes of polypeptides or polynucleotides encoding the polypeptides composed of a bait polypeptide, or a bait polynucleotide encoding a bait polypeptide and a prey polypeptide or a prey polynucleotide encoding- a prey polypeptide.
  • the prey polypeptide or prey polynucleotide encoding the prey polypeptide is capable of interacting with a bait polypeptide of interest in various hybrid systems.
  • the present invention is not limited to the type of method utilized to detect protein-protein interactions and therefore any method known in the art and variants thereof can be used. It is however better to use the method described in
  • Protein-protein interactions can also be detected using complementation assays such as those described by Pelletier et al at http://www.abrf.oi-g/IBT /Articles /IBT0012/jbt0012.ht l. WO 00/07038 and W098/34120.
  • the present invention is not limited to detecting protein-protein interactions using yeast, but also includes similar methods that can be used in detecting protein protein interactions in, for example, mammalian systems as described, for example in Takacs et al., Proc. Nat. Acad. Sci., USA, 90 (21 ): 10375 (1993) and Vasavada et al., Proc. Nat. Acad.
  • Suitable cells include, but are not limited to, VERO cells,
  • HELA cells such as ATCC No. CCL2
  • CHO cell lines such as ATCC No. CCL61
  • COS cells such as COS-7 cells and ATCC No. CRL 1650 cells, W138, BHK, HepG2, 3T3 such as ATCC No. CRL6361 , A549, PC12, K562 cells, 293 cells, Sf9 cells such as ATCC No. CRL171 1 and Cv1 cells such as ATCC No. CCL70.
  • suitable cells include, but are not limited to, prokaryotic host cells strains such as Escherichia coli, (e.g., strain DH5- ⁇ , Bacillus subtilis, Salmonella typhimurium, or strains of the genera of Pseudomonas, Streptomyces and Staphylococcus.
  • prokaryotic host cells strains such as Escherichia coli, (e.g., strain DH5- ⁇ , Bacillus subtilis, Salmonella typhimurium, or strains of the genera of Pseudomonas, Streptomyces and Staphylococcus.
  • yeast cells such as those of Saccharomyces such as Saccharomyces cerevisiae.
  • the bait polynucleotide, as well as the prey polynucleotide can be prepared according to the methods known in the art such as those described above in the publications and patents reciting the known method perse.
  • the bait polynucleotide of the present invention is obtained from genomic DNA of Helicobacter pylori.
  • the prey polynucleotide is obtained from genomic DNA of Helicobacter pylori, variants of genomic DNA of Helicobacter pylori, and fragments from the genome or transcriptome of Helicobacter pylori ranging from about 20 to 5000.
  • the prey polynucleotide is then selected, sequenced and identified.
  • a genomic DNA prey library is prepared from the Helicobacter pylori and constructed in the specially designed prey vector pP6 as shown in Figure 10 after ligation of suitable linkers such that every genomic DNA insert is fused to a nucleotide sequence in the vector that encodes the transcription activation domain of a reporter gene.
  • Any transcription activation domain can be used in the present invention. Examples include, but are not limited to, Gal4,YP16, B42, His and the like.
  • Toxic reporter genes such as CAT R , CYH2, CYH1 , URA3, bacterial and fungi toxins and the like can be used in reverse two-hybrid systems.
  • prey polypeptides encoded by the nucleotide inserts of the genomic DNA prey library thus prepared are termed "prey polypeptides in the context of the presently described selection method of the prey polynucleotides.
  • the bait polynucleotide can be inserted in bait plasmid as illustrated in Figure 1.
  • the bait polynucleotide insert is fused to a polynucleotide encoding the binding domain of, for example, the Gal4 DNA binding domain and the shuttle expression vector is used to transform cells.
  • any cells can be utilized in transforming the bait and prey polynucleotides of the present invention including mammalian cells, bacterial cells, yeast cells, insect cells and the like.
  • the present invention identifies protein-protein interactions in yeast.
  • a prey positive clone is identified containing a vector which comprises a nucleic acid insert encoding a prey polypeptide which binds to a bait polypeptide of interest.
  • the method in which protein-protein interactions are identified comprises the following steps:
  • step ii) cultivating dipioid cell clones obtained in step i) on a selective medium
  • This method may further comprise the step of:
  • Escherichia coli is used in a bacterial two-hybrid system, which encompasses a similar principle to that described above for yeast, but does not involve mating for characterizing the prey polynucleotide.
  • mammalian cells and a method similar to that described above for yeast for characterizing the prey polynucleotide are used.
  • the prey polypeptide that has been selected by testing the library of preys in a screen using the two-hybrid, two plus one hybrid methods and the like encodes the polypeptide interacting with the protein of interest.
  • the present invention is also directed, in a general aspect, to a complex of polypeptides, polynucleotides encoding the polypeptides composed of a bait polypeptide or bait polynucleotide encoding the bait polypeptide and a prey polypeptide or prey polynucleotide encoding the prey polypeptide capable of interacting with the bait polypeptide of interest.
  • complexes are identified in Table 1 , as the bait amino acid sequences and the prey amino acid sequences, as well as the bait and prey nucleic acid sequences.
  • the present invention relates to a complex of polynucleotides consisting of a first polynucleotide, or a fragment thereof, encoding a prey polypeptide that interacts with a bait polypeptide and a second polynucleotide or a fragment thereof.
  • This fragment has at least 20 consecutive nucleotides, but can have between 20 and 5,000 consecutive nucleotides, or between 12 and 10,000 consecutive nucleotides or between 12 and 20,000 consecutive nucleotides.
  • polypeptides of column 3 encoded by the polynucleotides of column 2 in Tables 2 and 7 and the polypeptides of column 5 encoded by the polynucleotides of column 3 in Table 8 according to the present invention and the complexes of the two polypeptides encoded by the sets of two polynucleotides also form part of the present invention.
  • the present invention relates to an isolated complex of at least two polypeptides encoded by two polynucleotides wherein said two polypeptides are associated in the complex by affinity binding and are depicted in Table 1 and Table 8.
  • the present invention relates to an isolated complex comprising at least a polypeptide encoded by an ORF (HP####) of column 1 of Table 1 and a polypeptide encoded by an ORF (HP####) of column
  • the present invention is not limited to these polypeptide complexes alone but also includes the isolated complex of the two polypeptides in which fragments and/or homologous polypeptides exhibiting at least 95% sequence identity, as well as from 96% sequence identity to 99.999% sequence identity.
  • Also encompassed in another embodiment of the present invention is an isolated complex in which the SID® polypeptide (see even SEQ ID Nos. from 2 to 3256 in column 3 of Table 2, even SEQ ID Nos. 6590 to 6594 in Table 7 and even SEQ ID Nos. 6596 to 6644 in Table 8.) of the prey polypeptides encoded by uneven SEQ ID Nos. 1 to 3255 in column 2 of Table 2, uneven SEQ ID Nos. 6589 to 6593 in Table 7 and uneven SEQ ID Nos. 6595 to 6643 in Table 8) forming the isolated complex.
  • the SID® polypeptide see even SEQ ID Nos. from 2 to 3256 in column 3 of Table 2, even SEQ ID Nos. 6590 to 6594 in Table 7 and even SEQ ID Nos. 6596 to 6644 in Table 8.
  • nucleic acids coding for a Selected Interacting Domain (SID®) polypeptide or a variant thereof or any of the nucleic acids set forth in Tables 2, 7 and 8 can be inserted into an expression vector which contains the necessary elements for the transcription and translation of the inserted protein-coding sequence.
  • transcription elements include a regulatory region and a promoter.
  • the nucleic acid which may encode a marker compound of the present invention is operably linked to a promoter in the expression vector.
  • the expression vector may also include a replication origin.
  • Suitable expression vectors include, for example, segments of chromosomal, non- chromosomal and synthetic DNA sequences.
  • Suitable vectors include, but are not limited to, derivatives of SV40 and pcDNA and known bacterial plasmids such as col El, pCR1 , pBR322, pMal-C2, pET, pGEX as described by Smith et al (1988), pMB9 and derivatives thereof, plasmids such as RP4, phage DNAs such as the numerous derivatives of phage I such as NM989, as well as other phage DNA such as M13 and filamentous single stranded phage DNA; yeast plasmids such as the 2 micron plasmid or derivatives of the 2 micron plasmid, as well as centomeric and integrative yeast shuttle vectors; vectors useful in eukaryotic cells such as
  • both non-fusion transfer vectors such as, but not limited to pVL941 (BamHI cloning site Summers, pVL1393 (BamHI, Smal, Xba ⁇ , EcoRI, Notl, Xmalll, BgH ⁇ and Pst ⁇ cloning sites; Invitrogen) pVL 392 (BglH, Pstl, Not ⁇ , Xmalll, EcoRI, XbaW, Sma ⁇ and BamHI cloning sites; Summers and Invitrogen) and pBlueSaclli (BamHI, Bglll, Pstl, Nco ⁇ and Hindlll cloning sites, with blue/white recombinant screening, Invitrogen), and fusion transfer vectors such as, but not limited to, pAc700 (BamHI and Kpnl cloning sites, in which the BamHI recognition site begins with the initiation cod
  • Mammalian expression vectors contemplated for use in the invention include vectors with inducible promoters, such as the dihydrofolate reductase promoters, any expression vector with a DHFR expression cassette or a DHFR/methotrexate co- amplification vector such as pED (Pstl, Sail, Sbal, Smal and EcoRI cloning sites, with the vector expressing both the cloned gene and DHFR; Kaufman, 1991 ).
  • inducible promoters such as the dihydrofolate reductase promoters
  • any expression vector with a DHFR expression cassette or a DHFR/methotrexate co- amplification vector such as pED (Pstl, Sail, Sbal, Smal and EcoRI cloning sites, with the vector expressing both the cloned gene and DHFR; Kaufman, 1991 ).
  • glutamine synthetase/methionine sulfoximine co-amplification vector such as pEE14 (Hindlll, Xball, Smal, Sbal, EcoRI and Bell cloning sites in which the vector expresses glutamine synthetase and the cloned gene; Celltech).
  • a vector that directs episomal expression under the control of the Epstein Barr Virus (EBV) or nuclear antigen (EBNA) can be used such as pREP4 (BamHI, Sfil, Xhol, Notl, Nhel, Hindlll, Nhel, Pvull and Kpnl cloning sites, constitutive RSV-LTR promoter, hygromycin selectable marker; Invitrogen) pCEP4 (BamHI, Sfil, Xhol, Nod, Nhel, Hindlll, Nhei, Pvull and Kpnl cloning sites, constitutive hCMV immediate early gene promoter, hygromycin selectable marker; Invitrogen), pMEP4 (Kpnl, Pvul, Nhel, Hindlll, Notl, Xhol, Sfil, BamHI cloning sites, inducible methallothionein lla gene promoter, hygromycin selectable marker, Invitrogen), p
  • Selectable mammalian expression vectors for use in the invention include, but are not limited to, pRc/CMV (Hindlll, BstXl, Notl, Sbal and Apal cloning sites, G418 selection, Invitrogen), pRc/RSV (Hindll, Spe ⁇ , BstXl, Notl, Xba cloning sites, G418 selection, Invitrogen) and the like.
  • Vaccinia irus mammalian expression vectors (see, for example Kaufman 1991 ) that can be used In the present invention include, but are not limited to, pSC1 1 (Smal cloning site, TK- and ⁇ -gal selection), pMJ601 (Sa/I, Smal, Afll , Na ⁇ , BspMli, BamHI, Apal, Nhel, Sacll, Kpnl and Hindlll cloning sites; TK- and ⁇ -gal selection), pTKgptFI S (EcoRI, Pstl, Salll, Accl, Hindll, Sbal, BamHI and Hpa cloning sites, TK or XPRT selection) and the like.
  • pSC1 1 Mal cloning site, TK- and ⁇ -gal selection
  • pMJ601 Sa/I, Smal, Afll , Na ⁇ , BspMli, BamHI, Apal, Nhel,
  • Yeast expression systems that can also be used in the present invention include, but are not limited to, the non-fusion pYES2 vector (Xbal, Sphl, Shol, Notl, GstXl , EcoRI, BstXl, BamHI, Sacl, Kpnl and Hindlll cloning sites, Invitrogen), the fusion pYESHisA, B, C (Xbal, Sphl, Shol, Notl, BstXl , EcoRI, BamHI, Sacl, Kpnl and Hindlll cloning sites, N-terminal peptide purified with ProBond resin and cleaved with enterokinase; Invitrogen), pRS vectors and the like.
  • the non-fusion pYES2 vector Xbal, Sphl, Shol, Notl, GstXl , EcoRI, BstXl, BamHI, Sacl, Kpnl and Hindlll cloning
  • mammalian and typically human cells as well as bacterial, yeast, fungi, insect, nematode and plant cells an used in the present invention and may be transfected by the nucleic acid or recombinant vector as defined herein.
  • suitable cells include, but are not limited to, VERO cells, HELA cells such as ATCC No. CCL2, CHO cell lines such as ATCC No. CCL61 , COS cells such as COS-7 cells and ATCC No. CRL 1650 cells, W138, BHK, HepG2, 3T3 such as ATCC No. CRL6361 , A549, PC12, K562 cells, 293 cells, Sf9 cells such as ATCC No. CRL1711 and Cv1 cells such as ATCC No. CCL70.
  • suitable cells include, but are not limited to, prokaryotic host cells strains such as Escherichia coli, (e.g., strain DH5- ⁇ ), Bacillus subtilis, Salmonella typhimurium, or strains of the genera of Pseudomonas, Streptomyces and Staphylococcus.
  • prokaryotic host cells strains such as Escherichia coli, (e.g., strain DH5- ⁇ ), Bacillus subtilis, Salmonella typhimurium, or strains of the genera of Pseudomonas, Streptomyces and Staphylococcus.
  • yeast cells such as those of Saccharomyces such as Saccharomyces cerevisiae.
  • the present invention relates to and also encompasses SID® polynucleotides.
  • SID® polynucleotides As explained above, for each bait polypeptide, several prey polypeptides may be identified by comparing and selecting the intersection of every isolated fragment that are included in the same polypeptide, as set forth, in Example 5.
  • the SID® polynucleotides of the present invention are represented by the nucleic acid sequences of uneven SEQ ID Nos. 1 to 3255 in column 2 of Table 2, uneven SEQ ID Nos. 6589 to 6593 in Table 7 and uneven SEQ ID Nos. 6595 to 6643 in Table 8 encoding the SID® polypeptides of even SEQ ID Nos. 2 to 3256. of Table 2, the even SEQ ID Nos. 6590 to 6594 in Table 7 and the even SEQ ID Nos. 6596 to 6644 in Table 8.
  • the present invention is not limited to the SID® nucleic acid sequences as described in the above paragraph, but also includes fragments of these sequences having at least 6 consecutive nucleic acids, between 6 and 5,000 consecutive nucleic acids and between 6 and 10,000 consecutive nucleic acids and between 6 and 20,000 consecutive nucleic acids, as well as variants thereof.
  • the fragments or variants of the SID® sequences possess at least the same affinity of binding to its protein or polypeptide counterpart, against which it has been initially selected.
  • this variant and/or fragments of the SID® sequences alternatively can have between 95% and 99.999% sequence identity to its protein or polypeptide counterpart.
  • the variants can be created by known mutagenesis techniques either in vitro or in vivo. Such a variant can be created such that it has altered binding characteristics with respect to the target protein and more specifically that the variant binds the target sequence with either higher or lower affinity.
  • Polynucleotides that are complementary to the above sequences which include the polynucleotides of the SID®'s, their fragments, variants and those that have specific sequence identity are also included in the present invention.
  • polynucleotide encoding the SID® polypeptide, fragment or variant thereof can also be inserted into recombinant vectors which are described in detail above.
  • the present invention also relates to a composition
  • a composition comprising the above- mentioned recombinant vectors containing the SID® polypeptides in Tables 2, 7 and 8, fragments or variants thereof, as well as recombinant . host cells transformed by the vectors.
  • the recombinant host cells that can be used in the present invention were discussed in greater detail above.
  • compositions comprising the recombinant vectors can contain physiological acceptable carriers such as diluents, adjuvants, excipients and any vehicle in which this composition can be delivered therapeutically and can include, ut are not limited to sterile liquids such as water and oils.
  • the present invention relates to a method of selecting modulating compounds, as well as the modulating molecules or compounds themselves which may be used in a pharmaceutical composition.
  • modulating compounds may act as a cofactor, as an inhibitor, as antibodies, as tags, as a competitive inhibitor, as an activator or alternatively have agonistic or antagonistic activity on the protein-protein interactions.
  • the activity of the modulating compound does not necessarily, for example, have to be 100% activation or inhibition. Indeed, even partial activation or inhibition can be achieved that is of pharmaceutical interest.
  • the modulating compound can be selected according to a method which comprises:
  • said first vector comprises a polynucleotide encoding a first hybrid polypeptide having a DNA binding domain
  • said second vector comprises a polynucleotide encoding a second hybrid polypeptide having a transcriptional activating domain that activates said toxic reporter gene when the first and second hybrid polypeptides interact;
  • the present invention relates to a modulating compound that inhibits the protein-protein interactions of a complex of two polypeptides of Table 1 and
  • the present invention also relates to a modulating compound that activates the protein-protein interactions of a complex of two polypeptides of Table 1 and Table 8.
  • the present invention relates to a method of selecting a modulating compound, which modulating compound inhibits the interactions of two polypeptides of Table 1 .
  • This method comprises:
  • said second vector comprises a polynucleotide encoding a second hybrid polypeptide having an enzymatic transcriptional activating domain that activates said toxic reporter gene when the first and second hybrid polypeptides interact;
  • any toxic reporter gene can be utilized including those reporter genes that can be used for negative selection including the URA3 gene, the CYH1 gene, the CYH2 gene and the like.
  • the present invention provides a kit for screening a modulating compound.
  • This kit comprises a recombinant host cell which comprises a reporter gene the expression of which is toxic for the recombinant host cell.
  • the host cell is transformed with two vectors.
  • the first vector comprises a polynucleotide encoding a first hybrid polypeptide having a DNA binding domain; and a second vector comprises a polynucleotide encoding a second hybrid polypeptide having a transcriptional activating domain that activates said toxic reporter gene when the first and second hybrid polypeptides interact.
  • a kit for screening a modulating compound by providing a recombinant host cell, as described in the paragraph above, but instead of a DNA binding domain, the first vector comprises a first hybrid polypeptide containing a first domain of a protein.
  • the second vector comprises a second polypeptide containing a second part of a complementary domain of a protein that activates the toxic reporter gene when the first and second hybrid polypeptides interact.
  • the activating domain can be p42 Gal 4, YP16 (HSV) and the DNA-binding domain can be derived from Gal4 or Lex A.
  • the protein or enzyme can be adenylate cyclase, guanylate cyclase, DHFR and the like.
  • SID® in Tables 2, 7 and 8 may be used as modulating compounds.
  • the present invention relates to a pharmaceutical composition
  • a pharmaceutical composition comprising the modulating compounds for preventing or treating ulcers in a human or animal, most preferably in a mammal.
  • This pharmaceutical composition comprises a pharmaceutically acceptable amount of the modulating compound.
  • the pharmaceutically acceptable amount can be estimated from cell culture assays.
  • a dose can be formulated in animal models to achieve a circulating concentration range that includes or encompasses a concentration point or range having the desired effect in an in vitro system. This information can thus be used to accurately determine the doses in other mammals, including humans and animals.
  • the therapeutically effective dose refers to that amount of the compound that results in amelioration of symptoms in a patient. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or in experimental animals. For example, the LD50 (the dose lethal to 50% of the population) as well as the ED50 (the dose therapeutically effective in 50% of the population) can be determined using methods known in the art. The dose ratio between toxic and therapeutic effects is the therapeutic index which can be expressed as the ratio between LD50 and ED50 compounds that exhibit high therapeutic indexes.
  • the data obtained from the cell culture and animal studies can be used in formulating a range of dosage of such compounds which lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity.
  • the pharmaceutical composition can be administered via any route such as locally, orally, systemically, intravenously, intramuscularly, mucosally, using a patch and can be encapsulated in liposomes, microparticles, microcapsules, and the like.
  • the pharmaceutical composition can be embedded in liposomes or even encapsulated.
  • any pharmaceutically acceptable carrier or adjuvant can be used in the pharmaceutical composition.
  • the modulating compound will be preferably in a soluble form combined with a pharmaceutically acceptable carrier.
  • the techniques for formulating and administering these compounds can be found in "Remington's Pharmaceutical Sciences” Mack Publication Co., Easton, PA, latest edition.
  • the mode of administration optimum dosages and galenic forms can be determined by the criteria known in the art taken into account the seriousness of the general condition of the mammal, the tolerance of the treatment and the side effects.
  • the present invention also relates to a method of treating or preventing ulcers in a human or mammal in need of such treatment.
  • This method comprises administering to a mammal in need of such treatment a pharmaceutically effective amount of a modulating compound which binds to a targeted bacterial protein.
  • the modulating compound is a polynucleotide which may be placed under the control of a regulatory sequence which is functional in the mammal or human.
  • the present invention relates to a pharmaceutical composition
  • a pharmaceutical composition comprising a SID® polypeptide, a fragment or variant thereof.
  • the SID® polypeptide, fragment or variant thereof can be used in a pharmaceutical composition provided that it is endowed with highly specific binding properties to a bait polypeptide of interest.
  • the original properties of the SID® polypeptide or variants thereof interfere with the naturally occurring interaction between a first protein and a second protein within the cells of the organism.
  • the SID® polypeptide binds specifically to either the first polypeptide or the second polypeptide. Therefore, the SID® polypeptides of the present invention or variants thereof interfere with protein-protein interactions of Helicotacter pylori proteins or between Helicobacter pylori proteins and mammal, for example, human proteins.
  • the present invention relates to a pharmaceutical composition
  • a pharmaceutical composition comprising a pharmaceutically acceptable amount of a SID® polypeptide or variant thereof, provided that the variant has the above-mentioned two characteristics; i.e., that it is endowed with highly specific binding properties to a bait polypeptide of interest and is devoid of biological activity of the naturally occurring protein.
  • the present invention relates to a pharmaceutical composition
  • a pharmaceutical composition comprising a pharmaceutically effective amount of a polynucleotide encoding a SID® polypeptide or a variant thereof wherein the polynucleotide is placed under the control of an appropriate regulatory sequence.
  • Appropriate regulatory sequences that are used are polynucleotide sequences derived from promoter elements and the like.
  • Polynucleotides that can be used in the pharmaceutical composition of the present invention include the nucleotide sequence: of uneven SEQ ID Nos. 1 to 3255 in column 2 of Table 2, uneven SEQ ID Nos. 6589 to 6593 in Table 7 and uneven SEQ ID Nos. 6595 to 6643 in Table 8.
  • the pharmaceutical composition of the present invention can also include a recombinant expression vector comprising the polynucleotide encoding the SID® polypeptide, fragment or variant thereof.
  • compositions can be administered by any route such as orally, systemically, intravenously, intramuscularly, intradermally, mucosally, encapsulated, using a patch and the like.
  • Any pharmaceutically acceptable carrier or adjuvant can be used in this pharmaceutical composition.
  • the SID® polypeptides as active ingredients will be preferably in a soluble form combined with a pharmaceutically acceptable carrier. The techniques for formulating and administering these compounds can be found in "Remington's Pharmaceutical Sciences" supra.
  • the amount of pharmaceutically acceptable SID® polypeptides can be determined as described above for the modulating compounds using cell culture and animal models.
  • Such compounds can be used in a pharmaceutical composition to treat or prevent ulcer.
  • the present invention also relates to a method - of preventing or treating ulcer in a mammal said method comprising the steps of administering to a mammal in need of such treatment a pharmaceutically effective amount of:
  • SID® polynucleotide encoding a SID® polypeptide of uneven SEQ ID Nos. 1 to 3255 in column 2 of Table 2, uneven SEQ ID Nos. 6589 to 6593 in Table 7 and uneven SEQ ID Nos. 6595 to 6643 in Table 8 or variants or fragments thereof wherein said polynucleotide is placed under the control of a regulatory sequence which is functional in said mammal; or
  • a recombinant expression vector comprising a polynucleotide encoding a SID® polypeptide which binds to a bacterial protein.
  • nucleic acids comprising a sequence which encodes the SID® proteins of Table 2 Table 7 and Table 8 and/or functional derivatives thereof are administered to modulate complex of
  • Table 1 and Table 8 by way of gene therapy. Any of the methodologies relating to gene therapy available within the art may be used in the practice of the present invention such as those described by Goldspiel et al Clin. Pharm. 12 pgs. 488-505 (1993).
  • Delivery of the therapeutic nucleic acid into a patient may be direct in vivo gene therapy (i.e., the patient is directly exposed to the nucleic acid or nucleic acid containing vector) or indirect ex vivo gene therapy (i.e., cells are first transformed with the nucleic acid in vitro and then transplanted into the patient).
  • direct in vivo gene therapy i.e., the patient is directly exposed to the nucleic acid or nucleic acid containing vector
  • indirect ex vivo gene therapy i.e., cells are first transformed with the nucleic acid in vitro and then transplanted into the patient.
  • an expression vector containing the nucleic acid is administered in such a manner that it becomes intracellular; i.e., by infection using a defective or attenuated retroviral or other viral vectors as described, for example in U.S. Patent 4,980,286 or by Robbins et al, Pharmacol.
  • retroviral vectors that are known in the art are such as those described in Miller et al, Meth. Enzymol. 217 pgs. 581-599 (1993) which have been modified to delete those retroviral sequences which are not required for packaging of the viral genome and subsequent integration into host cell DNA.
  • adenoviral vectors can be used which are advantageous due to their ability to infect non-dividing cells and such high-capacity adenoviral vectors are described in Kochanek, Human Gene Therapy, 10, pgs. 2451-2459 (1999).
  • Chimeric viral vectors that can be used are those described by Reynolds et al, Molecular Medecine Today, pgs. 25 -31 (1999).
  • Hybrid vectors can also be used and are described by Jacoby et al, Gene Therapy, 4, pgs. 1282-1283 (1997).
  • Direct injection of naked DNA or through the use of microparticle bombardment (e.g., Gene Gun®; Biolistic, Dupont) or by coating it with lipids can also be used in gene therapy.
  • Cell-surface receptor s/transfecting agents or through encapsulation in liposomes, microparticles or microcapsules or by administering the nucleic acid in linkage to a peptide which is known to enter the nucleus or by administering it in linkage to a ligand predisposed to receptor- mediated endocytosis See, Wu & Wu, J. Biol. Chem., 262 pgs. 4429-4432; ( 1987)
  • a nucleic acid ligand compound may be produced in which the ligand comprises a fusogenic viral peptide designed so as to disrupt endosomes, thus allowing the nucleic acid to avoid subsequent lysosomal degradation.
  • the nucleic acid may be targeted in vivo for cell specific endocyt ⁇ sis and expression by targeting a specific receptor such as that described in W092/06180, W093/14188 and WO 93/20221.
  • the nucleic acid may be introduced intracellularly and incorporated within the host cell genome for expression by homologous recombination. See, Zijlstra et al, Nature, 342, pgs. 435-428 (1989).
  • a gene is transferred into cells in vitro using tissue culture and the cells are delivered to the patient by various methods ' such as injecting subcutaneously, application of the cells into a skin graft and the intravenous injection of recombinant blood cells such as hematopoietic stem or progenitor cells.
  • Cells into which a nucleic acid can be introduced for the purposes of gene therapy include, for example, epithelial cells, endothelial cells, keratinocytes, fibroblasts, muscle cells, hepatocytes and blood cells.
  • the blood cells that can be used include, for example, T-lymphocytes, B-lymphocytes, monocytes, macrophages, neutrophils, eosinophils, megakaryocytes, granulocytes, hematopoietic cells or progenitor cells and the like.
  • the present invention relates to protein chips or protein microarrays. It is well known in the art that microarrays can contain more than 10,000 spots of a protein that can be robotically deposited on a surface of a glass slide or nylon filter. The proteins attach covalently to the slide surface, yet retain their ability to interact with other proteins or small molecules in solution. In some instances the protein samples can be made to adhere to glass slides by coating the siides with an aldehyde-containing reagent that attaches to primary amines.
  • a process for creating microarrays is described, for example by MacBeath and Schreiber in Science, Volume 289, Number 5485, pgs, 1760-1763 (2000) or Service, Science, Vol, 289, Number 5485 pg. 1673 (2000).
  • An apparatus for controlling, dispensing and measuring small quantities of fluid is described, for example, in U.S. Patent No. 6, 1 12,605.
  • the present invention also provides a record of protein-protein interactions, PIM@'s, SID®'s and any data encompassed in the following Tables. It will be appreciated that this record can be provided in paper or electronic or digital form.
  • the present invention relates to the classification of H. pylori within functional categories such as genes essential or non essential for viability using the general method described in Figure 18.
  • two exhaustive libraries of H. pylori ORFs were constructed in E.coli The first library contained every H. pylori (strain 26695) cloned individually (Library 1) while the second one (Library I I), contained these ORFs disrupted by a transposabie element.
  • These two ordered libraries are valuable tools for a large project of systematic inactivation of every ORF of the H. pylori genome. They were used to develop a strategy to search at the genomic scale for genes essential for the viability of the bacterium grown in vitro.
  • the inactivation strategy was applied to a series of 138 ORFs that were selected on two different criteria. Ninety six of them were previously shown to encode either proteins involved in protein-protein interaction in the two-hybrid yeast assay (Rain et al, 2001 ), and 42 encode H. pylori specific protein with no known function. The screening procedure led to the identification of 40 Putative Essential Genes (PEGs), of which 15 were shown to be true essential genes. The combination of both essentiality as well as the identification of interacting domains might serve as a direct pathway for the design of active compounds capable of inhibiting protein-protein interactions and possibly bacterial growth was the reasoning used behind this analysis.
  • PEGs Putative Essential Genes
  • This derivative plasmid corresponds to plLL570 (Labigne et al, 1992) in which DNA from the Hindlll site of the polylinker to the Aval site (position 1425 of the pBR322 bone) has been excluded by reverse PCR using plLL570 as a template and 570-1 plus 570-2 as primers (Table 4).
  • Each ORF was cloned in such way that the 5'-end of the gene (including the ATG) was inserted immediately downstream the three transcriptional and translational stops of plLL570 (Labigne et al, 1992) to prevent toxicity of the recombinant proteins in E. coli.
  • the library consists of 17 times 96-well plates (plate 1.1 to plate XVI 1.1 ).
  • the recombinant plasmids were transformed into DH5 ⁇ E. coli cells harboring the pTCA plasmid, a plasmid that confers resistance to tetracycline, encodes the Tn3 transposase and is immune to Tn3 (Seifert et al, 1986).
  • the presence of the two compatible plasmids plLL570-HP000X plus pTCA was checked by plasmid extraction and gel electrophoresis on individual isolated tetracycline, spectinomycin, kanamycin resistant clones.
  • primers 570-3 and 570-4 (corresponding to the bounderies of the cloning site on plLL570-») the adequation between the size of the cloned PCR product and that of the corresponding ORF was confirmed.
  • Library I consists of all the putative ORFs described on the TIGR Webb site in 1997 with the exclusion of 40 ORFs (hp01, 10, 46, 56, 94, 160, 223, 264, 289, 293, 399, 415, 435, 440, 453, 159, 464, 465, 488, 547, 607, 722, 790, 814, 846, 876, 884, 898, 968, 1007, 1069, 1205, 1248, 1304, 1358, 1394, 1452, 1460, 1497, 1511) for which either the initial gene amplification or the final cloning failed.
  • Tn3-Km was shown to preferentially map into the H. pylori inserts due to both the intrinsic properties of Tn3 that transposes into AT rich DNA region and the requirement of maintaining intact replicative function and spectinomycin modifying enzyme (aadA). The efficiency of the whole procedure was checked for five plates. For those 5 plates, the resulting kanamycin transconjugants of 96 independent cloned ORFs were kept individually and as pools of plasmids.
  • This figure exemplifies first that multiple insertions of the transposon took place, second that for the majority trransposon insertion occurred at a distance ranging between 100 to 600 bp from the 5'-end of the ORF, a distance compatible with the promotion of gene replacement by allelic recombination.
  • the ordered library of disrupted H. pylori ORFs in E. coli was used for the genomic screening of putative essential genes (PEGs), and the screening of a subset of 96 individual selected ORFs of H. pylori strain 26695 (Table 5) as genes encoding proteins demonstrating homodimeric or heterodimeric protein- protein interactions [Rain, 2001 #14], and the screening of a subset of 42 ORFs encoding H. pylori specific predicted protein with no known function (Table 6).
  • ORFs known to be essential [ groES (hp0011); holB (hp1231), dnaA (hp1529], or ORFs known to be non-essential for viability in vitro: urel, (hp0071); rdxA (hp0954); ggt (hp1118) representive of various sizes.
  • individual Tn3-Km disrupted recombinant plasmids were extracted from Library II, and used to transform H. pylori strain HAS141 (Janvier et al, 1999).
  • Kanamycin transformants were obtained for all but the hspA gene as expected, and 40 of the 138 tested ORFs, namely hp0061, 175, 377, 419, 553, 650, 739, 862, 928, 990, 1012, 1014, 1074, 1230, 1245, 1263, 1493 for the first series (Table 5), and hp0130, 231, 271, 358, 394, 659, 697, 699, 721, 726, 746, 838, 935, 947, 953, 973, 1023, 1028, 1039, 1053, 1085, 1265, 1568 (Table 6) which thus can be designated as Putative Essential Genes.
  • kanamycin resistant transformants obtained for the 78 knock-out (KO) genes were controlled by gene amplification using the 5' and the 3' oligonucleotides of the KO-ORF respectively in pairs with the 38 bp (Table 4) of the inverted repeat of the transposon. Criteria for allelic replacement were that the sum of the size of the two PCR products be identical to that of the KO-ORF. The final identification of the disrupted ORF and of the site of insertion was done by sequencing one of the two PCR products. For ORFs with a size over 700 bp two or three different transposon insertions mapping in the middle of the ORF were commonly observed among the six analysed transformants.
  • chromosomal DNA extracted from individual transformants of the disrupted hp1231(holB), 1514 (unknown) and 1529 (dnaA) as well as some (and not all) of the individual clones of hp0224 and hp0822 did hybridize with part of the vector confirming that a single crossing-over took place.
  • T 3-Km strategy is a powerful approach to be used as a first screen, at a genomic scale, for the identification of PEGs.
  • the definitive assignment of a non- essential status cannot be done exclusively on the presence of kanamycin transformants, but had to be confirmed and tested for absence of vector DNA within the chromosome of transformants by hybridization.
  • PEGs consist of ORFs that did not lead to the isolation of kanamycin transformants following the transformation of the parental isolate HAS141 with a pool of disrupted ORF.
  • absence of kanamycin transformants cannot be directly associated with the identification of a true essential gene.
  • kanamycin resistant mutant such as (i) absence of the specific gene in the tested strain, (ii) polar effect of the transposon on an essential ORF located downstream of the PEG, a property associated with miniTn3-Km (Skouloubris et al, 1998) (iii) experimental failure due to the small size of the ORF and of the bordering sequences required for allelic exchange.
  • the recombinant plasmids consist, in each case, of the 0.9 kb-Kanamycin promoter-less cassette flanked respectively upstream and downstream with the 300 first and the 300 last nucleotides of the ORF to knock-out.
  • the cassette carries a ribosome binding site and a start codon (ATG) in phase with the hundred 3'-terminal codons of the KO gene allowing the translation of the end of the gene to overcome any transcnptional/translational coupling effect
  • ATG start codon
  • the resulting constructed plasmid was transformed in four different H pylori genetic backgrounds strains HAS141 , N6, X47-2An and the sequenced strain 26695 Taking into account these criteria (Figure 17), the following conclusions were drawn relative to the 40 initially identified PEGs
  • Non polar kanamycin mutants were unambiguously obtained in HAS141 , N6, X47-2An as well as 26695 for ORFs hp0061, 419, 553, 650, 1263, and 1493 of the first series of genes encoding interacting protein (Table 5), and hp0130, 271, 358, 697, 699, 721, 726, 746, 838, 935, 947, 953, 973, 1023, 1028, 1039, 1053 (Table 6)
  • Gene replacement of the parental allele by the deleted and disrupted allele was confirmed by testing the chromosome of the mutant for the disappearance of a PCR product with a size identical to that of the parental ORF and its replacement by the expected size (1 5 kb PCR product 300 + 900+ 300 bp)
  • the parental ORF had a size ranging around 1 5 kb, gene replacement was confirmed by restricting the 1 5 kb PCR product with Smal to release the 0 9 kb-
  • HAS 141 All these kanamycin resistant HAS 141 transformants were positive when tested by hybridization with the labeled plLL570-» vector used as a probe, again attesting for the presence of rare but possible single crossing-over seen under strong selective pressure.
  • These 15 genes (hp0175. hp0231, hp0377, hp0394, hp0659, hp0739, hp0862, hp0928, hp1012, hp1014, hp1085, hp1230, 5 hp1245, hp1265, hp1568) can thus be definitively recognized as genes essential for the viability of H. pylori in vitro.
  • genes 9 are known to be genes that encode proteins involved in protein-protein interactions, and 6 were selected as encoding H. pylori specific protein without known function. They encode proteins with properties that will be discussed and classified in regards to 10 their potential as putative therapeutical targets.
  • hp0231, hp0394, hp0659, hp1085, hp1265, and hp1568 ORFs were cloned in the pB6vector and used as individual bait for the identification of 15 interacting proteins (Table 7).
  • HP0231 , HP1085, HP1568 did not provide data allowing assignation of a putative function and did not reveal homodimeric interaction underline the usefulness of the protein as a possible therapeutic target. Those genes remain ubiquitous, essential, H. pylori specific and without 20 known function. In contrast, HP0394, HP0659 and HP1265 gave positive screens (Table 7).
  • T compounds capable of inhibiting protein-protein interactions and possibly bacterial growth.
  • the 15 ubiquitous essential . genes identified by the procedure 12 were shown to be involved in protein-protein interaction and could be classified in different categories with regards to their potential as putative therapeutic targets.
  • the first category consists of ubiquitous ORFs encoding proteins with heterodimeric protein-protein interactions in the two-hybrid assay where both partners are playing an essential role for the viability of H. pylori and at least one of the two partners is H. pylori specific.
  • Four of the twelve essential ORFs answer these criteria hp0394, hp0862, hp1230, and hp0659 which encode proteins with no known or putative function.
  • H. pylori specific the recent publication of the Campylobacter genome demonstrates for some of them the existence of homologues in this closely phylogenetically related bacteria, but no homologues have been identified in the other bacterial genomes so far sequenced.
  • the hp1230 gene encodes a protein that has been recognized via the two hybrid assay as an homodimeric protein which interacts with the predicted chromosomal replication initiator protein, DnaA, encoded by hp1529.
  • the proteomic screen allowed the identification of a specific domain of interaction (SID) lying between AA31 and AA180 of HP1230 (SID1230) and a SID of 87 AA within the N-terminal domain of HP1529 or DnaA (AA12 to AA99) (SID1529) (Rain et al., 2001 ).
  • the oligonucleotide encoding SID1529 was randomly mutagenized, and selected, through the two-hybrid system, mutated sid that abolished the specific- HP1230/HP1529 interaction. This allowed the demonstration of the fact that isoleucine 58 and Lysine 61 were involved in the HP1230/HP1529 interaction since a double mutant I58F/K61 I within sid1529 abolished this interaction.
  • pylori growth Another example of this category of interest is the hp0862 gene.
  • the two- hybrid screen procedure revealed interactions between the HP0862 gene product and the C-terminal domain (AA100 to AA191 ) of the thymidylate kinase (HP1474), an essential enzyme responsible for the first phosphorylation step in the conversion of deoxythymidine 5'-monophosphate to deoxythymidine 5'- diphosphate for the final production of dTTP.
  • HP1230 the actual function of the HP0862 encoding gene is unknown, but its essential character and interaction with a known essential enzyme (Tmk) might orient further functional analysis, and encouraged the definition of more precise domain of interactions between the TmK protein and this new specific interacting protein.
  • Tmk essential enzyme
  • the hp394 and hp659 might also enter this category. They both encode protein essential specific to H. pylori.
  • the hp394 encodes a protein whose C- terminal domain (last 76 AA) interacts with the I2> subunit of the acetylcoenzyme A carboxylase transferase (HP950). This enzyme has an essential function in membrane lipid synthesis and catalyses the formation of malonyl-CoA, the first intermediate for fatty acid synthesis .
  • the protein encoded by hp0659 interacts with a putative outer membrane protein HP0655 with no known function. The essential property of hp0655 has not yet been tested and thus the classification of the hp0659 gene in this first category has to be confirmed.
  • the second category includes genes essential for H. pylori encoding predicted proteins with known functions.
  • the characteristic of this category is that these genes essential in H. pylori have not been reported to be lethal in other organisms. Thus targeting the proteins encoded by these genes might be relevant of a selective drug design specifically directed against H. pylori.
  • Hp0377 encodes a protein partially homologue to DsbC a thiol-disulfide interchange periplasmic protein involved in disulfide bond formation (Zapun et al., 1995).
  • the HP0377 product interacts in the two-hybrid assay with the last 100 C-terminal amino acids (SID) of the homodimeric secreted cysteine containing protein encoded by hp0224.
  • the HP0224 product is a methionine sulfoxide reductase homologue (MsrA).
  • MsrA plays a role in response to oxidative damage by reducing the methionine sulfoxide residues (Moskovitz et al., 1995), and directly, or indirectly contributes to the maintenance of adhesins (Wizemann et al., 1996).
  • HP0377 encoding DsbC is a good H. pylori specific target candidate because of the essential character unique to the bacterial species, and the accessibility of the protein within the periplasmic space.
  • the hp0175 gene is another representative of this category.
  • the gene encodes a predicted peptidyl-propyl cis-trans isomerase, an enzyme that accelerates protein folding by catalyzing the cis-trans isomerization of proline imidic peptide bonds in oligopeptides.
  • Two-hybrid screening identified HP0608, a H. pylori specific protein of unknown function, as interacting partner with HP0175.
  • the essential character of the gene documented in this study, as for HP0377, has not been reported for other microorganisms and appears unique to the species.
  • hp1265 is a very unique and specific target for H. pylori.
  • This essential gene is part of a large cluster of 14 genes (hp1260 to hp1273) among which 12 encode homologues of 12 of the 14 subunits of the NADH oxidoreductase complex of E. coli.
  • E. coli inactivation of the genes encoding the different subunits (nuo) is not lethal for the bacterium.
  • H. pylori inactivation of hp1263 is not lethal, but inactivation of one of the two H.
  • pylori specific subunit encoding gene is lethal, indicating that the HP1265 subunit which should be the subunit involved in the NADH binding has very unique properties deserving further functional investigations.
  • a third category consists in ORFs that encoded conserved hypothetical proteins, distributed in almost all the sequenced bacterial genomes, for which no function has been assigned, nor the essentiality assessed.
  • hp0739 gene which encodes a protein that interacts with another conserved hypothetical but non essential protein (HP810), and hp1012. In the two sequenced H.
  • the hp0739 gene is flanked by two genes involved in the biosynthesis of the peptidoglycan, and its involvement in this biosynthetic pathway remains to be explored.
  • the hp1012 gene encoded a protein, which has similitude with some metallo-proteases, however the function of this specific protease is unknown.
  • a fourth category includes ORFs with known functions that were previously shown to be essential in other microorganisms.
  • the present work allows us to extend this property to the H. pylori species, and reinforces their value as putative targets with large spectrum.
  • the hp0928 gene is one of those; it was selected through the two-hybrid screen as an homodimer.
  • the assigned function was that of GTP cyclohydrolase (folE) involved in the first step of the biosynthetic pathway of tetrahydrofolate, the structure of which was shown to be a homodecameric complex form of two pentamers. Both, the inability to succeed in knocking out that gene, and its oligomeric structure supports the assigned function.
  • the two-hybrid screening procedure delineates a domain of interaction consisting of 133 amino-acids between the FolE subunits. This domain might serve as a therapeutical target for the screening of lead compounds with large bacterial spectrum.
  • the hp1014 gene (hdhA) encodes an NAD + -dependent oxidoreductase belonging to the short-chain dehydrogenase/reductase (SDR 1 family).
  • SDR 1 family short-chain dehydrogenase/reductase
  • the enzyme is known to require a tetrameric form to be active in E. coli (Yoshimoto er al., 1991 ) which was compatible with the homodimeric interaction observed in the two hybrid assay.
  • hp1245 as an essential gene of this category. This gene encodes the SSB protein, the single strand binding protein, involved in DNA replication, recombination and DNA repair, and such observation is confirmatory of work previously done in other model microorganisms.
  • HP1245 was found to interact with HP0650, a non-essential protein of unknown function, but also significantly with HP0661 , a predicted ribonuclease H involved in DNA replication, a finding consistent with the HP1245 function.
  • Example 1 Preparation of a Helicobacter pylori genomic collection
  • Example 2 Screening the collection with the two-hybrid in yeast system
  • Example 9 Classification of genes of H. pylori
  • Example 10 Study of the interaction between two essential genes, HP1230-HP1529, by random mutagenesis.
  • Example 1 Preparation of a Helicobacter pylori genomic collection
  • the Helicobacter pylori genomic DNA is fragmented in a nebulizer (GATC) for 1 minute, precipitated and resuspended in water.
  • GATC nebulizer
  • the obtained nebulized genomic DNA is successively treated with Mung Bean Nuclease (Biolabs) (30 minutes at 30°C), T4 DNA polymerase (Biolabs) (10 minutes at 37°C) and Klenow enzyme (Pharmacia) (10 minutes at room o temperature and 1 hour at 16°C).
  • DNA is then extracted, precipitated and resuspended in water.
  • Oligonucleotide PL160 (5' end phosphorylated) 1 ⁇ g / ⁇ l and PL159 2 ⁇ g/ ⁇ l. Sequence of the oligo PL160 : 5'-ATCCCGGACGAAGGCC-3' (SEQ ID NO. 3257)
  • Linkers were preincubated (5 minutes at 95°C, 10 minutes at 68°C, 15 minutes at 42°C) then cooled down at room temperature and ligated with genomic DNA inserts at 4°C overnight.
  • Linkers were further removed on a separation column (Chromaspin TE 400, Clontech), according to the manufacturers protocol.
  • pACTIIst is successively digest with BamHI restriction enzyme (Biolabs) for 1 hour at 37°C, dephosphorylated with Calf Intestine Phosphatase (CIP) (Biolabs) and filled in with dGTP using Vent DNA polymerase (exo-) (Biolabs), extracted, precipitated and resuspended in water.
  • BamHI restriction enzyme Biolabs
  • CIP Calf Intestine Phosphatase
  • exo- Vent DNA polymerase
  • the prepared vector is ligated overnight at 15°C with the genomic blunt ended DNA described in section 2 using T4 DNA ligase (Biolabs). The DNA is then precipitated and resuspended in water.
  • HGXBHP1 CNCM N° 1-2181 .
  • the plasmid DNA contained in E. coli were extracted (Qiagen) from aliquoted E. coli frozen cells (1.A.5.).
  • Yeast transformation is performed according to standard protocol (Giest et al. Yeast, 11 , 355-360, 1995) using yeast carrier DNA (Clontech). This experiment leads to 10 4 to 5 10 4 cells/ ⁇ g DNA. Spread 2 10 4 cells on DO-Leu medium per plates. Aliquot and freeze at -80°C.
  • the genomic amplification of the ORF is obtained by PCR using the Pfu proofreading Taq polymerase (Stratagene) and 200 ng of genomic DNA as the template. PCR primers are chosen in regions flanking the ORF.
  • PCR fragments Purify PCR fragments with Qiaquick column (Qiagen) according to the manufacturer's protocol. Ligate digested PCR fragments into an adequately digested and dephosphorylated bait vector (pAS2 ⁇ ) according to standard protocol (Maniatis et al.).
  • Example 2 Screening the collection with the two-hybrid in yeast system
  • the mating procedure allows a direct selection on selective plates because the two fusion proteins are already produced in the parental cells. No replica plating is required.
  • OD 6 oonm of the DO-Trp preculture of Y187 cells carrying the bait plasmid preculture The OD 60 onm must lie between 0.1 and 0.5 in order to correspond to a linear measurement.
  • the number of His+ cell clones will define which protocol is to be processed:
  • the X-Gal overlay assay is performed directly on the selective medium plates after scoring the number of His+ colonies.
  • the water temperature should be 50°C.
  • Overlay mixture 0.25 M Na 2 HP0 4 pH7.5, 0.5% agar, 0.1 % SDS, 7% DMF (LABOSI), 0.04% X-Gal (ICN). For each plate, 10 ml overlay mixture are needed.
  • Temperature of the overlay mix should be between 45 and 50°C.
  • PCR amplification of fragments of plasmid DNA directly on yeast colonies is a quick and efficient procedure to identify sequences cloned into this plasmid It is directly derived from a published protocol (Wang H et a! , Analytical Biochemestry, 237, 145-146, 1996) However, it is not a standardized protocol in our hands it varies from strain to strain, it is dependent of experimental conditions (number of cells, Taq polymerase source, etc) This protocol should be optimized to specific local conditions
  • PCR mix composition is
  • thermocycler (GeneAmp 9700, Perkin Elmer) 5 minutes at 99.9°C and then 10 minutes at 4°C.
  • the PCR program was set up as followed : 15 94°C 3 minutes
  • the quality, the quantity and the length of the PCR fragment was checked on an agarose gel.
  • the length of the cloned fragment was the estimated length of the PCR fragment minus 300 base pairs that corresponded to the amplified flanking plasmid sequences
  • plasmids from yeast by electroporation can be rescued.
  • This experiment allows the recovery of prey plasmids from yeast cells by transformation of E. coli with a yeast cellular extract.
  • the prey plasmid can then be amplified and the cloned fragment can be sequenced.
  • Extraction buffer 2% Triton X100, 1 % SDS, 100 mM NaCl, 10 mM TrisHCI pH 8.0, 1 M EDTA pH 8.0.
  • Electrocompetent MC1066 cells prepared according to standard protocols (Maniatis).
  • blastwun available on the Internet site of the University of Washington is a development version of software for gene and protein identification through similarity searches of protein and nucleotide sequence databases.
  • Blastwun program compares prey polynucleotide insert sequence (rescued from prey plasmid) with whole Helicobacter pylori genome (available on NCBI web site: htrp-./ www.ncbi.nlm.nih.gov under GenBank accession number AE000511 ). This comparison leads to prey polynucleotide localizations in the H. Pylori genome, each localization having a score depending on the homology of sequence. For each prey polynucleotide, we consider the localization with the highest score and, if the insert sequence is included in and is in phase with an Open Reading Frame, we can identify one prey polypeptide interacting with one bait polypeptide.
  • This web page allows several requests concerning Helicobacter pylori's genome, in particular, its ORF sequence. To get the sequences of specific ORF's, click on the ewindow named "HP#" and click search. This operation leads to a new web page presenting nucleic and peptide sequence of the specific ORF.
  • Example 5 Identification of SID® Experiment results in step 4. sequences of each prey fragment encoding for an interacting prey polypeptide.
  • SID® Selected Interacting Domain
  • Analyse results select lead compounds that prevent transformed permeabilized yeast cells from growing.
  • An expression vector containing the SID® polynucleotide is made in the manner described in U.S. Patent 4,980,286. It is then administered to patients to treat H. pylori infections.
  • Example 8 Making of polyclonal and monoclonal antibodies
  • the protein-protein complex of Table 1 was injected into mice and polyclonal and monoclonal antibodies were made following the procedure set forth in Sambrook et al supra.
  • mice are immunized with an immunogen comprising complexes conjugated to keyhole limpet hemocyanin using glutaraldehyde or EDC as is well known in the art.
  • the complexes can also ' be stabilized by crosslinking as described in WO 00/37483.
  • the immunogen is then mixed with an adjuvant.
  • Each mouse receives four injections of 10 ⁇ g to 100 ⁇ g of immunogen, and after the fourth injection, blood samples are taken from the mice to determine if the serum contains antibodies to the immunogen. Serum titer is determined by ELISA or RIA. Mice with sera indicating the presence of antibody to the immunogen are selected for hybridoma production.
  • Spleens are removed from immune mice and single-cell suspension is prepared (Harlow et al 1988). Cell fusions are performed essentially as described by Kohler et al. Briefly, P365.3 myeloma cells (ATTC Rockville, Md) or NS-1 myeloma cells are fused with spleen cells using polyethylene glycol as described by Harlow et al. Cells are plated at a density of.2 x 10 5 cells/well in 96-well tissue culture plates. Individual wells are examined for growth and the supematants of wells with growth are tested for the presence of Table 1 complex-specific antibodies by ELISA or RIA using the Table 1 complex as a target protein. Cells in positive wells are expanded and subcloned to establish and confirm monoclonality.
  • Clones with the desired specificities are expanded and grown as ascites in mice or in a hollow fiber system to produce sufficient quantities of antibodies for characterization and assay development. Antibodies are tested for binding to bait polypeptide (from column 1 of Table 1 ) alone or to prey polypeptide (from column 2 of Table 1 ) alone, to determine which are specific for the Table 1 complex as opposed to those that bind to the individual proteins.
  • Monoclonal antibodies against each of the complexes set forth in Table 1 are prepared in a similar manner by mixing specified proteins together, immunizing an animal, fusing spleen cells with myeloma cells and isolating clones which produce antibodies specific for he protein complex, but not for individual proteins.
  • Example 9 Classification of genes of H. pylori within function categories at the genomic scale using 2 exhaustive libraries in E. coli 1. Bacterial strains, growth and storage conditions.
  • Escherichia coli strains DH5 ⁇ (BRL) HB101 (Boyer and Roulland-Dussoix, 1969) and NS2114 (Rif ) (Seifert et al., 1986) were used as hosts for plasmid cloning and disruption experiments and were grown at 37°C in L-broth (10 g of tryptone, 5 g of yeast extract and 5 g of NaCl per liter, pH 7.0) or on L-agar plates (1.5% agar) at 37°C.
  • L-broth 10 g of tryptone, 5 g of yeast extract and 5 g of NaCl per liter, pH 7.0
  • L-agar plates (1.5% agar
  • Antibiotics were used at the following final concentrations ( ⁇ g/ml) unless indicated iii the text: spectinomycin: 100 (Upjohn Laboratories, Paris, France), tetracycline: 8 (Sigma Chemicals, Saint-Quentin Fallavier, France), kanamycin: 25 (Serva, Frankfurt, Germany), rifampicine:100 ⁇ g (Sigma Chemicals). Independent recombinant E.
  • coli were saved by storing up to 96 clones individually in 96-well microtitre plates; clones were inoculated into L-broth supplemented with 8 ⁇ g/ml tetracycline, 100 ⁇ g/ml spectinomycin and 7% DMSO (Sigma) and stored at -80°C.
  • H. pylori strain 26695 (Tomb et al., 1997), HAS141 (Janvier et al., 1999), N6 (Ferrero et al., 1992), X47-2an (GUY et al., 1999) were routinely cultured on 10% horse blood agar medium (Blood Agar Base no.
  • Solid and liquid media contained supplements at the following final concentrations: 10 ⁇ g vancomycin (Dakota Pharmaceuticals, Creteil, France), 2.5 IU polymyxin (Pfizer Laboratories, Orsay, France), 5 ⁇ g trimethoprim (Sigma) and 4 ⁇ g amphotericin B (Bristol-Myers Squibb, Paris, France)/ml. Plates were incubated at 37°C under microaerobic conditions in an anaerobic jar with a carbon dioxide generator (CampyGen, Oxoid) without catalyst. H. pylori that had undergone chromosomal allelic exchange were selected on medium supplemented with 25 ⁇ g kanamycin.
  • the cloning of the 96 amplicons was performed using the ligation-independent method described by Rashtchian (Rashtchian, 1995).
  • First the linear plLL570-» derivative vector was prepared by gene amplification using #570-1 and #570-2 (Table 4) as primers and the plLL570 plasmid (Labigne et a/., 1992) as a template.
  • Three microliters of individual HP0001 to HP1590 PCR products were mixed together with 2 ⁇ l of plLL570-» derivative vector (75 ng), 14 ⁇ l of 1XPCR buffer, 1 ⁇ l of uracil DNA glycosylase (UDG) in a 2000- ⁇ l 96-well disposable plate.
  • Competent DH5 ⁇ cells 100 ⁇ l harboring the pTCA plasmid (Seifert et al., 1986) were added to each well, and the 96-well plate was further incubated for 45 mn on ice. One ml of prewarmed L-broth was added to each well, and the plate was then incubated for 90 mn at 37°C.
  • a selective antibiotic cocktail containing spectinomycin, tetracycline was added to each well to positively select and enrich in plLL570- « derivative recombinant plasmid transformed DH5 (pTCA) cells; plates were then incubated for another 13 hours at 37°C under agitation.
  • Individual transformant colonies were isolated by spotting 10 ⁇ l of liquid culture from each well on square agar plates containing tetracycline and spectinomycin using a 96-well inoculator designed to deliver a 10 ⁇ l liquid volume; cloning of the PCR product was confirmed by mini-preparation recombinant plasmid restricted with Clal-Aval. They were stored in DMSO (7%) at -80°C under a 96-well format as "library I" consisting of plate 1.1 to plate XVI 1.1.
  • Transposon mutagenesis of individual £. coli clones was performed using the mini-Tn3-Km transposon as previously described by Jenks et al. (Jenks et al., in press). All manipulations were performed in a 96-well format and four independent transposon mutageneses were carried out in parallel so as to saturate the mutagenesis disrupting process with independent events. Briefly, the stored microtitre plates containing the individual E.
  • Plasmid plLL553 harboring the mini- Tn3-Km transposon (Seifert et al., 1986) (Labigne, 1997) (a low copy auto- transferable plasmid pOX38 derivative) was transferred into these E. coli DH5 ⁇ clones by conjugation.
  • Transconjugates harboring all three plasmids (recombinant plLL570-» derivative, pTCA and plLL553), were selected by spotting 10 ⁇ l of the mating mixture on L-agar containing 25 ⁇ g/ml kanamycin, 8 ⁇ g/ml tetracycline and 100 ⁇ g/ml spectinomycin. Cointegrates were transferred by conjugation into E. coli NS2114SmRif carrying the ere gene.
  • H. pylori strains were naturally transformed with circular plasmid DNA ( ⁇ 2 ⁇ g per transformation). Briefly, bacteria were inoculated as 1 cm patches and grown for 5 h before addition of 10 ⁇ l supercoiled plasmid DNA. Each disrupted plasmid consisting either of a pool of disrupted plasmids when originating from library II or of a single recombinant plasmid for the non polar mutation construction was added to 4 independently prepared patches of H. pylori. After further incubation for 18 h, the bacteria from each individual patch were harvested and plated directly onto a single plate of selective medium (kanamycin, 25 ⁇ g/ml). Six individual kanamycin transformants were then subcultured. Chromosomal DNA was extracted using QiAmp kit extraction, and the constructed mutant characterized by several PCR controls and/or hybridization as described in the result section.
  • selective medium kanamycin, 25 ⁇ g/ml
  • Example 10 Study of the interaction between two essential genes, HP1230-HP1529, by random mutagenesis.
  • Mutagenized SID1529 was obtained by PCR using the Taq polymerase (Stratagene) and 200 ng of Helicobacter pylori genomic DNA and the following oligonucleotides :
  • the PCR program was set up as follows :
  • the amplification was checked by agarose gel electrophoresis.
  • PCR fragments were purified with Qiaquick column (Qiagen) according to the manufacturer's protocol and digested (Notl-BamHl).
  • the vector (pP7-centro) (see, Figure 20) was digested (Notl-BamHl) and dephosphorylated according to standard protocol (Sambrook et al.).
  • the genomic amplification of the HP1230 ORF was obtained by PCR using the Pfu proofreading Taq polymerase (Stratagene) and 200 ng of Helicobacter pylori genomic DNA as template.
  • the PCR program was set up as follows:
  • the amplification was checked by agarose gel electrophoresis.
  • the PCR fragments were purified with Qiaquick column (Qiagen) according to the manufacturer's protocol.
  • the digested PCR fragments were ligated into an adequately digested (BamHI-Pstl) and dephosphorylated bait vector (pB1 ) according to standard protocol (Sambrook et al.) and were transformed into competent bacterial cells.
  • the cells were grown , the DNA extracted and the plasmid was sequenced. '
  • HP1529 protein is expressed fused to the GAL4 Activation Domain (AD) in the pP6 plasmid
  • HP1230 is introduced in the p3H1 vector in fusion with the DNA-binding domain (DBD) of GAL4.
  • DBD DNA-binding domain
  • this vector contains the Met25 promoter which allow expression of a third partner in medium lacking methionine.
  • the resulting diploid strain was grown on a minimal medium lacking leucine and tryptophan to select for both plasmids (DO-2) and on DO-2 without histidine to select for interaction (DO-3). As a positive control, this strain was observed to grow on the selective medium for dilutions ranging from 1 to 10 '4 ( Figure 19, lane 1 ). This result shows an interaction between HP1230 and HP1529 proteins, as previously identified using library screening (Rain et al., 2001 ).
  • plasmids Two different plasmids were used for this study: (i) the pP6 vector which contain the GAL4 activation domain (AD) (Rain et al., 2001 ). One of the HP1529 fragments (nucleotides 1-1374) obtained by screening the HP1230 protein was selected and used as prey in the pP6 vector fused to GAL4 AD; (ii) the p3H1 vector which contains the DNA-binding domain (DBD) of GAL4 and a methionine- regulated Met25 promoter (Tirode et al., 1997, J. Biol. Chem. 272: 22995-22999).
  • DBD DNA-binding domain
  • the HP1230 encoding sequence of 540 bp was sub-cloned from pB1-HP1230 into the BamHllPstl sites of p3H1 as fusion protein with GAL4-DBD giving p3H1- HP1230.
  • the WT SID1529 or SID1529* (N38D-V53L) or SID1529* (V53L) were sub-cloned from pP7-centro (Notl-BamHl) to the NotllBglll sites of p3H1-HP1230 under the control of the Met25 promoter. Expression from the Met25 promoter is obtained in the absence of methionine.
  • As negative control we used a prey encoding the HP0875 protein. Ail PCR fragments and in frame fusions were checked by sequencing.
  • the pP6 and p3H1 derived-vectors were used to transform the Y187 and CG1945 yeast strains, respectively. Both strains were mated in YPD buffer
  • SID1529 derivatived might have some potential as lead compounds to inhibit
  • Example 1 1 Modulating compounds/PIM screening
  • Rashtchian A. (1995) Novel methods for cloning and engineering genes using the polymerase chain reaction. Current Opinion in Biotechnology 6: 30-36. Salama, N., Guillemin, K., McDaniel, T.K., Vietnamese, G., Tompkins, L., and
  • HP0336 213 GAAGATTTAGGCTCGTTTTTTTTGAAGACGCTTTTGGGTTTGGCGCTAGGGGGAGTAAAAGGCAAAAAAGCTCTATCGC 214 EDLGSFFEDAFGFGARGSKRQ
  • HP0336 217 GGAGCTTGAAATTTTAAAATCTTATCTTAAAATCCCTTATACTTTACTAGAGACCAACACCCTAAATTCCAAGGCTTGT 218 ELEILKSYLKIPYTLLETNTLNS

Landscapes

  • Chemical & Material Sciences (AREA)
  • Organic Chemistry (AREA)
  • Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Genetics & Genomics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)
  • Peptides Or Proteins (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Pharmaceuticals Containing Other Organic And Inorganic Compounds (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

The present invention relates to protein-protein interactions of Helicobacter pylori. More specifically, the present invention relates to complexes of polypeptides or polynucleotides encoding the polypeptides, fragments of the polypeptides, antibodies to the complexes, Selected Interacting Domains (SID®) which are identified due to the protein-protein interactions, methods for screening drugs for agents which modulate the interaction of proteins and pharmaceutical compositions that are capable of modulating the protein-protein interactions.

Description

PROTEIN-PROTEIN
INTERACTIONS IN Helicobacter pylori
FIELD OF THE INVENTION
The present invention relates to proteins that interact with Helicobacter pylori. More specifically, the present invention relates to complexes of polypeptides or polynucleotides encoding the polypeptides, fragments of the polypeptides, antibodies to the complexes, Selected Interacting Domains (SID®) which are identified due to the protein-protein interactions, methods for screening drugs for agents which modulate the interaction of proteins and pharmaceutical compositions that are capable of modulating the protein-protein interactions.
In another embodiment the present invention provides a protein-protein interaction map called a PIM® which is available in a report relating to the protein-protein interactions of Helicobacter pylori.
BACKGROUND AND PRIOR ART
Most biological processes involve specific protein-protein interactions. Protein-protein interactions enable two or more proteins to associate. A large number of non-covalent bonds form between the proteins when two protein surfaces are precisely matched. These bonds account for the specificity of recognition. Thus, protein-protein interactions are involved, for example, in the assembly of enzyme subunits, in antibody-antigen recognition, in the formation of biochemical complexes, in the correct folding of proteins, in the metabolism of proteins, in the transport of proteins, in the localization of proteins, in protein turnover, in first translation modifications, in the core structures of viruses and in signal transduction.
General methodologies to identify interacting proteins or to study these interactions have been developed. Among these methods are the two-hybrid system originally developed by Fields and co-workers and described, for example, in U.S. Patent Nos. 5,283,173, 5,468,614 and 5,667,973, which are hereby incorporated by reference.
The earliest and simplest two-hybrid system, which acted as basis for development of other versions, is an in vivo assay between two specifically constructed proteins. The first protein, known in the art as the "bait protein" is a chimeric protein which binds to a site on DNA upstream of a reporter gene by means of a DNA-binding domain or BD. Commonly, the binding domain is the DNA-binding domain from either Gal4 or native E. coli LexA and the sites placed upstream of the reporter are Gal4 binding sites or LexA operators, respectively.
The second protein is also a chimeric protein known as the "prey" in the art. This second chimeric protein carries an activation domain or AD. This activation domain is typically derived from Gal4, from VP 16 or from B42.
Besides the two hybrid systems, other improved systems have been developed to detect protein-protein interactions. For example, a two-hybrid plus one system was developed that allows the use of two proteins as bait to screen available cDNA libraries to detect a third partner. This method permits the detection between proteins that are part of a larger protein complex such as the RNA polymerase II holoenzyme and the TFIIH or TFIID complexes. Therefore, this method, in general, permits the detection of ternary complex formation as well as inhibitors preventing the interaction between the two previously defined fused proteins.
Another advantage of the two-hybrid plus one system is that it allows or prevents the formation of the transcriptional activator since the third partner can be expressed from a conditional promoter such as the methionine-repressed Met25 promoter which is positively regulated in medium lacking methionine. The presence of the methionine-regulated promoter provides an excellent control to evaluate the activation or inhibition properties of the third partner due to its "on" and "off switch for the formation of the transcriptional activator. The three-hybrid method is described, for example in Tirade et al., The Journal of Biological Chemistry, 272, No. 37 pp. 22995-22999 (1997) incorporated herein by reference.
Besides the two and two-hybrid plus one systems, yet another variant is that described in Vidal et al, Proc. Natl. Sci. 93 pgs. 10315-10320 called the reverse two-and one-hybrid systems where a collection of molecules can be screened that inhibit a specific protein-protein or protein/DNA interaction, respectively.
A summary of the available methodologies for detecting protein-protein interactions is described in Vidal and Legrain, Nucleic Acids Research Vol. 27, No. 4 pgs.919-929 (1999) and Legrain and Selig, FEBS Letters 480 pgs. 32-36 (2000) which references are incorporated herein by reference.
However, the above conventionally used approaches and especially the commonly used two-hybrid methods have their drawbacks. For example, it is known in the art that, more often than not, false positives and false negatives exist in the screening method. In fact, a doctrine has been developed in this field for interpreting the results and in common practice an additional technique such as co-immunoprecipitation or gradient sedimentation of the putative interactors from the appropriate cell or tissue type are generally performed. The methods used for interpreting the results are described by Brent and Finley, Jr. in Ann. Rev. Genet, 31 pgs. 663-704 (1997). Thus, the data interpretation is very questionable using the conventional systems.
One method to overcome the difficulties encountered with the methods in the prior art is described in WO 99/42612, incorporated herein by reference. This method is similar to the two-hybrid system described in the prior art in that it also uses bait and prey polypeptides. However, the difference with this method is that a step of mating at least one first haploid recombinant yeast cell containing the prey polypeptide to be assayed with a second haploid recombinant yeast cell containing the bait polynucleotide is performed. Of course the person skilled in the art would appreciate that either the first recombinant yeast cell or the second recombinant yeast cell also contains at least one detectable reporter gene that is activated by a polypeptide including a transcriptional activation domain.
The method described in W099/42612 permits the screening of more prey polynucleotides with a given bait polynucleotide in a single step than in the prior art systems due to the cell to cell mating strategy between haploid yeast cells. Furthermore, this method is more thorough and reproducible, as well as sensitive. Thus, the presence of false negatives and/or false positives is extremely minimal as compared to the conventional prior art methods.
One of the prokaryotic microorganisms studied by the inventors is Helicobacter pylori. Helicobacter pylori (H. pylori) is a microaerophilic, Gram negative, slow growing, spiral shaped and flagellated organism. H. pylori has been first isolated in 1983 from a gastric biopsy specimen of patient with chronic gastritis (Marshall et al., 1984, Lancet, 1 :1311-1314, Unidentified curved bacilli in the stomach of patients with gastπtis and peptic ulceration).
Helicobacter pylori has become identified as a primary cause of chronic gastroduodenal disorders, such as gastritis, dyspepsia, and peptic ulcers, in humans. Studies have shown (Labigne et al.) that H. pylori can be successfully eradicated by a treatment combining two antibiotics with a proton pump inhibitor. However, few antibiotics are active against H. pylon and antibiotic-resistant strains have begun to appear.
The H. pylori strain n° 26695 genome has been studied by Tomb et al.
(Tomb et al., 1997, Nature, vol. 388, 539-547, The complete genome sequence of the gastric pathogen Helicobacter pylori). This strain's genome consists of a circular chromosome with a size of 1 ,667,867 bp, average G + C content of 39%, and 1590 predicted coding sequences (open reading frames or "ORF").
The availability of the entire genome sequence of two clinical strains, 26695 and J99 (Aim et al, 1999) (Tomb et al, 1997) has encouraged more global approaches to the functional analysis of the whole set of genes. In H. pylori, there is still as much as 42% of the H. pylori encoded proteins for which there is a need to assign biological function. Attempts to classify genes within functional categories such as genes essential for viability, or conditionally essential in a given environment have been proposed by Akerley and collaborators (Akerley et al, 1998). This concept was first applied to H. pylori by Chalker et al. (Chalker et al, 2001 ) who analyzed for essentiality a set of genes selected by bioinformatic genome prioritization.
The bacterial factors necessary for colonization of the gastric environment, and for virulence of this pathogen, are poorly understood. Examples of known virulence factors are:
- Enzymes involved in neutralizing the acid gastric pH : the multisubunit urease is a characteristic enzyme that is crucial for survival in acidic pH and for successful colonization of the gastric environment, a site that few other microbes can colonize (Labigne et al., WO 93/07273, Helicobacter pylori genes necessary for the regulation and maturation of urease, and use thereof). Genes encoding ureases have been located on a 34 kb chromosome fragment and comprise ureA, ureB, ureC, ureD, ureE, ureF, ureG, ureH and urel.
- Bacterial flagellar proteins responsible for motility across the mucous layer (Hazell et al., 1986, J. Inf. Dis., 153, 658-663 Campylobacter pyloridis and gastritis: association with intracellular spaces and adaptation to an environment of mucus as important factors in colonization of the gastric epithelium; Leying et al., 1992, Mol. Microbiol., 6, 2863-2874 Cloning and genetic characterization of Helicobacter pylori flagellin gene): flagellar filaments biosynthesis comprises A and B flageilins and the filament cap. These two biosyntheses are regulated by flbA gene (Suerbaum et al., French patent application, 1995, n° 2,736,360, Cloning and characterization of flbA gene of Helicobacter pylori, aflagellated strains production).
- Two other essential toxins for virulence are VacA and CagA.
VacA is a H. pylori toxin that induces the formation of large acidic vacuoles in host epithelial cells. These large vacuoles originate from massive swelling of membranous compartments of late stages of the endocytic pathway (de Bernard et al., 1997, Microbiology, 26(4), 665-674, Helicobacter pylori toxin VacA induces vacuole formation by acting in the cell cytosol) Proof for receptor-mediated interaction with VacA has been made by Pagliaccia et al.; m2 allele of vacA gene has always been described as inactive in the in vitro HeLa cell assay, however, the m2 allele is associated with peptic ulcer and is prevalent in populations in which peptic ulcer and gastric cancer have high incidence (Pagliaccia et al., Proc. Natl. Acad. Sci. U.S.A, 1998, 95(17), 10212-10217, The m2 form of the Helicobacter pylori cytotoxin has cell type-specific vacuolating activity).
CagA is one of the proteins encoded by the "cag pathogenicity island"
(Spohn et al. 1997, Molecular Microbiology, 26(2), 361-372, . Transcriptional analysis of the divergent cagAB genes encoded by the pathogenicity island of Helicobacter pylori) found in H. pylori strains isolated from most patients with peptic ulcer disease and adenocarcinoma. CagA is produced by 50-60% of H. pylori strains; it is a high molecular weight (120-140 kDa) superficial protein and an immunodominant antigen with unknown function. H. pylori strains that produce CagA protein have two genes cagB and cagC (36 and 101 kDa proteins, respectively). These genes are highly associated with duodenal ulcers (Blaser et al. 1996, WO 96/12825, cagB and cagC genes of Helicobacter pylori and related methods an d compositions).
Other virulence factors are : several gastric tissue-specific adhesins (Boren et al., 1993, Science, 262, 1892-1895).
Therapeutic agents are currently available that eradicate H. pylori infections in vitro. However, methods employing antibiotic agents result in the emergence of bacterial strains which are resistant to these agents.
Thus, it is an object of the present invention to identify protein-protein interactions for Helicobacter pylori.
It is another object of the present invention to identify protein-protein interactions of Helicobacter pylori for the development of more effective and better targeted therapeutic applications. It is yet another object of the present invention to identify complexes of polypeptides or polynucleotides encoding the polypeptides and fragments of the polypeptides of Helicobacter pylori.
It is yet another object of the present invention to identify antibodies to these complexes of polypeptides or polynucleotides encoding the polypeptides and fragments of the polypeptides of Helicobacter pylori including polyclonal, as well as monoclonal antibodies that are used for detection.
It is still another object of the present invention to identify selected interacting domains of the polypeptides, called SID® polypeptides.
It is still another object of the present invention to identify selected interacting domains of the polynucleotides, called SID® polynucleotides.
It is another object of the present invention to generate protein-protein interactions maps called PIM®s.
It is yet another object of the present invention to classify genes of H. pylori into functional categories at the genomic scale such as genes essential or nonessential for viability.
It is yet another object of the present invention to identify Putative Essential Genes (PEGS) from H. pylori, as well as the true essential genes.
It is yet another object of the present invention to establish a large scale protein-protein interaction map of H. pylori using the two-hybrid system as a way to elucidate the function of yet uncharacterized proteins.
It is yet another object of the present invention to identify a superbinder phenotype in H. pylori with the two-hybrid system which completely inhibits specific protein-protein interactions.
It is yet another object of the present invention to identify oligopeptides.
Their overlapping or combining derivatives that inhibit H. pylori growth. In yet another aspect, the present invention relates to the identification of ORFs (open reading frames) having enzymatic activity, which provides a direct way to screen lead compounds that abolish enzymatic activity through the disruption of the oligomehc interaction.
It is yet another object of the present invention to provide a method for screening drugs for agents which modulate the interaction of proteins and pharmaceutical compositions that are capable of modulating the protein-protein interactions of Helicobacter pylori.
It is another object to administer the nucleic acids of the present invention via gene therapy.
It is yet another object of the present invention to provide protein chips or protein microarrays.
It is yet another object of he present invention to provide a report in, for example paper, electronic and/or digital forms, concerning the protein-protein interactions, the modulating compounds and the like, as well as a PIM®.
These and other objects are achieved by the present invention as evidenced by the summary of the invention, description of the preferred embodiments and the claims.
SUMMARY OF THE PRESENT INVENTION
Thus the present invention relates to a protein complex of polypeptides as described in Table 1.
Furthermore, the present invention provides SID® polynucleotides and SID® polypeptides as defined in Figure 2, as well as a PIM® for Helicobacter pylori.
The present invention also provides antibodies to the protein-protein complexes for Helicobacter pylori. In another embodiment the present invention provides a method for screening drugs for agents that modulate the protein-protein interactions and pharmaceutical compositions that are capable of modulating protein-protein interactions.
In another embodiment the present invention provides protein chips or protein microarrays.
In another embodiment the present invention identifies a superbinder phenotype in H. pylori with the two-hybrid system which completely inhibits protein-protein interactions.
In another aspect the present invention provides oligopeptides, their overlapping or combining derivatives thereof that inhibit H. pylori growth.
In yet another embodiment, the present invention identifies ORFs having enzymatic activity which provides a direct way to screen lead compounds.
In yet another embodiment the present invention provides a report in, for example, paper, electronic and/or digital forms.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 is a schematic representation of the pB1 plasmid.
Fig. 2 is a schematic representation of the pB5 plasmid.
Fig. 3 is a schematic representation of the pB6 plasmid.
Fig. 4 is a schematic representation of the pB13 plasmid.
Fig. 5 is a schematic representation of the pB14 plasmid.
Fig. 6 is a schematic representation of the pB20 plasmid.
Fig. 7 is a schematic representation of the pP1 plasmid.
Fig. 8 is a schematic representation of the pP2 plasmid. Fig. 9 is a schematic representation of the pP3 plasmid.
Fig. 10 is a schematic representation of the pP6 plasmid.
Fig. 1 1 is a schematic representation of the pP7 plasmid.
Fig. 12 is a schematic representation of vectors expressing the T25 fragment.
Fig. 13 is a schematic representation of vectors expressing the T18 fragment.
Fig. 14 is a schematic representation of various vectors of pCmAHLI , pT25 and pT18.
Fig. 15 is a schematic representation identifying the SlD®'s of
Helicobacter pylori. In this figure the "Full-length prey protein" is the Open Reading Frame (ORF) or coding sequence (CDS) where the identified prey polypeptides are included. The Selected Interaction Domain (SID®) is determined by the commonly shared polypeptide domain of every selected prey fragment.
Fig. 16 is a protein map (PIM®).
Fig. 17 is a gel illustrating the results obtained for the disruption of the ORFs hp0099 to hp0198. This figure exemplifies first that multiple insertions of the transposon took place and second that for the majority trransposon insertion occurred at a distance ranging between 100 to 600 bp from the 5'-end of the ORF, a distance compatible with the promotion of gene replacement by allelic recombination.
Fig. 18 is a schematic diagram of the procedure for classification of the genes as described in the present invention.
Fig. 19 are the results of three-hybrid experiments. Growth phenotypes of diploid strains containing various plasmids were analyzed by incubating cells at various dilutions (from 1 to 10"4). Yeast growth was performed over 2 days at 30°C on DO-3+Met or DO-3-Met medium. Lane 1 are cells containing [p3H1-HP1230]+pP6-HP1529; Lane 2 [p3H1 -HP1230- SID1529Wη+pP6-HP1529; lane 3[pH1-HP1230-SID1529(N38D-
V53L)+pP6-HP1529; lane 4 [p3H1-HP1230-SID1529(V53L)]+pP6- HP1529; and lane 5 [p3H1-HP1230]+pP6-HP0875.
Fig. 20 is the pP7-centro vector.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
As used herein the terms "polynucleotides", "nucleic acids" and "oligonucleotides" are used interchangeably and include, but are not limited to RNA, DNA, RNA/DNA sequences of more than' one nucleotide in either single chain or duplex form. The polynucleotide sequences of the present invention may be prepared from any known method including, but not limited to, any synthetic method, any recombinant method, any ex vivo generation method and the like, as well as combinations thereof.
The term "polypeptide" means herein a polymer of amino acids having no specific length. Thus, peptides, oligopeptides and proteins are included in the definition of "polypeptide" and these terms are used interchangeably throughout the specification, as well as in the claims. The term "polypeptide" does not exclude posttranslational modifications such as polypeptides having covalent attachment of glycosyl groups, acetyl groups, phosphate groups, lipid groups and the like. Also encompassed by this definition of "polypeptide" are homologs thereof.
By the term "homologs" is meant structurally similar genes contained within a given species, orthologs are functionally equivalent genes from a given species or strain, as determined for example, in a standard complementation assay. Thus, a polypeptide of interest can be used not only as a model for identifying similar genes in given strains, but also to identify homologs and orthologs of the polypeptide of interest in other species. The orthologs, for example, can also be identified in a conventional complementation assay. In addition or alternatively, such orthologs can be expected to exist in bacteria (or other kind of cells) in the same branch of the phylogenic tree, as set forth, for example, at ftp://ft.cme.msu.edu/pub/rdp/SSU-rRNA/SSU/Prok.phylo.
As used herein the term "prey polynucleotide" means a chimeric polynucleotide encoding a polypeptide comprising (i) a specific domain; and (ii) a polypeptide that is to be tested for interaction with a bait polypeptide. The specific domain is preferably a transcriptional activating domain.
As used herein, a "bait polynucleotide" is a chimeric polynucleotide encoding a chimeric polypeptide comprising (i) a complementary domain; and (ii) a polypeptide that is to be tested for interaction with at least one prey polypeptide.
The complementary domain is preferably a DNA-binding domain that recognizes a binding site that is further detected and is contained in the host organism.
As used herein "complementary domain" is meant a functional constitution of the activity when bait and prey are interacting; for example, enzymatic activity.
As used herein "specific domain" is meant a functional interacting activation domain that may work through different mechanisms by interacting directly or indirectly through intermediary proteins with RNA polymerase II or Ill- associated proteins in the vicinity of the transcription start site.
As used herein the term "complementary" means that, for example, each base of a first polynucleotide is paired with the complementary base of a second polynucleotide whose orientation is reversed. The complementary bases are A and T (or A and U) or C and G.
The term "sequence identity" refers to the identity between two peptides or between two nucleic acids. Identity between sequences can be determined by comparing a position in each of the sequences which may be aligned for the purposes of comparison. When a position in the compared sequences is occupied by the same base or amino acid, then the sequences are identical at that position. A degree of sequence identity between nucleic acid sequences is a function of the number of identical nucleotides at positions shared by these sequences. A degree of identity between amino acid sequences is a function of the number of identical amino acid sequences that are shared between these sequences. Since two polypeptides may each (i) comprise a sequence (i.e., a portion of a complete polynucleotide sequence) that is similar between two polynucleotides, and (ii) may further comprise a sequence that is divergent between two polynucleotides, sequence identity comparisons between two or more polynucleotides over a "comparison window" refers to the conceptual segment of at least 20 contiguous nucleotide positions wherein a polynucleotide sequence may be compared to a reference nucleotide sequence of at least 20 contiguous nucleotides and wherein the portion of the polynucleotide sequence in the comparison window may comprise additions or deletions (i.e., gaps) of 20 percent or less compared to the reference sequence (which does not comprise additions or deletions) for optimal alignment of the two sequences.
To determine the percent identity of two amino acids sequences or two nucleic acid sequences, the sequences are aligned for optimal comparison. For example, gaps can be introduced in the sequence of a first amino acid sequence or a first nucleic acid sequence for optimal alignment with the second amino acid sequence or second nucleic acid sequence. The amino acrid residues or nucleotides at corresponding amino acid positions or nucleotide positions are then compared. When a position in the first sequence is occupied by the same amino acid residue or nucleotide as the corresponding position in the second sequence, the molecules are identical at that position.
The percent identity between the two sequences is a function of the number of identical positions shared by the sequences. Hence % identity = number of identical positions / total number of overlapping positions X 100.
In this comparison the sequences can be the same length or may be different in length. Optimal alignment of sequences for determining a comparison window may be conducted by the local homology algorithm of Smith and Waterman (J. Theor. Biol, 91 (2) pgs. 370-380 (1981), by the homology alignment algorithm of Needleman and Wunsch, J. Miol. Biol, 48(3) pgs. 443-453 (1972), by the search for similarity via the method of Pearson and Lipman, PNAS, USA, 85(5) pgs. 2444-2448 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA and TFASTA in the Wisconsin Genetics Software Package Release 7.0, Genetic Computer Group, 575, Science Drive, Madison, Wisconsin) or by inspection.
The best alignment (i.e., resulting in the highest percentage of identity over the comparison window) generated by the various methods is selected.
The term "sequence identity" means that two polynucleotide sequences are identical (i.e., on a nucleotide by nucleotide basis) over the window of comparison. The term "percentage of sequence identity" is calculated by comparing two optimally aligned sequences over the window of comparison, determining the number of positions at which the identical nucleic acid base (e.g., A, T, C, G, U, or I) occurs in both sequences to yield the number of matched positions, dividing the number of matched positions by the total number of positions in the window of comparison (i.e., the window size) and multiplying the result by 100 to yield the percentage of sequence identity. The same process can be applied to polypeptide sequences.
The percentage of sequence identity of a nucleic acid sequence or an amino acid sequence can also be calculated using BLAST software (Version 2.06 of September 1998) with the default or user defined parameter.
The term "sequence similarity" means that amino acids can be modified while retaining the same function. It is known that amino acids are classified according to the nature of their side groups and some amino acids such as the basic amino acids can be interchanged for one another while their basic function is maintained.
The term "isolated" as used herein means that a biological material such as a nucleic acid or protein has been removed from its original environment in which it is naturally present. For example, a polynucleotide present in a plant, mammal or animal is present in its natural state and is not considered to be isolated. The same polynucleotide separated from the adjacent nucleic acid sequences in which it is naturally inserted in the genome of the plant or animal is considered as being "isolated."
The term "isolated" is not meant to exclude artificial or synthetic mixtures with other compounds, or the presence of impurities which do not interfere with the biological activity and which may be present, for example, due to incomplete purification, addition of stabilizers or mixtures with pharmaceutically acceptable excipients and the like.
"Isolated polypeptide" or "isolated protein" as used herein means a polypeptide or protein which is substantially free of those compounds that are normally associated with the polypeptide or protein in a naturally state such as other proteins or polypeptides, nucleic acids, carbohydrates, lipids and the like.
The term "purified" as used herein means at least one order of magnitude of purification is achieved, preferably two or three orders of magnitude, most preferably four or five orders of magnitude of purification of the starting material or of the natural material. Thus,- the term "purified" as utilized herein does not mean that the material is 100% purified and thus excludes any other material.
The term "variants" when referring to, for example, polynucleotides encoding a polypeptide variant of a given reference polypeptide are polynucleotides that differ from the reference polypeptide but generally maintain their functional characteristics of the reference polypeptide. A variant of a polynucleotide may be a naturally occurring allelic variant or it may be a variant that is known naturally not to occur. Such non-naturally occurring variants of the reference polynucleotide can be made by, for example, mutagenesis techniques, including those mutagenesis techniques that are applied to polynucleotides, cells or organisms.
Generally, differences are limited so that the nucleotide sequences of the reference and variant are closely similar overall and, in many regions identical.
Variants of polynucleotides according to the present invention include, but are not limited to, nucleotide sequences which are at least 95% identical after lo
alignment to the reference polynucleotide encoding the reference polypeptide. These variants can also have 96%, 97%, 98% and 99.999% sequence identity to the reference polynucleotide.
Nucleotide changes present in a variant polynucleotide may be silent, which means that these changes do not alter the amino acid sequences encoded by the reference polynucleotide.
Substitutions, additions and/or deletions can involve one or more nucleic acids. Alterations can produce conservative or non-conservative amino acid substitutions, deletions and/or additions.
Variants of a prey or a SID® polypeptide encoded, by a variant polynucleotide can possess a higher affinity of binding and/or a higher specificity of binding to its protein or polypeptide counterpart, against which it has been initially selected. In another context, variants can also loose their ability to bind to their protein or polypeptide counterpart.
By "anabolic pathway" is meant a reaction or series of reactions in a metabolic pathway that synthesize complex molecules from simpler ones, usually requiring the input of energy. An anabolic pathway is the opposite of a catabolic pathway.
As used herein, a "catabolic pathway" is a series of reactions in a metabolic pathway that break down complex compounds into simpler ones, usually releasing energy in the process. A catabolic pathway is the opposite of an anabolic pathway.
As used herein, "drug metabolism" is meant the study of how drugs are processed and broken down by the body. Drug metabolism can involve the study of enzymes that break down drugs, the study of how different drugs interact within the body and how diet and other ingested compounds affect the way the body processes drugs.
As used herein, "metabolism" means the sum of all of the enzyme- catalyzed reactions in living cells that transform organic molecules. By "secondary metabolism" is meant pathways producing specialized metabolic products that are not found in every cell.
As used herein, "SID®" means a Selected Interacting Domain and is identified as follows: for each bait polypeptide screened, selected prey polypeptides are compared. Overlapping fragments in the same ORF or CDS define the selected interacting domain.
As used herein the term "PIM®" means a protein-protein interaction map. This map is obtained from data acquired from a number of separate screens using different bait polypeptides and is designed to map out all of the interactions between the polypeptides.
The term "affinity of binding", as used herein, can be defined as the affinity constant Ka when a given SID® polypeptide of the present invention which binds to a polypeptide and is the following mathematical relationship:
[SID® /polypeptide complex]
Ka =
[free SID®] [free polypeptide]
wherein [free SID®], [free polypeptide] and [SID® polypeptide complex] consist of the concentrations at equilibrium respectively of the free SID® polypeptide, of the free polypeptide onto which the SID® polypeptide binds and of the complex formed between SID® polypeptide and the polypeptide onto which said SID® polypeptide specifically binds.
The affinity of a SID® polypeptide of the present invention or a variant thereof for its polypeptide counterpart can be assessed for example, on a Biacore™ apparatus marketed by Amersham Pharmacia Biotech Company such as described by Szabo et al Curr Opin Struct Biol 5 pgs. 699-705 (1995) and by
Edwards and Leartherbarrow, Anal. Biochem 246 pgs. 1-6 (1997). As used herein the phrase "at least the same affinity" with respect to the binding affinity between a SID® polypeptide of the present invention to another polypeptide means that the Ka is identical or can be at least two-fold, at least three fold or at least five fold greater than the Ka value of reference.
As used herein, the term "modulating compound" means a compound that inhibits or stimulates or can act on another protein which can inhibit or stimulate the protein-protein interaction of a complex of two polypeptides or the protein- protein interaction of two polypeptides.
More specifically, the present invention comprises complexes of polypeptides or polynucleotides encoding the polypeptides composed of a bait polypeptide, or a bait polynucleotide encoding a bait polypeptide and a prey polypeptide or a prey polynucleotide encoding- a prey polypeptide. The prey polypeptide or prey polynucleotide encoding the prey polypeptide is capable of interacting with a bait polypeptide of interest in various hybrid systems.
As described in the Background of the present invention there are various methods known in the art to identify prey polypeptides that interact with bait polypeptides of interest. These methods, include, but are not limited to, generic two- hybrid systems as described by Fields et al in Nature, 340:245-246 (1989) and more specifically in U.S. Patent Nos. 5,283,173, 5,468,614 and 5,667,973, which are hereby incorporated by reference; the reverse two-hybrid system described by Vidal et al, supra; the two plus one hybrid method described, for example, in Tirade et al, supra; the yeast forward and reverse 'n'-hybrid systems as described in Vidal and Legrain, supra; the method described in WO 99/42612; those methods described in Legrain et al FEBS Letters 480 pgs. 32-36 (2000) and the like.
The present invention is not limited to the type of method utilized to detect protein-protein interactions and therefore any method known in the art and variants thereof can be used. It is however better to use the method described in
WO 99/42612 or WO 00/66722, both references incorporated herein by reference due to the methods' sensitivity, reproducibility and reliability. Protein-protein interactions can also be detected using complementation assays such as those described by Pelletier et al at http://www.abrf.oi-g/IBT /Articles /IBT0012/jbt0012.ht l. WO 00/07038 and W098/34120.
Although the above methods are described for applications in the yeast system, the present invention is not limited to detecting protein-protein interactions using yeast, but also includes similar methods that can be used in detecting protein protein interactions in, for example, mammalian systems as described, for example in Takacs et al., Proc. Nat. Acad. Sci., USA, 90 (21 ): 10375 (1993) and Vasavada et al., Proc. Nat. Acad. Sci., USA, 88 (23):10686-90 (1991 ), as well as a bacterial two-hybrid system as described in Karimov et al (1998), W099/28746, WO 00/66722 and Legrain et al FEBS Letters, 480 pgs. 32-36 (2001 ).
The above-described methods are limited to the use of yeast, mammalian cells and Escherichia coli cells, however the present invention is not limited in this manner. Consequently, mammalian and typically human cells, as well as bacterial, yeast, fungus, insect, nematode and plant cells are encompassed by the present invention and may be transfected by the nucleic acid or recombinant vector as defined herein.
Examples of suitable cells include, but are not limited to, VERO cells,
HELA cells such as ATCC No. CCL2, CHO cell lines such as ATCC No. CCL61 , COS cells such as COS-7 cells and ATCC No. CRL 1650 cells, W138, BHK, HepG2, 3T3 such as ATCC No. CRL6361 , A549, PC12, K562 cells, 293 cells, Sf9 cells such as ATCC No. CRL171 1 and Cv1 cells such as ATCC No. CCL70.
Other suitable cells that can be used in the present invention include, but are not limited to, prokaryotic host cells strains such as Escherichia coli, (e.g., strain DH5-α, Bacillus subtilis, Salmonella typhimurium, or strains of the genera of Pseudomonas, Streptomyces and Staphylococcus.
Further suitable cells that can be used in the present invention include yeast cells such as those of Saccharomyces such as Saccharomyces cerevisiae. The bait polynucleotide, as well as the prey polynucleotide can be prepared according to the methods known in the art such as those described above in the publications and patents reciting the known method perse.
The bait polynucleotide of the present invention is obtained from genomic DNA of Helicobacter pylori. The prey polynucleotide is obtained from genomic DNA of Helicobacter pylori, variants of genomic DNA of Helicobacter pylori, and fragments from the genome or transcriptome of Helicobacter pylori ranging from about 20 to 5000. The prey polynucleotide is then selected, sequenced and identified.
A genomic DNA prey library is prepared from the Helicobacter pylori and constructed in the specially designed prey vector pP6 as shown in Figure 10 after ligation of suitable linkers such that every genomic DNA insert is fused to a nucleotide sequence in the vector that encodes the transcription activation domain of a reporter gene. Any transcription activation domain can be used in the present invention. Examples include, but are not limited to, Gal4,YP16, B42, His and the like.
Toxic reporter genes, such as CATR, CYH2, CYH1 , URA3, bacterial and fungi toxins and the like can be used in reverse two-hybrid systems.
The polypeptides encoded by the nucleotide inserts of the genomic DNA prey library thus prepared are termed "prey polypeptides in the context of the presently described selection method of the prey polynucleotides.
The bait polynucleotide can be inserted in bait plasmid as illustrated in Figure 1. The bait polynucleotide insert is fused to a polynucleotide encoding the binding domain of, for example, the Gal4 DNA binding domain and the shuttle expression vector is used to transform cells.
As stated above, any cells can be utilized in transforming the bait and prey polynucleotides of the present invention including mammalian cells, bacterial cells, yeast cells, insect cells and the like. In an embodiment, the present invention identifies protein-protein interactions in yeast. In using known methods a prey positive clone is identified containing a vector which comprises a nucleic acid insert encoding a prey polypeptide which binds to a bait polypeptide of interest. The method in which protein-protein interactions are identified comprises the following steps:
i) mating at least one first haploid recombinant yeast cell clone from a recombinant yeast cell clone library that has been transformed with a plasmid containing the prey polynucleotide to be assayed with a second haploid recombinant yeast cell clone transformed with a plasmid containing a bait polynucleotide encoding for the bait polypeptide;
ii) cultivating dipioid cell clones obtained in step i) on a selective medium; and
iii) selecting recombinant cell clones which grow on the selective medium.
This method may further comprise the step of:
iv) characterizing the prey polynucleotide contained in each recombinant cell clone which is selected in step iii).
In yet another embodiment of the present invention, in lieu of yeast, Escherichia coli is used in a bacterial two-hybrid system, which encompasses a similar principle to that described above for yeast, but does not involve mating for characterizing the prey polynucleotide.
In yet another embodiment of the present invention, mammalian cells and a method similar to that described above for yeast for characterizing the prey polynucleotide are used.
By performing the yeast, bacterial or mammalian two-hybrid system it is possible to identify for one particular bait an interacting prey polypeptide. The prey polypeptide that has been selected by testing the library of preys in a screen using the two-hybrid, two plus one hybrid methods and the like, encodes the polypeptide interacting with the protein of interest.
The present invention is also directed, in a general aspect, to a complex of polypeptides, polynucleotides encoding the polypeptides composed of a bait polypeptide or bait polynucleotide encoding the bait polypeptide and a prey polypeptide or prey polynucleotide encoding the prey polypeptide capable of interacting with the bait polypeptide of interest. These complexes are identified in Table 1 , as the bait amino acid sequences and the prey amino acid sequences, as well as the bait and prey nucleic acid sequences.
In another aspect, the present invention relates to a complex of polynucleotides consisting of a first polynucleotide, or a fragment thereof, encoding a prey polypeptide that interacts with a bait polypeptide and a second polynucleotide or a fragment thereof. This fragment has at least 20 consecutive nucleotides, but can have between 20 and 5,000 consecutive nucleotides, or between 12 and 10,000 consecutive nucleotides or between 12 and 20,000 consecutive nucleotides.
The polypeptides of column 3 encoded by the polynucleotides of column 2 in Tables 2 and 7 and the polypeptides of column 5 encoded by the polynucleotides of column 3 in Table 8 according to the present invention and the complexes of the two polypeptides encoded by the sets of two polynucleotides also form part of the present invention. In yet another embodiment, the present invention relates to an isolated complex of at least two polypeptides encoded by two polynucleotides wherein said two polypeptides are associated in the complex by affinity binding and are depicted in Table 1 and Table 8.
In yet another embodiment, the present invention relates to an isolated complex comprising at least a polypeptide encoded by an ORF (HP####) of column 1 of Table 1 and a polypeptide encoded by an ORF (HP####) of column
2 of Table 1 and Table 8. The present invention is not limited to these polypeptide complexes alone but also includes the isolated complex of the two polypeptides in which fragments and/or homologous polypeptides exhibiting at least 95% sequence identity, as well as from 96% sequence identity to 99.999% sequence identity.
Also encompassed in another embodiment of the present invention is an isolated complex in which the SID® polypeptide (see even SEQ ID Nos. from 2 to 3256 in column 3 of Table 2, even SEQ ID Nos. 6590 to 6594 in Table 7 and even SEQ ID Nos. 6596 to 6644 in Table 8.) of the prey polypeptides encoded by uneven SEQ ID Nos. 1 to 3255 in column 2 of Table 2, uneven SEQ ID Nos. 6589 to 6593 in Table 7 and uneven SEQ ID Nos. 6595 to 6643 in Table 8) forming the isolated complex.
Besides the isolated complexes described above, nucleic acids coding for a Selected Interacting Domain (SID®) polypeptide or a variant thereof or any of the nucleic acids set forth in Tables 2, 7 and 8 can be inserted into an expression vector which contains the necessary elements for the transcription and translation of the inserted protein-coding sequence. Such transcription elements include a regulatory region and a promoter. Thus, the nucleic acid which may encode a marker compound of the present invention is operably linked to a promoter in the expression vector. The expression vector may also include a replication origin.
A wide variety of host/expression vector combinations are employed in expressing the nucleic acids of the present invention. Useful expression vectors that can be used include, for example, segments of chromosomal, non- chromosomal and synthetic DNA sequences. Suitable vectors include, but are not limited to, derivatives of SV40 and pcDNA and known bacterial plasmids such as col El, pCR1 , pBR322, pMal-C2, pET, pGEX as described by Smith et al (1988), pMB9 and derivatives thereof, plasmids such as RP4, phage DNAs such as the numerous derivatives of phage I such as NM989, as well as other phage DNA such as M13 and filamentous single stranded phage DNA; yeast plasmids such as the 2 micron plasmid or derivatives of the 2 micron plasmid, as well as centomeric and integrative yeast shuttle vectors; vectors useful in eukaryotic cells such as vectors useful in insect or mammalian cells; vectors derived from combinations of plasmids and phage DNAs, such as plasmids that have been modified to employ phage DNA or the expression control sequences and the like.
For example in a baculovirus expression system, both non-fusion transfer vectors, such as, but not limited to pVL941 (BamHI cloning site Summers, pVL1393 (BamHI, Smal, Xba\, EcoRI, Notl, Xmalll, BgH\ and Pst\ cloning sites; Invitrogen) pVL 392 (BglH, Pstl, Not\, Xmalll, EcoRI, XbaW, Sma\ and BamHI cloning sites; Summers and Invitrogen) and pBlueSaclli (BamHI, Bglll, Pstl, Nco\ and Hindlll cloning sites, with blue/white recombinant screening, Invitrogen), and fusion transfer vectors such as, but not limited to, pAc700 (BamHI and Kpnl cloning sites, in which the BamHI recognition site begins with the initiation codon; Summers), pAc701 and pAc70-2 (same as pAc700, with -different reading frames), pAc360 (BamHI cloning site 36 base pairs downstream of a polyhedrin initiation codon; Invitrogen (195)) and pBlueBacHisA, B, C (three different reading frames with BamHI, Bg/ll, Pstl, Ncol and Hindlll cloning site, an N-terminal peptide for ProBond purification and blue/white recombinant screening of plaques; Invitrogen (220) can be used.
Mammalian expression vectors contemplated for use in the invention include vectors with inducible promoters, such as the dihydrofolate reductase promoters, any expression vector with a DHFR expression cassette or a DHFR/methotrexate co- amplification vector such as pED (Pstl, Sail, Sbal, Smal and EcoRI cloning sites, with the vector expressing both the cloned gene and DHFR; Kaufman, 1991 ). Alternatively a glutamine synthetase/methionine sulfoximine co-amplification vector, such as pEE14 (Hindlll, Xball, Smal, Sbal, EcoRI and Bell cloning sites in which the vector expresses glutamine synthetase and the cloned gene; Celltech). A vector that directs episomal expression under the control of the Epstein Barr Virus (EBV) or nuclear antigen (EBNA) can be used such as pREP4 (BamHI, Sfil, Xhol, Notl, Nhel, Hindlll, Nhel, Pvull and Kpnl cloning sites, constitutive RSV-LTR promoter, hygromycin selectable marker; Invitrogen) pCEP4 (BamHI, Sfil, Xhol, Nod, Nhel, Hindlll, Nhei, Pvull and Kpnl cloning sites, constitutive hCMV immediate early gene promoter, hygromycin selectable marker; Invitrogen), pMEP4 (Kpnl, Pvul, Nhel, Hindlll, Notl, Xhol, Sfil, BamHI cloning sites, inducible methallothionein lla gene promoter, hygromycin selectable marker, Invitrogen), pREPδ (BamHI, Xhol, Notl, Hindlll, Nhel and Kpnl cloning sites, RSV-LTR promoter, histidinol selectable marker; Invitrogen), pREP9 (Kpnl, Nhel, Hindlll, Noil, Xhol, Sfil, BamHI cloning sites, RSV-LTR promoter, G418 selectable marker; Invitrogen), and pEBVHis (RSV-LTR promoter, hygromycin selectable marker, N-terminal peptide purifiable via ProBond resin and cleaved by enterokinase; Invitrogen).
Selectable mammalian expression vectors for use in the invention include, but are not limited to, pRc/CMV (Hindlll, BstXl, Notl, Sbal and Apal cloning sites, G418 selection, Invitrogen), pRc/RSV (Hindll, Spe\, BstXl, Notl, Xba cloning sites, G418 selection, Invitrogen) and the like. Vaccinia irus mammalian expression vectors (see, for example Kaufman 1991 ) that can be used In the present invention include, but are not limited to, pSC1 1 (Smal cloning site, TK- and β-gal selection), pMJ601 (Sa/I, Smal, Afll , Naή, BspMli, BamHI, Apal, Nhel, Sacll, Kpnl and Hindlll cloning sites; TK- and β-gal selection), pTKgptFI S (EcoRI, Pstl, Salll, Accl, Hindll, Sbal, BamHI and Hpa cloning sites, TK or XPRT selection) and the like.
Yeast expression systems that can also be used in the present invention include, but are not limited to, the non-fusion pYES2 vector (Xbal, Sphl, Shol, Notl, GstXl , EcoRI, BstXl, BamHI, Sacl, Kpnl and Hindlll cloning sites, Invitrogen), the fusion pYESHisA, B, C (Xbal, Sphl, Shol, Notl, BstXl , EcoRI, BamHI, Sacl, Kpnl and Hindlll cloning sites, N-terminal peptide purified with ProBond resin and cleaved with enterokinase; Invitrogen), pRS vectors and the like.
Consequently, mammalian and typically human cells, as well as bacterial, yeast, fungi, insect, nematode and plant cells an used in the present invention and may be transfected by the nucleic acid or recombinant vector as defined herein.
Examples of suitable cells include, but are not limited to, VERO cells, HELA cells such as ATCC No. CCL2, CHO cell lines such as ATCC No. CCL61 , COS cells such as COS-7 cells and ATCC No. CRL 1650 cells, W138, BHK, HepG2, 3T3 such as ATCC No. CRL6361 , A549, PC12, K562 cells, 293 cells, Sf9 cells such as ATCC No. CRL1711 and Cv1 cells such as ATCC No. CCL70.
Other suitable cells that can be used in the present invention include, but are not limited to, prokaryotic host cells strains such as Escherichia coli, (e.g., strain DH5-α), Bacillus subtilis, Salmonella typhimurium, or strains of the genera of Pseudomonas, Streptomyces and Staphylococcus.
Further suitable cells that can be used in the present invention include yeast cells such as those of Saccharomyces such as Saccharomyces cerevisiae.
Besides the specific isolated complexes, as described above, the present invention relates to and also encompasses SID® polynucleotides. As explained above, for each bait polypeptide, several prey polypeptides may be identified by comparing and selecting the intersection of every isolated fragment that are included in the same polypeptide, as set forth, in Example 5. Thus the SID® polynucleotides of the present invention are represented by the nucleic acid sequences of uneven SEQ ID Nos. 1 to 3255 in column 2 of Table 2, uneven SEQ ID Nos. 6589 to 6593 in Table 7 and uneven SEQ ID Nos. 6595 to 6643 in Table 8 encoding the SID® polypeptides of even SEQ ID Nos. 2 to 3256. of Table 2, the even SEQ ID Nos. 6590 to 6594 in Table 7 and the even SEQ ID Nos. 6596 to 6644 in Table 8.
The present invention is not limited to the SID® nucleic acid sequences as described in the above paragraph, but also includes fragments of these sequences having at least 6 consecutive nucleic acids, between 6 and 5,000 consecutive nucleic acids and between 6 and 10,000 consecutive nucleic acids and between 6 and 20,000 consecutive nucleic acids, as well as variants thereof. The fragments or variants of the SID® sequences possess at least the same affinity of binding to its protein or polypeptide counterpart, against which it has been initially selected. Moreover this variant and/or fragments of the SID® sequences alternatively can have between 95% and 99.999% sequence identity to its protein or polypeptide counterpart. According to the present invention the variants can be created by known mutagenesis techniques either in vitro or in vivo. Such a variant can be created such that it has altered binding characteristics with respect to the target protein and more specifically that the variant binds the target sequence with either higher or lower affinity.
Polynucleotides that are complementary to the above sequences which include the polynucleotides of the SID®'s, their fragments, variants and those that have specific sequence identity are also included in the present invention.
The polynucleotide encoding the SID® polypeptide, fragment or variant thereof can also be inserted into recombinant vectors which are described in detail above.
The present invention also relates to a composition comprising the above- mentioned recombinant vectors containing the SID® polypeptides in Tables 2, 7 and 8, fragments or variants thereof, as well as recombinant . host cells transformed by the vectors. The recombinant host cells that can be used in the present invention were discussed in greater detail above.
The compositions comprising the recombinant vectors can contain physiological acceptable carriers such as diluents, adjuvants, excipients and any vehicle in which this composition can be delivered therapeutically and can include, ut are not limited to sterile liquids such as water and oils.
In yet another embodiment, the present invention relates to a method of selecting modulating compounds, as well as the modulating molecules or compounds themselves which may be used in a pharmaceutical composition. These modulating compounds may act as a cofactor, as an inhibitor, as antibodies, as tags, as a competitive inhibitor, as an activator or alternatively have agonistic or antagonistic activity on the protein-protein interactions.
The activity of the modulating compound does not necessarily, for example, have to be 100% activation or inhibition. Indeed, even partial activation or inhibition can be achieved that is of pharmaceutical interest. The modulating compound can be selected according to a method which comprises:
(a) cultivating a recombinant host cell with a modulating compound on a selective medium and a reporter gene the expression of which is toxic for said recombinant host cell wherein said recombinant host cell is transformed with two vectors:
(i) wherein said first vector comprises a polynucleotide encoding a first hybrid polypeptide having a DNA binding domain;
(ii) wherein said second vector comprises a polynucleotide encoding a second hybrid polypeptide having a transcriptional activating domain that activates said toxic reporter gene when the first and second hybrid polypeptides interact;
(b) selecting said modulating compound which inhibits or permits the growth of said recombinant host cell.
Thus, the present invention relates to a modulating compound that inhibits the protein-protein interactions of a complex of two polypeptides of Table 1 and
Table 8. The present invention also relates to a modulating compound that activates the protein-protein interactions of a complex of two polypeptides of Table 1 and Table 8.
In yet another embodiment, the present invention relates to a method of selecting a modulating compound, which modulating compound inhibits the interactions of two polypeptides of Table 1 . This method comprises:
(a) cultivating a recombinant host cell with a modulating compound on a selective medium and a reporter gene the expression of which is toxic for said recombinant host cell wherein said recombinant host cell is transformed with two vectors: (i) wherein said first vector comprises a polynucleotide encoding a first hybrid polypeptide having a first domain of an enzyme;
(ii) wherein said second vector comprises a polynucleotide encoding a second hybrid polypeptide having an enzymatic transcriptional activating domain that activates said toxic reporter gene when the first and second hybrid polypeptides interact;
(b) selecting said modulating compound which inhibits or permits the growth of said recombinant host cell.
In the two methods described above any toxic reporter gene can be utilized including those reporter genes that can be used for negative selection including the URA3 gene, the CYH1 gene, the CYH2 gene and the like.
In yet another embodiment, the present invention provides a kit for screening a modulating compound. This kit comprises a recombinant host cell which comprises a reporter gene the expression of which is toxic for the recombinant host cell. The host cell is transformed with two vectors. The first vector comprises a polynucleotide encoding a first hybrid polypeptide having a DNA binding domain; and a second vector comprises a polynucleotide encoding a second hybrid polypeptide having a transcriptional activating domain that activates said toxic reporter gene when the first and second hybrid polypeptides interact.
In yet another embodiment a kit is provided for screening a modulating compound by providing a recombinant host cell, as described in the paragraph above, but instead of a DNA binding domain, the first vector comprises a first hybrid polypeptide containing a first domain of a protein. "The second vector comprises a second polypeptide containing a second part of a complementary domain of a protein that activates the toxic reporter gene when the first and second hybrid polypeptides interact.
In the selection methods described above, the activating domain can be p42 Gal 4, YP16 (HSV) and the DNA-binding domain can be derived from Gal4 or Lex A. The protein or enzyme can be adenylate cyclase, guanylate cyclase, DHFR and the like.
SID® in Tables 2, 7 and 8 may be used as modulating compounds.
In yet another embodiment, the present invention relates to a pharmaceutical composition comprising the modulating compounds for preventing or treating ulcers in a human or animal, most preferably in a mammal.
This pharmaceutical composition comprises a pharmaceutically acceptable amount of the modulating compound. The pharmaceutically acceptable amount can be estimated from cell culture assays. For example, a dose can be formulated in animal models to achieve a circulating concentration range that includes or encompasses a concentration point or range having the desired effect in an in vitro system. This information can thus be used to accurately determine the doses in other mammals, including humans and animals.
The therapeutically effective dose refers to that amount of the compound that results in amelioration of symptoms in a patient. Toxicity and therapeutic efficacy of such compounds can be determined by standard pharmaceutical procedures in cell cultures or in experimental animals. For example, the LD50 (the dose lethal to 50% of the population) as well as the ED50 (the dose therapeutically effective in 50% of the population) can be determined using methods known in the art. The dose ratio between toxic and therapeutic effects is the therapeutic index which can be expressed as the ratio between LD50 and ED50 compounds that exhibit high therapeutic indexes.
The data obtained from the cell culture and animal studies can be used in formulating a range of dosage of such compounds which lies preferably within a range of circulating concentrations that include the ED50 with little or no toxicity.
The pharmaceutical composition can be administered via any route such as locally, orally, systemically, intravenously, intramuscularly, mucosally, using a patch and can be encapsulated in liposomes, microparticles, microcapsules, and the like. The pharmaceutical composition can be embedded in liposomes or even encapsulated.
Any pharmaceutically acceptable carrier or adjuvant can be used in the pharmaceutical composition. The modulating compound will be preferably in a soluble form combined with a pharmaceutically acceptable carrier. The techniques for formulating and administering these compounds can be found in "Remington's Pharmaceutical Sciences" Mack Publication Co., Easton, PA, latest edition.
The mode of administration optimum dosages and galenic forms can be determined by the criteria known in the art taken into account the seriousness of the general condition of the mammal, the tolerance of the treatment and the side effects.
The present invention also relates to a method of treating or preventing ulcers in a human or mammal in need of such treatment. This method comprises administering to a mammal in need of such treatment a pharmaceutically effective amount of a modulating compound which binds to a targeted bacterial protein. In a preferred embodiment, the modulating compound is a polynucleotide which may be placed under the control of a regulatory sequence which is functional in the mammal or human.
In yet another embodiment, the present invention relates to a pharmaceutical composition comprising a SID® polypeptide, a fragment or variant thereof. The SID® polypeptide, fragment or variant thereof can be used in a pharmaceutical composition provided that it is endowed with highly specific binding properties to a bait polypeptide of interest.
The original properties of the SID® polypeptide or variants thereof interfere with the naturally occurring interaction between a first protein and a second protein within the cells of the organism. Thus, the SID® polypeptide binds specifically to either the first polypeptide or the second polypeptide. Therefore, the SID® polypeptides of the present invention or variants thereof interfere with protein-protein interactions of Helicotacter pylori proteins or between Helicobacter pylori proteins and mammal, for example, human proteins.
Thus, the present invention relates to a pharmaceutical composition comprising a pharmaceutically acceptable amount of a SID® polypeptide or variant thereof, provided that the variant has the above-mentioned two characteristics; i.e., that it is endowed with highly specific binding properties to a bait polypeptide of interest and is devoid of biological activity of the naturally occurring protein.
In yet another embodiment, the present invention relates to a pharmaceutical composition comprising a pharmaceutically effective amount of a polynucleotide encoding a SID® polypeptide or a variant thereof wherein the polynucleotide is placed under the control of an appropriate regulatory sequence. Appropriate regulatory sequences that are used are polynucleotide sequences derived from promoter elements and the like.
Polynucleotides that can be used in the pharmaceutical composition of the present invention include the nucleotide sequence: of uneven SEQ ID Nos. 1 to 3255 in column 2 of Table 2, uneven SEQ ID Nos. 6589 to 6593 in Table 7 and uneven SEQ ID Nos. 6595 to 6643 in Table 8.
Besides the SID® polypeptides and polynucleotides, the pharmaceutical composition of the present invention can also include a recombinant expression vector comprising the polynucleotide encoding the SID® polypeptide, fragment or variant thereof.
The above described pharmaceutical compositions can be administered by any route such as orally, systemically, intravenously, intramuscularly, intradermally, mucosally, encapsulated, using a patch and the like. Any pharmaceutically acceptable carrier or adjuvant can be used in this pharmaceutical composition. The SID® polypeptides as active ingredients will be preferably in a soluble form combined with a pharmaceutically acceptable carrier. The techniques for formulating and administering these compounds can be found in "Remington's Pharmaceutical Sciences" supra.
The amount of pharmaceutically acceptable SID® polypeptides can be determined as described above for the modulating compounds using cell culture and animal models.
Such compounds can be used in a pharmaceutical composition to treat or prevent ulcer.
Thus, the present invention also relates to a method - of preventing or treating ulcer in a mammal said method comprising the steps of administering to a mammal in need of such treatment a pharmaceutically effective amount of:
(1 ) a SID® polypeptide of even SEQ ID Nos. 2 to 3256 in column 3 of Table 2, even SEQ ID Nos. 6590 to 6594 in Table 7 and even SEQ ID Nos. 6596 to 6644 in Table 8 or variants thereof which binds to a targeted mammalian or typically human protein; or
(2) or SID® polynucleotide encoding a SID® polypeptide of uneven SEQ ID Nos. 1 to 3255 in column 2 of Table 2, uneven SEQ ID Nos. 6589 to 6593 in Table 7 and uneven SEQ ID Nos. 6595 to 6643 in Table 8 or variants or fragments thereof wherein said polynucleotide is placed under the control of a regulatory sequence which is functional in said mammal; or
(3) a recombinant expression vector comprising a polynucleotide encoding a SID® polypeptide which binds to a bacterial protein.
In another embodiment the present invention nucleic acids comprising a sequence which encodes the SID® proteins of Table 2 Table 7 and Table 8 and/or functional derivatives thereof are administered to modulate complex of
Table 1 and Table 8 by way of gene therapy. Any of the methodologies relating to gene therapy available within the art may be used in the practice of the present invention such as those described by Goldspiel et al Clin. Pharm. 12 pgs. 488-505 (1993).
Delivery of the therapeutic nucleic acid into a patient may be direct in vivo gene therapy (i.e., the patient is directly exposed to the nucleic acid or nucleic acid containing vector) or indirect ex vivo gene therapy (i.e., cells are first transformed with the nucleic acid in vitro and then transplanted into the patient).
For example for in vivo gene therapy, an expression vector containing the nucleic acid is administered in such a manner that it becomes intracellular; i.e., by infection using a defective or attenuated retroviral or other viral vectors as described, for example in U.S. Patent 4,980,286 or by Robbins et al, Pharmacol.
Then , 80 No. 1 pgs. 35-47 (1998).
The various retroviral vectors that are known in the art are such as those described in Miller et al, Meth. Enzymol. 217 pgs. 581-599 (1993) which have been modified to delete those retroviral sequences which are not required for packaging of the viral genome and subsequent integration into host cell DNA. Also adenoviral vectors can be used which are advantageous due to their ability to infect non-dividing cells and such high-capacity adenoviral vectors are described in Kochanek, Human Gene Therapy, 10, pgs. 2451-2459 (1999). Chimeric viral vectors that can be used are those described by Reynolds et al, Molecular Medecine Today, pgs. 25 -31 (1999). Hybrid vectors can also be used and are described by Jacoby et al, Gene Therapy, 4, pgs. 1282-1283 (1997).
Direct injection of naked DNA or through the use of microparticle bombardment (e.g., Gene Gun®; Biolistic, Dupont) or by coating it with lipids can also be used in gene therapy. Cell-surface receptor s/transfecting agents or through encapsulation in liposomes, microparticles or microcapsules or by administering the nucleic acid in linkage to a peptide which is known to enter the nucleus or by administering it in linkage to a ligand predisposed to receptor- mediated endocytosis (See, Wu & Wu, J. Biol. Chem., 262 pgs. 4429-4432; ( 1987)) can be used to target cell types which specifically express the receptors of interest. In another embodiment a nucleic acid ligand compound may be produced in which the ligand comprises a fusogenic viral peptide designed so as to disrupt endosomes, thus allowing the nucleic acid to avoid subsequent lysosomal degradation. The nucleic acid may be targeted in vivo for cell specific endocytύsis and expression by targeting a specific receptor such as that described in W092/06180, W093/14188 and WO 93/20221. Alternatively the nucleic acid may be introduced intracellularly and incorporated within the host cell genome for expression by homologous recombination. See, Zijlstra et al, Nature, 342, pgs. 435-428 (1989).
In ex wVo gene a gene is transferred into cells in vitro using tissue culture and the cells are delivered to the patient by various methods' such as injecting subcutaneously, application of the cells into a skin graft and the intravenous injection of recombinant blood cells such as hematopoietic stem or progenitor cells.
Cells into which a nucleic acid can be introduced for the purposes of gene therapy include, for example, epithelial cells, endothelial cells, keratinocytes, fibroblasts, muscle cells, hepatocytes and blood cells. The blood cells that can be used include, for example, T-lymphocytes, B-lymphocytes, monocytes, macrophages, neutrophils, eosinophils, megakaryocytes, granulocytes, hematopoietic cells or progenitor cells and the like.
In" yet another embodiment the present invention relates to protein chips or protein microarrays. It is well known in the art that microarrays can contain more than 10,000 spots of a protein that can be robotically deposited on a surface of a glass slide or nylon filter. The proteins attach covalently to the slide surface, yet retain their ability to interact with other proteins or small molecules in solution. In some instances the protein samples can be made to adhere to glass slides by coating the siides with an aldehyde-containing reagent that attaches to primary amines. A process for creating microarrays is described, for example by MacBeath and Schreiber in Science, Volume 289, Number 5485, pgs, 1760-1763 (2000) or Service, Science, Vol, 289, Number 5485 pg. 1673 (2000). An apparatus for controlling, dispensing and measuring small quantities of fluid is described, for example, in U.S. Patent No. 6, 1 12,605.
The present invention also provides a record of protein-protein interactions, PIM@'s, SID®'s and any data encompassed in the following Tables. It will be appreciated that this record can be provided in paper or electronic or digital form.
In yet another embodiment, the present invention relates to the classification of H. pylori within functional categories such as genes essential or non essential for viability using the general method described in Figure 18. In this regard, two exhaustive libraries of H. pylori ORFs were constructed in E.coli The first library contained every H. pylori (strain 26695) cloned individually (Library 1) while the second one (Library I I), contained these ORFs disrupted by a transposabie element. These two ordered libraries are valuable tools for a large project of systematic inactivation of every ORF of the H. pylori genome. They were used to develop a strategy to search at the genomic scale for genes essential for the viability of the bacterium grown in vitro. The inactivation strategy was applied to a series of 138 ORFs that were selected on two different criteria. Ninety six of them were previously shown to encode either proteins involved in protein-protein interaction in the two-hybrid yeast assay (Rain et al, 2001 ), and 42 encode H. pylori specific protein with no known function. The screening procedure led to the identification of 40 Putative Essential Genes (PEGs), of which 15 were shown to be true essential genes. The combination of both essentiality as well as the identification of interacting domains might serve as a direct pathway for the design of active compounds capable of inhibiting protein-protein interactions and possibly bacterial growth was the reasoning used behind this analysis.
Construction and validation of the two ordered libraries
Library I consisted of an individual and ordered (hp0001 to hp1590) bank of every putative ORF according to the information provided on the Webb TIGR site in 1997. Based on the 1 590 ORFs identified at that time, 5' and 3'- oligonucleotides were synthesized with the characteristics described in Example 10 and in Table 3. Each ORF was cloned into a plLL570-» derivative plasmid, 3.6 kb in length. This derivative plasmid corresponds to plLL570 (Labigne et al, 1992) in which DNA from the Hindlll site of the polylinker to the Aval site (position 1425 of the pBR322 bone) has been excluded by reverse PCR using plLL570 as a template and 570-1 plus 570-2 as primers (Table 4). Each ORF was cloned in such way that the 5'-end of the gene (including the ATG) was inserted immediately downstream the three transcriptional and translational stops of plLL570 (Labigne et al, 1992) to prevent toxicity of the recombinant proteins in E. coli. The library consists of 17 times 96-well plates (plate 1.1 to plate XVI 1.1 ). The recombinant plasmids were transformed into DH5α E. coli cells harboring the pTCA plasmid, a plasmid that confers resistance to tetracycline, encodes the Tn3 transposase and is immune to Tn3 (Seifert et al, 1986). The presence of the two compatible plasmids plLL570-HP000X plus pTCA was checked by plasmid extraction and gel electrophoresis on individual isolated tetracycline, spectinomycin, kanamycin resistant clones. In addition, using primers 570-3 and 570-4 (corresponding to the bounderies of the cloning site on plLL570-») the adequation between the size of the cloned PCR product and that of the corresponding ORF was confirmed. Library I consists of all the putative ORFs described on the TIGR Webb site in 1997 with the exclusion of 40 ORFs (hp01, 10, 46, 56, 94, 160, 223, 264, 289, 293, 399, 415, 435, 440, 453, 159, 464, 465, 488, 547, 607, 722, 790, 814, 846, 876, 884, 898, 968, 1007, 1069, 1205, 1248, 1304, 1358, 1394, 1452, 1460, 1497, 1511) for which either the initial gene amplification or the final cloning failed.
Library II consists of the random insertion of Tn3-Km into each of the recombinant plLL570-»» ?pOOOX plasmid. Process of disruption has been designed to generate multiple independent transposon insertions for each cloned ORF. Tn3-Km was shown to preferentially map into the H. pylori inserts due to both the intrinsic properties of Tn3 that transposes into AT rich DNA region and the requirement of maintaining intact replicative function and spectinomycin modifying enzyme (aadA). The efficiency of the whole procedure was checked for five plates. For those 5 plates, the resulting kanamycin transconjugants of 96 independent cloned ORFs were kept individually and as pools of plasmids. The 96 disrupted recombinant plasmids were extracted together and used as a template for individual PCR assays using as primers the 38bp-R (Table 4) of the inverted repeats of Tn3 transposon and each of the 5'-end specific oligonucleotide used for the cloning of the ORFs. Figure 17 illustrates the results obtained for the disruption of the ORFs hp0099 to hp0198 (Plate II.2). This figure exemplifies first that multiple insertions of the transposon took place, second that for the majority trransposon insertion occurred at a distance ranging between 100 to 600 bp from the 5'-end of the ORF, a distance compatible with the promotion of gene replacement by allelic recombination.
Screening of Putative Essential Genes of H. pylori within a series of 138 selected ORFs
The ordered library of disrupted H. pylori ORFs in E. coli was used for the genomic screening of putative essential genes (PEGs), and the screening of a subset of 96 individual selected ORFs of H. pylori strain 26695 (Table 5) as genes encoding proteins demonstrating homodimeric or heterodimeric protein- protein interactions [Rain, 2001 #14], and the screening of a subset of 42 ORFs encoding H. pylori specific predicted protein with no known function (Table 6). Included in the set of 138 ORFs were ORFs used as controls: ORFs known to be essential [ groES (hp0011); holB (hp1231), dnaA (hp1529], or ORFs known to be non-essential for viability in vitro: urel, (hp0071); rdxA (hp0954); ggt (hp1118) representive of various sizes. For the 138 ORFs, individual Tn3-Km disrupted recombinant plasmids were extracted from Library II, and used to transform H. pylori strain HAS141 (Janvier et al, 1999). This isolate was chosen for the initial screening due to its high natural transformation competency, two logs above that of the sequenced strain 26695, and its ability to colonize mouse stomach (Janvier et al, 1999). Kanamycin transformants were obtained for all but the hspA gene as expected, and 40 of the 138 tested ORFs, namely hp0061, 175, 377, 419, 553, 650, 739, 862, 928, 990, 1012, 1014, 1074, 1230, 1245, 1263, 1493 for the first series (Table 5), and hp0130, 231, 271, 358, 394, 659, 697, 699, 721, 726, 746, 838, 935, 947, 953, 973, 1023, 1028, 1039, 1053, 1085, 1265, 1568 (Table 6) which thus can be designated as Putative Essential Genes. Due to the presence of the two large terminal inverted repeats (38 bp in size) within Tn3-Km which quickly reannealed during the amplification procedure, gene amplification of the disrupted allele using the respective 5' and 3' oligonucleotides of a given ORF often led to the production of a PCR product with a size similar to that of the parental allele, and thus was not helpful to confirm gene replacement of the parental by the disrupted allele. Thus, six individual kanamycin resistant transformants obtained for the 78 knock-out (KO) genes were controlled by gene amplification using the 5' and the 3' oligonucleotides of the KO-ORF respectively in pairs with the 38 bp (Table 4) of the inverted repeat of the transposon. Criteria for allelic replacement were that the sum of the size of the two PCR products be identical to that of the KO-ORF. The final identification of the disrupted ORF and of the site of insertion was done by sequencing one of the two PCR products. For ORFs with a size over 700 bp two or three different transposon insertions mapping in the middle of the ORF were commonly observed among the six analysed transformants. For a few ORFs, the 5' and the 3' oligonucleotides initially designed from the sequence of strain 26695, were unable to amplify the chromosome of the HAS 141 parental strain due to intrinsic polymorphism. In these cases, the 5' and 3' oligos from ORFn+ or ORFn-1 together with the 38 bp were used. Unexpectedly, a few kanamycin resistant transformants were obtained for hp1231 (holB), and hp1529 (dnaA). For these genes the insertion transposon mapped at the very 3' end of the genes, in contrast to the other genes where the transposable element mapped at any place but at a minimal distance of 300 bp of either end of the genes. These observations indicate that for some of the genes, known to be essential, the kanamycin resistant gene could be rescued, without allelic replacement, by integration of the whole plasmid via a single crossing-over. To estimate the frequency of this event, 260 preparations of chromosomal DNA from individual 210) or pooled (50) kanamycin resistant H. pylori transformants were spotted on nitrocellulose, denatured and hybridized with a probe consisting of the plLL570-» vector. Of 260 DNA preparations tested, chromosomal DNA extracted from individual transformants of the disrupted hp1231(holB), 1514 (unknown) and 1529 (dnaA) as well as some (and not all) of the individual clones of hp0224 and hp0822 did hybridize with part of the vector confirming that a single crossing-over took place. These results underlined that the T 3-Km strategy is a powerful approach to be used as a first screen, at a genomic scale, for the identification of PEGs. The definitive assignment of a non- essential status cannot be done exclusively on the presence of kanamycin transformants, but had to be confirmed and tested for absence of vector DNA within the chromosome of transformants by hybridization.
Essentiality and functional analysis of 15 of the 40 ORFs identified as Putative Essential Genes
According to an in-house definition, PEGs consist of ORFs that did not lead to the isolation of kanamycin transformants following the transformation of the parental isolate HAS141 with a pool of disrupted ORF. However, absence of kanamycin transformants cannot be directly associated with the identification of a true essential gene. Several explanations might account for the lack of knock out kanamycin resistant mutant, such as (i) absence of the specific gene in the tested strain, (ii) polar effect of the transposon on an essential ORF located downstream of the PEG, a property associated with miniTn3-Km (Skouloubris et al, 1998) (iii) experimental failure due to the small size of the ORF and of the bordering sequences required for allelic exchange. Thus, to confirm the essential character of a PEG (i.e., essentiality) additional experiments were achieved. For each of the 40 PEGs identified through the initial screening in strain HAS141 (Tables 5 & 6), the following criteria were used to confirm their essentiality. First, the distribution of the different ORFs of the genome by DNA/DNA hybridizations within clinical isolates to confirm their presence not only in the transformed isolate (HAS141 ), but also in every clinical isolate tested was studied. Second, a non- polar mutation was introduced into the PEG cloned in E. coli following the approach depicted in the Materials and Methods section, performed by reverse PCR from plasmids of Library I. The recombinant plasmids consist, in each case, of the 0.9 kb-Kanamycin promoter-less cassette flanked respectively upstream and downstream with the 300 first and the 300 last nucleotides of the ORF to knock-out. In addition, the cassette carries a ribosome binding site and a start codon (ATG) in phase with the hundred 3'-terminal codons of the KO gene allowing the translation of the end of the gene to overcome any transcnptional/translational coupling effect The resulting constructed plasmid was transformed in four different H pylori genetic backgrounds strains HAS141 , N6, X47-2An and the sequenced strain 26695 Taking into account these criteria (Figure 17), the following conclusions were drawn relative to the 40 initially identified PEGs
Strain HAS141 lacks ORFs hp0990 and hp1074 These absences accounted for the lack of mutant for these two ORFs, whereas the other 38 PEGs were found to be present in each of the 37 tested clinical isolates (Salama et al, 2000) and manuscript in preparation) The fact that no HAS 141 clones resistant to kanamycin were obtained when transformed with the derivative plasmids containing the disrupted hp0990 and hp1074 ORFs confirmed that in the absence of portable homologous region integration of the plasmid does not occur These two ORFs were not studied further since a non ubiquitous ORF has low probability to be essential and even if so, would not be a good candidate for H pylon specific therapeutical drug design
Non polar kanamycin mutants were unambiguously obtained in HAS141 , N6, X47-2An as well as 26695 for ORFs hp0061, 419, 553, 650, 1263, and 1493 of the first series of genes encoding interacting protein (Table 5), and hp0130, 271, 358, 697, 699, 721, 726, 746, 838, 935, 947, 953, 973, 1023, 1028, 1039, 1053 (Table 6) Gene replacement of the parental allele by the deleted and disrupted allele was confirmed by testing the chromosome of the mutant for the disappearance of a PCR product with a size identical to that of the parental ORF and its replacement by the expected size (1 5 kb PCR product 300 + 900+ 300 bp) When by chance, the parental ORF had a size ranging around 1 5 kb, gene replacement was confirmed by restricting the 1 5 kb PCR product with Smal to release the 0 9 kb-Km non-polar cassette
For 15 of the 38 ubiquitous PEGs, we were unable to obtain kanamycin transformants when introducing a non polar mutation in N6, X47-2An, 26695, although the genes were found to be present in every isolates (ubiquitous) One or two very rare clones were obtained for some of the genes exclusively in
HAS 141 All these kanamycin resistant HAS 141 transformants were positive when tested by hybridization with the labeled plLL570-» vector used as a probe, again attesting for the presence of rare but possible single crossing-over seen under strong selective pressure. These 15 genes (hp0175. hp0231, hp0377, hp0394, hp0659, hp0739, hp0862, hp0928, hp1012, hp1014, hp1085, hp1230, 5 hp1245, hp1265, hp1568) can thus be definitively recognized as genes essential for the viability of H. pylori in vitro. Among these 15 genes, 9 are known to be genes that encode proteins involved in protein-protein interactions, and 6 were selected as encoding H. pylori specific protein without known function. They encode proteins with properties that will be discussed and classified in regards to 10 their potential as putative therapeutical targets.
Search for interacting proteins when using the 6 ubiquitous H. pylori specific essential genes as a bait in the two hybrid assay
The hp0231, hp0394, hp0659, hp1085, hp1265, and hp1568 ORFs were cloned in the pB6vector and used as individual bait for the identification of 15 interacting proteins (Table 7).
Proteomic screens conducted for HP0231 , HP1085, HP1568 did not provide data allowing assignation of a putative function and did not reveal homodimeric interaction underline the usefulness of the protein as a possible therapeutic target. Those genes remain ubiquitous, essential, H. pylori specific and without 20 known function. In contrast, HP0394, HP0659 and HP1265 gave positive screens (Table 7).
Classification of twelve of the 15 ubiquitous essential genes.
The combination of both essentiality as well as the identification of interacting domains might serve as a direct pathway for the design of active
T compounds capable of inhibiting protein-protein interactions and possibly bacterial growth. Of the 15 ubiquitous essential.genes identified by the procedure, 12 were shown to be involved in protein-protein interaction and could be classified in different categories with regards to their potential as putative therapeutic targets. The first category consists of ubiquitous ORFs encoding proteins with heterodimeric protein-protein interactions in the two-hybrid assay where both partners are playing an essential role for the viability of H. pylori and at least one of the two partners is H. pylori specific. Four of the twelve essential ORFs answer these criteria hp0394, hp0862, hp1230, and hp0659 which encode proteins with no known or putative function. Although initially annotated as H. pylori specific, the recent publication of the Campylobacter genome demonstrates for some of them the existence of homologues in this closely phylogenetically related bacteria, but no homologues have been identified in the other bacterial genomes so far sequenced.
The hp1230 gene encodes a protein that has been recognized via the two hybrid assay as an homodimeric protein which interacts with the predicted chromosomal replication initiator protein, DnaA, encoded by hp1529. The proteomic screen allowed the identification of a specific domain of interaction (SID) lying between AA31 and AA180 of HP1230 (SID1230) and a SID of 87 AA within the N-terminal domain of HP1529 or DnaA (AA12 to AA99) (SID1529) (Rain et al., 2001 ). To examine whether the HP1230/HP 529 interaction was specific and could serve as a target for screening of lead compounds with bactericidal activity, the oligonucleotide encoding SID1529 was randomly mutagenized, and selected, through the two-hybrid system, mutated sid that abolished the specific- HP1230/HP1529 interaction. This allowed the demonstration of the fact that isoleucine 58 and Lysine 61 were involved in the HP1230/HP1529 interaction since a double mutant I58F/K61 I within sid1529 abolished this interaction. This mutagenesis random procedure also led to the isolation of a mutated S1D1529* (V53L) which confers a superbinder phenotype to HP1529 in the two-hybrid system. When overexpressed in the three-hybrid system (Tirade et al., 1997) under the control of the regulated Met25 promoter in vector p3H1 (Colland et al., 2001 ) (Figure 3), the superbinder SID1529* completely inhibits the HP1230/HP1529 interaction. Thus, oligopeptide - PNQLLCTTITAKYG-( SEQ ID No. 6588) or overiapping or combinatory derivatives have some potential as lead compounds to inhibit H. pylori growth. Another example of this category of interest is the hp0862 gene. The two- hybrid screen procedure revealed interactions between the HP0862 gene product and the C-terminal domain (AA100 to AA191 ) of the thymidylate kinase (HP1474), an essential enzyme responsible for the first phosphorylation step in the conversion of deoxythymidine 5'-monophosphate to deoxythymidine 5'- diphosphate for the final production of dTTP. As for HP1230, the actual function of the HP0862 encoding gene is unknown, but its essential character and interaction with a known essential enzyme (Tmk) might orient further functional analysis, and encouraged the definition of more precise domain of interactions between the TmK protein and this new specific interacting protein.
The hp394 and hp659 might also enter this category. They both encode protein essential specific to H. pylori. The hp394 encodes a protein whose C- terminal domain (last 76 AA) interacts with the I2> subunit of the acetylcoenzyme A carboxylase transferase (HP950). This enzyme has an essential function in membrane lipid synthesis and catalyses the formation of malonyl-CoA, the first intermediate for fatty acid synthesis . The protein encoded by hp0659 interacts with a putative outer membrane protein HP0655 with no known function. The essential property of hp0655 has not yet been tested and thus the classification of the hp0659 gene in this first category has to be confirmed.
- The second category includes genes essential for H. pylori encoding predicted proteins with known functions. The characteristic of this category is that these genes essential in H. pylori have not been reported to be lethal in other organisms. Thus targeting the proteins encoded by these genes might be relevant of a selective drug design specifically directed against H. pylori. Three of the twelve essential genes displayed this characteristic: hp0377 and hp0175, they both encode periplasmic proteins. Hp0377 encodes a protein partially homologue to DsbC a thiol-disulfide interchange periplasmic protein involved in disulfide bond formation (Zapun et al., 1995). So far, no other genes encoding Dsb-like proteins have been identified in the genome of H. pylori whereas in E. coli, as many as five proteins are required for disulfide bond formation in the periplasm. In E. coli, dsbC encodes a stable homodimer with both protein disulfide isomerase and chaperone activities (McCarthy et al., 2000), and is not essential probably due to redundancy of the function. In contrast, in H. pylori we demonstrate the essential character of that gene. The HP0377 product interacts in the two-hybrid assay with the last 100 C-terminal amino acids (SID) of the homodimeric secreted cysteine containing protein encoded by hp0224. The HP0224 product is a methionine sulfoxide reductase homologue (MsrA). In E. coli, MsrA plays a role in response to oxidative damage by reducing the methionine sulfoxide residues (Moskovitz et al., 1995), and directly, or indirectly contributes to the maintenance of adhesins (Wizemann et al., 1996). The H. pylori MsrA homologue (HP0224) was identified as one of the major antigens released into the extracellular space (Cao et al., 1998). Thus, the HP0377 encoding DsbC is a good H. pylori specific target candidate because of the essential character unique to the bacterial species, and the accessibility of the protein within the periplasmic space. The hp0175 gene is another representative of this category. The gene encodes a predicted peptidyl-propyl cis-trans isomerase, an enzyme that accelerates protein folding by catalyzing the cis-trans isomerization of proline imidic peptide bonds in oligopeptides. Two-hybrid screening identified HP0608, a H. pylori specific protein of unknown function, as interacting partner with HP0175. The essential character of the gene, documented in this study, as for HP0377, has not been reported for other microorganisms and appears unique to the species.
Finally, hp1265 is a very unique and specific target for H. pylori. This essential gene is part of a large cluster of 14 genes (hp1260 to hp1273) among which 12 encode homologues of 12 of the 14 subunits of the NADH oxidoreductase complex of E. coli. In E. coli inactivation of the genes encoding the different subunits (nuo) is not lethal for the bacterium. In H. pylori inactivation of hp1263 is not lethal, but inactivation of one of the two H. pylori specific subunit encoding gene is lethal, indicating that the HP1265 subunit which should be the subunit involved in the NADH binding has very unique properties deserving further functional investigations. - A third category, consists in ORFs that encoded conserved hypothetical proteins, distributed in almost all the sequenced bacterial genomes, for which no function has been assigned, nor the essentiality assessed. We report here on the essential character of two of them the hp0739 gene which encodes a protein that interacts with another conserved hypothetical but non essential protein (HP810), and hp1012. In the two sequenced H. pylori genomes (26695 and J99), the hp0739 gene is flanked by two genes involved in the biosynthesis of the peptidoglycan, and its involvement in this biosynthetic pathway remains to be explored. The hp1012 gene encoded a protein, which has similitude with some metallo-proteases, however the function of this specific protease is unknown. These two proteins with large distribution spectrum, so far unknown as essential genes, represent new putative targets.
- Finally, a fourth category, includes ORFs with known functions that were previously shown to be essential in other microorganisms. The present work allows us to extend this property to the H. pylori species, and reinforces their value as putative targets with large spectrum. The hp0928 gene is one of those; it was selected through the two-hybrid screen as an homodimer. By similarity, the assigned function was that of GTP cyclohydrolase (folE) involved in the first step of the biosynthetic pathway of tetrahydrofolate, the structure of which was shown to be a homodecameric complex form of two pentamers. Both, the inability to succeed in knocking out that gene, and its oligomeric structure supports the assigned function. The two-hybrid screening procedure delineates a domain of interaction consisting of 133 amino-acids between the FolE subunits. This domain might serve as a therapeutical target for the screening of lead compounds with large bacterial spectrum. The hp1014 gene (hdhA) encodes an NAD+-dependent oxidoreductase belonging to the short-chain dehydrogenase/reductase (SDR 1 family). The enzyme is known to require a tetrameric form to be active in E. coli (Yoshimoto er al., 1991 ) which was compatible with the homodimeric interaction observed in the two hybrid assay. The major interest of these two ORFs resides in their well known enzymatic activity, which provides a direct way to screen lead compounds capable of abolishing the enzymatic activity through the disruption of the oligomeric interactions. The present work also classifies hp1245 as an essential gene of this category. This gene encodes the SSB protein, the single strand binding protein, involved in DNA replication, recombination and DNA repair, and such observation is confirmatory of work previously done in other model microorganisms. HP1245 was found to interact with HP0650, a non-essential protein of unknown function, but also significantly with HP0661 , a predicted ribonuclease H involved in DNA replication, a finding consistent with the HP1245 function.
In order to fully illustrate the present invention and advantages thereof, the following specific examples are given, it being understood that the same are intended only as illustrative and in nowise limitative.
EXAMPLES
Example 1 : Preparation of a Helicobacter pylori genomic collection
1 .A. Fragment collection preparation arid transformation in E. col
1 .B. Collection transformation in Saccharomyces cerevisiae
1 .C. Construction of bait plasmid
Example 2: Screening the collection with the two-hybrid in yeast system
2.A. The mating protocol
2.B. The X-Gal overlay assay
2.C. The luminometry assay
Example 3: Identification of positive clones
3.A. PCR on yeast colonies
3.B. Plasmid rescue from yeast by electroporation 4S
Example 4: Detection of protein-protein interaction
Example 5: Identification of SID®
Example 6: Screening of modulating agent
Example 7: Gene therapy Example using SID® polypeptides
5 Example 8: Making of polyclonal and monoclonal antibodies
Example 9: Classification of genes of H. pylori
Example 10: Study of the interaction between two essential genes, HP1230-HP1529, by random mutagenesis.
Medium composition and standard protocols are available in o Maniatis et al..
Example 1 : Preparation of a Helicobacter pylori genomic collection
1.A. Collection preparation and transformation in Escherichia coli
1.A.1. Fragment of genomic DNA preparation
5 The Helicobacter pylori genomic DNA is fragmented in a nebulizer (GATC) for 1 minute, precipitated and resuspended in water.
The obtained nebulized genomic DNA is successively treated with Mung Bean Nuclease (Biolabs) (30 minutes at 30°C), T4 DNA polymerase (Biolabs) (10 minutes at 37°C) and Klenow enzyme (Pharmacia) (10 minutes at room o temperature and 1 hour at 16°C).
DNA is then extracted, precipitated and resuspended in water.
1.A.2. Ligation of linkers to blunt-ended genomic DNA
Oligonucleotide PL160 (5' end phosphorylated) 1 μg /μl and PL159 2μg/μl. Sequence of the oligo PL160 : 5'-ATCCCGGACGAAGGCC-3' (SEQ ID NO. 3257)
Sequence of the oligo PL159 : 5'-GGCCTTCGTCCGG-3' (SEQ ID NO. 3258)
Linkers were preincubated (5 minutes at 95°C, 10 minutes at 68°C, 15 minutes at 42°C) then cooled down at room temperature and ligated with genomic DNA inserts at 4°C overnight.
Linkers were further removed on a separation column (Chromaspin TE 400, Clontech), according to the manufacturers protocol.
1.A.3. Vector preparation
pACTIIst is successively digest with BamHI restriction enzyme (Biolabs) for 1 hour at 37°C, dephosphorylated with Calf Intestine Phosphatase (CIP) (Biolabs) and filled in with dGTP using Vent DNA polymerase (exo-) (Biolabs), extracted, precipitated and resuspended in water.
1.A.4. Ligation between vector and insert of gernomic DNA
The prepared vector is ligated overnight at 15°C with the genomic blunt ended DNA described in section 2 using T4 DNA ligase (Biolabs). The DNA is then precipitated and resuspended in water.
1.A.5. Library transformation in Escherichia coli
Transform DNA from section 1 .A.4 into Electromax DH10B electrocompetent cells (Gibco BRL) with a Cell Poratσr apparatus (Gibe BRL). Add 1 ml SOC medium and incubate transformed cells at 37°C for 1 hour. Add 9 ml volume of SOC medium per tube and plate on LB+ampicillin medium. Scrape colonies with liquid LB medium. Aliquot and freeze at -80°C.
The obtained collection of recombinant cell clones was named HGXBHP1 (CNCM N° 1-2181 ).
1 .B. Collection transformation in Saccharomyces cerevisiae The Saccharamyces cerevisiae strain (Y187 (MATα GalΔ GalδOA ade2- 101 His3 Leu2-3, -1 12 Trp1 -901 Ura3-52 URA3::UASGAL1 -LacZ Met)) transformed with the HGXBHP1 H. pylori genomic DNA library.
The plasmid DNA contained in E. coli were extracted (Qiagen) from aliquoted E. coli frozen cells (1.A.5.).
Grow Saccharomyces cerevisiae yeast Y187 in YPGIu.
Yeast transformation is performed according to standard protocol (Giest et al. Yeast, 11 , 355-360, 1995) using yeast carrier DNA (Clontech). This experiment leads to 104 to 5 104 cells/μg DNA. Spread 2 104 cells on DO-Leu medium per plates. Aliquot and freeze at -80°C.
1. C. Construction of bait plasmid
The genomic amplification of the ORF is obtained by PCR using the Pfu proofreading Taq polymerase (Stratagene) and 200 ng of genomic DNA as the template. PCR primers are chosen in regions flanking the ORF.
Set up the PCR program as followed:
94° 45"
x 30 cycles
Figure imgf000051_0001
72° 10'
15° CC
Check amplification on agarose gel.
Purify PCR fragments with Qiaquick column (Qiagen) according to the manufacturer's protocol.
Digest purified PCR fragments with adequate restriction enzymes.
Purify PCR fragments with Qiaquick column (Qiagen) according to the manufacturer's protocol. Ligate digested PCR fragments into an adequately digested and dephosphorylated bait vector (pAS2ΔΔ) according to standard protocol (Maniatis et al.).
Transform into competent bacterial cells. Grow cells, extract DNA and sequence plasmid.
Example 2: Screening the collection with the two-hybrid in yeast system
2.A. The mating protocol
We have chosen the mating two-hybrid in yeast system (first described by Legrain et al., Nature Genetics, 1997, vol. 16, 277-282, Toward a functional analysis of the yeast genome through exhaustive two-hybrid screens) for its advantages but we could also screen the Helicobacter pylori collection in the classical two-hybrid system as described in Fields et al, or in a yeast reverse two- hybrid system.
The mating procedure allows a direct selection on selective plates because the two fusion proteins are already produced in the parental cells. No replica plating is required.
This protocol is written for the use of the library transformed into the Y187 strain. Before mating, transform S. cerevisiae (CG 1945 strain (MATa Gal4-542
Gall 80-538 ade2-101 His3*200 Leu2-3,-112 Trp1-901 Ura3-52 Lys2-801 URA3::GAL4 17mers (X3)-CyC1TATA-LacZ LYS2::GAL1 UAS-GAL1TATA-HIS3
CYHR)) according to step 1.B and spread on DO-Trp medium.
Day 1 , morning : preculture
Preculture of Y187 cells carrying the bait plasmid obtained at step 1.C. in 20 ml DO-Trp medium. Grow at 30°C with vigorous agitation. Day 1 , late afternoon : culture
Measure OD6oonm of the DO-Trp preculture of Y187 cells carrying the bait plasmid preculture. The OD60onm must lie between 0.1 and 0.5 in order to correspond to a linear measurement.
Inoculate 150 ml DO-Trp at OD60onm 0.006/ml, grow overnight at 30°C with vigorous agitation.
Day 2 : mating
Medium and plates
5 YPGlu plates
50 ml tube with 30 ml DO-Leu-Trp-His
100 ml flask with 20 ml of YPGlu
75 DO-Leu-Trp-His plates
2 DO-Leu plates 2 DO-Trp plates
2 DO-Leu-Trp plates
Measure OD50onm of the DO-Trp culture. It should be around 1 .
For the mating, you must use twice as many bait cells as library cells. To get a good mating efficiency, you must collect the cells at 108 cells per cm2.
Estimate the amount of bait culture (in ml) that makes up 80 OD6oonm units for the mating with the prokaryote library.
Thaw a vial containing the HGXYHP1 library slowly on ice. Add the contents of the vial to 20 ml YPGlu. Let those cells recover at 30°C, under gentle agitation for 10 minutes.
Mating
Put the 80 OD600nm units of bait culture into a 250 ml flask. Add the HGXYHP1 library culture to the bait culture. Transfer the mixture of diploids into 50 ml sterile tubes. Centrifuge, discard the supernatant and resuspend in YPGlu medium.
Distribute cells in 400 μl samples in YPGlu plates with glass beads. Spread cells by shaking the plates.
Incubate plates cells-up at 30°C for 4h30min.
Collection of mated cells
Wash and rinse plates and spread collected cells on DO-Leu-Trp-His+Tet plates.
Day 4
Selection of clones able to grow on DO-Leu-Trp-His+-Tetracyclin : this medium allows us to isolate diploid clones presenting an interaction.
Count the His+ colonies on control plates.
The number of His+ cell clones will define which protocol is to be processed:
Upon 20 X106 His+ colonies
- if the number of His+ cell clones > 285 : then process overlay and then luminometry protocols on blue colonies (2.B and 2.C).
- If the number of His+ cell clones < 285 : process luminometry protocol (2.C).
- The following step leads to the selection of the strongest interaction.
2.8. The X-Gal overlay assay
The X-Gal overlay assay is performed directly on the selective medium plates after scoring the number of His+ colonies.
Material Set up a waterbath. The water temperature should be 50°C.
• 0.5 M Na2HP04 pH 7.5.
• 1.2% Bacto-agar.
• 2% X-Gal in DMF.
• Overlay mixture : 0.25 M Na2HP04 pH7.5, 0.5% agar, 0.1 % SDS, 7% DMF (LABOSI), 0.04% X-Gal (ICN). For each plate, 10 ml overlay mixture are needed.
• DO-Leu-Trp-His plates.
• Sterile toothpicks.
Experiment
Temperature of the overlay mix should be between 45 and 50°C.
Pour the overlay-mix over the plates in portions of 10 ml.
Collect them when the top layer is settled.
Incubate plates overlay-up at 30°C. Note the time.
Check for blue colonies regularly. If no blue colony appears, wait for overnight incubation. Mark with a pen and number the positives.
Streak the positives colonies on fresh DO-Leu-Trp-His plates with a sterile toothpick.
2. C. The luminometry assay
Grow His+ colonies overnight at 30°C in microtiter plates containing DO- Leu-Trp-His+Tetracyclin medium with shaking. The day after, dilute 15 times overnight culture into a new microtiter plate containing the same medium. Incubate 5 hours at 30°C with shaking. Dilute samples 5 times and read ODeoonm. Dilute again to obtain between 10,000 and 75,000 yeast cells/well in 100 ul final volume. Per well, add 76 ul of One Step Yeast Lysis Buffer (Tropix), 20 ul Sapphirell Enhancer (Tropix), 4 ui Galacton Star (Tropix), incubate 40 minutes at 30°C Measure the β-Gal read-out (L) using a Luminometer (Trilux, Wallach)
Calculate value of OD6oonm L and select interacting preys having the highest values
At this step of the protocol, we have isolated diploid cell clones presenting interactions The next step is now to identify polypeptides involved in the selected interactions
Example 3: Identification of positive clones
3 A PCR on yeast colonies
Introduction
PCR amplification of fragments of plasmid DNA directly on yeast colonies is a quick and efficient procedure to identify sequences cloned into this plasmid It is directly derived from a published protocol (Wang H et a! , Analytical Biochemestry, 237, 145-146, 1996) However, it is not a standardized protocol in our hands it varies from strain to strain, it is dependent of experimental conditions (number of cells, Taq polymerase source, etc) This protocol should be optimized to specific local conditions
Materials
- For 1 well, PCR mix composition is
32 5 μl water,
5 μl 10X PCR buffer (Pharmacia),
1 ui dNTP 10 mM,
0 5 ul Taq polymerase (5ui /ui) (Pharmacia),
0 5 ui oligonucleotide ABS 1 10 pmole/ui 5'-GCGTTTGGAATCACTACAGG-3' (SEQ ID NO. 3259)
0.5 μl oligonucleotide ABS2 10 pmole/ur
5'-CACGATGCACGTTGAAGTG-3' (SEQ ID NO. 3260)
- 1 N NaOH
5 Experiment
Grow positive colonies overnight at 30°C on a 96 well cell culture cluster (Costar), containing 150 ui DO-Leu-Trp-His+Tetracyclin with shaking. Resuspend culture and transfer immediately 100 ul on a Thermowell 96 (Costar).
Centrifuge 5 minutes at 4,000 rpm at room temperature.
10 Remove supernatant. Dispense 5 ui NaOH in each well, shake 1 minute
Place the Thermowell in the thermocycler (GeneAmp 9700, Perkin Elmer) 5 minutes at 99.9°C and then 10 minutes at 4°C.
In each well, add PCR mix, shake well.
The PCR program was set up as followed : 15 94°C 3 minutes
94°C 30 secondes
53°C 1 minute 30 secondes x 35 cycles
72°C 3 minutes
72°C 5 minutes
20 15°C ∞
The quality, the quantity and the length of the PCR fragment was checked on an agarose gel. The length of the cloned fragment was the estimated length of the PCR fragment minus 300 base pairs that corresponded to the amplified flanking plasmid sequences
> 3.B. Plasmids rescue from yeast by electroporation Introduction
The previous protocol of PCR on yeast cell may not be successful, in such a case, plasmids from yeast by electroporation can be rescued. This experiment allows the recovery of prey plasmids from yeast cells by transformation of E. coli with a yeast cellular extract. The prey plasmid can then be amplified and the cloned fragment can be sequenced.
Material
Plasmid rescue
Glass beads 425-600 μm (Sigma)
Phenol/chloroform (1/1 ) premixed with isoamyl alcohol (Amresco)
Extraction buffer: 2% Triton X100, 1 % SDS, 100 mM NaCl, 10 mM TrisHCI pH 8.0, 1 M EDTA pH 8.0.
Mix ethanol/NH Ac : 6 volumes ethanol with 7.5 M NH4 Acetate, 70% Ethanol and yeast cells in patches on plates.
Electroporation
SOC medium
M9 medium
Selective plates: M9-Leu+Ampicillin
2 mm electroporation cuvettes (Eurogeπtech)
Experiment
Plasmid rescue
Prepare cell patch on DO-Leu-Trp-His with cell culture of section 2.C. Scrape the cell of each patch into an Eppendorf tube, add 300 μl of glass beads in each tube, then, add 200 μl extraction buffer and add 200 μl phenol:chloroform: isoamyl alcohol (25:24:1 ).
Centrifuge tubes 10 minutes at 15,000 rpm.
Transfer 180 μ) supernatant to a sterile Eppendorf tube and add to each 500 μl ethanol/NH Ac, vortex.
Centrifuge tubes 15 minutes, 15,000 rpm at 4°C
Wash pellet with 200 μl 70% ethanol, remove ethanol and dry pellet.
Resuspend pellet in 10 μl water. Store extracts at -20°C.
Electroporation
Material : Electrocompetent MC1066 cells prepared according to standard protocols (Maniatis).
Add 1 μl of yeast plasmid DNA-extract to pre-chilled Eppendorf tube, and keep on ice.
Mix 1 μl plasmid yeast DNA-extract sample, add 20 μl electrocompetent cells and transfer in a cold electroporation cuvette.
Set the Biorad electroporator on 200 ohms resistance, 25 μF capacity; 2.5 kVolts. Place cuvette in the cuvette holder and electroporate.
Add 1 ml SOC into the cuvette and transfer the cell-mix into a sterile Eppendorf tube. Let cells recover for 30 minutes at 37°C, spin the cells down 1 minute, 4,000x g and pour off supernatant. Keep about 100 μl medium and use it to resuspend the cells and spread them on selective plates (e.g., M9-Leu plates).
Incubate plates for 36 hours at 37°C.
Grow one colony and extract plasmids. Check presence and size of insert through enzymatic digestion and agarose gel. Sequence insert. Example 4: Protein -protein interaction
For each bait, the previously protocol leads to the identification of prey polynucleotide sequences. In order to identify a protein-protein interaction, we need to characterize the obtained prey polypeptide sequence regarding the Helicobacter pylori genome.
This may be accomplish with a software program names blastwun (available on the Internet site of the University of Washington
Figure imgf000060_0001
is a development version of software for gene and protein identification through similarity searches of protein and nucleotide sequence databases).
Blastwun program compares prey polynucleotide insert sequence (rescued from prey plasmid) with whole Helicobacter pylori genome (available on NCBI web site: htrp-./ www.ncbi.nlm.nih.gov under GenBank accession number AE000511 ). This comparison leads to prey polynucleotide localizations in the H. Pylori genome, each localization having a score depending on the homology of sequence. For each prey polynucleotide, we consider the localization with the highest score and, if the insert sequence is included in and is in phase with an Open Reading Frame, we can identify one prey polypeptide interacting with one bait polypeptide.
Helicobacter pylori ORF's sequences are available on the World-Wide
Web site of The Institute for Genomic Research (TIGR) at hrrp: ' www.rigr.urg ' rdb .'mb ' hp b ' hp b.htlm.
This web page allows several requests concerning Helicobacter pylori's genome, in particular, its ORF sequence. To get the sequences of specific ORF's, click on the ewindow named "HP#" and click search. This operation leads to a new web page presenting nucleic and peptide sequence of the specific ORF.
See Table 1 : Protein-protein interactions in Helicobacter pylori.
Example 5: Identification of SID® Experiment results in step 4. sequences of each prey fragment encoding for an interacting prey polypeptide.
By comparing and selecting the intersection of every isolated fragments that are included in the same polypeptide, we define the Selected Interacting Domain (SID®) see Figure 15.
See results in Table 2.
Example 6: Screening of modulating agent
Select one specific interaction.
Transform a permeabilized yeast ceil with plasmids containing bait polypeptide and prey polypeptide of the specific interaction.
Plate a top agar containing transformed permeabilized yeast cells on square boxes (that already contains agarose gel).
Apply by spotting the compounds to test on top agar as soon as it is solidified. Incubate overnight at 30°C.
Analyse results : select lead compounds that prevent transformed permeabilized yeast cells from growing.
Example 7: Gene therapy Example using SID® polypeptides
An expression vector containing the SID® polynucleotide is made in the manner described in U.S. Patent 4,980,286. It is then administered to patients to treat H. pylori infections.
Example 8: Making of polyclonal and monoclonal antibodies
The protein-protein complex of Table 1 was injected into mice and polyclonal and monoclonal antibodies were made following the procedure set forth in Sambrook et al supra.
More specifically mice are immunized with an immunogen comprising complexes conjugated to keyhole limpet hemocyanin using glutaraldehyde or EDC as is well known in the art. The complexes can also' be stabilized by crosslinking as described in WO 00/37483. The immunogen is then mixed with an adjuvant. Each mouse receives four injections of 10 μg to 100 μg of immunogen, and after the fourth injection, blood samples are taken from the mice to determine if the serum contains antibodies to the immunogen. Serum titer is determined by ELISA or RIA. Mice with sera indicating the presence of antibody to the immunogen are selected for hybridoma production.
Spleens are removed from immune mice and single-cell suspension is prepared (Harlow et al 1988). Cell fusions are performed essentially as described by Kohler et al. Briefly, P365.3 myeloma cells (ATTC Rockville, Md) or NS-1 myeloma cells are fused with spleen cells using polyethylene glycol as described by Harlow et al. Cells are plated at a density of.2 x 105 cells/well in 96-well tissue culture plates. Individual wells are examined for growth and the supematants of wells with growth are tested for the presence of Table 1 complex-specific antibodies by ELISA or RIA using the Table 1 complex as a target protein. Cells in positive wells are expanded and subcloned to establish and confirm monoclonality.
Clones with the desired specificities are expanded and grown as ascites in mice or in a hollow fiber system to produce sufficient quantities of antibodies for characterization and assay development. Antibodies are tested for binding to bait polypeptide (from column 1 of Table 1 ) alone or to prey polypeptide (from column 2 of Table 1 ) alone, to determine which are specific for the Table 1 complex as opposed to those that bind to the individual proteins.
Monoclonal antibodies against each of the complexes set forth in Table 1 are prepared in a similar manner by mixing specified proteins together, immunizing an animal, fusing spleen cells with myeloma cells and isolating clones which produce antibodies specific for he protein complex, but not for individual proteins.
Example 9: Classification of genes of H. pylori within function categories at the genomic scale using 2 exhaustive libraries in E. coli 1. Bacterial strains, growth and storage conditions.
Escherichia coli strains DH5α (BRL) HB101 (Boyer and Roulland-Dussoix, 1969) and NS2114 (Rif ) (Seifert et al., 1986) were used as hosts for plasmid cloning and disruption experiments and were grown at 37°C in L-broth (10 g of tryptone, 5 g of yeast extract and 5 g of NaCl per liter, pH 7.0) or on L-agar plates (1.5% agar) at 37°C. Antibiotics were used at the following final concentrations (μg/ml) unless indicated iii the text: spectinomycin: 100 (Upjohn Laboratories, Paris, France), tetracycline: 8 (Sigma Chemicals, Saint-Quentin Fallavier, France), kanamycin: 25 (Serva, Frankfurt, Germany), rifampicine:100 μg (Sigma Chemicals). Independent recombinant E. coli were saved by storing up to 96 clones individually in 96-well microtitre plates; clones were inoculated into L-broth supplemented with 8 μg/ml tetracycline, 100 μg/ml spectinomycin and 7% DMSO (Sigma) and stored at -80°C. H. pylori strain 26695 (Tomb et al., 1997), HAS141 (Janvier et al., 1999), N6 (Ferrero et al., 1992), X47-2an (GUY et al., 1999) were routinely cultured on 10% horse blood agar medium (Blood Agar Base no. 2; Oxoid, Lyon, France) or in Brucella broth supplemented with 10 % Fecal Calf Serum (Gibco). Solid and liquid media contained supplements at the following final concentrations: 10 μg vancomycin (Dakota Pharmaceuticals, Creteil, France), 2.5 IU polymyxin (Pfizer Laboratories, Orsay, France), 5 μg trimethoprim (Sigma) and 4 μg amphotericin B (Bristol-Myers Squibb, Paris, France)/ml. Plates were incubated at 37°C under microaerobic conditions in an anaerobic jar with a carbon dioxide generator (CampyGen, Oxoid) without catalyst. H. pylori that had undergone chromosomal allelic exchange were selected on medium supplemented with 25 μg kanamycin.
Production of amplicons corresponding to each of the 1590 ORFs originally identified by TIGR in strain 26695: 1621 pairs of forward and reverse oligonucleotides targeting the 1590 ORFs of the genome of strain 26695, as assigned on the Webb site in 1997, were designed and synthesized by Eurogentec (Bel S.A., Seraing, Belgium). Pairs (sense and antisense, Table 3) of oligonucleotides were designed in order to allow full length amplification of each of the ORFs, with the exception of ORFs with a size over 3 kb that were split into two or three PCR products. Every forward and reverse 21 bp-oligonucleotide was tagged respectively at its 5'extremity with a CAUCAUCAU (Seq ID No.3261 ) and a CUACUACUA (SEQ ID No. 3262) sequence. Genomic DNA of Hp strain 26695, prepared from cesium chloride extraction (Labigne-Roussel et al., 1988), was used as a template for PCR. Seventeen times 96 amplicons were produced by polymerase chain reactions (PCR) using a PCR-express thermal cycler through 40 cycles consisting of a denaturation step of 94°C for 2 mn, a primer annealing step of 50°C for 30 sec. and an extension step at 72°C for 2 mn, under a 96-well format. The amplicons were controlled for size and quality (single band) on agarose gel.
Systematic UDG cloning of the ORFs of H. pylori strain 26695
The cloning of the 96 amplicons was performed using the ligation-independent method described by Rashtchian (Rashtchian, 1995). First the linear plLL570-» derivative vector was prepared by gene amplification using #570-1 and #570-2 (Table 4) as primers and the plLL570 plasmid (Labigne et a/., 1992) as a template. Three microliters of individual HP0001 to HP1590 PCR products were mixed together with 2 μl of plLL570-» derivative vector (75 ng), 14 μl of 1XPCR buffer, 1 μl of uracil DNA glycosylase (UDG) in a 2000-μl 96-well disposable plate. The plates were incubated at 37°C for 30 mn allowing the enzymatic reaction as well as the hybridization between protruding and complementary extension hybridization to occur. Competent DH5α cells (100 μl) harboring the pTCA plasmid (Seifert et al., 1986) were added to each well, and the 96-well plate was further incubated for 45 mn on ice. One ml of prewarmed L-broth was added to each well, and the plate was then incubated for 90 mn at 37°C. Finally, a selective antibiotic cocktail containing spectinomycin, tetracycline was added to each well to positively select and enrich in plLL570-« derivative recombinant plasmid transformed DH5 (pTCA) cells; plates were then incubated for another 13 hours at 37°C under agitation. Individual transformant colonies were isolated by spotting 10 μl of liquid culture from each well on square agar plates containing tetracycline and spectinomycin using a 96-well inoculator designed to deliver a 10 μl liquid volume; cloning of the PCR product was confirmed by mini-preparation recombinant plasmid restricted with Clal-Aval. They were stored in DMSO (7%) at -80°C under a 96-well format as "library I" consisting of plate 1.1 to plate XVI 1.1.
Systematic disruption of the recombinant plasmids by transposon mutagenesis in E. coli. Transposon mutagenesis of individual £. coli clones was performed using the mini-Tn3-Km transposon as previously described by Jenks et al. (Jenks et al., in press). All manipulations were performed in a 96-well format and four independent transposon mutageneses were carried out in parallel so as to saturate the mutagenesis disrupting process with independent events. Briefly, the stored microtitre plates containing the individual E. coli DH5 clones that harbored the pTCA plus the recombinant plLL570-* derivative plasmids were thawed and used to inoculate fresh plates. Plasmid plLL553 harboring the mini- Tn3-Km transposon (Seifert et al., 1986) (Labigne, 1997) (a low copy auto- transferable plasmid pOX38 derivative) was transferred into these E. coli DH5α clones by conjugation. Transconjugates harboring all three plasmids (recombinant plLL570-» derivative, pTCA and plLL553), were selected by spotting 10 μl of the mating mixture on L-agar containing 25 μg/ml kanamycin, 8 μg/ml tetracycline and 100 μg/ml spectinomycin. Cointegrates were transferred by conjugation into E. coli NS2114SmRif carrying the ere gene. Positive selection of resolved forms of the cointegrates was obtained by growth on L-agar containing 100 μg/ml rifampicin (Sigma), 625 μg/ml kanamycin, 625 μg/ml spectinomycin and 625 μg/ml streptomycin. A pool of each TnKm disrupted recombinant plasmids was stored as individual stock of disrupted ORFs at -80°C in DMSO (7%), designated "library II" consisting of plate 1.2 to plate XVII.2.
DNA preparation and standard molecular biology techniques. The alkaline lysis procedure (Sambrook et al., 1989) was used for small-scale preparation and MIDI Qiagen (Qiagen, Courtaboeuf, France) columns were used for large-scale plasmid preparation. Whole cell genomic DNA from individual generated H. pylori mutants was extracted using the QIAamp Tissue Kit (Qiagen) according to the manufacturer's instructions. Standard procedures for DNA/DNA hybridization were prepared on nitrocellulose membranes (Schleicher and Schuell,) according to the procedure of Sambrook et al. (Sambrook et al., 1989). They were hybridized under standard conditions with [α-32P]-deoxyribonucleotide probes labeled by random priming using the MegaPrime DNA system (Amersham) according to the manufacturer's instructions.
Transformation of H. pylori. H. pylori strains were naturally transformed with circular plasmid DNA (~ 2 μg per transformation). Briefly, bacteria were inoculated as 1 cm patches and grown for 5 h before addition of 10 μl supercoiled plasmid DNA. Each disrupted plasmid consisting either of a pool of disrupted plasmids when originating from library II or of a single recombinant plasmid for the non polar mutation construction was added to 4 independently prepared patches of H. pylori. After further incubation for 18 h, the bacteria from each individual patch were harvested and plated directly onto a single plate of selective medium (kanamycin, 25 μg/ml). Six individual kanamycin transformants were then subcultured. Chromosomal DNA was extracted using QiAmp kit extraction, and the constructed mutant characterized by several PCR controls and/or hybridization as described in the result section.
Introduction of a non polar mutation in a selected ORF (HP000X) For a given cloned ORF (HP000X) present in Library I, the recombinant plasmid was prepared and used as a template for reverse PCR performed with two oligonucleotides. Oligonucleotide HP000X-1 consists of 24 nucleotides complementary to an intragenic sequence located 300 bp downstream the 5'-end of the gene, and HP000X-2 consists of 24 nucleotides complementary to an intragenic sequence located 300 bp upstream of the 3'-end of the gene. All the HP00X-1 and HP000X-2 oligos were tagged respectively at their 5'-end with a CGGGGTACC (SEQ ID No. 3261 ) (Kpnl) sequence and at their 3'-end with a CGCGGATCC (SEQ ID No. 3262) (BamHI) sequence (Table 4). Following reverse amplification with high fidelity Taq polymerase (Boerhinger) from the given cloned ORF, 5 μl of PCR product were restricted with Dpnl in order to eliminate the template molecule, then restricted with Kpnl and BamHI and directly ligated to the promoteriess non polar Kpnl-BamHl kanamycin cassette (0.9 kb) previously described (Skouloubris et al., 1998). Transformants were selected on spectinomycin containing plates. Recombinant plasmids were then purified, controlled, and introduced by natural transformation into H. pylori cells.
Example 10 - Study of the interaction between two essential genes, HP1230-HP1529, by random mutagenesis.
1. Preparation of mutagenized SID1529 collection
A. Collection preparation and transformation in Escherichia coli
i. A Random mutagenesis of SIDΪ529 by PCR
Mutagenized SID1529 was obtained by PCR using the Taq polymerase (Stratagene) and 200 ng of Helicobacter pylori genomic DNA and the following oligonucleotides :
5'-ATTTGCGGCCGCAATCTTGGCGCTAGTCAAACAA-3' (SEQ ID No. 3263)
5'-CCGGGATCCTCAAGATTGGGCGTTAATTTGGAT-3' (SEQ ID No. 3264)
The PCR program was set up as follows :
94° V
45° 1 ' x 30 cycles
75° V The PCR conditions were as follows :
KCI (50 mM), Tris pH 9.0 (10 mM), MgCI2 (7 mM), MπCI2 (0.2 mM)
The amplification was checked by agarose gel electrophoresis.
B. Digestion of mutagenized SID1529 and pP7-centro vector
The PCR fragments were purified with Qiaquick column (Qiagen) according to the manufacturer's protocol and digested (Notl-BamHl). The vector (pP7-centro) (see, Figure 20) was digested (Notl-BamHl) and dephosphorylated according to standard protocol (Sambrook et al.).
C. Library transformation in Escherichia coli
The method followed in this section is the same as the one described in
Example 1 .A.5 above.
2. Collection transformation in Saccharomyces cerevisiae
The method followed in this section is the same as the one described in
Example 1.B above.
3. Construction of HP1230 bait plasmid 68
The genomic amplification of the HP1230 ORF was obtained by PCR using the Pfu proofreading Taq polymerase (Stratagene) and 200 ng of Helicobacter pylori genomic DNA as template.
The PCR program was set up as follows:
94° 45"
94° 45"
48° 45" x 30 cycles
72° 6'
72c 10'
The amplification was checked by agarose gel electrophoresis.
The PCR fragments were purified with Qiaquick column (Qiagen) according to the manufacturer's protocol. The digested PCR fragments were ligated into an adequately digested (BamHI-Pstl) and dephosphorylated bait vector (pB1 ) according to standard protocol (Sambrook et al.) and were transformed into competent bacterial cells. The cells were grown , the DNA extracted and the plasmid was sequenced. '
4. Screening the mutagenized SID1529 collection with the HP1230 bait protein using two-hybrid in yeast
The method followed in this section is the same as the one described in
Example 2 above, with the exception that DO Leu-Trp-His + Tet plates were replaced by DOLeu-Trp-His + Tet + 40 mM 3-AT plates.
5. Identification of positive clones
The method followed in this section is the same as the one described in Example 3 above. 6. The SID 1529* (V53L) inhibits the interaction between HP1529 and HP1230
In the three-hybrid system, the HP1529 protein is expressed fused to the GAL4 Activation Domain (AD) in the pP6 plasmid, whereas HP1230 is introduced in the p3H1 vector in fusion with the DNA-binding domain (DBD) of GAL4. In addition, this vector contains the Met25 promoter which allow expression of a third partner in medium lacking methionine. After transformation of Y187 and CG1945 yeast by the pP6-HP1529 and p3H1-HP1230 vectors, respectively, both strains were mated. The resulting diploid strain was grown on a minimal medium lacking leucine and tryptophan to select for both plasmids (DO-2) and on DO-2 without histidine to select for interaction (DO-3). As a positive control, this strain was observed to grow on the selective medium for dilutions ranging from 1 to 10'4 (Figure 19, lane 1 ). This result shows an interaction between HP1230 and HP1529 proteins, as previously identified using library screening (Rain et al., 2001 ).
Two different plasmids were used for this study: (i) the pP6 vector which contain the GAL4 activation domain (AD) (Rain et al., 2001 ). One of the HP1529 fragments (nucleotides 1-1374) obtained by screening the HP1230 protein was selected and used as prey in the pP6 vector fused to GAL4 AD; (ii) the p3H1 vector which contains the DNA-binding domain (DBD) of GAL4 and a methionine- regulated Met25 promoter (Tirode et al., 1997, J. Biol. Chem. 272: 22995-22999). The HP1230 encoding sequence of 540 bp was sub-cloned from pB1-HP1230 into the BamHllPstl sites of p3H1 as fusion protein with GAL4-DBD giving p3H1- HP1230. In addition, the WT SID1529 or SID1529* (N38D-V53L) or SID1529* (V53L) were sub-cloned from pP7-centro (Notl-BamHl) to the NotllBglll sites of p3H1-HP1230 under the control of the Met25 promoter. Expression from the Met25 promoter is obtained in the absence of methionine. As negative control, we used a prey encoding the HP0875 protein. Ail PCR fragments and in frame fusions were checked by sequencing.
The pP6 and p3H1 derived-vectors were used to transform the Y187 and CG1945 yeast strains, respectively. Both strains were mated in YPD buffer
(Yeast Peptone Dextrose; Bio 101 , Inc) for 4 hours at 30°C and the resulting diploid strain was selected on a minimal medium lacking leucine and tryptophane (DO-2). The interaction between proteins was observed in plates containing DO- 2 deleted in histidine (DO-3) without methionine.
To assay whether The different versions of SID1529 can modulate this interaction, this protein was cloned in the p3H1-HP1230 vector under the control of the Met25 promoter. In the presence of the WT SID 1529, growth was observed on D03-Met medium thus showing that the WT SID1529 had little or no effect on the HP1230-HP1529 interaction (Figure 19, lane 2). In contrast, cells transformed with pP6-HP1529 and p3H1-HP1230-SID1529*(N38D-V53L) or p3H1-HP1230- SID1529* (V53L) were almost unable to grow on the selective medium (Figure
19, lanes 3-4). The growth of both strains in selective medium and in the presence of methionine (DO-3 + Met) was not affected, thus showing that the effect of SID1529* (N38D-V53L) or SID1529* (V53L) is a specific inhibition of the
HP1230-HP1529 interaction (Figure 18, lanes 3-4). Taken together, these results clearly demonstrate that substitution of valine to leucine at position 53 of SID1529 led to complete inhibition of the HP1230-HP1529 interaction and confirm that
SID1529 derivatived might have some potential as lead compounds to inhibit
Helicobacter pylori growth.
Example 1 1 : Modulating compounds/PIM screening
The following results obtained from these Examples, as well as the teachings in the specification are set forth in the Tables below.
While the invention has been described in terms of the various preferred embodiments, the skilled artisan will appreciate that various modifications, substitutions, omissions and changes may be made without departing from the scope thereof. Accordingly, it is intended that the present invention be limited by the scope of the following claims, including equivalents thereof. REFERENCES
Akerley, B.J., Rubin, E.J., Camilli, A., Lampe, D.J., Robertson, H.M., and
Mekalanos, JJ. (1998) Systematic identification of essential genes by in vitro mariner mutagenesis. Proc Natl Acad Sci USA 95: 8927-8932.
Aim, R.A., Ling, L.S., Moir, D.T., King, B.L., Brown, E.D., Doig, P.C., et al. (1999)
Genomic-sequence comparison of two unrelated isolates of the human gastric pathogen Helicobacter pylori [published erratum appears in Nature 1999 Feb
25:397(6721 ):719]. Nature 397: 176-180. Bijlsma, J.J., Vandeπbroucke-Grauls, CM., Phadnis, S.H., and Kusters, J.G.
(1999) Identification of virulence genes of Helicobacter pylori by random insertion mutagenesis. Infect Immun 67: 2433-2440.
Blaser, M.J. (1993) Helicobacter pylori: microbiology of a 'slow' bacterial infection.
Trends Microbiol 1 : 255-260. Boyer, H.W., and Roulland-Dussoix, D. (1969) A complementation analysis of the restriction and modification of DNA in Escherichia coli. J Mol Biol 41 : 459-472.
Cao, P., McClain, M.S., Forsyth, M.H., and Cover, T.L. (1998) Extracellular release of antigenic proteins by Helicobacter pylori. Infect Immun 66: 2984-2986.
Chalker, A.F., Minehart, H.W., Hughes, N.J., Koretke, K.K., Lonetto, M.A., Brinkman, K.K., et al. (2001 ) Systematic identification of selective essential genes in Helicobacter pylori by genome prioritization and allelic replacement mutagenesis. J Bacteήol 183: 1259-1268.
Chevalier, C, Thiberge, J.-M., Ferrero, R.L., and Labigne, A. (1999) Essential role of Helicobacter pylori g-glutamyltranspeptidase for the colonization of the gastric mucosa of mice. Mol Microbiol 31 : 1359-1372.
Colland, F., Rain, J.-C, Gounon, P., Labigne, A., Legrain, P., and De Reuse, H.
(2001 ) Identification of the Helicobacter pyllori anti-s^δ factor. Molecular
Microbiology : in press.
Ferrero, R.L., Cussac, V., Courcoux, P., and Labigne, A. (1992) Construction of isogenic urease-negative mutants of Helicobacter pylori by allelic exchange. J
Bacteπol.174: 4212-4217.
GUY, B., C. HESSLER, S. FOURAGE, B. ROKBI, M.-J. QUENTIN MILLET
(1999). Comparison between targeted and untargeted systemic immunizations with adjuvanted urease to cure Helicobacter pylori infection in mice. Vaccine 17 : 1 130-1135.
Haas, R., Meyer, T.F., and van Putten, J.P. (1993) Aflagellated mutants of
Helicobacter pylori generated by genetic transformation of naturally competent strains using transposon shuttle mutagenesis. Mol Microbiol 8: 753-760.
Janvier, B., Grignon, B., Audibert, C, Pezennec, L., and Fauchere, J.L. (1999) Phenotypic changes of Helicobacter pylori components during an experimental infection in mice. 24: 27-33.
Jenks, P.J., Chevalier, C, Ecobichon, C, and Labigne, A. (in press) Identification of non-essential Helicobacter pylori genes using random mutagenesis and loop amplification (RMLA). Res Microbiol . Labigne, A. (1997) Random mutagenesis of the H. pylori genome. In Helicobacter pylori protocols. Mobley, C.L.C.a.H.M.T. (ed) Totowa, N. J.: Humana Press, pp.
153-163.
Labigne, A., Courcoux, P., and Tompkins, L. (1992) Cloning of Campylobacter jejuni genes required for leucine biosynthesis, and construction of leu-negative mutant of C. jejuni by shuttle transposon mutagenesis. Research in Microbiology
143: 15-26.
Labigne-Roussel, A., Courcoux, P., and Tompkins, L. (1988) Gene disruption and replacement as a feasible approach for mutagenesis of Campylobacter jejuni. J Bactehol 170: 1704-1708.
McCarthy, M.A., Haebel, P.W., Torronen, A., Rybin, V., Baker, E.N., and Metcalf,
P. (2000) Crystal structure of the protein disulfide bond isomerase, DsbC, from
Escherichia coli. Nat Struct Biol 7: 196-199.
Moskovitz, J., Rahman, M.A., Stassman, J., Yancey, S.O., Kushner, S.R., Brot, N., and Weissbach, H. (1995) Escherichia coli peptide methionine sulfoxide reductase gene: regulation of expression oand role in protecting against oxidative damage. J Bactehol 177: 502-507.
Odenbreit, S., Till, M., Hofreuter, D., Faller, G., and Haas, R. (1999) Genetic and functional characterization of the alpAB gene locus essential for the adhesion of Helicobacter pylori to human gastric tissue. Mol Microbiol 31 : 1537-1548.
Ogura, K., Maeda, S., Nakao, M., Watanabe, T., Tada, M., Kyutoku, T., et al.
(2000) Virulence factors of Helicobacter pylori responsible for gastric diseases in monoglian gerbil. J Exp Med 192: 1601-1610.
Parsonnet, J., Friedman, G.D., Vandersteed, D.P., Chang, Y., Vogelman, J.H., Orentreich, N., and Sibley, R.K. (1991 ) Helicobacter pylori infection and the risk of gastric carcinoma. N Engl J 325: 1127-1131.
Parsonnet, J., Hansen, S., Rodriguez, L., Gelb, A.B., Warnke, R.A., Jellum, E., et al. (1994) Helicobacter pylori infection and gastric lymphoma. N Engl J Med 330:
1267-1271. Rain, J.-C, Selig, L., de Reuse, H., Battaglia, V., Reverdy, C, Simon, S., et al.
(2001 ) The protein-protein interaction map of Helicobacter pylori. Nature 409: 211 -215.
Rashtchian, A. (1995) Novel methods for cloning and engineering genes using the polymerase chain reaction. Current Opinion in Biotechnology 6: 30-36. Salama, N., Guillemin, K., McDaniel, T.K., Sherlock, G., Tompkins, L., and
Falkow, S. (2000) A whole-genome microarray reveals genetic diversity among Helicobacter pylori strains. Proc Natl Acad Sci USA 97: 14668-14673. Sambrook, J., Fritsch, E.F., and Maniatis, T. (1989) Molecular cloning: a laboratory manual. Cold Spring Harbor, N. Y.: Cold Spring Harbor Laboratory Press.
Seifert, H.S., Chen, E.Y., So, M., and Heffron, F. (1986) Shuttle mutagenesis: a method of transposon mutagenesis for Saccharomyces cerevisae. Proc Natl Acad Sci USA 83: 735-739. Skouloubris, S., Thiberge, J.M., Labigne, A., and De Reuse, H. (1998) The Helicobacter pylori Urel protein is not involved in urease activity but is essential for bacterial survival in vivo. Infection & Immunity 66: 4517-4521. Tirode, F., Malaguti, C, Romero, F., Attar, R., Camonis, J., and'Egly, J.M. (1997) A conditionally expressed thirt partner stabilizes or prevents the formation of a transcriptional activator in a three-hybrid system. J Biol Chem 272: 22995-22999. Tomb, J.-F., White, O., Kerlavage, A.R., Clayton, R.A., Sutton, G.G., Fleischmann, R.D., et al. (1997) The complete genome sequence of the gastric pathogen Helicobacter pylori. Nature 388: 539-547.
Wizemann, T.M., Moskovitz, J., Pearce, B.J., Cundell, D., Arvidson, C.G., So, M., et al. (1996) Peptide methionine sulfoxide reductase contributes to the maintenance of adhesins in three major pathogens. Proc Natl Acad Sci USA 93: 7985-7990.
Yoshimoto, T., Higashi, H., Kanatani, A., Lin, X.S., Nagai, H., Oyama, H., et al. (1991 ) Cloning and sequencing of the 7 alpha-hydroxysteroid dehydrogenase gene from Escherichia coli HB101 and characterization of the expressed enzyme. J Bactehol 173: 2173-2179. Zapun, A., Missiakas, D., Raina, S., and Creighton, T.E. (1995) Structural and functional characterization of DsbC, a protein involved in disulfide bond formation in Escherichia coli. Biochemistry 34: 5075-5089.
TABLE 1 Complexes of interacting proteins
Figure imgf000075_0001
Figure imgf000076_0001
Figure imgf000077_0001
Figure imgf000078_0001
Figure imgf000079_0001
Figure imgf000080_0001
80
Figure imgf000081_0001
S
Figure imgf000082_0001
Figure imgf000083_0001
TABLE 2
Figure imgf000084_0001
SID®s Sequences
Figure imgf000084_0002
Figure imgf000085_0001
Figure imgf000085_0002
HP0339 15 ATGGCGAAAGCTATGTAGCAACGGGCAATATTTATGACACGCTCGTGCAATTCAGATACGGCACCACAGAAGTTGAA 16 GESYVATGNIYDTLVQFRYGTTE
CCCGCCTTAGCGACAAGCTGGGACATATCCCCAGATGGTCTTGTATATACCTTTCATTTACGCAAAGGGGTTTATTTC LATSWDISPDGLVYTFHLRKGVYF
CACCAAACGAAGTATTGGAATAAAAAAGTAGAGTTTAGCGCTAAAGATGTGCTGTTTTCGTTTGAACGCCAGATGGAT KYWNKKVEFSAKDVLFSFERQM
AAAGCTAAACGATATTATAGCCCGGGGGCTAAAAGCTATAAGTATTGGGAAGGCATGGGCATGTCTCATATTATTAA RYYSPGAKSYKYWEG G SHIIK
GAGCATTGAAGCTTTAGATGACTATACCATTAGATTCACACTTAATGGGCCAGAAGCCCCGTTTTTAGCGAATTTGGG LDDYTIRFTLNGPEAPFLANLG D
CATGGACTTTTTGAGCATTTTGAGTAAGGATTACGCTGATTACCTGGCTCAAAATAATAAAAAAGACGAGTTGGCTAA LSKDYADYLAQNNKKDELAKKPIG
AAAACCTATTGGGACAGGGCCTTTCAAATTCTTTTTGTGGAATAAAGATGAAAAAATCATTCTTTTAAAAAATCAAGAT FKFFLWNKDEKIILLKNQDYWGPK
TATTGGGGGCCTAAAGCGTATTTGGATAAGGTGGTGGTGCGCACCATTCCTAATTCTTCCACTCGCGCTTTAGCGTT DKVWRTIPNSSTRALALRTGEIM
GCGCACCGGCGAAATCATGCTCATGACTGGGCCTAATCTCAATGAAGTGGAGCAATTAGAAAAAGTCCCTAATATCG GPNLNEVEQLEKVPNIVVDKSAG
TGGTGGACAAAAGTGCTGGGTTGTTGGCGAGTTGGCTTTCGTTGAACACGCAAAAAAAGTATTTTGACAACCCTTTG WLSLNTQKKYFDNP VRLAINHAI
GTGCGTTTGGCTATCAATCATGCGATCAATGCAGATGATTACATCAAAGTGCTTTATGAAGGCTTTGCTCAAAAAATG DYIKVLYEGFAQK VNPFPPTIW
GTCAATCCTTTCCCGCCCACCATATGGGGTTATAACTACAATATCAAACCCTATGAATACGATTTGAAAAAGGCTAAG NIKPYEYDLKKAKELLKQAGYPNG
GAGTTGTTGAAACAAGCGGGCTATCCTAACGGCTTTA
HP0339 17 TACCCACACAAACACAAGCGCAATTACCCAAAAGCCGAGTGAGGCTAAACGAACGAGAGATTTACGATCTAGACTAT PTQTQAQLPKSRVRLNEREIYDL
GCGATCGTCAAAGCGAAAGATTTAAAACCAAGCTTTACCACAGGCGGGACGCAAAAGAGAACGGACATGAACGAAG KAKDLKPSFTTGGTQKRTD NEE
AGCAGATTAAAAGCATTGCTGAAAATTTTGATCCTAAAAAGATATTTGGTAGCGGAGGGTTTGAAGATTTACCGATCA IAENFDPKKIFGSGGFEDLPIILHD
TTCTACATGACGGGCAAGTGATCGCAGGAAACCACAGAATCCAAGGCATGCTAAACTTCACGCCTAAAAGCCGTTTT AGNHRIQGMLNFTPKSRFSYERAI
TCTTACGAGAGAGCGATCAAGGAATACTATCACATAGACTTAAAACCGGACGAGTTGTTAGTGAGAGTGCCACACAA YHIDLKPDELLVRVPHKRLNNTEI
GCGCCTAAACAACACCGAGATCAACAATTTAGCGGCTTCATCCAATCAAGGACGCTTCAATAGCGAAAGCGATCACG ASSNQGRFNSESDHAIAVLSHYE
CTATAGCGGTTTTAAGCCACTATGAAGCCAAGTTAAAAGAATTAGACCAAAAATTAGACGCTGATAGCATCTACTCAT ELDQKLDADSIYSLKNIVAKNLNF
TAAAAAACATTGTTGCTAAAAATTTGAATTTTGATAAGGCTACGCATCCTAATGTAACCGATAGTAATTTAGCGTTGCT HPNVTDSNLALL FN PRTKTQG
GATGTTTAACATGCCACGAACCAAAACGCAAGGGATAGAATTACτCAACCGCTGGAAAAAAGAATTTTCCAACGACA NRWKKEFSNDIKSYEKVKKMFVD
TTAAAAGCTATGAAAAAGTAAAAAAAATGTTTGTAGATAACGCGGGCAGTTTTCACAATCTCATCCACGATCTGAACTT SFHNLIHDLNFPKVSLNAYLSDI
CCCTAAGGTGAGTTTAAACGCTTATTTAAGCGATATTATGGATCGCAGTTTTGCGAATTTAAAGAATTACCAAAGCAC ANLKNYQSTSESLKDLSEKFYKTS
GAGCGAGAGCCTGAAAGATTTGAGCGAAAAATTCTATAAAACGAGTTCGTTAGAGATGTTTGAAAAGAGCGATCAAA MFEKSDQSTSDISEI GGAIARFA
GCACGAGCGATATTAGCGAGATTTTAGGAGGAGCGATCGCACGATTTGCACGATTTGATGATCCGAGCAAAGCGTT PSKALFEALRSDNIK
ATTTGAAGCCTTAAGAAG
HP0339 19 TATTTCTGTTGTCAAAGAGCTTGTGAAAAACTTAGCAACCAAAAGCGATGAAAAAGATGAGAATGGCAACAGCATCTC 20 IS VKELVKNLATKSDEKDENGNS
TTTTAGCCTAGCAGATTCTAATACGCTTGCAGCGGCAGTAACCAACCTTATCACAGGAGATATGAACCTAGATTATCC LADSNTLAAAVTNLITGD NLDYP
CATCACTCAACTTATTAATGCTTTCGGGAAAGACCACAATGATCCTAATGGGCTTGTCGCGCGATTAGCACCTTTTTG NAFGKDHNDPNGLVARLAPFCKS
CAAATCAACCAATGGTGAATTTCAATGGCTTTTTGATAATAAAGCAACGGATCGCTTAGATTTTTCAAAAACGATTATT EFQWUFDNKATDRLDFSKTIIGVD
GGCGTTGATGGGTCAAGTTTCTTAGACAATAATGATGTTTCGCCCTTTATTTGTTTTTACCTTTTCGCTCGTATCCAAG FLDNNDVSPFICFYLFARIQEAMD
AGGCAATGGAT6GGCGTAGATTTGTCTTAGATATTGATGAAGCTTGGAAATATTTAGGCGATCCAAAGGTCGCTTATT FVLDIDEAWKYLGDPKVAYFVRD
TTGTAAGAGACATGCTAAAAACTGCAAGGAAAAGAAACGCTATTGTCAGACTTGCGACTCAAAGCATCACTGATCTTT ARKRNAIVRLATQSITDLLACPIAD
TGGCTTGCCCTATTGCTGATACGATTAGAGAACAATGCCCTACAAAGATTTTTTTGAGj^AACGATGGGGGCAATCTTT QCPTKIFLRNDGGNLSDYQRLAN
CTGATTACCAAAGATTGGCTAATGTTACAGAAAAAGAATTTGAAATCATCACTAAGGGACTAGATAGGA EFEIITKGLDR
Figure imgf000087_0001
Figure imgf000088_0001
HP0339 37 GCGTGGATAAAGACGGCTTTTTAGTTACTCTTAATGGCTTTAGGGTGCTTTCTCGTTCGGGCTTGAACGAAAAAGGA 38 VDKDGFLVTLNGFRVLSRSGL
GGGATCATGCTCATGCCTGACGCTGAAATTGAAGTTGATCAAAATGGCGGAATCACTTTTAGGGATAATGAAGCCCA ML PDAEIEVDQNGGITFRDN
AATTC GCAGGCGCGTTAGCTTTAGTGAGTTTTAGCGAACCTAAAAATCTTAAAAAAATAGGGCAAAACCTTTATAC GA ALVSFSEPKNLKKIGQNLY
CTATCAGGGCGAAGGCGTTCATCAAGTCTCTGACTCTGGTGCGTTAAGGCAATACATGCTAGAAAAAAGCAATGTCA GVHQVSDSGALRQYMLEKSN
ATGCGGTGCGCGAGATGAGCGCTTTGATTGAAATCAACCGCTTTTTGGACATGTATTCTAAAGTGTTAAAAACCCAC MSALIEINRFLDMYSKVLKTHQ
CAAGACGACATGAACGCTGAAGCGATCAACAAACTCGCTGCAAAAGCTTAAACrrTTGGCGATTTTAAAATCCATAG EAINKLAAKA
GAGGGTTTTTACGCAATACCTTAACTCCCTAATGACGCTTTTTAAAAGCCTTGTTAATCTTTTTTATATAGGATTAGTTT
TAAAAGAGTTTAAAGAATTAAATGCCTTTTCTTTTGTAATCGCCTGCAAAGTCGAAACAATCGCAAGCTTCTCTTTATT
TCGTAGTTTATAGCCGCTAACTACTCAACAATAATTCAATTTTTAATTATTTCGTTTGTTGTTTAAAAACGCTCTTTATT
GTGGTATTCTGTAAGCAAGTGTCCCAATTTTGATTGTTATGGTTCTTGTGTTTTAAGCTCGTTTGTGGTAGGATATTCC
CATTAAGGTTTGAGACGCTATAGTTGATGCCGTCTCTGTAATGATTAAAGAGTGTTCCATTGGTTTCTAATTCCCTTAA
TGTTTTTATTGCCCCAATCAAGTTTTTTGATTACCCACTTGATTATCAGCGTCCATTTTAAGATAGAGCCATTCCAAAA
GCTTGTAAAAATTTTCTTTCTGCTTCCACTAAACTAGGGTGTAAGAGCAAGTCATTCTTTAAAACGTTTGTATAGGCGG
TGTGTG
HP0339 39 CTTTAATGCAAGCAGGGATTTTAGGGGGCTTAGCCAATGAAAAGCAATTTGGCTTCACTTACAACAAAGCCCCTAAT 40 LMQAGILGGLANEKQFGFTYN
GGTAGCGATTCCCAACAAGGCTACCAAAGCTTTAGCGGCCCGGGTTATTACACTAAAAACGGCGCTAATGGCACTA SDSQQGYQSFSGPGYYTKNG
CCCAAGCGCCCTTGAAAGCATTACCCGCTGGAGCGACAATTGGATCAGGCAATGGCCAATACACCTACCACCCCAG QAPLKALPAGATIGSGNGQYT
CTCGGCAGTCTATTATTTAGCCGATAGCATCATTGCTAATGGCATCACCGCTTCTATGATTTTTTCAGGCATGCAAAA VYYLADSIIANGITASMIFSGMQ
TTTCGCCAATAAAGCCGCTAAACTGACAGGCACTTCAAGCTATAGCCAGATGCAAGATGCGATCAATTACGGGGAAA AAKLTGTSSYSQMQDAINYGE
GCTTGCTCAGTAACACCGTAGCGTATGGGGATTTCATCACCAATTGGGTCGCCCCCTATTTGGATTTAAACAACAAA VAYGDFITNWVAPYLDLNNKG
GGTTTGAATTTCTTGCCTAGCTATGGGGGGCAATTGAATGGTGCTAATCATCAAACCCCACAATTAACCCCGCAACA YGGQLNGANHQTPQLTPQQA
AGCCCAACAAGAGCAAAAAGTCATCATGAACCAACTAGAGCAAGCCACAAACGCCCCCACCCCCGCGCAAATAAAC VI NQLEQATNAPTPAQINRIL
AGGATTTTAGCCAACCCCTATTCCCCCACGGCAAAAACTTTAATGGCTT TAKTLMA
HP0339 41 GGCGTTGTTTCAAAAGCATGGCAACAAGAATCAAGAGAGTTTGAAGCAATTTGAAGTTTATCATTATCAAAGTGGGG 42 ALFQKHGNKNQESLKQFEVYH
GGATTTCAAGGAATGAAAAAATCCAATATTTTTATAACGAGATTTTAAAAACCCCTATCGCTCAAGAAGAAGTGGATG ISRNEKIQYFYNEILKTPIAQEE
CATTAGCCCTGGAGTTTGGCACTATTATAGAACAAAAGCTTTTTGATAGGGAGCATTTGAATAGCGAAGTGATGGCAT EFGTIIEQKLFDREHLNSEV A
TTATTGATAAGCATTATAAAAATCATGTTTTCCATATCGCTTCAGCGGCCTTGCATAGCGAATTGCAAGTGTTGTGCG KNHVFHIASAALHSELQVLCEF
AGTTTTTAGGGATCATTAAGTATTTTAAGAGCGTTGAAGGGAGTCC KSVEGS
HP0339 43 AAAACAACAAAAAGGAATACCCATGGATATTCGCAACGAATTTTTACAATTTTTTCAAAATAAAGGGCATGCCGTTTAT 44 KQQKGIPMDIRNEFLQFFQNK
CCTAGCATGCCTTTAGTGCCTAATGACGCTACCTTGCTTTTTACCAATGCCGGCATGGTGCAATTTAAAGATATTTTT SMPLVPNDATLLFTNAG VQF
ACCGGGATTGTGCCACGCCCTAGCATTCCTAGAGCGGCAAGCTCGCAATTGTGCATGCGCGCAGGCGGCAAGCAT VPRPSIPRAASSQLCMRAGGK
AACGATTTGGAAAATGTCGGTTATACCGCAAGGCACCACACGCTTTTTGAAATGCTAGGGAATTTCTCTTTTGGGGAT NVGYTARHHTLFE LGNFSFG
TATTTCAAAGAAGAAGCGATCTTGTTTGCGTGGGAATTTGTAACC EAILFAWEFVT
HP0339 45 TCAAAGAAATCAAAAAAGCTAACAGCACCCTAAATAGCCAAAGGCGTTTTTTTAACGCCAGCCAGATCCGCCTTATG 46' KEIKKANSTLNSQRRFFNASQI
GACACTGATGCACTATTGAAACAAAGCGCTTTGGAATTAGAAAAATTACAAGCTTTAGAAAAACACATAAAAAAGGGC DALLKQSA ELEKLQALEKHIK
ATGGAACAAGAACGCTTAATAGAAGAATCCCAAACGCTTTTTTTACAAGAGCATTGCCCTTATTTGAGCGGCGTTAAG ERLIEESQTLFLQEHCPYLSGV
AATTTAGAAGAGGCTTCAAACGCTTTAGAAGTCCAAGAGCAAAACAACGCCCTTTTCTTACTCAAAGAGCCTAAACTC ASNALEVQEQNNALFLLKEPKL
GCCCGTTTGCTCTCACGATTGGATTTGATGAGCGCTTTAAACGCCTTGTGCGATCAGGTTTTAGAAAACCAAGCCCA RLDLMSALNA CDQVLENQAH
TAACC
Figure imgf000090_0001
HP1427 53 GCMCAAACTCAAAGCCCAAGAACTCATTGATGAAAAAAAAGCTCAAGAGATTAAAAACGAATTAGAAAAAGAAAGCT 54 NKLKAQELIDEKKAQEIKNELE
ACGCTATCTCCAGTATCGTTAAAAAATCTAAAAAATCCCCCACGCCGCCCCCTTTCATGACTTCTACTTTACAGCAAA SSIVKKSKKSPTPPPFMTSTLQ
GCGCTTCTAGTCTTTTAGGCTTTTCGCCCACAAAAACCATGAGTATCGCTCAAAAATTGTATGAGGGCGTAGCCACC LLGFSPTKTMSIAQKLYEGVAT
CCACAAGGGGTTATGGGGGTGATCACTTACATGAGGACCGATAGCTTGAATATCGCTAAAGAGGCTTTAGAAGAAG GVITYMRTDSLNIAKEALEEAR
CGAGGAATAAGATTTTAAAAGACTATGGCAAAGACTATTTACCCCCTAAAGCCAAAGTCTATTCCAGCAAGAATAAAA YGKDYLPPKAKVYSSKNKNAQ
ACGCCCAAGAAGCCCATGAAGCGATCAGGCCCACTTCTATTATTTTAGAGCCAAACGCTTTAAAAGACTACCTTAAG RPTSIILEPNALKDYLKPEELRL
CCTGAAGAATTAAGGCTCTATACCTTAATTTACAAACGCTTTTTAGCTTCTCAAATGCAAGACGCTCTTTTTGAAAGCC RFLASQMQDALFESQSVVVA
AAAGCGTGGTTGTGGCTTGCGAAAAAGGCGAGTTTAAAGCGAGTGGGAGAAAGCTCCTTTTTGATGGCTATTATAAA KASGRKLLFDGYYKILGNDDK
ATTTTAGGCAATGACGATAAGGACAAATTGCTCCCCAATTTGAAAGAAAATGACCCCATTAAATTAGAAAAACTAGAG LKENDPIKLEKLESNAHVTEPP AGCAACGCCCATGTTACAGAACCTCCAGCACGCTATTCTGAAGCGAGCTTGATTAAAGTTTTAGAAAGTTTAGGCATA ASLIKVLESLGIGRPSTYAPTIS
GGCAGGCCCAGCACCTACGCCCCAACGATTTCCCTTTTACAAAACAGAGACTACATCAAGGTAGAAAAAAAGCAAAT YIKVEKKQISALESAFKVIEILE CAGTGCTTTAGAGAGCGCTTTTAAGGTGATAGAAATTTTAGAAAAGCATTTTGAAGAAATCGTGGATTCCAAATTCAG DSKFSASLEEELDNIAQNKAD CGCTTCTTTAGAAGAAGAATTGGACAATATCGCTCAAAATAAAGCCGACTACCAGCAAGTCTTAAAGGACTTTTACTA DFYYPFMDKIEAGKKNIISQKV CCCCTTTATGGATAA QSCPKCGGELVKKNSRYGEFI
PKCKYVKQTESANDEADQEL
EMVQKFSRNGAFLACNNYPE
LKNTPNAKETIEGVKCPECGG
SKKGSFYGCNNYPKCNFLSN
CEKCHYLMSERIYRKKKAHECI
VFLEEDNG
HP1427 55 TGAATACAACAAAATCAATGCGATTTCTCAATCGCTCCAAAACACCCTAGAAAATAAAAACAATGATCTTAAAATTGAA 56 EYNKINAISQSLQNTLENKNND
AATGACTACGACCATCTTTTAACTCAAGCTAGCACCATTATTAATACCCTTCAAAGCCAATGCCCAGGCATAGACGGA YDHLLTQASTIINTLQSQCPGI
GGCAATGGCAAACCATGGGGCATTAATGCAAGCGGGAACGCATGCAATATTTTTGGCAACACCTTTAACGCCATCAC PWGINASGNACNIFGNTFNAIT
TAGCATGATAGATAGCGCTAAAAAAGCCGCCGCAGATGCCCGAAGAACTGCCCCAGAAAGTCCAAACCAACCAAGT KKAAADARRTAPESPNQPSAF
GCGTTTAACAACGCTGATTTCAATAAAAACCTTAATCAAGTCTCAAGCGTTATTAATGACACGATCTCTTACCTCAAAG NKNLNQVSSVINDTISYLKGDN
GGGACAATTTAGCAACCATCTACAACACCCTTCAAAAAACGCCCGATTCTAAAGGGTTTCAAAGTTTGGTGAGCCGA TLQKTPDSKGFQSLVSRSSYS
TCTAGCTATAGTTATTCCCTCAACGAAACCCAATATTCTGAATTCCAAACTACCACCAAAGAGTTTGGCCATAACCCTT QYSEFQTTTKEFGHNPFRSVG
TTAGAAGCGTGGGTTTAATCAACTCTCAAAGCAATAACGGAGCGATGAATGGCGTGGGCGTGCAATTAGGCTATAAG NNGAMNGVGVQLGYK
HP1427 57 CTTTAGGGAGCGGGAACGCATGGGGGACTGGGGGGAGCGCGAGCGTAACTTTTAACAGCCAAACTTCGCTCATTCT 58 LGSGNAWGTGGSASVTFNSQ CAATCAGGCTAATATCGTAAGCTCGCAAACCGATGGGATCTTTAGCATGCTGGGTCAAGAGGGTATTAATAAGGTTT QANIVSSQTDGIFSMLGQEGIN TCAATCAAGCCGGGCTCGCTAATA I I I I GGGCGAAGTGGCGGTGCAATCCATCAACAAAGCCGGGGGATTAGGGAA AGLANILGEVAVQSINKAGGL TTTGATAGTAAATACGCTAGGGAGTAATAGCGTGATTGGGGGGTATTTAACGCCTGAACAAAAAAATCAAACCCTAA LGSNSVIGGYLTPEQKNQTLS GCCAGCTTTTAGGGCAGAATAACTTTGATAATCTCATGAACGATAGCGGTTTGAATACGG NFDNLMNDSGLNT
HP1427 59 ATTTGACATCACCAATAACGCTGAACAGCTGTTAAATCAAGCGGCAAACATCATGCAAGTCCTTAATACGCAATGCCC 60 FDITNNAEQLLNQAANIMQVLN TTTAGTGCGTTCCACGAATAACGAAAACACTCCAGGGGGTGGTCAACCATGGGGTTTAAGCACATCCGGGAATGCG VRSTNNENTPGGGQPWGLST TGCAGCATCTTCCAACAAGAATTTAGCCAGGTTACTAGCATGATCAAAAACGCCCAAGAAATAATCGCGCAAAGCAA SIFQQEFSQVTSMIKNAQEIIA AATCGTTAGTGAAAACGCGCAAAATCAAA NAQNQ
HP1427 61 AAGGAACGCAAACTAAAGAAGAAGTCATAACCACCCAAAAAATCTATGAAAACCCCCTAACCCACCCACAAACTAAA 62 GTQTKEEVITTQKIYENPLTHP
GAACAGCCTAAAGAACAAAATAAAAGCGATACGGCCACCCCACAAAGCGCTTACGGAAAATACTACATACCCCAAAG KEQNKSDTATPQSAYGKYYIP
CACCATTTTAAAAAATGCAACGGCTTTATTCACCACGGACAAGATAGAAAATGGCTTAACTTTTTATTCTCAAAACCCT NATALFTTDKIENGLTFYSQN
GTGTATGCGAATATGGTTAATGGGAGCGTAACCATACAAAACTTTCTGCCTTATAATTTAAACAATGTTGAACTGAGTT VNGSVTIQNFLPYNLNNVELS
TTAAAGACGCTCAAGGCAAGGTGGTCAATTTAGGCGTGATAGAGACCATCCCTAAACAATCTCAAATTACCTTGCCT KWNLGVIETIPKQSQITLPAS
GCAAGCTTGTTTAATGATTCAGAATTTGAACAAGCTGATAGCTTTAATTACCAACAACTTCAAGCCACTGCCACACAA EQADSFNYQQLQATATQFSD
TTTTCTGACGCTAACACGCAAAGTTTGTTTCAAAAGCTCAGCAAGATCACAACCAATGTAACAATGAGTTATGAAAAC FQKLSKITTNVTMSYENADTN
GCCGATACCAACAATTTTAAAGGTAATTGCCATGATTGTGTGTCAGATTTCACCCCACAAACCGCAGAAGAATTGACC HDCVSDFTPQTAEELTNLMLD
AATTTAATGCTAGATATGATTGCGGTGTTTGACTCTAAATCGTGGGAAGAAGCCGTTTTAAACGCTCCTTTCCAATTTT SKSWEEAVLNAPFQFSNSSS
CTAACAGCTCATCAGAGTGCGGCTCTGACTTTCCTAAGTGCGTGAATCCTTTCAATAACGGGCGTGTCGCTCCCATC PKCVNPFNNGRVAPIYEKYVL
TATGAAAAATACGTGCTAACCCCACAATCCGTTATAGATGCGTTTAGAAGAACGATCAATCTTGAAGTGAATATCCTA DAFRRTINLEVNILKSGFVGLG
AAATCAGGGTTTGTAGGGCTAGGGTATGAACTTGATGATAATGATGGTAATCTGGGGATAGAAGCTTCTGCCTTAAA DGNLGIEASALNPEKLFGKTL
TCCTGAAAAATTGTTTGGTAAAACTTTGAACAAAGTTGATATTGTGGAATTAAGAGACATTATCCATGAATTTAGCCAC LRDIIHEFSHTKGY
ACTAAAGGCTA
HP1427 63 GGGATTAGGCTTGAATTTGCCTAACATTTCTAGCACGCGCCCCACCAAAGCGATCGTAAGAGAGTCGTTTTTTAACA 64 GLGLNLPNISSTRPTKAIVRES
CCTTGCAAGCAGAAATTAATGGAGCGCATTTTATAGAAGTGTTTTCAGGCAGCGCTTCTATGGGTTTAGAGGCTTTGA AEINGAHFIEVFSGSASMGLE
GTAGGGGGGCTAAAAGTGCGGTGTTTTTTGAACAAAACAAAAGCGCTTATAAGACGCTTTTAGAAAATATTTCCCTTT KSAVFFEQNKSAYKTLLENISL
TTAAAAACCGCTTGAAAAAAGAAATGGAAATTCAAACCTTTTTAGATGACGCTTTCAAGCTTTTGCCCACGCTGTGTTT KEMEIQTFLDDAFKLLPTLCLK
AAAAAATGGCGTTTTGAATATTATTTATTTGGATCCTCCTTTTGAAACAAGTGGGTTTTTAGGGATTTATGAAAAGTGT YLDPPFETSGFLGIYEKCFQA
TTTCAAGCTTTAGAAAGGTTATTGAAACGCTTTAATCCAAAAAATCTTTTAGTGGTTTTTGAGCATGAAAGCATGCATG FNPKNLLWFEHESMHEMPK
AAATGCCTAAAAGTCTTGTAACTTTAGCTATAATCAAACAAAAAAAATTTGGAAAAACCACTTTAACTTATTTTCAATAG KQKKFGKTTLTYFQ
GAATAGGCATGGCAGAAGAACAAGAAAATACCGCGCAACAACCCCCCAAAAAAAGCAAAGCCCTTTTATTTGTCATT
ATTGGAAGCGTGTTAGTGATGCTTTTATTGGTGGGGGTGATTATCATGTTACTTATGGGGAATAAGGAAGAATCTAAA
GAAAACGCTTCTAAAAACACCCAAGAAGTTCAAGCTAATCCTATGGCGAACAAGAATCAAGAAGCCAAAGAAGGCTC
TAATATCCAGCAATATTTGGTGCTTGGGCCTTTGTATGCGATTGATGCGCCTTTTGCGGTGAATCTGGTCTCTCAAAA
TGGCAGACGCTACCTTAAGGCTTCTATTTCGTTAGAATTGAGTAATGAAAAGCTTTTGAATGAAGTCAAGGTTAAAGA
CACAGCGATTAAAGACACGATTATAGAGATTCTGTCGTCTAAAAGCGTGGAAGAAGTGGTTACTAACAAAGGCAAAA
ACAAGCTTA
HP1427 65 AGGACAAGATACGACGACCATCACTTGCAATTCGTATTATGAGCCAGGACATGGTGGGCCTATATCCACTGCAAATT 66 GQDTTTITCNSYYEPGHGGPI
ATGCGAAAATCAATCAAGCCTATCAAATCATCCAAAAGGCTTTGACAGCCAATGGAGCTAATGGAGATGGGGTCCCC KINQAYQIIQKALTANGANGD
GTTTTAAGCAACACCACTACAAAACTTGATTTCACTATCAATGGAGACAAAAGAACGGGGGGCAAACCAAATACACCT NTTTKLDFTINGDKRTGGKPN
GAAAAGTTCCCATGGAGTGATGGGAAATATATTCACACCCAATGGATTAACACAATAGTAACACCAACAGAAACAAAT WSDGKYIHTQWINTIVTPTET
ATCAACACAGAAAATAACGCTCAAGAGCTTTTAAAACAAGCGAGCATCATTATCACTACCCTAAATGAGGCATGCCCA AQELLKQASIIITTLNEACPNF
AACTTCCAAAATGGTGGTAGAAGTTATTGGCAAGGGATAAGCGGCAATGGGACAATGTGCGGGATGTTTAAGAATGA YWQGISGNGTMCGMFKNEIS
AATCAGCGCGATCCAAGGCATGATCGCTAACGCTCAAGAAGCTGTCGCGCAAAGCAAAATCGTTAGTGAAAACGCG NAQEAVAQSKIVSENAQNQN
CAAAATCAAAACAACTTGGATACTGGAAAACCATTCAACCCTTACACGGACGCCAGCTTTGCGCAAAGCATGCTCAA PFNPYTDASFAQSMLKNAQA
AAACGCTCAAGCGCAAGCAGAGATTTTAAACCAAGCCGAACAAGTAGTAAAAAACTTTGAAAAAATCCCTACAGCCTT AEQVVKNFEKIPTAF
TGT
HP1427 67 CTACTAACAATGGCCTTTGTTTCCAAGGTAACCTGGATCTTTATAACGAAATGGTTGGCTCTATCAAAACTTTGAGTC 68 TNNGLCFQGNLDLYNEMVGS
AAAACATCAGCAAGAACATCTTTCAAGGCAACAACAACACCACGAGCCAAAACCTCTCCAACCAGCTCAGTGAGCTT ISKNIFQGNNNTTSQNLSNQL
AACACCGCTAGCGTTTATTTGACTTACATGAACTCCTTCTTAAACGCTAACAACCAAGCGGGTGGGATTTTTCAAAAC SVYLTYMNSFLNANNQAGGIF
AACACTAATCAAGCTTATGGAAATGGTGTTACCGCTCAACAAATCGCTTATATCCTAAAGC V\GCTTCAATCACTATG QAYGNGVTAQQIAYILKQASI
GGGCCAAGCGGTGATAGCGGGGCTGCCGCAGCGTTTTTGGACGCCGCTTTAGCGCAACATGTTTTCAACTCCGCTA DSGAAAAFLDAALAQHVFNS
ACGCCGGGAACGATTTGAGCGCTAAGGAATTCACTAGCTTGGTGCAAAACATCGTCAATAATTCTCAAAACGCTTTA LSAKEFTSLVQNIVNNSQNAL
ACGCTAGCCAACAACGCTAACATCAGCAATTCAACCGGCTATCAAGTGAGCTATGGCGGGCATATTGATCAAGCGC NISNSTGYQVSYGGHIDQAR
GCTCTACCCAAC
HP1427 69 AGGCATTGATGTAAGCACCGGCGAGTTTAAAGGCGCAGACATTCATTCTCAAACCACGCAATCCATGGAAAATATCA 70 GIDVSTGEFKGADIHSQTTQS
AAGCGATTTTAAAAGAAGCAGGGTTAGGGATGGATAGCGTGGTTAAAACGACTATTTTATTGAAAAGTTTAGACGATT LKEAGLGMDSWKTTILLKSL
TTGCGGTGGTGAATGGAATCTATGGGAGTTATTTTACAGAGCCTTATCCGGCCAGAGCGACCTTTCAAGTGGCTAAA NGIYGSYFTEPYPARATFQVA
CTGCCTAAAGACGCTTTAGTAGAAATTGAAGCGATAGCCATTAAGTAATTTATTAAAGGGACTATCAGCATGAAAAAA LVEIEAIAIK
GAGGTCGTGGTCATAGGCGGTGGGATTGTAGGGCTTTCTTGTGCGTATTCTATGCACAAGTTAGGGCATAAGGTCT
GCGTGATAGAAAAAAACGATGGCGCAAACGGCACTTCTTTTGGGAATGCTGGGCTTATTTCTGCGTTTAAAAAAGCC
CCACTCTCATGCCCTGGTGTGGTGTTAGACACCCTGAAGCTCATGCTCAAAAACCAAGCCCCTTTAAAATTCCATTTC
GGGCTTAATTTAAAGCTCTATCAATGGATTTTAAAATTTGTAAAAAGCGCGAACGCCAAATCCACGCACCGCACCATG
GCGTTGTTTGAACGCTACGGGTGGCTGAGTATTGATATGTATCATCAAATGCTAAAAGACGGCATGGACTTTT
HP1427 71 GGCTGAAACAAGAGCTAAAGGATTTGTTATGGTCGTTTTACATTCTCATTTAGAAAACGCGCTAAAACAATTGAAAGA 72 AETRAKGFVMWLHSHLENA
ATTGATTGATTTAACCGAGCGCGATATAAGAGACATCAAGCTCGCTAAACACACCGAAATTTTTGAAAGAAACCATCA IDLTERDIRDIKLAKHTEIFERN
AAAACAGCTAGCGATTCAAGCTTTTGAAAAAGAAAAAGCGAATATAGATGTGCAAATGTTGTCTTTAAAAAACCAATTC IQAFEKEKANIDVQMLSLKNQ
CCTGATAAAGAAATGAGTGAATTATTAGACGAAAAAACGAGCGATTTTTTAAACCAAATGCGAGAGTCCTTGTTTGTT SELLDEKTSDFLNQMRESLFV
TTGAAAGAAAAAAACTTGATTTATTCGCGCATGGCGTTTGCGGTTTCTGAATTTTATTCTTCGCTCATCCAACAAATCA YSRMAFAVSEFYSSLIQQIIPH
TTCCCCATGACACTTGCGATTATAAAGGCTCTAGGCATGTGGGGAGTCATTTTTTAAGAGTGCAGGCGTAAAATGGG GSRHVGSHFLRVQA
CGGAATCTTATCTTCACTCAACACTTCTTACACCGGCCTTCAAGCCCATCAGAGCATGGTGGATGTTACCGGGAATA
ATATTTCTAACGCTAGCGATGAATTTTATAGCCGCCAGCGCGTGATTGCAAAGCCCCAAGCGGCCTATATGTATGGC
ACTAAAAACGTGAATATGGGCGTGGATGTGGAAGCCATTGAAAGGGTGCATGATGAGTTTGTTTTTGCTCGT
HP1427 73 CAACCACACAATCTCCTAACAGCACGGTGATGGGAGCTTTAAACACCGTGTTGCAAAATGTCAGCAATTTCCAACAA 74 TTQSPNSTVMGALNTVLQNV
AGCATTCAAAACGCTTTTCAAAACCAAGAAAGTAATATCCAAGCTTGGGCGAATGCGATTTATAACACTAATGGGAGT QNAFQNQESNIQAWANAIYN
CAGTCGCAAGAGATGACACCTAACAATAACCAAGATTTACGCATCCAATTGAGGGCGAATTTTTACCAGCTCATCAAT QEMTPNNNQDLRIQLRANFY
ACCATTAACCAGCAAGTGCCTACAGACATGAATGCTTTAATTAATCAAAGCCAACAAACCCAACAAACAAGCGGATCA QQVPTDMNALINQSQQTQQT
GCAAGCA
HP1427 75 GCAATAGCTCTTCAGGGGGTTTGAGCATCAGCGGGAACGCCCAATTGCAAAATATTTTAA 76 NSSSGGLSISGNAQLQNIL
HP1427 77 CGATCAACATTGCAGGGCCTACTACCGGCCTTATCACTITAAGCTCTCAAACCGTCATTGACGCTTTAGGCTATGGC 78 INIAGPTTGLITLSSQTVIDALG
GTGAGTAACACTGTTGGCAACCAATTAGAGGGCATTTCTAATATCTTGAATCAAATTGGCAAAAGAAAAGACTTTTATT VGNQLEGISNILNQIGKRKDF
CTAGCCGTCAAATCTCTAGCATTTCCCAACAAATCATAGGGCTTAAAGGAAGCTCTGATCCCTTAAAAGCCCATTCTT SISQQIIGLKGSSDPLKAHSS
CACAGATCACAGCCAAACTCCTTTCCAACACCCAAAGCGCGTTTGATCAGGGCATCGCGCTAAGCACTAACATCATT NTQSAFDQGIALSTNIISSINS
AGCTCTATCAATAGCCTAAACCCTAGCAACAACACCCAAGAGGTTAAAAAACAGCTCCAAAACACCGCGCAATCCAT QEVKKQLQNTAQSMTELLQQ
GACAGAATTGTTGCAACAAATTGAACACAG
HP1427 79 AACCACAACCCAAACCATAGACGGCAAAAGCGTAACCACCACGATCAGTTCAAAAGTGGTTGGTAGCATCGCTAGTG 80 TTTQTIDGKSVTTTISSKVVGSI
GCAACACATCACATGTCATCACCAACAAATTAGACGGTGTGCCTGATAGCGCTCAAGCGCTCTTAGCGCAAGCGAG S HVITNKLDG VPDSAQALLAQ
CACGCTCATCAACACCATCAACGAAGCATGCCCGTATTTCCATGCTACTAATAGTAGTGAGGCTAACGCCCCAAAAT INEACPYFHATNSSEANAPKFS
TCTCTACTACTACTGGGAAAATATGCGGCGCTTTTTCAGAAGAAATCAGCGCGATCCAAAAGATGATCACGGACGCG CGAFSEEISAIQKMITDAQELV
CAAGAGCTAGTTAATCAAACGAGCGTCATTAACAGCAACGAACAATCAACTCCGGTAGGCAATAATAATGGCAAGCC NSNEQSTPVGNNNGKPFNPF
TTTCAACCCTTTCACGGACGCAAGTTTTGCGCAAGGCATGCTCGCTAACGCTAGCGCGCAAGCTAAAATGCTCAATT QGMLANASAQAKMLNLAHQV
TAGCCCATCAGGT6GGGCAAGC
HP1427 81 CAGAGAATGGGAAAGAGATCCCAGTCTCTTATTCAGGCGGATCATCATTCTCGCCTACAATACAATTGACATACCATA 82 ENGKEIPVSYSGGSSFSPTIQL
ATAACGCTGAAAACCTTTTGCAACAAGCCGCCACTATCATGCAAGTCCTTATTACTCAAAAGCCGCATGTGCAAACGA AENLLQQAATIMQVLITQKPHV
GCAATGGCGGTAAAGCGTGGGGGTTGAGTTCTACGCCTGGGAATGTGATGGATATTTTTGGTCCTTCTTTTAACGCT GKAWGLSSTPGNVMDIFGPSF
ATTAATGAGATGATTAAAAACGCTCAAACAGCCCTAGCAAAAACCCAACAGCTTAACGCTAATGAAAACGCCCAAATC MIKNAQTALAKTQQLNANENA
ACGCAACCCAACAATTTCAACCCCTACACCTCTAAAGACAAAGGGTTCGCTCAAGAAATGCTCAATAGAGCTGAAGC NFNPYTSKDKGFAQEMLNRAE
TCAAGCAGAGATTTTAAATTTAGCTAAGCAAGTAGCGAACAATTTCCACAGCATTCAAGGGCCTATTCAAGGGGATTT NLAKQVANNFHSIQGPIQGDL
AGAAGAATGTAAAGCAGGATCGGCTGGCGTGA SAGV
HP0218 83 CATCAAGCAAGCGAGTGCTTGGATAACTTAGATGACCCTACTGATCAAGAGGCCATAGAGCAATGTTTAGAGGGCTT 84 HQASECLDNLDDPTDQEAIEQ
GAGCGATAGTGAAAGGGCGCTAATTCTAGGAATTAAACGACAAGCTGATGAAGTGGATCTGATTTATAGCGATCTAA DSERALILGIKRQADEVDLIYSD
GAAACCGTAAAACCTTTGATAACATGGCGGCTAAAGGTTATCCATTGTTACCAATGGATTTCAAAAATGGCGGCGATA TFDNMAAKGYPLLPMDFKNGG
TTGCCACTATTAACGCCACTAATGTTGATGCGGACAAAATAGCTAGCGATAATCCTATTTATGCTTCCATAGAGCCTG ATNVDADKIASDNPIYASIEPDI
ATATTGCCAAGCAATACGAAACAGAAAAAACCATTAAGGATAAGAATTTAGAAGCTAAATTAGCTAAGGCTTTAGGTG EKTIKDKNLEAKLAKALGGNKK
GCAATAAAAAAGATGACGATAAAGAAAAAAGTAAAAAATCCACAGCAGAAGCTAAAGCAGAAAACAATAAGATAGACA KSKKSTAEAKAENNKIDKDVA
AAGATGTCGCAGAAACTGCCAAGAATATCAGTGAAATCGCTCTTAAGAACAAAJAAAGAAAAGAGTGGGGAATTTGTA EIALKNKKEKSGEFVDENGNPI
GATGAAAATGGTAATCCCATTGATGACAAAAAGAAAGCAGAAAAACAAGATGAAACAAGCCCTGTCAAACAGGCCTT AEKQDETSPVKQAFIGKSDPT TATAGGCAAGAGTGATCCCACATTTGTTTTAGCGCAATACACCCCCATTGAAATCACTCTGACTTCTAAAGTAGATGC TPIEITLTSKVDATLTGIVSGVV CACTCTCACAGGTATAGTGAGTGGGGTTGTAGCCAAAGATGTATGGAACATGAACGGCACTATGATCTTATTAGACA NMNGTMILLDKGTKVYGNYQS AAGGCACTAAGGTGTATGGGAATTATCAAAGCGTGAAAGGTGGCACACCCATTATGACACGCTTAATGATAGTCTTT PIMTRLMIVFTKAITPDGVIIPLA ACTAAAGCCATTACGCCTGATGGTGTGATAATACCTCTAGCAAACGCTCAAGCAGCAGGCATGTTGGGTGAAGCAG GMLGEAGVDGYVNNHFMKRI GGGTAGATGGCTATGTGAATAATCACTTTATGAAGCGCATAGGCTTTGCTGTGATAGCAAGCGTGGTTAATAGCTTC SWNSFLQTAPIIALDKLIGLGK TTGCAAACTGCGCCTATCA TPEFNYALGQAINGSMQSSAQ
GQLMNIPPSFYKNEGDSIKILT
SGVYDVKITNKS VDEIIKQSTK
HEEITTSPKGGN
HP1301 85 TGTTGGCGATGAAATCTCACGACTTAAATATGATATGAGCCACAAGACTATTAAAGGCTCTACAATTGAGAGTTCTAA 86 VGDEISRLKYDMSHKTIKGSTI
TCTTATCAGCATTTATAAAAAGATTGCGAGCGGACTACCTTTTGGGACTATCTCGGCGTTTAGACCTTTTAAAGACGC IYKKIASGLPFGTISAFRPFKDA
TTTTTATAAAGACTTTACCGAAAAAGAACAAAACGCTCTAATCTATGCTTATAAGAGCGGAGCAGACCCTAAAAATGC EKEQNALIYAYKSGADPKNADII
GGACATAATAGCCAAATATTGGTTAAGTCAATCTGTGGATTTAGACCCATACGACCCTATTAAAGTTGTAGATTTCTTT SQSVDLDPYDPIKWDFFHPQ
CACCCACAACCTGAAAATGGTAAAGAGACTACAAAATTTAAGAACTACAAAGATAGGATTGAGAACATTTATGCGACA TTKFKNYKDRIENIYATLYNTLG
CTCTATAACACATTGGGTAGGGGTTATGTGGATAAATTTTTTAAAAAAGAAGCCACAATGAGGGACTTTATGTCTAGC KFFKKEATMRDFMSSDKFVER
GATAAATTTGTTGAGAGATACCGCTACACTAGAAAAGAJAAATATGGCAAGGACACAAGCATTAAAAGACATAATGAAT KENMARTQALKDIMNIDRDFIG
ATTGACAGAGATTTCATTGGTTATATTGAAGTGTTAGGGTATTGGAAAGACAACCCTAAAGACAATATCTTACCAGAC YWKDNPKDNILPDKEVSFFVF
AAAGAGGTTAGCTTTTTTGTATTCCAAAACGAACCTAGTAGCACATTTGATTTGAAAAACCACTTATTGATATGGGGTA TFDLKNHLLIWGKQFRQVAICY
AACAATTCAGACAAGTAGCGATTTGCTATGGCGGACAATTGATTGCTAATAAGAATAAGACTTATAGGATAGATTTGA NKNKTYRIDLISCRPDNFGEV
1 AA(JTTGCAGACCTGATAATTTTGGTGAGGTTTGGGCTAAATTCACAGGGATTAAATTTTCAGTTCCTAGCGACTTAC KFSVPSDLPQALTRINDSVYTF
CACAAGCTCTCACACGCATAAATGACAGCGTTTATACTTTTCTC
HP1301 87 ACAAACCCTTTTACCCACCGCTCAAACCCTTTTAAACCATGCTAAAAAAACTCAAAGCTTGAATGGGGTGGAGATTGT 88 QTLLPTAQTLLNHAKKTQSLN
AGGGTTGGAGCATTTGGATAAAGTGATTTATTTAGATCAAGCCCCCATAGGCAAAACCCCACGAAGCAACCCTGCCA EHLDKVIYLDQAPIGKTPRSNP
CTTACACGGGAGTGATGGATGAAATCAGGATTTTATTTGCCGAGCAAAAAGAAGCTAAAATTTTAGGCTATAGTGCGA MDEIRILFAEQKEAKILGYSAS
GCCGTTTTAGCTTTAATGTTAAAGGAGGGCGGTGCGAGAAATGCCAAGGCGATGGGGACATTAAAATAGAAATGCA KGGRCEKCQGDGDIKIEMHFL
CTTTTTGCCTGATGTGTTAGTCCAATGCGATAGCTGTAAGGGCGCTAAATACAACCCCCAAACTTTAGAAATCAAGGT QCDSCKGAKYNPQTLEIKVKG
GAAAGGCAAATCCATTGCCGATGTGTTGAACATGAGCGTGGAAGAGGCTTATGAATTTTTTGCTAAATTCCCTAAAAT LNMSVEEAYEFFAKFPKIAVKL
CGCCGTGAAGTTAAAAACGCTTATGGATGTGGGCTTAGGCTATATCACTTTAGGGCAAAACGCTACGACTTTAAGTG VGLGYITLGQNATTLSGGEAQ
GGGGGGAGGCTCAAAGGATCAAATTAGCTAAAGAATTGAGTAAAAAAGACACAGGCAAAACCCTTTATATTTTAGAT ELSKKDTGKTLYILDEPTTGLH
GAGCCTACTACCGGTTTGCATTTTGAAGACGTGAATCATCTTTTACAAGTCTTGCATT LLQVLH
HP1301 89 GGTATAAAGCTTCTCTTACCACCAATGCGGCTCATTTGCATATCGGCAAAGGCGGTATCAATCTGTCCAATCAAGCG 90 YKASLTTNAAHLHIGKGGINLS
AGCGGGCGCACCCTTTTAGTGGAAAATCTAACCGGGAATATCACCGTTGATGGGCCTTTAAGAGTGAATAATCAAGT RTLLVENLTGNITVDGPLRVNN
GGGTGGTTATGCTTTGGCAGGATCAAGCGCGAATTTTGAGTTTAAGGCTGGTACGGATACCAAAAACGGCACAGCC ALAGSSANFEFKAGTDTKNGT
ACTTTTAATAACGATATTAGTTTGGGAAGATTTGTGAATTTAAAAGTGGATGCTCATACAGCTAATTTTAAAGGTATTG DISLGRFVNLKVDAHTANFKGI
ATACTGGTAATGGTGGTTTCAACACCTTAGATTTTAGTGGCGTTACAGGTAAGGTCAATATCAACAAGCTCATTACGG GFNTLDFSGVTGKVNINKLITA
CTTCCACTAATGTGGCCGTTAAAAACTTCAACATTAATGAATTGGTTGTTAAGACCAATGGGGTGAGTGTGGGGGAA KNFNINELVVKTNGVSVGEYT
TACACTCATTTTAGCGAAGATATAGGCAGTCAATCGCGCATCAATACCGTGCGTTTGGAAACTGGCACTAGGTCAAT GSQSRINTVRLETGTRSIFSG
CTTTTCTGGGGGTGTCAAATTTAAAAGCGGTGAAAAACTGGTTATAGATGAGTTTTACTATAGCCCTTGGAATTATTTT GEKLVIDEFYYSPWNYFDARNI
GACGCTAGGAATATTAAAAATGTTGAAATCACCAGAAAATTCGCTTCTTCAACCCCAGAAAACCCTTGGGGCACATCA RKFASSTPENPWGTSKLMFN
AAGCTTATGTTTAATAATCTAACCCTGGGTCAAAATGCGGTCATGGACTATAGTCAATTTTCAAATTTAACCATTCAGG NAVMDYSQFSNLTIQGDFINN
GGGATTTCATCAACAATCAAGGCACTATCAATTATTTGGTCCGAGGCGGGCAAGTAGCCACCTTGAATGTAGGCAAT LVRGGQVATLNVGNAAAMFF
GCGGCAGCTATGTTCTTTAGTAATAATGTGGATAGCGCGACTGGGTTTTACCAACCGCTCATGAAGATTAACAGCGC ATGFYQPLMKINSAQDLIKNKE
TCAAGATCTCATTAAAAATAAAGAACATGTCTTATTGAAAGCGAAAATCATCGGTTATGGCAATGTTTCTTTAGGCACT KIIGYGNVSLGTNSISNVNLIEQ
AACAGCATTAGTAA LYNNNNRMDICVVRNTDDIKA
NQSMVNNPDNYKYLIGKA K
ANGSKISVYYLGNSTPTEKGG
TNTTSNVRSANNALAQNAPFA
PNLVAINQHDFGTIESVF
HP1301 91 CCGATAAGGGATTAAAAAAGGTGTTCAAAGACAGCAAAAAAGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTC 92 DKGLKKVFKDSKKDACGFIYEI
ATGAAAGCCTATACCGCATTGCTAAAAAAACAAGACCGATACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGG AYTALLKKQDRYVYLLRYLPS
GCCAGCATTTTAACGACTGCCCTTTATGTCAAATACCCTGATTTTGACGCTTTGAAAAAGCTTTTGGTGTCTTATTATT LTTALYVKYPDFDALKKLLVSY
ACCAAACTTGGATTGCAGGAGGCACGATCACGCGCATCAAGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGC AGGTITRIKQTSINIIKNVKSNK
AATAAGAGCGTTGAAACCATCAAAGAGCTTATATTGAATAGCATCGACTCTTATAACACCTTTGATCAATACCTCTATA LILNSIDSYNTFDQYLYNLWDS
ACTTATGGGATAGCTCTTCTGTTTATCATAGCAAATGGGTGCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAG KWVRPVLALANYFMADEEKP
ATGAAGAGAAACCCCATTTTATCGCTATGGATGCCGAAACCCAAGTGGAGCATATTTTGCCACAAACGCCCAAAAGA AETQVEHILPQTPKRGSQWN
GGCAGTCAATGGAACGCGGATTTTGACAAAGAAAAAAGAGAAGAATGGGTAAATAATATCGCGAATTTAACCCTTTTA KREEWVNNIANLTLLKRKKNA
AAGCGTAAAAAGAACGCGCATGCTTTAAACGGGGATTTTGATGAAAAAAGAAAAATTTATGGAGGCAAAGACACGAG FDEKRKIYGGKDTSKVISCYDI
CAAAGTGATTAGCTGTTATGACATCACTAAAGAATTGTATAGCAATTATAGGAAGTGGAATGAGAAGTCCCTCCAAGA NYRKWNEKSLQERYKSLYNTI
GCGATACAAATCTTTGTATAACACTATCACGCCTGTTTTACACATAGAGGGGCAAGAAGATGATTTTGAAGATGATTT EGQEDDFEDDFDLE
TGATCTAGAATGATTAAAGATTGCCAAGCATCAAAACAA
Figure imgf000096_0001
HP1542 97 TTTAGAAGTGGGTAATCGTGTGGGATCGGGAGCTGGCACGCACACCGGCACAGCCACTTTAAACTTGAACGCTAAT 98 LEVGNRVGSGAGTHTGTATL
AAGGTCAATATCAATTCCAATATCAACGCGTATAAAACTTCGCAAGTGAATATAGGCAACGCTAACAGCGTTATTACC VNINSNINAYKTSQVNIGNANS
ATTGGTTCGGTTTCTTTGAGTGGGGATGTTTGCAGTTCTTTAGCTAGCGTTGGGATAGGGGCTAATTGCTCCACTTCT SLSGDVCSSLASVGIGANCST
GGGCCTAGCTATTCTTTTAAAGGGACGACTAACGCTACTAACACGGCGTTTAGTAATGCAAGCGGCAGTTTCACTTT FKGTTNATNTAFSNASGSFTF
TGAAGAGAACGCCACTTTTAGCGGGGCGAAATGGAATGGGGGGACTTATACCTTTAATAAAGAGTTTAGCGCTACCA SGAKWNGGTYTFNKEFSATN
ATAACACCGCCTTTAGTAGCGGTAGTTTTAATTTTAAAGGTGTAAGCTCTTTTAATGGTACTTCGTTTAGTAACGCTTC GSFNFKGVSSFNGTSFSNAS
TTATACTTTTGACAATCAAGCCACTTTCCAAAACAGCTCCTTTAATGGGGGGACTTTTACTTTTAATAACCAAACTAAT ATFQNSSFNGGTFTFNNQTN
CCAACTAACAACGCTCAGCACCCCCAAATTCAAAACAGCTCTTTTAGTGGTAACGCTACCACTCTTAAGGGCTTTGTG HPQIQNSSFSGNATTLKGFVN
AATTTCCAGCAAGCCTTTAACAATTCAAACCACCAACTAACGATCCAAAACGCTTCCTTTAATAACGCCACTTTTAACA NSNHQLTIQNASFNNATFNNT
ATACCGGTAAAATCACTATAGAAAAAGATGCGAGTTTTAATAACACGACATTCAACACTTCTGTTGATACAAACAACAT DASFNNTTFNTSVDTNNMSV
GAGTGTTACCGGTGGCGTTACTTTAAGCGGTAAAAATGACTTGAAAAATGGCTCAACCCTTGATTTTGGGAGTTCTAA SGKNDLKNGSTLDFGSSKITL
AATCACTCTCGCTCAAGGGACGACTTTCAACCTCACAAGTTTAGGCAGTGAGAAGAGCGTAACGATTTTAAATTCTAG NLTSLGSEKSVTILNSSGGITY
CGGTGGGATCACTTATAGTAACCTTTTAAACCATGCAATCAACGGCTTGACAAGTGCCTTAAAAACGAACGAAAGCC AINGLTSALKTNESLSNPQSF
TTTCAAATCCG IITYNGVTGQLLNENAATSKPT
SSTNSTQVYQVGYKIGDTIYK
HNSIIIQALESGTYTPPPVINGS
SNYINADMPWYDHKYYIPKSQ
TYYLPSVQIWGSYTNSFKQTF
LVIGYNSTWTDHNVSSSGTV
GSALNGHCGP PYYQCTGT
AYHVYITANLRSGNRIGTGGA
VDSINIANATITQHNAGIYSSS
SMDNSQNLNGLNSNGKLSVY
EAKDGKFIFNAGQAVFENTNF
QFSGDSLN
HP1542 99 GGCTATGGATGAGCAGTTAAAAATCTTAGACACCGTTAAGGTTAAAGCGACTCAAGCGGCTCAAGATGGGCAAACTA 100 AMDEQLKILDTVKVKATQAAQ
CAGAATCTCGTAAAGCGATTCAATCTGACATCGTTCGTTTGATTCAAGGTTTAGACAATATCGGTAATACGACTACTT SRKAIQSDIVRLIQGLDNIGNT
ATAACGGGCAAGCGTTATTGTCTGGTCAATTCACCAACAAAGAATTCCAAGTAGGGGCTTATTCTAACCAAAGCATTA ALLSGQFTNKEFQVGAYSNQ
AAGCTTCTATCGGCTCTACCACTTCCGATAAAATCGGTCAGGTTCGTATCGCTACAGGTGCGTTAATCACCGCTTCT STTSDKIGQVRIATGALITASG
GGAGATATTAGCTTGACTTTTAAACAAGTGGATGGCGTGAATGATGTAACTTTAGAGAGCGTAAAAGTTTCTAGTTCA QVDGVNDVTLESVKVSSSAG
GCAGGCACAGGGATTGGCGTGTTAGCAGAAGTGATTAACAAGAACTCTAACCGAACAGGCGTTAAAGCCTATGCGA EVINKNSNRTGVKAYASVITTS
GCGTTATCACCACGAGCGATGTGGCGGTCCAGTCAGGAAGTTTGAGTAATTTAACCTTAAATGGGATTCATTTGGGG SGSLSNLTLNGIHLGNIADIKK
AATATCGCAGATATTAAGAAAAACGACTCAGACGGACGATTAGTCGCAGCGATCAATGCGGTTACTTCAGAAACCGG LVAAINAVTSETGVEAYTDQK
CGTGGAAGCTTATACGGATCAAAAAGGGCGCTTGAATTTGCGCAGTATAGATGGTCGTGGGATTGAAATCAAAACCG SIDGRGIEIKTDSVSNGPSALT
ATAGTGTCAGTAATGGGCCTAGTGCTTTAACGATGGTTAATGGCGGTCAGGATTTAACAAAAGGCTCTACTAACTAC QDLTKGSTNYGRLSLTRLDA
GGAAGGCTTTCTCTCACACGCTTAGACGCTAAGAGCATCAATGTC
Figure imgf000098_0001
HP1542 107 TTTGGATGGGATTTCGCTCGCGCTAGGCTATTTGTGTTTGTTTATATTCGTTTTAAGCGCTTCTTTAATCTCTGAAAAA 108 LDGISLALGYLCLFIFVLSASLI
GCCTTATCCAAGCAGTATTTGCAAACCGCTAAAGATAAAATCACCTCTTTAAAGAATTTAAAAGTCATCGCCATTACC KQYLQTAKDKITSLKNLKVIAIT
GGAAGCTTTGGGAAAACCAGCACCAAAAATTTCTTGCTTCAAATCTTACAAACCACATTCAACGCGCATGCAAGCCC STKNFLLQILQTTFNAHASPKS
CAAAAGCGTCAATACCCTTTTAGGGCTTGCGAATGATATTAATCAGAATTTAGACGATAGGAGTGAAATCTATATCGC LANDINQNLDDRSEIYIAEAGA
TGAAGCCGGGGCAAGGAATAAGGGCGATATTAAAGAAATCACCTGTCTCATTGAACCGCACCTTGTTGTGGTTGCAG KEITCLIEPHLWVAEVGEQHL
AAGTGGGCGAACAGCATTTAGAATACTTTAAAACTTTAGAAAATATTTGCGAGACTAAAGCGGAATTATTGGATTCCA ENICETKAELLDSKRLEKAFCY
AACGCTTAGAAAAAGCCTTTTGTTACTCGGTGGAAAAGATCAAGCCCTATGCCCCTAAAGATAGCCCTTTAATAGACT PYAPKDSPLIDYSSLVKNIQST
ATTCTAGCCTGGTTAAAAACATCCAATCCACTTTAAAAGGCACTTCTTTTGAAATGCTTATAGGTAGCGTTTGGGAAA EMLIGSVWERFETKVLGEFSA
GATTTGAAACAAAGGTTCTAGGGGAGTTTAGCGCTTATAATATCGCTTCAGCCATTTTAATCGCTAAGCATTTAGGCT LIAKHLGLETERIKRLVLELNPI
TAGAGACCGAAAGGATCAAACGGCTTGTTTTAGAACTCAACCCTATTGCTCATCGTTTGCAACTTTTGGAAGTGAATC LEVNQKIIIDDSFNGNLKGMLE
AAAAAATCATCATAGACGATAGCTTTAATGGGAATTTAAAGGGCATGTTAGAGGGCATTCGTTTAGCGAGTTTGCACA HKGRKVIVTPGLVESNTESNE
AAGGGCGTAAAGTCATTGTAACACCGGGCTTAGTGGAAAGCAATACAGAAAGTAATGAGGCTTTAGCGCAAAAAATA DGVFDVAIITGELNSKTIASQL
GACGGGGTTTTTGATGTCGCTATCATCACAGGGGAGTTGAATTCCAAAACGATTGCTTCACAATTGAAAACCCCCCA LKDKAQLENILQATTIQGDLILF
AAAAATCTTACTC NYI
HP1542 109 AAATCTTGAAGTCATTACCCTATACTTACACACTGAAGGCATCACAAACCCCATGCTCTTTGCGATGGGCGCTATCTT 110 NLEVITLYLHTEGITNPMLFAM TTTGATTGGAGCGATTGGCTTTAAGGTTTCTTTGGTGCCTTTTCATACCTGGATGCCTGATGTGTATGAGGGCAATAA AIGFKVSLVPFHTWMPDVYEG CCCGGTCTTTGCGAGCTATATTTCCATTGTGCCTAAAATCGCCGGCTTTGTGGTAGCGACTCGCCTTTTTG ASYISIVPKIAGFVVATRLF
HP1542 111 CCACAAACAAAAACATGCCAAAACACGAAATTACGCCCAAGAAGAATTGGATAGCAACAAAGTAGAGGGCGTTACGG 112 HKQKHAKTRNYAQEELDSNK
AAATTTTGCATGTGAATGAGAGAGGGACTTTAGGCTTTCATAAGGAGTTAAAAAAGGGCGTTGAAGCGAATAACAAG LHVNERGTLGFHKELKKGVEA
ATCCAAGTGGAGCATTTAAACCCGCATTATAAGATGAACTTAAACTCTAAAGCGAGCGTTAAAATCACGCCTTTAGGG EHLNPHYKMNLNSKASVKITP
GGCTTGGGTGAGATTGGGGGGAACATGATGGTCATTGAAACCCCAAAAAGCGCGATCGTGATTGATGCGGGCATGA IGGNMMVIETPKSAIVIDAGMS
GCTTCCCTAAAGAGGGGCTCTTTGGCGTGGATATTTTAATCCCGGATTTTTCCTACTTGCACCAAATCAAGGACAAAA FGVDILIPDFSYLHQIKDKIAGIII
TCGCTGGCATTATCATCACCCATGCCCATGAAGATCACATAGGGGCCACGCCTTATTTGTTTAAAGAGCTGCAATTC DHIGATPYLFKELQFPLYGTPL
CCCCTTTATGGCACGCCCTTGAGTTTGGGGCTGATTGGGAGCAAGTTTGATGAACATGGTTTGAAAAAATACCGCTC SKFDEHGLKKYRSYFKIVEKR
GTATTTTAAAATCGTAGAAAAGCGCTGTCCCATTAGCGTGGGCGAATTTATCATTGAATG EFIIE
HP 1542 113 AATCTTTGATAACAATAATAAATCGGCTAATGCAAAAACAGGACCAGCGACTATCATCGCTCAAGGCACAAAAATAAA 114 FDNNNKSANAKTGPATI IAQG
GGGGGAGCTTCATTTAGATTACCATTTGCACGTAGATGGCGAATTAGAAGGGGTGGTGCATTCTAAAAGCACGGTG LHLDYHLHVDGELEGWHSKS
GTGATCGGGCAAACCGGCTCGGTAGTGGGTGAGATTTTTACTAATAAATTAGTGGTCAGTGGCAAGTTCACTGGCAC TGSWGEIFTNKLVVSGKFTG
GGTGGAGGCGGAAGTGGTAGAAATCATGCCTTTAGGGCACCTTGATGGCAAAATCTCTAGCCAAGAGCTTGTGGTG VEIMPLGHLDGKISSQELVVER
GAAAGAAAGGGGATTTTGATTGGGGAAACTCGCCCTAAGAATATTCAAGGGGGGGCGTTGTTAATCAATGAGCAAG ETRPKNIQGGALLINEQEKKIE
AAAAGAAAATTGAAAATAAATAGGGAATGATCCAATCCAGCCTTTATAGAGCCTTAAACAAAGGCTTTGATTACCAAAT
ACTCGCTTGTAAGGATTTTAAAGAATCCGAGCTCGCTAAAGAAGTCATAAGCTATTTTAAGCCAAATACCAAAGCCAT
TCTTTTCCCGGAGTTTAGGGCTAAAAAAAACGACGATTTGCGTTCGTTTTTTGAAGAATTTTTACAGCTTTTAGGGGG
TTTAAGGGAGTTTTATCAAGCCTTAGAAAACAAGCAAGAAACTATCATCATTGCCCCGATTAGCGCGTTATTGCACCC
TTTACCTAAAAAAGAACTTTTAGAAAGCTTTAAAATCACTCTTTTAGAAAAATATAACCTTAAGGATTTGAAAGACAAGC
TCTTTTATTATGGCTATGAAATTTTAGACTTAGTGGAAGTGGAAGGCGAAGCGAGCTTTAGGGGGGATATTGTGGAT
ATTTATGCGCCAAATTCTAAAGCGTATCGCTTGAGTTTTTTTGACACCGAGTGTGAGAGCATTAAGGAATTTGATCCC
ATTACTCAAATGAGCCTTAAAGAAGATTTGTTAGAAATTGAAATCCCCCCCACGCTTTTTAGTTTGGACGAATCATCTT
ATAAGGATCTAAAA
HP0452 115 TAGAAGTTTATTTGGATTTAAGAGACAAGCATGAACGATTGCAGCAAGAAATCACCGAATTGCAAAGCAAGAATGTGC 116 EVYLDLRDKHERLQQEITELQS
GCTTGCAAAAGCGTTTGTTTGAGTTGAAGGAATTACGGCCTAGAGATTAGATTTAAGGAAAATGGTAGTGTTAAAAAA QKRLFELKELRPRD
GATGATAGGTTTGGTGGTGGTTTTAAGCGTTTTATTGGCTAGAGACAACCCTTTTGAGCCTGAAATCAATTCCAAGAA
TTTGCAAGGGGGCTTTAATGGGATCTATGATAGTTATTTTAAAGAAATCCATGTGGATTTGCCCACGAGCGCTAGGAT
CTTAAAACAAATCACGCTCACTTACCAAGATATTGATGGCTCTATCCATTCTAAAGTCGTGGGCATTGATAAAG6CAT
TGATTGGCATTACCCTTTAAAACTCTCCCAACACACCCTTGATCCAGCCGCCTTTGAAAAACGCTACCAGATCCAAGA
CTTTGATTTTTTAATGGCAAGCAACACGATGATTTTGCGTTCCCCTTATAAAATTTTACGCTCCTTTGTGCTAGTCAAT
CCTTATAGAATCGTGTTAGACACGCAAAAAGGCCCTTTGGATATTTATCAAAACATGGATTTAAACCAGAAGTTTTTTT
CTCACATTAAAGTCGGCACGCACAAAGATTATTACCGCAT
HP0452 117 AGGTGTTCAAAGACAGCAAAAAAGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCA 118 VFKDSKKDACGFIYEISEFMKA
TTGCTAAAAAAACAAGACCGATACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACT KQDRYVYLLRYLPSRYWASILT
GCCCTTTATGTCAAATACCCTGATTTTGACGCTTTGAAAAAGCTTTTGGTGTCTTATTATTACCAAACTTGGATT6CAG KYPDFDALKKLLVSYYYQTWIA
GAGGCACGATCACGCGCATCAAGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACC IKQTSINIIKNVKSNKSVETIKELI
ATCAAAGAGCTTATATTGAATAGCATCGACTCTTATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTT YNTFDQYLYNLWDSSSVYHSK
CTGTTTATCATAGCAAATGGGTGCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATT LALANYFMADEEKPHFIAMDAE
TTATCGCTATGGATGCCGAAACCCAAGTGGAGCATATTTTGCCACAAACGCCCAAAAGAGGCAGTCAATGGAACGC ILPQTPKRGSQWNADFDKEKR
GGATTTTGACAAAGAAAAAAGAGAAGAATGGGTAAATAATATCGCGAATTTAACCCTTTTAAAGCGTAAAA NIANLTLLKRK
HP0452 119 GAACAAATTGCAAGAAACAGATAATTTATTAAAAACATTGAATGTGAAATCGCTTTTAGAAGCCTTGCTTGTTTATACG 120 NKLQETDNLLKTLNVKSLLEAL CCCAAAGGCTATAAAGATTTAAATTTATTAGAGCGTTTTGAAACGGGCTTGAGCGGCG" TTTAGAAGTGGGTATTTTA GYKDLNLLERFETGLSGVLEV GAGAAAAGAAACTACGCCAAAGTTTTAAAGATTTTCGCCTATTCCAAACGATTTTACAAAAATTTAGAGCTTGTTT TT NYAKVLKIFAYSKRFYKNLELV
TCAATTACAGCGCGTTCCATCACAGCCAGTTTAAAACCGGCGAGAGTTTGTTTATTTATGGTAAATTAGAGCAAAGCT FHHSQFKTGESLFIYGKLEQSS
CTTTTAATCAAGCTTATATCATTAACACGCCTAAMTCATTACCAAATTTGGTAAAATTTCTTTAATTTTTAAAAAAGTTA INTPKIITKFGKISLIFKKVKNHK
AAAATCATAAAAAAATACAAGAAAATTTACAAAAACTCATTTCTTTAGAAAATTTAAAAAAGGAAGGCGTTAAAGAGAA QKLISLENLKKEGVKENIAHLLL
TATCGCGCATTTATTGTTAGAAATCTTTTTCCCCACGCCGCATTTTGTCAAGGATTTTGAAACGAATAAAAATTTTCCT PHFVKDFETNKNFPSQHLNAL
TCACAACATTTAAATGCATTAAAATACATTGAAATGCTTTTTTATATGAAAAATTTAGAGCGCAAAAAATTGCAATTTGG FYMKNLERKKLQFGAKIACPN
CGCTAAAATCGCATGCCCCAATAATAACGAGCGCTTGAAAGCGTTTATCGCTTCTTTACCCTTTAAACTCACACGCGA AFIASLPFKLTRDQQNAI
TCAACAAAACGCCATTA
HP0336 121 ATGGATGGGAGGTTTGAAGATCCTAATCTAACCCCTTTAGAAGTCTTTGATAGAATCCATCATAAAAAAATCGCTAGC 122 MDGRFEDPNLTPLEVFDRIHH
GTGCATTTAGCGGATAAGGAAGCGATTTTAAAAGCCCTAGAAGTGGCTAAAAGCGATAAGAGCCGTTTCAGTCAAAA HLADKEAILKALEVAKSDKSRF
AAGCTTTACAGAAATCCATGCCTTAATGAGTCAAACCGCCCAGCTTTTTAGAGAAAGAAGAGGCGATTTGATAGGGA EIHALMSQTAQLFRERRGDLIG
TTTCCGCTTTAGAAGTGGGTAAGACTTTCGCTGAAACGGACGCTGAAGTGAGCGAAGCCATTGACTTTTTAGAGTTT GKTFAETDAEVSEAIDFLEFYP
TACCCTTACAGCTTAAGGGTGTTGCAAGAGCAAAACACAAAAACGCAATTCACCCCTAAAGGCGTGGGCGTGGTCAT QEQNTKTQFTPKGVGVVIAPW
TGCCCCATGGAATTTCCCTGTGGGCATTTCTGTAGGCACTATCGCTGCCCCCCTAGCTACGGGCAATCGGGTGATTT SVGTIAAPLATGNRVIYKPSSL
ACAAGCCCTCAAGTTTGTCTAGCGTAACGGGCTATAAGCTTTGTGAGTGCTTTTGGGATGCGGGCGTGCCTA KLCECFWDAGVP
HP0336 123 CAACAATGGGGGGCATCAAAAACACCACACCAAGATGAGTATCCCTAAAGCCCCTGATGATTTCAAACCTTTTATCAA 124 NNGGHQKHHTKMSIPKAPDDF AAAAATCCATAGAGATTTTAACCAAAGAAATCTCGTGCCCGTAGAGCATAAAATCTATAATGGCGAGAAGCCTTTAGA IHRDFNQRNLVPVEHKIYNGEK AATGCCTAACACCCTTAAGGCCAATGAAATGCGTTTGCATCTGGGCAAAGAAGCGGATTATGAGCAAAAGGACTTGA NTLKANEMRLHLGKEADYEQK TGGTTGGGTTTGAAAATAGCGAATCGCATTTGTTGGTGGTGAGTCAAGATTTAAGCGCTCGCATCGCCCTGATGAAG FENSESHLLWSQDLSARIALM CTTTTCGCTCAAAATTTCAAGACTGCCAACAAAGAGTTGCTCTTCTACAACGCCGAAAAACGCCTTGCAAGAGAACTT NFKTANKELLFYNAEKRLAREL GATGAGTTGAAAAAACACCACATCACGCCCATGCAAGGCCCTCTAGGGAGCGTTTTGGACACCGCTATGAATCCTAA HHITPMQGPLGSVLDTAMNPN TAGCGTGCTTGTGATAGACAATCTCAACGAAGCCAAAGAGTTGCACGACAAAATAGGGGTGGAAAAATTAAGATCGT NLNEAKELHDKIGVEKLRSFLE TTTTAGAAAAAGCCACAGACAACGAGCA
Figure imgf000101_0001
Figure imgf000102_0001
Figure imgf000103_0001
Figure imgf000104_0001
Figure imgf000105_0001
Figure imgf000106_0001
Figure imgf000107_0001
Figure imgf000108_0001
Figure imgf000109_0001
Figure imgf000110_0001
Figure imgf000111_0001
1 1
HP0336 209 ATCCAACCCTGATTATATTTCTACGCATAGCGAATCAGCCCTAGACTTGCTCAAGTTATTGAAAAAAAACCAGATGAA 210 SNPDYISTHSESALDLLKLLKK TGCAAGCGCGATTGAGATCGCTCACTTGCTCCTCAATCAAGATGATGATCTGAAAGCTAAAGAGCAAGCGCTTTATG SAIEIAHLLLNQDDDLKAKEQAL ATTTAGGAGCGTTGTATGCAAGGATCAAGGACTTTAAAAACGCCCACCTTTACAATCTGCAATATTTGCAGGACCATG ARIKDFKNAHLYNLQYLQDH
CGGAACTGGATAAAGCTTCTGTCGTTAGGGCGCGCGATGAAAAAGCCCTTTTTTCCATGGAGGGGAACACGCAAGA ASWRARDEKALFSMEGNTQE
AAAAATCGCCCACTATGACAAAATCATTCAAAATTTCCCTAATTCTAATGAAGCCCTAAAGGCTTTAGAATTGAAAGCC KIIQNFPNSNEALKALELKAQLL
CAACTATTGTTTGAAAACAAGCGTTATGCTGAAGTGTTAAGCATGCAAAAAAATTTGCCTAAAGATTCCCCTTTGATCC YAEVLSMQKNLPKDSPLIQKTL
AAAAAACGCTCAATGTCCTTGCTAAAACCCCATTAGAGAACCATCGTTGTGAAGAAGCCTTAAAATATTTATCCCAAA TPLENHRCEEALKYLSQITTFE
TCACAACCTTTGAATTCAGCCCCAAAGAAGAAATCCAAGCCTTTGATTGCTTGTATTTCGCATCGCTCAAAGAAAAAG IQAFDCLYFASLKEKAQIIALNA
CGCAAATCATTGCCCTAAACGCTTTTAAAACGGCTAAAGCCCCTAGCGAGAAATTAATATGGCTTTATCGTTTGGGGC PSEKLIWLYRLGRNYYRLGDF
GCAATTACTACCGCTTAGGGGATTTTAAAAATTCCACTCTGGCCTCTAAAGACGCTTTA SKDAL
HP0336 211 AGAGAGAAAAAACTCATCCACCCTAACGATGACGTGAACATGTCTCAAAGCTCCAACGACACTTTCCCTACCGCAAT 212 REKKLIHPNDDVNMSQSSNDT
GCACATTGTGAGCGTGCTAGAAATCACGCACAGACTGCTCCCTAGTTTGGAGAATCTGTTAAAAACCTTTAAAGAAAA HIVSVLEITHRLLPSLENLLKTF
AAGCCAACAATTTAAAGAGATTGTCAAGATCGGACGCACGCATTTACAAGACGCTACGCCTTTAACTTTGGGGCAAG QFKEIVKIGRTHLQDATPLTLG
AATTTAGCGGGTATGCGAGCATGCTAGAGCATTCTAAACAACAAATTTTAGAGAGTTTGGAGCATTTAAGAGAATTAG ASMLEHSKQQILESLEHLRELA
CCATAGGCGGGACGGCCGTAGGCACAGGGCTAAACGCTCATAAAGAATTGAGCGAAAAAGTGGCTGAAGAATTGAG GTGLNAHKELSEKVAEELSQF
CCAGTTTAGCGGCGTGAAATTCGTCTCTGCGCCCAATAAGTTCCATGCGCTCACTAGCCATGACGCTATCGCTTATG SAPNKFHALTSHDAIAYAHGAF
CGCATGGGGCTTTTAAAGCTTTAGCGGCGAATTTAATGAAAATCGCTAACGATATTAGATGGCTTGCGAGCGGGCCG NLMKIANDIRWLASGPRCGLG
CGCTGTGGTTTGGGCGAGCTTAATATCCCTGAAAACGAGCCGGGCAGTTCTATTATGCCCGGTAAAGTCAATCCCAC NEPGSSIMPGKVNPTQCEAMT
GCAATGCGAAGCGATGACAATGGTGGCCGTGCAAGTGATGGGGAATGATACCGCTATTGGCATTGCGGCCAGTCA VMGNDTAIGIAASQGNFELNVF
GGGTAATTTTGAATTGAACGTGTTCAAGCCGGTGATCATTTATAATTTCTTGCAAAGTTTAAGGCTATTGAGCGATAG NFLQSLRLLSDSMESFNIHCAS
CATGGAAAGTTTTAATATCCATTGCGCGAGCGGCATTGAGCCTAATAGAGAAAAGATTGATTATTACTTGCACCATTC REKIDYYLHHSLMLVTALNPHV
TTTGATGTTAGTAACCGCCCTAAACCCGCATGTGGGCTATGAAAACGCCGCTAAAATCGCTAAAAACGCCCACAAAA AKIAKNAHKKGISLKES
AAGGCATTTCTTTAAAAGAAAGC
HP0336 213 GAAGATTTAGGCTCGTTTTTTGAAGACGCTTTTGGGTTTGGCGCTAGGGGGAGTAAAAGGCAAAAAAGCTCTATCGC 214 EDLGSFFEDAFGFGARGSKRQ
ACCGGATTATTTGCAAACCCTTGAATTGAGTTTCAAAGAAGCGGTTTTTGGCTGTAAAAAAACCATTAAAGTCCAATA DYLQTLELSFKEAVFGCKKTIK
CCAGAGCGTTTGTGAAAGTTGCGATGGCACGGGCGCTAAAGACAAAGCCCTAGAGACTTGCAAGCAATGCAATGGG VCESCDGTGAKDKALETCKQC
CAGGGGCAGGTGTTTATGCGTCAAGGTTTTATGAGTTTTGCGCAAACTTGTGGGGCGTGTCAAGGCAAGGGCAAGA QVFMRQGFMSFAQTCGACQG
TCGTTAAAACCCCATGCCAAGCGTGCAAGGGTAAAACCTATATCCTTAAAGATGAAGAAATTGATGCGATAATCCCTG TPCQACKGKTYILKDEEIDAIIP
AGGGCATTGATGATCAAAACCGCATGGTGCTTAAAAATAAAGGCAATGAATACGAGAAGGGAAAAAGAGGGGATTTG NRMVLKNKGNEYEKGKRGDL
TATTTAGAAGCGCAAGTCAAAGAAGATGAGCATTTCAAGCGCGAAGGCTGCGATTTATTCATTAAAGCGCCGGTGTT KEDEHFKREGCDLFIKAPVFFT
TTTCACCACTATCGCTTTAGGGCATACGATTAAAGTGCCGTCTTTAAAAGGGGACGAACTGGAATTAAAAATCCCTAG TIKVPSLKGDELELKIPRNARD
AAACGCCAGAGACAAGCAGACTTTTGCGTTTAGAAACG RN
HP0336 215 TCCAACCACGAGATTTTACGCCCTTTAGTGGAAAAATTTGACATCCCTTATTTTTATGCGCCTTGCGACAATCAAGτTT 216 SNHEILRPLVEKFDIPYFYAPC
TGCATGAAAAAGAAGTTTTAGAAATCATTAAAAACCTGGAATTAAAGCACAAAGTGAGTGCAGACTTGCTCGTTTTAG EKEVLEIIKNLELKHKVSADLLV
CCAAATACATGCGCATTTTAAGCCATGATTTTACGAAGCGCTATGAAAACCAGATCTTAAATATCCATCATAGTTTCTT RILSHDFTKRYENQILNIHHSFL
GCCCGCATTCATTGGGGCTAATCCTTACCAGCAAGCGTTTGAAAGGGGCGTGAAAGTCATCGGGGCCACGGCGCAT NPYQQAFERGVKVIGATAHFV
TTTGTGAATGAAAGCCTTGATGCTGGGCCGATTATCATACAAGACACTCTGCCCATTAACCACAATTACAGCGTGGA GPIIIQDTLPINHNYSVEKMRLA
AAAAATGCGCCTAGCGGGTAAGGATATAGAAAAACTGGTTTTAGCTAGGGCTTTAAAACTCGTTTT LVLARALKLV
HP0336 217 GGAGCTTGAAATTTTAAAATCTTATCTTAAAATCCCTTATACTTTACTAGAGACCAACACCCTAAATTCCAAGGCTTGT 218 ELEILKSYLKIPYTLLETNTLNS
TTGAAAGATTTGAGCCAAAAAATCAGCGCCTTTTTCCCCAAACTAGACACTCAAAACAAGCTCTTACTCACTTCCCTA LSQKISAFFPKLDTQNKLLLTS
GCCCAAAAAATTGCCCTAGAAAACGCCATTACTGAATTGCAAAACGCTAAAAACCATTTAGAAACTTTAGAGCTTTTTl ENAITELQNAKNHLETLELFSY
CTTATCACATTTTAAGCGCGATAGAAAACTTGAACTTGCTCACCCGCCCTTATGAAACCAGCCAAATGCTAGACAGCA NLNLLTRPYETSQMLDSMFSE
TGTTCAGCGAATTTTGCTTAGGCAAATGAAACCGCTTAAAGAATGGATT
Figure imgf000113_0001
13
HP0697 227 CACTCAAATGGCTTTTGATTTTAATACGCCCTTGTTGCAAGTCCAGCACCACCATGCCCACTTTTTAGCGAGCGTCTT 228 TQMAFDFNTPLLQVQHHHAHF
AGACGCATTGTTACAAGATCCGCATTTAAATCACCCCTTTATAGGCATTGTCTGGGATGGGAGTGGGGCTTATGAAA DALLQDPHLNHPFIGIVWDGS
ATAAGATTTATGGGGCGGAGTGTTTTGTGGGGGATTTGGAACGCATTGAAGAAACCGCCAGGTTTGAAGAATTTTGG IYGAECFVGDLERIEETARFEE
CTTTTAGGGGGGCAAAAAGCGATCAAAGAGCCTAGACGCCTGGTTTTAGAAATCGCTTTAAAACACCAACTCAACAA GQKAIKEPRRLVLEIALKHQLN
GCTTTTAAAGCGCGTTCAAAAGCATTTTAAAGAAGACGAATTAGAAATTTTCCAACAAATGCATGACAAAAAAATTCAA QKHFKEDELEIFQQMHDKKIQS
AGCATAGCCACCAATTCCATAGGGCGTTTGTTTGATATAGTAGCGTTTAGTTTGGATTTAACAGGAACGATTAGCTTT GRLFDIVAFSLDLTGTISFEAES
GAAGCAGAGAGCGGGCAGGTTTTAGAAAATCTAGCCTTACAAAGCGATGAGATCGCTTTTTACCCTTTTGAAATCAA NLALQSDEIAFYPFEIKNSWC
AAACAGCGTGGTGTGTTTGAAAGAATTTTATCAAGCGTTTGAAAAGGATTTGGGCGTTTTAGAGCCTGAACGCATCG AFEKDLGVLEPERIAKKFFNSL
CTAAGAAATTTTTTAACAGCCTAGTAGAAATCATTACCGCTTTAATCGTGCCTTTTAAAGAGCATGTGGTGGTGTGCA VPFKEHV CSGGVFCNQLLC
GTGGGGGCGTGTTTTGCAACCAATTATTATGCGAACAATTAGCCAAACGATTGAGAGGGCTAAAGAGGCAGTATTTT RLRGLKRQYFFHKHFPPNDSS
TTCCACAAGCATrTCCCCCCTAATGACAGCAGTATCCCTATCGGTCAAGCCTTAATGGCGTATTTCAACCCTACAATC LMAYFNPTIIKKG
ATCAAAAAAGGATAAAAATGGATAGCGTAACTCTAGCATGCGGGAACGGAGGGAAAGAAACAAACGCTTTGATTGAG
CGAGTCTTTATGCCCTATTTAAAAGAATGGATTGTTGCATTTGATGAAGACGCCCCTAAATTTGAAGCTAGTGGGGAA
TATTGCGTGAGCACGG
HP0697 229 TCAAGGAATCTTACCTCCTACCATCAATCAAGAAATGCCTGACCCAGAATGCGATTTGGATTATATCCCTAATGCGGC 230 QGILPPTINQEMPDPECDLDYI
CAGAGAAAAGCGAGTGGATGCAGTGATGAGTAACTCATTTGGTTTTGGTGGCACTAATGGTGTTGTGATTTTCAAAA KRVDAVMSNSFGFGGTNG VI
AAGCCTAGTTTTACAAAGTTAGGATTTTGAATGGCCGTTTATTTAGATTTTGAAAATCATATTAAAGAGATTCAAAATG
AAATTGAATTAGCCCTTATTAGAGGCGATGAGGACGCTAAAGAAATCTTAGAAAAAAGATTGGATAAGGAGGTTAAAA
GCATTTATTCCAATCTCACTGATTTTCAAAAACTCCAATTAGCAAGACACCCTGACAGACCCTACGCTATGGATTACA
TTGATCTCATCTTAAAAGATAAATATGAAGTCTTTGGGGATAGGCATTATAACGATGATAAAGCGATCGTGTGCTTTG
TAGGGAAAATTGATAATGTCCCAGTTGTGGTGATCGGAGAAGAAAAGGGCAGAGGGACTAAAAACAAACTCTTAAGA
AATTTTGGCATGCCTAACCCTTGTGGCTATCGTAAGGCTTTGAAAATGGCAAAGTTTGCTGAAAAGTTTAATTTGCCT
ATTTTAATGCTTGTGGATACAGCCGGGGCGTATCCGGGGATTGGTGCAGAAGAAAGGGGGCAAAGTGAAGCGATCG
CTAAAAATCTCCAAGAGTTCGCCTCTTTAAAAGTCCCTACTATTTCTGTAATTATCGGTGAGGGGGGCAGTGGTGGT
GCGCTAGCGATTGCAGTGGCTGACAAATTGGCTAT6ATGGAATATTCCATTTTTAGCGTTATATCCCCAGAAGGTTGT
GCGGCGATTCTTTGGGATGACCCTAGCAAGACTGAAGTGGC
HP0697 231 CATAATGrrTGGGAATAAGCAGTTGCAACTTCAAATCAGTCAGAAAGATTCTGAGATTGCGGAGTTAAAAAAAGAAGT 232 IMFGNKQLQLQISQKDSEIAEL
CAATCTTTATCAAAGCCTTTTAAATTTGTGCTTGCATGAGGGTTTTGTAGGTATTAAAAACAATAAAGTCGTTTTTAAAA YQSLLNLCLHEGFVGIKNNK
GTGGGAATCTTGCAAGCTTGAACAATTTAGAAGAACAAAGCGTTCATTTTAAAGAAAAfGCAGAGAGCGTTAATTTAC LASLNNLEEQSVHFKENAESV
AAGGGGTTTCTTATTCTTTAAAAAGCCAAAATATTGATGGCGTGCAGTATTTTTCATTGGCTAAAAACACAAGTTGTGT YSLKSQNIDGVQYFSLAKNTS
GGGGGAATACCATAAAAATGATTTGTTTAAGACTTTTTGCGCGAGCTTAAAAGAAGGCTTAGAGAACGCGCAAGAAA KNDLFKTFCASLKEGLENAQE
GCATGCAGTATTTCCATCAAGAAACCGGTGCTCTTTTAAATGCAGCTAAAAATGGCGAAGCGCATTCTACTGAAGGA HQETGALLNAAKNGEAHSTE iTTGGGGACGGTTAATAAAACGGGGCAAGACATTGAATCGCTTTATGAAAAGATGCAAAACGCCACTTCGCTAGCGGA KTGQDIESLYEKMQNATSLAD
CTCTCTCAACCAACGGAGCAATGAAATCACTCAAGTCATTTCTTTGATTGATGATATTGCAGAACAAACCAATCTATTA NEITQVISLIDDIAEQTNLLALN
GCCCTAAATGCCGCTATTGAGGCCGCGCGAGCGGGCGAGCATGGGAGAGGGTTTGCGGTGGTGGCTGATGAGGT AGEHGRGFAWADEVRKLAE
GAGAAAACTCGCTGAAAAAACCCAAAAAGCCACTAAAGAAATCGCTGTTGTCGTTAAAAGCATGCAACAAGAAGCGA KEIAVWKSMQQEANDIQTNT
ACGATATTCAAACCAATACCCACGATATTAATTCTATTGTAAGCTCTATTAAGGGCGATGTGGAAGAGCTTAAATCCA SSIKGDVEELKSTVKNNMIVAQ
CCGTAAAAAATAACATGATTGTTGCGCAAGCGGCAAAATACACCATCTACAATATCAATAACCGGGTGTTTTGCGGTT YNINNRVFCGLAKLDHWFKN
TGGCCAAACTTGATCATGTGGTCTTTAAAAACAATCTTTATGGCATGGTTTTTGGTCTCAATTCCTTTGATATTACCAG FGLNSFDITSHKNCRLGKWYY
CCATAAGAATTGC ENFSNTSGYRALESHHASVH
VKAVQEDHITDSKYLEHKVHL
HVRENIDKMFYEKQDELNKIIE
ιl4
Figure imgf000115_0001
Figure imgf000116_0001
16
HP1423 255 ACGCTCAATATCCGTGTGCCTAGGCTTAAGCCCTTAAGTTTAGAGGATTTCACTTTCAAAGCAAGCGATCCAAAAAGT 256 TLNIRVPRLKPLSLEDFTFKASD TTGAAAGATTTAGCGCTTAAAGGGCATAACATTCTCATTAGCGGGGAGACTTCAAGCGGTAAAACAAGCCTATTAAAC DLALKGHNILISGETSSGKTSLL GCTCTTTTAGATTGCGTCAATAAAGACGAAAGGGTGGTGAGCGTTGAAGACAGCCAAGAATTGGATTTAAAAGCGTT CVNKDERWSVEDSQELDLKAF TAGTAATTGCGTGGGGCTTTTAGTGGGCAAGCAAGAAAACACGCGCTTTAATTATGAAGACGCTCTCAATATGGCCA GLLVGKQENTRFNYEDALNMA
TGCGCTTAAACCCAGACAGGCTCATTGTGGGCGAGATTGATACTAGGAATGCAGCGCTCTTTTTGCGTTTAGGAAAC DRLIVGEIDTRNAALFLRLGNTG ACCGGGCATAAGGGCATGCTCTCAACTATTCACGCTAATAGCGCTCAAAACACTTTAGAAGCCCTTTCACTGAATTTA LSTIHANSAQNTLEALSLNLSMR AGCATGCGTTACATGTATTCTTTGGATAAGGATTTGATGCGAGCGTATTTTAAGAGCGCGATTGATGTGATCGTGCAT LDKDLMRAYFKSAIDVIVHVNRI
GTGAATAGAATCAACAATGAGCGCCAAATCGCTGAAGTCTTATGGACTAAAGAGCTTTAAATGCCCCTAAAATCCTTA IAEVLWTKEL AAAAACCGCTTGAATCAGCATTTTGATCTATCGCCTCGCTATGGGAGCGTGAAAAA
HP1423 257 CTTAAAGAAATTTTAAGCCAGAATAAAGTCGGCATGCATTTAGCCCACAGCGTGGATGTGCGTATTGAAGTAGCGCC 258 LKEILSQNKVGMHLAHSVDVRIE
TAAAATCCAAATTAACGCCCAATCTAATATCAATTACAAAGCCATAAAAACGAGCGTCAAAGACTCTTACACTTTTGAA QINAQSNINYKAIKTSVKDSYTF
AATTTTGTCGTAGGCTCATGCAATAACACCGTTTATGAAATCGCTAAAAAAGTCGCCCAAAGCGATACCCCCCCTTAT GSCNNTVYEIAKKVAQSDTPPY
AACCCGGTGCTTTTTTATGGCGGCACAGGGTTAGGCAAAACGCACATTTTAAACGCTATCGGCAACCATGCCCTAGA YGGTGLGKTHILNAIGNHALEK
AAAGCATAAAAAAGTCGTGTTAGTCACTTCAGAAGACTTTTTGACAGACTTTTTAAAGCATTTAGACAACAAAACCATG LVTSEDFLTDFLKHLDNKTMDS
GATTCTTTTAAAGCAAAATACCGCCATTGCGACTTTTTCTTGTTAGATGACGCTCAATTTTTGCAAGGAAAACCCAAG RHCDFFLLDDAQFLQGKPKLEE
CTAGAAGAAGAATTTTTCCACACCTTTAACGAATTGCACGCCAACAGCAAACAAATCGTATTGATTTCAGACCGATCG FNELHANSKQIVLISDRSPKN/A
CCTAAAAACATCGCCGGCTTAGAAGATCGCTTAAAATCGCGCTTTGAATGGGGGATAACCGCTAAAGTCATGCCCCC LKSRFEWGITAKVMPPDLETKL
TGATTTAGAAACCAAACTTTCCATTGTCAAACAAAAATGCCAGCTCAATCAAATCACTTTGCCTGAAGAGGTGATGGA KCQLNQITLPEEVMEYIAQHISD
ATACATCGCCCAACACATCAGCGACAATATCCGCCAAATGGAAGGCGCGATCATTAAAATCAGCGTGAACGCGAACT EGAIIKISVNANLMNASIDLNLAK
TGATGAACGCTTCCATTGATTTGAACCTCGCTAAAACCGTTTTAGAAGATTTGCAAAAAGATCATGCTGAAGGTTCAA LQKDHAEGSSLENILLAVAQSLN
GCTTGGAAAATATCCTACTCGCTGTCGCGCAAAGCCTGAATCTCAAATCCAGCGAAATCAAAGTCTCTTCGCGCCAA EIKVSSRQKNVALARKLWYFAR
AAAAATGTCGCTTTGGCGAGGAAATTAGTCGTGTATTTCGCCAGGCTTTATACCCCTAACCCCACGCTCTCGCTCGC NPTLSLAQFLDLKDHSSISKMYS
TCAATTTTTGGATT MLEEEKSPFVLSLREEIKNRLNE
KTAFNSSE
HP0691 259 TCATTAAAAGAGCGGCAAAGGAATTGAAAGAGGGCATGTATGTGAATTTAGGGATAGGCTTGCCCACGCTTGTGGCT 260 IKRAAKELKEGMYVNLGIGLPTL
AATGAAGTGAGCGGGATGAATATCGTTTTCCAAAGCGAGAACGGGCTGTTAGGGATTGGCGCTTACCCTTTAGAGG SGMNIVFQSENGLLGIGAYPLE
GGAGCGTTGATGCGGATCTTATCAACGCAGGAAAGGAAACCATAACCGTGGTGCCGGGCGCTTCGTTTTTCAATAG DLINAGKETITWPGASFFNSAD
CGCGGATTCGTTTGCGATGATTCGTGGGGGGCATATTGATTTAGCGATTTTAGGGGGGATGGAAGTCTCACAAAATG RGGHIDLAILGGMEVSQNGDLA
GGGATTTGGCTAATTGGATGATCCCTAAAAAGCTCATAAAGGGCATGGGAGGGGCTATGGATTTGGTGCATGGCGC KKLIKGMGGAMDLVHGAKKVIVI
TAAAAAAGTGATTGTGATCATGGAGCATTGCAACAAATACGGGGAGTCTAAAGTGAAAAAGGAATGCTCATTGCCCT NKYGESKVKKECSLPLTGKGW
TAACAGGAAAAGGCGTGGTGCATCAATTGATAACGGATTTAGCGGTGTTTGAGTTTTCCAATAACGCCATGAAATTAG DLAVFEFSNNAMKLVELQEGVS
TGGAATTGCAAGAGGGGGTCAGCCTTGATCAAGTGAAAGAAAAGACAGAAGCTGAATTTGAGGTGCGCTTATAGCTT KEKTEAEFEVRL
GTAAAAGGGGGTGTTTATGTTTTTATTAAGGCATTTGACTTCAGCGTGCGTGTTTTTGGCGTCTAAATGTTTGCCGGA
CTCCTTTGTCTTGGTCGCTCTTTTATCGTTTGTCGTGTTTGTTCTTGTTTATTGCTTGACAGGGCAAGACGCTTTTTCT
GTCATTTCTAGTTGGGGGAATGGCGCTTGGACGCTTTTAGGTTTTTCTATGCAAATGGCCCTTATTTTGGTGTTGGGT
CAGGCTCTGGCTAACGCTAAATTAGTCCAAAAGCTTTTAAAATATC
17
HP0691 261 GGGGATTGAACAAGACGCTGATATTGTTTTATTTTTATATAGAGGCTATATCTATCAAATGAGGGCTGAAGACAACAA 262 GIEQDADIVLFLYRGYIYQMRAE
AATAGACAAACTCAAAAAAGAAGGTAAAATTGAAGAGGCGCAAGAGTTGTACTTAAAAGTTAATGAAGAAAGGCGTAT KLKKEGKIEEAQELYLKVNEER
CCACAAGCAAAATGGCAGCATTGAAGAGGCTGAAATCATTGTGGCTAAAAACAGGAATGGGGCTACAGGAACGGTT NGSIEEAEIIVAKNRNGATGTVY
TATACGCGCTTTAACGCTCCTTTCACGCGCTATGAAGACATGCCCATAGATTCCCATTTAGAAGAAGGGCAAGAAAC PFTRYEDMPIDSHLEEGQETKV
TAAAGTGGATTATGATATAGTTACAACTTGAAAGACAAAACTTTTCAGGGGGCGTTTGAACTTCTTACGACCCCCAAA TT
GAATACCTGGTGTGTGGGGTGTTTTTAAGCCTTTTATTGGCGATTAACCTTTATTTAGAATACTTGAATTACCAAAAGC
TTGATΓTTTCAAAACCTACAAGTTTGAGCGCTCAMTCTTGTTGCAATACCCTAAAACTAAAGATCAAAAAACCTATTT
TGTTTTAAAGCTCCMTCAAAAAACATGATCTTTTACACCACCATTAAAGAGCCTTTAAAAAACCTCCAATACCGCCAT
GCGCAATTTTTTGGCAAAATCAAACCTTGCTCGTTCTTAGAGTCTCTAAAATCATGCTTTTTTCAAACTTATTCTTTTTC
TTTAACACGAAAACAGGATTTCAAATCGCATTGGCGCCATTTCATTGACAGCGCTCATGAAAACGCTTTGGTGGGTAA
TTTATATCGCGCGTTATTTATAGGGGATAGCTTAAATAAAGACTTAAGGGATAGGGCTAACGCGCTAGGGATCAACC
ACTTACTAGCCATTAGCGGGTTCCATTTAGGGATTTTGAGCGTTAGCGTGTATTTTCTTTCCTCTCTTTTTTATACCCC
CTTACAAAAACGCTATTTCCCTTATAGGAACGCTTTTTATGATATAGGGGTTTTGGTGTGGGTTTTTTTGCTGGGGTAT TTGTTGC
HP0224 263 CGTTATTTTCTTCAGCGGATAAATACGACTCCGGTTGCGGGTGGCCAAGCTTTTCTAAGCCTATCAATAAAGATGTG 264 LFSSADKYDSGCGWPSFSKPIN
GTGAAATACGAAGACGATGAGAGCCTTAATAGGAAACGCATTGAAGTGTTGAGCCGTATTGGTAAGGCGCATTTAGG KYEDDESLNRKRIEVLSRIGKA
GCATGTGTTTAACGATGGGCCTAAAGAATTAGGGGGCTTAAGGTATTGCATCAACAGCGCGGCTTTAAGGTTTATCC FNDGPKELGGLRYCINSAALRFI
CCTTAAAAGACATGGAAAAAGAGGGTTATGGCGAGTTTATCCCTTATATCAAAAAGGGTGAATTGAAAAAATACATCA MEKEGYGEFIPYIKKGELKKYIN
ATGATAAAAAGTCGCATTAAGGGGTAATGACTAAGCCCCCTAAGGGGGGTTAAAATGAGGGGTTTAAGCGTTTGGGT
TGTCTTATGGGTTATCTTAAAAAACGCTAAAAACCGCCTTAAATGATTTATTTTTAATACCCTTACTTTAAAAACGCTCT
CTTTAAGCTTTATGGCTTTGTTCTTAAAATATCAATAACTAAAGCTTTTTAACTTTTTTAAAAATGGGGTTTTTAATTTTA
TTTTTTGTTTCAAACTTATTTTTTAAGGGGGTGGGGGGTTATCCCATTACAACCCCC
HP0224 265 CCATTATTTGGAAAGCGTTACGCATGCTTTAGAGATTAGTCCTAAAAACCCGCTTCTCATTGATAAGTTTTTAGAAAAA 266 HYLESVTHALEISPKNPLL/DKFL
GCGGTGGAATTAGATGTGGATGCTATTTGCGATAAAAAAGAGGTTTATATTGCCGGCATTTTACAGCACATTGAAGAA LDVDAICDKKEVYIAGILQHIEEA
GCCGGAATCCATTCAGGCGATTCGGCGTGCTTTATCCCTTCCACTCTAAGCCCTGAAATTTTAGATGAAATTGAGCG DSACFIPSTLSPEILDEIERVSAK
GGTGAGCGCAAAAATCGCTCTGCATTTGGGCGTAGTAGGGCTGTTGAATATCCAATTTGCTGTGCATCAAAATTCGC GWGLLNIQFAVHQNSLYLIEVN
TGTATTTGATTGAAGTCAATCCCAGAGCCAGCCGAACCGTGCCTTTTTTAAGCAAGGCTTTAGGCGTTCCTTTAGCCA RTVPFLSKALGVPLAKVATRVM
AAGTGGCGACTAGGGTTATGGTGCTAGAAGACTTGAAAGAAGCCTTGAAGTTTTATGATAAAAAAAATATCGTTGGAT KEALKFYDKKNIVGYSKGVYKP
ATTCTAAAGGCGTTTATAAGCCTAAAATGCCCCATTTTGTGGCTTTAAAAGAAGCGGTtTTCCCTTTTAATAAACTTTA VALKEAVFPFNKLYGSDLILGPE
TGGATCGGATTTGATTTTAGGGCCTGAGATGAAAAGCACCGGCGAAGTGATGGGGATTGCTAGATCTTTAGGGCTT GEVMGIARSLGLAFFKAQTACF
GCATTTTTCAAGGCTCAAACGGCTTGCTTTAACCCCATTAAAAACAAGGGGCTTATTTTTGTTTCTATTAAAGATAAGG KGLIFVSIKDKDKEEACVLMKRL ATAAAGAAGAAGCATGCGTTTTAATGAAGCGATTGGTTCAGTTGGGCTTTGAATTGTGCGCCACAGAAGGCACGCAT ELCATEGTHKALEKAGVKSLKV AAAGCTTTGGAAAAAGCCGGGGTAAAGTCTTTAAAGGTGCTTAAAATCTCTGAAGGCCGCCCCAATATCATG GRPNIM
HP0224 267 ATGGGGAATTGCACAAATACGACAATTCTTTATTGGCGCTCAAAAGAGACCGATTGAGAAGAATGGAATCTAGCGCT 268 GELHKYDNSLLALKRDRLRRME
TATGAAGAAGAGTTTTTACTCTCTTACATTCCAGCCCCTAAAGAGCTTAAGGCGGTTTTAGACAATTATGTGATAGGG EEEFLLSYIPAPKELKAVLDNYV
CAAGAGCAGGCTAAAAAGGTTTTTTCCGTAGCCGTGTATAACCATTACAAACGCTTATCTTTTAAAGAAAAACTCAAA AKKVFSVAVYNHYKRLSFKEKL
AAACAAGACAACCAAGACAGCAATGTGGAGTTAGAGCATTTAGAAGAAGTGGAGTTGAGCAAGTCTAATATTTTACTA QDSNVELEHLEEVELSKSNILLI
ATCGGCCCTACAGGATCAGGCAAAACTTTAATGGCGCAAACTCTGGCCAAGCATTTGGATATTCCTATCGCCATTAG GKTLMAQTLAKHLDIPIAISDAT
CGATGCGACTAGCTTGACTGAAGCGGGCTATGTGGGCGAAGACGTGGAAAATATTCTCACAAGATTGTTGCAAGCG GYVGEDVENILTRLLQASDWN
AGCGACTGGAATGTCCAAAAAG'CCCAAAAAGGCATTGTGTTTATTGATGAGATTGATAAAATCAGCCGTTTGTCAGAA KGIVFIDEIDKISRLSENRSITRD
AACCGCTCTATCACTAGAGATGTTTCTGGCGAGGGCGTTCAGCAAGCGTTGTTGAAAATCGTTGAAGGTTCTTTAGT VQQALLKIVEGSLVNIPPKGGR
GAATATCCCCCCCAAAGGCGGCAGAAAGCACCCTGAGGGCAATTTCATTCAAATTGACACGAGCGATATTTTATTCA NFIQIDTSDILFICAGAFDGLAEII
TTTGTGCTGGAGCGTTTGATGGGTTAGCTGAAATCATTAAAAAACGCACCACGCAGAATGTGTTGGGTTTCACTCAA QNVLGFTQEKMSKKEQEAILHL
GAAAAGATGAGCAAAAAAGAGCAAGAAGCGATCTTGCATTTAGTCCAAACCCATGACCTGGTTACTTATGGGCTTAT LVTYGLIPELIGRLPVLSTLDSIS
CCCTGAGCTTATTGGCCGTTTGCCGGTTTTAAGCACGCTAGATAGCATCAGTTTAGAAGCGATGGTGGATATTTTAC DILQKPKNALIKQYQQLFKMDE
AAAAACCTAAAAACGCTCTTATCAAGCAATACCAGCAGCTTTTCAAAATGGATGAGGTGGATTTGATCTTTGAAGAAG EEAIKEIAQLALERKTGARGLR
AAGCCATTAAAGAAATCGC LDIMFDLPKLKGSEVRITKDCVL
LIIAKT
HP0224 269 AGGTGTTCAAAGACAGCAAAAAAGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCA 270 VFKDSKKDACGFIYEISEFMKA
TTGCTAAAAAAACAAGACCGATACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACT KQDRYVYLLRYLPSRYWASILT
GCCCTTTATGTCAAATACCCTGATTTTGACGCTTTGAAAAAGCTTTTGGTGTCTTATTATTACCAAACTTGGATTGCAG KYPDFDALKKLLVSYYYQTW1A
GAGGCACGATCACGCGCATCAAGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACC IKQTSINIIKNVKSNKSVETIKELI
ATCAAAGAGCTTATATTGAATAGCATCGACTCTTATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTT YNTFDQYLYNLWDSSSVYHSK
CTGTTTATCATAGCAAATGGGTGCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATT LALANYFMADEEKPHFIAMDAE
TTATCGCTATGGATGCCGAAACCCAAGTGGAGCATATTTTGCCACAAACGCCCAAAAGAGGCAGTCAATGGAACGC ILPQTPKRGSQWNADFDKEKR
GGATTTTGACAAAGAAAAAAGAGAAGAATGGGTAAATAATATCGCGAATTTAACCCTTTTAAAGCGTAAAAAGAACGC NIANLTLLKRKKNAHALNGDFD
GCATGCTTTAAACGGGGATTTTGATGAAAAAAGAAAAATTTATGGAGGCAAAGACACGAGCAAAGTGATTAGCTGTTA GGKDTSKV1SCYD1TKELYSNY
TGACATCACTAAAGAATTGTATAGCAATTATAGGAAGTGGAATGAGAAGTCCCTCCAAG KSLQ
HP0692 271 GGAGCGCGGCTTATGCGGCCCGCTATGTGGCTAAAAATTTGGTAGCGAGTGGGGTTTGCGATAAAGCGACCGTGC 272 SAAYAARYVAKNLVASGVCDK
AGCTTGCTTATGCGATTGGGGTGATAGAGCCAGTGTCTATTTATGTGAACACGCATAACACGAGCAAGTATTCAAGC AYAIGVIEPVSIYVNTHNTSKYS
GCTGAGTTGGAAAAATGCGTGAAATCGGTTTTCAAACTCACGCCAAAAGGCATTATTGAAAGCTTGGATTTATTAAGA KCVKSVFKLTPKGIIESLDLLRPI
CCCATTTATTCGCTCACTTCAGCTTATGGGCATTTTGGGCGCGAATTAGAGGAATTCACTTGGGAAAAAACCAACAAA AYGHFGRELEEFTWEKTNKAE
GCTGAGGAGATTAAAGCGTTCTTΓAAGCGTTAAAAAAATATTTTAAGGGTAATATTTTAAAAAATTTTTGTATAATCAAC KR
AATTCACAAGGAGTTTAA
HP0692 273 GGGTGAGGTGATGGTTTTTGAAAGGAATTTAGCCATTCGTTTGAATGAAATCCTAGATTCTAACGCCATTGTATATTA 274 GEVMVFERNLAIRLNEILDSNAI
TCTCGCTAAAAATTCATGAGATTGTTATTCTTGTTATTGAGTGCTGCTTTTATGTTACTGGCTGAAGAAAAAATATCTTT KNS
AAACGATGACGCCCCCATTAAACTAGTGCATTGGCAAAATGCATTAAAAGAAGTCCAACCTGATTCAAACGCTCCAG
CAACACCACCTATAAAAGCCGTGCAAACCACGCTCACTTTTGAAACGCCTTTTAACAAAACGCCTAAAATCATGGAAG
TTGAAGGGCAAAA
HP0692 275 CTCCAGAAGGCGGTTATAAGGATAAACCTAAGGATAAACCTAGTAACACCACGCAAAATAATGCTAACAACAACCAA 276 PEGGYKDKPKDKPSNTTQNNA
CAAAACAGCGCTCAAAACAATAGTAACACTCAGGTTATTAACCCACCCAATAGCGCGCAAAAAACAGAAATTCAACC QNSAQNNSNTQV1NPPNSAQK
CACGCAAGTCATTGATGGGCCTTTTGCTGGTGGCAAAGACACGGTTGTCAATATTGATCGCATCAACACTAACGCTG QVIDGPFAGGKDTWNIDRINT
ATGGCACGATTAAAGTGGGAGGGTATAAAGCTTCTCTTACCACCAATGCGGCTCATTTGCATATCGGCAAAGGCGGT KVGGYKASLTTNAAHLHIGKG
ATCAATCTGTCCAATCAAGCGAGCGGGCGCACCCTTTTAGTGGAAAATCTAACCGGGAATATCACCGTTGATGGGCC QASGRTLLVENLTGNITVDGPL
TTTAAGAGTGAATAATCAAGTGGGTGGTTATGCTTTGGCAGGATCAAGCGCGAATTTTGAGTTTAAGGCTGGTACGG VGGYALAGSSANFEFKAGTDT
ATACCAAAAACGGCACAGCCACTTTTAATAACGATATTAGTTTGGGAAGATTTGTGAATTTAAAAGTGGATGCTCATA TFNNDISLGRFVNLKVDAHTAN
CAGCTAATTTTAAAGGTATTGATACTGGTAATGGTGGTTTCAACACCTTAGATTTTAGTGGCGTTACAGGTAAGGTCA GNGGFNTLDFSGVTGKVNINK
ATATCAACAAGCTCATTACGGCTTCCACTAATGTGGCCGTTAAAAACTTCAACATTAATGAATTGGTTGTTAAGACCAA NVAVKNFNINELWKTNGVSVG
TGGGGTGAGTGTGGGGGAATACACTCATTTTAGCGAAGATATAGGCAGTCAATCGCGCATCAATACCGTGCGTTTG SEDIGSQSRINTVRLETGTRSIF
GAAACTGGCACTAGGTCAATCTTTTCTGGGGGTGTCAAATTTAAAAGCGGTGAAAAACTGGTTATAGATGAGTTTTAC FKSGEKLVIDEFYYSPWNYFDA
TATAGCCCTTGGAATTATTTTGACGCTAGGAATATTAAAAATGTTGAAATCACCAGAAAATTCGCTTCTTCAACCCCAG VEITRKFASSTPENPWGTSKL
AAAACCCTTGGGGCACATCAAAGCTTATGTTTAATAATCTAACCCTGGGTCAAAATGCGGTCATGGACTATAGTCAAT LGQNAVMDYSQFSNLTIQGDFI
TTTCAAATTTAACCAT INYLVRGGQVATLNVGNAAAM
VDSATGFYQPLMKINSAQDLIK
LLKAKIIGYGNVSLGTNSISNVN
ERLALYN
HP0071 277 TCATGGCGCAAATCTTTGTGCGTTTCAATTATGTTTTAGGCGCGATCGGTTTTGTAGTGTTACTTTATGAAATCATTTC 278 MAQIFVRFNYVLGAIGFWLLY GTTTATTTATTACAAAAGATCGTTAGTGTATTTGATCCTTGGCGTGGCGATAGGGGCGTTGTGTTTGCTCTTTGTTTTT KRSLVYLILGVAIGALCLLFVFY TATTACACGCCTTATATTTTAAACGCTCAAAAAGCGGGCGAAGCCGCGCTTCAAAGTGCTGAATTTGCCCGCTCGCA NAQKAGEAALQSAEFARSHA CGCTCAAAGCGAATGGTTGTTTAAGGAATTGTTTGTGCTGGTGTGCGCTTTGTTTTTTTGGCGTTTGCTTGGAAAAAA KELFVLVCALFFWRLLGKNVL TGTGCTTTAGTCCCTTTGATTTAATCAAATGAGAGAGTT
HP0071 279 AGCTTATTGGCACGCCTTTAGACGAAAGCCGTATCGCTTTTAGAGATGCGTTTGATAGCCGTGGGTATAAATTAAAAA 280 LIGTPLDESRIAFRDAFDSRGY
ATTTGGTTGAAGAAGTTAATCAAAAATCCCCTAACGCACGCAACGAGTTGGATAAAGATGTCTTAAAAGTAGATGAAC EEVNQKSPNARNELDKDVLKV
GCATCAACTTAGTCTATACGCTTTTTAGCGCTCAATTTTTACGCATTTTCCCTAGCGATAAAACCACCGCTTGGCTCT VYTLFSAQFLRIFPSDKTTAWL
CGCCCATTGAAGCGATCAGCAGCCCCAATAAAGAAATTTCAAGCGTGGCAACGGAGTTTTTAAAAAATATTTTTAGCG SPNKEISSVATEFLKNIFSGFD
GGTTTGATGACGCTTTAAAAACCAACCAATGGGATAAGGTAGAAAAAACCCTAAAAGATTTAAGCATTTACCAAAAAG QWDKVEKTLKDLSIYQKEHAK
AGCATGCCAAAAACCTCTATTTATCCTCTTCTAAAGTGGATTCTGAAATTTTTTTAAACCACACCAATTTTTTTAACCGC SKVDSEIFLNHTNFFNRLTLPYI
CTGACCTTGCCTTATATCCTTCTTGGGTTATTGCTTTTTATCGTTGTGATCAGCTCTCTGGTTAAAAACACGATTCCAA FIWISSLVKNTIPNIWLTKILYF
ATATTTGGCTCACTAAAATCCTTTATTTTGCTATCTTGCTTTGCGCGCTCGCTCATTCTATGGGGCTAGTTTTACGCTG AHSMGLVLRWYVSGHSPWSN
GTATGTGAGCGGGCATTCGCCTTGGAGCAACGCTTATGAGTCCATGCTTTATATCGCATGGGCTTCTGTTATCGCAG LYIAWASVIAGFILRSKLALSAS
GGTTTATTTTACGCTCCAMCTCGCGCTATCGGCTTCTAGCTTTTTAGCCGGTATCGCGCTCTTTGTGGCTCATTTAG ALFVAHLGFMDPQIGHLVPVLK
GCTTTATGGATCCTCAAATCGGCCATTTAGTGCCGGTATTAAAATCCTATTGGCTCAATATCCATGTCTCTGTCATTAC IHVSVITASYSFLGLCFVLGILS
CGCTAGTTATAGCTTTTTGGGCTTGTGTTTTGTGCTAGGGATTTTGAGTTTGGTTTTGTTTATTTTGCGCAAACAAGG KQGRFNLDKTILSISAINEMSMI
GCGTTTCAATTTGGACAAAACCATTCTCTCCATTAGCGCTATCAATGAAATGAGCATGATTTTAGGCCTGTTCATGCT TAGNFLGGVWANESWGRYW
CACAGC ETWALISICVYALILHLRFLGSH
ASSSV
HP0071 281 ACTTAAAATTCTGTTAGTGGGGCATTTGATTACGCCCGTCTTTTTTATGAGCCATTTTCAAATGTGGCAAGCGTATTTT 282 LKILLVGHLITPVFFMSHFQMW
TTAAAACAAGGCGTTAAAGAGCAATACCTTTTTGTGTTTTATATCGCTTTTCAAGTGATTTCTATTCTCATTCATTTTTT QGVKEQYLFVFYIAFQVISILIH
AAAAGCCTCTAGTTATAGCCAAAAAATCGCCTTGAGTTCGCTTGTGGTGTTGTTAGGCGTTAGCCCCTTATTGCTTAG YSQKIALSSLWLLGVSPLLLS
CAATATCCCTTATTGTTTCATAGGGGTGTATGCGCTCATGGTGGCGTTTTTCACTTACATGAGCTATTGCTTAAACTAT GVYALMVAFFTYMSYCLNYQF
CAATTCTCCAAATTCGTTTCTAAAAACAACATTTCCTCGCTCTCATCGCTTTTATCAAGCTGTGTGCGCGTGGTCTCT KNNISSLSSLLSSCVRWSVLIL
GTGCTAATCTTATCGCTCAGCAGTCTGGAACTGCGTTACTTCTCACCCCTAACTATCATAACCATGCATTTTGCCTTG LRYFSPLTIITMHFALTLIILFFF
ACGCTTATCATCCTCTTTTTCTTTTTGTATAAGGCTAAGCCGTTTGATGAGTGAGCGGCTTTAAGAGTGCAACCTTTTA FDE
GCGATTTCTATAGCAACATCATAGCCATGGTTTAAAGCGGTAGCGATGCTCGCCCCTG
Figure imgf000121_0001
Figure imgf000122_0001
Figure imgf000123_0001
Figure imgf000124_0001
Figure imgf000125_0001
Figure imgf000126_0001
Figure imgf000127_0001
Figure imgf000128_0001
Figure imgf000129_0001
Figure imgf000130_0001
Figure imgf000131_0001
HP1073 371 AACATTTAACCCCACTCACTCACACCATCTTTAAAGCCTTATGGCTAGGCACAGCCTTAAGTGCATCTTTAAGTTTAG 372 HLTPLTHTIFKALWLGTALSAS
CCGCAACAGAAAGCCCCACTAAAACAGAGCCTAAGCCCGCTAAAGGGGTTAAAAACAAGCCCAAATCGCCCGTTAC ESPTKTEPKPAKGVKNKPKSP
TAAAGTCATGATGACCAATTGCGACAATATTAAAGATTTTAACGCTAAGCAAAAAGAAGTCTTAAAAGCCGCTTATCAA MTNCDNIKDFNAKQKEVLKAA
TTCGGCTCTAAAGAAAATTTAGGCTATGAAATGGCAGGCATTGCATGGAAAGAGTCATGCGCAGGGGTTTATAAAAT KENLGYEMAGIAWKESCAGVY
CAATTTTTCGGATCCGAGCGCGGGCGTGTATCATTCTTATATCCCAAGCGTTCTAAAAAGCTATGGGCATAATGATAG PSAGVYHSYIPSVLKSYGHND
CCCCTTTTTGCGTAATGTGATGGGGGAATTGCTCATTAAAGACGATGCGTTTGCTTCTGAAGTGGCTTTAAAAGAGTT VMGELLIKDDAFASEVALKELL
GCTCTATTGGAAAACACGCTACCATGACAATTTAAAAGACATGATTAAATCTTACAACAAGGGCAGTCGTTGGGAAAG YHDNLKDMIKSYNKGSRWERS
GAGCGAAAAATCTAACGCTGATGCTGAAAAATATTACGAAGAGATACAAGACAGAATCAGGCGTTTGAAAGAATCTA DAEKYYEEIQDRIRRLKESKIFD
AAATCTTTGATTCGCAGTCTAGTAATGACCAAGAATTGCAAAAAAGCGCTAATAGCAACCTGGATTTAGACCCTATCG DQELQKSANSNLDLDPIGNAM
GCAACGCCATGCCCCAAGCCTTAATTGCCAAAGAAACTAAAATAGAAGAAACCCAAGCAGAAAAATCCCAAGAAATG KETKIEETQAEKSQEMKETTSE
AAAGAGACAACTAGCGAGCAAACAAAAAGTAAGCCAGAAAAAGCAAAAGATAAACCCATGTATTTGGCGCAAATCAA PEKAKDKPMYLAQINSTDFTPV
CAGCACTGATTTCACACCCGTTAAAAAAAGCCCCAAAAAACCGGCTAAAGTGAGCCAAAAACACTCCTTTAAGAATAA KPAKVSQKHSFKNNIKNNVKN
CATTAAAAATAATGTAAAAAACAACGCCAAAACCGCTTCCAAAAAACAAGAAATGTGCAAAAATTGCTCTCCAGGGCA KKQEMCKNCSPGQRNAILANH
AAGGAATGCGATTTT L
HP1073 373 CATGATACAGAGTTTTACAAGATTGAATGTCGCTGACAATAGCGGCGCTAAAGAAATCATGTGCATTAAGGTGTTAG 374 MIQSFTRLNVADNSGAKEIMCI
GAGGCAGCCACAAACGCTATGCGAGCGTGGGTAGCGTGATCGTGGCTTCCGTGAAAAAAGCTATCCCTAATGGTAA SHKRYASVGSVIVASVKKAIPN
GGTGAAGCGCGGTCAAGTCGTCAAAGCCGTTGTGGTGAGAACGAAAAAAGAAATCCAAAGAAAAAATGGTTCTTTG GQVVKAVWRTKKEIQRKNGS
GTGCGTTTTGATGACAATGCAGCAGTGATCTTGGACGCTAAAAAAGATCCGGTTGGCACAAGGATTTTTGGGCCAGT DNAAVILDAKKDPVGTRIFGPV
GAGCCGAGAAGTGCGTTACGCTAATTTCATGAAAATTATTTCTCTAGCACCGGAGGTTGTATAATGAAAAGCGAAATC YANFMKIISLAPEW
AAAAAAAATGACATAGTAAAAGTCATTGCAGGAGACGATAAGGGTAAGGTCGCTAAGGTTTTAGCGGTGTTGCCTAA
GACTTCTCAAGTGGTTGTTGAAGGGTGTAAAGTGGTGAAAAAAGCGATTAAACCTACTGATGATAACCCTAAAGGGG
GCTTTATCCATAAAGAAAAGCCCATGCACATTTCCAATGTGAAGAAAGCCTAAGGAGTTGGATACATGTTTGGTTTGA
AACAATTTTATCAAAGTGAAGTGAGAACAAAACTCGCTCAAGAATTAGACATCAAAAACCCCATGCTTTTACCCAAGC
TAGAAAAAATCGTTATCAGCGTGGGCGCTGGGGCTCATGCAAAAGACATGAAAATCATGCAAAATATCGCACAAACG
ATTTCTTTGATTGCAGGGCAAAAAGCGGTTATCACTAAAGCGAAAAAATCCGTTGCAGGCTTTAAGATCAGAGAAGG
CATGGCGGTAGGGGCGAAAGTTACCTTAAGGAATAAACGCATGTATAATTTCTTAGAAAAGCTGATTGTGATTTCGTT
ACCCAGAGTGAAAGACTTTAGAGGGATTTCACGGAATGGTTTTGATGGGTGCGGGAATTACACCTTTGGGATCAATG
AGCAGTTGATTTTTCCGGAAG
HP1073 375 AGGTGTTCAAAGACAGCAAAAAAGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCA 376 VFKDSKKDACGFIYEISEFMKA
TTGCTAAAAAAACAAGACCGATACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACT KQDRYVYLLRYLPSRYWASILT
GCCCTTTATGTCAAATACCCTGATTTTGACGCTTTGAAAAAGCTTTTGGTGTCTTATTATTACCAAACTTGGATTGCAG KYPDFDALKKLLVSYYYQTWIA
GAGGCACGATCACGCGCATCAAGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACC IKQTSINIIKNVKSNKSVETIKELI
ATCAAAGAGCTTATATTGAATAGCATCGACTCTTATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTT YNTFDQYLYNLWDSSSVYHSK
CTGTTTATCATAGCAAATGGGTGCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATT LALANYFMADEEKPHFIAMDA
TTATCGCTATGGATGCCGAAACCCAAGTGGAGCATATTTTGCCACAAACGCCCAAAAGAGGCAGTCAATGGAACGC ILPQTPKRGSQWNADFDKEKR
GGATTTTGACAAAGAAAAAAGAGAAGAATGGGTAAATAATATCGCGAATTTAACCCTTTTAAAGCGTAAAAAGAACGC NIANLTLLKRKKNAHALNGDFD
GCATGCTTTAAACGGGGATTTTGATGAAAAAAGAAAAATTTATGGAGGCAAAGACACGAGCAAAGTGATTAGCTGTTA GGKDTSKVISCYDITKELYSNY
TGACATCACTAAAGAATTGTATAGCAATTATAGGAAGTGGAATGAGAAGTCCCTCCAAGAGCGATACAAATCTTTGTA KSLQERYKSLYNTITPVLHIEG
TAACACTATCACGCCTGTTTTACACATAGAGGGGCAAG,AAGATGATTTTGAAGATGATTTTGATCTAGA DDFDL
HP1198 377 AATTAGTCAGCTCTTCAGAATACGCTAAAAAACTCAATGCGATTGACAAGATTAAAAAAACCGAAGAAAAGCAAAAAG 378 LVSSSEYAKKLNAIDKIKKTEEK
TTTTAGATGAAGAATTAGAAGATGGCTATGACTTTTTGAAAGAAAAGGATTTTTTAGAGTGGAGCAGAAGCGATAGCC EELEDGYDFLKEKDFLEWSRSD
CAGTGCGCATGTATTTGCGCGAAATGGGGGATATAAAACTTTTAAGC AAGATGAAGAGATTGAATTGAGCAAGCAA MYLREMGDIKLLSKDEEIELSKQ
ATCCGCTTGGGTGAAGACATTATTTTAGACGCGATCTGCTCGGTGCCGTATTTGATTGATTTTATCTATGCGTATAAA DIILDAICSVPYLIDFIYAYKDALIN
GACGCTTTAATCAATCGTGAAAGAAGGGTTAAAGAGCTTTTCAGGAGCTTTGATGATGACGATGAAAATAGCGTGAG VKELFRSFDDDDENSVSDSKKD
CGATTCTAAAAAAGATGAAGAC CGAAGAAGATG GAAAACGAAGAAAGGAAAAAAGTCGTTTCTGAAAAAGACA EDEENEERKKWSEKDKKRVE
AGAAGCGTGTAGAAAAGGTTCAAGAAAGCTTTAAAGCCCTAGACAAGGCTAAAAAAGAATGGCTTAAAGCCCTTGAA FKALDKAKKEWLKALEAPIDER
GCCCCCATAGATGAAAGAGAAGACGAATTGGTGCGTTCATTGACCCTAGCTTACAAACGCCAAACACTCAAAGACAG RSLTLAYKRQTLKDRLYDLEPTS
ACTCTATGATTTAGAACCTACCAGCAAACTGATTAATGAATTAGTCAAAACGATGGAAACCACTTTAAAAAGCGGCGA LVKTMETTLKSGDGFEKELKRL
TGGGTTTGAAAAAGAGTTGAAACGCTTGGAATACAAACTGCCCTTATTCAATGACACTCTCATCGCAAACCATAAAAA LFNDTLIANHKKILANITNMTKED
AATCCTTGCCAATATCACTAACATGACTAAAGAAGATATTATCGCTCAAGTGCCAGAAGCGACTATGGTGAGCGTGTA PEATMVSVYMDLKKLFLTKEAS
TATGGATCTTAAAAAGCTTTTTTTGACTAAAGAAGCGAGCGAAGAAGGCTTTGATCTAGCCCCCAACAAGCTAAAAGA LAPNKLKEILEQIKRGKLISDRAK
AATTTTAGAGCAAATCAAAAGAGGGAAGTTGATTTCCGATCGCGCTAAAAACAAAATGGCTAAATCCAATTTAAGGTT KSNLRLWSIAKRFTSRGLPFLD
GGTGGTGAGCATCGC NIGLMKAVDKFEHEKGFKFSTY
KQAISRAIADQARTIRIPIHMIDTI
VMRKHIQENGKEPDLEVVAEEV
KVKNVIKVTKEPISLETPVGNDD
GDFVEDKNIVSSIDHIMREDLKA
LDQLNEREKAVIRMRFGLLDDE
EEIGKELNVTRERVRQIESSAIK
QYGRILRNYLRI
HP1198 379 TGAAACTAGTTTTAGCCAAGAATACAAGAAAATCAGACGCTAAGAGCGTGGAATTAGAGGATTTGTATCACGAATTCA 380 KLVLAKNTRKSDAKSVELEDLYH
GTGAAGATAAGCGTTCTATTTTCTATTTTGCCCCCACAAACGCCCACAAAGACATGCTCAAAGCGGTGGATTTTTTCA DKRSIFYFAPTNAHKDMLKAVD
AAGAAAAAGGTCATACGGCTTATTTAGATGAGGTGAGGGTCAGCACTGATGAAAAAGATTTTCTTTATGAATTGCACA GHTAYLDEVRVSTDEKDFLYEL
TTATTTAAAGGCTTGTATTGAAAGTTTATATTGAAACCATGGGTTGTGCCATGAATTCTAGGGATAGTGAGCATTTATT
GAGCGAGCTGTCCAAACTAGACTATAAAGAGACCAATGACCCTAAAACAGCGGATTTGATTTTAATCAACACTTGCA
GCGTGCGCGAAAAGCCTGAACGAAAATTGTTTTCAGAAATCGGTCAATTCGCTAAAATCAAAAAACCCAACGCCAAA
ATCGGGGTTTGCGGGTGCACTGCAAGCCACATGGGAGCGGATATTTTGAAAAAAGCCCCAAGCGTGAGC
HP1198 381 AAGAAGGGGTTTTAAGGGTTTTACTCAATAAAAAGGGCAAGCTCATTAAAGAATACAAAACCTTAGAGCCTTTAAAAA 382 EGVLRVLLNKKGKLIKEYKTLEP
GCCTAGAAATCCGTTTGAGTGAAGCCCCCATTGATAAACGCAATGATTTTTTATACCATAAGACCACTTATGCCCCTT RLSEAPIDKRNDFLYHKTTYAPF
TTTATCAAAAGGCTCGAGCGCTCATTAAAAAGGGCGTTATGTTTGATGAAATCTTTTATAACCAGGATTTGGAACTCA RALIKKGVMFDEIFYNQDLELTE
CTGAGGGCGCTAGGAGCAATCTTGTTTTAGAAATCCATAACAGGCTTTTAACCCCTTATTTTAGCGCGGGCGCGTTA NLVLEIHNRLLTPYFSAGALNGT
AACGGGACGGGTGTTGTGGGGTTGTTAAAAAAGGGTCTTGTTGGGCATGCACCTTTGAAATTGCAAGATTTGCAAAA LLKKGLVGHAPLKLQDLQKASKI
AGCGTCTAAAATCTATTGTATTAACGCGCTATATGGCTTAGTGGAAGTGAAAATAAAATAACTATAAAAACAGAGCGG LYGLVEVKIK
CTAAAACCTCATTTTTAGAAATAGGTTACCCAATGGAGCAAAAAAGTTAAAACTCGCCCACAATAATCATAATGATTAA
AGTTTTCATATTCATTATAAATCCGTTTACACAATTATTTTATAAATTCAAGTAGAGGGTTTGTAGGAACTCTCATCAAA
AAACAAGGAACATAATATGAGACATGGAGATATTAGTAGCAGCCCAGATACTGTGGGTGTAGCGGTAGTTAATTATA
AGATGCCTAGACTCCACACTAAGAATGAGGTGTTGGAAAATTGTCGCAATATCGCTAAGGTGATTGGTGGGGTCAAA
CAGGGTTTGCCTGGGTTGGATCTGATTATTTTCCCTGAATACAGCACGCATGGGATTATGTATGACAGACAAGAAAT
GTTTGATACAGCCGCAAGCGTTCCTGGAGAAGAAACCGCGATCTTTGCTGAAGCTTGTAAGAAAAACAAGGTTTGGG
GAGTGTTCTCTTTGACAGGGGAAAAACACGAGCAAGCCAAAAAGAATCCCTATAACACTTTGATTCTTGTCAATGATA
AGGGTGAGATCGT
HP1198 383 TGGCAGATGAAACGAACGCTCTGAGACAAAAAAACAGAGAACTGAACAAAAAAATAGATTGGCAAAAAGATTATGAC 384 ADETNALRQKNRELNKKIDWQ
AGGGAAAAATTAGAACGAGATAATAGGGATTTAGAACAATGCAAAGAGGATTTATTAGCCGCTAACGAGAATCTAAG EKLERDNRDLEQCKEDLLAANE
AAATAGAATCCAAGAATGGGAAAATGAAAAAAACAAGCTTGATCCAAGAGATGAAAGAATAAAAGAGTTAGAAGAAGA
AAAAAGGGAATTAGAGGGAATTTTAGCCCAAAAAGAGAACGCAGAACAAAAATATAACACTCTTTCAGTCAAAAATAA ELEGILAQKENAEQKYNTLSVK
GCAATTAGAAGCTGAGTTAGATATGCTTAACGAAAAATTTGAAAAACTGAAAAATATGTATGCTGGGGTAGAGGATTT AELDMLNEKFEKLKNMYAGVE
TGAAAAACGCCAAAAAAATATCAAAGAACAAATTGTAAAAACCAACCCCAAAGTCTTAGGCGCACCTTCAAACGAAGT QKNIKEQIVKTNPKVLGAPSNE
GGAAGAATTAGCGTTCTTAGAGCGTATAGAAAAGGGCATGCAAGAGTTCAATGTTTTCTATCCCAAGCGTTTATTGTA LERIEKGMQEFNVFYPKRLLYM
TATGTTCCACACCGCTTTAAAAAGCACGTCTCTATCGCCATTGAGCGTGCTAAGTGGGGTGAGTGGGACAGGAAAAT KSTSLSPLSVLSGVSGTGKSEL
CTGAACTGCCCAAGCTCTATGTGCATTTTGGGGGGTTAAATTTTTTAAGCATTGCTGTGCAGCCTACTTGGGATAGC HFGGLNFLSIAVQPTWDSPESL
CCAGAATCGTTGATGGGG
HP1198 385 GAATGGGGTGGAGATTGTAGGGTTGGAGCATTTGGATAAAGTGATTTATTTAGATCAAGCCCCCATAGGCAAAACCC 386 NGVEIVGLEHLDKVIYLDQAPIG
CACGAAGCAACCCTGCCACTTACACGGGAGTGATGGATGAAATCAGGATTTTATTTGCCGAGCAAAAAGAAGCTAAA NPATYTGVMDEIRILFAEQKEA
ATTTTAGGCTATAGTGCGAGCCGTTTTAGCTTTAATGTTAAAGGAGGGCGGTGCGAGAAATGCCAAGGCGATGGGG ASRFSFNVKGGRCEKCQGDGD
ACATTAAAATAGAAATGCACTTTTTGCCTGATGTGTTAGTCCAATGCGATAGCTGTAAGGGCGCTAAATACAACCCCC HFLPDVLVQCDSCKGAKYNPQ
AAACTTTAGAAATCAAGGTGAAAGGCAAATCCATTGCCGATGTGTTGAACATGAGCGTGGAAGAGGCTTATGAATTTT KGKSIADVLNMSVEEAYEFFAK
TTGCTAAATTCCCTAAAATCGCCGTGAAGTTAAAAACGCTTATGGATGTGGGCTTAGGCTATATCACTTTAGGGCAAA KLKTLMDVGLGYITLGQNATTLS
ACGCTACGACTTTAAGTGGGGGGGAGGCTCAAAGGATCAAATTAGCTAAAGAATTGAGTAAAAAAGACACAGGCAAA QRIKLAKELSKKDTGKTLYILDE
ACCCTTTATATTTTAGATGAGCCTACTACCGGTTTGCATTTTGAAGACGTGAATCATCTTTTACAAGTCTTGCATTCTT HFEDVNHLLQVLHSLVALGNSM
TAGTGGCGTTAGGCAATTCTATGCTAGTGATTGAGCATAATTTAGACATTATCAAAAACGCTGACTACATTATAGACAT NLDIIKNADYIIDMGPDGGDKGG
GGGGCCTGATGGGGGGGATAAGGGCGGGAAAGTCATTGCGAGCGGCACGCCTTTAGAGGTGGCGCAAAATTGCG GTPLEVAQNCEKTQSYTGKFLA
AAAAAACCCAAAGCTACACGGGAAAATTTTTAGCTTTGGAATTGAAATAGCTTGCATTTGTTTGTAGAAACAATGTAG
GGCTGAAAATAACAGAC
HP1198 387 CGGATTCTTACGGATTCTTGATGTTATTAAAAAAGTTACGACCCCAAAGGGTGGCATTGAAATCTTAAGGACTTTAAT 388 GFLRILDVIKKVTTPKGGIEILRT
TGATTTTACGCCCAAAATTGAAAACGCCCTGAATTTAGCGGOCAAAAGCCATAAGGGGCAATACAGAAAAAGCGGCG KIENALNLAAKSHKGQYRKSGE
AGCCTTATATTGTCCATCCTATTTGCGTGGCAAGCTTGGTAGCGTTTTGTGGGGGCGATGAGGCGATGGTGTGTGCT PICVASLVAFCGGDEAMVCAAL
GCGCTTTTGCATGATGTGGTGGAAGACACGCCTTGTAAGATTGAAACGATTGAGCAAGAATTTGGGCAAGATGTGGC EDTPCKIETIEQEFGQDVANLVD
TAATTTAGTGGATGCGCTCACTAAAATCACTGAAATCAGGAAAGAAGAATTAGGCGTGAGCTCTCAAGATCCCAGAA TEIRKEELGVSSQDPRMVVSAL
TGGTGGTTTCAGCGCTCACTTTCAGAAAGATTTTAATTAGCGCGATACAAGATCCAAGAGCCTTAGTGGTAAAGATTA ISAIQDPRALWKISDRLHNMLTL
GCGACAGGTTGCACAACATGCTCACCTTAGACGCCTTGCCTCATGACAAGCAAGTGCGTATTTCTAAAGAGACTCTA HDKQVRISKETLAVYAPIASRLG
GCGGTGTATGCCCCTATAGCGAGCCGATTGGGCATGTCTTCAATCAAAAATGAATTAGAAGACAAGAGCTTTTATTAT NELEDKSFYYIYPEEYKNIKEYL
ATTTATCCAGAAGAGTATAAAAATATCAAGGAATATTTGCACAAAAACAAGCAGTCTTTACTCTTAAAGCTCAACGCTT SLLLKLNAFASKLEKKLFDSGFS
TTGCGAGCAAGTTAGAAAAAAAACTTTTTGATAGTGGGTTTAGCCATTCGGATTTTAAACTCGTTACAAGGGTGAAAC LVTRVKRPYSIYLKMQRKGAVNI
GCCCTTATTCTATCTATCTTAAGATGCAACGAAAGGGCGCGGTTAATATTGATGAAATTTTGGACTTGTTAGCCATTA LLAIRILLKNPIDCYKVLGIIHLNF
GGATTTTATTGAAAAACCCGATTGATTGCTATAAAGTTTTAGGGATTATCCATTTGAATTTCAAACCCATTGTCTCTCG FKDYIALPKENGYKTIHTTIFDES
TTTTAAAGATTACATCGCTTTGCCCAAAGAAAATGGCTATAAGACGATACACACGACCATTTTTGATGAATCTTCTGTT QIRTFDMHMGAEYGNSAHWKY
TATGAAGTGCAG VDHEDHHEGMRWLQNFKYHD
DPKEFYELAKNDLYREDIWFSP
YTLPVGAIALDFAYMVHSDLGD
YINSKKALLNQELRSGDWKIIK
HP1198 389 ACCCAAAGAGAGGGTTTTAAGCGATGAAGAAATCAACAACAGGGCTGAAAGGATCGCTAAAAGCGAGTTAGAGAAG 390 PKERVLSDEEINNRAERIAKSE
GATACGAAGCTCGTTTCATCACACGATCAATACGAGCGCATGAAAAAAAGCGGATCGCTCAACACGGAAAACTTAGA LVSSHDQYERMKKSGSLNTEN
TTCGCACATTCAAGCCAACAGCTTACAAGAGCTGAATCAAAAATTGCTCCAATTCGTGGGCGCGGATAGGAAGTATA ANSLQELNQKLLQFVGADRKY
TGCCCTACACTAAAGCGGTGCAAATTTCTTTGAATAACCCCAATCTTAAAGATTTGGAAGTGATTGACACCCCAGGAG AVQISLNNPNLKDLEVIDTPGV
TGAATGACCCCATCGCTTCCAGGGAAGAACGCACCAAAGCCTTATTGAAAGATTGCGATGTGGTGTTTATCATAAGC REERTKALLKDCDWFIISSSN CTTCTAATCAGTTTTTAACGGAGAGCGATATGAGTTTGTTTGACAGGGTTTCTAACAAAGAAAGCCTTCAAGAAATTT DMSLFDRVSNKESLQEIYFVAS
ATTTTGTGGCAAGCCAAGCCGATAGCGCTGTTCTTTCTATGAGTGAAGTGGAAAAATCTCGCCACCACCTCCCCACA VLSMSEVEKSRHHLPTALENA
GCCTTAGAAAACGCGCAAAAATCCCTTTCATCTTCTTTAAATAAAACCATGGAAGCATTGATTCAAACAAACCCTAAC SLNKTMEALIQTNPNQRGIFEK
CAACGAGGGATTTTTGAAAAAGCGATCAAAAACGGCGTCATTTTGACTTCAGGGGCTTGCTTTAGCATGTATAAGGA ILTSGACFSMYKDFKNQASWE
TTTCAAAAACCAAGCTTCTTGGGAAAGCAAAAAAGAAGAGTGTTACAATGCATGGCGGAATTTAACCAACGCTTACCC CYNAWRNLTNAYPLTLLTALIN
CCTGACGCTTTTAACAGCGTTGATAAATCTGAAGAAAGCTTATTATTCTTAAGCAATATGGGCGCGATT S
HP1198 391 AAGCGAAAAAAATATAGAAAAGGTTTTGAACGCCTATGATAAGCAACAACACCACCATCAAGACGATCTCGCTATTCA 392 SEKNIEKVLNAYDKQQHHHQD
GTATTTACCAGCCGTGCGCGCCATGGCGTTTCGTCTAAAAGAGCGCTTGCCCAGCTCTATTGATTTTAACGATCTGG LPAVRAMAFRLKERLPSSIDFN
TTTCTATTGGCACTGAAGAATTGATTAAATTAGCCAGGCGTTATGAGAGCGCGTTAAACGATTCTTTTTGGGGGTATG TEELIKLARRYESALNDSFWGY
CGAAGACTCGTGTCAATGGGGCGATGT-fAGATTATTTGCGCTCTTTAGATGTGATTTCTCGCTCTAGCAGGAAACTC NGAMLDYLRSLDVISRSSRKLI
ATTAAAAGCATTGATATTGAAATCACCAAACACCTTAATGAGCATGGGAAAGAGCCTAGCGATGCGTATTTAGCGCAA TKHLNEHGKEPSDAYLAQTLG
ACTTTAGGCGAAAATATTGAAAAAATTAAAGAAGCCAAAACGGCTTCAGATATTTATGCGTTAGTGCCAATAGATGAA EAKTASDIYALVPIDEQFNAIEQ
CAATTCAATGCGATTGAGCAAGATGAAATCACTAAAAAAATTGAAGCAGAAGAGTTGTTAGAGCATGTCCAAAAAGCG IEAEELLEHVQKALNQMSERE
CTGAATCAAATGAGCGAAAGAGAGCAAATCCTTATCCAGCTTTATTACTTTGAAGAGTTGAATTTGAGCGAGATTAAA YFEELNLSEIKEILGITESRISQII
GAGATTTTAGGCATTACTGAATCGCGCATTTCTCAAATCATTAAAGAAGTGATTAAAAAGGTGCGTAAATCCTTAGGA VRKSLGVDHG
GTGGATCATGGCTGATATTTTAAGCCAAGAAGAAATTGATGCGCTTTTAGAAGTCGTTGATGAAAATGTGGATATTCA
AAATGTCCAAAAAAAAGATATTATCCCGCAGCGCAGCGTAACCCTCTATGATTTCAAACGCCCTAATCGTGTGAGTAA
GGAGCAACTGCGCTCTtTTAGGAGTATCCATGACAAAATGGCTAGGAATCTTTCTAGTCAAGTCTCTTCTATCATGCG
TTCTATTGTAGAGATCCAGCTCCATAGCGTGGATCAAATGACTTATGGCGAATTTTTGATGAGTTTGCCTAGCCCTAC
AAGTTTTAATGT
HP1198 393 TGAATACAACCGCCTCAAACAACGCACCGAGCATGATTTAGAAATGATTAGCGCGACCGGTGTGTGTAAGGGCATTG 394 EYNRLKQRTEHDLEMISATGV
AAAATTACGCGCGCCATTTCACCGGTAAAGCCCCTAACGAAACGCCTTTTTGCTTGTTTGATTATTTAGGGATTTTTG YARHFTGKAPNETPFCLFDYL
AGCGGGAGTTTTTAGTCATTGTGGATGAAAGCCATGTGAGTTTGCCACAGTTTGGGGGGATGTATGCAGGGGATAT FLVIVDESHVSLPQFGGMYAG
GAGCAGGAAAAGTGTTTTAGTGGAATATGGTTTTAGATTGCCTAGCGCTTTAGACAACCGCCCTTTAAAATTTGATGA SVLVEYGFRLPSALDNRPLKFD
ATTTATCCATAAAAATTGCCAGTTCCTTTTTGTGTCCGCTACGCCCAATAAGCTAGAATTAGAGCTTTCCAAAAAGAAT NCQFLFVSATPNKLELELSKKN
GTCGCTGAGCAAATCATTCGCCCTACAGGGCTTTTAGACCCTAAATTTGAAGTGCGAGACAGCGATAAGCAAGTCCA RPTGLLDPKFEVRDSDKQVQD
GGATTTGTTTGATG/ \ATCAAGTTAGTGGTGGCTAGAGGTGAAAGGGTGCTCATCACCACGCTCACTAAAAAAATGG LVVARGERVLITTLTKKMAEEL
CAGAAGAATTGTGCAAATATTATGCTGAATGGGGCTTGAAGGCGCGTTACATGCATAGTGAAATTGATGCGATTGAA WGLKARYMHSEIDAIERNHIIR
AGGAATCACATCATCCGCTCTTTAAGGCTTAAAGAATTTGACATTTTAATAGGGATCAATCTTTTAAGAGAAGGGCTG FDILIGINLLREGLDLPEVSLVAI
GATTTGCCTGAAGTCTCTTTAGTAGCGATCATGGATGCGGATAAAGAAGGGTTTTTAAGGAGTGAAACAAGCCTCAT EGFLRSETSLIQTMGRAARNA
TCAAACCATGGGGCGAGCCGCTAGAAACGCTAATGGCAAGGTTTTATTATACGCTAAAAAGATCACTCAAAGCATGC YAKKITQSMQKAFEITSYRRAK
AAAAAGCCTTTGAGATCACTAGTTACAGGCGCGCCAAACAAGAAGAGTTCAATAAAATCCATAACATCACCCCCAAAA KIHNITPKTVTRALEEELKLRDD
CCGTTACGCGCGCTTTAGAAGAGGAATTGAAATTAAGAGACGATGAAATTAGAATCGCTAAAGCCTTAAAAAAGGAC ALKKDKMPKSEREKIIKELDKK
AAAATGCCTAAAAGTGAA KNLDFEEAMRLRDEIAQLRTL
HP1198 395 AGAATCCTTACGAGCCTTAAAAGCTTCGCAAGAAGTGCAGGCTAACACGCTTAAGCAGCAATCGCAAACTTTAGAGG 396 ESLRALKASQEVQANTLKQQS
ATTTGAGGAATGAGATTCACGCTAACCAGCAAGCTATCCAGCAGTTAGACAAGCAAAATAAAGAGATGAGTGAATTAT RNEIHANQQAIQQLDKQNKEM
TGACCAAGTTAAGCCAGGATTTGGTTTCACAAATCGCCTTAATCCAAAAAGCTCTCAAAGAACAAGAGGAAAAAGCT LSQDLVSQIALIQKALKEQEEK
GAAAAGCCGCTCAAATCAAACGCTCCGGCTAATAAAACCCCCTCTTTGAAAGCCGAATCCCCAAAAAATCAAGAGGG SNAPANKTPSLKAESPKNQEG
AAAAACTCAAGAAAAGGCGAAAATTGAGTTTGATAAAGACTTGTCTAAGCAAAAAGAGATCTTTCAAGAAGCTCTGTC AKIEFDKDLSKQKEIFQEALSFF
TTTTTTTAAAAATAAATCCTATGCAGAAGCCAAAGAGCGTTTGTTGTGGTTAGAAGCCAATAGTTACAGACTTTATTAT AEAKERLLWLEANSYRLYYVR
GTGCGTTATGTTCTTGGAGAAGTGGCTTATG AY
HP1198 397 GCACACTCTAAAAGAAATGCTCACCATTAAATCCGATGATATTAGAGGCAGAGAGAACGCTTATAGGGCTATCGCTA 398 HTLKEMLTIKSDDIRGRENAYR AAGGTGAGCAAGTGGGCGAGAGTGAAATCCCTGAGACTTTCTATGTTTTGACTA QVGESEIPETFYVLT
HP1 198 399 CTTTAGCCCAAATGCGTTTAGCCATTGAAGCGGCTGAAGGCTCTGATTTGAGCAACGCTAACATGCTTTTTAAAGAA 400 LAQMRLAIEAAEGSDLSNANML
GCTTTTTCTAACGCCAAAGACAAAGAGAGTGCGAGTGAAATCGCGCTTAATTGGGCTGAAGCAGAGATAAACTATCA SNAKDKESASEIALNWAEAEIN
AAATTTTAATAACGCTAAATACCTCATTGATAAGGTGGTCCAATCCAACCCTGATTATATTTCTACGCATAGCGAATCA NAKYLIDKWQSNPDYISTHSE
GCCCTAGACTTGCTCAAGTTATTGAAAAAAAACCAGATGAATGCAAGCGCGATTGAGATCGCTCACTTGCTCCTCAA KLLKKNQMNASAIEIAHLLLNQD
TCAAGATGATGATCTGAAAGCTAAAGAGCAAGCGCTTTATGATTTAGGAGCGTTGTATGCAAGGATCAAGGACTTTAA KEQALYDLGALYARIKDFKNAH
AAACGCCCACCTTTACAATCTGCAATATTTGCAGGACCATGCGGAACTGGATAAAGCTTCTGTCGTTAGGGCGCGCG YLQDHAELDKASWRARDEKA
ATGAAAAAGCCCTTTTTTCCATGGAGGGGMCACGCAAGAAAAAATCGCCCACTATGACAAAATCATTCAAAATTTCC GNTQEKIAHYDKIIQNFPNSNE
CTAATTCTAATGAAGCCCTAAAGGCTTTAGAATTGAAAGCCCAACTATTGTTTGAAAACAAGCGTTATGCTGAAGTGT LKAQLLFENKRYAEVLSMQKNL
TAAGCATGCAAAAAAATTTGCCTAAAGATTCCCCTTTGATCCAAAAAACGCTCAATGTCCTTGCTAAAACCCCATTAG LIQKTLNVLAKTPLENHRCEEA
AGAACCATCGTTGTGAAGAAGCCTTAAAATATTTATCCCAAATCACAACCTTTGAATTCAGCCCCAAAGAAGAAATCC ITTFEFSPKEEIQAFDCLYFASL
AAGCCTTTGATTGCTTGTATTTCGCATCGCTCAAAGAAAAAGCGCAAATCATTGCCCTAAACGCTTTTAAAACGGCTA IALNAFKTAKAPSEKLIWLYRLG
AAGCCCCTAGCGAGAAATTAATATGGCTTTATCGTTTGGGGCGCAATTACTACCGCTTAGGGGATTTTAAAAAT LGDFKN
HP1198 401 GCGCGAACAAAAAGAAAAAAAAAGACGAATACAACAAACCGGCGATCTTTTGGTATCAAGGGATT TGAGAGAAATC 402 ANKKKKKDEYNKPAIFWYQGIL CTT TTGCTAATTTAGAAACAGCGGACAATTACTATTCTTCTTTACAAAGCGAACACATCAATTCCCCCCTTGTCCCAG NLETADNYYSSLQSEHINSPLV AAGCGATGCTAGCTTTAGGGCAAGCGCACATGAAAAAGAAAGAGTATGTTTTAGCGTCT "TTTACTTTGATGAATACA ALGQAHMKKKEYVLASFYFDE TCAAGCGCTTTGGGACTAAGGACAATGTGGATTATTTGAC "TT1 TAAAATTGCAATCGCATTATTACGCTTTCAAAAA TKDNVDYLTFLKLQSHYYAFKN
CCATTCTAAAGACCAGGAATTTATCTCTAATTCTATTGTGAGTTTAGGCGAATTTATAGAAAAATACCCTAACAGCCGT EFISNSIVSLGEFIEKYPNSRYR
TACCGCCCCTATGTAGAATACATGCAAATCAAATTCATTTTAGGGCAAAATGAGCTCAATCGCGCGATCGCGAATGT MQIKFILGQNELNRAIANVYKKR
CTATAAAAAACGCCACAAGCCTGAGGGCGTGAAACGCTATTTAGAAAGGATAGATGAGACTTTAGAAAAAGAGACTA GVKRYLERIDETLEKETKPKPS
AACCCAAACCATCGCACATGCCTTGGTATGTGTTAATTTTTGATTGGTAGGATATTTCAAAACCATACACATTATAACA VLIFDW
GAGAGATGAAAAATGACTGAAGAT
HP1198 403 CCTTTAAACCCTTTAAAGACGCTTTTTACAGAGATTTCAATCATAATGAGCAAAAGTTACTGATAGGGGCAGCTAAAA 404 FKPFKDAFYRDFNHNEQKLLIG
GCGGTTGCATTCAATCTAGCGCTGATAAACTGGCTCAGTTAAAAACGCGCTTACTCTACTGGCAAGACAAATCTGTTA CIQSSADKLAQLKTRLLYWQDK
AAGTGGATTGGGATAAACCCATTTTAATCAAGGACTTCTTTAAAGGCAATAATTACCTTTATAGGAGGTTTTGTTTTTT WDKPILIKDFFKGNNYLYRRFC
ATTGGGGAAGCATTTTATGGACAGATTTTTAAAGAATAACGCTAAGGCGAGCGTGAAAGACTTTATGTCTAGTAAGGA HFMDRFLKNNAKASVKDFMSS
GTTTGTCGCTAAATACCGATACACCCCCAAGCAAAATACAGAAAGAGCGAAAAAGCTGCAATCGTATTTAGAGAATAA KYRYTPKQNTERAKKLQSYLEN
GCGCGATTTTATAGGGTTTGTTCAAGCGCTTAACTCTTTAAAAGACAACCCGCAAGATCCTTTTTTACCCAATGAAGA GFVQALNSLKDNPQDPFLPNEE
AACGAGCTTTTTGGTGTTCGCTAATG FAN
HP1198 405 GCTCGCTTTGAATGAGTTGAATCCGGGCGAATGGGTGGTGCATGATGATTATGGGGTGGGCGTGTTTTCTCAATTAG 406 LALNELNPGEWWHDDYGVGV
TCCAGCACAGCGTTTTAGGGAGCAAGAGGGATTTTTTAGAAATCGCTTATTTGGGCGAAGACAAACTGCTGTTACCG QHSVLGSKRDFLEIAYLGEDKL
GTAGAAAACTTGCATCTCATCGCTCGCTATGTGGCGCAAAGCGATAGCGTGCCAGCTAAAGACCGGCTAGGGAAAG NLHLIARYVAQSDSVPAKDRLG
GGAGCTTTCTTAAATTAAAAGCTAAAGTCAGGACTAAGCTTTTAGAGATTGCTAGCAAGATCATTGAATTAGCGGCTG KLKAKVRTKLLEIASKIIELAAER
AACGCAATTTGATCTTGGGTAAAAAGATGGATGTGCATTTAGCGGAGTTGGAAGTCTTTAAATCGCATGCGGGGTTT KMDVHLAELEVFKSHAGFEYTS
GAATACACCAGC
HP1076 407 GTTTGAGCAGCGACAAAAAAGCCCTTTTAAGGAATTTGGAAAAATCGCTTAAAAATAAGATTTTTGCCCAAGCAGAAG 408 LSSDKKALLRNLEKSLKNKIFAQ
CGATCAGTCTTGTCAGCAATGCGATTAAAATCCAGCATTGCGGGCTTTCTGCAAAAAATAAGCCTGTGGGGAGCTTT LVSNAIKIQHCGLSAKNKPVGS
TTATTCGTGGGGCCTAGTGGGGTGGGGAAAACAGAATTGGCTAAAGAATTGGCCTTGAATTTGAATTTGCATTTTGA SGVGKTELAKELALNLNLHFER
ACGCTTTGACATGAGCGAATACAAAGAAGCCCATAGCGTGGCAAAACTCATCGGGAGTCCTAGCGGTTATGTGGGG YKEAHSVAKLIGSPSGYVGFEQ
TTTGAACAAGGGGGGTTATTGGTGAATGCGATTAAAAAACACCCGCATTGTTTGCTGCTTTTAGATGAGATAGAAAAA NAIKKHPHCLLLLDEIEKAHSNV
GCCCATTCTAACGTGTATGATTTGTTGTTGCAAGTGATGGATAACGCCACTTTGAGCGATAATTTAGGCAATCAGGC QVMDNATLSDNLGNQASFKHVI
GAGTTTTAAGCATGTGATTTTGATTATGACTTCAAATGTGGGGAGTAAGGATAAGGATACGCTAGGGTTTTTTAGCGC NVGSKDKDTLGFFSAKNTKYD
TAAAAACACCAAGTATGATAAAGCCGTTAAAGAGCTTTTGACCCCTGAATTACGCTCCAGGATTGATGCGATCGTGC LTPELRSRIDAIVPFNALSLED
CGTTTAACGCGCTCAGTTTGGAGGAT
Figure imgf000137_0001
Figure imgf000138_0001
HP0071 431 ATGGCGGGATTGCGTGCGCGAATTTGTTGCATAAAAATTCAGGGATCACGATAGATATTGGAGGGGGTAGCACCGA 432 GGIACANLLHKNSGITIDIGGGS
GTGCGCGTTGATTGAAAAAGGCAAGATTAAGGACTTAATCTCGCTTGATGTTGGGACGATTCGCATTAAAGAAATGTT EKGKIKDLISLDVGTIRIKEMFLD
TTTAGACAAAGACTTAGAGGTCAAATTGGCTAAAGCCTTTATCCAAAAAGAAGTCTCTAAACTGCCCTTTAAACACAA KLAKAFIQKEVSKLPFKHKNAF
AAACGCCTTTGGGGTGGGGGGGACGATCAGAGCGTTGAGTAAGGTATTGATGAAACGCTTTTG ΓTACCCTATTGATT RALSKVLMKRFCYPIDSLHGYEI
CTTTGCATGGCTATGAAATAGATGCACATAAAAATTTAGCGTTCATTGAAAAAATCGTCATGCTCAAAGAAGATCAATT NLAFIEKIVMLKEDQLRLLGVNE
ACGGCTTTTAGGGGTGAATGAAGAGCGTTTGGATAGCATCAGGAGCGGGGCGTTGATTTTATCAGTCGTTTTGGAG IRSGALILSWLEHLKTSLMITS
CATTTAAAAACTTCTTTAATGATCACTAGTGGGGTGGGGGTGAGAGAAGGCGTGTTTTTGAGCGATTTATTGCGCCA EGVFLSDLLRHHYHKFPPNINP
TCATTACCATAAATTCCCCCCCAATATCAACCCCTCTCTCATCTCTTTAAAAGATCGCTTTTTGCCCCATGAAAAGCAC DRFLPHEKHSQKVKKECVKLF
AGCCAAAAGGTCAAAAAAGAATGCGTGAAATTGTTTGAAGCCTTATCGCCTTTGCATAAAATAGATGAAAAATACCTT HKIDEKYLFHLKIAGELASMGKI
TTCCATTTAAAGATTGCGGGGGAATTAGCGAGCATGGGTAAGATTTTAAGCGTCTATTTAGCCCACAAGCACAGCGC HKHSAYFILNALSYGFSHQDRA
GTATTTTATTTTAAACGCTTTGAGTTATGGCTTTAGCCACCAGGATAGAGCGATCATTTGCTTATTAGCCCAATTCAGC QFSHKKI
CATAAAAAAATC
HP0071 433 TTTTGAAAACGCAAAAGCTGAATGCAGTTTAGTTTTTATTATCAATAAGGATTTTAGCCACGCTTGGGTCAAAAATAAA 434 FENAKAECSLVFIINKDFSHAW
GAGTTGCTAGAAACCTTTAAATACGAAGGCGAAGGCGTATTTTTAGACCAAGAAAATAAAATCCTGTATGCGGGCGT LETFKYEGEGVFLDQENKILYA
TAAAGAAGATGATGTGCATTTATTGAGAGAGAGCGCGTGTTTAGCCGTTCGCACCCTTAAAAAACTCGCTTTTAAAAG DVHLLRESACLAVRTLKKLAFK
CGTTAAAGTGGGCGTTTATACTTGTGGTGCACATTCTAAAGATAACGCGCTTTTAGAAAACTTGAAAGCGCTGTTTTT VYTCGAHSKDNALLENLKALFL
GGGCTTGAAATTAGGTTrGTATG TACGACACTTTTAAATCCAACAAAAAAGAAAGCGTTTTAAAAGAAGCCATTGT LYEYDTFKSNKKESVLKEAIVAL
CGCTTTAGAATTGCACAAACCTTGCGAAAAAACTTGCGCAAATTCTTTAGAAAAGAGTGCTAAAGAAGCGTTAAAATA CEKTCANSLEKSAKEALKYAEI
CGCTGAAATCATGACAGAAAGCTTGAATATCGTTAAAGATCTAGTCAATACCCCCCCTATGATTGGCACTCCGGTTTA NIVKDLVNTPPMIGTPVYMAEV
TATGGCTGAAGTGGCGCAAAAAGTGGCTAAAGAAAACCATTTAGAAATCCATGTTCATGATGAAAAATTTTTAGAAGA KENHLEIHVHDEKFLEEKKMNA
AAAGAAAATGAACGCCTTTTTAGCGGTCAATAAAGCCTCTCTTAGCGTCAATCCTCCTCGCTTGATCCATTTAGTCTA KASLSVNPPRLIHLVYKPKKAK
TAAGCCTAAAAAAGCGAAGAAAAAAATCGCTTTAGTGGGTAAGGGCTTGACTTATGATTGTGGGGGTTTGAGCTTGA GKGLTYDCGGLSLKPADYMVT
AACCGGCCGATTACATGGTTACTATGAAAGCGGATAAAGGCGGTGGCTCTGCGGTGATTGGGCTTTTAAACGCATTA GGGSAVIGLLNALAKLGVEAEV
GCCAAACTAGGCGTGGAGGCTGAAGTGCATGGCATTATTGGGGCTACAGAAAACATGATAGGCCCAGCCGCTTATA TENMIGPAAYKPDDILISKEGKS
AACCAGATGATATTTTGATCTCCAAAGAAGGCAAGAGCATAGAGGTCCGTAATACCGACGCTGAGGGGCGTTTGGTT TDAEGRLVLADCLSYAQDLNP
TTAGCGGATTGTTTG ATLTGACWGLGEFTSAIMGHN
LFETSGLESGELLAKLPFNRHL
KIADVCNISSSRYGGAITAGLFL
EFKDKWLHIDIAGPAYVEKEW
ASGAGVRACTAFVEELLKKA
HP0071 435 TGATGTTATTAAAAAAGTTACGACCCCAAAGGGTGGCATTGAAATCTTAAGGACTTTAATTGATTTTACGCCCAAAATT 436 DVIKKVTTPKGGIEILRTLIDFTP
GAAAACGCCCTGAATTTAGCGGCCAAAAGCCATAAGGGGCAATACAGAAAAAGCGGCGAGCCTTATATTGTCCATC NLAAKSHKGQYRKSGEPYIVH
CTATTTGCGTGGCAAGCTTGGTAGCGTTTTGTGGGGGCGATGAGGCGATGGTGTGTGCTGCGCTTTTGCATGATGT LVAFCGGDEAMVCAALLHDW
GGTGGAAGACACGCCTTGTAAGATTGAAACGATTGAGCAAGAATTTGGGCAAGATGTGGCTAATTTAGTGGATGCGC KIETIEQEFGQDVANLVDALTKI
TCACTAAAATCACTGAAATCAGGAAAGAAGAATTAGGCGTGAGCTCTCAAGATCCCAGAATGGTGGTTTCAGCGCTC ELGVSSQDPRMVVSALTFRKIL
ACTTTCAGAAAGATTTTAATTAGCGCGATACAAGATCCAAGAGCCTTAGTGGTAAAGATTAGCGACAGGTTGCACAAC PRALWKISDRLHNMLTLDALP
ATGCTCACCTTAGACGCCTTGCCTCATGACAAGCAAGTGCGTATTTCTAAAGAGACTCTAGCGGTGTATGCCCCTAT RISKETLAVYAPIASRLGMSSIK
AGCGAGCCGATTGGGCATGTCTTCAATCAAAAATGAATTAGAAGACAAGAGCTTTTATTATATTTATCCAGAAGAGTA KSFYYIYPEEYKNIKEYLHKNK
TAAAAATATCAAGGAATATTTGCACAAAAACAAGCAGTCTTTACTCTTAAAGCTCAACGCTTTTGCGAGCAAGTTAGAA LNAFASKLEKKLFDSGFSHSDF
AAAAAACTTTTTGATAGTGGGTTTAGCCATTCGGATTTTAAACTCGTTACAAGGGTGAAACGCCCTTATTCTATCTATC VKRPYSIYLKMQRKGAVNIDEIL
TTAAGATGCAACGAAAGGGCGCGGTTAATATTGATGAAATTTTGGACTTGTTAGCCATTAGGATTTTATTGAAAAACC ILLKNPIDCYKVLGIIHLNFKPIVS
CGATTGATTGCTATAAAGTTTTAGGGATTATCCATTTGAATTTCAAACCCATTGTCTCTCGTTTTAAAGATTACATCGC IALPKENGYK
TTTGCCCAAAGAAAATGGCTATAAG
HP0071 437 TTTTGCTAGCGATTGTGTTGAGCATTTCTACTTTTATTGCACAAGGTAAGATTAAGGTCAGTCTCCCTAACGCTAAAAA 438 LLAIVLSISTFIAQGKIKVSLPNA
TGCGGAAAAATCCCAGCCAAACGATCAAAAAGTGGTGGTCATCTCTGTAGATGAGCATGACAATATTTTCGTAGATG SQPNDQKWVISVDEHDNIFVD
ACAAACCGATGAATTTAGAAGCTTTGAGCGCTGTAGTCAAACAAACAGACCCTAAAACCCTTATAGACTTAAAAAGCG LEALSAWKQTDPKTLIDLKSD
ACAAAAGCTCTCGTTTTGAAACTTTTATCAGCATTATGGATATTTTAAAAGAGCATAATCATGAAAATTTCTCCATCTCC TFISIMDILKEHNHENFSISTQA
ACGCAAGCTCAGTAAAGTTTCAACGAGTGTTAGCTTTTTAATCTCTTTTGCCCTATACGCTATAGGGTTTGGCTATTTT
TTACTGCGCGAAGACGCCCCAGΆGCCTTTAGCGCAAGCCGGGACCACTAAGGTTACCATGAGTTTAGCCAGCATCA ACACTAATTCCAATACAAAGACTAATGCTGAGTCGGCTAAACCCAAAGAAGAGCCTAAAGAAAAACCCAAGAAAGAA GAGCCAAAAAAAGAAGAACCCAAAAAGGAGGTTACAAAGCCTAAACCTAAGCCTAAACCCAAGCCAAAGCCAAAACC AAAACCTAAGCCTGAACCCAAACCTGAACCAAAACCCGAGCCTAAGCCTGAGCCTAAAGTTGAAGAGGTTAAAAAAG
AAGAGCCTAAAGAAGAGCCCAAAAAAGAAGAAGCTAAAGAGGAAGCTAAAGAAAAAAGCGCTCCTAAACAAGTAACA ACTAAGGATATAGTCAAAGAAAAAGACAAGCAAGAAGAATCCAACAAAACCTCTGAGGGGGCCACTTCTGAAGCTCA AGCTTATAACCCAGGGGTGAGCAACGAATTTTTAATGA
HP0071 439 AAGACAGCAAAAAAGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCATTGCTAAAA 440 DSKKDACGFIYEISEFMKAYTA AAACAAGACCGATACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACTGCCCTTTAT RYVYLLRYLPSRYWASILTTAL
GTCAAATACCCTGATTTTGACGCTTTGAAAAAGCTTTTGGTGTCTTATTATTACCAAACTTGGATTGCAGGAGGCACG FDALKKLLVSYYYQTWIAGGTI
ATCACGCGCATCAAGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACCATCAAAGAG SINIIKNVKSNKSVETIKELILNSI
CTTATATTGAATAGCATCGACTCTTATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTTCTGTTTATCA FDQYLYNLWDSSSVYHSKWVR
TAGCAAATGGGTGCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATTTTATCGCTAT ANYFMADEEKPHFIAMDAETQ
GGATGCCGAAACCCAAGTGGAGCATATTTTGCCACAAACGCCCAAAAGAGGCAGTCAATGGAACGCGGATTTTGAC QTPKRGSQWNADFDKEKREE
AAAGAAAAAAGAGAAGAATGGGTAAATAATATCGCGAATTTAACCCTTTTAAAGCGTAAAAAGAACGCGCATGCTTTA NLTLLKRKKNAHALNGDFDEKR
AACGGGGATTTTGATGAAAAAAGAAAAATTTATGGAGGCAAAGACACGAGCAAAGTGATTAGCTGTTATGACATCACT KDTSKVISCYDITKELYSNYRK
AAAGAATTGTATAGCAATTATAGGAAGTGGAATGAGAAGTCCCTCCAAGAGCGATACAAATCTTTGTATAACACTATC LQERYKSLYNTITPVLHIEGQED
ACGCCTGTTTTACACATAGAGGGGCAAGAAGATGATTTTGAAGATGATTTTGATCTAGAATGATTAAAGATTGCCAAG FDLE
CATCAAAA
HP0316 441 ACCCCTAGAAGAAAGCCTGGATTTAAAAGAGTTTATCGCTCTTTTTAAAACCTTTTTTGCCAAAGAAAGAGATACTATT 442 PLEESLDLKEFIALFKTFFAKER
GCTTTAGAAAACGATCTCAAGCAAACTTTCACTTATTTAAACGAAGTGGATGCGATCGGTTTGCCCACCCCTAAAAGC NDLKQTFTYLNEVDAIGLPTPK
GTGAAAGAAAGCGATCTTATTATCATCAAACTCACCAAATTAGGGACGCTCCATTTAGATGAAATTTTTGAGATTGTCA DLIIIKLTKLGTLHLDEIFEIVKRL
AACGATTGCACTACATTGTCGTTTTACAAAACGCTTTTAAAACTTTCACGCATTTAAAATTTCATGAACGCCTTAACGC QNAFKTFTHLKFHERLNAIVLP
TATTGTCCTGCCCCCTTTTTTTAACGATCTGATCGCTTTATTTGATGATGAAGGGAAAATCAAACAAGGGGCTAACGC IALFDDEGKIKQGANATLDALN
TACCCTAGACGCTTTGAATGAAAGTTTGAACCGCCTTAAAAAAGAGAGCGTAAAAATCATTCACCATTACGCCCGCTC KKESVKIIHHYARSKELAPYLVD
TAAAGAGCTTGCCCCTTATTTAGTGGATACGCAAAGCCATCTTAAGCATGGTTATGAATGCCTTTTATTGAAAAGCGG KHGYECLLLKSGFSGAIKGWL
GTTTTCTGGCGCGATTAAAGGCGTTGTGCTAGAAAGGAGCGCTAATGGCTATTTCTATCTTTTGCCTGAAAGCGCCC GYFYLLPESAQKIAQKIAQIGNE
AAAAAATCGCCCAAAAAATCGCCCAAATTGGTAATGAAATAGATTGTTGCATTGTTGAAATGTGCCAAACTCTAAGCC EMCQTLSHSLQKHLLFLKFLFK
ATAGCTTGCAAAAACACCTTTTATTTTTAAAATTCCTTTTTAAAGAATTTGATTTTTTAGACAGCTTGCAAGCCCGGCTT DSLQARLNFAKAYNLEFVMPSF
AATΓTCGCTAAAGCCTACAATTTAGAATTTGTCATGCCAAGCTTTACACAAAAAAAAATGATTTTAGAAAACTTTTCACA MILENFSHPILKEPKPLNLKFEK
CCCCATTTTAAAAGAGCCAAAGCCCTTAAATTTGAAGTTTGAAAAATCCATGCTTGCTGTTACCGGCGTGAATGCGG TGVNAGGKTMLLKSLLSAAFLS
GCGGGAAAACCATGCTCTTAAAATCGCTTTTAAGCGCGGCTTTTTTAAGCAAGCATCTCATTCCTATGAAAATCAACG MKINAHHSIIPYFKEIHAIINDPQ
CCCAT STFAGRMKQFSALLSKENMLL
LGTDAD
HP0316 443 AAAATACAGCAAAAAACAGCTTTTTAATTTAATCCATCAATTAGAGCGAAAAATCAAAAAGATGCAAAATGATAGAATT 444 KYSKKQLFNLIHQLERKIKKMQ TCTTTTAAAGAAAAAATGGCTAAAGAATTGGAAAAAAGGGATCAAAACTTTAAGGATAAAATAGACGCGTTAAATGAA KEKMAKELEKRDQNFKDKIDA CTCTTGCAAAAAATCAGTCAAGCTTTTGATGATAAAAG KISQAFDDK
HP0316 445 TAAGGAATTTCAAGCCCAAACCGAGGGCGAATTTGGGGGGCTTGGGATCACGGTGGGCATGCGCGATGGCGTTTTA 446 KEFQAQTEGEFGGLGITVGMR ACCGTTATTGCCCCTTTAGAAGGCACTCCAGCTTACAAGGCTGGGGTTAAGTCAGGCGATAACATTTTAAAAATCAAT VIAPLEGTPAYKAGVKSGDNIL AACGAAAGCACGCTGAGCATGAGCATTGAT TLSMSID
HP1448 447 AGGGCTTTGGGGGCTTTAGTAATCAACACGCCTACTAGCGAGGGGATTTCTGGCGCCATTAAAAAAAGTAAAGAGTT 448 RALGALVINTPTSEGISGAIKKS
AGCCGAAAGTATCCCTGATAGCTATTTGCCCTTACAATTTGAAAACCCTGATAATCCCGCCGCTTACTACCACACCCT SIPDSYLPLQFENPDNPAAYYH
AGCCCCTGAGATTGTCCAAGAATTAGGCACAAACCTTACGAGCTTTGTAGCCGGGATAGGGAGTGGTGGCACTTTT VQELGTNLTSFVAGIGSGGTFA
GCGGGCACGGCTAGGTATTTGAAAGAACGCATCCCTGCGATTCGCTTGATTGGGGTGGAGCCGGAGGGTTCTATTT LKERIPAIRLIGVEPEGSILNGGE
TGAATGGGGGTGAGCCTGGGCCTCATGAGATTGAGGGCATTGGCGTGGAGTTCATCCCTCCTTTTTTTGAAAACTTG EIEGIGVEFIPPFFENLDIDGFETI
GATATTGATGGCTTTGAAACGATCTCAGATGAAGAGGGT G
HP0431 449 GGGTAGAGCAAGTGTTAGCCGATCTCAAAAACTTCTCAAAGGAGCAATTGGCTCAACAAGCTCAAAAAAATGAAGAT 450 VEQVLADLKNFSKEQLAQQAQ
TTCAATACTGGAAAAAATTCTGAACTATACCAATCCGTTAAGAATAGTGTAAATAAAACCCTAGTCGGTAATGGGTTAT NTGKNSELYQSVKNSVNKTLV
CTGGAATAGAGGCCACAGCTCTCGCCAAAAATTTTTCGGATATCAAGAAAGAATTGAATGAGAAATTTAAAAATTTCA GIEATALAKNFSDIKKELNEKFK
ATAACAATAATAATGGACTCAAAAACAGCACAGAACCCATTTATGCTAAAGTTAATAAAAAGAAAACAGGACAAGTAG NNGLKNSTEPIYAKVNKKKTGQ
CTAGCCCTGAAGAACCCATTTATACTCAAGTTGCTAAAAAGGTAAATGCAAAAATTGACCGACTCAATCAAATAGCAA EPIYTQVAKKVNAKIDRLNQIAS
GTGGTTTGGGTGGTGTAGGGCAAGCAGCGGGCTTCCCTTTGAAAAGGCATGATAAAGTTGATGATCTCAGTAAGGT GQAAGFPLKRHDKVDDLSKVG
AGGGCTTTCAGCTAGCCCTGAACCCATTTACGCTACGATTGATGATCTCGGCGGACCTTTCCCTTTGAAAAGGCATG EPIYATIDDLGGPFPLKRHDKVD
ATAAAGTTGATGATCTCAGTAAGGTAGGGCGATCAAGGAATCAAGAATTGGCTCAGAAAATTGACAATCTCAATCAAG GRSRNQELAQKIDNLNQAVSEA
CGGTATCAGAAGCTAAAGCAGGTTTTTTTGGCAACCTAGAGCAAACGATAGACAAGCTCAAAGATTCTACAAAAAAG FGNLEQTIDKLKDSTKKNVMNL
AATGTTATGAATCTATATGTTGAAAGTGCAAAAAAAGTGCCTGCTAGTTTGTCAGCGAAATTGGACAATTATGCTATTA KKVPASLSAKLDNYAINSHTRIN
ACAGCCACACACGCATTAATAGCAATATCCAAAATGGAGCAATCAATGAAAAAGCGACCGGTATGCTAACGCAAAAA GAINEKATGMLTQKNPEWLKLV
AACCCTGAGTGGCTTAAGCTCGTGAATGATAAGATAGTTGCGCATAATGTGGGAAGCGTTTCTTTGTCAGAGTATGA AHNVGSVSLSEYDKIGFNQKNM
TAAAATTGGCTTCAACCAGAAGAATATGAAAGATTATTCTGATTCGTTCAAGTTTTCCACCAAGTTGAACAATGCTGTA DSFKFSTKLNNAVKDIKSGFTHF
AAAGACATTAAGTC STGYYCLARENAEHGIKNVNTK
KS
HP0431 451 CCTTTTAAACCATGCTAAAAAAACTCAAAGCTTGAATGGGGTGGAGATTGTAGGGTTGGAGCATTTGGATAAAGTGAT 452 LLNHAKKTQSLNGVEIVGLEHL
TTATTTAGATCAAGCCCCCATAGGCAAAACCCCACGAAGCAACCCTGCCACTTACACGGGAGTGATGGATGAAATCA DQAPIGKTPRSNPATYTGVMDE
GGATTTTATTTGCCGAGCAAAAAGAAGCTAAAATTTTAGGCTATAGTGCGAGCCGTTTTAGCTTTAATGTTAAAGGAG EQKEAKILGYSASRFSFNVKGG
GGCGGTGCGAGAAATGCCAAGGCGATGGGGACATTAAAATAGAAATGCACTTTTTGCCTGATGTGTTAGTCCAATGC QGDGDIKIEMHFLPDVLVQCDS
GATAGCTGTAAGGGCGCTAAATACAACCCCCAAACTTTAGAAATCAAGGTGAAAGGCAAATCCATTGCCGATGTGTT YNPQTLEIKVKGKSIADVLNMSV
GAACATGAGCGTGGAAGAGGCTTATGAATTTTTTGCTAAATTCCCTAAAATCGCCGTGAAGTTAAAAACGCTTATGGA EFFAKFPKIAVKLKTLMDVGLGY
TGTGGGCTTAGGCTATATCACTTTAGGGCAAAACGCTACGACTTTAAGTGGGGGGGAGGCTCAAAGGATCAAATTAG NATTLSGGEAQRIKLAKELSKK
CTAAAGAATTGAGTAAAAAAGACACAGGCAAAACCCTTTATATTTTAGATGAGCCTACTACCGGTTTGCATTTTGAAG LYI LDEPTTGLHFEDVNHLLQVL
ACGTGAATCATCTTTTACAAGTCTTGCATTCTTTAGTGGCGTTAGGCAATTCTATGCTAGTGATTGAGCATAATTTAGA LGNSMLVIEHNLDIIKNADYIIDM
CATTATCAAAAACGCTGACTACATTATAGACATGGGGCCTGATGGGGGGGATAAGGGCGGGAAAGTCATTGCGAGC GDKGGKVIASGTPLEVAQNCEK
GGCACGCCTTTAGAGGTGGCGCAAAATTGCGAAAAAACCCAAAGCTACACGGGAAAATTTTTAGCTTTGGAATTGAA TGKFLALELK
ATAGCTTGCATTTGTTTGTAGAAACAATGTAGGGCTGAAAATAACAGAC
HP0674 453 AATACGCTAAAAATTTGAGCGCTTTAGCCCAAGACAACGCTTTAGATCCAGTCATTGGCAGAGAAGAAGAGATTTTAA 454 YAKNLSALAQDNALDPVIGREE
GAGTGATAGAAATTTTAGGGCGCAGAAAAAAGAATAACCCGCTTTTAATTGGCGAAGCGGGCGTAGGGAAAACCTC EILGRRKKNNPLLIGEAGVGKT
CATCGCTGAAGCTTTGGCTTTAAAAATCGCTCAAAAAGAAGTGCCGGAGTTTTTGCAAGAATATG.AAGTCTATTCTTT ALKIAQKEVPEFLQEYEVYSLDL
GGATTTAGCCTTAATGGTGGCTGGGGCAAAATACAGAGGGGATTTTGAAAAACGCTTGAAAAAAACGCTCAAAGAAA GAKYRGDFEKRLKKTLKEIQQN
TCCAACAAAACGGCCGTATCATTTTATTCATTGATGAAATCCACACCCTTTTAGGCACAGGGAGCAGTAACGCTGGG DEIHTLLGTGSSNAGSLDAANIL
AGCTTGGATGCGGCGAATATATTAAAACCGGTTTTAACGGATGGGAGCTTGAAATGTTTAGGAGCGACCACTTTTGA DGSLKCLGATTFEEYRSVFEKD
AGAATACCGCAGCGTGTTTGAAAAAGACAAGGCTTTTAATAGGCGTTTTTCAGTCATAAAAGTTGAAGAGCCTTCTAA RFSVIKVEEPSKEACYLILKKIAP
AGAGGCGTGTTACTTGATTTTAAAAAAGATCGCTCCCCTTTATGAAGAACACCACCAGGTGCGTTATGATGAGAGCG HQVRYDESVFKACVDLTSDYM
TGTTTAAGGCATGCGTGGATTTAACGAGTGATTACATGCATGATAAATTCTTGCCGGATAAAGCGATTGAATTATTAG PDKAIELLDEVGSRKKISPKKG
ATGAGGTGGGATCGAGGAAAAAAATCAGCCCTAAAAAGGGCAAAAAAATCGGCGTTGATGATGTGAAAGAAACGCT DVKETLALKLKIPKMRLSSDKK
CGCTCTAAAGCTTAAAATCCCTAAAATGCGTTTGAGCAGCGACAAAAAAGCCCTTTTAAGGAATTTGGAAAAATCGCT EKSLKNKIFAQAEAISLVSNAIKI
TAAAAATAAGATTTTTGCCCAAGCAGAAGCGATCAGTCTTGTCAGCAATGCGATTAAAATCCAGCATTGCGGGCTTTC SAKNKPVGSFLFVGPSGVGKT
TGCAAAAAATAAGCCTGTGGGGAGCTTTTTATTCGTGGGGCCTAGTGGGGTGGGGAAAACAGAATTGGCTAAAGAA A
TTGGCC
HP0674 455 GCGTTTGGGTATTCAAAAGGAAATT rACATTAGCGTGAATGAAGAAAATGAAAAAGCCCTTTTGAATTGTTACCCT 456 RLGIQKEIFYISVNEENEKALLN
AACGCTAAAAATATTGCAGGGTTTTTTCATTTAGAAACCGACTATGTAGGGCTTGGGATAGACCGGCAAATGGCGTG KNIAGFFHLETDYVGLGIDRQM TCTGGCGGTAAATAATGGCGTGGTGGTGGATGCCGGGAGTGCGATTACGATAGATTTAATCAAAGAGGGCAAGCAT NNGVWDAGSAITIDLIKEGKHL TTAGGAGGGTGTATTTTACCCGGTTTAGCCCAATATATTCATGCGTATAAAAAAAGCGCTAAAATTTTAGAGCAACCT PGLAQYIHAYKKSAKILEQPFKA TTCAAGGCCTTAGATTCTTTAGAAGTTTTACCTAAAAGCACTAGAGACGCTGTGAATTACGGCATGGTTTTGAGCGTC VLPKSTRDAVNYGMVLSVIACI ATTGCTTGTATCCAGCATTTAGCCAAAAATCAAAAAATCTATCTTTGTGGGGGCGATGCGAAGTATTTGAGCGCGTTT NQKIYLCGGDAKYLSAFLPHSV TTACCCCATTCTGTTTGCAAGGAGCGTTTGGTTTTTGACGGGATGGAAATCGCTCTTAAAAAAGCAGGGATACTAGA VFDGMEIALKKAGILECK ATGCAAATGATGCACAATTTGAGTTTTTTGGGCATGTTTTTAGCCGCTTTGAGCATGTCTTTAGGGCATTGTGTGGGC
HP0553 457 TGGCTAAGGTGGAACTGCCCTTAGCGGTTTCTTTAAAAGAGGTTAAAAAAGCTCAAAAACTTTTGGTGCTTTGCGGG 458 AKVELPLAVSLKEVKKAQKLLV
ATTACGGATGTGGGGAATATTGGAGGTATTTTTAGGAGCGCGTATTGCTTAGGAATGGGTGGCGTTATTTTAGATTTT VGNIGGIFRSAYCLGMGGVILD
GCTAAAGAATTGGCTTATGAGGGGATTGTGCGATCCAGCTTGGGGCTTATGTATGATTTGCCTTTTAGCGTTATGCC YEGIVRSSLGLMYDLPFSVMPN
TAACACGCTGGATTTAATCAATGAATTGAAAACGAGCGGGTTTTTATGTTTGGGCGCGAGCATGCAAGGCTCTAGTC ELKTSGFLCLGASMQGSSQIEN
AAATAGAAAATCTATCCTTAAAAAAATGCGCTCTTTTTTTGGGGAGCGAGCATGAGGGGTTGTCTAAAAAAATCCTTG CALFLGSEHEGLSKKILAKMDTI
CTAAAATGGATACTATATTGAGCGTAAAAATGCGAAGAGATTTTGATTCGCTCAATGTGAGCGTGGCAGCAGGGATC RRDFDSLNVSVAAGILMDKIN
TTMTGGATAAAATCAACTAGGTGGTCAATTGAATGGAACAGAATAAAAAAAGTTTAGAAAATTTAGATCTTTCTGATG
TTCAAAACATTTCTAAAGATATTTCTGGTGCAACA
HP0311 459 CTAGTGCAGAAATTTTTGACAAGAGGGCGATAGACTATGAGAGCCTCTTTTCACGCAAAAATAGGGCGCGAAATTTT 460 SAEIFDKRAIDYESLFSRKNRAR
ATGCCAAGAATGCCAAAAGATTCGCACTCGCAAGGCTTTGAGACTTTAAGCATTAATTTTGAAGGCACGATGGAGTG RMPKDSHSQGFETLSINFEGT
GAGCGCGTTTGGGATTTGGCTGAGTTTGTTATTGCATCAATACGGCACACAGATTTTACGCATCAAGGGGATTATTG FGIWLSLLLHQYGTQILRIKGIIDI
ACATTGGAAGCGGCTTTTTGGTGAGTATTAACGGCGTGATGCATGTCATTTACCCGCCTAAGCATATTTTAAAGGATC VSINGVMHVIYPPKHILKDQNG
AAAACGGCTCTAACCTCGTTTTTATCATGCGCCATTTAGAGCGTGAAAAAATCTTAAATTCCTTAAAGGGTTTTAAGGA MRHLEREKILNSLKGFKDFLGIK
TTTTCTCGGCATCAAGGGTTTTGAAACCCAATAATTTTTCTATTTATGGATAGCTGTTTGCATTTTGATGGGGAAAAGA
ACGATGAAGCTTAAAACCAAA
HP1077 461 GGGATTTTAAATCTAGAGATTTGCCCCAAAAACTCCATCTTGATAAAAAGCTCTCCCAAACAATACAGCCATGCATGC 462 DFKSRDLPQKLHLDKKLSQTIQ
AACTTAACGCATCAAAACACTACACTTCTACCGGGGTTAGAGAGCCTGATAAATGCACAAAGAGTTTTAAAAAATCCG NASKHYTSTGVREPDKCTKSF
CTCTCATGTCCTATGACTTAGCGCTAGGTTATTTGGTGAGTAAGAATAAGCAATACGGCTTAAAGGCTATAGAAATTT MSYDLALGYLVSKNKQYGLKAI
TAAACGCTTGGGCTAAAGAGCTTCAAAGCGTGGATACTTATCAGAGCGAGGATAATATCAATTTTTACATGCCTTATA WAKELQSVDTYQSEDNINFYM
TGAACATGGCTTATTGGTTTGTCAAAAAGGCGTTTCCTAGCCCAGAATATGAAGATTTCATTAAGCGGATGCGCCAGT AYWFVKKAFPSPEYEDFIKRM
A
HHP1077 463 TAATGCCTTTGTTTTAAACCCGCCTTATTCCGCTAGCGGTAATGGCATGGTGTTTGTGGAGCAGGCTTTAGAAAAAAT 464 NAFVLNPPYSASGNGMVFVEQ
GCAAAGCGGTTATGCGAGCGTGATCATCCAATCAAGCGCCGGCAGTGGTAAAGCCAAAGAATACAATGTAAGGATT QSGYASVIIQSSAGSGKAKEYN
TTGGAAAAACACACGCTTTTAGCGAGCATTAAAATGCCTTTAGATTTATTCATCGGTAAAAGCAGCGTTCAAACCCAT HTLLASIKMPLDLFIGKSSVQTHI
ATCTATGTTTTTAGGGTCAATGAAAAGCATGACGCTAAGCAAAGGGTGAAATTTATTAATTTCAGTAACGACGGCTAC NEKHDAKQRVKFINFSNDGYAR
GCTAGAGCGAATCGCAAAAAAGCCAAAGCCAGCCACAATTTAAAAGACACGCATAACGCCAAAGAGCGCTACAACG KAKASHNLKDTHNAKERYNE
AAGTCGTGGATTTAGTCCATATTGGCCAATCATGTTTGAAATTTCTAAGCGAAGATGACTATTATGAAAACACCATAG GQSCLKFLSEDDYYENTIDPKN
ATCCCAAAAACGGGAGCGATTGGAACCAAAACAAACCCACTGACACCAAACCCGAATTAGAGGATTTTAAAAGAACG NQNKPTDTKPELEDFKRTIADY
ATAGCCGATTACCTTTCTTATGAAGTAAGCTTGATTTTAAAAAACC SLILKN
HP0798 465 AGCAAGCGGTCGTATCAGCATGAATAAAGAGGCTTATGACGCTATTATCAATCATTGCGTCAAAAAGGGTCCGGTGT 466 ASGRISMNKEAYDAIINHCVKK
TACAGACTGCTATTATTGCTGGAATTATGGGGGCTAAAAAGACAAGCGAGCTCATTCCCATGTGCCATCCAATCATG TAIIAGIMGAKKTSELIPMCHPIM
CTCAATGGGGTGGATATTGATATTTTAGAAGAAAAAGAGACTTGTAGTTTTAAACTCTATGCGAGAGTCAAAACTCAA IDILEEKETCSFKLYARVKTQAK
GCTAAAACGGGCGTAGAAATGGAAGCGCTAATGAGTGTGAGCATAGGGCTTTTAACCATTTATGACATGGTGAAAGC EALMSVSIGLLTIYDMVKAIDKS
CATTGACAAGAGCATGACAATTAGCGGTGTGATGTTGGAGCATAAAAGTGGAGGCAAAAGTGGGGATTATAACGCTA VMLEHKSGGKSGDYNAKK
AAAAATAGAAAAAGACCAATAATCTAAAGATGTTAGGGTAAAATAACATTTTGACAACAAAAGCGTGTTGGTTGCTTC
GGGTTTGTTGTTATAGAAGTCTAAAATATTACAATCAAGGATAGAACGATGAAAGCAAATAATCATTTTAAAGATTTTG
CATGGAAAAAA
HP0436 467 GCTCCAAGCATCACTAAAGATGGCGTGAGCGTGGCTAAAGAGATTGAATTAAGTTGCCCGGTAGCTAACATGGGCG 468 APSITKDGVSVAKEIELSCPVAN
CTCAACTCGTTAAAGAAGTAGCGAGCAAAACCGCTGATGCTGCCGGCGATGGCACGACCACAGCGACCGTGCTGG LVKEVASKTADAAGDGTTTATV
CTTATAGCATTTTTAAAGAAGGTTTGAGGAACATCACGGCTGGGGCTAACCCTATTGAAGTGAAACGAGGCATGGAT KEGLRNITAGANPIEVKRGMDK
AAAGCCGCTGAAGCCATTATTAATGAGCTTAAAAAAGCGAGCAAAAAAGTGGGCGGTAAAGAAGAAATCACCCAAGT NELKKASKKVGGKEEITQVATIS
GGCGACCATTTCTGCAAACTCCGATCACAATATCGGGAAACTCATCGCTGACGCTATGGAAAAAGTGGGTAAAGACG HNIGKLIADAMEKVGKDGVITVE
GCGTGATCACCGTTGAAGAAGCTAAGGGCATTGAAGATGAACTAGATGTTGTAGAAGGCATGCAATTTGATAGAGGC EDELDWEGMQFDRGYLSPYF
TACCTCTCCCCTTATTTTGTAACAAACGCTGAGAAAATGACCGCTCAATTGGATAACGCTTACATCCTTTTAACGGAT KMTAQLDNAYILLTDKKISSMKD
AAAAAAATCTCTAGCATGAAAGACATTCTCCCGCTACTAGAAAAAACCATGAAAGAGGGCAAACCGCTTTTAATCATC KTMKEGKPLLII
GC
HP0436 469 ACAGGCGATAAAGTGAGATTGGGCGATACAGACTTGATCGCTGAAGTAGAACATGACTACACCATTTATGGCGAAGA 470 TGDKVRLGDTDLIAEVEHDYTIY
GCTTAAATTCGGTGGCGGTAAAACCCTGAGAGAAGGCATGAGCCAATCCAACAACCCTAGCAAAGAAGAATTGGAT FGGGKTLREGMSQSNNPSKEE
CTAATCATCACTAACGCTTTAATCGTGGATTACACCGGTATTTATAAAGCGGATATTGGTATTAAAGATGGCAAAATC NALIVDYTGIYKADIGIKDGKIAGI
GCTGGCATTGGTAAAGGCGGTAACAAAGACATGCAAGATGGCGTTAAAAACAATCTTAGCGTAGGTCCTGCTACTGA NKDMQDGVKNNLSVGPATEAL
AGCCTTAGCCGGTGAAGGTTTGATCGTAACTGCTGGTGGTATTGACACACACAT IVTAGGIDTH
HP0436 471 AAAAAAATCTTAGTCATAGGCGATCTGATCGCTGATTATTATTTGTGGGGGAAGAGCGAACGGCTTTCGCCTGAAGC 472 KKILVIGDLIADYYLWGKSERLS
CCCTGTGCCTGTTTTAGAAGTCCAGAGGGAGAGTAAGAATTTAGGCGGAGCGGCTAATGTGGCTAATAACCTCATTT PVLEVQRESKNLGGAANVANNL
CTTTAAAAGCTAAAGTTTTTTTATGTGGGGTCGTGGGCGATGATTTAGAGGGCGAGCATTTCATTAGCGCTTTAAAAG KVFLCGVVGDDLEGEHFISALK
CAAGAGGGATTGACGCTTCAGGTATTTTGATAGATAAAACCCGTTGCACCACGCTTAAAACGCGCATCATCGCGCAA ASGILIDKTRCTTLKTRIIAQNQQI
AACCAGCA TCGCGCGCGTGGATAAGGAAATCAAAGACCCCTTAAACGCTGATTTMGAAAAAAACTTTTAGATTTT KEIKDPLNADLRKKLLDFFTEKI
TTCACAGAAAAAATCCAAGAAATAGATGGCGTTATCCTTTCAGATTACAATAAGGGCGTGTTGGATTTTGAACTCACT ILSDYNKGVLDFELTQAMIALAN
CAAGCAATGATCGCTCTAGCCAACCAACACCACAAGCTCATTTTATGCGACCCTAAAGGGAAAGATTATAGCAAATAT ILCDPKGKDYSKYSHASLITPNR
TCCCATGCGAGTTTGATCACGCCTAATCGCACCGAATTAGAGCATGCGCTCCATTTGAAATTAGACAGCCATGCGAA ALHLKLDSHANLSKALQILKETY
TTTATCAAAAGCGCTCCAAATCTTAAAAGAAACTTATCATATCGCTATGCCTTTAGTAACTTTGAGTGAACAAGGCATC LVTLSEQGIAFLEKGELVNCPTI
GCTTTTTTAGAAAAAGGCGAGTTAGTCAATTGCCCCACTATCGCTAAAGAAGTTTATGATGTAACTGGGGCAGGCGA DVTGAGDTVIASLTLSLLESMSL
TACGGTGATCGCGTCTTTAACGCTCTCTTTATTAGAATCAATGAGCCTAAAAGATGCTTGCGAGTTTGCCAATGCGGC EFANAAAAVWGKMGSALASLE
GGCGGCTGTGGTGGTGGGTAAAATGGGGAGCGCGTTAGCGAGTTTAGAAGAAATCGCTTTGATTTTGAACCAAACG NQTHP
CACCCT
HP0436 473 TGGTTATGCTTTGGCAGGATCAAGCGCGAATTTTGAGTTTAAGGCTGGTACGGATACCAAAAACGGCACAGCCACTT 474 GYALAGSSANFEFKAGTDTKNG
TTAATAACGATATTAGTTTGGGAAGATTTGTG.AATTTAAAAGTGGATGCTCATACAGCTAATTTTAAAGGTATTGATAC NNDISLGRFVNLKVDAHTANFK
TGGTAATGGTGGTTTCAACACCTTAGATTTTAGTGGCGTTACAGGTAAGGTCAATATCAACAAGCTCATTACGGCTTC NGGFNTLDFSGVTGKVNINKLIT
CACTAATGTGGCCGTTAAAAACTTCAACATTAATGAATTGGTTGTTAAGACCAATGGGGTGAGTGTGGGGGAATACA AVKNFNINELVVKTNGVSVGEY
CTCATTTTAGCGAAGATATAGGCAGTCAATCGCGCATCAATACCGTGCGTTTGGAAACTGGCACTAGGTCAATCTTTT DIGSQSRINTVRLETGTRSIFSG
CTGGGGGTGTCAAATTTAAAAGCGGTGAAAAACTGGTTATAGATGAGTTTTACTATAGCCCTTGGAATTATTTTGACG SGEKLVIDEFYYSPWNYFDARN
CTAGGAATATTAAAAATGTTGAAATCACCAGAAAATTCGCTTCTTCAACCCCAGAAAACCCTTGGGGCACATCAAAGC TRKFASSTPENPWGTSKLMFN
TTATGTTTAATAATCTAACCCTGGGTCAAAATGCGGTCATGGACTATAGTCAATTTTCAAATTTAACCATTCAGGGGGA QNAVMDYSQFSNLTIQGDFINN
TTTCATCAACAATCAAGGCACTATCAATTATTTGGTCCGAGGCGGGCAAGTAGCCACCTTGAATGTAGGCAATGCGG YLVRGGQVATLNVGNAAAMFF
CAGCTATGTTCTTTAGTAATAATGTGGATAGCGCGACTGGGTTTTACCAACCGCTCATGAAGATTAACAGCGCTCAA SATGFYQPLMKINSAQDLIKNKE
GATCTCATTAAAAATAAAGAACATGTCTTATTGAAAGCGAAAATCATCGGTTATGGCAATGTTTCTTTAGGCACTAACA AKIIGYGNVSLGTNSISNVNLIEQ
GCATTAGTAATGTTAATCTAATAGAGCAATTCAAAGAGCGCCTAGCCCTTTACAACAACAATAACCGCATGGATATTT ALYNNNNRMD1CVVRNTDDIKA
GTGTGGTGCGAAATACTGATGACATTAAAGCATGCGGGACGGCTATCGGCAATCAAAGCATGGTGAATAACCCCGA GNQSMVNNPDNYKYLIGKAWK
CAATTACAAGTAT TANGSKISV
HP0436 475 TTTGGGTGGCATGATTCCTAAAGAAAGAATGGAAAGGGCTTTAGGCAGCGGCGTAATCATTTCTAAAGACGGCTATA 476 LGGMIPKERMERALGSGVIISKD
TTGTAACTAATAACCATGTGATTGATGGCGCGGATAAGATTAAAGTTACCATTCCAGGGAGCAATAAAGAATATTCCG NNHVIDGADKIKVTIPGSNKEYS
CCACTCTAGTAGGCACCGATTCTGAAAGCGATTTAGCGGTGATTCGCATCACTAAAGACAATCTGCCCACGATCAAA TDSESDLAVIRITKDNLPTIKFSD
TTCTCTGATTCTAATGATATTTCAGTGGGCGATTTGGTTTTTGCGATTGGTAACCCTTTTGGCGTGGGCGAAAGCGTT VGDLVFAIGNPFGVGESVTQGI
ACGCAAGGCATTGTTTCAGCGCTCAATAAAAGCGGGATTGGGATCAACAGCTATGAGAATTTCATTCAAACAGACGC KSGIGINSYENFIQTDASINPGN
TTCCATCAATCCTGGAAATTCCGGCGGCGCTTTAATTGATAGCCGTGGAGGGTTAGTGGGGATTAATACCGCTATTA DSRGGLVGINTAIISKTGGNHGI
TCTCTAAAACTGGGGGCAACCACGGCATTGGCTTTGCCATCCCTTCTAACATGGTTAAAGATACTGTAACCCAACTC SNMVKDTVTQLIKTGKIERGYLG
ATCAAAACCGGTAAGATTGAAAGAGGTTACTTGGGCGTGGGCTTGCAAGATTTGAGTGGCGATTTGCAAAATTCTTA DLSGDLQNSYDNKEGAWISVE
TGACAACAAAGAAGGGGCGGTAGTCATTAGCGTAGAAAAAGACTCTCCGGCTAAAAAAGCAGGGATTTTGGTGTGG KKAGILVWDLITEVNGKKVKNTN
GATTTGATCACCG GTCAATGGGAAAAAGGTTAAAAACACGAATGAGTTAAGAAATCTAATCGGCTCCATGCTACC LIGSMLPNQRVTLKVIRDKKERA
CAATCAAAGAGTAACCTTAAAAGTCATTAGAGACAAAAAAGAACGCGCTTTCACCCTCACTCTAGCTGAAAGGAAAAA AERKNPNKKETISAQNGAQGQL
CCCTAACAAAAAAGAAACCATTTCTGCTCAAAACGGCGCGCAAGGCCAATTGAACGGGCTTCAAGTAGAAGATTTAA VEDLTQETKRSMRLSDDVQGVL
CTCAAGAAACCAAAAGGTCTATGCGTTTGAGCGATGATGTTCAAGGGGTTTTAGTCTCTCAAGTGAATGAAAATTCCC NENSPAEQAGFRQGNIITKIEEV
CAGCAGAGCAAGCCGGATTT ADFNHALEKYKGKPKRFLVLDL
RIILVK
HP0436 477 CGCATTATTGCAAAAAGCCCTCAAACTCTACGCTCTTTTAAAGCCTTTAGAATTGAATGTGAGCATAGCCTCTAGCTT 478 ALLQKALKLYALLKPLELNVSIAS
TTCTAAAATAGGGAATTTGTTTGGTAGGGAATTAGAATCCTTTTGCGTGAAAATCCAGCCCAAAAACACCCGTGCTTT GNLFGRELESFCVKIQPKNTRAL
AAATAGTGAGAAACTTTATTTAAAGCTTTTCCAAAAAGGCGTTATCGCAAGGATTTCATGCGAATTCGTGTGCTTTGAA LYLKLFQKGVIARISCEFVCFEVF
GTCTTTAGCTTGAATGAAAAAGATTTTGAAAAAATCGCTCTGGTTTTAGAAGAAATTCTTAATAAAGCTTAAAAATTCG KDFEKIALVLEEILNKA
CTATAATAAAATTTCTTTTAAACGCGCCATATCCCCCACAAAACGCTAGAGAATGATAGAAAACGACAGAACATCAAT
TTAAAGGAACTTAAGAATGGAAAAAATCAGCGATCTTATAGAATGCATTGCGTATGAAAAAAATTTGCCTAAAGAGAT
GATTTCAAAAGTGATTCAAGGCTGTTTGTTAAAAATGGCGCAAAATGAGTTAGACCCCCTAGCACGCTACTTGGTGG
TTGAAGAAAACAAGCAGCTCCAGCTTATCCAGTTGGTAGAAGTTTTAGAAGATGGTGATGAAAGATTGGTTAACGAC
CCTTCTAAATACATCAGCCTGTCTAAAGCCAAAGAAATGGATCCAAGCGTTAAGATTAAAGACGAATTGTCCTATAGC
TTGAGTTTGGAGAGCATGAAACAAGGAGCGATCAACCGCCTTTTTAAAGATTTGCAATACCAGTTAGAAAAAGCGTTA
GAAGACAGCCACTTTGAAGCGTTTCAAAAGCGTCTTAACAGCGTTTTAATGGGGCAAGTGATTTTAGTGGATCACAA
CCAAAACACCTTTATTGAGATTGAGCAGCAATTTCAGGGCGTTCTTTCCATGCGCCATCGCATCAAGGGCGAGAGTT
TTAAAGTGGGCGATAGCATTAAAGCGGTTTTAACGCAAGTCAAACGCACGAAAAAAGGCTTATTATTAGAGCTGAGC
CGCACCACCCCTA
HP0071 479 GCAAAAAAGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCATTGCTAAAAAAACAA 480 KKDACGFIYEISEFMKAYTALLK
GACCGATACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACTGCCCTTTATGTCAAA VYLLRYLPSRYWASILTTALYVK
TACCCTGATTTTGACGCTTTGAAAAAGCTTTTGGTGTC1TATTATTACCAAACTTGGATTGCAGGAGGCACGATCACG ALKKLLVSYYYQTWIAGGTITRI
CGCATCAAGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACCATCAAAGAGCTTATA IIKNVKSNKSVETIKELILNSIDSY
TTGAATAGCATCGACTCTTATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTTCTGTTTATCATAGCA YLYNLWDSSSVYHSKWVRPVL
AATGGGTGCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATTTTATCGCTATGGATG FMADEEKPHFIAMDAETQVEHI
CCGAAACCCAAGTGGAGCATATTTTGCCACAAACGCCCAAAAGAGGCAGTCAATGGAACGCGGATTTTGACAAAGA KRGSQWNADFDKEKREEWVN
AAAAAGAGAAGAATGGGTAAATAATATCGCGAATTTAACCCTTTTAAAGCGTAAAAAGAACGCGCATGCTTTAAACGG LLKRKKNAHALNGDF
GGATTTT
HP0071 481 TTCTTTGATCGCTTCAGTGTTATTATACGCTTATGGCACAGGAGCGATTAAAGGCTTTGCCCTAACTACAGGCATTGG 482 SLIASVLLYAYGTGAIKGFALTT GATTTTAGCCTCTATTATCACCGCTATTGTTGGCACGCAAGGGATTTATCAAGCCCTTTTACCTAAACTCACTCAAAC SIITAIVGTQGIYQALLPKLTQTK AAAAAGCCTTTACTTTTGGTTTGGCGTGAATAAAAGAGCTTAGGAGGTTTTATGGAATTATTCAAACGAACTAGAATCT FGVNKRA TAAGCTTCATGCGTTATTCCAATTATGGGGTGATCGTTTCAGCAATTTTAGCGCTTCTAGCGTTGGGGCTTTTGTTTTT
CAAAGGGTTTTCTTTAGGGATTGATTTTGCGGGGGGGAGTTTGGTGCAAGTGCGCTACACTCAAAACGCCCCCATTA
AAGAAGTGCGCGATCTGTTTGAAAAAGAAGCTCGCTTCAAAGGCGTGCAAGTGAGCGAATTTGGCTCTAAAGAAGAA
ATTTTAATCAAATTCCCTTTTGTAGAAACGGCTGAAAATGAAGATCTGAACGCTATCGTGGCCAACATTCTAAAACCC
AGCGGCGATTTTGAAATCCGTAAATTTGACACCGTGGGCCCTAGAGTGGGGAGCGAATTGAAAGAGAAAGGCATTT
TGTCGCTGATTTTAGCATTAATAGCGATCATGGTTTATGTGAGTTTCCGCTATGAATGGCGTTTTGCTTTAGCGAGCG
TCATTGCGCTTGTGCATGATGTGATTTTAGTGGCAAGCTCGGTGATTGTTTTTAAGATTGATATGAATTTGGAAGTGA
TTGCGGCCTTGCTCACCTTGATTGGGTATTCCATTAATGATACGATCATTATTTTTGACAGGATCAGAGAAGAGATGC
TCTCTCAAAAAACCAAAAACGCCACTCAAGCCATTGATGAAGCCATTTCTAGCACGCTCACGCGCACGCTTTTAACTT
CTTTAACCGTGTTTTTTGTGGTGTTGATTTTGTGCGTGTTTGGGAGTAAGATCATCATTGGCTTTTCATTGCCCATGTT
AATAGGCACGA
HP1402 483 TGGAAAAAGATATAGCCCATGCGCGTTTCAAGGGTAATGAAAGCATGGTGTATGAAGAAAATTTTGTGCATGCCGGG 484 EKDIAHARFKGNESMVYEENFV
TTTGTGCTTATTGCGTGCAATTATGCGGCCTTGTGCGCGTTGAATAAAAGACACAGCGTGGTGGTTTCTAATAACATC VLIACNYAALCALNKRHSWVS
AATTTTTATGCCCCCCTAGAATTGAATCAAGAAGCACTCATTAAAGCGCAAGTGATTCAAGATGGCGTGAAAAAAGCT APLELNQEALIKAQVIQDGVKKA
GAAATAAAAATAGAGGCGTTTGTGTTAGACATTCAGGTTTTAGAGGGAATGATAGAAATTGTGGTGTTTGATAAAAAG FVLDIQVLEGMIEIWFDKKPFKF
CCTTTTAAATTCAATTTTAAAGAAGAGTAGTTAAATGGTTATTGTTTTAGTCGTGGATAGTTTTAAAGACACCAGTAAT
GGCACTTCTATGACAGCGTTTCGTTTTTTTGAAGCGCTGAAAAAAAGAGGGCATGTGATGAGAGTGGTCGCCCCTCA
TGTGGATAATTTAGGGAGTGAAGAAGAGGGGTATTACAACCTTAAAGAGCGCTACATCCCCCTAGTTACAGAAATTT
CACACAAACAACACATCCTTTTTGCTAAACCCGATGAAAAAATCTTAAGAAAGGCTTTTAAGGGAGCGGATATGATCC
ATACTTATTTGCCTTTTTTGCTAGAAAAAACAGCCGTAAAAATCGCGCGAGAAATGCAAGTGCCTTATATTGGCTCTTT
CCATTTACAGCCAGAGCATATTTC
HP1402 485 GATAGAAACCGAGCTTGGCATGCGTTTAAAAGCGCATGGGAGTTTGTTGAAAAAAATCCAAAAACCCCCTAAAAACA 486 IETELGMRLKAHGSLLKKIQKPP AATTCAAACCCCCTAAAACAACCATTCCTAAACCTAAAGAAGCGAGCTTGCGCCTTGATTTAAGGGGGCAACGCAGC PPKTTIPKPKEASLRLDLRGQR GAAGAAGCCCTGGATTTACTAGACGCTTTTTTAAACGACGCGCTTTTAGGGGGCTTTGAAGAAGTGCTGATTTGCCA DLLDAFLNDALLGGFEEVLICHG CGGCAAAGGGAGCGGGATTTTAGAAAAGTTTGTGAAAGAATTTTTAAAAAACCACCCCAAAGTGGTGAGCTTTAGCG LEKFVKEFLKNHPKWSFSDAPI ACGCTCCCATTAATTTAGGCGG
HP1402 487 TCAATCAAGCTAAAGTCCCTGTGATTTATGAAGAAAACCATTTGTTGCCTATGGGGTTTATCCATTTAGCCTTTAGGG 488 NQAKVPVIYEENHLLPMGFIHL
GGGGTGGGAGCTTAAGCGATAAAAACCAGTTGGGTTTGGCGAAATTATTCGCGCAAGTTTTAAACGAAGGCACTAAA GSLSDKNQLGLAKLFAQVLNE
GAGCTTGGTGCGGTGGGGTTTGCGCAACTTTTAGAGCAAAAAGCGATCAGTTTGAATGTGGATACCAGCACAGAAG AVGFAQLLEQKAISLNVDTSTE
ATTTGCAAATCACTTTAGAATTTTTAAAAGAATACGAAGATGAAGCCATTACGCGCTTAAAAGAGCTTTTAAAATCCCC EFLKEYEDEAITRLKELLKSPNF
TAATTTCACGCAAAACGCTTTAGAAAAAGTCAAAACCCAAATGTTAGCCGCACTTTTACAAAAAGAAAGCGATTTTGA EKVKTQMLAALLQKESDFDYL
CTATTTGGCTAAATTGACTTTAAAGCAAGAGCTTTTTGCTAACACCCCTTTAGCTAACGCAGCCTTAGGCACTAAAGA QELFANTPLANAALGTKESIQKI
GAGCATTCAAAAAATCAAGCTAGACGATTTGAAACAGCAATTTGCTAAGGTCTTTGAACTCAATAAGCTCGTGGTGGT KQQFAKVFELNKLWVLGGDL
6CTTGGGGGCGATTTGAAAATCGATCAAACCCTTAAGCGTTTGAATAACGCCCTTAATTTCTTGCCACAAGGTAAAG KRLNNALNFLPQGKAYEEPYF
CGTATGAAGAGCCTTATTTTGAAACGAGCGATAAAAAAAGCGAAAAAGTCCTCTATAAAGACACTGAGCAGGCTTTC SEKVLYKDTEQAFVYFGAPFKI
GTGTATTTTGGTGCGCCCTTTAAAATCAAGGATTTAAAACAGGATTTAGCGAAATCTAAAGTCATGATGTTTGTGCTT DLAKSKVMMFVLGGGFGSRL
GGTGGGGGGTTTGGCTCTCGTTTAATGGAAAAAATCA
HP1402 489 GCATTTAAGGCATGTGAAAGATCTCAAAAGCTTTTTAACGCATGCCAGAAAAAACTTGCCTTTCACGGCTAAAATTGA 490 HLRHVKDLKSFLTHARKNLPFT
AATTGAATGCGAAAGCTTTGAAGAGGCCAAAAACGCCATGAATGCGGGAGCGGATATTGTGATGTGCGATAATTTGA CESFEEAKNAMNAGADIVMCD
GCGTTTTAGAGACTAAAGAAATTGCCGCTTATAGAGATGCGCATTATCCCTTTGTTTTACTGGAAGCGAGCGGGAAC ETKEIAAYRDAHYPFVLLEASG
ATTTCACTAGAGAGCATTAACGCTTACGCTAAAAGCGGCGTGGATGCCATTAGCGTAGGGGCTTTAATCCATCAAGC NAYAKSGVDAISVGALIHQATFI
CACTTTTATTGACATGCACATGAAAATGGCTTAAAGACTTTAAAAAGGGGTTATTAACATGCTAAAAGAATATTTAGAA KMA
AGCATTAAAGATCTTACGCCTGAAAAGAATGAACTCACGCACCGCCCTTCTTTATACAACTTGCTTAATCAGTTAAAA
AACCATTTCAATAAAGAGTTTAAGATTGAACATGAGCCTGAAAGAAAGCAAGGAAGCCAGCC
HP0071 491 GCAAAAAAGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCATTGCTAAAAAAACAA 492 KKDACGFIYEISEFMKAYTALLK
GACCGATACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACTGCCCTTTATGTCAAA VYLLRYLPSRYWASILTTALYV
TACCCTGATTTTGACGCTTTGAAAAAGCTTTTGGTGTCTTATTATTACCAAACTTGGATTGCAGGAGGCACGATCACG ALKKLLVSYYYQTWIAGGTITRI
CGCATCAAGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACCATCAAAGAGCTTATA IIKNVKSNKSVETIKELILNSIDS
TTGAATAGCATCGACTCTTATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTTCTGTTTATCATAGCA YLYNLWDSSSVYHSKWVRPVL
AATGGGTGCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATTTTATCGCTATGGATG FMADEEKPHFIAMDAETQVEHI
CCGAAACCCAAGTGGAGCATATTTTGCCACAAACGCCCAAAAGAGGCAGTCAATGGAACGCGGATTTTGACAAAGA KRGSQWNADFDKEKREEWVN
AAAAAGAGAAGAATGGGTAAATAATATCGCGAATTTAACCCTTTTAAAGCGTAAAAAGAACGCGCATGCTTTAAACGG LLKRKKNAHALNGDFDEKRKIY
GGATTTTGATGAAAAAAGAAAAATTTATGGAGGCAAAGACACGAGCAAAGTGATTAGCTGTTATGACATCACTAAAGA SKVISCYDITKELYSNYRKWNE
ATTGTATAGCAATTATAGGAAGTGGAATGAGAAGTCCCTCCAAGAGCGATACAAAT RYK
HP0775 493 CCCTTTTAGGCACAGGGAGCAGTAACGCTGGGAGCTTGGATGCGGCGAATATATTAAAACCGGTTTTAACGGATGG 494 LLGTGSSNAGSLDAANILKPVL
GAGCTTGAAATGTTTAGGAGCGACCACTTTTGAAGAATACCGCAGCGTGTTTGAAAAAGACAAGGCTTTTAATAGGC KCLGATTFEEYRSVFEKDKAFN
GTTTTTCAGTCATAAAAGTTGAAGAGCCTTCTAAAGAGGCGTGTTACTTGATTTTAAAAAAGATCGCTCCCCTTTATGA IKVEEPSKEACYLILKKIAPLYEE
AGAACACCACCAGGTGCGTTATGATGAGAGCGTGTTTAAGGCATGCGTGGATTTAACGAGTGATTACATGCATGATA RYDESVFKACVDLTSDYMHDK
AATTCTTGCCGGATAAAGCGATTGAATTATTAGATGAGGTGGGATCGAGGAAAAAAATCAGCCCTAAAAAGGGCAAA AIELLDEVGSRKKISPKKGKKIG
AAAATCGGCGTTGATGATGTGAAAGAAACGCTCGCTCTAAAGCTTAAAATCCCTAAAATGCGTTTGAGCAGCGACAA ETLALKLKIPKMRLSSDKKALLR
AAAAGCCCTTTTAAGGAATTTGGAAAAATCGCTTAAAAATAAGATTTTTGCCCAAGCAGAAGCGATCAGTCTTGTCAG LKNKIFAQAEAISLVSNAIKIQHC
CAATGCGATTAAAATCCAGCATTGCGGGCTTTCTGCAAAAAATAAGCCTGTGGGGAGCTTTTTATTCGTGGGGCCTA NKPVGSFLFVGPSGVGKTELA
GTGGGGTGGGGAAAACAGAATTGGCTAAAGAATTGGCCTTGAATTTG L
HP0775 495 GTGCTTGGACAATAAAGCCCGCTTTGAAATGATTGAGCATGTTTTAGACAAATACAAAAGCCGTGAGAT TTTTCTCC 496 CLDNKARFEMIEHVLDKYKSRE GTTTGAAAAAGTGC TTAATGGGGTTTTTAA CTTTGGGAAAATGCTCCCTGATATGAGCGTGCCTTTCTTTGTCAA KVLLMGFLSFGKMLPDMSVPF
TAAAATCAGAAGCGACACGAAAGCGATGGTCTTGGATCAAGAAGAGAGCCAGTTAAAAGAGCGGATTTTAAAAAGAA SDTKAMVLDQEESQLKERILKR AAAATGAAAAAATCATTTTGAATGTGAATTTTATTGGCGAAGAGGTTTTAGGCGAAGAAGAAGCTAATGCGCGTTTTG LNVNFIGEEVLGEEEANARFEK AAAAATACTCTCAAGCCCTAAAATCCAACTACATCCAATACATTTCCATTAAAATCACGACGATTTTTTCTCAAATCAAT KSNYIQYISIKITTIFSQINILDFE ATCCTTGATTTTGAATACTCTAAAAAAGAGATTGTCAAACGCCTAGACGCTCTTTACGCCCTGGCTTTAGAAGAAGAA VKRLDALYALALEEEKKQGMP AAAAAACAAGGCATGCCTAAATTCATCAATTTGGATATGGAAGAATTTAGGGATTTAGAGCTAACAGTGGAGTCGTTT MEEFRDLELTVESFMESIAKFD ATGGAATCCATCGCTAAATTTGATTTGAACGCTGGTATTGTGCTGCAAGCCTATATTCCGGATTCTTATGAATATTTGA LQAYIPDSYEYLKKLHAFSKER AAAAACTGCACGCTTTTTCTAAAGAAAGGGTTTTAAAAGGGTTGAAGCCCATTAAAATCCGCTTTGTTAAGGGAGCGA PIKIRFVKGANME ACATGGAG
HP0775 497 GAGCGAGTGTGTTGAGCGCGTTACTTCTTGTAGGCTTAGGGGCAGCCCCTAAACATTCAGTTTCAGCTAATGACAAA 498 ASVLSALLLVGLGAAPKHSVSA
CGGATGCAGGATAATTTAGTGAGCGTGATTGAAAAACAGACCAATAAAAAGGTGCGTATTTTAGAAATCAAACCTTTA QDNLVSVIEKQTNKKVRILEIKP
AAATCTAGCCAGGATTTAAAAATGGTCGTTATTGAAGATCCGGACACTAAATACAATATCCCGCTTGTGGTGAGTAAG DLKMVVIEDPDTKYNIPLVVSK
GATGGTAATTTAATCATAGGGCTTAGCAACATATTCTTTAGCAATAAAAGCGATGATGTGCAATTAGTTGCAGAAACC LSNIFFSNKSDDVQLVAETNQK
AATCAAAAAGTTCAAGCTCTTAACGCCACCCAACAAAATAGCGCGAAATTGAACGCTATTTTTAATGAAATACCGGCT ATQQNSAKLNAIFNEIPADYAIE
GATTATGCGATAGAGTTGCCCTCTACTAACGCTGCAAATAAGGATAAAATCCTTTATATTGTCTCTGATCCCATGTGC AANKDKILYIVSDPMCPHCQKE
CCACATTGCCAAAAAGAGCTCACTAAACTTAGGGATCATTTAAAAGAAAACACCGTGAGAATGGTCGTGGTGGGGTG DHLKENTVRMWVGWLGVNS
GCTTGGGGTCAATTCAGCTAAAAAAGCGGCTTTAAT
HP0775 499 TAAAGGGCGTTTTTGGGCATGACAATAAAGAAAAGATTAACGCGCTTTTACAAGAAAAAAAGCGTTTTTTTATAGATG 500 KGVFGHDNKEKINALLQEKKRF
ACAATTTAGAAAACAAGCACTTAGACACCACGATGGTGAGCGAGTTTGTGGGAAAAACTAGGGCGTTTATTAAGATC LENKHLDTTMVSEFVGKTRAFI
CAAGAAGGCTGTGATTTTGATTGCAATTATTGCATTATCCCAAGCGTGAGAGGGAGGGCTAGGAGTTTTGAAGAGAG CDFDCNYCIIPSVRGRARSFEE
AAAAATTTTAGAGCAAGTGGGCCTTTTATGCTCTAAAGGGGTTCAAGAAGTGGTTTTAACCGGCACCAATGTGGGGA VGLLCSKGVQEWLTGTNVGS
GCTATGGGAAAGATAGAGGAAGCAATATCGCGCGATTGATTAAAAAATTAAGCCAGATCGCTGGATTAAAACGCATA GSNIARLIKKLSQIAGLKRIRIGS
AGGATTGGGAGCTTAGAACCTAATCAAATTAACGATGAATTTTTAGAGCTTTTAGAAGAGGATTTTTTAGAAAAACATT NDEFLELLEEDFLEKHLHIALQH
TGCATATCGCTTTACAGCACAGCCATGATCTCATGCTAGAGAGGATGAATCGAAGAAACCGCACTAAAAGCGATAGG LERMNRRNRTKSDRELLET1AS
GAATTATTAGAAACAATCGCTTCTAAGAATTTTGCTATTGGCACGGATTTTATTGTGGGGCATCCGGGCGAGAGCGG GTDFIVGHPGESGSVFEKAFKN
AAGCGTTTTTGAAAAAGCGTTTAAAAATTTAGAAAGCTTGCCTTTAACGCACATCCACCCTTTTATTTACAGCAAACGA LTHIHPFIYSKRKDTPSSLMTDS
AAAGACACCCCCTCTAGCTTGATGACTGATAGCGTGAGTTTGGAAGATTCTAAAAAGCGTTTGAATGCGATTAAAGAT SKKRLNAIKDLIFHKNKAFRQLQ
TTGATTTTTCATAAAAATAAGGCGTTCAGGCAATTGCAGCTCAAGCTCAATACGCCTCTAAAAGCCTTAGTGGAAGTG PLKALVEVQKDGEFKALDQFFN
CAAAAAGACGGCGAATTTAAAGCCTTAGATCAATTTTTCAACCCCATTAAAATCAAAAGCGATAAGCCTCTAAGGGCT DKPLRASFLEIKEYEIKERENHA
AGTTTTTTAGAAATCAAAGAGTATGAAATTAAGGAGAGGGAAAATCATGCCGTTTTCTAAAAATTT
HP0775 501 CTTCTTCGTTTGCTAATGAGGCTCCAATTGACATGCATGGGGGAAAAAGCGCTAAAATTGAGCGACAAAGCGTAGAA 502 SSFANEAPIDMHGGKSAKIERQ AATTCCGCCCAAAAAGAAAACTCTAAAAGCGCGATTTTAGAGCGTTTGAAGCGTTTAAGAGAGTATTCCAAAGACCAC AQKENSKSAILERLKRLREYSK TTAAAAGCCTTTCAAAGGCTTCAAGTCCAGGATTTTGACGGGCGTATCAAACCTTTAGACACCATTAGCATTGAATAT FQRLQVQDFDGRIKPLDTISIEY ATCCATA
HP0775 503 GATAGAAACCGAGCTTGGCATGCGTTTAAAAGCGCATGGGAGTTTGTTGAAAAAAATCCAAAAACCCCCTAAAAACA 504 IETELGMRLKAHGSLLKKIQKP AATTCAAACCCCCTAAAACAACCATTCCTAAACCTAAAGAAGCGAGCTTGCGCCTTGATTTAAGGGGGCAACGCAGC PPKTTIPKPKEASLRLDLRGQR GAAGAAGCCCTGGATTTACTAGACGCTTTTTTAAACGACGCGCTTTT DLLDAFLNDAL
Figure imgf000147_0001
Figure imgf000147_0002
HP0775 505 CTCAAACCGAGTTAAATTTAAAAGATTTAGAAAAAAAGCCCGCCGGGATCGTTAGGGATTATTATTTGTGGCGTTATA 506 QTELNLKDLEKKPAGIVRDYYL
TTAGCGATAAAAAAACCAGTTTAGAAAACGCTAAAAAAGCCTATGAATTGACTCAAAATAAAAACAACGCCCTACAAA DKKTSLENAKKAYELTQNKNN
AGGCCATGCAAGAAAAAGGCTCAGACAATGCAGAAAAAAACCCTGATGTTAAATTGCCTGAAGATATTTATTGCAAG MQEKGSDNAEKNPDVKLPEDI
CAAACGGCTTTAGAAAGCATGCTAGAAACAACAGACACTTTCCAAGCAAGCTGCATCGCTATCGCTTTAAAATCAAAG ALESMLETTDTFQASCIAIALKS
ATCAGAGATTTTGATAAAATCCCTATTGAAACCCTTAAGCCCTTACAAATTAAAATCAAAGAGGCTTACCCCGTTCTTT KIPIETLKPLQIKIKEAYPVLYEEL
ATGAAGAATTAGAAATTTTGCAAAGTAAGCATGTGAGCGCTTCTTTGTTTAAGGCTAACGCGCAAGTGTTTAGCGCGC KHVSASLFKANAQVFSALFNHL
TTTTCAATCATTTGAGTTATGAAAAAAAGCTCCAAATTTTTGAAAAGCATATCCCCATTAAAGAGTTAAACCGTCTTTTA LQIFEKHIPIKELNRLLDENYPAF
GACGAAAATTATCCGGCGTTTAACCGCTTGATCTATCAGGTTATTTTAGATCCTAAATTGGATCATTTTAAAGACGCTC QVILDPKLDHFKDALTKSNATH
TCACTAAAAGTAACGCTACCCACAGCAACGCGCAAACCTTTTTTATTCTAGGGATTAATGAAATCTTGCGCAAAAAAC FFILGINEILRKKPSKALKYFERS
CCTCTAAAGCGCTCAAGTATTTTGAACGATCAGAAGCGGTTGTCAAAGACGATGATTTTTCAAAAGACAGAGCGATTT DDDFSKDRAI FWQYLVSKKKKT TTGGCAGTATTTAGTTTCTAAAAAGAAAAAAACTTTAGAACGCCTTTCACAAAGCCCAGCTTTAAATCTCTATAGTCT QSPALNLYSLYAS
TTATGCGAGC
HP0775 507 ACAAACCCTTTTACCCACCGCTCAAACCCTTTTAAACCATGCTAAAAAAACTCAAAGCTTGAATGGGGTGGAGATTGT 508 QTLLPTAQTLLNHAKKTQSLNG
AGGGTTGGAGCATTTGGATAAAGTGATTTATTTAGATCAAGCCCCCATAGGCAAAACCCCACGAAGCAACCCTGCCA EHLDKVIYLDQAPIGKTPRSNP
CTTACACGGGAGTGATGGATGAAATCAGGATTTTATTTGCCGAGCAAAAAGAAGCTAAAATTTTAGGCTATAGTGCGA MDEIRILFAEQKEAKILGYSASR
GCCGTTTTAGCTTTAATGTTAAAGGAGGGCGGTGCGAGAAATGCCAAGGCGATGGGGACATTAAAATAGAAATGCA KGGRCEKCQGDGDIKIEMHFL
CTTTTTGCCTGATGTGTTAGTCCAATGCGATAGCTGTAAGGGCGCTAAATACAACCCCCAAACTTTAGAAATCAAGGT QCDSCKGAKYNPQTLEIKVKG
GAAAGGCAAATCCATTGCCGATGTGTTGAACATGAGCGTGGAAGAGGCTTATGAATTTTTTGCTAAATTCCCTAAAAT LNMSVEEAYEFFAKFPKIAVKL
CGCCGTGAAGTTAAAAACGCTTATGGATGTGGGCTTAGGCTATATCACTTTAGGGCAAAACGCTACGACTTTAAGTG VGLGYITLGQNATTLSGGEAQR
GGGGGGAGGCTCAAAGGATCAAATTAGCTAAAGAATTGAGTAAAAAAGACACAGGCAAAACCCTTTATATTTTAGAT ELSKKDTGKTLYILDEPTTGLHF
GAGCCTACTACCGGTTTGCATTTTGAAGACGTGAATCATCTTTTACAAGTCTTGCATTCTTTAGTGGCGTTAGGCAAT LLQVLHSLVALGNSMLVI
TCTATGCTAGTGATT
HP0775 509 TCTAGCGGTGTATGCCCCTATAGCGAGCCGATTGGGCATGTCTTCAATCAAAAATGAATTAGAAGACAAGAGCTTTT 510 LAVYAPIASRLGMSSIKNELEDK
ATTATATTTATCCAGAAGAGTATAAAAATATCAAGGAATATTTGCACAAAAACAAGCAGTCTTTACTCTTAAAGCTCAA YPEEYKNIKEYLHKNKQSLLLKL
CGCTrTTGCGAGCAAGTTAGAAAAAAAACTTTTTGATAGTGGGTTTAGCCATTCGGATTTTAAACTCGTTACAAGGGT KLEKKLFDSGFSHSDFKLVTRV
GAAACGCCCTTATTCTATCTATCTTAAGATGCAACGAAAGGGCGCGGTTAATATTGATGAAATTTTGGACTTGTTAGC YLKMQRKGAVNIDEILDLLAIRIL
CATTAGGATTTTATTGAAAAACCCGATTGATTGCTATAAAGTTTTAGGGATTATCCATTTGAATTTCAAACCCATTGTCT DCYKVLGIIHLNFKPIVSRFKDYI
CTCGTTTTAAAGATTACATC6CTTTGCCCAAAGAAAATGGCTATAAGACGATACACACGACCATTTTTGATGAATCTTC NGYKTIHTTIFDESSVYEVQIR
TGTTTATGAAGTGCAGATCCGCAC
HP0775 511 CTTAAGGAGTTTTAGAAACCCTCACCAAAGCGCTTCAAAATCCAGTATTTTAGGCTCAAGCTCTCTAAGAGAGCCAAA 512 LRSFRNPHQSASKSSILGSSSL
ACCTGGCGAAATCGCGCTAGCGCATAACGGCATGCTTTTTTTTGATGAATTGCCTCATTTTAAAAAGGATATTTTGGA GEIALAHNGMLFFDELPHFKKDI
AGCTTTAAGAGAGCCTTTAGAAAACAATAAATTGGTGGTTTCAAGAGTGCATAGCAAAATTGAATACGAAACCTCTTT EPLENNKLWSRVHSKIEYETS
TTTATTTGTAGGGGCTCAAAACCCTTGCTTGTGCGGGAATTTACTCAGCGCGACCAAAGCATGCCGTTGCCAAGACA QNPCLCGNLLSATKACRCQDR
GAGAAATCACGCAGTATAAAAACCGCTTGAGCGAGCCTTTTTTGGATAGGATTGATTTGTTTGTGCAAATGGAAGAG NRLSEPFLDRIDLFVQMEEGNY
GGGAATTATAAAGACACGCCGTCGCATTCTTGGACTTCAAAAGAGATGCATGAATTGGTGTTATTAGCTTTCAAGCAG HSWTSKEMHELVLLAFKQQKL
CAAAAGTTAAGGAAACAGAGCGTTTTTAATGGTAAGCTTAATGAAGAGCAGATAGAACGATTTTGCCCCTTAAACGCT FNGKLNEEQIERFCPLNAEAKK
GAAGCAAAAAAGTTGTTGGAGCAGGCGGTTGAAAGGTTTAATCTCTCCATGCGCTCTATTAATAAGGTCAAAAAAGT VERFNLSMRSINKVKKVARTIA
CGCTAGGACGATTGCGGATTTAAACGCTTGCGAGGATATAGAAAAATCTCACATGCTTAAAGCGCTGAGTTTTAGAA EDIEKSHMLKALSFRKIS
AGATTTCTTAAAAGGATTTTTATAAGGGAGAAAAAATGCAAGAATACCACATTCATAATTTGGATTGCCCTGATTGTGC
1 GTCTAAATTAGAAAGGGATTTAAA
HP0775 513 AAATAGGGCTAAAGAGGGATTTAAGAATGTTAGCAAATGGCTTGGCAAAAACAAAAGCACTGAAGCAGCACCCAAAG 514 NRAKEGFKNVSKWLGKNKSTE AGAGGGTTTTAAGCGATGAAGAAATCAACAACAGGGCTGAAAGGATCGCTAAAAGCGAGTTAGAGAAGGATACGAA RVLSDEEINNRAERIAKSELEK
Figure imgf000149_0001
GCTCGTTTCATCACACGATCAATACGAGCGCATGAAAAAAAGCGGATCGCTCAACACGGAAAACTTAGATTCGCACA SHDQYERMKKSGSLNTENLDS TTCAAGCCAACAGCTTACAAGAGCTGAATCAAAAATTGCTCCAATTCGTGGGCGCGGATAGGAAGTATATGCCCTAC SLQELNQKLLQFVGADRKYMP
ACTAAAGCGGTGCAAATTTCTTTGAATAACCCCAATCTTAAAGATTTGGAAGTGATTGACACCCCAGGAGTGAATGAC QISLNNPNLKDLEVIDTPGVND CCCATCGCTTCCAGGGAAGAACGCACCAAAGCCTTATTGAAAGATTGCGATGTGGTGTTTATCATAAGCTCTTCTAAT ERTKALLKDCDWFIISSSNQFL CAGTTTTTAACGGAGAGCGATATGAGTTTGTTTGACAGGGTTTCTAACAAAGAAAGCCTTCAAGAAATTTATTTTGTG SLFDRVSNKESLQEIYFVASQA GCAAGCCAAGCCGATAGCGCTGTTCTTTCTATGAGTGAAGTGGAAAAATCTCGCCACCACCTCCC SMSEVEKSRHHL
HP0775 515 GCGAGTTAGAACCCCAGATCACAGACGCTCTAAAACCCTTAGAATTTATCAAAGATTTTAAAAGTTGTTTGGATATTG 516 ELEPQITDALKPLEFIKDFKSCL GGAGCGGGGCGGGACTCCCTGCTATCCCTTTAGCCCTTGAAAAACCTGAAGTCAAATTCATTCTTTTAGAGCCAAGA GLPAIPLALEKPEVKFILLEPRIK ATAAAAAGAGCGGCTTTTTTAAACTACCTTAAMGCGTTTTGCCTTTAAAAAATATTGAAATCATTAAAAAGCGTTTAG NYLKSVLPLKNIEIIKKRLEDYQ AAGATTATCAAAATCTTTTACAAGTGGATTTGATCACTTCCAGAGCGGTCGCTAGCTCTTCTTTTTTGATAGAAAAAAG LITSRAVASSSFLIEKSQRFLKD
FYKGEQLKDEIACKDTECFMH
CACTGAATGCTTTATGCATCAAAAACGAGTTTATTTTTACAAATCAAAGGA YKSK
HP0775 517 ATrCGCAAGCCTATTTTGACGCTTTGCGAACGATCAGCCGCGCGTTTAAAAACTACCCTCAAACGATGTTTAAAAAAG 518 SQAYFDALRTISRAFKNYPQTM
ATTTGTATTTGTTAGAAATTATCGCATTAGGACAATTAGGCATTAAAAAATCCTTACTCATAGACATTGGCACCCAATG YLLEIIALGQLGIKKSLLIDIGTQ
GATTAAAAATTACCCGACTGATCCCAATATCCCTGAAGCGTTATACTATGTCGCCAAAGCTTTAGACGAGAACAACCA TDPNIPEALYYVAKALDENNHY
TTACAAACAGGCCATGCGTTATTACAAACGCATTCTTTTAGAATACAAGAACTCGCGCTACGCTCCTTTAGCCCAAAT YYKRILLEYKNSRYAPLAQMRL
GCGTTTAGCCATTGAAGCGGCTGAAGGCTCTGATTTGAGCAACGCTAACATGCTTTTTAAAGAAGCTTTTTCTAACGC GSDLSNANMLFKEAFSNAKDK
CAAAGACAAAGAGAGTGCGAGTGAAATCGCGCTTAATTGGGCTGAAGCAGAGATAAACTATCAAAATTTTAATAACG ALNWAEAEINYQNFNNAKYLID
CTAAATACCTCATTGATAAGGTGGTCCAATCCAACCCTGATTATATTTCTACGCATAGCGAATCAGCCCTAGACTTGC NPDYISTHSESALDLLKLLKKN
TCAAGTTATTGAAAAAAAACCAGATGAATGCAAGCGCGATTGAGATCGCTCACTTGCTCCTCAATCAAGATGATGATC AIEIAHLLLNQDDDLKAKEQALY
TGAAAGCTAAAGAGCAAGCGCTTTATGATTTAGGAGCGTTGTATGCAAGGATCAAGGACTTTAAAAACGCCCACCTT YARIKDFKNAHLYNLQYLQDHA
TACAATCTGCAATATTTGCAGGACCATGCGGAACTGGATAAAGCTTCTGTCGTTAGGGCGCGCGATGAAAAAGCCCT SWRARDEKALFSMEGNTQEKI
TTTTTCCATGGAGGGGAACACGCAAGAAAAAATCGCCCACTATGACAAAATCATTCAAAATTTCCCTAATTCTAATGA IIQNFPNSNEALKALELKAQLLF
AGCCCTAAAGGCTTTAGAATTGAAAGCCCAACTATTGTTTGAAAACAAGCGTTATGCTGAAGTGTTAAGCATGCAAAA AEVLSMQKNLPKDSPLIQKTLN
AAATTTGCCTAAAGATTCCCCTTTGATCCAAAAAACGCTCAATGTCCTTGCTAAAACCCCATTAGAGAACCATCGTTG PLENHRCEEALKYLSQITTFEF
TGAAGAAGCCTT QAFDCLYFAS
HP0775 519 GCATTTAAGGCATGTGAAAGATCTCAAAAGCT TAACGCATGCCAGAAAAAACTTGCCTTTCACGGCTAAAATTGA 520 HLRHVKDLKSFLTHARKNLPFT
AATTGAATGCGAAAGCTTTGAAGAGGCCAAAAACGCCATGAATGCGGGAGCGGATATTGTGATGTGCGATAATTTGA CESFEEAKNAMNAGADIVMCD
GCGTTTTAGAGACTAAAGAAATTGCCGCTTATAGAGATGCGCATTATCCCTTTGTTTTACTGGAAGCGAGCGGGAAC ETKEIAAYRDAHYPFVLLEASG
ATTTCACTAGAGAGCATTAACGCTTACGCTAAAAGCGGCGTGGATGCCATTAGCGTAGGGGCTTTAATCCATCAAGC NAYAKSGVDAISVGALIHQATFI
CACTTTTATTGACATGCACATGAAAATGGCTTAAAGACTTTAAAAAGGGGTTATTAACATGCTAAAAGAATATTTAGAA KMA
AGCATTAAAGATCTTACGCCTGAAAAGAATGAACTCACGCACCGCCCTTCTTTATACAACTTGCTTAATCAGTTAAAA
AAC
HP0775 521 GCAAAAAAGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCATTGCTAAAAAAACAA 522 KKDACGFIYEISEFMKAYTALLK
GACCGATACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACTGCCCTTTATGTCAAA VYLLRYLPSRY ASILTTALYVK
TACCCTGATTTTGACGCTTTGAAAAAGCTTTTGGTGTC1TATTATTACCAAACTTGGATTGCAGGAGGCACGATCACG ALKKLLVSYYYQTWIAGGTITRIK
CGCATCAAGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACCATCAAAGAGCTTATA IIKNVKSNKSVETIKELILNSIDSY
TTGAATAGCATCGACTCTTATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTTCTGTTTATCATAGCA YLYNLWDSSSVYHSKWVRPVL
AATGGGTGCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATTTTATCGCTATGGATG FMADEEKPHFIAMDAETQVEHIL
CCGAAACCCAAGTGGAGCATATTTTGCCACAAACGCCCAAAAGAGGCAGTCAATGGAACGCGGATTTTGACAAAGA KRGSQWNADFDKEKREEWVN
AAAAAGAGAAGAATGGGTAAATAATATCGCGAATTTAACCCTTTTAAAGCGTAAAAAGAACGCGCATGCTTTAAACGG LLKRKKNAHALNGDFDEKRKIY
GGATTTTGATGAAAAAAGAAAAATTTATGGAGGCAAAGACACGAGCAAAGTGATTAGCTGTTATGACATCACTAAAGA SKVISCYDITKELYSNYRKWNEK
ATTGTATAGCAATTATAGGAAGTGGAATGAGAAGTCCCTCCAAGAGCGATACAAATCTTTGTATAACACTATCACGCC RYKSLYNTITPVLHIEGQEDDFE
TGTTTTACACATAGAGGGGCAAGAAGATGATTTTGAAGATGATTTTGATCTAGAATGATTAAAGATTGCCAAGCATCA E
AAACAACAAAGAGGTGATCAATGCCTAAAAAAGAGCTATTAAAGATGTCAAAGAAAAGGATTTTTAAAGACTTCTTAAA
AGAAGCCAAACAGCACCGCCCTATTGTTTTCTATACAGATAATGATTGTGATGGCATGTTAGCTGGCAGCGTTTTAAT
GTCTATGTGTT
HP1489 523 TTTGAATTTGGCTGAAGACAGCGCGCCTTTGAACCATCCTAACGCTCAAAAACTCTCCTTAAAAAACGCATGGACTAG 524 LNLAEDSAPLNHPNAQKLSLKN GGTATTGTCTAACCATGAAGGCTTGCATGCGCAAGAATACGCCATTAAGCGAGCGAGTAAAATGAAATTAGCGGCTA VLSNHEGLHAQEYAIKRASKMK AACTTTCTTTTTTGCCTCAAATTGATTTGAGCGCTTTTTATGTGTATCTCTCTAACCCCATTAAAATGGATTTTGCCAGC SFLPQIDLSAFYVYLSNPIKMDF CAAAAACAACCGGGCGT PG
HP1086 525 ATAACGAGTTACAATTGTTGTGTTTCAGGCTGGGTAAAAACAAGGATTTGTATGCGGTCAATGTTTTTAAGATCCGTG 526 NELQLLCFRLGKNKDLYAVNVF
AAGTGGTGAAATACCATGGCAATCTCACCATCATTAGCCACGAAAACAATTCGCTCGTTGAGGGGCTAATCATTATAA VKYHGNLTIISHENNSLVEGLIIIR
GAGAACTCACCATTCCCTTGATTGATATGAAAAAATGGTTTTATTATGACAGCCAAAACAAAAACAAGGATTTACGCC LIDMKKWFYYDSQNKNKDLRPY
CTTATAGGATAGAAAAAGAAAAAGGCGAAGATGATATTGTTATGATTTGTGAGTTTTCTCGCTGGACTATAGGGGTTA KGEDDIVMICEFSRWTIGVRIYE
GGATCTATGAAGCGGATAGGATTTTGAGCAAGAAATGGACTGAAATGGAGCAAAGCGCTGGGCTAGGGGGATCTGC SKKWTEMEQSAGLGGSAGNNK
AGGCAATAACAAACTCGTGAGCCGCACGCGCTATTTTGATGGGCGCTTGGTGCAAGTGGTGGATATTGAAAAAATG TRYFDGRLVQWDIEKMLIDVFP
CTTATAGACGTGTTCCCTTGGATTGAAGATGAAAAACACAACGATTTAGAGACGCTTTCTAAAATCCATTCTAACCAAT KHNDLETLSKIHSNQCVLLADDS
GCGTTTTGCTTGCTGATGACTCCCCAAGCGTTTTGAAAACCATGCAAATGATTTTAGACAAGCTGGGCGTCAAGCAT KTMQMILDKLGVKHIDFINGKTLL
ATAGATTTTATCAATGGTAAAACCTTACTAGAGCATTTATTCAACCCCACAACCGATGTGAGTAATATTGGCCTGATTA NPTTDVSNIGLIITDLEMPEASGF
TTACCGATTTGGAAATGCCAGAGGCGAGCGGTTTTGAAGTGATCAAGCAGGTTAAAAACAATCCTTTGACTTCAAAAA VKNNPLTSKIPIW
TCCCTATCGTGGTCA
HP1086 527 CGCTCTATGGCGGGATTGCGTGCGCGAATTTGTTGCATAAAAATTCAGGGATCACGATAGATATTGGAGGGGGTAG 528 LYGGIACANLLHKNSGITIDIGGG
CACCGAGTGCGCGTTGATTGAAAAAGGCAAGATTAAGGACTTAATCTCGCTTGATGTTGGGACGATTCGCATTAAAG ALIEKGKIKDLISLDVGTIRIKEMF
AAATGTTTTTAGACAAAGACTTAGAGGTCAAATTGGCTAAAGCCTTTATCCAAAAAGAAGTCTCTAAACTGCCCTTTAA EVKLAKAFIQKEVSKLPFKHKNA
ACACAAAAACGCCTTTGGGGTGGGGGGGACGATCAGAGCGTTGAGTAAGGTATTGATGAAACGCTTTTGTTACCCT GTIRALSKVLMKRFCYPIDSLHG
ATTGATTCTTTGCATGGCTATGAAATAGATGCACATAAAAATTTAGCGTTCATTGAAAAAATCGTCATGCTCAAAGAAG HKNLAFIEKIVMLKEDQLRLLGV
ATCAATTACGGCTTTTAGGGGTGAATGAAGAGCGTTTGGATAGCATCAGGAGCGGGGCGTTGATTTTATCAGTCGTT DSIRSGALILSWLEHLKTSLMIT
TTGGAGCATTTAAAAACTTCTTTAATGATCACTAGTGGGGTGGGGGTGAGAGAAGGCGTGTTTTTGAGCGATTTATT VREGVFLSDLLRHHYHKFPPNIN
GCGCCATCATTACCATAAATTCCCCCCCAATATCAACCCCTCTCTCATCTCTTTAAAAGATCGCTTTTTGCCCCATGA LKDRFLPHEKHSQKVKKECVKL
AAAGCACAGCCAAAAGGTCAAAAAAGAATGCGTGAAATTGTTTGAAGCCTTATCGCCTTTGCATAAAATAGATGAA PLHKIDE
Figure imgf000151_0001
Figure imgf000152_0001
Figure imgf000153_0001
Figure imgf000154_0002
Figure imgf000154_0001
Figure imgf000155_0001
Figure imgf000156_0001
HP0525 575 AGAAGCTATCAATTACTCGCCTTCAGATGAGATTAGAAACCGCCCCTTATTTGACAATCTAACCCCCCTATTCCCTGA 576 EAINYSPSDEIRNRPLFDNLTP
TGAACAGATCAAATTAGAATACGAACCCACTAAAGTTACCGGCAGAATGCTAGATTTATTCAGCCCTGTGGGGAAAG I KLEYEPTKVTGRMLDLFSPV
GCCAAAGGGCTTTGATCGTCGCGCCACCAAGGACTGGGAAAACGGAGCTGATGAAAGAACTCGCCCAAGGCATCA LIVAPPRTGKTELMKELAQGIT
CTTCTAACCACCCTGAAGTGGAGCTGATTATCCTTCTAGTGGATGAGCGCCCTGAAGAAGTTACGGATATGCAACGA ELIILLVDERPEEVTDMQRSVK
AGCGTTAAGGGTCAAGTTTTTAGCTCCACTTTTGATTTGCCCGCAAACAACCACATAAGAATCGCTGAATTAGTCCTA STFDLPANNHIRIAELVLERAK
GAAAGGGCCAAAAGGCGCGTGGAAATGGGCAAAGATGTGGTGGTTTTATTGGATTCTATCACCCGTTTAGCCAGAG GKDVWLLDSITRLARAYNAV
CGTATAACGCTGTAACGCCTTCAAGCGGTAAGGTTTTAAGTGGAGGCGTGGATGCAAACGCCTTGCATAGGCCCAA VLSGGVDANALHRPKRFFGA
GCGTTTTTTTGGAGCCGCAAGGAATATTGAAGAAGGCGGGAGCTTGACGATTATCGCTACGGCGTTGATTGAAACG GSLTIIATALIETGSRMDEVIFE
GGATCTAGAATGGATGAGGTGATTTTTGAAGAATTTAAAGGCACCGGGAATAGCGAAATCGTTTTAGCGAGGAATAT NSEIVLARNIADRRIYPAFDILK
TGCGGACAGGCGCATTTACCCGGCCTTTGATATTTTAAAATCCGGCACACGAAAAGACAATATCTTGCTTGGCAAAG NILLGKDRLTKVWVLRNVMQ
ACCGCTTGACTAAAGTGTGGGTTTTAAGGAATGTGATGCAACAAATGGACGACATAGAAGCCTTAAGCTTTGTGTATT LSFVYSKMQQTKD
CTAAAATGCAACAAACTAAGGACAA
HP0525 577 AATACGCTGATCCTAGCACTTCTAAAAAGAGAGCCGATAAGGGATTAAAAAAGGTGTTCAAAGACAGCAAAAAAGAC 578 YADPSTSKKRADKGLKKVFK
GCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCATTGCTAAAAAAACAAGACCGATACGT CGFIYEISEFMKAYTALLKKQD
CTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACTGCCCTTTATGTCAAATACCCTGATTTT RYLPSRYWASILTTALYVKYP
GACGCTTTGAAAAAGCTTTTGGTGTCTTATTATTACCAAACTTGGATTGCAGGAGGCACGATCACGCGCATCAAGCA KLLVSYYYQTWIAGGTITRIKQ
AACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACCATCAAAGAGCTTATATTGAATAGCAT VKSNKSVETIKELILNSIDSYN
CGACTCTTATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTTCTGTTTATCATAGCAAATGGGTGCGT NLWDSSSVYHSKWVRPVLAL
CCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATTTTATCGCTATGGATGCCGAAACCCAA DEEKPHFIAMDAETQVEHILP
GTGGAGCATATTTTGCCACAAACGCCCAAAAGAGGCAGTCAATGGAACGCGGATTTTGACAAAGAAAAAAGAGAAGA SQWNADFDKEKREEWVNNIA
ATGGGTAAATAATATCGCGAATTTAACCCTTTTAAAGCGTAAAAAGAACGCGCATGCTTTAAACGGGGATTTTGATGA RKKNAHALNGDFDEKRKIYG
AAAAAGAAAAATTTATGGAGGCAAAGACACGAGCAAAGTGATTAGCTGTTATGACATCACTAAAGAATTGTATAGCAA VISCYDITKELYSNYRKWNEK
TTATAGGAAGTGGAATGAGAAGTCCCTCCAAGAGCGATACAAATCTTTGTATAACACTATCACGCCTGTTTTACACAT KSLYNTITPVLHIEGQEDDFED
AGAGGGGCAAGAAGATGATTTTGAAGATGATTTTGATCT
HP0525 579 CAGACATCAAATTCTCAACACCCAATCAGTTAGAACAATCCCTCAACGCTCTTAAAAACAAGCTAGCCGCATTCTTTT 580 DIKFSTPNQLEQSLNALKNKL
CAAAACACCCTGATAAACATAACGGCATGGAGTTTAATGAAATAGCTAAAACTCAAATAGAAGCGCTTTATATGCCGC PDKHNGMEFNEIAKTQIEALY
AATCTAGCGATGGTTTTGATGATTTTCGCAAGCATCTTGAAGAGAGTATTAAAAGTTTTATCAGAGCGAAAAAAAATC GFDDFRKHLEESIKSFIRAKKN
GTTATGGTTTTCCTAAAATCTTTGATGrrGCAGACATAGAACAAGAAGAGAGAAAAGTCATTGAATGGCGAGAAAAAG KIFDVADIEQEERKVIEWREKE
AGAGAGCGTCAAAACAAAGCTATAAACAAAACCTTCAAATCAACAAAATCGCTAACGATTTAAAGCGTGATAAGGTAG SYKQNLQINKIANDLKRDKW
TGGATAAAAGAACGATTTTAAGCGTGATAGACGCTGATTTAGATCGTGGTTTTATCCCGCCTAAAGACTTGTTAAAGC VIDADLDRGFIPPKDLLKQLEK
AATTAGAAAAAATCAGCGCTTCTCTTTCTAAAGACATCGTAATAGCGATAAAGCAAGTAGAAAAATTAGAGCTTAGCT DIVIAIKQVEKLELSYALIDNIQ
ATGCGCTAATAGACAATATCCAACACAACACGCTTGATGACACGCTTGATTTTACCTTTATTGTTGGGGATTCTTTGA LDFTFIVGDSLSVQSLYVTFDL
GCGTTCAGTCGCTTTATGTTACCTTTGATCTGGTGATTGACATGGATAGGCCTATGAGCGAGCAGTTCCTCAACCAT PMSEQFLNHIGELGSFESRER
ATTGGGGAATTGGGGAGTTTTGAATCTAGAGAGCGAGCGTTAGAGTGGGTGCGATTATCGCAAGCTAAACTGATCAT RLSQAKLIIETPREALKNAQLS
TGAAACGCCCAGAGAAGCGCTAAAAAATGCGCAATTATCGCAAATTGAAGAAATATTGACTGGCTGTATTTTTAATGG GCIFNGAYRLQNDLKGNRHG
CGCTTACCGCCTTCAAAACGATCTTAAGGGCAATAGACATGGAAATTTTAAATAAAATTCACACCCCAACAAGGAGAA
ACAAGCCATGCCAAAAAGATAAAGAACATTAAAAAACATTGAACGACCGATTATAGCGGAA
Figure imgf000158_0001
Figure imgf000159_0001
HP0088 599 AGGAGCTTTGATGATGACGATGAAAATAGCGTGAGCGATTCTAAAAAAGATGAAGACAACGAAGAAGATGAAGAAAA 600 RSFDDDDENSVSDSKKDEDN
CGAAGAAAGGAAAAAAGTCGTTTCTGAAAAAGACAAGAAGCGTGTAGAAAAGGTTCAAGAAAGCTTTAAAGCCCTAG EERKKWSEKDKKRVEKVQES
ACAAGGCTAAAAAAGAATGGCTTAAAGCCCTTGAAGCCCCCATAGATGAAAGAGAAGACGAATTGGTGCGTTCATTG KAKKEWLKALEAPIDEREDEL
ACCCTAGCTTACAAACGCCAAACACTCAAAGACAGACTCTATGATTTAGAACCTACCAGCAAACTGATTAATGAATTA AYKRQTLKDRLYDLEPTSKLIN
GTCAAAACGATGGAAACCACTTTAAAAAGCGGCGATGGGTTTGAAAAAGAGTTGAAACGCTTGGAATACAAACTGCC ETTLKSGDGFEKELKRLEYKL
CTTATTCAATGACACTCTCATCGCAAACCATAAAAAAATCCTTGCCAATATCACTAACATGACTAAAGAAGATATTATC LIANHKKILANITNMTKEDIIAQV
GCTCAAGTGCCAGAAGCGACTATGGTGAGCGTGTATATGGATCTTAAAAAGCTTTTTTTGACTAAAGAAGCGAGCGA VSVYMDLKKLFLTKEASEEGF
AGAAGGCTTTGATCTAGCCCCCAACAAGCTAAAAGAAATTTTAGAGCAAATCAAAAGAGGGAAGTTGATTTCCGATC LKEILEQIKRGKLISDRAKNKM
GCGCTAAAAACAAAATGGCTAAATCCAATTTAAGGTTGGTGGTGAGCATCGCTAAACGATTCACGAGCAGAGGCTTA LWSIAKRFTSRGLPFLDLIQE
CCATTCTTGGATTTGATTCAAGAGGGCAATATTGGCTTGATGAAAGCGGTGGATAAGTTTGAGCATGAAAAGGGCTT KAVDKFEHEKGFKFSTYATW
CAAGTTTTCTACCTATGCGACCTGGTGGATCAAACAAGCTATCAGCAGAGCCATAGCCGATCAGGCCCGCACTATCC RAIADQARTIRIPIHMIDTINRIN
GCATCCCCATTCACATGATTGATACGATTAATCGCATCAATAAAGTCATGCGCAAACACATTCAAGAAAACGGCAAAG HIQENGKEPDLEWAEEVGLS
AGCCTGATTTAGAAGTGGTGGCTGAAGAAGTGGGGCTTTCGTTAGATAAAGTGAAGAATGTGATTAAGGTGACTAAA VIKVTKEPISLETPVGNDDDGK
GAGCCTATCAGTTTGGAAA EDKNIVSSIDHIMREDLKAQIE
NEREKAVIRMRFGLLDDESDR
KELNVTRERVRQIESSAIKKLR
RILRNYLRI
HP0088 601 GCTCAATGATTTAAAAAGCGTTGAAGAAATCACTAACTACACCAAAGCCGGTGCTTTTTGTAAAAGCTGTGTGAGGCC 602 LNDLKSVEEITNYTKAGAFCKS
TGGAGGGCATGAAAAAAGGGATTATTACTTGGTGGATATTCTTAAAGAAGTGCGCGAAGAAATGGAAGCTGAAAAAC GHEKRDYYLVDILKEVREEME
TTAAAGCGACCGCTAATAAATCCCAAAGCGGAGAATTGGCTTTCAGGGAAATGACTATGGTTCAAAAGATTAAAGCG TANKSQSGELAFREMTMVQKI
GTGGATAAAGTCATTGATGAAAATATCCGCCCGATGCTTATGATGGATGGAGGGGATTTAGAGATTTTAGACATTAAA VIDENIRPMLMMDGGDLEILDI
GAAAGCGATGATTACATTGATGTGTATATCCGCTACATGGGGGCATGTGATGGGTGCATGAGCGCGACTACCGGGA IDVYIRYMGACDGCMSATTGT
CTTTATTTGCCATTGAAAACGCTTTGCAGGAATTATTGGATCGCAGTATCAGGGTGTTACCGATTTGA ALQELLDRSIRVLPI
HP0088 603 GCCCATTTGATTTTAGTGTGTAAATACAGAAAAAAGTTGTTGCAAGGGGATTTGAACAACTTTATTAAGTCTGTTATAG 604 AHLILVCKYRKKLLQGDLNNFI
ATGAGATAGCCACCCAAAGCAATTTCATTATCATTGCGATGGAAAGCGATATAGATCATTTGCACTTAATGGTTCAAT ATQSNFIIIAMESDIDHLHLMVQ
ATATTCCTAGAATGTCTATTAGTTCCATTATTTCTAGAATCAAACAGATCACTACTTATAGAGTTTGGCGTGATAAGAG ISSIISRIKQITTYRVWRDKRFIP
ATTTATCCCCTTATTGCAAAAACACTTTTGGAAAGAAAAAACTTTTTGGACTGATGGTTTTTTTGTTTGCTCTATCGGT WKEKTFWTDGFFVCSIGEANP
GAGGCTAACCCTGAAACGATCAAGGCGTATATAGAAAATCAAGGTTAATTTTACTCATAGGGTTTTTATAGTTCCTAG ENQG
CGGAACTAAAGCATTCATCCCAAACACTAAAGATATTTGGGATTTCTGCTTGGGTGGTTTAAAAATTCCATTCTTTTCT
AACAATCTTATTTTAAGCTCTTGTAATTTTTCTAAATTTGGCTTAGGCGATTCCAAAAGCTCTCTATCTTCTTTGGTAAG
ATTTTCTTTAAACTGGCTTTCTAAAAGATCATAATTACGCATAACTCCATCATCTTCAAACCCCTCTTGTGTGGCGCGT
AAAGCATGGATAATAGATAGGTTGGCGGGCATTTCTTGTTTTtGCTGCCTTGCTTTTAATTGGCTTAGCTTTTCACGC
ATAGAAGAATTAAAAAAAGCTCTCATTTTTTTCTATTTCTTTTAAAAATTCCAATTCACTGATAGGCGAACTATAGTAGC
GTCTAAGGGCTTCTTCTAA
HP0088 605 AAAGAATTAAACCGATTCAATCAAAGCGGTGCCAATAGCGATTCTCATATCAAAGACATGTTTGCGGATCGTAAGACT 606 KELNRFNQSGANSDSHIKDMF TTAGAAGAAGACATTAAAAACGCCTATGATGATCTTTTTGATTACCCCATTGACGATATAGAGGGCATGACTAGCGCC EEDIKNAYDDLFDYPIDDIEGM ATTGTCAGCATGAGCGCAATGAACGAGCTTGTAAAAGTCTCACGCGCCATTAACACGCTCAAAGAGCGCTACAATTT MSAMNELVKVSRAINTLKERY AATCCGCACTTCTAATGATAAAAAAATCCTTTCACTAAAAGAAAAAATTGATATTGAAAAGATCCATAAAATCTCTTCAA NDKKILSLKEKIDIEKIHKISSML TGCTTCATCAAAAAGCCAAACACCTCCATGCGTTAAAGAATATCAATGAGCCTAAAAACCCAAACGATTTAATGATTTT HLHALKNINEPKNPNDLMILED AGAAGACCTCATCGCTCTTTTAGACTTTAAAATAGAGTTTAAAGAACGCAAAG KIEFKERK
Figure imgf000161_0001
Figure imgf000162_0001
Figure imgf000163_0001
Figure imgf000164_0001
HP1298 639 TTTAAGATCAATAAATTGCGGACAGCGCCAΛGTAAGGAAGAACAGCCAAGCCATTGGGTGAAATGCCCTAAATGTT 640 FKTNKLRTAPSKEEQPSHWVK ATGCGTTAATGTATCATAAAGAAGTGTTTAGTAAATACAGCGTGTGTTTGAAATGCCATTACCATTTCCGCATGAAAG LMYHKEVFSKYSVCLKCHYH CGGCTGAAAGGATTGAATTTTTATGCGATGTGGGGAGTTTTGAAGAGTTTGACAAGCATTTACGGCCTAATGATCCTT ERIEFLCDVGSFEEFDKHLRP TAAATTTCGTGGATAAAGAGAGCTATAAACAACGCATTAAAAAATACGAAAAAAGGACTAACCGCCCAAGCTCAGTGA VDKESYKQRIKKYEKRTNRPS TCAGCGGTGAGGCTAAAATCAACCGCATGCCTTTGCAGATCGTGGTGTTTGATTTTAGCTTTATGGGGGGGAGTTTA AKINRMPLQIWFDFSFMGGS GGCTCTGTGGAGGGCGAAAAGATCGTAAGAGCAATCAATCGCGCGGTCGCTAAAAGAGAAGCGTTATTGATTGTTT EKIVRAINRAVAKREALLIVSA CAGCGAGTGGGGGGGCTAGGATGCAAGAATCCACTTATTCGCTCATGCAAATGGCTAAAACGAGCGCGGCTTTGAA QESTYSLMQMAKTSAALNRL CCGATTGAGTGAGGCCAAACTCCCTTTCATTTCGCTCTTAAGCGATCCCACTTATGGGGGCGTTAGCGCATCTTTTG ISLLSDPTYGGVSASFAFLGD CTTTTTTAGGGGATCTCATTATCGCAGAGCCAGGGGCGATGATAGGCTTTGCGGGGCCTAGGGTGATTAAGCAAAC MIGFAGPRVIKQTIGADLPEG TATAGGGGCGGATTTGCCTGAGGGCTTTCAAACAGCGGAATTTTTATTAGAGCATGGCTTGATTGATATGATTGTGC LEHGLIDMIVHRKDLKKTLSDL ACAGGAAGGATTTGAAGAAGACTTTGAGCGATCTCATCGCTATGATGACGCATAAGACTTCAAAGATTTTTTAAAGTT KTSKIF TTAACATTGATGCGTTGCGTGGTGTATTCTAT
HP1298 641 CAAGAAGCGAAGTGGCAATGACGGGCGAATTGACTTTGAGCGGGGAAGTTTTACCCATAGGAGGGTTGAAAGAAAA 642 RSEVAMTGELTLSGEVLPIGG
GTTGATCGCTGCTTTTAAAGCCGGCATTAAAACCGCTCTCATTCCTGTCAAAAATTACGAAAGGGATTTAGACGAGAT AFKAGIKTALIPVKNYERDLDE
CCCTGCTGAAGTGCGAGAAAATTTAAACATCGTTGCGGTGAAAAACATCGCTGAAGTGTTAGAAAAAACTTTGCTTTG ENLNIVAVKNIAEVLEKTLL
AAATTTGGCATGAAAGCAGGCATTATTGGTTTAGGGCTTATGGGGGGGAGTTTAGGGCTAGCCTTGCAAGAATGGG
GGCGTTTTAAAAGCGTTATAGGCTATGATCATAACGCTTTGCATGCTAAATTGGCTTTGACTTTGGGGCTTGTAGATG
AATGCGTGGGATTTGAAAAGATTTTAGAATGCGATGTGATTTTTTTGG
HP1298 643 CTGATGGGCTAAGCATAGACCCTAAAAGCAAGCAAGTCATTAACGATAGCTTCAATCGTTTTGAAGACGGCACACCC 644 DGLSIDPKSKQVINDSFNRFE
CCTGAAAAAAACGGCGATTTCGCTTTTTTGCTCCACATCATCAAATCCTTAAAAAACACAGGCAAAGGGGCAGTGATT KNGDFAFLLHIIKSLKNTGKG
TTACCCCATGGGGTGCTATTTAGGGGGAATGCTGAGGGTGCAATCAGAAAAAATCTTTTAACAAAAGGCTATATTAAA VLFRGNAEGAIRKNLLTKGYI
GGCGTGATAGGCCTAGCCCCTAATCTTTTTTATGGCACTTCCATTCCTGCATGCGTGATCGTTTTAGACAAAGAAAAC PNLFYGTSIPACVIVLDKENA
GCGCGCGCCAGAAAGGGCGTTTTCATGATAGATGCGAGCAAGGATTTTAAAAAAGACGGCAATAAAAACCGCTTGA MIDASKDFKKDGNKNRLREQ
GGGAACAAGATGTCCAAAAAATGATAGACACTTTTAACGCTTACAAAGAAATCCCTTATTATTCCAAAATGGTAAGCC DTFNAYKEIPYYSKMVSLEEI
TAGAAGAAATTAGCGCTAACGACTATAACTTGAATATCCCGCGCTACATTGCCGCCAAACCAGAATCAGAAAAAGAC NIPRYIAAKPESEKDLFALINS
TfGTTCGCTTTAATCAACAGCCACAAAGCTAGTTATTTGCCCAAAAACGAAATAAAAGCCTACGCCCCTTATTTTCAA KNEIKAYAPYFQVFKELKNTL
GTGTTTAAAGAGCTTAAAAACACGCTTTTTAAAAAGAGCGATAAAGAGAGTTATTACGCTTTAAAAACAGAATGCGAA SYYALKTECENIKELIIQSSEF
AACATTAAAGAATTAATCATTCAAAGCTCAGAATTCCAAACCTTCCACGCTTCTGTTTTAAACGCTTTTGACCGATTGG LNAFDRLDLFETFDHLEPGF
ATTTATTTGAAACTTTTGACCATTTAGAGCCAGGTTTTA
HP0072 645 CAAACAAAAGGTTTCAAACTATTTTTTAGGGGTTTTGCCTAAAAGCTATTCTATGAGCGAAGAAAACAACATTTTAGGC 646 KQKVSNYFLGVLPKSYSMSE
TTGTATGATGAGCATTTCTTGCTCACTAAAAACGAAAACTTAGTGGGCATCCTCCGTTTAGAAGGGGTTAGCTACACC YDEHFLLTKNENLVGILRLEG
CATTTAAGCACAGAGCAATTGCAAGATCTTTTCACCGAGCGCCAGATGGCGTTGGATTCTTTAGAAAAAGTCGTGGC TEQLQDLFTERQMALDSLEK
GCGCCTTGTGGTTAAAAGGCGTAAAATTGATCACCAACAAAACATTCAATCTGATTCTCAATACTTGCAAGCGATTTT KRRKIDHQQNIQSDSQYLQAI
GAATCAATTTGAAAACAAAGAAGTGTATGAAAATCAGTATTTTTTAGTTTTAGAAAGCACTCACTCTTTGCATGGCGTT KEVYENQYFLVLESTHSLHG
TTGGAGCATAAGAAAAAATCTCTCATGCATGCTAATAGGGAAAATTTTAAGGATATTCTCTCTTATAAAGCGCATTTTT SLMHANRENFKDILSYKAHFL
TGCAAGAAACTTTAAAAAGCTTAGAAATCCAGCTCAAAAATTATGCCCCCAAACTCTTAAGCTCCAAAGAGGTTTTGA LEIQLKNYAPKLLSSKEVLNFY
ATTTTTATGCGGAATACATTAATGGGTTTGATCTCCCCTTAAAACCCTTAGTAGGGGGGTATTTGAGCGATAGCTATA FDLPLKPLVGGYLSDSYIASSI
TCGCTAGTTCTATCACTTTTGAAAAAGATTATTTCATTCAAGAAAGCTTTAATCAAAAAACCTACAACCGCTTGATTGG FIQESFNQKTYNRLIGIKAYES
CATTAAAGCTTATGAGAGCGAAAGGATCACTTCTATAGCGATCGGAGCGCTTTTATACCAAGAAACGCCTTTAGATAT GALLYQETPLDIIFSIEPMSVH
TATCTTCTCTATAGAGCCTATGAGTGTGCATAAAACGCTGAGTTTTTTAAAAGAGAGGGCCAAGTTTAGCATGTCCAA ERAKFSMSNLVKNELLEYQE
TCTCGTTAAAAACGAGCTATTAGAATACCAAGAACTGGTCAAAACCAAACGCCTATCCATGCAAAAATTCGCCCTAAA SMQKFALNILIKAPSLENLDA
CATTCTTATCAAAGCCCCTAGTTTAGAGAATTTAGACGCTCAAACAAGCTTGGTTTTAGGGCTTTTATTTAAAGAAAAT LLFKENLVGVIETFGLKGGYF
TTAGTG HLNHRLRFLTSKALACLMVF
Figure imgf000166_0001
Figure imgf000167_0001
Figure imgf000168_0002
Figure imgf000168_0001
Figure imgf000169_0001
HP1173 675 TTCCTTTATTTGGTAAAACGCCTGATTTTGGCAACTCTGCACGCCTTATCAATTTCAAAGGGAATACGAATTTTAATCA 676 PLFGKTPDFGNSARLINFKGN
AGCCACGCTCAATTTAAGGGCTAAAAATATCCATATCAATTTCCAAGGCGTTTCTACTTTTAAACAAAACTCTACGATG TLNLRAKNIHINFQGVSTFKQN
AATTTAGCTGAAAGTTCCCAAGCGAGCTTTAACGCTCT ΓAAAGTGGAAGGGGAAACGAATTTCAATCTCAATAACTCA AESSQASFNALKVEGETNFNL
AGCTTGTTGAATTTCAATGGCAATAGCGTTTTCAACGCTCCTGTGAGTTTTTATGCTAATCATTCTCAAATTTCTTTCA NFNGNSVFNAPVSFYANHSQI
CTAAATTAGCGACTTTTAATTCTGACGCTTCTTTTGATTTAAGCAACAACAGCACCCTGAATTTTCAAAGCGTTCTTTT TFNSDASFDLSNNSTLNFQSV
AAATGGTGCTCTAAACCTTTTAGGCAATGGCAGTAACAATCTAGCGATCAACGCTAAAGGGAATTTTAGTTTTGGGTC NLLGNGSNNLAINAKGNFSFG
TAAAGGGATTTTGAATCTGTCTTATATGAATCTATTTGGGGGGGATAAAAAAACTTCCGTTTATGATGTGTTGCAAGC LSYMNLFGGDKKTSVYDVLQA
CCAAAATATTGATGGCTTAATGGGGAATAACGGCTATGAGAAGATCCGTTTTTATGGCATACAGATTGACAAGGCTG MGNNGYEKIRFYGIQIDKADYS
ATTACTCGTTTGATAACGGCGTTCATTCTTGGAGATTCACTAACCCGCTCAATACGACTGAAACGATTACAGAAACCT HSWRFTNPLNTTETITETLHNN
TGCATAACAACCGCTTGAAAGTGCAGATCTCTCAAAACGGCGTTTCTAATAATAAGATGTTCAATCTCGCTCCTAGCT SQNGVSNNKMFNLAPSLYDY
TGTATGATTACCAAAAAAACCCTTATAATGAAACCGAGAATTCCTATAATTACACAAGCGATAAGGTTGGCACTTATTA ETENSYNYTSDKVGTYYLTSNI
TTTAACGAGCAATATCAAAGGCTTTAATCAAAACAATAAAACACCCGGGACTTATAACGCGCAAAACCAACCCTTACA NNKTPGTYNAQNQPLQALHIY
AGCCTTACACATTTACAATCAGGCTATCACTAAGCAAGATTTGAACATGATCGCCAGTTTGGGTAAGGAGTTTTTGCC QDLNMIASLGKEFLPKIANLLS
TAAAAT LNSPNSFETLFGIFEKYGITLN
LLKI I NNFSNTTNYDFSQGNL
GQTNTKSWWFGGEGYKEPC
TCQMFRQTNLGQLLHSSTPYL
FRAKNIYITGTIGSGNAWGSG
FESGTNLVLNQAKIDAQGTDKI
QGGIEKLFGEKGLGNALSNIIY
NAIPKDLANMIPKDFGSKTLSS
VNNLLGVSAFKNAIMEILNSKT
GENGLLNALDPTERKKIDQML
HSSGFEKFIVKTLGI
HP1173 677 CTTCAGACGCTATCAATCAAGACATGATGCTTTATATTGAACGGATCGCTAAAATCATTCAAAAACTCCCTAAAAGGG 678 SDAINQDMMLYIERIAKIIQKLP
TGCATATTAATGTGAGAGGCTTTACGGATGATACGCCTTTAGTTAAAACCCGTTTTAAAAGCCATTATGAATTAGCCG VRGFTDDTPLVKTRFKSHYEL
CCAATCGCGCTTATAGGGTGATGAAAGTCCTTATACAATACGGCGTAAATCCTAACCAATTGTCTTTTTCTTCTTACG RVMKVLIQYGVNPNQLSFSSY
GCTCTACCAACCCTATCGCGCCTAACGACTCCCTAGAGAACAGAATGAAAAACAATCGTGTGGAAATCTTTTTTTCAA APNDSLENRMKNNRVEIFFST
CCGATGCGAACGATTTGAGTAAAATTCATTCTATTTTAGATAATGAGTTCAATCCCCACAAACAGCAAGAATGAATCG KIHSILDNEFNPHKQQE
CATGAATAAAAATTATCTTTTAATCTTTTTGTTGTTAGCGAGTCTTGTTGCTAGAGAGAAGGACGCTTCTTCAAACCTT
TTTGATTTGATTGATAAGGGGATCAACAGAGAACAAGAATTAAAAGAGCAGGAGCAAAAAACGCGCTTAAAACTGGC
TCAAAGCCCTTTAGTAGCGTTAGAGATTGTCCCCCAAGAAACGCCCTATTTAGAATGGCAAGGGGCTAGGGAGTCGT
ATTATTTAAAGGTGAGCGCTGTAGTGGAGAGCGTGGTTATCTTAAAAATTGACATCAATCAAGGGCGTTCTTGCTCG
CTCTACCCCACGCCTAAAAGCGTTTCTTTAGTGAGGAATCAAAGCGTAGCCTATGAAATTTTATGCGAAAACCAACCC
CTATGGATAGAAGTAAGCACCAATTTAGGCAAACGCACCTTTCAGTTTTAACCTGCAACCAACATTAAAGAATGCCTT
TAGCATTTTAAAACCCCTTTATCCAAATAAGTTTGGTTATAATGCTAGGCATGGGCGTTTTTAAACAATTGATCAAAGG
ATTGTATGAATGGTTGCTCCATTCTATAGATGTGGCTACGCAACATTTAGTTGCCATAGTATTAAAAATAAGCGTGGT
AAAATATTTGA
Figure imgf000171_0001
Figure imgf000172_0001
Figure imgf000173_0001
Figure imgf000174_0001
HP0325 707 AAAAGGTGAACACCCTAGCGTTTCCTACAAAAAGGCCATTTCCCAACAAAAGATTCAAGCTAAAATTGAAGAATTAGG 708 KGEHPSVSYKKAISQQKIQAKI
CGAAAACTATGAAAACGCCATTATTGAAGGCAAGATTGTAGGCAAGAATAAAGGGGGTTATATCGTGGAGTCTCAAG YENAIIEGKIVGKNKGGYIVES
GCGTGGAGTATTTCCTCTCCCGCTCGCACTCTTCTTTAAAGAATGACGCAAACCATATCGGCAAACGCGTTAAAGCG LSRSHSSLKNDANHIGKRVKA
TGCATCATTCGTGTGGATAAGGAAAACCATTCTATCAATATTTCTCGCAAACGATTCTTTGAAGTCAATGACAAACGA ENHSINISRKRFFEVNDKRQLE
CAACTTGAGGTTTCTAAGGAATTGTTAGAAGCCACAGAGCCGGTGTTAGGGGTTGTGCGCCAGATCACCCCTTTTGG EATEPVLGVVRQITPFGIFVEA
CATTTTTGTAGAAGCTAAGGGGATTGAGGGCTTGGTCCATTATTCTGAAATCAGCCATAAGGGACCAGTCAATCCTG VHYSEISHKGPVNPEKYYKEG
AAAAATACTACAAAGAGGGCGATGAAGTCTATGTCAAAGCCATCGCTTATGATGCAGAAAAAAGACGCCTTTCACTCT AIAYDAEKRRLSLSIKATIEDP
CCATAAAAGCGACTATAGAAGACCCATGGGAAGAGATTCAAGACAAGCTAAAACCCGGATACGCCATTAAGGTAGTG KLKPGYAIKWVSNIEHYGVFV
GTGAGCAACATTGAACATTATGGGGTGTTTGTGGATATTGGTAATGATATTGAAGGCTTTTTGCATGTTTCTGAAATC EGFLHVSE1SWDKNVSHPNNY
TCTTGGGATAAAAATGTCAGCCACCCTAACAATTACTTGAGCGTGGGGCAAGAGATTGATGTGAAAATCATTGACATT EIDVKIIDIDPKNRRLRVSLKQL
GATCCAAAAAATCGCCGCTTAAGGGTTTCTTTAAAGCAACTCACTAACAGGCCTTTTGATGTTTTTGAATCTAAACAC VFESKHQVGDVLEGKVATLTD
CAAGTGGGGGATGTTTTAGAAGGCAAAGTGGCGACTTTAACGGATTTTGGGGCGTTTTTAAATCTGGGTGGGGTGG LGGVDGLLHNHDAFWDKDKK
ATGGTTTGCTCCACAATCACGACGCTTTTTGGGATAAAGATAAAAAATGCAAAGACCACTATAAAATTGGCGATGTGA IGDVIKVKILK
TCAAAGTGAAAATCCTT
HP0325 709 CAAACTTAAACGCACCCAAACCCTTATTTGAATGTTTTGTAGGAGTTAATCTGGCCAAAGCCAAATATTATTCTAAAAA 710 NLNAPKPLFECFVGVNLAKAK
AGAAGAAAGAGAAAAAGAAAAGATGATCTTGAATTTTTGTAAGATATTTGAAATTATTCTTTTTGAAGCTATCCAAAAA EREKEKMILNFCKIFEIILFEAIQ
CAACCAAAGCCTGATTTTAAAAATAAAGACGAGCTTTTAGGGGATTATCCTAATCTTAAAAATTTAGATTCTTTAAGAG DFKNKDELLGDYPNLKNLDSL
AAGTGAGGGAAGACTTTTTGAAAAGAGCGTTTAAGAATGATGAAGCGAGTTTGGGAGCGTATGTGTTAGTGTTGCTT FLKRAFKNDEASLGAYVLVLLS
AGCTGTAAGTATTTTGAGAGCGTGTTTGAAAAAGTTCAAGAATGGCTAGATTTTATCGCTAGGCTTATTGCTTTGAGA SVFEKVQEWLDFIARLIALRGH
GGCCATGTGCACAAGATAACTA
HP0325 711 TCTTTCAAGCTTACGCCCCCTTAATGGCTAAAATCTGCTCGTATCAATCTAAGTTTGTGAGCGCTTTTTATCTTTATAC 712 FQAYAPLMAKICSYQSKFVSA
GCAACTCAAAAAAGAGCTTAAAACCTCTAAAGACACCCTCTATAAATTGTTACATGCGCTAGAAAAACAACGCATCCT LKKELKTSKDTLYKLLHALEKQ
TTTTTTAGTCCCTAATTTTGAAAACAATAAAACCAAATTGTATCTGTGCGATTTTGCCTTGCCTTATAGCCTGACTCCT PNFENNKTKLYLCDFALPYSLT
AGCCCTTCGCTTTTAAACGTTTTTGAAAACATGGTTTTTTTAGAGCTTTACAAGCAATTCCCAAAATATGAGCTTTACT NVFENMVFLELYKQFPKYELY
CCCATGATAACGGGATTTTTATCTTGCGCGAAAATTCTACCAACAAGCTCGCCCTCATCGCCCACGCTTTCCCCACG FILRENSTNKLALIAHAFPTPHF
CCCCATTTTTTAGAAAAACAGCTTTTATGGTGCCATAAACATGGGTTTTTAAACATTATAGTCGTTTCTATCAACGCCC WCHKHGFLNIIWSINAPISATN
CTATTTCAGCAACCAATACCCCCTACAAACACCTTAATTTCATTGATTTTTCTTTGGATATTCAATCTATTTTGGTATAA LNFIDFSLDIQSILV
TAGTTATTATGTATTTATGTATTTATGTATTTATGTATTTATGTATTTATGTATTTATGTATTTATGTATTTAATATTATTAC
TAAAATTTTTAATTGATTATTTAATATTTTTTATTTTTTCTTAAGCTAACCCACTATAAGCTATCCCAGTATTTCGCTTTTT
TGAGAAATGCTACCTACCCTAAAAGCGTAGGGTGTTTTTAATGATTTTTTATAGAAAGGAAGCTACAATGAACGCATT
GAAAAAATTAAGTTTCTGCGCCTTGTTATCC
HP0325 713 CTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCATTGCTAAAAAAACAAGACCGATACGTC 714 CGFIYEISEFMKAYTALLKKQD
TATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACTGCCCTTTATGTCAAATACCCTGATTTTG RYLPSRYWASILTTALYVKYPD
ACGCTTTGAAAAAGCTTTTGGTGTCTTATTATTACCAAACTTGGATTGCAGGAGGCACGATCACGCGCATGAAGCAAA KLLVSYYYQTWIAGGTITRIKQ
CCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACCATCAAAGAGCTTATATTGAATAGCATCG VKSNKSVETIKELILNSIDSYNT
ACTCTTATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTTCTGTTTATCATAGCAAATGGGTGCGTCC NLWDSSSVYHSKWVRPVLAL
TGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATTTTATCGCTATGGATGCCGAAACCCAAGT DEEKPHFIAMDAETQVEHILP
GGAGCATATTTTGCCACAAACGCCCAAAAGAGGCAGTCAATGGAACGCGGATTTTGACAAAGAAAAAAGAGAAGAAT SQWNADFDKEKREEWVNNIA
GGGTAAATAATATCGCGAATTTAACCCTTTTAAAGCGTAAAAAGAACGCGCATGCTTTAAACGGGGATTTTGATGAAA RKKNAHALNGDFDEKRKIYGG
AAAGAAAAATTTATGGAGGCAAAGACACGAGCAAAGTGATTAGCTGTTATGACATCACTAAAGAATTGTATAGCAATT VISCYDITKELYSNYRKWNEK
ATAGGAAGTGGAATGAGAAGTCCCTCCAAGAGCGATACAAATCTTTGTATAACACTATCACGCCTGTTTTACACATAG KSLYNTITPVLHIEGQEDDFED
AGGGGCAAGAAGATGATTTTGAAGATGAT
Figure imgf000176_0001
HP1067 727 ATCAAGCGATAACCCGCTAGCCGATGAGCCGGATTTGGATTACGCTAACATGAGCGCTGAAGAAGTGGAAGCAGAG 728 SSDNPL7\DEPDLDYANMSAEE
ATTGAACGGCTGCTGAACAAACGCCAAGAAGCCGATAAAGAACGAAGAGCTCAAAAAAAACAAGAAGCCAAACCCA RLLNKRQEADKERRAQKKQEA
AACAAGAA6TTACCCCAACAAAAGAAACCCCCAAAGCCCCTAAAACCGAAACTAAAGC VTPTKETPKAPKTETK
Figure imgf000177_0001
HP0650 729 CCTTAAAAAATTGGCTATCGCACCCCATGGCTTTTATTGAGTTTGTTTTAAGGCGCATGGCGGATTCCTATCTTTTAG 730 LKNWLSHPMAFIEFVLRRMAD ACGATCCTTTAGAAAAAGATAAGGCTCTTAAAGAAATGTTAGGGTTTTTAAAAAACTTTTCCTTGCTTTTACAAAGCGA PLEKDKALKEMLGFLKNFSLLL ATACAAGCCCTTAATCGCTACCCTTTTGCAAGCGCCTTTGCATGTTTTAGGGATTAGAGAGCGAGTCTCTTTTCAGCC PLIATLLQAPLHVLGIRERVSFQ TTTTACCCCAAAACAGAAAAACCCAATCGCCCTCAAAGGTTTGCGCATGTTTCTAGCGCGCCCAGTTTGGAATTTTT TEKPNRPQRFAHVSSAPSLEF AGAAAAATTAGTGATCCGCTATCTTTTAGAAGACAGAAGCTTGTTGGATTTAGCGGTGGGCTATATCCATAGTGGGG RYLLEDRSLLDLAVGYIHSGVF TATTCTTGCATAAAAAACAAGAATTTGACGCTTTGTGTCAAGAAAAATTAGACGACCCTAAATTAGTTGCGTTATTATT EFDALCQEKLDDPKLVALLLDA AGATGCGAATTTACCCCTAAAAAAAGGGGGTTTTGAAAAGGAATTGCGTTTGTTGATTTTGCGCTATTTTGAGCGCCA KGGFEKELRLLILRYFERQLKEI ACTCAAAGAAATCCCTAAAAGCTCGCTCCCTTTTAGCGAAAAAATGATCTGTTTAAAAAAGGCTCGCCAGGCCATTAT PFSEKMICLKKARQAIMKLKQG GAAATTAAAACAAGGAGAATTAGTCGCCATATGAAAAAGATTAAAGCTTTAGCCTTATTCAGTGGGGGGTTGGATAGT TTGTTGTCCATGAAATTACTCATTGATCAAGGCATTGAAGTAACCGCTTTGCACTTTAATATAGGGTTTGGGGGGAAT AAAGATAAAAGAGAGTATTTTGAAAACGCCACCGCGCAAATTGGGGCTAAGCTCTTGGTGTGCGATATTAGAGAGCA GTTTTTTAACGATGTGTTGTTCAAGCCCAAATACGGCTATGGGAAATATTTCAACCCTTGCATTGACTGCCATGCCAA CATGTTTAGGAACGCTTTTTATAAAATGCTTGAATTGAACGCGGATTTTGTTTTGAGCGGGGAAGTGCTAGGGCAAC GCCCTAAATCC
HP0650 731 ACAACGCGTCCAGTTTGACGATCCAAGAAAGGCTCTACCGCCATGAAATCAGCCGCTTACAGGTTAAGACTGATGAA 732 NASSLTIQERLYRHEISRLQVK
ACCTTAAAACTCATTAAAGAAGCCAAAAAGCGTTTGAATTATAACGATGATATACGAGATGTTTTGCAAGGGCTTTTG LIKEAKKRLNYNDDIRDVLQGL
AATATTGTGCCGGATTCCATCACTATTAATAGCATTGAAATAGACCAGCAAAGCGTGGTTGTTAGCGGTAAAACCCCT SITINSIEIDQQSVWSGKTPSK
TCTAAAGAAGCCTTTTATTTTTTGTTTCAAAACAAACTAAACCCCATGTTTGATTATTCTAGGGCGGAATTTTTCCCCTT FQNKLNPMFDYSRAEFFPLSD
AAGCGATGGGTGGTTTAATTTTGTCTCCACCAACTTTTCTAATTCCTTACTGATAAAAAATCCGGAGTCTATTAAATGA VSTNFSNSLLIKNPESIK
AGCCATTGCATTTTTCACACCTGGACAGAGAGCAATCAGGCGATGTGGGGTTTATCATTAAAAACCTCGTTTTTTTAG GGGTTTTTTCCTTATTGGGTTGGTTGAATACCGAGTATTTTCTATGGCCTAGCATGCTGGAATTAAAAAAAATCCTTTT AGAAGAAAATCGTAAAAAAAGCGTTTTAGAATACGCGCAAAGGCATTTTGAAACAGCCTTAACCAATTACCGCAATCA AAAAGAAACAAGCGAATCCTTGTTAAAGATTTTTAATGATGAAGAGTCCAAGCGGATTTTAGAAAAGATTTTAAAAAAA TGTTTTGATGCCTATAAAATCAAACCCTTGCTCTCTCAAAACTCCCTTCAAAAAACCCAGTTTTTTATCATGGCCAGAG CGAGCGAATTAGAAAAGACCTATCTTTTTTTCACCTTAATCAACAAGTATTTACCGAGCGCTCAAAGCCAATTGCCCT TAAAAATTTCTAAAGATGATGAAGGGTTGTTGGTGCAATTTGGCGTGAGTATTGATCTCCAATAGGATTAGGGGTAGT TTAATGCAAGATTTTGATTTTAGTTTTAATCCTAAGGCATGCGAGGGTTGTGGGGCAAAGTGTTGCGTGGGGGAAAG CGGGTA
HP0650 733 GAAAAACAACAATAATGTTAACGAGAAATTAGCAGGATTTGGGAAAGAAGAAGTAATGACCAATTTTGTTAGCGCCTT 734 KNNNNVNEKLAGFGKEEVMTN
TTTGGCAAGCTGCAAAGATGGTGGCACATTGCCTAATGCAGGGGTTACTTCTAACACTTGGGGGGCGGGTTGCGCG LASCKDGGTLPNAGVTSNTWG
TATGTGGGAGAGACGATAAGCGCCCTAACCAACAGCATCGCTCACTTTGGCACTCAAGAGCAGCAGATACAGCAAG VGETISALTNSIAHFGTQEQQI
CCGAAAACATCGCTGACACTCTAGTGAATTTCAAATCTAGATACAGCGAATTAGGCAACACCTATAACAGCATCACCA ADTLVNFKSRYSELGNTYNSIT
CCGCGCTCTCCAAAGTCCCTAACGCGCAAAGCTTGCAAAACGTGGTGAGCAAAAAGAATAACCCCTATAGCCCTCAA PNAQSLQNWSKKNNPYSPQ
GGCATAGAGACCAATTACTACCTCAATCAAAATTCTTACAACCAAATCCAAACCATCAACCAAGAACTAGGGCGTAAC YLNQNSYNQIQTINQELGRNPF
CCCTTTAGGAAAGTGGGCATCGTCAATTCTCAAACCAACAATGGTGCCATGAATGGGATCGGCATTCAGGTGGGCTA VNSQTNNGAMNGIGIQVGYKQ
TAAGCAATTCTTTGGCCAAAAAAGAAAATGGGGCGCTAGGTATTACGGCTTTTTTGATTACAACCATGCGTTCATCAA RKWGARYYGFFDYNHAFIKSS
ATCCAGCTTTTTCAACTCGGCTTCTGACGTGTGGACTTATGGTTTTGGAGCGGACGCGCTTTATAACTTCATCAACGA SDVWTYGFGADALYNFINDKA
TAAAGCCACCAATTTCTTAGGCAAAAACAACAAGCTTTCTTTGGGGCTTTTTGGCGGGATTGCGTTAGCGGGCACTT NNKLSLGLFGGIALAGTSWLN
CATGGCTCAATTCTGAGTACGTGAATTTAGCCACCGTGAATAACGTCTATAACGCTAAAATGAATGTGGCGAATTTCC ATVNNVYNAKMNVANFQFLFN
AATTCTTATTCAATATGGGAGTGAGGATGAATTTAGCCAGATCCAAGAAAAAAGGCAGCGATCATGCAGCTCAGCAT NLARSKKKGSDHAAQHGIELG
GGGATTGAGTTAGGGCTTAAAATCCCCACCATCAACACGAACTATTATTCCTTTATGGGGGCTGAACTCAAATACAGA TNYYSFMGAELKYRRLYSVYL
AGGCTCTATAGCGTGTAT
Figure imgf000178_0001
Figure imgf000179_0001
Figure imgf000180_0001
Figure imgf000181_0001
Figure imgf000182_0001
Figure imgf000183_0001
Figure imgf000184_0001
Figure imgf000185_0001
Figure imgf000185_0002
Figure imgf000186_0002
Figure imgf000186_0001
HP0868 819 GATGGAAGACTACGCAAGCAGAACCGCTGGAGCACTGGAGCGACTTGATAAGATTGTTGAAACAGAACAGAAGAAT 820 MEDYASRTAGALERLDKIVETE
CAACAAACTAAATTGGACACAGAAAATTTGAAAATAATTATTGAAACTTTGAGAAGTAAAATCAATGGGAATCAGCAAA QTKLDTENLKIIIETLRSKINGNQ
AGATGCTTGATAAAAGTAAAGAAATGAGCAGAAATTTTAAGCTTGATAGCACTAAAAACGAGATAGACGCAATTAAAG KSKEMSRNFKLDSTKNEIDAIKD
ATTTGAπAAAAAGGCTAATGAGCAAATAGCCAATTATAATGAGATGATAAAGGATATTGAAAAACAGAAAAAGAGTT NEQIANYNEMIKDIEKQKKSCKE
GTAAGGAACAAACTTGGAAATTTCTAGTCAATGAATTTAAAAGTGATATACAAGAATATAATAAAAAGTATTGCGGTTT FLVNEFKSDIQEYNKKYCGLEK
GGAGAAAGGAATAAACAATTTAGAGAAAGCAATTAGTGAAAATCAAGAAGAGGTAAAGAAATTAGAAAAT KAISENQEEVKKLEN
HP0748 821 CATGCGTTTGGCTAGGGGGTTTGCCCCCCTTTACCTCACCTTGCCTAAACGCTCTAATGGTTCGCCAAAAAAGATTT 822 MRLARGFAPLYLTLPKRSNGSP
TAGCGCTTGGAGCGCAACAAAAAGGGCATTTTAGCTTATTGGATAGCGGAACTTCCATTCTTTTACTCTCGCCTTTTT GAQQKGHFSLLDSGTSILLLSPF
GTGGGGATTTGAGCGTTTTAGAAAA.TGAAAAACACTTTAAAGAAACTTTGAATTTTTTCTTAAAAACCTATGATTTTAAA SVLENEKHFKETLNFFLKTYDFK
CCCACCATCTTAGCTTGCGACAAGCATCAAAACTACACCACCACTCAAATGGCTTTTGATTTTAATACGCCCTTGTTG CDKHQNYTTTQMAFDFNTPLLQ
CAAGTCCAGCACCACCATGCCCACTTTTTAGCGAGCGTCTTAGACGCATTGTTACAAGATCCGCATTTAAATCACCC HAHFLASVLDALLQDPHLNHPFI
CTTTATAGGCATTGTCTGGGATGGGAGTGGGGCTTATGAAAATAAGATTTATGGGGCGGAGTGTTTTGTGGGGGATT DGSGAYENKIYGAECFVGDLERI
TGGAACGCATTGAAGAAACCGCCAGGTTTGAAGAATTTTGGCTTTTAGGGGGGCAAAAAGCGATCAAAGAGCCTAG RFEEFWLLGGQKAIKEPRRLVL
ACGCCT6GTTTTAGAAATCGCTTTAAAACACCAACTCAACAAGCTTTTAAAGCGCGTTCAAAAGCATTTTAAAGAAGA QLNKLLKRVQKHFKEDELEIFQQ
CGAATTAGAAATTTTCCAACAAATGCATGACAAAAAAATTCAAAGCATAGCCACCAATTCCATAGGGCGTTTGTTTGA KIQSIATNSIGRLFDIVAFSLDLTG
TATAGTAGCGTTTAGTTTGGATTTAACAGGAACGATTAGCTTTGAAGCAGAGAGCGGGCAGGTTTTAGAAAATCTAG AESGQVLENLALQSDEIAFYPFE
CCTTACAAAGCGATGAGATCGCTTTTTACCCTTTTGAAATCAAAAACAGCGTGGTGTGTTTGAAAGAATTTTATCAAG VCLKEFYQAFEKDLGVLEPERIA
CGTTTGAAAAGGATTTGGGCGTTTTAGAGCCTGAACGCATCGCTAAGAAATTTTTTAACAGCCTAGTAGAAATCATTA NSLVEIITALIVPFKEHWVCSGG
CCGCTTTAATCGTGCCTTTTAAAGAGCATGTGGTGGTGTGCAGTGGGGGCGTGTTTTGCAACCAATTATTATGCGAA QLLCEQLAKRLRGLKRQYFFHK
CAATTAGCCAAACGA DSSIPIGQALMAYFNPTIIKKG
HP0748 823 GGTTGCCAATTCGCTTTAAAAATCCAAAAAGAATCCCAAAACAGAAGCCCCTTTGAGCTTTACTCGCCTTTAAAACGC 824 GCQFALKIQKESQNRSPFELYS
CAGATTTTACCCGCAAGTATTGAAGAAAGAACGCAAGTTTTTAGAACATTAGACATGGCTAAAAAGGACGCTAATAAG ILPASIEERTQVFRTLDMAKKDA
CCTTTTTTAGCCCAAAAAACGATCGCAACTTACCGCTTATTAAATGGAGGCGTGTGGCTTTCTAAAAATTCAAACCCC AQKTIATYRLLNGGVWLSKNSN
TTGAATTGCTGCATTCTGGCGCGCTCTAAAAGTAAGGCTAAAGTTAGGATCAACGATTTAAGGTGGGTTTTTTCCCAG ILARSKSKAKVRINDLRWVFSQR
CGTTTAAGCGTGCTAGTGGGTTATAGCCAAAGAGATGAAACCTTGTTTTTAACTTTAGAAGGCCTTAATACTCTTATG GYSQRDETLFLTLEGLNTLMAK
GCGAAAAACTACGATAATTTAAAAGAGTTAAACCTTAACCCCTTAAATTATGAAGAAGAGTTGTCTTTAAGGGCGTTA KELNLNPLNYEEELSLRALVSGS
GTGAGCGGGAGCGAGAGTATTAACCCTATTATCGTGCTAGAAGAACGCACAGAAAAGACCCTTTTTGTAGAAATTAA IIVLEERTEKTLFVEIKSVFQEEK
AAGCGTTTTTCAAGAAGAAAAGGTGTTTTATTTGCTGTA
HP0748 825 CTAGGGCCTGACAGCGCGGGGTATAGAGTTGGCACTTGTTGGAAAGACAGCGGGTTAGACACGGTTTCAGTAACCG 826 LGPDSAGYRVGTCWKDSGLDT ATTGCCATATTGTTTTAGGCTATTTGAACCCGGATAATTTCTTAGGCGGTTTGATCAAATTAGATGTGGATAGGGCTA CHIVLGYLNPDNFLGGLIKLDVD AAAAACACATTAAAGAACAAATCGCTGATCCGCTAGGCATTAGCGTAGAAGATGCGGCTGCTGGTGTGATTGAATTG IKEQIADPLGISVEDAAAGVIELL CTTGATTTGGAGCTTAAAGAATACTTGCGATCCAACATTAGCGCTAAAGGG EYLRSNISAKG
HP0748 827 TAAAAAAGCCGCTTTGGCCACTCAAAAAGCCGTTTTAGCTTTAGGTTTAAAGATTTTCCCTAAAAGCCCAAGCTTGAG 828 KKAALATQKAVLALGLKIFPKSP
CATGACAACGATTGTTAATGAGCATGCCAAAGAATTGAGAAACCTTTTAAAAGAAAAATACCAGGTGCAATTTGCGGG TIVNEHAKELRNLLKEKYQVQFA
CGGTCAAGAGCCTTATAAAGATGCGCTCATTCGTATCAACCACATGGGGATCATTCCTGTTTATAAAAGCGCTTACG PYKDALIRINHMGIIPVYKSAYAL
CTTTAAACGCCCTAGAGTTAGCCCTAAACGACTTGGATTTAAGGGAATTTGATGGCGTGGCGAACGCAACTTTTTTAA ALNDLDLREFDGVANATFLKQY
AGCAATATTATGGAATTTAAGGATCACAATGCATTATTCTTATGAAACCTTTTTAAAAGACAGCTTGGAATTAGTCAAA
CAAGTAGAGCAAATTTGCGGTGTCCCAGAAGCCCTTGTGTGCGTGATGCGAGGGGGCATGACTTTAGCGCATTTTTT
GAGTTTGCACTGGAATTTAAGGGAAGTTTATGGCATCAATGCGATTTCTTATGAC
Figure imgf000188_0001
Figure imgf000189_0001
Figure imgf000190_0002
Figure imgf000190_0001
Figure imgf000191_0001
Figure imgf000192_0001
Figure imgf000193_0001
Figure imgf000194_0001
Figure imgf000195_0001
HP0187 889 GCGAATTCGTGTGCTTTGAAGTCTTTAGCTTGAATGAAAAAGATTTTGAAAAAATCGCTCTGGTTTTAGAAGAAATTCT 890 EFVCFEVFSLNEKDFEKIALVLEEI TAATAAAGCTTAAAAATTCGCTATAATAAAATTTCTTTTAAACGCGCCATATCCCCCACAAAACGCTAGAGAATGATAG AAAACGACAGAACATCAATTTAAAGGAACTTAAGAATGGAAAAAATCAGCGATCTTATAGAATGCATTGCGTATGAAA AAAATTTGCCTAAAGAGATGATTTCAAAAGTGATTCAAGGCTGTTTGTTAAAAATGGCGCAAAATGAGTTAGACCCCC TAGCACGCTACTTG
HP0958 891 ATCATTTAATGGCTTTAGAAAAGGAATTGAGCGCGATTGCGATAGATATAGCTAAAGAAGTGATCCTTAAAGAAGTGG 892 HLMALEKELSAIAIDIAKEVILKEV
AAGACAACAGCCAAAAAGTAGCCCTTGCTTTGGCTGAAGAGCTTTTAAAAAACGTTTTAGACGCAACGGATATTCATT QKVALALAEELLKNVLDATDIHLK
TAAAAGTCAATCCCTTGGATTACCCTTATTTAAACGAGCGTTTGCAAAACGCTTCTAAAATCAAATTAGAGAGCAATGA DYPYLNERLQNASKIKLESNEAIS
GGCTATTTCTAAAGGAGGCGTTATGATCACTAGCTCTAATGGGAGCCTTGATGGGAATTTAATGGAGCGCTTTAAAA VMITSSNGSLDGNLMERFKTLKE
CGCTCAAAGAAAGCGTGTTGGAAAATTTTAAGGTGTGATTTTGCAAAATAAAACTTTTGATTTAAACCCTAACGATATT NFKV
GCAGGCTTGGAGTTGGTGTGTCAAACGCTACGGAATCGTATTTTAGAAGTGGTGAGCGCTAATGGGGGGCATTTAA
GCTCTTCTTTAGGGGCTGTGGAGCTGATTGTGGGCATGCATGCCTTATTTGATTGCCAAAAAAACCCTTTCATTTTTG
ACACTTCGCACCAAGCTTACGCCCACAAGCTTTTAACCGGGCGCTTTGAAAGCTTTAGCACTTTAAGGCAATTCAAG
GGTTTGAGCGGCTTTACTAAACCCAGCGAGAGCGCATACGATTATTTCATCGCCGGGCATAGTTCCACTTCGGTGTC
TATAGGCGTTGGGGTGGCTAAAGCTTTTTGTTTGAAACAAGCGCTAGGCATGCCCATAGCTTTATTAGGCGATGGGA
GCATTAGTGCAGGGATTTTTTATGAAGCCTTAAACGAACTGGGCGATAGGAAATACCCCATGATCATGATTTTAAACG
ATAATGAAATGAGTATCAGCACGCCTATTGGAGCCTTATCC
HP0958 893 TAAGTGAAGCCGATAAGGTCATTAAGGCCACTAAAGAAACTAAAGAGACCAAGAAAGAAGCTAAACGACTCAAAAAA 894 SEADKVIKATKETKETKKEAKRLK
GAAGCTAAACAGCGCCAACAGATCCCTGATCATAAGAAACCTCAATATGTCTCTGTTGATGACACAAAAACTCAAGC QRQQIPDHKKPQYVSVDDTKTQA
GCTGTTTGATATATACGACACCTTGAATGTGAATGACAAAAGCTTTGGGGATTGGTTTGGTAATAGCGCTTTGAAAGA YDTLNVNDKSFGDWFGNSALKD
CAAAACCTATCTCTACGCTATGGATCTATTGGATTACAACAACTATTTATCCATAGAAAACCCCATTATCAAAACAAGA YAMDLLDYNNYLSIENPIIKTRAM
GCAATGGGAACTTATGCGGATTTGATTATCATCACAGGCTCATTAGAGCAAGTCAATGGGTATTACAACATTCTAAAA DLIIITGSLEQVNGYYNILKALNKR
GCGCTCAACAAACGCAACGCTAAGTTTGTGTTAAAAATCAATGAGAACATGCCTTATGCCCAAGCGACTTTTTTAAGA VLKINENMPYAQATFLRVPKRSD
GTGCCAAAAAGAAGCGATCCTAATGCCCACACGCTTGATAAGGGAGCGTCAATTGATGAGAATAAGCTTTTTGAACA TLDKGASIDENKLFEQQKKMYFN
ACAAAAGAAAATGTATTTCAATTACGCTAACGATGTGATCTGCAGACCCGATGATGAAGTGTGTTCGCCCCTAAGAG VICRPDDEVCSPLRDEMVAMPTS
ATGAGATGGTAGCTATGCCCACTAGCGATAGCGTTACTCAAAAACCCAATATCATTGCTCCTTATAGCTTGTATAGAC TQKPNIIAPYSLYRLKETNNANEA
TAAAAGAGACAAATAACGCCAATGAGGCCCAACCATCACCTTATGCCACTCAAACCGCTCCCGAAAACAGCAAAGAG YATQTAPENSKEK
AAG
HP0958 895 GAATGCGGTAGAGTTTTACTTGCACCCTAATGGCGATATTACCGATCTTAAAATCATCATTGGCTCTGAATACAAAAT 896 NAVEFYLHPNGDITDLKIIIGSEYK
GCTTGATGACAACACCTTAAAGACCATTCAGATCGCTTATAAGGATTACCCACGCCCCAAAACTAAAACCCTCATTCG NTLKTIQIAYKDYPRPKTKTLIRIR
CATCAGAGTGCGTTATTACTTAGGGGGCAATTAAAAAATGGAAATCACGCTTTTTGACCCCATAGACGCCCACTTGC LGGN
ATGTGCGAGAAAACGCACTTTTAAAAGCGGTGTTAGGATATTCTAGCGAGCCTTTTAGTGCTGCAGTGATCATGCCT
AATCTCAGTAAGCCCTTGATTGACACTCCAACCACCCTTGAATACGAAGAAGAAATTTTAAACCATTCTTCAAACTTCA
AGCCTCTAATGAGTTTGTATTTCAATGATGGCTTGACTTTAGAAGAATTGCAATGCGCAAAAGAAAAAGGCGTCAGGT
TTTTAAAGCTCTACCCCAAAGGCATGACCACAAACGCGCAAAACGGCACTTCGGATTTGTTGGGTGAAAAAACT
HP0958 897 TAAAAGCCTTTTTGAAACCTTAGAAGCTCAAATCATTCCCCCTCTCTTTCCTACTGAAACCTCTCAAAAAATCGCTATG 898 KSLFETLEAQIIPPLFPTETSQKIA GATATTATCAGCGGGTTGAATAATGAGGGGTATTTTGAAGAAAATATTGAAGAAAGGGCTAGAATTTTAGGGGTAGA GLNNEGYFEENIEERARILGVESE GAGCGAAGTTTATGAAAAAGTGCGCAAGCGTTTTAGTTACCTTAATCCCGCTGGCATTGGTGCTAAAGATGTGAAAG KVRKRFSYLNPAGIGAKDVKESF AGAGCTTTTTATTCCAGTTAGAGAGTAGGGAATTAGACGATAATGAGCTTTATGAAGAAACGCGAAAAATCATTTTAA ESRELDDNELYEETRKIILNLEKH ATTTAGAAAAACACCATGAATTTTCTAAAGATTTTTATTATGAAAAGGCTTTAAAGATTTTAAAATCCTTTAAAAACCCC KDFYYEKALKILKSFKNPPAIEFLE CCAGCCATTGAGTTTTTAGAAAAAGAAATAGAAGTCATTCCTGAACTTTTTATTGTAGAAGTGGATAATGGAATCATCG VIPELFIVEVDNGIIVRLNDESYPTI TGCGTTTGAATGATGAGAGCTACCCGACAATCAGTTTGGAAGAAAATCGCTTTAAGGATAGCGGCTATTTAAAAGAAA NRFKDSGYLKEKLKEAKDLIDAL AATTAAAAGAGGCTAAAGATTTGATTGATGCGCTAAATTTGAGAAAAGCCACGATTTATAAAATCGGTCTGATGCTTT ATIYKIGLML
Figure imgf000197_0001
Figure imgf000198_0001
Figure imgf000199_0001
Figure imgf000200_0001
Figure imgf000201_0001
Figure imgf000201_0002
Figure imgf000202_0001
Figure imgf000203_0001
Figure imgf000204_0001
Figure imgf000204_0002
Figure imgf000205_0002
Figure imgf000205_0001
Figure imgf000206_0001
Figure imgf000207_0001
Figure imgf000208_0001
Figure imgf000209_0001
Figure imgf000210_0001
Figure imgf000211_0001
Figure imgf000212_0001
Figure imgf000213_0001
Figure imgf000214_0001
Figure imgf000215_0001
Figure imgf000216_0001
Figure imgf000217_0001
Figure imgf000218_0001
Figure imgf000219_0001
Figure imgf000220_0001
Figure imgf000221_0001
Figure imgf000222_0001
Figure imgf000223_0001
Figure imgf000224_0001
Figure imgf000225_0001
Figure imgf000226_0001
Figure imgf000227_0001
Figure imgf000228_0001
Figure imgf000229_0001
Figure imgf000230_0001
Figure imgf000231_0001
Figure imgf000232_0001
Figure imgf000233_0001
Figure imgf000234_0001
Figure imgf000235_0001
Figure imgf000236_0001
Figure imgf000237_0001
Figure imgf000237_0002
Figure imgf000238_0001
Figure imgf000239_0001
Figure imgf000240_0001
Figure imgf000241_0001
Figure imgf000242_0001
Figure imgf000243_0001
Figure imgf000244_0001
Figure imgf000245_0001
Figure imgf000246_0001
Figure imgf000247_0001
Figure imgf000247_0002
Figure imgf000248_0001
Figure imgf000248_0002
Figure imgf000249_0001
Figure imgf000250_0001
Figure imgf000250_0002
Figure imgf000251_0001
Figure imgf000252_0001
Figure imgf000253_0003
Figure imgf000253_0001
Figure imgf000253_0002
Figure imgf000254_0001
Figure imgf000255_0001
Figure imgf000256_0001
Figure imgf000257_0001
Figure imgf000258_0001
Figure imgf000259_0001
Figure imgf000260_0001
Figure imgf000261_0001
HP0047 1443 CGGTTGAAAAAGAGCATGGTGCTACGCCCCCAAAAGAAGCCAAAATAGGCGTTAGAAAATTCTATCGGCATAAAAAA 1444 VEKEHGATPPKEAKIGVRKFYRH
TGGGTGGATGCAGATGTGTGGCAAATGGAAAAATTACTGCCTGGAAATGAAGTCATAGGACCTGCGATCGTGGAAT VDADVWQMEKLLPGNEVIGPAI
CAGATGCGACCACTTTCGTGATACCCAAAGGCTTTGCGACAAGACTAGACAAACACCGATTGTTCCACTTGAAAGAA TTFVIPKGFATRLDKHRLFHLKEI
ATTAAATAAAGGAGTTCAAAATGGCAMTTTATTGAAAAACGGCAAAACTTTAAAACAAGCTAGAGATGAAATCCTAG
CCAGGACAGAAAAAACAGGGCATTATAATGGTCTCAAAAAACTAGAGTTTAAAGAAAGAGATCCGATTGGTTATGAG
AAGATGTTCTCTAAATTAAGAGGCGGTATCGTGCATGCCAGAGAAACGGCTAAAAGGATTGCGGCAAGCCCTATTGT
TGAGCAAGAGGGAGAATTGTGCTTCACGCTTTATAACGCTGTGGGCGATAGCGTGCTGACTTCTACAGGTATCATTA
TCCATGTAGGGACTATGGGATCAGCTATCAAATACATGGTAGAGAATAATTGGGAAGATAACCCAGGCATCAATGAC
AAGGATATTTTCACCAATAACGACTGCGCGATTGGGAATGTGCACCCATGCGATATTATGACTCTTGTGCCTATTTTC
CACGATGAAAAATTGATTGGGTGGGTAGGCGGCGTTACGCATGTGATTGATACCGGTTCGGTTACTCCAGGATCGA
TGAGCACTGGACAGGTTCAAAGATTTGGGGATGGATACATGATCACTTGCCGTAAGACAGGAGCGAATGATG
HP1497 1445 ACGAATCTTTAGAA TGGCACGAAATTCAAAAAGAGGTTTGTGTTTGAATTGCAAAATGATTTAATATTTTTGAGAAC 1446 ESLENGTKFKKRFVFELQNDLIF
AAAAAGGAATACCTATAGCGCTAAAACCGAGCATGAAAATTTTCAAGAAAGATTGTTAGCGGCTAAAGACACTTTTAG NTYSAKTEHENFQERLLAAKDT
ATTGATCAAAGCAAATGATTTTAAAACTAAAATTTACCCCTATCAAATCACCCTACGCTTAAAAACAAAACACATCATT NDFKTKIYPYQITLRLKTKHIIVFR
GTTTTTAGATGGTTGAAAAAAGAAATCATCAAACGCTTTGTTAAAAAAGATCTTTTCACCGAGACCCTATCCATAAAAA EIIKRFVKKDLFTETLSIKITDKKG
TAACGGATAAAAAAGGGCAAAAATACGCCTTAATCGCAAACTATAACCATGCAAGCGATATTATTGAGCTAATGCTTG LIANYNHASDIIELMLDDKTYTTT
ATGATAAAACCTACACCACCACCCTCTACTATCAAAAACCGCTTTTTGACCTGATTAAAAATTCTAATTTTAATTTGACA PLFDLIKNSNFNLTIKDNTLEINQ
ATCAAAGACAACACACTAGAAATAAATCAAGCTAAAAAACGCCTATTCGTTTAAAGCGAAAA FV
HP1496 1447 CCTTAGCGATCTTGTGAAAAATACCTCTTCTGTCGCTATCAATTTTGACAAAGAAGAAGAATl TTAAACCTAAACACC 1448 LSDLVKNTSSVAINFDKEEEFLN CTAAAAGACTATGAATTAGCCGTTCAAAT rTAAAAAAGAGGGCCAATGGCTGAAGAAGAAAAAACCGAACTCCCTAG DYELAVQILKKRANG CGCGAAAAAAATCCAAAAAGCCAGAGAAGAAGGCAATGTGCCTAAGAGCATGGAAGTGGTGGGGGTTTTGGGGTTA
TTGGCCGGGCTAATTAGTATΠTTGTTTTTTTTATATGGTGGGTGGATGGCTTTAGCGAAATGTATCGCCATGTGTTG AAAGATTTTTCCCTAGATTTCAGTAAAGAAAGCGTTCAAGAGCTGTTTAACCAACTGGCTAAAGACACTTTTTTATTGC TTTTACCGAT FTAATCAT" AGTCATTGAGCCTAAATTT TAAAATCAACCCTATCAATGGCGTCAAAAACCTTTTTTCTTTAAAAAAGCTCCTTGAT
GGGAGTTTGATCACCTTAAAAG' TTTTTTAGCTTTTTTTCTGGGGTTTTTCATCTTTTCTTTGTTTTTAGGGGAATTAAA CCATGCGGCTCT ΓTGAATTTACAAGGCCAGTTGTTGTGGTTTAAAAATAAGGCGTTATGGCTCATTTCTTCGCTTTTA TTTTTATTTTTTGTCTTGGCTTTTATAGATTTAGCGATCAAACGCCGCCAATACACCAACTCTTTAAAAATGACTAAACA AGAAGTTAAGGACGAATACAAACAGCAAGAAGGGAACCCAGAAATCAAAGCCAAAATCCGCCAAATGATGCTAAAAA ACGCCACGAATAAAATGATGCAAGAAATCCCTAAAGCCAATGTCGTGGTTACTAACCCCACCCATTACGCCGTCGCT CTCAAATTTGATGAAGAACACCCTGTGCCTGTGGTAGTGGCTAAAGGCACGGATTATTTAGCCATTAGGATTAAGGG
CATCGC
HP0045 1449 TGAATGAGATTATTTTAATCACTGGTGCCTATGGCATGGTGGGGCAGMCACGGCGTTGTATTTTAAAAAAAATAAGC 1450 NEIILITGAYGMVGQNTALYFKK
CTGATGTTACTCTACTCACTCCTAAAAAGAGCGAATTGTGTTTGTTGGATAAAGACAACGTTCAAGCTTATTTGAAAG TLLTPKKSELCLLDKDNVQAYLK
AATACAAGCCTACAGGCATTATCCATTGTGCCGGGAGAGTGGGGGGCATTGTCGCTAACATGAATGATCTTTCAACT TGIIHCAGRVGGIVANMNDLSTY
TACATGGTTGAGAATTTACTGATGGGCTTGTACCTCTTTTCTAGCGCTTTAGATTCGGGCGTGAAAAAAGCCATTAAT LLMGLYLFSSALDSGVKKAINLA
CTAGCGAGCTCTTGCGCTTACCCTAAATTCGCCCCCAACCCTTTAAAAGAGAGCGATTTATTGAACGGCTCTTTAGA PKFAPNPLKESDLLNGSLEPTN
GCCAACGAACGAAGGCTACGCTTTAGCCAAACTCTCTGTGATGAAGTATTGCGAGTATGTGAGCGCTGAAAAGGGC AKLSVMKYCEYVSAEKGVFYKT
GTTTTTTATAAAACCTTAGTGCCTTGCAACCTTTATGGCGAGTTTGACAAGTTTGAAGAAAAAATAGCGCACATGATA LYGEFDKFEEKIAHMIPGLIARM
CCAGGGCTTATCGCTAGGATGCACACCGCTAAATTAAAAAATGAAAAAGAGTTTGCGATGTGGGGCGATGGCACGG KNEKEFAMWGDGTARREYLNA
CCAGGAGAGAGTATCTAAACGCTAAAGATTTAGCCAGATTCATTTCTCTAGCTTATGAGAATATCGCTTCAATCCCTA FISLAYENIASIPSVMNVGSGVD
GCGTGATGAATGTTGGCTCTGGCGTGGATTACAGCATTGAAGAGTATTACGAAAAAGTCGCTCAGGTTTTAGACTAT YYEKVAQVLDYKGVFVKDLSKP
AAGGGCGTGTTTGTGAAAGACTTATCCAAACCAGTGGGCATGCAGCAAAAGCTTATGGATATTTCCAAACAAAGGGC QKLMDISKQRALKWELEIPLEQ
TTTAAAATGGGAATTAGAAATCCCTTTAGAGCAGGGCATCAAAGAAGCTTATGAGTATTATTTGAAGCTTTTAGAGGT EYYLKLLEV
TTGAAATAAAATCAAGGCTCTTATGGAGCTATAAAAACGCTCCGCTATTAAGCGTTAGGCTAGTGGTAGTTGCGATAT
TGTGATCGTTTTAGCTTC
Figure imgf000263_0001
Figure imgf000264_0001
Figure imgf000265_0001
Figure imgf000265_0003
Figure imgf000265_0002
Figure imgf000266_0001
Figure imgf000267_0001
HP1259 1491 ATGCGAGTGCGGAGTTTTTAGAAAGAGAAAAAGAAAAAGCCAAAAATGAAAAACGCCCTTTCAGGTATTCAGACGAG 1492 ASAEFLEREKEKAKNEKRPFRY
TGGGCCACTTTAGAAAAAGACAAGCACCATGCCCCTGTGGTGCGTTTAAAAGCCCCAAATCATGCGGTGTCTTTCAA ATLEKDKHHAPWRLKAPNHAV
CGATGCGATTAAAAAAGAAGTGAAATTTGAACCTGATGAATTGGATTCTTTTGTGCTTTTGAGACAGGATAAAAGCCC IKKEVKFEPDELDSFVLLRQDKS
TACTTATAATTTCGCTTGCGCATGCGATGATTTGCTTTATAAAATCAGTCTGATTATTAGAGGCGAAGATCATGTGAGT AC ACDDLLYKI SLI IRGEDHVSNT
AACACCCCCAAACAAATCTTAATCCAGCAAGCTTTAGGCTCCAATGATCCGATTGTTTATGCGCATTTGCCCATTATT IQQALGSNDPIVYAHLPIILDEVS
TTAGATGAAGTAAGCGGTAAAAAGATGAGTAAAAGAGATGAAGCCTCCAGCGTGAAATGGCTTTTGAATCAAGGGTT SKRDEASSVKWLLNQGFLPVAI
TTTACCGGTTGCGATTGCGAATTACCTCATCACTATCGGTAATAAAGTGCCTAAGGAAGTTTTTAGCCTTGATGAAGC IGNKVPKEVFSLDEAIEWFSLEN
GATAGAATGGTTTAGTTTAGAAAATCTTTCCAGTTCTCCGGCTCATTTTAATTTAAAATATTTAAAACACTTAAACCACG AHFNLKYLKHLNHEHLKLLDDD
AGCATTTAAAGCTTTTAGACGATGACAAGTTATTAGAACTCACTTCAATAAAAGATAAAAACCTCTTAGGGCTTTTAAG SIKDKNLLGLLRLFIEECGTLLEL
ATTGTTTATAGAAGAATGCGGCACGCTTTTAGAATTGAGGGAAAAAATTTCGTTGTTTTTAGAGCCAAAGGATATTGTT LFLEPKDIVKTYENEDFKERCLA
AAAACTTATGAAAATGAAGATTTTAAAGAGCGTTGTTTAGCGCTTTTTAACGCTCTAACAAGCATGGATTTTCAAGCGT TSMDFQAYKDFESFKKEAMRLS
ATAAGGATTTTGAAAGTTTTAAAAAAGAAGCCATGCGATTAAGCCAGCTTAAGGGTAAGGATTTTTTCAAACCTTTGC KDFFKPLRILLTGNSHGVELPLIF
GCATCCTTTTAACCGGGAACTCGCATGGCGTTGAATTGCCTTTGATTTTCCCCTATATCCAAAGCCATCATCAAGAAG HHQEVLRLKA
TTTTAAG
HP1259 1493 GAATGGGGTGGAGATTGTAGGGTTGGAGCATTTGGATAAAGTGATTTATTTAGATCAAGCCCCCATAGGCAAAACCC 1494 NGVEIVGLEHLDKVIYLDQAPIGK
CACGAAGCAACCCTGCCACTTACACGGGAGTGATGGATGAAATCAGGATTTTATTTGCCGAGCAAAAAGAAGCTAAA NPATYTGVMDEIRILFAEQKEAKI
ATΠTAGGCTATAGTGCGAGCCGTTTTAGCTTTAATGTTAAAGGAGGGCGGTGCGAGAAATGCCAAGGCGATGGGG ASRFSFNVKGGRCEKCQGDGDI
ACATTAAAATAGAAATGCACTTTTTGCCTGATGTGTTAGTCCAATGCGATAGCTGTAAGGGCGCTAAATACAACCCCC HFLPDVLVQCDSCKGAKYNPQT
AAACTTTAGAAATCAAGGTGAAAGGCAAATCCATTGCCGATGTGTTGAACATGAGCGTGGAAGAGGCTTATGAATTTT KGKSIADVLNMSVEEAYEFFAKF
TTGCTAAATTCCCTAAAATCGCCGTGAAGTTAAAAACGCTTATGGATGTGGGCTTAGGCTATATCACTTTAGGGCAAA KLKTLMDVGLGYITLGQNATTLS
ACGCTACGACTTTAAGTGGGGGGGAGGCTCAAAGGATCAAATTAGCTAAAGAATTGAGTAAAAAAGACACAGGCAAA QRIKLAKELSKKDTGKTLYILDE
ACCCTTTATATTTTAGATGAGCCTACTACCGGTTTGCATTTTGAAGACGTGAATCATCTTTTACAAGTCTTGCATTCTT HFEDVNHLLQVLHSLVALGNSM
TAGTGGCGTTAGGCAATTCTATGCTAGTGATT
HP1259 1495 ACGCTCCTCCCAGTCAAGCACGCCTTTGGGTAGTGCCACCCAGTAAAATGGACGAACAAGAGCTTATTAATGAGGG 1496 APPSQARLWWPPSKMDEQELI CTATTATGCGATTTTTGGGGCTGCCGGGGCAAGGACTGAAGTCCCAGGCTGTAGCTTGTGCATGGGCAATCAAGCG YAIFGAAGARTEVPGCSLCMGN AGGGTTAGGGATAATGCGGTCGTCTTTTCCACTTCCACACGCAATTTTGATAATCGCATGGGTAGAGGGGCTAAAGT RDNAWFSTSTRNFDNRMGRG GTATTTAGGCAGTGCGGAGCTTGGGGCGGCGTGCGCTTTACTAGGGCGAATCCCCACTAAAGAAGAATACATGAAT GSAELGAACALLGRIPTKEEYM TTAGTGAGTGA
HP1259 1497 AACCATGATTAACACGATGTTTTGCGCGACCATGCAAAGGGGAGTGGCGGAAATCGTGGCTGTGGAAGCGACTTTC 1498 TMINTMFCATMQRGVAEIVAVE
ACAAGGGCTTTGCCGGCGTTTGTGATTTCAGGGTTAGCTAATAGCTCTATCCAAGAAGCCAAACAGCGGGTTCAATC ALPAFVISGLANSSIQEAKQRVQ
GGCTTTACAAAATAACGATTTCACTTTCCCGCCTTTAAAAATCACCATCAACCTTTCCCCCTCAGATTTGCCTAAATCC NNDFTFPPLKITINLSPSDLPKSG
GGGAGTCATTTTGATTTGCCTATCGCTCTTTTAATCGCTTTGCAAAAACAAGAGTTGGCTTTTAAAGAGTGGTTTGCTT LPIALLIALQKQELAFKEWFAFG
TTGGGGAGTTAGGGCTTGATGGCAAGATCAAACCCAATCCTAACATTTTCCCCATGCTTTTAGACATTGCCATTAAAC GKIKPNPNIFPMLLDIAIKHPHAKI
ACCCCCATGCTAAGATCATTGCGCCTAAGGCCAATGAAGAGCTrTTTTCGCTTATCCCTAATTTGCAATGCTTTTTTG NEELFSLIPNLQCFFVGHFKEAL
TGGGGCATTTTAAAGAAGCGTTAGAAATCTTGCAAAACCCTGAAACCAAAGCAGACACCCACACGAAAAAACTACCC PETKADTHTKKLPFKTIELNDKE
TTTAAAACGATAGAATTGAACGATAAAGAGTATTATTTTTCAGACGCCTATGCCTTAGATTTTAAAGAAGTTAAGGGGC AYALDFKEVKGQAVAKEAALIAS
AAGCTGTCGCTAAAGAGGCCGCTTTGATCGCTAGCGCTGGGTTTCATAACTTGATTTTAGAGGGAAGTCCAGGGTGT NLILEGSPGCGKSMIINRMRYILP
GGGAAAAGCATGATCATTAATCGCATGCGTTATATCTTGCCTCCATTAAGCCTGAATGAAATCCTAGA NEIL
Figure imgf000269_0002
Figure imgf000269_0001
HP1259 1505 ATGTCTATCACCCCTACCATTGGACGCCCATTGGTTTTATGGATGATATTCAAAATTGGACTTTAAAAGACATTAAAAA 1506 VYHPYHWTPIGFMDDIQNWTLK
ATTCCATTCGCTCTATTATCAGCCTAAAAACGCTATCGTTTTGGTGGTAGGCGATGTCAATTCCCAAAAGGTTnTGA HSLYYQPKNAIVLWGDVNSQK
ATTGAGTAAAAAGCATTTTGAATCCTTAAAAAACCTTGATGAAAAAGCTATCCCCACCCCTTACATGAAAGAGCCTAA KKHFESLKNLDEKAIPTPYMKEP
GCAAGATGGAGCCAGAACGGCAGTCGTGCATAAAGATGGGGTCCATTTAGAATGGGTGGCCCTTGGGTATAAAGTG ARTAWHKDGVHLEWVALGYK
CCTGCTTTCAAGCATAAAGATCAAGTCGCCTTAGACGCACTAAGTAGGCTTTTAGGCGAAGGCAAAAGCTCGTGGTT HKDQVALDALSRLLGEGKSSWL
GCAAAGCGAATTAGTGGATAAAAAACGCTTGGCTTCTCAAGCTTTCTCGCACAACATGCAATTACAAGATGAAAGCG VDKKRLASQAFSHNMQLQDES
TGTTTTTATTCATTGCGGGGGGTAATCCTAATGTCAAAGCCGAAGCCTTACAAAAAGAAATCGTAGCGCTTTTAGAAA GGNPNVKAEALQKEIVALLEKLK
AGCTGAAAAAAGGCGAAATCACTCAAGCGGAATTAGACAAGCTCAAAATCAATCAAAAAGCTGACTTTATTTCTAATT QAELDKLKINQKADFISNLESSS
TAGAAAGTTCTAGCGATGTTGCGGGGCTTTTTGCGGACTATTTAGTGCAAAACGATATTCAAGGCTTGACGGATTAC FADYLVQNDIQGLTDYQRQFLD
CAGCGACMTTTTTGGATrTAAAAGTGAGCGATTTGGTGCGTGTGGCCAATGAATATTTTAAAGACACCCAATCAACC LVRVANEYFKDTQSTTVFLKP
ACCGTGTTTTTGAAACCTTAAAAGAGCCTTATAACATGCAATTTCATTCATCTAGCGCGTTGATTACGCCTTTTAAAAA
AGATTTGAGCGTTGATGAGGCCGCTTATGAAACCTTGATCAAGCGCCAAATTTTTCAGGGCATGGACGCATGCGTGC
CTGTTGGCACGACAGGAGAATCCGCCACGCTCACCCACAAAGAGCACATGCGTTGCATTGAAATCGCCATAGAAAC
TTGCAAAAACACTAAA
HP1259 1507 ACCTCAAAGCCCCTAAGCCTAACACCCCACAGATCATGGCAGTTTTAAACTTGACTCCGGATAGCTTCTATGAAAAG 1508 LKAPKPNTPQIMAVLNLTPDSFY
AGCCGGTTTGATAGCAAAAAAGCGCTTGAAGAAATCTATCAATGGCTAGAAAAGGGTATCACGCTCATTGATATAGG FDSKKALEEIYQWLEKGITLIDIG
CGCGGCCAGTTCAAGGCCAGAGAGTGAAATCATTGATCCAAAAATAGAGCAAGATCGCTTAAAAGAAATTTTATTAG PESEIIDPKIEQDRLKEILLEIKSQ
AAATCAAATCCCAAAAACTCTACCAATGCGCCAAATTCAGCATAGACACCTACCATGCCACAACCGCCCAAATGGCT CAKFSIDTYHATTAQMALEHYFS
TTGGAGCATTATTTTTCCATCCTTAATGATGTGAGCGGTTTTAATAGCGCTGAAATGCTAGAAGTCGCAAAAGATTAC SGFNSAEMLEVAKDYKPTCILM
MGCCCACTTGCATTTTMTGCACACTCAAAA CCCCCAAAGACATGCMGAAAATGTTTTTTACCACAATTTATTTG PKDMQENVFYHNLFDEMDRFFK
ATGAAATGGATCGCTTCTTTAAGGAAAAACTAGAAGTTTTAGAAAAATACGTGCTCCAAGATATTATTTTAGATATTGG VLEKYVLQDIILDIGFGFAKLKEH
GTTTGGATTCGCTAAATTAAAAGAGCATAATTTAGCCTTAATCAAGCATTTAAGCCACTTTCTCAAATTCAAAAAACCC KHLSHFLKFKKPLLVGASRKNTI
TTATTGGTGGGCGCGAGTCGTAAAAACACGATCGGGCTTATCACTGGGCGTGAAGTTCAAGACCGGCTCGCCGGCA REVQDRLAGTLSLHLMALQNGA
CTTTGAGTTTGCATTTAATGGCGTTGCAAAATGGAGCGAGCGTTTTAAGAGTGCATGATATTGATGAGCATATAGATT VHDIDEHIDLIKVFKSLEETD
TAATCAAAGTGTTTAAGAGTTTGGAAGAAACGGATTGAACCAAAATTTTTAGCTTTGATCCCACTAGGCGTTGGTCAT
CATCAAAACGCTCAAAAGCGTTTTTGATAATCAATCCTTGATTAACCCTAATCTTTAAAGAATCACATCCATGGTTGAA
GGGTTGTTGTAAGCGTTTTGGTAGCGTTGCAAAAAAGCGTTATGGTTGGTAGCGTTAAAGGTTTGTAGGGCTTTTTT
GTGCAAACCATTT
HP1259 1509 GCAACATGAAAAATTTAGTAATCTTAAGCGGGGCTGGTATTTCAGCAGAAAGCGGGATTAAAACCTTTAGAGACGCT 1510 NMKNLVILSGAGISAESGIKTFR GATGGCTTGTGGGAAAGGGCATGACATCATGGAAGTTGCCTCGCCTTATGGCTGGAAAAAGAACCCGCAAAAGGTG WERA TTGGATTTTTACAACCAAAGGCGCCGACAGCTTTTTGAAGTTTATCCTAACAAAGCCCATAAGGCTTTAGCGGAATTG GAAAAACACTATCAAGTCAATATCATCACCCAAAATGTAGATGATTTGCATGAAAGAGCGGGTTCTTCTCGCATTTTG CACTTGCATGGGGAATTATTGAGCGTTCGCAGCGAGAAAGATCCTAATTTAGTTTATAGGTGGGAAAAGGACTTGAA TTTAGGCGACTTGGCCAAAGACAAATCGCAATTACGCCCTGATATTGTGTGGTTTGGCGAAGCGGTGCCTTTGCTTA AAGAAGCGATTTCTTTAGTCAAACAAGCGCATCTTTTAATCATCATTGGCACTTCTTTGCAAGTCTATCCCGCCGCTA GCCTCTACACGCATGCGCATAAAGACGCTCTCA
HP1259 1511 CTCTAAAGAGAGCTTGATGCATGCCATTAATTCAATTAGAGTGGGCATGCATTTTAAAGAGTTGAGTCAGATTTTAGA 1512 SKESLMHAINSIRVGMHFKELS GAGCACTATTACAGAAAGGGGCTTTGTGCCTTTGAAAGGATTTTGCGGGCATGGCATTGGTAAAAAACCCCATGAAG TERGFVPLKGFCGHGIGKKPHE AGCCAGAGATCCCCAACTACCTAGAAAAAGGCGTCAAACCTAATAGCGGCCCTAAAATCAAAGAGGGCATGGTATTT NYLEKGVKPNSGPKIKEGMVFC TGCTTAGAGCCTATGGTGTGTCAAAAACAGGGCGAGCCTAAAATACTAGCGGATAAGTGGAGC VCQKQGEPKILADKWS
HP1259 1513 CTTATTCAGGCGGATCATCATTCTCGCCTACAATACAATTGACATACCATAATAACGCTGAAAACCTTTTGCAACAAG 1514 YSGGSSFSPTIQLTYHNNAENL
CCGCCACTATCATGCAAGTCCTTATTACTCAAAAGCCGCATGTGCAAACGAGCAATGGCGGTAAAGCGTGGGGGTT TIMQVLITQKPHVQTSNGGKAW
GAGTTCTACGCCTGGGAATGTGATGGATATTTTTGGTCCTTCTTTTAACGCTATTAATGAGATGATTAAAAACGCTCA PGNVMDIFGPSFNAINEMIKNA
AACAGCCCTAGCAAAAACCCAACAGCTTAACGCTAATGAAAACGCCCAAATCACGCAACCCAACAATTTCAACCCCT KTQQLNANENAQITQPNNFNPY
ACACCTCTAAAGACAAAGGGTTCGCTCAAGAAATGCTCAATAGAGCTGAAGCTCAAGCAGAGATTTTAAATTTAGCTA GFAQEMLNRAEAQAEILNLAKQ
AGCAAGTAGCGAACAATTTCCACAGCATTCAAGGGCCTATTCAAGGGGATTTAGAAGAATGTAAAGCAGGATCGGCT HSIQGPIQGDLEECKAGSAGVIT r:r:ρ :τ aτ arτaaτΛfl trττf; :rtc
Figure imgf000271_0001
HP0962 1523 AAGAAGCGTGTAGAA GGTTCAAGAAAGCTTTAAAGCCCTAGACAAGGCTAAAAAAGAATGGCTTAAAGCCCTTGA 1524 KKRVEKVQESFKALDKAKKEWL
AGCCCCCATAGATGAAAGAGAAGACGAATTGGTGCGTTCATTGACCCTAGCTTACAAACGCCAAACACTCAAAGACA PIDEREDELVRSLTLAYKRQTLK
GACTCTATGATTTAGAACCTACCAGCAAACTGATTAATGAATTAGTCAAAACGATGGAAACCACTTTAAAAAGCGGCG DLEPTSKLINELVKTMETTLKSG
ATGGGTTTGAAAAAGAGTTGAAACGCTTGGAATACAAACTGCCCTTATTCAATGACACTCTCATCGCAAACCATAAAA ELKRLEYKLPLFNDTLIANHKKIL
AAATCCTTGCCAATATCACTAACATGACTAAAGAAGATATTATCGCTCAAGTGCCAGAAGCGACTATGGTGAGCGTGT MTKEDIIAQVPEATMVSVYMDLK
ATATGGATCTTAAAAAGCTTTTTTTGACTAAAGAAGCGAGCGAAGAAGGCTTTGATCTAGCCCCCAACAAGCTAAAAG KEASEEGFDLAPNKLKEI LEQIK
AAATTTTAGAGCAAATCAAMGAGGG GTrGATTTCCGATCGCGCTAAAAACAAAATGGCTAAATCCAATTTAAGGT SDRAKNKMAKSNLRLWSIAKR
TGGTGGTGAGCATCGCTAAACGATTCACGAGCAGAGGCTTACCATTCTTGGATTTGATTCAAGAGGGCAATATTGGC LPFLDLIQEGNIGLMKAVDKFEH
TTGATGAAAGCGGTGGATAAGTTTGAGCATGAAAAGGGCTTCAAGTTTTCTACCTATGCGACCTGGTGGATCAAACA FSTYATWWIKQAISRAIADQART
AGCTATCAGCAGAGCCATAGCCGATCAGGCCCGCACTATCCGCATCCCCATTCACATGATT Ml
HP0962 15251 CAGAT TTTAGAACCCATTAGTTCTAAACGCTTAAAAGAGTTGGCGGACTTGAAAATATCTTGCGCGACGATCAGGAAT 1526 QILEPISSKRLKELADLKISCATIR TATTT "CAAATCCTTTCTAAAGAGGGCATGCTTTATCAAGCCCATTCTAGTGGCGCTAGATTGCCCACTTTTAAGGCG LSKEGMLYQAHSSGARLPTFKA
TTTGAAAACTATTGGCAAAAGTCGTTGCGCTTTGAAACTTTAAAGGTGAATGAAAAACGCCTAAAAAGCGCGAGTGAA WQKSLRFETLKVNEKRLKSASE
AATTTTGGGCTTTTCACGCTGTTAAAAAAACCCAGTTTGGAGCGTTTAGAAAGAGTCATTGAGTGCGAAAAACGCTTT TLLKKPSLERLERVIECEKRFLIL
TTGATTTTGGACTTTTTGGCGTTTTCTTGCGCACTGGGTTACAGCGTTAAAATGGAAAAGTrTTTATTAGAGCTTGTG SCALGYSVKMEKFLLELVGRSV
GGCAGAAGCGTTAAAGAAGTGCGCTCAATCGCTGCTTCTTTCAATGCGTTGAGTTTGGCCAGGCAATTAGAGCGTTT lAASFNALSLARQLERLEYSNTQI
GGAGTATTCCAACACACAAATCACACGCTTTAATCTGATGGGGTTAAAAACGCTTTTAAACAGCCCTTTATTTTTTGAC LMGLKTLLNSPLFFDILGGKVLE
ATTTTAGGGGGTAAGGTTTTAGAGCGTTTGAGTAAGGGTTTGCATTTTATAGAGCCTGATTGCATGCTAGTAACACGC LHFIEPDCMLVTRPVEFQNKRM
CCTGTAGAATTTCAAAACAAGCGGATGCAACTGCTTTGCGTGGGGAAACTAGAATGCGATTATGAAGGGTTTTTTCA GKLECDYEGFFQTISEEE
AACGATTTCTGAGGAGGAATAATGAAAGATGAACACAACCAAGAACACGATCATTTAAGCCAAAAAGAGCCAGAGTT
TTGCGAAAAGGCTTGCAAAGAACAACAATATGAAGAAAAGCAAGAAGCGGGCGAAAAAGAAGGCGAGATCAAGGAA
GATTTTGAGCTTAAATATAAAGAAATGCACGAAAAATACTTAAGAGTGCATGCGGATTTTGAAAACGTGAAAAAGCGC
TTAGAAAGAGACAAGAGCATGGCACTAGAGTATGCGTATGAAAAGATCGCATTGGATCTATTGCCGGTGATTGATGC
ACTTCTTGGGGCTCA
HP0962 1527 CTAGGGCTTTTAAACCAATTTATTTTGCAAAGCTACAAGGTAGAAAAGGAATTTAAAGATTATAAAGCCCTTTATGAAT 1528 LGLLNQFILQSYKVEKEFKDYKA
GGGTCATAGAGATTTTACCTCAAGCCATTTGGGTGGTG TGAAAACGGGAGCTTTTTTTATAAAAATTCTTTAGCCA VIEILPQAIWWNENGSFFYKNSL
ATCAAAGCCATGAGGTGTTCAATAAGGCTAAATTAGAAAATTTTAACACTGAAATTGAACATGAAAATAAAAGCTATTT HEVFNKAKLENFNTEIEHENKSY
AGTCCAGCAAAACAGCATTCAAGGCAAGCAAATCATCACCGCAACCGATATTAGCGCTCAAAAACGCCAAGAGCGG NSIQGKQIITATDISAQKRQERLA
CTCGCTTCTATGGGGAAAATCτCAGCGCATTTAGCCCATGAGATCAGAAACCCTGTAGGTTCTATCTCTCTTTTAGCT ISAHLAHEIRNPVGSISLLASVLL
TCAGTGTTATTAAAGCATGCGAACGAAAAGACTAAGCCCATT KTKPI
HP0962 1529 TTTAAAGGGCGTTTTTGGGCATGACAATAAAGAAMGATT CGCGCTTTTACAAGAAAAAAAGCGTTTTTTTATAGAT 1530 LKGVFGHDNKEKINALLQEKKRF
GACAATTTAGAAAACAAGCACTTAGACACCACGATGGTGAGCGAGTTTGTGGGAAAAACTAGGGCGTTTATTAAGAT NLENKHLDTTMVSEFVGKTRAFI
CCAAGAAGGCTGTGATTTTGATTGCAATTATTGCATTATCCCAAGCGTGAGAGGGAGGGCTAGGAGTTTTGAAGAGA GCDFDCNYCIIPSVRGRARSFEE
GAAAAATTTTAGAGCAAGTGGGCCTTTTATGCTCTAAAGGGGTTCAAGAAGTGGTTTrAACCGGCACCAATGTGGGG QVGLLCSKGVQEWLTGTNVGS
AGCTATGGGAAAGATAGAGGAAGCAATATCGCGCGATTGATTAAAAAATTAAGCCAGATCGCTGGATTAAAACGCAT RGSNIARLIKKLSQIAGLKRIRIGS
AAGGATTGGGAGCTTAGAACCTAATCAAATTAACGATGAATTTTTAGAGCTTTTAGAAGAGGATTTTTTAGAAAAACAT QINDEFLELLEEDFLEKHLHIALQ
TTGCATATCGCTTTACAGCACAGCCATGATCTCATGCTAGAGAGGATGAATCGAAGAAACCGCACTAAAAGCGATAG LMLERMNRRNRTKSDRELLETI
GGAATTATTAGAAACAATCGCTTCTAAGAATTTTGCTATTGGCACGGATTTTATTGTGGGGCATCCGGGCGAGAGCG AIGTDFIVGHPGESGSVFEKAFK
GAAGCGTTTTTGAAAAAGCGTTTAAAAATTTAGAAAGCTTGCCTTTAACGCACATCCACCCTTTTATTTACAGCAAACG PLTHIHPFIYSKRKDTPSSLMTD
AAAAGACACCCCCTCTAGCTTGATGACTGATAGCGTGAGTTTGGAAGATTCTAAAAAGCGTTTGAATGCGATTAAAGA DSKKRLNAIKDLIFHKNKAFRQL
TTTGATTTTTCATAAAAATAAGGCGTTCAGGCAATTGCAGCTCAAGCTCAATACGCCTCTAAAAGCCTTAGTGGAAGT TPLKALVEVQKDGEFKALDQFF
GCAAAAAGACGGCGAATTTAAAGCCTTAGATCAATTTTTCAACCCC
HP0962 1531 CAGCCATTTAGTCGTTTTGATCGAACCTAAAATAGAGATCAATAAAGTTATCCCTGAAAGTTATCAAAAAGAGTTTGAG 1532 SHLWLIEPKIEINKVIPESYQKEF
AAGTCTTTGTTTCTCCAGTTGAGTAGTTTTTTAGAGAGAAAAGGCTATAGCGTTTCGCAATTTAAAGATGCTAGCGAA FLQLSSFLERKGYSVSQFKDAS
ATCCCTCAAGACATCAAAGAAAAAGCGTTGCTCGTTTTACGCATGGATGGGAATGTGGCTATCTTGGAAGATATTGTA KEKALLVLRMDGNVAILEDIVEE
GAAGAGAGCGATGCGCTTAGCGAAGAAAAAGTGATAGACATGTCTTCAGGGTATTTG/WCTTGAATTTTGTTGAGCC EEKVIDMSSGYLNLNFVEPKSED
AAAAAGTGAAGATATTATCCATAGTTTTGGTATTGATGTTTCAAAGATTAAGGCTGTGATTGAAAGAGTGGAATTGCG GIDVSKIKAVIERVELRRT
GCGCACCAA
HP0962 1533 TTGGAATACGCTGATCCTAGCACTTCTAAAAAGAGAGCCGATAAGGGATTAAAAAAGGTGTTCAAAGACAGCAAAAA 1534 LEYADPSTSKKRADKGLKKVFK
AGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCATTGCTAAAAAAACAAGACCGAT ACGFIYEISEFMKAYTALLKKQD
ACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACTGCCCTTTATGTCAAATACCCTG LR YLPS RYWAS I LTTALYVKYPD
ATTTTGACGCTTTGAAAAAGCTTTTGGTGTCTTATTATTACCAAACTTGGATTGCAGGAGGCACGATCACGCGCATCA KLLVSYYYQTWIAGGTITRIKQTS
AGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACCATCAAAGAGCTTATATTGAATA VKSNKSVETIKELILNSIDSYNTF
GCATCGACTCTTATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTTCTGTTTATCATAGCAAATGGGT NLWDSSSVYHSKWVRPVLALAN
GCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATTTTATCGCTATGGATGCCGAAAC DEEKPHFIAMDAETQVEHILPQT
CCAAGTGGAGCATAT rGCCACAAACGCCCAAAAGAGGCAGTCAATGGAACGCGGATTTTGACAAAGAAAAAAGAG SQWNADFDKEKREEWVNNIAN MGAATGGGTAAAT TATCGCGAATTTAACCCTTTTAAAGCGTAAAAAGAACGCGCATGCTTTAAACGGGGATTT G RKKNAHALNGDFDEKRKIYGGK
ATGAAAAAAGAAAAATTTATGGAGGCAAAGACACGAGCAAAGTGATTAGCTGTTATGACATCACTAAAGAATTGTATA VISCYDITKELYSNYRKWNEKSL
GCAATTATAGGAAGTGGAATGAGAAGTCCCTCCAAGAGCGATACAAATCTTTGTATAACACTATCACGCCTGTTTTAC KSLYNTITPVLHIEGQEDDFEDD
ACATAGAGGGGCAAGAAGATGATTTTGAAGATGATTTTGATCTAGAATGATTAAAGATTGCCAAGCATCAAAACAACA
AAGAGGTGATCAATGCCTAAAAAAGAGCTATTAAAGATGTCAAAGAAAAGGATTT ΓAAAGACTTCTTAAAAGAAGCC
AAACAGCACCGC
HP0962 1535 AAGGAGAAAGGTATGATTTATAAAATCGCGCGCCTTAAGGATCATTATGTGATTTGTTACCACAACGAATACACCATT 1536 KEKGMIYKIARLKDHYVICYHNE GAATTGAGCAAGCAGTTCCGCTCCGCTCAAATCCCCTTTGTGGTCGTGGATAATGACCCTAGTTTTGAAGAAGAAGC KQFRSAQIPFVWDNDPSFEEE CATTAAACACAAATACCCCTACTATATCATAGGCGATCCGCACACCAATTTAGCCATGCTAAAAACCCACTTAAGCAG YPYYIIGDPHTNLAMLKTHLSSA CGCTAGGGGTGTTGTGGCTTTGTCTAAGATTTTACCGGTGAATGTGGCGTTAATGGTGAGCGTGCGCTTGTTTGAAA LSKILPVNVALMVSVRLFEKELK AGGAATTAAAACGCAAACCTTACTACATCATTGCGAGCGCGCATAGCGATGAAGGCTTAGAAAAATTAAAAAAATTAG IIASAHSDEGLEKLKKLGADMW GGGCTGATATGGTGGTTTCCCCTACAAAACTCATGGCGCAGAGAGTGAGCGCGATGGCGGTGCGTCCGGATATGG MAQRVSAMAVRPDMENILERFI AAAATATCTTAGAGCGTTTTATCAATAAAAAAGACACGCTTTTAGACTTAGAGGAAGTGATTGTCCCTAAAACTAGCTG LLDLEEVIVPKTSWLVLRKLKEA
GCTTGTGTTAAGGAAATTAAAAGAAGCCCATTTTAGAGAGATCGCTAAAGCGTTT AKAF
HP0962 1537 ATGAGTAACCCACAAAATTTGAGCAATAACAAAAATCTTAGCGAATTTATCAAGCAACAACGAGAAAATGAATTAGAC 1538 MSNPQNLSNNKNLSEFIKQQRE
CAAATGGAACGACTAGAGGACATGCAAGAACAGGCTCAAGCTAATGCGCTCAAACAAATTGAAGAACTCAACAAGAA QMERLEDMQEQAQANALKQIE
ACAAGCTGAAGAGACAATCAAGCAAAGAGCCAAAGATAAAATCAATATTAAGACAGATAAGCCTCAAAAAAGCCCTG QAEETIKQRAKDKINIKTDKPQK
AGGATAACTCCATAGAATTATCTCCTAGCGATAGCGCTTGGAGAACTAATCTTGTTGTGCGGACTAATAAAGCCTTGT SIELSPSDSAWRTNLWRTNKAL
ATCAATTCATTTTGAGMTAGCTCAAAAAGACAATTTTGCTTCAGCGTATCTAACAGTCAAATTAGAATACCCACAAAG RIAQKDNFASAYLTVKLEYPQRH
ACACGAAGTCTCTAGCGTTATTGAAGAGGAGTTAAAAAAGAGAGAAGAAGCAAAGAGGCAGAAAGAATTGATTAAGC VIEEELKKREEAKRQKELIKQEN
AAGAAAATCTTAACACCACAGCCTACATCAATAGAGTGATGATGGCGAGCAATGAACAGATTATCAACAAAGAAAAAA YINRVMMASNEQIINKEKIREEK
TAAGAGAAGAAAAGCAAAAAATTATCTTAGATCAAGCAAAGGCGCTAGAGACTCAATATGTGCATAATGCCTTAAAAA QAKALETQYVHNALKRNPVPRN
GAAACCCCGTGCCTAGAAACTACAACTACTACCAAGCGCCTGAAAAACGCTCTAAACATATTATGCCCTCTGAAATTT QAPEKRSKHIMPSEIFDDGTFTY
TTGATGATGGCACATTCACTTATTTTGGTTTCAAAAACATCACTCTCCAACCTGCTATTTTTGTGGTTCAACCTGATGG NITLQPAIFWQPDGKLSMTDAA
GAAATTGAGCATGACTGATGCCGCCATTGATCCTAACATGACCAATTCAGGATTGAGATGGTATAGAGTTAATGAAAT TNSGLRWYRVNEIAEKFKLIKDK
TGCAGAAAAATTTAAGCTCATTAAAGACAAAGCCCTTGTAACAGTAATCAATAAAGGCTATGGGAAAAATCCATTGAC INKGYGKNPLTKNYNIKNYGELE
AAAAAATTACAATATCAAAAACTATGGTGAATTGGAGCGTGTGATTAAAAAGCTCCCTCTTGTCAGAGATAAATAAAAA LPLVRDK
GGCGTTAAGA
HP0962 1539 TGCAAAAGAAGACGCTCTTAAGGCTCGCAAGAAGCTCTTAAACAATACGCATGATTTCTTAGAAGACTTGATTTTTAG 1540 AKEDALKARKKLLNNTHDFLEDLI
AAAACAAAAAATCAAAGAGCTTATGGATCATAGAGCTAAAGTTCTTTCAGACTTAGAAAACAAATACAAAAAAGAAAAA KIKELMDHRAKVLSDLENKYKKE
GAGGCTCTAGAGAAAGAGACAAGAGGTAAAATCCTTACTGCTAAGTCAAAGGCTTATGGGGATTTGGAGCAAGCCTT EKETRGKILTAKSKAYGDLEQAL
AAAAGATAACCCTCTCTATAGGAAACTTCTTCCTAACCCTTATGCCTATGTTTTAAACCAAGAAACATTCACCAAAGAA LYRKLLPNPYAYVLNQETFTKED
GATAGGGAGCGTTTGAGTTATTACTACCCCCAGGTGAAAACGAGCAGTATTTTTAAAAAAACCACCGCTACCACTAAA SYYYPQVKTSSIFKKTTATTKDK
GATAAGGCTCAGGCTTTGCTTCAAATGGGTGTGTTTTCTTTAGATGAAGAACAAAACAAAAAAGCGAGCCGATTAGCT QMGVFSLDEEQNKKASRLALSY
TTATCTTACAAGCAAGCGATTGAAGAATATTCCAATAACGTTTCTAATTTATTGAGCAGAAAAGAATTGGATAATATAG EYSNNVSNLLSRKELDNIDYYLQ
ATTATTACTTACAGCTTGAAAGAAACAAGTTTGACTCCAAAGCAAAAGATATTGCTCAAAAGGCTACTAACACGCTTAT KFDSKAKDIAQKATNTLIFNSERL
TTTTAATTCGGAACGCTTGGCGTTTAGCATGGCGATTGATAAGATCAATGAGAAATACTTAAGGGGCTATGAGGCTTT AIDKINEKYLRGYEAFSNLLKNVK
TTCTAACTTGTTGAAAAATGTCAAAGATGATGTGGAATTGAATACTCTGACTAAAAATTTTACCAATCAAAAATTGAGT LNTLTKNFTNQKLSFAQ
TTCGCACAA
HP0962 1541 CGTTAAGGGGGCTGATGCGCT ΓAGCCGCAGCGTTTTAAAAGAATTAGAAGAATTTGTGCGCCAATTTGGGGCTA 1542 VKGADALFSRSVLKELEEFVRQF
AAGGCTTAGCGTATTTGCAAATTAAAGAAGATGAAATTAAAGGACCTTTAGTTAAATTTTTAAGCGAAAAGGGGCTTA LAYLQIKEDEIKGPLVKFLSEKGL
AGAATATTTTAGAAAGGACTGATGCGCAAGTTGGGGATATTGTCTTTTTTGGAGCCGGGGATAAAAAAATCGTGTTAG RTDAQVGDIVFFGAGDKKIVLDY
ATTACATGGGGCGTTTGCGCTTGAAGGTGGCTGAAACGCTTGATCTGATTGATAAAGACGCTTTGAATTTCTTATGG RLKVAETLDLI DKDALNFLWWN
GTAGTCAATTTCCCCATGTTTGAAAAAACCGAAAACGGCTATCATGCCGCGCACCACCCTTTTACGATGCCTAAAAAT EKTENGYHAAHHPFTMPKNIEC
ATAGAATGCGAAGATATAGAAGAGGTTGAAGCGCATGCGTATGATGTGGTGCTTAATGGCGTGGAGCTTGGTGGGG VEAHAYDWLNGVELGGGSIRIH
GGAGCATMGGATTCATAAAGAAGAAATGCAAAAAAAAGTCTTTGAAAAAATCAATATCCATGAAGAAGAAGCGCAAA QKKVFEKINIHEEEAQKKF
AAAAATTT
HP0962 1543 AAATTGAATGTTTTGATTGAAAATTCCAGCGCGCTAGAAAGGGAATTGAAACAAAAGAATGAACATTTAGAGAACGCT 1544 KLNVLIENSSALERELKQKNEHL
TTAAAAGAGCAAGAATATTTGAAAAACGCATGGCTTTTAGAAATGGAAAAACAAAAAGAAATCTTTCACAATAAAAAAT EQEYLKNAWLLEMEKQKEIFHN
TGGAATTGGAAAAATCCTACCAACAAGCCCTAAATATCTTAAAAAGCGAAGTCGCTTCAAAAGATACTAGCTCCATGC EKSYQQALNILKSEVASKDTSSM
ATAAAGAAATCCATAAAGCGAGCGAAArπTAAGCAAACACAAAACAAACCAAGAGATCCCACAAATCATAACGAACT HKASEILSKHKTNQEIPQIITNFQ
TTCAAGCCAACGAAAAAGCGCGCTACAAGAATGAAAGCGTGCTGATTGTACAAATTTTAGACAAGGGCTATTATTGG RYKNESVLIVQILDKGYYWIETEL
ATAGAAACCGAGCTTGGCATGCGTTTAAAAGCGCATGGGAGTTrGrrGAAAAAAATCCAAAAACCCCCTAAAAACAA KAHGSLLKKIQKPPKNKFKPPKT
ATTCAAACCCCCTAAAACAACCATTCCTAAACCTAAAGAAGCGAGCTTGCGCCTTGATTTAAGGGGGCAACGCAGCG KEASLRLDLRGQRSEEALDLLDA
AAGAAGCCCTGGATTTACTAGACGCT TTTTAAACGACGCGCTTTTAGGGGGCTTTGAAGAAGTGCTGATTTGCCAC ALLGGFEEVLICHGKGSGILEKF
GGCAAAGGGAGCGGGA' TT ΓAGAAAAGTTTGTGAAAGA
HP0962 1545 AGTCAAGTCCATTTCTTATGTCGGGCTTTCTTACATGTCTGACATGCTCGCTAATGAAATTGTAAAGATTCGTGTGGG 1546 VKSISYVGLSYMSDMLANEIVKIR
CGATATTGTGGATTCTAAAAAAATAGACACCGCTGTTTTGGCTTTGTTCAATCAAGGGTATTTTAAAGACGTTTATGCC VDSKKIDTAVLALFNQGYFKDVY
ACTTTTGAAGGCGGCATATTAGAGTTTCATTTTGATGAAAAAGCCAGGATTGCCGGGGTAGAAATCAAGGGTTATGG GGILEFHFDEKARIAGVEIKGYG
GACTGAA GGAAAAAGACGGCTTAAAATCCCAAATGGGGATCAAAAAGGGCGACACCTTTGATGAGCAAAAATTAG DGLKSQMGIKKGDTFDEQKLEH
AGCATGCTAAAACGGCTTTAAAAACCGCTTTAGAGGGGCAGGGCTATTATGGGAGCGTGGTGGAGGTGCGCACAGA KTALEGQGYYGSWEVRTEKVS
AAAGGTCAGTGAGGGTGCATTATTGATCGTGTTTGATGTGAATAGGGGGGATAGCATTTATATCAAACAATCCATTTA LIVFDVNRGDSIYIKQSIYEGSAK
TGAGGGAAGCGCGAAATTAAAACGCCGCATGATTGAATCTTTGAGTGCGAACAAGCAACGAGATTTCATGGGCTGG MIESLSANKQRDFMGWMWGLN
ATGTGGGGCTTGAATGACGGGAAATTGCGTTTAGATCAACTAGAATACGATTCTATGCGTATCCAAGATGTGTATATG RLDQLEYDSMRIQDVYMRRGYL
CGTAGGGGTTACTTAGACGCTCATATTTCTTCGCCTTTTTTGAAAACGGATTTTTCTACCCATGACGCTAAGCTTCATT SPFLKTDFSTHDAKLHYKVKEGI
ATAAAGTCAAAGAGGGGATCCAATACAGGATTTCAGACATTTTAATAGAGATTGACAACCCGGTAGTCCCCTTAAAAA DILIEIDNPWPLKTLEKALKVKR
CCTTAGAAA GCGCTTAAAGTGAAAAGGAAAGATGTCTTTAATATTGAGCATTTAAGAGCGGATGCGCAAATTTTAA IEHLRADAQILKTEIADKGYAFA
AAACCGAAATCGCCGATAAGGGTTATGCGTTTGCGGTGGTGAAGCCAGACTTGGATAAAGATGAAAAAAACGGGCT LDKDEKNGLVKVIYRIEVGDMVY
TGTGAAAGTCATTTATCGTATTGAAGTGGGCGATATGGTGTATATCAATGATGTCATCATTTCAGGGAACCAGCGCAC SGNQRTSDRIIRRELLLGPKDKY
GAGCGATAGGATCATT RNSENSLRRLGFFSKVKIEEKRV
Figure imgf000275_0001
Figure imgf000276_0001
Figure imgf000277_0001
HP0194 1569 ATTTAGCCAAGTTAGGGCTTAATGAAAGAGAGATAAATAAACTTTTCGTTGAATTAGAAAATTCAAACGCCCTAGAATT 1570 LAKLGLNEREINKLFVELENSNAL
TGACGAAACCAACAACGCTTACAAAATCATTGCCCCTATTTGTGAGACGATGCAAAACAATGAAGAAAGGATAAAAGA TNNAYKIIAPICETMQNNEERIKD
CTTTTTAAGCGATGAAGAATACCACGCCGTTTTAAGCGCCTTTAAAATGGCTGAAAATCCAACCAACAAGCGCGATCA EEYHAVLSAFKMAENPTNKRDQI
MTTATA CGCTAACCMCCTCAAGAAAMGTTAAAATCCGCCAAAATTTAGCTAAGGAATTTAAAGAGTTGTGGCA QPQEKVKIRQNLAKEFKELWQTI
AACCATCAACGCGCAAAGCCAATTAAGCTATCAAAATATCCAAAAAACCAAGCTGATAGAATCTATCGCCAAAGCGTT QLSYQNIQKTKLIESIAKAFNESH
TAATGAGAGCCATGTGAGTGCTGAGGTAATCAAATTTGAAAGCAAAAGGTATGACCCACAAACCAACAGGATCATCA VIKFESKRYDPQTNRIITEQSSTL
CAGAGCAATCAAGCACCTTAAAAATCAGAGACTACGCTAACGCTTTACAAAAAGAAATCAACGCGCTTTTGCTTGACT ANALQKEINALLLDFAKDERLPL
TTGCCAAAGATGAGCGCTTACCCTTAAAATTCACGCTTGAACTCTACAACGCTTTAAACAAAGAGCATTTCACAAACT LYNALNKEHFTNSPKKAFKLLKG
CACCCAAAA GCCTTTAAATTGCTTAAAGGCATCATTAAAGATAAGTTGCATGAAAACTTGCTTTCTTGCGTGAGTTA LHENLLSCVSYGFCQNAFSNTA
TGGATTTTGCCAAAACGCCTTTTCTAACACGGCTTTTGATAAAACCGATCCGCTTTATTGTGAAGATGGCTCGCCCAA PLYCEDGSPKNEIEKHKIGKYKS
AAATGAGATTGAAAAACACAAAATAGGCAAATACAAAAGTGCGCAAACTCCAAGCCCAAACTATCTTTATGAGACGAT SPNYLYETIIYDSKIEEEVSEEGV
CATTTATGATTCTAAAATTGAAGAAGAAGTGAGTGAAGAGGGGGTGCAAACACTGGAAGGTAAAAGCATAGAAGTTT GKSIEVFAKLPKFKIPTPYKNYEP
TTGCCAAGCTCCCTAAATTTAAAATCCCAACGCCTTATAAAAACTATGAGCCTGATTTTGCTTATTTGCTTAAAGATGA LLKDEKGAKIFFVCETKGYEKES
AAAGGGTGC EKRKIEYAKKFFETLSQNLKNAK
VFATRINKQDLFNTLKNALKETP
HP0194 1571 GATTTTTGATGTGAAAGCGCCTATTTTGGGGTTTGAAACCATTCATAAAATGCGTTTGCAAAAGATTGATGAAATCTTT 1572 IFDVKAPILGFETIHKMRLQKIDEI
TTGCGTTTGAATAGCACAGAAGAGAATTCCGTGGTGTCTTTCACGCTGGTCAATCCCTTTGCCTTAAGAAAATACGAA STEENSWSFTLVNPFALRKYEF
TTTGAAGTGCCTACCCCTTTAAAAATCCTTTTAGAATTAGAGGGAGCCAAGAGCGTTCTAGTCGCTAATATCATGGTC PLKILLELEGAKSVLVANIMWQT
GTTCAAACCCCCATTGAGCTTTCCACCGTGAATTATTTAGCCCCTTTAATTTTCAATTTGGACAAGCAGCTCATGGGG TVNYLAPLI FNLDKQLMGQWLD
CAAGTGGTTTTGGATTCTAACAAATACCCACACTACCATTTAAGAGAGAATATTCTAAGCCACACGCATGAATGATGC PHYHLRENILSHTHE
ATAAGCGGTTTCTATGCAAAGCGTTTCTTGAAGCTTATGGCGCAATGCATCGGTGTTTTTGCAAATACAAGCGAGCG
GTTTTCTTATGGTATAATAGCGGTTATTTTTTTAGAATAACCAACGCTAAAAATAGCGCTTTAATTGAATCCAAAACAA
GGAGTTTATTTTATGGAACAAAGCCATCAAAACTTGCAATCTCAAT ΓTATAGAGCATATCTTACAAATTCTACCTCA
CCGCTATCCCATGCTTTTAGTGGATAGAATTA
HP0194 1573 CGATGAAGTCAAAAGAGGCTCAAGAGACGAAACGATTAATTCTGCGAGAGACGTTTGGCAAGCAGCCAAATCCCAA 1574 DEVKRGSRDETINSARDVWQAA
GCCACTTTAGCCAAAGAGACTTATAAGCGCGTTCAAGATTTGTATGATAATGGCGTGGCGAGCTTGCAAAAGCGCGA TLAKETYKRVQDLYDNGVASLQ
TGAAGCCTATGCGGCTTATGAAAGCACTAAATACAACGAGAGCGCGGCTTACCAAAAGTATAAAATGGCTTTAGGGG AYAAYESTKYNESAAYQKYKMA
GGGCGAGCTCTGAAAGTAAGATTGCCGCTAAGGCTAAAGAGAGCGCGGCTTTAGGGCAAGTGAATGAAGTGGAGT SSESKIAAKAKESAALGQVNEVE
CTTATTTAAAAGACGTCAAAGCGACAGCCCCAATTGATGGGGAAGTGAGTAACGTGCTTTTAAGCGGTGGCGAGCTT DVKATAPIDGEVSNVLLSGGELS
AGCCCTAAGGGTTTTCCTGTGGTTTTAATGATAGATTTAAAGGATAGTTGGTTAAAAATCAGCGTGCCTGAAAAGTAT PWLMIDLKDSWLKISVPEKYLN
TTGAACGAGTTTAAAGTGGGTAAGGAATTTGAAGGCTATATCCCGGCGTTGAAAAAAAGCACGAAATTCAGGGTCAA KEFEGYIPALKKSTKFRVKYLSV
ATATTTGAGCGTGATGGGGGATTTTGCGACTTGGAAAGCGACGAATAATTCCAACACTTACGACATGAAAAGCTATG ATWKATNNSNTYDMKSYEVEAI
AAGTGGAAGCCATACCCTTAGAAGAGTTGGAAAATTTTAGGGTAGGGATGAGCGTGTTAGTTACCATTAAACCTTAAA ENFRVGMSVLVTIKP
AAGGATTGTTTTGTTCAGATTGATAAGCGCATGGGTTTTACAAGACAAGTTCTTGTTTATCGTGTGTTTTATATTGCCT
TTTTGTTTAGGGGTTTTAGGCACGCAAATCTTTAAACAAGAAATCCCAAGACAGCTCCCTATCGTGGTGGTGGATTTG
GATAAGACCACGACAAGCCATCAAGTAGCGTTTGAATTAGGTGCAACGAGCGCGCTTCAAATCAAACACCAAGTGG
CTAGCCTTTTAGAAGCCAAACGCTTTTTAAACTCCGCCGAAGCGTATGGGGCGTTAGTTTTGCgTAGAGATTTAGAG
AAAAAAATCAAAATGGGGCGAA
Figure imgf000279_0001
Figure imgf000280_0001
HP0042 1595 GCGTTTAGAAATGATCCCTAAAAAAGCCAACCTTATGATTTTAGATAAAGAAAAATGCGTGATAGAAGCCTTTCGTTTT 1596 RLEMIPKKANLMILDKEKCVIEAF
AATGACAGGGTCGCTAAAAACGATATΠTAGGGGCATTGCCTCCTAATATTTACGAGCATCAAGAAGAGGATTTGGAT RVAKNDILGALPPNIYEHQEEDL
TTTAAGGGATTGTTAGACATATTAGAAAAAGATTTTTTATTTTACCAACACAAAGAATTGGAACACAAAAAAAATCAAAT LDILEKDFLFYQHKELEHKKNQII
CATCMGCGATTAAACGCCCAAAAAGAACGCTTGAAAGAAAAATTAGAAAAACTAGAAGATCCTAAAAATTTACAATT QKERLKEKLEKLEDPKNLQLEAK
AGAAGCGAAAGAATTGCAAACTCAAGCCTCATTGTRGCTCACTTACCAGCATTTAATCCATAAGCATGAAAGCCGCGT QASLLLTYQHLIHKHESRWLKD
GGTTTTAAAGGATTTTGAAGATAAAGAACGCGCAATTGAAATTGATAAGAGCATGCCCTTAAACGCTTTTATCAATAAA ERAIEIDKSMPLNAFINKKFTLSK
AAATTCACTCTCAGTAAAAAAAAGAAACAAAAATCGCAATTTTTGTATTTAGAAGAAGAGAATTTGAAAGAAAAAATCG KSQFLYLEEENLKEKIAFKENQI
CTTTTAAAGAAAATCAAATCAACTATGTTAAAGGAGCGCAAGAAGAMGCGTTTTAGAAATGTTTATGCCCTCTAAAAA AQEESVLEMFMPSKNSKIKRPM
TTCTAAAATCAAACGCCCTATGAGCGGGTATGAAGTGCTGTATTATAAGGATTTTAAAATCGGTT VLYYKDFKIG
HP1493 1597 GTTACGCTTCAAAACTCATGGATAGTTTTGGCATCGCTCGTTTGTATTCTAATCCTGCTTTATTGGATTTGGATAACCA 1598 YASKLMDSFGIARLYSNPALLDL
TGGTATTCCTAATGACATTAATGATAATGAAGAAGCGCCCAAAAACACTCCCTTAAAAAAATACGCTAAAAATTTGAG PNDINDNEEAPKNTPLKKYAKNL
CGCTTTAGCCCAAGACAACGCTTTAGATCCAGTCATTGGCAGAGAAGAAGAGATTTTAAGAGTGATAGAAATTTTAG QDNALDPVIGREEEILRVIEILGR
GGCGCAGAAAAAAGAATAACCCGCTTTTAATTGGCGAAGCGGGCGTAGGGAAAACCTCCATCGCTGAAGCTTTGGC NPLLIGEAGVGKTSIAEALALKIA
TTTAAAAATCGCTCAAAAAGAAGTGCCGGAGTTTTTGCAAGAATATGAAGTCTATTCTTTGGATTTAGCCTTAATGGTG PEFLQEYEVYSLDLALMVAGAK
GCTGGGGCAAAATACAGAGGGGATTTTGAAAAACGCTTGAAAAAAACGCTCAAAGAAATCCAACAAAACGGCCGTAT EKRLKKTLKEIQQNGRIILFIDEIH
CATTTTATTCATTGATGAAATCCACACCCTTTTAGGCACAGGGAGCAGTAACGCTGGGAGCTTGGATGCGGCGAATA GSSNAGSLDAANILKPVLTDGSL
TATTAAAACCGGTTTTAACGGATGGGAGCTTGAAATGTTTAGGAGCGACCACTTTTGAAGAATACCGCAGCGTGTTT ATTFEEYRSVFEKDKAFNRRFS
GAAAAAGACAAGGCTTTTAATAGGCGTTTTTCAGTCATAAAAGTTGAAGAGCCTTCTAAAGAGGCGTGTTACTTGATT EPSKEACYLILKKIAPLYEEHHQV
TTAAAAAAGATCGCTCCCCTTTATGAAGAACACCACCAGGTGCGTTATGATGAGAGCGTGTTTAAGGCATGCGTGGA SVFKACVDLTSDYMHDKFLPDK
TTTAACGAGTGATTACATGCATGATAAATTCTTGCCGGATAAAGCGATTGAATTATTAGATGAGGTGGGATCGAGGAA DEVGSRKKISPKKGKKIGVDDVK
AAAAATCAGCCCTAAAAAGGGCAAAAAAATCGGCGTTGATGATGTGAAAGAAACGCTCGCTCTAAAGCTTAAAATCC LKLKIPKMRLSSDKKALLRNLEK
CTAAAATGCGTTTGAGCAGCGACAAAAAAGCCCTTTTAAGGAATTTGGAAAAATCGCTTAAAAATAAGATTTTTGCCC IFAQAEAISLVSNAIKIQHCGLSA
AAGCAGAAGCGATCAG GSFLFVGPSGVG
HP1493 1599 CAAGAAAACGCATGAAAATTAAAAATATCTTACTGAGTGGGGGGAGCGGGAAACGCCTATGGCCTTTAAGCCGTAGC 1600 QENA
CTATACCCTAAGCAATTTTTAAAGCTTTTTGACCATAAAAGCTTGTTTGAATTGAGTTTTAAAAGAAACGCTTCCTTAGT
AGATGAAACGCTCATTGTGTGCAATGAAAAGCATTATTTTTTAGCCCTAGAAGAGATAAAGAATGAAATCAAAAACAA
AAGCGTGGGTTTTTTATTAGAGAGCTTGAGTAAAAACACCGCTAACGCCATCGCTTTGAGCGCTTTAATGAGCGATA
AAGAAGATTTGCTCATCGTTACGCCAAGCGATCATTTGATTAAAGACCTTCAAGCGTATGAAAACGCGATAAAAAAAG
CGATTGATCTAGCCCAAAAAGGTTTTTTAGTCACTTTTGGGGTGAGTATTGACAAGCCCAACACGGAGTTTGGGTATA
TTGAAAGCCCTAATGGTTTAGATGTCAAGCGATTCATTGAAAAGCCAAGCCTAGACAAAGCGATAGAGTTTCAAAAAA
GCGGGGGTTTTTATTTCAATAGCGGCATGTTTGTTTTCCAAGCGGGCGrπTTTTAGACGAACTAAAAAAGCATGCCC
CCACTATTTTAAAGGGGTGTGAAAGAGCGTTTGMTCTTTAGAAAACGCGTATTTTTTTGAAAAAAAGATCGCTCGTTT
GAGCGAAAAGAGCATGCAAGATTTAGAAGACATGAGTATTGATATAGCCTTAATGCAACAAAGCCACAAAATCAAAAT
GGTAGAATTGAACG
HP1493 1601 AGCCTATCTTTGTTCTAATGAGGGGGCGTTTGAGAGCGATTATTTAAACCCTATCCAATCAAGCTATCAAGCTTTAAT 1602 AYLCSNEGAFESDYLNPIQSSY
GCGAAAAGATAGCCCTAAATTATACAACCATCAAGCCACCAACCACTCGCAAGCCGCTTTAGAGAAATTAAAACTCAT KDSPKLYNHQATNHSQAALEKL
TAACAAAGAACAAGGCAAAGAATGCTTGCCTAAAAACTTGCATGGCAAACAGCAATTCAAAAGCACATGGGGGCGCC EQGKECLPKNLHGKQQFKSTW
TGAATTGGAATAAAATCAGCCCCACCATAGACACACGATTTGACACTCCCAGCAATGGCACCAACTCCCACCCCGAA WNKISPTIDTRFDTPSNGTNSH
TTGCACCGCTCTATCACGCCTAGAGAAGCCGCTAGGATACAAAGTTTTAGCGATAATTATATCTTTTATGGCAATAAA SITPREAARIQSFSDNYIFYGNK
ACGAGCGTTTGTAAGCAAATCGGTAACGCTGTGCCTCCTCTTCTAGCCCTAGCCTTAGGCAAAGCGATCTTAAAA QIGNAVPPLLALALGKAILK
Figure imgf000282_0001
Figure imgf000283_0001
Figure imgf000284_0001
Figure imgf000285_0001
HP0862 1637 TGGGGGCGTGGGAGCAGCAAGCAGAAAGGGCATΠTAATGAAAGGCGTGCATGTTTTAGAGGTGCTTACCCAAGCT 1638 GGVGAASRKGILMKGVHVLEVL
AAAAGCATCGCTTTTGATAAAACCGGCACTTTGACTAAAGGCGTTTTTAAAGTAACGGATATTGTGCCGCAAAATGGG SIAFDKTGTLTKGVFKVTDIVPQN
CATTCTAAAGAAGAAGTTTTGCATTACGCTTCTTGTTCGCAGCTCTTATCCACGCACCCTATCGCTTTATCCATTCAAA EEVLHYASCSQLLSTHPIALSIQK
AAGCATGCGAAGAAATGTTAAAGGACGATAAGCACCAACATGACATTAAAAATTACGAAGAAGTGAGCGGGATGGG MLKDDKHQHDIKNYEEVSGMGV
GGTTAAAGCGCAATGCCATACGGATTTAATCATCGCAGGGAATGAAAAAATGCTCGATCAATTCCATATCGCGCACA HTDLIIAGNEKMLDQFHIAHSPSK
GCCCTTCCAAAGAAAACGGCACGATCGTGCATGTGGCTTTCAATCAAACTTATGTGGGCTATATCGTCATTAGCGAT IVHVAFNQTYVGYIVISDEIKDDAI
GAGATTAAAGATGACGCCATAGAGTGCTTAAGGGATTTAAAAGTGCAAGGGATAGAAAATTTTTGCATTTTGAGCGG DLKVQGIENFCILSGDRKSATESI
GGACAGAAAGAGCGCGACTGAGAGCATCGCTCAAACCCTAGGCTGTGAATACCATGCGAGCTTGTTGCCTGAAGAA GCEYHASLLPEEKTSVFKTFKER
AAAACGAGCGTGTTTAAAACCTTTAAAGAACGCTATAAAGCTCCGGCGATTTTTGTAGGCGATGGCATCAATGACGC AIFVGDGINDAPTLASADVGIGM
TCCGACTCTAGCGAGTGCTGATGTGGGGATTGGCATGGGGAAAGGATCAGAATTGAGCAAGCAAAGCGCAGACATT ELSKQSADIVITNDSLNSLVKVLAI
GTGATCACCAATGACTCTTTAAATTCGTTAGTGAAAGTTTTAGCGATCGCTAAAAAAACTAAAAGCATTATTTGGCAAA KSIIWQNILFALGIKAVFIVLGLMG
ATATCTTGTTCGCTTTAGGGATTAAAGCCG I I I I LATCGTGCTAGGGCTTATGGGGGTAGCGAGCTTGTGGGAAGCG WEAVFGDV
GTCTTTGGCGATGTGG
HP0862 1639 CTTAGAATTAAACCTGCTTGCCACCCAAAGCGTCTTGCCTGCAAAAATCATTCTTTTACTCATAGACAAAGAGGGCTT 1640 LELNLLATQSVLPAKIILLLIDKEG
AAAACAGCGCTTAAGCCTTAAAAGTTTAGATAAAATAGAAAACCAAGGCATAGAAAAATTACTTCATATCCAGCAAAA LSLKSLDKIENQGIEKLLHIQQKL
GCTCAAAACCCACGCTTATGCGTTACAAGAAAAATTTGGGTGCGAAGTTTTGGAATTAGACGCTAAAGAAAGCGTTA ALQEKFGCEVLELDAKESVKNLH
AAAACTTGCACGAAAAAATCGCCGCTTTTATAAAATGCGCTGTTTAACCTGTTTGAAGCTTTCTTTTAAGCCTCTTTGC AFIKCAV
CCAAAT
HP0402 1641 CTTGCGCGACATGCGATGGCTTCTTTTATAAAAATAAAGAAGTAGCGGTGCTTGGTGGAGGCGATACCGCCGTAGAA 1642 CATCDGFFYKNKEVAVLGGGDT
GAGGCGATTTATTTGGCCAACATCTGCAAAAAAGTCTATCTCATCCACAGAAGAGATGGTTTTAGGTGCGCGCCTAT AIYLANICKKVYLIHRRDGFRCAP
CACTTTAGAGCATGCCAAAAACAACGATAAGATTGAGTTTTTAACCCCTTATGTGGTGGAAGAAATCAAGGGCGATG AKNNDKIEFLTPYWEEIKGDAS
CTTCTGGAGTGTCTTCTTTAAGCATTAAAAACACAGCCACTAACGAAAAAAGAGAATTGGTTGTGCCGGGGTTTTTTA SIKNTATNEKRELWPGFFIFVGY
TTTTTGTGGGTTATGATGTGAATAACGCCGTGTTGAAACAAGAAGATAACTCCATGCTATGCAAATGCGATGAATACG AVLKQEDNSMLCKCDEYGSIW
GCTCTATTGTCGTGGATTTTTCCATGAAAACGAATGTTCAAGGCTTGTTTGCGGCAGGAGATATTCGCATTTTTGCCC KTNVQGLFAAGDIRIFAPKQWC
CTAAACAAGTGGTTTGCGCTGCAAGCGATGGCGCTACGGCAGCCTTAAGCGTGATTTCTTATTTAGAACACCATTAA GATAALSVISYLEHH
ATCAAGCTTATAACCCTAGATTTTAGGGTTATGAGTTCTCGCTTCAAACATTCATCGG
HP0402 1643 CAAGGTGTATGATTTATCGGTTTATGTGCAGATGCAAGTGGATATTTTGTCTTCTAAAAAATAAAGCTTTTAAGGGGAT 1644 KVYDLSVYVQMQVDILSSKK TTTATAAATATCTTAAATAGGGAGTTAGAAGGTTTTTATGAATTACGATAACTATTGGGATGAGGACAAACCAGAACTC AATATCACGCCTTTAGTGGATGTGATGCTCGTTTTGTTGGCTATTCTTATGGTAACGACACCCACTCTTACTTATAAAG AAGAAATTGCTTTGCCTTCTGGCTCAAAAACTGCTAGAGCCACTCAAGATAAAATGATAGAGATCCGCATGGATAAAG ACGCAAAAATCTATATAGATAGTCAAACCTATGAATACAACTCTTTCCCA
HP0402 1645 CTTTAGGGGTGGCTTTGATCTTATGTTCAGGCTTACTCATTGCCTTGCCAGCTCTTTTAAAAGAATTAAAAAAAATTTA 1646 LGVALILCSGLLIALPALLKELKKI AGCGATGCAACTAAGCCCCTTACAAAGCGCTCTGTTATACTTTAGCTATTTTATTTATCCAGAGAAAAAAACAAGAAG CTTTGATTTAAGCGATTTAGTCTTTATTATCATGGTTTTTTTAGTCCTGGCTTTGGGGCTGTTGATGAGTGGAGAAATT TCTATCAGCTACAATGAAGCGAAGGATTTTTTTTATAGCAGCGCATGGTTTGTTCAAATCGCTCAAAAAAGCACTGAA ATTTTAGGCCAAAACGATTTGGCTTTAAGATTGCCTTTTTTGATCGCTCATCTCATTAACATGTTT
HP0402 1647 TTTTGGTTTGCTGTTTTTAAGGCAACACATGAATCTTTTAAAAAGGGCATGCGATTTTTCGGTATTAGATAGCGATGAA 1648 FGLLFLRQHMNLLKRACDFSVL GTCAAAACGCTGTGCAAACAGCTCAAAATTTCAAACTTTAGAGCCAGTATTTCTCAAATCAAAAACGGCATGATGGAT KTLCKQLKISNFRASISQIKNGM TTGAGCATGCAAGATAGCGAATGCTATAAAGCGTATGAGCTTTATCAAAACGCGCTCAAAAAAGACAATTTAGTGGAT MQDSECYKAYELYQNALKKDNL TTTGACGATTTGCTTTTTTTAAGCCTTAAGATTTTACAAGATAATGAAACCATCGCCAAAGAGACTAGCGAG DLLFLSLKILQDNETIAKETSE
Figure imgf000286_0001
Figure imgf000287_0002
Figure imgf000287_0001
HP1255 1659 TTCGTTTTATGCGGCGGCGTCTATGGCCTACATGCTAGGGGCAAAGCATGCGTTTGATGCGGATCACATCGCTTGC 1660 SFYAAASMAYMLGAKHAFDAD
ATAGATAACACCATTAGAAAGCTCACCCAACAAGGCAAAAACGCCTATGGTGTGGGGTTTTACTTTTCTATGGGGCA NTIRKLTQQGKNAYGVGFYFS TTCAAGCGTGGTGATTTTAATGACCATCATCAGCGCGTTTGCGATCGCTTGGGCTAAAGAACACACGCCGATGCTAG WILMT/ISAFAIAWAKEHTPMLE
AAGAAATAGGGGGGGTAGTGGGGACTTTAGTTTCTGGGCTTTTTTTGCTCATTATAGGGCTATTGAATGCGATTATTC VGTLVSGLFLLIIGLLNAIILLDLL TCTTGGATTTATTAAAAATATTCAAAAAATCGCACTCTAATGAAAGCCTAAGCCAGCAACAAAATGAAGAGATCGAGC HSNESLSQQQNEEIERLLTSRG
GGCIC I lAACGAG'IAGGGGCT I GC ICAACCGC I l l l l IAAACCC ITGT I I AAT G I CT CCAAG TCG I GCAT Al IT FKPLFNFVSKSWHIYPIGFLFGL
ATCCTATCGG I TTTCT'I TTTGGGCTGGGl \ I TGATACCGC I'A I GAAA I CGCGCT I T I GCCC I CTCTAGCAGCGCG SEIALLALSSSAIKVSMVGMLSL
ATTAAAG I GAGTATGGTGGGCAI CT CTC I MACCCATT CTT1 1 1 CCGC I GCA I GAG I I'TGT I ΓGACACTTT AGAT GMSLFDTLDGAFMLKAYDWAF GGGGCGTTCATGCTCAAGGCGTATGACTGGGCGTTCAAAACCCCTTTAAGAAAAATCTATTACAATATCTCTATCACG KIYYNISITALSVFIALFIGLIELFQ GCCTTAAGCGTGTTTATCGCGCTCTTTATTGGCTTGATTGAGCTTTTTCAAGTCGTTAGCGAGAAACTCCATTTAAAAT LHLKFENRLLRALQSLEFTDLG TTGAAAACCGCCTTTTAAGAGCCTTACAAAGCCTGGAATTTACAGACTTGGGCTATTACTTGGTGGGCTTATTTGTAA FVIAFLGSFFLWKIKF
TAGCGTTTCTAGGATCGTTCTTTTTATGGAAAATCAAATTT
HP1255 1661 CCGCCTCAAACAACGCACCGAGCATGATTTAGAAATGATTAGCGCGACCGGTGTGTGTAAGGGCATTGAAAATTACG 1662 RLKQRTEHDLEMISATGVCKGI CGCGCCATTTCACCGGTAAAGCCCCTAACGAAACGCCTTTTTGCTTGTTTGATTATTTAGGGATTTTTGAGCGGGAGT HFTGKAPNETPFCLFDYLGIFE
TTTTAGTCATTGTGGATGAAAGCCATGTGAGTTTGCCACAGTTTGGGGGGATGTATGCAGGGGATATGAGCAGGAAA VDESHVSLPQFGGMYAGDMS
AGTGT1 1 I AG 1 GGAATATGG IT 1 I AGAT TGCCTAGCGC M 1 AGACAACCGCCC'M "1 AAAA 1 M GA I GAAT 'I I AT CCA I A EYGFRLPSALDNRPLKFDEFIH AAAATTGCCAGTTCCTTTTTGTGTCCGCTACGCCCAATAAGCTAGAATTAGAGCTTTCCAAAAAGAATGTCGCTGAGC LFVSATPNKLELELSKKNVAEQII AAATCATTCGCCCTACAGGGC IT IT AGACCCTAAATTTGAAGTGCGAGACAGCGATAAGCAAGTCCAGGATTTGTTT LLDPKFEVRDSDKQVQDLFDEI GATGAAATCAAGTTAGTGGTGGCTAGAGGTGAAAGGGTGCTCATCACCACGCTCACTAAAAAAATGGCAGAAGAATT RGERVLITTLTKKMAEELCKYY GTGCAAATATTATGCTGAATGGGGCTTGAAGGCGCGTTACATGCATAGTGAAATTGATGCGATTGAAAGGAATCACA KARYMHSEIDAIERNHIIRSLRL
TCATCCGCTCTTTAAGGCTTAAAGAATTTGACATTTTAATAGGGATCAATCTTTTAAGAGAAGGGCTGGATTTGCCTG GINLLREGLDLPEVSLVAIMDAD AAGTCTCTTTAGTAGCGATCATGGATGCGGATAAAGAAGGGTTTTTAAGGAGTGAAACAAGCCTCATTCAAACCATG RSETSLIQTMGRAARNANGKVL
GGGCGAGCCGCTAGAAACGCTAATGGCAAGGTTTTATTATACGCTAAAAAGATCACTCAAAGCATGCAAAAAGCCTT ITQSMQKAFEITSYRRAKQEEF TGAGATCACTAGTTACAGGCGCGCCAAACAAGAAGAGTTCAATAAAATCCATAACATCACCCCCAAAACCGTTACGC TPKTVTRALEEELKLRDDEIRIA GCGCTTTAGAAGAGGAATTGAAATTAAGAGACGATGAAATTAGAATCGCTAAAGCCTTAAAAAAGGACAAAATGCCTA DKMPKSEREKIIKELDKKMREC AAAGTGAAAGGGAAAAA FEEAMRLRDEIAQLRTL
HP1255 1663 GGAAIAT CG IAT CCAI l A I CG I G I GG I 1 1 I G IAM I I rCAAA I i 1 C 1 ACAGAGAGC 1 I GG I GCAAC I C I I CACAACGAI 1664 NIVSIIVWFCIFQISTESLVQLFTT
CC I A'IACC'I 1 IT'TAT I GGC I I GTATTATM lAACCGG I'GGAAIAAGTCATGCAAGCAG'I A I 1 1 l AGCGAATGGGGAG GLYYFNRWNKSCKQ
TTTCCTAAATCTCAAAAATGCTTAGACCTTTTAAAAAACGCTCCCTTTTTAATCGCATGCGATGGGGCTGTTACCTCAT TACATGCGCTTCAATTCAAACCCAGCGTTGTTATAGGCGATCTAGATAGCATTGATTCGCATTTGAAAGCTTTGTATA
ACCCTATACGCATGAGTGAACAAAACAGCAACGATTTGTCCAAAGCCTTTTTTTATGCTTTAAATAAAGGCTGTGATG
ACTTTATTTTTTTAGGGTTGAATGGCAAGCGAGAAGATCACGCTTTAGCGAACACTTTTTTATTGTTGGAATATTTTAA
AT I I ΓGCCAAAAAAT CCAAGCCATAAGCGACTATGGTCTT TTTAGGGTG lAGAAACCCC I I T CAC IT TGCCCAG I M l
AAAGGGGAACAAATCTCGCTTTTTAGCCTGGATCTTAAAGCCCAATTCACTTCTAAAAACCTCAAATACCCCTTAAAA
AACTTGCGTTTAAAAACGCTCTTTTCTGGCTCGCTCAATGAAGCTACAGATAGTTATTTTAGCCTTAGCTCTACACCTA AATCGGTGGTGTTGGTGTATCAAAAATTCTTATAAGCGGGTTTTGTTAGGCAAGTTTTTGTCTGTATATTGTGTCAAGT TAAGAAGCGTTTGAGTTAGCAAGTAGAGAAAGATTGATTCAGTTTCGCTTTGCGTGGAGGTGGTAGGTTTAGTGTTTT
TΆAT I GCAΠ GI T GGCGGACACAATAAIAGAAGT I CAT AT Π I ΓCAACAAAACCAC ACAGAGCGAACAAAI CAAC CCACTACCCATATCGGCTAAAATCCAAAGTAATTTAACATAAAGGAATAACCAAGATCTAGCATGTCTCAAAATACGC
TTGA
HP1255 1665 GGCGGTGGATGAAGAACACAATAAAGATGCGATGAAAATGACGGATTTAGAGGCTCAAAAATTAGGCAGCGTGTTGT 1666 AVDEEHNKDAMKMTDLEAQKL TGTCTGATGTGGAAATGGGGGGTAAAATCTTGCTCAAAGCGATCCCCATTTTAGATGGCGAAATGCTTACAGATGCG SDVEMGGKILLKAIPILDGEMLT
AAAGTGGTGTATGACCAAAACAACCAGCCGGTGGTGAGCTTCACGCTGGATGCGCAAGGGGCTAAGATTTTTGGGG YDQNNQPWSFTLDAQGAKIF ATTTCTCAGGTGCGAATGTGGGCAAACGCATGGCGATTG l l l l AGACAATAAGGTCTATTCAGCCCCGGTGATTAGG ANVGKRMAIVLDNKVYSAPVIR GAGCGTATCGGTGGGGGGAGCGGGCAGATTAGCGGGAA I 1 1 1 AGCGTGGCTCAAGCGAGCGATTTAGCGATCGCT GSGQISGNFSVAQASDLAIALR TTAAGGAGTGGGG
Figure imgf000289_0001
Figure imgf000290_0001
HP0753 1685 GGGTGCGTTGTGTTTAGGGGGGCTAATGGCAGAGCAAGACCCTAAAGAGCTTGTGGGTTTGGGGGCAAAGAGCTA 1686 GALCLGGLMAEQDPKELVGLGA
CAMGAGAAAGATTTCACTCAAGCGAAGAAATATTTTGAGAAAGCGTGCGATTTGAAAGAAAATAGCGGGTGTTTTAA EKDFTQAKKYFEKACDLKENSG
TTTAGGGGTGCTTTATTATCAAGGGCAAGGGGTGGAAAAGAACTTGAAAAAAGCCGCCTCCTTTTACGCTAAAGCTT GVLYYQGQGVEKNLKKAASFYA
GCGATTTGAATTACAGCAATGGGTGTCATTTGCTAGGGAATTTATATTACAGCGGGCAAGGCGTGTCCCAAAACACC LNYSNGCHLLGNLYYSGQGVSQ
AATAAAGCCCTACAATACTACTCTAAAGCGTGCGATTTGAAATACGCTGAAGGGTGCGCGAGCTTAGGGGGGATTTA ALQYYSKACDLKYAEGCASLGG/
TCATGATGGTAAAGTGGTAACTAGGGATTTTAAAAAAGCGGTGGAATATTTCACTAAAGCGTGCGATTTAAACGATGG KWTRDFKKAVEYFTKACDLND
CGATGGTTGCACGATATTAGGGAGCTTGTATGATGCAGGCAGAGGTACGCCTAAAGATTTGAAAAAGGCGCTCGCT TILGSLYDAGRGTPKDLKKALAS
TCGTATGATAAAGCTTGCGACTTAAAAGACAGCCCAGGGTGCTTTAACGCAGGGAATATGTATCATCATGGCGAAGG CDLKDSPGCFNAGNMYHHGEG
TGCAACGAAGAATTTTAAAGAGGCTCTCGCTCGTTATTCTAAAGCATGCGAATTGGAAAATGGCGGAGGGTGTTTCA FKEALARYSKACELENGGGCFN
ATTTAGGGGCTATGCAATACAATGGCGAAGGCGTAACAAGGAATGAAAAGCAAGCCATAGAAAACTTTAAAAAAGGC QYNGEGVTRNEKQAIENFKKGC
TGTAAATTGGGCGCTAAAGGGGCGTGCGATATTCTCAAGCAGCTTAAGATCAAAGTTTAGTTTGAATAAGGCTTGAT KGACDILKQLKIKV
CAAATGGCTTTAAAAAGCGCTCTTT GGGGGTTAGGCGGATTTTAGGGGAGGTTTTAGGGGAGAATAATTTTAAAATA
CTTCCTATCCCCTTAAGATAATGAGTTTAAACCACCCAAGCAGAAATCCCAAATATCTTTAGTGTTTGGGATGAATGC
TTTAGTTCCGCTAGGAACTATA
HP0753 1687 CTACGCATAGCGAATCAGCCCTAGACTTGCTCAAGTTATTGAAAAAAAACCAGATGAATGCAAGCGCGATTGAGATC 1688 THSESALDLLKLLKKNQMNASAI
GCTCACTTGCTCCTCAATCAAGATGATGATCTGAAAGCTAAAGAGCAAGCGCTTTATGATTTAGGAGCGTTGTATGCA LLNQDDDLKAKEQALYDLGALY
AGGATCAAGGACTTTAAAAACGCCCACCTTTACAATCTGCAATATTTGCAGGACCATGCGGAACTGGATAAAGCTTCT FKNAHLYNLQYLQDHAELDKAS
GTCGTTAGGGCGCGCGATGAAAAAGCCCTTTTTTCCATGGAGGGGAACACGCAAGAAAAAATCGCCCACTATGACA RDEKALFSMEGNTQEKIAHYDKII
AAATCATTCAAAATTTCCCTAATTCTAATGAAGCCCTAAAGGCTTTAGAATTGAAAGCCCAACTATTGTTTGAAAACAA NSNEALKALELKAQLLFENKRYA
GCGTTATGCTGAAGTGTTAAGCATGCAAAAAAATTTGCCTAAAGATTCCCCTTTGATCCAAAAAACGCTCAATGTCCT MQKNLPKDSPLIQKTLNVLAKTP
TGCTAAAACCCCATTAGAGAACCATCGTTGTGAAGAAGCCTTAAAATATTTATCCCAAATCACAACCTTTGAATTCAG RCEEALKYLSQITTFEFSPKEEIQ
CCCCAAAGAAGAAATCCAAGCCTTTGATTGCTTGTATTTCGCATCGCTCAAAGAAAAAGCGCAAATCATTGCCCTAAA LYFASLKEKAQIIALNAFKTAKAP
CGCTTTTAAAACGGCTAAAGCCCCTAGCGAGAAATTAATATGGCTTTATCGTTTGGGGCGCAATTACTACCGCTTAG WLYRLGRNYYRLGDFKNSTLAS
GGGATTTTAAAAATTCCACTCTGGCCTCTAAAGACGCTTTAATTCTCGCTCAAAGCTTGAATAAAAAAGAATT TTATGA LAQSLNKKEFYDIAFVLFSDYMQ
TATTGCTTTTGTTTTATTTTCAGATTACATGCAAAACAATGAAAAAGAATTGGCTCTCCATTTGTATGCGTTTT TAGAAA ELALHLYAFLEKHFKGDKRMALV
AGCATTTCAAAGGCGATAAACGCATGGCGCTAGTTTATTTTAAATTGCTAGAAAATGAAAAAGATCCTAAAAGCGTCA LENEKDPKSVKIYATSLLKLQDA AAATTTATGCCACAAGCTTGCTCAAACTCCAAGACGCTTATAAGGACTATTCTTACAC . Y
HP0753 1689 TTTACAGCAATCCAAGAGCATGGGGGATTTATTGGCTAAAGCGATGCCTATAGAAAGGATTTTAAAAGCGTATTCTGT 1690 LQQSKSMGDLLAKAMPIERILKA
TCCGGTGGGTTCGTTAGAAAATTATGAAAAAATCTATTATCAAAACGCTTTCAAACCTAAAGTGCAAATCACTTTTGAT VGSLENYEKIYYQNAFKPKVQIT
AACAACGGCGATGCGGAAATCAAAAGCGCTCTCATAAGCGCTTATGCCAGAGTGCTAACCCCTAGTGATGAAGAAAA GDAEIKSALISAYARVLTPSDEE
ACTCTATCAAATCAAAAATGAAGTTTTCACAGACAGTGCTAATGGCATCACGCGCATTAGAGTGGTTGTTAGCGCGA KNEVFTDSANGITRIRWVSASD
GCGATTGTCAAGGCACGCCTGTATTGAATAGAAGCCTTGAAGTGGATGAAAAGAATAAGAATTTTGeTATCACGCGC PVLNRSLEVDEKNKNFAITRLQS
TTACAATCTTTGCTTTATAAAGAACTGAAAGATTATGCCAATAAAGAAGGGCAAGGCAATACGGGGTTATAAGCAGGA LKDYANKEGQGNTGL
TATCAATCTTGCTTTGTGCTTATTGATTCAGTTTGGTAGCGTTATCATTTTGTTTGGCGTGGCGTTTTGCACCCCTTAA
AATAACTCTCCGAAAGTCTGTGGCTTTTATTCTTGCCAAATAAACTAAAT
HP0753 1691 GCCTTTTGATGATTTTGAAAAAACCCTTTTGCAATTGAAAAAAGAGCATTTTAAAGCCGCGCATTTTGTAACGGCGTT 1692 PFDDFEKTLLQLKKEHFKAAHFV
CCGCTATTCTTTAGAGGGTAAAATCACGGAGGGTTTTAGCGATGATGGCGAGCCTAAAGGGAGTTCAGGCATGCCT YSLEGKITEGFSDDGEPKGSSG
ATGCTTAGCGTTTTAAGGCGAGAGGATTTAATCAATATAGGATTAGTGAGCGTGCGTTATTTTGGAGGCACGCTTTTA SVLRREDLINIGLVSVRYFGGTL
GGGGTGGGGGGCTTGATGAAAGCTTATGCTAAGAGCGCGTTATTGTGCGTAGAAAACGCTCAAAAAGAGGACGCTT GLMKAYAKSALLCVENAQKEDA
TGAAGGATTTTGTGGAGTTGGAAACTTTAAGTGCTCATTATTCTTACAA GAATTAGACGCTCTTCAGCGTGAAATTAA ELETLSAHYSYKELDALQREIKK
GAAATTTAGCTTACAATTAAGCAAAAAGAATTTTTCAAACCAAAGCGTGGAAGTGGAAATCAGCGGCACAAGAGAAAA SKKNFSNQSVEVEISGTRENLQ
TTTGCAAGCGTTTTTACAACAAAATAAGATAAATTAGGGAGAGTGGTATGGGATTTTTGAATGGGTATTTTTTATGGGT NKIN
T
Figure imgf000292_0001
Figure imgf000293_0001
Figure imgf000294_0001
Figure imgf000295_0001
Figure imgf000296_0001
HP1145 1737 TACAGGGCATTATGAAGCGCAAAAGATTTACATCACCGGTAGCATTGAAAGCGGGAATCGCATTTCTAGCGGTGGG 1738 TGHYEAQKIYITGSIESGNRISSG
GGCGCGAGCCTTAATTTTAACGGGCTTCAAGGCATTCTTTTAACGAACGCGACTTTGTATAACCGCGCCGCTGGCAC LNFNGLQGILLTNATLYNRAAGT
GCAAAGCTCGTCTATGAATTTTATCTCTAACAGCGCGAACATTCAGGCTCAAAACTCCTATTTTATAGACGATACCGC MNFISNSANIQAQNSYFIDDTAQ
ACAAAATGGCGGTAACCCTAATTTCAGTTTCAACGCTTTGAATCTGGATTTTTCTAACAGCTCTTTTAGAGGCTATGTG PNFSFNALNLDFSNSSFRGYVG
GGGAAAACGCAATCTGTTTTTAAATTCAATGCCAAGAATGCGATCAGTTTCACCAACAGCACGAATTTAAGCTCTGGT VFKFNAKNAISFTNSTNLSSGLY
TTGTATCAAATGCAAGCTAAAAGCGTGTTGTTTGACAATTCCAATTTAAGCGTTTCAGTGGGGACAAGCAGTATTAAA KSVLFDNSNLSVSVGTSSIKANA
GCCAATGCGATCAATCTTTCTCAAAATGCCTCTATTAATGCGAGCAACCATTCAACCTTAGAACTTCAAGGCGATTTG NASINASNHSTLELQGDLNVND
AATGTGAACGACACCAGCTCGCTCAACCTCAACCAAAGCACGATTAATGTTTCCAATAACGCCACGATCAACGATTAT LNQSTINVSNNATINDYASLIASN
GCGAGCTTGATTGCGAGTAATGGCTCTCACCTTAATTTTAACGGGGCGGTTAATTTCAATTCAGCGAATATTACTACG NFNGAVNFNSANITTSLNNSSIV
AGTTTGAATAATTCCTCTATCGTGTTTAAGGGGGCGGTCTCTTTAGGAGGGCAGTTTAATTTAAGCAATAACTCTTCT SLGGQFNLSNNSSLDFQGSSAI
TTAGATTTCCAAGGCTCTAGCGCTATCACCTCTAACACGGCGTTTAATTTCTATGATAACGCTTTTTCTCAAAGCCCC FNFYDNAFSQSPITFHQALDIKA
ATCACTTTCCATCMGCCCTTGACATTAAAGCGCCCTTAAGTTTGGGAGGCAACCTTTTAAACCCTAACAACAGCAGC GNLLNPNNSSVLDLKNSQLVFG
GTGCTGGATTTAAAAAACAGCCAGCTTGTTTTTGGCGATCAAGGGAGTTTGAATATCGCTAACATTGATTTACTAAGC LNIANIDLLSDLNDNKNRVYNIIQ
GATCTAAATGAT SNWYERISFFGMHINDGIYDAKN
FTNPLNNALKITESFKDNQLSVT
GIKNTLYNIGSEIFNYQKVYNNA
YSDDAQGVFYLTSNVKGYYNPN
ASGSNNTTKNNNLTSESSIISQT
GNPISALHIYNKGYNFNNIKALG
LYPEIKKVLGNDFSPSSLNALNS
QLTKLITPNDWKNINELIDNANN
FNNGTLIVGATQIGQTDTNSAW
GYQTPCDYTDIVCQKFRGTYLG
HP1145 1739 GAATACTTGAAAAATAAGGGGGTCAAAATCACTGATGCGACTTGCCCGTATGTGATCAAGCCCCAACAAATTGTGGA 1740 EYLKNKGVKITDATCPYVIKPQQI
ATCCATGAGCAAAGAAGGGTATCAAATCGTACTTTTTGGGGATATTAACCACCCTGAAGTCAAGGGCGTGATAAGCT SKEGYQIVLFGDINHPEVKGVIS
ATGCCACTAACCAGGCTTTAGTCGTCAATTCGTTAGMGAATTGCAAGAAAAAAAGCTCCAACGAAAAGTGGCTTTAG ALWNSLEELQEKKLQRKVALVS
TCTCCCAAACCACCAAACAAACCCCAAAACTCTTGCAAATCGCTTCTTATTTGGTGGAAAGATGCACTGAAGTGCGTA QTPKLLQIASYLVERCTEVRIFNT
TTTTTAACACGATTTGTAACGCCACTTCTTACAACCAAAAAGCCGCTTTGGATTTGAGTAAGGAAGTGGATATTATGAT SYNQKAALDLSKEVDIMIWGGK
AGTCGTGGGCGGTAAGACTTCTTCAAACACCAAACAGCTCTTAAGCATCGCCAAACAGCATTGCAAAGACAGCTACT TKQLLSIAKQHCKDSYLVEDENE
TGGTAGAAGACGAAAACGAATTAGAATTAGCATGGTTTAAGGATAAAAAATTGTGTGGGATTACCGCTGGGGCTTCC WFKDKKLCGITAGASTPDWIIEN
ACGCCGGACTGGATTATAGAAAATGTCAAGCAAAAAATCAGCACGATTTAACACATTT STI
HP1145 1741 GACAATCTAACCCCCCTATTCCCTGATGAACAGATCAAATTAGAATACGAACCCACTAAAGTTACCGGCAGAATGCTA 1742 DNLTPLFPDEQIKLEYEPTKVTG GATTTATTCAGCCCTGTGGGGAAAGGCCAAAGGGCTTTGATCGTCGCGCCACCAAGGACTGGGAAAACGGAGCTGA LFSPVGKGQRALIVAPPRTGKTE TGAAAGAACTCGCCCAAGGCATCACTTCTAACCACCCTGAAGTGGAGCTGATTATCCTTCTAGTGGATGAGCGCCCT LAQGITSNHPEVELIILLVDERPE GAAGAAGTTACGGATATGCAACGAAGCGTTAAGGGTCAAGTTTTTAGCTCCAC MQRSVKGQVFSS
HP1145 1743 TCCAGCGCGCTAGAAAGGGAATTGAAACAAAAGAATGAACATTTAGAGAACGCTTTAAAAGAGCAAGAATATTTGAAA 1744 SSALERELKQKNEHLENALKEQ
AACGCATGGCTTTTAGA TGGAAAAACAAAAAGAAATCTTTCACAATAAAAAATTGGAATTGGAAAAATCCTACCAAC AWLLEMEKQKEIFHNKKLELEK
AAGCCCTAAATATCTTAAAAAGCGAAGTCGCTTCAAAAGATACTAGCTCCATGCATAAAGAAATCCATAAAGCGAGCG LNILKSEVASKDTSSMHKEIHKA
AAATTTTAAGCAAACACAAAACAAACCAAGAGATCCCACAAATCATAACGAACTTTCAAGCCAACGAAAAAGCGCGCT KHKTNQEIPQIITNFQANEKARY
ACAAGAATGAAAGCGTGCTGATTGTACAAATTTTAGACAAGGGCTATTATTGGATAGAAACCGAGCTTGGCATGCGT LIVQILDKGYYWIETELGMRLKA
TTAAAAGCGCATGGGAGTTTGTTGAAAAAAATCCAAAAACCCCCTAAAAACAAATTCAAACCCCCTAAAACAACCATT KKIQKPPKNKFKPPKTTIPKPKE
CCTAAACCTAAAGAAGCGAGCTTGCGCCTTGATTTAAGGGGGCAACGCAGCGAAGAAGCCCTGGATTTACTAGACG DLRGQRSEEALDLLDAFLNDAL
CTTTTTTAAACGACGCGCTTTTAGGGGGCTTTGAAGAAGTGCTGATTTGCCACGGCAAAGGGAGCGGGATTTTAGAA EVLICHGKGSGILEKFVKEFLKN
AAGTTTGTGAAAGAATTTTTAAAAAACCACCCCAAAGTGGTGA
HP1145 1745 GCGCACTTCTGGGGAAGTGTCAAAAAAAGAGGGAGGGAAAATCTTTGATAGCGGATCGAGGGCGACGGTAGCGATT 1746 RTSGEVSKKEGGKIFDSGSRAT
ATCTΓ ΓT I I GTGAAAGATAAGAGCACTCCTGATAATACGATTTTTTATTATGAAGTGGAAGATTACTTGAAAAGAGAAG VKDKSTPDNTIFYYEVEDYLKRE CCAAACTCMCTGGCTCGCCAATTTRGAAAATTTGGATTTTGTGCCTTTTGAGAAAATCACCCCGAATGATAAAGGCG WLANFENLDFVPFEKITPNDKGD ATTGGATCAACCAAAGGAATGACGCTTTTGAAAAACTCATCCCTTTAAAAAGAGACAAAACACTCCAAAACGACAGCG RNDAFEKLIPLKRDKTLQNDSVF TTTTTGACATCAATTCTCTTGGCG G
HP1145 1747 TGGGCGCGATGAGATGGTGGTGTGCGGTGGCGTGGAGAACATGAGCGCAGCACCCTATTTGTCGTTTGACATGCG 1748 GRDEMWCGGVENMSAAPYLS
AGATGGGAAAAGAATGGGGAATGCGAACATGATAGATTCCATGATACATGATGGATTGTGGGATGCGTTCAATAATT DGKRMGNANMIDSMIHDGLWDA
ACCACATGGGGATCACCGCTGATAATGTCGCTCAAGCATACCACATAAGCCGAGAAGATCAAGATAATTTCGCGCTC HMGITADNVAQAYHISREDQDN
CAGTCGCAACTCAAAGCAAGAGCCGCCATTAATGCAGGGAAATTCCAAGAAGAAATCACGCCTATTGAAATAGCGAA QLKARAAINAGKFQEEITPIEIAN
TAAAAAAGGCGTGGTGGTTTTTAAAGAAGACGAATACCCTAGAGAAACGACGCTAGAATCCCTTGCAAAGCTCAAAC WFKEDEYPRETTLESLAKLKPA
CCGCTTTCAAAAAAGACGGATCGGTAACGGCGGGAAATTCATCAGGGATCAATGATGGCGCGAGTATTATCATTTTA GSVTAGNSSGINDGASIIILCSTK
TGCAGCACTAAAAAAGCGCAAACATTGGGGTTAAAAGCCATGGCGACTATCAAGGGGTTTGGTTTGGGTGGTTGCA LGLKAMATIKGFGLGGCSPDIMG
GTCCGGATATAATGGGTATATGCCCTAGCATCGCGATTA Al
HP1145 1749 AAAGTGATTTATTTAGATCAAGCCCCCATAGGCAAAACCCCACGAAGCAACCCTGCCACTTACACGGGAGTGATGGA 1750 KVIYLDQAPIGKTPRSNPATYTG
TGAAATCAGGATTTTATTTGCCGAGCAAAAAGAAGCTAAAATTTTAGGCTATAGTGCGAGCCGTTTTAGCTTTAATGTT RILFAEQKEAKILGYSASRFSFNV
AAAGGAGGGCGGTGCGAGAAATGCCAAGGCGATGGGGACATTAAAATAGAAATGCACTTTTTGCCTGATGTGTTAG RCEKCQGDGDIKIEMHFLPDVLV
TCCAATGCGATAGCTGTAAGGGCGCTAAATACAACCCCCAAACTTTAGAAATCAAGGTGAAAGGCAAATCCATTGCC CKGAKYNPQTLEIKVKGKSIADV
GATGTGTTGAACATGAGCGTGGAAGAGGCTTATGAATTTTTTGCTAAATTCCCTAAAATCGCCGTGAAGTTAAAAACG VEEAYEFFAKFPKIAVKLKTLMD
CTTATGGATGTGGGCTTAGGCTATATCACTTTAGGGCAAAACGCTACGACTTTAAGTGGGGGGGAGGCTCAAAGGA YITLGQNATTLSGGEAQRIKLAK
TCAAATTAGCTAAAGAATTGAGTAAAAAAGACACAGGCAAAACCCTTTATATTTTAGATGAGCCTACTACCGGTTTGC DTGKTLYILDEPTTGLHFEDVNH
ATTTTGAAGACGTGAATCATCTTTTACAAGTCTTGCATTCTTTAGTGGCGTTAGGCAATrCTATGCTAGTGATTGAGCA HSLVALGNSMLVIEHNLDIIKNAD
TAATTTAGACATTATCAAAAACGCTGACTACATTATAGACATGGGGCCTGATGGGGGGGATAAGGGCGGGAAAGT GPDGGDKGGK
HP1145 1751 GTTAGAATTCCAAAAAATCCAAGCCCTACTCTTTAAAAAAGGGCTTTGTATCACCCCCTATAATGAATTGAATTTAGAG 1752 LEFQKIQALLFKKGLCITPYNELN
CAAAAAGCGAAGGCTAAAACCTATTTTAAAGAGCAGCTTTACGCGTTAGTTTTGCCTTTTAAATTGGATTCTTCACACA AKAKTYFKEQLYALVLPFKLDSS
CTTTCCCGCCTTTAGCGAATTTGACTTTCGCGCTTTTTGCCCGCATCAAAGACAAAGAAACCCAAATTATCTCCTATG PLANLTFALFARIKDKETQIISYAL
CGCTCATCAMCTCCCCTCTTTTATCTTCCGTTTTGTAGAGCTAGAAAAAGGCTTGTTTGTGTTAGCTGAAGAAATCG FIFRFVELEKGLFVLAEEIVEAHL
TGGAAGCGCATTTAGAAGAATTGTTTTTAGAGCATGAGATTTTAGATTGCATGGCGTTTAGGGTAACTTGCGATGCG EHEILDCMAFRVTCDADIAITEDE
GATATTGCTATCACTGAAGATGAAGCGCATGATTATGCAGATTTGATGAGTAAGAGTTTGAGGAAACGCAATCAAGG ADLMSKSLRKRNQGEIVRLQTQ
CGAAATCGTGCGCTTGCAAACCCAAAAAGGGAGTC GAGCTTTTAAAAACCCTCTTAGCGTCTTTAAGGAGTTTTCA ELLKTLLASLRSFQTHSYKKHKL
AACCCACTCTTACAAAAAGCACAAACTCACCGGCATGCATATCTATAAAAGCGCGATCATGCTCAATTTAGGGGATTT YKSAIMLNLGDLWELVNHSDFKA
GTGGGAATTAGTCAATCATAGCGATTTTAAAGCGCTCAAATCGCCCAATTTCACACCCAAAATCCACCCTCATTTCAA NFTPKtHPHFNENDLFKSIEKQD
TGAAAACGATCTTTTCAAATCTATAGAAAAACAGGATCTGTTGCTGTTTCATCCTTATGAAAGTTTTGAGCCTGTGATT PYESFEPVIDLIEQAASDPATLSI
GATTTAATAGAGCAAGCCGCTAGCGATCCAGCCACCCTTTCTATCAAAATGACGCTTTATCGTGTGGGCA RVG
HP0177 1753 GCAAAAAAGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCATTGCTAAAAAAACAA 1754 KKDACGFIYEISEFMKAYTALLK
GACCGATACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACTGCCCTTTATGTCAAA VYLLRYLPSRYWASILTTALYVK
TACCCTGATTTTGACGCTTTGAAAAAGCTTTTGGTGTCTTATTATTACCAAACTTGGATTGCAGGAGGCACGATCACG ALKKLLVSYYYQTWIAGGTITRIK
CGCATCAAGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACCATCAAAGAGCTTATA IIKNVKSNKSVETIKELILNSIDSY
TTGAATAGCATCGACTCTTATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTTCTGTTTATCATAGCA YLYNLWDSSSVYHSKWVRPVL
AATGGGTGCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATTTTATCGCTATGGATG FMADEEKPHFIAMDAETQVEHIL
CCGAAACCCAAGTGGAGCATATTTTGCCACAAACGCCCAAAAGAGGCAGTCAATGGAACGCGGATTTTGACAAAGA KRGSQWNADFDKEKREEWVNN
AAAAAGAGAAGAATGGGTAAATAATATCGCGAATTTAACCCTTTTAAAGCGTAAAAAGAACGCGCATGCTTTAAACGG LLKRKKNAHALNGDFDEKRKIY
GGATTTTGATGAAAAAAGAAAAATTTATGGAGGCAAAGACACGAGCAAAGTGATTAGCTGTTATGACATCACTAAAGA SKVISCYDITKELYSNYRKWNEK
ATTGTATAGCAATTATAGGAAGTGGAATGAGAAGTCCCTCCAAGAGCGATACAAATCTTTGTATAACACTATCACGCC RYKSLYNTITPVLHIEGQEDDFE
TGTTTTACACATAGAGGGGCAAGAAGATGATTTTGAAGATGATTTTGATCT
HP0177 1755 GTATGTGGATAACGAT ACGTGTTTTTATTCCACAACACGGACAATAAAGACCATGAGTTTTATTTCAAAGTTTTAGGG 1756 YVDNDYVFLFHNTDNKDHEFYF
CAAAAAGACATTCAGATCAAAAAGCCTTTAAATCCTATCGCCATTAAAGCCGGGCAAAAGATTAAAGCGGTAGTGATT QKDIQIKKPLNPIAIKAGQKIKA
TTAAGAAAACCCCTAAAGAGTAACGCCACAGAATACAAGAACGCTAAAGACGCTCTAATCCCCATTACCATACAAGCT LKSNATEYKNAKDALIPITIQAYS
TATAGCGCGGACGATAAGAATATTACGATAGAAAGGGAATCGGTGTTTATTGCACCAAGTGAGGATTGAAGCCTAAA NITIERESVFIAPSED
ACTAGCGTTCAATCACTTCATAAGGCAAGCCTTGTTTGATCAATTCTTCCATAAAGGGATCGGTGTTTAATTCTTCTAT
GTTAAACACCCCGGTCCTAAAAtGATCCGCGCTCCAAGTGTCATTGCAAATCATTTTAGCCGCGCACATCGCTGGCA
CGCCGGTGGTGTAGCTTATGGCTTGCGAACCCACTTCTTCATAGCACTT TATGATCGCACACGTTGTAAATGTAGA
GCGT TGTCTTGGTTGTTTTTAATGCCGGTCATGTAGCACCCGATGTTGGTTTTACCGGTGGTGTCTTTGGCTAGAG TGGCTGGATCCGGMGCAMGTTTΓTAAAAATTGTATCGGCACGATTTTTACGCCTTGATGCTCTATTTCTTTAATGC
CTAGCATGCCGACATTTTCCMGCATTTCATGTGGGTTAAATAATTTTGAGAGAAAGTCATAAAAAACCTCGCCCTCC
HP0177 1757 TTCTTTGATCGCTTCAGTGTTATTATACGCTTATGGCACAGGAGCGATTAAAGGCTTTGCCCTAACTACAGGCATTGG 1758 SLIASVLLYAYGTGAIKGFALTTG
GATTTTAGCCTCTATTATCACCGCTATTGTTGGCACGCAAGGGATTTATCAAGCCCTTTTACCTAAACTCACTCAAAC SIITAIVGTQGIYQALLPKLTQTKS
AAAAAGCCTTTACTTTTGGTTTGGCGTGAATAAAAGAGCTTAGGAGGTTTTATGGAATTATTCAAACGAACTAGAATCT FGVNKRA
TAAGCTTCATGCGTTATTCCAATTATGGGGTGATCGTTTCAGCAATTTTAGCGCTTCTAGCGTTGGGGCTTTTGTTTTT
CAAAGGGTTTTCTTTAGGGATTGATTTTGCGGGGGGGAGTTTGGTGCAAGTGCGCTACACTCAAAACGCCCCCATTA
AAGAAGTGCGCGATCTGTTTGAAAAAGAAGCTCGCTTCAAAGGCGTGCAAGTGAGCGAATTTGGCTCTAAAGAAGAA
ATTTTAATCAAATTCCCTTTTGTAGAAACGGCTGAAAATGAAGATCTGAACGCTATCGTGGCCAACATTCTAAAACCC
AGCGGCGATTTTGAAATCCGTAAATTTGACACCGTGGGCCCTAGAGTGGGGAGCGAATTGAAAGAGAAAGGCATTT
TGTCGCTGATTTTAGCATTAATAGCGATCATGGTTTATGTGAGTTTCCGCTATGAATGGCGTTTTGCTTTAGCGAGCG
TCATTGCGCTTGTGCATGATGTGATTTTAGTGGCAAGCTCGGTGATTGTTTTTAAGATTGATATGAATTTGGAAGTGA
TTGCGGCCTTGCTCACCTTGATTGGGTATTCCATTAATGATACGATCATTATTTTTGACAGGATCAGAGAAGAGATGC
TCTCTCAAAAAACCAAAAACGCCACTCAAGCCATTGATGAAGCCATTTCTAGCACGCTCACGCGCACGCTTTTAACTT
CTTTAACCGTGTTTTTTGTGGTGTTGATTTTGTGCGTGTTTGGGAGTAAGATCATCATTGGCTTTTCATTGCCCATGTT
AATAGGCACGA
HP0291 1759 AGAATAAGCCAAA TCCTTTTTATTTTAGAMCTTCTAAAATCAATAAMCTTACCCCATAGAGCGTTTTAAAGAATTA 1760 NKPKILFILETSKINKTYPIERFKE
GCGTTAATTTTAGAAAATTTTCAAATTTGCTTGTTATGGCATGCTGATGAATATAAAGCCACTACGCTTTATCACGCTT NFQICLLWHADEYKATTLYHALK
TAAAACACCAACGCGATGTGTTATTGCTCCCCAAACTCACTITAAACGAGGTTAAGGCGTTGCTCTTTAAAATGGATT VLLLPKLTLNEVKALLFKMDLIIG
TGATTATTGGGGGCGATACGGGCATCACGCATTTAGCATGGGCGTTGCAAAAACCCAGCATCACCCTTTATGGCAAC THLAWALQKPSITLYGNTPMER
ACGCCCATGGAGCGTTTTAAATTAGAAAGCCCGATCAATGTTTCGCTCACCGGTAATTCAAACGCCAACTACCATAAA PINVSLTGNSNANYHKKDFSIQN
AAGGATTTTTCTATCCAAAATATAGAGCCTAAAAAAATTAAAGAATGCGTTTTAAACATCTTAAAGGAAAAAGAATGAC IKECVLNILKEKE
TTACAAAGAACGACTCATACACGAAAAAATATTGAAACAAGACGACAAGGGTTTTAAAACAGAACTGCGCATTTTGAG
TATTTTTATCGTGGAATCTTTAGTGAATATTTTGGGGTTTATTTTAGCTAA
HP0291 1761 ATGCGTTGGATAATGAATTAAGCGATCTTTTAGACAAACGCTTAGAAATCGCTTTAAAAATCGCACTCATCAAACAAG 1762 ALDNELSDLLDKRLEIALKIALIK
AAAGCCCCATTTATTGCCCTAAAAGAGAGCAAGAAATTTTAAAACGACτCAGCCAAAGGGATTTCAAGCATTTGAATG CPKREQEILKRLSQRDFKHLNG
GAGAAATTCTTACGGGTTTTTATACAGAGGTTTTTAAGATTTCTAGAAAATTTCAAGAAAACGCCCTGAAAGAGTTAAA YTEVFKISRKFQENALKELKK
AAAATAAAAGAGAGTTGTTATGTTTGAAAAAATTACCCTAGCGCATAAGGACTTGTTTTCAAGGTTTTTAAGCGCTCAA
AAAATCGTTTTATCAGATGTGAGTTTTACGAATTGCTTTTTATGGCAGCACGCAAGGCTCATTOAAGTGGCGGTGATT
AGGGATTGTTTGGTGATTCAAACCACTTATGAAAATCAAAAACCCTTTTATTTCTATCCTATCGGTAAGAATGCGTTTG
AATGCGTAAAAGAGCTTTTGAAATTAGAAAAAAATTTAAGATTCCACTCCCTGACTTTAGAGCAAAAAGACGATTTGAA
AGACAATTTTGTAGGGGTGTTTGATTTCACTTACAACCGAGACAGGAGCGATTACGTTTATTCTATTGAAGAATTGAT
CGCTTTAAMGGGAAAAAATACCATAAGAAAAAAAACCACCTAAACCAGTTTTTAACCAATCATGCGAATTTTGTTTAT
GAAAAAATTTCTCCTCAAAATAAAAAGGAAGTTTTAGAAGCTTCTCAAGCGTGGTTTTTAGAAAGCCAAACCGATGAT
ATAGGGCTAATCAATGAAAATAAGGGCATTCAAAGCGTGTTAGAAAATTATGAAAGCTTGGATGTAAAGGGGGGGCT
TATTAGGGTTAATGGGGAAATAGCCTCGTTTAGTTTTGGAGAAGTTTTAAACGAAGAGAGCGCGCTCATCCACATTG
AAAAAGCCCGCACAGATATTGCAGGCGCGTATCAGATCATCAACCAGCAG
HP0291 1763 AGGGATTAAAAAAGGTGTTCAAAGACAGCAAAAAAGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAA 1764 GLKKVFKDSKKDACGFIYEISEF
GCCTATACCGCATTGCTAAAAAAACAAGACCGATACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGC ALLKKQDRYVYLLRYLPSRYWA
ATTΓTAACGACTGCCCTTTATGTCAAATACCCTGATTTTGACGCTTTGAAAAAGCTTTTGGTGTCTTATTATTACCAAA LYVKYPDFDALKKLLVSYYYQT
CTTGGATTGCAGGAGGCACGATCACGCGCATCAAGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAG TITRIKQTSINIIKNVKSNKSVETIK
AGCGTTGAAACCATCAAAGAGCTTATATTGAATAGCATCGACTCTTATAACACCTTTGATCAATACCTCTATAACTTAT SIDSYNTFDQYLYNLWDSSSVY
GGGATAGCTCTTCTGTTTATCATAGCAAATGGGTGCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAG VRPVLALANYFMADEEKPHFIA
AGAAACCCCATTTTATCGCTATGGATGCCGAAACCCAAGTGGAGCATATTTTGCCACAAACGCCCAAAAGAGGCAGT QVEHILPQTPKRGSQWNADFDK
CAATGGAACGCGGATTTTGACAAAGAAAAAAGAGAAGAATGGGTAAATAATATCGCGAATTTAACCCTTTTAAAGCGT EWVNNIANLTLLKRKKNAHALN
AAAAAGAACGCGCATGCTTTAAACGGGGATTTTGATGAAAAAAGAAAAATTTATGGAGGCAAAGACACGAGCAAAGT KRKIYGGKDTSKVISCYDITKELY
GATTAGCTGTTATGACATCACTAAAGAATTGTATAGCAATTATAGGAAGTGGAATGAGAAGTCCCTCCAAGAGCGATA KWNEKSLQERYKSLYNTITPVL
CAAATCTTTGTATAACACTATCACGCCTGTTTTACACATAGAGGGGCAAGAAGATGATTTTGAAGATGA EDDFED
HP0291 1765 GGAGTTGGAAGTCTTTAAATCGCATGCGGGGTTTGAATACACCAGCGATCAAGAAAAGGCTATCGCTGAAATTTCAA 1766 ELEVFKSHAGFEYTSDQEKAIAE
AGGATTTAAGCTCTCACAGGGTGATGGATAGATTATTGAGTGGGGATGTGGGTT TGGGAAAACAGAAGTGGCGAT SSHRVMDRLLSGDVGFGKTEV
GCATGCGATT TT GCGCGTTTTTGAACGGCTTTCAAAGCGCTTTAGTTGTGCCTACCACTTTATTAGCGCACCAGCA FCAFLNGFQSALWPTTLLAHQ
1TTTGAGACTTTAAGGGCGCGTTTTGAAAATTTTGGCGTTAAAGTGGCTCGTTTGGACAGGTATGCGAGCGAAAAAA RARFENFGVKVARLDRYASEKN
ACAAGCTTTTAAAGGCGGTGGAATTAGGGCAAGTTGATGCGCTAATAGGCACGCATGCGATTTTAGGCGCGAAATTC VELGQVDALIGTHAILGAKFKNL
AAAAACCTGGGCTTGGTGGTGGTGGATGAAGAGCATAAATTTGGCGTGAAACAAAAAGAAGCTTTAAAAGAATTGAG DEEHKFGVKQKEALKELSKSVH
TAAGAGCGTGCATTTTTTAAGCATGTCCGCTACGCCTATCCCGCGCACTCTAAACATGGCGCTCTCTCAAATTAAGG ATPIPRTLNMALSQIKGISSLKTP
GCATTAGTTCTTTAAAAACCCCGCCCACAGACAGAAAGCCCAGCCGCACTTTTTTGAAAGAAAAGAATGACGAACTC KPSRTFLKEKNDELLKEIIYRELR
TTAAMGAGATTATTTACAGAGAATTACGCCGTAACGGGCAAATTTTTTACATCCATAACCACATCGCTAGCATTTTAA IFYIHNHIASILKVKTKLEDLIPKLK
MGTCAAMCCMGCTAGMGATTTMTCCCTAMCTCAAAATCGCTATTTTGCATTCCCAGATTAACGCTAATGAGA SQINANESEEIMLEFAKGNYQVL
GCGAAGAAATCATGCTAGAGTTTGCCAAGGGAAATTATCAGGTTTTATTATGCACTTCTATTGTGGAATCAGGGATTC VESGIHLPNANTIIIDNAQNFGLA
ATTTGCCTAACGCTAACACGATCATTATAGATAATGCGCAAAATTTCGGGCTGGCTGATTTGCACCAATTGAGAGGG RGRVGRGKKEGFCYFLIEDQKS
CGTGTGGGGAGAGGTAAAAAAGAAGGCTTTTGTTATTTCCTCATAGAAGATCAAAAAAGTTTGAATGAACAGGCTTTA ALKRLLALEKNSYLGSGESVAY
AAACGCTTGCTCGCTTTG GGGNLLGQDQSGHIKNIGYALY
DAIYELSGGK
HP0291 1767 TTCTTTGATCGCTTCAGTGTTATTATACGCTTATGGCACAGGAGCGATTAAAGGCTTTGCCCTAACTACAGGCATTGG 1768 SLIASVLLYAYGTGAIKGFALTTG
GATTTTAGCCTCTATTATCACCGCTATTGTTGGCACGCAAGGGATTTATCAAGCCCTTTTACCTAAACTCACTCAAAC SIITAIVGTQGIYQALLPKLTQTK
AAAAAGCCTTTACTITTGGTTTGGCGTGAATAAAAGAGCTTAGGAGGTTTTATGGAATTATTCAAACGAACTAGAATCT FGVNKRA
TAAGCTTCATGCGTTATTCCAATTATGGGGTGATCGTTTCAGCAATTTTAGCGCTTCTAGCGTTGGGGCTTTTGL I I TT
CAAAGGGTTTTCTTTAGGGATTGATTTTGCGGGGGGGAGTT.TGGTGCAAGTGCGCTACACTCAAAACGCCCCCATTA
AAGAAGTGCGCGATCTGTTTGAAAAAGAAGCTCGCTTCAAAGGCGTGCAAGTGAGCGAATTTGGCTCTAAAGAAGAA
ATTTTAATCAAATTCCCTTTTGTAGAAACGGCTGAAAATGAAGATCTGAACGCTATCGTGGCCAACATTCTAAAACCC
AGCGGCGATTTTGAAATCCGTAAATTTGACACCGTGGGCCCTAGAGTGGGGAGCGAATTGAAAGAGAAAGGCATTT
TGTCGCTGATΠTAGCATTAATAGCGATCATGGTTTATGTGAGTTTCCGCTATGAATGGCGTTTTGCTTTAGCGAGCG
TCATTGCGCTTGTGCATGATGTGATTTTAGTGGCAAGCTCGGTGATTGTTTTTAAGATTGATATGAATTTGGAAGTGA
TTGCGGCCTTGCTCACCTTGATTGGGTATTCCATTAATGATACGATCATTATTTTTGACAGGAT.CAGAGAAGAGATGC TCTCTCAAAAAACCAAAAACGCCACTCAAGCCATTGATGAAGCCATTTCTAGCACGCTCACGCGCACGCTTTTAACTT CTTTAACCGTGTTTTTTGTGGTGTTGATTTTGTGCGTGTTTGGGAGTAAGATCATCATTGGCTTTTCATTGCCCATGTT AATAGGCACGA
Figure imgf000301_0001
Figure imgf000302_0001
Figure imgf000303_0001
Figure imgf000304_0001
Figure imgf000305_0001
Figure imgf000306_0001
Figure imgf000307_0001
Figure imgf000308_0001
Figure imgf000309_0001
Figure imgf000310_0001
Figure imgf000311_0001
Figure imgf000312_0001
Figure imgf000313_0001
Figure imgf000314_0001
Figure imgf000315_0001
Figure imgf000316_0001
Figure imgf000317_0001
Figure imgf000318_0002
Figure imgf000318_0001
LIZ
Figure imgf000319_0001
Figure imgf000320_0001
Figure imgf000321_0001
Figure imgf000322_0001
Figure imgf000323_0001
Figure imgf000324_0001
Figure imgf000325_0001
Figure imgf000326_0001
Figure imgf000327_0001
Figure imgf000328_0001
Figure imgf000329_0001
Figure imgf000330_0001
Figure imgf000331_0001
Figure imgf000332_0001
Figure imgf000333_0001
Figure imgf000334_0001
Figure imgf000335_0001
Figure imgf000336_0001
Figure imgf000337_0001
Figure imgf000338_0001
Figure imgf000339_0001
Figure imgf000340_0001
Figure imgf000341_0001
Figure imgf000342_0001
Figure imgf000343_0001
Figure imgf000344_0001
Figure imgf000345_0001
Figure imgf000346_0001
Figure imgf000347_0001
Figure imgf000348_0001
Figure imgf000349_0001
Figure imgf000350_0001
Figure imgf000351_0002
Figure imgf000351_0001
Figure imgf000352_0001
Figure imgf000353_0001
HP1559 2179 CACGAAAGCCAAAAACAACGCCACCATTCTCACTAACACCCCAACGCTCATGGATGAAAGAAAAAGCATCATGCAM 2180 TKAKNNATILTNTPTLMDERKSI
CTTATGATGTCAACCACCCCCTAGAGTGTGGCGTGTGCGATAAGAGTGGGGAGTGCGAATTGCAAGACATGACGCA VNHPLECGVCDKSGECELQDM
TTTAACCGGCGTAGAGCACCAACCCTATGCGGTGGCTGATGATTTTAAAGCACTGGATTTTTGGGCAAAAGCCTTGT VEHQPYAVADDFKALDFWAKAL
ATGATCCTAATTTGTGCATCATGTGTGAAAGGTGCGTAACCACTTGTAAGGACAATGTGGGCGAAAACAACCTTAAA LCIMCERCVTTCKDNVGENNLK
GCCACTAAAGCCGACTTGCATGGTCCGGATAAATTTAAAGACAGCATGTCCAAAGACGCTTTTAGCGTGTGGAGTCG LHAPDKFKDSMSKDAFSVWSR
TAAGCAAAAAGGCATTATTTCTT TGTGGGCAGCGTGCCTTGCTATGATTGCGGGGAATGCATTGCAGTATGCCCTG SFVGSVPCYDCGECIAVCPVGA
TGGGCGCTTTGAGCTATAAAGATTTCGCTTACACGGCTAACGCATGGGAGTTAAAAAAGATCCATTCTACTTGTTCGC FAYTANAWELKKIHSTCSHCSA
ATTGCTCGGCCGGGTGTTTGATTTCTTATGATGTGCGCCATTTTGATACTCTAGGCGAAGAATCTAAAATTTTTAGAG YDVRHFDTLGEESKIFRVLNDFY
TGCTTAATGATTTTTACCATAACCCTATTTGTGGGGCAGGCCGTTTCGCTTTTGATGTGAGCTCTAGCCCTAAAGGCA GAGRFAFDVSSSPKGSANLKEA
GTGCTAATCTTAAAGAAGCGCAAAACGCCCTCAAAGAATGCGAAGCGGTGCGAATAGGTGGGGATATTACGAATGA KECEAVRIGGDITNEEAFLIERLR
AGAGGCGTTTTTAATAGAGCGTTTAAGAAAAGAGCTTGATTTTAAAATCTACAATCAAGAAGCGTATCGTTTCCAGCA FKIYNQEAYRFQQFLKVLGEIKR
ATTCTTAAAAGTATTGGGCGAAATTAAACGCCCCAGCGTTGAAGAGATTAAAACTTCTCATTTAGTCGTTACGATAGG IKTSHLWTIGSSIKTENPLVRYAI
ATCTTCTATCAAAACAGAAAACCCTTTGGTGCGCTATGCCATCAATAACGCTCTCAAACTCAATAAAGCTTCTTTAATC KLNKASLIAMHPIKDNALANLCR
GCTATGCACCCTATTAAG THEVGAEEILLGMLLKMLNIESA
EDSKQNIVDEAALKALEEERKKA
EQGCSIGENKAENQEENKTEAT
NQEENK
HP1559 2181 GGTGGTCCAATCCAACCCTGATTATATTTCTACGCATAGCGAATCAGCCCTAGACTTGCTCAAGTTATTGAAAAAAAA 2182 WQSNPDYISTHSESALDLLKLL
CCAGATGAATGCAAGCGCGATTGAGATCGCTCACTTGCTCCTCAATCAAGATGATGATCTGAAAGCTAAAGAGCAAG MNASAIEIAHLLLNQDDDLKAKE
CGCTTTATGATTTAGGAGCGTTGTATGCAAGGATCAAGGACTTTAAAAACGCCCACCTTTACAATCTGCAATATTTGC LGALYARIKDFKNAHLYNLQYLQ
AGGACCATGCGGAACTGGATAAAGCTTCTGTCGTTAGGGCGCGCGATGAAAAAGCCCTTTTTTCCATGGAGGGGAA LDKASWRARDEKALFSMEGNT
CACGCAAGAAAAAATCGCCCACTATGACAAAATCATTCAAAATTTCCCTAATTCTAATGAAGCCCTAAAGGCTTTAGA HYDKIIQNFPNSNEALKALELKA
ATTGAAAGCCCAACTATTGTTTGAA CMGCGTTATGCTGAAGTGTTAAGCATGCAAAAAAATTTGCCTAAAGATTC NKRYAEVLSMQKNLPKDSPLIQ
CCCTTTGATCCAAAAAACGCTCAATGTCCTTGCTAAAACCCCATTAGAGAACCATCGTTGTGAAGAAGCCTTAAAATA LAKTPLENHRCEEALKYLSQITT
TTrATCCCAAATCACAACCTTTGAATTCAGCCCCAAAGAAGAAATCCAAGCCTTTGATTGCTTGTATTTCGCATCGCT KEEIQAFDCLYFASLKEKAQIIAL
CAAAGAAAAAGCGCAAATCATTGCCCTAAACGCTTTTAAAACGGCTAAAGCCCCTAGCGAGAAATTAATATG AKAPSEKL1
HP1559 2183 GAAATTTTAGGGGGTAATTTCAGAGAGAAAAAACTCATCCACCCTAACGATGACGTGAACATGTCTCAAAGCTCCAA 2184 EILGGNFREKKLIHPNDDVNMS
CGACACTTTCCCTACCGCAATGCACATTGTGAGCGTGCTAGAAATCACGCACAGACTGCTCCCTAGTTTGGAGAATC TFPTAMHIVSVLEITHRLLPSLEN
TGTTAAAAACCTTTAAAGAAAAAAGCCAACAATTTAAAGAGATTGTCAAGATCGGACGCACGCATTTACAAGACGCTA KEKSQQFKEIVKIGRTHLQDATP
CGCCTTTAACTTTGGGGCAAGAATTTAGCGGGTATGCGAGCATGCTAGAGCATTCTAAACAACAAATΠTAGAGAGTT QEFSGYASMLEHSKQQILESLE
TGGAGCATTTAAGAGAATTAGCCATAGGCGGGACGGCCGTAGGCACAGGGCTAAACGCTCATAAAGAATTGAGCGA AIGGTAVGTGLNAHKELSEKVAE
AAAAGTGGCTGAAGAATTGAGCCAGTTTAGCGGCGTGAAATTCGTCTCTGCGCCCAATAAGTTCCATGCGCTCACTA FSGVKFVSAPNKFHALTSHDAIA
GCCATGACGCTATCGCTTATGCGCATGGGGCTTTTAAAGCTTTAGCGGCGAATTTAATGAAAATCGCTAACGATATTA AFKALAANLMKIANDIRWLASGP
GATGGCTTGCGAGCGGGCCGCGCTGTGGTTTGGGCGAGCTTAATATCCCTGAAAACGAGCCGGGCAGTTCTATTAT GELNIPENEPGSSIMPGKVNPT
GCCCGGTAAAGTCAATCCCACGCAATGCGAAGCGATGACAATGGTGGCCG TMVA
Figure imgf000355_0001
Figure imgf000356_0001
Figure imgf000357_0001
Figure imgf000358_0001
Figure imgf000359_0001
Figure imgf000360_0001
Figure imgf000361_0001
Figure imgf000362_0001
Figure imgf000363_0001
Figure imgf000364_0001
Figure imgf000365_0001
Figure imgf000366_0001
Figure imgf000367_0001
Figure imgf000368_0001
Figure imgf000369_0001
Figure imgf000370_0001
Figure imgf000371_0001
Figure imgf000372_0002
Figure imgf000372_0001
Figure imgf000373_0001
Figure imgf000374_0001
Figure imgf000375_0001
Figure imgf000376_0001
Figure imgf000377_0001
Figure imgf000378_0001
Figure imgf000379_0001
Figure imgf000380_0001
Figure imgf000381_0001
Figure imgf000382_0001
Figure imgf000383_0001
Figure imgf000384_0001
Figure imgf000384_0002
Figure imgf000384_0003
Figure imgf000385_0001
Figure imgf000386_0002
Figure imgf000386_0001
Figure imgf000387_0001
Figure imgf000388_0001
HP1449 2483 GAAGATTTAGGCTCGTTTTTTGMGACGCTTTTGGGTTTGGCGCTAGGGGGAGTAAAAGGCAAAAAAGCTCTATCGC 2484 EDLGSFFEDAFGFGARGSKRQ
ACCGGATTATTTGCAAACCCTTG TTGAGTTTCAAAGAAGCGGTTTTTGGCTGTAAAAAAACCATTAAAGTCCAATA DYLQTLELSFKEAVFGCKKTIKV
CCAGAGCGTTTGTGAAAGTTGCGATGGCACGGGCGCTAAAGACAAAGCCCTAGAGACTTGCAAGCAATGCAATGGG VCESCDGTGAKDKALETCKQC
CAGGGGCAGGTGTTTATGCGTCAAGGTTTTATGAGTTTTGCGCAAACTTGTGGGGCGTGTCAAGGCAAGGGCAAGA QVFMRQGFMSFAQTCGACQG
TCGTTAAAACCCCATGCCAAGCGTGCAAGGGTAAAACCTATATCCTTAAAGATGAAGAAATTGATGCGATAATCCCTG TPCQACKGKTYILKDEEIDAIIPE
AGGGCATTGATGATCAAAACCGCATGGTGCTTAAAAATAAAGGCAATGAATACGAGAAGGGAAAAAGAGGGGATTTG NRMVLKNKGNEYEKGKRGDLY
TATTTAGAAGCGCAAGTCAAAGAAGATGAGCATTTCAAG KEDEHFK
HP1202 24851 GTATTATGAGTTTTTCTTTATCTTCCCTAAGGAGCGGGAGC l l l l l GAGAGCTTTCTTTTAGACGCCACGCATCTAGC 2486 YYEFFFIFPKERELFESFLLDAT
ATTAGAAGAATCTAGCTTAGAGAATTrAAAAGCGTTTGACGATAAAGAAACCATTGGGTTTATAAGCCAATCCAATTG ESSLENLKAFDDKETIGFISQSN
GCATTATTTCGCCACTCATGACCCCCTAAAAAAAGATCTGAAAGAAAATTTAAAAGAAAAACCTCCACATCTCAAAAAT ATHDPLKKDLKENLKEKPPHLK
TTCGTTATTTTACGCTCTCAAAAGGATTTGAATAACTCGCTCATTCCAGCATTAGMGCGTTTTGTTTGAATTTAAAAC SQKDLNNSLIPALEAFCLNLKQN
AAAACCTGCAAAGTGAGTTTGAT TTTTCTATCTTTCACGCAATCTGGCTTCAAAAGATTGGCTAGAAGCCTACAAAC FDFFYLSRNLASKDWLEAYKQAI
AAGCTA' TTGCCGGTGCAATGCACCAAATTTTACATACACCCTAGCTGGCATCAAAAGCCAAGCCATGTTGTTACAA CTKFYIHPSWHQKPSHWTNDC
ATGATTGCATAATGATTGATCCGGCTTTGGCCΠTGGATCAGGCCATCATGAAAGCACTTCTATGTGTTTGGAACTGC ALAFGSGHHESTSMCLELLSDID
TCTCTGACATTGATTTAAAACGCAAAAACGCCTTAGATGTGGGTTGTGGGAGCGGGATTTTAAGCATCGCTTTAAAAA NALDVGCGSGILSIALKKQGVSA C GGCGTTAGCGCTTTAGTAGCTTGCGATACGGATAGTTTAGCCGTTGAAGAAACCCTAAAAAATTTTAGCTTGA DTDSLAVEETLKNFSLNQIPLLV
ATCA TACCCCTATTAGTGCMGATA GTCATTTATGGCTCTACGCAAAAAATTGAAGGGCGTTTTGATGTTATTGT YGSTQKIEGRFDVIVANLVADVI
GGCGAACCTTGTCGCTGATGTGATTAAGAGTTTGTATAGTGAATTTGTGCGGCTTTGTAACCACACTCTTATTTTATC EFVRLCNHTLILSGILETHLNSVL
AGGGATTTTAGAAACCCATTTAAACTCTGTTTTACAGATCTATTATAATGGATTTGAGGTTTTAGAACAGCGACAGCGT GFEVLEQRQRNEWVALKLLKKQ
AACGAATGGGTCGCTCTAAAATTGCTTAAAAAACAACCAATAAATTAAGGATTATAATGAAACCAACGAACGAACCTA
AAAAAC
HP1322 2487 GGATA GAGAACGCTAGGGAATATTTGTTCGCTTGAAAGGGAGTTTTTGAGGGTTITAGGGGTRRTCTTTTTATGCT 2488 DKENAREYLFA
ATAATGGAAAAAGCTCTAATCATTAAGCTATTTGCTAGGAGGTTΓGCATGCAAGAGTTTTΓAGGTTTTGGTGTGGTGG
GGAATTTTGCAGGGCATTTGGAGCAAGCAGGAGAGAGTCATAGTTTTATTAACATGAAAAGCGAAGAAAAGGACGCC
CCT/V\GGGGCTATTCCCTTTTTATATCCCCTATGAAAATTGTTATTTGGGGCGTTGTTGCATTGATAACCATAAGATTA
TTTTGCCTAGTGATCTAGATTTAAGGGTGCAAGCAGAGCCAGAAATCGCTTTAGAATGCGATGTTAAATACGATGAAA
AACATTTGGTTGCAAAGCTCGTGCCTAATTTTTTCATGGCGTTTAATGACGCTTCTGTGCGCAATTTAGACGCCGCAA
AACTCTCCCAAAAAAAGAATTTTTCACCGGCTTCTAAAGGTATAGGGCAGAAATTGCCCATTGACAGGTTTGTTTATG
GGGGGGTGTGTAACAATTTCTCTATCGCGTCTTTTTTGAAATACAATAATGTTTGGCACATTTATGGGGAAAACAGCA
AATTGCTCAAATACGAGTTTTTTTATCAAAAGCTTTTAGATTGGATTAAAGACCAATTAAACCACCAACAAGATGGCGA
CTCTTTAGAGGCTCT GACCTTTTTTAGAGCGCCATMTTTCCCCACTAAAATGATTTTTGCAATAGGGGCTACCCC
TTATATGCCTTTTGCGCAAGAGCATTTTTTGCAAAAAGGCGATGAGGTGGTGATCGTTGCTTACAACCATTTAC
HP1322 2489 GATTCAAGGGGGGGCTGGCACAAGCACGAACATGAACATGAACGAAGTGATTGCTAATTTGGCTTTAGAATACATGG 2490 IQGGAGTSTNMNMNEVIANLAL
GGCATCAAAAGGGCGAGTATCAATTTTGCCACCCAAACGACCATGTCAACCGCTCTCAATCCACTAATGACGCCTAT HQKGEYQFCHPNDHVNRSQST
CCTAGTGCGTTAAAAATCGCTATTTATGAGCGCTTGAGCAATCTAGTCGCCCCCATGAAAGCCTTAAGGGATGCTTT PSALKIAIYERLSNLVAPMKALR
CGCTCAAAAGGCTAAGGAATTCGCTCATGTGATTAAAATGGGGCGCACCCAGCTTCAAGACGCTGTGCCTATGACTT KAKEFAHVIKMGRTQLQDAVPM
TAGGCCAAGAGTTTGAAACTTATGCCTTGATGGTTGATAGGGATATTGAGCAGGTTTTAGACGCTAGGAATTGGGTA EFETYALMVDRDIEQVLDARNW
AGAGAGCTTAATTTAGGCGGCACGGCTATTGGCACAGGGATCAATTCGCACCCGGATTATCGCAGTTTGATTGAAAA NLGGTAIGTGINSHPDYRSLIEK
GAAAATCCAAGAAGTAACGGGCCGCCCCTTTGTCATGGCTAATAACTTGATTGAAGCCACTCAAAGCACGGGGGCG TGRPFVMANNLIEATQSTGAYV
TATGTGCAAGTGAGCGGGGTGTTAAAGCGTATTGCGGTGAAACTTTCTAAGGTTTGTAACGATCTCAGGCTACTCAG VLKRIAVKLSKVCNDLRLLSSGP
TTCAGGCCCTAGAGCCGGGTTGAATGAAATCAATTTGCCTAAAATGCAGCCGGGTAGTTCTATCATGCCCGGTAAAG NEINLPKMQPGSSIMPGKVNPVI
TCAATCCGGTGATCCCTGAAGTGGTCAATCAGGTGTGCTTTGCGGTGATTGGGAATGATTTGAGCGTGGCGTTAGC NQVCFAVIGNDLSVALAAEGGQ
CGCAGAAGGCGGGCAGTTGCAACTCAATGTGTTTGAGCCGGTGATCGCTTACAAGCTATTCCATTCCTTTGTGATTT FEPVIAYKLFHSFVILGRAIETLT
TAGGGCGTGCGATTGAAACTTTAACGACTAAATGTGTGGAAGGCATCACGGCTAATGAAAAGATTTGCCACGATTAT GITANEKICHDYVFNSIGIVTALN
GTCTTTAACAGCATTGGCATTGTTACCGCGCTAAACCCTCATATCGGCTATGAAAAATCCGCTATGATCGCTAAAGAA EKSAMIAKEALKSDRSIYDIALEK
GCCTTAAAAAGCGATCGCTCTATT EQLDDIFKPENMLSPHAFKKHK
Figure imgf000390_0001
Figure imgf000391_0001
Figure imgf000392_0001
Figure imgf000393_0001
Figure imgf000394_0001
Figure imgf000395_0001
Figure imgf000396_0001
Figure imgf000397_0001
HP0243 2571 AGGTGTTCAAAGACAGCAAAAAAGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCA 2572 VFKDSKKDACGFIYEISEFMK
TTGCTAAAAAAACAAGACCGATACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACT KQDRYVYLLRYLPSRYWASIL
GCCCTTTATGTCAAATACCCTGATTTTGACGCTTTGAAAAAGCTTTTGGTGTCTTATTATTACCAAACTTGGATTGCAG KYPDFDALKKLLVSYYYQTWI
GAGGCACGATCACGCGCATCAAGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACC IKQTSINIIKNVKSNKSVETIKE
ATCAAAGAGCTTATATTGAATAGCATCGACTCTTATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTT YNTFDQYLYNLWDSSSVYHS
CTGTTTATCATAGCAAATGGGTGCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATT LALANYFMADEEKPHFIAMDA
TTATCGCTATGGATGCCGAAACCCAAGTGGAGCATATTTTGCCACAAAC6CCCAAAAGAGGCAGTCAATGGAACGC ILPQTPKRGSQWNADFDKEK
GGATTTTGACAAAGAAAAAAGAGAAGAATGGGTAMTMTATCGCGAATTTAACCCTTTTAAAGCGTAAAAAGAACGC NIANLTLLKRKKNAHALNGDF
GCATGCTTTAAACGGGGATTTTGATGAAAAAAGAAAAATTTATGGAGGCAAAGACACGAGCAAAGTGATTAGCTGTTA GGKDTSKVISCYDITKELYSN
TGACATCACTAAAGAATTGTATAGCAATTATAGGAAGTGGAATGAGAAGTCCCTCCAAGAGCGATACAAATCTTTGTA KS LQERYKSLYNTITPVLHI EG
TAACACTATCACGCCTGTTTTACACATAGAGGGGCAAGAAGATGATTTTGAAGATGATTTTGATCTAGAATGATTAAA DDFDLE
GATrGCCAAGCATCAAAACAACAAAGAGGTGATCAATGCCTAAAAAAGAGCTATTAAAGATGTCAAAGAAAAGGATTT
TTAAAGACTTCTTAAAAGAAGCCAAACAGCACCGCCCTATTGTTTTCTATACAGATAATGATTGTGATGGCATGTTAG
CTGGCAGCGTTT
HP1338 2573 GAAAGTGATTGAGGCGTTGCCTAACGCTACTTTTAAGGTGGAGTTAGACAATAAGCATGTGGTGTTGTGCCGTATTT 2574 KVIEALPNATFKVELDNKHW
CTGGAAAGATGCGCATGCACTATATTAGGATTGCTTTAGGCGATAGGGTTAAGCTAGAGCTTACGCCCTATAGCTTA MRMHYIRIALGDRVKLELTPY
GACAAAGGTCGGATMCTTTTAGATATAAATGAATTTAAGGGTTATTTCAATGAAAATATGTTAATATAAAAGTTTTTAG TFRYK
TAGTGCCTAATTTTTTCAAAGGAGAAAAATCATGAAAGTCAGGCCATCAGTGAAAAAGATGTGCGATAACTGCAAAAT
CATTAAAAGAAGGGGCGTTATTAGAGTGATCTGCGCTACCCCTAAACACAAACAAAGACAAGGATAAAGCATGGCAA
GGATTGCTGGTGTAGATTTACCAAAAAAGAAGAGAGTGGAGTATGCCCTTACCTATATTTATGGGATTGGGCTTAAG
AGTTCCAGAGAGATTTTAGAAGCGGTAGGCATTTCTTTTGACAAGCGCGTGCATGAATTGAGCGAAGATGAAGTGTC
TAGCATCGCTAAAAAAATCCAACAAAGCTACCTAGTAGAGGGCGATTTGCGTAAAAAAGTTCAAATGGATATTAAATC
TTTAATGGACTTGGGGAATTATCGTGGGATCAGGCATCGTAAGGGTCTTCCTGTGAGAGGTCAAACCACTAAAAATA
ACGCTAGGACTCGTAAGGGTAAGAAAAAAACCGTGGGTAGCAAGTAGCGAATAAGGAGATGATGATTTAATGGCTAA
GAGAAATGTAACGGCTAAAAAGAAAGTAGTCAAAAAGAATATTGCGAGAGGGGTTGTTTATATTTCAGCGACCTTT
HP1338 2575 TCCGCTTTTCGGTTTCTTTACAACAAAATTTATTAGACGAATTAGACAACCGCATCATTAAAAACGGCTATTCTTCTCG 2576 RFSVSLQQNLLDELDNRIIKN
ATCAGAATTAGTGCGCGACATGATCAGAGAAAAATTAGTAGAAGACAATTGGGCAGAAGACAACCCTAATGACGAGA ELVRDMIREKLVEDNWAEDN
GCAAAATCGCCGTGCTTGTGGTGATTTATGATCACCACCAAAGGGAATTAAACCAGCGCATGATAGACATTCAGCAT AVLWIYDHHQRELNQRMIDI
GCCAGCGGGACGCATGTTTTATGCACCACGCACATTCACATGGATGAGCATAATTGCTTGGAGACGATTATTrTACA HVLCTTHIHMDEHNCLETIILQ
AGGCAATTCGTTTGAAATCCAACGCTTGCAATTGGAAATTGGGGGGCTTAGGGGGGtTAAATTCGCTAAATTGACTA QRLQLEIGGLRGVKFAKLTK
AGGCGTCTAGCTTTGAATACAATGAATAGCGTCTTAAAATACAAAGAATTAGCGCTCTATGGAGGGAGTTTTGATCCC
TTGCACAAGGCTCATTTAGCCATTATTGAGCAAACTTTAGAATTATTGCCATCCGCTAAGCTCATTGTCTTACCCGCTT
ATCAAAACCCTTTCAAAAAGCCATGTTTTTTGGACGCAC
HP1338 2577 CCGATAAGGGATTAAAAAAGGTGTTCAAAGACAGCAAAAAAGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTC 2578 DKGLKKVFKDSKKDACGFIY ATGAAAGCCTATACCGCATTGCTAAAAAAACAAGACCGATACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGG AYTALLKKQDRYVYLLRYLP GCCAGCATTTT CGACTGCCCTTTATGTCAAATACCCTGATTTTGACGCTTTGAAAAAGCTTTTGGTGTCTTATTATT LTTALYVKYPDFDALKKLLVS ACCAAACTTGGATTGCAGGAGGCACGATCACGCGCATCAAGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGC AGGTITRIKQTSINIIKNVKSN AATAAGAGCGTTGAAACCATCAAAGAGCTTATATTGAATAGCATCGACTCTTATAACACCTTTGATCAATACCTCTATA LILNSIDSYNTFDQYLYNLWD ACTTATGGGATAGCTCTTCTGTTTATCATAGCAAATGGGTGCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAG KWVRPVLALANYFMADEEK ATGAAGAGAAACCCCATTTTATCGCTATGGATGCCGAAACCCAAGTGGAGCATATTTTGCCACAAACGCCCAAAAGA AETQVEHILPQTPKRGSQW GGCAGTCAATGGAACGCGGATTTTGACAAAGAAAAAAGAGAAGAATGGGTAAATAATATCGCGAATTTAACCCTTTTA K KRREEEEWWVVNNNNIIAANNLLTTLLLKRKKN MGCGTAAA GMCGCGCATGCTTTAAACGGGGATTTTGATGAAAAAAGAAAAATTTATGGAGGCAAAGACACG FDEKRKIYGGKDT
Figure imgf000399_0001
HP0249 2587 CCGAAATCACTTCTAACACCGCCACCGCTGCCGCATTTTTACCGGTGATTGGAGGGGTTGCGATGGGCATGGGTTA 2588|EITSNTATAAAFLPVIGGVAMG
TGAAAACCATCAGAGCTTGTTATTGACCATTCCTGTAGCCTTGAGTGCGACTTGCGCGTTCATGCTCCCTGTGGTCA HQSLLLTIPVALSATCAFMLP
CCCCACCGAATGCAATAGCTTATGGCTCTGGGTATGTTAAAATAACGGACATGATTAAAGCCGGTTTGTGGCTTAAT IAYGSGYVKITDMIKAGLWLNL
CTGGTAGGAGTTGTTTTGATTAGCACGTTTAGCTATTTTTTGGTTTCGTTAATATTTAATTGATTAAGGAAAAAAGTGA STFSYFLVSLIFN
AAGAAGAGTTATTTAAAGAAAAATCTCGTTACATTACAGGGTTTGTTTTAATCATTGTGGCAGACTTGATTTTATATGC
GGACAATTTGTTGTTGTTTTGGGCGGTTTTAGGGGGGATTTATGCGGTAGGGTTTTCTGAAGCGTTAAGATTATTCCA
AGTrAAAGCGAGCTrTAGCCTGTATCTCATTTTAGTGTTGTCATGGGTGGCGGCGTATTTTAACGGGCGTCCTATAG
AATGCGCTCTTATTAGCGCCATGGTCATGGCTAGTGTTATCGCTTATCAAAAAGCGCACCATAGCGAAGCCATTTTAC
CCTTTTTGTATCCGGGCGTrGGG I I I I I TGCGCTTTTTGGGGTTTATAAGGATTTTGGTGCAGTAGCGATCATTTGGC
TTTTAGTCGTGGTGGTTGCAAGCGATGTGGGGGCGTTTTTTGGAGGCAAGCTTTTAGGCAAAACCCCTTTCACGCCC
ACTTCGCCGAATAAAACCTTAGAGGGCGCGTTGATTGGCGTGGTTTTGGCGAGCGTTTTAGGATCGTTTGTGGGCAT
GGGGAMTTGAGCGGAGGCTTTTTTATGGCGCTCTTTTTTAGTTTTTTAATCGCTCTTGTGGCGGTGTTTGGGGATTT
GTATGAAAGCTATTTGAAAAGAAAGGTCGGTATCAAAGATAGCGGTAAGATTTTACCCGGGCATGGGGGCGTTTTAG
ACCGGTTGGATTCCAT
HP0249 2589 TGCTTCCTAAAAACTTGATCATGGAAGTGAATATGGATTTTGTGTTAAACAAAATCAGCAAGGTΠTGCCTTTCACAAC 2590 LPKNLIMEVNMDFVLNKISKVL
CCATAGCTTGCAAGTGAGTAAAATCGTGCTAGCTTTGACGATTTTAGCCTTATRGCTGGGTTTAAGGAAGTTGATCAC LQVSKIVLALTILALLLGLRKL1T
TTGGCTTTTAGCCTTATTGTTAGATCGTATTTTTGAAATCATGCAGCGCAATAAAAAAATGCATGTCAATGTGCAAAAG LDRIFEIMQRNKKMHVNVQKS
AGCATTGTTTCGCCGGTTTCTGTCTTTTTAGCCCTATTTAGTTGCGATGTGGCTTRAGATATTTTCTACTACCCTAACG VFLALFSCDVALDIFYYPNASP
CATCGCCCCCTAAAGTTTCTATGTGGGTGGGCGCGGTGTATATCATGCTTTTAGCATGGTTAGTGATAGCGCTTTTTA WVGAVYIMLLAWLVIALFKGY
AAGGCTATGGGGAAGCGTTAGTTACGAATATGGCTACCAAAAGCACGCACAATTTTAGAAAAGAAGTGATCAACTTG NMATKSTHNFRKEVINLILKW
ATTTTAAAAGTCGTGTATTTTTTGA
HP0249 2591 TGCCACTATTAACGCCACTAATGTTGATGCGGACAAAATAGCTAGCGATAATCCTATTTATGCTTCCATAGAGCCTGA 2592 ATINATNVDADKIASDNPIYASI
TATTGCCAAGCAATACGAAACAGAAAAAACCATTAAGGATAAGAATTTAGAAGCTAAATTAGCTAAGGCTTTAGGTGG QYETEKTIKDKNLEAKLAKAL
CAATAAAAAAGATGACGATAAAGAAAAAAGTAAAAAATCCACAGCAGAAGCTAAAGCAGAAAACAATAAGATAGACAA DDKEKSKKSTAEAKAENNKID
AGATGTCGCAGAAACTGCCAAGAATATCAGTGAAATCGCTCTTAAGAACAAAAAAGAAAAGAGTGGGGAATTTGTAG AKNISEIALKNKKEKSGEFVDE
ATGAAAATGGTAATCCCATTGATGACAAAAAGAAAGCAGAAAAACAAGATGAAACAAGCCCTGTCAAACAGGCCTTTA DKKKAEKQDETSPVKQAFIGK
TAGGC GAGTGATCCCACATTTGTTTTAGCGCAATACACCCCCATTGAAATCACTCTGACTTCTAAAGTAGATGCCA LAQYTPIEITLTSKVDATLTGIV
CTCTCACAGGTATAGTGAGTGGGGTTGTAGCCAAAGATGTATGGAACATGAACGGCACTATGATCTTATTAGACAAA DVWNMNGTMILLDKGTKVYG
GGCACTAAGGTGTATGGGAATTATCAAAGCGTGAAAGGTGGCACACCCATTATGACACGCTTAATGATAGTCTTTAC GGTPIMTRLMIVFTKAITPDGV
TAAAGCCATTACGCCTGATGGTGTGATAATACCTCTAGCAAACGCTCAAGCAGCAGGCATGTTGGGTGAAGCAGGG QAAGMLGEAGVDGYVNNHF
GTAGATGGCTATGTGAATAATCACTTTATGAAGCGCATAGGCTTTGCTGTGATAGCAAGCGTGGTTAATAGCTTCTTG VIASVVNSFLQTAPIIALDKLIG
CAAACTGCGCCTATCATAGCTCTAGATAAACTCATAGGCCTTGGCAAAGGTAGAAGTGAAAGGACACCTGAATTTAA ERTPEFNYALGQAINGSMQS
TTACGCTTTGGGTCAAGCTATCAATGGTAGCATGCAAAGTTCAGCTCAG
HP0249 2593 GGGTAGAGCAAGTGTTAGCCGATCTCAAAAACTTCTCAAAGGAGCAATTGGCTCAACAAGCTCAAAAAAATGAAGAT 2594 VEQVLADLKNFSKEQLAQQA
TTCAATACTGGAAAAAATTCTGAACTATACCAATCCGTTAAGAATAGTGTAAATAAAACCCTAGTCGGTAATGGGTTAT NTGKNSELYQSVKNSVNKTL
CTGGAATAGAGGCCACAGCTCTCGCCAAAAATTTTTCGGATATCAAGAAAGAATTGAATGAGAAATTTAAAAATTTCA GIEATALAKNFSDIKKELNEKF
ATAACAATAATAATGGACTCAAAAACAGCACAGAACCCATTTATGCTAAAGTTAATAAAAAGAAAACAGGACAAGTAG NNGLKNSTEPIYAKVNKKKTG
CTAGCCCTGAAGAACCCATTTATACTCAAGTTGCTAAAAAGGTAAATGCAAAAATTGACCGACTCAATCAAATAGCAA EPIYTQVAKKVNAKIDRLNQIA
GTGGTTTGGGTGGTGTAGGGCAAGCAGCGGGCTTCCCTTTGAAAAGGCATGATAAAGTTGATGATCTCAGTAAGGT GQAAGFPLKRHDKVDDLSKV
AGGGCTTTCAGCTAGCCCTGAACCCATTTACGCTACGATTGATGATCTCGGCGGACCTTTCCCTTTGAAAAGGCATG EPIYATIDDLGGPFPLKRHDK
ATAAAGTTGATGATCTCAGTAAGGTAGGGCGATCAAGGAATCAAGAATTGGCTCAGAAAATTGACAATCTCAATCAAG GRSRNQELAQKIDNLNQAVS
CGGTATCAGAAGCTAAAGCAGGTTTTTTTGGCAACCTAGAGCAAACGATAGACAAGCTCAAAGATTCTACAAAAAAG FGNLEQTIDKLKDSTKKNVM
AATGTTΛTGAATCTATATGTTGAAAGTGCAAAAAAAGTGCCTGCTAGTTTGTCAGCGAAATTGGACAATTATGCTATTA KKVPASLSAKLDNYAINSHTR
ACAGCCACACACGCATrAATAGCAATATCCAAAATGGAGCAATCAATGAAAAAGCGACCGGTATGCTAACGCAAAAA GAINEKATGMLTQK
Figure imgf000401_0001
HP0249 2603 CGCAAGGCTATAAAGGGCCTAGTGCGGTGAGCGATATTATCACGGCGTTTGGGGAATTTAGCGTGAGCGGGAATTA 2604 QGYKGPSAVSDIITAFGEFSVS
TGTGATTGGGGCGATTATCTTTAGTATTTTAGTGCTAGTGAATCTATTAGTGGTTACTAATGGCTCTACTAGGGTTACT AIIFSILVLVNLLWTNGSTRVTE
GAAGTGAGGGCGCGATTTGCCCTAGATGCTATGCCAGGAAAGCAAATGGCGATTGATGCGGATTTAAACTCAGGAC ALDAMPGKQMAIDADLNSGLID
TTATTGACGATAAGGAAGCCAAAAAACGGCGCGCCGCTCTAAGCCAAGAAGCGGATTTTTATGGCGCGATGGATGG KRRAALSQEADFYGAMDGASK
CGCATCTAAATTCGTCAAAGGCGATGCGATCGCTTCTATCATCATCACGCTTATCAATATCATTGGAGGGTTTTTAGT AIASIIITLINIIGGFLVGVFQRDM
GGGCGTGTTTCAAAGGGATATGAGCTTGAGCTTTAGCGCTAGCACTTTCACTATCTTAACCATTGGCGATGGGCTTG ASTFTILTIGDGLVGQIPALIIATA
TGGGGCAAATCCCTGCCTTAATCATTGCGACAGCGACCGGTATTGTCGCCACTCGCACCACGCAAAATGAAGAAGA TRTTQNEEEDFASKL/TQLTNK
GGACTTTGCTTCCAAACTCATCACACAGCTCACCAATAAAAGCAAAACTTTAGTGATTGTGGGAGCGATTTTATTGCT VGAILLLFATIPGLPTFSLAFVGT
TΓTTGCCACCATTCCTGGACTCCCTACCTTTTCTTTAGCGTTTGTAGGGACTCTCTTTTTATTCATCGCATGGCTGATT WLISREGKDGLLTKLENYLSQK
AGCAGGGAGGGGAAAGACGGGCTGCTCACTAAATTAGAAAATTATTTGAGTCAAAAATTCGGCTTGGATTTGAGCGA SEKPHSSKIKPHTPTTRAKTQE
AAAACCCCACAGCTCCAAAATCAAACCCCACACCCCAACCACAAGGGCTAAAACCCAAGAAGAGCTTAAAAGAGAAG EEQAIDEVLKIEFLELALGYQLI
AAGAGCAAGCGATTGATGAAGTGTTAAAAATTGAATTTTTAGAACTGGCTTTAGGCTATCAACTCATCAGTCTTGCGG QGGDLLERIRGIRKKIASDYGF
ACATGAAACAAGGGGGCGATTTGTTAGAAAGGATTAGGGGTATTAGAAAAAAGATAGCGAGCGATTATGGTTTTTTG IRDNLQLPPTHYEIKLKGIVIGE
ATGCCTCAAATCCGGATCAG DKFLAMNTGFVNKEIEGIPTKE DALWIETKNKEEAI I QGYTI I DPS TSELVKKYAEDFITKD
HP0249 2605 ATTACTTTTTAAAAGCCCCTATTTTAGGATTTGAGCATATTAACGAAGTGCGTTTGGAAAAAATTGATTCCTTATTCAG 2606 YFLKAPILGFEHINEVRLEKIDS
CCGATTAATTAGCCAAACCAATTCCCCCATGGCGTTGGATATGGTTTTAGTGAATCCTTATTGTTTGAGGGAATACAG QTNSPMALDMVLVNPYCLREY
CTTTGTGATACCCAAATACATAGAATTACTGCTAGAATTAGATTCTCATTCCAAAGTGGAGGTGTATTGCGTGGTCGT YIELLLELDSHSKVEVYCVWL
GTTGCAAAAAAATTTAGAAGATTCTATGGTTAATTTCTTAGCCCCTTTAGTGTTTAATTCCAAAAATGGCTTTGGCGCT SMVNFLAPLVFNSKNGFGAQV
CAAGTGGCGCTTTCTATGATGGATTATCCGGATTTTGGCTTTAGGGATCCTTTAAAAAGCTTTGTGATTCAAGAAAGA DYPDFGFRDPLKSFVIQERERA
GAACGAGCTTAAAGGTTTTTAGCATGAAATTCGCTCTTACAGGGGGAGGCACAGGGGGGCATCTCTCTATCGCTAAA
GCCTTAGCCATAGAATTAGAAAAGCAAGGCATAGAGGCTATTTATTTAGGCTCCACTTACGGGCAAGATAAAGAATG
GTTTGAAAATAGCCCCTTATTTAGCGAACGCTATTTTTTC CACGCAAGGCGTGGTCAATAAAAGCTTTTTTAAAAAA
ATAGGCTCTTTATTCTTGCAAGCTAAAGCCGCTTTTAAGGCTAAAGAGATTTTAAAAAAACACCAGATCACGCACACC
ATTAGCGTGGGGGGTTTTAGCGCAGGGCCGGCAAGTTTTGCAAGCTTGCTCAATAAAATACCCCTTTATATCCATGA
GCAAAATGCGATTAAAGGCTCTCTCAATCGCTACCTTTCCCCTAAAGCTAAGGCGGTGTTTTCAAGCTATGCCTTTAA
AGATAAAGGAAATCATGTTTTAACCTCCTATCCCGTGCAAAACGCTΠTTTTGATΠTGCTAGGACTCGCACGGAAAT
CMGCACATTTTATTTTTAGGCGGTTCGCAAGGGGCAAMGCGATCAATG TTCGCTTTATTAAACGCTCCCAAACT
CACCAAACAA
HP0249 2607 GTAGTAGTGGCACTACTTGCTCCGGTTGGCTTATCAACCTTTTAGGGGCAATCCCCACCAATGGAGTGAGCGATACG 2608 SSGTTCSGWLINLLGAIPTNGV
AATAATTTAATTAATCTGCTCACTGAATTCATTAAAACCGCCGGGTTTATCCAAAATAATGATAGTAGTGTATCTACTA LINLLTEFIKTAGFIQNNDSSVS
GTCTTACAAGCGCTTTTCAAGCCATTACGAGCGCTATTTCTCAAGGGTTTCAAGCCTTACAAAACGATATTAGCCCTA FQAITSAISQGFQALQNDISPN
ATGCGATTTTAACCTTGCTCCAAGAGATTACTTCTAACACCACCACCATTCAGTCATTCTCGCAAACCTTACGGCAGC EITSNTTTIQSFSQTLRQLLGDK
TTTTAGGGGATAAAACATTCTTTATGGCGCAACAAAAGCTCATTGATGCGATGAT QQKLIDAM
HP0249 2609 TATGATTTTTGATGTGAAAGCGCCTATTTTGGGGTTTGAAACCATTCATAAAATGCGTTTGCAAAAGATTGATGAAATC 2610 MIFDVKAPILGFETIHKMRLQKI
TTTTTGCGTTTGAATAGCACAGAAGAGAATTCCGTGGTGTCTTTCACGCTGGTCAATCCCTTTGCCTTAAGAAAATAC LNSTEENSVVSFTLVNPFALRK
GAATTTGAAGTGCCTACCCCTTTAAAAATCCTTTTAGAATTAGAGGGAGCCAAGAGCGTTCTAGTCGCTAATATCATG PTPLKILLELEGAKSVLVANIM
GTCGTTCAAACCCCCATTGAGCTTTCCACCGTGAATTATTTAGCCCCTTTAATTTTCAATTTGGACAAGCAGCTCATG LSTVNYLAPLIFNLDKQLMGQV
GGGCAAGTGGTTTTGGATTCTAACAAATACCCACACTACCATTTAAGAGAGAATATTCTAAGCCACACGCATGAATGA KYPHYHLRENILSHTHE
TGCATAAGCGGTTTCTATGCAAAGCGTTTCTTGAAGCTTATGGCGCAATGCATCGGTGTTTTTGCAAATACAAGCGA
GCGGTTTTCTTATGGTATAATAGCGGT
HP0249 2611 AAGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCATTGCTAAAAAAACAAGACCGA 2612 DACGFIYEISEFMKAYTALLKK
TACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACTGCCCTTTATGTCAAATACCCT LLRYLPSRYWASILTTALYVKY
GATTTTGACGCTTTGAAAAAGCTrTTGGTGTCTTATTATTACCAAACTTGGATTGCAGGAGGCACGATCACGCGCATC KKLLVSYYYQTWIAGGTITRIK
AAGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACCATCAAAGAGCTTATATTGAAT VKSNKSVETIKELILNSIDSYN
AGCATCGACTCTTATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTTCTGTTTATCATAGCAAATGGG YNLWDSSSVYHSKWVRPVLA
TGCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATTTTATCGCTATGGATGCCGAAA ADEEKPHFIAMDAETQVEHILP
CCCAAGTGGAGCATATTTTGCCACAAACGCCCAAAAGAGGCAGTCAATGGAACGCGGATTTTGACAAAGAAAAAAGA GSQWNADFDKEKREEWVNNI
GAAGAATGGGTAAATAATATCGCGAATTTAACCCTTTTAAAGCGTAAAAAGAACGCGCATGCTTTAAACGGGGATTTT KRKKNAHALNGDFDEKRKIYG
GATGAAAAAAGAAAMTTTATGGAGGCAAAGACACGAGCAAAGTGATTAGCTGTTATGACATCACTAAAGAATTGTAT KVISCYDITKELYSNYRKWNEK
AGCAATTATAGGAAGTGGAATGAGAAGTCCCTCCAAGAGCGATACAAATCTTTGTATAACACTATCACGCCTGTTTTA YKSLYNTITPVLHIEGQ
CACATAGAGGGGCAA
HP0249 2613 GCTCAAAAACGCCTTCTTACAAACCGCTTTGACTTTAGACGCTAAAACCTTAGCCACCAAAGCTTTAGTGCGAGATTC 2614 LKNAFLQTALTLDAKTLATKAL
CAATTTGATTTCTGCTAAAGCCCAAGCAATGCCTATTGTΠTGCAATTGCATGCCCTATACAATGAAGAAAACAATTAC ISAKAQAMPIVLQLHALYNEEN
ACGCAATACCTTTTAAGCGTGATGTTGCCTTGCATGTGGCTCATTTTTATTGCGATTGGCATGCTCAATTTCATTCAAA LSVMLPCMWLIFIAIGMLNFIQ
AAGCCTCTAACATGCGAGAGCTT TGATCAGTATTGTAGCGAATGTGTGTGTGTTTAGTTTTTGGGGGATGGGCATG ELLISIVANVCVFSFWGMGMA
GCGTTTTA" T TAATCTCATTGGCATGGAAGGGCATTATGCGCATTTG MEGHYAHL
HP0249 2615 CTTTATGCGCGTTATTCAT TTAGAAGAGCGTTTGAAAGCTAACGCTAAAGAAGTCGTTCAGGCΠTACAAAATAAAG 2616 LCALFILEERLKANAKEWQAL GCTTAGAATTAGAGAT RTAAGCGGGGATAATGAAAGCTCGGTTAAGGAGTGCGCGAAAAAATTAGGGATTTCTAATT LEILSGDNESSVKECAKKLGIS
ATCATGCCCATTTGACCCCTGAAGATAAGGCTCAAACCATCAGCTCTTATAAGGGCGTTTGCGCGATGGTAGGCGAT TPEDKAQTISSYKGVCAMVGD
GGCAATAATGATGCGTTAGCCTTAAAACAAGCGAGCGTTTCTTTAGGGTTTGAAAAAAGCGCTTTGAGTAAAAGCGC LALKQASVSLGFEKSALSKSA
ATGCGATATTTTGCTTTTAGAAGAGGATTTGAGTTTGCTAAAAAAAGCGTTTGATAACGCTCAAAAAGTCTATCAAGTG EDLSLLKKAFDNAQKVYQWL
GTGTTGCAAAACATTGTTTTGAGCTTGATTTATAACGCTATTTTAATCCCGGTCGCTATGCTAGGATACATTAACCCTT IYNAILIPVAMLGYINPLIASLSM
TAATAGCGAGTTTGAGCATGAGCGCTAGCTCACTCTTAGTG V
HP0249 2617 AAAAGAGACGAATGAAAAGCTTTCCTTGCAAATGGATGAATTTTTAGACGATTTGCAGCTTTCAGGGGAACGCATCAA 2618 KETNEKLSLQMDEFLDDLQLS
CGATTTAGAAGAAGTGGTGGGAGTGAATAGGCCTGAAGAAGAAAAAGAAGAGGGCAATTTTTCCAGTCGTTTGGATG LEEWGVNRPEEEKEEGNFSS
TCGCTGGGATTACCGGGCTTCAAAAAAGCTTTATCATGCGCCTTATCCCTAATGACTACCCGCTAGAATCCTATCGG GITGLQKSFIMRLIPNDYPLES
CGCGTTTCAGCCGCCTTTAATAAAAGAATCCACCCTATTTTGCATGTGCTGCACAACCATACCGGGCTTGATTTAAGC AFNKRIHPILHVLHNHTGLDLS
ACCGCTATTAACACCCCTGTGTATGCGAGCGCGAGCGGGGTAGTAGGGTTAGCGAGCAAGGGGTGGAATGGGGGG YASASGWGLASKGWNGGYG
TATGGGMTTTGATTAAAGTTTTCCACCCTTTTGGTTTTAAAACCTACTACGCCCATTTGAATAAAATCGTCGTAAAAA HPFGFKTYYAHLNKIWKTGEF
CGGGCGAATTTGTCAAAAAAGGGCAGTTGATTGGGTATAGTGGTAATACAGGGATGAGCACAGGGCCGCATTTGCA LIGYSGNTGMSTGPHLHYEVR
TTATGAAGTGCGTTTTTTAGATCAACCCATAAACCCCATGAGTTTCACCAAATGGAACATGAAAGATTTTGAAGAAGTT NPMSFTKWNMKDFEEVFNKE
TTCAATAAAGAAAGGAGCATCAGATGGCAATCTTTGATAACAATAATAAATCGGCTAATGCAAAAACAGGACCAGCGA SLITIINRLMQKQDQRLSSLKA
CTATCATCGCTCAAGGCACAAAAATAAAGGGGGAGCTTCATTTAGATTACCATTTGCACGTAGATGGCGAATTAGAA
GGGGTGGTGCATTCTAAAAGCACGGTGGTGATCGGGCAAACCGGCTCGGTAGTGGGTGAGATTTTTACTAATAAAT
TAGTGGTCAGTGGCAAGTTCACTGGCACGGTGGAGGCGGAAGTGG
HP0591 2619 CCTCATTCGCATCAGAGTGCGTTATTACTTAGGGGGCAATTAAAAAATGGAAATCACGCTTTTrGACCCCATAGACGC 2620 LIRIRVRYYLGGN
CCACTTGCATGTGCGAGAAAACGCACTTTTAAAAGCGGTGTTAGGATATTCTAGCGAGCCTTTTAGTGCTGCAGTGA
TCATGCCTAATCTCAGTAAGCCCTTGATTGACACTCCAACCACCCTTGAATACGAAGAAGAAATTTTAAACCATTCTT
CAAACTTCAAGCCTCTAATGAGTTTGTATTTCAATGATGGCTTGACTTTAGAAGAATTGCAATGCGCAAAAGAAAAAG
GCGTCAGGTTTTTAAAGCτCTACCCCAAAGGCATGACCACAAACGCGCAAAACGGCACTTCGGATTTGTTGGGTGAA
AAMCTT AGAGGTTTTAGAAMCGCCCAAAAATTAGGCTTTATTTTATGCATCCATGCAGAACAAACCGGGTTTTGTT
TGGATAAAGAATTTTTATGCCATAGCGT
HP0591 2621 AGAATAATTTGCTGTATTTGCATGGCAATTTCAACGCGACTAATATCTTTTTAACGAATAATTTTAATGTCGGCAACCC 2622INNLLYLHGNFNATNIFLTNNFNV
TAACGCTGGCGGTGGGGCGACGATTAATTTTAACGCTGATGAAACCTTGAACGCTGACGGGTTAAATTACACGAATT AGGGATINFNADETLNADGLNY
TCCAAACCGTGGCTTTGGGCTTACAAACCAGTGCGAGCCAGCATTCATGGGCGAATTTTAATTCCAAGCTTTCTATG VALGLQTSASQHSWANFNSKL
GAGATTAAAAATTCTAACTTTAGGGATTTCACATGGGGAGGCTTTAATTTTAATTCAGGGCGTATCACTTTTGAAAACA NSNFRDFTWGGFNFNSGRITF
CCACTTTTAGCGGCTGGACCAATATTAACGGAGCGACTGAGAGCGGCTCATCGTATGTGAATATGGTTGCGAATACG GWTNINGATESGSSYVNMVAN
GATTTGATATTTTCTAATTCCATTTTAGGAGGGGGCATTCGCTATGATTTGAAAGCTAATAACATTATTTTCAATAACTC NSILGGGIRYDLKANNIIFNNSQ
TCAAATGGTTATTGATGTGTCTAAGAATGTGAATCAGTCATCATTGAATGGGAATGTTACTTTCAATAATTCCAGGCTT SKNVNQSSLNGNVTFNNSRLS
TCAGTCAAGCCCAATGCGGCTATTAATATTGGGGATAGCCAAACCCAAACGGCTTTAGAAAACGCTTCAAGCCTTTC AINIGDSQTQTALENASSLSFYN
TTTTTACAACAACAGCGTGGCGAATTTTAACGGCACAACCGCTTTTAACGGGGTGTCTTATTTGAATTTGAACCCTAA NFNGTTAFNGVSYLNLNPNAQ
CGCTCAAGTAAGCTTCAATCAAGTA TTTCAATAACGCTAATGTAACTTTTTATGGCATTCCTTTATTTGGTAAAACG VNFNNANVTFYGIPLFGKTPDF
CCTGATTTTGGCAACTCTGCACGCCTTATCAATTTCAAAGGGAATACGAATTTTAATCAAGCCACGCTCAATTTAAGG LINFKGNTNFNQATLNLRAKNIH
GCTAAAAATATCCATATCAATTTCCAAGGCGTTTCTACTTTTAAACAAAACTCTACGATGAATTTAGCTGAAAGTTCCC VSTFKQNSTMNLAESSQASFN
AAGCGAGCTTTAACGCTCTTAAAGTGGAAGGGGAAACGAATTTCAATCTCAATAACTCAAGCTTGTTGAATTTCAATG GETNFNLNNSSLLNFNGNSVFN
GCAATAG YANHSQISFTKLATFNSDASFD
TLNFQSVLLNGALNLLGNGSNN
KGNFSFGSKGILNLSYMNLFGG
VYDVLQAQNIDGLMGNNGYEKI
QIDKADYSFDNGVHSWRFTNP
ITETLHNNRLKVQISQNGVSNN
APSLYDYQKNPYNETENSYNY
GTYYLTSNIKGFNQNNKTPGTY
QPLQA
HP0591 2623 CTGGGGCGTTTGCTAAAATGCTTTTAAGAGAAATCGGTATTGTTTGTGAAAGCGGGATTATAGAAATTGGGGGTATTA 2624 GAFAKMLLREIGIVCESGIIEIGG
AAGCCAAAAATTATGATTTTAATCACGCCTTAAAAAGCGAGATTTTTGCCCTAGATGAAGAACAAGAAGAAGCGCAAA YDFNHALKSEIFALDEEQEEAQ
AAACAGCCATTCAAAACGCTATCAAAAACCACGATAGCATAGGGGGTGTGGCTTTGATTAGAGCGAGGAGCATAAAA AIKNHDSIGGVALIRARSIKTNQ
ACCAATCAAAAGCTCCCCATTGGCTTAGGTCAAGGGCTATACGCTAAATTAGACGCTAAAATCGCTGAAGCGATGAT GQGLYAKLDAKIAEAMMGLNG
GGGGCTTAATGGGGTGAAAGCGGTTGAAATAGGCAAGGGGGTAGAAAGCTCTTTATTAAAAGGCTCAGAGTATAAT GKGVESSLLKGSEYNDLMDQK
GATTTAATGGATCAAAAGGGGTTTTTGAGCAATCGTAGCGGAGGGGTTTTAGGGGGCATGAGCAATGGGGAAGAAA RSGGVLGGMSNGEEIIVRVHF
TCATTGTTAGAGTGCATTTCAAACCCACGCCAAGCATTTTCCAACCTCAACGAACCATAGACATTAATGGCAATGAGT FQPQRTIDINGNECECLLKGRH
GCGAATGCTTGTTAAAGGGCAGGCATGATCCTTGCATTGCGATTAGAGGGAGCGTGGTGTGCGAGAGTTTGTTAGC RGSWCESLLALVLADMVLLNL
GTTGGTGTTGGCTGATATGGTATTACTCAATTTGACTT
HP0368 2625 CACGATTGAAAGAAAAAATCGCTAACGCTAAGGTTTTGTGCGCGGTGAGTGGGGGCGTGGATTCTACGGTGGTCGC 2626 RLKEKIANAKVLCAVSGGVDST TACGCTGITGCACAGAGCCATTAAGGATAATTTGATCGCTGTTTTTGTGGATCATGGCTTGTTGCGTAAAAATGAAAA LHRAIKDNLIAVFVDHGLLRKN AGAAAGGGTGCAAGCGATGTTTAAGGACTTGAAAATCCCTTTAAACACGATAGACGCTAAAGAAGTCTTT AMFKDLKIPLNTIDAKEVF
HP0368 2627 CT CAATGCGCGTGAGCGCATCAAAGTAGGGGGATTGCCCACTTTTAAATTTGGTGCTAAAAAAAATGCTAAAGAA 2628 NNARERIKVGGLPTFKFGAKK
GAACAATACAAGCAAGATGAAAGATTGAATGAGCGATTGAGAGAGTTTGCTGAAACGCTCATAGAAAGCGTGAGAAA YKQDERLNERLREFAETLIESV
AAAACTGCAAAAATTAGGGGATTATGAAAATATAGAAAAAATTTTGGATTTAGAAGAAGCCCTTAGACGCTACTATAGT KLGDYENIEKILDLEEALRRYYS
TCGCCTATCAGTGMTTGGMTTTTTAAAAGAAATAGAAAAAAATGAGAGCTTTTTTTAATTCTTCTATGCGTGAAAAG EFLKEIEKNESFF
CTAAGCCAATTAAAAGCAAGGCAGCAAAAACAAGAAATGCCCGCCAACC
HP0368 2629 CGTCTGCGTTTAAAAATTTAAAACTCCAACTCAAACGAAGAGAAATAATCAACCGCTATGTTTCTCAAGCTTTGGGGG 2630 SAFKNLKLQLKRREIINRYVSQA ATTTAAAAAAAGGGTTTAGATACGCTAAAGTAGAACACCAAATCCTAAAAATCTATTTCACGCACCCTAGCTATTTGAA KGFRYAKVEHQILKIYFTHPSYL
AGCCTTTAAAATAGAAGMGCCTATTACACCAACCACCTGAAAGCCCATTTAAAAGAAACGCAAAAAACCCTAAAAGC EAYYTNHLKAHLKETQKTLKAL
CCTAGATTACCCCTTTGATTTTAAGACTATCC GCGAGCGTGAAAAAAAGGGCTTATCAAAAACCAGTTGTTAAAAA FKTIQASVKKRAYQKPWKKEK
AGAAAAACCCCCTAAAAGCGTGAATGTCAAΠGCGAAGGTTTGAGCGATTTCACTAAAAAGCAATTTTTAAAGCTCAA NVNCEGLSDFTKKQFLKLKRA
ACGCGCTTGTAACGATAATACGCTGCGCACGCCCCCTTGAGAGCTGACCATGCAACTGCCGATCGGGTTTTGCGGG RTPP
GTGCAAGTTTTGGCAAATAGTGAGCAGTCTAGGGGCTTAGAGCAATTTTCAAAACTCAAACGCGCTTGTAACGATAA
TACGCCGCGCACGCCCCCTCAGAGCTGACCATGCAACTGCCGATCGGGTTTTGTGGGGTGCAAGTTGTAGCGAATA
ACGAGCAGTCTAGGGGCTTAGCGATGCCTΠTAAAATCTCCCCGCACTTGCATGCTTTGTTTTCTTTAGAGGTTTTGT
GGCTTAAATATTCTTTAAAGACTTCTTCAGCGTCATAAGAAGCGAACGCTTCTTTGAGTTTGAGAGCGGATCGTTTGA
TATTCCCCAAACCTCTCCATTCAAAATTTTCCCTAACTTCCATGC
HP0368 2631 TGACCCTTATAAAGCGTTTTTATCAGCGGTGAAAGTGATGAGCAAGCAATTGGGTGTTTTTGGCGAAAGACCCATTG 2632 DPYKAFLSAVKVMSKQLGVFG
CTAACACGGAGTATTCAGGCGATTACGCTCAAAGAGATGACGCTAAAGACTTGAGCGCTAAGATTGAAAGCATGAAT TEYSGDYAQRDDAKDLSAKIES
TTGAGCGCTAGGTGTTTTAATTGCTTGGATAAAATCGGCATCAAGTATGTGGGCGAACTCGTGTTGATGAGCGAAGA RCFNCLDKIGIKYVGELVLMSE
AGAGCTTAAGGGCGTGAAAAACATGGGTAAAAAATCCTATGATGAAATCGCTGAAAAATTGAATGATTTGGGCTATCC VKNMGKKSYDEIAEKLNDLGY
GGTAGGCACAGAATTAAGCCCTGAACAAAGAGAGAGTTTAAAGAAAAGATTAGAAAAATTAGAAGATAAAGGAGGTA SPEQRESLKKRLEKLEDKGGN
ACGACTGATGAGACACAAACACGGATACCGCAAGCTTGGGAGAACCAGCTCGCACAGAAAGGCGTTATTAAAGAAT
TTAGCGATCGCTTTGATTGAGCATAACAAAATT
HP0368 2633 GGCTTACAAAAACCACAGGGCTTACAATACGATTGAATTAAAAGATGAGGTGGCGCGTTTGGCCAAACTCGCTTTAA 2634 AYKNHRAYNTIELKDEVARLAK
CTAAAATGATGGAGTTATCTTAATGGAGATTAGAACCTTTTTAGAACGCGCTTTAAAAGAAGATTTAGGGCATGGGGA MMELS
TTTGTTTGAAAGGGTGTTAGAAAAAGATTTTAAGGCCACAGCGTTTGTTAGGGCTAAACAAGAGGGCGTGTTTTCAG
GCGAAAAATACGCTTTAGAGTTGTTGGAAATGACCGGCATTGAATGCGTTCAAACGATTAAGGATAAAGAACGCTTC
AAGCCTAAAGACGCTTTAATGGAAATTAGGGGGGATTTTAGCATGCTTTTAAAGGTTGAGCGCACCCTTTTAAACCTT
TTGCAACACAGCAGCGGGATTGCTACTTTAACGAGCCGTTTTGTAGAGGCTTTAAATTCTCATAAAGTGCGTTTGTTG
GACACGAGAAAAACGAGACCCCTTTTAAGGATCTTTGAAAAATATTCCGTGCTTAATGGGGGAGCGAGCAACCACCG
CTTAGGGCTAGATGACGCTTTAATGCTTAAAGACACGCATTTAAGGCATGTGAAAGATCTCAAAAGCTTTTTAACGCA
TGCCAGAAAAAACTTGCCTTTCACGGCTAAAATTGAAATTGAATGCGAAAGCTTTGAAGAGGCCAAAAACGCCATGA
ATGCGGGAGCGGATATTGTGATGTGCGATAATTTGAGCGTTTTAGAGACTAAAGAAATTGCCGCTTATAGAGATGCG
CATTATCCCTTTGTΠTACTGGAAGCGAGCGGGAACATTTCACTAGAGAGCATTAACGCTTACGCTAAAAGCGGCGT
GGATGCCATTAGCGTA
HP0368 2635 CCAAACAGCACCGCCCTATTGTTTTCTATACAGATAATGATTGTGATGGCATGTTAGCTGGCAGCGT 2636 KQHRPIVFYTDNDCDGMLAGS
HP0005 2637 GATTTCCGATCGCGCTAAAAACAAAATGGCTAAATCCAATTTAAGGTTGGTGGTGAGCATCGCTAAACGATTCACGA 2638 ISDRAKNKMAKSNLRLWSIAK
GCAGAGGCTTACCATTCTTGGATTTGATTCAAGAGGGCAAfATTGGCTTGATGAAAGCGGTGGATAAGTTTGAGCAT GLPFLDLIQEGNIGLMKAVDKF
GAAAAGGGCTTCAAGTTTTCTACCTATGCGACCTGGTGGATCAAACAAGCTATCAGCAGAGCCATAGCCGATCAGGC FKFSTYATWWIKQAISRAIADQ
CCGCACTATCCGCATCCCCATTCACATGATTGATACGATTAATCGCATCAATAAAGTCATGCGCAAACACATTCAAGA IHMIDTINRINKVMRKHIQENGK
AAACGGCAAAGAGCCTGATTTAGAAGTGGTGGCTGAAGAAGTGGGGCTTTCGTTAGATAAAGTGAAGAATGTGATTA WAEEVGLSLDKVKNVIKVTKE
AGGTGACTAAAGAGCCT
HP0005 2639 CTTACATGAGGACCGATAGCTTGAATATCGCTAAAGAGGCTTTAGAAGAAGCGAGGAATAAGATTTTAAAAGACTATG 2640 YMRTDSLNIAKEALEEARNKILK
GCAAAGACTATTTACCCCCTAAAGCCAAAGTCTATTCCAGCAAGAATAAAAACGCCCAAGAAGCCCATGAAGCGATC DYLPPKAKVYSSKNKNAQEAHE
AGGCCCACTTCTATTATΠTAGAGCCAAACGCTTTAAAAGACTACCTTAAGCCTGAAGAATTAAGGCTCTATACCTTAA SIILEPNALKDYLKPEELRLYTLIY
T"RRACAMCGCTRTTTAGCTTCTCAAATGCAAGACGCTCTTTTTGAAAGCCAAAGCGTGGTTGTGGCTTGCGAAAAAG ASQMQDALFESQSVWACEKG
GCGAGTTTAAAGCGAGTGGGAGAAAGCTCCTTTTTGATGGCTATTATAAAATTTTAGGCAATGACGATAAGGACAAAT GRKLLFDGYYKILGNDDKDKLL
TGCTCCCCAATTTGAAAGAAAATGACCCCATTAAATTAGAAAAACTAGAGAGCAACGCCCATGTTACAGAACCTCCAG NDPIKLEKLESNAHVTEPPARY
CACGCTATTCTGAAGCGAGCTTGATTAAAGTTTTAGAAAGTTTAGGCATAGGCAGGCCCAGCACCTACGCCCCAACG KVLESLGIGRPSTYAPTISLLQN
ATTTCCCTTTTACAAAACAGAGACTACATCAAGGTAGAAAAAAAGCAAATCAGTGCTTTAGAGAGCGCTTTTAAGGTG VEKKQISALESAFKVIEILEKHFE
ATAGAMTTTRAGAAAAGCATTTTGAAGAAATCGTGGATTCCAAATTCAGCGCTTCTTTAGAAGAAGAATTGGACAATA KFSASLEEELDNIAQNKADYQQ
TCGCTCAAAATAAAGCCGACTACCAGCAAGTCTTAAAGGACTTTTACTACCCCTTTATGGATAAAATTGAAGCTGGGA YYPFMDKIEAGKKNIISQKVHEK
AAAAGAATATCATCTCTCAAAAAGTGCATGAAAAAACCGGTCAATCATGCCCTAAATGCGGTGGGGAATTAGTCAAAA CPKCGGELVKKNSRYGEFIACN
AAAATAGCCGTTATGGGGAGTTTATCGCTTGCAACAATTACCCTAAATGCAAATATGTCAAACAAACTGAAAGCGCTA CKYVKQTESANDEADQELCEK
ATGATGAAGCCGATCAAGAATTGTGCGAAAAATGCGGAGGGGAAATGGTGCAAAAATTCAGCAGAAACGGGGCGTT VQKFSRNGAFLACNNYPECKN
TTTAGCTTGCAA NTPNAKETIEGVKCPECGGDIA
KGSFYGCNNYPKCNFLSNHKPI
EKCHYLMSERIYRKKKAHECIK
HP0005 2641 CTAAACGGATTATTTCTGAGATTGACAAACAGCCAAAGGCTAAAAAAGAAGCTAAATTCATTGAGTTAGCCAATCGGG 2642 KRIISEIDKQPKAKKEAKFIELAN ATACGATTGATCCTAACAGCAAGAACGCGCAAAATGGCGGTGATTTGGGGAAATTCCAAAAGAACCAAATGGCTCCG PNSKNAQNGGDLGKFQKNQM GATTTTTCTAAAGCCGCTTTCGCTTTAACTCCTGGGGATTACACTAAAACCCCTGTTAAAACAGAGTTTGGTTATCATA KAAFALTPGDYTKTPVKTEFGY TTATCTATTTGATTTCTAAAGATAGCCCTGTAACTTATACTTATGAACAGGCTAAACCTACCATTAAGGGGATGTTACA KDSPVTYTYEQAKPTIKGMLQE AGAAMGCTTTTCCAAGAACGCATGAATCAACGCATTGAGGAACTAAGAAAGCACGCTAAAATTGTTATCAACAAGTA RMNQRIEELRKHAKIVINK ATTGATGAGGTGTTATCATGTTAGTTAAAGGCAATGAAATTTTATTGAAAGCCCATAAAGAAGGTTATGGGGTGGGGG CGTTTAATTTCGTGAATTTTGAAATGCTAAACGCTATTTTTGAAGCAGGAAATGAGGAAAATTCCCCGCTTTTCATTCA AACGAGTGAGGGAGCGATCAAATACATGGGGATTGATATGGCGGTAG
HP0005 2643 CGCTCTATGGCGGGATTGCGTGCGCGAATTTGTTGCATAAAAATTCAGGGATCACGATAGATATTGGAGGGGGTAG 2644 LYGGIACANLLHKNSGITIDIGG
CACCGAGTGCGCGTTGATTGAAAAAGGCAAGATTAAGGACTTAATCTCGCTTGATGTTGGGACGATTCGCATTAAAG ALIEKGKIKDLISLDVGTIRIKEMF
AAATGTTTTTAGACAAAGACTTAGAGGTCAAATTGGCTAAAGCCTTTATCCAAAAAGAAGTCTCTAAACTGCCCTTTAA EVKLAKAFIQKEVSKLPFKHKN
ACACAAAAACGCCTTTGGGGTGGGGGGGACGATCAGAGCGTTGAGTAAGGTATTGATGAAACGCTTTTGTTACCCT GTIRALSKVLMKRFCYPIDSLH
ATTGATTCTTTGCATGGCTATGAAATAGATGCACATAAAAATTTAGCGTTCATTGAAAAAATCGTCATGCTCAAAGAAG HKNLAFIEKIVMLKEDQLRLLGV
ATCAATTACGGCTTTTAGGGGTGAATGAAGAGCGTTTGGATAGCATCAGGAGCGGGGCGTTGATTTTATCAGTCGTT DSIRSGALILSWLEHLKTSLMIT
TTGGAGCATTTAAAAACTTCTTTAATGATCACTAGTGGGGTGGGGGTGAGAGAAGGCGTGTTTTTGAGCGATTTATT VREGVFLSDLLRHHYHKFPPNI GCGCCATCATTACCATAAATTCCCCCCCAATATCAACCCCTCTCTCATCTCTTTAAAAGATCGCTTTTTGCCCCATGA LKDRFLPHEKHSQKVKKECVK AAAGCACAGCCAAAAGGTCAAAAAAGAATGCGTGAAATTGTTTGAAGCCTTATCGCCTTTGCATAAAATAGATGAAAA PLHKIDEKYLFHLKIAGELASMG ATACCTTTTCCATTTAAAGATTGCGGGGGAATTAGCGAGCATGGGTAAGATTTTAAGCGTCTATTTAGCCCACAAGCA LAHKHS
CAGCGC
HP0005 2645 AACCGCGACAACCACGCAAGACGGCGTAACGATCACCACTACCTATAATAATAACAAAGCCACCGTCAAATTTGACA 2646 TATTTQDGVTITTTYNNNKATV
TCACCAATAACGCTGAACAGCTGTTAAATCAAGCGGCAAACATCATGCAAGTCCTTAATACGCAATGCCCTTTAGTGC NAEQLLNQAANIMQVLNTQCP
GTTCCACGAATAACGAAAACACTCCAGGGGGTGGTCAACCATGGGGTTTAAGCACATCCGGGAATGCGTGCAGCAT NENTPGGGQPWGLSTSGNAC
CTTCCAACAAGAATTTAGCCAGGTTACTAGCATGATCAAAAACGCCCAAGAAATAATCGCGCAAAGCAAAATCGTTAG FSQVTSMIKNAQEIIAQSKIVSE
TGAAAACGCGCAAAATCAAAACAACTTGGATACTGGAAAACCATTCAACCCTTACACGGACGCCAGCTTTGCGCAAA NNLDTGKPFNPYTDASFAQSM
GCATGCTCAAAAACGCTCAAGCGCAAGCAGAGATGTTCAATrrGAGCGAACAAGTGAAAAAGAACTTGGAAGTCATG AQAEMFNLSEQVKKNLEVMKN
AAAAACAACAATAATGTTAACGAGAAATTAGCAGGATTTGGGAAAGAAGAAGTAATGACCAATTTTGTTAGCGCCTTT EKLAGFGKEEVMTNFVSAFLA
TTGGCAAGCTGCAAAGATGGTGGCACATTGCCTAATGCAGGGGTTACTTCTAACACTTGGGGGGCGGGTTGCGCGT GTLPNAGVTSNTWGAGCAYV
ATGTGGGAGAGACGATAAGCGCCCTAACCAACAGCATCGCTCACTTTGGCACTCAAGAGCAGCAGATACAGCAAGC LTNSIAHFGTQEQQIQQA
CG
HP0005 2647 AAAAGCCGTGCAAACCACGCTCACTTTTGAAACGCCTTTTAACAAAACGCCTAAAATCATGGAAGTTGAAGGGCAAA 2648|KAVQTTLTFETPFNKTPKIMEV
AGGTGATCGTCTTAAAAAACGCTAAACTGGATTCTAAAAAAACCATGGATTTTAAAGAAGCCTCTTTGAATGCTTTAGA VLKNAKLDSKKTMDFKEASLNA
AATGTTTTCCTACCAAAATGACATCTACCTCTTGTCTAAAAAAGCTAAAGTGGAATTAGAAATCCAAGCTTCAAACAGC YQNDIYLLSKKAKVELEIQASNS
AAGGATAAAAAACGGCTCCGCTTTCTCTTTTTACCCAAAGGTTTTCATTTAGCCCCACCGCCTAACCTGAAAGAAAAA LRFLFLPKGFHLAPPPNLKEKS
TCTCAGCAAACTAACCTTGCACAAAAAGACACCAACGAGCAACCCCAAAGCCCTTTAAACACTCTAGAGTTAAAACC AQKDTNEQPQSPLNTLELKPP
CCCACTAAATTTAAGCCATGCTTATAAGGeGCTAGCGGTTAT YKALAV
HP0005 2649, CACCAATTCGTTCAATAAAAACCCCCATTTAATGCGGATGTGGAGCTTAGATGATGTGTTCAATCAAAGCGAATTGCA 2650 TNSFNKNPHLMRMWSLDDVF
AGCGTGGTTGCAACGCATTTTAAAAGCCTATCCTAGTGCTTCGTTCGTGTGTTCGCCCAAACTTGATGGGGTTTCGC AWLQR1LKAYPSASFVCSPKLD
TCAATCTTTTGTATCAACATGGCAAGCTAGTGAAGGCGACCACTAGGGGCAACGGCTTAGAAGGAGAATTAGTTAGC LLYQHGKLVKATTRGNGLEGE
GCAAACGCTAAACACATCGCTAATATCCCCCACGCTATCGCTTATAATGGAGAAATAGAAATCAGGGGCGAAGTGAT KHIANIPHAIAYNGEIEIRGEVIIS
CATTTCTAAA GGATTTTGACGCTTTGAATCAAGAGCGCTTAAACGCTAATGAACCCCTATTCGCTAACCCCAGAAA ALNQERLNANEPLFANPRNAA
CGCCGCATCAGGGAGTrTGAGGCAACTTGATAGCGAAATCACTAAAAAGCGTAAATTGCAATTCATTCCTTGGGGCG QLDSEITKKRKLQFIPWGVGKH
TGGGCAAGCATTCTTTAAATTTTTTAAGCTTTAAGGAGTGTTTGGATTTTATCGTCTCGTTAGGTTTTAGCGCCATTCA SFKECLDFIVSLGFSAIQYLSLN
ATACTTAAGCCTAAACAAAAACCACCAAGAAATAGAAGACAATTACCACACCCTAATTAGAGAAAGGGAGGGCTTTTT IEDNYHTLIREREGFFALLDGM
TGCCCTTTTAGACGGCATGGTGATCGTTGTGAATGAATTAAATATTCAAAAGGAGCTAGGCTACACGCAAAAATCCC LNIQKELGYTQKSPKFACAYKF
CTAAATTCGCTTGCGCTTATAAATTCCCGGCTTTAGAAAAACACACCAAAATTGTAGGAGTCATTAACCAAGTGGGGC HTKIVGVINQVGRSGA1TPVALL
GCAGCGGGGCGATCACACCGGTCGCTCTTTTAGAGCCTGTGGAAATTGCTGGAGCTATGATTAATAGAGCGACCTT GAMINRATLHNYSEIEKKNIML
ACACAATTATTCTGAAATTGAAAAAAAGAATATCATGCTCAGTGATAGGGTCGTTGTCATTAGAAGCGGCGATGTGAT IRSGDVIPKIIKPLESYRDGSQH
CCCTAAAATCATCAAGCCTTTAGAATCTTATAGAGACGGCTCGCAACATAAAATTGAACGCCCCAAGGTTTGCCCTAT VCPICSHELLCEEIFTYCQNLN
ATGTTCGCATGAGCTT ESLIHFASKDALNIQGLGDKV/E
KLIFNALDLYALKLEDLMRLDKF
QNLLDAILKSKNPPLW
HP0005 2651 CAACTTTTAGAATTGAAAAGGGGCATATCAAACACCGAAAGCCTAAAGAATTGGTTTTTAGCGTTCATTTAACAGACG 2652 TFRIEKGHIKHRKPKELVFSVHL
ATTTAAAGCGGCGCGATTTTAGCATGAATGCGATCGCTTATAGCCCTACAAAAGGGCTGATTGATCCTTTTAAAGGG RRDFSMNAIAYSPTKGLIDPFK
CAGAATGCGATTGAAAATCAAATGATTGAATGCGTGGGGGAAGCGCGATTAAGGTTTTTTGAAGACGCTTTAAGGAT NQMIECVGEARLRFFEDALRIL
TTTAAGATCGCTGCGATTCAGTGCAACTTTAGGCTTTAAGATAGCGCCAAACACCAAAGAAGCGGTTTTTGCGTGTAA ATLGFKIAPNTKEAVFACKDLL
GGATTTGTTAAAACACCTTTCTAAAGAACGCTTACAAAGTGAATTGAATAAGCTTCTTATGGGGAAAAACGCCTATGA RLQSELNKLLMGKNAYEVAKE
AGTGGCTAAAGAATATC GAAATTTTAGAGTRGGTTATTCAAGAAAAAATAGAAAATTTAGGGTRRTTAAAAAACGCG LVIQEKIENLGFLKNAPFNLELR
CCTTTCAATCTGGAATTMGATTGTTAGGGTTTTTΓAAGCATCAAAAAAGTTTAGAAAGTTTACGCTACCCTAAAAAAA HQKSLESLRYPKKTIVLFSKAK
CGATCGTTTTATTTTCCA GCTAAAGMTGCCATAAATCTΠTTTAAATATTCATAACAAMCAGAGTTAAAATTTTTA LNIHNKTELKFLLKNYDLEPFN
TTGAAA CTACGATTTAGAGCCTTTTMTTTGGCTTΓAGATTΓTTATGCGCTCAAAAACCCCAAACATGCTTTAAAAA ALKNPKHALKIKGLLKEIFDSNE
TTAAAGGCTTGTTAAAAGAAATCTTTGATTCTAACGAGCCTTTTAAAAAAGAACACTTGGCCCTTAAGGGCGGTGCGC HLALKGGALQSLGYQHQKIGEI
TTCAAAGCTTGGGTTACCAGCACCAAAAAATCGGCGAAATTTTAAACGCATGCTTAGATTTAGTCATCGCTAACCCTA DLV/ANPKNNA
AAAATAACGCTTT
HP0005 2653 CCAGCACAGAAGATTTGCAAATCACTTTAGAATTTTTAAAAGAATACGAAGATGAAGCCATTACGCGCTTAAAAGAGC 2654 STEDLQITLEFLKEYEDEAITRL
TTTTAAAATCCCCTAATTTCACGCAAAACGCTTTAGAAAAAGTCAAAACCCAAATGTTAGCCGCACTTTTACAAAAAGA PNFTQNALEKVKTQMLAALLQK
AAGCGATTTTGACTATTTGGCTAAATTGACTTTAAAGCAAGAGCTTTTTGCTAACACCCCTTTAGCTAACGCAGCCTTA YLAKLTLKQELFANTPLANAAL
GGCACTAAAGAGAGCATTCAAAAAATCAAGCTAGACGATTTGAAACAGCAATTTGCTAAGGTCTTTGAACTCAATAAG QKIKLDDLKQQFAKVFELNKL
CTCGTGGTGGTGCTTGGGGGCGATTTGAAAATCGATCAAACCCTTAAGCGTTTGAATAACGCCCTTAATTTCTTGCC DLKIDQTLKRLNNALNFLPQGK
ACAAGGTAMGCGTATGAAGAGCCTTATTTTGAAACGAGCGATAAAAAAAGCGAAAAAGTCCTCTATAAAGACACTGA YFETSDKKSEKVLYKDTEQAFV
GCAGGCTTTCGTGTATTTTGGTGCGCCCTTTAAAATCAAGGATTTAAAACAGGATTTAGCGAAATCTAAAGTCATGAT FKIKDLKQDLAKSKVMMFVLGG
GTTTGTGCTTGGTGGGGGGTTTGGCTCTCGTTTAATGGAAAAAATCAGGGTTCAAGAGGGATTAGCTTATAGCGTGT RLMEKIRVQEGLAYSVYIRSNF
ATATCCGCTCCAATTTTTCTAAAGTGGCGCATTTTGCGAGCGGGTATTTGCAAACCAAGCTCAGCACTCAAACTAAAA FASGYLQTKLSTQTKSVALVKKI
GCGπGCCTTAGTrAAAAAAATCGTTAAGGAATTTATAGAAAAAGGCATGACGCAACAAGAATTAGACGACGCTAAAA EKGMTQQELDDAKKFLLGSEP
AGTTTTTACTAGGCTCTGAGCCTTTAAGGAATGAAACGATCTCTAGCCGCTTGAACACCACTTACAATTATTTTTATTT SSRLNTTYNYFYLGLPLNFNQT
AGGTTTGCCTTTAAATTTTAACCAAACGCTGCTCAATCAAATCCAAAAAATGAGTTTGAAAGAAATCAATGATTTCATT KMSLKEINDFIKAHTEINDLTFAI
AAAGCCCACACCGAAATCAACGACTTGACTTTTGCTATTGTGAGCAATAAAAAGAAGGACAAATGATGCCAT KDK
HP0005 2655 GGGTTTTGAAACTGGATCCGGACCGATTAGCGGTGTTTAATTACGCGCATGTGCCTTGGGTGAAAAAAACGATGCGT 2656 VLKLDPDRLAVFNYAHVPWVK AAAATTGATGAAACCCTATTGCCAAGCCTTAGAGACAAGCTAGAGATTTTAGAATCTCTTATTAGTTTTTTAGAAAAAG DETLLPSLRDKLEILESLISFLEK CCAACTACCAAATGATAGGCATGGATCATTTCGCTAAAAGCGATAATGAATTGTATCTAGCCCTTCAAAAAGCAGAAT MIGMDHFAKSDNELYLALQKAE TGC
HP0005 2657 GGAAGAAAATCGCTTTAAGGATAGCGGCTATTTAAAAGAAAAATTAAAAGAGGCTAAAGATTTGATTGATGCGCTAAA 26581 EENRFKDSGYLKEKLKEAKDLI
TTTGAGAAAAGCCACGATTTATAAAATCGGTCTGATGCTTTTAGAGTATCAATACGA I M M I TAAGGGTAAGGAATTA RKATIYKIGLMLLEYQYDFFKGK
CGCCCTTTAAAGCTATTAGATTTAGCCAATGAGTTTAACCACTCTGTAAGCACGATTTCAAGGGCCATTTCTAATAAAT KLLDLANEFNHSVSTISRAISNK
ATTTGGCATGCGAAAGGGGGGTT TCCCCATTAAGCATTTCTTTAGCATCGCCTTAGACAATAGCGAGACTTCAAAC RGVFPIKHFFSIALDNSETSNAV
GCTGTGATTAAAGACTATC TAGAATTGATCAAAAACGAAGACAAAAAAGAGCCTTTGAGCGACGCTAAGATT A EL1KNEDKKEPLSDAKILEL1EEK
GAACTCATTGAAGAAAAATTCCATTTGAAAATGGTAAGAAGAACGATCACCAAATACCGCCAACTGCTCAACATCGCC VRRTITKYRQLLNIASSSERKRL TCTTCAAGCGAAAGGAAAAGGCTCTATTTGATGCGCGCTTGAAAACCATGATTTTAGGGCTTATTCTTATTGAGTATC ATTTGTCCTTGTATGCGTGTGAGTTCAAACTCTTGGGATACTTTGGCGAGTTGGTATAAGCT
HP0005 2659 GTTTGGAAAAACCTGATTTGAAGGAGTTGGAGGAATACTACCATAAAAACAAGGTGTCTTATTTGGACAAAGAGGGG 2660 LEKPDLKELEEYYHKNKVSYLD AAATTGCAGGATTTTAAAAGCGTTCAAGAGCAAGTCAAGCATGATTTAAGCATGCAAAAAGCGAATGAAAAAGCCTTA QDFKSVQEQVKHDLSMQKAN AGGAGCTATATCGCTCTAAAAAAAGCGAACGCGCAAAACTACACCACACAAGATTTTGAAGAGAACAACTCCCCCTA YIALKKANAQNYTTQDFEENNS TACTGCTGAAATCACGCAAAAACTCACCGCTCTCAAACCCCTTGAAATCCTAAAGCCAGAGCCTTTTAAAGATGGTTT TQKLTALKPLEILKPEPFKDGFI
TATTGTGGTGCAACTCATCTCTCAAATTAAAGACGAATTGCAAAATTTTAATGAAGCTAAAAGCGCTCTTAAAACCCG QIKDELQNFNEAKSALKTRLTQ
CCTAACTCAAGAAAAAACCCTTATGGCGTΓGCAAACTTTAGCCAAAGAAAAGCTTAAGGATTTTAAGGGCAAAAGCGT ALQTLAKEKLKDFKGKSVGW
GGGCTATGTAAGCCCTAATTTTGGAGGCACTATTAGTGAGCTTAACCAAGAAGAAAGTGCTAAGTTTATCAACGCTCT GTISELNQEESAKFINALFNRQ
TTTTAACCGCCAGGAAAAAAAGGGGTTTATCGCTATTAATAATAAAGTGGTGCTCTATCAAATCACAGAACAAAATTTC AINNKWLYQITEQNFNHSFSA
AACCACTCATTTAGTGCAGAAGAAAGCCAGTATATGCAGCGTTTAGTCAATAACACTAAAACGGATTTTTTT MQRLVNNTKTDFF
HP0005 2661 CTTTTTCTAACGCCAAAGACAAAGAGAGTGCGAGTGAAATCGCGCTTAATTGGGCTGAAGCAGAGATAAACTATCAA 2662 FSNAKDKESASEIALNWAEAEI
AATTTTAATAACGCTAAATACCTCATTGATAAGGTGGTCCAATCCAACCCTGATTATATTTCTACGCATAGCGAATCAG NNAKYLIDKVVQSNPDYISTHS
CCCTAGACTTGCTCAAGTTATTGAAAAAAAACCAGATGAATGCAAGCGCGATTGAGATCGCTCACTTGCTCCTCAATC LKLLKKNQMNASAIEIAHLLLN
AAGATGATGATCTGAAAGCTAAAGAGCAAGCGCTTTATGATTTAGGAGCGTTGTATGCAAGGATCAAGGACTTTAAAA AKEQALYDLGALYARIKDFKNA
ACGCCCACCTTTACAATCTGCAATATTTGCAGGACCATGCGGAACTGGATAAAGCTTCTGTCGTTAGGGCGCGCGAT QYLQDHAELDKASWRARDE
GAAAAAGCCCTTTTTTCCATGGAGGGGAACACGCAAGAAAAAATCGCCCACTATGACAAAATCATTCAAAATTTCCCT EGNTQEKIAHYDKIIQNFPNSN
AATTCTAATGAAGCCCTAAAGGCTTTAGAATTGAAAGCCCAACTATTGTTTGAAAACAAGCGTTATGCTGAAGTGTTA ELKAQLLFENKRYAEVLSMQK
AGCATGCAAAAAAATTTGCCTAAAGATTCCCCTTTGATCCAAAAAACGCTCAATGTCCTTGCTAAAACCCCATTAGAG PLIQKTLNVLAKTPLENHRCEE
AACCATCGTTGTGMGMGCCTTAAAATATTTATCCCAAATCACAACCTTTGAATTCAGCCCCAAAGAAGAAATCCAA QITTFEFSPKEEIQAFDCLYFAS
GCCTTTGATTGCTTGTATTTCGCATCGCTCAAAGAAAAAGCGCAAATCATTGCCCTAAACGCTTTTAAAACGGCTAAA QIIALNAFKTAKAPSEKLIWLYR
GCCCCTAGCGAGAAATTAATATGGCTTTATCGTTTGGGGCGCAATTACTACCGCTTAGGGGATTTTAAAAATTCC YRLGDFKNS
HP0005 2663 GGCTAAMGGGTGG GCGTTCAGTCGTTTCTCTAATGTGGTTTCAGAAATTGAAAAAAAATATGTGGATAAAATCAG 2664 AKRVEAFSRFSNWSEIEKKYV
CATTTCTGAGATCATGACTAAAGCGATTGAAGGCTTGCTCTCTAATTTGGACGCGCATTCAGCGTATTTGAATGAAAA EIMTKAIEGLLSNLDAHSAYLNE
GAAGTTTAAGGAATTTCAAGCCCAAACCGAGGGCGAATTTGGGGGGCTTGGGATCACGGTGGGCATGCGCGATGG FQAQTEGEFGGLGITVGMRDG
CGTTTTAACCGTTATTGCCCCTTTAGAAGGCACTCCAGCTTACAAGGCTGGGGTTAAGTCAGGCGATAACATTTTAAA PLEGTPAYKAGVKSGDNILKIN
AATCAATAACGAAAGCACGCTGAGCATGAGCATTGATGATGCGATCAACCTCATGCGCGGCAAGCCAAAAACCCCTA MSIDDAINLMRGKPKTPIQITVV
TTCAGATCACCGTTGTAAGAAAAAACGAGCCAAAACCTTTAGTGTTTAACATCATTAGAGACATCATTAAACTCCCCT KPLVFNIIRDIIKLPSVYVKKIKET
CTGTCTATGTGAAAAAGATTAAAGAAACCCCTTATCTGTATGTGAGAGTGAGTGGTTTTGACAAGAATGTTACCAAAT RVSGFDKNVTKSVLEGLKANP
CGGTTTTAGAAGGCTTAAAAGCTAACCCTAAGGCTAAGGGGATCGTGTTGGATTTAAGGGGCAATCCTGGAGGGCT LDLRGNPGGLLNQAVGLSNLFI
ATTAAACCAAGCGGTGGGCTTGTCTAACCTCTTCATTAAAGAGGGGGT
HP0005 2665 TCATGAAAGCCTATACCGCATTGCTAAAAAAACAAGACCGATACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATT 2666 MKAYTALLKKQDRYVYLLRYLP G
HP0005 2667 CTAAAAACAGCGCCATTGAAGTTTCCAACACGAAAGCTTCTGCTATGAAGAACGAAACGATTGGAAGCGGCGATCTT 2668 KNSAIEVSNTKASAMKNETIGS
AAAAAGGTGTGTGAGAAAATCAAAAGCGCACTACCCTTTGGGATCATCTCAGCCTTTAAACCCTTTAAAGACGCTTTT VCEKIKSALPFGIISAFKPFKDA
TACAGAGATTTCAATCATAATGAGCAAAAGTTACTGATAGGGGCAGCTAAAAGCGGTTGCATTCAATCTAGCGCTGAT NHNEQKLLIGAAKSGCIQSSAD
AAACTGGCTCAGTTAAAAACGCGCTTACTCTACTGGCAAGACAAATCTGTTAAAGTGGATTGGGATAAACCCATTTTA KTRLLYWQDKSVKVDWDKPILI
ATCMGGACTTCTTTAAAGGCAATMTTACCTTTATAGGAGGTTTTGTTΓΠTATTGGGGAAGCATTTTATGGACAGAT GNNYLYRRFCFLLGKHFMDRF
TTTTAAAGAATAACGCTAAGGCGAGCGTGAAAGACTTTATGTCTAGTAAGGAGTTTGTCGCTAAATACCGATACACCC KASVKDFMSSKEFVAKYRYTP
CCAAGCAAAATACAGAAAGAGCGAAAAAGCTGCAATCGTATTTAGAGAATAAGCGCGATTTTATAGGGTTTGTTCAAG RAKKLQSYLENKRDFIGFVQAL
CGCTTAACTCTTTAAAAGACAACCCGCAAGATCCTTTTTTACCCAATGAAGAAACGAGCTTTTTGGTGTTCGCTAATG NPQDPFLPNEETSFLVFANEPT
AGCCTACTATCGTGTTTAATTTAAGGGATTATTTATTGGTGTTAGCGCAAATCTTTAACCAACAAGCGATCTGTTATTG DYLLVLAQIFNQQAICYCES
TGAAAGCAA
HP0247 2669 GCAATTACAATTGCATTTATTGCGAGTTGGGTAAAGCCAAGCCCATTGAACGCATGGAAGAAGTGATCAAAGTGGAA 2670 NYNCIYCELGKAKPIERMEEVIK
ACCTTGATTAACGCCATTCAAAACGCCCTAAACAACCTCACCACCCCCATTGATGTTTTAACCATTACCGCTAATGGC AIQNALNNLTTPIDVLTITANGE
GAACCCACGCTATACCCTCATTTATTAGAGCTTATCCAAAGCATCAAGCCTTTTTTAAAGGGCGTTAAAACTTTGATTT LLELIQSIKPFLKGVKTLILSNGS
TAAGCAATGGCTCGCTCTTTTATGAGCCAAAAGTCCAGCAAGCCTTAAAGGAATTTGACATCGTTAAATTTTCTTTAG KVQQALKEFDIVKFSLDAIDLKA
ACGCTATTGATTTGAAAGCCTTTGAAAGAGTGGATAAACCCTATTCTAAAGACATTAATAAGATTTTAGAGGGGATTTT KPYSKDINKILEGILRFSQIYQG
GCGCTTTTCTCAAATTTATCAAGGGCAATTGGTGGCTGAAGTGTTGTTAATTAAGGGCGTGAATGATAGCGCGAACA LLIKGVNDSANNLKLIAAFLKQI
ACTTAAAACTCATCGCTGCCTTTTTAAAACAAATCAATATAGCCAGAGTGGATTTAAGCACCATAGACAGACC LSTIDR
HP0247 2671 ATGAGCATGAGAGGGCCGAAGCGATCATGCGCCTTTTAGACACCCAAGCACCCAAAAAGAGCATTGTTTTCACGCG 2672 EHERAEAIMRLLDTQAPKKSIV
CACTAAAAAAGAAGCCGATGAATTGCACCAATTCCTTGCTTCTAAAAATTACAAAAGCACCGCCTTGCATGGGGATAT EADELHQFLASKNYKSTALHG
GGATCAAAGGGATCGGCGCTCTTCTATCATGGCGTTTAAAAAAAATGACGCTGATGTGTTGGTGGCTACAGATGTGG DRRSSIMAFKKNDADVLVATD
CGAGTCGTGGGCTAGATATTAGCGGTGTAAGCCATGTGTTTAATTACCACTTGCCCCTAAACACTGAGAGCTATATC DISGVSHVFNYHLPLNTESYIH
CATCGCATCGGGAGAACCGGGCGAGCGGGCAAAAAAGGCATGGCGATCACTTTAGTAACCCCTTTAGAATACAAAG RAGKKGMAITLVTPLEYKELLR
AGCTTTTACGCATGCAAAAAGAAATTGATTCAGAGATTGAACTTTTTGAAATCCCCACCATTAACGAAAATCAGATCAT DSEIELFEIPTINENQIIKTLHDA
CAAAACCTTGCATGACGCTAAAGTGTCTGAAGGGATCATCAGCCTTTATGAACAGCTrACCGAAATTTTTGAGCCGTC SLYEQLTEIFEPSQLVLKLLSL
TCAATTGGTTTTAAAACTTTTGAGTTTGC
HP0247 2673 CAAAGGCTTACTCACTAAGCTTAAAGCAACAGCAGGGAGAAACAATAACGGGCGCATCACCAGCCGCCACAAAGAG 2674 KGLLTKLKATAGRNNNGRITSR
AGAGGGGCTAAAAAACTCTATCGCATTATTGATTTCAAGCGCAATAAATACAATATTGAAGGGAAAGTGGCTGCGATT AKKLYRIIDFKRNKYNIEGKVAA1
GAGTATGATCCTTACAGAAATGCGCGCATCGCTCTTGTAGTCTATCCTGATGGGGACAAACGCTATATTTTACAGCC RNARIALVVYPDGDKRYILQPS
AAGCGGTTTGAAAGTGGGCGATAGCGTTATCGCTGCTGAAGGCGGTTTGGATATTAAAGTGGGCTTTGCGATGAAG DSVIAAEGGLDIKVGFAMKLKNI
TTAAAAAATATCCCCATAGGAACGGTGGTGCATAATATTGAAATGCATCCAGGGGCTGGCGGGCAATTAGCCAGAAG VHNIEMHPGAGGQLARSAGMS
CGCAGGAATGAGCGCTCAAATCATGGGTAGAGAAAATAAATACACCATTATTAGGATGCCAAGCTCTGAAATGCGCT RENKYTIIRMPSSEMRYILSEC
ACATTCTAAGCGAATGTATGGCGAGTGTTGGCGTGGTAGGGAATGAGGATTTTATCAATGTCTCTATCGGTAAGGCA WGNEDFINVSIGKAGRNRHR
GGGCGTAACCGCCACAGAGGGATCCGCCCACAAACTCGTGGTAGCGCGATGAACCCAGTGGATCACCCGCATGGT RGSAMNPVDHPHGGGEGKTG
GGGGGTGAGGGTAAAACAGGGACAAGCGGTCATCCTGTATCGCCTTGGGGCACTCCAGCTAAGGGCTATAAAACG VSPWGTPAKGYKTRKKKASDK
AGAAAGAAAAAAGCTAGCGACAAGCTCATCATTTCCAGAAAGAAACATAAATAAAGGTTAAAGAATGTCT KHK
HP0247 2675 GCAAAAAAGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCATTGCTAAAAAAACAA 2676, KKDACGFIYEISEFMKAYTALLK
GACCGATACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACTGCCCTTTATGTCAAA VYLLRYLPSRYWASILTTALYV
TACCCTGATTTTGACGCTTTGAAAAAGCTTTTGGTGTCTTATTATTACCAAACTTGGATTGCAGGAGGCACGATCACG ALKKLLVSYYYQTWIAGGTITRI
CGCATCAAGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACCATCAAAGAGCTTATA IIKNVKSNKSVETIKELILNSIDS
TTGAATAGCATCGACTCTTATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTTCTGTTTATCATAGCA YLYNLWDSSSVYHSKWVRPVL
AATGGGTGCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATTTTATCGCTATGGATG FMADEEKPHFIAMDAETQVEHI
CCGAAACCCAAGTGGAGCATATTTTGCCACAAACGCCCAAAAGAGGCAGTCAATGGAACGCGGATTTTGACAAAGA KRGSQWNADFDKEKREEWVN
AAAAAGAGAAGAATGGGTAAATAATATCGCGAATTTAACCCTTTTAAAGCGTAAAAAGAACGCGCATGCTTTAAACGG LLKRKKNAHALNGDFDEKRKIY
GGATTTTGATGAAAAAAGAAAAATTTATGGAGGCAAAGACACGAGCAAAGTGATTAGCTGTTATGACATCACTAAAGA SKVISCYDITKELYSNYRKW
ATTGTATAGCAATTATAGGAAGTGGA
HP0247 2677 TGACACCAAAGCGCTAGAGGCTTTTGGGGTTAATGCGGGCTTTrTATCCCAAATGCCCAACGCTTTAAAAAAAATGA 2678 DTKALEAFGVNAGFLSQMPNA
ATAAAGAAGAAGAATGGAAGAGACTTGTCAAAAGATTTGATGTGAATTACCAGTTCATCCCCATCATTAAAAACATGC KEEEWKRLVKRFDVNYQFIPIIK
TCATAGAAGCGAGCGTGCCGCAAGAATTTTTATTTTTAGCCATGGCCGAGTCTAAATTTTCATCAAGGGCTTATAGCA ASVPQEFLFLAMAESKFSSRA
GGAAAAAAGCGGTAGGGATTTGGCAATTCATGCCAAGCACGGCTAAAGAATTAGGGCTTAAGGTCAATCATTACATT VGIWQFMPSTAKELGLKVNHYI
GATGAAAGAAGAGATCCCATTAAAAGCACTCAAGCGGCGATCACTTATTTGAAACGGCTCTACAAGCAAACCGGAGA PIKSTQAAITYLKRLYKQTGEW
GTGGTATTTGGTCGCTATGGCGTATAATTACGGCTTACGCAAGGTTCAAAACGCTATTAAAGCCGCCGGCACTTCGG AYNYGLRKVQNAIKAAGTSDIKI
ACATTAAAATTTTGTTGGATGAAGATAAGAAATACCTCCCTAAAGAAACACGAGAGTATATCCGCTCCATTCTAAGCC KKYLPKETREYIRSILSLALKFN
TAGCGTTAAAATTCAACAGCCTAGACAACCTCAAAGATAAAGAATATCTGCTCAATCGTGGGGCGAGGGTGAGTTTA DKEYLLNRGARVSLVGVPFKR
GTGGGCGTCCCGTTTAAAAGGCGTGCTTCTTTAGTCCAAGTAGCCAAAAATTTGAATTTGAGTTTGGAAACCTTAAAA QVAKNLNLSLETLKSYNHQFR
TCCTACAACCACCAATTCCGTTATAACATTCTGCCTTCTAAAGACCCCACTTATACCATTTATATCCCTTATGAAAAAC KDPTYTIYIPYEKLALFKQRQIK
TCGCTCTTTTCAMCAACGCCAGATCAAACAAAATAAAAACATTC GCCAGπCAAAAAGCCCTTTTATCACCCATGT ASSKSPFITHVVLPKETLSSIAK
GGTCTTACCTAAAGAAACCCTATCTTCTATCGCTAAACGCTATCAAGTCAGTATTTCCAATATCCAATTAGCCAATGAT SNIQLANDLKDSNIFIHQRLIIPT
CTCAAAGATTCTMTATTTTTATCCACCAGCGTTTAATCATCCCCACCAACAAAAAATTACTCGCTACAAGGGAATTTT TREF
AATGGGTTTGG
HP1411 2679 TTAAAAAAATCCGCTTCCCTGAAAGCAGCGGCATAGGGGTAAAACCCATCAGTAAAGAAGGCACAGAGAGGCTAGT 2680 KKIRFPESSGIGVKPISKEGTE
GAGAAAGGCGATTGAATACGCAATTGATAACGACAAGCCAAGCGTGACTTTTGTGCATAAAGGTAACATCATGAAAT EYAIDNDKPSVTFVHKGNIMKY
ACACCGAAGGGGCGTTCATGAAATGGGGCTATGCGCTCGCTCAAAAAGAATTTAACGCTCAA'GTCATTGATAAAGGC MKWGYALAQKEFNAQVIDKG
CCATGGTGTTCTTTGAAAAACCCTAAAAACGGTAAAGAAATCATCATTAAAGACATGATCGCTGACGCGTTTTTGCAA NPKNGKEIIIKDMIADAFLQQILL
CAAATCTTATTGCGCCCTAGCGAATACAGCGTCATTGCGACCATGAATTTGAACGGGGATTATATCTCTGATGCGTTA SVIATMNLNGDYISDALAAMVG
GCGGCGATGGTGGGGGGTATTGGTATCGCTCCTGGGGCTAATCTCAATGA GANLN
HP1411 2681 GCAAAAAGAAGCCCTAGCACTCATTATTGTAGATGAAGAAGTTTCTTTGGAAGTTTTAGAAGAGCTTAAAAACATTCC 2682 QKEALALIIVDEEVSLEVLEELK TGCGTGCTTAAGCGTTCATTATGTGGTTATTTAAGGTAGTTGGATGCGAGATTTTTTAAAACTTTTAAAAAAGCATGAT SVHYWl GAATTAAAAATCATTGACACCCCCCTTGAAGTGGATTTAGAAATCGCTCATCTAGCCTATATAGAAGCGAAAAAACCT AATGGGGGCAAAGCCCTTTTATTCACGCAACCCA
HP1411 2683 GCGATCTTAAAAAGGTGTGTGAGAAAATCAAAAGCGCACTACCCTTTGGGATCATCTCAGCCTTTAAACCCTTTAAAG 2684 DLKKVCEKIKSALPFGIISAFKP
ACGCTTTTTACAGAGATTTCAATCATAATGAGCAAAAGTTACTGATAGGGGCAGCTAAAAGCGGTTGCATTCAATCTA RDFNHNEQKLLIG,AAKSGCIQS
GCGCTGATAAACTGGCTCAGTTAAAAACGCGCTTACTCTACTGGCAAGACAAATCTGTTAAAGTGGATTGGGATAAA AQLKTRLLYWQDKSVKVDWD
CCCATTTTAATCAAGGACTTCTTTAAAGGCAATAATTACCTTTATAGGAGGTTTTGTTTTTTATTGGGGAAGCATTTTAT FFKGNNYLYRRFCFLLGKHFM
GGACAGATTTTTAAAGAATAACGCTAAGGCGAGCGTGAAAGACTTTATGTCTAGTAAGGAGTTTGTCGCTAAATACC NAKASVKDFMSSKEFVAKYRY
GATACACCCCCAAGCAAAATACAGAAAGAGCGAAAAAGCTGCAATCGTATTTAGAGAATAAGCGCGATTTTATAGGG TERAKKLQSYLENKRDFIGFV
TTTGTTCAAGCGCTTAACTCTTTAAAAGACAACCCGCAAGATCCTTTTTTACCCAATGAAGAAACGAGCTTTTTGGTGT KDNPQDPFLPNEETSFLVFAN
TCGCTAATGAGCCTACTATCGTGTTTAATTTAAGGGATTATTTATTGGTGTTAGCGCAAATCTTTAACCAACAAGCGAT NLRDYLLVLAQIFNQQAICYCE
CTGTTATTGTGAAAGCAAATGCCCTATAGAATTGATCAACGCTTCACCGGGTAAGGACTTTAACAAGACACAAGACA LINASPGKDFNKTQDSFPDIKF
GCTTCCCAGACATCAAATTCTCAACACCCAATCAGTTAGAACAATCCCTCAACGCTCTTAAAAACAAGCTAGCCGCAT EQSLNALKNKLAAFFSKH
TCTTTTCAAAACACC
HP1411 2685 GCCCAGAAGCGGCTAAAAACGCCTRAATGGAGCGTTTCACTTTGAGCGAGATTCAAAGCAAGGCCATTTTAGAAATG 2686 PEAAKNALMERFTLSEIQSKAI
CGTTTGCAACGCTTAACAGGCCTTGAAAGGGATAAGATCAAAGAAGAATACCAAAACTΓGTTGGAGCTTATTGATGAT RLTGLERDKIKEEYQNLLELID
CTCAATGGCATTTTAAAGAGCGAAGATCGCTTGAATGGAGTCGTCAAAACAGAGCTTTTAGAAGTCAAAGAGCAATTT SEDRLNGWKTELLEVKEQFS
TCTTCTCCAAGGCGCACTGAAATTCAAGAATCTTATGAAAATATTGACATAGAAGATTTGATCGCTAATGAGCCTATG IQESYENIDIEDLIANEPMVVSM
GTAGTGAGCATGAGTTATAAAGGCTATGTGAAAAGAGTGGATTTAAAAGCTTATGAAAAGCAAAATCGTGGTGGTAAA VKRVDLKAYEKQNRGGKGKL
GGCAAGCTTTCAGGCAGCACTTATGAAGACGATTTCATTGAAAACTTTTTTGTGGCTAACACGCATGATATTTTGCTC DDFIENFFVANTHDILLFITNKG
TTTATCACCAATAAGGGGCAATTGTATCATTTGAAAGTCTATAAAATCCCAGAAGCGAGCCGGATCGCTATGGGTAAA VYKIPEASRIAMGKAIVNLISLA
GCCATTGTAAATTTAATCTCGCTCGCTCCGGATGAAAAGATCATGGCGACTCTAAGCACCAAAGACTTTAGCGATGA ATLSTKDFSDERSLAFFTKNG
ACGCTCTTTGGCCTTCTTCACGAAAAATGGCGTGGTGAAGCGCACCAATTTGAGCGAATTTGAAAGCAACAGGAGTT LSEFESNRSCGIRAIVLDEGDE
GTGGTATCAGAGCGATTGTTTTAGATGAAGGCGATGAATTAGTGAGCGCAAAAGTTGTGGATAAAAACGCTAAGCAT VDKNAKHLLIASHLGIFIKFPLE
TTGCTCATCGCATCGCATTTGGGCATTTTCATTAAATTCCCTTTAGAAGAGGTGCGCGAGATCGGAAGAACTACTCGT RTTRGVIGIKLNENDFWGAW
GGGGTTATAGGCATCAAGCTGAATGAAAACGATTTTGTTGTCGGTGCGGTCGTTATTAGCGATGATGGCAACAAGCT KLLSVSENGLGKQTLAEAYR
TTTGAGCGTGAGTGAAAACGGGCTTGGCAAGCAAACTTTAGCCGAAGCGTATAGA
HP1411 2687 CTACCTTAAGGCTTCTATTTCGTTAGAATTGAGTAATGAAAAGCTTTTGAATGAAGTCAAGGTTAAAGACACAGCGAT 2688 YLKASISLELSNEKLLNEVKVK
TAAAGACACGATTATAGAGATTCTGTCGTCTAAAAGCGTGGAAGAAGTGGTTACTAACAAAGGCAAAAACAAGCTTAA IIEILSSKSVEEWTNKGKNKL
AGATGAGATTAAGAGCCATTTGAATTCGTTTTTGATTGATGGCTTTATTAAAAATGTCTTTTTCACTGATTTCATTATCC LNSFLIDGFIKNVFFTDFIIQ
AATAATTTTAAATGATTGGCATAGATATTGTCTCTATTGCTAGGATAGAAAAGTGCGTGAAACGCTTTAAAATGAAGTT
TTTAGAGCG I I I IT TATCGCCAAGCGAGATTGTTTTATGCAAGGATAAATCCAGCAGTATCGCCGGGTTTTTCGCGCT
TAMGAGGCTTGTTCTAAAGCCCTTCAAGTGGGCATTGGTAAGGAATTGAGCTTTTTGGATATAAAAATCTCTAAAAG
CCCTAAAAACGCCCCCTTAATCACCCTTTCCAAAGAAAAAATGGATTATTTCAATATCCAAAGCTTGAGCGCGAGCAT
CAGCCATGACGCTGGTTTTGCGATAGCGGTCGTGGTGGTTTCTTCGTCAAATGAATAAATAACGCATTCTAAAACTAA
CATTCTGTTTTAAAMGCGCTT CGGCTGTTTAAAAMCGCAAAAAACTTTTTTCAAATGGTTTTGATAACAAATTGC
AMTAMTAAMTCAMCTTCAA TTTMGTT TTTTMT TTATTTTTATAGTATGCATCGGTTTGAATTAAATGAG
A GGTTATCACAATGAATGGTTATTTGAGAGTAAAAACCTCTTATTTTTTAGCGTTGAACGCtTTGACTTTTTTGTCTT
TTAACTCTrTGGTGGGCGCGAAAGAACAGCATCACACTTTGCAAAAAGTGACAACCACTGAGCAAAAATTCAATCCA
AGCGCGCCGCTTTCATGGCAAAGCGAAGAGATGCGTAATTCCACAAGCTCTCGCACGGTGATTTCCAACAAGGAAC
TCAA
HP1411 2689 GAGTAAATCTAAGGGCAATGTGGTGTCTTTGGACAAGCTGCTCAAAACGCATGGGAGCGATGTGGTGCGTTTGTGG 2690 SKSKGNWSLDKLLKTHGSDW GTAGCGTTTAATGACTATCAAAACGATTTGAGAGTCTCTCAAACCTTTTTCACTCAAACAGAACAACATTATAAAAAAT AFNDYQNDLRVSQTFFTQTEQ TCCGCAACACCCTGAAATTCTTACTCGCTAATTTTAGCGATATGGATCTCAAGAATTTAGAACGCCCCCATAACTTCA RNTLKFLLANFSDMDLKNLERP GCCCTTTAGATCATTTTATGTTAGAGACTTTAGAAACCATAAGCGCTGGAGTCAATAGCGCGTTTGAAGAGCATGATT LDHFMLETLETISAGVNSAFEE TTGTGAAAGGCTTGAATATTTTAATGGCGTTTGTTACCAATGAATTGAGCGGGATTTATTTAGACGCTTGCAAGGATA GLNILMAFVTNELSGIYLDACKD GCTTGTATTGCGATAGCAAAAACAATGAAAAACGCCAAGCCATTCAAATGGTTTTACTCGCTACAGCTAGTAAGTTGT SKNNEKRQAIQMVLLATASKLC GCTACTTTTTAGCCCCGATTTTAACGCACACGATTGAAGAAGTTTTAGAGCATAGCCAAGCGCTTCGCATTTTTTTAC LTHTIEEVLEHSQALRIFLQAKD AAGCCA GATGTGTTTGATTTAAAAGACATTAGCGTTTCAGAAAAACTCCACCTCAAAGAGTTTAAAAAACCAGAAA DISVSEKLHLKEFKKPENFEAVL ATTTTGAAGCCGTTTTAGCCTTGCGTTCTGCCTTTAATGAAGAGTTAGACCGATTGAAAAA FNEELDRLK
HP1411 2691 CGAAAAAAGGATTTTCATGCTGCTTTGCGCTGGAAGGAATGAGACTTTAAAAAAAGCGGTGCCTATTGGTGTGGGTT 2692 EKRIFMLLCAGRNETLKKAVPI
TGATAGAGAGCGCGATTAATCTAACGAGAATGTGTTTAAAAAACCCTGATACAGAAAGCCTTATTTTTATAGGGAGCG SAINLTRMCLKNPDTESLIFIGS
CGGGGAGTTATAGCCCAGA TGGAGCTTTTAAGCGTGTTTGAAAGCGTTTGCGGCTATCAAATTGAAGAGAGTTTT PEMELLSVFESVCGYQIEESFS
AGCCATTTAAACAGCTACACGCCTTTGGATAATTTCATTCACATAGAAACTGAAGAGCAGGCTCTTTTTGAAAGGGTG TPLDNFIHIETEEQALFERVRVN
CGTGTGAATAGCAGTAACTATATCCACACCAGCGAAATGTTCGCTAAAAAAATGGTTCAAAAGGGCGTTTTATTAGAA HTSEMFAKKMVQKGVLLENME CATGGAGTTTTTTAGCGTTTT GCGTGGCTA GCGTTTTCTTTAAAGGCTAAAGGGATTTTTTGCGTGAGTAATT SVAKAFSLKAKGIFCVSNYVGL
ATGTGGGGCTTAATGCGTATCAGGAATTTAAAGAAAACCACGCCAAAGTCAAACAGATTTTAGAAAACATCATTGATA FKENHAKVKQILENIIDSLII
GTTTAATAATTTAATAGTTTAGCTATCATGGAGCATTCTAAATTAAAGGCGATCACATGTTTGAAAAAATACGCAAGAT
TTTAGCGGATATTGAAGATTCGCAAMTGAAATTGAAATGCTTTTAAAATTAGCGAATTTGAGTTTGGGGGATTTTATT
GAGATTAAAAGAGGGAGCATGGACATGCCAAAGGGCGTGAATGAAGCGTTTTTTACGCAATTAAGCGAAGAAGTGG
AGCGATTGAAGGAGCTrATTAACGCTTTGAATAAAATCAAAAAAGGGTTATTGGTGTTTTAAATGTGTGGGATTGTAG
GTTATATAGGGGATAGCGAGAAAAAATCCGTTCTTTTAGAGGGATTAAAGGAATTGGAATACAGAGGTTATGACAGC
GCGGGTTTAGCCGTATTGAGTAATGATCGTTTGGAAGTGTTTAAAACTCAAGGGAAATTAGAAAACCTTAAACTAGAG
CTTAAAAATAAA
HP0482 2693 GAGGTGAAGTGTTAGAAATTGTGGCTTTGATCGCTTATTTGAATAGCTTGGGTAATTCCAGGATCAACGCCAATCAAA 2694 GEVLEIVALIAYLNSLGNSRINA
ACGCTAAATAAGGGGTGAATGATGGATTTAGAAAGTTTGAGAGGTTTTGCGTATGCGTTT TTACCATTCTTTTTACG
CTCTT ΓTGTATGCCTATATT TTAGCATGTATAGAAAGCAAAAAAAAGGCATTATGGATTATGAGCGATACGGATACT
TAGCGTTAAATGATGCTTTAGAAGACGAGTTGATTGAACCACGCCATAAAAAAGTTCATGATAATGGCATAAAGGAAA
GTTGAAATGGATTTTTTAAACGACCATATAAATGTTΓΓTGGCTTGATTGCAGCGCTTGTGATTTTAGTTTTAACCATCT
ATGAATCCAGTTCGCTCATTAAAGAAATGCGCGACAGCAAATCTCAAGGTGAGCTTGTAGAAAATG
HP0482 2695 GCG GTCAAAACTGCCTTTAAAATCGCTGATGTAGAATACGTGAAAGACAGCACAAAGTTAAATTTCAACTATCTTA 2696 EVKTAFKIADVEYVKDSTKLNF
AGGATTTAAAAGATGAA C TCAACCTTTATCTCAA TATTTTAACTCAAAATGTGGCTAGAGTGTATTTAATTGTA KDENNQPLSQNILTQNVARVYL
GTGAATGGTGAGATTAAAAAAATCGGTGGCTCTCAAGCAGATGGTGGGATTAAAAGCACGCTCAATATTTATAAAGAT EIKKIGGSQADGGIKSTLNIYKD
GGGGGAGTCAAAGGGAGGCCTAATATTAGAAGTTTTGGCGTGTGGTATTTTCTTTATCACACAATACTCACAGGGGC RPNIRSFGVWYFLYHTILTGAKI
TAAAATAGAATTTTACATGATTTATCAGCCTAATTTTGAAACTCAAGTGAAAGGCTTGTTTGGTTTTTGTGCAATCAAA YQPNFETQVKGLFGFCAIKDAS
GATGCAAGTATAAGCTATAAACTTTTAGAGCAAGCTTGCCTGACGGATTACAGAAACAATAGCAATGACGCATTACCC EQACLTDYRNNSNDALPEWN
GAATGGAATGTGCAAGAGCAGGGCAAAGATTAGCCAAATGATATTAAAGATGAGCATGCCAATATCACTCAAAAAGC KD
TCAAAACAGAGAAAAGGCCGTCCATAGAAAAGCGATTGACAAACCTAGCGGAACTTTAAAAGATTAAGAGATAGTCT
TAACCCAATCTCAAAAAAAGAACTTTAAGITTTTACTCCATTTAAAAAGTGTGGGTTTAGATGAAAGGAAAAAATAAAA
CTTGTATAAGGTAGCAGATATTTTTTGTGGTGCTGGAGGATTGAGCTATGGCTTTTCTATGCACCCTTATTTTGAATTG
ATATGGGCTAACGATATAGACAAGGACGCCATTTTAAGCTATCAAGCCAATCATAAAGAGGTGTAAACCATTTTATGC
GATATTGTGCAACTTCATTGCCACAACTTGCCATGCGTTTCAATTGATATTCTACTAGGCGGACCACCATGCCAGAGC
TATTCTACCCTTGGCAAAAGAAAAATGGATGAAAAAGCGAATCTGTTTAAAGAA
HP0482 2697 AATACGCTGATCCTAGCACTTCTAAAAAGAGAGCCGATAAGGGATTAAAAAAGGTGTTCAAAGACAGCAAAAAAGAC 2698 YADPSTSKKRADKGLKKVFKD
GCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCATTGCTAAAAAAACAAGACCGATACGT CGFIYEISEFMKAYTALLKKQD
CTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACTGCCCTTTATGTCAAATACCCTGATTTT RYLPSRYWASILTTALYVKYPD
GACGCTTrGAAAAAGCTTTTGGTGTCTTATTATTACCAAACTTGGATTGCAGGAGGCACGATCACGCGCATCAAGCA KLLVSYYYQTWIAGGTITRIKQT
AACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACCATCAAAGAGCTTATATTGAATAGCAT VKSNKSVETIKELILNSIDSYNTF
CGACTCTTATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTTCTGTTTATCATAGCAAATGGGTGCGT NLWDSSSVYHSKWVRPVLALA
CCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATTTTATCGCTATGGATGCCGAAACCCAA DEEKPHFIAMDAETQVEHILPQ
GTGGAGCATATTTTGCCACAAACGCCCAAAAGAGGCAGTCAATGGAACGCGGATTTTGACAAAGAAAAAAGAGAAGA SQWNADFDKEKREEWVNNIAN
ATGGGTAAATAATATCGCGAATTTAACCCTTTTAAAGCGTAAAAAGAACGCGCATGCTTTAAACGGGGATTTTGATGA RKKNAHALNGDFDEKRKIYGG
AAAAAGAAAAATTTATGGAGGCAAAGACACGAGCAAAGTGATTAGCTGTTATGACATCACTAAAGAATTGTATAGCAA VISCYDITKELYSNYRKWNEKS
TTATAGGAAGTGGAATGAGAAGTCCCTCCAAGAGCGATACAAATCTTTGTATAACACTATCACGCCTGTTTTACACAT KSLYNTITPVLHIEGQEDDFED
AGAGGGGCAAGAAGATGATTTTGAAGATGATTTTGATCT
HP0241 2699 CGAGCGTGAGCGAAGTGTTAGAATTGTGCGATGTTTTAAACCCGCGCAACATTAAGGGGCGTTTGAATTTGATCGTG 2700 SVSEVLELCDVLNPRNIKGRLN
CGCATGGGTTCTAAGATGATTAAAGAGCGTTTGCCTAAACTTTTACAAGGGGTGTTGGAAGAAAAACGCCATATTTTA GSKMIKERLPKLLQGVLEEKRH
TGGAGCATTGATCCCATGCATGGCAACACGGTTAAAACCAGCTTGGGGGTTAAAACAAGGGCTTTTGATAGCGTGTT PMHGNTVKTSLGVKTRAFDSV
AGATG GTGAA GCTTTTTTGAAATCCATAGGGCTGAAGGGAGTTTGGCTTCAGGGGTTCATTTGGAAATGACAG SFFEIHRAEGSLASGVHLEMTG
GTGAGAATGTTACAGAATGTATCGGTGGCTCGCAAGCGATCACCGAAGAGGGTTTGAGCTGCCATTACTACACGCA CIGGSQAITEEGLSCHYYTQCD
ATGCGATCCAAGATTAAACGCCACCCAAGCCCTAGAACTCGCCTTCTTAATCGCTGACATGCTCAAAAAACAGCACG TQALELAFLIADMLKKQHA
CTTAGTTAAAAAGAGATTAATCTTTTTTTMCTCTTTTACTTTATMTTATCGTTGGCATTTTAATATTCAAAGGAGCTTG
AAATGAGAATTTCTCTTTrAGCTGTAATTTTAGCGTTATTGTTTGTGGCTTGCCACGAAACTAAAAAACAAATCTTACA
AAACGAAGCCGATAGCACCCCTTCAGAAAAAACCATTTGGCAACCTGAACAAAAATAAAAATTGTAAAAATACTCAAA
GGCATTTTTTAAAATAAACGCAATAAAAAAGCTAGTGGGTGATAGTTTTTTGTAAAGCGATCTTAGGGTTGTAGGGGA
ATGATTTCAAAAACACCCCCTAACTCTTATGAAGTTTATTATAAAACTAA CAAMTTTAAAACAACAAGTTTTTAAAA
GTGAGAGAATGGCTAAAAMTCTCA TTAAAATGGGCGAAAGTTAGAAAGTTAAACTACAAACTCTCTAAAACCTTTT
GAGCATGGCCTTTCGCTTTGACATTGTAGAAACAI I I M CTAAAACGCCTTGCGTGTTGATAATGAAGGTGGAGCGG
ATAATCCCCA
HP0241 2701 CGCAACAAAGCGATCCAACTACAGCTAATATTGCAAACCCTTGCGCGCTTAGCGCCCAAAGCACGAATGGCGCTTCT 2702 QQSDPTTANIANPCALSAQSTN
TCTAATAATGCGTCAAATAACGCGCCAATCGCCTTGAGTAATAACGATGAAAGCTTGATGGTTGCGGCGAATGATTT NASNNAPIALSNNDESLMVAAN
CAATTTTTCAGGCAATATTTACGCTAATGGGGTGGTTGATTTTTCAAAGATTAAAGGCTCTGCAAACATTAAAAACCTG GNIYANGWDFSKIKGSANIKNL
TATCTTTACAATAACGCTCAATTCCAAGCCAACAATCTCACTATTTCCAATCAAGCGGTGTTAGAAAAAAACGCCAGC AQFQANNLTISNQAVLEKNASF
TTTGTAACGAATMTTΓAAACATTCAAGGAGCGTTTAACAACAACGCCACGCAAAAAATAGAGGTGCTTCAAAATTTA NIQGAFNNNATQKIEVLQNLVIA
GTGATCGCTTCAAACGCTTCTTTAAGCACCGGGATTTATGGGTTAGAAGTAGGGGGGGCTTTGAATAATTCTGGAGC STGIYGLEVGGALNNSGAIHFN
GATCCATTTTAATTTAGAAAATACCCAAACGCCAACGCCGCTCATTCAAGCAGAGGGGATCATTAACCTCAACACCAC TPTPLIQAEGIINLNTTQTPFMN
CCAAACGCCTTTTATGAATGTCAATAACAGCATGGCCAATAATACGACTTACACTTTATTAAAAAGCAGCCGTTACATT ANNTTYTLLKSSRYIDYNINPNS
GATTACAATATCAACCCCAACAGCTTGCAATCGTATTTGAATCTCTACACTTTAATCAATATCAACGGGAACCACATAG NLYTLININGNHIEEKNGALTYL
AGGAAAAAAACGGCGCATTGACTTATTTGGGCCAACGGGTTTTGTTGCAAGATAAGGGGTTATTGTTAAGCGTAGCG LQDKGLLLSVALPNSNNASQN
CTGCCCAACTCAAACAACGCTTCTCAAAACAACATTTTAAGCCTTTCTGTCCTTTATAACCAAGTTAAAATGTCTTGCG VLYNQVKMSCGDKAMDFTPPT
GCGATAAAGCGATGGATΠTACCCCCCCTACCTTACAAGATTACATTGTGGGCATTCAAGGGCAAAGCGCGCTCAAT VGIQGQSALNQIEAVGGNAIKW
CAAATTGAAGCTGTTGGGGGGAACGCTATCAAGTGGCTTTCAACATTGATGATGGAGACTAAAGAAAACCCGTTTTT METKENPFFAPIYLKNHSLNEIL
TGCGCCGATTTA LQNTASLISNPNFRDNATNLLE
QTSRLTKLSDFRSREGESDFSL
KRFSDPNPEVFVKYSQLSKHP
QGVGGASFISGGNGTLYGLNA
VKNVILGGYVAYGYSDFNGNIM
NVDVGMYARAFLKRNEFTLSA
GNATSINSSNSLLSVLNQRYNY
SVNGNYGYDFMFKQKSWLKP
HP0241 2703 TTTGAACAACCTTTTAACGAGATACAGCACCCTAAACACCCTTATCAAATTGTCCGCTGATCCGAGCGCAATTAATGC 2704 LNNLLTRYSTLNTLIKLSADPSAI
GGTGCGGGAAAATCTGGGCGCGAGCACGAAGAATTTGATCGGCGATAAAGCCAACTCCCCGGCGTATCAAGCCGT NLGASTKNLIGDKANSPAYQAV
GTTTTTAGCGATCAACGCGGCGGTAGGGTTGTGGAATACCATCGGCTATGCGGTCATGTGCGGGAACGGGAACGG AVGLWNTIGYAVMCGNGNGTE
CACAGAGAGTGGGCCTGGCAGCGTGATCTTTAATGACCAACCAGGACAGGATTCCACGCAAATTACTTGCAACCGC SVIFNDQPGQDSTQITCNRFES
TTTGAATCAACTGGGCCTGGTAAAAGCATGTCTATTGATGAATTCAAAAAACTCAATGAAGCCTATCAAATCATCCAG SMSIDEFKKLNEAYQIIQQALKN
CAAGCTTTAAAAAATCAAAGTGGGTTTCCTGAATTAGGCGGGAACGGCACAAAAGTGAGTGTTAATTACAATTACGAA ELGGNGTKVSVNYNYECRQTA
TGCAGACAAACTGCTGATATCAACGGCGGTGTGTATCAGTTCTGCAAGGCTAAAAATGGTAGTAGTAGCAGTAGTAA VYQFCKAKNGSSSSSNGGNG
TGGCGGTAATGGCAGTAGCACGCAAACAACCGCGACAACCACGCAAGACGGCGTAACGATCACCACTACCTATAAT ATTTQDGVTITTTYNNNKATVK
AATAACAAAGCCACCGTCAAATTTGACATCACCAATAACGCTGAACAGCTGTTAAATC'AAGCGGCAAACATCATGCAA AEQLLNQAANIMQVLNTQCPL
GTCCTTAATACGCAATGCCCTTTAGTGCGTTCCACGAATAACGAAAACACTCCAGGGGGTGGTCAACCATGGGGTTT ENTPGGGQPWGLSTSGNACSI
AAGCACATCCGGGAATGCGTGCAGCATCTTCCAACAAGAATTTAGCCAGGTTACTAGCATGATCAAAAACGCCCAAG SQVTSMIKNAQEIIAQSKIVSEN
AAATAATCGCGCAAAGCAAAATCGTTAGTGAAAACGCGCAAAATCAAAACAACTTGGATACTGGAAAACCATTCAACC NLDTGKPFNPYTDASFAQSML
CTTACACGGACGCCAGCTTTGCGCAAAGCATGCTCAAAAACGCTCAAGCGCAAGCAGAGATGTTCAATTTGAGCGA QAEMFNLSEQVKKNLEVMKNN
ACAAGTGAAAAAGAACTTGGAAGTC KLAGFGKEEVMTNFVSAFLAS
TLPNAGVT
HP0241 2705 TAGCGGTAATTGAAAACGAAATGCACACTTTTTTCACCAAACTCATTGATGGGAATTACCCTGATTATCAAAAAATCCT 2706 AVIENEMHTFFTKLIDGNYPDY CCCTAMGAATACATTTCTTCTTTCACTTTAGGCAAGGAAGAATTTAAAGAGAGCATTAAATTGTGCAGTTCTTTAAGC EYISSFTLGKEEFKESIKLCSSL TCCACCATTAAACTCACTTTAGAAAAAAACAACGCTTTGTTTGAATCTTTGGATTCTGAGCATAGCGAAACGGCTAAA TLEKNNALFESLDSEHSETAKT ACCTCTGTTGAGATTGAAAAAGGTTTGGATATTGAAAAAGCCTTTCATTTGGGCGTGAA GLDIEKAFHLGV
HP0241 2707 CGGTGGGCGATGTTTTTGGTGAAAACGGGCTTTTAAACGCGCTAGATCCTACGGAAAGAAAAAAAATTGATCAAATG 2708 VGDVFGENGLLNALDPTERKKI CTTTTAGAGCAAATCCAAGCCCATTCTTCAGGGTTTGAAAAATTCATCGTGAAAACTTTAGGGATTGAAAATGTAGAG EQIQAHSSGFEKFIVKTLGIENV AATTTCATCAATAACTGGTATGGCAAGCAAAGCTTGAGTTCTTTTGCCAATAATTTTGTGCCTGGAGGCTTGAATCAA NWYGKQSLSSFANNFVPGGL GCCCTTGATAAAATAGGCTCTAGCTCTGATGCCAAAGACTTACAGAACTTCTTGGATAAAACGACTTTTGGGGATATT IGSSSDAKDLQNFLDKTTFGDI TTAAATCAAATGATTGAACAAGCCCCCTTAATCAATAAACTCAT QAPLINKL
Figure imgf000415_0001
Figure imgf000416_0001
HP0448 2729 GCATGTGGGTAAGGATAATTTGCCCTTTTATTTCAAAAAAGTTAMGAAGTGTTAAAGAGAGGCGGGATGTTTTTGCT 2730 HVGKDNLPFYFKKVKEVLKRGG
CCACTCCATTTTATGCTGTTTTGAAGGCAAGACTAACGCATGGGTGGATAAATACATTTTTCCAGGTGGTTATTTGCC HSILCCFEGKTNAWVDKYIFPG
CTCTTTAAGAGAAGTGATGAGCGTGATGAGCGAATGCGACTTCCACTTGCTCATGGCTGAAAGCTTACGCATCCATT
ACGCTAAGACTTTAGACATTTGGCGAAACAATTTCAACCACAATCTAGACCAAGTGAAAAGACTCAGCTATGACGAAC TLDIWRNNFNHNLDQVKRLSYD
GCTTTATCCGCATGTGGGATCTGTATTTAAGGACTTGCGCATCCGCCTTTAGGGTGGGGAGCGCGGATTTATTCCAA MWDLYLRTCASAFRVGSADLF
TTGCITTTAACCAACAGCGTGGATAACACTTTCCCCTTAACCAAAGAATACATCTACCAATAATTTCAGATTTTAGCCC SVDNTFPLTKEYIYQ
CTTCTTTCAGGCTAACTTCTGGCAAACTCTCCACATACAAACTCTGTGATGGGAAAGCAAAACTCAAATGGTGCTTTT
CTACAATCCCCATGATTTTTAGCATCACATCTTCTTTGACTTCTAGCCACTCTTCCCAAACCACTGTCTTAGAAAAGCA
ATACACTAA TATT TAGAGCTGTCCGCAAACTGATCTAAAAAGACAAACAAATTGTTΠTATACCCTAAAAAATCA
TCAATAGAMC TATCTTTTTTAAACATGTAGCGGTAATCGCTCACATTTTGCAAAGCGCTATCGGCTCCGTTAGCG
ATTTTAGGGTGGTTTTCTA
HP0819 2731 GTGAAGCCGATAAGGTCATTAAGGCCACTAAAGAAACTAAAGAGACCAAGAAAGAAGCTAAACGACTCAAAAAAGAA 2732 EADKVIKATKETKETKKEAKRL
GCTAAACAGCGCCAACAGATCCCTGATCATAAGAAACCTCAATATGTCTCTGTTGATGACACAAAAACTCAAGCGCT QRQQIPDHKKPQYVSVDDTKT
GTTTGATATATACGACACCTTGAATGTGAATGACAAAAGCTTTGGGGATTGGTTTGGTAATAGCGCTTTGAAAGACAA YDTLNVNDKSFGDWFGNSALK
AACCTATCTCTACGCTATGGATCTATTGGATTACAACAACTATTTATCCATAGAAAACCCCATTATCAAAACAAGAGCA YAMDLLDYNNYLSIENPIIKTRA
ATGGGAACTTATGCGGATTTGATTATCATCACAGGCTCATTAGAGCAAGTCAATGGGTATTACAACATTCTAAAAGCG DLIIITGSLEQVNGYYNILKALNK
CTCAACAAACGCAACGCTAAGTTTGTGTTAAAAATCAATGAGAACATGCCTTATGCCCAAGCGACTTTTTTAAGAGTG VLKINENMPYAQATFLRVPKRS
CCAAAAAGAAGCGATCCTAATGCCCACACGCTTGATAAGGGAGCGTCAATTGATGAGAATAAGCTTTTTGAACAACA TLDKGASIDENKLFEQQKKMYF
AAAGAAAATGTATTTCAATTACGCTAACGATGTGATCTGCAGACCCGATGATGAAGTGTGTTCGCCCCTAAGAGATG VICRPDDEVCSPLRDEMVAMP
AGATGGTAGCTATGCCCACTAGCGATAGCGTTACTCAAAAACCCAATATC TQKPNI
HP 1537 2733 CAGAGCCATAGCCGATCAGGCCCGCACTATCCGCATCCCCATTCACATGATTGATACGATTAATCGCATCAATAAAG 2734 RAIADQARTIRIPIHMIDTINRINK TCATGCGCAAACACATTCAAGAAAACGGCAAAGAGCCTGATTTAGAAGTGGTGGCTGAAGAAGTGGGGCTTTCGTTA HIQENGKEPDLEWAEEVGLSL GATAAAGTGAAGAATGTGATTAAGGTGACTAAAGAGCCTATCAGTTTGGAAACCCCAGTCGGCAATGATGATGATGG VIKVTKEPISLETPVGNDDDGK CAAGTTTGGGGATTTCGTGGAAGATAAGAATATCGTCAGCTCCATTGATCACATCATGCGAGAA EDKNIVSSIDHIMRE
HP1537 2735 TTTAGAAAACGATCTCAAGCAAACTTTCACTTATTTAAACGAAGTGGATGCGATCGGTTTGCCCACCCCTAAAAGCGT 2736 LENDLKQTFTYLNEVDAIGLPT
GAAAGAAAGCGATCTTATTATCATCAAACTCACCAAATTAGGGACGCTCCATTTAGATGAAATTTTTGAGATTGTCAAA SDLIIIKLTKLGTLHLDEIFEIVKR
CGATTGCACTACATTGTCGTTTTACAAAACGCTT TAAAACTTTCACGCATTTAAAATTTCATGAACGCCTTAACGCTA LQNAFKTFTHLKFHERLNAIVLP
TTGTCCTGCCCCCTTT1TTTAACGATCTGATCGCTTTATTTGATGATGAAGGGAAAATCAAACAAGGGGCTAACGCTA LIALFDDEGKIKQGANATLDAL
CCCTAGACGCTTTGAATGAAAGTTTGAACCGCCTTAAAAAAGAGAGCGTAAAAATCATTCACCATTACGCCCGCTCTA LKKESVKIIHHYARSKELAPYLV
AAGAGCTTGCCCCTTATTTAGTGGATACGCAAAGCCATCTTAAGCATGGTTATGAATGCCmTATTGAAAAGCGGGT LKHGYECLLLKSGFSGAIKGW
TTTCTGGCGCGATTAAAGGCGTTGTGCTAGAAAGGAGCGCTAATGGCTATTTCTATCTTTTGCCTGAAAGCGCCCAA NGYFYLLPESAQKIAQKIAQIG
AAAATCGCCCAAAAAATCGCCCAAATTGGTAATGAAATAGATTGTTGCATTGTTGAAATGTGCCAAACTCTAAGCCAT VEMCQTLSHSLQKHLLFLKFLF
AGCTTGCAAAAACACCTTTTATTTTTAAAATTCCTTTTTAMGMTπGATTTTTTAGACAGCTTGCAAGCCCGGCTTAA LDSLQARLNFAKAYNLEFVMP
TTTCGCTAAAGCCTACAATTTAGMTTTGTCATGCCAAGCTTTACACAAAAAAAAATGATT ΓTΓTTAAGGAAAAAAAACCTTTTTTTCACAC MILENFSHPILKEPKPLNLKFEK
CCCATTTTAAAAGAGCCAAAGCCCTTAAATTTGAAGTTTGAAAAATCCATGCTTGCTGTTACCGGCGTGAATGCGGG TGVNAGGKTMLLKSLLSAAFLS CGGGAAAACCATGCTCTTAAAATCGCTTTTAAGCGCGGCTTTTTTAAGCAAGCATCTCATTCCTATGAAAATCAACGC MKINAHHSIIPYFKEIHAIINDPQ CCATCATTCCATTATCCCCTATTTTAAAGAAATCCACGCCATTATTAATGACCCCCAAAACAGCGCGAACAATATCTCT STFAGRMKQFSALLSKENMLL
ACTTTT LGTDADEASSLYKTLLEKLLKQ THHKRLSVLMAENKEVELLAA ERPTYTFLKGVIGK
HP1537 2737 ATGAAATTGTAAAGATTCGTGTGGGCGATATTGTGGATTCTAAAAAAATAGACACCGCTGTTTTGGCTTTGTTCAATC 2738 EIVKIRVGDIVDSKKIDTAVLALF
AAGGGTATTTTAAAGACGTTTATGCCACTTTTGAAGGCGGCATATTAGAGTTTCATTTTGATGAAAAAGCCAGGATTG KDVYATFEGGILEFHFDEKARIA
CCGGGGTAGAAATCAAGGGTTATGGGACTGAAAAGGAAAAAGACGGCTTAAAATCCCAAATGGGGATCAAAAAGGG GYGTEKEKDGLKSQMGIKKGD
CGACACCTTTGATGAGCAAAAATTAGAGCATGCTAAAACGGCTTTAAAAACCGCTTTAGAGGGGCAGGGCTATTATG KLEHAKTALKTALEGQGYYGS
GGAGCGTGGTGGAGGTGCGCACAGAAAAGGTCAGTGAGGGTGCATTATTGATCGTGTTTGATGTGAATAGGGGGG EKVSEGALLIVFDVNRGDSIYIK
ATAGCATTTATATCAAACAATCCATTTATGAGGGAAGCGCGAAATTAAAACGCCGCATGATTGAATCTTTGAGTGCGA GSAKLKRRMIESLSANKQRDFM
ACAAGCAACGAGATTTCATGGGCTGGATGTGGGGCTTGAATGACGGGAAATTGCGTTTAGATCAACTAGAATACGAT WGLNDGKLRLDQLEYDSMRIQ
TCTATGCGTATCCAAGATGTGTATATGCGTAGGGGTTACTTAGACGCTCATATTTCTTCGCCTTTTTTGAAAACGGAT RGYLDAHISSPFLKTDFSTHDA
TTTTCTACCCATGACGCTAAGCTTCATTATAAAGTCAAAGAGGGGATCCAATACAGGATTTCAGACATTTTAATAGAG VKEGIQYRISDILIEIDNPWPLK
ATTGACAACCCGGTAGTCCCCTTAAAAACCTTAGAAAAAGCGCTTAAAGTGAAAAGGAAAGATGTCTTTAATATTGAG KVKRKDVFNIEHLRADAQILKTE
CATTTAAGAGCGGATGCGCAAATTTTAAAAACCGAAATCGCCGATAAGGGTTATGCGTTTGCGGTGGTGAAGCCAGA YAFAWKPDLDKDEKNGLVKVI
CTTGGATAAAGATGAAAAAAACGGGCTTGTGAAAGTCATTTATCGTATTGAAGTGGGCGATATGGTGTATATCAATGA DMVYINDVIISGNQRTSDRIIRR
TGTCATCATTTCAGGGAACCAGCGCACGAGCGATAGGATCATTAGAAGGGAGTTATTGTTAGGGCCTAAGGATAAAT KDKYNLTKLRNSENSLRRLGFF
ACAACTTGACCAAACTGAG EEKRVNSSLMDLLVSVEEGRT
GLGYGSYGGLMLNGSVSERNL
SMSLYANIATGGGRSYPGMPK
FAGNLSLTNPRIFDSWYSSTINL
ISYQYIQQGGGFGVNVGRMLG
SLGYNLN
HP 1537 2739 AATTTTTACAAAAATTCGCCCAGAATCAAAACGCTTATGCGGGCAGTGAGAATTTAGACGAGCTTTTAAAGCATGCTA 2740 FLQKFAQNQNAYAGSENLDELL
AAATTTCTAGTTTGATGTTTTTAGCCAGAGCGTATTCTAAAGCGGATGTACAAATGAGTATTGAAATCTTAAAAGGGCT SSLMFLARAYSKADVQMSIEILK
TTTGAATCGCTCCTTAAAAGACGAAGAAAAAATCGCTGTTTTAGATTTATTAGCCAAAAATTATTTTAGTGTAGGGTAT SLKDEEKIAVLDLLAKNYFSVGY
TTGCAAAAAACAAAAGACACCGTGAAAGAAATTTTGCGCTTTTCCCCAAGGAATGTGGGAGCTTTGTTGAAACTTTTG DTVKEILRFSPRNVGALLKLLHA
CATGCGTATGAATTAGAAAAAGACTATTCAAAGGCTTTAGAGACTTTGGAATGTTTGGAAGAATTAGAAGTGGTTGAA DYSKALETLECLEELEWEIETIK
ATTGAAACGATTAAAAATTACCTTTATCTCATGCATTTAATAGAAAATAAAGAAGATGTGGCTAAAATTTTGCATGTTTC MHLIENKEDVAKILHVSKASLDL
AAAGGCGTCGTTAGATTΓGAAAAAAATCGCCCTGMTCACTTAAAATCGCATGATGAAAATCTTTTTTGGCAAGAAATT HLKSHDENLFWQEIDATKRLEN
GATGCAACTAAACGGCTAGAAAATGTGATCGATCTTTTATGGGA W
HP1537 2741 GAATGGGGTGGAGATTGTAGGGTTGGAGCATTTGGATAAAGTGATTTATTTAGATCAAGCCCCCATAGGCAAAACCC 2742 NGVEIVGLEHLDKVIYLDQAPIG
CACGAAGCAACCCTGCCACTTACACGGGAGTGATGGATGAAATCAGGATTTTATTTGECGAGCAAAAAGAAGCTAAA NPATYTGVMDEIRILFAEQKEA
ATTTTAGGCTATAGTGCGAGCCGTTTTAGCTTTAATGTTAAAGGAGGGCGGTGCGAGAAATGCCAAGGCGATGGGG ASRFSFNVKGGRCEKCQGDG
ACATTAAAATAGAAATGCACTΠTTGCCTGATGTGTTAGTCCAATGCGATAGCTGTAAGGGCGCTAAATACAACCCCC HFLPDVLVQCDSCKGAKYNPQ
AAACTTTAGAAATCAAGGTGAAAGGCAAATCCATTGCCGATGTGTTGAACATGAGCGTGGAAGAGGCTTATGAATTTT KGKSIADVLNMSVEEAYEFFAK
TTGCTAAATTCCCTAAAATCGCCGTGAAGTTAAAAACGCTTATGGATGTGGGCTTAGGCTATATCACTTTAGGGC KLKTLMDVGLGYITLG
HP1537 2743 TATTTTTTGTCTTGGCTTTTATAGATTTAGCGATCAAACGCCGCCAATACACCAACTCTTTAAAAATGACTAAACAAGA 2744 FFVLAFIDLAIKRRQYTNSLKMT
AGTTAAGGACGAATACAAACAGCAAGAAGGGAACCCAGAAATCAAAGCCAAAATCCGCCAAATGATGCTAAAAAACG DEYKQQEGNPEIKAKIRQMML
CCACGAATAAAATGATGCAAGAAATCCCTAAAGCCAATGTCGTGGTTACTAACCCCACCCATTACGCCGTCGCTCTC MMQEIPKANWVTNPTHYAVA
AAATTTGATGAAGAACACCCTGTGCCTGTGGTAGTGGCTAAAGGCACGGATTATTTAGCCATTAGGATTAAGGGCAT HPVPWVAKGTDYLAIRIKGIAR
CGCTAGAGAGCATGACATAGAMTTATAGAAAATAAAACGCTCGCCAGAGAGCTTTATAGAGATGTGAAATTAAACG ENKTLARELYRDVKLNAAIPEE
CTGCCATACCAGAAGAATTGTTTGAAGCCGTGGCGATAGTCTTCGCTCAAGTGGCTAAATTAGAGCAAGAACGCCAA IVFAQVAKLEQERQKQKIIKPL
AAACAAAAGATCATTAAACCTCTTTMGATTTTTTAAACCCGCTTTTAAGCCCTAAAAAAACACTCAATCAAAAGGCTT
TAGCTATTCCTATTTGAGCTTTAAATTTGATGTTCCAAACGCCTTAAAGTGTTATGCACATATTCTAAAAGCCACCCTA
ACAGGCCTAAAAAATAAAAACCCAATCGCATGCCATAGTATTCTTTAGTTTGAATGAACATGCTAGAGCATAG
HP1537 2745 TCGCTAAAAAACACCCAGATTGTATTGAATTAAGAGTGCATCCAAGCATGATTAAAAACGAATGCATGCTCTCTAAAG 2746 , AKKHPDCIELRVHPSMIKNECM
TGGATGGGGTGATGAACGCTATCAGCGTCATAGGGGATAAGGTGGGCGAGACTTTGTATTATGGGGCTGGGGCTG GVMNAISVIGDKVGETLYYGAG
GGGGAGAGCCTACCGCAAGCGCGGTCATTAGCGATATTATAGAAATCGCAAGGAAAAAAAGCTCTCTAATGCTGGG TASAVISDIIEIARKKSSLMLGFE
CTTTGAAACCCCTCAAAAACTCCCCCTAAAACCCAAAGAAGAAATCCAATGCGCTTATTATGCGCGCTTGTTAGTGAG PLKPKEEIQCAYYARLLVSDEK
CGATGAAAMGGGGTTTΠTCTCAAATCAGCGCGATTTTAGCTCAAAATGATATTTCGCTCAACAATGTCTTGCAAAA SAILAQNDISLNNVLQKEILHSN
AGAAATCTTGCATTCCAACAAGGCTAAAATCTTATTTTCCACGCACACCACCAACGAAAAGTCTATGCTGAACGCCCT STHTTNEKSMLNALKELENLQS
TAAAGAGCTTGAAAATTTACAAAGCGTGTTGGATACCCCCAAGATGATTCGTTTGGAAAATTGAATGCGCTTTTTGAA KMIRLEN
CAACAAACATAGAGAAAAGGGCTTAAAGGCTGAAGAAGAAGCTTGCGGATTTTTAAAATCGTTAGGTTTTGAAATGGT
GGAGAGGAACTT'TRRTTCACAATTTGGCGAAATTGATATTATCGCTTTGAAAAAAGGGGTTTTGCATTTCATTGAAGTC
AAAAGCGGGGAAMTTTTGATCCCATTTATGCGATCACGCCGAGCAAATTAAAAAAGATGATTAAAACGATCCGCTGT
TATTTGTCCCAAAAAGATCCCAATAGCGATTTTTGCATAGACGCTCTTATTGTGAAAAATGGTAAATTTGAGCTTTTAG
AAAATATCACTΠTTAGATTTTTACAGAAAGTAAATGCGA
HP1537 2747 ATTTAAAACGCTTGAAAGATTCCAACCACATTAT TTTAGACACCAAAAACGCCCTTTTAGCATGCGACACTAAAGGCG 2748 LKRLKDSNHIILDTKNALLACDT ATGGGGCGATGGCTGAGCCTTTAGAAATCCTTl TTAAAGCCGCTCAAACGCTCCTAAAAGACGCTTATTTTGAAAACA MAEPLEILFKAAQTLLKDAYFE
GAGAAGTCATAGTCATGGGCGGCGCGAGTATAGAAAAGATTGACAGCGTTCGAACGATTAGCAATCTTTCTAGCGG MGGASIEKIDSVRTISNLSSGIQ GATTCAAGCGAGCGCTTTAGCTTTGGCGTTATATTTTAAGGGAGCCAAAGTTACTTTGATCGC LALYFKGAKVTLI
HP1537 2749 CCGCCTCAAACAACGCACCGAGCATGATTTAGAAATGATTAGCGCGACCGGTGTGTGTAAGGGCATTGAAAATTACG 2750 RLKQRTEHDLEMISATGVCKGI CGCGCCATTTCACCGGTAAAGCCCCTAACGAAACGCCTTTTTGCTTGTTTGATTATTTAGGGATTTTTGAGCGGGAGT HFTGKAPNETPFCLFDYLGIFE TTTTAGTCATTGTGGATGAAAGCCATGTGAGTTTGCCACAGTTTGGGGGGATGTATGCAGGGGATATGAGCAGGAAA VDESHVSLPQFGGMYAGDMS AGTGTTTTAGTGGAATATGGTTTTAGATTGCCTAGCGCTTTAGACAACCGCCCTTTAAAATTTGATGAATTTATCCATA EYGFRLPSALDNRPLKFDEFIH
AAAATTGCCAGTTCCTTTTTGTGTCCGCTACGCCCAATAAGCTAGAATTAGAGCTTTCCAAAAAGAATGTCGCTGAGC LFVSATPNKLELELSKKNVAEQI
AAATCATTCGCCCTACAGGGCTTTTAGACCCTAAATTTGAAGTGCGAGACAGCGATAAGCAAGTCCAGGATTTGTTT LLDPKFEVRDSDKQVQDLFDEI
GATGAAATCAAGTTAGTGGTGGCTAGAGGTGAAAGGGTGCTCATCACCACGCTCACTAAAAAAATGGCAGAAGAATT RGERVLITTLTKKMAEELCKYY
GTGCAAATATTATGCTGAATGGGGCTTGAAGGCGCGTTACATGCATAGTGAAATTGATGCGATTGAAAGGAATCACA KARYMHSEIDAIERNHIIRSLRL
TCATCCGCTCTTTAAGGCTTAAAGAATTTGACATTTTAATAGGGATCAATCTTTTAAGAGAAGGGCTGGATTTGCCTG GINLLREGLDLPEVSLVAIMDA
AAGTCTCTTTAGTAGCGATCATGGATGCGGATAAAGAAGGGTTTTTAAGGAGTGAAACAAGCCTCATTCAAACCATG RSETSLIQTMGRAARNANGKV
GGGCGAGCCGCTAGAAACGCTAATGGCAAGGTTTTATTATACGCTAAAAAGATCACTCAAAGCATGCAAAAAGCCTT ITQSMQKAFEITSYRRAKQEEF
TGAGATCACTAGTTACAGGCGCGCCAAACAAGAAGAGTTCAATAAAATCCATAACATCACCCCCAAAACCGTTACGC TPKTVTRALEEELKLRDDEIRIA
GCGCTlTAGMGAGGAATTGAAATTAAGAGACGATGAAATTAGAATCGCTAAAGCCTtAAAAAAGGACAAAATGCCTA DKMPKSEREKIIKELDKKMREC
AAAGTGAAAGGGAAAAA FEEAMRLRDEIAQLRTL
HP1537 2751 CACCTATCAAGCCGTGAGTGGGGCAGGGAACAAGGGCATAGAGAGTTTAAAAAATGAGTTAAAAACCGCTTTAGAGT 2752 TYQAVSGAGNKGIESLKNELK
GTTTGGAAAAAGACCCCACTATTGATTTAAACCAAGTCTTGCAAGCTGGGGCTTTCGCTTATCCGATCGCTTTCAATG EKDPTIDLNQVLQAGAFAYPIA
CGATCGCTCATATTGATACTΠTAAGGAGAATGGTTACACGAAAGAAGAGCTAAAAATGCTGCATGAAACCCATAAAA DTFKENGYTKEELKMLHETHKI
TCATGGGCGTGGATTTCCCTATCAGCGCGACTTGCGTGCGCGTGCCGGTATTGAGGAGCCATAGCGAGAGTTTGAG PISATCVRVPVLRSHSESLSIA
TATCGCTTTTGAAAAAGAATTCGATCTCAAAGAAGTCTATGAAGTTTTAAAAAACGCCCCTAGCGTGGCTGTTTGCGA LKEVYEVLKNAPSVAVCDDPS
TGATCCCAGTCATAATCTTTACCCCACGCCCCTAAAAGCGAGCCACACGGATAGCGTCTTTATAGGGCGCTTGAGGA PLKASHTDSVFIGRLRKDLFDK
AGGATTTGTTTGATAAGAAAACTTTGCATGGCTTTTGTGTGGCGGATCAATTGAGAGTGGGGGCAGCCACCAACGCA FCVADQLRVGAATNALK/ALHY
CTCAAAATCGCTCTGCATTACATTAAGAACGCTTGAGTTTATTCAAAGATAACAAAGATGAATGTGCATGCGCTTGAA
TTAAGGATAAGGATCAATCTAGCCCAATCTCTAAAGAAGAGCTTTAAGTTACTAGGCTTGCGATGTTAAACAAGGGTG
TAATTGAATTTCTCAAGGCTTGATTGAGAGAAATGCTAATATTTGGTAGGTCAAAAGGAATGATAAAGGCTTACACTC
AAGCAGAGAGTGAATGTTTAGCTAGCGAGTTATTACTTATTATTCCCTTTAAAAGGGTGTGAGTTTAAGGTATAAGGA
AAACTTGTATCAAGTTTTGTTGGAATGGATTAGAAAAATCTGATTGGATTGACCCTTACAATTTTTCAAACCAATCGTT
TAATAGCGATTAAATATGGCTATATATACTACAATAATAAGATTTTGAAAGGTTGGTAATGGAATCAGTAAAAACAGGA
AAAACAAATAAAG
HP1537 2753 GAAATATTTGCCTAGCGGTTCGGCTGCGGCTACAATAGGTTTAGCCACAAGCAGGCGTTTTAAAAAACAAGACGGCA 2754 KYLPSGSAAATIGLATSRRFKK
CGCTAGGCGAAGAGGTGTGCTTTATAGATGCGCGTTTGTTTGGGCGAACGGCTGAAATCGCTAACCAGTATTTGAG GEEVCFIDARLFGRTAEIANQY
CAAGGGTTCAAGCGTTTTGATAGAAGGGCGTTTGACTTATGAGAGTTGGATGGATCAAACGGGCAAAAAAAATTCCC SVLIEGRLTYESWMDQTGKKN
GCCACACTATCACAGCGGACTCGTTGCAATTTATGGATAAAAAGTCAGACAATCCCCAAGCAAACGCTATGCAAGAT ADSLQFMDKKSDNPQANAMQ
AGTATAATGCATGAGAATTCCAACAACGCTTATCCCGCTAATCATAACGCTCCCAGCCAAGATCCTTTTAACCAAGCT NSNNAYPANHNAPSQDPFNQA
TATGCGCAAAACGCTTACGCTAAAGAGAATTTACAAGCACAGCCGTCCAAGTATCAAAACAGCGTGCCTGAAATCAA YAKENLQAQPSKYQNSVPEINI
TATTGATGAAGAAGAAATCCCCTTTTAAGGGTTAAAATTAAGGAGACATTATGGAAAGAAAACGCTATTCAAAACGCT
ATTGCAAATACACTGAAGCTAAAATCAGCTTTATTGACTATAAAGATTTAGACATGCTCAAGCACACGCTATCAGAGC
GCTATAAAATCATGCCAAGGAGATTGACAGGCAATAGCAAAAAGTGGCAAGAAAGGGTGGAAGTAGCGATCAAAAG
AGCCCGCCACATGGCTTTAATCCCCTACATTGTGGACAGGAAAAAAGTCGTGGATAGCCCTTTTAAACAGCACTAAA
TGTTTGATTAGGGCTAATAGGGGGCATGCCCTTTTMTCTTATTTAGTTTTGGCTCTATTTTTGTTAAATATTGGTTGT
AAAAGCGTTA
HP1537 2755 CAACGACACTTTCCCTACCGCAATGCACATTGTGAGCGTGCTAGAAATCACGCACAGACTGCTCCCTAGTTTGGAGA 2756 NDTFPTAMHIVSVLEITHRLLPS
ATCTGTTAAAAACCTTTAAAGAAAAAAGCCAACAATTTAAAGAGATTGTCAAGATCGGACGCACGCATTTACAAGACG KTFKEKSQQFKEIVKIGRTHLQ
CTACGCCTTTAACTΓTGGGGCAAGAATTTAGCGGGTATGCGAGCATGCTAGAGCATTCTAAACAACAAATTTTAGAGA LGQEFSGYASMLEHSKQQILE
GTTTGGAGCATTTAAGAGAATTAGCCATAGGCGGGACGGCCGTAGGCACAGGGCTAAACGCTCATAAAGAATTGAG ELAIGGTAVGTGLNAHKELSEK
CGAAAAAGTGGCTGAAGAATTGAGCCAGTTTAGCGGCGTGAAATTCGTCTCTGCGCCCAATAAGTTCCATGCGCTCA SQFSGVKFVSAPNKFHALTSH
CTAGCCATGACGCTATCGCTTATGCGCATGGGGCTTTTAAAGCΠTAGCGGCGAATTTAATGAAAATCGCTAACGAT HGAFKALAANLMKIANDIRWLA
ATTAGATGGCTTGCGAGCGGGCCGCGCTGTGGTTTGGGCGAGCTTAATATCCCTGAAAACGAGCCGGGCAGTTCTA GLGELNIPENEPGSSIMPGKVN
TTATGCCCGGTAAAGTCAATCCCACGCAATGCGAAGCGATGACAATGGTGGCCGTGCAAGTGATGGGGAATGATAC AMTMVAVQVMGNDTA/GIAAS
CGCTATTGGCATTGCGGCCAGTCAGGGTAATTTTGAATTGAACGTGTTCAAGCCGGTGATCATTTATAATTTCTTGCA LNVFKPVIIYNFLQSLRLLSDSM
AAGTTTAAGGCTATTGAGCGATAGCATGGAAAGTTTTAATATCCATTGCGCGAGCGGCATTGAGCCTAATAGAGAAA CASGIEPNREKIDYYLHHSLML
AGATTGATTATTACTTGCACCATTCTTTGATGTTAGTAACCGCCCTAAACCCGCATGTGGGCTATGAAAACGCCGCTA HVGYENAAKIA
AAATCGCTA
HP1537 2757 TATGACGGATAACAACCAAAACAATGAAAACCATGAAAACAGCAGTGAAAATTCAAAAGCTGATGAGATGCGAGCCG 2758 MTDNNQNNENHENSSENSKA
GAGCGTTTGAGCGCTTCACCAACCGCAAAAAGCGTTTCAGAGAAAACGCGCAAAAAAACGCAGAGTATTCAAACCAT GAFERFTNRKKRFRENAQKNA
GAAGCGTCTTCGCACCATAAAAAAGAGCATCGCCCTAACAAAAAACCAAACAACCACCACAAACAAAAACATGCCAA EASSHHKKEHRPNKKPNNHHK
AACACGAAATTACGCCCAAGAAGAATTGGATAGCAACAAAGTAGAGGGCGTTACGGAAATTTTGCATGTGAATGAGA TRNYAQEELDSNKVEGVTEILH
GAGGGACTTTAGGCTTTCATAAGGAGTTAAAAAAGGGCGTTGAAGCGAATAACAAGATCCAAGTGGAGCATTTAAAC TLGFHKELKKGVEANNKIQVE
CCGCATTATAAGATGAACTTAAACTCTAAAGCGAGCGTTAAAATCACGCCTTTAGGGGGCTTGGGTGAGATTGGGGG KMNLNSKASVKITPLGGLGEIG
GAACATGATGGTCATTGAAACCCCAAAAAGCGCGATCGTGATTGATGCGGGCATGAGCTTCCCTAAAGAGGGGCTC IETPKSAIVIDAGMSFPKEGLF
TTTGGCGTGGATATTTTAATCCCGGATTTTTCCTACTTGCACCAAATCAAGGACAAAATCGCTGGCATTATCATCACC DFSYLHQIKDKIAGIIITHAHED
CATGCCCATGAAGATCACATAGGGGCCACGCCTTATTTGTTTAAAGAGCTGCAATTCCCCCTTTATGGCACGCCCTT LFKELQFPLYGTPLSLGLIGSK
GAGTTTGGGGCTGATTGGGAGCMGTTTGATGMCATGGTTTGAAAAMTACCGCTCGTATTTTAAAATCGTAGAAAA KKYRSYFKIVEKRCPISVGEFII
GCGCTGTCCCATTAGCGTGGGCGAATTTATCATTGAATGGATCCACATCACGCATTCTATCATTGACAGCAGCGCTT HSIIDSSALAIQTKAGTIIHTGDF
TAGCGATCCAAACTAAAGCCGGAACGATCATCCACACCGGCGATTTTAAAATCGATCACACCCCGGTGGATAATTTG VDNLPTDLYRLAHYGEKGVML
CCCACGGATTTGTATCGTTTAGCGCACTATGGCGAAAAGGGGGTGATGCTTCTΠTAAGCGATTCCACCAACTCCCA NSHKSGTTPSESTIAPAFDTLF
TAAATCCGGGACTACGCCGAGT RVIMSTFSSNIHRVYQAIQY
HP1537 2759 AAAGCTTTCCACTTTTCAAGAGCTTGTGAGCGTGTATTACGGCATGGTGTTAAACGCAGAAGTGGCTGAAACTTTAG 2760 KLSTFQELVSVYYGMVLNAEVA
AAGAGGTGGAAAAAGGCCATTATMGCATTTCCAAAACGCTTTGAAAATGCAAAAAGTGGGGCAAATCGCTAGGGTA VEKGHYKHFQNALKMQKVGQI
GAAACCTTAGGCGCTCAAGTGGCTTATGATAAGGCCCATATCGCTAGCGTTAAGGCTAAAGACGTGTTAGAAGTTTC LGAQVAYDKAHIASVKAKDVLE
GCAGCTCTCGTTCAATTCCATTTTATCTAGCAAGGACGATTTAGTGCCTTCAAGCAAATTAGAGATCCGCACGGAGAA FNSILSSKDDLVPSSKLEIRTEK
AAATCTGCCCGATCTGAGCTTTTTTGTTTCTTCCACGCTCAATTCCTACCCGGTTTTAAAGACTTTAGAAAATCAGATT SFFVSSTLNSYPVLKTLENQIQI
CAAATCTCTAAAGAAAACACGAAATTACAGATCGCTAAATTCTTGCCCCAAGTGAGTTTTTTTGGCTCTTATATTATGA KLQIAKFLPQVSFFGSYIMKQN
AGCAAAACAATTCGGTGTTTGAAGACATGATCCCTAGTTGGTTTGTGGGCGTGGCCGGGCGCATGCCTATTCTTTCT DMIPSWFVGVAGRMPILSPTGR
CCCACAGGGCGCATTCAAAAATACCAAGCGAGCAAATTAGCGGAGTTGCAAGTGAGTAGCGAACAAATCCAGGCTA ASKLAELQVSSEQIQAKKNMEL
AAAAAAACATGGAATTATTAGTGAATAAGACTTATAAAGAGACGCTTTCTTATTTGAAAGAATACAAAAGCTTGCTTTC YKETLSYLKEYKSLLSSVELAKE
TAGCGTGGAATTAGCCAAGGAAAACTTAAAACTCCAAGAGCAGGCTTTTTTACAAGGCTTAAGCACGAACGCTCAAG EQAFLQGLSTNAQVIDARNTLS
TCATTGATGCGAGGAACACGCTTTCTTCTATCGTCGTGGAGCAAAAAAGCGTGGCTTATAAATACATCGTTTCATTAG QKSVAYKYIVSLANLMALSDHI
CGAATTTMTGGCGTTAAGCGATCATATTGATTΓATTTTATGAATTTGTTTATTAAGGGAAAAAATCATGTCAAATAGCA VY
TGTTGGATAAAAATAAAGCGATTCTTACAGGGGGTGGGGCTTTATTATTAGGGCTAATCGTGCTTTTTTATTTAGCTTA
TCGCCCTAAGG
HP1537 2761 CTCTCACAGGGTGATGGATAGATTATTGAGTGGGGATGTGGGTTTTGGGAAAACAGAAGTGGCGATGCATGCGATT 2762 SHRVMDRLLSGDVGFGKTEVA
TTTTGCGCG l l l l l GAACGGCTTTCAAAGCGCTTTAGTTGTGCCTACCACTTTATTAGCGCACCAGCATTTTGAGACT CAFLNGFQSALWPTTLLAHQH
TTAAGGGCGCGTTTTGAAAATTTTGGCGTTAAAGTGGCTCGTTTGGACAGGTATGCGAGCGAAAAAAACAAGCTTTT ARFENFGVKVARLDRYASEKN
AAAGGCGGTGGAATTAGGGCAAGTTGATGCGCTAATAGGCACGCATGCGATTTTAGGCGCGAAATTCAAAAACCTG ELGQVDALIGTHAILGAKFKNL
GGCTTGGTGGTGGTGGATGAAGAGCATAAATTTGGCGTGAAACAAAAAGAAGCTTTAAAAGAATTGAGTAAGAGCGT EEHKFGVKQKEALKELSKSVH
GCATTTTTTAAGCATGTCCGCTACGCCTATCCCGCGCACTCTAAACATGGCGCTCTCTCAAATTAAGGGCATTAGTTC TPIPRTLNMALSQIKGISSLKTP
TTTAAAAACCCCGCCCACAGACAGAAAGCCCAGCCGCACTTTTTTGAAAGAAAAGAATGACGAACTCTTAAAAGAGA PSRTFLKEKNDELLKEIIYRELR
TTATTTACAGAGAATTACGCCGTAACGGGCAAATTTTTTACATCCATAACCACATCGCTAGCATTTTAAAAGTCAAAAC FYIHNHIASILKVKTKLEDLIPKL
CAAGCTAGAAGATTTAATCCCTAAACTCAAAATCGCTATTTTGCATTCCCAGATTAACGCTAATGAGAGCGAAGAAAT QINANESEEIMLEFAKGNYQVL
CATGCTAGAGTTTGCCAAGGGAAATTATCAGGTTTTATTATGCACTTCTATTGTGGAATCAGGGATTCATTTGCCTAA ESGIHLPNANTIIIDNAQNFGLA
CGCTAACACGATCATTATAGATAATGCGCAAAATTTCGGGCTGGCTGATTTGCACCAATTGAGAGGGCGTGTGGGGA RGRVGRGKKEGFCYFLIEDQK
GAGGTAAAAAAGAAGGCTTTTGTTATTTCCTCATAGAAGATCAAAAAAGTTTGAATGAACAGGCTTTAAAACGCTTGC ALKRLLALEKNSYLGSGES
TCGCTTTGGAAAAAAATTCATATTTAGGCAGCGGGGAGAGTGT
HP1286 2763 GAGCGAGTGTGTTGAGCGCGTTACTTCTTGTAGGCTTAGGGGCAGCCCCTAAACATTCAGTTTCAGCTAATGACAAA 2764 ASVLSALLLVGLGAAPKHSVSA
CGGATGCAGGATMTTTAGTGAGCGTGATTGAAAAACAGACCAATAAAAAGGTGCGTATTTTAGAAATCAAACCTTTA QDNLVSVIEKQTNKKVRILEIKP
AAATCTAGCCAGGATTTAAAAATGGTCGTTATTGAAGATCCGGACACTAAATACAATATCCCGCTTGTGGTGAGTAAG DLKMWIEDPDTKYNIPLWSK
GATGGTAATTTAATCATAGGGCTTAGCAACATATTCTTTAGCAATAAAAGCGATGATGTGCAATTAGTTGCAGAAACC LSNIFFSNKSDDVQLVAETNQK
AATCAAAAAGTTCAAGCTCTTAACGCCACCCAACAAAATAGCGCGAAATTGAACGCTATTTTTAATGAAATACCGGCT ATQQNSAKLNAIFNEIPADYAIE
GATTATGCGATAGAGTTGCCCTCTACTAACGCTGCAAATAAGGATAAAATCCTTTATATTGTCTCTGATCCCATGTGC AANKDKILYIVSDPMCPHCQKE
CCACATTGCCAAAAAGAGCTCACTAAACTTAGGGATCATTTAAAAGAAAACACCGTGAGAATGGTCGTGGTGGGGTG DHLKENTVRMWVGWLGVNSA
GCTTGGGGTCAATTCAGCTAAAAAAGCGGCTTTAATCCAAGAAGAAATGGCGAAAGCTAGGGCTAGGGGAGCGAGC IQEEMAKARARGASVEDKISILE
GTGGAAGATAAGATCTCTATTCTTGAAAAGATTTATTCC
HP1286 2765 TCGGTGGCGTGGCCATAGGGGGTGATGCTCCCATAAGCACGCAAAGCATGACCTTTAGCAAAACCGCTGATATTGA 2766 GGVAIGGDAPISTQSMTFSKTA
AAGCACTAAAAATCAAATTGACAGACTCAAACTCGCCGGGGCCGATTrAGTGAGGGTGGCGGTGAGTAATGAAAAG NQIDRLKLAGADLVRVAVSNEK
GACGCTCTAGCCTTAAAAG TTGAAAAAAGTGTCCCCTTTGCCTTTAATCGCTGATATTCATTTCCATTATAAATTCG KELKKVSPLPLIADIHF.HYKFALI
CTCTCATTGCCGCTCAAAGCGTGGATGCGATCAGGATTAACCCCGGAAACATCGGCTCTAAAGAGAAGATCAAAGC DAIRINPGNIGSKEKIKAWDAC
GGTGGTTGATGCTTGTAAAGAAAAAAACATTCCTATAAGAATTGGCGTGAATGCTGGGAGTTTAGAAAAGCAGTTTGA IRIGVNAGSLEKQFDQKYGPTP
TCAAAAATACGGACCCACCCCAAAAGGCATGGTAGAAAGCGCTTTGTATAA SALY
.
HP1286 2767 ACGCCCTAGCTATGTGGATTCGGATTATGAAGTCTTTAGCGAAACGATTTTTTTACAAAACATGGTGTATCAGCCTAC 2768 RPSYVDSDYEVFSETIFLQNMV
AGAAGAAAGAGATTCTTTCGCCCAACTGACTAAAGATGAAAACGATTCTTTTAACCCCGAAACTTCTGTGATTTTATTG ERDSFAQLTKDENDSFNPETS
AATGAACCAAGCGATAGCGATACAAAAAACCCGCCCTTGAACCAAAATGAGTCTAATACTAACACTGCCAATAACGAT PSDSDTKNPPLNQNESNTNTA
ACAAAAAACCCGTTCCTTTACAAACCGAAAAGAAAAACAAAAGATCCAAAGCTCATTGAATATTCCCAACAAAATTTCT NPFLYKPKRKTKDPKLIEYSQQ
ACCCCCTAAAGGATGGGGATATTATGATGAGTAAAGAAGGGGATCAATGGCTGATAGAAATCAAATCCAAAGCCTTG KDGDIMMSKEGDQWLIEIKSKA
AAGC
HP1286 2769 CAGGCGGGGAGCCAAGCTTGTATTTCAATAACCCTATTTTAATCAGCGTTTTAGAGCATTTTTATCGCCAAAAAATCC 2770 GGEPSLYFNNPILISVLEHFYR CTTTATGTGTAGAGAGTAATGGTTCTATTTTTTTTGAATTTAGCCCTATTTTAAAAGAATTGCATTTCACTCTAAGCGTC VESNGSIFFEFSPILKELHFTLS AAACTCTCTTTTTCTTTAGAGGAAGAAAGCAAGCGGATCCATCTTAA LEEESKRIHL
HP1286 2771 ATGGAAAGAATTGTCTAAAAACGCCAAAGACTCCGCTCAAAAACAGGCTCTCGCTCAAAAAACAGAAGCTTTAACGC 2772 WKELSKNAKDSAQKQALAQKT
ACAACATTAAAGACACCAGAGAGAGGTTAACGACCTTACAGCACAAGGCGAGTGAAGAATTAAAAAGCGTCATTAAA NIKDTRERLTTLQHKASEELKS
GAAGTCAATAGCTTGGGTTCTCAAATCGCTGAGATTAACAAACGCATTAAAGAAGTGGAAAACAACAAGAGTTTAAAG SLGSQIAEINKRIKEVENNKSLK
CATGCGAACGAATTAAGGGATAAGCGAGATGAATTGGAATrCCATTTGCGAGAGCTTTTAGGGGGGAATGTTTTTAA RDKRDELEFHLRELLGGNVFK
AAGCAGCATTAAGACTCATTCGCTCACCGATAAAGACTCAGCGGATTTTGATGAGAGCTATAACCTTAATATCGGGC SLTDKDSADFDESYNLNIGHGF
ATGGGTTCAATATCATTGATGGCTCTATTTTCCATCCTTTAGTGGTTAAAGAATCCGAAAATAAAGGGGGTTTGAACC IFHPLVVKESENKGGLNQVYF
AGGTTTATTTTCAAAGCGATGATTTTAAGGTTACTAATATTACTGACAAGCTCAATCAGGGAAGAGTGGGGGCGTTAT VTNITDKLNQGRVGALLNVYND
TGAATGTGTATAATGACGGCTCTAACGGGACTTTAAAGGGCAAATTACAAGATTATATTGATTrGTTGGATTCTTTTGC LKGKLQDYIDLLDSFAKGLIEST
TAAGGGTTTGATAGAATCCACTAATGCGATTTACGCTCAAAGCGCGAGTCATTATATTGAGGGCGAGCCGGTGGAGT QSASHYIEGEPVEFNSDEAFK
TTAATAGCGATGAAGCCTTTAAAGACACTAACTACAATATCAAAAACGGCTCGTTTGACTTAATCGCTTACAACACCG KNGSFDLIAYNTDGKEIARKTIA
ATGGTAAAGAAATCGCTAGAAAAACCATTGCTATCACGCCCATTACAACCATGAACGATATTATCCAAGCCATTAACG MNDIIQAINANTDDNQDNNTEN
CTAACACTGATGACAATCAGGACAATAACACCGAAAAC
HP1286 2773 CTCTGAGGGGACTTTAGGCTrTATTTCAAGCGTGGAATTAGAATGCGTGAAAGACTACGCTTATAAAACTTGCGCGTT 2774 SEGTLGFISSVELECVKDYAYK ATTGTTTTATGAAAATTTAGAGCGATGTGCCAAAGCCGCTCAAATTCTAGCCGCCTTAAAAGCCAAACAACCTGAAAT YENLERCAKAAQILAALKAKQP GATTTCTTCAGCAGAGCTTATGGATTATGCGTGCTTAAAAAGCGTGAAAGGTTTGGAGGGCATGCCTAGCGTGGTTT ELMDYACLKSVKGLEGMPSW TAGAAATCAAAGAGCCTAACGCATGCTTACTCATTCAAAGCGAAAGCGATGATCCTTTAATTTTAGAAAACAGCATGC NACLLIQSESDDPLILENSMQTI
AAACGATTTTAAACGCTTTGAGTGCGATACCGGTCGTTTTAGATTCTCAAATCAGCAGTGATCCTAGTATTTATCAATC AIPWLDSQISSDPSIYQSWWKI
GTGGTGGAAGATCAGAAAAGGCATTTTCCCTATCGCAGCGTCAAAAAGAAAAAGCCAAAGCTCTGTGATCATTGAAG PIAASKRKSQSSVIIEDICFSQE
ACATCTGCTTCAGTC GAGGATTTTGTAGAGGGGGCAAAAGCGATTGAAGGGCTTTJAAAAAAACATGGCTTTAAG AKAIEGLLKKHGFKDNGIIFGH
GATAATGGCATTATTTTTGGGCATGCGTTAAGCGGGAATTTGCATTTTGTCGTTACGECGATTTTAGAAAATGAAGCT HFVVTPILENEAERKAFENLVS
GAAAGAA GCGTΓTGAAAATTTAGTTTCTGAGATGTTTTTAATGGTGAGCAAAAGCTCTGGCTCTATTAAAGCCGAA VSKSSGSIKAEHGTGRMVAPF
CATGGCACAGGCAGGATGGTAGCCCCTTTTGTGGAAATGGAGTGGGGAGAAAAAGCCTACAAAATCCACAAGCAAA GEKAYKIHKQIKELFDPNGILNP
TCAAAGAATTGTTTGATCCTAACGGCATTTTAAACCCTGATGTGATCATCACAAACGA
Figure imgf000423_0001
HP1411 2785 CAGTCAATGAAACAAAAAATATTGTAGAAGTGGGGATTGATTCTTCTATTGAAGAGAGCTATTTAGCTTATTCCATGA 2786 VNETKNIVEVGIDSSIEESYLAYS
GCGTGATCATAGGGCGCGCTTTACCGGACGCTAGAGATGGCTTAAAGCCCGtGCATAGGCGTATTTTGTATGCGAT GRALPDARDGLKPVHRRILYAM
GCATGAATTAGGCCTTACTTCAAAAGTCGCTTACAAAAAAAGCGCTAGGATCGTGGGTGATGTGATTGGTAAATACC TSKVAYKKSARIVGDVIGKYHP
ACCCCCATGGCGATAATGCGGTTTATGATGCGCTAGTGAGAATGGCGCAAGATTTTTCCATGCGTTTGGAATTAGTG VYDALVRMAQDFSMRLELVDG
GATGGGCAGGGCAACTTTGGCTCTATTGATGGCGATAACGCCGCAGCGATGCGTTACACTGAAGCCAGAATGACTA SIDGDNAAAMRYTEARMTKAS
AGGCGAGTGAAGAAATTTTAAGGGATATTGATAAAGACACCATTGATTTTGTGCCTAATTATGACGATACCTTAAAAG DKDTIDFVPNYDDTLKEPDILPS
AGCCAGATATTTTACCAAGCCGTCTGCCTAACCTTTTAGTCAATGGGGCTAATGGGATCGCTGTGGGGATGGCGACT LVNGANGIAVGMATSIPPHRMD
TCTATCCCCCCTCACAGGATGGATGAAATCATAGACGCTTTAGTGCATGTCTTAGAAAACCCTAACGCTGGATTAGAT VHVLENPNAGLDEILEFVKGPD
GA TCTTAG TTTGTCAAAGGGCCTGATTTTCCCACTGGTGGGATCATTTATGGCAAGGCGGGTATTATTGAAGC IYGKAGIIEAYKTGRGRVKVRAK
CTATAAMCGGGGCGAGGACGCGTGA GTGCGGGCCAAAGTGCATGTGGAAAAAACAAAAAATAAAGAAATCATC TKNKEIIVLDEMPFQTNKAKLVE
GTTTTAGATGAAATGCCTTTTCAAACCAATAAAGCCAAATTAGTGGAACAAATCAGCGATTTAGCGCGAGAAAAGCAA AREKQIEGISEVRDESDREGIR
ATTGAAGGCATTAGTGAAGTGCGCGATGAGAGCGATAGAGAGGGCATTAGAGTGGTGATTGAATTAAAAAGAGACG RDAMSEIVLNHLYKLTTMETTF
CGATGAGTGAAATTGTCTTAAACCACCTCTACAAACTCACCACTATGGAAACCACTTTTAGCATCATTTTACTCGCTAT YNKEPKIFTLLELLHLFLNHRKTI
TTACAATAAAGAGCCTAAGAT ELEKAKARAHILEGYLIALDNID
TSQSPEAAKNALMERFTLSEIQ
MRLQRLTGLERDKIKEEYQNLL
NGILKSEDRLNGWKTELLEVK
PRRTEIQESYENIDIEDLIANEP
SYKGWKRVDLKAYEKQNRGG
GSTYEDDFIENFFVANTHDILLFI
QLYHLKVYKIPEASRIAMGKAIV
PDEKIMATLSTKDFSDERSLAF
VVKRTNLSEFESNRSCGIRAIVL
ELVSAKWDKNAKHLLIASHLGI
EEVREIGRTTRGVIGIKLNENDF
VISDDGNKLLSVSENGLGKQTL
HP1411 2787 GGTGAATCTGGTCTCTCAAAATGGCAGACGCTACCTTAAGGCTTCTATTTCGTTAGAATTGAGTAATGAAAAGCTTTT 2788 VNLVSQNGRRYLKASISLELSN
GAATGAAGTCAAGGTTAAAGACACAGCGATTAAAGACACGATTATAGAGATTCTGTCGTCTAAAAGCGTGGAAGAAG EVKVKDTAIKDTIIEILSSKSVEE
TGGTTACT CAAAGGCAAA CMGCTTAAAGATGAGATTMGAGCCATTTG TTCGTTTTTGATTGATGGCTTTAT GKNKLKDEIKSHLNSFLIDGFIK
TAAAMTGTCTTTTTCACTGATTTCATTATCCAATAATTTTAAATGATTGGCATAGATATTGTCTCTATTGCTAGGATAG DFIIQ
AAAAGTGCGTGAAACGCTTTAAAATGAAGTTTTTAGAGCGTTTTTTATCGCCAAGCGAGATTGTTTTATGCAAGGATA
AATCCAGCAGTATCGCCGGGTTTTTCGCGCTTAAAGAGGCTTGTTCTAAAGCCCTTCAAGTGGGCATTGGTAAGGAA
TrGAGCTTTTTGGATATAAAAATCTCTAAAAGCCCTAAAAAGGCCCCCTTAATCACCCTTTCCAAAGAAAAAATGGATT
ATTTCAATATCCAAAGCTTGAGCGCGAGCATCAGCCATGACGCTGGTTTTGCGATAGCGGTCGTGGTGGTTTCTTCG
TCA TGAATAMTAACGCATTCTAAAACTAACATTCTGTTTTAAAAAGCGCTTAACGGCTGTTTAAAAAACGCAAAAA
ACTTTTTTCAAATGGTTTTGATAACAAATTGCAAATAAATAAAATCAAACTTCAAAATTTAAGTTAATTTTAATAATTATT
TTTATAGTATGCATCGGTTTGAATTAAATGAGAAAGGTTATCAC
HP1411 2789 ACGCTTAATGGAGCTTGGAGCGCCAGAAATCATTGTGCGCAATGAAAAAAGGATGTTGCAAGAAGCCGTGGATGTG 2790 RLMELGAPEIIVRNEKRMLQEA
CTTTTTGATAACGGCCGCAGCACTAATGCGGTTAAAGGGGCTAACAAACGCCCTTTAAAATCGCTCAGTGAAATCAT DNGRSTNAVKGANKRPLKSLS
TAAAGGCAAGCAGGGGCGTTTCAGGCAAAACCTTTTAGGTAAGCGCGTGGATTTTTCAGGCAGAAGCGTGATTGTG QGRFRQNLLGKRVDFSGRSVI
GTTGGGCCTAATCTCAAAATGGATGAATGCGGGTTGCCTAAAAACATGGCGTTAGAACTCTTCAAACCGCATTTGTrA LKMDECGLPKNMALELFKPHL
TCCAAGCTTGAAGAGAGAGGCTATGCCACCACGCTCAAACAGGCTAAACGCATGATTGAGCAAAAAAGCAATGAAGT RGYATTLKQAKRMIEQKSNEV
ATGGGAGTGCTTGCAAGAAATCACAGAGGGGTATCCGGTGCTACTCAACCGCGCTCCTACCTTGCACAAGCAATCC EITEGYPVLLNRAPTLHKQSIQ
ATTCAAGCGTTCCATCCAAAGCTGATTGACGGCAAAGCGATCCAATTGCACCCGTTAGTGTGTTCAGCGTTCAACGC DGKAIQLHPLVCSAFNADFDG
CGATTTTGACGGGGACCAAATGGCGGTGCATGTGCCTTTAAGCCAGGAAGCGATCGCTGAATGCAAGGTGCTGATG HVPLSQEAIAECKVLMLSSMNI
CTAAGCTCTATGAATATCCTTTTGCCTGCTAGCGGTAAGGCCGTAGCCATTCCTAGCCAAGATATGGTT KAVAIPSQDMV
HP1411 2791 GCGTGTTGAATTGGATTTTAGGAGTGTAAAAATGCAAGTTTCACAATATCTGTATCAAAATGTGCAATCTATTTGGGG 2792 RVELDFRSVKMQVSQYLYQNV
GGATTGTATTTCCCATCCGTTCGTTCAAGGCATAGGGCGTGGGACTTTAGAAAGAGATAAATTTCGTTTTTATATCAT DCISHPFVQGIGRGTLERDKFR
TCAGGATTATTTGTTCCTTTTAGAATACGCTAAGGTGTTTGCTTTGGGCGTAGTTAAAGCTTGTGATGAGGCGGTGAT YLFLLEYAKVFALGWKACDEA
GAGGGAGTTTTCTAATGCTATACAAGATATTTTAAATAACGAGATGAGTATCCATAACCATTACATTAGAGGACTTCAA SNAIQDILNNEMSIHNHYIRGLQI
ATCACTCAAAAAGAATTGCAAAACGCGCGCCCCACTCTAGCGAATAAATCCTATACAAGCTACATGCTCGCTGAAGG QNARPTLANKSYTSYMLAEGFK
GTTrAAGGGCTCTATCAAAGAAGTTGCGGCGGCTGTTCTATCCTGTGGTTGGAGCTATTTAGTGATCGCGCAAAATT VAAAVLSCGWSYLVIAQNLSQI
TAAGCCAAATCCCCAACGCTTTAGAACATGCCTTTTATGGGCATrGGATTAAGGGCTATAGTTCCAAAGAATTTCAAG HAFYGHWIKGYSSKEFQACVN
CGTGCGTAAATTGGAATATTAATTTGCTTGATTCCCTCACTCTCACTTCTTCAAAACAAGAAATTGAAAAATTAAAGGA DSLTLTSSKQEIEKLKDIFITTSE
CATTTTTATCACTACAAGCGAATACGAATATCTGT TGGGATATGGCGTATCAAAGTTGAATAAAAGT TTTGTGCTG WDMAYQS
TTAGAAAAACGCAAGAGAAGAAGATATCTATCTCTTGCAAAACGATTATTGGGCGTAAGCTTCTAGTTTGAGCTCAAT CTTTACCTCATCTCCAACGACAGCATCGCTAAAGGTTTTACCGATACCAAAATCCTTGCGGTTGAT TGCCTTCAGC TTGTAACACCATGAATTCTTTTTTATTCATGGGGTT TGTAAGGGGGCTTGGAT ΓGGCTTCCAATACGACAGGCTT
GGTTACGCCACGAAGAGTCAAATCCCCATAGATT TACCATCTTCGTATTTGGTCATTTTAAAGCTCCCTTTGGGGTA TTTTACCAC
HP1411 2793 GCGATCΠAAAAAGGTGTGTGAGAAAATCAAAAGCGCACTACCCTTTGGGATCATCTCAGCCTTTAAACCCTTTAAAG 2794 DLKKVCEKIKSALPFGIISAFKPF
ACGCTTΠTACAGAGATTTCAATCATAATGAGCAAAAGTTACTGATAGGGGCAGCTAAAAGCGGTTGCATTCAATCTA RDFNHNEQKLLIGAAKSGCIQS
GCGCTGATAAACTGGCTCAGTTAAAAACGCGCTTACTCTACTGGCAAGACAAATCTGTTAAAGTGGATTGGGATAAA AQLKTRLLYWQDKSVKVDWDK
CCCATΓTTAATCAAGGACTTCTTTAAAGGCAATAATTACCTTTATAGGAGGTTTTGTTTTTTATTGGGGAAGCATTTTAT FFKGNNYLYRRFCFLLGKHFMD
GGACAGATTTTTAAAGAATAACGCTAAGGCGAGCGTGAAAGACTTTATGTCTAGTAAGGAGTTTGTCGCTAAATACC NAKASVKDFMSSKEFVAKYRY
GATACACCCCCAAGCAAAATACAGAAAGAGCGAAAAAGCTGCAATCGTATTTAGAGAATAAGCGCGATTTTATAGGG TERAKKLQSYLENKRDFIGFVQ
TTTGTTCAAGCGCTTAACTCTTTAAAAGACAACCCGCAAGATCCTTTTTTACCCAATGAAGAAACGAGCTTTTTGGTGT KDNPQDPFLPNEETSFLVFANE
TCGCTAATGAGCCTACTATCGTGTTTAATTTAAGGGATTATTTATTGGTGTTAGCGCAAATCTTTAACCAACAAGCGAT NLRDYLLVLAQIFNQQAICYCES
CTGTTATTGTGAAAGCAAATGCCCTATAGAATTGATCAACGCTTCACCGGGTAAGGACTTTAACAAGACACAAGACA LINASPGKDFNKTQDSFPDIKFS
GCTTCCCAGACATCAAATTCTCAACACCCAATCAGTTAGAACAATCCCTCAACGCTCTTAAAAACAAGCTAGCCGCAT EQSLNALKNKLAAFFSKHPDKH
TCTTTTCAAAACACCCTGATAAACATAACGGCATGGAGTTTAATGAAATAGCTAAAACTCAAATAGA FNEIAKTQI
HP1411 2795 TGCTGCTTTGCGCTGGMGG TGAGACTTTAAAAAAAGCGGTGCCTATTGGTGTGGGTTTGATAGAGAGCGCGATT 2796 LLCAGRNETLKKAVPIGVGLIES
AATCTAACGAGAATGTGTTTAAAAAACCCTGATACAGAAAGCCTTATTTTTATAGGGAGCGCGGGGAGTTATAGCCCA MCLKNPDTESLIFIGSAGSYSPE
GA TGGAGCTTTTAAGCGTGTTTGAMGCGTTTGCGGCTATCAAATTGAAGAGAGTTTTAGCCATTTAAACAGCTAC VFESVCGYQIEESFSHLNSYTP
ACGCCTTTGGATAATTTCATTCACATAGAAACTGAAGAGCAGGCTCTTTTTGAAAGGGTGCGTGTGAATAGCAGTAAC IETEEQALFERVRVNSSNYIHTS
TATATCCACACCAGCGAAATGTTCGCTAAAAAAATGGTTCAAAAGGGCGTT TATTAGAAAACATGGAGT' TAGC KMVQKGVLLENMEFFSVLSVA
GTTTTAAGCGTGGCTAAAGCGT TCTTTAAAGGCTAAAGGGATI TT rGCGTGAGTAATTATGTGGGGCTTAATGCG AKGIFCVSNYVGLNAYQEFKEN TATCAGGAATTTAAAGAAAACCACGCCAAAGTCAAACAGATT ΓTAGAAAACATCATTGATAGTTTAATAATTTAATAGT QILENIIDSLII TTAGCTATCATGGAGCATTCTAAATTAAAGGCGATCACATGTTTGAAAAAATACGCAAGATT TAGCGGATATTGAAG ATTCGCAAAATGAAATTGAAATGCTTTTAAAATTAGCGAATTTGAGTTTGGGGGAT TTATTGAGATTAAAAGAGGGA GCATGGACATGCCAAAGGGCGTGAATGAAGCGTTTTTTACGCAATTAAGCGAAGAAGTGGAGCGATTGAAGGAGCT TATTAACGCTTTGAATAAAATCAAAAAAGGGTTATTGGTGTTTTAAATGTGTGGGATTGTAGGTTATATAGGGGATAG CGAGAAAAAATCCGTTCTTTTAGAGGGATTAAAGGAATTGGAATACAGAGGTTATGACAGCGCGGGTTTAGCCGTAT TGAGTAATGATCGTTTGGAAGTGTTTAAAACTCAAGGGAAATTAGAAAACCTTAAACTAGAGCTTAAAAATAAAGAGTT TTTGGATTTTGG
HP0935 2797 GATCATCAAAGGCGGGTTCATTGCGTTGAGTCAAATGGGTGACGCGAACGCTTCTATCCCTACCCCACAACCAGTTT 2798 IIKGGFIALSQMGDANASIPTPQ
ATTACAGAGAAATGTTCGCTCATCATGGTAAAGCCAAATACGATGCAAACATCACTTTTGTGTCTCAAGCGGCTTATG EMFAHHGKAKYDANITFVSQA
ACAAAGGCATTAAAGAAGAATTAGGGCTrGAAAGACAAGTGTTGCCGGTAAAAAATTGCAGAAACATCACTAAAAAA KEELGLERQVLPVKNCRNITKK
GACATGCAATTCAACGACACTACCGCTCACATTGAAGTCAATCCTGAAACTTACCATGTGTTCGTGGATGGCAAAGA DTTAHIEVNPETYHVFVDGKEV
AGTAACTTCTAAACCAGCCMTAAAGTGAGCTTGGCGCAACTCTTTAGCATTTTCTAGGATTTTTTAGGAGCAACGCT NKVSLAQLFSIF
CCTTAAATCC
HP0935 2799 GCGGACTAATAAAGCCTTGTATCAATTCATTTTGAGAATAGCTCAAAAAGACAATTTTGCTTCAGCGTATCTAACAGTC 2800 RTNKALYQFILRIAQKDNFASAY
AMTTAG TACCCACAAAGACACGAAGTCTCTAGCGTTATTGAAGAGGAGTTAAAAAAGAGAGAAGAAGCAAAGAG EYPQRHEVSSVIEEELKKREEA
GCAGAAAGAATTGATTAAGCAAGAAAATCTTAACACCACAGCCTACATCAATAGAGTGATGATGGCGAGCAATGAAC LIKQENLNTTAYINRVMMASNE
AGATTATCAACAAAGAAAAAATAAGAGAAGAAAAGCAAAAAATTATCTTAGATCAAGCAAAGGCGCTAGAGACTCAAT KIREEKQKIILDQAKALETQYVH
ATGTGCATAATGCCTTAAAAAGAAACCCCGTGCCTAGAAACTACAACTACTACCAAGCGCCTGAAAAACGCTCTAAA NPVPRNYNYYQAPEKRSKHIM
CATATTATGCCCTCTGAAATTTTTGATGATGGCACATTCACTTATTTTGGTTTCAAAAACATCACTCTCCAACCTGCTA DGTFTYFGFKNITLQPAIFWQP
TTTTTGTGGTTCAACCTGATGGGAAATTGAGCATGACTGATGCCGCCATTGATCCTAACATGACCAATTCAGGATTGA MTDAA1DPNMTNSGLRWYRVN
GATGGTATAGAGTTAATGAAATTGCAGAAAAATTTAAGCTCATTAAAGACAAAGCCCTTGTAACAGTAATCAATAAAG KLIKDKALVTVINKGYGKNPLTK
GCTATGGGAAAAATCCATTGACAAAAAATTACAATATCAAAAACTATGGTGAATTGGAGCGTGTGATTAAAAAGCTCC NYGELERVIKKLPLVRDK
CTCTTGTCAGAGATAMTAAAAAGGCGTTAAGACATGAATGAAGAAAACGATAAACTTGAAACTTCTAAAAAAGCCCA
ACAAGATTCACCCCAAGATTTATCCAATGAAGAAGCAACAGAAGCCAATCA rGAAAATCTTTTAAAAGAATCCAAA
GAAA
HP0935 2801 ACGAAGGCACTAAAGAGCTTGGTGCGGTGGGGTTTGCGCAACTTTTAGAGCAAAAAGCGATCAGTTTGAATGTGGA 2802 EGTKELGAVGFAQLLEQKAISL
TACCAGCACAGAAGATTTGCAMTCACTTTAGAATTTTTAAAAGAATACGAAGATGAAGCCATTACGCGCTTAAAAGA TEDLQITLEFLKEYEDEAITRLK
GCTTTTAAAATCCCCTAATTTCACGCAA CGCTTTAGAAAAAGTCAAAACCCAAATGTTAGCCGCACTTTTACAAAAA NFTQNALEKVKTQMLAALLQK
GAAAGCGATTITGACTAT1TGGCTAAATTGACTTTAAAGCAAGAGCTTTTTGCTAACACCCCTTTAGCTAACGCAGCC LAKLTLKQELFANTPLANAALG
TTAGGCACTAAAGAGAGCATTCAAAAAATCAAGCTAGACGATTTGAAACAGCAATTTGCTAAGGTCTTTGAACTCAAT KIKLDDLKQQFAKVFELNKLW
AAGCTCGTGGTGGTGCTTGGGGGCGATTTGAAAATCGATCAAACCCTTAAGCGTTTGAATAACGCCCTTAATTTCTT LKIDQTLKRLNNALNFLPQGKA
GCCACAAGGTAAAGCGTATGAAGAGCCTTATTTTGAAACGAGCGATAAAAAAAGCGAAAAAGTCCTCTATAAAGACA FETSDKKSEKVLYKDTEQAFVY
CTGAGCAGGCTTTCGTGTATTTTGGTGCGCCCTTTAAAATCAAGGATTTAAAACAGGATTTAGCGAAATCTAAAGTCA KIKDLKQDLAKSKVMMFVLGG
TGATGTTTGTGCTTGGTGGGGGGTTTGGCTCTCGTTTAATGGAAAAAATCAGGGTTCAAGAGGGATTAGCTTATAGC MEKIRVQEGLAYSVYIRSNFSK
GTGTATATCCGCTCCAATTTTTCTAAAGTGGCGCATTTTGCGAGCGGGTATTTGCAAACCAAGCTCAGCACTCAAACT SGYLQTKLSTQTKSVALVKKIV
AAAAGCGTTGCCTTAGTTAAAAAAATCGTTAAGGAATTTATAGAAAAAGGCATGACGCAACAAGAATTAGACGACGCT GMTQQELDDAKKFLLGSEPLR
AAAAAGTTTTTACTAGGCTCTGAGCCTTTAAGGAATGAAACGATCTCTAGCCGCTTGAACACCACTTACAATTATTTTT RLNTTYNYFYLGLPLNFNQTLL
ATTTAGGTTTGCCTTTAAATTTTMCCA CGCTGCTCMTCAAATCCAAAAAATGAGTTTGAAAGAAATCAATGATTT MSLKEINDFIKAHTEINDLTFAIV
CATTAAAGCCCA DK
HP1530 2803 CGTGGTTTTATCCCGCCTAAAGACTTGTTAAAGCAATTAGAAAAAATCAGCGCTTCTCTTTCTAAAGACATCGTAATA 2804 RGFIPPKDLLKQLEKISASLSKDI GCGATAAAGCAAGTAGAAAAATTAGAGCTTAGCTATGCGCTAATAGACAATATCCAACACAACACGCTTGATGACAC VEKLELSYALIDNIQHNTLDDTL GCTTGATTTTACCTTTATTGTTGGGGATTCTTTGAGCGTTCAGTCGCTTTATGTTACCTTTGATCTGGTGATTGACATG GDSLSVQSLYVTFDLVIDMDRP GATAGGCCTATGAGCGAGCAGTTCCTCAACCATATTGGGGAATTGG LNHIGEL
HP0797 2805 ATAATTTAGGGAATGCAAACAACACCATTTACTATTACGACAAGAGCATTGATTTTTATGCGAGCGGGAAAACTCTAT 2806 NLGNANNTIYYYDKSIDFYASG
TCACTAAAGCGGAATTTTCTCAAACATTCACCGGGCAAAACAGCGCGATCGTTTTTGGGGCTAAAAGCATATGGACG AEFSQTFTGQNSAIVFGAKSIW
AGCTTAAGCGATGCACCGCAGTCTAACACCATCATTCGCTTTGGGGACAATAAGGGAGCAGGGAGTAATGATGCGA APQSNTIIRFGDNKGAGSNDAS
GCGGGCATTGCTGGAATTTGCAATGCATAGGCTTTATTACAGGGCATTATGAAGCGCAAAAGATTTACATCACCGGT
AGCATTGAAAGCGGGAATCGCATTTCTAGCGGTGGGGGCGCGAGCCTTAATTTTAACGGGCTTCAAGGCATTCTTTT
AACGAACGCGACTTTGTATAACCGCGCCGCTGGCACGCAAAGCTCGTCTATGAATTTTATCTCTAACAGCGCGAACA AGTQSSSMNFISNSANIQAQN
TTCAGGCTCAAAACTCCTATTTTATAGACGATACCGCACAAAATGGCGGTAACCCTAATTTCAGTTTCAACGCTTTGA TAQNGGNPNFSFNALNLDFSN
ATCTGGATTΠTCTAACAGCTCTTTTAGAGGCTATGTGGGGAAAACGCAATCTGTTTTTAAATTCAATGCCAAGAATG YVGKTQSVFKFNAKNAISFTNS
CGATCAGTTTCACCAACAGCACG TTTAAGCTCTGGTTTGTATCAAATGCAAGCTAAAAGCGTGTTGTTTGACAATT GLYQMQAKSVLFDNSNLSVSV
CCAATTTAAGCGTTTCAGTGGGGACAAGCAGTATTAAAGCCAATGCGATCAATCTTTCTCAAAATGCCTCTATTAATG ANAINLSQNASINASNHSTLEL
CGAGCAACCATTCAACCTTAGAACTTCAAGGCGATTTGAATGTGAACGACACCAGCTCGCTCAACCTCAACCAAAGC VNDTSSLNLNQSTINVSNNATI
ACGATTAATGTTTCCAATAACGCCACGATCAACGATTATGCGAGCTTGATTGCGAGTAATGGCTCTCACCTTAATTTT IASNGSHLNFNGAVNFNSANIT
AACGGGGCGGTTAATTTCAATTCAGCGAATATTACTACGAGTTTGAATAATTCCTCTATCGTGTTTAAGGGGGCGGTC SSIVFKGAVSLGGQFNLSNNS
TCTTTAGGAGGGCAGTT SSAITSNTAFNFYDNAFSQSPI
DIKAPLSLGGNLLNPNNSSVLD
LVFGDQGSLNIANIDLLSDLND
YNIIQADMNSNWYERISFFGM
DAKNQTYSFTNPLNNALKITES
LSVTLSQIPGIKNTLYNIGSEIF
HP0797 2807 GCGCTGATCGCACCACGAGAGTGGATTTCAACGCTAAAAATATCTTAATTGATAATTTTTTAGAAATCAATAATCGTGT 2808 ADRTTRVDFNAKNILIDNFLEIN
GGGTTCTGGAGCCGGGAGGAAAGCCAGCTCTACGGTTTTAACTTTGCAAGCTTCAGAAGGGATTACTAGCAGTAAA GAGRKASSTVLTLQASEGITSS
AATGCGGAAATTTCTCTTTATGATGGCGCCACGCTCAATTTGGCTTCAAACAGCGTTAAATrAATGGGTAATGTGTGG LYDGATLNLASNSVKLMGNVW
ATGGGCCGTTTGCAATATGTGGGAGCGTATTTGGCCCCTTCATACAGCACGATAAACACTTCAAAAGTGACAGGGGA YVGAYLAPSYSTINTSKVTGEV
AGTGAATTTTAACCATCTCACTGTGGGCGATCACAACGCCGCTCAAGCAGGCATTATCGCTAGTAACAAGACTCATA TVGDHNAAQAGIIASNKTHIGT
TTGGCACACTGGATTTGTGGCAAAGCGCGGGACTAAACATTATCGCCCCTCCAGAAGGCGGTTATAAGGATAAACCT SAGLNIIAPPEGGYKDKPKDKP
AAGGATAAACCTAGTAACACCACGCAAAATAATGCTAACAACAACCAACAAAACAGCGCTCAAAACAATAGTAACACT NNANNNQQNSAQNNSNTQVI
CAGGTTATTAACCCACCCAATAGCGCGCAAAAAACAGAAATTCAACCCACGCAAGTCATTGATGGGCCTTTTGCTGG QKTEIQPTQVIDGPFAGGKDT
TGGCAAAGACACGGTTGTCAATATTGATCGCATCAACACTAACGCTGATGGCACGATTAAAGTGGGAGGGTATAAAG NTNADGTIKVGGYKASLTTNA
CTTCTCTTACCACCAATGCGGCTCATTTGCATATCGGCAAAGGCGGTATCAATCTGTCCAATCAAGCGAGCGGGCGC KGGINLSNQASGRTLLVENLT
ACCCTTTTAGTGGAAAATCTAACCGGGAATATCACCGTTGATGGGCCTTTAAGAGTGAATAATCAAGTGGGTGGTTAT GPLRVNNQVGGYALAGSSAN
GCTTTGGCAGGATCAAGCGCGAATTTTGAGTTTAAGGCTGGTACGGATACCAAAAACGGCACAGCCACTTTTAATAA TDTKNGTATFNNDISLGRFVN
CGATATTAGTTTGGGAAGATTTGTGAATTTAAAAGTGGATGCTCATACAGCTAATTTTAAAGGTATTGATACTGGTAAT TANFKGIDTGNGGFNTLDFSG
GGTGGTTTCAACACCTT INKLITASTNVAVKNFNINELW
SVGEYTHFSEDIGSQSRINTV
RSI FSGGVKFKSGEKLVIDEFY
YFDARNIKNVEITRKFASSTPE
SKLMFNNLTLGQNAVMDYSQ
GDFINNQGTINYLVRGGQVAT
AAMFFSNNVDSATGFYQPLM
DLIKNKEHVLLKAKIIGYGNVS
NVNLIEQFKERLALYNNNNRM
NTDDIKACGTAIGNQSMVNNP
HP0797 2809 AAGTTTTTTGGTAAGCTTTTTGGTTGCTGAAAACGCTCATGAGCCAGAAGAAATCAAGGCTAAAGTGGCTTATGTGAA 2810 SFLVSFLVAENAHEPEEIKAKV
AATCCCCCAATTAGAAGATTTGGAAAACAACCCGGTTTATATCGGTCAAATTATAGGCGTAACTTATGATTTATTGCTG QLEDLENNPVYIGQIIGVTYDLL
TTTGACGCTGAGTTTTTGGAAGCCAAAATCAAAGACGGGTTGGATAAAACCCAAATTGAGCTTTTAAACAAGATGCCT FLEAKIKDGLDKTQIELLNKMPK
AAATGGAAAAAGGTGGAAAAAGAGCTTTTCAGAGCGACTTATTATTACAAGATTAAGGGCATAAAAGCGATTATTCCG EKELFRATYYYKIKGIKAIIPSLE
TCCTTAGAAGTGAGCGCGTTTTCCAATAAAGACAAATACATAGATCATTCCATAGCCCCAAAAGTTACTTTGCAGGTA NKDKYIDHSIAPKVTLQVTDLSK
ACGGATTTGTCCAAAAACCCTCGTTATGCGAATGTCATGGCTAAAGATTTACAAGTCTTGCAATACAAAACCAAAGAT NVMAKDLQVLQYKTKDYDDKN
TATGACGATAAAAACAATATTTTGGTGATGGAAATAGCGTTCAAAGAAGCCACTTGGGAAGATTTTCACATCAAAGAA EIAFKEATWEDFHIKEAIKQGFD
GCGATCAAGCMGGGTTTGATAACGCCTCTTTAAACCAGATCAAGGCTAAAGAAGGGAGCGTTTTTTATTATTGCGT IQIKAKEGSVFYYCVLPKTIQNLS
GTTGCCTAAGACTATTCAAAACCTTTCTΠTGATTATTTC
HP0933 2811 ATGCCCATGGTTGAGTAATGGTGGTGCAGGCAATGTGGCCGGTGGCAATAGTTTATGGGCCGGAATAGATAAAGGC 2812 CPWLSNGGAGNVAGGNSLWA
GACGGGAGCGCATGCGGGATTTTTAAAAATGAAATCAGCGCGATTCAAGACATGATCAAAAACGCTGAAATAGCCGT DGSACGIFKNEISAIQDMIKNAE
AGAGCAATCCAAAATCGTTACCGCCAACGCGCAAAACCAGCACAACCTAGACACTGGGAAAGCATTCAACCCCTATA SKIVTANAQNQHNLDTGKAFN
AAGACGCCAACTTCGCCCAAAGCATGTTCGCTAACGCTAGAGCGCAAGCGGAGATTTTAAACCGCGCTCAAGCAGT FAQS MFAN ARAQAE I LNRAQA
GGTGAAGGACTTTGAAAGAATCCCTGCAGCGTTCGTGAAAGACTCTTTAGGAGTATGCCATGAAAAGGGTAGCGAC RIPAAFVKDSLGVCHEKGSDG
GGCAATCTCCGTGGCACGCCATCTGGCACGGTTACTTCTAACACTTGGGGAGCCGGCTGCGCGTATGTGGGAGAA SGTVTSNTWGAGCAYVGETV
ACCGTAACGAATCTAAAAAACAGCATCGCTCATTTTGGCGACCAAGCGGAGCGAATCCATAATGCGCGAAATCTCGC IAHFGDQAERIHNARNLAYTLA
CTACACTTTAGCGAATTTCAGCGGCCAGTACAAAAAGCTAGGCGAACACTATGACAGCATCACAGCGGCGCTCTCTA YKKLGEHYDSITAALSSLPDAQ
GCTTGCCTGATGCGCAATCTTTACAAAATGTGGTGAGCAAAAAGACTAACCCTAACAGCCCGCAAGGCATACAGGAT VSKKTNPNSPQGIQDNY
AATTACT
HP0933 2813 ATACGCGCAAGGCATTAACCAACGAAACGCTCAATTCAGGATTTATTTAGATTGGTATTTACACCATATCGGTCTGTT 2814 YAQGINQRNAQFRIYLDWYLH
TAACCCTTATAAAGCGCGAATCGCTGAACATGTTTTTAAAACCACTCTTGCTCATGATGGCATTTATTATAAATTAAAC PYKARIAEHVFKTTLAHDGIYYK
TACCCGCCAACAACAAAGTATCATGGTAATAGCTTTACAGAATGCGCTCATTTTTATTTGAAAAACATTTATCAACAGG TTKYHGNSFTECAHFYLKNIYQ
ATTTAGATGATAAAAGCATTGAAAAATTAAGGGAGCAGTTAGGCTTTATTCAAAAGAGCGAGGAGTTTAGACGAGATA KSIEKLREQLGFIQKSEEFRRD
GCAAAATCATCAATCTTTATCGCCTTTCAACGCCTAATGTTTGCAGTGCATGCTGCGATGATTACGACATTAAAGAAA RLSTPNVCSACCDDYDIKERS
GAAGTTTTCTTTCTTTACCTTTATATCAAATCACTCAAAATCCCGATTCCTACTACACTGAAATACATGATTTCTTTAGG YQITQNPDSYYTEIHDFFRQNQ
CAAMTCAGAGAATTAGATGTTTTAGCAMTCTTGCTAAACTTTGCCCTACTTGTCATAGGGCTTTAAAAAAAGGATCT SKSC
AGCGAAGAGGAGTTTCAAAAACGCTTGATTAGAAACATTCTCAA
HP0933 2815 CTTTGGCTTTAGAGGTGATTAGGCATTCATGCGCGCATTTGCTTGCGCAAAGCTTGAAAGCCCTTTATCCGGACGCG 2816 LALEVI RHSCAHLLAQSLKALY AAATΠTTTGTAGGCCCTGTGGTAGAAGAGGGGTΠTATTACGATTTCAAGACTTCTTCAAAAATCAGCGAAGAGGAT VGPWEEGFYYDFKTSSKISEE TTGCCTAAAATTGAAGCGAAAATGAAAGAGTTTGCGAAGTTGAAACTCGCTATCACTAAAGAGACTTTAACCAGAGAG AKMKEFAKLKLAITKETLTREQ CAAGCΤTTGGAGCGTTTTAAGGGCGATGAATTAAAGCATGCGGTGATGAGTAAAATCGGTGGCGATGCCTTTGGCG GDELKHAVMSKIGGDAFGVYQ TGTATCAACAAGGCGAGTTTGAAGATTTGTGTAAGGGGCCGCATCTCCCAAACACCCGTΠTTTAAACCATTTTAAGC DLCKGPHLPNTRFLNHFKLTKL TCACTAAACTGGCTGGGGCTTATTTGGGCGGCGATGAAAACAATGAAATGCTCATTAGAATCTATGGAATCGCTTTT GGDENNEMLIRIYGIAFATKEG GCCACCAAAGAGGGTTTAAAAGACTATCTTTTCCAAATAGAAGAAGCGAAAAAACGAGATCACAGAAAGCTAGGCGT QIEEAKKRDHRKLGVELGLFS GGAGCTAGGGCTTTTTAGCTTTGATGATGAGATAGGGGCGGGCTTACCTΓTATGGCTGCCTAAAGGGGCAAGGCTT AGLPLWLPKGARLRKRIEDLLS 3GAAGCGCATTGAAGATTTATTGAGTCAAGCGTTACTTTTAAGAGGCTATGAGCCGGTTAAAGGTCCTGAGATTTTA R RGGYYEEPPVVKKGGPPEEIILL
HP0933 2817 CTTATGGGGGGAGTTGCCCGCATGGCGGGGGAGCGTTTAGCGGGAAAGACCCTAGCAAAGTGGATAGGAGCGCG 2818 YGGSCPHGGGAFSGKDPSKV
GCTTATGCGGCCCGCTATGTGGCTAAAAATTTGGTAGCGAGTGGGGTTTGCGATAAAGCGACCGTGCAGCTTGCTT YAARWAKNLVASGVCDKATV
ATGCGATTGGGGTGATAGAGCCAGTGTCTATTTATGTGAACACGCATAACACGAGCAAGTATTCAAGCGCTGAGTTG GVIEPVSIYVNTHNTSKYSSAE
GAAAAATGCGTGAMTCGGTTTTCAAACTCACGCCAAAAGGCATTATTGAAAGCTTGGATTTATTAAGACCCATTTATT SVFKLTPKGIIESLDLLRPIYSL
CGCTCACTTCAGCTTATGGGCATTTTGGGCGCGAATTAGAGGAATTCACTTGGGAAAAAACCAACAAAGCTGAGGAG FGRELEEFTWEKTNKAEEIKA
ATTAAAGCGTTCTTTMGCGTTAAAAAAATATTTT GGGT TATTTTAAAAAATTTTTGTATAATCAACAATTCACAA
GGAGTTTAAATTGAAACAAAGAACGCTGTCTATTATTAAACCGGATGCACTTAAGAAAAAAGTGGTAGGCAAAATCAT
TGATCGTrTTGAGAGTAACGGCTTGGAAGTGGTTGCTATGAAACGCTTGCATTTGAGCGTTAAAGACGCTGAAAACT
TTTATGCG
HP0933 2819 GAGCGAGTGTGTTGAGCGCGTTACTTCTTGTAGGCTTAGGGGCAGCCCCTAAACATTCAGTTTCAGCTAATGACAAA 2820 ASVLSALLLVGLGAAPKHSVSA
CGGATGCAGGATAATTTAGTGAGCGTGATTGAAAAACAGACCAATAAAAAGGTGCGTATTTTAGAAATCAAACCTTTA QDNLVSVIEKQTNKKVRILEIKP
AAATCTAGCCAGGATTTAAAAATGGTCGTTATTGAAGATCCGGACACTAAATACAATATCCCGCTTGTGGTGAGTAAG KMVVIEDPDTKYNIPLWSK
GATGGTAATrTAATCATAGGGCTTAGCAACATATTCTTTAGCAATAAAAGCGATGATGTGCAATTAGTTGCAGAAACC LSNIFFSNKSDDVQLVAETNQK
AATCAAAAAGTTCAAGCTCTTAACGCCACCCAACAAAATAGCGCGAAATTGAACGCTATTTTTAATGAAATACCGGCT ATQQNSAKLNAIFNEIPADYAIE
GATTATGCGATAGAGTTGCCCTCTACTAACGCTGCAAATAAGGATAAAATCCTTTATATTGTCTCTGATCCCATGTGC AANKDKILYIVSDPMCPHCQKE
CCACATTGCCAAAAAGAGCTCACTAAACTTAGGGATCATTTAAAAGAAAACACCGTGAGAATGGTCGTGGTGGGGTG DHLKENTVRMWVGWLGVNS
GCrTGGGGTCAATTCAGCTAAAAAAGCGGCTTTAATCCAAGAAGAAATGGC IQEEM
HP0933 2821 ATGGCGGGATTGCGTGCGCGAATTTGTTGCATAAAAATTCAGGGATCACGATAGATATTGGAGGGGGTAGCACCGA 2822 GGIACANLLHKNSGITIDIGGGS
GTGCGCGTTGATTGAAAAAGGCAAGATTAAGGACTTAATCTCGCTTGATGTTGGGACGATTCGCATTAAAGAAATGTT EKGKIKDLISLDVGTIRIKEMFL
TTTAGACAAAGACTTAGAGGTCAAATTGGCTAAAGCCTTTATCCAAAAAGAAGTCτCTAAACTGCCCTTTAAACACAA KLAKAFIQKEVSKLPFKHKNAF
AAACGCCTTTGGGGTGGGGGGGACGATCAGAGCGTTGAGTAAGGTATTGATGAAACGCTTTTGTTACCCTATTGATT RALSKVLMKRFCYPIDSLHGYE
CTTTGCATGGCTATGAAATAGATGCACATAAAAATTTAGCGTTCATTGAAAAAATCGTCATGCTCAAAGAAGATCAATT NLAFIEKIVMLKEDQLRLLGVNE
ACGGC rTAGGGGTGAATGAAGAGCGTTTGGATAGCATCAGGAGCGGGGCGTTGATT ATCAGTCGTTTTGGAG IRSGALILSWLEHLKTSLMITSG CATTTAAAAACTTCTTTAATGATCACTAGTGGGGTGGGGGTGAGAGAAGGCGTGTT TTGAGCGATTTATTGCGCCA EGVFLSDLLRHHYHKFPPNINP
TCATTACCATAAATTCCCCCCCAATATCAACCCCTCTCTCATCTCTTTAAAAGATCGCTTTTTGCCCCATGAAAAGCAC DRFLPHEKHSQKVKKECVKLF AGCCAAAAGGTCAAAAAAGAATGCGTGAAATTGTTTGAAGCCTTATCGCCTTTGCATAAAATAGATGAAAAATAC HKIDEKY
HP0933 2823 CCCTGAGGATAACTCCATAGAATTATCTCCTAGCGATAGCGCTTGGAGAACTAATCTTGTTGTGCGGACTAATAAAG 2824 PEDNSIELSPSDSAWRTNLW
CCTTGTATCAATTCATTTTGAGAATAGCTCAAAAAGACAATTTTGCTTCAGCGTATCTAACAGTCAAATTAGAATACCC YQFILRIAQKDNFASAYLTVKLE
ACAAAGACACGAAGTCTCTAGCGTTATTGAAGAGGAGTTAAAAAAGAGAGAAGAAGCAAAGAGGCAGAAAGAATTGA EVSSVIEEELKKREEAKRQKELI
TTAAGCAAGAAAATCTTAACACCACAGCCTACATCAATAGAGTGATGATGGCGAGCAATGAACAGATTATCAACAAAG NTTAYINRVMMASNEQIINKEKI
AAAAAATAAGAGAAGAAAAGCAAAAAATTATCTTAGATCAAGCAAAGGCGCTAGAGACTCAATATGTGCATAATGCCT KIILDQAKALETQYVHNALKRN
TAAAAAGAAACCCCGTGCCTAGAAACTACAACTACTACCAAGCGCCTGAAAAACGCTCTAAACATATTATGCCCTCTG NYYQAPEKRSKHIMPSEIFDDG
AAATTTTTGATGATGGCACATTCACTTATTTTGGTTTCAAAAACATCACTCTCCAACCTGCTATTTTTGTGGTTCAACCT GFKNITLQPAIFWQPDGKLSM
GATGGGAAATTGAGCATGACTGATGCCGCCATTGATCCTAACATGACCAATTCAGGATTGAGATGGTATAGAGTTAA PNMTNSGLRWYRVNEIAEKFK
TGAAATTGCAGAAAAATTTAAGCTCATTAAAGACAAAGCCCTTGTAACAGTAATCAATAAAGGCTATGGGAAAAATCC LVTVINKGYGKNPLTKNYNIKN
ATTGACAAAAAATTACAATATCAAAAACTATGGTGAATTGGAGCGTGTGATTAAAAAGCTCCCTCTTGTCAGAGATAA VIKKLPLVRDK
ATAAAAAGGCGTTAAGACATGAATGAAGAAAACGATAAACTTGAAACTTCTAAAAAAGCCCAACAAGATTCACCCCAA
GATTTATCCAATGAAGAAGCAACAGAAGCCAATCATT
HP0933 2825 AACAAAAGAACCATGTTTATACGCCTGTGTATAATGAACTGATAGAGAAGTATAGTGAGATACCCTTAAATGACAAAC 2826 QKNHVYTPVYNELIEKYSEIPL TCAAAGACACACCATTCATGGTGCAAGTGAAGTTGCCAAATTACAAGGACTATTTGTTGGATAATAAACAAGTTGTAC TPFMVQVKLPNYKDYLLDNKQ TAACTTTCAAACTTGTTCACCATTCTAAAAAGATTACGCTCATAGGCGATGCCAATAAGATACTTCAATACAAGAATTA LVHHSKKITLIGDANKILQYKNY CTTCCAAGCTAATGGAGCGAGATCTGACATTGATTTTTACTTGCAACCCACTTTGAATCAAAAGGGTGTGGTGATGAT ARSDIDFYLQPTLNQKGWMIA AGCGAGTAACTACAATGATAATCCTAACAGCAAAGAAAAACCACAGACCTTTGATGTGTTGCAAGGAAGTCAGCCAA NPNSKEKPQTFDVLQGSQPM TGCTAGGAGCTAACACAAAAAACTTGCATGGCTATGATGTGAGTGGAGCAAACAACAAGCAAGTGATCAATGAAGTG NLHGYDVSGANNKQVINEVAR GCAAGAGAAAMGCTCAGCTAGAAAAAATCAATCAGTATTACAAAACTCTTTTACAAGACAAGGAACAAGAATATACC EKINQYYKTLLQDKEQEYTTR ACTAGGAAAAATAACCAACGAGAAATTTTAGAAACATTGAGTAATCGTGCAGGTTATCAAATGAGGCAGAATGTGATT ILETLSNRAGYQMRQNVISSEI AGTTCTGAGATTTTTAAGAATGGCAACTTGAACATGCAAGCCAAAGAGGAAGAAGTTAGGGAGAAGCTACAAGAAGA NMQAKEEEVREKLQEERENE AAGAGAGAATGAATACTTGCGAAATCAAATCAGAAGTTTGCTCAGTGGTAAGTGATTGAAAGAAAAGGAGAGAGTAG RSLLSGK ATTTTTCTAAGGGTAGAGAGAAACGCCAACGGGCGTTAGCTTTACTTGATAAGTAAGGCGATCAATTAGGTAATCTTT TTAAAGCACTCATCATTTTTTAATTCTGTCTCTGATTGAGACAAATA I I I I I CAAACTGAATTTTGTTAGTTGCACCGAG AT I I IT I GTTTGTTTTCCACATCAAGCATTTTACACTCC TTCTTTCATGTATTTTTCAAGATCAAATTTCATCAGCAA
ATCCCTAA
HP0933 2827 CTATCCATTCTTCGCTTTTCCACACAGACGCTGATTCTAAGGATATTTGGAGTCAAGTGAGGAAGCAATTTGATTTCA 2828 IHSSLFHTDADSKDIWSQVRKQ
TTCCAGGAAAAACCCCTGTGTGTGTTGGCGTGTGCTATATCGCGCCTTATAAAAATCAAGACCTTATTGGCTCTAGC GKTPVCVGVCYIAPYKNQDLIG
GCTTTTGCGTGGTCGCTGAACTTTGGGGCCACGGTGGTAGGGACTTTGCTTTTAGGGAGCGCTCAAGAAAAAGCCA WSLNFGATWGTLLLGSAQEK
ATAATAATGGCGGATCGATCTGGTTTGGTAAGAATAATTTGCTGTATTTGCATGGCAATTTCAACGCGACTAATATCTT GSIWFGKNNLLYLHGNFNATNI
TTTAACGAATAATTTTAATGTCGGCAACCCTAACGCTGGCGGTGGGGCGACGATTAATTTTAACGCTGATGAAACCTT FNVGNPNAGGGATINFNADET
GAACGCTGACGGGTTAAATTACACGAATTTCCAAACCGTGGCTTTGGGCTTACAAACCAGTGCGAGCCAGCATTCAT LNYTNFQTVALGLQTSASQHS
GGGCGAATTTTAATTCCAAGCTTTCTATGGAGATTAAAAATTCTAACTTTAGGGATTTCACATGGGGAGGCTTTAATTT SKLSMEIKNSNFRDFTWGGFN
TAATTCAGGGCGTATCACTTTTGAAAACACCACTTTTAGCGGCTGGACCAATATTAACGGAGCGACTGAGAGCGGCT TFENTTFSGWTNINGATESGSS
CATCGTATGTGAATATGGTTGCGAATACGGATTTGATATTTTCTAATTCCATTTTAGGAGGGGGCATTCGCTATGATTT VANTDLIFSNSILGGGIRYDLKA
GAAAGCT TAACATTATTTTCAATAACTCTCAAATGGTTATTGATGTGTCTAAGAATGTGAATCAGTCATCATTGAAT NSQMVIDVSKNVNQSSLNGNV
GGGAATGTTACTTTCAATAATTCCAGGCTTTCAGTCAAGCCCAATGCGGCTATTAATATTGGGGATAGCCAAACCCAA RLSVKPNAAINIGDSQTQTALE
ACGGCTTTAGAAAACGCTTCAAGCCTTTCTTTTTACAACAACAGCGTGGCGAATTTTAACGGCACAACCGCTTTTAAC FYNNSVANFNGTTAFNGVSYL
GGGGTGTCTTATTTGAATTTGAACCCTAACGCTCAAGTAAGCTTCAATCAAGTAAATTTCAATAACGCTAATGTAACTT AQVSFNQVNFNNANVTFYGIPL
TTTATGGCAT DFGNSARLINFKGNTNFNQATL
NIHINFQGVSTFKQNSTMNLAE
FNALKVEGETNFNLNNSSLLNF
FNAPVSFYANHSQISFTKLATF
FDLSNNSTLNFQSVLLNGALNL
NNLAINAKGNFSFGSKGILNLS
GGDKKTSVYDVLQAQNIDGLM
EKIRFYGIQIDKADYSFDNGVH
NPLNTTETITETLHNNRLKVQIS
NNKMFNLAPSLYDYQKNPYNE
NYTSDKVGTYYLTSNIKGFNQN
TYNAQNQPLQALHIYNQAITKQ
SLGKEFLPKIANLLSSGALDNL
ETLFGIFEKYGITLNQENWKSL
SN
HP0933 28291 GCAATACAGAAAAAGCGGCGAGCCTTATATTGTCCATCCTATTTGCGTGGCAAGCTTGGTAGCGTTTTGTGGGGGC 2830 QYRKSGEPYIVHPICVASLVAF
GATGAGGCGATGGTGTGTGCTGCGCTTTTGCATGATGTGGTGGAAGACACGCCTTGTAAGATTGAAACGATTGAGC AMVCAALLHDVVEDTPCKIETI
AAGAATTTGGGCAAGATGTGGCTAATTTAGTGGATGCGCTCACTAAAATCACTGAAATCAGGAAAGAAGAATTAGGC QDVANLVDALTKITEIRKEELG
GTGAGCTCTCAAGATCCCAGAATGGTGGTTTCAGCGCTCACTTTCAGAAAGATTTTAATTAGCGCGATACAAGATCC RMWSALTFRKILISAIQDPRAL
AAGAGCCTTAGTGGTAAAGATTAGCGACAGGTTGCACAACATGCTCACCTTAGACGCCTTGCCTCATGACAAGCAAG RLHNMLTLDALPHDKQVRISK
TGCGTATTTCTAAAGAGACTCTAGCGGTGTATGCCCCTATAGCGAGCCGATTGGGCATGTCTTCAATCAAAAATGAA APIASRLGMSSIKNE
TT
HP0933 2831 GCTTCTTTTATAAAAATAAAGAAGTAGCGGTGCTTGGTGGAGGCGATACCGCCGTAGAAGAGGCGATTTATTTGGCC 2832 FFYKNKEVAVLGGGDTAVEEA AACATCTGCAAAAAAGTCTATCTCATCCACAGAAGAGATGGTTTTAGGTGCGCGCCTATCACTTTAGAGCATGCCAAA KKVYLIHRRDGFRCAPITLEHA AACAACGATAAGATTGAG ΓΠTAACCCCTTATGTGGTGGAAGAAATCAAGGGCGATGCTTCTGGAGTGTCTTCTTTA EFLTPYVVEEIKGDASGVSSLS AGCATTAAAAACACAGCCACTAACGAAAAAAGAGAATTGGTTGTGCCGGGGT FTA' TTTTGTGGGTTATGATGTG NEKRELWPGFFIFVGYDVNN
AATAACGCCGTGTTGAAACAAGAAGATAACTCCATGCTATGCAAATGCGATGAATACGGCTCTATTGTCGTGGATTTT DNSMLCKCDEYGSIWDFSMK TCCATGAAAACGAATGTTCAAGGCTTGTTTGCGGCAGGAGATATTCGCATTTTTGCCCCTAAACAAGTGGTTTGCGC LFAAGDIRIFAPKQWCAASDG TGCAAGCGATGGCGCTACGGCAGCCTTAAGCGTGATTTCTTATTTAGAACACCATTAAATCAAGCTTATAACCCT VISYLEHH
HP0933 2833 GATGTTCATGTGCGCGGATGCGGTGATTATCAGTAAAGCGGATATGGTTGAGGTGTTTAATTTCAGGGTTTCTCAAG 2834 MFMCADAVIISKADMVEVFNFR
TCAAAGAAGACATGCAAAAATTAAAGCCTGAAGCGCCTATTTTTTTAATGAGCTCTAAAGACCCTAAAAGTTTGGAAG EDMQKLKPEAPIFLMSSKDPK
ATTTTAAAAATTTCCTTCTAGAAAAAAAGCGTGAAAATTACCAGTCCACGCATTCGTTTTAATGTGTTTAGCGATCCCC NFLLEKKRENYQSTHSF
TCTAAAGTCATAGCCATTAAGGATAATGTGGTGCTTTTGGAAACTTTGGGCGTTCAAAGAGAGGCGAGCTTGGATTT
AATGGGCGAGTCCGTTAAAGTGGGCGATTATGTGCTGTTGCACATCGGCTATGTGATGAGCAAGATTGATGAAAAAG
AAGCCCTAGAATCCATTGAGCTTTATCAAGAAATGATCGCCAGAATGAACGAAACGCAATAATAACAATGAGCGTTGA
TCACCTCATTTCGCCCTTTAGAGACAAGCGAACCCTTTTAGCGCTCTCTAATGCAATCAAAAAACTCGCTTTCAAACT
TGAAAAAAAATTAGTCATCATGGAAGTGTGCGGAGGGCATACGCATTCTATCATGAAATACGGGCTTTTAGATTTGAT
GCCTAACAATTTAGAGTTTGTGCATGGGCCGGGCTGCCCGGTATGCGTGATGCCAAGAGCGCGCCTTGATGAAGCT
TATGAACTCGCTACTATCAAAGATAGCATTGTTTTGAGTTTAGGGGATATGATGAGAGTCCCCGGGAGCTATGGGAG
TTTGATACAAGCCAGAGAAAAGGGGTTGGATGCACGCTTTrrGTATTCGCCCATGCAAGCTTTAGAGATCGCTAAAG
AAAACCCTACTAAAAAAGTCATTTATATTGCGATCGGTTTTGAAACCACAACGCCCATGAGCGCTAGCGTTTTATGGA
GCGCCAAAA GAAAAAATTAATAACCTTTTTTTCCACATTAACCACATTCTAGTGCCTCCCAGCGTGAGCGCGATTT
TAAAAGATCCAGCATG
HP0933 2835 TAGCATGCTAGTGGGGGTAGCCGCATCGCTCATTCCTTTCTTAGAGCATGATGACGCCAACCGTGCCTTAATGGGG 2836 SMLVGVAASLIPFLEHDDANRA
ACTAACATGCAGCGCCAAGCGGTGCCCTTATTAAGAAGCGACGCTCCCATTGTAGGCACGGGGATTGAAAAAATTAT MQRQAVPLLRSDAPIVGTGIEK
TGCTAGGGATTCTTGGGGAGCGATCAAAGCCAATCGCGCAGGCGTTGTAGAAAAAATTGATTCTAAAAATATTTATAT WGAIKANRAGWEKIDSKNIYIL
TTTAGGCGAAAGCAAAGAAGAAGCCTATATTGATGCGTATTCTTTGCAAAAAAACTTGCGCACCAACCAAAACACCAG EAYIDAYSLQKNLRTNQNTSF
TTTCAATCAAGTCCCTATCGTTAAAGTGGGCGATAAAGTGGGAGCCGGGCAAATCATCGCTGATGGCCCTAGCATG KVGDKVGAGQIIADGPSMDRG
GATAGAGGCGAGTTGGCGTTAGGGAAAAATGTGCGCGTGGCGTTCATGCCTrGGAATGGCTATAACTTTGAAGACG KNVRVAFMPWNGYNFEDAIW
CGATCGTGGTGAGTGAGTGCATCACTAAAGATGATATTTTCACTTCCACCCACATTTATGAAAAAGAAGTGGATGCTA DDI FTSTHI YEKEVDARELKHG
GGGAGCTTAAGCATGGTGTGGAAGAATTTACCGCTGATATTCCTGATGTGAAAGAAGAAGCGCTCGCTCATCTTGAT DIPDVKEEALAHLDESGIVKVG
GAAAGCGGGATCGTTAAAGTCGGTACTTATGTGAGCGCTGGCATGATTTTGGTGGGCAAAACTTCTCCTAAAGGCGA GMILVGKTSPKGEIKSTPEERL
GATTAAAAGCACGCCTGAAGAGCGGCTTTTAAGGGCTATTTTTGGGGATAAAGCCGGGCATGTGGTCAATAAGAGTT DKAGHWNKSLYCPPSLEGTVI
TGTATTGCCCTCCCAGTTTGGAAGGCACGGTGATTGATGTGAAAGTCTTCACTAAAAAAGGCTATGAGAAAGACGCG TKKGYEKDARVLSAYEEEKAK
CGAGTTTTGAGCGCGTATGAAGAAGAAAAAGCCAAGCTTGATATGGAGCATTTTGATCGCTTGACCATGCTCAATAG FDRLTMLNREELLRVSSLLSQ
AGAAGAATTGTTGCGCGTTAGCTCGCTCCTTTCTCAAGCGATTTTAGAAGAGCCTTTCAGCCATAACGGCAAGGATT SHNGKDYKEGDQIPKEEIASIN
ATAAAGAAGGCGATCAAATCCCT LVKKYSKEVQNHYEITKNNFLE
GEEHEEKLSILEKDDILPNGVIK
AT
HP0933 2837 CGAAGAGTTCGCTCTCAAACAAGTGGCCAAACAAGCCACCAGCTCTCTTTTATACCGATTAGGAAAAACCATTATTTT 2838 EEFALKQVAKQATSSLLYRLG
AGCGAGCGTGTGCGTGGAAAGAGAGCCTGTGAGTG GATTTTCTGCCTTTAGTGGTGCAGTTTTTAGAAAAATCTT VCVEREPVSEDFLPLWQFLE
ATGCAGCCGGAAAGATCCCGGGCGGTTTTGTTAAAAGAGAAGGCAGGGCGCAAGATTTTGAAATCTTAACCTCTAG K1PGGFVKREGRAQDFEILTSR
GCTCATAGACAGGACTTTACGCCCTTTATTCCCTAAAGACTACCGCTACCCTACACAGATCACTTTAATGGTTTTAAG RPLFPKDYRYPTQITLMVLSH
CCATGATATTGAAAATGACTTGCAGGTTTCTGCTTTAAACGCCGCTTCAGCCGCTCTCTTTTTGGCCCATATCGCTCC VSALNAASAALFLAHIAPIKSV
TATTAAAAGCGTGAGCGCTTGCAGGATCGCTAGGATGGATAACGAATTTATCATTAACCCTAGCGCAAGCCTTTTGA MDNEFIINPSASLLNQSSLDLF
ATCAATCCAGTTTGGATTTGTTCGTGTCTGGAACGAAAGAGAGTTTGAACATGATAGAAATGCGCTCTTTGGGGCAA SLNMIEMRSLGQKLNALEEPL
AAATTGAACGCTTTAGAAGAGCCTTTAATGTTAGAAGCTTTAGAATTGGCTCAAAAAAGTTTGGAAGAAACTTGCACG LAQKSLEETCTLYEEIFTPHQN
CTTTATGAAGAGATTTTCACGCCCCACCAAAACGAGCTGTTTTTCAAAGAGAGCCAAGGAATAG SQGI
HP0933 2839 GTGTTTTTTGATGCAAGGGTTTTTAAGAAGCCTGTTTTTTGGGGTTAAAAAGATCCCTAAACCATTCGCTCCTCTAGTA 2840|CFLMQGFLRSLFFGVKKIPKPF
GAAAAGGGCGTTTrAAAAGAAGCGCTTGAATTGAAAAAGGATCGCTATTTTTTAAAAGAAGGCTTTGATATAGGCAAA IKGVLKEALELKKDRYFLKEGFD
GTTGAAA GTAAAAGATAAGGCGTTTTTCATTTCTTTAGCGAAAAATTACCCTAAAGACCCTTTAATCAAAAACTTAC KVKDKAFFISLAKNYPKDPLIKN
CCCCATCTTTTAAAACAGACGCTTTGATTTTATGCCAAATAGAATGTTCTAAAAAACGCCCCATAGCCTTTTTTAAAGC KTDALILCQIECSKKRPIAFFKA
CGCTCTTTTAAATGCAGATCACACGATGATAGCTTACTTGGCTAAAAAAAATAACCAGATTGTGGCTATCCCTTTTAAA HTMIAYLAKKNNQIVAIPFKEPF
GAGCCTTTTAAAAAACCTGTTTCTTTAAAGCACAGCCAAAAATCCTTACTAGAATTGCCCAGGCATTGCGTGGTAAAA LKHSQKSLLELPRHCWKIDTK
ATTGATACTAAAAAGCGTGAAATCAGCGAGATTTTAGGGGCTTTAGAAGACCCTTTAATAGATGAAAACCTTTCTTTG LGALEDPLIDENLSLSLFDRVK
AGCCTTTTTGACAGGGTGAAGGATTTTTCAAAAGATTGCTTGAATTTAGCGCAACATTACGCGCAACTTAAAGCGAGC LNLAQHYAQLKASDFKDRINYS
GATTTTAAAGACAGGATCAATTATTCTCACATCCCTTTTATCACCATTGACCCCAAAGACGCTAAAGATTTTGACGATG DPKDAKDFDDAIFYDQEKRVLF
CGATTTTTTATGACCAAGAAAAAAGGGTTTTGTTTGTGGCGGTTGCTGATGTGAGCGAATTTGTGCCGAAACATTCCA VSEFVPKHSSLDKEARVRGFS
GTTTGGATAAAGAAGCTAGGGTTAGGGGTTTTAGCGTGTATTTCCCTAATAGCGTCTATCCCATGCTGCCTTTGAGTT VYPMLPLSLSQGACSLKAFEK
TGTCTCAAGGGGCATGCTCATTAAAAGCGTTTGAAAAACGCCTGGCTTTAGTGTATGAAATCCCTTTAGATAATTTGA EIPLDNLKN
AAAAC
HP0938 2841 CGAGATTTTAGTGGGCGAGAGCGCCAAAAGACAAGCGGTAACCAACCCAGAAAAAACCATTTATTCTATTAAAAGAA 2842 EILVGESAKRQAVTNPEKTIYSI
TCATGGGTTTGATGTTTAATGAAGATAAGGCTAAAGAAGCCGAAAAGCGCTTGCCTTATAAGATTGTGGATAGGAAT MFNEDKAKEAEKRLPYKIVDR
GGGGCTTGCGCGATTGAGATTTCGGGTAAAGTTTATACCCCTCAAGAGATTTCAGCCAAAATTTTAATGAAGCTCAAA EISGKVYTPQEISAKILMKLKED
GAAGACGCTGAAAGTTATTTGGGCGAGAGCGTTACGGAAGCGGTGATCACGGTTCCAGCTTATTTTAACGACAGCC GESVTEAVITVPAYFNDSQRK
AAAGGAAAGCGACTAAAGAAGCCGGCACGATTGCAGGGCTTAATGTTTTAAGGATCATCAATGAGCCTACAAGTGCG TIAGLNVLRIINEPTSAALAYGL
GCATTAGCTTATGGCTTGGATAAAAAAGAGAGCGAAAAAATCATGGTTTATGATTTGGGTGGGGGGACTTTTGATGTT KIMVYDLGGGTFDVTVLETGD
ACCGTTTTAGAAACAGGCGATAATGTCGTGGAAGTTTTAGCCACAGGGGGCGATGCGTTTTTAGGGGGCGATGATTT ATGGDAFLGGDDFDNRVIDFL
TGAC TCGTGTGATTGATTTCTTAGCGAGTGAGTTTAAAAGCGAAACGGGCATTGAAATTAAAAACGATGTGATGGC ETGIEIKNDVMALQRLKEAAEN
GTTGCAACGCCTAAAAGAAGCGGCTGAAAACGCTAAAAAAGAACTGAGCTCTGCGATGGAGACTGAAATCAATTTGC SAMETEINLPFITADATGPKHLV
CCTTTATCACCGCGGACGCTACCGGGCCTAAACACTTGGTTAAAAAACTCACTAGGGCTAAATTTGAAAGCTTGACA AKFESLTEDLMKETISKIESVIK
GAAGATCTCATGAAAGAAACGATTTCTAAAATTGAAAGCGTGATCAAAGATGCGGGGCTAACCAAAAATGAGATTTCA NEISEWMVGGSTRIPKVQER
GAAGTGGTGATGGTGGGAGGCTCTACTCGTATCCCTAAAGTCCAAGAAAGGGTGAAAGCGTTTATTAATAAAGATTT DLNKSVNPDEWAVGASIQGG
GAATAAAAGCGTCAATCCTGATGAGGTCGTAGCGGTGGGAGCGAGCATTCAAGGGGGCGTGTTGAAAGGCGATGT VKDVLLLDVTPLSLGIETLGGV
GAAAGATGTGCTTTTATTAGAC RGTTIPAKKSQVFSTAEDNQP
LQGERELARDNKSLGKFDLQG
GVPQIEVTFDIDANGILTVSAQ
SQEIKIS
HP0938 2843 CAGACGCTAAGAGCGTGGAATTAGAGGATTTGTATCACGAATTCAGTGAAGATAAGCGTTCTATnTCTATTTTGCCC 2844 DAKSVELEDLYHEFSEDKRSIF
CCACAAACGCCCACAAAGACATGCTCAAAGCGGTGGATTTTTTCAAAGAAAAAGGTCATACGGCTTATTTAGATGAG NAHKDMLKAVDFFKEKGHTAY
GTGAGGGTCAGCACTGATGAAAAAGATTTTCTTTATGAATTGCACATTATTTAAAGGCTTGTATTGAAAGTTTATATTG VSTDEKDFLYELHII
AAACCATGGGTTGTGCCATGAATTCTAGGGATAGTGAGCATTTATTGAGCGAGCTGTCCAAACTAGACTATAAAGAG
ACCAATGACCCTAAAACAGCGGATTTGATTTTAATCAACACTTGCAGCGTGCGCGAAAAGCCTGAACGAAAATTGTTT
TCAGAAATCGGTCAATTCGCTAAAATCAAAAAACCCAACGCCAAAATCGGGGTTTGCGGGTGCACTGCAAGCCACAT
GGGAGCGGATATTTTGAAAA
HP0938 2845 CGCCAAAAATTCTTTTAAAATCACACCCCCAACAACTATAAAGGAGCCTAACAAGATTTTAAATTGCAACAATTCTGGC1 2846 AKNSFKITPPTTIKEPNKILNCN
ACAACCATGCGTTTATACAGCGGGCTTTTAAGCGCTCAAAAAGGGCTTTTTGTTTTAAGCGGGGACAATTCCTTAAAC MRLYSGLLSAQKGLFVLSGDN
GCACGCCCCATGAAAAGAATCATTGAGCCTTTGAAGGCTTTTGGGGCAAAAATTTTAGGGAGAGAGGATAACCATTT PMKRIIEPLKAFGAKILGREDN
CGCCCCCTTAGTGATCTTAGGGAGTCCGTTAAAAGCTTGCCATTATGAAAGCCCTATCGCTTCAGCTCAAGTCAAAA LGSPLKACHYESPIASAQVKS
GCGCTTTTATTTTAAGCGCCTTACAAGCTCAAGGCGCAAGCACTTATAAAGAAAGCGAGCTTAGCCGTAACCACACA QAQGASTYKESELSRNHTEIM
GAAATCATGCTTAAAAGTTTGGGAGCTGATATTCACAATCAAGACGGCGTTTTAJAAAATTTCACCCCTAGAAAAACCC DIHNQDGVLKISPLEKPLEAFD
CTAGAAGCCTTTGATTTTACGATAGCTAATGATCCGTCTAGCGCGTTTTTTTTCGCCCTCGCTTGCGCGATTACGCCA PSSAFFFALACAITPKSRLLLKN
AAAAGCCGCCTTCTTTTAAAAAATGTCTTGCTCAACCCCACTCGCATAGAAGCTTTTGAAGTTTTGAAAAAAATGGGT TRIEAFEVLKKMGASIEYAIQSK
GCTTCCATAGAGTATGCGATTCAGTCCAAAGATTTAGAAATGATTGGCGATATTTATGTAGAGCATGCCCCTTTAAAA GDIYVEHAPLKAINIDQNIASLID
GCGATCAATATTGATCAAAATATCGCCAGTCTTATTGATGAAATCCCCGCTTTAAGTATCGCTATGCTTTTTGCAAAAG IAMLFAKGKSMVKNAKDLRAK
GCAAAAGCATGGTTAAAAACGCTAAAGATTTACGAGCTAAAGAAAGCGACAGGATTAAAGCGGTTGTTTCTAATTTCA AWSNFKALGIECEEFEDGFY
AAGCTTTAGGGATTGAGTGCGAAGAGTTTGAAGATGGGTTTTATGTAGAGGGATTAGAAGATATAAGCCCATTAAAA SPLKQRFSRIKPPLIKSFNDHRI
CAGCGCTTTTCTAGGATTAAGCCCCCCCTTATCAAAAGCTTCAATGACCACAGGATTGCGATGAGTTTTGCTGTTTTA VLTLALPLEIDNLECANISFPQF
ACTTTAGCGTTG QFKKGSLNGN
HP0938 2847 ACATTAAAACTTCTGAAGGTGGCACGCATGAGGCGGGCTTTAAAATGGGCTTGTCTAAAGCAATTTTGCAATATATTG 2848 IKTSEGGTHEAGFKMGLSKAIL
GCAATAATATTAAAACCAAAGAGTCACGCCCCATCTCTGAAGATATTAAAGAGGGGTTGATCGCTGTTGTGAGCTTG IKTKESRPISEDIKEGLIAWSL
AAAATGAGCGAGCCTTTGTTTGAAGGGCAGACTAAATCCAAACTCGGCAGTTCGTATGCGCGCGCGTTGGTTTCAAA FEGQTKSKLGSSYARALVSKL
ATTAGTCTATGATAAAATCCATCAATTTTTAGAAGAAAACCCTAACGAAGCCAAAATCATTGCCAATAAAGCCCTACTA QFLEENPNEAKIIANKALLAAK
GCTGCAAAAGCCAGAGAAGCCAGTAAGAAAGCCAGAGA KAR
HP0938 2849 GAATGGGGTGGAGATTGTAGGGTTGGAGCATTTGGATAAAGTGATTTATTTAGATCAAGCCCCCATAGGCAAAACCC 2850 NGVE1VGLEHLDKVIYLDQAPI
CACGAAGCAACCCTGCCACTTACACGGGAGTGATGGATGAAATCAGGATTTTATTTGCCGAGCAAAAAGAAGCTAAA NPATYTGVMDEIRILFAEQKEA
ATTTTAGGCTATAGTGCGAGCCGTTTTAGCTTTAATGTTAAAGGAGGGCGGTGCGAGAAATGCCAAGGCGATGGGG ASRFSFNVKGGRCEKCQGDG
ACATTAAAATAGAAATGCACTTTTTGCCTGATGTGTTAGTCCAATGCGATAGCTGTAAGGGCGCTAAATACAACCCCC HFLPDVLVQCDSCKGAKYNP
AAACTTTAGAAATCAAGGTGAAAGGCAAATCCATTGCCGATGTGTTGAACATGAGCGTGGAAGAGGCTTATGAATTTT KGKSIADVLNMSVEEAYEFFA
TTGCTAAATTCCCTAAAATCGCCGTGAAGTTAAAAACGCTTATGGATGTGGGCTTAGGCTATATCACTTTAGGGCAAA KLKTLMDVGLGYITLGQNATTL
ACGCTACGACTTTAAGTGGGGGGGAGGCTCAAAGGATCAAATTAGCTAAAGAATTGAGTAAAAAAGACACAGGCAAA QRIKLAKELSKKDTGKTLYILD
ACCCTTTATATTTTAGATGAGCCTACTACCGGTTTGCATTTTGAAGACGTGAATCATCTTTTACAAGTCTTGCATTCTT HFEDVNHLLQVLHSLVALGNS
TAGTGGCGTTAGGCAATTCTATGCTAGTGATTGAGCATAATTTAGACATTATCAAAAACGCTGACTACATTATAGACAT NLDIIKNADYIIDMGPDGGDKG
GGGGCCTGATGGGGGGGATAAGGGCGGGAAAGTCATTGCGAGCGGCACGCCTTTAGAGGTGGCGCAAAATTGCG GTPLEVAQNCEKTQSYTGKFL
AAAAAACCCAAAGCTACACGGGAAAATTTTTAGCTTTG
HP0938 2851 ATACCACCTTTTTGGCTTTACTGGCACGCCCATTTTTGCAGCTAATTGCGATAAAAACAACCCTTTAGGCACGACAGA 2852 YHLFGFTGTPIFAANCDKNNP GCAAAAGTTTGGGAAATGCCTCCACCAATACACCATTATTGATGCGATCAGGGATAAAAACGTTTTGCCCTTTAGAGT KFGKCLHQYTIIDAIRDKNVLP GG TACCACMCACCATTAAAGCTAMGAGGACATTMGGATMTMGGTTAGAGCGGTTGATGAAAAAAACGCCC NTIKAKEDIKDNKVRAVDEKN TTTTGGATACTAGGAGGATCAAAGAAATCACTAAATGCATTTTAGAGCGTTTCAATCAAGCCACTAAAAATAAAAAATT IKEITKCILERFNQATKNKKFN CAATTCCATTCTGGCATGCTCTAGCATAGAAGCGCTGAAAAAATACTACCAAGCCTTTAAAGAAGAAAAACACGATCT EALKKYYQAFKEEKHDLKIAAI TAAAATCGCTGCCATTTTTAGCTATAGCGCTAATGAGGAAATTGACACGCTAGAAGATGAAAACAATGAAAGCGCTTG EEIDTLEDENNESACRLDKSS CCGGCTAGACAAAAGCTCAAGGGATTTTTTAGAGGGCGCGATTGCGGATTATAATGGGATGTTTGGCGTTTCT TTG AIADYNGMFGVSFDTSDQKFQ
ACACTTCGGATCAAAAATTCCAAAGTTATTACAAGGATCTTTCTCA LS
HP1414 2853 CCATGAAAAGGGTAGCGACGGCAATCTCCGTGGCACGCCATCTGGCACGGTTACTTCTAACACTTGGGGAGCCGG 2854 HEKGSDGNLRGTPSGTVTSN
CTGCGCGTATGTGGGAGAAACCGTAACGAATCTAAAAAACAGCATCGCTCATTTTGGCGACCAAGCGGAGCGAATC CAYVGETVTNLKNSIAHFGDQ
CATAATGCGCGAAATCTCGCCTACACTrTAGCGAATTTCAGCGGCCAGTACAAAAAGCTAGGCGAACACTATGACAG ARNLAYTLANFSGQYKKLGEH
CATCACAGCGGCGCTCTCTAGCTTGCCTGATGCGCAATCTTTACAAAATGTGGTGAGCAAAAAGACTAACCCTAACA ALSSLPDAQSLQNWSKKTNP
GCCCGCAAGGCATACAGGATAATTACTACATTGACTCCAACATCCATTCTCAAGTGCAATCTAGGAGTCAAGAACTC QDNYYIDSNIHSQVQSRSQEL
GGCAGTAACCCTTTCAGACGCGCCGGGCTAATCGCCGCTTCTACCACCAATAACGGCGCGATGAATGGGATTGGCT RAGLIAASTTNNGAMNGIGFQ
TTCAAGTGGGCTATAAGCAATTCTTTGGGAAAAACAAACGATGGGGCGCGAGATACTACGGCTTTGTGGATTACAAC FGKNKRWGARYYGFVDYNH
CACACCTATAACAAGTCCCAA I I I I I CAACTCCGATTCTGATGTTTGGACTTA FFNSDSDVWT
HP1414 2855 TTGGCGGGGATGTAGATATAGAGAGTTTGCAAAAATATGAAAACCCCACGATAGTGGAGTTCAGATATAAAGAAAAG 2856 GGDVDIESLQKYENPT1VEFRY
TTTCTTAAAGATTACAACGCTTTAAGCCACCGCCCTTTCACAGAAAAGAACGATATACAAGTGGCGCATGAAGCCCAA DYNALSHRPFTEKNDIQVAHEA
AAGGAACTTTCAAGCGCTCAAGGTAATCATTTTTATCGGTTTTTTTCTCAATTGAAGCAATGGGCTGGAGCGGATGAA SAQGNHFYRFFSQLKQWAGA
AAACGGAATTTTAGGGATCTCATAGAGGATTTTTCTTTAGAAAGCTTCACTAATTGCACGGATTTTAACCCCATAGAAA FRDLIEDFSLESFTNCTDFNPIEI
TCTATGCATACTGCATCGGCCGTTGCATCAACAACATGGAAAATGGCGTGTTTTTGAAATACTTTTTATCCTATCCCAT GRCINNMENGVFLKYFLSYPIK
CAAGTATGAAAAGCATCAGGCTGAAAAAATCAGAGAGAGTTTTGAAAAAGGCTTGAAAAAATCCTTACCCAGGCATG AEKIRESFEKGLKKSLPRHVFD
TTTTTGACGATGAAAAAACGGCTAAAATGTTCAAAGTGGAATTAAAAGCGAGCGAGCCTTGCGCGTATGCCATTAGC KMFKVELKASEPCAYAISALKS
GCTTTAAAAAGCTACGGGTTTGATAAATTTGCAAAATTAGACAAGCCCATTTATTACGGGGTGTTTGATTTTGGGGGC FAKLDKPIYYGVFDFGGGTTDF
GGGACGACGGATTTTGACTTTGGCAAATGGGAAAAAAGCGCTAATCCTAAATTCGCTTACAAAATGACGCATTTTAG WEKSANPKFAYKMTHFSNGG
CAATGGAGGGGATAAGTATTTAG
HP1414 2857 AGCAAGAAAAT1TCTGTGGGAAGACATAAAAACTTTAGATGAAAAGAGTGGCGTTCATTTGTTCCCTAAAAATATTGG 2858 ARKFLWEDIKTLDEKSGVHLFP TGAAATCAAGGATAAATTTGAAACAAACAAGGAAAAATTCAAACAAAGCAAAAACTATTCTGAGTTCGCAGAATATTG KDKFETNKEKFKQSKNYSEFA CCGAGAGTGTAACCCTTATACAGCGTTTCAAAACTTAAGAAATAAAGTTCAATTCCCTTTAAGCGGTGGTTTATCTTAT CNPYTAFQNLRNKVQFPLSGG AAGTCTTACAAACTCGTGCCAACCATGAAAGAATACA YKLVPTMKEY
HP1414 2859 TGAAGAAAATRGCGCATGTTATGCTTGCAAACGCTATTCTAAAGCCTATTTGCACCATTTATTTAGGGCTAAAGAACT 2860 EENCACYACKRYSKAYLHHLF CACTTACGCTCGTTTGGCCAGCTTGCACAATTTGCATTTTTATTTAGAGCTGGTGAAGAACGCCAGAAACGCCATTTT YARLASLHNLHFYLELVKNARN AGAA GCGGTTTTTGAGTTTTAAAAAAGAATTTTTGGAGAAATACAACTCCCGCTCTCATTG TGATGGAATGCAA FLSFKKEFLEKYNSRSH AAATACTAAAAAGCGTTTTTTACCATCAATAAAAGTTTTCTTAAAAATAAGGCTTTAGTTTTT TTΠTGAT
HP1414 2861 TATCCATAGAAAACCCCATTATCAAAACAAGAGCAATGGGAACTTATGCGGATTTGATTATCATCACAGGCTCATTAG 2862 SIENPIIKTRAMGTYADLIIITGSL AGCAAGTCAATGGGTATTACAACATTCTAAAAGCGCTCAACAAACGCAACGCTAAGTTTGTGTTAAAAATCAATGAGA YYNILKALNKRNAKFVLKINEN ACATGCCTTATGCCCAAGCGACTTTTTTAAGAGTGCCAAAAAGAAGCGATCCTAATGCCCACACGCTTGATAAGGGA TFLRVPKRSDPNAHTLDKGASI GCGTCAATTGA
HP1414 2863 ATGAAGTTCAAAGAAATGAAGCTCAAAAAGAAACCCCCCAATCCAATCAAACGCCTAAAGAAATGAAAGTCAAGTCCA 2864 EVQRNEAQKETPQSNQTPKE
TTTCTTATGTCGGGCTTTCTTACATGTCTGACATGCTCGCTAATGAAATrGTAAAGATTCGTGTGGGCGATATTGTGG SYVGLSYMSDMLANEIVKIRVG
ATTCTAAAAAAATAGACACCGCTGTTTTGGCTTTGTTCAATCAAGGGTATTTTAAAGACGTTTATGCCACTTTTGAAGG KIDTAVLALFNQGYFKDVYATF
CGGCATATTAGAGTTTCATTTTGATGAAAAAGCCAGGATTGCCGGGGTAGAAATCAAGGGTTATGGGACTGAAAAGG FHFDEKARIAGVEIKGYGTEKE
AAAAAGACGGCTTAAAATCCCAAATGGGGATCAAAAAGGGCGACACCTTTGATGAGCAAAAATTAGAGCATGCTAAA SQMGIKKGDTFDEQKLEHAKT
ACGGCTTTAAAAACCGCTTTAGAGGGGCAGGGCTATTATGGGAGCGTGGTGGAGGTGCGCACAGAAAAGGTCAGT EGQGYYGSWEVRTEKVSEG
GAGGGTGCATTATTGATCGTGTTTGATGTGAATAGGGGGGATAGCATTTATATCAAACAATCCATTTATGAGGGAAG VNRGDSIYIKQSIYEGSAKLKR
CGCGAAATTAAAACGCCGCATGATTGAATCTTTGAGTGCGAACAAGCAACGAGATTTCATGGGCTGGATGTGGGGC SANKQRDFMGWMWGLNDGK
TTGAATGACGGGAAATTGCGTTTAGATCAACTAGAATACGATTCTATGCGTATCCAAGATGTGTATATGCGTAGGGGT EYDSMRIQDVYMRRGYLDAHI
TACTTAGACGCTCATATTTCTTCGCCTTTTTTGAAAACGGATTTTTCTACCCATGACGCTAAGCTTCATTATAAAGTCA TDFSTHDAKLHYKVKEGIQYRI
AAGAGGGGATCCAATACAGGATTTCAGACATTTTAATAGAGATTGACAACCCGGTAGTCCCCTTAAAAACCTTAGAAA DNPVVPLKTLEKALKVKRKDVF
AAGCGCTTAAAGTGAAAAGGAAAGATGTCTTTMTATTGAGCATrTAAGAGCGGATGCGCAAATTTTAAAAACCGAAA ADAQILKTEIADKGYAFAWKP
TCGCCGATAAGGGTTATGCGTTTGCGGTGGTGAAGCCAGACTTGGATAAAGATGAAAAAAACGGGCTTGTGAAAGT KNGLVKVIYRIEVGDMVYINDVI
CATTTATCGTATTGAAGT RTSDRIIRRELLLGPKDKYNLT
NSLRRLGFFSKVKIEEKRVNSS
VSVEEGRTGQLQFGLGYGSY
GSVSERNLFGTGQSMSLYANI
RSYPGMPKGAGRMFAGNLSL
DSWYSSTINLYADYRISYQYIQ
GVNVGRMLGNRTHVSLGYNL
GFSSPLYN
HP1414 2865 AGAATCCTTACGAGCCTTAAAAGCTTCGCAAGAAGTGCAGGCTAACACGCTTAAGCAGCAATCGCAAACTTTAGAGG 2866|ESLRALKASQEVQANTLKQQSQ
ATTTGAGGAATGAGATTCACGCTAACCAGCAAGCTATCCAGCAGTTAGACAAGCAAAATAAAGAGATGAGTGAATTAT RNEIHANQQAIQQLDKQNKEMS
TGACCAAGTTAAGCCAGGATTTGGTTTCACAAATCGCCTTAATCCAAAAAGCTCTCAAAGAACAAGAGGAAAAAGCT LSQDLVSQIALIQKALKEQEEKA
GAAAAGCCGCTCAAATCAAACGCTCCGGCTAATAAAACCCCCTCTTTGAAAGCCGAATCCCCAAAAAATCAAGAGGG IAPANKTPSLKAESPKNQEG
AAAMCTCAAGAAAAGGCGAAAATTGAGTTTGATAAAGACTTGTCTAAGCAAAAAGAGATCTTTCAAGAAGCTCTGTC AKIEFDKDLSKQKEIFQEALSFF
M l I I I l AAAAATAAATCCTATGCAGAAGCCAAAGAGCGTTTGTTGTGGTTAGAAGCCAATAGTTACAGACTTTATTAT AEAKERLLWLEANSYRLYYVRY
GTGCGTTATGTTCTTGGAGAAGTGGCTTATG AY
HP1413 2867 CGATATTTTAAGCTCTATTTTTGGGAAAGGAGGCTTTTCGCAAAGATTTTCTCAAAACTCGCAAGGCTTTTCTGGCTTT 2868 DILSSIFGKGGFSQRFSQNSQG
AAI I I I I CCAATTTCGCCCCTGAAAATTTAGACATAACCGCCGCTTTAAATGTCTCTGTTTTAGACACCCTTTTAGGCA FSNFAPENLDITAALNVSVLDTL
ATAAAAAACAAGTGAGCATCAATAATGAGACTΠTAGCCTTAAAATCCCTATTGGCGTGGAAGAGGGCGAAAAGATTA QVSINNETFSLKIPIGVEEGEKIR
GGGTTCGCAACAAGGGGAAAACGGGGCGAACGACTAGGGGCGATTTGCTCTTAGAGATCCATATTGAAGAAGATGA GKTGRTTRGDLLLEIHIEEDEM
AATGTATAGGCGCGAGAMGATGATATTACCCAAATCTTTGATTTACCCTTAAAAACGGCTCTTTTTGGAGGGAAAAT DDITQIFDLPLKTALFGGKIEIAT
TGAAATCGCTACTTGGCATAAAACCTTAACCCTAACCATTCCCCCTAACACCAAAGCGATGCAAAAATTCCGCATTAA TLTIPPNTKAMQKFRIKEKGIKN
AGAAAAAGGGATCAAAAACAGAAAAACTTCGCATGTGGGGGATTTGTATTTGCAGGCTCGTTTGATTTTGCCTAAAAC VGDLYLQARLILPKTETLSNELK
TGAAACGCTRRCTAATGAGTTGAAAGCGTTATTAGAAAAAGAATTGTAAGGAGGAATCGT EL
HP1413 2869 TTTGGAATACGCTGATCCTAGCACTTCTAAAAAGAGAGCCGATAAGGGATTAAAAAAGGTGTTCAAAGACAGCAAAA 2870 LEYADPSTSKKRADKGLKKVFK
AAGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCATrGCTAAAAAAACAAGACCGA ACGFIYEISEFMKAYTALLKKQ
TACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACTGCCCTTTATGTCAAATACCCT LRYLPSRYWASILTTALYVKYP
GATTTTGACGCTTTGAAAAAGCTTTTGGTGTCTTATTATTACCAAACTTGGATTGCAGGAGGCACGATCACGCGCATC KLLVSYYYQTWIAGGTITRIKQT
AAGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACCATCAAAGAGCTTATATTGAAT VKSNKSVETIKELILNSIDSYNTF
AGCATCGACTCTrATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTTCTGTrTATCATAGCAAATGGG NLWDSSSVYHSKWVRPVLALA
TGCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATTTTATCGCTATGGATGCCGAAA DEEKPHFIAMDAETQVEHILPQ
CCCAAGTGGAGCATATTTTGCCACAAACGCCCAAAAGAGGCAGTCAATGGAACGCGGATTTTGACAAAGAAAAAAGA SQWNADFDKEKREEWVNNIAN
GAAGAATGGGTAAATAATATCGCGAATTTAACCCTTTTAAAGCGTAAAAAGAACGCGCATGCTTTAAACGGGGATTTT RKKNAHALNGDFDEKRKIYGG
GATGAAAAAAGAAAAATTTATGGAGGCAAAGACACGAGCAAAGTGATTAGCTGTTATGACATCACTAAAGAATTGTAT VISCYDITKELYSNYRKWNEKS
AGCAATTATAGGAAGTGGAATGAGAAGTCCCTCCAAGAGCGATACAAATCTTTGTATAACACTATCACGCCTGTTTTA KSLYNTITPVLHIEGQEDDFEDD
CACATAGAGGGGCMGMGATGATTTTGAAGATGATTTTGATCTAGAATGATTAAAGATTGCCAAGCATCAAAACAAC
A GAGGTGATCAATGCCTAAAAAAGAGCTATTAAAGATGTCAAAGAAAAGGATTTTTAAAGACTTCTTAAAAGAAGC
CAAACAGCACCG
HP1413 2871 GGCGTGATTTTTATCCGCTATATCCCTAAAGATAAAATGGTAGAAAGCAAGTCCTTAAAACTCTATTTGTTCAGTTACA 2872 GVIFIRYIPKDKMVESKSLKLYL GAAACCATGGGAGTTTTCATGAGAGCTGTATCAATACGATTTTGCTAGATTTAGTCCGATTGCTAGAGCCAAAGTATT HGSFHESCINTILLDLVRLLEPK TGGAAGTGTATGGGGATTTTGCCTCTAGGGGTGGGATTGCGATCAAGCCCTTTGTGAATTATGCGATCAAAGAATAC GDFASRGGIAIKPFVNYAIKEYQ CAGGACTTTAAAGAGA CGCCTTTTGAATGCGAAATMCGCGCAAAGATAGGGTTTTAAGATTAAAGCCATTCTAAA RLLNAK TTAAACGGACAGATTAGGGG
HP1533 2873 AGAAGGTTTGAGGAACATCACGGCTGGGGCTAACCCTATTGAAGTGAAACGAGGCATGGATAAAGCCGCTGAAGCC 2874 EGLRNITAGANPIEVKRGMDKA
ATTATTAATGAGCTTAAAAAAGCGAGCAAAAAAGTGGGCGGTAAAGAAGAAATCACCCAAGTGGCGACCATTTCTGC ELKKASKKVGGKEEITQVATIS
AAACTCCGATCACAATATCGGGAAACTCATCGCTGACGCTATGGAAAAAGTGGGTAAAGACGGCGTGATCACCGTT NIGKLIADAMEKVGKDGVITVE
GAAGAAGCTAAGGGCATTGAAGATGAACTAGATGTTGTAGAAGGCATGCAATTTGATAGAGGCTACCTCTCCCCTTA DELDWEGMQFDRGYLSPYFV
TΠTGTAACAAACGCTGAGAAAATGACCGCTCAATTGGATAACGCTTACATCCTTTTAACGGATAAAAAAATCTCTAG MTAQLDNAYILLTDKKISSMKDI
CATGAAAGACATTCTCCCGCTACTAGAAAAAACCATGAAAGAGGGCAAACCGCTRTTAATCATCGCTGAAG TMKEGKPLLIIAE
HP1533 2875 TCGCTCCCCTTTATGAAGAACACCACCAGGTGCGTTATGATGAGAGCGTGTTTAAGGCATGCGTGGATTTAACGAGT 2876 APLYEEHHQVRYDESVFKACV
GATTACATGCATGATAAATTCTTGCCGGATAAAGCGATTGAATTATTAGATGAGGTGGGATCGAGGAAAAAAATCAGC YMHDKFLPDKAIELLDEVGSRK
CCTAAAAAGGGCAAAAAAATCGGCGTTGATGATGTGAAAGAAACGCTCGCTCTAAAGCTTAAAATCCCTAAAATGCG GKKIGVDDVKETLALKLKIPKM
TTTGAGCAGCGACAAAAAAGCCCTTTTAAGGAATTTGGAAAAATCGCTTAAAAATAAGATTTTTGCCCAAGCAGAAGC KALLRNLEKSLKNKIFAQAEAIS
GATCAGTCTTGTCAGCAATGCGATTAAAATCCAGCATTGCGGGCTTTCTGCAAAAAATAAGCCTGTGGGGAGCTTTT KIQHCGLSAKNKPVGSFLFVGP
TATTCGTGGGGCCTAGTGGGGTGGGGAAAACAGAATTGGCTAAAGAATTGGCCTTGAATTTGAATTTGCATTTTGAA TELAKELALNLNLHFERFDMSE
CGCTTTGACATGAGCGAATACAAAGAAGCCCATAGCGTGGCAAAACTCATCGGGAGTCCTAGCGGTTATGTGGGGT SVAKLIGSPSGYVGFEQGGLL
TTGAACAAGGGGGGTTATTGGTGAATGCGATTAAAAAACACCCGCATTGTTTGCTGCTTTTAGATGAGATAGAAAAAG HPHCLLLLDEIEKAHSNVYDLL
CCCATTCTAACGTGTATGATTTGTTGTTGCAAGTGATGGATAACGCCACTTTGAGCGATAATTTAGGCAATCAGGCGA NATLSDNLGNQASFKHVILIMT
GTrTTAAGCATGTGATTTTGATTATGACTTCAAATGTGGGGAGTAAGGATAAGGATACGCTAGGGTTTTTTAGCGCTA KDKDTLGFFSAKNTKYDKAVK
AAAACACCAAGTATGATAAAGCCGTTAAAGAGCTTTTGACCCCTGAATTACGCTCCAGGATTGATGCGATCGTGCCG LRSRIDAIVPFNALSLEDFERIV
TTT CGCGCTCAGπTGGAGGAπTTGAACGCATTGTTTCTGTAGAATTAGACAAATTAAAAGCCCTAGCGCTAGAG LKALALEQDITLKFHKE
CAAGACATAACCTTAAAATTCCATAAAGAAGT
HP1533 2877 AAGAAAAAGACTACACGCAAGGGGGTTATGGGGTTTTATTTGAAGGTTTAGACTCTAGCGATAACGCTTTAATCTTAC 2878 EKDYTQGGYGVLFEGLDSSDN
AGCACCTCCAGCAAAACCAAATCCCTTATAAAGTCTCAAAGGACGACACCATCCTTATCCCTAAAGATAAAGTGTATG LQQNQIPYKVSKDDTILIPKDKV
AAGAAAGGATCACTCTGGCTTCTCAAGGGATCCCTAAAACGAGTAAAGTGGGCTTTGAAATCTTTGACACTAAAGAC LASQGIPKTSKVGFEIFDTKDF
TTTGGAGCGACTGATTTTGATCAAAATATCAAACTCATTCGCGCCATTGAGGGGGAATTGTCGCGCACGATTGAAAG QNIKLIRAIEGELSRTIESLNPIL
TTTAAACCCCATTTTGAAAGCCAATGTGCATATTGCAATCCCTAAAGACAGCGTGTTTGTGGCTAAAGAAGTCCCTCC AIPKDSVFVAKEVPPSASVMLK
TAGCGCTTCGGTGATGCTCAAACTCAAGCCTGACATG GCTTTCACCCACTCAAATTTTAGGGATTAAAAATTTAAT KLSPTQILGIKNLIAAAVPK
CGCTGCAGCTGTGCCTAAACT
HP1533 2879 GCCCTATATCGCTAGAAATTATCCTTTGGAAAAATCCGTCCTCAAAGAACCGCATGAAGCCCTTTTTGGGGGGGTTA 2880 PYIARNYPLEKSVLKEPHEALF AAGGCGATGAGATCTTAAAAGAAATCGTTTT TTTTTTAAGGCCCCGGCCTTAAAAAATTTTAAAAAAAAAATTCCCCCCTTTTTTTTTGGTTTGTGAAATGGGGTA DEILKEIVFLAAKLKIPFLVCEM ITGACCAGTTGAAAAGCTTGAAAGAATGCTTGGAATTTTGCGGTTATGATGCAGAGTT TTACAAGG SLKECLEFCGYDAEFYK
HP1533 2881 TTCCATAGCCTTAAGCGTGCGTTGCGTGATCCATTCTTTAGAAAAAACCCTGAATGATGAAGAGGTCAATTCAGCCGT 2882 SIALSVRCVIHSLEKTLNDEEVN
GCAAAAAGCACTTGAAATTTTAGAAAAAGAATTTAACGCCCGCCTTAAAGGATAATATAAAGGATAATATGTGATAGA ALEILEKEFNARLKG
GCTTGACATTAACGCTAGCGATAAATCGCTCTCACACAGAGCCGTTATTTTTAGCCTGCTCGCTCAAAAACCTTGTTT
CGTGCGGMTTTTTTAATGGGAGAAGATTGTTTAAGCTCTTTAGAAATCGCTCAAAATTTAGGGGCTAAAGTGGAAAA
TACCGCCAAAAATTCTTTTAAAATCACACCCCCAACAACTATAAAGGAGCCTAACAAGATT ΓAAATTGCAACAATTCT
GGCACAACCATGCGTTTATACAGCGGGCTTTTAAGCGCTCAAAAAGGGCTTTTTGT ΓTTAAGCGGGGACAATTCCTT
AAACGCACGCCCCATGAAAAGAATCATTGAGCCTTTGAAGGCT
HP1533 2883 TCTTGTTTTTGTCTCTGGGCAATTAGGCATTGATGTAAGCACCGGCGAGTTTAAAGGCGCAGACATTCATTCTCAAAC 2884 LVFVSGQLGIDVSTGEFKGADI CACGCAATCCATGGAAAATATCAAAGCGATTTTAAAAGAAGCAGGGTTAGGGATGGATAGCGTGGTTAAAACGACTA QSMENIKAILKEAGLGMDSWK TTTTATTGAAAAGTTTAGACGATTTTGCGGTGGTGAATGGAATCTATGGGAGTTATΠTACAGAGCCTTATCCGGCCA SLDDFAWNGIYGSYFTEPYPA GAGCGACCTTTCAAGTGGCTAAACTGCCTAAAGACGCTTTAGTAGAAATTGAAGCGATAGCC VAKLPKDALVEIEAIA
HP1533 2885 CTTCACCAACCGCAAAAAGCGTTTCAGAGAAAACGCGCAAAAAAACGCAGAGTATTCAAACCATGAAGCGTCTTCGC 2886 FTNRKKRFRENAQKNAEYSN ACCATAAAAAAGAGCATCGCCCTAACAAAAAACCAAACAACCACCACAAACAAAAACATGCCAAAACACGAAATTACG HKKEHRPNKKPNNHHKQKHA CCCAAGAAGAATTGGATAGCAACAAAGTAGAGGGCGTTACGGAAATTTTGCATGTGAATGAGAGAGGGACTTTAGG QEELDSNKVEGVTEILHVNER CTTTCATAAGGAGTTAAAAAAGGGCGTTGAAGCGAATAACAAGATCCAAGTGGAGCATTTAAACCCGCATTATAAGAT KELKKGVEANNKIQVEHLNPH GAACTTAAACTCTAAAGCGAGCGTTAAAATCACGCCTTTAGGGGGCTTGGGTGAGATTGGGGGGAACATGATGGTC SKASVKITPLGGLGEIGGNMM ATTGAAACCCCAAAAAGCGCGATCGTGATTGATGCGGGCATGAGCTTCCCTAAAGAGGGGCTCTTTGGCGTGGATA AIVIDAGMSFPKEGLFGVDILIP TTAATCCCGGATTTTTCCTACTTGCACCAAATCAAGGACAAAATCGCTGGCATTATCATCACCCATGCCCATGAAG QIKDKIAGIIITHAHEDHIGATPY ATCACATAGGGGCCACGCCTTATTTGTTTAAAGAGCTGCAATTCCCCCTTTATGGCACGCCCTTGAGTTTGGGGCTG FPLYGTPLSLGLIGSKFDEHGL ATTGGGAGCAAGTTTGATGAACATGGTTTGAAAAAATACCGCTCGTATTTTAAAATCGTAGAAAAGCGCTGTCCCATT YFKIVEKRCPISVGEFIIEWIHIT AGCGTGGGCGAATTTATCATTGAATGGATCCACATCACGCATrCTATCATTGACAGCAG
Figure imgf000437_0001
HP1224 2895 CGATGAAGTCAAAACGCTGTGCAAACAGCTCAAAATTTCAAACTTTAGAGCCAGTATTTCTCAAATCAAAAACGGCAT 2896 DEVKTLCKQLKISNFRASISQIK
GATGGATTTGAGCATGCAAGATAGCGAATGCTATAAAGCGTATGAGCTTTATCAAAACGCGCTCAAAAAAGACAATTT DLSMQDSECYKAYELYQNALK
AGTGGATTTTGACGATTTGCTTTTTTTAAGCCTTAAGATrTTACAAGATAATGAAACCATCGCCAAAGAGACTAGCGA DFDDLLFLSLKILQDNETIAKET
Figure imgf000438_0001
GCGCTACCACTACATTATGGTAGATGAGTATCAAGACACGAACGCCTTGCAATTGGAGTTTTTAAAAAAATTGAGTTT YIMVDEYQDTNALQLEFLKKLS
CACGCACCATAATTTGTGCGTGGTGGGCGATGACGATCAGAGCATTTATGGGTTTAGGGGGGCTGATATTTCTAACA LCWGDDDQSIYGFRGADISNI
TTTTAAATTTTTCCAAGCAT1TTAAAGGGGCTAAAATAGTGAAATTAGAGACCAACTACCGCTCTAGCGCTGAAATTTT FKGAKIVKLETNYRSSAEILAC
AGCGTGCGCTAATrCTCTCATCAGCCACAACCAACACCGCCATATTAAAACGCTTCAAAGTTTCAAAGGCTCGCATAA NQHRHIKTLQSFKGSHKSWC
AAGCGTGGπTGTAAAGAATATTTGACGCAAAAAGAAGAGAGCCTGGATGTGGCTTATCAAATCAAAGCCCTTTTAAA KEESLDVAYQIKALLKKGENLE
GMGGGCGAG TTTAGAAMTATCGCTATTTTGTATCGCTTAMCGGGCTTAGCCGCAGCATTGAAGAGAGCTTGA LNGLSRSIEESLNALNIPYRLIG
ACGCTTTGAATATCCCTTATAGGCTCATTGGGGCGCTGAGTTTCTATGAAAGAGCCGAGATTAAAGACGCTTTGGCG RAEIKDALAFMHLVAKKDDRFF
TTCATGCATTTAGTGGCTAAAAAAGACGATCGCTTTTTTATIAAGCGCGTTTTAAACAAGCCCCCAAGAGGTCTTGGC KPPRGLGKITQEWIFSLLDEEG GATCACTCAAGAATGGATTT1TTCTCTTTTAGATGAAGAGGGTTTGAATTTAGAAGAGGCGCTAAAACTTGGGGCG LKLGAFKDKLNPKNEYALKQFI
TTTAAAGACAAATTAAACCCTAAAAACGAATACGCTTTGAAACAATTTATCGCTATGATAGGGCGTTTGAGGGAGGCT REAFEISVEEFCSRFLEETNLL
TTTGAAATTTCA DNYEEREGFVKELLTLVKEYFK SLLDFLNESVLDAHNTENAQK HMSKGLEFKHVFVIGLEEGF
HP0134 2897 TCCCCACCCCTTACATGAAAGAGCCTAAGCAAGATGGAGCCAGAACGGCAGTCGTGCATAAAGATGGGGTCCATTT 2898' PTPYMKEPKQDGARTAWHKD
AGAATGGGTGGCCCTTGGGTATAAAGTGCCTGCTTTCAAGCATAAAGATCAAGTCGCCTTAGACGCACTAAGTAGGC WVALGYKVPAFKHKDQVALDA
TTTTAGGCGAAGGCAAAAGCTCGTGGTTGCAAAGCGAATTAGTGGATAAAAAACGCTTGGCTTCTCAAGCTTTCTCG GEGKSSWLQSELVDKKRLASQ
CACAACATGCAATTACAAGATGAAAGCGTGTmTATTCATTGCGGGGGGTAATCCTAATGTCAAAGCCGAAGCCTTA MQLQDESVFLFIAGGNPNVKA
CAAAAAGAAATCGTAGCGCTTTTAGAAAAGCTGAAAAAAGGCGAAATCACTCAAGCGGAATTAGACAAGCTCAAAAT IVALLEKLKKGEITQAELDKLKI
CAATCAAAAAGCTGACTTTATTTCTAATTTAGAAAGTTCTAGCGATGTTGCGGGGCTTTTTGCGGACTATTTAGTGCA ISNLESSSDVAGLFADYLVQND
AAACGATATTCAAGGCTTGACGGATTACCAGCGACAATT TTGGATTTAAAAGTGAGCGATTTGGTGCGTGTGGCCA YQRQFLDLKVSDLVRVANEYF
ATGAATATTTTAAAGACACCCAATCAACCACCGTGTT TTGAAACCTTAAAAGAGCCTT TVFLKP
HP1223 2899 CAGACATCCATTTCAAAACCCTTGACAGCAACCAGAGCGTGGAAACGATTGAAGTAGAGATTATATTACCTAGATAGT 2900 DIHFKTLDSNQSVETIEVEIILPR
GAATCAACGAATGAAAAGCCACTTCCAATACAGCACGCTAGAAAATATCCCTAAAGCCTTTGACATTCTCAAAGACCC
CCCTAAAAAACTCTATTGTGTGGGCGATACCAAGCTTTTGGACACGCCTTTAAAAGTGGCGATCATAGGCACAAGAA
GACCCACCCCTTACAGCAAGCAACACACGATCACTCTAGCTAGAGAGCTTGCTAAAAATGGCGCGGTTATTGTGAGT
GGGGGAGCGTTAG
HP1223 2901 GGATTTATGAGATTGAAGTCAAGCTTGGATCGGGGGTTGTGGGCGTGTTTAAAATTGATGTGGTGGCTGAGTAGAAA 29021 IYEIEVKLGSGVVGVFKIDWAE ATGTTTGAAGCGACGACGATTTTAGGCTATAGAGGGGAATTGAATCATAAAAAGTTCGCGCTCATTGGAGGCGATGG GCAGGTAACTTTGGGTAATTGCGTGGTCAAAGCCAATGCGAC
HP1223 2903 CTTCAGCCCTCATTCAATGGGAAAATAACCCAAGCGCTAAAATAGCCACTTATGCGGTGTATCGCTTTGAAGCCAAC 2904 SALIQWENNPSAKIATYAVYRF
TCTAAAACCCCTTTGCGCΠTGGGAATATCACCAAAAACCAATTCGTGGATAAGGACATGAAAGTGGGCGTGGCTTA TPLRFGNITKNQFVDKDMKVG
TCGCTATCAAGTGGTGAGCGTGGATAAAGATGGATTAGAGTCGCACCCAAGCAAAGAAGTGCGTTTGTTTTTAGAGC QVVSVDKDGLESHPSKEVRLF
GCTAAAAGGGTTTTAATGCCCCATTTTTTAGCCAAGCTGGATTTTAAACCTTTAGAATACCCCTTAATTGAAGGGGATT
TTTGTTTTCATAGGGMTTTTTAAGCTTAAAAAACCCCACTAAAAGCTGTGTGTATGCGAGTTTTAAGGATCGTATTTT
TTTATTGCAAAAAATCAGGCGAGCGAATGATTTTTTAATCAAAAGCGAAAAAGCAACGCCTTTAAAAAGAGAGGTTTT
AAAACAAGCTTTAAGGATTTATTCGCAATCTTTTGAGGTCATTTCGCATAATTTGCAAGAAAATTCTAAACATGCGAGC
GGAAAAAAAACCCTTGATTTAGGAACTTTTGAAGACTTTATTCAAAAAAATCAAGCCCCTATTTTAATAGAAATTGGTT
TTGGGAGCGGGAGGCATTTGATAGAATTAGCCAAAAACAACCCCACTAAAACATGCTTAGGGATAGAGATTCACACC
CCGTCTATCGCGCAAGCGTTAAAGCAAATTGAGTTATTGGATTTAAAAAATCTGCACA
HP1223 2905 GGTTTTAGCGGGTTTATTAACTTTTTTAAACAGGAATGAATGCAACATTGTGGGCGTGTCTTATTTGGGCTATAAAGA 2906IVLAGLLTFLNRNECNIVGVSYL
CAAGTATTCTAGCCATTGTGAAGTGAGTTTTGAAATAGCCACAGATAAGGCGGATTGGATCAGAGCCTTAATCAATC
GCAAATATCAGGATAGGATTGTAGAATTATCCAGTCTGGATGACGCTTATGAATCATAATAAGCCCTAATTAAGGAAT DRIVELSSLDDAYES
GAACATGGAACAAAAAATCGCTATCGCCTTAAAAGAGATCGCTAGAGGCACTAATGAAATCATTGGATTAGAATACAT
TGAAAAGTTGGTGAGGAAATATTATGAAACCAATGAACGCTTTATCGTTAAAGCCGGGTTTGATCCTACCGCTCCAGA
TTTGCATTTAGGGCATACGGTATTGATCCAAAAACTGGCTTTATTGCAGCAATATGGGGCTAGGGTTAAGTTTTTGAT
TGGGGATTTTACCGCTATGATAGGCGATCCTACAGGGAAAAATGAAACCAGAAAGCCCTTAAACCGGGAGCAAGTCT
TAGAA CGCTAAAACTTATGAAGAGCAAATCTATAAAATTTTAGATGAAAAACACACCGAAGTGTGCTTTAATTCCAC
TTGGTTGGATGCTTTAGGCGCAAAAGGCATGATAGAATTGTGCGCGAAGTTTTCAGTCGCTAGAATGCTAGAAAGGG
ACGATTrCACTAAACGCTATAAAGAA TCGCCCCATTAGCATCGTGGAATTTTTATACCCTTTGTTGCAG
HP1223 2907 AGGTGTTCAAAGACAGCAAAAAAGACGCTTGCGGGTTCATCTATGAGATCAGCGAGTTCATGAAAGCCTATACCGCA 2908 VFKDSKKDACGFIYEISEFMKA
TTGCTAAAAAAACAAGACCGATACGTCTATTTATTGAGGTATCTCCCCTCTAGGTATTGGGCCAGCATTTTAACGACT KQDRYVYLLRYLPSRYWASILT
GCCCTTTATGTCA TACCCTGATTTTGACGCTTTGAAAAAGCTTTTGGTGTCTTATTATTACCAAACTTGGATTGCAG KYPDFDALKKLLVSYYYQTWIA
GAGGCACGATCACGCGCATCAAGCAAACCAGTATCAACATTATCAAAAACGTTAAAAGCAATAAGAGCGTTGAAACC IKQTSINIIKNVKSNKSVETIKELI
ATCAAAGAGCTTATATTGAATAGCATCGACTCTTATAACACCTTTGATCAATACCTCTATAACTTATGGGATAGCTCTT YNTFDQYLYNLWDSSSVYHSK
CTGTTTATCATAGCAAATGGGTGCGTCCTGTCTTAGCCCTAGCTAATTATTTCATGGCAGATGAAGAGAAACCCCATT LALANYFMADEEKPHFIAMDAE
TTATCGCTATGGATGCCGAAACCCAAGTGGAGCATATTTTGCCACAAACGCCCAAAAGAGGCAGTCAATGGAACGC ILPQTPKRGSQWNADFDKEKR
GGATTTTGACAAAGJAAAAAAGAGAAGAATGGGTAAATAATATCGCGAATTTAACCCTTTTAAAGCGTAAAAAGAACGC NIANLTLLKRKKNAHALNGDFD
GCATGCTTTAAACGGGGATTTTGATGAAAAAAGAAAAATTTATGGAGGCAAAGACACGAGCAAAGTGATTAGCTGTTA GGKDTSKVISCYDITKELYSNY
TGACATCACTAAAGAATTGTATAGCAATTATAGGAAGTGGAATGAGAAGTCCCTCCAAGAGCGATACAAATCTTTGTA KSLQERYKSLYN
TAACA
HP1102 2909 GAGCGAGTGTGTTGAGCGCGTTACTTCTTGTAGGCTTAGGGGCAGCCCCTAAACATTCAGTTTCAGCTAATGACAAA 2910 ASVLSALLLVGLGAAPKHSVSA
CGGATGCAGGATAATTTAGTGAGCGTGATTGAAAAACAGACCAATAAAAAGGTGCGTATTTTAGAAATCAAACCTTTA QDNLVSVIEKQTNKKVRILEIKP
AAATCTAGCCAGGATTTAAAAATGGTCGTTATTGAAGATCCGGACACTAAATACAATATCCCGCTTGTGGTGAGTAAG DLKMWIEDPDTKYNIPLWSK
GATGGTAATTTAATCATAGGGCTTAGCAACATATTCTTTAGCAATAAAAGCGATGATGTGCAATTAGTTGCAGAAACC LSNIFFSNKSDDVQLVAETNQK
AATCAAAAAGTTCAAGCTCTTAACGCCACCCAACAAAATAGCGCGAAATTGAACGCTATTTTTAATGAAATACCGGCT ATQQNSAKLNAIFNEIPADYAIE
GATTATGCGATAGAGTTGCCCTCTACTAACGCTGCAAATAAGGATAAAATCCTTTATATTGTCTCTGATCCCATGTGC AANKDKILYIVSDPMCPHCQKE
CCACATTGCCAAAAAGAGCTCACTAAACTTAGGGATCATTTAAAAGAAAACACCGTGAGAATGGTCGTGGTGGGGTG DHLKENTVRMWVGWLGVNSA
GCTTGGGGTCAATTCAGCTAAAAAAGCGGCπTAATCCAAGAAGAAATGGCGAAAGCtAGGGCTAGGGGAGCGAGC IQEEMAKARARGASVEDKISILE
GTGGAAGATAAGATCTCTATTCTTGAAAAGATTTATTCCACCCAATACGATATTAACGCTCAAAAAGAGCCTGAAGAT QYDINAQKEPEDLRTKVENTTK
TTACGCACTAMGTGGAAAATACCACTAAAAAGATTTTTGAATCTGGCGTGATTAAGGGTGTGCCTTTCTTATACCATT GVIKGVPFLYHYKA
ATAAGGCATGATATAAGGTTGCTCTCATGAAAAAACCCTATAGGAAGATTTCTGATTATGCGATCGTGGGTGGTTTGA
GCGCGTTAGTGATGGTGAGCATTGTGGGGTGTAAGAGCAA
Figure imgf000439_0001
HP1102 2911 TGCATTGGGGGGGAATGTTAAAATGATCGTGGAAAAACAAAAAATTAACACCCAAACAGAAATCCAAAACATGCAAATi 2912 ALGGNVKMIVEKQKINTQTEIQ
CGCGCTCCAAAAAAATAACGAAATAATCAAGCTCAAAATGAACCAGCAAAACGCTCTCTTAGAAGCGTTAAAAAATAG QKNNEIIKLKMNQQNALLEALK
CTTTGAACCGAGCGTTACTCTAAAAACACAAATGGAAATGCTTTCTCAAGCTCTAGGAAGTTCTTCTGACAACGCTCA SVTLKTQMEMLSQALGSSSDN
ATACATCGCTTACAATACGATTGGTATCAAGGCGTTTGAAGAAACCTTAAAAGGTTTTGAAACATGGTTGAAAGTGGC NTIGIKAFEETLKGFETWLKVA
TATGCAAAAAGCGACCCTTATTGATTATAATTCCCTAACGGGTCAGGCTTTGTTTCAAAGTGCCATCTATGCGCCTGC IDYNSLTGQALFQSAIYAPALSF
TCTTAGTTTTTTTTCAAGCATGGGTGCACCATTTGGAATCATTGAAACATTCACTCTAGCGCCCACAAAATGCCCCTA APFGIIETFTLAPTKCPYLDGLKI
TCTTGATGGGCTAAAAATTTCAGCATGCCTTATGGAACAGGTTATTCAGAATTACAGAATGATTGTAGCCCTTATTCAA EQVIQNYRMIVALIQNKLSDADF
AATAAACTGAGTGATGCAGATTTTCAAAATATCGCTTATTTGAATGGGATCAATGGAGAAATCAAAACCTTAAAAGGAT LNGINGEIKTLKGSVDLNALIEV
CAGTAGATTTGAACGCGCTCATAGAAGTTGCTATCTTAAACGCAGAAAATCATTTAAACTATATAGAGAATCTTGAAAA NHLNYIENLEKKADLWEEQLKL
AAAAGCCGACCTTTGGGAAGAACAACTGAAATTAGAAAGAGAAACGACAGCAAGAAACATTGCTAGCTCTAAAGTTA ARNIASSKVIVK
TTGTCAAATGAAAACACTCGTGAAAAATACCATATCTTCTTTTTTGCTCTTGTCTGTTTTGATGGCAGAAGATATAACA
AGCGGTTTAAAGCAACTGGATAGCACCTACCAAGAGACCAACCAACAAGTGCTCAAAAACTTAGATGAGATTTTTTCA
ACCACTAGCCCTAGTGCTAATAATGAAATGGGTGAAGAAGATGCTCTAAACATCAAAAAAGCGGCCATTGCTTTGAG
AGGAGATT
HP1102 2913 TGACAGCGATTTGAAAAACGACCCTAAGGAATTTTACGAACTCGCTAAGAACGATTTGTATCGTGAAGATATTGTCGT 2914 DSDLKNDPKEFYELAKNDLYR
TTTTTCGCCTCATGGGGACACTTACACTTTACCGGTGGGTGCGATCGCTTTAGATΠTGCTTACATGGTGCATAGCG SPHGDTYTLPVGAIALDFAYMV
ATTTGGGCGATAAAGCCACGGACGCTTATATCAATAGTAAAAAAGCCTTACTCAATCAGGAATTAAGAAGTGGGGAT DKATDAYINSKKALLNQELRSG
GTGGTTAAAATCATTAAAGGCGATAAAATAATACCTCGTTTCATTTGGATGGATCAGCTTAAAACTTCTAAGGCTAAAA KGDKIIPRFIWMDQLKTSKAKN
ACCATTTGCGCATCCAAAGAAGAAACCGCTTGAA RNRL
HP0495 2915 GTAAAATCATTTCTTCAGCGTTTGAAAAAATCCACTCTCTTAATGGGTTTGACACTGATGAAGCGATGAAACAAGCCA 2916 KIISSAFEKIHSLNGFDTDEAMK
TTATCAATCATTACCAATCGCATTTGCCTTTGATGCCAGAACAGATTTTATTGAACGCTTGCTCTAATGAAACGCTTAA YQSHLPLMPEQILLNACSNETL
AGAATTGCAAGAATTTATCTCTCACCAATACTCTAAAAAAATCGCTCTTAGCATTCCTAAAAAAGGCGACAAGCTCGC FISHQYSKKIALSIPKKGDKLALI
TTTAATAGAAATCGCTATGAAAAACGCTCAAGAGATTTTTAGCCAAGAAAAAACCTCTAATGAAGATCTGATTTTAGAA NAQEIFSQEKTSNEDLILEEAR
GAAGCGCGATCGCTCTTTAAATTAGAGTGCATGCCTTATAGGGTAGAAATCTTTGACACAAGCCACCATTCTAGCAG CMPYRVEIFDTSHHSSSQCVG
CCAATGCGTGGGGGGAATGGTCGTGTATGAAAACAACGCCTTCCAAAAAAACTCTTATCGGCGCTACCATTTAAAAG ENNAFQKNSYRRYHLKGSDEY
GCTCTGATGAATACACTCAAATGAGCGAATTGCTCACCAGAAGGGCTTTAGACTTTGCCAAAGAGCCACCGCCTAAT LLTRRALDFAKEPPPNLWVIDG
TTGTGGGTGATCGATGGAGGGAGGGCACAATTAAACATCGCTTTAGAAATTTTAAAAAGCAGCGGGAGTTTTGTGGA NIALEILKSSGSFVEVIAISKEKR
AGTGATCGCTATTTCTAAAGAAAAAAGGGATTCTAAAGCTTATCGCTCTAAAGGGGGCGCTAAAGACATTATCCATAC RS KGGAKDI I HTPSDTFKLLPS
GCCTAGCGATACTTTTAAATTGCTCCCTAGCGACAAACGCTTGCAGTGGGTGCAAAAATTGCGCGATGAAAGCCACC WVQKLRDESHRYAINFHRSTK
GGTATGCGATAAACTTCCATAGATCCACTAAACTTAAAAACATGAAACAAATCGCTCTTTTAAAAGAAAAGGGTATAG QIALLKEKGIGEASVKKLLDYF
GAGMGCCAGCGTGAAAAMTTATTGGATTATTTTGGGAGTTTTGAAGCGATAGAAAAAGCGAGCGAGCAGGAAAAA KASEQEKNAVLKKRI
AACGCCGTTTTAAAAAAACGAATCTAAAGGAAAAAACATGAAAAAAAGATTGAATAT
HP0495 2917 GCGTGCCTCACATCGTTGTTTTCTTAAACAAACAAGACATGGTAGATGACCAAGAATTGTTAGAACTTGTAGAAATGG 2918 VPHIWFLNKQDMVDDQELLEL AAGTGCGCGAATTGTTGAGCGCGTATGAATTTCCTGGCG RELLSAYEFPG
HP0132 2919 CATGTGCGTTTGCGTTCAAGCTTACGCCGAGCAAGATTACTTΠTTAGGGATTTTAAATCTAGAGATTTGCCCCAAAA 2920 MCVCVQAYAEQDYFFRDFKS
ACTCCATCTTGATAAAAAGCTCTCCCAAACAATACAGCCATGCATGCAACTTAACGCATCAAAACACTACACTTCTAC LHLDKKLSQTIQPCMQLNASK
CGGGGTTAGAGAGCCTGATAAATGCACAAAGAGTTTTAAAAAATCCGCTCTCATGTCCTATGACTTAGCGCTAGGTT VREPDKCTKSFKKSALMSYDL
ATTTGGTGAGTAAGAATMGCMTACGGCTTAAAGGCTATAGA TTTTAAACGCTTGGGCTAAAGAGCTTCAAAGCG SKNKQYGLKAIEILNAWAKELQ
TGGATACTTATCAGAGCGAGGATAATATCAATTTTTACATGCCTTATATGAACATGGCTTATTGGTTTGTCAAAAAGGC QSEDNINFYMPYMNMAYWFV
GTTTCCTAGCCCAGAATATGAAGATTTCATTAAGCGGATGCGCCAGTATTCTCAATCAGCTCTTAACACTAACCATGG PEYEDFIKRMRQYSQSALNTN
GGCGTGGGGCATTCTTTTTGATGTGAGTTCTGCGCTAGCGTTAGACGATAATGCCCΠTTGCACAATAGCGCTAATC ILFDVSSALALDDNALLHNSAN
Figure imgf000441_0001
Figure imgf000442_0001
Figure imgf000443_0001
Figure imgf000444_0001
Figure imgf000445_0001
Figure imgf000446_0001
Figure imgf000447_0001
Figure imgf000448_0001
Figure imgf000449_0001
Figure imgf000450_0001
Figure imgf000451_0001
HP0137 3003 TGGGCATTGCGGACCTTGGCCGTATTACCAATGCACAGGCACGACTAACGGCACTTATAGCGCCTATCATGTGTATA 3004 GHCGPWPYYQCTGTTNGTYSA
TCACAGCGAATCTGCGTTCTGGCAATCGTATAGGCACCGGTGGGGCAGCTAATCTAATCTTTAATGGGGTAGATAGT TANLRSGNRIGTGGAANLIFNG
ATCAATATCGCTAACGCTACCATCACGCAACATAACGCCGGAATCTATTCAAGCTCTATGACTTTTTCCACGCAAAGC ANATITQHNAGIYSSSMTFSTQS
ATGGATAATTCGCAGAATTTGAATGGTCTAAATTCTAACGGCAAACTTTCGGTGTATGGCACCACTTTCACTAACGAA QNLNGLNSNGKLSVYGTTFTNE
GCTAAAGATGGGAAATTCATTTTeAATGCAGGGCAAGCGGTTTTTGAAAACACCAACTTTAATGGAGGGAGTTACCA KFIFNAGQAVFENTNFNGGSYQ
ATTCAGCGGCGATAGCTTGAATtTTTCAAACAACAACCAGTTCAATAGCGGTTCGTTTGAAATTAGCGCAAAAAACGC SLNFSNNNQFNSGSFEISAKNA
TTCGTTCAATAACGCTAACTTTAACAACAGCGCTTCTTTTAATTTCAATAATTCTAACGCGACCACTTCGTTTGTGGGG NFNNSASFNFNNSNATTSFVGD
GATTTCACTAACGCTAATTCAAATTTGCAAATCGCCGGGAACGCTGTTTTTGGGAACTCTACTAATGGCTCTCAAAAT NSNLQIAGNAVFGNSTNGSQNT
ACCGCTAATTTTAATAATACCGGCTCTGTTAATATTTCAGGGAATGCAACCTTTGATAATGTGGTGTTTAATGGCCCTA NTGSVNISGNATFDNWFNGPT
CGAACACGAGCGTGAAAGGGCAGGTTACTTTAAATAACATCACTTTAAAAAACCTGAACGCCCCTTTGTCTTTTGGC GQVTLNNITLKNLNAPLSFGDG
GATGGGACGATTACTTTTAACGCTCATTCGGTGATTAATATTGCTGAATCTATCACTAATGGCAACCCTATCACTCTT HSVINIAESITNGNPITLVSSSKEI
GTAAGCTCTTCTAAAGAAATTGAATACAACAACGCTTTCAGTAAAAATCTATGGCAGCTCATCAACTACCAAGGGCAT AFSKNLWQLINYQGHGASSEKL
GGGGCAAGCAGTGAAAAGCTCGTCTCTAGCGCGGGTAATGGCGTTTATGATGTGGTGTATTCTTTCAATAACCAAAC GNGVYDWYSFNNQTYNFQEV
CTACAATTTCCAA ISIRRLGVNMVFDYVDMEKSDH
ALGFMTYMPNSYNNNLGNANN
DKSIDFYASGKTLFTKAEFSQTF
HP0137 3005 AGTCAATCAAATGCCCTCGTTTGTTTTCCCTTTAGTGCAGGGGGCTAAGGGTTTGAATTACGGATTAGAGGCTGGGA 3006 VNQMPSFVFPLVQGAKGLNYG
GCAAGTCTGAACTCATCATCGCAATGAGTTACACTAACCCTAAAGCCCCTATCACCGTGAATGGCTTTAAAGACAAA KSELIIAMSYTNPKAPITVNGFKD
GAAATGATTGAGCTTGGCTTTATCGCTAAAAGCATGCAGCATGAGATCACTTTAACGATTGAGGGTTTGAATGAATTG LGFIAKSMQHEITLTIEGLNELKTI
AAAACCATTATCGCCGTGGCTAAACAAAACGAGTTTTTAGCCTGCCCTAAAATTGGCATCCGCATCCGTTTGCACAG QNEFLACPKIGIRIRLHSTGTGV
CACTGGCACTGGCGTTTGGGCAAAGAGTGGGGGGATCAATTCTAAATTTGGTCTTAGCAGCACTGAAGTTTTAGAGG GGINSKFGLSSTEVLEAMRLLEE
CGATGCGCCTTTTAGAAGAAAACGACTTGTTAGAGCATTTCCACATGATACATTTCCATATAGGCTCTCAAATCAGCG EHFHMIHFHIGSQISDISPLKKAL
ATATTTCGCCCTTAAAAAAGGCTTTAAGAGAAGCGGGTAACTTGTATGCAGAATTGCGTAAAATGGGCGCTAAAAATC NLYAELRKMGAKNLNSVNIGGG
TTAATAGCGTGAATATTGGAGGGGGGTTAGCCGTAGAATACACCCAACACAAGCACCACCAAGACAAAAACTACACT TQHKHHQDKNYTLEEFSADWF
TTAGAGGAATTCAGCGCTGATGTGGTGTTTTTATTGAGAGAAATTGTGAAAAATAAGCAGGAAATCGAGCCGGACAT VKNKQEIEPDIFIESGRYISANHA
TTTCATTGAATCAGGCCGTTATATTTCCGCTAACCATGCCGTTTTAGTGGCCCCGGTGTTAGAATTGTTTTCGCATGA VLELFSHEYNEKSLKIKENNNPP
ATACAATGAAAAATCCCTAAAAATCAAAGAAAATAATAACCCCCCTTTGATTGATGAAATGCTAGACTTGCTCGCTAAT LDLLANINEKNAIEYLHDSFDHT
ATCAATGAAAAAAACGCCATTGAATACTTGCATGATAGTTTTGATCACACCGAGTCGCTATTCACGCTTTTTGATCTG FDLGYIDLIDRSNTEVLAHLIVKK
GGCTATATTGATTTGATTGACAGGAGCAACACTGAAGTTTTAGCCCATTTGATCGTCAAAAAAGCGGTGCAATTGCTT YVKDHNDILRIQEQVQERYLLNC
TATGTTAAGGATCAT SLPDYWGLRQNFPVMPLNKLD
SASLWDITCDSDGEIAFDSTKPL
DIDEEEYFLA
HP0137 3007 AAAAGAGAACGCAGAACAAAAATATAACACTCTTTCAGTCAAAAATAAGCAATTAGAAGCTGAGTTAGATATGCTTAA 3008 KENAEQKYNTLSVKNKQLEAEL
CGAAAAATTTGAAAAACTGAAAAATATGTATGCTGGGGTAGAGGATTTTGAAAAACGCCAAAAAAATATCAAAGAACA EKFEKLKNMYAGVEDFEKRQK
AATTGTAAAAACCAACCCCAAAGTCTTAGGCGCACCTTCAAACGAAGTGGAAGAATTAGCGTTCTTAGAGCGTATAG VKTNPKVLGAPSNEVEELAFLE
AAAAGGGCATGCAAGAGTTCAATGTTTTCTATCCCAAGCGTTTATTGTATATGTTCCACACCGCTTTAAAAAGCACGT MQEFNVFYPKRLLYMFHTALKS
CTCTATCGCCATTGAGCGTGCTAAGTGGGGTGAGTGGGACAGGAAAATCTGAACTGCCCAAGCTCTATGTGCA" TTT LSVLSGVSGTGKSELPKLYVHF
GGGGGGTTAAATTTTTTAAGCATTGCTGTGCAGCCTACTTGGGATAGCCCAGAAT LSIAVQPTWDSPE
HP0137 3009 TAAAACCGAATTTGAAAGGCTTTACAAACTCAAACGCGCTTCAATTCTAGCAAGGGAAAATTTATCGCCAAGCTATAA 3010 KTEFERLYKLKRASILARENLSP AGAATATTTGAAAAGAGATTTTGATGAGAGCGGTAATTTACGCGTCCATCAAGGCTATTTTAGCGGCGATAGCGTAG LKRDFDESGNLRVHQGYFSGD CGCTCAATAAAGGCAAAAAAGAAAGCAAAAAAGAAGACATTGAAGCCAACGACATTAAAATGATTTTAAGCGAAAAAG KGKKESKKEDIEANDIKMILSEK
AAAAACTGCTTTCATTTCAAACGCCTTTGCGTTTTATTTTTAGCGTGTGGGCTTTGCAAGAGGGCTGGGATAATCCAA FQTPLRFIFSVWALQEGWDNPN
ACATTTTTACCCTTATTAAACTGGCAAATTCTACCAGCGAAACCAGCCGCCATCAGCAAGTGGGGCGCGGGTTAAGA LANSTSETSRHQQVGRGLRIAI
ATTGCGATTAATCAAGAGGGCAAGCGCGTTACGCATGGATTTTTAAAAGGCAATGACAACGCTTTTTATAAAATAAAC RVTHGFLKGNDNAFYKINYLDM
TACCTTGATATGTTAGTGAGTGGCGAAGAAGTGGGCTTTATAGAGGGTTTGCAAAAAGAGATTGAAGCGAGCAGCTT EVGFIEGLQKEIEASSFIGGGNA
TATTGGTGGTGGC CGCGCTAGACAGAGAGGATTTAGCCAAGTTAGGGCTTAATGAAAGAGAGATAAATAAACTTT LAKLGLNEREINKLFVELENSNA
TCGTTGAATTAGAAAATTCAAACGCCCTAGAATTTGACGAAACCAACAACGCTTACAAAATCATTGCCCCTATTTGTG TNNAYKIIAPICETMQNNEER
AGACGATGCAAAACAATGAAGAAAGG
HP0137 3011 GGCAAAAGACATTTTTATCACAGGGGCTGTTGGATCGGGCAATGAGTGGAAAACCGGTGGGGGGGCGATACTGGTT 3012 AKDIFITGAVGSGNEWKTGGGAI
TTTGAAAGCTCAAACGAATTAAGCGCTAATGGGGCTTATTTTCAAAATAACAGAGCCGGGACGCAAACTTCTTGGATC SSNELSANGAYFQNNRAGTQT
AATTTGATTTCCAATAACAGCGTGAATTTGACAAACACAGATTTTGGCAACCAAACCCCTAATGGGGGCTTTAATGCT SNNSVNLTNTDFGNQTPNGGF
ATGGGGCGAAAGATCACCTATAATGGCGGGATTGTCAATGGCGGGAA I I I I'GGCTTTGATAACGTGGATAGCAATG RKITYNGGIVNGGNFGFDNVDS
GCGCAACCACCATTAGCGGAGTAACTTTCAACAATAACGGCGCGCTCACTTATAAGGGTGGGAATGGTATTGGGGG ISGVTFNNNGALTYKGGNGIGG
GAGCATCACTTTCACTAACTCTAATATCAATCATTACAAACTCAATCTTAACGCTAATAGCGTTACCTTTAATAACAGT SNINHYKLNLNANSVTFNNSALG
GCTTTAGGGAGTATGCCTAATGGCAACGCTAATACTATAGGAAACGCCTATATTCTTAATGCAAGCAATATTACTTTTA GNANTIGNAYILNASNITFNNLTF
ATAATTTGACCTTTAATGGGGGATGGTTTGTTTTTAATATACCTGATGCTCATGTTAATTTTCAAGGCACAACCACGAT WFVFNIPDAHVNFQGTTTINNPT
CAACAACCCCACTTCGCCTTTTGTCAATATGACCGGTAAAGTTACTATTAATCCTAATGCGATTTTTAACATTCAAAAT NMTGKVTINPNAIFNIQNYTPSIG TACACGCCTAGTATAGGGAGCGCTTACACGCTCTTTAGCATGAAAAATGGCTCTATCACCTATAATGATGTCAATAAC LFSMKNGSITYNDVNNLWNIIRL TTATGGAATATCATCAGGCTTAAAAACACGCAAGCCACAAAAGACGCTGATAAAAATCATACAAGTTCAAATAACAAC TKDADKNHTSSNNNTHTYYVTY ACCCACACTTACTATGTAACCTACAATTTAGGCGGCACGCTTTATAATTTCAGACAAATTTTTAGCCCTGATTCTATTG TLYNFRQIFSPDSIVLQSVYYGA
TTTTGCAATCTGTCTATTATGGCGCGAACAATCTTTACTACACCAATAGCGTGAATATCCATGATAATGTTTTTAATTTA TNSVNIHDNVFNLKNINDDKADT AAAAATATT GLNTWNYTNARFTQTYGGKNS
ATTPWANGSIPKSNSTVRFGGY
WGKTGYITGTFTADRVYITGNM
GAQTGGGATLNF
HP0137 3013 TCATTCTTCAAGCGCGACAGCCCCAACCAATGAAACACTAGAAGCGAATGCGAATAATTTCGCTTTTTTAGGTGCAAT 3014 HSSSATAPTNETLEANANNFAF
TAAGGCTAATGGATTAGTGGATTTTTCAAAAGTTTTACAAAATACTACGATCGGGACTTTAGATTTAGGGCCAAACGC ANGLVDFSKVLQNTTIGTLDLGP
TACTTTTAAAGCGAATCATTTGATCGTGAATAACGCTTTTAACAATAACTCTAATTACAGGGCTGAΓATTAGCGGTAAT KANHLIVNNAFNNNSNYRADISG
CTCAATGTGGTTAAAGGAGCGGCTCTCAGCACGAATGAAAATGGTTTGAATGTGGGGGGCGATTTCAAGAGCGAAG VKGAALSTNENGLNVGGDFKSE
GGTCATTAATCTTTAATCTTAACAATAAAACCAATCAAACGATTATTAATGTGGCTGGCAATTCTACGATCATGTCTTA NLNNKTNQTIINVAGNSTIMSYN
TAACAATCAAGCTTTAATCCATTTTAATACCCAACTCAAGCAAGGCGCTTACACGCTTATTAATGCGAAACGCATGCT HFNTQLKQGAYTLINAKRMLYG
TTATGGTTATGACAATCAAATCATTCGTGGAGGGAGCTTGAGCGATTACCTCAAGCTTTACACCCTCATTGATTTTAA RGGSLSDYLKLYTLIDFNGKRM
CGGCAAACGCATGCAATTAAACGGCGATTCACTAAGCTATGACAACCAACCGGTCAATATTAAAGATGGGGGTCTTG SLSYDNQPVNIKDGGLVVSFKD
TGGTAAGCTTTAAAGACAATCAGGGGCAAATGGTGTATTCATCTATCCTTTATGATAAAGTTCAAGTTAGCGTCTCTG MVYSSILYDKVQVSVSDKPMDI
ATAAGCCCATGGATATTCATGCCCCTAGTTTGGAGTATTACATTAAATACATTCAAGGCAGTGCTGGTTTGGATGCGA EYYIKYIQGSAGLDAIKSAGNNSI
TCAAATCTGCAGGCAATAATTCCATTCTGTGGTTGAATGAGCTTTTTGTGGCTAAAGGGGGTAATCCCTTGTTCGCTC ELFVAKGGNPLFAPYYLQDNPT
CTTATTATTTGCAAGACAATCCCACTGAACACATTGTTACTTTAATGAAAGATATTACTAGCGCTTTAGGCATGCTTTC LMKDITSALGMLSKPNLKNNST
TAAACCCAATCTTAAAAACAATTCCACCGATGCTTTACAGCTCAACACTTACACGCAACAAATGAGCCGTTTAGCCAA NTYTQQMSRLAKLSNFASFDST
GCTTTCTAAT RLSSLKNQRFADAIPNAMDVILK
DKLKNNLWATGVGGVSFVENG
GVNVGYDRFIKGVIVGGYAAYG
ERITNSKSDNVDVGLYARAFIKK
SVNETWGANKNQISSNDTLLSM
KYSTWTTNAKVNYGYDFMFKN
PQIGLRYYYIGMTGLEGVMHNA
KANADPSKKSVLTIELALENRHY
SYFYAIGGFGRDLLVNSMGDKL
NNTLSYRKGELYN
HP0137 3015 TCCAGCGCGCTAGAAAGGGAATTGAAACAAAAGAATGAACATTTAGAGAACGCTTTAAAAGAGCAAGAATATTTGAAA 3016 SSALERELKQKNEHLENALKEQ
AACGCATGGCTTTTAGAAATGGAAAMCAAAAAGAAATCTTTCACAATAAAAAATTGGAATTGGAAAAATCCTACCAAC AWLLEMEKQKEIFHNKKLELEKS
AAGCCCTAAATATCTTAAAAAGCGAAGTCGCTTCAAAAGATACTAGCTCCATGCATAAAGAAATCCATAAAGCGAGCG LNILKSEVASKDTSSMHKEIHKA
AAATTTTAAGCAAACACAAAACAAACCAAGAGATCCCACAAATCATAACGAACTTTCAAGCCAACGAAAAAGCGCGCT KHKTNQEIPQIITNFQANEKARY
ACAAGAATGAAAGCGTGCTGATTGTACAAATTTTAGACAAGGGCTATTATTGGATAGAAACCGAGCTTGGCATGCGT LIVQILDKGYYWIETELGMRLKA
TTAAAAGCGCATGGGAGTTTGTTGAAAAAAATCCAAAAACCCCCTAAAAACAAATTCAAACCCCCTAAAACAACCATT KKIQKPPKNKFKPPKTTIPKPKE
CCTAAACCTAAAGAAGCGAGCTTGCGCCTTGATTTAAGGGGGCAACGCAGCGAAGAAGCCCTGGATTTACTAGACG DLRGQRSEEALDLLDAFLNDAL
CTTTTTTAAACGACGCGCTTTTAGGGGGCTTTGAAGAAGTGCTGATTTGCCACGGCAAAGGGAGCGGGATTTTAGAA EVLICHGKGSGILEKFVKEFLK
AAGTTTGTGAAAGAATTTTTAAAAA
HP0137 3017 GCCTTTAGGTTTTGGCATTGCCAGAGTGGATATTGCCCCTATTTCCAAAAAGATTTTATGCGCCACTTACCCTGTTTT 3018 PLGFGIARVDIAPISKKILCATYP
GAATTGGAAAGATGAAAATTTAGGCTCTTATGCGGTGTTTTGCAACTCGCTTTCAAAAGAAAAAATCCTAAAAGAGAG DENLGSYAVFCNSLSKEKILKES
CGCGAGCGAGCGCGTTATTGAGATTGATGAAAGTTTTGTGTTAAAAGCGTTGGATTTTTATACGCCCTTTTTGAATGA VIEIDESFVLKALDFYTPFLNEAY
AGCCTATTCTAATAAAATGGCTCATAAAAACATCCAAGTGGTTTTAGAGCTTTTAAAGGCTTTAGAAGAAAATCGTTTG AHKNIQWLELLKALEENRLKNS
AAAAATAGCGATGGGGAGTCTCTTTATCGCTTGGTGATCTTGTATGAAGATAAGCCTTGCGAGAGCGTGGAGAGCG LYRLVILYEDKPCESVESAYMKL
CGTATATGAAACTTTTAGCGCTCTCTTTAGGTAAAGCCCCTTTGAGGAGTTTGAATTTAGAGGGTATTTTTAACCAGC GKAPLRSLNLEGIFNQLSNAAW
TTTCTAATGCGGCCTGGAGCGGTAACAAGCCCTATGAATTAGAATGGCTTAGAATGAACGAAGTGGCTTTAAAAATG YELEWLRMNEVALKMRDHFPSI
CGAGACCATTTCCCTAGCATTGATTTCATAGATAAATTCCCACGCTATTTGATGCAATTAATCCCTGAGTTTGATAATA FPRYLMQLIPEFDNIRLLDSSKT
TCCGCTTATTGGATAGCTCAAAAACGCGCTTTGGGGCGTATTTAGGGACTGGAGGTTATACCCAAATGCCTGGGGCT LGTGGYTQMPGASYVNFNAGA
AGTTATGTGAATTTTAACGCAGGGGCTATGGGAGTGTGCATGAATGAGGGGCGTATTTCTTCATCGGTGGTGGTTGG MNEGRISSSWVGAGTDIGGGA
AGCAGGCACTGATATTGGTGGGGGAGCGAGCGTGTTAGGCGTTTTAAGTGGAGGGAATAACAACCCCATTAGCATC VLSGGNNNPISIGKNCLLGANSV
GGGAAAAATTGTTTGCTAGGGGCTAATAGCGTTACTGGAATTAGTCTAGGCGATGGCTGTATCGTGGATGCAGGCG GDGCIVDAGVAILAGSVIEIEENE
TTGCGATACTAGCCGGGAGCGTGATAGAAATTGAAGAAAATGAGTTTAAAAAGCTTTTAGAAGTGAATAGCGCTTTAG EVNSALEKHANNLYKGKELSGK
AAAAACATGCCAACAAC FRSNSQNGKLIAFRSVKKIELNQ
454
Figure imgf000455_0001
Figure imgf000456_0001
HP0137 3033 CCAGACAGAAACGCCGGCGATAAAGAAGCCGAAGAAAAATTCAAGCTCATCAATGAAGCCTATGGGGTGTTAAGCG 3034 PDRNAGDKEAEEKFKLINEAYG
ATGAAAAGAAGCGGGCCTTATACGACAGGTATGGTAAAAAAGGCTTAAACCAAGCCGGCGCAAGCCAAGGCGATTT KKRALYDRYGKKGLNQAGASQ
TTCTGATTTTTTTGAAGATTTAGGCTCGTTTTTTGAAGACGCTTTTGGGTTTGGCGCTAGGGGGAGTAAAAGGCAAAA FFEDLGSFFEDAFGFGARGSKR
AAGCTCTATCGCACCGGATTATTTGCAAACCCTTGAATTGAGTTTCAAAGAAGCGGTTTTTGGCTGTAAAAAAACCAT APDYLQTLELSFKEAVFGCKKTI
TAAAGTCCAATACCAGAGCGTTTGTGAAAGTTGCGATGGCACGGGCGCTAAAGACAAAGCCCTAGAGACTTGCAAG QSVCESCDGTGAKDKALETCK
CAATGCAATGGGCAGGGGCAGGTGTTTATGCGTCAAGGTTTTATGAGTTTTGCGCAAACTTGTGGGGCGTGTCAAG QGQVFMRQGFMSFAQTCGAC
GCAAGGGCAAGATCGTTAAAACCCCATGCCAAGCGTGCAAGGGTAAAACCTATATCCTTAAAGATGAAGAAATTGAT IVKTPCQACKGKTYILKDEEIDAII
GCGATAATCCCTGAGGGCATTGATGATCAAAACCGCATGGTGCTTAAAAATAAAGGCAATGAATACGAGAAGGGAAA DQNRMVLKNKGNEYEKGKRGD
AAGAGGGGATTTGTATTTAGAAGCGCAAGTCAAAGAAGATGAGCATTTCAAGCGCGAAGGCTGCGATTTATTCATTA QVKEDEHFKREGCDLFIKAPVF
AAGCGCCGGTGTTTTTCACCACTATCGCTTTAGGGCATACGATTAAAGTGCCGTCTTTAAAAGGGGACGAACTGGAA GHTIKVPSLKGDELELKIPRNAR
TTAAAAATCCCTAGAAACGCCAGAGACAAGCAGACTTTTGCGTTTAGAAACGAGGGCGTGAAACACCCTGAAAGCTC AFRNEGVKHPESSYRGSL
TTATAGAGGGAGTTTGAT
HP0137 3035 TTCAGCCAGAATGAACTCAACGATATTTTAATGCTCTCCTTACTGGATGGCTATATCCAAAACGAAAATAAGGCGTTT 3036 FSQNELNDILMLSLLDGYIQNEN
AGCCCCCTTTTAGGCGCGCTTGAAGAAAAATTCACCCGATTAGAGAAGCTAGAAAAAGAAAGGCGATTGTTAGAGGA LLGALEEKFTRLEKLEKERRLLE
TAAAAAGCGTTTCCAAAAGGATTTAGAAGAACGATTGAATTTTGAAAAAATGAAATTAGAGAGGCTGGATTTAAAAGA QKDLEERLNFEKMKLERLDLKE
AGATGAATACGAACGCCTTTTAGAGCAAAAAAAATTGCTTTCTAGTAAGGAAAAACTGAACGATAAAATCGCTCTGGC LLEQKKLLSSKEKLNDKIALALE
GTTAGAGGTGCTAGAAAATACCCATAAAATCACGCATGCTTTAGAGAGCGTGGGCCATAGCGCGGAGTTTTTAAAAA HKITHALESVGHSAEFLKSALLE
GCGCTTTATTAGAAGCGAGCGCTCTATTGGAAAAAGAGCAGGCTAAATTAGAAGAGTGCGAGCGTTTGGACATTGAA EKEQAKLEECERLDIEKVLERLG
AAAGTTCTAGAAAGGCTTGGCATGCTAAGTGGGATCATTAAGGATTACGGGAGTATTATGCATGCTAAAGAACGATT IKDYGSIMHAKERLGHVKNELH
AGGGCATGTrAAAAACGAATTGCACAACCTAAAAGAAATTGATAGTCATTGCGAAACTTACCACAAAGAAATAGAGCG SHCETYHKEIERLKTECLKLCEEI
ATTAAAAACCGAATGCTTGAAATTGTGCGAAGAAATAAGCGGCTTTAGAAAAGAGTATTTAGCCGGTTTTAACGCTCT KEYLAGFNALLSAKAKDLLLKSP
TTTAAGCGCTAAAGCGAAAGATTTGCTCCTAAAAAGCCCCAGTTTGGTT
HP0137 3037 TGTGCGCAGTGGGGATTGCAAGGAAGATTGCGCTTATTGCACGCAAAGCTCACACCATCAAGGAGCGATCAAGCGC 3038 VRSGDCKEDCAYCTQSSHHQG
TATAAATTTAAAGATGAAAAAGTGGTTTTACAAGAGGCTAGAGCGTTAAGACAATTAGGGGCTTTAGGGTTTTGTCTG KFKDEKVVLQEARALRQLGALG
GTTACTTCAGGGCGCGAATTAGACGATGAAAAATGCGAATACATCGCTAAATTAGCTAAAGCCATCAATCAAGAAGA SGRELDDEKCEYIAKLAKAINQE
ATTGGGCTTGCATCTAATCGCATGCTGCGGGCGCGCGGATTTGGAGCAATTAGAATTTTTAAGAGATGCGGGCATC LIACCGRADLEQLEFLRDAG1HS
CATAGCTATAACCACAATTTAGAGACTTCGCAAAATTTCτTCCCτAAGATTTGTTCCACGCACACATGGGAAGAAAGG LETSQNFFPKICSTHTWEERFIT
TTTATCACATGCGAAAACGCTTTAAGGGCGGGGTTAGGCTTGTGCAGTGGGGGGATTTTTGGGCTTAATGAGAGCT RAGLGLCSGGIFGLNESWEDRI
GGGAAGATCGGATTGAAATGCTTAGAGCGCTAGCTTCGCTCTCTCCGCACACCACGCCGATTAATTTTTTCATTAAAA LASLSPHTTPINFFIKNPVLPIDA
ACCCGGTATTGCCCATTGATGCAGAGACTTTAAGTGCAGATGAAGCCCTAGAATGCGTGCTTTTGGCTAAGGAATTT DEALECVLLAKEFLPNARLMVA
TTGCCTAACGCTAGGCTTATGGTGGCTGGGGGGCGTGAAGTGGTGTTTAAAGATAATGACAAAAAGGAAGCCAAGC VFKDNDKKEAKLFE
TTTTTGAATA
HP0137 3039 TTTGAAAGCGTTTCTAAGGATTTTTACCACGATAATTTGGAATTTAACCACCGCAGTGCGCCTTTAATCATTAATTATG 3040 FESVSKDFYHDNLEFNHRSAPLI
TG CACCATTTTTAAAAAAGCTTATCAAAATTCCCCCACCGCTTATTTGGAGCAAAAATACCCTAAAACTTCTCAAAA TIFKKAYQNSPTAYLEQKYPKTS
TAAACATGTTACAGACGGCTATGTTAAAGTCTCTTTAGTGGCTGATGAAAGAGAATTGTTATTAGATCAGGTCTTACA VTDGYVKVSLVADERELLLDQV
AGAAGCTCAAAACCTTTTAGAACATCGTATTGAGCCTAAAGATATTACCATTTTATGCGCCACTAATGACGACGCTTT NLLEHRIEPKDITILCATNDDALEI
AGAAATCAAAAATTATTTGCAAGAGCGTTTGAGTGCGATTCGCCCAAGCACGGAATCTAGCGCAAAATTGTCTCAATT QERLSAIRPSTESSAKLSQFVES
TGTAGAGTCTAAAATCATTAAGAACGCTTTAGAATACGCTCTAGCGGAAGAACCTTACAAGCCTTTTTATAAGCACAG ALEYALAEEPYKPFYKHSVLKLA
CGTTTTAAAACTCGCTGGATACTTGCATGATGATGCGATCGCTTTAGCTGGTTTTAACCCTAAAAAAGAGAGCGTGG DDAIALAGFNPKKESVAGFVWK
CAGGCTTTGTGTGGAAGGTGATGGAGCAATTTGAGCTTTATGGAGAGCCTGCACAAAGCTGTTTGGAATTAGCGGTT FELYGEPAQSCLELAVGCEDAD
GGGTGCGAAGATGCCGATGGATTTTTAGAAAAATTAGAGACTAAAGCGATCGCTTCTTCCCATTCAAAAGGCGCGCA LETKAIASSHSKGAQIMTIHKSK
GATCATGACCATTCACAAATCTAAAGGCATGCAATTCCCTTATGTGATCGTGTGCGAAC YVIVCE
HP0372 3041 TGAAAGCTTATTAGATTGTATCCAATCGCATTTTGGCAAAAACGCCCACCCAGCCCACAGACTAGACTATGAAACGA 3042 ESLLDCIQSHFGKNAHPAHRLD GCGGGTTGGTTTTAGCGGGCAAGACCTTACAAAGCGTTAAGGATTTAAAAGCGC I I I I CATGCAAAAAAAAGTAAAA LVLAGKTLQSVKDLKALFMQKK AAAACTTATTTAGCACTAGCGCATGGGTTGGTTGAAAAAAGCATGAAAATAGACAAACCCATTCTAACGCCACAAAAC LALAHGLVEKSMKIDKPILTPQNI ATTCAAAAAGATTTGCACATTCGATCTAAAATTTCTCCTTTAGGCAAGCAATCAATCACGCTTGTTGAGCCTTTAAGCT HIRSKISPLGKQSITLVEPLSYNP
ATAACCCTTTTTTGGATATAAGCOTACTCAAAATCACCCCACTCACCGGGCGCACGCACCAGATCCGCTTGCATTTAA LLKITPLTGRTHQIRLHLSSVDH GCAGCGTGGATCATAGGATCGTGGGTGAGGGGCTTTATGGGGTGGCAGATGAAAACGCTAGGGAGTATCTTCAATT GLYGVADENAREYLQLKRENN AAAGCGTGAAAATAACGCCCCATTACTCATGCTCCATGCCGCTAGTTTAGAGTTTGAATTTAAAGGAGCGATCTATAA LHAASLEFEFKGAIYKIASPMPE
AATCGCTTCCCCCATGCCTGAACGCTTCATGCCTTTTTTAAAAGATCTATCCTTTΠTTATTGATTGATCTTTTGGTTGT LKDLSFFY
CCTCTTCTGGGGTTTGGTTGAGGTCGCCTATATGTTCTTTGATGGTTTTTTTCTCCACCACAGGAGCGTTTTTAAAAC TCTTTTGAAATTCTTGGCTGATTTTAACAA
HP0372 3043 TGAGCAGTTACGGGTATGATATTAGAGTGGGGAGTGAGTTCATGCTCTTTGATAACAAAAACGCTTTAATTGACCCTA 3044 SSYGYDIRVGSEFMLFDNKNALI AAAACTTTGACCCTAACAACGCGACTAAAATTGATGCGAGTAAAGAAGGGTATTTTATCTrGCCCGCTAACGCGTTCG FDPNNATKIDASKEGYF/LPANA CCCTAGCCCATACGATAGAGTATTTTAAAATGCCTAAAGACACTTTAGCGATTTGTTTAGGCAAAAGCACTTACGCTA TIEYFKMPKDTLAICLGKSTYAR GGTGTGGGATTATTGTGAATGTTACGCCTTTTGAGCCGGAATTTGAAGGCTATATTACGATTGAAATTTCTAACACCA VTPFEPEFEGYITIEISNTTNLPA
CTAATTTACCGGCTAAAGTCTATGCCAATGAGGGGATCGCGCAAGTGGTGTTTTTACAAGGCGATGAAATGTGCGAG EGIAQWFLQGDEMCEQSYKDR CAAAGCTATAAAGACAGAGGCGGTAAGTAT
HP0010 3045 TAAAGAAACGGCGACCACGATCAATCAAGAGATCGCTAAATACCATGAAAAAAGCGATAAAGCCGCTTTGGGGCTTT 3046 KETATTINQEIAKYHEKSDKAAL ATGAATTGCTAAAAGGGGCTACCACTAATCTTAGTTTGCAAGCGCAAGAACTCAGTGTCAAGCAAGCGATGAAAAAC LKGATTNLSLQAQELSVKQAMK
CACACCATCGCCAAAGCGATGTTTTTGCCCACTTTGAACGCGAGTTATAATTTTAAAAATGAAGCTAGGGATACTCCA KAMFLPTLNASYNFKNEARDTP
GAATATAAGCATTATAACACCCAACAACTCCAAGCTCAAGTCACATTGAATGTGTTTAATGGCTTTAGCGATGTGAAT NTQQLQAQVTLNVFNGFSDVN
AATGTCAAGGAAAAGTCTGCGACTTACCGCTCCAATGTGGCTAATTTAGAATATAGCCGCCAGAGCGTGTATTTGCA SATYRSNVANLEYSRQSVYLQ
AGTGGTGCAACAATACTACGAGTATTTTAACAATCTCGCTCGCATGATCGCTTTGCAAAAAAAATTAGAGCAAATCAA YEYFNNLARMIALQKKLEQIKTDI
AACGGACATTAAAAGGGTTACCAAGCTCTATGACAAAGGGCTGACCACGATTGATGATTTGCAAAGCCTAAAAGCGC KLYDKGLTTIDDLQSLKAQGNLS
AAGGGAATTTGAGCGAATACGATATTTTGGACATGCAATTTGCTTTGGAGCAAAACCGCTTGACTTTAGAATACCTTA DMQFALEQNRLTLEYLTNLSVK
CCAATCTCAGTGTGAAAAATTTGAAAAAGACCACGATTGATGCGCCTAATTTGCAATTAAGAGAAAGGCAGGATTTAG TIDAPNLQLRERQDLVSLREQIS
TTTCTTTAAGGGAGCAGATTTCCGCAATCAGATACCAAAACAAGCAACTCAATTATTACCCCAAGATAGATGTGTTTG NKQLNYYPKIDVFDSWLFWIQK
ACTCATGGCTTTTTTGGATCCAAAAACCCGCTTATGCCACAGGGCGTTTTGGGAATTTCTACCCCGGTCAGCAAAAT GRFGNFYPGQQNTAGVTATLNI ACGGCTGGGGTTACTGCGACTTTGAATATTTTTGATGATATAGGCTTGAGCTTGCAAAAACAATCCATCATGCTAGGC LSLQKQSIMLGQLANEKNLAYK CAATTAGCGAATGAAAAGAATTTAGCGTATAAAAAGCTAGAGCAAGAAAAAGACGAACAGCTTTACAGAAAATCGCTT KDEQLYRKSLDIARAKI ESSKAS GATATTGCCAGAGCC LSFANIKRKYDANLVDFTTYLRG DAEVAYNLALNNYEVQKANYIF IDDYVH
Figure imgf000459_0001
Figure imgf000460_0001
HP0371 3059 TCATGTGTATATCACAGCGAATCTGCGTTCTGGCAATCGTATAGGCACCGGTGGGGCAGCTAATCTAATCTTTAATG 3060 HVYITANLRSGNRIGTGGAANLI
GGGTAGATAGTATCAATATCGCTAACGCTACCATCACGCAACATAACGCCGGAATCTATTCAAGCTCTATGACTTTTT SINIANATITQHNAGIYSSSMTFS
CCACGCAAAGCATGGATAATTCGCAGAATTTGAATGGTCTAAATTCTAACGGCAAACTTTCGGTGTATGGCACCACTT DNSQNLNGLNSNGKLSVYGTT
TCACTAACGAAGCTAAAGATGGGAAATTCATTTTCAATGCAGGGCAAGCGGTTTTTGAAAACACCAACTTTAATGGAG KDGKFIFNAGQAVFENTNFNGG
GGAGTTACCAATTCAGCGGCGATAGCTTGAATTTTTCAAACAACAACCAGTTCAATAGCGGTTCGTTTGAAATTAGCG SGDSLNFSNNNQFNSGSFEISA
CAAAAAACGCTTCGTTCAATAACGCTAACTTTAACAACAGCGCTTCTTTTAATTTCAATAATTCTAACGCGACCACTTC NNANFNNSASFNFNNSNATTSF
GTTTGTGGGGGATTTCACTAACGCTAATTCAAATTTGCAAATCGCCGGGAACGCTGTTTTTGGGAACTCTACTAATGG TNANSNLQIAGNAVFGNSTNGS
CTCTCAAAATACCGCTAATTTTAATAATACCGGCTCTGTTAATATTTCAGGGAATGCAACCTTTGATAATGTGGTGTTT NFNNTGSVNISGNATFDNWFN
AATGGCCCTACGAACACGAGCGTGAAAGGGCAGGTTACTTTAAATAACATCACTTTAAAAAACCTGAACGCCCCTTT SVKGQVTLNNITLKNLNAPLSFG
GTCTTTTGGCGATGGGACGATTACTTTTAACGCTCATTCGGTGATTAATATTGCTGAATCTATCACTAATGGCAACCC FNAHSVINIAESITNGNPITLVSS
TATCACTCTTGTAAGCTCTTCTAAAGAAATTGAATACAACAACGCTTTCAGTAAAAATCTATGGCAGCTCATCAACTAC YNNAFSKNLWQLINYQGHGAS
CAAGGGCATGGGGCAAGCAGTGAAAAGCTCGTCTCTAGCGCGGGTAATGGCGTTTATGATGTGGTGTATTCTTTCA SSAGNGVYDWYSFNNQTYNF
ATAACCAAACCTACAATTTCCAAGAGGTTTTTTCACAAAACAGCATTTCTATCCGGCGTTTGGGCGTTAACATGGTGT QNSISIRRLGVNMVFDYVDMEK
TTGATTATGTG YQNALGFMTYMPNSYNNNLGN
YYYDKSIDFYASGKTLFTKAE
HP0371 3061 GAATAGATCGTTATGAACCTTTCTGAAATTGAAGAGTTGATCAAAGAATTTAAAGCTTCTGATTTGGGGCATTTGAAAT 3062 IDRYEPF
TAAAGCATGAGCATTTTGAGTTGGTTTTGGATAAAGAATCCGCTTAfGCGAAAAAAAGTGCGTTAAATCCCGCCCATT
CTCCAGCCCCCATTATGGTAGAAGCGAGCATGCCAAGCGTCCAAACCCCTGTGCCTATGGTATGCACCCCTATTGT
GGATAAAAAAGAAGATTTCGTGCTTTCGCCTATGGTAGGCACTTTTTATCATGCACCCTCCCCTGGGGCTGAGCCTT
ATGTCAAAGCGGGCGATACGCTTAAAAAAGGGCAAATCGTGGGCATTGTAGAAGCGATGAAAATCATGAATGAAATT
GAAGTGGAATACCCTTGCAAGGTGGTTTCTGTTGAAGTGGGAGACGCTCAACCGGTAGAATACGGCACAAAACTCA
TCAMGTTGAAMGCTTTAAMTCCATGAATAAAGTGMTAMGAAAATAAAAAGGTAGAAAAAAAAGAGCTTTCACG
CATTTTGATCGCTAATAGAGGCGAGATCGCTTTAAGAGCGATCCAAACCATTCAAGAAATGGGTAAAGAATCCATAG
CCATTTATTCTATCGCTGACAAGGACGCCCACTACCTCAATACGGCTAGCGCAAAAGTGTGTATAGGGGGGGCCAA
ATCCAGCGAGAGCTACTTGAATATCCCTGCGATCATTAGCGCGGCGGAATTGTTTGAAGCGGATGCGATTTTCCCCG
GGTATGGGTTTTTGAGCGAGAATCAGAATTTTGTAGAGATTTGCTCGCACCATTCTTTAGAATTTATTGGTCCAAGCG
CGAAGGTCATGGCTTTAATGAGCGATAAATCCAAGGCCAAAAGCGTGATGAAGGAAGCCGGCATGCCTGTGATTGA
GGGCAGTGATGGGTTGCTTAAAAGCTATCAAGAAGCTGAAGAAATCGCTGATAAAATCGGCTACCCTGTCATCATTA
AAGCAGCCGCTGGTGGGGGCGG
HP0371 3063 AGCGAGTGCTTGGATAACTTAGATGACCCTACTGATCAAGAGGCCATAGAGCAATGTTTAGAGGGCTTGAGCGATAG 3064 SECLDNLDDPTDQEAIEQCLEG
TGAAAGGGCGCTAATTCTAGGAATTAAACGACAAGCTGATGAAGTGGATCTGATTTATAGCGATCTAAGAAACCGTA RALILGIKRQADEVDLIYSDLRN
AAACCTTTGATAACATGGCGGCTAAAGGTTATCCATTGTTACCAATGGATTTCAAAAATGGCGGCGATATTGCCACTA NMAAKGYPLLPMDFKNGGDIA
TTAACGCCACTAATGTTGATGCGGACAAAATAGCTAGCGATAATCCTATTTATGCTTCCATAGAGCCTGATATTGCCA VDADKIASDNPIYASIEPDIAKQY
AGCAATACGAAACAGAAAAAACCATTAAGGATAAGAATTTAGAAGCTAAATTAGCTAAGGCTTTAGGTGGCAATAAAA IKDKNLEAKLAKALGGNKKDDD
AAGATGACGATAAAGAAAAAAGTAAAAAATCCACAGCAGAAGCTAAAGCAGAAAACAATAAGATAGACAAAGATGTC KSTAEAKAENNKIDKDVAETAK
GCAGAAACTGCCAAGAATATCAGTG ATCGCTCTTAAGAACAAAAAAGAAAAGAGTGGGGAATTTGTAGATGAAAA LKNKKEKSGEFVDENGNPIDDK
TGGTAATCCCATTGATGACAAAAAGAAAGCAGAAAAACAAGATGAAACAAGCCCTGTCAAACAGGCCTTTATAGGCA QDETSPVKQAFIGKSDPTFVLA
AGAGTGATCCCACATTTGTTTTAGCGCAATACACCCCCATTGAAATCACTCTGACTTCTAAAGTAGATGCCACTCTCA ITLTSKVDATLTGIVSGWAKDV
CAGGTATAGTGAGTGGGGTTGTAGCCAAAGATGTATGGAACATGAACGGCACTATGATCTTA GTMIL
HP0371 3065 GAAAATCTCACTAACGCTATGAGTAACCCACAAAATTTGAGCAATAACAAAAATCTTAGCGAATTTATCAAGCAACAA 3066 ENLTNAMSNPQNLSNNKNLSE
CGAGAAAATGAATTAGACCAAATGGAACGACTAGAGGACATGCAAGAACAGGCTCAAGCTAATGCGCTCAAACAAAT ENELDQMERLEDMQEQAQAN
TGAAGAACTCAACAAGAAACAAGCTGAAGAGACAATCAAGCAAAGAGCCAAAGATAAAATCAATATTAAGACAGATAA ELNKKQAEETIKQRAKDKINIKT
GCCTCAAAAAAGCCCTGAGGATAACTCCATAGAATTATCTCCTAGCGATAGCGCTTGGAGAACTAATCTTGTTGTGC SPEDNSIELSPSDSAWRTNLW
GGACTAATAAAGCCTTGTATCAATTCATTTTGAGAATAGCTCAAAAAGACAATTTTGCTTCAGCGTATCTAACAGTCAA LYQFILRIAQKDNFASAYLTVKL
ATTAGAATACCCACAAAGACACGAAGTCTCTAGCGTTATTGAAGAGGAGTTAAAAAAGAGAGAAGAAGCAAAGAGGC HEVSSVIEEELKKREEAKRQKE AGAAAGAATTGATTAAGCAAGAAAATCTTAACACCACAGCCTACATCAATAGAGTGATGATGGCGAGCAATGAACAG LNTTAYINRVMMASNEQIINKEK ATTATCAACAAAGAAAAAATAAGAGAAGAAAAGCAAAAAATTATCTTAGATCAAGCAAAGGCGCTAGAGACTCAATAT QKIILDQAKALETQYVHNALKRN GTGCATAATGCCTTAAAAAGAAACCCCGTGCCTAGAAACTACAACTACTACCAAGCGCCTGAAAAACGCTCTAAACA NYNYYQAPEKRSKHIMPSEIFD
TATTATGCCCTCTGAAATTTTTGATGATGGCACATTCACTTATTTTGGTTTCAAAAACATCACTCTCCAACCTGCTATTT YFGFKNITLQPAIFWQPDGKLS
TTGTGGTTCAACCTGATGGGAAATTGAGCATGACTGATGCCGCCATTGATCCTA AIDP
HP0371 3067 TATTACAGGGGTTTTTAAGGCAAAAGACATTTTTATCACAGGGGCTGTTGGATCGGGCAATGAGTGGAAAACCGGTG 3068 ITGVFKAKDIFITGAVGSGNEWK
GGGGGGCGATACTGGTTTTTGAAAGCTCAAACGAATTAAGCGCTAATGGGGCTTATTTTCAAAATAACAGAGCCGGG AILVFESSNELSANGAYFQNNR
ACGCAAACTTCTTGGATCAATTTGATTTCCAATAACAGCGTGAATTTGACAAACACAGATTTTGGCAACCAAACCCCT SWINLISNNSVNLTNTDFGNQT
AATGGGGGCTTTAATGCTATGGGGCGAAAGATCACCTATAATGGCGGGATTGTCAATGGCGGGAATTTTGGCTTTGA NAMGRKITYNGGIVNGGNFGFD
TAACGTGGATAGCAATGGCGCAACCACCATTAGCGGAGTAACTTTCAACAATAACGGCGCGCTCACTTATAAGGGTG NGATTISGVTFNNNGALTYKGG
GGAATGGTATTGGGGGGAGCATCACTTTCACTAACTCTAATATCAATCATTACAAACTCAATCTTAACGCTAATAGCG SITFTNSNINHYKLNLNANSVTF
TTACCTTTAATAACAGTGCTTTAGGGAGTATGCCTAATGGCAACGCTAATACTATAGGAAACGCCTATATTCTTAATG GSMPNGNANTIGNAYILNASNIT
CAAGCAATATTACTTTTAATAATTTGACCTTTAATGGGGGATGGTTTGTTTTTAATATACCTGATGCTCATGTTAATTTT FNGGWFVFNIPDAHVNFQGTT
CAAGGCACAACCACGATCAACAACCCCACTTCGCCTTTTGTCAATATGACCGGTAAAGTTACTATTAATCCTAATGCG SPFVNMTGKVTINPNAIFNIQNY
ATTTTTAACATTCAAAATTACACGCCTAGTATAGGGAGCGCTTACACGCTCTTTAGCATGAAAAATGGCTCTATCACC SAYTLFSMKNGSITYNDVNNLW
TATAATGATGTCAATAACTTATGGAATATCATCAGGCTTAAAAACACGCAAGCCACAAAAGACGCTGATAAAAATCAT NTQATKDADKNHTSSNNNTHT
ACAAGTTCAAATAACAACACCCACACTTACTATGTAACCTACAATTTAGGC NLG
HP0371 3069 CATCCCTAAATCTAACAGCACGGTGCGTTTTGGGGGGTATGAGGGAGTCAATTGGGGGAAAACGGGCTATATTACT 3070 IPKSNSTVRFGGYEGVNWGKT GGCACTTTCACAGCCGATAGGGTTTATATCACCGGTAACATGATGACTGGTAACGGCGCTCAAACCGGTGGGGGGG FTADRVYITGNMMTGNGAQTG CGACTTTGAATTTTGTGGGCGCGACTGAAATTAATATCGCTGGAGCCACTTTTAAAAACCTAAAAACCACTTCACAAA NFVGATEINIAGATFKNLKTTSQ ACTCTTACATGACTTTTATGGCATTAGGGGATAGCTCTGGGAGCGCTAAGATCAATGTTTCTCAATCTGATTTTTACG FMALGDSSGSAKINVSQSDFYD ATTGGACGGGTGGGGGGTATGATTTTACCGGTAATGGCGTTTTTGATAGCGTGAATTTCAACAA GYDFTGNGVFDSVNFN
HP0371 3071 TGGGAAAACTTCAGGGGCTGAATGGGGGTTAGTGGGCTATATTCAAGGCGTTTTTAAAGCCAATCAAATTGACATTA 3072 GKTSGAEWGLVGYIQGVFKAN
CCGGCACGATTCGCTCTGGTAATGGGGCCAAAACCGGTGGGGGCGCGACTTTAGTGTTTAACGCTCAAAAGCGTTT TIRSGNGAKTGGGATLVFNAQK
GAATATCGCTAACGCTCATTTGAATAACGATAAAGCCGGTTTGCAAAATTCATGGATGAATTTCATTGTCAATAATGGT NAHLNNDKAGLQNSWMNFIVN
AATTTGAATGTAACAAACGCAAAATTTAGCAACCAAACCCCACATGGAGGCTTTAACCTTAAGGCCAATAACATTACT VTNAKFSNQTPHGGFNLKANNI
TGGGATAAAGGCTCTGTGAATGGAGGGGGGAATTTTGGCGTGGATAACGCCGATAGCAATGGCGCAACCACCATTA GSVNGGGNFGVDNADSNGATT
GCGGAGTAACTTTCAATAATAACGGCACTTTGATTTATAAAGGGGGTGAAAATAGCGCCGGAAATTCTTTAACCCTAG FNNNGTLIYKGGENSAGNSLTL
AAAACAACACTTTCAATTCCTACAATATCAACGCAAAAGCGCAAAACCTAATTTTTAACAACAACTCGTTTAATGGCGG NSYNINAKAQNLIFNNNSFNGG
TAGCTATTCGTTTAATGACACTAAAAACACCACCTTTAAAGGCACAAACACGCTCATTAACAGCGATCCTTTTAGCCG DTKNTTFKGTNTLINSDPFSRLK
CCTTAAAGGATCAGTTTCTATTGAAAATAATAGTGTTTTTAATATTGAAAGGGATTTGACCGATAAAACCACTTACACG ENNSVFNIERDLTDKTTYTLLSG
CTTTTAAGCGGAAATAGTATCAAATACAATAACCAAGCTTTAGCGGGACAATGCTTTTTCAAAAAATTTATGGAATTTA NNQALAGQCFFKKFMEFNPLW
ATCCATTATGGTGGCGAACAAGGGACTCTATTAAGAGCGGATAACAACACCTTTTTTGTGCAATTCACCCAAAGCAAC DSIKSG iGGCCAAA
HP0371 3073 TACGAATTTTAATCAAGCCACGCTCAATTTAAGGGCTAAAAATATCCATATCAATTTCCAAGGCGTTTCTACTTTTAAA 3074 TNFNQATLNLRAKNIHINFQGVS
CAAAACTCTACGATGAATTTAGCTGAAAGTTCCCAAGCGAGCTTTAACGCTCTTAAAGTGGAAGGGGAAACGAATTT NSTMNLAESSQASFNALKVEGE
CAATCTCAATAACTCAAGCTTGTTGAATTTCAATGGCAATAGCGTTTTCAACGCTCCTGTGAGTTTTTATGCTAATCAT NNSSLLNFNGNSVFNAPVSFYA
TCTCAAATTTCTTTCACTAAATTAGCGACTTTTAATTCTGACGCTTCTTTTGATTTAAGCAACAACAGCACCCTGAATTT SFTKLATFNSDASFDLSNNSTLN
TCAAAGCGTTCTTTTAAATGGTGCTCTAAACCTTTTAGGCAATGGCAGTAACAATCTAGCGATCAACGCTAAAGGGAA LLNGALNLLGNGSNNLAINAKG
TTTTAGTTTTGGGTCTAAAGGGATTTTGAATCTGTCTTATATGAATCTATTTGGGGGGGATAAAAAAACTTCCGTTTAT SKGILNLSYMNLFGGDKKTSVY
GATGTGTTGCAAGCCCAAAATATTGATGGCTTAATGGGGAATAACGGCTATGAGAAGATCCGTTTTTATGGCATACA QNIDGLMGNNGYEKIRFYGIQID
GATTGACAAGGCTGATTACTCGTTTGATAACGGCGTTC SFDNGV
HP0371 3075 GATCAATTTCACGCAGTATCAGGGTAAGCTTrCGTTTATTTCTAAAGATTTTTCTAACATTTCATTAGATACCTTAAACG 3076 INFTQYQGKLSFISKDFSNISLDT
CTACTAACGGATTAACGCTTAATGCTCCTAAAAATGACATTAGCGTTCAAAAAGGTCAGATTTGCGTGAATGTTTTAAA NGLTLNAPKNDISVQKGQICVN
TTGCATGGGCGAGAAAAAAGCTCATTCTTCAAGCGCGACAGCCCCAACCAATGAAACACTAGAAGCGAATGCGAATA GEKKAHSSSATAPTNETLEANA
ATTTCGCTTTTTTAGGTGCAATTAAGGCTAATGGATTAGTGGATTTTTCAAAAGTTTTACAAAATACTACGATCGGGAC LGAIKANGLVDFSKVLQNTTIGT
TTTAGATTTAGGGCCAAACGCTACTTTTAAAGCGAATCATTTGATCGTGAATAACGCTTTTAACAATAACTCTAATTAC NATFKANHLIVNNAFNNNSNYR
AGGGCTGATATTAGCGGTAATCTCAATGTGGTTAAAGGAGCGGCTCTCAGCACGAATGAAAATGGTTTGAATGTGGG NLNWKGAALSTNENGLNVGG
GGGCGATTTCAAGAGCGAAGGGTCATTAATCTTTAATCTTAACAATAAAACCAATCAAACGATTATTAATGTGGCTGG GSLIFNLNNKTNQTIINVAGNSTI
CAATTCTACGATCATGTCTTATAACAATCAAGCTTTAATCCATTTTAATACCCAACTCAAGCAAGGCGCTTACACGCTT NQALIHFNTQLKQGAYTLINAKR
ATTAATGCGAAACGCATGCTTTATGGTTATGACAATCAAATCATTCGTGGAGGGAGCTTGAGCGATTACCTCAAGCTT YDNQIIRGGSLSDYLKLYTLIDFN
TACACCCTCATTGATTTTAACGGCAAACGCATGCAATTAAACGGCGATTCACTAAGCTATGACAACCAACCGGTCAAT QLNGDSLSYDNQPVNIKDGGL
ATTAAAGATGGGGGTCTTGTGGTAAGCTTTAAAGACAATCAGGGGCAAATGGTGTATTCATCTATCCTTTATGATAAA DNQGQMVYSSILYDKVQVSVSD
GTTCAAGTTAGCGTCTCTGATAAGCCCATGGATATTCATGCCCCTAGTTTGGAGTATTACATTAAATACATTCAAGGC IHAPSLEYYIKYIQGSAGLDAIKS
AGTGCTGGTTTGGATGCGATCAAATCTGCAGGCAATAATTCCATTCTGTGGTTGAATGAGCTTTTTGTGGCTAAAGG SILWLNELFVAKGGNPLFAPYYL
GGGTAAT TEHIVTLMKDITSALGMLSKPNL
DALQLNTYTQQMSRLAKLSNFA
DFSERLSSLKNQRFADAIPNAM
SQRDKLKNNLWATGVGGVSFV
GTLYGVNVGYDRFIKGVIVGGY
SGFYER
HP0371 3077 GCTTGGCAGTAAGGTTTTGAATTACGATGTGATTGACAAGCTTAAGGACGCTGATGAAAAGGCGTTAATCGCCCCCT 3078 LGSKVLNYDVIDKLKDADEKALI
TAGACAAGAAAATGGAGCAAAATGTTGAAAAACAAAAAGCCCTTGTAGAAATTAAAACGCTCCTTTCAGCTCTAAAAG KMEQNVEKQKALVEIKTLLSALK
GCCCGGTTAAAACGCTTTCAGATTATTCCACTTATATCAGCCGAAAAAGCAATGTTACAGGCGATGCGTTGAGTGCG TLSDYSTYISRKSNVTGDALSAS
AGCGTGGGGGTTGGCGTGCCTATTCAAGATATTAAAGTGGATGTGCAAAATTTAGCGCAAGGCGATATTAACG,AATT VPIQDIKVDVQNLAQGDINELGA
GGGGGCGAAATTTTCTTCAAGAGACGATATTTTTAGCCAAGTGGATACCACGCTCAAGTTTTACACACAAAACAAAGA DDIFSQVDTTLKFYTQNKDYAV
CTACGCCGTTAATATTAAAGCAGGAATGACTTTAGGCGATGTGGCTCAAAGCATCACGGACGCTACCAATGGCGAAG MTLGDVAQSITDATNGEVMGIV
TGATGGGTATTGTGATGAAAACAGGAGGGAATGACCCCTACCAATTAATGGTGAATACCAAAAACACCGGCGAAGAC GNDPYQLMVNTKNTGEDNRVY
AACCGAGTCTATTTTGGCTCACACCTCCAATCCACGCTCACTAACAAAAACGCCCTTTCTTTGGGGGTTGATGGGAG QSTLTNKNALSLGVDGSGKSEV
CGGAAAAAGTGAAGTGAGTTTGAATTTAAAGGGGGCTGATGGGAACATGCATGAAGTCCCCATCATGCTAGAACTCC GADGNMHEVPIMLELPESASIK
CTGAAAGCGCTTCTATCAAACAAAAAAACACCGCAATCCAAAAAGCGATGGAGCAGGCTTTAGAAAATGACCCTAAT QKAMEQALENDPNFKNLIANGD
TTTAAAAATTTGATCGCTAATGGGGATATTTCCATAGACACTCTTCATGGAGGGGAATCTTTAATCATTAATGACAGG HGGESLIINDRRGGNIEVKGSK
CGTGGGGGAAACATTGAAGTTAAAGGGAGTAAGGCTAAAGAGCTTGGGTTTTTACAAACCACCACCCAAGAAAGCG FLQTTTQESDLLKSSRTIKEGKL
ATTTATTAAAAAGCTCTCGCACGATAAAAGAGGGTAAATTAGAAGGGGTGGTCAGTTTGAATGGCCAAAAACTGGATT LNGQKLDLSALTKESNTSEENT
TGAGTGCTTTAACCAAAGAG
HP0371 3079 CCTTTCGCCCTTACAAATCCCCCAAGATTTGACTTACCAGCCCGTGCTTAGCACGAAAGTGAATATTAGCGTGAATTT 3080 LSPLQIPQDLTYQPVLSTKVNIS AAACCCTAAAGACCATTTAAAAGGCGTGCAAGATTTTTTCTTAAACGATAAGGGCGAGATTATTAAGGAGCGTTTTTT KDHLKGVQDFFLNDKGEIIKERF AAACCAGGACATTAACGCTTTAGCGAATAACGATAACGAGCCCATAGATGCGATCACTAATCGCAAATTAAACATCAG NALANNDNEPIDAITNRKLNIS TAT
Figure imgf000464_0001
HP0634 3091 CCGCTTGGATCCCTGCTTGGTTGCTCTTTATCCAACACTGGGTGTGAGATGATCATAGAGCGTTTAGTTGGCAATCT 3092 AWIPAWLLFIQHWV
AAGGGATTTAAACCCCTTGGATTTCAGCGTGGATCATGTGGATTTGGAATGGTTTGAAACGAGGAAAAAAATCGCTC
GTTTTAAAACCAGGCAAGGCAAAGACATAGCCATACGCCTTAAAGACGCTCCCAAGTTGGGGCTCTCTCAAGGGGA
TATTTTATTTAAAGAAGAGAAGGAAATTATCGCCGTTAATATCTTGGATTCTGAAGTCATTCACATCCAAGCCAAGAGC
GTGGCAGAAGTAGCGAAAATATGCTATGAAATAGGAAACCGCCATGCGGCTTTATACTATGGCGAGTCTCAATTTGA
ATTTAAAACACCATTTGAAAAGCCCACGCTAGCGTTATTAGAAAAGCTAGGGGTTCAAAATCGTGTTTTAAGTTCAAA
ATTGGATTCCAAAGAACGCTTAACCGTGAGCATGCCCCATAGTGAGCCTAATTTTAAGGTCTCACTAGCGAGCGATT
TTAAAGTGGTCGTAA TAGAAAAGAAATAGAAAAATMCAAATGGATAMGGAAAAAGCGTGAAAAGCACTGAAAAA
AGCGTGGGTATGCCCCCAAAAACTCCAAAGACAGACAACAACGCTCATGTGGATAACGAATTTCTGATTCTGCAAGT
CAATGAT
HP0634 3093 TCATTGATGGGCAAAAAGAAGCGCTCTATGGCGGGATTGCGTGCGCGAATTTGTTGCATAAAAATTCAGGGATCACG 3094 IDGQKEALYGGIACANLLHKNSG
ATAGATATTGGAGGGGGTAGCACCGAGTGCGCGTTGATTGAAAAAGGCAAGATTAAGGACTTAATCTCGCTTGATGT GGSTECALIEKGKIKDLISLDVGT
TGGGACGATTCGCATTAAAGAAATGTTTTTAGACAAAGACTTAGAGGTCAAATTGGCTAAAGCCTTTATCCAAAAAGA MFLDKDLEVKLAKAFIQKEVSKL
AGTCTCTAAACTGCCCTTTAAACACAAAAACGCCTTTGGGGTGGGGGGGACGATCAGAGCGTTGAGTAAGGTATTG NAFGVGGTIRALSKVLMKRFCY
ATGAAACGCTTTTGTTACCCTATTGATTCTTTGCATGGCTATGAAATAGATGCACATAAAAATTTAGCGTTCATTGAAA HGYEIDAHKNLAFIEKIVMLKED
AAATCGTCATGCTCAAAGAAGATCAATTACGGCTTTTAGGGGTGAATGAAGAGCGTTTGGATAGCATCAGGAGCGGG GVNEERLDSIRSGALILSWLEH
GCGTTGATTTTATCAGTCGTTTTGGAGCATTTAAAAACTTCTTTAATGATCACTAGTGGGGTGGGGGTGAGAGAAGG MITSGVGVREGVFLSDLLRHHY
CGTGTTTTTGAGCGATTTATTGCGCCATCATTACCATAAATTCCCCCCCAATATCAACCCCTCTCTCATCTCTTTAAAA NINPSLISLKDRFLPHEKHSQKV
GATCGCTTTTTGCCCCATGAAAAGCACAGCCAAAAGGTCAAAAAAGAATGCGTGAAATTGTTTGAAGCCTTATCGCC KLFEALSPLHKIDEKYLFHLKIAG
TTTGCATAAAATAGATGAAAAATACCTTTTCCATTTAAAGATTGCGGGGGAATTAGCGAGCATGGGTAAGATTTTAAG MGKILSVYLAHKHSAYFI
CGTCTATTTAGCCCACAAGCACAGCGCGTATTTTATTTT
HP0634 3095 CCTTTTCACGCATTCTGGCAAAATTTCTATCAAAAACGCGCCCTATAAATTGGATAATACCCCTATTGAAGAAAATTGC 3096 LFTHSGKISIKNAPYKLDNTPIEE
GCATGTTATGCTTGCAAACGCTATTCTAAAGCCTATTTGCACCATTTATTTAGGGCTAAAGAACTCACTTACGCTCGTT YACKRYSKAYLHHLFRAKELTY
TGGCCAGCTTGCACAATTTGCATTTTTATTTAGAGCTGGTGAAGAACGCCAGAAACGCCATTTTAGAAAAGCGGTTTT LHNLHFYLELVKNARNAILEKRF
TGAGTTTTAAAAAAGAATTTTTGGAGAAATACAACTCCCGCTCTCATTGAATGATGGAATGCAAAAATACTAAAAAGC EFLEKYNSRSH
GTTTTTTACCATCAATAAAAGTTTTCTTAAAAATAAGGCTTTAGTTTTTAATTTTTGATTAAAACAAAACCCTATCTTTGA
TAAATCTCAGGGTGGGTGCTTTTAAAACGCCTATGGAACCAAAAATAACTTTCCGGGTGGTTTCTAATCACCTCTTCG
CACAAACTCGCTTGGGCTTGCGTGCATTCTAAAATATCGTTTTGCGCGTTATCGGTGATTTGAGAGCGGATACTCGG
ATAATAGGTCGCTGTATAATGCGAATAATCGTCATTAAAATCAATGAATACCGGCTGAATATCTATATTATAACGGCG
CGACAAAATAGAAGCGATCGTGGTGTGCGTAGCGTCTCTATCAAAGAATTTCACCACCACCCCATCTTTAGGCACGA
CATTTTGATCCACTAAAATCCCCACCAGACCATTGCCTTGATTATACATTTTAATGAGTTCTTTCATCGCCCCTATTTT
ATTGACAAAACGCACCCCAAACGCCTCTCGCCTACTCATAATCATGTGATTGATAGGGGCAAATTTAGTCAAACGCC
CCAAACACCCCCTACCATAATTTTCATAATATTGCGCTAAAGTCGTGCCTACCGCTTCCCAATAGCCAAAATGCATGC
ATAAAGTGATCGCTTGGCCTTCCTTGTTTAAAGATTTCCACACATTTTCTTCATTGATGAGCGTGAAACGAGCGTCGT
ATTCATC
HP0634 3097 AAAACAAGCACTTAGACACCACGATGGTGAGCGAGTTTGTGGGAAAAACTAGGGCGTTTATTAAGATCCAAGAAGGC 3098 NKHLDTTMVSEFVGKTRAFIKIQ
TGTGATTTTGATTGCAATTATTGCATTATCCCAAGCGTGAGAGGGAGGGCTAGGAGTTTTGAAGAGAGAAAAATTTTA FDCNYCIIPSVRGRARSFEERKI
GAGCAAGTGGGCCTTTTATGCTCTAAAGGGGTTCAAGAAGTGGTTTTAACCGGCACCAATGTGGGGAGCTATGGGA LLCSKGVQEVVLTGTNVGSYGK
AAGATAGAGGAAGCAATATCGCGCGATTGATTAAAAAATTAAGCCAGATCGCTGGATTAAAACGCATAAGGATTGGG NIARLIKKLSQIAGLKRIRIGSLEP
AGCTTAGAACCTAATCAAATTAACGATGAATTTTTAGAGCTTTTAGAAGAGGATTTTTTAGAAAAACATTTGCATATCG EFLELLEEDFLEKHLHIALQHSH
CTTTACAGCACAGCCATGATCTCATGCTAGAGAGGATGAATCGAAGAAACCGCACTAAAAGCGATAGGGAATTATTA RMNRRNRTKSDRELLETIASKN
GAAACAATCGCTTCTAAGAATTTTGCTATTGGCACGGATTTTATTGTGGGGCATCCGGGCGAGAGCGGAAGCGTTTT DFIVGHPGESGSVFEKAFKNLE
TGAAAAAGCGTTTAAAAATTTAGAAAGCTTGCCTTTAACGCACATCCACCCTTTTATTTACAGCAAACGAAAAGACAC HIHPFIYSKRKDTPSSLMTDSVS
CCCCTCTAGCTTGATGACTGATAGCGTGAGTTTGGAAGATTCTAAAAAGCGTTTGAATGCGATTAAAGATTTGATTTT KRLNAIKDLIFHKNKAFRQLQLK
TCATAAAAATAAGGCGTTCAGGCAATTGCAGCTCAAGCTCAATACGCCTCTAAAAGCCTTAGTGGAAGTGCAAAAAG KALVEVQKDGEFKALDQFFNPIK
ACGGCGAATTTAAAGCCTTAGATCAATTTTTCAACCCCATTAAAATCAAAAGCGATAAGCCTCTAAGGGCTAGTTTTTT PLRASFLEIKEYEIKERENHAVF
AGAAATCAAAGAGTATGAAATTAAGGAGAGGGAAAATCATGCCGTTTTCTAAAAATTTAGAAAATCTTACCGCTCCCT
TCAAACGCATTAAAAACCGCTCGCTTGTTTTGGCGTTAGGG
HP0634 3099 GCCCTATATCGCTAGAAATTATCCTTTGGAAAAATCCGTCCTCAAAGAACCGCATGAAGCCCTTTTTGGGGGGGTTA 3100 PYIARNYPLEKSVLKEPHEALFG
AAGGCGATGAGATCTTAAAAGAAATCGTTTTTTTAGCCGCTAAATTAAAAATCCCTTTTTTGGTTTGTGAAATGGGGTA DEILKEIVFLAAKLKIPFLVCEMG
TGACCAGTTGAAAAGCTTGAAAGAATGCTTGGAATTTTGCGGTTATGATGCAGAGTTTTACAAGG SLKECLEFCGYDAEFYK
HP0634 3101 ATGCGGGCGATATTGTAGCGATTGCCGGGTTTAATGCAATGGATGTGGGCGATAGCGTCGTTGATCCTGCTAACCC 3102 AGDIVAIAGFNAMDVGDSVVDP
CATGCCTTTAGATCCCATGCATTTAGAAGAGCCTACGATGAGCGTGTATTTTGCTGTCAATGATTCACCCTTAGCCGG PLDPMHLEEPTMSVYFAVNDSP
GTTAGAAGGAAAGCATGTTACTGCTAATAAATTGAAAGACAGGCTCTTAAAAGAAATGCAAACCAATATCGCTATGAA GKHVTANKLKDRLLKEMQTNIA
ATGCGAAGAAATGGGCGAGGGCAAGTTTAAAGTGAGTGGGCGTGGGGAATTGCAAATCACTATTTTAGCTGAAAACT MGEGKFKVSGRGELQITILAENL
TGCGCCGTGAAGGGTTTGAATTTAGCATTTCACGCCCT FEFSISRP
HP0634 3103 ATATCCAAAAAACCAAGCTGATAGAATCTATCGCCAAAGCGTTTAATGAGAGCCATGTGAGTGCTGAGGTAATCAAAT 3104 IQKTKLIESIAKAFNESHVSAEVI
TTGAAAGCAAAAGGTATGACCCACAAACCAACAGGATCATCACAGAGCAATCAAGCACCTTAAAAATCAGAGACTAC RYDPQTNRIITEQSSTLKIRDYA
GCTAACGCTTTACAAAAAGAAATCAACGCGCTTTTGCTTGACTTTGCCAAAGATGAGCGCTTACCCTTAAAATTCACG EINALLLDFAKDERLPLKFTLELY
CTTGAACTCTACAACGCTTTAAACAAAGAGCATTTCACAAACTCACCCAAAAAAGCCTTTAAATTGCTTAAAGGCATCA KEHFTNSPKKAFKLLKGIIKDKLH
TTAAAGATAAGTTGCATGAA
HP0634 3105 AAGGCGTTTTTAAAGCCAATCAAATTGACATTACCGGCACGATTCGCTCTGGTAATGGGGCCAAAACCGGTGGGGG 3106 GVFKANQIDITGTIRSGNGAKTG
CGCGACTTTAGTGTTTAACGCTCAAAAGCGTTTGAATATCGCTAACGCTCATTTGAATAACGATAAAGCCGGTTTGCA LVFNAQKRLNIANAHLNNDKAG
AAATTCATGGATGAATTTCATTGTCAATAATGGTAATTTGAATGTAACAAACGCAAAATTTAGCAACCAAACCCCACAT WMNFIVNNGNLNVTNAKFSNQT
GGAGGCTTTAACCTTAAGGCCAATAACATTACTTGGGATAAAGGCTCTGTGAATGGAGGGGGGAATTTTGGCGTGG FNLKANNITWDKGSVNGGGNF
ATAACGCCGATAGCAATGGCGCAACCACCATTAGCGGAGTAACTTTCAATAATAACGGCACTTTGATTTATAAAGGG DSNGATTISGVTFNNNGTLIYKG
GGTGAAAATAGCGCCGGAAATTCTTTAACCCTAGAAAACAACACTTTCAATTCCTACAATATCAACGCAAAAGCGCAA AGNSLTLENNTFNSYNINAKAQ
AACCTAATTTTTAACAACAACTCGTTTAATGGCGGTAGCTATTCGTTTAATGACACTAAAAACACCACCTTTAAAGGCA NSFNGGSYSFNDTKNTTFKGTN
CAAACACGCTCATTAACAGCGATCCTTTTAGCCGCCTTAAAGGATCAGTTTCTATTGAAAATAATAGTGTTTTTAATAT DPFSRLKGSVSIENNSVFNIERD
TGAAAGGGATTTGACCGATAAAACCACTTACACGCTTTTAAGCGGAAATAGTATCAAATACAATAACCAAGCTTTAGC TYTLLSGNSIKYNNQALAGQCF
GGGACAATGCTTTTTCAAAAAATTTATGGAATTTAATCCATTATGGTGGCGAACAAGGGACTCTATTAAGAGCGGATA EFNPLWWRTRDSIKSG
ACAACACCTTTTTTGTGCAATTCACCCAAAGCAACGGCCAAAAATTTGTTTTTGAAGAAACTTTTAATCCGGGCTCTAT
CACCTATAAATATTTCACTATCCATTCTTCGCTTTTCCACACAGACGCTGATTCTAAGGATATTTGGAGTCAAGTGAG
GAAGCAATTTGATTTCATTCCAGGAAAAACCCCTGTGTGTGTTGGCGTGTGCTATATCGCGCCTTATAAAAATCAAGA
CCTTATTGGCT
HP0634 3107 ATCATTTTAAAGACGCTCTCACTAAAAGTAACGCTACCCACAGCAACGCGCAAACCTTTTTTATTCTAGGGATTAATG 3108
AAATCTTGCGCAAAAAACCCTCTAAAGCGCTCAAGTATTTTGAACGATCAGAAGCGGTTGTCAAAGACGATGATTTTT KKPSKALKYFERSEAWKDDDF
CAAAAGACAGAGCGATTπTTGGCAGTATTTAGTTTCTAAAAAGAAAAAAACTTTAGAACGCCTTTCACAAAGCCCAG IFWQYLVSKKKKTLERLSQSPA
CTTTAAATCTCTATAGTCTTTATGCGAGCCGCAAACTCAAAACCACGCCCAGTTACCGCATCATTTCACGCATCCAGA YASRKLKTTPSYRIISRIQNLSQ
ATTTAAGCCAAGAAGATCCTCCTTTTAACACTTATGACCCTTTTTCGTGGCAAATTTTTAAGGAAAAAACCTTGAGTTT NTYDPFSWQIFKEKTLSLKDEG
GAAAGATGAGGGCGCGTTTAATGCGATGCTAAAAAGCCTGTATTATGAAAAAAGCGCTCCTGAATTGACCTATCTTTT LKSLYYEKSAPELTYLLS
AAGCC
HP0634 3109 ATGAAGTTCAAAGAAATGAAGCTCAAAAAGAAACCCCCCAATCCAATCAAACGCCTAAAGAAATGAAAGTCAAGTCCA 3110 EVQRNEAQKETPQSNQTPKEM
TTTCTTATGTCGGGCTTTCTTACATGTCTGACATGCTCGCTAATGAAATTGTAAAGATTCGTGTGGGCGATATTGTGG SWGLSYMSDMLANEIVKIRVG
ATTCTAAAAAAATAGACACCGCTGTTTTGGCTTTGTTCAATCAAGGGTATTTTAAAGACGTTTATGCCACTTTTGAAGG KIDTAVLALFNQGYFKDVYATFE
CGGCATATTAGAGTTTCATTTTGATGAAAAAGCCAGGATTGCCGGGGTAGAAATCAAGGGTTATGGGACTGAAAAGG FHFDEKARIAGVEIKGYGTEKEK
AAAAAGACGGCTTAAAATCCCAAATGGGGATCAAAAAGGGCGACACCTTTGATGAGCAAAAATTAGAGCATGCTAAA SQMGIKKGDTFDEQKLEHAKTA
ACGGCTTTAAAAACCGCTTTAGAGGGGCAGGGCTATTATGGGAGCGTGGTGGAGGTGCGCACAGAAAAGGTCAGT EGQGYYGSWEVRTEKVSEGA
GAGGGTGCATTATTGATCGTGTTTGATGTGAATAGGGGGGATAGCATTTATATCAAACAATCCATTTATGAGGGAAG VNRGDSIYIKQSIYEGSAKLKRR
CGCGAAATTAAAACGCCGCATGATTGAATCTTTGAGTGCGAACAAGCAACGAGATTTCATGGGCTGGATGTGGGGC SANKQRDFMGWMWGLNDGKL
TTGAATGACGGGAAATTGCGTTTAGATCAACTAGAATACGATTCTATGCGTATCCAAGATGTGTATATGCGTAGGGGT EYDSMRIQDVYMRRGYLDAHIS
TACTTAGACGCTCATATTTCTTCGCCTTTTTTGAAAACGGATTTTTCTACCCATGACGCTAAGCTTCATTATAAAGTCA TDFSTHDAKLHYKVKEGIQYRIS
AAGAGGGGATCCAATACAGGATTTCAGACATTTTAATAGAGATTGACAACCCGGTAGTCCCCTTAAAAACCTTAGAAA DNPWPLKTLEKALKVKRKDVF
AAGCGCTTAAAGTGAAAAGGAAAGATGTCTTTAATATTGAGCATTTAAGAGCGGATGCGCAAATTTTAAAAACCGAAA ADAQILKTEIADKGYAFAWKPD
TCGCCGATAAGGGTTATGCGTTTGCGGTGGTGAAGCCAGACTTGGATAAAGATGAAAAAAACGGGCTTGTGAAAGT KNGLVKVIYRIEVGDMVYINDVII
CATTTATCGTATTGAAGT RTSDRIIRRELLLGPKDKYNLTK
NSLRRLGFFSKVKIEEKRVNSSL
VSVEEGRTGQLQFGLGYGSYG
GSVSERNLFGTGQSMSLYANIA
RSYPGMPKGAGRMFAGNLSLT
DSWYSSTINLYADYRISYQYIQQ
GVNVGRMLGNRTHVSLGYNLN
GFSSPLYN
HP0634 3111 TTGCCCCATGTTGATAGCGAAATCTAGTAGTGAAAGTAGTGGCGCAGCTACTACAAACGCCCCTTCATGGCAAACAG 3112 CPMLIAKSSSESSGAATTNAPS
CCGGTGGCGGCAAAAATTCATGTGCGACTTTTGGTGCGGAGTTTAGTGCCGCTTCAGACATGATTAATAATGCGCAA GGGKNSCATFGAEFSAASDMIN
AAAATCGTTCAAGAAACCCAACAACTCAGCGCCAACCAACCAAAAAATATCACACAACCCCATAATCTCAACCTTAAC VQETQQLSANQPKNITQPHNLN
ACCCCTAGCAGTCTTACGGCTTTAGCTCAAAAAATGCTCAAAAATGCGCAATCTCAAGCAGAAATTTTAAAACTAGCC SLTALAQKMLKNAQSQAEILKLA
AATCAAGTGGAGAGCGATTTTAACAAACTTTCTTCAGGCCATCTTAAAGACTACATAGGGAAATGCGATGCGAGCGC SDFNKLSSGHLKDYIGKCDASAI
TATAAGCAGTGCGAATATGACAATGCAAAATCAAAAGAACAATTGGGGGAACGGGTGTGCTGGCGTGGAAGAAACT MTMQNQKNNWGNGCAGVEET
CTGTCTTCATTAAAAACAAGTGCCGCTGATTTTAACAACCAAACGCCACAAATCAATCAAGCGCAAAACCTAGCCAAC TSAADFNNQTPQINQAQNLANT
ACCCTTATTCAAGAACTTGGCAACAACCCTTTTAGGAATATGGGCATGATCGCTTCTTCAACCACGAATAACGGCGC GNNPFRNMGMIASSTTNNGAL
CTTGAATGGCCTTGGGGTGCAAGTGGGTTATAAGCAATTTTTTGGGGAAAAGAAAAGATGGGGGTTAAGGTATTATG QVGYKQFFGEKKRWGLRYYGF
GTTTCTTTGATTACAACCACGCCTATATCAAATCCAATTTCTTTAACTCGGCTTCTGATGTGTGGACTTATGGGGTGG HAYIKSNFFNSASDVWTYGVGS
GCAGCGATTTATTGTTTAATTTCATCAATGATAAAAACACCAACTTTTTAGGCAAGAATAACAAGATTTCAGTGGGATr FINDKNTNFLGKNNKISVGFFG
TTTTGGAGGTATCGCCTT
HP0634 3113 TGTGATCGCCAAAAAAAAAGCTTTAGGGGTTTTAGACATCAAAGCTTATGCAGGCCATACGCCCTTTAACACCTATAA 3114 VIAKKKALGVLDIKAYAGHTPFN
AGGCGATGGGCTTATCATTGCCACGCCCTTAGGCTCGACCGCTTATAATTTGAGTGCTCATGGGCCGATTGTGCATG GLIIATPLGSTAYNLSAHGPIVHA
CTTTAAGCCAGAGCTATATTTTAACGCCCTTGTGCGATTTTTCTTTAACGCAACGCCCTTTAGTGTTAGGAGCGGAAT YILTPLCDFSLTQRPLVLGAEFC
TTTGTTTGAATTTTTGCGCTCATGAAGACGCTCTTGTGGTCATTGATGGGCAAGCCACCTATGATTTAAAGGCCAACC HEDALWIDGQATYDLKANQPL
AACCCCTATACATTCAAAAAAGCCCCACGACCACCAAGCTCTTACAAAAAAATTCAAGGGATTATTTTAAAGTGCTTA PTTTKLLQKNSRDYFKVLKEKLL
AAGAAAAGCTCTTATGGGGGGAAAGCCCTAGCAAAAA GATAAAAAGGGTAAAAAACATGCGAGATTTCAATAACG PSKKR
CTCAAATCACACGCTTAAAAGTGCGTCAAAACGCTGTTTTTGAAAAATTGGATCTGGAGTTTAAAGACGGCTTGAGCG
CGATTAGTGGGGCTAGCGGGGTGGGAAAAAGCGTTCTTATTGCGAGTCTTTTAGGGGCGTTTGGGCTTAAAGAGAG
CAACGCTTCAAACATTGAAGTGGAATTGATTGCGCCTTTTTTAGACACTGAAGAATACGGCATTTTTAGAGAAGATGA
ACATGAACCCTTAGTCATCAGCGTGATTAAAAAAGAAAAAACGCGCTATTTTTTAAACCAAACAAGCCTGTCTAAAAA
CACGCTCAAAGCGTTATTAAAAGGTTTGATTAAACGCTTGTCTAACGATAGATTCAGCCAGAATGAACTCAACGATAT
TTTAATGCTCTCCTTACTGGATGGCTATATCCAAAACGAAAATAAGGCGTTTAGCCCCCTTTTAGGCGCGCTTGAAGA
AAAATTCACCCGATTAGAGAAGCTAGAAAAAGAAAGGCGATTGTTAGAGGATAAAAAGCGTTTCCAAAAGGATTTAGA
AGAACGATTGAATT
HP0634 3115 TCTCTCTAACCCCATTAAAATGGATTTTGCCAGCCAAAAACAACCGGGCGTGCAAAAAGCCACCAACCAGATCCATC 3116 LSNPIKMDFASQKQPGVQKATN
AAGGCATACAAAACATCCAGCAAAATATCCCTTCTCAAGTATTAACCCCTCAAATCCAAGCGGGCATGCAAGGGGTG GIQNIQQNIPSQVLTPQIQAGMQ
ATGCAAGGTTTTGGGGCTTTGAGCAGCACTTTAGAAGCCCCCTTATTGTTTTCTAAGCAAAATGTGGTGATTGGGGC GFGALSSTLEAPLLFSKQNWIG
TTTGAGCATTATTTATCCCCTTTATATGGGTGGGGCAAGATTCACGATGGTGCGCATTGCGGATTTGATGCAAAAAGA PLYMGGARFTMVRIADLMQKDA
TGCTAATGAAGTGTATCGTTTGAAAAAGCTTTCCACTTTTCAAGAGCTTGTGAGCGTGTATTACGGCATGGTGTTAAA RLKKLSTFQELVSVYYGMVLNA
CGCAGAAGTGGCTGAAACTTTAGAAGAGGTGGAAAAAGGCCATTATAAGCATTTCCAAAACGCTTTGAAAATGCAAA LEEVEKGHYKHFQNALKMQKV
AAGTGGGGCAAATCGCTAGGGTAGAAACCTTAGGCGCTCAAGTGGCTTATGATAAGGCCCATATCGCTAGCGTTAA VETLGAQVAYDKAHIASVKAKD
GGCTAAAGACGTGTTAGAAGTTTCGCAGCTCTCGTTCAATTCCATTTTATCTAGCAAGGACGATTTAGTGCCTTCAAG QLSFNSILSSKDDLVPSSKLEIRT
CAAATTAGAGATCCGCACGGAGAAAAATCTGCCCGATCTGAGCTTTTTTGTTTCTTCCACGCTCAATTCCTACCCGGT PDLSFFVSSTLNSYPVLKTLENQ
TTTAAAGACTTTAGAAAATCAGATTCAAATCTCTAAAGAAAACACGAAATTACAGATCGCTAAATTCTTGCCCCAAGTG ENTKLQIAKFLPQVSFFGSYIMK
AGTTTTTTTGGCTCTTATATTATGAAGCAAAACAATTCGGTGTTTGAAGACATGATCCCTAGTTGGTTTGTGGGCGTG VFEDMIPSWFVGVAGRMPILSP
GCCGGGCGCATGCCTATTCTTTCTCCCACAGGGCGCATTCAAAAATACCAAGCGAGCAAATTAGCGGAGTTGCAAG KYQASKLAELQVSSEQIQAKKN
TGAGTAGCGAACAAATCCAGGCTAAAAAAAACATGGAATTATTAGTGAATAAGACTTATAAAGAGACGCTTTCTTATTT NKTYKETLSYLKEYKSLLSSVEL
GAAAGAATACAAAAGC KLQEQAFLQGLSTNAQVIDARN
VEQKSVAYKYIVSLANLMALSDH
EFVY
HP0634 3117 CAAACAGAGTGGCTTTGGAAGAAATTTTAGCCCTAAAACCATCGCTTTTAAGCTTTAGCGCGGATAAATTCTTTAACA 3118 NRVALEEILALKPSLLSFSADKF
GTGCGCAAGCGGGCATTATTATGGGGCAAAAAGAACGGGTTGAAGCGTTAAAAAACCACCCCCTTTATAGAGTTTTA AGIIMGQKERVEALKNHPLYRVL
AGGGTGGGTAAAATCACGCTCACCTTGCTTTTTTGCAGCCTAAAAGCATGGATAAATCATCAAGAAGACATTACAATC TLTLLFCSLKAWINHQEDITIHAL
CATGCGTTATTGAACCAAACTAAAGACGCATTATTGCAAAAAGCCCTCAAACTCTACGCTCTTTTAAAGCCTTTAGAAT DALLQKALKLYALLKPLELNVSIA
TGAATGTGAGCATAGCCTCTAGCTTTTCTAAAATAGGGAATTTGTTTGGTAGGGAATTAGAATCCTTTTGCGTGAAAA KIGNLFGRELESFCVKIQPKNTR
TCCAGCCCAAAAACACCCGTGCtTTAAATAGTGAGAAACTTTATTTAAAGCTTTTCCAAAAAGGCGTTATCGCAAGGA KLYLKLFQKGVIARISCEFVCFE
TTTCATGCGAATTCGTGTGCTTTGAAGTCTTTAGCTTGAATGAAAAAGATTTTGAAAAAATCGCTCTGGTTTTAGAAGA EKDFEKIALVLEEILNKA
AATTCTTAATAAAGCTTAAAAATTCGCTATAATAAAATTTCTTTTAAACGCGCCATATCCCCCACAAAACGCTAGAGAA
TGATAGAA CGACAGMCATCMTTTAAAGGAACTTAAGAATGGAAAAAATCAGCGATCTTATAGAATGCATTGCGT
ATGAAAAAAATTTGCCTAAAGAGATGATTTCAAAAGTGATTCAAGGCTGTTTGTTAAAAATGGCGCAAAATGAGTTAG
ACCCCCTAGCACGCTACTTGGTGGTTGAAGAAAACAAGCAGCTCCAGCTTATCCAGTTGGTAGAAGTTTTAGAAGAT
GGTGATGAAAGATTGGTTAACGACCCTTCTAAATACATCAGCCTGTCTAAAGCCAAAGAAATGGATCCAAGCGTTAA
GATTAAAGACGAATTGTCCTATAGCTTGAGTTTGGAGAGCATGAAACAAGGAGCGATCAACCGCCTTTTTAAAGATTT
GCAATACCAG
HP0634 3119 ATTAAGGAGTTTTTTAATCATGGCAGATATTCAAAGGCGTGATTTTTTAGGAATGAGCCTTGCTAGTGTTACAGCTATA 3120 LRSFLIMADIQRRDFLGMSLASV GGGGCTATAGCGAGTCTGGTAGCGATGAAAAAGACTTGGGATCCGCTTCCAAGCGTTGTTTCAGCCGGTTTTACGA ASLVAMKKTWDPLPSWSAGFT CCATAGATGTGGCGAATATGCAAGAAGGGCAGTTTTCCACCGTGGAATGGCGTGGGAAACCGGTCTATATCCTCAA NMQEGQFSTVEWRGKPVYILK GCGTTCTAAAAAAGAGGGCTTTAATGAAAAGCGCGATTTTAAAGTTGGCGAGAGCGTmTACCACAGCCATTCAAAT GFNEKRDFKVGESVFTTAIQICT
TTGCACGCATTTAGGGTGTATCCCCACTTATCAAGATGAAGAAAAAGGCTTTTTATGCCCATGCCATGGGGGGCGTT PTYQDEEKGFLCPCHGGRFTS
TCACTTCTGATGGCGTGAATATTGCCGGCACTCCCCCTCCACGCCCTTTTGATATCCCGCCTTTTAAAATTGAAGGC AGTPPPRPFDIPPFKIEGTKITFG
ACTAAGATCACTTTTGGTGAAGCCGGGGCTGAATACAAGAAAATGATGGCTAAAGCGTAAGGAGAGTTTAATGGCAG EYKKMMAKA
AGATAAAAAAAGCGAAAAATTTAGGCGAATGGCTGGACATGCGTCTTGGCACTAACAAGCTTGTTAAAGTGCTAATG
ACAGAATATTGGATCCCTAAAAACATCAATTTTTTATGGGCGATGGGGGTGATTTTATTAACCCTTTTTGGCGTGCTT
GTGGTCTCAGGGATTTTCTTGCTCATGTATTACAAGCCTGATGCGAAAATGGCGTTTGATAGCGTGAATTTCACCATC
ATGCAAGAAGTGGCTTATGGCTGGCTTTGGCGCCACATGCATGCCACGGCAGCGAGCATGATTTTTGTCATCATTTA
TATCCACATGTTTGTTGGCATCTATTATGGCTCTTACAAAAAGGGTCGTGAGATGATTTGGATTAGCGGGATGATTTT
GTTTGTGGTCTTTAGCGCGGAAGCCTTTAGCGGGTATATGCTGCCTTGGGGGCAGATGAGTTATTGGGCCGCAGCG
GTTATCACGAATTTATTTGG
HP0830 3121 AAAAATGACCGGTTTAGAAGAGGTTTTGCTTCTAAGCAGCAGCGGGACAGGGGCTATGGAAGCGAGCGTGATTTCC 3122 KMTGLEEVLLLSSSGTGAMEAS
TTGTGTCAAAAAGAGTTGCTTTTTGTTAATGCGGGCAAGTTTGGCGAAAGGTTTGGCAAGATCGCTAAAGCCCATTCT QKELLFVNAGKFGERFGKIAKAH
ATCAAAGCCCATGAATTAGTCTATGAATGGGACACACCAGCTCAAGTAGATGAAATATTAAGCGTTCTTAAAGCCAAC HELVYEWDTPAQVDEILSVLKAN
CCTAACATTGATGCGTTTTGCATTCAAGCATGCGAGTCTAGTGGGGGGTTACGACACCCTGTGGAAAAAATCGCTCA AFCIQACESSGGLRHPVEKIAQA
AGCGATCAAAGAAACTAACCCGAATGTTTTTGTAATTGTAGATGCTATCACCGCTrTAGGGGTTGAGCCTTTAGAAAT PNVFVIVDAITALGVEPLEITHVD
AACGCATGTTGATGCGCTCATTGGAGGGAGTCAAAAAGCGTTCATGCTGCCTCCTGCGATGAGCCTAGTCGCATTG SQKAFMLPPAMSLVALSQNAIE
AGCCAGAATGCAATTGAGCGTATAGAAGAACGCAATGTGGGGTTTTATTTCAATTTAAAGAGCGAATTGAAAAACCAA NVGFYFNLKSELKNQRNNTTSY
AGGAATAACACCACAAGCTACACCGCTCCTATTTTACACACTTTAGGGTTGCAACGCTATTTTGAATTGGTGCAAAAT HTLGLQRYFELVQNLGGFEALY
TTAGGGGGCTTTGAAGCGCTCTATAGAGA
HP0830 3123 AAAGAATGCAAATTGATGACGCATTATTGCAACGCTTGGAAAAATTGAGCATGCTAGAGATTAAAGATGAGCATAAAG 3124 KECKLMTHYCNAWKN AGAGCGTTAAAGGTCATTTAGCGGAGGTTTTAGGCTTTGTAGAAAACATTTTCGCTTTAGAAACTAGCGCGCTAAAAA CAGATATAGAGCTATGCACCCCCTTAAGAGAAGACGAGCCTAAAAGCCAACCTAACACCGCCAAAGAGATTTTGAGC CAAAACAAACACAGCCAGGATCATTACTTCGTTGTGCCCAAGATCATTGAATAGGTTTTATTGAATAGGTTTCATATCA AGTTAAAAACTTGATTTTTGAAATCAAACAGACAGAAAAAAGCTATCGCTTGATCCAATTCAAGCTTTTTACTA
HP0830 3125 GTTCTGTCTTGTTAGTCACTTTAGGAGCGAGCATGCACGCACAATCTTACTTACCCAAACATGAGAGCGTTACCTTAA 3126 SVLLVTLGASMHAQSYLPKHES
AAAACGGGTTGCAAGTCGTGAGCGTCCCCCTAGAAAATAAAACCGGGGTTATAGAAGTGGATGTGCTTTATAAAGTC GLQWSVPLENKTGVIEVDVLY
GGCTCTAGAAACGAAACCATGGGAAAGAGCGGGATCGCTCACATGTTAGAGCATTTGAATTTTAAAAGCACCAAAAA NETMGKSGIAHMLEHLNFKSTK
CCTTAAAGCCGGCGAATTTGATAAAATCGTTAAGCGTTTTGGGGGCGTGAGTAACGCTTCTACGAGTTTTGATATTAC EFDKIVKRFGGVSNASTSFDITR
GCGCTACTTCATTAAAACCAGTCAGGCTAACTTGGATAAGTCTTTAGAATTGTTCGCTGAAACCATGGGTTCATTGAA SQANLDKSLELFAETMGSLNLK
TTTAAAAGAAGATGAGTTTTTGCCTGAGCGTCAAGTGGTCGCTGAAGAAAGGCGATGGCGCACTGATAATTCCCCTA PERQWAEERRWRTDNSPIGM
TCGGCATGCTTTATTTCCGCTTTTTTAACACCGCTTATGTCTATCACCCCTACCATTGGACGCCCATTGGTTTTATGG FNTAYVYHPYHWTPIGFMDDIQ
ATGATATTCAAAATTGGACTTTAAAAGACATTAAAAAATTCCATTCGCTCTATTATCAGCCTAAAAACGCTATCGTTTTG DIKKFHSLYYQPKNAIVLWGDV
GTGGTAGGCGATGTCAATTCCCAAAAGGTTTTTGAATTGAGTAAAAAGCATTTTGAATCCTTAAAAAACCTTGATGAA VFELSKKHFESLKNLDEKAIPTP
AAAGCTATCCCCACCCCTTACATGAAAGAGCCTAAGCAAGATGGAGCCAGAACGGCAGTCGTGCATAAAGATGGGG KQDGARTAWHKDGVHLEWVA
TCCATTTAGAATGGGTGGCCCTTGGGTATAAAGTGCCTGCTTTCAAGCATAAAGATCAAGTCGCCTTAGACGCACTA PAFKHKDQVALDALSRLLGEGK
AGTAGGCTTTTAGGCGAAGGCAAAAGCTCGTGGTTGCAAAGCGAATTAGTGGATAAAAAACGCTTGGCTTCTCAAGC QSELVDKKRLASQAFSHNMQL
TTTCTCGCACAACATGCAATTACAAGATGAAAGCGTGTTTTTATTCATTGCGGGGGGTAATCCTAATGTCAAAGCCGA FLFIAGGNPNVKAEALQKEIVAL
AGCCTTACAAAAAGA KGEITQAELDKLKINQKADFISN
DVAGLFADYLVQNDIQGLTDYQ
LKVSDLVRVANEYFKDTQSTTV
HP0830 3127 GGAGCGAGAAAGCCTCTAAAACTAAAAACAGCGCCATTGAAGTTTCCAACACGAAAGCTTCTGCTATGAAGAACGAA 3128 SEKASKTKNSAIEVSNTKASAM
ACGATTGGAAGCGGCGATCTTAAAAAGGTGTGTGAGAAAATCAAAAGCGCACTACCCTTTGGGATCATCTCAGCCTT GSGDLKKVCEKIKSALPFGIISA
TAAACCCTTTAAAGACGCTTTTTACAGAGATTTCAATCATAATGAGCAAAAGTTACTGATAGGGGCAGCTAAAAGCGG DAFYRDFNHNEQKLLIGAAKSG
TTGCATTCAATCTAGCGCTGATAAACTGGCTCAGTTAAAAACGCGCTTACTCTACTGGCAAGACAAATCTGTTAAAGT ADKLAQLKTRLLYWQDKSVKVD
GGATTGGGATAAACCCATTTTAATCAAGGACTTCTTTAAAGGCAATAATTACCTTTATAGGAGGTTTTGTTTTTTATTG ILIKDFFKGNNYLYRRFCFLLGK
GGGAAGCATTTTATGGACAGATTTTTAAAGAATAACGCTAAGGCGAGCGTGAAAGACTTTATGTCTAGTAAGGAGTTT FLKNNAKASVKDFMSSKEFVAK
GTCGCTAAATACCGATACACCCCCAAGCAAAATACAGAAAGAGCGAAAAAGCTGCAATCGTATTTAGAGAATAAGCG KQNTERAKKLQSYLENKRDFIG
CGArπTATAGGGTTTGTTCAAGCGCTTAACTCTTTAAAAGACAACCCGCAAGATCCTTTTTTACCCAATGAAGAAAC NSLKDNPQDPFLPNEETS
GAGCTT
HP0830 3129 TGATGAGATGCGAGCCGGAGCGTTTGAGCGCTTCACCAACCGCAAAAAGCGTTTCAGAGAAAACGCGCAAAAAAAC 3130 DEMRAGAFERFTNRKKRFREN
GCAGAGTATTCAAACCATGAAGCGTCTTCGCACCATAAAAAAGAGCATCGCCCTAACAAAAAACCAAACAACCACCA EYSNHEASSHHKKEHRPNKKP
CAAACAAAAACATGCCAAAACACGAAATTACGCCCAAGAAGAATTGGATAGCAACAAAGTAGAGGGCGTTACGGAAA QKHAKTRNYAQEELDSNKVEG
TTTTGCATGTGAATGAGAGAGGGACTTTAGGCTTTCATAAGGAGTTAAAAAAGGGCGTTGAAGCGAATAACAAGATC VNERGTLGFHKELKKGVEANN
CAAGTGGAGCATTTAAACCCGCATTATAAGATGAACTTAAACTCTAAAGCGAGCGTTAAAATCACGCCTTTAGGGGG LNPHYKMNLNSKASVKITPLGG
CTTGGGTGAGATTGGGGGGAACATGATGGTCATTGAAACCCCAAAAAGCGCGATCGTGATTGATGCGGGCATGAGC GNMMVIETPKSAIVIDAGMSFP
TTCCCTAAAGAGGGGCTCTTTGGCGTGGATATTTTAATCCCGGATTTTTCCTACTTGCACCAAATCAAGGACAAAATC GVDILIPDFSYLHQIKDKIAGIIIT
GCTGGCATTATCATCACCCATGCCCATGAAGATCACATAGGGGCCACGCCTTATTTGTTTAAAGAGCTGCAATTCCC HIGATPYLFKELQFPLYGTPLSL
CCTTTATGGCACGCCCTTGAGTTTGGGGCTGATTGGGAGCAAGTTTGATGAACATGGTTTGAAAAAATACCGCTCGT KFDEHGLKKYRSYFKIVEKRCPI
ATTTTAAAATCGTAGAAAAGCGCTGTCCCATTAGCGTGGGCGAATTTATCATTGAATGGATCCACATCACGCATTCTA IIEWIHITHSIIDSSALAIQTKAGTI
TCATTGACAGCAGCGCTTTAGCGATCCAAACTAAAGCCGGAACGATCATCCACACCGGCGATTTTAAAATCGATCAC FKIDHTPVDNLPTDLYRLAHYG
ACCCCGGTGGATAATTTGCCCACGGATTTGTATCGTTTAGCGCACTATGGCGAAAAGGGGGTGATGCTTCTTTTAAG LLLSDSTNSHKSGTTPSESTIAP
CGATTCCACCAACTCCCATAAATCCGGGACTACGCCGAGTGAAAGCACCATAGCGCCGGCTTTTGATACCCTTTTTA FKEAQGRVIMSTFSSNIHRVYQ
AAGAAGCGCAAGGGAGGGTGATT KYNRKIAVIGRSMEKNLDIAREL
PYQSFIEANE
HP0830 3131 GAAGCAAAACAATTCGGTGTTTGAAGACATGATCCCTAGTTGGTTTGTGGGCGTGGCCGGGCGCATGCCTATTCTTT 3132 KQNNSVFEDMIPSWFVGVAGR
CTCCCACAGGGCGCATTCAAAAATACCAAGCGAGCAAATTAGCGGAGTTGCAAGTGAGTAGCGAACAAATCCAGGC PTGRIQKYQASKLAELQVSSEQI
TAAAAAAAACATGGAATTATTAGTGAATAAGACTTATAAAGAGACGCTTTCTTATTTGAAAGAATACAAAAGCTTGCTT NMELLVNKTYKETLSYLKEYKSL
TCTAGCGTGGAATTAGCCAAGGAAAACTTAAAACTCCAAGAGCAGGCTTTTTTACAAGGCTTAAGCACGAACGCTCA ELAKENLKLQEQAFLQGLSTNA
AGTCATTGATGCGAGGAACACGGTTTCTTCTATCGTCGTGGAGCAAAAAAGCGTGGCTTATAAATACATCGTTTCATT RNTLSSIWEQKSVAYKYIVSLA
AGCGAATTT TGGCGTTAAGCGATCATATTGATTTATTTTATGAATTTGTTTATTAAGGGAAAAAATCATGTCAAATA SDHIDLFYEFVY
GCATGTTGGATAAAAATAAAGCGATTCTTACAGGGGGTGGGGCTTTATTATTAGGGCTAATCGTGCTTTTTTATTTAG
CTTATCGCCCTAAGGCTGAAGTGTTGCAAGGGT TTTGGAAGCCAGAGAATACAGCGTGAGCTCCAAAGTCCCTGG CCGCATTGAAAAGGTGTTTGTTAAAAAAGGCGATCACATTAAAAAGGGCGATTTGGl TTT" AGCATTTCTAGCCCTGA
ATTAGAAGCCAAACTCGCTCAAGCTGAAGCCGGGCATAAAGCCGCTAAAGCGCTTAGCGATGAAGTCAAAAGAGGC TCAAGAGACGAAACGATTAATTCTGCGAGAGACGTTTGGCAAGCAGCCAAATCCCAAGCCACTTTAGCCAAAGAGAC TTATAAGCGCGTTCAAGATTTGTATGATAATGGCGTGGCGA
HP0406 3133 GGGATTGAAACTTTAGGGGGCGTGATGACTAAAGTGATTGATAGAGGCACGACTATTCCGGCGAAAAAATCTCAAGT 3134 GIETLGGVMTKVIDRGTTIPAKK GTTCTCAACCGCTGAAGACAACCAGCCCGCTGT TAEDNQPA
HP0406 3135 AGCAATACACAGAAAGATTTGGTTAAAGAACAGAAAGATTTGGTTAAAGAACAGAAAGATTTGGTTAAAGAACAGAAA 3136 SNTQKDLVKEQKDLVKEQKDLV
GATTTGGTTAAAGAACAGAAAGATTTGGTTAAAACACAGAAAGATTTCATTAAATATGTAGAACAAAATTGCCAAGAAA DLVKEQKDLVKTQKDFIKYVEQ
ATCATAATCAATTCTTTATTGAAAAAGGAGGAATTAAGGCTGGTATTGGTATAGAAGTAGAAGCTGAATGCAAAACCC HNQFFIEKGGIKAGIGIEVEAEC
CTAAACCTGCAAAAACCAATCAAACCCCTATCCAGCCAAAACACCTCCCAAACTCTAAACAACCCCGCTCTCAAAGA AKTNQTPIQPKHLPNSKQPRSQ
GGATCAAAAGCGCAAGAGCTTATCGCTTATTTGCAAAAAGAGCTAGAATTTCTGCCCTATTCGCAAAAAGCTATCGCT AQELIAYLQKELEFLPYSQKAIA
AAACAAGTGGATTTTTACAGGCCAAGTTCTATCGCTTATTTAGAACTAGATCCTAGAGATTTTAAGGTTACAGAAGAAT YRPSSIAYLELDPRDFKVTEEW
GGCAAAAAGAAAATCTAAAAATACGCTCTAAAGCTCAAGCTAAAATGCTTGAAATGAGAAACCCACAAGCCCACCTTT KIRSKAQAKMLEMRNPQAHLSN
CAAACTCTCAAAGCCTTTTGTTCGTTCAAAAAATATTTGCTGATGTTAATAAAGAAATAGAAGCAGTTGCTAATACTGA FVQKIFADVNKEIEAVANTEKKA
AAAGAAAGCAGAAAAAGCGGGTTATGGTTATAGTAAAAGGATGTAGCGGTTAAAAACATTGCACCAAGTTTTTAATTA GYSKRM
TCTGTCGGCTTTTGAAAACATTTTTTATGGTAGCGTTATTTGGCAATAAAAGAGATCCTTATTTAAAAGCTTAAGCTAT
CCCTAAAGTAAGGTAAAACTTACAAATGGAGGTTAAAACGCGCAATAAAACTCCAAAGAGCCGACTTTTTTAATCGTA
ATTTGCTTATGGTTTAACAAAGTTTCTAAATGCTTAAAAGCGTTAGAATCTAGAATTAGGGGGGCTTCTTCTGCGCTTA
AAGGGCGCCTGGAAATTAAAGCGAGCAATTCGTCTATACCGCAAGAAATCAATTTTTTAGCTTGAGTAATGGATCGTT
TAGGCA
HP0406 3137 CAAAAGACAGAACAAGAAAAACAAAAGACAGAACAAGAAAAACAAAAGACAGAACAGGAAAAACAAAAGACAAGCAA 3138 QKTEQEKQKTEQEKQKTEQEK
TATAGAGACTAACAATCAAATAAAAGTAGAACAAGAACAACAGAAGACAGAACAAGAAAAACAAAAGACAAACAATAC IETNNQIKVEQEQQKTEQEKQK
GCAAA GATTTGGTTAAATACGCAGAACAAAATTGCCAAGAAAATCATAATCAATTCTTTATTAAAAAATTAGGAATT KDLVKYAEQNCQENHNQFFIKK
AAGGGTGGCATTGCTATAGAAGTAGAAGCTGAATGCAAAACCCCTAAACCTGCAAAAACCAATCAAACCCCTATCCA GIAIEVEAECKTPKPAKTNQTPI
GCCAAAACACCTCCCAAACTCTAAACAACCTCATTCTCAAAGAGGATCAAAAGCGCAAGAGTTTATCGCTTATTTGCA PNSKQPHSQRGSKAQEFIAYLQ
AAAAGAGCTAGAATTTCTGCCCTATTCGCAAAAAGCTATCGCTAAACAAGTGAATTTCTATAAACCAAGTTCTATCGCT LPYSQKAIAKQVNFYKPSSIAYL
TATTTAGAACTAGATCCTAGAGATTTTAAGGTTACAGAAGAATGGCAAAAAGAAAATCTAAAAATACGCTCTAAAGCTC DFKVTEEWQKENLKIRSKAQAK
AAGCTAAAATGCTTGAAATGAGGGATTTAAAACCAGACCCACAAGCCCACCTTCCAACCTCTCAAAGCCTTTTGTTCG RDLKPDPQAHLPTSQSLLFVQKI
TTCAAAAAATATTTGCTGATGTTAATAAAGAAATAGAAGCAGTTGCTAATACTGAAAAGAAAGCAGAAAAAGCGGGTT NKEIEAVANTEKKAEKAGYGYS
ATGGTTATAGTAAAAGGATGTAGGCATAAGAAAATAAGAACACCATAAAATCGTTTTTAGCTTCTAGGAGACATCAGT
CAGTTTCTTGCCATAGAAAAATCGCTTATTAGCTTGCTCCCTTTAAAAGGGTGTGAGTTTAAGGTATAAGGAAAACTT
GTATCAAGTTTTATTGGAATGGATTAGAAAAATCTGATTGGATTGACCCTTACAATTTTTCAAACCAATCGTTTAATAG
CGATTAAATATGXCTATATACACTACAACAATAAGATTTTGAAAGGTTGGTAATGGAATCAGTAAAAACAGGAAAAACA
AATAAGG
Figure imgf000472_0001
HP0406 3149 TCCGAAA GGTGAACACCCTAGCGTTTCCTACAAAAAGGCCATTTCCCAACAAAAGATTCAAGCTAAAATTGAAGAA 3150 SEKGEHPSVSYKKAISQQKIQA
TTAGGCGAAAACTATGAAAACGCCATTATTGAAGGCAAGATTGTAGGCAAGAATAAAGGGGGTTATATCGTGGAGTC ENYENAIIEGKIVGKNKGGYIVE
TCAAGGCGTGGAGTATTTCCTCTCCCGCTCGCACTCTTCTTTAAAGAATGACGCAAACCATATCGGCAAACGCGTTA YFLSRSHSSLKNDANHIGKRVK
AAGCGTGCATCATTCGTGTGGATAAGGAAAACCATTCTATCAATATTTCTCGCAAACGATTCTTTGAAGTCAATGACA DKENHSINISRKRFFEVNDKRQL
AACGACAACTTGAGGTTTCTAAGGAATTGTTAGAAGCCACAGAGCCGGTGTTAGGGGTTGTGCGCCAGATCACCCC ELLEATEPVLGWRQITPFGIFV
TTTTGGCATTTTTGTAGAAGCTAAGGGGATTGAGGGCTTGGTCCATTATTCTGAAATCAGCCATAAGGGACCAGTCA GLVHYSEISHKGPVNPEKYYKE
ATCCTGAAAAATACTACAAAGAGGGCGATGAAGTCTATGTCAAAGCCATCGCTTATGATGCAGAAAAAAGACGCCTT VKAIAYDAEKRRLSLSIKATIEDP
TCACTCTCCATAAAAGCGACTATAGAAGACCCATGGGAAGAGATTCAAGACAAGCTAAAACCCGGATACGCCATTAA DKLKPGYAIKVWSNIEHYGVFV
GGTAGTGGTGAGCAACATTGAACATTATGGGGTGTTTGTGGATATTGGTAATGATATTGAAGGCTTTTTGCATGTTTC IEGFLHVSEISWDKNVSHPNNY
TGAAATCTCTTGGGATAAAAATGTCAGCCACCCTAACAATTACTTGAGCGTGGGGCAAGAGATTGATGTGAAAATCAT EIDVKIIDIDPKNRRLRVSLK
TGACATTGATCCAAAAAATCGCCGCTTAAGGGTTTCTTTAAAG
HP0406 3151 CGTGGATTTTAAGATCGCTCAAAACGAACAAGAAGAGCAGGATCTGTGGTTTTCAAGGCGTAACGCTTCTCAAAGTA 3152 VDFKIAQNEQEEQDLWFSRRNA
TTAGCGTTTATGGTAAAAAGAAATTGAATGAAGATGTAACCGTTCCTAGGGCGAGTTTGCCGAGTTTGTTGCAAGAA VYGKKKLNEDVTVPRASLPSLL
GTCGCCAAAATAAGCCAAAAATACGGCTTTAAAATCCCTTGCTTTGGGCATACGGGCGATGGCAATGTGCATGTGAA ISQKYGFKIPCFGHTGDGNVHV
TATCATGCTAGAAGATCCTAAAAGGGAtTTAGAAAAAGGGCATGAGGCTATGGAAGAGATTTTTCAGGCCGCTATCA DPKRDLEKGHEAMEEIFQAAISL
GTTTGGAGGGGACTTTAAGCGGGGAGCATGGCATAGGCTTGTCTAAAGCCAAATTCATGCCTTTAGCGTTCAATCAT SGEHGIGLSKAKFMPLAFNHSE
AGTGAAATGGAGCTTTTTAGGMTATTAAAAMGCTCTTGATCCTAATAATATTTTAAACCCTTTTAAAATGGGGTTGT NIKKALDPNNILNPFKMGL
AAGATTTAAAGCAAGGAAAGCGCATGAAAATCGGTGTTTATGGAGCGAGCGGTCGTATAGGGAAACTGCTTTTAGAA
GAATTAAAAGGGGGGTATAAGGGATTAGCGCTATCTAGCGTGTTTGTTAGGCAAAAATGCGAAACCGATTTCAGCTC rTTTTCGCACGCCCCTTTAGTAACCAACGATTTAAAAGCGTTTGTGAGGGCATGCGAATGCGTGATTGATTTTTCTTT
ACCTAAAGGCGTGGATAATTTGCTAGAGGCTCTTTTAGAATGCCCTAAAATTTTAGTTTCTGGCACAACCGGTTTAGA
AAAAGAAACGCTAGAAAAAATGCAACAATTAGCCTTAAAAGCACCGCTTTTGCATGCGCACAACATGTCTATAGGGAT
TATGATGCTCAACCAATTAGCCTTTTTAACTTCTTTGAAATTAAAAGATGCGGATATTGAAATTATAGAAACGCACCAC
AATCTCAAAAAAGATATCCCGAGCGGCACGGCGTTGAGTTTGTATGAAACTTGCGCTAAGGCTAGGGGGTATGATGA
AAAAAATGCCCTAATC
HP0406 3153 AGAAGAAATCACGGAGCGTTTGAAAGAACAGCATGCGAGTTTAAATTTAGATAAAAAAGACATTGAGCTCAAACACC 3154 EEITERLKEQHASLNLDKKDIEL CGATTAAAAGCACAGGGATTTATGAGATTGAAGTCAAGCTTGGATCGGGGGTTGTGGGCGTGTTTAAAATTGATGTG TGIYEIEVKLGSGWGVFKIDW GTGGCTGAGTAGAAAATGTTTGAAGCGACGACGATTTTAGGCTATAGAGGGGAATTGAATCATAAAAAGTTCGCGCT CATTGGAGGCGATGGGCAGGTAACTTTGGGT
HP0406 3155 AAAGGTTATCCATTGTTACCAATGGATTTCAAAAATGGCGGCGATATTGCCACTATTAACGCCACTAATGTTGATGCG 3156 KGYPLLPMDFKNGGDIATINATN GACAAAATAGCTAGCGATAATCCTATTTATGCTTCCATAGAGCCTGATATTGCCAAGCAATACGAAACAGAAAAAACC KIASDNPIYASIEPDIAKQYETEK ATTAAGGATAAGAATTTAGAAGCTAAATTAGCTAAGGCTTTAGGTGGCAATAAAAAAGATGACGATAAAGAAAAAAGT NLEAKLAKALGGNKKDDDKEKS AAAAAATCCACAGCAGAAGCTAAAGCAGAAAACAATAAGATAGACAAAGATGTCGCAGAAACTGCCAAGAATATCAG EAKAENNKIDKDVAETAKNISEI TGAAATCGCTCTTAAGAACAAAAAAGAAAAGAGTGGGGAATTTGTAGATGAAAATGGTAATCCCATTGATGACAAAAA KEKSGEFVDENGNPIDDKKKAE GAAAGCAGAAAAACAAGATGAAACAAGCCCTGTCAAACAG TSPVKQ
HP0406 3157 AATGCAAAGCTCTTCAGGCGAGAATGTTTTTTCATACGATCTTAAAACGGAATATGTTTTAGACCCTAACATTTTGATA 3158 MQSSSGENVFSYDLKTEYVLD
GAGACGATGAAAAGGCATGGTTTTGATTTTGTGGATATTAGACGGGTGTCTTTAAAAGAGTGGGAATACGATTTTTCT MKRHGFDFVDIRRVSLKEWEY
TTACAAGAAGTCAAGCTCCCTAACGCGAGAGTCTTAGTTTTGAGTAGCGAACCTGTGGAGTTTAAGGAAGCGAGCG EVKLPNARVLVLSSEPVEFKEA
GGAAATACTGGCTGAGCGTGAATCAAAACGCGTATTTAAAAATAAGCTCTAATAACCCTTTGTGGCAACCCAAAATTA LSVNQNAYLKISSNNPLWQPKII
TTTTTTATGATGAAAACTTAAAGATCATTCAAATCATTGCTAAAGAAAACAGACAACAAGAAATCGCTCTTAATTTGCT LKIIQIIAKENRQQEIALNLLNGV
CAATGGCGTGCGTTTTATCCATATCACTGACGCAAAAAACCCTATCGTTTTAAAAAATGGGATTAGCGTGGTTTTTGA DAKNPIVLKNGISWFDAMP TGCGATGCCTTAAGTAGGGGTAACCCTTATCCAATTGATTGGAATATTAAGAAAATCCGCTTTCAGGCTTGAAAAAAA GAGGGCCTTAATAATAGCGATCCACCCAATATTTACCAAAAAAAATC I I I I GTATCACACAGCCCACCAATAGACCAG CTCCCGCTTCTATAAAAAGGACAGCCCACCAATTAATAAAACCAAACCATACAAAAAACAAATGCGCTGGTAAAGCG
CCAAGCAATAGTGAAACGCCGCTCACAATAGAAATTCCAAAGCGGGTTTTAAAGATCTTACTTTTTTTGCCAAACGCC ATGTCTGCTACAAATTCCCCACTAAAAGAAATCAAAAACCACCCAATCAAATGTTCAGGCATCAATAACCAAGCAACT GCACTAC
HP0406 3159 AACAAAATCCACAACAAATGATGCAAAAATCGGATTAAATTATCAGCATGAGTTTTTACTTTGCTATGCTAAAGATAAA 3160 TKSTTNDAKIGLNYQHEFLLCY AATTATACAAATCTCTTGGGGGGAGAAAAGAATTTAGAAAATTACAAAAACCCCGATAACGACCCTAATGGAGCATGG YTNLLGGEKNLENYKNPDNDP ATTAATGATAATCCTAGCGCAAAAAGTGGGAATATGAAAACGGGGTA I I I I GGAGTTACTAATCCTTACACAAACAAA NDNPSAKSGNMKTGYFGVTNP
GTGGATTATCCGCCTGTGGGTATGTTTTGGCGTTTTTCACAAAATACGATACAAAAACACATTGATGAGGGGCGGAT DYPPVGMFWRFSQNTIQKHIDE TTGCTTTAAAAAAGAGCATAAGGATAATGAAAGGGGCTTTATTTACAAACGCTATTTAAAGGATTTAAAAACCACGCAA KKEHKDNERGFIYKRYLKDLKT AAAACTTTTGATAGTTTGATATTTAGCGATAATTGTTATATGAACCAAGCGGCGACTAAAGAACTTTTAAATTTGGGAA DSLIFSDNCYMNQAATKELLNL TGGGAGAATAI I I l ACTTATCCAAAAGGCGTAGAATTTATGAAAAAAATCATTCTGCATTCAACCACGCCAAACGAGG FTYPKGVEFMKKIILHSTTPNEG GCGACATCATCTTAGATTTTTTTGCTGGGAGCGGGACAACCGTGCATGCGGTGATGGAATTAAACGCAGAAGATAAG FAGSGTTVHAVMELNAEDKGN GGTAATAGGGAATTTAT I I l AGTTCAAATTGATGAAGAGATAAAAGAAGATGAAAGCGCTTATGATTTTTGTAAGAAG QIDEEIKEDESAYDFCKKELKSA GAGTTAAAAAGTGCAAAGCCTGTCATTAGCGACATTACCATAGAAAGGGTTAAAAGAGCCGCCCAAAAAATAAGCCA DITIERVKRAAQKISQLSKDSGL ATTATCAAAAGATAGCGGTTTGGATTTAGGCTTTAAAGTTTATACCTTACAAGACAAAGTGCAAATTATAAACGACAAA VYTLQDKVQIINDKEEITLFNRS
GAAGAAATAACGC I IT IT AACCGA 1 CGGA I "I I'AACGCCCT ITGACAAAGCCCTAAATTTAGCCCTACAATGCGGCAAA DKALNLALQCGKTLNQALEIIIKD ACGCTCAATCAAGCGCTAGAGATTATCATCAAAGACAAACTCTACAAATGCGAAGACGCTTAC I I I I GTATCGTGTGC EDAYFCIVCDEEAQEYLAKSKN GATGAAGAA DGYEEIDLEAFLNLNASFKERLS
HP0406 3161 GCTTTAAAAGAGCAAGAATATTTGAAAMCGCATGGCTTTTAGAAATGGAAAAACAAAAAGAAATCTTTCACAATAAAA 3162 ALKEQEYLKNAWLLEMEKQKEI AATTGGAATTGGAAAAATCCTACCAACMGCCCTAAATATCTTAAAAAGCGAAGTCGCTTCAAAAGATACTAGCTCCA LELEKSYQQALNILKSEVASKDT TGCATAAAGAAATCCATAAAGCGAGCGAAAI 1 1 lAAGCAAACACAAAACAAACCAAGAGATCCCACAAATCATAACGA KEIHKASEILSKHKTNQEIPQIIT ACTTTCAAGCCAACGAAAAAGCGCGCTACAAGAATGAAAGCGTGCTGATTGTACAAATTTTAGACAAGGGCTATTATT EKARYKNESVLIVQILDKGYYWI GGATAGAAACCGAGCTTGGCATGCGTTTAAAAGCGCATGGGAGTTTGπGAAAAAAATCCAAAAACCCCCTAAAAAC MRLKAHGSLLKKIQKPPKNKFK AAATTCAAACCCCCTAAAACAACCATTCCTAAACCTAAAGAAGCGAGCTTGCGCCTTGATTTAAGGGGGCAACGCAG PKPKEASLRLDLRGQRSEEALD
1 CGAAGAAGCCCTGGATTTACTAGACGCTTTTTTAAACGACGCGCTΓTTAGGGGGCTTTG LNDALLGGF
HP0406 3163 TGCCACTTTTGAAGGCGGCATATTAGAGTTTCATTTTGATGAAAAAGCCAGGATTGCCGGGGTAGAAATCAAGGGTT 3164 ATFEGGILEFHFDEKARIAGVEI
ATGGGACTGAAAAGGAAAAAGACGGCTTAAAATCCCAAATGGGGATCAAAAAGGGCGACACCTTTGATGAGCAAAAA EKEKDGLKSQMGIKKGDTFDE
TTAGAGCATGCTAAAACGGCTTTAAAAACCGCTTTAGAGGGGCAGGGCTATTATGGGAGCGTGGTGGAGGTGCGCA AKTALKTALEGQGYYGSWEV
CAGAAAAGGTCAGTGAGGGTGCATTATTGATCGTGTTTGATGTGAATAGGGGGGATAGCATTTATATCAAACAATCC EGALLIVFDVNRGDSIYIKQSIYE
ATTTATGAGGGAAGCGCGAAATTAAAACGCCGCATGATTGAATCTTTGAGTGCGAACAAGCAACGAGATTTCATGGG KRRMIESLSANKQRDFMGWM
CTGGATGTGGGGCTTGAATGACGGGAAATTGCGTTTAGATCAACTAGAATACGATTCTATGCGTATCCAAGATGTGT GKLRLDQLEYDSMRIQDVYMR
ATATGCGTAGGGGTTACTTAGACGCTCATATTTCTTCGCCTTTTTTGAAAACGGATTTTTCTACCCATGACGCTAAGC AHISSPFLKTDFSTHDAKLHYK
TTCATTATAAAGTCAAAGAGGGGATCCAATACAGGATTTCAGACATTTTAATAGAGATTGACAACCCGGTAGTCCCCT YRISDILIEIDNPWPLKTLEKAL
TAAAAACCTTAGAAAAAGCGCTTAAAGTGAAAAGGAAAGATGTCTTTAATATTGAGCATTTAAGAGCGGATGCGCAAA DVFNIEHLRADAQILKTEIADKG
TTTTAAAAACCGAAATCGCCGATAAGGGTTATGCGTTTGCGGTGGTGAAGCCAGACTTGGATAAAGATGAAAAAAAC VKPDLDKDEKNGLVKVIYRIEV
GGGCTTGTGAAAGTCATTTATCGTATTGAAGTGGGCGATATGGTGTATATCAATGATGTCATCATTTCAGGGAACCAG NDVI ISGNQRTSDRI IRRELLLG
CGCACGAGCGATAGGATCATTAGAAGGGAGTTATTGTTAGGGCCTAAGGATAAATACAACTTGACCAAACTGAGAAA NLTKLRNSENSLRRLGFFSKVKI
TTCCGAAAATTCTTTAAGGCGTTTAGGATTCTTCTCTAAAGTCAAAATTGAAGAAAAAAGGGTTAATAGCTCACTCATG VNSSLMDLLVSVEEGRTGQLQ
GATTTATTAGTGAGCGTA GSYGGLMLNGSVSERNLFGTG
YA
HP0406 3165 AAAATCGCTCAAAAAATCGCCCAAAAAGTTAAAATTGATGGCTTTAGAAGAGGTAAAGTCCCCCTTAGCTTAGTGAAA 3166 KIAQKIAQKVKIDGFRRGKVPLS ACCCGTTATCAAGCCCAAATTGAACAAGACGCTCAAGAAGAAATGATTCAAGAGGTTTTGAAAAACGCTTTTAAGGAA YQAQIEQDAQEEMIQEVLKNAF TrAGGGATTGAAAATAAGGATCTCATCGGCAGCCCCAATCTCACTAAATTTGAAAAAAAAGACACGCATTTTGAAATA ENKDLIGSPNLTKFEKKDTHFEI GAAGCGGACATCGGCTTAAAACCCACGATTGTTTTAGACAAGATCAAAGAGTGCGTGCCTAGCGTGGGAGTGGAAG LKPTIVLDKIKECVPSVGVEVPN TTCCAAATGAAGAAAAAATTGATGAGCGTTTGAAACAGCTCGCTAAAGATTATGCGAAATTTGTGGATACCAACACTC ERLKQLAKDYAKFVDTNTQRKA AAAGAAAAGCTCAAAATGACGATAAATTAACGATTGATTTTGAAGGCTTTATAGATAATGCGCCTTTTGAAGGGGGCA KLTIDFEGFIDNAPFEGGKAENF AGGCTGAGAATTTCAATTTGATTTTAGGCAGTAAGCAAATGCTAGAAGATTTTGAAAAGGCTCTTTTAGGCATGCAAG SKQMLEDFEKALLGMQAGEEK CGGGTGAAGAAAAAGAATTCCCTTTGACTTTCCCTAGCAAATACC FPSKY
HP0406 3167 AAAAAAGAAGGGTATTTGGCTGTTGCTATGAATGGCGAAATTGTrTTACGCCCCGATCCTAAAAGGACCATACAGAA 3168 KKEGYLAVAMNGEIVLRPDPKR
AAAATCAGAACCCGGGTTGTTATTCTCCACTGGTTTGGATAAAATGGAAGGGGTTTTAATCCCAGCCGGGTTTGTCA SEPGLLFSTGLDKMEGVLIPAG
AGGTTACCATACTAGAGCCTATGAGTGGGGAATCTTTGGATTCTTTTACGATGGATTTGAGCGAGTTGGACATTCAA LEPMSGESLDSFTMDLSELDIQ
GAAAAATTCTTAAAAACCACCCATTCAAGCCATAGCGGGGGGTTAGTTAGCACTATGGTTAAGGGAACGGATAATTC TTHSSHSGGLVSTMVKGTDNS
TAATGACGCGATCAAGAGCGCTTTGAATAAGATTTTTGCAAATATCATGCAAGAAATAGACAAAAAACTCACTCAAAA ALNKIFANIMQEIDKKLTQKNLE
GAATTTAGAATCTTATCAAAAAGACGCCAAGGAATTAAAAGGCAAAAGAAACCGATAAAGACAAATAACGCATAAAAA AKELKGKRNR
AAGAACGCTTGAACAAACTGCTTAAAAGGGGGTTTTTAGCGTTCTTTTTGAGCGTGTATTTAAGGGCTGATGATTTGG
TTACTTACACCATCATCAAAGAAAAAGATCTAGGATACCAGCGGTTTTTAGCCAAGAAGTGTTTAAGGGGTAAAACCC
ACCCTCCGTGTTTTACTAAGCCTAAAAAGCCTAAAAGAAAACTTTTTAATATAGACAAAAGCTCCCACTATTATGGCAC
AAGCGTGGTGCAAATGTCATGGCTACAGAGTAGGGAAAAATTTGAAAACCATTCAAAATACCGAGACATTCCTT TGC
TGAAGTCAGTTTGATTTATGGCTATAAACAATTTTTTCCTAAAAAAGAGCGCTACGGCTTCCGT TTATGTCTCTTTG
GATTACGCTTATGGGTTTTTTCTTAAAAATAAGGGCGTGTTGGGCGATAGTTTGAGGGAGAGTTCGCAAATCCCTAAA
AGCTATAGAGAAAAATTGCAAAGAAAAGAGACTTTTATTAACGCTATTTTTTATGGCGCGGGAGCTGACTm TATACA AACGCGCTTTT
IHP0406 3169 TTTTAGCCTTAAAATCCCTATTGGCGTGGAAGAGGGCGAAAAGATTAGGGTTCGCAACAAGGGGAAAACGGGGCGA 3170 FSLKIPIGVEEGEKIRVRNKGKT
ACGACTAGGGGCGATTTGCTCTTAGAGATCCATATTGAAGAAGATGAAATGTATAGGCGCGAGAAAGATGATATTAC GDLLLEIHIEEDEMYRREKDDIT
CCAAATCTTTGATTTACCCTTAAAAACGGCTCTTTTTGGAGGGAAAATTGAAATCGCTACTTGGCATAAAACCTTAACC LKTALFGGKIEIATWHKTLTLTIP
CTAACCATTCCCCCTAACACCAAAGCGATGCAAAAATTCCGCATTAAAGAAAAAGGGATCAAAAACAGAAAAACTTCG MQKFRIKEKGIKNRKTSHVGDL
CATGTGGGGGATTTGTATTTGCAGGCTCGTTTGATTTTGCCTAAAACTGAAACGCTTTCTAATGAGTTGAAAGCGTTA LILPKTETLSNELKALLEKEL
TTAGAAAAAGAATTGTAAGGAGGAATCGTGTGCGATTATGATGAACCGCTTTATTTAATCAGCGTCGTGGCTAAAATC
TTAGGCGTGCACCCTCAAACCTTGCGCCAATACGAAAAAGAGGGTTTGATAGAGCCTAGCAGGACTGATGGGAAAA
TGCGCTTGTATTCCCAACGAGACATGGACAAAATCAAAACGATTTTACGCCTTACAAGGGATATGGGGGTTAATCTT
GCGGGCGTGGATATTATCTTGCGTTTAAAAGAAAAGCTTGATGAATTAGACAACCTGAATAAAGAGTTGCAAGACGC
TCTGCACAAACACTCTAAAAATACCAAAACCCCMCGAAAAATTTAAACACCCCTACGAATTTTTACGAATTGATTTTA
TTTAAAAAATGAGCCTGACTTCGCTTTTAAACCCAAAAAGCCTAGAAGATTTTTTAGGCCAAGAGCATTTAGTAGGGA
AAGACGCCCCCT
HP0406 3171 ATCAGCATGCTTTTAGACGTGAAATTGAACGTTAAGGTGCGCATCGGGCAAAAAAAAATGATTTTAAAAGACGTGGTC 3172 ISMLLDVKLNVKVRIGQKKMILK
ΤCTATGGATATAGGGAGCGTGGTAGAGCTGGATCAATTGGTGAATGACCCTTTGGAAATTCTTGTAGATGACAAGGT DIGSWELDQLVNDPLEILVDDK
GATCGCTAAGGGCGAAGTGGTGATTGTGGATGGGAATTTTGGCATTCAAATCACGGATATTGGCACTAAAAAAGAAC EWIVDGNFGIQITDIGTKKERLE
GCTTAGAACAATTGAAACATTAAATCTTTTTATCATAAAAAGGAAAGGGATATGGCTATTTTTGGGGAATTAAGCTCGC
TTGGGCATTTGTTTAAAAAAACGCAAGAATTGGAAATTTTGCATGAGTATTTAAAAGAGGTCATGCAAAAGGGTAGTA
AAGCGAATCAAAGGGTTTTAAACCTCGCTACCAATACGGAATTTCAAGTGCCTTTAGGGCATGGCATCTTTAGCATAG
AGCAGAGTTATTGTTTAGAGCATGCCAAAGAAAGCGAGAAAGGTTTTTTTGAAAGCCACAAAAAATATGTGGATTTCC
AGCTGATTGTCAAAGGCGTTGAGGGGGCTAAAGCGGTGGGTATCAATCAGGCTGTCATTAAAAACCCTTACGATGAA
AAAAGAGATTTGATCGTTTATGAGCCGGTTAGTGAAGCTTCTTTTTTGCGTTTGCATGCGGGCATGCTGGCTATTTTT
TΓTGAAAACGATGCGCATGCGTTGAGGTTTTATGGAGAGTCTTTTGAAAAATATAGGGAAGAGCCGATTTTTAAAGCG
GTCGTTAAAGCGCCCAAAGGATTGATCAAATTAAAATTA
HP0406 3173 AAAGACAAACAAGAGGCTAAAAAAGCCAAAAAACCCAGTAAGCCCAAAGCCACCCCCACCGCCAAAAACAACAAATC 3174 KDKQEAKKAKKPSKPKATPTAK
CCATAAAATTGATTTTAGCGATGCGAGGGATTTTAAGGGCAATGATATTTATGATGATGAAACCGATGAAATCTTATT HKIDFSDARDFKGNDIYDDETD
GTTTGATTTGCATGAACAAGATAATTTCAATAAGGAAGAAGAAGAAAAAGAAATCCGCCAAAATATCAACGACAGGGT LHEQDNFNKEEEEKEIRQNIND
GCGCGTCCAAAGAAAAAACCCTTGGATGAATGAAAGCGGGATCAAACGACAATCCAAGAAAAAACGCGCATTCCGT RKNPWMNESGIKRQSKKKRAF
AACGATAACAGCCAAAAAGTGATCCAAAGCACGACTGCAATCCCTGAAGAAGTGCGCGTCTATGAATTCGCGCAAAA QKVIQSTTAIPEEVRVYEFAQKA
AGCGAATTTGAATCTAGCTGATGTGATTAAAACCCTCTTTAATTTAGGGCTTATGGTAACTAAAAACGACTTTTTGGAT DVIKTLFNLGLMVTKNDFLDKDS
AAGGATAGCATAGAAATTTTAGCCGAAGAGTTCCATTTAGAAATTTCTGTTCAAAACACTTTAGAAGAATTTGAAGTAG EFHLEISVQNTLEEFEVEEVLEG
AAGAAGTGCTAGAGGGGGTGAAAAAAGAGCGCCCGCCTGTGGTTACTATCATGGGGCATGTTGATCATGGTAAAAC RPPWTIMGHVDHGKTSLLDKI
TTCACTGCTGGATAAAATCCGTGATAAAAGAGTCGCTCACACGGAAGCTGGGGGGATCACTCAGCACATTGGCGCT AHTEAGGITQHIGAYMVEKNDK
TACATGGTAGAAAAAAATGACAAATGGGTGTCTTTCATTGACACCCCAGGGCATGAAGCCTTTAGCCAGATGCGTAA DTPGHEAFSQMRNRGAQVTDI
TCGTGGGGCTCAAGTAACAGATATTGCAGTGATTGTGATAGCGGCTGATGATGGGGTGAAGCAGCAGACTATTGAA ADDGVKQQTIEALEHAKAANVP
GCTTTAGAGCATGCAAAGGCCGCTAATGTGCCTGTGATTTTTGCGATGAATAAAATGGATAAGCCTAATGT NKMDKPN
HP0406 3175 CATTAACGCTGAAAAACCCAATGTGCGTTTTAATGACATGGCAGGCAATGAAGAAGCCAAAGAAGAAGTGGTAGAAA 3176 INAEKPNVRFNDMAGNEEAKE
TCGTAGATTTCTTAAAATACCCTGAACGATACGCCAATTTAGGGGCTAAAATCCCTAAAGGCGTGTTATTAGTAGGGC DFLKYPERYANLGAKIPKGVLL
CTCCAGGAACCGGTAAAACCCTTTTAGCCAAAGCGGTAGCCGGCGAAGCGCATGTGCCGTTTTTCTCTATGGGAGG TGKTLLAKAVAGEAHVPFFSM
GAGCAGTTTCATTGAAATGTTTGTGGGCTTAGGGGCAAGCAGGGTTAGGGATTTATTTGAAACCGCTAAAAAACAAG EMFVGLGASRVRDLFETAKKQ
CCCCTAGCATCATTTTTATTGATGAAATTGATGCCATAGGTAAGAGCCGAGCGGCTGGAGGCGTGGTGAGCGGGAA DEIDAIGKSRAAGGWSGNDER
CGATGAAAGAGAGCAAACCTTAAACCAGCTCTTAGCCGAAATGGATGGTTTTGGGAGCGAAAATGCGCCTGTAATTG QLLAEMDGFGSENAPVIVLAAT
TCTTAGCCGCAACGAACCGCCCCGAAATCTTAGATCCGGCGTTAATGCGTCCAGGGCGCTTTGACAGGCAGGTTTT DPALMRPGRFDRQVLVDKPDF
AGTGGATAAGCCTGATTTTAATGGTAGGGTAGAAATCTTAAAAGTGCATATTAAAGGCGTGAAACTTGCTAATGATGT ILKVHIKGVKLANDVNLQEVAKL
GAATTTGCAAGAAGTCGCCAAACTCACTGCAGGGCTTGCAGGAGCGGATTTGGCGAATATTATCAATGAAGCCGCA GADLANIINEAALLAGRNNQKE
CTTTTAGCAGGAAGAAACAACCAAAAAGAAGTCAGGCAGCAACATTTAAAAGAAGCGGTTGAAAGAGGGATTGCAGG LKEAVERGIAGLEKKSRRISPKE
GTTAGAAAAGAAAAGCAGGCGCATCAGTCCTAAAGAAAAGAAAATCGTCGCCTACCATGAAAGCGGGCATGCCGTG YHESGHAVISEMTKGSARVNK
ATTTCTGAAATGACTAAAGGGAGTGCTAGGGTGAATAAAGTCTCTATCATTCCAAGGGGCATGGCGGCTTTGGGCTA GMAALGYTLNTPEENKYLMQK
CACCCTTAACACGCCTGAAGAAAACAAATACTTGATGCAAAAGCACGAACTCATCGCTGAAATTGATGTGCTTTTAGG IDVLLGGRAAEDVFLEEISTGAS
CGGGAGAGCGGCTGAAGATGTCTTT ATDIIKGMVSYYGMSSVSGLMV NAFLGGGYGSSREFSEKTAEE NLLEERYKHVKQTLSDYREAIE
HP0406 3177 GAAAAAGACAGCTCTATTAATGATGATTTAGAGCGTTTGAGATTGAGCGCGACCACCTCACTTTTAGGTTATGATGAT 3178 EKDSSINDDLERLRLSATTSLLG
GTGATCGTGATAGCGAGCGTTTCGGCTAATTATGGTTTGGGTAACCCTGAAGAATATTTAAAAGTCATGGAAAAAATC VIASVSANYGLGNPEEYLKVME
AAAGTGGGCGAGAAGCGCGCTTACAAGAGTTTTTTATTAAAGCTAGTGGAAATGGGCTATAGCCGTAATGAAGTGGT EKRAYKSFLLKLVEMGYSRNE
GTTTGATAGGGGGAGCTTTAGAGCGACCGGAGAATGCGTGGATATTTTCCCCGCTTATAATGACGCTGAATTTATTA GSFRATGECVDIFPAYNDAEFI
GGATTGAATTTTTTGGCGATGAGATAGAAAGGATTGCGGTCTTTGACGCTTTAGAAAAAAATGAAATCAAGCGCTTGG DEIERIAVFDALEKNEIKRLDSV
ATTCTGTCATGCTTTATGCGGCCAGTCAGTTTGCCGTAGGGAGCGAGAGGCTGAATTTAGCCATTAAAAGCATTGAA SQFAVGSERLNLAIKSIEDELΛL
GATGAACTCGCTTTAAGGTTGAAATTTTTTAAAGAGCAGGATAAAATGCTTGAATACAACCGCCTCAAACAACGCACC KEQDKMLEYNRLKQRTEHDLE
GAGCATGATTTAGAAATGATTAGCGCGACCGGTGTGTGTAAGGGCATTGAAAATTACGCGCGCCATTTCACCGGTAA GVCKGIENYARHFTGKAPNETP
AGCCCCTAACGAAACGCCTTTTTGCTTGTTTGATTATTTAGGGATTTTTGAGCGGGAGTRTTTAGTCATTGTGGATGA YLGIFEREFLVIVDESHVSLPQF
AAGCCATGTGAGTTTGCCACAGTTTGGGGGGATGTATGCAGGGGATATGAGCAGGAAAAGTGTTTTAGTGGAATAT AGDMSRKSVLVEYGFRLPSAL
GGTTTTAGATTGCCTAGCGCTTTAGACAACCGCCCTTTAAAATTTGATGAATTTATCCATAAAAATTGCCAGTTCCTTT KFDEFIHKNCQFLFVSA
TTGTGTCCGCT
HP0406 3179 TCCAAACAAGCTGAAAAAGAAAATCAAATCAATTGGTGGAAATATTCAGGATTAACAATAGCGACAAGTTTATTATTAG 3180 SKQAEKENQINWWKYSGLTIAT
CCGCTTGTAGTGTTGGTGATATTGATAAACAGATAGAGTTAGAACAAGAAAAAAAGGAAGCTGAAAACGCTAGGGAT ACSVGDIDKQIELEQEKKEAEN
AGAGCGAACAAGAGTGGGATAGAACTGGAACAGGAAAAACAAAAGACCATTAAAGAACAAAAAGATTTAGTTAAAAA NKSGIELEQEKQKTIKEQKDLVK
AGCAGAACAAAATTGCCAAGAAAATCATGGCCAATTCTTTATGAAAAAATTAGGAATTAAGGGTGGCATTGCTATAGA NCQENHGQFFMKKLGIKGGIAI
AGTAGAAGCTGAATGCAAAACCCCTAAACCTGCAAAAACCAATCAAACCCCTATCCAGCCAAAACACCTCCCCAACT CKTPKPAKTNQTPIQPKHLPNS
CTAAACAACCCCACTCTCAAAGAGGATCAAAAGCGCAAGAGCTTATCGCTTATTTGCAAAAAGAGTTAGAATCTCTGC QRGSKAQELIAYLQKELESLPY
CCTATTCACAAAAAGCTATCGCTAAACAAGTGAATTTTTACAGGCCAAGTTCTGTCGCTTATTTAGAACTAGACCCTA KQVNFYRPSSVAYLELDPRDFK
GAGATTTTAAGGTTACAGAAGAATGGCAAAAAGAAAATCTAAAAATACGCTCTAAAGCTCAAGCTAAAATGCTTGGAA WQKENLKIRSKAQAKMLGNEK
ATGAGAAACCCACAAGCCCACCTTTCAACCTCTCAAAGCCTTTTGTTCGTTCAAAAAATATTTGCTGATGTTAATAAAG FNLSKPFVRSKNIC
AAATAGAAGCAGTTGCTAATA
Figure imgf000478_0001
HP0406 3189 AGCGAACAAATCCAGGCTAAAAAAAACATGGAATTATTAGTGAATAAGACTTATAAAGAGACGCTTTCTTATTTGAAA 3190 SEQIQAKKNMELLVNKTYKETL
GAATACAAAAGCTTGCTTTCTAGCGTGGAATTAGCCAAGGAAAACTTAAAACTCCAAGAGCAGGCTTTTTTACAAGGC YKSLLSSVELAKENLKLQEQAF
TTAAGCACGAACGCTCAAGTCATTGATGCGAGGAACACGCTTTCTTCTATCGTCGTGGAGCAAAAAAGCGTGGCTTA TNAQVIDARNTLSSIWEQKSV
TAAATACATCGTTTCATTAGCGAATTTAATGGCGTTAAGCGATCATATTGATTTATTTTATGAATTTGTTTATTAAGGGA SLANLMALSDHIDLFYEFVY
AAAAATCATGTCAAATAGCATGTTGGATAAAAATAAAGCGATTCTTACAGGGGGTGGGGCTTTATTATTAGGGCTAAT
CGTGCTTTTTTATTTAGCTTATCGCCCTAAGGCTGAAGTGTTGCAAGGGTTTTTGGAAGCCAGAGAATACAGCGTGA
GCTCCAAAGTCCCTGGCCGCATTGAAAAGGTGTTTGTTAAAAAAGGCGATCACATTAAAAAGGGCGATTTGGTTTTTA
GCATTTCTAGCCCTGAATTAGAAGCCAAACTCGCTCAAGCTGAAGCCGGGCATAAAGCCGCTAAAGCGCTTAGCGA
TGAAGTCAAAAGAGGCTCAAGAGACGAAACGATTAATTCTGCGAGAGACGTTTGGCAAGCAGCCAAATCCCAAGCC
ACTTTAGCCAAAGAGACTTATAAGCGCGTTCAAGATTTGTATGATAATGGCGTGGCGAGCTTGCAAAAGCGCGATGA
AGCCTATGCGGCTTATGAAAGCACTAAATACAACGAGAGCGCGGCTTACCAAAAGTATAAAATGGCTTTAGGGGGG
GCGAGCTCTGAAAGTAAGATTGCCGCTAAGGCTAAAGAGAGCGCGGCTTTAGGGCAAGTGAATGAAGTGGAGTCTT
ATTTAAAAGACGTCAAAGCGACAGCCCCAATTGATGGGGAAGTGAGTAACGTGCTTTTAAGCGGTGGCGAGCTTAG
CCCTAAGGGTTTTCCTGTGGT
HP0406 3191 GCCCAAGCCAACATGCCTAATCTAGTGATGAGCAAACAAGACACTGCGGCTAGGGGGACTATCTATAGTCAAGACA 3192 AQANMPNLVMSKQDTAARGTI
ACTACAGCCTAGCCACTTCACAAACCCTTTTCAAACTGGGCTTTGATACAAGGTTTTTAAACCCGGATAAAGAAGATT YSLATSQTLFKLGFDTRFLNPD
TTTTCATTGATTTCCTTTCTATTTATAGCAATATCCCTAAAAAGTCCTTAAAAGACGCCATCAATACAAAAGGCTATATC DFLSIYSNIPKKSLKDAINTKGYII
ATTCTAGCCTATGATCTCACGCCCAATATGGCTGCTAATATTAGAGACTTAAATAAGAAATTTTTAGCCTTTGGGGTTT TPNMAANIRDLNKKFLAFGVFQ
TTCAAAATTTCAAAGACGCGCACGATAAGGTGTGGCAAAAGCAAGGGCTAAACATTGAAGTGAGCGGCGTTTCTAGG HDKVWQKQGLNIEVSGVSRHY
CATTACCCTTATCAAAATAGCCTAGAGCCAATCATTGGCTATGTGCAAAAACAAGAAGAAGACAAGCTCACTTTAACT LEPIIGYVQKQEEDKLTLTTGKK
ACCGGTAAAAAAGGCGTTGAAAAATCTCAAGATCACTTGCTTAAAGCCCAACAAAATGGCATAAGAACAGGCAAAAG QDHLLKAQQNGIRTGKRDVSF
AGACGTGAGTTTTAACTTTATCCAAAACCACTCTTATACAGAGGTTGAACGCCTTGATGGCTATGAGGTGTATTTGAG SYTEVERLDGYEVYLSVPLKLQ
CGTTCCTTTAAAACTCCAAAGAGAAATTGAAACCCTATTGGATAAAACTAAAGACAAACTCAAGGCTAAAGAAATCCT LDKTKDKLKAKEILVGIINPKSGE
AGTGGGTATCATTAACCCTAAAAGCGGGGAAATTTTATCGCTAGCTTCAAGCAAGCGCTTCAATCCTAATGCGATTAA SSKRFNPNAIKTSDYESLNLSV
AACCAGCGATTATGAAAGCTTGAATTTGAGCGTTGCTGAAAAGGTTTTTGAGCCAGGCAGCACGATCAAACCCATTG EPGSTIKPIVYSLLLDKNLINPKE
TTTATTCCTTGCTGTTAGACAAGAATTTGATCAACCCCAAAGAACGCATTGATTTAAACCATGGCTATTACCAATTAGG HGYYQLGKYTIKDDFIPSKKA
AAAATACACCATTAAAGACGACTTTATCCCCAGTAAAAAAGCCGTTGTGGAAGACATT
HP0287 3193 TGAAACCCCCGAATACACCGCTAACGCTGATGGTATTGGCACGCTAAGGA' ΓΓTAGAAGCCATGCGGATTTTAGGAT 3194 ETPEYTANADGIGTLRILEAMRI
TAGAAAAGAAAACGCGCTTT ATCAAGCCAGCACGAGCGAATTGTATGGCGAAGTCTTAGAAACCCCGCAAAATGAA KTRFYQASTSELYGEVLETPQN
AACACCCCCTTTAACCCACGAAGCCCCTATGCGGTCGCTAAAATGTATGCCTTTTACATCACCAAAAATTACAGAGAG NPRSPYAVAKMYAFYITKNYRE
GCCTATAACTTGTTTGCGGTTAATGGCATTCTTTTTAACCATGAGAGCAGGGTAAGGGGCGAAACTTTTGTAACCCGT AVNGILFNHESRVRGETFVTRKI
AAAATCACACGAGCCGCTAGCGCGATAGCGTATAACTTAACGGATTGCTTGTATTTAGGGAATTTAGACGCTAAAAG AIAYNLTDCLYLGNLDAKRDWG
AGACTGGGGGCATGCCAAAGATTACGTGAAAATGATGCATTTAATGCTCCAAGCGCCCATCCCACAAGATTATGTGA VKMMHLMLQAPIPQDYVIATGK
TCGCCACAGGAAAGACCACAAGCGTGCGCGATTTTGTGAAAATGAGCTTTGAATTTATCGGTATCAATTTAGAATTTC DFVKMSFEFIGINLEFQNTGIKEI
AAAATACAGGGATTAAAGAAATCGGTTTGATTAAAAGCGTTGATGAAAAAAGAGCGAACGCTTTAAAATTGAACTTAA VDEKRANALKLNLSHLKKGQI
GCCATTTAAAAAAAGGCCA TCGTGGTGCGCATAGACGAGCGTTATTTCAGGCCTACCGAAGTGGATTTGCTTTTA YFRPTEVDLLLGDPTKAEKELD
GGCGATCCCACTAAGGCAGAGAAAGAGCTAGACTGGGTTAGGGAATACGATTTAAAAGAGTTGGTTAAGGACATGTT DLKELVKDMLEYDLKECQKNLY
AGAATACGATTTAAAAGAATGCCAAAAAAACCTTTACTTGCAAGATGGGGGTTATATTTTAAGGAATTTTTATGAATGA GYILRNFYE
GATTATTTTAATCACTGGTGCCTATGGCATGGTGGGGCAGAACACGGCGTTGTATTTTAAAAAAAATAAGCCTGATGT
TACTCTACTCACTCCTAAAAAGAGCGAATTGTGTTTGTTGGATAAAGACAACGTTCAAGCTTATTTGAAAGAATACAA
GCCTACAGGC
HP0287 3195 ACTCAAGATAGGAACGCCAAACAAAGCGGGCATACCACTTACCCAAACCACTCCTTTAAAAACGAAAGCGATACCGA 3196 TQDRNAKQSGHTTYPNHSFKN TTTTATTTTAAAAGCCAACCGAGAATGGGCTAAAAAAGTGCGCGAGAAAATGCGTAACGCTCCTATTTTAGAGCTTTA FILKANREWAKKVREKMRNAPI CCCAGAGATGGATGGGAGGTTTGAAGATCCTAATCTAACCCCTTTAGAAGTCTTTGATAGAATCCATCATAAAAAAAT EMDGRFEDPNLTPLEVFDRIHH CGCTAGCGTGCATTTAGCGGATAAGGAAGCGATTTTAAAAGCCCTAGAAGTGGCTAAAAGCGATAAGAGCCGTTTCA VHLADKEAILKALEVAKSDKSRF GTCAAAAAAGCTTTACAGAAATCCATGCCTTAATGAGTCAAACCGCCCAGCTTTTTAGAGAAA FTEIHALMSQTAQLFRE
Figure imgf000480_0001
HP0287 3205 ATGAAAGAGCAAGAATGGGATTTAAGCGCTTTATTTGAAAATAMGAMGCGCAGAAGMTTTTTAAAAACCTTACAA 3206 MKEQEWDLSALFENKESAEEFL
ACAGMGCACAAGAATTTGAGAGTGCTTATCAAAATAACCTTAAGAATTTAGACGCTACAGGATTTGCCAACGCTCTT EAQEFESAYQNNLKNLDATGFA
AAACAΠACGAAAATTTGTCAGAAAAGAΤTTCTAGGGTGATGGCTTACGCTCAATTGCTTTTTGCTAAGAACACCAAA HYENLSEKISRVMAYAQLLFAK
GAAGCGAAGTTTTATTCGCAATGCGAAATGGCTΓGCGCGAATATCCAACAACACCTTTTATTCTTTGAAATTGAATTC KFYSQCEMACANIQQHLLFFEIE
AAGAACTTAGACGCCAAAAAACAGCTCGCTTTCATTAAAAAATGCAAAGACCATGCTTTTTATTTGAACAATCTCATAG DAKKQLAFIKKCKDHAFYLNNLI
AAAGGAAAAAGCACACCCTAAATRTAGATGAAGAAAAGATCGCTCTAGCCCTTTCACCTGTGGGAGTGGGTGCGTTT TLNLDEEKIALALSPVGVGAFSR
AGCCGTCTTTTTGATGAGCA I I I I I CTTCTTTGAAAATCCCTTTTGAAGAGCAAAATTTAAGCGAGGAAGAAATTTTAG FSSLKIPFEEQNLSEEEILALLHN
CCCTCTTGCACAACCCCAAACGCAAGATCCGTAAAAAATCTCAAAAAGCTTTCAGTAAAGCGTTAGAAAAATCCCGC RKKSQKAFSKALEKSRPLLTYIL
CCTTTGCTCACTTACATRΠ"AAACATGGTGCGTAAAGATTTGCTCATTGAAACTAGGCTGAGAAAATACGATAAAAAA KDLLIETRLRKYDKKESFRHIDN
GAGAGTTTCCGCCACATTGACAACCAGATCTCCCAAGAGAGCGTGGATAGCATGATAGAGATCGTGAACGCTAATTT SVDSMIEIVNANFSLVHRYYHQ
TRCTTTAGTGCATCGTTATTATCATCAAAAAGCGCAAATTTTAGGGCATAAACTCAAAGATTATGACCGCTACGCCCC HKLKDYDRYAPLNDESITMTYS
TTTAAACGATGAGAGCATCACCATGACTTACTCTCAAGCTTTAGAAGAAGTGCTTAAAACCCTTAAAGCCTTTAGCCC VLKTLKAFSPEFHKIASKAIKEG
TGAATTTCACAAAATCGCTTCTAAAGCGATCAAAGAGGGCTGGGTGGATTCACACCCTAAAGACTTTAAGCAAGGTG PKDFKQGGAFSHGGVPSAHPY
GGGCTΠTAGCC TGNRRDAFTIAHEFGHMIHQEL
VLNMDTPLTTAETASVFSEMLF
KGLKSDELLFML
HP0287 3207 CGAACGAGAGATTTACGATCTAGACTATGCGATCGTCAAAGCGAAAGATTTAAAACCAAGCTTTACCACAGGCGGGA 3208 EREIYDLDYAIVKAKDLKPSFTT
CGCAAAAGAGAACGGACATGAACGAAGAGCAGATTAAAAGCATTGCTGAAAATTTTGATCCTAAAAAGATATTTGGTA RTDMNEEQIKSIAENFDPKKIFG
GCGGAGGGTTTGAAGATTTACCGATCATTCTACATGACGGGCAAGTGATCGCAGGAAACCAGAGAATCCAAGGCAT EDLPIILHDGQVIAGNHRIQGML
GCTAMCTTCACGCCTAAAAGCCGTTTTTCTTACGAGAGAGCGATCAAGGAATACTATCACATAGACTTAAAACCGGA SRFSYERAIKEYYHIDLKPDELL
CGAGTTGTTAGTGAGAGTGCCACACAAGCGCCTAAACAACACCGAGATCAACAATTTAGCGGCTTCATCCAATCAAG KRLNNTEINNLAASSNQGRFNS
GACGCTTCAATAGCGAAAGCGATCACGCTATAGCGGTTTTAAGCCACTATGAAGCCAAGTTAAAAGAATTAGACCAA IAVLSHYEAKLKELDQKLDADS1
AAATTAGACGCTGATAGCATCTACTCATTAAAAAACATTGTTGCTAAAAATTTGAATTTTGATAAGGCTACGCATCCTA VAKNLNFDKATHPNVTDSNLAL
ATGTAACCGATAGTAATTTAGCGTRGCTGATGTTTAACATGCCACGAACCAAAACGCAAGGGATAGAATTACTCAACC PRTKTQGIELLNRWKKEFSNDI
GCTGGAAAAAAGAAT11TCCAACGACATTAAAAGCTATGAAAAAGTAAAAAAAATGTTTGTAGATAACGCGGGCAGTT VKKMFVDNAGSFHNLIHDLNFP
TTCACAATCTCATCCACGATCTGAACTTCCCTAAGGTGAGTTTAAACGCTTATTTAAGCGATATTATGGATCGCAGTTT AYLSDIMDRSFANLKNYQSTSE
TGCGAATTTAAAGAATTACCAAAGCACGAGCGAGAGCCTGAAAGATRTGAGCGAAAAATTCTATAAAACGAGTTCGTT SEKFYKTSSLEMFEKSDQSTSD
AGAGATGTTTGAAAAGAGCGATCAAAGCACGAGCGATATTAGCGAGATΠTAGGAGGAGCGATCGCACGATTTGCA GAIARFARFDDPSKALFEALRS
CGATTTGATGATCCGAGCAAAGCGTRATTTGAAGCCTTAAGAAGCGATAACATTAAAAAAGGCTTGAAAGATTACAAG LKDYKIADVTKDMFNADSKEFK
ATCGCAGATGTTACT FT
HP0287 3209 GGAAATTTAATGAACGAAAACGCGCCTACGCACAAAAGTTCGCACAAAGTCAAAACACACACGCCAGTGAGCGGTTA1 3210' KFNERKRAYAQKFAQSQNTHA
TCACATTGAAGATTTACGCACCTACCCTACTGAAAAGCTTTTAGAAATCGCTAACAAGCTCAAAGTGGAAAACCCCCA H
AGAATTCAAACGACAAGACTTGATGTTTGAAAT rAAAAACCCAAGTTACGCAAGGCGGATACATTCTTTTTACCGG
GA' TTAGAAATCATGCCTGATGGGTATGGCTTT TAAGAGGGTTTGATGGGAGTTTTTCAGACGGGCATAACGACA CATATGTTAGCCCTTCTCAAATCAGGCGTTTTGCTTTAAGGAATGGCGATATTGTTACCGGTCAAGTGCGATCCCCCA AAGATCAAGAAAAATACTACGCCCTTTTAAAAATAGAAGCTATCAATTACTCGCCTTCAGATGAGATTAGAAACCGCC CCTTATTTGACAATCTAACCCCCCTATTCCCTGATGAACAGATCAAATTAGAATACGAACCCACTAAAGTTACCGGCA GAATGCTAGATTTATTCAGCCCTGTGGGGAAAGGCCAAAGGGCTTTGATCGTCGCGCCACCAAGGACTGGGAAAAC GGAGCTGATGAAAGAACTCGCCCAAGGCATCACTTCTAACCACCCTGAAGTGGAGCTGATTATCCTTCTAGTGGATG AGCGCCCTGAAGAAGTTACGGATATGCAACGAAGCGTTAAGGGTCAAGTTTTTAGCTCCACTTTTGATTTGCCCGCA AACAACCACATAAGAATCGCTGAATTAGTCCTAGAAAGGGCCAAAA
HP0287 3211 GGAGCAGATTTCCGCAATCAGATACCAAAACAAGCAACTCAATTATTACCCCAAGATAGATGTGTTTGACTCATGGCT 3212 EQISAIRYQNKQLNYYPKIDVFD
TTTTTGGATCCAAAAACCCGCTTATGCCACAGGGCGTTTTGGGAATTTCTACCCCGGTCAGCAAAATACGGCTGGGG WIQKPAYATGRFGNFYPGQQN
TTACTGCGACTTTGAATATTTTTGATGATATAGGCTTGAGCTTGCAAAAACAATCCATCATGCTAGGCCAATTAGCGA ATLNIFDDIGLSLQKQSIMLGQL
ATGAAAAGAATTTAGCGTATAAAAAGCTAGAGCAAGAAAAAGACGAACAGCTTTACAGAAAATCGCTTGATATTGCCA LAYKKLEQEKDEQLYRKSLDI A
GAGCCAAGATTGAATCTTCAAAGGCTAGTTTGGATGCGGCCMTCTTTCTTTTGCCAATATTAAGAGGAAATACGACG SKASLDAANLSFANIKRKYDAN
CTMTTTAGTGGATTTCACCACC'TATTT GGGGCTTAACCACGCGCTTTGATGCGGAAGTGGCTTACAATTTAGCGC YLRGLTTRFDAEVAYNLALNNY TCAATAATTATGAAGTGCAAAAAGCCAATTACATTTTCAACAGCGGGCATAAAATAGACGACTATGTGCATT NYIFNSGHKIDDWH
HP0287 3213 GAGGGCGTGAGCCAAAGTGAGCTGGAAGTGGATGCGAGGTTGCAATTCAGGGACAAGATCACTACTGAAGAAATCC 3214 EGVSQSELEVDARLQFRDKITT AACAAACTTTGATTAAAACCGCTGTGGATAAGATAGATATTGACACGCCTAATTGGAGTTTTGTCGCCTCAAGGCTTT TLIKTAVDKIDIDTPNWSFVASR ΓΓΓGTATGATTTATACCATAAAGTAAGTGGTTTTACAGGGTATAGGCATTTGAAAGAGTATTTTGAAAACGCTGAAGA YHKVSGFTGYRHLKEYFENAEE AAAGGGCCGCATCCTTAAGGGCTTT GGAAA TTTGATCTAGAGTTTTTAAATAGCCAGATCAAGCCTGAAAGGG KGFKEKFDLEFLNSQIKPERDF ATTTCCAATTC TTATTTAGGGATTAAAACCTTGTATGATCGCTATTTGTTAAAAGACGCTAACAACAACCCTATTGA GIKTLYDRYLLKDANNNPIELPQ ATTGCCCCAACACATGTTTATGAGCATTGCGATGTTTTTAGCACAAAACGAACAAGAACCCAATAAAATCGCCTTAGA SIAMFLAQNEQEPNKIALEFYEV ATTTTATGAAGTTTTGAGCAAGTTTGAAGCGATGTGCGCGACCCCCACTCTAGCGAACGCCCGCACCACCAAACACC AMCATPTLANARTTKHQLSSC AGCTCAGCTCATGCTATATTGGCAGCACGCCGGATAATATTGAGGGGATTTTTGACAGCTATAAGGAAATGGCGCTG DNIEGIFDSYKEMALLSKYGGGI TTGTCCAAATACGGCGGAGGGATTGGCTGGGATTTTTCTTTGGTGCGCTCTATTGGGAGTTATATTGATGGGCATAA SLVRSIGSYIDGHKNASAGTIPF AAATGCGAGCGCTGGCACGATCCCTTTTTTAAAAATCGCTAACGATGTGGCGATTGCGGTGGATCAATTAGGCACAC VAIAVDQLGTRKGAIAVYLEIWH GAAAGGGCGCGATTGCGGTGTATITGGAAATTTGGCACATTGATGTGATGGAGTTCATTGATTTAAGGAAAAATAGC FIDLRKNSGDERRRAH GGCGATGAAAGGCGAAGAGCGCAT
HP0287 3215 ATGTΠTGCGCGACCATGCAAAGGGGAGTGGCGGAAATCGTGGCTGTGGAAGCGACTTTCACAAGGGCTTTGCCG 3216 MFCATMQRGVAEIVAVEATFTR GCGTTTGTGATTTCAGGGTTAGCTAATAGCTCTATCCAAGAAGCCAAACAGCGGGTTCAATCGGCTTTACAAAATAAC VISGLANSSIQEAKQRVQSALQ GATTTCACTTTCCCGCCTTTAAAAATCACCATCAACCTTTCCCCCTCAGATTTGCCTAAATCCGGGAGTCATTTTGATT FPPLKITINLSPSDLPKSGSHFD TGCCTATCGCTCTTTTAATCGCTTTGCAAAAACAAGAGTTGGC TTAAAGAGTGGTTTGCTTTTGGGGAGTTAGGGC ALQKQELAFKEWFAFGELGLD
TTGATGGCAAGATCAAACCCAATCCTAACATTTTCCCCATGCT TAGACATTGCCATTAAACACCCCCATGCTAAGA PNIFPMLLDIAIKHPHAKIIAPKA
TCATTGCGCCTAAGGCCAATGAAGAGC I l l l l I CGCTTATCCCTAATTTGCAATGCTTTT GTGGGGCATTTTAAAGA LIPNLQCFFVGHFKEALEILQNP
AGCGTTAGAAATCTTGCAAAACCCTGAAACCAAAGCAGACACCCACACGAAAAAACTACCCTTTAAAACGATAGAATT THTKKLPFKTIELNDKEYYFSDA
GAACGATAAAGAGTATTAI I I I I CAGACGCCTATGCCTTAGATTTTAAAGAAGTTAAGGGGCAAGCTGTCGCTAAAGA KEVKGQAVAKEAALIASAGFHN
GGCCGCTTTGATCGCTAGCGCTGGGTTTCATAACTTGATTTTAGAGGGAAGTCCAGGGTGTGGGAAAAGCATGATC PGCGKSMIINRMRYILPPLSLNE
ATTAATCGCATGCGTTATATCTTGCCTCCATTAAGCCTGAATGAAATCCTAGAAGCGACAAAATTACGCATTTTAAGC LRILSEQDSAYYPLRSFRNPHQ
GAGCAAGACAGTGCCTATTACCCCTTAAGGAGTTTTAGAAACCCTCACCAAAGCGCTTCAAAATCCAGTATTTTAGGC SILGSSSLREPKPGE
TCAAGCTCTCTAAGAGAGCCAAAACCTGGCGAAA
HP0287 3217 CCAAGAATACCGCCGCCTTAAAGACGCATATGCTGAGTCTATCCCTGATTTTAAAGAACTCACCAAAGATCAAATCAA 3218 QEYRRLKDAYAESIPDFKELTK
AGCCATGCATTTAGAAAAAAGCGCTTTAGATTCGCTCATCAATCAAGCCTTATTGAGAAATCTCGCTTTAGATTTAGG MHLEKSALDSLINQALLRNLAL
GCTTGGCGCTACAAAGCAAGAAGTGGCGAAAGAGATCAGAAAAACGAGCGTTTTCCAAAAAGATGGCGTTTTTGATG TKQEVAKEIRKTSVFQKDGVFD
AAGAATTGTATAAAAATATCTTAAAGCAAAGCCATTACCGCCCCAMCATlTTGAAGAAAGCGTrGAAAGGCTTTTAAT NILKQSHYRPKHFEESVERLLIL
CCTTCAAAAAATCAGCACTCTATTCCCCAAMCCACTACCCCTTTGGAGCAATCCAGCCTATCGCTTTGGGCAAAATT FPKTTTPLEQSSLSLWAKLQDK
GCAAGACAAATTAGACATTCTTATCCTAAACCCTAGTGATGTTAAAATCTCTCTTAATGAAGAAGAGATGAAAAAATAT NPSDVKISLNEEEMKKYYESHK
TACGAGTCCCATAAAAAGGATTTTAAAAAGCCCACGAGCTTTAAAACACGCTCTTTATATTTTGACGCTAGTTTGGAA PTSFKTRSLYFDASLEKPDLKE
AAACCTGATTTGAAGGAGTTGGAGGAATACTACCATAAAAACAAGGTGTCTTATTTGGACAAAGAGGGGAAATTGCA HKNKVSYLDKEGKLQDFKSVQ
GGATTTTAAAAGCGTTCAAGAGCAAGTCAAGCATGATTTAAGCATGCAAAAAGCGAATGAAAAAGCCTTAAGGAGCT DLSMQKANEKALRSYIALKKAN
ATATCGCTCTAAAAAAAGCGAACGCGCAAAACTACACCACACAAGATTTT TQDF
Figure imgf000483_0001
Figure imgf000484_0001
HP0287 3233 CGAAACGATGAAAATCAAGTGGATGCGATCATGAAAAAAGCGAGCCTTTTGTATGAGCAAGGGCAAAAAGATGAAGC 3234 RNDENQVDAIMKKASLLYEQGQ
TTTGCATTTGTTTGACAAAGCCGCTTCTTTTTCGCAAGGGATTGCGAGCCATAAτTTAGGGGTGATTAAGTTTAAGGA LHLFDKAASFSQGIASHNLGVIK
AAAGGATTTTAATGGGGCGTTGGATTTGTTTGATTCTAGCATCGCTTCTAAAGAAAATGCGAGCGTGAGCGCGATTG FNGALDLFDSSIASKENASVSAI
ATGCGTTAGTGAGCGCTTATCATTTGCAAGATGAAGATTTGTATTATCATTATCTAAAAATTGCGAGAGACACTTTGTA AYHLQDEDLYYHYLKIARDTLYK
TAMGACTATAAAAAGTCTTTTTATTCCTACGCTTACGCGCTCAAATCCTATTACGCCGGGGAGTATTTTGAAGCCCT SFYSYAYALKSYYAGEYFEALS
TTCGCCCTTAATGCACCCTMTTCCAACGCCTTTTTAAAGCCTAATACGCGCTTAGCGTCTAAATTGTTTTTGATGTTT NSNAFLKPNTRLASKLFL FKD
AAAGATGAAACGAACGCTTACGAGCAATTACAAAAAAGCGCAAACGCCCAAGATGAGCTTGCTTTAGGGCTTTTGCA EQLQKSANAQDELALGLLQARL
GGCGCGTTTGGGCCATTACAAGCAGGCTTTGGAGCATTTGCAGCATTATTTGCACAACTACCCTAAAGATTTAAACG QALEHLQHYLHNYPKDLNALMA
CTTTGATGGCTTTGGAATTGGTGAGTTTGAAAAAAGGCGACACCCTTAAAGCGAGCGAAGCCTTAAAATTAGCCAGC LKKGDTLKASEALKLASHTKEDT
CACACCAAAGAAGACACGCTATTAGCCAACTCTTTTTACCCCATCAAGCCCACGATAAAC SFYPIKPTIN
HP0287 3235 AAAGATGCTAATGAAGTGTATCGTTTGAAAAAGCTTTCCACTTTTCAAGAGCTTGTGAGCGTGTATTACGGCATGGTG 3236 KDANEVYRLKKLSTFQELVSVY
TTAAACGCAGAAGTGGCTGAAACTTTAGAAGAGGTGGAAAAAGGCCATTATAAGCATTTCCAAAACGCTTTGAAAATG NAEVAETLEEVEKGHYKHFQNA
CAAAAAGTGGGGCAAATCGCTAGGGTAGAAACCTTAGGCGCTCAAGTGGCTTATGATAAGGCCCATATCGCTAGCG KVGQIARVETLGAQVAYDKAHIA
TTAAGGCTAAAGACGTGTTAGAAGTTTCGCAGCTCTCGTTCAATTCCATTTTATCTAGCAAGGACGATTTAGTGCCTT KDVLEVSQLSFNSILSSKDDLVP
CAAGCAAATTAGAGATCCGCACGGAGAAAAATCTGCCCGATCTGAGCTTrrTTGTTTCTTCCACGCTCAATTCCTACC IRTEKNLPDLSFFVSSTLNSYPV
CGGTTTTAAAGACTTTAGAAAATCAGATrCAAATCTCTAAAGAAAACACGAAATTACAGATCGCTAAATTCTTGCCCCA NQIQISKENTKLQIAKFLPQVSFF
AGTGAGTTTTTTTGGCTCTTATATTATGAAGCAAAACAATTCGGTGTTTGAAGACATGATCCCTAGTTGGTTTGTGGG KQNNSVFED IPSWFVGVAG
CGTGGCCGGGCGCATGCCTATTCTTTCTCCCACAGGGCGCATTCAAAAATACCAAGCGAGCAAATTAGCGGAGTTG SPTGRIQKYQASKLAELQVSSE
CAAGTGAGTAGCGAACAAATCCAGGCTAAAAAAMCATGGAATTATTAGTGAATAAGACTTATAAAGAGACGCTTTCT KNMELLVNKTYKETLSYLKEYKS
TATTTGAAAGAATACAAAAGCTTGCTTTCTAGCGTGGAATTAGCCAAGGAAAACTTAAAACTCCAAGAGCAGGCTT VELA.KENLKLQEQA
HP0287 3237 CACTCTTATACAGAGGTTGAACGCCTTGATGGCTATGAGGTGTATTTGAGCGTTCCTTTAAAACTCCAAAGAGAAATT 3238 HSYTEVERLDGYEVYLSVPLKL GAAACCCTATTGGATAAAACTAAAGACAAACTCAAGGCTAAAGAAATCCTAGTGGGTATCATTAACCCTAAAAGCGG TLLDKTKDKLKAKEILVGIINPKS GGAAATTTTATCGCTAGCTTCAAGCAAGCGCTTCAATCCTAATGCGATTAAAACCAGCGATTATGAAAGCTTGAATTT ASSKRFNPNAIKTSDYESLNLSV GAGCGTTGCTGAAAAGGTTTTTGAGCCAGGCAGCACGATC FEPGSTI
HP0975 3239 CATTAGCGGAGTTTGAATGGGTGGTGAGCAATCTTAAATACCAAAGCCTTGCAAAAGTGGAAATCAAACGAAACCAT 3240 LAEFEWWSNLKYQSLAKVEIKR
AAAGTCAAAGAAGTAACGCTCAAAGTCAATAAGCGTTATGGGGGGT TTTACTCAAAGACACTTTTTTAGAGCGCTAT KEVTLKVNKRYGGFLLKDTFLE
GGCATCGCTTTAGATGAGCGTT rATTATCACTAAAATAGGCGCTCATTTGCCCAAAGGCTTGGATTTTTTAAAGCTT DERFIITKIGAHLPKGLDFLKLGD
GGGGATAGGATTTTATGGGTGAATTATAAAAGCGTGGCGTCCAACCCAAAGGCTTTAAGAGAAGCGTTAAGCGCGC NYKSVASNPKALREALSAPKIEL
CTAAAATTGAATTATTAGTCTTGCGTAAAGGCTTTGAATTTTACATTAAAGTCCGTTGAAGTATTGATGAAAAATGACG GFEFYIKV CTTATGAAATTATTCTTTCTTGGTTTATCACGCCTCTCACGGCGATTTTAGGGCGTTTCGCTGAATTTTTTCTCTACAC TTTGCATGCGCAATTGGTGTTTAATAGCGTGGTCGCTTTGGCGTTCATGCTCTTTGCTTATAGGAGTTTGAAAGAACA GAATTTCTTCAGCGCTAGCGCGCTAACAGAAGCGTTATTGTTTGTGGGGTTΓTTTGCACTTTTCAACTACGCTTTAAA AAATCCCATGCATTTTTATGAATTTTTCCAAAACGCTATTTTTATTGCGCCTAACATGATCGCGCAAAGCCTCTCTCAA AGCTTGAGTMCTTTTCTGACCATGCGCTTTCTTTAGATTTTATCTTTAATCATGGTTTTTATGCCCTTAGTΓTCATCAG CGATTTGAGCCATAATGAMTGTCTGTGTGGCTTTLLLTMGCGTTTTGCAAGGGCTTTTTTTGAGCGTGCTGTTTGC
AATCATCATTTTAGTGTATTTAGAAGTGCATGTGTGGTGCTCTTTAGGGGTGCTGTTTTTAGCGTTTGGGTTTTTTAAA ACCTGGAGGAGCGTTGTGGTTATATGCCTAAAA GTGCTTCGCTCTTGGGTTTTACAAGCCTTTTTTGTTGTTGGTA GGGTTTT
HP0975 3241 CAAACTTAAACGCACCCAAACCCTTATTTGAATGTTTTGTAGGAGTTAATCTGGCCAAAGCCAAATATTATTCTAAAAA 3242 NLNAPKPLFECFVGVNLAKAKY
AGAAGAAAGAGAAAMGAAAAGATGATCTTGAATTTTTGTAAGATATTTGAAATTATTCTTTTTGAAGCTATCCAAAAA EREKEKMILNFCKIFEIILFEAIQ
CAACCAAAGCCTGAT TAAAAATAAAGACGAGCTTTTAGGGGATTATCCTAATCTTAAAAATTTAGATTCTTTAAGAG DFKNKDELLGDYPNLKNLDSLR AAGTGAGGGAAGACTTTTTGAAAAGAGCGTTTAAGAATGATGAAGCGAGTTTGGGAGCGTATGTGTTAGTGTTGCTT FLKRAFKNDEASLGAYVLVLLS AGCTGTAAGTATTTTGAGAGCGTGTTTGAAAAAGTTCAAGAATGGCTAGAT TTATCGCTAGGCTTATTGCTTTGAGA SVFEKVQEWLDFIARLIALRG GGC
HP0975 3243 CCCTGAGGATAACTCCATAGAATTATCTCCTAGCGATAGCGCTTGGAGAACTAATCTTGTTGTGCGGACTAATAAAG 3244 PEDNSIELSPSDSAWRTNLWR
CCTTGTATCAATTCATTTTGAGAATAGCTCAAAAAGACAATTTTGCTTCAGCGTATCTAACAGTCAAATTAGAATACCC YQFILRIAQKDNFASAYLTVKLE
ACAAAGACACGAAGTCTCTAGCGTTATTGAAGAGGAGTTAAAAAAGAGAGAAGAAGCAAAGAGGCAGAAAGAATTGA EVSSVIEEELKKREEAKRQKELI
TTAAGCAAGAAAATCTTAACACCACAGCCTACATCAATAGAGTGATGATGGCGAGCAATGAACAGATTATCAACAAAG NTTAYINRVM ASNEQIINKEKI
AAAAAATAAGAGAAGAAAAGCAAAAAATTATCTTAGATCAAGCAAAGGCGCTAGAGACTCAATATGTGCATAATGCCT KIILDQAKALETQYVHNALKRNP
TAAAAAGAAACCCCGTGCCTAGAAACTACAACTACTACCAAGCGCCTGAAAAACGCTCTAAACATATTATGCCCTCTG NYYQAPEKRSKHIMPSEIFDDG
AAATTTTΓGATGATGGCACATTCACTTATTTTGGTTTCAAAAACATCACTCTCCAACCTGCTATTTTTGTGGTTCAACCT GFKNITLQPAIFVVQPDGKLSM
GATGGGAAATTGAGCATGACTGATGCCGCCATTGATCCTAACATGACCAATTCAGGATTGAGATGGTATAGAGTTAA PNMTNSGLRWYRVNEIAEKFKL
TGAAATTGCAGAAAAATTTAAGCTCATTAAAGACAAAGCCCTTGTAACAGTAATCAATAAAGGCTATGGGAAAAATCC LVTVINKGYGKNPLTKNYNIKNY
ATTGACAAAAAATTACAATATCAAAAACTATGGTGAATTGGAGCGTGTGATTAAAAAGCTCCCTCTTGTCAGAGATAA VIKKLPLVRDK
ATAAAAAGGCGTTAAGACATGAATGAAGAAAACGATAAACTTGAAACTTCTAAAAAAGCCCAACAAGATTCAC
HP0975 3245 GGCGGTTTTTTTATGCTGGCTTTTTACCCCATAA GCAAGGAAAGGCGCTTAAAAATCGCTAAAATTTTAAACGCTT 3246 RFFYAGFLPHKSKERRLKIAKIL
TAGCGTATTTGGAAGAAAAAACCCCGGTGGTTTTTTATGAAAGCCCGCACCGATTGTTGGAGACTTTAAAGGATTTAA LEEKTPVVFYESPHRLLETLKD
ACGATCTGGCTAAAGGCATGCATTTGTTTGCGGCTAAAGAGCTTACCAAACTCCACCAACAATATTATTTAGGAGAG G HLFAAKELTKLHQQYYLGE
GTTTCTCAAATCATAGAGCGGTTGCAACAAAGCACTATCCAAGGGGAGTGGGTTTTAGTGCTTTTGAATGAGAAAAA RLQQSTIQGEWVLVLLNEKKI E
AATAGAGCCTTGCATGGGGCTATCGGCGTTATTGGAGTTGGATTTACCTCCTAAAATTAAGGCTAAAATTGAAGCCG SALLELDLPPKIKAKIEAV TQK
TTATGACACAAAAAAACGCTAAAGAGCTTTATTTCCAGCGTTTGTTAGAAGAAAAAAATCAATAAAAGGGGTTTAGCAT YFQRLLEEKNQ
GCAAGCAGTAATTTATGGCAAGCAAGTGATTATGCACCTTCTAAACTCTCATCAAGAAAAATTGCAAGAAATCTATCTT
TCTAAAGAAATAGACAAGAAACTTTTTTTCGCGCTCAAAAAAGCATGCCCTAATATCATCAAAGTGGATAATAAAAAAG
CGCAAAGCTTGGCTAAGGGGGGGAATCATCAAGGGGTTTTGGCTAAGGTGGAACTGCCCTTAGCGGTTTCT
HP0975 3247 CATTTGAAGCTGTAATCGGGCTAGAAGTCCATGTCCAACTCAACACCAAAACCAAAATCTTTTGCTCTTGCTCTACAA 3248 FEAVIGLEVHVQLNTKTKIFCSC
GCTTTGGAG TCCCCTMTTCTAACACCTGCCCTGTGTGTTTGGGTTTACCGGGAGCTTTGCCGGTATTGAATAAA ESPNSNTCPVCLGLPGALPVLN
GAAGTGGTTAAAAAAGCCATCCAATTAGGCACAGCCATTGAAGCCAATATCAACCAATATTCTATTTTTGCGAGGAAA KAIQLGTAIEANINQYSIFARKN
AATTATTTTTACCCTGATTTGCCTAAGGCTTATCAAATTTCGCAGTTTGAAGTCCCTATTGTGAGCGATGGGAAATTAG PKAYQISQFEVPIVSDGKLEIDT
AGATTGACACTAAAGAGGGTGCAAAAATCGTGCGTATTGAAAGGGCCCACATGGAAGAAGACGCCGGTAAAAATAT VRIERAH EEDAGKNIHEGSYS
CCATGAGGGCAGTTATTCTTTAGTGGATTTGAACCGCGCTTGCACCCCTTTATTAGAAATTGTCAGTAAGCCGGACAT RACTPLLEIVSKPD RNSEEAIA
GCGAAATAGTGAAGAAGCTATAGCGTATTTGAAAAAGCTCCATGCTATCGTGCGTTTTATAGGGATTTCTGATGCGAA HAIVRFIGISDANMQEGNFRCD
CATGCAAGAGGGGAATTTCAGGTGCGATGCGAACGTGTCCATTAGACCCAAAGGCGATGAAAAGCTTTATACGAGA PKGDEKLYTRVEIKNLNSFRFIA
GTAGAGATTAAAAATCTAAATAGCTTTAGATTCATTGCTAAAGCGATTGAATACGAGATAGAGCGCCAAAGCGCGGC EIERQSAAWENGRYNEEVVQE
GTGGGAGAACGGGCGCTATAATGAAGAGGTGGTTCAAGAAACGCGCCTTTTTGACACCCATAAAGGGATCACCCTT THKGITLS RNKEESADYRYFK
TCTATGCGCAATAAAGAAGAATCAGCGGATTACCGCTATT1TAAAGATCCGGATTTGTATCCTGTTTTTATTGATGAAA PVFIDEKLLKEAQKINELPSAKKI
AACTTTTAAAAGAAGCTCAAAAGATCAATGAATTGCCTAGCGCGAAAAAAATCCGCTACATGAAAGATTTTAACCTTAA DFNLKEDDANLLVSDPLLAQYF
AGAAGACGATGCGAATTTATTGGTGAGCGATCCTTTATTGGCGCAGTATTTTGAAAGCATGCTCCATCTTGGGGTTAA LGVKAKTSVTWLCVELLGRLKA
GGCTAAAACGAGCGT CGVSTHMLGALAKRIDEGKISG
VLDKLLEEQG
Figure imgf000487_0001
Figure imgf000488_0001
TABLE 3 SA1_HP0001 CUACUACUACTCAATTCAAGGGTTTTTGAG (SEQ ID NO. 3265) SA2_HP0002 CUACUACUATTTTAACCCTTGAGAGTTTGG (SEQ ID NO. 3266 ) SA3_HP0003 CUACUACUATTCCTTTAAAATAAATTTTGG (SEQ ID NO. 3267) SA4_HP0004 CUACUACUAGGGGTTTTTGTTTTAGAAGTT (SEQ ID NO. 3268 ) SA5_HP0005 CUACUACUATTAACAATCCTTTAAAAGCTC (SEQ ID NO. 3269 ) SA6_HP0006 CUACUACUAAAACTACACCCATAAATTATC (SEQ ID NO. 3270 ) SA7_HP0007 CUACUACUATCAGCTGGTAGAGCAATTCCC (SEQ ID NO. 3271 ) SA8_HP0008 CUACUACUACTAATAAGATTTGTTAGATCT (SEQ ID NO. 3272 ) SA9_HP0010 CUACUACUAGGGGCTTACATCATGCCACCC (SEQ ID NO. 3273 ) SA10_HP001X CUACUACUAAATAATGTTTTTAGTGTTTTT (SEQ ID NO. 3274 ) SA11_HP0012 CUACUACUAATCTTTTTCATATGGCGACTA (SEQ ID NO. 3275 ) SA12_HP0013 CUACUACUATTCAAGCGAACAAATATTCCC (SEQ ID NO. 3276 ) SB1_HP0014 CUACUACUACTATTCTACGATTTGATAGAC (SEQ ID NO. 3277 ) SB2_HP0015 CUACUACUAAATCATTAAAAAATTCCCATA (SEQ ID NO. 3278 ) SB3_HP0016 CUACUACUAACATTCTACGCTCTATAACAT (SEQ ID NO. 3279 ) SB4_HP0017 CUACUACUACTAGCTCCTCAAAAGTTTCTC (SEQ ID NO. 3280 ) SB5_HP0018 CUACUACUATGAAATCATTTTAAACGACTC (SEQ ID NO. 3281 ) SB6_HP0019 CUACUACUATCATGCTAATTCCAAAAATTG (SEQ ID NO. 3282 ) SB7_HP0020 CUACUACUATTAGTTTCTGTTTTTATAGTC (SEQ ID NO. 3283 ) SB8_HP0021 CUACUACUATTAAGGCTTTTTGGGGCTTGT (SEQ ID NO. 3284) SB9_HP0022 CUACUACUATGTTATTTTACTCTTTTTTGT (SEQ ID NO. 3285 ) SB10_HP0023 CUACUACUATTATTTGGATTTGCCCCATAT (SEQ ID NO. 3286 ) SB11_HP0024 CUACUACUATTAGCGTTTCATCAAAAACAC (SEQ ID NO. 3287 ) SB12_HP0025 CUACUACUACTTTTAGTAAGCAAACACATA (SEQ ID NO. 3288 ) SC1_HP0026 CUACUACTJAAGCCTTAATCCCCTACATAGA (SEQ ID NO. 3289 ) SC2_HP0027 CUACUACUAGCTCTTTCACATGTTTTCAAT (SEQ ID NO. 3290) SC3_HP0028 CUACUACUATTAATGAGCTTTGCTCTTAAA (SEQ ID NO. 3291 ) SC4_HP0029 CUACUACUACGTTCTGTTTGCACTTTATTT (SEQ ID NO. 3292 ) SC5 HP0030 CUACUACUAATGGTTTAACTTTTAAAGTGG (SEQ ID NO. 3293 )
1SC6_HP0031 CUACUACUACTACTTGGCGATCACAACAGG (SEQ ID NO. 3294 ) 1SC7_HP0032 CUACUACUATTATTTTATCTCTTCTACCAA (SEQ ID NO. 3295 ) XSC8_HP0033 CUACUACUATCAAAAGTCATTTTCTTTCAC (SEQ ID NO. 3296 ) 1SC9_HP0034 CUACUACUAATCCATCTCTAACCCTTTTCT (SEQ ID NO. 3297 ) 1SC10_HP0035S34 CUACUACUATGAAACATCACAATTTAGCAA (SEQ ID NO. 3298 ) 1SC11_HP0036S35 CUACUACUAACTTCAACGGACTTTAATGTA (SEQ ID NO. 3299 ) 1SC12_HP0037S36 CUACUACUATTTAAATTTCATTTTTTGTAT (SEQ ID NO. 3300 ) 1SD1 HP0038S37 CUACUACUAGCATCAATCACCCCTATTAGT (SEQ ID NO. 3301 )
1SD2_HP0039S38 CUACUACUACTAATGGGTTTTGAAAAAAAC (SEQ ID NO. 3302 ) 1SD3_HP0040S39 CUACUACUACACTTATTCATCTTTAGCCTT (SEQ ID NO. 3303 ) 1SD4_HP0041S40 CUACUACUAATGCGAGCGGCTAGTAAATCC (SEQ ID NO. 3304 ) 1SD5_HP0042S41 CUACUACUATGAGTCACTTCAAAAACTCAG (SEQ ID NO. 3305 ) 1SD6_HP0043S42 CUACUACUATTTATTAGGCGTTTTGATTTT (SEQ ID NO. 3306 ) 1SD7_HP0044S43 CUACUACUAATAATCTCATTCATAAAAATT (SEQ ID NO. 3307 ) 1SD8_HP0045S44 CUACUACUATTCAAACCTCTAAAAGCTTCA (SEQ ID NO. 3308 ) 1SD9_HP0046S45 CUACUACUAAGTTTGTTTTTAATGCAAAGA (SEQ ID NO. 3309 )
1SD10_HP004 S46 CUACUACUATTAACAAATCCTAGGCAATAA (SEQ ID NO. 3310 ) 1SD11_HP0048S47 CUACUACUATTTATCCTTTTTTGATGATTG (SEQ ID NO. 3311 ) 1SD12_HP0049Ξ48 CUACUACUATCAATAAAGTTGCATCGTTAC (SEQ ID NO. 3312 ) 1SE1 HP0050S49 CUACUACUAAATAGCGTTTTTAAAACAGAT (SEQ ID NO. 3313 )
1SE2_HP0051S50 CUACUACUATCATTTTCTTAAGCTTTTTAA (SEQ ID NO. 3314 ) 1SE3_HP0052S51 CUACUACUAAGTAGGGCAAAGTTTAGCAAG (SEQ ID NO. 3315 ) 1SE4_HP0053S52 CUACUACUAACTTATTAAAATTTTAGCTTG (SEQ ID NO. 3316 ) 1SE5_HP0054S53 CUACUACUACTTTCATAAGCTACTCCTTAA (SEQ ID NO. 3317 ) XSE6_HP0055S54 CUACUACUATTAATGTTTCAAGCTCTCAAT (SEQ ID NO. 3318 ) 1SE7_HP0056S55 CUACUACUATTATTTTTCAGCACAGCATGA (SEQ ID NO. 3319 ) 1SE8_HP0057S56 CUACUACUAATAAACCCCTAGAATGTAGAA (SEQ ID NO. 3320 ) 1SE9_HP0059S57 CUACUACUATTATGGTTTTGGTTGTTTTGA (SEQ ID NO. 3321 )
1SE10_HP0060S58 CUACUACUAGTCATCATAATTTTTCTGCCT (SEQ ID NO. 3322 ) 1SE11_HP0061S59 CUACUACUATTTAAAAATTTTCTTCATCAA (SEQ ID NO. 3323 ) 1SE12_HP0062S60 CUACUACUACCACCCCTTAATAATCTTCCT (SEQ ID NO. 3324 ) 1SF1 HP0063S61 CUACUACUAAATTCAGTTCTGAGTTTGCAG (SEQ ID NO. 3325 )
SF2_HP0064S62 CUACUACUATTAACCCCTTGATTCCAGTTT (SEQ ID NO. 3326 ) SF3_HP0065S63 CUACUACUATCAATCATTTTTATCCTTAGA (SEQ ID NO. 3327 ) SF4_HP0066S64 CUACUACUACCCCTTTTTTTGACCCTATAA (SEQ ID NO. 3328 ) SF5_HP0067S65 CUACUACUATTTCAAACCTTTTGCGTGGTG (SEQ ID NO. 3329 ) SF6_HP0068S66 CUACUACUAAGTGTTCATCAATCTTCCAAT (SEQ ID NO. 3330 ) SF7_HP0069S67 CUACUACUAGAGATTAAAATTCAAGACATA (SEQ ID NO. 3331 ) SF8_HP0070S68 CUACUACUATCTTTTCTATTTTACGACCAC (SEQ ID NO. 3332 ) SF9_HP0071S69 CUACUACUATCACACCCAGTGTTGGATAAA (SEQ ID NO. 3333 ) SF10_HP0072S70 CUACUACUACTCCTAAAAAATCCTAGAAAA (SEQ ID NO. 3334 ) SF11_HP0073S71 CUACUACUATTACTCCTTAATTGTTTTTAC (SEQ ID NO. 3335 ) SFX2_HP0074S72 CUACUACUATTATGCCTTAATTTTGTTTTG (SEQ ID NO. 3336 ) SG1_HP0075S73 CUACUACUATTAGCACAAATGCCCTTCAAA (SEQ ID NO. 3337 ) SG2_HP0076S74 CUACUACUAAAAACTAAGCGAGAGCGAGAG (SEQ ID NO. 3338 ) SG3_HP0077S75 CUACUACUAAGTAATGGTCTTATTCAAACT (SEQ ID NO. 3339 ) SG4_HP0078S76 CUACUACUAACTTCCTTCTTATTTTTTCAT (SEQ ID NO. 3340 ) SG5_HP0079S77 CUACUACUATTAGAAGTTGATCATGTAGTT (SEQ ID NO. 3341 ) SG6_HP0080S78 CUACUACUACTTTGATATTGGGGTTAATCT (SEQ ID NO. 3342 ) SG7_HP0081S79 CUACUACUATTACCTTAAAAGAATCATTTT (SEQ ID NO. 3343 ) SG8_HP0082S80 CUACUACUAAAATCAAAATTTCTTCCTGCT (SEQ ID NO. 3344 ) SG9_HP0083S81 CUACUACUAGATTACCTTTTGGAGAATTGT (SEQ ID NO. 3345 ) SG10_HP0084S82 CUACUACUATTCTCATTTAGCGTCCTTTTT (SEQ ID NO. 3346 ) SG11_HP0085S83 CUACUACUACTTTTTTATTTTTGCTCCTGA (SEQ ID NO. 3347 ) SG12_HP0086S84 CUACUACUATTAATTTTCCAATTCTTCTGG (SEQ ID NO. 3348) SH1_HP0087S85 CUACUACUAAAGAAAATCAATTCGCATTTA (SEQ ID NO. 3349 ) SH2_HP0088S86 CUACUACUACATCAAATGCGCAAATAGTTT (SEQ ID NO. 3350) SH3_HP0089S87 CUACUACUAACAAACCCTAAAGCTCATCCA (SEQ ID NO. 3351 ) SH4_HP0090S88 CUACUACUATCACACGTATTCTTCTAAAAA (SEQ ID NO. 3352 ) SH5_HP009XS89 CUACUACUAATTACCATTTTTGTGCCTTTG (SEQ ID NO. 3353 ) SH6_HP0092S90 CUACUACUACTTTATTGGTGGTTTAAATCT (SEQ ID NO. 3354) SH7_HP0093Ξ91 CUACUACUACACTTTAAGCGTTATACTTTT (SEQ ID NO. 3355 ) SH8 HP0094S92 CUACUACUATCAAAGAAAGCTTGCACTGAT (SEQ ID NO. 3356) SH9 HP0095S93 CUACUACUAAATCATTTGAATTTTTTATTT (SEQ ID NO. 3357 )
1SH10_HP0096S94 CUACUACUATTTTTTGCGTTTATTTTTGAG (SEQ ID NO. 3358) SHX1_HP0097S95 CUACUACUACGATTACATCTTAGAAATATC (SEQ ID NO. 3359) SH12_HP0098S96 CUACUACUAGGGCTTAAAAAAGATTTAATG (SEQ ID NO. 3360) RA1_HP0001R1 CAUCAUCAUAGGGTTAAAATGGCGACACGA (SEQ ID NO. 3361) 1RA2_HP0002R2 CAUCAUCAUATGCAAATCATAGAAGGGAAA (SEQ ID NO. 3362) 1RA3_HP0003R3 CAUCAUCAUGGAAAAGTCATGAAAACTTCT (SEQ ID NO. 3363) 1RA4_HP0004R4 CAUCAUCAUTAAAGAGTGAAAGCGTTTTTA (SEQ ID NO. 3364) 1RA5_HP0005R5 CAUCAUCAUGGGCTTTCATGCAATTATGTG (SEQ ID NO. 3365) RA6_HP0006R6 CAUCAUCAUAGCTTTTAAAGGATTGTTAAA (SEQ ID NO. 3366) 1RA7_HP0007R7 CAUCAUCAUACGATGCTCATGGTGACCCGT (SEQ ID NO. 3367) 1RA8_HP0008R8 CAUCAUCAUGATGCGTTTTTGTTATTGGTT (SEQ ID NO. 3368) 1RA9_HP0010R9 CAUCAUCAUATGGCAAAAGAAATCAAATTT (SEQ ID NO. 3369) 1RA10_HP0011R10 CAUCAUCAUTATGAAGTTTCAGCCATTAGG (SEQ ID NO. 3370) 1RA11_HP0012R11 CAUCAUCAUTGAAAAATGATTCTTAAAAGT (SEQ ID NO. 3371) 1RA12_HP0013R12 CAUCAUCAUAGTCGCCATATGAAAAAGATT (SEQ ID NO. 3372) 1RB1_HP0014R13 CAUCAUCAUAGGTTTGCATGCAAGAGTTTT (SEQ ID NO. 3373) 1RB2_HP0015R14 CAUCAUCAUTTTTGATGTCCGCTCATTTTT (SEQ ID NO. 3374) 1RB3_HP0016R15 CAUCAUCAUATTTTTTAATGATTATCCTGT (SEQ ID NO. 3375) 1RB4__HP0017RX6 CAUCAUCAUTTATAGAGCGTAGAATGTTAG (SEQ ID NO. 3376) 1RB5_HP0018RX7 CAUCAUCAUTTAGATTGAAAATATTCGTTC (SEQ ID NO. 3377) 1RB6_HP0019R18 CAUCAUCAUGGGAAGTCATGGCTGATAGTT (SEQ ID NO. 3378) 1RB7_HP0020R19 CAUCAUCAUATGAAAAAATACAGCACTATC (SEQ ID NO. 3379) 1RB8_HP0021R20 CAUCAUCAUACACGCATGAAAAAATTCTTA (SEQ ID NO. 3380) 1RB9_HP0022R21 CAUCAUCAUCGTGTTTGGCATCATTATTCC (SEQ ID NO. 3381) 1RB10_HP0023R22 CAUCAUCAUATGGTTTATTTAAAGTGGAAA (SEQ ID NO. 3382) 1RB11_HP0024R23 CAUCAUCAUATGAAGTGTAAAAGTGGCAAA (SEQ ID NO. 3383) 1RB12_HP0025R24 CAUCAUCAUAGGAGAAAACATGAAGAAAAA (SEQ ID NO. 3384) 1RC1_HP0026R25 CAUCAUCAUAATGTCTGTCACTTTAGTCAA (SEQ ID NO. 3385) 1RC2_HP0027R26 CAUCAUCAUAGTGGCTTACAACCCTAAAAT (SEQ ID NO. 3386) XRC3_HP00 8R27 CAUCAUCAUAGGAATGAAGTTGATAAAATT (SEQ ID NO. 3387) 1RC4_HP0029R28 CAUCAUCAUGTTAAACCATGCTCTTTATCA (SEQ ID NO. 3388) 1RC5 HP0030R29 CAUCAUCAUATGCTATTGAATTACGATTTT (SEQ ID NO. 3389)
1RC6_HP0031R30 CAUCAUCAUGGTTGGTTCTTATGAATATTT (SEQ ID NO. 3390)
1RC7_HP0032R31 CAUCAUCAUATAAGCGGCATGAAAATGTAT (SEQ ID NO. 3391)
1RC8_HP0033R32 CAUCAUCAUAGATAAAATAATGGCTAAATT (SEQ ID NO. 3392) iRC9_HP0034R33 CAUCAUCAUATGACTTTTGAAATGCTTTAT (SEQ ID NO. 3393)
1RC10_HP0035R34 CAUCAUCAUGGTTAGAGATGGATTTTAGTC (SEQ ID NO. 3394)
1RC11_HP0036R35 CAUCAUCAUATGTTTCACAAAGCCCTTATT (SEQ ID NO. 3395)
1RC12_HP0037R36 CAUCAUCAUTGATGAAAAATGACGCTTATG (SEQ ID NO. 3396)
1RD1_HP0038R37 CAUCAUCAUATGAAAGAAAAGCCTTTCAAT (SEQ ID NO. 3397)
1RD2_HP0039R38 CAUCAUCAUGGTGATTGATGCGTAAGGTTT (SEQ ID NO. 3398)
1RD3_HP0040R39 CAUCAUCAUATGGCGACCTTATTGTTTTTT (SEQ ID NO. 3399)
1RD4_HP0041R40 CAUCAUCAUAAAAAGGCTAAAGATGAATAA (SEQ ID NO. 3400)
1RD5_HP0042R41 CAUCAUCAUATGGATAACCCTAAAGGCATT (SEQ ID NO. 3401)
1RD6_HP0043R42 CAUCAUCAUATGAAAATTAAAAATATCTTA (SEQ ID NO. 3402)
1RD7_HP0044R43 CAUCAUCAUATGAAAGAAAAAATCGCTTTA (SEQ ID NO. 3403)
1RD8_HP0045R44 CAUCAUCAUATGAATGAGATTATTTTAATC (SEQ ID NO. 3404)
1RD9_HP0046R45 CAUCAUCAUTTAAAAGGGATTTGGTGTGTG (SEQ ID NO. 3405)
1RD10_HP0047R46 CAUCAUCAUATGGATAGCGTAACTCTAGCA (SEQ ID NO. 3406)
1RD11_HP0048R47 CAUCAUCAUAGAAGTGATAGGGGTTGATTG (SEQ ID NO. 3407)
1RD12_HP0049R48 CAUCAUCAUGGATATTTTGAATGAAAAGAA (SEQ ID NO. 3408)
1RE1_HP0050R49 CAUCAUCAUGCTTAAGAAAATGATACAAAT (SEQ ID NO. 3409)
1RE2_HP0051R50 CAUCAUCAUGAAAGCTTGAAATGAATTATA (SEQ ID NO. 3410)
1RE3_HP0052R51 CAUCAUCAUTTTAATTTAGGGGTGTTGTTG (SEQ ID NO. 3411)
1RE4_HP0053R52 CAUCAUCAUAGCTTATGAAAGAGCAATCAA (SEQ ID NO. 3412)
1RE5_HP0054R53 CAUCAUCAUTATGCTCTTTGATCAAACCTT (SEQ ID NO. 3413)
1RE6_HP0055R54 CAUCAUCAUATGGGACATGTTGTTTTAAGT (SEQ ID NO. 3414)
1RE7_HP0056R55 CAUCAUCAUAGGTAAGCTCATGCAAAAAAT (SEQ ID NO. 3415)
1RE8_HP0057R56 CAUCAUCAUATGAAAACAATTAAAAATGGT (SEQ ID NO. 3416)
1RE9_HP0059R57 CAUCAUCAUATGGGAACATTCATTGAAAAA (SEQ ID NO. 3417)
1RE10_HP0060R58 CAUCAUCAUATGGTAACACCCTTAAAAAGT (SEQ ID NO. 3418)
1RE11_HP0061R59 CAUCAUCAUAGAAAAATTATGATGACTAAG (SEQ ID NO. 3419)
1RE12_HP0062R60 CAUCAUCAUCGATGAGCAGAGTGCAAATGG (SEQ ID NO. 3420)
1RF1 HP0063R61 CAUCAUCAUTGGTTTATGGCTGAATGGAAA (SEQ ID NO. 3421)
1RF2_HP0064R62 CAUCAUCAUATGGAAACAATTCCTGCAAAC (SEQ ID NO. 3422 )
1RF3_HP0065R63 CAUCAUCAUATGTTTTCTCATGAAGTTTAT (SEQ ID NO. 3423 )
1RF4_HP0066R64 CAUCAUCAUCGTGAAGCCAAAGAGCATGAA (SEQ ID NO. 3424 )
1RF5_HP0067R65 CAUCAUCAUTGGAAGATTGATGAACACTTA (SEQ ID NO. 3425 )
1RF6_HP0068R66 CAUCAUCAUATGGTAAAAATTGGAGTTTGT (SEQ ID NO. 3426 )
1RF7_HP0069R67 CAUCAUCAUATAACAAATGGATAAAGGAAA (SEQ ID NO. 3427 )
1RF8_HP0070R68 CAUCAUCAUATGATCATAGAGCGTTTAGTT (SEQ ID NO. 3428 )
1RF9_HP0071R69 CAUCAUCAUGGCAATGCTAGGACTTGTATT (SEQ ID NO. 3429 )
1RF10_HP0072R70 CAUCAUCAUAATGAAAAAGATTAGCAGAAA (SEQ ID NO. 3430)
1RF11_HP0073R71 CAUCAUCAUGATGAAACTCACCCCAAAAGA (SEQ ID NO. 3431 )
1RF12_HP0074R72 CAUCAUCAUTGTGCTAAAAACCACTAAAAA (SEQ ID NO. 3432 )
1RG1_HP0075R73 CAUCAUCAUCGATGAAAATTTTTGGGACTG (SEQ ID NO. 3433 )
1RG2_HP0076R74 CAUCAUCAUGCTATGGCAAATCATAAGTCC (SEQ ID NO. 3434 )
1RG3_HP0077R75 CAUCAUCAUATGTCCATTCTAGCCGAAAAG (SEQ ID NO. 3435 )
1RG4_HP0078R76 CAUCAUCAUTAAATGAAAAAGGTTTTTTTA (SEQ ID NO. 3436 )
1RG5_HP0079R77 CAUCAUCAUATGAAAAAGTCATTCAAAAAA (SEQ ID NO. 3437 )
1RG6_HP0080R78 CAUCAUCAUTATCAATGAAAGCTATAAAAA (SEQ ID NO. 3438 )
1RG7_HP0081R79 CAUCAUCAUAAGAAATGCCCATGCAGGCTT (SEQ ID NO. 3439 )
1RG8_HP0082R80 CAUCAUCAUATGAAATCTACAAGAATTGGT (SEQ ID NO. 3440 )
1RG9_HP0083R81 CAUCAUCAUAAATGAGAAAAATCTATGCTA (SEQ ID NO. 3441 )
1RG10_HP0084R82 CAUCAUCAUTTTATCAAAGGATTCTTATGA (SEQ ID NO. 3442 )
1RG11_HP0085R83 CAUCAUCAUGCATGCAAAAAGAACAAGAAG (SEQ ID NO. 3443 )
1RGX2_HP0086R84 CAUCAUCAUGGAGAAAGACAATGAGTATGG (SEQ ID NO. 3444 )
1RH1_HP0087R85 CAUCAUCAUATGCGTTATTTTCTTGTAGTT (SEQ ID NO. 3445 )
1RH2_HP0088R86 CAUCAUCAUATGAAAAAGAAAGCTAACGAA (SEQ ID NO. 3446 )
1RH3_HP0089R87 CAUCAUCAUGGGAGAAAATGGTGCAAAAAA (SEQ ID NO. 3447 )
1RH4_HP0090R88 CAUCAUCAUATGCAATACGCGCTATTATTT (SEQ ID NO. 3448 )
XRH5_HP0091R89 CAUCAUCAUCAAAAATGAAAAACCATCTGC (SEQ ID NO. 3449 )
1RH6_HP0092R90 CAUCAUCAUATGGTAATCGCGCATTCTAAT (SEQ ID NO. 3450 )
1RH7_HP0093R91 CAUCAUCAUAGGCGCTTGAGTATATGGCAA (SEQ ID NO. 3451 )
1RH8_HP0094R92 CAUCAUCAUTGAATGGCTTTTAAGGTGGTG (SEQ ID NO. 3452 )
1RH9 HP0095R93 CAUCAUCAUATGCCAAAACCCAAGAAAAAC (SEQ ID NO. 3453 )
RH10_HP0096R94 CAUCAUCAUATAGAGATGAAAACATTCAAA (SEQ ID NO. 3454) RH11_HP0097R95 CAUCAUCAUAATGAAAAAAATCGTTTTAGT (SEQ ID NO. 3455) RH12 HP0098R96 CAUCAUCAUATTTTATGCCTTTTGTCCCCA (SEQ ID NO. 3456) SA1_ HP0099S97 CUACUACUAAGCATTAAAACTGCTTTTTAT (SEQ ID NO. 3457) SA2_ HP0100S98 CUACUACUACTTACAGCAAATAAAACACCT (SEQ ID NO. 3458) SA3_ HP0101S99 CUACUACUACTAAAACGAAACGATTAAAGA (SEQ ID NO. 3459) SA4_ HP0102S100 CUACUACUAAATCTTTCTTTAATCGTTTCG (SEQ ID NO. 3460) SA5_ HP0103S101 CUACUACUACCCTGATTAAGTTTTAAACAA (SEQ ID NO. 3461) SA6_ HP0104S102 CUACUACUATTATTTAACTTCACTCTCTTT (SEQ ID NO. 3462) SA7_ HP0105S103 CUACUACUATCAAACCCCCACTTCAGACCA (SEQ ID NO. 3463) SA8_ HP0106S104 CUACUACUATGTAATGAAACTTTAGCCTAT (SEQ ID NO. 3464) SA9_ HP0107S105 CUACUACUATTATAAATAAATACCTTTTGA (SEQ ID NO. 3465) SAl0_HP0108S106 CUACUACUATTCAAGATGGGCTTTTTAAAT (SEQ ID NO. 3466) SAl1_HP0109S107 CUACUACUAGCGTTTCACTCCACTTCCGCA (SEQ ID NO. 3467) SA12_HP0110S108 CUACUACUAATGTTTTAATCGTTTTT GCA (SEQ ID NO. 3468) SB1 HP0111S109 CUACUACUACATCTTTCATTATTCCTCCTC (SEQ ID NO. 3469) SB2_HP0112S110 CUACUACUATTTATCCTTTTTAGCTTTTTA (SEQ ID NO. 3470) SB3_HP0113S111 CUACUACUACCCTTTTTAAGCGTATGTGTC (SEQ ID NO. 3471) SB4_HP0114S112 CUACUACUATTACCATTCTTTTAAAGCCAT (SEQ ID NO. 3472) SB5_HP0115S113 CUACUACUAGTTATTGTAAAAGCCTTAAGA (SEQ ID NO. 3473) SB6_HP01ieS114 CUACUACUACTTTAGCCATTATCTTCCTCT (SEQ ID NO. 3474) SB7_HP0117S115 CUACUACUAAGGTTAAAACGCGCAATAAAA (SEQ ID NO. 3475) SB8_HP0121S119 CUACUACUATATCAATGTTCAGTTAAGCCA (SEQ ID NO. 3476) SB9_HP0122S120 CUACUACUACAAACCTTAAAATTTAGCAAT (SEQ ID NO. 3477) SB10_HP0123S121 CUACUACUATTCTACTCAAAAACTAACCTC (SEQ ID NO. 3478) SB11_HP0124S122 CUACUACUACCGCGATTAGTCTTCATTTTT (SEQ ID NO. 3479) SB12_HP0125S123 CUACUACUATTAAGCCCTGCAAAGCAACGA (SEQ ID NO. 3480) ΞC1 HP0126S124 CUACUACUACCAATAGAGATTAAAGATGCT (SEQ ID NO. 3481) SC2_HP0127S125 CUACUACUACTTAAAAGCTTACAATGTAAC (SEQ ID NO. 3482) SC3_HP0128S126 CUACUACUAGATTTTTCAAAGATTTGTTTG (SEQ ID NO. 3483) SC4_HPO129S127 CUACUACUACTAATGCTTATGATGGCTTGG (SEQ ID NO. 3484) SC5 HP0130S128 CUACUACUATACTATTTTTCAGACTTACTA (SEQ ID NO. 3485)
SC6_HP0131S129 CUACUACUATCTGCTTTGATATTCTTAAAA (SEQ ID NO. 3486 ) SC7_HP0132S130 CUACUACUATCCCTTAGCATTTTAAGGTTT (SEQ ID NO. 3487 ) SC8_HP0133S131 CUACUACUAGCCATTAAAATAGTCCTAAAA (SEQ ID NO. 3488 ) SC9_HP0134S132 CUACUACUATTTTTAACTAAGCGTGCTGTT (SEQ ID NO. 3489 ) SC10_HP0135S133 CUACUACUAACAATTTTTATTTTTGTTCAG (SEQ ID NO. 3490 ) SC11_HP0136S134 CUACUACUAAAACTACAAACTCTCTAAAAC (SEQ ID NO. 3491 ) SC12_HP0137S135 CUACUACUATCTTATCCTTTAGTAGAGAAT (SEQ ID NO. 3492 ) SD1_HP0138S136 CUACUACUAAAGCTCTTTACTCATAAATCA (SEQ ID NO. 3493 ) SD2_HP0139S137 CUACUACUATTAAAGCCCAAGTCTTGAGGC (SEQ ID NO. 3494 ) SD3_HP0140S138 CUACUACUATTATTTAGGAATGATGGGGAT (SEQ ID NO. 3495 ) SD4_HP0141S139 CUACUACUAATCTAAAATGGCCCTACTTTA (SEQ ID NO. 3496 ) SD5_HP0142S140 CUACUACUAAGGGTTTAACCCCCAAATAAA (SEQ ID NO. 3497 ) SD6_HP0144S141 CUACUACUACCTCCCCTATCTAGACATAGG (SEQ ID NO. 3498 ) SD7_HP014SS142 CUACUACUACCCTTATTTAGCGTTTTGATT (SEQ ID NO. 3499 ) SD8_HP0146S143 CUACUACUACATTTCAACTTTCCTTTATGC (SEQ ID NO. 3500 ) SD9_HP0147S144 CUACUACUATTAGCGATTGGATAAATTCAG (SEQ ID NO. 3501 ) SD10_HP0148S145 CUACUACUATCATTTTTTACTCCCTTGATT (SEQ ID NO. 3502 ) SD11_HP0149S146 CUACUACUACCTCTTAAGATTTTTTAAAAA (SEQ ID NO. 3503 ) SD12_HP0150S147 CUACUACUATTTTTTAAATGCGTTGCTTGA (SEQ ID NO. 3504 ) SE1_HP0151S148 CUACUACUATTCTTAATCCAATTCCAATTT (SEQ ID NO. 3505 ) SE2_HP0152S149 CUACUACUAGATTAAGAAAAGCGGTATTTT (SEQ ID NO. 3506 ) SE3_HP0153S150 CUACUACUAAATCCTTTTTATTCCATTTCT (SEQ ID NO. 3507 ) SE4JHP0154S151 CUACUACUACACTAGCCATGCTTAAACAAC (SEQ ID NO. 3508 ) SE5__HP0155S152 CUACUACUACTACCATTTTCCTTAAATCTA (SEQ ID NO. 3509 ) SE6_HP0156S153 CUACUACUAGCTTTTATTTCAATTTTAATT (SEQ ID NO. 3510 ) SE7_HP0157S154 CUACUACUATTATGCGATGAATTGTAGCAC (SEQ ID NO. 3511 ) SE8_HP0158S155 CUACUACUACTCTTAGCGTTCTCGTTTGGG (SEQ ID NO. 3512 ) SE9_HP0159S156 CUACUACUAGAGAACGCTAAGAGTTAAAAA (SEQ ID NO. 3513 ) SE10_HP0160S157 CUACUACUAGCTTTAACTTATTGCATATCA (SEQ ID NO. 3514 ) SE11_HP0161S158 CUACUACUACCTAAGACCACTTTAAAAAAT (SEQ ID NO. 3515 ) SE12_HP0162S159 CUACUACUAGCATTTCACTCAATATTGGTA (SEQ ID NO. 3516 ) SF1 HP0163S160 CUACUACUAGTCCCATTCAATTCCTTTGTA (SEQ ID NO. 3517 )
SF2__HP0164S161 CUACUACUAACTTCTCTCAAATTTTCGGGG (SEQ ID NO. 3518 ) SF3_HP0165S162 CUACUACUACAACTCACGCTTTTATCCCCT (SEQ ID NO. 3519 ) SF4_HP0166S163 CUACUACUATCAGTATTCTAATTTATAACC (SEQ ID NO. 3520) SF5_HP0167S164 CUACUACUATCGTTTTTCACTCTGTGTTTT (SEQ ID NO. 3521 ) SF6_HP0168S165 CUACUACUATTTTTCAATTCCATTCTACCA (SEQ ID NO. 3522 ) SF7__HP0169S166 CUACUACUAGCCATTTTTAGACTCTGACTT (SEQ ID NO. 3523 ) SF8_HP0170S167 CUACUACUAACTACTTGGCTCCAAAAGAAG (SEQ ID NO. 3524 ) SF9_HP0171S168 CUACUACUAGCACTTAAGCTTTAGCAATCA (SEQ ID NO. 3525 ) SF10_HP0172S169 CUACUACUAATTTTAATTTTCAAACCTTAA (SEQ ID NO. 3526 ) SF11_HP0173S170 CUACUACUACCTTTAAAAGATTTTGCTAAT (SEQ ID NO. 3527 ) SF12_HP0174S171 CUACUACUATGAATAAAATCTAGCATTCTT (SEQ ID NO. 3528 ) SG1_HP0175S172 CUACUACUATCAATTACTTGTTGATAACAA (SEQ ID NO. 3529 ) SG2_HP0176S173 CUACUACUACTTTCCTTGTTGATTAAATTT (SEQ ID NO. 3530 ) SG3_HP0177Ξ174 CUACUACUACTCACTTCACCTTTTCAAGAT (SEQ ID NO. 3531 ) SG4_HP0178S175 CUACUACUACTACAATGAGCGTTCTATATC (SEQ ID NO. 3532 ) SG5_HP0179S176 CUACUACUATTATTTCTCCTTAATCAAAAC (SEQ ID NO. 3533 ) SG6_HP0180S177 CUACUACUATTTAATCATGATCGTTTCCTA (SEQ ID NO. 3534 ) SG7_HP0181S178 CUACUACUATTTATAGGGGTTCTTTATTAG (SEQ ID NO. 3535 ) SG8JHP0182S179 CUACUACUAATTATTCTTCACTCTCCACAT (SEQ ID NO. 3536 ) SG9_HP0183S180 CUACUACUATTAAAAAATAGGTTGGTGGTA (SEQ ID NO. 3537 ) SG10_HP0184S181 CUACUACUACCTTTCAAAAATGGTTATAAA (SEQ ID NO. 3538 ) SG11_HP0185S182 CUACUACUACTACCGCGCTTCTATGACAAC (SEQ ID NO. 3539 ) SG12_HP0186S183 CUACUACUATTAGTGGGCGGCGAAATCCTT (SEQ ID NO. 3540 ) SH1_HP0187S184 CUACUACUATCAAAACTTCAAAAAGATTTT (SEQ ID NO. 3541 ) SH2JHP0188S185 CUACUACUATCTTACAGATCCCTAAAATTC (SEQ ID NO. 3542 ) SH3_HP0189S186 CUACUACUAAAACTTTTAATGCGCTTTATT (SEQ ID NO. 3543 ) SH4_HP0190S187 CUACUACUATTAAAGCTCTCTTTCAGGAAG (SEQ ID NO. 3544 ) ΞH5_HP0191S188 CUACUACUAAAGAAATTAGCGGCTTTTACC (SEQ ID NO. 3545) SH6_HP0192S189 CUACUACUACATTATCACTCATGGTGTTCT (SEQ ID NO. 3546 ) SH7_HP0193S190 CUACUACUAACCCTTATTCTTTGTGGAATT (SEQ ID NO. 3547 ) SH8_HP0194S191 CUACUACUATTTATAAAAATGAAATGATTG (SEQ ID NO. 3548 ) SH9 HP0195S192 CUACUACUACCTTATTGTTCTTTATGCAAA (SEQ ID NO. 3549 )
SH10_HP0196S193 CUACUACUAGCTTATTATAGCTTATTTCAT (SEQ ID NO. 3550) SH11_HP0197S194 CUACUACUAAATATTTTTTTAACGCTTAAA (SEQ ID NO. 3551) SH12_HP0198S195 CUACUACUACTTAAATCTTAAAGATCTCTA (SEQ ID NO. 3552) RA1_HP0099R97 CAUCAUCAUTTGTCTAAAGGTTTGAGTATC (SEQ ID NO. 3553) RA2_HP0100R98 CAUCAUCAUATGCTCATTCATATTTGCTGC (SEQ ID NO. 3554) RA3_HP0101R99 CAUCAUCAUTTTGAAAACTCTATTTAGTAT (SEQ ID NO. 3555) RA4_HP0102R100 CAUCAUCAUTGGATTGTTAAAAGTTTCTGT (SEQ ID NO. 3556) RA5_HP0103R101 CAUCAUCAUGGAGATAAATGATGTTTTCTT (SEQ ID NO. 3557) RA6_HP0104R102 CAUCAUCAUCAATGAAAAAATTGGTTTTAG (SEQ ID NO. 3558) RA7_HP0105R103 CAUCAUCAUACATGAAAACACCAAAAATGA (SEQ ID NO. 3559) RA8_HP0106R104 CAUCAUCAUCATGCGCATGCAAACCAAATT (SEQ ID NO. 3560) RA9_HP0107R105 CAUCAUCAUTTTGTTATAGGAGAAATGATG (SEQ ID NO. 3561) RA10_HP0108R106 CAUCAUCAUATGCCTAAACTCGTTAAATGG (SEQ ID NO. 3562) RA11_HP0109R107 CAUCAUCAUATGGGAAAAGTTATTGGAATT (SEQ ID NO. 3563) RA12_HP0110R108 CAUCAUCAUGGAATAATGAAAGATGAACAC (SEQ ID NO. 3564) RB1_HP0111R109 CAUCAUCAUATGGTGATTGACGAGATTTTT (SEQ ID NO. 3565) RB2_HP0112R110 CAUCAUCAUTTGACGCAAGCTCTTTACTAT (SEQ ID NO. 3566) RB3_HP0113R111 CAUCAUCAUACATGACCATCAACACCCATT (SEQ ID NO. 3567) RB4_HP0114RX12 CAUCAUCAUTTATAAGTTAGAATGATGGAT (SEQ ID NO. 3568) RB5_HP0115R113 CAUCAUCAUAACATGAGTTTTAGGATAAAT (SEQ ID NO. 3569) RB6_HP0116R114 CAUCAUCAUCGTTCCATGAAGCACCTTATT (SEQ ID NO. 3570) RB7_HP0117R115 CAUCAUCAUTAATGGCTAAAGAAAATCCGC (SEQ ID NO. 3571) RB8_HP0121R119 CAUCAUCAUTGTGCGATATATCAAGTTTTT (SEQ ID NO. 3572) RB9_HP0122R120 CAUCAUCAUTTGTATAGGCGGTTATTGTTG (SEQ ID NO. 3573) RB10_HP0123R121 CAUCAUCAUATGAGTGCGGAACTGATTGCT (SEQ ID NO. 3574) RB11_HP0124R122 CAUCAUCAUAAGAGGTTAGTTTTTGAGTAG (SEQ ID NO. 3575) RB12_HP0125R123 CAUCAUCAUCATGCCAAAAATGAAGACTAA (SEQ ID NO. 3576) RC1_HP0126R124 CAUCAUCAUATGAGAGTTAAAACAGGCGTT (SEQ ID NO. 3577) RC2_HP0127R125 CAUCAUCAUTACATGAAGAAATCTGTTATA (SEQ ID NO. 3578) RC3_HP0128R126 CAUCAUCAUCTCTTTATGGCCTTTTATGAT (SEQ ID NO. 3579) RC4_HP0129R127 CAUCAUCAUATGAAAAAATTAGCGGTTTCT (SEQ ID NO. 3580) RC5 HP0130R128 CAUCAUCAUTTGGTGTGAAACGGATTTTAT (SEQ ID NO. 3581)
RC6_HP0131R129 CAUCAUCAUATGCCCTATCCTTTTATGAGT (SEQ ID NO. 3582 ) RC7_HP0132R130 CAUCAUCAUATGGCTAGTTTTTCTATTTTA (SEQ ID NO. 3583 ) RC8_HP0133R131 CAUCAUCAUATGGCACAAGAAAAAGCAGTT (SEQ ID NO. 3584 ) RC9_HP0134R132 CAUCAUCAUAAAGGATTATAGAAATGTCAA (SEQ ID NO. 3585 ) RC10_HP0135R133 CAUCAUCAUTGAAATGAGAATTTCTCTTTT (SEQ ID NO. 3586 ) RC11_HP0X36R134 CAUCAUCAUAGAATGGAAAAATTAGAAGTA (SEQ ID NO. 3587 ) RC12_HP0137R135 CAUCAUCAUTAGAAGGGGTGATTTATGAGT (SEQ ID NO. 3588 ) RD1_HP0138R136 CAUCAUCAUATGATCATGGAAAAATACCAT (SEQ ID NO. 3589 ) RD2_HP0139R137 CAUCAUCAUGCCTTCTTTGAAAGTCAATTT (SEQ ID NO. 3590 ) RD3_HP0140R138 CAUCAUCAUTTATGGAATTTTATCAAGTCT (SEQ ID NO. 3591 ) RD4_HP0141R139 CAUCAUCAUGTGCTAGAATTTCATCAAATT (SEQ ID NO. 3592 ) RD5_HP0142R140 CAUCAUCAUTGTAGCTGGAAACTTTACACA (SEQ ID NO. 3593 ) RD6_HP0X44R141 CAUCAUCAUTATTGCATGCAAGAAAATGTG (SEQ ID NO. 3594 ) RD7_HP0145R142 CAUCAUCAUGGAGGTTGGAAATGTTTAGTT (SEQ ID NO. 3595 ) RD8JHP0146R143 CAUCAUCAUGGTGAATGATGGATTTAGAAA (SEQ ID NO. 3596 ) RD9__HP0147R144 CAUCAUCAUATGGATTTTTTAAACGACCAT (SEQ ID NO. 3597 ) RD10_HP0148R145 CAUCAUCAUAAAGAGATGAAATTTTTAAAC (SEQ ID NO. 3598 ) RD11_HP0149R146 CAUCAUCAUTGTTTAGCACATGAGAATCTT (SEQ ID NO. 3599 ) RD12_HP0150R147 CAUCAUCAUAGAGGGAAAATTAATGAAAGA (SEQ ID NO. 3600 ) RE1_HP0151R148 CAUCAUCAUATGATAAAAAGCCTAAATTCC (SEQ ID NO. 3601 ) RE2_HP0152R149 CAUCAUCAUTTGATTAGTGTCGCTCATAGC (SEQ ID NO. 3602 ) RE3_HP0153R150 CAUCAUCAUGTAATTTAATGGCAATAGATG (SEQ ID NO. 3603 ) RE4_HP0154R151 CAUCAUCAUTTTGATGCTAACCATTAAAGA (SEQ ID NO. 3604 ) RE5_HP0155R152 CAUCAUCAUGTTGTTTAAGCATGGCTAGTG (SEQ ID NO. 3605 ) RE6_HP0156R153 CAUCAUCAUATGGTAGTGTTAAAAAAGATG (SEQ ID NO. 3606 ) RE7_HP0157R154 CAUCAUCAUCATGCAGCATTTAGTCTTAAT (SEQ ID NO. 3607 ) RE8_HP0158R155 CAUCAUCAUGCCTTTTGATGTTAAGTAGAG (SEQ ID NO. 3608 ) RE9_HP0159R156 CAUCAUCAUAATGAGTATTATTATTCCTAT (SEQ ID NO. 3609 ) RE10_HP0160R157 CAUCAUCAUGAATGAGATGATAAAGAGTTG (SEQ ID NO. 3610 ) RE11_HP01S1R158 CAUCAUCAUCCTATGATTAGGGACGCAGAG (SEQ ID NO. 361 1 ) RE12_HP0162R159 CAUCAUCAUTACAAAGGAATTGAATGGGAC (SEQ ID NO. 3612 ) RF1 HP0163R160 CAUCAUCAUCATGTTCAAACGATTGAGAAG (SEQ ID NO. 3613 )
RF2_HP0164R161 CAUCAUCAUGGATAAAAGCGTGAGTTGTAA (SEQ ID NO. 3614) RF3_HP0165R162 CAUCAUCAUTTTTGCGTTTCTCTATCTTTT (SEQ ID NO. 3615) RF4_HP0166R163 CAUCAUCAUATGATAGAAGTTTTAATGATA (SEQ ID NO. 3616) RF5_HP0167R164 CAUCAUCAUAAATGGCTGGTAGAATGGAAT (SEQ ID NO. 3617) RF6_HP0168R165 CAUCAUCAUACGCATGCCCTTAGAAACGAT (SEQ ID NO. 3618) RF7_HP0169R166 CAUCAUCAUTTTGAACCAAGTTGAATTACT (SEQ ID NO. 3619) RF8_HP0170R167 CAUCAUCAUAACGACAATGACACAAGAAGA (SEQ ID NO. 3620) RF9_HP0171R168 CAUCAUCAUGGTTCTGTTTGTGGATAACTA (SEQ ID NO. 3621) RF10_HP0172R169 CAUCAUCAUAACATGATTAGTTTTAAAGAA (SEQ ID NO. 3622) RF11_HP0173R170 CAUCAUCAUGAATGCTAGATTTTATTCAAG (SEQ ID NO. 3623) RF12__HP0174R171 CAUCAUCAUGACTTTGCGCTTAAATTACCC (SEQ ID NO. 3624) RG1_HP0175R172 CAUCAUCAUAACACACAATGAAAAAAAATA (SEQ ID NO. 3625) RG2_HP0176R173 CAUCAUCAUATGTTAGTTAAAGGCAATGAA (SEQ ID NO. 3626) RG3_HP0177R174 CAUCAUCAUACATGGCAATTGGGATGAGCG (SEQ ID NO. 3627) RG4_HP0178R175 CAUCAUCAUAGAAATGTTACAACCCCCTAA (SEQ ID NO. 3628) RG5_HP0179R176 CAUCAUCAUTCATGATTAAAGCGATTAATA (SEQ ID NO. 3629) RG6_HP0180R177 CAUCAUCAUCATGCGTCTTCTTCTGTTCAA (SEQ ID NO. 3630) RG7_HP0181R178 CAUCAUCAUGTGGTGGTGGTAGCCTTTGGG (SEQ ID NO. 3631) RG8_HP0182RX79 CAUCAUCAUCATGTTTTCTAACCAATACAT (SEQ ID NO. 3632) RG9_HP0183R180 CAUCAUCAUGTGGAGAGTGAAGAATAATGG (SEQ ID NO. 3633) RG10_HP0184R181 CAUCAUCAUAGATGACAGAAATGGAATTAA (SEQ ID NO. 3634) RGX1_HP0X85R182 CAUCAUCAUTGAAAGGATTTTTAATGTCAG (SEQ ID NO. 3635) RG12_HP0186R183 CAUCAUCAUAAATTGAGTGAAGAAGAAGTG (SEQ ID NO. 3636) RH1_HP0187R184 CAUCAUCAUATGGTTGTCGCTAAGAATGAA (SEQ ID NO. 3637) RH2_HP0188R185 CAUCAUCAUGAATGAAAACCACTATAAAAG (SEQ ID NO. 3638) RH3_HP0189R186 CAUCAUCAUTTGAGATGTTGGAAAAATTGA (SEQ ID NO. 3639) RH4_HP0190R187 CAUCAUCAUAGTGGGTCGTTTTTGAAAATC (SEQ ID NO. 3640) RH5_HP0191R188 CAUCAUCAUAACACCATGAGTGATAATGAA (SEQ ID NO. 3641) RH6_HP0192RX89 CAUCAUCAUGGTAGAAAATGAAAATAACAT (SEQ ID NO. 3642) RH7_HP0193R190 CAUCAUCAUATGCAACAAGAAGAGATTATA (SEQ ID NO. 3643) RH8_HP0194R191 CAUCAUCAUAAGGATCGTTCAATGACAAAA (SEQ ID NO. 3644) RH9 HP0X95R192 CAUCAUCAUTCATGGGATTTTTAAAAGGTA (SEQ ID NO. 3645)
RH10_HP0196R193 CAUCAUCAUTGATGAAATTAAGCGAATTGT (SEQ ID NO. 3646 ) RH11_HP0197R194 CAUCAUCAUAAGGACAATCAATGAAAGATA (SEQ ID NO. 3647 ) RH12_HP0198R195 CAUCAUCAUGGAGTTTAAATTGAAACAAAG (SEQ ID NO. 3648 ) SA1_HP0199S196 CUACUACUATTTCAATCTGCATAATGGTAG (SEQ ID NO. 3649 ) SA2_HP0200S197 CUACUACUAACTAGTATTCTTTAGTAAATT (SEQ ID NO. 3650 ) SA3_HP0201S198 CUACUACUATAGTCATCTTTAAGCGTCTTG (SEQ ID NO. 3651 ) SA4_HP0202S199 CUACUACUACCTAACTTCCTCCAAAATACA (SEQ ID NO. 3652 ) SA5_HP0203S200 CUACUACUAGGCTTGCTCTAAAGCCATTTG (SEQ ID NO. 3653 ) SA6_HP0204S201 CUACUACUATTTATTTGGAAAATTTATGGT (SEQ ID NO. 3654 ) SA7_HP0205S202 CUACUACUAAATCCTTTCACTTTATAACAT (SEQ ID NO. 3655 ) SA8_HP0206S203 CUACUACUATCCTTTAAATTATCTCTTTAG (SEQ ID NO. 3656 ) SA9_HP0207S204 CUACUACUAACCCTTTTTAAACTAATGCGA (SEQ ID NO. 3657 ) SA10_HP0209S205 CUACUACUAGTTTAATTTTTTAAAAACGAT (SEQ ID NO. 3658 ) SA11_HP0210S206 CUACUACUACCCCTACAACGCTTTCAATAG (SEQ ID NO. 3659 ) SA12_HP0211S207 CUACUACUATTAAAGTTCTATTTTTAATTC (SEQ ID NO. 3660 ) SB1_HP0212S208 CUACUACUATTGTTTATTTTATGCCTCACT (SEQ ID NO. 3661 ) SB2_HP0213S209 CUACUACUATTAAGAGTTTTTCCGCAAATG (SEQ ID NO. 3662 ) SB3_HP0214S210 CUACUACUATAATCAATTAAATATTAACGA (SEQ ID NO. 3663 ) SB4_HP0215S21X CUACUACUAAACCATTCAATCCCCTAAAAA (SEQ ID NO. 3664 ) SB5_HP02X6S212 CUACUACUACTACACTCCCGCTACATTTTT (SEQ ID NO. 3665 ) SB6_HP0217S2X3 CUACUACUACTATCTTTCCGCTAAATTGAA (SEQ ID NO. 3666 ) SB7_HP0218S214 CUACUACUAAACCTATTTCCTCACAAACTG (SEQ ID NO. 3667 ) SB8_HP0219S215 CUACUACUATTAAGGTTTTTTAAGGGGTCT (SEQ ID NO. 3668 ) SB9_HP0220S216 CUACUACUATTAATAAGAGCTTGAAATATT (SEQ ID NO. 3669 ) SB10_HP0221S217 CUACUACUAAAAAAGTTCAAATCGGTAACA (SEQ ID NO. 3670 ) SB11_HP0222S2X8 CUACUACUATTACTCTATTTTTCTTAAAGC (SEQ ID NO. 3671 ) SB12_HP0223S219 CUACUACUATCCAATCACATCCATTCAACA (SEQ ID NO. 3672 ) SCX_HP0224S220 CUACUACUAACCCCTTAATGCGACTTTTTA (SEQ ID NO. 3673 ) SC2_HP0225S221 CUACUACUAAATCATTTAAGGCGGTTTTTA (SEQ ID NO. 3674 ) SC3_HP0226Ξ222 CUACUACUATTACCCCATAATGAGCTTGTG (SEQ ID NO. 3675 ) SC4_HP0227S223 CUACUACUATTAGTAAGCAAACACATAATT (SEQ ID NO. 3676 ) SC5 HP0228S224 CUACUACUATCATGCAATCCCTTTAGATTT (SEQ ID NO. 3677 )
SC6_HP0229S225 CUACUACUATGAGAACCGATTCAATATCAA (SEQ ID NO. 3678 ) SC7_HP0230S226 CUACUACUATTCACCGCTCAAGGAGATCGG (SEQ ID NO. 3679 ) SC8_HP0231S227 CUACUACUATATATCATGCCTTATAATGGT (SEQ ID NO. 3680 ) SC9JHP0232S228 CUACUACUATGAATCAATTAAATTATGAGT (SEQ ID NO. 3681 ) SC10_HP0233S229 CUACUACUAGCTCATTATTGCAACCTATGG (SEQ ID NO. 3682 ) SC11_HP0234S230 CUACUACUAACCCTACAAAGACCAATAGAT (SEQ ID NO. 3683 ) SC12_HP0235S231 CUACUACUAGGGGGTTGATTAAATTTTCAA (SEQ ID NO. 3684 ) SD1_HP0236S232 CUACUACUAGATTAAGGCTTAGGAGAATCC (SEQ ID NO. 3685 ) SD2_HP0237S233 CUACUACUAAACAAACGCATTAAAACAACT (SEQ ID NO. 3686 ) SD3_HP0238S234 CUACUACUACTTATTCGCTTTCTAACATTT (SEQ ID NO. 3687 ) SD4_HP0239S235 CUACUACUAATTATTCCTCGTAATATTCGC (SEQ ID NO. 3688 ) ΞD5_HP0240S236 CUACUACUACCATTAAAAAGTCCTATAAAT (SEQ ID NO. 3689 ) SD6_HP0241S237 CUACUACUATTCCTTAAGGCTCTTTAGGAG (SEQ ID NO. 3690 ) SD7_HP0242S238 CUACUACUAGCGGATAAGCTCATTCATTCC (SEQ ID NO. 3691 ) SD8_HP0243S239 CUACUACUATGGTCGCTTAAGCCAAATGGG (SEQ ID NO. 3692 ) SD9_HP0244S240 CUACUACUACCAAATTAAGAAGCGTTAAGA (SEQ ID NO. 3693 ) SD10_HP0245S241 CUACUACUAATTTTTTCATGGTTTTTGTTT (SEQ ID NO. 3694 ) SD11_HP0246S242 CUACUACUATGTTTATCATAGTATCTCCAT (SEQ ID NO. 3695 ) SD12_HP0247S243 CUACUACUATTAACGGCGTTTGGGTTTTTT (SEQ ID NO. 3696 ) SE1_HP0248S244 CUACUACUACGTTTTTAAGGCTCTTTAGTC (SEQ ID NO. 3697 ) SE2__HP0249S245 CUACUACUATTATAAACAAAACAAAGAAAC (SEQ ID NO. 3698 ) ΞE3_HP0250S246 CUACUACUATTAAAGCCTGGATTCTAACAA (SEQ ID NO. 3699 ) SE4_HP0251S247 CUACUACUACCCCCTTATTTGAGCATGTTA (SEQ ID NO. 3700 ) SE5_HP0252S248 CUACUACUATCCTTAAAAACCGACTGAATA (SEQ ID NO. 3701 ) SE6_HP0253S249 CUACUACUACGCCACGCTTAAAGGAAGGCT (SEQ ID NO. 3702 ) SE7_HP0254S250 CUACUACUATTAAAAACCTATCGTGTAATT (SEQ ID NO. 3703 ) SE8_HP0255S25X CUACUACUATACAGAAGCGAATTTTTTCAT (SEQ ID NO. 3704 ) SE9_HP0256S252 CUACUACUACAGACCCAATAACAAGATTTT (SEQ ID NO. 3705 ) SE10_HP0257S253 CUACUACUACCGCTACAATGAACATCATAC (SEQ ID NO. 3706 ) SE11_HP0258S254 CUACUACUATTTTATAGCAAACGAGTGAGA (SEQ ID NO. 3707 ) SE12_HP0259S255 CUACUACUATCAAACCCCTACACCCTATCC (SEQ ID NO. 3708 ) SF1 HP0260S256 CUACUACUATTTATCTTAAAATTTCGTATT (SEQ ID NO. 3709 )
SF2_HP0261S257 CUACUACUAACTTTATGCAAGATAATTTTC (SEQ ID NO. 3710 ) SF3_HP0262S258 CUACUACUACTAGCCTTTAGGCTTGTCTTT (SEQ ID NO. 3711 ) SF4_HP0263S259 CUACUACUAGCTCACACTAACGACAAACTC (SEQ ID NO. 3712 ) SF5_HP0264S260 CUACUACUAACAAAACCTCACTTAATCTTA (SEQ ID NO. 3713 ) SF6_HP0265S26X CUACUACUACCTCCTATTTTTGCAAGAAAT (SEQ ID NO. 3714 ) SF7_HP0266S262 CUACUACUACTATGATTTCTTGCATGCATT (SEQ ID NO. 3715 ) SF8_HP0267S263 CUACUACUAAATTAGATCACCCTTTTCCCC (SEQ ID NO. 3716 ) SF9_HP0268S264 CUACUACUATTAAATAATGTGCAATTCATA (SEQ ID NO. 3717 ) SF10_HP0269S265 CUACUACUATATTTTTATTAGTTGCCTTTA (SEQ ID NO. 3718 ) SF11_HP0270S266 CUACUACUACCCCTTAAGCACCCAAACCCC (SEQ ID NO. 3719 ) SF12_HP0271S267 CUACUACUACTAAAACGCATGCTTTTCATT (SEQ ID NO. 3720 ) SG1_HP0272S268 CUACUACUATCATTTAATAGACTCCGGATT (SEQ ID NO. 3721 ) SG2_HP0273S269 CUACUACUAATCCTATTGGAGATCAATACT (SEQ ID NO. 3722 ) SG3_HP0274S270 CUACUACUAAATTTTAGCGCACTTTAGTTT (SEQ ID NO. 3723 ) SG4_HP0275S271 CUACUACUACCTATTTCTTGCATTCTTGAA (SEQ ID NO. 3724 ) SG5_HP0276S272 CUACUACUATTAAAATTCAATCTAAAAAGA (SEQ ID NO. 3725 ) SG6_HP0277S273 CUACUACUATAGCCTTTAATCTTGCTCTTT (SEQ ID NO. 3726 ) SG7_HP0278S274 CUACUACUATCAAGCAAACTCTATCGTCAA (SEQ ID NO. 3727 ) SG8_HP0279S275 CUACUACUATCTTTGTAAGTCATTCTTTTT (SEQ ID NO. 3728 ) SG9_HP0280S276 CUACUACUACTATCTTTGATAAATCTCAGG (SEQ ID NO. 3729 ) SG10_HP0281S277 CUACUACUATTCAATGAGAGCGGGAGTTGT (SEQ ID NO. 3730 ) SG11_HP0282S278 CUACUACUACATTCTTTCACTCCGCTACAG (SEQ ID NO. 3731 ) SG12_HP0283S279 CUACUACUACCTTAATGCCATTTTTCTAAC (SEQ ID NO. 3732 ) SH1_HP0284S280 CUACUACUATTTCATTTTTGTTCCCTTTTA (SEQ ID NO. 3733 ) SH2_HP0285S281 CUACUACUATTTTTAGAAAACGGCATGATT (SEQ ID NO. 3734 ) SH3_HP0286S282 CUACUACUATCACCCGTTCAATAAATCATA (SEQ ID NO. 3735 ) SH4_HP0287S283 CUACUACUACTTCCACCTTTTTTAAGATTT (SEQ ID NO. 3736 ) SH5_HP0288S284 CUACUACUAAAGGGACTAAAGCACATTTTT (SEQ ID NO. 3737 ) SH6__HP0289S285 CUACUACUAGAACGAAGCATTATAGCAAAA (SEQ ID NO. 3738 ) SH7_HP0290S286 CUACUACUATCAAACCCCTTTTAAGCCTTC (SEQ ID NO. 3739 ) SH8_HP0291S287 CUACUACUACTTTTATTTTTTTAACTCTTT (SEQ ID NO. 3740 ) SH9 HP0292S288 CUACUACUAAAAATCCCCAAAAATCATTTA (SEQ ID NO. 3741 )
SH10_HP0293S289 CUACUACUAGCTCTGTTTTTATAGTTATTT (SEQ ID NO. 3742 ) SH11_HP0294S290 CUACUACUACCTCCTTTTGCCCTTTATTTC (SEQ ID NO. 3743 ) SH12_HP0295S291 CUACUACUAACCACCAAAAGATTACAAGTA (SEQ ID NO. 3744 ) RA1_HP0199R196 CAUCAUCAUATGCAATTTGAAATGCGTAAA (SEQ ID NO. 3745) RA2_HP0200R197 CAUCAUCAUTTTATGGCAGTACCTGATAGA (SEQ ID NO. 3746 ) RA3_HP0201R198 CAUCAUCAUCGCATGATGAAAATTGTAATA (SEQ ID NO. 3747 ) RA4_HP0202R199 CAUCAUCAUATGGAATTTTACGCCTCTCTT (SEQ ID NO. 3748 ) RA5_HP0203R200 CAUCAUCAUATGAAAAAGGTTGTTTTTTTA (SEQ ID NO. 3749) RA6_HP0204R201 CAUCAUCAUTTTTGATGTTTAAAAAAATGT (SEQ ID NO. 3750 ) RA7_HP0205R202 CAUCAUCAUTTGAGCGAGCTTGTAACTGAA (SEQ ID NO. 3751 ) RA8_HP0206R203 CAUCAUCAUATGGAATTTTATAAGCGTGTT (SEQ ID NO. 3752 ) RA9_HP0207R204 CAUCAUCAUAAATGGTGCAATTTCAAAACA (SEQ ID NO. 3753 ) RA10_HP0209R205 CAUCAUCAUAGGGTGCTTAATCAAAAAGTT (SEQ ID NO. 3754 ) RA11_HP0210R206 CAUCAUCAUGGAAAAGATCAATGTCTAATC (SEQ ID NO. 3755 ) RA12_HP0211R207 CAUCAUCAUATGCTAGGAAACGTTAAAAAA (SEQ ID NO. 3756 ) RB1_HP0212R208 CAUCAUCAUGCATGGACGCTTTAGAAATCA (SEQ ID NO. 3757 ) RB2_HP0213R209 CAUCAUCAUGAGTGGTAAAAGAAAGTGATA (SEQ ID NO. 3758 ) RB3_HP0214R210 CAUCAUCAUAATAAAATGGAAAATCATTCG (SEQ ID NO. 3759 ) RB4_HP0215R211 CAUCAUCAUGGAAAAAAGTGAAAGAAGAGT (SEQ ID NO. 3760 ) RB5_HP0216R212 CAUCAUCAUTAGGGGATTGAATGGTTGTTT (SEQ ID NO. 3761 ) RB6_HP0217R213 CAUCAUCAUTATGGGGTTAAAAAATAAAAT (SEQ ID NO. 3762 ) RB7_HP0218R214 CAUCAUCAUAAAAGGGCTTAAAATGAAAAC (SEQ ID NO. 3763 ) RB8_HP0219R215 CAUCAUCAUTTTATTTTTGGTGGTTAGTAT (SEQ ID NO. 3764) RB9_HP0220R216 CAUCAUCAUACCTTGTTACAACGAATTTAT (SEQ ID NO. 3765 ) RB10_HP0221R217 CAUCAUCAUAAATGGCAAAACATGATTTAG (SEQ ID NO. 3766 ) RB11_HP0222R218 CAUCAUCAUTGATGGAAAAGACAGAAAACA (SEQ ID NO. 3767 ) RB12_HP0223R2X9 CAUCAUCAUGTGATTTTAAAAAAGAGTGGT (SEQ ID NO. 3768 ) RC1_HP022 R220 CAUCAUCAUCAATGAAGGTATTATCTTATT (SEQ ID NO. 3769 ) RC2_HP0225R221 CAUCAUCAUATGAGGGGTTTAAGCGTTTGG (SEQ ID NO. 3770) RC3_HP0226R222 CAUCAUCAUTTTTTTATGGAAGAATCAACA (SEQ ID NO. 3771 ) RC4_HP0227R223 CAUCAUCAUAACATGAAAAAATCCCTCTTA (SEQ ID NO. 3772 ) RC5 HP0228R224 CAUCAUCAUTAATGAGAAAGAAAGGCATGT (SEQ ID NO. 3773 )
RC6_HP0229R225 CAUCAUCAUTTTATGAAAAAAACGATTTTA (SEQ ID NO. 3774) RC7_HP0230R226 CAUCAUCAUATGATTATCATTCCTGCTAGG (SEQ ID NO. 3775 ) RC8_HP0231R227 CAUCAUCAUAATGATATTAAGAGCGAGTGT (SEQ ID NO. 3776 ) RC9_HP0232R228 CAUCAUCAUATGAAAAAACCCTATAGGAAG (SEQ ID NO. 3777 ) RC10_HP0233R229 CAUCAUCAUTAAAAAATGCAAGTGATTCCT (SEQ ID NO. 3778 ) RC11_HP0234R230 CAUCAUCAUCAATAATGAGCGCAATGATGC (SEQ ID NO. 3779 ) RC12_HP0235R231 CAUCAUCAUATGGGTGTCAAATTTTTAAAA (SEQ ID NO. 3780 ) RD1_HP0236R232 CAUCAUCAUTTGTTTTAATGCGTTTGTTTA (SEQ ID NO. 3781 ) RD2_HP0237R233 CAUCAUCAUAGTGGGAAATTTAGTGATTGG (SEQ ID NO. 3782 ) RD3_HP0238R234 CAUCAUCAUCATGCTATTTTCAAAACTCTT (SEQ ID NO. 3783 ) RD4_HP0239R235 CAUCAUCAUAGGACTTTTTAATGGAGTTAG (SEQ ID NO. 3784 ) RD5_HP0240R236 CAUCAUCAUATGCAAGAAAAACAACTTAAA (SEQ ID NO. 3785) RD6_HP0241R237 CAUCAUCAUTCAGCGGGAATGAATGAGCTT (SEQ ID NO. 3786 ) RD7_HP0242R238 CAUCAUCAUAGCCAGCATGAGAGATTACAG (SEQ ID NO. 3787 ) RD8_HP0243R239 CAUCAUCAUGATGAAAACATTTGAAATTCT (SEQ ID NO. 3788 ) RD9_HP0244R240 CAUCAUCAUATGAAAAAATCCAAGCACTTA (SEQ ID NO. 3789 ) RD10_HP0245R241 CAUCAUCAUGAGATACTATGATAAACAACA (SEQ ID NO. 3790 ) RD11_HP0246R242 CAUCAUCAUTGATTGAAACGGGTGTTTTTA (SEQ ID NO. 3791 ) RD12_HP0247R243 CAUCAUCAUCCCATGGAATTGAATCAACCA (SEQ ID NO. 3792 ) RE1_HP0248R244 CAUCAUCAUATGCCCATTGATTTGAACGAA (SEQ ID NO. 3793 ) RE2_HP0249R245 CAUCAUCAUAACGCATGGCATCTCTTGCCT (SEQ ID NO. 3794) RE3_HP0250R246 CAUCAUCAUGTTATATGCTAGAAATCAAAA (SEQ ID NO. 3795 ) RE4_HP0251R247 CAUCAUCAUGAGCGTGAGCAGTTTGTTTAA (SEQ ID NO. 3796 ) RE5__HP0252R248 CAUCAUCAUTTGAAAAACCACTCCTTTAAA (SEQ ID NO. 3797 ) RE6J3P0253R249 CAUCAUCAUATGAAAAATACCAATACAAAA (SEQ ID NO. 3798 ) RE7_HP0254R250 CAUCAUCAUGTGGCGTTAGCTGAAGACGAT (SEQ ID NO. 3799 ) RE8_HP0255R251 CAUCAUCAUTTTTATGGCAGATGTCGTTGT (SEQ ID NO. 3800 ) RE9_HP0256R252 CAUCAUCAUCTATGAAAAAATTCGCTTCTG (SEQ ID NO. 3801 ) RE10_HP0257R2B3 CAUCAUCAUATGCGTAAAATCTTGTTATTG (SEQ ID NO. 3802 ) RE11_HP0258R254 CAUCAUCAUGGGGTATGATGTTCATTGTAG (SEQ ID NO. 3803 ) RE12_HP0259R255 CAUCAUCAUGGAGCTGGTGGATGTATTGAG (SEQ ID NO. 3804 ) RF1 HP0260R256 CAUCAUCAUGGGAAAATTATCTTGCATAAA (SEQ ID NO. 3805 )
RF2_HP0261R257 CAUCAUCAUGACATGCAATTTTTAAATCAA (SEQ ID NO. 3806 RF3_HP0262R258 CAUCAUCAUATGGATTTCTATGGATTAAAG (SEQ ID NO. 3807 RF4_HP0263R259 CAUCAUCAUATGAAACCTTATTTCAGTTTA (SEQ ID NO. 3808 RF5_HP0264R260 CAUCAUCAUAAAGGATTGATAATGAATTTA (SEQ ID NO. 3809 RF6_HP0265R261 CAUCAUCAUTTGAAAAAGGATTGAATGATG (SEQ ID NO. 3810 RF7_HP0266R262 CAUCAUCAUAGGGTTTGATGCTGTTAAAAA (SEQ ID NO. 3811 RF8_HP0267R263 CAUCAUCAUATGCAAGAAATCATAGGAGCG (SEQ ID NO. 3812 RF9_HP0268R264 CAUCAUCAUAGAGTCAATGAAACTAGTTTT (SEQ ID NO. 3813 RF10_HP0269R265 CAUCAUCAUCTTGTATTGAAAGTTTATATT (SEQ ID NO. 3814 RF11_HP0270R266 CAUCAUCAUGATGATTGGGTTTGCGTATTT (SEQ ID NO. 3815 RF12_HP0271R267 CAUCAUCAUGGTGCTTAAGGGGTTAAAAAA (SEQ ID NO. 3816 RG1_HP0272R268 CAUCAUCAUAGCATGCGTTTTAGTTACATT (SEQ ID NO. 3817 RG2_HP0273R269 CAUCAUCAUAATGAAGCCATTGCATTTTTC (SEQ ID NO. 3818 RG3_HP0274R270 CAUCAUCAUATGCAAGATTTTGATTTTAGT (SEQ ID NO. 3819 RG4_HP0275R271 CAUCAUCAUATGAATATTCAAATAAAGAAA (SEQ ID NO. 3820 RG5_HP0276R272 CAUCAUCAUGAATGCAAGAAATAGGGATTT (SEQ ID NO. 3821 RG6_HP0277R273 CAUCAUCAUAGGAGTTAGTCATGTCATTAT (SEQ ID NO. 3822 RG7_HP0278R274 CAUCAUCAUGCAATGGCTAAAATCACAACC (SEQ ID NO. 3823 RG8_HP0279R275 CAUCAUCAUGATAGAGTTTGCTTGAAAATA (SEQ ID NO. 3824 RG9__HP0280R276 CAUCAUCAUATGACTTACAAAGAACGACTC (SEQ ID NO. 3825 RG10_HP0281R277 CAUCAUCAUCTTAAACGATGGATTTTCAAC (SEQ ID NO. 3826 RG11_HP0282R278 CAUCAUCAUTTTTAGGAAGTTTGGAATTGA (SEQ ID NO. 3827 RG12_HP0283R279 CAUCAUCAUTGAAAGAATGCAAGAAATTTT (SEQ ID NO. 3828 RH1__HP0284R280 CAUCAUCAUATGGCATTAAGGGTATTGTTA (SEQ ID NO. 3829 RH2_HP0285R281 CAUCAUCAUAGGGAACAAAAATGAAAAAAG (SEQ ID NO. 3830 RH3__HP0286R282 CAUCAUCAUATCATGCCGTTTTCTAAAAAT (SEQ ID NO. 3831 RH4JHP0 87R283 CAUCAUCAUATGCGTTTTTTTAGTGGTTTT (SEQ ID NO. 3832 RH5_HP0288R284 CAUCAUCAUGGTGGAAGATGAAAAAATTTG (SEQ ID NO. 3833 RH6_HP0289R285 CAUCAUCAUATGAAAAAGTTTAAAAAGAAA (SEQ ID NO. 3834 RH7_HP0290R286 CAUCAUCAUGGTTTTTCTATGTTTAATTAT (SEQ ID NO. 3835 RH8_HP029XR287 CAUCAUCAUATGCAAAAGAATTTGGATAGT (SEQ ID NO. 3836 RH9 HP0292R288 CAUCAUCAUAGTTGTTATGTTTGAAAAAAT (SEQ ID NO. 3837
RH10_HP0293R289 CAUCAUCAUATGATTTTTGGGGATTTTAAA (SEQ ID NO. 3838 RH11_HP0294R290 CAUCAUCAUATATGAGACATGGAGATATTA (SEQ ID NO. 3839 RHX2_HP0295R291 CAUCAUCAUTCATGCGCGTTACCTTTGGCT (SEQ ID NO. 3840 SA1_HP0296S292 CUACUACUATTATGCTACAATTTTAGTGAT (SEQ ID NO. 3841 SA2_HP0297S293 CUACUACUACTACTCCCCAAAATTTTGACT (SEQ ID NO. 3842 SA3_HP0298S294 CUACUACUATTTATTTTTCTAAATACACCT (SEQ ID NO. 3843 SA4_HP0299S29S CUACUACUAATTATGACAACCTTATTCTAG (SEQ ID NO. 3844 SA5_HP0300S296 CUACUACUACATGCAAGCTCCTTTTAAGAG (SEQ ID NO. 3845 SA6_HP0301S297 CUACUACUATCATCGCAACTCCTTTTGAAA (SEQ ID NO. 3846 SA7_HP0302S298 CUACUACUATTTATTTGGCAAATCTTTGCA (SEQ ID NO. 3847 SA8_HP0303S299 CUACUACUATTAGGGTAACGCTTCCAACAA (SEQ ID NO. 3848 SA9__HP0304S300 CUACUACUATTCATGGCGATCGGAGTTTCA (SEQ ID NO. 3849 SA10_HP0305S301 CUACUACUATTATCCCTTGATCATGCTTTC (SEQ ID NO. 3850 SA11_HP0306S302 CUACUACUAAAATTCACACACCTTTTATGA (SEQ ID NO. 3851 SAX2_HP0307S303 CUACUACUATTATTTTGTTTTATTTTGTGT (SEQ ID NO. 3852 SBX_HP0308S304 CUACUACUATTGTGTCAAAACTTACCATTA (SEQ ID NO. 3853 SB2_HP0309S305 CUACUACUAGCTCCTTTTTAAGTGAATTTT (SEQ ID NO. 3854 SB3_HP0310S306 CUACUACUATCGCTATTTTTTTCTAGGGTT (SEQ ID NO. 3855 SB4_HP0311S307 CUACUACUAGGGATTTTGGGCATCAGTGAA (SEQ ID NO. 3856 SB5_HP0312Ξ308 CUACUACUATTATTGGGTTTCAAAACCCTT (SEQ ID NO. 3857 SB6_HP0313S309 CUACUACUATTCTATTTATGGATAGCTGTT (SEQ ID NO. 3858 SB7_HP0314S310 CUACUACUATTAAAATTTTTGGCACTTATA (SEQ ID NO. 3859 SB8 3P0315S311 CUACUACUACTAGGATTTCACAATCTCAGT (SEQ ID NO. 3860 SB9_HP03X6S312 CUACUACUATCAAACGCTAAAGCATACATT (SEQ ID NO. 3861 SB10_HP0317S313 CUACUACUAAGCTTAGTAAGCGAACACATA (SEQ ID NO. 3862 SB11_HP0318S314 CUACUACUATTATTTCTTGTGAGCGAAATT (SEQ ID NO. 3863 SB12_HP0319S315 CUACUACUAGCTTAATCTCTAGCGGAAATT (SEQ ID NO. 3864 SC1_HP0320S316 CUACUACUATTAACTTTCTTGTTTGCTTTT (SEQ ID NO. 3865 SC2_HP0321S317 CUACUACUAGTTTTATAGGGTTTCGTTTTT (SEQ ID NO. 3866 SC3_HP0322S318 CUACUACUATTATGCATTAGGTTTTTTATC (SEQ ID NO. 3867 SC4_HP0323S319 CUACUACUATTTAAAACCCAACGCAACTCC (SEQ ID NO. 3868 SC5 HP0324S320 CUACUACUATTTAGAAGTGGTAATTATACC (SEQ ID NO. 3869
SC6_HP0325S321 CUACUACUATTTTAATAAGGCATTTGGGTT (SEQ ID NO. 3870) SC7_HP0326S322 CUACUACUAGAATAATTTTTTTTCAAATCT (SEQ ID NO. 3871) SC8_HP0327Ξ323 CUACUACUACCCTAAAGTTTTAGAAGAGAT (SEQ ID NO. 3872) SC9_HP0328S324 CUACUACUAGAGTTTATAGACGTTCTTTTG (SEQ ID NO. 3873) SC10_HP0329S325 CUACUACUATCATTCAGGGTTAAATCGTTT (SEQ ID NO. 3874) SC1X_HP0330S326 CUACUACUATGATTTATTTTTTATGATTGA (SEQ ID NO. 3875) SC12_HP0331S327 CUACUACUACTCATGAAAATATCCCTTTTA (SEQ ID NO. 3876) SD1_HP0332S328 CUACUACUACTATCTAGGTAATATAATCTC (SEQ ID NO. 3877) SD2_HP0333S329 CUACUACUAAATCACGCTAACACCACAATG (SEQ ID NO. 3878) SD3_HP0334S330 CUACUACUATCCCTAATGATTTTTCAAAAC (SEQ ID NO. 3879) SD4_HP0335S331 CUACUACUATTCACACGCTTTAACATAGTA (SEQ ID NO. 3880) SD5_HP0336S332 CUACUACUATCAAATCTAGTAGTTGTTTAA (SEQ ID NO. 3881) SD6_HP0337S333 CUACUACUACTATTTTCCTTGCTCATTGTC (SEQ ID NO. 3882) SD7_HP0338S334 CUACUACUATTTTTAATTCCCCTTATTATT (SEQ ID NO. 3883) SD8JHP0339S335 CUACUACUATCCAATTTCAAAATTTCCATA (SEQ ID NO. 3884) SD9_HP0340S336 CUACUACUATTATTCCCCAAGCAAAATACA (SEQ ID NO. 3885) SD10_HP0341S337 CUACUACUATTAAAGCCTCCCTAAATTACT (SEQ ID NO. 3886) SD11_HP0342S338 CUACUACUATTATTTTTTATCCCTCTTGTT (SEQ ID NO. 3887) SD12_HP0343S339 CUACUACUATCTAGACATCAGCATCAAATA (SEQ ID NO. 3888) SE1_HP0344S340 CUACUACUACATCATTTTCCATACTCACTC (SEQ ID NO. 3889) SE2__HP0345S341 CUACUACUAATCTTTTATAATATCGGTAGG (SEQ ID NO. 3890) SE3_HP0346S342 CUACUACUATTATTGATTGATCTTTTGGTT (SEQ ID NO. 3891) SE4_HP0347S343 CUACUACUATCAATAAAAAAAGGATAGATC (SEQ ID NO. 3892) SE5_HP0348S344 CUACUACUACACTACAACAAACTTTTAACA (SEQ ID NO. 3893) SE6_HP0349S345 CUACUACUATTGTTAGATTAGGATTTAGAA (SEQ ID NO. 3894) SE7_HP0350S346 CUACUACUATCGTTTAGATCTTTTGAAACA (SEQ ID NO. 3895) SE8_HP0351S347 CUACUACUAACCTTCTATCTTTTAACCTTT (SEQ ID NO. 3896) SE9_HP0352S348 CUACUACUACTATTCAATGACATCTTCCTC (SEQ ID NO. 3897) SE10_HP0353S349 CUACUACUACAAAATCACACCTTAAAATTT (SEQ ID NO. 3898) SE11_HP0354S350 CUACUACUATTTCATCTCTCTTGTCCTAAA (SEQ ID NO. 3899) SE12_HP0355S351 CUACUACUACGCCCTAATCGATCTTTAATA (SEQ ID NO. 3900) SF1 HP0356S352 CUACUACUAGTTATTTTATTTGCCCTATTA (SEQ ID NO. 3901)
SF2_HP0357S353 CUACUACUATTAAGGGTTTTTATGGGTGGG (SEQ ID NO. 3902 ) SF3_HP0358S354 CUACUACUATTAAAAAAGCCCCTTAAGCCC (SEQ ID NO. 3903 ) SF4_HP0359S355 CUACUACUATAGCTCAGCTGGGAGAGCGCC (SEQ ID NO. 3904 ) SF5_HP0360S356 CUACUACUATTATTGAAACCTCAAAAGGTG (SEQ ID NO. 3905 ) SF6_HP0361S357 CUACUACUACCTTTTGAGGTTTCAATAATA (SEQ ID NO. 3906 ) SF7_HP0362 S358 CUACUACUATCAATAACGCTTTAAAATGAA (SEQ ID NO. 3907 ) SF8_HP0363S3S9 CUACUACUATTATTGCACCCCATCTACAAC (SEQ ID NO. 3908 ) SF9_HP0364S360 CUACUACUACCTTAAAAATCATCAAAACTC (SEQ ID NO. 3909 ) SF10_HP0365S361 CUACUACUATAGGGCTTCAAATCTTAATCA (SEQ ID NO. 3910 ) ΞFXX_HP0366S362 CUACUACUAACTCATTCTATTTTAAAACTC (SEQ ID NO. 391 1 ) SFX2_HP0367S363 CUACUACUATCGCTATAAGCAAACTCTTTC (SEQ ID NO. 3912 ) SG1_HP0368S364 CUACUACUAAAAAATATTAATCTAATCCTA (SEQ ID NO. 3913 ) SG2_HP0369S365 CUACUACUATTGTTCTAAGTCCATCAATTA (SEQ ID NO. 3914 ) SG3_HP0370S366 CUACUACUAAACTTAAAAATTTTCTTCTAA (SEQ ID NO. 3915 ) SG4_HP0371S367 CUACUACUATTAAAGCTTTTCAACTTTGAT (SEQ ID NO. 3916 ) SG5_HP0372S368 CUACUACUATCACTTTAAAATTTTAGGCAA (SEQ ID NO. 3917 ) SG6_HP0373 S369 CUACUACUACCTAAAAAATCCACCCGTAAT (SEQ ID NO. 3918 ) SG7_HP0374S370 CUACUACUACCCCCTAAACTTGTGCGATAC (SEQ ID NO. 3919 ) ΞG8_HP0375S371 CUACUACUATCAAAGGCCCATTAACACAAC (SEQ ID NO. 3920 ) SG9_HP0376S372 CUACUACUATACTATTCCTTGAGGTTTTTA (SEQ ID NO. 3921 ) SG10_HP0377S373 CUACUACUAAGGCTTTCCTAGTTAGACTTG (SEQ ID NO. 3922 ) SG11_HP0378S374 CUACUACUATGGTTATCCTTTAAGCTAATT (SEQ ID NO. 3923 ) SG12_HP0379S375 CUACUACUACCAATTTTACAAACCCAATTT (SEQ ID NO. 3924 ) SH1_HP0380S376 CUACUACUAGGTTTGTAAAATTGGGGGTAA (SEQ ID NO. 3925 ) SH2_HP0381S377 CUACUACUATTTATCTTAAAAAACTTTTTA (SEQ ID NO. 3926 ) SH3_HP0382S378 CUACUACUAAAAGGGTCATTCAATTTCATA (SEQ ID NO. 3927 ) SH4_HP0383S379 CUACUACUATTATTGGATTTTATAAAGCCA (SEQ ID NO. 3928 ) SH5_HP0384S380 CUACUACUACGATTACTTTTCCACCAACAC (SEQ ID NO. 3929 ) SH6_HP0385S381 CUACUACUAATGTTTTTAATGGTTTATGAC (SEQ ID NO. 3930 ) SH7_HP0386S382 CUACUACUAGGAGCGATTAAGTGATAGAAC (SEQ ID NO. 3931 ) SH8_HP0387S383 CUACUACUATTAAAAAATATCCACAGGATC (SEQ ID NO. 3932 ) SH9 HP0388S384 CUACUACUATTTCAAAAAACTTAAGTTTTT (SEQ ID NO. 3933 )
SH10_HP0389S385 CUACUACUAAAATAACGCTTAAGCTTTTTT (SEQ ID NO. 3934 ) SH11_HP0390S386 CUACUACUAGATTTCCTATTTCAACACTTT (SEQ ID NO. 3935 ) SH12JHP0391S387 CUACUACUAGTTTTTAGAAGTCTTTTTTTA (SEQ ID NO. 3936 ) RA1_HP0296R292 CAUCAUCAUAAATGTCATACGCAATATTCA (SEQ ID NO. 3937 ) RA2_HP0297R293 CAUCAUCAUCAATGGCACACAAGAAAGGTC (SEQ ID NO. 3938 ) RA3_HP0298R294 CAUCAUCAUGACTGGCTTATGAATAATGTT (SEQ ID NO. 3939 ) RA4_HP0299R295 CAUCAUCAUATGCTGAGTTTTATCATTAAG (SEQ ID NO. 3940 ) RA5_HP0300R296 CAUCAUCAUATGGAGTCTTTTAGAGAGTTT (SEQ ID NO. 3941 ) RA6_HP0301R297 CAUCAUCAUGCATGATTTTAGAAGTTAAAG (SEQ ID NO. 3942 ) RA7_HP0302R298 CAUCAUCAUGATGAAGCTCTTAGAAATTAA (SEQ ID NO. 3943 ) RA8_HP0303R299 CAUCAUCAUGATTTTTAAGCTGTGTTTGTA (SEQ ID NO. 3944 ) RA9_HP0304R300 CAUCAUCAUATGAAAAGATTTGTTTTGTTT (SEQ ID NO. 3945 ) RA10_HP0305R301 CAUCAUCAUAATGAAAAAAATGGTTTTGGT (SEQ ID NO. 3946 ) RA11_HP0306R302 CAUCAUCAUTCATGGAGTTGTTGCACAGCA (SEQ ID NO. 3947 ) RA12_HP0307R303 CAUCAUCAUAAAGGTGTGTGAATTTTTTGA (SEQ ID NO. 3948 ) RB1_HP0308R304 CAUCAUCAUCAAATCCCATGTGCCAAATCC (SEQ ID NO. 3949 ) RB2_HP0309R305 CAUCAUCAUGGGAGTGGATCTTGAAAACAA (SEQ ID NO. 3950 ) RB3_HP0310R306 CAUCAUCAUTTATGGCAAAAGAAATTTTAG (SEQ ID NO. 3951 ) RB4_HP0311R307 CAUCAUCAUTGGAAGATGTGCGTTTTATGC (SEQ ID NO. 3952 ) RB5_HP0312R308 CAUCAUCAUATGCCCAAAATCCCTATCACG (SEQ ID NO. 3953 ) RB6_HP0313R309 CAUCAUCAUAATGCGCGTGTTTGTTTGCTT (SEQ ID NO. 3954 ) RB7_HP0314R310 CAUCAUCAUATGCGATTGTTGCATTTTTTT (SEQ ID NO. 3955 ) RB8_HP0315R311 CAUCAUCAUAAAGGTTTTGAAAATGTATGC (SEQ ID NO. 3956 ) RB9_HP0316R312 CAUCAUCAUAGGAATAAGCATGCCTAACAC (SEQ ID NO. 3957 ) RB10_HP0317R313 CAUCAUCAUAAAAACATGAAAAAACACATC (SEQ ID NO. 3958 ) RB11_HP0318R314 CAUCAUCAUATGCTTAATCGTATCATAGAA (SEQ ID NO. 3959 ) RB12_HP0319R315 CAUCAUCAUAGCATGCACACTCTCATTAAG (SEQ ID NO. 3960 ) RC1_HP0320R316 CAUCAUCAUGTATGGGCGGATTCACAAGCA (SEQ ID NO. 3961 ) RC2_HP0321R317 CAUCAUCAUCCTAATGCATAATGATTTTAA (SEQ ID NO. 3962 ) RC3_HP0322R318 CAUCAUCAUATGAAAATGATTCTATTCAAC (SEQ ID NO. 3963 ) RC4_HP0323R3X9 CAUCAUCAUTGATGTTAAACAAGTTTAAAA (SEQ ID NO. 3964 ) RC5 HP0324R320 CAUCAUCAUATGAAACTTAACTTGCAGGAG (SEQ ID NO. 3965 )
RC6_HP0325R32X CAUCAUCAUAGGCTAGTGGATGAAAAAAGC (SEQ ID NO. 3966 ) RC7_HP0326R322 CAUCAUCAUATGAGAGCGATCGCTATTGTT (SEQ ID NO. 3967 ) RC8_HP0327R323 CAUCAUCAUTTTAGAGATTTGAAAAAAAAT (SEQ ID NO. 3968 ) RC9_HP0328R324 CAUCAUCAUCCTGAATGAAAAGCGATAAAC (SEQ ID NO. 3969 ) RC10_HP0329R325 CAUCAUCAUAGGCATGCAAAAAGATTACCA (SEQ ID NO. 3970 ) RC11_HP0330R326 CAUCAUCAUATTTTGGCATTACCCGTTTAT (SEQ ID NO. 3971 ) RC12_HP0331R327 CAUCAUCAUGGAATCATATGGCAATAGTAG (SEQ ID NO. 3972 ) RD1_HP0332R328 CAUCAUCAUTTCATGAGTTTGTTTGATTTT (SEQ ID NO. 3973 ) RD2_HP0333R329 CAUCAUCAUAGTGAATCAACGAATGAAAAG (SEQ ID NO. 3974) RD3_HP0334R330 CAUCAUCAUTGTGGTGTTAGCGTGATTTTG (SEQ ID NO. 3975 ) RD4_HP0335R33X CAUCAUCAUGGGGGGTTAATGGCAGAGCCA (SEQ ID NO. 3976 ) RD5_HP0336R332 CAUCAUCAUCGATGGTAGGGGGTGGAACGG (SEQ ID NO. 3977 ) RD6_HP0337R333 CAUCAUCAUAAAACGGATAACCATGAAAAT (SEQ ID NO. 3978 ) RD7_HP0338R334 CAUCAUCAUCGACAATGAGCAAGGAAAATA (SEQ ID NO. 3979 ) RD8_HP0339R335 CAUCAUCAUGTGGATTCAGAGGGGTTTTCG (SEQ ID NO. 3980 ) RD9JHP0340R336 CAUCAUCAUCAAGGAATATGGAAATTTTGA (SEQ ID NO. 3981 ) RD10_HP0341R337 CAUCAUCAUATGCTAAAGAATTTCAAAAAG (SEQ ID NO. 3982 ) RD11_HP0342R338 CAUCAUCAUGTGAGCGTGGAATTTGCTTCT (SEQ ID NO. 3983 ) RDX2_HP0343R339 CAUCAUCAUATGTCTTATTTTTTTAAAATC (SEQ ID NO. 3984) REX_HP0344R340 CAUCAUCAUGGGGTTTGTTTTTTGCACTAT (SEQ ID NO. 3985 ) RE2_HP0345R341 CAUCAUCAUGAGTGAGTATGGAAAATGATG (SEQ ID NO. 3986 ) RE3_HP0346R342 CAUCAUCAUTGTGGATAAGTTCTTGGATTG (SEQ ID NO. 3987) RE4_HP0347R343 CAUCAUCAUTGTTGTAGTGCCTTTTGTTGA (SEQ ID NO. 3988 ) RE5_HP0348R344 CAUCAUCAUAAATCCTAATCTAACAAATGA (SEQ ID NO. 3989) RE6_HP0349R345 CAUCAUCAUTATGGATAGAGCCAAATTTAT (SEQ ID NO. 3990 ) RE7_HP0350R3 6 CAUCAUCAUTAAATAATGGCTGAAAATTCT (SEQ ID NO. 3991 ) RE8J3P0351R347 CAUCAUCAUTAATATTGGATTTAAAGGTAT (SEQ ID NO. 3992 ) RE9_HP0352R348 CAUCAUCAUAATGGCAACCAAGCTTACCCC (SEQ ID NO. 3993 ) REX0_HP0353R349 CAUCAUCAUATGTCATTGAATAGCCGTAAG (SEQ ID NO. 3994 ) RE1X_HP0354R350 CAUCAUCAUGTGATTTTGCAAAATAAAACT (SEQ ID NO. 3995) RE12_HP0355R351 CAUCAUCAUGACAAGAGAGATGAAAACAAA (SEQ ID NO. 3996 ) RF1 HP0356R352 CAUCAUCAUCTTCTATGGCAAAACATAAGA (SEQ ID NO. 3997 )
RF2_HP0357R353 CAUCAUCAUAATGGCGCACATTTTAGTTAG (SEQ ID NO. 3998) RF3_HP0358R354 CAUCAUCAUATGAAAAAACTTCTTTATACC (SEQ ID NO. 3999) RF4_HP0359R355 CAUCAUCAUATGGTGGAGAATAGCGGGATC (SEQ ID NO. 4000) RF5_HP0360R356 CAUCAUCAUAGGGGTTTTATGGCATTATTA (SEQ ID NO. 4001) RF6_HP0361R357 CAUCAUCAUCGTTATTGAATGCGTTGTTTT (SEQ ID NO. 4002) RF7_HP0362R358 CAUCAUCAUTAAAGCATGATTAAACACTAT (SEQ ID NO. 4003) RF8_HP0363R359 CAUCAUCAUGGGCTTGTTTGAATAGTATTA (SEQ ID NO. 4004) RF9_HP0364R360 CAUCAUCAUGGAACAAAATGGAAGTTTCAC (SEQ ID NO. 4005) RF10_HP0365R361 CAUCAUCAUGAAGTGGCGAAATTGGTAGAC (SEQ ID NO. 4006) RF11_HP0366R362 CAUCAUCAUTTGAAAGAGTTTGCTTATAGC (SEQ ID NO. 4007) RF12_HP0367R363 CAUCAUCAUATGTCTAAGATTTCAAATAAT (SEQ ID NO. 4008) RG1_HP0368R364 CAUCAUCAUGTGCTTAACGCTATAGATGGA (SEQ ID NO. 4009) RG2_HP0369R365 CAUCAUCAUGTGAAGTTATTAAAAAATACG (SEQ ID NO. 4010) RG3_HP0370R366 CAUCAUCAUTCCATGAATAAAGTGAATAAA (SEQ ID NO. 4011) RG4_HP0371R367 CAUCAUCAUAGGAATAGATCGTTATGAACC (SEQ ID NO. 4012) RG5_HP0372R368 CAUCAUCAUAGGCTTTGTGCGTATGGGATT (SEQ ID NO. 4013) RG6_HP0373R369 CAUCAUCAUCATGCTAAGATTCGTTAGTAA (SEQ ID NO. 4014) RG7_HP0374R370 CAUCAUCAUATGCGTTTTGTCTATCACCCT (SEQ ID NO. 4015) RG8_HP0375R371 CAUCAUCAUAAGAAGTTAATGTTAGAGATG (SEQ ID NO. 4016) RG9_HP0376R372 CAUCAUCAUCCAATGGATTTAATCAATGAA (SEQ ID NO. 4017) RG10_HP0377R373 CAUCAUCAUATGTTTTCACTTTCTTATGTT (SEQ ID NO. 4018) RG11_HP0378R374 CAUCAUCAUTAATGAAGAATCTCAAAAGCC (SEQ ID NO. 4019) RG12_HP0379R375 CAUCAUCAUGATAACCATGTTCCAACCCCT (SEQ ID NO. 4020) RH1_HP0380R376 CAUCAUCAUATGTATGTTGAAAAAATTCTC (SEQ ID NO. 4021) RH2_HP0381R377 CAUCAUCAUATGACCCTTTCGCAAGCCCTA (SEQ ID NO. 4022) RH3_HP0382R378 CAUCAUCAUATGCTTGACATATGGATAGAT (SEQ ID NO. 4023) RH4_HP0383R379 CAUCAUCAUTTTGTCATGCCCCATTCTTTA (SEQ ID NO. 4024) RH5_HP0384R380 CAUCAUCAUAATGCAAAAAAATATATTAAA (SEQ ID NO. 4025) RH6_HP0385R38X CAUCAUCAUTATGATGCAATCTTTAAGTTT (SEQ ID NO. 4026) RH7_HP0386R382 CAUCAUCAUGTTGTCATAAACCATTAAAAA (SEQ ID NO. 4027) RH8_HP0387R383 CAUCAUCAUAAAATCTCATCATGTTCTATC (SEQ ID NO. 4028) RH9 HP0388R384 CAUCAUCAUATGAAAGACACTCTGTTTAAT (SEQ ID NO. 4029)
RH10_HP0389R385 CAUCAUCAUCATGTTTACATTACGAGAGTT (SEQ ID NO. 4030 ) RH11_HP0390R386 CAUCAUCAUCATGCAAAAAGTTACTTTTAA (SEQ ID NO. 4031 ) RH12_HP039XR387 CAUCAUCAUAATCGTGAGCAACCAATTAAA (SEQ ID NO. 4032 ) SAX_HP0392S388 CUACUACUACTCACGATTGGTCTCCTTCTA (SEQ ID NO. 4033 ) SA2_HP0393S389 CUACUACUAATTACGCATTCTTGTCTAAAA (SEQ ID NO. 4034 ) SA3_HP0394S390 CUACUACUAATTTAGGCGTCCTTATTTTGA (SEQ ID NO. 4035) SA4_HP0395S391 CUACUACUAGCATCTTACTCTTTGAACAAA (SEQ ID NO. 4036 ) SA5_HP0396S392 CUACUACUATTCTTTATAGATCTTCTTTGT (SEQ ID NO. 4037 ) SA6_HP0397S393 CUACUACUAACCTTAAATAACCACATAATG (SEQ ID NO. 4038 ) SA7_HP0398S394 CUACUACUATGAATTTAAAAAACAGCATGA (SEQ ID NO. 4039 ) SA8_HP0399S395 CUACUACUACTTTAGAGTTTTTCTTTAAGA (SEQ ID NO. 4040 ) SA9_HP0400S396 CUACUACUAGTGTTAAATCGTGCTGATTTT (SEQ ID NO. 4041 ) SA10_HP0401S397 CUACUACUATTAATTTCCATTAAGACTCCC (SEQ ID NO. 4042 ) SA11_HP0402S398 CUACUACUATCTATCACATATTATCCTTTA (SEQ ID NO. 4043 ) SA12_HP0403S399 CUACUACUAATTAAAAGCTCTCCAACACTC (SEQ ID NO. 4044) SBX_HP0404S400 CUACUACUACGCTTTTAATGTTTGTCTCCG (SEQ ID NO. 4045 ) SB2_HP0405S401 CUACUACUATAGCTTAACGCAATTTTTTAA (SEQ ID NO. 4046 ) SB3_HP0406S402 CUACUACUACGTTAATTTAAAACTACTTTT (SEQ ID NO. 4047 ) SB4_HP0407S403 CUACUACUAACTCTTTTTAAGCCCCAATAA (SEQ ID NO. 4048 ) SB5_HP0408S404 CUACUACUATCATCGCTCATTCTTTATTTT (SEQ ID NO. 4049 ) SB6_HP0409S405 CUACUACUATCATTCCCATTCAATCGTTCC (SEQ ID NO. 4050 ) SB7_HP0410S406 CUACUACUAACTACTTTCGTTTTTTCATTT (SEQ ID NO. 4051 ) SB8_HP0411S407 CUACUACUATTAAATGAGATGGGGTGGAAG (SEQ ID NO. 4052 ) SB9_HP0412S408 CUACUACUATCAATCTTTAGAAAAAAAAAC (SEQ ID NO. 4053 ) SB10_HP0413S409 CUACUACUATTTGTTTAGACTTTAACACCT (SEQ ID NO. 4054 ) SB11_HP0414S4X0 CUACUACUACTAATTTAATGCGGTGTTTCT (SEQ ID NO. 4055 ) SB12_HP0415S411 CUACUACUATCTACCAATAATTTCAGATTT (SEQ ID NO. 4056 ) SC1_HP0416S4X2 CUACUACUATCTGAAATTATTGGTAGATGT (SEQ ID NO. 4057 ) SC2_HP0417S413 CUACUACUATTAGCTGATCAAACTTCCTGC (SEQ ID NO. 4058 ) SC3_HP0418S414 CUACUACUAAAATGAGCATGATTTACCATA (SEQ ID NO. 4059 ) SC4_HP0419S415 CUACUACUATTATTTAACGCTTGCTTGGTT (SEQ ID NO. 4060 ) SC5 HP0420S416 CUACUACUACTACTCTTCTTTAAAATTGAA (SEQ ID NO. 4061 )
SC6_HP042XS4X7 CUACUACUATTATGATAAGGTTTTAAAGAG (SEQ ID NO. 4062) SC7JHP0422S418 CUACUACUATTAAGAAATCGTGCGCAAATA (SEQ ID NO. 4063) SC8_HP0423S419 CUACUACUAGAGTGGGGGGGGTTTATTCTA (SEQ ID NO. 4064) SC9_HP0424S420 CUACUACUATGTTGGGGTGTGAATTTTATT (SEQ ID NO. 4065) SC10_HP0425S421 CUACUACUATTATAAGGCTTGTAAGCTTTT (SEQ ID NO. 4066) SC11_HP0426S422 CUACUACUAGCAATCTTTAATCATTCTAGA (SEQ ID NO. 4067) SC12_HP0427S423 CUACUACUATGGGGCTTATCGAATCAAATT (SEQ ID NO. 4068) SD1_HP0428S424 CUACUACUATCATGGAATCCAGTTCAACTT (SEQ ID NO. 4069) SD2_HP0429S42S CUACUACUAAAAAGTCATAATTGAAGGGCT (SEQ ID NO. 4070) SD3_HP0430S426 CUACUACUATTAATCGCATCCACTTTTTTT (SEQ ID NO. 4071) SD4_HP0431S427 CUACUACUACTTCTAACTCCATAAACTACC (SEQ ID NO. 4072) SD5_HP0432S428 CUACUACUATCAAAGCATTACGCTAATAAA (SEQ ID NO. 4073) SD6_HP0433S429 CUACUACUAACTCAGCATTTTTGAACCTTA (SEQ ID NO. 4074) SD7_HP0434S430 CUACUACUATTAGGATTGGCGCTTCAAACC (SEQ ID NO. 4075) SD8_HP0435S431 CUACUACUACACGCATAGAAGAATTAAAAA (SEQ ID NO. 4076) SD9_HP0436S432 CUACUACUATACTCATAGGGTTTTTATAGT (SEQ ID NO. 4077) SD10_HP0437S433 CUACUACUAATTAACCTTGATTTTCTATAT (SEQ ID NO. 4078) SDX1_HP0438S434 CUACUACUAAACCGCTAAAGCGATTGGCTT (SEQ ID NO. 4079) SD12_HP0439S43S CUACUACUAGGAGCTATGAAGCAAGAAAAA (SEQ ID NO. 4080) SE1_HP0440S436 CUACUACUATCATTTCTTTGTCTTTTTGTG (SEQ ID NO. 4081) SE2_HP0441S437 CUACUACUACCTACTCCTTTGTCAAGTAAA (SEQ ID NO. 4082) SE3_HP0442S438 CUACUACUAATTTTCTCAAGCATAATAATT (SEQ ID NO. 4083) SE4_HP0443S439 CUACUACUATCTAAAATAACCATTGACTTA (SEQ ID NO. 4084) SE5_HP0444S440 CUACUACUATTTGTTTTCATCTTTCATGTC (SEQ ID NO. 4085) SE6_HP0445S44X CUACUACUAACTTCAATCGCGGTCTACTAC (SEQ ID NO. 4086) SE7J3P0446S442 CUACUACUACTTTATGATGCTAAAATTTCC (SEQ ID NO. 4087) SE8_HP0447S443 CUACUACUATCATAAAATTTTGCCCTCTTT (SEQ ID NO. 4088) SE9_HP0448S444 CUACUACUAACATCAGCCTAGGAAGCCCAA (SEQ ID NO. 4089) SE10JHP0449S445 CUACUACUACTCATCGTTTCTCCTTTAGAT (SEQ ID NO. 4090) SE11_HP0450S446 CUACUACUATTTTAGATAAACTTTTTTTCT (SEQ ID NO. 4091) SEX2_HP0452S447 CUACUACUATGCTACTTTTTAAGAAGTTTT (SEQ ID NO. 4092) SF1 HP0453S448 CUACUACUATGATTTATTCATCCTCTTCTG (SEQ ID NO. 4093)
SF2_HP0454S449 CUACUACUATTAGGTTTTAATGTCTTTGCC (SEQ ID NO. 4094) SF3_HP0455S450 CUACUACUAGCGTGAATTTTAAAACAACAT (SEQ ID NO. 4095) SF4_HP0456S451 CUACUACUATCACACAGAACTTGGATAATA (SEQ ID NO. 4096) SF5_HP0457S452 CUACUACUATCCTTTTAAGCGTAAAATGAA (SEQ ID NO. 4097) SFSJHP0458S453 CUACUACUATTAAGGGCGCTTTCTAACATT (SEQ ID NO. 4098) SF7_HP0459S454 CUACUACUAGTTTTTCATTTAAGCACCTTT (SEQ ID NO. 4099) SF8_HP0460S455 CUACUACUATCAAATTTGCGAGCTTAAAGC (SEQ ID NO. 4100) SF9_HP0461S456 CUACUACUATTAGCCTTTGTGTGTTCTTTT (SEQ ID NO. 4101) SF10_HP0462S457 CUACUACUATTTTCATTGGGGTTTGACTTG (SEQ ID NO. 4102) SF11_HP0463S458 CUACUACUACAGGGTTTAGAGAATAATTTT (SEQ ID NO. 4103) SFX2_HP0464S459 CUACUACUATTAGGCATGCGGATTTCCTTG (SEQ ID NO. 4104) SGX_HP0465S460 CUACUACUATGTAAAAATAAAAATCATGGT (SEQ ID NO. 4105) SG2_HP0466S46X CUACUACUATTAGGAGTCTTTTAAATCATT (SEQ ID NO. 4106) SG3_HP0467S462 CUACUACUACCTTTATGAGTATTTTTTCAT (SEQ ID NO. 4107) SG4_HP0468S463 CUACUACUAATCTTAAATGAAAGTAACCTT (SEQ ID NO. 4108) SG5_HP0469S464 CUACUACUACTTTAGCTTTTGTGCAGATTT (SEQ ID NO. 4109) SG6_HP0470S465 CUACUACUACTTGTTATTTAAAAGCCTTAA (SEQ ID NO. 4110) SG7_HP047XS466 CUACUACUATTATTTTTTAGCCGGCTCATT (SEQ ID NO. 4111) SG8_HP0472S467 CUACUACUATTAGAAAGTAAAGACATAATC (SEQ ID NO. 4112) SG9_HP0473S468 CUACUACUATTAATCCACAATATAGCCGTA (SEQ ID NO. 4113) SG10_HP0474S469 CUACUACUATTATCATAAAAACGAGCTTTG (SEQ ID NO. 4114) SG11_HP0475S470 CUACUACUATTCTAAAATTTTTTGTTTTGA (SEQ ID NO. 4115) SG12_HP0476S471 CUACUACUAAGAGCCAATTTTTTAGTTTTT (SEQ ID NO. 4116) SH1_HP0477S472 CUACUACUACCTAAAAACTATAAACGTAAC (SEQ ID NO. 4117) SH2_HP0478S473 CUACUACUAATTATCCCCTAAAGCCAAAAG (SEQ ID NO. 4118) SH3_HP0479S474 CUACUACUAGAAGTATAACACAATATTATA (SEQ ID NO. 4119) SH4_HP0480S475 CUACUACUATTATTTTTTCGCCCTTTTCCT (SEQ ID NO. 4120) SH5_HP0481S476 CUACUACUAATTTTAGAATTTTCTAACACA (SEQ ID NO. 4121) SH6__HP0482S477 CUACUACUACTAATCTTTGCCCTGCTCTTG (SEQ ID NO. 4122) SH7_HP0483S478 CUACUACUACATCTTACCCCTTTAAAAAGT (SEQ ID NO. 4123) SH8_HP0484S479 CUACUACUATTTATGAAAGCGGCTTATTAG (SEQ ID NO. 4124) SH9 HP0485S480 CUACUACUATTTATTGCCTTCTGCTAAAAG (SEQ ID NO. 4125)
SH10_HP0486S481 CUACUACUATTAGAAACTATAACTAATGTG (SEQ ID NO. 4126 ) SH11_HP0487S482 CUACUACUATAAAATGCTAGTAAATCCTAA (SEQ ID NO. 4127 ) SH12_HP0488S483 CUACUACUATTTAATTTTTTGTGGTAGGAT (SEQ ID NO. 4128 ) RA1_HP0392R388 CAUCAUCAUATGGATGATTTGCAAGAAATA (SEQ ID NO. 4129 ) RA2_HP0393R389 CAUCAUCAUATGGCAGAAAAAACAGCTAAC (SEQ ID NO. 4130 ) RA3_HP0394R390 CAUCAUCAUCAAAGAGTAAGATGCTAGAAA (SEQ ID NO. 4131 ) RA4_HP0395R391 CAUCAUCAUATGTTAGATTATCGCCAAAAA (SEQ ID NO. 4132 ) RA5_HP0396R392 CAUCAUCAUGGTAGTTGGATGCGAGATTTT (SEQ ID NO. 4133 ) RA6_HP0397R393 CAUCAUCAUAATGTATCAAGTAGCCATTTG (SEQ ID NO. 4134 ) RA7_HP0398R394 CAUCAUCAUATGAGATTTAAGGGTGTTGTT (SEQ ID NO. 4135 ) RA8_HP0399R395 CAUCAUCAUCAATGAGCAAGATAGCAGAGC (SEQ ID NO. 4136 ) RA9_HP0400R396 CAUCAUCAUATGGAAATTAAAATGGCTAAG (SEQ ID NO. 4137 ) RA10JHP0401R397 CAUCAUCAUTGTGATAGAGCTTGACATTAA (SEQ ID NO. 4138 ) RA11_HP0402R398 CAUCAUCAUAATGAAACTGAGCATTAATGA (SEQ ID NO. 4139 ) RAX2_HP0403R399 CAUCAUCAUCGTTATTGCACACCTTAATAG (SEQ ID NO. 4140 ) RB1_HP0404R400 CAUCAUCAUATGAATGTGTTTGAAAAAATA (SEQ ID NO. 4141 ) RB2_HP0405R401 CAUCAUCAUGTGCAAGCGTTTTTAAATAGG (SEQ ID NO. 4142 ) RB3_HP0406R402 CAUCAUCAUAATTTGGATATTTTAGATTTG (SEQ ID NO. 4143 ) RB4_HP0407R403 CAUCAUCAUATGTCCATTTCACGCAGAAGT (SEQ ID NO. 4144 ) RB5_HP0408R404 CAUCAUCAUTATAAGGATAAAAGATGAATA (SEQ ID NO. 4145 ) RB6_HP0409R405 CAUCAUCAUCGATGATTTTAGTATTAGATT (SEQ ID NO. 4146 ) RB7_HP0410R406 CAUCAUCAUAGGAATTTAATGAAAAAAGGT (SEQ ID NO. 4147 ) RB8_HP0411R407 CAUCAUCAUTGAGTGCCGCTTTTGATTTTA (SEQ ID NO. 4148 ) RB9J3P0412R408 CAUCAUCAUGGAGTTATGGATTCTTTAGAG (SEQ ID NO. 4149 ) RBX0_HP0413R409 CAUCAUCAUTTCTATGAAAGTCAATAAGGG (SEQ ID NO. 4150 ) RB11_HP0414R410 CAUCAUCAUGGTCTATGAAAAAAATTGATG (SEQ ID NO. 4151 ) RB12_HP0415R411 CAUCAUCAUATGCGTTTATTATTGTGGTGG (SEQ ID NO. 4152 ) RC1_HP0416R412 CAUCAUCAUGATGATTTCAAAATTTTTGCT (SEQ ID NO. 4153 ) RC2_HP0417R413 CAUCAUCAUGATTATAAAGATGCAAAAATC (SEQ ID NO. 4154 ) RC3_HP0418R414 CAUCAUCAUATGCGATTAGGGGTGAATGAA (SEQ ID NO. 4155 ) RC4_HP0419R415 CAUCAUCAUGGTAAATCATGCTCATTTGTA (SEQ ID NO. 4156 ) RC5 HP0420R416 CAUCAUCAUAAGATAGTGCAAGAATCAGTC (SEQ ID NO. 4157 )
5RC6_HP0421R4X7 CAUCAUCAUAGAAGAGTAGTTAAATGGTTA (SEQ ID NO. 4158
5RC7_HP0422R4X8 CAUCAUCAUATGCAAGAAGTCCATGATTAT (SEQ ID NO. 4159
5RC8_HP0423R419 CAUCAUCAUTATGGGTTTTCAAAATGAAAA (SEQ ID NO. 4160
5RC9_HP0424R420 CAUCAUCAUATGTTATTGGATTATGATTTT (SEQ ID NO. 4161
5RC10_HP0425R421 CAUCAUCAUATGCCTAAAAAAGAGCTATTA (SEQ ID NO. 4162
5RC11_HP0426R422 CAUCAUCAUATGATGGCAAAAATAGATGTA (SEQ ID NO. 4163
5RCT2_HP0427R423 CAUCAUCAUATACCCCATGTCATCTTTTTT (SEQ ID NO. 4164
5RD1_HP0428R424 CAUCAUCAUATGAGTGCGTCTTTGGGTGAT (SEQ ID NO. 4165
5RD2_HP0429R425 CAUCAUCAUCTATGAATGAAAATGGGAAAA (SEQ ID NO. 4166
5RD3_HP0430R426 CAUCAUCAUTTGTTGTTGAGCGATGGGCAT (SEQ ID NO. 4167
5RD4_HP0431R427 CAUCAUCAUCGATTAAGCAATGAGAGATTT (SEQ ID NO. 4168
5RD5_HP0432R428 CAUCAUCAUTTATGGAGTTAGAAGAAATTG (SEQ ID NO. 4169
5RD6_HP0433R429 CAUCAUCAUTTGAAACGCCCTACTATGCCC (SEQ ID NO. 4170
5RD7_HP0434R430 CAUCAUCAUGGTTCAAAAATGCTGAGTTTT (SEQ ID NO. 4171
5RD8_HP0435R431 CAUCAUCAUACGATGGATGAAGTCTTAAAA (SEQ ID NO. 4172
5RD9J3P0436R432 CAUCAUCAUAAATGAGAGCTTTTTTTAATT (SEQ ID NO. 4173
5RD10_HP0437R433 CAUCAUCAUGGATAGAGGAATGAGGAAAAA (SEQ ID NO. 4174
5RD11_HP0438R434 CAUCAUCAUATTTTTGCTTAACGCTATCAA (SEQ ID NO. 4175
5RDX2_HP0439R435 CAUCAUCAUCATAGGCTAGTTGATGTCAAA (SEQ ID NO. 4176
5RB1_HP0440R436 CAUCAUCAUAAGGAGTAGGAATGAAAGATA (SEQ ID NO. 4177
SRE2_HP0441R437 CAUCAUCAUTATGCTTGAGAAAATTTTTAA (SEQ ID NO. 4178
5RE3_HP0442R438 CAUCAUCAUAGATGGCATTGCATATTGTTT (SEQ ID NO. 4179
5RE4_HP0443R439 CAUCAUCAUTGAAAGATGAAAACAAAACAT (SEQ ID NO. 4180
5RE5_HP0444R440 CAUCAUCAUTATGGCTTATATCGCTAATTT (SEQ ID NO. 4181
5RE6_HP0445R441 CAUCAUCAUAATGCAATTCCCACTCAAAAA (SEQ ID NO. 4182
5RE7_HP0446R442 CAUCAUCAUGGACCATGCAAGAAAGAGTTT (SEQ ID NO. 4183
5RE8_HP0447R443 CAUCAUCAUCCCAACGAAGAAGTAGTGTCC (SEQ ID NO. 4184
5RE9_HP0448R444 CAUCAUCAUTATGAAAATTATTAGCTTTGA (SEQ ID NO. 4185
5RE10_HP0449R445 CAUCAUCAUAGTGGAGTTTTTTGCAGACTC (SEQ ID NO. 4186
5RE11_HP0450R446 CAUCAUCAUAACGATGAGCGGGAATGAAGA (SEQ ID NO. 4187
5RE12_HP0452R447 CAUCAUCAUAAATGGCAGATGAAACGAACG (SEQ ID NO. 4188
5RF1 HP0453R448 CAUCAUCAUTGCAATGGATCTAGAAGAACT (SEQ ID NO. 4189
RF2_HP0454R449 CAUCAUCAUGGTGTTGGAATTGAAATTAAT (SEQ ID NO. 4190) RF3_HP0455R450 CAUCAUCAUAAAATGTTTTTGCAAGTTGTA (SEQ ID NO. 4191) RF4_HP0456R451 CAUCAUCAU7ΛAAGGAAAACAATGATTAAAC (SEQ ID NO. 4192) RF5_HP0457R452 CAUCAUCAUCCATGCAATTAGTTGGTATTT (SEQ ID NO. 4193) RF6_HP0458R453 CAUCAUCAUATGCAAAAAGAAGTCTTAGTA (SEQ ID NO. 4194) RF7_HP0459R454 CAUCAUCAUGAAAGTCTTCTTAATGTTAGA (SEQ ID NO. 4195) RF8_HP0460R455 CAUCAUCAUCATGTCTAATTTGCAAGAACT (SEQ ID NO. 4196) RF9_HP0461R456 CAUCAUCAUGTTAAATGAAGTTTTATTCAA (SEQ ID NO. 4197) RF10__HP0462R457 CAUCAUCAUTTGAGTGAGTGGCAAACATTT (SEQ ID NO. 4198) RF11JHP0463R458 CAUCAUCAUAAATCCGCATGCCTAATAACG (SEQ ID NO. 4199) RF12_HP0464R459 CAUCAUCAUATGCCATACAATGAAATCACA (SEQ ID NO. 4200) RG1_HP0465R460 CAUCAUCAUATGCCCTTTTTAAAAGCCCTA (SEQ ID NO. 4201) RG2_HP0466R46X CAUCAUCAUAATGGAAATCATTTTATTGAT (SEQ ID NO. 4202) RG3_HP0467R462 CAUCAUCAUATGCGCATTATCATAAGGTTA (SEQ ID NO. 4203) RG4_HP0468R463 CAUCAUCAUGATGAGCGTTTTGAAATTGCA (SEQ ID NO. 4204) RG5_HP0469R464 CAUCAUCAUCGCATGCAAAGAGAATTAAGG (SEQ ID NO. 4205) RG6_HP0470R465 CAUCAUCAUCATGAAAGAGCAAGAATGGGA (SEQ ID NO. 4206) RG7_HP047XR466 CAUCAUCAUATGGAAAACAGCACACTTTAT (SEQ ID NO. 4207) RG8_HP0472R467 CAUCAUCAUGAAGGATTTATTATGATTAAA (SEQ ID NO. 4208) RG9_HP0473R468 CAUCAUCAUAGGGAGTTCTATGAAAAATAC (SEQ ID NO. 4209) RG10_HP0474R469 CAUCAUCAUGAGCAATGGATCATGAGTTTT (SEQ ID NO. 4210) RGX1_HP0475R470 CAUCAUCAUATGATAAAAGCGCGGTTTAAA (SEQ ID NO. 4211) RG12_HP0476R471 CAUCAUCAUGAACAATGAGTTTGATCGTTA (SEQ ID NO. 4212) RH1_HP0477R472 CAUCAUCAUATGCAAAAAGCCTTATTACAT (SEQ ID NO. 4213) RH2_HP0478R473 CAUCAUCAUGGGGGTAAATGCCTTCAAACG (SEQ ID NO. 4214) RH3_HP0479R474 CAUCAUCAUAATCATGCATGTTGCTTGTCT (SEQ ID NO. 4215) RH4_HP0480R475 CAUCAUCAUAGGACATTTATGAAAAATATT (SEQ ID NO. 4216) RH5_HP0481R476 CAUCAUCAUGAATGCATGCCAATTTATTCA (SEQ ID NO. 4217) RH6_HP0482R477 CAUCAUCAUATGCATATTAGCGAAGTCAAA (SEQ ID NO. 4218) RH7_HP0483R478 CAUCAUCAUTTTTCTATGCACCCTTATTTT (SEQ ID NO. 4219) RH8_HP0484R479 CAUCAUCAUAGGGGTAAGATGTTTAACAAT (SEQ ID NO. 4220) RH9 HP0485R480 CAUCAUCAUCAAGAATGAAAAAAATTGGTT (SEQ ID NO. 4221)
RH10_HP0486R48X CAUCAUCAUGAATGAAATTAAAGAAACGAA (SEQ ID NO. 4222) RHX1_HP0487R482 CAUCAUCAUGATATGCCTTTTTGTATTTTT (SEQ ID NO. 4223) RH12_HP0488R483 CAUCAUCAUAGGAGTGAAGTTACCCAAAGC (SEQ ID NO. 4224) SA1_HP0489S484 CUACUACUACTCAAATCCTTGACTTTCTTC (SEQ ID NO. 4225) SA2_HP0490S485 CUACUACUACAATCACAATGAAATGTAATC (SEQ ID NO. 4226) SA3_HP0491S486 CUACUACUAAAAAACCCTAAACGGCTAAGC (SEQ ID NO. 4227) SA4_HP0492S487 CUACUACUAACTACTTTTTTTGAGGCATAG (SEQ ID NO. 4228) SA5_HP0493S488 CUACUACUATTTTCATTTTAACGCACCTTC (SEQ ID NO. 4229) SA6_HP0494S489 CUACUACUACGCTTTAATCCTTTAAAACAA (SEQ ID NO. 4230) SA7_HP0495S490 CUACUACUAGATTAAAGCGTTTGAACCACT (SEQ ID NO. 4231) SA8_HP0496Ξ491 CUACUACUATCAGATGGCATTGAGCAGCTC (SEQ ID NO. 4232) SA9_HP0497S492 CUACUACUATTAATAAATCTTGCTCAACCA (SEQ ID NO. 4233) SAX0_HP0498S493 CUACUACUATATTTCACTTAAATTGCAAGA (SEQ ID NO. 4234) SA1X_HP0499S494 CUACUACUATTAAGGGTTTAAGCGTATTCC (SEQ ID NO. 4235) SAX2_HP0500S495 CUACUACUATCAGCCTTTATAGTGTGATTG (SEQ ID NO. 4236) SB1_HP0501S496 CUACUACUAAAAATCCCTTACACATCTAGT (SEQ ID NO. 4237) SB2_HP0502S497 CUACUACUATCAATATCCGCTCCTATTTTT (SEQ ID NO. 4238) SB3_HP0503S498 CUACUACUACACTAACTTTAATCAAACTCA (SEQ ID NO. 4239) SB4_HP0504S499 CUACUACUATTTTAAACTAAAAGGGATTTT (SEQ ID NO. 4240) SB5_HP0S05S500 CUACUACUATGTTAATTTTGAAGTATTTTT (SEQ ID NO. 4241) SB6_HP0506S501 CUACUACUAATAAGACATTAAAAACCCTCT (SEQ ID NO. 4242) SB7_HP0507S502 CUACUACUATCCTTTTAAACATTTTTAAAC (SEQ ID NO. 4243) SB8_HP0508S503 CUACUACUATTGGTCTAACATGTTATTTTT (SEQ ID NO. 4244) SB9_HP0509S504 CUACUACUATCCTTGCTTTAAATCTTACAA (SEQ ID NO. 4245) SB10_HP0510S505 CUACUACUAAAAACATTCAACCAAACATCT (SEQ ID NO. 4246) SB11_HP0511S506 CUACUACUATGTGCTTCTTTTTCTATAAAC (SEQ ID NO. 4247) SB12_HP0512S507 CUACUACUACATTGTTTTAGCATGAATAAG (SEQ ID NO. 4248) SC1_HP05X3S508 CUACUACUATTATTGCCTATTCCTACCACC (SEQ ID NO. 4249) SC2_HP05X4S509 CUACUACUACTACTCAGCCACCACATCAAT (SEQ ID NO. 4250) SC3_HP05X5S510 CUACUACUATTAAAGCTCCAAAATTTTAAT (SEQ ID NO. 4251) SC4_HP0516S511 CUACUACUACATCATAAAATATACTTCACC (SEQ ID NO. 4252) SC5 HP0517S5X2 CUACUACUATCAATCCCTATTCCTTCTATG (SEQ ID NO. 4253)
6SC6_HP0518S5X3 CUACUACUATTTTTCTTATTTTTCCATTAT (SEQ ID NO. 4254)
6SC7_HP0519S5X4 CUACUACUACTACAGAGCTTCCATGGCTTT (SEQ ID NO. 4255)
6SC8_HP0520S515 CUACUACUAGTTTTGCCTCAATCTTTAGTC (SEQ ID NO. 4256)
6SC9_HP0522S516 CUACUACUAAACTTACTTTGAATCTTTCAG (SEQ ID NO. 4257)
6SC10_HP0523S517 CUACUACUACTACTCGTTATATCGCACTTG (SEQ ID NO. 4258)
6SC11_HP0524S518 CUACUACUATTGTCTTAATCACAGTTCACT (SEQ ID NO. 4259)
6SC12_HP0525S519 CUACUACUAACTACCTGTGTTTGATATAAA (SEQ ID NO. 4260)
6SD1_HP0526S520 CUACUACUATCTTATTCCAAATTTAATTTT (SEQ ID NO. 4261)
6SD2_HP0527S521 CUACUACUACTTGAATTAATTGCCACCTTT (SEQ ID NO. 4262)
6SD3_HP0528S522 CUACUACUAACGCCTTTTTATTTATCTCTG (SEQ ID NO. 4263)
6SD4_HP0529S523 CUACUACUAGGCTTATCCTTTAAACATAGA (SEQ ID NO. 4264)
6SD5_HP0530S524 CUACUACUATTCTTATTTATTTAATGCCTT (SEQ ID NO. 4265)
6SD6_HP0531S525 CUACUACUATCATTCTGCACCGCCTTGTTT (SEQ ID NO. 4266)
6SD7_HP0532S526 CUACUACUACCTTTTCTTTCAATCACTTAC (SEQ ID NO. 4267)
6SD8_HP0S33S527 CUACUACUACACTCATCATTTTTTAATTCT (SEQ ID NO. 4268)
6SD9_HP0534S528 CUACUACUAATTTTACACTCCTTTTTCTTT (SEQ ID NO. 4269)
6SD10_HP0535S529 CUACUACUAAATACACTACTAGAGTCTTAC (SEQ ID NO. 4270)
6SD11_HP0536S530 CUACUACUATCATAAGAACCAATTTTGCCA (SEQ ID NO. 4271)
6SD12_HP0537S531 CUACUACUATCCTATTCAAAGGGATTATTC (SEQ ID NO. 4272)
6SE1_HP0538S532 CUACUACUACGCTATTTTTTCCCATGAGCG (SEQ ID NO. 4273)
6SE2_HP0539S533 CUACUACUAAAAAATCATTTAACAATGATC (SEQ ID NO. 4274)
6SE3_HP0S40Ξ534 CUACUACUAAGTGTTTTCATTTGACAATAA (SEQ ID NO. 4275)
6SE4_HP0541S535 CUACUACUATTCCTCACTTCACGATTATTT (SEQ ID NO. 4276)
6SE5_HP0542S536 CUACUACUAGTTAATACCCTAAGATCGGTG (SEQ ID NO. 4277)
SSE6_HP0543S537 CUACUACUATTAACGCCACTCAATCGTTAC (SEQ ID NO. 4278)
6SE7_HP0544S538 CUACUACUACATTTTAATACTCCTTTATTT (SEQ ID NO. 4279)
6SE8_HP0545S539 CUACUACUACCCCTCTCTTTATAGATATAC (SEQ ID NO. 4280)
6SE9_HP0546S540 CUACUACUATCAATTTAGCTAGCTCCTCCG (SEQ ID NO. 4281)
6SE10_HP0547S541 CUACUACUAATCCTTTAAGATTTTTGGAAA (SEQ ID NO. 4282)
6SE11_HP0549S542 CUACUACUACGTTACAATTTAAGCCATTCT (SEQ ID NO. 4283)
6SE12_HP0550S543 CUACUACUATAGGGATTTTATTTTTCATTC (SEQ ID NO. 4284)
6SF1 HP0551S544 CUACUACUAAAGAGTTACTTCAAGTTGTAG (SEQ ID NO. 4285)
SF2_HP0552S545 CUACUACUACTTTTATTGATTTTTTTCTTC (SEQ ID NO. 4286) SF3_HP0553S546 CUACUACUACCTAGTTGATTTTATCCATTA (SEQ ID NO. 4287) SF4_HP0554S547 CUACUACUATTTTCATCGCTGGAGTTCTTT (SEQ ID NO. 4288) SF5_HP0555S548 CUACUACUATTAAGGCATCGCATCAAAAAC (SEQ ID NO. 4289) SF6_HP0556S549 CUACUACUAAAAAAAAGAGGGCCTTAATAA (SEQ ID NO. 4290) SF7_HP0557S550 CUACUACUATCTAATTCATGCTCTCCACAA (SEQ ID NO. 4291) SF8_HP0558S551 CUACUACUACTAGGCTTTTTTGAAAATCAC (SEQ ID NO. 4292) SF9_HP0559S552 CUACUACUATTAAGCCAGTTTATTATCCTC (SEQ ID NO. 4293) SF10_HP0560S553 CUACUACUATTCGTTGGAGCGTTTATTTTT (SEQ ID NO. 4294) SF1X_HP056XS554 CUACUACUATTGTTTAGGACTACATATAAA (SEQ ID NO. 4295) SF12_HP0562S555 CUACUACUAGTCTATTATAGTCTTGACTCA (SEQ ID NO. 4296) SG1_HP0563S556 CUACUACUATCACTTTTTTCTGGCAATTTT (SEQ ID NO. 4297) SG2_HP0564S557 CUACUACUATTAAAGCCTACTAGAAGAATC (SEQ ID NO. 4298) SG3_HP0565S558 CUACUACUATCAGCTCTTAAGAGCAGCACC (SEQ ID NO. 4299) SG4_HP0566S559 CUACUACUAAATGATCAAAAACACCCATTT (SEQ ID NO. 4300) SG5_HP0567S560 CUACUACUACCATAGGGCTTATATCATTCG (SEQ ID NO. 4301) SG6_HP0568S561 CUACUACUATTATTTTTTATAACGAATGAG (SEQ ID NO. 4302) SG7_HP0569S562 CUACUACUAATAAGACAGACTAGATATTGA (SEQ ID NO. 4303) SG8_HP0570S563 CUACUACUACAGACAAGCCCATTTCAAGCC (SEQ ID NO. 4304) SG9_HP0571S564 CUACUACUACTCTAGCGGTTTTTCTTACTA (SEQ ID NO. 4305) SG10_HP0572S565 CUACUACUACCCTTATTCTAGCACTCTAAC (SEQ ID NO. 4306) SG11_HP0573S566 CUACUACUATTTAAAGGAGAGCTTCTATTT (SEQ ID NO. 4307) SGX2__HP0574S567 CUACUACUAAGGTTTATGATTTCAGCGATT (SEQ ID NO. 4308) SHX_HP0575S568 CUACUACUAAAAGATTAAGAGAGGAGTAAA (SEQ ID NO. 4309) SH2_HP0576S569 CUACUACUATTAATGCGTTGCGTTTTCTTT (SEQ ID NO. 4310) SH3_HP0577S570 CUACUACUATTAAAATCCCTTCCTTTGTTG (SEQ ID NO. 4311) SH4_HP0578S571 CUACUACUATAGTTTGTCAATCTTTAAGAT (SEQ ID NO. 4312) SH5_HP0579S572 CUACUACUAAGCGTTAATACAACCTTTTTT (SEQ ID NO. 4313) SH6_HP0580S573 CUACUACUACGTTAATCACTTGAGGAGGGA (SEQ ID NO. 4314) SH7_HP0581S574 CUACUACUATTAGGCGATTTCTTGAAGGTT (SEQ ID NO. 4315) SH8_HP0582S575 CUACUACUATTTTAATTGCCCCCTAAGTAA (SEQ ID NO. 4316) SH9 HP0583S576 CUACUACUATTAGAATTTTCCGGCATATAA (SEQ ID NO. 4317)
SH10_HP0584S577 CUACUACUATCATGAATTTTTAGCGAGATA (SEQ ID NO. 4318) SH11_HP0585S578 CUACUACUACCACTACGCTTTAAAGCTAGC (SEQ ID NO. 4319) SH12_HP0586S579 CUACUACUAGCGTCATTTTACTTCCTTTTT (SEQ ID NO. 4320) RA1_HP0489R484 CAUCAUCAUACAATGATCAATTACCCTAAT (SEQ ID NO. 4321) RA2_HP0490R485 CAUCAUCAUTTGTTTGAAAAGTTGAAATTT (SEQ ID NO. 4322) RA3_HP0491R486 CAUCAUCAUAGGATACACAATGGCAAAAAG (SEQ ID NO. 4323) RA4_HP0492R487 CAUCAUCAUTTTATGGAGCGTTCGCTTATT (SEQ ID NO. 4324) RA5_HP0493R488 CAUCAUCAUTTTATGCTCTATTCTTTACTA (SEQ ID NO. 4325) RA6_HP0494R489 CAUCAUCAUAGGTGCGTTAAAATGAAAATC (SEQ ID NO. 4326) RA7_HP0495R490 CAUCAUCAUATGCCATCTGATTCAAAAAAA (SEQ ID NO. 4327) RA8_HP0496R491 CAUCAUCAUATGCGCTGTAGGGTATATTAC (SEQ ID NO. 4328) RA9_HP0497R492 CAUCAUCAUGGTAGTTAAGAATGGGTAATC (SEQ ID NO. 4329) RA10_HP0498R493 CAUCAUCAUGCATGGGAAAATTTTCTAAAT (SEQ ID NO. 4330) RA11_HP0499R494 CAUCAUCAUATGAAAAGCATTTTGCTCTTT (SEQ ID NO. 4331) RA12_HP0500R495 CAUCAUCAUAACTATGAAAATCAGTGTTAG (SEQ ID NO. 4332) RB1_HP0501R496 CAUCAUCAUGCTGATTTGAATGCAAAATTA (SEQ ID NO. 4333) RB2_HP0502R497 CAUCAUCAUCGTGAATTTAGGGGTTTATTA (SEQ ID NO. 4334) RB3_HP0503R498 CAUCAUCAUGTGGGAAATCCCCCCTATAAC (SEQ ID NO. 4335) RB4_HP0504R499 CAUCAUCAUTGTGAGTTTGATTAAAGTTAG (SEQ ID NO. 4336) RB5_HP0505R500 CAUCAUCAUTTGAGTGAAATGATTGATTAC (SEQ ID NO. 4337) RB6_HP0506R501 CAUCAUCAUCATGGTATTTTTTCATAAGAA (SEQ ID NO. 4338) RB7_HP0507R502 CAUCAUCAUAGAGGGTTTTTAATGTCTTAT (SEQ ID NO. 4339) RB8_HP0508R503 CAUCAUCAUATGTTAAGGCTTTTGATAGGA (SEQ ID NO. 4340) RB9_HP0509R504 CAUCAUCAUACATGTTAGACCAACAACACA (SEQ ID NO. 4341) RB10_HP0510R505 CAUCAUCAUCATGAAAATCGGTGTTTATGG (SEQ ID NO. 4342) RB11J3P0511R506 CAUCAUCAUTATAAAATGAAAACCATTAGA (SEQ ID NO. 4343) RB12_HP0512R507 CAUCAUCAUTAATGATAGTAAGAACTCAAA (SEQ ID NO. 4344) RC1_HP0513R508 CAUCAUCAUAATGATAAGATGAACGGACAT (SEQ ID NO. 4345) RC2_HP05X4R509 CAUCAUCAUGGGATATGATGAAAGTTCTAT (SEQ ID NO. 4346) RC3_HP0515R510 CAUCAUCAUAAATGTTTGAAGCGACGACGA (SEQ ID NO. 4347) RC4_HP0516R5X1 CAUCAUCAUTTAATGTCTAAATTGAATATG (SEQ ID NO. 4348) RC5 HP0517R512 CAUCAUCAUAGTATATTTTATGATGAAAAC (SEQ ID NO. 4349)
RC6_HP0518R513 CAUCAUCAUAATAGGGATTGAAAAAGGTAT (SEQ ID NO. 4350 RC7_HP0519R514 CAUCAUCAUGATCTGACATGTTTAAAGATT (SEQ ID NO. 4351 RC8_HP0520R515 CAUCAUCAUCTATGGCTGACACAATCAATA (SEQ ID NO. 4352 RC9_HP0522R5X6 CAUCAUCAUTAATGTTTAGAAAACTAGCAA (SEQ ID NO. 4353 RC10_HP0523R517 CAUCAUCAUTTGTTCGAGAAATGGATTGGT (SEQ ID NO. 4354 RC11_HP0524R518 CAUCAUCAUCAATGGAAGACTTTTTGTATA (SEQ ID NO. 4355 RC12_HP0525R519 CAUCAUCAUTAAGAACATGACTGAAGACAG (SEQ ID NO. 4356 RDX_HP0526R520 CAUCAUCAUAGGAAACAGCAATGGAACTCG (SEQ ID NO. 4357 RD2_HP0527R521 CAUCAUCAUCATGAATGAAGAAAACGATAA (SEQ ID NO. 4358 RD3_HP0528R522 CAUCAUCAUTGATGGGGCAGGCATTTTTTA (SEQ ID NO. 4359 RD4_HP0529R523 CAUCAUCAUAACATGTTTAATATTAAAAGG (SEQ ID NO. 4360 RD5_HP0530R524 CAUCAUCAUTCGCATGTTAGGGAAAAAAAA (SEQ ID NO. 4361 RD6_HP0531R525 CAUCAUCAUATGAACGATACAACAGAGCAT (SEQ ID NO. 4362 RD7_HP0532R526 CAUCAUCAUATGAAACTGAGAGCAAGTGTT (SEQ ID NO. 4363 RD8_HP0533R527 CAUCAUCAUGTAAAATGCTTGATGTGGAAA (SEQ ID NO. 4364 RD9_HP0534R528 CAUCAUCAUCAAGAGCGATATGAGTAATAA (SEQ ID NO. 4365 RD10_HP0535R529 CAUCAUCAUGTGAAATGCTTCCTACTAAAA (SEQ ID NO. 4366 RD11_HP0536R530 CAUCAUCAUACCAAAATGAAACGACCGATT (SEQ ID NO. 4367 RD12_HP0537R531 CAUCAUCAUAGGTTACAAAAACTATAAGAT (SEQ ID NO. 4368 RE1_HP0538R53 CAUCAUCAUAGGAGACACTCTTGAAAAGTA (SEQ ID NO. 4369 RE2_HP0539R533 CAUCAUCAUATGAAAACACTCGTGAAAAAT (SEQ ID NO. 4370 RE3_HP0540R534 CAUCAUCAUGAGGAAAGAGATGTGAAATGT (SEQ ID NO. 4371 RE4_HP054XR535 CAUCAUCAUAACATGGCAGGTACACAAGCT (SEQ ID NO. 4372 RE5_HP0542R536 CAUCAUCAUTATGAAAACGAATTTTTATAA (SEQ ID NO. 4373 RE6_HP0543R537 CAUCAUCAUAAGGAGTATTAAAATGAAACA (SEQ ID NO. 4374 RE7_HP0544R538 CAUCAUCAUCTATAAAGAGAGGGGTGTTTG (SEQ ID NO. 4375 RE8_HP0545R539 CAUCAUCAUAGCTAAATTGATCAATAATAA (SEQ ID NO. 4376 RE9_HP0546R540 CAUCAUCAUGAAACGCATGAAATTTTTTAC (SEQ ID NO. 4377 REX0_HP0547R54X CAUCAUCAUGGAGAAACAATGACTAACGAA (SEQ ID NO. 4378 RE11_HP0549R542 CAUCAUCAUATGAAAATAGGCGTTTTTGAT (SEQ ID NO. 4379 RE12_HP0550R543 CAUCAUCAUAAGGAAATTTAATGAACGAAA (SEQ ID NO. 4380 RF1 HP0551R544 CAUCAUCAUGATGAAAAAAGGCATTCACCC (SEQ ID NO. 4381
RF2_HP0552R545 CAUCAUCAUTGGCTTTTTGTGCTGTATTTT (SEQ ID NO. 4382) RF3_HP0553R546 CAUCAUCAUAGCATGCAAGCAGTAATTTAT (SEQ ID NO. 4383) RF4_HP0554R547 CAUCAUCAUTGAATGGAACAGAATAAAAAA (SEQ ID NO. 4384) RF5_HP0555R548 CAUCAUCAUATGAAAAACCTTTTTTTAGCT (SEQ ID NO. 4385) RF6_HP0556R549 CAUCAUCAUATGGTTACACATGAAAAAATC (SEQ ID NO. 4386) RF7_HP0557R550 CAUCAUCAUTTTGAATGGCCGTTTATTTAG (SEQ ID NO. 4387) RF8_HP0558R551 CAUCAUCAUATTGGTGCGTCGGATTGTAGT (SEQ ID NO. 4388) RF9_HP0559R552 CAUCAUCAUTTATGGCTTTATTTGAAGATA (SEQ ID NO. 4389) RF10_HP0560R553 CAUCAUCAUATGGGGATTATTTACTTAATA (SEQ ID NO. 4390) RFX1_HP0561R554 CAUCAUCAUAATGCAATTCACAGGGAAAAA (SEQ ID NO. 4391) RF12_HP0562R555 CAUCAUCAUATGCCAGGGATTAAGGTTAGA (SEQ ID NO. 4392) RG1_HP0563R556 CAUCAUCAUGGGGTAACAAGTGGCAATAAA (SEQ ID NO. 4393) RG2_HP0564R557 CAUCAUCAUGTGACTTTTTATCTCTCTAAA (SEQ ID NO. 4394) RG3_HP0565R558 CAUCAUCAUAACCATGCAAGCGTTAAAATC (SEQ ID NO. 4395) RG4_HP0566R559 CAUCAUCAUTAGATGGTGTTTTACAAGTAT (SEQ ID NO. 4396) RG5_HP0567R560 CAUCAUCAUCATGAAAGCTCAGTATTTCTT (SEQ ID NO. 4397) RG6_HP0568R561 CAUCAUCAUCGTGATGCCTTTGGAATTATT (SEQ ID NO. 4398) RG7_HP0569R562 CAUCAUCAUAATGGGCTTGTCTGTAGGCAT (SEQ ID NO. 4399) RG8__HP0570R563 CAUCAUCAUGACTTTATGTTAAAAATCAAA (SEQ ID NO. 4400) RG9_HP0571R564 CAUCAUCAUAGGGTGAATTTTGGAAGAATA (SEQ ID NO. 4401) RG10_HP0572R565 CAUCAUCAUATGAATGAAACGCTCAAAGAA (SEQ ID NO. 4402) RG11_HP0573R566 CAUCAUCAUATGTTTACCCAATGGTTTATT (SEQ ID NO. 4403) RG12_HP0574R567 CAUCAUCAUGTTTATGAATAAGCTCTTAAA (SEQ ID NO. 4404) RH1_HP0575R568 CAUCAUCAUGTGCAATTTTTTGATTTCTCT (SEQ ID NO. 4405) RH2__HP0576R569 CAUCAUCAUATGAAATTTTTACGCTCTGTT (SEQ ID NO. 4406) RH3_HP0577R570 CAUCAUCAUAAATGGGCATGCCAAATAGGG (SEQ ID NO. 4407) RH4__HP0578R571 CAUCAUCAUTATGAAATCCCTATCTAATGC (SEQ ID NO. 4408) RH5_HP0579R572 CAUCAUCAUATGCTTATATCTTCTTCTTAC (SEQ ID NO. 4409) RH6_HP0580R573 CAUCAUCAUTATCTTGGAACCTTCAAGAAA (SEQ ID NO. 4410) RH7_HP058XR574 CAUCAUCAUATGGAAATCACGCTTTTTGAC (SEQ ID NO. 4411) RH8_HP0582R575 CAUCAUCAUTATGCCGGAAAATTCTAAACT (SEQ ID NO. 4412) RH9 HP0583R576 CAUCAUCAUATGAGATTGTTATTCTTGTTA (SEQ ID NO. 4413)
RH10_HP0584R577 CAUCAUCAUAGTTTTATGTCAGAAACAGAA (SEQ ID NO. 4414 RH11_HP0S85R578 CAUCAUCAUGAAAGCGTGTGAAAATGGGTT (SEQ ID NO. 4415 RH12_HP0586R579 CAUCAUCAUTGATGAATAAAAGAAAACATG (SEQ ID NO. 4416 SA1_HP0587S580 CUACUACUACTTAAAAATGATTATTAGAAA (SEQ ID NO. 4417 SA2_HP0588S581 CUACUACUAGCATTATTTGTCTCTCCCTTC (SEQ ID NO. 4418 SA3_HP0589S582 CUACUACUAGCCATTTTAAAGCTCCTTCAA (SEQ ID NO. 4419 SA4_HP0590S583 CUACUACUACTTCCATTATTGTTTTCCTTG (SEQ ID NO. 4420 SA5_HP0591S584 CUACUACUAGCCATTAAGCCCTAACTTTCA (SEQ ID NO. 4421 SA6_HP0592S585 CUACUACUATCTGTCATGGCGTTTCCTTTA (SEQ ID NO. 4422 SA7_HP0593S586 CUACUACUATCAATACACCACGCTTAAACG (SEQ ID NO. 4423 SA8_HP0594S587 CUACUACUACCCCTTATAGATTCATGTTAG (SEQ ID NO. 4424 SA9_HP0595S588 CUACUACUAAGTTTAAAAATTAAGGGTGTA (SEQ ID NO. 4425 SA10_HP0S96S589 CUACUACUACCTACATGGCTATAGGGACTT (SEQ ID NO. 4426 SA11_HP0597S590 CUACUACUATTAGAACAACAAGCGTTCCTC (SEQ ID NO. 4427 SA12_HP0598S591 CUACUACUATTAACCACTCTTAAAAGAAGA (SEQ ID NO. 4428 SB1_HP0599S592 CUACUACUAATCATTCGCCTTTTTGAATTT (SEQ ID NO. 4429 SB2_HP0600S593 CUACUACUATTAGTATTGCTGCTTCAAAAA (SEQ ID NO. 4430 SB3_HP0601S594 CUACUACUAACTAAGTTAAAAGCCTTAAGA (SEQ ID NO. 4431 SB4_HP0602S595 CUACUACUAAAATCAAAGTTTTAATTCCAA (SEQ ID NO. 4432 SB5_HP0603S596 CUACUACUATTACCCTAAAAATCTCGTAAA (SEQ ID NO. 4433 SB6_HP0604S597 CUACUACUACCTATCGTCTGGTTTTAGCAT (SEQ ID NO. 4434 SB7_HP0605S598 CUACUACUATTAATGCACATAGTCGTCTAT (SEQ ID NO. 4435 SB8_HP0606S599 CUACUACUAACATCAAAATATCCTATTATT (SEQ ID NO. 4436 SB9__HP0607S600 CUACUACUATTTTCATTCTAAAGTTTTTTG (SEQ ID NO. 4437 SB10_HP0608S601 CUACUACUATCTCTTAAAACATATAGCTAT (SEQ ID NO. 4438 SB11_HP0609S602 CUACUACUATGTTGTTATCCGCTCTTAATA (SEQ ID NO. 4439 SB12_HP0610S603 CUACUACUATGGCTCAAAAAAGGGGTAAGC (SEQ ID NO. 4440 SC1_HP0611S604 CUACUACUACTTTAACGCATGCGTTCTTTG (SEQ ID NO. 4441 SC2_HP0612S605 CUACUACUAACAAGAATATATTATGCGATT (SEQ ID NO. 4442 SC3_HP0613S606 CUACUACUACCCACTAAATATCCTTTAGTA (SEQ ID NO. 4443 SC4_HP0614S607 CUACUACUAGTTATTGGTTTTATGGGTGTT (SEQ ID NO. 4444 SC5 HP0615S608 CUACUACUATAATTTTAATCCAATTCTTTA (SEQ ID NO. 4445
SC6_HP06X6S609 CUACUACUAATTTATGAAAGCGTTTTTTTA (SEQ ID NO. 4446) SC7_HP0617S610 CUACUACUATTATTTTCTCAAGCGAATGTG (SEQ ID NO. 4447) SC8_HP0618S611 CUACUACUATTAATTACCGAAAGACAAGAT (SEQ ID NO. 4448) SC9_HP0620S612 CUACUACUATTAGCCTTGATAGGCTTTTAT (SEQ ID NO. 4449) SC10_HP0621S613 CUACUACUAAACTACAATTTAACGATTTTA (SEQ ID NO. 4450) SC11_HP0622S614 CUACUACUATCATTAGGGTTGGATTAAGGG (SEQ ID NO. 4451) SC12_HP0623S6X5 CUACUACUATTACATTTCGCCTCTTAATTG (SEQ ID NO. 4452) SD1_HP0624S616 CUACUACUATCAAGCATGATTTTCTCGATA (SEQ ID NO. 4453) SD2_HP0625S617 CUACUACUACTAATCCTTTAAACTTTTTTC (SEQ ID NO. 4454) ΞD3_HP0626S618 CUACUACUATTAATGCAGGTTTTGATTCAA (SEQ ID NO. 4455) SD4_HP0627S619 CUACUACUATACCCCATTAGCATACATCGC (SEQ ID NO. 4456) SD5_HP0628S620 CUACUACUAAATTAGGGTATATCATAAGAA (SEQ ID NO. 4457) SD6_HP0629S621 CUACUACUACTTGTTTATTCCACAATAAAG (SEQ ID NO. 4458) SD7_HP0630S622 CUACUACUAGAAATGGGGGGGTTAAAAAAA (SEQ ID NO. 4459) SD8_HP0631S623 CUACUACUAAATCCCTTACTCTTTGTTTTT (SEQ ID NO. 4460) SD9_HP0632S624 CUACUACUACTTTTTAGAATTTAGCGAAAT (SEQ ID NO. 4461) SD10_HP0633S625 CUACUACUAACTCATTCTTTGGTATAACCA (SEQ ID NO. 4462) SD11_HP0634S626 CUACUACUATTCAAAAACCCTTATAAGAAA (SEQ ID NO. 4463) SD12_HP0635S627 CUACUACUAAGCCCTTATTCATAATCCAAA (SEQ ID NO. 4464) SE1_HP0636S628 CUACUACUATCACCCTTCGGCGATATTGTT (SEQ ID NO. 4465) SE2_HP0637S629 CUACUACUAATCAATAAAAAGCGTAAGGGA (SEQ ID NO. 4466) SE3_HP0638S630 CUACUACUAGTTTTAATGTTTGTTTTTAAA (SEQ ID NO. 4467) SE4_HP0639S631 CUACUACUATTATTTTTTATTAGCTTGAAA (SEQ ID NO. 4468) SE5_HP0640S632 CUACUACUATTTTTAATTTCTTCTGCCTAT (SEQ ID NO. 4469) SE6_HP0641S633 CUACUACUATTAATGTTCTCCTTTTTCTAA (SEQ ID NO. 4470) SE7_HP0642S634 CUACUACUATTATTCAATCACTTCATAAAT (SEQ ID NO. 4471) SE8_HP0643S635 CUACUACUATCATGCTTTGAGCCTTAAAAC (SEQ ID NO. 4472) SE9_HP0644S636 CUACUACUACGCATGATTTTCTTTAAAGGT (SEQ ID NO. 4473) SE10_HP0645S637 CUACUACUATCATGGTTTGTCCTTTGAGTT (SEQ ID NO. 4474) SE11_HP0646S638 CUACUACUAAGATTTATAAGCGTTTTTTAT (SEQ ID NO. 4475) SEX2_HP0647S639 CUACUACUATCTATCCTTTTTCTTTTTTAT (SEQ ID NO. 4476) SF1 HP0648S640 CUACUACUACAAATCTTGGATTATTTTTCT (SEQ ID NO. 4477)
SF2_HP0649S64X CUACUACUACGTTCAGTCTTTATGCTTTTT (SEQ ID NO. 4478) SF3_HP0650S642 CUACUACUATTAAGATTGGTTAAGAAATTG (SEQ ID NO. 4479) SF4_HP0651S643 CUACUACUACCTTACTTTTTAACCCATCTC (SEQ ID NO. 4480) SF5_HP0652S644 CUACUACUATTTTTAAATCAAAGGCTTGAT (SEQ ID NO. 4481) SF6_HP0653S645 CUACUACUAACCCAATACCCTAAAATTAAG (SEQ ID NO. 4482) SF7_HP0654S646 CUACUACUAGCTAAAAATTATTAAAATTTC (SEQ ID NO. 4483) ΞF8_HP0655S647 CUACUACUAGCATTTTAAAACCTTGTTCCC (SEQ ID NO. 4484) SF9_HP0656S648 CUACUACUATTTTTTCATGTTATTTATACC (SEQ ID NO. 4485) SF10_HP0657S649 CUACUACUAGCATCATTTGTCCTTCTTTTT (SEQ ID NO. 4486) SF11_HP0658S650 CUACUACUATCATCAACCCAATTTCTCTTT (SEQ ID NO. 4487) SF12_HP0659S651 CUACUACUATCACTCTCTAATCATCACAAT (SEQ ID NO. 4488) SG1__HP0660S652 CUACUACUATTCAATTTCTTGCATTCAATC (SEQ ID NO. 4489) SG2_HP0661S653 CUACUACUATTCCCTTTAAGTGGTCGTTTT (SEQ ID NO. 4490) SG3_HP0662S654 CUACUACUATCATTTGGCTTCCTTCAGTTT (SEQ ID NO. 4491) SG4_HP0663S655 CUACUACUACCAATTTCGTTTAATTCTCAT (SEQ ID NO. 4492) SG5_HP0664S656 CUACUACUAAATCAATGGTTTGCATTAAAA (SEQ ID NO. 4493) SG6_HP0665S657 CUACUACUATTTCATTCATAGCGTTTTACT (SEQ ID NO. 4494) SG7_HP0666S658 CUACUACUAGGTTAAAAAGGCTTTTTATTC (SEQ ID NO. 4495) SG8_HP0667S659 CUACUACUAGATGCTATTTTGATGGAATTT (SEQ ID NO. 4496) SG9_HP0668S660 CUACUACUATTCATCGTTGTGATTGGTTGC (SEQ ID NO. 4497) SG10_HP0669S6S1 CUACUACUATAATCACTCAAACCTCTTTTC (SEQ ID NO. 4498) SG11_HP0670S662 CUACUACUAAAGGGGTTTTAGCGGTTTGGC (SEQ ID NO. 4499) SG12_HP0671S663 CUACUACUAAAATCAGAAATTGTAACGATA (SEQ ID NO. 4500) SH1_HP0672S664 CUACUACUATCTTTTTATCCCTTTGATTTG (SEQ ID NO. 4501) SH2_HP0673S665 CUACUACUACATCAGGAGTAATAAAACAAT (SEQ ID NO. 4502) SH3_HP0674S666 CUACUACUATTAGTCATCTATTAAGTGAAT (SEQ ID NO. 4503) SH4_HP0675S667 CUACUACUAGGGTGTTTTTAATTTTCTTCC (SEQ ID NO. 4504) SH5_HP0676S668 CUACUACUACCCCATTTTTTAAGAAATTAA (SEQ ID NO. 4505) SH6_HP0677S669 CUAC ACUATCAATTAAAAAAAGAGTTTGT (SEQ ID NO. 4506) SH7_HP0678S670 CUACUACUATGATGCCTTTATTGGTCTGTT (SEQ ID NO. 4507) SH8__HP0679S671 CUACUACUATTAAGCCAATTTGACAGACGC (SEQ ID NO. 4508) SH9 HP0680S672 CUACUACUACTTATATTATTGGCAATTAAA (SEQ ID NO. 4509)
SH10_HP0681S673 CUACUACUAAAATCACCACCCCCATTTAGC (SEQ ID NO. 4510 ) SH11_HP0682S674 CUACUACUATGACTCATTTAAGATCACCAA (SEQ ID NO. 4511 ) SH12_HP0683S675 CUACUACUATTAAGGTTTCTTAAAAAACTT (SEQ ID NO. 4512 ) RA1_HP0587R580 CAUCAUCAUATGACGACTAAAAGAGTGAAT (SEQ ID NO. 4513 ) RA2_HP0588R581 CAUCAUCAUGGAGAATGAATGGCTAAAATG (SEQ ID NO. 4514 ) RA3__HP0589R582 CAUCAUCAUATGCGTGAGATTATTTCTGAT (SEQ ID NO. 4515 ) RA4_HP0590R583 CAUCAUCAUAAATGGCGTTTAATTATGATG (SEQ ID NO. 4516 ) RA5_HP0591R584 CAUCAUCAUTAATGGAAGCGCAATTACGAT (SEQ ID NO. 4517 ) RA6_HP0592R585 CAUCAUCAUTTGGGTTGCGTTTTTACCAAT (SEQ ID NO. 4518 ) RA7_HP0593R586 CAUCAUCAUTGACATGCTTTTAAAGAATTT (SEQ ID NO. 4519 ) RA8_HP0594R587 CAUCAUCAUCATGGAGTTTTTGGGACTGAT (SEQ ID NO. 4520 ) RA9_HP0595R588 CAUCAUCAUGGGATGCATGGATAAAGAAAC (SEQ ID NO. 4521 ) RA10_HP0596R589 CAUCAUCAUGCGTGTTAGAAAAATCTTTTT (SEQ ID NO. 4522 ) RA11_HP0597R590 CAUCAUCAUATGCTAAAAAAGATTTTTTAT (SEQ ID NO. 4523 ) RA12_HP0598R591 CAUCAUCAUAAGTGTTTTGATGTTTTCTAA (SEQ ID NO. 4524 ) RB1_HP0599R592 CAUCAUCAUTGCCATAATGTTTGGGAATAA (SEQ ID NO. 4525 ) RB2_HP0600R593 CAUCAUCAUCATGCAAACACCAATGGATAC (SEQ ID NO. 4526 ) RB3_HP0601R594 CAUCAUCAUCAATGGCTTTTCAGGTCAATA (SEQ ID NO. 4527 ) RB4_HP0602R595 CAUCAUCAUTGTGTTGGATAGTTTTGAGAT (SEQ ID NO. 4528 ) RB5_HP0603R596~ CAUCAUCAUTTTATTGGCCAAAAAGGATTG (SEQ ID NO. 4529 ) RB6_HP0604R597 CAUCAUCAUAGGATTTATGATGATTTTCAT (SEQ ID NO. 4530 ) RB7_HP0605R598 CAUCAUCAUGGGATTGATGAATACTATCAT (SEQ ID NO. 4531 ) RB8_HP0606R599 CAUCAUCAUGAAAATGATACGAAAAATTTT (SEQ ID NO. 4532 ) RB9_HP0607R600 CAUCAUCAUAGGATATTTTGATGTATAAAA (SEQ ID NO. 4533 ) RB10_HP0608R60X CAUCAUCAUATGGTGGGTTTAGCTCCTATT (SEQ ID NO. 4534 ) RB11_HP0609R602 CAUCAUCAUGGTTCAAAATATGACTTATAG (SEQ ID NO. 4535 ) RBX2_HP06X0R603 CAUCAUCAUACACCTTTTTTGTGCAATTCA (SEQ ID NO. 4536) RC1_HP06XXR604 CAUCAUCAUACATGCTATCACCAGCAACTT (SEQ ID NO. 4537 ) RC2_HP06X2R605 CAUCAUCAUGTGGGAGCAATTCTATCTATC (SEQ ID NO. 4538 ) RC3_HP06X3R606 CAUCAUCAUTTATGATTTCTAACATCAGCA (SEQ ID NO. 4539 ) RC4_HP06X4R607 CAUCAUCAUTTTAAGGAGCAATGATGGAAC (SEQ ID NO. 4540 ) RC5 HP06X5R608 CAUCAUCAUACATGATAAAAAGCCAAAAAG (SEQ ID NO. 4541 )
RC6_HP0616R609 CAUCAUCAUGGTGGTAAGAGATATTGACAA (SEQ ID NO. 4542) RC7_HP0617R6X0 CAUCAUCAUAAAACATGCGAAGTCATTTTT (SEQ ID NO. 4543) RC8_HP0618R6X1 CAUCAUCAUGGATACGATATGAAACAACTA (SEQ ID NO. 4544) RC9_HP0620R612 CAUCAUCAUGCAATGAATTTAGAGAAGTTA (SEQ ID NO. 4545) RC10_HP0621R613 CAUCAUCAUCTTGTGGATTTGGATTCTTGT (SEQ ID NO. 4546) RCXX_HP0622R614 CAUCAUCAUATGGGTGCAGTGGTTGTTTTA (SEQ ID NO. 4547) RC12_HP0623R615 CAUCAUCAUATCATGCTTGAAACCCCAAAA (SEQ ID NO. 4548) RD1_HP0624R616 CAUCAUCAUATGACCTTTGAGCCTTATCCT (SEQ ID NO. 4549) RD2_HP0625R617 CAUCAUCAUGGTTTAATGCTAGAAAATAGA (SEQ ID NO. 4550) RD3_HP0626R618 CAUCAUCAUATGATCAATAAGTTTAAAAAT (SEQ ID NO. 4551) RD4_HP0627R6X9 CAUCAUCAUGAATGCTCAAAAAAAGTTTGT (SEQ ID NO. 4552) RD5_HP0628R620 CAUCAUCAUTCGCTATTATGAATGGGTTGC (SEQ ID NO. 4553) RD6_HP0629R621 CAUCAUCAUGCTATGGAAGCGAAACAAAGC (SEQ ID NO. 4554) RD7_HP0630RS22 CAUCAUCAUATGAAAAAAGTACTCATCATT (SEQ ID NO. 4555) RD8_HP0631R623 CAUCAUCAUGAGTGGTCATGTTCTACGATG (SEQ ID NO. 4556) RD9_HP0632R624 CAUCAUCAUCATGTCAAAAAAAATCGTAGT (SEQ ID NO. 4557) RD10_HP0633R625 CAUCAUCAUATGGATAAAATGAATAAGGTC (SEQ ID NO. 4558) RDXX_HP0634R626 CAUCAUCAUATGAGTCAAAAAATCCTAATT (SEQ ID NO. 4559) RDX2__HP0635R627 CAUCAUCAUTTGAATTGGTTTTTGTTTTTC (SEQ ID NO. 4560) REX_HP0636R628 CAUCAUCAUTATGCGTATAGTTAGAAATTT (SEQ ID NO. 4561) RE2_HP0637R629 CAUCAUCAUCCATGCATCAAAACAATAAAA (SEQ ID NO. 4562) RE3_HP0638R630 CAUCAUCAUATGAAAAAAGCTCTCTTACTA (SEQ ID NO. 4563) RE4_HP063 R631 CAUCAUCAUAATTTTGATGGAACAAAAAAT (SEQ ID NO. 4564) RE5JHP0640R632 CAUCAUCAUGTGTTTGAGTTAGAAGAAGGA (SEQ ID NO. 4565) RE6_HP0641R633 CAUCAUCAUAGAAAACATGATAACGATGAA (SEQ ID NO. 4566) RE7_HP0642R634 CAUCAUCAUATTAAAATGGACAGAGAACAA (SEQ ID NO. 4567) RE8_HP0S43R635 CAUCAUCAUAGGGTTTTTATGCTTCGTTTT (SEQ ID NO. 4568) RE9_HP0644R636 CAUCAUCAUGCATGATTTTTTCCACTCTTA (SEQ ID NO. 4569) RE10_HP0645R637 CAUCAUCAUATCATGCGTTTTTTTACCTTG (SEQ ID NO. 4570) RE11_HP0646R638 CAUCAUCAUGGACAAACCATGATTAAAAAA (SEQ ID NO. 4571) RE12_HP0647R639 CAUCAUCAUCATGGGCAATTTGACTTATTA (SEQ ID NO. 4572) RF1 HP0648R640 CAUCAUCAUTTGGATTTTTTAGAGATTGTA (SEQ ID NO. 4573)
RF2_HP0649R641 CAUCAUCAUATGCGTATTGAGCATGATTTC (SEQ ID NO. 4574) RF3_HP0650R642 CAUCAUCAUTTTATATGGAGCGTCTTTTGG (SEQ ID NO. 4575) RF4_HP0651R643 CAUCAUCAUAAAGGATAACCATGTTCCAAC (SEQ ID NO. 4576) RF5_HP0652R644 CAUCAUCAUAGGGTATTGGGTGCAAAAACT (SEQ ID NO. 4577) RF6_HP0653R645 CAUCAUCAUATGTTATCAAAAGACATCATT (SEQ ID NO. 4578) RF7_HP0654R646 CAUCAUCAUCATGGACTTTTTAGAAAAAGT (SEQ ID NO. 4579) RF8_HP0655R647 CAUCAUCAUATTAAAAAATTAATTCTATCC (SEQ ID NO. 4580) RF9_HP0656R648 CAUCAUCAUGGCAATGGCAAGAAATGTAAA (SEQ ID NO. 4581) RF10_HP0657R649 CAUCAUCAUCAAAGGTATAAATAACATGAA (SEQ ID NO. 4582) RF11_HP0658R650 CAUCAUCAUATGATGCCATTTGAAGCTGTA (SEQ ID NO. 4583) RF12_HP0659R651 CAUCAUCAUATGAGGAAAATTTTTTCTTAT (SEQ ID NO. 4584) RG1_HP0660R652 CAUCAUCAUTAGATTGTGCAACACTTCAAT (SEQ ID NO. 4585) RG2_HP0661R653 CAUCAUCAUGAATGCAAGAAATTGAAATTT (SEQ ID NO. 4586) RG3_HP0662R654 CAUCAUCAUAGGGAAAAATGATGAAAAACA (SEQ ID NO. 4587) RG4_HP0663R655 CAUCAUCAUAAGGAAGCCAAATGAACACTT (SEQ ID NO. 4588) RG5_HP0664R656 CAUCAUCAUAAGTGGAAAAATTACCTAAAA (SEQ ID NO. 4589) RG6_HP0665R6S7 CAUCAUCAUATGCAAACCATTGATTTTGAA (SEQ ID NO. 4590) RG7_HP0666R658 CAUCAUCAUAGTAAAACGCTATGAATGAAA (SEQ ID NO. 4591) RG8_HP0667R659 CAUCAUCAUATGAAAAATGCAAGAAATCAG (SEQ ID NO. 4592) RG9_HP0668R660 CAUCAUCAUGTGCAATGCAAATTCCATCAA (SEQ ID NO. 4593) RG10J3P0669R661 CAUCAUCAUCATGACCCAAGCCTGGTTGAT (SEQ ID NO. 4594) RG11_HP0670R662 CAUCAUCAUAAGAGGTTTGAGTGATTACAT (SEQ ID NO. 4595) RG12_HP0671R663 CAUCAUCAUTGTTGTATGAAAAAGTTTGTA (SEQ ID NO. 4596) RH1_HP0672R664 CAUCAUCAUTTAAGCATGTTGTATTCCTCT (SEQ ID NO. 4597) RH2_HP0673R665 CAUCAUCAUAAAATATGGCAGTAAGATTTG (SEQ ID NO. 4598) RH3_HP0674R666 CAUCAUCAUGATCTAAATGCTAGCTTATTG (SEQ ID NO. 4599) RH4_HP0675R667 CAUCAUCAUAATGAAACACCCCCTAGAAGA (SEQ ID NO. 4600) RH5_HP0676R668 CAUCAUCAUATGGTTTTATACCACTACTAT (SEQ ID NO. 4601) RH6_HP0677R669 CAUCAUCAUATGGATATTTATGCGTTATAT (SEQ ID NO. 4602) RH7_HP0678R670 CAUCAUCAUGGGGGTTTGGTTTGGATGACG (SEQ ID NO. 4603) RH8_HP0679R671 CAUCAUCAUATGCTTTTTGCGATGATTGGT (SEQ ID NO. 4604) RH9 HP0680R672 CAUCAUCAUTTGATTACAGTGGTTAAACGA (SEQ ID NO. 4605)
RHX0_HP068XR673 CAUCAUCAUTATGGATACTACGAAAGAGAA (SEQ ID NO. 4606) RH11_HP0682R674 CAUCAUCAUGTGTGCCAAACATGCCTTGAA (SEQ ID NO. 4607) RHX2_HP0683R675 CAUCAUCAUTTAATGCTTTCTGTCATCATA (SEQ ID NO. 4608) SAX_HP0685S676 CUACUACUACGCTTGAATGCTTGTTAATAT (SEQ ID NO. 4609) SA2_HP0686S677 CUACUACUAGTTTAAAACTCATAATTCAAA (SEQ ID NO. 4610) SA3_HP0687S678 CUACUACUATTTAAACCAAAATTTGAGTCG (SEQ ID NO. 4611) SA4_HP0688S679 CUACUACUACATTTAGATTCAACCCCTAAC (SEQ ID NO. 4612) SA5_HP0689S680 CUACUACUATTTAAGCAAACTCCCTAATCT (SEQ ID NO. 4613) SA6JHP0690S681 CUACUACUATTCTCCTTATTTTTGTTCAAC (SEQ ID NO. 4614) SA7_HP0691S682 CUACUACUACTCTCTCATTTCGTGCTCCTT (SEQ ID NO. 4615) SA8_HP0692S683 CUACUACUACTATAAGCGCACCTCAAATTC (SEQ ID NO. 4616) SA9_HP0693S684 CUACUACUAAAAAACTCACACTAAAAAATA (SEQ ID NO. 4617) SA10_HP0694S685 CUACUACUAGCCCCTACAAAAAATAAATAG (SEQ ID NO. 4618) SA11_HP0695S686 CUACUACUACTCCTTTATTTAATTTCTTTC (SEQ ID NO. 4619) SA12_HP0696S687 CUACUACUATTTATTTCTCATCAACCAACA (SEQ ID NO. 4620) SB1_HP0697S688 CUACUACUACAATCTTAGCGTCTTTCTGGG (SEQ ID NO. 4621) SB2_HP0698S689 CUACUACUAAAGTATAGTGATAGAACTCAT (SEQ ID NO. 4622) SB3_HP0699S690 CUACUACUAATCACTCATGGCTTATTTTGT (SEQ ID NO. 4623) SB4_HP0700S691 CUACUACUAGTGTCTCCCTAAAATTTAATT (SEQ ID NO. 4624) SB5_HP0701S692 CUACUACUAGCATCACTCAAACAAATTTTG (SEQ ID NO. 4625) SB6_HP0702Ξ693 CUACUACUACGATTTTCATTCTTCTTCCTT (SEQ ID NO. 4626) SB7_HP0703S694 CUACUACUATCCCTACCTTTCCAAAAACAA (SEQ ID NO. 4627) SB8_HP0704S695 CUACUACUAGCTAGCGGCAACAATAACCAT (SEQ ID NO. 4628) SB9_HP0705S696 CUACUACUACTATTTCAATTCCAAAGCTAA (SEQ ID NO. 4629) SB10_HP0706S697 CUACUACUATTAAAAAGTGTAGTTATACCC (SEQ ID NO. 4630) SB11_HP0707S698 CUACUACUATACTCATGGCTTGAATTGAAA (SEQ ID NO. 4631) SB12_HP0708S699 CUACUACUATTCACTTAAAACGCAATATCG (SEQ ID NO. 4632) SC1__HP0709S700 CUACUACUATTTACTTAGCGCACTTCTTTA (SEQ ID NO. 4633) SC2 3P0710S701 CUACUACUACTCAAAACACCCACCCATAAT (SEQ ID NO. 4634) SC3_HP0711S702 CUACUACUACATTATTTAAAATCCCTAAAC (SEQ ID NO. 4635) SC4_HP07X2S703 CUACUACUATTATAGCCATAGGCACTTGAA (SEQ ID NO. 4636) SC5 HP07X4S704 CUACUACUATTTCAAGCGCGCATCAAATAG (SEQ ID NO. 4637)
SC6_HP0715S705 CUACUACUAAAGATCGCCATGTCTATACCT (SEQ ID NO. 4638) SC7_HP0716S706 CUACUACUACCCTTTAAGCGATCTTTATTG (SEQ ID NO. 4639) SC8_HP0717S707 CUACUACUACGCTCTCATAACAATTCCACG (SEQ ID NO. 4640) SC9_HP0718S708 CUACUACUACTTTTTAGGTTTTGCTCAATA (SEQ ID NO. 4641) SC10JHP0719S709 CUACUACUACACTTCTTCACGCATTTTTTT (SEQ ID NO. 4642) SC11_HP0720S710 CUACUACUATAAATTAAATTTATCTCACTT (SEQ ID NO. 4643) SC12_HP0721S711 CUACUACUATTTAGTGCTTATCGCTGTGAT (SEQ ID NO. 4644) SD1_HP0723S712 CUACUACUACTTTCAATACTCTTCAAACAT (SEQ ID NO. 4645) SD2_HP0724S713 CUACUACUAGGTGATAAAATCTACAAAACT (SEQ ID NO. 4646) SD3_HP0726S714 CUACUACUATAATGAGAACGAATCCAAACT (SEQ ID NO. 4647) SD4_HP0727S715 CUACUACUATTCAAACGCTTTTTTGATTCA (SEQ ID NO. 4648) ΞD5_HP0728S716 CUACUACUATTTTATAGATCCATTAAACTC (SEQ ID NO. 4649) SD6_HP0729S717 CUACUACUATTATGAAGTTTCCTTGTGGGT (SEQ ID NO. 4650) SD7_HP0730S718 CUACUACUATCATGCCTTGCCTTACTTTAA (SEQ ID NO. 4651) SD8_HP0731S7X9 CUACUACUATTCAGTTCATGCCTCTTTCAG (SEQ ID NO. 4652) SD9JHP0732S720 CUACUACUATGTTCTAGCGTTTCATTCTTC (SEQ ID NO. 4653) SD10_HP0733S721 CUACUACUATCCAACTAATCCTGCAATAAT (SEQ ID NO. 4654) SD11_HP0734S722 CUACUACUATTTTAAAAAGGGCTTAAAACC (SEQ ID NO. 4655) SD12_HP0735S723 CUACUACUAGTTAATGGCTTTTTAAGTTTT (SEQ ID NO. 4656) SE1JHP0736S724 CUACUACUACCTTAAATTCCATAATATTGC (SEQ ID NO. 4657) SE2_HP0737S725 CUACUACUACTTAAAATTAAAGTTTAATGT (SEQ ID NO. 4658) SE3JHP0738S726 CUACUACUACATTACTTATTCTTTTGGATT (SEQ ID NO. 4659) SE4_HP0739S727 CUACUACUACTAAGACTTTGCATGATGATA (SEQ ID NO. 4660) SE5J3P0740S728 CUACUACUACTAAATGTAATTAGGGGCGTC (SEQ ID NO. 4661) SE6_HP0741S729 CUACUACUATTCAACCAAGCTCCTTTTCTA (SEQ ID NO. 4662) SE7_HP0742S730 CUACUACUATTTTTATTCCCTTTTTAAGTG (SEQ ID NO. 4663) SE8_HP0743S73X CUACUACUACGGTTATGAGCCGAGTGCTCT (SEQ ID NO. 4664) SE9_HP0745S732 CUACUACUAAACCGCTAATACCACACTACT (SEQ ID NO. 4665) SE10_HP0746S733 CUAC ACUATTTAGCGCTCTAAAAACAAAC (SEQ ID NO. 4666) SEX1_HP0747S734 CUACUACUAACACTCATCCAATTCCCTTTT (SEQ ID NO. 4667) SE12_HP0748S735 CUACUACUATTAAGAGTATTCATAAACTTC (SEQ ID NO. 4668) SF1 HP0749S736 CUACUACUATTATACATGCCTAGTCCTCCA (SEQ ID NO. 4669)
SF2_HP0750S737 CUACUACUATTAATTGCGTGCGATGAGTTT (SEQ ID NO. 4670) SF3_HP0751S738 CUACUACUATTTAGCCTCTTTTATCAAAGA (SEQ ID NO. 4671) SF4_HP0752S739 CUACUACUATCATTTGCACGGAATTGAATT (SEQ ID NO. 4672) SF5_HP0753S740 CUACUACUAAAAATGTTAGGCGAGTTCATC (SEQ ID NO. 4673) SF6_HP0754S741 CUACUACUATTAAGACAAAAATTTTTGAAT (SEQ ID NO. 4674) SF7_HP0755S742 CUACUACUATCACTTCCTTTTATCGTTGAT (SEQ ID NO. 4675) SF8_HP07S6S743 CUACUACUAATTTTTCACTCTTCTTCATAA (SEQ ID NO. 4676) SF9_HP0757S744 CUACUACUAATCAAATATAGCGTTTTAACA (SEQ ID NO. 4677) SF10_HP0758S745 CUACUACUATTGATAGATTAGCCTAATAGT (SEQ ID NO. 4678) SF11_HP0759S746 CUACUACUACAATAATCAAGCCTTTTTCCC (SEQ ID NO. 4679) SF12_HP0760S747 CUACUACUAGGGAAAAAGGCTTGATTATTG (SEQ ID NO. 4680) SG1J3P0761S748 CUACUACUAGCCCTCTAGCGTAATAGATCT (SEQ ID NO. 4681) SG2_HP0762S749 CUACUACUATTAATCCTTTAAATTGGAAAT (SEQ ID NO. 4682) SG3_HP0763S750 CUACUACUATTAGAATTTATTGTCCCACAA (SEQ ID NO. 4683) SG4_HP0764S751 CUACUACUATTCTAAATTATTTCTTAAGAA (SEQ ID NO. 4684) SG5_HP0765S752 CUACUACUATCCTTTAGATGTTTATGATAT (SEQ ID NO. 4685) SG6_HP0766S753 CUACUACUACTTTAAATTTCTATTTTTCCA (SEQ ID NO. 4686) SG7_HP0767S754 CUACUACUAGGTTTAGGGTAGGCTTATAAA (SEQ ID NO. 4687) SG8_HP0768S755 CUACUACUACCTACCCCCCTGTGTAGTAAA (SEQ ID NO. 4688) SG9_HP0769S756 CUACUACUACTTCAGCCATTGGCCCTCTTT (SEQ ID NO. 4689) SG10_HP0770S757 CUACUACUAAAAATCTTAAAGAGGTTTAAT (SEQ ID NO. 4690) SG11_HP0771S758 CUACUACUATATTTGAGCTTTAAATTTGAT (SEQ ID NO. 4691) SGX2_HP0772S7S9 CUACUACUACTAATCATTCTTGCTGAAGAA (SEQ ID NO. 4692) SHX_HP0773S760 CUACUACUACTACAAATTAACCCTCTGTAA (SEQ ID NO. 4693) SH2_HP0774S761 CUACUACUAGCTCTTTCCTTAATATTAGTT (SEQ ID NO. 4694) SH3_HP0775S762 CUACUACUATTATGATTCATAAGCGTCATC (SEQ ID NO. 4695) SH4_HP0776S763 CUACUACUATTATCAATTTCGTTCATCTAT (SEQ ID NO. 4696) SH5_HP0777S764 CUACUACUATAAAGGGCTTATTTTACCATA (SEQ ID NO. 4697) SH6_HP0778S765 CUACUACUACTCAGCGAATCACTAAAAACG (SEQ ID NO. 4698) SH7_HP0779S766 CUACUACUAAAACTAGAGCCTGAAATTCTC (SEQ ID NO. 4699) SH8_HP0780S767 CUACUACUATATCCCCTAGCGCCAATTATA (SEQ ID NO. 4700) SH9 HP0781S768 CUACUACUATCTGAATATTATTCTTTTTTT (SEQ ID NO. 4701)
SH10_HP0782S769 CUACUACUAATAGGGTTTAAAAAAAGAATT (SEQ ID NO. 4702) SH11_HP0783S770 CUACUACUACTATCAATTAGGCTTGGTAGA (SEQ ID NO. 4703) SHX2_HP0784S771 CUACUACUATTAGCAAAAGTTAGCTAAAAT (SEQ ID NO. 4704) RA1_HP0685R676 CAUCAUCAUGGTGTTTTCTTTTTTAAGGAC (SEQ ID NO. 4705) RA2_HP0686R677 CAUCAUCAUTGGTAAATGAAAAGAATTTTA (SEQ ID NO. 4706) RA3_HP0687R678 CAUCAUCAUAACCCAATGAAAGAAATCACT (SEQ ID NO. 4707) RA4_HP0688RS79 CAUCAUCAUAATGTTGTGCGTGTTTGATAT (SEQ ID NO. 4708) RA5_HP0689R680 CAUCAUCAUCTATGATGAATATTCCTGGTA (SEQ ID NO. 4709) RA6_HP0690R681 CAUCAUCAUATGAATGAAGTGGTTGTAGTG (SEQ ID NO. 4710) RA7_HP0691R682 CAUCAUCAUAGGAGAATGAGATGAATAAGG (SEQ ID NO. 4711) RA8_HP0692R683 CAUCAUCAUAATGAGAGAGGCTATCATTAA (SEQ ID NO. 4712) RA9_HP0693R684 CAUCAUCAUATGTTTTTATTAAGGCATTTG (SEQ ID NO. 4713) RA10_HP0694R685 CAUCAUCAUTTTTGTTTTTCAAATTTATTT (SEQ ID NO. 4714) RAX1_HP0695R686 CAUCAUCAUCTAAATGAAAGACGCAAGAGT (SEQ ID NO. 4715) RA12_HP0696R687 CAUCAUCAUTCAAAATGGCAAATTTATTGA (SEQ ID NO. 4716) RB1_HP0697R688 CAUCAUCAUGAATGGTCATGTCAAAATACA (SEQ ID NO. 4717) RB2_HP0698R689 CAUCAUCAUAGGAAAAGCTATGAAAGTAAC (SEQ ID NO. 4718) RB3_HP0699R690 CAUCAUCAUTTTTTTGACTAAAAAATTCAT (SEQ ID NO. 4719) RB4_HP0700R691 CAUCAUCAUGCCATGAGTGATTTCGAAGTC (SEQ ID NO. 4720) RB5_HP0701R692 CAUCAUCAUACACATGCAAGATAATTCAGT (SEQ ID NO. 4721) RB6_HP0702R693 CAUCAUCAUTGAGTGATGCGTTTCTTTTCA (SEQ ID NO. 4722) RB7_HP0703R694 CAUCAUCAUATGAAAATCGCCATTGTAGAA (SEQ ID NO. 4723) RB8_HP0704R695 CAUCAUCAUAATGGTCTCTAACTCCCTTAA (SEQ ID NO. 4724) RB9_HP0705R696 CAUCAUCAUAAAACCATTATGGATAAGATC (SEQ ID NO. 4725) RB10_HP0706R697 CAUCAUCAUATGGAATTTATGAAAAAGTTT (SEQ ID NO. 4726) RB11_HP0707R698 CAUCAUCAUGGGATTGTTTGCAAGAAATAG (SEQ ID NO. 4727) RB12_HP0708R699 CAUCAUCAUTGAGTAAGCCATGAATATCAA (SEQ ID NO. 4728) RC1_HP0709R700 CAUCAUCAUAGGAAATAATACAGATGAGAA (SEQ ID NO. 4729) RC2_HP0710R701 CAUCAUCAUGAGATAACCATGAGAAAACTA (SEQ ID NO. 4730) RC3__HP0711R702 CAUCAUCAUATGTATGCGGCTCATCCTATT (SEQ ID NO. 4731) RC4_HP0712R703 CAUCAUCAUAATGAACTATAAAGAATTATT (SEQ ID NO. 4732) RC5 HP0714R704 CAUCAUCAUGTATAGACATGGCGATCTTAC (SEQ ID NO. 4733)
RC6_HP0715R705 CAUCAUCAUGGGTCGTTTGAATGGATATTT (SEQ ID NO. 4734) RC7_HP0716R706 CAUCAUCAUTGTTATGAGAGCGAATTTAGA (SEQ ID NO. 4735) RC8_HP0717R707 CAUCAUCAUATGCAAGTTTTAGCGTTAAAA (SEQ ID NO. 4736) RC9_HP0718R708 CAUCAUCAUAAGATGTTTGTGGTTTTTATA (SEQ ID NO. 4737) RC10_HP0719R709 CAUCAUCAUATGGTAAAATGCCAAAATTTG (SEQ ID NO. 4738) RC11_HP0720R710 CAUCAUCAUCAAAAAAATGCGTGAAGAAGT (SEQ ID NO. 4739) RC12_HP0721R711 CAUCAUCAUAATGAAAAAAGCGTTGAAAAT (SEQ ID NO. 4740) RD1_HP0723R7X2 CAUCAUCAUGGTTATGGCTCAAAATTTACC (SEQ ID NO. 4741) RD2_HP0724R713 CAUCAUCAUGGTTAATGGTTGATGCCTTTT (SEQ ID NO. 4742) RD3_HP0726R714 CAUCAUCAUCCGTTATGGGTAGAATTGAAT (SEQ ID NO. 4743) RD4_HP0727R715 CAUCAUCAUGAGAATAAATGGACTTTAAAA (SEQ ID NO. 4744) RD5_HP0728R716 CAUCAUCAUAGTGCAAGATTTTAAAACCCA (SEQ ID NO. 4745) RD6_HP0729R717 CAUCAUCAUATGAACCACCTTTTAATCCTC (SEQ ID NO. 4746) RD7_HP0730R7X8 CAUCAUCAUGATGTTTTCTAAAATTTGCTC (SEQ ID NO. 4747) RD8_HP0731R719 CAUCAUCAUATGAAGAATGAAACGCTAGAA (SEQ ID NO. 4748) RD9_HP0732R720 CAUCAUCAUAACGCTTGGAATCCTATGGGT (SEQ ID NO. 4749) RD10_HP0733R721 CAUCAUCAUATGAAAAACATTTATCTTGAT (SEQ ID NO. 4750) RD11_HP0734R722 CAUCAUCAUACATGCAAGTTAAAGAAAACA (SEQ ID NO. 4751) RDX2_HP0735R723 CAUCAUCAUACAATGCATTATTCTTATGAA (SEQ ID NO. 4752) RE1_HP0736R724 CAUCAUCAUATTTTAATGTTGCTTTTCACT (SEQ ID NO. 4753) RE2_HP0737R725 CAUCAUCAUATTGGATAAATTTAGTCTTCG (SEQ ID NO. 4754) RE3_HP0738R726 CAUCAUCAUGTGGAGTTTTGCGTTTTATTT (SEQ ID NO. 4755) RE4_HP0739R727 CAUCAUCAUATGGCCAAACGCAGTATCGCT (SEQ ID NO. 4756) RE5_HP0740R728 CAUCAUCAUTCATCATGCAAAGTCTTAGTT (SEQ ID NO. 4757) RE6_HP0741R729 CAUCAUCAUGGAAATGAACATGCAACATTT (SEQ ID NO. 4758) RE7_HP0742R730 CAUCAUCAUCGATGAAAGCGCGTGGGTTTA (SEQ ID NO. 4759) RE8_HP0743R731 CAUCAUCAUAACGCATGGCATTAGACAAAA (SEQ ID NO. 4760) RE9_HP0745R732 CAUCAUCAUTTGGATGCAAAAAGTTTTCAT (SEQ ID NO. 4761) REX0_HP0746R733 CAUCAUCAUTTAGATGAGGTCTTGGATGAA (SEQ ID NO. 4762) RE11_HP0747R734 CAUCAUCAUAGGGTTTTAATGCCCCATTTT (SEQ ID NO. 4763) RE12_HP0748R735 CAUCAUCAUAGGGAATTGGATGAGTGTGAT (SEQ ID NO. 4764) RF1 HP0749R736 CAUCAUCAUTATGAATACTCTTAAAAAGCA (SEQ ID NO. 4765)
RF2_HP0750R737 CAUCAUCAUATGTATAAATTAGGGGTGTTT (SEQ ID NO. 4766) RF3_HP0751R738 CAUCAUCAUTGGAGGTTTTTATGGTCAATG (SEQ ID NO. 4767) RF4_HP0752R739 CAUCAUCAUAGGTAAAACATGGCAATAGGT (SEQ ID NO. 4768) RF5_HP0753R740 CAUCAUCAUGAACATTATGCAATACGCTAA (SEQ ID NO. 4769) RF6_HP0754R741 CAUCAUCAUGATGAACTCGCCTAACATTTT (SEQ ID NO. 4770) RF7_HP0755R742 CAUCAUCAUTGGTTGTTTAAAAACGATTTT (SEQ ID NO. 4771) RF8_HP0756R743 CAUCAUCAUGAGATGAAAGATTACGAAGAC (SEQ ID NO. 4772) RF9_HP0757R744 CAUCAUCAUGATTATGAAGAAGAGTGAAAA (SEQ ID NO. 4773) RF10_HP0758R745 CAUCAUCAUATGCTAGAAAATAGCTCTATA (SEQ ID NO. 4774) RF11_HP0759R746 CAUCAUCAUTTTTAGATGTGCTGGTGGTCG (SEQ ID NO. 4775) RF12_HP0760R747 CAUCAUCAUAGAATGCAAATTGCAACAGCA (SEQ ID NO. 4776) RG1_HP0761R748 CAUCAUCAUTTGGCTTGCAAGCTTTTGTTT (SEQ ID NO. 4777) RG2_HP0762R749 CAUCAUCAUTAATGAGAATTAAGGCTTATT (SEQ ID NO. 4778) RG3_HP0763R750 CAUCAUCAUATGTTTAATTTTTTCAAAAAA (SEQ ID NO. 4779) RG4_HP0764R751 CAUCAUCAUACATGCCATTGCCATTTATTA (SEQ ID NO. 4780) RG5_HP0765R752 CAUCAUCAUGAAATCATGGAAGAACAAAAG (SEQ ID NO. 4781) RG6_HP0766R753 CAUCAUCAUAGAGGGGGTAATTCATGTTAG (SEQ ID NO. 4782) RG7_HP0767R754 CAUCAUCAUAGTTCCTTATGGAAAAAGTCC (SEQ ID NO. 4783) RG8_HP0768R755 CAUCAUCAUAGGGAGTGTTAGTAGATAGTT (SEQ ID NO. 4784) RG9_HP0769R756 CAUCAUCAUTTGAAAAACCCTATCATTGAT (SEQ ID NO. 4785) RG10_HP0770R757 CAUCAUCAUAATGGCTGAAGAAGAAAAAAC (SEQ ID NO. 4786) RG11_HP0771R758 CAUCAUCAUGATTAGGTTGTAGATGAATTT (SEQ ID NO. 4787) RG12_HP0772R759 CAUCAUCAUAGTGCTTGTGAGGTTAGGGGT (SEQ ID NO. 4788) RH1_HP0773R760 CAUCAUCAUAGAGCTATGGTATCAACACTC (SEQ ID NO. 4789) RH2_HP0774R761 CAUCAUCAUGAATGAACATGGAACAAAAAA (SEQ ID NO. 4790) RH3_HP0775R762 CAUCAUCAUGATAGATGAACGAAATTGATA (SEQ ID NO. 4791) RH4_HP0776R763 CAUCAUCAUACATTGGATAGAACTCAAAAT (SEQ ID NO. 4792) RH5_HP0777R764 CAUCAUCAUGGATCATGCAAGCAAAGATAA (SEQ ID NO. 4793) RH6_HP0778R765 CAUCAUCAUAGTGCGTTTTGGTAAAATTGA (SEQ ID NO. 4794) RH7_HP0779R766 CAUCAUCAUAGAAATGATGAAAGATTTTTT (SEQ ID NO. 4795) RH8_HP0780R767 CAUCAUCAUAGAGTGGCGGTGAAAAAAATC (SEQ ID NO. 4796) RH9 HP0781R768 CAUCAUCAUATGCCATACGCCTTAAGAAAA (SEQ ID NO. 4797)
RH10_HP0782R769 CAUCAUCAUGTGTTAAAATTTCAAAAATTA (SEQ ID NO. 4798) RH11_HP0783R770 CAUCAUCAUGTGGCGTTGCCCTTTTATTTT (SEQ ID NO. 4799) RH12_HP0784R771 CAUCAUCAUGTGTTGTTTACAGTTATTTTT (SEQ ID NO. 4800) SA1_HP0785S772 CUACUACUATCATTGGCGCACAATATCAAT (SEQ ID NO. 4801) SA2_HP0786S773 CUACUACUAAAGGATCTATTTGGCAAATAA (SEQ ID NO. 4802) SA3_HP0787S774 CUACUACUATTATTCATTCCTTAACACGCT (SEQ ID NO. 4803) SA4_HP0788S775 CUACUACUAAGGGTTTAAAGCTTCCACACG (SEQ ID NO. 4804) SA5_HP0789S776 CUACUACUACTATTTTAGGGCTTGTTTTTC (SEQ ID NO. 4805) SA6_HP0790S777 CUACUACUATTACGCAAGCTCTTTGCTATT (SEQ ID NO. 4806) SA7_HP0791S778 CUACUACUACAAGGCTTTAAGCTCTCATCG (SEQ ID NO. 4807) SA8_HP0792S779 CUACUACUAAATCCTTTTAAGAAATCTTTC (SEQ ID NO. 4808) SA9_HP0793S780 CUACUACUATGTTACTCGTGTTTTTGTTTT (SEQ ID NO. 4809) SA10_HP0794S781 CUACUACUATCACTTCACGTTTTTCTGTAA (SEQ ID NO. 4810) SA11_HP0795S782 CUACUACUATTAACCCGCTTGAATTTTTTG (SEQ ID NO. 4811) SA12_HP0796S783 CUACUACUAAAAACTAAAAGTCATAAGTTG (SEQ ID NO. 4812) SB1_HP0797S784 CUACUACUATGTCTTTATCGGTTTCTTTTG (SEQ ID NO. 4813) SB2_HP0798S785 CUACUACUATTTTCTATTTTTTAGCGTTAT (SEQ ID NO. 4814) SB3_HP0799S786 CUACUACUACGGCATTTTTTATCCTATTGT (SEQ ID NO. 4815) SB4_HP0800S787 CUACUACUAACTCCTTTTTAAGCTAAAAGC (SEQ ID NO. 4816) SB5_HP0801S788 CUACUACUAACCTAGCCCCCACAAACCGGT (SEQ ID NO. 4817) SB6_HP0802S789 CUACUACUAGCACTCATAATTTTCACAATA (SEQ ID NO. 4818) SB7_HP0803S790 CUACUACUAAGCTCTTAAAATTTGTAAGTC (SEQ ID NO. 4819) SB8_HP0804S791 CUACUACUACTTAAAGGCTAATCGTTTCTA (SEQ ID NO. 4820) SB9_HP0805S792 CUACUACUATCTAATGTTTGCCAATTCTGA (SEQ ID NO. 4821) SB10_HP0806S793 CUACUACUATTTAGGGTTGTAATGTTTTGA (SEQ ID NO. 4822) SB11_HP0807S794 CUACUACUAGCTCACATACGCTGTGATGCT (SEQ ID NO. 4823) SB12_HP0808S795 CUACUACUAATTTATTCATTTGACGAAGAA (SEQ ID NO. 4824) SC1_HP0809S796 CUACUACUATTATTGGATAATGAAATCAGT (SEQ ID NO. 4825) SC2_HP0810S797 CUACUACUAATTCCTATTGAAAATAAGTTA (SEQ ID NO. 4826) SC3_HP0811S798 CUACUACUATTTTTTTACTGGCTGATGATT (SEQ ID NO. 4827) ΞC4_HP0812S799 CUACUACUATTACATTTTATTTTTTTTCAC (SEQ ID NO. 4828) SC5 HP0813S800 CUACUACUACAATTAAGCCATCCTTGAAAC (SEQ ID NO. 4829)
SC6_HP08X4S801 CUACUACUACCCCTTAATCTAAATTTCATA (SEQ ID NO. 4830) SC7_HP08X5S802 CUACUACUATTTAGCCCTCAAATTGAGATT (SEQ ID NO. 4831) SC8__HP0816S803 CUACUACUATCATTCTTGCTGTTTGTGGGG (SEQ ID NO. 4832) SC9_HP08X7S804 CUACUACUAGCAGGTTAAAACTGAAAGGTG (SEQ ID NO. 4833) SC10_HP0818S805 CUACUACUAATGCTTTATAACCCTAATCTT (SEQ ID NO. 4834) SC11_HP0819S806 CUACUACUATAAAATCACTGCTCTTGTTTA (SEQ ID NO. 4835) SCX2__HP0820S807 CUACUACUACATTCATTCCTCAAAAAAACT (SEQ ID NO. 4836) SDX_HP0821S808 CUACUACUATGTTTTTTCCTTTAGATTCGT (SEQ ID NO. 4837) SD2_HP0822S809 CUACUACUATCAATTTTCCAAACGAATCAT (SEQ ID NO. 4838) SD3JHP0823S810 CUACUACUAAAATCTAAAAAGTGATATTTT (SEQ ID NO. 4839) SD4_HP0824S811 CUACUACUAGCTATTAGCCTAAAAGTTTGT (SEQ ID NO. 4840) SD5_HP0825S812 CUACUACUAGATTTAATGGTGTTCTAAATA (SEQ ID NO. 4841) SD6_HP0826S813 CUACUACUAATTATACAAACTGCCAATATT (SEQ ID NO. 4842) SD7__HP0827S814 CUACUACUATGTTACTAAGACTTTTTAGGA (SEQ ID NO. 4843) SD8_HP0828S815 CUACUACUATTAATGCCCTTCATCGGTTAA (SEQ ID NO. 4844) SD9_HP0829S816 CUACUACUATTTCTTTTACAATTCACCCAT (SEQ ID NO. 4845) SD10_HP0830S817 CUACUACUACCTTAATCTAATTTTAAATCT (SEQ ID NO. 4846) SD11_HP0831S818 CUACUACUAAGATAAAACGAGCTTATAAGA (SEQ ID NO. 4847) SD12_HP0832S819 CUACUACUAACCATTTTAAGATTTGATATT (SEQ ID NO. 4848) SE1_HP0833S820 CUACUACUAGCCTAGCTTTGGCATTCTAAA (SEQ ID NO. 4849) SE2_HP0834S821 CUACUACUAACGCATAAAAAATTTAATTTT (SEQ ID NO. 4850) SE3_HP0835S822 CUACUACUAATTCACTTACCTTCTTCAACT (SEQ ID NO. 4851) SE4_HP0836S823 CUACUACUACTTGGCTAGAGCCAGGGCTTG (SEQ ID NO. 4852) SE5_HP0837S824 CUACUACUAATGATTATAGAGTAAGAGGGT (SEQ ID NO. 4853) SE6_HP0838S825 CUACUACUATTCATTGGAGCTCGCTGTCAG (SEQ ID NO. 4854) SE7_HP0839S826 CUACUACUATTACCAGCGATAGCCTAAAGA (SEQ ID NO. 4855) SE8_HP0840S827 CUACUACUATCATAATAATTTCAACAAATC (SEQ ID NO. 4856) SE9_HP0841S828 CUACUACUAGCATTAAAGGGCGTTGTCTTT (SEQ ID NO. 4857) SE10_HP0842S829 CUACUACUACTACCCTATCGTGTCCAACAA (SEQ ID NO. 4858) SE11__HP0843S830 CUACUACUACTCACGCATTGTTTAAAAGTT (SEQ ID NO. 4859) SE12_HP0844S831 CUACUACUATCAAACAAGCTCTTTGATACT (SEQ ID NO. 4860) SF1 HP0845S832 CUACUACUATTTCACCACTCTTCCTTAATA (SEQ ID NO. 4861)
SF2_HP0846S833 CUACUACUACTATTTTTCATTAACCTCATC (SEQ ID NO. 4862) SF3_HP0847S834 CUACUACUACCCTTTCTAGCTCTGCTTCGC (SEQ ID NO. 4863) SF4_HP0848S835 CUACUACUAACTCATGTTATTCCTTGTTTT (SEQ ID NO. 4864) SF5_HP0849S836 CUACUACUATTTTCCTATTGTTCCAGTTCC (SEQ ID NO. 4865) SF6_HP0850S837 CUACUACUATTGCTCTATTTTATGCATTTT (SEQ ID NO. 4866) SF7_HP0851S838 CUACUACUAAAAGCCTTTATTGATTATAAG (SEQ ID NO. 4867) SF8_HP0852S839 CUACUACUATTAAAAAGAAAGCGATGGGTT (SEQ ID NO. 4868) SF9_HP0853S840 CUACUACUATGCGGTTTCATTTTTTGCTCG (SEQ ID NO. 4869) SF10_HP0854S84X CUACUACUATTTAGATTGCGTCTCCATTGA (SEQ ID NO. 4870) SF11_HP0855S842 CUACUACUATCCTAAAAATTAAAATACAAA (SEQ ID NO. 4871) SF12_HP0856S843 CUACUACUAAGCCCTAATTTTGAATTTCTT (SEQ ID NO. 4872) SG1_HP0857S844 CUACUACUACTAATTTTTATGAGCGAAATG (SEQ ID NO. 4873) SG2_HP0858S845 CUACUACUATCAATCATTGCATGTCCTTTT (SEQ ID NO. 4874) SG3_HP0859S846 CUACUACUATTTTTTCATGCGCACTGTCCT (SEQ ID NO. 4875) SG4_HP0860S847 CUACUACUATAACGCATTCTTGTCCTTATT (SEQ ID NO. 4876) SG5_HP0861S848 CUACUACUATTGTTAGTGTTCATGCGAATG (SEQ ID NO. 4877) SG6_HP0862S849 CUACUACUATCATTTGCATTCTAGTATCCC (SEQ ID NO. 4878) SG7_HP0863S850 CUACUACUAGCCTACTTCTTATTCATTTCT (SEQ ID NO. 4879) SG8_HP0864S851 CUACUACUAAATCAATGCGTTCCTTGAATG (SEQ ID NO. 4880) SG9_HP0865S852 CUACUACUACTTAATGCTCATGCCTTACTC (SEQ ID NO. 4881) SG10_HP0866S853 CUACUACUAGATTTTAATTTTCATCAAAAC (SEQ ID NO. 4882) SG11_HP0867S854 CUACUACUAAGTAGAGATTATAGTAAAAAT (SEQ ID NO. 4883) SG12_HP0868S855 CUACUACUACGTGGGCATTATTTGTCTTTC (SEQ ID NO. 4884) SHX_HP0869S856 CUACUACUACGGTTATTCCGCTAACATTTC (SEQ ID NO. 4885) SH2_HP0870S857 CUACUACUACTTATTGCTTAAGATTCAATA (SEQ ID NO. 4886) SH3_HP0871S858 CUACUACUAACTCGCTTTAGTGCAAAATCG (SEQ ID NO. 4887) SH4_HP0872S859 CUACUACUATTCCTAAGCTTTTTTGAGGAA (SEQ ID NO. 4888) SH5_HP0873S860 CUACUACUAATCATTTTTTAGAAAAAATCA (SEQ ID NO. 4889) SH6_HP0874S861 CUACUACUATTAGACAATCGTATAGGAC C (SEQ ID NO. 4890) SH7_HP0875S862 CUACUACUATTACTTTTTCTTTTTTGTGTG (SEQ ID NO. 4891) SH8_HP0876S863 CUACUACUAGCTCCATTAACTACCATTTGT (SEQ ID NO. 4892) SH9 HP0877S864 CUACUACUAAGTGGCTTACAAATGGTAGTT (SEQ ID NO. 4893)
SH10_HP0878S865 CUACUACUATCAAAGGGTGGGGTGGTATAA (SEQ ID NO. 4894) SH11_HP0879S866 CUACUACUATTAGTTTAAAGCCCAGTTGGT (SEQ ID NO. 4895) SH12_HP0880S867 CUACUACUAGTTCATTGAGCCATTAACGCC (SEQ ID NO. 4896) RA1__HP0785R772 CAUCAUCAUTAATGAGAGCTTTTTTAAAGA (SEQ ID NO. 4897) RA2_HP0786R773 CAUCAUCAUATGATAAAAGCAATCATTGGA (SEQ ID NO. 4898) RA3_HP0787R774 CAUCAUCAUTTTGCCAAATAGATCCTTAAT (SEQ ID NO. 4899) RA4_HP0788R775 CAUCAUCAUATGAACTATAAAGTTGCATCT (SEQ ID NO. 4900) RA5_HP0789R776 CAUCAUCAUAGGCATAACATGAATTACGAA (SEQ ID NO. 4901) RA6__HP0790R777 CAUCAUCAUGGGGCAAAATGCATAAAATAG (SEQ ID NO. 4902) RA7__HP0791R778 CAUCAUCAUATGCAAGAATACCACATTCAT (SEQ ID NO. 4903) RA8_HP0792R779 CAUCAUCAUCGAGTAACAACCATGATTAAC (SEQ ID NO. 4904) RA9 HP0793R780 CAUCAUCAUGCATGGCGTTATTAGAGATTA (SEQ ID NO. 4905) RA10_HP0794R781 CAUCAUCAUGAGAGATGATGGGATACATTC (SEQ ID NO. 4906) RA11_HP0795R782 CAUCAUCAUAAAAATGAATCTTGAAGTGAA (SEQ ID NO. 4907) RA12_HP0796R783 CAUCAUCAUAGAACGCTTGAACAAACTGCT (SEQ ID NO. 4908) RBXJHP0797R784 CAUCAUCAUTAGAACGATGAAAGCAAATAA (SEQ ID NO. 4909) RB2_HP0798R785 CAUCAUCAUATGCCGCTCACTCATTTGAAT (SEQ ID NO. 4910) RB3JHP0799R786 CAUCAUCAUAAATGCAAACGATTCATATAG (SEQ ID NO. 4911) RB4__HP0800R787 CAUCAUCAUGGTGTTAAAAATCATTCAAGG (SEQ ID NO. 4912) RB5__HP080XR788 CAUCAUCAUAATGATGGTAGAAGTGCGATT (SEQ ID NO. 4913) RB6JHP0802R789 CAUCAUCAUAGTATCCTTGAAACGATTAGA (SEQ ID NO. 4914) RB7_HP0803R790 CAUCAUCAUCCTGATGAGAGTAATAATAGA (SEQ ID NO. 4915) RB8_HP0804R791 CAUCAUCAUTAGAATGATCTTAAAACGAGT (SEQ ID NO. 4916) RB9JHP0805R792 CAUCAUCAUATGCGTGTTTTTATTATTCAT (SEQ ID NO. 4917) RB10_HP0806R793 CAUCAUCAUATGTTAGATGATATTCCTATT (SEQ ID NO. 4918) RB11_HP0807R794 CAUCAUCAUATGAATGGTTATTTGAGAGTA (SEQ ID NO. 4919) RB12_HP0808R795 CAUCAUCAUATGATTGGCATAGATATTGTC (SEQ ID NO. 4920) RC1_HP0809R796 CAUCAUCAUCATGGCAGAAGAACAAGAAAA (SEQ ID NO. 4921) RC2_HP0810R797 CAUCAUCAUATGCCAAATCATCAGCCAGTA (SEQ ID NO. 4922) RC3_HP0811R798 CAUCAUCAUGGGTTTTTTGAAAATTTTTCT (SEQ ID NO. 4923) RC4_HP0812R799 CAUCAUCAUGAAAATCGTGCGATCGTTTGG (SEQ ID NO. 4924) RC5 HP0813R800 CAUCAUCAUGTTTGGAGTTTGGAAATTTTA (SEQ ID NO. 4925)
RC6_HP0814R801 CAUCAUCAUAATTGTTAAGCCGGCTAGAAA (SEQ ID NO. 4926) RC7_HP0815R802 CAUCAUCAUGTTTTGGATTTATCAACCATA (SEQ ID NO. 4927) RC8_HP08X6R803 CAUCAUCAUAAGATGGCTAAGAAAAACAAA (SEQ ID NO. 4928) RC9_HP0817R804 CAUCAUCAUGAATGAATCGCATGAATAAAA (SEQ ID NO. 4929) RC10_HP0818R805 CAUCAUCAUAGTTTGGTTATAATGCTAGGC (SEQ ID NO. 4930) RC11_HP0819R806 CAUCAUCAUATGAAAGAAATCGTTACAATA (SEQ ID NO. 4931) RCX2_HP0820R807 CAUCAUCAUATGGAACAAAATATTTTCTCC (SEQ ID NO. 4932) RD1_HP0821R808 CAUCAUCAUTGAATGGCTGATTTATTGTCC (SEQ ID NO. 4933) RD2_HP0822R809 CAUCAUCAUCATGAAAAAAAGATTGAATAT (SEQ ID NO. 4934) RD3_HP0823R810 CAUCAUCAUTGAATGCGCTTTTTGAACAAC (SEQ ID NO. 4935) RD4_HP0824R811 CAUCAUCAUTAATGAGTCACTATATTGAAT (SEQ ID NO. 4936) RD5_HP0825R812 CAUCAUCAUATGATAGATTGCGCGATTATT (SEQ ID NO. 4937) RD6_HP0826R813 CAUCAUCAUTTTGCGTGTTTTTGCCATTTC (SEQ ID NO. 4938) RD7_HP0827R814 CAUCAUCAUGAATTTTATCTTGAGAAACAT (SEQ ID NO. 4939) RD8_HP0828R815 CAUCAUCAUATGGAACACAGAGTATTTACT (SEQ ID NO. 4940) RD9_HP0829R816 CAUCAUCAUAAGGATAGAAAATGAGAATTT (SEQ ID NO. 4941) RD10_HP0830R817 CAUCAUCAUGCTAAAAACATGATCACTTTA (SEQ ID NO. 4942) RDX1_HP0831R818 CAUCAUCAUGACAATATCAAATCTTAAAAT (SEQ ID NO. 4943) RD12_HP0832R819 CAUCAUCAUTTATGTGGATCACCCAAGAAA (SEQ ID NO. 4944) RE1_HP0833R820 CAUCAUCAUCGTTGTTTTTAGTCAAAAAAA (SEQ ID NO. 4945) RE2_HP0834R821 CAUCAUCAUAGATGAATACAAGCCATAAAA (SEQ ID NO. 4946) RE3_HP0835R822 CAUCAUCAUCATGAACAAAGCGGAATTTAT (SEQ ID NO. 4947) RE4_HP0836R823 CAUCAUCAUAATGCCCATGCGTTTGCACAC (SEQ ID NO. 4948) RE5_HP0837R824 CAUCAUCAUCCTAATAATGTCATTGAATTG (SEQ ID NO. 4949) RE6_HP0838R825 CAUCAUCAUCTCTATAATCATGCGGTATTT (SEQ ID NO. 4950) RE7_HP0839R826 CAUCAUCAUATGAAAAACTTTTCCCCACTT (SEQ ID NO. 4951) RE8_HP0840R827 CAUCAUCAUACATGCCAAATCATCAAAACA (SEQ ID NO. 4952) RE9_HP0841R828 CAUCAUCAUTATTATGAATTTTTTAGAAGA (SEQ ID NO. 4953) RE10_HP0842R829 CAUCAUCAUCCCTTTAATGCTAGAAGCCCT (SEQ ID NO. 4954) RE11__HP0843R830 CAUCAUCAUTTGTTTGATGCGGATTGTTTG (SEQ ID NO. 4955) RE12_HP0844R831 CAUCAUCAUGGAAGAGTGGTGAAAATTTAC (SEQ ID NO. 4956) RF1 HP0845R832 CAUCAUCAUCGCCGTTTAATGGATTTTTGT (SEQ ID NO. 4957)
RF2_HP0846R833 CAUCAUCAUGCCTTGCAAGTGATCCACCAA (SEQ ID NO. 4958) RF3JTP0847R834 CAUCAUCAUGGAATAACATGAGTTACGAAA (SEQ ID NO. 4959) RF4_HP0848R835 CAUCAUCAUATGTTTTGCTTTGAAAATTTG (SEQ ID NO. 4960) RF5_HP0849R836 CAUCAUCAUGGGGGCAAAATGCATAAAATA (SEQ ID NO. 4961) RF6_HP0850R837 CAUCAUCAUCATGGAAAACAACCCAAACAA (SEQ ID NO. 4962) RF7__HP0851R838 CAUCAUCAUTTGAATGGTATTTGACAGAAC (SEQ ID NO. 4963) RF8_HP0852R839 CAUCAUCAUAATGAAACCGCAAGACATTGA (SEQ ID NO. 4964) RF9_HP0853R840 CAUCAUCAUAGATGCTACAAACCATCAACT (SEQ ID NO. 4965) RF10_HP0854R841 CAUCAUCAUCATGTCATTGAAAGTGTTTGA (SEQ ID NO. 4966) RF11_HP0855R842 CAUCAUCAUAATGCTAGCTTCCATCATCTC (SEQ ID NO. 4967) RF12_HP0856R843 CAUCAUCAUATGGAATTTTATAAAAAACAA (SEQ ID NO. 4968) RGX_HP0857R844 CAUCAUCAUGACATGCAATGATTGATAATT (SEQ ID NO. 4969) RG2_HP0858R845 CAUCAUCAUCGCATGAAAAAAATCTTAGTC (SEQ ID NO. 4970) RG3_HP0859R846 CAUCAUCAUAGAATGCGTTATATTGATGAT (SEQ ID NO. 4971) RG4_HP0860R847 CAUCAUCAUCATTCGCATGAACACTAACAA (SEQ ID NO. 4972) RG5__HP086XR848 CAUCAUCAUGAATGCAAATGATGCACAATT (SEQ ID NO. 4973) RG6_HP0862R849 CAUCAUCAUATGCCAGCTAGGCAATCTTTT (SEQ ID NO. 4974) RG7_HP0863R850 CAUCAUCAUATGAATAAACCATTTTTAATC (SEQ ID NO. 4975) RG8_HP0864R851 CAUCAUCAUGGCATGAGCATTAAGGAAAAT (SEQ ID NO. 4976) RG9_HP0865R852 CAUCAUCAUATGAAAATTAAAATCCAAAAA (SEQ ID NO. 4977) RGX0_HP0866R853 CAUCAUCAUATGAATAAAGAACCTATGAGT (SEQ ID NO. 4978) RGX1_HP0867R854 CAUCAUCAUATGCCCACGATTTTAGTGAGC (SEQ ID NO. 4979) RG12_HP0868R855 CAUCAUCAUACCGATGCAAGAAGAATTGAA (SEQ ID NO. 4980) RH1_HP0869R85S CAUCAUCAUGTTTAGTATGCATGAATACTC (SEQ ID NO. 4981) RH2_HP0870R857 CAUCAUCAUCATGCTTAGGTCTTTATGGTC (SEQ ID NO. 4982) RH3_HP0871R858 CAUCAUCAUCAAGGAGCAAAAAAGATGAAA (SEQ ID NO. 4983) RH4_HP0872R859 CAUCAUCAUCATGCAAGATTTACCCCCATG (SEQ ID NO. 4984) RH5_HP0873R860 CAUCAUCAUTTGTCTTTATGCGAGAATTTT (SEQ ID NO. 4985) RH6_HP0874R861 CAUCAUCAUTATGAAACGAAGGGATTTTAT (SEQ ID NO. 4986) RH7_HP0875R862 CAUCAUCAUAAGATGGTTAATAAAGATGTG (SEQ ID NO. 4987) RH8_HP0876R863 CAUCAUCAUGTTGTTGGATGTTTTTAAGAG (SEQ ID NO. 4988) RH9 HP0877R864 CAUCAUCAUATGCGTATTTTAGGAATAGAT (SEQ ID NO. 4989)
9RH10_HPO878R865 CAUCAUCAUAATGAAAGCAAACACAATCAT (SEQ ID NO. 4990) 9RH11_HP0879R866 CAUCAUCAUGGATGATGAATGGGATCAGTC (SEQ ID NO. 4991) 9RHX2 HP0880R867 CAUCAUCAUGAAAGATAATCAAAAAATGAA (SEQ ID NO. 4992)
10SA1_HP0881S868 CUACUACUAATTTTTTGATTATCTTTCAAT (SEQ ID NO. 4993)
10SA2_HP0883S869 CUACUACUATTTTAAGAGCGCAGTTGTTGT (SEQ ID NO. 4994)
10SA3_HP0884S870 CUACUACUATTTCTTGTTCTAGCCTTTTAT (SEQ ID NO. 4995)
10SA4_HP0885S871 CUACUACUAAAACATTTAAAAACCTTGAAA (SEQ ID NO. 4996)
10SA5_HP0886S872 CUACUACUAGGAGGTGTTTAAAAAAACTTC (SEQ ID NO. 4997)
10SA6JHP0887S873 CUACUACUATATTTAGAAACTATACCTCAT (SEQ ID NO. 4998)
10SA7_HP0888S874 CUACUACUACCTCTACAACGCATACACCAC (SEQ ID NO. 4999)
10SA8_HP0889S875 CUACUACUATCACACCCCCCTAGTTCTAAA (SEQ ID NO. 5000)
10SA9_HP0890S876 CUACUACUATTAAGCATCACGCTTCCTTTT (SEQ ID NO. 5001)
10SA10_HP0891S877 CUACUACUATTTAAACACCCTCGTGTTTCC (SEQ ID NO. 5002)
10SAX1_HP0892S878 CUACUACUAGGTGTGGGCGGTTCAAAACAG (SEQ ID NO. 5003)
10SA12_HP0893S879 CUACUACUATTATAAGTTTTCACAATGCTT (SEQ ID NO. 5004)
10SB1_HP0894S880 CUACUACUAAAGGGGTGTGGGTGGATTAAA (SEQ ID NO. 5005)
10SB2_HP0895S881 CUACUACUATTTTTAAGATTGAGCTTCAAC (SEQ ID NO. 5006)
10SB3_HP0896S882 CUACUACUATTAGTAAGCGAACACATAATT (SEQ ID NO. 5007)
10SB4_HP0897S883 CUACUACUAATGGTCAGCTCTCAAGGGGGC (SEQ ID NO. 5008)
10SB5_HP0898S884 CUACUACUAGGGGCTTAGAGCAATTTTCAA (SEQ ID NO. 5009)
10SB6_HP0899S885 CUACUACUATATTATTGCGTTTCGTTCATT (SEQ ID NO. 5010)
10SB7_HP0900S886 CUACUACUATTAAAACGAATGCGTGGACTG (SEQ ID NO. 5011)
10SB8_HP0901S887 CUACUACUACCGCCTATCCTACCTTATCAT (SEQ ID NO. 5012)
10SB9_HP0902S888 CUACUACUATTTTCCTTATTTTTTACTTAA (SEQ ID NO. 5013)
X0ΞB10_HP0905S889 CUACUACUAGCTTTCAAAATCCTTTCATCT (SEQ ID NO. 5014)
10SB11_HP0906S890 CUACUACUAGCTTAAAATCACGCATAAAGA (SEQ ID NO. 5015)
10SB12_HP0907S891 CUACUACUAGTGTCGTTCATGCTGTCTCCT (SEQ ID NO. 5016)
X0SC1JHP0908Ξ892 CUACUACUACCTTTTATTTTTTCAAGCTAA (SEQ ID NO. 5017)
10SC2_HP0909S893 CUACUACUATTATTCAAAAAATTCTCCAAA (SEQ ID NO. 5018)
10SC3_HP0910S894 CUACUACUATTTATCCTAGATTTAAAAAGT (SEQ ID NO. 5019)
10SC4_HP0911S895 CUACUACUAACCAATTTTATTTTGAATCGG (SEQ ID NO. 5020)
10SC5 HP0912S896 CUACUACUACTTAGAATGAATACCCATAAG (SEQ ID NO. 5021)
10SC6_HP09X3S897 CUACUACUATTTAGAAGGCGTAGCCATAGA (SEQ ID NO. 5022)
10SC7JHP0914S898 CUACUACUATCAAAACTTCAGCGTGAGGTT (SEQ ID NO. 5023)
10SC8_HP0915S899 CUACUACUAATCCATTATTAAAACTTATAG (SEQ ID NO. 5024)
10SC9_HP0916S900 CUACUACUAGCCATCACATTGTTTTGCTCG (SEQ ID NO. 5025)
10SC10_HP0917S901 CUACUACUATCAAGCGATAATCTCTTGAAA (SEQ ID NO. 5026)
10SC11_HP0918S902 CUACUACUATCTTTTTTATCTTATTCTCAC (SEQ ID NO. 5027)
10SC12_HP09X9S903 CUACUACUACGCCATCAAATTACTTCAAAT (SEQ ID NO. 5028)
10SD1_HP0920S904 CUACUACUACTATGTGTCTCTGTCTGAAAA (SEQ ID NO. 5029)
10SD2_HP0921S905 CUACUACUAAAATGACTTAATAATGATACA (SEQ ID NO. 5030)
10SD3_HP0922S906 CUACUACUAATGATACCATTAAAAAGCATA (SEQ ID NO. 5031)
X0SD4_HP0923S907 CUACUACUACCCCCTAAAAACTATAAACGT (SEQ ID NO. 5032)
10SD5_HP0924S908 CUACUACUATTTAGTTTTTTTGCCTCAAAT (SEQ ID NO. 5033)
X0SD6_HP0925S909 CUACUACUATCATGCTTTGATCCTTGAATT (SEQ ID NO. 5034)
X0SD7_HP0926S910 CUACUACUATCAAAATTCGTCATTATTTTC (SEQ ID NO. 5035)
10SD8_HP0927S911 CUACUACUAAAAAAATTTTCCATTTAAATC (SEQ ID NO. 5036)
10SD9_HP0928S912 CUACUACUAGAATTATAACCTAAGATTTTA (SEQ ID NO. 5037)
10SD10JHP0929S913 CUACUACUATCATGTCTTGCCTTTAAATAG (SEQ ID NO. 5038)
10SD11_HP0930S914 CUACUACUAGCTTTTTACTCACTTTAACAT (SEQ ID NO. 5039)
10SD12_HP093XS915 CUACUACUACTCATTAAAATTTGCTCGCTA (SEQ ID NO. 5040)
10SE1_HP0932S916 CUACUACUAACCATTTCATTGCCTATAACA (SEQ ID NO. 5041)
10SE2_HP0933S917 CUACUACUACATTTTTAAGGGATCTGGTGT (SEQ ID NO. 5042)
X0SE3_HP0934S918 CUACUACUAACTTTTTAAAACCCTTTTTGA (SEQ ID NO. 5043)
X0SE4_HP0935S9X9 CUACUACUATGGATGCACCAATCAAGACAA (SEQ ID NO. 5044)
X0SE5_HP0936S920 CUACUACUACGCATCCTTCATGCTGTGGGC (SEQ ID NO. 5045)
X0SE6J3P0937S921 CUACUACUATATTATTCTTTGAGCCATATC (SEQ ID NO. 5046)
10SE7_HP0938S922 CUACUACUAGCGAGGTTAAACGCTGTTGGC (SEQ ID NO. 5047)
10SE8_HP0939S923 CUACUACUACCAACTTAATTTAAGATTTTT (SEQ ID NO. 5048)
X0SE9_HP0940S924 CUACUACUAATTAGGGCTGGCTGACATCTT (SEQ ID NO. 5049)
X0SEX0_HP0941S925 CUACUACUATTGCATGGTTTGATTTTAAAT (SEQ ID NO. 5050)
10SEX1_HP0942S926 CUACUACUACTTTTTAACATGCCATTTTAC (SEQ ID NO. 5051)
10SE12_HP0943S927 CUACUACUACTTAATCCCTAAAAAATGCAG (SEQ ID NO. 5052)
10SF1 HP0944S928 CUACUACUATTACTTAATGGCTATCGCTTC (SEQ ID NO. 5053)
10SF2_HP0945S929 CUACUACUAATTCAAGAGGATTTAAGATTG (SEQ ID NO. 5054)
X0SF3_HP0946S930 CUACUACUATTCAAGTTTTTAGATTTTCAC (SEQ ID NO. 5055)
X0SF4_HP0947S931 CUACUACUATTAAAAACGCACCACCATAGT (SEQ ID NO. 5056)
10SF5_HP0948S932 CUACUACUACCTACAAAAACGCTTCTAACT (SEQ ID NO. 5057)
10SF6_HP0949S933 CUACUACUAGCACCTCCTATTTATGGTATG (SEQ ID NO. 5058)
10SF7_HP0950S934 CUACUACUACTTTAAAAAATCTTTGAAGTC (SEQ ID NO. 5059)
10SF8_HP0951S935 CUACUACUAACCTTTCAAAACCCTAAATTC (SEQ ID NO. 5060)
10SF9_HP0952S936 CUACUACUAGTTTGTGCGTTAAGGTAGTTT (SEQ ID NO. 5061)
X0SF10_HP0953S937 CUACUACUAAACTACCTTAACGCACAAACG (SEQ ID NO. 5062)
10SF11_HP0954S938 CUACUACUATAATCACAACCAAGTAATCGC (SEQ ID NO. 5063)
X0SF12_HP0955S939 CUACUACUAAATTTCATTGATTTTCCTTTA (SEQ ID NO. 5064)
10SG1_HP0956S940 CUACUACUAAATCCTTAAATTTCAAATTCT (SEQ ID NO. 5065)
10SG2_HP0957S941 CUACUACUATGTTTATCCTTTTTAATGTTT (SEQ ID NO. 5066)
10SG3_HP0958S942 CUACUACUAAACTTAAACAATCAAACTAAT (SEQ ID NO. 5067)
10SG4_HP0959S943 CUACUACUACGCTTTCCTTTAAATGATTTG (SEQ ID NO. 5068)
10SG5_HP0960S944 CUACUACUACTTAACTAACGCCATTTTCAG (SEQ ID NO. 5069)
10SG6_HP0961S945 CUACUACUACCTTTTAAGCGCGTCTGATCA (SEQ ID NO. 5070)
10SG7_HP0962S946 CUACUACUAAGCGTTAAATTAGTTGTTTCT (SEQ ID NO. 5071)
10SG8_HP0963S947 CUACUACUATTCTTCAATCATCGTCCTCTT (SEQ ID NO. 5072)
10SG9_HP0964S948 CUACUACUATAACGCTCATTGAATCCCCTT (SEQ ID NO. 5073)
10SG10_HP0965S949 CUACUACUATTAAGAATAATAAGCTTTCTT (SEQ ID NO. 5074)
10SG11_HP0966S9S0 CUACUACUAGTTCATTGTTGATCCTTTAAA (SEQ ID NO. 5075)
10SG12_HP0967S951 CUACUACUACAATTTCATAAAACAATCTTG (SEQ ID NO. 5076)
X0SH1_HP0968S952 CUACUACUATCAAAGCCTTTTTCAAACGAA (SEQ ID NO. 5077)
10SH2_HP0969S953 CUACUACUAACTCAAACGATTTTAATCTTT (SEQ ID NO. 5078)
10SH3_HP0970S954 CUACUACUAATCATTCCTCCCCTAAATTGT (SEQ ID NO. 5079)
10SH4_HP0971S955 CUACUACUATCAATGCAATTCTCCTAATCT (SEQ ID NO. 5080)
10SH5_HP0972S956 CUACUACUACTAAATCGCAATTTCTTTAAT (SEQ ID NO. 5081)
10SH6_HP0973S957 CUACUACUATTTAGGACTTTTTAAAGCTCT (SEQ ID NO. 5082)
10SH7_HP0974S958 CUACUACUATTAAAATAGGGGTTCGTCCAT (SEQ ID NO. 5083)
10SH8_HP0975S959 CUACUACUACTATTCAATGATCTTGGGCAC (SEQ ID NO. 5084)
10SH9 HP0976S960 CUACUACUATTCAGCCTTTTCTTAACTCAT (SEQ ID NO. 5085)
10SH10_HP0977S961 CUACUACUATTTATTGAATGTATTTGACTA (SEQ ID NO. 5086)
10SH11_HP0978S962 CUACUACUATCTTAAAAGAATTTAGAAATC (SEQ ID NO. 5087)
X0SH12_HP0979S963 CUACUACUATCAGTCTTGCTGGATTCTCAT (SEQ ID NO. 5088)
10RA1_HP0881R868 CAUCAUCAUTGTGGGTATAATGCAAAAAGT (SEQ ID NO. 5089)
10RA2_HP0883R869 CAUCAUCAUAATGATAGTGGGTTTGATAGG (SEQ ID NO. 5090)
10RA3_HP0884R870 CAUCAUCAUAAAAAATGAGCTTGGAGCGTT (SEQ ID NO. 5091)
10RA4_HP0885R871 CAUCAUCAUTTTGATGATGGCTAATATTCT (SEQ ID NO. 5092)
10RA5_HP0886R872 CAUCAUCAUAGGTTTTTAAATGTTTATTTA (SEQ ID NO. 5093)
10RA6_HP0887R873 CAUCAUCAUAGAAAGGAAAAAAATGGAAAT (SEQ ID NO. 5094)
10RA7_HP0888R874 CAUCAUCAUGGGTGTGATGGTTTTAGAAGT (SEQ ID NO. 5095)
10RA8_HP0889R875 CAUCAUCAUGCGTGATGCTTAAAACCTATC (SEQ ID NO. 5096)
10RA9_HP0890R876 CAUCAUCAUATGCTGTTAGATCAAGGGTAT (SEQ ID NO. 5097)
10RA10_HP0891R877 CAUCAUCAUAATGCCTCAAATACAATCATC (SEQ ID NO. 5098)
10RA11_HP0892R878 CAUCAUCAUAAAAATGCTGACGATTGAAAC (SEQ ID NO. 5099)
10RA12_HP0893R879 CAUCAUCAUAACATGCCAAACACCACCAAC (SEQ ID NO. 5100)
10RB1_HP0894R880 CAUCAUCAUGGTGTTGAAGCTCAATCTTAA (SEQ ID NO. 5101)
10RB2_HP0895R881 CAUCAUCAUTTGGTATTTTTAAAAAAGGAA (SEQ ID NO. 5102)
10RB3_HP0896R882 CAUCAUCAUAAAACATGAAAAAAACCCTTT (SEQ ID NO. 5103)
10RB4_HP0897R883 CAUCAUCAUGGAATGCCAGGACCAAAACCT (SEQ ID NO. 5104)
10RB5_HP0898R884 CAUCAUCAUACAATGAGCGTTGATCACCTC (SEQ ID NO. 5105)
10RB6_HP0899R885 CAUCAUCAUATGTGTTTAGCGATCCCCTCT (SEQ ID NO. 5106)
10RB7_HP0900R886 CAUCAUCAUGAACAACATGAGCGAACAACG (SEQ ID NO. 5107)
10RB8_HP0901R887 CAUCAUCAUAAATGATGAAAGCCGTTTTTA (SEQ ID NO. 5108)
X0RB9_HP0902R888 CAUCAUCAUTTTGATGGAAGTGGTTCATTT (SEQ ID NO. 5109)
X0RB10_HP0905R889 CAUCAUCAUGCATGCAAGGTTTATGGATTT (SEQ ID NO. 5110)
10RB11JHP0906R890 CAUCAUCAUAAGGATAACCATGCCATCTCC (SEQ ID NO. 5111)
10RB12_HP0907R891 CAUCAUCAUATGGCTATTGATTTAGCAGAA (SEQ ID NO. 5112)
X0RCX_HP0908R892 CAUCAUCAUCATGAACGACACCTTATTAAA (SEQ ID NO. 5113)
10RC2_HP0909R893 CAUCAUCAUTAATGATACCCACACAGCTTA (SEQ ID NO. 5114)
10RC3_HP0910R894 CAUCAUCAUTTTGAATAATTTAGACATTAA (SEQ ID NO. 5115)
10RC4_HP091XR895 CAUCAUCAUCTTGTTAGAAACTTTGCAATT (SEQ ID NO. 5116)
X0RC5 HP0912R896 CAUCAUCAUAAATCATGATAAAGAAAAATA (SEQ ID NO. 5117)
10RC6_HP0913R897 CAUCAUCAUGAGACAAACATGAAACAAAAT (SEQ ID NO. 5118
10RC7_HP0914R898 CAUCAUCAUGAGAGTTTAGTGAGGCAAGAA (SEQ ID NO. 5119
10RC8_HP0915R899 CAUCAUCAUATGGCTAAGATCAATGGTTAT (SEQ ID NO. 5120
10RC9_HP0916R900 CAUCAUCAUATGAATGACAAGCGTTTTAGA (SEQ ID NO. 5121
10RC10_HP0917R901 CAUCAUCAUCGATAATGTCCCCCTTAACCC (SEQ ID NO. 5122
10RC11J3P0918R902 CAUCAUCAUTGATGGCGTTAGTGTATCTCG (SEQ ID NO. 5123
10RC12_HP0919R903 CAUCAUCAUACATTCTTTCTTTAAGGATTA (SEQ ID NO. 5124
10RD1_HP0920R904 CAUCAUCAUGAGTAAAACATGGCATTGTAT (SEQ ID NO. 5125
10RD2_HP0921R905 CAUCAUCAUCGATGAAAATTTTTATCAATG (SEQ ID NO. 5126
10RD3_HP0922R906 CAUCAUCAUAATGGCGTTTAAAAAGGCCAG (SEQ ID NO. 5127
10RD4_HP0923R907 CAUCAUCAUCACCCATGCAATTTCAAAAAG (SEQ ID NO. 5128
10RD5_HP0924R908 CAUCAUCAUATGCCGTTTATCAATATCAAA (SEQ ID NO. 5129
10RD6__HP0925R909 CAUCAUCAUAACAGAATGAATACTTATAAA (SEQ ID NO. 5130
10RD7_HP0926R910 CAUCAUCAUATCAAAGCATGAATTTAAATT (SEQ ID NO. 5131
10RD8_HP0927R911 CAUCAUCAUATGCGAGCGCGTTGCTCAAAG (SEQ ID NO. 5132
10RD9J3P0928R912 CAUCAUCAUATGGAAAATTTTTTCAACCAG (SEQ ID NO. 5133
10RD10_HP0929R913 CAUCAUCAUAGCATGAGTAGCCCTAATTTA (SEQ ID NO. 5134
10RDX1_HP0930R914 CAUCAUCAUCATGAAAAAGATTTTACTCAC (SEQ ID NO. 5135
10RD12_HP0931R915 CAUCAUCAUGGGAATGTTAAAGTGAGTAAA (SEQ ID NO. 5136
10RE1_HP0932R916 CAUCAUCAUATAAGCCTTGTTATTTTCTTG (SEQ ID NO. 5137
10RE2_HP0933R917 CAUCAUCAUATGGTTATCAGGCGATTGTAT (SEQ ID NO. 5138
10RE3_HP0934R918 CAUCAUCAUTAAAAATGAAACTCCCGGTCG (SEQ ID NO. 5139
10RE4_HP0935R919 CAUCAUCAUATGACCATCAAAGTTTTTTCG (SEQ ID NO. 5140
10RE5_HP0936R920 CAUCAUCAUATAGTGATGGCCCACTTTGGG (SEQ ID NO. 5141
10RE6_HP0937R921 CAUCAUCAUTTGAGCTATTCTGATTTATTA (SEQ ID NO. 5142
10RΞ7_HP0938R922 CAUCAUCAUATGGGTGTGTTTTTGGATAAG (SEQ ID NO. 5143
10RE8_HP0939R923 CAUCAUCAUATGTCAGCCAGCCCTAATCTG (SEQ ID NO. 5144
X0RE9_HP0940R924 CAUCAUCAUTAAAATGAAAAAAGTTTTATT (SEQ ID NO. 5145
10RE10_HP0941R925 CAUCAUCAUAGTAAAATGGCATGTTAAAAA (SEQ ID NO. 5146
10RE11_HP0942R926 CAUCAUCAUTATGGAAACGATTGATTCGGT (SEQ ID NO. 5147
10REX2__HP0943R927 CAUCAUCAUAGCATGAAAAAAGAGGTCGTG (SEQ ID NO. 5148
10RF1 HP0944R928 CAUCAUCAUCATGAAAGAAGTCATCCATTC (SEQ ID NO. 5149
10RF2_HP0945R929 CAUCAUCAUGTGGTGTATTTGCCAACAATA (SEQ ID NO. 5150)
10RF3_HP0946R930 CAUCAUCAUGGTGGGATTGTCAGCATCAAG (SEQ ID NO. 5151)
10RF4_HP0947R931 CAUCAUCAUTTTTGTTAAGCGTTTTGAGTG (SEQ ID NO. 5152)
10RF5_HP0948R932 CAUCAUCAUATGCGTTTTTACATTATCTTT (SEQ ID NO. 5153)
10RF6_HP0949R933 CAUCAUCAUATGCGTTGCGTGGTGTATTCT (SEQ ID NO. 5154)
10RF7_HP0950R934 CAUCAUCAUAACATGGGATTTGCAGATTTC (SEQ ID NO. 5155)
10RF8_HP0951R935 CAUCAUCAUGTGATGCAAGGGTTTCTTTTA (SEQ ID NO. 5156)
10RF9_HP0952R936 CAUCAUCAUAATGAAATTTAAATTTTTGAA (SEQ ID NO. 5157)
10RF10_HP0953R937 CAUCAUCAUTTTGTTTAAAAGAATGGTTTT (SEQ ID NO. 5158)
10RF11_HP0954R938 CAUCAUCAUATCAATGAAATTTTTGGATCA (SEQ ID NO. 5159)
X0RF12_HP0955R939 CAUCAUCAUGATCATGAACGCTTGGAATAC (SEQ ID NO. 5160)
10RG1_HP0956R940 CAUCAUCAUACATGGAAAAAGCTTATAAAA (SEQ ID NO. 5161)
10RG2_HP0957R941 CAUCAUCAUTTGTTTAAGTTTTTCTACCTT (SEQ ID NO. 5162)
10RG3_HP0958R942 CAUCAUCAUGAAAGCGATGAACACCCACCT (SEQ ID NO. 5163)
X0RG4_HP0959R943 CAUCAUCAUCTGAAAATGGCGTTAGTTAAG (SEQ ID NO. 5164)
10RG5_HP0960R944 CAUCAUCAUATGCAAGATTTTTCAAGTTTA (SEQ ID NO. 5165)
X0RG6_HP0961R945 CAUCAUCAUATGGAAATTGCAGTATTTGGT (SEQ ID NO. 5166)
10RG7_HP0962R946 CAUCAUCAUAGGACGATGATTGAAGAAAAC (SEQ ID NO. 5167)
10RG8_HP0963R947 CAUCAUCAUATGAGCGTTAATTTTTTTAAG (SEQ ID NO. 5168)
X0RG9_HP0964R948 CAUCAUCAUAGAGTGTTACAATGCATGGCG (SEQ ID NO. 5169)
10RGX0_HP0965R949 CAUCAUCAUACAATGAACGAGCAAGAACTC (SEQ ID NO. 5170)
10RG1X_HP0966R950 CAUCAUCAUTAAAGGATTGATGATGGTTTT (SEQ ID NO. 5171)
10RG12_HP0967R951 CAUCAUCAUAACCAATAAAGGAGTTAAAAT (SEQ ID NO. 5172)
X0RH1_HP0968R952 CAUCAUCAUATGCTCGCTTTAGAAATTTAT (SEQ ID NO. 5173)
10RH2_HP0969R953 CAUCAUCAUAATGATGCTCGCTTCCATTAT (SEQ ID NO. 5174)
X0RH3_HP0970R954 CAUCAUCAUGATTAGGAGAATTGCATTGAA (SEQ ID NO. 5175)
10RH4_HP0971R955 CAUCAUCAUCTTTTTTAATGAGCATGTTTA (SEQ ID NO. 5176)
10RH5_HP0972R956 CAUCAUCAUTTGCATTCAGATGAATTGTTA (SEQ ID NO. 5177)
10RH6_HP0973R957 CAUCAUCAUTTTGAAACATTTAACCCCACT (SEQ ID NO. 5178)
10RH7_HP0974R958 CAUCAUCAUAAATGGCGCAAAAAACTCTTT (SEQ ID NO. 5179)
10RH8_HP0975R959 CAUCAUCAUAAGGAAAAGAATGCAAATTGA (SEQ ID NO. 5180)
10RH9 HP0976R960 CAUCAUCAUTAACATGAATTTTCAAGAAAA (SEQ ID NO. 5181)
10RH10_HP0977R961 CAUCAUCAUGAGTTCTTATGATTGAATGGA (SEQ ID NO. 5182 )
10RH11_HP0978R962 CAUCAUCAUAGGGGAAATCATGGAACATAA (SEQ ID NO. 5183 )
10RH12_HP0979R963 CAUCAUCAUTGTGGCTATGGTTCATCAATC (SEQ ID NO. 5184 )
11SA1_HP0980S964 CUACUACUAGGTTCTAGCCAAATAGTATCC (SEQ ID NO. 5185 )
11SA2_HP0981S965 CUACUACUAATCAGCGATGAAGAGCTTGAC (SEQ ID NO. 5186)
11SA3_HP0982S966 CUACUACUAACTAATAATTTGTAAGACTAA (SEQ ID NO. 5187 )
11SA4_HP0983S967 CUACUACUACACACCAATCATTTAGGAGAG (SEQ ID NO. 5188)
11SA5_HP0984S968 CUACUACUATCTAAAGCCCATAATTGATGT (SEQ ID NO. 5189 )
11SA6_HP0985S969 CUACUACUACTAAACAAGAGCAAGAGAAAC (SEQ ID NO. 5190)
11SA7_HP0986S970 CUACUACUATTATCAACGCCTAGAGTTATT (SEQ ID NO. 5191 )
11SA8_HP0987S971 CUACUACUATACTCATAGGGTTTTTATAGT (SEQ ID NO. 5192 )
11SA9_HP0989S973 CUACUACUACTTCATAGCTCCATAACTAGC (SEQ ID NO. 5193 )
11SA10_HP0990S974 CUACUACUAACCCTTAATCCCATATACTAA (SEQ ID NO. 5194 )
11SA11_HP0991S975 CUACUACUAGGGTTTAGTGATGAGAGATAG (SEQ ID NO. 5195)
11SA12_HP0992S976 CUACUACUACTATATTACTTGCGGAAATTC (SEQ ID NO. 5196 )
11SB1_HP0993S977 CUACUACUACTTTTCATTGATAGGGATTGA (SEQ ID NO. 5197 )
11SB2_HP0994S978 CUACUACUATTTATCGTCTACGCTTAGGTG (SEQ ID NO. 5198 )
11SB3_HP0995S979 CUACUACUACACCTTACTTCTTGTCCTCTA (SEQ ID NO. 5199 )
11SB4_HP0996S980 CUACUACUAGCATTATCTATTACGCCCTTT (SEQ ID NO. 5200 )
11SB5_HP0997S981 CUACUACUATTTAATAAACTTTGAGTTAGG (SEQ ID NO. 5201 )
11SB6_HP0999S983 CUACUACUAAACTCACATCCCCATGTCATA (SEQ ID NO. 5202 )
11SB7_HP1000S984 CUACUACUACATCATTTTTTGATCCTTAAA (SEQ ID NO. 5203 )
11ΞB8_HP1001S985 CUACUACUATCACAACTTGCTCGCCTTATC (SEQ ID NO. 5204 )
11SB9_HP1002S986 CUACUACUACTTTACTTAGCGTATTTTTTA (SEQ ID NO. 5205 )
11SB10_HP1003S987 CUACUACUATTAAAAATCCATTCCATAGTT (SEQ ID NO. 5206 )
X1SB11_HP1004S988 CUACUACUACTATTTGAGAAGTTGCACTAC (SEQ ID NO. 5207 )
11SBX2_HP1005S989 CUACUACUAAGTTCTGACTTTAAGCATTAT (SEQ ID NO. 5208 )
11SC1_HP1006S990 CUACUACUATAATCCATCAAAACACTTCTT (SEQ ID NO. 5209 )
11SC2_HPX008S991 CUACUACUAGCTAATTTAATGCGGTGTTTC (SEQ ID NO. 5210 )
11SC3_HPX009S992 CUACUACUAGCGCTAGTGGGATAAAAGATT (SEQ ID NO. 5211 )
11SC4_HP1010S993 CUACUACUATTTGAACGATAACCCTTAAAA (SEQ ID NO. 5212 )
11SC5 HPX0XXS994 CUACUACUATGTTTCATCTTAAATCCGCTC (SEQ ID NO. 5213 )
X1SC6_HP1012S995 CUACUACUAAGGCTCTTTTAAGGTTTCAAA (SEQ ID NO. 5214)
11SC7_HP1013S996 CUACUACUATCATGCAATTACCTCATATTG (SEQ ID NO. 5215)
11SC8_HP1014S997 CUACUACUAAAGAAATATCTTTATTTAAAA (SEQ ID NO. 5216)
11SC9_HP1015S998 CUACUACUATCATGCTTTCAAAAAGACTTG (SEQ ID NO. 5217)
11SC10_HP1016S999 CUACUACUATTAAGATTTATAATATTTAAT (SEQ ID NO. 5218)
11SC1X_HPX0X7S1000 CUACUACUAGAAAAGATTATATCAAATCTT (SEQ ID NO. 5219)
11SC12_HPX018S1001 CUACUACUATTAATAGAATCGTGGTAAGAA (SEQ ID NO. 5220)
XXSDX_HP10X9S1002 CUACUACUACGACCCACCCCTATCATTTCA (SEQ ID NO. 5221)
11SD2_HP1020S1003 CUACUACUACCTTTTATCACATTTAAAGTT (SEQ ID NO. 5222)
11SD3_HP1021S1004 CUACUACUAGTTATTTGCGCGGTAAGTTAT (SEQ ID NO. 5223)
11SD4_HP1022S1005 CUACUACUATTGTTTTAAGAAATTCTTTTA (SEQ ID NO. 5224)
X1SD5_HP1023S1006 CUACUACUATTAAAACAAAAACTTCCCAAA (SEQ ID NO. 5225)
1XSD6_HP1024S1007 CUACUACUATCCTTACAATTCTTTTTCTAA (SEQ ID NO. 5226)
11SD7_HP1025S1008 CUACUACUAGCTCATTTTTTAAATAAAATC (SEQ ID NO. 5227)
11SD8_HP1026S1009 CUACUACUAAGATCAATTTCTTATCTTATC (SEQ ID NO. 5228)
11SD9_HP1027S1010 CUACUACUAATCTTTTAACATTCACTCTCT (SEQ ID NO. 5229)
11SD10_HP1028S1011 CUACUACUAGCTTAGATAGGGCTATCTTTA (SEQ ID NO. 5230)
11ΞD11_HP1029S1012 CUACUACUACCAATCTCATAATTTTAATTT (SEQ ID NO. 5231)
X1SDX2_HP1030S1013 CUACUACUAAGATTTAATGTTTCAATTGTT (SEQ ID NO. 5232)
11SE1_HP1031S1014 CUACUACUATCTCACTCTTCTTCTATTTTC (SEQ ID NO. 5233)
X1SE2_HP1032S1015 CUACUACUATCAGCCATGATCCACTCCTAA (SEQ ID NO. 5234)
1XSE3_HPX033S1016 CUACUACUAATCAGATAGCGGGCCTTTTGG (SEQ ID NO. 5235)
11SE4_HP1034S1017 CUACUACUATTCAAGCCTACCCCAAATACT (SEQ ID NO. 5236)
11SE5_HP1035S1018 CUACUACUAATTGTTCATGCTTGTTCCTTA (SEQ ID NO. 5237)
11SE6_HP1036S10X9 CUACUACUAATTTCACCACTCCCCTTTTTT (SEQ ID NO. 5238)
11SE7_HP1037S1020 CUACUACUACATCACAAAAGCTCAGACCTA (SEQ ID NO. 5239)
11SE8_HP1038S1021 CUACUACUACCCTTTTATTTTTGATTGTTA (SEQ ID NO. 5240)
XXSE9_HP1039S1022 CUACUACUATGCTAAAAAGCACTTTTATCT (SEQ ID NO. 5241)
1XSEX0_HP1040S1023 CUACUACUATTATCTGTCTTTAATGCCTAA (SEQ ID NO. 5242)
11SE11_HP1041S1024 CUACUACUATTCTCCTTTTTTATCAATTAT (SEQ ID NO. 5243)
11SE12_HP 042S1025 CUACUACUATTAGTTTTAAGCGTTGTTGAA (SEQ ID NO. 5244)
11SF1 HP1043S1026 CUACUACUACCACGGAATTTACTCTTCACA (SEQ ID NO. 5245)
X1SF2_HP1044S1027 CUACUACUATTAACTAAGCTTGATTTTTAG (SEQ ID NO. 5246
11SF3_HPX045S1028 CUACUACUATTTTAGATTTTACTCCTCCAT (SEQ ID NO. 5247
11SF4_HP1046S1029 CUACUACUATTCGCATTATTTCACCATTCA (SEQ ID NO. 5248
11SF5_HP1047S1030 CUACUACUACTTCTATTTTTTTAGTCATTA (SEQ ID NO. 5249
11SF6_HP1048S1031 CUACUACUAGTTCATTAGAGAGTTCTTTTT (SEQ ID NO. 5250
1XSF7_HPX049SX032 CUACUACUATCATGCTATGCTTCTCTCCTT (SEQ ID NO. 5251
11ΞF8_HP1050S1033 CUACUACUATTTTAATTTCAGTCTTTCTCA (SEQ ID NO. 5252
11SF9_HP1051S1034 CUACUACUACTAAATAGGAGGCAAAATGTA (SEQ ID NO. 5253
11SFX0_HP1052S103S CUACUACUATTCAGGATTAAGCGAAAGCCA (SEQ ID NO. 5254
XXSFXX_HP1053S1036 CUACUACUATCATAATACTTCCTTTATGTC (SEQ ID NO. 5255
X1SF12_HPX054S1037 CUACUACUATTAACATGACTATCCTTTCAT (SEQ ID NO. 5256
XXSGX_HPX055S1038 CUACUACUATTAAAAAATAAACGCATAATT (SEQ ID NO. 5257
11SG2_HP1056S1039 CUACUACUACCTTTACAATAAATAGTTGTA (SEQ ID NO. 5258
11SG3_HP1057S1040 CUACUACUATTATTTTTTTAGAACACATAT (SEQ ID NO. 5259
XXSG4_HP1058S104X CUACUACUACATTAATGATAACTTTCTAAT (SEQ ID NO. 5260
1XSG5_HP1059S1042 CUACUACUATTTCTAAACAAGATTAAAATA (SEQ ID NO. 5261
11SG6_HP1060SX043 CUACUACUATCTTCAAACATTCTCTTTTTC (SEQ ID NO. 5262
11SG7_HP1061S1044 CUACUACUATCTAAATCAAATTCTTTCAAC (SEQ ID NO. 5263
11SG8_HP1062S1045 CUACUACUAGTTCATAAAATCAGCATCCCA (SEQ ID NO. 5264
XXΞG9_HP1063SX046 CUACUACUACTTAACATAAACTTTCCTTTG (SEQ ID NO. 5265
11SG10_HPX064S1047 CUACUACUAGCATATTTACCTCTTATCCTT (SEQ ID NO. 5266
11SG11_HP1065S1048 CUACUACUATTATTTTAAAAGGAGTGAATC (SEQ ID NO. 5267
11SG12_HP1066S1049 CUACUACUACTCTCTCCGCCATTTAAAAAT (SEQ ID NO. 5268
11SH1_HP1067S1050 CUACUACUATACATTGGCTTTAACACTCAA (SEQ ID NO. 5269
X1SH2_HP1068S1051 CUACUACUATCCTTAATTTATTGGTTGTTT (SEQ ID NO. 5270
11SH3_HP1069S1052 CUACUACUACTCTTAACTCGCTTGCTCTTC (SEQ ID NO. 5271
11SH4_HP1070S1053 CUACUACUAGGTTAATAGGCATTAGAGACT (SEQ ID NO. 5272
11SH5_HP1071SX054 CUACUACUATCATGCGCTTTTATTTTTATT (SEQ ID NO. 5273
11SH6_HPX072SX055 CUACUACUATTTCATTCAATGATCCTTAAT (SEQ ID NO. 5274
11SH7_HP1073S1056 CUACUACUATACTTCTATATCACACCACTT (SEQ ID NO. 5275
11SH8_HP1074S1057 CUACUACUATTTTTCGCTTTAAACGAATAG (SEQ ID NO. 5276
11SH9 HP1075S1058 CUACUACUACCTACACAAATTTCACTAAGA (SEQ ID NO. 5277
1SH10_HP1076S1059 CUACUACUAATTCTAAGCCCTCAAATTATC (SEQ ID NO. 5278 1SH11_HP1077S1060 CUACUACUATAGAATTCAGCTCTCTAGTTT (SEQ ID NO. 5279 ] 1ΞH12_HP10 8S1061 CUACUACUATTAGAAAATTTTAGGATTTGT (SEQ ID NO. 5280] RA1_HP0980R964 CAUCAUCAUGTGGAGCATGATGTTCACTGT (SEQ ID NO. 5281 1RA2_HP0981R965 CAUCAUCAUAAAGAAGAAAATGAAGAAAAT (SEQ ID NO. 5282] 1RA3_HP0982R966 CAUCAUCAUCTTGCTACCCCAAAATTCAGG (SEQ ID NO. 5283] 1RA4_HP0983R967 CAUCAUCAUAAAAGGAAAAATTTATGGATG (SEQ ID NO. 5284 ] 1RA5_HP0984R968 CAUCAUCAUATGGAAAAATATAACGATAAA (SEQ ID NO. 5285] 1RA6JHP0985R969 CAUCAUCAUATGGGGCGTTGGTTCCTTGGA (SEQ ID NO. 5286 ] 1RA7_HP0986R970 CAUCAUCAUCCCATTAAAGATGTGGAAACT (SEQ ID NO. 5287] 1RA8_HP0987R971 CAUCAUCAUATGCTAGATAGGGCTATATTA (SEQ ID NO. 5288] 1RA9_HP0989R973 CAUCAUCAUTTATTTTTGCTTAACGCTATC (SEQ ID NO. 5289] 1RAX0_HP0990R974 CAUCAUCAUAGAAATGTTGTGTAAAATGGT (SEQ ID NO. 5290] 1RA11_HP0991R975 CAUCAUCAUAATTTATGGCTTTTGAAATGA (SEQ ID NO. 5291 1RAX2_HP0992R976 CAUCAUCAUAATAGAAAAATGATTAAATTA (SEQ ID NO. 5292] 1RB1_HP0993R977 CAUCAUCAUATGTCATTTGCCCCTATGTTA (SEQ ID NO. 5293 ] 1RB2_HP0994R978 CAUCAUCAUATGTCCTTTCTATCAATCCCT (SEQ ID NO. 5294] 1RB3_HP0995R979 CAUCAUCAUGGAGCAATATGCCACATAACA (SEQ ID NO. 5295 ] 1RB4_HP0996R980 CAUCAUCAUAATGAAACGCTCCCACTTAGA (SEQ ID NO. 5296] 1RB5_HP0997R981 CAUCAUCAUTTATTTTTGCTTAACGCTATC (SEQ ID NO. 5297 ] 1RB6_HP0999R983 CAUCAUCAUAAATGAGCAAAGACAGAGATT (SEQ ID NO. 5298] 1RB7_HPX000R984 CAUCAUCAUGGAATAAGCGATGATAATCAC (SEQ ID NO. 5299] 1RB8_HP1001R985 CAUCAUCAUATGGAATTTAAAAACACCAAA (SEQ ID NO. 5300] 1RB9_HP1002R986 CAUCAUCAUTAAATGAAAGAAACAAGACTT (SEQ ID NO. 5301 1RB10_HP1003R987 CAUCAUCAUGTTTTGGGTAGTGCAACTTCT (SEQ ID NO. 5302 ] 1RBXX_HP1004R988 CAUCAUCAUAAATGGCGTTAGAAAAAAGTT (SEQ ID NO. 5303] 1RB12_HP1005R989 CAUCAUCAUTATGTCAAATATTATTACAGA (SEQ ID NO. 5304] 1RC1_HP1006R990 CAUCAUCAUGTGGTTTCACTGAAAGAGAGT (SEQ ID NO. 5305] XRC2_HPX008R991 CAUCAUCAUCTATGAAAAAAATTGATGATA (SEQ ID NO. 5306 ] 1RC3_HP1009R992 CAUCAUCAUGTGTATTGTTTTTATTCTTTG (SEQ ID NO. 5307] 1RC4_HP1010R993 CAUCAUCAUATTGAATCGTTTCTTTAACCG (SEQ ID NO. 5308] 1RC5 HP101XR994 CAUCAUCAUATGCTTTATTCATTAGTAAAA (SEQ ID NO. 5309 ]
X1RC6_HPX012R995 CAUCAUCAUGCGGATTTAAGATGAAACATT (SEQ ID NO. 5310)
1XRC7_HPX013R996 CAUCAUCAUAGCCTTATAACATGCAATTTC (SEQ ID NO. 5311)
11RC8_HP1014R997 CAUCAUCAUATTGCATGAATGGTTCCAATC (SEQ ID NO. 5312)
11RC9_HP1015R998 CAUCAUCAUCTGTTAGTGCTTTTTATTGTT (SEQ ID NO. 5313)
11RC10_HP1016R999 CAUCAUCAUCTGTGGTTTTAATGTGTGTGG (SEQ ID NO. 5314)
11RC11_HP1017R1000 CAUCAUCAUAAATGCTTTTGATCGTTTTTT (SEQ ID NO. 5315)
X1RC12_HP1018R1001 CAUCAUCAUCATGATGAAAAAAACCCTTTT (SEQ ID NO. 5316)
11RDX_HPX0X9R1002 CAUCAUCAUAGAGCGAGTGAGTGTCCCCTC (SEQ ID NO. 5317)
11RD2_HP1020R1003 CAUCAUCAUATGTCTTTGATTAGAGTGAAT (SEQ ID NO. 5318)
11RD3_HP1021R1004 CAUCAUCAUAAAAGGAAGAATTGATGAAAA (SEQ ID NO. 5319)
11RD4_HP1022RX005 CAUCAUCAUGATTCATGCATTCAAGAACTC (SEQ ID NO. 5320)
11RD5_HP1023R1006 CAUCAUCAUTTTAATGCAATATAAGAAAAA (SEQ ID NO. 5321)
11RD6_HP1024R1007 CAUCAUCAUTCCATGAGTAAGAGTTTATAC (SEQ ID NO. 5322)
11RD7_HPX025RX008 CAUCAUCAUGTGTGCGATTATGATGAACCG (SEQ ID NO. 5323)
X1RD8_HP1026R1009 CAUCAUCAUAAAAAATGAGCCTGACTTCGC (SEQ ID NO. 5324)
11RD9_HP1027R1010 CAUCAUCAUTATCAGCATGAAAAGATTAGA (SEQ ID NO. 5325)
11RD10_HP1028R10X1 CAUCAUCAUTATGAGATTGGTGAGTCTTGT (SEQ ID NO. 5326)
11RD11_HP1029R1012 CAUCAUCAUATGGCTATTTTTGGGGAATTA (SEQ ID NO. 5327)
11RD12_HP1030R10X3 CAUCAUCAUAGTGAGAAATGCAAGATTTTA (SEQ ID NO. 5328)
XXREX_HP1031R10X4 CAUCAUCAUAGTGGATCATGGCTGATATTT (SEQ ID NO. 5329)
11RE2_HP1032R1015 CAUCAUCAUTTATGATTTTGATGATGGAAA (SEQ ID NO. 5330)
XXRE3_HP1033RX0X6 CAUCAUCAUACTTCTAATGTATTCTAAAAT (SEQ ID NO. 5331)
XXRE4_HP1034R1017 CAUCAUCAUAATAAGGAACAAGCATGAACA (SEQ ID NO. 5332)
X1RE5_HPX035R1018 CAUCAUCAUGGGGAGTGGTGAAATTCTATA (SEQ ID NO. 5333)
X1RE6_HP1036RX0X9 CAUCAUCAUGTGATGCGAGAGATCCTTACT (SEQ ID NO. 5334)
XXRE7_HPX037RX020 CAUCAUCAUCCCATGAAAGGATTAGAAAGA (SEQ ID NO. 5335)
11RE8_HPX038R1021 CAUCAUCAUATGAAAATTTTAGTGATTCAA (SEQ ID NO. 5336)
X1RE9_HP1039R1022 CAUCAUCAUGGTGTGGCGAGTTTGTGTTGA (SEQ ID NO. 5337)
XXREX0_HP1040R1023 CAUCAUCAUATGGCTTTGAATCTGGAGAAA (SEQ ID NO. 5338)
11RE11_HP1041R1024 CAUCAUCAUATTTTAGATTATGGCAAACGA (SEQ ID NO. 5339)
11RE12_HP1042R1025 CAUCAUCAUATGATGCAAGTTTACCACCTT (SEQ ID NO. 5340)
11RF1 HP1043R1026 CAUCAUCAUACCATGCGCGTTCTACTGATT (SEQ ID NO. 5341)
1XRF2_HP1044RX027 CAUCAUCAUAAGGGTGATATGCTGATCTCC (SEQ ID NO. 5342) 1XRF3_HP1045R1028 CAUCAUCAUGGAGATGCGATGCAATTAGAC (SEQ ID NO. 5343) X1RF4_HP1046R1029 CAUCAUCAUATGACTAAAAAAATAGAAGAG (SEQ ID NO. 5344) 11RF5_HP1047R1030 CAUCAUCAUCTAATGAACGCTCATAAAGAA (SEQ ID NO. 5345) 11RF6_HPX048R1031 CAUCAUCAUGAAGCATAGCATGAGTGGTAT (SEQ ID NO. 5346) XXRF7_HP1049R1032 CAUCAUCAUGATAATGATGGGGTTCTTATT (SEQ ID NO. 5347) 11RF8_HP1050R1033 CAUCAUCAUTTGGTAGTGAGTGTTCCTGCA (SEQ ID NO. 5348) XXRF9_HPX05XR1034 CAUCAUCAUGGTGTATCAAAACAATTTTTT (SEQ ID NO. 5349) 11RF10_HP1052R1035 CAUCAUCAUTTATGAAACAAACAACCATTA (SEQ ID NO. 5350) 1XRF1X_HP1053RX036 CAUCAUCAUCTTAATGAGCAATGGATAAAA (SEQ ID NO. 5351) 11RF12_HP1054R1037 CAUCAUCAUTATTTGATTTTTAACGCTTTA (SEQ ID NO. 5352) X1RG1_HPX055R1038 CAUCAUCAUAATGTTGAAATTTAAATATGG (SEQ ID NO. 5353) 11RG2_HP1056R1039 CAUCAUCAUTTGGGTATCAATATGTGTTCT (SEQ ID NO. 5354) 1XRG3_HP1057R1040 CAUCAUCAUCGATTTTGGGTCTTAAAAAAT (SEQ ID NO. 5355) 11RG4_HP1058R104X CAUCAUCAUCACTTAGGATTTTTAATGAGC (SEQ ID NO. 5356) 11RG5_HP1059R1042 CAUCAUCAUATGAAAGAACGGATAGTCAAT (SEQ ID NO. 5357) 11RG6_HP1060RX043 CAUCAUCAUTTATGTTTGGCATGGGCTTTT (SEQ ID NO. 5358) X1RG7_HP106XRX044 CAUCAUCAUAGAGAATGTTTGAAGATTTAA (SEQ ID NO. 5359) XXRG8_HP1062R1045 CAUCAUCAUGTTGAAAGAATTTGATTTAGA (SEQ ID NO. 5360) 11RG9_HPX063RX046 CAUCAUCAUCTGATTTTATGAACCCCTTAT (SEQ ID NO. 5361) 11RG10_HP1064R1047 CAUCAUCAUACTGAATGCTTTATGCATCAA (SEQ ID NO. 5362) 1XRG11_HP1065R1048 CAUCAUCAUTGAAGGATAAGAGGTAAATAT (SEQ ID NO. 5363) 11RG12_HP1066RX049 CAUCAUCAUGTGCATTATTTAAGAATTTTA (SEQ ID NO. 5364) 1XRH1_HP1067R1050 CAUCAUCAUTTGAAACTACTGGTAGTAGAT (SEQ ID NO. 5365) 11RH2_HP1068RX05X CAUCAUCAUGTGTTAAAGCCAATGTATTAT (SEQ ID NO. 5366) 11RH3_HP1069R1052 CAUCAUCAUTAATGAAACCAACGAACGAAC (SEQ ID NO. 5367) 11RH4_HPX070R1053 CAUCAUCAUGAGTGCTAGCTTACAAACAGA (SEQ ID NO. 5368) 11RH5_HP1071R1054 CAUCAUCAUATGCCTATTAACCCTCTCTAT (SEQ ID NO. 5369) 11RH6_HPX072RX055 CAUCAUCAUCGCATGAAAGAATCTTTTTAC (SEQ ID NO. 5370) 11RH7_HP1073R1056 CAUCAUCAUATGAAAGCAACTTTTCAAGTG (SEQ ID NO. 5371) XXRH8_HPX074RX057 CAUCAUCAUATAAGCATGGGCATCAAAGAA (SEQ ID NO. 5372) 11RH9 HP1075R1058 CAUCAUCAUAGAGAGATGGGATGAAAAAAA (SEQ ID NO. 5373)
XXRH10_HPX076RX059 CAUCAUCAUTTTATGGATATTTTAAAAACT (SEQ ID NO. 5374) XXRH11_HPX077R1060 CAUCAUCAUTGTGAAATTGTGGTTTCCTTA (SEQ ID NO. 5375) X1RH12_HP1078RX061 CAUCAUCAUATGGCAGATAAAGAAATACTG (SEQ ID NO. 5376) 12SA1_HP1079S1062 CUACUACUATGCTTTAACCAAAAAGATTCT (SEQ ID NO. 5377) 12SA2_HPX080S1063 CUACUACUAGGTTCATCTCTTTTCGTTCAA (SEQ ID NO. 5378) 12SA3_HP1081SX064 CUACUACUACCTTAAAGCCACTCTGCATTT (SEQ ID NO. 5379) X2SA4_HP1082S1065 CUACUACUATAAAACCCTTAGCATTCTGTC (SEQ ID NO. 5380) 12SA5_HP1083S1066 CUACUACUAGTTAGAATTTAGCGCTAAGAG (SEQ ID NO. 5381) X2SA6_HPX084S1067 CUACUACUACAATCAATCTAATAGCAAAAA (SEQ ID NO. 5382) 12SA7_HP1085S1068 CUACUACUACGCATTAAAATTGAGGTAGGG (SEQ ID NO. 5383) 12SA8_HP1086S1069 CUACUACUATTAGGCTCGCTTGAAATGGAT (SEQ ID NO. 5384) 12SA9_HP1087Ξ1070 CUACUACUAAGTCATTTTAACACAAATTAT (SEQ ID NO. 5385) 12SA10_HP1088S1071 CUACUACUATCATGCGTTGAGTAATCTTTT (SEQ ID NO. 5386) 12SA11_HP1089S1072 CUACUACUATCATCGGTTGCACATGTCTTT (SEQ ID NO. 5387) 12SAX2_HPX090S1073 CUACUACUAAGCCTAAAAGTTTTGCAAAAT (SEQ ID NO. 5388) 12SB1_HP1091S1074 CUACUACUATCATTCCAAATAGGTTTTTTT (SEQ ID NO. 5389) 12SB2_HP1092SX075 CUACUACUATTAAGCTTTTGCAGCGAGTTT (SEQ ID NO. 5390) 12SB3_HP1093SX076 CUACUACUATTAAGCTCGTTTGTGGTAGGA (SEQ ID NO. 5391) X2SB4_HPX094S1077 CUACUACUAGAGCTATGAAGCAAGAAAAAG (SEQ ID NO. 5392) 12SB5_HP1095S1078 CUACUACUACCGCTAAAGCGATTGGGCTTT (SEQ ID NO. 5393) X2SB6_HP1096SX079 CUACUACUAATTAACCTTGATTTTCTATAT (SEQ ID NO. 5394) X2SB7_HPX097S1080 CUACUACUATCATTATCTTAAGGGGATAGG (SEQ ID NO. 5395) X2SB8_HPX098SX081 CUACUACUACTAAACTTTGATCTTAAGCTG (SEQ ID NO. 5396) X2SB9_HPX099S1082 CUACUACUATGCCATGATCTTTTTGTTATC (SEQ ID NO. 5397) X2SBX0_HPXX00SX083 CUACUACUATTTATACCTTTATACCAAAAC (SEQ ID NO. 5398) 12SB1X_HPXX0XΞ1084 CUACUACUACCCTTTATTGATAGAGTGGTT (SEQ ID NO. 5399) 12SB12_HP1102S1085 CUACUACUATCTTGGGTAAGTTTCAGTTTT (SEQ ID NO. 5400) 12SC1_HP1103S1086 CUACUACUACTATATCTTATCATGCAGTAA (SEQ ID NO. 5401) X2SC2_HPXX04S1087 CUACUACUATTAATCAAACGATTTTTTCAT (SEQ ID NO. 5402) X2SC3_HPXX05SX088 CUACUACUATTTTGGCTCTAGCTCTTTTTT (SEQ ID NO. 5403) 12SC4_HP1106S1089 CUACUACUAGGTTATTTAGAGGTTTTAGAT (SEQ ID NO. 5404) 12SC5 HPXX07S1090 CUACUACUAAAACCTCTAAATAACCATATC (SEQ ID NO. 5405)
X2SC6^HPX108SX091 CUACUACUAGTTATTGAACTTCTTCATAAG (SEQ ID NO. 5406)
12SC7_HP1109S1092 CUACUACUATTATGATTTTTTCTTTTCTTG (SEQ ID NO. 5407)
X2SC8_HPX110S1093 CUACUACUACCTTTTTAAAAAAAGCTCATT (SEQ ID NO. 5408)
12SC9_HP1111SX094 CUACUACUATTATACTTTAGCTTCTTCTCT (SEQ ID NO. 5409)
12SC10_HP1112S1095 CUACUACUATTATTCAAACACCCTTTTAAA (SEQ ID NO. 5410)
12SCX1_HPX113S1096 CUACUACUAGCTTAAAAAATATAAGTGTAG (SEQ ID NO. 5411)
X2SCX2_HPX114S1097 CUACUACUACGTTTAAAGCGTTCTTAATTG (SEQ ID NO. 5412)
X2SDX_HPXX15S1098 CUACUACUATCAATGTTTTGATAATGATTG (SEQ ID NO. 5413)
12SD2_HP1116SX099 CUACUACUATTAATTTTTTGTGGTAGGATT (SEQ ID NO. 5414)
X2SD3_HP1117S1100 CUACUACUAAGTTATCGGCTTGAAGTGTTC (SEQ ID NO. 5415)
X2SD4_HPX118S1101 CUACUACUAACAAAGAATTAAAATTCTTTC (SEQ ID NO. 5416)
12SD5_HP1119S1X02 CUACUACUATGGGTAGAAAACTTATTGTTT (SEQ ID NO. 5417)
X2SD6_HP1120S1103 CUACUACUATTACGCCTGCACTCTTAAAAA (SEQ ID NO. 5418)
12SD7_HPX121S1X04 CUACUACUAATTAAATCGCCTTTAACATTT (SEQ ID NO. 5419)
X2SD8_HPXX22SX105 CUACUACUACTAGCTTATCCCCAATAAATC (SEQ ID NO. 5420)
X2SD9_HP1123S1106 CUACUACUACTTACTACCCATGCGAACATG (SEQ ID NO. 5421)
12SDX0_HPX124SXX07 CUACUACUATCATGGTTTTGCATGGTGGTG (SEQ ID NO. 5422)
X2SDXX_HPX125Ξ1X08 CUACUACUAATTACTTCATTAATTTGACAT (SEQ ID NO. 5423)
12SD12_HP1126S1109 CUACUACUATTACCAATCAAAGGCTTGTAT (SEQ ID NO. 5424)
X2SEX_HPXX27SXXX0 CUACUACUACCTCATTGTTCTTCCTTAGTG (SEQ ID NO. 5425)
12SE2_HP1128SX11X CUACUACUACCTTATCAGTTGGCATTTTTT (SEQ ID NO. 5426)
X2SE3_HPXX29SXXX2 CUACUACUACCCTGCTTTTTAAGGACTTGT (SEQ ID NO. 5427)
X2SE4_HP1X30SXXX3 CUACUACUACTTTATTTTTTAGAAGACAAA (SEQ ID NO. 5428)
12SE5_HPX13XSX114 CUACUACUACCCCTTAAAGAGACTCAATCT (SEQ ID NO. 5429)
X2SE6_HP1132SXX15 CUACUACUACCCTTAGGAATTTTTCATGTT (SEQ ID NO. 5430)
12SE7_HP1133SXX16 CUACUACUATTTTATTTTAGGGCTTCTACG (SEQ ID NO. 5431)
12SE8_HP1134S1117 CUACUACUACTCCTACTCGCTATAAGTGAG (SEQ ID NO. 5432)
12SE9_HP1X35SXXX8 CUACUACUATGATTAAATAGATTGAATAAC (SEQ ID NO. 5433)
X2SE10_HPX136S1119 CUACUACUATGCATTTAAAGCCTTTGTTTC (SEQ ID NO. 5434)
12SE11_HP1137S1120 CUACUACUACATTCATCAACTCCCTAAACC (SEQ ID NO. 5435)
12SE12_HPXX38SXX2X CUACUACUAAAGCGTTAAGACATCATTTTT (SEQ ID NO. 5436)
X2SF1 HPX139SXX22 CUACUACUACCATCACTACCCCTGAAGAAT (SEQ ID NO. 5437)
SF2_HP1140S1X23 CUACUACUAACTCACATCCTATCATAAATC (SEQ ID NO. 5438) SF3_HP1141SXX24 CUACUACUATTGTCTCATGCCAAAATACCG (SEQ ID NO. 5439) SF4_HPX142S1125 CUACUACUATTACTTCATTCTCATCATCAT (SEQ ID NO. 5440) SF5_HP1143S1126 CUACUACUACTAATCTTCTAAATCCTCCCC (SEQ ID NO. 5441) SF6JHPX144S1127 CUACUACUACTAGGCGCATGCCGAATAATG (SEQ ID NO. 5442) SF7_HP1145SXX28 CUACUACUATGTGGGGATCGTGTTAGTTCT (SEQ ID NO. 5443) SF8_HP1146S1129 CUACUACUACGGCTTCTAGATCAAGACTTC (SEQ ID NO. 5444) SF9_HP1147S1130 CUACUACUACAGAGATTAATGGCGGACTTC (SEQ ID NO. 5445) SF10_HP1X48S1X31 CUACUACUATCATGATTTGTGCTGTTTGAA (SEQ ID NO. 5446) SF1X_HPXX49S1132 CUACUACUATTTCACTCAACTATTCTCTAA (SEQ ID NO. 5447) SFX2_HP1150S1133 CUACUACUAGCCCACTAAAAGCATAGAAAC (SEQ ID NO. 5448) SG1_HPX15XSX134 CUACUACUACCTTAGGCTTTTTGAGAAAGT (SEQ ID NO. 5449) SG2_HP1152S1X35 CUACUACUAAAAAGTTAGCGCATTTTAGGG (SEQ ID NO. 5450) SG3_HP1153S1136 CUACUACUACTATCCTTTTATTATGGTTGT (SEQ ID NO. 5451) SG4_HP1X54SXX37 CUACUACUATTAAGCTCGTTCTCTTTCTTT (SEQ ID NO. 5452) SG5_HP1155SXX38 CUACUACUAATTTTAGGCGCTCAAAATCGT (SEQ ID NO. 5453) SG6_HP1156S1139 CUACUACUAGTTTAATCAAAAGCCTATGTT (SEQ ID NO. 5454) SG7_HP1157S1140 CUACUACUATTAAAACCCCATGATGTAATT (SEQ ID NO. 5455) SG8_HP1158SXX41 CUACUACUATTTTTAGAGGCGCATTTTTTT (SEQ ID NO. 5456) SG9_HP1159SX142 CUACUACUAAATTTTTCAAAGCCCTTCATA (SEQ ID NO. 5457) SGX0JHP1160SXX43 CUACUACUACCTAATCCTGTGTGCGTTCAA (SEQ ID NO. 5458) SGXX_HP1X6XS1144 CUACUACUATAAAAAATTAAGCGAAAGAAC (SEQ ID NO. 5459) SG12_HP1162SX145 CUACUACUACATCTTGCTTAAAAACCCATT (SEQ ID NO. 5460) SH1_HPXX63S1146 CUACUACUATTAATTCTTTTGGCGTTTTTC (SEQ ID NO. 5461) SH2_HP1X64SXX47 CUACUACUAGAGTGAGCGGCTTTAAGAGTG (SEQ ID NO. 5462) SH3_HP1X65S1148 CUACUACUAGCCGCTCACTCATCAAACGGC (SEQ ID NO. 5463) SH4_HP1166S1149 CUACUACUACTATTGGTTGTAATTTTTATA (SEQ ID NO. 5464) SH5_HP1167S1150 CUACUACUACATCAAAAAGATGCTACTAGA (SEQ ID NO. 5465) SH6_HP1X68S1X51 CUACUACUATTCAAGAGGCTTTGATGTAGG (SEQ ID NO. 5466) SH7_HP1X69SXX52 CUACUACUACCATTTTAAGCCACTTTCTTT (SEQ ID NO. 5467) SH8_HP1170S1153 CUACUACUATTTTACCCTCTAATGTGTTGG (SEQ ID NO. 5468) SH9 HP1X7XSX154 CUACUACUATATTTAACAGCTCCCTAAAAA (SEQ ID NO. 5469)
12SH10_HP1172S1155 CUACUACUATCATTCAAAAATAATTTCTTC (SEQ ID NO. 5470) 12SH11_HP1173S1156 CUACUACUAAATGAAGCGAAGAATAAAGAC (SEQ ID NO. 5471) 12SHX2_HPXX74S1157 CUACUACUATTTAGGAGTTTTCTTCTTGCT (SEQ ID NO. 5472) 12RAX_HP1079R1062 CAUCAUCAUGGATTTAATGATTCAGTCTGT (SEQ ID NO. 5473) 12RA2_HP1080R1063 CAUCAUCAUATGAAAAAGATTGCATTTTTT (SEQ ID NO. 5474) 12RA3_HPX081R1064 CAUCAUCAUCGTTTGACAGAATGCTAAGGG (SEQ ID NO. 5475) X2RA4_HPX082RX065 CAUCAUCAUAGGTTAAGCATTTGAAACTCT (SEQ ID NO. 5476) 12RA5_HP1083R1066 CAUCAUCAUAAAAGGAAAAAAATGAAAAAT (SEQ ID NO. 5477) 12RA6_HP1084R1067 CAUCAUCAUATGCCAAAAAAATGCCGACAC (SEQ ID NO. 5478) 12RA7_HP1085RX068 CAUCAUCAUGGAATGGGCTTGAAAAATCTC (SEQ ID NO. 5479) 12RA8_HPX086R1069 CAUCAUCAUTTTAATGCGCTTAGATTACGC (SEQ ID NO. 5480) X2RA9_HP1087R1070 CAUCAUCAUGGGAATGTTGAATTTTTTATC (SEQ ID NO. 5481) 12RA10_HPX088RX07X CAUCAUCAUGGAAAAATTTGATGCGATTGA (SEQ ID NO. 5482) 12RA11_HP1089R1072 CAUCAUCAUAACGCATGAACTTAGAAAAAC (SEQ ID NO. 5483) 12RA12_HP1090R1073 CAUCAUCAUGTGCAACCGATGAAATCTAAA (SEQ ID NO. 5484) X2RB1_HP1091R1074 CAUCAUCAUCAAGGATAGCCATGAACCCCC (SEQ ID NO. 5485) 12RB2_HP1092R1075 CAUCAUCAUCATGCAAAATGGGTATTATGC (SEQ ID NO. 5486) X2RB3_HPX093R1076 CAUCAUCAUATGGAACACTCTTTAATCATT (SEQ ID NO. 5487) 12RB4_HPX094RX077 CAUCAUCAUGGTGTGTGAGCGAAAACTTTT (SEQ ID NO. 5488) 12RB5_HP1095R1078 CAUCAUCAUTTTGCTTAACGCTATCAAGTT (SEQ ID NO. 5489) X2RB6_HPX096RX079 CAUCAUCAUGGATAGAGGAATGAGGAAAAA (SEQ ID NO. 5490) 12RB7_HPX097R1080 CAUCAUCAUTTGATCAAATGGCTTTAAAAA (SEQ ID NO. 5491) 12RB8_HPX098R1081 CAUCAUCAUATGTTAGAAAATGTCAAAAAG (SEQ ID NO. 5492) 12RB9_HP1099R1082 CAUCAUCAUAAATGCAAGATAAAATAATAG (SEQ ID NO. 5493) 12RB10_HP1100R1083 CAUCAUCAUTTCATGCCTAAGCATTCTTTA (SEQ ID NO. 5494) 12RB11_HP1101R1084 CAUCAUCAUTTGATGTTAGATTTTGATTTG (SEQ ID NO. 5495) 12RB12_HP1102R1085 CAUCAUCAUCATGGGTTATCAATTGTTTGA (SEQ ID NO. 5496) 12RC1_HP1103R1086 CAUCAUCAUTATGCCAAAAACTGAAACTTA (SEQ ID NO. 5497) 12RC2_HPXX04RX087 CAUCAUCAUTTAATGAGAGTTCAATCTAAA (SEQ ID NO. 5498) 12RC3_HP1105RX088 CAUCAUCAUTTTTTTATGACTTCAGCTTCA (SEQ ID NO. 5499) 12RC4_HP1106R1089 CAUCAUCAUGGCTGTGTTTTAGTTTGATTT (SEQ ID NO. 5500) 12RC5 HP1X07R1090 CAUCAUCAUTGTGGGGCAGGATAACATAAG (SEQ ID NO. 5501)
12RC6_HP1108RX091 CAUCAUCAUTATTACCATGTTTCAAATTAG (SEQ ID NO. 5502 )
12RC7_HP1109R1092 CAUCAUCAUATGAAAGATTGGAACGAATTT (SEQ ID NO. 5503 )
12RC8_HP1110R1093 CAUCAUCAUAAAAATATGGCAAAAAGTATT (SEQ ID NO. 5504 ) i2RC9_HPlllXRX094 CAUCAUCAUTATCATGGTAAAAGAAGTCAA (SEQ ID NO. 5505 )
12RC10_HP1XX2R1095 CAUCAUCAUTAAGGATTGTCGGTGTTAGAA (SEQ ID NO. 5506 )
12RC1X_HPX113R1096 CAUCAUCAUTTGAAACGAGCGTTATACTTA (SEQ ID NO. 5507 )
X2RCX2_HP1X14R1097 CAUCAUCAUAAAACATGCCCTTATTTGATT (SEQ ID NO. 5508 )
12RD1_HPXX15R1098 CAUCAUCAUATGATCAATTACCCTAATCTA (SEQ ID NO. 5509 )
12RD2_HP1116R1099 CAUCAUCAUAGGAGTGAAGTTACCCAAAGC (SEQ ID NO. 5510 )
12RD3_HP1117R1100 CAUCAUCAUAGTAACAATGGGATACGCAAG (SEQ ID NO. 5511 )
12RD4_HP1118R1101 CAUCAUCAUGATGAGACGGAGTTTTTTGAA (SEQ ID NO. 5512 )
12RD5__HP1119R1102 CAUCAUCAUAAAATGGGCGGAATCTTATCT (SEQ ID NO. 5513 )
12RD6_HP1120R1103 CAUCAUCAUATGGTCGTTTTACATTCTCAT (SEQ ID NO. 5514 )
12RD7_HP1121R1104 CAUCAUCAUACTTTTATGGATTTTTGCTCT (SEQ ID NO. 5515 )
12RD8_HP1122R1105 CAUCAUCAUATGGAATGAATATCAAATTAA (SEQ ID NO. 5516 )
12RD9_HP1123R1106 CAUCAUCAUCATGCAAAACCATGATTTAGA (SEQ ID NO. 5517 )
12RD10_HPX124R1107 CAUCAUCAUGGATGAAAAGGCTTTTTTTTA (SEQ ID NO. 5518 )
12RD1X_HPX125R1108 CAUCAUCAUATGAAGAGATCTTCTGTATTT (SEQ ID NO. 5519 )
12RD12_HP1126R1109 CAUCAUCAUAGGAAGAACAATGAGGTATTT (SEQ ID NO. 5520 )
12RE1_HP1X27R1X10 CAUCAUCAUAAAATGCCAACTGATAAGGAG (SEQ ID NO. 5521 )
12RE2_HP1128R1111 CAUCAUCAUGCTAATTTGCATGAGTAAGAG (SEQ ID NO. 5522 )
12RE3_HP1129R1112 CAUCAUCAUGGTTTTTATGAATTACGATAA (SEQ ID NO. 5523 )
12RE4_HP1130R1113 CAUCAUCAUATGTTAGATTCAATCGTTTAT (SEQ ID NO. 5524 )
12RE5_HP1131R1114 CAUCAUCAUGTGATGGCTTTGTTGAAAATT (SEQ ID NO. 5525 )
12RE6_HP1132R1115 CAUCAUCAUATGAAAGCGATGGAAGGTAAA (SEQ ID NO. 5526 )
12RE7_HP1133R1116 CAUCAUCAUATGGCGAATTTAAGAGACATT (SEQ ID NO. 5527 )
12RE8_HP1X34R1XX7 CAUCAUCAUGAGATAAATAATGTCCCAACT (SEQ ID NO. 5528 )
X2RE9_HPXX35RXXX8 CAUCAUCAUATGCAAGATTTAAAAGTGATC (SEQ ID NO. 5529 )
12RE10_HPXX36R1XX9 CAUCAUCAUTGATGAATGTTTGTAGTTAAA (SEQ ID NO. 5530 )
12RE11_HP1137R1120 CAUCAUCAUATAGGTTATGAATATATCGGT (SEQ ID NO. 5531 )
12REX2_HP1138R1121 CAUCAUCAUGGTAGTGATGGCAAAAAATAA (SEQ ID NO. 5532 )
12RFX HPXX39RXX22 CAUCAUCAUGATGTGAGTATGATGAGTGAA (SEQ ID NO. 5533 )
2RF2_HP1140R1123 CAUCAUCAUGCATGAGACAATGTGAAAAAA (SEQ ID NO. 5534 2RF3_HP1141R1124 CAUCAUCAUATAACATGCGAATCGTATTTA (SEQ ID NO. 5535
12RF4_HP1142R1125 CAUCAUCAUAGTGAGCGTGAATAGTAATGG (SEQ ID NO. 5536
12RFS_HP1143R1126 CAUCAUCAUCATGCAAGAAAATCAAACCCG (SEQ ID NO. 5537 2RF6_HP1144R1127 CAUCAUCAUATGGGAAGAAGTCTTGCATTA (SEQ ID NO. 5538
12RF7_HP1145R1128 CAUCAUCAUCTGTTGGTGCACGATATTACC (SEQ ID NO. 5539
12RF8_HP1146R1129 CAUCAUCAUATGAGAACAGAACCTTTATTT (SEQ ID NO. 5540
12RF9_HP1147R1X30 CAUCAUCAUAACACATGAAAAACCGCTACA (SEQ ID NO. 5541
12RF10_HP1148RXX31 CAUCAUCAUGTGAAATTCAGCGTTTTAACC (SEQ ID NO. 5542
12RF11_HPXX49RXX32 CAUCAUCAUATGGTTTCTATGCTTTTAGTG (SEQ ID NO. 5543
12RF12_HP1150R1133 CAUCAUCAUAGGTTAAATGCACGAATCAGT (SEQ ID NO. 5544
12RG1_HP1151R1134 CAUCAUCAUGGAGATTTTTGTAATGACAGT (SEQ ID NO. 5545
12RG2_HP1152R1135 CAUCAUCAUAAAAGGATAGAAAATGTTTCA (SEQ ID NO. 5546
12RG3_HP1153R1136 CAUCAUCAUAATGATAATGAAACAAGAACC (SEQ ID NO. 5547
12RG4_HP1X54R1137 CAUCAUCAUAAAGGATAGTCCAGTGAATTA (SEQ ID NO. 5548
12RG5_HP1155R1138 CAUCAUCAUATGAAATTCGCTCTTACAGGG (SEQ ID NO. 5549
12RG6_HP1156R1139 CAUCAUCAUTTGCAAAACTTTGTTTTTAGT (SEQ ID NO. 5550
12RG7_HPXX57RXX40 CAUCAUCAUAATAAGGGAAAAATATGATAA (SEQ ID NO. 5551
12RG8_HP1X58RXX41 CAUCAUCAUGGTGTGTCATGGAAATCTTAC (SEQ ID NO. 5552
12RG9_HPXX59R1142 CAUCAUCAUGCGCGATGCATTTAGACAGGC (SEQ ID NO. 5553
X2RGX0_HPXX60RXX43 CAUCAUCAUATGCTAGAAATAGACAACCAA (SEQ ID NO. 5554
X2RGXX_HP1161R1144 CAUCAUCAUGATATGGGAAAAATTGGTATC (SEQ ID NO. 5555
12RG12_HPXX62R1145 CAUCAUCAUATGAAAATAAAAACGAATCAA (SEQ ID NO. 5556
12RH1JHP1163R1146 CAUCAUCAUAGATGAATACAGAAATTTTAA (SEQ ID NO. 5557
12RH2_HP1164R1147 CAUCAUCAUATGAATCAAGAAATTTTAGAC (SEQ ID NO. 5558
12RH3_HP1165R1148 CAUCAUCAUCAAATGTTAAGGAAAAACATT (SEQ ID NO. 5559
12RH4_HP1166R1149 CAUCAUCAUATGCTAACCCAATTAAAAACT (SEQ ID NO. 5560
12RH5_HPXX67R1150 CAUCAUCAUCATGAAAAAGGCAAGTCAGGT (SEQ ID NO. 5561
12RH6_HP1168R1151 CAUCAUCAUAGGACATGCAAAAAAGTTTAG (SEQ ID NO. 5562
12RH7_HP1X69R1152 CAUCAUCAUAGCGTTATGGCATTAGATTGG (SEQ ID NO. 5563
12RH8_HPXX70R1153 CAUCAUCAUATGGGAGTTTTACTAGAATTA (SEQ ID NO. 5564
12RH9 HPX171R1154 CAUCAUCAUAAATGAGCGTGATTTTAGAAA (SEQ ID NO. 5565
12RHX0_HP1172RXX55 CAUCAUCAUATGAAAACAAACGGGCTTTTT (SEQ ID NO. 5566)
12RH11_HP1173R1156 CAUCAUCAUAGGATTTGTTGATGAGTTATT (SEQ ID NO. 5567)
X2RH12_HPX174R1157 CAUCAUCAUGGTATGCAAAAAACTTCTAAC (SEQ ID NO. 5568)
13ΞAX_HPX175SXX58 CUACUACUACTTAACGAAAGATAAATACAG (SEQ ID NO. 5569)
X3SA2_HPXX76SX159 CUACUACUAGCGTTTTTTAAATTTGAAAAG (SEQ ID NO. 5570)
X3SA3_HP1177S1X60 CUACUACUATTAATAGGCAAACACATAATT (SEQ ID NO. 5571)
X3SA4_HP1X78SX16X CUACUACUAAGGCTAACTCATCATCTCCAA (SEQ ID NO. 5572)
X3SA5_HP1179SXX62 CUACUACUAGGGGTCATGGTTGTCCTTTAA (SEQ ID NO. 5573)
13SA6_HP1180S1X63 CUACUACUACTTTTAATGAGCGTTTAGCCC (SEQ ID NO. 5574)
13SA7_HP1X81SX164 CUACUACUATAATTGTTTTATTTTCTAAAG (SEQ ID NO. 5575)
13SA8_HP1182S1X65 CUACUACUAAGCTTAAGCGTCTAAATATTT (SEQ ID NO. 5576)
X3SA9_HP1183SXX66 CUACUACUATCAAGCTTTTTTGTTGAGTAT (SEQ ID NO. 5577)
13SA10_HP1184S1167 CUACUACUATTATTTATTGGTTAAGTTTTC (SEQ ID NO. 5578)
13SAXX_HPX185S1168 CUACUACUACTCTTTATGTTTTTTTAAACT (SEQ ID NO. 5579)
13ΞA12JEiP1186S1169 CUACUACUAGGCTCTTCTAATCACAAACCA (SEQ ID NO. 5580)
13SBX_HPXX87SXX70 CUACUACUAAACGATTTTATGGTGTTTTCT (SEQ ID NO. 5581)
13SB2_HP1189S1172 CUACUACUATCAAGCGTTCTTAATGTAATG (SEQ ID NO. 5582)
13SB3_HPX190SX173 CUACUACUAAGCGACATTATAAGTCTTCAT (SEQ ID NO. 5583)
13SB4_HP1191S1174 CUACUACUATTAAGGCTCTTCTAAAAGAGT (SEQ ID NO. 5584)
13SB5_HP1192S1175 CUACUACUACATTAGACCACTGAGTTTTTA (SEQ ID NO. 5585)
X3SB6_HP1193S1X76 CUACUACUACTTTTATTGATTCACCATTTC (SEQ ID NO. 5586)
13SB7_HP1194SX177 CUACUACUACGTTTTTTAAGATTGAATACA (SEQ ID NO. 5587)
13SB8_HPX195S1X78 CUACUACUAAATAAGAGAGCGTTATAATTA (SEQ ID NO. 5588)
13SB9_HP1196S1179 CUACUACUATAACTCCAATTACCAGCGGTA (SEQ ID NO. 5589)
13SB10_HPXX97S1180 CUACUACUATTTCCTCTTATTTTTTCTTGT (SEQ ID NO. 5590)
13SB11_HPX198S118X CUACUACUATATCAAAATTTAGAGTTATCC (SEQ ID NO. 5591)
X3SB12_HPX199S1182 CUACUACUAGTCTTACTTGACTTCAACCTT (SEQ ID NO. 5592)
X3SCX_HPX200S1183 CUACUACUACCATCACAGGCTCTTAGTTTT (SEQ ID NO. 5593)
13SC2_HP1201S1184 CUACUACUATAACGCTATTTAATATCCATC ' (SEQ ID NO. 5594)
13SC3_HP1202S1185 CUACUACUACCAATCAATCCACAACTTCTA (SEQ ID NO. 5595)
13SC4_HP1203S1186 CUACUACUACATGTTTTCTCCTTAAAAAGT (SEQ ID NO. 5596)
13SC5 HPX204Ξ1187 CUACUACUACTAGCTCTTCAATTTGATTTC (SEQ ID NO. 5597)
13SC6JHP1205S1X88 CUACUACUATTATTCAATAATATTGCTCAC (SEQ ID NO. 5598) 13SC7_HP1206S1189 CUACUACUATTAGCCGAGATTGTCTTTGTG (SEQ ID NO. 5599) 13SC8_HPX207S1190 CUACUACUACTTTACTCTTTTATAAAGTTT (SEQ ID NO. 5600) 13SC9_HP1208S1191 CUACUACUATGACTCTTATTTGATAAGAAT (SEQ ID NO. 5601) 13SC10_HP1209S1192 CUACUACUAGCAACCGCTATAAAGTAGTTT (SEQ ID NO. 5602) X3SC11_HP1210S1193 CUACUACUACAACCGCTTGTTTAAAATTAA (SEQ ID NO. 5603) 13SC12_HPX2X1S1194 CUACUACUATTGTTTATTTTAAATTTTTAG (SEQ ID NO. 5604) 13SD1_HP1212S1195 CUACUACUATTAACTTAAGAATGGGTTACT (SEQ ID NO. 5605) 13SD2_HP1213S1196 CUACUACUACTTTTTAAAATTTTAAGCCAA (SEQ ID NO. 5606) 13SD3_HP1214SXX97 CUACUACUATATTTAATGTTCCTTTTTTAA (SEQ ID NO. 5607) 13SD4_HP1215S1198 CUACUACUATCTCACTTACGCTGTAAAGAG (SEQ ID NO. 5608) X3SD5_HP1216S1X99 CUACUACUAAAAACCCGCCTTTAAATAATC (SEQ ID NO. 5609) 13SD6_HP1217S1200 CUACUACUAATGATTAAAACAATACCAAAA (SEQ ID NO. 5610) 13SD7_HP1218S1201 CUACUACUATTAAGAATATTCTTTCAAATC (SEQ ID NO. 5611) X3SD8_HPX2X9SX202 CUACUACUAAGTCAGTCAGTTTCTCTTAAA (SEQ ID NO. 5612) X3SD9_HP1220S1203 CUACUACUATTATTTCAACCTTTCTTTATA (SEQ ID NO. 5613) X3SD10_HP1221S1204 CUACUACUACTAGCATTTTAATTCCCCGAA (SEQ ID NO. 5614) 13SD11_HP1222S1205 CUACUACUATTAAAGCGTGCAAGCATCCAC (SEQ ID NO. 5615) 13SD12_HP1223S1206 CUACUACUACTCATGCCTAGTTTTTTTTGT (SEQ ID NO. 5616) 13SE1_HP1224S1207 CUACUACUATTTTAACATTCCTTAATTCTC (SEQ ID NO. 5617) X3SE2_HP1225S1208 CUACUACUATTTTTAATGAATGCCTACAAA (SEQ ID NO. 5618) 13SE3_HPX226SX209 CUACUACUAACAATCATAACAGCCACAAAG (SEQ ID NO. 5619) X3SE4_HPX227SX210 CUACUACUATTATTTGAGGGTGGGGATGTA (SEQ ID NO. 5620) 13SE5_HPX228SX211 CUACUACUACTATAAATACCCCTCTCTTTT (SEQ ID NO. 5621) 13SE6_HP1229S1212 CUACUACUATTCATTGATCTAATTGATACA (SEQ ID NO. 5622) 13SE7_HP1230S1213 CUACUACUATTCACAGAGTAACCTTATTGT (SEQ ID NO. 5623) 13SE8_HP1231S1214 CUACUACUATCATGGCGTTTTCTTTTTGGA (SEQ ID NO. 5624) 13SE9_HPX232S1215 CUACUACUAGGTTCAATCCGTTTCTTCCAA (SEQ ID NO. 5625) 13SE10_HPX233SX2X6 CUACUACUATTAAAGAATCACATCCATGGT (SEQ ID NO. 5626) 13SE11_HP1234SX2X7 CUACUACUACATCGCTTAAATTTTTTTTAA (SEQ ID NO. 5627) 13SE12_HP1235S1218 CUACUACUAAATTTAATGGCGCACCAAACT (SEQ ID NO. 5628) 13SF1 HP1236S1219 CUACUACUAGAGACCATTATAAAAAACCCT (SEQ ID NO. 5629)
13SF2_HP1237S1220 CUACUACUACCCCTAAAAATCCTTTAACAA (SEQ ID NO. 5630) 13SF3_HP1238S1221 CUACUACUAAATGCAAGGTTAGGGATTATT (SEQ ID NO. 5631) 13SF4_HP1239S1222 CUACUACUAGCTAAACATGCCAATTTATGA (SEQ ID NO. 5632) 13SF5_HP1240S1223 CUACUACUAACAGCGTTTATAATGATGAAA (SEQ ID NO. 5633) 13SF6_HPX24XSX224 CUACUACUACTCCATGCTATCCCTCTAAAG (SEQ ID NO. 5634) X3SF7_HPX242SX225 CUACUACUACCTTAAGCGCGCTCAGATTTT (SEQ ID NO. 5635) X3SF8_HPX243SX226 CUACUACUACCAAACATTAGTAAGCGAACA (SEQ ID NO. 5636) X3SF9_HPX244S1227 CUACUACUATCAAACATTTAGTGCTGTTTA (SEQ ID NO. 5637) 13SFX0_HPX245SX228 CUACUACUATTAACCCTTAAAAGGGGATTT (SEQ ID NO. 5638) 13SF11_HPX246S1229 CUACUACUATTATTCGCTATGAGATCCTAC (SEQ ID NO. 5639) 13SFX2_HP1247SX230 CUACUACUATTTTTAGGGGTTAATGATTTT (SEQ ID NO. 5640) 13SG1_HP1248S1231 CUACUACUATTTACGATACATGCTCTTTAA (SEQ ID NO. 5641) 13SG2_HP1249S1232 CUACUACUATCAAAAAACACTTCGCATGAC (SEQ ID NO. 5642) X3SG3_HPX250S1233 CUACUACUATCATTCAGCCTTTTTTAAAAG (SEQ ID NO. 5643) 13SG4_HPX25XS1234 CUACUACUATCAACGCTTTTCAAAATCAAT (SEQ ID NO. 5644) 13SG5__HP1252S1235 CUACUACUATCATTGAAGATCCTTTTCTTT (SEQ ID NO. 5645) X3SG6_HPX253SX236 CUACUACUACTCAAATTAACCCCATCAATT (SEQ ID NO. 5646) X3SG7_HP1254SX237 CUACUACUATTAAGAACGCTTGATCCCAGA (SEQ ID NO. 5647) X3SG8_HPX255SX238 CUACUACUAGTAACATTATTTTTCATCCTT (SEQ ID NO. 5648) X3SG9_HPX256SX239 CUACUACUACCATGGTTTAGACCTTTAAGA (SEQ ID NO. 5649) X3SG10_HPX257S1240 CUACUACUATTTTAGTTGCCACGACTTCCT (SEQ ID NO. 5650) 13SGX1_HP1258S124X CUACUACUATTTTTCATGTTGCCTCCTTGA (SEQ ID NO. 5651) X3SG12_HP1259S1242 CUACUACUATTAAGAAGCCATTTCTATGAG (SEQ ID NO. 5652) 13SH1_HP1260S1243 CUACUACUAGCATTATTTCACCTCTAATTT (SEQ ID NO. 5653) X3SH2_HPX261S1244 CUACUACUATCTTACCATCACACTAACCTT (SEQ ID NO. 5654) 13SH3_HPX262S1245 CUACUACUAGAGCCATTTTATAGCCTCTTT (SEQ ID NO. 5655) 13SH4_HPX263SX246 CUACUACUAGCGTTTCATCTATCCACCTCG (SEQ ID NO. 5656) X3SH5_HPX264SX247 CUACUACUATCAAAATCAAGCCTTTTTTCT (SEQ ID NO. 5657) X3SHS_HPX265SX248 CUACUACUATCATGAAATTCCTTGAGAGTG (SEQ ID NO. 5658) 13SH7_HP1266S1249 CUACUACUAGCTCATGCTTGCTCCTTTAAA (SEQ ID NO. 5659) 13SH8_HP1267S1250 CUACUACUAAAACCTCCTTTAAATTAAAAT (SEQ ID NO. 5660) 13SH9 HP1268S1251 CUACUACUAGGTTTCAAACATTTTCATCTC (SEQ ID NO. 5661)
X3SH10_HP1269S1252 CUACUACUAGTTTAACCCTATCATAGAGAT (SEQ ID NO. 5662)
X3SH11_HPX270SX253 CUACUACUACTCAACCTTTCATAGCGTTTA (SEQ ID NO. 5663)
13SHX2_HPX27XSX254 CUACUACUATTATCTCCCAAAAAAAGCTAC (SEQ ID NO. 5664)
X3RAX_HP1175R1158 CAUCAUCAUCATGGGGTTTTTCAAGCTTAA (SEQ ID NO. 5665)
13RA2_HP1176R1159 CAUCAUCAUATAATGCTTTCGCCATTTTTA (SEQ ID NO. 5666)
13RA3_HP1177R1160 CAUCAUCAUATCTTATGAAAAAAACGAAAA (SEQ ID NO. 5667)
X3RA4_HP1178R1X61 CAUCAUCAUCATGACCCCTCACATCAACGC (SEQ ID NO. 5668)
13RA5_HP1179R1162 CAUCAUCAUAAACATGCAAAAAAGAGTGGT (SEQ ID NO. 5669)
13RA6_HP1180R1163 CAUCAUCAUATGATTTTTAGCTCTCTTTTT (SEQ ID NO. 5670)
X3RA7_HP1181R1164 CAUCAUCAUATGTTTAAGAAAATTTTTCCA (SEQ ID NO. 5671)
13RA8_HP1X82RXX65 CAUCAUCAUAGGATCAAAAATGGCCTATGA (SEQ ID NO. 5672)
X3RA9_HP1183RX166 CAUCAUCAUCATGCATGCAGAATTTTTCAC (SEQ ID NO. 5673)
13RAX0_HPXX84RXX67 CAUCAUCAUAAATGCTAATAAAAAAGATTG (SEQ ID NO. 5674)
X3RAXX_HP1185R1168 CAUCAUCAUCCCTTAAGATTTAAATGATGA (SEQ ID NO. 5675)
X3RAX2_HPXX86R1169 CAUCAUCAUTAAGGATAATTAAAATGAAAA (SEQ ID NO. 5676)
X3RBX_HP1X87R1170 CAUCAUCAUAATATGACTATATACACTACA (SEQ ID NO. 5677)
13RB2_HP1189RX172 CAUCAUCAUGGAGAAAACGATGAAGACTTA (SEQ ID NO. 5678)
X3RB3_HPX190RX173 CAUCAUCAUATGATTACCCCTAAAGTGTTG (SEQ ID NO. 5679)
13RB4_HP1191R1174 CAUCAUCAUCTATTGATGAGCGTAAATGCA (SEQ ID NO. 5680)
X3RB5_HPX192R1175 CAUCAUCAUGTATGAAAAAGCAAATCTTGA (SEQ ID NO. 5681)
13RB6_HPXX93R1176 CAUCAUCAUATGCAACAGCGTCATTTAGGC (SEQ ID NO. 5682)
13RB7_HP1194R1177 CAUCAUCAUGTAGCAATGCTCTTTAAAAAT (SEQ ID NO. 5683)
13RB8_HP1X95R1178 CAUCAUCAUGAGTTAGAATGGCTAGAAAAA (SEQ ID NO. 5684)
X3RB9_HP1196RXX79 CAUCAUCAUCAAAAACATGAGAAGAAGAAA (SEQ ID NO. 5685)
X3RBX0_HP1197R1180 CAUCAUCAUAATAGTGCCTACTATCAATCA (SEQ ID NO. 5686)
13RB11_HP1198RX181 CAUCAUCAUATGTCAAAAAAAATTCCCCTA (SEQ ID NO. 5687)
13RB12_HPXX99RXX82 CAUCAUCAUGGATTATGGCAATTTCAAAAG (SEQ ID NO. 5688)
X3RC1_HP1200R1X83 CAUCAUCAUAGATGCAAAAACAACATCAAA (SEQ ID NO. 5689)
X3RC2_HP120XR1184 CAUCAUCAUATCGTGGCAAAAAAAGTATTT (SEQ ID NO. 5690)
13RC3_HP1202R1185 CAUCAUCAUATGGCTAAAAAAGTAGTCGGA (SEQ ID NO. 5691)
13RC4_HP1203R1186 CAUCAUCAUGAATAATGATGGATTGGTATG (SEQ ID NO. 5692)
13RC5 HPX204R1187 CAUCAUCAUCATTATGAAAGTTAAAATAGG (SEQ ID NO. 5693)
13RC6_HP1205R1188 CAUCAUCAUATGGCAAAAGAAAAGTTTAAC (SEQ ID NO. 5694) 13RC7_HPX206RX189 CAUCAUCAUCACTTAAATTATGGCAAAAAA (SEQ ID NO. 5695) X3RC8_HPX207R1190 CAUCAUCAUAATCATGGCGCTTGAAGTGGT (SEQ ID NO. 5696) 13RC9_HP1208R1191 CAUCAUCAUCAATGAACTACATCGGTTCTA (SEQ ID NO. 5697) 13RC10_HPX209R1192 CAUCAUCAUTTTAATTTGGAGTTTGATAAA (SEQ ID NO. 5698) 13RC1X_HP1210R1193 CAUCAUCAUATGCTGGATTTGTCTTATAGC (SEQ ID NO. 5699) 13RC12_HP1211R1194 CAUCAUCAUGATTGAAAAAGATTACTATTA (SEQ ID NO. 5700) 13RD1_HP1212R1195 CAUCAUCAUATGAAATTTTTAGCGTTATTT (SEQ ID NO. 5701) 13RD2_HP12X3RX196 CAUCAUCAUATATGGATTTTATCACCATCA (SEQ ID NO. 5702) 13RD3_HP1214RX197 CAUCAUCAUGTGAGAAAAGGAATGCATTTG (SEQ ID NO. 5703) 13RD4_HP1215R1198 CAUCAUCAUCATGAGCGCGGATGTGGGTTA (SEQ ID NO. 5704) 13RD5_HP12X6R1199 CAUCAUCAUTAATCATGATTTATTGGTTGT (SEQ ID NO. 5705) 13RD6_HP1217R1200 CAUCAUCAUGAATGCACTCTCCAAATTTAG (SEQ ID NO. 5706) 13RD7_HP1218R1201 CAUCAUCAUATGAAAGATAACAATAACTAT (SEQ ID NO. 5707) 13RD8_HP1219R1202 CAUCAUCAUAAATGCCATCACACAAAAACC (SEQ ID NO. 5708) 13RD9_HP1220R1203 CAUCAUCAUTAAAATGCTAGTAGAAATAGA (SEQ ID NO. 5709) 13RD10_HP1221R1204 CAUCAUCAUGGCATTGGACAACACTCTCAA (SEQ ID NO. 5710) 13RD11_HP1222R1205 CAUCAUCAUGTGCGTGTGGAAGAAAATTAT (SEQ ID NO. 5711) 13RD12_HP1223R1206 CAUCAUCAUGATAAACATGCTTGAAGATTA (SEQ ID NO. 5712) 13RE1_HP1224R1207 CAUCAUCAUATGAGGGAGATTGTATGGGTG (SEQ ID NO. 5713) X3RE2_HPX225RX208 CAUCAUCAUGCATGAATTTTGTCTTTTTAT (SEQ ID NO. 5714) 13RΞ3_HPX226R1209 CAUCAUCAUTGAAGTGAAAATGAGAGAAAT (SEQ ID NO. 5715) 13RE4_HP1227R1210 CAUCAUCAUACAAACGATGAAAAAGGTTAT (SEQ ID NO. 5716) 13RE5_HPX228R121X CAUCAUCAUAAGACCTATGCTACATAAAAA (SEQ ID NO. 5717) 13RE6_HPX229R12X2 CAUCAUCAUTAGGGTGTTAATCGTTCAAAA (SEQ ID NO. 5718) 13RE7_HP1230R1213 CAUCAUCAUCAATGAAAAATTTCTACGATT (SEQ ID NO. 5719) X3RE8_HPX231R1214 CAUCAUCAUACTCTGTGAAAAACTCCAACC (SEQ ID NO. 5720) 13RE9_HP1232R1215 CAUCAUCAUATGATTGTAAAACGCCTTAAC (SEQ ID NO. 5721) 13RE10_HP1233R1216 CAUCAUCAUATGGCTGTTTCTTCTATCAAT (SEQ ID NO. 5722) 13RE11_HP1234R1217 CAUCAUCAUATGCGTAATACCATTTTATTT (SEQ ID NO. 5723) 13RE12_HP1235R1218 CAUCAUCAUATGCAACTAAGCCCCTTACAA (SEQ ID NO. 5724) 13RF1 HP1236R1219 CAUCAUCAUAAAATCATGAGACTCAAACTA (SEQ ID NO. 5725)
RF2_HP1237R1220 CAUCAUCAUATGGTCTCTCTCTATTTAGAA (SEQ ID NO. 5726) RF3_HP1238R1221 CAUCAUCAUATGGGAAGTATCGGTAGTATG (SEQ ID NO. 5727) RF4_HP1239R1222 CAUCAUCAUATGGGGTCTTGGTCTCGTATT (SEQ ID NO. 5728) RF5_HP1240R1223 CAUCAUCAUGGGATAGCATGGAGCTTATTT (SEQ ID NO. 5729) RF6_HP1241R1224 CAUCAUCAUATGGATATTCGCAACGAATTT (SEQ ID NO. 5730) RF7_HP1242R1225 CAUCAUCAUATGTTCCATGAATTTAGAGAC (SEQ ID NO. 5731) RF8_HP1243R1226 CAUCAUCAUAAAAACATGAAAAAACACATC (SEQ ID NO. 5732) RF9_HP1244R1227 CAUCAUCAUATGGAAAGAAAACGCTATTCA (SEQ ID NO. 5733) RF10_HP124SR1228 CAUCAUCAUCATGTTTAATAAAGTGATTAT (SEQ ID NO. 5734) RF11_HP1246R1229 CAUCAUCAUTGGATGAGGCATTATGAAACG (SEQ ID NO. 5735) RF12_HP1247R1230 CAUCAUCAUATGTATCGTAAAGATTTGGAC (SEQ ID NO. 5736) RG1_HP1248R1231 CAUCAUCAUTTGATGCAAGGGTTTTTAAGA (SEQ ID NO. 5737) RG2_HP1249R1232 CAUCAUCAUGGCTGAATGAAAGAATAATGA (SEQ ID NO. 5738) RG3_HP1250R1233 CAUCAUCAUAGGAATGAAAACTGAGATGAA (SEQ ID NO. 5739) RG4_HP1251R1234 CAUCAUCAUGATCTTCAATGATTGCTTACA (SEQ ID NO. 5740) RG5_HP1252R1235 CAUCAUCAUGGGGTTAATTTGAGTGGTCTT (SEQ ID NO. 5741) RG6_HPX253RX236 CAUCAUCAUGTGGTTACAATAACACTAACT (SEQ ID NO. 5742) RG7_HP1254R1237 CAUCAUCAUTTTGGACTCTTTTCATTCATT (SEQ ID NO. 5743) RG8_HP1255R1238 CAUCAUCAUTTTAGGCATTTAAGGAATCAG (SEQ ID NO. 5744) RG9_HP1256R1239 CAUCAUCAUAAAATAATGTTACAGGCCATT (SEQ ID NO. 5745) RG10_HP1257R1240 CAUCAUCAUATGGATATTAAGGCATGTTAT (SEQ ID NO. 5746) RG11_HP1258R1241 CAUCAUCAUGTGGCAACTAAAAAAACAAAA (SEQ ID NO. 5747) RG12_HP1259R1242 CAUCAUCAUTGATGGCTTGTGGGAAAGGGC (SEQ ID NO. 5748) RH1_HP1260R1243 CAUCAUCAUGATGCAACAAGCCACAGAAGC (SEQ ID NO. 5749) RH2_HP1261R1244 CAUCAUCAUGTGAAATAATGCAACAAGCAC (SEQ ID NO. 5750) RH3_HP1262R1245 CAUCAUCAUAGGTTAGTGTGATGGTAAGAA (SEQ ID NO. 5751) RH4_HP1263R1246 CAUCAUCAUAATGGCTCAAAATTTCACGAA (SEQ ID NO. 5752) RH5_HP1264R1247 CAUCAUCAUAGATGAAACGCTTTGATTTAC (SEQ ID NO. 5753) RH6_HP1265R1248 CAUCAUCAUATTTTGAGCGGCTTTAACCCC (SEQ ID NO. 5754) RH7_HP1266R1249 CAUCAUCAUGGAATTTCATGATCACAATGA (SEQ ID NO. 5755) RH8_HPX267RX250 CAUCAUCAUAAGCATGAGCGCTTATATCAT (SEQ ID NO. 5756) RH9 HP1268R1251 CAUCAUCAUTATGGCCAAACAAGAATACAA (SEQ ID NO. 5757)
13RH10JHP1269R1252 CAUCAUCAUATGTTTGAAACCATTGCCTTT (SEQ ID NO. 5758) 13RH11_HP1270R1253 CAUCAUCAUGAATCTCTATGATAGGGTTAA (SEQ ID NO. 5759) X3RHX2_HP1271R1254 CAUCAUCAUATGCAATATTCTTCTTTGCTG (SEQ ID NO. 5760) 14SA1_HPX272S1255 CUACUACUAGACTATCTATTAACATAAGAG (SEQ ID NO. 5761) 14SA2_HP1273S1256 CUACUACUATTTTGACTTAAGCCACAAATT (SEQ ID NO. 5762) 14SA3_HP1274S1257 CUACUACUATTATGACTCCTTGTTTTTGAA (SEQ ID NO. 5763) 14SA4_HP1275S1258 CUACUACUACTCTATTTTAAAGTTTTTCTA (SEQ ID NO. 5764) 14SA5_HP1276S1259 CUACUACUATGTTTAGTCCTTTTCGTCTTC (SEQ ID NO. 5765) 14SA6_HP1277S1260 CUACUACUATTAAAAAATCATTCCTCCTAT (SEQ ID NO. 5766) 14SA7_HP1278S1261 CUACUACUAACCTCATTTTAAACCTCCTTT (SEQ ID NO. 5767) 14SA8_HPX279SX262 CUACUACUATCATCTTAATACTCTCTTAAA (SEQ ID NO. 5768) 14SA9_HP1281S1263 CUACUACUATCTCTTTCATAATAACCCTTC (SEQ ID NO. 5769) 14SA10_HP1282S1264 CUACUACUATTCATAAGCTTGTTTTCCTGA (SEQ ID NO. 5770) 14SA11_HP1283S1265 CUACUACUATCAAACTCGTTCAAAATAGGT (SEQ ID NO. 5771) 14SA12_HP1284S1266 CUACUACUACTACATCTTGCGGTAATTGAT (SEQ ID NO. 5772) 14SB1_HP1285S1267 CUACUACUAGTTCTTATGCGATGGATAAAT (SEQ ID NO. 5773) 14SB2_HP1286S1268 CUACUACUATTATTGGGCGTAAGCTTCTAG (SEQ ID NO. 5774) 14SB3_HP1287S1269 CUACUACUATCAACTTTGATACGCCATATC (SEQ ID NO. 5775) 14ΞB4_HP1288S1270 CUACUACUACCCAAAGCTACCTGTCATACA (SEQ ID NO. 5776) 14SB5_HP1289S1271 CUACUACUATTAAATCACCACCCTAATTTA (SEQ ID NO. 5777) 14SB6_HP1290S1272 CUACUACUATCACTGCTTGCATGACTTATT (SEQ ID NO. 5778) 14SB7_HP1291S1273 CUACUACUAACCCGCTTATAAGAATTTTTG (SEQ ID NO. 5779) 14SB8_HP1292S1274 CUACUACUATTTCATACAAATTCAATGGTG (SEQ ID NO. 5780) 14SB9J3P1293S1275 CUACUACUATCAGTCGTTACCTCCTTTATC (SEQ ID NO. 5781) 14SB10_HP1294S1276 CUACUACUATTACTTAGAATACAATTCTAC (SEQ ID NO. 5782) 14SB11_HP1295S1277 CUACUACUACCTCCTTACACTCTTCTTCTT (SEQ ID NO. 5783) 14SB12_HP1296S1278 CUACUACUAATCTCCTTATTCGCTACTTGC (SEQ ID NO. 5784) 14SC1_HP1297S1279 CUACUACUAATGCTTTATCCTTGTCTTTGT (SEQ ID NO. 5785) 14SC2_HP1298S1280 CUACUACUAAATTCATTTATATCTAAAAGT (SEQ ID NO. 5786) 14SC3_HP1299S1281 CUACUACUACCATTAACGCTCCGTAAGAAT (SEQ ID NO. 5787) X4SC4_HP1300S1282 CUACUACUAGATTGCCATTAAAAGCCTACC (SEQ ID NO. 5788) 14SC5 HP1301S1283 CUACUACUATTACTTCTGCCCGCTTGTTTT (SEQ ID NO. 5789)
SC6JHP1302S1284 CUACUACUATTTTTAACTACGCCTTGATTT (SEQ ID NO. 5790) SC7_HP1303S1285 CUACUACUAGTCATAGAGCGATCCCGTTCT (SEQ ID NO. 5791) SC8_HP1304S1286 CUACUACUATTATTTTTTAGCTGTTTTACC (SEQ ID NO. 5792) SC9_HP1305S1287 CUACUACUAAATCCTTTACCAAATGCTACA (SEQ ID NO. 5793) SC10_HP1306S1288 CUACUACUACTTACCAACTGGCTTTTCTCA (SEQ ID NO. 5794) SC11_HP1307S1289 CUACUACUAATTAACCCCTTATCTCACTTT (SEQ ID NO. 5795) SC12_HP1308S1290 CUACUACUAACTCCTTAGGCTTTCTTCACA (SEQ ID NO. 5796) SD1_HP1309S1291 CUACUACUACTTTTCATTATACAACCTCCG (SEQ ID NO. 5797) SD2_HP1310S1292 CUACUACUAGCTTATACTCCCACTACTAAA (SEQ ID NO. 5798) SD3_HP1311S1293 CUACUACUATTACTCAACGCTAGAAGAATA (SEQ ID NO. 5799) SD4_HP1312S1294 CUACUACUACTTTTATCTTTCAATTCAGTA (SEQ ID NO. 5800) SD5_HP1313S1295 CUACUACUAATTATTGCCTCCCTCTTCTGC (SEQ ID NO. 5801) SD6_HP1314S1296 CUACUACUACTACTTACCTTCTGCTTGATT (SEQ ID NO. 5802) SD7_HP1315S1297 CUACUACUACCCCTTACTTGCCAATCTTTT (SEQ ID NO. 5803) SD8_HP1316S1298 CUACUACUATTTAACCTTTATTTATGTTTC (SEQ ID NO. 5804) SD9_HP1317S1299 CUACUACUACCTTTCTTAAGCACCAAGGGC (SEQ ID NO. 5805) SD10_HP1318S1300 CUACUACUATTTTTACTCCTCTGTCTTATC (SEQ ID NO. 5806) SD11_HP1319S1301 CUACUACUATTATACCGCTCTAATGCGTCC (SEQ ID NO. 5807) SD12_HP1320S1302 CUACUACUAAGGACTACTTCGTTTCCATAG (SEQ ID NO. 5808) SE1_HP1321S1303 CUACUACUAATTATACCAAAATAGATTGAA (SEQ ID NO. 5809) SE2_HP1322S1304 CUACUACUAACTCTTAGAGGTGGAACAACG (SEQ ID NO. 5810) SE3_HP1323S1305 CUACUACUACCTTTTAAATAAGGCGTTGTT (SEQ ID NO. 5811) SE4_HP1324S1306 CUACUACUATCCCTACTCACATGTTTCATG (SEQ ID NO. 5812) SE5_HP1325S1307 CUACUACUATCAAGCCTTAGGCCCGATCAT (SEQ ID NO. 5813) SE6_HP1326Ξ1308 CUACUACUATCTTATAAATCCAGGCTTGTT (SEQ ID NO. 5814) SE7_HP1327S1309 CUACUACUATCATTCTAATCCTTTAAGGTT (SEQ ID NO. 5815) SE8J3PX328SX3X0 CUACUACUATCATTCAATAATCCCCATTGT (SEQ ID NO. 5816) SE9_HPX329SX3XX CUACUACUATCATGTTTGATTGCTTTTAAT (SEQ ID NO. 5817) SE10_HP1330S1312 CUACUACUATTAGGGATTAAAAAAAGCCTT (SEQ ID NO. 5818) SE11_HP1331S1313 CUACUACUAATCAACATTCTAACTGCTTCC (SEQ ID NO. 5819) SE12_HP1332S1314 CUACUACUATTATTTGAACCAGTCTTTAAT (SEQ ID NO. 5820) SF1 HP1333S1315 CUACUACUATTGTCCTACTCTTGCTTAAAA (SEQ ID NO. 5821)
14SF2_HP1334S1316 CUACUACUATTATTGTTGTTTAAATTGTGG (SEQ ID NO. 5822) 14SF3_HP1335S1317 CUACUACUATAGTTCTTTAGTTTTTAAACA (SEQ ID NO. 5823) 14SF4_HP1336S1318 CUACUACUAGGTTTAGGCGTGTTTTTTAAT (SEQ ID NO. 5824) 14SF5_HP1337S1319 CUACUACUAGCCTAAACCCCTAAACTAGCC (SEQ ID NO. 5825) 14SF6_HP1338S1320 CUACUACUATAAGACGCTATTCATTGTATT (SEQ ID NO. 5826) 14SF7_HP1339S1321 CUACUACUATTTCATTTTTTCATGATCCTG (SEQ ID NO. 5827) 14SF8_HP1340S1322 CUACUACUAACACTCGTTGAAACTTTACTG (SEQ ID NO. 5828) 14SF9_HP1341S1323 CUACUACUAAATCAGTCTTCTTTCAAGCTA (SEQ ID NO. 5829) 14SF10_HP1342S1324 CUACUACUATTAGTAAGCAAACACATAATT (SEQ ID NO. 5830) 14SF11_HP1343S1325 CUACUACUATTTTTAACTTTTTTTCATTTT (SEQ ID NO. 5831) X4SFX2_HPX344S1326 CUACUACUACTACAACCAGCCTTTCTTTTT (SEQ ID NO. 5832) 14SGX_HPX345SX327 CUACUACUAAGAGTTAATGGCGTTTGTCCA (SEQ ID NO. 5833) X4SG2_HPX346S1328 CUACUACUATCTTTTAATTTTGTGCTATAT (SEQ ID NO. 5834) 14SG3_HP1347S1329 CUACUACUACATCAATCATAAACTAAAATC (SEQ ID NO. 5835) 14SG4_HP1348S1330 CUACUACUAAGTCAAAAAGCTTCATTGTTC (SEQ ID NO. 5836) 14SG5 3P1349S1331 CUACUACUACGATTGGACTTTTTATTTGAC (SEQ ID NO. 5837) 14SG6_HP1350S1332 CUACUACUAACCCCATGAGTTTTTATTTCT (SEQ ID NO. 5838) 14SG7_HP1351S1333 CUACUACUACTCTATCGCAAAGACAATAAC (SEQ ID NO. 5839) 14SG8_HP1352S1334 CUACUACUATTAAGAGTCCTTTTGGCAGAT (SEQ ID NO. 5840) 14SG9_HP1354S1335 CUACUACUATGGGTGTTTTAGGGTTTGTTG (SEQ ID NO. 5841) 14SG10JKP1355S1336 CUACUACUAGTCTTTAAGCCATTTTCATGT (SEQ ID NO. 5842) 14SG11_HP1356S1337 CUACUACUACCATTAAGATAACTCCATCAT (SEQ ID NO. 5843) 14SG12_HP1357S1338 CUACUACUACGTTATCAGTTGGCATGAAAT (SEQ ID NO. 5844) 14SH1_HP1358S1339 CUACUACUACTACCATTTTAACCCTGCCAA (SEQ ID NO. 5845) 14SH2_HP1359S1340 CUACUACUATCATAATTTTAAGCCTTTTCT (SEQ ID NO. 5846) 14SH3_HP1360S1341 CUACUACUAAACTCATGCATGCTTAAACCC (SEQ ID NO. 5847) 14SH4_HP1361S1342 CUACUACUAACTTTTCTAACCTACAATTAA (SEQ ID NO. 5848) 14SH5_HP1362S1343 CUACUACUATCAAGTTGTAACTATATCATA (SEQ ID NO. 5849) 14SH6_HP1363S1344 CUACUACUATTATCCTTTTATAATTGTTTG (SEQ ID NO. 5850) 14SH7_HP1364S1345 CUACUACUATTATCCTTGAAATTGAACGCA (SEQ ID NO. 5851) 14SH8_HP1366S1347 CUACUACUAATTTTATCAGCCGTTTTTAGA (SEQ ID NO. 5852) 14SH9 HP1367S1348 CUACUACUATTTTTATTCGCATTCATTATA (SEQ ID NO. 5853)
14SH10_HP1368S1349 CUACUACUAATAAATCTTATTCAAAATCAA (SEQ ID NO. 5854) 14SH11_HP1370S1350 CUACUACUAGATTTTCTCCTTACCAAATGA (SEQ ID NO. 5855) X4SH12_HP1371S135X CUACUACUATAAAATTAGGCTAATTGTTTT (SEQ ID NO. 5856) X4RA1_HP1272R1255 CAUCAUCAUAGATAAGAAATGCAGTTTTTA (SEQ ID NO. 5857) 14RA2_HP1273R1256 CAUCAUCAUTCTCTTATGTTAATAGATAGT (SEQ ID NO. 5858) X4RA3_HP1274R1257 CAUCAUCAUTTTGTGGCTTAAGTCAAAAAT (SEQ ID NO. 5859) 14RA4_HP1275R1258 CAUCAUCAUATAACATGGACATTAGCATTT (SEQ ID NO. 5860) 14RA5_HP1276R1259 CAUCAUCAUATGGATAAAAAGGACAAAAAT (SEQ ID NO. 5861) 14RA6_HP1277R1260 CAUCAUCAUAAGGAGGTTTAAAATGAGGTA (SEQ ID NO. 5862) 14RA7_HP1278R1261 CAUCAUCAUAAGAGAGTATTAAGATGAATA (SEQ ID NO. 5863) 14RA8_HP1279R1262 CAUCAUCAUCATGCCTAGCGTGTTAGAAAA (SEQ ID NO. 5864) 14RA9_HP1281R1263 CAUCAUCAUAGCTTATGAAAATCTTTTTTA (SEQ ID NO. 5865) 14RA10_HP1282R1264 CAUCAUCAUTGATGATCAGTCTCATAGAAA (SEQ ID NO. 5866) 14RA11_HPX283R1265 CAUCAUCAUCTGTTTTTGAAATCTTTACTC (SEQ ID NO. 5867) 14RA12_HP1284R1266 CAUCAUCAUATGGATTTTATAGGGTTTGAA (SEQ ID NO. 5868) 14RB1_HP1285R1267 CAUCAUCAUATGAGTGTTTTAAATGCCAAA (SEQ ID NO. 5869) 14RB2_HP1286R1268 CAUCAUCAUAATGAAAAAAGCGTTAATATC (SEQ ID NO. 5870) 14RB3_HP1287R1269 CAUCAUCAUAAATGCAAGTTTCACAATATC (SEQ ID NO. 5871) 14RB4_HP1288R1270 CAUCAUCAUGTGTGCCAAACATGCCTTGAA (SEQ ID NO. 5872) 14RB5_HP1289R1271 CAUCAUCAUGGAGAGAGCATGAATGTCAAA (SEQ ID NO. 5873) 14RB6_HPX290R1272 CAUCAUCAUGCATGTTAATAACCACCCAAC (SEQ ID NO. 5874) X4RB7_HPX29XRX273 CAUCAUCAUGTCATGCAAGCAGTGATTTTA (SEQ ID NO. 5875) 14RB8_HP1292R1274 CAUCAUCAUATGAGACACAAACACGGATAC (SEQ ID NO. 5876) 14RB9_HP1293R1275 CAUCAUCAUGGGTTAGAGCATGAAAGTTAT (SEQ ID NO. 5877) 14RB10_HP1294R1276 CAUCAUCAUATGGCAAGATATAGAGGTGCA (SEQ ID NO. 5878) 14RB11_HP1295RX277 CAUCAUCAUTAATGGCTAAGAGAAATGTAA (SEQ ID NO. 5879) 14RB12_HP1296R1278 CAUCAUCAUATAAAGCATGGCAAGGATTGC (SEQ ID NO. 5880) 14RC1_HP1297R1279 CAUCAUCAUATGAAAGTCAGGCCATCAGTG (SEQ ID NO. 5881) 14RC2_HP1298R1280 CAUCAUCAUTTAATGGCAAGAGATGATGTT (SEQ ID NO. 5882) 14RC3_HP1299R1281 CAUCAUCAUTTAATGGCAATCTCTATTAAA (SEQ ID NO. 5883) 14RC4_HP1300R1282 CAUCAUCAUATGAATAAAGCTATTGCTAGT (SEQ ID NO. 5884) 14RC5 HP1301R1283 CAUCAUCAUATGATAATGGGATTAGAAAAT (SEQ ID NO. 5885)
14RC6_HP1302R1284 CAUCAUCAUTATGACAGAAAGGAAAGGTAT (SEQ ID NO. 5886 ) 14RC7_HP1303R1285 CAUCAUCAUGTGATGAACGCGAAAGCATTG (SEQ ID NO. 5887 ) X4RC8_HPX304R1286 CAUCAUCAUATGTCAAGAATCGGGAAAAGA (SEQ ID NO. 5888 ) 14RC9_HP1305R1287 CAUCAUCAUGGATAAGATATGGTAAATGAT (SEQ ID NO. 5889 ) 14RC10_HP1306R1288 CAUCAUCAUAAAAGTGAGATAAGGGGTTAA (SEQ ID NO. 5890 ) 14RC11_HP1307R1289 CAUCAUCAUCATGTTTGGTTTGAAACAATT (SEQ ID NO. 5891 ) 14RC12_HP1308R1290 CAUCAUCAUATGAAAAGCGAAATCAAAAAA (SEQ ID NO. 5892 ) 14RD1_HP1309R1291 CAUCAUCAUATGATACAGAGTTTTACAAGA (SEQ ID NO. 5893 ) X4RD2_HP13X0R1292 CAUCAUCAUAGGGTAAGCAATGAATACAAA (SEQ ID NO. 5894 ) 14RD3_HP1311R1293 CAUCAUCAUATGAAATATACTGAATTGAAA (SEQ ID NO. 5895) 14RD4_HP1312R1294 CAUCAUCAUAATCATGTTAATGCCAAAAAG (SEQ ID NO. 5896 ) 14RD5_HP1313R1295 CAUCAUCAUGGCATGGGACAAAAAGTTAAT (SEQ ID NO. 5897) 14RD6_HP1314R1296 CAUCAUCAUATGAGTAAAGCGTTATTAAGA (SEQ ID NO. 5898 ) 14RD7_HP1315R1297 CAUCAUCAUAATGTCTAGGTCAATTAAAAA (SEQ ID NO. 5899 ) 14RD8_HP1316R1298 CAUCAUCAUATGGCGATTAAAACTTATAAG (SEQ ID NO. 5900 ) 14RD9_HP1317R1299 CAUCAUCAUATGGCAGACATCATGGATATA (SEQ ID NO. 5901 ) 14RD10_HP1318R1300 CAUCAUCAUATGAGTAAGGCCATCGTTTTA (SEQ ID NO. 5902 ) 14RD11_HP1319R1301 CAUCAUCAUGGTTAGATATGGAATTTTTAG (SEQ ID NO. 5903 ) 14RD12_HP1320R1302 CAUCAUCAUTTAGGAGAGTTTATGGAAAAA (SEQ ID NO. 5904 ) 14RE1_HP132XRX303 CAUCAUCAUTAAAAAATGCGCTACGATTAT (SEQ ID NO. 5905 ) 14RE2_HP1322R1304 CAUCAUCAUGGTGTTTTTAATGATTTTTTA (SEQ ID NO. 5906 ) 14RE3_HP1323R1305 CAUCAUCAUTTGGGTTGCGTATCAATGACT (SEQ ID NO. 5907 ) 14RE4_HP1324R1306 CAUCAUCAUATGAATCCTGAAAAAGCCACT (SEQ ID NO. 5908 ) 14RE5_HP1325R1307 CAUCAUCAUATGCAATTTAGAATTGAACAT (SEQ ID NO. 5909 ) 14RE6_HP1326R1308 CAUCAUCAUCAAAACGATGAAAAAGTTAGC (SEQ ID NO. 5910 ) 14RE7_HP1327R1309 CAUCAUCAUGAGCATGCTATCTTTTATAAG (SEQ ID NO. 5911 ) 14RE8_HP1328R1310 CAUCAUCAUGAATGAAACGGATTTTATGGT (SEQ ID NO. 5912 ) 14RE9_HP1329R1311 CAUCAUCAUGGGATTATTGAATGATAGAAA (SEQ ID NO. 5913 ) 14RE10_HP1330R1312 CAUCAUCAUTAGAATGTTGATGCATTCTAT (SEQ ID NO. 5914) 14RE11_HP1331R1313 CAUCAUCAUTGTTGATGCATGAGTTTCTAA (SEQ ID NO. 5915) 14RE12_HP1332R1314 CAUCAUCAUTCGTGGAATTGAGTTATTATG (SEQ ID NO. 5916 ) 14RF1 HP1333R1315 CAUCAUCAUATGAGATTTTTTTGCTTTTTC (SEQ ID NO. 5917 )
RF2_HPX334RX3X6 CAUCAUCAUATGTCAAAAAAAGTAGCTATA (SEQIDNO. 5918 RF3_HP1335R1317 CAUCAUCAUTGGTTAAAAAAAGGTGGTTAT (SEQIDNO. 5919 RF4_HP1336R1318 CAUCAUCAUGTAATGGCCAAAATTGAATTG (SEQ ID NO. 5920 RF5_HP1337R1319 CAUCAUCAUGCGTCTAGCTTTGAATACAAT (SEQ ID NO. 5921 RF6_HP1338R1320 CAUCAUCAUAATGGATACACCCAATAAAGA (SEQ ID NO. 5922 RF7_HP1339R1321 CAUCAUCAUATGTCTGTATCGCATGTTGCT (SEQ ID NO. 5923 RF8_HP1340R1322 CAUCAUCAUAAATGAAAAGCATCAGAAGAG (SEQ ID NO. 5924 RF9_HP1341R1323 CAUCAUCAUATGAAAATTTCTCCATCTCCA (SEQ ID NO. 5925 RF10_HP1342R1324 CAUCAUCAUAACATGAAAAAATCCCTCTTA (SEQ ID NO. 5926 RF11_HP1343R1325 CAUCAUCAUTCATGGAATTTTTATCCTCAC (SEQ ID NO. 5927 RF12_HP1344R1326 CAUCAUCAUAAGGATCGCTTATGGTGAACG (SEQ ID NO. 5928 RG1_HP1345RX327 CAUCAUCAUTTCATGTTAGCTAAAATGTCG (SEQ ID NO. 5929 RG2_HP1346R1328 CAUCAUCAUAAAGGGAAAACATGCCAATTA (SEQ ID NO. 5930 RG3_HP1347RX329 CAUCAUCAUGAACAATGAAGCTTTTTGACT (SEQ ID NO. 5931 RG4_HP1348R1330 CAUCAUCAUATGAAGTCAAATAAAAAGTCC (SEQ ID NO. 5932 RG5_HP1349RX331 CAUCAUCAUTAAAAACTCATGGGGTTTTTA (SEQ ID NO. 5933 RG6_HP1350R1332 CAUCAUCAUGGGTGGTGTTATTAACAATGA (SEQ ID NO. 5934 RG7_HP1351R1333 CAUCAUCAUAGGACTCTTAATGGACTATCA (SEQ ID NO. 5935 RG8_HP1352R1334 CAUCAUCAUATGGATTTTTTAAAAGAAAAC (SEQ ID NO. 5936 RG9_HP1354R1335 CAUCAUCAUACATGCTAAAAGAATATTTAG (SEQ ID NO. 5937 RG10_HP1355R1336 CAUCAUCAUATGGAGATTAGAACCTTTTTA (SEQ ID NO. 5938 RG11_HP1356R1337 CAUCAUCAUGAATTTCATGCCAACTGATAA (SEQ ID NO. 5939 RG12_HP1357R1338 CAUCAUCAUATGGCAGGGTTAAAATGGTAG (SEQ ID NO. 5940 RH1_HP1358R1339 CAUCAUCAUATGTTATCTTCTAGTGATTTG (SEQ ID NO. 5941 RH2_HP1359R1340 CAUCAUCAUATGGCTATTTGGGGGTGGTGT (SEQ ID NO. 5942 RH3_HP1360R1341 CAUCAUCAUCAAATCATTGCTTAAAAAAAT (SEQ ID NO. 5943 RH4_HP1361R1342 CAUCAUCAUGGTGTGTGGGGTGTTTTTAAG (SEQ ID NO. 5944 RH5_HP1362R1343 CAUCAUCAUAATGGATCATTTAAAGCATTT (SEQ ID NO. 5945 RH6_HP1363R1344 CAUCAUCAUAAAGATGCTTTCAGTGTATGA (SEQ ID NO. 5946 RH7_HP1364R1345 CAUCAUCAUGTTGGCTATCGCTTTAACCCA (SEQ ID NO. 5947 RH8_HP1366R1347 CAUCAUCAUGATATTTGACATGCCAAAATT (SEQ ID NO. 5948 RH9 HP1367R1348 CAUCAUCAUTGTTTGATTTTGAATAAGATT (SEQ ID NO. 5949
4RH10_HP1368R1349 CAUCAUCAUTTGAATATCAATAAAGTGTTT (SEQ ID NO. 5950) 4RH11_HP1370R1350 CAUCAUCAUTTGTATAAATTAAAAGATGAA (SEQ ID NO. 5951) 4RH12_HPX371R135X CAUCAUCAUAATCATGGCAAAGAAAAAAGA (SEQ ID NO. 5952) 5SA1_HP1372S1352 CUACUACUAATATAAGGCTAGTTTTTCACA (SEQ ID NO. 5953) 5SA2_HP1373S1353 CUACUACUAGCATTATTCACTAAAACCCAC (SEQ ID NO. 5954) 15SA3_HP1374S1354 CUACUACUACGTGTTCCTTAAGGAAGAATT (SEQ ID NO. 5955) 15SA4_HP1375S1355 CUACUACUATCATGTTATTCCTCTTGTTTT (SEQ ID NO. 5956) 15SA5_HP1376S1356 CUACUACUATTAATCTCTCTCTGCAATCAT (SEQ ID NO. 5957) 5SA6_HP1377S1357 CUACUACUAATCATTCATGCGTGTGGCTTA (SEQ ID NO. 5958) 15SA7_HP1378S1358 CUACUACUACTACCAATCAAAAATTAACAC (SEQ ID NO. 5959) 15SA8_HP1379S1359 CUACUACUATTCAAAGCAAAGTTTTTTCTA (SEQ ID NO. 5960) 15SA9_HP1380S1360 CUACUACUATACTTTACATGAACTCCTGGA (SEQ ID NO. 5961) 15SA10_HP1381S1361 CUACUACUAATCAAATTTCTAAATGGCTTA (SEQ ID NO. 5962) 15SA11_HP1382S1362 CUACUACUATTAATACTTCTTAGCGTTGTT (SEQ ID NO. 5963) 15SA12_HP1383S1363 CUACUACUACTACCTTGCAAAAGAATTTAG (SEQ ID NO. 5964) 15SBX_HP1384S1364 CUACUACUACCATTTTAATCCCATTTAATT (SEQ ID NO. 5965) 15SB2_HP1385S1365 CUACUACUAGCATCATTCTCCCTTTAAATA (SEQ ID NO. 5966) 15SB3_HP1386S1366 CUACUACUACTTTATGCAAGAGATTGTCTG (SEQ ID NO. 5967) 15SB4_HP1387S1367 CUACUACUAAAATCAAAATACTAGGCAACA (SEQ ID NO. 5968) 15SB5_HP1388S1368 CUACUACUATTCAATACCTAAACATAGGAT (SEQ ID NO. 5969) 15SB6_HP1389S1369 CUACUACUACTTTTCGTTAGGGTTTAGAGC (SEQ ID NO. 5970) 15SB7_HP1390S1370 CUACUACUAGATTAAGAGCAAAAAATTAGA (SEQ ID NO. 5971) 15SB8_HP1391S1371 CUACUACUACTACCCCACAATATCCAACAA (SEQ ID NO. 5972) X5SB9_HPX392SX372 CUACUACUAAAGTGTTTAAGTGTCCTTTAG (SEQ ID NO. 5973) 15SB10_HP1393S1373 CUACUACUATCATTCTTGCGCCTTTAATTT (SEQ ID NO. 5974) 15SB11_HP1394S1374 CUACUACUATTTTTATCTTTTTTTGCTAGG (SEQ ID NO. 5975) 15SB12_HP1395S1375 CUACUACUAGCCCCTAGAAAGTGAACACAT (SEQ ID NO. 5976) 15SC1_HP1396S1376 CUACUACUAAATCAACTTTCTTTTCTTTTT (SEQ ID NO. 5977) 15SC2_HP1397S1377 CUACUACUACTAGAAGTTACCGGATTGAGA (SEQ ID NO. 5978) 15SC3_HP1398S1378 CUACUACUATCAAAGCTCCTTTAAAATCTC (SEQ ID NO. 5979) 15SC4_HP1399S1379 CUACUACUAGTTTGAAAAGCGATCAGTAAC (SEQ ID NO. 5980) 15SC5 HP1400S1380 CUACUACUAAGCCTTTAGAAAGTGTAGTTC (SEQ ID NO. 5981)
15SC6__HP1401S1381 CUACUACUACATTTAAGAACGCTCCAAATA (SEQ ID NO. 5982) 15SC7_HP1402S1382 CUACUACUATTAATGATACTCGTCTATGTG (SEQ ID NO. 5983) 15SC8_HP1403S1383 CUACUACUATCACCCCATTAACCCCAAATC (SEQ ID NO. 5984) 15SC9_HP1404S1384 CUACUACUATCAAAGAATGGCACAATCTCC (SEQ ID NO. 5985) 15SC10_HP1405S1385 CUACUACUATCACAAACCACTTAAATTTTC (SEQ ID NO. 5986) 15SC11_HP1406S1386 CUACUACUAAAAAAGTTCTCTCATTAATGA (SEQ ID NO. 5987) 15SC12_HP1407S1387 CUACUACUACTTTTTCATCAAGTTTCTTTT (SEQ ID NO. 5988) 15SD1_HP1408S1388 CUACUACUATGGGGCTTATCGAATCAAATT (SEQ ID NO. 5989) 15SD2_HP1409S1389 CUACUACUATTTAATCATTCTAGATCAAAA (SEQ ID NO. 5990) 15SD3_HP1410S1390 CUACUACUATTCCTCATGCCCGCCAGCTTG (SEQ ID NO. 5991) 15SD4_HP14XXSX39X CUACUACUATGTTGGGGTGTGAATTTTATT (SEQ ID NO. 5992) X5SD5_HP1412S1392 CUACUACUAGGGGGGGGTTTATTCTACATT (SEQ ID NO. 5993) 15SD6_HP1413S1393 CUACUACUACGTTATTTCGCATTCAAAAGG (SEQ ID NO. 5994) 15SD7_HP1414S1394 CUACUACUATTAGGCGTTTTGATAAGGAAG (SEQ ID NO. 5995) 15SD8_HP1415S1395 CUACUACUAAGTTTTGTGTATCCATTAGAT (SEQ ID NO. 5996) 15SD9_HP1416S1396 CUACUACUAATGTCCTTTAAGAGAGTTTGA (SEQ ID NO. 5997) 15SD10_HP1418S1397 CUACUACUACCCTATCTCAAAATCTTCACT (SEQ ID NO. 5998) 15SD11_HP1419S1398 CUACUACUATTTCTAGCCGATGATTTTGGG (SEQ ID NO. 5999) 15SD12_HP1420S1399 CUACUACUATTCCATGATATTCCTTTTATC (SEQ ID NO. 6000) 15SE1_HP1421S1400 CUACUACUAGCATTTAAAGCTCTTTAGTCC (SEQ ID NO. 6001) 15SE2_HP1422S1401 CUACUACUAATCCTTTTATCATCGCTCTTT (SEQ ID NO. 6002) 15SE3_HP1423S1402 CUACUACUAGTCTTTATTCTTTTGTTTTAG (SEQ ID NO. 6003) 15SE4_HP1424S1403 CUACUACUACACTTCATGCTTTATTGTTCC (SEQ ID NO. 6004) 15SE5_HP1425S1404 CUACUACUATTAAAAACGATAAACATAATC (SEQ ID NO. 6005) 15SE6_HP1426S1405 CUACUACUACTATAAAATAACCGATGAAAA (SEQ ID NO. 6006) 15SE7_HP1428S1407 CUACUACUATCAAATTTGCTGAGAGAGTTT (SEQ ID NO. 6007) 15SE8_HP1429S1408 CUACUACUATCATGCTTTAAGCCCTAATTC (SEQ ID NO. 6008) 15SE9_HP1430S1409 CUACUACUATCAAAAAGAATGGGCATGACA (SEQ ID NO. 6009) 15SE10_HP1431S1410 CUACUACUACATTAGCCTTTTAAAAGGTAA (SEQ ID NO. 6010) 15SE11_HP1432S1411 CUACUACUACCACAAACGCCCCAATCAATA (SEQ ID NO. 6011) 15SΞ12__HP1433S1412 CUACUACUAAAAAATCACTCCCTTAAAAAC (SEQ ID NO. 6012) 15SF1 HP1434S1413 CUACUACUATCAAAACACCACCGTTTTATT (SEQ ID NO. 6013)
SF2_HP1435S1414 CUACUACUAACATCTTTATCTCATTAATAA (SEQ ID NO. 6014) SF3_HP1436S1415 CUACUACUAACAATCTAAAAAACCTCTTCG (SEQ ID NO. 6015) SF4_HP1437S1416 CUACUACUACTACAAATTGATTTGTTGTGA (SEQ ID NO. 6016) SF5_HP1438S1417 CUACUACUAGATTTATCTGTTAAAGTTGAA (SEQ ID NO. 6017) SF6_HP1439S1418 CUACUACUACCTAAACAACCTTTCTAGCAT (SEQ ID NO. 6018) SF7_HP1440S1419 CUACUACUATTAGGGAATGTTTGGGAAAAT (SEQ ID NO. 6019) SF8_HP1441S1420 CUACUACUATTATAGAGAAGAGCTAAACAC (SEQ ID NO. 6020) SF9_HP1442S1421 CUACUACUAATGCGTCAAGGCTTAATGACC (SEQ ID NO. 6021) SF10__HP1443S1422 CUACUACUATGAGTTTCATTCGCCCCCTTT (SEQ ID NO. 6022) SF11_HP1444S1423 CUACUACUACTACCATTTCTTTCATTATTA (SEQ ID NO. 6023) SF12JHP1445S1424 CUACUACUATTAGGCTTTGCTATAAATCTC (SEQ ID NO. 6024) SG1_HP1446S1425 CUACUACUACTAAAACTAATTTTTTTGATT (SEQ ID NO. 6025) SG2_HP1447S1426 CUACUACUAAGGGTTTAGACGGAAAGCTTT (SEQ ID NO. 6026) SG3_HP1448S1427 CUACUACUATTATTGTTTCGCATAGGTATG (SEQ ID NO. 6027) SG4_HP1449S1428 CUACUACUATGTGATTAAACCTTAATGATG (SEQ ID NO. 6028) SG5_HP1450S1429 CUACUACUATTTGCATCAATGTTCCTTTTT (SEQ ID NO. 6029) SG6_HP1451S1430 CUACUACUATCATTCATTGTTAAAGTCGTT (SEQ ID NO. 6030) SG7_HP1452S1431 CUACUACUATTCATTTGCCTAAGCAAAATT (SEQ ID NO. 6031) SG8_HP1453S1432 CUACUACUATTAGAAAATCCACCCGTAATT (SEQ ID NO. 6032) SG9_HP1454S1433 CUACUACUAATCCTGCTTATAACCCCGTAT (SEQ ID NO. 6033) SG10__HP1455S1434 CUACUACUATTTCATTATTAATCCTTACTC (SEQ ID NO. 6034) SG11_HP1456S1435 CUACUACUACCTACTTTTTAACCATGCCCA (SEQ ID NO. 6035) SG12_HP1457S1436 CUACUACUATAAATTTAAAACATACGCTTA (SEQ ID NO. 6036) SH1_HP1458S1437 CUACUACUAGATACCCCTTACAATAACGCT (SEQ ID NO. 6037) SH2_HP1459S1438 CUACUACUAAAGCTAATCCCCTTTAACATT (SEQ ID NO. 6038) SH3_HP1460S1439 CUACUACUACTCAAGCTAAATCCCTCCACT (SEQ ID NO. 6039) SH4_HP1461S1440 CUACUACUATAAATCAAAAAGAGGGTTTAG (SEQ ID NO. 6040) SH5_HP1462S1441 CUACUACUAACTCAAGGCGTTTTCAAATAA (SEQ ID NO. 6041) SH6_HP1463S1442 CUACUACUATTTTTATTTTTTTAACGGGCT (SEQ ID NO. 6042) SH7_HP1464S1443 CUACUACUATTATTTCCTTTCTCCAAAAAT (SEQ ID NO. 6043) SH8_HP1465S1444 CUACUACUACCCGCCGATTAAAGTGTAATT (SEQ ID NO. 6044) SH9 HP1466S1445 CUACUACUAGTTAGTAGCGTTCATTATATG (SEQ ID NO. 6045)
15SH10_HP1467S1446 CUACUACUAAAAAGGTTAAAATTTATAGCT (SEQ ID NO. 6046) 15SH11_HP1468S1447 CUACUACUATTAGCCAACTTCAAAAATCCA (SEQ ID NO. 6047) 15SHX2_HP1469S1448 CUACUACUAGGTTAAAAATCCCTCAAGTAA (SEQ ID NO. 6048) X5RAX_HP1372RX352 CAUCAUCAUGCATGCGTTTTTATTTTAAAT (SEQ ID NO. 6049) 15RA2_HP1373R1353 CAUCAUCAUGGCATGATTTTTAGCAAATTG (SEQ ID NO. 6050) 15RA3_HP1374R1354 CAUCAUCAUAGAGGAATAACATGAACGAAA (SEQ ID NO. 6051) 15RA4_HP1375R1355 CAUCAUCAUAGAGAGAGATTAAAAAATGAG (SEQ ID NO. 6052) 15RA5_HP1376R1356 CAUCAUCAUATGGAACAAAGCCATCAAAAC (SEQ ID NO. 6053) 15RA6_HP1377R1357 CAUCAUCAUATGTCTTTAAGGAGAAGATTT (SEQ ID NO. 6054) 15RA7_HP1378R1358 CAUCAUCAUATGCGTTTAAAACATTTTAAA (SEQ ID NO. 6055) 15RA8_HP1379R1359 CAUCAUCAUAAATGACTGAAGATTTTCCTA (SEQ ID NO. 6056) 15RA9_HP1380R1360 CAUCAUCAUGGCTTATGGGGGGGAGTTTAG (SEQ ID NO. 6057) 15RA10JHP1381R1361 CAUCAUCAUTTTAGGGGTGCGAGGAGCGAA (SEQ ID NO. 6058) 15RA11_HP1382R1362 CAUCAUCAUATGACCAATATCACCCCTGAA (SEQ ID NO. 6059) 15RA12_HP1383R1363 CAUCAUCAUGTGTTTTTGGCTTCAGGGGTG (SEQ ID NO. 6060) 15RB1_HP1384R1364 CAUCAUCAUAGGGAGAATGATGCAAAATAG (SEQ ID NO. 6061) 15RB2_HP1385R1365 CAUCAUCAUAGGGTTTTTATGGATTACAAA (SEQ ID NO. 6062) 15RB3_HP1386R1366 CAUCAUCAUAGTATTTTGAAAGTAGCTCCG (SEQ ID NO. 6063) 15RB4_HP1387R1367 CAUCAUCAUGGGGGCGTGAATGATAAAAAT (SEQ ID NO. 6064) 15RB5_HP1388R1368 CAUCAUCAUCACATGAATAATATTTGGTTT (SEQ ID NO. 6065) 15RB6_HP1389R1369 CAUCAUCAUTATGATGACAGAAGAAACCTA (SEQ ID NO. 6066) 15RB7_HP1390R1370 CAUCAUCAUACCAATGAGCGAACCATTAGA (SEQ ID NO. 6067) 15RB8_HP1391R1371 CAUCAUCAUGGTTAGCATGGGTGTTTCCGC (SEQ ID NO. 6068) 15RB9_HP1392R1372 CAUCAUCAUATGAAATTTTTTCTTTTAAAG (SEQ ID NO. 6069) 15RB10_HP1393R1373 CAUCAUCAUAACATGCGAGATTTCAATAAC (SEQ ID NO. 6070) 15RB11_HP1394R1374 CAUCAUCAUATGAAAGATTCACTTCAAACT (SEQ ID NO. 6071) 15RB12_HP1395R1375 CAUCAUCAUCACATGAAAAAAATTTTTTCT (SEQ ID NO. 6072) 15RC1_HPX396RX376 CAUCAUCAUAATGGAAAAAGAGCTCGTTAC (SEQ ID NO. 6073) 15RC2_HP1397R1377 CAUCAUCAUGGGTTTTTATGGCAAAAATTG (SEQ ID NO. 6074) 15RC3_HP1398R1378 CAUCAUCAUCATGACTATTGGGCTAGTTAA (SEQ ID NO. 6075) 15RC4_HP1399R1379 CAUCAUCAUATAAAATGATTTTAGTAGGAT (SEQ ID NO. 6076) 15RC5 HP1400R1380 CAUCAUCAUAGGAGTTTGTGTTGCATAAAA (SEQ ID NO. 6077)
15RC6_HP1401R1381 CAUCAUCAUAATGAACGCTTATTGTTTGAC (SEQ ID NO. 6078) 15RC7_HP1402R1382 CAUCAUCAUTATGATGAAAACAGAAAAAGA (SEQ ID NO. 6079) 15RC8_HP1403R1383 CAUCAUCAUGTTAAAATGGCGATCAAAAAA (SEQ ID NO. 6080) i5RC9_HP1404R1384 CAUCAUCAUGGGTTAATGGGGTGACAGAAC (SEQ ID NO. 6081) 15RC10_HP1405R1385 CAUCAUCAUAAAAATGCCTATGAATATAAG (SEQ ID NO. 6082) 15RC11_HP1406R1386 CAUCAUCAUATTATGCAAGAGATTTTTTTA (SEQ ID NO. 6083) 15RC12_HP1407R1387 CAUCAUCAUGCTGTCATTAATGAGAGAACT (SEQ ID NO. 6084) 15RD1_HP1408R1388 CAUCAUCAUATACCCCATGTCATCTTTTTT (SEQ ID NO. 6085) 15RD2_HP1409R1389 CAUCAUCAUATGATGGCAAAAATAGATGTA (SEQ ID NO. 6086) 15RD3_HP1410R1390 CAUCAUCAUATGCCTAAAAAAGAGCTATTA (SEQ ID NO. 6087) 15RD4_HP1411R1391 CAUCAUCAUATGTTATTGGATTATGATTTT (SEQ ID NO. 6088) 15RD5_HP1412R1392 CAUCAUCAUATGGGTTTTCAAAATGAAAAT (SEQ ID NO. 6089) 15RD6_HP1413R1393 CAUCAUCAUATGACCCCTGAACTAAACCTC (SEQ ID NO. 6090) 15RD7_HP1414R1394 CAUCAUCAUCATGAACCAACGCATAGAAAC (SEQ ID NO. 6091) 15RD8_HP1415R1395 CAUCAUCAUCTTTGAGTATTTATAAAGACA (SEQ ID NO. 6092) 15RD9_HP1416R1396 CAUCAUCAUCTAATGGATACACAAAACTTA (SEQ ID NO. 6093) 15RD10_HP1418R1397 CAUCAUCAUAATGCTAGAAACCACCATTGA (SEQ ID NO. 6094) 15RD11_HP1419R1398 CAUCAUCAUGGAATATCATGGAATCACAAC (SEQ ID NO. 6095) 15RD12J3P1420R1399 CAUCAUCAUATGCCCCTAAAATCCTTAAAA (SEQ ID NO. 6096) 15RE1_HP1421R1400 CAUCAUCAUAGGGCTTTTGAAAACTTTACA (SEQ ID NO. 6097) 15RE2_HP1422R1401 CAUCAUCAUTCAGTGGAAGAATACAAAGAC (SEQ ID NO. 6098) 15RE3_HP1423R1402 CAUCAUCAUCATGCGAATAGACAAATTTTT (SEQ ID NO. 6099) 15RE4_HP1424R1403 CAUCAUCAUATTTGTTTGAAGTTTCAAATT (SEQ ID NO. 6100) 15RE5_HP1425R1404 CAUCAUCAUGTAATATGTTAGCGCAATTAG (SEQ ID NO. 6101) 15RE6_HP1426R1405 CAUCAUCAUTCTATGCTAGGGTTTAATCTT (SEQ ID NO. 6102) 15RE7_HP1428R1407 CAUCAUCAUTAAAGCATGAAAGCTAGTATT (SEQ ID NO. 6103) 15RE8_HP1429R1408 CAUCAUCAUATGCCCATTCTTTTTGATTGT (SEQ ID NO. 6104) 15RE9_HP1430R1409 CAUCAUCAUTATGACGGATAACAACCAAAA (SEQ ID NO. 6105) 15RE10_HP1431R1410 CAUCAUCAUTTATGGTAGTAGCTAAAAAGT (SEQ ID NO. 6106) 15RE11_HP1432R1411 CAUCAUCAUGAGCTAGAATTTAAATTTCAA (SEQ ID NO. 6107) 15RE12_HP1433R1412 CAUCAUCAUATGCTAAGCTATAAGCATTCT (SEQ ID NO. 6108) 15RF1 HP1434R1413 CAUCAUCAUTGAGATAAAGATGTTAGAATT (SEQ ID NO. 6109)
15RF2_HP1435R1414 CAUCAUCAUATGTGGAGTTTCATTCAAAAA (SEQ ID NO. 6110 ) 15RF3JHP1436R1415 CAUCAUCAUATGGCGTGTAAATTTTGCCCC (SEQ ID NO. 6111 ) 15RF4_HP1437R1416 CAUCAUCAUATGAGTGAAAACATTAAAGGA (SEQ ID NO. 6112 ) 15RF5_HP1438R1417 CAUCAUCAUGCATGTTTTTTAAAACTTATC (SEQ ID NO. 6113 ) 15RF6_HP1439R1418 CAUCAUCAUATGGACAATAGCACAGACAGA (SEQ ID NO. 6114 ) 15RF7_HP1440R1419 CAUCAUCAUATGGCGTTTTGGCATAAAAGA (SEQ ID NO. 6115) 15RF8_HP1441R1420 CAUCAUCAUTTTATGATGGATCCAATTAAA (SEQ ID NO. 6116 ) 15RF9_HP1442R1421 CAUCAUCAUTAAGAGATTAAACATGCTCAT (SEQ ID NO. 6117 ) 15RF10_HP1443R1422 CAUCAUCAUCTTGACGCATGTTTTTGAAGT (SEQ ID NO. 6118 ) X5RFXX_HP1444R1423 CAUCAUCAUAAAAAGGGGGCGAATGAAACT (SEQ ID NO. 6119 ) 15RFX2_HPX445RX424 CAUCAUCAUATGAAAGAAATGGTAGATTAT (SEQ ID NO. 6120 ) 15RG1_HP1446R1425 CAUCAUCAUGAGCGGTGAAAAAAGTAGAAT (SEQ ID NO. 6121 ) 15RG2_HP1447R1426 CAUCAUCAUTTTTAAATGAAACGCACTTAC (SEQ ID NO. 6122 ) 15RG3_HP1448R1427 CAUCAUCAUTAATGCCAGACGAGCTAAGGG (SEQ ID NO. 6123 ) 15RG4JHP1449R1428 CAUCAUCAUTACCTATGCGAAACAATAAAA (SEQ ID NO. 6124 ) 15RG5_HP1450R1429 CAUCAUCAUATGGATAAAAACAACAATAAT (SEQ ID NO. 6125 ) 15RG6_HP1451R1430 CAUCAUCAUTGATGCAAAATTTTATTGAAA (SEQ ID NO. 6126 ) 15RG7_HP1452R1431 CAUCAUCAUATGAAAAATACATCGTCATCA (SEQ ID NO. 6127 ) 15RG8_HP1453R1432 CAUCAUCAUGACCATTAAGGAGATTTTATG (SEQ ID NO. 6128 ) 15RG9_HP1454R1433 CAUCAUCAUGAGTAAGGATTAATAATGAAA (SEQ ID NO. 6129 ) 15RG10_HP1455R1434 CAUCAUCAUGATAAGCAAATTGTGGATAAA (SEQ ID NO. 6130 ) 15RG11_HP1456R1435 CAUCAUCAUAACAATGAAAAATCAAGTTAA (SEQ ID NO. 6131 ) 15RG12_HP1457R1436 CAUCAUCAUCATTATGTTATTGAAAACAAA (SEQ ID NO. 6132 ) 15RH1_HP1458R1437 CAUCAUCAUATGTCAGAAATGATTAACGGG (SEQ ID NO. 6133 ) 15RH2__HP1459R1438 CAUCAUCAUAGTGGAGGGATTTAGCTTGAG (SEQ ID NO. 6134 ) 15RH3_HP1460R1439 CAUCAUCAUGTTAGACAATGAAAGAGAATA (SEQ ID NO. 6135 ) 15RH4_HP1461R1440 CAUCAUCAUTCATCATGAAAAAATCCATTT (SEQ ID NO. 6136 ) 15RH5_HP1462R1441 CAUCAUCAUAATGTTTGATAAAAAACTTTC (SEQ ID NO. 6137 ) 15RH6_HP1463R1442 CAUCAUCAUAAATAATCATGAGAGCTACGG (SEQ ID NO. 6138) 15RH7__HP1464R1443 CAUCAUCAUTTGGAAAGGCATGTGAATTAC (SEQ ID NO. 6139 ) X5RH8_HPX465RX444 CAUCAUCAUAACATATAATGAACGCTACTA (SEQ ID NO. 6140 ) 15RH9 HP1466R1445 CAUCAUCAUTTTAATGAAGACAGAGAAACA (SEQ ID NO. 6141 )
15RH10_HP1467R1446 CAUCAUCAUAATGTTTAAAAAAATCATTTT (SEQ ID NO. 6142 ) 15RH11_HP1468R1447 CAUCAUCAUGGTGTGAAAATGGCAAATTTA (SEQ ID NO. 6143 ) 15RH12_HP1469R1448 CAUCAUCAUCTGAATGTTGAAAAGAATGAT (SEQ ID NO. 6144 ) X6SA1_HPX470S1449 CUACUACUACTAACCTTTTAATTCATTCCA (SEQ ID NO. 6145 ) 16SA2_HP1471S1450 CUACUACUAACCCCTTTTAAAATAACGAGT (SEQ ID NO. 6146 ) 16SA3J3P1472S1451 CUACUACUATATCGCTTTGGGGGCATTTGG (SEQ ID NO. 6147 ) 16SA4_HP1473S1452 CUACUACUAAAAATAAAATTATAACTCATT (SEQ ID NO. 6148 ) 16SA5_HP1474S1453 CUACUACUAGGTTAAACAGCGCATTTTATA (SEQ ID NO. 6149 ) 16SA6_HP1475S1454 CUACUACUACCTTCTAACACCACATACATT (SEQ ID NO. 6150 ) 16SA7_HP1476S1455 CUACUACUATCAGTTCATTCCCCATCGTGG (SEQ ID NO. 6151 ) 16SA8_HP1477S1456 CUACUACUATTTCATGAATGTCCTTTATAA (SEQ ID NO. 6152 ) 16SA9_HP1478SX457 CUACUACUATCAAAACCCGTTATCCACTTT (SEQ ID NO. 6153 ) 16SA10_HP1479S1458 CUACUACUATCAAAATCCATCATTCTAAAA (SEQ ID NO. 6154 ) 16SA11_HP1480S1459 CUACUACUAGCCTTAAAGGTATTTTTCTAA (SEQ ID NO. 6155 ) 16SA12_HP1481S1460 CUACUACUATCAACCCTCTTTAATCGTTCT (SEQ ID NO. 6156 ) 16SB1_HP1482Ξ1461 CUACUACUACTTTAAGCCTTTTGGTCTGGC (SEQ ID NO. 6157 ) 16SB2_HP1483S1462 CUACUACUAACATTCCTTTAGTTTTTTTTA (SEQ ID NO. 6158 ) 16SB3JHP1484S1463 CUACUACUATTAAAAAGGCTTGACAACCAC (SEQ ID NO. 6159 ) 16SB4_HP1485S1464 CUACUACUATCCCTAATTTATCTTATTTTG (SEQ ID NO. 6160 ) 16SB5_HP1486S1465 CUACUACUATGAGATTTTTAAGCGTTTTCA (SEQ ID NO. 6161 ) 16SB6_HP1487S1466 CUACUACUAAAAAAAATTCATGCACTAGCC (SEQ ID NO. 6162 ) 16SB7_HP1488S1467 CUACUACUATTTTAAGGTTTAATGGTAACT (SEQ ID NO. 6163 ) 16SB8_HP1489S1468 CUACUACUACCTTAATAAACAAATTCATAA (SEQ ID NO. 6164 ) 16SB9_HP1490S1469 CUACUACUATCATGCTTCATTTTCTCCCTG (SEQ ID NO. 6165 ) 16SB10_HP1491S1470 CUACUACUAAAACCCTAGAAATACTTTTCT (SEQ ID NO. 6166 ) 16SB11_HP1492S1471 CUACUACUACCTTAAAGCTTATCAAACTCC (SEQ ID NO. 6167 ) 16SB12_HP1493S1472 CUACUACUATCATTCTTTAAGCCTTATTGG (SEQ ID NO. 6168 ) 16SC1_HP1494S1473 CUACUACUATGTTTTATCCTTGTTTTAAAT (SEQ ID NO. 6169 ) 16SC2_HP1495S1474 CUACUACUAAAACCCTTAAAAATCAAAAAC (SEQ ID NO. 6170 ) 16SC3_HP1496S1475 CUACUACUATCACTTCGCTTTAATCACACC (SEQ ID NO. 6171 ) 16SC4_HP1497S1476 CUACUACUATTTTGATAGAGTGCATTTAAC (SEQ ID NO. 6172 ) 16SC5 HP1498S1477 CUACUACUAATAAATCCTACAACCTCTTAT (SEQ ID NO. 6173 )
16SC6_HP1499S1478 CUACUACUATTAGCACATGGTGGTTGAGTT (SEQ ID NO. 6174 ) 16SC7__HP1500S1479 CUACUACUACAATCAAAATAAAGAATTGAA (SEQ ID NO. 6175 ) 16SC8_HP150XS1480 CUACUACUACTACCAAACTTAAAAAGTATA (SEQ ID NO. 6176 ) 16SC9_HP1502S1481 CUACUACUATGAAACGCTCTTAAACAAACC (SEQ ID NO. 6177 ) 16SC10_HP1503S1482 CUACUACUAGGTTTGTTTAAGAGCGTTTCA (SEQ ID NO. 6178 ) X6SC11_HP1504S1483 CUACUACUAAAGGGCGCTAGTTTAGAAGCG (SEQ ID NO. 6179 ) 16SC12_HP1505S1484 CUACUACUATCTTAAGAGTTTTCTATCCAT (SEQ ID NO. 6180) 16SD1_HP1506S1485 CUACUACUAAGTCTCATGATGGGAAAAAAG (SEQ ID NO. 6181 ) 16SD2_HP1507S1486 CUACUACUACTAGCGTTCAATCACTTCATA (SEQ ID NO. 6182 ) 16SD3_HP1508S1487 CUACUACUAGCTTCAATCCTCACTTGGTGC (SEQ ID NO. 6183 ) 16SD4_HP1509S1488 CUACUACUATCATAAGACTTTCTTTTCCTT (SEQ ID NO. 6184 ) 16SD5_HP1510S1489 CUACUACUAGGCTAAAGATTGCTTTCATAG (SEQ ID NO. 6185 ) 16SD6_HP1511S1490 CUACUACUATTACTGAGCGGCTGAATCAAG (SEQ ID NO. 6186 ) 16SD7JHP1512S1491 CUACUACUATTCATTTTAGAACTGATAGGA (SEQ ID NO. 6187 ) 16SD8JHP1513S1492 CUACUACUACGAATTTTTAAGCTTTATTAA (SEQ ID NO. 6188 ) 16ΞD9_HP1514S1493 CUACUACUATTAATTCTTAAACAAAGACTC (SEQ ID NO. 6189) 16SD10_HP1515S1494 CUACUACUACTTGAAAAACTATCGTTTGAT (SEQ ID NO. 6190 ) 16SD11_HP1516S1495 CUACUACUATTATTTGTTCTGTTGCCAATA (SEQ ID NO. 6191 ) 16SD12_HP1517S1496 CUACUACUAATTCACTGCCCCTCTTCAATG (SEQ ID NO. 6192 ) 16SE1_HP1518S1497 CUACUACUATTAATTCCTTACATCGCTTAA (SEQ ID NO. 6193 ) 16SΞ2_HPX519S1498 CUACUACUATTAAATTTTGACTTTCTCCAA (SEQ ID NO. 6194) 16SE3_HP1520S1499 CUACUACUATCAAACAAGATTTGTAAGATA (SEQ ID NO. 6195 ) 16SE4_HP1521S1500 CUACUACUACTCTAATCTTTACGATTTAAA (SEQ ID NO. 6196 ) 16SE5_HP1523S1501 CUACUACUATTTCAATTTTCAAATGTTCCC (SEQ ID NO. 6197 ) 16SE6_HP1524S1502 CUACUACUAACACTTTAATACTTCCTGTTG (SEQ ID NO. 6198 ) 16SE7_HP1525S1503 CUACUACUAGGTTGGAATTAGTTTAAAGGT (SEQ ID NO. 6199 ) 16SE8_HP1526S1504 CUACUACUATTAAACTAATTCCAACCCTAC (SEQ ID NO. 6200 ) 16SE9_HP1527S1505 CUACUACUACCACCATCTTAAACTTCGTTG (SEQ ID NO. 6201 ) 16SE10_HP1528S1506 CUACUACUACTATAACCTATTTATGACTTT (SEQ ID NO. 6202 ) 16SEXX_HPX529S1507 CUACUACUACCTTTTTTCATTCACTTGAAT (SEQ ID NO. 6203 ) 16SE12_HP1530S1508 CUACUACUATTAAATTATTAAACTATCAAT (SEQ ID NO. 6204 ) 16SF1 HP1531S1509 CUACUACUACACACATTTAAAACACCAATA (SEQ ID NO. 6205 )
SF2_HP1532S1510 CUACUACUACTTTTTATTCCACGGTAACGC (SEQ ID NO. 6206) SF3_HP1533S1511 CUACUACUAACTAATGCTTCAAACAATCTT (SEQ ID NO. 6207) SF4_HP1534S1512 CUACUACUAACCGCTAAAGCGATTGGGCTT (SEQ ID NO. 6208) SF5_HP1535S1513 CUACUACUAATTAACCTTGATTTTCTATAT (SEQ ID NO. 6209) SF6_HP1536S1514 CUACUACUACACTAAAGATATTTGGGATTT (SEQ ID NO. 6210) SF7_HP1537S1515 CUACUACUAGCTCTCTTAAAATGTGGTTGG (SEQ ID NO. 6211) SF8_HP1539S1516 CUACUACUAATCTTAAACTCTTTCATCTAA (SEQ ID NO. 6212) SF9_HP1540S1517 CUACUACUATTACGCTTTAGCCATCATTTT (SEQ ID NO. 6213) SF10_HP1541S1518 CUACUACUATTAACGCCGCTTTAAAGAAAT (SEQ ID NO. 6214) SF11_HP1542S1519 CUACUACUACCTATTTATTTTCAATTTTCT (SEQ ID NO. 6215) SF12_HP1543S1520 CUACUACUAGCTCCCCCTTTATTTTTGTGC (SEQ ID NO. 6216) SG1_HP1544S1521 CUACUACUATAAAATCCTTACTGGCTTATG (SEQ ID NO. 6217) SG2_HP1545S1522 CUACUACUAGCATCAATCCCTCTTCTCTTG (SEQ ID NO. 6218) SG3_HP1546S1523 CUACUACUAGGCTATTTTTCATTATTGAAT (SEQ ID NO. 6219) SG4_HP1547S1524 CUACUACUAAGCCCTCATGCGATAACAAAA (SEQ ID NO. 6220) SG5_HP1548S1525 CUACUACUACCATCACTACCCCTTAATCCG (SEQ ID NO. 6221) SG6_HP1549S1526 CUACUACUACATAGAAACTCCTTAAACTTG (SEQ ID NO. 6222) SG7_HP1550S1527 CUACUACUACCTCCTAAGCTCTTTTATTCA (SEQ ID NO. 6223) SG8_HP1551S1528 CUACUACUACGTTAAAAAGTTTCATTCAGT (SEQ ID NO. 6224) SG9_HP1552S1529 CUACUACUAAGCATTTTTAGCTATTTCTTT (SEQ ID NO. 6225) SG10_HP1553S1530 CUACUACUATTCAGACCCATAATTTTTCAA (SEQ ID NO. 6226) SG1X_HP1554S1531 CUACUACUAGACATTATTCGGCTCCTTGAG (SEQ ID NO. 6227) SG12_HP1555S1532 CUACUACUATTTCTAAAACCTCACTTCATT (SEQ ID NO. 6228) SH1_HP1556S1533 CUACUACUATTATTTTATCTTAAAGCGGTT (SEQ ID NO. 6229) SH2JHP1557S1534 CUACUACUACTAGATCTGCGTTCTTAAAAG (SEQ ID NO. 6230)
Figure imgf000581_0001
SH6_HP1563S1540 CUACUACUACCGATTAAAGCTTAATGGAAT (SEQ ID NO. 6234) SH7_HP1564S1541 CUACUACUAGCTTAAAAAGCCGGGATAATC (SEQ ID NO. 6235) SH8_HP1565S1542 CUACUACUATTAAAGATAGCCAAGCTCATA (SEQ ID NO. 6236) SH9 HP1566S1543 CUACUACUATTATAGCGAAGATTTTTCATA (SEQ ID NO. 6237)
16SH10_HP1567S1544 CUACUACUAGCGGCTTATAATGGGTTAGTG (SEQ ID NO. 6238 16SH11_HP1568S1545 CUACUACUATCATGGTTTTTCGCCTTTCTT (SEQ ID NO. 6239 X6SH12_HPX569SX546 CUACUACUACGCATTAAAAACCTCCTTTGA (SEQ ID NO. 6240 16RA1_HP1470RX449 CAUCAUCAUAAGAGGGTTGAATGATGGAGC (SEQ ID NO. 6241 16RA2_HP1471R1450 CAUCAUCAUGTGTTGTCAAGTAAGAAAATT (SEQ ID NO. 6242 16RA3_HP1472R1451 CAUCAUCAUATGAATAAAGTTCAATCTATT (SEQ ID NO. 6243 16RA4_HP1473RX452 CAUCAUCAUAATGCGCTGTTTAACCTGTTT (SEQ ID NO. 6244 X6RA5_HPX474RX453 CAUCAUCAUCTTAAAATGTATGTGGTGTTA (SEQ ID NO. 6245 X6RA6_HP1475R1454 CAUCAUCAUGATGCAAAAAATCGGCATTTA (SEQ ID NO. 6246 16RA7_HP1476R1455 CAUCAUCAUCGCAAATCTTATAAAGGACAT (SEQ ID NO. 6247 16RA8_HP1477R1456 CAUCAUCAUTTTTGAAAATTTTAATCCTTT (SEQ ID NO. 6248 16RA9_HP1478R1457 CAUCAUCAUAGTTTTTAGAATGATGGATTT (SEQ ID NO. 6249 16RA10_HP1479R1458 CAUCAUCAUGGCTAGTTTGTGCTGAATGAA (SEQ ID NO. 6250 16RA11_HP1480R1459 CAUCAUCAUAGGGTTGAATGATTGATAGAA (SEQ ID NO. 6251 16RA12_HP1481R1460 CAUCAUCAUCTTAAAGCATGCGAGTGTTTG (SEQ ID NO. 6252 16RB1_HP1482R1461 CAUCAUCAUAATGTTATGCAAGATGAATTA (SEQ ID NO. 6253 16RB2_HP1483R1462 CAUCAUCAUATGAAAAAAGAAAAGCATCTC (SEQ ID NO. 6254 16RB3_HP1484R1463 CAUCAUCAUTGGTATGGGATTTTTGAATGG (SEQ ID NO. 6255 16RB4_HP1485R1464 CAUCAUCAUGTGAAAACGCTTAAAAATCTC (SEQ ID NO. 6256 16RB5_HP1486R1465 CAUCAUCAUGGCTAGTGCATGAATTTTTTT (SEQ ID NO. 6257 16RB6_HP1487R1466 CAUCAUCAUGGATTGTTTTGTTCAGATTGA (SEQ ID NO. 6258 16RB7_HP1488R1467 CAUCAUCAUATGTCAAATAGCATGTTGGAT (SEQ ID NO. 6259 16RB8_HP1489R1468 CAUCAUCAUAAGCATGAAAAAAACAACCCT (SEQ ID NO. 6260 16RB9_HP1490R1469 CAUCAUCAUATGGGGAGGAATCAAGGAGCT (SEQ ID NO. 6261 16RB10_HP1491R1470 CAUCAUCAUCGATATGGAAATTAAAAACAT (SEQ ID NO. 6262 16RB11_HP1492R1471 CAUCAUCAUGAGTTTTAAAAAATGATAGAA (SEQ ID NO. 6263 16RB12_HP1493R1472 CAUCAUCAUGCATGCAAAAGTTTGATTATG (SEQ ID NO. 6264 16RC1_HP1494R1473 CAUCAUCAUATGAAACTTAAAAAAACCCTG (SEQ ID NO. 6265 16RC2_HP1495R1474 CAUCAUCAUACATGCAAGAATTCAGTTTGT (SEQ ID NO. 6266 16RC3JHP1496R1475 CAUCAUCAUATGTTAGAAGGCGTTATTAGA (SEQ ID NO. 6267 16RC4_HP1497R1476 CAUCAUCAUGTTATGACGCTTTTAGTAGGT (SEQ ID NO. 6268 16RC5 HP1498R1477 CAUCAUCAUAGTGCGTTTGTTTAGATTTGT (SEQ ID NO. 6269
16RC6_HP1499R1478 CAUCAUCAUAGGTTTTTGATGAGTTCTGTT (SEQ ID NO. 6270) 16RC7_HP1500R1479 CAUCAUCAUTTATAATGTGTTCTAATTCTT (SEQ ID NO. 6271) 16RC8_HP1501R1480 CAUCAUCAUGAGAGTGATGCTCAATTTTAT (SEQ ID NO. 6272) 16RC9_HP1502R1481 CAUCAUCAUATGGATGCGATTTATCCTTAT (SEQ ID NO. 6273) 16RC10_HP1503R1482 CAUCAUCAUAATTAAGGATTAAAAAAATGA (SEQ ID NO. 6274) 16RC11_HP1504R1483 CAUCAUCAUATGGATAGAAAACTCTTAAGA (SEQ ID NO. 6275) 16RC12_HP1505R1484 CAUCAUCAUATGAGACTTTATGAGAGTTTA (SEQ ID NO. 6276) 16RD1_HP1506R1485 CAUCAUCAUGCAAGTTTGCAAGAAATTAAG (SEQ ID NO. 6277) 16RD2_HP1507R1486 CAUCAUCAUAGTGTGGTGGCACACAAAATG (SEQ ID NO. 6278) 16RD3_HP1508R1487 CAUCAUCAUATGCTTGAAACTTCTAGCCAT (SEQ ID NO. 6279) X6RD4_HPX509RX488 CAUCAUCAUACGATTTTAAAAAATGGCTAG (SEQ ID NO. 6280) 16RD5_HP1510R1489 CAUCAUCAUGAAAGTCTTATGAAAACTAAA (SEQ ID NO. 6281) 16RD6_HP1511R1490 CAUCAUCAUATGAAAGCAATCTTTAGCCTC (SEQ ID NO. 6282) 16RD7_HP1512R1491 CAUCAUCAUATGCTTAGAAATCAATTTCGT (SEQ ID NO. 6283) 16RD8_HP1513R1492 CAUCAUCAUACTCATCATGGCTAAAGAAAC (SEQ ID NO. 6284) 16RD9_HP1514R1493 CAUCAUCAUATGGAAAAAATCAGCGATCTT (SEQ ID NO. 6285) 16RD10_HP1515R1494 CAUCAUCAUCTTCTATGAGAAAAAATCATA (SEQ ID NO. 6286) 16RD11_HP1516R1495 CAUCAUCAUGGGCAGTGAATGGAAAAGTTA (SEQ ID NO. 6287) 16RD12_HP1517R1496 CAUCAUCAUATGGATTATAAAAAATTAGAT (SEQ ID NO. 6288) 16RE1_HP1518R1497 CAUCAUCAUATGATTCGTTTCGCTCACATC (SEQ ID NO. 6289) X6RE2_HPX519R1498 CAUCAUCAUTTGCAAGAATGCAGACAATTT (SEQ ID NO. 6290) 16RE3_HP1520R1499 CAUCAUCAUATGGGGTTAAAAATTATCAAT (SEQ ID NO. 6291) 16RE4_HP1521R1500 CAUCAUCAUTAGGGGGTAGAGGTGAAAATC (SEQ ID NO. 6292) 16RE5_HP1523R1501 CAUCAUCAUTTGCAAGAAACAGATAATTTA (SEQ ID NO. 6293) 16RE6_HP1524R1502 CAUCAUCAUTTGATGAAATCTAAAATCACT (SEQ ID NO. 6294) 16RE7_HP1525R1503 CAUCAUCAUAGGAAGTATTAAAGTGTGAAA (SEQ ID NO. 6295) 16RE8_HP1526R1504 CAUCAUCAUATGAAACTGATTTCATGGAAT (SEQ ID NO. 6296) 16RE9_HP1527R1505 CAUCAUCAUCATGAAAAAATCCCTTTGTCT (SEQ ID NO. 6297) 16RE10_HP1528R1506 CAUCAUCAUATGCTTATTCCTTTTTATTTT (SEQ ID NO. 6298) 16RE11_HP1529R1507 CAUCAUCAUGCCATGGATACCAACAACAAT (SEQ ID NO. 6299) 16RE12_HP1530R1508 CAUCAUCAUAGGATTTTCATGCTGCTTTGC (SEQ ID NO. 6300) 16RF1 HP1531R1509 CAUCAUCAUATGTTTGAAAAAATACGCAAG (SEQ ID NO. 6301)
16RF2_HP1532R1510 CAUCAUCAUATGTGTGGGATTGTAGGTTAT (SEQ ID NO. 6302) 16RF3_HP1533R1511 CAUCAUCAUGATGTGGATCACCCAAGAAAC (SEQ ID NO. 6303) 16RF4_HP1534R1512 CAUCAUCAUTTGCTTAACGCTATCAAGTTT (SEQ ID NO. 6304) 16RF5_HP1535R1513 CAUCAUCAUGGAATGAGGAAAAACCATTAT (SEQ ID NO. 6305) 16RF6__HP1536R1514 CAUCAUCAUTTTAATGAAAAAATATTGTCG (SEQ ID NO. 6306) 16RF7_HP1537R1515 CAUCAUCAUATGCATGCTTGAAAAAGTGTT (SEQ ID NO. 6307) 16RF8_HP1539R1516 CAUCAUCAUGAGAGTTTAATGGCAGAGATA (SEQ ID NO. 6308) 16RF9_HP1540R1517 CAUCAUCAUTTTTTAATCATGGCAGATATT (SEQ ID NO. 6309) 16RF10_HP1541R1518 CAUCAUCAUAATGATCCAATCCAGCCTTTA (SEQ ID NO. 6310) 16RF11_HP1542R1519 CAUCAUCAUCAGATGGCAATCTTTGATAAC (SEQ ID NO. 6311) 16RF12_HP1543R1520 CAUCAUCAUAGTGTTTTTAGACAGGCGTTT (SEQ ID NO. 6312) 16RG1_HP1544R1521 CAUCAUCAUTTGATGCCGCAAAACCAGCTT (SEQ ID NO. 6313) 16RG2_HP1545R1522 CAUCAUCAUATGAAAAATAGCCCTTTGAAT (SEQ ID NO. 6314) 16RG3_HP1546R1523 CAUCAUCAUCATGAGGGCTTTACTTTTTTT (SEQ ID NO. 6315) 16RG4_HP1547R1524 CAUCAUCAUGTAGTGATGGATTTTATCAAT (SEQ ID NO. 6316) 16RG5_HP1548R1525 CAUCAUCAUATGGATTGGGGTCGGGTCGTT (SEQ ID NO. 6317) 16RG6_HP1549R1526 CAUCAUCAUTTAGGAGGTTTTATGGAATTA (SEQ ID NO. 6318) 16RG7_HP1550R1527 CAUCAUCAUGTGCCTTCTTTACTAGAAACT (SEQ ID NO. 6319) 16RG8_HP1551R1528 CAUCAUCAUAATGGGACAAATTAAAGACAT (SEQ ID NO. 6320) 16RG9_HP1552R1529 CAUCAUCAUGTCTATGAATCTCAAAAAAAC (SEQ ID NO. 6321) 16RG10_HP1553R1530 CAUCAUCAUTCTCAATGGATACCAAAAGAC (SEQ ID NO. 6322) 16RG11_HP1554R1531 CAUCAUCAUATGGTAACCATGAAAGATTTA (SEQ ID NO. 6323) 16RG12__HP1555R1532 CAUCAUCAUAGCCGAATAATGTCAGGAATT (SEQ ID NO. 6324) 16RH1_HP1556R1533 CAUCAUCAUATGGATAATAGGAATATTGAT (SEQ ID NO. 6325) 16RH2_HP1557R1534 CAUCAUCAUGTTCTATGCAAGCCATACACA (SEQ ID NO. 6326) 16RH3_HP1558R1535 CAUCAUCAUCATGTTTTTATCTTCTTTTGA (SEQ ID NO. 6327) 16RH4_HP1559R1536 CAUCAUCAUTTTGAAAAAGGGGTTTTATTA (SEQ ID NO. 6328) 16RH5_HP1560R1537 CAUCAUCAUATGACTACAGACAGAAATTTG (SEQ ID NO. 6329) 16RH6_HP1563R1540 CAUCAUCAUGTTACGATATGTTAGTTACAA (SEQ ID NO. 6330) 16RH7_HP1564R1541 CAUCAUCAUTTAATAAAGGGGTTTTTATGA (SEQ ID NO. 6331) X6RH8_HP1565R1542 CAUCAUCAUGTATGAAAAATCTTCGCTATA (SEQ ID NO. 6332) 16RH9 HP1566R1543 CAUCAUCAUATGCTCTCTTTAAAACAAGAT (SEQ ID NO. 6333)
16RH10_HP1567R1544 CAUCAUCAUCGAAAAACCATGATTGTCATT (SEQ ID NO. 6334 16RH11_HP1568R1545 CAUCAUCAUATGCGTTGGTGGTGTTTTTTG (SEQ ID NO. 6335 16RH12_HP1569R1546 CAUCAUCAUCAAGATGAAGCGCTCAAGCTT (SEQ ID NO. 6336 17SA1_HP1570S1547 CUACUACUATAAAACGCTATTAGAGGTAAA (SEQ ID NO. 6337 17SA2_HP1571S1548 CUACUACUATTAATCATTCTCTCGTTAAAA (SEQ ID NO. 6338 17SA3_HP1572S1549 CUACUACUACCAAACCCATTAAAATTCCCT (SEQ ID NO. 6339 17SA4_HP1573S1550 CUACUACUAAGCCTTATTTTAGGGGAAACT (SEQ ID NO. 6340 17SA5_HP1574S1551 CUACUACUATCATCTAATACCCTAAAGTCA (SEQ ID NO. 6341 17SA6_HP1575S1552 CUACUACUATTCCGATCACTTGGACATTTG (SEQ ID NO. 6342 17SA7_HP1576S1553 CUACUACUATTTTAATCCTTTAATTTTTCC (SEQ ID NO. 6343 X7SA8_HPX577SX554 CUACUACUAAAACCAAATCCCTAATACTTA (SEQ ID NO. 6344 X7SA9_HPX578S1555 CUACUACUATTCCAACAAATTAGAACTTTT (SEQ ID NO. 6345 X7SA10_HP1579S1556 CUACUACUATAGCCTATGTTAGTTTTTATT (SEQ ID NO. 6346 17SA11_HP1580S1557 CUACUACUATATCCCTTTACCATTGATAAG (SEQ ID NO. 6347 17SA12J3P1581Ξ1558 CUACUACUAACCTTTGAGCTTTTTCATTGA (SEQ ID NO. 6348 17SB1_HPX582S1559 CUACUACUATTTTATCGCTTAATGAGTTGC (SEQ ID NO. 6349 17SB2_HP1583S1560 CUACUACUAATCTTTAAGCCAAATATTTGA (SEQ ID NO. 6350 17SB3_HP1584S1561 CUACUACUATCCATTCACTCAAAACTTTTT (SEQ ID NO. 6351 17SB4_HP1585S1562 CUACUACUACCTTTTAGCGTTTGAGGGAAT (SEQ ID NO. 6352 17SB5_HP1586Ξ1563 CUACUACUAGCTTATAGGGTAACCGTTCCG (SEQ ID NO. 6353 17SB6_HP1587S1564 CUACUACUACTACTTTACCCACCACATAAC (SEQ ID NO. 6354 17SB7_HP1588S1565 CUACUACUAAAAGCTTAAATGGATTCTATT (SEQ ID NO. 6355 17SB8_HP1590S1566 CUACUACUAATTCTGTTGAACTTGTCAATT (SEQ ID NO. 6356 17SB9_HP0009S1567 CUACUACUATTAATAATTGCGAATAAGCGC (SEQ ID NO. 6357 17SB10_HP0058S1568 CUACUACUACGATAGTATAATATGGTGGTG (SEQ ID NO. 6358 17SB11_HP0076S1569 CUACUACUATTTTCACTGAAGCGTTAAGCC (SEQ ID NO. 6359 17SB12_HP0107S1570 CUACUACUATTTTGAGAGATAACGATCGGC (SEQ ID NO. 6360 17SC1_HP0118S1571 CUACUACUACCTTTTACTATAACCATAACC (SEQ ID NO. 6361 17SC2_HP0119S1572 CUACUACUATTTAGCTTGAGCTTTAGAGCG (SEQ ID NO. 6362 17ΞC3_HP0120S1573 CUACUACUATTTACTATAACCATAACCCGC (SEQ ID NO. 6363 17SC4_HP0143S1574 CUACUACUACAAATTGACCCCTTGAATGTG (SEQ ID NO. 6364 17SC5 HP0208S1575 CUACUACUACCTTTTAATCAGTCGCTTCAC (SEQ ID NO. 6365
17SC6_HP0381S1576 CUACUACUATTTTAAAACGCCCACAAACCC (SEQ ID NO. 6366) 17SC7_HP0451S1577 CUACUACUATCTTCTCTTTGCTTCATGTGG (SEQ ID NO. 6367) 7SC8_HP0521S1578 CUACUACL7AATCCATTGCATTTGGGATATT (SEQ ID NO. 6368) 7SC9_HP0544S1579 CUACUACUATTTGCACACAAGCCGCCCAAG (SEQ ID NO. 6369) 17SC10_HP0548S1580 CUACUACUATACAGACTTGCAAAATAGCGC (SEQ ID NO. 6370) 17SC11_HP0550S1581 CUACUACUATCATTGTCCTTAGTTTGTTGC (SEQ ID NO. 6371) 17SC12_HP0619S1582 CUACUACUAAAAGGAGCGTATCGTCTGCTG (SEQ ID NO. 6372) 7SD1_HP0713S1583 CUACUACUAGGCTTATTCTTATTGAGTATC (SEQ ID NO. 6373) 17SD2_HP0715S1584 CUACUACUACCTAAATAATACTTACGCACC (SEQ ID NO. 6374) 17SD3_HP0722S1585 CUACUACUACTATAGAGCCTTCTGTATTGC (SEQ ID NO. 6375) 17SD4_HP0725S1586 CUACUACUACTCATCGCTGCTAAAGCTTGG (SEQ ID NO. 6376) 17SD5_HP0744Ξ1587 CUACUACUAGAGAGAAGTTATCATGCCAAC (SEQ ID NO. 6377) 17SD6_HP0790S1588 CUACUACUATCTCGGTAATATTCGTATTGC (SEQ ID NO. 6378) 17SD7_HP0807S1589 CUACUACUATTGCTGAATTGAGGCGGAATG (SEQ ID NO. 6379) 17SD8_HP0814S1590 CUACUACUAATGCGTGATTTTATGCGTGCC (SEQ ID NO. 6380) 17SD9_HP0882S1591 CUACUACUACTTTGCTTCATTAGGCGTTTG (SEQ ID NO. 6381) 17SD10_HP0884S1592 CUACUACUATTTCGCATAGACCACGCTCCC (SEQ ID NO. 6382) 17SD11_HP0897S1593 CUACUACUAGGCGTGCGCAGCGTATTATCG (SEQ ID NO. 6383) 17SD12_HP0898S1594 CUACUACUACGCGCTTGTAACGATAATACG (SEQ ID NO. 6384) 17SE1_HP0903S1595 CUACUACUATTTGGTTTGCAAAGCGATTTC (SEQ ID NO. 6385) 17SE2_HP0904S1596 CUACUACUATGCGCTTGAAGGGCGCTAATC (SEQ ID NO. 6386) 17SE3_HP0988S1597 CUACUACUAACCTTGATTTTCTATATACGC (SEQ ID NO. 6387) 17SE4_HP0998S1598 CUACUACUATTTCTATATACGCCTTGATCG (SEQ ID NO. 6388) 17SE5_HP1007S1599 CUACUACUATGCATGGATATTTCCTACCCC (SEQ ID NO. 6389) 17SE6_HP1180S1600 CUACUACUAATGAATAACCCGGCGATAGTC (SEQ ID NO. 6390) 17SE7_HP1188S1601 CUACUACUAAGAGGTTGAAAGGTGGGCTTG (SEQ ID NO. 6391) 17SE8_HP1280S1602 CUACUACUAGGCATGGCTTGCCTTATGTTT (SEQ ID NO. 6392) 17SE9_HP1353S1603 CUACUACUAGGATTTTGCAAGATTTCTTGC (SEQ ID NO. 6393) 17SE10_HP1365S1604 CUACUACUATAGTGGGTTAAAGCGATAGCC (SEQ ID NO. 6394) 17SE11_HP1369S1605 CUACUACUAAAACACTTCGTCCATGAGCAC (SEQ ID NO. 6395) 17SE12_HP1372S1606 CUACUACUAATGCCTTAAAAGTTTAGCGCC (SEQ ID NO. 6396) X7SF1 HP1417S1607 CUACUACUACATGCTGCCATAACAAACAGC (SEQ ID NO. 6397)
X7SF2_HPX427SX608 CUACUACUACTCGTGATGCCCGTGGCAGCA (SEQ ID NO. 6398)
17SF3_HP1522S1609 CUACUACUACGTTTGGGTTTCCATTCTATC (SEQ ID NO. 6399)
17SF4_HP1538S1610 CUACUACUACCCACGCATAAGCCAAGAACG (SEQ ID NO. 6400)
17SF5_HP1561S1611 CUACUACUAAAAGGGCTCAACCTCTGCATC (SEQ ID NO. 6401)
17SF6_HP1562S1612 CUACUACUAGGTTCAACCTCTGCATCATTC (SEQ ID NO. 6402)
X7SF7_HP1589S1613 CUACUACUATTAATTTCGTTTGCTTGTGCC (SEQ ID NO. 6403)
17SF8_HP0527-1S1614 CUACUACUAACAATCTAGAACCTGTTGTTC (SEQ ID NO. 6404)
17SF9_HP0527-2S1615 CUACUACUACTTGGCAGTTTCTGCGACATC (SEQ ID NO. 6405)
17SF10_HP0609-1S1616 CUACUACUAAATCGTGTCAGCTTTATCATC (SEQ ID NO. 6406)
17SF11_HP609-2S1617 CUACUACUATAATAGAGTCCCTTGTTCGCC (SEQ ID NO. 6407)
X7SFX2_HP6X0-1S1618 CUACUACUACGTACTAGAATGGAGCAATTG (SEQ ID NO. 6408)
17SG1_HP610-2S1619 CUACUACUACGCATACAAGCCCACATCCAC (SEQ ID NO. 6409)
17SG2_HP887-1S1620 CUACUACUAAGTGCCAGTTTCCAAACGCAC (SEQ ID NO. 6410)
17SG3_HP887-2S1621 CUACUACUACCTCATTCCTAAATTGGAAGC (SEQ ID NO. 6411)
17SG4JHP922-1S1622 CUACUACUAGCGGTGTTTTGATTGTTGGTG (SEQ ID NO. 6412)
17SG5_HP922-2S1623 CUACUACUACGCTTTGAAAGGTTACATCCG (SEQ ID NO. 6413)
17SG6_HP922-3S1624 CUACUACUAGATTGATGATTGAGAGTAGGG (SEQ ID NO. 6414)
17SG7-HP1157-1S1625 CUACUACUAGTTATTGTTAGGTGTCATCTC (SEQ ID NO. 6415)
17SG8_HP1157-2S1626 CUACUACUACCTCCTGTAACTCACATCAGC (SEQ ID NO. 6416)
17SG9_HP1198-1S1627 CUACUACUAGCACGGTTGGCGTCATCATGC (SEQ ID NO. 6417)
17SG10_HP1198-2S1628 CUACUACUACGCTCCAAGCTCCATTAAGCG (SEQ ID NO. 6418)
17SG11-HP1198-3S1629 CUACUACUAGTCGTTTCTTGGAAAGAGGCC (SEQ ID NO. 6419)
17SG12_HP1517-1S1630 CUACUACUAAGCGAATTAGCGCACTTAATG (SEQ ID NO. 6420)
17SH1_HP1517-2S1631 CUACUACUATGATCTTAATTTCTTCATCGG (SEQ ID NO. 6421)
17RA1_HP1570R1547 CAUCAUCAUATGATTAAGTTATTGCTTTTA (SEQ ID NO. 6422)
17RA2_HP1571R1548 CAUCAUCAUAATGGGTTTGGCGTTGGAAAA (SEQ ID NO. 6423)
17RA3_HP1572R1549 CAUCAUCAUGATGTCAAAAAGAATGAAGTG (SEQ ID NO. 6424)
17RA4_HP1573R1550 CAUCAUCAUGGATAAAAGCATGTTTATAGA (SEQ ID NO. 6425)
17RA5_HP1574R1551 CAUCAUCAUTTATGTTCAGCGGTCTAATCC (SEQ ID NO. 6426)
17RA6_HP1575R1552 CAUCAUCAUGGTATTAGATGAATAAAACCA (SEQ ID NO. 6427)
17RA7_HP1576R1553 CAUCAUCAUAAACGATGGTAGTAGAATTAA (SEQ ID NO. 6428)
17RA8 HP1577R1554 CAUCAUCAUAATGATTTCTCAAATGCTCAT (SEQ ID NO. 6429)
17RA9_HP1578R1555 CAUCAUCAUAAATTGATGCAACATGAAATC (SEQ ID NO. 6430) 17RA10_HP1579R1556 CAUCAUCAUAAAGTGCTAAAAAAATTATTA (SEQ ID NO. 6431) 17RA11_HP1580RX557 CAUCAUCAUCATGCTATTTAATGGGCTATG (SEQ ID NO. 6432) X7RA12_HP1581R1558 CAUCAUCAUAGTGTTGTGGGTGCTATATTT (SEQ ID NO. 6433) 17RB1_HP1582R1559 CAUCAUCAUAGTCATGCGTTTTGGATTGAA (SEQ ID NO. 6434) 17RB2_HP1583R1560 CAUCAUCAUCGATAAAATGGCTAAAAAGAA (SEQ ID NO. 6435) 17RB3_HP1584R1561 CAUCAUCAUATGATTTTAAGCATTGAAAGT (SEQ ID NO. 6436) 17RB4_HP1585R1562 CAUCAUCAUGAAAATCATGCTCCGTTCTCT (SEQ ID NO. 6437) 17RB5_HP1586R1563 CAUCAUCAUCTTGGGGTGGTTTTGTGTTTT (SEQ ID NO. 6438) 17RB6_HP1587R1564 CAUCAUCAUGCACTAAGAATGAATGAAGAC (SEQ ID NO. 6439) 17RB7_HP1588R1565 CAUCAUCAUATGGCATACAAATATGATAGA (SEQ ID NO. 6440) 17RB8_HP1590R1566 CAUCAUCAUATATAATATGGCATACAGATA (SEQ ID NO. 6441) 17RB9_HP0009S1567 CAUCAUCAUCAATCAAGCGGTAACGAACGC (SEQ ID NO. 6442) 17RB10_HP0058S1568 CAUCAUCAUAGAGCTTTTAAAATCTGTTGG (SEQ ID NO. 6443) 17RB11_HP0076S1569 CAUCAUCAUATGGCAAATCATAAGTCCGCA . (SEQ ID NO. 6444) X7RBX2_HP0X07S1570 CAUCAUCAUATGATGATTATCACCACAATG (SEQ ID NO. 6445) 17RC1_HP0118S157X CAUCAUCAUATAAGGTTGGCAAGAATACAG (SEQ ID NO. 6446) X7RC2_HP0XX9S1572 CAUCAUCAUAAGAGTTTTAGTAATGCTTGC (SEQ ID NO. 6447) 17RC3_HP0120S1573 CAUCAUCAUAAGGTTGGCAAGAATACAGAG (SEQ ID NO. 6448) 17RC4_HP0143S1574 CAUCAUCAUATGATTAAACAAACCCTCATC (SEQ ID NO. 6449) 17RC5_HP0208S1575 CAUCAUCAUAAGAGATTATCCCTATTGTCG (SEQ ID NO. 6450) 17RC6_HP0381S1576 CAUCAUCAUACCCTTTCGCAAGCCCTAAAC (SEQ ID NO. 6451) 17RC7_HP0451S1577 CAUCAUCAUTAAAGTTTATCTAAAAAACCG (SEQ ID NO. 6452) 17RC8_HP0521S1578 CAUCAUCAUATGATACAAAGAGGATTGAGT (SEQ ID NO. 6453) 17RC9JHP0544S1579 CAUCAUCAUATTGTAGCGATTGTTATTGTG (SEQ ID NO. 6454) X7RCX0_HP0548S1580 CAUCAUCAUAATTTGATGTTACCATCATAG (SEQ ID NO. 6455) 17RC11_HP0550S1581 CAUCAUCAUACGCGCTACGCAAAAGTTCGC (SEQ ID NO. 6456) 17RC12_HP0619S1582 CAUCAUCAUTTTAAAAGAAAGTCAAAGGCσ (SEQ ID NO. 6457) 17RD1_HP0713S1583 CAUCAUCAUCCTATGGCTATAAAAGAATGG (SEQ ID NO. 6458) 17RD2_HP0715S1584 CAUCAUCAUTGGATATTTTAAAAGCAGAGC (SEQ ID NO. 6459) 17RD3_HP0722S1585 CAUCAUCAUGTCAAAAACACCGGCGAATTG (SEQ ID NO. 6460) 17RD4 HP0725S1586 CAUCAUCAUCGGCTATCAAATCGGCGAAGC (SEQ ID NO. 6461)
17RD5_HP0744S1587 CAUCAUCAUATCATGAGAAATTGAATGGCG (SEQ ID NO 6462) 17RD6JHP0790S1588 CAUCAUCAUCGCTTACTCCAAACTCTAGCG (SEQ ID NO 6463) 17RD7_HP0807S1589 CAUCAUCAUCGCGAAAGAACAGCATCACAC (SEQ ID NO 6464) 17RD8_HP0814S1590 CAUCAUCAUAAGCCGGCTAGAAAAAGAGCG (SEQ ID NO 6465) 17RD9_HP0882S1591 CAUCAUCAUTAAGAATTTAGAATTAAGCGC (SEQ ID NO 6466) 17RD10_HP0884S1592 CAUCAUCAUATGAGCTTGGAGCGTTTTGCC (SEQ ID NO 6467) 17RD11_HP0897S1593 CAUCAUCAUCCAGGACCAAAACCTGGTGCC (SEQ ID NO 6468) 17RD12_HP0898S1594 CAUCAUCAUAGCGTTGATCACCTCATTTCG (SEQ ID NO 6469) 17RE1_HP0903SX595 CAUCAUCAUGTGTTGAATCTGGGCAGTTCG (SEQ ID NO 6470) X7RE2_HP0904SX596 CAUCAUCAUAAAGTGGTTTTACCAGAGAGC (SEQ ID NO 6471) 17RE3_HP0988S1597 CAUCAUCAUATGAGGAAAAACCATTATCCA (SEQ ID NO 6472) 17RE4_HP0998S1598 CAUCAUCAUAACCATTATCCATTAAGGGGG (SEQ ID NO 6473) 17RE5_HP1007S1599 CAUCAUCAUTAAATTCCGCTTGTATCCCAC (SEQ ID NO 6474) 17RE6_HP1180S1600 CAUCAUCAUTTTTAGTGTTGTAGGGATGGC (SEQ ID NO 6475) 17RE7_HP1188S1601 CAUCAUCAUTGAAGAGAGTTAGAGAACTTG (SEQ ID NO 6476) 17RE8_HP1280S1602 CAUCAUCAUAGATTTTAAACGCTCTGTATC (SEQ ID NO 6477) 17RE9_HP1353S1603 CAUCAUCAUTTGTTTCGTGCCAATACCGCC (SEQ ID NO 6478) 17RE10JTP1365S1604 CAUCAUCAUTTTTACTAGAAGACGATTACC (SEQ ID NO 6479) 17RE11_HP1369S1605 CAUCAUCAUACGAAGCGCAATTTTATGAAG (SEQ ID NO 6480) 17RE12_HP1372S1606 CAUCAUCAUAAATTCCTTTGGCTTTTAGGG (SEQ ID NO 6481) 17RF1_HP1417R1607 CAUCAUCAUTGTTTCTCTTGTGTAACGCTC (SEQ ID NO 6482) 17RF2_HP1427S1608 CAUCAUCAUTGGCACACCATGAAGAACAAC (SEQ ID NO 6483) 17RF3_HPX522S1609 CAUCAUCAUCGTTATTTTCCCGGTTGCTTG (SEQ ID NO 6484) 17RF4_HP1538S1610 CAUCAUCAUAGTTTAAGATTCTAATCATCC (SEQ ID NO 6485) 17RF5_HP1561S1611 CAUCAUCAUTTCTTTAGGCGTGCTTGTCGC (SEQ ID NO 6486) 17RF6_HP1562S1612 CAUCAUCAUTCTCTTATTCTTTAGGCGCGC (SEQ ID NO 6487) 17RF7_HP1589S1613 CAUCAUCAUAGTTCAACAGAATACCAAAGG (SEQ ID NO 6488) 17RF8_HP0527-1S1614 CAUCAUCAUAAGAAACCCAAACTCAAATGG (SEQ ID NO 6489) 17RF9_HP0527-2S1615 CAUCAUCAUGCGAGAAATTACTCACCCCTG (SEQ ID NO 6490) 17RF10_HP0609-1S1616 CAUCAUCAUACAGATTTAAAGAATGAACGC (SEQ ID NO 6491) 17RF11_HP609~2S1617 CAUCAUCAUTTAGTCATGCAGAGCAGACCG (SEQ ID NO 6492) 17RF12 HP610-1S1618 CAUCAUCAUAATAATGGCGGATCGATCTGG (SEQ ID NO 6493)
17RG1_HP610-2S1619 CAUCAUCAUGCTTTGGAGGCGTTTATCAGC (SEQ ID NO. 6494 ) X7RG2_HP887-1SX620 CAUCAUCAUATCGCCCTTTGGTTTCTCTCG (SEQIDNO. 6495) X7RG3_HP887-2S1621 CAUCAUCAUGGATTTCATCAACAATCAAGG (SEQIDNO. 6496) X7RG4_HP922-XS1622 CAUCAUCAUGCGTTTAAAAAGGCCAGGTTG (SEQIDNO. 6497) 17RG5_HP922-2S1623 CAUCAUCAUGGAGGATCAAGCGTTATTAGC (SEQ ID NO. 6498 ) 17RG6-HP922-3S1624 CAUCAUCAUAGAAAGTCAAGGCTTAGTGAG (SEQIDNO. 6499) 17RG7_HP1157-1S1625 CAUCAUCAUGGCTGGTATATGTCTGTAGGC (SEQIDNO. 6500) 17RG8_HP1157-2S1626 CAUCAUCAUGATTTACGCATCCAATTGAGG (SEQ ID NO. 6501 ) 17RG9_HP1198-1S1627 CAUCAUCAUACAACGAGACAGCTATGATTC (SEQ ID NO. 6502 ) 17RG10_HP1198-2S1628 CAUCAUCAUGGCTATTTTTGGGGATAAAGC (SEQ ID NO. 6503 ) 17RG11-HP1198-3S1629 CAUCAUCAUGATGGCAGATAGCGGCGCAAG (SEQ ID NO. 6504 ) 17RG12_HP1517-1S1630 CAUCAUCAUGATTTACCCAACACAAACTAC (SEQIDNO. 6505) 17RH1 HP1517-2S1631 CAUCAUCAUCTTAGAAACCCTAACGCACCC (SEQIDNO. 6506)
Figure imgf000590_0001
TABLE 4
Figure imgf000591_0001
Figure imgf000592_0001
Figure imgf000594_0001
Figure imgf000595_0001
TABLE 5
Figure imgf000596_0001
Figure imgf000597_0001
Figure imgf000598_0001
Figure imgf000599_0001
Table 5 : List of the 96 selected ORFs of the genome of strain 26695 analyzed in this study. Grey lines indicate, ORF identified as Putative Essential Genes (PEGs) based on the inability to isolate kanamycin resistant transformants when th recombinant plasmid corresponding to the Tn3-Km disrupted cloned ORF was used to transform H. pylori strain HAS 141 Bold lines indicate ORFs that are true essential genes. ORF number with an asterisk (*) non ubiquitous ORFs, # gene known to be essential for which resistant transposon insertion occurred via integration of the plasmid through a singl crossing over (transformants were traces of the vector were demonstrated by DNA/DNA hybridization). «spp »: H. pylo specific predicted protein encoding gene ; chp /conserved hypothetical protein encoding gene.
(a and b) - Functional categories and sizes are according to the characteristics described in th http://genolist.pasteur.fr/PyloriGene/ data base.
(c) - Interacting proteins have been screened through the two hybrid assay in yeast [Rain, 2001 #95]; protein-protein interaction were scored and classified in categories. The indicated interacting protein corresponds to that with the highest score for given bait (http://pim.hybrigenics.com/pimrider/).
TABLE 6
Figure imgf000600_0001
Figure imgf000601_0001
Table 6 : List of the 42 selected ORFs of the H. pylori genome, strain 26695, analyzed in this study. Grey lines indicate ORFs identified as Putative Essential Genes (PEGs) based on the inability to isolate kanamycin resistant transformants when the recombinant plasmid corresponding to the Tn3~Km disrupted cloned ORF was used to transform H. pylori strain HAS 141. Bold lines indicate ORFs that are true essential genes, "pp ": H. pylori specific predicted protein encoding gene.
TABLE 7
Figure imgf000602_0001
Figure imgf000603_0001
Figure imgf000604_0001
TABLE 8
Figure imgf000605_0001
Figure imgf000606_0001
Figure imgf000607_0001
Figure imgf000608_0001
Figure imgf000609_0001
Figure imgf000610_0001
Figure imgf000611_0001
Figure imgf000612_0001
Figure imgf000613_0001
Figure imgf000614_0001
Figure imgf000615_0001
CGTTTGGCTAACGCTTATTTATTCAG RALMCEEGPKAVPCD
CGGGTTAAGAGGCTCAGGGAAAACC TCIQCQSALNNHHIDI
AGCTCTTCTAGGATTTTTGCTAGGGC IEMDGASNRGIDDVR
TTTAATGTGTGAAGAAGGGCCAAAG NLIEQTRYKPSFGRYK
GCTGTGCCTTGCGATACTTGCATCCA IFIIDEVHMFTTEAFNA
ATGCCAGAGCGCTTTAAACAACCACC LLKTLEEPPSHVKFLL
ACATAGATATTATAGAAATGGATGGG ATTDALKLPATILSRT
GCGTCTAATAGGGGGATTGATGATGT QHFRFKKIPENSVISH
CCGTAATCTCATAGAGCAAACGCGCT LKTILEKEQVSYETSA
ACAAACCAAGCTTTGGGCGCTATAAA LEKLAHSGQGSLRD
ATCTTTATCATTGATGAAGTGCATAT TITLLEQAINYCDNAI
GTTCACCACCGAAGCGTTTAACGCGC TESKVAEMLGAIDRS
TTTTAAAGACTTTAGAAGAGCCTCCTA VLEDFFQSLINQDEAR
GCCATGTGAAATTCCTTTTAGCGACAA LKERYAILENYETESV
CAGACGCCTTGAAACTGCCCGCTACCA LEEMMLFLKAKLLSP
TACTCAGCCGCACCCAGCATTTCAGGTT DFYSILLIERFFKIIMS
TAAAAAAATCCCTGAAAATTCCGTTAT SLSLLKEGANASFVL
TTCTCATTTAAAAACCATTTTAGAAAA LLLKMKFK
AGAACAAGTGAGTTATGAAACAAGCG
CGTTAGAAAAACTGGCTCACAG
CGGGCAAGGGAGCCTAAGGGAT
ACGATCACTCTTTTAGAACAAGC
CATCAATTATTGCGATAACGCTAT
CACAGAAAGCAAGGTGGCTGAAAT
GTTAGGAGCGATTGACAGAAGCGT
TTTAGAAGATTTTTTCCAAAGCCTA
ATCAACCAAGATGAAGCGCGATTA
AAAGAGCGTTATGCCATTTTAGAAA
ATTATGAAACCGAGAGCGT
TTAGAAGAAATGATGCTTTT
TTTGAAAGCGAAATTATTGA
GCCCTGATTTTTATTCTATCC
TTTTGATAGAGCGCTTTTTTAAA
ATCATTATGAGCAGTTTGAGCCT
TTTAAAAGAAGGGGCAAATGCC
Figure imgf000617_0001
Figure imgf000618_0001
Figure imgf000619_0001
Figure imgf000620_0001

Claims

CLAIMSWhat is claimed is:
1. A complex of protein-protein interactions in Helicobacter pylori as defined in Table 1 and Table 8.
2. A complex of polynucleotides in Helicobacter pylori encoding for the polypeptides defined in Table 1 and Table 8.
3. A recombinant host cell expressing the interacting polypeptides in Helicobacter pylori defined in Table 1 and Table 8.
4. A method for selecting a modulating compound in Helicobacter pylori comprising:
(a) cultivating a recombinant host cell on a selective medium containing a modulating compound and a reporter gene the expression of which is toxic for said recombinant host cell wherein said recombinant host cell is transformed with two vectors:
(i) wherein said first vector comprises a polynucleotide encoding a first, hybrid polypeptide and a DNA bonding domain; (ii) wherein said second vector comprises a polynucleotide encoding a second hybrid polypeptide and an activating domain that activates said toxic reporter gene when the first and second hybrid polypeptides interact;
(b) selecting said modulating compound which inhibits the growth of said recombinant host cell.
5. A modulating compound obtained from the method of Claim 4.
6. A SID® polypeptide in Helicobacter pylori comprising the even SEQ ID Nos. 2 to 3256 in column 3 of Table 2, even SEQ ID Nos. 6590 to 6594 in Table 7 and even SEQ ID Nos. 6596 to 6644 in Table 8.
7. A SID® polynucleotide in Helicobacter pylori comprising the uneven SEQ ID Nos. 1 to 3255 in column 2 of Table 2, uneven SEQ ID Nos. 6589 to 6593 in Table 7 and uneven SEQ ID Nos. 6595 to 6643 in Table 8.
8. A vector comprising the SID® polynucleotide in Helicobacter pylori comprising the uneven SEQ ID Nos. 1 to 3255 in column 2 of Table 2, uneven SEQ ID Nos. 6589 to 6593 in Table 7 and uneven SEQ ID Nos. 6595 to 6643 in Table 8.
9. A fragment of said SID® polypeptide according to Claim 6.
10. A variant of said SID® polypeptide according to Claim 6.
11.A fragment of said SID® polynucleotide according to Claim 7.
12. A variant of said SID® polynucleotide according to Claim 7.
13. A vector comprising the SID® polypeptide according to Claim 11.
14. A recombinant host cell containing the vectors according to Claim 8.
15. A pharmaceutical composition comprising a modulating compound of claim 5 and a pharmaceutically acceptable carrier.
16. A pharmaceutical composition comprising a SID® polypeptide of claim 6, and a pharmaceutically acceptable carrier.
17. A pharmaceutical composition comprising the recombinant host cells of claim 14 and a pharmaceutically acceptable carrier.
18. A protein chip comprising the polypeptides of claim 6.
PCT/EP2001/015428 2001-01-02 2001-12-28 PROTEIN-PROTEIN INTERACTIONS IN $i(HELICOBACTER PYLORI) WO2002066501A2 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US25930201P 2001-01-02 2001-01-02
US60/259,302 2001-01-02

Publications (2)

Publication Number Publication Date
WO2002066501A2 true WO2002066501A2 (en) 2002-08-29
WO2002066501A3 WO2002066501A3 (en) 2003-11-20

Family

ID=22984381

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2001/015428 WO2002066501A2 (en) 2001-01-02 2001-12-28 PROTEIN-PROTEIN INTERACTIONS IN $i(HELICOBACTER PYLORI)

Country Status (1)

Country Link
WO (1) WO2002066501A2 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004111274A1 (en) * 2003-06-10 2004-12-23 bioMérieux B.V. Nucleic acid sequences that can be used as primers and probes in the amplification and detection of sars coronavirus
JP2007524366A (en) * 2003-04-22 2007-08-30 インターツェル・アクチェンゲゼルシャフト Helicobacter pylori antigen
US10828358B2 (en) 2015-12-14 2020-11-10 Technische Universität München Helicobacter pylori vaccines
US11471532B2 (en) 2016-07-20 2022-10-18 Max-Planck-Gesellschaft Zur Förderung Methods for treatment of H. pylori infections
AU2021212132B2 (en) * 2016-04-06 2023-05-11 Immatics Biotechnologies Gmbh Novel peptides and combination of peptides for use in immunotherapy against aml and other cancers

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999042612A1 (en) * 1998-02-18 1999-08-26 Institut Pasteur A fast and exhaustive method for selecting a prey polypeptide interacting with a bait polypeptide of interest: application to the construction of maps of interactors polypeptides
WO2000066722A1 (en) * 1999-04-30 2000-11-09 Hybrigenics S.A. Collection of prokaryotic dna for two hybrid systems helicobacter pylori protein-protein interactions and application thereof

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999042612A1 (en) * 1998-02-18 1999-08-26 Institut Pasteur A fast and exhaustive method for selecting a prey polypeptide interacting with a bait polypeptide of interest: application to the construction of maps of interactors polypeptides
WO2000066722A1 (en) * 1999-04-30 2000-11-09 Hybrigenics S.A. Collection of prokaryotic dna for two hybrid systems helicobacter pylori protein-protein interactions and application thereof

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHALKER ALISON F ET AL: "Systematic identification of selective essential genes in Helicobacter pylori by genome prioritization and allelic replacement mutagenesis." JOURNAL OF BACTERIOLOGY, vol. 183, no. 4, 2001, pages 1259-1268, XP002245013 ISSN: 0021-9193 cited in the application *
LEGRAIN P ET AL: "Genome-wide protein interaction maps using two-hybrid systems" FEBS LETTERS, ELSEVIER SCIENCE PUBLISHERS, AMSTERDAM, NL, vol. 480, no. 1, 25 August 2000 (2000-08-25), pages 32-36, XP004337490 ISSN: 0014-5793 cited in the application *
TOMB J-F ET AL: "THE COMPLETE GENOME SEQUENCE OF THE GASTRIC PATHOGEN HELICOBACTER PYLORI" NATURE, MACMILLAN JOURNALS LTD. LONDON, GB, vol. 388, no. 6642, 7 August 1997 (1997-08-07), pages 539-547,TABEL, XP002062106 ISSN: 0028-0836 cited in the application *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007524366A (en) * 2003-04-22 2007-08-30 インターツェル・アクチェンゲゼルシャフト Helicobacter pylori antigen
WO2004111274A1 (en) * 2003-06-10 2004-12-23 bioMérieux B.V. Nucleic acid sequences that can be used as primers and probes in the amplification and detection of sars coronavirus
US8106172B2 (en) 2003-06-10 2012-01-31 Biomerieux, B.V. Nucleic acid sequences that can be used as primers and probes in the amplification and detection of SARS coronavirus
US10828358B2 (en) 2015-12-14 2020-11-10 Technische Universität München Helicobacter pylori vaccines
AU2021212132B2 (en) * 2016-04-06 2023-05-11 Immatics Biotechnologies Gmbh Novel peptides and combination of peptides for use in immunotherapy against aml and other cancers
US11471532B2 (en) 2016-07-20 2022-10-18 Max-Planck-Gesellschaft Zur Förderung Methods for treatment of H. pylori infections

Also Published As

Publication number Publication date
WO2002066501A3 (en) 2003-11-20

Similar Documents

Publication Publication Date Title
JP4469026B2 (en) Streptococcus pneumoniae antigens and vaccines
US6913907B2 (en) Enterococcus faecalis polynucleotides encoding EF059
WO2002057303A2 (en) Protein-protein interactions between shigella flexneri polypeptides and mammalian polypeptides
JP2004135679A (en) Staphylococcus aureus polynucleotide and sequence
JP5340926B2 (en) Rheumatoid fever-related peptide (PARF) and its use as a diagnostic marker
TW201247701A (en) Novel polypeptides and bacteriophages specific to Klebsiella pneumoniae capsular type strains and their applications
JP2002525083A (en) STAPHYLOCOCCUSAUREUS gene and polypeptide
JP2000501621A (en) Nucleic acid and amino acid sequences related to Helicobacter pylori and vaccine compositions thereof
US6916615B2 (en) Collection of prokaryotic DNA for two hybrid systems Helicobacter pylori protein-protein interactions and application thereof
WO2002066501A2 (en) PROTEIN-PROTEIN INTERACTIONS IN $i(HELICOBACTER PYLORI)
EP1222255A1 (en) Polypeptide fragments comprising c-terminal portion of helicobacter catalase
JPH11511032A (en) IceA gene and related methods
JP2002272484A (en) dexB
JP2002503445A (en) New pth
JP2002534960A (en) vaccine
JP2000210093A (en) Gid a1
US20030235842A1 (en) Map
JPH11235179A (en) Pth
US20020119512A1 (en) Ups
US6316211B1 (en) ThdF
JP2002522010A (en) nrdF
JP2002525044A (en) topA
JPH11178584A (en) Gid b
JP2002516333A (en) priA
US20120301428A1 (en) Clostridium gene

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PH PL PT RO RU SD SE SG SI SK SL TJ TM TN TR TT TZ UA UG US UZ VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

122 Ep: pct application non-entry in european phase
NENP Non-entry into the national phase

Ref country code: JP

WWW Wipo information: withdrawn in national office

Country of ref document: JP