CN116635086A - Compositions and methods for inhibiting expression of multiple genes - Google Patents

Compositions and methods for inhibiting expression of multiple genes Download PDF

Info

Publication number
CN116635086A
CN116635086A CN202180078940.0A CN202180078940A CN116635086A CN 116635086 A CN116635086 A CN 116635086A CN 202180078940 A CN202180078940 A CN 202180078940A CN 116635086 A CN116635086 A CN 116635086A
Authority
CN
China
Prior art keywords
site
gene
specific breaker
sequence
specific
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202180078940.0A
Other languages
Chinese (zh)
Inventor
L·M·比奇
J·J·史密斯
R·卡尼克
K·A·戈斯
A·W·谢德格
J·M·肯尼迪
J·D·法雷利
H·贝拉格扎尔
L·A·阮
C·W·奥唐内尔
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Flagship Pioneering Innovations V Inc
Original Assignee
Flagship Pioneering Innovations V Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Flagship Pioneering Innovations V Inc filed Critical Flagship Pioneering Innovations V Inc
Priority claimed from PCT/US2021/052720 external-priority patent/WO2022072546A2/en
Publication of CN116635086A publication Critical patent/CN116635086A/en
Pending legal-status Critical Current

Links

Landscapes

  • Medicines That Contain Protein Lipid Enzymes And Other Medicines (AREA)
  • Micro-Organisms Or Cultivation Processes Thereof (AREA)

Abstract

The present disclosure relates to site-specific disruption agents for modulating, e.g., reducing, expression of a target plurality of genes in a cell. In some embodiments, the target plurality of genes comprises a pro-inflammatory gene, e.g., CXCL1, CXCL2, CXCL3, CXCL4, CXCL5, CXCL6, CXCL7, and IL-8. In some embodiments, the method includes using a first site-specific breaker that targets a first anchor sequence and a second site-specific breaker that breaks a second anchor sequence.

Description

Compositions and methods for inhibiting expression of multiple genes
Cross Reference to Related Applications
The application claims the benefit of U.S. provisional application 63/085,013 filed on 29 th 9 and 2021 filed on 29 th 6. The contents of the above-mentioned applications are hereby incorporated by reference in their entirety.
Sequence listing
The present application comprises a sequence listing that has been electronically submitted in ASCII format and is incorporated herein by reference in its entirety. The ASCII copy created at 9.29 of 2021 was named O2057-7021wo_sl.txt and was 666,818 bytes in size.
Background
The deregulation of gene expression is the root cause of many diseases (e.g., in mammals, such as humans). Many diseases and disorders are associated with multiple related genes. New tools, systems, and methods are needed to alter (e.g., reduce) the expression of multiple related genes.
Disclosure of Invention
The present disclosure provides, inter alia, site-specific disruption agents or systems comprising site-specific disruption agents that are useful for modulating, e.g., reducing, expression of a plurality of target genes, e.g., a first gene and a second gene, within an anchor sequence-mediated junction (ASMC) comprising a first anchor sequence and a second anchor sequence. In one aspect, the site-specific breaker comprises a targeting moiety that specifically binds to the first anchor sequence or to the proximal side of the first anchor sequence in the ASMC. In some embodiments, binding of the site-specific disruption agent occurs in an amount sufficient to modulate (e.g., reduce) expression of a plurality of target genes (e.g., a first gene and a second gene). In some embodiments, the site-specific breaker further comprises an effector moiety. Typically, modulation of expression of a target plurality of genes by a site-specific disruption agent involves binding of the site-specific disruption agent to the first anchor sequence or proximal to the first anchor sequence. In some embodiments, binding of the site-specific disruption agent to the first anchor sequence can disrupt binding of a nucleation polypeptide, such as CTCF, to the first anchor sequence, e.g., thereby disrupting formation and/or maintenance of ASMC, e.g., thereby modulating, e.g., reducing, expression of the plurality of genes. In some embodiments, binding of the site-specific disruption agent to the first anchor sequence or proximal to the first anchor sequence can localize the function of the effector moiety to the first anchor sequence and/or ASMC, e.g., thereby disrupting formation and/or maintenance of ASMC, e.g., thereby modulating, e.g., reducing, expression of the plurality of genes. In some embodiments, binding of the site-specific breaker to the first anchor sequence or proximal to the first anchor sequence can localize the function of the effector moiety to the first anchor sequence and/or ASMC, e.g., thereby modulating (e.g., reducing) expression of a plurality of genes. Without wishing to be bound by theory, in some embodiments, targeting multiple genes within the same ASMC is believed to be more effectively regulated, e.g., reduce expression of multiple genes and/or more effectively achieve therapeutic effects related to the function of multiple genes. For example, in some embodiments, the targeted multiple genes may all be pro-inflammatory genes; targeting multiple pro-inflammatory genes to modulate, e.g., reduce expression, as taught herein may be more effective in reducing inflammation than targeting a single gene. Targeting multiple genes contained within the same genomic complex (e.g., ASMC) (e.g., by targeting the ASMC or the anchor sequence of ASMC) may have a greater additive or synergistic effect (e.g., with respect to expression regulation or stability/duration of regulation) than the effect of targeting individual genes of the multiple genes.
In some aspects, the disclosure provides a method of reducing expression of a first gene and a second gene in a cell, the method comprising: contacting the cell with a site-specific disruption agent comprising a targeting moiety that specifically binds to the first anchor sequence or to a site located proximal to the first anchor sequence in an amount sufficient to reduce expression of the first and second genes within an anchor sequence-mediated junction comprising the first anchor sequence and the second anchor sequence. In some embodiments, the first gene and the second gene are pro-inflammatory genes.
In some aspects, the disclosure relates to a site-specific breaker comprising: a DNA binding moiety, e.g., a targeting moiety, that specifically binds to or is proximal to the intracellular first anchor sequence. In some embodiments, the first anchor sequence is part of an anchor sequence-mediated junction, the anchor sequence-mediated junction further comprising a second anchor sequence, a first gene, and a second gene. In some embodiments, the first gene and the second gene are pro-inflammatory genes.
In some aspects, the disclosure relates to a site-specific breaker comprising: a targeting moiety that specifically binds to or is proximal to a first anchor sequence within a cell, wherein the first anchor sequence is part of an anchor sequence-mediated junction, the anchor sequence-mediated junction further comprising a second anchor sequence, a first gene, and a second gene, wherein the first gene and the second gene are pro-inflammatory genes.
In another aspect, the present disclosure provides a system comprising: a first site-specific breaker comprising a first targeting moiety and optionally a first effector moiety, and a second site-specific breaker, wherein the first site-specific breaker specifically binds to (or is proximal to) a first anchor sequence comprising an anchor sequence-mediated junction (ASMC) of a first gene and a second gene; the second site-specific breaker comprises a second targeting moiety and optionally a second effector moiety, wherein the second site-specific breaker binds to (or is proximal to) a second anchor sequence of the ASMC.
In another aspect, the disclosure relates to a method of reducing expression of a first gene and a second gene in a cell, the method comprising: contacting the cell with a site-specific disruption agent comprising a targeting moiety that specifically binds to a first anchor sequence or a site located proximal to the first anchor sequence in an amount sufficient to reduce expression of the first and second genes within an anchor sequence-mediated junction comprising the first anchor sequence and the second anchor sequence, wherein the first gene and the second gene are pro-inflammatory genes; thereby reducing the expression of the first and second genes.
In another aspect, the disclosure relates to a method of reducing expression of a first gene and a second gene in a cell, the method comprising:
contacting the cell with a system in an amount sufficient to reduce expression of the first and second genes, the system comprising: a first site-specific breaker comprising a first targeting moiety and optionally a first effector moiety, the first site-specific breaker specifically binding to the first anchor sequence or a site proximal to the first anchor sequence; and a second site-specific breaker comprising a second targeting moiety and optionally a second effector moiety, wherein the second site-specific breaker binds to (or is proximal to) a second anchor sequence of the ASMC, the first and second genes being within the ASMC, wherein the first and second genes are pro-inflammatory genes.
In another aspect, the disclosure relates to a reaction mixture comprising a cell (e.g., a human cell, e.g., a primary human cell) and a site-specific breaker or system as described herein.
In another aspect, the disclosure relates to a method of treating a subject having an inflammatory disorder, the method comprising administering to the subject a site-specific disruption agent or system as described herein in an amount sufficient to treat the inflammatory disorder.
In another aspect, the disclosure relates to a method of treating inflammation (e.g., localized inflammation) in a subject having an infection (e.g., a viral infection, such as covd-19), the method comprising administering to the subject a site-specific disruption agent or system as described herein in an amount sufficient to treat the inflammation.
In another aspect, the disclosure relates to a human cell with reduced expression of a first gene and a second gene, wherein the first gene and the second gene are pro-inflammatory genes, wherein the cell comprises a disrupted (e.g., fully disrupted) anchor sequence-mediated junction comprising the first gene and the second gene. In some embodiments, the human cells have been previously contacted with a site-specific disruption agent or system described herein. In some embodiments, the human cells no longer comprise the site-specific disruption agent or system described herein.
In another aspect, the disclosure relates to a human cell comprising locus coordinates chr4:74595464-74595486, chr4:74595464-74595486, chr4:5237, or a mutation at chr4:5237, or within 5, 10, 15, 20, 30, or 50 nucleotides of said region.
Those of ordinary skill in the art will recognize, or be able to ascertain using no more than routine experimentation, many equivalents to the specific embodiments of the disclosure described herein.
All publications, patent applications, patents, and other references mentioned herein (e.g., sequence database reference numbers) are incorporated by reference in their entirety. For example, all GenBank, unigene and Entrez sequences mentioned herein (e.g., in any of the tables herein) are incorporated by reference. Unless otherwise indicated, the sequence accession numbers specified herein (including in any of the tables herein) refer to the current database entry by 9 months 29 days 2020. When a gene or protein references multiple sequence accession numbers, all sequence variants are encompassed.
Definition of the definition
Anchor sequence: the term "anchor sequence" as used herein refers to a nucleic acid sequence recognized by a nucleating agent that binds sufficiently to form an anchor sequence-mediated connection, such as a complex. In some embodiments, the anchor sequence comprises one or more CTCF binding motifs. In some embodiments, the anchor sequence is not located within the coding region of the gene. In some embodiments, the anchor sequence is located within an intergenic region. In some embodiments, the anchor sequence is not located within an enhancer or promoter. In some embodiments, the anchor sequence is located at least 400bp, at least 450bp, at least 500bp, at least 550bp, at least 600bp, at least 650bp, at least 700bp, at least 750bp, at least 800bp, at least 850bp, at least 900bp, at least 950bp, or at least 1kb from any transcription initiation site. In some embodiments, the anchor sequence is located within a region unrelated to genomic imprinting, monoallelic expression, and/or monoallelic epigenetic markers. In some embodiments, the anchor sequence has one or more functions selected from the group consisting of: binding to an endogenous nucleation polypeptide (e.g., CTCF), interacting with a second anchor sequence to form an anchor sequence-mediated linkage, or isolating from an enhancer other than the anchor sequence-mediated linkage. In some embodiments of the present disclosure, techniques are provided that can specifically target one or more specific anchor sequences, but not other anchor sequences (e.g., sequences that may contain a nucleating agent (e.g., CTCF) binding motif in different contexts); such targeted anchor sequences may be referred to as "target anchor sequences". In some embodiments, the sequence and/or activity of the target anchor sequence is modulated, while the sequence and/or activity of one or more other anchor sequences that may be present in the same system as the targeted anchor sequence (e.g., in the same cell and/or on the same nucleic acid molecule (e.g., the same chromosome) in some embodiments) is not modulated. In some embodiments, the anchor sequence comprises or is a nucleation polypeptide binding motif. In some embodiments, the anchor sequence is adjacent to a nucleation polypeptide binding motif.
Anchor sequence mediated ligation: the term "anchor sequence-mediated linkage" as used herein refers to a DNA structure, in some cases a complex, that is created and/or maintained via physical interaction or binding by one or more polypeptides (e.g., nucleation polypeptides) or one or more protein and/or nucleic acid entities (e.g., RNA or DNA) that bind anchor sequences to effect spatial proximity and functional linkage between anchor sequences to at least two anchor sequences in the DNA (see, e.g., fig. 1).
And (3) associating: if the presence, level, form, and/or function of one event or entity is associated with another event or entity, then the two events or entities are "related" to each other, as that term is used herein. For example, in some embodiments, a particular entity (e.g., polypeptide, genetic feature, metabolite, microorganism, etc.) is considered to be associated with a particular disease, disorder, or condition if its presence, level, form, and/or function is associated with the incidence and/or susceptibility of the disease, disorder, or condition (e.g., in the relevant population). In some embodiments, two or more entities are physically "associated" with each other if they interact directly or indirectly such that they are and/or remain physically proximate to each other. In some embodiments, two or more entities physically associated with each other are covalently linked to each other; in some embodiments, two or more entities that are physically associated with each other are not covalently linked to each other, but are non-covalently bound, such as by hydrogen bonding, van der waals interactions, hydrophobic interactions, magnetic properties, and combinations thereof. In some embodiments, a DNA sequence is "associated with" a target genome or transcription complex when the nucleic acid is at least partially located within the target genome or transcription complex, and expression of the gene in the DNA sequence is affected by formation or disruption of the target genome or transcription complex.
Site-specific disruption agent: as used herein, the term "site-specific disruption agent" refers to an agent or entity that specifically inhibits, dissociates, degrades, and/or modifies one or more components of a genomic complex (e.g., ASMC) to thereby modulate (e.g., reduce) expression of a target plurality of genes as described herein. In some embodiments, the site-specific disruption agent interacts with one or more components of the genomic complex. In some embodiments, the site-specific disruption agent binds (e.g., directly or, in some embodiments, indirectly) one or more genome complex components. In some embodiments, the site-specific breaker binding can be an anchor sequence, e.g., a first and/or second anchor sequence, comprising a portion of an ASMC of a target plurality of genes. In some embodiments, the site-specific breaker binding can be a site proximal to an anchor sequence (e.g., first and/or second anchor sequence) comprising a portion of an ASMC that targets multiple genes. In some embodiments, the site-specific disruption agent modifies one or more genome complex components. In some embodiments, the site-specific disruption agent comprises an oligonucleotide. In some embodiments, the site-specific disruption agent comprises a polypeptide. In some embodiments, the site-specific breaker comprises an antibody (e.g., a monospecific or multispecific antibody construct) or antibody fragment. In some embodiments, the site-specific disruption agent is directed to a particular genomic location and/or genomic complex by a targeting moiety, as described herein. In some embodiments, the site-specific disruption agent comprises a genomic complex component or variant thereof. In some embodiments, the site-specific breaker comprises a targeting moiety. In some embodiments, the site-specific breaker comprises an effector moiety. In some embodiments, the site-specific breaker comprises a plurality of effector moieties. In some embodiments, the site-specific breaker comprises a targeting moiety and one or more effector moieties. In some embodiments, the site-specific disruption agent specifically binds to a first site in the genome with a higher affinity than a second site in the genome (e.g., relative to any other site in the genome). In some embodiments, the site-specific disruption agent preferentially inhibits, dissociates, degrades, and/or modifies one or more components of the first genomic complex relative to the second genomic complex (e.g., relative to any other genomic complex).
Domain: as used herein, the term "domain" refers to a segment or portion of an entity. In some embodiments, a "domain" is associated with a particular structural and/or functional feature of an entity such that when the domain is physically separated from the remainder of its parent entity, it substantially or entirely retains the particular structural and/or functional feature. Alternatively or additionally, in some embodiments, a domain may be or comprise a part of an entity that, when separated from the (parent) entity and connected to a different (receiving) entity, substantially retains and/or confers to the receiving entity one or more structural and/or functional features that characterize it in the parent entity. In some embodiments, the domain is or comprises a segment or portion of a molecule (e.g., a small molecule, a carbohydrate, a lipid, a nucleic acid, a polypeptide, etc.). In some embodiments, the domain is or comprises a segment of a polypeptide. In some such embodiments, the domain is characterized by a particular structural element (e.g., a particular amino acid sequence or sequence motif, an alpha-helical feature, a beta-sheet feature, a helical coil feature, a random coil feature, etc.), and/or a particular functional feature (e.g., binding activity, enzymatic activity, folding activity, signaling activity, etc.).
Effector moiety: as used herein, the term "effector moiety" refers to a domain having one or more functions that, when properly positioned in the nucleus of a cell, regulate, e.g., reduce, expression of a target plurality of genes in the cell. In some embodiments, the effector moiety comprises a polypeptide. In some embodiments, the effector moiety comprises a polypeptide and a nucleic acid. The function associated with the effector moiety may directly affect the expression of the target multiple genes, e.g., block the recruitment of transcription factors that would stimulate the expression of the genes. The functions associated with the effector moiety may indirectly affect the expression of the target multiple genes, e.g., introducing epigenetic modifications or recruiting other factors that introduce epigenetic modifications, thereby inducing changes in chromosome topology, thereby inhibiting expression of the target multiple genes.
Genome complex: as used herein, the term "genomic complex" is a complex that brings together two genomic sequence elements that are separated from each other on one or more chromosomes via interactions between and among proteins and/or other components (possibly including genomic sequence elements). In some embodiments, the genomic sequence element is an anchor sequence to which one or more protein components of the complex bind. In some embodiments, the genomic complex may comprise an anchor sequence-mediated linkage. In some embodiments, the genomic sequence element may be or comprise a CTCF binding motif, promoter, and/or enhancer. In some embodiments, the genomic sequence element comprises at least one or both of a promoter and/or a regulatory site (e.g., an enhancer). In some embodiments, complex formation nucleates at one or more genomic sequence elements and/or by binding of one or more protein components to one or more genomic sequence elements. As will be appreciated by those of skill in the art, in some embodiments, co-localization (e.g., binding) of genomic loci via complex formation alters DNA topology at or near (in some embodiments, including between) one or more genomic sequence elements. In some embodiments, the genomic complex comprises an anchor sequence-mediated linkage comprising one or more loops. In some embodiments, a genomic complex as described herein is nucleated by a nucleating polypeptide (e.g., like CTCF and/or mucin). In some embodiments, a genomic complex as described herein may comprise, for example, one or more of the following: CTCF, cohesin, non-coding RNA (e.g., edra), transcription machinery proteins (e.g., RNA polymerase, one or more transcription factors, e.g., selected from the group consisting of TFIIA, TFIIB, TFIID, TFIIE, TFIIF, TFIIH, etc.), transcription regulators (e.g., mediator), P300, enhancer-binding protein, repressor-binding protein, histone modifiers, etc.), and the like. In some embodiments, a genomic complex as described herein comprises one or more polypeptide components and/or one or more nucleic acid components (e.g., one or more RNA components), which in some embodiments can interact with each other and/or with one or more genomic sequence elements (e.g., anchor sequences, promoter sequences, regulatory sequences (e.g., enhancer sequences)) in order to constrain a piece of genomic DNA to a topology (e.g., loop) that is not employed by the genomic DNA when the complex is not formed.
Nucleic acid: as used herein, the term "nucleic acid" in its broadest sense refers to any compound and/or substance that is incorporated or can be incorporated into an oligonucleotide chain. In some embodiments, the nucleic acid is a compound and/or substance that is or can be incorporated into the oligonucleotide chain via a phosphodiester linkage. As will be apparent from the context, in some embodiments, "nucleic acid" refers to a single nucleic acid residue (e.g., nucleotide and/or nucleoside); in some embodiments, "nucleic acid" refers to an oligonucleotide strand comprising a single nucleic acid residue. In some embodiments, a "nucleic acid" is or comprises RNA; in some embodiments, a "nucleic acid" is or comprises DNA. In some embodiments, the nucleic acid is, comprises, or consists of one or more natural nucleic acid residues. In some embodiments, the nucleic acid is, comprises, or consists of one or more nucleic acid analogs. In some embodiments, the nucleic acid analog differs from the nucleic acid in that it does not utilize a phosphodiester backbone. For example, in some embodiments, a nucleic acid is, comprises, or consists of one or more "peptide nucleic acids" that are known in the art and have peptide bonds rather than phosphodiester bonds in the backbone are considered to be within the scope of the present invention. Alternatively or additionally, in some embodiments, the nucleic acid has one or more phosphorothioate and/or 5' -N-phosphoramidite linkages instead of phosphodiester linkages. In some embodiments, the nucleic acid is, comprises, or consists of one or more natural nucleosides (e.g., adenosine, thymidine, guanosine, cytidine, uridine, deoxyadenosine, deoxythymidine, deoxyguanosine, and deoxycytidine). In some embodiments, the nucleic acid is, comprises, or consists of one or more nucleoside analogs (e.g., 2-aminoadenosine, 2-thiothymidine, inosine, pyrrolopyrimidine, 3-methyladenosine, 5-methylcytidine, C-5 propynyl-cytidine, C-5 propynyl-uridine, 2-aminoadenosine, C5-bromouridine, C5-fluorouridine, C5-iodouridine, C5-propynyl-uridine, C5-propynyl-cytidine, C5-methylcytidine, 2-aminoadenosine, 7-deazaguanosine, 8-oxoadenosine, 0 (6) -methylguanine, 2-thiocytidine, methylated bases, inserted bases, and combinations thereof). In some embodiments, the nucleic acid comprises one or more modified sugars (e.g., 2 '-fluororibose, ribose, 2' -deoxyribose, arabinose, and hexose) as compared to that in the natural nucleic acid. In some embodiments, the nucleic acid has a nucleotide sequence encoding a functional gene product (e.g., RNA or protein). In some embodiments, the nucleic acid comprises one or more introns. In some embodiments, the nucleic acid is prepared by one or more of isolation from a natural source, enzymatic synthesis (in vivo or in vitro) by complementary template-based polymerization, replication in a recombinant cell or system, and chemical synthesis. In some embodiments, the nucleic acid is at least 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, 100, 110, 120, 130, 140, 150, 160, 170, 180, 190, 20, 225, 250, 275, 300, 325, 350, 375, 400, 425, 450, 475, 500, 600, 700, 800, 900, 1000, 1500, 2000, 2500, 3000, 3500, 4000, 4500, 5000, or more residues in length. In some embodiments, the nucleic acid is partially or fully single stranded; in some embodiments, the nucleic acid is partially or fully double stranded. In some embodiments, the nucleic acid has a nucleotide sequence comprising at least one element encoding a polypeptide or is a complement of a sequence encoding a polypeptide. In one embodiment, the nucleic acid has enzymatic activity.
Operatively connected to: as used herein, the phrase "operably linked" refers to a juxtaposition wherein the components described are in a relationship permitting them to function in their intended manner. The transcription control element "operably linked" to a functional element (e.g., a gene) is associated in such a way that: expression and/or activity of a functional element (e.g., a gene) is achieved under conditions compatible with the transcriptional control element. In some embodiments, the "operably linked" transcriptional control element is contiguous (e.g., covalently linked) to the coding element of interest (e.g., gene); in some embodiments, the operably linked transcriptional control element acts in trans or otherwise at a distance from the functional element of interest (e.g., gene). In some embodiments, operably linked means that the two nucleic acid sequences are contained on the same nucleic acid molecule. In another embodiment, operably linked may further mean that two nucleic acid sequences are in proximity to each other on the same nucleic acid molecule, e.g., within 1000, 500, 100, 50, or 10 base pairs of each other or immediately adjacent to each other.
Peptides, polypeptides, proteins: as used herein, the terms "peptide," "polypeptide," and "protein" refer to a compound consisting of amino acid residues covalently linked by peptide bonds or by means other than peptide bonds. The protein or peptide must contain at least two amino acids and there is no limit to the maximum number of amino acids that can comprise the sequence of the protein or peptide. Polypeptides include any peptide or protein comprising two or more amino acids linked to each other by peptide bonds or by means other than peptide bonds. As used herein, the term refers to both short and long chains, which are also commonly referred to in the art as, for example, peptides, oligopeptides, and oligomers, and which are commonly referred to in the art as proteins, which are of many types.
Proximal: as used herein, proximal refers to the proximity of two sites (e.g., nucleic acid sites) such that binding of a site-specific breaker at a first site and/or modification of the first site by the site-specific breaker will produce the same or substantially the same effect as binding and/or modification of other sites. For example, a DNA targeting moiety may bind to a first site proximal to the anchor sequence (a second site), and an effector moiety associated with the DNA targeting moiety may epigenetically modify the first site such that binding of the anchor sequence to an endogenous nucleating polypeptide is modified, substantially the same if the second site (the anchor sequence) is bound and/or modified. In some embodiments, the sites adjacent to each other are less than 5000, 4000, 3000, 2000, 1000, 900, 800, 700, 600, 500, 400, 300, 200, 100, 50, 40, 30, 20, 10, or 5 base pairs apart from each other.
Sequence targeting polypeptide: as used herein, the term "sequence-targeting polypeptide" refers to a protein, such as an enzyme, e.g., cas9 or TALEN, that recognizes or specifically binds to a target nucleic acid sequence. In some embodiments, the sequence targeting polypeptide is a catalytically inactive protein, such as dCas9, TAL effector molecule, or Zn finger molecule, which lacks endonuclease activity.
Specific binding: as used herein, the term "specific binding" refers to the ability to distinguish between potential binding partners in the environment in which the binding occurs. In some embodiments, a binding agent that interacts with one particular target when other potential targets are present is said to "specifically bind" to the target with which it interacts. In some embodiments, specific binding is assessed by detecting or determining the extent of binding between the binding agent and its partner; in some embodiments, specific binding is assessed by detecting or determining the extent of dissociation of the binding agent-partner complex. In some embodiments, specific binding is assessed by detecting or determining the ability of the binding agent to compete with the selective interaction between its partner and another entity. In some embodiments, specific binding is assessed by performing such assays or determinations over a range of concentrations.
The subject: as used herein, the term "subject" or "test subject" refers to any organism to which a provided compound or composition is administered according to the present disclosure, e.g., for experimental, diagnostic, prophylactic and/or therapeutic purposes. Typical subjects include animals (e.g., mammals such as mice, rats, rabbits, non-human primates, and humans; insects; worms; and the like) and plants. In some embodiments, the subject may have and/or be susceptible to a disease, disorder, and/or condition.
Basically: as used herein, the term "substantially" refers to a qualitative condition that exhibits all or nearly all of the range or degree of a feature or characteristic of interest. Those of ordinary skill in the art will appreciate that biological and chemical phenomena are rarely, if ever, accomplished and/or proceed to completion or to achieve or avoid absolute results. Thus, the term "substantially" may be used in some embodiments herein to capture potential imperfections inherent in many biological and chemical phenomena.
Symptom alleviation: as used herein, the phrase "symptom relief" may be used when the degree (e.g., intensity, severity, etc.) and/or frequency of one or more symptoms of a particular disease, disorder, or condition is reduced. In some embodiments, delaying the onset of a particular symptom is considered a form of reducing the frequency of the symptom.
And (3) target: according to the present disclosure, an agent or entity is considered to be "targeted" to another agent or entity if it specifically binds to the agent or entity under conditions in which they are in contact with each other. For example, in some embodiments, the antibody (or antigen binding fragment thereof) targets its cognate epitope or antigen. In some embodiments, a nucleic acid having a specific sequence targets a nucleic acid of a substantially complementary sequence. In some embodiments, the targeting moiety that specifically binds to the anchor sequence targets the anchor sequence, ASMC comprising the anchor sequence, and/or a plurality of genes within the ASMC.
Target multiple genes: as used herein, the term "target plurality of genes" refers to a group of more than one gene (e.g., 2, 3, 4, 5, 6, 7, 8, 9, or more genes) that is targeted for modulation (e.g., modulation of expression). In some embodiments, the target plurality of genes is part of a targeted genomic complex. In some embodiments, each gene of the target plurality of genes has at least a portion (e.g., part or all) of its genomic sequence as part of a target genomic complex, e.g., at least partially within an ASMC, that is targeted by a site-specific disruption agent as described herein. In some embodiments, modulating comprises inhibiting expression of a target plurality of genes. In some embodiments, the target plurality of genes is modulated by contacting the target plurality of genes or a transcription control element operably linked to one or more of the target plurality of genes with a site-specific disruption agent as described herein. In some embodiments, one or more of the target plurality of genes is abnormally expressed (e.g., overexpressed) in a cell, e.g., in a cell of a subject (e.g., patient). In some embodiments, the target plurality of genes has a related function. For example, genes targeting multiple genes may all have a pro-inflammatory effect when expressed; such genes of the target plurality of genes may be referred to herein as pro-inflammatory genes or target pro-inflammatory genes. In some embodiments, a gene of the target plurality of genes encodes a protein. In some embodiments, a gene in the target plurality of genes encodes a functional RNA.
A targeting moiety: as used herein, the term "targeting moiety" means an agent or entity that specifically interacts (e.g., targets) with a component or group of components, e.g., one or more components involved in a genomic complex (e.g., anchor sequence-mediated ligation) as described herein. In some embodiments, a targeting moiety according to the present disclosure targets one or more target components of a genomic complex as described herein. In some embodiments, the targeting moiety targets a genome to a complex component comprising a genomic sequence element (e.g., an anchor sequence). In some embodiments, the targeting moiety targets a component of the genome complex, rather than a genomic sequence element. In some embodiments, the targeting moiety targets a plurality of genomic complex components, or a combination thereof, which in some embodiments may include genomic sequence elements. In some aspects, the contributions of the present disclosure include insights that: inhibition, dissociation, degradation, and/or modification of one or more genome complexes (e.g., comprising a target anchor sequence proximal to a target gene (e.g., a fusion gene, e.g., a fusion tumor gene) and/or breakpoint, as described herein) can be achieved by targeting one or more genome complex components (including one or more genome sequence elements) with a site-specific disruption agent. In some aspects, effective inhibition, dissociation, degradation, and/or modification of one or more genomic complexes may be achieved by targeting one or more complex components comprising one or more genomic sequence elements, as described herein. In some embodiments, the present disclosure contemplates that the effectiveness of the improvement (e.g., regarding the degree of specificity of a particular genomic complex compared to other genomic complexes that may be formed or present in a given system [ e.g., in terms of the impact on the number of complexes detected in a population ]) inhibition, dissociation, degradation, or modification may be achieved by targeting one or more complex components that are not genomic sequence elements, and, optionally, may alternatively or additionally include targeting genomic sequence elements, wherein the improved inhibition, dissociation, degradation, or modification is relative to the inhibition, dissociation, degradation, or modification typically achieved by targeting one or more individual genomic sequence elements. In some embodiments, the site-specific disruption agent as described herein facilitates inhibition, dissociation, degradation, or modification of the target genomic complex. For example, as a non-limiting example, in some embodiments, a site-specific breaker as described herein inhibits, dissociates, degrades, and/or modifies at least one component of a given genomic complex (e.g., comprising an anchor sequence-mediated linkage). In some embodiments, a site-specific disruption agent as described herein inhibits, dissociates, degrades, and/or does not degrade at least one other specific genomic complex (i.e., a non-target genomic complex) (e.g., a component thereof), and/or does not modify at least one other specific genomic complex (i.e., a non-target genomic complex) (e.g., a component thereof), which may be present in other cells (e.g., in a non-target cell), and/or does not modify at least one other specific genomic complex (e.g., a component thereof), for example, which may be present in other cells (e.g., in a non-target cell) and/or at a different site in the same cell (i.e., within a target cell). The site-specific disruption agent as described herein can comprise a targeting moiety. In some embodiments, the targeting moiety also acts as an effector moiety (e.g., a disruption moiety); in some such embodiments, the provided site-specific disruption agent may lack any effector moiety (e.g., disruption, modification, or other effector moiety) that is separate (or meaningfully different) from the targeting moiety.
Therapeutically effective amount of: as used herein, the term "therapeutically effective amount" means the amount of a substance (e.g., a therapeutic agent, composition, and/or formulation) that, when administered as part of a therapeutic regimen, causes a desired biological response. In some embodiments, a therapeutically effective amount of a substance is an amount sufficient, when administered to a subject suffering from or susceptible to a disease, disorder, and/or condition, to treat, diagnose, prevent, and/or delay the onset of the disease, disorder, and/or condition. As will be appreciated by one of ordinary skill in the art, the effective amount of a substance may vary depending on such factors as: such as one or more desired biological endpoints, a substance to be delivered, one or more target cells or one or more tissues, etc. For example, in some embodiments, an effective amount of a compound in a formulation for treating a disease, disorder, and/or condition is an amount that alleviates, ameliorates, reduces, inhibits, prevents, delays the onset of, reduces the severity of, and/or reduces the incidence of one or more symptoms or features of the disease, disorder, and/or condition. In some embodiments, the therapeutically effective amount is administered in a single dose; in some embodiments, multiple unit doses are required to deliver a therapeutically effective amount.
Transcription control sequence: as used herein, the term "transcription control sequence" refers to a nucleic acid sequence that increases or decreases transcription of a gene. "enhancer sequences" increase the likelihood of gene transcription. "silencing or repressing sequences" reduce the likelihood of gene transcription. Examples of transcription control sequences include promoters and enhancers. In some embodiments, the ASMC comprises a transcriptional control sequence. Such transcriptional control sequences are referred to as internal transcriptional control sequences (e.g., the enhancement sequences contained within an ASMC are referred to as internal enhancement sequences).
Drawings
The following detailed description of embodiments of the present disclosure will be better understood when read in conjunction with the accompanying drawings. For the purpose of illustrating the disclosure, there is shown in the drawings exemplary embodiments of the invention. It should be understood, however, that the disclosure is not limited to the precise arrangements and instrumentalities of the embodiments shown in the drawings.
FIG. 1 shows an exemplary localization of a gRNA sequence in an anchor sequence. The figures disclose SEQ ID NOs 244-245 in the order of appearance, respectively.
FIG. 2 shows exemplary localization and restriction site information of a gRNA sequence in an anchor sequence. The figures disclose SEQ ID NOs 246-247 in the order of appearance, respectively.
FIG. 3 shows a graph of expression (mRNA) of various chemokines in TNF-treated cells with and without treatment with a site-specific breaker comprising a CRISPR/Cas molecule and a first exemplary gRNA.
FIG. 4 shows a graph of expression (mRNA) of various chemokines in TNF-treated cells with and without treatment with a site-specific breaker comprising a CRISPR/Cas molecule and a second exemplary gRNA.
FIG. 5 shows a graph depicting different types of genomic complexes (e.g., ASMC), such as loops, and models of how the expression of genes contained therein is altered.
Figure 6 shows a plot of cytokine expression measured by RNA levels of CXCL1, CXCL2, CXCL3 and IL-8 in THP-1 cells treated with a site-specific breaker comprising a CRISPR/Cas molecule and an sgRNA targeting an anchor sequence comprising a genomic complex of a cytokine encoding gene (e.g., ASMC).
Figure 7 shows a graph of cytokine secretion (CXCL 1 and IL-8) of THP-1 cells treated with site-specific disrupters comprising CRISPR/Cas molecules and different sgrnas targeted to the anchor sequences comprising the genomic complex of the cytokine encoding gene (e.g., ASMC).
Figure 8 shows a graph of cytokine expression (CXCL 3) measured by RNA level in THP-1 cells treated with a site-specific breaker comprising a CRISPR/Cas molecule and an sgRNA targeting an anchor sequence comprising a cytokine encoding gene (e.g., ASMC) (top) and a flow chart showing how the cells are processed in the experiment (bottom).
Figure 9A shows a graph of cytokine expression (CXCL 1) measured by RNA level in THP-1 cells 3 weeks after treatment with a site-specific breaker comprising a CRISPR/Cas molecule and an sgRNA targeting an anchor sequence comprising a genomic complex of a cytokine encoding gene (e.g., ASMC) (top), and a flow chart showing how the cells are processed in the experiment (bottom). Figure 9B shows a graph of cytokine expression (CXCL 3) measured by RNA level in THP-1 cells 3 weeks after treatment with a site-specific breaker comprising a CRISPR/Cas molecule and an sgRNA targeting an anchor sequence comprising a genomic complex of a cytokine encoding gene (e.g., ASMC).
Figure 10 shows a graph of cytokine expression (CXCL 1) measured by RNA level in THP-1 cells after treatment with a site-specific breaker comprising a catalytically inactive CRISPR/Cas molecule and a transcription repressor (KRAB) and an sgRNA targeting an anchor sequence comprising a genomic complex of a cytokine encoding gene (e.g., ASMC).
Figure 11 shows a graph of cytokine expression (CXCL 1) measured by RNA level in THP-1 cells after treatment with a site-specific breaker comprising a catalytically inactive CRISPR/Cas molecule and histone methyltransferase (EZH 2) and an sgRNA targeting an anchor sequence of a genomic complex comprising a cytokine encoding gene (e.g., ASMC).
Figure 12 shows a graph of cytokine expression (CXCL 1) measured by RNA level in THP-1 cells after treatment with a site-specific breaker comprising a catalytically inactive CRISPR/Cas molecule and a DNA methyltransferase (MQ 1) and an sgRNA targeting an anchor sequence of a genomic complex (e.g., ASMC) comprising a cytokine encoding gene.
FIG. 13 shows a graph of cytokine expression (CXCL 1) measured by RNA levels in THP-1 cells after 72 hours, 3 weeks or 4 weeks of treatment with different site-specific disruption agents (top) and a flow chart showing how the cells are processed in the experiment (bottom).
Figure 14 shows a graph of cytokine expression (CXCL 3) measured by RNA level in THP-1 cells (top) after treatment with different site-specific disruption agents and sgrnas targeting anchor sequences comprising genomic complexes of cytokine encoding genes (e.g., ASMC), and a flow chart showing how the cells were processed in the experiment (bottom).
Figure 15 shows a graph of cytokine expression (CXCL 1) measured by RNA level in THP-1 cells (top) after treatment with different site-specific disruption agents and sgrnas targeting anchor sequences comprising genomic complexes of cytokine encoding genes (e.g., ASMC), and a flow chart showing how the cells were processed in the experiment (bottom).
FIG. 16 shows human CXCL IGD and gene cluster organization. FIG. 16A shows a schematic Insulated Genomic Domain (IGD) illustrating two loops within the CXLC1-8 gene cluster. The CXCL8, CXCL6 and CXCL1 genes are located on the left loop of the IGD. The CXCL2-5 and CXCL7 genes are located on the right loop of the IGD. Investigation of IGD data from different cell lines indicated that intermediate CTCF was only present in CXCL secreting cells (e.g., not in lymphocytes).
Figure 16B shows the guidelines designed for four different CTCF targets: left CTCF-2, left CTCF, middle CTCF, and right CTCF.
FIG. 17 shows that CXCL1-8 gene is down-regulated when dCS 9-EZH2 guide 30183 targets an intermediate CTCF motif within the CXCL1-8 cluster in TNF-alpha treated human A549 lung cancer epithelial cells. Cells stimulated with tnfα served as controls.
FIG. 18 shows that CXCL1, 2, 3, 8 genes are down-regulated when dCS 9-EZH2 guide 30183 targets the intermediate CTCF motif located within the CXCL1-8 cluster in TNF-alpha treated human IMR-90 normal lung fibroblasts. Cells stimulated with tnfα served as controls.
FIG. 19 shows that CXCL1, 2, 3, 8 genes are down-regulated when control A targets the left CTCF motif within the CXCL1-8 cluster in TNF-alpha treated human monocytes. Cells stimulated with tnfα served as controls.
FIG. 20 shows mouse CXCL IGD and cluster organization. FIG. 20A shows a schematic Insulated Genomic Domain (IGD), illustrating two loops within the CXCL gene cluster. FIG. 20B illustrates two loops within CXLC1-5, 7 and 15 gene clusters. The CXCL4, CXCL5 and CXCL7 genes are located on the left loop of the IGD. The CXCL1-3 and CXCL15 genes are located on the right loop of IGD guides designed for four different CTCF targets: left (L), middle 1 (M1), middle 2 (M2), and right (R) CTCF.
Figure 21A shows IGD guides designed for four different CTCF targets: middle 1 (M1), middle 2 (M2), and right (R) CTCF.
FIG. 21B shows the in vitro down-regulation of mouse CXCL IGD in Hep 1.6 using dCS 9-MQ1. dCas9-MQ1 was transfected with a guide targeting either the right CTCF or one of the middle two CTCF motifs in the CXCL gene cluster, with no down-regulation (orange) of any of the seven CXCL genes following tnfα stimulation. When dCas9-MQ1 was transfected with the combined guide targeting the middle CTCF and right, the entire gene cluster was down-regulated (blue).
FIG. 22A shows a schematic experimental design to determine the effect of dCAS9-MQ1 on reducing leukocyte filtration in inflamed lungs. At the-2 hour time point, each mouse was treated with LNP alone or with 3mg/kg of dCas9-MQ1 targeting two middle and right CTCFs. Mice were simulated with 5mg/kg LPS at zero hours followed by either a second dose of LNP alone or 3mg/kg dmas 9-MQ1 targeting two middle and right CTCF at the +8 hour time point. Dexamethasone was administered intraperitoneally at a dose of 10mg/kg at time 0, 24 and 48 hours. Animals were sacrificed at 72 hours and bronchiolar lavage fluid was collected from the lungs for flow staining.
Fig. 22B shows that systemic administration of dCas9-MQ1 reduced leukocyte infiltration in the inflamed lung. Total white blood cell count/mL in bronchiolar lavage fluid obtained from dCas9-MQ1 treated mice showed significant differences compared to lps+ disease animals.
Fig. 23A shows the composition of infiltrating cells found in bronchiole lavage fluid obtained from inflamed lung in mice. The leukocyte types that make up the majority of infiltrating cells are neutrophils, followed by B cells, T cells, macrophages and other types of hematopoietic cells.
Figure 23B shows that dCas9-MQ1 reduced the count of neutrophils that soaked the lung with a significant difference compared to the +lps disease group.
Fig. 24 shows that the reduction of leukocytes in BALF is lung-specific, not due to the reduction of leukocytes in peripheral blood. The figure illustrates that the effect of treatment with dCas9-MQ1 to reduce white blood cell count in BALF is lung-specific, not because the mice themselves have a reduced white blood cell population. The hematopoietic cell populations in peripheral blood were similar in all groups.
FIGS. 25A-G show that CXCL1-5, CXCL7 and CXCL15 gene expression is reduced in lung tissue. After treatment of animals with LNP alone or with dCas9-MQ1, lung tissue was treated to examine CXCL gene expression by qPCR method. All CXCL genes show down-regulation when treated with dCA9-MQ 1. CXCL2 expression was down-regulated most.
Figure 26 shows that reducing CXCL expression and cell recruitment to the site of inflammation has a beneficial downstream effect of reducing the presence of other cytokines.
The levels of secreted chemokine proteins in BALF showed reduced CXCL 1 and 2 protein levels. Reducing CXCL expression and cell recruitment to sites of inflammation has the beneficial downstream effect of reducing the presence of GM-CSF (fig. 26C) and IL6 (fig. 26D).
Detailed Description
The present disclosure provides techniques for modulating (e.g., reducing) expression of a target plurality of genes in, for example, a subject or patient cell by using a site-specific breaker or a system comprising two or more site-specific breakers. In some embodiments, the site-specific breaker comprises a targeting moiety. In some embodiments, the site-specific breaker comprises a targeting moiety and an effector moiety. Without wishing to be bound by theory, many diseases and disorders are associated with a group of genes having related functions, which are associated with a common genomic complex (e.g., ASMC). Modulation (e.g., disruption) of a genomic complex (e.g., ASMC) comprising (in whole or in part) a target plurality of genes may be an improved method of altering (e.g., reducing) expression of the target plurality of genes (e.g., with respect to improved efficiency, conformation, and/or altered stability) as compared to modulating individual target genes. The improvements may translate into corresponding improvements in the treatment of diseases and conditions associated with the target multiple genes. For example, multiple genes may be associated with a pro-inflammatory effect, and a site-specific disruption agent may target a genomic complex (e.g., ASMC) comprising (in whole or in part) the multiple genes to modulate (e.g., reduce) expression of the multiple genes, thereby achieving an anti-inflammatory effect (e.g., better anti-inflammatory effect than targeting a gene of the multiple genes alone). Examples of site-specific disrupters, targeting moieties, effector moieties, and targeting multiple genes are provided herein.
Site-specific disruption agents can be modulated by one or more means, e.g., reducing expression of a target plurality of genes. In some embodiments, the site-specific breaker binds to a target site, e.g., an anchor sequence, and physically or spatially competes for binding to other genomic complex components, e.g., a nucleation polypeptide. Without wishing to be bound by theory, physical or spatial blocking of the anchor sequence, e.g., such that binding of a genomic complex component (e.g., a nucleating polypeptide) to the anchor sequence is inhibited (e.g., prevented), is a mechanism by which a site-specific disrupting agent may modulate (e.g., reduce) expression of a plurality of genes of interest. Site-specific disruption agents can destabilize interactions of a genomic complex component (e.g., a nucleation polypeptide) with an anchor sequence, for example, by altering (e.g., reducing) the affinity and/or avidity of the genomic complex component for binding to the anchor sequence. Blocking or destabilizing the binding of the genomic complex component (e.g., the nucleation polypeptide) to the anchor sequence may be achieved by one or more means including: an epigenetic modification of the anchor sequence or sequence proximal thereto, a genetic modification of the anchor sequence or sequence proximal thereto, or a binding of a site-specific breaker to the anchor sequence or sequence proximal thereto. Inhibiting (e.g., preventing) binding of a component of a genomic complex (e.g., a nucleation polypeptide) to an anchor sequence can inhibit (e.g., disrupt or prevent formation of) the genomic complex, e.g., ASMC. Inhibition of a genomic complex (e.g., ASMC) comprising all or a portion of a target plurality of genes can be modulated, e.g., reduce expression of the genes of the target plurality of genes. In some embodiments, the site-specific breaker comprises a targeting moiety, a first effector moiety, and a second effector moiety. In some embodiments, the first effector moiety has a sequence that is different from the sequence of the second effector moiety. In some embodiments, the first effector moiety has a sequence that is identical to a sequence of the second effector moiety.
The present disclosure further provides, in part, systems comprising two or more site-specific disruption agents, each comprising a targeting moiety and optionally an effector moiety. In some embodiments, the targeting moiety targets two or more different sequences (e.g., each site-specific breaker can target a different sequence). In some embodiments, the first site-specific breaker binds to a transcriptional regulatory element (e.g., a promoter or transcription initiation site (TSS)) that is operably linked to a target plurality of genes (e.g., human CXCL 1-8), and the second site-specific breaker binds to an anchor sequence comprising an anchor sequence-mediated junction (ASMC) of the target plurality of genes (e.g., human CXCL 1-8). In some embodiments, modulation of expression of a target plurality of genes, such as human CXCL1-8, by the system involves binding of a first site-specific breaker and a second site-specific breaker to the first and second DNA sequences, respectively. Binding of the first and second DNA sequences localizes the function of the first and second effector moieties to those sites. Without wishing to be bound by theory, in some embodiments, the function of both the first and second site-specific breaker effector moieties is employed to stably repress the expression of a target plurality of genes associated with or comprising a first and/or second DNA sequence, e.g., wherein the first and/or second DNA sequence is or comprises the sequence of the target plurality of genes or one or more operably linked transcription control elements.
Site-specific disruption agent
In some embodiments, the site-specific breaker comprises a targeting moiety. In some embodiments, the targeting moiety specifically binds to a DNA sequence, e.g., an anchor sequence, thereby modulating, e.g., disrupting, a genomic complex (e.g., ASMC) comprising the DNA sequence. In some embodiments, the site-specific breaker comprises a targeting moiety and an effector moiety. In some embodiments, the targeting moiety specifically binds to a DNA sequence, thereby localizing the function of the effector moiety to the DNA sequence, thereby modulating (e.g., disrupting) a genomic complex (e.g., ASMC) comprising the DNA sequence. In some embodiments, the site-specific breaker comprises one targeting moiety and one effector moiety. In some embodiments, the site-specific breaker comprises one targeting moiety and more than one effector moiety, e.g., two, three, four, or five effector moieties, each of which may be the same or different from another of the more than one effector moieties. In some embodiments, the site-specific breaker can comprise two effector moieties, wherein a first effector moiety comprises a different function than a second effector moiety. For example, the site-specific disruption agent may comprise two effector moieties, wherein a first effector moiety comprises a DNA methyltransferase function (e.g., comprising G9A or EZH2 or a functional fragment or variant thereof) and a second effector moiety comprises a transcriptional repression function (e.g., comprising KRAB or a functional fragment or variant thereof). In some embodiments, the site-specific disruption agent comprises effector moieties whose functions are complementary to each other in reducing expression of the target plurality of genes, wherein the functions together inhibit expression, and optionally, do not inhibit expression or negligibly inhibit expression when present alone. In some embodiments, the site-specific disruption agent comprises a plurality of effector moieties, wherein each effector moiety is complementary to each other effector moiety, each effector moiety reducing expression of a target plurality of genes.
In some embodiments, the site-specific disruption agent comprises a combination of effector moieties whose functions cooperate in reducing expression of the target plurality of genes. Without wishing to be bound by theory, in some embodiments, it is believed that the epigenetic modifications to the genomic loci are cumulative, as multiple transcription activated epigenetic markers (e.g., multiple different types of epigenetic markers and/or a broader marker of a given type) together, individually, more effectively inhibit expression (e.g., produce a greater reduction in expression and/or a longer lasting reduction in expression) than a single modification alone. In some embodiments, the site-specific disruption agent comprises a plurality of effector moieties, wherein each effector moiety cooperates with each other effector moiety, e.g., each effector moiety reduces expression of a target plurality of genes. In some embodiments, a site-specific breaker (comprising a plurality of effector moieties that cooperate with each other) inhibits expression of a target plurality of genes more effectively than a site-specific breaker comprising a single effector moiety. In some embodiments, a site-specific breaker comprising the plurality of effector moieties is at least 1.05x (i.e., 1.05 times), 1.1x, 1.15x, 1.2x, 1.25x, 1.3x, 1.35x, 1.4x, 1.45x, 1.5x, 1.55x, 1.6x, 1.65x, 1.7x, 1.75x, 1.8x, 1.85x, 1.9x, 1.95x, 2x, 3x, 4x, 5x, 6x, 7x, 8x, 9x, 10x, 20x, 30x, 40x, 50x, 60x, 70x, 80x, 90x, or 100x more effective than a site-specific breaker comprising a single effector moiety in reducing expression of a target plurality of genes.
In some embodiments, the site-specific breaker comprises one or more targeting moieties, such as a Cas domain, TAL effector domain, or Zn finger domain. In embodiments, when the system comprises two or more targeting moieties of the same type, e.g., two or more Cas domains, the targeting moieties specifically bind to two or more different sequences. For example, in a site-specific breaker system comprising two or more Cas domains, the two or more Cas domains may be selected or altered such that they only significantly bind grnas corresponding to their target sequences (e.g., without significantly binding grnas corresponding to targets of another Cas domain).
In some embodiments, the site-specific breaker comprises a targeting moiety and an effector moiety covalently linked, e.g., by a peptide bond. In some embodiments, the targeting moiety and the effector moiety are located on the same polypeptide chain, e.g., linked by one or more peptide bonds and/or linkers. In some embodiments, the site-specific breaker comprises a fusion molecule, e.g., comprising a targeting moiety and an effector moiety linked by a peptide bond and/or linker. In some embodiments, the site-specific breaker comprises a targeting moiety located N-terminal to the effector moiety on the same polypeptide chain. In some embodiments, the site-specific breaker comprises a targeting moiety located at the C-terminus of the effector moiety on the same polypeptide chain. In some embodiments, the site-specific breaker comprises a targeting moiety and an effector moiety covalently linked by a non-peptide bond. In some embodiments, the targeting moiety is conjugated to the effector moiety through a non-peptide bond. In some embodiments, the site-specific breaker comprises a targeting moiety and a plurality of effector moieties, wherein the targeting moiety and the plurality of effector moieties are covalently linked, e.g., by a peptide bond (e.g., the targeting moiety and the plurality of effector moieties are both linked by a series of covalent bonds, although each individual moiety may not share a covalent bond with each other).
In other embodiments, the site-specific breaker comprises a targeting moiety and an effector moiety that are non-covalently linked, e.g., they are non-covalently bound to each other. In some embodiments, the site-specific breaker comprises a targeting moiety that is non-covalently bound to an effector moiety or vice versa. In some embodiments, the site-specific breaker comprises a targeting moiety and a plurality of effector moieties, wherein the targeting moiety and at least one effector moiety are not covalently linked, e.g., are non-covalently associated with each other, and wherein the targeting moiety and at least one other effector moiety are covalently linked, e.g., by a peptide bond.
In some embodiments, the site-specific breaker comprises a first effector moiety comprising G9A and a second effector moiety comprising KRAB. In some embodiments, the site-specific breaker comprises a first effector moiety comprising G9A and a second effector moiety comprising EZH 2. In some embodiments, the site-specific breaker comprises a first effector moiety comprising EZH2 and a second effector moiety comprising KRAB.
In some embodiments, the site-specific breaker comprises a targeting moiety and an effector moiety, wherein the C-terminus of the effector moiety (e.g., an effector moiety selected from EZH2 or G9A or a functional variant or fragment thereof) is covalently linked to the N-terminus of the targeting moiety. In some embodiments, the site-specific breaker comprises a targeting moiety and an effector moiety, wherein the N-terminus of the effector moiety (e.g., an effector moiety selected from HDAC8, MQ1, DNMT3a/3L, KRAB, or a functional variant or fragment thereof) is covalently linked to the C-terminus of the targeting moiety. In some embodiments, the site-specific breaker comprises a targeting moiety, a first effector moiety, and a second effector moiety, wherein the C-terminus of the first effector moiety (e.g., an effector moiety selected from EZH2, G9A, or a functional variant or fragment thereof) and the N-terminus of the targeting moiety are covalently linked, and the C-terminus of the targeting moiety and the N-terminus of the second effector moiety (e.g., an effector moiety selected from HDAC8, MQ1, DNMT3a/3L, KRAB, or a functional variant or fragment thereof) are covalently linked. Covalent attachment may be, for example, through a linker sequence.
In some embodiments, the site-specific breaker comprises a targeting moiety, a first effector moiety, and a second effector moiety, wherein the first effector moiety is EZH2 or a functional variant or fragment thereof, e.g., wherein the first effector moiety comprises the amino acid sequence of SEQ ID NO:17, or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions different therefrom, and the first effector moiety is N-terminal to the targeting moiety; and the second effector moiety is KRAB or a functional variant or fragment thereof, e.g., wherein the second effector moiety comprises the amino acid sequence of SEQ ID No. 13, or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions different therefrom, and the second effector moiety is at the C-terminus of the targeting moiety.
In some embodiments, the site-specific breaker comprises a targeting moiety, a first effector moiety, and a second effector moiety, wherein the first effector moiety is EZH2 or a functional variant or fragment thereof, e.g., wherein the first effector moiety comprises the amino acid sequence of SEQ ID NO:17, or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions different therefrom, and the first effector moiety is N-terminal to the targeting moiety; and the second effector moiety is HDAC8 or a functional variant or fragment thereof, e.g., wherein the second effector moiety comprises the amino acid sequence of SEQ ID No. 19, or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions different therefrom, and the second effector moiety is at the C-terminus of the targeting moiety.
In some embodiments, a site-specific breaker comprises a targeting moiety, a first effector moiety, and a second effector moiety, wherein the first effector moiety is G9A or a functional variant or fragment thereof, e.g., wherein the first effector moiety comprises the amino acid sequence of SEQ ID No. 67, or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions therefrom, and the first effector moiety is N-terminal to the targeting moiety; and the second effector moiety is KRAB or a functional variant or fragment thereof, e.g., wherein the second effector moiety comprises the amino acid sequence of SEQ ID No. 13, or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions different therefrom, and the second effector moiety is at the C-terminus of the targeting moiety.
In some embodiments, a site-specific breaker comprises a targeting moiety, a first effector moiety, and a second effector moiety, wherein the first effector moiety is G9A or a functional variant or fragment thereof, e.g., wherein the first effector moiety comprises the amino acid sequence of SEQ ID No. 67, or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions therefrom, and the first effector moiety is N-terminal to the targeting moiety; and the second effector moiety is EZH2 or a functional variant or fragment thereof, e.g., wherein the second effector moiety comprises the amino acid sequence of SEQ ID No. 17, or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions thereto, and the second effector moiety is at the C-terminus of the targeting moiety.
In some embodiments, the first effector moiety comprises histone methyltransferase activity and the second effector moiety comprises a different histone methyltransferase activity. In some embodiments, the first effector moiety comprises histone methyltransferase activity and the second effector moiety comprises the same histone methyltransferase activity. In some embodiments, the first effector moiety comprises histone demethylase activity and the second effector moiety comprises histone deacetylase activity. In some embodiments, the first effector moiety comprises histone demethylase activity and the second effector moiety comprises DNA methyltransferase activity. In some embodiments, the first effector moiety comprises histone demethylase activity and the second effector moiety comprises DNA demethylase activity. In some embodiments, the first effector moiety comprises histone demethylase activity and the second effector moiety comprises transcriptional repressor activity. In some embodiments, the first effector moiety comprises histone demethylase activity and the second effector moiety comprises a different histone demethylase activity. In some embodiments, the first effector moiety comprises histone demethylase activity and the second effector moiety comprises the same histone demethylase activity. In some embodiments, the first effector moiety comprises histone deacetylase activity and the second effector moiety comprises DNA methyltransferase activity. In some embodiments, the first effector moiety comprises histone deacetylase activity and the second effector moiety comprises DNA demethylase activity. In some embodiments, the first effector moiety comprises histone deacetylase activity and the second effector moiety comprises transcriptional repressor activity. In some embodiments, the first effector moiety comprises histone deacetylase activity and the second effector moiety comprises a different histone deacetylase activity. In some embodiments, the first effector moiety comprises histone deacetylase activity and the second effector moiety comprises the same histone deacetylase activity. In some embodiments, the first effector moiety comprises DNA methyltransferase activity and the second effector moiety comprises DNA demethylase activity. In some embodiments, the first effector moiety comprises DNA methyltransferase activity and the second effector moiety comprises transcriptional repressor activity. In some embodiments, the first effector moiety comprises a DNA methyltransferase activity and the second effector moiety comprises a different DNA methyltransferase activity. In some embodiments, the first effector moiety comprises DNA methyltransferase activity and the second effector moiety comprises the same DNA methyltransferase activity. In some embodiments, the first effector moiety comprises DNA demethylase activity and the second effector moiety comprises transcription repressor activity. In some embodiments, the first effector moiety comprises a DNA demethylase activity and the second effector moiety comprises a different DNA demethylase activity. In some embodiments, the first effector moiety comprises DNA demethylase activity and the second effector moiety comprises the same DNA demethylase activity. In some embodiments, the first effector moiety comprises a transcriptional repressor activity and the second effector moiety comprises a different transcriptional repressor activity. In some embodiments, the first effector moiety comprises a transcriptional repressor activity and the second effector moiety comprises the same transcriptional repressor activity.
In some embodiments, the first effector moiety comprises DNMT3a/3l, MQ1, KRAB, G9A, HDAC, or EZH2, and the second effector moiety comprises DNMT3a/3l, MQ1, KRAB, G9A, HDAC, or EZH2.
Joint
The site-specific breaker may comprise one or more linkers. The linker may link the targeting moiety to the effector moiety, the effector moiety to another effector moiety, or the targeting moiety to another targeting moiety. The linker may be a chemical bond, such as one or more covalent or non-covalent bonds. In some embodiments, the linker is covalent. In some embodiments, the linker is non-covalent. In some embodiments, the linker is a peptide linker. Such linkers may be between 2-30, 5-30, 10-30, 15-30, 20-30, 25-30, 2-25, 5-25, 10-25, 15-25, 20-25, 2-20, 5-20, 10-20, 15-20, 2-15, 5-15, 10-15, 2-10, 5-10, or 2-5 amino acids in length, or greater than or equal to 2, 5, 10, 15, 20, 25, or 30 amino acids in length (and optionally up to 50, 40, 30, 25, 20, 15, 10, or 5 amino acids in length). In some embodiments, a linker may be used to separate the first moiety from the second moiety, e.g., to separate the targeting moiety from the effector moiety. In some embodiments, for example, a linker may be located between the targeting moiety and the effector moiety, e.g., to provide molecular flexibility of the secondary and tertiary structures. In some embodiments, the site-specific breaker can comprise a first effector moiety linked to the targeting moiety through a first linker and a second effector moiety linked to the targeting moiety through a second linker. In some embodiments, the first linker has a sequence that is identical to the sequence of the second linker. In some embodiments, the first linker has a sequence that is different from the sequence of the second linker. In some embodiments, the first effector moiety is N-terminal to the targeting moiety. In some embodiments, the C-terminus of the targeting moiety. In some embodiments, wherein the C-terminus of the first effector moiety is linked to the N-terminus of the targeting moiety by a first linker and the N-terminus of the second effector moiety is linked to the C-terminus of the targeting moiety by a second linker.
The joint may comprise a flexible, rigid, and/or cleavable joint as described herein. In some embodiments, the linker includes at least one glycine, alanine, and serine amino acid to provide flexibility. In some embodiments, the linker is a hydrophobic linker, such as comprising negatively charged sulfonate groups, polyethylene glycol (PEG) groups, or pyrophosphoric acid diester groups. In some embodiments, the linker is cleavable to selectively release a moiety (e.g., a polypeptide) from the modulator, but stable enough to prevent premature cleavage.
In some embodiments, one or more moieties of the site-specific disruption agents described herein are linked to one or more linkers.
As known to those skilled in the art, the most commonly used flexible linkers have sequences consisting primarily of Gly and Ser residue ("GS" linker) segments. Flexible linkers may have domains/moieties for linking that require some degree of movement or interaction, and may include small, non-polar (e.g., gly) or polar (e.g., ser or Thr) amino acids. The incorporation of Ser or Thr can also maintain the stability of the linker in aqueous solution by forming hydrogen bonds with water molecules and thus reduce adverse interactions between the linker and the moiety/domain.
Rigid linkers are useful for maintaining a fixed distance between domains/moieties and maintaining their independent function. Rigid linkers can also be useful when spatial separation of the domains is critical to maintaining stability or biological activity of one or more components in the fusion. The rigid linker may have an alpha helical structure or a proline-rich sequence (Pro-rich sequence), (XP) n Wherein X represents any amino acid, preferably Ala, lys or Glu.
The cleavable linker may release the free functional domain/moiety in vivo. In some embodiments, the linker may be cleaved under specific conditions (e.g., in the presence of a reducing agent or protease). In vivo cleavable linkers may take advantage of the reversible nature of the disulfide bond. One example includes thrombin-sensitive sequences (e.g., PRSs) between two Cys residues. In vitro thrombin treatment of CPRSC (SEQ ID NO: 243) results in cleavage of thrombin sensitive sequences, while the reversible disulfide bonds remain intact. Such linkers are known and described, for example, in Chen et al, 2013.Fusion Protein Linkers:Property,Design and Functionality [ fusion protein linkers: properties, design and function Adv Drug Deliv Rev [ advanced drug delivery comment ]65 (10): 1357-1369. In vivo cleavage of the linker in the fusion protein may also be performed by proteases that are expressed in vivo under certain conditions, in specific cells or tissues, or within certain cell compartments that are restricted. The specificity of many proteases provides slow cleavage of the linker in a restricted compartment.
Molecules suitable for use in the linkers described hereinExamples of (a) include negatively charged sulfonate groups; lipids, e.g. poly (- -CH) 2 - -) hydrocarbon chains, such as polyethylene glycol (PEG) groups, unsaturated variants thereof, hydroxylated variants thereof, amidated or other N-containing variants; a non-carbon linker; a carbohydrate linker; a phosphodiester linker, or other molecule capable of covalently linking two or more components of a site-specific breaker. Non-covalent linkers (e.g., hydrophobic lipid globules to which the polypeptide is attached) may also be included, such as by hydrophobic regions of the polypeptide or hydrophobic extensions of the polypeptide, such as a series of residues rich in leucine, isoleucine, valine or possibly also alanine, phenylalanine or even tyrosine, methionine, glycine, or other hydrophobic residues. The components of the site-specific breaker may use charge-based chemical ligation such that the positively charged component of the site-specific breaker is linked to the negative charge of the other component.
Nucleic acid
In one aspect, the disclosure provides nucleic acid sequences encoding the site-specific breakers, systems, targeting moieties and/or effector moieties described herein. The skilled artisan knows that the nucleic acid sequence of RNA is identical to the corresponding DNA sequence, except that thymine (T) is typically replaced by uracil (U). It is to be understood that when the nucleotide sequence is represented by a DNA sequence (e.g., including A, T, G, C), the present disclosure also provides a corresponding RNA sequence (e.g., including A, U, G, C), where "U" replaces "T". The polynucleotide sequence is described herein using conventional symbols: the left hand end of the single stranded polynucleotide sequence is the 5' end; the left hand orientation of the double stranded polynucleotide sequence is referred to as the 5' orientation.
Those of skill in the art will appreciate that due to the degeneracy of the genetic code, a large number of nucleotide sequences encoding site-specific disrupters comprising a DNA targeting moiety and/or an effector moiety as described herein may be produced, some of which have similarities, e.g., 90%, 95%, 96%, 97%, 98% or 99% identity, to the nucleic acid sequences disclosed herein. For example, both codons AGA, AGG, CGA, CGC, CGG and CGU encode the amino acid arginine. Thus, in each position of the nucleic acid molecule of the present disclosure where arginine is specified by a codon, the codon can be changed to any of the corresponding codons described above without changing the encoded polypeptide.
In some embodiments, the nucleic acid sequence encoding a site-specific breaker comprising a targeting moiety and/or one or more effector moieties may be part or all of a codon optimized coding region, optimized for codon usage in a mammal, such as a human. In some embodiments, the nucleic acid sequences encoding the targeting moiety and/or the one or more effector moieties are codon optimized to increase protein expression and/or increase the duration of protein expression. In some embodiments, the protein produced by the codon-optimized nucleic acid sequence is at least 1%, at least 2%, at least 5%, at least 10%, at least 15%, at least 20%, at least 30%, at least 40%, or at least 50% higher than the level of the protein when encoded by the non-codon-optimized nucleic acid sequence.
In some embodiments, the systems described herein or methods described herein include the use of a polypeptide comprising one or more (e.g., one) DNA targeting moieties and one or more effector moieties, e.g., wherein the effector moiety is or comprises MQ1, e.g., bacterial MQ1, or a functional variant or fragment thereof. In some embodiments, MQ1 is a single ratio spiroplasma (Spiroplasma monobiae) MQ1, e.g., MQ1 from strain ATCC 33825 and/or corresponding to Uniprot ID P15840. In some embodiments, the MQ1 effector moiety is encoded by the nucleotide sequence of SEQ ID NO. 10. In some embodiments, the nucleotide sequences described herein comprise or have at least 80%, 85%, 90%, 95%, 99% or 100% identity to the sequence of SEQ ID NO 10, or a sequence that differs therefrom by NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 positions.
In some embodiments, MQ1 comprises the amino acid sequence of SEQ ID NO. 11. In some embodiments, MQ1 comprises the amino acid sequence of SEQ ID NO. 12. In some embodiments, the effector domains described herein comprise SEQ ID NO 11 or 12 or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto, or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 positions different therefrom.
In some embodiments, MQ1 for use in the site-specific disruption agents described herein is a variant, e.g., comprising one or more mutations relative to wild-type MQ1 (e.g., SEQ ID NO:11 or SEQ ID NO: 12). In some embodiments, MQ1 variants comprise one or more amino acid substitutions, deletions, or insertions relative to wild-type MQ 1. In some embodiments, the MQ1 variant comprises the K297P substitution. In some embodiments, the MQ1 variant comprises an N299C substitution. In some embodiments, the MQ1 variant comprises the E301Y substitution. In some embodiments, MQ1 variants comprise Q147L substitution (e.g., and have reduced DNA methyltransferase activity relative to wild-type MQ 1). In some embodiments, MQ1 variants comprise K297P, N299C and E301Y substitutions (e.g., and have reduced DNA binding affinity relative to wild-type MQ 1). In some embodiments, MQ1 variants comprise Q147L, K297P, N299C and E301Y substitutions (e.g., and have reduced DNA methyltransferase activity and DNA binding affinity relative to wild type MQ 1). In some embodiments, the site-specific disruption agent comprises one or more linkers described herein, e.g., linking a moiety/domain to another moiety/domain. In some embodiments, the site-specific disruption agent comprises a targeting moiety that is or comprises a CRISPR/Cas molecule, e.g., comprising a CRISPR/Cas protein, e.g., dCas9 protein. In some embodiments, the site-specific disruption agent is a fusion protein comprising an effector moiety that is or comprises MQ1 and a DNA targeting moiety that is or comprises a CRISPR/Cas molecule, e.g., comprising a CRISPR/Cas protein, e.g., dCas9 protein; for example, dCas9m4. In some embodiments, the site-specific disruption agent comprises additional moieties described herein. In some embodiments, the site-specific disruption agent reduces expression of a target gene or genes (e.g., a target gene or genes described herein). In some embodiments, the site-specific disruption agent can be used in a method of modulating (e.g., reducing) gene expression, a method of treating a disorder, or a method of epigenetic modification of a target gene or transcription control element described herein. In some embodiments, the system comprises two or more site-specific disruption agents.
In some embodiments, the systems described herein or the methods described herein include the use of a site-specific breaker or polypeptide comprising one or more (e.g., one) targeting moiety and one or more effector moiety, wherein the effector moiety is or comprises a Krueppel-associated cassette (KRAB) domain of zinc finger protein 10, e.g., a protein or functional variant or fragment thereof as encoded according to np_056209.2, or nm_ 015394.5. In some embodiments, the KRAB is a synthetic KRAB construct. In some embodiments, the KRABs used in the site-specific disruption agents described herein are variants, e.g., comprising one or more mutations relative to wild-type KRABs (e.g., according to np_056209.2 or protein encoded by nm_ 015394.5). In some embodiments, the KRAB variant comprises one or more amino acid substitutions, deletions or insertions relative to wild-type KRAB. In some embodiments, the KRAB variant comprises an L37P substitution. In some embodiments, the KRAB comprises the amino acid sequence of SEQ ID NO. 13:
in some embodiments, the KRAB effector moiety is encoded by the nucleotide sequence of SEQ ID NO. 14. In some embodiments, the nucleotide sequences described herein comprise or have at least 80%, 85%, 90%, 95%, 99% or 100% identity to the sequence of SEQ ID NO. 14, or a sequence that differs therefrom by NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 positions.
In some embodiments, the KRAB for a polypeptide or site-specific breaker described herein is a variant, e.g., comprises one or more mutations relative to the KRAB sequence of SEQ ID NO: 13. In some embodiments, the KRAB variant comprises one or more amino acid substitutions, deletions or insertions relative to SEQ ID NO. 13.
In some embodiments, the polypeptide or site-specific breaker is a fusion protein comprising an effector moiety that is or comprises KRAB and a targeting moiety, e.g., a Crisper/Cas protein. In some embodiments, the polypeptide or site-specific breaker comprises additional moieties described herein. In some embodiments, the polypeptide or site-specific disruption agent reduces expression of the target gene or genes. In some embodiments, the polypeptide or site-specific disruption agent can be used in a method of modulating (e.g., reducing) gene expression, a method of treating a disorder, or a method of epigenetic modification of a target gene or genes described herein, e.g., a transcriptional control element.
In some embodiments, the systems described herein or methods described herein include the use of a site-specific breaker or polypeptide comprising one or more (e.g., one) targeting moiety and one or more effector moiety, wherein the effector moiety is or comprises a DNMT3a/3L complex, or a functional variant or fragment thereof. In some embodiments, the DNMT3a/3L complex is a fusion construct. In some embodiments, the DNMT3A/3L complex comprises DNMT3A, e.g. human DNMT3A, e.g. according to np_072046.2 or the protein encoded by nm_ 022552.4) or the protein encoded by nm_022552.4 or a functional variant or fragment thereof, e.g. aa 679-912 of human DNMT3A, e.g. according to np_072046.2 or the protein encoded by nm_ 022552.4. In some embodiments, the DNMT3a/3L complex comprises human DNMT3L or a functional fragment or variant thereof (e.g., according to np_787063.1 or a protein encoded by nm_175867.3 or a functional variant or fragment thereof, e.g., aa 274-386 of human DNMT3L according to np_787063.1 or a protein encoded by nm_ 175867.3). In some embodiments, DNMT3a/3L comprises the amino acid sequence of SEQ ID NO: 15. In some embodiments, the effector moiety described herein comprises or has at least 80%, 85%, 90%, 95%, 99% or 100% identity to SEQ ID NO 15, or a sequence not differing by more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 positions therefrom.
In some embodiments, DNMT3a/3L is encoded by the nucleotide sequence of SEQ ID NO: 16. In some embodiments, a nucleic acid described herein comprises or has at least 80%, 85%, 90%, 95%, 99% or 100% identity to the sequence of SEQ ID No. 16, or a sequence not differing by more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions.
In some embodiments, the systems described herein or methods described herein include the use of a site-specific breaker or polypeptide comprising one or more (e.g., one) targeting moiety and one or more effector moiety, wherein the effector moiety is or comprises EZH2, e.g., according to NP-004447.2 or np_001190176.12 or a protein encoded by nm_004456.5 or nm_001203247.2, or a functional variant or fragment thereof. In some embodiments, MQ1 for the site-specific disruption agents described herein is a variant, e.g., comprises one or more mutations relative to EZH2 (e.g., according to NP-004447.2 or np_001190176.12 or protein encoded by nm_004456.5 or nm_ 001203247.2). In some embodiments, the EZH2 variant comprises one or more amino acid substitutions, deletions, or insertions relative to wild-type EZH 2. In some embodiments, EZH2 comprises the amino acid sequence of SEQ ID NO: 17:
In some embodiments, the EZH2 effector moiety is encoded by the nucleotide sequence of SEQ ID NO. 18. In some embodiments, the nucleotide sequences described herein comprise the sequence of SEQ ID NO. 18 or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto, or a sequence having NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions difference thereto.
/>
In some embodiments, the EZH2 for the polypeptides or site-specific disruption agents described herein is a variant, e.g., comprises one or more mutations relative to the EZH2 sequence of SEQ ID NO: 17. In some embodiments, the EZH2 variant comprises one or more amino acid substitutions, deletions, or insertions relative to SEQ ID No. 17.
In some embodiments, the polypeptide or site-specific breaker is a fusion protein comprising an effector moiety and a targeting moiety that are or contain EZH 2. In some embodiments, the polypeptide or site-specific breaker comprises additional moieties described herein. In some embodiments, the polypeptide or site-specific disruption agent reduces expression of the target gene or genes. In some embodiments, the polypeptide or site-specific disruption agent can be used in a method of modulating (e.g., reducing) gene expression, a method of treating a disorder, or a method of epigenetic modification of a target gene or genes described herein, e.g., a transcriptional control element.
In some embodiments, the systems described herein or methods described herein include the use of a site-specific breaker or polypeptide comprising one or more (e.g., one) targeting moiety and one or more effector moiety, wherein the effector moiety is or comprises HDAC8, e.g., according to np_001159890 or np_060956.1 or a protein encoded by nm_001166418 or nm_018486.3, or a functional variant or fragment thereof. In some embodiments, HDAC8 comprises the amino acid sequence of SEQ ID NO: 19:
in some embodiments, the HDAC8 effector moiety is encoded by the nucleotide sequence of SEQ ID NO: 66. In some embodiments, the nucleotide sequence described herein comprises the sequence of SEQ ID NO 66 or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto, or a sequence having NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions difference thereto.
In some embodiments, HDAC8 used in the polypeptides or site-specific disruption agents described herein is a variant, e.g., comprises one or more mutations relative to the HDAC8 sequence of SEQ ID NO: 19. In some embodiments, the HDAC8 variant comprises one or more amino acid substitutions, deletions or insertions relative to SEQ ID NO. 19.
In some embodiments, the polypeptide or site-specific breaker is a fusion protein comprising an effector moiety and a targeting moiety that are or contain HDAC 8. In some embodiments, the polypeptide or site-specific breaker comprises additional moieties described herein. In some embodiments, the polypeptide or site-specific disruption agent reduces expression of the target gene or genes. In some embodiments, the polypeptide or site-specific disruption agent can be used in a method of modulating (e.g., reducing) gene expression, a method of treating a disorder, or a method of epigenetic modification of a target gene or genes described herein, e.g., a transcriptional control element.
In some embodiments, the systems described herein or methods described herein include the use of a site-specific breaker or polypeptide comprising one or more (e.g., one) targeting moiety and one or more effector moiety, wherein the effector moiety is or comprises G9A, e.g., according to np_001350618.1 or a protein encoded by nm_001363689.1 or a functional variant or fragment thereof, e.g., aa967-1250 comprising G9A, e.g., according to np_001350618.1 or a protein encoded by nm_ 001363689.1. In some embodiments, G9A comprises the amino acid sequence of SEQ ID NO: 67:
In some embodiments, the G9A effector moiety is encoded by the nucleotide sequence of SEQ ID NO. 68. In some embodiments, the nucleotide sequences described herein comprise the sequence of SEQ ID NO. 68 or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto, or a sequence having NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions difference thereto.
In some embodiments, the G9A for the polypeptides or site-specific disruption agents described herein is a variant, e.g., comprises one or more mutations relative to the G9A sequence of SEQ ID NO: 67. In some embodiments, the G9A variant comprises one or more amino acid substitutions, deletions or insertions relative to SEQ ID NO. 67.
In some embodiments, the polypeptide or site-specific breaker is a fusion protein comprising an effector moiety that is or contains G9A and a targeting moiety. In some embodiments, the polypeptide or site-specific breaker comprises additional moieties described herein. In some embodiments, the polypeptide or site-specific disruption agent reduces expression of the target gene or genes. In some embodiments, the polypeptide or site-specific disruption agent can be used in a method of modulating (e.g., reducing) gene expression, a method of treating a disorder, or a method of epigenetic modification of a target gene or genes described herein, e.g., a transcriptional control element.
System and method for controlling a system
The systems of the present disclosure may comprise two or more site-specific disruption agents. In some embodiments, the site-specific breaker system comprises 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 or more site-specific breakers (and optionally no more than 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3 or 2). In some embodiments, the system targets two or more different sequences (e.g., 1 st and 2 nd, 3 rd, 4 th, 5 th, 6 th, 7 th, 8 th, 9 th, 10 th, 11 th, 12 th and/or additional DNA sequences, and optionally no more than 20 th, 19 th, 18 th, 17 th, 16 th, 15 th, 14 th, 13 th, 12 th, 11 th, 10 th, 9 th, 8 th, 6 th, 5 th, 4 th, 3 rd or 2 nd sequences). In some embodiments, the system comprises a plurality of site-specific disruption agents, wherein each member of the plurality of site-specific disruption agents binds undetectably, e.g., does not bind another member of the plurality of site-specific disruption agents. In some embodiments, the system comprises a first site-specific breaker and a second site-specific breaker, wherein the first site-specific breaker binds undetectably, e.g., does not bind to the second site-specific breaker.
In some embodiments, the systems of the present disclosure comprise two or more site-specific disruption agents, wherein the site-specific disruption agents are present together in a composition, pharmaceutical composition, or mixture. In some embodiments, the systems of the present disclosure comprise two or more site-specific disruption agents, wherein one or more site-specific disruption agents are not mixed with at least one other site-specific disruption agent. In some embodiments, the system may comprise a first site-specific breaker and a second site-specific breaker, wherein the presence of the first site-specific breaker in the nucleus of the cell does not overlap with the presence of the second site-specific breaker in the nucleus of the same cell, wherein the system achieves a reduction in expression of the plurality of genes by the non-overlapping presence of the first and second site-specific breakers. In some embodiments, the first site-specific breaker and the second site-specific breaker can act simultaneously or sequentially.
In some embodiments, the site-specific breakers of the system each comprise a different targeting moiety (e.g., the first, second, third, or additional site-specific breakers each comprise targeting moieties that are different from each other). For example, the system can comprise a first site-specific breaker and a second site-specific breaker, wherein the first site-specific breaker comprises a first targeting moiety (e.g., cas9 domain, TAL effector domain, or Zn-finger domain) and the second site-specific breaker comprises a second targeting moiety (e.g., cas9 domain, TAL effector domain, or Zn-finger domain) that is different from the first targeting moiety. In some embodiments, different can refer to comprising different types of targeting moieties, e.g., a first targeting moiety comprises a Cas9 domain and a second DNA targeting moiety comprises a Zn-finger domain. In other embodiments, different can refer to different variants comprising the same type of targeting moiety, e.g., a first targeting moiety comprises a first Cas9 domain (e.g., from a first species) and a second targeting moiety comprises a second Cas9 domain (e.g., from a second species).
In embodiments, when the system comprises two or more targeting moieties of the same type, e.g., two or more Cas9 or Zn finger domains, the targeting moieties specifically bind to two or more different sequences. For example, in a system comprising two or more Cas9 molecules, the two or more Cas9 domains may be selected or altered such that they only significantly bind grnas corresponding to their target DNA sequences (e.g., without significantly binding grnas corresponding to targets of another Cas9 domain). In another example, in a system comprising two or more effector moieties, the two or more effector moieties may be selected or altered such that they only bind significantly to their target sequences (e.g., do not bind significantly to the target sequence of another effector moiety).
In some embodiments, the system comprises three or more site-specific disruption agents and two or more site-specific disruption agents comprise the same targeting moiety. For example, the system may comprise three site-specific disruption agents, wherein the first and second site-specific disruption agents each comprise a first targeting moiety and the third site-specific disruption agent comprises a second, different targeting moiety. For another example, the system may comprise four site-specific disruption agents, wherein the first and second site-specific disruption agents each comprise a first targeting moiety and the third and fourth site-specific disruption agents comprise a second, different targeting moiety. For another example, the system may comprise five site-specific disruption agents, wherein the first and second site-specific disruption agents each comprise a first targeting moiety, the third and fourth site-specific disruption agents each comprise a second, different targeting moiety, and the fifth site-specific disruption agent comprises a third, different targeting moiety. As mentioned above, different may refer to different variants comprising different types of targeting moieties or comprising the same type of targeting moiety.
In some embodiments, the site-specific disrupters of the system each bind a different DNA sequence (e.g., the first, second, third, or additional site-specific disrupters each bind a DNA sequence that is different from each other). For example, the system may comprise a first site-specific breaker that binds to a first DNA sequence and a second site-specific breaker that binds to a second DNA sequence. In some embodiments involving different DNA sequences, there is at least one position that is different between the DNA sequence to which one site-specific breaker binds and the DNA sequence to which another site-specific breaker binds, or there is at least one position in the DNA sequence to which one site-specific breaker binds that is not present in the DNA sequence to which another site-specific breaker binds.
In some embodiments, the first DNA sequence may be located on a first genomic DNA strand and the second DNA sequence may be located on a second genomic DNA strand. In some embodiments, the first DNA sequence may be located on the same genomic DNA strand as the second DNA sequence.
In some embodiments, the system comprises three or more site-specific disruption agents and two or more site-specific disruption agents bind the same DNA sequence. For example, the system may comprise three site-specific disrupters, wherein the first and site-specific disrupters each bind to a first DNA sequence and the third site-specific disrupter binds to a second, different DNA sequence. For another example, the system may comprise four site-specific disrupters, wherein the first and second site-specific disrupters each bind to the first DNA sequence and the third and fourth site-specific disrupters each bind to the second DNA sequence. For another example, the system may comprise five site-specific disrupters, wherein the first and second site-specific disrupters each bind to the first DNA sequence, the third and fourth site-specific disrupters each bind to the second DNA sequence, and the fifth site-specific disrupter binds to the third DNA sequence. As described above, different may refer to at least one position that is different between the DNA sequence to which one site-specific breaker binds and the DNA sequence to which another site-specific breaker binds, or at least one position that is not present in the DNA sequence to which another site-specific breaker binds is present in the DNA sequence to which one site-specific breaker binds.
In some embodiments, the system comprises two or more (e.g., two) site-specific disruption agents, and the plurality of (e.g., two) site-specific disruption agents comprise targeting moieties that bind different DNA sequences. In such embodiments, the first targeting moiety may bind to a first DNA sequence and the second DNA targeting moiety may bind to a second DNA sequence, wherein the first and second DNA sequences are different and do not overlap. In some such embodiments, the first DNA sequence is separated from the second DNA sequence by at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 base pairs (optionally, no more than 500, 400, 300, 200, 100, 95, 90, 85, 80, 75, 70, 65, 60, 55, or 50 base pairs). In some such embodiments, the first DNA sequence is not more than 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 25, 30, 35, 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, 95, or 100 base pairs apart from the second DNA sequence (optionally, no base pairs, e.g., the first and second sequences are immediately adjacent to each other).
In some embodiments, the site-specific breakers of the system each comprise a different effector moiety (e.g., the first, second, third, or additional site-specific breakers each comprise effector moieties that are different from each other). For example, the system can comprise a first site-specific breaker and a second site-specific breaker, wherein the first site-specific breaker comprises a first effector moiety and the second site-specific breaker comprises a second effector moiety different from the first effector moiety. In some embodiments, the different effector moieties comprise different types of effector moieties. In other embodiments, different effector moieties comprise different variants of the same type of effector moiety.
Targeting moiety
The targeting moiety may specifically bind to a DNA sequence, such as a DNA sequence associated with a target plurality of genes, such as an anchor sequence of an ASMC comprising the target plurality of genes. Any molecule or compound that specifically binds to a DNA sequence can be used as the targeting moiety. In some embodiments, the targeting moiety comprises a nucleic acid, e.g., comprising a sequence complementary to an anchor sequence (e.g., an anchor sequence of an ASMC comprising a target plurality of genes). In some embodiments, the nucleic acid is an oligonucleotide that physically/spatially blocks the binding of a component of a genomic complex (e.g., a nucleating polypeptide, e.g., CTCF) to the anchor sequence. In some embodiments, the nucleic acid comprises a guide RNA (gRNA), e.g., compatible with a CRISPR/Cas molecule. In some embodiments, the targeting moiety comprises a CRISPR/Cas molecule, TAL effector molecule, zn finger molecule, tetR domain, meganuclease, peptide Nucleic Acid (PNA), or nucleic acid molecule.
In some embodiments, the targeting moiety is at a K of less than or equal to 500, 450, 400, 350, 300, 250, 200, 150, 100, 50, 40, 30, 20, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.09, 0.08, 0.07, 0.06, 0.05, 0.04, 0.03, 0.02, 0.01, 0.005, 0.002, or 0.001nM D (and optionally K of at least 50, 40, 30, 20, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1, 0.09, 0.08, 0.07, 0.06, 0.05, 0.04, 0.03, 0.02, 0.01, 0.005, 0.002 or 0.001nM D ) Binding to its target sequence. In some embodiments, the targeting moiety is present in a K of 0.001nM to 500nM, e.g., 0.1nM to 5nM, e.g., about 0.5nM D Binding to its target sequence. In some embodiments, the targeting moiety is at least 500, 600, 700, 800, 900, 1000, 2000, 5000, 10,000, or 100,000nM K D Bind non-target sequences (and optionally, do not significantly bind non-target sequences). In some embodiments, the targeting moiety does not bind to a non-target sequence.
In some embodiments, the targeting moiety comprises a nucleic acid comprising a sequence selected from table 7 or a sequence having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity thereto, or no more than 1, 2, 3, 4, or 5 positions different therefrom.
Table 7: exemplary sequence
Name of the name Sequence(s) SEQ ID NO.
GD-28481 AGCCCCACCTTGTGGTCAGA 21
GD-28482 AGTGCTGCCTTCTGACCACA 22
GD-28483 GCTGCCTTCTGACCACAAGG 23
GD-28484 CCAGTATAAGCCCCACCTTG 24
GD-28485 CTGCCTGTCCCATAAGGAGG 25
GD-28486 GCACTGCCTGTCCCATAAGG 26
GD-28487 GGTCCTCCTCCTTATGGGAC 27
GD-28488 GCCTTGTTTTCGGCTCTAGA 28
GD-28489 GCCATCTAGAGCCGAAAACA 29
GD-29251 CCAATGAAGATGAAACTGGG 30
GD-29252 AACGTGCTTGCCTAAGATTC 31
GD-29253 AGCCCTTAATCATATCTAGT 32
GD-29254 CAGAGCTTAAGACCTGTACT 33
GD-29255 GCCCACCTTGACCTTCACAA 34
In some embodiments, the targeting moiety comprises a nucleic acid comprising a sequence selected from table 6 or a sequence having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity thereto, or no more than 1, 2, 3, 4, or 5 positions different therefrom.
Table 6: exemplary guidance sequences
/>
/>
In some embodiments, the targeting moiety comprises a nucleic acid comprising a sequence complementary to a sequence of an anchor sequence (e.g., an anchor sequence of an ASMC comprising a target plurality of genes) or having no more than 1, 2, 3, 4, 5, 6, 7, 8, 9, or 10 positions non-complementary thereto.
In some embodiments, the targeting moiety comprises a nucleic acid comprising a sequence that at least partially overlaps with a region having the following genomic coordinates: chr4: chr4: chr4, chr5, or a sequence within 5, 10, 15, 20, 30, 40 or 50 nucleotides of the region, or a sequence comprising at least 70%, 80%, 90%, 95%, 96%, or more than the same sequence as the genomic region.
In some embodiments, the targeting moiety binds to the sequence of the genomic position: chr4:74595464-74595486, chr4:74595457-74595479, chr4:74595460-74595482, chr4:74595472-74595494, chr4:75000088-75000110, chr4:75000091-75000113, chr4:75000085-75000107, chr4:75000157-75000179, chr4:75000156-75000178, chr4:74595215-7459 74528567-74528589, chr4:74595370-74595392, chr4:74595560-74595582, chr4:74595642-74595664, chr4:74595787-74595809, chr4:74528428-74528450, chr4:74528567-74528589, chr5: 74528567-74528589, chr37, chr5: 74528567-74528589, and chr5.
In some embodiments, the targeting moiety binds to the anchor sequence or a site proximal to the anchor sequence, e.g., the anchor sequence as part of an ASMC that fully or partially comprises the target plurality of genes.
In some embodiments, the targeting moiety comprises a CRISPR/Cas molecule. In some embodiments, the effector moiety comprises a CRISPR/Cas molecule. The CRISPR/Cas molecule comprises a protein, such as a Cas protein, and optionally a guide RNA, such as a single guide RNA (sgRNA), that is involved in a Clustered Regulatory Interval Short Palindromic Repeat (CRISPR) system.
CRISPR systems are adaptive defense systems originally found in bacteria and archaea. CRISPR systems use RNA-guided nucleases (e.g., cas9 or Cpf 1) known as CRISPR-associated or "Cas" endonucleases to cleave foreign DNA. For example, in a typical CRISPR/Cas system, endonucleases are directed to a target nucleotide sequence (e.g., a site in the genome to be sequence edited) by targeting a sequence-specific non-coding "guide RNA" of a single-or double-stranded DNA sequence. Three classes (I-III) of CRISPR systems have been identified. Class II CRISPR systems use a single Cas endonuclease (rather than multiple Cas proteins). A class II CRISPR system includes type II Cas endonucleases, such as Cas9, CRISPR RNA ("crRNA") and transactivating crRNA ("tracrRNA"). crrnas contain "guide RNAs," i.e., RNA sequences that generally correspond to about 20 nucleotides of the target DNA sequence. The crRNA also contains a region that binds to the tracrRNA to form a partially double stranded structure that is cleaved by rnase III, resulting in a crRNA/tracrRNA hybrid. The crRNA/tracrRNA hybrid then directs Cas9 endonuclease to recognize and cleave the target DNA sequence. The target DNA sequence must be generally adjacent to a "protospacer adjacent motif" ("PAM") that is specific for a given Cas endonuclease; however, PAM sequences appear to be spread throughout a given genome. CRISPR endonucleases identified from different prokaryotic species have unique PAM sequence requirements; examples of PAM sequences include 5'-NGG (streptococcus pyogenes (Streptococcus pyogenes)), 5' -nniagaa (streptococcus thermophilus (Streptococcus thermophilus) CRISPR 1), 5'-NGGNG (streptococcus thermophilus CRISPR 3), and 5' -NNNGATT (neisseria meningitidis (Neisseria meningiditis)). Some endonucleases (e.g., cas9 endonucleases) are associated with a G-rich PAM site (e.g., 5'-NGG (e.g., TGG, e.g., CGG, e.g., AGG)) and blunt-end cleave the target DNA 3 nucleotides upstream (5') from the PAM site. Another class II CRISPR system comprises a V-endonuclease Cpf1, which is smaller than Cas 9; examples include AsCpf1 (from an amino acid coccus species (an acidococcus sp.)) and LbCpf1 (from a lachnospiracesp.)). Cpf1 related CRISPR arrays are processed to mature crRNAs without the need for tracrRNA; in other words, the Cpf1 system requires only Cpf1 nuclease and crRNA to cleave the target DNA sequence. Cpf1 endonucleases are associated with T-rich PAM sites such as 5' -TTN. Cpf1 may also recognize the 5' -CTA PAM motif. Cpf1 cleaves the target DNA by introducing a staggered or staggered double strand break with a 5 'overhang of 4 or 5 nucleotides, e.g.cleaving a target DNA in which the staggered or staggered cleavage of 5 nucleotides is located 18 nucleotides downstream (3') from the PAM site on the coding strand and 23 nucleotides downstream from the PAM site on the complementary strand; the 5 nucleotide overhangs resulting from such misdirected cleavage allow for more precise genome editing by insertion of DNA by homologous recombination than by insertion of DNA cleaved at blunt ends. See, e.g., zetsche et al (2015) Cell [ Cell ],163:759-771.
A variety of CRISPR-associated (Cas) genes or proteins can be used in the technology provided by the present disclosure, and the choice of Cas protein will depend on the specific conditions of the method. Specific examples of Cas proteins include class II systems, including Cas1, cas2, cas3, cas4, cas5, cas6, cas7, cas8, cas9, cas10, cpf1, C2C1, or C2C3. In some embodiments, the Cas protein (e.g., cas9 protein) may be from any of a variety of prokaryotic species. In some embodiments, a particular Cas protein (e.g., a particular Cas9 protein) is selected to recognize a particular Protospacer Adjacent Motif (PAM) sequence. In some embodiments, the targeting moiety comprises a sequence targeting polypeptide, such as a Cas protein, e.g., cas9. In certain embodiments, the Cas protein (e.g., cas9 protein) may be obtained from bacteria or archaebacteria or synthesized using known methods. In certain embodiments, the Cas protein may be from a gram positive bacterium or a gram negative bacterium. In certain embodiments, the Cas protein may be from streptococcus (e.g., streptococcus pyogenes or streptococcus thermophilus), francistus (e.g., francistus novacell), staphylococcus (e.g., staphylococcus aureus), amino acid coccus (e.g., amino acid coccus species BV3L 6), neisseria (e.g., neisseria meningitidis), cryptococcus, corynebacterium, haemophilus, eubacterium, pasteurella, praecox, veillonella, or marine bacillus.
In some embodiments, the Cas protein requires the presence of a Protospacer Adjacent Motif (PAM) in or adjacent to the target DNA sequence in order for the Cas protein to bind and/or function. In some embodiments, PAM is or comprises 5 'to 3' NGG, YG, NNGRRT, NNNRRT, NGA, TYCV, TATV, NTTN or NNNGATT, wherein N represents any nucleotide, Y represents C or T, R represents a or G, and V represents a or C or G. In some embodiments, the Cas protein is a protein listed in table 1. In some embodiments, the Cas protein comprises one or more mutations that alter its PAM. In some embodiments, the Cas protein comprises the E1369R, E1449H and R1556A mutations or similar substitutions of amino acids corresponding to the positions. In some embodiments, the Cas protein comprises the E782K, N968K and R1015H mutations or similar substitutions of amino acids corresponding to the positions. In some embodiments, the Cas protein comprises the D1135V, R1335Q and T1337R mutations or similar substitutions of amino acids corresponding to the positions. In some embodiments, the Cas protein comprises the S542R and K607R mutations or similar substitutions of the amino acids corresponding to the positions. In some embodiments, the Cas protein comprises the S542R, K548V and N552R mutations or similar substitutions of the amino acids corresponding to the positions.
TABLE 1
In some embodiments, the Cas protein has catalytic activity and cleaves one or both strands of the target DNA site. In some embodiments, the alteration, e.g., insertion or deletion, is made after cleavage of the target DNA site, e.g., by a cell repair machine.
In some embodiments, the Cas protein is modified to inactivate nucleases, e.g., nuclease-deficient Cas9. Whereas on the specific DNA sequence targeted by the gRNA, wild-type Cas9 produces a Double Strand Break (DSB), many CRISPR endonucleases with modified functionality are available, for example: the "nickase" version of Cas9 produces only single strand breaks; catalytically inactive Cas9 ("dCas 9") does not cleave the target DNA. In some embodiments, the binding of dCas9 to the DNA sequence can interfere with transcription at that site by steric hindrance. In some embodiments, the binding of dCas9 to the anchor sequence may interfere with (e.g., reduce or prevent) formation and/or maintenance of a genomic complex (e.g., ASMC). In some embodiments, the targeting moiety comprises a catalytically inactive Cas9, e.g., dCas9, e.g., cas9m4. Many catalytically inactive Cas9 proteins are known in the art. In some embodiments, dCas9 comprises mutations in each endonuclease domain of the Cas protein, e.g., D10A and H840A mutations.
In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises a D11A mutation or similar substitution of the amino acid corresponding to the position. In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises an H969A mutation or similar substitution of the amino acid corresponding to the position. In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises an N995A mutation or similar substitution of the amino acid corresponding to the position. In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises D11A, H969A and N995A mutations or similar substitutions of amino acids corresponding to the positions.
In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises a D10A mutation or similar substitution of the amino acid corresponding to the position. In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises an H557A mutation or similar substitution of the amino acid corresponding to the position. In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises D10A and H557A mutations or similar substitutions of amino acids corresponding to the positions.
In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises a D839A mutation or similar substitution of the amino acid corresponding to the position. In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises the H840A mutation or similar substitution of the amino acid corresponding to the position. In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises an N863A mutation or similar substitution of the amino acid corresponding to the position. In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises D10A, D839A, H840A and N863A mutations or similar substitutions of amino acids corresponding to the positions.
In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises an E993A mutation or similar substitution of the amino acid corresponding to the position.
In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises a D917A mutation or a similar substitution of the amino acid corresponding to the position. In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises the E1006A mutation or similar substitution of the amino acid corresponding to the position. In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises a D1255A mutation or a similar substitution of the amino acid corresponding to the position. In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises D917A, E1006A and D1255A mutations or similar substitutions of amino acids corresponding to the positions.
In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises a D16A mutation or similar substitution of the amino acid corresponding to the position. In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises the D587A mutation or similar substitution of the amino acid corresponding to the position. In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises the H588A mutation or similar substitution of the amino acid corresponding to the position. In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises an N611A mutation or similar substitution of the amino acid corresponding to the position. In some embodiments, the catalytically inactive Cas9 protein, e.g., dCas9, comprises D16A, D587A, H588A and N611A mutations or similar substitutions of amino acids corresponding to the positions.
In some aspects, the systems described herein or the methods described herein include the use of a site-specific breaker or polypeptide comprising one or more (e.g., one) targeting moiety and one or more effector moiety (e.g., one or two effector moiety), wherein the one or more targeting moiety is or comprises a CRISPR/Cas molecule comprising a Cas protein, e.g., a catalytically inactive Cas9 protein, e.g., sadCas9, dCas9, e.g., dCas9m4, or a functional variant or fragment thereof. In some embodiments, dCas9 comprises the amino acid sequence of SEQ ID NOs 5, 6, or 7:
/>
/>
in some embodiments, dCas9 is encoded by the nucleic acid sequence of SEQ ID No. 8 or 9:
/>
/>
/>
guide RNA (gRNA)
In some embodiments, the targeting moiety can comprise a Cas molecule that contains or is linked (e.g., covalently linked) to a gRNA. gRNA is a short synthetic RNA consisting of a "scaffold" sequence necessary for Cas protein binding and a user-defined about 20 nucleotide targeting sequence for genomic targets. In practice, the guide RNA sequence is typically designed to have a length of 17-24 nucleotides (e.g., 19, 20, or 21 nucleotides) and to be complementary to the target nucleic acid sequence. In some embodiments, the gRNA comprises 3-6 flanking Phosphorothioate (PS) linkages, e.g., 3 flanking PS linkages at each end. Custom gRNA generators and algorithms are commercially available for designing effective guide RNAs. Gene editing can also be achieved using chimeric "single guide RNAs" ("sgrnas"), which are single RNA molecules engineered (synthesized) to mimic naturally occurring crRNA-tracrRNA complexes and contain both tracrRNA (for nuclease binding) and at least one crRNA (to direct nucleases to the targeted sequence for editing). Chemically modified sgrnas have also been demonstrated to be effective for use with Cas proteins; see, e.g., hendel et al (2015) Nature Biotechnol [ natural-biotechnology ],985-991.
In some embodiments, the gRNA comprises a nucleic acid sequence complementary to a DNA sequence associated with a target gene. In some embodiments, the DNA sequence is, comprises, or overlaps an expression control element operably linked to a target gene. In some embodiments, the gRNA comprises a nucleic acid sequence that is at least 90%, 95%, 99%, or 100% complementary to a DNA sequence associated with the target gene. In some embodiments, the gRNA used with the targeting moiety comprising a Cas molecule is an sgRNA. In some embodiments, a gRNA binding nucleic acid sequence comprising or having at least 70%, 75%, 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to a sequence selected from table 4, table 5, table 6, table 7, or a sequence not differing by more than 1, 2, 3, 4, or 5 positions therefrom.
In some embodiments, the gRNA used with the CRISPR/Cas molecule specifically binds to a target sequence associated with β -2-microglobulin expression. In some embodiments, the gRNA used with the CRISPR/Cas molecule specifically binds to a target sequence associated with expression of one or more CXCL1-8 genes. Such gRNA may comprise a target binding sequence selected from SEQ ID NOS.20-62.
In some embodiments, the targeting moiety is or comprises a TAL effector molecule. TAL effector molecules, such as TAL effector molecules that specifically bind to DNA sequences, comprise multiple TAL effector domains or fragments thereof, and optionally one or more additional portions of naturally occurring TAL effectors (e.g., N and/or C termini of the multiple TAL effector domains). Many TAL effectors are known to those skilled in the art and are commercially available, for example from the zemoer feishier technologies company (Thermo Fisher Scientific).
TALEs are natural effector proteins secreted by a variety of bacterial pathogens, including the plant pathogen Xanthomonas (Xanthomonas), which regulate gene expression in host plants and promote bacterial colonization and survival. Specific binding of TAL effectors is based on the central repeat domain (repeat variable diradical, RVD domain) of nearly identical typical 33 or 34 amino acid repeats arranged in tandem.
Members of the TAL effector family differ primarily in the number and order of their repeat sequences. The number of repeat sequences ranges from 1.5 to 33.5 repeats, and the C-terminal repeat is typically short in length (e.g., about 20 amino acids), and is commonly referred to as a "half-repeat". Each repeat of TAL effectors has a one repeat to one base pair correlation, where different repeat types exhibit different base pair specificities (one repeat recognizes one base pair on the target gene sequence). In general, the fewer the number of repeat sequences, the weaker the protein-DNA interactions. The number of 6.5 repeats has been shown to be sufficient to activate transcription of the reporter gene (Scholze et al, 2010).
Repeat-to-repeat variations occur predominantly at amino acid positions 12 and 13, and are therefore referred to as "hypervariable" and are responsible for the specificity of interactions with the target DNA promoter sequence, as shown in table 2, which lists exemplary repeat variable double Residues (RVDs) and their correspondence to nucleobase targets.
TABLE 2 RVD and nucleobase specificity
Thus, it is possible to modify the repeat sequence of TAL effectors to target a specific DNA sequence. Further studies indicate that RVD NK can target G. The target site of TAL effectors also tends to include a T flanking the 5' base targeted by the first repeat, but the exact mechanism of this recognition is not yet clear. More than 113 TAL effector sequences are known to date. Non-limiting examples of TAL effectors from xanthomonas include Hax2, hax3, hax4, avrXa7, avrXa10, and AvrBs3.
Accordingly, TAL effector domains of TAL effector molecules of the present disclosure may be derived from TAL effectors from any bacterial species, such as Xanthomonas (Xanthomonas) species, for example, the African strain of Xanthomonas oryzae (Xanthomonas oryzae. Oryzae) (Yu et al 2011), the Xanthomonas campestris radish pathogenic variety (Xanthomonas campescita, rapani) strain 756C, and the bacterial species of Rhizoctonia cerealis (Xanthomonas oryzae. Oryzicola) strain BLS256 (Bogdaroave et al 2011). As used herein, TAL effector domains according to the present disclosure comprise a RVD domain, and one or more flanking sequences (sequences on the N-terminal and/or C-terminal side of the RVD domain) also from naturally occurring TAL effectors. It may comprise more or less repeat sequences than the RVD of the naturally occurring TAL effector. TAL effector molecules of the present disclosure are designed to target a given DNA sequence based on the above-described codes and other codes known in the art. The number of TAL effector domains (e.g., repeat sequences (monomers or modules)) and their specific sequences are selected based on the desired DNA target sequence. For example, TAL effector domains, such as repeat sequences, may be removed or added to accommodate a particular target sequence. In one embodiment, a TAL effector molecule of the disclosure comprises 6.5 to 33.5 TAL effector domains, e.g., repeat sequences. In one embodiment, a TAL effector molecule of the disclosure comprises 8 to 33.5 TAL effector domains, e.g., a repeat sequence, e.g., 10 to 25 TAL effector domains, e.g., a repeat sequence, e.g., 10 to 14 TAL effector domains, e.g., a repeat sequence.
In some embodiments, the TAL effector molecule comprises a TAL effector domain corresponding to a perfect match to a DNA target sequence. In some embodiments, the repetition on the DNA target sequence and the mismatch between target base pairs are allowed as long as it allows the function of a site-specific breaker comprising TAL effector molecules. Typically, TALE binding is inversely related to the number of mismatches. In some embodiments, the TAL effector molecule of the site-specific breaker of the present disclosure comprises no more than 7 mismatches, 6 mismatches, 5 mismatches, 4 mismatches, 3 mismatches, 2 mismatches, or 1 mismatch with the target DNA sequence, and optionally no mismatches. Without wishing to be bound by theory, in general, the fewer the number of TAL effector domains in a TAL effector molecule, the fewer mismatches will be tolerated and still allow the function of the site-specific breaker comprising the TAL effector molecule. Binding affinity is believed to depend on the sum of matched repeat-DNA combinations. For example, TAL effector molecules having 25 or more TAL effector domains may be able to tolerate up to 7 mismatches.
In addition to TAL effector domains, TAL effector molecules of the present disclosure may comprise additional sequences derived from naturally occurring TAL effectors. The length of one or more C-terminal and/or N-terminal sequences contained on each side of the TAL effector domain portion of the TAL effector molecule may vary and is selected by one of skill in the art, for example based on the study of Zhang et al (2011). Zhang et al have characterized many C-terminal and N-terminal truncation mutants in TAL effector-based proteins of Hax3 origin and have identified key elements that contribute to optimal binding to the target sequence and thus activate transcription. In general, transcriptional activity was found to be inversely related to the length of the N-terminus. With respect to the C-terminus, important elements of DNA binding residues within the first 68 amino acids of the Hax3 sequence were identified. Thus, in some embodiments, the first 68 amino acids on the C-terminal side of the TAL effector domain of a naturally occurring TAL effector are included in the TAL effector molecules of the site-specific breaker of the present disclosure. Thus, in one embodiment, a TAL effector molecule of the present disclosure comprises 1) one or more TAL effector domains derived from a naturally occurring TAL effector; 2) At least 70, 80, 90, 100, 110, 120, 130, 140, 150, 170, 180, 190, 200, 220, 230, 240, 250, 260, 270, 280 or more amino acids from a naturally occurring TAL effector on the N-terminal side of the TAL effector domain; and/or 3) at least 68, 80, 90, 100, 110, 120, 130, 140, 150, 170, 180, 190, 200, 220, 230, 240, 250, 260 or more amino acids from a naturally occurring TAL effector on the C-terminal side of the TAL effector domain.
In some embodiments, the targeting moiety is or comprises a Zn-bearing molecule. Zn refers to a molecule comprising a Zn refers to a protein, such as a naturally occurring Zn refers to a protein or an engineered Zn refers to a protein, or a fragment thereof. Many zinc finger proteins are known to those skilled in the art and are commercially available, for example, from Sigma Aldrich (Sigma Aldrich).
In some embodiments, the Zn-finger molecules comprise non-naturally occurring Zn-finger proteins engineered to bind to a selected target DNA sequence. See, for example, beerli et al (2002) Nature Biotechnol [ Nature Biotechnology ]20:135-141; pabo et al (2001) Ann.Rev.biochem. [ annual review of biochemistry ]70:313-340; isalan et al (2001) Nature Biotechnol [ Nature Biotechnology ]19:656-660; segal et al (2001) curr.Opin.Biotechnol. [ biotechnology Current perspective ]12:632-637; choo et al (2000) Curr.Opin. Structure. Biol. [ journal of molecular biology ]10:411-416; U.S. Pat. nos. 6,453,242;6,534,261;6,599,692;6,503,717;6,689,558;7,030,215;6,794,136;7,067,317;7,262,054;7,070,934;7,361,635;7,253,273; U.S. patent publication 2005/0064474;2007/0218528;2005/0267061, are incorporated herein by reference in their entirety.
Engineered Zn finger proteins may have new binding specificities compared to naturally occurring Zn finger proteins. Engineering methods include, but are not limited to, rational design and various types of choices. Rational design includes, for example, using a database comprising triplex (or quadruplet) nucleotide sequences and a single Zn-finger amino acid sequence, wherein each triplet or quadruplet nucleotide sequence is associated with one or more amino acid sequences of the Zn-finger that bind to a particular triplet or quadruplet sequence. See, for example, U.S. Pat. nos. 6,453,242 and 6,534,261, which are incorporated herein by reference in their entireties.
Exemplary selection methods (including phage display and two-hybrid systems) are disclosed in the following: U.S. patent No. 5,789,538;5,925,523;6,007,988;6,013,453;6,410,248;6,140,466;6,200,759; and 6,242,568; international patent publication No. WO 98/37186; WO 98/53057; WO 00/27878; and WO 01/88197 and GB 2,338,237. In addition, enhancing the binding specificity of zinc finger proteins has been described, for example, in International patent publication No. WO 02/077227.
In addition, as disclosed in these and other references, the zinc finger domains and/or multi-finger zinc finger proteins can be linked together using any suitable linker sequence (including, for example, a linker of 5 or more amino acids in length). See also U.S. patent No. 6,479,626;6,903,185; and 7,153,949 are 6 or more amino acids in length. The proteins described herein may include any combination of suitable linkers between individual zinc fingers of the protein. In addition, enhancing the binding specificity of zinc finger binding domains has been described, for example, in commonly owned International patent publication No. WO 02/077227.
Zn refers to proteins and methods for designing and constructing fusion proteins (and polynucleotides encoding the same) are known to those of skill in the art and are described in detail below: U.S. patent No. 6,140,0815;789,538;6,453,242;6,534,261;5,925,523;6,007,988;6,013,453; and 6,200,759; international patent publication No. WO 95/19431; WO 96/06166; WO 98/53057; WO 98/54311; WO 00/27878; WO 01/60970; WO 01/88197; WO 02/099084; WO 98/53058; WO 98/53059; WO 98/53060; WO 02/016536; and WO 03/016496.
In addition, as disclosed in these and other references, zn-finger proteins and/or multi-finger Zn-finger proteins can be linked together using any suitable linker sequence (including, for example, a linker of 5 or more amino acids in length), for example, as a fusion protein. See also U.S. patent No. 6,479,626;6,903,185; and 7,153,949 are 6 or more amino acids in length. The Zn-finger molecules described herein can include any combination of suitable linkers between individual zinc finger proteins and/or multi-finger Zn-finger proteins of the Zn-finger molecules.
In certain embodiments, the targeting moiety comprises a Zn-finger molecule comprising an engineered zinc finger protein that binds (in a sequence-specific manner) to a target DNA sequence. In some embodiments, the Zn-finger molecule comprises a Zn-finger protein or fragment thereof. In other embodiments, the Zn-finger molecules comprise a plurality of Zn-finger proteins (or fragments thereof), such as 2, 3,4, 5, 6 or more Zn-finger proteins (and optionally, no more than 12, 11, 10, 9, 8, 7, 6,5, 4, 3 or 2 Zn-finger proteins). In some embodiments, the Zn-finger molecule comprises at least three Zn-finger proteins. In some embodiments, the Zn-finger molecule comprises four, five, or six fingers. In some embodiments, the Zn-finger molecule comprises 8, 9, 10, 11, or 12 fingers. In some embodiments, zn-finger molecules comprising three Zn-finger proteins recognize a target DNA sequence comprising 9 or 10 nucleotides. In some embodiments, zn-finger molecules comprising four Zn-finger proteins recognize a target DNA sequence comprising 12 to 14 nucleotides. In some embodiments, zn-finger molecules comprising six Zn-finger proteins recognize a target DNA sequence comprising 18 to 21 nucleotides.
In some embodiments, the Zn-finger molecule comprises a two-finger Zn-finger protein. A two-finger zinc finger protein is a protein in which two clusters of zinc finger proteins are separated by intervening amino acids such that the two zinc finger domains bind to two discrete target DNA sequences. An example of a two-handed zinc finger binding protein is SIP1, in which clusters of four zinc finger proteins are located at the amino terminus of the protein and clusters of three Zn finger proteins are located at the carboxy terminus (see Remade et al (1999) EMBO Journal [ European Journal of molecular biology ]18 (18): 5073-5084). Each cluster of zinc fingers in these proteins is capable of binding to a unique target sequence, and the space between the two target sequences may contain a number of nucleotides.
In some embodiments, the targeting moiety is or comprises a DNA binding domain from a nuclease. For example, recognition sequences for homing endonucleases and meganucleases (e.g., I-SceI, I-CeuI, PI-PspI, PI-Sce, I-SceIV, I-CsmI, I-PanI, I-SceII, I-PpoI, I-SceIII, I-CreI, I-TevI, I-TevII and I-TevIII) are known. See also U.S. patent No. 5,420,032;6,833,252; belfort et al (1997) Nucleic Acids Res [ nucleic acids Res. ] 25:3379-3388; dujon et al (1989) Gene [ Gene ]82:115-118; perler et al (1994) Nucleic Acids Res [ nucleic acids Res. ] 22:1125-1127; jasin (1996) Trends Genet [ genetics trend ].12:224-228; gimble et al (1996) J.mol.biol [ journal of molecular biology ].263:163-180; argast et al (1998) J.mol.biol [ journal of molecular biology ].280:345-353 and New England Biolabs catalog. In addition, the DNA binding specificity of homing endonucleases and meganucleases can be engineered to bind non-natural target sites. See, e.g., chevalier et al (2002) molecular cell 10:895-905; epinat et al (2003) Nucleic Acids Res [ nucleic acids research ].31:2952-2962; ashworth et al (2006) Nature [ Nature ]441:656-659; paques et al (2007) Current Gene Therapy [ current gene therapy ]7:49-66; U.S. patent publication No. 2007/017128.
In some embodiments, the targeting moiety comprises a nucleic acid. In some embodiments, the nucleic acid that may be included in the targeting moiety may be or include DNA, RNA, and/or artificial or synthetic nucleic acids or nucleic acid analogs or mimics. For example, in some embodiments, the nucleic acid may be or comprise one or more of the following: genomic DNA (gDNA), complementary DNA (cDNA), peptide Nucleic Acid (PNA), peptide-oligonucleotide conjugates, locked Nucleic Acid (LNA), bridged Nucleic Acid (BNA), polyamides, triplex-forming oligonucleotides, antisense oligonucleotides, tRNA, mRNA, rRNA, miRNA, gRNA, siRNA or other RNAi molecules (e.g., expression products targeted to non-coding RNAs described herein and/or to specific genes associated with the target genomic complexes described herein), and the like. In some embodiments, the nucleic acid may comprise one or more residues that are not naturally occurring DNA or RNA residues, may comprise one or more linkages that are/are phosphodiester linkages (e.g., may be phosphorothioate linkages, etc.), and/or may comprise one or more modifications, such as, for example, 2'o modifications, such as 2' -ome. A variety of nucleic acid structures that can be used to prepare synthetic nucleic acids are known in the art (see, e.g., WO 2017/062862l and WO 2014/012581), and those of skill in the art will appreciate that these nucleic acid structures can be utilized in accordance with the present disclosure.
Nucleic acids suitable for use in a site-specific breaker (e.g., targeting moiety) can include, but are not limited to, DNA, RNA, modified oligonucleotides (e.g., chemical modifications, such as modifications that alter backbone linkages, sugar molecules, and/or nucleobases), and artificial nucleic acids. In some embodiments, the nucleic acid includes, but is not limited to, genomic DNA, cDNA, peptide Nucleic Acid (PNA) or peptide oligonucleotide conjugates, locked Nucleic Acid (LNA), bridged Nucleic Acid (BNA), polyamides, triplex forming oligonucleotides, modified DNA, antisense DNA oligonucleotides, tRNA, mRNA, rRNA, modified RNA, miRNA, gRNA, and siRNA or other RNA or DNA molecules.
In some embodiments of the present invention, in some embodiments, the targeting moiety comprises a nucleic acid of a length of about 15-200, 20-200, 30-200, 40-200, 50-200, 60-200, 70-200, 80-200, 90-200, 100-200, 110-200, 120-200, 130-200, 140-200, 150-200, 160-200, 170-200, 180-200, 190-200, 215-190, 20-190, 30-190, 40-190, 50-190, 60-190, 70-190, 80-190, 90-190, 100-190, 110-190, 120-190, 130-190, 140-190, 150-190, 160-190, 170-190, 180-190, 15-180, 20-180 30-180, 40-180, 50-180, 60-180, 70-180, 80-180, 90-180, 100-180, 110-180, 120-180, 130-180, 140-180, 150-180, 160-180, 170-180, 15-170, 20-170, 30-170, 40-170, 50-170, 60-170, 70-170, 80-170, 90-170, 100-170, 110-170, 120-170, 130-170, 140-170, 150-170, 160-170, 15-160, 20-160, 30-160, 40-160, 50-160, 60-160, 70-160, 80-160, 90-160, 100-160, 110-160, 120-160, 130-160, 140-160, 150-160, 215-150, 20-150, 30-150, 40-150, 50-150, 60-150, 70-150, 80-150, 90-150, 100-150, 110-150, 120-150, 130-150, 140-150, 15-140, 20-140, 30-140, 40-140, 50-140, 60-140, 70-140, 80-140, 90-140, 100-140, 110-140, 120-140, 130-140, 15-130, 20-130, 30-130, 40-130, 50-130, 60-130, 70-130, 80-130, 90-130, 100-130, 110-130, 120-130, and so on 215-120, 20-120, 30-120, 40-120, 50-120, 60-120, 70-120, 80-120, 90-120, 100-120, 110-120, 15-110, 20-110, 30-110, 40-110, 50-110, 60-110, 70-110, 80-110, 90-110, 100-110, 15-100, 20-100, 30-100, 40-100, 50-100, 60-100, 70-100, 80-100, 90-100, 15-90, 20-90, 30-90, 40-90, 50-90, 60-90, 70-90, 80-90, 15-80, 20-80, 30-80, 40-80, 50-80, 60-80, 70-80, 15-70, 20-70, 30-70, 40-70, 50-70, 60-70, 15-60, 20-60, 30-60, 40-60, 50-60, 15-50, 20-50, 30-50, 40-50, 15-40, 20-40, 30-40, 15-30, 20-30, or 15-20 nucleotides, or any range therebetween.
Effector moieties
The site-specific disruption agent of the present disclosure can include one or more effector moieties. The effector moiety has one or more functions that, when used as part of a site-specific breaker described herein, modulate (e.g., reduce) expression of a target plurality of genes in a cell. In some embodiments, the effector moiety blocks the anchor sequence physically or spatially, e.g., such that binding of a genomic complex component (e.g., a nucleation polypeptide) to the anchor sequence is inhibited (e.g., prevented). In some embodiments, the effector moiety destabilizes the interaction of the genomic complex component (e.g., the nucleation polypeptide) with the anchor sequence, e.g., by altering (e.g., reducing) the affinity and/or avidity of the genomic complex component to bind to the anchor sequence. For example, the effector moiety may recruit factors that inhibit or destabilize the formation of a genomic complex (e.g., ASMC), or it may inhibit the recruitment of factors (e.g., genomic complex components or transcription factors) necessary to form or maintain a genomic complex (e.g., ASMC). In some embodiments, the effector moiety has an epigenetic modification function in that it modulates the epigenetic landscape of the anchor sequence or a sequence proximal to the anchor sequence, e.g., by facilitating (e.g., catalyzing) the application or removal of one or more epigenetic modifications to the DNA or a histone associated therewith, to reduce expression of the target plurality of genes. In some embodiments, the effector moiety has a genetically modified function, e.g., it introduces alterations (e.g., insertions, deletions, or substitutions) to the anchor sequence or its proximal sequence.
In some embodiments, the effector moiety comprises a CRISPR/Cas molecule, TAL effector molecule, zn finger molecule, tetR domain, or meganuclease. In some embodiments, the effector moiety has a genetically modified function, e.g., a CRISPR/Cas molecule, TAL effector molecule, zn finger molecule, with endonuclease activity capable of being genetically altered in the methods described herein.
In some embodiments, the effector moiety comprises a histone modification function, such as histone methyltransferase, histone demethylase, or histone deacetylase activity. In some embodiments, histone methyltransferase function includes methyltransferase activity that targets H3K 9. In some embodiments, histone methyltransferase function includes methyltransferase activity that targets H3K 56. In some embodiments, histone methyltransferase function includes methyltransferase activity that targets H3K 27. In some embodiments, histone methyltransferase or demethylase function transfers one, two, or three methyl groups. In some embodiments, histone demethylase function includes a demethylase activity that targets H3K 4. In some embodiments, the effector moiety comprises a protein selected from the group consisting of: SETDB1, SETDB2, EHMT2 (i.e., G9A), EHMT1 (i.e., GLP), SUV39H1, EZH2, EZH1, SUV39H2, SETD8, SUV420H1, SUV420H2, or a functional variant or fragment of any of them, e.g., the SET domain of any of them. In some embodiments, the effector moiety comprises a protein selected from the group consisting of: KDM1A (i.e., LSD 1), KDM1B (i.e., LSD 2), KDM2A, KDM2B, KDM5A, KDM5B, KDM5C, KDM5D, KDM4B, NO66, or a functional variant or fragment of any one thereof. In some embodiments, the effector moiety comprises a protein selected from the group consisting of: HDAC1, HDAC2, HDAC3, HDAC4, HDAC5, HDAC6, HDAC7, HDAC8, HDAC9, HDAC10, HDAC11, SIRT1, SIRT2, SIRT3, SIRT4, SIRT5, SIRT6, SIRT7, SIRT8, SIRT9, or a functional variant or fragment of any of these.
In some embodiments, the effector moiety comprises a DNA modification function, such as a DNA methyltransferase. In some embodiments, the effector moiety comprises a protein selected from the group consisting of: MQ1, DNMT3A2, DNMT3B1, DNMT3B2, DNMT3B3, DNMT3B4, DNMT3B5, DNMT3B6, DNMT3L, or a functional variant or fragment of any one thereof.
In some embodiments, the effector moiety comprises a transcriptional repressor. In some embodiments, the transcriptional repressor blocks recruitment of factors that stimulate or promote transcription (e.g., transcription of a target gene). In some embodiments, the transcriptional repressor recruits factors that inhibit, for example, transcription (e.g., transcription of a target gene). In some embodiments, the effector moiety, e.g., a transcriptional repressor, is or comprises a protein selected from the group consisting of: KRAB, meCP2, HP1, RBBP4, REST, FOG1, SUZ12 or a functional variant or fragment of any one thereof.
In some embodiments, the effector moiety comprises a protein having the functions described herein. In some embodiments, the effector moiety comprises a protein selected from the group consisting of:
KRAB (e.g., protein encoded according to np_056209.2 or nm_ 015394.5);
SET domain (e.g., SET domain:
SETDB1 (e.g., protein encoded according to np_001353347.1 or nm_ 001366418.1);
EZH2 (e.g., according to NP-004447.2 or np_001190176.1 or protein encoded by nm_004456.5 or nm_ 001203247.2);
g9a (e.g., according to np_001350618.1 or protein encoded by nm_ 001363689.1); or alternatively
SUV39H1 (e.g., according to NP-003164.1 or the protein encoded by NM-003173.4));
histone demethylase LSD1 (e.g., according to np_055828.2 or protein encoded by nm_ 015013.4);
FOG1 (e.g., the N-terminal residue of FOG 1) (e.g., according to np_722520.2 or the protein encoded by nm_ 153813.3); or alternatively
KAP1 (e.g., according to np_005753.1 or protein encoded by nm_ 005762.3);
a functional fragment or variant of any of them, or
A polypeptide having a sequence at least 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98 or 99% identical to any of the above sequences. In some embodiments, the effector moiety comprises a protein selected from the group consisting of:
DNMT3A (e.g., human DNMT 3A) (e.g., according to NP-072046.2)
Or a protein encoded by nm_ 022552.4);
DNMT3B (e.g. according to NP 008823.1)
Or a protein encoded by nm_ 006892.4);
DNMT3L (e.g. according to NP 787063.1)
Or a protein encoded by nm_ 175867.3);
DNMT3A/3L complex,
bacterial MQ1 (e.g., according to CAA35058.1 or Uniprot ID P15840.3 obtained from strain ATCC 33825);
a functional fragment of any one of them, or
A polypeptide having a sequence at least 80, 85, 90, 91, 92, 93, 94, 95, 96, 97, 98 or 99% identical to any of the above sequences. In some embodiments, the effector moiety comprises mature bacterial MQ1 (e.g., according to CAA35058.1 or Uniprot ID P15840.3 obtained from strain ATCC 33825
Exemplary effector moieties may include, but are not limited to: ubiquitin, bicyclic peptides as ubiquitin ligase inhibitors, transcription factors, DNA and protein modifying enzymes such as topoisomerase, topoisomerase inhibitors such as topotecan, DNA methyltransferases such as DNMT family (e.g., DNMT3A, DNMT3B, DNMT L), protein methyltransferases (e.g., viral lysine methyltransferase (vSET), protein-lysine N-methyltransferase (SMYD 2), deaminases (e.g., APOBEC, UG 1), histone methyltransferases such as zeste enhancer homolog 2 (EZH 2), PRMT1, histone-lysine-N-methyltransferase (Setdb 1), histone methyltransferase (SET 2), euchromatin-lysine N-methyltransferase 2 (G9 a), histone-lysine N-methyltransferases (SUV 39H 1) and G9 a), histone deacetylases (e.g., HDAC1, HDAC2, HDAC 3), enzymes that play a role in DNA demethylation (e.g., TET family enzymes catalyze the oxidation of 5-methylcytosine to 5-hydroxymethylcytosine and higher oxidized derivatives), protein demethylases such as KDM1A and lysine-specific histone demethylase 1 (LSD 1), helicases such as DHX9, deacetylases (e.g., sirtuin 1, 2, 3, 4, 5, 6, or 7), kinases, phosphatases, DNA intercalators such as ethidium bromide, SYBR green, and proflavan, efflux pump inhibitors such as peptide mimetics such as phenylalanine arginyl β -naphtalenamide or quinoline derivatives, nuclear receptor activators and inhibitors, proteasome inhibitors, competitive inhibitors of enzymes such as those involved in lysosomal storage diseases, protein synthesis inhibitors, nucleases (e.g., cpf1, cas9, zinc finger nucleases), fusions of one or more thereof (e.g., dCas9-DNMT, dCas 9-apodec, dCas9-UG 1), and specific domains from proteins, such as KRAB domains.
In some embodiments, candidate domains may be determined to be suitable for use as effector moieties by methods known to those of skill in the art. For example, candidate effector moieties may be tested by: it is determined whether the candidate effector moiety reduces expression of the target gene in the cell, e.g., reduces the level of RNA transcript encoded by the target gene (e.g., as measured by RNASeq or northern blot) or reduces the level of protein encoded by the target gene (e.g., as measured by ELISA), when the candidate effector moiety is present in the nucleus and is appropriately located (e.g., to the target gene or to a transcriptional control element operably linked to the target gene, e.g., by the targeting moiety).
In some embodiments, the site-specific breaker comprises an effector moiety that does not bind (e.g., does not bind detectably) to another copy of the effector moiety, e.g., the effector moiety is monomeric and does not bind as a multimer. In some embodiments, the site-specific breaker comprises an effector moiety that is associated with another copy of the effector moiety into a multimer, e.g., a dimer, trimer, tetramer, etc. In some embodiments, the site-specific breaker comprises a plurality of effector moieties, wherein each effector moiety is non-detectably bound, e.g., does not bind, to another effector moiety. In some embodiments, the effector moiety used in the compositions and methods described herein functions in a monomeric (e.g., non-dimeric) state.
In some embodiments, the effector moiety comprises, for example, an epigenetic modification that modulates the two-dimensional structure of chromatin (i.e., modulates the structure of chromatin in a manner that alters its two-dimensional expression).
Epigenetic modifications useful in the methods and compositions of the present disclosure include agents that affect epigenetic markers, such as DNA methylation, histone acetylation, histone glycosylation, histone phosphorylation, and RNA-related silencing. Exemplary epigenetic enzymes that can target genomic sequence elements as described herein include DNA methylases (e.g., DNMT3a, DNMT3b, DNMTL, MQ 1), DNA demethylases (e.g., TET family), histone methyltransferases, histone deacetylases (e.g., HDAC1, HDAC2, HDAC 3), sirtuin 1, 2, 3, 4, 5, 6, or 7, lysine-specific histone demethylase 1 (LSD 1), histone-lysine-N-methyltransferase (Setdb 1), euchromatin-lysine N-methyltransferase 2 (G9 a), histone-lysine N-methyltransferase (SUV 39H 1), zeste enhancer homolog 2 (EZH 2), viral lysine methyltransferases (vSET), histone methyltransferases (SET 2), and protein-lysine N-methyltransferases (SMYD 2). Examples of such epigenetic modifiers are described, for example, in de Groote et al Nuc. Acids Res. [ nucleic acids Res ] (2012): 1-18.
In some embodiments, the site-specific disruption agent useful herein (e.g., comprising an epigenetic modified moiety) comprises or is a construct described in: koferle et al Genome Medicine 7.59 (2015): 1-3, incorporated herein by reference. For example, in some embodiments, the site-specific disruption agent comprises or is a construct found in Table 1 of Koferle et al, e.g., a histone deacetylase, a histone methyltransferase, a DNA demethylase, or an H3K4 and/or H3K9 histone demethylase described in Table 1 (e.g., dCAS9-p300, TALE-TET1, ZF-DNMT3A, or TALE-LSD 1).
Additional part
The site-specific breaker may further comprise one or more additional moieties (e.g., in addition to the one or more targeting moieties and the one or more effector moieties). In some embodiments, the additional moiety is selected from a labeling moiety or a monitoring moiety, a cleavable moiety (e.g., a cleavable moiety located between the targeting moiety and the effector moiety or at the N or C terminus of the polypeptide), a small molecule, a membrane translocation polypeptide, or a pharmaceutical moiety.
Exemplary site-specific disruption Agents
The following exemplary site-specific disruption agents are presented for illustrative purposes only and are not intended to be limiting.
In some embodiments, the site-specific disruption agent comprises a targeting moiety (e.g., comprising dCas9, e.g., staphylococcus aureus dCas 9) and an effector moiety comprising MQ1, e.g., bacterial MQ 1. In some embodiments, the site-specific breaker is encoded by the nucleic acid sequence of SEQ ID NO 201 (e.g., a plasmid encoding the site-specific breaker) and/or 202 (e.g., a nucleic acid (e.g., mRNA) encoding the site-specific breaker). In some embodiments, a nucleic acid described herein comprises or has at least 80%, 85%, 90%, 95%, 99% or 100% identity to a nucleic acid sequence of SEQ ID No. 201 or 202, or a sequence not differing by more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 positions. In some embodiments, the targeting moiety is encoded by the nucleic acid sequence of SEQ ID NO. 9 and/or the effector moiety is encoded by the nucleic acid sequence of SEQ ID NO. 10.
Sa-dCAS9-MQ1 (PL-27695) plasmid DNA sequence:
/>
/>
/>
mRNA sequence expressed by Sa-dCAS9-MQ1 (MR-28126):
/>
/>
in some embodiments, the site-specific disruption agent comprises a targeting moiety (e.g., comprising dCas9, e.g., streptococcus pyogenes dCas9, or a functional variant or mutant thereof; e.g., cas9m 4), and an effector moiety comprising MQ1, e.g., bacterial MQ 1. In some embodiments, the site-specific breaker is encoded by the nucleic acid sequence of SEQ ID NO:207 (e.g., a nucleic acid encoding a site-specific breaker (e.g., mRNA)). In some embodiments, the nucleic acids described herein comprise or have at least 80%, 85%, 90%, 95%, 99% or 100% identity to the nucleic acid sequence of SEQ ID NO. 207, or a sequence that differs therefrom by NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 positions. In some embodiments, the targeting moiety is encoded by the nucleic acid sequence of SEQ ID NO. 8 and/or the effector moiety is encoded by the nucleic acid sequence of SEQ ID NO. 10. dCAS9-MQ1 mRNA sequence (MR 28125)
/>
/>
/>
In some embodiments, the site-specific breaker comprises the amino acid sequence of SEQ ID NO:203, 208, 73 or 74. In some embodiments, the site-specific disruption agent described herein comprises the amino acid sequence of SEQ ID NO 203, 208, 73, or 74 or a sequence having at least 80%, 85%, 90%, 95%, 99%, or 100% identity thereto, or a sequence not differing by more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 positions therefrom.
Sa-dCAS9-MQ1 protein sequence:
/>
dCAS9-MQ1 protein sequence (corresponding to MR-28125):
/>
HA tag-free Sa-dCAs9-MQ1
dCAS9-MQ1 without HA tag
In some embodiments, the site-specific disruption agent comprises a targeting moiety (e.g., comprising dCas9, e.g., streptococcus pyogenes dCas 9) and an effector moiety comprising a KRAB, e.g., a KRAB domain. In some embodiments, the site-specific breaker is encoded by the nucleic acid sequence of SEQ ID NO:204 (e.g., a plasmid encoding the site-specific breaker) and/or 205 (e.g., a nucleic acid (e.g., mRNA)). In some embodiments, a nucleic acid described herein comprises or has at least 80%, 85%, 90%, 95%, 99% or 100% identity to a nucleic acid sequence of SEQ ID NO. 204 or 205, or a sequence that differs therefrom by NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 positions. In some embodiments, the targeting moiety is encoded by the nucleic acid sequence of SEQ ID No.8 and the effector moiety is encoded by the nucleic acid sequence of SEQ ID No. 14.
Sp-dCAS9-KRAB (PL-27687) plasmid DNA sequence:
/>
/>
/>
Sp-dCAS9-KRAB (MR-28122) expressed mRNA sequence:
/>
/>
in some embodiments, the site-specific breaker comprises the amino acid sequence of SEQ ID NO. 206 or 75. In some embodiments, a nucleic acid described herein comprises or has at least 80%, 85%, 90%, 95%, 99% or 100% identity to a nucleic acid sequence of SEQ ID No. 206 or 75, or a sequence not differing by more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 positions.
Sp-dCAS9-KRAB protein sequence:
/>
Sp-dCAs9-KRAB protein sequence without HA tag:
in some embodiments, the site-specific disruption agent comprises a targeting moiety (e.g., comprising dCas9, e.g., staphylococcus aureus dCas 9) and an effector moiety comprising EZH 2. In some embodiments, the site-specific disruption agent comprises a targeting moiety (which contains dCas9, e.g., streptococcus pyogenes dCas9, e.g., mutated streptococcus pyogenes Cas9, e.g., cas9m 4) and an effector moiety containing EZH 2. In some embodiments, the site-specific breaker is encoded by the nucleic acid sequence of SEQ ID NO. 209 (e.g., mRNA encoding the site-specific breaker). In some embodiments, a nucleic acid described herein comprises or has at least 80%, 85%, 90%, 95%, 99% or 100% identity to the nucleic acid sequence of SEQ ID NO. 209, or a sequence that differs therefrom by NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 positions. In some embodiments, the targeting moiety is encoded by the nucleic acid sequence of SEQ ID NO. 8 and/or the effector moiety is encoded by the nucleic acid sequence of SEQ ID NO. 18.
/>
/>
/>
In some embodiments, the site-specific breaker comprises the amino acid sequence of SEQ ID NO. 210 or 76. In some embodiments, a nucleic acid described herein comprises the nucleic acid sequence of SEQ ID No. 210 or 76 or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto, or a sequence not differing therefrom by more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions.
EZH2-dCAS9 protein sequence (corresponding MR-28938)
EZH2-dCAS9 without HA tag
In some embodiments, the site-specific disruption agent comprises a targeting moiety (e.g., comprising dCas9, e.g., staphylococcus aureus dCas 9) and an effector moiety comprising DNMT3 (e.g., DNMT3 a/3L). In some embodiments, the site-specific disruption agent comprises a targeting moiety (which contains dCas9, e.g., streptococcus pyogenes dCas9, e.g., mutated streptococcus pyogenes Cas9, e.g., cas9m 4) and an effector moiety containing DNMT3 (e.g., DNMT3 a/3L). In some embodiments, the site-specific breaker is encoded by the nucleic acid sequence of SEQ ID NO:211 (e.g., mRNA encoding the site-specific breaker). In some embodiments, a nucleic acid described herein comprises or has at least 80%, 85%, 90%, 95%, 99% or 100% identity to the nucleic acid sequence of SEQ ID NO. 211, or a sequence that differs therefrom by NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 positions.
/>
/>
/>
In some embodiments, the site-specific breaker comprises the amino acid sequence of SEQ ID NO 211 or 77. In some embodiments, the constructs described herein comprise the amino acid sequence of SEQ ID NO 211 or 77 or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto, or a sequence not differing by more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions therefrom.
dCAS9-DNMT3a/3L protein sequence (corresponding MR-29414)
/>
dmas-DNMT 3a/3L (h) without HA tag
/>
In some embodiments, the site-specific breaker comprises a targeting moiety (e.g., comprising dCas9, e.g., staphylococcus aureus dCas 9) and an effector moiety comprising HDAC8 (e.g., an HDAC8 domain). In some embodiments, the site-specific disruption agent comprises a targeting moiety (which contains dCas9, e.g., streptococcus pyogenes dCas9, e.g., mutated streptococcus pyogenes Cas9, e.g., cas9m 4) and an effector moiety containing HDAC8 (e.g., an HDAC8 domain). In some embodiments, the site-specific breaker is encoded by the nucleic acid sequence of SEQ ID NO:213 (e.g., mRNA encoding the site-specific breaker).
In some embodiments, the nucleic acids described herein comprise or have at least 80%, 85%, 90%, 95%, 99% or 100% identity to the nucleic acid sequence of SEQ ID NO. 213, or a sequence that differs therefrom by NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 positions.
/>
/>
In some embodiments, the site-specific breaker comprises the amino acid sequence of SEQ ID NO:214 or 78. In some embodiments, the site-specific disruption agent described herein comprises the amino acid sequence of SEQ ID NO 214 or 78 or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto, or not more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions different therefrom.
dCAS9-HDAC8 protein sequence (corresponding to MR 29439)
dCAS9-HDAC8 without HA tag
In some embodiments, the site-specific breaker comprises a targeting moiety (e.g., comprising dCas9, e.g., staphylococcus aureus dCas 9) and a first effector moiety comprising EZH2 (e.g., an EZH2 domain); and a second effector moiety comprising HDAC8 (e.g., HDAC8 domain). In some embodiments, the site-specific disruption agent comprises a targeting moiety (which comprises dCas9, e.g., streptococcus pyogenes dCas9, e.g., mutated streptococcus pyogenes Cas9, e.g., cas9m 4); a first effector moiety comprising EZH2 (e.g., an EZH2 domain); and a second effector moiety comprising HDAC8 (e.g., HDAC8 domain). In some embodiments, the site-specific breaker is encoded by the nucleic acid sequence of SEQ ID NO. 215 (e.g., mRNA encoding the site-specific breaker). In some embodiments, the nucleic acids described herein comprise or have at least 80%, 85%, 90%, 95%, 99% or 100% identity to the nucleic acid sequence of SEQ ID NO. 215, or a sequence that differs therefrom by NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 positions.
/>
/>
/>
/>
In some embodiments, the site-specific breaker comprises the amino acid sequence of SEQ ID NO:216 or 79. In some embodiments, the site-specific disruption agent described herein comprises the amino acid sequence of SEQ ID NO 216 or 79 or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto, or a sequence not differing therefrom by more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions.
EZH2-dCAS9-HDAC8 protein sequence (corresponding MR-29447)
/>
EZH2-dCAS9-HDAC8 protein sequence (corresponding MR-29447)
/>
In some embodiments, the site-specific disruption agent comprises a targeting moiety (e.g., comprising dCas9, e.g., staphylococcus aureus dCas 9) and a first effector moiety comprising G9A (e.g., a G9A domain); and a second effector moiety comprising EZH2 (e.g., an EZH2 domain). In some embodiments, the site-specific disruption agent comprises a targeting moiety (which comprises dCas9, e.g., streptococcus pyogenes dCas9, e.g., mutated streptococcus pyogenes Cas9, e.g., cas9m 4); a first effector moiety comprising G9A (e.g., a G9A domain); and a second effector moiety comprising EZH2 (e.g., an EZH2 domain). In some embodiments, the site-specific breaker is encoded by the nucleic acid sequence of SEQ ID NO:69 (e.g., mRNA encoding the site-specific breaker). In some embodiments, a nucleic acid described herein comprises or has at least 80%, 85%, 90%, 95%, 99% or 100% identity to the nucleic acid sequence of SEQ ID NO. 69, or a sequence that differs therefrom by NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 positions.
G9A-dCAS9-EZH2 (MR-29441) mRNA sequence
/>
/>
/>
/>
In some embodiments, the site-specific breaker comprises the amino acid sequence of SEQ ID NO 70 or 80. In some embodiments, the site-specific disruption agent described herein comprises the amino acid sequence of SEQ ID NO 70 or 80 or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto, or a sequence not differing therefrom by more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions.
G9A-dCAS9-EZH2 protein
/>
HA tag-free G9A-dCAS9-EZH2 protein
/>
In some embodiments, the site-specific disruption agent comprises a targeting moiety (e.g., comprising dCas9, e.g., staphylococcus aureus dCas 9) and a first effector moiety comprising G9A (e.g., a G9A domain); and a second effector moiety comprising a KRAB (e.g., a KRAB domain). In some embodiments, the site-specific disruption agent comprises a targeting moiety (which comprises dCas9, e.g., streptococcus pyogenes dCas9, e.g., mutated streptococcus pyogenes Cas9, e.g., cas9m 4); a first effector moiety comprising G9A (e.g., a G9A domain); and a second effector moiety comprising a KRAB (e.g., a KRAB domain). In some embodiments, the site-specific breaker is encoded by the nucleic acid sequence of SEQ ID NO:71 (e.g., mRNA encoding the site-specific breaker). In some embodiments, a nucleic acid described herein comprises or has at least 80%, 85%, 90%, 95%, 99% or 100% identity to a nucleic acid sequence of SEQ ID NO. 71, or a sequence that differs therefrom by NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 positions.
G9A-dCAS9-KRAB (MR-29942) mRNA sequence
/>
/>
/>
In some embodiments, the site-specific breaker comprises the amino acid sequence of SEQ ID NO:72 or 81. In some embodiments, the site-specific disruption agent described herein comprises the amino acid sequence of SEQ ID NO 72 or 81 or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto, or a sequence not differing therefrom by more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions.
G9A-dCAS9-KRAB protein
/>
HA-tag-free G9A-dCAS9-KRAB protein
/>
In some embodiments, the site-specific breaker comprises a targeting moiety (e.g., comprising dCas9, e.g., staphylococcus aureus dCas 9) and a first effector moiety comprising EZH2 (e.g., an EZH2 domain); and a second effector moiety comprising a KRAB (e.g., a KRAB domain). In some embodiments, the site-specific disruption agent comprises a targeting moiety (which comprises dCas9, e.g., streptococcus pyogenes dCas9, e.g., mutated streptococcus pyogenes Cas9, e.g., cas9m 4); a first effector moiety comprising EZH2 (e.g., an EZH2 domain); and a second effector moiety comprising a KRAB (e.g., a KRAB domain). In some embodiments, the site-specific breaker is encoded by the nucleic acid sequence of SEQ ID NO:85 (e.g., mRNA encoding the site-specific breaker). In some embodiments, the nucleic acids described herein comprise the nucleic acid sequence of SEQ ID NO. 85 or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto, or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions different therefrom. EZH2-dCAS9-KRAB (MR-29948) mRNA sequence
/>
/>
/>
/>
In some embodiments, the site-specific breaker comprises the amino acid sequence of SEQ ID NO 86 or 82. In some embodiments, the site-specific disruption agent described herein comprises the amino acid sequence of SEQ ID NO 86 or 82 or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto, or a sequence not differing therefrom by more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions.
EZH2-dCAS9-KRAB protein
/>
EZH2-dCAS9-KRAB protein without HA tag
/>
In some embodiments, the site-specific disruption agent comprises a CRISPR/Cas molecule comprising Cas 9. In some embodiments, the site-specific breaker is encoded by the nucleic acid sequence of SEQ ID NO:217 (e.g., mRNA encoding the site-specific breaker). In some embodiments, a nucleic acid described herein comprises or has at least 80%, 85%, 90%, 95%, 99% or 100% identity to the nucleic acid sequence of SEQ ID NO. 217, or a sequence that differs therefrom by NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 positions.
/>
/>
In some embodiments, the site-specific breaker comprises the amino acid sequence of SEQ ID NO:218 or 84. In some embodiments, the site-specific disruption agent described herein comprises the amino acid sequence of SEQ ID NO 218 or 84 or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto, or a sequence not differing therefrom by more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions.
Cas9 protein sequence (corresponding MR-28127)
Cas9 protein sequence without HA tag (corresponding MR-28127)
In some embodiments, the site-specific breaker comprises a Nuclear Localization Sequence (NLS). In some embodiments, the site-specific breaker comprises an NLS, such as an N-terminal SV40 NLS. In some embodiments, the site-specific breaker comprises an NLS, such as a C-terminal SV40 NLS. In some embodiments, the site-specific breaker comprises an NLS, e.g., a C-terminal nucleoplasmin NLS. In some embodiments, the site-specific breaker comprises a first NLS at the N-terminus and a second NLS at the C-terminus. In some embodiments, the first and second NLSs have the same sequence. In some embodiments, the first and second NLSs have different sequences. In some embodiments, the site-specific breaker comprises a first NLS at the N-terminus, a second NLS, and a third NLS at the C-terminus. In some embodiments, at least two NLSs have the same sequence. In some embodiments, the first and second NLSs have the same sequence and the third NLS has a different sequence than the first and second NLSs. In some embodiments, the site-specific breaker comprises an SV40 NLS, e.g., the site-specific breaker comprises a sequence according to PKKRK (SEQ ID NO: 63). In some embodiments, the site-specific breaker comprises a nucleoprotein NLS, e.g., the site-specific breaker comprises the sequence of KRPAATKKAGQAKKK (SEQ ID NO: 64). In some embodiments, the site-specific breaker comprises a C-terminal sequence comprising one or more of, e.g., either or both of: nuclear localization sequences of nucleoplasmin and HA tag. In some embodiments, the site-specific breaker comprises an epitope tag, such as an HA tag: YPYDVPDYA (SEQ ID NO: 65). In some embodiments, the site-specific breaker can comprise two copies of an epitope tag.
While epitope tags are useful in many research settings, it is sometimes desirable to omit epitope tags in the therapeutic setting. Thus, in some embodiments, the site-specific breaker lacks an epitope tag. In some embodiments, the site-specific disruption agent described herein comprises a sequence provided herein (or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto, or a sequence not differing by more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions thereto), but lacks the HA tag of SEQ ID NO: 65. In some embodiments, the nucleic acids described herein comprise a sequence provided herein (or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto, or a sequence not differing by more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions thereto), but lacks the HA tag encoding SEQ ID NO: 65. In some embodiments, the site-specific breaker does not comprise an NLS. In some embodiments, the site-specific breaker does not comprise an epitope tag. In some embodiments, the site-specific breaker does not comprise an HA tag. In some embodiments, the site-specific breaker does not comprise an HA tag sequence according to SEQ ID NO: 65.
In some embodiments, the site-specific breaker comprises a targeting moiety comprising a Zn-finger molecule and an effector moiety comprising EZH2 or a functional fragment or variant thereof. In some embodiments, the site-specific breaker is encoded by: the nucleic acid sequence of SEQ ID NO 219, 220, 222, 223, 233 or 234 or a sequence which is at least 80%, 85%, 90%, 95%, 99% or 100% identical thereto or which differs therefrom by NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions.
In some embodiments, the site-specific disruption agent comprises a targeting moiety comprising a Zn-finger molecule and an effector moiety comprising DNMT3 (e.g., DNMT3a or DNMT 3L) or a functional fragment or variant thereof. In some embodiments, the site-specific breaker is encoded by: the nucleic acid sequence of SEQ ID NO 221, 231 or 236-239 or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or not differing therefrom by more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions.
In some embodiments, the site-specific breaker comprises a targeting moiety comprising a Zn-finger molecule and an effector moiety comprising G9A or a functional fragment or variant thereof. In some embodiments, the site-specific breaker is encoded by: 224, 225 or 227-230 or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or not more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions different therefrom.
In some embodiments, the site-specific breaker comprises a targeting moiety comprising a Zn-finger molecule and an effector moiety comprising HDAC8 or a functional fragment or variant thereof. In some embodiments, the site-specific breaker is encoded by: the nucleic acid sequence of SEQ ID NO. 226, 232, 235 or 240-242 or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or not differing therefrom by more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions.
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
/>
In some embodiments, a nucleic acid (e.g., a nucleic acid encoding a site-specific breaker) for use in a method or composition described herein comprises a nucleic acid sequence of any one of SEQ ID NOs 69, 71, 85, 201, 202, 204, 205, 207, 209, 211, 213, 215, 217, or 219-242, the complement or reverse complement of any one thereof, or a sequence having at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98% or 99% identity thereto. In some embodiments, the site-specific disruption agent for use in the methods or compositions described herein comprises the amino acid sequence of any one of SEQ ID NOs 70, 72-82, 84, 86, 203, 206, 208, 210, 212, 214, 216, or 218, or comprises a sequence having at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity thereto. In some embodiments, a site-specific breaker for use in a method or composition described herein comprises an amino acid sequence encoded by any one of SEQ ID NOs 69, 71, 85, 201, 202, 204, 205, 207, 209, 211, 213, 215, 217, or 219-242, or an amino acid sequence having at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity thereto.
Functional features
The site-specific disruptors or systems of the disclosure can be used to modulate (e.g., reduce) expression of a target plurality of genes in a cell. In some embodiments, modulating expression comprises reducing the level of RNA, e.g., mRNA, encoded by each of the target plurality of genes. In some embodiments, modulating expression comprises reducing the level of a protein encoded by each of the target plurality of genes. In some embodiments, modulating expression comprises reducing the level of mRNA and protein encoded by each of the target plurality of genes. In some embodiments, the expression level of a gene in a target plurality of genes in a cell that is contacted or comprises a site-specific breaker is at least 1.05x (i.e., 1.05 x), 1.1x, 1.15x, 1.2x, 1.25x, 1.3x, 1.35x, 1.4x, 1.45x, 1.5x, 1.55x, 1.6x, 1.65x, 1.7x, 1.75x, 1.8x, 1.85x, 1.9x, 1.95x, 2x, 3x, 4x, 5x, 6x, 7x, 8x, 9x, 10x, 20x, 30x, 40x, 50x, 60x, 70x, 80x, 90x, or 100x lower than the expression level of the gene in a cell that is not contacted or comprises the site-specific breaker. In some embodiments, the expression level of each gene in the target plurality of genes in a cell contacted or comprising a site-specific disruption agent or system is at least 1.05x (i.e., 1.05 x), 1.1x, 1.15x, 1.2x, 1.25x, 1.3x, 1.35x, 1.4x, 1.45x, 1.5x, 1.55x, 1.6x, 1.65x, 1.7x, 1.75x, 1.8x, 1.85x, 1.9x, 1.95x, 2x, 3x, 4x, 5x, 6x, 7x, 8x, 9x, 10x, 20x, 30x, 40x, 50x, 60x, 70x, 80x, 90x, or 100x lower than the expression level of the gene in a cell not contacted or comprising the site-specific disruption agent or system. Expression of the gene can be determined by methods known to those skilled in the art, including RT-PCR, ELISA, western blotting, and the methods of examples 2 or 4-19.
The site-specific breakers or systems of the present disclosure can be used to reduce binding of a nucleating polypeptide, such as CTCF, to an anchor sequence. In some embodiments, contacting the cell or administering a site-specific breaker or system results in reduced binding of a nucleated polypeptide (e.g., CTCF) to an anchor sequence (e.g., an anchor sequence of an ASMC comprising a target plurality of genes). In some embodiments, contacting the cell or administering the site-specific breaker or system results in a complete loss of binding or at least 50%, 60%, 70%, 80%, 90%, 95% or 99% reduction relative to the binding of the nucleated polypeptide (e.g., CTCF) to the anchor sequence prior to treatment with the site-specific breaker or system or in the absence of the site-specific breaker or system, e.g., as measured by ChIP and/or quantitative PCR.
The site-specific disruption agent or system of the present disclosure can be used to disrupt a genomic complex (e.g., ASMC) comprising a target plurality of cells. In some embodiments, contacting the cell or administering the site-specific disruption agent or system results in a decrease in the level of genomic complexes (e.g., ASMC) comprising the target plurality of genes relative to the level of the complexes prior to treatment with the site-specific disruption agent or system or in the absence of the site-specific disruption agent or system. In some embodiments, contacting the cell or administering the site-specific disruption agent or system results in complete loss of the genomic complex (e.g., ASMC) or a reduction of at least 25%, 30%, 35%, 40%, 45%, 50%, 60%, 70%, 80%, 90%, 95% or 99% relative to the level of the complex prior to treatment with the site-specific disruption agent or system or in the absence of the site-specific disruption agent or system, e.g., as measured by chua-PET, ELISA (e.g., for assessing changes in gene expression), CUT & RUN, ATAC-SEQ, chIP, and/or quantitative PCR.
The site-specific disruptors or systems of the disclosure may be used to modulate (e.g., reduce) expression of a target plurality of genes in a cell for a period of time. In some embodiments, expression of a gene in a target plurality of genes in a cell contacted with or comprising a site-specific disruption agent or system is significantly reduced for at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 hours, or at least 1, 2, 3, 4, 5, 6, 7, 10, or 14 days, or at least 1, 2, 3, 4, or 5 weeks, or at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 months, or at least 1, 2, 3, 4, or 5 years (e.g., indefinitely). In some embodiments, the expression of each of the target plurality of genes in a cell contacted with or comprising a site-specific disruption agent or system is significantly reduced for at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 hours, or at least 1, 2, 3, 4, 5, 6, 7, 10, or 14 days, or at least 1, 2, 3, 4, or 5 weeks, or at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 months, or at least 1, 2, 3, 4, or 5 years (e.g., indefinitely). Optionally, the expression of the gene in the target plurality of genes in the cell contacted with or comprising the site-specific disruption agent or system is significantly reduced for no more than 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 year. Optionally, the expression of each gene of the target plurality of genes in the cell contacted with or comprising the site-specific disruption agent or system is significantly reduced for no more than 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 year.
The site-specific breaker or system may comprise a plurality of effector moieties, wherein each effector moiety has a different function than the other effector moieties. For example, a site-specific breaker or system may comprise a first effector moiety comprising histone deacetylase function and a second effector moiety comprising DNA methyltransferase function. In some embodiments, the site-specific disruption agent comprises a combination of effector moieties whose functions are complementary to each other in terms of modulating (e.g., reducing) expression of the target plurality of genes, e.g., wherein the functions together reduce expression, and optionally, do not reduce expression or negligibly reduce expression when present alone.
In some embodiments, the site-specific disruption agent or system comprises a combination of effector moieties whose functions cooperate in modulating (e.g., reducing) expression of a plurality of genes of interest. Without wishing to be bound by theory, it is believed that epigenetic modifications to genomic loci are cumulative in that multiple repressing epigenetic markers (e.g., multiple different types of epigenetic markers and/or a broader marker of a given type) together reduce expression more effectively (e.g., produce a greater reduction in expression and/or a longer lasting reduction in expression) than a single modification alone. In some embodiments, the site-specific disruption agent or system comprises a plurality of mutually synergistic effector moieties, e.g., each effector moiety reduces expression of a target gene. In some embodiments, a site-specific breaker comprising a plurality of different effector moieties that cooperate with one another is more effective in modulating (e.g., reducing) expression of a target plurality of genes than a site-specific breaker comprising only one of the plurality of different effector moieties or a non-synergistic combination of effector moieties. In some embodiments, such site-specific disrupters are at least 1.05x (i.e., 1.05x, 1.1x, 1.15x, 1.2x, 1.25x, 1.3x, 1.35x, 1.4x, 1.45x, 1.5x, 1.55x, 1.6x, 1.65x, 1.7x, 1.75x, 1.8x, 1.85x, 1.9x, 1.95x, 2x, 3x, 4x, 5x, 6x, 7x, 8x, 9x, 10x, 20x, 30x, 40x, 50x, 60x, 70x, 80x, 90x, or 100x more effective in modulating (e.g., reducing) expression of a target plurality of genes than a site-specific disrupter or system comprising only one of the plurality of different effector moieties or a non-synergistic combination of effector moieties.
Target site
The site-specific disruption agents or systems disclosed herein can be used to modulate (e.g., reduce) expression of a target plurality of genes in a cell (e.g., a subject or patient). The target plurality of genes may include any genes known to those of skill in the art. The target plurality of genes comprises at least two genes. In some embodiments, the targeted plurality of genes comprises at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 genes (optionally, no more than 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or 30 genes), e.g., a first gene and a second gene, and optionally a third gene, a fourth gene, a fifth gene, a sixth gene, a seventh gene, an eighth gene, a ninth gene, a tenth gene, an eleventh gene, a twelfth gene, a thirteenth gene, a fourteenth gene, a fifteenth gene, a sixteenth gene, a seventeenth gene, an eighteenth gene, a nineteenth gene, and/or a twentieth gene. In some embodiments of the present invention, in some embodiments, the targeted plurality of genes comprises 2-20, 2-18, 2-16, 2-14, 2-12, 2-10, 2-9, 2-8, 2-7, 2-6, 2-5, 2-4, 2-3, 3-20, 3-18, 3-16, 3-14, 3-12, 3-10, 3-9, 3-8, 3-7, 3-6, 3-5, 3-4, 4-20, 4-18, 4-16, 4-14, 4-12, 4-10, 4-9, 4-8, 4-7, 4-6, 4-5, 5-20, 5-18, 5-16, 5-14, 5-12 5-10, 5-9, 5-8, 5-7, 5-6, 6-20, 6-18, 6-16, 6-14, 6-12, 6-10, 6-9, 6-8, 6-7, 7-20, 7-18, 7-16, 7-14, 7-12, 7-10, 7-9, 7-8, 8-20, 8-18, 8-16, 8-14, 8-12, 8-10, 8-9, 9-20, 9-18, 9-16, 9-14, 9-12, 9-10, 10-20, 10-18, 10-16, 10-14, 10-12, 12-20, 12-18, 12-16, 12-14, 14-20, 14-18, 14-16, 16-20, 16-18 or 18-20 genes.
In some embodiments, two or more (e.g., all) of the target plurality of genes are associated with a disease or disorder in a subject (e.g., a mammal, such as a human, cow, horse, sheep, chicken, rat, mouse, cat, or dog). In some embodiments, the disease or disorder is an inflammatory disease, such as an immune-mediated inflammatory disease. In some embodiments, the disease or disorder is one or more of the following: rheumatoid arthritis, inflammatory arthritis, gout, asthma, neutrophilic skin disease, paw edema, acute Respiratory Disease Syndrome (ARDS), covd-19, psoriasis, inflammatory bowel disease, infection (e.g., caused by a pathogen, e.g., bacteria, virus or fungus), external injury (e.g., abrasion or foreign matter), the effects of radiation or chemical injury, osteoarthritis joint pain, inflammatory pain, acute pain, chronic pain, cystitis, bronchitis, dermatitis, cardiovascular disease, neurodegenerative disease, liver disease, lung disease, kidney disease, pain, swelling, stiffness, tenderness, redness, fever or a biomarker associated with a disease state (e.g., cytokine, immune receptor or inflammatory marker) is elevated. In some embodiments, the inflammatory disorder is associated with an infection, e.g., a viral infection, e.g., sars-Cov-2 virus. In some embodiments, the inflammatory disorder is an autoimmune disorder. In some embodiments, the inflammatory disorder is associated with hypoxia. In some embodiments, the inflammatory disorder is associated with ARDS, hypoxia and/or sepsis. In some embodiments, the infection is an overlapping infection, e.g., caused by more than one pathogen, e.g., a first virus or bacterium or fungus, and a second virus or bacterium or fungus. In some embodiments, the inflammatory disorder may alter lung cell composition, e.g., reduced AT2 cells and/or increased dendritic cells, macrophages, neutrophils, NK cells, fibroblasts, leukocytes, lymphatic endothelial cells, and/or vascular endothelial cells. In some embodiments, the disorder is associated with one or more complications, such as respiratory tract infection, obesity, gastroesophageal reflux disease, skin lesions, and/or obstructive sleep apnea.
In some embodiments, two or more (e.g., all) of the genes of the target plurality are aberrantly expressed, e.g., overexpressed, in a cell, e.g., in a subject (e.g., a human subject).
In some embodiments, two or more (e.g., all) of the genes in the target plurality have related functions. Without wishing to be bound by theory, it is believed that genes with related functions are often located close to each other in the genome and are also often (in whole or in part) found in co-genomic complexes (e.g., ASMC). Modulation (e.g., reduction) of expression of a target plurality of genes, wherein two or more (e.g., all) of the plurality of genes have related functions, can be efficiently and effectively achieved by targeting a genome complex (e.g., ASMC) comprising the correlations.
In some embodiments, one, two, three, or more (e.g., all) of the genes of the target plurality are cytokines, e.g., chemokines, interleukins, transcription factors (e.g., interferon regulated transcription factors), intercellular adhesion molecules (ICAMs), or interferon receptors. In some embodiments, two or more (e.g., all) of the genes of the target plurality are cytokines, interleukins, transcription factors (e.g., interferon regulated transcription factors), intercellular adhesion molecules (ICAM), or interferon receptors. In some embodiments, the target plurality of genes is mammalian genes, e.g., mouse genes, human genes.
In some embodiments, two or more (e.g., all) of the genes in the target plurality have pro-inflammatory function. In some embodiments, two or more (e.g., all) of the genes in the target plurality can act as chemoattractants for immune cells (e.g., neutrophils). For example, genes having a pro-inflammatory function (also referred to herein as pro-inflammatory genes) include CXCL1, CXCL2, CXCL3, CXCL4, CXCL5, CXCL6, CXCL7, IL8, CXCL15, CCL2, CCL7, CCL9, IL1A, IL1B, CSF, IRF1, ICAM4, ICAM5, IFNAR2, IL10RB, or IFNGR2. In some embodiments, the target plurality of genes comprises two or more of CXCL1, CXCL2, CXCL3, CXCL4, CXCL5, CXCL6, CXCL7, IL8, CXCL15, CCL2, CCL7, CCL9, IL1A, IL1B, CSF2, IRF1, ICAM4, ICAM5, IFNAR2, IL10RB, or IFNGR2. In some embodiments, the plurality of genes comprises one or more genes of the human CXCL family. In some embodiments, the target plurality of genes comprises CXCL1 (e.g., a nucleic acid sequence encoding an RNA according to nm_002089 or a nucleic acid encoding a polypeptide according to P09341, or a mutant thereof), CXCL2 (e.g., a nucleic acid sequence encoding an RNA according to nm_001511 or a nucleic acid encoding a polypeptide according to P19875, or a mutant thereof), CXCL3 (e.g., a nucleic acid sequence encoding an RNA according to nm_002090 or a nucleic acid encoding a polypeptide according to P19876, or a mutant thereof), CXCL4 (e.g., a nucleic acid sequence encoding an RNA according to nm_002619 or nm_001363352, or a nucleic acid encoding a polypeptide according to P02776, or a mutant thereof), CXCL5 (e.g., a nucleic acid sequence encoding an RNA according to nm_002994 or a nucleic acid encoding a polypeptide according to P42830, or a mutant thereof), CXCL6 (e.g., a nucleic acid sequence encoding an RNA according to nm_002993 or a nucleic acid encoding a polypeptide according to P80162, or a mutant thereof), CXCL7 (e.g., a nucleic acid sequence encoding an RNA according to nm_002704 or an RNA according to P or a polypeptide according to P02584, or a mutant thereof), CXCL5 (e.g., a nucleic acid encoding a polypeptide according to nm_37584, or a polypeptide according to P001354840, or a mutant thereof). In some embodiments, the plurality of genes comprises one or more genes of the mouse CXCL family. In some embodiments, the target plurality of genes comprises CXCL1 (e.g., a nucleic acid sequence encoding an RNA according to nm_008176.3 or a nucleic acid encoding a polypeptide according to P12850, or a mutant thereof), CXCL2 (e.g., a nucleic acid sequence encoding an RNA according to nm_009140.2 or a nucleic acid encoding a polypeptide according to P10889, or a mutant thereof), CXCL3 (e.g., a nucleic acid sequence encoding an RNA according to nm_203320.3 or a nucleic acid encoding a polypeptide according to Q6W5C0, or a mutant thereof), CXCL4 (e.g., a nucleic acid sequence encoding an RNA according to nm_019932 or a nucleic acid encoding a polypeptide according to Q9Z126, or a mutant thereof), CXCL5 (e.g., a nucleic acid sequence encoding an RNA according to nm_009141.3 or a nucleic acid encoding a polypeptide according to P50228, or a mutant thereof), CXCL7 (e.g., a nucleic acid sequence encoding an RNA according to nm_023785.3 or a nucleic acid encoding a polypeptide according to Q9EQI5, or a mutant thereof), and CXCL15 (e.g., a nucleic acid sequence encoding an RNA according to nm_011339 or a nucleic acid encoding a Q WVL according to Q9 or a mutant thereof). In some embodiments, the target plurality of genes comprises CCL2, CCL7, CCL9, IL1A, and IL1B. In some embodiments, the target plurality of genes comprises CSF2, IRF1, ICAM4, and ICAM5. In some embodiments, the target plurality of genes comprises IFNAR2, IL10RB, and IFNGR2.
In some embodiments, inhibiting expression of two or more (e.g., all) of the target plurality of genes can modulate expression of other genes encoding proteins (e.g., cytokines), e.g., reduce CXCL expression and recruitment of CXCL to the site of inflammation, reduce the presence of GM-CSF and/or IL-6 at the site of inflammation.
In some embodiments, the target plurality of genes is part of a genomic complex (e.g., ASMC). As used herein, reference to a target plurality of genes as part of a genomic complex (e.g., ASMC) means that each gene of the plurality of genes is at least partially contained in the genomic complex (e.g., ASMC). A portion of a target plurality of genes referred to as a genomic complex (e.g., ASMC) may be used interchangeably with a genomic complex (e.g., ASMC) comprising the target plurality of genes. For example, a target plurality of genes may consist of two genes located adjacent to each other in the genome, wherein a first anchor sequence is located within a first one of the genes and a second anchor sequence is located outside a second one of the genes that is remote from the first gene. ASMC formed by the association of the first and second anchor sequences will fully comprise the second of these genes and partially comprise the first of these genes; multiple genes consisting of these two genes will become part of the ASMC. In some embodiments, each gene of the target plurality of genes is entirely within the genomic complex (e.g., ASMC) (e.g., no portion of the transcript encoding gene sequence is outside the genomic complex (e.g., ASMC)). In some embodiments, each gene of the target plurality of genes is partially within a genomic complex (e.g., ASMC) (e.g., portions of the transcript encoding gene sequence are outside of the genomic complex (e.g., ASMC)). In some embodiments, at least one gene of the target plurality of genes is entirely within the genomic complex, e.g., ASMC (e.g., no portion of the transcript encoding gene sequence is outside the genomic complex, e.g., ASMC), and at least one gene of the target plurality of genes is partially within the genomic complex, e.g., ASMC (e.g., some portion of the transcript encoding gene sequence is outside the genomic complex, e.g., ASMC).
Genes in the target plurality of genes may include coding sequences, such as exons, and/or non-coding sequences, such as introns, 3 'utrs, or 5' utrs. In some embodiments, a gene of the target plurality of genes is operably linked to a transcriptional control element. In some embodiments, the transcriptional control element of a gene in the target plurality of genes is also part of a genomic complex (e.g., ASMC) of which the gene is a part. Reference to a transcriptional control element operably linked to a gene that is part of a genomic complex (e.g., ASMC) can be understood in the same sense as the reference to a target plurality of genes described above. In some embodiments, each transcriptional control element operably linked to a gene in the target plurality of genes is entirely within the genomic complex (e.g., ASMC) (e.g., no portion of the transcriptional control element sequence is outside of the genomic complex (e.g., ASMC)). In some embodiments, each transcription control element in the target plurality of genes is partially within the genomic complex (e.g., ASMC) (e.g., portions of the transcription control element sequence are outside of the genomic complex (e.g., ASMC)). In some embodiments, each transcriptional control element of the target plurality of genes is entirely outside the genomic complex (e.g., ASMC) (e.g., each transcriptional control element sequence is outside the genomic complex (e.g., ASMC)). In some embodiments, at least one transcriptional control element operably linked to a gene of the target plurality of genes is entirely within the genomic complex (e.g., ASMC) (e.g., no portion of the transcriptional control element sequence is outside of the genomic complex such as ASMC). In some embodiments, at least one transcriptional control element operably linked to another gene of the target plurality of genes is partially within the genomic complex (e.g., ASMC) (e.g., portions of the transcript encoding gene sequence are outside of the genomic complex (e.g., ASMC)). In some embodiments, at least one transcriptional control element operably linked to a gene of the target plurality of genes is entirely outside the genomic complex (e.g., ASMC).
In some embodiments, the site-specific disruption agent or system targets a plurality of genes by binding to an anchor sequence (e.g., an anchor sequence that is part of a genomic complex (e.g., ASMC) comprising the target plurality of genes). In some embodiments, the targeting moiety binds to the anchor sequence. In some embodiments, the binding of a genomic complex component (e.g., a nucleation polypeptide) to an anchor sequence nucleates complex formation, e.g., anchor sequence mediated ligation formation. Each anchor sequence-mediated connection comprises one or more anchor sequences, e.g., a plurality of anchor sequences. In some embodiments, the anchor sequence-mediated linkage may be disrupted to alter, e.g., inhibit, expression of the target plurality of genes. Such disruption may regulate gene expression, for example, by altering the topology of the DNA, for example, by regulating the ability of genes in the plurality of genes to interact with transcriptional control elements (e.g., enhancement and silencing/repression sequences).
Targeting moieties suitable for use in a site-specific disruption agent or system can bind (e.g., specifically bind) to a site proximal to an anchor sequence (e.g., an anchor sequence that is part of a genomic complex (e.g., ASMC) comprising a target plurality of genes). As used herein, proximal refers to the proximity of two sites (e.g., nucleic acid sites) such that binding of a site-specific breaker or system at a first site and/or modification of the first site by a site-specific breaker will produce the same or substantially the same effect as binding and/or modification of other sites. For example, the targeting moiety may bind to a first site proximal to an anchor sequence (second site) that is part of a genomic complex (e.g., ASMC) comprising a target plurality of genes, and an effector moiety associated with the targeting moiety may epigenetically modify the first site such that the genomic complex (e.g., ASMC) comprising the anchor sequence is modified, substantially the same if the second site is bound and/or modified. In some embodiments, the site proximal to the target gene (e.g., an exon, an intron, or a splice site within the target gene), proximal to a transcription control element operably linked to the target gene, or proximal to an anchor sequence is within 5000, 4000, 3000, 2000, 1000, 900, 800, 700, 600, 500, 400, 300, 200, 100, 50, 40, 30, 25, 20, 15, 10, or 5 base pairs of the target gene (and optionally, at least 5, 10, 20, 25, 50, 100, 200, or 300 base pairs of the target gene), the transcription control element, or anchor sequence. In some embodiments, the site proximal to the anchor sequence is a site less than 1000, 900, 800, 700, 600, 500, 400, 300, 200, 100, 50, 40, 30, 20, 10, or 5 base pairs (and optionally at least 5, 10, 20, 25, 50, 100, 200, or 300 base pairs) from the anchor sequence. In some embodiments, the site proximal to the anchor sequence is a site less than 800, 700, 600, 500, 400, or 300 base pairs (and optionally at least 5, 10, 20, 25, 50, 100, 200, or 300 base pairs) from the anchor sequence.
Targeting moieties suitable for use in the site-specific disruption agents or systems described herein can bind, for example, to a site comprising at least 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides or base pairs (and optionally no more than 50, 49, 48, 47, 46, 45, 44, 43, 42, 41, 40, 39, 38, 37, 36, 35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, or 10 nucleotides or base pairs). In some embodiments, the targeting moiety binds to a site comprising 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, or 50 nucleotides or base pairs.
Genome complex
Genomic complexes related to the present disclosure include stable structures that comprise multiple polypeptide and/or nucleic acid (particularly ribonucleic acid) components and co-localize two or more genomic sequence elements (e.g., anchor sequences, promoters, and/or enhancer elements). In some embodiments, the related genomic complexes comprise an anchor sequence-mediated linkage (e.g., a genomic loop). In some embodiments, the genomic sequence elements in the genomic complex (i.e., in three-dimensional space) comprise transcriptional promoter and/or regulatory (e.g., enhancer or repressor) sequences. Alternatively/additionally, in some embodiments, the genomic sequence elements in the genomic complex comprise binding sites for one or more of CTCF, YY1, etc. In some embodiments, the genomic complex comprises a target plurality of genes. In some embodiments, two or more (e.g., all) of the genes of the target plurality are located in a single loop of the ASMC. In some embodiments, two or more (e.g., all) of the genes in the target plurality are located in different loops of the ASMC.
In some embodiments, the genomic complexes whose incidence is reduced according to the present disclosure comprise or consist of: one or more components selected from the group consisting of: genomic sequence elements (e.g., anchor sequences, such as CTCF binding motifs, YY1 binding motifs, etc., which may be recognized by a nucleation component in some embodiments), one or more polypeptide components (e.g., one or more nucleation polypeptides, one or more transcription machinery proteins, and/or one or more transcription regulatory proteins) and/or one or more non-genomic nucleic acid components (e.g., non-coding RNAs and/or mrnas, e.g., transcribed from genes associated with a genomic complex).
In some embodiments, the genome complex component is part of a genome complex that pools together two genome sequence elements that are separated from each other on a chromosome, e.g., by interactions between multiple proteins and/or other components.
In some embodiments, the genomic sequence element is an anchor sequence to which one or more protein components of the complex bind; thus, in some embodiments, the genomic complex comprises an anchor sequence-mediated linkage. In some embodiments, the genomic sequence element comprises a CTCF binding motif, promoter, and/or enhancer. In some embodiments, the genomic sequence element comprises at least one or both of a promoter and/or a regulatory site (e.g., an enhancer). In some embodiments, complex formation nucleates at one or more genomic sequence elements and/or by binding of one or more protein components to one or more genomic sequence elements.
Genomic sequence elements involved in a genomic complex as described herein may be discontinuous with respect to each other. In some embodiments having discrete genomic sequence elements (e.g., anchor sequences, promoters, and/or transcriptional regulatory sequences), a first genomic sequence element (e.g., anchor sequences, promoters, or transcriptional regulatory sequences) may be separated from a second genomic sequence element (e.g., anchor sequences, promoters, or transcriptional regulatory sequences) by about 500bp to about 500Mb, about 750bp to about 200Mb, about 1kb to about 100Mb, about 25kb to about 50Mb, about 50kb to about 1Mb, about 100kb to about 750kb, about 150kb to about 500kb, or about 175kb to about 500kb. In some embodiments, the first genomic sequence element (e.g., an anchor sequence, a promoter, or a transcriptional regulatory sequence) is separated from the second genomic sequence element (e.g., an anchor sequence, a promoter, or a transcriptional regulatory sequence) by about 500bp, 600bp, 700bp, 800bp, 900bp, 1kb, 5kb, 10kb, 15kb, 20kb, 25kb, 30kb, 35kb, 40kb, 45kb, 50kb, 55kb, 60kb, 65kb, 70kb, 75kb, 80kb, 85kb, 90kb, 95kb, 100kb, 125kb, 150kb, 175kb, 200kb, 225kb, 250kb, 275kb, 300kb, 350kb, 400kb, 500kb, 600kb, 700kb, 800kb, 900kb, 1Mb, 2Mb, 3Mb, 4Mb, 5Mb, 6Mb, 7Mb, 8Mb, 9Mb, 10Mb, 15Mb, 20Mb, 25Mb, 50Mb, 75Mb, 100Mb, 200Mb, 300Mb, 400, or any small therebetween.
Anchor sequence mediated ligation
In some embodiments, the genomic complexes related to the present disclosure are or comprise anchor sequence mediated ligation (ASMC). In some embodiments, an anchor sequence-mediated linkage is formed when one or more nucleation polypeptides bind to an anchor sequence in the genome, and interactions between and among these proteins and optionally one or more other components form a bond in which the anchor sequence is physically co-localized. In many of the embodiments described herein, one or more genes are associated with an anchor sequence-mediated linkage; in such embodiments, the anchor sequence-mediated linkage typically includes one or more anchor sequences, one or more genes, and one or more transcriptional control sequences, such as an enhancing or silencing sequence. In some embodiments, the transcriptional control sequence is located within, partially within, or external to the anchor sequence-mediated junction. In some embodiments, the ASMC comprises internal enhancing sequences, such as enhancers. In some embodiments, the ASMC comprises a target plurality of genes.
In some embodiments, a genomic complex (e.g., anchor sequence-mediated linkage) as described herein is or comprises a genomic loop, such as an intrachromosomal loop. In certain embodiments, a genomic complex (e.g., anchor sequence mediated ligation) as described herein comprises a plurality of genomic loops. The one or more genomic loops can include a first anchor sequence, a nucleic acid sequence, a transcription control sequence, and a second anchor sequence. In some embodiments, at least one genomic loop comprises, in order, a first anchor sequence, a transcription control sequence, and a second anchor sequence; or a first anchor sequence, a nucleic acid sequence, and a second anchor sequence. In still other embodiments, one or both of the nucleic acid sequence and the transcription control sequence are located within a genomic loop. In still other embodiments, one or both of the nucleic acid sequence and the transcription control sequence is located outside the genomic loop. In some embodiments, one or more of the genomic loops comprises a transcriptional control sequence. In some embodiments, the genomic complex (e.g., anchor sequence mediated ligation) comprises a TATA box, CAAT box, GC box, or CAP site.
In some embodiments, the anchor sequence-mediated linkage comprises a plurality of genomic loops; in some such embodiments, the anchor sequence-mediated linkage comprises at least one of an anchor sequence, a nucleic acid sequence, and a transcription control sequence in one or more genomic loops.
Type of ring
In some embodiments, the genomic loop comprises one or more, e.g., 2, 3, 4, 5, or more genes, e.g., a target plurality of genes. In some embodiments, two or more, e.g., 2, 3, 4, 5, or more genes in the target plurality of genes are transcribed in the same direction. In some embodiments, all genes in the target plurality of genes are transcribed in the same direction.
In some embodiments, the disclosure provides methods of modulating (e.g., reducing) expression of a target plurality of genes in a loop, the methods comprising inhibiting, dissociating, degrading, and/or modifying a genomic complex that achieves co-localization of genomic sequences that are outside of, not part of, or comprised in: (i) Genes whose expression is modulated (e.g., a target plurality of genes); and/or (ii) one or more associated transcriptional control sequences that affect transcription of genes whose expression is modulated.
In some embodiments, the disclosure provides methods of modulating (e.g., reducing) transcription of a target plurality of genes, the methods comprising inhibiting formation of and/or destabilizing complexes that achieve co-localization of genomic sequences that are discontinuous with: (i) Genes whose expression is modulated (e.g., a target plurality of genes); and/or (ii) related transcriptional control sequences that affect transcription of genes whose expression is modulated.
In some embodiments, the anchor sequence-mediated linkage is associated with one or more, e.g., 2, 3, 4, 5, or more transcription control sequences. In some embodiments, a gene of the target plurality of genes (e.g., one, two, or more, such as all, of the target plurality of genes) is discontinuous with one or more transcription control sequences. In some embodiments where the gene is discontinuous with its one or more transcription control sequences, the gene may be separated from the one or more transcription control sequences by about 100bp to about 500Mb, about 500bp to about 200Mb, about 1kb to about 100Mb, about 25kb to about 50Mb, about 50kb to about 1Mb, about 100kb to about 750kb, about 150kb to about 500kb, or about 175kb to about 500kb. In some embodiments, the gene is separated from the transcription control sequence by about 100bp, 300bp, 500bp, 600bp, 700bp, 800bp, 900bp, 1kb, 5kb, 10kb, 15kb, 20kb, 25kb, 30kb, 35kb, 40kb, 45kb, 50kb, 55kb, 60kb, 65kb, 70kb, 75kb, 80kb, 85kb, 90kb, 95kb, 100kb, 125kb, 150kb, 175kb, 200kb, 225kb, 250kb, 275kb, 300kb, 350kb, 400kb, 500kb, 600kb, 700kb, 800kb, 900kb, 1Mb, 2Mb, 3Mb, 4Mb, 5Mb, 6Mb, 7Mb, 8Mb, 9Mb, 10Mb, 15Mb, 20Mb, 25Mb, 50Mb, 75Mb, 100Mb, 200Mb, 300Mb, 400Mb, 500Mb, or any size therebetween.
Anchor sequence
Typically, the anchor sequence is a genomic sequence element to which a component of the genomic complex (e.g., a nucleating polypeptide) specifically binds. In some embodiments, the combination of the genomic complex component with the anchor sequence nucleates complex formation.
Each anchor sequence-mediated connection comprises one or more anchor sequences, e.g., a plurality. In some embodiments, the anchor sequence can be manipulated or altered to form and/or stabilize a naturally occurring loop, to form one or more new loops (e.g., to form an exogenous loop or to form a non-naturally occurring loop with an exogenous or altered anchor sequence), or to inhibit the formation of a naturally occurring or exogenous loop or to disrupt its stability. Such changes may be used to regulate gene expression, for example, by altering the topology of the DNA, for example, by thereby regulating the ability of the target gene to interact with gene regulatory and control factors (e.g., enhancement and silencing/repression sequences).
In some embodiments, chromatin structure is modified by substitution, addition, or deletion of one or more nucleotides within an anchor sequence-mediated linkage. In some embodiments, the chromatin structure is modified by substitution, addition, or deletion of one or more nucleotides within the anchor sequence of the anchor sequence-mediated linkage.
In some embodiments, the anchor sequence comprises a common nucleotide sequence, such as a CTCF binding motif: n (T/C/G) N (G/A/T) CC (A/T/G) (C/G) (C/T/A) AG (G/A) (G/T) GG (C/A/T) (G/A) (C/G) (C/T/A) (G/A/C) (SEQ ID NO: 1), wherein N is any nucleotide.
The CTCF binding motif may also be in the opposite orientation, e.g., (G/A/C) (C/T/A) (C/G) (G/A) (C/A/T) GG (G/T) (G/A) GA (C/T/A) (C/G) (A/T/G) CC (G/A/T) N (T/C/G) N (SEQ ID NO: 2). In some embodiments, the anchor sequence comprises SEQ ID NO. 1 or SEQ ID NO. 2 or a sequence that is at least 75%, at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% identical to SEQ ID NO. 1 or SEQ ID NO. 2.
In some embodiments, the anchor sequence comprises a CTCF binding motif, a USF1 binding motif, a YY1 binding motif, a TAF3 binding motif, or a ZNF143 binding motif. In some embodiments, the anchor sequence comprises a nucleation polypeptide binding motif, such as a YY1 binding motif: CCGCCATNTT (SEQ ID NO: 3), wherein N is any nucleotide. The YY1 binding motif may also be in the opposite orientation, e.g. AANATGGCGG (SEQ ID NO: 4), where N is any nucleotide. In some embodiments, the anchor sequence comprises SEQ ID NO. 3 or SEQ ID NO. 4 or a sequence that is at least 75%, at least 80%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99% identical to SEQ ID NO. 3 or SEQ ID NO. 4.
In some embodiments, the anchor sequence-mediated connection comprises at least a first anchor sequence and a second anchor sequence. For example, in some embodiments, the first anchor sequence and the second anchor sequence may each comprise a common nucleotide sequence, e.g., each comprise a CTCF binding motif. In some embodiments, the first anchor sequence and the second anchor sequence may each comprise a USF1 binding motif. In some embodiments, the first anchor sequence and the second anchor sequence may each comprise a YY1 binding motif. In some embodiments, the first anchor sequence and the second anchor sequence may each comprise a TAF3 binding motif. In some embodiments, the first anchor sequence and the second anchor sequence may each comprise a ZNF143 binding motif.
In some embodiments, the first anchor sequence and the second anchor sequence comprise different sequences, e.g., the first anchor sequence comprises a CTCF binding motif and the second anchor sequence comprises an anchor sequence other than a CTCF binding motif. In some embodiments, the first anchor sequence comprises a CTCF binding motif, a USF1 binding motif, a YY1 binding motif, a TAF3 binding motif, or a ZNF143 binding motif, and the second anchor sequence comprises a CTCF binding motif, a USF1 binding motif, a YY1 binding motif, a TAF3 binding motif, or a ZNF143 binding motif, wherein neither of the first and second anchor sequences comprises a CTCF binding motif, a USF1 binding motif, a YY1 binding motif, a TAF3 binding motif, or a ZNF143 binding motif. In some embodiments, each anchor sequence comprises a common nucleotide sequence (e.g., CTCF binding motif, USF1 binding motif, YY1 binding motif, TAF3 binding motif, or ZNF143 binding motif) and one or more flanking nucleotides on one or both sides of the common nucleotide sequence.
The two anchor sequences that can form the ligation (e.g., each anchor sequence comprising a CTCF binding motif, USF1 binding motif, YY1 binding motif, TAF3 binding motif, or ZNF143 binding motif) can be present in the genome in any orientation, e.g., in the same orientation (tandem) 5'-3' (left tandem, or 3'-5' (right tandem), or in a convergent orientation, where one anchor sequence is in a 5'-3' orientation and the other is in a 3'-5' orientation.
The two CTCF binding motifs that can form a ligation (e.g., contiguous or non-contiguous CTCF binding motifs) can be present in the genome in any orientation, e.g., 5'-3' (in tandem on the left, e.g., two CTCF binding motifs comprising SEQ ID NO: 1) or 3'-5' (in tandem on the right, e.g., two CTCF binding motifs comprising SEQ ID NO: 2), or in a convergent orientation, wherein one CTCF binding motif comprises SEQ ID NO:1 and the other comprises SEQ ID NO:2. Ctfbsdb 2.0: CTCF binding motifs and genomic organization databases (instordb.uthsc.edu /) on the world wide web can be used to identify CTCF binding motifs associated with target genes.
In some embodiments, the anchor sequence comprises a CTCF binding motif associated with a target plurality of genes, wherein the target plurality of genes is associated with a disease, disorder, and/or condition. In some embodiments, the anchor sequence comprises a CTCF binding motif associated with a target plurality of genes, wherein a gene of the target plurality of genes has a related function. In some embodiments, the anchor sequence comprises a CTCF binding motif associated with a target plurality of genes, wherein the target plurality of genes (e.g., two or more of the plurality of genes, e.g., all) are aberrantly expressed in a cell of the subject.
In some embodiments, chromatin structure may be modified by substitution, addition, or deletion of one or more nucleotides within at least one anchor sequence (e.g., a nucleation polypeptide binding motif). One or more nucleotides may be specifically targeted, e.g., targeted changes, for substitution, addition, or deletion within the anchor sequence (e.g., the nucleation polypeptide binding motif).
In some embodiments, the anchor sequence-mediated linkage can be altered by altering the orientation of at least one common nucleotide sequence (e.g., a nucleation polypeptide binding motif). In some embodiments, the anchor sequence comprises a nucleated polypeptide binding motif (e.g., CTCF binding motif), and the targeting moiety introduces alterations in at least one of the nucleated polypeptide binding motifs, e.g., alters binding affinity to the nucleated polypeptide.
In some embodiments, the anchor sequence-mediated linkage can be altered by introducing an exogenous anchor sequence. In some embodiments, a non-naturally occurring or exogenous anchor sequence is added to destabilize or inhibit formation of a naturally occurring anchor sequence-mediated linkage, e.g., by inducing non-naturally occurring loop formation, altering (e.g., reducing) transcription of the nucleic acid sequence.
Other compositions
Nucleic acids and vectors
The present disclosure further relates in part to nucleic acids encoding the site-specific disrupters or systems described herein. In some embodiments, the site-specific breaker can be provided by a composition comprising a nucleic acid encoding the site-specific breaker (e.g., a targeting moiety and/or an effector moiety of the site-specific breaker), wherein the nucleic acid is associated with sufficient other sequences to achieve expression of the site-specific breaker in a system of interest (e.g., in a particular cell, tissue, organism, etc.). In some embodiments, the system may be provided by a composition comprising: a first nucleic acid encoding a first site-specific breaker, e.g., a first targeting moiety and/or a first effector moiety of the first site-specific breaker; and a second nucleic acid encoding a second site-specific breaker, e.g., a second targeting moiety and/or a second effector moiety of the second site-specific breaker, wherein the first and/or second nucleic acids are associated with sufficient other sequences to effect expression of the site-specific breaker in a system of interest (e.g., in a particular cell, tissue, organism, etc.).
In some particular embodiments, the disclosure provides nucleic acid compositions encoding a site-specific breaker or polypeptide or nucleic acid portion thereof (e.g., comprising a targeting portion and/or effector portion of a polypeptide and/or nucleic acid). In some particular embodiments, the present disclosure provides compositions of nucleic acids encoding a first site-specific breaker and a second site-specific breaker, or polypeptide or nucleic acid portions thereof (e.g., targeting moieties and/or effector moieties, comprising polypeptides and/or nucleic acids). In some such embodiments, the provided nucleic acids can include DNA, RNA, or any other nucleic acid portion or entity described herein, and can be prepared by any technique (e.g., synthesis, cloning, amplification, in vitro or in vivo transcription, etc.) as described herein or otherwise available in the art. In some embodiments, provided nucleic acids encoding one or more site-specific disruption agents or polypeptides or nucleic acid portions thereof can be operably associated with one or more replication, integration, and/or expression signals suitable and/or sufficient to effect integration, replication, and/or expression of the provided nucleic acids in a system of interest (e.g., in a particular cell, tissue, organism, etc.).
In some embodiments, a composition for delivering a site-specific breaker or system described herein comprises a vector, e.g., a viral vector, comprising one or more nucleic acids encoding the site-specific breaker or polypeptide or nucleic acid portion thereof. In some embodiments, the first vector comprises a first nucleic acid encoding a first site-specific breaker, and the second vector comprises a second nucleic acid encoding a second site-specific breaker. In some embodiments, a single vector comprises a first nucleic acid encoding a first site-specific breaker and a second nucleic acid encoding a second site-specific breaker.
In some embodiments, the composition for delivering a site-specific breaker or system described herein is or comprises RNA, e.g., mRNA, comprising one or more nucleic acids encoding one or more components of the site-specific breaker or polypeptide or nucleic acid portion thereof.
The nucleic acids described herein or nucleic acids encoding the proteins described herein may be incorporated into vectors. Including those derived from retroviruses such as lentiviruses, are suitable tools for achieving long-term gene transfer, as they allow for long-term stable integration of transgenes and their propagation in daughter cells. Examples of vectors include expression vectors, replication vectors, probe-generating vectors, and sequencing vectors. The expression vector may be provided to the cell in the form of a viral vector. Viral vector technology is well known in the art and is described in various handbooks of pathology and molecular biology. Viruses that may be used as vectors include, but are not limited to, retroviruses, adenoviruses, adeno-associated viruses, herpesviruses, and lentiviruses. In general, suitable vectors contain an origin of replication in at least one organism, a promoter sequence, a convenient restriction endonuclease site, and one or more selectable markers.
Expression of natural or synthetic nucleic acids is typically achieved by: the nucleic acid encoding the gene of interest is operably linked to a promoter, and the construct is incorporated into an expression vector. Vectors may be suitable for replication and integration in eukaryotes. Typical cloning vectors contain transcription and translation terminators, initiation sequences, and promoters, and can be used for expression of the desired nucleic acid sequence.
Additional promoter elements, such as enhancement sequences, may regulate the frequency of transcription initiation. Typically, these sequences are located in the region 30-110bp upstream of the transcription initiation site, although a variety of promoters have recently been shown to also contain functional elements downstream of the transcription initiation site. The spacing between promoter elements is generally flexible so that promoter function can be preserved when reversing or moving the elements relative to each other. In the thymidine kinase (tk) promoter, the spacing between promoter elements may be increased to 50bp before the activity begins to decrease. Depending on the promoter, it appears that individual elements may function together or independently to activate transcription.
One example of a suitable promoter is the immediate early Cytomegalovirus (CMV) promoter sequence. This promoter sequence is a strong constitutive promoter sequence capable of driving high levels of expression of any polynucleotide sequence to which it is operably linked. Some examples of suitable promoters are extended growth factor-1α (EF-1α). However, other constitutive promoter sequences may also be used, including, but not limited to, simian virus 40 (SV 40) early promoter, mouse Mammary Tumor Virus (MMTV), human Immunodeficiency Virus (HIV), long Terminal Repeat (LTR) promoter, moMuLV promoter, avian leukemia virus promoter, epstein barr virus immediate early promoter, rous sarcoma virus promoter, along with human gene promoters (such as, but not limited to, actin promoter, myosin promoter, hemoglobin promoter, and creatine kinase promoter).
The present disclosure should not be construed as limited to the use of any particular promoter or class of promoters (e.g., constitutive promoters). For example, in some embodiments, inducible promoters are considered to be part of the disclosure. In some embodiments, the use of an inducible promoter provides a molecular switch capable of turning on the expression of a polynucleotide sequence operably linked thereto (when such expression is desired). In some embodiments, the use of an inducible promoter provides a molecular switch that can turn off expression (when expression is not desired). Examples of inducible promoters include, but are not limited to, metallothionein promoters, glucocorticoid promoters, progesterone promoters, and tetracycline promoters.
In some embodiments, the expression vector to be introduced may also contain a selectable marker gene or a reporter gene or both, thereby facilitating identification and selection of the expressing cells from the population of cells sought to be transfected or infected by the viral vector. In some aspects, the selectable marker may be performed on a single piece of DNA and used in a co-transfection procedure. Both the selectable marker and the reporter gene may be flanked by appropriate transcriptional control sequences to enable expression in the host cell. Useful selectable markers can include, for example, antibiotic resistance genes, such as neo and the like.
In some embodiments, the reporter gene may be used to identify potentially transfected cells and/or to assess the function of the transcriptional control sequences. Typically, a reporter gene is one that is not present in or expressed by the recipient source (of the reporter gene) and encodes a polypeptide whose expression is evidenced by some readily detectable property (e.g., enzymatic activity or visible fluorescence). After introducing the DNA into the recipient cells, the expression of the reporter gene is measured at an appropriate time. Suitable reporter genes may include genes encoding luciferases, beta-galactosidases, chloramphenicol acetyl transferase, secreted alkaline phosphatase, or green fluorescent protein genes (e.g., ui-Tei et al, 2000FEBS Letters [ European society of Biochemical Association ] 479:79-82). Suitable expression systems are well known and may be prepared using known techniques or commercially available. Typically, constructs with minimal 5' flanking regions that show the highest expression levels of the reporter gene are identified as promoters. Such promoter regions may be linked to a reporter gene and used to assess the ability of an agent to regulate promoter-driven transcription.
Cells
The present disclosure further relates in part to cells comprising the site-specific disrupters or systems described herein. Any cell known to those of skill in the art, such as a cell line, e.g., a cell line suitable for expression of a recombinant polypeptide, is suitable for inclusion of the site-specific disruption agent described herein. In some embodiments, cells such as cell lines may be used in systems that express a site-specific disruption agent, contain one or more site-specific agents, or nucleic acid or polypeptide portions thereof. In some embodiments, cells, such as cell lines, can be used to express or amplify nucleic acids, such as vectors, encoding the site-specific disruption agent. In some embodiments, cells, such as cell lines, can be used to express or amplify one or more nucleic acids, such as a vector encoding a first site-specific breaker and a vector encoding a second site-specific breaker. In some embodiments, the cell comprises a nucleic acid encoding a site-specific breaker described herein. In some embodiments, the cell comprises a first nucleic acid encoding a first site-specific breaker and a second nucleic acid encoding a second site-specific breaker as described herein.
In some embodiments, the cell comprises a nucleic acid encoding a site-specific breaker or a nucleic acid or polypeptide portion thereof, and the nucleic acid is integrated into the genomic DNA of the cell. In some embodiments, the cell comprises a nucleic acid encoding a site-specific breaker or a nucleic acid or polypeptide portion thereof, and the nucleic acid is placed on a vector. In some embodiments, the cell comprises a first nucleic acid encoding a first site-specific breaker and a second nucleic acid encoding a second site-specific breaker or a nucleic acid or polypeptide portion thereof, and the first and second nucleic acids are integrated into the genomic DNA of the cell. In some embodiments, the cell comprises a first nucleic acid encoding a first site-specific breaker and a second nucleic acid encoding a second site-specific breaker or a nucleic acid or polypeptide portion thereof, and the first and second nucleic acids are placed on a vector.
Examples of cells that may contain and/or express the site-specific disruption agent or system herein include, but are not limited to, hepatocytes, astrocytes, cumulating cells, neuronal cells, endothelial cells, alveolar cells, epithelial cells, muscle cells, synovial layers, chondrocytes, immune cells, and lymphocytes.
The present disclosure further relates in part to cells made by the methods or processes described herein. In some embodiments, the disclosure provides a cell produced by: providing a site-specific breaker as described herein, providing a cell and contacting the cell with the site-specific breaker (or a nucleic acid encoding the site-specific breaker, or a composition comprising the site-specific breaker or nucleic acid). In some embodiments, the disclosure provides a cell produced by: providing a system as described herein, providing a cell and contacting the cell with the system (or a first nucleic acid encoding a first site-specific breaker and a second nucleic acid encoding a second site-specific breaker, or a composition comprising the system or nucleic acids). Without wishing to be bound by theory, cells contacted with the site-specific disruption agents or systems described herein may exhibit: a reduction in expression of a target plurality of genes; an epigenetic marker associated with the target plurality of genes, a transcriptional control element operably linked to the target plurality of genes, or modification of an anchor sequence proximal to the target plurality of genes or an anchor sequence associated with an anchor sequence mediated linkage comprising the target plurality of genes; a gene of the target plurality of genes, a transcriptional control element operably linked to the target plurality of genes, or an anchor sequence proximal to the target plurality of genes or a genetic modification of an anchor sequence associated with an anchor sequence-mediated linkage comprising the target plurality of genes; and/or a decrease (e.g., absence) in the level of a genomic complex (e.g., ASMC) comprising a target plurality of genes, as compared to a similar cell that has not been contacted with the site-specific disruption agent. In some embodiments, the cells exhibiting the decrease in expression of the target plurality of genes, modification of the epigenetic marker, and/or genetic modification do not comprise a site-specific breaker. The reduction in expression, modification of the epigenetic marker, and/or genetic modification of the target plurality of genes may last, for example, at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 hours, or at least 1, 2, 3, 4, 5, 6, 7, 10, or 14 days, or at least 1, 2, 3, 4, or 5 weeks, or at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 months, or at least 1, 2, 3, 4, or 5 years (e.g., indefinitely) after contact with the site-specific disruption agent. In some embodiments, cells previously contacted with the site-specific disruption agent maintain a decrease in expression of the target plurality of genes, modification of the epigenetic marker, and/or genetic modification after the site-specific disruption agent is no longer present in the cells, e.g., for at least 1 hour, 2 hours, 3 hours, 4 hours, 5 hours, 6 hours, 7 hours, 8 hours, 9 hours, 10 hours, 11 hours, 12 hours, 13 hours, 14 hours, 15 hours, 16 hours, 17 hours, 18 hours, 19 hours, 20 hours, 21 hours, 22 hours, 23 hours, or 24 hours, or at least 1 day, 2 days, 3 days, 4 days, 5 days, 6 days, 7 days, 10 days, or 14 days, or at least 1 week, 2 weeks, 3 weeks, 4 weeks, or 5 weeks, or at least 1 month, 2 months, 3 months, 4 months, 5 months, 6 months, 7 months, 8 months, 9, 20 years, 21 months, 22 hours, 23 hours, or 24 hours, or at least 1 day, 2 days, 3 days, 4 months, 4 weeks, 4 months, 5 years, 3 months, 11, or at least, 5 months, for example. In some embodiments, the cell is a mammalian cell, such as a human cell. In some embodiments, the cell is a somatic cell and/or a primary cell.
Kit for detecting a substance in a sample
The disclosure further relates in part to a kit comprising a site-specific breaker, a system, a nucleic acid encoding a site-specific breaker, or a first nucleic acid encoding a first site-specific breaker and a second nucleic acid encoding a second site-specific breaker as described herein. In some embodiments, the kit comprises a site-specific breaker, system, or nucleic acid encoding the same, and instructions for using the site-specific breaker or system. In some embodiments, the kit comprises a nucleic acid encoding a site-specific breaker or a component thereof (e.g., a polypeptide or nucleic acid portion of a site-specific breaker) and instructions for using the nucleic acid and/or the site-specific breaker. In some embodiments, the kit comprises a cell comprising a nucleic acid encoding a site-specific breaker or a component thereof (e.g., a polypeptide or nucleic acid portion of a site-specific breaker) and instructions for using the cell, nucleic acid, and/or site-specific breaker. In some embodiments, the kit comprises a first nucleic acid encoding a first site-specific breaker and a second nucleic acid encoding a second site-specific breaker or a component thereof (e.g., a polypeptide or nucleic acid portion of the first and second site-specific breakers) and instructions for using the nucleic acids and/or the system. In some embodiments, the kit comprises a cell comprising a nucleic acid encoding: a first nucleic acid encoding a first site-specific breaker and a second nucleic acid encoding a second site-specific breaker or a component thereof (e.g., a polypeptide or nucleic acid portion of the first and second site-specific breakers).
In some embodiments, the kit comprises a unit dose of a site-specific breaker, or a unit dose of a nucleic acid encoding a site-specific breaker described herein, e.g., a vector. In some embodiments, the kit comprises a unit dose of the system or unit dose of the first and second nucleic acids, e.g., vectors, encoding the first site-specific breaker and the second site-specific breaker described herein.
Method for preparing site-specific disrupters
In some embodiments, the site-specific breaker or system comprises one or more proteins and thus can be produced by a method of producing a protein. As will be appreciated by the skilled artisan, methods of preparing a protein or polypeptide (which may be included in a modulator as described herein) are routine in the art. Generally, see Smales and James (editions), therapeutic Proteins: methods and Protocols [ therapeutic protein: methods and protocols ] (Methods in Molecular Biology [ methods of molecular biology ]), huma Press [ Hu Mana Press ] (2005); and Crommelin, sindelar and Meibohm (editions), pharmaceutical Biotechnology: fundamentals and Applications [ pharmaceutical biotechnology: foundation and application ], springer [ Springer Press ] (2013).
Proteins or polypeptides of the compositions of the present disclosure may be biochemically synthesized by employing standard solid phase techniques. Such methods include exclusion solid phase synthesis, partial solid phase synthesis methods, fragment condensation, classical solution synthesis. These methods can be used when the peptide is relatively short (e.g., 10 kDa) and/or when it cannot be produced by recombinant techniques (i.e., not encoded by a nucleic acid sequence) and thus involves different chemistries.
Solid phase synthesis procedures are well known in the art and are further described by: john Morrow Stewart and Janis Dillaha Young, solid Phase Peptide Syntheses [ solid phase peptide synthesis ],2 nd edition, pierce Chemical Company [ pierce chemical company ],1984; and Coin, I. Et al, nature Protocols [ Nature laboratory Manual ],2:3247-3256,2007.
For longer peptides, recombinant methods may be used. Methods for preparing recombinant therapeutic polypeptides are conventional in the art. Generally, see Smales and James (editions), therapeutic Proteins: methods and Protocols [ therapeutic protein: methods and protocols ] (Methods in Molecular Biology [ methods of molecular biology ]), huma Press [ Hu Mana Press ] (2005); and Crommelin, sindelar and Meibohm (editions), pharmaceutical Biotechnology: fundamentals and Applications [ pharmaceutical biotechnology: foundation and application ], springer [ Springer Press ] (2013).
An exemplary method of producing a therapeutic drug protein or polypeptide involves expression in mammalian cells, although insect cells, yeast, bacteria, or other cells may also be used, under the control of an appropriate promoter, to produce a recombinant protein. Mammalian expression vectors may contain non-transcribed elements such as origins of replication, suitable promoters, and other 5 'or 3' flanking non-transcribed sequences; and 5 'or 3' untranslated sequences, such as necessary ribosome binding sites, polyadenylation sites, splice donor and acceptor sites, and termination sequences. DNA sequences derived from the SV40 viral genome, such as the SV40 origin, early promoters, splicing and polyadenylation sites, may be used to provide other genetic elements necessary for expression of heterologous DNA sequences. Suitable cloning and expression vectors for use with bacterial, fungal, yeast, and mammalian cell hosts are described in the following documents: green & Sambrook, molecular Cloning: A Laboratory Manual [ molecular cloning-laboratory Manual ] (fourth edition), cold Spring Harbor Laboratory Press [ Cold spring harbor laboratory Press ] (2012).
Where a large amount of protein or polypeptide is desired, it may be produced using techniques such as those described by the following documents: brian Bray, nature Reviews Drug Discovery [ natural review: drug discovery ],2:587-593,2003; weissbach & Weissbach,1988,Methods for Plant Molecular Biology [ methods of plant molecular biology ], academic Press [ Academic Press ], new York, section VIII, pages 421-463.
Various mammalian cell culture systems can be used to express and produce recombinant proteins. Examples of mammalian expression systems include, but are not limited to, CHO cells, COS cells, heLA and BHK cell lines. The process of host cell culture for the production of protein therapeutics is described in the following documents: zhou and Kantardjiiff (editions), mammalian Cell Cultures for Biologics Manufacturing [ mammalian cell culture for biological manufacture ] (Advances in Biochemical Engineering/Biotechnology [ progress of biochemical engineering/Biotechnology ]), springer [ Springer Press ] (2014). The compositions described herein may include a vector, such as a viral vector encoding a recombinant protein, such as a lentiviral vector. In some embodiments, a vector, such as a viral vector, may comprise a nucleic acid encoding a recombinant protein.
Purification of protein therapeutics is described in the following documents: franks, protein Biotechnology: isolation, characation and Stabilization [ protein biotechnology: isolation, characterization, and stabilization]Humana Press [ Hu Mana Press ]](2013) The method comprises the steps of carrying out a first treatment on the surface of the Cutler, protein Purification Protocols [ protein purification protocol ]](Methods in Molecular Biology [ methods of molecular biology ] ]) Humana Press [ Hu Mana Press ]](2010). Formulations of protein therapeutics are described in the following documents: meyer (editions), therapeutic Protein Drug Products: practical Approaches to formulation in the Laboratory, manufacturing, and the clinical [ therapeutic protein drug product: practical method of preparing preparation in laboratory, manufacturing and clinic]Woodhead Publishing Series [ Wu Dehai De publication series](2012). The protein comprises one or more amino acids. Amino acids include any compound and/or substance that can be incorporated into a polypeptide chain, for example, by forming one or more peptide bonds. In some embodiments, the amino acid has the general structure H 2 N-C (H) (R) -COOH. In some embodiments, the amino acid is a naturally occurring amino acid. In some embodiments, the amino acid is an unnatural amino acid; in some embodiments, the amino acid is a D-amino acid; in some embodiments, the amino acid is an L-amino acid. "Standard amino acid" refers to any of the twenty standard L-amino acids typically found in naturally occurring peptides. "non-standard amino acid" refers to any amino acid other than a standard amino acid, whether synthetically prepared or obtained from natural sources. In some embodiments, the amino acids in the polypeptide, including the carboxy and/or amino terminal amino acids, may comprise structural modifications as compared to the general structures described above. For example, in some embodiments, amino acids may be substituted by methylation, amidation, acetylation, pegylation, glycosylation, phosphorylation, and/or substitution (e.g., ammonia) as compared to the general structure A group, a carboxylic acid group, one or more protons, and/or a hydroxyl group). In some embodiments, such modifications may, for example, alter the circulating half-life of a polypeptide containing a modified amino acid as compared to a polypeptide containing an otherwise identical unmodified amino acid. In some embodiments, such modifications do not significantly alter the activity associated with a polypeptide containing a modified amino acid compared to a polypeptide containing an otherwise identical unmodified amino acid. As will be clear from the context, in some embodiments, the term "amino acid" may be used to refer to a free amino acid; in some embodiments, it may be used to refer to an amino acid residue of a polypeptide.
Pharmaceutical compositions, formulations, delivery and administration
The present disclosure further relates in part to pharmaceutical compositions comprising the site-specific disruption agents described herein, and to pharmaceutical compositions comprising nucleic acids encoding the site-specific disruption agents or systems described herein.
As used herein, the term "pharmaceutical composition" refers to an active agent (e.g., a site-specific breaker or system, or nucleic acid encoding the same) formulated with one or more pharmaceutically acceptable carriers (e.g., pharmaceutically acceptable carriers known to those of skill in the art). In some embodiments, the active agent is present in a unit dose suitable for administration in a treatment regimen that, when administered to a relevant population, exhibits a statistically significant probability of achieving a predetermined therapeutic effect. In some embodiments, the pharmaceutical composition comprises a site-specific breaker or system of the disclosure.
In some embodiments, the pharmaceutical compositions may be specifically formulated for administration in solid or liquid form, including those suitable for use in: oral administration, e.g., drenches (aqueous or non-aqueous solutions or suspensions), tablets, such as those for oral, sublingual and systemic absorption, pills, powders, granules, pastes for the tongue; parenteral administration, for example by subcutaneous, intramuscular, intravenous or epidural injection, as, for example, a sterile solution or suspension, or as a sustained release formulation; topical application, for example, as a cream, ointment or controlled release patch or spray to the skin, lungs or oral cavity; intravaginal or intrarectal, for example, as pessaries, creams or foams; sublingual buccal administration; an eye; transdermal; or nasal, pulmonary and/or to other mucosal surfaces, for example, as an aerosol, aqueous solution or suspension. In some embodiments, the composition may be lyophilized or spray dried. In some embodiments, the composition may be formulated for pulmonary administration and/or intravenous administration.
As used herein, the term "pharmaceutically acceptable" refers to those compounds, materials, compositions, and/or dosage forms which are, within the scope of sound medical judgment, suitable for use in contact with the tissues of human beings and animals without excessive toxicity, irritation, allergic response, or other problem or complication, commensurate with a suitable benefit/risk ratio.
As used herein, the term "pharmaceutically acceptable carrier" means a pharmaceutically acceptable material, composition or vehicle, such as a liquid or solid filler, diluent, excipient or solvent encapsulating material, that participates in the transport or transport of the subject compound from one organ or portion of the body to another organ or portion of the body. Each carrier must be "acceptable" in the sense of being compatible with the other ingredients of the formulation and not deleterious to the patient. For example, in some embodiments, materials that may be used as pharmaceutically acceptable carriers include: sugars such as lactose, glucose, and sucrose; starches, such as corn starch and potato starch; cellulose and its derivatives such as sodium carboxymethyl cellulose, ethyl cellulose and cellulose acetate; powdered tragacanth; malt; gelatin; talc; excipients, such as cocoa butter and suppository waxes; oils such as peanut oil, cottonseed oil, safflower oil, sesame oil, olive oil, corn oil and soybean oil; glycols, such as propylene glycol; polyols such as glycerol, sorbitol, mannitol and polyethylene glycol; esters such as ethyl oleate and ethyl laurate; agar; buffering agents such as magnesium hydroxide and aluminum hydroxide; alginic acid; pyrogen-free water; isotonic saline; ringer's solution; ethanol; a pH buffer solution; polyesters, polycarbonates and/or polyanhydrides; and other non-toxic compatible substances employed in pharmaceutical formulations.
As used herein, the term "pharmaceutically acceptable salt" refers to salts of such compounds which are suitable for use in a pharmaceutical environment, i.e., salts which are suitable for use in contact with the tissues of humans and lower animals without undue toxicity, irritation, allergic response, and the like, commensurate with a suitable benefit/risk ratio, within the scope of sound medical judgment. Pharmaceutically acceptable salts are well known in the art. For example, pharmaceutically acceptable salts are described in detail in J.pharmaceutical Sciences, 66:1-19 (1977). In some embodiments, pharmaceutically acceptable salts include, but are not limited to, non-toxic acid addition salts, which are salts having amino groups formed using inorganic acids such as hydrochloric acid, hydrobromic acid, phosphoric acid, sulfuric acid, and perchloric acid, or using organic acids such as acetic acid, maleic acid, tartaric acid, citric acid, succinic acid, or malonic acid, or by using other methods used in the art such as ion exchange. In some embodiments, pharmaceutically acceptable salts include, but are not limited to, adipates, alginates, ascorbates, aspartate, benzenesulfonates, benzoates, bisulfate, borates, butyrates, camphorates, camphorsulfonates, citrates, cyclopentanepropionates, digluconate, dodecylsulfate, ethanesulfonate, formate, fumarate, glucoheptonate, glycerophosphate, gluconate, hemisulfate, heptanoate, caproate, hydroiodinates, 2-hydroxy-ethanesulfonate, lactoaldehyde, lactate, laurate, dodecylsulfate, malate, maleate, malonate, methanesulfonate, 2-naphthalenesulfonate, nicotinate, nitrate, oleate, oxalate, palmitate, pamoate, pectate, persulfate, 3-phenylpropionate, phosphate, picrate, pivalate, propionate, stearate, succinate, sulfate, tartrate, thiocyanate, p-toluenesulfonate, undecanoate, valerate, and the like. Representative alkali or alkaline earth metal salts include sodium, lithium, potassium, calcium, magnesium, and the like. In some embodiments, pharmaceutically acceptable salts suitably include nontoxic ammonium, quaternary ammonium, and amine cations formed using counter ions such as halides, hydroxides, carboxylates, sulfates, phosphates, nitrates, alkyl, sulfonates, and arylsulfonates having from 1 to 6 carbon atoms.
In various embodiments, the present disclosure provides a pharmaceutical composition as described herein with a pharmaceutically acceptable excipient. Pharmaceutically acceptable excipients include excipients that can be used to prepare generally safe, non-toxic and desirable pharmaceutical compositions, and include excipients that are acceptable for veterinary use as well as for human pharmaceutical use. Such excipients may be solid, liquid, semi-solid, or, in the case of aerosol compositions, gaseous.
The pharmaceutical formulations may be prepared according to conventional pharmaceutical techniques, including grinding, mixing, granulating, and if necessary, tableting; or milling, mixing and filling for hard gelatin capsule forms. When a liquid carrier is used, the formulation may be in the form of a syrup, elixir, emulsion or aqueous or non-aqueous solution or suspension. Such liquid formulations may be administered orally directly.
In some embodiments, the pharmaceutical composition may be formulated for delivery to cells and/or subjects via any route of administration. The mode of administration to a subject may include injection, infusion, inhalation, intranasal, intraocular, topical delivery, interannular delivery, or ingestion. Injections include, but are not limited to, intravenous, intramuscular, intraarterial, intrathecal, intraventricular, intracapsular, intraorbital, intracardiac, intradermal, intraperitoneal, transtracheal, subcutaneous, subcuticular, intra-articular, bronchial, subcapsular, subarachnoid, intraspinal, and intrasternal injections and infusions. In some embodiments, administration includes aerosol inhalation, e.g., using nebulization. In some embodiments, administration is systemic (e.g., oral, rectal, intranasal, sublingual, buccal, or parenteral), enteral (e.g., systemic-wide effect, but delivery through the gastrointestinal tract), or topical (e.g., topical application on the skin, intravitreal injection). In some embodiments, one or more compositions are administered systemically. In some embodiments, the administration is parenteral administration and the therapeutic agent is a parenteral therapeutic agent. In some embodiments, administration may be bronchial (e.g., by bronchial instillation), buccal, skin (which may be or include, for example, one or more of dermal topical, intradermal, transdermal, etc.), enteral, arterial, intradermal, intragastric, intramedullary, intramuscular, intranasal, intraperitoneal, intrathecal, intravenous, intraventricular, specific organ (e.g., intrahepatic), mucosal, nasal, oral, rectal, subcutaneous, sublingual, topical, tracheal (e.g., by intratracheal instillation), vaginal, vitreous, etc. In some embodiments, administration may be a single dose. In some embodiments, administration may include intermittent administration (e.g., multiple doses separated in time) and/or periodic administration (e.g., a single dose separated by a common period of time). In some embodiments, administering may include continuing administration (e.g., infusion) for at least a selected period of time. In some embodiments, six, eight, ten, 12, 15, or 20 or more administrations may be administered to the subject during or over a period of time as a treatment regimen. In some embodiments, administration may be given as needed, e.g., as long as symptoms associated with the disease, disorder, or condition persist. In some embodiments, repeated administration may be indicated for the remainder of the subject's life. The treatment period may vary and may be, for example, one day, two days, three days, one week, two weeks, one month, two months, three months, six months, one year or more.
In some embodiments, administration is provided using a respiratory delivery apparatus, such as a nebulizer, e.g., a metered dose inhaler, e.g., a dry powder inhaler. Some commercially available dry powder inhalers include Spinhaler (Fei Sen pharmaceutical company (Fisons Pharmaceuticals), rochester (Rochester, new york) and Rotahaler (glaring smith corporation (GSK), north card triangle research park (RTP), north carolina). In some embodiments, the atomizer may include a jet atomizer, an ultrasonic atomizer, and/or a vibrating screen atomizer.
Dosage of
Pharmaceutical compositions according to the present disclosure may be delivered in therapeutically effective amounts. The precise therapeutically effective amount is the amount of the composition that will produce the most effective result in a given subject in terms of therapeutic efficacy. The amount will vary depending on a variety of factors including, but not limited to, the characteristics of the therapeutic compound (including activity, pharmacokinetics, pharmacodynamics, and bioavailability), the physiological condition of the subject (including age, sex, disease type and stage, general physical condition, responsiveness to a given dose, and drug type), the nature of the one or more pharmaceutically acceptable carriers in the formulation, and/or the route of administration.
In some aspects, the present disclosure provides methods of delivering a therapeutic agent comprising administering to a subject a composition as described herein, wherein the genomic complex modulator is the therapeutic agent and/or wherein delivery of the therapeutic agent causes a change in gene expression relative to gene expression in the absence of the therapeutic agent.
The methods as provided in the various embodiments herein may be used in any of the aspects described herein. In some embodiments, one or more compositions target a particular cell or one or more particular tissues.
For example, in some embodiments, one or more compositions target epithelial, connective, muscle and/or neural tissue or cells. In some embodiments, the composition targets cells or tissues of a particular organ system, such as the respiratory system (pharynx, larynx, trachea, bronchi, lungs, diaphragm), the cardiovascular system (heart, vasculature); digestive system (esophagus, stomach, liver, gall bladder, pancreas, intestine, colon, rectum and anus); endocrine system (hypothalamus, pituitary, pineal body or pineal gland, thyroid, parathyroid gland, adrenal gland); excretory system (kidney, ureter, bladder); lymphatic system (lymph, lymph node, lymphatic vessel, tonsil, adenoid, thymus, spleen); skin system (skin, hair, nails); musculature (e.g., skeletal muscle); nervous system (brain, spinal cord, nerves); the reproductive system (ovary, uterus, breast, testis, vas deferens, seminal vesicle, prostate); skeletal system (bone, cartilage); and/or combinations thereof. In some embodiments, the composition targets a cell, e.g., an endothelial cell, an alveolar cell, an epithelial cell, a liver cell, a stellate cell, a cumic cell, a synovial layer, a chondrocyte, a fibroblast, a ductal epithelial cell, an epithelial intestinal cell, a goblet cell, a basal cell, and/or an immune cell. In some embodiments, the composition targets cells of an organ, such as nasal cells, lung cells, ileal cells, cardiomyocytes, optic nerve cells, liver cells, bladder cells, pancreatic cells, kidney cells, nerve cells, prostate cells, testicular cells, and in some embodiments, the composition of the present disclosure crosses the blood brain barrier, placental membrane, or blood testosterone barrier. In some embodiments, the composition targets cells that express ACE-2 receptors.
In some embodiments, the pharmaceutical compositions as provided herein are administered systemically.
In some embodiments, the administration is parenteral administration and the therapeutic agent is a parenteral therapeutic agent.
In some embodiments, the pharmaceutical compositions of the present disclosure have improved PK/PD, e.g., increased pharmacokinetics or pharmacodynamics, such as improved targeting, absorption, or transport (e.g., at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 75%, 80%, 90% improvement or more) compared to the active agent alone. In some embodiments, the pharmaceutical composition has reduced adverse effects, such as reduced diffusion to non-target sites, off-target activity, or toxic metabolism (e.g., at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 75%, 80%, 90% or more reduced compared to the active agent alone). In some embodiments, the composition increases the efficacy of the therapeutic agent and/or decreases the toxicity of the therapeutic agent (e.g., at least 5%, 10%, 15%, 20%, 30%, 40%, 50%, 60%, 75%, 80%, 90% or more) as compared to the active agent alone.
The pharmaceutical compositions described herein can be formulated, for example, to comprise a carrier (e.g., a pharmaceutical carrier and/or a polymeric carrier, e.g., a liposome or vehicle) and delivered to a subject in need thereof (e.g., a human or non-human agricultural animal or livestock, e.g., bovine, canine, feline, equine, poultry) by known methods. Such methods include transfection (e.g., lipid-mediated cationic polymers, calcium phosphate); electroporation or other methods of disrupting membranes (e.g., nuclear transfection) and viral delivery (e.g., lentivirus, retrovirus, adenovirus, AAV). Delivery methods are also described, for example, in Gori et al Delivery and Specificity of CRISPR/Cas9 Genome Editing Technologies for Human Gene Therapy [ transfer and specificity of CRISPR/Cas9 genome editing techniques for human gene therapy ]. Human Gene Therapy [ human gene therapy ].2015, month 7, 26 (7): 443-451.Doi:10.1089/hum.2015.074; and Zuris et al, cationic lipid-mediated delivery of proteins enables efficient protein-based genome editing in vitro and in vivo [ Cationic lipid-mediated protein delivery capable of achieving efficient protein-based genome editing in vitro and in vivo ]. Nat Biotechnol [ natural biotechnology ].2014, 10 months, 30; 33 (1):73-80.
Lipid nanoparticles
The site-specific breaker as described herein can be delivered using any biological delivery system/formulation, including particles, e.g., nanoparticle delivery systems. Nanoparticles include particles having a size (e.g., diameter) of about 1 to about 1000 nanometers, about 1 to about 500 nanometers, about 1 to about 100nm, about 30nm to about 200nm, about 50nm to about 300nm, about 75nm to about 200nm, about 100nm to about 200nm, and any range therebetween. The nanoparticles have a composite structure of nanoscale dimensions. In some embodiments, the nanoparticles are generally spherical, although different morphologies are possible depending on the composition of the nanoparticles. The portion of the nanoparticle that is in contact with the environment outside the nanoparticle is generally defined as the surface of the nanoparticle. In some embodiments, the nanoparticle has a largest dimension ranging between 25nm and 200 nm. The nanoparticles described herein comprise a delivery system that may be provided in any form, including but not limited to solid, semi-solid, emulsion, or colloidal nanoparticles. Nanoparticle delivery systems may include, but are not limited to, lipid-based systems, liposomes, micelles, microvesicles, exosomes, or gene-guns. In one embodiment, the nanoparticle is a Lipid Nanoparticle (LNP). In some embodiments, the LNP is a particle comprising a plurality of lipid molecules that are physically associated with each other by intermolecular forces. In some embodiments, the LNP may comprise a plurality of components, such as 3-4 components. In one embodiment, a site-specific breaker or a pharmaceutical composition comprising the site-specific breaker (or a nucleic acid encoding the site-specific breaker or a pharmaceutical composition comprising a nucleic acid encoding the site-specific breaker) is encapsulated in the LNP. In one embodiment, a system or a pharmaceutical composition comprising the system (or a nucleic acid encoding the system or a pharmaceutical composition comprising a nucleic acid encoding the system) is encapsulated in an LNP. In some embodiments, the nucleic acid encoding the first site-specific breaker and the nucleic acid encoding the second site-specific breaker are present in the same LNP. In some embodiments, the nucleic acid encoding the first site-specific breaker and the nucleic acid encoding the second site-specific breaker are present in different LNPs. LNP preparation and modulator encapsulation/and/or adaptation may be used from Rosin et al, molecular Therapy [ molecular therapy ], volume 19, 12, pages 1286-2200, 2011, month 12). In some embodiments, the lipid nanoparticle compositions disclosed herein can be used to express proteins encoded by mRNA. In some embodiments, when the nucleic acid is present in the lipid nanoparticle, the nucleic acid is resistant to degradation by nucleases in aqueous solution.
In some embodiments, the LNP formulation can include a CCD lipid, a neutral lipid, and/or a helper lipid. In some embodiments, the LNP formulation comprises an ionizable lipid. In some embodiments, the ionizable lipid may be a cationic lipid, an ionizable cationic lipid, or an amine-containing lipid that can be readily protonated. In some embodiments, the lipid is a cationic lipid, which may exist in a positively charged or neutral form depending on pH. In some embodiments, the cationic lipid is a lipid that is capable of being positively charged, for example, under physiological conditions. In some embodiments, the lipid particles comprise cationic lipids formulated with neutral lipids, ionizable amine-containing lipids, biodegradable alkyne lipids, steroids, phospholipids including polyunsaturated lipids, structural lipids (e.g., sterols), PEG, cholesterol, and polymer conjugated lipids.
In some embodiments, the LNP formulation (e.g., MC3 and/or SSOP) includes cholesterol, PEG, and/or helper lipids. LNP may be, for example, microspheres (including unilamellar and multilamellar vesicles, lamellar phase lipid bilayers, which in some embodiments are substantially spherical.
In some embodiments, the LNP can comprise an aqueous core, e.g., comprising a nucleic acid encoding a site-specific breaker or system disclosed herein. In some embodiments of the disclosure, the cargo of the LNP formulation comprises at least one guide RNA. In some embodiments, cargo, such as nucleic acid encoding a site-specific breaker or system as disclosed herein, can be adsorbed to the surface of an LNP, such as an LNP comprising a cationic lipid. In some embodiments, cargo, such as nucleic acid encoding a site-specific breaker or system as disclosed herein, can be associated with the LNP. In some embodiments, cargo, such as nucleic acid encoding a site-specific breaker or system as disclosed herein, may be encapsulated, e.g., fully encapsulated and/or partially encapsulated, in an LNP.
In some embodiments, LNP comprising cargo may be administered for systemic delivery, e.g., delivery of a therapeutically effective dose of cargo, which may result in extensive exposure of the active agent within the organism. Systemic delivery of lipid nanoparticles can be, for example, intravenous, pulmonary, bronchial, intra-arterial, subcutaneous, and intraperitoneal delivery. In some embodiments, systemic delivery of the lipid nanoparticle is by intravenous delivery. In some embodiments, the LNP comprising cargo can be administered for local delivery, e.g., delivery of the active agent directly to a target site within an organism. In some embodiments, the LNP may be delivered locally to a disease site, such as a tumor, other target site, such as an inflammatory site, or to a target organ, such as liver, lung, stomach, colon, pancreas, uterus, breast, lymph node, etc. In some embodiments, the LNPs disclosed herein can be delivered locally to specific cells, such as hepatocytes, astrocytes, cumic cells, endothelial cells, alveolar cells, and/or epithelial cells. In some embodiments, the LNPs disclosed herein can be delivered locally to a specific tumor site, e.g., subcutaneously, in situ.
LNP can be formulated as a dispersed phase in an emulsion, as micelles, or as an internal phase in a suspension. In some embodiments, the LNP is biodegradable. In some embodiments, LNP does not accumulate to cytotoxic levels or cause toxicity in vivo at a therapeutically effective dose. In some embodiments, LNP does not accumulate to cytotoxic levels or cause toxicity in vivo after repeated administration at therapeutically effective doses. In some embodiments, the LNP does not elicit an innate immune response at a therapeutically effective dose that results in significant adverse effects.
In some embodiments, the LNP used comprises the formula (6Z, 9Z,28Z, 31Z) -hepta-hexaenoic acid-6,9,28,31-tetraen-19-yl 4- (dimethylamino) butyrate or ssPalmO-phenyl-P4C 2 (ssPalmO-Phe, SS-OP). In some embodiments, the LNP formulation comprises the following formula: (6Z, 9Z,28Z, 31Z) -heptahexenoic acid-6,9,28,31-tetraen-19-yl 4- (dimethylamino) butanoate (MC 3), 1, 2-dioleoyl-sn-glycero-3-phosphorylcholine (DOPC), cholesterol, 1, 2-dimyristoyl-rac-glycero-3-methoxypolyethylene glycol-2000 (PEG 2 k-DMG), e.g., MC3 LNP or ssPalmO-phenyl-P4C 2 (ssPalmO-Phe, SS-OP), 1, 2-dioleoyl-sn-glycero-3-phosphorylcholine (DOPC), cholesterol, 1, 2-dimyristoyl-rac-glycero-3-methoxypolyethylene glycol-2000 (PEG 2 k-DMG), e.g., SSOP-LNP.
Liposomes are spherical vesicle structures composed of a lipid bilayer of one or more layers surrounding an inner aqueous compartment and a relatively impermeable outer lipophilic phospholipid bilayer. Liposomes can be anionic, neutral or cationic. Liposomes are biocompatible, non-toxic, can deliver both hydrophilic and lipophilic drug molecules, protect their cargo from degradation by plasmatic enzymes, and transport their load across the biological membrane and the Blood Brain Barrier (BBB) (for reviews see, e.g., spuch and navaro, journal of Drug Delivery [ journal of drug delivery ], volume 2011, article ID 469679, page 12, 2011.doi:10.1155/2011/469679).
Vesicles can be made from several different types of lipids; however, phospholipids are most commonly used to form liposomes as drug carriers. Vesicles may include, but are not limited to DOTMA, DOTAP, DOTIM, DDAB, alone or in combination with cholesterol to produce DOTMA and cholesterol, DOTAP and cholesterol, dotm and cholesterol, DDAB and cholesterol. Methods of preparing multilamellar vesicle lipids are known in the art (see, e.g., U.S. patent No. 6,693,086, the teachings of which are incorporated herein by reference for multilamellar vesicle lipid preparation). Although vesicle formation may be spontaneous when the lipid membrane is mixed with an aqueous solution, vesicle formation may also be accelerated by applying force in the form of oscillation through the use of a homogenizer, sonicator or squeeze device (for reviews see, e.g., sphch and Navarro, journal of Drug Delivery [ journal of drug delivery ], volume 2011, article ID 469679, page 12, 2011.doi:10.1155/2011/469679). The extruded lipids may be prepared by extrusion through a filter having a reduced size, as described in Templeton et al, nature Biotech [ Nature Biotech ],15:647-652,1997, the teachings of which are incorporated herein by reference for the preparation of extruded lipids.
Methods and compositions provided herein can include pharmaceutical compositions administered by a regimen sufficient to alleviate symptoms of a disease, disorder, and/or condition. In some aspects, the present disclosure provides methods of delivering a therapeutic agent by administering a composition as described herein.
Use of the same
The present disclosure further relates to the use of the site-specific disruption agent or system disclosed herein. In some embodiments, such provided techniques can be used to effect modulation, e.g., repression, of expression of a target plurality of genes, and can, e.g., control activity, delivery, and rate of appearance of one or more products of the target plurality of genes, e.g., in a cell, among other things. In some embodiments, the cell is a mammalian cell, such as a human cell. In some embodiments, the cell is a somatic cell. In some embodiments, the cell is a primary cell. For example, in some embodiments, the cell is a mammalian somatic cell. In some embodiments, the mammalian somatic cell is a primary cell. In some embodiments, the mammalian somatic cells are non-embryonic cells.
Regulation of gene expression
The disclosure also relates in part to a method of modulating (e.g., reducing) expression of a target plurality of genes, the method comprising providing a site-specific breaker or system described herein (or a nucleic acid encoding the same, or a pharmaceutical composition comprising the same), and contacting the target plurality of genes, anchor sequences associated with the target plurality of genes, and/or a genomic complex (e.g., ASMC) comprising the target plurality of genes with the site-specific breaker or system. In some embodiments, modulating (e.g., reducing) expression of the target plurality of genes comprises modulating transcription of genes in the target plurality of genes compared to a reference value (e.g., transcription of genes in the absence of a site-specific disruption agent or system). In some embodiments, the method of modulating (e.g., reducing) expression of a target plurality of genes is used ex vivo, e.g., on cells from a subject (e.g., a mammalian subject, e.g., a human subject). In some embodiments, the cell is a mammalian cell, such as a human cell. In some embodiments, the cell is a somatic cell. In some embodiments, the cell is a primary cell. In some embodiments, the method of modulating (e.g., reducing) expression of a target plurality of genes is used in vivo, e.g., in a mammalian subject (e.g., a human subject). In some embodiments, the method of modulating (e.g., reducing) expression of a target plurality of genes is used in vitro, e.g., on a cell or cell line as described herein.
Without wishing to be bound by theory, in some embodiments, it is believed that the site-specific disruption agent or system may modulate expression of a target plurality of genes by binding to an anchor sequence of a genomic complex (e.g., ASMC) comprising the target plurality of genes, and have one, two, or all of the following effects: physically or spatially blocking (e.g., competitively inhibiting) binding of a component of the genomic complex (e.g., a nucleation polypeptide) to the anchor sequence; performing epigenetic modification (e.g., thereby reducing and/or eliminating binding of a genomic complex component (e.g., a nucleation polypeptide) to an anchor sequence) to the target plurality of genes, a transcriptional control element operably linked to the target plurality of genes, or an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes; or genetically modifying the target plurality of genes, a transcriptional control element operably linked to the target plurality of genes, or an anchor sequence proximal to or associated with an anchor sequence-mediated linkage comprising the target plurality of genes (e.g., thereby reducing and/or eliminating binding of a genomic complex component (e.g., a nucleation polypeptide) to the anchor sequence).
In some embodiments, the methods described herein modulate, e.g., reduce, expression of two or more genes in a target plurality of genes. In some embodiments, the methods described herein modulate (e.g., reduce) expression of at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, or 20 (optionally, no more than 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, or 30) genes in a target plurality of genes. In some embodiments, the methods described herein regulate, e.g., reduce, 2-20, 2-18, 2-16, 2-14, 2-12, 2-10, 2-9, 2-8, 2-7, 2-6, 2-5, 2-4, 2-3, 3-20, 3-18, 3-16, 3-14, 3-12, 3-10, 3-9, 3-8, 3-7, 3-6, 3-5, 3-4, 4-20, 4-18, 4-16, 4-14, 4-12, 4-10, 4-9, 4-8, 4-7, 4-6, 4-5, 5-20, 5-18, 5-16, 5-14 of the target plurality of genes 5-12, 5-10, 5-9, 5-8, 5-7, 5-6, 6-20, 6-18, 6-16, 6-14, 6-12, 6-10, 6-9, 6-8, 6-7, 7-20, 7-18, 7-16, 7-14, 7-12, 7-10, 7-9, 7-8, 8-20, 8-18, 8-16, 8-14, 8-12, 8-10, 8-9, 9-20, 9-18, 9-16, 9-14, 9-12, 9-10, 10-20, 10-18, 10-16, 10-14, 10-12, 12-20, 12-18, 12-16, 12-14, 14-20, 14-18, 14-16, 16-20, 16-18 or 18-20 genes. In some embodiments, the methods described herein modulate, e.g., reduce, expression of each gene (e.g., all genes) in the target plurality of genes.
In some embodiments, the methods described herein modulate (e.g., reduce) expression of genes in a target plurality of genes, wherein one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more, e.g., all) of the genes is a cytokine, an interleukin, a transcription factor (e.g., an interferon-modulating transcription factor), an intercellular adhesion molecule (ICAM), or an interferon receptor. In some embodiments, the methods described herein modulate (e.g., reduce) the level of RNA, e.g., mRNA, produced by one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more, e.g., all) genes of the target plurality of genes. In some embodiments, modulating expression comprises reducing the level of protein produced by one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more, e.g., all) genes of the target plurality of genes. In some embodiments, modulating expression comprises reducing the level of mRNA and protein produced by one or more (e.g., 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or more, e.g., all) genes in the target plurality of genes. In some embodiments, the decrease is at least a 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% or 90% decrease compared to the pre-treatment level or the level in the absence of the site-specific breaker.
In some embodiments, the methods described herein modulate (e.g., reduce) expression of one or more (e.g., 1, 2, 3, or all) of human CXCL1, CXCL2, CXCL3, CXCL4, CXCL5, CXCL6, CXCL7, and IL8, e.g., after stimulation of cells with TNF- α, e.g., using the assays of any one of examples 2 or 4-11. In some embodiments, the methods described herein modulate (e.g., reduce) expression of one or more (e.g., 1, 2, 3, or all) of mouse CXCL1, CXCL2, CXCL3, CXCL4, CXCL5, CXCL7, and CXCL15, e.g., after stimulation of cells with TNF- α, e.g., using the assay of example 14.
In some embodiments, the methods described herein reduce binding of a nucleating polypeptide, such as CTCF, to an anchor sequence. In some embodiments, contacting the cell or administering the site-specific breaker results in reduced binding of the nucleation polypeptide (e.g., CTCF) to an anchor sequence (e.g., an anchor sequence of an ASMC comprising a target plurality of genes). In some embodiments, contacting the cell or administering the site-specific breaker results in a complete loss of binding or at least 50%, 60%, 70%, 80%, 90%, 95% or 99% reduction relative to the binding of the nucleated polypeptide (e.g., CTCF) to the anchor sequence prior to treatment with the site-specific breaker or system or in the absence of the site-specific breaker or system, e.g., as measured by ChIP and/or quantitative PCR.
The present disclosure further relates in part to a method of treating a disorder associated with overexpression of a target plurality of genes in a subject, the method comprising administering to the subject a site-specific disruption agent, system, nucleic acid, vector, cell, or pharmaceutical composition described herein. Conditions associated with overexpression of a particular gene expression are known to those skilled in the art. Such conditions include, but are not limited to, metabolic disorders, neuromuscular disorders, cancer (e.g., solid tumors), fibrosis, diabetes, urea disorders, immune disorders, inflammation, and arthritis. In some embodiments, the disorder is an autoimmune disorder. In some embodiments, the disorder is associated with or caused by an infection, such as a viral infection, e.g., a SARS-Cov2 viral infection.
The disclosure further relates in part to a method of modulating (e.g., reducing) expression of a target plurality of genes in a cell of a subject (e.g., a human subject). In some embodiments, the subject has a disease or disorder. In some embodiments, the disease is an inflammatory disease, such as an immune-mediated inflammatory disease. In some embodiments, the disease or disorder is one or more of the following: rheumatoid arthritis, gout, asthma, neutrophilic dermatoses, paw edema, acute Respiratory Disease Syndrome (ARDS), covd-19, psoriasis, inflammatory bowel disease, infection (e.g., caused by a pathogen, such as bacteria, viruses or fungi), external injury (e.g., abrasion or foreign matter), the effects of radiation or chemical injury, osteoarthritis joint pain, inflammatory pain, acute pain, chronic pain, cystitis, bronchitis, dermatitis, cardiovascular disease, neurodegenerative disease, liver disease, lung disease, kidney disease, pain, swelling, stiffness, tenderness, redness, fever or a biomarker associated with a disease state (e.g., cytokine, immune receptor or inflammatory marker). In some embodiments, the inflammatory disorder is associated with an infection, e.g., a viral infection, e.g., sars-Cov-2 virus. In some embodiments, the inflammatory disorder is an autoimmune disorder.
The methods and compositions provided herein can treat conditions associated with overexpression of a target plurality of genes by stably or transiently altering (e.g., reducing) transcription of the target plurality of genes. In some embodiments, such modulation lasts at least about 1 hour to about 30 days, or at least about 2 hours, 6 hours, 12 hours, 18 hours, 24 hours, 2 days, 3 days, 4 days, 5 days, 6 days, 7 days, 8 days, 9 days, 10 days, 11 days, 12 days, 13 days, 14 days, 15 days, 16 days, 17 days, 18 days, 19 days, 20 days, 21 days, 22 days, 23 days, 24 days, 25 days, 26 days, 27 days, 28 days, 29 days, 30 days, or longer, or any time therebetween. In some embodiments, such modulation lasts at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 hours, or at least 1, 2, 3, 4, 5, 6, or 7 days, or at least 1, 2, 3, 4, or 5 weeks, or at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 months, or at least 1, 2, 3, 4, or 5 years (e.g., indefinitely). Optionally, such modulation is continued for no more than 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 year.
In some embodiments, the methods or compositions provided herein can reduce the expression of a gene of a target plurality of genes in a cell by at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 100% (and optionally up to 100%) relative to the expression of the gene of the target plurality of genes in a cell that has not been contacted with the composition or treated with the method. In some embodiments, the methods or compositions provided herein can reduce the expression of each gene of the target plurality of genes in the cell by at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, or 100% (and optionally up to 100%) relative to the expression of each gene of the target plurality of genes in a cell that has not been contacted with the composition or treated with the method.
In some embodiments, the methods provided herein can be modulated, e.g., reduced, expression of a target plurality of genes by disrupting genomic complexes, e.g., anchor sequence-mediated junctions, comprising the target plurality of genes. In some embodiments, the methods described herein disrupt a genomic complex (e.g., ASMC). In some embodiments, contacting the cell or administering the site-specific disruption agent results in a decrease in the level of a genomic complex (e.g., ASMC) comprising the target plurality of genes relative to the level of the complex prior to treatment with the site-specific disruption agent or system or in the absence of the site-specific disruption agent or system. In some embodiments, contacting the cell or administering the site-specific breaker results in complete loss of the genomic complex (e.g., ASMC) or a reduction of at least 25%, 30%, 35%, 40%, 45%, 50%, 60%, 70%, 80%, 90%, 95% or 99% relative to the level of the complex prior to treatment with the site-specific breaker or system or in the absence of the site-specific breaker or system, e.g., as measured by chua-PET, ELISA (e.g., for assessing changes in gene expression), CUT & RUN, ATAC-SEQ, chIP, and/or quantitative PCR.
In some embodiments, the methods and compositions provided herein can treat disorders associated with inflammatory cascades or cytokine storms by reducing recruitment of cytokines at the site of inflammation. In some embodiments, the inflammatory cascade and/or cytokine storm is associated with an inflammatory disorder, such as a virus-mediated inflammatory disorder, e.g., a covd-19 infection. In some embodiments, the inflammatory disorder is associated with an infection, e.g., a viral infection, e.g., sars-Cov-2 virus. In some embodiments, the inflammatory disorder is an autoimmune disorder. In some embodiments, the inflammatory disorder is associated with hypoxia. In some embodiments, the inflammatory disorder is associated with ARDS, hypoxia and/or sepsis. In some embodiments, the infection is an overlapping infection, e.g., caused by more than one pathogen, e.g., a first virus or bacteria or fungus, and a second virus, or a second bacteria, or a second fungus.
Epigenetic modification
The disclosure also relates in part to a method of epigenetic modification of: targeting one or more (e.g., all) of the plurality of genes; a transcription control element operably linked to the target plurality of genes; an anchor sequence proximal to the target plurality of genes or an anchor sequence associated with an anchor sequence-mediated junction comprising the target plurality of genes; or a site proximal to said anchor sequence, the method comprising providing a site-specific breaker, a system, a nucleic acid encoding the site-specific breaker, a nucleic acid encoding a component of the system, or a pharmaceutical composition comprising said site-specific breaker, system, or nucleic acid; and contacting one or more (e.g., all) of the target plurality of genes, a transcriptional control element operably linked to the target plurality of genes, an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to the anchor sequence with a site-specific disruption agent or system, thereby epigenetic modifying the target plurality of genes, the transcriptional control element operably linked to the target plurality of genes, the anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to the anchor sequence.
In some embodiments, the method of epigenetic modification of a target plurality of genes, a transcription control element operably linked to the target plurality of genes, an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to the anchor sequence comprises increasing or decreasing DNA methylation of the target plurality of genes, a transcription control element operably linked to the target plurality of genes, an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to the anchor sequence. In some embodiments, the method of epigenetic modification of a target plurality of genes, a transcription control element operably linked to the target plurality of genes, an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to the anchor sequence comprises increasing or decreasing histone methylation of a histone associated with: the target plurality of genes, a transcriptional control element operably linked to the target plurality of genes, an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to the anchor sequence. In some embodiments, the method of epigenetic modification of a target plurality of genes, a transcription control element operably linked to the target plurality of genes, an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to the anchor sequence comprises reducing histone acetylation associated with: the target plurality of genes, a transcriptional control element operably linked to the target plurality of genes, an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to the anchor sequence. In some embodiments, the method of epigenetic modification of a target plurality of genes, a transcription control element operably linked to the target plurality of genes, an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to the anchor sequence comprises increasing or decreasing histone thresh of a histone associated with: the target plurality of genes, a transcriptional control element operably linked to the target plurality of genes, an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to the anchor sequence. In some embodiments, the method of epigenetic modification of a target plurality of genes, a transcription control element operably linked to the target plurality of genes, an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to the anchor sequence comprises increasing or decreasing histone phosphorylation of a histone associated with: the target plurality of genes, a transcriptional control element operably linked to the target plurality of genes, an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to the anchor sequence.
In some embodiments, a method of epigenetic modification of a target plurality of genes, a transcription control element operably linked to the target plurality of genes, an anchor sequence proximal to the target plurality of genes or an anchor sequence associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to said anchor sequence, can reduce the level of epigenetic modification by at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 100% (and optionally up to 100%) relative to the level of epigenetic modification of the site in a cell not contacted with the composition or treated with the method. In some embodiments, a method of epigenetic modification of a target plurality of genes, a transcription control element operably linked to the target plurality of genes, an anchor sequence proximal to the target plurality of genes, or an anchor sequence associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to said anchor sequence, can increase the level of epigenetic modification by at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 120%, 140%, 160%, 180%, 200%, 300%, 400%, 500%, 600%, 700%, 800%, 900% or 1000% (and optionally, up to 200%, 300%, 400%, 500%, 600%, 700%, 800%, 900%, 1000% or 2000%) relative to the level of epigenetic modification of the site in a cell not contacted with the composition or treated with the method. In some embodiments, epigenetic modifications to a target plurality of genes, a transcription control element operably linked to the target plurality of genes, an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to the anchor sequence can modify the expression level of the target plurality of genes, e.g., as described herein.
In some embodiments, the epigenetic modification produced by the methods described herein lasts at least about 1 hour to about 30 days, or at least about 2 hours, 6 hours, 12 hours, 18 hours, 24 hours, 2 days, 3 days, 4 days, 5 days, 6 days, 7 days, 8 days, 9 days, 10 days, 11 days, 12 days, 13 days, 14 days, 15 days, 16 days, 17 days, 18 days, 19 days, 20 days, 21 days, 22 days, 23 days, 24 days, 25 days, 26 days, 27 days, 28 days, 29 days, 30 days, or more, or any time therebetween. In some embodiments, such modulation lasts at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, or 24 hours, or at least 1, 2, 3, 4, 5, 6, or 7 days, or at least 1, 2, 3, 4, or 5 weeks, or at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, or 12 months, or at least 1, 2, 3, 4, or 5 years (e.g., indefinitely). Optionally, such modulation is continued for no more than 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 year.
In some embodiments, a site-specific breaker or system for use in a method of epigenetic modification of a target plurality of genes, a transcriptional control element operably linked to the target plurality of genes, an anchor sequence proximal to the target plurality of genes, or an anchor sequence associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to the anchor sequence, comprises an effector moiety comprising an epigenetic modification moiety. For example, an effector moiety can comprise an epigenetic modification having DNA methyltransferase activity, and an endogenous or naturally occurring target sequence (e.g., a gene in a target plurality of genes, a transcription control element operably linked to the target plurality of genes, an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes, or a site proximal to the anchor sequence) can be altered to increase methylation thereof (e.g., to reduce interaction of a transcription factor with a gene or a portion of a transcription control element in a target plurality of genes, to reduce binding of a nucleation protein to an anchor sequence, and/or to interrupt or prevent anchor sequence-mediated linkage), or altered to reduce methylation thereof (e.g., to increase interaction of a transcription factor with a gene or a portion of a transcription control element in a target plurality of genes, to increase binding of a nucleation protein to an anchor sequence, and/or to promote or increase the strength of an anchor sequence-mediated linkage).
Genetic modification
The present disclosure further relates in part to a method of genetically modifying one or more (e.g., one, two, three, or all) genes of a target plurality of genes, a transcription control element operably linked to the target plurality of genes, or an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes, the method comprising providing a site-specific breaker or system or a nucleic acid encoding the same or a composition comprising the site-specific breaker, system or nucleic acid; and contacting one or more (e.g., one, two, three, or all) genes of the target plurality of genes, a transcription control element operably linked to the target plurality of genes, or an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes with the site-specific disruption agent, thereby genetically modifying the target plurality of genes, the transcription control element operably linked to the target plurality of genes, or an anchor sequence proximal to the target plurality of genes or associated with an anchor sequence-mediated linkage comprising the target plurality of genes.
Genetic modification may include introducing one or more of an insertion, deletion, or substitution into a gene in a target plurality of genes, a transcriptional control element operably linked to the target plurality of genes, or an anchor sequence proximal to or associated with an anchor sequence-mediated linkage comprising the target plurality of genes. In some embodiments, inserting comprises adding at least 150, 200, 250, 300, 400, 500, 600, 700, 800, 900, 1000, 1200, 1400, 1600, 1800, or 2000 nucleotides (and optionally no more than 3000, 2500, 2000, 1500, 1000, 900, 800, 700, 600, 500, 400, 300, or 200 nucleotides). In some embodiments, inserting comprises adding at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, or 100 nucleotides (optionally no more than 200, 150, 100, 90, 80, 70, 60, 50, 40, 35, 30, 25, 20, 15, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 nucleotide). In some embodiments, the insertion includes adding 1-10, 1-9, 1-8, 1-7, 1-6, 1-5, 1-4, 1-3, or 1-2 nucleotides. In some embodiments, the deletion includes the removal of at least 150, 200, 250, 300, 400, 500, 600, 700, 800, 900, 1000, 1200, 1400, 1600, 1800, or 2000 nucleotides (and optionally no more than 3000, 2500, 2000, 1500, 1000, 900, 800, 700, 600, 500, 400, 300, or 200 nucleotides). In some embodiments, the deletion includes removing at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, or 100 nucleotides (optionally no more than 200, 150, 100, 90, 80, 70, 60, 50, 40, 35, 30, 25, 20, 15, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 nucleotide). In some embodiments, the deletion includes the removal of 1-10, 1-9, 1-8, 1-7, 1-6, 1-5, 1-4, 1-3, or 1-2 nucleotides. In some embodiments, the substitution comprises a change of at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 15, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, or 100 nucleotides (optionally no more than 200, 150, 100, 90, 80, 70, 60, 50, 40, 35, 30, 25, 20, 15, 10, 9, 8, 7, 6, 5, 4, 3, 2, or 1 nucleotide). In some embodiments, the substitutions include altering 1-10, 1-9, 1-8, 1-7, 1-6, 1-5, 1-4, 1-3, or 1-2 nucleotides.
In some embodiments, the genetic modification comprises an insertion, deletion, or substitution of an anchor sequence (e.g., an anchor sequence associated with an ASMC comprising a target plurality of genes). In some embodiments, the genetic modification alters (e.g., decreases or increases) the binding of a component of the genomic complex (e.g., a nucleation polypeptide) to the anchor sequence. In some embodiments, the genetic modification completely or partially abrogates (e.g., by insertion, deletion, or substitution) the anchor sequence, thereby reducing or eliminating binding of the nucleating polypeptide to the anchor sequence, e.g., and reducing or eliminating the presence of an ASMC comprising the anchor sequence. Without wishing to be bound by theory, the present disclosure contemplates the use of site-specific disruption agents with genetic modification functions to introduce insertions, deletions, or substitutions into an anchor sequence to reduce or eliminate the involvement of the anchor sequence in a genomic complex (e.g., ASMC) comprising a target plurality of genes. As described elsewhere herein, such changes are expected to disrupt the genomic complex (e.g., ASMC) and potentially reduce expression of the target multiple genes.
In some embodiments, the genetic modification comprises insertion of a sequence comprising an anchor sequence. Without wishing to be bound by theory, the present disclosure contemplates the use of site-specific disrupters or systems with genetic modification functions to introduce exogenous anchor sequences into genes in a target plurality of genes, transcription control elements operably linked to the target plurality of genes, or anchor sequences proximal to the target plurality of genes or associated with anchor sequence-mediated linkages comprising the target plurality of genes. It is believed that the presence of the new anchor sequence may disrupt the formation and/or maintenance of a genomic complex (e.g., ASMC) comprising the target plurality of genes, thereby modulating (e.g., reducing) expression of the target plurality of genes.
The following examples are provided to further illustrate some embodiments of the disclosure, but are not intended to limit the scope of the disclosure; it will be appreciated by their exemplary nature that other procedures, methods or techniques known to those skilled in the art may alternatively be used.
Examples
Example 1: reducing expression of exemplary multiple genes
This example describes, in part, experiments demonstrating that site-specific disruption agents comprising a CRISPR/Cas molecule and a guide RNA comprising a given guide sequence reduce the expression of a target plurality of genes comprising CXCL1, CXCL2, CXCL3, and IL 8.
RNP and sgRNA Cas 9/guide RNP complexes comprising site-specific disrupters comprising mRNA encoding a CRISPR/Cas molecule (Cas 9) are transfected into THP-1 cells by electroporation. Cells were cultured in rpmi+10% fbs. Parent lines were also analyzed for comparison.
For each edited cell line and parental control, 350k cells were plated in quadruplicate into 24-well plates. After one hour, 10ng/ml of TNFα (Sigma catalog number 654205) was added to 2 wells of each cell line. The remaining 2 wells were untreated as controls.
The edited cells and parental cells were incubated with tnfα for 24 hours. Thereafter, the DNA and RNA were isolated using the AllPrep DNA/RNA kit (Qiagen) according to the manufacturer's protocol. Using RT Supermix kit (New England Biolabs (New England Biolabs)) reverse transcribes RNA samples into cDNA and uses CXCL1, CXCL2, CX aloneCL3 and IL8 specific TaqMan primer/probe set and +.>The rapid higher master mix (zemoeimer technologies (Thermo Fisher Scientific)) was analyzed by quantitative PCR.
CXCL1-3 and IL7 gene expression is quantified relative to a human GAPDH reference gene using the ΔΔCt method. The change in gene expression was further quantified by measuring the fold increase in gene expression following tnfα treatment directly compared to the level of gene expression in untreated cells.
Positioning CTCF anchors at two boundaries of an Insulated Genomic Domain (IGD) using ChIP-seq data and using a known CTCF position weight matrixJASPAR) CTCF anchor sequences were identified by calculation. CRISPR (Sp Cas 9) guides were selected to target CTCF anchor sequences.
The sequences of the guides are listed in the following table.
Table 4.
/>
Example 2: cytokine expression in THP-1 cells was reduced at 72 hours
This example describes, in part, experiments demonstrating that the use of a site-specific breaker comprising a CRISPR/Cas molecule and a guide RNA targeting an anchor sequence of an ASMC comprising a target plurality of genes reduces the expression of the target plurality of genes comprising CXCL1, CXCL2, CXCL3 and IL 8.
The sgrnas and mRNA encoding Cas9 RNP were electroporated into THP-1 cells. The sgRNA sequence (from example 1) was selected to target one of the CTCF sites of ASMC comprising CXCL1, CXCL2, CXCL3 and IL 8. Transfected cells were incubated with 10ng/ml TNFα for 24 hours. Thereafter, the DNA and RNA were isolated using the AllPrep DNA/RNA kit (Kaijer Co.) according to the manufacturer's protocol. UsingRT Supermix kit (New England biological laboratory Co.)New England Biolabs)) to reverse transcribe RNA samples into cDNA and use separate CXCL1, CXCL2, CXCL3 and IL8 specific TaqMan primer/probe sets and +.>The rapid higher master mix (zemoeimer technologies (Thermo Fisher Scientific)) was analyzed by quantitative PCR.
CXCL1-3 and IL8 gene expression is quantified relative to a human GAPDH reference gene using the ΔΔCt method. The change in gene expression was further quantified by measuring the fold increase in gene expression following tnfα treatment directly compared to the level of gene expression in untreated cells. The results in fig. 6 demonstrate that site-specific disruption agents comprising CRISPR/Cas molecules and sgrnas can be used to reduce CXCL1, CXCL2, CXCL3 and IL8 expression in THP-1 cells, and that expression is reduced 72 hours after treatment. The results also indicate that LNP delivery of the site-specific breaker can be used to deliver an effective amount of the agent to the target cells.
Similar results can be seen when experiments were repeated with sgrnas targeting another CTCF site of ASMC comprising CXCL1, CXCL2, CXCL3 and IL8 (data not shown).
Example 3: site-specific modulators reduce cytokine protein secretion by THP-1 cells
This example describes, in part, experiments demonstrating that secretion of two genes CXCL1 and IL-8 in a target plurality of genes is reduced by treating cells with a site-specific breaker comprising a CRISPR/Cas molecule and a guide RNA targeting the anchor sequence of an ASMC comprising the target plurality of genes.
THP-1 cells were electroporated with RNP comprising sgRNA and mRNA encoding a site-specific breaker comprising an exemplary CRISPR/Cas molecule (Cas 9) as in the previous examples. sgrnas (from example 1) target one of the CTCF sites of ASMC comprising CXCL1 and IL-8. Cells were stimulated with 10ng/ml TNFα for 24 hours. After that, cell supernatants were collected and frozen at-80 ℃. Supernatants from cells contacted with 4 different sgrnas, as well as mrnas encoding CRISPR/Cas molecules, were screened for CXCL1 and IL-8 protein levels on the cytokine panel by Mo Ji Genetics inc (Myriad Genetics inc.), as well as untransfected positive controls. Fig. 7 shows that reduced CXCL1 and IL8 levels were observed in each supernatant obtained from cells treated with sgrnas and CRISPR/Cas molecule RNPs, indicating a phenotypic response to ASMC disruption (e.g., by disruption of anchor sequence and nucleation polypeptide interactions, e.g., disruption of CTCF binding). This data is consistent with the reduced mRNA expression observed by qPCR in example 2.
Similar results can be seen when experiments were repeated with sgrnas targeting another CTCF site of ASMC comprising CXCL1 and IL8 (data not shown).
Example 4: CXCL3 expression reduction as measured by qPCR
This example describes, in part, experiments demonstrating that treatment of THP-1 cells with site-specific disruption agents comprising a CRISPR/Cas molecule and various guide RNAs targeting anchor sequences of ASMC comprising a target plurality of genes reduces CXCL3 expression.
THP-1 cells were grown in RPMI+10% FBS. Cells were transfected with mRNA encoding CRISPR/Cas molecule (Cas 9) and sgrnas targeting any CTCF site of ASMC comprising target multiple genes using LNP. The sgrnas used (from example 1) target either the left or right CTCF sites as shown in figure 9A. Using a solution from precision nanosystems (Precision NanoSystems inc.)Spark TM Lipid Nanoparticle (LNP) formulation was performed using SSOP lipid mixtures. For experimental conditions, for each transfection condition, 350k cells/well and parental controls were seeded into 24-well plates 72 hours after LNP transfection (see flow chart of fig. 8). After one hour, 10ng/ml of TNFα (Sigma catalog number 654205) was added to each well. Untreated parental cells were seeded with and without 10ng/ml tnfα.
Transfected cells were incubated with tnfα for 24 hours. Thereafter, the DNA and RNA were isolated using the AllPrep DNA/RNA kit (Qiagen) according to the manufacturer's protocol. UsingRT Supermix kit (New England Biolabs) reverse transcribing RNA samples into cDNA and using CXCL3 TaqMan primer/probe sets alone and>the rapid higher master mix (zemoeimer feishi technologies) was analyzed by quantitative PCR.
The ΔΔct method was used to quantify CXCL3 gene expression relative to a human GAPDH reference gene. The change in gene expression was further quantified by measuring the fold increase in gene expression following tnfα treatment directly compared to the level of gene expression in untreated cells. The results show (figure 8, panel) that site-specific disruption agents comprising CRISPR/Cas molecules and several different sgrnas can be used to reduce CXCL3 expression in THP-1 cells. The results also indicate that LNP delivery of the site-specific breaker can be used to deliver an effective amount of the agent to the target cells. The results also indicate that targeting anchor sequences on either side of an ASMC comprising a target plurality of genes can reduce expression of the target plurality of genes.
Similar results can be seen when experiments were repeated with sgrnas targeting another CTCF site of ASMC comprising CXCL1, CXCL2, CXCL3 and IL8 (data not shown).
Example 5: reduced CXCL1 and CXCL3 expression 3 weeks after transfection
This example describes, in part, experiments demonstrating stable reduction of CXCL1 and CXCL3 expression in THP-1 cells three weeks after transfection with site-specific disruption agents comprising a CRISPR/Cas molecule and various guide RNAs targeting the anchor sequences of ASMC comprising a target plurality of genes.
Cells and LNP were prepared and samples were analyzed as in example 4, except that transfected cells were incubated for 3 weeks prior to tnfα stimulation (see flow chart of fig. 9A).
The results show (fig. 9A and 9B) that site-specific disruption agents comprising a CRISPR/Cas molecule and several different sgrnas can be used to stably reduce CXCL1 and CXCL3 expression in THP-1 cells up to and including 3 weeks after treatment with LNP comprising the one or more agents. The results also indicate that targeting anchor sequences on either side of an ASMC comprising a target plurality of genes can stably reduce expression of the target plurality of genes.
Example 6: agents comprising KRAB effector moieties reduce CXCL1 expression
This example describes, in part, experiments demonstrating reduced expression of CXCL1 in THP-1 cells after transfection with site-specific disruption agents comprising a CRISPR/Cas molecule fused to a transcription repressor and various guide RNAs targeting anchor sequences of ASMC comprising a target plurality of genes.
THP-1 cells were grown in RPMI+10% FBS. Cells were transfected with mRNA encoding a CRISPR/Cas molecule fused to a transcription repressor (dCas 9-KRAB) and sgrnas (from example 1) targeting any CTCF site of ASMC comprising a target plurality of genes using LNP. Using a solution from precision nanosystemsSpark TM Lipid Nanoparticle (LNP) formulation was performed using SSOP lipid mixtures. For experimental conditions, for each transfection condition, 350k cells/well and parental controls were seeded into 24-well plates 72 hours after LNP transfection. After one hour, 10ng/ml of TNFα (Sigma catalog number 654205) was added to each well. Untreated parental cells were seeded with and without 10ng/ml tnfα. Transfection was performed with mRNA encoding CRISPR/Cas molecule (Cas 9) and sgRNA (according to examples 2, 4 and 5) as positive control.
Transfected cells were incubated with tnfα for 24 hours. Thereafter, the DNA and RNA were isolated using the AllPrep DNA/RNA kit (Qiagen) according to the manufacturer's protocol. UsingRT Supermix kit (New England Biolabs) reverse transcribing RNA samples into cDNA and using separate CXCL 1-specific TaqMan primer/probe sets and +.>Quick high-grade mother mixture The material (Semerle technologies) was analyzed by quantitative PCR.
The ΔΔct method was used to quantify CXCL1 gene expression relative to a human GAPDH reference gene. The change in gene expression was further quantified by measuring the fold increase in gene expression following tnfα treatment directly compared to the level of gene expression in untreated cells. The results show (fig. 10) that CXCL1 expression in THP-1 cells can be reduced using a site-specific breaker comprising a catalytically inactive CRISPR/Cas molecule fused to the transcription repressor KRAB and targeting CTCF sites by different sgrnas. Reduced expression was observed 72 hours post transfection.
Example 7: agents comprising an EZH2 effector moiety reduce CXCL1 expression
This example describes, in part, experiments demonstrating reduced expression of CXCL1 in THP-1 cells after transfection with a site-specific disruption agent comprising a CRISPR/Cas molecule fused to histone methyltransferase (EZH 2) and various guide RNAs targeting anchor sequences of ASMC comprising a target plurality of genes.
THP-1 cells were grown in RPMI+10% FBS. Cells were transfected with mRNA encoding a catalytically inactive CRISPR/Cas molecule (dCas 9) (dCas 9-EZH 2) fused to histone deacetylase and sgrnas (from example 1) targeting any CTCF site of ASMC comprising a target plurality of genes using LNP. Using a solution from precision nanosystems Spark TM Lipid Nanoparticle (LNP) formulation was performed using SSOP lipid mixtures. For experimental conditions, for each transfection condition, 350k cells/well and parental controls were seeded into 24-well plates 72 hours after LNP transfection. After one hour, 10ng/ml of TNFα (Sigma catalog number 654205) was added to each well. Untreated parental cells were seeded with and without 10ng/ml tnfα.
Transfected cells were incubated with tnfα for 24 hours. Thereafter, the DNA and RNA were isolated using the AllPrep DNA/RNA kit (Qiagen) according to the manufacturer's protocol. UsingRT Supermix kit (New England Biolabs) reverse transcribing RNA samples into cDNA and using separate CXCL 1-specific TaqMan primer/probe sets and +.>The rapid higher master mix (zemoeimer feishi technologies) was analyzed by quantitative PCR.
The ΔΔct method was used to quantify CXCL1 gene expression relative to a human GAPDH reference gene. The change in gene expression was further quantified by measuring the fold increase in gene expression following tnfα treatment directly compared to the level of gene expression in untreated cells. The results show (fig. 11) that CXCL1 expression in THP-1 cells can be reduced using a site-specific breaker comprising a catalytically inactive CRISPR/Cas molecule fused to histone methyltransferase (EZH 2) and targeting CTCF sites by different sgrnas. Reduced expression was observed 72 hours post transfection.
Example 8: agents comprising MQ1 effector moieties reduce CXCL1 expression
This example describes, in part, experiments demonstrating reduced expression of CXCL1 in THP-1 cells after transfection with site-specific disruption agents comprising a CRISPR/Cas molecule fused to a DNA methyltransferase (MQ 1) and various guide RNAs targeting anchor sequences of ASMC comprising a target plurality of genes.
THP-1 cells were grown in RPMI+10% FBS. Cells were transfected with mRNA encoding a catalytically inactive CRISPR/Cas molecule fused to MQ1 (dCas 9) (dCas 9-MQ 1) and sgrnas (from example 1) targeting any CTCF site of ASMC comprising a target plurality of genes using LNP. Using a solution from precision nanosystemsSpark TM Lipid Nanoparticle (LNP) formulation was performed using SSOP lipid mixtures. For experimental conditions, for each transfection condition, 350k cells/well and parental controls were seeded to 2 at 72 hours post LNP transfectionIn a 4-well plate. After one hour, 10ng/ml of TNFα (Sigma catalog number 654205) was added to each well. Untreated parental cells were seeded with and without 10ng/ml tnfα.
Transfected cells were incubated with tnfα for 24 hours. Thereafter, the DNA and RNA were isolated using the AllPrep DNA/RNA kit (Qiagen) according to the manufacturer's protocol. Using RT Supermix kit (New England Biolabs (New England Biolabs)) reverse transcribes RNA samples into cDNA and uses separate CXCL1 and CXCL3 specific TaqMan primer/probe sets and>the rapid higher master mix (zemoeimer technologies (Thermo Fisher Scientific)) was analyzed by quantitative PCR.
The ΔΔct method was used to quantify CXCL1 and 3 gene expression relative to a human GAPDH reference gene. The change in gene expression was further quantified by measuring the fold increase in gene expression following tnfα treatment directly compared to the level of gene expression in untreated cells. The results show (fig. 12) that CXCL1 expression in THP-1 cells can be reduced using a site-specific breaker comprising a catalytically inactive CRISPR/Cas molecule fused to a DNA methyltransferase (MQ 1) and targeting CTCF sites by different sgrnas. Reduced expression was observed 72 hours post transfection. Similar results were observed when CXCL3 expression was measured (data not shown).
Example 9: durable CXCL1 expression reduction after Cas9 or dCAS9-EZH2 treatment
This example describes, in part, experiments demonstrating stable reduction of CXCL1 expression in THP-1 cells up to 4 weeks after transfection with a catalytically inactive CRISPR/Cas molecule comprising a CRISPR/Cas molecule or fused to histone deacetylase (EZH 2) and a site specific breaker targeting sgrnas comprising anchor sequences of ASMC of a target plurality of genes (comprising CXCL 1).
Using ATx TM Expandable deviceSpread transfection System (MaxCyte), THP-1 cells grown in RPMI+10% FBS were electroporated with mRNA and sgRNA encoding either of the site-specific disrupters (Cas 9 or dCAS9-EZH 2) (from example 1) at 500 ten thousand cells/condition in the process assembly. Samples of transfected cells were harvested and incubated with tnfα for 24 hours. Repeated once a week for 4 weeks. Thereafter, the DNA and RNA were isolated using the AllPrep DNA/RNA kit (Kaijer Co.) according to the manufacturer's protocol. UsingRT SuperMix kit (New England Biolabs (New England Biolabs)) reverse transcribes RNA samples into cDNA and uses separate CXCL1, CXCL2, CXCL3 and IL8 specific TaqMan primer/probe sets and->The rapid higher master mix (zemoeimer technologies (Thermo Fisher Scientific)) was analyzed by quantitative PCR.
The ΔΔct method was used to quantify CXCL1 gene expression relative to a human GAPDH reference gene. The change in gene expression was further quantified by measuring the fold increase in gene expression following tnfα treatment directly compared to the level of gene expression in untreated cells. The results show (fig. 13) that site-specific disruption agents comprising CRISPR/Cas molecules, or catalytically inactive CRISPR/Cas molecules fused to histone methyltransferase (EZH 2) and targeting CTCF sites by sgrnas can be used to reduce CXCL1 expression in THP-1 cells. The decrease in expression lasted at least 4 weeks and was also observed 72 hours and 3 weeks after transfection.
Example 10: CXCL3 expression reduction after treatment with EZH2-dCAS9-KRAB and sgRNA
This example describes, in part, experiments demonstrating reduced expression of CXCL3 in THP-1 cells after transfection with a site-specific disruption agent comprising a catalytically inactive CRISPR/Cas molecule fused to histone methyltransferase (EZH 2) and transcription repressor (KRAB) and various guide RNAs targeting anchor sequences of ASMC comprising a target plurality of genes.
THP-1 cells were grown in RPMI+10% FBS. Several different site-specific breakers were tested: G9A-dCAS9-EZH2 (G9A fused to dCAS9 fused to EZH 2), G9A-dCAS9-KRAB and EZH2-dCAS9-KRAB. Cells were transfected with mRNA encoding a site-specific breaker and sgrnas targeting CTCF sites of ASMC comprising multiple genes targeted using LNP. The sgrnas are selected to target genomic DNA sites proximal to but at a distance from the left CTCF site (e.g., 80, 160, 235, or 300 nucleotides from the CTCF site). Exemplary guide sequences are given in table 5, targeting genomic DNA sites proximal to but a distance from the left CTCF site.
TABLE 5
Instruction article Guidance sequence Genome coordinates SEQ ID No.
GD-29251 CCAATGAAGATGAAACTGGG chr4:74595215-74595237 30
GD-29252 AACGTGCTTGCCTAAGATTC chr4:74595370-74595392 31
GD-29253 AGCCCTTAATCATATCTAGT chr4:74595560-74595582 32
GD-29254 CAGAGCTTAAGACCTGTACT chr4:74595642-74595664 33
GD-29255 GCCCACCTTGACCTTCACAA chr4:74595787-74595809 34
Using a solution from precision nanosystemsSpark TM Lipid Nanoparticle (LNP) formulation was performed using SSOP lipid mixtures. For experimental conditions, for each transfection condition, 350k cells/well and parental controls were seeded into 24-well plates 72 hours after LNP transfection. After one hour, 10ng/ml of TNFα (Sigma catalog number 654205) was added to each well. Untreated parental cells were seeded with and without 10ng/ml tnfα.
Transfected cells were incubated with tnfα for 24 hours. Thereafter, the DNA and RNA were isolated using the AllPrep DNA/RNA kit (Qiagen) according to the manufacturer's protocol. UsingRT Supermix kit (New England Biolabs (New England Biolabs)) reverse transcribes RNA samples into cDNA and uses separate CXCL1 and CXCL3 specific TaqMan primer/probe sets and>the rapid higher master mix (zemoeimer technologies (Thermo Fisher Scientific)) was analyzed by quantitative PCR.
The ΔΔct method was used to quantify CXCL1 and 3 gene expression relative to a human GAPDH reference gene. The change in gene expression was further quantified by measuring the fold increase in gene expression following tnfα treatment directly compared to the level of gene expression in untreated cells. The results show (fig. 14) that CXCL3 expression in THP-1 cells can be reduced using a site-specific disruption agent comprising a catalytically inactive CRISPR/Cas molecule fused to histone methyltransferase (EZH 2) and transcription repressor (KRAB) and targeting sites proximal to CTCF by sgrnas. Similar results were observed when CXCL1 expression was measured (data not shown).
Example 11: CXCL1 expression reduction after treatment with site-specific disruption agent and sgRNA
This example describes, in part, experiments demonstrating reduced expression of CXCL1 in THP-1 cells following transfection with various exemplary site-specific disruption agents, including: catalytically inactive CRISPR/Cas molecules fused to DNA methyltransferase (DNMT 33a/3 l); a catalytically inactive CRISPR/Cas molecule fused to histone deacetylase (HDAC 8); or catalytically inactive CRISPR/Cas molecules fused to histone deacetylase (HDAC 8) and histone methyltransferase (EZH 2), and a variety of guide RNAs targeting anchor sequences of ASMC comprising a target plurality of genes.
THP-1 cells were grown in RPMI+10% FBS. Several different site-specific breakers were tested: dAS 9-DNMT3a/3l (DNMT 3a/3l fused to dAS 9), dAS 9-HDAC8 and EZH 2-dAS 9-HDAC8. Cells were transfected with mRNA encoding a site-specific breaker and sgrnas (from example 1) targeting any CTCF site of ASMC comprising multiple genes targeted using LNP. Using a solution from precision nanosystemsSpark TM Lipid Nanoparticle (LNP) formulation was performed using SSOP lipid mixtures. For experimental conditions, for each transfection strip In pieces, 350k cells/well and parental controls were seeded into 24-well plates 72 hours after LNP transfection. After one hour, 10ng/ml of TNFα (Sigma catalog number 654205) was added to each well. Untreated parental cells were seeded with and without 10ng/ml tnfα.
Transfected cells were incubated with tnfα for 24 hours. Thereafter, the DNA and RNA were isolated using the AllPrep DNA/RNA kit (Qiagen) according to the manufacturer's protocol. UsingRT Supermix kit (New England Biolabs) reverse transcribing RNA samples into cDNA and using separate CXCL 1-specific TaqMan primer/probe sets and +.>The rapid higher master mix (zemoeimer feishi technologies) was analyzed by quantitative PCR.
The ΔΔct method was used to quantify CXCL1 gene expression relative to a human GAPDH reference gene. The change in gene expression was further quantified by measuring the fold increase in gene expression following tnfα treatment directly compared to the level of gene expression in untreated cells. The results show (FIG. 15) that site-specific disruption agents comprising dCAS9-DNMT3a/3l, dCAS9-HDAC8 or EZH2-dCAS9-HDAC8 can be used to reduce CXCL1 expression in THP-1 cells, and that these agents can be effective in reducing cytokine expression when CTCF sites are targeted by several different sgRNAs.
Example 12: CXCL Gene cluster expression reduction following treatment with dCas9-EZH2 and guide 30183 in human A549 lung cancer epithelial cells and IMR-90 cells
This example demonstrates reduced CXCL gene cluster expression in human a549 lung cancer epithelial cells and IMR-90 cells when treated with dCas9-EZH2 and guide 30183 (control 1).
Human A549 cells [ (A549)CCL-185) and IMR-90 cells (++>CCL-186) was inoculated in 100 μl of medium in flat bottom cell culture treatment plates at a density of 15,000 cells per well. A549 cells receive F12/K-30-2004 medium, IMR-90 cells receiving EMEM +.>-30-2003 medium. Both complete media were made of 10% FBS (VWR catalog No. 97068-085). After 24 hours of attachment to the plate, LNP containing guide 30183 and EZH2-dCAS9 control was added to the medium at a final concentration of 2 μg/ml SSOP lipid mixture. After 6 hours, the medium was changed to 100 μl of the appropriate medium and the cells were incubated for 72 hours. After 72 hours incubation was completed, TNFα (Sigma catalog number 654205) was added to the designated wells at a final concentration of 10ng/ml and incubated for 24 hours. After 24 hours, +.>96RNA core kit (Marshall-Nagel Inc. (Macherey-Nagel Inc.), catalog number 740466.4) RNA was isolated according to the manufacturer's protocol. Use- >RT Supermix kit (New England Biolabs) reverse transcribes RNA samples into cDNA and uses separate specific TaqMan primer/probe sets andthe rapid higher master mix (zemoeimer feishi technologies) was analyzed by quantitative PCR. Gene expression was quantified relative to a human ABL1 reference gene using the ΔΔCt method. The change in gene expression was further quantified by measuring the fold increase in gene expression following tnfα treatment directly compared to the level of gene expression in untreated cells.
The data show that the expression levels of genes in the CXCL gene cluster (specifically CXCL1, CXCL2, CXCL3, CXCL4, CXCL5, CXCL6, CXCL7, and IL-8) in human a549 lung cancer epithelial cells were reduced by 40% -70% when treated with dCas9-EZH2 (fig. 17). When intermediate CTCF served as the target for dCAS9-EZH2 and GD-30183, the expression level of the genes of the CXCL gene cluster (specifically CXCL1, CXCL2, CXCL3 and IL-8) in IMR-90 cells was reduced by about 50% (FIG. 18).
Example 13: reduced CXCL Gene cluster expression following treatment with dCAS9-EZH2 and guide GD-28481 in human monocytes
This example demonstrates reduced CXCL gene cluster expression in human monocytes when treated with dCas-based effectors (control a).
The Cas 9/guide RNP complex was transfected into THP-1 cells (ATCC-TIB-202) by electroporation by Synthhego.
After receiving the edited cell line, the vials were thawed and the cells were cultured in rpmi+10% fbs (VWR catalog No. 97068-085) for one week to recover the cells from freeze thawing. Parent unedited THP1 cell lines were also analyzed for comparison.
350,000 cells (for the edited cell line) were plated in quadruplicate and parental controls were inoculated into 24-well plates. After one hour, 10ng/ml of TNFα (Sigma catalog number 654205) was added. Untreated control wells were also used to compare fold increases in chemokine expression.
The edited cells and parental cells were incubated with tnfα for 24 hours. Thereafter, DNA and RNA were isolated using the DNA/RNA All Prep kit (Kaijer Co.) according to the manufacturer's protocol. UsingRT Supermix kit (New England Biolabs) reverse transcribing RNA samples into cDNA and using separate CXCL1, CXCL2, CXCL3 and IL8 specific TaqMan primer/probe sets and->Rapid higher master mix (Semer Feishul technologies) was entered by quantitative PCRAnd (5) performing row analysis.
CXCL1-3 and IL8 gene expression is quantified relative to a human GAPDH reference gene using the ΔΔCt method. The change in gene expression was further quantified by measuring the fold increase in gene expression following tnfα treatment directly compared to the level of gene expression in untreated cells.
The data show that the expression of the genes of CXCL1, CXCL2, CXCL3 and IL-8 in control A treated monocytes was reduced by 65%, 55%, 88% and 52% compared to the expression of the CXCL1, CXCL2, CXCL3 and IL-8 genes in untreated monocytes, respectively, 24 hours after dosing (FIG. 19).
Example 14: mice in Hep 1.6 cells following treatment with dCas9-MQ1 and sgrnas of targeted three anchor sequences CXCL gene cluster expression reduction
This example demonstrates down-regulation of mouse CXCL gene cluster expression in Hep 1.6 cells when treated with dCas9-MQ1 and sgrnas targeting three anchor sequences.
HEPA 1.6 in mouse cellsCRL-1830) was seeded at 10k cells per well in 100 μl of medium (DMEM Ji Boke company (Gibco) catalog No. 11995-065, 10% FBS VWR catalog No. 97068-085) in flat bottom cell culture treatment plates. After 24 hours of attachment to the plate, the cultures were divided into four treatment groups and three control groups. LNP containing (i) the directors GD-30594 and dCAS9-MQ1 controls targeting the right CTCF, (ii) the directors GD-30592 and dCAS9-MQ1 effectors targeting the intermediate CTCF 1, (iii) the directors GD-30593 and dCAS9-MQ1 effectors targeting the intermediate CTCF, and (iv) the combinations of GD-30594, GD-30592 and GD-30593 and dCAS9-MQ1 targeting the intermediate and right CTCF was added to the cell cultures under the treatment group at a final concentration of 2 μg/mL SSOP lipid mixture. Untreated cells, cells treated with LNP, cells treated with TNF and LNP containing transfection control instructions were used as controls. After 6 hours, the medium was changed to 100 μl DMEM and the cells were incubated for 72 hours. After 72 hours incubation, TNF alpha (Sigma catalog number 654245) was added at a final concentration of 10ng/ml The concentrations were added to the indicated wells and incubated for 24 hours. After 24 hours, nucleo +.>96RNA core kit (Marshall-Nagel, cat. No. 740466.4) RNA was isolated according to the manufacturer's protocol. Use->RT Supermix kit (New England Biolabs) reverse transcribes RNA samples into cDNA and uses separate specific TaqMan primer/probe sets and +.>The rapid higher master mix (zemoeimer feishi technologies) was analyzed by quantitative PCR. Gene expression was quantified relative to the mouse HPRT reference gene using the ΔΔct method. The change in gene expression was further quantified by measuring the fold increase in gene expression following tnfα treatment directly compared to the level of gene expression in untreated cells.
The data show that dCas9-MQ1 treated cells transfected with the guide targeting either the right CTCF motif or one of the two intermediate CTCF motifs in the CXCL gene cluster showed some down-regulation of all seven CXCL genes following tnfα stimulation (fig. 21B). However, when cells were treated with dCas9-MQ1 (transfected with a combination of guides targeting middle and right CTCF), the entire CXCL gene cluster was significantly more down-regulated (fig. 21B).
Example 15: systemic administration of dCas9-MQ1 demonstrated a significant reduction in leukocyte infiltration of the inflamed lung this example demonstrated that systemic administration of dCas9-MQ1 reduced leukocyte infiltration in vivo in the mouse lung.
The mouse Lipopolysaccharide (LPS) pneumonia model was used to study acute inflammation of the lung. LNP containing DOTAP 1% peg short ncRNA was used as a control. Each mouse received a 3mg/kg dose of LNP-DOTAP or dCAS9-MQ1 at-2 hours via the intravenous administration site. Mice were stimulated at 0 hours by oral inhalation of 5mg/kg LPS. A second dose of LNP-DOTAP or dCAS9-MQ1 at 3mg/kg was administered at the +8 hour time point. Dexamethasone was administered intraperitoneally at 10mg/kg at hours 0, 24 and 48. Animals were sacrificed at 72 hours and bronchiolar lavage fluid was collected from the lungs for flow staining. The reduction in neutrophil infiltration in BALF was used as a measure to understand the severity of inflammatory response.
Bronchiolar lavage fluid collected from lungs of dCas9-MQ1 treated animals showed about 5.0x10 5 Individual white blood cell counts/mL. Sham-operated groups, i.e. untreated healthy mice, did not have any significant white blood cells present in the bronchiolar lavage fluid (BALF). LPS-treated mice, dexamethasone-treated mice and LNP-DOTAP-treated mice showed about 8.0X10 in the respective bronchiollavage fluid 5 Individual white blood cell counts/mL, about 7.2x10 5 Individual white blood cell counts/mL and about 6.0x10 5 Individual white blood cell counts/mL (figure 22B). A 56% reduction in neutrophil infiltration in bronchoalveolar lavage fluid (BALF) was also observed in mice after 72 hours of treatment with dCas9-MQ1 compared to disease control.
Example 16: systemic administration of dCas9-MQ1 showed a significant reduction in neutrophil infiltration of BALF
The BALF obtained in example 15 was analyzed in this example to evaluate cell populations.
Flow cytometry analysis using the following staining groups was used to evaluate the cell population in BALF obtained in example 15, and the percentage of cells present in BALF at termination was recorded (fig. 23A). Neutrophil counts in BALF were also plotted using the following antibody staining group.
Alveolar macrophages: CD45 + ,Siglec F + ,CD11b - ,CD11c +
Neutrophil CD45 + ,Siglec F - ,CD11b + ,CD11c - ,Ly-6G +
T cell: CD45 + ,Siglec F - ,CD11c - ,CD3 +
B cell: CD45 + ,Siglec F - -,CD11c - ,B220 + (FIG. 23B)
Analysis showed that the leukocyte types that make up the majority of infiltrating cells were neutrophils, followed by B cells, T cells, macrophages and other types of hematopoietic cells (fig. 23A). The control reduced the number of neutrophils that soaked the lung, with a significant difference compared to the +lps disease group (fig. 23B).
Example 17: the reduction of leukocytes in BALF is lung-specific
This example demonstrates that the reduction of leukocytes in BALF is lung-specific, indicating that the reduction is caused by dCas9-MQ1 treatment.
The mouse Lipopolysaccharide (LPS) pneumonia model was used to study acute inflammation of the lung. LNP containing DOTAP 1% peg short ncRNA was used as a control. Each mouse received a 3mg/kg dose of LNP-DOTAP or dCAS9-MQ1 at-2 hours via the intravenous administration site. Mice were stimulated at 0 hours by oral inhalation of 5mg/kg LPS. A second dose of LNP-DOTAP or dCAS9-MQ1 at 3mg/kg was administered at the +8 hour time point. Dexamethasone was administered intraperitoneally at 10mg/kg at hours 0, 24 and 48. Animals were sacrificed at 72 hours and bronchiolar lavage fluid was collected from the lungs for flow staining. Peripheral blood was collected at the end of 72 hours. Using CD45 + Flow analysis of antibody staining was used to determine the leukocyte population in peripheral blood in each group. The white blood cell counts obtained for each group are plotted.
The figure shows that the effect of treatment with the control to reduce the white blood cell count in BALF is lung specific, indicating that the reduction in white blood cell count is due to dCas9-MQ1 treatment rather than due to the reduction in white blood cell population of the mice themselves (which would show a reduction in white blood cell count in peripheral blood). The hematopoietic cell populations in peripheral blood were found to be similar in all groups (fig. 24).
Example 18: systemic administration of dCas9-MQ1 demonstrated reduced CXCL gene expression in lung tissue
This example demonstrates that CXCL gene cluster expression is down-regulated in lung tissue following systemic administration of dCas9-MQ 1.
BALF was collected using the method described in example 15. After BALF collection, half of the left lung lobes were flash frozen for qPCR analysis. Lung tissue was homogenized and RNA was extracted for qPCR analysis, specifically quantified for CXCL1-7 and CXCL 15. Gene expression was quantified relative to the mouse GAPDH reference gene using the ΔΔCt method.
The data show that CXCL gene cluster expression was down-regulated to a different extent in lung tissue samples obtained from mice treated with dCas9-MQ1 compared to CXCL gene cluster expression in lung tissue samples obtained from mice not treated with dCas9-MQ1 (fig. 25).
Example 19: reducing CXCL expression has the beneficial downstream effect of reducing cell recruitment at sites of inflammation and the presence of other cytokines
Overexpression of the CXCL gene cluster produces chemokines that attract neutrophils. Chemokines recruiting inflammatory cells to the lung promote local inflammation, leading to a severe pathogenesis. This example demonstrates that down-regulating CXCL expression has a beneficial downstream effect, i.e., reduced cell recruitment, resulting in reduced presence of other cytokines at the site of inflammation, suggesting that down-regulating CXCL expression is a promising approach to reduce the severity of the pathogenesis of inflammation.
BALF was collected using the method described in example 15. After BALF collection, half of the left lung lobes were flash frozen for qPCR analysis. Lung tissue was homogenized, RNA extracted for qPCR analysis, using multiplexThe instrument exclusively quantifies the total counts of CXCL1, CXCL2, GM-CSF and IL-6 proteins in BALF.
The data show that lung tissue obtained from mice treated with dCS 9-MQ1 showed lower expression of CXCL1, CXCL2, GM-CSF and IL-6 than CXCL1, CXCL2, GM-CSF and IL-6 found in lung tissue obtained from mice not treated with dCS 9-MQ 1.
Equivalent forms
It is to be understood that while the invention has been described in conjunction with the specific embodiments thereof, the foregoing description is intended to illustrate and not limit the scope of the invention, which is defined by the scope of the appended claims. Some aspects, advantages and modifications are within the scope of the following claims.
Sequence listing
<110> flagship pioneer innovation V share Limited
<120> compositions and methods for inhibiting expression of multiple genes
<130> O2057-7021WO
<140>
<141>
<150> 63/216,487
<151> 2021-06-29
<150> 63/085,013
<151> 2020-09-29
<160> 247
<170> patent In version 3.5
<210> 1
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<220>
<221> modified base
<222> (1)..(1)
<223> a, c, t, g, unknown or otherwise
<220>
<221> modified base
<222> (3)..(3)
<223> a, c, t, g, unknown or otherwise
<400> 1
nbndccdsha grkgghrshv 20
<210> 2
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<220>
<221> modified base
<222> (18)..(18)
<223> a, c, t, g, unknown or otherwise
<220>
<221> modified base
<222> (20)..(20)
<223> a, c, t, g, unknown or otherwise
<400> 2
vhsrhggkrg ahsdccdnbn 20
<210> 3
<211> 10
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<220>
<221> modified base
<222> (8)..(8)
<223> a, c, t, g, unknown or otherwise
<400> 3
ccgccatntt 10
<210> 4
<211> 10
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<220>
<221> modified base
<222> (3)..(3)
<223> a, c, t, g, unknown or otherwise
<400> 4
aanatggcgg 10
<210> 5
<211> 1367
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 5
Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val Gly
1 5 10 15
Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys
20 25 30
Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly
35 40 45
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys
50 55 60
Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr
65 70 75 80
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 95
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His
100 105 110
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His
115 120 125
Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser
130 135 140
Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met
145 150 155 160
Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175
Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn
180 185 190
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys
195 200 205
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu
210 215 220
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu
225 230 235 240
Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp
245 250 255
Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp
260 265 270
Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
275 280 285
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
290 295 300
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met
305 310 315 320
Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala
325 330 335
Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350
Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln
355 360 365
Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly
370 375 380
Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
385 390 395 400
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly
405 410 415
Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro
435 440 445
Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met
450 455 460
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val
465 470 475 480
Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn
485 490 495
Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu
500 505 510
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
515 520 525
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys
530 535 540
Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val
545 550 555 560
Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser
565 570 575
Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr
580 585 590
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu
610 615 620
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His
625 630 635 640
Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
660 665 670
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala
675 680 685
Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys
690 695 700
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His
705 710 715 720
Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735
Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg
740 745 750
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
755 760 765
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu
770 775 780
Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
785 790 795 800
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815
Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
820 825 830
Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys Asp
835 840 845
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg Gly
850 855 860
Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn
865 870 875 880
Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
885 890 895
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
900 905 910
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys
915 920 925
His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
930 935 940
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys
945 950 955 960
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu
965 970 975
Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990
Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
995 1000 1005
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1010 1015 1020
Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr
1025 1030 1035
Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn
1040 1045 1050
Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr
1055 1060 1065
Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg
1070 1075 1080
Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
1085 1090 1095
Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1100 1105 1110
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1115 1120 1125
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1130 1135 1140
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser
1145 1150 1155
Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe
1160 1165 1170
Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu
1175 1180 1185
Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe
1190 1195 1200
Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu
1205 1210 1215
Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn
1220 1225 1230
Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
1235 1240 1245
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His
1250 1255 1260
Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg
1265 1270 1275
Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr
1280 1285 1290
Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile
1295 1300 1305
Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
1310 1315 1320
Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr
1325 1330 1335
Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
1340 1345 1350
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
<210> 6
<211> 1367
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 6
Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly
1 5 10 15
Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys
20 25 30
Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly
35 40 45
Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys
50 55 60
Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr
65 70 75 80
Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe
85 90 95
Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His
100 105 110
Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His
115 120 125
Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser
130 135 140
Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met
145 150 155 160
Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp
165 170 175
Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn
180 185 190
Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys
195 200 205
Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu
210 215 220
Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu
225 230 235 240
Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp
245 250 255
Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp
260 265 270
Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
275 280 285
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile
290 295 300
Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met
305 310 315 320
Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys Ala
325 330 335
Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp
340 345 350
Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln
355 360 365
Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly
370 375 380
Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys
385 390 395 400
Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu Gly
405 410 415
Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu
420 425 430
Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro
435 440 445
Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met
450 455 460
Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val
465 470 475 480
Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn
485 490 495
Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu
500 505 510
Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
515 520 525
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys
530 535 540
Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val
545 550 555 560
Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser
565 570 575
Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr
580 585 590
Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn
595 600 605
Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu
610 615 620
Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His
625 630 635 640
Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr
645 650 655
Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
660 665 670
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala
675 680 685
Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys
690 695 700
Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His
705 710 715 720
Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile
725 730 735
Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg
740 745 750
His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
755 760 765
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile Glu
770 775 780
Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro Val
785 790 795 800
Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln
805 810 815
Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg Leu
820 825 830
Ser Asp Tyr Asp Val Ala Ala Ile Val Pro Gln Ser Phe Leu Lys Asp
835 840 845
Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg Gly
850 855 860
Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn
865 870 875 880
Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe
885 890 895
Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
900 905 910
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr Lys
915 920 925
His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu
930 935 940
Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys
945 950 955 960
Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu
965 970 975
Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val
980 985 990
Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
995 1000 1005
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1010 1015 1020
Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr
1025 1030 1035
Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn
1040 1045 1050
Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr
1055 1060 1065
Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg
1070 1075 1080
Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
1085 1090 1095
Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1100 1105 1110
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1115 1120 1125
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1130 1135 1140
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser
1145 1150 1155
Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe
1160 1165 1170
Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu
1175 1180 1185
Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe
1190 1195 1200
Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu
1205 1210 1215
Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn
1220 1225 1230
Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
1235 1240 1245
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His
1250 1255 1260
Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg
1265 1270 1275
Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr
1280 1285 1290
Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile
1295 1300 1305
Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
1310 1315 1320
Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr
1325 1330 1335
Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
1340 1345 1350
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp
1355 1360 1365
<210> 7
<211> 1053
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 7
Ala Lys Arg Asn Tyr Ile Leu Gly Leu Ala Ile Gly Ile Thr Ser Val
1 5 10 15
Gly Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala Gly
20 25 30
Val Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg Arg
35 40 45
Ser Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg Ile
50 55 60
Gln Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp His
65 70 75 80
Ser Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly Leu
85 90 95
Ser Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His Leu
100 105 110
Ala Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp Thr
115 120 125
Gly Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys Ala
130 135 140
Leu Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys Lys
145 150 155 160
Asp Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp Tyr
165 170 175
Val Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His Gln
180 185 190
Leu Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr Arg
195 200 205
Arg Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp Lys
210 215 220
Asp Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr Phe
225 230 235 240
Pro Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu Tyr
245 250 255
Asn Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu Asn
260 265 270
Glu Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val Phe
275 280 285
Lys Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile Leu
290 295 300
Val Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly Lys
305 310 315 320
Pro Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile Thr
325 330 335
Ala Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile Ala
340 345 350
Lys Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu Leu
355 360 365
Thr Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile Ser
370 375 380
Asn Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala Ile
385 390 395 400
Asn Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile Ala
405 410 415
Ile Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser Gln
420 425 430
Gln Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser Pro
435 440 445
Val Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala Ile
450 455 460
Ile Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala Arg
465 470 475 480
Glu Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln Lys
485 490 495
Arg Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr Thr
500 505 510
Gly Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His Asp
515 520 525
Met Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu Glu
530 535 540
Asp Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp Ala Ile Ile Pro
545 550 555 560
Arg Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val Lys
565 570 575
Gln Glu Glu Asn Ser Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr Leu
580 585 590
Ser Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His Ile
595 600 605
Leu Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys Glu
610 615 620
Tyr Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys Asp
625 630 635 640
Phe Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly Leu
645 650 655
Met Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val Lys
660 665 670
Val Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys Trp
675 680 685
Lys Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu Asp
690 695 700
Ala Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys Lys
705 710 715 720
Leu Asp Lys Ala Lys Lys Val Met Glu Asn Gln Met Phe Glu Glu Lys
725 730 735
Gln Ala Glu Ser Met Pro Glu Ile Glu Thr Glu Gln Glu Tyr Lys Glu
740 745 750
Ile Phe Ile Thr Pro His Gln Ile Lys His Ile Lys Asp Phe Lys Asp
755 760 765
Tyr Lys Tyr Ser His Arg Val Asp Lys Lys Pro Asn Arg Lys Leu Ile
770 775 780
Asn Asp Thr Leu Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn Thr Leu
785 790 795 800
Ile Val Asn Asn Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp Lys Leu
805 810 815
Lys Lys Leu Ile Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr His His
820 825 830
Asp Pro Gln Thr Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln Tyr Gly
835 840 845
Asp Glu Lys Asn Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly Asn Tyr
850 855 860
Leu Thr Lys Tyr Ser Lys Lys Asp Asn Gly Pro Val Ile Lys Lys Ile
865 870 875 880
Lys Tyr Tyr Gly Asn Lys Leu Asn Ala His Leu Asp Ile Thr Asp Asp
885 890 895
Tyr Pro Asn Ser Arg Asn Lys Val Val Lys Leu Ser Leu Lys Pro Tyr
900 905 910
Arg Phe Asp Val Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val Thr Val
915 920 925
Lys Asn Leu Asp Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val Asn Ser
930 935 940
Lys Cys Tyr Glu Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn Gln Ala
945 950 955 960
Glu Phe Ile Ala Ser Phe Tyr Lys Asn Asp Leu Ile Lys Ile Asn Gly
965 970 975
Glu Leu Tyr Arg Val Ile Gly Val Asn Asn Asp Leu Leu Asn Arg Ile
980 985 990
Glu Val Asn Met Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu Asn Met
995 1000 1005
Asn Asp Lys Arg Pro Pro His Ile Ile Lys Thr Ile Ala Ser Lys
1010 1015 1020
Thr Gln Ser Ile Lys Lys Tyr Ser Thr Asp Ile Leu Gly Asn Leu
1025 1030 1035
Tyr Glu Val Lys Ser Lys Lys His Pro Gln Ile Ile Lys Lys Gly
1040 1045 1050
<210> 8
<211> 4101
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 8
gacaagaagt acagcatcgg cctggccatc ggcaccaaca gcgtgggctg ggccgtgatc 60
accgacgagt acaaggtgcc cagcaagaag ttcaaggtgc tgggcaacac cgaccggcac 120
agcatcaaga agaacctgat cggcgccctg ctgttcgaca gcggcgagac cgccgaggcc 180
acccggctga agcggaccgc ccggcggcgg tacacccggc ggaagaaccg gatctgctac 240
ctgcaggaga tcttcagcaa cgagatggcc aaggtggacg acagcttctt ccaccggctg 300
gaggagagct tcctggtgga ggaggacaag aagcacgagc ggcaccccat cttcggcaac 360
atcgtggacg aggtggccta ccacgagaag taccccacca tctaccacct gcggaagaag 420
ctggtggaca gcaccgacaa ggccgacctg cggctgatct acctggccct ggcccacatg 480
atcaagttcc ggggccactt cctgatcgag ggcgacctga accccgacaa cagcgacgtg 540
gacaagctgt tcatccagct ggtgcagacc tacaaccagc tgttcgagga gaaccccatc 600
aacgccagcg gcgtggacgc caaggccatc ctgagcgccc ggctgagcaa gagccggcgg 660
ctggagaacc tgatcgccca gctgcccggc gagaagaaga acggcctgtt cggcaacctg 720
atcgccctga gcctgggcct gacccccaac ttcaagagca acttcgacct ggccgaggac 780
gccaagctgc agctgagcaa ggacacctac gacgacgacc tggacaacct gctggcccag 840
atcggcgacc agtacgccga cctgttcctg gccgccaaga acctgagcga cgccatcctg 900
ctgagcgaca tcctgcgggt gaacaccgag atcaccaagg cccccctgag cgccagcatg 960
atcaagcggt acgacgagca ccaccaggac ctgaccctgc tgaaggccct ggtgcggcag 1020
cagctgcccg agaagtacaa ggagatcttc ttcgaccaga gcaagaacgg ctacgccggc 1080
tacatcgacg gcggcgccag ccaggaggag ttctacaagt tcatcaagcc catcctggag 1140
aagatggacg gcaccgagga gctgctggtg aagctgaacc gggaggacct gctgcggaag 1200
cagcggacct tcgacaacgg cagcatcccc caccagatcc acctgggcga gctgcacgcc 1260
atcctgcggc ggcaggagga cttctacccc ttcctgaagg acaaccggga gaagatcgag 1320
aagatcctga ccttccggat cccctactac gtgggccccc tggcccgggg caacagccgg 1380
ttcgcctgga tgacccggaa atccgaggag accatcaccc cctggaactt cgaggaggtg 1440
gtggacaagg gcgccagcgc ccagagcttc atcgagcgga tgaccaactt cgacaagaac 1500
ctgcccaacg agaaggtgct gcccaagcac agcctgctgt acgagtactt caccgtgtac 1560
aacgagctga ccaaggtgaa gtacgtgacc gagggcatgc ggaagcccgc cttcctgagc 1620
ggcgagcaga agaaggccat cgtggacctg ctgttcaaga ccaaccggaa ggtgaccgtg 1680
aagcagctga aggaggacta cttcaagaag atcgagtgct tcgacagcgt ggagatcagc 1740
ggcgtggagg accggttcaa cgccagcctg ggcacctacc acgacctgct gaagatcatc 1800
aaggacaagg acttcctgga caacgaggag aacgaggaca tcctggagga catcgtgctg 1860
accctgaccc tgttcgagga ccgggagatg atcgaggagc ggctgaaaac ctacgcccac 1920
ctgttcgacg acaaggtgat gaagcagctg aagcggcggc ggtacaccgg ctggggccgg 1980
ctgagccgga agctgatcaa cggcatccgg gacaagcaga gcggcaagac catcctggac 2040
ttcctgaaat ccgacggctt cgccaaccgg aacttcatgc agctgatcca cgacgacagc 2100
ctgaccttca aggaggacat ccagaaggcc caggtgagcg gccagggcga cagcctgcac 2160
gagcacatcg ccaacctggc cggcagcccc gccatcaaga agggcatcct gcagaccgtg 2220
aaggtggtgg acgagctggt gaaggtgatg ggccggcaca agcccgagaa catcgtgatc 2280
gagatggccc gggagaacca gaccacccag aagggccaga agaacagccg ggagcggatg 2340
aagcggatcg aggagggcat caaggagctg ggcagccaga tcctgaagga gcaccccgtg 2400
gagaacaccc agctgcagaa cgagaagctg tacctgtact acctgcagaa cggccgggac 2460
atgtacgtgg accaggagct ggacatcaac cggctgagcg actacgacgt ggccgccatc 2520
gtgccccaga gcttcctgaa ggacgacagc atcgacaaca aggtgctgac ccggagcgac 2580
aaggcccggg gcaagagcga caacgtgccc agcgaggagg tggtgaagaa gatgaagaac 2640
tactggcggc agctgctgaa cgccaagctg atcacccagc ggaagttcga caacctgacc 2700
aaggccgagc ggggcggcct gagcgagctg gacaaggccg gcttcatcaa gcggcagctg 2760
gtggagaccc ggcagatcac caagcacgtg gcccagatcc tggacagccg gatgaacacc 2820
aagtacgacg agaacgacaa gctgatccgg gaggtgaagg tgatcaccct gaaatccaag 2880
ctggtgagcg acttccggaa ggacttccag ttctacaagg tgcgggagat caacaactac 2940
caccacgccc acgacgccta cctgaacgcc gtggtgggca ccgccctgat caagaagtac 3000
cccaagctgg agagcgagtt cgtgtacggc gactacaagg tgtacgacgt gcggaagatg 3060
atcgccaaga gcgagcagga gatcggcaag gccaccgcca agtacttctt ctacagcaac 3120
atcatgaact tcttcaagac cgagatcacc ctggccaacg gcgagatccg gaagcggccc 3180
ctgatcgaga ccaacggcga gaccggcgag atcgtgtggg acaagggccg ggacttcgcc 3240
accgtgcgga aggtgctgag catgccccag gtgaacatcg tgaagaaaac cgaggtgcag 3300
accggcggct tcagcaagga gagcatcctg cccaagcgga acagcgacaa gctgatcgcc 3360
cggaagaagg actgggaccc caagaagtac ggcggcttcg acagccccac cgtggcctac 3420
agcgtgctgg tggtggccaa ggtggagaag ggcaagagca agaagctgaa atccgtgaag 3480
gagctgctgg gcatcaccat catggagcgg agcagcttcg agaagaaccc catcgacttc 3540
ctggaggcca agggctacaa ggaggtgaag aaggacctga tcatcaagct gcccaagtac 3600
agcctgttcg agctggagaa cggccggaag cggatgctgg ccagcgccgg cgagctgcag 3660
aagggcaacg agctggccct gcccagcaag tacgtgaact tcctgtacct ggccagccac 3720
tacgagaagc tgaagggcag ccccgaggac aacgagcaga agcagctgtt cgtggagcag 3780
cacaagcact acctggacga gatcatcgag cagatcagcg agttcagcaa gcgggtgatc 3840
ctggccgacg ccaacctgga caaggtgctg agcgcctaca acaagcaccg ggacaagccc 3900
atccgggagc aggccgagaa catcatccac ctgttcaccc tgaccaacct gggcgccccc 3960
gccgccttca agtacttcga caccaccatc gaccggaagc ggtacaccag caccaaggag 4020
gtgctggacg ccaccctgat ccaccagagc atcaccggcc tgtacgagac ccggatcgac 4080
ctgagccagc tgggcggcga c 4101
<210> 9
<211> 3159
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 9
gccaagcgga actacatcct gggcctggcc atcggcatca ccagcgtggg ctacggcatc 60
atcgactacg agacccggga cgtgatcgac gccggcgtgc ggctgttcaa ggaggccaac 120
gtggagaaca acgagggccg gcggagcaag cggggcgccc ggcggctgaa gcggcggcgg 180
cggcaccgga tccagcgggt gaagaagctg ctgttcgact acaacctgct gaccgaccac 240
agcgagctga gcggcatcaa cccctacgag gcccgggtga agggcctgag ccagaagctg 300
agcgaggagg agttcagcgc cgccctgctg cacctggcca agcggcgggg cgtgcacaac 360
gtgaacgagg tggaggagga caccggcaac gagctgagca ccaaggagca gatcagccgg 420
aacagcaagg ccctggagga gaagtacgtg gccgagctgc agctggagcg gctgaagaag 480
gacggcgagg tgcggggcag catcaaccgg ttcaagacca gcgactacgt gaaggaggcc 540
aagcagctgc tgaaggtgca gaaggcctac caccagctgg accagagctt catcgacacc 600
tacatcgacc tgctggagac ccggcggacc tactacgagg gccccggcga gggcagcccc 660
ttcggctgga aggacatcaa ggagtggtac gagatgctga tgggccactg cacctacttc 720
cccgaggagc tgcggagcgt gaagtacgcc tacaacgccg acctgtacaa cgccctgaac 780
gacctgaaca acctggtgat cacccgggac gagaacgaga agctggagta ctacgagaag 840
ttccagatca tcgagaacgt gttcaagcag aagaagaagc ccaccctgaa gcagatcgcc 900
aaggagatcc tggtgaacga ggaggacatc aagggctacc gggtgaccag caccggcaag 960
cccgagttca ccaacctgaa ggtgtaccac gacatcaagg acatcaccgc ccggaaggag 1020
atcatcgaga acgccgagct gctggaccag atcgccaaga tcctgaccat ctaccagagc 1080
agcgaggaca tccaggagga gctgaccaac ctgaacagcg agctgaccca ggaggagatc 1140
gagcagatca gcaacctgaa gggctacacc ggcacccaca acctgagcct gaaggccatc 1200
aacctgatcc tggacgagct gtggcacacc aacgacaacc agatcgccat cttcaaccgg 1260
ctgaagctgg tgcccaagaa ggtggacctg agccagcaga aggagatccc caccaccctg 1320
gtggacgact tcatcctgag ccccgtggtg aagcggagct tcatccagag catcaaggtg 1380
atcaacgcca tcatcaagaa gtacggcctg cccaacgaca tcatcatcga gctggcccgg 1440
gagaagaaca gcaaggacgc ccagaagatg atcaacgaga tgcagaagcg gaaccggcag 1500
accaacgagc ggatcgagga gatcatccgg accaccggca aggagaacgc caagtacctg 1560
atcgagaaga tcaagctgca cgacatgcag gagggcaagt gcctgtacag cctggaggcc 1620
atccccctgg aggacctgct gaacaacccc ttcaactacg aggtggacgc catcatcccc 1680
cggagcgtga gcttcgacaa cagcttcaac aacaaggtgc tggtgaagca ggaggagaac 1740
agcaagaagg gcaaccggac ccccttccag tacctgagca gcagcgacag caagatcagc 1800
tacgagacct tcaagaagca catcctgaac ctggccaagg gcaagggccg gatcagcaag 1860
accaagaagg agtacctgct ggaggagcgg gacatcaacc ggttcagcgt gcagaaggac 1920
ttcatcaacc ggaacctggt ggacacccgg tacgccaccc ggggcctgat gaacctgctg 1980
cggagctact tccgggtgaa caacctggac gtgaaggtga aatccatcaa cggcggcttc 2040
accagcttcc tgcggcggaa gtggaagttc aagaaggagc ggaacaaggg ctacaagcac 2100
cacgccgagg acgccctgat catcgccaac gccgacttca tcttcaagga gtggaagaag 2160
ctggacaagg ccaagaaggt gatggagaac cagatgttcg aggagaagca ggccgagagc 2220
atgcccgaga tcgagaccga gcaggagtac aaggagatct tcatcacccc ccaccagatc 2280
aagcacatca aggacttcaa ggactacaag tacagccacc gggtggacaa gaagcccaac 2340
cggaagctga tcaacgacac cctgtacagc acccggaagg acgacaaggg caacaccctg 2400
atcgtgaaca acctgaacgg cctgtacgac aaggacaacg acaagctgaa gaagctgatc 2460
aacaagagcc ccgagaagct gctgatgtac caccacgacc cccagaccta ccagaagctg 2520
aagctgatca tggagcagta cggcgacgag aagaaccccc tgtacaagta ctacgaggag 2580
accggcaact acctgaccaa gtacagcaag aaggacaacg gccccgtgat caagaagatc 2640
aagtactacg gcaacaagct gaacgcccac ctggacatca ccgacgacta ccccaacagc 2700
cggaacaagg tggtgaagct gagcctgaag ccctaccggt tcgacgtgta cctggacaac 2760
ggcgtgtaca agttcgtgac cgtgaagaac ctggacgtga tcaagaagga gaactactac 2820
gaggtgaaca gcaagtgcta cgaggaggcc aagaagctga agaagatcag caaccaggcc 2880
gagttcatcg ccagcttcta caagaacgac ctgatcaaga tcaacggcga gctgtaccgg 2940
gtgatcggcg tgaacaacga cctgctgaac cggatcgagg tgaacatgat cgacatcacc 3000
taccgggagt acctggagaa catgaacgac aagcggcccc cccacatcat caagaccatc 3060
gccagcaaga cccagagcat caagaagtac agcaccgaca tcctgggcaa cctgtacgag 3120
gtgaaatcca agaagcaccc ccagatcatc aagaagggc 3159
<210> 10
<211> 1155
<212> DNA
<213> Monobpium (Spiroplasma monobiae)
<400> 10
agcaaggtgg agaacaagac caagaagctg cgggtgttcg aggccttcgc cggcatcggc 60
gcccagcgga aggccctgga gaaggtgcgg aaggacgagt acgagatcgt gggcctggcc 120
gagtggtacg tgcccgccat cgtgatgtac caggccatcc acaacaactt ccacaccaag 180
ctggagtaca agagcgtgag ccgggaggag atgatcgact acctggagaa caagaccctg 240
agctggaaca gcaagaaccc cgtgagcaac ggctactgga agcggaagaa ggacgacgag 300
ctgaagatca tctacaacgc catcaagctg agcgagaagg agggcaacat cttcgacatc 360
cgggacctgt acaagcggac cctgaagaac atcgacctgc tgacctacag cttcccctgc 420
caggacctga gccagcaggg catccagaag ggcatgaagc ggggcagcgg cacccggagc 480
ggcctgctgt gggagatcga gcgggccctg gacagcaccg agaagaacga cctgcccaag 540
tacctgctga tggagaacgt gggcgccctg ctgcacaaga agaacgagga ggagctgaac 600
cagtggaagc agaagctgga gagcctgggc taccagaaca gcatcgaggt gctgaacgcc 660
gccgacttcg gcagcagcca ggcccggcgg cgggtgttca tgatcagcac cctgaacgag 720
ttcgtggagc tgcccaaggg cgacaagaag cccaagagca tcaagaaggt gctgaacaag 780
atcgtgagcg agaaggacat cctgaacaac ctgctgaagt acaacctgac cgagttcaag 840
aaaaccaaga gcaacatcaa caaggccagc ctgatcggct acagcaagtt caacagcgag 900
ggctacgtgt acgaccccga gttcaccggc cccaccctga ccgccagcgg cgccaacagc 960
cggatcaaga tcaaggacgg cagcaacatc cggaagatga acagcgacga gaccttcctg 1020
tacatcggct tcgacagcca ggacggcaag cgggtgaacg agatcgagtt cctgaccgag 1080
aaccagaaga tcttcgtgtg cggcaacagc atcagcgtgg aggtgctgga ggccatcatc 1140
gacaagatcg gcggc 1155
<210> 11
<211> 386
<212> PRT
<213> Monobpium (Spiroplasma monobiae)
<400> 11
Met Ser Lys Val Glu Asn Lys Thr Lys Lys Leu Arg Val Phe Glu Ala
1 5 10 15
Phe Ala Gly Ile Gly Ala Gln Arg Lys Ala Leu Glu Lys Val Arg Lys
20 25 30
Asp Glu Tyr Glu Ile Val Gly Leu Ala Glu Trp Tyr Val Pro Ala Ile
35 40 45
Val Met Tyr Gln Ala Ile His Asn Asn Phe His Thr Lys Leu Glu Tyr
50 55 60
Lys Ser Val Ser Arg Glu Glu Met Ile Asp Tyr Leu Glu Asn Lys Thr
65 70 75 80
Leu Ser Trp Asn Ser Lys Asn Pro Val Ser Asn Gly Tyr Trp Lys Arg
85 90 95
Lys Lys Asp Asp Glu Leu Lys Ile Ile Tyr Asn Ala Ile Lys Leu Ser
100 105 110
Glu Lys Glu Gly Asn Ile Phe Asp Ile Arg Asp Leu Tyr Lys Arg Thr
115 120 125
Leu Lys Asn Ile Asp Leu Leu Thr Tyr Ser Phe Pro Cys Gln Asp Leu
130 135 140
Ser Gln Gln Gly Ile Gln Lys Gly Met Lys Arg Gly Ser Gly Thr Arg
145 150 155 160
Ser Gly Leu Leu Trp Glu Ile Glu Arg Ala Leu Asp Ser Thr Glu Lys
165 170 175
Asn Asp Leu Pro Lys Tyr Leu Leu Met Glu Asn Val Gly Ala Leu Leu
180 185 190
His Lys Lys Asn Glu Glu Glu Leu Asn Gln Trp Lys Gln Lys Leu Glu
195 200 205
Ser Leu Gly Tyr Gln Asn Ser Ile Glu Val Leu Asn Ala Ala Asp Phe
210 215 220
Gly Ser Ser Gln Ala Arg Arg Arg Val Phe Met Ile Ser Thr Leu Asn
225 230 235 240
Glu Phe Val Glu Leu Pro Lys Gly Asp Lys Lys Pro Lys Ser Ile Lys
245 250 255
Lys Val Leu Asn Lys Ile Val Ser Glu Lys Asp Ile Leu Asn Asn Leu
260 265 270
Leu Lys Tyr Asn Leu Thr Glu Phe Lys Lys Thr Lys Ser Asn Ile Asn
275 280 285
Lys Ala Ser Leu Ile Gly Tyr Ser Lys Phe Asn Ser Glu Gly Tyr Val
290 295 300
Tyr Asp Pro Glu Phe Thr Gly Pro Thr Leu Thr Ala Ser Gly Ala Asn
305 310 315 320
Ser Arg Ile Lys Ile Lys Asp Gly Ser Asn Ile Arg Lys Met Asn Ser
325 330 335
Asp Glu Thr Phe Leu Tyr Ile Gly Phe Asp Ser Gln Asp Gly Lys Arg
340 345 350
Val Asn Glu Ile Glu Phe Leu Thr Glu Asn Gln Lys Ile Phe Val Cys
355 360 365
Gly Asn Ser Ile Ser Val Glu Val Leu Glu Ala Ile Ile Asp Lys Ile
370 375 380
Gly Gly
385
<210> 12
<211> 385
<212> PRT
<213> Monobpium (Spiroplasma monobiae)
<400> 12
Ser Lys Val Glu Asn Lys Thr Lys Lys Leu Arg Val Phe Glu Ala Phe
1 5 10 15
Ala Gly Ile Gly Ala Gln Arg Lys Ala Leu Glu Lys Val Arg Lys Asp
20 25 30
Glu Tyr Glu Ile Val Gly Leu Ala Glu Trp Tyr Val Pro Ala Ile Val
35 40 45
Met Tyr Gln Ala Ile His Asn Asn Phe His Thr Lys Leu Glu Tyr Lys
50 55 60
Ser Val Ser Arg Glu Glu Met Ile Asp Tyr Leu Glu Asn Lys Thr Leu
65 70 75 80
Ser Trp Asn Ser Lys Asn Pro Val Ser Asn Gly Tyr Trp Lys Arg Lys
85 90 95
Lys Asp Asp Glu Leu Lys Ile Ile Tyr Asn Ala Ile Lys Leu Ser Glu
100 105 110
Lys Glu Gly Asn Ile Phe Asp Ile Arg Asp Leu Tyr Lys Arg Thr Leu
115 120 125
Lys Asn Ile Asp Leu Leu Thr Tyr Ser Phe Pro Cys Gln Asp Leu Ser
130 135 140
Gln Gln Gly Ile Gln Lys Gly Met Lys Arg Gly Ser Gly Thr Arg Ser
145 150 155 160
Gly Leu Leu Trp Glu Ile Glu Arg Ala Leu Asp Ser Thr Glu Lys Asn
165 170 175
Asp Leu Pro Lys Tyr Leu Leu Met Glu Asn Val Gly Ala Leu Leu His
180 185 190
Lys Lys Asn Glu Glu Glu Leu Asn Gln Trp Lys Gln Lys Leu Glu Ser
195 200 205
Leu Gly Tyr Gln Asn Ser Ile Glu Val Leu Asn Ala Ala Asp Phe Gly
210 215 220
Ser Ser Gln Ala Arg Arg Arg Val Phe Met Ile Ser Thr Leu Asn Glu
225 230 235 240
Phe Val Glu Leu Pro Lys Gly Asp Lys Lys Pro Lys Ser Ile Lys Lys
245 250 255
Val Leu Asn Lys Ile Val Ser Glu Lys Asp Ile Leu Asn Asn Leu Leu
260 265 270
Lys Tyr Asn Leu Thr Glu Phe Lys Lys Thr Lys Ser Asn Ile Asn Lys
275 280 285
Ala Ser Leu Ile Gly Tyr Ser Lys Phe Asn Ser Glu Gly Tyr Val Tyr
290 295 300
Asp Pro Glu Phe Thr Gly Pro Thr Leu Thr Ala Ser Gly Ala Asn Ser
305 310 315 320
Arg Ile Lys Ile Lys Asp Gly Ser Asn Ile Arg Lys Met Asn Ser Asp
325 330 335
Glu Thr Phe Leu Tyr Ile Gly Phe Asp Ser Gln Asp Gly Lys Arg Val
340 345 350
Asn Glu Ile Glu Phe Leu Thr Glu Asn Gln Lys Ile Phe Val Cys Gly
355 360 365
Asn Ser Ile Ser Val Glu Val Leu Glu Ala Ile Ile Asp Lys Ile Gly
370 375 380
Gly
385
<210> 13
<211> 96
<212> PRT
<213> Homo sapiens (Homo sapiens)
<400> 13
Asp Ala Lys Ser Leu Thr Ala Trp Ser Arg Thr Leu Val Thr Phe Lys
1 5 10 15
Asp Val Phe Val Asp Phe Thr Arg Glu Glu Trp Lys Leu Leu Asp Thr
20 25 30
Ala Gln Gln Ile Leu Tyr Arg Asn Val Met Leu Glu Asn Tyr Lys Asn
35 40 45
Leu Val Ser Leu Gly Tyr Gln Leu Thr Lys Pro Asp Val Ile Leu Arg
50 55 60
Leu Glu Lys Gly Glu Glu Pro Trp Leu Val Glu Arg Glu Ile His Gln
65 70 75 80
Glu Thr His Pro Asp Ser Glu Thr Ala Phe Glu Ile Lys Ser Ser Val
85 90 95
<210> 14
<211> 288
<212> DNA
<213> Homo sapiens (Homo sapiens)
<400> 14
gacgccaaga gcctgaccgc ctggagccgg accctggtga ccttcaagga cgtgttcgtg 60
gacttcaccc gggaggagtg gaagctgctg gacaccgccc agcagatcct gtaccggaac 120
gtgatgctgg agaactacaa gaacctggtg agcctgggct accagctgac caagcccgac 180
gtgatcctgc ggctggagaa gggcgaggag ccctggctgg tggagcggga gatccaccag 240
gagacccacc ccgacagcga gaccgccttc gagatcaaga gcagcgtg 288
<210> 15
<211> 475
<212> PRT
<213> Homo sapiens (Homo sapiens)
<400> 15
Gly Lys Ile Met Tyr Val Gly Asp Val Arg Ser Val Thr Gln Lys His
1 5 10 15
Ile Gln Glu Trp Gly Pro Phe Asp Leu Val Ile Gly Gly Ser Pro Cys
20 25 30
Asn Asp Leu Ser Ile Val Asn Pro Ala Arg Lys Gly Leu Tyr Glu Gly
35 40 45
Thr Gly Arg Leu Phe Phe Glu Phe Tyr Arg Leu Leu His Asp Ala Arg
50 55 60
Pro Lys Glu Gly Asp Asp Arg Pro Phe Phe Trp Leu Phe Glu Asn Val
65 70 75 80
Val Ala Met Gly Val Ser Asp Lys Arg Asp Ile Ser Arg Phe Leu Glu
85 90 95
Ser Asn Pro Val Met Ile Asp Ala Lys Glu Val Ser Ala Ala His Arg
100 105 110
Ala Arg Tyr Phe Trp Gly Asn Leu Pro Gly Met Asn Arg Pro Leu Ala
115 120 125
Ser Thr Val Asn Asp Lys Leu Glu Leu Gln Glu Cys Leu Glu His Gly
130 135 140
Arg Ile Ala Lys Phe Ser Lys Val Arg Thr Ile Thr Thr Arg Ser Asn
145 150 155 160
Ser Ile Lys Gln Gly Lys Asp Gln His Phe Pro Val Phe Met Asn Glu
165 170 175
Lys Glu Asp Ile Leu Trp Cys Thr Glu Met Glu Arg Val Phe Gly Phe
180 185 190
Pro Val His Tyr Thr Asp Val Ser Asn Met Ser Arg Leu Ala Arg Gln
195 200 205
Arg Leu Leu Gly Arg Ser Trp Ser Val Pro Val Ile Arg His Leu Phe
210 215 220
Ala Pro Leu Lys Glu Tyr Phe Ala Cys Val Ser Ser Gly Asn Ser Asn
225 230 235 240
Ala Asn Ser Arg Gly Pro Ser Phe Ser Ser Gly Leu Val Pro Leu Ser
245 250 255
Leu Arg Gly Ser His Met Asn Pro Leu Glu Met Phe Glu Thr Val Pro
260 265 270
Val Trp Arg Arg Gln Pro Val Arg Val Leu Ser Leu Phe Glu Asp Ile
275 280 285
Lys Lys Glu Leu Thr Ser Leu Gly Phe Leu Glu Ser Gly Ser Asp Pro
290 295 300
Gly Gln Leu Lys His Val Val Asp Val Thr Asp Thr Val Arg Lys Asp
305 310 315 320
Val Glu Glu Trp Gly Pro Phe Asp Leu Val Tyr Gly Ala Thr Pro Pro
325 330 335
Leu Gly His Thr Cys Asp Arg Pro Pro Ser Trp Tyr Leu Phe Gln Phe
340 345 350
His Arg Leu Leu Gln Tyr Ala Arg Pro Lys Pro Gly Ser Pro Arg Pro
355 360 365
Phe Phe Trp Met Phe Val Asp Asn Leu Val Leu Asn Lys Glu Asp Leu
370 375 380
Asp Val Ala Ser Arg Phe Leu Glu Met Glu Pro Val Thr Ile Pro Asp
385 390 395 400
Val His Gly Gly Ser Leu Gln Asn Ala Val Arg Val Trp Ser Asn Ile
405 410 415
Pro Ala Ile Arg Ser Arg His Trp Ala Leu Val Ser Glu Glu Glu Leu
420 425 430
Ser Leu Leu Ala Gln Asn Lys Gln Ser Ser Lys Leu Ala Ala Lys Trp
435 440 445
Pro Thr Lys Leu Val Lys Asn Cys Phe Leu Pro Leu Arg Glu Tyr Phe
450 455 460
Lys Tyr Phe Ser Thr Glu Leu Thr Ser Ser Leu
465 470 475
<210> 16
<211> 1626
<212> DNA
<213> Homo sapiens (Homo sapiens)
<400> 16
aaccacgacc aggagttcga cccccccaag gtgtaccccc ccgtgcccgc cgagaagcgg 60
aagcccatcc gggtgctgag cctgttcgac ggcatcgcca ccggcctgct ggtgctgaag 120
gacctgggca tccaggtgga ccggtacatc gccagcgagg tgtgcgagga cagcatcacc 180
gtgggcatgg tgcggcacca gggcaagatc atgtacgtgg gcgacgtgcg gagcgtgacc 240
cagaagcaca tccaggagtg gggccccttc gacctggtga tcggcggcag cccctgcaac 300
gacctgagca tcgtgaaccc cgcccggaag ggcctgtacg agggcaccgg ccggctgttc 360
ttcgagttct accggctgct gcacgacgcc cggcccaagg agggcgacga ccggcccttc 420
ttctggctgt tcgagaacgt ggtggccatg ggcgtgagcg acaagcggga catcagccgg 480
ttcctggaga gcaaccccgt gatgatcgac gccaaggagg tgagcgccgc ccaccgggcc 540
cggtacttct ggggcaacct gcccggcatg aaccggcccc tggccagcac cgtgaacgac 600
aagctggagc tgcaggagtg cctggagcac ggccggatcg ccaagttcag caaggtgcgg 660
accatcacca cccggagcaa cagcatcaag cagggcaagg accagcactt ccccgtgttc 720
atgaacgaga aggaggacat cctgtggtgc accgagatgg agcgggtgtt cggcttcccc 780
gtgcactaca ccgacgtgag caacatgagc cggctggccc ggcagcggct gctgggccgg 840
agctggagcg tgcccgtgat ccggcacctg ttcgcccccc tgaaggagta cttcgcctgc 900
gtgagcagcg gcaacagcaa cgccaacagc cggggcccca gcttcagcag cggcctggtg 960
cccctgagcc tgcggggcag ccacatgaat cctctggaga tgttcgagac agtgcccgtg 1020
tggagaaggc aacccgtgag ggtgctgagc ctcttcgagg acattaagaa ggagctgacc 1080
tctctgggct ttctggaatc cggcagcgac cccggccagc tgaaacacgt ggtggacgtg 1140
accgacacag tgaggaagga cgtggaagag tggggcccct ttgacctcgt gtatggagcc 1200
acacctcctc tcggccacac atgcgatagg cctcccagct ggtatctctt ccagttccac 1260
agactgctcc agtacgccag acctaagccc ggcagcccca gacccttctt ctggatgttc 1320
gtggacaatc tggtgctgaa caaggaggat ctggatgtgg ccagcagatt tctggagatg 1380
gaacccgtga caatccccga cgtgcatggc ggctctctgc agaacgccgt gagagtgtgg 1440
tccaacatcc ccgccattag aagcagacac tgggctctgg tgagcgagga ggaactgtct 1500
ctgctggccc agaataagca gtcctccaag ctggccgcca agtggcccac caagctggtg 1560
aagaactgct ttctgcctct gagggagtat ttcaagtatt tcagcaccga actgaccagc 1620
agcctg 1626
<210> 17
<211> 745
<212> PRT
<213> Homo sapiens (Homo sapiens)
<400> 17
Gly Gln Thr Gly Lys Lys Ser Glu Lys Gly Pro Val Cys Trp Arg Lys
1 5 10 15
Arg Val Lys Ser Glu Tyr Met Arg Leu Arg Gln Leu Lys Arg Phe Arg
20 25 30
Arg Ala Asp Glu Val Lys Ser Met Phe Ser Ser Asn Arg Gln Lys Ile
35 40 45
Leu Glu Arg Thr Glu Ile Leu Asn Gln Glu Trp Lys Gln Arg Arg Ile
50 55 60
Gln Pro Val His Ile Leu Thr Ser Val Ser Ser Leu Arg Gly Thr Arg
65 70 75 80
Glu Cys Ser Val Thr Ser Asp Leu Asp Phe Pro Thr Gln Val Ile Pro
85 90 95
Leu Lys Thr Leu Asn Ala Val Ala Ser Val Pro Ile Met Tyr Ser Trp
100 105 110
Ser Pro Leu Gln Gln Asn Phe Met Val Glu Asp Glu Thr Val Leu His
115 120 125
Asn Ile Pro Tyr Met Gly Asp Glu Val Leu Asp Gln Asp Gly Thr Phe
130 135 140
Ile Glu Glu Leu Ile Lys Asn Tyr Asp Gly Lys Val His Gly Asp Arg
145 150 155 160
Glu Cys Gly Phe Ile Asn Asp Glu Ile Phe Val Glu Leu Val Asn Ala
165 170 175
Leu Gly Gln Tyr Asn Asp Asp Asp Asp Asp Asp Asp Gly Asp Asp Pro
180 185 190
Glu Glu Arg Glu Glu Lys Gln Lys Asp Leu Glu Asp His Arg Asp Asp
195 200 205
Lys Glu Ser Arg Pro Pro Arg Lys Phe Pro Ser Asp Lys Ile Phe Glu
210 215 220
Ala Ile Ser Ser Met Phe Pro Asp Lys Gly Thr Ala Glu Glu Leu Lys
225 230 235 240
Glu Lys Tyr Lys Glu Leu Thr Glu Gln Gln Leu Pro Gly Ala Leu Pro
245 250 255
Pro Glu Cys Thr Pro Asn Ile Asp Gly Pro Asn Ala Lys Ser Val Gln
260 265 270
Arg Glu Gln Ser Leu His Ser Phe His Thr Leu Phe Cys Arg Arg Cys
275 280 285
Phe Lys Tyr Asp Cys Phe Leu His Pro Phe His Ala Thr Pro Asn Thr
290 295 300
Tyr Lys Arg Lys Asn Thr Glu Thr Ala Leu Asp Asn Lys Pro Cys Gly
305 310 315 320
Pro Gln Cys Tyr Gln His Leu Glu Gly Ala Lys Glu Phe Ala Ala Ala
325 330 335
Leu Thr Ala Glu Arg Ile Lys Thr Pro Pro Lys Arg Pro Gly Gly Arg
340 345 350
Arg Arg Gly Arg Leu Pro Asn Asn Ser Ser Arg Pro Ser Thr Pro Thr
355 360 365
Ile Asn Val Leu Glu Ser Lys Asp Thr Asp Ser Asp Arg Glu Ala Gly
370 375 380
Thr Glu Thr Gly Gly Glu Asn Asn Asp Lys Glu Glu Glu Glu Lys Lys
385 390 395 400
Asp Glu Thr Ser Ser Ser Ser Glu Ala Asn Ser Arg Cys Gln Thr Pro
405 410 415
Ile Lys Met Lys Pro Asn Ile Glu Pro Pro Glu Asn Val Glu Trp Ser
420 425 430
Gly Ala Glu Ala Ser Met Phe Arg Val Leu Ile Gly Thr Tyr Tyr Asp
435 440 445
Asn Phe Cys Ala Ile Ala Arg Leu Ile Gly Thr Lys Thr Cys Arg Gln
450 455 460
Val Tyr Glu Phe Arg Val Lys Glu Ser Ser Ile Ile Ala Pro Ala Pro
465 470 475 480
Ala Glu Asp Val Asp Thr Pro Pro Arg Lys Lys Lys Arg Lys His Arg
485 490 495
Leu Trp Ala Ala His Cys Arg Lys Ile Gln Leu Lys Lys Asp Gly Ser
500 505 510
Ser Asn His Val Tyr Asn Tyr Gln Pro Cys Asp His Pro Arg Gln Pro
515 520 525
Cys Asp Ser Ser Cys Pro Cys Val Ile Ala Gln Asn Phe Cys Glu Lys
530 535 540
Phe Cys Gln Cys Ser Ser Glu Cys Gln Asn Arg Phe Pro Gly Cys Arg
545 550 555 560
Cys Lys Ala Gln Cys Asn Thr Lys Gln Cys Pro Cys Tyr Leu Ala Val
565 570 575
Arg Glu Cys Asp Pro Asp Leu Cys Leu Thr Cys Gly Ala Ala Asp His
580 585 590
Trp Asp Ser Lys Asn Val Ser Cys Lys Asn Cys Ser Ile Gln Arg Gly
595 600 605
Ser Lys Lys His Leu Leu Leu Ala Pro Ser Asp Val Ala Gly Trp Gly
610 615 620
Ile Phe Ile Lys Asp Pro Val Gln Lys Asn Glu Phe Ile Ser Glu Tyr
625 630 635 640
Cys Gly Glu Ile Ile Ser Gln Asp Glu Ala Asp Arg Arg Gly Lys Val
645 650 655
Tyr Asp Lys Tyr Met Cys Ser Phe Leu Phe Asn Leu Asn Asn Asp Phe
660 665 670
Val Val Asp Ala Thr Arg Lys Gly Asn Lys Ile Arg Phe Ala Asn His
675 680 685
Ser Val Asn Pro Asn Cys Tyr Ala Lys Val Met Met Val Asn Gly Asp
690 695 700
His Arg Ile Gly Ile Phe Ala Lys Arg Ala Ile Gln Thr Gly Glu Glu
705 710 715 720
Leu Phe Phe Asp Tyr Arg Tyr Ser Gln Ala Asp Ala Leu Lys Tyr Val
725 730 735
Gly Ile Glu Arg Glu Met Glu Ile Pro
740 745
<210> 18
<211> 2235
<212> DNA
<213> Homo sapiens (Homo sapiens)
<400> 18
ggccagaccg gcaagaagag cgagaagggc cccgtgtgct ggcggaagcg ggtgaagagc 60
gagtacatgc ggctgcggca gctgaagcgg ttccggcggg ccgacgaggt gaagagcatg 120
ttcagcagca accggcagaa gatcctggag cggaccgaga tcctgaacca ggagtggaag 180
cagcggcgaa tccagcccgt gcacatcctg accagcgtga gcagcctgcg gggcacccgg 240
gagtgcagcg tgaccagcga cctggacttc cccacccagg tgatccccct aaagaccctg 300
aacgccgtgg ccagcgtgcc catcatgtac agctggagcc ccctgcagca gaacttcatg 360
gtggaggacg agaccgtgct gcacaacatc ccctacatgg gcgacgaggt gctggaccag 420
gacggcacct tcatcgagga gctgatcaag aactacgacg gcaaggtgca cggcgaccgg 480
gagtgcggct tcatcaacga cgagatcttc gtggagctgg tgaacgccct gggccagtac 540
aacgacgacg acgacgacga cgacggcgac gaccccgagg agcgggagga gaagcagaag 600
gacctggagg accaccggga cgacaaggag agccggcccc cccggaagtt ccccagcgac 660
aagatcttcg aggccatcag cagcatgttc cccgacaagg gcaccgccga ggagctgaag 720
gagaagtaca aggagctgac cgagcagcag ctgcccggcg ccctgccccc cgagtgcacc 780
cccaacatcg acggccccaa cgccaagagc gtgcagcggg agcagagcct gcacagcttc 840
cacaccctgt tctgccggcg gtgcttcaag tacgactgct tcctgcaccc cttccacgcc 900
acccccaaca cctacaagcg gaagaacacc gagaccgccc tggacaacaa gccctgcggc 960
ccccagtgct accagcacct ggagggcgcc aaggagttcg ccgccgccct gaccgccgag 1020
cggatcaaga ccccccccaa gcggcccggc ggccggcggc ggggccggct gcccaacaac 1080
agcagccggc ccagcacccc caccatcaac gtgctggaga gcaaggacac cgacagcgac 1140
cgggaggccg gcaccgagac cggcggcgag aacaacgaca aggaggagga ggagaagaag 1200
gacgagacca gcagcagcag cgaggccaac agccggtgcc agacccccat caagatgaag 1260
cccaacatcg agccccccga gaacgtggag tggagcggcg ccgaggccag catgttccgg 1320
gtgctgatcg gcacctacta cgacaacttc tgcgccatcg cccggctgat cggcaccaag 1380
acctgccggc aggtgtacga gttccgggtg aaggagagca gcatcatcgc ccccgccccc 1440
gccgaggacg tggacacccc cccccggaag aagaagcgga agcaccggct gtgggccgcc 1500
cactgccgga agatccagct gaagaaggac ggcagcagca accacgtgta caactaccag 1560
ccctgcgacc acccccggca gccctgcgac agcagctgcc cctgcgtgat cgcccagaac 1620
ttctgcgaga agttctgcca gtgcagcagc gagtgccaga accggttccc cggctgccgg 1680
tgcaaggccc agtgcaacac caagcagtgc ccctgctacc tggccgtgcg ggagtgcgac 1740
cccgacctgt gcctgacctg cggcgccgcc gaccactggg acagcaagaa cgtgagctgc 1800
aagaactgca gcatccagcg gggcagcaag aagcacctgc tgctggcccc cagcgacgtg 1860
gccggctggg gcatcttcat caaggacccc gtgcagaaga acgagttcat cagcgagtac 1920
tgcggcgaga tcatcagcca ggacgaggcc gaccggcggg gcaaggtgta cgacaagtac 1980
atgtgcagct tcctgttcaa cctgaacaac gacttcgtgg tggacgccac ccggaagggc 2040
aacaagatcc ggttcgccaa ccacagcgtg aaccccaact gctacgccaa ggtgatgatg 2100
gtgaacggcg accaccggat cggcatcttc gccaagcggg ccatccagac cggcgaggag 2160
ctgttcttcg actaccggta cagccaggcc gacgccctga agtacgtggg catcgagcgg 2220
gagatggaga tcccc 2235
<210> 19
<211> 376
<212> PRT
<213> Homo sapiens (Homo sapiens)
<400> 19
Glu Glu Pro Glu Glu Pro Ala Asp Ser Gly Gln Ser Leu Val Pro Val
1 5 10 15
Tyr Ile Tyr Ser Pro Glu Tyr Val Ser Met Cys Asp Ser Leu Ala Lys
20 25 30
Ile Pro Lys Arg Ala Ser Met Val His Ser Leu Ile Glu Ala Tyr Ala
35 40 45
Leu His Lys Gln Met Arg Ile Val Lys Pro Lys Val Ala Ser Met Glu
50 55 60
Glu Met Ala Thr Phe His Thr Asp Ala Tyr Leu Gln His Leu Gln Lys
65 70 75 80
Val Ser Gln Glu Gly Asp Asp Asp His Pro Asp Ser Ile Glu Tyr Gly
85 90 95
Leu Gly Tyr Asp Cys Pro Ala Thr Glu Gly Ile Phe Asp Tyr Ala Ala
100 105 110
Ala Ile Gly Gly Ala Thr Ile Thr Ala Ala Gln Cys Leu Ile Asp Gly
115 120 125
Met Cys Lys Val Ala Ile Asn Trp Ser Gly Gly Trp His His Ala Lys
130 135 140
Lys Asp Glu Ala Ser Gly Phe Cys Tyr Leu Asn Asp Ala Val Leu Gly
145 150 155 160
Ile Leu Arg Leu Arg Arg Lys Phe Glu Arg Ile Leu Tyr Val Asp Leu
165 170 175
Asp Leu His His Gly Asp Gly Val Glu Asp Ala Phe Ser Phe Thr Ser
180 185 190
Lys Val Met Thr Val Ser Leu His Lys Phe Ser Pro Gly Phe Phe Pro
195 200 205
Gly Thr Gly Asp Val Ser Asp Val Gly Leu Gly Lys Gly Arg Tyr Tyr
210 215 220
Ser Val Asn Val Pro Ile Gln Asp Gly Ile Gln Asp Glu Lys Tyr Tyr
225 230 235 240
Gln Ile Cys Glu Ser Val Leu Lys Glu Val Tyr Gln Ala Phe Asn Pro
245 250 255
Lys Ala Val Val Leu Gln Leu Gly Ala Asp Thr Ile Ala Gly Asp Pro
260 265 270
Met Cys Ser Phe Asn Met Thr Pro Val Gly Ile Gly Lys Cys Leu Lys
275 280 285
Tyr Ile Leu Gln Trp Gln Leu Ala Thr Leu Ile Leu Gly Gly Gly Gly
290 295 300
Tyr Asn Leu Ala Asn Thr Ala Arg Cys Trp Thr Tyr Leu Thr Gly Val
305 310 315 320
Ile Leu Gly Lys Thr Leu Ser Ser Glu Ile Pro Asp His Glu Phe Phe
325 330 335
Thr Ala Tyr Gly Pro Asp Tyr Val Leu Glu Ile Thr Pro Ser Cys Arg
340 345 350
Pro Asp Arg Asn Glu Pro His Arg Ile Gln Gln Ile Leu Asn Tyr Ile
355 360 365
Lys Gly Asn Leu Lys His Val Val
370 375
<210> 20
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 20
ggggccacta gggacaggat 20
<210> 21
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 21
agccccacct tgtggtcaga 20
<210> 22
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 22
agtgctgcct tctgaccaca 20
<210> 23
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 23
gctgccttct gaccacaagg 20
<210> 24
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 24
ccagtataag ccccaccttg 20
<210> 25
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 25
ctgcctgtcc cataaggagg 20
<210> 26
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 26
gcactgcctg tcccataagg 20
<210> 27
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 27
ggtcctcctc cttatgggac 20
<210> 28
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 28
gccttgtttt cggctctaga 20
<210> 29
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 29
gccatctaga gccgaaaaca 20
<210> 30
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 30
ccaatgaaga tgaaactggg 20
<210> 31
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 31
aacgtgcttg cctaagattc 20
<210> 32
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 32
agcccttaat catatctagt 20
<210> 33
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 33
cagagcttaa gacctgtact 20
<210> 34
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 34
gcccaccttg accttcacaa 20
<210> 35
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 35
gttactgcgt aattaccagg 20
<210> 36
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 36
tattacatcc tacctataag 20
<210> 37
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 37
tgggctctgg acttagatcg 20
<210> 38
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 38
taagtgggct atgtatacac 20
<210> 39
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 39
tttctaagtc tgtcacaagg 20
<210> 40
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 40
aaagtaatat gatctaggaa 20
<210> 41
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 41
gttcgagcgg ctgtgcgagg 20
<210> 42
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 42
gctctgtggc tctccgagaa 20
<210> 43
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 43
gtgtgtgtgt ttcaacgtag 20
<210> 44
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 44
ggaagtcact gggagctgcg 20
<210> 45
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 45
ggccacgggt gtgttcccag 20
<210> 46
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 46
atggccattt gcaaaagtca 20
<210> 47
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 47
ccaaactaga cagataaagc 20
<210> 48
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 48
ccagcatgac tctagcatgc 20
<210> 49
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 49
tggccaaggt ctgatatgca 20
<210> 50
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 50
tcatgagtcc cagaacatgt 20
<210> 51
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 51
gcgaaagaag tagtagctaa 20
<210> 52
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 52
gactaagact ggcaaatctg 20
<210> 53
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 53
gactaagagg agccgacatg 20
<210> 54
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 54
gaaaaacggg tgttgtgacg 20
<210> 55
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 55
tttgtgaact aaggattctg 20
<210> 56
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 56
gtccgtgtag agttaccatg 20
<210> 57
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 57
gatgtattca caagaggact 20
<210> 58
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 58
aattactacc tcatagctag 20
<210> 59
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 59
gaaggtagaa atccgccact 20
<210> 60
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 60
gaaacgccga ggtaactcat 20
<210> 61
<211> 20
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 61
caactaaaat ttctagccct 20
<210> 62
<211> 21
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Oligonucleotides
<400> 62
gactccagtc tttctagaag a 21
<210> 63
<211> 6
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Peptides
<400> 63
Pro Lys Lys Lys Arg Lys
1 5
<210> 64
<211> 15
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Peptides
<400> 64
Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys
1 5 10 15
<210> 65
<211> 9
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Peptides
<400> 65
Tyr Pro Tyr Asp Val Pro Asp Tyr Ala
1 5
<210> 66
<211> 1128
<212> DNA
<213> Homo sapiens (Homo sapiens)
<400> 66
gaggagcccg aggagcccgc cgatagcgga caatctctgg tgcccgtcta catctacagc 60
cccgaatatg tgagcatgtg tgattccctc gccaagatcc ctaagagagc cagcatggtg 120
cattctctga tcgaggccta cgctctgcat aagcaaatga ggatcgtgaa gcccaaggtc 180
gccagcatgg aagagatggc cacctttcac accgatgcct acctccaaca tctccagaag 240
gtgtcccaag agggcgacga cgaccacccc gactccattg agtacggact gggctatgat 300
tgccccgcca ccgagggcat ctttgactat gccgccgcta tcggcggagc taccatcaca 360
gccgcccagt gtctgattga tggcatgtgc aaggtcgcca tcaactggtc cggaggctgg 420
catcatgcca agaaggatga ggcctccggc ttctgttatc tgaatgacgc cgtgctgggc 480
attctgagac tgaggaggaa attcgagagg attctgtacg tggatctgga tctgcatcac 540
ggagatggag tcgaagatgc cttcagcttc accagcaagg tgatgacagt ctctctgcac 600
aagttctccc ccggcttctt tcccggaacc ggcgacgtgt ccgacgtggg actgggcaag 660
ggaaggtact acagcgtgaa cgtgcccatt caagacggca tccaagacga gaagtactac 720
cagatctgcg agtccgtgct caaggaggtc taccaagcct tcaatcctaa ggctgtcgtg 780
ctccaactgg gagctgatac cattgctggc gatcccatgt gcagcttcaa tatgacaccc 840
gtcggaatcg gcaagtgcct caagtacatc ctccagtggc agctcgccac cctcattctc 900
ggaggaggcg gatacaatct ggctaatacc gccagatgct ggacctatct gaccggcgtg 960
attctgggca aaacactgag cagcgaaatc cccgaccacg agtttttcac cgcttacggc 1020
cccgactacg tgctggagat cacccccagc tgcagacccg atagaaacga accccataga 1080
atccagcaaa ttctgaacta tatcaagggc aacctcaagc acgtcgtg 1128
<210> 67
<211> 282
<212> PRT
<213> Homo sapiens (Homo sapiens)
<400> 67
Gly Asn Arg Ala Ile Arg Thr Glu Lys Ile Ile Cys Arg Asp Val Ala
1 5 10 15
Arg Gly Tyr Glu Asn Val Pro Ile Pro Cys Val Asn Gly Val Asp Gly
20 25 30
Glu Pro Cys Pro Glu Asp Tyr Lys Tyr Ile Ser Glu Asn Cys Glu Thr
35 40 45
Ser Thr Met Asn Ile Asp Arg Asn Ile Thr His Leu Gln His Cys Thr
50 55 60
Cys Val Asp Asp Cys Ser Ser Ser Asn Cys Leu Cys Gly Gln Leu Ser
65 70 75 80
Ile Arg Cys Trp Tyr Asp Lys Asp Gly Arg Leu Leu Gln Glu Phe Asn
85 90 95
Lys Ile Glu Pro Pro Leu Ile Phe Glu Cys Asn Gln Ala Cys Ser Cys
100 105 110
Trp Arg Asn Cys Lys Asn Arg Val Val Gln Ser Gly Ile Lys Val Arg
115 120 125
Leu Gln Leu Tyr Arg Thr Ala Lys Met Gly Trp Gly Val Arg Ala Leu
130 135 140
Gln Thr Ile Pro Gln Gly Thr Phe Ile Cys Glu Tyr Val Gly Glu Leu
145 150 155 160
Ile Ser Asp Ala Glu Ala Asp Val Arg Glu Asp Asp Ser Tyr Leu Phe
165 170 175
Asp Leu Asp Asn Lys Asp Gly Glu Val Tyr Cys Ile Asp Ala Arg Tyr
180 185 190
Tyr Gly Asn Ile Ser Arg Phe Ile Asn His Leu Cys Asp Pro Asn Ile
195 200 205
Ile Pro Val Arg Val Phe Met Leu His Gln Asp Leu Arg Phe Pro Arg
210 215 220
Ile Ala Phe Phe Ser Ser Arg Asp Ile Arg Thr Gly Glu Glu Leu Gly
225 230 235 240
Phe Asp Tyr Gly Asp Arg Phe Trp Asp Ile Lys Ser Lys Tyr Phe Thr
245 250 255
Cys Gln Cys Gly Ser Glu Lys Cys Lys His Ser Ala Glu Ala Ile Ala
260 265 270
Leu Glu Gln Ser Arg Leu Ala Arg Leu Asp
275 280
<210> 68
<211> 846
<212> DNA
<213> Homo sapiens (Homo sapiens)
<400> 68
ggaaataggg ctatcagaac cgagaagatc atctgtaggg acgtggctag aggctacgag 60
aacgtgccca ttccttgcgt gaatggcgtg gatggcgaac cttgccccga ggactacaaa 120
tacatctccg agaactgcga aaccagcaca atgaacatcg acagaaacat cacccacctc 180
cagcactgca catgtgtgga tgactgctcc tccagcaact gtctgtgcgg ccagctctcc 240
atcagatgct ggtacgacaa ggacggcaga ctgctgcaag agttcaacaa gatcgaaccc 300
cctctcatct tcgagtgtaa ccaagcttgc agctgctgga gaaactgcaa gaatagagtg 360
gtccagagcg gcatcaaggt gagactgcaa ctgtacagaa ccgccaagat gggatgggga 420
gtgagggctc tgcaaaccat tccccaaggc accttcatct gcgaatacgt gggcgaactg 480
atctccgacg ccgaagctga cgtgagagag gacgacagct atctcttcga tctggacaat 540
aaggacggcg aggtgtactg catcgacgct agatattacg gcaacatctc tagattcatc 600
aaccacctct gcgatcccaa catcattccc gtgagggtgt tcatgctgca ccaagatctg 660
aggttcccta gaatcgcctt cttcagctct agagacatca gaaccggcga ggagctgggc 720
ttcgattacg gcgatagatt ctgggacatc aagtccaagt acttcacatg ccagtgcggc 780
agcgagaagt gtaagcacag cgctgaggcc attgctctgg agcagtctag actggccaga 840
ctggat 846
<210> 69
<211> 7987
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 69
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgcctc gggcgggggt ggctcaggaa 120
gcgccgctat cgccgaagtg ctgctgaatg ccagatgcga tctgcatgcc gtgaactacc 180
acggcgacac ccctctgcat atcgccgcta gagagagcta ccatgactgt gtgctgctgt 240
ttctgagcag aggcgccaat cccgagctca gaaacaaaga gggcgacacc gcttgggatc 300
tgacacccga gagatccgac gtgtggttcg ctctgcaact gaatagaaaa ctgagactgg 360
gcgtcggcaa tagggccatt agaaccgaga agatcatctg tagggacgtg gctaggggct 420
acgagaacgt gcccatccct tgtgtgaacg gagtggatgg agagccttgc cccgaggatt 480
acaaatacat cagcgagaac tgcgaaacct ccaccatgaa tatcgataga aacattacac 540
acctccagca ctgtacatgc gtggacgatt gcagcagcag caactgtctg tgcggccaac 600
tgagcatcag atgctggtac gacaaggatg gcagactgct gcaagagttc aacaagatcg 660
aaccccctct gatcttcgag tgtaaccaag cttgcagctg ttggaggaac tgcaagaata 720
gggtcgtgca gtccggaatc aaggtgagac tgcagctgta tagaacagct aagatgggat 780
ggggagtcag agctctgcag accatccccc aaggcacatt catctgtgag tacgtcggcg 840
aactcatcag cgacgctgag gccgatgtga gggaggacga cagctatctc ttcgacctcg 900
acaacaagga cggcgaggtg tactgcatcg acgctagata ttacggcaac atcagcagat 960
tcatcaacca cctctgcgac cccaatatca tccccgtgag agtgttcatg ctccatcaag 1020
atctgagatt ccctaggatc gccttcttca gctctagaga cattagaacc ggcgaggagc 1080
tgggattcga ctacggcgac aggttctggg acatcaagag caagtacttc acatgccaat 1140
gcggcagcga gaaatgcaag catagcgccg aggccattgc tctggagcag tctagactgg 1200
ctaggctgga ccctcacccc gagctgctgc ccgaactggg atctctgcct cccgtgaata 1260
ccggaggtgg cggatcggga gacaagaagt acagcatcgg cctggccatc ggcaccaaca 1320
gcgtgggctg ggccgtgatc accgacgagt acaaggtgcc cagcaagaag ttcaaggtgc 1380
tgggcaacac cgaccggcac agcatcaaga agaacctgat cggcgccctg ctgttcgaca 1440
gcggcgagac cgccgaggcc acccggctga agcggaccgc ccggcggcgg tacacccggc 1500
ggaagaaccg gatctgctac ctgcaggaga tcttcagcaa cgagatggcc aaggtggacg 1560
acagcttctt ccaccggctg gaggagagct tcctggtgga ggaggacaag aagcacgagc 1620
ggcaccccat cttcggcaac atcgtggacg aggtggccta ccacgagaag taccccacca 1680
tctaccacct gcggaagaag ctggtggaca gcaccgacaa ggccgacctg cggctgatct 1740
acctggccct ggcccacatg atcaagttcc ggggccactt cctgatcgag ggcgacctga 1800
accccgacaa cagcgacgtg gacaagctgt tcatccagct ggtgcagacc tacaaccagc 1860
tgttcgagga gaaccccatc aacgccagcg gcgtggacgc caaggccatc ctgagcgccc 1920
ggctgagcaa gagccggcgg ctggagaacc tgatcgccca gctgcccggc gagaagaaga 1980
acggcctgtt cggcaacctg atcgccctga gcctgggcct gacccccaac ttcaagagca 2040
acttcgacct ggccgaggac gccaagctgc agctgagcaa ggacacctac gacgacgacc 2100
tggacaacct gctggcccag atcggcgacc agtacgccga cctgttcctg gccgccaaga 2160
acctgagcga cgccatcctg ctgagcgaca tcctgcgggt gaacaccgag atcaccaagg 2220
cccccctgag cgccagcatg atcaagcggt acgacgagca ccaccaggac ctgaccctgc 2280
tgaaggccct ggtgcggcag cagctgcccg agaagtacaa ggagatcttc ttcgaccaga 2340
gcaagaacgg ctacgccggc tacatcgacg gcggcgccag ccaggaggag ttctacaagt 2400
tcatcaagcc catcctggag aagatggacg gcaccgagga gctgctggtg aagctgaacc 2460
gggaggacct gctgcggaag cagcggacct tcgacaacgg cagcatcccc caccagatcc 2520
acctgggcga gctgcacgcc atcctgcggc ggcaggagga cttctacccc ttcctgaagg 2580
acaaccggga gaagatcgag aagatcctga ccttccggat cccctactac gtgggccccc 2640
tggcccgggg caacagccgg ttcgcctgga tgacccggaa atccgaggag accatcaccc 2700
cctggaactt cgaggaggtg gtggacaagg gcgccagcgc ccagagcttc atcgagcgga 2760
tgaccaactt cgacaagaac ctgcccaacg agaaggtgct gcccaagcac agcctgctgt 2820
acgagtactt caccgtgtac aacgagctga ccaaggtgaa gtacgtgacc gagggcatgc 2880
ggaagcccgc cttcctgagc ggcgagcaga agaaggccat cgtggacctg ctgttcaaga 2940
ccaaccggaa ggtgaccgtg aagcagctga aggaggacta cttcaagaag atcgagtgct 3000
tcgacagcgt ggagatcagc ggcgtggagg accggttcaa cgccagcctg ggcacctacc 3060
acgacctgct gaagatcatc aaggacaagg acttcctgga caacgaggag aacgaggaca 3120
tcctggagga catcgtgctg accctgaccc tgttcgagga ccgggagatg atcgaggagc 3180
ggctgaaaac ctacgcccac ctgttcgacg acaaggtgat gaagcagctg aagcggcggc 3240
ggtacaccgg ctggggccgg ctgagccgga agctgatcaa cggcatccgg gacaagcaga 3300
gcggcaagac catcctggac ttcctgaaat ccgacggctt cgccaaccgg aacttcatgc 3360
agctgatcca cgacgacagc ctgaccttca aggaggacat ccagaaggcc caggtgagcg 3420
gccagggcga cagcctgcac gagcacatcg ccaacctggc cggcagcccc gccatcaaga 3480
agggcatcct gcagaccgtg aaggtggtgg acgagctggt gaaggtgatg ggccggcaca 3540
agcccgagaa catcgtgatc gagatggccc gggagaacca gaccacccag aagggccaga 3600
agaacagccg ggagcggatg aagcggatcg aggagggcat caaggagctg ggcagccaga 3660
tcctgaagga gcaccccgtg gagaacaccc agctgcagaa cgagaagctg tacctgtact 3720
acctgcagaa cggccgggac atgtacgtgg accaggagct ggacatcaac cggctgagcg 3780
actacgacgt ggccgccatc gtgccccaga gcttcctgaa ggacgacagc atcgacaaca 3840
aggtgctgac ccggagcgac aaggcccggg gcaagagcga caacgtgccc agcgaggagg 3900
tggtgaagaa gatgaagaac tactggcggc agctgctgaa cgccaagctg atcacccagc 3960
ggaagttcga caacctgacc aaggccgagc ggggcggcct gagcgagctg gacaaggccg 4020
gcttcatcaa gcggcagctg gtggagaccc ggcagatcac caagcacgtg gcccagatcc 4080
tggacagccg gatgaacacc aagtacgacg agaacgacaa gctgatccgg gaggtgaagg 4140
tgatcaccct gaaatccaag ctggtgagcg acttccggaa ggacttccag ttctacaagg 4200
tgcgggagat caacaactac caccacgccc acgacgccta cctgaacgcc gtggtgggca 4260
ccgccctgat caagaagtac cccaagctgg agagcgagtt cgtgtacggc gactacaagg 4320
tgtacgacgt gcggaagatg atcgccaaga gcgagcagga gatcggcaag gccaccgcca 4380
agtacttctt ctacagcaac atcatgaact tcttcaagac cgagatcacc ctggccaacg 4440
gcgagatccg gaagcggccc ctgatcgaga ccaacggcga gaccggcgag atcgtgtggg 4500
acaagggccg ggacttcgcc accgtgcgga aggtgctgag catgccccag gtgaacatcg 4560
tgaagaaaac cgaggtgcag accggcggct tcagcaagga gagcatcctg cccaagcgga 4620
acagcgacaa gctgatcgcc cggaagaagg actgggaccc caagaagtac ggcggcttcg 4680
acagccccac cgtggcctac agcgtgctgg tggtggccaa ggtggagaag ggcaagagca 4740
agaagctgaa atccgtgaag gagctgctgg gcatcaccat catggagcgg agcagcttcg 4800
agaagaaccc catcgacttc ctggaggcca agggctacaa ggaggtgaag aaggacctga 4860
tcatcaagct gcccaagtac agcctgttcg agctggagaa cggccggaag cggatgctgg 4920
ccagcgccgg cgagctgcag aagggcaacg agctggccct gcccagcaag tacgtgaact 4980
tcctgtacct ggccagccac tacgagaagc tgaagggcag ccccgaggac aacgagcaga 5040
agcagctgtt cgtggagcag cacaagcact acctggacga gatcatcgag cagatcagcg 5100
agttcagcaa gcgggtgatc ctggccgacg ccaacctgga caaggtgctg agcgcctaca 5160
acaagcaccg ggacaagccc atccgggagc aggccgagaa catcatccac ctgttcaccc 5220
tgaccaacct gggcgccccc gccgccttca agtacttcga caccaccatc gaccggaagc 5280
ggtacaccag caccaaggag gtgctggacg ccaccctgat ccaccagagc atcaccggcc 5340
tgtacgagac ccggatcgac ctgagccagc tgggcggcga cagcggcggc aagcggcccg 5400
ccgccaccaa gaaggccggc caggccaaga agaagaagtc gggcgggggt ggctcaggac 5460
agaccggcaa aaagtccgaa aagggccccg tgtgctggag gaagagggtc aagagcgagt 5520
acatgaggct gagacagctc aagagattta ggagagccga tgaggtgaag tccatgttct 5580
ccagcaacag acaaaagatt ctggagagga ccgagatcct caaccaagag tggaagcaga 5640
gaagaatcca gcccgtgcac attctgacct ccgtgagctc tctgaggggc acaagagaat 5700
gctccgtcac cagcgatctg gacttcccca cacaagtgat ccccctcaag acactgaacg 5760
ctgtggccag cgtgcccatc atgtatagct ggtcccctct gcaacagaac ttcatggtgg 5820
aggacgagac agtgctgcac aatatcccct acatgggaga tgaggtgctg gaccaagacg 5880
gcacctttat tgaggagctg attaaaaact acgatggcaa ggtgcacggc gatagggagt 5940
gtggcttcat caacgacgag atcttcgtcg agctggtgaa tgctctgggc cagtataatg 6000
acgatgatga cgacgatgac ggcgacgacc ccgaagagag agaggagaag caaaaggatc 6060
tggaggacca tagggacgac aaagagtcta gacctcctag aaagttcccc tccgacaaga 6120
tcttcgaagc catctcctcc atgttccccg acaagggcac agccgaggaa ctgaaggaga 6180
agtataagga actcacagag caacagctgc ccggagctct gcctcccgag tgcaccccta 6240
acatcgacgg ccccaacgcc aagagcgtgc agagggagca atccctccac agcttccata 6300
ccctcttctg cagaagatgc tttaaatacg attgctttct ccatcctttc cacgccacac 6360
ccaacaccta caagaggaag aacaccgaaa ccgctctgga caataaacct tgcggacccc 6420
agtgctacca gcatctggaa ggagccaagg aatttgccgc tgctctgaca gccgagagaa 6480
ttaaaacccc tcccaaaaga cccggcggca gaaggagggg cagactgcct aataacagca 6540
gcagacccag cacccctacc attaacgtgc tggaatccaa ggacaccgac agcgatagag 6600
aggccggcac agaaaccggc ggagagaaca acgacaagga ggaggaggag aagaaagacg 6660
agacatcctc cagcagcgag gctaatagca gatgccagac ccctatcaag atgaaaccta 6720
atatcgagcc ccccgagaat gtggagtgga gcggcgctga ggcctccatg tttagagtgc 6780
tgatcggaac ctactacgac aacttctgcg ctatcgctag actgattggc accaagacat 6840
gcagacaagt gtacgagttc agagtcaagg agagctccat tatcgccccc gcccccgccg 6900
aagatgtgga cacccccccc agaaagaaga aaaggaagca tagactgtgg gccgcccact 6960
gtagaaagat ccagctcaaa aaggacggca gcagcaacca cgtgtacaac tatcagcctt 7020
gtgaccaccc cagacaacct tgtgattcca gctgcccttg cgtgatcgcc cagaacttct 7080
gcgagaagtt ctgtcagtgc agcagcgagt gccaaaatag atttcccgga tgtaggtgca 7140
aagcccagtg caataccaag cagtgccctt gctatctggc cgtgagagag tgcgatcccg 7200
atctgtgtct gacatgtgga gctgccgacc attgggacag caagaatgtg agctgcaaga 7260
actgcagcat ccaaagggga agcaaaaaac atctgctgct cgccccttcc gatgtggccg 7320
gatggggaat ctttatcaag gaccccgtcc agaaaaacga gttcatttcc gagtattgcg 7380
gcgagatcat cagccaagac gaagctgata gaagaggcaa agtgtatgac aaatacatgt 7440
gctccttcct cttcaacctc aataatgatt tcgtggtgga cgccacaagg aagggcaaca 7500
agattagatt cgccaaccac agcgtcaatc ctaactgcta tgccaaggtc atgatggtca 7560
acggcgacca cagaattggc atcttcgcta agagggccat ccagaccggc gaggaactgt 7620
tcttcgacta tagatactcc caagccgacg ctctgaagta cgtgggcatc gagagagaga 7680
tggaaatccc cggaggtggc ggatcgggaa agcggcccgc cgccaccaag aaggccggtc 7740
aggccaagaa gaagaagggc agctacccct acgacgtgcc cgactacgcc tgagcggccg 7800
cttaattaag ctgccttctg cggggcttgc cttctggcca tgcccttctt ctctcccttg 7860
cacctgtacc tcttggtctt tgaataaagc ctgagtagga agtctagaaa aaaaaaaaaa 7920
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 7980
aaaaaaa 7987
<210> 70
<211> 2581
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 70
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Ser Gly Gly Gly Gly Ser Gly Ser Ala Ala Ile Ala Glu Val Leu
20 25 30
Leu Asn Ala Arg Cys Asp Leu His Ala Val Asn Tyr His Gly Asp Thr
35 40 45
Pro Leu His Ile Ala Ala Arg Glu Ser Tyr His Asp Cys Val Leu Leu
50 55 60
Phe Leu Ser Arg Gly Ala Asn Pro Glu Leu Arg Asn Lys Glu Gly Asp
65 70 75 80
Thr Ala Trp Asp Leu Thr Pro Glu Arg Ser Asp Val Trp Phe Ala Leu
85 90 95
Gln Leu Asn Arg Lys Leu Arg Leu Gly Val Gly Asn Arg Ala Ile Arg
100 105 110
Thr Glu Lys Ile Ile Cys Arg Asp Val Ala Arg Gly Tyr Glu Asn Val
115 120 125
Pro Ile Pro Cys Val Asn Gly Val Asp Gly Glu Pro Cys Pro Glu Asp
130 135 140
Tyr Lys Tyr Ile Ser Glu Asn Cys Glu Thr Ser Thr Met Asn Ile Asp
145 150 155 160
Arg Asn Ile Thr His Leu Gln His Cys Thr Cys Val Asp Asp Cys Ser
165 170 175
Ser Ser Asn Cys Leu Cys Gly Gln Leu Ser Ile Arg Cys Trp Tyr Asp
180 185 190
Lys Asp Gly Arg Leu Leu Gln Glu Phe Asn Lys Ile Glu Pro Pro Leu
195 200 205
Ile Phe Glu Cys Asn Gln Ala Cys Ser Cys Trp Arg Asn Cys Lys Asn
210 215 220
Arg Val Val Gln Ser Gly Ile Lys Val Arg Leu Gln Leu Tyr Arg Thr
225 230 235 240
Ala Lys Met Gly Trp Gly Val Arg Ala Leu Gln Thr Ile Pro Gln Gly
245 250 255
Thr Phe Ile Cys Glu Tyr Val Gly Glu Leu Ile Ser Asp Ala Glu Ala
260 265 270
Asp Val Arg Glu Asp Asp Ser Tyr Leu Phe Asp Leu Asp Asn Lys Asp
275 280 285
Gly Glu Val Tyr Cys Ile Asp Ala Arg Tyr Tyr Gly Asn Ile Ser Arg
290 295 300
Phe Ile Asn His Leu Cys Asp Pro Asn Ile Ile Pro Val Arg Val Phe
305 310 315 320
Met Leu His Gln Asp Leu Arg Phe Pro Arg Ile Ala Phe Phe Ser Ser
325 330 335
Arg Asp Ile Arg Thr Gly Glu Glu Leu Gly Phe Asp Tyr Gly Asp Arg
340 345 350
Phe Trp Asp Ile Lys Ser Lys Tyr Phe Thr Cys Gln Cys Gly Ser Glu
355 360 365
Lys Cys Lys His Ser Ala Glu Ala Ile Ala Leu Glu Gln Ser Arg Leu
370 375 380
Ala Arg Leu Asp Pro His Pro Glu Leu Leu Pro Glu Leu Gly Ser Leu
385 390 395 400
Pro Pro Val Asn Thr Gly Gly Gly Gly Ser Gly Asp Lys Lys Tyr Ser
405 410 415
Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr
420 425 430
Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr
435 440 445
Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp
450 455 460
Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg
465 470 475 480
Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe
485 490 495
Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu
500 505 510
Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile
515 520 525
Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr
530 535 540
Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp
545 550 555 560
Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly
565 570 575
His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp
580 585 590
Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu
595 600 605
Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala
610 615 620
Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro
625 630 635 640
Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu
645 650 655
Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala
660 665 670
Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu
675 680 685
Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys
690 695 700
Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr
705 710 715 720
Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp
725 730 735
Glu His His Gln Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln
740 745 750
Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly
755 760 765
Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys
770 775 780
Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu
785 790 795 800
Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp
805 810 815
Asn Gly Ser Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile
820 825 830
Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu
835 840 845
Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro
850 855 860
Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu
865 870 875 880
Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala
885 890 895
Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu
900 905 910
Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe
915 920 925
Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met
930 935 940
Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp
945 950 955 960
Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu
965 970 975
Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly
980 985 990
Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu
995 1000 1005
Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu
1010 1015 1020
Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu Phe Glu Asp
1025 1030 1035
Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His Leu Phe
1040 1045 1050
Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr Gly
1055 1060 1065
Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
1070 1075 1080
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
1085 1090 1095
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr
1100 1105 1110
Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp
1115 1120 1125
Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile
1130 1135 1140
Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val
1145 1150 1155
Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu Met
1160 1165 1170
Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg
1175 1180 1185
Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser
1190 1195 1200
Gln Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn
1205 1210 1215
Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr
1220 1225 1230
Val Asp Gln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val
1235 1240 1245
Ala Ala Ile Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp
1250 1255 1260
Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg Gly Lys Ser Asp
1265 1270 1275
Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn Tyr Trp
1280 1285 1290
Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe Asp
1295 1300 1305
Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
1310 1315 1320
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
1325 1330 1335
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr
1340 1345 1350
Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu
1355 1360 1365
Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr
1370 1375 1380
Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr
1385 1390 1395
Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys
1400 1405 1410
Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val
1415 1420 1425
Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr
1430 1435 1440
Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr
1445 1450 1455
Glu Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile
1460 1465 1470
Glu Thr Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg
1475 1480 1485
Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn
1490 1495 1500
Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu
1505 1510 1515
Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys
1520 1525 1530
Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr
1535 1540 1545
Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys
1550 1555 1560
Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile
1565 1570 1575
Met Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu
1580 1585 1590
Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu
1595 1600 1605
Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met
1610 1615 1620
Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu
1625 1630 1635
Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu
1640 1645 1650
Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe
1655 1660 1665
Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile
1670 1675 1680
Ser Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp
1685 1690 1695
Lys Val Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg
1700 1705 1710
Glu Gln Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu
1715 1720 1725
Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg
1730 1735 1740
Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile
1745 1750 1755
His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser
1760 1765 1770
Gln Leu Gly Gly Asp Ser Gly Gly Lys Arg Pro Ala Ala Thr Lys
1775 1780 1785
Lys Ala Gly Gln Ala Lys Lys Lys Lys Ser Gly Gly Gly Gly Ser
1790 1795 1800
Gly Gln Thr Gly Lys Lys Ser Glu Lys Gly Pro Val Cys Trp Arg
1805 1810 1815
Lys Arg Val Lys Ser Glu Tyr Met Arg Leu Arg Gln Leu Lys Arg
1820 1825 1830
Phe Arg Arg Ala Asp Glu Val Lys Ser Met Phe Ser Ser Asn Arg
1835 1840 1845
Gln Lys Ile Leu Glu Arg Thr Glu Ile Leu Asn Gln Glu Trp Lys
1850 1855 1860
Gln Arg Arg Ile Gln Pro Val His Ile Leu Thr Ser Val Ser Ser
1865 1870 1875
Leu Arg Gly Thr Arg Glu Cys Ser Val Thr Ser Asp Leu Asp Phe
1880 1885 1890
Pro Thr Gln Val Ile Pro Leu Lys Thr Leu Asn Ala Val Ala Ser
1895 1900 1905
Val Pro Ile Met Tyr Ser Trp Ser Pro Leu Gln Gln Asn Phe Met
1910 1915 1920
Val Glu Asp Glu Thr Val Leu His Asn Ile Pro Tyr Met Gly Asp
1925 1930 1935
Glu Val Leu Asp Gln Asp Gly Thr Phe Ile Glu Glu Leu Ile Lys
1940 1945 1950
Asn Tyr Asp Gly Lys Val His Gly Asp Arg Glu Cys Gly Phe Ile
1955 1960 1965
Asn Asp Glu Ile Phe Val Glu Leu Val Asn Ala Leu Gly Gln Tyr
1970 1975 1980
Asn Asp Asp Asp Asp Asp Asp Asp Gly Asp Asp Pro Glu Glu Arg
1985 1990 1995
Glu Glu Lys Gln Lys Asp Leu Glu Asp His Arg Asp Asp Lys Glu
2000 2005 2010
Ser Arg Pro Pro Arg Lys Phe Pro Ser Asp Lys Ile Phe Glu Ala
2015 2020 2025
Ile Ser Ser Met Phe Pro Asp Lys Gly Thr Ala Glu Glu Leu Lys
2030 2035 2040
Glu Lys Tyr Lys Glu Leu Thr Glu Gln Gln Leu Pro Gly Ala Leu
2045 2050 2055
Pro Pro Glu Cys Thr Pro Asn Ile Asp Gly Pro Asn Ala Lys Ser
2060 2065 2070
Val Gln Arg Glu Gln Ser Leu His Ser Phe His Thr Leu Phe Cys
2075 2080 2085
Arg Arg Cys Phe Lys Tyr Asp Cys Phe Leu His Pro Phe His Ala
2090 2095 2100
Thr Pro Asn Thr Tyr Lys Arg Lys Asn Thr Glu Thr Ala Leu Asp
2105 2110 2115
Asn Lys Pro Cys Gly Pro Gln Cys Tyr Gln His Leu Glu Gly Ala
2120 2125 2130
Lys Glu Phe Ala Ala Ala Leu Thr Ala Glu Arg Ile Lys Thr Pro
2135 2140 2145
Pro Lys Arg Pro Gly Gly Arg Arg Arg Gly Arg Leu Pro Asn Asn
2150 2155 2160
Ser Ser Arg Pro Ser Thr Pro Thr Ile Asn Val Leu Glu Ser Lys
2165 2170 2175
Asp Thr Asp Ser Asp Arg Glu Ala Gly Thr Glu Thr Gly Gly Glu
2180 2185 2190
Asn Asn Asp Lys Glu Glu Glu Glu Lys Lys Asp Glu Thr Ser Ser
2195 2200 2205
Ser Ser Glu Ala Asn Ser Arg Cys Gln Thr Pro Ile Lys Met Lys
2210 2215 2220
Pro Asn Ile Glu Pro Pro Glu Asn Val Glu Trp Ser Gly Ala Glu
2225 2230 2235
Ala Ser Met Phe Arg Val Leu Ile Gly Thr Tyr Tyr Asp Asn Phe
2240 2245 2250
Cys Ala Ile Ala Arg Leu Ile Gly Thr Lys Thr Cys Arg Gln Val
2255 2260 2265
Tyr Glu Phe Arg Val Lys Glu Ser Ser Ile Ile Ala Pro Ala Pro
2270 2275 2280
Ala Glu Asp Val Asp Thr Pro Pro Arg Lys Lys Lys Arg Lys His
2285 2290 2295
Arg Leu Trp Ala Ala His Cys Arg Lys Ile Gln Leu Lys Lys Asp
2300 2305 2310
Gly Ser Ser Asn His Val Tyr Asn Tyr Gln Pro Cys Asp His Pro
2315 2320 2325
Arg Gln Pro Cys Asp Ser Ser Cys Pro Cys Val Ile Ala Gln Asn
2330 2335 2340
Phe Cys Glu Lys Phe Cys Gln Cys Ser Ser Glu Cys Gln Asn Arg
2345 2350 2355
Phe Pro Gly Cys Arg Cys Lys Ala Gln Cys Asn Thr Lys Gln Cys
2360 2365 2370
Pro Cys Tyr Leu Ala Val Arg Glu Cys Asp Pro Asp Leu Cys Leu
2375 2380 2385
Thr Cys Gly Ala Ala Asp His Trp Asp Ser Lys Asn Val Ser Cys
2390 2395 2400
Lys Asn Cys Ser Ile Gln Arg Gly Ser Lys Lys His Leu Leu Leu
2405 2410 2415
Ala Pro Ser Asp Val Ala Gly Trp Gly Ile Phe Ile Lys Asp Pro
2420 2425 2430
Val Gln Lys Asn Glu Phe Ile Ser Glu Tyr Cys Gly Glu Ile Ile
2435 2440 2445
Ser Gln Asp Glu Ala Asp Arg Arg Gly Lys Val Tyr Asp Lys Tyr
2450 2455 2460
Met Cys Ser Phe Leu Phe Asn Leu Asn Asn Asp Phe Val Val Asp
2465 2470 2475
Ala Thr Arg Lys Gly Asn Lys Ile Arg Phe Ala Asn His Ser Val
2480 2485 2490
Asn Pro Asn Cys Tyr Ala Lys Val Met Met Val Asn Gly Asp His
2495 2500 2505
Arg Ile Gly Ile Phe Ala Lys Arg Ala Ile Gln Thr Gly Glu Glu
2510 2515 2520
Leu Phe Phe Asp Tyr Arg Tyr Ser Gln Ala Asp Ala Leu Lys Tyr
2525 2530 2535
Val Gly Ile Glu Arg Glu Met Glu Ile Pro Gly Gly Gly Gly Ser
2540 2545 2550
Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys
2555 2560 2565
Lys Lys Gly Ser Tyr Pro Tyr Asp Val Pro Asp Tyr Ala
2570 2575 2580
<210> 71
<211> 6040
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 71
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgcctc gggcgggggt ggctcaggaa 120
gcgccgctat cgccgaagtg ctgctgaatg ccagatgcga tctgcatgcc gtgaactacc 180
acggcgacac ccctctgcat atcgccgcta gagagagcta ccatgactgt gtgctgctgt 240
ttctgagcag aggcgccaat cccgagctca gaaacaaaga gggcgacacc gcttgggatc 300
tgacacccga gagatccgac gtgtggttcg ctctgcaact gaatagaaaa ctgagactgg 360
gcgtcggcaa tagggccatt agaaccgaga agatcatctg tagggacgtg gctaggggct 420
acgagaacgt gcccatccct tgtgtgaacg gagtggatgg agagccttgc cccgaggatt 480
acaaatacat cagcgagaac tgcgaaacct ccaccatgaa tatcgataga aacattacac 540
acctccagca ctgtacatgc gtggacgatt gcagcagcag caactgtctg tgcggccaac 600
tgagcatcag atgctggtac gacaaggatg gcagactgct gcaagagttc aacaagatcg 660
aaccccctct gatcttcgag tgtaaccaag cttgcagctg ttggaggaac tgcaagaata 720
gggtcgtgca gtccggaatc aaggtgagac tgcagctgta tagaacagct aagatgggat 780
ggggagtcag agctctgcag accatccccc aaggcacatt catctgtgag tacgtcggcg 840
aactcatcag cgacgctgag gccgatgtga gggaggacga cagctatctc ttcgacctcg 900
acaacaagga cggcgaggtg tactgcatcg acgctagata ttacggcaac atcagcagat 960
tcatcaacca cctctgcgac cccaatatca tccccgtgag agtgttcatg ctccatcaag 1020
atctgagatt ccctaggatc gccttcttca gctctagaga cattagaacc ggcgaggagc 1080
tgggattcga ctacggcgac aggttctggg acatcaagag caagtacttc acatgccaat 1140
gcggcagcga gaaatgcaag catagcgccg aggccattgc tctggagcag tctagactgg 1200
ctaggctgga ccctcacccc gagctgctgc ccgaactggg atctctgcct cccgtgaata 1260
ccggaggtgg cggatcggga gacaagaagt acagcatcgg cctggccatc ggcaccaaca 1320
gcgtgggctg ggccgtgatc accgacgagt acaaggtgcc cagcaagaag ttcaaggtgc 1380
tgggcaacac cgaccggcac agcatcaaga agaacctgat cggcgccctg ctgttcgaca 1440
gcggcgagac cgccgaggcc acccggctga agcggaccgc ccggcggcgg tacacccggc 1500
ggaagaaccg gatctgctac ctgcaggaga tcttcagcaa cgagatggcc aaggtggacg 1560
acagcttctt ccaccggctg gaggagagct tcctggtgga ggaggacaag aagcacgagc 1620
ggcaccccat cttcggcaac atcgtggacg aggtggccta ccacgagaag taccccacca 1680
tctaccacct gcggaagaag ctggtggaca gcaccgacaa ggccgacctg cggctgatct 1740
acctggccct ggcccacatg atcaagttcc ggggccactt cctgatcgag ggcgacctga 1800
accccgacaa cagcgacgtg gacaagctgt tcatccagct ggtgcagacc tacaaccagc 1860
tgttcgagga gaaccccatc aacgccagcg gcgtggacgc caaggccatc ctgagcgccc 1920
ggctgagcaa gagccggcgg ctggagaacc tgatcgccca gctgcccggc gagaagaaga 1980
acggcctgtt cggcaacctg atcgccctga gcctgggcct gacccccaac ttcaagagca 2040
acttcgacct ggccgaggac gccaagctgc agctgagcaa ggacacctac gacgacgacc 2100
tggacaacct gctggcccag atcggcgacc agtacgccga cctgttcctg gccgccaaga 2160
acctgagcga cgccatcctg ctgagcgaca tcctgcgggt gaacaccgag atcaccaagg 2220
cccccctgag cgccagcatg atcaagcggt acgacgagca ccaccaggac ctgaccctgc 2280
tgaaggccct ggtgcggcag cagctgcccg agaagtacaa ggagatcttc ttcgaccaga 2340
gcaagaacgg ctacgccggc tacatcgacg gcggcgccag ccaggaggag ttctacaagt 2400
tcatcaagcc catcctggag aagatggacg gcaccgagga gctgctggtg aagctgaacc 2460
gggaggacct gctgcggaag cagcggacct tcgacaacgg cagcatcccc caccagatcc 2520
acctgggcga gctgcacgcc atcctgcggc ggcaggagga cttctacccc ttcctgaagg 2580
acaaccggga gaagatcgag aagatcctga ccttccggat cccctactac gtgggccccc 2640
tggcccgggg caacagccgg ttcgcctgga tgacccggaa atccgaggag accatcaccc 2700
cctggaactt cgaggaggtg gtggacaagg gcgccagcgc ccagagcttc atcgagcgga 2760
tgaccaactt cgacaagaac ctgcccaacg agaaggtgct gcccaagcac agcctgctgt 2820
acgagtactt caccgtgtac aacgagctga ccaaggtgaa gtacgtgacc gagggcatgc 2880
ggaagcccgc cttcctgagc ggcgagcaga agaaggccat cgtggacctg ctgttcaaga 2940
ccaaccggaa ggtgaccgtg aagcagctga aggaggacta cttcaagaag atcgagtgct 3000
tcgacagcgt ggagatcagc ggcgtggagg accggttcaa cgccagcctg ggcacctacc 3060
acgacctgct gaagatcatc aaggacaagg acttcctgga caacgaggag aacgaggaca 3120
tcctggagga catcgtgctg accctgaccc tgttcgagga ccgggagatg atcgaggagc 3180
ggctgaaaac ctacgcccac ctgttcgacg acaaggtgat gaagcagctg aagcggcggc 3240
ggtacaccgg ctggggccgg ctgagccgga agctgatcaa cggcatccgg gacaagcaga 3300
gcggcaagac catcctggac ttcctgaaat ccgacggctt cgccaaccgg aacttcatgc 3360
agctgatcca cgacgacagc ctgaccttca aggaggacat ccagaaggcc caggtgagcg 3420
gccagggcga cagcctgcac gagcacatcg ccaacctggc cggcagcccc gccatcaaga 3480
agggcatcct gcagaccgtg aaggtggtgg acgagctggt gaaggtgatg ggccggcaca 3540
agcccgagaa catcgtgatc gagatggccc gggagaacca gaccacccag aagggccaga 3600
agaacagccg ggagcggatg aagcggatcg aggagggcat caaggagctg ggcagccaga 3660
tcctgaagga gcaccccgtg gagaacaccc agctgcagaa cgagaagctg tacctgtact 3720
acctgcagaa cggccgggac atgtacgtgg accaggagct ggacatcaac cggctgagcg 3780
actacgacgt ggccgccatc gtgccccaga gcttcctgaa ggacgacagc atcgacaaca 3840
aggtgctgac ccggagcgac aaggcccggg gcaagagcga caacgtgccc agcgaggagg 3900
tggtgaagaa gatgaagaac tactggcggc agctgctgaa cgccaagctg atcacccagc 3960
ggaagttcga caacctgacc aaggccgagc ggggcggcct gagcgagctg gacaaggccg 4020
gcttcatcaa gcggcagctg gtggagaccc ggcagatcac caagcacgtg gcccagatcc 4080
tggacagccg gatgaacacc aagtacgacg agaacgacaa gctgatccgg gaggtgaagg 4140
tgatcaccct gaaatccaag ctggtgagcg acttccggaa ggacttccag ttctacaagg 4200
tgcgggagat caacaactac caccacgccc acgacgccta cctgaacgcc gtggtgggca 4260
ccgccctgat caagaagtac cccaagctgg agagcgagtt cgtgtacggc gactacaagg 4320
tgtacgacgt gcggaagatg atcgccaaga gcgagcagga gatcggcaag gccaccgcca 4380
agtacttctt ctacagcaac atcatgaact tcttcaagac cgagatcacc ctggccaacg 4440
gcgagatccg gaagcggccc ctgatcgaga ccaacggcga gaccggcgag atcgtgtggg 4500
acaagggccg ggacttcgcc accgtgcgga aggtgctgag catgccccag gtgaacatcg 4560
tgaagaaaac cgaggtgcag accggcggct tcagcaagga gagcatcctg cccaagcgga 4620
acagcgacaa gctgatcgcc cggaagaagg actgggaccc caagaagtac ggcggcttcg 4680
acagccccac cgtggcctac agcgtgctgg tggtggccaa ggtggagaag ggcaagagca 4740
agaagctgaa atccgtgaag gagctgctgg gcatcaccat catggagcgg agcagcttcg 4800
agaagaaccc catcgacttc ctggaggcca agggctacaa ggaggtgaag aaggacctga 4860
tcatcaagct gcccaagtac agcctgttcg agctggagaa cggccggaag cggatgctgg 4920
ccagcgccgg cgagctgcag aagggcaacg agctggccct gcccagcaag tacgtgaact 4980
tcctgtacct ggccagccac tacgagaagc tgaagggcag ccccgaggac aacgagcaga 5040
agcagctgtt cgtggagcag cacaagcact acctggacga gatcatcgag cagatcagcg 5100
agttcagcaa gcgggtgatc ctggccgacg ccaacctgga caaggtgctg agcgcctaca 5160
acaagcaccg ggacaagccc atccgggagc aggccgagaa catcatccac ctgttcaccc 5220
tgaccaacct gggcgccccc gccgccttca agtacttcga caccaccatc gaccggaagc 5280
ggtacaccag caccaaggag gtgctggacg ccaccctgat ccaccagagc atcaccggcc 5340
tgtacgagac ccggatcgac ctgagccagc tgggcggcga cagcggcggc aagcggcccg 5400
ccgccaccaa gaaggccggc caggccaaga agaagaagtc gggcgggggt ggctcagacg 5460
ctaagtctct gaccgcttgg agcagaacac tggtcacctt caaggacgtg ttcgtcgact 5520
tcacaagaga ggagtggaaa ctgctggaca ccgcccagca gatcctctat agaaacgtca 5580
tgctggagaa ctacaagaat ctggtgtctc tgggctacca gctgaccaag cccgacgtga 5640
ttctgaggct ggagaagggc gaggagcctt ggctggtgga gagagagatc caccaagaaa 5700
cccaccccga cagcgaaacc gccttcgaga tcaagagcag cgtgggaggt ggcggatcgg 5760
gaaagcggcc cgccgccacc aagaaggccg gtcaggccaa gaagaagaag ggcagctacc 5820
cctacgacgt gcccgactac gcctgagcgg ccgcttaatt aagctgcctt ctgcggggct 5880
tgccttctgg ccatgccctt cttctctccc ttgcacctgt acctcttggt ctttgaataa 5940
agcctgagta ggaagtctag aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 6000
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 6040
<210> 72
<211> 1932
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 72
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Ser Gly Gly Gly Gly Ser Gly Ser Ala Ala Ile Ala Glu Val Leu
20 25 30
Leu Asn Ala Arg Cys Asp Leu His Ala Val Asn Tyr His Gly Asp Thr
35 40 45
Pro Leu His Ile Ala Ala Arg Glu Ser Tyr His Asp Cys Val Leu Leu
50 55 60
Phe Leu Ser Arg Gly Ala Asn Pro Glu Leu Arg Asn Lys Glu Gly Asp
65 70 75 80
Thr Ala Trp Asp Leu Thr Pro Glu Arg Ser Asp Val Trp Phe Ala Leu
85 90 95
Gln Leu Asn Arg Lys Leu Arg Leu Gly Val Gly Asn Arg Ala Ile Arg
100 105 110
Thr Glu Lys Ile Ile Cys Arg Asp Val Ala Arg Gly Tyr Glu Asn Val
115 120 125
Pro Ile Pro Cys Val Asn Gly Val Asp Gly Glu Pro Cys Pro Glu Asp
130 135 140
Tyr Lys Tyr Ile Ser Glu Asn Cys Glu Thr Ser Thr Met Asn Ile Asp
145 150 155 160
Arg Asn Ile Thr His Leu Gln His Cys Thr Cys Val Asp Asp Cys Ser
165 170 175
Ser Ser Asn Cys Leu Cys Gly Gln Leu Ser Ile Arg Cys Trp Tyr Asp
180 185 190
Lys Asp Gly Arg Leu Leu Gln Glu Phe Asn Lys Ile Glu Pro Pro Leu
195 200 205
Ile Phe Glu Cys Asn Gln Ala Cys Ser Cys Trp Arg Asn Cys Lys Asn
210 215 220
Arg Val Val Gln Ser Gly Ile Lys Val Arg Leu Gln Leu Tyr Arg Thr
225 230 235 240
Ala Lys Met Gly Trp Gly Val Arg Ala Leu Gln Thr Ile Pro Gln Gly
245 250 255
Thr Phe Ile Cys Glu Tyr Val Gly Glu Leu Ile Ser Asp Ala Glu Ala
260 265 270
Asp Val Arg Glu Asp Asp Ser Tyr Leu Phe Asp Leu Asp Asn Lys Asp
275 280 285
Gly Glu Val Tyr Cys Ile Asp Ala Arg Tyr Tyr Gly Asn Ile Ser Arg
290 295 300
Phe Ile Asn His Leu Cys Asp Pro Asn Ile Ile Pro Val Arg Val Phe
305 310 315 320
Met Leu His Gln Asp Leu Arg Phe Pro Arg Ile Ala Phe Phe Ser Ser
325 330 335
Arg Asp Ile Arg Thr Gly Glu Glu Leu Gly Phe Asp Tyr Gly Asp Arg
340 345 350
Phe Trp Asp Ile Lys Ser Lys Tyr Phe Thr Cys Gln Cys Gly Ser Glu
355 360 365
Lys Cys Lys His Ser Ala Glu Ala Ile Ala Leu Glu Gln Ser Arg Leu
370 375 380
Ala Arg Leu Asp Pro His Pro Glu Leu Leu Pro Glu Leu Gly Ser Leu
385 390 395 400
Pro Pro Val Asn Thr Gly Gly Gly Gly Ser Gly Asp Lys Lys Tyr Ser
405 410 415
Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr
420 425 430
Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr
435 440 445
Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp
450 455 460
Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg
465 470 475 480
Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe
485 490 495
Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu
500 505 510
Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile
515 520 525
Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr
530 535 540
Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp
545 550 555 560
Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly
565 570 575
His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp
580 585 590
Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu
595 600 605
Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala
610 615 620
Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro
625 630 635 640
Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu
645 650 655
Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala
660 665 670
Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu
675 680 685
Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys
690 695 700
Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr
705 710 715 720
Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp
725 730 735
Glu His His Gln Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln
740 745 750
Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly
755 760 765
Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys
770 775 780
Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu
785 790 795 800
Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp
805 810 815
Asn Gly Ser Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile
820 825 830
Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu
835 840 845
Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro
850 855 860
Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu
865 870 875 880
Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala
885 890 895
Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu
900 905 910
Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe
915 920 925
Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met
930 935 940
Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp
945 950 955 960
Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu
965 970 975
Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly
980 985 990
Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu
995 1000 1005
Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu
1010 1015 1020
Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu Phe Glu Asp
1025 1030 1035
Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His Leu Phe
1040 1045 1050
Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr Gly
1055 1060 1065
Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
1070 1075 1080
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
1085 1090 1095
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr
1100 1105 1110
Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp
1115 1120 1125
Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile
1130 1135 1140
Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val
1145 1150 1155
Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu Met
1160 1165 1170
Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg
1175 1180 1185
Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser
1190 1195 1200
Gln Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn
1205 1210 1215
Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr
1220 1225 1230
Val Asp Gln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val
1235 1240 1245
Ala Ala Ile Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp
1250 1255 1260
Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg Gly Lys Ser Asp
1265 1270 1275
Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn Tyr Trp
1280 1285 1290
Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe Asp
1295 1300 1305
Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
1310 1315 1320
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
1325 1330 1335
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr
1340 1345 1350
Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu
1355 1360 1365
Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr
1370 1375 1380
Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr
1385 1390 1395
Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys
1400 1405 1410
Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val
1415 1420 1425
Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr
1430 1435 1440
Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr
1445 1450 1455
Glu Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile
1460 1465 1470
Glu Thr Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg
1475 1480 1485
Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn
1490 1495 1500
Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu
1505 1510 1515
Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys
1520 1525 1530
Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr
1535 1540 1545
Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys
1550 1555 1560
Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile
1565 1570 1575
Met Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu
1580 1585 1590
Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu
1595 1600 1605
Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met
1610 1615 1620
Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu
1625 1630 1635
Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu
1640 1645 1650
Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe
1655 1660 1665
Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile
1670 1675 1680
Ser Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp
1685 1690 1695
Lys Val Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg
1700 1705 1710
Glu Gln Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu
1715 1720 1725
Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg
1730 1735 1740
Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile
1745 1750 1755
His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser
1760 1765 1770
Gln Leu Gly Gly Asp Ser Gly Gly Lys Arg Pro Ala Ala Thr Lys
1775 1780 1785
Lys Ala Gly Gln Ala Lys Lys Lys Lys Ser Gly Gly Gly Gly Ser
1790 1795 1800
Asp Ala Lys Ser Leu Thr Ala Trp Ser Arg Thr Leu Val Thr Phe
1805 1810 1815
Lys Asp Val Phe Val Asp Phe Thr Arg Glu Glu Trp Lys Leu Leu
1820 1825 1830
Asp Thr Ala Gln Gln Ile Leu Tyr Arg Asn Val Met Leu Glu Asn
1835 1840 1845
Tyr Lys Asn Leu Val Ser Leu Gly Tyr Gln Leu Thr Lys Pro Asp
1850 1855 1860
Val Ile Leu Arg Leu Glu Lys Gly Glu Glu Pro Trp Leu Val Glu
1865 1870 1875
Arg Glu Ile His Gln Glu Thr His Pro Asp Ser Glu Thr Ala Phe
1880 1885 1890
Glu Ile Lys Ser Ser Val Gly Gly Gly Gly Ser Gly Lys Arg Pro
1895 1900 1905
Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys Gly Ser
1910 1915 1920
Tyr Pro Tyr Asp Val Pro Asp Tyr Ala
1925 1930
<210> 73
<211> 1497
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 73
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Ala Lys Arg Asn Tyr Ile Leu Gly Leu Ala Ile Gly Ile Thr Ser
20 25 30
Val Gly Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala
35 40 45
Gly Val Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg
50 55 60
Arg Ser Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg
65 70 75 80
Ile Gln Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp
85 90 95
His Ser Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly
100 105 110
Leu Ser Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His
115 120 125
Leu Ala Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp
130 135 140
Thr Gly Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys
145 150 155 160
Ala Leu Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys
165 170 175
Lys Asp Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp
180 185 190
Tyr Val Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His
195 200 205
Gln Leu Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr
210 215 220
Arg Arg Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp
225 230 235 240
Lys Asp Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr
245 250 255
Phe Pro Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu
260 265 270
Tyr Asn Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu
275 280 285
Asn Glu Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val
290 295 300
Phe Lys Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile
305 310 315 320
Leu Val Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly
325 330 335
Lys Pro Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile
340 345 350
Thr Ala Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile
355 360 365
Ala Lys Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu
370 375 380
Leu Thr Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile
385 390 395 400
Ser Asn Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala
405 410 415
Ile Asn Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile
420 425 430
Ala Ile Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser
435 440 445
Gln Gln Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser
450 455 460
Pro Val Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala
465 470 475 480
Ile Ile Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala
485 490 495
Arg Glu Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln
500 505 510
Lys Arg Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr
515 520 525
Thr Gly Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His
530 535 540
Asp Met Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu
545 550 555 560
Glu Asp Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp Ala Ile Ile
565 570 575
Pro Arg Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val
580 585 590
Lys Gln Glu Glu Asn Ser Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr
595 600 605
Leu Ser Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His
610 615 620
Ile Leu Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys
625 630 635 640
Glu Tyr Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys
645 650 655
Asp Phe Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly
660 665 670
Leu Met Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val
675 680 685
Lys Val Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys
690 695 700
Trp Lys Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu
705 710 715 720
Asp Ala Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys
725 730 735
Lys Leu Asp Lys Ala Lys Lys Val Met Glu Asn Gln Met Phe Glu Glu
740 745 750
Lys Gln Ala Glu Ser Met Pro Glu Ile Glu Thr Glu Gln Glu Tyr Lys
755 760 765
Glu Ile Phe Ile Thr Pro His Gln Ile Lys His Ile Lys Asp Phe Lys
770 775 780
Asp Tyr Lys Tyr Ser His Arg Val Asp Lys Lys Pro Asn Arg Lys Leu
785 790 795 800
Ile Asn Asp Thr Leu Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn Thr
805 810 815
Leu Ile Val Asn Asn Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp Lys
820 825 830
Leu Lys Lys Leu Ile Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr His
835 840 845
His Asp Pro Gln Thr Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln Tyr
850 855 860
Gly Asp Glu Lys Asn Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly Asn
865 870 875 880
Tyr Leu Thr Lys Tyr Ser Lys Lys Asp Asn Gly Pro Val Ile Lys Lys
885 890 895
Ile Lys Tyr Tyr Gly Asn Lys Leu Asn Ala His Leu Asp Ile Thr Asp
900 905 910
Asp Tyr Pro Asn Ser Arg Asn Lys Val Val Lys Leu Ser Leu Lys Pro
915 920 925
Tyr Arg Phe Asp Val Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val Thr
930 935 940
Val Lys Asn Leu Asp Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val Asn
945 950 955 960
Ser Lys Cys Tyr Glu Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn Gln
965 970 975
Ala Glu Phe Ile Ala Ser Phe Tyr Lys Asn Asp Leu Ile Lys Ile Asn
980 985 990
Gly Glu Leu Tyr Arg Val Ile Gly Val Asn Asn Asp Leu Leu Asn Arg
995 1000 1005
Ile Glu Val Asn Met Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu
1010 1015 1020
Asn Met Asn Asp Lys Arg Pro Pro His Ile Ile Lys Thr Ile Ala
1025 1030 1035
Ser Lys Thr Gln Ser Ile Lys Lys Tyr Ser Thr Asp Ile Leu Gly
1040 1045 1050
Asn Leu Tyr Glu Val Lys Ser Lys Lys His Pro Gln Ile Ile Lys
1055 1060 1065
Lys Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys
1070 1075 1080
Lys Lys Lys Ala Arg Asp Ser Lys Val Glu Asn Lys Thr Lys Lys
1085 1090 1095
Leu Arg Val Phe Glu Ala Phe Ala Gly Ile Gly Ala Gln Arg Lys
1100 1105 1110
Ala Leu Glu Lys Val Arg Lys Asp Glu Tyr Glu Ile Val Gly Leu
1115 1120 1125
Ala Glu Trp Tyr Val Pro Ala Ile Val Met Tyr Gln Ala Ile His
1130 1135 1140
Asn Asn Phe His Thr Lys Leu Glu Tyr Lys Ser Val Ser Arg Glu
1145 1150 1155
Glu Met Ile Asp Tyr Leu Glu Asn Lys Thr Leu Ser Trp Asn Ser
1160 1165 1170
Lys Asn Pro Val Ser Asn Gly Tyr Trp Lys Arg Lys Lys Asp Asp
1175 1180 1185
Glu Leu Lys Ile Ile Tyr Asn Ala Ile Lys Leu Ser Glu Lys Glu
1190 1195 1200
Gly Asn Ile Phe Asp Ile Arg Asp Leu Tyr Lys Arg Thr Leu Lys
1205 1210 1215
Asn Ile Asp Leu Leu Thr Tyr Ser Phe Pro Cys Gln Asp Leu Ser
1220 1225 1230
Gln Gln Gly Ile Gln Lys Gly Met Lys Arg Gly Ser Gly Thr Arg
1235 1240 1245
Ser Gly Leu Leu Trp Glu Ile Glu Arg Ala Leu Asp Ser Thr Glu
1250 1255 1260
Lys Asn Asp Leu Pro Lys Tyr Leu Leu Met Glu Asn Val Gly Ala
1265 1270 1275
Leu Leu His Lys Lys Asn Glu Glu Glu Leu Asn Gln Trp Lys Gln
1280 1285 1290
Lys Leu Glu Ser Leu Gly Tyr Gln Asn Ser Ile Glu Val Leu Asn
1295 1300 1305
Ala Ala Asp Phe Gly Ser Ser Gln Ala Arg Arg Arg Val Phe Met
1310 1315 1320
Ile Ser Thr Leu Asn Glu Phe Val Glu Leu Pro Lys Gly Asp Lys
1325 1330 1335
Lys Pro Lys Ser Ile Lys Lys Val Leu Asn Lys Ile Val Ser Glu
1340 1345 1350
Lys Asp Ile Leu Asn Asn Leu Leu Lys Tyr Asn Leu Thr Glu Phe
1355 1360 1365
Lys Lys Thr Lys Ser Asn Ile Asn Lys Ala Ser Leu Ile Gly Tyr
1370 1375 1380
Ser Lys Phe Asn Ser Glu Gly Tyr Val Tyr Asp Pro Glu Phe Thr
1385 1390 1395
Gly Pro Thr Leu Thr Ala Ser Gly Ala Asn Ser Arg Ile Lys Ile
1400 1405 1410
Lys Asp Gly Ser Asn Ile Arg Lys Met Asn Ser Asp Glu Thr Phe
1415 1420 1425
Leu Tyr Ile Gly Phe Asp Ser Gln Asp Gly Lys Arg Val Asn Glu
1430 1435 1440
Ile Glu Phe Leu Thr Glu Asn Gln Lys Ile Phe Val Cys Gly Asn
1445 1450 1455
Ser Ile Ser Val Glu Val Leu Glu Ala Ile Ile Asp Lys Ile Gly
1460 1465 1470
Gly Pro Ser Ser Gly Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala
1475 1480 1485
Gly Gln Ala Lys Lys Lys Lys Gly Ser
1490 1495
<210> 74
<211> 1811
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 74
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val
20 25 30
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
35 40 45
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
50 55 60
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
65 70 75 80
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
85 90 95
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
100 105 110
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
115 120 125
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
130 135 140
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
145 150 155 160
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
165 170 175
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
180 185 190
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
195 200 205
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
210 215 220
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
225 230 235 240
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
245 250 255
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
260 265 270
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
275 280 285
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
290 295 300
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
305 310 315 320
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
325 330 335
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
340 345 350
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
355 360 365
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
370 375 380
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
385 390 395 400
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
405 410 415
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
420 425 430
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
435 440 445
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
450 455 460
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
465 470 475 480
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
485 490 495
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
500 505 510
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
515 520 525
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
530 535 540
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
545 550 555 560
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
565 570 575
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
580 585 590
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
595 600 605
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
610 615 620
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
625 630 635 640
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
645 650 655
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
660 665 670
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
675 680 685
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
690 695 700
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
705 710 715 720
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
725 730 735
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
740 745 750
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
755 760 765
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
770 775 780
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
785 790 795 800
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
805 810 815
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
820 825 830
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
835 840 845
Leu Ser Asp Tyr Asp Val Ala Ala Ile Val Pro Gln Ser Phe Leu Lys
850 855 860
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg
865 870 875 880
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
885 890 895
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
900 905 910
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
915 920 925
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
930 935 940
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
945 950 955 960
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
965 970 975
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
980 985 990
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
995 1000 1005
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu
1010 1015 1020
Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile
1025 1030 1035
Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe
1040 1045 1050
Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu
1055 1060 1065
Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly
1070 1075 1080
Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1085 1090 1095
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys
1100 1105 1110
Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro
1115 1120 1125
Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp
1130 1135 1140
Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser
1145 1150 1155
Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu
1160 1165 1170
Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser
1175 1180 1185
Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr
1190 1195 1200
Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser
1205 1210 1215
Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1220 1225 1230
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr
1235 1240 1245
Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly
1250 1255 1260
Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His
1265 1270 1275
Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser
1280 1285 1290
Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser
1295 1300 1305
Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu
1310 1315 1320
Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1325 1330 1335
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr
1340 1345 1350
Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile
1355 1360 1365
Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly
1370 1375 1380
Asp Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys
1385 1390 1395
Lys Lys Ala Arg Asp Ser Lys Val Glu Asn Lys Thr Lys Lys Leu
1400 1405 1410
Arg Val Phe Glu Ala Phe Ala Gly Ile Gly Ala Gln Arg Lys Ala
1415 1420 1425
Leu Glu Lys Val Arg Lys Asp Glu Tyr Glu Ile Val Gly Leu Ala
1430 1435 1440
Glu Trp Tyr Val Pro Ala Ile Val Met Tyr Gln Ala Ile His Asn
1445 1450 1455
Asn Phe His Thr Lys Leu Glu Tyr Lys Ser Val Ser Arg Glu Glu
1460 1465 1470
Met Ile Asp Tyr Leu Glu Asn Lys Thr Leu Ser Trp Asn Ser Lys
1475 1480 1485
Asn Pro Val Ser Asn Gly Tyr Trp Lys Arg Lys Lys Asp Asp Glu
1490 1495 1500
Leu Lys Ile Ile Tyr Asn Ala Ile Lys Leu Ser Glu Lys Glu Gly
1505 1510 1515
Asn Ile Phe Asp Ile Arg Asp Leu Tyr Lys Arg Thr Leu Lys Asn
1520 1525 1530
Ile Asp Leu Leu Thr Tyr Ser Phe Pro Cys Gln Asp Leu Ser Gln
1535 1540 1545
Gln Gly Ile Gln Lys Gly Met Lys Arg Gly Ser Gly Thr Arg Ser
1550 1555 1560
Gly Leu Leu Trp Glu Ile Glu Arg Ala Leu Asp Ser Thr Glu Lys
1565 1570 1575
Asn Asp Leu Pro Lys Tyr Leu Leu Met Glu Asn Val Gly Ala Leu
1580 1585 1590
Leu His Lys Lys Asn Glu Glu Glu Leu Asn Gln Trp Lys Gln Lys
1595 1600 1605
Leu Glu Ser Leu Gly Tyr Gln Asn Ser Ile Glu Val Leu Asn Ala
1610 1615 1620
Ala Asp Phe Gly Ser Ser Gln Ala Arg Arg Arg Val Phe Met Ile
1625 1630 1635
Ser Thr Leu Asn Glu Phe Val Glu Leu Pro Lys Gly Asp Lys Lys
1640 1645 1650
Pro Lys Ser Ile Lys Lys Val Leu Asn Lys Ile Val Ser Glu Lys
1655 1660 1665
Asp Ile Leu Asn Asn Leu Leu Lys Tyr Asn Leu Thr Glu Phe Lys
1670 1675 1680
Lys Thr Lys Ser Asn Ile Asn Lys Ala Ser Leu Ile Gly Tyr Ser
1685 1690 1695
Lys Phe Asn Ser Glu Gly Tyr Val Tyr Asp Pro Glu Phe Thr Gly
1700 1705 1710
Pro Thr Leu Thr Ala Ser Gly Ala Asn Ser Arg Ile Lys Ile Lys
1715 1720 1725
Asp Gly Ser Asn Ile Arg Lys Met Asn Ser Asp Glu Thr Phe Leu
1730 1735 1740
Tyr Ile Gly Phe Asp Ser Gln Asp Gly Lys Arg Val Asn Glu Ile
1745 1750 1755
Glu Phe Leu Thr Glu Asn Gln Lys Ile Phe Val Cys Gly Asn Ser
1760 1765 1770
Ile Ser Val Glu Val Leu Glu Ala Ile Ile Asp Lys Ile Gly Gly
1775 1780 1785
Pro Ser Ser Gly Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly
1790 1795 1800
Gln Ala Lys Lys Lys Lys Gly Ser
1805 1810
<210> 75
<211> 1519
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 75
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val
20 25 30
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
35 40 45
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
50 55 60
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
65 70 75 80
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
85 90 95
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
100 105 110
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
115 120 125
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
130 135 140
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
145 150 155 160
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
165 170 175
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
180 185 190
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
195 200 205
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
210 215 220
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
225 230 235 240
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
245 250 255
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
260 265 270
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
275 280 285
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
290 295 300
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
305 310 315 320
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
325 330 335
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
340 345 350
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
355 360 365
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
370 375 380
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
385 390 395 400
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
405 410 415
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
420 425 430
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
435 440 445
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
450 455 460
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
465 470 475 480
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
485 490 495
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
500 505 510
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
515 520 525
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
530 535 540
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
545 550 555 560
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
565 570 575
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
580 585 590
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
595 600 605
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
610 615 620
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
625 630 635 640
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
645 650 655
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
660 665 670
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
675 680 685
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
690 695 700
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
705 710 715 720
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
725 730 735
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
740 745 750
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
755 760 765
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
770 775 780
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
785 790 795 800
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
805 810 815
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
820 825 830
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
835 840 845
Leu Ser Asp Tyr Asp Val Ala Ala Ile Val Pro Gln Ser Phe Leu Lys
850 855 860
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg
865 870 875 880
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
885 890 895
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
900 905 910
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
915 920 925
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
930 935 940
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
945 950 955 960
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
965 970 975
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
980 985 990
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
995 1000 1005
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu
1010 1015 1020
Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile
1025 1030 1035
Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe
1040 1045 1050
Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu
1055 1060 1065
Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly
1070 1075 1080
Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1085 1090 1095
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys
1100 1105 1110
Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro
1115 1120 1125
Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp
1130 1135 1140
Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser
1145 1150 1155
Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu
1160 1165 1170
Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser
1175 1180 1185
Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr
1190 1195 1200
Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser
1205 1210 1215
Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1220 1225 1230
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr
1235 1240 1245
Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly
1250 1255 1260
Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His
1265 1270 1275
Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser
1280 1285 1290
Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser
1295 1300 1305
Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu
1310 1315 1320
Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1325 1330 1335
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr
1340 1345 1350
Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile
1355 1360 1365
Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly
1370 1375 1380
Asp Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys
1385 1390 1395
Lys Lys Ala Ser Asp Ala Lys Ser Leu Thr Ala Trp Ser Arg Thr
1400 1405 1410
Leu Val Thr Phe Lys Asp Val Phe Val Asp Phe Thr Arg Glu Glu
1415 1420 1425
Trp Lys Leu Leu Asp Thr Ala Gln Gln Ile Leu Tyr Arg Asn Val
1430 1435 1440
Met Leu Glu Asn Tyr Lys Asn Leu Val Ser Leu Gly Tyr Gln Leu
1445 1450 1455
Thr Lys Pro Asp Val Ile Leu Arg Leu Glu Lys Gly Glu Glu Pro
1460 1465 1470
Trp Leu Val Glu Arg Glu Ile His Gln Glu Thr His Pro Asp Ser
1475 1480 1485
Glu Thr Ala Phe Glu Ile Lys Ser Ser Val Ser Gly Gly Lys Arg
1490 1495 1500
Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys Gly
1505 1510 1515
Ser
<210> 76
<211> 2168
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 76
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Gly Ser Gly Gly Ser Gly
1 5 10 15
Gln Thr Gly Lys Lys Ser Glu Lys Gly Pro Val Cys Trp Arg Lys Arg
20 25 30
Val Lys Ser Glu Tyr Met Arg Leu Arg Gln Leu Lys Arg Phe Arg Arg
35 40 45
Ala Asp Glu Val Lys Ser Met Phe Ser Ser Asn Arg Gln Lys Ile Leu
50 55 60
Glu Arg Thr Glu Ile Leu Asn Gln Glu Trp Lys Gln Arg Arg Ile Gln
65 70 75 80
Pro Val His Ile Leu Thr Ser Val Ser Ser Leu Arg Gly Thr Arg Glu
85 90 95
Cys Ser Val Thr Ser Asp Leu Asp Phe Pro Thr Gln Val Ile Pro Leu
100 105 110
Lys Thr Leu Asn Ala Val Ala Ser Val Pro Ile Met Tyr Ser Trp Ser
115 120 125
Pro Leu Gln Gln Asn Phe Met Val Glu Asp Glu Thr Val Leu His Asn
130 135 140
Ile Pro Tyr Met Gly Asp Glu Val Leu Asp Gln Asp Gly Thr Phe Ile
145 150 155 160
Glu Glu Leu Ile Lys Asn Tyr Asp Gly Lys Val His Gly Asp Arg Glu
165 170 175
Cys Gly Phe Ile Asn Asp Glu Ile Phe Val Glu Leu Val Asn Ala Leu
180 185 190
Gly Gln Tyr Asn Asp Asp Asp Asp Asp Asp Asp Gly Asp Asp Pro Glu
195 200 205
Glu Arg Glu Glu Lys Gln Lys Asp Leu Glu Asp His Arg Asp Asp Lys
210 215 220
Glu Ser Arg Pro Pro Arg Lys Phe Pro Ser Asp Lys Ile Phe Glu Ala
225 230 235 240
Ile Ser Ser Met Phe Pro Asp Lys Gly Thr Ala Glu Glu Leu Lys Glu
245 250 255
Lys Tyr Lys Glu Leu Thr Glu Gln Gln Leu Pro Gly Ala Leu Pro Pro
260 265 270
Glu Cys Thr Pro Asn Ile Asp Gly Pro Asn Ala Lys Ser Val Gln Arg
275 280 285
Glu Gln Ser Leu His Ser Phe His Thr Leu Phe Cys Arg Arg Cys Phe
290 295 300
Lys Tyr Asp Cys Phe Leu His Pro Phe His Ala Thr Pro Asn Thr Tyr
305 310 315 320
Lys Arg Lys Asn Thr Glu Thr Ala Leu Asp Asn Lys Pro Cys Gly Pro
325 330 335
Gln Cys Tyr Gln His Leu Glu Gly Ala Lys Glu Phe Ala Ala Ala Leu
340 345 350
Thr Ala Glu Arg Ile Lys Thr Pro Pro Lys Arg Pro Gly Gly Arg Arg
355 360 365
Arg Gly Arg Leu Pro Asn Asn Ser Ser Arg Pro Ser Thr Pro Thr Ile
370 375 380
Asn Val Leu Glu Ser Lys Asp Thr Asp Ser Asp Arg Glu Ala Gly Thr
385 390 395 400
Glu Thr Gly Gly Glu Asn Asn Asp Lys Glu Glu Glu Glu Lys Lys Asp
405 410 415
Glu Thr Ser Ser Ser Ser Glu Ala Asn Ser Arg Cys Gln Thr Pro Ile
420 425 430
Lys Met Lys Pro Asn Ile Glu Pro Pro Glu Asn Val Glu Trp Ser Gly
435 440 445
Ala Glu Ala Ser Met Phe Arg Val Leu Ile Gly Thr Tyr Tyr Asp Asn
450 455 460
Phe Cys Ala Ile Ala Arg Leu Ile Gly Thr Lys Thr Cys Arg Gln Val
465 470 475 480
Tyr Glu Phe Arg Val Lys Glu Ser Ser Ile Ile Ala Pro Ala Pro Ala
485 490 495
Glu Asp Val Asp Thr Pro Pro Arg Lys Lys Lys Arg Lys His Arg Leu
500 505 510
Trp Ala Ala His Cys Arg Lys Ile Gln Leu Lys Lys Asp Gly Ser Ser
515 520 525
Asn His Val Tyr Asn Tyr Gln Pro Cys Asp His Pro Arg Gln Pro Cys
530 535 540
Asp Ser Ser Cys Pro Cys Val Ile Ala Gln Asn Phe Cys Glu Lys Phe
545 550 555 560
Cys Gln Cys Ser Ser Glu Cys Gln Asn Arg Phe Pro Gly Cys Arg Cys
565 570 575
Lys Ala Gln Cys Asn Thr Lys Gln Cys Pro Cys Tyr Leu Ala Val Arg
580 585 590
Glu Cys Asp Pro Asp Leu Cys Leu Thr Cys Gly Ala Ala Asp His Trp
595 600 605
Asp Ser Lys Asn Val Ser Cys Lys Asn Cys Ser Ile Gln Arg Gly Ser
610 615 620
Lys Lys His Leu Leu Leu Ala Pro Ser Asp Val Ala Gly Trp Gly Ile
625 630 635 640
Phe Ile Lys Asp Pro Val Gln Lys Asn Glu Phe Ile Ser Glu Tyr Cys
645 650 655
Gly Glu Ile Ile Ser Gln Asp Glu Ala Asp Arg Arg Gly Lys Val Tyr
660 665 670
Asp Lys Tyr Met Cys Ser Phe Leu Phe Asn Leu Asn Asn Asp Phe Val
675 680 685
Val Asp Ala Thr Arg Lys Gly Asn Lys Ile Arg Phe Ala Asn His Ser
690 695 700
Val Asn Pro Asn Cys Tyr Ala Lys Val Met Met Val Asn Gly Asp His
705 710 715 720
Arg Ile Gly Ile Phe Ala Lys Arg Ala Ile Gln Thr Gly Glu Glu Leu
725 730 735
Phe Phe Asp Tyr Arg Tyr Ser Gln Ala Asp Ala Leu Lys Tyr Val Gly
740 745 750
Ile Glu Arg Glu Met Glu Ile Pro Ser Thr Gly Gly Ser Gly Gly Ser
755 760 765
Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Arg Pro Asp Lys Lys Tyr
770 775 780
Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile
785 790 795 800
Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn
805 810 815
Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe
820 825 830
Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg
835 840 845
Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile
850 855 860
Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu
865 870 875 880
Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro
885 890 895
Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro
900 905 910
Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala
915 920 925
Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg
930 935 940
Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val
945 950 955 960
Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu
965 970 975
Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser
980 985 990
Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu
995 1000 1005
Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu
1010 1015 1020
Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala
1025 1030 1035
Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp
1040 1045 1050
Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
1055 1060 1065
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
1070 1075 1080
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala
1085 1090 1095
Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu
1100 1105 1110
Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu
1115 1120 1125
Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp
1130 1135 1140
Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile
1145 1150 1155
Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn
1160 1165 1170
Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser
1175 1180 1185
Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg
1190 1195 1200
Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys
1205 1210 1215
Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro
1220 1225 1230
Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser
1235 1240 1245
Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys
1250 1255 1260
Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp
1265 1270 1275
Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu Leu
1280 1285 1290
Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
1295 1300 1305
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
1310 1315 1320
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val
1325 1330 1335
Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys
1340 1345 1350
Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala
1355 1360 1365
Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys
1370 1375 1380
Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile
1385 1390 1395
Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu
1400 1405 1410
Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys
1415 1420 1425
Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg
1430 1435 1440
Lys Leu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile
1445 1450 1455
Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met
1460 1465 1470
Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln
1475 1480 1485
Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His Glu His Ile
1490 1495 1500
Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln
1505 1510 1515
Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg His
1520 1525 1530
Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
1535 1540 1545
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
1550 1555 1560
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His
1565 1570 1575
Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr
1580 1585 1590
Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp
1595 1600 1605
Ile Asn Arg Leu Ser Asp Tyr Asp Val Ala Ala Ile Val Pro Gln
1610 1615 1620
Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg
1625 1630 1635
Ser Asp Lys Ala Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu
1640 1645 1650
Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala
1655 1660 1665
Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu
1670 1675 1680
Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg
1685 1690 1695
Gln Leu Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile
1700 1705 1710
Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu
1715 1720 1725
Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser
1730 1735 1740
Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn
1745 1750 1755
Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val Gly
1760 1765 1770
Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
1775 1780 1785
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1790 1795 1800
Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr
1805 1810 1815
Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn
1820 1825 1830
Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr
1835 1840 1845
Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg
1850 1855 1860
Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
1865 1870 1875
Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1880 1885 1890
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1895 1900 1905
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1910 1915 1920
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser
1925 1930 1935
Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe
1940 1945 1950
Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu
1955 1960 1965
Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe
1970 1975 1980
Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu
1985 1990 1995
Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn
2000 2005 2010
Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
2015 2020 2025
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His
2030 2035 2040
Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg
2045 2050 2055
Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr
2060 2065 2070
Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile
2075 2080 2085
Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
2090 2095 2100
Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr
2105 2110 2115
Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
2120 2125 2130
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser
2135 2140 2145
Gly Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys
2150 2155 2160
Lys Lys Lys Gly Ser
2165
<210> 77
<211> 1977
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 77
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val
20 25 30
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
35 40 45
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
50 55 60
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
65 70 75 80
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
85 90 95
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
100 105 110
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
115 120 125
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
130 135 140
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
145 150 155 160
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
165 170 175
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
180 185 190
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
195 200 205
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
210 215 220
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
225 230 235 240
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
245 250 255
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
260 265 270
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
275 280 285
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
290 295 300
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
305 310 315 320
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
325 330 335
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
340 345 350
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
355 360 365
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
370 375 380
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
385 390 395 400
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
405 410 415
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
420 425 430
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
435 440 445
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
450 455 460
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
465 470 475 480
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
485 490 495
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
500 505 510
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
515 520 525
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
530 535 540
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
545 550 555 560
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
565 570 575
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
580 585 590
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
595 600 605
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
610 615 620
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
625 630 635 640
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
645 650 655
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
660 665 670
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
675 680 685
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
690 695 700
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
705 710 715 720
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
725 730 735
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
740 745 750
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
755 760 765
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
770 775 780
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
785 790 795 800
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
805 810 815
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
820 825 830
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
835 840 845
Leu Ser Asp Tyr Asp Val Ala Ala Ile Val Pro Gln Ser Phe Leu Lys
850 855 860
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg
865 870 875 880
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
885 890 895
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
900 905 910
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
915 920 925
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
930 935 940
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
945 950 955 960
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
965 970 975
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
980 985 990
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
995 1000 1005
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu
1010 1015 1020
Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile
1025 1030 1035
Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe
1040 1045 1050
Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu
1055 1060 1065
Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly
1070 1075 1080
Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1085 1090 1095
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys
1100 1105 1110
Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro
1115 1120 1125
Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp
1130 1135 1140
Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser
1145 1150 1155
Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu
1160 1165 1170
Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser
1175 1180 1185
Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr
1190 1195 1200
Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser
1205 1210 1215
Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1220 1225 1230
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr
1235 1240 1245
Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly
1250 1255 1260
Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His
1265 1270 1275
Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser
1280 1285 1290
Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser
1295 1300 1305
Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu
1310 1315 1320
Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1325 1330 1335
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr
1340 1345 1350
Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile
1355 1360 1365
Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly
1370 1375 1380
Asp Ser Ala Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly
1385 1390 1395
Gly Gly Ser Gly Pro Lys Lys Lys Arg Lys Val Ala Ala Ala Gly
1400 1405 1410
Ser Asn His Asp Gln Glu Phe Asp Pro Pro Lys Val Tyr Pro Pro
1415 1420 1425
Val Pro Ala Glu Lys Arg Lys Pro Ile Arg Val Leu Ser Leu Phe
1430 1435 1440
Asp Gly Ile Ala Thr Gly Leu Leu Val Leu Lys Asp Leu Gly Ile
1445 1450 1455
Gln Val Asp Arg Tyr Ile Ala Ser Glu Val Cys Glu Asp Ser Ile
1460 1465 1470
Thr Val Gly Met Val Arg His Gln Gly Lys Ile Met Tyr Val Gly
1475 1480 1485
Asp Val Arg Ser Val Thr Gln Lys His Ile Gln Glu Trp Gly Pro
1490 1495 1500
Phe Asp Leu Val Ile Gly Gly Ser Pro Cys Asn Asp Leu Ser Ile
1505 1510 1515
Val Asn Pro Ala Arg Lys Gly Leu Tyr Glu Gly Thr Gly Arg Leu
1520 1525 1530
Phe Phe Glu Phe Tyr Arg Leu Leu His Asp Ala Arg Pro Lys Glu
1535 1540 1545
Gly Asp Asp Arg Pro Phe Phe Trp Leu Phe Glu Asn Val Val Ala
1550 1555 1560
Met Gly Val Ser Asp Lys Arg Asp Ile Ser Arg Phe Leu Glu Ser
1565 1570 1575
Asn Pro Val Met Ile Asp Ala Lys Glu Val Ser Ala Ala His Arg
1580 1585 1590
Ala Arg Tyr Phe Trp Gly Asn Leu Pro Gly Met Asn Arg Pro Leu
1595 1600 1605
Ala Ser Thr Val Asn Asp Lys Leu Glu Leu Gln Glu Cys Leu Glu
1610 1615 1620
His Gly Arg Ile Ala Lys Phe Ser Lys Val Arg Thr Ile Thr Thr
1625 1630 1635
Arg Ser Asn Ser Ile Lys Gln Gly Lys Asp Gln His Phe Pro Val
1640 1645 1650
Phe Met Asn Glu Lys Glu Asp Ile Leu Trp Cys Thr Glu Met Glu
1655 1660 1665
Arg Val Phe Gly Phe Pro Val His Tyr Thr Asp Val Ser Asn Met
1670 1675 1680
Ser Arg Leu Ala Arg Gln Arg Leu Leu Gly Arg Ser Trp Ser Val
1685 1690 1695
Pro Val Ile Arg His Leu Phe Ala Pro Leu Lys Glu Tyr Phe Ala
1700 1705 1710
Cys Val Ser Ser Gly Asn Ser Asn Ala Asn Ser Arg Gly Pro Ser
1715 1720 1725
Phe Ser Ser Gly Leu Val Pro Leu Ser Leu Arg Gly Ser His Met
1730 1735 1740
Asn Pro Leu Glu Met Phe Glu Thr Val Pro Val Trp Arg Arg Gln
1745 1750 1755
Pro Val Arg Val Leu Ser Leu Phe Glu Asp Ile Lys Lys Glu Leu
1760 1765 1770
Thr Ser Leu Gly Phe Leu Glu Ser Gly Ser Asp Pro Gly Gln Leu
1775 1780 1785
Lys His Val Val Asp Val Thr Asp Thr Val Arg Lys Asp Val Glu
1790 1795 1800
Glu Trp Gly Pro Phe Asp Leu Val Tyr Gly Ala Thr Pro Pro Leu
1805 1810 1815
Gly His Thr Cys Asp Arg Pro Pro Ser Trp Tyr Leu Phe Gln Phe
1820 1825 1830
His Arg Leu Leu Gln Tyr Ala Arg Pro Lys Pro Gly Ser Pro Arg
1835 1840 1845
Pro Phe Phe Trp Met Phe Val Asp Asn Leu Val Leu Asn Lys Glu
1850 1855 1860
Asp Leu Asp Val Ala Ser Arg Phe Leu Glu Met Glu Pro Val Thr
1865 1870 1875
Ile Pro Asp Val His Gly Gly Ser Leu Gln Asn Ala Val Arg Val
1880 1885 1890
Trp Ser Asn Ile Pro Ala Ile Arg Ser Arg His Trp Ala Leu Val
1895 1900 1905
Ser Glu Glu Glu Leu Ser Leu Leu Ala Gln Asn Lys Gln Ser Ser
1910 1915 1920
Lys Leu Ala Ala Lys Trp Pro Thr Lys Leu Val Lys Asn Cys Phe
1925 1930 1935
Leu Pro Leu Arg Glu Tyr Phe Lys Tyr Phe Ser Thr Glu Leu Thr
1940 1945 1950
Ser Ser Leu Ser Gly Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala
1955 1960 1965
Gly Gln Ala Lys Lys Lys Lys Gly Ser
1970 1975
<210> 78
<211> 1809
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 78
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val
20 25 30
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
35 40 45
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
50 55 60
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
65 70 75 80
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
85 90 95
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
100 105 110
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
115 120 125
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
130 135 140
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
145 150 155 160
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
165 170 175
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
180 185 190
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
195 200 205
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
210 215 220
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
225 230 235 240
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
245 250 255
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
260 265 270
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
275 280 285
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
290 295 300
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
305 310 315 320
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
325 330 335
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
340 345 350
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
355 360 365
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
370 375 380
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
385 390 395 400
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
405 410 415
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
420 425 430
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
435 440 445
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
450 455 460
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
465 470 475 480
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
485 490 495
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
500 505 510
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
515 520 525
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
530 535 540
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
545 550 555 560
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
565 570 575
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
580 585 590
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
595 600 605
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
610 615 620
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
625 630 635 640
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
645 650 655
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
660 665 670
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
675 680 685
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
690 695 700
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
705 710 715 720
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
725 730 735
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
740 745 750
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
755 760 765
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
770 775 780
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
785 790 795 800
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
805 810 815
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
820 825 830
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
835 840 845
Leu Ser Asp Tyr Asp Val Ala Ala Ile Val Pro Gln Ser Phe Leu Lys
850 855 860
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg
865 870 875 880
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
885 890 895
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
900 905 910
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
915 920 925
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
930 935 940
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
945 950 955 960
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
965 970 975
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
980 985 990
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
995 1000 1005
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu
1010 1015 1020
Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile
1025 1030 1035
Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe
1040 1045 1050
Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu
1055 1060 1065
Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly
1070 1075 1080
Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1085 1090 1095
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys
1100 1105 1110
Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro
1115 1120 1125
Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp
1130 1135 1140
Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser
1145 1150 1155
Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu
1160 1165 1170
Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser
1175 1180 1185
Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr
1190 1195 1200
Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser
1205 1210 1215
Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1220 1225 1230
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr
1235 1240 1245
Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly
1250 1255 1260
Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His
1265 1270 1275
Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser
1280 1285 1290
Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser
1295 1300 1305
Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu
1310 1315 1320
Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1325 1330 1335
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr
1340 1345 1350
Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile
1355 1360 1365
Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly
1370 1375 1380
Asp Ser Gly Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln
1385 1390 1395
Ala Lys Lys Lys Lys Ser Gly Gly Gly Gly Ser Glu Glu Pro Glu
1400 1405 1410
Glu Pro Ala Asp Ser Gly Gln Ser Leu Val Pro Val Tyr Ile Tyr
1415 1420 1425
Ser Pro Glu Tyr Val Ser Met Cys Asp Ser Leu Ala Lys Ile Pro
1430 1435 1440
Lys Arg Ala Ser Met Val His Ser Leu Ile Glu Ala Tyr Ala Leu
1445 1450 1455
His Lys Gln Met Arg Ile Val Lys Pro Lys Val Ala Ser Met Glu
1460 1465 1470
Glu Met Ala Thr Phe His Thr Asp Ala Tyr Leu Gln His Leu Gln
1475 1480 1485
Lys Val Ser Gln Glu Gly Asp Asp Asp His Pro Asp Ser Ile Glu
1490 1495 1500
Tyr Gly Leu Gly Tyr Asp Cys Pro Ala Thr Glu Gly Ile Phe Asp
1505 1510 1515
Tyr Ala Ala Ala Ile Gly Gly Ala Thr Ile Thr Ala Ala Gln Cys
1520 1525 1530
Leu Ile Asp Gly Met Cys Lys Val Ala Ile Asn Trp Ser Gly Gly
1535 1540 1545
Trp His His Ala Lys Lys Asp Glu Ala Ser Gly Phe Cys Tyr Leu
1550 1555 1560
Asn Asp Ala Val Leu Gly Ile Leu Arg Leu Arg Arg Lys Phe Glu
1565 1570 1575
Arg Ile Leu Tyr Val Asp Leu Asp Leu His His Gly Asp Gly Val
1580 1585 1590
Glu Asp Ala Phe Ser Phe Thr Ser Lys Val Met Thr Val Ser Leu
1595 1600 1605
His Lys Phe Ser Pro Gly Phe Phe Pro Gly Thr Gly Asp Val Ser
1610 1615 1620
Asp Val Gly Leu Gly Lys Gly Arg Tyr Tyr Ser Val Asn Val Pro
1625 1630 1635
Ile Gln Asp Gly Ile Gln Asp Glu Lys Tyr Tyr Gln Ile Cys Glu
1640 1645 1650
Ser Val Leu Lys Glu Val Tyr Gln Ala Phe Asn Pro Lys Ala Val
1655 1660 1665
Val Leu Gln Leu Gly Ala Asp Thr Ile Ala Gly Asp Pro Met Cys
1670 1675 1680
Ser Phe Asn Met Thr Pro Val Gly Ile Gly Lys Cys Leu Lys Tyr
1685 1690 1695
Ile Leu Gln Trp Gln Leu Ala Thr Leu Ile Leu Gly Gly Gly Gly
1700 1705 1710
Tyr Asn Leu Ala Asn Thr Ala Arg Cys Trp Thr Tyr Leu Thr Gly
1715 1720 1725
Val Ile Leu Gly Lys Thr Leu Ser Ser Glu Ile Pro Asp His Glu
1730 1735 1740
Phe Phe Thr Ala Tyr Gly Pro Asp Tyr Val Leu Glu Ile Thr Pro
1745 1750 1755
Ser Cys Arg Pro Asp Arg Asn Glu Pro His Arg Ile Gln Gln Ile
1760 1765 1770
Leu Asn Tyr Ile Lys Gly Asn Leu Lys His Val Val Gly Gly Gly
1775 1780 1785
Gly Ser Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala
1790 1795 1800
Lys Lys Lys Lys Gly Ser
1805
<210> 79
<211> 2572
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 79
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Gly Ser Gly Gly Ser Gly
1 5 10 15
Gln Thr Gly Lys Lys Ser Glu Lys Gly Pro Val Cys Trp Arg Lys Arg
20 25 30
Val Lys Ser Glu Tyr Met Arg Leu Arg Gln Leu Lys Arg Phe Arg Arg
35 40 45
Ala Asp Glu Val Lys Ser Met Phe Ser Ser Asn Arg Gln Lys Ile Leu
50 55 60
Glu Arg Thr Glu Ile Leu Asn Gln Glu Trp Lys Gln Arg Arg Ile Gln
65 70 75 80
Pro Val His Ile Leu Thr Ser Val Ser Ser Leu Arg Gly Thr Arg Glu
85 90 95
Cys Ser Val Thr Ser Asp Leu Asp Phe Pro Thr Gln Val Ile Pro Leu
100 105 110
Lys Thr Leu Asn Ala Val Ala Ser Val Pro Ile Met Tyr Ser Trp Ser
115 120 125
Pro Leu Gln Gln Asn Phe Met Val Glu Asp Glu Thr Val Leu His Asn
130 135 140
Ile Pro Tyr Met Gly Asp Glu Val Leu Asp Gln Asp Gly Thr Phe Ile
145 150 155 160
Glu Glu Leu Ile Lys Asn Tyr Asp Gly Lys Val His Gly Asp Arg Glu
165 170 175
Cys Gly Phe Ile Asn Asp Glu Ile Phe Val Glu Leu Val Asn Ala Leu
180 185 190
Gly Gln Tyr Asn Asp Asp Asp Asp Asp Asp Asp Gly Asp Asp Pro Glu
195 200 205
Glu Arg Glu Glu Lys Gln Lys Asp Leu Glu Asp His Arg Asp Asp Lys
210 215 220
Glu Ser Arg Pro Pro Arg Lys Phe Pro Ser Asp Lys Ile Phe Glu Ala
225 230 235 240
Ile Ser Ser Met Phe Pro Asp Lys Gly Thr Ala Glu Glu Leu Lys Glu
245 250 255
Lys Tyr Lys Glu Leu Thr Glu Gln Gln Leu Pro Gly Ala Leu Pro Pro
260 265 270
Glu Cys Thr Pro Asn Ile Asp Gly Pro Asn Ala Lys Ser Val Gln Arg
275 280 285
Glu Gln Ser Leu His Ser Phe His Thr Leu Phe Cys Arg Arg Cys Phe
290 295 300
Lys Tyr Asp Cys Phe Leu His Pro Phe His Ala Thr Pro Asn Thr Tyr
305 310 315 320
Lys Arg Lys Asn Thr Glu Thr Ala Leu Asp Asn Lys Pro Cys Gly Pro
325 330 335
Gln Cys Tyr Gln His Leu Glu Gly Ala Lys Glu Phe Ala Ala Ala Leu
340 345 350
Thr Ala Glu Arg Ile Lys Thr Pro Pro Lys Arg Pro Gly Gly Arg Arg
355 360 365
Arg Gly Arg Leu Pro Asn Asn Ser Ser Arg Pro Ser Thr Pro Thr Ile
370 375 380
Asn Val Leu Glu Ser Lys Asp Thr Asp Ser Asp Arg Glu Ala Gly Thr
385 390 395 400
Glu Thr Gly Gly Glu Asn Asn Asp Lys Glu Glu Glu Glu Lys Lys Asp
405 410 415
Glu Thr Ser Ser Ser Ser Glu Ala Asn Ser Arg Cys Gln Thr Pro Ile
420 425 430
Lys Met Lys Pro Asn Ile Glu Pro Pro Glu Asn Val Glu Trp Ser Gly
435 440 445
Ala Glu Ala Ser Met Phe Arg Val Leu Ile Gly Thr Tyr Tyr Asp Asn
450 455 460
Phe Cys Ala Ile Ala Arg Leu Ile Gly Thr Lys Thr Cys Arg Gln Val
465 470 475 480
Tyr Glu Phe Arg Val Lys Glu Ser Ser Ile Ile Ala Pro Ala Pro Ala
485 490 495
Glu Asp Val Asp Thr Pro Pro Arg Lys Lys Lys Arg Lys His Arg Leu
500 505 510
Trp Ala Ala His Cys Arg Lys Ile Gln Leu Lys Lys Asp Gly Ser Ser
515 520 525
Asn His Val Tyr Asn Tyr Gln Pro Cys Asp His Pro Arg Gln Pro Cys
530 535 540
Asp Ser Ser Cys Pro Cys Val Ile Ala Gln Asn Phe Cys Glu Lys Phe
545 550 555 560
Cys Gln Cys Ser Ser Glu Cys Gln Asn Arg Phe Pro Gly Cys Arg Cys
565 570 575
Lys Ala Gln Cys Asn Thr Lys Gln Cys Pro Cys Tyr Leu Ala Val Arg
580 585 590
Glu Cys Asp Pro Asp Leu Cys Leu Thr Cys Gly Ala Ala Asp His Trp
595 600 605
Asp Ser Lys Asn Val Ser Cys Lys Asn Cys Ser Ile Gln Arg Gly Ser
610 615 620
Lys Lys His Leu Leu Leu Ala Pro Ser Asp Val Ala Gly Trp Gly Ile
625 630 635 640
Phe Ile Lys Asp Pro Val Gln Lys Asn Glu Phe Ile Ser Glu Tyr Cys
645 650 655
Gly Glu Ile Ile Ser Gln Asp Glu Ala Asp Arg Arg Gly Lys Val Tyr
660 665 670
Asp Lys Tyr Met Cys Ser Phe Leu Phe Asn Leu Asn Asn Asp Phe Val
675 680 685
Val Asp Ala Thr Arg Lys Gly Asn Lys Ile Arg Phe Ala Asn His Ser
690 695 700
Val Asn Pro Asn Cys Tyr Ala Lys Val Met Met Val Asn Gly Asp His
705 710 715 720
Arg Ile Gly Ile Phe Ala Lys Arg Ala Ile Gln Thr Gly Glu Glu Leu
725 730 735
Phe Phe Asp Tyr Arg Tyr Ser Gln Ala Asp Ala Leu Lys Tyr Val Gly
740 745 750
Ile Glu Arg Glu Met Glu Ile Pro Ser Thr Gly Gly Ser Gly Gly Ser
755 760 765
Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Arg Pro Asp Lys Lys Tyr
770 775 780
Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile
785 790 795 800
Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn
805 810 815
Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe
820 825 830
Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg
835 840 845
Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile
850 855 860
Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu
865 870 875 880
Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro
885 890 895
Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro
900 905 910
Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala
915 920 925
Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg
930 935 940
Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val
945 950 955 960
Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu
965 970 975
Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser
980 985 990
Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu
995 1000 1005
Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu
1010 1015 1020
Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala
1025 1030 1035
Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp
1040 1045 1050
Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
1055 1060 1065
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
1070 1075 1080
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala
1085 1090 1095
Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu
1100 1105 1110
Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu
1115 1120 1125
Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp
1130 1135 1140
Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile
1145 1150 1155
Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn
1160 1165 1170
Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser
1175 1180 1185
Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg
1190 1195 1200
Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys
1205 1210 1215
Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro
1220 1225 1230
Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser
1235 1240 1245
Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys
1250 1255 1260
Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp
1265 1270 1275
Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu Leu
1280 1285 1290
Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
1295 1300 1305
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
1310 1315 1320
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val
1325 1330 1335
Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys
1340 1345 1350
Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala
1355 1360 1365
Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys
1370 1375 1380
Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile
1385 1390 1395
Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu
1400 1405 1410
Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys
1415 1420 1425
Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg
1430 1435 1440
Lys Leu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile
1445 1450 1455
Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met
1460 1465 1470
Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln
1475 1480 1485
Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His Glu His Ile
1490 1495 1500
Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln
1505 1510 1515
Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg His
1520 1525 1530
Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
1535 1540 1545
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
1550 1555 1560
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His
1565 1570 1575
Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr
1580 1585 1590
Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp
1595 1600 1605
Ile Asn Arg Leu Ser Asp Tyr Asp Val Ala Ala Ile Val Pro Gln
1610 1615 1620
Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg
1625 1630 1635
Ser Asp Lys Ala Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu
1640 1645 1650
Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala
1655 1660 1665
Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu
1670 1675 1680
Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg
1685 1690 1695
Gln Leu Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile
1700 1705 1710
Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu
1715 1720 1725
Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser
1730 1735 1740
Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn
1745 1750 1755
Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val Gly
1760 1765 1770
Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
1775 1780 1785
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1790 1795 1800
Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr
1805 1810 1815
Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn
1820 1825 1830
Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr
1835 1840 1845
Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg
1850 1855 1860
Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
1865 1870 1875
Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1880 1885 1890
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1895 1900 1905
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1910 1915 1920
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser
1925 1930 1935
Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe
1940 1945 1950
Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu
1955 1960 1965
Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe
1970 1975 1980
Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu
1985 1990 1995
Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn
2000 2005 2010
Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
2015 2020 2025
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His
2030 2035 2040
Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg
2045 2050 2055
Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr
2060 2065 2070
Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile
2075 2080 2085
Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
2090 2095 2100
Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr
2105 2110 2115
Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
2120 2125 2130
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser
2135 2140 2145
Gly Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys
2150 2155 2160
Lys Lys Lys Ser Gly Gly Gly Gly Ser Glu Glu Pro Glu Glu Pro
2165 2170 2175
Ala Asp Ser Gly Gln Ser Leu Val Pro Val Tyr Ile Tyr Ser Pro
2180 2185 2190
Glu Tyr Val Ser Met Cys Asp Ser Leu Ala Lys Ile Pro Lys Arg
2195 2200 2205
Ala Ser Met Val His Ser Leu Ile Glu Ala Tyr Ala Leu His Lys
2210 2215 2220
Gln Met Arg Ile Val Lys Pro Lys Val Ala Ser Met Glu Glu Met
2225 2230 2235
Ala Thr Phe His Thr Asp Ala Tyr Leu Gln His Leu Gln Lys Val
2240 2245 2250
Ser Gln Glu Gly Asp Asp Asp His Pro Asp Ser Ile Glu Tyr Gly
2255 2260 2265
Leu Gly Tyr Asp Cys Pro Ala Thr Glu Gly Ile Phe Asp Tyr Ala
2270 2275 2280
Ala Ala Ile Gly Gly Ala Thr Ile Thr Ala Ala Gln Cys Leu Ile
2285 2290 2295
Asp Gly Met Cys Lys Val Ala Ile Asn Trp Ser Gly Gly Trp His
2300 2305 2310
His Ala Lys Lys Asp Glu Ala Ser Gly Phe Cys Tyr Leu Asn Asp
2315 2320 2325
Ala Val Leu Gly Ile Leu Arg Leu Arg Arg Lys Phe Glu Arg Ile
2330 2335 2340
Leu Tyr Val Asp Leu Asp Leu His His Gly Asp Gly Val Glu Asp
2345 2350 2355
Ala Phe Ser Phe Thr Ser Lys Val Met Thr Val Ser Leu His Lys
2360 2365 2370
Phe Ser Pro Gly Phe Phe Pro Gly Thr Gly Asp Val Ser Asp Val
2375 2380 2385
Gly Leu Gly Lys Gly Arg Tyr Tyr Ser Val Asn Val Pro Ile Gln
2390 2395 2400
Asp Gly Ile Gln Asp Glu Lys Tyr Tyr Gln Ile Cys Glu Ser Val
2405 2410 2415
Leu Lys Glu Val Tyr Gln Ala Phe Asn Pro Lys Ala Val Val Leu
2420 2425 2430
Gln Leu Gly Ala Asp Thr Ile Ala Gly Asp Pro Met Cys Ser Phe
2435 2440 2445
Asn Met Thr Pro Val Gly Ile Gly Lys Cys Leu Lys Tyr Ile Leu
2450 2455 2460
Gln Trp Gln Leu Ala Thr Leu Ile Leu Gly Gly Gly Gly Tyr Asn
2465 2470 2475
Leu Ala Asn Thr Ala Arg Cys Trp Thr Tyr Leu Thr Gly Val Ile
2480 2485 2490
Leu Gly Lys Thr Leu Ser Ser Glu Ile Pro Asp His Glu Phe Phe
2495 2500 2505
Thr Ala Tyr Gly Pro Asp Tyr Val Leu Glu Ile Thr Pro Ser Cys
2510 2515 2520
Arg Pro Asp Arg Asn Glu Pro His Arg Ile Gln Gln Ile Leu Asn
2525 2530 2535
Tyr Ile Lys Gly Asn Leu Lys His Val Val Gly Gly Gly Gly Ser
2540 2545 2550
Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys
2555 2560 2565
Lys Lys Gly Ser
2570
<210> 80
<211> 2572
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 80
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Ser Gly Gly Gly Gly Ser Gly Ser Ala Ala Ile Ala Glu Val Leu
20 25 30
Leu Asn Ala Arg Cys Asp Leu His Ala Val Asn Tyr His Gly Asp Thr
35 40 45
Pro Leu His Ile Ala Ala Arg Glu Ser Tyr His Asp Cys Val Leu Leu
50 55 60
Phe Leu Ser Arg Gly Ala Asn Pro Glu Leu Arg Asn Lys Glu Gly Asp
65 70 75 80
Thr Ala Trp Asp Leu Thr Pro Glu Arg Ser Asp Val Trp Phe Ala Leu
85 90 95
Gln Leu Asn Arg Lys Leu Arg Leu Gly Val Gly Asn Arg Ala Ile Arg
100 105 110
Thr Glu Lys Ile Ile Cys Arg Asp Val Ala Arg Gly Tyr Glu Asn Val
115 120 125
Pro Ile Pro Cys Val Asn Gly Val Asp Gly Glu Pro Cys Pro Glu Asp
130 135 140
Tyr Lys Tyr Ile Ser Glu Asn Cys Glu Thr Ser Thr Met Asn Ile Asp
145 150 155 160
Arg Asn Ile Thr His Leu Gln His Cys Thr Cys Val Asp Asp Cys Ser
165 170 175
Ser Ser Asn Cys Leu Cys Gly Gln Leu Ser Ile Arg Cys Trp Tyr Asp
180 185 190
Lys Asp Gly Arg Leu Leu Gln Glu Phe Asn Lys Ile Glu Pro Pro Leu
195 200 205
Ile Phe Glu Cys Asn Gln Ala Cys Ser Cys Trp Arg Asn Cys Lys Asn
210 215 220
Arg Val Val Gln Ser Gly Ile Lys Val Arg Leu Gln Leu Tyr Arg Thr
225 230 235 240
Ala Lys Met Gly Trp Gly Val Arg Ala Leu Gln Thr Ile Pro Gln Gly
245 250 255
Thr Phe Ile Cys Glu Tyr Val Gly Glu Leu Ile Ser Asp Ala Glu Ala
260 265 270
Asp Val Arg Glu Asp Asp Ser Tyr Leu Phe Asp Leu Asp Asn Lys Asp
275 280 285
Gly Glu Val Tyr Cys Ile Asp Ala Arg Tyr Tyr Gly Asn Ile Ser Arg
290 295 300
Phe Ile Asn His Leu Cys Asp Pro Asn Ile Ile Pro Val Arg Val Phe
305 310 315 320
Met Leu His Gln Asp Leu Arg Phe Pro Arg Ile Ala Phe Phe Ser Ser
325 330 335
Arg Asp Ile Arg Thr Gly Glu Glu Leu Gly Phe Asp Tyr Gly Asp Arg
340 345 350
Phe Trp Asp Ile Lys Ser Lys Tyr Phe Thr Cys Gln Cys Gly Ser Glu
355 360 365
Lys Cys Lys His Ser Ala Glu Ala Ile Ala Leu Glu Gln Ser Arg Leu
370 375 380
Ala Arg Leu Asp Pro His Pro Glu Leu Leu Pro Glu Leu Gly Ser Leu
385 390 395 400
Pro Pro Val Asn Thr Gly Gly Gly Gly Ser Gly Asp Lys Lys Tyr Ser
405 410 415
Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr
420 425 430
Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr
435 440 445
Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp
450 455 460
Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg
465 470 475 480
Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe
485 490 495
Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu
500 505 510
Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile
515 520 525
Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr
530 535 540
Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp
545 550 555 560
Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly
565 570 575
His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp
580 585 590
Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu
595 600 605
Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala
610 615 620
Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro
625 630 635 640
Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu
645 650 655
Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala
660 665 670
Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu
675 680 685
Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys
690 695 700
Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr
705 710 715 720
Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp
725 730 735
Glu His His Gln Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln
740 745 750
Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly
755 760 765
Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys
770 775 780
Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu
785 790 795 800
Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp
805 810 815
Asn Gly Ser Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile
820 825 830
Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu
835 840 845
Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro
850 855 860
Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu
865 870 875 880
Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala
885 890 895
Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu
900 905 910
Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe
915 920 925
Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met
930 935 940
Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp
945 950 955 960
Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu
965 970 975
Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly
980 985 990
Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu
995 1000 1005
Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu
1010 1015 1020
Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu Phe Glu Asp
1025 1030 1035
Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His Leu Phe
1040 1045 1050
Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr Gly
1055 1060 1065
Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
1070 1075 1080
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
1085 1090 1095
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr
1100 1105 1110
Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp
1115 1120 1125
Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile
1130 1135 1140
Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val
1145 1150 1155
Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu Met
1160 1165 1170
Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg
1175 1180 1185
Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser
1190 1195 1200
Gln Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn
1205 1210 1215
Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr
1220 1225 1230
Val Asp Gln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val
1235 1240 1245
Ala Ala Ile Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp
1250 1255 1260
Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg Gly Lys Ser Asp
1265 1270 1275
Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn Tyr Trp
1280 1285 1290
Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe Asp
1295 1300 1305
Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
1310 1315 1320
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
1325 1330 1335
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr
1340 1345 1350
Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu
1355 1360 1365
Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr
1370 1375 1380
Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr
1385 1390 1395
Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys
1400 1405 1410
Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val
1415 1420 1425
Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr
1430 1435 1440
Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr
1445 1450 1455
Glu Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile
1460 1465 1470
Glu Thr Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg
1475 1480 1485
Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn
1490 1495 1500
Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu
1505 1510 1515
Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys
1520 1525 1530
Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr
1535 1540 1545
Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys
1550 1555 1560
Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile
1565 1570 1575
Met Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu
1580 1585 1590
Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu
1595 1600 1605
Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met
1610 1615 1620
Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu
1625 1630 1635
Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu
1640 1645 1650
Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe
1655 1660 1665
Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile
1670 1675 1680
Ser Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp
1685 1690 1695
Lys Val Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg
1700 1705 1710
Glu Gln Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu
1715 1720 1725
Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg
1730 1735 1740
Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile
1745 1750 1755
His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser
1760 1765 1770
Gln Leu Gly Gly Asp Ser Gly Gly Lys Arg Pro Ala Ala Thr Lys
1775 1780 1785
Lys Ala Gly Gln Ala Lys Lys Lys Lys Ser Gly Gly Gly Gly Ser
1790 1795 1800
Gly Gln Thr Gly Lys Lys Ser Glu Lys Gly Pro Val Cys Trp Arg
1805 1810 1815
Lys Arg Val Lys Ser Glu Tyr Met Arg Leu Arg Gln Leu Lys Arg
1820 1825 1830
Phe Arg Arg Ala Asp Glu Val Lys Ser Met Phe Ser Ser Asn Arg
1835 1840 1845
Gln Lys Ile Leu Glu Arg Thr Glu Ile Leu Asn Gln Glu Trp Lys
1850 1855 1860
Gln Arg Arg Ile Gln Pro Val His Ile Leu Thr Ser Val Ser Ser
1865 1870 1875
Leu Arg Gly Thr Arg Glu Cys Ser Val Thr Ser Asp Leu Asp Phe
1880 1885 1890
Pro Thr Gln Val Ile Pro Leu Lys Thr Leu Asn Ala Val Ala Ser
1895 1900 1905
Val Pro Ile Met Tyr Ser Trp Ser Pro Leu Gln Gln Asn Phe Met
1910 1915 1920
Val Glu Asp Glu Thr Val Leu His Asn Ile Pro Tyr Met Gly Asp
1925 1930 1935
Glu Val Leu Asp Gln Asp Gly Thr Phe Ile Glu Glu Leu Ile Lys
1940 1945 1950
Asn Tyr Asp Gly Lys Val His Gly Asp Arg Glu Cys Gly Phe Ile
1955 1960 1965
Asn Asp Glu Ile Phe Val Glu Leu Val Asn Ala Leu Gly Gln Tyr
1970 1975 1980
Asn Asp Asp Asp Asp Asp Asp Asp Gly Asp Asp Pro Glu Glu Arg
1985 1990 1995
Glu Glu Lys Gln Lys Asp Leu Glu Asp His Arg Asp Asp Lys Glu
2000 2005 2010
Ser Arg Pro Pro Arg Lys Phe Pro Ser Asp Lys Ile Phe Glu Ala
2015 2020 2025
Ile Ser Ser Met Phe Pro Asp Lys Gly Thr Ala Glu Glu Leu Lys
2030 2035 2040
Glu Lys Tyr Lys Glu Leu Thr Glu Gln Gln Leu Pro Gly Ala Leu
2045 2050 2055
Pro Pro Glu Cys Thr Pro Asn Ile Asp Gly Pro Asn Ala Lys Ser
2060 2065 2070
Val Gln Arg Glu Gln Ser Leu His Ser Phe His Thr Leu Phe Cys
2075 2080 2085
Arg Arg Cys Phe Lys Tyr Asp Cys Phe Leu His Pro Phe His Ala
2090 2095 2100
Thr Pro Asn Thr Tyr Lys Arg Lys Asn Thr Glu Thr Ala Leu Asp
2105 2110 2115
Asn Lys Pro Cys Gly Pro Gln Cys Tyr Gln His Leu Glu Gly Ala
2120 2125 2130
Lys Glu Phe Ala Ala Ala Leu Thr Ala Glu Arg Ile Lys Thr Pro
2135 2140 2145
Pro Lys Arg Pro Gly Gly Arg Arg Arg Gly Arg Leu Pro Asn Asn
2150 2155 2160
Ser Ser Arg Pro Ser Thr Pro Thr Ile Asn Val Leu Glu Ser Lys
2165 2170 2175
Asp Thr Asp Ser Asp Arg Glu Ala Gly Thr Glu Thr Gly Gly Glu
2180 2185 2190
Asn Asn Asp Lys Glu Glu Glu Glu Lys Lys Asp Glu Thr Ser Ser
2195 2200 2205
Ser Ser Glu Ala Asn Ser Arg Cys Gln Thr Pro Ile Lys Met Lys
2210 2215 2220
Pro Asn Ile Glu Pro Pro Glu Asn Val Glu Trp Ser Gly Ala Glu
2225 2230 2235
Ala Ser Met Phe Arg Val Leu Ile Gly Thr Tyr Tyr Asp Asn Phe
2240 2245 2250
Cys Ala Ile Ala Arg Leu Ile Gly Thr Lys Thr Cys Arg Gln Val
2255 2260 2265
Tyr Glu Phe Arg Val Lys Glu Ser Ser Ile Ile Ala Pro Ala Pro
2270 2275 2280
Ala Glu Asp Val Asp Thr Pro Pro Arg Lys Lys Lys Arg Lys His
2285 2290 2295
Arg Leu Trp Ala Ala His Cys Arg Lys Ile Gln Leu Lys Lys Asp
2300 2305 2310
Gly Ser Ser Asn His Val Tyr Asn Tyr Gln Pro Cys Asp His Pro
2315 2320 2325
Arg Gln Pro Cys Asp Ser Ser Cys Pro Cys Val Ile Ala Gln Asn
2330 2335 2340
Phe Cys Glu Lys Phe Cys Gln Cys Ser Ser Glu Cys Gln Asn Arg
2345 2350 2355
Phe Pro Gly Cys Arg Cys Lys Ala Gln Cys Asn Thr Lys Gln Cys
2360 2365 2370
Pro Cys Tyr Leu Ala Val Arg Glu Cys Asp Pro Asp Leu Cys Leu
2375 2380 2385
Thr Cys Gly Ala Ala Asp His Trp Asp Ser Lys Asn Val Ser Cys
2390 2395 2400
Lys Asn Cys Ser Ile Gln Arg Gly Ser Lys Lys His Leu Leu Leu
2405 2410 2415
Ala Pro Ser Asp Val Ala Gly Trp Gly Ile Phe Ile Lys Asp Pro
2420 2425 2430
Val Gln Lys Asn Glu Phe Ile Ser Glu Tyr Cys Gly Glu Ile Ile
2435 2440 2445
Ser Gln Asp Glu Ala Asp Arg Arg Gly Lys Val Tyr Asp Lys Tyr
2450 2455 2460
Met Cys Ser Phe Leu Phe Asn Leu Asn Asn Asp Phe Val Val Asp
2465 2470 2475
Ala Thr Arg Lys Gly Asn Lys Ile Arg Phe Ala Asn His Ser Val
2480 2485 2490
Asn Pro Asn Cys Tyr Ala Lys Val Met Met Val Asn Gly Asp His
2495 2500 2505
Arg Ile Gly Ile Phe Ala Lys Arg Ala Ile Gln Thr Gly Glu Glu
2510 2515 2520
Leu Phe Phe Asp Tyr Arg Tyr Ser Gln Ala Asp Ala Leu Lys Tyr
2525 2530 2535
Val Gly Ile Glu Arg Glu Met Glu Ile Pro Gly Gly Gly Gly Ser
2540 2545 2550
Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys
2555 2560 2565
Lys Lys Gly Ser
2570
<210> 81
<211> 1923
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 81
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Ser Gly Gly Gly Gly Ser Gly Ser Ala Ala Ile Ala Glu Val Leu
20 25 30
Leu Asn Ala Arg Cys Asp Leu His Ala Val Asn Tyr His Gly Asp Thr
35 40 45
Pro Leu His Ile Ala Ala Arg Glu Ser Tyr His Asp Cys Val Leu Leu
50 55 60
Phe Leu Ser Arg Gly Ala Asn Pro Glu Leu Arg Asn Lys Glu Gly Asp
65 70 75 80
Thr Ala Trp Asp Leu Thr Pro Glu Arg Ser Asp Val Trp Phe Ala Leu
85 90 95
Gln Leu Asn Arg Lys Leu Arg Leu Gly Val Gly Asn Arg Ala Ile Arg
100 105 110
Thr Glu Lys Ile Ile Cys Arg Asp Val Ala Arg Gly Tyr Glu Asn Val
115 120 125
Pro Ile Pro Cys Val Asn Gly Val Asp Gly Glu Pro Cys Pro Glu Asp
130 135 140
Tyr Lys Tyr Ile Ser Glu Asn Cys Glu Thr Ser Thr Met Asn Ile Asp
145 150 155 160
Arg Asn Ile Thr His Leu Gln His Cys Thr Cys Val Asp Asp Cys Ser
165 170 175
Ser Ser Asn Cys Leu Cys Gly Gln Leu Ser Ile Arg Cys Trp Tyr Asp
180 185 190
Lys Asp Gly Arg Leu Leu Gln Glu Phe Asn Lys Ile Glu Pro Pro Leu
195 200 205
Ile Phe Glu Cys Asn Gln Ala Cys Ser Cys Trp Arg Asn Cys Lys Asn
210 215 220
Arg Val Val Gln Ser Gly Ile Lys Val Arg Leu Gln Leu Tyr Arg Thr
225 230 235 240
Ala Lys Met Gly Trp Gly Val Arg Ala Leu Gln Thr Ile Pro Gln Gly
245 250 255
Thr Phe Ile Cys Glu Tyr Val Gly Glu Leu Ile Ser Asp Ala Glu Ala
260 265 270
Asp Val Arg Glu Asp Asp Ser Tyr Leu Phe Asp Leu Asp Asn Lys Asp
275 280 285
Gly Glu Val Tyr Cys Ile Asp Ala Arg Tyr Tyr Gly Asn Ile Ser Arg
290 295 300
Phe Ile Asn His Leu Cys Asp Pro Asn Ile Ile Pro Val Arg Val Phe
305 310 315 320
Met Leu His Gln Asp Leu Arg Phe Pro Arg Ile Ala Phe Phe Ser Ser
325 330 335
Arg Asp Ile Arg Thr Gly Glu Glu Leu Gly Phe Asp Tyr Gly Asp Arg
340 345 350
Phe Trp Asp Ile Lys Ser Lys Tyr Phe Thr Cys Gln Cys Gly Ser Glu
355 360 365
Lys Cys Lys His Ser Ala Glu Ala Ile Ala Leu Glu Gln Ser Arg Leu
370 375 380
Ala Arg Leu Asp Pro His Pro Glu Leu Leu Pro Glu Leu Gly Ser Leu
385 390 395 400
Pro Pro Val Asn Thr Gly Gly Gly Gly Ser Gly Asp Lys Lys Tyr Ser
405 410 415
Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile Thr
420 425 430
Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn Thr
435 440 445
Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe Asp
450 455 460
Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg Arg
465 470 475 480
Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile Phe
485 490 495
Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu Glu
500 505 510
Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro Ile
515 520 525
Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro Thr
530 535 540
Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala Asp
545 550 555 560
Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg Gly
565 570 575
His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val Asp
580 585 590
Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu Glu
595 600 605
Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser Ala
610 615 620
Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu Pro
625 630 635 640
Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu Ser Leu
645 650 655
Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala Glu Asp Ala
660 665 670
Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp Leu Asp Asn Leu
675 680 685
Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu Phe Leu Ala Ala Lys
690 695 700
Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp Ile Leu Arg Val Asn Thr
705 710 715 720
Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser Met Ile Lys Arg Tyr Asp
725 730 735
Glu His His Gln Asp Leu Thr Leu Leu Lys Ala Leu Val Arg Gln Gln
740 745 750
Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe Asp Gln Ser Lys Asn Gly
755 760 765
Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys
770 775 780
Phe Ile Lys Pro Ile Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu
785 790 795 800
Val Lys Leu Asn Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp
805 810 815
Asn Gly Ser Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile
820 825 830
Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu
835 840 845
Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro
850 855 860
Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser Glu
865 870 875 880
Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys Gly Ala
885 890 895
Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp Lys Asn Leu
900 905 910
Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu Leu Tyr Glu Tyr Phe
915 920 925
Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr Val Thr Glu Gly Met
930 935 940
Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln Lys Lys Ala Ile Val Asp
945 950 955 960
Leu Leu Phe Lys Thr Asn Arg Lys Val Thr Val Lys Gln Leu Lys Glu
965 970 975
Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp Ser Val Glu Ile Ser Gly
980 985 990
Val Glu Asp Arg Phe Asn Ala Ser Leu Gly Thr Tyr His Asp Leu Leu
995 1000 1005
Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp Asn Glu Glu Asn Glu
1010 1015 1020
Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr Leu Phe Glu Asp
1025 1030 1035
Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala His Leu Phe
1040 1045 1050
Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr Thr Gly
1055 1060 1065
Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp Lys
1070 1075 1080
Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
1085 1090 1095
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr
1100 1105 1110
Phe Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp
1115 1120 1125
Ser Leu His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile
1130 1135 1140
Lys Lys Gly Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val
1145 1150 1155
Lys Val Met Gly Arg His Lys Pro Glu Asn Ile Val Ile Glu Met
1160 1165 1170
Ala Arg Glu Asn Gln Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg
1175 1180 1185
Glu Arg Met Lys Arg Ile Glu Glu Gly Ile Lys Glu Leu Gly Ser
1190 1195 1200
Gln Ile Leu Lys Glu His Pro Val Glu Asn Thr Gln Leu Gln Asn
1205 1210 1215
Glu Lys Leu Tyr Leu Tyr Tyr Leu Gln Asn Gly Arg Asp Met Tyr
1220 1225 1230
Val Asp Gln Glu Leu Asp Ile Asn Arg Leu Ser Asp Tyr Asp Val
1235 1240 1245
Ala Ala Ile Val Pro Gln Ser Phe Leu Lys Asp Asp Ser Ile Asp
1250 1255 1260
Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg Gly Lys Ser Asp
1265 1270 1275
Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys Asn Tyr Trp
1280 1285 1290
Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys Phe Asp
1295 1300 1305
Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp Lys
1310 1315 1320
Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
1325 1330 1335
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr
1340 1345 1350
Asp Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu
1355 1360 1365
Lys Ser Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr
1370 1375 1380
Lys Val Arg Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr
1385 1390 1395
Leu Asn Ala Val Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys
1400 1405 1410
Leu Glu Ser Glu Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val
1415 1420 1425
Arg Lys Met Ile Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr
1430 1435 1440
Ala Lys Tyr Phe Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr
1445 1450 1455
Glu Ile Thr Leu Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile
1460 1465 1470
Glu Thr Asn Gly Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg
1475 1480 1485
Asp Phe Ala Thr Val Arg Lys Val Leu Ser Met Pro Gln Val Asn
1490 1495 1500
Ile Val Lys Lys Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu
1505 1510 1515
Ser Ile Leu Pro Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys
1520 1525 1530
Lys Asp Trp Asp Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr
1535 1540 1545
Val Ala Tyr Ser Val Leu Val Val Ala Lys Val Glu Lys Gly Lys
1550 1555 1560
Ser Lys Lys Leu Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile
1565 1570 1575
Met Glu Arg Ser Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu
1580 1585 1590
Ala Lys Gly Tyr Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu
1595 1600 1605
Pro Lys Tyr Ser Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met
1610 1615 1620
Leu Ala Ser Ala Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu
1625 1630 1635
Pro Ser Lys Tyr Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu
1640 1645 1650
Lys Leu Lys Gly Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe
1655 1660 1665
Val Glu Gln His Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile
1670 1675 1680
Ser Glu Phe Ser Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp
1685 1690 1695
Lys Val Leu Ser Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg
1700 1705 1710
Glu Gln Ala Glu Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu
1715 1720 1725
Gly Ala Pro Ala Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg
1730 1735 1740
Lys Arg Tyr Thr Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile
1745 1750 1755
His Gln Ser Ile Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser
1760 1765 1770
Gln Leu Gly Gly Asp Ser Gly Gly Lys Arg Pro Ala Ala Thr Lys
1775 1780 1785
Lys Ala Gly Gln Ala Lys Lys Lys Lys Ser Gly Gly Gly Gly Ser
1790 1795 1800
Asp Ala Lys Ser Leu Thr Ala Trp Ser Arg Thr Leu Val Thr Phe
1805 1810 1815
Lys Asp Val Phe Val Asp Phe Thr Arg Glu Glu Trp Lys Leu Leu
1820 1825 1830
Asp Thr Ala Gln Gln Ile Leu Tyr Arg Asn Val Met Leu Glu Asn
1835 1840 1845
Tyr Lys Asn Leu Val Ser Leu Gly Tyr Gln Leu Thr Lys Pro Asp
1850 1855 1860
Val Ile Leu Arg Leu Glu Lys Gly Glu Glu Pro Trp Leu Val Glu
1865 1870 1875
Arg Glu Ile His Gln Glu Thr His Pro Asp Ser Glu Thr Ala Phe
1880 1885 1890
Glu Ile Lys Ser Ser Val Gly Gly Gly Gly Ser Gly Lys Arg Pro
1895 1900 1905
Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys Gly Ser
1910 1915 1920
<210> 82
<211> 2292
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 82
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Gly Ser Gly Gly Ser Gly
1 5 10 15
Gln Thr Gly Lys Lys Ser Glu Lys Gly Pro Val Cys Trp Arg Lys Arg
20 25 30
Val Lys Ser Glu Tyr Met Arg Leu Arg Gln Leu Lys Arg Phe Arg Arg
35 40 45
Ala Asp Glu Val Lys Ser Met Phe Ser Ser Asn Arg Gln Lys Ile Leu
50 55 60
Glu Arg Thr Glu Ile Leu Asn Gln Glu Trp Lys Gln Arg Arg Ile Gln
65 70 75 80
Pro Val His Ile Leu Thr Ser Val Ser Ser Leu Arg Gly Thr Arg Glu
85 90 95
Cys Ser Val Thr Ser Asp Leu Asp Phe Pro Thr Gln Val Ile Pro Leu
100 105 110
Lys Thr Leu Asn Ala Val Ala Ser Val Pro Ile Met Tyr Ser Trp Ser
115 120 125
Pro Leu Gln Gln Asn Phe Met Val Glu Asp Glu Thr Val Leu His Asn
130 135 140
Ile Pro Tyr Met Gly Asp Glu Val Leu Asp Gln Asp Gly Thr Phe Ile
145 150 155 160
Glu Glu Leu Ile Lys Asn Tyr Asp Gly Lys Val His Gly Asp Arg Glu
165 170 175
Cys Gly Phe Ile Asn Asp Glu Ile Phe Val Glu Leu Val Asn Ala Leu
180 185 190
Gly Gln Tyr Asn Asp Asp Asp Asp Asp Asp Asp Gly Asp Asp Pro Glu
195 200 205
Glu Arg Glu Glu Lys Gln Lys Asp Leu Glu Asp His Arg Asp Asp Lys
210 215 220
Glu Ser Arg Pro Pro Arg Lys Phe Pro Ser Asp Lys Ile Phe Glu Ala
225 230 235 240
Ile Ser Ser Met Phe Pro Asp Lys Gly Thr Ala Glu Glu Leu Lys Glu
245 250 255
Lys Tyr Lys Glu Leu Thr Glu Gln Gln Leu Pro Gly Ala Leu Pro Pro
260 265 270
Glu Cys Thr Pro Asn Ile Asp Gly Pro Asn Ala Lys Ser Val Gln Arg
275 280 285
Glu Gln Ser Leu His Ser Phe His Thr Leu Phe Cys Arg Arg Cys Phe
290 295 300
Lys Tyr Asp Cys Phe Leu His Pro Phe His Ala Thr Pro Asn Thr Tyr
305 310 315 320
Lys Arg Lys Asn Thr Glu Thr Ala Leu Asp Asn Lys Pro Cys Gly Pro
325 330 335
Gln Cys Tyr Gln His Leu Glu Gly Ala Lys Glu Phe Ala Ala Ala Leu
340 345 350
Thr Ala Glu Arg Ile Lys Thr Pro Pro Lys Arg Pro Gly Gly Arg Arg
355 360 365
Arg Gly Arg Leu Pro Asn Asn Ser Ser Arg Pro Ser Thr Pro Thr Ile
370 375 380
Asn Val Leu Glu Ser Lys Asp Thr Asp Ser Asp Arg Glu Ala Gly Thr
385 390 395 400
Glu Thr Gly Gly Glu Asn Asn Asp Lys Glu Glu Glu Glu Lys Lys Asp
405 410 415
Glu Thr Ser Ser Ser Ser Glu Ala Asn Ser Arg Cys Gln Thr Pro Ile
420 425 430
Lys Met Lys Pro Asn Ile Glu Pro Pro Glu Asn Val Glu Trp Ser Gly
435 440 445
Ala Glu Ala Ser Met Phe Arg Val Leu Ile Gly Thr Tyr Tyr Asp Asn
450 455 460
Phe Cys Ala Ile Ala Arg Leu Ile Gly Thr Lys Thr Cys Arg Gln Val
465 470 475 480
Tyr Glu Phe Arg Val Lys Glu Ser Ser Ile Ile Ala Pro Ala Pro Ala
485 490 495
Glu Asp Val Asp Thr Pro Pro Arg Lys Lys Lys Arg Lys His Arg Leu
500 505 510
Trp Ala Ala His Cys Arg Lys Ile Gln Leu Lys Lys Asp Gly Ser Ser
515 520 525
Asn His Val Tyr Asn Tyr Gln Pro Cys Asp His Pro Arg Gln Pro Cys
530 535 540
Asp Ser Ser Cys Pro Cys Val Ile Ala Gln Asn Phe Cys Glu Lys Phe
545 550 555 560
Cys Gln Cys Ser Ser Glu Cys Gln Asn Arg Phe Pro Gly Cys Arg Cys
565 570 575
Lys Ala Gln Cys Asn Thr Lys Gln Cys Pro Cys Tyr Leu Ala Val Arg
580 585 590
Glu Cys Asp Pro Asp Leu Cys Leu Thr Cys Gly Ala Ala Asp His Trp
595 600 605
Asp Ser Lys Asn Val Ser Cys Lys Asn Cys Ser Ile Gln Arg Gly Ser
610 615 620
Lys Lys His Leu Leu Leu Ala Pro Ser Asp Val Ala Gly Trp Gly Ile
625 630 635 640
Phe Ile Lys Asp Pro Val Gln Lys Asn Glu Phe Ile Ser Glu Tyr Cys
645 650 655
Gly Glu Ile Ile Ser Gln Asp Glu Ala Asp Arg Arg Gly Lys Val Tyr
660 665 670
Asp Lys Tyr Met Cys Ser Phe Leu Phe Asn Leu Asn Asn Asp Phe Val
675 680 685
Val Asp Ala Thr Arg Lys Gly Asn Lys Ile Arg Phe Ala Asn His Ser
690 695 700
Val Asn Pro Asn Cys Tyr Ala Lys Val Met Met Val Asn Gly Asp His
705 710 715 720
Arg Ile Gly Ile Phe Ala Lys Arg Ala Ile Gln Thr Gly Glu Glu Leu
725 730 735
Phe Phe Asp Tyr Arg Tyr Ser Gln Ala Asp Ala Leu Lys Tyr Val Gly
740 745 750
Ile Glu Arg Glu Met Glu Ile Pro Ser Thr Gly Gly Ser Gly Gly Ser
755 760 765
Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Arg Pro Asp Lys Lys Tyr
770 775 780
Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile
785 790 795 800
Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn
805 810 815
Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe
820 825 830
Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg
835 840 845
Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile
850 855 860
Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu
865 870 875 880
Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro
885 890 895
Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro
900 905 910
Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala
915 920 925
Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg
930 935 940
Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val
945 950 955 960
Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu
965 970 975
Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser
980 985 990
Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu
995 1000 1005
Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu
1010 1015 1020
Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala
1025 1030 1035
Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp
1040 1045 1050
Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
1055 1060 1065
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
1070 1075 1080
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala
1085 1090 1095
Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu
1100 1105 1110
Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu
1115 1120 1125
Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp
1130 1135 1140
Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile
1145 1150 1155
Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn
1160 1165 1170
Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser
1175 1180 1185
Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg
1190 1195 1200
Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys
1205 1210 1215
Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro
1220 1225 1230
Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser
1235 1240 1245
Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys
1250 1255 1260
Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp
1265 1270 1275
Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu Leu
1280 1285 1290
Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
1295 1300 1305
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
1310 1315 1320
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val
1325 1330 1335
Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys
1340 1345 1350
Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala
1355 1360 1365
Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys
1370 1375 1380
Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile
1385 1390 1395
Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu
1400 1405 1410
Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys
1415 1420 1425
Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg
1430 1435 1440
Lys Leu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile
1445 1450 1455
Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met
1460 1465 1470
Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln
1475 1480 1485
Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His Glu His Ile
1490 1495 1500
Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln
1505 1510 1515
Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg His
1520 1525 1530
Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
1535 1540 1545
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
1550 1555 1560
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His
1565 1570 1575
Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr
1580 1585 1590
Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp
1595 1600 1605
Ile Asn Arg Leu Ser Asp Tyr Asp Val Ala Ala Ile Val Pro Gln
1610 1615 1620
Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg
1625 1630 1635
Ser Asp Lys Ala Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu
1640 1645 1650
Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala
1655 1660 1665
Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu
1670 1675 1680
Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg
1685 1690 1695
Gln Leu Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile
1700 1705 1710
Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu
1715 1720 1725
Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser
1730 1735 1740
Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn
1745 1750 1755
Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val Gly
1760 1765 1770
Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
1775 1780 1785
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1790 1795 1800
Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr
1805 1810 1815
Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn
1820 1825 1830
Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr
1835 1840 1845
Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg
1850 1855 1860
Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
1865 1870 1875
Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1880 1885 1890
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1895 1900 1905
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1910 1915 1920
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser
1925 1930 1935
Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe
1940 1945 1950
Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu
1955 1960 1965
Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe
1970 1975 1980
Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu
1985 1990 1995
Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn
2000 2005 2010
Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
2015 2020 2025
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His
2030 2035 2040
Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg
2045 2050 2055
Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr
2060 2065 2070
Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile
2075 2080 2085
Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
2090 2095 2100
Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr
2105 2110 2115
Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
2120 2125 2130
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser
2135 2140 2145
Gly Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys
2150 2155 2160
Lys Lys Lys Ser Gly Gly Gly Gly Ser Asp Ala Lys Ser Leu Thr
2165 2170 2175
Ala Trp Ser Arg Thr Leu Val Thr Phe Lys Asp Val Phe Val Asp
2180 2185 2190
Phe Thr Arg Glu Glu Trp Lys Leu Leu Asp Thr Ala Gln Gln Ile
2195 2200 2205
Leu Tyr Arg Asn Val Met Leu Glu Asn Tyr Lys Asn Leu Val Ser
2210 2215 2220
Leu Gly Tyr Gln Leu Thr Lys Pro Asp Val Ile Leu Arg Leu Glu
2225 2230 2235
Lys Gly Glu Glu Pro Trp Leu Val Glu Arg Glu Ile His Gln Glu
2240 2245 2250
Thr His Pro Asp Ser Glu Thr Ala Phe Glu Ile Lys Ser Ser Val
2255 2260 2265
Gly Gly Gly Gly Ser Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala
2270 2275 2280
Gly Gln Ala Lys Lys Lys Lys Gly Ser
2285 2290
<210> 83
<400> 83
000
<210> 84
<211> 1405
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 84
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
20 25 30
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
35 40 45
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
50 55 60
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
65 70 75 80
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
85 90 95
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
100 105 110
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
115 120 125
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
130 135 140
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
145 150 155 160
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
165 170 175
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
180 185 190
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
195 200 205
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
210 215 220
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
225 230 235 240
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
245 250 255
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
260 265 270
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
275 280 285
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
290 295 300
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
305 310 315 320
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
325 330 335
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
340 345 350
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
355 360 365
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
370 375 380
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
385 390 395 400
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
405 410 415
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
420 425 430
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
435 440 445
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
450 455 460
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
465 470 475 480
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
485 490 495
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
500 505 510
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
515 520 525
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
530 535 540
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
545 550 555 560
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
565 570 575
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
580 585 590
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
595 600 605
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
610 615 620
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
625 630 635 640
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
645 650 655
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
660 665 670
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
675 680 685
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
690 695 700
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
705 710 715 720
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
725 730 735
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
740 745 750
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
755 760 765
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
770 775 780
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
785 790 795 800
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
805 810 815
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
820 825 830
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
835 840 845
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
850 855 860
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
865 870 875 880
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
885 890 895
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
900 905 910
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
915 920 925
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
930 935 940
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
945 950 955 960
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
965 970 975
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
980 985 990
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
995 1000 1005
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu
1010 1015 1020
Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile
1025 1030 1035
Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe
1040 1045 1050
Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu
1055 1060 1065
Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly
1070 1075 1080
Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1085 1090 1095
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys
1100 1105 1110
Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro
1115 1120 1125
Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp
1130 1135 1140
Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser
1145 1150 1155
Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu
1160 1165 1170
Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser
1175 1180 1185
Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr
1190 1195 1200
Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser
1205 1210 1215
Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1220 1225 1230
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr
1235 1240 1245
Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly
1250 1255 1260
Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His
1265 1270 1275
Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser
1280 1285 1290
Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser
1295 1300 1305
Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu
1310 1315 1320
Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1325 1330 1335
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr
1340 1345 1350
Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile
1355 1360 1365
Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly
1370 1375 1380
Asp Ser Gly Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln
1385 1390 1395
Ala Lys Lys Lys Lys Gly Ser
1400 1405
<210> 85
<211> 7147
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 85
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcggc agcggcggca gcggccagac cggcaagaag agcgagaagg 120
gccccgtgtg ctggcggaag cgggtgaaga gcgagtacat gcggctgcgg cagctgaagc 180
ggttccggcg ggccgacgag gtgaagagca tgttcagcag caaccggcag aagatcctgg 240
agcggaccga gatcctgaac caggagtgga agcagcggcg aatccagccc gtgcacatcc 300
tgaccagcgt gagcagcctg cggggcaccc gggagtgcag cgtgaccagc gacctggact 360
tccccaccca ggtgatcccc ctaaagaccc tgaacgccgt ggccagcgtg cccatcatgt 420
acagctggag ccccctgcag cagaacttca tggtggagga cgagaccgtg ctgcacaaca 480
tcccctacat gggcgacgag gtgctggacc aggacggcac cttcatcgag gagctgatca 540
agaactacga cggcaaggtg cacggcgacc gggagtgcgg cttcatcaac gacgagatct 600
tcgtggagct ggtgaacgcc ctgggccagt acaacgacga cgacgacgac gacgacggcg 660
acgaccccga ggagcgggag gagaagcaga aggacctgga ggaccaccgg gacgacaagg 720
agagccggcc cccccggaag ttccccagcg acaagatctt cgaggccatc agcagcatgt 780
tccccgacaa gggcaccgcc gaggagctga aggagaagta caaggagctg accgagcagc 840
agctgcccgg cgccctgccc cccgagtgca cccccaacat cgacggcccc aacgccaaga 900
gcgtgcagcg ggagcagagc ctgcacagct tccacaccct gttctgccgg cggtgcttca 960
agtacgactg cttcctgcac cccttccacg ccacccccaa cacctacaag cggaagaaca 1020
ccgagaccgc cctggacaac aagccctgcg gcccccagtg ctaccagcac ctggagggcg 1080
ccaaggagtt cgccgccgcc ctgaccgccg agcggatcaa gacccccccc aagcggcccg 1140
gcggccggcg gcggggccgg ctgcccaaca acagcagccg gcccagcacc cccaccatca 1200
acgtgctgga gagcaaggac accgacagcg accgggaggc cggcaccgag accggcggcg 1260
agaacaacga caaggaggag gaggagaaga aggacgagac cagcagcagc agcgaggcca 1320
acagccggtg ccagaccccc atcaagatga agcccaacat cgagcccccc gagaacgtgg 1380
agtggagcgg cgccgaggcc agcatgttcc gggtgctgat cggcacctac tacgacaact 1440
tctgcgccat cgcccggctg atcggcacca agacctgccg gcaggtgtac gagttccggg 1500
tgaaggagag cagcatcatc gcccccgccc ccgccgagga cgtggacacc cccccccgga 1560
agaagaagcg gaagcaccgg ctgtgggccg cccactgccg gaagatccag ctgaagaagg 1620
acggcagcag caaccacgtg tacaactacc agccctgcga ccacccccgg cagccctgcg 1680
acagcagctg cccctgcgtg atcgcccaga acttctgcga gaagttctgc cagtgcagca 1740
gcgagtgcca gaaccggttc cccggctgcc ggtgcaaggc ccagtgcaac accaagcagt 1800
gcccctgcta cctggccgtg cgggagtgcg accccgacct gtgcctgacc tgcggcgccg 1860
ccgaccactg ggacagcaag aacgtgagct gcaagaactg cagcatccag cggggcagca 1920
agaagcacct gctgctggcc cccagcgacg tggccggctg gggcatcttc atcaaggacc 1980
ccgtgcagaa gaacgagttc atcagcgagt actgcggcga gatcatcagc caggacgagg 2040
ccgaccggcg gggcaaggtg tacgacaagt acatgtgcag cttcctgttc aacctgaaca 2100
acgacttcgt ggtggacgcc acccggaagg gcaacaagat ccggttcgcc aaccacagcg 2160
tgaaccccaa ctgctacgcc aaggtgatga tggtgaacgg cgaccaccgg atcggcatct 2220
tcgccaagcg ggccatccag accggcgagg agctgttctt cgactaccgg tacagccagg 2280
ccgacgccct gaagtacgtg ggcatcgagc gggagatgga gatccccagc accggcggca 2340
gcggcggcag cggcggcagc ggcggcagcg gcggcagcgg ccgacccgac aagaagtaca 2400
gcatcggcct ggccatcggc accaacagcg tgggctgggc cgtgatcacc gacgagtaca 2460
aggtgcccag caagaagttc aaggtgctgg gcaacaccga ccggcacagc atcaagaaga 2520
acctgatcgg cgccctgctg ttcgacagcg gcgagaccgc cgaggccacc cggctgaagc 2580
ggaccgcccg gcggcggtac acccggcgga agaaccggat ctgctacctg caggagatct 2640
tcagcaacga gatggccaag gtggacgaca gcttcttcca ccggctggag gagagcttcc 2700
tggtggagga ggacaagaag cacgagcggc accccatctt cggcaacatc gtggacgagg 2760
tggcctacca cgagaagtac cccaccatct accacctgcg gaagaagctg gtggacagca 2820
ccgacaaggc cgacctgcgg ctgatctacc tggccctggc ccacatgatc aagttccggg 2880
gccacttcct gatcgagggc gacctgaacc ccgacaacag cgacgtggac aagctgttca 2940
tccagctggt gcagacctac aaccagctgt tcgaggagaa ccccatcaac gccagcggcg 3000
tggacgccaa ggccatcctg agcgcccggc tgagcaagag ccggcggctg gagaacctga 3060
tcgcccagct gcccggcgag aagaagaacg gcctgttcgg caacctgatc gccctgagcc 3120
tgggcctgac ccccaacttc aagagcaact tcgacctggc cgaggacgcc aagctgcagc 3180
tgagcaagga cacctacgac gacgacctgg acaacctgct ggcccagatc ggcgaccagt 3240
acgccgacct gttcctggcc gccaagaacc tgagcgacgc catcctgctg agcgacatcc 3300
tgcgggtgaa caccgagatc accaaggccc ccctgagcgc cagcatgatc aagcggtacg 3360
acgagcacca ccaggacctg accctgctga aggccctggt gcggcagcag ctgcccgaga 3420
agtacaagga gatcttcttc gaccagagca agaacggcta cgccggctac atcgacggcg 3480
gcgccagcca ggaggagttc tacaagttca tcaagcccat cctggagaag atggacggca 3540
ccgaggagct gctggtgaag ctgaaccggg aggacctgct gcggaagcag cggaccttcg 3600
acaacggcag catcccccac cagatccacc tgggcgagct gcacgccatc ctgcggcggc 3660
aggaggactt ctaccccttc ctgaaggaca accgggagaa gatcgagaag atcctgacct 3720
tccggatccc ctactacgtg ggccccctgg cccggggcaa cagccggttc gcctggatga 3780
cccggaaatc cgaggagacc atcaccccct ggaacttcga ggaggtggtg gacaagggcg 3840
ccagcgccca gagcttcatc gagcggatga ccaacttcga caagaacctg cccaacgaga 3900
aggtgctgcc caagcacagc ctgctgtacg agtacttcac cgtgtacaac gagctgacca 3960
aggtgaagta cgtgaccgag ggcatgcgga agcccgcctt cctgagcggc gagcagaaga 4020
aggccatcgt ggacctgctg ttcaagacca accggaaggt gaccgtgaag cagctgaagg 4080
aggactactt caagaagatc gagtgcttcg acagcgtgga gatcagcggc gtggaggacc 4140
ggttcaacgc cagcctgggc acctaccacg acctgctgaa gatcatcaag gacaaggact 4200
tcctggacaa cgaggagaac gaggacatcc tggaggacat cgtgctgacc ctgaccctgt 4260
tcgaggaccg ggagatgatc gaggagcggc tgaaaaccta cgcccacctg ttcgacgaca 4320
aggtgatgaa gcagctgaag cggcggcggt acaccggctg gggccggctg agccggaagc 4380
tgatcaacgg catccgggac aagcagagcg gcaagaccat cctggacttc ctgaaatccg 4440
acggcttcgc caaccggaac ttcatgcagc tgatccacga cgacagcctg accttcaagg 4500
aggacatcca gaaggcccag gtgagcggcc agggcgacag cctgcacgag cacatcgcca 4560
acctggccgg cagccccgcc atcaagaagg gcatcctgca gaccgtgaag gtggtggacg 4620
agctggtgaa ggtgatgggc cggcacaagc ccgagaacat cgtgatcgag atggcccggg 4680
agaaccagac cacccagaag ggccagaaga acagccggga gcggatgaag cggatcgagg 4740
agggcatcaa ggagctgggc agccagatcc tgaaggagca ccccgtggag aacacccagc 4800
tgcagaacga gaagctgtac ctgtactacc tgcagaacgg ccgggacatg tacgtggacc 4860
aggagctgga catcaaccgg ctgagcgact acgacgtggc cgccatcgtg ccccagagct 4920
tcctgaagga cgacagcatc gacaacaagg tgctgacccg gagcgacaag gcccggggca 4980
agagcgacaa cgtgcccagc gaggaggtgg tgaagaagat gaagaactac tggcggcagc 5040
tgctgaacgc caagctgatc acccagcgga agttcgacaa cctgaccaag gccgagcggg 5100
gcggcctgag cgagctggac aaggccggct tcatcaagcg gcagctggtg gagacccggc 5160
agatcaccaa gcacgtggcc cagatcctgg acagccggat gaacaccaag tacgacgaga 5220
acgacaagct gatccgggag gtgaaggtga tcaccctgaa atccaagctg gtgagcgact 5280
tccggaagga cttccagttc tacaaggtgc gggagatcaa caactaccac cacgcccacg 5340
acgcctacct gaacgccgtg gtgggcaccg ccctgatcaa gaagtacccc aagctggaga 5400
gcgagttcgt gtacggcgac tacaaggtgt acgacgtgcg gaagatgatc gccaagagcg 5460
agcaggagat cggcaaggcc accgccaagt acttcttcta cagcaacatc atgaacttct 5520
tcaagaccga gatcaccctg gccaacggcg agatccggaa gcggcccctg atcgagacca 5580
acggcgagac cggcgagatc gtgtgggaca agggccggga cttcgccacc gtgcggaagg 5640
tgctgagcat gccccaggtg aacatcgtga agaaaaccga ggtgcagacc ggcggcttca 5700
gcaaggagag catcctgccc aagcggaaca gcgacaagct gatcgcccgg aagaaggact 5760
gggaccccaa gaagtacggc ggcttcgaca gccccaccgt ggcctacagc gtgctggtgg 5820
tggccaaggt ggagaagggc aagagcaaga agctgaaatc cgtgaaggag ctgctgggca 5880
tcaccatcat ggagcggagc agcttcgaga agaaccccat cgacttcctg gaggccaagg 5940
gctacaagga ggtgaagaag gacctgatca tcaagctgcc caagtacagc ctgttcgagc 6000
tggagaacgg ccggaagcgg atgctggcca gcgccggcga gctgcagaag ggcaacgagc 6060
tggccctgcc cagcaagtac gtgaacttcc tgtacctggc cagccactac gagaagctga 6120
agggcagccc cgaggacaac gagcagaagc agctgttcgt ggagcagcac aagcactacc 6180
tggacgagat catcgagcag atcagcgagt tcagcaagcg ggtgatcctg gccgacgcca 6240
acctggacaa ggtgctgagc gcctacaaca agcaccggga caagcccatc cgggagcagg 6300
ccgagaacat catccacctg ttcaccctga ccaacctggg cgcccccgcc gccttcaagt 6360
acttcgacac caccatcgac cggaagcggt acaccagcac caaggaggtg ctggacgcca 6420
ccctgatcca ccagagcatc accggcctgt acgagacccg gatcgacctg agccagctgg 6480
gcggcgacag cggcggcaag cggcccgccg ccaccaagaa ggccggccag gccaagaaga 6540
agaagtcggg cgggggtggc tcagacgcta agtctctgac cgcttggagc agaacactgg 6600
tcaccttcaa ggacgtgttc gtcgacttca caagagagga gtggaaactg ctggacaccg 6660
cccagcagat cctctataga aacgtcatgc tggagaacta caagaatctg gtgtctctgg 6720
gctaccagct gaccaagccc gacgtgattc tgaggctgga gaagggcgag gagccttggc 6780
tggtggagag agagatccac caagaaaccc accccgacag cgaaaccgcc ttcgagatca 6840
agagcagcgt gggaggtggc ggatcgggaa agcggcccgc cgccaccaag aaggccggtc 6900
aggccaagaa gaagaagggc agctacccct acgacgtgcc cgactacgcc tgagcggccg 6960
cttaattaag ctgccttctg cggggcttgc cttctggcca tgcccttctt ctctcccttg 7020
cacctgtacc tcttggtctt tgaataaagc ctgagtagga agtctagaaa aaaaaaaaaa 7080
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 7140
aaaaaaa 7147
<210> 86
<211> 2301
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 86
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Gly Ser Gly Gly Ser Gly
1 5 10 15
Gln Thr Gly Lys Lys Ser Glu Lys Gly Pro Val Cys Trp Arg Lys Arg
20 25 30
Val Lys Ser Glu Tyr Met Arg Leu Arg Gln Leu Lys Arg Phe Arg Arg
35 40 45
Ala Asp Glu Val Lys Ser Met Phe Ser Ser Asn Arg Gln Lys Ile Leu
50 55 60
Glu Arg Thr Glu Ile Leu Asn Gln Glu Trp Lys Gln Arg Arg Ile Gln
65 70 75 80
Pro Val His Ile Leu Thr Ser Val Ser Ser Leu Arg Gly Thr Arg Glu
85 90 95
Cys Ser Val Thr Ser Asp Leu Asp Phe Pro Thr Gln Val Ile Pro Leu
100 105 110
Lys Thr Leu Asn Ala Val Ala Ser Val Pro Ile Met Tyr Ser Trp Ser
115 120 125
Pro Leu Gln Gln Asn Phe Met Val Glu Asp Glu Thr Val Leu His Asn
130 135 140
Ile Pro Tyr Met Gly Asp Glu Val Leu Asp Gln Asp Gly Thr Phe Ile
145 150 155 160
Glu Glu Leu Ile Lys Asn Tyr Asp Gly Lys Val His Gly Asp Arg Glu
165 170 175
Cys Gly Phe Ile Asn Asp Glu Ile Phe Val Glu Leu Val Asn Ala Leu
180 185 190
Gly Gln Tyr Asn Asp Asp Asp Asp Asp Asp Asp Gly Asp Asp Pro Glu
195 200 205
Glu Arg Glu Glu Lys Gln Lys Asp Leu Glu Asp His Arg Asp Asp Lys
210 215 220
Glu Ser Arg Pro Pro Arg Lys Phe Pro Ser Asp Lys Ile Phe Glu Ala
225 230 235 240
Ile Ser Ser Met Phe Pro Asp Lys Gly Thr Ala Glu Glu Leu Lys Glu
245 250 255
Lys Tyr Lys Glu Leu Thr Glu Gln Gln Leu Pro Gly Ala Leu Pro Pro
260 265 270
Glu Cys Thr Pro Asn Ile Asp Gly Pro Asn Ala Lys Ser Val Gln Arg
275 280 285
Glu Gln Ser Leu His Ser Phe His Thr Leu Phe Cys Arg Arg Cys Phe
290 295 300
Lys Tyr Asp Cys Phe Leu His Pro Phe His Ala Thr Pro Asn Thr Tyr
305 310 315 320
Lys Arg Lys Asn Thr Glu Thr Ala Leu Asp Asn Lys Pro Cys Gly Pro
325 330 335
Gln Cys Tyr Gln His Leu Glu Gly Ala Lys Glu Phe Ala Ala Ala Leu
340 345 350
Thr Ala Glu Arg Ile Lys Thr Pro Pro Lys Arg Pro Gly Gly Arg Arg
355 360 365
Arg Gly Arg Leu Pro Asn Asn Ser Ser Arg Pro Ser Thr Pro Thr Ile
370 375 380
Asn Val Leu Glu Ser Lys Asp Thr Asp Ser Asp Arg Glu Ala Gly Thr
385 390 395 400
Glu Thr Gly Gly Glu Asn Asn Asp Lys Glu Glu Glu Glu Lys Lys Asp
405 410 415
Glu Thr Ser Ser Ser Ser Glu Ala Asn Ser Arg Cys Gln Thr Pro Ile
420 425 430
Lys Met Lys Pro Asn Ile Glu Pro Pro Glu Asn Val Glu Trp Ser Gly
435 440 445
Ala Glu Ala Ser Met Phe Arg Val Leu Ile Gly Thr Tyr Tyr Asp Asn
450 455 460
Phe Cys Ala Ile Ala Arg Leu Ile Gly Thr Lys Thr Cys Arg Gln Val
465 470 475 480
Tyr Glu Phe Arg Val Lys Glu Ser Ser Ile Ile Ala Pro Ala Pro Ala
485 490 495
Glu Asp Val Asp Thr Pro Pro Arg Lys Lys Lys Arg Lys His Arg Leu
500 505 510
Trp Ala Ala His Cys Arg Lys Ile Gln Leu Lys Lys Asp Gly Ser Ser
515 520 525
Asn His Val Tyr Asn Tyr Gln Pro Cys Asp His Pro Arg Gln Pro Cys
530 535 540
Asp Ser Ser Cys Pro Cys Val Ile Ala Gln Asn Phe Cys Glu Lys Phe
545 550 555 560
Cys Gln Cys Ser Ser Glu Cys Gln Asn Arg Phe Pro Gly Cys Arg Cys
565 570 575
Lys Ala Gln Cys Asn Thr Lys Gln Cys Pro Cys Tyr Leu Ala Val Arg
580 585 590
Glu Cys Asp Pro Asp Leu Cys Leu Thr Cys Gly Ala Ala Asp His Trp
595 600 605
Asp Ser Lys Asn Val Ser Cys Lys Asn Cys Ser Ile Gln Arg Gly Ser
610 615 620
Lys Lys His Leu Leu Leu Ala Pro Ser Asp Val Ala Gly Trp Gly Ile
625 630 635 640
Phe Ile Lys Asp Pro Val Gln Lys Asn Glu Phe Ile Ser Glu Tyr Cys
645 650 655
Gly Glu Ile Ile Ser Gln Asp Glu Ala Asp Arg Arg Gly Lys Val Tyr
660 665 670
Asp Lys Tyr Met Cys Ser Phe Leu Phe Asn Leu Asn Asn Asp Phe Val
675 680 685
Val Asp Ala Thr Arg Lys Gly Asn Lys Ile Arg Phe Ala Asn His Ser
690 695 700
Val Asn Pro Asn Cys Tyr Ala Lys Val Met Met Val Asn Gly Asp His
705 710 715 720
Arg Ile Gly Ile Phe Ala Lys Arg Ala Ile Gln Thr Gly Glu Glu Leu
725 730 735
Phe Phe Asp Tyr Arg Tyr Ser Gln Ala Asp Ala Leu Lys Tyr Val Gly
740 745 750
Ile Glu Arg Glu Met Glu Ile Pro Ser Thr Gly Gly Ser Gly Gly Ser
755 760 765
Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Arg Pro Asp Lys Lys Tyr
770 775 780
Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile
785 790 795 800
Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn
805 810 815
Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe
820 825 830
Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg
835 840 845
Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile
850 855 860
Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu
865 870 875 880
Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro
885 890 895
Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro
900 905 910
Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala
915 920 925
Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg
930 935 940
Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val
945 950 955 960
Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu
965 970 975
Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser
980 985 990
Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu
995 1000 1005
Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu
1010 1015 1020
Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala
1025 1030 1035
Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp
1040 1045 1050
Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
1055 1060 1065
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
1070 1075 1080
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala
1085 1090 1095
Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu
1100 1105 1110
Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu
1115 1120 1125
Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp
1130 1135 1140
Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile
1145 1150 1155
Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn
1160 1165 1170
Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser
1175 1180 1185
Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg
1190 1195 1200
Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys
1205 1210 1215
Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro
1220 1225 1230
Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser
1235 1240 1245
Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys
1250 1255 1260
Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp
1265 1270 1275
Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu Leu
1280 1285 1290
Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
1295 1300 1305
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
1310 1315 1320
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val
1325 1330 1335
Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys
1340 1345 1350
Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala
1355 1360 1365
Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys
1370 1375 1380
Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile
1385 1390 1395
Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu
1400 1405 1410
Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys
1415 1420 1425
Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg
1430 1435 1440
Lys Leu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile
1445 1450 1455
Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met
1460 1465 1470
Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln
1475 1480 1485
Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His Glu His Ile
1490 1495 1500
Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln
1505 1510 1515
Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg His
1520 1525 1530
Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
1535 1540 1545
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
1550 1555 1560
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His
1565 1570 1575
Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr
1580 1585 1590
Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp
1595 1600 1605
Ile Asn Arg Leu Ser Asp Tyr Asp Val Ala Ala Ile Val Pro Gln
1610 1615 1620
Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg
1625 1630 1635
Ser Asp Lys Ala Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu
1640 1645 1650
Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala
1655 1660 1665
Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu
1670 1675 1680
Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg
1685 1690 1695
Gln Leu Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile
1700 1705 1710
Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu
1715 1720 1725
Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser
1730 1735 1740
Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn
1745 1750 1755
Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val Gly
1760 1765 1770
Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
1775 1780 1785
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1790 1795 1800
Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr
1805 1810 1815
Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn
1820 1825 1830
Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr
1835 1840 1845
Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg
1850 1855 1860
Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
1865 1870 1875
Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1880 1885 1890
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1895 1900 1905
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1910 1915 1920
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser
1925 1930 1935
Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe
1940 1945 1950
Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu
1955 1960 1965
Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe
1970 1975 1980
Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu
1985 1990 1995
Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn
2000 2005 2010
Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
2015 2020 2025
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His
2030 2035 2040
Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg
2045 2050 2055
Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr
2060 2065 2070
Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile
2075 2080 2085
Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
2090 2095 2100
Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr
2105 2110 2115
Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
2120 2125 2130
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser
2135 2140 2145
Gly Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys
2150 2155 2160
Lys Lys Lys Ser Gly Gly Gly Gly Ser Asp Ala Lys Ser Leu Thr
2165 2170 2175
Ala Trp Ser Arg Thr Leu Val Thr Phe Lys Asp Val Phe Val Asp
2180 2185 2190
Phe Thr Arg Glu Glu Trp Lys Leu Leu Asp Thr Ala Gln Gln Ile
2195 2200 2205
Leu Tyr Arg Asn Val Met Leu Glu Asn Tyr Lys Asn Leu Val Ser
2210 2215 2220
Leu Gly Tyr Gln Leu Thr Lys Pro Asp Val Ile Leu Arg Leu Glu
2225 2230 2235
Lys Gly Glu Glu Pro Trp Leu Val Glu Arg Glu Ile His Gln Glu
2240 2245 2250
Thr His Pro Asp Ser Glu Thr Ala Phe Glu Ile Lys Ser Ser Val
2255 2260 2265
Gly Gly Gly Gly Ser Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala
2270 2275 2280
Gly Gln Ala Lys Lys Lys Lys Gly Ser Tyr Pro Tyr Asp Val Pro
2285 2290 2295
Asp Tyr Ala
2300
<210> 87
<400> 87
000
<210> 88
<400> 88
000
<210> 89
<400> 89
000
<210> 90
<400> 90
000
<210> 91
<400> 91
000
<210> 92
<400> 92
000
<210> 93
<400> 93
000
<210> 94
<400> 94
000
<210> 95
<400> 95
000
<210> 96
<400> 96
000
<210> 97
<400> 97
000
<210> 98
<400> 98
000
<210> 99
<400> 99
000
<210> 100
<400> 100
000
<210> 101
<400> 101
000
<210> 102
<400> 102
000
<210> 103
<400> 103
000
<210> 104
<400> 104
000
<210> 105
<400> 105
000
<210> 106
<400> 106
000
<210> 107
<400> 107
000
<210> 108
<400> 108
000
<210> 109
<400> 109
000
<210> 110
<400> 110
000
<210> 111
<400> 111
000
<210> 112
<400> 112
000
<210> 113
<400> 113
000
<210> 114
<400> 114
000
<210> 115
<400> 115
000
<210> 116
<400> 116
000
<210> 117
<400> 117
000
<210> 118
<400> 118
000
<210> 119
<400> 119
000
<210> 120
<400> 120
000
<210> 121
<400> 121
000
<210> 122
<400> 122
000
<210> 123
<400> 123
000
<210> 124
<400> 124
000
<210> 125
<400> 125
000
<210> 126
<400> 126
000
<210> 127
<400> 127
000
<210> 128
<400> 128
000
<210> 129
<400> 129
000
<210> 130
<400> 130
000
<210> 131
<400> 131
000
<210> 132
<400> 132
000
<210> 133
<400> 133
000
<210> 134
<400> 134
000
<210> 135
<400> 135
000
<210> 136
<400> 136
000
<210> 137
<400> 137
000
<210> 138
<400> 138
000
<210> 139
<400> 139
000
<210> 140
<400> 140
000
<210> 141
<400> 141
000
<210> 142
<400> 142
000
<210> 143
<400> 143
000
<210> 144
<400> 144
000
<210> 145
<400> 145
000
<210> 146
<400> 146
000
<210> 147
<400> 147
000
<210> 148
<400> 148
000
<210> 149
<400> 149
000
<210> 150
<400> 150
000
<210> 151
<400> 151
000
<210> 152
<400> 152
000
<210> 153
<400> 153
000
<210> 154
<400> 154
000
<210> 155
<400> 155
000
<210> 156
<400> 156
000
<210> 157
<400> 157
000
<210> 158
<400> 158
000
<210> 159
<400> 159
000
<210> 160
<400> 160
000
<210> 161
<400> 161
000
<210> 162
<400> 162
000
<210> 163
<400> 163
000
<210> 164
<400> 164
000
<210> 165
<400> 165
000
<210> 166
<400> 166
000
<210> 167
<400> 167
000
<210> 168
<400> 168
000
<210> 169
<400> 169
000
<210> 170
<400> 170
000
<210> 171
<400> 171
000
<210> 172
<400> 172
000
<210> 173
<400> 173
000
<210> 174
<400> 174
000
<210> 175
<400> 175
000
<210> 176
<400> 176
000
<210> 177
<400> 177
000
<210> 178
<400> 178
000
<210> 179
<400> 179
000
<210> 180
<400> 180
000
<210> 181
<400> 181
000
<210> 182
<400> 182
000
<210> 183
<400> 183
000
<210> 184
<400> 184
000
<210> 185
<400> 185
000
<210> 186
<400> 186
000
<210> 187
<400> 187
000
<210> 188
<400> 188
000
<210> 189
<400> 189
000
<210> 190
<400> 190
000
<210> 191
<400> 191
000
<210> 192
<400> 192
000
<210> 193
<400> 193
000
<210> 194
<400> 194
000
<210> 195
<400> 195
000
<210> 196
<400> 196
000
<210> 197
<400> 197
000
<210> 198
<400> 198
000
<210> 199
<400> 199
000
<210> 200
<400> 200
000
<210> 201
<211> 7433
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 201
tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60
cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120
ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180
accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcaggcgcc 240
attcgccatt caggctgcgc aactgttggg aagggcgatc ggtgcgggcc tcttcgctat 300
tacgccagct ggcgaaaggg ggatgtgctg caaggcgatt aagttgggta acgccagggt 360
tttcccagtc acgacgttgt aaaacgacgg ccagtgaatt gacgcgtatt gggatggtac 420
ctaatacgac tcactataag gaaataagag agaaaagaag agtaagaaga aatataagag 480
ccaccatggc ccccaagaag aagcggaagg tgggcatcca cggcgtgccc gccgccgcca 540
agcggaacta catcctgggc ctggccatcg gcatcaccag cgtgggctac ggcatcatcg 600
actacgagac ccgggacgtg atcgacgccg gcgtgcggct gttcaaggag gccaacgtgg 660
agaacaacga gggccggcgg agcaagcggg gcgcccggcg gctgaagcgg cggcggcggc 720
accggatcca gcgggtgaag aagctgctgt tcgactacaa cctgctgacc gaccacagcg 780
agctgagcgg catcaacccc tacgaggccc gggtgaaggg cctgagccag aagctgagcg 840
aggaggagtt cagcgccgcc ctgctgcacc tggccaagcg gcggggcgtg cacaacgtga 900
acgaggtgga ggaggacacc ggcaacgagc tgagcaccaa ggagcagatc agccggaaca 960
gcaaggccct ggaggagaag tacgtggccg agctgcagct ggagcggctg aagaaggacg 1020
gcgaggtgcg gggcagcatc aaccggttca agaccagcga ctacgtgaag gaggccaagc 1080
agctgctgaa ggtgcagaag gcctaccacc agctggacca gagcttcatc gacacctaca 1140
tcgacctgct ggagacccgg cggacctact acgagggccc cggcgagggc agccccttcg 1200
gctggaagga catcaaggag tggtacgaga tgctgatggg ccactgcacc tacttccccg 1260
aggagctgcg gagcgtgaag tacgcctaca acgccgacct gtacaacgcc ctgaacgacc 1320
tgaacaacct ggtgatcacc cgggacgaga acgagaagct ggagtactac gagaagttcc 1380
agatcatcga gaacgtgttc aagcagaaga agaagcccac cctgaagcag atcgccaagg 1440
agatcctggt gaacgaggag gacatcaagg gctaccgggt gaccagcacc ggcaagcccg 1500
agttcaccaa cctgaaggtg taccacgaca tcaaggacat caccgcccgg aaggagatca 1560
tcgagaacgc cgagctgctg gaccagatcg ccaagatcct gaccatctac cagagcagcg 1620
aggacatcca ggaggagctg accaacctga acagcgagct gacccaggag gagatcgagc 1680
agatcagcaa cctgaagggc tacaccggca cccacaacct gagcctgaag gccatcaacc 1740
tgatcctgga cgagctgtgg cacaccaacg acaaccagat cgccatcttc aaccggctga 1800
agctggtgcc caagaaggtg gacctgagcc agcagaagga gatccccacc accctggtgg 1860
acgacttcat cctgagcccc gtggtgaagc ggagcttcat ccagagcatc aaggtgatca 1920
acgccatcat caagaagtac ggcctgccca acgacatcat catcgagctg gcccgggaga 1980
agaacagcaa ggacgcccag aagatgatca acgagatgca gaagcggaac cggcagacca 2040
acgagcggat cgaggagatc atccggacca ccggcaagga gaacgccaag tacctgatcg 2100
agaagatcaa gctgcacgac atgcaggagg gcaagtgcct gtacagcctg gaggccatcc 2160
ccctggagga cctgctgaac aaccccttca actacgaggt ggacgccatc atcccccgga 2220
gcgtgagctt cgacaacagc ttcaacaaca aggtgctggt gaagcaggag gagaacagca 2280
agaagggcaa ccggaccccc ttccagtacc tgagcagcag cgacagcaag atcagctacg 2340
agaccttcaa gaagcacatc ctgaacctgg ccaagggcaa gggccggatc agcaagacca 2400
agaaggagta cctgctggag gagcgggaca tcaaccggtt cagcgtgcag aaggacttca 2460
tcaaccggaa cctggtggac acccggtacg ccacccgggg cctgatgaac ctgctgcgga 2520
gctacttccg ggtgaacaac ctggacgtga aggtgaaatc catcaacggc ggcttcacca 2580
gcttcctgcg gcggaagtgg aagttcaaga aggagcggaa caagggctac aagcaccacg 2640
ccgaggacgc cctgatcatc gccaacgccg acttcatctt caaggagtgg aagaagctgg 2700
acaaggccaa gaaggtgatg gagaaccaga tgttcgagga gaagcaggcc gagagcatgc 2760
ccgagatcga gaccgagcag gagtacaagg agatcttcat caccccccac cagatcaagc 2820
acatcaagga cttcaaggac tacaagtaca gccaccgggt ggacaagaag cccaaccgga 2880
agctgatcaa cgacaccctg tacagcaccc ggaaggacga caagggcaac accctgatcg 2940
tgaacaacct gaacggcctg tacgacaagg acaacgacaa gctgaagaag ctgatcaaca 3000
agagccccga gaagctgctg atgtaccacc acgaccccca gacctaccag aagctgaagc 3060
tgatcatgga gcagtacggc gacgagaaga accccctgta caagtactac gaggagaccg 3120
gcaactacct gaccaagtac agcaagaagg acaacggccc cgtgatcaag aagatcaagt 3180
actacggcaa caagctgaac gcccacctgg acatcaccga cgactacccc aacagccgga 3240
acaaggtggt gaagctgagc ctgaagccct accggttcga cgtgtacctg gacaacggcg 3300
tgtacaagtt cgtgaccgtg aagaacctgg acgtgatcaa gaaggagaac tactacgagg 3360
tgaacagcaa gtgctacgag gaggccaaga agctgaagaa gatcagcaac caggccgagt 3420
tcatcgccag cttctacaag aacgacctga tcaagatcaa cggcgagctg taccgggtga 3480
tcggcgtgaa caacgacctg ctgaaccgga tcgaggtgaa catgatcgac atcacctacc 3540
gggagtacct ggagaacatg aacgacaagc ggccccccca catcatcaag accatcgcca 3600
gcaagaccca gagcatcaag aagtacagca ccgacatcct gggcaacctg tacgaggtga 3660
aatccaagaa gcacccccag atcatcaaga agggcaagcg gcccgccgcc accaagaagg 3720
ccggccaggc caagaagaag aaggcccggg acagcaaggt ggagaacaag accaagaagc 3780
tgcgggtgtt cgaggccttc gccggcatcg gcgcccagcg gaaggccctg gagaaggtgc 3840
ggaaggacga gtacgagatc gtgggcctgg ccgagtggta cgtgcccgcc atcgtgatgt 3900
accaggccat ccacaacaac ttccacacca agctggagta caagagcgtg agccgggagg 3960
agatgatcga ctacctggag aacaagaccc tgagctggaa cagcaagaac cccgtgagca 4020
acggctactg gaagcggaag aaggacgacg agctgaagat catctacaac gccatcaagc 4080
tgagcgagaa ggagggcaac atcttcgaca tccgggacct gtacaagcgg accctgaaga 4140
acatcgacct gctgacctac agcttcccct gccaggacct gagccagcag ggcatccaga 4200
agggcatgaa gcggggcagc ggcacccgga gcggcctgct gtgggagatc gagcgggccc 4260
tggacagcac cgagaagaac gacctgccca agtacctgct gatggagaac gtgggcgccc 4320
tgctgcacaa gaagaacgag gaggagctga accagtggaa gcagaagctg gagagcctgg 4380
gctaccagaa cagcatcgag gtgctgaacg ccgccgactt cggcagcagc caggcccggc 4440
ggcgggtgtt catgatcagc accctgaacg agttcgtgga gctgcccaag ggcgacaaga 4500
agcccaagag catcaagaag gtgctgaaca agatcgtgag cgagaaggac atcctgaaca 4560
acctgctgaa gtacaacctg accgagttca agaaaaccaa gagcaacatc aacaaggcca 4620
gcctgatcgg ctacagcaag ttcaacagcg agggctacgt gtacgacccc gagttcaccg 4680
gccccaccct gaccgccagc ggcgccaaca gccggatcaa gatcaaggac ggcagcaaca 4740
tccggaagat gaacagcgac gagaccttcc tgtacatcgg cttcgacagc caggacggca 4800
agcgggtgaa cgagatcgag ttcctgaccg agaaccagaa gatcttcgtg tgcggcaaca 4860
gcatcagcgt ggaggtgctg gaggccatca tcgacaagat cggcggcccc agcagcggcg 4920
gcaagcggcc cgccgccacc aagaaggccg gccaggccaa gaagaagaag ggcagctacc 4980
cctacgacgt gcccgactac gcctgagcgg ccgcttaatt aagctgcctt ctgcggggct 5040
tgccttctgg ccatgccctt cttctctccc ttgcacctgt acctcttggt ctttgaataa 5100
agcctgagta ggaagtctag aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 5160
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa ttgtcttctt catcgcctgc 5220
agatcccaat ggcgcgccga gcttggctcg agcatggtca tagctgtttc ctgtgtgaaa 5280
ttgttatccg ctcacaattc cacacaacat acgagccgga agcataaagt gtaaagcctg 5340
gggtgcctaa tgagtgagct aactcacatt aattgcgttg cgctcactgc ccgctttcca 5400
gtcgggaaac ctgtcgtgcc agctgcatta atgaatcggc caacgcgcgg ggagaggcgg 5460
tttgcgtatt gggcgctctt ccgcttcctc gctcactgac tcgctgcgct cggtcgttcg 5520
gctgcggcga gcggtatcag ctcactcaaa ggcggtaata cggttatcca cagaatcagg 5580
ggataacgca ggaaagaaca tgtgagcaaa aggccagcaa aaggccagga accgtaaaaa 5640
ggccgcgttg ctggcgtttt tccataggct ccgcccccct gacgagcatc acaaaaatcg 5700
acgctcaagt cagaggtggc gaaacccgac aggactataa agataccagg cgtttccccc 5760
tggaagctcc ctcgtgcgct ctcctgttcc gaccctgccg cttaccggat acctgtccgc 5820
ctttctccct tcgggaagcg tggcgctttc tcatagctca cgctgtaggt atctcagttc 5880
ggtgtaggtc gttcgctcca agctgggctg tgtgcacgaa ccccccgttc agcccgaccg 5940
ctgcgcctta tccggtaact atcgtcttga gtccaacccg gtaagacacg acttatcgcc 6000
actggcagca gccactggta acaggattag cagagcgagg tatgtaggcg gtgctacaga 6060
gttcttgaag tggtggccta actacggcta cactagaaga acagtatttg gtatctgcgc 6120
tctgctgaag ccagttacct tcggaaaaag agttggtagc tcttgatccg gcaaacaaac 6180
caccgctggt agcggtggtt tttttgtttg caagcagcag attacgcgca gaaaaaaagg 6240
atctcaagaa gatcctttga tcttttctac ggggtctgac gctcagtgga acgaaaactc 6300
acgttaaggg attttggtca tgagattatc aaaaaggatc ttcacctaga tccttttaaa 6360
ttaaaaatga agttttaaat caatctaaag tatatatgag taaacttggt ctgacagtta 6420
gaaaaactca tcgagcatca aatgaaactg caatttattc atatcaggat tatcaatacc 6480
atatttttga aaaagccgtt tctgtaatga aggagaaaac tcaccgaggc agttccatag 6540
gatggcaaga tcctggtatc ggtctgcgat tccgactcgt ccaacatcaa tacaacctat 6600
taatttcccc tcgtcaaaaa taaggttatc aagtgagaaa tcaccatgag tgacgactga 6660
atccggtgag aatggcaaaa gtttatgcat ttctttccag acttgttcaa caggccagcc 6720
attacgctcg tcatcaaaat cactcgcatc aaccaaaccg ttattcattc gtgattgcgc 6780
ctgagcgaga cgaaatacgc gatcgctgtt aaaaggacaa ttacaaacag gaatcgaatg 6840
caaccggcgc aggaacactg ccagcgcatc aacaatattt tcacctgaat caggatattc 6900
ttctaatacc tggaatgctg ttttcccagg gatcgcagtg gtgagtaacc atgcatcatc 6960
aggagtacgg ataaaatgct tgatggtcgg aagaggcata aattccgtca gccagtttag 7020
tctgaccatc tcatctgtaa catcattggc aacgctacct ttgccatgtt tcagaaacaa 7080
ctctggcgca tcgggcttcc catacaatcg atagattgtc gcacctgatt gcccgacatt 7140
atcgcgagcc catttatacc catataaatc agcatccatg ttggaattta atcgcggcct 7200
agagcaagac gtttcccgtt gaatatggct catactcttc ctttttcaat attattgaag 7260
catttatcag ggttattgtc tcatgagcgg atacatattt gaatgtattt agaaaaataa 7320
acaaataggg gttccgcgca catttccccg aaaagtgcca cctgacgtct aagaaaccat 7380
tattatcatg acattaacct ataaaaatag gcgtatcacg aggccctttc gtc 7433
<210> 202
<211> 4762
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 202
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgccgc caagcggaac tacatcctgg 120
gcctggccat cggcatcacc agcgtgggct acggcatcat cgactacgag acccgggacg 180
tgatcgacgc cggcgtgcgg ctgttcaagg aggccaacgt ggagaacaac gagggccggc 240
ggagcaagcg gggcgcccgg cggctgaagc ggcggcggcg gcaccggatc cagcgggtga 300
agaagctgct gttcgactac aacctgctga ccgaccacag cgagctgagc ggcatcaacc 360
cctacgaggc ccgggtgaag ggcctgagcc agaagctgag cgaggaggag ttcagcgccg 420
ccctgctgca cctggccaag cggcggggcg tgcacaacgt gaacgaggtg gaggaggaca 480
ccggcaacga gctgagcacc aaggagcaga tcagccggaa cagcaaggcc ctggaggaga 540
agtacgtggc cgagctgcag ctggagcggc tgaagaagga cggcgaggtg cggggcagca 600
tcaaccggtt caagaccagc gactacgtga aggaggccaa gcagctgctg aaggtgcaga 660
aggcctacca ccagctggac cagagcttca tcgacaccta catcgacctg ctggagaccc 720
ggcggaccta ctacgagggc cccggcgagg gcagcccctt cggctggaag gacatcaagg 780
agtggtacga gatgctgatg ggccactgca cctacttccc cgaggagctg cggagcgtga 840
agtacgccta caacgccgac ctgtacaacg ccctgaacga cctgaacaac ctggtgatca 900
cccgggacga gaacgagaag ctggagtact acgagaagtt ccagatcatc gagaacgtgt 960
tcaagcagaa gaagaagccc accctgaagc agatcgccaa ggagatcctg gtgaacgagg 1020
aggacatcaa gggctaccgg gtgaccagca ccggcaagcc cgagttcacc aacctgaagg 1080
tgtaccacga catcaaggac atcaccgccc ggaaggagat catcgagaac gccgagctgc 1140
tggaccagat cgccaagatc ctgaccatct accagagcag cgaggacatc caggaggagc 1200
tgaccaacct gaacagcgag ctgacccagg aggagatcga gcagatcagc aacctgaagg 1260
gctacaccgg cacccacaac ctgagcctga aggccatcaa cctgatcctg gacgagctgt 1320
ggcacaccaa cgacaaccag atcgccatct tcaaccggct gaagctggtg cccaagaagg 1380
tggacctgag ccagcagaag gagatcccca ccaccctggt ggacgacttc atcctgagcc 1440
ccgtggtgaa gcggagcttc atccagagca tcaaggtgat caacgccatc atcaagaagt 1500
acggcctgcc caacgacatc atcatcgagc tggcccggga gaagaacagc aaggacgccc 1560
agaagatgat caacgagatg cagaagcgga accggcagac caacgagcgg atcgaggaga 1620
tcatccggac caccggcaag gagaacgcca agtacctgat cgagaagatc aagctgcacg 1680
acatgcagga gggcaagtgc ctgtacagcc tggaggccat ccccctggag gacctgctga 1740
acaacccctt caactacgag gtggacgcca tcatcccccg gagcgtgagc ttcgacaaca 1800
gcttcaacaa caaggtgctg gtgaagcagg aggagaacag caagaagggc aaccggaccc 1860
ccttccagta cctgagcagc agcgacagca agatcagcta cgagaccttc aagaagcaca 1920
tcctgaacct ggccaagggc aagggccgga tcagcaagac caagaaggag tacctgctgg 1980
aggagcggga catcaaccgg ttcagcgtgc agaaggactt catcaaccgg aacctggtgg 2040
acacccggta cgccacccgg ggcctgatga acctgctgcg gagctacttc cgggtgaaca 2100
acctggacgt gaaggtgaaa tccatcaacg gcggcttcac cagcttcctg cggcggaagt 2160
ggaagttcaa gaaggagcgg aacaagggct acaagcacca cgccgaggac gccctgatca 2220
tcgccaacgc cgacttcatc ttcaaggagt ggaagaagct ggacaaggcc aagaaggtga 2280
tggagaacca gatgttcgag gagaagcagg ccgagagcat gcccgagatc gagaccgagc 2340
aggagtacaa ggagatcttc atcacccccc accagatcaa gcacatcaag gacttcaagg 2400
actacaagta cagccaccgg gtggacaaga agcccaaccg gaagctgatc aacgacaccc 2460
tgtacagcac ccggaaggac gacaagggca acaccctgat cgtgaacaac ctgaacggcc 2520
tgtacgacaa ggacaacgac aagctgaaga agctgatcaa caagagcccc gagaagctgc 2580
tgatgtacca ccacgacccc cagacctacc agaagctgaa gctgatcatg gagcagtacg 2640
gcgacgagaa gaaccccctg tacaagtact acgaggagac cggcaactac ctgaccaagt 2700
acagcaagaa ggacaacggc cccgtgatca agaagatcaa gtactacggc aacaagctga 2760
acgcccacct ggacatcacc gacgactacc ccaacagccg gaacaaggtg gtgaagctga 2820
gcctgaagcc ctaccggttc gacgtgtacc tggacaacgg cgtgtacaag ttcgtgaccg 2880
tgaagaacct ggacgtgatc aagaaggaga actactacga ggtgaacagc aagtgctacg 2940
aggaggccaa gaagctgaag aagatcagca accaggccga gttcatcgcc agcttctaca 3000
agaacgacct gatcaagatc aacggcgagc tgtaccgggt gatcggcgtg aacaacgacc 3060
tgctgaaccg gatcgaggtg aacatgatcg acatcaccta ccgggagtac ctggagaaca 3120
tgaacgacaa gcggcccccc cacatcatca agaccatcgc cagcaagacc cagagcatca 3180
agaagtacag caccgacatc ctgggcaacc tgtacgaggt gaaatccaag aagcaccccc 3240
agatcatcaa gaagggcaag cggcccgccg ccaccaagaa ggccggccag gccaagaaga 3300
agaaggcccg ggacagcaag gtggagaaca agaccaagaa gctgcgggtg ttcgaggcct 3360
tcgccggcat cggcgcccag cggaaggccc tggagaaggt gcggaaggac gagtacgaga 3420
tcgtgggcct ggccgagtgg tacgtgcccg ccatcgtgat gtaccaggcc atccacaaca 3480
acttccacac caagctggag tacaagagcg tgagccggga ggagatgatc gactacctgg 3540
agaacaagac cctgagctgg aacagcaaga accccgtgag caacggctac tggaagcgga 3600
agaaggacga cgagctgaag atcatctaca acgccatcaa gctgagcgag aaggagggca 3660
acatcttcga catccgggac ctgtacaagc ggaccctgaa gaacatcgac ctgctgacct 3720
acagcttccc ctgccaggac ctgagccagc agggcatcca gaagggcatg aagcggggca 3780
gcggcacccg gagcggcctg ctgtgggaga tcgagcgggc cctggacagc accgagaaga 3840
acgacctgcc caagtacctg ctgatggaga acgtgggcgc cctgctgcac aagaagaacg 3900
aggaggagct gaaccagtgg aagcagaagc tggagagcct gggctaccag aacagcatcg 3960
aggtgctgaa cgccgccgac ttcggcagca gccaggcccg gcggcgggtg ttcatgatca 4020
gcaccctgaa cgagttcgtg gagctgccca agggcgacaa gaagcccaag agcatcaaga 4080
aggtgctgaa caagatcgtg agcgagaagg acatcctgaa caacctgctg aagtacaacc 4140
tgaccgagtt caagaaaacc aagagcaaca tcaacaaggc cagcctgatc ggctacagca 4200
agttcaacag cgagggctac gtgtacgacc ccgagttcac cggccccacc ctgaccgcca 4260
gcggcgccaa cagccggatc aagatcaagg acggcagcaa catccggaag atgaacagcg 4320
acgagacctt cctgtacatc ggcttcgaca gccaggacgg caagcgggtg aacgagatcg 4380
agttcctgac cgagaaccag aagatcttcg tgtgcggcaa cagcatcagc gtggaggtgc 4440
tggaggccat catcgacaag atcggcggcc ccagcagcgg cggcaagcgg cccgccgcca 4500
ccaagaaggc cggccaggcc aagaagaaga agggcagcta cccctacgac gtgcccgact 4560
acgcctgagc ggccgcttaa ttaagctgcc ttctgcgggg cttgccttct ggccatgccc 4620
ttcttctctc ccttgcacct gtacctcttg gtctttgaat aaagcctgag taggaagtct 4680
agaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 4740
aaaaaaaaaa aaaaaaaaaa aa 4762
<210> 203
<211> 1506
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 203
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Ala Lys Arg Asn Tyr Ile Leu Gly Leu Ala Ile Gly Ile Thr Ser
20 25 30
Val Gly Tyr Gly Ile Ile Asp Tyr Glu Thr Arg Asp Val Ile Asp Ala
35 40 45
Gly Val Arg Leu Phe Lys Glu Ala Asn Val Glu Asn Asn Glu Gly Arg
50 55 60
Arg Ser Lys Arg Gly Ala Arg Arg Leu Lys Arg Arg Arg Arg His Arg
65 70 75 80
Ile Gln Arg Val Lys Lys Leu Leu Phe Asp Tyr Asn Leu Leu Thr Asp
85 90 95
His Ser Glu Leu Ser Gly Ile Asn Pro Tyr Glu Ala Arg Val Lys Gly
100 105 110
Leu Ser Gln Lys Leu Ser Glu Glu Glu Phe Ser Ala Ala Leu Leu His
115 120 125
Leu Ala Lys Arg Arg Gly Val His Asn Val Asn Glu Val Glu Glu Asp
130 135 140
Thr Gly Asn Glu Leu Ser Thr Lys Glu Gln Ile Ser Arg Asn Ser Lys
145 150 155 160
Ala Leu Glu Glu Lys Tyr Val Ala Glu Leu Gln Leu Glu Arg Leu Lys
165 170 175
Lys Asp Gly Glu Val Arg Gly Ser Ile Asn Arg Phe Lys Thr Ser Asp
180 185 190
Tyr Val Lys Glu Ala Lys Gln Leu Leu Lys Val Gln Lys Ala Tyr His
195 200 205
Gln Leu Asp Gln Ser Phe Ile Asp Thr Tyr Ile Asp Leu Leu Glu Thr
210 215 220
Arg Arg Thr Tyr Tyr Glu Gly Pro Gly Glu Gly Ser Pro Phe Gly Trp
225 230 235 240
Lys Asp Ile Lys Glu Trp Tyr Glu Met Leu Met Gly His Cys Thr Tyr
245 250 255
Phe Pro Glu Glu Leu Arg Ser Val Lys Tyr Ala Tyr Asn Ala Asp Leu
260 265 270
Tyr Asn Ala Leu Asn Asp Leu Asn Asn Leu Val Ile Thr Arg Asp Glu
275 280 285
Asn Glu Lys Leu Glu Tyr Tyr Glu Lys Phe Gln Ile Ile Glu Asn Val
290 295 300
Phe Lys Gln Lys Lys Lys Pro Thr Leu Lys Gln Ile Ala Lys Glu Ile
305 310 315 320
Leu Val Asn Glu Glu Asp Ile Lys Gly Tyr Arg Val Thr Ser Thr Gly
325 330 335
Lys Pro Glu Phe Thr Asn Leu Lys Val Tyr His Asp Ile Lys Asp Ile
340 345 350
Thr Ala Arg Lys Glu Ile Ile Glu Asn Ala Glu Leu Leu Asp Gln Ile
355 360 365
Ala Lys Ile Leu Thr Ile Tyr Gln Ser Ser Glu Asp Ile Gln Glu Glu
370 375 380
Leu Thr Asn Leu Asn Ser Glu Leu Thr Gln Glu Glu Ile Glu Gln Ile
385 390 395 400
Ser Asn Leu Lys Gly Tyr Thr Gly Thr His Asn Leu Ser Leu Lys Ala
405 410 415
Ile Asn Leu Ile Leu Asp Glu Leu Trp His Thr Asn Asp Asn Gln Ile
420 425 430
Ala Ile Phe Asn Arg Leu Lys Leu Val Pro Lys Lys Val Asp Leu Ser
435 440 445
Gln Gln Lys Glu Ile Pro Thr Thr Leu Val Asp Asp Phe Ile Leu Ser
450 455 460
Pro Val Val Lys Arg Ser Phe Ile Gln Ser Ile Lys Val Ile Asn Ala
465 470 475 480
Ile Ile Lys Lys Tyr Gly Leu Pro Asn Asp Ile Ile Ile Glu Leu Ala
485 490 495
Arg Glu Lys Asn Ser Lys Asp Ala Gln Lys Met Ile Asn Glu Met Gln
500 505 510
Lys Arg Asn Arg Gln Thr Asn Glu Arg Ile Glu Glu Ile Ile Arg Thr
515 520 525
Thr Gly Lys Glu Asn Ala Lys Tyr Leu Ile Glu Lys Ile Lys Leu His
530 535 540
Asp Met Gln Glu Gly Lys Cys Leu Tyr Ser Leu Glu Ala Ile Pro Leu
545 550 555 560
Glu Asp Leu Leu Asn Asn Pro Phe Asn Tyr Glu Val Asp Ala Ile Ile
565 570 575
Pro Arg Ser Val Ser Phe Asp Asn Ser Phe Asn Asn Lys Val Leu Val
580 585 590
Lys Gln Glu Glu Asn Ser Lys Lys Gly Asn Arg Thr Pro Phe Gln Tyr
595 600 605
Leu Ser Ser Ser Asp Ser Lys Ile Ser Tyr Glu Thr Phe Lys Lys His
610 615 620
Ile Leu Asn Leu Ala Lys Gly Lys Gly Arg Ile Ser Lys Thr Lys Lys
625 630 635 640
Glu Tyr Leu Leu Glu Glu Arg Asp Ile Asn Arg Phe Ser Val Gln Lys
645 650 655
Asp Phe Ile Asn Arg Asn Leu Val Asp Thr Arg Tyr Ala Thr Arg Gly
660 665 670
Leu Met Asn Leu Leu Arg Ser Tyr Phe Arg Val Asn Asn Leu Asp Val
675 680 685
Lys Val Lys Ser Ile Asn Gly Gly Phe Thr Ser Phe Leu Arg Arg Lys
690 695 700
Trp Lys Phe Lys Lys Glu Arg Asn Lys Gly Tyr Lys His His Ala Glu
705 710 715 720
Asp Ala Leu Ile Ile Ala Asn Ala Asp Phe Ile Phe Lys Glu Trp Lys
725 730 735
Lys Leu Asp Lys Ala Lys Lys Val Met Glu Asn Gln Met Phe Glu Glu
740 745 750
Lys Gln Ala Glu Ser Met Pro Glu Ile Glu Thr Glu Gln Glu Tyr Lys
755 760 765
Glu Ile Phe Ile Thr Pro His Gln Ile Lys His Ile Lys Asp Phe Lys
770 775 780
Asp Tyr Lys Tyr Ser His Arg Val Asp Lys Lys Pro Asn Arg Lys Leu
785 790 795 800
Ile Asn Asp Thr Leu Tyr Ser Thr Arg Lys Asp Asp Lys Gly Asn Thr
805 810 815
Leu Ile Val Asn Asn Leu Asn Gly Leu Tyr Asp Lys Asp Asn Asp Lys
820 825 830
Leu Lys Lys Leu Ile Asn Lys Ser Pro Glu Lys Leu Leu Met Tyr His
835 840 845
His Asp Pro Gln Thr Tyr Gln Lys Leu Lys Leu Ile Met Glu Gln Tyr
850 855 860
Gly Asp Glu Lys Asn Pro Leu Tyr Lys Tyr Tyr Glu Glu Thr Gly Asn
865 870 875 880
Tyr Leu Thr Lys Tyr Ser Lys Lys Asp Asn Gly Pro Val Ile Lys Lys
885 890 895
Ile Lys Tyr Tyr Gly Asn Lys Leu Asn Ala His Leu Asp Ile Thr Asp
900 905 910
Asp Tyr Pro Asn Ser Arg Asn Lys Val Val Lys Leu Ser Leu Lys Pro
915 920 925
Tyr Arg Phe Asp Val Tyr Leu Asp Asn Gly Val Tyr Lys Phe Val Thr
930 935 940
Val Lys Asn Leu Asp Val Ile Lys Lys Glu Asn Tyr Tyr Glu Val Asn
945 950 955 960
Ser Lys Cys Tyr Glu Glu Ala Lys Lys Leu Lys Lys Ile Ser Asn Gln
965 970 975
Ala Glu Phe Ile Ala Ser Phe Tyr Lys Asn Asp Leu Ile Lys Ile Asn
980 985 990
Gly Glu Leu Tyr Arg Val Ile Gly Val Asn Asn Asp Leu Leu Asn Arg
995 1000 1005
Ile Glu Val Asn Met Ile Asp Ile Thr Tyr Arg Glu Tyr Leu Glu
1010 1015 1020
Asn Met Asn Asp Lys Arg Pro Pro His Ile Ile Lys Thr Ile Ala
1025 1030 1035
Ser Lys Thr Gln Ser Ile Lys Lys Tyr Ser Thr Asp Ile Leu Gly
1040 1045 1050
Asn Leu Tyr Glu Val Lys Ser Lys Lys His Pro Gln Ile Ile Lys
1055 1060 1065
Lys Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys
1070 1075 1080
Lys Lys Lys Ala Arg Asp Ser Lys Val Glu Asn Lys Thr Lys Lys
1085 1090 1095
Leu Arg Val Phe Glu Ala Phe Ala Gly Ile Gly Ala Gln Arg Lys
1100 1105 1110
Ala Leu Glu Lys Val Arg Lys Asp Glu Tyr Glu Ile Val Gly Leu
1115 1120 1125
Ala Glu Trp Tyr Val Pro Ala Ile Val Met Tyr Gln Ala Ile His
1130 1135 1140
Asn Asn Phe His Thr Lys Leu Glu Tyr Lys Ser Val Ser Arg Glu
1145 1150 1155
Glu Met Ile Asp Tyr Leu Glu Asn Lys Thr Leu Ser Trp Asn Ser
1160 1165 1170
Lys Asn Pro Val Ser Asn Gly Tyr Trp Lys Arg Lys Lys Asp Asp
1175 1180 1185
Glu Leu Lys Ile Ile Tyr Asn Ala Ile Lys Leu Ser Glu Lys Glu
1190 1195 1200
Gly Asn Ile Phe Asp Ile Arg Asp Leu Tyr Lys Arg Thr Leu Lys
1205 1210 1215
Asn Ile Asp Leu Leu Thr Tyr Ser Phe Pro Cys Gln Asp Leu Ser
1220 1225 1230
Gln Gln Gly Ile Gln Lys Gly Met Lys Arg Gly Ser Gly Thr Arg
1235 1240 1245
Ser Gly Leu Leu Trp Glu Ile Glu Arg Ala Leu Asp Ser Thr Glu
1250 1255 1260
Lys Asn Asp Leu Pro Lys Tyr Leu Leu Met Glu Asn Val Gly Ala
1265 1270 1275
Leu Leu His Lys Lys Asn Glu Glu Glu Leu Asn Gln Trp Lys Gln
1280 1285 1290
Lys Leu Glu Ser Leu Gly Tyr Gln Asn Ser Ile Glu Val Leu Asn
1295 1300 1305
Ala Ala Asp Phe Gly Ser Ser Gln Ala Arg Arg Arg Val Phe Met
1310 1315 1320
Ile Ser Thr Leu Asn Glu Phe Val Glu Leu Pro Lys Gly Asp Lys
1325 1330 1335
Lys Pro Lys Ser Ile Lys Lys Val Leu Asn Lys Ile Val Ser Glu
1340 1345 1350
Lys Asp Ile Leu Asn Asn Leu Leu Lys Tyr Asn Leu Thr Glu Phe
1355 1360 1365
Lys Lys Thr Lys Ser Asn Ile Asn Lys Ala Ser Leu Ile Gly Tyr
1370 1375 1380
Ser Lys Phe Asn Ser Glu Gly Tyr Val Tyr Asp Pro Glu Phe Thr
1385 1390 1395
Gly Pro Thr Leu Thr Ala Ser Gly Ala Asn Ser Arg Ile Lys Ile
1400 1405 1410
Lys Asp Gly Ser Asn Ile Arg Lys Met Asn Ser Asp Glu Thr Phe
1415 1420 1425
Leu Tyr Ile Gly Phe Asp Ser Gln Asp Gly Lys Arg Val Asn Glu
1430 1435 1440
Ile Glu Phe Leu Thr Glu Asn Gln Lys Ile Phe Val Cys Gly Asn
1445 1450 1455
Ser Ile Ser Val Glu Val Leu Glu Ala Ile Ile Asp Lys Ile Gly
1460 1465 1470
Gly Pro Ser Ser Gly Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala
1475 1480 1485
Gly Gln Ala Lys Lys Lys Lys Gly Ser Tyr Pro Tyr Asp Val Pro
1490 1495 1500
Asp Tyr Ala
1505
<210> 204
<211> 7499
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 204
tcgcgcgttt cggtgatgac ggtgaaaacc tctgacacat gcagctcccg gagacggtca 60
cagcttgtct gtaagcggat gccgggagca gacaagcccg tcagggcgcg tcagcgggtg 120
ttggcgggtg tcggggctgg cttaactatg cggcatcaga gcagattgta ctgagagtgc 180
accatatgcg gtgtgaaata ccgcacagat gcgtaaggag aaaataccgc atcaggcgcc 240
attcgccatt caggctgcgc aactgttggg aagggcgatc ggtgcgggcc tcttcgctat 300
tacgccagct ggcgaaaggg ggatgtgctg caaggcgatt aagttgggta acgccagggt 360
tttcccagtc acgacgttgt aaaacgacgg ccagtgaatt gacgcgtatt gggatggtac 420
ctaatacgac tcactataag gaaataagag agaaaagaag agtaagaaga aatataagag 480
ccaccatggc ccccaagaag aagcggaagg tgggcatcca cggcgtgccc gccgccgaca 540
agaagtacag catcggcctg gccatcggca ccaacagcgt gggctgggcc gtgatcaccg 600
acgagtacaa ggtgcccagc aagaagttca aggtgctggg caacaccgac cggcacagca 660
tcaagaagaa cctgatcggc gccctgctgt tcgacagcgg cgagaccgcc gaggccaccc 720
ggctgaagcg gaccgcccgg cggcggtaca cccggcggaa gaaccggatc tgctacctgc 780
aggagatctt cagcaacgag atggccaagg tggacgacag cttcttccac cggctggagg 840
agagcttcct ggtggaggag gacaagaagc acgagcggca ccccatcttc ggcaacatcg 900
tggacgaggt ggcctaccac gagaagtacc ccaccatcta ccacctgcgg aagaagctgg 960
tggacagcac cgacaaggcc gacctgcggc tgatctacct ggccctggcc cacatgatca 1020
agttccgggg ccacttcctg atcgagggcg acctgaaccc cgacaacagc gacgtggaca 1080
agctgttcat ccagctggtg cagacctaca accagctgtt cgaggagaac cccatcaacg 1140
ccagcggcgt ggacgccaag gccatcctga gcgcccggct gagcaagagc cggcggctgg 1200
agaacctgat cgcccagctg cccggcgaga agaagaacgg cctgttcggc aacctgatcg 1260
ccctgagcct gggcctgacc cccaacttca agagcaactt cgacctggcc gaggacgcca 1320
agctgcagct gagcaaggac acctacgacg acgacctgga caacctgctg gcccagatcg 1380
gcgaccagta cgccgacctg ttcctggccg ccaagaacct gagcgacgcc atcctgctga 1440
gcgacatcct gcgggtgaac accgagatca ccaaggcccc cctgagcgcc agcatgatca 1500
agcggtacga cgagcaccac caggacctga ccctgctgaa ggccctggtg cggcagcagc 1560
tgcccgagaa gtacaaggag atcttcttcg accagagcaa gaacggctac gccggctaca 1620
tcgacggcgg cgccagccag gaggagttct acaagttcat caagcccatc ctggagaaga 1680
tggacggcac cgaggagctg ctggtgaagc tgaaccggga ggacctgctg cggaagcagc 1740
ggaccttcga caacggcagc atcccccacc agatccacct gggcgagctg cacgccatcc 1800
tgcggcggca ggaggacttc taccccttcc tgaaggacaa ccgggagaag atcgagaaga 1860
tcctgacctt ccggatcccc tactacgtgg gccccctggc ccggggcaac agccggttcg 1920
cctggatgac ccggaaatcc gaggagacca tcaccccctg gaacttcgag gaggtggtgg 1980
acaagggcgc cagcgcccag agcttcatcg agcggatgac caacttcgac aagaacctgc 2040
ccaacgagaa ggtgctgccc aagcacagcc tgctgtacga gtacttcacc gtgtacaacg 2100
agctgaccaa ggtgaagtac gtgaccgagg gcatgcggaa gcccgccttc ctgagcggcg 2160
agcagaagaa ggccatcgtg gacctgctgt tcaagaccaa ccggaaggtg accgtgaagc 2220
agctgaagga ggactacttc aagaagatcg agtgcttcga cagcgtggag atcagcggcg 2280
tggaggaccg gttcaacgcc agcctgggca cctaccacga cctgctgaag atcatcaagg 2340
acaaggactt cctggacaac gaggagaacg aggacatcct ggaggacatc gtgctgaccc 2400
tgaccctgtt cgaggaccgg gagatgatcg aggagcggct gaaaacctac gcccacctgt 2460
tcgacgacaa ggtgatgaag cagctgaagc ggcggcggta caccggctgg ggccggctga 2520
gccggaagct gatcaacggc atccgggaca agcagagcgg caagaccatc ctggacttcc 2580
tgaaatccga cggcttcgcc aaccggaact tcatgcagct gatccacgac gacagcctga 2640
ccttcaagga ggacatccag aaggcccagg tgagcggcca gggcgacagc ctgcacgagc 2700
acatcgccaa cctggccggc agccccgcca tcaagaaggg catcctgcag accgtgaagg 2760
tggtggacga gctggtgaag gtgatgggcc ggcacaagcc cgagaacatc gtgatcgaga 2820
tggcccggga gaaccagacc acccagaagg gccagaagaa cagccgggag cggatgaagc 2880
ggatcgagga gggcatcaag gagctgggca gccagatcct gaaggagcac cccgtggaga 2940
acacccagct gcagaacgag aagctgtacc tgtactacct gcagaacggc cgggacatgt 3000
acgtggacca ggagctggac atcaaccggc tgagcgacta cgacgtggcc gccatcgtgc 3060
cccagagctt cctgaaggac gacagcatcg acaacaaggt gctgacccgg agcgacaagg 3120
cccggggcaa gagcgacaac gtgcccagcg aggaggtggt gaagaagatg aagaactact 3180
ggcggcagct gctgaacgcc aagctgatca cccagcggaa gttcgacaac ctgaccaagg 3240
ccgagcgggg cggcctgagc gagctggaca aggccggctt catcaagcgg cagctggtgg 3300
agacccggca gatcaccaag cacgtggccc agatcctgga cagccggatg aacaccaagt 3360
acgacgagaa cgacaagctg atccgggagg tgaaggtgat caccctgaaa tccaagctgg 3420
tgagcgactt ccggaaggac ttccagttct acaaggtgcg ggagatcaac aactaccacc 3480
acgcccacga cgcctacctg aacgccgtgg tgggcaccgc cctgatcaag aagtacccca 3540
agctggagag cgagttcgtg tacggcgact acaaggtgta cgacgtgcgg aagatgatcg 3600
ccaagagcga gcaggagatc ggcaaggcca ccgccaagta cttcttctac agcaacatca 3660
tgaacttctt caagaccgag atcaccctgg ccaacggcga gatccggaag cggcccctga 3720
tcgagaccaa cggcgagacc ggcgagatcg tgtgggacaa gggccgggac ttcgccaccg 3780
tgcggaaggt gctgagcatg ccccaggtga acatcgtgaa gaaaaccgag gtgcagaccg 3840
gcggcttcag caaggagagc atcctgccca agcggaacag cgacaagctg atcgcccgga 3900
agaaggactg ggaccccaag aagtacggcg gcttcgacag ccccaccgtg gcctacagcg 3960
tgctggtggt ggccaaggtg gagaagggca agagcaagaa gctgaaatcc gtgaaggagc 4020
tgctgggcat caccatcatg gagcggagca gcttcgagaa gaaccccatc gacttcctgg 4080
aggccaaggg ctacaaggag gtgaagaagg acctgatcat caagctgccc aagtacagcc 4140
tgttcgagct ggagaacggc cggaagcgga tgctggccag cgccggcgag ctgcagaagg 4200
gcaacgagct ggccctgccc agcaagtacg tgaacttcct gtacctggcc agccactacg 4260
agaagctgaa gggcagcccc gaggacaacg agcagaagca gctgttcgtg gagcagcaca 4320
agcactacct ggacgagatc atcgagcaga tcagcgagtt cagcaagcgg gtgatcctgg 4380
ccgacgccaa cctggacaag gtgctgagcg cctacaacaa gcaccgggac aagcccatcc 4440
gggagcaggc cgagaacatc atccacctgt tcaccctgac caacctgggc gcccccgccg 4500
ccttcaagta cttcgacacc accatcgacc ggaagcggta caccagcacc aaggaggtgc 4560
tggacgccac cctgatccac cagagcatca ccggcctgta cgagacccgg atcgacctga 4620
gccagctggg cggcgacaag cggcccgccg ccaccaagaa ggccggccag gccaagaaga 4680
agaaggccag cgacgccaag agcctgaccg cctggagccg gaccctggtg accttcaagg 4740
acgtgttcgt ggacttcacc cgggaggagt ggaagctgct ggacaccgcc cagcagatcc 4800
tgtaccggaa cgtgatgctg gagaactaca agaacctggt gagcctgggc taccagctga 4860
ccaagcccga cgtgatcctg cggctggaga agggcgagga gccctggctg gtggagcggg 4920
agatccacca ggagacccac cccgacagcg agaccgcctt cgagatcaag agcagcgtga 4980
gcggcggcaa gcggcccgcc gccaccaaga aggccggcca ggccaagaag aagaagggca 5040
gctaccccta cgacgtgccc gactacgcct gagcggccgc ttaattaagc tgccttctgc 5100
ggggcttgcc ttctggccat gcccttcttc tctcccttgc acctgtacct cttggtcttt 5160
gaataaagcc tgagtaggaa gtctagaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 5220
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaattgt cttcttcatc 5280
gcctgcagat cccaatggcg cgccgagctt ggctcgagca tggtcatagc tgtttcctgt 5340
gtgaaattgt tatccgctca caattccaca caacatacga gccggaagca taaagtgtaa 5400
agcctggggt gcctaatgag tgagctaact cacattaatt gcgttgcgct cactgcccgc 5460
tttccagtcg ggaaacctgt cgtgccagct gcattaatga atcggccaac gcgcggggag 5520
aggcggtttg cgtattgggc gctcttccgc ttcctcgctc actgactcgc tgcgctcggt 5580
cgttcggctg cggcgagcgg tatcagctca ctcaaaggcg gtaatacggt tatccacaga 5640
atcaggggat aacgcaggaa agaacatgtg agcaaaaggc cagcaaaagg ccaggaaccg 5700
taaaaaggcc gcgttgctgg cgtttttcca taggctccgc ccccctgacg agcatcacaa 5760
aaatcgacgc tcaagtcaga ggtggcgaaa cccgacagga ctataaagat accaggcgtt 5820
tccccctgga agctccctcg tgcgctctcc tgttccgacc ctgccgctta ccggatacct 5880
gtccgccttt ctcccttcgg gaagcgtggc gctttctcat agctcacgct gtaggtatct 5940
cagttcggtg taggtcgttc gctccaagct gggctgtgtg cacgaacccc ccgttcagcc 6000
cgaccgctgc gccttatccg gtaactatcg tcttgagtcc aacccggtaa gacacgactt 6060
atcgccactg gcagcagcca ctggtaacag gattagcaga gcgaggtatg taggcggtgc 6120
tacagagttc ttgaagtggt ggcctaacta cggctacact agaagaacag tatttggtat 6180
ctgcgctctg ctgaagccag ttaccttcgg aaaaagagtt ggtagctctt gatccggcaa 6240
acaaaccacc gctggtagcg gtggtttttt tgtttgcaag cagcagatta cgcgcagaaa 6300
aaaaggatct caagaagatc ctttgatctt ttctacgggg tctgacgctc agtggaacga 6360
aaactcacgt taagggattt tggtcatgag attatcaaaa aggatcttca cctagatcct 6420
tttaaattaa aaatgaagtt ttaaatcaat ctaaagtata tatgagtaaa cttggtctga 6480
cagttagaaa aactcatcga gcatcaaatg aaactgcaat ttattcatat caggattatc 6540
aataccatat ttttgaaaaa gccgtttctg taatgaagga gaaaactcac cgaggcagtt 6600
ccataggatg gcaagatcct ggtatcggtc tgcgattccg actcgtccaa catcaataca 6660
acctattaat ttcccctcgt caaaaataag gttatcaagt gagaaatcac catgagtgac 6720
gactgaatcc ggtgagaatg gcaaaagttt atgcatttct ttccagactt gttcaacagg 6780
ccagccatta cgctcgtcat caaaatcact cgcatcaacc aaaccgttat tcattcgtga 6840
ttgcgcctga gcgagacgaa atacgcgatc gctgttaaaa ggacaattac aaacaggaat 6900
cgaatgcaac cggcgcagga acactgccag cgcatcaaca atattttcac ctgaatcagg 6960
atattcttct aatacctgga atgctgtttt cccagggatc gcagtggtga gtaaccatgc 7020
atcatcagga gtacggataa aatgcttgat ggtcggaaga ggcataaatt ccgtcagcca 7080
gtttagtctg accatctcat ctgtaacatc attggcaacg ctacctttgc catgtttcag 7140
aaacaactct ggcgcatcgg gcttcccata caatcgatag attgtcgcac ctgattgccc 7200
gacattatcg cgagcccatt tatacccata taaatcagca tccatgttgg aatttaatcg 7260
cggcctagag caagacgttt cccgttgaat atggctcata ctcttccttt ttcaatatta 7320
ttgaagcatt tatcagggtt attgtctcat gagcggatac atatttgaat gtatttagaa 7380
aaataaacaa ataggggttc cgcgcacatt tccccgaaaa gtgccacctg acgtctaaga 7440
aaccattatt atcatgacat taacctataa aaataggcgt atcacgaggc cctttcgtc 7499
<210> 205
<211> 4828
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 205
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgccga caagaagtac agcatcggcc 120
tggccatcgg caccaacagc gtgggctggg ccgtgatcac cgacgagtac aaggtgccca 180
gcaagaagtt caaggtgctg ggcaacaccg accggcacag catcaagaag aacctgatcg 240
gcgccctgct gttcgacagc ggcgagaccg ccgaggccac ccggctgaag cggaccgccc 300
ggcggcggta cacccggcgg aagaaccgga tctgctacct gcaggagatc ttcagcaacg 360
agatggccaa ggtggacgac agcttcttcc accggctgga ggagagcttc ctggtggagg 420
aggacaagaa gcacgagcgg caccccatct tcggcaacat cgtggacgag gtggcctacc 480
acgagaagta ccccaccatc taccacctgc ggaagaagct ggtggacagc accgacaagg 540
ccgacctgcg gctgatctac ctggccctgg cccacatgat caagttccgg ggccacttcc 600
tgatcgaggg cgacctgaac cccgacaaca gcgacgtgga caagctgttc atccagctgg 660
tgcagaccta caaccagctg ttcgaggaga accccatcaa cgccagcggc gtggacgcca 720
aggccatcct gagcgcccgg ctgagcaaga gccggcggct ggagaacctg atcgcccagc 780
tgcccggcga gaagaagaac ggcctgttcg gcaacctgat cgccctgagc ctgggcctga 840
cccccaactt caagagcaac ttcgacctgg ccgaggacgc caagctgcag ctgagcaagg 900
acacctacga cgacgacctg gacaacctgc tggcccagat cggcgaccag tacgccgacc 960
tgttcctggc cgccaagaac ctgagcgacg ccatcctgct gagcgacatc ctgcgggtga 1020
acaccgagat caccaaggcc cccctgagcg ccagcatgat caagcggtac gacgagcacc 1080
accaggacct gaccctgctg aaggccctgg tgcggcagca gctgcccgag aagtacaagg 1140
agatcttctt cgaccagagc aagaacggct acgccggcta catcgacggc ggcgccagcc 1200
aggaggagtt ctacaagttc atcaagccca tcctggagaa gatggacggc accgaggagc 1260
tgctggtgaa gctgaaccgg gaggacctgc tgcggaagca gcggaccttc gacaacggca 1320
gcatccccca ccagatccac ctgggcgagc tgcacgccat cctgcggcgg caggaggact 1380
tctacccctt cctgaaggac aaccgggaga agatcgagaa gatcctgacc ttccggatcc 1440
cctactacgt gggccccctg gcccggggca acagccggtt cgcctggatg acccggaaat 1500
ccgaggagac catcaccccc tggaacttcg aggaggtggt ggacaagggc gccagcgccc 1560
agagcttcat cgagcggatg accaacttcg acaagaacct gcccaacgag aaggtgctgc 1620
ccaagcacag cctgctgtac gagtacttca ccgtgtacaa cgagctgacc aaggtgaagt 1680
acgtgaccga gggcatgcgg aagcccgcct tcctgagcgg cgagcagaag aaggccatcg 1740
tggacctgct gttcaagacc aaccggaagg tgaccgtgaa gcagctgaag gaggactact 1800
tcaagaagat cgagtgcttc gacagcgtgg agatcagcgg cgtggaggac cggttcaacg 1860
ccagcctggg cacctaccac gacctgctga agatcatcaa ggacaaggac ttcctggaca 1920
acgaggagaa cgaggacatc ctggaggaca tcgtgctgac cctgaccctg ttcgaggacc 1980
gggagatgat cgaggagcgg ctgaaaacct acgcccacct gttcgacgac aaggtgatga 2040
agcagctgaa gcggcggcgg tacaccggct ggggccggct gagccggaag ctgatcaacg 2100
gcatccggga caagcagagc ggcaagacca tcctggactt cctgaaatcc gacggcttcg 2160
ccaaccggaa cttcatgcag ctgatccacg acgacagcct gaccttcaag gaggacatcc 2220
agaaggccca ggtgagcggc cagggcgaca gcctgcacga gcacatcgcc aacctggccg 2280
gcagccccgc catcaagaag ggcatcctgc agaccgtgaa ggtggtggac gagctggtga 2340
aggtgatggg ccggcacaag cccgagaaca tcgtgatcga gatggcccgg gagaaccaga 2400
ccacccagaa gggccagaag aacagccggg agcggatgaa gcggatcgag gagggcatca 2460
aggagctggg cagccagatc ctgaaggagc accccgtgga gaacacccag ctgcagaacg 2520
agaagctgta cctgtactac ctgcagaacg gccgggacat gtacgtggac caggagctgg 2580
acatcaaccg gctgagcgac tacgacgtgg ccgccatcgt gccccagagc ttcctgaagg 2640
acgacagcat cgacaacaag gtgctgaccc ggagcgacaa ggcccggggc aagagcgaca 2700
acgtgcccag cgaggaggtg gtgaagaaga tgaagaacta ctggcggcag ctgctgaacg 2760
ccaagctgat cacccagcgg aagttcgaca acctgaccaa ggccgagcgg ggcggcctga 2820
gcgagctgga caaggccggc ttcatcaagc ggcagctggt ggagacccgg cagatcacca 2880
agcacgtggc ccagatcctg gacagccgga tgaacaccaa gtacgacgag aacgacaagc 2940
tgatccggga ggtgaaggtg atcaccctga aatccaagct ggtgagcgac ttccggaagg 3000
acttccagtt ctacaaggtg cgggagatca acaactacca ccacgcccac gacgcctacc 3060
tgaacgccgt ggtgggcacc gccctgatca agaagtaccc caagctggag agcgagttcg 3120
tgtacggcga ctacaaggtg tacgacgtgc ggaagatgat cgccaagagc gagcaggaga 3180
tcggcaaggc caccgccaag tacttcttct acagcaacat catgaacttc ttcaagaccg 3240
agatcaccct ggccaacggc gagatccgga agcggcccct gatcgagacc aacggcgaga 3300
ccggcgagat cgtgtgggac aagggccggg acttcgccac cgtgcggaag gtgctgagca 3360
tgccccaggt gaacatcgtg aagaaaaccg aggtgcagac cggcggcttc agcaaggaga 3420
gcatcctgcc caagcggaac agcgacaagc tgatcgcccg gaagaaggac tgggacccca 3480
agaagtacgg cggcttcgac agccccaccg tggcctacag cgtgctggtg gtggccaagg 3540
tggagaaggg caagagcaag aagctgaaat ccgtgaagga gctgctgggc atcaccatca 3600
tggagcggag cagcttcgag aagaacccca tcgacttcct ggaggccaag ggctacaagg 3660
aggtgaagaa ggacctgatc atcaagctgc ccaagtacag cctgttcgag ctggagaacg 3720
gccggaagcg gatgctggcc agcgccggcg agctgcagaa gggcaacgag ctggccctgc 3780
ccagcaagta cgtgaacttc ctgtacctgg ccagccacta cgagaagctg aagggcagcc 3840
ccgaggacaa cgagcagaag cagctgttcg tggagcagca caagcactac ctggacgaga 3900
tcatcgagca gatcagcgag ttcagcaagc gggtgatcct ggccgacgcc aacctggaca 3960
aggtgctgag cgcctacaac aagcaccggg acaagcccat ccgggagcag gccgagaaca 4020
tcatccacct gttcaccctg accaacctgg gcgcccccgc cgccttcaag tacttcgaca 4080
ccaccatcga ccggaagcgg tacaccagca ccaaggaggt gctggacgcc accctgatcc 4140
accagagcat caccggcctg tacgagaccc ggatcgacct gagccagctg ggcggcgaca 4200
agcggcccgc cgccaccaag aaggccggcc aggccaagaa gaagaaggcc agcgacgcca 4260
agagcctgac cgcctggagc cggaccctgg tgaccttcaa ggacgtgttc gtggacttca 4320
cccgggagga gtggaagctg ctggacaccg cccagcagat cctgtaccgg aacgtgatgc 4380
tggagaacta caagaacctg gtgagcctgg gctaccagct gaccaagccc gacgtgatcc 4440
tgcggctgga gaagggcgag gagccctggc tggtggagcg ggagatccac caggagaccc 4500
accccgacag cgagaccgcc ttcgagatca agagcagcgt gagcggcggc aagcggcccg 4560
ccgccaccaa gaaggccggc caggccaaga agaagaaggg cagctacccc tacgacgtgc 4620
ccgactacgc ctgagcggcc gcttaattaa gctgccttct gcggggcttg ccttctggcc 4680
atgcccttct tctctccctt gcacctgtac ctcttggtct ttgaataaag cctgagtagg 4740
aagtctagaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 4800
aaaaaaaaaa aaaaaaaaaa aaaaaaaa 4828
<210> 206
<211> 1528
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 206
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val
20 25 30
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
35 40 45
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
50 55 60
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
65 70 75 80
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
85 90 95
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
100 105 110
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
115 120 125
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
130 135 140
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
145 150 155 160
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
165 170 175
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
180 185 190
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
195 200 205
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
210 215 220
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
225 230 235 240
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
245 250 255
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
260 265 270
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
275 280 285
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
290 295 300
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
305 310 315 320
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
325 330 335
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
340 345 350
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
355 360 365
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
370 375 380
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
385 390 395 400
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
405 410 415
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
420 425 430
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
435 440 445
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
450 455 460
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
465 470 475 480
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
485 490 495
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
500 505 510
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
515 520 525
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
530 535 540
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
545 550 555 560
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
565 570 575
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
580 585 590
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
595 600 605
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
610 615 620
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
625 630 635 640
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
645 650 655
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
660 665 670
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
675 680 685
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
690 695 700
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
705 710 715 720
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
725 730 735
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
740 745 750
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
755 760 765
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
770 775 780
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
785 790 795 800
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
805 810 815
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
820 825 830
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
835 840 845
Leu Ser Asp Tyr Asp Val Ala Ala Ile Val Pro Gln Ser Phe Leu Lys
850 855 860
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg
865 870 875 880
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
885 890 895
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
900 905 910
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
915 920 925
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
930 935 940
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
945 950 955 960
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
965 970 975
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
980 985 990
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
995 1000 1005
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu
1010 1015 1020
Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile
1025 1030 1035
Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe
1040 1045 1050
Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu
1055 1060 1065
Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly
1070 1075 1080
Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1085 1090 1095
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys
1100 1105 1110
Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro
1115 1120 1125
Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp
1130 1135 1140
Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser
1145 1150 1155
Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu
1160 1165 1170
Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser
1175 1180 1185
Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr
1190 1195 1200
Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser
1205 1210 1215
Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1220 1225 1230
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr
1235 1240 1245
Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly
1250 1255 1260
Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His
1265 1270 1275
Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser
1280 1285 1290
Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser
1295 1300 1305
Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu
1310 1315 1320
Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1325 1330 1335
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr
1340 1345 1350
Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile
1355 1360 1365
Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly
1370 1375 1380
Asp Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys
1385 1390 1395
Lys Lys Ala Ser Asp Ala Lys Ser Leu Thr Ala Trp Ser Arg Thr
1400 1405 1410
Leu Val Thr Phe Lys Asp Val Phe Val Asp Phe Thr Arg Glu Glu
1415 1420 1425
Trp Lys Leu Leu Asp Thr Ala Gln Gln Ile Leu Tyr Arg Asn Val
1430 1435 1440
Met Leu Glu Asn Tyr Lys Asn Leu Val Ser Leu Gly Tyr Gln Leu
1445 1450 1455
Thr Lys Pro Asp Val Ile Leu Arg Leu Glu Lys Gly Glu Glu Pro
1460 1465 1470
Trp Leu Val Glu Arg Glu Ile His Gln Glu Thr His Pro Asp Ser
1475 1480 1485
Glu Thr Ala Phe Glu Ile Lys Ser Ser Val Ser Gly Gly Lys Arg
1490 1495 1500
Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys Lys Lys Gly
1505 1510 1515
Ser Tyr Pro Tyr Asp Val Pro Asp Tyr Ala
1520 1525
<210> 207
<211> 5705
<212> RNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 207
aaggaaauaa gagagaaaag aagaguaaga agaaauauaa gagccaccau ggcccccaag 60
aagaagcgga aggugggcau ccacggcgug cccgccgccg acaagaagua cagcaucggc 120
cuggccaucg gcaccaacag cgugggcugg gccgugauca ccgacgagua caaggugccc 180
agcaagaagu ucaaggugcu gggcaacacc gaccggcaca gcaucaagaa gaaccugauc 240
ggcgcccugc uguucgacag cggcgagacc gccgaggcca cccggcugaa gcggaccgcc 300
cggcggcggu acacccggcg gaagaaccgg aucugcuacc ugcaggagau cuucagcaac 360
gagauggcca agguggacga cagcuucuuc caccggcugg aggagagcuu ccugguggag 420
gaggacaaga agcacgagcg gcaccccauc uucggcaaca ucguggacga gguggccuac 480
cacgagaagu accccaccau cuaccaccug cggaagaagc ugguggacag caccgacaag 540
gccgaccugc ggcugaucua ccuggcccug gcccacauga ucaaguuccg gggccacuuc 600
cugaucgagg gcgaccugaa ccccgacaac agcgacgugg acaagcuguu cauccagcug 660
gugcagaccu acaaccagcu guucgaggag aaccccauca acgccagcgg cguggacgcc 720
aaggccaucc ugagcgcccg gcugagcaag agccggcggc uggagaaccu gaucgcccag 780
cugcccggcg agaagaagaa cggccuguuc ggcaaccuga ucgcccugag ccugggccug 840
acccccaacu ucaagagcaa cuucgaccug gccgaggacg ccaagcugca gcugagcaag 900
gacaccuacg acgacgaccu ggacaaccug cuggcccaga ucggcgacca guacgccgac 960
cuguuccugg ccgccaagaa ccugagcgac gccauccugc ugagcgacau ccugcgggug 1020
aacaccgaga ucaccaaggc cccccugagc gccagcauga ucaagcggua cgacgagcac 1080
caccaggacc ugacccugcu gaaggcccug gugcggcagc agcugcccga gaaguacaag 1140
gagaucuucu ucgaccagag caagaacggc uacgccggcu acaucgacgg cggcgccagc 1200
caggaggagu ucuacaaguu caucaagccc auccuggaga agauggacgg caccgaggag 1260
cugcugguga agcugaaccg ggaggaccug cugcggaagc agcggaccuu cgacaacggc 1320
agcauccccc accagaucca ccugggcgag cugcacgcca uccugcggcg gcaggaggac 1380
uucuaccccu uccugaagga caaccgggag aagaucgaga agauccugac cuuccggauc 1440
cccuacuacg ugggcccccu ggcccggggc aacagccggu ucgccuggau gacccggaaa 1500
uccgaggaga ccaucacccc cuggaacuuc gaggaggugg uggacaaggg cgccagcgcc 1560
cagagcuuca ucgagcggau gaccaacuuc gacaagaacc ugcccaacga gaaggugcug 1620
cccaagcaca gccugcugua cgaguacuuc accguguaca acgagcugac caaggugaag 1680
uacgugaccg agggcaugcg gaagcccgcc uuccugagcg gcgagcagaa gaaggccauc 1740
guggaccugc uguucaagac caaccggaag gugaccguga agcagcugaa ggaggacuac 1800
uucaagaaga ucgagugcuu cgacagcgug gagaucagcg gcguggagga ccgguucaac 1860
gccagccugg gcaccuacca cgaccugcug aagaucauca aggacaagga cuuccuggac 1920
aacgaggaga acgaggacau ccuggaggac aucgugcuga cccugacccu guucgaggac 1980
cgggagauga ucgaggagcg gcugaaaacc uacgcccacc uguucgacga caaggugaug 2040
aagcagcuga agcggcggcg guacaccggc uggggccggc ugagccggaa gcugaucaac 2100
ggcauccggg acaagcagag cggcaagacc auccuggacu uccugaaauc cgacggcuuc 2160
gccaaccgga acuucaugca gcugauccac gacgacagcc ugaccuucaa ggaggacauc 2220
cagaaggccc aggugagcgg ccagggcgac agccugcacg agcacaucgc caaccuggcc 2280
ggcagccccg ccaucaagaa gggcauccug cagaccguga agguggugga cgagcuggug 2340
aaggugaugg gccggcacaa gcccgagaac aucgugaucg agauggcccg ggagaaccag 2400
accacccaga agggccagaa gaacagccgg gagcggauga agcggaucga ggagggcauc 2460
aaggagcugg gcagccagau ccugaaggag caccccgugg agaacaccca gcugcagaac 2520
gagaagcugu accuguacua ccugcagaac ggccgggaca uguacgugga ccaggagcug 2580
gacaucaacc ggcugagcga cuacgacgug gccgccaucg ugccccagag cuuccugaag 2640
gacgacagca ucgacaacaa ggugcugacc cggagcgaca aggcccgggg caagagcgac 2700
aacgugccca gcgaggaggu ggugaagaag augaagaacu acuggcggca gcugcugaac 2760
gccaagcuga ucacccagcg gaaguucgac aaccugacca aggccgagcg gggcggccug 2820
agcgagcugg acaaggccgg cuucaucaag cggcagcugg uggagacccg gcagaucacc 2880
aagcacgugg cccagauccu ggacagccgg augaacacca aguacgacga gaacgacaag 2940
cugauccggg aggugaaggu gaucacccug aaauccaagc uggugagcga cuuccggaag 3000
gacuuccagu ucuacaaggu gcgggagauc aacaacuacc accacgccca cgacgccuac 3060
cugaacgccg uggugggcac cgcccugauc aagaaguacc ccaagcugga gagcgaguuc 3120
guguacggcg acuacaaggu guacgacgug cggaagauga ucgccaagag cgagcaggag 3180
aucggcaagg ccaccgccaa guacuucuuc uacagcaaca ucaugaacuu cuucaagacc 3240
gagaucaccc uggccaacgg cgagauccgg aagcggcccc ugaucgagac caacggcgag 3300
accggcgaga ucguguggga caagggccgg gacuucgcca ccgugcggaa ggugcugagc 3360
augccccagg ugaacaucgu gaagaaaacc gaggugcaga ccggcggcuu cagcaaggag 3420
agcauccugc ccaagcggaa cagcgacaag cugaucgccc ggaagaagga cugggacccc 3480
aagaaguacg gcggcuucga cagccccacc guggccuaca gcgugcuggu gguggccaag 3540
guggagaagg gcaagagcaa gaagcugaaa uccgugaagg agcugcuggg caucaccauc 3600
auggagcgga gcagcuucga gaagaacccc aucgacuucc uggaggccaa gggcuacaag 3660
gaggugaaga aggaccugau caucaagcug cccaaguaca gccuguucga gcuggagaac 3720
ggccggaagc ggaugcuggc cagcgccggc gagcugcaga agggcaacga gcuggcccug 3780
cccagcaagu acgugaacuu ccuguaccug gccagccacu acgagaagcu gaagggcagc 3840
cccgaggaca acgagcagaa gcagcuguuc guggagcagc acaagcacua ccuggacgag 3900
aucaucgagc agaucagcga guucagcaag cgggugaucc uggccgacgc caaccuggac 3960
aaggugcuga gcgccuacaa caagcaccgg gacaagccca uccgggagca ggccgagaac 4020
aucauccacc uguucacccu gaccaaccug ggcgcccccg ccgccuucaa guacuucgac 4080
accaccaucg accggaagcg guacaccagc accaaggagg ugcuggacgc cacccugauc 4140
caccagagca ucaccggccu guacgagacc cggaucgacc ugagccagcu gggcggcgac 4200
aagcggcccg ccgccaccaa gaaggccggc caggccaaga agaagaaggc ccgggacagc 4260
aagguggaga acaagaccaa gaagcugcgg guguucgagg ccuucgccgg caucggcgcc 4320
cagcggaagg cccuggagaa ggugcggaag gacgaguacg agaucguggg ccuggccgag 4380
ugguacgugc ccgccaucgu gauguaccag gccauccaca acaacuucca caccaagcug 4440
gaguacaaga gcgugagccg ggaggagaug aucgacuacc uggagaacaa gacccugagc 4500
uggaacagca agaaccccgu gagcaacggc uacuggaagc ggaagaagga cgacgagcug 4560
aagaucaucu acaacgccau caagcugagc gagaaggagg gcaacaucuu cgacauccgg 4620
gaccuguaca agcggacccu gaagaacauc gaccugcuga ccuacagcuu ccccugccag 4680
gaccugagcc agcagggcau ccagaagggc augaagcggg gcagcggcac ccggagcggc 4740
cugcuguggg agaucgagcg ggcccuggac agcaccgaga agaacgaccu gcccaaguac 4800
cugcugaugg agaacguggg cgcccugcug cacaagaaga acgaggagga gcugaaccag 4860
uggaagcaga agcuggagag ccugggcuac cagaacagca ucgaggugcu gaacgccgcc 4920
gacuucggca gcagccaggc ccggcggcgg guguucauga ucagcacccu gaacgaguuc 4980
guggagcugc ccaagggcga caagaagccc aagagcauca agaaggugcu gaacaagauc 5040
gugagcgaga aggacauccu gaacaaccug cugaaguaca accugaccga guucaagaaa 5100
accaagagca acaucaacaa ggccagccug aucggcuaca gcaaguucaa cagcgagggc 5160
uacguguacg accccgaguu caccggcccc acccugaccg ccagcggcgc caacagccgg 5220
aucaagauca aggacggcag caacauccgg aagaugaaca gcgacgagac cuuccuguac 5280
aucggcuucg acagccagga cggcaagcgg gugaacgaga ucgaguuccu gaccgagaac 5340
cagaagaucu ucgugugcgg caacagcauc agcguggagg ugcuggaggc caucaucgac 5400
aagaucggcg gccccagcag cggcggcaag cggcccgccg ccaccaagaa ggccggccag 5460
gccaagaaga agaagggcag cuaccccuac gacgugcccg acuacgccug agcggccgcu 5520
uaauuaagcu gccuucugcg gggcuugccu ucuggccaug cccuucuucu cucccuugca 5580
ccuguaccuc uuggucuuug aauaaagccu gaguaggaag ucuagaaaaa aaaaaaaaaa 5640
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 5700
aaaaa 5705
<210> 208
<211> 1820
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 208
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val
20 25 30
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
35 40 45
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
50 55 60
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
65 70 75 80
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
85 90 95
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
100 105 110
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
115 120 125
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
130 135 140
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
145 150 155 160
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
165 170 175
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
180 185 190
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
195 200 205
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
210 215 220
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
225 230 235 240
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
245 250 255
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
260 265 270
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
275 280 285
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
290 295 300
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
305 310 315 320
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
325 330 335
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
340 345 350
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
355 360 365
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
370 375 380
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
385 390 395 400
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
405 410 415
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
420 425 430
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
435 440 445
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
450 455 460
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
465 470 475 480
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
485 490 495
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
500 505 510
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
515 520 525
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
530 535 540
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
545 550 555 560
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
565 570 575
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
580 585 590
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
595 600 605
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
610 615 620
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
625 630 635 640
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
645 650 655
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
660 665 670
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
675 680 685
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
690 695 700
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
705 710 715 720
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
725 730 735
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
740 745 750
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
755 760 765
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
770 775 780
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
785 790 795 800
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
805 810 815
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
820 825 830
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
835 840 845
Leu Ser Asp Tyr Asp Val Ala Ala Ile Val Pro Gln Ser Phe Leu Lys
850 855 860
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg
865 870 875 880
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
885 890 895
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
900 905 910
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
915 920 925
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
930 935 940
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
945 950 955 960
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
965 970 975
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
980 985 990
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
995 1000 1005
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu
1010 1015 1020
Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile
1025 1030 1035
Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe
1040 1045 1050
Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu
1055 1060 1065
Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly
1070 1075 1080
Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1085 1090 1095
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys
1100 1105 1110
Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro
1115 1120 1125
Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp
1130 1135 1140
Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser
1145 1150 1155
Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu
1160 1165 1170
Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser
1175 1180 1185
Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr
1190 1195 1200
Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser
1205 1210 1215
Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1220 1225 1230
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr
1235 1240 1245
Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly
1250 1255 1260
Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His
1265 1270 1275
Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser
1280 1285 1290
Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser
1295 1300 1305
Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu
1310 1315 1320
Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1325 1330 1335
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr
1340 1345 1350
Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile
1355 1360 1365
Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly
1370 1375 1380
Asp Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys
1385 1390 1395
Lys Lys Ala Arg Asp Ser Lys Val Glu Asn Lys Thr Lys Lys Leu
1400 1405 1410
Arg Val Phe Glu Ala Phe Ala Gly Ile Gly Ala Gln Arg Lys Ala
1415 1420 1425
Leu Glu Lys Val Arg Lys Asp Glu Tyr Glu Ile Val Gly Leu Ala
1430 1435 1440
Glu Trp Tyr Val Pro Ala Ile Val Met Tyr Gln Ala Ile His Asn
1445 1450 1455
Asn Phe His Thr Lys Leu Glu Tyr Lys Ser Val Ser Arg Glu Glu
1460 1465 1470
Met Ile Asp Tyr Leu Glu Asn Lys Thr Leu Ser Trp Asn Ser Lys
1475 1480 1485
Asn Pro Val Ser Asn Gly Tyr Trp Lys Arg Lys Lys Asp Asp Glu
1490 1495 1500
Leu Lys Ile Ile Tyr Asn Ala Ile Lys Leu Ser Glu Lys Glu Gly
1505 1510 1515
Asn Ile Phe Asp Ile Arg Asp Leu Tyr Lys Arg Thr Leu Lys Asn
1520 1525 1530
Ile Asp Leu Leu Thr Tyr Ser Phe Pro Cys Gln Asp Leu Ser Gln
1535 1540 1545
Gln Gly Ile Gln Lys Gly Met Lys Arg Gly Ser Gly Thr Arg Ser
1550 1555 1560
Gly Leu Leu Trp Glu Ile Glu Arg Ala Leu Asp Ser Thr Glu Lys
1565 1570 1575
Asn Asp Leu Pro Lys Tyr Leu Leu Met Glu Asn Val Gly Ala Leu
1580 1585 1590
Leu His Lys Lys Asn Glu Glu Glu Leu Asn Gln Trp Lys Gln Lys
1595 1600 1605
Leu Glu Ser Leu Gly Tyr Gln Asn Ser Ile Glu Val Leu Asn Ala
1610 1615 1620
Ala Asp Phe Gly Ser Ser Gln Ala Arg Arg Arg Val Phe Met Ile
1625 1630 1635
Ser Thr Leu Asn Glu Phe Val Glu Leu Pro Lys Gly Asp Lys Lys
1640 1645 1650
Pro Lys Ser Ile Lys Lys Val Leu Asn Lys Ile Val Ser Glu Lys
1655 1660 1665
Asp Ile Leu Asn Asn Leu Leu Lys Tyr Asn Leu Thr Glu Phe Lys
1670 1675 1680
Lys Thr Lys Ser Asn Ile Asn Lys Ala Ser Leu Ile Gly Tyr Ser
1685 1690 1695
Lys Phe Asn Ser Glu Gly Tyr Val Tyr Asp Pro Glu Phe Thr Gly
1700 1705 1710
Pro Thr Leu Thr Ala Ser Gly Ala Asn Ser Arg Ile Lys Ile Lys
1715 1720 1725
Asp Gly Ser Asn Ile Arg Lys Met Asn Ser Asp Glu Thr Phe Leu
1730 1735 1740
Tyr Ile Gly Phe Asp Ser Gln Asp Gly Lys Arg Val Asn Glu Ile
1745 1750 1755
Glu Phe Leu Thr Glu Asn Gln Lys Ile Phe Val Cys Gly Asn Ser
1760 1765 1770
Ile Ser Val Glu Val Leu Glu Ala Ile Ile Asp Lys Ile Gly Gly
1775 1780 1785
Pro Ser Ser Gly Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly
1790 1795 1800
Gln Ala Lys Lys Lys Lys Gly Ser Tyr Pro Tyr Asp Val Pro Asp
1805 1810 1815
Tyr Ala
1820
<210> 209
<211> 6775
<212> RNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 209
aggaaauaag agagaaaaga agaguaagaa gaaauauaag agccaccaug gcccccaaga 60
agaagcggaa ggugggcggc agcggcggca gcggccagac cggcaagaag agcgagaagg 120
gccccgugug cuggcggaag cgggugaaga gcgaguacau gcggcugcgg cagcugaagc 180
gguuccggcg ggccgacgag gugaagagca uguucagcag caaccggcag aagauccugg 240
agcggaccga gauccugaac caggagugga agcagcggcg aauccagccc gugcacaucc 300
ugaccagcgu gagcagccug cggggcaccc gggagugcag cgugaccagc gaccuggacu 360
uccccaccca ggugaucccc cuaaagaccc ugaacgccgu ggccagcgug cccaucaugu 420
acagcuggag cccccugcag cagaacuuca ugguggagga cgagaccgug cugcacaaca 480
uccccuacau gggcgacgag gugcuggacc aggacggcac cuucaucgag gagcugauca 540
agaacuacga cggcaaggug cacggcgacc gggagugcgg cuucaucaac gacgagaucu 600
ucguggagcu ggugaacgcc cugggccagu acaacgacga cgacgacgac gacgacggcg 660
acgaccccga ggagcgggag gagaagcaga aggaccugga ggaccaccgg gacgacaagg 720
agagccggcc cccccggaag uuccccagcg acaagaucuu cgaggccauc agcagcaugu 780
uccccgacaa gggcaccgcc gaggagcuga aggagaagua caaggagcug accgagcagc 840
agcugcccgg cgcccugccc cccgagugca cccccaacau cgacggcccc aacgccaaga 900
gcgugcagcg ggagcagagc cugcacagcu uccacacccu guucugccgg cggugcuuca 960
aguacgacug cuuccugcac cccuuccacg ccacccccaa caccuacaag cggaagaaca 1020
ccgagaccgc ccuggacaac aagcccugcg gcccccagug cuaccagcac cuggagggcg 1080
ccaaggaguu cgccgccgcc cugaccgccg agcggaucaa gacccccccc aagcggcccg 1140
gcggccggcg gcggggccgg cugcccaaca acagcagccg gcccagcacc cccaccauca 1200
acgugcugga gagcaaggac accgacagcg accgggaggc cggcaccgag accggcggcg 1260
agaacaacga caaggaggag gaggagaaga aggacgagac cagcagcagc agcgaggcca 1320
acagccggug ccagaccccc aucaagauga agcccaacau cgagcccccc gagaacgugg 1380
aguggagcgg cgccgaggcc agcauguucc gggugcugau cggcaccuac uacgacaacu 1440
ucugcgccau cgcccggcug aucggcacca agaccugccg gcagguguac gaguuccggg 1500
ugaaggagag cagcaucauc gcccccgccc ccgccgagga cguggacacc cccccccgga 1560
agaagaagcg gaagcaccgg cugugggccg cccacugccg gaagauccag cugaagaagg 1620
acggcagcag caaccacgug uacaacuacc agcccugcga ccacccccgg cagcccugcg 1680
acagcagcug ccccugcgug aucgcccaga acuucugcga gaaguucugc cagugcagca 1740
gcgagugcca gaaccgguuc cccggcugcc ggugcaaggc ccagugcaac accaagcagu 1800
gccccugcua ccuggccgug cgggagugcg accccgaccu gugccugacc ugcggcgccg 1860
ccgaccacug ggacagcaag aacgugagcu gcaagaacug cagcauccag cggggcagca 1920
agaagcaccu gcugcuggcc cccagcgacg uggccggcug gggcaucuuc aucaaggacc 1980
ccgugcagaa gaacgaguuc aucagcgagu acugcggcga gaucaucagc caggacgagg 2040
ccgaccggcg gggcaaggug uacgacaagu acaugugcag cuuccuguuc aaccugaaca 2100
acgacuucgu gguggacgcc acccggaagg gcaacaagau ccgguucgcc aaccacagcg 2160
ugaaccccaa cugcuacgcc aaggugauga uggugaacgg cgaccaccgg aucggcaucu 2220
ucgccaagcg ggccauccag accggcgagg agcuguucuu cgacuaccgg uacagccagg 2280
ccgacgcccu gaaguacgug ggcaucgagc gggagaugga gauccccagc accggcggca 2340
gcggcggcag cggcggcagc ggcggcagcg gcggcagcgg ccgacccgac aagaaguaca 2400
gcaucggccu ggccaucggc accaacagcg ugggcugggc cgugaucacc gacgaguaca 2460
aggugcccag caagaaguuc aaggugcugg gcaacaccga ccggcacagc aucaagaaga 2520
accugaucgg cgcccugcug uucgacagcg gcgagaccgc cgaggccacc cggcugaagc 2580
ggaccgcccg gcggcgguac acccggcgga agaaccggau cugcuaccug caggagaucu 2640
ucagcaacga gauggccaag guggacgaca gcuucuucca ccggcuggag gagagcuucc 2700
ugguggagga ggacaagaag cacgagcggc accccaucuu cggcaacauc guggacgagg 2760
uggccuacca cgagaaguac cccaccaucu accaccugcg gaagaagcug guggacagca 2820
ccgacaaggc cgaccugcgg cugaucuacc uggcccuggc ccacaugauc aaguuccggg 2880
gccacuuccu gaucgagggc gaccugaacc ccgacaacag cgacguggac aagcuguuca 2940
uccagcuggu gcagaccuac aaccagcugu ucgaggagaa ccccaucaac gccagcggcg 3000
uggacgccaa ggccauccug agcgcccggc ugagcaagag ccggcggcug gagaaccuga 3060
ucgcccagcu gcccggcgag aagaagaacg gccuguucgg caaccugauc gcccugagcc 3120
ugggccugac ccccaacuuc aagagcaacu ucgaccuggc cgaggacgcc aagcugcagc 3180
ugagcaagga caccuacgac gacgaccugg acaaccugcu ggcccagauc ggcgaccagu 3240
acgccgaccu guuccuggcc gccaagaacc ugagcgacgc cauccugcug agcgacaucc 3300
ugcgggugaa caccgagauc accaaggccc cccugagcgc cagcaugauc aagcgguacg 3360
acgagcacca ccaggaccug acccugcuga aggcccuggu gcggcagcag cugcccgaga 3420
aguacaagga gaucuucuuc gaccagagca agaacggcua cgccggcuac aucgacggcg 3480
gcgccagcca ggaggaguuc uacaaguuca ucaagcccau ccuggagaag auggacggca 3540
ccgaggagcu gcuggugaag cugaaccggg aggaccugcu gcggaagcag cggaccuucg 3600
acaacggcag caucccccac cagauccacc ugggcgagcu gcacgccauc cugcggcggc 3660
aggaggacuu cuaccccuuc cugaaggaca accgggagaa gaucgagaag auccugaccu 3720
uccggauccc cuacuacgug ggcccccugg cccggggcaa cagccgguuc gccuggauga 3780
cccggaaauc cgaggagacc aucacccccu ggaacuucga ggagguggug gacaagggcg 3840
ccagcgccca gagcuucauc gagcggauga ccaacuucga caagaaccug cccaacgaga 3900
aggugcugcc caagcacagc cugcuguacg aguacuucac cguguacaac gagcugacca 3960
aggugaagua cgugaccgag ggcaugcgga agcccgccuu ccugagcggc gagcagaaga 4020
aggccaucgu ggaccugcug uucaagacca accggaaggu gaccgugaag cagcugaagg 4080
aggacuacuu caagaagauc gagugcuucg acagcgugga gaucagcggc guggaggacc 4140
gguucaacgc cagccugggc accuaccacg accugcugaa gaucaucaag gacaaggacu 4200
uccuggacaa cgaggagaac gaggacaucc uggaggacau cgugcugacc cugacccugu 4260
ucgaggaccg ggagaugauc gaggagcggc ugaaaaccua cgcccaccug uucgacgaca 4320
aggugaugaa gcagcugaag cggcggcggu acaccggcug gggccggcug agccggaagc 4380
ugaucaacgg cauccgggac aagcagagcg gcaagaccau ccuggacuuc cugaaauccg 4440
acggcuucgc caaccggaac uucaugcagc ugauccacga cgacagccug accuucaagg 4500
aggacaucca gaaggcccag gugagcggcc agggcgacag ccugcacgag cacaucgcca 4560
accuggccgg cagccccgcc aucaagaagg gcauccugca gaccgugaag gugguggacg 4620
agcuggugaa ggugaugggc cggcacaagc ccgagaacau cgugaucgag auggcccggg 4680
agaaccagac cacccagaag ggccagaaga acagccggga gcggaugaag cggaucgagg 4740
agggcaucaa ggagcugggc agccagaucc ugaaggagca ccccguggag aacacccagc 4800
ugcagaacga gaagcuguac cuguacuacc ugcagaacgg ccgggacaug uacguggacc 4860
aggagcugga caucaaccgg cugagcgacu acgacguggc cgccaucgug ccccagagcu 4920
uccugaagga cgacagcauc gacaacaagg ugcugacccg gagcgacaag gcccggggca 4980
agagcgacaa cgugcccagc gaggaggugg ugaagaagau gaagaacuac uggcggcagc 5040
ugcugaacgc caagcugauc acccagcgga aguucgacaa ccugaccaag gccgagcggg 5100
gcggccugag cgagcuggac aaggccggcu ucaucaagcg gcagcuggug gagacccggc 5160
agaucaccaa gcacguggcc cagauccugg acagccggau gaacaccaag uacgacgaga 5220
acgacaagcu gauccgggag gugaagguga ucacccugaa auccaagcug gugagcgacu 5280
uccggaagga cuuccaguuc uacaaggugc gggagaucaa caacuaccac cacgcccacg 5340
acgccuaccu gaacgccgug gugggcaccg cccugaucaa gaaguacccc aagcuggaga 5400
gcgaguucgu guacggcgac uacaaggugu acgacgugcg gaagaugauc gccaagagcg 5460
agcaggagau cggcaaggcc accgccaagu acuucuucua cagcaacauc augaacuucu 5520
ucaagaccga gaucacccug gccaacggcg agauccggaa gcggccccug aucgagacca 5580
acggcgagac cggcgagauc gugugggaca agggccggga cuucgccacc gugcggaagg 5640
ugcugagcau gccccaggug aacaucguga agaaaaccga ggugcagacc ggcggcuuca 5700
gcaaggagag cauccugccc aagcggaaca gcgacaagcu gaucgcccgg aagaaggacu 5760
gggaccccaa gaaguacggc ggcuucgaca gccccaccgu ggccuacagc gugcuggugg 5820
uggccaaggu ggagaagggc aagagcaaga agcugaaauc cgugaaggag cugcugggca 5880
ucaccaucau ggagcggagc agcuucgaga agaaccccau cgacuuccug gaggccaagg 5940
gcuacaagga ggugaagaag gaccugauca ucaagcugcc caaguacagc cuguucgagc 6000
uggagaacgg ccggaagcgg augcuggcca gcgccggcga gcugcagaag ggcaacgagc 6060
uggcccugcc cagcaaguac gugaacuucc uguaccuggc cagccacuac gagaagcuga 6120
agggcagccc cgaggacaac gagcagaagc agcuguucgu ggagcagcac aagcacuacc 6180
uggacgagau caucgagcag aucagcgagu ucagcaagcg ggugauccug gccgacgcca 6240
accuggacaa ggugcugagc gccuacaaca agcaccggga caagcccauc cgggagcagg 6300
ccgagaacau cauccaccug uucacccuga ccaaccuggg cgcccccgcc gccuucaagu 6360
acuucgacac caccaucgac cggaagcggu acaccagcac caaggaggug cuggacgcca 6420
cccugaucca ccagagcauc accggccugu acgagacccg gaucgaccug agccagcugg 6480
gcggcgacag cggcggcaag cggcccgccg ccaccaagaa ggccggccag gccaagaaga 6540
agaagggcag cuaccccuac gacgugcccg acuacgccug agcggccgcu uaauuaagcu 6600
gccuucugcg gggcuugccu ucuggccaug cccuucuucu cucccuugca ccuguaccuc 6660
uuggucuuug aauaaagccu gaguaggaag ucuagaaaaa aaaaaaaaaa aaaaaaaaaa 6720
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaa 6775
<210> 210
<211> 2177
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 210
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Gly Ser Gly Gly Ser Gly
1 5 10 15
Gln Thr Gly Lys Lys Ser Glu Lys Gly Pro Val Cys Trp Arg Lys Arg
20 25 30
Val Lys Ser Glu Tyr Met Arg Leu Arg Gln Leu Lys Arg Phe Arg Arg
35 40 45
Ala Asp Glu Val Lys Ser Met Phe Ser Ser Asn Arg Gln Lys Ile Leu
50 55 60
Glu Arg Thr Glu Ile Leu Asn Gln Glu Trp Lys Gln Arg Arg Ile Gln
65 70 75 80
Pro Val His Ile Leu Thr Ser Val Ser Ser Leu Arg Gly Thr Arg Glu
85 90 95
Cys Ser Val Thr Ser Asp Leu Asp Phe Pro Thr Gln Val Ile Pro Leu
100 105 110
Lys Thr Leu Asn Ala Val Ala Ser Val Pro Ile Met Tyr Ser Trp Ser
115 120 125
Pro Leu Gln Gln Asn Phe Met Val Glu Asp Glu Thr Val Leu His Asn
130 135 140
Ile Pro Tyr Met Gly Asp Glu Val Leu Asp Gln Asp Gly Thr Phe Ile
145 150 155 160
Glu Glu Leu Ile Lys Asn Tyr Asp Gly Lys Val His Gly Asp Arg Glu
165 170 175
Cys Gly Phe Ile Asn Asp Glu Ile Phe Val Glu Leu Val Asn Ala Leu
180 185 190
Gly Gln Tyr Asn Asp Asp Asp Asp Asp Asp Asp Gly Asp Asp Pro Glu
195 200 205
Glu Arg Glu Glu Lys Gln Lys Asp Leu Glu Asp His Arg Asp Asp Lys
210 215 220
Glu Ser Arg Pro Pro Arg Lys Phe Pro Ser Asp Lys Ile Phe Glu Ala
225 230 235 240
Ile Ser Ser Met Phe Pro Asp Lys Gly Thr Ala Glu Glu Leu Lys Glu
245 250 255
Lys Tyr Lys Glu Leu Thr Glu Gln Gln Leu Pro Gly Ala Leu Pro Pro
260 265 270
Glu Cys Thr Pro Asn Ile Asp Gly Pro Asn Ala Lys Ser Val Gln Arg
275 280 285
Glu Gln Ser Leu His Ser Phe His Thr Leu Phe Cys Arg Arg Cys Phe
290 295 300
Lys Tyr Asp Cys Phe Leu His Pro Phe His Ala Thr Pro Asn Thr Tyr
305 310 315 320
Lys Arg Lys Asn Thr Glu Thr Ala Leu Asp Asn Lys Pro Cys Gly Pro
325 330 335
Gln Cys Tyr Gln His Leu Glu Gly Ala Lys Glu Phe Ala Ala Ala Leu
340 345 350
Thr Ala Glu Arg Ile Lys Thr Pro Pro Lys Arg Pro Gly Gly Arg Arg
355 360 365
Arg Gly Arg Leu Pro Asn Asn Ser Ser Arg Pro Ser Thr Pro Thr Ile
370 375 380
Asn Val Leu Glu Ser Lys Asp Thr Asp Ser Asp Arg Glu Ala Gly Thr
385 390 395 400
Glu Thr Gly Gly Glu Asn Asn Asp Lys Glu Glu Glu Glu Lys Lys Asp
405 410 415
Glu Thr Ser Ser Ser Ser Glu Ala Asn Ser Arg Cys Gln Thr Pro Ile
420 425 430
Lys Met Lys Pro Asn Ile Glu Pro Pro Glu Asn Val Glu Trp Ser Gly
435 440 445
Ala Glu Ala Ser Met Phe Arg Val Leu Ile Gly Thr Tyr Tyr Asp Asn
450 455 460
Phe Cys Ala Ile Ala Arg Leu Ile Gly Thr Lys Thr Cys Arg Gln Val
465 470 475 480
Tyr Glu Phe Arg Val Lys Glu Ser Ser Ile Ile Ala Pro Ala Pro Ala
485 490 495
Glu Asp Val Asp Thr Pro Pro Arg Lys Lys Lys Arg Lys His Arg Leu
500 505 510
Trp Ala Ala His Cys Arg Lys Ile Gln Leu Lys Lys Asp Gly Ser Ser
515 520 525
Asn His Val Tyr Asn Tyr Gln Pro Cys Asp His Pro Arg Gln Pro Cys
530 535 540
Asp Ser Ser Cys Pro Cys Val Ile Ala Gln Asn Phe Cys Glu Lys Phe
545 550 555 560
Cys Gln Cys Ser Ser Glu Cys Gln Asn Arg Phe Pro Gly Cys Arg Cys
565 570 575
Lys Ala Gln Cys Asn Thr Lys Gln Cys Pro Cys Tyr Leu Ala Val Arg
580 585 590
Glu Cys Asp Pro Asp Leu Cys Leu Thr Cys Gly Ala Ala Asp His Trp
595 600 605
Asp Ser Lys Asn Val Ser Cys Lys Asn Cys Ser Ile Gln Arg Gly Ser
610 615 620
Lys Lys His Leu Leu Leu Ala Pro Ser Asp Val Ala Gly Trp Gly Ile
625 630 635 640
Phe Ile Lys Asp Pro Val Gln Lys Asn Glu Phe Ile Ser Glu Tyr Cys
645 650 655
Gly Glu Ile Ile Ser Gln Asp Glu Ala Asp Arg Arg Gly Lys Val Tyr
660 665 670
Asp Lys Tyr Met Cys Ser Phe Leu Phe Asn Leu Asn Asn Asp Phe Val
675 680 685
Val Asp Ala Thr Arg Lys Gly Asn Lys Ile Arg Phe Ala Asn His Ser
690 695 700
Val Asn Pro Asn Cys Tyr Ala Lys Val Met Met Val Asn Gly Asp His
705 710 715 720
Arg Ile Gly Ile Phe Ala Lys Arg Ala Ile Gln Thr Gly Glu Glu Leu
725 730 735
Phe Phe Asp Tyr Arg Tyr Ser Gln Ala Asp Ala Leu Lys Tyr Val Gly
740 745 750
Ile Glu Arg Glu Met Glu Ile Pro Ser Thr Gly Gly Ser Gly Gly Ser
755 760 765
Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Arg Pro Asp Lys Lys Tyr
770 775 780
Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile
785 790 795 800
Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn
805 810 815
Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe
820 825 830
Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg
835 840 845
Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile
850 855 860
Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu
865 870 875 880
Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro
885 890 895
Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro
900 905 910
Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala
915 920 925
Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg
930 935 940
Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val
945 950 955 960
Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu
965 970 975
Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser
980 985 990
Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu
995 1000 1005
Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu
1010 1015 1020
Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala
1025 1030 1035
Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp
1040 1045 1050
Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
1055 1060 1065
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
1070 1075 1080
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala
1085 1090 1095
Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu
1100 1105 1110
Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu
1115 1120 1125
Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp
1130 1135 1140
Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile
1145 1150 1155
Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn
1160 1165 1170
Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser
1175 1180 1185
Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg
1190 1195 1200
Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys
1205 1210 1215
Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro
1220 1225 1230
Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser
1235 1240 1245
Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys
1250 1255 1260
Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp
1265 1270 1275
Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu Leu
1280 1285 1290
Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
1295 1300 1305
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
1310 1315 1320
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val
1325 1330 1335
Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys
1340 1345 1350
Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala
1355 1360 1365
Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys
1370 1375 1380
Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile
1385 1390 1395
Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu
1400 1405 1410
Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys
1415 1420 1425
Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg
1430 1435 1440
Lys Leu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile
1445 1450 1455
Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met
1460 1465 1470
Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln
1475 1480 1485
Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His Glu His Ile
1490 1495 1500
Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln
1505 1510 1515
Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg His
1520 1525 1530
Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
1535 1540 1545
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
1550 1555 1560
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His
1565 1570 1575
Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr
1580 1585 1590
Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp
1595 1600 1605
Ile Asn Arg Leu Ser Asp Tyr Asp Val Ala Ala Ile Val Pro Gln
1610 1615 1620
Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg
1625 1630 1635
Ser Asp Lys Ala Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu
1640 1645 1650
Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala
1655 1660 1665
Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu
1670 1675 1680
Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg
1685 1690 1695
Gln Leu Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile
1700 1705 1710
Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu
1715 1720 1725
Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser
1730 1735 1740
Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn
1745 1750 1755
Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val Gly
1760 1765 1770
Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
1775 1780 1785
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1790 1795 1800
Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr
1805 1810 1815
Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn
1820 1825 1830
Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr
1835 1840 1845
Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg
1850 1855 1860
Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
1865 1870 1875
Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1880 1885 1890
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1895 1900 1905
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1910 1915 1920
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser
1925 1930 1935
Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe
1940 1945 1950
Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu
1955 1960 1965
Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe
1970 1975 1980
Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu
1985 1990 1995
Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn
2000 2005 2010
Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
2015 2020 2025
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His
2030 2035 2040
Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg
2045 2050 2055
Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr
2060 2065 2070
Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile
2075 2080 2085
Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
2090 2095 2100
Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr
2105 2110 2115
Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
2120 2125 2130
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser
2135 2140 2145
Gly Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys
2150 2155 2160
Lys Lys Lys Gly Ser Tyr Pro Tyr Asp Val Pro Asp Tyr Ala
2165 2170 2175
<210> 211
<211> 6202
<212> RNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 211
aggaaauaag agagaaaaga agaguaagaa gaaauauaag agccaccaug gcccccaaga 60
agaagcggaa ggugggcauc cacggcgugc ccgccgccga caagaaguac agcaucggcc 120
uggccaucgg caccaacagc gugggcuggg ccgugaucac cgacgaguac aaggugccca 180
gcaagaaguu caaggugcug ggcaacaccg accggcacag caucaagaag aaccugaucg 240
gcgcccugcu guucgacagc ggcgagaccg ccgaggccac ccggcugaag cggaccgccc 300
ggcggcggua cacccggcgg aagaaccgga ucugcuaccu gcaggagauc uucagcaacg 360
agauggccaa gguggacgac agcuucuucc accggcugga ggagagcuuc cugguggagg 420
aggacaagaa gcacgagcgg caccccaucu ucggcaacau cguggacgag guggccuacc 480
acgagaagua ccccaccauc uaccaccugc ggaagaagcu gguggacagc accgacaagg 540
ccgaccugcg gcugaucuac cuggcccugg cccacaugau caaguuccgg ggccacuucc 600
ugaucgaggg cgaccugaac cccgacaaca gcgacgugga caagcuguuc auccagcugg 660
ugcagaccua caaccagcug uucgaggaga accccaucaa cgccagcggc guggacgcca 720
aggccauccu gagcgcccgg cugagcaaga gccggcggcu ggagaaccug aucgcccagc 780
ugcccggcga gaagaagaac ggccuguucg gcaaccugau cgcccugagc cugggccuga 840
cccccaacuu caagagcaac uucgaccugg ccgaggacgc caagcugcag cugagcaagg 900
acaccuacga cgacgaccug gacaaccugc uggcccagau cggcgaccag uacgccgacc 960
uguuccuggc cgccaagaac cugagcgacg ccauccugcu gagcgacauc cugcggguga 1020
acaccgagau caccaaggcc ccccugagcg ccagcaugau caagcgguac gacgagcacc 1080
accaggaccu gacccugcug aaggcccugg ugcggcagca gcugcccgag aaguacaagg 1140
agaucuucuu cgaccagagc aagaacggcu acgccggcua caucgacggc ggcgccagcc 1200
aggaggaguu cuacaaguuc aucaagccca uccuggagaa gauggacggc accgaggagc 1260
ugcuggugaa gcugaaccgg gaggaccugc ugcggaagca gcggaccuuc gacaacggca 1320
gcauccccca ccagauccac cugggcgagc ugcacgccau ccugcggcgg caggaggacu 1380
ucuaccccuu ccugaaggac aaccgggaga agaucgagaa gauccugacc uuccggaucc 1440
ccuacuacgu gggcccccug gcccggggca acagccgguu cgccuggaug acccggaaau 1500
ccgaggagac caucaccccc uggaacuucg aggagguggu ggacaagggc gccagcgccc 1560
agagcuucau cgagcggaug accaacuucg acaagaaccu gcccaacgag aaggugcugc 1620
ccaagcacag ccugcuguac gaguacuuca ccguguacaa cgagcugacc aaggugaagu 1680
acgugaccga gggcaugcgg aagcccgccu uccugagcgg cgagcagaag aaggccaucg 1740
uggaccugcu guucaagacc aaccggaagg ugaccgugaa gcagcugaag gaggacuacu 1800
ucaagaagau cgagugcuuc gacagcgugg agaucagcgg cguggaggac cgguucaacg 1860
ccagccuggg caccuaccac gaccugcuga agaucaucaa ggacaaggac uuccuggaca 1920
acgaggagaa cgaggacauc cuggaggaca ucgugcugac ccugacccug uucgaggacc 1980
gggagaugau cgaggagcgg cugaaaaccu acgcccaccu guucgacgac aaggugauga 2040
agcagcugaa gcggcggcgg uacaccggcu ggggccggcu gagccggaag cugaucaacg 2100
gcauccggga caagcagagc ggcaagacca uccuggacuu ccugaaaucc gacggcuucg 2160
ccaaccggaa cuucaugcag cugauccacg acgacagccu gaccuucaag gaggacaucc 2220
agaaggccca ggugagcggc cagggcgaca gccugcacga gcacaucgcc aaccuggccg 2280
gcagccccgc caucaagaag ggcauccugc agaccgugaa ggugguggac gagcugguga 2340
aggugauggg ccggcacaag cccgagaaca ucgugaucga gauggcccgg gagaaccaga 2400
ccacccagaa gggccagaag aacagccggg agcggaugaa gcggaucgag gagggcauca 2460
aggagcuggg cagccagauc cugaaggagc accccgugga gaacacccag cugcagaacg 2520
agaagcugua ccuguacuac cugcagaacg gccgggacau guacguggac caggagcugg 2580
acaucaaccg gcugagcgac uacgacgugg ccgccaucgu gccccagagc uuccugaagg 2640
acgacagcau cgacaacaag gugcugaccc ggagcgacaa ggcccggggc aagagcgaca 2700
acgugcccag cgaggaggug gugaagaaga ugaagaacua cuggcggcag cugcugaacg 2760
ccaagcugau cacccagcgg aaguucgaca accugaccaa ggccgagcgg ggcggccuga 2820
gcgagcugga caaggccggc uucaucaagc ggcagcuggu ggagacccgg cagaucacca 2880
agcacguggc ccagauccug gacagccgga ugaacaccaa guacgacgag aacgacaagc 2940
ugauccggga ggugaaggug aucacccuga aauccaagcu ggugagcgac uuccggaagg 3000
acuuccaguu cuacaaggug cgggagauca acaacuacca ccacgcccac gacgccuacc 3060
ugaacgccgu ggugggcacc gcccugauca agaaguaccc caagcuggag agcgaguucg 3120
uguacggcga cuacaaggug uacgacgugc ggaagaugau cgccaagagc gagcaggaga 3180
ucggcaaggc caccgccaag uacuucuucu acagcaacau caugaacuuc uucaagaccg 3240
agaucacccu ggccaacggc gagauccgga agcggccccu gaucgagacc aacggcgaga 3300
ccggcgagau cgugugggac aagggccggg acuucgccac cgugcggaag gugcugagca 3360
ugccccaggu gaacaucgug aagaaaaccg aggugcagac cggcggcuuc agcaaggaga 3420
gcauccugcc caagcggaac agcgacaagc ugaucgcccg gaagaaggac ugggacccca 3480
agaaguacgg cggcuucgac agccccaccg uggccuacag cgugcuggug guggccaagg 3540
uggagaaggg caagagcaag aagcugaaau ccgugaagga gcugcugggc aucaccauca 3600
uggagcggag cagcuucgag aagaacccca ucgacuuccu ggaggccaag ggcuacaagg 3660
aggugaagaa ggaccugauc aucaagcugc ccaaguacag ccuguucgag cuggagaacg 3720
gccggaagcg gaugcuggcc agcgccggcg agcugcagaa gggcaacgag cuggcccugc 3780
ccagcaagua cgugaacuuc cuguaccugg ccagccacua cgagaagcug aagggcagcc 3840
ccgaggacaa cgagcagaag cagcuguucg uggagcagca caagcacuac cuggacgaga 3900
ucaucgagca gaucagcgag uucagcaagc gggugauccu ggccgacgcc aaccuggaca 3960
aggugcugag cgccuacaac aagcaccggg acaagcccau ccgggagcag gccgagaaca 4020
ucauccaccu guucacccug accaaccugg gcgcccccgc cgccuucaag uacuucgaca 4080
ccaccaucga ccggaagcgg uacaccagca ccaaggaggu gcuggacgcc acccugaucc 4140
accagagcau caccggccug uacgagaccc ggaucgaccu gagccagcug ggcggcgaca 4200
gcgccggcgg cggcggcagc ggcggcggcg gcagcggcgg cggcggcagc ggccccaaga 4260
agaagcggaa gguggccgcc gccggcagca accacgacca ggaguucgac ccccccaagg 4320
uguacccccc cgugcccgcc gagaagcgga agcccauccg ggugcugagc cuguucgacg 4380
gcaucgccac cggccugcug gugcugaagg accugggcau ccagguggac cgguacaucg 4440
ccagcgaggu gugcgaggac agcaucaccg ugggcauggu gcggcaccag ggcaagauca 4500
uguacguggg cgacgugcgg agcgugaccc agaagcacau ccaggagugg ggccccuucg 4560
accuggugau cggcggcagc cccugcaacg accugagcau cgugaacccc gcccggaagg 4620
gccuguacga gggcaccggc cggcuguucu ucgaguucua ccggcugcug cacgacgccc 4680
ggcccaagga gggcgacgac cggcccuucu ucuggcuguu cgagaacgug guggccaugg 4740
gcgugagcga caagcgggac aucagccggu uccuggagag caaccccgug augaucgacg 4800
ccaaggaggu gagcgccgcc caccgggccc gguacuucug gggcaaccug cccggcauga 4860
accggccccu ggccagcacc gugaacgaca agcuggagcu gcaggagugc cuggagcacg 4920
gccggaucgc caaguucagc aaggugcgga ccaucaccac ccggagcaac agcaucaagc 4980
agggcaagga ccagcacuuc cccguguuca ugaacgagaa ggaggacauc cuguggugca 5040
ccgagaugga gcggguguuc ggcuuccccg ugcacuacac cgacgugagc aacaugagcc 5100
ggcuggcccg gcagcggcug cugggccgga gcuggagcgu gcccgugauc cggcaccugu 5160
ucgccccccu gaaggaguac uucgccugcg ugagcagcgg caacagcaac gccaacagcc 5220
ggggccccag cuucagcagc ggccuggugc cccugagccu gcggggcagc cacaugaauc 5280
cucuggagau guucgagaca gugcccgugu ggagaaggca acccgugagg gugcugagcc 5340
ucuucgagga cauuaagaag gagcugaccu cucugggcuu ucuggaaucc ggcagcgacc 5400
ccggccagcu gaaacacgug guggacguga ccgacacagu gaggaaggac guggaagagu 5460
ggggccccuu ugaccucgug uauggagcca caccuccucu cggccacaca ugcgauaggc 5520
cucccagcug guaucucuuc caguuccaca gacugcucca guacgccaga ccuaagcccg 5580
gcagccccag acccuucuuc uggauguucg uggacaaucu ggugcugaac aaggaggauc 5640
uggauguggc cagcagauuu cuggagaugg aacccgugac aauccccgac gugcauggcg 5700
gcucucugca gaacgccgug agaguguggu ccaacauccc cgccauuaga agcagacacu 5760
gggcucuggu gagcgaggag gaacugucuc ugcuggccca gaauaagcag uccuccaagc 5820
uggccgccaa guggcccacc aagcugguga agaacugcuu ucugccucug agggaguauu 5880
ucaaguauuu cagcaccgaa cugaccagca gccugagcgg cggcaagcgg cccgccgcca 5940
ccaagaaggc cggccaggcc aagaagaaga agggcagcua ccccuacgac gugcccgacu 6000
acgccugagc ggccgcuuaa uuaagcugcc uucugcgggg cuugccuucu ggccaugccc 6060
uucuucucuc ccuugcaccu guaccucuug gucuuugaau aaagccugag uaggaagucu 6120
agaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 6180
aaaaaaaaaa aaaaaaaaaa aa 6202
<210> 212
<211> 1986
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 212
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val
20 25 30
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
35 40 45
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
50 55 60
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
65 70 75 80
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
85 90 95
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
100 105 110
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
115 120 125
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
130 135 140
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
145 150 155 160
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
165 170 175
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
180 185 190
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
195 200 205
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
210 215 220
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
225 230 235 240
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
245 250 255
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
260 265 270
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
275 280 285
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
290 295 300
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
305 310 315 320
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
325 330 335
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
340 345 350
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
355 360 365
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
370 375 380
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
385 390 395 400
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
405 410 415
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
420 425 430
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
435 440 445
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
450 455 460
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
465 470 475 480
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
485 490 495
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
500 505 510
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
515 520 525
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
530 535 540
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
545 550 555 560
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
565 570 575
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
580 585 590
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
595 600 605
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
610 615 620
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
625 630 635 640
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
645 650 655
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
660 665 670
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
675 680 685
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
690 695 700
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
705 710 715 720
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
725 730 735
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
740 745 750
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
755 760 765
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
770 775 780
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
785 790 795 800
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
805 810 815
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
820 825 830
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
835 840 845
Leu Ser Asp Tyr Asp Val Ala Ala Ile Val Pro Gln Ser Phe Leu Lys
850 855 860
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg
865 870 875 880
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
885 890 895
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
900 905 910
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
915 920 925
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
930 935 940
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
945 950 955 960
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
965 970 975
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
980 985 990
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
995 1000 1005
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu
1010 1015 1020
Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile
1025 1030 1035
Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe
1040 1045 1050
Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu
1055 1060 1065
Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly
1070 1075 1080
Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1085 1090 1095
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys
1100 1105 1110
Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro
1115 1120 1125
Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp
1130 1135 1140
Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser
1145 1150 1155
Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu
1160 1165 1170
Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser
1175 1180 1185
Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr
1190 1195 1200
Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser
1205 1210 1215
Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1220 1225 1230
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr
1235 1240 1245
Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly
1250 1255 1260
Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His
1265 1270 1275
Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser
1280 1285 1290
Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser
1295 1300 1305
Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu
1310 1315 1320
Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1325 1330 1335
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr
1340 1345 1350
Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile
1355 1360 1365
Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly
1370 1375 1380
Asp Ser Ala Gly Gly Gly Gly Ser Gly Gly Gly Gly Ser Gly Gly
1385 1390 1395
Gly Gly Ser Gly Pro Lys Lys Lys Arg Lys Val Ala Ala Ala Gly
1400 1405 1410
Ser Asn His Asp Gln Glu Phe Asp Pro Pro Lys Val Tyr Pro Pro
1415 1420 1425
Val Pro Ala Glu Lys Arg Lys Pro Ile Arg Val Leu Ser Leu Phe
1430 1435 1440
Asp Gly Ile Ala Thr Gly Leu Leu Val Leu Lys Asp Leu Gly Ile
1445 1450 1455
Gln Val Asp Arg Tyr Ile Ala Ser Glu Val Cys Glu Asp Ser Ile
1460 1465 1470
Thr Val Gly Met Val Arg His Gln Gly Lys Ile Met Tyr Val Gly
1475 1480 1485
Asp Val Arg Ser Val Thr Gln Lys His Ile Gln Glu Trp Gly Pro
1490 1495 1500
Phe Asp Leu Val Ile Gly Gly Ser Pro Cys Asn Asp Leu Ser Ile
1505 1510 1515
Val Asn Pro Ala Arg Lys Gly Leu Tyr Glu Gly Thr Gly Arg Leu
1520 1525 1530
Phe Phe Glu Phe Tyr Arg Leu Leu His Asp Ala Arg Pro Lys Glu
1535 1540 1545
Gly Asp Asp Arg Pro Phe Phe Trp Leu Phe Glu Asn Val Val Ala
1550 1555 1560
Met Gly Val Ser Asp Lys Arg Asp Ile Ser Arg Phe Leu Glu Ser
1565 1570 1575
Asn Pro Val Met Ile Asp Ala Lys Glu Val Ser Ala Ala His Arg
1580 1585 1590
Ala Arg Tyr Phe Trp Gly Asn Leu Pro Gly Met Asn Arg Pro Leu
1595 1600 1605
Ala Ser Thr Val Asn Asp Lys Leu Glu Leu Gln Glu Cys Leu Glu
1610 1615 1620
His Gly Arg Ile Ala Lys Phe Ser Lys Val Arg Thr Ile Thr Thr
1625 1630 1635
Arg Ser Asn Ser Ile Lys Gln Gly Lys Asp Gln His Phe Pro Val
1640 1645 1650
Phe Met Asn Glu Lys Glu Asp Ile Leu Trp Cys Thr Glu Met Glu
1655 1660 1665
Arg Val Phe Gly Phe Pro Val His Tyr Thr Asp Val Ser Asn Met
1670 1675 1680
Ser Arg Leu Ala Arg Gln Arg Leu Leu Gly Arg Ser Trp Ser Val
1685 1690 1695
Pro Val Ile Arg His Leu Phe Ala Pro Leu Lys Glu Tyr Phe Ala
1700 1705 1710
Cys Val Ser Ser Gly Asn Ser Asn Ala Asn Ser Arg Gly Pro Ser
1715 1720 1725
Phe Ser Ser Gly Leu Val Pro Leu Ser Leu Arg Gly Ser His Met
1730 1735 1740
Asn Pro Leu Glu Met Phe Glu Thr Val Pro Val Trp Arg Arg Gln
1745 1750 1755
Pro Val Arg Val Leu Ser Leu Phe Glu Asp Ile Lys Lys Glu Leu
1760 1765 1770
Thr Ser Leu Gly Phe Leu Glu Ser Gly Ser Asp Pro Gly Gln Leu
1775 1780 1785
Lys His Val Val Asp Val Thr Asp Thr Val Arg Lys Asp Val Glu
1790 1795 1800
Glu Trp Gly Pro Phe Asp Leu Val Tyr Gly Ala Thr Pro Pro Leu
1805 1810 1815
Gly His Thr Cys Asp Arg Pro Pro Ser Trp Tyr Leu Phe Gln Phe
1820 1825 1830
His Arg Leu Leu Gln Tyr Ala Arg Pro Lys Pro Gly Ser Pro Arg
1835 1840 1845
Pro Phe Phe Trp Met Phe Val Asp Asn Leu Val Leu Asn Lys Glu
1850 1855 1860
Asp Leu Asp Val Ala Ser Arg Phe Leu Glu Met Glu Pro Val Thr
1865 1870 1875
Ile Pro Asp Val His Gly Gly Ser Leu Gln Asn Ala Val Arg Val
1880 1885 1890
Trp Ser Asn Ile Pro Ala Ile Arg Ser Arg His Trp Ala Leu Val
1895 1900 1905
Ser Glu Glu Glu Leu Ser Leu Leu Ala Gln Asn Lys Gln Ser Ser
1910 1915 1920
Lys Leu Ala Ala Lys Trp Pro Thr Lys Leu Val Lys Asn Cys Phe
1925 1930 1935
Leu Pro Leu Arg Glu Tyr Phe Lys Tyr Phe Ser Thr Glu Leu Thr
1940 1945 1950
Ser Ser Leu Ser Gly Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala
1955 1960 1965
Gly Gln Ala Lys Lys Lys Lys Gly Ser Tyr Pro Tyr Asp Val Pro
1970 1975 1980
Asp Tyr Ala
1985
<210> 213
<211> 5698
<212> RNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 213
aggaaauaag agagaaaaga agaguaagaa gaaauauaag agccaccaug gcccccaaga 60
agaagcggaa ggugggcauc cacggcgugc ccgccgccga caagaaguac agcaucggcc 120
uggccaucgg caccaacagc gugggcuggg ccgugaucac cgacgaguac aaggugccca 180
gcaagaaguu caaggugcug ggcaacaccg accggcacag caucaagaag aaccugaucg 240
gcgcccugcu guucgacagc ggcgagaccg ccgaggccac ccggcugaag cggaccgccc 300
ggcggcggua cacccggcgg aagaaccgga ucugcuaccu gcaggagauc uucagcaacg 360
agauggccaa gguggacgac agcuucuucc accggcugga ggagagcuuc cugguggagg 420
aggacaagaa gcacgagcgg caccccaucu ucggcaacau cguggacgag guggccuacc 480
acgagaagua ccccaccauc uaccaccugc ggaagaagcu gguggacagc accgacaagg 540
ccgaccugcg gcugaucuac cuggcccugg cccacaugau caaguuccgg ggccacuucc 600
ugaucgaggg cgaccugaac cccgacaaca gcgacgugga caagcuguuc auccagcugg 660
ugcagaccua caaccagcug uucgaggaga accccaucaa cgccagcggc guggacgcca 720
aggccauccu gagcgcccgg cugagcaaga gccggcggcu ggagaaccug aucgcccagc 780
ugcccggcga gaagaagaac ggccuguucg gcaaccugau cgcccugagc cugggccuga 840
cccccaacuu caagagcaac uucgaccugg ccgaggacgc caagcugcag cugagcaagg 900
acaccuacga cgacgaccug gacaaccugc uggcccagau cggcgaccag uacgccgacc 960
uguuccuggc cgccaagaac cugagcgacg ccauccugcu gagcgacauc cugcggguga 1020
acaccgagau caccaaggcc ccccugagcg ccagcaugau caagcgguac gacgagcacc 1080
accaggaccu gacccugcug aaggcccugg ugcggcagca gcugcccgag aaguacaagg 1140
agaucuucuu cgaccagagc aagaacggcu acgccggcua caucgacggc ggcgccagcc 1200
aggaggaguu cuacaaguuc aucaagccca uccuggagaa gauggacggc accgaggagc 1260
ugcuggugaa gcugaaccgg gaggaccugc ugcggaagca gcggaccuuc gacaacggca 1320
gcauccccca ccagauccac cugggcgagc ugcacgccau ccugcggcgg caggaggacu 1380
ucuaccccuu ccugaaggac aaccgggaga agaucgagaa gauccugacc uuccggaucc 1440
ccuacuacgu gggcccccug gcccggggca acagccgguu cgccuggaug acccggaaau 1500
ccgaggagac caucaccccc uggaacuucg aggagguggu ggacaagggc gccagcgccc 1560
agagcuucau cgagcggaug accaacuucg acaagaaccu gcccaacgag aaggugcugc 1620
ccaagcacag ccugcuguac gaguacuuca ccguguacaa cgagcugacc aaggugaagu 1680
acgugaccga gggcaugcgg aagcccgccu uccugagcgg cgagcagaag aaggccaucg 1740
uggaccugcu guucaagacc aaccggaagg ugaccgugaa gcagcugaag gaggacuacu 1800
ucaagaagau cgagugcuuc gacagcgugg agaucagcgg cguggaggac cgguucaacg 1860
ccagccuggg caccuaccac gaccugcuga agaucaucaa ggacaaggac uuccuggaca 1920
acgaggagaa cgaggacauc cuggaggaca ucgugcugac ccugacccug uucgaggacc 1980
gggagaugau cgaggagcgg cugaaaaccu acgcccaccu guucgacgac aaggugauga 2040
agcagcugaa gcggcggcgg uacaccggcu ggggccggcu gagccggaag cugaucaacg 2100
gcauccggga caagcagagc ggcaagacca uccuggacuu ccugaaaucc gacggcuucg 2160
ccaaccggaa cuucaugcag cugauccacg acgacagccu gaccuucaag gaggacaucc 2220
agaaggccca ggugagcggc cagggcgaca gccugcacga gcacaucgcc aaccuggccg 2280
gcagccccgc caucaagaag ggcauccugc agaccgugaa ggugguggac gagcugguga 2340
aggugauggg ccggcacaag cccgagaaca ucgugaucga gauggcccgg gagaaccaga 2400
ccacccagaa gggccagaag aacagccggg agcggaugaa gcggaucgag gagggcauca 2460
aggagcuggg cagccagauc cugaaggagc accccgugga gaacacccag cugcagaacg 2520
agaagcugua ccuguacuac cugcagaacg gccgggacau guacguggac caggagcugg 2580
acaucaaccg gcugagcgac uacgacgugg ccgccaucgu gccccagagc uuccugaagg 2640
acgacagcau cgacaacaag gugcugaccc ggagcgacaa ggcccggggc aagagcgaca 2700
acgugcccag cgaggaggug gugaagaaga ugaagaacua cuggcggcag cugcugaacg 2760
ccaagcugau cacccagcgg aaguucgaca accugaccaa ggccgagcgg ggcggccuga 2820
gcgagcugga caaggccggc uucaucaagc ggcagcuggu ggagacccgg cagaucacca 2880
agcacguggc ccagauccug gacagccgga ugaacaccaa guacgacgag aacgacaagc 2940
ugauccggga ggugaaggug aucacccuga aauccaagcu ggugagcgac uuccggaagg 3000
acuuccaguu cuacaaggug cgggagauca acaacuacca ccacgcccac gacgccuacc 3060
ugaacgccgu ggugggcacc gcccugauca agaaguaccc caagcuggag agcgaguucg 3120
uguacggcga cuacaaggug uacgacgugc ggaagaugau cgccaagagc gagcaggaga 3180
ucggcaaggc caccgccaag uacuucuucu acagcaacau caugaacuuc uucaagaccg 3240
agaucacccu ggccaacggc gagauccgga agcggccccu gaucgagacc aacggcgaga 3300
ccggcgagau cgugugggac aagggccggg acuucgccac cgugcggaag gugcugagca 3360
ugccccaggu gaacaucgug aagaaaaccg aggugcagac cggcggcuuc agcaaggaga 3420
gcauccugcc caagcggaac agcgacaagc ugaucgcccg gaagaaggac ugggacccca 3480
agaaguacgg cggcuucgac agccccaccg uggccuacag cgugcuggug guggccaagg 3540
uggagaaggg caagagcaag aagcugaaau ccgugaagga gcugcugggc aucaccauca 3600
uggagcggag cagcuucgag aagaacccca ucgacuuccu ggaggccaag ggcuacaagg 3660
aggugaagaa ggaccugauc aucaagcugc ccaaguacag ccuguucgag cuggagaacg 3720
gccggaagcg gaugcuggcc agcgccggcg agcugcagaa gggcaacgag cuggcccugc 3780
ccagcaagua cgugaacuuc cuguaccugg ccagccacua cgagaagcug aagggcagcc 3840
ccgaggacaa cgagcagaag cagcuguucg uggagcagca caagcacuac cuggacgaga 3900
ucaucgagca gaucagcgag uucagcaagc gggugauccu ggccgacgcc aaccuggaca 3960
aggugcugag cgccuacaac aagcaccggg acaagcccau ccgggagcag gccgagaaca 4020
ucauccaccu guucacccug accaaccugg gcgcccccgc cgccuucaag uacuucgaca 4080
ccaccaucga ccggaagcgg uacaccagca ccaaggaggu gcuggacgcc acccugaucc 4140
accagagcau caccggccug uacgagaccc ggaucgaccu gagccagcug ggcggcgaca 4200
gcggcggcaa gcggcccgcc gccaccaaga aggccggcca ggccaagaag aagaagucgg 4260
gcgggggugg cucagaggag cccgaggagc ccgccgauag cggacaaucu cuggugcccg 4320
ucuacaucua cagccccgaa uaugugagca ugugugauuc ccucgccaag aucccuaaga 4380
gagccagcau ggugcauucu cugaucgagg ccuacgcucu gcauaagcaa augaggaucg 4440
ugaagcccaa ggucgccagc auggaagaga uggccaccuu ucacaccgau gccuaccucc 4500
aacaucucca gaaggugucc caagagggcg acgacgacca ccccgacucc auugaguacg 4560
gacugggcua ugauugcccc gccaccgagg gcaucuuuga cuaugccgcc gcuaucggcg 4620
gagcuaccau cacagccgcc cagugucuga uugauggcau gugcaagguc gccaucaacu 4680
gguccggagg cuggcaucau gccaagaagg augaggccuc cggcuucugu uaucugaaug 4740
acgccgugcu gggcauucug agacugagga ggaaauucga gaggauucug uacguggauc 4800
uggaucugca ucacggagau ggagucgaag augccuucag cuucaccagc aaggugauga 4860
cagucucucu gcacaaguuc ucccccggcu ucuuucccgg aaccggcgac guguccgacg 4920
ugggacuggg caagggaagg uacuacagcg ugaacgugcc cauucaagac ggcauccaag 4980
acgagaagua cuaccagauc ugcgaguccg ugcucaagga ggucuaccaa gccuucaauc 5040
cuaaggcugu cgugcuccaa cugggagcug auaccauugc uggcgauccc augugcagcu 5100
ucaauaugac acccgucgga aucggcaagu gccucaagua cauccuccag uggcagcucg 5160
ccacccucau ucucggagga ggcggauaca aucuggcuaa uaccgccaga ugcuggaccu 5220
aucugaccgg cgugauucug ggcaaaacac ugagcagcga aauccccgac cacgaguuuu 5280
ucaccgcuua cggccccgac uacgugcugg agaucacccc cagcugcaga cccgauagaa 5340
acgaacccca uagaauccag caaauucuga acuauaucaa gggcaaccuc aagcacgucg 5400
ugggaggugg cggaucggga aagcggcccg ccgccaccaa gaaggccggu caggccaaga 5460
agaagaaggg cagcuacccc uacgacgugc ccgacuacgc cugagcggcc gcuuaauuaa 5520
gcugccuucu gcggggcuug ccuucuggcc augcccuucu ucucucccuu gcaccuguac 5580
cucuuggucu uugaauaaag ccugaguagg aagucuagaa aaaaaaaaaa aaaaaaaaaa 5640
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaa 5698
<210> 214
<211> 1818
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 214
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Asp Lys Lys Tyr Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val
20 25 30
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
35 40 45
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
50 55 60
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
65 70 75 80
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
85 90 95
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
100 105 110
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
115 120 125
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
130 135 140
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
145 150 155 160
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
165 170 175
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
180 185 190
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
195 200 205
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
210 215 220
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
225 230 235 240
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
245 250 255
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
260 265 270
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
275 280 285
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
290 295 300
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
305 310 315 320
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
325 330 335
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
340 345 350
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
355 360 365
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
370 375 380
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
385 390 395 400
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
405 410 415
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
420 425 430
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
435 440 445
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
450 455 460
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
465 470 475 480
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
485 490 495
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
500 505 510
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
515 520 525
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
530 535 540
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
545 550 555 560
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
565 570 575
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
580 585 590
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
595 600 605
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
610 615 620
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
625 630 635 640
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
645 650 655
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
660 665 670
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
675 680 685
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
690 695 700
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
705 710 715 720
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
725 730 735
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
740 745 750
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
755 760 765
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
770 775 780
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
785 790 795 800
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
805 810 815
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
820 825 830
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
835 840 845
Leu Ser Asp Tyr Asp Val Ala Ala Ile Val Pro Gln Ser Phe Leu Lys
850 855 860
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Ala Arg
865 870 875 880
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
885 890 895
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
900 905 910
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
915 920 925
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
930 935 940
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
945 950 955 960
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
965 970 975
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
980 985 990
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
995 1000 1005
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu
1010 1015 1020
Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile
1025 1030 1035
Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe
1040 1045 1050
Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu
1055 1060 1065
Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly
1070 1075 1080
Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1085 1090 1095
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys
1100 1105 1110
Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro
1115 1120 1125
Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp
1130 1135 1140
Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser
1145 1150 1155
Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu
1160 1165 1170
Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser
1175 1180 1185
Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr
1190 1195 1200
Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser
1205 1210 1215
Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1220 1225 1230
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr
1235 1240 1245
Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly
1250 1255 1260
Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His
1265 1270 1275
Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser
1280 1285 1290
Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser
1295 1300 1305
Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu
1310 1315 1320
Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1325 1330 1335
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr
1340 1345 1350
Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile
1355 1360 1365
Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly
1370 1375 1380
Asp Ser Gly Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln
1385 1390 1395
Ala Lys Lys Lys Lys Ser Gly Gly Gly Gly Ser Glu Glu Pro Glu
1400 1405 1410
Glu Pro Ala Asp Ser Gly Gln Ser Leu Val Pro Val Tyr Ile Tyr
1415 1420 1425
Ser Pro Glu Tyr Val Ser Met Cys Asp Ser Leu Ala Lys Ile Pro
1430 1435 1440
Lys Arg Ala Ser Met Val His Ser Leu Ile Glu Ala Tyr Ala Leu
1445 1450 1455
His Lys Gln Met Arg Ile Val Lys Pro Lys Val Ala Ser Met Glu
1460 1465 1470
Glu Met Ala Thr Phe His Thr Asp Ala Tyr Leu Gln His Leu Gln
1475 1480 1485
Lys Val Ser Gln Glu Gly Asp Asp Asp His Pro Asp Ser Ile Glu
1490 1495 1500
Tyr Gly Leu Gly Tyr Asp Cys Pro Ala Thr Glu Gly Ile Phe Asp
1505 1510 1515
Tyr Ala Ala Ala Ile Gly Gly Ala Thr Ile Thr Ala Ala Gln Cys
1520 1525 1530
Leu Ile Asp Gly Met Cys Lys Val Ala Ile Asn Trp Ser Gly Gly
1535 1540 1545
Trp His His Ala Lys Lys Asp Glu Ala Ser Gly Phe Cys Tyr Leu
1550 1555 1560
Asn Asp Ala Val Leu Gly Ile Leu Arg Leu Arg Arg Lys Phe Glu
1565 1570 1575
Arg Ile Leu Tyr Val Asp Leu Asp Leu His His Gly Asp Gly Val
1580 1585 1590
Glu Asp Ala Phe Ser Phe Thr Ser Lys Val Met Thr Val Ser Leu
1595 1600 1605
His Lys Phe Ser Pro Gly Phe Phe Pro Gly Thr Gly Asp Val Ser
1610 1615 1620
Asp Val Gly Leu Gly Lys Gly Arg Tyr Tyr Ser Val Asn Val Pro
1625 1630 1635
Ile Gln Asp Gly Ile Gln Asp Glu Lys Tyr Tyr Gln Ile Cys Glu
1640 1645 1650
Ser Val Leu Lys Glu Val Tyr Gln Ala Phe Asn Pro Lys Ala Val
1655 1660 1665
Val Leu Gln Leu Gly Ala Asp Thr Ile Ala Gly Asp Pro Met Cys
1670 1675 1680
Ser Phe Asn Met Thr Pro Val Gly Ile Gly Lys Cys Leu Lys Tyr
1685 1690 1695
Ile Leu Gln Trp Gln Leu Ala Thr Leu Ile Leu Gly Gly Gly Gly
1700 1705 1710
Tyr Asn Leu Ala Asn Thr Ala Arg Cys Trp Thr Tyr Leu Thr Gly
1715 1720 1725
Val Ile Leu Gly Lys Thr Leu Ser Ser Glu Ile Pro Asp His Glu
1730 1735 1740
Phe Phe Thr Ala Tyr Gly Pro Asp Tyr Val Leu Glu Ile Thr Pro
1745 1750 1755
Ser Cys Arg Pro Asp Arg Asn Glu Pro His Arg Ile Gln Gln Ile
1760 1765 1770
Leu Asn Tyr Ile Lys Gly Asn Leu Lys His Val Val Gly Gly Gly
1775 1780 1785
Gly Ser Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala
1790 1795 1800
Lys Lys Lys Lys Gly Ser Tyr Pro Tyr Asp Val Pro Asp Tyr Ala
1805 1810 1815
<210> 215
<211> 7987
<212> RNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 215
aggaaauaag agagaaaaga agaguaagaa gaaauauaag agccaccaug gcccccaaga 60
agaagcggaa ggugggcggc agcggcggca gcggccagac cggcaagaag agcgagaagg 120
gccccgugug cuggcggaag cgggugaaga gcgaguacau gcggcugcgg cagcugaagc 180
gguuccggcg ggccgacgag gugaagagca uguucagcag caaccggcag aagauccugg 240
agcggaccga gauccugaac caggagugga agcagcggcg aauccagccc gugcacaucc 300
ugaccagcgu gagcagccug cggggcaccc gggagugcag cgugaccagc gaccuggacu 360
uccccaccca ggugaucccc cuaaagaccc ugaacgccgu ggccagcgug cccaucaugu 420
acagcuggag cccccugcag cagaacuuca ugguggagga cgagaccgug cugcacaaca 480
uccccuacau gggcgacgag gugcuggacc aggacggcac cuucaucgag gagcugauca 540
agaacuacga cggcaaggug cacggcgacc gggagugcgg cuucaucaac gacgagaucu 600
ucguggagcu ggugaacgcc cugggccagu acaacgacga cgacgacgac gacgacggcg 660
acgaccccga ggagcgggag gagaagcaga aggaccugga ggaccaccgg gacgacaagg 720
agagccggcc cccccggaag uuccccagcg acaagaucuu cgaggccauc agcagcaugu 780
uccccgacaa gggcaccgcc gaggagcuga aggagaagua caaggagcug accgagcagc 840
agcugcccgg cgcccugccc cccgagugca cccccaacau cgacggcccc aacgccaaga 900
gcgugcagcg ggagcagagc cugcacagcu uccacacccu guucugccgg cggugcuuca 960
aguacgacug cuuccugcac cccuuccacg ccacccccaa caccuacaag cggaagaaca 1020
ccgagaccgc ccuggacaac aagcccugcg gcccccagug cuaccagcac cuggagggcg 1080
ccaaggaguu cgccgccgcc cugaccgccg agcggaucaa gacccccccc aagcggcccg 1140
gcggccggcg gcggggccgg cugcccaaca acagcagccg gcccagcacc cccaccauca 1200
acgugcugga gagcaaggac accgacagcg accgggaggc cggcaccgag accggcggcg 1260
agaacaacga caaggaggag gaggagaaga aggacgagac cagcagcagc agcgaggcca 1320
acagccggug ccagaccccc aucaagauga agcccaacau cgagcccccc gagaacgugg 1380
aguggagcgg cgccgaggcc agcauguucc gggugcugau cggcaccuac uacgacaacu 1440
ucugcgccau cgcccggcug aucggcacca agaccugccg gcagguguac gaguuccggg 1500
ugaaggagag cagcaucauc gcccccgccc ccgccgagga cguggacacc cccccccgga 1560
agaagaagcg gaagcaccgg cugugggccg cccacugccg gaagauccag cugaagaagg 1620
acggcagcag caaccacgug uacaacuacc agcccugcga ccacccccgg cagcccugcg 1680
acagcagcug ccccugcgug aucgcccaga acuucugcga gaaguucugc cagugcagca 1740
gcgagugcca gaaccgguuc cccggcugcc ggugcaaggc ccagugcaac accaagcagu 1800
gccccugcua ccuggccgug cgggagugcg accccgaccu gugccugacc ugcggcgccg 1860
ccgaccacug ggacagcaag aacgugagcu gcaagaacug cagcauccag cggggcagca 1920
agaagcaccu gcugcuggcc cccagcgacg uggccggcug gggcaucuuc aucaaggacc 1980
ccgugcagaa gaacgaguuc aucagcgagu acugcggcga gaucaucagc caggacgagg 2040
ccgaccggcg gggcaaggug uacgacaagu acaugugcag cuuccuguuc aaccugaaca 2100
acgacuucgu gguggacgcc acccggaagg gcaacaagau ccgguucgcc aaccacagcg 2160
ugaaccccaa cugcuacgcc aaggugauga uggugaacgg cgaccaccgg aucggcaucu 2220
ucgccaagcg ggccauccag accggcgagg agcuguucuu cgacuaccgg uacagccagg 2280
ccgacgcccu gaaguacgug ggcaucgagc gggagaugga gauccccagc accggcggca 2340
gcggcggcag cggcggcagc ggcggcagcg gcggcagcgg ccgacccgac aagaaguaca 2400
gcaucggccu ggccaucggc accaacagcg ugggcugggc cgugaucacc gacgaguaca 2460
aggugcccag caagaaguuc aaggugcugg gcaacaccga ccggcacagc aucaagaaga 2520
accugaucgg cgcccugcug uucgacagcg gcgagaccgc cgaggccacc cggcugaagc 2580
ggaccgcccg gcggcgguac acccggcgga agaaccggau cugcuaccug caggagaucu 2640
ucagcaacga gauggccaag guggacgaca gcuucuucca ccggcuggag gagagcuucc 2700
ugguggagga ggacaagaag cacgagcggc accccaucuu cggcaacauc guggacgagg 2760
uggccuacca cgagaaguac cccaccaucu accaccugcg gaagaagcug guggacagca 2820
ccgacaaggc cgaccugcgg cugaucuacc uggcccuggc ccacaugauc aaguuccggg 2880
gccacuuccu gaucgagggc gaccugaacc ccgacaacag cgacguggac aagcuguuca 2940
uccagcuggu gcagaccuac aaccagcugu ucgaggagaa ccccaucaac gccagcggcg 3000
uggacgccaa ggccauccug agcgcccggc ugagcaagag ccggcggcug gagaaccuga 3060
ucgcccagcu gcccggcgag aagaagaacg gccuguucgg caaccugauc gcccugagcc 3120
ugggccugac ccccaacuuc aagagcaacu ucgaccuggc cgaggacgcc aagcugcagc 3180
ugagcaagga caccuacgac gacgaccugg acaaccugcu ggcccagauc ggcgaccagu 3240
acgccgaccu guuccuggcc gccaagaacc ugagcgacgc cauccugcug agcgacaucc 3300
ugcgggugaa caccgagauc accaaggccc cccugagcgc cagcaugauc aagcgguacg 3360
acgagcacca ccaggaccug acccugcuga aggcccuggu gcggcagcag cugcccgaga 3420
aguacaagga gaucuucuuc gaccagagca agaacggcua cgccggcuac aucgacggcg 3480
gcgccagcca ggaggaguuc uacaaguuca ucaagcccau ccuggagaag auggacggca 3540
ccgaggagcu gcuggugaag cugaaccggg aggaccugcu gcggaagcag cggaccuucg 3600
acaacggcag caucccccac cagauccacc ugggcgagcu gcacgccauc cugcggcggc 3660
aggaggacuu cuaccccuuc cugaaggaca accgggagaa gaucgagaag auccugaccu 3720
uccggauccc cuacuacgug ggcccccugg cccggggcaa cagccgguuc gccuggauga 3780
cccggaaauc cgaggagacc aucacccccu ggaacuucga ggagguggug gacaagggcg 3840
ccagcgccca gagcuucauc gagcggauga ccaacuucga caagaaccug cccaacgaga 3900
aggugcugcc caagcacagc cugcuguacg aguacuucac cguguacaac gagcugacca 3960
aggugaagua cgugaccgag ggcaugcgga agcccgccuu ccugagcggc gagcagaaga 4020
aggccaucgu ggaccugcug uucaagacca accggaaggu gaccgugaag cagcugaagg 4080
aggacuacuu caagaagauc gagugcuucg acagcgugga gaucagcggc guggaggacc 4140
gguucaacgc cagccugggc accuaccacg accugcugaa gaucaucaag gacaaggacu 4200
uccuggacaa cgaggagaac gaggacaucc uggaggacau cgugcugacc cugacccugu 4260
ucgaggaccg ggagaugauc gaggagcggc ugaaaaccua cgcccaccug uucgacgaca 4320
aggugaugaa gcagcugaag cggcggcggu acaccggcug gggccggcug agccggaagc 4380
ugaucaacgg cauccgggac aagcagagcg gcaagaccau ccuggacuuc cugaaauccg 4440
acggcuucgc caaccggaac uucaugcagc ugauccacga cgacagccug accuucaagg 4500
aggacaucca gaaggcccag gugagcggcc agggcgacag ccugcacgag cacaucgcca 4560
accuggccgg cagccccgcc aucaagaagg gcauccugca gaccgugaag gugguggacg 4620
agcuggugaa ggugaugggc cggcacaagc ccgagaacau cgugaucgag auggcccggg 4680
agaaccagac cacccagaag ggccagaaga acagccggga gcggaugaag cggaucgagg 4740
agggcaucaa ggagcugggc agccagaucc ugaaggagca ccccguggag aacacccagc 4800
ugcagaacga gaagcuguac cuguacuacc ugcagaacgg ccgggacaug uacguggacc 4860
aggagcugga caucaaccgg cugagcgacu acgacguggc cgccaucgug ccccagagcu 4920
uccugaagga cgacagcauc gacaacaagg ugcugacccg gagcgacaag gcccggggca 4980
agagcgacaa cgugcccagc gaggaggugg ugaagaagau gaagaacuac uggcggcagc 5040
ugcugaacgc caagcugauc acccagcgga aguucgacaa ccugaccaag gccgagcggg 5100
gcggccugag cgagcuggac aaggccggcu ucaucaagcg gcagcuggug gagacccggc 5160
agaucaccaa gcacguggcc cagauccugg acagccggau gaacaccaag uacgacgaga 5220
acgacaagcu gauccgggag gugaagguga ucacccugaa auccaagcug gugagcgacu 5280
uccggaagga cuuccaguuc uacaaggugc gggagaucaa caacuaccac cacgcccacg 5340
acgccuaccu gaacgccgug gugggcaccg cccugaucaa gaaguacccc aagcuggaga 5400
gcgaguucgu guacggcgac uacaaggugu acgacgugcg gaagaugauc gccaagagcg 5460
agcaggagau cggcaaggcc accgccaagu acuucuucua cagcaacauc augaacuucu 5520
ucaagaccga gaucacccug gccaacggcg agauccggaa gcggccccug aucgagacca 5580
acggcgagac cggcgagauc gugugggaca agggccggga cuucgccacc gugcggaagg 5640
ugcugagcau gccccaggug aacaucguga agaaaaccga ggugcagacc ggcggcuuca 5700
gcaaggagag cauccugccc aagcggaaca gcgacaagcu gaucgcccgg aagaaggacu 5760
gggaccccaa gaaguacggc ggcuucgaca gccccaccgu ggccuacagc gugcuggugg 5820
uggccaaggu ggagaagggc aagagcaaga agcugaaauc cgugaaggag cugcugggca 5880
ucaccaucau ggagcggagc agcuucgaga agaaccccau cgacuuccug gaggccaagg 5940
gcuacaagga ggugaagaag gaccugauca ucaagcugcc caaguacagc cuguucgagc 6000
uggagaacgg ccggaagcgg augcuggcca gcgccggcga gcugcagaag ggcaacgagc 6060
uggcccugcc cagcaaguac gugaacuucc uguaccuggc cagccacuac gagaagcuga 6120
agggcagccc cgaggacaac gagcagaagc agcuguucgu ggagcagcac aagcacuacc 6180
uggacgagau caucgagcag aucagcgagu ucagcaagcg ggugauccug gccgacgcca 6240
accuggacaa ggugcugagc gccuacaaca agcaccggga caagcccauc cgggagcagg 6300
ccgagaacau cauccaccug uucacccuga ccaaccuggg cgcccccgcc gccuucaagu 6360
acuucgacac caccaucgac cggaagcggu acaccagcac caaggaggug cuggacgcca 6420
cccugaucca ccagagcauc accggccugu acgagacccg gaucgaccug agccagcugg 6480
gcggcgacag cggcggcaag cggcccgccg ccaccaagaa ggccggccag gccaagaaga 6540
agaagucggg cggggguggc ucagaggagc ccgaggagcc cgccgauagc ggacaaucuc 6600
uggugcccgu cuacaucuac agccccgaau augugagcau gugugauucc cucgccaaga 6660
ucccuaagag agccagcaug gugcauucuc ugaucgaggc cuacgcucug cauaagcaaa 6720
ugaggaucgu gaagcccaag gucgccagca uggaagagau ggccaccuuu cacaccgaug 6780
ccuaccucca acaucuccag aagguguccc aagagggcga cgacgaccac cccgacucca 6840
uugaguacgg acugggcuau gauugccccg ccaccgaggg caucuuugac uaugccgccg 6900
cuaucggcgg agcuaccauc acagccgccc agugucugau ugauggcaug ugcaaggucg 6960
ccaucaacug guccggaggc uggcaucaug ccaagaagga ugaggccucc ggcuucuguu 7020
aucugaauga cgccgugcug ggcauucuga gacugaggag gaaauucgag aggauucugu 7080
acguggaucu ggaucugcau cacggagaug gagucgaaga ugccuucagc uucaccagca 7140
aggugaugac agucucucug cacaaguucu cccccggcuu cuuucccgga accggcgacg 7200
uguccgacgu gggacugggc aagggaaggu acuacagcgu gaacgugccc auucaagacg 7260
gcauccaaga cgagaaguac uaccagaucu gcgaguccgu gcucaaggag gucuaccaag 7320
ccuucaaucc uaaggcuguc gugcuccaac ugggagcuga uaccauugcu ggcgauccca 7380
ugugcagcuu caauaugaca cccgucggaa ucggcaagug ccucaaguac auccuccagu 7440
ggcagcucgc cacccucauu cucggaggag gcggauacaa ucuggcuaau accgccagau 7500
gcuggaccua ucugaccggc gugauucugg gcaaaacacu gagcagcgaa auccccgacc 7560
acgaguuuuu caccgcuuac ggccccgacu acgugcugga gaucaccccc agcugcagac 7620
ccgauagaaa cgaaccccau agaauccagc aaauucugaa cuauaucaag ggcaaccuca 7680
agcacgucgu gggagguggc ggaucgggaa agcggcccgc cgccaccaag aaggccgguc 7740
aggccaagaa gaagaagggc agcuaccccu acgacgugcc cgacuacgcc ugagcggccg 7800
cuuaauuaag cugccuucug cggggcuugc cuucuggcca ugcccuucuu cucucccuug 7860
caccuguacc ucuuggucuu ugaauaaagc cugaguagga agucuagaaa aaaaaaaaaa 7920
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 7980
aaaaaaa 7987
<210> 216
<211> 2581
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 216
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Gly Ser Gly Gly Ser Gly
1 5 10 15
Gln Thr Gly Lys Lys Ser Glu Lys Gly Pro Val Cys Trp Arg Lys Arg
20 25 30
Val Lys Ser Glu Tyr Met Arg Leu Arg Gln Leu Lys Arg Phe Arg Arg
35 40 45
Ala Asp Glu Val Lys Ser Met Phe Ser Ser Asn Arg Gln Lys Ile Leu
50 55 60
Glu Arg Thr Glu Ile Leu Asn Gln Glu Trp Lys Gln Arg Arg Ile Gln
65 70 75 80
Pro Val His Ile Leu Thr Ser Val Ser Ser Leu Arg Gly Thr Arg Glu
85 90 95
Cys Ser Val Thr Ser Asp Leu Asp Phe Pro Thr Gln Val Ile Pro Leu
100 105 110
Lys Thr Leu Asn Ala Val Ala Ser Val Pro Ile Met Tyr Ser Trp Ser
115 120 125
Pro Leu Gln Gln Asn Phe Met Val Glu Asp Glu Thr Val Leu His Asn
130 135 140
Ile Pro Tyr Met Gly Asp Glu Val Leu Asp Gln Asp Gly Thr Phe Ile
145 150 155 160
Glu Glu Leu Ile Lys Asn Tyr Asp Gly Lys Val His Gly Asp Arg Glu
165 170 175
Cys Gly Phe Ile Asn Asp Glu Ile Phe Val Glu Leu Val Asn Ala Leu
180 185 190
Gly Gln Tyr Asn Asp Asp Asp Asp Asp Asp Asp Gly Asp Asp Pro Glu
195 200 205
Glu Arg Glu Glu Lys Gln Lys Asp Leu Glu Asp His Arg Asp Asp Lys
210 215 220
Glu Ser Arg Pro Pro Arg Lys Phe Pro Ser Asp Lys Ile Phe Glu Ala
225 230 235 240
Ile Ser Ser Met Phe Pro Asp Lys Gly Thr Ala Glu Glu Leu Lys Glu
245 250 255
Lys Tyr Lys Glu Leu Thr Glu Gln Gln Leu Pro Gly Ala Leu Pro Pro
260 265 270
Glu Cys Thr Pro Asn Ile Asp Gly Pro Asn Ala Lys Ser Val Gln Arg
275 280 285
Glu Gln Ser Leu His Ser Phe His Thr Leu Phe Cys Arg Arg Cys Phe
290 295 300
Lys Tyr Asp Cys Phe Leu His Pro Phe His Ala Thr Pro Asn Thr Tyr
305 310 315 320
Lys Arg Lys Asn Thr Glu Thr Ala Leu Asp Asn Lys Pro Cys Gly Pro
325 330 335
Gln Cys Tyr Gln His Leu Glu Gly Ala Lys Glu Phe Ala Ala Ala Leu
340 345 350
Thr Ala Glu Arg Ile Lys Thr Pro Pro Lys Arg Pro Gly Gly Arg Arg
355 360 365
Arg Gly Arg Leu Pro Asn Asn Ser Ser Arg Pro Ser Thr Pro Thr Ile
370 375 380
Asn Val Leu Glu Ser Lys Asp Thr Asp Ser Asp Arg Glu Ala Gly Thr
385 390 395 400
Glu Thr Gly Gly Glu Asn Asn Asp Lys Glu Glu Glu Glu Lys Lys Asp
405 410 415
Glu Thr Ser Ser Ser Ser Glu Ala Asn Ser Arg Cys Gln Thr Pro Ile
420 425 430
Lys Met Lys Pro Asn Ile Glu Pro Pro Glu Asn Val Glu Trp Ser Gly
435 440 445
Ala Glu Ala Ser Met Phe Arg Val Leu Ile Gly Thr Tyr Tyr Asp Asn
450 455 460
Phe Cys Ala Ile Ala Arg Leu Ile Gly Thr Lys Thr Cys Arg Gln Val
465 470 475 480
Tyr Glu Phe Arg Val Lys Glu Ser Ser Ile Ile Ala Pro Ala Pro Ala
485 490 495
Glu Asp Val Asp Thr Pro Pro Arg Lys Lys Lys Arg Lys His Arg Leu
500 505 510
Trp Ala Ala His Cys Arg Lys Ile Gln Leu Lys Lys Asp Gly Ser Ser
515 520 525
Asn His Val Tyr Asn Tyr Gln Pro Cys Asp His Pro Arg Gln Pro Cys
530 535 540
Asp Ser Ser Cys Pro Cys Val Ile Ala Gln Asn Phe Cys Glu Lys Phe
545 550 555 560
Cys Gln Cys Ser Ser Glu Cys Gln Asn Arg Phe Pro Gly Cys Arg Cys
565 570 575
Lys Ala Gln Cys Asn Thr Lys Gln Cys Pro Cys Tyr Leu Ala Val Arg
580 585 590
Glu Cys Asp Pro Asp Leu Cys Leu Thr Cys Gly Ala Ala Asp His Trp
595 600 605
Asp Ser Lys Asn Val Ser Cys Lys Asn Cys Ser Ile Gln Arg Gly Ser
610 615 620
Lys Lys His Leu Leu Leu Ala Pro Ser Asp Val Ala Gly Trp Gly Ile
625 630 635 640
Phe Ile Lys Asp Pro Val Gln Lys Asn Glu Phe Ile Ser Glu Tyr Cys
645 650 655
Gly Glu Ile Ile Ser Gln Asp Glu Ala Asp Arg Arg Gly Lys Val Tyr
660 665 670
Asp Lys Tyr Met Cys Ser Phe Leu Phe Asn Leu Asn Asn Asp Phe Val
675 680 685
Val Asp Ala Thr Arg Lys Gly Asn Lys Ile Arg Phe Ala Asn His Ser
690 695 700
Val Asn Pro Asn Cys Tyr Ala Lys Val Met Met Val Asn Gly Asp His
705 710 715 720
Arg Ile Gly Ile Phe Ala Lys Arg Ala Ile Gln Thr Gly Glu Glu Leu
725 730 735
Phe Phe Asp Tyr Arg Tyr Ser Gln Ala Asp Ala Leu Lys Tyr Val Gly
740 745 750
Ile Glu Arg Glu Met Glu Ile Pro Ser Thr Gly Gly Ser Gly Gly Ser
755 760 765
Gly Gly Ser Gly Gly Ser Gly Gly Ser Gly Arg Pro Asp Lys Lys Tyr
770 775 780
Ser Ile Gly Leu Ala Ile Gly Thr Asn Ser Val Gly Trp Ala Val Ile
785 790 795 800
Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe Lys Val Leu Gly Asn
805 810 815
Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile Gly Ala Leu Leu Phe
820 825 830
Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu Lys Arg Thr Ala Arg
835 840 845
Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys Tyr Leu Gln Glu Ile
850 855 860
Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser Phe Phe His Arg Leu
865 870 875 880
Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys His Glu Arg His Pro
885 890 895
Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr His Glu Lys Tyr Pro
900 905 910
Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp Ser Thr Asp Lys Ala
915 920 925
Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His Met Ile Lys Phe Arg
930 935 940
Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro Asp Asn Ser Asp Val
945 950 955 960
Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr Asn Gln Leu Phe Glu
965 970 975
Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala Lys Ala Ile Leu Ser
980 985 990
Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn Leu Ile Ala Gln Leu
995 1000 1005
Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn Leu Ile Ala Leu
1010 1015 1020
Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe Asp Leu Ala
1025 1030 1035
Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp Asp Asp
1040 1045 1050
Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp Leu
1055 1060 1065
Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
1070 1075 1080
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala
1085 1090 1095
Ser Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu
1100 1105 1110
Leu Lys Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu
1115 1120 1125
Ile Phe Phe Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp
1130 1135 1140
Gly Gly Ala Ser Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile
1145 1150 1155
Leu Glu Lys Met Asp Gly Thr Glu Glu Leu Leu Val Lys Leu Asn
1160 1165 1170
Arg Glu Asp Leu Leu Arg Lys Gln Arg Thr Phe Asp Asn Gly Ser
1175 1180 1185
Ile Pro His Gln Ile His Leu Gly Glu Leu His Ala Ile Leu Arg
1190 1195 1200
Arg Gln Glu Asp Phe Tyr Pro Phe Leu Lys Asp Asn Arg Glu Lys
1205 1210 1215
Ile Glu Lys Ile Leu Thr Phe Arg Ile Pro Tyr Tyr Val Gly Pro
1220 1225 1230
Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp Met Thr Arg Lys Ser
1235 1240 1245
Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu Val Val Asp Lys
1250 1255 1260
Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr Asn Phe Asp
1265 1270 1275
Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser Leu Leu
1280 1285 1290
Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys Tyr
1295 1300 1305
Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
1310 1315 1320
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val
1325 1330 1335
Thr Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys
1340 1345 1350
Phe Asp Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala
1355 1360 1365
Ser Leu Gly Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys
1370 1375 1380
Asp Phe Leu Asp Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile
1385 1390 1395
Val Leu Thr Leu Thr Leu Phe Glu Asp Arg Glu Met Ile Glu Glu
1400 1405 1410
Arg Leu Lys Thr Tyr Ala His Leu Phe Asp Asp Lys Val Met Lys
1415 1420 1425
Gln Leu Lys Arg Arg Arg Tyr Thr Gly Trp Gly Arg Leu Ser Arg
1430 1435 1440
Lys Leu Ile Asn Gly Ile Arg Asp Lys Gln Ser Gly Lys Thr Ile
1445 1450 1455
Leu Asp Phe Leu Lys Ser Asp Gly Phe Ala Asn Arg Asn Phe Met
1460 1465 1470
Gln Leu Ile His Asp Asp Ser Leu Thr Phe Lys Glu Asp Ile Gln
1475 1480 1485
Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu His Glu His Ile
1490 1495 1500
Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly Ile Leu Gln
1505 1510 1515
Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly Arg His
1520 1525 1530
Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln Thr
1535 1540 1545
Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
1550 1555 1560
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His
1565 1570 1575
Pro Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr
1580 1585 1590
Tyr Leu Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp
1595 1600 1605
Ile Asn Arg Leu Ser Asp Tyr Asp Val Ala Ala Ile Val Pro Gln
1610 1615 1620
Ser Phe Leu Lys Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg
1625 1630 1635
Ser Asp Lys Ala Arg Gly Lys Ser Asp Asn Val Pro Ser Glu Glu
1640 1645 1650
Val Val Lys Lys Met Lys Asn Tyr Trp Arg Gln Leu Leu Asn Ala
1655 1660 1665
Lys Leu Ile Thr Gln Arg Lys Phe Asp Asn Leu Thr Lys Ala Glu
1670 1675 1680
Arg Gly Gly Leu Ser Glu Leu Asp Lys Ala Gly Phe Ile Lys Arg
1685 1690 1695
Gln Leu Val Glu Thr Arg Gln Ile Thr Lys His Val Ala Gln Ile
1700 1705 1710
Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp Glu Asn Asp Lys Leu
1715 1720 1725
Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser Lys Leu Val Ser
1730 1735 1740
Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg Glu Ile Asn
1745 1750 1755
Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val Val Gly
1760 1765 1770
Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu Phe Val
1775 1780 1785
Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile Ala Lys
1790 1795 1800
Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe Phe Tyr
1805 1810 1815
Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu Ala Asn
1820 1825 1830
Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly Glu Thr
1835 1840 1845
Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr Val Arg
1850 1855 1860
Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys Thr Glu
1865 1870 1875
Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro Lys Arg
1880 1885 1890
Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp Pro Lys
1895 1900 1905
Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser Val Leu
1910 1915 1920
Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu Lys Ser
1925 1930 1935
Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser Ser Phe
1940 1945 1950
Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr Lys Glu
1955 1960 1965
Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser Leu Phe
1970 1975 1980
Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala Gly Glu
1985 1990 1995
Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr Val Asn
2000 2005 2010
Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly Ser Pro
2015 2020 2025
Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His Lys His
2030 2035 2040
Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser Lys Arg
2045 2050 2055
Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser Ala Tyr
2060 2065 2070
Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu Asn Ile
2075 2080 2085
Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala Ala Phe
2090 2095 2100
Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr Ser Thr
2105 2110 2115
Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile Thr Gly
2120 2125 2130
Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly Asp Ser
2135 2140 2145
Gly Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys
2150 2155 2160
Lys Lys Lys Ser Gly Gly Gly Gly Ser Glu Glu Pro Glu Glu Pro
2165 2170 2175
Ala Asp Ser Gly Gln Ser Leu Val Pro Val Tyr Ile Tyr Ser Pro
2180 2185 2190
Glu Tyr Val Ser Met Cys Asp Ser Leu Ala Lys Ile Pro Lys Arg
2195 2200 2205
Ala Ser Met Val His Ser Leu Ile Glu Ala Tyr Ala Leu His Lys
2210 2215 2220
Gln Met Arg Ile Val Lys Pro Lys Val Ala Ser Met Glu Glu Met
2225 2230 2235
Ala Thr Phe His Thr Asp Ala Tyr Leu Gln His Leu Gln Lys Val
2240 2245 2250
Ser Gln Glu Gly Asp Asp Asp His Pro Asp Ser Ile Glu Tyr Gly
2255 2260 2265
Leu Gly Tyr Asp Cys Pro Ala Thr Glu Gly Ile Phe Asp Tyr Ala
2270 2275 2280
Ala Ala Ile Gly Gly Ala Thr Ile Thr Ala Ala Gln Cys Leu Ile
2285 2290 2295
Asp Gly Met Cys Lys Val Ala Ile Asn Trp Ser Gly Gly Trp His
2300 2305 2310
His Ala Lys Lys Asp Glu Ala Ser Gly Phe Cys Tyr Leu Asn Asp
2315 2320 2325
Ala Val Leu Gly Ile Leu Arg Leu Arg Arg Lys Phe Glu Arg Ile
2330 2335 2340
Leu Tyr Val Asp Leu Asp Leu His His Gly Asp Gly Val Glu Asp
2345 2350 2355
Ala Phe Ser Phe Thr Ser Lys Val Met Thr Val Ser Leu His Lys
2360 2365 2370
Phe Ser Pro Gly Phe Phe Pro Gly Thr Gly Asp Val Ser Asp Val
2375 2380 2385
Gly Leu Gly Lys Gly Arg Tyr Tyr Ser Val Asn Val Pro Ile Gln
2390 2395 2400
Asp Gly Ile Gln Asp Glu Lys Tyr Tyr Gln Ile Cys Glu Ser Val
2405 2410 2415
Leu Lys Glu Val Tyr Gln Ala Phe Asn Pro Lys Ala Val Val Leu
2420 2425 2430
Gln Leu Gly Ala Asp Thr Ile Ala Gly Asp Pro Met Cys Ser Phe
2435 2440 2445
Asn Met Thr Pro Val Gly Ile Gly Lys Cys Leu Lys Tyr Ile Leu
2450 2455 2460
Gln Trp Gln Leu Ala Thr Leu Ile Leu Gly Gly Gly Gly Tyr Asn
2465 2470 2475
Leu Ala Asn Thr Ala Arg Cys Trp Thr Tyr Leu Thr Gly Val Ile
2480 2485 2490
Leu Gly Lys Thr Leu Ser Ser Glu Ile Pro Asp His Glu Phe Phe
2495 2500 2505
Thr Ala Tyr Gly Pro Asp Tyr Val Leu Glu Ile Thr Pro Ser Cys
2510 2515 2520
Arg Pro Asp Arg Asn Glu Pro His Arg Ile Gln Gln Ile Leu Asn
2525 2530 2535
Tyr Ile Lys Gly Asn Leu Lys His Val Val Gly Gly Gly Gly Ser
2540 2545 2550
Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln Ala Lys Lys
2555 2560 2565
Lys Lys Gly Ser Tyr Pro Tyr Asp Val Pro Asp Tyr Ala
2570 2575 2580
<210> 217
<211> 4486
<212> RNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 217
aggaaauaag agagaaaaga agaguaagaa gaaauauaag agccaccaug gcccccaaga 60
agaagcggaa ggugggcauc cacggcgugc ccgccgccga caagaaguac agcaucggcc 120
uggacaucgg caccaacagc gugggcuggg ccgugaucac cgacgaguac aaggugccca 180
gcaagaaguu caaggugcug ggcaacaccg accggcacag caucaagaag aaccugaucg 240
gcgcccugcu guucgacagc ggcgagaccg ccgaggccac ccggcugaag cggaccgccc 300
ggcggcggua cacccggcgg aagaaccgga ucugcuaccu gcaggagauc uucagcaacg 360
agauggccaa gguggacgac agcuucuucc accggcugga ggagagcuuc cugguggagg 420
aggacaagaa gcacgagcgg caccccaucu ucggcaacau cguggacgag guggccuacc 480
acgagaagua ccccaccauc uaccaccugc ggaagaagcu gguggacagc accgacaagg 540
ccgaccugcg gcugaucuac cuggcccugg cccacaugau caaguuccgg ggccacuucc 600
ugaucgaggg cgaccugaac cccgacaaca gcgacgugga caagcuguuc auccagcugg 660
ugcagaccua caaccagcug uucgaggaga accccaucaa cgccagcggc guggacgcca 720
aggccauccu gagcgcccgg cugagcaaga gccggcggcu ggagaaccug aucgcccagc 780
ugcccggcga gaagaagaac ggccuguucg gcaaccugau cgcccugagc cugggccuga 840
cccccaacuu caagagcaac uucgaccugg ccgaggacgc caagcugcag cugagcaagg 900
acaccuacga cgacgaccug gacaaccugc uggcccagau cggcgaccag uacgccgacc 960
uguuccuggc cgccaagaac cugagcgacg ccauccugcu gagcgacauc cugcggguga 1020
acaccgagau caccaaggcc ccccugagcg ccagcaugau caagcgguac gacgagcacc 1080
accaggaccu gacccugcug aaggcccugg ugcggcagca gcugcccgag aaguacaagg 1140
agaucuucuu cgaccagagc aagaacggcu acgccggcua caucgacggc ggcgccagcc 1200
aggaggaguu cuacaaguuc aucaagccca uccuggagaa gauggacggc accgaggagc 1260
ugcuggugaa gcugaaccgg gaggaccugc ugcggaagca gcggaccuuc gacaacggca 1320
gcauccccca ccagauccac cugggcgagc ugcacgccau ccugcggcgg caggaggacu 1380
ucuaccccuu ccugaaggac aaccgggaga agaucgagaa gauccugacc uuccggaucc 1440
ccuacuacgu gggcccccug gcccggggca acagccgguu cgccuggaug acccggaaau 1500
ccgaggagac caucaccccc uggaacuucg aggagguggu ggacaagggc gccagcgccc 1560
agagcuucau cgagcggaug accaacuucg acaagaaccu gcccaacgag aaggugcugc 1620
ccaagcacag ccugcuguac gaguacuuca ccguguacaa cgagcugacc aaggugaagu 1680
acgugaccga gggcaugcgg aagcccgccu uccugagcgg cgagcagaag aaggccaucg 1740
uggaccugcu guucaagacc aaccggaagg ugaccgugaa gcagcugaag gaggacuacu 1800
ucaagaagau cgagugcuuc gacagcgugg agaucagcgg cguggaggac cgguucaacg 1860
ccagccuggg caccuaccac gaccugcuga agaucaucaa ggacaaggac uuccuggaca 1920
acgaggagaa cgaggacauc cuggaggaca ucgugcugac ccugacccug uucgaggacc 1980
gggagaugau cgaggagcgg cugaaaaccu acgcccaccu guucgacgac aaggugauga 2040
agcagcugaa gcggcggcgg uacaccggcu ggggccggcu gagccggaag cugaucaacg 2100
gcauccggga caagcagagc ggcaagacca uccuggacuu ccugaaaucc gacggcuucg 2160
ccaaccggaa cuucaugcag cugauccacg acgacagccu gaccuucaag gaggacaucc 2220
agaaggccca ggugagcggc cagggcgaca gccugcacga gcacaucgcc aaccuggccg 2280
gcagccccgc caucaagaag ggcauccugc agaccgugaa ggugguggac gagcugguga 2340
aggugauggg ccggcacaag cccgagaaca ucgugaucga gauggcccgg gagaaccaga 2400
ccacccagaa gggccagaag aacagccggg agcggaugaa gcggaucgag gagggcauca 2460
aggagcuggg cagccagauc cugaaggagc accccgugga gaacacccag cugcagaacg 2520
agaagcugua ccuguacuac cugcagaacg gccgggacau guacguggac caggagcugg 2580
acaucaaccg gcugagcgac uacgacgugg accacaucgu gccccagagc uuccugaagg 2640
acgacagcau cgacaacaag gugcugaccc ggagcgacaa gaaccggggc aagagcgaca 2700
acgugcccag cgaggaggug gugaagaaga ugaagaacua cuggcggcag cugcugaacg 2760
ccaagcugau cacccagcgg aaguucgaca accugaccaa ggccgagcgg ggcggccuga 2820
gcgagcugga caaggccggc uucaucaagc ggcagcuggu ggagacccgg cagaucacca 2880
agcacguggc ccagauccug gacagccgga ugaacaccaa guacgacgag aacgacaagc 2940
ugauccggga ggugaaggug aucacccuga aauccaagcu ggugagcgac uuccggaagg 3000
acuuccaguu cuacaaggug cgggagauca acaacuacca ccacgcccac gacgccuacc 3060
ugaacgccgu ggugggcacc gcccugauca agaaguaccc caagcuggag agcgaguucg 3120
uguacggcga cuacaaggug uacgacgugc ggaagaugau cgccaagagc gagcaggaga 3180
ucggcaaggc caccgccaag uacuucuucu acagcaacau caugaacuuc uucaagaccg 3240
agaucacccu ggccaacggc gagauccgga agcggccccu gaucgagacc aacggcgaga 3300
ccggcgagau cgugugggac aagggccggg acuucgccac cgugcggaag gugcugagca 3360
ugccccaggu gaacaucgug aagaaaaccg aggugcagac cggcggcuuc agcaaggaga 3420
gcauccugcc caagcggaac agcgacaagc ugaucgcccg gaagaaggac ugggacccca 3480
agaaguacgg cggcuucgac agccccaccg uggccuacag cgugcuggug guggccaagg 3540
uggagaaggg caagagcaag aagcugaaau ccgugaagga gcugcugggc aucaccauca 3600
uggagcggag cagcuucgag aagaacccca ucgacuuccu ggaggccaag ggcuacaagg 3660
aggugaagaa ggaccugauc aucaagcugc ccaaguacag ccuguucgag cuggagaacg 3720
gccggaagcg gaugcuggcc agcgccggcg agcugcagaa gggcaacgag cuggcccugc 3780
ccagcaagua cgugaacuuc cuguaccugg ccagccacua cgagaagcug aagggcagcc 3840
ccgaggacaa cgagcagaag cagcuguucg uggagcagca caagcacuac cuggacgaga 3900
ucaucgagca gaucagcgag uucagcaagc gggugauccu ggccgacgcc aaccuggaca 3960
aggugcugag cgccuacaac aagcaccggg acaagcccau ccgggagcag gccgagaaca 4020
ucauccaccu guucacccug accaaccugg gcgcccccgc cgccuucaag uacuucgaca 4080
ccaccaucga ccggaagcgg uacaccagca ccaaggaggu gcuggacgcc acccugaucc 4140
accagagcau caccggccug uacgagaccc ggaucgaccu gagccagcug ggcggcgaca 4200
gcggcggcaa gcggcccgcc gccaccaaga aggccggcca ggccaagaag aagaagggca 4260
gcuaccccua cgacgugccc gacuacgccu gagcggccgc uuaauuaagc ugccuucugc 4320
ggggcuugcc uucuggccau gcccuucuuc ucucccuugc accuguaccu cuuggucuuu 4380
gaauaaagcc ugaguaggaa gucuagaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 4440
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaa 4486
<210> 218
<211> 1414
<212> PRT
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polypeptides
<400> 218
Met Ala Pro Lys Lys Lys Arg Lys Val Gly Ile His Gly Val Pro Ala
1 5 10 15
Ala Asp Lys Lys Tyr Ser Ile Gly Leu Asp Ile Gly Thr Asn Ser Val
20 25 30
Gly Trp Ala Val Ile Thr Asp Glu Tyr Lys Val Pro Ser Lys Lys Phe
35 40 45
Lys Val Leu Gly Asn Thr Asp Arg His Ser Ile Lys Lys Asn Leu Ile
50 55 60
Gly Ala Leu Leu Phe Asp Ser Gly Glu Thr Ala Glu Ala Thr Arg Leu
65 70 75 80
Lys Arg Thr Ala Arg Arg Arg Tyr Thr Arg Arg Lys Asn Arg Ile Cys
85 90 95
Tyr Leu Gln Glu Ile Phe Ser Asn Glu Met Ala Lys Val Asp Asp Ser
100 105 110
Phe Phe His Arg Leu Glu Glu Ser Phe Leu Val Glu Glu Asp Lys Lys
115 120 125
His Glu Arg His Pro Ile Phe Gly Asn Ile Val Asp Glu Val Ala Tyr
130 135 140
His Glu Lys Tyr Pro Thr Ile Tyr His Leu Arg Lys Lys Leu Val Asp
145 150 155 160
Ser Thr Asp Lys Ala Asp Leu Arg Leu Ile Tyr Leu Ala Leu Ala His
165 170 175
Met Ile Lys Phe Arg Gly His Phe Leu Ile Glu Gly Asp Leu Asn Pro
180 185 190
Asp Asn Ser Asp Val Asp Lys Leu Phe Ile Gln Leu Val Gln Thr Tyr
195 200 205
Asn Gln Leu Phe Glu Glu Asn Pro Ile Asn Ala Ser Gly Val Asp Ala
210 215 220
Lys Ala Ile Leu Ser Ala Arg Leu Ser Lys Ser Arg Arg Leu Glu Asn
225 230 235 240
Leu Ile Ala Gln Leu Pro Gly Glu Lys Lys Asn Gly Leu Phe Gly Asn
245 250 255
Leu Ile Ala Leu Ser Leu Gly Leu Thr Pro Asn Phe Lys Ser Asn Phe
260 265 270
Asp Leu Ala Glu Asp Ala Lys Leu Gln Leu Ser Lys Asp Thr Tyr Asp
275 280 285
Asp Asp Leu Asp Asn Leu Leu Ala Gln Ile Gly Asp Gln Tyr Ala Asp
290 295 300
Leu Phe Leu Ala Ala Lys Asn Leu Ser Asp Ala Ile Leu Leu Ser Asp
305 310 315 320
Ile Leu Arg Val Asn Thr Glu Ile Thr Lys Ala Pro Leu Ser Ala Ser
325 330 335
Met Ile Lys Arg Tyr Asp Glu His His Gln Asp Leu Thr Leu Leu Lys
340 345 350
Ala Leu Val Arg Gln Gln Leu Pro Glu Lys Tyr Lys Glu Ile Phe Phe
355 360 365
Asp Gln Ser Lys Asn Gly Tyr Ala Gly Tyr Ile Asp Gly Gly Ala Ser
370 375 380
Gln Glu Glu Phe Tyr Lys Phe Ile Lys Pro Ile Leu Glu Lys Met Asp
385 390 395 400
Gly Thr Glu Glu Leu Leu Val Lys Leu Asn Arg Glu Asp Leu Leu Arg
405 410 415
Lys Gln Arg Thr Phe Asp Asn Gly Ser Ile Pro His Gln Ile His Leu
420 425 430
Gly Glu Leu His Ala Ile Leu Arg Arg Gln Glu Asp Phe Tyr Pro Phe
435 440 445
Leu Lys Asp Asn Arg Glu Lys Ile Glu Lys Ile Leu Thr Phe Arg Ile
450 455 460
Pro Tyr Tyr Val Gly Pro Leu Ala Arg Gly Asn Ser Arg Phe Ala Trp
465 470 475 480
Met Thr Arg Lys Ser Glu Glu Thr Ile Thr Pro Trp Asn Phe Glu Glu
485 490 495
Val Val Asp Lys Gly Ala Ser Ala Gln Ser Phe Ile Glu Arg Met Thr
500 505 510
Asn Phe Asp Lys Asn Leu Pro Asn Glu Lys Val Leu Pro Lys His Ser
515 520 525
Leu Leu Tyr Glu Tyr Phe Thr Val Tyr Asn Glu Leu Thr Lys Val Lys
530 535 540
Tyr Val Thr Glu Gly Met Arg Lys Pro Ala Phe Leu Ser Gly Glu Gln
545 550 555 560
Lys Lys Ala Ile Val Asp Leu Leu Phe Lys Thr Asn Arg Lys Val Thr
565 570 575
Val Lys Gln Leu Lys Glu Asp Tyr Phe Lys Lys Ile Glu Cys Phe Asp
580 585 590
Ser Val Glu Ile Ser Gly Val Glu Asp Arg Phe Asn Ala Ser Leu Gly
595 600 605
Thr Tyr His Asp Leu Leu Lys Ile Ile Lys Asp Lys Asp Phe Leu Asp
610 615 620
Asn Glu Glu Asn Glu Asp Ile Leu Glu Asp Ile Val Leu Thr Leu Thr
625 630 635 640
Leu Phe Glu Asp Arg Glu Met Ile Glu Glu Arg Leu Lys Thr Tyr Ala
645 650 655
His Leu Phe Asp Asp Lys Val Met Lys Gln Leu Lys Arg Arg Arg Tyr
660 665 670
Thr Gly Trp Gly Arg Leu Ser Arg Lys Leu Ile Asn Gly Ile Arg Asp
675 680 685
Lys Gln Ser Gly Lys Thr Ile Leu Asp Phe Leu Lys Ser Asp Gly Phe
690 695 700
Ala Asn Arg Asn Phe Met Gln Leu Ile His Asp Asp Ser Leu Thr Phe
705 710 715 720
Lys Glu Asp Ile Gln Lys Ala Gln Val Ser Gly Gln Gly Asp Ser Leu
725 730 735
His Glu His Ile Ala Asn Leu Ala Gly Ser Pro Ala Ile Lys Lys Gly
740 745 750
Ile Leu Gln Thr Val Lys Val Val Asp Glu Leu Val Lys Val Met Gly
755 760 765
Arg His Lys Pro Glu Asn Ile Val Ile Glu Met Ala Arg Glu Asn Gln
770 775 780
Thr Thr Gln Lys Gly Gln Lys Asn Ser Arg Glu Arg Met Lys Arg Ile
785 790 795 800
Glu Glu Gly Ile Lys Glu Leu Gly Ser Gln Ile Leu Lys Glu His Pro
805 810 815
Val Glu Asn Thr Gln Leu Gln Asn Glu Lys Leu Tyr Leu Tyr Tyr Leu
820 825 830
Gln Asn Gly Arg Asp Met Tyr Val Asp Gln Glu Leu Asp Ile Asn Arg
835 840 845
Leu Ser Asp Tyr Asp Val Asp His Ile Val Pro Gln Ser Phe Leu Lys
850 855 860
Asp Asp Ser Ile Asp Asn Lys Val Leu Thr Arg Ser Asp Lys Asn Arg
865 870 875 880
Gly Lys Ser Asp Asn Val Pro Ser Glu Glu Val Val Lys Lys Met Lys
885 890 895
Asn Tyr Trp Arg Gln Leu Leu Asn Ala Lys Leu Ile Thr Gln Arg Lys
900 905 910
Phe Asp Asn Leu Thr Lys Ala Glu Arg Gly Gly Leu Ser Glu Leu Asp
915 920 925
Lys Ala Gly Phe Ile Lys Arg Gln Leu Val Glu Thr Arg Gln Ile Thr
930 935 940
Lys His Val Ala Gln Ile Leu Asp Ser Arg Met Asn Thr Lys Tyr Asp
945 950 955 960
Glu Asn Asp Lys Leu Ile Arg Glu Val Lys Val Ile Thr Leu Lys Ser
965 970 975
Lys Leu Val Ser Asp Phe Arg Lys Asp Phe Gln Phe Tyr Lys Val Arg
980 985 990
Glu Ile Asn Asn Tyr His His Ala His Asp Ala Tyr Leu Asn Ala Val
995 1000 1005
Val Gly Thr Ala Leu Ile Lys Lys Tyr Pro Lys Leu Glu Ser Glu
1010 1015 1020
Phe Val Tyr Gly Asp Tyr Lys Val Tyr Asp Val Arg Lys Met Ile
1025 1030 1035
Ala Lys Ser Glu Gln Glu Ile Gly Lys Ala Thr Ala Lys Tyr Phe
1040 1045 1050
Phe Tyr Ser Asn Ile Met Asn Phe Phe Lys Thr Glu Ile Thr Leu
1055 1060 1065
Ala Asn Gly Glu Ile Arg Lys Arg Pro Leu Ile Glu Thr Asn Gly
1070 1075 1080
Glu Thr Gly Glu Ile Val Trp Asp Lys Gly Arg Asp Phe Ala Thr
1085 1090 1095
Val Arg Lys Val Leu Ser Met Pro Gln Val Asn Ile Val Lys Lys
1100 1105 1110
Thr Glu Val Gln Thr Gly Gly Phe Ser Lys Glu Ser Ile Leu Pro
1115 1120 1125
Lys Arg Asn Ser Asp Lys Leu Ile Ala Arg Lys Lys Asp Trp Asp
1130 1135 1140
Pro Lys Lys Tyr Gly Gly Phe Asp Ser Pro Thr Val Ala Tyr Ser
1145 1150 1155
Val Leu Val Val Ala Lys Val Glu Lys Gly Lys Ser Lys Lys Leu
1160 1165 1170
Lys Ser Val Lys Glu Leu Leu Gly Ile Thr Ile Met Glu Arg Ser
1175 1180 1185
Ser Phe Glu Lys Asn Pro Ile Asp Phe Leu Glu Ala Lys Gly Tyr
1190 1195 1200
Lys Glu Val Lys Lys Asp Leu Ile Ile Lys Leu Pro Lys Tyr Ser
1205 1210 1215
Leu Phe Glu Leu Glu Asn Gly Arg Lys Arg Met Leu Ala Ser Ala
1220 1225 1230
Gly Glu Leu Gln Lys Gly Asn Glu Leu Ala Leu Pro Ser Lys Tyr
1235 1240 1245
Val Asn Phe Leu Tyr Leu Ala Ser His Tyr Glu Lys Leu Lys Gly
1250 1255 1260
Ser Pro Glu Asp Asn Glu Gln Lys Gln Leu Phe Val Glu Gln His
1265 1270 1275
Lys His Tyr Leu Asp Glu Ile Ile Glu Gln Ile Ser Glu Phe Ser
1280 1285 1290
Lys Arg Val Ile Leu Ala Asp Ala Asn Leu Asp Lys Val Leu Ser
1295 1300 1305
Ala Tyr Asn Lys His Arg Asp Lys Pro Ile Arg Glu Gln Ala Glu
1310 1315 1320
Asn Ile Ile His Leu Phe Thr Leu Thr Asn Leu Gly Ala Pro Ala
1325 1330 1335
Ala Phe Lys Tyr Phe Asp Thr Thr Ile Asp Arg Lys Arg Tyr Thr
1340 1345 1350
Ser Thr Lys Glu Val Leu Asp Ala Thr Leu Ile His Gln Ser Ile
1355 1360 1365
Thr Gly Leu Tyr Glu Thr Arg Ile Asp Leu Ser Gln Leu Gly Gly
1370 1375 1380
Asp Ser Gly Gly Lys Arg Pro Ala Ala Thr Lys Lys Ala Gly Gln
1385 1390 1395
Ala Lys Lys Lys Lys Gly Ser Tyr Pro Tyr Asp Val Pro Asp Tyr
1400 1405 1410
Ala
<210> 219
<211> 3262
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 219
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcggc agcggcggca gcggccagac cggcaagaag agcgagaagg 120
gccccgtgtg ctggcggaag cgggtgaaga gcgagtacat gcggctgcgg cagctgaagc 180
ggttccggcg ggccgacgag gtgaagagca tgttcagcag caaccggcag aagatcctgg 240
agcggaccga gatcctgaac caggagtgga agcagcggcg aatccagccc gtgcacatcc 300
tgaccagcgt gagcagcctg cggggcaccc gggagtgcag cgtgaccagc gacctggact 360
tccccaccca ggtgatcccc ctaaagaccc tgaacgccgt ggccagcgtg cccatcatgt 420
acagctggag ccccctgcag cagaacttca tggtggagga cgagaccgtg ctgcacaaca 480
tcccctacat gggcgacgag gtgctggacc aggacggcac cttcatcgag gagctgatca 540
agaactacga cggcaaggtg cacggcgacc gggagtgcgg cttcatcaac gacgagatct 600
tcgtggagct ggtgaacgcc ctgggccagt acaacgacga cgacgacgac gacgacggcg 660
acgaccccga ggagcgggag gagaagcaga aggacctgga ggaccaccgg gacgacaagg 720
agagccggcc cccccggaag ttccccagcg acaagatctt cgaggccatc agcagcatgt 780
tccccgacaa gggcaccgcc gaggagctga aggagaagta caaggagctg accgagcagc 840
agctgcccgg cgccctgccc cccgagtgca cccccaacat cgacggcccc aacgccaaga 900
gcgtgcagcg ggagcagagc ctgcacagct tccacaccct gttctgccgg cggtgcttca 960
agtacgactg cttcctgcac cccttccacg ccacccccaa cacctacaag cggaagaaca 1020
ccgagaccgc cctggacaac aagccctgcg gcccccagtg ctaccagcac ctggagggcg 1080
ccaaggagtt cgccgccgcc ctgaccgccg agcggatcaa gacccccccc aagcggcccg 1140
gcggccggcg gcggggccgg ctgcccaaca acagcagccg gcccagcacc cccaccatca 1200
acgtgctgga gagcaaggac accgacagcg accgggaggc cggcaccgag accggcggcg 1260
agaacaacga caaggaggag gaggagaaga aggacgagac cagcagcagc agcgaggcca 1320
acagccggtg ccagaccccc atcaagatga agcccaacat cgagcccccc gagaacgtgg 1380
agtggagcgg cgccgaggcc agcatgttcc gggtgctgat cggcacctac tacgacaact 1440
tctgcgccat cgcccggctg atcggcacca agacctgccg gcaggtgtac gagttccggg 1500
tgaaggagag cagcatcatc gcccccgccc ccgccgagga cgtggacacc cccccccgga 1560
agaagaagcg gaagcaccgg ctgtgggccg cccactgccg gaagatccag ctgaagaagg 1620
acggcagcag caaccacgtg tacaactacc agccctgcga ccacccccgg cagccctgcg 1680
acagcagctg cccctgcgtg atcgcccaga acttctgcga gaagttctgc cagtgcagca 1740
gcgagtgcca gaaccggttc cccggctgcc ggtgcaaggc ccagtgcaac accaagcagt 1800
gcccctgcta cctggccgtg cgggagtgcg accccgacct gtgcctgacc tgcggcgccg 1860
ccgaccactg ggacagcaag aacgtgagct gcaagaactg cagcatccag cggggcagca 1920
agaagcacct gctgctggcc cccagcgacg tggccggctg gggcatcttc atcaaggacc 1980
ccgtgcagaa gaacgagttc atcagcgagt actgcggcga gatcatcagc caggacgagg 2040
ccgaccggcg gggcaaggtg tacgacaagt acatgtgcag cttcctgttc aacctgaaca 2100
acgacttcgt ggtggacgcc acccggaagg gcaacaagat ccggttcgcc aaccacagcg 2160
tgaaccccaa ctgctacgcc aaggtgatga tggtgaacgg cgaccaccgg atcggcatct 2220
tcgccaagcg ggccatccag accggcgagg agctgttctt cgactaccgg tacagccagg 2280
ccgacgccct gaagtacgtg ggcatcgagc gggagatgga gatccccggc agcagcggat 2340
ccctggagcc cggcgaaaag ccttacaagt gtcccgagtg cggaaagagc ttcagcagag 2400
ccgataatct gaccgagcac caaaggaccc acaccggaga gaagccttat aagtgtcccg 2460
aatgcggcaa aagcttttct agaagcgatc atctgaccaa ccaccagagg acacacaccg 2520
gagaaaaacc ttacaaatgc cccgagtgcg gcaaaagctt ctcccagagc agcaatctgg 2580
tgagacacca aaggacccac accggcgaaa aaccctataa atgccccgaa tgtggcaaga 2640
gctttagcac atccggcgag ctggtgaggc atcaaagaac acataccggc gagaagccct 2700
acaagtgccc cgagtgtgga aaaagcttca gcacccacct cgatctgatc agacaccaga 2760
ggacccatac cggagagaaa ccctacaaat gtcccgagtg cggaaagtcc tttagccagc 2820
tggcccatct gagagctcat caaaggacac acaccggcga gaagccttac aagtgtcccg 2880
agtgcggaaa atccttctcc caactggccc atctgagggc ccaccagaga acccacaccg 2940
gcaaaaagac ctccgctagc ggcagcggcg gcggcagcgg cggcaagcgg cccgccgcca 3000
ccaagaaggc cggccaggcc aagaagaaga agggcagcta cccctacgac gtgcccgact 3060
acgcctgagc ggccgcttaa ttaagctgcc ttctgcgggg cttgccttct ggccatgccc 3120
ttcttctctc ccttgcacct gtacctcttg gtctttgaat aaagcctgag taggaagtct 3180
agaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 3240
aaaaaaaaaa aaaaaaaaaa aa 3262
<210> 220
<211> 3262
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 220
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcggc agcggcggca gcggccagac cggcaagaag agcgagaagg 120
gccccgtgtg ctggcggaag cgggtgaaga gcgagtacat gcggctgcgg cagctgaagc 180
ggttccggcg ggccgacgag gtgaagagca tgttcagcag caaccggcag aagatcctgg 240
agcggaccga gatcctgaac caggagtgga agcagcggcg aatccagccc gtgcacatcc 300
tgaccagcgt gagcagcctg cggggcaccc gggagtgcag cgtgaccagc gacctggact 360
tccccaccca ggtgatcccc ctaaagaccc tgaacgccgt ggccagcgtg cccatcatgt 420
acagctggag ccccctgcag cagaacttca tggtggagga cgagaccgtg ctgcacaaca 480
tcccctacat gggcgacgag gtgctggacc aggacggcac cttcatcgag gagctgatca 540
agaactacga cggcaaggtg cacggcgacc gggagtgcgg cttcatcaac gacgagatct 600
tcgtggagct ggtgaacgcc ctgggccagt acaacgacga cgacgacgac gacgacggcg 660
acgaccccga ggagcgggag gagaagcaga aggacctgga ggaccaccgg gacgacaagg 720
agagccggcc cccccggaag ttccccagcg acaagatctt cgaggccatc agcagcatgt 780
tccccgacaa gggcaccgcc gaggagctga aggagaagta caaggagctg accgagcagc 840
agctgcccgg cgccctgccc cccgagtgca cccccaacat cgacggcccc aacgccaaga 900
gcgtgcagcg ggagcagagc ctgcacagct tccacaccct gttctgccgg cggtgcttca 960
agtacgactg cttcctgcac cccttccacg ccacccccaa cacctacaag cggaagaaca 1020
ccgagaccgc cctggacaac aagccctgcg gcccccagtg ctaccagcac ctggagggcg 1080
ccaaggagtt cgccgccgcc ctgaccgccg agcggatcaa gacccccccc aagcggcccg 1140
gcggccggcg gcggggccgg ctgcccaaca acagcagccg gcccagcacc cccaccatca 1200
acgtgctgga gagcaaggac accgacagcg accgggaggc cggcaccgag accggcggcg 1260
agaacaacga caaggaggag gaggagaaga aggacgagac cagcagcagc agcgaggcca 1320
acagccggtg ccagaccccc atcaagatga agcccaacat cgagcccccc gagaacgtgg 1380
agtggagcgg cgccgaggcc agcatgttcc gggtgctgat cggcacctac tacgacaact 1440
tctgcgccat cgcccggctg atcggcacca agacctgccg gcaggtgtac gagttccggg 1500
tgaaggagag cagcatcatc gcccccgccc ccgccgagga cgtggacacc cccccccgga 1560
agaagaagcg gaagcaccgg ctgtgggccg cccactgccg gaagatccag ctgaagaagg 1620
acggcagcag caaccacgtg tacaactacc agccctgcga ccacccccgg cagccctgcg 1680
acagcagctg cccctgcgtg atcgcccaga acttctgcga gaagttctgc cagtgcagca 1740
gcgagtgcca gaaccggttc cccggctgcc ggtgcaaggc ccagtgcaac accaagcagt 1800
gcccctgcta cctggccgtg cgggagtgcg accccgacct gtgcctgacc tgcggcgccg 1860
ccgaccactg ggacagcaag aacgtgagct gcaagaactg cagcatccag cggggcagca 1920
agaagcacct gctgctggcc cccagcgacg tggccggctg gggcatcttc atcaaggacc 1980
ccgtgcagaa gaacgagttc atcagcgagt actgcggcga gatcatcagc caggacgagg 2040
ccgaccggcg gggcaaggtg tacgacaagt acatgtgcag cttcctgttc aacctgaaca 2100
acgacttcgt ggtggacgcc acccggaagg gcaacaagat ccggttcgcc aaccacagcg 2160
tgaaccccaa ctgctacgcc aaggtgatga tggtgaacgg cgaccaccgg atcggcatct 2220
tcgccaagcg ggccatccag accggcgagg agctgttctt cgactaccgg tacagccagg 2280
ccgacgccct gaagtacgtg ggcatcgagc gggagatgga gatccccggc agcagcggat 2340
ccctggagcc cggcgaaaag ccttacaaat gtcccgaatg cggaaagagc ttcagcagag 2400
ccgacaatct gaccgaacat cagagaaccc ataccggaga aaaaccttac aaatgtcccg 2460
agtgcggcaa aagcttctcc caagccggac atctggccag ccaccaaagg acacataccg 2520
gcgagaaacc ctacaagtgc cccgagtgcg gcaagtcctt ctctagatcc gatgagctgg 2580
tcagacatca gagaacccat accggcgaga agccttataa gtgccccgaa tgtggcaagt 2640
ccttcagcca gagagctcat ctggagaggc atcaaagaac acacaccgga gagaaacctt 2700
acaagtgtcc cgagtgtgga aagagcttct ccagaaggga cgagctgaac gtccaccaaa 2760
gaacccatac cggcgaaaag ccctataaat gccccgagtg tggaaaatcc ttttctagat 2820
ccgaccatct gacaacccac cagaggaccc ataccggaga gaagccctac aaatgccccg 2880
agtgtggaaa aagcttctct agaaacgatg ctctgacaga gcaccaaagg acccacaccg 2940
gcaaaaagac cagcgctagc ggcagcggcg gcggcagcgg cggcaagcgg cccgccgcca 3000
ccaagaaggc cggccaggcc aagaagaaga agggcagcta cccctacgac gtgcccgact 3060
acgcctgagc ggccgcttaa ttaagctgcc ttctgcgggg cttgccttct ggccatgccc 3120
ttcttctctc ccttgcacct gtacctcttg gtctttgaat aaagcctgag taggaagtct 3180
agaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 3240
aaaaaaaaaa aaaaaaaaaa aa 3262
<210> 221
<211> 2668
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 221
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgccgg cagcagcgga tccctggagc 120
ccggcgaaaa gccttacaag tgccccgagt gtggcaaatc ctttagcacc accggaaatc 180
tgaccgtcca ccagagaaca cataccggcg agaaacccta caagtgtccc gagtgcggca 240
aatccttcag ccagctggcc catctgagag cccatcaaag gacccatacc ggcgagaaac 300
cttacaagtg tcccgaatgt ggaaagtcct ttagcagccc cgccgatctg acaagacatc 360
aaagaaccca caccggcgag aagccctata aatgtcccga gtgtggaaag tccttcagcc 420
agagcggcaa tctgaccgag catcaaagaa cccataccgg cgaaaagccc tataagtgcc 480
ccgaatgcgg aaaaagcttc tccacaagcg gcgagctggt gagacaccaa aggacacata 540
ccggcgaaaa gccttataaa tgccccgagt gcggcaagag cttctctaga aaggacaatc 600
tgaagaacca ccaaagaaca cacaccggcg agaagcccta caaatgcccc gagtgcggca 660
agagctttag ccagtccagc aacctcgtga gacatcagag gacacatacc ggaaaaaaga 720
ccagcgctag cggcagcggc ggcggcagcg gcggcaacca cgaccaggag ttcgaccccc 780
ccaaggtgta cccccccgtg cccgccgaga agcggaagcc catccgggtg ctgagcctgt 840
tcgacggcat cgccaccggc ctgctggtgc tgaaggacct gggcatccag gtggaccggt 900
acatcgccag cgaggtgtgc gaggacagca tcaccgtggg catggtgcgg caccagggca 960
agatcatgta cgtgggcgac gtgcggagcg tgacccagaa gcacatccag gagtggggcc 1020
ccttcgacct ggtgatcggc ggcagcccct gcaacgacct gagcatcgtg aaccccgccc 1080
ggaagggcct gtacgagggc accggccggc tgttcttcga gttctaccgg ctgctgcacg 1140
acgcccggcc caaggagggc gacgaccggc ccttcttctg gctgttcgag aacgtggtgg 1200
ccatgggcgt gagcgacaag cgggacatca gccggttcct ggagagcaac cccgtgatga 1260
tcgacgccaa ggaggtgagc gccgcccacc gggcccggta cttctggggc aacctgcccg 1320
gcatgaaccg gcccctggcc agcaccgtga acgacaagct ggagctgcag gagtgcctgg 1380
agcacggccg gatcgccaag ttcagcaagg tgcggaccat caccacccgg agcaacagca 1440
tcaagcaggg caaggaccag cacttccccg tgttcatgaa cgagaaggag gacatcctgt 1500
ggtgcaccga gatggagcgg gtgttcggct tccccgtgca ctacaccgac gtgagcaaca 1560
tgagccggct ggcccggcag cggctgctgg gccggagctg gagcgtgccc gtgatccggc 1620
acctgttcgc ccccctgaag gagtacttcg cctgcgtgag cagcggcaac agcaacgcca 1680
acagccgggg ccccagcttc agcagcggcc tggtgcccct gagcctgcgg ggcagccaca 1740
tgaatcctct ggagatgttc gagacagtgc ccgtgtggag aaggcaaccc gtgagggtgc 1800
tgagcctctt cgaggacatt aagaaggagc tgacctctct gggctttctg gaatccggca 1860
gcgaccccgg ccagctgaaa cacgtggtgg acgtgaccga cacagtgagg aaggacgtgg 1920
aagagtgggg cccctttgac ctcgtgtatg gagccacacc tcctctcggc cacacatgcg 1980
ataggcctcc cagctggtat ctcttccagt tccacagact gctccagtac gccagaccta 2040
agcccggcag ccccagaccc ttcttctgga tgttcgtgga caatctggtg ctgaacaagg 2100
aggatctgga tgtggccagc agatttctgg agatggaacc cgtgacaatc cccgacgtgc 2160
atggcggctc tctgcagaac gccgtgagag tgtggtccaa catccccgcc attagaagca 2220
gacactgggc tctggtgagc gaggaggaac tgtctctgct ggcccagaat aagcagtcct 2280
ccaagctggc cgccaagtgg cccaccaagc tggtgaagaa ctgctttctg cctctgaggg 2340
agtatttcaa gtatttcagc accgaactga ccagcagcct gagcggcggc aagcggcccg 2400
ccgccaccaa gaaggccggc caggccaaga agaagaaggg cagctacccc tacgacgtgc 2460
ccgactacgc ctgagcggcc gcttaattaa gctgccttct gcggggcttg ccttctggcc 2520
atgcccttct tctctccctt gcacctgtac ctcttggtct ttgaataaag cctgagtagg 2580
aagtctagaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2640
aaaaaaaaaa aaaaaaaaaa aaaaaaaa 2668
<210> 222
<211> 3262
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 222
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcggc agcggcggca gcggccagac cggcaagaag agcgagaagg 120
gccccgtgtg ctggcggaag cgggtgaaga gcgagtacat gcggctgcgg cagctgaagc 180
ggttccggcg ggccgacgag gtgaagagca tgttcagcag caaccggcag aagatcctgg 240
agcggaccga gatcctgaac caggagtgga agcagcggcg aatccagccc gtgcacatcc 300
tgaccagcgt gagcagcctg cggggcaccc gggagtgcag cgtgaccagc gacctggact 360
tccccaccca ggtgatcccc ctaaagaccc tgaacgccgt ggccagcgtg cccatcatgt 420
acagctggag ccccctgcag cagaacttca tggtggagga cgagaccgtg ctgcacaaca 480
tcccctacat gggcgacgag gtgctggacc aggacggcac cttcatcgag gagctgatca 540
agaactacga cggcaaggtg cacggcgacc gggagtgcgg cttcatcaac gacgagatct 600
tcgtggagct ggtgaacgcc ctgggccagt acaacgacga cgacgacgac gacgacggcg 660
acgaccccga ggagcgggag gagaagcaga aggacctgga ggaccaccgg gacgacaagg 720
agagccggcc cccccggaag ttccccagcg acaagatctt cgaggccatc agcagcatgt 780
tccccgacaa gggcaccgcc gaggagctga aggagaagta caaggagctg accgagcagc 840
agctgcccgg cgccctgccc cccgagtgca cccccaacat cgacggcccc aacgccaaga 900
gcgtgcagcg ggagcagagc ctgcacagct tccacaccct gttctgccgg cggtgcttca 960
agtacgactg cttcctgcac cccttccacg ccacccccaa cacctacaag cggaagaaca 1020
ccgagaccgc cctggacaac aagccctgcg gcccccagtg ctaccagcac ctggagggcg 1080
ccaaggagtt cgccgccgcc ctgaccgccg agcggatcaa gacccccccc aagcggcccg 1140
gcggccggcg gcggggccgg ctgcccaaca acagcagccg gcccagcacc cccaccatca 1200
acgtgctgga gagcaaggac accgacagcg accgggaggc cggcaccgag accggcggcg 1260
agaacaacga caaggaggag gaggagaaga aggacgagac cagcagcagc agcgaggcca 1320
acagccggtg ccagaccccc atcaagatga agcccaacat cgagcccccc gagaacgtgg 1380
agtggagcgg cgccgaggcc agcatgttcc gggtgctgat cggcacctac tacgacaact 1440
tctgcgccat cgcccggctg atcggcacca agacctgccg gcaggtgtac gagttccggg 1500
tgaaggagag cagcatcatc gcccccgccc ccgccgagga cgtggacacc cccccccgga 1560
agaagaagcg gaagcaccgg ctgtgggccg cccactgccg gaagatccag ctgaagaagg 1620
acggcagcag caaccacgtg tacaactacc agccctgcga ccacccccgg cagccctgcg 1680
acagcagctg cccctgcgtg atcgcccaga acttctgcga gaagttctgc cagtgcagca 1740
gcgagtgcca gaaccggttc cccggctgcc ggtgcaaggc ccagtgcaac accaagcagt 1800
gcccctgcta cctggccgtg cgggagtgcg accccgacct gtgcctgacc tgcggcgccg 1860
ccgaccactg ggacagcaag aacgtgagct gcaagaactg cagcatccag cggggcagca 1920
agaagcacct gctgctggcc cccagcgacg tggccggctg gggcatcttc atcaaggacc 1980
ccgtgcagaa gaacgagttc atcagcgagt actgcggcga gatcatcagc caggacgagg 2040
ccgaccggcg gggcaaggtg tacgacaagt acatgtgcag cttcctgttc aacctgaaca 2100
acgacttcgt ggtggacgcc acccggaagg gcaacaagat ccggttcgcc aaccacagcg 2160
tgaaccccaa ctgctacgcc aaggtgatga tggtgaacgg cgaccaccgg atcggcatct 2220
tcgccaagcg ggccatccag accggcgagg agctgttctt cgactaccgg tacagccagg 2280
ccgacgccct gaagtacgtg ggcatcgagc gggagatgga gatccccggc agcagcggat 2340
ccctggagcc cggcgaaaag ccttacaagt gccccgagtg tggcaaatcc tttagcacca 2400
ccggaaatct gaccgtccac cagagaacac ataccggcga gaaaccctac aagtgtcccg 2460
agtgcggcaa atccttcagc cagctggccc atctgagagc ccatcaaagg acccataccg 2520
gcgagaaacc ttacaagtgt cccgaatgtg gaaagtcctt tagcagcccc gccgatctga 2580
caagacatca aagaacccac accggcgaga agccctataa atgtcccgag tgtggaaagt 2640
ccttcagcca gagcggcaat ctgaccgagc atcaaagaac ccataccggc gaaaagccct 2700
ataagtgccc cgaatgcgga aaaagcttct ccacaagcgg cgagctggtg agacaccaaa 2760
ggacacatac cggcgaaaag ccttataaat gccccgagtg cggcaagagc ttctctagaa 2820
aggacaatct gaagaaccac caaagaacac acaccggcga gaagccctac aaatgccccg 2880
agtgcggcaa gagctttagc cagtccagca acctcgtgag acatcagagg acacataccg 2940
gaaaaaagac cagcgctagc ggcagcggcg gcggcagcgg cggcaagcgg cccgccgcca 3000
ccaagaaggc cggccaggcc aagaagaaga agggcagcta cccctacgac gtgcccgact 3060
acgcctgagc ggccgcttaa ttaagctgcc ttctgcgggg cttgccttct ggccatgccc 3120
ttcttctctc ccttgcacct gtacctcttg gtctttgaat aaagcctgag taggaagtct 3180
agaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 3240
aaaaaaaaaa aaaaaaaaaa aa 3262
<210> 223
<211> 3262
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 223
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcggc agcggcggca gcggccagac cggcaagaag agcgagaagg 120
gccccgtgtg ctggcggaag cgggtgaaga gcgagtacat gcggctgcgg cagctgaagc 180
ggttccggcg ggccgacgag gtgaagagca tgttcagcag caaccggcag aagatcctgg 240
agcggaccga gatcctgaac caggagtgga agcagcggcg aatccagccc gtgcacatcc 300
tgaccagcgt gagcagcctg cggggcaccc gggagtgcag cgtgaccagc gacctggact 360
tccccaccca ggtgatcccc ctaaagaccc tgaacgccgt ggccagcgtg cccatcatgt 420
acagctggag ccccctgcag cagaacttca tggtggagga cgagaccgtg ctgcacaaca 480
tcccctacat gggcgacgag gtgctggacc aggacggcac cttcatcgag gagctgatca 540
agaactacga cggcaaggtg cacggcgacc gggagtgcgg cttcatcaac gacgagatct 600
tcgtggagct ggtgaacgcc ctgggccagt acaacgacga cgacgacgac gacgacggcg 660
acgaccccga ggagcgggag gagaagcaga aggacctgga ggaccaccgg gacgacaagg 720
agagccggcc cccccggaag ttccccagcg acaagatctt cgaggccatc agcagcatgt 780
tccccgacaa gggcaccgcc gaggagctga aggagaagta caaggagctg accgagcagc 840
agctgcccgg cgccctgccc cccgagtgca cccccaacat cgacggcccc aacgccaaga 900
gcgtgcagcg ggagcagagc ctgcacagct tccacaccct gttctgccgg cggtgcttca 960
agtacgactg cttcctgcac cccttccacg ccacccccaa cacctacaag cggaagaaca 1020
ccgagaccgc cctggacaac aagccctgcg gcccccagtg ctaccagcac ctggagggcg 1080
ccaaggagtt cgccgccgcc ctgaccgccg agcggatcaa gacccccccc aagcggcccg 1140
gcggccggcg gcggggccgg ctgcccaaca acagcagccg gcccagcacc cccaccatca 1200
acgtgctgga gagcaaggac accgacagcg accgggaggc cggcaccgag accggcggcg 1260
agaacaacga caaggaggag gaggagaaga aggacgagac cagcagcagc agcgaggcca 1320
acagccggtg ccagaccccc atcaagatga agcccaacat cgagcccccc gagaacgtgg 1380
agtggagcgg cgccgaggcc agcatgttcc gggtgctgat cggcacctac tacgacaact 1440
tctgcgccat cgcccggctg atcggcacca agacctgccg gcaggtgtac gagttccggg 1500
tgaaggagag cagcatcatc gcccccgccc ccgccgagga cgtggacacc cccccccgga 1560
agaagaagcg gaagcaccgg ctgtgggccg cccactgccg gaagatccag ctgaagaagg 1620
acggcagcag caaccacgtg tacaactacc agccctgcga ccacccccgg cagccctgcg 1680
acagcagctg cccctgcgtg atcgcccaga acttctgcga gaagttctgc cagtgcagca 1740
gcgagtgcca gaaccggttc cccggctgcc ggtgcaaggc ccagtgcaac accaagcagt 1800
gcccctgcta cctggccgtg cgggagtgcg accccgacct gtgcctgacc tgcggcgccg 1860
ccgaccactg ggacagcaag aacgtgagct gcaagaactg cagcatccag cggggcagca 1920
agaagcacct gctgctggcc cccagcgacg tggccggctg gggcatcttc atcaaggacc 1980
ccgtgcagaa gaacgagttc atcagcgagt actgcggcga gatcatcagc caggacgagg 2040
ccgaccggcg gggcaaggtg tacgacaagt acatgtgcag cttcctgttc aacctgaaca 2100
acgacttcgt ggtggacgcc acccggaagg gcaacaagat ccggttcgcc aaccacagcg 2160
tgaaccccaa ctgctacgcc aaggtgatga tggtgaacgg cgaccaccgg atcggcatct 2220
tcgccaagcg ggccatccag accggcgagg agctgttctt cgactaccgg tacagccagg 2280
ccgacgccct gaagtacgtg ggcatcgagc gggagatgga gatccccggc agcagcggat 2340
ccctggagcc cggcgaaaaa ccctataagt gccccgagtg cggcaagagc tttagcgatc 2400
ccggccatct ggtgaggcat cagaggaccc acaccggcga aaagccttac aaatgccccg 2460
agtgtggaaa aagcttcagc agaagcgatc atctgaccac ccatcagagg acacataccg 2520
gcgagaagcc ttataaatgc cccgaatgtg gaaagagctt ctccagaagc gaccatctga 2580
ccaaccacca gaggacccat accggagaaa aaccttacaa atgccccgag tgtggaaagt 2640
ccttcagctc ccccgccgat ctgacaagac atcagagaac ccacaccggc gaaaaacctt 2700
ataaatgtcc cgagtgtggc aaaagcttct ccgacaagaa ggatctgaca agacaccaaa 2760
ggacccacac cggcgagaaa ccttataaat gtcccgaatg cggaaaaagc tttagcagaa 2820
acgacgctct gaccgaacac cagagaacac ataccggaga gaaaccctat aaatgtcccg 2880
agtgcggaaa atccttcagc accaccggcg ctctgacaga gcatcagagg acacacaccg 2940
gcaaaaagac ctccgctagc ggcagcggcg gcggcagcgg cggcaagcgg cccgccgcca 3000
ccaagaaggc cggccaggcc aagaagaaga agggcagcta cccctacgac gtgcccgact 3060
acgcctgagc ggccgcttaa ttaagctgcc ttctgcgggg cttgccttct ggccatgccc 3120
ttcttctctc ccttgcacct gtacctcttg gtctttgaat aaagcctgag taggaagtct 3180
agaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 3240
aaaaaaaaaa aaaaaaaaaa aa 3262
<210> 224
<211> 1897
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 224
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgcctc gggcgggggt ggctcaggaa 120
atagggctat cagaaccgag aagatcatct gtagggacgt ggctagaggc tacgagaacg 180
tgcccattcc ttgcgtgaat ggcgtggatg gcgaaccttg ccccgaggac tacaaataca 240
tctccgagaa ctgcgaaacc agcacaatga acatcgacag aaacatcacc cacctccagc 300
actgcacatg tgtggatgac tgctcctcca gcaactgtct gtgcggccag ctctccatca 360
gatgctggta cgacaaggac ggcagactgc tgcaagagtt caacaagatc gaaccccctc 420
tcatcttcga gtgtaaccaa gcttgcagct gctggagaaa ctgcaagaat agagtggtcc 480
agagcggcat caaggtgaga ctgcaactgt acagaaccgc caagatggga tggggagtga 540
gggctctgca aaccattccc caaggcacct tcatctgcga atacgtgggc gaactgatct 600
ccgacgccga agctgacgtg agagaggacg acagctatct cttcgatctg gacaataagg 660
acggcgaggt gtactgcatc gacgctagat attacggcaa catctctaga ttcatcaacc 720
acctctgcga tcccaacatc attcccgtga gggtgttcat gctgcaccaa gatctgaggt 780
tccctagaat cgccttcttc agctctagag acatcagaac cggcgaggag ctgggcttcg 840
attacggcga tagattctgg gacatcaagt ccaagtactt cacatgccag tgcggcagcg 900
agaagtgtaa gcacagcgct gaggccattg ctctggagca gtctagactg gccagactgg 960
atggcagcag cggatccctg gagcccggag aaaagcctta caaatgcccc gagtgcggca 1020
agtccttcag ccagctggct catctgagag ctcatcaaag gacccacacc ggcgagaagc 1080
cctataagtg ccccgagtgc ggaaaatcct tctcccagag cagcaatctc gtcagacacc 1140
agaggaccca caccggcgag aaaccttaca agtgtcccga atgtggaaag tccttctccc 1200
aaaagagctc tctgatcgcc catcagagaa cacataccgg cgaaaaaccc tacaagtgcc 1260
ccgagtgtgg caaaagcttt tccaccaccg gcaatctgac cgtgcatcaa agaacccaca 1320
ccggcgaaaa accctacaaa tgccccgagt gtggcaaatc cttctccgac cccggccatc 1380
tggtgaggca ccagaggaca cacaccggcg agaaacctta taaatgtccc gaatgcggca 1440
agtcctttag caccagcggc tctctggtga gacatcagag gacacatacc ggcgaaaagc 1500
cttacaagtg tcccgagtgt ggcaaaagct tcagccagaa cagcacactg acagagcatc 1560
agagaaccca taccggcaaa aagaccagcg ctagcggcag cggcggcggc agcggcggca 1620
agcggcccgc cgccaccaag aaggccggcc aggccaagaa gaagaagggc agctacccct 1680
acgacgtgcc cgactacgcc tgagcggccg cttaattaag ctgccttctg cggggcttgc 1740
cttctggcca tgcccttctt ctctcccttg cacctgtacc tcttggtctt tgaataaagc 1800
ctgagtagga agtctagaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1860
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaa 1897
<210> 225
<211> 1897
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 225
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgcctc gggcgggggt ggctcaggaa 120
atagggctat cagaaccgag aagatcatct gtagggacgt ggctagaggc tacgagaacg 180
tgcccattcc ttgcgtgaat ggcgtggatg gcgaaccttg ccccgaggac tacaaataca 240
tctccgagaa ctgcgaaacc agcacaatga acatcgacag aaacatcacc cacctccagc 300
actgcacatg tgtggatgac tgctcctcca gcaactgtct gtgcggccag ctctccatca 360
gatgctggta cgacaaggac ggcagactgc tgcaagagtt caacaagatc gaaccccctc 420
tcatcttcga gtgtaaccaa gcttgcagct gctggagaaa ctgcaagaat agagtggtcc 480
agagcggcat caaggtgaga ctgcaactgt acagaaccgc caagatggga tggggagtga 540
gggctctgca aaccattccc caaggcacct tcatctgcga atacgtgggc gaactgatct 600
ccgacgccga agctgacgtg agagaggacg acagctatct cttcgatctg gacaataagg 660
acggcgaggt gtactgcatc gacgctagat attacggcaa catctctaga ttcatcaacc 720
acctctgcga tcccaacatc attcccgtga gggtgttcat gctgcaccaa gatctgaggt 780
tccctagaat cgccttcttc agctctagag acatcagaac cggcgaggag ctgggcttcg 840
attacggcga tagattctgg gacatcaagt ccaagtactt cacatgccag tgcggcagcg 900
agaagtgtaa gcacagcgct gaggccattg ctctggagca gtctagactg gccagactgg 960
atggcagcag cggatccctg gagcccggcg aaaagcctta caagtgcccc gagtgtggca 1020
aatcctttag caccaccgga aatctgaccg tccaccagag aacacatacc ggcgagaaac 1080
cctacaagtg tcccgagtgc ggcaaatcct tcagccagct ggcccatctg agagcccatc 1140
aaaggaccca taccggcgag aaaccttaca agtgtcccga atgtggaaag tcctttagca 1200
gccccgccga tctgacaaga catcaaagaa cccacaccgg cgagaagccc tataaatgtc 1260
ccgagtgtgg aaagtccttc agccagagcg gcaatctgac cgagcatcaa agaacccata 1320
ccggcgaaaa gccctataag tgccccgaat gcggaaaaag cttctccaca agcggcgagc 1380
tggtgagaca ccaaaggaca cataccggcg aaaagcctta taaatgcccc gagtgcggca 1440
agagcttctc tagaaaggac aatctgaaga accaccaaag aacacacacc ggcgagaagc 1500
cctacaaatg ccccgagtgc ggcaagagct ttagccagtc cagcaacctc gtgagacatc 1560
agaggacaca taccggaaaa aagaccagcg ctagcggcag cggcggcggc agcggcggca 1620
agcggcccgc cgccaccaag aaggccggcc aggccaagaa gaagaagggc agctacccct 1680
acgacgtgcc cgactacgcc tgagcggccg cttaattaag ctgccttctg cggggcttgc 1740
cttctggcca tgcccttctt ctctcccttg cacctgtacc tcttggtctt tgaataaagc 1800
ctgagtagga agtctagaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1860
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaa 1897
<210> 226
<211> 2179
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 226
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgccgg cagcagcgga tccctggagc 120
ccggagaaaa gccttacaaa tgccccgagt gcggcaagtc cttcagccag ctggctcatc 180
tgagagctca tcaaaggacc cacaccggcg agaagcccta taagtgcccc gagtgcggaa 240
aatccttctc ccagagcagc aatctcgtca gacaccagag gacccacacc ggcgagaaac 300
cttacaagtg tcccgaatgt ggaaagtcct tctcccaaaa gagctctctg atcgcccatc 360
agagaacaca taccggcgaa aaaccctaca agtgccccga gtgtggcaaa agcttttcca 420
ccaccggcaa tctgaccgtg catcaaagaa cccacaccgg cgaaaaaccc tacaaatgcc 480
ccgagtgtgg caaatccttc tccgaccccg gccatctggt gaggcaccag aggacacaca 540
ccggcgagaa accttataaa tgtcccgaat gcggcaagtc ctttagcacc agcggctctc 600
tggtgagaca tcagaggaca cataccggcg aaaagcctta caagtgtccc gagtgtggca 660
aaagcttcag ccagaacagc acactgacag agcatcagag aacccatacc ggcaaaaaga 720
ccagcgctag cggcagcggc ggcggcagcg gcggcgagga gcccgaggag cccgccgata 780
gcggacaatc tctggtgccc gtctacatct acagccccga atatgtgagc atgtgtgatt 840
ccctcgccaa gatccctaag agagccagca tggtgcattc tctgatcgag gcctacgctc 900
tgcataagca aatgaggatc gtgaagccca aggtcgccag catggaagag atggccacct 960
ttcacaccga tgcctacctc caacatctcc agaaggtgtc ccaagagggc gacgacgacc 1020
accccgactc cattgagtac ggactgggct atgattgccc cgccaccgag ggcatctttg 1080
actatgccgc cgctatcggc ggagctacca tcacagccgc ccagtgtctg attgatggca 1140
tgtgcaaggt cgccatcaac tggtccggag gctggcatca tgccaagaag gatgaggcct 1200
ccggcttctg ttatctgaat gacgccgtgc tgggcattct gagactgagg aggaaattcg 1260
agaggattct gtacgtggat ctggatctgc atcacggaga tggagtcgaa gatgccttca 1320
gcttcaccag caaggtgatg acagtctctc tgcacaagtt ctcccccggc ttctttcccg 1380
gaaccggcga cgtgtccgac gtgggactgg gcaagggaag gtactacagc gtgaacgtgc 1440
ccattcaaga cggcatccaa gacgagaagt actaccagat ctgcgagtcc gtgctcaagg 1500
aggtctacca agccttcaat cctaaggctg tcgtgctcca actgggagct gataccattg 1560
ctggcgatcc catgtgcagc ttcaatatga cacccgtcgg aatcggcaag tgcctcaagt 1620
acatcctcca gtggcagctc gccaccctca ttctcggagg aggcggatac aatctggcta 1680
ataccgccag atgctggacc tatctgaccg gcgtgattct gggcaaaaca ctgagcagcg 1740
aaatccccga ccacgagttt ttcaccgctt acggccccga ctacgtgctg gagatcaccc 1800
ccagctgcag acccgataga aacgaacccc atagaatcca gcaaattctg aactatatca 1860
agggcaacct caagcacgtc gtgggaggtg gcggatcggg aaagcggccc gccgccacca 1920
agaaggccgg tcaggccaag aagaagaagg gcagctaccc ctacgacgtg cccgactacg 1980
cctgagcggc cgcttaatta agctgccttc tgcggggctt gccttctggc catgcccttc 2040
ttctctccct tgcacctgta cctcttggtc tttgaataaa gcctgagtag gaagtctaga 2100
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2160
aaaaaaaaaa aaaaaaaaa 2179
<210> 227
<211> 1897
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 227
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgcctc gggcgggggt ggctcaggaa 120
atagggctat cagaaccgag aagatcatct gtagggacgt ggctagaggc tacgagaacg 180
tgcccattcc ttgcgtgaat ggcgtggatg gcgaaccttg ccccgaggac tacaaataca 240
tctccgagaa ctgcgaaacc agcacaatga acatcgacag aaacatcacc cacctccagc 300
actgcacatg tgtggatgac tgctcctcca gcaactgtct gtgcggccag ctctccatca 360
gatgctggta cgacaaggac ggcagactgc tgcaagagtt caacaagatc gaaccccctc 420
tcatcttcga gtgtaaccaa gcttgcagct gctggagaaa ctgcaagaat agagtggtcc 480
agagcggcat caaggtgaga ctgcaactgt acagaaccgc caagatggga tggggagtga 540
gggctctgca aaccattccc caaggcacct tcatctgcga atacgtgggc gaactgatct 600
ccgacgccga agctgacgtg agagaggacg acagctatct cttcgatctg gacaataagg 660
acggcgaggt gtactgcatc gacgctagat attacggcaa catctctaga ttcatcaacc 720
acctctgcga tcccaacatc attcccgtga gggtgttcat gctgcaccaa gatctgaggt 780
tccctagaat cgccttcttc agctctagag acatcagaac cggcgaggag ctgggcttcg 840
attacggcga tagattctgg gacatcaagt ccaagtactt cacatgccag tgcggcagcg 900
agaagtgtaa gcacagcgct gaggccattg ctctggagca gtctagactg gccagactgg 960
atggcagcag cggatccctg gagcccggcg aaaagcctta caagtgtccc gagtgcggaa 1020
agagcttcag cagagccgat aatctgaccg agcaccaaag gacccacacc ggagagaagc 1080
cttataagtg tcccgaatgc ggcaaaagct tttctagaag cgatcatctg accaaccacc 1140
agaggacaca caccggagaa aaaccttaca aatgccccga gtgcggcaaa agcttctccc 1200
agagcagcaa tctggtgaga caccaaagga cccacaccgg cgaaaaaccc tataaatgcc 1260
ccgaatgtgg caagagcttt agcacatccg gcgagctggt gaggcatcaa agaacacata 1320
ccggcgagaa gccctacaag tgccccgagt gtggaaaaag cttcagcacc cacctcgatc 1380
tgatcagaca ccagaggacc cataccggag agaaacccta caaatgtccc gagtgcggaa 1440
agtcctttag ccagctggcc catctgagag ctcatcaaag gacacacacc ggcgagaagc 1500
cttacaagtg tcccgagtgc ggaaaatcct tctcccaact ggcccatctg agggcccacc 1560
agagaaccca caccggcaaa aagacctccg ctagcggcag cggcggcggc agcggcggca 1620
agcggcccgc cgccaccaag aaggccggcc aggccaagaa gaagaagggc agctacccct 1680
acgacgtgcc cgactacgcc tgagcggccg cttaattaag ctgccttctg cggggcttgc 1740
cttctggcca tgcccttctt ctctcccttg cacctgtacc tcttggtctt tgaataaagc 1800
ctgagtagga agtctagaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1860
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaa 1897
<210> 228
<211> 1897
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 228
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgcctc gggcgggggt ggctcaggaa 120
atagggctat cagaaccgag aagatcatct gtagggacgt ggctagaggc tacgagaacg 180
tgcccattcc ttgcgtgaat ggcgtggatg gcgaaccttg ccccgaggac tacaaataca 240
tctccgagaa ctgcgaaacc agcacaatga acatcgacag aaacatcacc cacctccagc 300
actgcacatg tgtggatgac tgctcctcca gcaactgtct gtgcggccag ctctccatca 360
gatgctggta cgacaaggac ggcagactgc tgcaagagtt caacaagatc gaaccccctc 420
tcatcttcga gtgtaaccaa gcttgcagct gctggagaaa ctgcaagaat agagtggtcc 480
agagcggcat caaggtgaga ctgcaactgt acagaaccgc caagatggga tggggagtga 540
gggctctgca aaccattccc caaggcacct tcatctgcga atacgtgggc gaactgatct 600
ccgacgccga agctgacgtg agagaggacg acagctatct cttcgatctg gacaataagg 660
acggcgaggt gtactgcatc gacgctagat attacggcaa catctctaga ttcatcaacc 720
acctctgcga tcccaacatc attcccgtga gggtgttcat gctgcaccaa gatctgaggt 780
tccctagaat cgccttcttc agctctagag acatcagaac cggcgaggag ctgggcttcg 840
attacggcga tagattctgg gacatcaagt ccaagtactt cacatgccag tgcggcagcg 900
agaagtgtaa gcacagcgct gaggccattg ctctggagca gtctagactg gccagactgg 960
atggcagcag cggatccctg gagcccggcg aaaaacccta taaatgcccc gagtgtggca 1020
agagcttttc cgaccccgga cacctcgtga ggcatcagag aacacatacc ggcgagaaac 1080
cctacaagtg ccccgaatgc ggcaaatcct tctctagaaa ggacaatctg aaaaaccatc 1140
aaagaaccca taccggcgag aagccctata aatgtcccga gtgtggaaag agcttcagcc 1200
acaagaacgc tctgcagaac catcagagga cccataccgg cgaaaagcct tataagtgcc 1260
ccgagtgcgg aaaatccttt tctagaaggg acgagctgaa tgtgcaccaa aggacacata 1320
ccggagagaa accctacaaa tgccccgagt gcggcaagtc cttcagcacc tccggcaatc 1380
tggtgaggca ccaaaggaca cacaccggcg aaaaacctta caagtgtccc gagtgcggaa 1440
aaagcttttc ccagaacagc acactgaccg aacaccaaag gacccacacc ggagagaaac 1500
cttataaatg tcccgagtgt ggaaagtcct ttagccagtc cggcaatctg acagagcatc 1560
aaagaaccca caccggcaaa aagacctccg ctagcggcag cggcggcggc agcggcggca 1620
agcggcccgc cgccaccaag aaggccggcc aggccaagaa gaagaagggc agctacccct 1680
acgacgtgcc cgactacgcc tgagcggccg cttaattaag ctgccttctg cggggcttgc 1740
cttctggcca tgcccttctt ctctcccttg cacctgtacc tcttggtctt tgaataaagc 1800
ctgagtagga agtctagaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1860
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaa 1897
<210> 229
<211> 1897
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 229
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgcctc gggcgggggt ggctcaggaa 120
atagggctat cagaaccgag aagatcatct gtagggacgt ggctagaggc tacgagaacg 180
tgcccattcc ttgcgtgaat ggcgtggatg gcgaaccttg ccccgaggac tacaaataca 240
tctccgagaa ctgcgaaacc agcacaatga acatcgacag aaacatcacc cacctccagc 300
actgcacatg tgtggatgac tgctcctcca gcaactgtct gtgcggccag ctctccatca 360
gatgctggta cgacaaggac ggcagactgc tgcaagagtt caacaagatc gaaccccctc 420
tcatcttcga gtgtaaccaa gcttgcagct gctggagaaa ctgcaagaat agagtggtcc 480
agagcggcat caaggtgaga ctgcaactgt acagaaccgc caagatggga tggggagtga 540
gggctctgca aaccattccc caaggcacct tcatctgcga atacgtgggc gaactgatct 600
ccgacgccga agctgacgtg agagaggacg acagctatct cttcgatctg gacaataagg 660
acggcgaggt gtactgcatc gacgctagat attacggcaa catctctaga ttcatcaacc 720
acctctgcga tcccaacatc attcccgtga gggtgttcat gctgcaccaa gatctgaggt 780
tccctagaat cgccttcttc agctctagag acatcagaac cggcgaggag ctgggcttcg 840
attacggcga tagattctgg gacatcaagt ccaagtactt cacatgccag tgcggcagcg 900
agaagtgtaa gcacagcgct gaggccattg ctctggagca gtctagactg gccagactgg 960
atggcagcag cggatccctg gagcccggcg aaaaacccta taagtgcccc gagtgcggca 1020
agagctttag cgatcccggc catctggtga ggcatcagag gacccacacc ggcgaaaagc 1080
cttacaaatg ccccgagtgt ggaaaaagct tcagcagaag cgatcatctg accacccatc 1140
agaggacaca taccggcgag aagccttata aatgccccga atgtggaaag agcttctcca 1200
gaagcgacca tctgaccaac caccagagga cccataccgg agaaaaacct tacaaatgcc 1260
ccgagtgtgg aaagtccttc agctcccccg ccgatctgac aagacatcag agaacccaca 1320
ccggcgaaaa accttataaa tgtcccgagt gtggcaaaag cttctccgac aagaaggatc 1380
tgacaagaca ccaaaggacc cacaccggcg agaaacctta taaatgtccc gaatgcggaa 1440
aaagctttag cagaaacgac gctctgaccg aacaccagag aacacatacc ggagagaaac 1500
cctataaatg tcccgagtgc ggaaaatcct tcagcaccac cggcgctctg acagagcatc 1560
agaggacaca caccggcaaa aagacctccg ctagcggcag cggcggcggc agcggcggca 1620
agcggcccgc cgccaccaag aaggccggcc aggccaagaa gaagaagggc agctacccct 1680
acgacgtgcc cgactacgcc tgagcggccg cttaattaag ctgccttctg cggggcttgc 1740
cttctggcca tgcccttctt ctctcccttg cacctgtacc tcttggtctt tgaataaagc 1800
ctgagtagga agtctagaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1860
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaa 1897
<210> 230
<211> 1897
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 230
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgcctc gggcgggggt ggctcaggaa 120
atagggctat cagaaccgag aagatcatct gtagggacgt ggctagaggc tacgagaacg 180
tgcccattcc ttgcgtgaat ggcgtggatg gcgaaccttg ccccgaggac tacaaataca 240
tctccgagaa ctgcgaaacc agcacaatga acatcgacag aaacatcacc cacctccagc 300
actgcacatg tgtggatgac tgctcctcca gcaactgtct gtgcggccag ctctccatca 360
gatgctggta cgacaaggac ggcagactgc tgcaagagtt caacaagatc gaaccccctc 420
tcatcttcga gtgtaaccaa gcttgcagct gctggagaaa ctgcaagaat agagtggtcc 480
agagcggcat caaggtgaga ctgcaactgt acagaaccgc caagatggga tggggagtga 540
gggctctgca aaccattccc caaggcacct tcatctgcga atacgtgggc gaactgatct 600
ccgacgccga agctgacgtg agagaggacg acagctatct cttcgatctg gacaataagg 660
acggcgaggt gtactgcatc gacgctagat attacggcaa catctctaga ttcatcaacc 720
acctctgcga tcccaacatc attcccgtga gggtgttcat gctgcaccaa gatctgaggt 780
tccctagaat cgccttcttc agctctagag acatcagaac cggcgaggag ctgggcttcg 840
attacggcga tagattctgg gacatcaagt ccaagtactt cacatgccag tgcggcagcg 900
agaagtgtaa gcacagcgct gaggccattg ctctggagca gtctagactg gccagactgg 960
atggcagcag cggatccctg gagcccggcg aaaagcctta caaatgtccc gaatgcggaa 1020
agagcttcag cagagccgac aatctgaccg aacatcagag aacccatacc ggagaaaaac 1080
cttacaaatg tcccgagtgc ggcaaaagct tctcccaagc cggacatctg gccagccacc 1140
aaaggacaca taccggcgag aaaccctaca agtgccccga gtgcggcaag tccttctcta 1200
gatccgatga gctggtcaga catcagagaa cccataccgg cgagaagcct tataagtgcc 1260
ccgaatgtgg caagtccttc agccagagag ctcatctgga gaggcatcaa agaacacaca 1320
ccggagagaa accttacaag tgtcccgagt gtggaaagag cttctccaga agggacgagc 1380
tgaacgtcca ccaaagaacc cataccggcg aaaagcccta taaatgcccc gagtgtggaa 1440
aatccttttc tagatccgac catctgacaa cccaccagag gacccatacc ggagagaagc 1500
cctacaaatg ccccgagtgt ggaaaaagct tctctagaaa cgatgctctg acagagcacc 1560
aaaggaccca caccggcaaa aagaccagcg ctagcggcag cggcggcggc agcggcggca 1620
agcggcccgc cgccaccaag aaggccggcc aggccaagaa gaagaagggc agctacccct 1680
acgacgtgcc cgactacgcc tgagcggccg cttaattaag ctgccttctg cggggcttgc 1740
cttctggcca tgcccttctt ctctcccttg cacctgtacc tcttggtctt tgaataaagc 1800
ctgagtagga agtctagaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 1860
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaa 1897
<210> 231
<211> 2668
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 231
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgccgg cagcagcgga tccctggagc 120
ccggcgaaaa accctataag tgccccgagt gcggcaagag ctttagcgat cccggccatc 180
tggtgaggca tcagaggacc cacaccggcg aaaagcctta caaatgcccc gagtgtggaa 240
aaagcttcag cagaagcgat catctgacca cccatcagag gacacatacc ggcgagaagc 300
cttataaatg ccccgaatgt ggaaagagct tctccagaag cgaccatctg accaaccacc 360
agaggaccca taccggagaa aaaccttaca aatgccccga gtgtggaaag tccttcagct 420
cccccgccga tctgacaaga catcagagaa cccacaccgg cgaaaaacct tataaatgtc 480
ccgagtgtgg caaaagcttc tccgacaaga aggatctgac aagacaccaa aggacccaca 540
ccggcgagaa accttataaa tgtcccgaat gcggaaaaag ctttagcaga aacgacgctc 600
tgaccgaaca ccagagaaca cataccggag agaaacccta taaatgtccc gagtgcggaa 660
aatccttcag caccaccggc gctctgacag agcatcagag gacacacacc ggcaaaaaga 720
cctccgctag cggcagcggc ggcggcagcg gcggcaacca cgaccaggag ttcgaccccc 780
ccaaggtgta cccccccgtg cccgccgaga agcggaagcc catccgggtg ctgagcctgt 840
tcgacggcat cgccaccggc ctgctggtgc tgaaggacct gggcatccag gtggaccggt 900
acatcgccag cgaggtgtgc gaggacagca tcaccgtggg catggtgcgg caccagggca 960
agatcatgta cgtgggcgac gtgcggagcg tgacccagaa gcacatccag gagtggggcc 1020
ccttcgacct ggtgatcggc ggcagcccct gcaacgacct gagcatcgtg aaccccgccc 1080
ggaagggcct gtacgagggc accggccggc tgttcttcga gttctaccgg ctgctgcacg 1140
acgcccggcc caaggagggc gacgaccggc ccttcttctg gctgttcgag aacgtggtgg 1200
ccatgggcgt gagcgacaag cgggacatca gccggttcct ggagagcaac cccgtgatga 1260
tcgacgccaa ggaggtgagc gccgcccacc gggcccggta cttctggggc aacctgcccg 1320
gcatgaaccg gcccctggcc agcaccgtga acgacaagct ggagctgcag gagtgcctgg 1380
agcacggccg gatcgccaag ttcagcaagg tgcggaccat caccacccgg agcaacagca 1440
tcaagcaggg caaggaccag cacttccccg tgttcatgaa cgagaaggag gacatcctgt 1500
ggtgcaccga gatggagcgg gtgttcggct tccccgtgca ctacaccgac gtgagcaaca 1560
tgagccggct ggcccggcag cggctgctgg gccggagctg gagcgtgccc gtgatccggc 1620
acctgttcgc ccccctgaag gagtacttcg cctgcgtgag cagcggcaac agcaacgcca 1680
acagccgggg ccccagcttc agcagcggcc tggtgcccct gagcctgcgg ggcagccaca 1740
tgaatcctct ggagatgttc gagacagtgc ccgtgtggag aaggcaaccc gtgagggtgc 1800
tgagcctctt cgaggacatt aagaaggagc tgacctctct gggctttctg gaatccggca 1860
gcgaccccgg ccagctgaaa cacgtggtgg acgtgaccga cacagtgagg aaggacgtgg 1920
aagagtgggg cccctttgac ctcgtgtatg gagccacacc tcctctcggc cacacatgcg 1980
ataggcctcc cagctggtat ctcttccagt tccacagact gctccagtac gccagaccta 2040
agcccggcag ccccagaccc ttcttctgga tgttcgtgga caatctggtg ctgaacaagg 2100
aggatctgga tgtggccagc agatttctgg agatggaacc cgtgacaatc cccgacgtgc 2160
atggcggctc tctgcagaac gccgtgagag tgtggtccaa catccccgcc attagaagca 2220
gacactgggc tctggtgagc gaggaggaac tgtctctgct ggcccagaat aagcagtcct 2280
ccaagctggc cgccaagtgg cccaccaagc tggtgaagaa ctgctttctg cctctgaggg 2340
agtatttcaa gtatttcagc accgaactga ccagcagcct gagcggcggc aagcggcccg 2400
ccgccaccaa gaaggccggc caggccaaga agaagaaggg cagctacccc tacgacgtgc 2460
ccgactacgc ctgagcggcc gcttaattaa gctgccttct gcggggcttg ccttctggcc 2520
atgcccttct tctctccctt gcacctgtac ctcttggtct ttgaataaag cctgagtagg 2580
aagtctagaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2640
aaaaaaaaaa aaaaaaaaaa aaaaaaaa 2668
<210> 232
<211> 2179
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 232
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgccgg cagcagcgga tccctggagc 120
ccggcgaaaa gccttacaag tgtcccgagt gcggaaagag cttcagcaga gccgataatc 180
tgaccgagca ccaaaggacc cacaccggag agaagcctta taagtgtccc gaatgcggca 240
aaagcttttc tagaagcgat catctgacca accaccagag gacacacacc ggagaaaaac 300
cttacaaatg ccccgagtgc ggcaaaagct tctcccagag cagcaatctg gtgagacacc 360
aaaggaccca caccggcgaa aaaccctata aatgccccga atgtggcaag agctttagca 420
catccggcga gctggtgagg catcaaagaa cacataccgg cgagaagccc tacaagtgcc 480
ccgagtgtgg aaaaagcttc agcacccacc tcgatctgat cagacaccag aggacccata 540
ccggagagaa accctacaaa tgtcccgagt gcggaaagtc ctttagccag ctggcccatc 600
tgagagctca tcaaaggaca cacaccggcg agaagcctta caagtgtccc gagtgcggaa 660
aatccttctc ccaactggcc catctgaggg cccaccagag aacccacacc ggcaaaaaga 720
cctccgctag cggcagcggc ggcggcagcg gcggcgagga gcccgaggag cccgccgata 780
gcggacaatc tctggtgccc gtctacatct acagccccga atatgtgagc atgtgtgatt 840
ccctcgccaa gatccctaag agagccagca tggtgcattc tctgatcgag gcctacgctc 900
tgcataagca aatgaggatc gtgaagccca aggtcgccag catggaagag atggccacct 960
ttcacaccga tgcctacctc caacatctcc agaaggtgtc ccaagagggc gacgacgacc 1020
accccgactc cattgagtac ggactgggct atgattgccc cgccaccgag ggcatctttg 1080
actatgccgc cgctatcggc ggagctacca tcacagccgc ccagtgtctg attgatggca 1140
tgtgcaaggt cgccatcaac tggtccggag gctggcatca tgccaagaag gatgaggcct 1200
ccggcttctg ttatctgaat gacgccgtgc tgggcattct gagactgagg aggaaattcg 1260
agaggattct gtacgtggat ctggatctgc atcacggaga tggagtcgaa gatgccttca 1320
gcttcaccag caaggtgatg acagtctctc tgcacaagtt ctcccccggc ttctttcccg 1380
gaaccggcga cgtgtccgac gtgggactgg gcaagggaag gtactacagc gtgaacgtgc 1440
ccattcaaga cggcatccaa gacgagaagt actaccagat ctgcgagtcc gtgctcaagg 1500
aggtctacca agccttcaat cctaaggctg tcgtgctcca actgggagct gataccattg 1560
ctggcgatcc catgtgcagc ttcaatatga cacccgtcgg aatcggcaag tgcctcaagt 1620
acatcctcca gtggcagctc gccaccctca ttctcggagg aggcggatac aatctggcta 1680
ataccgccag atgctggacc tatctgaccg gcgtgattct gggcaaaaca ctgagcagcg 1740
aaatccccga ccacgagttt ttcaccgctt acggccccga ctacgtgctg gagatcaccc 1800
ccagctgcag acccgataga aacgaacccc atagaatcca gcaaattctg aactatatca 1860
agggcaacct caagcacgtc gtgggaggtg gcggatcggg aaagcggccc gccgccacca 1920
agaaggccgg tcaggccaag aagaagaagg gcagctaccc ctacgacgtg cccgactacg 1980
cctgagcggc cgcttaatta agctgccttc tgcggggctt gccttctggc catgcccttc 2040
ttctctccct tgcacctgta cctcttggtc tttgaataaa gcctgagtag gaagtctaga 2100
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2160
aaaaaaaaaa aaaaaaaaa 2179
<210> 233
<211> 3262
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 233
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcggc agcggcggca gcggccagac cggcaagaag agcgagaagg 120
gccccgtgtg ctggcggaag cgggtgaaga gcgagtacat gcggctgcgg cagctgaagc 180
ggttccggcg ggccgacgag gtgaagagca tgttcagcag caaccggcag aagatcctgg 240
agcggaccga gatcctgaac caggagtgga agcagcggcg aatccagccc gtgcacatcc 300
tgaccagcgt gagcagcctg cggggcaccc gggagtgcag cgtgaccagc gacctggact 360
tccccaccca ggtgatcccc ctaaagaccc tgaacgccgt ggccagcgtg cccatcatgt 420
acagctggag ccccctgcag cagaacttca tggtggagga cgagaccgtg ctgcacaaca 480
tcccctacat gggcgacgag gtgctggacc aggacggcac cttcatcgag gagctgatca 540
agaactacga cggcaaggtg cacggcgacc gggagtgcgg cttcatcaac gacgagatct 600
tcgtggagct ggtgaacgcc ctgggccagt acaacgacga cgacgacgac gacgacggcg 660
acgaccccga ggagcgggag gagaagcaga aggacctgga ggaccaccgg gacgacaagg 720
agagccggcc cccccggaag ttccccagcg acaagatctt cgaggccatc agcagcatgt 780
tccccgacaa gggcaccgcc gaggagctga aggagaagta caaggagctg accgagcagc 840
agctgcccgg cgccctgccc cccgagtgca cccccaacat cgacggcccc aacgccaaga 900
gcgtgcagcg ggagcagagc ctgcacagct tccacaccct gttctgccgg cggtgcttca 960
agtacgactg cttcctgcac cccttccacg ccacccccaa cacctacaag cggaagaaca 1020
ccgagaccgc cctggacaac aagccctgcg gcccccagtg ctaccagcac ctggagggcg 1080
ccaaggagtt cgccgccgcc ctgaccgccg agcggatcaa gacccccccc aagcggcccg 1140
gcggccggcg gcggggccgg ctgcccaaca acagcagccg gcccagcacc cccaccatca 1200
acgtgctgga gagcaaggac accgacagcg accgggaggc cggcaccgag accggcggcg 1260
agaacaacga caaggaggag gaggagaaga aggacgagac cagcagcagc agcgaggcca 1320
acagccggtg ccagaccccc atcaagatga agcccaacat cgagcccccc gagaacgtgg 1380
agtggagcgg cgccgaggcc agcatgttcc gggtgctgat cggcacctac tacgacaact 1440
tctgcgccat cgcccggctg atcggcacca agacctgccg gcaggtgtac gagttccggg 1500
tgaaggagag cagcatcatc gcccccgccc ccgccgagga cgtggacacc cccccccgga 1560
agaagaagcg gaagcaccgg ctgtgggccg cccactgccg gaagatccag ctgaagaagg 1620
acggcagcag caaccacgtg tacaactacc agccctgcga ccacccccgg cagccctgcg 1680
acagcagctg cccctgcgtg atcgcccaga acttctgcga gaagttctgc cagtgcagca 1740
gcgagtgcca gaaccggttc cccggctgcc ggtgcaaggc ccagtgcaac accaagcagt 1800
gcccctgcta cctggccgtg cgggagtgcg accccgacct gtgcctgacc tgcggcgccg 1860
ccgaccactg ggacagcaag aacgtgagct gcaagaactg cagcatccag cggggcagca 1920
agaagcacct gctgctggcc cccagcgacg tggccggctg gggcatcttc atcaaggacc 1980
ccgtgcagaa gaacgagttc atcagcgagt actgcggcga gatcatcagc caggacgagg 2040
ccgaccggcg gggcaaggtg tacgacaagt acatgtgcag cttcctgttc aacctgaaca 2100
acgacttcgt ggtggacgcc acccggaagg gcaacaagat ccggttcgcc aaccacagcg 2160
tgaaccccaa ctgctacgcc aaggtgatga tggtgaacgg cgaccaccgg atcggcatct 2220
tcgccaagcg ggccatccag accggcgagg agctgttctt cgactaccgg tacagccagg 2280
ccgacgccct gaagtacgtg ggcatcgagc gggagatgga gatccccggc agcagcggat 2340
ccctggagcc cggagaaaag ccttacaaat gccccgagtg cggcaagtcc ttcagccagc 2400
tggctcatct gagagctcat caaaggaccc acaccggcga gaagccctat aagtgccccg 2460
agtgcggaaa atccttctcc cagagcagca atctcgtcag acaccagagg acccacaccg 2520
gcgagaaacc ttacaagtgt cccgaatgtg gaaagtcctt ctcccaaaag agctctctga 2580
tcgcccatca gagaacacat accggcgaaa aaccctacaa gtgccccgag tgtggcaaaa 2640
gcttttccac caccggcaat ctgaccgtgc atcaaagaac ccacaccggc gaaaaaccct 2700
acaaatgccc cgagtgtggc aaatccttct ccgaccccgg ccatctggtg aggcaccaga 2760
ggacacacac cggcgagaaa ccttataaat gtcccgaatg cggcaagtcc tttagcacca 2820
gcggctctct ggtgagacat cagaggacac ataccggcga aaagccttac aagtgtcccg 2880
agtgtggcaa aagcttcagc cagaacagca cactgacaga gcatcagaga acccataccg 2940
gcaaaaagac cagcgctagc ggcagcggcg gcggcagcgg cggcaagcgg cccgccgcca 3000
ccaagaaggc cggccaggcc aagaagaaga agggcagcta cccctacgac gtgcccgact 3060
acgcctgagc ggccgcttaa ttaagctgcc ttctgcgggg cttgccttct ggccatgccc 3120
ttcttctctc ccttgcacct gtacctcttg gtctttgaat aaagcctgag taggaagtct 3180
agaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 3240
aaaaaaaaaa aaaaaaaaaa aa 3262
<210> 234
<211> 3262
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 234
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcggc agcggcggca gcggccagac cggcaagaag agcgagaagg 120
gccccgtgtg ctggcggaag cgggtgaaga gcgagtacat gcggctgcgg cagctgaagc 180
ggttccggcg ggccgacgag gtgaagagca tgttcagcag caaccggcag aagatcctgg 240
agcggaccga gatcctgaac caggagtgga agcagcggcg aatccagccc gtgcacatcc 300
tgaccagcgt gagcagcctg cggggcaccc gggagtgcag cgtgaccagc gacctggact 360
tccccaccca ggtgatcccc ctaaagaccc tgaacgccgt ggccagcgtg cccatcatgt 420
acagctggag ccccctgcag cagaacttca tggtggagga cgagaccgtg ctgcacaaca 480
tcccctacat gggcgacgag gtgctggacc aggacggcac cttcatcgag gagctgatca 540
agaactacga cggcaaggtg cacggcgacc gggagtgcgg cttcatcaac gacgagatct 600
tcgtggagct ggtgaacgcc ctgggccagt acaacgacga cgacgacgac gacgacggcg 660
acgaccccga ggagcgggag gagaagcaga aggacctgga ggaccaccgg gacgacaagg 720
agagccggcc cccccggaag ttccccagcg acaagatctt cgaggccatc agcagcatgt 780
tccccgacaa gggcaccgcc gaggagctga aggagaagta caaggagctg accgagcagc 840
agctgcccgg cgccctgccc cccgagtgca cccccaacat cgacggcccc aacgccaaga 900
gcgtgcagcg ggagcagagc ctgcacagct tccacaccct gttctgccgg cggtgcttca 960
agtacgactg cttcctgcac cccttccacg ccacccccaa cacctacaag cggaagaaca 1020
ccgagaccgc cctggacaac aagccctgcg gcccccagtg ctaccagcac ctggagggcg 1080
ccaaggagtt cgccgccgcc ctgaccgccg agcggatcaa gacccccccc aagcggcccg 1140
gcggccggcg gcggggccgg ctgcccaaca acagcagccg gcccagcacc cccaccatca 1200
acgtgctgga gagcaaggac accgacagcg accgggaggc cggcaccgag accggcggcg 1260
agaacaacga caaggaggag gaggagaaga aggacgagac cagcagcagc agcgaggcca 1320
acagccggtg ccagaccccc atcaagatga agcccaacat cgagcccccc gagaacgtgg 1380
agtggagcgg cgccgaggcc agcatgttcc gggtgctgat cggcacctac tacgacaact 1440
tctgcgccat cgcccggctg atcggcacca agacctgccg gcaggtgtac gagttccggg 1500
tgaaggagag cagcatcatc gcccccgccc ccgccgagga cgtggacacc cccccccgga 1560
agaagaagcg gaagcaccgg ctgtgggccg cccactgccg gaagatccag ctgaagaagg 1620
acggcagcag caaccacgtg tacaactacc agccctgcga ccacccccgg cagccctgcg 1680
acagcagctg cccctgcgtg atcgcccaga acttctgcga gaagttctgc cagtgcagca 1740
gcgagtgcca gaaccggttc cccggctgcc ggtgcaaggc ccagtgcaac accaagcagt 1800
gcccctgcta cctggccgtg cgggagtgcg accccgacct gtgcctgacc tgcggcgccg 1860
ccgaccactg ggacagcaag aacgtgagct gcaagaactg cagcatccag cggggcagca 1920
agaagcacct gctgctggcc cccagcgacg tggccggctg gggcatcttc atcaaggacc 1980
ccgtgcagaa gaacgagttc atcagcgagt actgcggcga gatcatcagc caggacgagg 2040
ccgaccggcg gggcaaggtg tacgacaagt acatgtgcag cttcctgttc aacctgaaca 2100
acgacttcgt ggtggacgcc acccggaagg gcaacaagat ccggttcgcc aaccacagcg 2160
tgaaccccaa ctgctacgcc aaggtgatga tggtgaacgg cgaccaccgg atcggcatct 2220
tcgccaagcg ggccatccag accggcgagg agctgttctt cgactaccgg tacagccagg 2280
ccgacgccct gaagtacgtg ggcatcgagc gggagatgga gatccccggc agcagcggat 2340
ccctggagcc cggcgaaaaa ccctataaat gccccgagtg tggcaagagc ttttccgacc 2400
ccggacatct cgtgaggcat cagagaacac ataccggcga gaaaccctac aagtgccccg 2460
aatgcggcaa atccttctct agaaaggaca atctgaaaaa ccatcaaaga acccataccg 2520
gcgagaagcc ctataaatgt cccgagtgtg gaaagagctt cagccacaag aacgctctgc 2580
agaaccatca gaggacccat accggcgaaa agccttataa gtgccccgag tgcggaaaat 2640
ccttttctag aagggacgag ctgaatgtgc accaaaggac acataccgga gagaaaccct 2700
acaaatgccc cgagtgcggc aagtccttca gcacctccgg caatctggtg aggcaccaaa 2760
ggacacacac cggcgaaaaa ccttacaagt gtcccgagtg cggaaaaagc ttttcccaga 2820
acagcacact gaccgaacac caaaggaccc acaccggaga gaaaccttat aaatgtcccg 2880
agtgtggaaa gtcctttagc cagtccggca atctgacaga gcatcaaaga acccacaccg 2940
gcaaaaagac ctccgctagc ggcagcggcg gcggcagcgg cggcaagcgg cccgccgcca 3000
ccaagaaggc cggccaggcc aagaagaaga agggcagcta cccctacgac gtgcccgact 3060
acgcctgagc ggccgcttaa ttaagctgcc ttctgcgggg cttgccttct ggccatgccc 3120
ttcttctctc ccttgcacct gtacctcttg gtctttgaat aaagcctgag taggaagtct 3180
agaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 3240
aaaaaaaaaa aaaaaaaaaa aa 3262
<210> 235
<211> 2179
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 235
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgccgg cagcagcgga tccctggagc 120
ccggcgaaaa accctataaa tgccccgagt gtggcaagag cttttccgac cccggacacc 180
tcgtgaggca tcagagaaca cataccggcg agaaacccta caagtgcccc gaatgcggca 240
aatccttctc tagaaaggac aatctgaaaa accatcaaag aacccatacc ggcgagaagc 300
cctataaatg tcccgagtgt ggaaagagct tcagccacaa gaacgctctg cagaaccatc 360
agaggaccca taccggcgaa aagccttata agtgccccga gtgcggaaaa tccttttcta 420
gaagggacga gctgaatgtg caccaaagga cacataccgg agagaaaccc tacaaatgcc 480
ccgagtgcgg caagtccttc agcacctccg gcaatctggt gaggcaccaa aggacacaca 540
ccggcgaaaa accttacaag tgtcccgagt gcggaaaaag cttttcccag aacagcacac 600
tgaccgaaca ccaaaggacc cacaccggag agaaacctta taaatgtccc gagtgtggaa 660
agtcctttag ccagtccggc aatctgacag agcatcaaag aacccacacc ggcaaaaaga 720
cctccgctag cggcagcggc ggcggcagcg gcggcgagga gcccgaggag cccgccgata 780
gcggacaatc tctggtgccc gtctacatct acagccccga atatgtgagc atgtgtgatt 840
ccctcgccaa gatccctaag agagccagca tggtgcattc tctgatcgag gcctacgctc 900
tgcataagca aatgaggatc gtgaagccca aggtcgccag catggaagag atggccacct 960
ttcacaccga tgcctacctc caacatctcc agaaggtgtc ccaagagggc gacgacgacc 1020
accccgactc cattgagtac ggactgggct atgattgccc cgccaccgag ggcatctttg 1080
actatgccgc cgctatcggc ggagctacca tcacagccgc ccagtgtctg attgatggca 1140
tgtgcaaggt cgccatcaac tggtccggag gctggcatca tgccaagaag gatgaggcct 1200
ccggcttctg ttatctgaat gacgccgtgc tgggcattct gagactgagg aggaaattcg 1260
agaggattct gtacgtggat ctggatctgc atcacggaga tggagtcgaa gatgccttca 1320
gcttcaccag caaggtgatg acagtctctc tgcacaagtt ctcccccggc ttctttcccg 1380
gaaccggcga cgtgtccgac gtgggactgg gcaagggaag gtactacagc gtgaacgtgc 1440
ccattcaaga cggcatccaa gacgagaagt actaccagat ctgcgagtcc gtgctcaagg 1500
aggtctacca agccttcaat cctaaggctg tcgtgctcca actgggagct gataccattg 1560
ctggcgatcc catgtgcagc ttcaatatga cacccgtcgg aatcggcaag tgcctcaagt 1620
acatcctcca gtggcagctc gccaccctca ttctcggagg aggcggatac aatctggcta 1680
ataccgccag atgctggacc tatctgaccg gcgtgattct gggcaaaaca ctgagcagcg 1740
aaatccccga ccacgagttt ttcaccgctt acggccccga ctacgtgctg gagatcaccc 1800
ccagctgcag acccgataga aacgaacccc atagaatcca gcaaattctg aactatatca 1860
agggcaacct caagcacgtc gtgggaggtg gcggatcggg aaagcggccc gccgccacca 1920
agaaggccgg tcaggccaag aagaagaagg gcagctaccc ctacgacgtg cccgactacg 1980
cctgagcggc cgcttaatta agctgccttc tgcggggctt gccttctggc catgcccttc 2040
ttctctccct tgcacctgta cctcttggtc tttgaataaa gcctgagtag gaagtctaga 2100
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2160
aaaaaaaaaa aaaaaaaaa 2179
<210> 236
<211> 2668
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 236
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgccgg cagcagcgga tccctggagc 120
ccggcgaaaa gccttacaag tgtcccgagt gcggaaagag cttcagcaga gccgataatc 180
tgaccgagca ccaaaggacc cacaccggag agaagcctta taagtgtccc gaatgcggca 240
aaagcttttc tagaagcgat catctgacca accaccagag gacacacacc ggagaaaaac 300
cttacaaatg ccccgagtgc ggcaaaagct tctcccagag cagcaatctg gtgagacacc 360
aaaggaccca caccggcgaa aaaccctata aatgccccga atgtggcaag agctttagca 420
catccggcga gctggtgagg catcaaagaa cacataccgg cgagaagccc tacaagtgcc 480
ccgagtgtgg aaaaagcttc agcacccacc tcgatctgat cagacaccag aggacccata 540
ccggagagaa accctacaaa tgtcccgagt gcggaaagtc ctttagccag ctggcccatc 600
tgagagctca tcaaaggaca cacaccggcg agaagcctta caagtgtccc gagtgcggaa 660
aatccttctc ccaactggcc catctgaggg cccaccagag aacccacacc ggcaaaaaga 720
cctccgctag cggcagcggc ggcggcagcg gcggcaacca cgaccaggag ttcgaccccc 780
ccaaggtgta cccccccgtg cccgccgaga agcggaagcc catccgggtg ctgagcctgt 840
tcgacggcat cgccaccggc ctgctggtgc tgaaggacct gggcatccag gtggaccggt 900
acatcgccag cgaggtgtgc gaggacagca tcaccgtggg catggtgcgg caccagggca 960
agatcatgta cgtgggcgac gtgcggagcg tgacccagaa gcacatccag gagtggggcc 1020
ccttcgacct ggtgatcggc ggcagcccct gcaacgacct gagcatcgtg aaccccgccc 1080
ggaagggcct gtacgagggc accggccggc tgttcttcga gttctaccgg ctgctgcacg 1140
acgcccggcc caaggagggc gacgaccggc ccttcttctg gctgttcgag aacgtggtgg 1200
ccatgggcgt gagcgacaag cgggacatca gccggttcct ggagagcaac cccgtgatga 1260
tcgacgccaa ggaggtgagc gccgcccacc gggcccggta cttctggggc aacctgcccg 1320
gcatgaaccg gcccctggcc agcaccgtga acgacaagct ggagctgcag gagtgcctgg 1380
agcacggccg gatcgccaag ttcagcaagg tgcggaccat caccacccgg agcaacagca 1440
tcaagcaggg caaggaccag cacttccccg tgttcatgaa cgagaaggag gacatcctgt 1500
ggtgcaccga gatggagcgg gtgttcggct tccccgtgca ctacaccgac gtgagcaaca 1560
tgagccggct ggcccggcag cggctgctgg gccggagctg gagcgtgccc gtgatccggc 1620
acctgttcgc ccccctgaag gagtacttcg cctgcgtgag cagcggcaac agcaacgcca 1680
acagccgggg ccccagcttc agcagcggcc tggtgcccct gagcctgcgg ggcagccaca 1740
tgaatcctct ggagatgttc gagacagtgc ccgtgtggag aaggcaaccc gtgagggtgc 1800
tgagcctctt cgaggacatt aagaaggagc tgacctctct gggctttctg gaatccggca 1860
gcgaccccgg ccagctgaaa cacgtggtgg acgtgaccga cacagtgagg aaggacgtgg 1920
aagagtgggg cccctttgac ctcgtgtatg gagccacacc tcctctcggc cacacatgcg 1980
ataggcctcc cagctggtat ctcttccagt tccacagact gctccagtac gccagaccta 2040
agcccggcag ccccagaccc ttcttctgga tgttcgtgga caatctggtg ctgaacaagg 2100
aggatctgga tgtggccagc agatttctgg agatggaacc cgtgacaatc cccgacgtgc 2160
atggcggctc tctgcagaac gccgtgagag tgtggtccaa catccccgcc attagaagca 2220
gacactgggc tctggtgagc gaggaggaac tgtctctgct ggcccagaat aagcagtcct 2280
ccaagctggc cgccaagtgg cccaccaagc tggtgaagaa ctgctttctg cctctgaggg 2340
agtatttcaa gtatttcagc accgaactga ccagcagcct gagcggcggc aagcggcccg 2400
ccgccaccaa gaaggccggc caggccaaga agaagaaggg cagctacccc tacgacgtgc 2460
ccgactacgc ctgagcggcc gcttaattaa gctgccttct gcggggcttg ccttctggcc 2520
atgcccttct tctctccctt gcacctgtac ctcttggtct ttgaataaag cctgagtagg 2580
aagtctagaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2640
aaaaaaaaaa aaaaaaaaaa aaaaaaaa 2668
<210> 237
<211> 2668
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 237
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgccgg cagcagcgga tccctggagc 120
ccggagaaaa gccttacaaa tgccccgagt gcggcaagtc cttcagccag ctggctcatc 180
tgagagctca tcaaaggacc cacaccggcg agaagcccta taagtgcccc gagtgcggaa 240
aatccttctc ccagagcagc aatctcgtca gacaccagag gacccacacc ggcgagaaac 300
cttacaagtg tcccgaatgt ggaaagtcct tctcccaaaa gagctctctg atcgcccatc 360
agagaacaca taccggcgaa aaaccctaca agtgccccga gtgtggcaaa agcttttcca 420
ccaccggcaa tctgaccgtg catcaaagaa cccacaccgg cgaaaaaccc tacaaatgcc 480
ccgagtgtgg caaatccttc tccgaccccg gccatctggt gaggcaccag aggacacaca 540
ccggcgagaa accttataaa tgtcccgaat gcggcaagtc ctttagcacc agcggctctc 600
tggtgagaca tcagaggaca cataccggcg aaaagcctta caagtgtccc gagtgtggca 660
aaagcttcag ccagaacagc acactgacag agcatcagag aacccatacc ggcaaaaaga 720
ccagcgctag cggcagcggc ggcggcagcg gcggcaacca cgaccaggag ttcgaccccc 780
ccaaggtgta cccccccgtg cccgccgaga agcggaagcc catccgggtg ctgagcctgt 840
tcgacggcat cgccaccggc ctgctggtgc tgaaggacct gggcatccag gtggaccggt 900
acatcgccag cgaggtgtgc gaggacagca tcaccgtggg catggtgcgg caccagggca 960
agatcatgta cgtgggcgac gtgcggagcg tgacccagaa gcacatccag gagtggggcc 1020
ccttcgacct ggtgatcggc ggcagcccct gcaacgacct gagcatcgtg aaccccgccc 1080
ggaagggcct gtacgagggc accggccggc tgttcttcga gttctaccgg ctgctgcacg 1140
acgcccggcc caaggagggc gacgaccggc ccttcttctg gctgttcgag aacgtggtgg 1200
ccatgggcgt gagcgacaag cgggacatca gccggttcct ggagagcaac cccgtgatga 1260
tcgacgccaa ggaggtgagc gccgcccacc gggcccggta cttctggggc aacctgcccg 1320
gcatgaaccg gcccctggcc agcaccgtga acgacaagct ggagctgcag gagtgcctgg 1380
agcacggccg gatcgccaag ttcagcaagg tgcggaccat caccacccgg agcaacagca 1440
tcaagcaggg caaggaccag cacttccccg tgttcatgaa cgagaaggag gacatcctgt 1500
ggtgcaccga gatggagcgg gtgttcggct tccccgtgca ctacaccgac gtgagcaaca 1560
tgagccggct ggcccggcag cggctgctgg gccggagctg gagcgtgccc gtgatccggc 1620
acctgttcgc ccccctgaag gagtacttcg cctgcgtgag cagcggcaac agcaacgcca 1680
acagccgggg ccccagcttc agcagcggcc tggtgcccct gagcctgcgg ggcagccaca 1740
tgaatcctct ggagatgttc gagacagtgc ccgtgtggag aaggcaaccc gtgagggtgc 1800
tgagcctctt cgaggacatt aagaaggagc tgacctctct gggctttctg gaatccggca 1860
gcgaccccgg ccagctgaaa cacgtggtgg acgtgaccga cacagtgagg aaggacgtgg 1920
aagagtgggg cccctttgac ctcgtgtatg gagccacacc tcctctcggc cacacatgcg 1980
ataggcctcc cagctggtat ctcttccagt tccacagact gctccagtac gccagaccta 2040
agcccggcag ccccagaccc ttcttctgga tgttcgtgga caatctggtg ctgaacaagg 2100
aggatctgga tgtggccagc agatttctgg agatggaacc cgtgacaatc cccgacgtgc 2160
atggcggctc tctgcagaac gccgtgagag tgtggtccaa catccccgcc attagaagca 2220
gacactgggc tctggtgagc gaggaggaac tgtctctgct ggcccagaat aagcagtcct 2280
ccaagctggc cgccaagtgg cccaccaagc tggtgaagaa ctgctttctg cctctgaggg 2340
agtatttcaa gtatttcagc accgaactga ccagcagcct gagcggcggc aagcggcccg 2400
ccgccaccaa gaaggccggc caggccaaga agaagaaggg cagctacccc tacgacgtgc 2460
ccgactacgc ctgagcggcc gcttaattaa gctgccttct gcggggcttg ccttctggcc 2520
atgcccttct tctctccctt gcacctgtac ctcttggtct ttgaataaag cctgagtagg 2580
aagtctagaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2640
aaaaaaaaaa aaaaaaaaaa aaaaaaaa 2668
<210> 238
<211> 2668
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 238
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgccgg cagcagcgga tccctggagc 120
ccggcgaaaa accctataaa tgccccgagt gtggcaagag cttttccgac cccggacacc 180
tcgtgaggca tcagagaaca cataccggcg agaaacccta caagtgcccc gaatgcggca 240
aatccttctc tagaaaggac aatctgaaaa accatcaaag aacccatacc ggcgagaagc 300
cctataaatg tcccgagtgt ggaaagagct tcagccacaa gaacgctctg cagaaccatc 360
agaggaccca taccggcgaa aagccttata agtgccccga gtgcggaaaa tccttttcta 420
gaagggacga gctgaatgtg caccaaagga cacataccgg agagaaaccc tacaaatgcc 480
ccgagtgcgg caagtccttc agcacctccg gcaatctggt gaggcaccaa aggacacaca 540
ccggcgaaaa accttacaag tgtcccgagt gcggaaaaag cttttcccag aacagcacac 600
tgaccgaaca ccaaaggacc cacaccggag agaaacctta taaatgtccc gagtgtggaa 660
agtcctttag ccagtccggc aatctgacag agcatcaaag aacccacacc ggcaaaaaga 720
cctccgctag cggcagcggc ggcggcagcg gcggcaacca cgaccaggag ttcgaccccc 780
ccaaggtgta cccccccgtg cccgccgaga agcggaagcc catccgggtg ctgagcctgt 840
tcgacggcat cgccaccggc ctgctggtgc tgaaggacct gggcatccag gtggaccggt 900
acatcgccag cgaggtgtgc gaggacagca tcaccgtggg catggtgcgg caccagggca 960
agatcatgta cgtgggcgac gtgcggagcg tgacccagaa gcacatccag gagtggggcc 1020
ccttcgacct ggtgatcggc ggcagcccct gcaacgacct gagcatcgtg aaccccgccc 1080
ggaagggcct gtacgagggc accggccggc tgttcttcga gttctaccgg ctgctgcacg 1140
acgcccggcc caaggagggc gacgaccggc ccttcttctg gctgttcgag aacgtggtgg 1200
ccatgggcgt gagcgacaag cgggacatca gccggttcct ggagagcaac cccgtgatga 1260
tcgacgccaa ggaggtgagc gccgcccacc gggcccggta cttctggggc aacctgcccg 1320
gcatgaaccg gcccctggcc agcaccgtga acgacaagct ggagctgcag gagtgcctgg 1380
agcacggccg gatcgccaag ttcagcaagg tgcggaccat caccacccgg agcaacagca 1440
tcaagcaggg caaggaccag cacttccccg tgttcatgaa cgagaaggag gacatcctgt 1500
ggtgcaccga gatggagcgg gtgttcggct tccccgtgca ctacaccgac gtgagcaaca 1560
tgagccggct ggcccggcag cggctgctgg gccggagctg gagcgtgccc gtgatccggc 1620
acctgttcgc ccccctgaag gagtacttcg cctgcgtgag cagcggcaac agcaacgcca 1680
acagccgggg ccccagcttc agcagcggcc tggtgcccct gagcctgcgg ggcagccaca 1740
tgaatcctct ggagatgttc gagacagtgc ccgtgtggag aaggcaaccc gtgagggtgc 1800
tgagcctctt cgaggacatt aagaaggagc tgacctctct gggctttctg gaatccggca 1860
gcgaccccgg ccagctgaaa cacgtggtgg acgtgaccga cacagtgagg aaggacgtgg 1920
aagagtgggg cccctttgac ctcgtgtatg gagccacacc tcctctcggc cacacatgcg 1980
ataggcctcc cagctggtat ctcttccagt tccacagact gctccagtac gccagaccta 2040
agcccggcag ccccagaccc ttcttctgga tgttcgtgga caatctggtg ctgaacaagg 2100
aggatctgga tgtggccagc agatttctgg agatggaacc cgtgacaatc cccgacgtgc 2160
atggcggctc tctgcagaac gccgtgagag tgtggtccaa catccccgcc attagaagca 2220
gacactgggc tctggtgagc gaggaggaac tgtctctgct ggcccagaat aagcagtcct 2280
ccaagctggc cgccaagtgg cccaccaagc tggtgaagaa ctgctttctg cctctgaggg 2340
agtatttcaa gtatttcagc accgaactga ccagcagcct gagcggcggc aagcggcccg 2400
ccgccaccaa gaaggccggc caggccaaga agaagaaggg cagctacccc tacgacgtgc 2460
ccgactacgc ctgagcggcc gcttaattaa gctgccttct gcggggcttg ccttctggcc 2520
atgcccttct tctctccctt gcacctgtac ctcttggtct ttgaataaag cctgagtagg 2580
aagtctagaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2640
aaaaaaaaaa aaaaaaaaaa aaaaaaaa 2668
<210> 239
<211> 2668
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 239
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgccgg cagcagcgga tccctggagc 120
ccggcgaaaa gccttacaaa tgtcccgaat gcggaaagag cttcagcaga gccgacaatc 180
tgaccgaaca tcagagaacc cataccggag aaaaacctta caaatgtccc gagtgcggca 240
aaagcttctc ccaagccgga catctggcca gccaccaaag gacacatacc ggcgagaaac 300
cctacaagtg ccccgagtgc ggcaagtcct tctctagatc cgatgagctg gtcagacatc 360
agagaaccca taccggcgag aagccttata agtgccccga atgtggcaag tccttcagcc 420
agagagctca tctggagagg catcaaagaa cacacaccgg agagaaacct tacaagtgtc 480
ccgagtgtgg aaagagcttc tccagaaggg acgagctgaa cgtccaccaa agaacccata 540
ccggcgaaaa gccctataaa tgccccgagt gtggaaaatc cttttctaga tccgaccatc 600
tgacaaccca ccagaggacc cataccggag agaagcccta caaatgcccc gagtgtggaa 660
aaagcttctc tagaaacgat gctctgacag agcaccaaag gacccacacc ggcaaaaaga 720
ccagcgctag cggcagcggc ggcggcagcg gcggcaacca cgaccaggag ttcgaccccc 780
ccaaggtgta cccccccgtg cccgccgaga agcggaagcc catccgggtg ctgagcctgt 840
tcgacggcat cgccaccggc ctgctggtgc tgaaggacct gggcatccag gtggaccggt 900
acatcgccag cgaggtgtgc gaggacagca tcaccgtggg catggtgcgg caccagggca 960
agatcatgta cgtgggcgac gtgcggagcg tgacccagaa gcacatccag gagtggggcc 1020
ccttcgacct ggtgatcggc ggcagcccct gcaacgacct gagcatcgtg aaccccgccc 1080
ggaagggcct gtacgagggc accggccggc tgttcttcga gttctaccgg ctgctgcacg 1140
acgcccggcc caaggagggc gacgaccggc ccttcttctg gctgttcgag aacgtggtgg 1200
ccatgggcgt gagcgacaag cgggacatca gccggttcct ggagagcaac cccgtgatga 1260
tcgacgccaa ggaggtgagc gccgcccacc gggcccggta cttctggggc aacctgcccg 1320
gcatgaaccg gcccctggcc agcaccgtga acgacaagct ggagctgcag gagtgcctgg 1380
agcacggccg gatcgccaag ttcagcaagg tgcggaccat caccacccgg agcaacagca 1440
tcaagcaggg caaggaccag cacttccccg tgttcatgaa cgagaaggag gacatcctgt 1500
ggtgcaccga gatggagcgg gtgttcggct tccccgtgca ctacaccgac gtgagcaaca 1560
tgagccggct ggcccggcag cggctgctgg gccggagctg gagcgtgccc gtgatccggc 1620
acctgttcgc ccccctgaag gagtacttcg cctgcgtgag cagcggcaac agcaacgcca 1680
acagccgggg ccccagcttc agcagcggcc tggtgcccct gagcctgcgg ggcagccaca 1740
tgaatcctct ggagatgttc gagacagtgc ccgtgtggag aaggcaaccc gtgagggtgc 1800
tgagcctctt cgaggacatt aagaaggagc tgacctctct gggctttctg gaatccggca 1860
gcgaccccgg ccagctgaaa cacgtggtgg acgtgaccga cacagtgagg aaggacgtgg 1920
aagagtgggg cccctttgac ctcgtgtatg gagccacacc tcctctcggc cacacatgcg 1980
ataggcctcc cagctggtat ctcttccagt tccacagact gctccagtac gccagaccta 2040
agcccggcag ccccagaccc ttcttctgga tgttcgtgga caatctggtg ctgaacaagg 2100
aggatctgga tgtggccagc agatttctgg agatggaacc cgtgacaatc cccgacgtgc 2160
atggcggctc tctgcagaac gccgtgagag tgtggtccaa catccccgcc attagaagca 2220
gacactgggc tctggtgagc gaggaggaac tgtctctgct ggcccagaat aagcagtcct 2280
ccaagctggc cgccaagtgg cccaccaagc tggtgaagaa ctgctttctg cctctgaggg 2340
agtatttcaa gtatttcagc accgaactga ccagcagcct gagcggcggc aagcggcccg 2400
ccgccaccaa gaaggccggc caggccaaga agaagaaggg cagctacccc tacgacgtgc 2460
ccgactacgc ctgagcggcc gcttaattaa gctgccttct gcggggcttg ccttctggcc 2520
atgcccttct tctctccctt gcacctgtac ctcttggtct ttgaataaag cctgagtagg 2580
aagtctagaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2640
aaaaaaaaaa aaaaaaaaaa aaaaaaaa 2668
<210> 240
<211> 2179
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 240
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgccgg cagcagcgga tccctggagc 120
ccggcgaaaa gccttacaag tgccccgagt gtggcaaatc ctttagcacc accggaaatc 180
tgaccgtcca ccagagaaca cataccggcg agaaacccta caagtgtccc gagtgcggca 240
aatccttcag ccagctggcc catctgagag cccatcaaag gacccatacc ggcgagaaac 300
cttacaagtg tcccgaatgt ggaaagtcct ttagcagccc cgccgatctg acaagacatc 360
aaagaaccca caccggcgag aagccctata aatgtcccga gtgtggaaag tccttcagcc 420
agagcggcaa tctgaccgag catcaaagaa cccataccgg cgaaaagccc tataagtgcc 480
ccgaatgcgg aaaaagcttc tccacaagcg gcgagctggt gagacaccaa aggacacata 540
ccggcgaaaa gccttataaa tgccccgagt gcggcaagag cttctctaga aaggacaatc 600
tgaagaacca ccaaagaaca cacaccggcg agaagcccta caaatgcccc gagtgcggca 660
agagctttag ccagtccagc aacctcgtga gacatcagag gacacatacc ggaaaaaaga 720
ccagcgctag cggcagcggc ggcggcagcg gcggcgagga gcccgaggag cccgccgata 780
gcggacaatc tctggtgccc gtctacatct acagccccga atatgtgagc atgtgtgatt 840
ccctcgccaa gatccctaag agagccagca tggtgcattc tctgatcgag gcctacgctc 900
tgcataagca aatgaggatc gtgaagccca aggtcgccag catggaagag atggccacct 960
ttcacaccga tgcctacctc caacatctcc agaaggtgtc ccaagagggc gacgacgacc 1020
accccgactc cattgagtac ggactgggct atgattgccc cgccaccgag ggcatctttg 1080
actatgccgc cgctatcggc ggagctacca tcacagccgc ccagtgtctg attgatggca 1140
tgtgcaaggt cgccatcaac tggtccggag gctggcatca tgccaagaag gatgaggcct 1200
ccggcttctg ttatctgaat gacgccgtgc tgggcattct gagactgagg aggaaattcg 1260
agaggattct gtacgtggat ctggatctgc atcacggaga tggagtcgaa gatgccttca 1320
gcttcaccag caaggtgatg acagtctctc tgcacaagtt ctcccccggc ttctttcccg 1380
gaaccggcga cgtgtccgac gtgggactgg gcaagggaag gtactacagc gtgaacgtgc 1440
ccattcaaga cggcatccaa gacgagaagt actaccagat ctgcgagtcc gtgctcaagg 1500
aggtctacca agccttcaat cctaaggctg tcgtgctcca actgggagct gataccattg 1560
ctggcgatcc catgtgcagc ttcaatatga cacccgtcgg aatcggcaag tgcctcaagt 1620
acatcctcca gtggcagctc gccaccctca ttctcggagg aggcggatac aatctggcta 1680
ataccgccag atgctggacc tatctgaccg gcgtgattct gggcaaaaca ctgagcagcg 1740
aaatccccga ccacgagttt ttcaccgctt acggccccga ctacgtgctg gagatcaccc 1800
ccagctgcag acccgataga aacgaacccc atagaatcca gcaaattctg aactatatca 1860
agggcaacct caagcacgtc gtgggaggtg gcggatcggg aaagcggccc gccgccacca 1920
agaaggccgg tcaggccaag aagaagaagg gcagctaccc ctacgacgtg cccgactacg 1980
cctgagcggc cgcttaatta agctgccttc tgcggggctt gccttctggc catgcccttc 2040
ttctctccct tgcacctgta cctcttggtc tttgaataaa gcctgagtag gaagtctaga 2100
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2160
aaaaaaaaaa aaaaaaaaa 2179
<210> 241
<211> 2179
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 241
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgccgg cagcagcgga tccctggagc 120
ccggcgaaaa accctataag tgccccgagt gcggcaagag ctttagcgat cccggccatc 180
tggtgaggca tcagaggacc cacaccggcg aaaagcctta caaatgcccc gagtgtggaa 240
aaagcttcag cagaagcgat catctgacca cccatcagag gacacatacc ggcgagaagc 300
cttataaatg ccccgaatgt ggaaagagct tctccagaag cgaccatctg accaaccacc 360
agaggaccca taccggagaa aaaccttaca aatgccccga gtgtggaaag tccttcagct 420
cccccgccga tctgacaaga catcagagaa cccacaccgg cgaaaaacct tataaatgtc 480
ccgagtgtgg caaaagcttc tccgacaaga aggatctgac aagacaccaa aggacccaca 540
ccggcgagaa accttataaa tgtcccgaat gcggaaaaag ctttagcaga aacgacgctc 600
tgaccgaaca ccagagaaca cataccggag agaaacccta taaatgtccc gagtgcggaa 660
aatccttcag caccaccggc gctctgacag agcatcagag gacacacacc ggcaaaaaga 720
cctccgctag cggcagcggc ggcggcagcg gcggcgagga gcccgaggag cccgccgata 780
gcggacaatc tctggtgccc gtctacatct acagccccga atatgtgagc atgtgtgatt 840
ccctcgccaa gatccctaag agagccagca tggtgcattc tctgatcgag gcctacgctc 900
tgcataagca aatgaggatc gtgaagccca aggtcgccag catggaagag atggccacct 960
ttcacaccga tgcctacctc caacatctcc agaaggtgtc ccaagagggc gacgacgacc 1020
accccgactc cattgagtac ggactgggct atgattgccc cgccaccgag ggcatctttg 1080
actatgccgc cgctatcggc ggagctacca tcacagccgc ccagtgtctg attgatggca 1140
tgtgcaaggt cgccatcaac tggtccggag gctggcatca tgccaagaag gatgaggcct 1200
ccggcttctg ttatctgaat gacgccgtgc tgggcattct gagactgagg aggaaattcg 1260
agaggattct gtacgtggat ctggatctgc atcacggaga tggagtcgaa gatgccttca 1320
gcttcaccag caaggtgatg acagtctctc tgcacaagtt ctcccccggc ttctttcccg 1380
gaaccggcga cgtgtccgac gtgggactgg gcaagggaag gtactacagc gtgaacgtgc 1440
ccattcaaga cggcatccaa gacgagaagt actaccagat ctgcgagtcc gtgctcaagg 1500
aggtctacca agccttcaat cctaaggctg tcgtgctcca actgggagct gataccattg 1560
ctggcgatcc catgtgcagc ttcaatatga cacccgtcgg aatcggcaag tgcctcaagt 1620
acatcctcca gtggcagctc gccaccctca ttctcggagg aggcggatac aatctggcta 1680
ataccgccag atgctggacc tatctgaccg gcgtgattct gggcaaaaca ctgagcagcg 1740
aaatccccga ccacgagttt ttcaccgctt acggccccga ctacgtgctg gagatcaccc 1800
ccagctgcag acccgataga aacgaacccc atagaatcca gcaaattctg aactatatca 1860
agggcaacct caagcacgtc gtgggaggtg gcggatcggg aaagcggccc gccgccacca 1920
agaaggccgg tcaggccaag aagaagaagg gcagctaccc ctacgacgtg cccgactacg 1980
cctgagcggc cgcttaatta agctgccttc tgcggggctt gccttctggc catgcccttc 2040
ttctctccct tgcacctgta cctcttggtc tttgaataaa gcctgagtag gaagtctaga 2100
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2160
aaaaaaaaaa aaaaaaaaa 2179
<210> 242
<211> 2179
<212> DNA
<213> artificial sequence
<220>
<223> description of artificial sequence synthetic
Polynucleotide
<400> 242
aggaaataag agagaaaaga agagtaagaa gaaatataag agccaccatg gcccccaaga 60
agaagcggaa ggtgggcatc cacggcgtgc ccgccgccgg cagcagcgga tccctggagc 120
ccggcgaaaa gccttacaaa tgtcccgaat gcggaaagag cttcagcaga gccgacaatc 180
tgaccgaaca tcagagaacc cataccggag aaaaacctta caaatgtccc gagtgcggca 240
aaagcttctc ccaagccgga catctggcca gccaccaaag gacacatacc ggcgagaaac 300
cctacaagtg ccccgagtgc ggcaagtcct tctctagatc cgatgagctg gtcagacatc 360
agagaaccca taccggcgag aagccttata agtgccccga atgtggcaag tccttcagcc 420
agagagctca tctggagagg catcaaagaa cacacaccgg agagaaacct tacaagtgtc 480
ccgagtgtgg aaagagcttc tccagaaggg acgagctgaa cgtccaccaa agaacccata 540
ccggcgaaaa gccctataaa tgccccgagt gtggaaaatc cttttctaga tccgaccatc 600
tgacaaccca ccagaggacc cataccggag agaagcccta caaatgcccc gagtgtggaa 660
aaagcttctc tagaaacgat gctctgacag agcaccaaag gacccacacc ggcaaaaaga 720
ccagcgctag cggcagcggc ggcggcagcg gcggcgagga gcccgaggag cccgccgata 780
gcggacaatc tctggtgccc gtctacatct acagccccga atatgtgagc atgtgtgatt 840
ccctcgccaa gatccctaag agagccagca tggtgcattc tctgatcgag gcctacgctc 900
tgcataagca aatgaggatc gtgaagccca aggtcgccag catggaagag atggccacct 960
ttcacaccga tgcctacctc caacatctcc agaaggtgtc ccaagagggc gacgacgacc 1020
accccgactc cattgagtac ggactgggct atgattgccc cgccaccgag ggcatctttg 1080
actatgccgc cgctatcggc ggagctacca tcacagccgc ccagtgtctg attgatggca 1140
tgtgcaaggt cgccatcaac tggtccggag gctggcatca tgccaagaag gatgaggcct 1200
ccggcttctg ttatctgaat gacgccgtgc tgggcattct gagactgagg aggaaattcg 1260
agaggattct gtacgtggat ctggatctgc atcacggaga tggagtcgaa gatgccttca 1320
gcttcaccag caaggtgatg acagtctctc tgcacaagtt ctcccccggc ttctttcccg 1380
gaaccggcga cgtgtccgac gtgggactgg gcaagggaag gtactacagc gtgaacgtgc 1440
ccattcaaga cggcatccaa gacgagaagt actaccagat ctgcgagtcc gtgctcaagg 1500
aggtctacca agccttcaat cctaaggctg tcgtgctcca actgggagct gataccattg 1560
ctggcgatcc catgtgcagc ttcaatatga cacccgtcgg aatcggcaag tgcctcaagt 1620
acatcctcca gtggcagctc gccaccctca ttctcggagg aggcggatac aatctggcta 1680
ataccgccag atgctggacc tatctgaccg gcgtgattct gggcaaaaca ctgagcagcg 1740
aaatccccga ccacgagttt ttcaccgctt acggccccga ctacgtgctg gagatcaccc 1800
ccagctgcag acccgataga aacgaacccc atagaatcca gcaaattctg aactatatca 1860
agggcaacct caagcacgtc gtgggaggtg gcggatcggg aaagcggccc gccgccacca 1920
agaaggccgg tcaggccaag aagaagaagg gcagctaccc ctacgacgtg cccgactacg 1980
cctgagcggc cgcttaatta agctgccttc tgcggggctt gccttctggc catgcccttc 2040
ttctctccct tgcacctgta cctcttggtc tttgaataaa gcctgagtag gaagtctaga 2100
aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa 2160
aaaaaaaaaa aaaaaaaaa 2179
<210> 243
<211> 5
<212> PRT
<213> unknown
<220>
<223> description of unknowns:
thrombin-sensitive sequences
<400> 243
Cys Pro Arg Ser Cys
1 5
<210> 244
<211> 51
<212> DNA
<213> unknown
<220>
<223> description of unknowns:
anchor sequence
<400> 244
tggatgggag tgtgacagtg ctgccttctg accacaaggt ggggcttata c 51
<210> 245
<211> 51
<212> DNA
<213> unknown
<220>
<223> description of unknowns:
anchor sequence
<400> 245
gtataagccc caccttgtgg tcagaaggca gcactgtcac actcccatcc a 51
<210> 246
<211> 102
<212> DNA
<213> unknown
<220>
<223> description of unknowns:
anchor sequence
<400> 246
ttgttttcgg ctctagatgg cgccataagc ctatgtttac cttgactttt gcaaatggcc 60
atggcactgc ctgtcccata aggaggagga ccaaagggat ta 102
<210> 247
<211> 51
<212> DNA
<213> unknown
<220>
<223> description of unknowns:
anchor sequence
<400> 247
caaaagtcaa ggtaaacata ggcttatggc gccatctaga gccgaaaaca a 51

Claims (232)

1. A method of reducing expression of a first gene and a second gene in a cell, the method comprising:
contacting the cell with a site-specific disruption agent comprising a targeting moiety that specifically binds to the first anchor sequence or a site proximal to the first anchor sequence in an amount sufficient to reduce expression of the first and second genes,
the first and second genes are within an anchor sequence-mediated junction comprising the first anchor sequence and the second anchor sequence,
wherein optionally, the first gene and the second gene are pro-inflammatory genes;
thereby reducing the expression of the first and second genes.
2. A site-specific breaker comprising:
a DNA binding moiety, such as a targeting moiety, which specifically binds to or is proximal to the intracellular first anchor sequence,
wherein the first anchor sequence is part of an anchor sequence-mediated junction, the anchor sequence-mediated junction further comprising a second anchor sequence, a first gene, and a second gene,
wherein optionally, the first gene and the second gene are pro-inflammatory genes.
3. The site-specific breaker of claim 2, wherein the first or second anchor sequence is located between IL-8 and RASSF 6; between the IL-8 enhancer and RASSF 6; CXCL1 and CXCL 4; CXCL2 and EPGN; or between the E2 enhancer and EPGN.
4. The site-specific breaker of claim 2 or 3, wherein the site-specific breaker further comprises an effector moiety.
5. The site-specific breaker of any one of claims 2-4, wherein the targeting moiety comprises a TAL effector molecule, a CRISPR/Cas molecule (e.g., a catalytically inactive CRISPR/Cas protein), a zinc finger domain, a tetR domain, a meganuclease, or an oligonucleotide.
6. The site-specific breaker of any one of claims 2-5, wherein the effector moiety comprises an effector as described herein, e.g., MQ1, DNMT3A2, DNMT3B1, DNMT3B2, DNMT3B3, DNMT3B4, DNMT3B5, DNMT3B6, DNMT3L, EZH, HDAC8, KRAB, meCP2, HP1, RBBP4, REST, FOG1, SUZ12, SETDB1, SETDB2, EHMT2 (i.e., G9A), EHMT1 (i.e., GLP), SUV39H1, HDAC2, HDAC3, HDAC4, HDAC5, HDAC6, HDAC7, HDAC8, HDAC9, HDAC10, HDAC11, t1, SIRT2, SIRT3, SIRT4, SIRT5, t6, SIRT7, t8, t9, ezb 1, tdv 39, td8, tdv 420, or a functional fragment thereof, or any one of the SUV 420.
7. The site-specific breaker of any one of claims 2-6, wherein the effector moiety is linked to the targeting moiety by a linker.
8. The site-specific breaker of any one of claims 2-7, wherein the effector moiety is C-terminal to the targeting moiety.
9. The site-specific breaker of any one of claims 2-7, wherein the effector moiety is N-terminal to the targeting moiety.
10. The site-specific breaker of any one of claims 2-9, wherein the effector moiety is encoded by a nucleotide sequence selected from any one of SEQ ID NOs 10, 14, 16, 18, 66, 68, or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions different therefrom.
11. The site-specific breaker of any one of claims 2-10, wherein the effector moiety comprises an amino acid sequence according to any one of SEQ ID NOs 11, 12, 13, 15, 17, 19, 67, or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions different therefrom.
12. The site-specific breaker of any of claims 2-11, wherein the effector moiety is MQ1 or a functional variant or fragment thereof, e.g., wherein the effector moiety comprises the amino acid sequence of SEQ ID No. 11 or 12, or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions thereto, wherein optionally the effector moiety is at the C-terminus of the targeting moiety.
13. The site-specific breaker of any one of claims 2-11, wherein the effector moiety is KRAB or a functional variant or fragment thereof, e.g., wherein the effector moiety comprises the amino acid sequence of SEQ ID No. 13, or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions thereto, wherein optionally the effector moiety is at the C-terminus of the targeting moiety.
14. The site-specific breaker of any one of claims 2-11, wherein the effector moiety is DNMT3a/3L or a functional variant or fragment thereof, e.g., wherein the effector moiety comprises the amino acid sequence of SEQ ID No. 15, or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions different thereto, wherein optionally the effector moiety is at the C-terminus of the targeting moiety.
15. The site-specific breaker of any one of claims 2-11, wherein the effector moiety is EZH2 or a functional variant or fragment thereof, e.g., wherein the effector moiety comprises the amino acid sequence of SEQ ID No. 17, or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions different therefrom.
16. The site-specific breaker of any one of claims 2-11, wherein the effector moiety is HDAC8 or a functional variant or fragment thereof, e.g., wherein the effector moiety comprises the amino acid sequence of SEQ ID No. 19, or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions thereto, wherein optionally the effector moiety is at the C-terminus of the targeting moiety.
17. The site-specific breaker of any one of claims 2-11, wherein the effector moiety is G9A or a functional variant or fragment thereof, e.g., wherein the effector moiety comprises the amino acid sequence of SEQ ID No. 67, or a sequence having at least 80%, 85%, 90%, 95%, 99% or 100% identity thereto or NO more than 20, 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2 or 1 positions thereto, wherein optionally the effector moiety is N-terminal to the targeting moiety.
18. The site-specific breaker of any one of claims 2-17, further comprising a second effector moiety.
19. The site-specific breaker of claim 18, wherein the targeting moiety is located between the first effector moiety and the second effector moiety.
20. The site-specific breaker of any one of claims 2-19, wherein the effector moiety comprises a polymer, such as an oligonucleotide; for example, gRNA.
21. The site-specific breaker of claim 20, wherein the oligonucleotide has a sequence comprising a complement of the anchor sequence or a complement of a sequence proximal to the anchor sequence.
22. The site-specific breaker of any one of claims 2-21, wherein the targeting moiety further comprises a gRNA, e.g., a gRNA that binds to a genomic locus comprising at least 14, 15, 16, 17, 18, 19, or 20 nucleotides of the sequence of any one of SEQ ID NOs 20-62, e.g., wherein the gRNA comprises a sequence comprising at least 14, 15, 16, 17, 18, 19, or 20 nucleotides of the sequence of any one of SEQ ID NOs 20-62.
23. The site-specific breaker of any one of claims 2-22, wherein the targeting domain comprises a CRISPR/Cas molecule, e.g., a catalytically inactive CRISPR/Cas protein, and a gRNA, e.g., a gRNA that binds to a genomic locus comprising at least 14, 15, 16, 17, 18, 19 or 20 nucleotides of the sequence of any one of SEQ ID NOs 20-62, e.g., wherein the gRNA comprises a sequence comprising at least 14, 15, 16, 17, 18, 19 or 20 nucleotides of the sequence of any one of SEQ ID NOs 20-62 and the effector moiety comprises an effector selected from the group consisting of DNMT3a/3l, MQ1, KRAB, G9A, HDAC or EZH 2.
24. The site-specific breaker of claim 23, wherein the targeting domain comprises a CRISPR/Cas molecule, e.g., a catalytically inactive CRISPR/Cas protein, and a gRNA, e.g., a gRNA that binds to a genomic locus comprising at least 14, 15, 16, 17, 18, 19, or 20 nucleotides of the sequence of any of SEQ ID NOs 20-62, e.g., wherein the gRNA comprises a sequence comprising at least 14, 15, 16, 17, 18, 19, or 20 nucleotides of the sequence of any of SEQ ID NOs 20-62, the first effector moiety comprises an effector selected from DNMT3a/3l, MQ1, KRAB, G9A, HDAC, or EZH2, and the second effector moiety comprises an effector selected from DNMT3a/3l, MQ1, KRAB, G9A, HDAC, or EZH 2.
25. The site-specific breaker of any one of claims 2-24, wherein the targeting domain binds to a genomic locus comprising at least 14, 15, 16, 17, 18, 19 or 20 nucleotides of the sequence of any one of SEQ ID NOs 20-62.
26. The site-specific breaker of any one of claims 2-25, wherein the targeting domain binds to a genomic locus within 50 nucleotides (e.g., upstream or downstream) of the sequence of any one of SEQ ID NOs 20-62.
27. The site-specific breaker of any one of claims 2-26, wherein the targeting domain binds to a genomic locus within 100 nucleotides (e.g., upstream or downstream) of the sequence of any one of SEQ ID NOs 20-62.
28. The site-specific breaker of any one of claims 2-27, wherein the targeting domain binds to a genomic locus within 200 (e.g., upstream or downstream) nucleotides of the sequence of any one of SEQ ID NOs 20-62.
29. The site-specific breaker of any one of claims 2-28, wherein the targeting domain binds to a genomic locus within 300 nucleotides (e.g., upstream or downstream) of the sequence of any one of SEQ ID NOs 20-62.
30. The site-specific breaker of any one of claims 2-29, which: (i) Comprising one or more nuclear localization signal sequences (NLS), or (ii) does not comprise an NLS, optionally wherein the NLS comprises the amino acid sequences of SEQ ID NOs 63 and/or 64.
31. The site-specific breaker of any one of claims 18-30, wherein the first and/or second effector moiety comprises a DNA methyltransferase, a histone deacetylase, a histone demethylase, or a recruiter of a histone modification complex.
32. The site-specific breaker of claims 2-31, wherein the ASMC comprises two loops.
33. The site-specific breaker of any one of claims 2-32 or the method of claim 1, wherein the first gene is located in a first loop of the ASMC and the second gene is located in a second loop of the ASMC.
34. The site-specific breaker or method of claim 33, wherein the first anchor sequence is located between the first loop and the second loop.
35. A nucleic acid encoding the site-specific breaker of any one of claims 2-34.
36. The method of claim 1 or the site-specific breaker of any one of claims 2-36, wherein the anchor sequence-mediated linkage further comprises a third gene, and optionally wherein the method results in reduced expression of the third gene.
37. The method or site-specific breaker of claim 36, wherein the anchor sequence-mediated linkage further comprises a fourth gene, and optionally wherein the method results in reduced expression of the fourth gene.
38. The method or site-specific breaker of claim 37, wherein the anchor sequence-mediated linkage further comprises a fifth gene, and optionally wherein the method results in reduced expression of the fifth gene.
39. The method or site-specific breaker of claim 38, wherein the anchor sequence-mediated linkage further comprises a sixth gene, and optionally wherein the method results in reduced expression of the sixth gene.
40. The method or site-specific breaker of claim 39, wherein the anchor sequence-mediated linkage further comprises a seventh gene, and optionally wherein the method results in reduced expression of the seventh gene.
41. The method or site-specific breaker of claim 40, wherein the anchor sequence-mediated junction further comprises an eighth gene, and optionally wherein the method results in reduced expression of the eighth gene.
42. A human cell having reduced expression of a first gene and a second gene,
wherein the first gene and the second gene are pro-inflammatory genes,
wherein the cell comprises a disrupted (e.g., a fully disrupted) anchor sequence-mediated junction comprising the first gene and the second gene.
43. The human cell of claim 42, having reduced CTCF binding to the anchor sequence comprised by the anchor sequence-mediated linkage, e.g., reduced by at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 100%.
44. The human cell of claim 42 or 43, wherein the expression of a third pro-inflammatory gene of the human cell is reduced.
45. The human cell of claim 44, wherein the expression of a fourth pro-inflammatory gene of the human cell is reduced.
46. The human cell of claim 45, wherein the expression of a fifth pro-inflammatory gene of the human cell is reduced.
47. The human cell of claim 46, wherein the expression of a sixth pro-inflammatory gene of the human cell is reduced.
48. The human cell of claim 47, wherein the expression of a seventh pro-inflammatory gene of the human cell is reduced.
49. The human cell of claim 48, wherein the expression of an eighth pro-inflammatory gene of the human cell is reduced.
50. The human cell of any one of claims 42-49, wherein the human cell comprises chr4:74595464-74595486, chr4:74595464-74595486, chr4:5237, or a mutation at chr4:5237, or within 5, 10, 15, 20, 30, or 50 nucleotides of said region.
51. A human cell comprising chr4:74595464-74595486, chr4:74595464-74595486, chr4:5237, or a mutation at chr4:5237, or within 5, 10, 15, 20, 30, or 50 nucleotides of said region.
52. The human cell of claim 27 or 28, wherein the mutation comprises a deletion, substitution, or insertion (e.g., 1-10, 1-9, 1-8, 1-7, 1-6, 1-5, 1-4, 1-3, or 1-2 nucleotides).
53. The human cell of any one of claims 50-52, which has reduced binding to the mutated CTCF, e.g., by at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 100% compared to a human cell with undamaged ASMC.
54. The human cell of any one of claims 42-53, wherein the expression of the first and second genes is reduced by at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% or 90% as compared to a human cell with an undamaged ASMC.
55. A system, comprising:
a first site-specific breaker comprising a first targeting moiety and optionally a first effector moiety, wherein the first site-specific breaker specifically binds to a first anchor sequence of an anchor sequence-mediated junction (ASMC), wherein the ASMC comprises a first gene and a second gene, and
a second site-specific breaker comprising a second targeting moiety and optionally a second effector moiety, wherein the second site-specific breaker binds to a second anchor sequence of the ASMC.
56. The system of claim 55, wherein the first anchor sequence is between IL-8 and RASSF 6; between the IL-8 enhancer and RASSF 6; CXCL1 and CXCL 4; CXCL2 and EPGN; or between the E2 enhancer and EPGN.
57. The system of claim 55 or 56, wherein the second anchor sequence is between IL-8 and RASSF 6; between the IL-8 enhancer and RASSF 6; CXCL1 and CXCL 4; CXCL2 and EPGN; or between the E2 enhancer and EPGN.
58. The system of any one of claims 55-57, wherein the first anchor sequence is between IL-8 enhancer and RASSF6 and the second anchor sequence is between CXCL1 and CXCL 4.
59. The system of any one of claims 55-58, wherein the first anchor sequence is between IL-8 enhancer and RASSF6 and the second anchor sequence is between E2 enhancer and EPGN.
60. The system of any one of claims 55-59, wherein the first anchor sequence is between CXCL1 and CXCL4 and the second anchor sequence is between the E2 enhancer and EPGN.
61. The system of any one of claims 55-60, wherein the first site-specific breaker is a site-specific breaker as described herein, e.g., a site-specific breaker of any one of claims 2-9.
62. The system of any one of claims 55-61, wherein the second site-specific breaker is a site-specific breaker as described herein, e.g., a site-specific breaker of any one of claims 2-9.
63. The system of any one of claims 55-62, wherein the first targeting moiety and the second targeting moiety each independently comprise a TAL effector molecule, a CRISPR/Cas molecule, a zinc finger domain, a tetR domain, a meganuclease, or an oligonucleotide.
64. The system of any one of claims 55-63, wherein the first effector and the second effector each independently comprise an effector described herein, such as MQ1, EZH2, HDAC8, KRAB, G9A, or DNMT3a/3l, or a functional variant or fragment of any one thereof.
65. The system of any one of claims 55-62, wherein the first effector and the second effector each independently comprise a protein selected from the group consisting of: SETDB1, SETDB2, EHMT2 (i.e., G9A), EHMT1 (i.e., GLP), SUV39H1, EZH2, EZH1, SUV39H2, SETD8, SUV420H1, SUV420H2, or functional variants or fragments of any of these.
66. The system of any one of claims 55-65, wherein the first effector and the second effector each independently comprise a protein selected from the group consisting of: HDAC1, HDAC2, HDAC3, HDAC4, HDAC5, HDAC6, HDAC7, HDAC8, HDAC9, HDAC10, HDAC11, SIRT1, SIRT2, SIRT3, SIRT4, SIRT5, SIRT6, SIRT7, SIRT8, SIRT9, or a functional variant or fragment of any of these.
67. The system of any one of claims 55-43, wherein the first effector and the second effector each independently comprise a protein selected from the group consisting of: MQ1, DNMT3A2, DNMT3B1, DNMT3B2, DNMT3B3, DNMT3B4, DNMT3B5, DNMT3B6, DNMT3L, DNMT a/3l, or a functional variant or fragment of any one thereof.
68. The system of any one of claims 55-67, wherein the first effector and the second effector each independently comprise a protein selected from the group consisting of: KRAB, meCP2, HP1, RBBP4, REST, FOG1, SUZ12, or a functional variant or fragment of any one thereof.
69. The system of any one of claims 55-68, wherein the first effector and the second effector each independently comprise a polymer, such as an oligonucleotide.
70. The system of any one of claims 55-69, wherein the first oligonucleotide and the second oligonucleotide are the same.
71. The system of any one of claims 55-70, wherein the first oligonucleotide and the second oligonucleotide are different.
72. The system of any one of claims 55-71, wherein the first oligonucleotide has a sequence comprising a complement of the first anchor sequence or a complement of a sequence proximal to the first anchor sequence and the second oligonucleotide has a sequence comprising a complement of the second anchor sequence or a complement of a sequence proximal to the second anchor sequence.
73. The system of any one of claims 55-72, wherein the anchor sequence-mediated linkage further comprises a third gene.
74. The system of any one of claims 55-73, wherein the anchor sequence-mediated linkage further comprises a fourth gene.
75. The system of any one of claims 55-74, wherein the anchor sequence-mediated linkage further comprises a fifth gene.
76. The system of any one of claims 55-75, wherein the anchor sequence-mediated linkage further comprises a sixth gene.
77. The system of any one of claims 55-76, wherein the anchor sequence-mediated linkage further comprises a seventh gene.
78. The system of any one of claims 55-77, wherein the anchor sequence-mediated linkage further comprises an eighth gene.
79. The system of any one of claims 55-78, wherein the ASMC comprises two rings.
80. A nucleic acid composition encoding the system of any one of claims 55-79.
81. The nucleic acid of claim 80, wherein a single nucleic acid encodes the first site-specific breaker and the second site-specific breaker.
82. The nucleic acid of claim 81, wherein a first nucleic acid encodes the first site-specific breaker and a second nucleic acid encodes the second site-specific breaker.
83. A method of reducing expression of a first gene and a second gene in a cell, the method comprising contacting the cell with the system of any one of claims 55-79 of the nucleic acid composition of any one of claims 80-82.
84. The method of claim 83, wherein the cell is contacted with the first site-specific breaker and the second site-specific breaker simultaneously.
85. The method of claim 83, wherein the cell is contacted with the first site-specific breaker and the second site-specific breaker sequentially.
86. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL1 and the second gene is CXCL2.
87. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL1 and the second gene is CXCL3.
88. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL1 and the second gene is IL-8.
89. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL1 and the second gene is CXCL4.
90. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL1 and the second gene is CXCL5.
91. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL1 and the second gene is CXCL6.
92. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL1 and the second gene is CXCL7.
93. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL2 and the second gene is CXCL3.
94. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL2 and the second gene is IL-8.
95. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL2 and the second gene is CXCL4.
96. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL2 and the second gene is CXCL4.
97. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL2 and the second gene is CXCL5.
98. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL2 and the second gene is CXCL6.
99. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL2 and the second gene is CXCL7.
100. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL3 and the second gene is IL-8.
101. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL3 and the second gene is CXCL4.
102. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL3 and the second gene is CXCL5.
103. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL3 and the second gene is CXCL6.
104. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL3 and the second gene is CXCL7.
105. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL4 and the second gene is CXCL5.
106. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL4 and the second gene is CXCL6.
107. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL4 and the second gene is CXCL7.
108. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL4 and the second gene is IL-8.
109. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL5 and the second gene is CXCL6.
110. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL5 and the second gene is CXCL7.
111. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL5 and the second gene is IL-8.
112. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL6 and the second gene is CXCL7.
113. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL6 and the second gene is IL-8.
114. The method, human cell, site-specific breaker or system of any one of claims 1-85, wherein the first gene is CXCL7 and the second gene is IL-8.
115. The method, human cell, site-specific breaker or system of any one of claims 36-85, wherein the first gene is CXCL1, the second gene is CXCL2, and the third gene is CXCL3.
116. The method, human cell, site-specific breaker or system of any one of claims 36-85, wherein the first, second and third genes are selected from CXCL1, CXCL2, CXCL3, CXCL4, CXCL5, CXCL6, CXCL7 or IL-8.
117. The method, human cell, site-specific breaker or system of any one of claims 36-85, wherein the first, second, third and fourth genes are selected from CXCL1, CXCL2, CXCL3, CXCL4, CXCL5, CXCL6, CXCL7 or IL-8.
118. The method, human cell, site-specific breaker or system of any one of claims 36-85, wherein the first, second, third, fourth and fifth genes are selected from CXCL1, CXCL2, CXCL3, CXCL4, CXCL5, CXCL6, CXCL7 or IL-8.
119. The method, human cell, site-specific breaker or system of any one of claims 36-85, wherein the first, second, third, fourth, fifth and sixth genes are selected from CXCL1, CXCL2, CXCL3, CXCL4, CXCL5, CXCL6, CXCL7 or IL-8.
120. The method, human cell, site-specific breaker or system of any one of claims 36-85, wherein the first, second, third, fourth, fifth, sixth and seventh genes are selected from CXCL1, CXCL2, CXCL3, CXCL4, CXCL5, CXCL6, CXCL7 or IL-8.
121. The method, human cell, site-specific breaker or system of any one of claims 36-85, wherein the first, second, third, fourth, fifth, sixth, seventh and eighth genes are CXCL1, CXCL2, CXCL3, CXCL4, CXCL5, CXCL6, CXCL7 or IL-8.
122. The method, human cell, site-specific breaker or system of any one of the preceding claims, wherein the first gene is a cytokine.
123. The method, human cell, site-specific breaker or system of any one of the preceding claims, wherein the second gene is a cytokine.
124. The method, human cell, site-specific breaker or system of any one of the preceding claims, wherein the third gene is a cytokine.
125. The method, human cell, site-specific breaker or system of any of the preceding claims, wherein the fourth gene is a cytokine.
126. The method, human cell, site-specific breaker or system of any of the preceding claims, wherein the fifth gene is a cytokine.
127. The method, human cell, site-specific breaker or system of any of the preceding claims, wherein the sixth gene is a cytokine.
128. The method, human cell, site-specific breaker or system of any of the preceding claims, wherein the seventh gene is a cytokine.
129. The method, human cell, site-specific breaker or system of any of the preceding claims, wherein the eighth gene is a cytokine.
130. The method, human cell, site-specific breaker or system of any of the preceding claims, wherein the anchor sequence mediated linkage comprises 3, 4 or 5 pro-inflammatory genes.
131. The method, site-specific breaker or system of any preceding claim, wherein the site-specific breaker comprises a nucleic acid (e.g. DNA or RNA) comprising a nucleotide sequence selected from SEQ ID NOs 20-62 or a sequence having at least 90%, 95%, 98% or 99% identity thereto not differing by more than 1, 2, 3, 4 or 5 positions therefrom.
132. The method, site-specific breaker of any preceding claim, wherein the site-specific breaker comprises a nucleic acid (e.g. DNA or RNA) comprising a nucleotide sequence selected from SEQ ID NOs 21, 22, 24, 40, or a sequence having at least 90%, 95%, 98% or 99% identity thereto not differing by more than 1, 2, 3, 4 or 5 positions.
133. The method or site-specific breaker of any one of the preceding claims, wherein the site-specific breaker binds to a sequence that at least partially overlaps with a region having genomic coordinates selected from tables 4, 5, 6, 7 or a sequence within 5, 10, 15, 20, 30, 40, 50, 100, 200, 300, 400, 500, 600, 700, 800, 900 or 1000 nucleotides of said region.
134. The method of any one of the preceding claims, which results in a decrease in the level of a cytokine such as a chemokine, e.g. after stimulation of the cell with TNF- α, e.g. as measured in examples 2-11.
135. The method or human cell of any one of the preceding claims, wherein the level of a cytokine (e.g., a chemokine) is reduced, e.g., upon stimulation of the cell with TNF- α, e.g., as measured in examples 2-11.
136. The method or human cell of any one of the preceding claims, wherein transcript levels of one or more (e.g., 2, 3, or all) of CXCL1, CXCL2, CXCL3, and IL8 are reduced, e.g., upon stimulation of the cell with TNF- α, e.g., as measured in examples 2 or 4-11.
137. The method or human cell of any one of the preceding claims, wherein the transcript level of one or more (e.g., 2, 3 or all) of CXCL1, CXCL2, CXCL3, CXCL4, CXCL5, CXCL6, CXCL7 and IL8 is reduced, e.g., upon stimulation of the cell with TNF-a.
138. The method or human cell of any one of claims 132-137, wherein the reduction is at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% or 90% reduction compared to a pre-treatment level or a human cell with undamaged ASMC.
139. The method or human cell of any one of the preceding claims, wherein the protein level (e.g., secreted protein level) of one or more (e.g., 2, 3, or all) of CXCL1, CXCL2, CXCL3, and IL8 is reduced, e.g., upon stimulation of the cell with TNF- α, e.g., as measured in example 3.
140. The method or human cell of any one of the preceding claims, wherein the protein level (e.g., secreted protein level) of one or more (e.g., 2, 3, or all) of CXCL1, CXCL2, CXCL3, CXCL4, CXCL5, CXCL6, CXCL7, and IL8 is reduced, e.g., upon stimulation of the cell with TNF-a.
141. The method or human cell of claim 140, wherein the reduction is at least 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80% or 90% reduction compared to a pre-treatment level or a human cell with undamaged ASMC.
142. The method of any one of the preceding claims, which results in reduced binding of CTCF to the first anchor sequence, e.g., complete loss of binding or at least 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% or 100% loss compared to human cells with undamaged ASMC, e.g., as measured by ChIP and quantitative PCR.
143. The method of any one of the preceding claims, which results in disruption of the anchor sequence mediated connection.
144. The method of any one of the preceding claims, wherein the population of cells is contacted with the site-specific breaker, and wherein the first anchor sequence is edited in at least 50%, 60%, 70%, 80%, 90% or 95% of the cells in the population.
145. The method of any one of the preceding claims, wherein the effect (e.g., a decrease in cytokine level) is additive or synergistic compared to the effect of inhibiting the first gene or the second gene alone.
146. The method of any one of the preceding claims, wherein the decrease in expression persists for at least 1, 2, 3, 4, 5, 6, 7, 10, or 14 days, or at least 1, 2, 3, 4, or 5 weeks.
147. The method, human cell, site-specific breaker or system of any one of the preceding claims, wherein the cell is a cell of a subject suffering from an inflammatory disease, such as an immune-mediated inflammatory disease.
148. The method, human cell, site-specific breaker or system of any one of the preceding claims, wherein the inflammatory disease is an autoimmune disorder, such as rheumatoid arthritis.
149. The method, human cell, site-specific breaker or system of any of the preceding claims, wherein the inflammatory disease is associated with a pathogen infection, such as a viral infection, e.g. a SARS-CoV2 infection.
150. The method, human cell, site-specific breaker or system of any of the preceding claims, wherein the inflammatory disease is associated with an overlapping infection, e.g., an infection caused by two or more pathogens, e.g., viruses and bacteria (e.g., SARS-CoV2 and streptococcus pneumoniae), e.g., viruses and fungi (e.g., SARS-CoV2 and mucormycosis).
151. The method, human cell, site-specific breaker or system of any one of the preceding claims, wherein the cell is a cell of a subject having: rheumatoid arthritis, inflammatory arthritis, gout, asthma, neutrophilic skin disease, paw edema, acute Respiratory Disease Syndrome (ARDS), covd-19, psoriasis, inflammatory bowel disease, infection (e.g., caused by a pathogen, such as a bacterium, virus, or fungus), external injury (e.g., an abrasion or foreign object), the effects of radiation or chemical injury, osteoarthritis joint pain, inflammatory pain, acute pain, chronic pain, cystitis, bronchitis, dermatitis, skin disorders, cardiovascular disease, neurodegenerative disease, liver disease, lung disease, kidney disease, pain, swelling, stiffness, tenderness, redness, fever, or a biomarker associated with a disease state (e.g., a cytokine, chemokine, growth factor, immune receptor, infection marker, or inflammation marker).
152. The method, human cell, site-specific breaker, or system of any one of the preceding claims, wherein the cell is a cell of a subject having rheumatoid arthritis, psoriasis, or inflammatory bowel disease.
153. The method, human cell, site-specific breaker or system of any one of the preceding claims, wherein the cell is a cell of a subject having rheumatoid arthritis, gout, neutrophilic asthma, neutrophilic skin disease, acute Respiratory Disease Syndrome (ARDS) or covd-19.
154. The method, site-specific breaker or system of any preceding claim, wherein the anchor sequence mediated linkage comprises an internal reinforcing sequence.
155. The method, site-specific disruption agent or system of any one of the preceding claims, wherein the second gene (and optionally third, fourth, fifth, sixth, seventh or eighth gene) is transcribed in the same direction as the first gene.
156. The method, site-specific breaker or system of any preceding claim, wherein the first anchor sequence comprises a binding motif selected from CTCF binding motif, USF1 binding motif, YY1 binding motif, TAF3 binding motif or ZNF143 binding motif.
157. The method, site-specific breaker or system of any preceding claim, wherein the first anchor sequence comprises a CTCF binding motif.
158. The method, site-specific breaker or system of any preceding claim, wherein the site-specific breaker specifically binds to or proximal to the first anchor sequence with sufficient affinity to compete for binding with endogenous nucleation polypeptides (e.g., CTCF, USF1, YY1, TAF3 or ZNF 143) within the cell.
159. The method, site-specific breaker or system of any preceding claim, wherein the site-specific breaker adds, deletes or replaces one or more nucleotides within or proximal to the first anchor sequence.
160. The method or site-specific breaker of any one of the preceding claims, wherein the site-specific breaker comprises a targeting moiety or effector moiety comprising a first CRISPR/Cas molecule comprising a first CRISPR/Cas protein and a first guide RNA.
161. The method or system of any of the preceding claims, wherein the first site-specific breaker comprises a first targeting moiety or first effector moiety comprising a first CRISPR/Cas molecule comprising a first CRISPR/Cas protein and a first guide RNA and the second site-specific breaker comprises a second targeting moiety or second effector moiety comprising a second CRISPR/Cas molecule comprising a second CRISPR/Cas protein and a second guide RNA.
162. The method or site-specific breaker of any one of the preceding claims, wherein the site-specific breaker comprises a targeting moiety or effector moiety comprising a TAL effector molecule, CRISPR/Cas molecule, zinc finger domain, tetR domain, meganuclease or oligonucleotide.
163. The method or system of any one of the preceding claims, wherein the first site-specific breaker comprises a first targeting moiety or first effector moiety comprising a TAL effector molecule, CRISPR/Cas molecule, zinc finger domain, tetR domain, meganuclease or oligonucleotide and the second site-specific breaker comprises a second targeting moiety or second effector moiety comprising a TAL effector molecule, CRISPR/Cas molecule, zinc finger domain, tetR domain, meganuclease or oligonucleotide.
164. The method or site-specific breaker of any one of the preceding claims, wherein the site-specific breaker comprises an effector moiety comprising a histone modification function, such as histone methyltransferase, histone demethylase or histone deacetylase activity.
165. The method or system of any one of the preceding claims, wherein the first and/or the second site-specific breaker comprises an effector moiety comprising a histone modification function, such as histone methyltransferase, histone demethylase or histone deacetylase activity.
166. The method, site-specific breaker, or system of claim 164 or 165, wherein the effector moiety comprises a protein selected from the group consisting of: SETDB1, SETDB2, EHMT2 (i.e., G9A), EHMT1 (i.e., GLP), SUV39H1, EZH2, EZH1, SUV39H2, SETD8, SUV420H1, SUV420H2, or functional variants or fragments of any of these.
167. The method, site-specific breaker, or system of claim 164 or 165, wherein the effector moiety comprises a protein selected from the group consisting of: HDAC1, HDAC2, HDAC3, HDAC4, HDAC5, HDAC6, HDAC7, HDAC8, HDAC9, HDAC10, HDAC11, SIRT1, SIRT2, SIRT3, SIRT4, SIRT5, SIRT6, SIRT7, SIRT8, SIRT9, or a functional variant or fragment of any of these.
168. The method, site-specific breaker or system of claim 164 or 165, wherein the effector moiety comprises EZH2 or a functional variant or fragment of any of them.
169. The method, site-specific breaker or system of claim 164 or 165, wherein the effector moiety comprises HDAC8 or a functional variant or fragment of any of them.
170. The method or site-specific breaker of any one of the preceding claims, wherein the site-specific breaker comprises an effector moiety comprising a DNA modification function, such as a DNA methyltransferase.
171. The method or system of any one of the preceding claims, wherein the first and/or the second site-specific breaker comprises an effector moiety comprising a DNA modification function, such as a DNA methyltransferase.
172. The method, site-specific breaker, or system of claim 170 or 171, wherein the effector moiety comprises a protein selected from the group consisting of: MQ1, DNMT3A2, DNMT3B1, DNMT3B2, DNMT3B3, DNMT3B4, DNMT3B5, DNMT3B6, DNMT3L, DNMT a/3l, or a functional variant or fragment of any one thereof.
173. The method, site-specific breaker or system of claim 170 or 171, wherein the effector moiety comprises MQ1 or a functional variant or fragment of any of them.
174. The method, site-specific breaker or system of claim 170 or 171, wherein the effector moiety comprises DNMT3 (e.g., DNMT3a, DNMT3L, DNMT a/3l, DNMT3B1, DNMT3B2, DNMT3B3, DNMT3B4, DNMT3B5 or DNMT3B 6) or a functional variant or fragment of any of them.
175. The method or site-specific breaker of any one of the preceding claims, wherein the site-specific breaker comprises an effector moiety comprising a transcriptional repressor.
176. The method or site-specific breaker of any one of the preceding claims, wherein the first and/or the second site-specific breaker comprises an effector moiety comprising a transcriptional repressor.
177. The method, site-specific breaker, or system of claim 175 or 176, wherein the effector moiety comprises a protein selected from the group consisting of: KRAB, meCP2, HP1, RBBP4, REST, FOG1, SUZ12, or a functional variant or fragment of any one thereof.
178. The method, site-specific breaker, or system of claim 177, wherein the effector moiety comprises KRAB or a functional variant or fragment of any one of them.
179. The method or site-specific breaker of any one of the preceding claims, wherein the site-specific breaker comprises a polymer.
180. The method or system of any of the preceding claims, wherein the first and/or the second site-specific breaker comprises a polymer.
181. The method, site-specific breaker, or system of claim 179 or 180, wherein the polymer comprises polyamide.
182. The method, site-specific breaker, or system of claim 179 or 180, wherein the polymer is an oligonucleotide.
183. The method, site-specific breaker, or system of claim 182, wherein the oligonucleotide has a sequence comprising the complement of the first anchor sequence or the complement of the sequence proximal to the first anchor sequence.
184. The method, site-specific breaker, or system of claim 182, wherein the oligonucleotide has a sequence comprising the complement of the second anchor sequence or the complement of the sequence proximal to the second anchor sequence.
185. The method, site-specific breaker or system of any one of claims 182-184, wherein the oligonucleotide comprises a chemical modification.
186. The method or site-specific breaker or system of claim 179 or 180, wherein the polymer is a peptide nucleic acid.
187. The method, site-specific breaker or system of any preceding claim, wherein the site-specific breaker comprises a peptide-nucleic acid mixture.
188. The method, site-specific breaker, or system of any preceding claim, wherein the site-specific breaker (e.g., a targeting moiety or effector moiety of the site-specific breaker) comprises a peptide or polypeptide.
189. The method, site-specific breaker, or system of claim 188, wherein the polypeptide is a zinc finger polypeptide.
190. The method, site-specific breaker or system of claim 188, wherein the polypeptide is or comprises a transcription activator-like effector nuclease (TALEN) polypeptide.
191. The method or site-specific breaker of any preceding claim, wherein the site-specific breaker comprises a small molecule.
192. The method or system of any preceding claim, wherein the first and/or the second site-specific breaker comprises a small molecule.
193. The method or site-specific breaker of any one of the preceding claims, wherein the site-specific breaker further comprises an effector moiety, such as an epigenetic modifier, e.g., a DNA methyltransferase, a histone deacetylase, or a histone methyltransferase.
194. The method or system of any preceding claim, wherein the first and/or the second site-specific breaker further comprises an effector moiety, such as an epigenetic modifier, e.g., a DNA methyltransferase, a histone deacetylase, or a histone methyltransferase.
195. The method or site-specific breaker of any one of the preceding claims, wherein the site-specific breaker comprises a fusion molecule.
196. The method or system of any preceding claim, wherein the first and/or the second site-specific breaker comprises a fusion molecule.
197. The method or site-specific breaker of any preceding claim, wherein the site-specific breaker comprises a targeting moiety comprising a CRISPR/Cas molecule and an effector moiety comprising a transcriptional repressor, e.g. as a fusion molecule.
198. The method or system of any preceding claim, wherein the first and/or the second site-specific breaker comprises a targeting moiety comprising a CRISPR/Cas molecule and an effector moiety comprising a transcriptional repressor, e.g. as a fusion molecule.
199. The method or site-specific breaker of claim 198, wherein the targeting moiety comprises dCas9 and the effector moiety KRAB or a functional variant or portion thereof.
200. The method or system of any preceding claim, wherein the first and/or the second targeting moiety comprises dCas9 and the effector moiety KRAB or a functional variant or portion thereof.
201. The method or site-specific breaker of any one of claims 1-177, wherein the site-specific breaker comprises a targeting moiety comprising a CRISPR/Cas molecule and an effector moiety comprising a histone methyltransferase, e.g., as a fusion molecule.
202. The method or system of any preceding claim, wherein the first and/or the second site-specific breaker comprises a targeting moiety comprising a CRISPR/Cas molecule and an effector moiety comprising a histone methyltransferase, e.g. as a fusion molecule.
203. The method, site-specific breaker or system of claim 201, wherein the targeting moiety comprises dCas9 and the effector moiety comprises EZH2 or a functional variant or portion thereof.
204. The method, site-specific breaker, or system of any one of claims 1-196, wherein the site-specific breaker comprises a targeting moiety comprising a CRISPR/Cas molecule and an effector moiety comprising a DNA methyltransferase, e.g., as a fusion molecule.
205. The method, site-specific breaker, or system of claim 204, wherein the targeting moiety comprises dCas9 and the effector moiety comprises MQ1 or a functional variant or portion thereof.
206. The method, site-specific breaker or system of claim 203, wherein the targeting moiety comprises dCas9 and the effector moiety comprises DNMT3, such as DNMT3a/3l or a functional variant or portion thereof.
207. The method, site-specific breaker or system of any one of the preceding claims, wherein the site-specific breaker comprises a targeting moiety comprising a CRISPR/Cas molecule, a first effector moiety comprising a histone methyltransferase and a second effector moiety comprising a transcription repressor, e.g. as a fusion molecule.
208. The method, site-specific breaker, or system of claim 207, wherein the targeting moiety comprises dCas9, the first effector moiety comprises EZH2 or a functional variant or portion thereof, and the second effector moiety comprises KRAB or a functional variant or portion thereof.
209. The method, site-specific breaker or system of any preceding claim, wherein the site-specific breaker comprises a targeting moiety comprising a CRISPR/Cas molecule and an effector moiety comprising a histone deacetylase, e.g. as a fusion molecule.
210. The method, site-specific breaker or system of claim 209, wherein the targeting moiety comprises dCas9 and the effector moiety comprises HDAC8 or a functional variant or portion thereof.
211. The method, site-specific breaker or system of any one of the preceding claims, wherein the site-specific breaker comprises a targeting moiety comprising a CRISPR/Cas molecule, a first effector moiety comprising a histone methyltransferase and a second effector moiety comprising a histone deacetylase, e.g. as a fusion molecule.
212. The method, site-specific breaker or system of claim 211, wherein the targeting moiety comprises dCas9, the first effector moiety comprises EZH2 or a functional variant or portion thereof, and the second effector moiety comprises HDAC8 or a functional variant or portion thereof.
213. The method, site-specific breaker, or system of any one of claims 195-212, wherein the site-specific breaker comprises an amino acid sequence encoded by or comprising a sequence having at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity to any one of them: a nucleic acid sequence selected from the group consisting of SEQ ID NOs 69, 71, 85, 201, 202, 204, 205, 207, 209, 211, 213, 215, 217 or 219-242, the complement or reverse complement of any of which.
214. The method, site-specific breaker, or system of any one of claims 195-213, wherein the site-specific breaker comprises an amino acid sequence selected from any one of SEQ ID NOs 70, 72, 82, 84, 86, 203, 206, 208, 210, 212, 214, 216, or 218 or encoded by a sequence selected from any one of SEQ ID NOs 219-242, or a sequence having at least 80%, 85%, 90%, 91%, 92%, 93%, 94%, 95%, 96%, 97%, 98%, or 99% identity thereto.
215. The method, site-specific disruption agent, or system of any one of the preceding claims, wherein the cell is located in a subject.
216. The method, site-specific breaker of any one of claims 1-215, wherein the cell is ex vivo.
217. The method or site-specific breaker or system of any preceding claim, wherein the cell is a mammalian cell, such as a human cell.
218. The method, site-specific disruption agent, or system of any one of the preceding claims, wherein the cell is a somatic cell.
219. The method, site-specific breaker or system of any one of the preceding claims, wherein the cell is a primary cell.
220. The method of any one of the preceding claims, wherein the contacting step is performed ex vivo.
221. The method of claim 220, further comprising the step of removing the cell (e.g., mammalian cell) from the subject prior to the contacting step.
222. The method of claim 220 or 221, wherein the method further comprises administering the cells (e.g., mammalian cells) to the subject at step (b) after the contacting step.
223. The method of any one of claims 1-222, wherein the contacting step comprises administering to a subject a composition comprising the site-specific breaker.
224. The method of claim 223, wherein the site-specific disruption agent is administered as monotherapy.
225. The method of claim 223, wherein the site-specific breaker is administered in combination with a second therapeutic agent.
226. A reaction mixture comprising cells (e.g., human cells, e.g., primary human cells) and the site-specific breaker or system of any one of the preceding claims.
227. A method of treating a subject having an inflammatory disorder, the method comprising:
The method of claim, wherein the site-specific breaker, system or reaction mixture of any preceding claim is administered to the subject in an amount sufficient to treat the inflammatory disorder,
thereby treating the inflammatory disorder.
228. The method of claim 227, wherein the inflammatory disorder is rheumatoid arthritis, psoriasis, or inflammatory bowel disease.
229. The method of claim 227 or 228 wherein the inflammatory disorder is rheumatoid arthritis, gout, neutrophilic asthma, neutrophilic skin disease, acute Respiratory Disease Syndrome (ARDS) or covd-19.
230. The method of any one of claims 227-229, wherein the inflammatory disorder is an autoimmune disorder, such as rheumatoid arthritis.
231. The method of any one of claims 227-229, wherein the inflammatory disease is associated with a pathogen infection, such as a viral infection, e.g., a SARS-CoV2 infection.
232. The method of any of claims 227-229, wherein the inflammatory disease is associated with a superinfection, such as an infection caused by two or more pathogens, such as viruses and bacteria (e.g., SARS-CoV2 and streptococcus pneumoniae), such as viruses and fungi (e.g., SARS-CoV2 and mucormycosis).
CN202180078940.0A 2020-09-29 2021-09-29 Compositions and methods for inhibiting expression of multiple genes Pending CN116635086A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US63/085,013 2020-09-29
US202163216487P 2021-06-29 2021-06-29
US63/216,487 2021-06-29
PCT/US2021/052720 WO2022072546A2 (en) 2020-09-29 2021-09-29 Compositions and methods for inhibiting the expression of multiple genes

Publications (1)

Publication Number Publication Date
CN116635086A true CN116635086A (en) 2023-08-22

Family

ID=87638643

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202180078940.0A Pending CN116635086A (en) 2020-09-29 2021-09-29 Compositions and methods for inhibiting expression of multiple genes

Country Status (1)

Country Link
CN (1) CN116635086A (en)

Similar Documents

Publication Publication Date Title
KR20210023833A (en) How to edit single base polymorphisms using a programmable base editor system
JP7306696B2 (en) RNA-guided nucleic acid-modifying enzyme and method of use thereof
KR20210041008A (en) Multi-effector nucleobase editor for modifying nucleic acid target sequences and methods of using the same
KR20210139265A (en) Adenosine deaminase base editor for modifying nucleobases in target sequences and methods of using the same
JP2023027277A (en) Rna-guided nucleic acid modifying enzymes and methods of use thereof
CN111801417A (en) Novel RNA-programmable endonuclease systems and their use in genome editing and other applications
KR20210116526A (en) Modified immune cells with enhanced anti-neoplastic activity and immunosuppressive resistance
RU2721275C2 (en) Delivery, construction and optimization of systems, methods and compositions for sequence manipulation and use in therapy
KR20210127206A (en) A method of editing a disease-associated gene using an adenosine deaminase base editor, including for the treatment of a hereditary disease
KR20220076467A (en) New Nucleobase Editor and How to Use It
KR20210124280A (en) Nucleobase editor with reduced off-target deamination and method for modifying nucleobase target sequence using same
JP2019520391A (en) CRISPR / CAS 9 Based Compositions and Methods for Treating Retinal Degeneration
KR20220090512A (en) Compositions and methods for the treatment of liquid cancer
KR20230074525A (en) Compositions and methods for inhibiting gene expression
KR20210077732A (en) Programmable DNA base editing by NME2CAS9-deaminase fusion protein
JP2019500899A (en) Cellular RNA tracking and manipulation through nuclear delivery of CRISPR / Cas9
KR20180034402A (en) New CRISPR Enzymes and Systems
KR20210138603A (en) Modified immune cells with an adenosine deaminase base editor for modifying nucleobases in a target sequence
JP2017506898A (en) Methods and compositions for nuclease-mediated targeted integration
KR20210125560A (en) Disruption of splice receptor sites of disease-associated genes using an adenosine deaminase base editor, including for treatment of hereditary diseases
CN113423831B (en) Nuclease-mediated repeat amplification
KR20220010540A (en) How to edit single nucleotide polymorphisms using a programmable base editor system
Wu et al. Distinct roles of RECQ1 in the maintenance of genomic stability
JP2022525428A (en) Compositions and Methods Containing TTR Guide RNA and Polynucleotides Encoding RNA Guide DNA Binders
KR20220019685A (en) Compositions and methods for the treatment of hepatitis B

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination