CN116670152A

CN116670152A - Novel compositions having tissue-specific targeting motifs and compositions containing the same

Info

Publication number: CN116670152A
Application number: CN202180085871.6A
Authority: CN
Inventors: J·M·威尔逊; J·J·西姆斯; Y·元
Original assignee: University of Pennsylvania Penn
Current assignee: University of Pennsylvania Penn
Priority date: 2020-12-01
Filing date: 2021-12-01
Publication date: 2023-08-29
Also published as: IL303236A; CA3200014A1; WO2022119871A2; AU2021390480A1; US20240024507A1; WO2022119871A3; JP2023551903A; EP4256065A2; TW202237850A; KR20230117157A; AU2021390480A9; AR124216A1

Abstract

Provided herein are compositions comprising a targeting peptide linked to or inserted into a targeting protein of a recombinant vector having at least one exogenous peptide comprising the amino acid sequence of N-x- (T/I/V/a) - (K/R) (SEQ ID NO: 47). Compositions providing such conjugates, targeting peptides or recombinant vectors with mutant capsids or envelope proteins, and uses thereof are provided.

Description

Novel compositions having tissue-specific targeting motifs and compositions containing the same

Background

Adeno-associated virus (AAV) is currently the gene therapy vector of choice. This is because AAV can deliver stable expression of transgenes from non-integrated genomes for decades, and because AAV is very safe and non-immunogenic. However, AAV gene therapy is currently limited to a few diseases due to delivery and trending challenges. This is especially true for Central Nervous System (CNS) disorders. AAV gene therapy vectors can be delivered directly by injecting the vector directly into the cerebrospinal fluid (CSF), but this approach typically transduces 1% or less of the brain cells. In addition, transduction is mostly focused on cells in direct contact with CSF. Cells in the "deep brain" are rarely transduced. This limits the number of CNS disorders that can be treated by gene therapy.

In contrast to the CSF network, the vascular system of the brain reaches almost every cell in the CNS. This is because these tissues have a great demand for glucose, oxygen and other nutrients. However, cells in the brain and spinal cord are protected by specialized vascular units (blood brain barrier (BBB)) from the circulatory system. The BBB limits the diffusion of macromolecules such as viral vectors and proteins, and even many small molecule drugs through a complex network of cells that are tightly connected around the brain and spinal cord vessels. Thus, engineering AAV variants that are capable of efficiently crossing the BBB and transforming deep brain cells has become a great challenge in delivering gene therapies to the CNS.

An AAV capsid developed by California institute of technology (CalTech) has a seven amino acid peptide inserted in hypervariable loop 8 (HVR 8) on the AAV9 capsid to produce a rAAV called AAV 9-PHP.B. rAAV has been reported to mediate interactions with Ly6a (GPI anchored receptor on the cerebral vessels of some mouse strains). U.S. patent publication No. 2017/0166926A1. This interaction drives AAV9-php.b transport across the BBB, resulting in about 50-fold greater transduction of brain cells than AAV 9. However, this finding has not been transformed into larger animals or humans.

There remains a need for vectors that can specifically target selected tissues and cell types.

Disclosure of Invention

In certain embodiments, a recombinant adeno-associated virus particle (rAAV) is provided having a capsid comprising an amino acid sequence comprising the motifs N-x- (T/I/V/A) - (K/R) (SEQ ID NO: 47). Suitably, the amino acid sequence is at least a portion of an AAV vp3 protein in a capsid and a vector genome packaged in a capsid, said vector genome comprising a nucleic acid sequence encoding a gene product under the control of a sequence that directs expression of said gene product, provided that said capsid is not a mutant AAV2 capsid comprising an NDVRAVS (SEQ ID NO: 48) sequence. In certain embodiments, an amino acid sequence comprising an N-x- (T/I/V/a) - (K/R) motif is inserted into the AAV capsid vp3 region, optionally flanked by two amino acids to seven amino acids at the amino-terminus and/or carboxy-terminus of the motif. In certain embodiments, the sequence of insertion into the capsid comprises: (a) SSNTVKLTSGH (SEQ ID NO: 40); (b) EFSSNTVKLTS (SEQ ID NO: 38); (c) GGVLTNIARGEYMRGG (SEQ ID NO: 46); (d) GGIEINATRAGTNLGG (SEQ ID NO: 43); (e) GGSSNTVKLTSGHGG (SEQ ID NO: 39); (f) IEINATRAGTNL (SEQ ID NO: 42); or (g) SANFIKPTSY (SEQ ID NO: 41). In certain embodiments, the amino acid sequence of the motif is NTVK, optionally flanked by two to seven amino acids at its carboxy and/or amino terminus and interposed between amino acids 588 and 589 of the AAV9 capsid protein, based on the numbering of the amino acid sequences: SEQ ID NO. 44.

In certain embodiments, the rAAV has an inserted NTVK sequence in its capsid, optionally flanked by two to seven amino acids at its carboxy and/or amino terminus and inserted between amino acids 588 and 589 of the AAV9 capsid protein, based on the numbering of the amino acid sequences: SEQ ID NO. 44.

In certain embodiments, the compositions include a rAAV having an insertion motif and optionally flanking sequences, and one or more of a physiologically compatible carrier, excipient, and/or aqueous suspension matrix.

In certain embodiments, an endothelial cell targeting peptide is provided, the peptide comprising an amino acid sequence of N-x- (T/I/V/A) - (K/R) (SEQ ID NO: 47), optionally flanked by two to seven amino acids at the amino-and/or carboxy-terminus of the motif and optionally further conjugated to a nanoparticle, a second molecule or a viral capsid protein. In certain embodiments, the endothelial cell targeting peptide comprises: (a) SSNTVKLTSGH (SEQ ID NO: 40); (b) EFSSNTVKLTS (SEQ ID NO: 38); (c) GGVLTNIARGEYMRGG (SEQ ID NO: 46); (d) GGIEINATRAGTNLGG (SEQ ID NO: 43); (e) GGSSNTVKLTSGHGG (SEQ ID NO: 39); (f) IEINATRAGTNL (SEQ ID NO: 42); or (g) SANFIKPTSY (SEQ ID NO: 41). In certain embodiments, the amino acid sequence of the motif is NTVK. In certain embodiments, a composition is provided that includes an endothelial cell targeting peptide and one or more of a physiologically compatible carrier, excipient, and/or aqueous suspension matrix.

In certain embodiments, provided herein is a fusion polypeptide or protein comprising a brain endothelial cell targeting peptide and a fusion partner comprising at least one polypeptide or protein. In certain embodiments, a composition comprises a fusion polypeptide or protein according to claim 11, and one or more of a physiologically compatible carrier, excipient, and/or aqueous suspension matrix.

Provided herein are compositions and methods for delivering therapy to a patient in need thereof using rAAV, endothelial cell targeting peptides, fusion polypeptides or proteins and/or compositions described herein. In certain embodiments, the treatment is targeted to brain endothelial cells.

In certain embodiments, provided are compositions and methods for treating an alan-Herndon-Dudley disease by delivering a rAAV as described herein to a subject in need thereof, wherein the encoded gene product is an MCT8 protein.

In certain embodiments, a method for targeted therapy of the lung is provided, comprising administering to a patient in need thereof a rAAV as described herein.

In certain embodiments, a method is provided for treating a pulmonary disease by delivering to a subject in need thereof a rAAV having a capsid with an intervening targeting peptide and encoding a therapeutic gene product, wherein the encoded gene product is a soluble Ace2 protein, an anti-SARS antibody, an anti-SARS-CoV 2 antibody, an anti-influenza antibody, or a cystic fibrosis transmembrane protein.

In certain embodiments, a method for increasing in vitro transduction of AAV producer cells is provided comprising inserting an N-x- (T/I/V/a) - (K/R) motif into an AAV capsid. In certain embodiments, the producer cell is a 293 cell.

These and other embodiments and advantages of the present invention will be apparent from the specification, including but not limited to the detailed description of the invention.

Drawings

Figures 1A to 1B show the enrichment scores of the best performing peptide hits and reference peptides in the brains of the screened mice. FIG. 1A shows enrichment scores for C57BL/6J mice. FIG. 1B shows enrichment scores for Balb/c mice.

FIGS. 2A and 2B show the enrichment score for the best hit in NHP tissues in the screen. Fig. 2A shows enrichment scores for NHP brains. Fig. 2B shows enrichment scores for NHP spinal cord tissue.

Figures 3A to 3D show a secondary validation of transduction levels that performed best for peptide hits in AAV capsids comprising GFP reporter transgenes. Results were plotted against AAV9 transduction. FIG. 3A shows a secondary validated screen for selected peptide targeting of brain tissue in Balb/c mice. FIG. 3B shows a secondary validated screen for selected peptide targeting of brain tissue in C57BL/6 mice. FIG. 3C shows a secondary validated screen for selected peptide targeting of liver tissue in Balb/C mice. FIG. 3D shows a secondary validated screen for selected peptide targeting of liver tissue in C57BL/6 mice.

Figure 4 shows aligned regions of amino acid sequences of various AAV capsid proteins of AA9, AAV8, AAV7, AAV6, AAV5, AAV4, AAV3B, AAV and AAV1, focusing on regions HVRVIII where targeting peptides may be inserted (based on structural analysis).

Figure 5 shows that the "NxTK" motif is a key motif for brain biodistribution in SAN inserts and shows the mean impact of substitution (fold change compared to the original sequence).

FIG. 6 shows that the "NxTK" motif controls plasmid-to-AAV conversion in SAN peptide inserts, and shows the average effect of substitution (fold change over original sequence).

Figures 7A to 7D show that the "NxTK" motif confers broad transduction advantages across cell lines. Fig. 7A shows relative transduction levels compared to AAV9 capsids in 293 cells. Fig. 7B shows relative transduction levels compared to AAV9 capsids in NIH3T3 cells. Fig. 7C shows relative transduction levels compared to AAV9 capsids in HUH7 cells. Fig. 7D shows transduction levels of macaque primary airway cells on day 3 post transduction (3 DPT) and day 7 post transduction (7 DPT). Fig. 7E shows microscopic analysis of macaque primary airway epithelial cells in a control sample treated with vehicle (i.e., no vehicle). FIG. 7F shows microscopic analysis of primary airway epithelial cells of macaque transduced with AAV9-GFP vector. Fig. 7G shows microscopic analysis of macaque primary airway epithelial cells transduced with AAV9-GFP vector including EFS peptide inserts. Figure 7H shows microscopic analysis of primary airway epithelial cells of cynomolgus monkey transduced with AAV9-GFP including SAN peptide inserts.

Figure 8 shows preliminary transduction tests with GFP vector in cultured human cells (nose, bronchi and trachea) plotted as mRNA copy number versus total mRNA in micrograms.

Detailed Description

Provided herein is a targeting peptide sequence. Fusion proteins, modified proteins, mutant viral capsids, and other moieties operably linked to an exogenous targeting peptide motif N-x- (T/I/V/A) - (K/R) (SEQ ID NO: 47) are also provided herein. In certain embodiments, such exogenous motifs confer upon these compositions a modulation of native tissue specificity of the source (parent) protein, viral vector, or other portion. In certain embodiments, the targeting peptide in such a motif provides enhanced or altered endothelial cell targeting. In certain embodiments, the targeting peptide in such motifs provides enhanced or altered lung, bronchial, tracheal and/or nasal epithelial targeting. In certain embodiments, viral vectors having modified capsids with such motifs exhibit increased transduction of AAV producer cells in vitro.

The targeting peptide can be linked to a recombinant protein (e.g., for enzyme replacement therapy) or polypeptide (e.g., an immunoglobulin) to target a desired tissue (e.g., CNS or lung) to form a fusion protein or conjugate. In addition, the targeting peptide can be linked to liposomes and/or nanoparticles (lipid nanoparticles, LNP) to form peptide coated liposomes and/or LNP to target the desired tissue. The sequence encoding at least one copy of the targeting peptide and optionally the linking sequence may be fused in-frame with the coding sequence of the recombinant protein and co-expressed with the protein or polypeptide to provide a fusion protein or conjugate. Alternatively, other synthetic methods may be used to form conjugates with proteins, polypeptides, or other moieties (e.g., DNA, RNA, or small molecules). In certain embodiments, multiple copies of the targeting peptide are in the fusion protein/conjugate. Suitable methods of conjugating a targeting peptide to a recombinant protein include modifying the amino (N) terminus and one or more residues on a recombinant human protein (e.g., an enzyme) with a first crosslinking agent to produce a first crosslinking agent modified recombinant human protein, modifying the amino (N) terminus of a short extension linker region in front of the targeting peptide with a second crosslinking agent to produce a second crosslinking agent modified variant targeting peptide, and then conjugating the first crosslinking agent modified recombinant human protein to the second crosslinking agent modified variant targeting peptide containing the short extension linker. Other suitable methods of conjugating a targeting peptide to a recombinant protein include conjugating a first cross-linker modified recombinant human protein to one or more second cross-linker modified variant targeting peptides, wherein the first cross-linker modified recombinant protein comprises a recombinant protein characterized by having a chemically modified N-terminus and one or more modified lysine residues, and the one or more second cross-linker modified variant targeting peptides comprise one or more variant targeting peptides comprising a modified N-terminal amino acid of a short extension linker in front of the targeting peptide. Other suitable methods for conjugating the targeting peptide to a protein, polypeptide, nanoparticle, or other biologically useful chemical moiety may be selected. See, for example, U.S. Pat. No. 9,545,450 B2 (NHS-phosphine crosslinker; NHS-azide crosslinker); U.S. published patent application No. US 2018/0185503 A1 (aldehyde-hydrazide crosslinking).

In certain embodiments, the targeting peptide may be inserted at a suitable site within a protein or polypeptide (e.g., a viral capsid protein). In some of these embodiments, and in some other embodiments, the targeting peptide may be flanked by short extension linkers at its carboxy (COO-) and/or amino (N) terminus. Such linkers may be from 1 to 20 amino acid residues in length, or may be from about 2 to 20 amino acid residues, or from about 1 to 15 amino acid residues, or from about 2 to 12 amino acid residues, or from 2 to 7 amino acid residues in length. The short extension linker may also be about 10 amino acids in length. The presence and length of the N-terminal linker is selected independently of the carboxy-terminal linker, and the presence and length of the carboxy-terminal linker is selected independently of the N-terminal linker. A flexible GS extension linker of 5 amino acids (glycine-serine), a 10 amino acid extension linker comprising 2 flexible GS linkers, a 15 amino acid extension linker comprising 3 flexible GS linkers, a 20 amino acid extension linker comprising 4 flexible GS linkers, or any combination thereof may be used to provide a suitable short extension linker.

In certain embodiments, a composition is provided that can be used to target endothelial cells. The composition is a mutant capsid, a fusion protein, or another conjugate comprising at least one exogenous targeting peptide comprising: the amino acid sequence of N-x- (T/I/V/A) - (K/R) (SEQ ID NO: 47) is optionally flanked by two to seven amino acids at the amino-and/or carboxy-terminus of the motif and optionally further conjugated to a nanoparticle, a second molecule or a viral capsid protein. The targeting peptide comprises the following sequences with optional linking sequences:

(a)SSNTVKLTSGH(SEQ ID NO:40)；

(b)EFSSNTVKLTS(SEQ ID NO:38)；

(c)GGVLTNIARGEYMRGG(SEQ ID NO:46)；

(d)GGIEINATRAGTNLGG(SEQ ID NO:43)；

(e)GGSSNTVKLTSGHGG(SEQ ID NO:39)；

(f) IEINATRAGTNL (SEQ ID NO: 42); or (b)

(g)SANFIKPTSY(SEQ ID NO:41)。

In certain embodiments, the targeting peptide motif is encoded by a nucleic acid sequence selected from the group consisting of:

(a)agcagcaacaccgtgaagctgaccagcggacac(SEQ ID NO:54)；

(b)gagttcagcagcaacaccgtgaagctgaccagc(SEQ ID NO:50)；

(c)ggaggagtgctgaccaacatcgctagaggagagtacatgagaggagga(SEQ ID NO:56)；

(d)ggaggaatcgagatcaacgctaccagagctggaaccaacctgggagga(SEQ ID NO:52)；

(e)ggaggaagcagcaacaccgtgaagctgaccagcggacacggagga(SEQ ID NO:55)；

(f) atcgagatcaacgctaccagagctggaaccaacctg (SEQ ID NO: 51); or (b)

(g)agcgctaacttcatcaagcctaccagctac(SEQ ID NO:53)。

In certain embodiments, the targeting peptide is encoded by the nucleic acid sequence of SEQ ID NO. 50 or a sequence at least about 70% identical thereto. In certain embodiments, the targeting peptide is encoded by the nucleic acid sequence of SEQ ID NO. 51 or a sequence at least about 70% identical thereto. In certain embodiments, the targeting peptide is encoded by the nucleic acid sequence of SEQ ID NO. 52 or a sequence at least about 70% identical thereto. In certain embodiments, the targeting peptide is encoded by the nucleic acid sequence of SEQ ID NO. 53 or a sequence at least about 70% identical thereto. In certain embodiments, the targeting peptide is encoded by the nucleic acid sequence of SEQ ID NO. 54 or a sequence at least about 70% identical thereto. In certain embodiments, the targeting peptide is encoded by the nucleic acid sequence of SEQ ID NO. 55 or a sequence at least about 70% identical thereto. In certain embodiments, the targeting peptide is encoded by the nucleic acid sequence of SEQ ID NO. 56 or a sequence at least about 70% identical thereto. In some embodiments, the nucleic acid sequence encoding the targeting peptide motif is optionally flanked at the 5 'and/or 3' end of the nucleic acid sequence of the motif by extension linkers of six to twenty one nucleotides.

In certain embodiments, the targeting peptide is NTVK. In certain embodiments, the targeting peptide is NTVR. In certain embodiments, more than one copy of the targeting peptide within such motif is provided in a conjugate or modified protein (e.g., parvoviral capsid). In certain embodiments, there are two or more different targeting peptides.

In certain embodiments, a composition is provided that can be used to target nasal and/or lung epithelial cells. The composition is a mutant capsid, a fusion protein, or another conjugate comprising at least one exogenous targeting peptide comprising: the amino acid sequence of N-x- (T/I/V/A) - (K/R) (SEQ ID NO: 47) is optionally flanked by two to seven amino acids at the amino-and/or carboxy-terminus of the motif and optionally further conjugated to a nanoparticle, a second molecule or a viral capsid protein. The targeting peptides include: (a) SSNTVKLTSGH (SEQ ID NO: 40); (b) EFSSNTVKLTS (SEQ ID NO: 38); (c) GGVLTNIARGEYMRGG (SEQ ID NO: 46); (d) GGIEINATRAGTNLGG (SEQ ID NO: 43); (e) GGSSNTVKLTSGHGG (SEQ ID NO: 39); (f) IEINATRAGTNL (SEQ ID NO: 42); or (g) SANFIKPTSY (SEQ ID NO: 41).

In certain embodiments, the targeting peptide is NTVK. In certain embodiments, the targeting peptide is an NTVR, optionally flanked by spacer amino acids described herein. In certain embodiments, more than one copy of the targeting peptide within such motif is provided in a conjugate or modified protein (e.g., parvoviral capsid). In certain embodiments, there are two or more different targeting peptides.

Examples of suitable proteins for targeting, including enzymes, immunoglobulins, therapeutic proteins, immunogenic polypeptides, nanoparticles, DNA, RNA, and other moieties (e.g., small molecules, etc.), are described in more detail below. These and other biological and chemical moieties are suitable for use with the targeting peptides provided herein.

In certain embodiments, the composition is a nucleic acid sequence molecule comprising a targeting peptide sequence motif linked to a nucleic acid molecule, wherein the nucleic acid sequence is a DNA molecule or an RNA molecule, e.g., naked DNA, naked plasmid DNA, messenger RNA (mRNA). In some embodiments, the nucleic acid molecules are further coupled to various compositions and nanoparticles, including, for example, micelles, liposomes, cationic lipid-nucleic acid compositions, polysaccharide compositions, and other polymers, lipid and/or cholesterol-nucleic acid conjugates, and other constructs such as described herein. See, for example, WO2014/089486, US 2018/0353616A1, US2013/0037977A1, WO2015/074085A1, US9670152B2 and US 8,853,377B2, x.su et al, mol. Pharmaceuticals, 2011,8 (3), pages 774-787; network release, 21 days of 2011 3 month; WO2013/182683, WO 2010/053572 and WO 2012/170930, all of which are incorporated herein by reference. In certain embodiments, the targeting peptide motif is chemically linked to the nanoparticle surface, wherein the nanoparticle encapsulates the nucleic acid molecule. In some embodiments, nanoparticles comprising targeting peptides attached to a surface are designed for targeted tissue-specific delivery. In some embodiments, two or more different targeting peptides are attached to the nanoparticle surface. Suitable chemical linkages or crosslinks include those known to those skilled in the art.

Capsid shell

In certain embodiments, a recombinant parvovirus is provided having a modified parvovirus capsid with an exogenous peptide from at least an N-x- (T/I/V/a) - (K/R) targeting motif. Such recombinant parvoviruses may be hybrid bocaviruses/AAV or recombinant AAV vectors. In other embodiments, other viral vectors may be generated that have one or more exogenous targeting peptides from the N-x- (T/I/V/a) - (K/R) motif in the exposed capsid protein (which may be the same or different, or a combination thereof) to modulate and/or alter the targeting specificity of the viral vector compared to the parental vector.

The targeting peptide may be inserted at any suitable position of the hypervariable loop (HVR) VIII. For example, based on the numbering of the AAV9 capsid, a linker of different length is inserted between amino acids 588 and 589 (Q-a) of the AAV9 capsid protein to the peptide, based on the numbering of the following AAV9 VP1 amino acid sequences: SEQ ID NO. 44. See also WO 2019/168961 published at 9, 2019, including table G providing AAV9 deamidation patterns, and WO 2020/160582 submitted at 7, 2018, 9. The amino acid residue positions in AAVhu68 (SEQ ID NO: 45) are identical. However, another site may be selected within HVRVIII. Alternatively, another exposed loop HVR (e.g., HVRIV) may be selected for insertion. Comparable HVR regions can be selected in other capsids. In certain embodiments, the locations of HVRVIII and HVRIV are determined using algorithms and/or alignment techniques described in U.S. Pat. No. 9,737,618 B2 (column 15, lines 3-23) and U.S. Pat. No. 10,308,958B2 (column 15, line 46-column 16, line 6), each of which is incorporated herein by reference in its entirety. In certain embodiments, AAV1 capsid proteins are selected as parent capsids, wherein targeting peptides with linkers of different lengths are inserted at appropriate positions in the HVRVIII region of amino acids 582 to 585 or the HVRIV region of amino acids 456 to 459, based on vp1 numbering (Gurda, BL. et al, capsid antibodies of different adeno-associated virus serotypes bind to a common region (Capsid Antibodies to Different Adeno-Associated Virus Serotypes Bind Common Regions), 2012, J.Virol (Journal of Virology), 12.6.2013, 87 (16): 9111-91114). In certain embodiments, AAV8 is selected as the parent capsid, wherein targeting peptides with linkers of different lengths are inserted into the HVRVIII region of amino acids 586 to 591 (e.g., 590-591 (N-T)) or the HVRIV region of amino acids 456 to 460 at appropriate positions, based on VP1 numbering (Gurda, BL. et al), the neutralizing epitope on the capsid of adeno-associated virus serotype 8 (Mapping a Neutralizing epitope onto the Capsid of Adeno-Associated Virus Serotype 8), 2012, journal of virology, 5 month 16, 86 (15): 7739-7751). In certain embodiments, AAV7 is selected as the parent capsid, wherein targeting peptides with linkers of different lengths are inserted at appropriate positions from amino acids 589 to 590 (N-T). In certain embodiments, AAV6 is selected as the parent capsid, wherein targeting peptides with linkers of different lengths are inserted at appropriate positions of amino acids 588 to 589 (S-T). In certain embodiments, AAV5 is selected as the parent capsid, wherein targeting peptides with linkers of different lengths are inserted into the appropriate positions of amino acids 577 to 578 (T-T). In certain embodiments, AAV4 is selected as the parent capsid, wherein targeting peptides with linkers of different lengths are inserted at appropriate positions of amino acids 586 to 587 (S-N). In certain embodiments, AAV3B is selected as the parent capsid, wherein targeting peptides with linkers of different lengths are inserted at appropriate positions of amino acids 588 to 589 (N-T). In certain embodiments, AAV2 is selected as the parent capsid, wherein targeting peptides with linkers of different lengths are inserted at the appropriate positions of amino acids 587 to 588 (N-R). In certain embodiments, AAV1 is selected as the parent capsid, wherein targeting peptides with linkers of different lengths are inserted at the appropriate positions of amino acids 589 to 589 (S-T). See also fig. 4.

In certain embodiments, the parental capsid modified to contain the N-x- (T/I/V/a) - (K/R) motif and optionally flanking sequences is selected from parvoviral (e.g., clade F AAV (e.g., AAVhu68 or AAV 9), clade E (e.g., AAV 8), or certain clade a AAV (e.g., AAV1, AAVrh 91)) capsids, or non-parvoviral capsids (e.g., herpes simplex virus, etc.), to enhance expression of CNS-targeted cells and/or otherwise modulate the type of CNS-targeted cells. In other embodiments, the capsid is selected from the group consisting of a non-natural CNS-targeting parvovirus (e.g., a clade F AAV, such as AAVhu68 or AAV9, or certain clade a AAV, such as AAV1, AAVrh 91) capsid or a non-parvovirus capsid (e.g., herpes Simplex Virus (HSV), etc.). See, for example, WO 2020/223231 (rh 91, including forms having deamidation patterns) published at 5 of 11 of 2020, U.S. provisional patent application No. 63/065,616 submitted at 14 of 8 of 2020, and U.S. provisional patent application No. 63/109734 submitted at 4 of 11 of 2020. In certain embodiments, the capsid is selected from the AAV clades F AAVhu95 and AAVhu96 capsids. See, for example, U.S. provisional application No. 63/251,599 filed on date 2201, month 10, and date 2.

In certain embodiments, the parental capsid modified to contain an N-x- (T/I/V/a) - (K/R) motif is selected from a virus (e.g., AAV) that naturally targets nasal epithelial cells, nasopharyngeal cells, and/or pulmonary cells, so as to enhance targeting as compared to a parental AAV (e.g., clade a AAV such as AAV1, AAVrh32.33, AAV6.2, AAV6, AAVrh 91) or AAV5 or certain clade F AAV (e.g., AAVhu68 or AAV 9) capsid or non-parvoviral capsid (e.g., adenovirus, HSV, RSV, etc.). See, for example, WO 2020/223231 (rh 91, including forms having deamidation patterns) published at 5 of 11 of 2020, U.S. provisional patent application No. 63/065,616 submitted at 14 of 8 of 2020, and U.S. provisional patent application No. 63/109734 submitted at 4 of 11 of 2020.

In certain embodiments, the AAV capsid is not a mutant AAV2 capsid comprising the NDVRAVS (SEQ ID NO: 48) sequence.

For example, capsids from clade F AAV, such as AAVhu68 or AAV9, may be selected. Methods of generating vectors having an AAV9 capsid or an AAVhu68 capsid and/or a chimeric capsid derived from AAV9 have been described. See, for example, US 7,906,111, which is incorporated herein by reference. AAV serotypes that transduce nasal cells or another suitable target (e.g., muscle or lung) can be selected as a capsid source for an AAV viral vector, including, for example, AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV6.2, AAV7, AAV8, AAV9, rh10, AAVrh64R1, AAVrh64R2, rh8, AAVrh32.33 (see, e.g., U.S. published patent application No. 2007-0036760-A1; U.S. published patent application No. 2009-0197338-A1; and EP 1310571). See also WO 2003/042397 (AAV 7 and other ape-like AAV), us patent 7790449 and us patent 7282199 (AAV 8), WO 2005/033321 (AAV 9) and WO 2006/110689 or recombinant AAV not yet found or based thereon may be used as a source of AAV capsids. See, for example, international application Nos. PCT/US21/45945 (AAV rh 91) and WO 2020/223236 A1 (AAV rh 90), WO 2020/223231 A1 and 2021, 8/13, and WO 2020/223236 A1 (AAV rh92, AAV rh93, AAV rh 9193), all of which are incorporated herein by reference in their entirety. These documents also describe other AAV that may be selected for AAV production and are incorporated by reference. In some embodiments, an AAV capsid (cap) for a viral vector may be generated by mutagenesis (i.e., by insertion, deletion, or substitution) of one of the AAV caps described above or a nucleic acid encoding the same. In some embodiments, the AAV capsid is chimeric, comprising domains from two or three or four or more of the AAV capsid proteins described above. In some embodiments, the AAV capsid is a chimera of Vpl monomers, vp2 monomers, and Vp3 monomers from two or three different AAV or recombinant AAV. In some embodiments, the rAAV composition comprises more than one cap described above.

As used herein, the term "clade" in relation to a group of AAV refers to a group of AAV that are phylogenetically T-related to each other as determined by bootstrap values (bootstrapping values) of at least 75% (of at least 1000 replicas) and poisson correction distance measurements (Poisson correction distance measurement) of not more than 0.05 using the adjacency algorithm (Neighbor-Joining algorithm) based on alignment of AAV vp1 amino acid sequences. The adjacency algorithm has been described in the literature. See, e.g., m.nei and s.kumar, molecular evolution and phylogenetic (Molecular Evolution and Phylogenetics) (oxford university press (Oxford University Press), new york (2000)). Available computer programs are provided that can be used to implement this algorithm. For example, the MEGA v2.1 program implements the modified Nei-Gojobori method. Using these techniques and computer programs, and the sequence of the AAV vp1 capsid protein, one of skill in the art can readily determine whether the selected AAV is contained in one of the clades identified herein or in another clade outside of those clades. See, e.g., G Gao et al, J virology (JVirol), month 6 of 2004; 78 6381-6388, which identifies clades A, B, C, D, E and F and provides the nucleic acid sequences of novel AAV, genBank accession numbers AY530553 through AY530629. See also WO 2005/033321.

As used herein, an "AAV9 capsid" is a self-assembled AAV capsid composed of a plurality of AAV9vp proteins. AAV9vp proteins are typically expressed as alternative splice variants, which are encoded by GenBank accession numbers: the nucleic acid sequence encoding the vp1 amino acid sequence of AAS 99264. These splice variants produce proteins of different lengths. In certain embodiments, an "AAV9 capsid" comprises an AAV having an amino acid sequence that is 99264 99% identical or 99% identical to AAS. See also WO 2019/168961 published at 9, month 6 of 2019, containing table G providing deamidation patterns of AAV 9. See also US7906111 and WO 2005/033321. As used herein, "AAV9 variants" include variants described in, for example, WO2016/049230, US 8,927,514, US 2015/0344911, and US 8,734,809.

rAAVhu68 is composed of AAVhu68 capsid and vector genome. The AAVhu68 capsid is an assembly of a vp1 heterologous population, a vp2 heterologous population, and a vp3 heterologous population of proteins. As used herein, the term "heterologous" or any grammatical variation thereof, when used in reference to a vp capsid protein, refers to a population of non-identical elements, e.g., having vp1, vp2, or vp3 monomers (proteins) with different modified amino acid sequences. See also PCT/US2018/019992, wo 2018/160582, entitled "Adeno-Associated Virus (AAV) clade F vector and use thereof (AAV) Clade F Vector and Uses Therefor", and which is incorporated herein by reference in its entirety.

For other recombinant viral vectors, the appropriate exposed portion of the viral capsid or envelope protein responsible for targeting specificity is selected for insertion of the targeting peptide. For example, in adenoviruses, modification of the hexon protein may be desired. In lentiviruses, the envelope fusion protein may be modified to include one or more copies of a targeting motif. For vaccine viruses, the major glycoprotein may be modified to include one or more copies of the targeting motif. Suitably, for safety, these recombinant viral vectors are replication defective.

Expression cassette and vector

The genomic sequence of a vector that is packaged into an AAV capsid and delivered to a host cell typically consists of at least the transgene and its regulatory sequences and AAV Inverted Terminal Repeats (ITRs). Both single stranded AAV and self-complementary (sc) AAV are encompassed within the rAAV. A transgene is a nucleic acid coding sequence heterologous to the vector sequence that encodes a polypeptide, protein, functional RNA molecule (e.g., miRNA inhibitor), or other gene product of interest. The nucleic acid coding sequence is operably linked to the regulatory component in a manner that allows transcription, translation and/or expression of the transgene in cells of the target tissue.

AAV sequences of vectors typically include cis-acting 5 'and 3' Inverted Terminal Repeat (ITR) sequences (see, e.g., b.j. Carter, "parvovirus handbook (Handbook of Parvoviruses)", p tijsser editions, CRC Press, pp.155 (1990)). The ITR sequence is about 145 base pairs (bp) in length. Preferably, substantially the entire sequence encoding the ITR is used in the molecule, although some minor modification of these sequences is allowed. The ability to modify these ITR sequences is within the skill of the art. (see, e.g., text, e.g., sambrook et al, "molecularcloning: A laboratory Manual (Molecular Cloning: A Laboratory Manual)", 2 nd edition, cold spring harbor laboratory, N.Y. (Cold Spring Harbor Laboratory, new York) (1989), and K.Fisher et al, (J.Virol., 70:520 (1996)). An example of such a molecule employed in the present invention is a "cis-acting" plasmid containing a transgene, wherein the selected transgene sequence and associated regulatory elements flank 5 'and 3' aav ITR sequences. In one embodiment, the ITRs are from an AAV that is different from the AAV supplying the capsid. In one embodiment, the ITR sequence is from AAV2. Shortened versions of the 5' ITR, known as Δitr, have been described in which the D sequence and terminal resolution sites (trs) are deleted. In certain embodiments, the vector genome (e.g., of a plasmid) comprises a shortened AAV2 ITR of 130 base pairs, wherein the external a element is deleted. During amplification of vector DNA using the internal a element as a template and encapsulated into a capsid to form a viral particle, the shortened ITR reverts to a wild-type length of 145 base pairs. In other embodiments, full length AAV 5 'and 3' itrs are used. However, ITRs from other AAV sources may be selected. In the case where the source of the ITR is from AAV2 and the AAV capsid is from another AAV source, the resulting vector may be referred to as pseudotyped. However, other configurations of these elements may be suitable.

In addition to the major elements identified above for recombinant AAV vectors, AAV vectors also comprise the necessary conventional control elements operably linked to the transgene in a manner that allows for its transcription, translation, and/or expression in cells transfected with the plasmid vector or infected with the virus produced by the invention. As used herein, an "operably linked" sequence comprises an expression control sequence that is contiguous with the gene of interest and an expression control sequence that acts in trans or remotely to control the gene of interest.

Regulatory control elements typically contain a promoter sequence as part of the expression control sequence, for example positioned between the selected 5' itr sequence and the coding sequence. Constitutive promoters, regulatable promoters [ see, for example, WO 2011/126808 and WO 2013/04943], tissue-specific promoters or promoters responsive to physiological cues may be used in the vectors described herein.

Examples of constitutive promoters suitable for controlling expression of therapeutic products include, but are not limited to, chicken beta-actin (CB) promoter, human Cytomegalovirus (CMV) promoter, ubiquitin C promoter (UbC), simian virus 40 (SV 40) early and late promoters, U6 promoter, metallothionein promoter, EF1 alpha promoter, ubiquitin promoter, hypoxanthine phosphoribosyl transferase (HPRT) promoter, dihydrofolate reductase (DHFR) promoter (Scharfmann ET al, proc. Natl. Acad. Sci. USA 88:4626-4630 (1991), adenosine deaminase promoter, phosphoglycerate kinase (PGK) promoter, pyruvate kinase promoter, phosphoglycerate mutase promoter, phospho-actin promoter (Lai ET al, examples of tissue or cell specific promoters suitable for use in the present invention include, but are not limited to, endothelin-I (ET-I) and Flt-I (which are endothelial cell specific), foxJ1 (which target ciliated cells), other examples of tissue specific promoters suitable for use in the present invention include, but are not limited to, liver specific promoters examples of liver specific promoters may include, for example, thyroid hormone binding globulin (TBG), albumin, miyatake et al, (1997) journal of virology, 71:5124; hepatitis B virus core promoter, sandig et al, (1996) Gene therapy (Gene Ther.), 3:1002 9; or human alpha 1-antitrypsin, phosphoenolpyruvate carboxykinase (PECK) or Alpha Fetoprotein (AFP), arbuthnot et al, (1996) human gene therapy (hum. Gene Ther.), 7:150114. Preferably, such promoters are of human origin.

Inducible promoters suitable for controlling expression of a therapeutic product comprise promoters responsive to exogenous agents (e.g., pharmacological agents) or physiological causes. These response elements include, but are not limited to, hypoxia Response Elements (HREs) that bind HIF-Iα and β, metal ion response elements such as Mayo et al (1982, cell 29:99-108); brinster et al (1982, & Nature) 296:39-42) and Searle et al (1985) molecular cell biology (mol. Cell. Biol. 5:1480-1489); or a heat shock response element as described by Nouer et al (in: heat shock response (Heat Shock Response), editors Nouer, L., CRC, boca Raton, fla., ppI-220,1991).

In one embodiment, expression of the gene product is controlled by a regulatable promoter that tightly controls transcription of the sequence encoding the gene product (e.g., a pharmacological agent) or transcription factors activated by a pharmacological agent or, in alternative embodiments, by a physiological cause. Preferably a promoter system that is leak-free and can be tightly controlled.

Examples of regulatable promoters useful in the present invention as ligand-dependent transcription factor complexes include, but are not limited to, nuclear receptor superfamily members activated by their respective ligands (e.g., glucocorticoids, estrogens, progestins, retinoids, ecdysone, and analogs and mimetics thereof) and rTTA activated by tetracycline. In one aspect of the invention, the gene switch is an EcR-based gene switch. Examples of such systems include, but are not limited to, the systems described in U.S. patent nos. 6,258,603, 7,045,315, U.S. published patent application nos. 2006/0014711, 2007/0161086, and international published application No. WO 01/70816. Examples of chimeric ecdysone receptor systems are described in the following: U.S. Pat. No. 7,091,038, U.S. published patent application Nos. 2002/0110861, 2004/0033600, 2004/0096942, 2005/0266457 and 2006/0100416, international published application Nos. WO 01/70816, WO 02/066612, WO 02/066613, WO 02/066614, WO 02/066615, WO 02/29075 and WO 2005/108617, each of which is incorporated by reference in its entirety. Examples of nonsteroidal ecdysone agonist modulation systems are Mammalian inducible expression systems (ibs wechat, MA, new England Biolabs, ipswich).

Still other promoter systems may contain response elements, including but not limited to tetracycline (tet) response elements (e.g., described by Gossen & Bujar (1992), proc. Natl. Acad. Sci. USA) 89:5547-551), or hormone response elements, e.g., described by Lee et al (1981, natl. 294:228-232), hynes et al (1981, proc. Natl. Acad. Sci. USA) 78:2038-2042), klock et al (1987, natl) 329:734-736); and Israel & Kaufman (1989, & nucleic acids Res.) & 17:2589-2604), as well as other inducible promoters known in the art. The use of such promoters allows control of expression of soluble hACE2 constructs, for example, by the Tet-on/off system (Gossen et al, 1995, science 268:1766-9; gossen et al, 1992, proc. Natl. Acad. Sci. USA.), 89 (12): 5547-51); the TetR-KRAB system (Urrilia R.,2003, genome biology (Genome biol.), 4 (10): 231; deuschle U et al, 1995, molecular cell biology (4): 1907-14); a system for regulating and controlling the flow of metaponin (RU 486) (Geneswitch; wang Y et al, 1994, proc. Natl. Acad. Sci. USA, 91 (17): 8180-4; schilinger et al, 2005, proc. Natl. Acad. Sci. U.S A.102 (39): 13789-94), and a system for regulating and controlling the flow of humanized tamoxifen-dep (Roscilli et al, 2002, molecular therapy (mol. Ther.) (6 (5): 653-63).

In another aspect, the gene switch is based on heterodimerization of FK506 binding protein (FKBP) with FKBP Rapamycin Associated Protein (FRAP) and is modulated by rapamycin or a non-immunosuppressive analog thereof. Examples of such systems include, but are not limited to, ARGENTs ^TM Transcription technology (ARIAD Pharmaceuticals, cambridge, mass.) and the system described in: U.S. patent nos. 6,015,709, 6,117,680, 6,479,653, 6,187,757 and 6,649,595, U.S. publication No. 2002/0173474, U.S. publication No. 200910100535, U.S. patent No. 5,834,266, U.S. patent No. 7,109,317, U.S. patent No. 7,485,441, U.S. patent No. 5,830,462, U.S. patent No. 5,869,337, U.S. patent No. 5,871,753, U.S. patent No. 6,011,018, U.S. patent No. 6,043,082, U.S. patent No. 6,046,047, U.S. patent No. 6,063,625, U.S. patent No. 6,140,120, U.S. patent No. 6,165,787, U.S. patent No. 6,972,193, U.S. patent No. 6,326,166, U.S. patent No. 6,326,166, U.S. No. 6,043,082 U.S. patent No. 6,326,166, U.S. patent No. 6,326,166 U.S. Pat. No. 6,326,166, WO 94/18347, WO 96/20951, WO 96/06097, WO 97/31898, WO 96/41865, WO 98/02 441、WO 95/33052、WO 99110508、WO 99110510、WO 99/36553、WO 99/41258、WO 01114387，ARGENT ^TM Kit for regulating transcription retrovirus, version 2.0 (9109102), and ARGENT ^TM A regulatory transcription plasmid kit, version 2.0 (9109/02), each of which is incorporated herein by reference in its entirety. The Ariad system was designed to be induced by rapamycin and its analogues (known as "rapalogs"). Examples of suitable rapamycin are described above in connection with ARGENT ^TM The description of the system is provided in a file listed. In one embodiment, the molecule is rapamycin [ e.g., by Pfizer as Rapamune ] ^TM Sales and sales]. In another embodiment, what is known as AP21967[ ARIAD]Is a significant problem. Examples of such dimer molecules useful in the present invention include, but are not limited to, rapamycin, FK506, FK1012 (homodimers of FK 506), rapamycin analogs ("rapalogs"), which are readily prepared by chemical modification of natural products to add "bumps" (bumps) that reduce or eliminate affinity to endogenous FKBP and/or FRAP. Examples of rapalog include, but are not limited to, as in AP26113 (Ariad), AP1510 (Amara, J.F. et al, 1997, proc. Natl. Acad. Sci. USA), 94 (20): 10618-23), AP22660, AP22594, AP21370, AP22594, AP23054, AP1855, AP1856, AP1701, AP1861, AP1692 and AP1889, with 'bumps' designed to minimize interactions with endogenous FKBP. Still other rapalogs may be selected, e.g. AP23573[ Merck ] ]. In certain embodiments, rapamycin or a suitable analog can be delivered locally to AAV transfected cells of the nasopharynx. Such local delivery may be delivered locally to the cells via bolus injection, cream or gel by intranasal injection. See U.S. patent application Ser. No. 2019/0216841 A1, which is incorporated herein by reference.

Other suitable enhancers include those suitable for the desired target tissue indication. In one embodiment, the expression cassette comprises one or more expression enhancers. In one embodiment, the expression cassette contains two or more expression enhancers. These enhancers may be the same or different from each other. For example, the enhancer may comprise a CMV immediate early enhancer. Such enhancers may be present in two copies located adjacent to each other. Alternatively, the double copy of the enhancer may be separated by one or more sequences. In yet another embodiment, the expression cassette further comprises an intron, e.g., a chicken β -actin intron. Other suitable introns include introns known in the art, for example, the introns described in WO 2011/126808. Examples of suitable polyadenylation (polyA) sequences include, for example, rabbit-binding globulin (also known as rabbit beta-globulin or rBG), SV40, SV50, bovine growth hormone (bGH), human growth hormone, and synthetic polyA. Optionally, one or more sequences may be selected to stabilize the mRNA. An example of such a sequence is a modified WPRE sequence, which may be engineered upstream of the polyA sequence and downstream of the coding sequence (see, e.g., MA Zanta-Boussif et al, gene therapy (2009) 16:605-619).

AAV viral vectors may comprise multiple transgenes. In some cases, a different transgene may be used to encode each subunit of a protein (e.g., an immunoglobulin domain, an immunoglobulin heavy chain, an immunoglobulin light chain). In one embodiment, the cells produce the multi-subunit protein after infection/transfection with a virus containing each of the different subunits. In another embodiment, different subunits of a protein may be encoded by the same transgene. IRES is desirable when the size of the DNA encoding each subunit is small, e.g., the total size of the DNA encoding the subunit and IRES is less than 5 kilobases. As an alternative to IRES, DNA may be isolated from the sequence encoding the 2A peptide, which self-cleaves in a post-translational event. See, for example, ML Donnelly et al (1997, month 1) journal of genovirology (J.Gen.Virol.), 78 (Pt 1): 13-21; furler, S et al (6. 2001) Gene therapy, 8 (11): 864-873; klump et al (month 5 2001) Gene therapy 8 (10): 811-817. This 2A peptide is significantly smaller than IRES, making it very suitable for use where space is a limiting factor. More typically, when the transgene is large, consists of multiple subunits, or both transgenes are co-delivered, the rAAV carrying the desired transgene or subunit is co-administered to allow them to be in tandem in vivo to form a single vector genome. In such embodiments, a first AAV may carry an expression cassette that expresses a single transgene, and a second AAV may carry an expression cassette that expresses a different transgene for co-expression in a host cell. However, the transgene selected may encode any biologically active product or other product, such as the product required for research.

In addition to the elements identified above for the expression cassette, the vector also comprises conventional control elements operably linked to the coding sequence in a manner that allows transcription, translation and/or expression of the coding product (e.g., soluble hACE2 construct, anti-influenza antibody, anti-covd 19 antibody) in cells transfected with the plasmid vector or infected with the virus produced by the invention. Examples of other suitable transgenes are provided herein. As used herein, an "operably linked" sequence comprises an expression control sequence that is contiguous with the gene of interest and an expression control sequence that acts in trans or remotely to control the gene of interest.

Expression control sequences comprise suitable enhancers; a transcription factor; a transcription terminator; a promoter; efficient RNA processing signals such as splicing and polyadenylation (polyA) signals; sequences that stabilize cytoplasmic mRNA, such as woodchuck hepatitis virus (WHP) post-transcriptional regulatory elements (WPRE); sequences that enhance translation efficiency (i.e., kozak consensus sequences); a sequence that enhances protein stability; and, when desired, sequences that enhance secretion of the encoded product.

In one embodiment, the regulatory sequences are selected such that the total rAAV vector genome is from about 2.0 to about 5.5 kilobases in size. In one embodiment, the rAAV vector genome is required to approximate the size of the native AAV genome. Thus, in one embodiment, the regulatory sequences are selected such that the total rAAV vector genome is about 4.7kb in size. In another embodiment, the total rAAV vector genome is less than about 5.2kb in size. The size of the vector genome may be manipulated based on the size of regulatory sequences comprising promoters, enhancers, introns, poly a, etc. See Wu et al, molecular therapy, month 1 2010, 18 (1): 80-6, incorporated herein by reference.

Thus, in one embodiment, the intron is contained in a vector. Suitable introns include chicken beta-actin intron, human beta globulin IVS2 (Kelly et al, nucleic acids research 43 (9): 4721-32 (2015)); promega chimeric introns (Almond, B. And Schenborn, E.T. comparison of pCI-neo Vector and pcDNA4/HisMax Vector (A Comparison of pCI-neo Vector and pcDNA4/HisMax Vector); and an hFIX intron. Various introns suitable for use herein are known in the art and include, but are not limited to, introns found on bpg. See also shephev v., fedorov a. Progress in exon-Intron databases (Advances in the Exon-Intron Database), "bioinformatics bulletin (Briefings in Bioinformatics)," 2006,7:178-185, which is incorporated herein by reference.

Several different viral genomes were generated in the studies described herein. However, one skilled in the art will appreciate that other genomic configurations (including other regulatory sequences) may be substituted for promoters, enhancers, and other coding sequences may be selected.

rAAV vector production

For use in the production of AAV viral vectors (e.g., recombinant (r) AAV), the expression cassette may be carried on any suitable vector (e.g., plasmid) for delivery to packaging host cells. Plasmids useful in the present invention can be engineered so that they are suitable for in vitro replication and packaging in prokaryotic cells, insect cells, mammalian cells, and other cells. Suitable transfection techniques and packaging host cells are known and/or can be readily designed by those skilled in the art.

In certain embodiments, incorporating at least one copy of an N-x- (T/I/V/a) - (K/R) motif into an AAV capsid provides a production advantage over methods that do not incorporate at least one copy of the motif in an AAV capsid, and wherein the producer cell is a 293 cell.

Methods of preparing AAV-based vectors (e.g., with AAV9 or another AAV capsid) are known. See, for example, U.S. published patent application No. 2007/0036760 (15 days 2 months of 2007), which is incorporated herein by reference. The invention is not limited to the use of AAV9 or other clade F AAV amino acid sequences, but encompasses peptides and/or proteins containing terminal β -galactose binding generated by other methods known in the art, including, for example, by chemical synthesis, by other synthetic techniques, or by other methods. The sequence of any AAV capsid provided herein can be readily generated using a variety of techniques. Suitable production techniques are well known to those skilled in the art. See, e.g., sambrook et al, molecular cloning, A laboratory Manual (Molecular Cloning: A Laboratory Manual), cold spring harbor laboratory Press (Cold Spring Harbor Press, cold Spring Harbor, N.Y.). Alternatively, peptides may also be synthesized by well known methods of solid phase peptide synthesis (Merrifield, (1962) American society of chemistry (J.Am. Chem. Soc.)), 85:2149; stewart and Young, solid phase peptide synthesis (Solid Phase Peptide Synthesis) (san Francisco, 1969) pp.27-62. These methods may involve, for example, culturing a host cell containing a nucleic acid sequence encoding an AAV capsid; a functional rep gene; a minigene consisting of at least AAV Inverted Terminal Repeats (ITRs) and transgenes; and an ancillary function sufficient to allow packaging of the minigene into AAV capsid proteins. These and other suitable production methods are within the knowledge of those skilled in the art and are not limiting of the invention.

The components required for culturing in a host cell to package an AAV minigene into an AAV capsid may be provided to the host cell in trans form. Alternatively, any one or more of the desired components (e.g., minigenes, rep sequences, cap sequences, and/or helper functions) may be provided by stabilizing host cells that have been engineered to contain one or more of the desired components using methods known to those of skill in the art. Most suitably, such stable host cells will contain the desired components under the control of an inducible promoter. However, the desired components may be under the control of a constitutive promoter. Examples of suitable inducible and constitutive promoters are provided herein in the discussion of regulatory elements suitable for use in transgenes. In another alternative, the selected stable host cell may contain selected components under the control of a constitutive promoter and other selected components under the control of one or more inducible promoters. For example, stable host cells derived from 293 cells (which contain E1 helper functions under the control of constitutive promoters) but contain rep and/or cap proteins under the control of inducible promoters may be produced. Other stable host cells can be produced by those skilled in the art.

These rAAV are particularly useful for gene delivery for therapeutic purposes and for preventing infection. Further, the compositions of the invention may also be used to produce a desired gene product in vitro. For in vitro production, the desired product (e.g., protein) may be obtained from the desired culture after transfection of host cells with rAAV containing molecules encoding the desired product and culturing the cell culture under conditions that allow expression. The expressed product can then be purified and isolated as desired. Suitable techniques for transfection, cell culture, purification and isolation are known to those skilled in the art. Methods for producing and isolating AAV suitable for use as a vector are known in the art. See, for example, grieger & Samulski,2005, "adeno-associated virus as gene therapy vector: vector development, production and clinical use (Adeno-associated virus as a gene therapy vector: vector development, production and clinical applications) "," Biochemical engineering/Biotechnology progression (adv. Biochem. Engin/Biotechnol.), "99:119-145; buning et al, 2008, "recent developments in adeno-associated viral vector technology (Recent developments in adeno-associated virus vector technology)," "journal of Gene medicine (J.Gene Med.)", 10:717-733; and references cited below, each of which is incorporated herein by reference in its entirety. For packaging transgenes into viral particles, ITRs are the only AAV component in cis required in the same construct as the nucleic acid molecule containing the expression cassette. The cap and rep genes may be supplied in trans.

In one embodiment, the expression cassettes described herein are engineered into genetic elements (e.g., shuttle plasmids) that transfer the immunoglobin construct sequences carried thereon into packaging host cells to produce viral vectors. In one embodiment, the selected genetic element can be delivered to the AAV packaging cell by any suitable method, including transfection, electroporation, liposome delivery, membrane fusion techniques, high speed DNA coated pellet, viral infection, and protoplast fusion. Stable AAV packaging cells can also be prepared. Alternatively, the expression cassette may be used to produce viral vectors other than AAV, or to produce a mixture of antibodies in vitro. Methods for preparing such constructs are known to the nucleic acid manipulation skilled person and include genetic engineering, recombinant engineering and synthetic techniques. See, for example, molecular cloning: laboratory Manual, green and Sambrook editions, cold spring harbor laboratory Press (Cold Spring Harbor Press, cold Spring Harbor, NY) (2012) for Cold spring harbor, N.Y.

The term "AAV intermediate" or "AAV vector intermediate" refers to an assembled rAAV capsid lacking the desired genomic sequences packaged therein. These may also be referred to as "empty" capsids. Such capsids may contain no detectable genomic sequence of the expression cassette, or only partially packaged genomic sequences insufficient to effect expression of the gene product. These empty capsids are nonfunctional to transfer the gene of interest to the host cell.

The recombinant AAV described herein may be generated using known techniques. See, for example, WO 2003/042397; WO 2005/033321, WO 2006/110689; US 7588772 B2. Such methods involve culturing a host cell containing a nucleic acid sequence encoding an AAV capsid; a functional rep gene; an expression cassette consisting of at least AAV Inverted Terminal Repeats (ITRs) and a transgene; and an ancillary function sufficient to allow packaging of the expression cassette into an AAV capsid protein. Thus methods of producing capsids, coding sequences, and methods for producing rAAV viral vectors have been described. See, for example, gao et al, proc. Natl. Acad. Sci.) (U.S. A), 100 (10), 6081-6086 (2003) and US 2013/0045186A1.

In one embodiment, the cells are made in a suitable cell culture (e.g., HEK 293 cells). Methods for making the gene therapy vectors described herein include methods well known in the art, such as generating plasmid DNA for producing the gene therapy vector, producing the vector, and purifying the vector. In some embodiments, the gene therapy vector is an AAV vector, and the plasmids generated are AAV cis plasmids encoding the AAV genome and the gene of interest for packaging into a capsid, AAV trans plasmids containing AAV rep and cap genes, and adenovirus helper plasmids. The vector production process may comprise method steps such as starting cell culture, performing cell passaging, inoculating cells, transfecting cells with plasmid DNA, exchanging the transfected medium for serum-free medium, and harvesting the cells and medium containing the vector. The harvested vector-containing cells and medium are referred to herein as a crude cell harvest. In yet another system, the gene therapy vector is introduced into the insect cell by infection with a baculovirus-based vector. For reviews of these production systems, see, for example, zhang et al, 2009, "Adenovirus-adeno-associated virus hybrids for large-scale recombinant adeno-associated virus production (Adenoirus-adeno-associated virus hybrid for large-scale recombinant adeno-associated virus production)" (human Gene therapy (Human Gene Therapy) 20:922-929, which is incorporated herein by reference in its entirety. Methods of making and using these and other AAV production systems are also described in the following U.S. patents, the contents of each of which are incorporated herein by reference in their entirety: 5,139,941;5,741,683;6,057,152;6,204,059;6,268,213;6,491,907;6,660,514;6,951,753;7,094,604;7,172,893;7,201,898;7,229,823; and 7,439,065. In certain embodiments, methods of making and using an AAV production system comprise methods of using pseudorabies virus (rPRV) described in U.S. patent application 63/016,894, filed on 28, month 4 2020, which is incorporated herein by reference.

Thereafter, the crude cell harvest may be a process step of the subject matter, such as concentrating the carrier harvest, diafiltering the carrier harvest, microfluidizing the carrier harvest, nuclease digestion of the carrier harvest, filtering the microfluidized intermediate, crude purification by chromatography, crude purification by ultracentrifugation, buffer exchange by tangential flow filtration and/or formulation and filtration to produce a plurality of carriers.

Two-step affinity chromatography purification is performed at high salt concentrations followed by purification of the carrier drug product and removal of empty capsids using anion exchange resin chromatography. These methods are described in more detail in the following: international patent application No. PCT/US2016/065970, entitled "scalable purification method for AAV9 (Scalable Purification Method for AAV 9)", filed 12/9/2016, which is incorporated herein by reference. The AAV8 of International patent application No. PCT/US2016/065976, filed on Ser. No. 2016/12/9, and the rh10 of International patent application No. PCT/US16/66013 entitled "scalable purification method of AAVrh10 (Scalable Purification Method for AAVrh 10)" filed on Ser. No. 2016/12/11, and the scalable purification method of AAV1 (Scalable Purification Method for AAV 1) "filed on Ser. No. 2016/US 2016/065974, filed on Ser. No. 2016/12/9, and the purification method of AAV1 in filed on Ser. No. 2015/12/11, are all incorporated herein by reference.

To calculate the empty particle and intact particle content, the vp3 band volume (e.g., in the examples herein, iodixanol gradient purified formulation, where GC number = particle number) of the selected samples is plotted against the GC particles loaded. The resulting linear equation (y=mx+c) is used to calculate the number of particles in the banded volume of the test article peak. The number of particles per 20. Mu.L loaded (pt) was then multiplied by 50 to give particles (pt)/mL. The Pt/mL was divided by GC/mL to give the particle to genome copy ratio (Pt/GC). Pt/mL-GC/mL gave empty Pt/mL. Empty pt/mL divided by pt/mL and x 100 gives the percentage of empty particles.

Generally, methods for assaying empty capsids and AAV vector particles with packaged genomes are known in the art. See, e.g., grimm et al, (1999) Gene therapy 6:1322-1330; and Sommer et al, (molecular therapy (molecular. Ther.)) (2003) 7:122-128. To test denatured capsids, the method comprises subjecting the treated AAV stock to SDS-polyacrylamide gel electrophoresis (consisting of any gel capable of separating three capsid proteins, e.g. a gradient gel containing 3-8% triacetate in buffer), followed by running the gel until sample material is separated and blotting the gel onto a nylon or nitrocellulose membrane (preferably nylon). Then, anti-AAV The capsid antibody serves as a primary antibody that binds to the denatured capsid protein, preferably an anti-AAV capsid monoclonal antibody, most preferably a B1 anti-AAV 2 monoclonal antibody (Wobus et al, J.Virol.2000, 74:9281-9293). A secondary antibody is then used which binds to the primary antibody and comprises a means for detecting binding to the primary antibody, more preferably an anti-IgG antibody comprising a detection molecule covalently bound thereto, most preferably a sheep anti-mouse IgG antibody covalently linked to horseradish peroxidase. A method for detecting binding is used to semi-quantitatively determine binding between a primary antibody and a secondary antibody, preferably a detection method capable of detecting radioisotope emissions, electromagnetic radiation or colorimetric changes, most preferably a chemiluminescent detection kit. For example, for SDS-PAGE, samples can be extracted from the column fractions and heated in SDS-PAGE loading buffer containing a reducing agent (e.g., DTT), and the capsid proteins resolved on a pre-formed gradient polyacrylamide gel (e.g., novex). Silver staining may be performed using SilverXpress (Invitrogen, CA) or other suitable staining methods (i.e., SYPRO ruby or coomassie staining) according to manufacturer's instructions. In one embodiment, the concentration of AAV vector genome (vg) in the column fraction can be measured by quantitative real-time PCR (Q-PCR). The sample is diluted and digested with dnase I (or another suitable nuclease) to remove exogenous DNA. After nuclease inactivation, taqMan with specificity for the DNA sequence between the primers is used ^TM The fluorescent probe further dilutes and amplifies the sample. The number of cycles (threshold cycles, ct) required for each sample to reach a defined fluorescence level was measured on a Applied Biosystems Prism 7700 sequence detection system. Plasmid DNA containing the same sequence as that contained in the AAV vector was used to generate a standard curve in the Q-PCR reaction. The values of the cycle threshold (Ct) obtained from the samples were used to determine vector genome titers by normalizing them with respect to the Ct values of the plasmid standard curve. Endpoint determination based on digital PCR may also be used.

In addition, other examples of measuring the empty particle to intact particle ratio are also known in the art. In the analysis of superanalysisThe sedimentation rate measured in a rapid centrifuge (AUC) can detect aggregates, other minor components, and provide good quantification of the relative amounts of different particulate matter based on their different sedimentation coefficients. This is an absolute method based on basic units of length and time, without the need for standard molecules as references. The carrier sample was loaded into cells of a 2-channel charcoal-epon center plate (centrepice) with an optical path length of 12 mm. The provided dilution buffer was loaded into the reference channel of each cell. The loaded cells were then placed in AN-60Ti assay rotor and loaded into a Beckman-Coulter ProteomeLab XL-I assay ultracentrifuge equipped with absorbance and RI detectors. After complete temperature equilibration at 20 ℃, the rotor was brought to a final operating speed of 12,000 rpm. Record a approximately every 3 minutes ₂₈₀ The scan was continued for about 5.5 hours (110 total scans per sample). Raw data was analyzed using the c(s) method and implemented in the analysis program SEDFIT. The resulting size distribution was plotted and peak integrated. The percentage value associated with each peak represents the peak area fraction of the total area under all peaks and is based on raw data generated at 280 nm; these values are used by many laboratories to calculate the empty particle to complete particle ratio. However, since the empty particles and the complete particles have different extinction coefficients at this wavelength, the raw data can be adjusted accordingly. The ratio of empty particles to integral monomer peaks before and after the extinction coefficient adjustment was used to determine the empty particle-integral particle ratio.

In one aspect, an optimized q-PCR method is used that utilizes a broad spectrum of serine proteases, such as proteinase K (as commercially available from Qiagen). More specifically, the optimized qPCR genome titer assay is similar to the standard assay except that after dnase I digestion, the sample is diluted with proteinase K buffer and treated with proteinase K, then heat inactivated. Suitably, the sample is diluted with proteinase K buffer in an amount equal to the sample size. The proteinase K buffer may be concentrated 2-fold or more. Typically, proteinase K treatment is about 0.2mg/mL, but may vary from 0.1mg/mL to about 1 mg/mL. The treatment step is typically conducted at about 55 ℃ for about 15 minutes, but may be conducted at a lower temperature (e.g., about 37 ℃ to about 50 ℃) for a longer period of time (e.g., about 20 minutes to about 30 minutes), or at a higher temperature (e.g., up to about 60 ℃) for a shorter period of time (e.g., about 5 minutes to 10 minutes). Similarly, heat inactivation typically lasts about 15 minutes at about 95 ℃, but the temperature may be reduced (e.g., about 70 ℃ to about 90 ℃) and the time prolonged (e.g., about 20 minutes to about 30 minutes). The sample is then diluted (e.g., 1000-fold) and TaqMan analysis is performed as described in the standard assay. Quantification can also be performed using ViroCyt or flow cytometry.

Additionally or alternatively, droplet digital PCR (ddPCR) may be used. For example, methods for determining single stranded and self-complementary AAV vector genome titers by ddPCR have been described. See, e.g., m.lock et al, hu gene therapy methods (Hu Gene Therapy Methods), human gene therapy methods (hum.gene ter. Methods), 2014, month 4; 25 (2) 115-25.Doi:10.1089/hgtb.2013.131. Electronic version 2014, 2 months 14 days.

Therapeutic proteins and delivery systems

Fusion partners, conjugation partners, and recombinant vectors comprising the targeting motifs provided herein (i.e., N-x- (T/I/V/a) - (K/R) motifs) can be used in a variety of different therapeutic proteins, polypeptides, nanoparticles, and delivery systems. Examples of proteins and compounds useful in the compositions and targeted delivery provided herein include the following. It will be appreciated that viral vectors, nanoparticles and other delivery systems contain sequences encoding selected proteins (or conjugates) for expression in vivo.

In certain embodiments, the proteins are MCT8 protein (SLC 16A2 gene) and other compounds useful in the treatment of Allan-Herndon-Dudley disease and symptoms thereof.

In certain embodiments, the protein is selected from diseases associated with defective transport, such as, for example, cystic fibrosis (cystic fibrosis transmembrane regulator), alpha-1-antitrypsin (hereditary emphysema), FE (hereditary hemochromatosis), tyrosinase (eyelid albinism), protein C (protein C deficiency), complement C inhibitors (hereditary angioedema type I), alpha-D-galactosidase (Fabry disease), beta hexosaminidase (Tay-Sachs), sucrase-isomaltase (congenital sucrase-isomaltase deficiency), UDP-glucuronic acid transferase (kecinju type II), insulin receptor (diabetes), growth hormone receptor (len syndrome), and the like. Examples of other genes and proteins are those related to, for example: spinal muscular atrophy (SMA, SMN 1), huntington's disease, rett syndrome (e.g., methyl-CpG-binding protein 2 (MeCP 2); uniProtKB-P51608), amyotrophic Lateral Sclerosis (ALS), duchenne muscular dystrophy, friedreich ataxia (e.g., frataxin), ATXN2 associated with spinocerebellar ataxia type 2 (SCA 2)/ALS; TDP-43 associated with ALS, progranulin (PRGN) (associated with non-alzheimer's brain degeneration including frontotemporal dementia (FTD), progressive non-fluency aphasia (PNFA), and semantic dementia), and the like. See, e.g., www.orpha.net/confor/cgi-bin/disease_search_list. Php; raredeease.info.nih.gov/diseases. Further exemplary genes that may be delivered via the rAAV include, but are not limited to, glucose 6 phosphatase associated with glycogen storage disease or type 1A deficiency (GSD 1), phosphoenolpyruvate carboxykinase (PEPCK) associated with PEPCK deficiency; cyclin-dependent kinase-like 5 (CDKL 5), also known as serine/threonine kinase 9 (STK 9), associated with seizures and severe neurodevelopmental disorders; galactose-1 phosphouridine transferase associated with galactosylation; phenylalanine hydroxylase (PAH) associated with Phenylketonuria (PKU); a gene product associated with primary homooxaluria type 1 comprising hydroxy acid oxidase 1 (GO/HAO 1) and AGXT, a branched-chain alpha-keto acid dehydrogenase associated with maple syrup urine disease comprising BCKDH, BCKDH E2, BAKDH E1a and BAKDH E1b; fumarylacetoacetases associated with type 1 tyrosinemia; methylmalonyl-coa mutase associated with methylmalonate; mid-chain acyl-coa dehydrogenase associated with mid-chain acetyl-coa deficiency; ornithine Transcarbamylase (OTC) associated with ornithine transcarbamylase deficiency; argininosuccinate synthetase (ASS 1) associated with citrullinemia; lecithin-cholesterol acyltransferase (LCAT) deficiency; methylmalonic Acid (MMA); NPC1 associated with niemann pick disease type C1); propionic Acidemia (PA); transthyretin (TTR) associated hereditary amyloidosis; low Density Lipoprotein Receptor (LDLR) proteins associated with Familial Hypercholesterolemia (FH), LDLR variants, such as those described in WO 2015/164778; PCSK9; apoE and ApoC proteins associated with dementia; UDP-glucuronyltransferase associated with Crohn's disease; adenosine deaminase associated with severe combined immunodeficiency disease; hypoxanthine guanine phosphoribosyl transferase associated with gout and leys-nehn's syndrome; a biotin enzyme associated with a biotin enzyme deficiency; α -galactosidase a (a-Gal a) associated with fabry disease); beta-galactosidase (GLB 1) associated with GM1 ganglioside deposition disease; ATP7B associated with wilson's disease; beta-glucocerebrosidase associated with gaucher disease types 2 and 3; peroxisome membrane protein 70kDa associated with Ji Weige syndrome; arylsulfatase a (ARSA) associated with metachromatic leukodystrophy, galactocerebrosidase (GALC) associated with keabb disease, α -Glucosidase (GAA) associated with pompe disease; a sphingomyelinase (SMPD 1) gene associated with niemann pick disease type a; arginine succinate synthase associated with adult-onset citrullinemia type II (CTLN 2); carbamoyl phosphate synthase 1 (CPS 1) associated with urea circulatory disorders; motor neuron Survival (SMN) proteins associated with spinal muscular atrophy; ceramidase related to faber fatty granuloma; b-hexosaminidases associated with GM2 ganglioside deposition and tay-sajohne and sandhoff; aspartyl glucosaminidase related to aspartyl glucosamine; an alpha-fucosidase associated with fucosidosis; an alpha-mannosidase associated with an alpha-mannosidosis; porphobilinogen deaminase associated with Acute Intermittent Porphyria (AIP); alpha-1 antitrypsin for the treatment of alpha-1 antitrypsin deficiency (emphysema); erythropoietin for the treatment of thalassemia or anemia arising from renal failure; vascular endothelial growth factor, angiopoietin-1 and fibroblast growth factor for the treatment of ischemic diseases; thrombomodulin and tissue factor pathway inhibitors for the treatment of occluded blood vessels, as seen, for example, in atherosclerosis, thrombosis or embolism; aromatic Amino Acid Decarboxylase (AADC) and Tyrosine Hydroxylase (TH) for use in the treatment of parkinson's disease.

Examples of proteins and compounds useful in the compositions and targeted delivery provided herein include therapeutic proteins and other compounds and vaccine protein derivatives of the following respiratory-related infectious diseases, as well as passive immunoglobulins directed against these infectious diseases. Examples of suitable therapeutic proteins include, for example, alpha-1-antitrypsin, cystic fibrosis transmembrane protein, and variants thereof, surfactant-B, bone morphogenic protein receptor type II (associated with pulmonary hypertension), and various cancer treatments.

Examples of suitable vaccines or passive immunity include proteins derived from airborne pathogens that have been associated with severe acute respiratory syndrome (SARS-CoV 1), common cold, and non-a, b or c hepatitis, including human respiratory coronaviruses. SARS-CoV2 is the causative agent of COVID-19, and antibodies specific for this virus have been described. Examples of IgG antibodies that have been described as useful for binding to human ACE2 of SARS-CoV2 and that have neutralizing activity include, for example, LY-CoV555 (Eli Lilly), TY027 (Tychon), STI-1499 and STI-2020 (COVI-GUARD; sorrento), 80R, ADI055689/56046 (Adimab) (Renn et al, trends in pharmacology science (Trends in Pharmacological Sciences), 2020); BD-217, BD-218, BD-236 (Cao et al, cell, 182,73-84 (2020)). Examples of IgG antibodies that have been described as useful for binding to the Receptor Binding Domain (RBD) of human ACE2 of SARS-COV2 and that have neutralizing activity include, for example, COV2-2196, COV2-2130, COV2-2165 (Zost et al, nature 584,443-465 (2020)); BD-361, BD-368-2 (Cao et al, cell (Cell), 182,73-84 (2020)); b38, H4 (Y.Wu et al, science 10.1126/science abc2241 (2020); jahanshahlu and Rezaei, biomedical and pharmacotherapy (Biomedicine and Pharmacotherapy) 129 (2020)); s309, S315, S304 (Pinto et al, nature, 583,290-311 (2020)); CC6.29, CC6.30, CC6.33, CC12.1, CC12.3 (Rogers et al, science 369,956-963 (2020)); JS016 (Eli Lilly), CA1, CB6-LALA, P2C-1F11/P2B-2F6/P2A-1A3, 311mab-31B5311/32D4, COVA 2-15, 414-1 (Renn et al, trends in pharmacology science (Trends in Pharmacological Sciences), 2020). Examples of IgG antibodies that have been described as useful for binding to the spike protein of human ACE2 of SARS-COV1 and that have neutralizing activity include, for example, m396 and CR3104 (Prabakara et al, J.Biol.Chem. (Journal of Biological Chemistry), 281,15829-15836 (2006); ter Meulen et al, U.S. Sci.S. Sci.A. (PLoS), 3,7 (2006)). Examples of IgG antibodies that have been described as useful for binding to RBD or spike proteins of human ACE2 of SARS-COV1 and SARS-CoV2 and that have neutralizing activity include, for example, CR3022 and 47D11 (Wang et al, nature communication (Nature Communications), 11, natural.com/natural communications (2020)).

Examples of other target viruses include influenza viruses from the orthomyxoviridae family (orthomyxovirudae family) comprising: influenza a, influenza b and influenza c. The type a virus is the most virulent human pathogen. Influenza a serotypes that have been associated with pandemics comprise: H1N1, which causes Spanish influenza (span Flu) in 1918 and Swine influenza (sweet Flu) in 2009; H2N2, which caused Asian influenza (Asian Flu) in 1957; H3N2, which caused Hong Kong influenza (Hong Kong Flu) in 1968; H5N1, which causes avian influenza (Bird Flu) in 2004; H7N7; H1N2; H9N2; H7N2; H7N3; and H10N7. Broadly neutralizing antibodies against influenza a have been described (broadly neutralizing antibody). As used herein, "broadly neutralizing antibody" refers to a neutralizing antibody that can neutralize multiple strains from multiple subtypes. For example, CR6261[ The Scripps Institute/Crucell ] has been described as a monoclonal antibody that binds to a broad range of influenza viruses, including 1918 "Spanish flu" (SC 1918/H1) and the avian influenza H5N1 class of viruses (Viet 04/H5) that were transmitted from chickens to humans in Vietnam in 2004. CR6261 recognizes the highly conserved helical region in the membrane proximal stem of hemagglutinin, a major protein on the surface of influenza virus. Such an antibody is described in WO 2010/130636, which is incorporated herein by reference. Another neutralizing antibody F10 has been described which can be used against H1N1 and H5N 1[ XOMA Ltd ]. [ Sui et al, nature Structure and molecular biology (Nature Structural and Molecular Biology) (Sui et al, 2009,16 (3): 265-73) ] other antibodies against influenza, such as Fab28 and Fab49, may be selected. See, for example, WO 2010/140114 and WO 2009/115972, which are incorporated herein by reference. Still other antibodies may be readily selected, such as those described in WO 2010/010466, U.S. published patent publication US/2011/076265 and WO 2008/156763.

Other target pathogen viruses include sand viruses (including funin, ma Qiubo virus and Lassa), filoviruses (including Marburg virus (Marburg) and Ebola virus (Ebola)), hantaviruses, picornaviruses (including rhinoviruses, echoviruses), coronaviruses, paramyxoviruses, measles viruses, respiratory syncytial viruses, togaviruses, coxsackieviruses, parvoviruses B19, parainfluenza viruses, adenoviruses, reoviruses, smallpox viruses (Variola major) from the poxviridae family, and Vaccinia (vaccina) (Cowpox)), varicella-zoster viruses (pseudorabies). Viral hemorrhagic fever is caused by members of the arenaviridae (arenavirus family) (Lassa fever), which is also associated with Lymphocytic Choriomeningitis (LCM), filoviruses (ebola virus), and hantavirus (puremata virus). Members of the picornavirus (rhinovirus subfamily) are associated with the human cold. The coronaviridae family contains many non-human viruses, such as infectious bronchitis virus (poultry), transmissible gastroenteritis virus (pig), porcine hemagglutinating encephalomyelitis virus (pig), feline infectious peritonitis virus (cat), feline enterocoronavirus (cat), canine coronavirus (dog). Paramyxoviridae include parainfluenza virus type 1, parainfluenza virus type 3, bovine parainfluenza virus type 3, rubella virus (mumps virus), parainfluenza virus type 2, parainfluenza virus type 4, newcastle disease virus (chicken), rinderpest, measles virus (including measles and canine distemper virus), and pneumovirus (including Respiratory Syncytial Virus (RSV)). Parvoviridae include feline parvovirus (feline enteritis), feline panleukopenia virus (feline panleucopeniavirus), canine parvovirus, and porcine parvovirus. Adenoviridae contain viruses that cause respiratory diseases (EX, AD7, ARD, o.b.).

Neutralizing antibody constructs against bacterial pathogens may also be selected for use in the present invention. In one embodiment, the neutralizing antibody construct is directed against the bacterium itself. In another embodiment, the neutralizing antibody construct is directed against a toxin produced by a bacterium. Examples of airborne bacterial pathogens include, for example, neisseria meningitidis (Neisseria meningitidis) (meningitis), klebsiella pneumoniae (Klebsiella pneumonia) (pneumonia), pseudomonas aeruginosa (Pseudomonas aeruginosa) (pneumonia), pseudomonas pseudomeldonis (Pseudomonas pseudomallei) (pneumonia), pseudomonas meldonis (pneumonia), acinetobacter (Acinetobacter) (pneumonia), moraxella catarrhalis (Moraxella catarrhalis), moraxella lacuna (Moraxella lacunata), alcaligenes (alkalifenes), cardiobacillus (Cardiobacterium), haemophilus influenzae (Haemophilus influenzae) (influenza), haemophilus parainfluenza (Haemophilus parainfluenzae), bordetella pertussis (Bordetella pertussis) (pertussis), morgans (Francisella tularensis) (pneumonia/fever), legionella pneumoniae (Legionella pneumonia) (legionella), chlamydia psittaci (Chlamydia psittaci) (pneumonia), chlamydia pneumoniae (Chlamydia pneumoniae) (pneumonia), mycobacterium tuberculosis (Mycobacterium tuberculosis) (tuberculosis (TB)), kansasii (Mycobacterium kansasii) (Mycobacterium avium) (influenza), haemophilus parainfluenza (Mycobacterium avium) (pneumonia), mycobacterium anthracis (4824) (anthrax) (45) and other bacteria (45., streptococcus pyogenes (Streptococcus pyogenes) (scarlet fever), streptococcus pneumoniae (Streptococcus pneumoniae) (pneumonia), diphtheria bacillus (Corynebacteria diphtheria) (diphtheria), mycoplasma pneumoniae (Mycoplasma pneumoniae) (pneumonia). The causative agent of anthrax is a toxin produced by bacillus anthracis. Neutralizing antibodies to Protective Agents (PA), one of the three peptides forming toxoids, have been described. The other two polypeptides consist of a Lethal Factor (LF) and an Edema Factor (EF). anti-PA neutralizing antibodies have been described as being passively immune-effective against anthrax. See, for example, U.S. patent No. 7,442,373; sawada-Hirai et al, journal of immune-based therapies and vaccines (J Immune Based Ther vaccines.) "2004; 2:5 (on-line, 5 months, 12 days 2004). Yet other anti-anthrax toxin neutralizing antibodies have been described and/or can be generated. Similarly, neutralizing antibodies to other bacteria and/or bacterial toxins may be used to generate non-IgG antibodies as described herein.

Other infectious diseases may be caused by airborne fungi including, for example, aspergillus species (Aspergillus species), pyricularia virens (Absidia corymbifera), rhizopus stolonifer (Rhixpus stolonifer), pachyrhizus (Mucor plebauus), cryptococcus neoformans (Cryptococcus neoformans), histoplasma capsulatum (histoplasma), blastodermia (Blastomyces dermatitidis), coccoides macrosporum (Coccidioides immitis), penicillium species (Penicillium species), microglobaria hayensis (Micropolyspora faeni), actinomycetes vulgaris (Thermoactinomyces vulgaris), alternaria species (Alternaria alternate), mycosporum species (Cladosporium species), helminthiosporum (helminthiosporum), and scillium species (Stachybotrys species).

In addition to infectious disease conditions affecting human air transmission (many of which are described above), passive immunization according to the invention may be used to prevent conditions associated with direct inoculation of the nasal passages, such as conditions that may be transmitted by direct contact of the finger with the nasal passages. These conditions may include fungal infections (e.g., athlete's foot), ringworm, or viruses, bacteria, parasites, fungi, and other pathogens that may be transmitted by direct contact. In addition, a variety of conditions affect domestic pets, cattle and other livestock, as well as other animals. For example, in dogs, infection of the upper respiratory tract with aspergillus sinus for dogs causes significant disease. In cats, upper respiratory disease of nasal origin or cat respiratory disease syndrome can cause morbidity and mortality if left untreated. Cattle are susceptible to infection by infectious bovine rhinotracheitis (commonly known as IBR or red nose), an acute infectious bovine viral disease. In addition, cattle are susceptible to infection with Bovine Respiratory Syncytial Virus (BRSV), which causes mild to severe respiratory disease and can impair resistance to other diseases. Still other pathogens and diseases will be apparent to those skilled in the art.

Antibodies against pathogens, and in particular neutralizing antibodies (such as those specifically identified herein (e.g., anti-SARS-CoV 2, anti-SARS-CoV 1, anti-influenza, anti-ebola virus, anti-RSV)) can be used to generate class-switching or non-IgG antibodies. Monoclonal antibodies (mabs) with broad neutralizing capacity can be identified using antibody phage display screening libraries from recently seasonal influenza vaccine vaccinated donors, non-immunized humans, or naturally infected survivors. In the case of influenza, antibodies that neutralize more than one influenza subtype have been identified by blocking fusion of the virus with the host cell. This technique can be used for other infections to obtain neutralizing monoclonal antibodies. See, e.g., US 5,811,524, which describes the generation of neutralizing antibodies against Respiratory Syncytial Virus (RSV). The techniques described herein are applicable to other pathogens. Such antibodies may be used intact or they may be modified to generate sequences (scaffolds) of artificial or recombinant neutralizing antibody constructs. Such methods have been described [ see for example WO 2010/13036; WO 2009/115972; WO 2010/140114]. In one embodiment, mice, rats, hamsters, or other host animals are immunized with an immunizing agent to produce lymphocytes that produce antibodies that bind to the immunizing antigen. In an alternative method, lymphocytes may be immunized in vitro. Human antibodies can be produced using techniques such as phage display libraries (Hoogenboom and Winter, journal of molecular biology (J. Mol. Biol.), 1991,227:381, marks et al, journal of molecular biology (J. Mol. Biol.)), 1991, 222:581.

Composition and use

Provided herein are compositions containing at least one rAAV stock (e.g., rAAV9 or rAAVhu68 mutant stock) and optionally a carrier, excipient, and/or preservative. rAAV stock refers to a plurality of rAAV vectors in the same amount as described, for example, in the discussion below regarding concentration and dosage units.

In certain embodiments, the composition may contain at least a second, different rAAV stock. This second vector stock may be different from the first vector stock by having a different AAV capsid and/or a different vector genome. In certain embodiments, the compositions described herein may contain a different vector expressing an expression cassette described herein, or another active component (e.g., an antibody construct, another biological agent, and/or a small molecule drug).

As used herein, "carrier" includes any and all solvents, dispersion media, vehicles, coatings, diluents, antibacterial and antifungal agents, isotonic and absorption delaying agents, buffers, carrier solutions, suspensions, gums, and the like. The use of such media and agents for pharmaceutically active substances is well known in the art. Supplementary active ingredients may also be incorporated into the compositions. The phrase "pharmaceutically acceptable" refers to molecular entities and compositions that do not produce allergic or similar untoward reactions when administered to a host. Delivery vehicles such as liposomes, nanocapsules, microparticles, microspheres, lipid particles, vesicles, and the like may be used to introduce the compositions of the invention into suitable host cells. In particular, the transgene delivered by the rAAV vector may be formulated for delivery or encapsulation in a lipid particle, liposome, vesicle, nanosphere, nanoparticle, or the like.

In one embodiment, the composition comprises a final formulation suitable for delivery to a subject, the composition being, for example, an aqueous liquid suspension buffered to a physiologically compatible pH and salt concentration. Optionally, one or more surfactants are present in the formulation. In another embodiment, the composition may be transported as a concentrate that is diluted for administration to a subject. In other embodiments, the composition may be lyophilized and reconstituted at the time of administration.

The suitable surfactant or combination of surfactants may be selected from non-toxic nonionic surfactants. In one embodiment, a primary hydroxyl terminated dual-tube energy block copolymer surfactant is selected, e.g., asF68[BASF]Also known as Poloxamer 188, which has a neutral pH, has an average molecular weight of 8400. Other surfactants and other poloxamers may be selectedI.e., nonionic triblock copolymers composed of a central hydrophobic chain of polyoxypropylene (poly (propylene oxide)) flanked by two hydrophilic chains of polyoxyethylene (poly (ethylene oxide)), SOLUTOL HS 15 (polyethylene glycol-15 hydroxystearate), LABRASOL (polyoxyglyceryl octoate), polyoxy 10 oil ether, TWEEN (polyoxyethylene sorbitan fatty acid ester), ethanol, and polyethylene glycol. In one embodiment, the formulation contains a poloxamer. These copolymers are generally designated by the letter "P" (for poloxamers), followed by three numbers: the first two digits x 100 give the approximate molecular weight of the polyoxypropylene core, and the last digit x 10 gives the percentage of polyoxyethylene content. In one embodiment, poloxamer 188 is selected. The surfactant may be present in an amount up to about 0.0005% to about 0.001% of the suspension.

In one embodiment, the formulation buffer is Phosphate Buffered Saline (PBS) (final formulation buffer, FFB) with a total salt concentration of 200mM, containing 0.001% (w/v) Pluronic F68.

The vector is administered in a sufficient amount to transfect the cells and provide sufficient levels of gene transfer and expression to provide therapeutic benefit without undue side effects or with a medically acceptable physiological effect, as can be determined by one of skill in the medical arts. In certain embodiments, the carrier is formulated for delivery via an intranasal delivery device for targeted delivery to nasal and/or nasopharyngeal epithelial cells. In certain embodiments, the carrier is formulated for an aerosol delivery device, for example via a nebulizer or by other suitable means. Other conventional and pharmaceutically acceptable routes of administration include, but are not limited to, direct delivery to the desired organ (e.g., lung), oral inhalation, intrathecal, intratracheal, intraarterial, intraocular, intravenous, intramuscular, subcutaneous, intradermal, and other parenteral routes of administration. In one embodiment, an intranasal mucosal atomizer is usedMAD Nasal ^TM MAD 110) intranasal administration vehicle. In another embodiment, a Vibrating Mesh nebulizer (Vibrating Mesh Nebulize)(/>Solo) or MADgic ^TM A laryngeal mucosa nebulizer (Laryngeal Mucosal Atomizer) administers the carrier in the lung in nebulized form. The routes of administration may be combined, if desired. The route of administration and use of the delivery rAAV vectors thereof are also described in the following published U.S. patent applications: US 2018/0155412A1, US 2018/0243156 A1, US 2014/0031418 A1, and US 2019/0216841A1, each of which is incorporated herein by reference in its entirety.

The dose of the viral vector will depend primarily on factors such as the condition being treated, the age, weight and health of the patient, and thus may vary from patient to patient. For example, a therapeutically effective human dose of a viral vector is typically within the following range: containing about 10 ⁹ To 4x10 ¹⁴ The dose of AAV vector for GC is from about 25 to about 1000 microliters to about 5mL of aqueous suspension. Dosages will be adjusted to balance the therapeutic benefit with any side effects, and such dosages may be varied depending on the therapeutic application in which the recombinant vector is employed. The expression level of the transgene may be monitored to determine the frequency of doses of the resulting viral vector, preferably an AAV vector containing a transgene. Optionally, a dosage regimen similar to that described for therapeutic purposes may be used for immunization with the compositions of the invention.

Replication-defective virus compositions may be formulated in dosage units containing replication-defective viruses in amounts within the following ranges: about 10 ⁹ GC to about 10 ¹⁶ GC (for treating subjects with an average weight of 70 kg), comprises all integer or fractional amounts within the stated range, and is preferably 10 for human patients ¹² GC to 10 ¹⁴ And (3) GC. In one embodiment, the composition is formulated to contain at least 10 per dose ⁹ 、2x10 ⁹ 、3x10 ⁹ 、4x10 ⁹ 、5x10 ⁹ 、6x10 ⁹ 、7x10 ⁹ 、8x10 ⁹ Or 9x10 ⁹ GC, including all integers or fractional amounts within the range. In one embodiment, the composition is formulated to contain at least 10 per dose ¹⁰ 、2x10 ¹⁰ 、3x10 ¹⁰ 、4x10 ¹⁰ 、5x10 ¹⁰ 、6x10 ¹⁰ 、7x10 ¹⁰ 、8x10 ¹⁰ Or 9x10 ¹⁰ GC, including all integers or fractional amounts within the range. In one embodiment, the composition is formulated to contain at least 10 per dose ¹¹ 、2x10 ¹¹ 、3x10 ¹¹ 、4x10 ¹¹ 、5x10 ¹¹ 、6x10 ¹¹ 、7x10 ¹¹ 、8x10 ¹¹ Or 9x10 ¹¹ GC, including all integers or fractional amounts within the range. In one embodiment, the composition is formulated to contain at least 10 per dose ¹² 、2x10 ¹² 、3x10 ¹² 、4x10 ¹² 、5x10 ¹² 、6x10 ¹² 、7x10 ¹² 、8x10 ¹² Or 9x10 ¹² GC, including all integers or fractional amounts within the range. In one embodiment, the composition is formulated to contain at least 10 per dose ¹³ 、2x10 ¹³ 、3x10 ¹³ 、4x10 ¹³ 、5x10 ¹³ 、6x10 ¹³ 、7x10 ¹³ 、8x10 ¹³ Or 9x10 ¹³ GC, including all integers or fractional amounts within the range. In one embodiment, the composition is formulated to contain at least 10 per dose ¹⁴ 、2x10 ¹⁴ 、3x10 ¹⁴ 、4x10 ¹⁴ 、5x10 ¹⁴ 、6x10 ¹⁴ 、7x10 ¹⁴ 、8x10 ¹⁴ Or 9x10 ¹⁴ GC, including all integers or fractional amounts within the range. In one embodiment, the composition is formulated to contain at least 10 per dose ¹⁵ 、2x10 ¹⁵ 、3x10 ¹⁵ 、4x10 ¹⁵ 、5x10 ¹⁵ 、6x10 ¹⁵ 、7x10 ¹⁵ 、8x10 ¹⁵ Or 9x10 ¹⁵ GC, including all integers or fractional amounts within the range. In one embodiment, for human use, the dosage range may be 10 per dose ¹⁰ To about 10 ¹² GC, including all integers or fractional amounts within the range. In one embodiment, for human use, the dosage range may be 10 per dose ⁹ Up to about 7x10 ¹³ GC, including all integers or fractional amounts within the range. In one embodiment, for a personClass of application, dose range of 6.25x10 ¹² GC to 5.00x10 ¹³ And (3) GC. In a further embodiment, the dosage is about 6.25x10 ¹² GC. About 1.25x10 ¹³ GC. About 2.50x10 ¹³ GC or about 5.00x10 ¹³ And (3) GC. In certain embodiments, the dose is divided equally into two halves and applied to each nostril. In certain embodiments, the dosage range is 6.25x10 for human use ¹² GC to 5.00x10 ¹³ GC, administered in two aliquots of 0.2ml per nostril, total volume delivered per subject was 0.8ml.

These above-described dosages may be administered in various volumes of carrier, excipient or buffer formulations, ranging from about 25 to about 1000 microliters or more in volume, including all numbers within the ranges, depending on the size of the area to be treated, the viral titer used, the route of administration, and the desired effect of the method. In one embodiment, the volume of carrier, excipient, or buffer is at least about 25 μl. In one embodiment, the volume is about 50 μl. In another embodiment, the volume is about 75 μl. In another embodiment, the volume is about 100. Mu.L. In another embodiment, the volume is about 125. Mu.L. In another embodiment, the volume is about 150. Mu.L. In another embodiment, the volume is about 175. Mu.L. In yet another embodiment, the volume is about 200 μl. In another embodiment, the volume is about 225. Mu.L. In yet another embodiment, the volume is about 250 μl. In yet another embodiment, the volume is about 275 μl. In yet another embodiment, the volume is about 300 μl. In yet another embodiment, the volume is about 325 μl. In another embodiment, the volume is about 350 μl. In another embodiment, the volume is about 375. Mu.L. In another embodiment, the volume is about 400. Mu.L. In another embodiment, the volume is about 450 μl. In another embodiment, the volume is about 500 μl. In another embodiment, the volume is about 550. Mu.L. In another embodiment, the volume is about 600. Mu.L. In another embodiment, the volume is about 650 μl. In another embodiment, the volume is about 700. Mu.L. In another embodiment, the volume is between about 700 and about 1000 μl.

In certain embodiments, the recombinant vector may be administered intranasally by spraying each nostril twice. In one embodiment, two shots are applied by alternating shots to each nostril, e.g., left nostril shot, right nostril shot, then left nostril shot, right nostril shot. In some embodiments, there may be a delay between alternate injections. For example, each naris may receive multiple injections at intervals of about 10 to 60 seconds or 20 to 40 seconds or about 30 seconds to minutes or more. Such sprays can deliver, for example, about 150 to 300 or about 250 μl in each spray to achieve a total dosing volume of about 200 to about 600, 400 to 700, or 450 to 1000 μl.

In certain embodiments, the recombinant AAV vector may be administered intranasally to achieve a concentration of the transgene expression product of 5-20ng/ml measured in nasal wash after administration (e.g., one week to four weeks or about two weeks after administration of the vector). Methods for obtaining nasal washes from subjects are conventional.

For other routes of administration, such as intravenous or intramuscular, the dosage level will be higher than for intranasal delivery. For example, such suspensions may have a dosage volume of about 1mL to about 25mL, with a dosage of up to about 2.5x10 ¹⁵ GC。

In certain embodiments, the intranasal delivery device provides a spray atomizer that delivers a particle mist size having an average size ranging from about 30 microns to about 100 microns. In certain embodiments, the average size ranges from about 10 microns to about 50 microns. Suitable devices have been described in the literature and some are commercially available, e.g. LMA MAD NASAL ^TM (Teleflex Medical；Ireland)；Teleflex VaxINator ^TM (Teleflex Medical; ireland); controlled Particle from Kurve Technologies(CPD). See also PG Djubesland, drug delivery and movement study (Drug Deliv and Transl. Res) (2013) 3:42-62. In certain embodiments, the delivery particle size and volume are controlled so as to preferentially target nasal epithelial cells and minimize pairingLung targeting. In other embodiments, the particle mist is about 0.1 microns to about 20 microns or less for delivery to the lung cells. Such smaller particle sizes may minimize retention in the nasal epithelium.

An apparatus atomizes particles at an average diameter of about 16 microns to about 22 microns. The mist may be delivered directly to a tracheobronchial tree inserted through the aspiration channel of a 3.5-mm flexible fiber bronchoscope (Olympus, melville, NY). Other suitable delivery devices may include a laryngeal-tracheal mucosal nebulizer, which provides administration across the upper respiratory tract via the vocal cords. It passes through the vocal cords and down the laryngeal mask or into the nasal cavity. The droplets are atomized at an average diameter of about 30 microns to about 100 microns. The tip of the standard device is about 0.18in (4.6 mm) in diameter, about 4.5-8.5 inches in length, and is inserted through the aspiration channel and advanced about 3mm past the distal tip of the speculum. The dose that can be administered is 10 aliquots (approximately 150 μl each) of saline control or rAAV that are injected into the right main bronchus.

In one embodiment, a frozen composition is provided that contains a rAAV in frozen form in a buffer solution as described herein. Optionally, one or more surfactants (e.g., pluronic F68), stabilizers, or preservatives are present in this composition. Suitably, at the time of use, the composition is thawed and titrated to the desired dose with a suitable diluent (e.g., sterile saline or buffered saline).

In one embodiment, a composition is provided that includes one or more exogenous endothelial cell targeting peptides from the following motifs: n-x- (T/I/V/A) - (K/R) (SEQ ID NO: 47) and optionally flanking linker sequences, and one or more physiologically compatible carriers, excipients and/or aqueous suspension matrices. Further provided are compositions comprising nucleic acid sequences encoding the targeting peptides. In certain embodiments, the targeting peptide is the targeting peptide of SEQ ID NO. 40 and is encoded by the nucleic acid sequence of SEQ ID NO. 54 or a sequence at least about 70% identical thereto. In certain embodiments, the targeting peptide is the targeting peptide of SEQ ID NO. 38 and is encoded by the nucleic acid sequence of SEQ ID NO. 50 or a sequence at least about 70% identical thereto. In certain embodiments, the targeting peptide is the targeting peptide of SEQ ID NO. 46 and is encoded by the nucleic acid sequence of SEQ ID NO. 56 or a sequence at least about 70% identical thereto. In certain embodiments, the targeting peptide is the targeting peptide of SEQ ID NO. 43 and is encoded by the nucleic acid sequence of SEQ ID NO. 52 or a sequence at least about 70% identical thereto. In certain embodiments, the targeting peptide is the targeting peptide of SEQ ID NO. 39 and is encoded by the nucleic acid sequence of SEQ ID NO. 55 or a sequence at least about 70% identical thereto. In certain embodiments, the targeting peptide is the targeting peptide of SEQ ID NO. 42 and is encoded by the nucleic acid sequence of SEQ ID NO. 51 or a sequence at least about 70% identical thereto. In certain embodiments, the targeting peptide is the targeting peptide of SEQ ID NO. 41 and is encoded by the nucleic acid sequence of SEQ ID NO. 53 or a sequence at least about 70% identical thereto.

In another embodiment, a fusion polypeptide or protein is provided comprising one or more exogenous brain endothelial cell targeting peptides from the following motifs: n-x- (T/I/V/A) - (K/R) (SEQ ID NO: 47) and fusion partners comprising at least one polypeptide or protein. Further provided are nucleic acid sequences encoding the fusion polypeptides or proteins.

In certain embodiments, a composition is provided that includes a fusion polypeptide or protein, or a nucleic acid sequence encoding the fusion polypeptide or protein, or a nanoparticle containing the fusion polypeptide or protein. The composition may further comprise one or more of a physiologically compatible carrier, excipient, and/or aqueous suspension matrix.

In certain embodiments, the nucleic acid sequence encoding the fusion polypeptide protein is encapsulated in a Lipid Nanoparticle (LNP). As used herein, the phrase "lipid nanoparticle" or "nanoparticle" refers to a transfer vector that includes one or more lipids (e.g., cationic lipids, non-cationic lipids, and PEG-modified lipids). Preferably, the lipid nanoparticle is formulated to deliver one or more nucleic acid sequences to one or more target cells (e.g., liver and/or muscle). Examples of suitable lipids include, for example, phosphatidyl compounds (e.g., phosphatidylglycerol, phosphatidylcholine, phosphatidylserine, phosphatidylethanolamine, sphingolipids, cerebrosides, and gangliosides). The use of polymers as transfer agents, whether alone or in combination with other transfer agents, is also contemplated. Suitable polymers may include, for example, polyacrylates, polyalkylcyanoacrylates, polylactide-polyglycolide copolymers, polycaprolactone, dextran, albumin, gelatin, alginate, collagen, chitosan, cyclodextrin, dendrimers, and polyethyleneimine. In one embodiment, the selection of the transfer vector is based on its ability to facilitate transfection of the nucleic acid sequence encapsulated therein into the target cell. Useful lipid nanoparticles for nucleic acid sequences include cationic lipids to encapsulate such nucleic acid sequences and/or enhance delivery of such nucleic acid sequences to target cells that will act as reservoirs for protein production. As used herein, the phrase "cationic lipid" refers to any of a variety of lipid species that carry a net positive charge at a selected pH, such as a physiological pH. The lipid nanoparticles of interest may be prepared by a multicomponent lipid mixture comprising different ratios employing one or more cationic lipids, non-cationic lipids, and PEG-modified lipids. Several cationic lipids have been described in the literature, many of which are commercially available. See, for example, WO2014/089486, US 2018/0353616A1 and US 8,853,377B2, which are incorporated herein by reference. In certain embodiments, LNP formulations are performed using conventional procedures, including cholesterol, ionizable lipids, helper lipids, PEG-lipids, and polymers, to form lipid bilayers around the encapsulated nucleic acid sequence (Kowalski et al, 2019, molecular therapy (mol. Ther.) (27 (4): 710-728). In some embodiments, the LNP comprises a cationic lipid (i.e., N- [1- (2, 3-dioleoyloxy) propyl ] -N, N-trimethylammonium chloride (DOTMA) or 1, 2-dioleoyl-3-trimethylammonium propane (DOTAP)) with a helper lipid DOPE. In some embodiments, the LNP comprises an ionizable lipid Dlin-MC3-DMA ionizable lipid, or a diketopiperazine-based ionizable lipid (cKK-E12). In some embodiments, the polymer comprises Polyethylenimine (PEI) or poly (β -amino) ester (PBAE). See, e.g., WO2014/089486, US 2018/0353616A1, US2013/0037977A1, WO2015/074085A1, US9670152B2 and US 8,853,377B2, which documents are incorporated herein by reference.

In certain embodiments, compositions, such as rAAV, fusion polypeptides or proteins having a modified capsid with an N-x- (T/I/V/A) - (K/R) (SEQ ID NO: 47) peptide and optionally a linker sequence, or conjugates comprising a nanoparticle or chemical moiety, can be used to deliver therapy to a patient in need thereof. In certain embodiments, the methods are used for targeted therapy of brain endothelial cells. In certain embodiments, the methods are used to treat Allan-Herndon-Dudley's disease by delivering the MCT8 protein (e.g., uniProt ID No: P36021) or a gene that expresses MCT8 in vivo. In other embodiments, the methods are used for targeted therapy of the lung. In certain embodiments, the delivered product is a soluble Ace2 protein (e.g., a hAce2 bait or a hAce2 bait fusion), an anti-SARS antibody, an anti-SARS-CoV 2 antibody, an anti-influenza antibody, or a cystic fibrosis transmembrane protein. See also omim. See also U.S. provisional application number 63/143,614, filed on 29 a 1 month 2021, U.S. provisional application number 63/16 5,511, filed on 12 a 3 month 2021, U.S. patent application number 63/166,686, filed on 26 a 3 month 2021, U.S. provisional application number 63/215,159, filed on 8 a 10 month 2021, filed on 25 a 6 month 2021, all of which are incorporated herein by reference.

In certain embodiments, rAAV having modified capsids described herein can be delivered in a combination therapeutic regimen that further includes one or more additional active components. In certain embodiments, the regimen may involve co-administration of an immunomodulatory component. Such immunomodulation regimens may include, for example, but are not limited to, immunosuppressants such as glucocorticoids, steroids, antimetabolites, T-cell inhibitors, macrolide drugs (e.g., rapamycin or rapamycin analogs (rapalog)), and cytostatics including alkylating agents, antimetabolites, cytotoxic antibiotics, antibodies, or agents active against immunoaffinity. Immunosuppressants may comprise nitrogen mustards, nitrosoureas, platinum compounds, methotrexate, azathioprine, mercaptopurine, fluorouracil, dactinomycin, anthracyclines, mitomycin C, bleomycin, mithramycin, IL-2 receptor (CD 25) or CD3 directed antibodies, anti-IL-2 antibodies, cyclosporines, tacrolimus, sirolimus, IFN- β, IFN- γ, opioids or TNF- α (tumor necrosis factor- α) binders. In certain embodiments, immunosuppressive therapy can begin prior to administration of gene therapy. Such therapies may involve co-administration of two or more drugs (e.g., prednisone, mycophenolate Mofetil (MMF), and/or sirolimus (i.e., rapamycin)) on the same day. One or more of these drugs may be continued to be used at the same dose or at an adjusted dose after administration of the gene therapy. Such therapy may last for about 1 week, about 15 days, about 30 days, about 45 days, 60 days, or longer, as desired. Still other co-therapies may comprise, for example, an anti-IgG enzyme that has been described as useful for depleting anti-AAV antibodies (and thus may allow for administration to patients with a test of antibodies to the selected AAV capsid above a threshold level), and/or delivering an anti-FcRN antibody, as described, for example, in U.S. provisional patent application No. 63/040,381, entitled "compositions and methods for treating gene therapy patients," filed on 6-17-2020, and/or a) a steroid or steroid combination and/or (b) an IgG-clearing enzyme, (c) an Fc-IgE binding inhibitor; (d) an Fc-IgM binding inhibitor; (e) an Fc-IgA binding inhibitor; and/or (f) one or more of gamma interferon.

An antibody "Fc region" refers to a crystallizable fragment, which is a region of an antibody that interacts with a cell surface receptor (Fc receptor). In one embodiment, the Fc region is a human IgG1 Fc. In one embodiment, the Fc region is a human IgG2 Fc. In one embodiment, the Fc region is a human IgG4 Fc. In one embodiment, the Fc region is an Engineered Fc fragment, see, e.g., lobner, elisabeth et al, "Engineered IgG1-Fc, one fragment binding them all together (Engineered IgG1-Fc-one fragment to bind them all.)" immune review (Immunological reviews) 270.1 (2016): 113-131; saxena, abhishek and Donghui Wu. "progress of therapeutic Fc engineering-modulation of IgG-related effector functions and serum half-life (Advances in therapeutic Fc engineering-modulation of IgG-Associated effector functions and serum half-life)" immunological front (Frontiers in immunology) 7 (2016); irani, vashti et al, "molecular characterization of the human IgG subclass and its effect on the design of therapeutic monoclonal antibodies against infectious diseases (Molecular properties of human IgG subclasses and their implications for designing therapeutic monoclonal antibodies against infectious diseases.)" "molecular immunology (Molecular immunology)," 67.2 (2015): 171-182; rath, timo et al, "Fc fusion protein and FcRn: structural insights into more durable and more effective treatments (Fc-fusion proteins and FcRn: structural insights for longer-lasting and more effective therapeutics.) "vital reviews of biotechnology (Critical reviews in biotechnology)" 35.2 (2015): 235-254; and Invivogen, igG-Fc engineering for therapeutic use (IgG-Fc Engineering For Therapeutic Use), invivogen. Com/docs/Insight200605.Pdf, month 4 2006; each of which is incorporated herein by reference.

The antibody "hinge region" is the flexible amino acid portion of the heavy chains of the IgG and IgA immunoglobulin classes that connects the two chains by disulfide bonds.

An "immunoglobulin molecule" is a protein that contains an immunologically active portion of an immunoglobulin heavy chain and an immunoglobulin light chain that are covalently coupled together and capable of specifically binding an antigen. Immunoglobulin molecules are immunoglobulin molecules of any type (e.g., igG, igE, igM, igD, igA and IgY), class (e.g., igG1, igG2, igG3, igG4, igA1, and IgA 2), or subclass. The terms "antibody" and "immunoglobulin" are used interchangeably herein.

An "immunoglobulin heavy chain" is a polypeptide comprising at least a portion of an immunoglobulin antigen binding domain and at least a portion of an immunoglobulin heavy chain variable region or at least a portion of an immunoglobulin heavy chain constant region. Thus, immunoglobulin derived heavy chains have regions of significant amino acid sequence homology with members of the immunoglobulin gene superfamily. For example, the heavy chain in a Fab fragment is an immunoglobulin derived heavy chain.

An "immunoglobulin light chain" is a polypeptide comprising at least a portion of an immunoglobulin antigen binding domain and at least a portion of an immunoglobulin light chain variable region or at least a portion of an immunoglobulin light chain constant region. Thus, immunoglobulin derived light chains have regions of significant amino acid sequence homology with members of the immunoglobulin gene superfamily.

"neutralizing antibody titer" (NAb titer) is a measure of how much neutralizing antibody (e.g., anti-AAV NAb) is produced to neutralize the physiological effects of its target epitope (e.g., AAV). anti-AAV NAb titers can be measured as described, for example, in Calcedo, R et al, "worldwide epidemic of neutralizing antibodies against adeno-Associated viruses (Worldwide Epidemiology of Neutralizing Antibodies to Adeno-Associated viruses)", journal of infectious diseases (Journal of Infectious Diseases), 2009,199 (3): p.381-390, which is incorporated herein by reference.

As used herein, unless otherwise indicated, a "sub-population" of vp proteins refers to a group of vp proteins that have at least one defined common property and that consist of at least one group member to less than all members of a reference group. For example, unless otherwise specified, a "sub-population" of vp1 proteins is at least one (1) vp1 protein, and less than all of the vp1 proteins in the assembled AAV capsid. Unless otherwise indicated, a "sub-population" of vp3 proteins may be one (1) vp3 protein that is less than all of the vp3 proteins in the assembled AAV capsid. For example, the vp1 protein may be a sub-population of vp proteins; the vp2 protein may be a separate sub-population of vp proteins, and vp3 is yet another sub-population of vp proteins in the assembled AAV capsid. In another example, vp1, vp2, and vp3 proteins may contain sub-populations with different modifications, e.g., at least one, two, three, or four highly deamidated asparagines, e.g., at an asparagine-glycine pair. Unless otherwise indicated, highly deamidated refers to at least 45% deamidation, at least 50% deamidation, at least 60% deamidation, at least 65% deamidation, at least 70%, at least 75%, at least 80%, at least 85%, at least 90%, at least 95%, 97%, 99%, up to about 100% deamidation at the reference amino acid position compared to the predicted amino acid sequence of the reference amino acid position. Such percentages may be determined using 2D gel, mass spectrometry techniques, or other suitable techniques.

As used herein, "stock" of rAAV refers to a population of rAAVs. Although their capsid proteins are heterologous due to deamidation, rAAV in stock are expected to share the same vector genome. The stock may include a rAAV having a capsid with, for example, heterologous deamidation pattern characteristics of the AAV capsid protein of choice and the production system of choice. The stock may be produced by a single production system or pooled by multiple runs of the production system. Various production systems may be selected, including but not limited to those described herein. See, e.g., WO 2019/168961 published at month 9, 2019, containing table G providing deamidation patterns of AAV9 and WO 2020/160582 submitted at month 7, 2018. See also, for example, WO 2020/223231 (rh 91, including tables with deamidation patterns) published at month 11, month 5, 2020, U.S. provisional patent application No. 63/065,616, filed at month 8, month 14, 2020, and U.S. provisional patent application No. 63/109,734, filed at month 11, month 4, 2021, and international patent application No. PCT/US21/45945, filed at month 8, 13, all of which are incorporated herein by reference in their entirety.

The abbreviation "sc" refers to self-complementation. "self-complementary AAV" refers to a construct in which the coding region carried by the recombinant AAV nucleic acid sequence has been designed to form an intramolecular double-stranded DNA template. After infection, rather than waiting for cell-mediated second strand synthesis, two complementary semi-scAAV will associate to form one double stranded DNA (dsDNA) that is susceptible to immediate replication and transcription. See, e.g., D M McCarty et al, "Self-complementary recombinant adeno-associated virus (scaV) vectors promote efficient transduction (Self-complementary recombinant adeno-associated virus (scaV) vectors promote efficient transduction independently of DNA synthesis) independent of DNA synthesis," Gene therapy, (month 8 2001), volume 8, 16, pages 1248-1254. Self-complementary AAV is described, for example, in us patent No. 6,596,535;7,125,717; and 7,456,683, each of which is incorporated herein by reference in its entirety.

As used herein, the term "operably linked" refers to both expression control sequences that are contiguous with the gene of interest and expression control sequences that function in trans or remotely to control the gene of interest.

The term "heterologous" when used in connection with a protein or nucleic acid indicates that the protein or nucleic acid includes two or more sequences or subsequences that are not found in the same relationship to each other in nature. For example, nucleic acids are typically recombinantly produced, having two or more sequences from unrelated genes arranged to produce new functional nucleic acids. For example, in one embodiment, the nucleic acid has a promoter from one gene arranged to direct expression of coding sequences from a different gene. Thus, with respect to the coding sequence, the promoter is heterologous.

"replication defective virus" or "viral vector" refers to a synthetic or artificial viral particle in which an expression cassette containing a gene of interest is packaged in a viral capsid or envelope, wherein any viral genomic sequence that is also packaged within the viral capsid or envelope is replication defective; that is, it is unable to produce progeny virus particles, but retains the ability to infect target cells. In one embodiment, the genome of the viral vector does not contain genes encoding enzymes required for replication (the genome may be engineered to be "gut-free" -only contain the transgene of interest flanking the signals required to amplify and package the artificial genome), but these genes may be supplied during production. Thus, this is considered to be safe for use in gene therapy because replication and infection by progeny virions does not occur unless the viral enzymes required for replication are present.

A "recombinant AAV" or "rAAV" is a DNase-resistant viral particle comprising two elements, an AAV capsid, and a vector genome comprising at least non-AAV coding sequences packaged within the AAV capsid. In certain embodiments, the capsid contains about 60 proteins consisting of vp1 protein, vp2 protein, and vp3 protein, which self-assemble to form the capsid. Unless otherwise indicated, "recombinant AAV" or "rAAV" may be used interchangeably with the phrase "rAAV vector. rAAV is a "replication defective virus" or "viral vector" in that it lacks any functional AAV rep genes or functional AAV cap genes and is incapable of producing offspring. In certain embodiments, only the AAV sequences are AAV Inverted Terminal Repeats (ITRs), typically located at the 5 'and 3' extremities of the vector genome, to allow for the packaging of genes and regulatory sequences located between the ITRs within the AAV capsid.

The term "nuclease resistance" means that the AAV capsid is assembled around an expression cassette designed to deliver the transgene to the host cell and protect the packaged genomic sequences from degradation (digestion) during a nuclease incubation step designed to remove contaminating nucleic acids that may be present from the production process.

As used herein, "vector genome" refers to a nucleic acid sequence packaged inside the parvoviral (e.g., rAAV) capsid that forms a viral particle. Such nucleic acid sequences comprise AAV Inverted Terminal Repeats (ITRs). In the examples herein, the vector genome contains at least 5 'to 3' AAV 5 'itrs, coding sequences (i.e., transgenes), and AAV 3' itrs. ITRs from AAV2 (AAV other than the capsid source) or other than full-length ITRs may be selected. In certain embodiments, the ITRs are from the same AAV source as the AAV that provides rep function during production or trans-supplementation of AAV. Further, other ITRs, such as self-complementary (scAAV) ITRs, may be used. Both single stranded AAV and self-complementary (sc) AAV are encompassed within the rAAV. A transgene is a nucleic acid coding sequence heterologous to the vector sequence that encodes a polypeptide, protein, functional RNA molecule (e.g., miRNA inhibitor), or other gene product of interest. The nucleic acid coding sequence is operably linked to the regulatory component in a manner that allows transcription, translation and/or expression of the transgene in cells of the target tissue. Suitable components of the vector genome are discussed in more detail herein. In one example, a "vector genome" contains, from 5 'to 3', at least a vector-specific sequence, which may be a terminal repeat sequence that specifically encapsulates a viral vector capsid or envelope protein, operably linked to regulatory control sequences (directing the expression of the sequence in a target cell) to encode a protein of interest. For example, AAV inverted terminal repeats are used for packaging into AAV and certain other parvoviral capsids.

As used herein, an "operably linked" sequence comprises an expression control sequence that is contiguous with the gene of interest and an expression control sequence that acts in trans or remotely to control the gene of interest.

In certain embodiments, the non-viral genetic elements used to make the rAAV will be referred to as vectors (e.g., production vectors). In certain embodiments, these vectors are plasmids, but other suitable genetic elements are contemplated. Such production plasmids may encode sequences that are expressed during rAAV production, such as AAV capsids or rep proteins that are not packaged into rAAV that are required for the production of rAAV. Alternatively, such production plasmids may carry the vector genome packaged into the rAAV.

As used herein, "parental capsid" refers to a non-mutated or non-modified capsid selected from parvoviruses or other viruses (e.g., AAV, adenovirus, HSV, RSV, etc.). In certain embodiments, the parental capsid comprises any naturally-occurring AAV capsid, including a wild-type genome encoding a capsid protein (i.e., vp protein), wherein the capsid protein directs AAV transduction and/or tissue-specific chemotaxis. In some embodiments, the parental capsid is selected from AAV that naturally targets the CNS. In other embodiments, the parental capsid is selected from an AAV that is not naturally targeted to the CNS.

As used herein, "variant capsid" or "variant AAV capsid" refers to a modified capsid or a mutated capsid, wherein the capsid protein comprises the insertion of a tissue specific targeting peptide.

As used herein, an "expression cassette" refers to a nucleic acid molecule that includes a biologically useful nucleic acid sequence (e.g., a gene cDNA, mRNA, etc., encoding a protein, enzyme, or other useful gene product) and regulatory sequences operably linked thereto that direct or regulate transcription, translation, and/or expression of the nucleic acid sequence and its gene product. As used herein, "operably linked" sequences include both regulatory sequences that are contiguous or non-contiguous with the nucleic acid sequence and regulatory sequences that function in either a trans or cis nucleic acid sequence. Such regulatory sequences typically comprise, for example, one or more of a promoter, enhancer, intron, kozak sequence, polyadenylation sequence, and TATA signal. The expression cassette may contain regulatory sequences upstream (5 ') of the gene sequence, such as one or more of a promoter, enhancer, intron, etc., and one or more of an enhancer, or downstream (3') of the gene sequence, such as the 3 'untranslated region (3' utr) including a polyadenylation site, among other elements. In certain embodiments, the regulatory sequence is operably linked to the nucleic acid sequence of the gene product, wherein the regulatory sequence is separated from the nucleic acid sequence of the gene product by an intervening nucleic acid sequence, i.e., a 5 'untranslated region (5' utr). In certain embodiments, the expression cassette comprises a nucleic acid sequence of one or more gene products. In some embodiments, the expression cassette may be a monocistronic expression cassette or a bicistronic expression cassette. In other embodiments, the term "transgene" refers to one or more DNA sequences from an external source inserted into a target cell.

In the context of the present invention, the term "translation" relates to the process of ribosomes, in which the mRNA chain controls the assembly of amino acid sequences to produce proteins or peptides.

The term "expression" is used herein in its broadest sense and includes the production of RNA or RNA and proteins. Expression may be transient or may be stable.

When referring to a nucleic acid or fragment thereof, the term "substantial homology" or "substantial similarity" means that when optimally aligned with the appropriate nucleotide insertion or deletion of another nucleic acid (or its complementary strand), at least about 95 to 99% of the aligned sequences have nucleotide sequence identity. Preferably, the homology is over the full length sequence, or an open reading frame or another suitable fragment thereof of at least 15 nucleotides in length. Examples of suitable fragments are described herein.

In the context of nucleic acid sequences, the terms "sequence identity", "percent sequence identity" or "percent identity" refer to residues in two sequences that are identical when aligned for maximum correspondence. The length of the desired sequence identity comparison may exceed the full length of the genome, the full length of the gene coding sequence, or a fragment of at least about 500 to 5000 nucleotides. However, identity between smaller fragments may also be desired, e.g., at least about nine nucleotides, typically at least about 20 to 24 nucleotides, at least about 28 to 32 nucleotides, at least about 36 or more nucleotides. Similarly, for amino acid sequences, the "percent sequence identity" can be readily determined over the full length of the protein or a fragment thereof. Suitably, the fragment is at least about 8 amino acids in length, and may be up to about 700 amino acids in length. Examples of suitable fragments are described herein.

When referring to an amino acid or fragment thereof, the term "substantial homology" or "substantial similarity" means that when optimally aligned with the appropriate amino acid insertion or deletion of another amino acid (or its complementary strand), at least about 95 to 99% of the aligned sequences have amino acid sequence identity. Preferably, homology is over the full length sequence or protein thereof, e.g., an immunoglobulin region or domain, an AAV cap protein or fragment thereof of at least 8 amino acids or more desirably at least 15 amino acids in length. Examples of suitable fragments are described herein.

The term "highly conserved" means at least 80% identical, preferably at least 90% identical, and more preferably more than 97% identical. Identity can be readily determined by those skilled in the art using algorithms and computer programs known to those skilled in the art.

In general, when referring to "identity", "homology" or "similarity" between two different adeno-associated viruses, reference is made to "aligned" sequences to determine "identity", "homology" or "similarity". "aligned" sequences or "alignment" refers to multiple nucleic acid sequences or protein (amino acid) sequences that typically contain corrections for missing or additional bases or amino acids as compared to a reference sequence. Alignment was performed using any of a variety of published or commercially available multiple sequence alignment programs. Examples of such programs include "ClustalΩ", "ClustalW", "CAP sequence assembly", "MAP", and "MEME", which are accessible through a Web server on the Internet. Other sources of such procedures are known to those skilled in the art. Alternatively, the carrier NTI utility is also used. Many algorithms known in the art can be used for measuring The amount of nucleotide sequence identity includes those algorithms included in the above procedure. As another example, the GCG version 6.1 program Fasta can be used ^TM The polynucleotide sequences were compared. Fasta ^TM An alignment and percent sequence identity of the optimal overlap region between the query sequence and the search sequence is provided. For example, the percent sequence identity between nucleic acid sequences may be Fasta using its default parameters (NOPAM coefficients of word size 6 and scoring matrix) as provided in GCG version 6.1 ^TM As determined, the procedure is incorporated herein by reference. A number of sequence alignment programs can also be used for amino acid sequences, such as the "ClustalΩ", "ClustalX", "MAP", "PIMA", "MSA", "BLOCKMAKER", "MEME" and the "Match-Box" programs. Typically, any of these programs is used in default settings, although one skilled in the art may change these settings as desired. Alternatively, one skilled in the art may utilize another algorithm or computer program that provides at least the same level of identity or alignment as provided by the reference algorithm and program. See, e.g., J.D.Thomson et al, nucleic acid research (nucleic acids Res.), general comparison of multiple sequence alignments (A comprehensive comparison of multiple sequence alignments), 27 (13): 2682-2690 (1999).

The effective amount may be determined based on an animal model rather than a human patient.

As described above, unless otherwise indicated, the term "about" when used in reference to a numerical value means a variation of ±10% from a given reference (±10%, for example, ±1%, ±2%, ±3%, ±4%, ±5%, ±6%, ±7%, ±8%, ±9%, ±10, or a value therebetween).

In some cases, the term "e+#" or the term "e+#" is used to refer to an index. For example, "5E10" or "5E10" is 5x 10 ¹⁰ . These terms may be used interchangeably.

The terms "comprises" and "comprising" and variations thereof, including "comprises" and "comprising," as used throughout the specification and claims, include other components, elements, integers, steps, etc. The term "consist of … …" or "consist of … …" does not include other components, elements, integers, steps, etc.

It should be noted that the term "a" or "an" means one or more/one or more, e.g. "enhancer" is to be understood as representing one or more/one or more enhancers. As such, the terms "a" (or "an"), "one or more" and "at least one" are used interchangeably herein.

With respect to the description of these applications, it is intended that each of the compositions described herein may be used in another embodiment in the methods of the present application. In addition, it is also contemplated that in another embodiment, each of the compositions described herein for use in the methods is itself an embodiment of the application.

Unless defined otherwise in the present specification, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs and the disclosure of which is referred to provides a general guide to many terms used in this application to those skilled in the art.

Examples

The following examples are illustrative only and are not limiting of the application described herein.

EXAMPLE 1 primary screening

It has been shown that small peptides inserted into the flexible loop of AAV capsid surface can mediate interactions with new cellular receptors. In one case found by the california institute of technology (AAV 9-php.b), the seven amino acid peptides inserted into the HVR8 loop on AAV9 mediate interactions with Ly6a (GPI anchor receptor on the cerebral vessels of some mouse strains). This interaction drives AAV9-php.b transport across the Blood Brain Barrier (BBB), resulting in about 50-fold greater transduction of brain cells than AAV 9. In this work, we sought peptide inserts that were able to bind to cell membrane targets on the BBB, thereby having the potential to drive AAV9 capsids across the BBB.

We have attempted to solve the AAV-BBB problem by first investigating peptide sequences in the current academic and patent literature that may have potential to interact with cerebrovascular cells. We found the following sources of these peptides:

results of the disclosed phage display experiments, wherein phage display libraries were panned against primary brain endothelial cells;

natural ligand peptides to known BBB resident membrane proteins;

CDRs of antibodies targeting BBB resident membrane proteins;

viral coat protein of flaviviruses causing encephalitis; and

bacterial toxins having cell binding activity against GPI anchor points.

We generated a library of AAV9 insertion mutants containing hundreds of peptides from these sources, all individually inserted at the HVR8 site (numbering between positions 588 and 589, based on AAV9 capsid amino acid sequence of SEQ ID NO: 44). Each peptide is typically present in the library in a variety of forms, differing in that: 1) Length of the inserted peptide; 2) The presence of flexible GSG or GG linker sequences flanking the peptide. Peptides were also encoded using multiple synonymous codons, allowing us to independently observe replication activity in the screen.

Furthermore, we generated libraries of insertional variants in HVR8 with AAVhu68 capsid of known or suspected ligand peptides that target the Blood Brain Barrier (BBB) receptor (numbering between positions 588 and 589, based on the AAVhu68 capsid amino acid sequence of SEQ ID NO: 45). These are:

Peptides that bind to mammalian brain endothelium (published phage display data);

classical RMT receptor ligand (e.g., tf);

CDRs of mAb against RMT receptor (e.g., anti TfR); and

coat protein of flaviviruses that cause encephalitis.

As a control, PHP.B peptide (positive control for C57/BL6, negative control for Balb/C & NHP) was also included. Each peptide is encoded in a variety of ways (with and without linkers, and in several synonymous DNA sequences).

We injected this library Intravenously (IV) into two mouse strains and one non-human primate at high doses. After a 2-3 week survival period, the animals were necropsied and tissues were collected. We extracted the DNA genome of AAV vectors from the CNS and other tissues and sequenced them for the Next Generation (NGS). The vector variants encapsidate their own capsid gene variants, allowing us to track capsid activity by the relative abundance of capsid gene variants in the tissue of interest. We scored the BBB activity ("enrichment score") of each variant in the library by calculating the abundance in the CNS of each variant normalized to its abundance in the injected library mixture.

In the mouse study, the most brain-enriched HVR8 inserts in C57/BL6 mice were: TLAVPFK (SEQ ID NO: 49) (PHP.B), positive control PHP.B appeared independently 3 times as the most enriched hit. Three php.b peptides with synonymous codons were enriched independently. Several other peptides are also enriched in the brain. FIGS. 1A and 1B show the enrichment scores of the best mouse brain hits in the screen with reference peptide (FIG. 1A is C57BL/6J mice; FIG. 1B is Balb/C mice. FIGS. 2A and 2B show the enrichment scores of the most well performing NHP brain (FIG. 2A) and spinal cord (FIG. 2B) tissues.

Table 1. Hit peptide list for primary screening (as identified in NHP and mouse screening).

Abbreviations	Peptide amino acid sequence	SEQ ID NO
			EFS	EFSSNTVKLTS	38
SSN-L	GGSSNTVKLTSGHGG	39
			SSN	SSNTVKLTSGH	40
SAN	SANFIKPTSY	41
			VLT-L	GGVLTNIARGEYMRGG	46
IEI	IEINATRAGTNL	42
			IEI-L	GGIEINATRAGTNLGG	43

EXAMPLE 2 Secondary verification

We continued to follow the initial screening of mice by generating GFP reporter vectors for several hit capsids. The vector was injected into C57BL/6J mice IV at high doses. After 2 weeks we performed necropsy on mice and collected GFP images of brain sections (data not shown). All hit vectors tested in GFP studies left the liver, as evident from liver GFP staining (data not shown).

Imaging at higher magnification revealed that capsid SSN and SAN were significantly localized in the brain, but limited to the endothelium. RCA-lectin was a co-staining marker of brain endothelial cells in these sections (data not shown). These results indicate that AAV capsids identified in the screen bind to the BBB receptor but do not cross the BBB. The mutant series showed a range of affinities for the Ly6a receptor. The brain transduction level was reduced, which was related to the observed tighter binding affinity of the peptide to Ly6a receptor. The tight binding observed reduces transduction, as the genome may become lodged in the endosome.

Such endothelial localization may be useful for certain diseases. In addition, this activity can be optimized to convert these brain-localized vectors into vectors that cross the BBB.

We demonstrated these activities across the BBB and brain localization in bar code vector studies. Briefly, each capsid was used to individually produce a vector containing a GFP reporter gene, which contains a unique DNA barcode (barcode). The barcoded capsid formulations were mixed in the same ratio and injected into C57BL/6J or Balb/C mice (FIGS. 3A-3D). After survival, NGS sequencing was performed on mouse tissues to calculate the abundance of each barcode in the vector genome extracted from the tissues. The results confirm the brain localization of the vector genomes of all hit capsids identified in the primary screen. In Balb/c mice, the secondary validated screen showed brain targeting for all hit sequences found in the primary screen (FIG. 3A). In C576BL/6 mice, the secondary validated screen showed brain targeting for all hit sequences found in the primary screen (fig. 3B). Liver off-target for all hit sequences was consistent with affinity for cerebral vessels relative to AVA9 in Balb/C and C57BL/6 mice (fig. 3C and 3D).

EXAMPLE 3 endothelial targeting sequence

For NHP secondary validation, bar code studies were performed. The study was performed in two NHPs injected with a mixture of 27 bar code vectors comprising 4.5x10 after 21 days of survival ¹³ GC/kg AAV9. Whole brain tissue homogenate analysis was performed. While some vectors showed modest improvements in vector biodistribution, none showed improvements in whole brain transduction. Accumulation of vector genome in brain (DNA) has poor correlation with expression of vector-derived transcripts (mRNA) (table 2).We observed that the mRNA of the endothelial targeting vector has poor correlation with DNA.

Table 2.

Table 3 below shows the average relative localization scores of the two NHPs in the bar code study, normalized against AAV9 (equal to 1). Consistent with brain-focused library design, localization of most vectors in non-brain tissues was not significant. One exception to this is the vector PMK, which is significantly relocated to the spleen of both NHPs relative to AAV9.

Table 3.

AAV	Eyes (eyes)	Kidney and kidney	Liver	Lung (lung)	Pancreas gland	Spleen	Testis	Diaphragm	Quadrcp	Heart and method for producing the same
											AAV9	1.0	1.0	1.0	1.0	1.0	1.0	1.0	1.0	1.0	1.0
SSN	1.5	0.3	0.3	0.3	0.2	0.0	0.4	1.1	1.0	0.6
											SSN-L	1.3	0.2	0.5	0.2	0.2	0.1	0.6	0.9	0.9	0.6
EFS	2.6	0.4	0.4	0.7	0.2	0.1	0.9	1.6	1.4	1.0
											VLT-L	0.9	0.2	0.5	0.3	0.2	0.1	0.4	0.6	0.6	0.4
IEI	0.6	0.2	0.1	0.5	0.1	0.0	0.2	0.3	0.5	0.2
											IEI-L	1.0	0.2	0.1	0.4	0.1	0.6	0.8	0.5	0.7	0.5
SAN	1.0	0.3	0.3	0.5	0.1	0.1	0.3	0.7	0.8	0.4

Table 4 below shows that brain endothelial hits have a common in vitro transduction profile (as measured in 293 transduction) and a common production profile. Importantly, the relative abundance of vectors with endothelial targeting activity in both AAV and plasmid libraries showed a significant increase, as measured by NGS.

Table 4.

The mutant library of SAN peptides demonstrates the role of the "NxTK" motif in brain targeting. In this study, SAN inserts were subjected to each possible single amino acid change, optimized library variants were injected in mice, and the biodistribution f-score and yield of each variant were measured. The "NxTK" motif is a key motif for brain biodistribution in SAN inserts (table 5 and figure 5). The "NxTK" motif controls the conversion of plasmid into AAV in SAN peptide inserts (table 6 and fig. 6). Furthermore, the "NxTK" motif relates the following three properties of these endothelial vectors in the "NxTK" class: endothelial cell transduction, improved 293 cell transduction and ability to spread during library production. Figures 7A to 7D show that the "NxTK" motif confers broad transduction advantages across cell lines. Relative transduction levels were improved in 293 cells (fig. 7A), NIH3T3 cells (fig. 7B) and HUH7 cells (fig. 7C) compared to AAV9 capsids. Fig. 7D shows a significant improvement in early stage of transduction at day 3 post transduction (3 DPT), and an approximately 10-fold improvement at day 7 post transduction (7 DPT). AAV-GFP vector with EFS and SAN peptide inserts showed improved transduction when transduced primary macaque airway epithelial cells (fig. 7E-7H).

TABLE 5 brain biodistribution (average from top to bottom)

/>

TABLE 6 plasmid to AAV yield conversion

	S	A	N	F	I	K	P	T	S	Y
											A	0.6	0.4	-1.1	1.1	1.3	-1.4	1.3	-0.1	0.4	1.2
C	-3.1	11.0	-6.2	-2.6	-6.0	-6.8	-4.3	-5.4	-4.4	-1.7
											D	-0.1	0.6	-0.7	0.3	-0.7	-1.1	1.0	0.2	0.8	0.9
E	0.3	-1.2	-1.0	0.7	-0.6	-0.6	0.9	0.1	0.8	0.8
											F	-2.8	-5.5	-5.1	0.4	-3.7	-7.2	-5.5	-3.7	-3.2	0.2
G	0.4	0.7	-0.9	1.2	-0.4	-0.5	1.2	0.9	0.7	1.1
											H	0.2	-0.9	-1.6	1.3	-0.6	-1.7	1.0	0.0	0.1	0.9
I	-0.9	-1.4	-2.9	1.1	0.4	-3.3	-1.9	-0.7	-0.9	1.0
											K	-0.5	-0.5	-1.3	0.3	-1.6	-3	-1.2	-2.5	-1.8	-1.1
L	-0.5	-1.1	-2.6	1.1	-1.1	-4.5	-1.7	-1.2	-1.2	0.6
											M	-0.6	-1.5	-2.4	1.3	-0.6	-3.0	-0.4	-0.9	-0.8	1.0
N	1.0	0.5	0.4	1.5	-0.1	-1.0	1.5	0.6	0.7	1.2
											P	0.7	-0.2	-1.1	-0.4	-0.4	-0.9	0.4	-0.4	0.4	0.7
Q	0.8	1.2	-0.6	1.6	-0.1	-1.2	1.5	0.0	0.4	1.0
											R	-2.1	-2.9	-4.6	-1.0	-5.2	-1.3	-3.2	-4.2	-3.9	-1.3
S	0.4	0.5	-1.5	1.0	1.3	-0.9	1.3	0.3	0.4	1.2
											T	0.5	0.4	-1.0	1.3	1.1	-1.0	1.3	0.4	0.5	1.1
V	0.3	-0.3	-2.3	1.1	0.8	-3.7	-0.1	-0.2	0.2	1.1
											W	-4.4	-5.6	-11.0	-2.0	-5.5	-5.8	-11.6	-4.9	-7.3	-1.1
Y	-1.8	-2.9	-6.3	0.2	-3.8	-6.8	-4.3	-2.9	-3.1	0.4

As summarized above, the selected amino acid sequences all contain the functional motifs N-x- (T/I/V/A) - (K/R) (SEQ ID NO: 47) as set forth in Table 7. In addition to the selected sequences shown in table 7, other sequences were also identified during screening. We have data that support many substitutions of these insert sequences also support or even improve endothelial targeting activity. Furthermore, we have found about thousands of sequences that fit this motif from large random insert libraries-all of which may share improved transduction properties.

TABLE 7 selected endothelial targeting sequences

Carrier name	Insert sequences	SEQ ID NO
			SSN	SSNTVKLTSGH	40
EFS	EFSSNTVKLTS	38
			VLT-L	GGVLTNIARGEYMRGG	46
IEI-L	GGIEINATRAGTNLGG	43
			SSN-L	GGSSNTVKLTSGHGG	39
IEI	IEINATRAGTNL	42
			SAN	SANFIKPTSY	41

We completed bar code evaluation of the primary screening hits in NHP. Brain localization is the most prominent feature in this library, whereas targeting of peripheral tissues is not significantly enhanced except for possible spleen targeting of AAV 9-PMK. We define a sequence motif common to all peptide inserts with brain endothelial targeting activity. We demonstrate the activity of this motif in brain endothelial targeting, as well as in conferring broad transduction advantages in vitro. The single sequence motif "NxTK" defines an insert from 4 unrelated sources that shares three characteristics:

(1) Significantly improved in vitro transduction of 293s and other cell lines;

(2) Parasitic expansion during library production-the "diffuse" phenotype; and

(3) Endothelial biodistribution in mice and NHPs.

The "NxTK" motif was mapped in the systematic mutation screening of SAN and EFS vector inserts. In systematic mutation screening, the "NxTK" motif is shown to be critical for brain endothelial biodistribution and for abundance in library production. Plasmid to AAV conversion refers to capsid yield during library production, controlled by the following 2 factors: the presence of parasitic diffusion motifs (major factors) and the production of the built-in capsid (minor factors). One of the diffusion motifs was identified as "NxTK" and likely interacted with 293 cell surface receptors and conferred transduction advantages. The transduction advantage in 293 may lead to the transmission of vector genome (Cap) to neighboring cells during the production phase, since library production is performed with restriction of Cap, most cells initially have anything else than Cap gene. The secondary factors are only revealed after digital filtering out of the vector with the parasitic diffusion motif.

Example 4 engineering strategies for AAV capsid development for airway delivery

The current AAV vectors have poor transduction of cells of the nasal and upper respiratory tracts, limiting the prevention of AAV strategiesDefending against upper respiratory infections such as influenza or covd-19. This project aims at engineering AAV capsids to improve transduction of these tissues, especially pursuing directed evolution and receptor targeting strategies. The engineering strategy for pursuing AAV airway-specific capsid selection comprises the steps of: including but not limited to generating a variegated library of constructs with inserts (10 ³ To 10 ⁹ Initial diversity), in primary primate or human airway cells, identifying genomic features (DNA or RNA) of the selected construct, performing additional screening to focus on hits, validating improved capsid hits in NHPs. Sources of capsid diversity for airway delivery include pursuing two approaches: unbiased and biased methods. In an unbiased approach, random peptides are inserted into the AAV capsid surface to generate greater than 10 ⁷ Large libraries of individual variant diversity. In a biased approach, a peptide having known or suspected airway cell binding activity is inserted into the AAV capsid to generate a peptide having about 10 ³ A small library of individual variant diversity. Sources of such known or suspected airway cell binding peptides are: the phage display results disclosed, peptides from the viral receptor binding domain, known ligands for airway receptors, and peptides generated in previous in vitro library screens. To screen the generated capsid library in vitro, assays using primary airway cells in air-liquid interface (ALI) cultures were used. The cells used are of human and cynomolgus origin and include cells of nasal, tracheal and bronchial cell origin. Preliminary transduction tests with GFP vector in cynomolgus primary airway epithelial cell cultures showed a significant early improvement in transduction of EFS and SAN insert peptides into AAV capsids (fig. 7D and 7E-7H). Fig. 7E shows microscopic analysis of macaque primary airway epithelial cells in a control sample treated with vehicle (i.e., no vehicle). FIG. 7F shows microscopic analysis of primary airway epithelial cells of macaque transduced with AAV9-GFP vector. Fig. 7G shows microscopic analysis of macaque primary airway epithelial cells transduced with AAV9-GFP vector including EFS peptide inserts. Figure 7H shows microscopic analysis of primary airway epithelial cells of cynomolgus monkey transduced with AAV9-GFP including SAN peptide inserts. Preliminary transfer with GFP vector in cultured human cells The pilot test showed overall lower transduction, with a ratio of mRNA copy number to total mRNA in micrograms of cultured human cells of 1x10 on day 7 ⁴ (FIG. 8), and the ratio of cultured macaque primary airway epithelial cells was 1X10 ⁶ (FIG. 7D). SAN motifs showed advantages in transduction on day 7 compared to AAV9 transduction (fig. 8). The EFS motif showed poor transduction in human cells cultured in bronchi and trachea (FIG. 8).

Furthermore, we have developed, among other things, a number of insert sequences through in vitro selection protocols that confer significantly improved cell binding and transduction activity. The resulting AAV-insert vectors were tested in a bar code pool on ALI cultures. The results showed that all AAV 9-insert vectors selected in vitro were superior to AAV9 vectors (results not shown). In particular, the AAV 9-insert vector of Spr3L (NxTK) was the best in comparison, the latter transduction being about 50-fold of AAV9 capsids.

(sequence Listing free text)

For the composition contained in

The sequence of free text under the numeric identifier <223> provides the following information.

/>

All documents cited in this specification are incorporated herein by reference. U.S. provisional application No. 63/119,863, filed on 1/12/2020, is incorporated herein by reference in its entirety. A sequence listing, named "20-9409PCT_ST25", is hereby incorporated by reference, as if set forth in full. Although the invention has been described with reference to specific embodiments, it will be understood that modifications may be made without departing from the spirit of the invention. Such modifications are intended to fall within the scope of the appended claims.

Sequence listing

<110> board of university of pennsylvania (The Trustees of the University of Pennsylvania)

<120> novel AAV capsids having endothelial tissue specific targeting motif and compositions containing the same

<130> UPN-20-9409.PCT

<150> US 63/119,863

<151> 2020-12-01

<160> 56

<170> patent in version 3.5

<210> 1

<211> 4725

<212> DNA

<213> artificial sequence

<220>

<223> AAV2/9 n.588.EFS nucleic acid sequence expression cassettes

<220>

<221> misc_feature

<222> (1)..(36)

<223> truncated promoter

<220>

<221> promoter

<222> (1)..(7)

<223> p5 promoter

<220>

<221> CDS

<222> (37)..(1899)

<223> AAV2-Rep

<220>

<221> CDS

<222> (1919)..(4162)

<223> AAV9 Cap

<220>

<221> misc_feature

<222> (3683)..(3715)

<223> EFS

<220>

<221> misc_feature

<222> (4253)..(4383)

<223> p5 promoter

<220>

<221> misc_feature

<222> (4511)..(4725)

<223> LacZ promoter

<400> 1

ccattttgaa gcgggaggtt tgaacgcgca gccgcc atg ccg ggg ttt tac gag 54

Met Pro Gly Phe Tyr Glu

1 5

att gtg att aag gtc ccc agc gac ctt gac gag cat ctg ccc ggc att 102

Ile Val Ile Lys Val Pro Ser Asp Leu Asp Glu His Leu Pro Gly Ile

10 15 20

tct gac agc ttt gtg aac tgg gtg gcc gag aag gaa tgg gag ttg ccg 150

Ser Asp Ser Phe Val Asn Trp Val Ala Glu Lys Glu Trp Glu Leu Pro

25 30 35

cca gat tct gac atg gat ctg aat ctg att gag cag gca ccc ctg acc 198

Pro Asp Ser Asp Met Asp Leu Asn Leu Ile Glu Gln Ala Pro Leu Thr

40 45 50

gtg gcc gag aag ctg cag cgc gac ttt ctg acg gaa tgg cgc cgt gtg 246

Val Ala Glu Lys Leu Gln Arg Asp Phe Leu Thr Glu Trp Arg Arg Val

55 60 65 70

agt aag gcc ccg gag gct ctt ttc ttt gtg caa ttt gag aag gga gag 294

Ser Lys Ala Pro Glu Ala Leu Phe Phe Val Gln Phe Glu Lys Gly Glu

75 80 85

agc tac ttc cac atg cac gtg ctc gtg gaa acc acc ggg gtg aaa tcc 342

Ser Tyr Phe His Met His Val Leu Val Glu Thr Thr Gly Val Lys Ser

90 95 100

atg gtt ttg gga cgt ttc ctg agt cag att cgc gaa aaa ctg att cag 390

Met Val Leu Gly Arg Phe Leu Ser Gln Ile Arg Glu Lys Leu Ile Gln

105 110 115

aga att tac cgc ggg atc gag ccg act ttg cca aac tgg ttc gcg gtc 438

Arg Ile Tyr Arg Gly Ile Glu Pro Thr Leu Pro Asn Trp Phe Ala Val

120 125 130

aca aag acc aga aat ggc gcc gga ggc ggg aac aag gtg gtg gat gag 486

Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly Asn Lys Val Val Asp Glu

135 140 145 150

tgc tac atc ccc aat tac ttg ctc ccc aaa acc cag cct gag ctc cag 534

Cys Tyr Ile Pro Asn Tyr Leu Leu Pro Lys Thr Gln Pro Glu Leu Gln

155 160 165

tgg gcg tgg act aat atg gaa cag tat tta agc gcc tgt ttg aat ctc 582

Trp Ala Trp Thr Asn Met Glu Gln Tyr Leu Ser Ala Cys Leu Asn Leu

170 175 180

acg gag cgt aaa cgg ttg gtg gcg cag cat ctg acg cac gtg tcg cag 630

Thr Glu Arg Lys Arg Leu Val Ala Gln His Leu Thr His Val Ser Gln

185 190 195

acg cag gag cag aac aaa gag aat cag aat ccc aat tct gat gcg ccg 678

Thr Gln Glu Gln Asn Lys Glu Asn Gln Asn Pro Asn Ser Asp Ala Pro

200 205 210

gtg atc aga tca aaa act tca gcc agg tac atg gag ctg gtc ggg tgg 726

Val Ile Arg Ser Lys Thr Ser Ala Arg Tyr Met Glu Leu Val Gly Trp

215 220 225 230

ctc gtg gac aag ggg att acc tcg gag aag cag tgg atc cag gag gac 774

Leu Val Asp Lys Gly Ile Thr Ser Glu Lys Gln Trp Ile Gln Glu Asp

235 240 245

cag gcc tca tac atc tcc ttc aat gcg gcc tcc aac tcg cgg tcc caa 822

Gln Ala Ser Tyr Ile Ser Phe Asn Ala Ala Ser Asn Ser Arg Ser Gln

250 255 260

atc aag gct gcc ttg gac aat gcg gga aag att atg agc ctg act aaa 870

Ile Lys Ala Ala Leu Asp Asn Ala Gly Lys Ile Met Ser Leu Thr Lys

265 270 275

acc gcc ccc gac tac ctg gtg ggc cag cag ccc gtg gag gac att tcc 918

Thr Ala Pro Asp Tyr Leu Val Gly Gln Gln Pro Val Glu Asp Ile Ser

280 285 290

agc aat cgg att tat aaa att ttg gaa cta aac ggg tac gat ccc caa 966

Ser Asn Arg Ile Tyr Lys Ile Leu Glu Leu Asn Gly Tyr Asp Pro Gln

295 300 305 310

tat gcg gct tcc gtc ttt ctg gga tgg gcc acg aaa aag ttc ggc aag 1014

Tyr Ala Ala Ser Val Phe Leu Gly Trp Ala Thr Lys Lys Phe Gly Lys

315 320 325

agg aac acc atc tgg ctg ttt ggg cct gca act acc ggg aag acc aac 1062

Arg Asn Thr Ile Trp Leu Phe Gly Pro Ala Thr Thr Gly Lys Thr Asn

330 335 340

atc gcg gag gcc ata gcc cac act gtg ccc ttc tac ggg tgc gta aac 1110

Ile Ala Glu Ala Ile Ala His Thr Val Pro Phe Tyr Gly Cys Val Asn

345 350 355

tgg acc aat gag aac ttt ccc ttc aac gac tgt gtc gac aag atg gtg 1158

Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp Cys Val Asp Lys Met Val

360 365 370

atc tgg tgg gag gag ggg aag atg acc gcc aag gtc gtg gag tcg gcc 1206

Ile Trp Trp Glu Glu Gly Lys Met Thr Ala Lys Val Val Glu Ser Ala

375 380 385 390

aaa gcc att ctc gga gga agc aag gtg cgc gtg gac cag aaa tgc aag 1254

Lys Ala Ile Leu Gly Gly Ser Lys Val Arg Val Asp Gln Lys Cys Lys

395 400 405

tcc tcg gcc cag ata gac ccg act ccc gtg atc gtc acc tcc aac acc 1302

Ser Ser Ala Gln Ile Asp Pro Thr Pro Val Ile Val Thr Ser Asn Thr

410 415 420

aac atg tgc gcc gtg att gac ggg aac tca acg acc ttc gaa cac cag 1350

Asn Met Cys Ala Val Ile Asp Gly Asn Ser Thr Thr Phe Glu His Gln

425 430 435

cag ccg ttg caa gac cgg atg ttc aaa ttt gaa ctc acc cgc cgt ctg 1398

Gln Pro Leu Gln Asp Arg Met Phe Lys Phe Glu Leu Thr Arg Arg Leu

440 445 450

gat cat gac ttt ggg aag gtc acc aag cag gaa gtc aaa gac ttt ttc 1446

Asp His Asp Phe Gly Lys Val Thr Lys Gln Glu Val Lys Asp Phe Phe

455 460 465 470

cgg tgg gca aag gat cac gtg gtt gag gtg gag cat gaa ttc tac gtc 1494

Arg Trp Ala Lys Asp His Val Val Glu Val Glu His Glu Phe Tyr Val

475 480 485

aaa aag ggt gga gcc aag aaa aga ccc gcc ccc agt gac gca gat ata 1542

Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala Pro Ser Asp Ala Asp Ile

490 495 500

agt gag ccc aaa cgg gtg cgc gag tca gtt gcg cag cca tcg acg tca 1590

Ser Glu Pro Lys Arg Val Arg Glu Ser Val Ala Gln Pro Ser Thr Ser

505 510 515

gac gcg gaa gct tcg atc aac tac gca gac agg tac caa aac aaa tgt 1638

Asp Ala Glu Ala Ser Ile Asn Tyr Ala Asp Arg Tyr Gln Asn Lys Cys

520 525 530

tct cgt cac gtg ggc atg aat ctg atg ctg ttt ccc tgc aga caa tgc 1686

Ser Arg His Val Gly Met Asn Leu Met Leu Phe Pro Cys Arg Gln Cys

535 540 545 550

gag aga atg aat cag aat tca aat atc tgc ttc act cac gga cag aaa 1734

Glu Arg Met Asn Gln Asn Ser Asn Ile Cys Phe Thr His Gly Gln Lys

555 560 565

gac tgt tta gag tgc ttt ccc gtg tca gaa tct caa ccc gtt tct gtc 1782

Asp Cys Leu Glu Cys Phe Pro Val Ser Glu Ser Gln Pro Val Ser Val

570 575 580

gtc aaa aag gcg tat cag aaa ctg tgc tac att cat cat atc atg gga 1830

Val Lys Lys Ala Tyr Gln Lys Leu Cys Tyr Ile His His Ile Met Gly

585 590 595

aag gtg cca gac gct tgc act gcc tgc gat ctg gtc aat gtg gat ttg 1878

Lys Val Pro Asp Ala Cys Thr Ala Cys Asp Leu Val Asn Val Asp Leu

600 605 610

gat gac tgc atc ttt gaa caa taaatgattt aaatcaggt atg gct gcc gat 1930

Asp Asp Cys Ile Phe Glu Gln Met Ala Ala Asp

615 620 625

ggt tat ctt cca gat tgg ctc gag gac aac ctt agt gaa gga att cgc 1978

Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser Glu Gly Ile Arg

630 635 640

gag tgg tgg gct ttg aaa cct gga gcc cct caa ccc aag gca aat caa 2026

Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro Lys Ala Asn Gln

645 650 655

caa cat caa gac aac gct cga ggt ctt gtg ctt ccg ggt tac aaa tac 2074

Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro Gly Tyr Lys Tyr

660 665 670

ctt gga ccc ggc aac gga ctc gac aag ggg gag ccg gtc aac gca gca 2122

Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro Val Asn Ala Ala

675 680 685

gac gcg gcg gcc ctc gag cac gac aag gcc tac gac cag cag ctc aag 2170

Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp Gln Gln Leu Lys

690 695 700 705

gcc gga gac aac ccg tac ctc aag tac aac cac gcc gac gcc gag ttc 2218

Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala Asp Ala Glu Phe

710 715 720

cag gag cgg ctc aaa gaa gat acg tct ttt ggg ggc aac ctc ggg cga 2266

Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly Asn Leu Gly Arg

725 730 735

gca gtc ttc cag gcc aaa aag agg ctt ctt gaa cct ctt ggt ctg gtt 2314

Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro Leu Gly Leu Val

740 745 750

gag gaa gcg gct aag acg gct cct gga aag aag agg cct gta gag cag 2362

Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg Pro Val Glu Gln

755 760 765

tct cct cag gaa ccg gac tcc tcc gcg ggt att ggc aaa tcg ggt gca 2410

Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly Lys Ser Gly Ala

770 775 780 785

cag ccc gct aaa aag aga ctc aat ttc ggt cag act ggc gac aca gag 2458

Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr Gly Asp Thr Glu

790 795 800

tca gtc cca gac cct caa cca atc gga gaa cct ccc gca gcc ccc tca 2506

Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro Ala Ala Pro Ser

805 810 815

ggt gtg gga tct ctt aca atg gct tca ggt ggt ggc gca cca gtg gca 2554

Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly Ala Pro Val Ala

820 825 830

gac aat aac gaa ggt gcc gat gga gtg ggt agt tcc tcg gga aat tgg 2602

Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser Ser Gly Asn Trp

835 840 845

cat tgc gat tcc caa tgg ctg ggg gac aga gtc atc acc acc agc acc 2650

His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile Thr Thr Ser Thr

850 855 860 865

cga acc tgg gcc ctg ccc acc tac aac aat cac ctc tac aag caa atc 2698

Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu Tyr Lys Gln Ile

870 875 880

tcc aac agc aca tct gga gga tct tca aat gac aac gcc tac ttc ggc 2746

Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn Ala Tyr Phe Gly

885 890 895

tac agc acc ccc tgg ggg tat ttt gac ttc aac aga ttc cac tgc cac 2794

Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe His Cys His

900 905 910

ttc tca cca cgt gac tgg cag cga ctc atc aac aac aac tgg gga ttc 2842

Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp Gly Phe

915 920 925

cgg cct aag cga ctc aac ttc aag ctc ttc aac att cag gtc aaa gag 2890

Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile Gln Val Lys Glu

930 935 940 945

gtt acg gac aac aat gga gtc aag acc atc gcc aat aac ctt acc agc 2938

Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn Asn Leu Thr Ser

950 955 960

acg gtc cag gtc ttc acg gac tca gac tat cag ctc ccg tac gtg ctc 2986

Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu Pro Tyr Val Leu

965 970 975

ggg tcg gct cac gag ggc tgc ctc ccg ccg ttc cca gcg gac gtt ttc 3034

Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro Ala Asp Val Phe

980 985 990

atg att cct cag tac ggg tat ctg acg ctt aat gat gga agc cag gcc 3082

Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp Gly Ser Gln Ala

995 1000 1005

gtg ggt cgt tcg tcc ttt tac tgc ctg gaa tat ttc ccg tcg caa 3127

Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro Ser Gln

1010 1015 1020

atg cta aga acg ggt aac aac ttc cag ttc agc tac gag ttt gag 3172

Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu Phe Glu

1025 1030 1035

aac gta cct ttc cat agc agc tac gct cac agc caa agc ctg gac 3217

Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp

1040 1045 1050

cga cta atg aat cca ctc atc gac caa tac ttg tac tat ctc tca 3262

Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

1055 1060 1065

aag act att aac ggt tct gga cag aat caa caa acg cta aaa ttc 3307

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe

1070 1075 1080

agt gtg gcc gga ccc agc aac atg gct gtc cag gga aga aac tac 3352

Ser Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr

1085 1090 1095

ata cct gga ccc agc tac cga caa caa cgt gtc tca acc act gtg 3397

Ile Pro Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val

1100 1105 1110

act caa aac aac aac agc gaa ttt gct tgg cct gga gct tct tct 3442

Thr Gln Asn Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser

1115 1120 1125

tgg gct ctc aat gga cgt aat agc ttg atg aat cct gga cct gct 3487

Trp Ala Leu Asn Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala

1130 1135 1140

atg gcc agc cac aaa gaa gga gag gac cgt ttc ttt cct ttg tct 3532

Met Ala Ser His Lys Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser

1145 1150 1155

gga tct tta att ttt ggc aaa caa gga act gga aga gac aac gtg 3577

Gly Ser Leu Ile Phe Gly Lys Gln Gly Thr Gly Arg Asp Asn Val

1160 1165 1170

gat gcg gac aaa gtc atg ata acc aac gaa gaa gaa att aaa act 3622

Asp Ala Asp Lys Val Met Ile Thr Asn Glu Glu Glu Ile Lys Thr

1175 1180 1185

act aac ccg gta gca acg gag tcc tat gga caa gtg gcc aca aac 3667

Thr Asn Pro Val Ala Thr Glu Ser Tyr Gly Gln Val Ala Thr Asn

1190 1195 1200

cac cag agt gcc caa gag ttc agc agc aac acc gtg aag ctg acc 3712

His Gln Ser Ala Gln Glu Phe Ser Ser Asn Thr Val Lys Leu Thr

1205 1210 1215

agc gca cag gcg cag acc ggc tgg gtt caa aac caa gga ata ctt 3757

Ser Ala Gln Ala Gln Thr Gly Trp Val Gln Asn Gln Gly Ile Leu

1220 1225 1230

ccg ggt atg gtt tgg cag gac aga gat gtg tac ctg caa gga ccc 3802

Pro Gly Met Val Trp Gln Asp Arg Asp Val Tyr Leu Gln Gly Pro

1235 1240 1245

att tgg gcc aaa att cct cac acg gac ggc aac ttt cac cct tct 3847

Ile Trp Ala Lys Ile Pro His Thr Asp Gly Asn Phe His Pro Ser

1250 1255 1260

ccg ctg atg gga ggg ttt gga atg aag cac ccg cct cct cag atc 3892

Pro Leu Met Gly Gly Phe Gly Met Lys His Pro Pro Pro Gln Ile

1265 1270 1275

ctc atc aaa aac aca cct gta cct gcg gat cct cca acg gcc ttc 3937

Leu Ile Lys Asn Thr Pro Val Pro Ala Asp Pro Pro Thr Ala Phe

1280 1285 1290

aac aag gac aag ctg aac tct ttc atc acc cag tat tct act ggc 3982

Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr Gln Tyr Ser Thr Gly

1295 1300 1305

caa gtc agc gtg gag atc gag tgg gag ctg cag aag gaa aac agc 4027

Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn Ser

1310 1315 1320

aag cgc tgg aac ccg gag atc cag tac act tcc aac tat tac aag 4072

Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Tyr Lys

1325 1330 1335

tct aat aat gtt gaa ttt gct gtt aat act gaa ggt gta tat agt 4117

Ser Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val Tyr Ser

1340 1345 1350

gaa ccc cgc ccc att ggc acc aga tac ctg act cgt aat ctg taa 4162

Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

1355 1360 1365

ttgcttgtta atcaataaac cgtttaattc gtttcagttg aactttggtc tctgcgaagg 4222

gcgaattcgt ttaaacctgc aggactagag gtcctgtatt agaggtcacg tgagtgtttt 4282

gcgacatttt gcgacaccat gtggtcacgc tgggtattta agcccgagtg agcacgcagg 4342

gtctccattt tgaagcggga ggtttgaacg cgcagccgcc aagccgaatt ctgcagatat 4402

ccatcacact ggcggccgct cgactagagc ggccgccacc gcggtggagc tccagctttt 4462

gttcccttta gtgagggtta attgcgcgct tggcgtaatc atggtcatag ctgtttcctg 4522

tgtgaaattg ttatccgctc acaattccac acaacatacg agccggaagc ataaagtgta 4582

aagcctgggg tgcctaatga gtgagctaac tcacattaat tgcgttgcgc tcactgcccg 4642

ctttccagtc gggaaacctg tcgtgccagc tgcattaatg aatcggccaa cgcgcgggga 4702

gaggcggttt gcgtattggg cgc 4725

<210> 2

<211> 621

<212> PRT

<213> artificial sequence

<220>

<223> synthetic construct

<400> 2

Met Pro Gly Phe Tyr Glu Ile Val Ile Lys Val Pro Ser Asp Leu Asp

1 5 10 15

Glu His Leu Pro Gly Ile Ser Asp Ser Phe Val Asn Trp Val Ala Glu

20 25 30

Lys Glu Trp Glu Leu Pro Pro Asp Ser Asp Met Asp Leu Asn Leu Ile

35 40 45

Glu Gln Ala Pro Leu Thr Val Ala Glu Lys Leu Gln Arg Asp Phe Leu

50 55 60

Thr Glu Trp Arg Arg Val Ser Lys Ala Pro Glu Ala Leu Phe Phe Val

65 70 75 80

Gln Phe Glu Lys Gly Glu Ser Tyr Phe His Met His Val Leu Val Glu

85 90 95

Thr Thr Gly Val Lys Ser Met Val Leu Gly Arg Phe Leu Ser Gln Ile

100 105 110

Arg Glu Lys Leu Ile Gln Arg Ile Tyr Arg Gly Ile Glu Pro Thr Leu

115 120 125

Pro Asn Trp Phe Ala Val Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly

130 135 140

Asn Lys Val Val Asp Glu Cys Tyr Ile Pro Asn Tyr Leu Leu Pro Lys

145 150 155 160

Thr Gln Pro Glu Leu Gln Trp Ala Trp Thr Asn Met Glu Gln Tyr Leu

165 170 175

Ser Ala Cys Leu Asn Leu Thr Glu Arg Lys Arg Leu Val Ala Gln His

180 185 190

Leu Thr His Val Ser Gln Thr Gln Glu Gln Asn Lys Glu Asn Gln Asn

195 200 205

Pro Asn Ser Asp Ala Pro Val Ile Arg Ser Lys Thr Ser Ala Arg Tyr

210 215 220

Met Glu Leu Val Gly Trp Leu Val Asp Lys Gly Ile Thr Ser Glu Lys

225 230 235 240

Gln Trp Ile Gln Glu Asp Gln Ala Ser Tyr Ile Ser Phe Asn Ala Ala

245 250 255

Ser Asn Ser Arg Ser Gln Ile Lys Ala Ala Leu Asp Asn Ala Gly Lys

260 265 270

Ile Met Ser Leu Thr Lys Thr Ala Pro Asp Tyr Leu Val Gly Gln Gln

275 280 285

Pro Val Glu Asp Ile Ser Ser Asn Arg Ile Tyr Lys Ile Leu Glu Leu

290 295 300

Asn Gly Tyr Asp Pro Gln Tyr Ala Ala Ser Val Phe Leu Gly Trp Ala

305 310 315 320

Thr Lys Lys Phe Gly Lys Arg Asn Thr Ile Trp Leu Phe Gly Pro Ala

325 330 335

Thr Thr Gly Lys Thr Asn Ile Ala Glu Ala Ile Ala His Thr Val Pro

340 345 350

Phe Tyr Gly Cys Val Asn Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp

355 360 365

Cys Val Asp Lys Met Val Ile Trp Trp Glu Glu Gly Lys Met Thr Ala

370 375 380

Lys Val Val Glu Ser Ala Lys Ala Ile Leu Gly Gly Ser Lys Val Arg

385 390 395 400

Val Asp Gln Lys Cys Lys Ser Ser Ala Gln Ile Asp Pro Thr Pro Val

405 410 415

Ile Val Thr Ser Asn Thr Asn Met Cys Ala Val Ile Asp Gly Asn Ser

420 425 430

Thr Thr Phe Glu His Gln Gln Pro Leu Gln Asp Arg Met Phe Lys Phe

435 440 445

Glu Leu Thr Arg Arg Leu Asp His Asp Phe Gly Lys Val Thr Lys Gln

450 455 460

Glu Val Lys Asp Phe Phe Arg Trp Ala Lys Asp His Val Val Glu Val

465 470 475 480

Glu His Glu Phe Tyr Val Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala

485 490 495

Pro Ser Asp Ala Asp Ile Ser Glu Pro Lys Arg Val Arg Glu Ser Val

500 505 510

Ala Gln Pro Ser Thr Ser Asp Ala Glu Ala Ser Ile Asn Tyr Ala Asp

515 520 525

Arg Tyr Gln Asn Lys Cys Ser Arg His Val Gly Met Asn Leu Met Leu

530 535 540

Phe Pro Cys Arg Gln Cys Glu Arg Met Asn Gln Asn Ser Asn Ile Cys

545 550 555 560

Phe Thr His Gly Gln Lys Asp Cys Leu Glu Cys Phe Pro Val Ser Glu

565 570 575

Ser Gln Pro Val Ser Val Val Lys Lys Ala Tyr Gln Lys Leu Cys Tyr

580 585 590

Ile His His Ile Met Gly Lys Val Pro Asp Ala Cys Thr Ala Cys Asp

595 600 605

Leu Val Asn Val Asp Leu Asp Asp Cys Ile Phe Glu Gln

610 615 620

<210> 3

<211> 747

<212> PRT

<213> artificial sequence

<220>

<223> synthetic construct

<400> 3

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser

1 5 10 15

Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro

20 25 30

Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro

35 40 45

Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro

50 55 60

Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 80

Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro

115 120 125

Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140

Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly

145 150 155 160

Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175

Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro

180 185 190

Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly

195 200 205

Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser

210 215 220

Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile

225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255

Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn

260 265 270

Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg

275 280 285

Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn

290 295 300

Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile

305 310 315 320

Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn

325 330 335

Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu

340 345 350

Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro

355 360 365

Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp

370 375 380

Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe

385 390 395 400

Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu

405 410 415

Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu

420 425 430

Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

435 440 445

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser

450 455 460

Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro

465 470 475 480

Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn

485 490 495

Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn

500 505 510

Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys

515 520 525

Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly

530 535 540

Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile

545 550 555 560

Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser

565 570 575

Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Glu Phe Ser Ser

580 585 590

Asn Thr Val Lys Leu Thr Ser Ala Gln Ala Gln Thr Gly Trp Val Gln

595 600 605

Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln Asp Arg Asp Val Tyr

610 615 620

Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly Asn Phe

625 630 635 640

His Pro Ser Pro Leu Met Gly Gly Phe Gly Met Lys His Pro Pro Pro

645 650 655

Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asp Pro Pro Thr Ala

660 665 670

Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr Gln Tyr Ser Thr Gly

675 680 685

Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn Ser Lys

690 695 700

Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Tyr Lys Ser Asn

705 710 715 720

Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val Tyr Ser Glu Pro Arg

725 730 735

Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

740 745

<210> 4

<211> 4728

<212> DNA

<213> artificial sequence

<220>

<223> AAV2/9 n.588.IEI nucleic acid sequence expression cassettes

<220>

<221> misc_feature

<222> (1)..(36)

<223> truncated promoter

<220>

<221> promoter

<222> (1)..(7)

<223> p5 promoter

<220>

<221> CDS

<222> (37)..(1899)

<223> AAV2-Rep

<220>

<221> CDS

<222> (1919)..(4165)

<223> AAV9 Cap

<220>

<221> misc_feature

<222> (3683)..(3718)

<223> IEI

<220>

<221> misc_feature

<222> (4256)..(4386)

<223> p5 promoter

<220>

<221> misc_feature

<222> (4514)..(4728)

<223> LacZ promoter

<400> 4

ccattttgaa gcgggaggtt tgaacgcgca gccgcc atg ccg ggg ttt tac gag 54

Met Pro Gly Phe Tyr Glu

1 5

att gtg att aag gtc ccc agc gac ctt gac gag cat ctg ccc ggc att 102

Ile Val Ile Lys Val Pro Ser Asp Leu Asp Glu His Leu Pro Gly Ile

10 15 20

tct gac agc ttt gtg aac tgg gtg gcc gag aag gaa tgg gag ttg ccg 150

Ser Asp Ser Phe Val Asn Trp Val Ala Glu Lys Glu Trp Glu Leu Pro

25 30 35

cca gat tct gac atg gat ctg aat ctg att gag cag gca ccc ctg acc 198

Pro Asp Ser Asp Met Asp Leu Asn Leu Ile Glu Gln Ala Pro Leu Thr

40 45 50

gtg gcc gag aag ctg cag cgc gac ttt ctg acg gaa tgg cgc cgt gtg 246

Val Ala Glu Lys Leu Gln Arg Asp Phe Leu Thr Glu Trp Arg Arg Val

55 60 65 70

agt aag gcc ccg gag gct ctt ttc ttt gtg caa ttt gag aag gga gag 294

Ser Lys Ala Pro Glu Ala Leu Phe Phe Val Gln Phe Glu Lys Gly Glu

75 80 85

agc tac ttc cac atg cac gtg ctc gtg gaa acc acc ggg gtg aaa tcc 342

Ser Tyr Phe His Met His Val Leu Val Glu Thr Thr Gly Val Lys Ser

90 95 100

atg gtt ttg gga cgt ttc ctg agt cag att cgc gaa aaa ctg att cag 390

Met Val Leu Gly Arg Phe Leu Ser Gln Ile Arg Glu Lys Leu Ile Gln

105 110 115

aga att tac cgc ggg atc gag ccg act ttg cca aac tgg ttc gcg gtc 438

Arg Ile Tyr Arg Gly Ile Glu Pro Thr Leu Pro Asn Trp Phe Ala Val

120 125 130

aca aag acc aga aat ggc gcc gga ggc ggg aac aag gtg gtg gat gag 486

Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly Asn Lys Val Val Asp Glu

135 140 145 150

tgc tac atc ccc aat tac ttg ctc ccc aaa acc cag cct gag ctc cag 534

Cys Tyr Ile Pro Asn Tyr Leu Leu Pro Lys Thr Gln Pro Glu Leu Gln

155 160 165

tgg gcg tgg act aat atg gaa cag tat tta agc gcc tgt ttg aat ctc 582

Trp Ala Trp Thr Asn Met Glu Gln Tyr Leu Ser Ala Cys Leu Asn Leu

170 175 180

acg gag cgt aaa cgg ttg gtg gcg cag cat ctg acg cac gtg tcg cag 630

Thr Glu Arg Lys Arg Leu Val Ala Gln His Leu Thr His Val Ser Gln

185 190 195

acg cag gag cag aac aaa gag aat cag aat ccc aat tct gat gcg ccg 678

Thr Gln Glu Gln Asn Lys Glu Asn Gln Asn Pro Asn Ser Asp Ala Pro

200 205 210

gtg atc aga tca aaa act tca gcc agg tac atg gag ctg gtc ggg tgg 726

Val Ile Arg Ser Lys Thr Ser Ala Arg Tyr Met Glu Leu Val Gly Trp

215 220 225 230

ctc gtg gac aag ggg att acc tcg gag aag cag tgg atc cag gag gac 774

Leu Val Asp Lys Gly Ile Thr Ser Glu Lys Gln Trp Ile Gln Glu Asp

235 240 245

cag gcc tca tac atc tcc ttc aat gcg gcc tcc aac tcg cgg tcc caa 822

Gln Ala Ser Tyr Ile Ser Phe Asn Ala Ala Ser Asn Ser Arg Ser Gln

250 255 260

atc aag gct gcc ttg gac aat gcg gga aag att atg agc ctg act aaa 870

Ile Lys Ala Ala Leu Asp Asn Ala Gly Lys Ile Met Ser Leu Thr Lys

265 270 275

acc gcc ccc gac tac ctg gtg ggc cag cag ccc gtg gag gac att tcc 918

Thr Ala Pro Asp Tyr Leu Val Gly Gln Gln Pro Val Glu Asp Ile Ser

280 285 290

agc aat cgg att tat aaa att ttg gaa cta aac ggg tac gat ccc caa 966

Ser Asn Arg Ile Tyr Lys Ile Leu Glu Leu Asn Gly Tyr Asp Pro Gln

295 300 305 310

tat gcg gct tcc gtc ttt ctg gga tgg gcc acg aaa aag ttc ggc aag 1014

Tyr Ala Ala Ser Val Phe Leu Gly Trp Ala Thr Lys Lys Phe Gly Lys

315 320 325

agg aac acc atc tgg ctg ttt ggg cct gca act acc ggg aag acc aac 1062

Arg Asn Thr Ile Trp Leu Phe Gly Pro Ala Thr Thr Gly Lys Thr Asn

330 335 340

atc gcg gag gcc ata gcc cac act gtg ccc ttc tac ggg tgc gta aac 1110

Ile Ala Glu Ala Ile Ala His Thr Val Pro Phe Tyr Gly Cys Val Asn

345 350 355

tgg acc aat gag aac ttt ccc ttc aac gac tgt gtc gac aag atg gtg 1158

Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp Cys Val Asp Lys Met Val

360 365 370

atc tgg tgg gag gag ggg aag atg acc gcc aag gtc gtg gag tcg gcc 1206

Ile Trp Trp Glu Glu Gly Lys Met Thr Ala Lys Val Val Glu Ser Ala

375 380 385 390

aaa gcc att ctc gga gga agc aag gtg cgc gtg gac cag aaa tgc aag 1254

Lys Ala Ile Leu Gly Gly Ser Lys Val Arg Val Asp Gln Lys Cys Lys

395 400 405

tcc tcg gcc cag ata gac ccg act ccc gtg atc gtc acc tcc aac acc 1302

Ser Ser Ala Gln Ile Asp Pro Thr Pro Val Ile Val Thr Ser Asn Thr

410 415 420

aac atg tgc gcc gtg att gac ggg aac tca acg acc ttc gaa cac cag 1350

Asn Met Cys Ala Val Ile Asp Gly Asn Ser Thr Thr Phe Glu His Gln

425 430 435

cag ccg ttg caa gac cgg atg ttc aaa ttt gaa ctc acc cgc cgt ctg 1398

Gln Pro Leu Gln Asp Arg Met Phe Lys Phe Glu Leu Thr Arg Arg Leu

440 445 450

gat cat gac ttt ggg aag gtc acc aag cag gaa gtc aaa gac ttt ttc 1446

Asp His Asp Phe Gly Lys Val Thr Lys Gln Glu Val Lys Asp Phe Phe

455 460 465 470

cgg tgg gca aag gat cac gtg gtt gag gtg gag cat gaa ttc tac gtc 1494

Arg Trp Ala Lys Asp His Val Val Glu Val Glu His Glu Phe Tyr Val

475 480 485

aaa aag ggt gga gcc aag aaa aga ccc gcc ccc agt gac gca gat ata 1542

Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala Pro Ser Asp Ala Asp Ile

490 495 500

agt gag ccc aaa cgg gtg cgc gag tca gtt gcg cag cca tcg acg tca 1590

Ser Glu Pro Lys Arg Val Arg Glu Ser Val Ala Gln Pro Ser Thr Ser

505 510 515

gac gcg gaa gct tcg atc aac tac gca gac agg tac caa aac aaa tgt 1638

Asp Ala Glu Ala Ser Ile Asn Tyr Ala Asp Arg Tyr Gln Asn Lys Cys

520 525 530

tct cgt cac gtg ggc atg aat ctg atg ctg ttt ccc tgc aga caa tgc 1686

Ser Arg His Val Gly Met Asn Leu Met Leu Phe Pro Cys Arg Gln Cys

535 540 545 550

gag aga atg aat cag aat tca aat atc tgc ttc act cac gga cag aaa 1734

Glu Arg Met Asn Gln Asn Ser Asn Ile Cys Phe Thr His Gly Gln Lys

555 560 565

gac tgt tta gag tgc ttt ccc gtg tca gaa tct caa ccc gtt tct gtc 1782

Asp Cys Leu Glu Cys Phe Pro Val Ser Glu Ser Gln Pro Val Ser Val

570 575 580

gtc aaa aag gcg tat cag aaa ctg tgc tac att cat cat atc atg gga 1830

Val Lys Lys Ala Tyr Gln Lys Leu Cys Tyr Ile His His Ile Met Gly

585 590 595

aag gtg cca gac gct tgc act gcc tgc gat ctg gtc aat gtg gat ttg 1878

Lys Val Pro Asp Ala Cys Thr Ala Cys Asp Leu Val Asn Val Asp Leu

600 605 610

gat gac tgc atc ttt gaa caa taaatgattt aaatcaggt atg gct gcc gat 1930

Asp Asp Cys Ile Phe Glu Gln Met Ala Ala Asp

615 620 625

ggt tat ctt cca gat tgg ctc gag gac aac ctt agt gaa gga att cgc 1978

Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser Glu Gly Ile Arg

630 635 640

gag tgg tgg gct ttg aaa cct gga gcc cct caa ccc aag gca aat caa 2026

Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro Lys Ala Asn Gln

645 650 655

caa cat caa gac aac gct cga ggt ctt gtg ctt ccg ggt tac aaa tac 2074

Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro Gly Tyr Lys Tyr

660 665 670

ctt gga ccc ggc aac gga ctc gac aag ggg gag ccg gtc aac gca gca 2122

Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro Val Asn Ala Ala

675 680 685

gac gcg gcg gcc ctc gag cac gac aag gcc tac gac cag cag ctc aag 2170

Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp Gln Gln Leu Lys

690 695 700 705

gcc gga gac aac ccg tac ctc aag tac aac cac gcc gac gcc gag ttc 2218

Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala Asp Ala Glu Phe

710 715 720

cag gag cgg ctc aaa gaa gat acg tct ttt ggg ggc aac ctc ggg cga 2266

Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly Asn Leu Gly Arg

725 730 735

gca gtc ttc cag gcc aaa aag agg ctt ctt gaa cct ctt ggt ctg gtt 2314

Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro Leu Gly Leu Val

740 745 750

gag gaa gcg gct aag acg gct cct gga aag aag agg cct gta gag cag 2362

Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg Pro Val Glu Gln

755 760 765

tct cct cag gaa ccg gac tcc tcc gcg ggt att ggc aaa tcg ggt gca 2410

Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly Lys Ser Gly Ala

770 775 780 785

cag ccc gct aaa aag aga ctc aat ttc ggt cag act ggc gac aca gag 2458

Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr Gly Asp Thr Glu

790 795 800

tca gtc cca gac cct caa cca atc gga gaa cct ccc gca gcc ccc tca 2506

Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro Ala Ala Pro Ser

805 810 815

ggt gtg gga tct ctt aca atg gct tca ggt ggt ggc gca cca gtg gca 2554

Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly Ala Pro Val Ala

820 825 830

gac aat aac gaa ggt gcc gat gga gtg ggt agt tcc tcg gga aat tgg 2602

Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser Ser Gly Asn Trp

835 840 845

cat tgc gat tcc caa tgg ctg ggg gac aga gtc atc acc acc agc acc 2650

His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile Thr Thr Ser Thr

850 855 860 865

cga acc tgg gcc ctg ccc acc tac aac aat cac ctc tac aag caa atc 2698

Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu Tyr Lys Gln Ile

870 875 880

tcc aac agc aca tct gga gga tct tca aat gac aac gcc tac ttc ggc 2746

Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn Ala Tyr Phe Gly

885 890 895

tac agc acc ccc tgg ggg tat ttt gac ttc aac aga ttc cac tgc cac 2794

Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe His Cys His

900 905 910

ttc tca cca cgt gac tgg cag cga ctc atc aac aac aac tgg gga ttc 2842

Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp Gly Phe

915 920 925

cgg cct aag cga ctc aac ttc aag ctc ttc aac att cag gtc aaa gag 2890

Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile Gln Val Lys Glu

930 935 940 945

gtt acg gac aac aat gga gtc aag acc atc gcc aat aac ctt acc agc 2938

Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn Asn Leu Thr Ser

950 955 960

acg gtc cag gtc ttc acg gac tca gac tat cag ctc ccg tac gtg ctc 2986

Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu Pro Tyr Val Leu

965 970 975

ggg tcg gct cac gag ggc tgc ctc ccg ccg ttc cca gcg gac gtt ttc 3034

Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro Ala Asp Val Phe

980 985 990

atg att cct cag tac ggg tat ctg acg ctt aat gat gga agc cag gcc 3082

Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp Gly Ser Gln Ala

995 1000 1005

gtg ggt cgt tcg tcc ttt tac tgc ctg gaa tat ttc ccg tcg caa 3127

Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro Ser Gln

1010 1015 1020

atg cta aga acg ggt aac aac ttc cag ttc agc tac gag ttt gag 3172

Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu Phe Glu

1025 1030 1035

aac gta cct ttc cat agc agc tac gct cac agc caa agc ctg gac 3217

Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp

1040 1045 1050

cga cta atg aat cca ctc atc gac caa tac ttg tac tat ctc tca 3262

Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

1055 1060 1065

aag act att aac ggt tct gga cag aat caa caa acg cta aaa ttc 3307

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe

1070 1075 1080

agt gtg gcc gga ccc agc aac atg gct gtc cag gga aga aac tac 3352

Ser Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr

1085 1090 1095

ata cct gga ccc agc tac cga caa caa cgt gtc tca acc act gtg 3397

Ile Pro Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val

1100 1105 1110

act caa aac aac aac agc gaa ttt gct tgg cct gga gct tct tct 3442

Thr Gln Asn Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser

1115 1120 1125

tgg gct ctc aat gga cgt aat agc ttg atg aat cct gga cct gct 3487

Trp Ala Leu Asn Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala

1130 1135 1140

atg gcc agc cac aaa gaa gga gag gac cgt ttc ttt cct ttg tct 3532

Met Ala Ser His Lys Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser

1145 1150 1155

gga tct tta att ttt ggc aaa caa gga act gga aga gac aac gtg 3577

Gly Ser Leu Ile Phe Gly Lys Gln Gly Thr Gly Arg Asp Asn Val

1160 1165 1170

gat gcg gac aaa gtc atg ata acc aac gaa gaa gaa att aaa act 3622

Asp Ala Asp Lys Val Met Ile Thr Asn Glu Glu Glu Ile Lys Thr

1175 1180 1185

act aac ccg gta gca acg gag tcc tat gga caa gtg gcc aca aac 3667

Thr Asn Pro Val Ala Thr Glu Ser Tyr Gly Gln Val Ala Thr Asn

1190 1195 1200

cac cag agt gcc caa atc gag atc aac gct acc aga gct gga acc 3712

His Gln Ser Ala Gln Ile Glu Ile Asn Ala Thr Arg Ala Gly Thr

1205 1210 1215

aac ctg gca cag gcg cag acc ggc tgg gtt caa aac caa gga ata 3757

Asn Leu Ala Gln Ala Gln Thr Gly Trp Val Gln Asn Gln Gly Ile

1220 1225 1230

ctt ccg ggt atg gtt tgg cag gac aga gat gtg tac ctg caa gga 3802

Leu Pro Gly Met Val Trp Gln Asp Arg Asp Val Tyr Leu Gln Gly

1235 1240 1245

ccc att tgg gcc aaa att cct cac acg gac ggc aac ttt cac cct 3847

Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly Asn Phe His Pro

1250 1255 1260

tct ccg ctg atg gga ggg ttt gga atg aag cac ccg cct cct cag 3892

Ser Pro Leu Met Gly Gly Phe Gly Met Lys His Pro Pro Pro Gln

1265 1270 1275

atc ctc atc aaa aac aca cct gta cct gcg gat cct cca acg gcc 3937

Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asp Pro Pro Thr Ala

1280 1285 1290

ttc aac aag gac aag ctg aac tct ttc atc acc cag tat tct act 3982

Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr Gln Tyr Ser Thr

1295 1300 1305

ggc caa gtc agc gtg gag atc gag tgg gag ctg cag aag gaa aac 4027

Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn

1310 1315 1320

agc aag cgc tgg aac ccg gag atc cag tac act tcc aac tat tac 4072

Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Tyr

1325 1330 1335

aag tct aat aat gtt gaa ttt gct gtt aat act gaa ggt gta tat 4117

Lys Ser Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val Tyr

1340 1345 1350

agt gaa ccc cgc ccc att ggc acc aga tac ctg act cgt aat ctg 4162

Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

1355 1360 1365

taa ttgcttgtta atcaataaac cgtttaattc gtttcagttg aactttggtc 4215

tctgcgaagg gcgaattcgt ttaaacctgc aggactagag gtcctgtatt agaggtcacg 4275

tgagtgtttt gcgacatttt gcgacaccat gtggtcacgc tgggtattta agcccgagtg 4335

agcacgcagg gtctccattt tgaagcggga ggtttgaacg cgcagccgcc aagccgaatt 4395

ctgcagatat ccatcacact ggcggccgct cgactagagc ggccgccacc gcggtggagc 4455

tccagctttt gttcccttta gtgagggtta attgcgcgct tggcgtaatc atggtcatag 4515

ctgtttcctg tgtgaaattg ttatccgctc acaattccac acaacatacg agccggaagc 4575

ataaagtgta aagcctgggg tgcctaatga gtgagctaac tcacattaat tgcgttgcgc 4635

tcactgcccg ctttccagtc gggaaacctg tcgtgccagc tgcattaatg aatcggccaa 4695

cgcgcgggga gaggcggttt gcgtattggg cgc 4728

<210> 5

<211> 621

<212> PRT

<213> artificial sequence

<220>

<223> synthetic construct

<400> 5

Met Pro Gly Phe Tyr Glu Ile Val Ile Lys Val Pro Ser Asp Leu Asp

1 5 10 15

Glu His Leu Pro Gly Ile Ser Asp Ser Phe Val Asn Trp Val Ala Glu

20 25 30

Lys Glu Trp Glu Leu Pro Pro Asp Ser Asp Met Asp Leu Asn Leu Ile

35 40 45

Glu Gln Ala Pro Leu Thr Val Ala Glu Lys Leu Gln Arg Asp Phe Leu

50 55 60

Thr Glu Trp Arg Arg Val Ser Lys Ala Pro Glu Ala Leu Phe Phe Val

65 70 75 80

Gln Phe Glu Lys Gly Glu Ser Tyr Phe His Met His Val Leu Val Glu

85 90 95

Thr Thr Gly Val Lys Ser Met Val Leu Gly Arg Phe Leu Ser Gln Ile

100 105 110

Arg Glu Lys Leu Ile Gln Arg Ile Tyr Arg Gly Ile Glu Pro Thr Leu

115 120 125

Pro Asn Trp Phe Ala Val Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly

130 135 140

Asn Lys Val Val Asp Glu Cys Tyr Ile Pro Asn Tyr Leu Leu Pro Lys

145 150 155 160

Thr Gln Pro Glu Leu Gln Trp Ala Trp Thr Asn Met Glu Gln Tyr Leu

165 170 175

Ser Ala Cys Leu Asn Leu Thr Glu Arg Lys Arg Leu Val Ala Gln His

180 185 190

Leu Thr His Val Ser Gln Thr Gln Glu Gln Asn Lys Glu Asn Gln Asn

195 200 205

Pro Asn Ser Asp Ala Pro Val Ile Arg Ser Lys Thr Ser Ala Arg Tyr

210 215 220

Met Glu Leu Val Gly Trp Leu Val Asp Lys Gly Ile Thr Ser Glu Lys

225 230 235 240

Gln Trp Ile Gln Glu Asp Gln Ala Ser Tyr Ile Ser Phe Asn Ala Ala

245 250 255

Ser Asn Ser Arg Ser Gln Ile Lys Ala Ala Leu Asp Asn Ala Gly Lys

260 265 270

Ile Met Ser Leu Thr Lys Thr Ala Pro Asp Tyr Leu Val Gly Gln Gln

275 280 285

Pro Val Glu Asp Ile Ser Ser Asn Arg Ile Tyr Lys Ile Leu Glu Leu

290 295 300

Asn Gly Tyr Asp Pro Gln Tyr Ala Ala Ser Val Phe Leu Gly Trp Ala

305 310 315 320

Thr Lys Lys Phe Gly Lys Arg Asn Thr Ile Trp Leu Phe Gly Pro Ala

325 330 335

Thr Thr Gly Lys Thr Asn Ile Ala Glu Ala Ile Ala His Thr Val Pro

340 345 350

Phe Tyr Gly Cys Val Asn Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp

355 360 365

Cys Val Asp Lys Met Val Ile Trp Trp Glu Glu Gly Lys Met Thr Ala

370 375 380

Lys Val Val Glu Ser Ala Lys Ala Ile Leu Gly Gly Ser Lys Val Arg

385 390 395 400

Val Asp Gln Lys Cys Lys Ser Ser Ala Gln Ile Asp Pro Thr Pro Val

405 410 415

Ile Val Thr Ser Asn Thr Asn Met Cys Ala Val Ile Asp Gly Asn Ser

420 425 430

Thr Thr Phe Glu His Gln Gln Pro Leu Gln Asp Arg Met Phe Lys Phe

435 440 445

Glu Leu Thr Arg Arg Leu Asp His Asp Phe Gly Lys Val Thr Lys Gln

450 455 460

Glu Val Lys Asp Phe Phe Arg Trp Ala Lys Asp His Val Val Glu Val

465 470 475 480

Glu His Glu Phe Tyr Val Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala

485 490 495

Pro Ser Asp Ala Asp Ile Ser Glu Pro Lys Arg Val Arg Glu Ser Val

500 505 510

Ala Gln Pro Ser Thr Ser Asp Ala Glu Ala Ser Ile Asn Tyr Ala Asp

515 520 525

Arg Tyr Gln Asn Lys Cys Ser Arg His Val Gly Met Asn Leu Met Leu

530 535 540

Phe Pro Cys Arg Gln Cys Glu Arg Met Asn Gln Asn Ser Asn Ile Cys

545 550 555 560

Phe Thr His Gly Gln Lys Asp Cys Leu Glu Cys Phe Pro Val Ser Glu

565 570 575

Ser Gln Pro Val Ser Val Val Lys Lys Ala Tyr Gln Lys Leu Cys Tyr

580 585 590

Ile His His Ile Met Gly Lys Val Pro Asp Ala Cys Thr Ala Cys Asp

595 600 605

Leu Val Asn Val Asp Leu Asp Asp Cys Ile Phe Glu Gln

610 615 620

<210> 6

<211> 748

<212> PRT

<213> artificial sequence

<220>

<223> synthetic construct

<400> 6

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser

1 5 10 15

Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro

20 25 30

Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro

35 40 45

Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro

50 55 60

Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 80

Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro

115 120 125

Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140

Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly

145 150 155 160

Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175

Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro

180 185 190

Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly

195 200 205

Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser

210 215 220

Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile

225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255

Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn

260 265 270

Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg

275 280 285

Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn

290 295 300

Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile

305 310 315 320

Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn

325 330 335

Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu

340 345 350

Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro

355 360 365

Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp

370 375 380

Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe

385 390 395 400

Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu

405 410 415

Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu

420 425 430

Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

435 440 445

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser

450 455 460

Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro

465 470 475 480

Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn

485 490 495

Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn

500 505 510

Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys

515 520 525

Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly

530 535 540

Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile

545 550 555 560

Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser

565 570 575

Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Ile Glu Ile Asn

580 585 590

Ala Thr Arg Ala Gly Thr Asn Leu Ala Gln Ala Gln Thr Gly Trp Val

595 600 605

Gln Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln Asp Arg Asp Val

610 615 620

Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly Asn

625 630 635 640

Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Met Lys His Pro Pro

645 650 655

Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asp Pro Pro Thr

660 665 670

Ala Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr Gln Tyr Ser Thr

675 680 685

Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn Ser

690 695 700

Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Tyr Lys Ser

705 710 715 720

Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val Tyr Ser Glu Pro

725 730 735

Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

740 745

<210> 7

<211> 4740

<212> DNA

<213> artificial sequence

<220>

<223> AAV2/9 n.588.IEI-L nucleic acid sequence expression cassettes

<220>

<221> misc_feature

<222> (1)..(36)

<223> truncated promoter

<220>

<221> promoter

<222> (1)..(7)

<223> p5 promoter

<220>

<221> CDS

<222> (37)..(1899)

<223> AAV2 Rep

<220>

<221> CDS

<222> (1919)..(4177)

<223> AAV9 Cap

<220>

<221> misc_feature

<222> (3683)..(3739)

<223> IEI-L

<220>

<221> misc_feature

<222> (4268)..(4398)

<223> p5 promoter

<220>

<221> misc_feature

<222> (4526)..(4740)

<223> LacZ promoter

<400> 7

ccattttgaa gcgggaggtt tgaacgcgca gccgcc atg ccg ggg ttt tac gag 54

Met Pro Gly Phe Tyr Glu

1 5

att gtg att aag gtc ccc agc gac ctt gac gag cat ctg ccc ggc att 102

Ile Val Ile Lys Val Pro Ser Asp Leu Asp Glu His Leu Pro Gly Ile

10 15 20

tct gac agc ttt gtg aac tgg gtg gcc gag aag gaa tgg gag ttg ccg 150

Ser Asp Ser Phe Val Asn Trp Val Ala Glu Lys Glu Trp Glu Leu Pro

25 30 35

cca gat tct gac atg gat ctg aat ctg att gag cag gca ccc ctg acc 198

Pro Asp Ser Asp Met Asp Leu Asn Leu Ile Glu Gln Ala Pro Leu Thr

40 45 50

gtg gcc gag aag ctg cag cgc gac ttt ctg acg gaa tgg cgc cgt gtg 246

Val Ala Glu Lys Leu Gln Arg Asp Phe Leu Thr Glu Trp Arg Arg Val

55 60 65 70

agt aag gcc ccg gag gct ctt ttc ttt gtg caa ttt gag aag gga gag 294

Ser Lys Ala Pro Glu Ala Leu Phe Phe Val Gln Phe Glu Lys Gly Glu

75 80 85

agc tac ttc cac atg cac gtg ctc gtg gaa acc acc ggg gtg aaa tcc 342

Ser Tyr Phe His Met His Val Leu Val Glu Thr Thr Gly Val Lys Ser

90 95 100

atg gtt ttg gga cgt ttc ctg agt cag att cgc gaa aaa ctg att cag 390

Met Val Leu Gly Arg Phe Leu Ser Gln Ile Arg Glu Lys Leu Ile Gln

105 110 115

aga att tac cgc ggg atc gag ccg act ttg cca aac tgg ttc gcg gtc 438

Arg Ile Tyr Arg Gly Ile Glu Pro Thr Leu Pro Asn Trp Phe Ala Val

120 125 130

aca aag acc aga aat ggc gcc gga ggc ggg aac aag gtg gtg gat gag 486

Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly Asn Lys Val Val Asp Glu

135 140 145 150

tgc tac atc ccc aat tac ttg ctc ccc aaa acc cag cct gag ctc cag 534

Cys Tyr Ile Pro Asn Tyr Leu Leu Pro Lys Thr Gln Pro Glu Leu Gln

155 160 165

tgg gcg tgg act aat atg gaa cag tat tta agc gcc tgt ttg aat ctc 582

Trp Ala Trp Thr Asn Met Glu Gln Tyr Leu Ser Ala Cys Leu Asn Leu

170 175 180

acg gag cgt aaa cgg ttg gtg gcg cag cat ctg acg cac gtg tcg cag 630

Thr Glu Arg Lys Arg Leu Val Ala Gln His Leu Thr His Val Ser Gln

185 190 195

acg cag gag cag aac aaa gag aat cag aat ccc aat tct gat gcg ccg 678

Thr Gln Glu Gln Asn Lys Glu Asn Gln Asn Pro Asn Ser Asp Ala Pro

200 205 210

gtg atc aga tca aaa act tca gcc agg tac atg gag ctg gtc ggg tgg 726

Val Ile Arg Ser Lys Thr Ser Ala Arg Tyr Met Glu Leu Val Gly Trp

215 220 225 230

ctc gtg gac aag ggg att acc tcg gag aag cag tgg atc cag gag gac 774

Leu Val Asp Lys Gly Ile Thr Ser Glu Lys Gln Trp Ile Gln Glu Asp

235 240 245

cag gcc tca tac atc tcc ttc aat gcg gcc tcc aac tcg cgg tcc caa 822

Gln Ala Ser Tyr Ile Ser Phe Asn Ala Ala Ser Asn Ser Arg Ser Gln

250 255 260

atc aag gct gcc ttg gac aat gcg gga aag att atg agc ctg act aaa 870

Ile Lys Ala Ala Leu Asp Asn Ala Gly Lys Ile Met Ser Leu Thr Lys

265 270 275

acc gcc ccc gac tac ctg gtg ggc cag cag ccc gtg gag gac att tcc 918

Thr Ala Pro Asp Tyr Leu Val Gly Gln Gln Pro Val Glu Asp Ile Ser

280 285 290

agc aat cgg att tat aaa att ttg gaa cta aac ggg tac gat ccc caa 966

Ser Asn Arg Ile Tyr Lys Ile Leu Glu Leu Asn Gly Tyr Asp Pro Gln

295 300 305 310

tat gcg gct tcc gtc ttt ctg gga tgg gcc acg aaa aag ttc ggc aag 1014

Tyr Ala Ala Ser Val Phe Leu Gly Trp Ala Thr Lys Lys Phe Gly Lys

315 320 325

agg aac acc atc tgg ctg ttt ggg cct gca act acc ggg aag acc aac 1062

Arg Asn Thr Ile Trp Leu Phe Gly Pro Ala Thr Thr Gly Lys Thr Asn

330 335 340

atc gcg gag gcc ata gcc cac act gtg ccc ttc tac ggg tgc gta aac 1110

Ile Ala Glu Ala Ile Ala His Thr Val Pro Phe Tyr Gly Cys Val Asn

345 350 355

tgg acc aat gag aac ttt ccc ttc aac gac tgt gtc gac aag atg gtg 1158

Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp Cys Val Asp Lys Met Val

360 365 370

atc tgg tgg gag gag ggg aag atg acc gcc aag gtc gtg gag tcg gcc 1206

Ile Trp Trp Glu Glu Gly Lys Met Thr Ala Lys Val Val Glu Ser Ala

375 380 385 390

aaa gcc att ctc gga gga agc aag gtg cgc gtg gac cag aaa tgc aag 1254

Lys Ala Ile Leu Gly Gly Ser Lys Val Arg Val Asp Gln Lys Cys Lys

395 400 405

tcc tcg gcc cag ata gac ccg act ccc gtg atc gtc acc tcc aac acc 1302

Ser Ser Ala Gln Ile Asp Pro Thr Pro Val Ile Val Thr Ser Asn Thr

410 415 420

aac atg tgc gcc gtg att gac ggg aac tca acg acc ttc gaa cac cag 1350

Asn Met Cys Ala Val Ile Asp Gly Asn Ser Thr Thr Phe Glu His Gln

425 430 435

cag ccg ttg caa gac cgg atg ttc aaa ttt gaa ctc acc cgc cgt ctg 1398

Gln Pro Leu Gln Asp Arg Met Phe Lys Phe Glu Leu Thr Arg Arg Leu

440 445 450

gat cat gac ttt ggg aag gtc acc aag cag gaa gtc aaa gac ttt ttc 1446

Asp His Asp Phe Gly Lys Val Thr Lys Gln Glu Val Lys Asp Phe Phe

455 460 465 470

cgg tgg gca aag gat cac gtg gtt gag gtg gag cat gaa ttc tac gtc 1494

Arg Trp Ala Lys Asp His Val Val Glu Val Glu His Glu Phe Tyr Val

475 480 485

aaa aag ggt gga gcc aag aaa aga ccc gcc ccc agt gac gca gat ata 1542

Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala Pro Ser Asp Ala Asp Ile

490 495 500

agt gag ccc aaa cgg gtg cgc gag tca gtt gcg cag cca tcg acg tca 1590

Ser Glu Pro Lys Arg Val Arg Glu Ser Val Ala Gln Pro Ser Thr Ser

505 510 515

gac gcg gaa gct tcg atc aac tac gca gac agg tac caa aac aaa tgt 1638

Asp Ala Glu Ala Ser Ile Asn Tyr Ala Asp Arg Tyr Gln Asn Lys Cys

520 525 530

tct cgt cac gtg ggc atg aat ctg atg ctg ttt ccc tgc aga caa tgc 1686

Ser Arg His Val Gly Met Asn Leu Met Leu Phe Pro Cys Arg Gln Cys

535 540 545 550

gag aga atg aat cag aat tca aat atc tgc ttc act cac gga cag aaa 1734

Glu Arg Met Asn Gln Asn Ser Asn Ile Cys Phe Thr His Gly Gln Lys

555 560 565

gac tgt tta gag tgc ttt ccc gtg tca gaa tct caa ccc gtt tct gtc 1782

Asp Cys Leu Glu Cys Phe Pro Val Ser Glu Ser Gln Pro Val Ser Val

570 575 580

gtc aaa aag gcg tat cag aaa ctg tgc tac att cat cat atc atg gga 1830

Val Lys Lys Ala Tyr Gln Lys Leu Cys Tyr Ile His His Ile Met Gly

585 590 595

aag gtg cca gac gct tgc act gcc tgc gat ctg gtc aat gtg gat ttg 1878

Lys Val Pro Asp Ala Cys Thr Ala Cys Asp Leu Val Asn Val Asp Leu

600 605 610

gat gac tgc atc ttt gaa caa taaatgattt aaatcaggt atg gct gcc gat 1930

Asp Asp Cys Ile Phe Glu Gln Met Ala Ala Asp

615 620 625

ggt tat ctt cca gat tgg ctc gag gac aac ctt agt gaa gga att cgc 1978

Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser Glu Gly Ile Arg

630 635 640

gag tgg tgg gct ttg aaa cct gga gcc cct caa ccc aag gca aat caa 2026

Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro Lys Ala Asn Gln

645 650 655

caa cat caa gac aac gct cga ggt ctt gtg ctt ccg ggt tac aaa tac 2074

Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro Gly Tyr Lys Tyr

660 665 670

ctt gga ccc ggc aac gga ctc gac aag ggg gag ccg gtc aac gca gca 2122

Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro Val Asn Ala Ala

675 680 685

gac gcg gcg gcc ctc gag cac gac aag gcc tac gac cag cag ctc aag 2170

Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp Gln Gln Leu Lys

690 695 700 705

gcc gga gac aac ccg tac ctc aag tac aac cac gcc gac gcc gag ttc 2218

Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala Asp Ala Glu Phe

710 715 720

cag gag cgg ctc aaa gaa gat acg tct ttt ggg ggc aac ctc ggg cga 2266

Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly Asn Leu Gly Arg

725 730 735

gca gtc ttc cag gcc aaa aag agg ctt ctt gaa cct ctt ggt ctg gtt 2314

Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro Leu Gly Leu Val

740 745 750

gag gaa gcg gct aag acg gct cct gga aag aag agg cct gta gag cag 2362

Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg Pro Val Glu Gln

755 760 765

tct cct cag gaa ccg gac tcc tcc gcg ggt att ggc aaa tcg ggt gca 2410

Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly Lys Ser Gly Ala

770 775 780 785

cag ccc gct aaa aag aga ctc aat ttc ggt cag act ggc gac aca gag 2458

Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr Gly Asp Thr Glu

790 795 800

tca gtc cca gac cct caa cca atc gga gaa cct ccc gca gcc ccc tca 2506

Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro Ala Ala Pro Ser

805 810 815

ggt gtg gga tct ctt aca atg gct tca ggt ggt ggc gca cca gtg gca 2554

Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly Ala Pro Val Ala

820 825 830

gac aat aac gaa ggt gcc gat gga gtg ggt agt tcc tcg gga aat tgg 2602

Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser Ser Gly Asn Trp

835 840 845

cat tgc gat tcc caa tgg ctg ggg gac aga gtc atc acc acc agc acc 2650

His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile Thr Thr Ser Thr

850 855 860 865

cga acc tgg gcc ctg ccc acc tac aac aat cac ctc tac aag caa atc 2698

Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu Tyr Lys Gln Ile

870 875 880

tcc aac agc aca tct gga gga tct tca aat gac aac gcc tac ttc ggc 2746

Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn Ala Tyr Phe Gly

885 890 895

tac agc acc ccc tgg ggg tat ttt gac ttc aac aga ttc cac tgc cac 2794

Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe His Cys His

900 905 910

ttc tca cca cgt gac tgg cag cga ctc atc aac aac aac tgg gga ttc 2842

Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp Gly Phe

915 920 925

cgg cct aag cga ctc aac ttc aag ctc ttc aac att cag gtc aaa gag 2890

Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile Gln Val Lys Glu

930 935 940 945

gtt acg gac aac aat gga gtc aag acc atc gcc aat aac ctt acc agc 2938

Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn Asn Leu Thr Ser

950 955 960

acg gtc cag gtc ttc acg gac tca gac tat cag ctc ccg tac gtg ctc 2986

Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu Pro Tyr Val Leu

965 970 975

ggg tcg gct cac gag ggc tgc ctc ccg ccg ttc cca gcg gac gtt ttc 3034

Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro Ala Asp Val Phe

980 985 990

atg att cct cag tac ggg tat ctg acg ctt aat gat gga agc cag gcc 3082

Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp Gly Ser Gln Ala

995 1000 1005

gtg ggt cgt tcg tcc ttt tac tgc ctg gaa tat ttc ccg tcg caa 3127

Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro Ser Gln

1010 1015 1020

atg cta aga acg ggt aac aac ttc cag ttc agc tac gag ttt gag 3172

Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu Phe Glu

1025 1030 1035

aac gta cct ttc cat agc agc tac gct cac agc caa agc ctg gac 3217

Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp

1040 1045 1050

cga cta atg aat cca ctc atc gac caa tac ttg tac tat ctc tca 3262

Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

1055 1060 1065

aag act att aac ggt tct gga cag aat caa caa acg cta aaa ttc 3307

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe

1070 1075 1080

agt gtg gcc gga ccc agc aac atg gct gtc cag gga aga aac tac 3352

Ser Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr

1085 1090 1095

ata cct gga ccc agc tac cga caa caa cgt gtc tca acc act gtg 3397

Ile Pro Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val

1100 1105 1110

act caa aac aac aac agc gaa ttt gct tgg cct gga gct tct tct 3442

Thr Gln Asn Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser

1115 1120 1125

tgg gct ctc aat gga cgt aat agc ttg atg aat cct gga cct gct 3487

Trp Ala Leu Asn Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala

1130 1135 1140

atg gcc agc cac aaa gaa gga gag gac cgt ttc ttt cct ttg tct 3532

Met Ala Ser His Lys Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser

1145 1150 1155

gga tct tta att ttt ggc aaa caa gga act gga aga gac aac gtg 3577

Gly Ser Leu Ile Phe Gly Lys Gln Gly Thr Gly Arg Asp Asn Val

1160 1165 1170

gat gcg gac aaa gtc atg ata acc aac gaa gaa gaa att aaa act 3622

Asp Ala Asp Lys Val Met Ile Thr Asn Glu Glu Glu Ile Lys Thr

1175 1180 1185

act aac ccg gta gca acg gag tcc tat gga caa gtg gcc aca aac 3667

Thr Asn Pro Val Ala Thr Glu Ser Tyr Gly Gln Val Ala Thr Asn

1190 1195 1200

cac cag agt gcc caa gga gga atc gag atc aac gct acc aga gct 3712

His Gln Ser Ala Gln Gly Gly Ile Glu Ile Asn Ala Thr Arg Ala

1205 1210 1215

gga acc aac ctg gga gga gca cag gcg cag acc ggc tgg gtt caa 3757

Gly Thr Asn Leu Gly Gly Ala Gln Ala Gln Thr Gly Trp Val Gln

1220 1225 1230

aac caa gga ata ctt ccg ggt atg gtt tgg cag gac aga gat gtg 3802

Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln Asp Arg Asp Val

1235 1240 1245

tac ctg caa gga ccc att tgg gcc aaa att cct cac acg gac ggc 3847

Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly

1250 1255 1260

aac ttt cac cct tct ccg ctg atg gga ggg ttt gga atg aag cac 3892

Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Met Lys His

1265 1270 1275

ccg cct cct cag atc ctc atc aaa aac aca cct gta cct gcg gat 3937

Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asp

1280 1285 1290

cct cca acg gcc ttc aac aag gac aag ctg aac tct ttc atc acc 3982

Pro Pro Thr Ala Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr

1295 1300 1305

cag tat tct act ggc caa gtc agc gtg gag atc gag tgg gag ctg 4027

Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu

1310 1315 1320

cag aag gaa aac agc aag cgc tgg aac ccg gag atc cag tac act 4072

Gln Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr

1325 1330 1335

tcc aac tat tac aag tct aat aat gtt gaa ttt gct gtt aat act 4117

Ser Asn Tyr Tyr Lys Ser Asn Asn Val Glu Phe Ala Val Asn Thr

1340 1345 1350

gaa ggt gta tat agt gaa ccc cgc ccc att ggc acc aga tac ctg 4162

Glu Gly Val Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu

1355 1360 1365

act cgt aat ctg taa ttgcttgtta atcaataaac cgtttaattc gtttcagttg 4217

Thr Arg Asn Leu

1370

aactttggtc tctgcgaagg gcgaattcgt ttaaacctgc aggactagag gtcctgtatt 4277

agaggtcacg tgagtgtttt gcgacatttt gcgacaccat gtggtcacgc tgggtattta 4337

agcccgagtg agcacgcagg gtctccattt tgaagcggga ggtttgaacg cgcagccgcc 4397

aagccgaatt ctgcagatat ccatcacact ggcggccgct cgactagagc ggccgccacc 4457

gcggtggagc tccagctttt gttcccttta gtgagggtta attgcgcgct tggcgtaatc 4517

atggtcatag ctgtttcctg tgtgaaattg ttatccgctc acaattccac acaacatacg 4577

agccggaagc ataaagtgta aagcctgggg tgcctaatga gtgagctaac tcacattaat 4637

tgcgttgcgc tcactgcccg ctttccagtc gggaaacctg tcgtgccagc tgcattaatg 4697

aatcggccaa cgcgcgggga gaggcggttt gcgtattggg cgc 4740

<210> 8

<211> 621

<212> PRT

<213> artificial sequence

<220>

<223> synthetic construct

<400> 8

Met Pro Gly Phe Tyr Glu Ile Val Ile Lys Val Pro Ser Asp Leu Asp

1 5 10 15

Glu His Leu Pro Gly Ile Ser Asp Ser Phe Val Asn Trp Val Ala Glu

20 25 30

Lys Glu Trp Glu Leu Pro Pro Asp Ser Asp Met Asp Leu Asn Leu Ile

35 40 45

Glu Gln Ala Pro Leu Thr Val Ala Glu Lys Leu Gln Arg Asp Phe Leu

50 55 60

Thr Glu Trp Arg Arg Val Ser Lys Ala Pro Glu Ala Leu Phe Phe Val

65 70 75 80

Gln Phe Glu Lys Gly Glu Ser Tyr Phe His Met His Val Leu Val Glu

85 90 95

Thr Thr Gly Val Lys Ser Met Val Leu Gly Arg Phe Leu Ser Gln Ile

100 105 110

Arg Glu Lys Leu Ile Gln Arg Ile Tyr Arg Gly Ile Glu Pro Thr Leu

115 120 125

Pro Asn Trp Phe Ala Val Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly

130 135 140

Asn Lys Val Val Asp Glu Cys Tyr Ile Pro Asn Tyr Leu Leu Pro Lys

145 150 155 160

Thr Gln Pro Glu Leu Gln Trp Ala Trp Thr Asn Met Glu Gln Tyr Leu

165 170 175

Ser Ala Cys Leu Asn Leu Thr Glu Arg Lys Arg Leu Val Ala Gln His

180 185 190

Leu Thr His Val Ser Gln Thr Gln Glu Gln Asn Lys Glu Asn Gln Asn

195 200 205

Pro Asn Ser Asp Ala Pro Val Ile Arg Ser Lys Thr Ser Ala Arg Tyr

210 215 220

Met Glu Leu Val Gly Trp Leu Val Asp Lys Gly Ile Thr Ser Glu Lys

225 230 235 240

Gln Trp Ile Gln Glu Asp Gln Ala Ser Tyr Ile Ser Phe Asn Ala Ala

245 250 255

Ser Asn Ser Arg Ser Gln Ile Lys Ala Ala Leu Asp Asn Ala Gly Lys

260 265 270

Ile Met Ser Leu Thr Lys Thr Ala Pro Asp Tyr Leu Val Gly Gln Gln

275 280 285

Pro Val Glu Asp Ile Ser Ser Asn Arg Ile Tyr Lys Ile Leu Glu Leu

290 295 300

Asn Gly Tyr Asp Pro Gln Tyr Ala Ala Ser Val Phe Leu Gly Trp Ala

305 310 315 320

Thr Lys Lys Phe Gly Lys Arg Asn Thr Ile Trp Leu Phe Gly Pro Ala

325 330 335

Thr Thr Gly Lys Thr Asn Ile Ala Glu Ala Ile Ala His Thr Val Pro

340 345 350

Phe Tyr Gly Cys Val Asn Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp

355 360 365

Cys Val Asp Lys Met Val Ile Trp Trp Glu Glu Gly Lys Met Thr Ala

370 375 380

Lys Val Val Glu Ser Ala Lys Ala Ile Leu Gly Gly Ser Lys Val Arg

385 390 395 400

Val Asp Gln Lys Cys Lys Ser Ser Ala Gln Ile Asp Pro Thr Pro Val

405 410 415

Ile Val Thr Ser Asn Thr Asn Met Cys Ala Val Ile Asp Gly Asn Ser

420 425 430

Thr Thr Phe Glu His Gln Gln Pro Leu Gln Asp Arg Met Phe Lys Phe

435 440 445

Glu Leu Thr Arg Arg Leu Asp His Asp Phe Gly Lys Val Thr Lys Gln

450 455 460

Glu Val Lys Asp Phe Phe Arg Trp Ala Lys Asp His Val Val Glu Val

465 470 475 480

Glu His Glu Phe Tyr Val Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala

485 490 495

Pro Ser Asp Ala Asp Ile Ser Glu Pro Lys Arg Val Arg Glu Ser Val

500 505 510

Ala Gln Pro Ser Thr Ser Asp Ala Glu Ala Ser Ile Asn Tyr Ala Asp

515 520 525

Arg Tyr Gln Asn Lys Cys Ser Arg His Val Gly Met Asn Leu Met Leu

530 535 540

Phe Pro Cys Arg Gln Cys Glu Arg Met Asn Gln Asn Ser Asn Ile Cys

545 550 555 560

Phe Thr His Gly Gln Lys Asp Cys Leu Glu Cys Phe Pro Val Ser Glu

565 570 575

Ser Gln Pro Val Ser Val Val Lys Lys Ala Tyr Gln Lys Leu Cys Tyr

580 585 590

Ile His His Ile Met Gly Lys Val Pro Asp Ala Cys Thr Ala Cys Asp

595 600 605

Leu Val Asn Val Asp Leu Asp Asp Cys Ile Phe Glu Gln

610 615 620

<210> 9

<211> 752

<212> PRT

<213> artificial sequence

<220>

<223> synthetic construct

<400> 9

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser

1 5 10 15

Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro

20 25 30

Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro

35 40 45

Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro

50 55 60

Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 80

Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro

115 120 125

Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140

Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly

145 150 155 160

Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175

Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro

180 185 190

Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly

195 200 205

Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser

210 215 220

Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile

225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255

Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn

260 265 270

Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg

275 280 285

Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn

290 295 300

Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile

305 310 315 320

Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn

325 330 335

Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu

340 345 350

Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro

355 360 365

Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp

370 375 380

Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe

385 390 395 400

Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu

405 410 415

Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu

420 425 430

Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

435 440 445

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser

450 455 460

Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro

465 470 475 480

Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn

485 490 495

Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn

500 505 510

Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys

515 520 525

Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly

530 535 540

Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile

545 550 555 560

Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser

565 570 575

Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Gly Gly Ile Glu

580 585 590

Ile Asn Ala Thr Arg Ala Gly Thr Asn Leu Gly Gly Ala Gln Ala Gln

595 600 605

Thr Gly Trp Val Gln Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln

610 615 620

Asp Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His

625 630 635 640

Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Met

645 650 655

Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala

660 665 670

Asp Pro Pro Thr Ala Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr

675 680 685

Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln

690 695 700

Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn

705 710 715 720

Tyr Tyr Lys Ser Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val

725 730 735

Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

740 745 750

<210> 10

<211> 4722

<212> DNA

<213> artificial sequence

<220>

<223> AAV2/9 n.588.SAN nucleic acid sequence expression cassettes

<220>

<221> misc_feature

<222> (1)..(36)

<223> truncated promoter

<220>

<221> promoter

<222> (1)..(7)

<223> p5 promoter

<220>

<221> CDS

<222> (37)..(1899)

<223> AAV2 Rep

<220>

<221> CDS

<222> (1919)..(4159)

<223> AAV 9 Cap

<220>

<221> misc_feature

<222> (3683)..(3712)

<223> SAN

<220>

<221> misc_feature

<222> (4250)..(4380)

<223> p5 promoter

<220>

<221> misc_feature

<222> (4508)..(4722)

<223> LacZ promoter

<400> 10

ccattttgaa gcgggaggtt tgaacgcgca gccgcc atg ccg ggg ttt tac gag 54

Met Pro Gly Phe Tyr Glu

1 5

att gtg att aag gtc ccc agc gac ctt gac gag cat ctg ccc ggc att 102

Ile Val Ile Lys Val Pro Ser Asp Leu Asp Glu His Leu Pro Gly Ile

10 15 20

tct gac agc ttt gtg aac tgg gtg gcc gag aag gaa tgg gag ttg ccg 150

Ser Asp Ser Phe Val Asn Trp Val Ala Glu Lys Glu Trp Glu Leu Pro

25 30 35

cca gat tct gac atg gat ctg aat ctg att gag cag gca ccc ctg acc 198

Pro Asp Ser Asp Met Asp Leu Asn Leu Ile Glu Gln Ala Pro Leu Thr

40 45 50

gtg gcc gag aag ctg cag cgc gac ttt ctg acg gaa tgg cgc cgt gtg 246

Val Ala Glu Lys Leu Gln Arg Asp Phe Leu Thr Glu Trp Arg Arg Val

55 60 65 70

agt aag gcc ccg gag gct ctt ttc ttt gtg caa ttt gag aag gga gag 294

Ser Lys Ala Pro Glu Ala Leu Phe Phe Val Gln Phe Glu Lys Gly Glu

75 80 85

agc tac ttc cac atg cac gtg ctc gtg gaa acc acc ggg gtg aaa tcc 342

Ser Tyr Phe His Met His Val Leu Val Glu Thr Thr Gly Val Lys Ser

90 95 100

atg gtt ttg gga cgt ttc ctg agt cag att cgc gaa aaa ctg att cag 390

Met Val Leu Gly Arg Phe Leu Ser Gln Ile Arg Glu Lys Leu Ile Gln

105 110 115

aga att tac cgc ggg atc gag ccg act ttg cca aac tgg ttc gcg gtc 438

Arg Ile Tyr Arg Gly Ile Glu Pro Thr Leu Pro Asn Trp Phe Ala Val

120 125 130

aca aag acc aga aat ggc gcc gga ggc ggg aac aag gtg gtg gat gag 486

Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly Asn Lys Val Val Asp Glu

135 140 145 150

tgc tac atc ccc aat tac ttg ctc ccc aaa acc cag cct gag ctc cag 534

Cys Tyr Ile Pro Asn Tyr Leu Leu Pro Lys Thr Gln Pro Glu Leu Gln

155 160 165

tgg gcg tgg act aat atg gaa cag tat tta agc gcc tgt ttg aat ctc 582

Trp Ala Trp Thr Asn Met Glu Gln Tyr Leu Ser Ala Cys Leu Asn Leu

170 175 180

acg gag cgt aaa cgg ttg gtg gcg cag cat ctg acg cac gtg tcg cag 630

Thr Glu Arg Lys Arg Leu Val Ala Gln His Leu Thr His Val Ser Gln

185 190 195

acg cag gag cag aac aaa gag aat cag aat ccc aat tct gat gcg ccg 678

Thr Gln Glu Gln Asn Lys Glu Asn Gln Asn Pro Asn Ser Asp Ala Pro

200 205 210

gtg atc aga tca aaa act tca gcc agg tac atg gag ctg gtc ggg tgg 726

Val Ile Arg Ser Lys Thr Ser Ala Arg Tyr Met Glu Leu Val Gly Trp

215 220 225 230

ctc gtg gac aag ggg att acc tcg gag aag cag tgg atc cag gag gac 774

Leu Val Asp Lys Gly Ile Thr Ser Glu Lys Gln Trp Ile Gln Glu Asp

235 240 245

cag gcc tca tac atc tcc ttc aat gcg gcc tcc aac tcg cgg tcc caa 822

Gln Ala Ser Tyr Ile Ser Phe Asn Ala Ala Ser Asn Ser Arg Ser Gln

250 255 260

atc aag gct gcc ttg gac aat gcg gga aag att atg agc ctg act aaa 870

Ile Lys Ala Ala Leu Asp Asn Ala Gly Lys Ile Met Ser Leu Thr Lys

265 270 275

acc gcc ccc gac tac ctg gtg ggc cag cag ccc gtg gag gac att tcc 918

Thr Ala Pro Asp Tyr Leu Val Gly Gln Gln Pro Val Glu Asp Ile Ser

280 285 290

agc aat cgg att tat aaa att ttg gaa cta aac ggg tac gat ccc caa 966

Ser Asn Arg Ile Tyr Lys Ile Leu Glu Leu Asn Gly Tyr Asp Pro Gln

295 300 305 310

tat gcg gct tcc gtc ttt ctg gga tgg gcc acg aaa aag ttc ggc aag 1014

Tyr Ala Ala Ser Val Phe Leu Gly Trp Ala Thr Lys Lys Phe Gly Lys

315 320 325

agg aac acc atc tgg ctg ttt ggg cct gca act acc ggg aag acc aac 1062

Arg Asn Thr Ile Trp Leu Phe Gly Pro Ala Thr Thr Gly Lys Thr Asn

330 335 340

atc gcg gag gcc ata gcc cac act gtg ccc ttc tac ggg tgc gta aac 1110

Ile Ala Glu Ala Ile Ala His Thr Val Pro Phe Tyr Gly Cys Val Asn

345 350 355

tgg acc aat gag aac ttt ccc ttc aac gac tgt gtc gac aag atg gtg 1158

Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp Cys Val Asp Lys Met Val

360 365 370

atc tgg tgg gag gag ggg aag atg acc gcc aag gtc gtg gag tcg gcc 1206

Ile Trp Trp Glu Glu Gly Lys Met Thr Ala Lys Val Val Glu Ser Ala

375 380 385 390

aaa gcc att ctc gga gga agc aag gtg cgc gtg gac cag aaa tgc aag 1254

Lys Ala Ile Leu Gly Gly Ser Lys Val Arg Val Asp Gln Lys Cys Lys

395 400 405

tcc tcg gcc cag ata gac ccg act ccc gtg atc gtc acc tcc aac acc 1302

Ser Ser Ala Gln Ile Asp Pro Thr Pro Val Ile Val Thr Ser Asn Thr

410 415 420

aac atg tgc gcc gtg att gac ggg aac tca acg acc ttc gaa cac cag 1350

Asn Met Cys Ala Val Ile Asp Gly Asn Ser Thr Thr Phe Glu His Gln

425 430 435

cag ccg ttg caa gac cgg atg ttc aaa ttt gaa ctc acc cgc cgt ctg 1398

Gln Pro Leu Gln Asp Arg Met Phe Lys Phe Glu Leu Thr Arg Arg Leu

440 445 450

gat cat gac ttt ggg aag gtc acc aag cag gaa gtc aaa gac ttt ttc 1446

Asp His Asp Phe Gly Lys Val Thr Lys Gln Glu Val Lys Asp Phe Phe

455 460 465 470

cgg tgg gca aag gat cac gtg gtt gag gtg gag cat gaa ttc tac gtc 1494

Arg Trp Ala Lys Asp His Val Val Glu Val Glu His Glu Phe Tyr Val

475 480 485

aaa aag ggt gga gcc aag aaa aga ccc gcc ccc agt gac gca gat ata 1542

Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala Pro Ser Asp Ala Asp Ile

490 495 500

agt gag ccc aaa cgg gtg cgc gag tca gtt gcg cag cca tcg acg tca 1590

Ser Glu Pro Lys Arg Val Arg Glu Ser Val Ala Gln Pro Ser Thr Ser

505 510 515

gac gcg gaa gct tcg atc aac tac gca gac agg tac caa aac aaa tgt 1638

Asp Ala Glu Ala Ser Ile Asn Tyr Ala Asp Arg Tyr Gln Asn Lys Cys

520 525 530

tct cgt cac gtg ggc atg aat ctg atg ctg ttt ccc tgc aga caa tgc 1686

Ser Arg His Val Gly Met Asn Leu Met Leu Phe Pro Cys Arg Gln Cys

535 540 545 550

gag aga atg aat cag aat tca aat atc tgc ttc act cac gga cag aaa 1734

Glu Arg Met Asn Gln Asn Ser Asn Ile Cys Phe Thr His Gly Gln Lys

555 560 565

gac tgt tta gag tgc ttt ccc gtg tca gaa tct caa ccc gtt tct gtc 1782

Asp Cys Leu Glu Cys Phe Pro Val Ser Glu Ser Gln Pro Val Ser Val

570 575 580

gtc aaa aag gcg tat cag aaa ctg tgc tac att cat cat atc atg gga 1830

Val Lys Lys Ala Tyr Gln Lys Leu Cys Tyr Ile His His Ile Met Gly

585 590 595

aag gtg cca gac gct tgc act gcc tgc gat ctg gtc aat gtg gat ttg 1878

Lys Val Pro Asp Ala Cys Thr Ala Cys Asp Leu Val Asn Val Asp Leu

600 605 610

gat gac tgc atc ttt gaa caa taaatgattt aaatcaggt atg gct gcc gat 1930

Asp Asp Cys Ile Phe Glu Gln Met Ala Ala Asp

615 620 625

ggt tat ctt cca gat tgg ctc gag gac aac ctt agt gaa gga att cgc 1978

Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser Glu Gly Ile Arg

630 635 640

gag tgg tgg gct ttg aaa cct gga gcc cct caa ccc aag gca aat caa 2026

Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro Lys Ala Asn Gln

645 650 655

caa cat caa gac aac gct cga ggt ctt gtg ctt ccg ggt tac aaa tac 2074

Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro Gly Tyr Lys Tyr

660 665 670

ctt gga ccc ggc aac gga ctc gac aag ggg gag ccg gtc aac gca gca 2122

Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro Val Asn Ala Ala

675 680 685

gac gcg gcg gcc ctc gag cac gac aag gcc tac gac cag cag ctc aag 2170

Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp Gln Gln Leu Lys

690 695 700 705

gcc gga gac aac ccg tac ctc aag tac aac cac gcc gac gcc gag ttc 2218

Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala Asp Ala Glu Phe

710 715 720

cag gag cgg ctc aaa gaa gat acg tct ttt ggg ggc aac ctc ggg cga 2266

Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly Asn Leu Gly Arg

725 730 735

gca gtc ttc cag gcc aaa aag agg ctt ctt gaa cct ctt ggt ctg gtt 2314

Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro Leu Gly Leu Val

740 745 750

gag gaa gcg gct aag acg gct cct gga aag aag agg cct gta gag cag 2362

Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg Pro Val Glu Gln

755 760 765

tct cct cag gaa ccg gac tcc tcc gcg ggt att ggc aaa tcg ggt gca 2410

Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly Lys Ser Gly Ala

770 775 780 785

cag ccc gct aaa aag aga ctc aat ttc ggt cag act ggc gac aca gag 2458

Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr Gly Asp Thr Glu

790 795 800

tca gtc cca gac cct caa cca atc gga gaa cct ccc gca gcc ccc tca 2506

Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro Ala Ala Pro Ser

805 810 815

ggt gtg gga tct ctt aca atg gct tca ggt ggt ggc gca cca gtg gca 2554

Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly Ala Pro Val Ala

820 825 830

gac aat aac gaa ggt gcc gat gga gtg ggt agt tcc tcg gga aat tgg 2602

Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser Ser Gly Asn Trp

835 840 845

cat tgc gat tcc caa tgg ctg ggg gac aga gtc atc acc acc agc acc 2650

His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile Thr Thr Ser Thr

850 855 860 865

cga acc tgg gcc ctg ccc acc tac aac aat cac ctc tac aag caa atc 2698

Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu Tyr Lys Gln Ile

870 875 880

tcc aac agc aca tct gga gga tct tca aat gac aac gcc tac ttc ggc 2746

Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn Ala Tyr Phe Gly

885 890 895

tac agc acc ccc tgg ggg tat ttt gac ttc aac aga ttc cac tgc cac 2794

Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe His Cys His

900 905 910

ttc tca cca cgt gac tgg cag cga ctc atc aac aac aac tgg gga ttc 2842

Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp Gly Phe

915 920 925

cgg cct aag cga ctc aac ttc aag ctc ttc aac att cag gtc aaa gag 2890

Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile Gln Val Lys Glu

930 935 940 945

gtt acg gac aac aat gga gtc aag acc atc gcc aat aac ctt acc agc 2938

Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn Asn Leu Thr Ser

950 955 960

acg gtc cag gtc ttc acg gac tca gac tat cag ctc ccg tac gtg ctc 2986

Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu Pro Tyr Val Leu

965 970 975

ggg tcg gct cac gag ggc tgc ctc ccg ccg ttc cca gcg gac gtt ttc 3034

Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro Ala Asp Val Phe

980 985 990

atg att cct cag tac ggg tat ctg acg ctt aat gat gga agc cag gcc 3082

Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp Gly Ser Gln Ala

995 1000 1005

gtg ggt cgt tcg tcc ttt tac tgc ctg gaa tat ttc ccg tcg caa 3127

Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro Ser Gln

1010 1015 1020

atg cta aga acg ggt aac aac ttc cag ttc agc tac gag ttt gag 3172

Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu Phe Glu

1025 1030 1035

aac gta cct ttc cat agc agc tac gct cac agc caa agc ctg gac 3217

Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp

1040 1045 1050

cga cta atg aat cca ctc atc gac caa tac ttg tac tat ctc tca 3262

Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

1055 1060 1065

aag act att aac ggt tct gga cag aat caa caa acg cta aaa ttc 3307

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe

1070 1075 1080

agt gtg gcc gga ccc agc aac atg gct gtc cag gga aga aac tac 3352

Ser Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr

1085 1090 1095

ata cct gga ccc agc tac cga caa caa cgt gtc tca acc act gtg 3397

Ile Pro Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val

1100 1105 1110

act caa aac aac aac agc gaa ttt gct tgg cct gga gct tct tct 3442

Thr Gln Asn Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser

1115 1120 1125

tgg gct ctc aat gga cgt aat agc ttg atg aat cct gga cct gct 3487

Trp Ala Leu Asn Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala

1130 1135 1140

atg gcc agc cac aaa gaa gga gag gac cgt ttc ttt cct ttg tct 3532

Met Ala Ser His Lys Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser

1145 1150 1155

gga tct tta att ttt ggc aaa caa gga act gga aga gac aac gtg 3577

Gly Ser Leu Ile Phe Gly Lys Gln Gly Thr Gly Arg Asp Asn Val

1160 1165 1170

gat gcg gac aaa gtc atg ata acc aac gaa gaa gaa att aaa act 3622

Asp Ala Asp Lys Val Met Ile Thr Asn Glu Glu Glu Ile Lys Thr

1175 1180 1185

act aac ccg gta gca acg gag tcc tat gga caa gtg gcc aca aac 3667

Thr Asn Pro Val Ala Thr Glu Ser Tyr Gly Gln Val Ala Thr Asn

1190 1195 1200

cac cag agt gcc caa agc gct aac ttc atc aag cct acc agc tac 3712

His Gln Ser Ala Gln Ser Ala Asn Phe Ile Lys Pro Thr Ser Tyr

1205 1210 1215

gca cag gcg cag acc ggc tgg gtt caa aac caa gga ata ctt ccg 3757

Ala Gln Ala Gln Thr Gly Trp Val Gln Asn Gln Gly Ile Leu Pro

1220 1225 1230

ggt atg gtt tgg cag gac aga gat gtg tac ctg caa gga ccc att 3802

Gly Met Val Trp Gln Asp Arg Asp Val Tyr Leu Gln Gly Pro Ile

1235 1240 1245

tgg gcc aaa att cct cac acg gac ggc aac ttt cac cct tct ccg 3847

Trp Ala Lys Ile Pro His Thr Asp Gly Asn Phe His Pro Ser Pro

1250 1255 1260

ctg atg gga ggg ttt gga atg aag cac ccg cct cct cag atc ctc 3892

Leu Met Gly Gly Phe Gly Met Lys His Pro Pro Pro Gln Ile Leu

1265 1270 1275

atc aaa aac aca cct gta cct gcg gat cct cca acg gcc ttc aac 3937

Ile Lys Asn Thr Pro Val Pro Ala Asp Pro Pro Thr Ala Phe Asn

1280 1285 1290

aag gac aag ctg aac tct ttc atc acc cag tat tct act ggc caa 3982

Lys Asp Lys Leu Asn Ser Phe Ile Thr Gln Tyr Ser Thr Gly Gln

1295 1300 1305

gtc agc gtg gag atc gag tgg gag ctg cag aag gaa aac agc aag 4027

Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn Ser Lys

1310 1315 1320

cgc tgg aac ccg gag atc cag tac act tcc aac tat tac aag tct 4072

Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Tyr Lys Ser

1325 1330 1335

aat aat gtt gaa ttt gct gtt aat act gaa ggt gta tat agt gaa 4117

Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val Tyr Ser Glu

1340 1345 1350

ccc cgc ccc att ggc acc aga tac ctg act cgt aat ctg taa 4159

Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

1355 1360 1365

ttgcttgtta atcaataaac cgtttaattc gtttcagttg aactttggtc tctgcgaagg 4219

gcgaattcgt ttaaacctgc aggactagag gtcctgtatt agaggtcacg tgagtgtttt 4279

gcgacatttt gcgacaccat gtggtcacgc tgggtattta agcccgagtg agcacgcagg 4339

gtctccattt tgaagcggga ggtttgaacg cgcagccgcc aagccgaatt ctgcagatat 4399

ccatcacact ggcggccgct cgactagagc ggccgccacc gcggtggagc tccagctttt 4459

gttcccttta gtgagggtta attgcgcgct tggcgtaatc atggtcatag ctgtttcctg 4519

tgtgaaattg ttatccgctc acaattccac acaacatacg agccggaagc ataaagtgta 4579

aagcctgggg tgcctaatga gtgagctaac tcacattaat tgcgttgcgc tcactgcccg 4639

ctttccagtc gggaaacctg tcgtgccagc tgcattaatg aatcggccaa cgcgcgggga 4699

gaggcggttt gcgtattggg cgc 4722

<210> 11

<211> 621

<212> PRT

<213> artificial sequence

<220>

<223> synthetic construct

<400> 11

Met Pro Gly Phe Tyr Glu Ile Val Ile Lys Val Pro Ser Asp Leu Asp

1 5 10 15

Glu His Leu Pro Gly Ile Ser Asp Ser Phe Val Asn Trp Val Ala Glu

20 25 30

Lys Glu Trp Glu Leu Pro Pro Asp Ser Asp Met Asp Leu Asn Leu Ile

35 40 45

Glu Gln Ala Pro Leu Thr Val Ala Glu Lys Leu Gln Arg Asp Phe Leu

50 55 60

Thr Glu Trp Arg Arg Val Ser Lys Ala Pro Glu Ala Leu Phe Phe Val

65 70 75 80

Gln Phe Glu Lys Gly Glu Ser Tyr Phe His Met His Val Leu Val Glu

85 90 95

Thr Thr Gly Val Lys Ser Met Val Leu Gly Arg Phe Leu Ser Gln Ile

100 105 110

Arg Glu Lys Leu Ile Gln Arg Ile Tyr Arg Gly Ile Glu Pro Thr Leu

115 120 125

Pro Asn Trp Phe Ala Val Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly

130 135 140

Asn Lys Val Val Asp Glu Cys Tyr Ile Pro Asn Tyr Leu Leu Pro Lys

145 150 155 160

Thr Gln Pro Glu Leu Gln Trp Ala Trp Thr Asn Met Glu Gln Tyr Leu

165 170 175

Ser Ala Cys Leu Asn Leu Thr Glu Arg Lys Arg Leu Val Ala Gln His

180 185 190

Leu Thr His Val Ser Gln Thr Gln Glu Gln Asn Lys Glu Asn Gln Asn

195 200 205

Pro Asn Ser Asp Ala Pro Val Ile Arg Ser Lys Thr Ser Ala Arg Tyr

210 215 220

Met Glu Leu Val Gly Trp Leu Val Asp Lys Gly Ile Thr Ser Glu Lys

225 230 235 240

Gln Trp Ile Gln Glu Asp Gln Ala Ser Tyr Ile Ser Phe Asn Ala Ala

245 250 255

Ser Asn Ser Arg Ser Gln Ile Lys Ala Ala Leu Asp Asn Ala Gly Lys

260 265 270

Ile Met Ser Leu Thr Lys Thr Ala Pro Asp Tyr Leu Val Gly Gln Gln

275 280 285

Pro Val Glu Asp Ile Ser Ser Asn Arg Ile Tyr Lys Ile Leu Glu Leu

290 295 300

Asn Gly Tyr Asp Pro Gln Tyr Ala Ala Ser Val Phe Leu Gly Trp Ala

305 310 315 320

Thr Lys Lys Phe Gly Lys Arg Asn Thr Ile Trp Leu Phe Gly Pro Ala

325 330 335

Thr Thr Gly Lys Thr Asn Ile Ala Glu Ala Ile Ala His Thr Val Pro

340 345 350

Phe Tyr Gly Cys Val Asn Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp

355 360 365

Cys Val Asp Lys Met Val Ile Trp Trp Glu Glu Gly Lys Met Thr Ala

370 375 380

Lys Val Val Glu Ser Ala Lys Ala Ile Leu Gly Gly Ser Lys Val Arg

385 390 395 400

Val Asp Gln Lys Cys Lys Ser Ser Ala Gln Ile Asp Pro Thr Pro Val

405 410 415

Ile Val Thr Ser Asn Thr Asn Met Cys Ala Val Ile Asp Gly Asn Ser

420 425 430

Thr Thr Phe Glu His Gln Gln Pro Leu Gln Asp Arg Met Phe Lys Phe

435 440 445

Glu Leu Thr Arg Arg Leu Asp His Asp Phe Gly Lys Val Thr Lys Gln

450 455 460

Glu Val Lys Asp Phe Phe Arg Trp Ala Lys Asp His Val Val Glu Val

465 470 475 480

Glu His Glu Phe Tyr Val Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala

485 490 495

Pro Ser Asp Ala Asp Ile Ser Glu Pro Lys Arg Val Arg Glu Ser Val

500 505 510

Ala Gln Pro Ser Thr Ser Asp Ala Glu Ala Ser Ile Asn Tyr Ala Asp

515 520 525

Arg Tyr Gln Asn Lys Cys Ser Arg His Val Gly Met Asn Leu Met Leu

530 535 540

Phe Pro Cys Arg Gln Cys Glu Arg Met Asn Gln Asn Ser Asn Ile Cys

545 550 555 560

Phe Thr His Gly Gln Lys Asp Cys Leu Glu Cys Phe Pro Val Ser Glu

565 570 575

Ser Gln Pro Val Ser Val Val Lys Lys Ala Tyr Gln Lys Leu Cys Tyr

580 585 590

Ile His His Ile Met Gly Lys Val Pro Asp Ala Cys Thr Ala Cys Asp

595 600 605

Leu Val Asn Val Asp Leu Asp Asp Cys Ile Phe Glu Gln

610 615 620

<210> 12

<211> 746

<212> PRT

<213> artificial sequence

<220>

<223> synthetic construct

<400> 12

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser

1 5 10 15

Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro

20 25 30

Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro

35 40 45

Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro

50 55 60

Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 80

Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro

115 120 125

Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140

Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly

145 150 155 160

Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175

Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro

180 185 190

Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly

195 200 205

Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser

210 215 220

Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile

225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255

Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn

260 265 270

Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg

275 280 285

Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn

290 295 300

Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile

305 310 315 320

Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn

325 330 335

Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu

340 345 350

Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro

355 360 365

Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp

370 375 380

Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe

385 390 395 400

Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu

405 410 415

Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu

420 425 430

Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

435 440 445

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser

450 455 460

Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro

465 470 475 480

Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn

485 490 495

Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn

500 505 510

Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys

515 520 525

Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly

530 535 540

Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile

545 550 555 560

Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser

565 570 575

Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Ser Ala Asn Phe

580 585 590

Ile Lys Pro Thr Ser Tyr Ala Gln Ala Gln Thr Gly Trp Val Gln Asn

595 600 605

Gln Gly Ile Leu Pro Gly Met Val Trp Gln Asp Arg Asp Val Tyr Leu

610 615 620

Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly Asn Phe His

625 630 635 640

Pro Ser Pro Leu Met Gly Gly Phe Gly Met Lys His Pro Pro Pro Gln

645 650 655

Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asp Pro Pro Thr Ala Phe

660 665 670

Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr Gln Tyr Ser Thr Gly Gln

675 680 685

Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn Ser Lys Arg

690 695 700

Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Tyr Lys Ser Asn Asn

705 710 715 720

Val Glu Phe Ala Val Asn Thr Glu Gly Val Tyr Ser Glu Pro Arg Pro

725 730 735

Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

740 745

<210> 13

<211> 4725

<212> DNA

<213> artificial sequence

<220>

<223> AAV2/9 n.588.SSN nucleic acid sequence expression cassettes

<220>

<221> misc_feature

<222> (1)..(36)

<223> truncated promoter

<220>

<221> promoter

<222> (1)..(7)

<223> p5 promoter

<220>

<221> CDS

<222> (37)..(1899)

<223> AAV2 Rep

<220>

<221> CDS

<222> (1919)..(4162)

<223> AAV9 Cap

<220>

<221> misc_feature

<222> (3683)..(3715)

<223> SSN

<220>

<221> misc_feature

<222> (4253)..(4383)

<223> p5 promoter

<220>

<221> misc_feature

<222> (4511)..(4725)

<223> LacZ promoter

<400> 13

ccattttgaa gcgggaggtt tgaacgcgca gccgcc atg ccg ggg ttt tac gag 54

Met Pro Gly Phe Tyr Glu

1 5

att gtg att aag gtc ccc agc gac ctt gac gag cat ctg ccc ggc att 102

Ile Val Ile Lys Val Pro Ser Asp Leu Asp Glu His Leu Pro Gly Ile

10 15 20

tct gac agc ttt gtg aac tgg gtg gcc gag aag gaa tgg gag ttg ccg 150

Ser Asp Ser Phe Val Asn Trp Val Ala Glu Lys Glu Trp Glu Leu Pro

25 30 35

cca gat tct gac atg gat ctg aat ctg att gag cag gca ccc ctg acc 198

Pro Asp Ser Asp Met Asp Leu Asn Leu Ile Glu Gln Ala Pro Leu Thr

40 45 50

gtg gcc gag aag ctg cag cgc gac ttt ctg acg gaa tgg cgc cgt gtg 246

Val Ala Glu Lys Leu Gln Arg Asp Phe Leu Thr Glu Trp Arg Arg Val

55 60 65 70

agt aag gcc ccg gag gct ctt ttc ttt gtg caa ttt gag aag gga gag 294

Ser Lys Ala Pro Glu Ala Leu Phe Phe Val Gln Phe Glu Lys Gly Glu

75 80 85

agc tac ttc cac atg cac gtg ctc gtg gaa acc acc ggg gtg aaa tcc 342

Ser Tyr Phe His Met His Val Leu Val Glu Thr Thr Gly Val Lys Ser

90 95 100

atg gtt ttg gga cgt ttc ctg agt cag att cgc gaa aaa ctg att cag 390

Met Val Leu Gly Arg Phe Leu Ser Gln Ile Arg Glu Lys Leu Ile Gln

105 110 115

aga att tac cgc ggg atc gag ccg act ttg cca aac tgg ttc gcg gtc 438

Arg Ile Tyr Arg Gly Ile Glu Pro Thr Leu Pro Asn Trp Phe Ala Val

120 125 130

aca aag acc aga aat ggc gcc gga ggc ggg aac aag gtg gtg gat gag 486

Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly Asn Lys Val Val Asp Glu

135 140 145 150

tgc tac atc ccc aat tac ttg ctc ccc aaa acc cag cct gag ctc cag 534

Cys Tyr Ile Pro Asn Tyr Leu Leu Pro Lys Thr Gln Pro Glu Leu Gln

155 160 165

tgg gcg tgg act aat atg gaa cag tat tta agc gcc tgt ttg aat ctc 582

Trp Ala Trp Thr Asn Met Glu Gln Tyr Leu Ser Ala Cys Leu Asn Leu

170 175 180

acg gag cgt aaa cgg ttg gtg gcg cag cat ctg acg cac gtg tcg cag 630

Thr Glu Arg Lys Arg Leu Val Ala Gln His Leu Thr His Val Ser Gln

185 190 195

acg cag gag cag aac aaa gag aat cag aat ccc aat tct gat gcg ccg 678

Thr Gln Glu Gln Asn Lys Glu Asn Gln Asn Pro Asn Ser Asp Ala Pro

200 205 210

gtg atc aga tca aaa act tca gcc agg tac atg gag ctg gtc ggg tgg 726

Val Ile Arg Ser Lys Thr Ser Ala Arg Tyr Met Glu Leu Val Gly Trp

215 220 225 230

ctc gtg gac aag ggg att acc tcg gag aag cag tgg atc cag gag gac 774

Leu Val Asp Lys Gly Ile Thr Ser Glu Lys Gln Trp Ile Gln Glu Asp

235 240 245

cag gcc tca tac atc tcc ttc aat gcg gcc tcc aac tcg cgg tcc caa 822

Gln Ala Ser Tyr Ile Ser Phe Asn Ala Ala Ser Asn Ser Arg Ser Gln

250 255 260

atc aag gct gcc ttg gac aat gcg gga aag att atg agc ctg act aaa 870

Ile Lys Ala Ala Leu Asp Asn Ala Gly Lys Ile Met Ser Leu Thr Lys

265 270 275

acc gcc ccc gac tac ctg gtg ggc cag cag ccc gtg gag gac att tcc 918

Thr Ala Pro Asp Tyr Leu Val Gly Gln Gln Pro Val Glu Asp Ile Ser

280 285 290

agc aat cgg att tat aaa att ttg gaa cta aac ggg tac gat ccc caa 966

Ser Asn Arg Ile Tyr Lys Ile Leu Glu Leu Asn Gly Tyr Asp Pro Gln

295 300 305 310

tat gcg gct tcc gtc ttt ctg gga tgg gcc acg aaa aag ttc ggc aag 1014

Tyr Ala Ala Ser Val Phe Leu Gly Trp Ala Thr Lys Lys Phe Gly Lys

315 320 325

agg aac acc atc tgg ctg ttt ggg cct gca act acc ggg aag acc aac 1062

Arg Asn Thr Ile Trp Leu Phe Gly Pro Ala Thr Thr Gly Lys Thr Asn

330 335 340

atc gcg gag gcc ata gcc cac act gtg ccc ttc tac ggg tgc gta aac 1110

Ile Ala Glu Ala Ile Ala His Thr Val Pro Phe Tyr Gly Cys Val Asn

345 350 355

tgg acc aat gag aac ttt ccc ttc aac gac tgt gtc gac aag atg gtg 1158

Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp Cys Val Asp Lys Met Val

360 365 370

atc tgg tgg gag gag ggg aag atg acc gcc aag gtc gtg gag tcg gcc 1206

Ile Trp Trp Glu Glu Gly Lys Met Thr Ala Lys Val Val Glu Ser Ala

375 380 385 390

aaa gcc att ctc gga gga agc aag gtg cgc gtg gac cag aaa tgc aag 1254

Lys Ala Ile Leu Gly Gly Ser Lys Val Arg Val Asp Gln Lys Cys Lys

395 400 405

tcc tcg gcc cag ata gac ccg act ccc gtg atc gtc acc tcc aac acc 1302

Ser Ser Ala Gln Ile Asp Pro Thr Pro Val Ile Val Thr Ser Asn Thr

410 415 420

aac atg tgc gcc gtg att gac ggg aac tca acg acc ttc gaa cac cag 1350

Asn Met Cys Ala Val Ile Asp Gly Asn Ser Thr Thr Phe Glu His Gln

425 430 435

cag ccg ttg caa gac cgg atg ttc aaa ttt gaa ctc acc cgc cgt ctg 1398

Gln Pro Leu Gln Asp Arg Met Phe Lys Phe Glu Leu Thr Arg Arg Leu

440 445 450

gat cat gac ttt ggg aag gtc acc aag cag gaa gtc aaa gac ttt ttc 1446

Asp His Asp Phe Gly Lys Val Thr Lys Gln Glu Val Lys Asp Phe Phe

455 460 465 470

cgg tgg gca aag gat cac gtg gtt gag gtg gag cat gaa ttc tac gtc 1494

Arg Trp Ala Lys Asp His Val Val Glu Val Glu His Glu Phe Tyr Val

475 480 485

aaa aag ggt gga gcc aag aaa aga ccc gcc ccc agt gac gca gat ata 1542

Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala Pro Ser Asp Ala Asp Ile

490 495 500

agt gag ccc aaa cgg gtg cgc gag tca gtt gcg cag cca tcg acg tca 1590

Ser Glu Pro Lys Arg Val Arg Glu Ser Val Ala Gln Pro Ser Thr Ser

505 510 515

gac gcg gaa gct tcg atc aac tac gca gac agg tac caa aac aaa tgt 1638

Asp Ala Glu Ala Ser Ile Asn Tyr Ala Asp Arg Tyr Gln Asn Lys Cys

520 525 530

tct cgt cac gtg ggc atg aat ctg atg ctg ttt ccc tgc aga caa tgc 1686

Ser Arg His Val Gly Met Asn Leu Met Leu Phe Pro Cys Arg Gln Cys

535 540 545 550

gag aga atg aat cag aat tca aat atc tgc ttc act cac gga cag aaa 1734

Glu Arg Met Asn Gln Asn Ser Asn Ile Cys Phe Thr His Gly Gln Lys

555 560 565

gac tgt tta gag tgc ttt ccc gtg tca gaa tct caa ccc gtt tct gtc 1782

Asp Cys Leu Glu Cys Phe Pro Val Ser Glu Ser Gln Pro Val Ser Val

570 575 580

gtc aaa aag gcg tat cag aaa ctg tgc tac att cat cat atc atg gga 1830

Val Lys Lys Ala Tyr Gln Lys Leu Cys Tyr Ile His His Ile Met Gly

585 590 595

aag gtg cca gac gct tgc act gcc tgc gat ctg gtc aat gtg gat ttg 1878

Lys Val Pro Asp Ala Cys Thr Ala Cys Asp Leu Val Asn Val Asp Leu

600 605 610

gat gac tgc atc ttt gaa caa taaatgattt aaatcaggt atg gct gcc gat 1930

Asp Asp Cys Ile Phe Glu Gln Met Ala Ala Asp

615 620 625

ggt tat ctt cca gat tgg ctc gag gac aac ctt agt gaa gga att cgc 1978

Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser Glu Gly Ile Arg

630 635 640

gag tgg tgg gct ttg aaa cct gga gcc cct caa ccc aag gca aat caa 2026

Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro Lys Ala Asn Gln

645 650 655

caa cat caa gac aac gct cga ggt ctt gtg ctt ccg ggt tac aaa tac 2074

Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro Gly Tyr Lys Tyr

660 665 670

ctt gga ccc ggc aac gga ctc gac aag ggg gag ccg gtc aac gca gca 2122

Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro Val Asn Ala Ala

675 680 685

gac gcg gcg gcc ctc gag cac gac aag gcc tac gac cag cag ctc aag 2170

Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp Gln Gln Leu Lys

690 695 700 705

gcc gga gac aac ccg tac ctc aag tac aac cac gcc gac gcc gag ttc 2218

Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala Asp Ala Glu Phe

710 715 720

cag gag cgg ctc aaa gaa gat acg tct ttt ggg ggc aac ctc ggg cga 2266

Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly Asn Leu Gly Arg

725 730 735

gca gtc ttc cag gcc aaa aag agg ctt ctt gaa cct ctt ggt ctg gtt 2314

Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro Leu Gly Leu Val

740 745 750

gag gaa gcg gct aag acg gct cct gga aag aag agg cct gta gag cag 2362

Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg Pro Val Glu Gln

755 760 765

tct cct cag gaa ccg gac tcc tcc gcg ggt att ggc aaa tcg ggt gca 2410

Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly Lys Ser Gly Ala

770 775 780 785

cag ccc gct aaa aag aga ctc aat ttc ggt cag act ggc gac aca gag 2458

Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr Gly Asp Thr Glu

790 795 800

tca gtc cca gac cct caa cca atc gga gaa cct ccc gca gcc ccc tca 2506

Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro Ala Ala Pro Ser

805 810 815

ggt gtg gga tct ctt aca atg gct tca ggt ggt ggc gca cca gtg gca 2554

Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly Ala Pro Val Ala

820 825 830

gac aat aac gaa ggt gcc gat gga gtg ggt agt tcc tcg gga aat tgg 2602

Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser Ser Gly Asn Trp

835 840 845

cat tgc gat tcc caa tgg ctg ggg gac aga gtc atc acc acc agc acc 2650

His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile Thr Thr Ser Thr

850 855 860 865

cga acc tgg gcc ctg ccc acc tac aac aat cac ctc tac aag caa atc 2698

Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu Tyr Lys Gln Ile

870 875 880

tcc aac agc aca tct gga gga tct tca aat gac aac gcc tac ttc ggc 2746

Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn Ala Tyr Phe Gly

885 890 895

tac agc acc ccc tgg ggg tat ttt gac ttc aac aga ttc cac tgc cac 2794

Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe His Cys His

900 905 910

ttc tca cca cgt gac tgg cag cga ctc atc aac aac aac tgg gga ttc 2842

Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp Gly Phe

915 920 925

cgg cct aag cga ctc aac ttc aag ctc ttc aac att cag gtc aaa gag 2890

Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile Gln Val Lys Glu

930 935 940 945

gtt acg gac aac aat gga gtc aag acc atc gcc aat aac ctt acc agc 2938

Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn Asn Leu Thr Ser

950 955 960

acg gtc cag gtc ttc acg gac tca gac tat cag ctc ccg tac gtg ctc 2986

Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu Pro Tyr Val Leu

965 970 975

ggg tcg gct cac gag ggc tgc ctc ccg ccg ttc cca gcg gac gtt ttc 3034

Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro Ala Asp Val Phe

980 985 990

atg att cct cag tac ggg tat ctg acg ctt aat gat gga agc cag gcc 3082

Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp Gly Ser Gln Ala

995 1000 1005

gtg ggt cgt tcg tcc ttt tac tgc ctg gaa tat ttc ccg tcg caa 3127

Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro Ser Gln

1010 1015 1020

atg cta aga acg ggt aac aac ttc cag ttc agc tac gag ttt gag 3172

Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu Phe Glu

1025 1030 1035

aac gta cct ttc cat agc agc tac gct cac agc caa agc ctg gac 3217

Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp

1040 1045 1050

cga cta atg aat cca ctc atc gac caa tac ttg tac tat ctc tca 3262

Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

1055 1060 1065

aag act att aac ggt tct gga cag aat caa caa acg cta aaa ttc 3307

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe

1070 1075 1080

agt gtg gcc gga ccc agc aac atg gct gtc cag gga aga aac tac 3352

Ser Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr

1085 1090 1095

ata cct gga ccc agc tac cga caa caa cgt gtc tca acc act gtg 3397

Ile Pro Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val

1100 1105 1110

act caa aac aac aac agc gaa ttt gct tgg cct gga gct tct tct 3442

Thr Gln Asn Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser

1115 1120 1125

tgg gct ctc aat gga cgt aat agc ttg atg aat cct gga cct gct 3487

Trp Ala Leu Asn Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala

1130 1135 1140

atg gcc agc cac aaa gaa gga gag gac cgt ttc ttt cct ttg tct 3532

Met Ala Ser His Lys Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser

1145 1150 1155

gga tct tta att ttt ggc aaa caa gga act gga aga gac aac gtg 3577

Gly Ser Leu Ile Phe Gly Lys Gln Gly Thr Gly Arg Asp Asn Val

1160 1165 1170

gat gcg gac aaa gtc atg ata acc aac gaa gaa gaa att aaa act 3622

Asp Ala Asp Lys Val Met Ile Thr Asn Glu Glu Glu Ile Lys Thr

1175 1180 1185

act aac ccg gta gca acg gag tcc tat gga caa gtg gcc aca aac 3667

Thr Asn Pro Val Ala Thr Glu Ser Tyr Gly Gln Val Ala Thr Asn

1190 1195 1200

cac cag agt gcc caa agc agc aac acc gtg aag ctg acc agc gga 3712

His Gln Ser Ala Gln Ser Ser Asn Thr Val Lys Leu Thr Ser Gly

1205 1210 1215

cac gca cag gcg cag acc ggc tgg gtt caa aac caa gga ata ctt 3757

His Ala Gln Ala Gln Thr Gly Trp Val Gln Asn Gln Gly Ile Leu

1220 1225 1230

ccg ggt atg gtt tgg cag gac aga gat gtg tac ctg caa gga ccc 3802

Pro Gly Met Val Trp Gln Asp Arg Asp Val Tyr Leu Gln Gly Pro

1235 1240 1245

att tgg gcc aaa att cct cac acg gac ggc aac ttt cac cct tct 3847

Ile Trp Ala Lys Ile Pro His Thr Asp Gly Asn Phe His Pro Ser

1250 1255 1260

ccg ctg atg gga ggg ttt gga atg aag cac ccg cct cct cag atc 3892

Pro Leu Met Gly Gly Phe Gly Met Lys His Pro Pro Pro Gln Ile

1265 1270 1275

ctc atc aaa aac aca cct gta cct gcg gat cct cca acg gcc ttc 3937

Leu Ile Lys Asn Thr Pro Val Pro Ala Asp Pro Pro Thr Ala Phe

1280 1285 1290

aac aag gac aag ctg aac tct ttc atc acc cag tat tct act ggc 3982

Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr Gln Tyr Ser Thr Gly

1295 1300 1305

caa gtc agc gtg gag atc gag tgg gag ctg cag aag gaa aac agc 4027

Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn Ser

1310 1315 1320

aag cgc tgg aac ccg gag atc cag tac act tcc aac tat tac aag 4072

Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Tyr Lys

1325 1330 1335

tct aat aat gtt gaa ttt gct gtt aat act gaa ggt gta tat agt 4117

Ser Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val Tyr Ser

1340 1345 1350

gaa ccc cgc ccc att ggc acc aga tac ctg act cgt aat ctg taa 4162

Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

1355 1360 1365

ttgcttgtta atcaataaac cgtttaattc gtttcagttg aactttggtc tctgcgaagg 4222

gcgaattcgt ttaaacctgc aggactagag gtcctgtatt agaggtcacg tgagtgtttt 4282

gcgacatttt gcgacaccat gtggtcacgc tgggtattta agcccgagtg agcacgcagg 4342

gtctccattt tgaagcggga ggtttgaacg cgcagccgcc aagccgaatt ctgcagatat 4402

ccatcacact ggcggccgct cgactagagc ggccgccacc gcggtggagc tccagctttt 4462

gttcccttta gtgagggtta attgcgcgct tggcgtaatc atggtcatag ctgtttcctg 4522

tgtgaaattg ttatccgctc acaattccac acaacatacg agccggaagc ataaagtgta 4582

aagcctgggg tgcctaatga gtgagctaac tcacattaat tgcgttgcgc tcactgcccg 4642

ctttccagtc gggaaacctg tcgtgccagc tgcattaatg aatcggccaa cgcgcgggga 4702

gaggcggttt gcgtattggg cgc 4725

<210> 14

<211> 621

<212> PRT

<213> artificial sequence

<220>

<223> synthetic construct

<400> 14

Met Pro Gly Phe Tyr Glu Ile Val Ile Lys Val Pro Ser Asp Leu Asp

1 5 10 15

Glu His Leu Pro Gly Ile Ser Asp Ser Phe Val Asn Trp Val Ala Glu

20 25 30

Lys Glu Trp Glu Leu Pro Pro Asp Ser Asp Met Asp Leu Asn Leu Ile

35 40 45

Glu Gln Ala Pro Leu Thr Val Ala Glu Lys Leu Gln Arg Asp Phe Leu

50 55 60

Thr Glu Trp Arg Arg Val Ser Lys Ala Pro Glu Ala Leu Phe Phe Val

65 70 75 80

Gln Phe Glu Lys Gly Glu Ser Tyr Phe His Met His Val Leu Val Glu

85 90 95

Thr Thr Gly Val Lys Ser Met Val Leu Gly Arg Phe Leu Ser Gln Ile

100 105 110

Arg Glu Lys Leu Ile Gln Arg Ile Tyr Arg Gly Ile Glu Pro Thr Leu

115 120 125

Pro Asn Trp Phe Ala Val Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly

130 135 140

Asn Lys Val Val Asp Glu Cys Tyr Ile Pro Asn Tyr Leu Leu Pro Lys

145 150 155 160

Thr Gln Pro Glu Leu Gln Trp Ala Trp Thr Asn Met Glu Gln Tyr Leu

165 170 175

Ser Ala Cys Leu Asn Leu Thr Glu Arg Lys Arg Leu Val Ala Gln His

180 185 190

Leu Thr His Val Ser Gln Thr Gln Glu Gln Asn Lys Glu Asn Gln Asn

195 200 205

Pro Asn Ser Asp Ala Pro Val Ile Arg Ser Lys Thr Ser Ala Arg Tyr

210 215 220

Met Glu Leu Val Gly Trp Leu Val Asp Lys Gly Ile Thr Ser Glu Lys

225 230 235 240

Gln Trp Ile Gln Glu Asp Gln Ala Ser Tyr Ile Ser Phe Asn Ala Ala

245 250 255

Ser Asn Ser Arg Ser Gln Ile Lys Ala Ala Leu Asp Asn Ala Gly Lys

260 265 270

Ile Met Ser Leu Thr Lys Thr Ala Pro Asp Tyr Leu Val Gly Gln Gln

275 280 285

Pro Val Glu Asp Ile Ser Ser Asn Arg Ile Tyr Lys Ile Leu Glu Leu

290 295 300

Asn Gly Tyr Asp Pro Gln Tyr Ala Ala Ser Val Phe Leu Gly Trp Ala

305 310 315 320

Thr Lys Lys Phe Gly Lys Arg Asn Thr Ile Trp Leu Phe Gly Pro Ala

325 330 335

Thr Thr Gly Lys Thr Asn Ile Ala Glu Ala Ile Ala His Thr Val Pro

340 345 350

Phe Tyr Gly Cys Val Asn Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp

355 360 365

Cys Val Asp Lys Met Val Ile Trp Trp Glu Glu Gly Lys Met Thr Ala

370 375 380

Lys Val Val Glu Ser Ala Lys Ala Ile Leu Gly Gly Ser Lys Val Arg

385 390 395 400

Val Asp Gln Lys Cys Lys Ser Ser Ala Gln Ile Asp Pro Thr Pro Val

405 410 415

Ile Val Thr Ser Asn Thr Asn Met Cys Ala Val Ile Asp Gly Asn Ser

420 425 430

Thr Thr Phe Glu His Gln Gln Pro Leu Gln Asp Arg Met Phe Lys Phe

435 440 445

Glu Leu Thr Arg Arg Leu Asp His Asp Phe Gly Lys Val Thr Lys Gln

450 455 460

Glu Val Lys Asp Phe Phe Arg Trp Ala Lys Asp His Val Val Glu Val

465 470 475 480

Glu His Glu Phe Tyr Val Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala

485 490 495

Pro Ser Asp Ala Asp Ile Ser Glu Pro Lys Arg Val Arg Glu Ser Val

500 505 510

Ala Gln Pro Ser Thr Ser Asp Ala Glu Ala Ser Ile Asn Tyr Ala Asp

515 520 525

Arg Tyr Gln Asn Lys Cys Ser Arg His Val Gly Met Asn Leu Met Leu

530 535 540

Phe Pro Cys Arg Gln Cys Glu Arg Met Asn Gln Asn Ser Asn Ile Cys

545 550 555 560

Phe Thr His Gly Gln Lys Asp Cys Leu Glu Cys Phe Pro Val Ser Glu

565 570 575

Ser Gln Pro Val Ser Val Val Lys Lys Ala Tyr Gln Lys Leu Cys Tyr

580 585 590

Ile His His Ile Met Gly Lys Val Pro Asp Ala Cys Thr Ala Cys Asp

595 600 605

Leu Val Asn Val Asp Leu Asp Asp Cys Ile Phe Glu Gln

610 615 620

<210> 15

<211> 747

<212> PRT

<213> artificial sequence

<220>

<223> synthetic construct

<400> 15

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser

1 5 10 15

Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro

20 25 30

Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro

35 40 45

Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro

50 55 60

Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 80

Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro

115 120 125

Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140

Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly

145 150 155 160

Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175

Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro

180 185 190

Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly

195 200 205

Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser

210 215 220

Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile

225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255

Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn

260 265 270

Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg

275 280 285

Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn

290 295 300

Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile

305 310 315 320

Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn

325 330 335

Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu

340 345 350

Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro

355 360 365

Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp

370 375 380

Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe

385 390 395 400

Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu

405 410 415

Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu

420 425 430

Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

435 440 445

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser

450 455 460

Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro

465 470 475 480

Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn

485 490 495

Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn

500 505 510

Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys

515 520 525

Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly

530 535 540

Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile

545 550 555 560

Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser

565 570 575

Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Ser Ser Asn Thr

580 585 590

Val Lys Leu Thr Ser Gly His Ala Gln Ala Gln Thr Gly Trp Val Gln

595 600 605

Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln Asp Arg Asp Val Tyr

610 615 620

Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly Asn Phe

625 630 635 640

His Pro Ser Pro Leu Met Gly Gly Phe Gly Met Lys His Pro Pro Pro

645 650 655

Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asp Pro Pro Thr Ala

660 665 670

Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr Gln Tyr Ser Thr Gly

675 680 685

Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn Ser Lys

690 695 700

Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Tyr Lys Ser Asn

705 710 715 720

Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val Tyr Ser Glu Pro Arg

725 730 735

Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

740 745

<210> 16

<211> 4737

<212> DNA

<213> artificial sequence

<220>

<223> AAV2/9 n.588.SSN-L nucleic acid sequence expression cassettes

<220>

<221> misc_feature

<222> (1)..(36)

<223> truncated promoter

<220>

<221> promoter

<222> (1)..(7)

<223> p5 promoter

<220>

<221> CDS

<222> (37)..(1899)

<223> AAV2 Rep

<220>

<221> CDS

<222> (1919)..(4174)

<223> AAV9 Cap

<220>

<221> misc_feature

<222> (3683)..(3727)

<223> SSN-L

<220>

<221> misc_feature

<222> (4253)..(4737)

<223> LacZ promoter

<220>

<221> misc_feature

<222> (4265)..(4395)

<223> p5 promoter

<400> 16

ccattttgaa gcgggaggtt tgaacgcgca gccgcc atg ccg ggg ttt tac gag 54

Met Pro Gly Phe Tyr Glu

1 5

att gtg att aag gtc ccc agc gac ctt gac gag cat ctg ccc ggc att 102

Ile Val Ile Lys Val Pro Ser Asp Leu Asp Glu His Leu Pro Gly Ile

10 15 20

tct gac agc ttt gtg aac tgg gtg gcc gag aag gaa tgg gag ttg ccg 150

Ser Asp Ser Phe Val Asn Trp Val Ala Glu Lys Glu Trp Glu Leu Pro

25 30 35

cca gat tct gac atg gat ctg aat ctg att gag cag gca ccc ctg acc 198

Pro Asp Ser Asp Met Asp Leu Asn Leu Ile Glu Gln Ala Pro Leu Thr

40 45 50

gtg gcc gag aag ctg cag cgc gac ttt ctg acg gaa tgg cgc cgt gtg 246

Val Ala Glu Lys Leu Gln Arg Asp Phe Leu Thr Glu Trp Arg Arg Val

55 60 65 70

agt aag gcc ccg gag gct ctt ttc ttt gtg caa ttt gag aag gga gag 294

Ser Lys Ala Pro Glu Ala Leu Phe Phe Val Gln Phe Glu Lys Gly Glu

75 80 85

agc tac ttc cac atg cac gtg ctc gtg gaa acc acc ggg gtg aaa tcc 342

Ser Tyr Phe His Met His Val Leu Val Glu Thr Thr Gly Val Lys Ser

90 95 100

atg gtt ttg gga cgt ttc ctg agt cag att cgc gaa aaa ctg att cag 390

Met Val Leu Gly Arg Phe Leu Ser Gln Ile Arg Glu Lys Leu Ile Gln

105 110 115

aga att tac cgc ggg atc gag ccg act ttg cca aac tgg ttc gcg gtc 438

Arg Ile Tyr Arg Gly Ile Glu Pro Thr Leu Pro Asn Trp Phe Ala Val

120 125 130

aca aag acc aga aat ggc gcc gga ggc ggg aac aag gtg gtg gat gag 486

Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly Asn Lys Val Val Asp Glu

135 140 145 150

tgc tac atc ccc aat tac ttg ctc ccc aaa acc cag cct gag ctc cag 534

Cys Tyr Ile Pro Asn Tyr Leu Leu Pro Lys Thr Gln Pro Glu Leu Gln

155 160 165

tgg gcg tgg act aat atg gaa cag tat tta agc gcc tgt ttg aat ctc 582

Trp Ala Trp Thr Asn Met Glu Gln Tyr Leu Ser Ala Cys Leu Asn Leu

170 175 180

acg gag cgt aaa cgg ttg gtg gcg cag cat ctg acg cac gtg tcg cag 630

Thr Glu Arg Lys Arg Leu Val Ala Gln His Leu Thr His Val Ser Gln

185 190 195

acg cag gag cag aac aaa gag aat cag aat ccc aat tct gat gcg ccg 678

Thr Gln Glu Gln Asn Lys Glu Asn Gln Asn Pro Asn Ser Asp Ala Pro

200 205 210

gtg atc aga tca aaa act tca gcc agg tac atg gag ctg gtc ggg tgg 726

Val Ile Arg Ser Lys Thr Ser Ala Arg Tyr Met Glu Leu Val Gly Trp

215 220 225 230

ctc gtg gac aag ggg att acc tcg gag aag cag tgg atc cag gag gac 774

Leu Val Asp Lys Gly Ile Thr Ser Glu Lys Gln Trp Ile Gln Glu Asp

235 240 245

cag gcc tca tac atc tcc ttc aat gcg gcc tcc aac tcg cgg tcc caa 822

Gln Ala Ser Tyr Ile Ser Phe Asn Ala Ala Ser Asn Ser Arg Ser Gln

250 255 260

atc aag gct gcc ttg gac aat gcg gga aag att atg agc ctg act aaa 870

Ile Lys Ala Ala Leu Asp Asn Ala Gly Lys Ile Met Ser Leu Thr Lys

265 270 275

acc gcc ccc gac tac ctg gtg ggc cag cag ccc gtg gag gac att tcc 918

Thr Ala Pro Asp Tyr Leu Val Gly Gln Gln Pro Val Glu Asp Ile Ser

280 285 290

agc aat cgg att tat aaa att ttg gaa cta aac ggg tac gat ccc caa 966

Ser Asn Arg Ile Tyr Lys Ile Leu Glu Leu Asn Gly Tyr Asp Pro Gln

295 300 305 310

tat gcg gct tcc gtc ttt ctg gga tgg gcc acg aaa aag ttc ggc aag 1014

Tyr Ala Ala Ser Val Phe Leu Gly Trp Ala Thr Lys Lys Phe Gly Lys

315 320 325

agg aac acc atc tgg ctg ttt ggg cct gca act acc ggg aag acc aac 1062

Arg Asn Thr Ile Trp Leu Phe Gly Pro Ala Thr Thr Gly Lys Thr Asn

330 335 340

atc gcg gag gcc ata gcc cac act gtg ccc ttc tac ggg tgc gta aac 1110

Ile Ala Glu Ala Ile Ala His Thr Val Pro Phe Tyr Gly Cys Val Asn

345 350 355

tgg acc aat gag aac ttt ccc ttc aac gac tgt gtc gac aag atg gtg 1158

Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp Cys Val Asp Lys Met Val

360 365 370

atc tgg tgg gag gag ggg aag atg acc gcc aag gtc gtg gag tcg gcc 1206

Ile Trp Trp Glu Glu Gly Lys Met Thr Ala Lys Val Val Glu Ser Ala

375 380 385 390

aaa gcc att ctc gga gga agc aag gtg cgc gtg gac cag aaa tgc aag 1254

Lys Ala Ile Leu Gly Gly Ser Lys Val Arg Val Asp Gln Lys Cys Lys

395 400 405

tcc tcg gcc cag ata gac ccg act ccc gtg atc gtc acc tcc aac acc 1302

Ser Ser Ala Gln Ile Asp Pro Thr Pro Val Ile Val Thr Ser Asn Thr

410 415 420

aac atg tgc gcc gtg att gac ggg aac tca acg acc ttc gaa cac cag 1350

Asn Met Cys Ala Val Ile Asp Gly Asn Ser Thr Thr Phe Glu His Gln

425 430 435

cag ccg ttg caa gac cgg atg ttc aaa ttt gaa ctc acc cgc cgt ctg 1398

Gln Pro Leu Gln Asp Arg Met Phe Lys Phe Glu Leu Thr Arg Arg Leu

440 445 450

gat cat gac ttt ggg aag gtc acc aag cag gaa gtc aaa gac ttt ttc 1446

Asp His Asp Phe Gly Lys Val Thr Lys Gln Glu Val Lys Asp Phe Phe

455 460 465 470

cgg tgg gca aag gat cac gtg gtt gag gtg gag cat gaa ttc tac gtc 1494

Arg Trp Ala Lys Asp His Val Val Glu Val Glu His Glu Phe Tyr Val

475 480 485

aaa aag ggt gga gcc aag aaa aga ccc gcc ccc agt gac gca gat ata 1542

Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala Pro Ser Asp Ala Asp Ile

490 495 500

agt gag ccc aaa cgg gtg cgc gag tca gtt gcg cag cca tcg acg tca 1590

Ser Glu Pro Lys Arg Val Arg Glu Ser Val Ala Gln Pro Ser Thr Ser

505 510 515

gac gcg gaa gct tcg atc aac tac gca gac agg tac caa aac aaa tgt 1638

Asp Ala Glu Ala Ser Ile Asn Tyr Ala Asp Arg Tyr Gln Asn Lys Cys

520 525 530

tct cgt cac gtg ggc atg aat ctg atg ctg ttt ccc tgc aga caa tgc 1686

Ser Arg His Val Gly Met Asn Leu Met Leu Phe Pro Cys Arg Gln Cys

535 540 545 550

gag aga atg aat cag aat tca aat atc tgc ttc act cac gga cag aaa 1734

Glu Arg Met Asn Gln Asn Ser Asn Ile Cys Phe Thr His Gly Gln Lys

555 560 565

gac tgt tta gag tgc ttt ccc gtg tca gaa tct caa ccc gtt tct gtc 1782

Asp Cys Leu Glu Cys Phe Pro Val Ser Glu Ser Gln Pro Val Ser Val

570 575 580

gtc aaa aag gcg tat cag aaa ctg tgc tac att cat cat atc atg gga 1830

Val Lys Lys Ala Tyr Gln Lys Leu Cys Tyr Ile His His Ile Met Gly

585 590 595

aag gtg cca gac gct tgc act gcc tgc gat ctg gtc aat gtg gat ttg 1878

Lys Val Pro Asp Ala Cys Thr Ala Cys Asp Leu Val Asn Val Asp Leu

600 605 610

gat gac tgc atc ttt gaa caa taaatgattt aaatcaggt atg gct gcc gat 1930

Asp Asp Cys Ile Phe Glu Gln Met Ala Ala Asp

615 620 625

ggt tat ctt cca gat tgg ctc gag gac aac ctt agt gaa gga att cgc 1978

Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser Glu Gly Ile Arg

630 635 640

gag tgg tgg gct ttg aaa cct gga gcc cct caa ccc aag gca aat caa 2026

Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro Lys Ala Asn Gln

645 650 655

caa cat caa gac aac gct cga ggt ctt gtg ctt ccg ggt tac aaa tac 2074

Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro Gly Tyr Lys Tyr

660 665 670

ctt gga ccc ggc aac gga ctc gac aag ggg gag ccg gtc aac gca gca 2122

Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro Val Asn Ala Ala

675 680 685

gac gcg gcg gcc ctc gag cac gac aag gcc tac gac cag cag ctc aag 2170

Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp Gln Gln Leu Lys

690 695 700 705

gcc gga gac aac ccg tac ctc aag tac aac cac gcc gac gcc gag ttc 2218

Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala Asp Ala Glu Phe

710 715 720

cag gag cgg ctc aaa gaa gat acg tct ttt ggg ggc aac ctc ggg cga 2266

Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly Asn Leu Gly Arg

725 730 735

gca gtc ttc cag gcc aaa aag agg ctt ctt gaa cct ctt ggt ctg gtt 2314

Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro Leu Gly Leu Val

740 745 750

gag gaa gcg gct aag acg gct cct gga aag aag agg cct gta gag cag 2362

Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg Pro Val Glu Gln

755 760 765

tct cct cag gaa ccg gac tcc tcc gcg ggt att ggc aaa tcg ggt gca 2410

Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly Lys Ser Gly Ala

770 775 780 785

cag ccc gct aaa aag aga ctc aat ttc ggt cag act ggc gac aca gag 2458

Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr Gly Asp Thr Glu

790 795 800

tca gtc cca gac cct caa cca atc gga gaa cct ccc gca gcc ccc tca 2506

Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro Ala Ala Pro Ser

805 810 815

ggt gtg gga tct ctt aca atg gct tca ggt ggt ggc gca cca gtg gca 2554

Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly Ala Pro Val Ala

820 825 830

gac aat aac gaa ggt gcc gat gga gtg ggt agt tcc tcg gga aat tgg 2602

Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser Ser Gly Asn Trp

835 840 845

cat tgc gat tcc caa tgg ctg ggg gac aga gtc atc acc acc agc acc 2650

His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile Thr Thr Ser Thr

850 855 860 865

cga acc tgg gcc ctg ccc acc tac aac aat cac ctc tac aag caa atc 2698

Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu Tyr Lys Gln Ile

870 875 880

tcc aac agc aca tct gga gga tct tca aat gac aac gcc tac ttc ggc 2746

Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn Ala Tyr Phe Gly

885 890 895

tac agc acc ccc tgg ggg tat ttt gac ttc aac aga ttc cac tgc cac 2794

Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe His Cys His

900 905 910

ttc tca cca cgt gac tgg cag cga ctc atc aac aac aac tgg gga ttc 2842

Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp Gly Phe

915 920 925

cgg cct aag cga ctc aac ttc aag ctc ttc aac att cag gtc aaa gag 2890

Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile Gln Val Lys Glu

930 935 940 945

gtt acg gac aac aat gga gtc aag acc atc gcc aat aac ctt acc agc 2938

Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn Asn Leu Thr Ser

950 955 960

acg gtc cag gtc ttc acg gac tca gac tat cag ctc ccg tac gtg ctc 2986

Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu Pro Tyr Val Leu

965 970 975

ggg tcg gct cac gag ggc tgc ctc ccg ccg ttc cca gcg gac gtt ttc 3034

Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro Ala Asp Val Phe

980 985 990

atg att cct cag tac ggg tat ctg acg ctt aat gat gga agc cag gcc 3082

Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp Gly Ser Gln Ala

995 1000 1005

gtg ggt cgt tcg tcc ttt tac tgc ctg gaa tat ttc ccg tcg caa 3127

Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro Ser Gln

1010 1015 1020

atg cta aga acg ggt aac aac ttc cag ttc agc tac gag ttt gag 3172

Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu Phe Glu

1025 1030 1035

aac gta cct ttc cat agc agc tac gct cac agc caa agc ctg gac 3217

Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp

1040 1045 1050

cga cta atg aat cca ctc atc gac caa tac ttg tac tat ctc tca 3262

Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

1055 1060 1065

aag act att aac ggt tct gga cag aat caa caa acg cta aaa ttc 3307

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe

1070 1075 1080

agt gtg gcc gga ccc agc aac atg gct gtc cag gga aga aac tac 3352

Ser Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr

1085 1090 1095

ata cct gga ccc agc tac cga caa caa cgt gtc tca acc act gtg 3397

Ile Pro Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val

1100 1105 1110

act caa aac aac aac agc gaa ttt gct tgg cct gga gct tct tct 3442

Thr Gln Asn Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser

1115 1120 1125

tgg gct ctc aat gga cgt aat agc ttg atg aat cct gga cct gct 3487

Trp Ala Leu Asn Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala

1130 1135 1140

atg gcc agc cac aaa gaa gga gag gac cgt ttc ttt cct ttg tct 3532

Met Ala Ser His Lys Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser

1145 1150 1155

gga tct tta att ttt ggc aaa caa gga act gga aga gac aac gtg 3577

Gly Ser Leu Ile Phe Gly Lys Gln Gly Thr Gly Arg Asp Asn Val

1160 1165 1170

gat gcg gac aaa gtc atg ata acc aac gaa gaa gaa att aaa act 3622

Asp Ala Asp Lys Val Met Ile Thr Asn Glu Glu Glu Ile Lys Thr

1175 1180 1185

act aac ccg gta gca acg gag tcc tat gga caa gtg gcc aca aac 3667

Thr Asn Pro Val Ala Thr Glu Ser Tyr Gly Gln Val Ala Thr Asn

1190 1195 1200

cac cag agt gcc caa gga gga agc agc aac acc gtg aag ctg acc 3712

His Gln Ser Ala Gln Gly Gly Ser Ser Asn Thr Val Lys Leu Thr

1205 1210 1215

agc gga cac gga gga gca cag gcg cag acc ggc tgg gtt caa aac 3757

Ser Gly His Gly Gly Ala Gln Ala Gln Thr Gly Trp Val Gln Asn

1220 1225 1230

caa gga ata ctt ccg ggt atg gtt tgg cag gac aga gat gtg tac 3802

Gln Gly Ile Leu Pro Gly Met Val Trp Gln Asp Arg Asp Val Tyr

1235 1240 1245

ctg caa gga ccc att tgg gcc aaa att cct cac acg gac ggc aac 3847

Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly Asn

1250 1255 1260

ttt cac cct tct ccg ctg atg gga ggg ttt gga atg aag cac ccg 3892

Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Met Lys His Pro

1265 1270 1275

cct cct cag atc ctc atc aaa aac aca cct gta cct gcg gat cct 3937

Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asp Pro

1280 1285 1290

cca acg gcc ttc aac aag gac aag ctg aac tct ttc atc acc cag 3982

Pro Thr Ala Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr Gln

1295 1300 1305

tat tct act ggc caa gtc agc gtg gag atc gag tgg gag ctg cag 4027

Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln

1310 1315 1320

aag gaa aac agc aag cgc tgg aac ccg gag atc cag tac act tcc 4072

Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser

1325 1330 1335

aac tat tac aag tct aat aat gtt gaa ttt gct gtt aat act gaa 4117

Asn Tyr Tyr Lys Ser Asn Asn Val Glu Phe Ala Val Asn Thr Glu

1340 1345 1350

ggt gta tat agt gaa ccc cgc ccc att ggc acc aga tac ctg act 4162

Gly Val Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr

1355 1360 1365

cgt aat ctg taa ttgcttgtta atcaataaac cgtttaattc gtttcagttg 4214

Arg Asn Leu

1370

aactttggtc tctgcgaagg gcgaattcgt ttaaacctgc aggactagag gtcctgtatt 4274

agaggtcacg tgagtgtttt gcgacatttt gcgacaccat gtggtcacgc tgggtattta 4334

agcccgagtg agcacgcagg gtctccattt tgaagcggga ggtttgaacg cgcagccgcc 4394

aagccgaatt ctgcagatat ccatcacact ggcggccgct cgactagagc ggccgccacc 4454

gcggtggagc tccagctttt gttcccttta gtgagggtta attgcgcgct tggcgtaatc 4514

atggtcatag ctgtttcctg tgtgaaattg ttatccgctc acaattccac acaacatacg 4574

agccggaagc ataaagtgta aagcctgggg tgcctaatga gtgagctaac tcacattaat 4634

tgcgttgcgc tcactgcccg ctttccagtc gggaaacctg tcgtgccagc tgcattaatg 4694

aatcggccaa cgcgcgggga gaggcggttt gcgtattggg cgc 4737

<210> 17

<211> 621

<212> PRT

<213> artificial sequence

<220>

<223> synthetic construct

<400> 17

Met Pro Gly Phe Tyr Glu Ile Val Ile Lys Val Pro Ser Asp Leu Asp

1 5 10 15

Glu His Leu Pro Gly Ile Ser Asp Ser Phe Val Asn Trp Val Ala Glu

20 25 30

Lys Glu Trp Glu Leu Pro Pro Asp Ser Asp Met Asp Leu Asn Leu Ile

35 40 45

Glu Gln Ala Pro Leu Thr Val Ala Glu Lys Leu Gln Arg Asp Phe Leu

50 55 60

Thr Glu Trp Arg Arg Val Ser Lys Ala Pro Glu Ala Leu Phe Phe Val

65 70 75 80

Gln Phe Glu Lys Gly Glu Ser Tyr Phe His Met His Val Leu Val Glu

85 90 95

Thr Thr Gly Val Lys Ser Met Val Leu Gly Arg Phe Leu Ser Gln Ile

100 105 110

Arg Glu Lys Leu Ile Gln Arg Ile Tyr Arg Gly Ile Glu Pro Thr Leu

115 120 125

Pro Asn Trp Phe Ala Val Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly

130 135 140

Asn Lys Val Val Asp Glu Cys Tyr Ile Pro Asn Tyr Leu Leu Pro Lys

145 150 155 160

Thr Gln Pro Glu Leu Gln Trp Ala Trp Thr Asn Met Glu Gln Tyr Leu

165 170 175

Ser Ala Cys Leu Asn Leu Thr Glu Arg Lys Arg Leu Val Ala Gln His

180 185 190

Leu Thr His Val Ser Gln Thr Gln Glu Gln Asn Lys Glu Asn Gln Asn

195 200 205

Pro Asn Ser Asp Ala Pro Val Ile Arg Ser Lys Thr Ser Ala Arg Tyr

210 215 220

Met Glu Leu Val Gly Trp Leu Val Asp Lys Gly Ile Thr Ser Glu Lys

225 230 235 240

Gln Trp Ile Gln Glu Asp Gln Ala Ser Tyr Ile Ser Phe Asn Ala Ala

245 250 255

Ser Asn Ser Arg Ser Gln Ile Lys Ala Ala Leu Asp Asn Ala Gly Lys

260 265 270

Ile Met Ser Leu Thr Lys Thr Ala Pro Asp Tyr Leu Val Gly Gln Gln

275 280 285

Pro Val Glu Asp Ile Ser Ser Asn Arg Ile Tyr Lys Ile Leu Glu Leu

290 295 300

Asn Gly Tyr Asp Pro Gln Tyr Ala Ala Ser Val Phe Leu Gly Trp Ala

305 310 315 320

Thr Lys Lys Phe Gly Lys Arg Asn Thr Ile Trp Leu Phe Gly Pro Ala

325 330 335

Thr Thr Gly Lys Thr Asn Ile Ala Glu Ala Ile Ala His Thr Val Pro

340 345 350

Phe Tyr Gly Cys Val Asn Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp

355 360 365

Cys Val Asp Lys Met Val Ile Trp Trp Glu Glu Gly Lys Met Thr Ala

370 375 380

Lys Val Val Glu Ser Ala Lys Ala Ile Leu Gly Gly Ser Lys Val Arg

385 390 395 400

Val Asp Gln Lys Cys Lys Ser Ser Ala Gln Ile Asp Pro Thr Pro Val

405 410 415

Ile Val Thr Ser Asn Thr Asn Met Cys Ala Val Ile Asp Gly Asn Ser

420 425 430

Thr Thr Phe Glu His Gln Gln Pro Leu Gln Asp Arg Met Phe Lys Phe

435 440 445

Glu Leu Thr Arg Arg Leu Asp His Asp Phe Gly Lys Val Thr Lys Gln

450 455 460

Glu Val Lys Asp Phe Phe Arg Trp Ala Lys Asp His Val Val Glu Val

465 470 475 480

Glu His Glu Phe Tyr Val Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala

485 490 495

Pro Ser Asp Ala Asp Ile Ser Glu Pro Lys Arg Val Arg Glu Ser Val

500 505 510

Ala Gln Pro Ser Thr Ser Asp Ala Glu Ala Ser Ile Asn Tyr Ala Asp

515 520 525

Arg Tyr Gln Asn Lys Cys Ser Arg His Val Gly Met Asn Leu Met Leu

530 535 540

Phe Pro Cys Arg Gln Cys Glu Arg Met Asn Gln Asn Ser Asn Ile Cys

545 550 555 560

Phe Thr His Gly Gln Lys Asp Cys Leu Glu Cys Phe Pro Val Ser Glu

565 570 575

Ser Gln Pro Val Ser Val Val Lys Lys Ala Tyr Gln Lys Leu Cys Tyr

580 585 590

Ile His His Ile Met Gly Lys Val Pro Asp Ala Cys Thr Ala Cys Asp

595 600 605

Leu Val Asn Val Asp Leu Asp Asp Cys Ile Phe Glu Gln

610 615 620

<210> 18

<211> 751

<212> PRT

<213> artificial sequence

<220>

<223> synthetic construct

<400> 18

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser

1 5 10 15

Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro

20 25 30

Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro

35 40 45

Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro

50 55 60

Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 80

Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro

115 120 125

Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140

Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly

145 150 155 160

Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175

Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro

180 185 190

Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly

195 200 205

Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser

210 215 220

Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile

225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255

Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn

260 265 270

Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg

275 280 285

Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn

290 295 300

Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile

305 310 315 320

Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn

325 330 335

Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu

340 345 350

Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro

355 360 365

Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp

370 375 380

Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe

385 390 395 400

Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu

405 410 415

Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu

420 425 430

Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

435 440 445

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser

450 455 460

Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro

465 470 475 480

Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn

485 490 495

Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn

500 505 510

Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys

515 520 525

Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly

530 535 540

Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile

545 550 555 560

Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser

565 570 575

Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Gly Gly Ser Ser

580 585 590

Asn Thr Val Lys Leu Thr Ser Gly His Gly Gly Ala Gln Ala Gln Thr

595 600 605

Gly Trp Val Gln Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln Asp

610 615 620

Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr

625 630 635 640

Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Met Lys

645 650 655

His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asp

660 665 670

Pro Pro Thr Ala Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr Gln

675 680 685

Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys

690 695 700

Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr

705 710 715 720

Tyr Lys Ser Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val Tyr

725 730 735

Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

740 745 750

<210> 19

<211> 4740

<212> DNA

<213> artificial sequence

<220>

<223> AAV2/9 n.588.VLT-L nucleic acid sequence expression cassettes

<220>

<221> misc_feature

<222> (1)..(36)

<223> truncated promoter

<220>

<221> promoter

<222> (1)..(7)

<223> p5 promoter

<220>

<221> CDS

<222> (37)..(1899)

<223> AAV2 Rep

<220>

<221> CDS

<222> (1919)..(4177)

<223> AAV9 Cap

<220>

<221> misc_feature

<222> (3683)..(3730)

<223> VLT-L

<220>

<221> misc_feature

<222> (4268)..(4398)

<223> p5 promoter

<220>

<221> misc_feature

<222> (4526)..(4740)

<223> LacZ promoter

<400> 19

ccattttgaa gcgggaggtt tgaacgcgca gccgcc atg ccg ggg ttt tac gag 54

Met Pro Gly Phe Tyr Glu

1 5

att gtg att aag gtc ccc agc gac ctt gac gag cat ctg ccc ggc att 102

Ile Val Ile Lys Val Pro Ser Asp Leu Asp Glu His Leu Pro Gly Ile

10 15 20

tct gac agc ttt gtg aac tgg gtg gcc gag aag gaa tgg gag ttg ccg 150

Ser Asp Ser Phe Val Asn Trp Val Ala Glu Lys Glu Trp Glu Leu Pro

25 30 35

cca gat tct gac atg gat ctg aat ctg att gag cag gca ccc ctg acc 198

Pro Asp Ser Asp Met Asp Leu Asn Leu Ile Glu Gln Ala Pro Leu Thr

40 45 50

gtg gcc gag aag ctg cag cgc gac ttt ctg acg gaa tgg cgc cgt gtg 246

Val Ala Glu Lys Leu Gln Arg Asp Phe Leu Thr Glu Trp Arg Arg Val

55 60 65 70

agt aag gcc ccg gag gct ctt ttc ttt gtg caa ttt gag aag gga gag 294

Ser Lys Ala Pro Glu Ala Leu Phe Phe Val Gln Phe Glu Lys Gly Glu

75 80 85

agc tac ttc cac atg cac gtg ctc gtg gaa acc acc ggg gtg aaa tcc 342

Ser Tyr Phe His Met His Val Leu Val Glu Thr Thr Gly Val Lys Ser

90 95 100

atg gtt ttg gga cgt ttc ctg agt cag att cgc gaa aaa ctg att cag 390

Met Val Leu Gly Arg Phe Leu Ser Gln Ile Arg Glu Lys Leu Ile Gln

105 110 115

aga att tac cgc ggg atc gag ccg act ttg cca aac tgg ttc gcg gtc 438

Arg Ile Tyr Arg Gly Ile Glu Pro Thr Leu Pro Asn Trp Phe Ala Val

120 125 130

aca aag acc aga aat ggc gcc gga ggc ggg aac aag gtg gtg gat gag 486

Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly Asn Lys Val Val Asp Glu

135 140 145 150

tgc tac atc ccc aat tac ttg ctc ccc aaa acc cag cct gag ctc cag 534

Cys Tyr Ile Pro Asn Tyr Leu Leu Pro Lys Thr Gln Pro Glu Leu Gln

155 160 165

tgg gcg tgg act aat atg gaa cag tat tta agc gcc tgt ttg aat ctc 582

Trp Ala Trp Thr Asn Met Glu Gln Tyr Leu Ser Ala Cys Leu Asn Leu

170 175 180

acg gag cgt aaa cgg ttg gtg gcg cag cat ctg acg cac gtg tcg cag 630

Thr Glu Arg Lys Arg Leu Val Ala Gln His Leu Thr His Val Ser Gln

185 190 195

acg cag gag cag aac aaa gag aat cag aat ccc aat tct gat gcg ccg 678

Thr Gln Glu Gln Asn Lys Glu Asn Gln Asn Pro Asn Ser Asp Ala Pro

200 205 210

gtg atc aga tca aaa act tca gcc agg tac atg gag ctg gtc ggg tgg 726

Val Ile Arg Ser Lys Thr Ser Ala Arg Tyr Met Glu Leu Val Gly Trp

215 220 225 230

ctc gtg gac aag ggg att acc tcg gag aag cag tgg atc cag gag gac 774

Leu Val Asp Lys Gly Ile Thr Ser Glu Lys Gln Trp Ile Gln Glu Asp

235 240 245

cag gcc tca tac atc tcc ttc aat gcg gcc tcc aac tcg cgg tcc caa 822

Gln Ala Ser Tyr Ile Ser Phe Asn Ala Ala Ser Asn Ser Arg Ser Gln

250 255 260

atc aag gct gcc ttg gac aat gcg gga aag att atg agc ctg act aaa 870

Ile Lys Ala Ala Leu Asp Asn Ala Gly Lys Ile Met Ser Leu Thr Lys

265 270 275

acc gcc ccc gac tac ctg gtg ggc cag cag ccc gtg gag gac att tcc 918

Thr Ala Pro Asp Tyr Leu Val Gly Gln Gln Pro Val Glu Asp Ile Ser

280 285 290

agc aat cgg att tat aaa att ttg gaa cta aac ggg tac gat ccc caa 966

Ser Asn Arg Ile Tyr Lys Ile Leu Glu Leu Asn Gly Tyr Asp Pro Gln

295 300 305 310

tat gcg gct tcc gtc ttt ctg gga tgg gcc acg aaa aag ttc ggc aag 1014

Tyr Ala Ala Ser Val Phe Leu Gly Trp Ala Thr Lys Lys Phe Gly Lys

315 320 325

agg aac acc atc tgg ctg ttt ggg cct gca act acc ggg aag acc aac 1062

Arg Asn Thr Ile Trp Leu Phe Gly Pro Ala Thr Thr Gly Lys Thr Asn

330 335 340

atc gcg gag gcc ata gcc cac act gtg ccc ttc tac ggg tgc gta aac 1110

Ile Ala Glu Ala Ile Ala His Thr Val Pro Phe Tyr Gly Cys Val Asn

345 350 355

tgg acc aat gag aac ttt ccc ttc aac gac tgt gtc gac aag atg gtg 1158

Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp Cys Val Asp Lys Met Val

360 365 370

atc tgg tgg gag gag ggg aag atg acc gcc aag gtc gtg gag tcg gcc 1206

Ile Trp Trp Glu Glu Gly Lys Met Thr Ala Lys Val Val Glu Ser Ala

375 380 385 390

aaa gcc att ctc gga gga agc aag gtg cgc gtg gac cag aaa tgc aag 1254

Lys Ala Ile Leu Gly Gly Ser Lys Val Arg Val Asp Gln Lys Cys Lys

395 400 405

tcc tcg gcc cag ata gac ccg act ccc gtg atc gtc acc tcc aac acc 1302

Ser Ser Ala Gln Ile Asp Pro Thr Pro Val Ile Val Thr Ser Asn Thr

410 415 420

aac atg tgc gcc gtg att gac ggg aac tca acg acc ttc gaa cac cag 1350

Asn Met Cys Ala Val Ile Asp Gly Asn Ser Thr Thr Phe Glu His Gln

425 430 435

cag ccg ttg caa gac cgg atg ttc aaa ttt gaa ctc acc cgc cgt ctg 1398

Gln Pro Leu Gln Asp Arg Met Phe Lys Phe Glu Leu Thr Arg Arg Leu

440 445 450

gat cat gac ttt ggg aag gtc acc aag cag gaa gtc aaa gac ttt ttc 1446

Asp His Asp Phe Gly Lys Val Thr Lys Gln Glu Val Lys Asp Phe Phe

455 460 465 470

cgg tgg gca aag gat cac gtg gtt gag gtg gag cat gaa ttc tac gtc 1494

Arg Trp Ala Lys Asp His Val Val Glu Val Glu His Glu Phe Tyr Val

475 480 485

aaa aag ggt gga gcc aag aaa aga ccc gcc ccc agt gac gca gat ata 1542

Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala Pro Ser Asp Ala Asp Ile

490 495 500

agt gag ccc aaa cgg gtg cgc gag tca gtt gcg cag cca tcg acg tca 1590

Ser Glu Pro Lys Arg Val Arg Glu Ser Val Ala Gln Pro Ser Thr Ser

505 510 515

gac gcg gaa gct tcg atc aac tac gca gac agg tac caa aac aaa tgt 1638

Asp Ala Glu Ala Ser Ile Asn Tyr Ala Asp Arg Tyr Gln Asn Lys Cys

520 525 530

tct cgt cac gtg ggc atg aat ctg atg ctg ttt ccc tgc aga caa tgc 1686

Ser Arg His Val Gly Met Asn Leu Met Leu Phe Pro Cys Arg Gln Cys

535 540 545 550

gag aga atg aat cag aat tca aat atc tgc ttc act cac gga cag aaa 1734

Glu Arg Met Asn Gln Asn Ser Asn Ile Cys Phe Thr His Gly Gln Lys

555 560 565

gac tgt tta gag tgc ttt ccc gtg tca gaa tct caa ccc gtt tct gtc 1782

Asp Cys Leu Glu Cys Phe Pro Val Ser Glu Ser Gln Pro Val Ser Val

570 575 580

gtc aaa aag gcg tat cag aaa ctg tgc tac att cat cat atc atg gga 1830

Val Lys Lys Ala Tyr Gln Lys Leu Cys Tyr Ile His His Ile Met Gly

585 590 595

aag gtg cca gac gct tgc act gcc tgc gat ctg gtc aat gtg gat ttg 1878

Lys Val Pro Asp Ala Cys Thr Ala Cys Asp Leu Val Asn Val Asp Leu

600 605 610

gat gac tgc atc ttt gaa caa taaatgattt aaatcaggt atg gct gcc gat 1930

Asp Asp Cys Ile Phe Glu Gln Met Ala Ala Asp

615 620 625

ggt tat ctt cca gat tgg ctc gag gac aac ctt agt gaa gga att cgc 1978

Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser Glu Gly Ile Arg

630 635 640

gag tgg tgg gct ttg aaa cct gga gcc cct caa ccc aag gca aat caa 2026

Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro Lys Ala Asn Gln

645 650 655

caa cat caa gac aac gct cga ggt ctt gtg ctt ccg ggt tac aaa tac 2074

Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro Gly Tyr Lys Tyr

660 665 670

ctt gga ccc ggc aac gga ctc gac aag ggg gag ccg gtc aac gca gca 2122

Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro Val Asn Ala Ala

675 680 685

gac gcg gcg gcc ctc gag cac gac aag gcc tac gac cag cag ctc aag 2170

Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp Gln Gln Leu Lys

690 695 700 705

gcc gga gac aac ccg tac ctc aag tac aac cac gcc gac gcc gag ttc 2218

Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala Asp Ala Glu Phe

710 715 720

cag gag cgg ctc aaa gaa gat acg tct ttt ggg ggc aac ctc ggg cga 2266

Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly Asn Leu Gly Arg

725 730 735

gca gtc ttc cag gcc aaa aag agg ctt ctt gaa cct ctt ggt ctg gtt 2314

Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro Leu Gly Leu Val

740 745 750

gag gaa gcg gct aag acg gct cct gga aag aag agg cct gta gag cag 2362

Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg Pro Val Glu Gln

755 760 765

tct cct cag gaa ccg gac tcc tcc gcg ggt att ggc aaa tcg ggt gca 2410

Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly Lys Ser Gly Ala

770 775 780 785

cag ccc gct aaa aag aga ctc aat ttc ggt cag act ggc gac aca gag 2458

Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr Gly Asp Thr Glu

790 795 800

tca gtc cca gac cct caa cca atc gga gaa cct ccc gca gcc ccc tca 2506

Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro Ala Ala Pro Ser

805 810 815

ggt gtg gga tct ctt aca atg gct tca ggt ggt ggc gca cca gtg gca 2554

Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly Ala Pro Val Ala

820 825 830

gac aat aac gaa ggt gcc gat gga gtg ggt agt tcc tcg gga aat tgg 2602

Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser Ser Gly Asn Trp

835 840 845

cat tgc gat tcc caa tgg ctg ggg gac aga gtc atc acc acc agc acc 2650

His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile Thr Thr Ser Thr

850 855 860 865

cga acc tgg gcc ctg ccc acc tac aac aat cac ctc tac aag caa atc 2698

Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu Tyr Lys Gln Ile

870 875 880

tcc aac agc aca tct gga gga tct tca aat gac aac gcc tac ttc ggc 2746

Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn Ala Tyr Phe Gly

885 890 895

tac agc acc ccc tgg ggg tat ttt gac ttc aac aga ttc cac tgc cac 2794

Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg Phe His Cys His

900 905 910

ttc tca cca cgt gac tgg cag cga ctc atc aac aac aac tgg gga ttc 2842

Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn Asn Trp Gly Phe

915 920 925

cgg cct aag cga ctc aac ttc aag ctc ttc aac att cag gtc aaa gag 2890

Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile Gln Val Lys Glu

930 935 940 945

gtt acg gac aac aat gga gtc aag acc atc gcc aat aac ctt acc agc 2938

Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn Asn Leu Thr Ser

950 955 960

acg gtc cag gtc ttc acg gac tca gac tat cag ctc ccg tac gtg ctc 2986

Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu Pro Tyr Val Leu

965 970 975

ggg tcg gct cac gag ggc tgc ctc ccg ccg ttc cca gcg gac gtt ttc 3034

Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro Ala Asp Val Phe

980 985 990

atg att cct cag tac ggg tat ctg acg ctt aat gat gga agc cag gcc 3082

Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp Gly Ser Gln Ala

995 1000 1005

gtg ggt cgt tcg tcc ttt tac tgc ctg gaa tat ttc ccg tcg caa 3127

Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe Pro Ser Gln

1010 1015 1020

atg cta aga acg ggt aac aac ttc cag ttc agc tac gag ttt gag 3172

Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu Phe Glu

1025 1030 1035

aac gta cct ttc cat agc agc tac gct cac agc caa agc ctg gac 3217

Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu Asp

1040 1045 1050

cga cta atg aat cca ctc atc gac caa tac ttg tac tat ctc tca 3262

Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

1055 1060 1065

aag act att aac ggt tct gga cag aat caa caa acg cta aaa ttc 3307

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe

1070 1075 1080

agt gtg gcc gga ccc agc aac atg gct gtc cag gga aga aac tac 3352

Ser Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr

1085 1090 1095

ata cct gga ccc agc tac cga caa caa cgt gtc tca acc act gtg 3397

Ile Pro Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val

1100 1105 1110

act caa aac aac aac agc gaa ttt gct tgg cct gga gct tct tct 3442

Thr Gln Asn Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser

1115 1120 1125

tgg gct ctc aat gga cgt aat agc ttg atg aat cct gga cct gct 3487

Trp Ala Leu Asn Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala

1130 1135 1140

atg gcc agc cac aaa gaa gga gag gac cgt ttc ttt cct ttg tct 3532

Met Ala Ser His Lys Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser

1145 1150 1155

gga tct tta att ttt ggc aaa caa gga act gga aga gac aac gtg 3577

Gly Ser Leu Ile Phe Gly Lys Gln Gly Thr Gly Arg Asp Asn Val

1160 1165 1170

gat gcg gac aaa gtc atg ata acc aac gaa gaa gaa att aaa act 3622

Asp Ala Asp Lys Val Met Ile Thr Asn Glu Glu Glu Ile Lys Thr

1175 1180 1185

act aac ccg gta gca acg gag tcc tat gga caa gtg gcc aca aac 3667

Thr Asn Pro Val Ala Thr Glu Ser Tyr Gly Gln Val Ala Thr Asn

1190 1195 1200

cac cag agt gcc caa gga gga gtg ctg acc aac atc gct aga gga 3712

His Gln Ser Ala Gln Gly Gly Val Leu Thr Asn Ile Ala Arg Gly

1205 1210 1215

gag tac atg aga gga gga gca cag gcg cag acc ggc tgg gtt caa 3757

Glu Tyr Met Arg Gly Gly Ala Gln Ala Gln Thr Gly Trp Val Gln

1220 1225 1230

aac caa gga ata ctt ccg ggt atg gtt tgg cag gac aga gat gtg 3802

Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln Asp Arg Asp Val

1235 1240 1245

tac ctg caa gga ccc att tgg gcc aaa att cct cac acg gac ggc 3847

Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly

1250 1255 1260

aac ttt cac cct tct ccg ctg atg gga ggg ttt gga atg aag cac 3892

Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Met Lys His

1265 1270 1275

ccg cct cct cag atc ctc atc aaa aac aca cct gta cct gcg gat 3937

Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asp

1280 1285 1290

cct cca acg gcc ttc aac aag gac aag ctg aac tct ttc atc acc 3982

Pro Pro Thr Ala Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr

1295 1300 1305

cag tat tct act ggc caa gtc agc gtg gag atc gag tgg gag ctg 4027

Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu

1310 1315 1320

cag aag gaa aac agc aag cgc tgg aac ccg gag atc cag tac act 4072

Gln Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr

1325 1330 1335

tcc aac tat tac aag tct aat aat gtt gaa ttt gct gtt aat act 4117

Ser Asn Tyr Tyr Lys Ser Asn Asn Val Glu Phe Ala Val Asn Thr

1340 1345 1350

gaa ggt gta tat agt gaa ccc cgc ccc att ggc acc aga tac ctg 4162

Glu Gly Val Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu

1355 1360 1365

act cgt aat ctg taa ttgcttgtta atcaataaac cgtttaattc gtttcagttg 4217

Thr Arg Asn Leu

1370

aactttggtc tctgcgaagg gcgaattcgt ttaaacctgc aggactagag gtcctgtatt 4277

agaggtcacg tgagtgtttt gcgacatttt gcgacaccat gtggtcacgc tgggtattta 4337

agcccgagtg agcacgcagg gtctccattt tgaagcggga ggtttgaacg cgcagccgcc 4397

aagccgaatt ctgcagatat ccatcacact ggcggccgct cgactagagc ggccgccacc 4457

gcggtggagc tccagctttt gttcccttta gtgagggtta attgcgcgct tggcgtaatc 4517

atggtcatag ctgtttcctg tgtgaaattg ttatccgctc acaattccac acaacatacg 4577

agccggaagc ataaagtgta aagcctgggg tgcctaatga gtgagctaac tcacattaat 4637

tgcgttgcgc tcactgcccg ctttccagtc gggaaacctg tcgtgccagc tgcattaatg 4697

aatcggccaa cgcgcgggga gaggcggttt gcgtattggg cgc 4740

<210> 20

<211> 621

<212> PRT

<213> artificial sequence

<220>

<223> synthetic construct

<400> 20

Met Pro Gly Phe Tyr Glu Ile Val Ile Lys Val Pro Ser Asp Leu Asp

1 5 10 15

Glu His Leu Pro Gly Ile Ser Asp Ser Phe Val Asn Trp Val Ala Glu

20 25 30

Lys Glu Trp Glu Leu Pro Pro Asp Ser Asp Met Asp Leu Asn Leu Ile

35 40 45

Glu Gln Ala Pro Leu Thr Val Ala Glu Lys Leu Gln Arg Asp Phe Leu

50 55 60

Thr Glu Trp Arg Arg Val Ser Lys Ala Pro Glu Ala Leu Phe Phe Val

65 70 75 80

Gln Phe Glu Lys Gly Glu Ser Tyr Phe His Met His Val Leu Val Glu

85 90 95

Thr Thr Gly Val Lys Ser Met Val Leu Gly Arg Phe Leu Ser Gln Ile

100 105 110

Arg Glu Lys Leu Ile Gln Arg Ile Tyr Arg Gly Ile Glu Pro Thr Leu

115 120 125

Pro Asn Trp Phe Ala Val Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly

130 135 140

Asn Lys Val Val Asp Glu Cys Tyr Ile Pro Asn Tyr Leu Leu Pro Lys

145 150 155 160

Thr Gln Pro Glu Leu Gln Trp Ala Trp Thr Asn Met Glu Gln Tyr Leu

165 170 175

Ser Ala Cys Leu Asn Leu Thr Glu Arg Lys Arg Leu Val Ala Gln His

180 185 190

Leu Thr His Val Ser Gln Thr Gln Glu Gln Asn Lys Glu Asn Gln Asn

195 200 205

Pro Asn Ser Asp Ala Pro Val Ile Arg Ser Lys Thr Ser Ala Arg Tyr

210 215 220

Met Glu Leu Val Gly Trp Leu Val Asp Lys Gly Ile Thr Ser Glu Lys

225 230 235 240

Gln Trp Ile Gln Glu Asp Gln Ala Ser Tyr Ile Ser Phe Asn Ala Ala

245 250 255

Ser Asn Ser Arg Ser Gln Ile Lys Ala Ala Leu Asp Asn Ala Gly Lys

260 265 270

Ile Met Ser Leu Thr Lys Thr Ala Pro Asp Tyr Leu Val Gly Gln Gln

275 280 285

Pro Val Glu Asp Ile Ser Ser Asn Arg Ile Tyr Lys Ile Leu Glu Leu

290 295 300

Asn Gly Tyr Asp Pro Gln Tyr Ala Ala Ser Val Phe Leu Gly Trp Ala

305 310 315 320

Thr Lys Lys Phe Gly Lys Arg Asn Thr Ile Trp Leu Phe Gly Pro Ala

325 330 335

Thr Thr Gly Lys Thr Asn Ile Ala Glu Ala Ile Ala His Thr Val Pro

340 345 350

Phe Tyr Gly Cys Val Asn Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp

355 360 365

Cys Val Asp Lys Met Val Ile Trp Trp Glu Glu Gly Lys Met Thr Ala

370 375 380

Lys Val Val Glu Ser Ala Lys Ala Ile Leu Gly Gly Ser Lys Val Arg

385 390 395 400

Val Asp Gln Lys Cys Lys Ser Ser Ala Gln Ile Asp Pro Thr Pro Val

405 410 415

Ile Val Thr Ser Asn Thr Asn Met Cys Ala Val Ile Asp Gly Asn Ser

420 425 430

Thr Thr Phe Glu His Gln Gln Pro Leu Gln Asp Arg Met Phe Lys Phe

435 440 445

Glu Leu Thr Arg Arg Leu Asp His Asp Phe Gly Lys Val Thr Lys Gln

450 455 460

Glu Val Lys Asp Phe Phe Arg Trp Ala Lys Asp His Val Val Glu Val

465 470 475 480

Glu His Glu Phe Tyr Val Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala

485 490 495

Pro Ser Asp Ala Asp Ile Ser Glu Pro Lys Arg Val Arg Glu Ser Val

500 505 510

Ala Gln Pro Ser Thr Ser Asp Ala Glu Ala Ser Ile Asn Tyr Ala Asp

515 520 525

Arg Tyr Gln Asn Lys Cys Ser Arg His Val Gly Met Asn Leu Met Leu

530 535 540

Phe Pro Cys Arg Gln Cys Glu Arg Met Asn Gln Asn Ser Asn Ile Cys

545 550 555 560

Phe Thr His Gly Gln Lys Asp Cys Leu Glu Cys Phe Pro Val Ser Glu

565 570 575

Ser Gln Pro Val Ser Val Val Lys Lys Ala Tyr Gln Lys Leu Cys Tyr

580 585 590

Ile His His Ile Met Gly Lys Val Pro Asp Ala Cys Thr Ala Cys Asp

595 600 605

Leu Val Asn Val Asp Leu Asp Asp Cys Ile Phe Glu Gln

610 615 620

<210> 21

<211> 752

<212> PRT

<213> artificial sequence

<220>

<223> synthetic construct

<400> 21

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser

1 5 10 15

Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro

20 25 30

Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro

35 40 45

Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro

50 55 60

Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 80

Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro

115 120 125

Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140

Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly

145 150 155 160

Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175

Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro

180 185 190

Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly

195 200 205

Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser

210 215 220

Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile

225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255

Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn

260 265 270

Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg

275 280 285

Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn

290 295 300

Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile

305 310 315 320

Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn

325 330 335

Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu

340 345 350

Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro

355 360 365

Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp

370 375 380

Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe

385 390 395 400

Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu

405 410 415

Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu

420 425 430

Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

435 440 445

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser

450 455 460

Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro

465 470 475 480

Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn

485 490 495

Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn

500 505 510

Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys

515 520 525

Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly

530 535 540

Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile

545 550 555 560

Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser

565 570 575

Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Gly Gly Val Leu

580 585 590

Thr Asn Ile Ala Arg Gly Glu Tyr Met Arg Gly Gly Ala Gln Ala Gln

595 600 605

Thr Gly Trp Val Gln Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln

610 615 620

Asp Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His

625 630 635 640

Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Met

645 650 655

Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala

660 665 670

Asp Pro Pro Thr Ala Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr

675 680 685

Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln

690 695 700

Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn

705 710 715 720

Tyr Tyr Lys Ser Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val

725 730 735

Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

740 745 750

<210> 22

<211> 1863

<212> DNA

<213> artificial sequence

<220>

<223> AAV2 Rep nucleic acid sequence

<400> 22

atgccggggt tttacgagat tgtgattaag gtccccagcg accttgacga gcatctgccc 60

ggcatttctg acagctttgt gaactgggtg gccgagaagg aatgggagtt gccgccagat 120

tctgacatgg atctgaatct gattgagcag gcacccctga ccgtggccga gaagctgcag 180

cgcgactttc tgacggaatg gcgccgtgtg agtaaggccc cggaggctct tttctttgtg 240

caatttgaga agggagagag ctacttccac atgcacgtgc tcgtggaaac caccggggtg 300

aaatccatgg ttttgggacg tttcctgagt cagattcgcg aaaaactgat tcagagaatt 360

taccgcggga tcgagccgac tttgccaaac tggttcgcgg tcacaaagac cagaaatggc 420

gccggaggcg ggaacaaggt ggtggatgag tgctacatcc ccaattactt gctccccaaa 480

acccagcctg agctccagtg ggcgtggact aatatggaac agtatttaag cgcctgtttg 540

aatctcacgg agcgtaaacg gttggtggcg cagcatctga cgcacgtgtc gcagacgcag 600

gagcagaaca aagagaatca gaatcccaat tctgatgcgc cggtgatcag atcaaaaact 660

tcagccaggt acatggagct ggtcgggtgg ctcgtggaca aggggattac ctcggagaag 720

cagtggatcc aggaggacca ggcctcatac atctccttca atgcggcctc caactcgcgg 780

tcccaaatca aggctgcctt ggacaatgcg ggaaagatta tgagcctgac taaaaccgcc 840

cccgactacc tggtgggcca gcagcccgtg gaggacattt ccagcaatcg gatttataaa 900

attttggaac taaacgggta cgatccccaa tatgcggctt ccgtctttct gggatgggcc 960

acgaaaaagt tcggcaagag gaacaccatc tggctgtttg ggcctgcaac taccgggaag 1020

accaacatcg cggaggccat agcccacact gtgcccttct acgggtgcgt aaactggacc 1080

aatgagaact ttcccttcaa cgactgtgtc gacaagatgg tgatctggtg ggaggagggg 1140

aagatgaccg ccaaggtcgt ggagtcggcc aaagccattc tcggaggaag caaggtgcgc 1200

gtggaccaga aatgcaagtc ctcggcccag atagacccga ctcccgtgat cgtcacctcc 1260

aacaccaaca tgtgcgccgt gattgacggg aactcaacga ccttcgaaca ccagcagccg 1320

ttgcaagacc ggatgttcaa atttgaactc acccgccgtc tggatcatga ctttgggaag 1380

gtcaccaagc aggaagtcaa agactttttc cggtgggcaa aggatcacgt ggttgaggtg 1440

gagcatgaat tctacgtcaa aaagggtgga gccaagaaaa gacccgcccc cagtgacgca 1500

gatataagtg agcccaaacg ggtgcgcgag tcagttgcgc agccatcgac gtcagacgcg 1560

gaagcttcga tcaactacgc agacaggtac caaaacaaat gttctcgtca cgtgggcatg 1620

aatctgatgc tgtttccctg cagacaatgc gagagaatga atcagaattc aaatatctgc 1680

ttcactcacg gacagaaaga ctgtttagag tgctttcccg tgtcagaatc tcaacccgtt 1740

tctgtcgtca aaaaggcgta tcagaaactg tgctacattc atcatatcat gggaaaggtg 1800

ccagacgctt gcactgcctg cgatctggtc aatgtggatt tggatgactg catctttgaa 1860

caa 1863

<210> 23

<211> 621

<212> PRT

<213> artificial sequence

<220>

<223> AAV2 Rep amino acid sequence

<400> 23

Met Pro Gly Phe Tyr Glu Ile Val Ile Lys Val Pro Ser Asp Leu Asp

1 5 10 15

Glu His Leu Pro Gly Ile Ser Asp Ser Phe Val Asn Trp Val Ala Glu

20 25 30

Lys Glu Trp Glu Leu Pro Pro Asp Ser Asp Met Asp Leu Asn Leu Ile

35 40 45

Glu Gln Ala Pro Leu Thr Val Ala Glu Lys Leu Gln Arg Asp Phe Leu

50 55 60

Thr Glu Trp Arg Arg Val Ser Lys Ala Pro Glu Ala Leu Phe Phe Val

65 70 75 80

Gln Phe Glu Lys Gly Glu Ser Tyr Phe His Met His Val Leu Val Glu

85 90 95

Thr Thr Gly Val Lys Ser Met Val Leu Gly Arg Phe Leu Ser Gln Ile

100 105 110

Arg Glu Lys Leu Ile Gln Arg Ile Tyr Arg Gly Ile Glu Pro Thr Leu

115 120 125

Pro Asn Trp Phe Ala Val Thr Lys Thr Arg Asn Gly Ala Gly Gly Gly

130 135 140

Asn Lys Val Val Asp Glu Cys Tyr Ile Pro Asn Tyr Leu Leu Pro Lys

145 150 155 160

Thr Gln Pro Glu Leu Gln Trp Ala Trp Thr Asn Met Glu Gln Tyr Leu

165 170 175

Ser Ala Cys Leu Asn Leu Thr Glu Arg Lys Arg Leu Val Ala Gln His

180 185 190

Leu Thr His Val Ser Gln Thr Gln Glu Gln Asn Lys Glu Asn Gln Asn

195 200 205

Pro Asn Ser Asp Ala Pro Val Ile Arg Ser Lys Thr Ser Ala Arg Tyr

210 215 220

Met Glu Leu Val Gly Trp Leu Val Asp Lys Gly Ile Thr Ser Glu Lys

225 230 235 240

Gln Trp Ile Gln Glu Asp Gln Ala Ser Tyr Ile Ser Phe Asn Ala Ala

245 250 255

Ser Asn Ser Arg Ser Gln Ile Lys Ala Ala Leu Asp Asn Ala Gly Lys

260 265 270

Ile Met Ser Leu Thr Lys Thr Ala Pro Asp Tyr Leu Val Gly Gln Gln

275 280 285

Pro Val Glu Asp Ile Ser Ser Asn Arg Ile Tyr Lys Ile Leu Glu Leu

290 295 300

Asn Gly Tyr Asp Pro Gln Tyr Ala Ala Ser Val Phe Leu Gly Trp Ala

305 310 315 320

Thr Lys Lys Phe Gly Lys Arg Asn Thr Ile Trp Leu Phe Gly Pro Ala

325 330 335

Thr Thr Gly Lys Thr Asn Ile Ala Glu Ala Ile Ala His Thr Val Pro

340 345 350

Phe Tyr Gly Cys Val Asn Trp Thr Asn Glu Asn Phe Pro Phe Asn Asp

355 360 365

Cys Val Asp Lys Met Val Ile Trp Trp Glu Glu Gly Lys Met Thr Ala

370 375 380

Lys Val Val Glu Ser Ala Lys Ala Ile Leu Gly Gly Ser Lys Val Arg

385 390 395 400

Val Asp Gln Lys Cys Lys Ser Ser Ala Gln Ile Asp Pro Thr Pro Val

405 410 415

Ile Val Thr Ser Asn Thr Asn Met Cys Ala Val Ile Asp Gly Asn Ser

420 425 430

Thr Thr Phe Glu His Gln Gln Pro Leu Gln Asp Arg Met Phe Lys Phe

435 440 445

Glu Leu Thr Arg Arg Leu Asp His Asp Phe Gly Lys Val Thr Lys Gln

450 455 460

Glu Val Lys Asp Phe Phe Arg Trp Ala Lys Asp His Val Val Glu Val

465 470 475 480

Glu His Glu Phe Tyr Val Lys Lys Gly Gly Ala Lys Lys Arg Pro Ala

485 490 495

Pro Ser Asp Ala Asp Ile Ser Glu Pro Lys Arg Val Arg Glu Ser Val

500 505 510

Ala Gln Pro Ser Thr Ser Asp Ala Glu Ala Ser Ile Asn Tyr Ala Asp

515 520 525

Arg Tyr Gln Asn Lys Cys Ser Arg His Val Gly Met Asn Leu Met Leu

530 535 540

Phe Pro Cys Arg Gln Cys Glu Arg Met Asn Gln Asn Ser Asn Ile Cys

545 550 555 560

Phe Thr His Gly Gln Lys Asp Cys Leu Glu Cys Phe Pro Val Ser Glu

565 570 575

Ser Gln Pro Val Ser Val Val Lys Lys Ala Tyr Gln Lys Leu Cys Tyr

580 585 590

Ile His His Ile Met Gly Lys Val Pro Asp Ala Cys Thr Ala Cys Asp

595 600 605

Leu Val Asn Val Asp Leu Asp Asp Cys Ile Phe Glu Gln

610 615 620

<210> 24

<211> 2244

<212> DNA

<213> artificial sequence

<220>

<223> AAV9 Cap n.588.EFS nucleic acid sequence

<220>

<221> misc_feature

<222> (1765)..(1797)

<223> EFS

<400> 24

atggctgccg atggttatct tccagattgg ctcgaggaca accttagtga aggaattcgc 60

gagtggtggg ctttgaaacc tggagcccct caacccaagg caaatcaaca acatcaagac 120

aacgctcgag gtcttgtgct tccgggttac aaataccttg gacccggcaa cggactcgac 180

aagggggagc cggtcaacgc agcagacgcg gcggccctcg agcacgacaa ggcctacgac 240

cagcagctca aggccggaga caacccgtac ctcaagtaca accacgccga cgccgagttc 300

caggagcggc tcaaagaaga tacgtctttt gggggcaacc tcgggcgagc agtcttccag 360

gccaaaaaga ggcttcttga acctcttggt ctggttgagg aagcggctaa gacggctcct 420

ggaaagaaga ggcctgtaga gcagtctcct caggaaccgg actcctccgc gggtattggc 480

aaatcgggtg cacagcccgc taaaaagaga ctcaatttcg gtcagactgg cgacacagag 540

tcagtcccag accctcaacc aatcggagaa cctcccgcag ccccctcagg tgtgggatct 600

cttacaatgg cttcaggtgg tggcgcacca gtggcagaca ataacgaagg tgccgatgga 660

gtgggtagtt cctcgggaaa ttggcattgc gattcccaat ggctggggga cagagtcatc 720

accaccagca cccgaacctg ggccctgccc acctacaaca atcacctcta caagcaaatc 780

tccaacagca catctggagg atcttcaaat gacaacgcct acttcggcta cagcaccccc 840

tgggggtatt ttgacttcaa cagattccac tgccacttct caccacgtga ctggcagcga 900

ctcatcaaca acaactgggg attccggcct aagcgactca acttcaagct cttcaacatt 960

caggtcaaag aggttacgga caacaatgga gtcaagacca tcgccaataa ccttaccagc 1020

acggtccagg tcttcacgga ctcagactat cagctcccgt acgtgctcgg gtcggctcac 1080

gagggctgcc tcccgccgtt cccagcggac gttttcatga ttcctcagta cgggtatctg 1140

acgcttaatg atggaagcca ggccgtgggt cgttcgtcct tttactgcct ggaatatttc 1200

ccgtcgcaaa tgctaagaac gggtaacaac ttccagttca gctacgagtt tgagaacgta 1260

cctttccata gcagctacgc tcacagccaa agcctggacc gactaatgaa tccactcatc 1320

gaccaatact tgtactatct ctcaaagact attaacggtt ctggacagaa tcaacaaacg 1380

ctaaaattca gtgtggccgg acccagcaac atggctgtcc agggaagaaa ctacatacct 1440

ggacccagct accgacaaca acgtgtctca accactgtga ctcaaaacaa caacagcgaa 1500

tttgcttggc ctggagcttc ttcttgggct ctcaatggac gtaatagctt gatgaatcct 1560

ggacctgcta tggccagcca caaagaagga gaggaccgtt tctttccttt gtctggatct 1620

ttaatttttg gcaaacaagg aactggaaga gacaacgtgg atgcggacaa agtcatgata 1680

accaacgaag aagaaattaa aactactaac ccggtagcaa cggagtccta tggacaagtg 1740

gccacaaacc accagagtgc ccaagagttc agcagcaaca ccgtgaagct gaccagcgca 1800

caggcgcaga ccggctgggt tcaaaaccaa ggaatacttc cgggtatggt ttggcaggac 1860

agagatgtgt acctgcaagg acccatttgg gccaaaattc ctcacacgga cggcaacttt 1920

cacccttctc cgctgatggg agggtttgga atgaagcacc cgcctcctca gatcctcatc 1980

aaaaacacac ctgtacctgc ggatcctcca acggccttca acaaggacaa gctgaactct 2040

ttcatcaccc agtattctac tggccaagtc agcgtggaga tcgagtggga gctgcagaag 2100

gaaaacagca agcgctggaa cccggagatc cagtacactt ccaactatta caagtctaat 2160

aatgttgaat ttgctgttaa tactgaaggt gtatatagtg aaccccgccc cattggcacc 2220

agatacctga ctcgtaatct gtaa 2244

<210> 25

<211> 747

<212> PRT

<213> artificial sequence

<220>

<223> AAV9 Cap n.588.EFS amino acid sequence

<220>

<221> MISC_FEATURE

<222> (499)..(599)

<223> EFS

<400> 25

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser

1 5 10 15

Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro

20 25 30

Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro

35 40 45

Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro

50 55 60

Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 80

Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro

115 120 125

Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140

Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly

145 150 155 160

Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175

Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro

180 185 190

Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly

195 200 205

Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser

210 215 220

Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile

225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255

Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn

260 265 270

Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg

275 280 285

Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn

290 295 300

Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile

305 310 315 320

Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn

325 330 335

Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu

340 345 350

Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro

355 360 365

Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp

370 375 380

Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe

385 390 395 400

Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu

405 410 415

Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu

420 425 430

Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

435 440 445

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser

450 455 460

Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro

465 470 475 480

Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn

485 490 495

Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn

500 505 510

Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys

515 520 525

Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly

530 535 540

Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile

545 550 555 560

Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser

565 570 575

Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Glu Phe Ser Ser

580 585 590

Asn Thr Val Lys Leu Thr Ser Ala Gln Ala Gln Thr Gly Trp Val Gln

595 600 605

Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln Asp Arg Asp Val Tyr

610 615 620

Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly Asn Phe

625 630 635 640

His Pro Ser Pro Leu Met Gly Gly Phe Gly Met Lys His Pro Pro Pro

645 650 655

Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asp Pro Pro Thr Ala

660 665 670

Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr Gln Tyr Ser Thr Gly

675 680 685

Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn Ser Lys

690 695 700

Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Tyr Lys Ser Asn

705 710 715 720

Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val Tyr Ser Glu Pro Arg

725 730 735

Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

740 745

<210> 26

<211> 2247

<212> DNA

<213> artificial sequence

<220>

<223> AAV9 Cap n588.IEI nucleic acid sequence

<220>

<221> misc_feature

<222> (1765)..(1800)

<223> IEI

<400> 26

atggctgccg atggttatct tccagattgg ctcgaggaca accttagtga aggaattcgc 60

gagtggtggg ctttgaaacc tggagcccct caacccaagg caaatcaaca acatcaagac 120

aacgctcgag gtcttgtgct tccgggttac aaataccttg gacccggcaa cggactcgac 180

aagggggagc cggtcaacgc agcagacgcg gcggccctcg agcacgacaa ggcctacgac 240

cagcagctca aggccggaga caacccgtac ctcaagtaca accacgccga cgccgagttc 300

caggagcggc tcaaagaaga tacgtctttt gggggcaacc tcgggcgagc agtcttccag 360

gccaaaaaga ggcttcttga acctcttggt ctggttgagg aagcggctaa gacggctcct 420

ggaaagaaga ggcctgtaga gcagtctcct caggaaccgg actcctccgc gggtattggc 480

aaatcgggtg cacagcccgc taaaaagaga ctcaatttcg gtcagactgg cgacacagag 540

tcagtcccag accctcaacc aatcggagaa cctcccgcag ccccctcagg tgtgggatct 600

cttacaatgg cttcaggtgg tggcgcacca gtggcagaca ataacgaagg tgccgatgga 660

gtgggtagtt cctcgggaaa ttggcattgc gattcccaat ggctggggga cagagtcatc 720

accaccagca cccgaacctg ggccctgccc acctacaaca atcacctcta caagcaaatc 780

tccaacagca catctggagg atcttcaaat gacaacgcct acttcggcta cagcaccccc 840

tgggggtatt ttgacttcaa cagattccac tgccacttct caccacgtga ctggcagcga 900

ctcatcaaca acaactgggg attccggcct aagcgactca acttcaagct cttcaacatt 960

caggtcaaag aggttacgga caacaatgga gtcaagacca tcgccaataa ccttaccagc 1020

acggtccagg tcttcacgga ctcagactat cagctcccgt acgtgctcgg gtcggctcac 1080

gagggctgcc tcccgccgtt cccagcggac gttttcatga ttcctcagta cgggtatctg 1140

acgcttaatg atggaagcca ggccgtgggt cgttcgtcct tttactgcct ggaatatttc 1200

ccgtcgcaaa tgctaagaac gggtaacaac ttccagttca gctacgagtt tgagaacgta 1260

cctttccata gcagctacgc tcacagccaa agcctggacc gactaatgaa tccactcatc 1320

gaccaatact tgtactatct ctcaaagact attaacggtt ctggacagaa tcaacaaacg 1380

ctaaaattca gtgtggccgg acccagcaac atggctgtcc agggaagaaa ctacatacct 1440

ggacccagct accgacaaca acgtgtctca accactgtga ctcaaaacaa caacagcgaa 1500

tttgcttggc ctggagcttc ttcttgggct ctcaatggac gtaatagctt gatgaatcct 1560

ggacctgcta tggccagcca caaagaagga gaggaccgtt tctttccttt gtctggatct 1620

ttaatttttg gcaaacaagg aactggaaga gacaacgtgg atgcggacaa agtcatgata 1680

accaacgaag aagaaattaa aactactaac ccggtagcaa cggagtccta tggacaagtg 1740

gccacaaacc accagagtgc ccaaatcgag atcaacgcta ccagagctgg aaccaacctg 1800

gcacaggcgc agaccggctg ggttcaaaac caaggaatac ttccgggtat ggtttggcag 1860

gacagagatg tgtacctgca aggacccatt tgggccaaaa ttcctcacac ggacggcaac 1920

tttcaccctt ctccgctgat gggagggttt ggaatgaagc acccgcctcc tcagatcctc 1980

atcaaaaaca cacctgtacc tgcggatcct ccaacggcct tcaacaagga caagctgaac 2040

tctttcatca cccagtattc tactggccaa gtcagcgtgg agatcgagtg ggagctgcag 2100

aaggaaaaca gcaagcgctg gaacccggag atccagtaca cttccaacta ttacaagtct 2160

aataatgttg aatttgctgt taatactgaa ggtgtatata gtgaaccccg ccccattggc 2220

accagatacc tgactcgtaa tctgtaa 2247

<210> 27

<211> 748

<212> PRT

<213> artificial sequence

<220>

<223> AAV9 Cap n588.IEI amino acid sequence

<220>

<221> MISC_FEATURE

<222> (499)..(600)

<223> IEI

<400> 27

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser

1 5 10 15

Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro

20 25 30

Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro

35 40 45

Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro

50 55 60

Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 80

Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro

115 120 125

Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140

Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly

145 150 155 160

Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175

Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro

180 185 190

Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly

195 200 205

Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser

210 215 220

Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile

225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255

Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn

260 265 270

Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg

275 280 285

Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn

290 295 300

Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile

305 310 315 320

Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn

325 330 335

Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu

340 345 350

Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro

355 360 365

Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp

370 375 380

Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe

385 390 395 400

Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu

405 410 415

Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu

420 425 430

Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

435 440 445

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser

450 455 460

Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro

465 470 475 480

Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn

485 490 495

Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn

500 505 510

Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys

515 520 525

Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly

530 535 540

Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile

545 550 555 560

Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser

565 570 575

Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Ile Glu Ile Asn

580 585 590

Ala Thr Arg Ala Gly Thr Asn Leu Ala Gln Ala Gln Thr Gly Trp Val

595 600 605

Gln Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln Asp Arg Asp Val

610 615 620

Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly Asn

625 630 635 640

Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Met Lys His Pro Pro

645 650 655

Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asp Pro Pro Thr

660 665 670

Ala Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr Gln Tyr Ser Thr

675 680 685

Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn Ser

690 695 700

Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Tyr Lys Ser

705 710 715 720

Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val Tyr Ser Glu Pro

725 730 735

Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

740 745

<210> 28

<211> 2259

<212> DNA

<213> artificial sequence

<220>

<223> AAV9 Cap n.588.IEI-L nucleic acid sequence

<220>

<221> misc_feature

<222> (1765)..(1812)

<223> IEI-L

<400> 28

atggctgccg atggttatct tccagattgg ctcgaggaca accttagtga aggaattcgc 60

gagtggtggg ctttgaaacc tggagcccct caacccaagg caaatcaaca acatcaagac 120

aacgctcgag gtcttgtgct tccgggttac aaataccttg gacccggcaa cggactcgac 180

aagggggagc cggtcaacgc agcagacgcg gcggccctcg agcacgacaa ggcctacgac 240

cagcagctca aggccggaga caacccgtac ctcaagtaca accacgccga cgccgagttc 300

caggagcggc tcaaagaaga tacgtctttt gggggcaacc tcgggcgagc agtcttccag 360

gccaaaaaga ggcttcttga acctcttggt ctggttgagg aagcggctaa gacggctcct 420

ggaaagaaga ggcctgtaga gcagtctcct caggaaccgg actcctccgc gggtattggc 480

aaatcgggtg cacagcccgc taaaaagaga ctcaatttcg gtcagactgg cgacacagag 540

tcagtcccag accctcaacc aatcggagaa cctcccgcag ccccctcagg tgtgggatct 600

cttacaatgg cttcaggtgg tggcgcacca gtggcagaca ataacgaagg tgccgatgga 660

gtgggtagtt cctcgggaaa ttggcattgc gattcccaat ggctggggga cagagtcatc 720

accaccagca cccgaacctg ggccctgccc acctacaaca atcacctcta caagcaaatc 780

tccaacagca catctggagg atcttcaaat gacaacgcct acttcggcta cagcaccccc 840

tgggggtatt ttgacttcaa cagattccac tgccacttct caccacgtga ctggcagcga 900

ctcatcaaca acaactgggg attccggcct aagcgactca acttcaagct cttcaacatt 960

caggtcaaag aggttacgga caacaatgga gtcaagacca tcgccaataa ccttaccagc 1020

acggtccagg tcttcacgga ctcagactat cagctcccgt acgtgctcgg gtcggctcac 1080

gagggctgcc tcccgccgtt cccagcggac gttttcatga ttcctcagta cgggtatctg 1140

acgcttaatg atggaagcca ggccgtgggt cgttcgtcct tttactgcct ggaatatttc 1200

ccgtcgcaaa tgctaagaac gggtaacaac ttccagttca gctacgagtt tgagaacgta 1260

cctttccata gcagctacgc tcacagccaa agcctggacc gactaatgaa tccactcatc 1320

gaccaatact tgtactatct ctcaaagact attaacggtt ctggacagaa tcaacaaacg 1380

ctaaaattca gtgtggccgg acccagcaac atggctgtcc agggaagaaa ctacatacct 1440

ggacccagct accgacaaca acgtgtctca accactgtga ctcaaaacaa caacagcgaa 1500

tttgcttggc ctggagcttc ttcttgggct ctcaatggac gtaatagctt gatgaatcct 1560

ggacctgcta tggccagcca caaagaagga gaggaccgtt tctttccttt gtctggatct 1620

ttaatttttg gcaaacaagg aactggaaga gacaacgtgg atgcggacaa agtcatgata 1680

accaacgaag aagaaattaa aactactaac ccggtagcaa cggagtccta tggacaagtg 1740

gccacaaacc accagagtgc ccaaggagga atcgagatca acgctaccag agctggaacc 1800

aacctgggag gagcacaggc gcagaccggc tgggttcaaa accaaggaat acttccgggt 1860

atggtttggc aggacagaga tgtgtacctg caaggaccca tttgggccaa aattcctcac 1920

acggacggca actttcaccc ttctccgctg atgggagggt ttggaatgaa gcacccgcct 1980

cctcagatcc tcatcaaaaa cacacctgta cctgcggatc ctccaacggc cttcaacaag 2040

gacaagctga actctttcat cacccagtat tctactggcc aagtcagcgt ggagatcgag 2100

tgggagctgc agaaggaaaa cagcaagcgc tggaacccgg agatccagta cacttccaac 2160

tattacaagt ctaataatgt tgaatttgct gttaatactg aaggtgtata tagtgaaccc 2220

cgccccattg gcaccagata cctgactcgt aatctgtaa 2259

<210> 29

<211> 752

<212> PRT

<213> artificial sequence

<220>

<223> AAV9 Cap n.588.IEI-L amino acid sequence

<220>

<221> MISC_FEATURE

<222> (499)..(604)

<223> IEI-L

<400> 29

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser

1 5 10 15

Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro

20 25 30

Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro

35 40 45

Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro

50 55 60

Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 80

Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro

115 120 125

Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140

Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly

145 150 155 160

Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175

Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro

180 185 190

Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly

195 200 205

Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser

210 215 220

Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile

225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255

Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn

260 265 270

Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg

275 280 285

Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn

290 295 300

Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile

305 310 315 320

Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn

325 330 335

Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu

340 345 350

Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro

355 360 365

Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp

370 375 380

Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe

385 390 395 400

Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu

405 410 415

Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu

420 425 430

Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

435 440 445

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser

450 455 460

Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro

465 470 475 480

Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn

485 490 495

Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn

500 505 510

Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys

515 520 525

Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly

530 535 540

Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile

545 550 555 560

Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser

565 570 575

Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Gly Gly Ile Glu

580 585 590

Ile Asn Ala Thr Arg Ala Gly Thr Asn Leu Gly Gly Ala Gln Ala Gln

595 600 605

Thr Gly Trp Val Gln Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln

610 615 620

Asp Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His

625 630 635 640

Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Met

645 650 655

Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala

660 665 670

Asp Pro Pro Thr Ala Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr

675 680 685

Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln

690 695 700

Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn

705 710 715 720

Tyr Tyr Lys Ser Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val

725 730 735

Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

740 745 750

<210> 30

<211> 2241

<212> DNA

<213> artificial sequence

<220>

<223> AAV9 Cap n.588.SAN nucleic acid sequence

<220>

<221> misc_feature

<222> (1765)..(1794)

<223> SAN

<400> 30

atggctgccg atggttatct tccagattgg ctcgaggaca accttagtga aggaattcgc 60

gagtggtggg ctttgaaacc tggagcccct caacccaagg caaatcaaca acatcaagac 120

aacgctcgag gtcttgtgct tccgggttac aaataccttg gacccggcaa cggactcgac 180

aagggggagc cggtcaacgc agcagacgcg gcggccctcg agcacgacaa ggcctacgac 240

cagcagctca aggccggaga caacccgtac ctcaagtaca accacgccga cgccgagttc 300

caggagcggc tcaaagaaga tacgtctttt gggggcaacc tcgggcgagc agtcttccag 360

gccaaaaaga ggcttcttga acctcttggt ctggttgagg aagcggctaa gacggctcct 420

ggaaagaaga ggcctgtaga gcagtctcct caggaaccgg actcctccgc gggtattggc 480

aaatcgggtg cacagcccgc taaaaagaga ctcaatttcg gtcagactgg cgacacagag 540

tcagtcccag accctcaacc aatcggagaa cctcccgcag ccccctcagg tgtgggatct 600

cttacaatgg cttcaggtgg tggcgcacca gtggcagaca ataacgaagg tgccgatgga 660

gtgggtagtt cctcgggaaa ttggcattgc gattcccaat ggctggggga cagagtcatc 720

accaccagca cccgaacctg ggccctgccc acctacaaca atcacctcta caagcaaatc 780

tccaacagca catctggagg atcttcaaat gacaacgcct acttcggcta cagcaccccc 840

tgggggtatt ttgacttcaa cagattccac tgccacttct caccacgtga ctggcagcga 900

ctcatcaaca acaactgggg attccggcct aagcgactca acttcaagct cttcaacatt 960

caggtcaaag aggttacgga caacaatgga gtcaagacca tcgccaataa ccttaccagc 1020

acggtccagg tcttcacgga ctcagactat cagctcccgt acgtgctcgg gtcggctcac 1080

gagggctgcc tcccgccgtt cccagcggac gttttcatga ttcctcagta cgggtatctg 1140

acgcttaatg atggaagcca ggccgtgggt cgttcgtcct tttactgcct ggaatatttc 1200

ccgtcgcaaa tgctaagaac gggtaacaac ttccagttca gctacgagtt tgagaacgta 1260

cctttccata gcagctacgc tcacagccaa agcctggacc gactaatgaa tccactcatc 1320

gaccaatact tgtactatct ctcaaagact attaacggtt ctggacagaa tcaacaaacg 1380

ctaaaattca gtgtggccgg acccagcaac atggctgtcc agggaagaaa ctacatacct 1440

ggacccagct accgacaaca acgtgtctca accactgtga ctcaaaacaa caacagcgaa 1500

tttgcttggc ctggagcttc ttcttgggct ctcaatggac gtaatagctt gatgaatcct 1560

ggacctgcta tggccagcca caaagaagga gaggaccgtt tctttccttt gtctggatct 1620

ttaatttttg gcaaacaagg aactggaaga gacaacgtgg atgcggacaa agtcatgata 1680

accaacgaag aagaaattaa aactactaac ccggtagcaa cggagtccta tggacaagtg 1740

gccacaaacc accagagtgc ccaaagcgct aacttcatca agcctaccag ctacgcacag 1800

gcgcagaccg gctgggttca aaaccaagga atacttccgg gtatggtttg gcaggacaga 1860

gatgtgtacc tgcaaggacc catttgggcc aaaattcctc acacggacgg caactttcac 1920

ccttctccgc tgatgggagg gtttggaatg aagcacccgc ctcctcagat cctcatcaaa 1980

aacacacctg tacctgcgga tcctccaacg gccttcaaca aggacaagct gaactctttc 2040

atcacccagt attctactgg ccaagtcagc gtggagatcg agtgggagct gcagaaggaa 2100

aacagcaagc gctggaaccc ggagatccag tacacttcca actattacaa gtctaataat 2160

gttgaatttg ctgttaatac tgaaggtgta tatagtgaac cccgccccat tggcaccaga 2220

tacctgactc gtaatctgta a 2241

<210> 31

<211> 746

<212> PRT

<213> artificial sequence

<220>

<223> AAV9 Cap n.588.SAN amino acid sequence

<220>

<221> MISC_FEATURE

<222> (499)..(598)

<223> SAN

<400> 31

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser

1 5 10 15

Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro

20 25 30

Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro

35 40 45

Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro

50 55 60

Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 80

Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro

115 120 125

Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140

Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly

145 150 155 160

Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175

Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro

180 185 190

Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly

195 200 205

Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser

210 215 220

Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile

225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255

Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn

260 265 270

Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg

275 280 285

Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn

290 295 300

Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile

305 310 315 320

Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn

325 330 335

Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu

340 345 350

Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro

355 360 365

Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp

370 375 380

Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe

385 390 395 400

Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu

405 410 415

Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu

420 425 430

Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

435 440 445

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser

450 455 460

Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro

465 470 475 480

Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn

485 490 495

Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn

500 505 510

Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys

515 520 525

Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly

530 535 540

Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile

545 550 555 560

Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser

565 570 575

Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Ser Ala Asn Phe

580 585 590

Ile Lys Pro Thr Ser Tyr Ala Gln Ala Gln Thr Gly Trp Val Gln Asn

595 600 605

Gln Gly Ile Leu Pro Gly Met Val Trp Gln Asp Arg Asp Val Tyr Leu

610 615 620

Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly Asn Phe His

625 630 635 640

Pro Ser Pro Leu Met Gly Gly Phe Gly Met Lys His Pro Pro Pro Gln

645 650 655

Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asp Pro Pro Thr Ala Phe

660 665 670

Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr Gln Tyr Ser Thr Gly Gln

675 680 685

Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn Ser Lys Arg

690 695 700

Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Tyr Lys Ser Asn Asn

705 710 715 720

Val Glu Phe Ala Val Asn Thr Glu Gly Val Tyr Ser Glu Pro Arg Pro

725 730 735

Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

740 745

<210> 32

<211> 2244

<212> DNA

<213> artificial sequence

<220>

<223> AAV9 Cap n.588.SSN nucleic acid sequence

<220>

<221> misc_feature

<222> (1765)..(1797)

<223> SSN

<400> 32

atggctgccg atggttatct tccagattgg ctcgaggaca accttagtga aggaattcgc 60

gagtggtggg ctttgaaacc tggagcccct caacccaagg caaatcaaca acatcaagac 120

aacgctcgag gtcttgtgct tccgggttac aaataccttg gacccggcaa cggactcgac 180

aagggggagc cggtcaacgc agcagacgcg gcggccctcg agcacgacaa ggcctacgac 240

cagcagctca aggccggaga caacccgtac ctcaagtaca accacgccga cgccgagttc 300

caggagcggc tcaaagaaga tacgtctttt gggggcaacc tcgggcgagc agtcttccag 360

gccaaaaaga ggcttcttga acctcttggt ctggttgagg aagcggctaa gacggctcct 420

ggaaagaaga ggcctgtaga gcagtctcct caggaaccgg actcctccgc gggtattggc 480

aaatcgggtg cacagcccgc taaaaagaga ctcaatttcg gtcagactgg cgacacagag 540

tcagtcccag accctcaacc aatcggagaa cctcccgcag ccccctcagg tgtgggatct 600

cttacaatgg cttcaggtgg tggcgcacca gtggcagaca ataacgaagg tgccgatgga 660

gtgggtagtt cctcgggaaa ttggcattgc gattcccaat ggctggggga cagagtcatc 720

accaccagca cccgaacctg ggccctgccc acctacaaca atcacctcta caagcaaatc 780

tccaacagca catctggagg atcttcaaat gacaacgcct acttcggcta cagcaccccc 840

tgggggtatt ttgacttcaa cagattccac tgccacttct caccacgtga ctggcagcga 900

ctcatcaaca acaactgggg attccggcct aagcgactca acttcaagct cttcaacatt 960

caggtcaaag aggttacgga caacaatgga gtcaagacca tcgccaataa ccttaccagc 1020

acggtccagg tcttcacgga ctcagactat cagctcccgt acgtgctcgg gtcggctcac 1080

gagggctgcc tcccgccgtt cccagcggac gttttcatga ttcctcagta cgggtatctg 1140

acgcttaatg atggaagcca ggccgtgggt cgttcgtcct tttactgcct ggaatatttc 1200

ccgtcgcaaa tgctaagaac gggtaacaac ttccagttca gctacgagtt tgagaacgta 1260

cctttccata gcagctacgc tcacagccaa agcctggacc gactaatgaa tccactcatc 1320

gaccaatact tgtactatct ctcaaagact attaacggtt ctggacagaa tcaacaaacg 1380

ctaaaattca gtgtggccgg acccagcaac atggctgtcc agggaagaaa ctacatacct 1440

ggacccagct accgacaaca acgtgtctca accactgtga ctcaaaacaa caacagcgaa 1500

tttgcttggc ctggagcttc ttcttgggct ctcaatggac gtaatagctt gatgaatcct 1560

ggacctgcta tggccagcca caaagaagga gaggaccgtt tctttccttt gtctggatct 1620

ttaatttttg gcaaacaagg aactggaaga gacaacgtgg atgcggacaa agtcatgata 1680

accaacgaag aagaaattaa aactactaac ccggtagcaa cggagtccta tggacaagtg 1740

gccacaaacc accagagtgc ccaaagcagc aacaccgtga agctgaccag cggacacgca 1800

caggcgcaga ccggctgggt tcaaaaccaa ggaatacttc cgggtatggt ttggcaggac 1860

agagatgtgt acctgcaagg acccatttgg gccaaaattc ctcacacgga cggcaacttt 1920

cacccttctc cgctgatggg agggtttgga atgaagcacc cgcctcctca gatcctcatc 1980

aaaaacacac ctgtacctgc ggatcctcca acggccttca acaaggacaa gctgaactct 2040

ttcatcaccc agtattctac tggccaagtc agcgtggaga tcgagtggga gctgcagaag 2100

gaaaacagca agcgctggaa cccggagatc cagtacactt ccaactatta caagtctaat 2160

aatgttgaat ttgctgttaa tactgaaggt gtatatagtg aaccccgccc cattggcacc 2220

agatacctga ctcgtaatct gtaa 2244

<210> 33

<211> 747

<212> PRT

<213> artificial sequence

<220>

<223> AAV9 Cap n.588.SSN amino acid sequence

<220>

<221> MISC_FEATURE

<222> (499)..(599)

<223> SSN

<400> 33

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser

1 5 10 15

Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro

20 25 30

Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro

35 40 45

Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro

50 55 60

Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 80

Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro

115 120 125

Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140

Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly

145 150 155 160

Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175

Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro

180 185 190

Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly

195 200 205

Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser

210 215 220

Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile

225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255

Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn

260 265 270

Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg

275 280 285

Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn

290 295 300

Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile

305 310 315 320

Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn

325 330 335

Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu

340 345 350

Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro

355 360 365

Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp

370 375 380

Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe

385 390 395 400

Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu

405 410 415

Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu

420 425 430

Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

435 440 445

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser

450 455 460

Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro

465 470 475 480

Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn

485 490 495

Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn

500 505 510

Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys

515 520 525

Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly

530 535 540

Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile

545 550 555 560

Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser

565 570 575

Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Ser Ser Asn Thr

580 585 590

Val Lys Leu Thr Ser Gly His Ala Gln Ala Gln Thr Gly Trp Val Gln

595 600 605

Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln Asp Arg Asp Val Tyr

610 615 620

Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr Asp Gly Asn Phe

625 630 635 640

His Pro Ser Pro Leu Met Gly Gly Phe Gly Met Lys His Pro Pro Pro

645 650 655

Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asp Pro Pro Thr Ala

660 665 670

Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr Gln Tyr Ser Thr Gly

675 680 685

Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys Glu Asn Ser Lys

690 695 700

Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr Tyr Lys Ser Asn

705 710 715 720

Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val Tyr Ser Glu Pro Arg

725 730 735

Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

740 745

<210> 34

<211> 2256

<212> DNA

<213> artificial sequence

<220>

<223> AAV9 Cap n.588.SSN-L nucleic acid sequence

<220>

<221> misc_feature

<222> (1765)..(1809)

<223> SSN-L

<400> 34

atggctgccg atggttatct tccagattgg ctcgaggaca accttagtga aggaattcgc 60

gagtggtggg ctttgaaacc tggagcccct caacccaagg caaatcaaca acatcaagac 120

aacgctcgag gtcttgtgct tccgggttac aaataccttg gacccggcaa cggactcgac 180

aagggggagc cggtcaacgc agcagacgcg gcggccctcg agcacgacaa ggcctacgac 240

cagcagctca aggccggaga caacccgtac ctcaagtaca accacgccga cgccgagttc 300

caggagcggc tcaaagaaga tacgtctttt gggggcaacc tcgggcgagc agtcttccag 360

gccaaaaaga ggcttcttga acctcttggt ctggttgagg aagcggctaa gacggctcct 420

ggaaagaaga ggcctgtaga gcagtctcct caggaaccgg actcctccgc gggtattggc 480

aaatcgggtg cacagcccgc taaaaagaga ctcaatttcg gtcagactgg cgacacagag 540

tcagtcccag accctcaacc aatcggagaa cctcccgcag ccccctcagg tgtgggatct 600

cttacaatgg cttcaggtgg tggcgcacca gtggcagaca ataacgaagg tgccgatgga 660

gtgggtagtt cctcgggaaa ttggcattgc gattcccaat ggctggggga cagagtcatc 720

accaccagca cccgaacctg ggccctgccc acctacaaca atcacctcta caagcaaatc 780

tccaacagca catctggagg atcttcaaat gacaacgcct acttcggcta cagcaccccc 840

tgggggtatt ttgacttcaa cagattccac tgccacttct caccacgtga ctggcagcga 900

ctcatcaaca acaactgggg attccggcct aagcgactca acttcaagct cttcaacatt 960

caggtcaaag aggttacgga caacaatgga gtcaagacca tcgccaataa ccttaccagc 1020

acggtccagg tcttcacgga ctcagactat cagctcccgt acgtgctcgg gtcggctcac 1080

gagggctgcc tcccgccgtt cccagcggac gttttcatga ttcctcagta cgggtatctg 1140

acgcttaatg atggaagcca ggccgtgggt cgttcgtcct tttactgcct ggaatatttc 1200

ccgtcgcaaa tgctaagaac gggtaacaac ttccagttca gctacgagtt tgagaacgta 1260

cctttccata gcagctacgc tcacagccaa agcctggacc gactaatgaa tccactcatc 1320

gaccaatact tgtactatct ctcaaagact attaacggtt ctggacagaa tcaacaaacg 1380

ctaaaattca gtgtggccgg acccagcaac atggctgtcc agggaagaaa ctacatacct 1440

ggacccagct accgacaaca acgtgtctca accactgtga ctcaaaacaa caacagcgaa 1500

tttgcttggc ctggagcttc ttcttgggct ctcaatggac gtaatagctt gatgaatcct 1560

ggacctgcta tggccagcca caaagaagga gaggaccgtt tctttccttt gtctggatct 1620

ttaatttttg gcaaacaagg aactggaaga gacaacgtgg atgcggacaa agtcatgata 1680

accaacgaag aagaaattaa aactactaac ccggtagcaa cggagtccta tggacaagtg 1740

gccacaaacc accagagtgc ccaaggagga agcagcaaca ccgtgaagct gaccagcgga 1800

cacggaggag cacaggcgca gaccggctgg gttcaaaacc aaggaatact tccgggtatg 1860

gtttggcagg acagagatgt gtacctgcaa ggacccattt gggccaaaat tcctcacacg 1920

gacggcaact ttcacccttc tccgctgatg ggagggtttg gaatgaagca cccgcctcct 1980

cagatcctca tcaaaaacac acctgtacct gcggatcctc caacggcctt caacaaggac 2040

aagctgaact ctttcatcac ccagtattct actggccaag tcagcgtgga gatcgagtgg 2100

gagctgcaga aggaaaacag caagcgctgg aacccggaga tccagtacac ttccaactat 2160

tacaagtcta ataatgttga atttgctgtt aatactgaag gtgtatatag tgaaccccgc 2220

cccattggca ccagatacct gactcgtaat ctgtaa 2256

<210> 35

<211> 751

<212> PRT

<213> artificial sequence

<220>

<223> AAV9 Cap n.588.SSN-L amino acid sequence

<220>

<221> MISC_FEATURE

<222> (499)..(603)

<223> SSN-L

<400> 35

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser

1 5 10 15

Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro

20 25 30

Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro

35 40 45

Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro

50 55 60

Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 80

Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro

115 120 125

Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140

Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly

145 150 155 160

Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175

Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro

180 185 190

Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly

195 200 205

Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser

210 215 220

Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile

225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255

Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn

260 265 270

Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg

275 280 285

Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn

290 295 300

Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile

305 310 315 320

Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn

325 330 335

Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu

340 345 350

Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro

355 360 365

Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp

370 375 380

Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe

385 390 395 400

Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu

405 410 415

Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu

420 425 430

Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

435 440 445

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser

450 455 460

Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro

465 470 475 480

Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn

485 490 495

Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn

500 505 510

Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys

515 520 525

Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly

530 535 540

Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile

545 550 555 560

Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser

565 570 575

Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Gly Gly Ser Ser

580 585 590

Asn Thr Val Lys Leu Thr Ser Gly His Gly Gly Ala Gln Ala Gln Thr

595 600 605

Gly Trp Val Gln Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln Asp

610 615 620

Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His Thr

625 630 635 640

Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Met Lys

645 650 655

His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala Asp

660 665 670

Pro Pro Thr Ala Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr Gln

675 680 685

Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln Lys

690 695 700

Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn Tyr

705 710 715 720

Tyr Lys Ser Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val Tyr

725 730 735

Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

740 745 750

<210> 36

<211> 2259

<212> DNA

<213> artificial sequence

<220>

<223> AAV9 Cap n588.VLT-L nucleic acid sequence

<220>

<221> misc_feature

<222> (1765)..(1812)

<223> VLT-L

<400> 36

atggctgccg atggttatct tccagattgg ctcgaggaca accttagtga aggaattcgc 60

gagtggtggg ctttgaaacc tggagcccct caacccaagg caaatcaaca acatcaagac 120

aacgctcgag gtcttgtgct tccgggttac aaataccttg gacccggcaa cggactcgac 180

aagggggagc cggtcaacgc agcagacgcg gcggccctcg agcacgacaa ggcctacgac 240

cagcagctca aggccggaga caacccgtac ctcaagtaca accacgccga cgccgagttc 300

caggagcggc tcaaagaaga tacgtctttt gggggcaacc tcgggcgagc agtcttccag 360

gccaaaaaga ggcttcttga acctcttggt ctggttgagg aagcggctaa gacggctcct 420

ggaaagaaga ggcctgtaga gcagtctcct caggaaccgg actcctccgc gggtattggc 480

aaatcgggtg cacagcccgc taaaaagaga ctcaatttcg gtcagactgg cgacacagag 540

tcagtcccag accctcaacc aatcggagaa cctcccgcag ccccctcagg tgtgggatct 600

cttacaatgg cttcaggtgg tggcgcacca gtggcagaca ataacgaagg tgccgatgga 660

gtgggtagtt cctcgggaaa ttggcattgc gattcccaat ggctggggga cagagtcatc 720

accaccagca cccgaacctg ggccctgccc acctacaaca atcacctcta caagcaaatc 780

tccaacagca catctggagg atcttcaaat gacaacgcct acttcggcta cagcaccccc 840

tgggggtatt ttgacttcaa cagattccac tgccacttct caccacgtga ctggcagcga 900

ctcatcaaca acaactgggg attccggcct aagcgactca acttcaagct cttcaacatt 960

caggtcaaag aggttacgga caacaatgga gtcaagacca tcgccaataa ccttaccagc 1020

acggtccagg tcttcacgga ctcagactat cagctcccgt acgtgctcgg gtcggctcac 1080

gagggctgcc tcccgccgtt cccagcggac gttttcatga ttcctcagta cgggtatctg 1140

acgcttaatg atggaagcca ggccgtgggt cgttcgtcct tttactgcct ggaatatttc 1200

ccgtcgcaaa tgctaagaac gggtaacaac ttccagttca gctacgagtt tgagaacgta 1260

cctttccata gcagctacgc tcacagccaa agcctggacc gactaatgaa tccactcatc 1320

gaccaatact tgtactatct ctcaaagact attaacggtt ctggacagaa tcaacaaacg 1380

ctaaaattca gtgtggccgg acccagcaac atggctgtcc agggaagaaa ctacatacct 1440

ggacccagct accgacaaca acgtgtctca accactgtga ctcaaaacaa caacagcgaa 1500

tttgcttggc ctggagcttc ttcttgggct ctcaatggac gtaatagctt gatgaatcct 1560

ggacctgcta tggccagcca caaagaagga gaggaccgtt tctttccttt gtctggatct 1620

ttaatttttg gcaaacaagg aactggaaga gacaacgtgg atgcggacaa agtcatgata 1680

accaacgaag aagaaattaa aactactaac ccggtagcaa cggagtccta tggacaagtg 1740

gccacaaacc accagagtgc ccaaggagga gtgctgacca acatcgctag aggagagtac 1800

atgagaggag gagcacaggc gcagaccggc tgggttcaaa accaaggaat acttccgggt 1860

atggtttggc aggacagaga tgtgtacctg caaggaccca tttgggccaa aattcctcac 1920

acggacggca actttcaccc ttctccgctg atgggagggt ttggaatgaa gcacccgcct 1980

cctcagatcc tcatcaaaaa cacacctgta cctgcggatc ctccaacggc cttcaacaag 2040

gacaagctga actctttcat cacccagtat tctactggcc aagtcagcgt ggagatcgag 2100

tgggagctgc agaaggaaaa cagcaagcgc tggaacccgg agatccagta cacttccaac 2160

tattacaagt ctaataatgt tgaatttgct gttaatactg aaggtgtata tagtgaaccc 2220

cgccccattg gcaccagata cctgactcgt aatctgtaa 2259

<210> 37

<211> 752

<212> PRT

<213> artificial sequence

<220>

<223> AAV9 Cap n588.VLT-L amino acid sequence

<220>

<221> MISC_FEATURE

<222> (499)..(604)

<223> VLT-L

<400> 37

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser

1 5 10 15

Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro

20 25 30

Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro

35 40 45

Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro

50 55 60

Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 80

Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro

115 120 125

Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140

Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly

145 150 155 160

Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175

Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro

180 185 190

Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly

195 200 205

Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser

210 215 220

Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile

225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255

Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn

260 265 270

Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg

275 280 285

Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn

290 295 300

Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile

305 310 315 320

Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn

325 330 335

Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu

340 345 350

Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro

355 360 365

Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp

370 375 380

Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe

385 390 395 400

Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu

405 410 415

Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu

420 425 430

Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

435 440 445

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser

450 455 460

Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro

465 470 475 480

Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn

485 490 495

Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn

500 505 510

Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys

515 520 525

Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly

530 535 540

Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile

545 550 555 560

Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser

565 570 575

Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Gly Gly Val Leu

580 585 590

Thr Asn Ile Ala Arg Gly Glu Tyr Met Arg Gly Gly Ala Gln Ala Gln

595 600 605

Thr Gly Trp Val Gln Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln

610 615 620

Asp Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His

625 630 635 640

Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Met

645 650 655

Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala

660 665 670

Asp Pro Pro Thr Ala Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr

675 680 685

Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln

690 695 700

Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn

705 710 715 720

Tyr Tyr Lys Ser Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val

725 730 735

Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

740 745 750

<210> 38

<211> 11

<212> PRT

<213> artificial sequence

<220>

<223> EFS peptide sequence

<400> 38

Glu Phe Ser Ser Asn Thr Val Lys Leu Thr Ser

1 5 10

<210> 39

<211> 15

<212> PRT

<213> artificial sequence

<220>

<223> SSN-L peptide sequence

<400> 39

Gly Gly Ser Ser Asn Thr Val Lys Leu Thr Ser Gly His Gly Gly

1 5 10 15

<210> 40

<211> 11

<212> PRT

<213> artificial sequence

<220>

<223> SSN peptide sequence

<400> 40

Ser Ser Asn Thr Val Lys Leu Thr Ser Gly His

1 5 10

<210> 41

<211> 10

<212> PRT

<213> artificial sequence

<220>

<223> SAN peptide sequence

<400> 41

Ser Ala Asn Phe Ile Lys Pro Thr Ser Tyr

1 5 10

<210> 42

<211> 12

<212> PRT

<213> artificial sequence

<220>

<223> IEI peptide sequence

<400> 42

Ile Glu Ile Asn Ala Thr Arg Ala Gly Thr Asn Leu

1 5 10

<210> 43

<211> 16

<212> PRT

<213> artificial sequence

<220>

<223> IEI-L peptide sequence

<400> 43

Gly Gly Ile Glu Ile Asn Ala Thr Arg Ala Gly Thr Asn Leu Gly Gly

1 5 10 15

<210> 44

<211> 736

<212> PRT

<213> artificial sequence

<220>

<223> AAV9 capsid

<400> 44

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser

1 5 10 15

Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro

20 25 30

Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro

35 40 45

Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro

50 55 60

Val Asn Ala Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 80

Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro

115 120 125

Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140

Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Ala Gly Ile Gly

145 150 155 160

Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175

Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro

180 185 190

Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly

195 200 205

Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser

210 215 220

Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile

225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255

Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn

260 265 270

Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg

275 280 285

Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn

290 295 300

Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile

305 310 315 320

Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn

325 330 335

Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu

340 345 350

Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro

355 360 365

Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp

370 375 380

Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe

385 390 395 400

Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu

405 410 415

Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu

420 425 430

Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

435 440 445

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser

450 455 460

Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro

465 470 475 480

Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn

485 490 495

Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn

500 505 510

Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys

515 520 525

Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly

530 535 540

Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile

545 550 555 560

Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser

565 570 575

Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Ala Gln Ala Gln

580 585 590

Thr Gly Trp Val Gln Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln

595 600 605

Asp Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His

610 615 620

Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Met

625 630 635 640

Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala

645 650 655

Asp Pro Pro Thr Ala Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr

660 665 670

Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln

675 680 685

Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn

690 695 700

Tyr Tyr Lys Ser Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val

705 710 715 720

Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

725 730 735

<210> 45

<211> 736

<212> PRT

<213> artificial sequence

<220>

<223> AAVhu68 capsid

<400> 45

Met Ala Ala Asp Gly Tyr Leu Pro Asp Trp Leu Glu Asp Asn Leu Ser

1 5 10 15

Glu Gly Ile Arg Glu Trp Trp Ala Leu Lys Pro Gly Ala Pro Gln Pro

20 25 30

Lys Ala Asn Gln Gln His Gln Asp Asn Ala Arg Gly Leu Val Leu Pro

35 40 45

Gly Tyr Lys Tyr Leu Gly Pro Gly Asn Gly Leu Asp Lys Gly Glu Pro

50 55 60

Val Asn Glu Ala Asp Ala Ala Ala Leu Glu His Asp Lys Ala Tyr Asp

65 70 75 80

Gln Gln Leu Lys Ala Gly Asp Asn Pro Tyr Leu Lys Tyr Asn His Ala

85 90 95

Asp Ala Glu Phe Gln Glu Arg Leu Lys Glu Asp Thr Ser Phe Gly Gly

100 105 110

Asn Leu Gly Arg Ala Val Phe Gln Ala Lys Lys Arg Leu Leu Glu Pro

115 120 125

Leu Gly Leu Val Glu Glu Ala Ala Lys Thr Ala Pro Gly Lys Lys Arg

130 135 140

Pro Val Glu Gln Ser Pro Gln Glu Pro Asp Ser Ser Val Gly Ile Gly

145 150 155 160

Lys Ser Gly Ala Gln Pro Ala Lys Lys Arg Leu Asn Phe Gly Gln Thr

165 170 175

Gly Asp Thr Glu Ser Val Pro Asp Pro Gln Pro Ile Gly Glu Pro Pro

180 185 190

Ala Ala Pro Ser Gly Val Gly Ser Leu Thr Met Ala Ser Gly Gly Gly

195 200 205

Ala Pro Val Ala Asp Asn Asn Glu Gly Ala Asp Gly Val Gly Ser Ser

210 215 220

Ser Gly Asn Trp His Cys Asp Ser Gln Trp Leu Gly Asp Arg Val Ile

225 230 235 240

Thr Thr Ser Thr Arg Thr Trp Ala Leu Pro Thr Tyr Asn Asn His Leu

245 250 255

Tyr Lys Gln Ile Ser Asn Ser Thr Ser Gly Gly Ser Ser Asn Asp Asn

260 265 270

Ala Tyr Phe Gly Tyr Ser Thr Pro Trp Gly Tyr Phe Asp Phe Asn Arg

275 280 285

Phe His Cys His Phe Ser Pro Arg Asp Trp Gln Arg Leu Ile Asn Asn

290 295 300

Asn Trp Gly Phe Arg Pro Lys Arg Leu Asn Phe Lys Leu Phe Asn Ile

305 310 315 320

Gln Val Lys Glu Val Thr Asp Asn Asn Gly Val Lys Thr Ile Ala Asn

325 330 335

Asn Leu Thr Ser Thr Val Gln Val Phe Thr Asp Ser Asp Tyr Gln Leu

340 345 350

Pro Tyr Val Leu Gly Ser Ala His Glu Gly Cys Leu Pro Pro Phe Pro

355 360 365

Ala Asp Val Phe Met Ile Pro Gln Tyr Gly Tyr Leu Thr Leu Asn Asp

370 375 380

Gly Ser Gln Ala Val Gly Arg Ser Ser Phe Tyr Cys Leu Glu Tyr Phe

385 390 395 400

Pro Ser Gln Met Leu Arg Thr Gly Asn Asn Phe Gln Phe Ser Tyr Glu

405 410 415

Phe Glu Asn Val Pro Phe His Ser Ser Tyr Ala His Ser Gln Ser Leu

420 425 430

Asp Arg Leu Met Asn Pro Leu Ile Asp Gln Tyr Leu Tyr Tyr Leu Ser

435 440 445

Lys Thr Ile Asn Gly Ser Gly Gln Asn Gln Gln Thr Leu Lys Phe Ser

450 455 460

Val Ala Gly Pro Ser Asn Met Ala Val Gln Gly Arg Asn Tyr Ile Pro

465 470 475 480

Gly Pro Ser Tyr Arg Gln Gln Arg Val Ser Thr Thr Val Thr Gln Asn

485 490 495

Asn Asn Ser Glu Phe Ala Trp Pro Gly Ala Ser Ser Trp Ala Leu Asn

500 505 510

Gly Arg Asn Ser Leu Met Asn Pro Gly Pro Ala Met Ala Ser His Lys

515 520 525

Glu Gly Glu Asp Arg Phe Phe Pro Leu Ser Gly Ser Leu Ile Phe Gly

530 535 540

Lys Gln Gly Thr Gly Arg Asp Asn Val Asp Ala Asp Lys Val Met Ile

545 550 555 560

Thr Asn Glu Glu Glu Ile Lys Thr Thr Asn Pro Val Ala Thr Glu Ser

565 570 575

Tyr Gly Gln Val Ala Thr Asn His Gln Ser Ala Gln Ala Gln Ala Gln

580 585 590

Thr Gly Trp Val Gln Asn Gln Gly Ile Leu Pro Gly Met Val Trp Gln

595 600 605

Asp Arg Asp Val Tyr Leu Gln Gly Pro Ile Trp Ala Lys Ile Pro His

610 615 620

Thr Asp Gly Asn Phe His Pro Ser Pro Leu Met Gly Gly Phe Gly Met

625 630 635 640

Lys His Pro Pro Pro Gln Ile Leu Ile Lys Asn Thr Pro Val Pro Ala

645 650 655

Asp Pro Pro Thr Ala Phe Asn Lys Asp Lys Leu Asn Ser Phe Ile Thr

660 665 670

Gln Tyr Ser Thr Gly Gln Val Ser Val Glu Ile Glu Trp Glu Leu Gln

675 680 685

Lys Glu Asn Ser Lys Arg Trp Asn Pro Glu Ile Gln Tyr Thr Ser Asn

690 695 700

Tyr Tyr Lys Ser Asn Asn Val Glu Phe Ala Val Asn Thr Glu Gly Val

705 710 715 720

Tyr Ser Glu Pro Arg Pro Ile Gly Thr Arg Tyr Leu Thr Arg Asn Leu

725 730 735

<210> 46

<211> 16

<212> PRT

<213> artificial sequence

<220>

<223> VLT-L peptide sequence

<400> 46

Gly Gly Val Leu Thr Asn Ile Ala Arg Gly Glu Tyr Met Arg Gly Gly

1 5 10 15

<210> 47

<211> 4

<212> PRT

<213> artificial sequence

<220>

<223> N-X- (T/I/V/A) - (K/R) motif

<220>

<221> MISC_FEATURE

<222> (2)..(2)

<223> any amino acid

<220>

<221> MISC_FEATURE

<222> (3)..(3)

<223> Xaa is selected from threonine (T), isoleucine (I), valine (V) or alanine (A)

<220>

<221> MISC_FEATURE

<222> (4)..(4)

<223> Xaa is selected from lysine (K) or arginine (R)

<400> 47

Asn Xaa Xaa Xaa

1

<210> 48

<211> 7

<212> PRT

<213> artificial sequence

<220>

<223> AAV2 variant peptide NDVRAVS

<400> 48

Asn Asp Val Arg Ala Val Ser

1 5

<210> 49

<211> 7

<212> PRT

<213> artificial sequence

<220>

<223> PHP.B peptide insert

<400> 49

Thr Leu Ala Val Pro Phe Lys

1 5

<210> 50

<211> 33

<212> DNA

<213> artificial sequence

<220>

<223> nucleic acid sequence EFS

<400> 50

gagttcagca gcaacaccgt gaagctgacc agc 33

<210> 51

<211> 36

<212> DNA

<213> artificial sequence

<220>

<223> nucleic acid sequence IEI

<400> 51

atcgagatca acgctaccag agctggaacc aacctg 36

<210> 52

<211> 48

<212> DNA

<213> artificial sequence

<220>

<223> nucleic acid sequence IEI-L

<400> 52

ggaggaatcg agatcaacgc taccagagct ggaaccaacc tgggagga 48

<210> 53

<211> 30

<212> DNA

<213> artificial sequence

<220>

<223> nucleic acid sequence SAN

<400> 53

agcgctaact tcatcaagcc taccagctac 30

<210> 54

<211> 33

<212> DNA

<213> artificial sequence

<220>

<223> nucleic acid sequence SSN

<400> 54

agcagcaaca ccgtgaagct gaccagcgga cac 33

<210> 55

<211> 45

<212> DNA

<213> artificial sequence

<220>

<223> nucleic acid sequence SSN-L

<400> 55

ggaggaagca gcaacaccgt gaagctgacc agcggacacg gagga 45

<210> 56

<211> 48

<212> DNA

<213> artificial sequence

<220>

<223> nucleic acid sequence VLT-L

<400> 56

ggaggagtgc tgaccaacat cgctagagga gagtacatga gaggagga 48

Claims

1. A recombinant adeno-associated viral particle (rAAV) having a capsid comprising an amino acid sequence comprising the motifs N-x- (T/I/V/a) - (K/R) (SEQ ID NO: 47), wherein the amino acid sequence is at least a portion of an AAV vp3 protein in the capsid and a vector genome packaged in the capsid, the vector genome comprising a nucleic acid sequence encoding a gene product under the control of a sequence that directs expression of the gene product, with the proviso that the capsid is not a mutant AAV2 capsid comprising an NDVRAVS (SEQ ID NO: 48) sequence.

2. The rAAV of claim 1, wherein the amino acid sequence comprising the N-x- (T/I/V/a) - (K/R) motif is inserted into an AAV capsid vp3 region, optionally flanked by two amino acids to seven amino acids at the amino-terminus and/or carboxy-terminus of the motif.

3. The rAAV of claim 1 or 2, wherein the sequence inserted into the capsid comprises:

(a)SSNTVKLTSGH(SEQ ID NO:40)；

(b)EFSSNTVKLTS(SEQ ID NO:38)；

(c)GGVLTNIARGEYMRGG(SEQ ID NO:46)；

(d)GGIEINATRAGTNLGG(SEQ ID NO:43)；

(e)GGSSNTVKLTSGHGG(SEQ ID NO:39)；

(f) IEINATRAGTNL (SEQ ID NO: 42); or (b)

(g)SANFIKPTSY(SEQ ID NO:41)。

4. The rAAV of any one of claims 1-3, wherein the amino acid sequence of the motif is NTVK.

5. The rAAV of claim 1, wherein the motif N-x- (T/I/V/a) - (K/R) (SEQ ID NO: 47) is optionally flanked by two to seven amino acids at the carboxy and/or amino terminus and is inserted between amino acids 588 and 589 of the AAV9 capsid protein, based on numbering of the amino acid sequences: SEQ ID NO. 44.

6. A composition comprising a stock solution of a rAAV according to any one of claims 1 to 5 and one or more of a physiologically compatible carrier, excipient and/or aqueous suspension matrix.

7. An endothelial cell targeting peptide, wherein the endothelial cell targeting peptide comprises a motif comprising an amino acid sequence of N-x- (T/I/V/a) - (K/R) (SEQ ID NO: 47), optionally flanked by two to seven amino acids at the amino-and/or carboxy-terminus of the motif and optionally further conjugated to a nanoparticle, a second molecule or a viral capsid protein.

8. The endothelial cell targeting peptide according to claim 7, wherein the endothelial cell targeting peptide comprises:

(a)SSNTVKLTSGH(SEQ ID NO:40)；

(b)EFSSNTVKLTS(SEQ ID NO:38)；

(c)GGVLTNIARGEYMRGG(SEQ ID NO:46)；

(d)GGIEINATRAGTNLGG(SEQ ID NO:43)；

(e)GGSSNTVKLTSGHGG(SEQ ID NO:39)；

(f) IEINATRAGTNL (SEQ ID NO: 42); or (b)

(g)SANFIKPTSY(SEQ ID NO:41)。

9. The endothelial cell targeting peptide according to claim 7 or 8, wherein the amino acid sequence of the motif is NTVK.

10. A composition comprising an endothelial cell targeting peptide according to any one of claims 7 to 9 and one or more of a physiologically compatible carrier, excipient and/or aqueous suspension matrix.

11. A fusion polypeptide or protein comprising the brain endothelial cell targeting peptide according to any one of claims 7 to 9 and a fusion partner comprising at least one polypeptide or protein.

12. A composition comprising the fusion polypeptide or protein of claim 11 and one or more of a physiologically compatible carrier, excipient, and/or aqueous suspension matrix.

13. Use of the stock solution of rAAV according to any one of claims 1 to 5, the endothelial cell targeting peptide according to any one of claims 7 to 9, or the fusion polypeptide or protein according to claim 11, or the composition according to any one of claims 6, 10 or 12, for delivering a treatment to a patient in need thereof.

14. A method for targeted therapy of brain endothelial cells, the method comprising administering to a patient in need thereof a stock solution of rAAV according to any one of claims 1 to 5.

15. A method for treating an alan-Herndon-Dudley disease by delivering a stock solution of a rAAV according to any one of claims 1 to 5 to a subject in need thereof, wherein the encoded gene product is an MCT8 protein.

16. A method for targeted therapy of the lung comprising administering to a patient in need thereof a stock solution of rAAV according to any one of claims 1 to 5.

17. A method for treating a pulmonary disease by delivering a stock solution of a rAAV according to any one of claims 1 to 5 to a subject in need thereof, wherein the encoded gene product is a soluble Ace2 protein, an anti-SARS antibody, an anti-CoV 2 antibody, an anti-influenza antibody, or a cystic fibrosis transmembrane protein.

18. A method for increasing AAV production cell transduction in vitro comprising inserting an N-x- (T/I/V/a) - (K/R) (SEQ ID NO: 47) motif into an AAV capsid.

19. The method of claim 16, wherein the producer cell is a 293 cell.