WO2024050547A2 - Compact bidirectional promoters for gene expression - Google Patents
Compact bidirectional promoters for gene expression Download PDFInfo
- Publication number
- WO2024050547A2 WO2024050547A2 PCT/US2023/073367 US2023073367W WO2024050547A2 WO 2024050547 A2 WO2024050547 A2 WO 2024050547A2 US 2023073367 W US2023073367 W US 2023073367W WO 2024050547 A2 WO2024050547 A2 WO 2024050547A2
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- variant
- promoter
- functional fragment
- cell
- coding sequence
- Prior art date
Links
- 230000002457 bidirectional effect Effects 0.000 title claims abstract description 282
- 230000014509 gene expression Effects 0.000 title claims description 144
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 187
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims abstract description 13
- 201000010099 disease Diseases 0.000 claims abstract description 12
- 108091026890 Coding region Proteins 0.000 claims description 396
- 239000012634 fragment Substances 0.000 claims description 322
- 210000004027 cell Anatomy 0.000 claims description 281
- 108020004705 Codon Proteins 0.000 claims description 219
- 238000000034 method Methods 0.000 claims description 219
- 239000013598 vector Substances 0.000 claims description 141
- 150000007523 nucleic acids Chemical class 0.000 claims description 115
- -1 FMRI Proteins 0.000 claims description 105
- 102000039446 nucleic acids Human genes 0.000 claims description 85
- 108020004707 nucleic acids Proteins 0.000 claims description 85
- 150000001413 amino acids Chemical class 0.000 claims description 78
- 238000013518 transcription Methods 0.000 claims description 48
- 230000035897 transcription Effects 0.000 claims description 48
- 241000282414 Homo sapiens Species 0.000 claims description 46
- 230000001225 therapeutic effect Effects 0.000 claims description 42
- 108010079245 Cystic Fibrosis Transmembrane Conductance Regulator Proteins 0.000 claims description 37
- 101000997662 Homo sapiens Lysosomal acid glucosylceramidase Proteins 0.000 claims description 34
- 101000575454 Homo sapiens Major facilitator superfamily domain-containing protein 8 Proteins 0.000 claims description 34
- 102100033342 Lysosomal acid glucosylceramidase Human genes 0.000 claims description 34
- 102100025613 Major facilitator superfamily domain-containing protein 8 Human genes 0.000 claims description 34
- 102100037632 Progranulin Human genes 0.000 claims description 34
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 32
- 108091032973 (ribonucleotides)n+m Proteins 0.000 claims description 25
- 102100028734 1,4-alpha-glucan-branching enzyme Human genes 0.000 claims description 25
- 102100025683 Alkaline phosphatase, tissue-nonspecific isozyme Human genes 0.000 claims description 25
- 102100022794 Bestrophin-1 Human genes 0.000 claims description 25
- 102100031504 Beta-1,4 N-acetylgalactosaminyltransferase 2 Human genes 0.000 claims description 25
- 102100026189 Beta-galactosidase Human genes 0.000 claims description 25
- 102000020038 Cholesterol 24-Hydroxylase Human genes 0.000 claims description 25
- 108091022871 Cholesterol 24-Hydroxylase Proteins 0.000 claims description 25
- 102100035432 Complement factor H Human genes 0.000 claims description 25
- 102100035431 Complement factor I Human genes 0.000 claims description 25
- 102100029142 Cyclic nucleotide-gated cation channel alpha-3 Human genes 0.000 claims description 25
- 108010032606 Fragile X Mental Retardation Protein Proteins 0.000 claims description 25
- 102100028496 Galactocerebrosidase Human genes 0.000 claims description 25
- 101001058479 Homo sapiens 1,4-alpha-glucan-branching enzyme Proteins 0.000 claims description 25
- 101000574445 Homo sapiens Alkaline phosphatase, tissue-nonspecific isozyme Proteins 0.000 claims description 25
- 101000903449 Homo sapiens Bestrophin-1 Proteins 0.000 claims description 25
- 101000729812 Homo sapiens Beta-1,4 N-acetylgalactosaminyltransferase 2 Proteins 0.000 claims description 25
- 101000765010 Homo sapiens Beta-galactosidase Proteins 0.000 claims description 25
- 101000771071 Homo sapiens Cyclic nucleotide-gated cation channel alpha-3 Proteins 0.000 claims description 25
- 101000860395 Homo sapiens Galactocerebrosidase Proteins 0.000 claims description 25
- 101000651201 Homo sapiens N-sulphoglucosamine sulphohydrolase Proteins 0.000 claims description 25
- 101001027324 Homo sapiens Progranulin Proteins 0.000 claims description 25
- 101000846198 Homo sapiens Ribitol 5-phosphate transferase FKRP Proteins 0.000 claims description 25
- 102100029199 Iduronate 2-sulfatase Human genes 0.000 claims description 25
- 108010009491 Lysosomal-Associated Membrane Protein 2 Proteins 0.000 claims description 25
- 102100038225 Lysosome-associated membrane glycoprotein 2 Human genes 0.000 claims description 25
- 102100027661 N-sulphoglucosamine sulphohydrolase Human genes 0.000 claims description 25
- 102100031774 Ribitol 5-phosphate transferase FKRP Human genes 0.000 claims description 25
- 102100034197 Tripeptidyl-peptidase 1 Human genes 0.000 claims description 25
- 201000007640 neuronal ceroid lipofuscinosis 7 Diseases 0.000 claims description 25
- 108020004414 DNA Proteins 0.000 claims description 24
- 102100022207 E3 ubiquitin-protein ligase parkin Human genes 0.000 claims description 24
- 101000619542 Homo sapiens E3 ubiquitin-protein ligase parkin Proteins 0.000 claims description 24
- 101001018717 Homo sapiens Mitofusin-2 Proteins 0.000 claims description 24
- 101000729271 Homo sapiens Retinoid isomerohydrolase Proteins 0.000 claims description 24
- 101150083522 MECP2 gene Proteins 0.000 claims description 24
- 102100039124 Methyl-CpG-binding protein 2 Human genes 0.000 claims description 24
- 102100033703 Mitofusin-2 Human genes 0.000 claims description 24
- 102100031176 Retinoid isomerohydrolase Human genes 0.000 claims description 24
- 102000006601 Thymidine Kinase Human genes 0.000 claims description 22
- 108020004440 Thymidine kinase Proteins 0.000 claims description 22
- 102100034561 Alpha-N-acetylglucosaminidase Human genes 0.000 claims description 21
- 102100022146 Arylsulfatase A Human genes 0.000 claims description 21
- 102100022548 Beta-hexosaminidase subunit alpha Human genes 0.000 claims description 21
- 108091022930 Glutamate decarboxylase Proteins 0.000 claims description 21
- 102100035902 Glutamate decarboxylase 1 Human genes 0.000 claims description 21
- 101001045440 Homo sapiens Beta-hexosaminidase subunit alpha Proteins 0.000 claims description 21
- 101001008411 Homo sapiens Lebercilin Proteins 0.000 claims description 21
- 101001126977 Homo sapiens Methylmalonyl-CoA mutase, mitochondrial Proteins 0.000 claims description 21
- 101000595489 Homo sapiens Phosphatidylinositol N-acetylglucosaminyltransferase subunit A Proteins 0.000 claims description 21
- 102100027443 Lebercilin Human genes 0.000 claims description 21
- 102100030979 Methylmalonyl-CoA mutase, mitochondrial Human genes 0.000 claims description 21
- 102100036050 Phosphatidylinositol N-acetylglucosaminyltransferase subunit A Human genes 0.000 claims description 21
- 102100022881 Rab proteins geranylgeranyltransferase component A 1 Human genes 0.000 claims description 21
- 108010009380 alpha-N-acetyl-D-glucosaminidase Proteins 0.000 claims description 21
- 239000013607 AAV vector Substances 0.000 claims description 20
- 101000840540 Homo sapiens Iduronate 2-sulfatase Proteins 0.000 claims description 19
- 108700009124 Transcription Initiation Site Proteins 0.000 claims description 19
- 239000013612 plasmid Substances 0.000 claims description 19
- 239000013603 viral vector Substances 0.000 claims description 19
- 210000000349 chromosome Anatomy 0.000 claims description 18
- 101001109052 Homo sapiens NADH-ubiquinone oxidoreductase chain 4 Proteins 0.000 claims description 17
- 239000005089 Luciferase Substances 0.000 claims description 17
- 102100021506 NADH-ubiquinone oxidoreductase chain 4 Human genes 0.000 claims description 17
- 101000901140 Homo sapiens Arylsulfatase A Proteins 0.000 claims description 16
- 101000620777 Homo sapiens Rab proteins geranylgeranyltransferase component A 1 Proteins 0.000 claims description 16
- 101001041393 Homo sapiens Serine protease HTRA1 Proteins 0.000 claims description 16
- 108060001084 Luciferase Proteins 0.000 claims description 16
- 102100021119 Serine protease HTRA1 Human genes 0.000 claims description 16
- 108010039203 Tripeptidyl-Peptidase 1 Proteins 0.000 claims description 16
- 230000002207 retinal effect Effects 0.000 claims description 16
- 241000701161 unidentified adenovirus Species 0.000 claims description 16
- 102100035028 Alpha-L-iduronidase Human genes 0.000 claims description 15
- 102100021295 Bardet-Biedl syndrome 1 protein Human genes 0.000 claims description 15
- 102000055157 Complement C1 Inhibitor Human genes 0.000 claims description 15
- 108700040183 Complement C1 Inhibitor Proteins 0.000 claims description 15
- 101001019502 Homo sapiens Alpha-L-iduronidase Proteins 0.000 claims description 15
- 101000737574 Homo sapiens Complement factor H Proteins 0.000 claims description 15
- 101150097162 SERPING1 gene Proteins 0.000 claims description 15
- 102000005028 SLC6A1 Human genes 0.000 claims description 15
- 108060007759 SLC6A1 Proteins 0.000 claims description 15
- 210000003292 kidney cell Anatomy 0.000 claims description 15
- 210000000663 muscle cell Anatomy 0.000 claims description 15
- 210000002569 neuron Anatomy 0.000 claims description 15
- 230000001737 promoting effect Effects 0.000 claims description 15
- 102100022712 Alpha-1-antitrypsin Human genes 0.000 claims description 14
- 101000855412 Homo sapiens Carbamoyl-phosphate synthase [ammonia], mitochondrial Proteins 0.000 claims description 14
- 101000983292 Homo sapiens N-fatty-acyl-amino acid synthase/hydrolase PM20D1 Proteins 0.000 claims description 14
- 101000861263 Homo sapiens Steroid 21-hydroxylase Proteins 0.000 claims description 14
- 102100026873 N-fatty-acyl-amino acid synthase/hydrolase PM20D1 Human genes 0.000 claims description 14
- 108010059385 chemotactic factor inactivator Proteins 0.000 claims description 14
- 210000002889 endothelial cell Anatomy 0.000 claims description 14
- 210000002919 epithelial cell Anatomy 0.000 claims description 14
- 210000005229 liver cell Anatomy 0.000 claims description 14
- 210000005265 lung cell Anatomy 0.000 claims description 14
- 210000004498 neuroglial cell Anatomy 0.000 claims description 14
- 108010022637 Copper-Transporting ATPases Proteins 0.000 claims description 13
- 102100027587 Copper-transporting ATPase 1 Human genes 0.000 claims description 13
- 108010067937 Cyanuric acid amidohydrolase Proteins 0.000 claims description 13
- 108091006587 SLC13A5 Proteins 0.000 claims description 13
- 102100035210 Solute carrier family 13 member 5 Human genes 0.000 claims description 13
- 102100040894 Amylo-alpha-1,6-glucosidase Human genes 0.000 claims description 12
- 101000690509 Aspergillus oryzae (strain ATCC 42149 / RIB 40) Alpha-glucosidase Proteins 0.000 claims description 12
- 102100027591 Copper-transporting ATPase 2 Human genes 0.000 claims description 12
- 101000893559 Homo sapiens Amylo-alpha-1,6-glucosidase Proteins 0.000 claims description 12
- 101000936280 Homo sapiens Copper-transporting ATPase 2 Proteins 0.000 claims description 12
- 210000002845 virion Anatomy 0.000 claims description 12
- 101000823116 Homo sapiens Alpha-1-antitrypsin Proteins 0.000 claims description 11
- 102100038223 Phenylalanine-4-hydroxylase Human genes 0.000 claims description 10
- 101710125939 Phenylalanine-4-hydroxylase Proteins 0.000 claims description 10
- 241000702421 Dependoparvovirus Species 0.000 claims description 9
- 101001132874 Homo sapiens Myotubularin Proteins 0.000 claims description 9
- 102100033817 Myotubularin Human genes 0.000 claims description 9
- 108020003589 5' Untranslated Regions Proteins 0.000 claims description 8
- 101000894722 Homo sapiens Bardet-Biedl syndrome 1 protein Proteins 0.000 claims description 8
- 239000013609 scAAV vector Substances 0.000 claims description 8
- 108091035707 Consensus sequence Proteins 0.000 claims description 7
- 101150026630 FOXG1 gene Proteins 0.000 claims description 6
- 102100020871 Forkhead box protein G1 Human genes 0.000 claims description 6
- 108091008109 Pseudogenes Proteins 0.000 claims description 6
- 102000057361 Pseudogenes Human genes 0.000 claims description 6
- 241000713666 Lentivirus Species 0.000 claims description 5
- 241000829100 Macaca mulatta polyomavirus 1 Species 0.000 claims description 5
- 241001529453 unidentified herpesvirus Species 0.000 claims description 4
- 108091026898 Leader sequence (mRNA) Proteins 0.000 claims description 3
- 108091023045 Untranslated Region Proteins 0.000 claims description 3
- 241000712079 Measles morbillivirus Species 0.000 claims description 2
- 241000700618 Vaccinia virus Species 0.000 claims description 2
- 102100023419 Cystic fibrosis transmembrane conductance regulator Human genes 0.000 claims 5
- 102100023532 Synaptic functional regulator FMR1 Human genes 0.000 claims 5
- 125000003729 nucleotide group Chemical group 0.000 description 109
- 239000002773 nucleotide Substances 0.000 description 108
- 235000001014 amino acid Nutrition 0.000 description 82
- 229940024606 amino acid Drugs 0.000 description 80
- 210000001519 tissue Anatomy 0.000 description 56
- 102000004169 proteins and genes Human genes 0.000 description 49
- 235000018102 proteins Nutrition 0.000 description 46
- 102000040430 polynucleotide Human genes 0.000 description 44
- 108091033319 polynucleotide Proteins 0.000 description 44
- 239000002157 polynucleotide Substances 0.000 description 44
- 230000006870 function Effects 0.000 description 32
- 102000008371 intracellularly ATP-gated chloride channel activity proteins Human genes 0.000 description 32
- 238000004519 manufacturing process Methods 0.000 description 32
- 239000002245 particle Substances 0.000 description 32
- 241000700605 Viruses Species 0.000 description 29
- 239000013608 rAAV vector Substances 0.000 description 24
- 230000001105 regulatory effect Effects 0.000 description 24
- 230000000694 effects Effects 0.000 description 23
- 238000004806 packaging method and process Methods 0.000 description 21
- 108090000765 processed proteins & peptides Proteins 0.000 description 21
- 102000007338 Fragile X Mental Retardation Protein Human genes 0.000 description 20
- 102000004196 processed proteins & peptides Human genes 0.000 description 20
- 239000000203 mixture Substances 0.000 description 19
- 238000006467 substitution reaction Methods 0.000 description 18
- 108700019146 Transgenes Proteins 0.000 description 17
- 210000000234 capsid Anatomy 0.000 description 17
- 210000004962 mammalian cell Anatomy 0.000 description 17
- 229920001184 polypeptide Polymers 0.000 description 17
- 230000003612 virological effect Effects 0.000 description 17
- 239000003623 enhancer Substances 0.000 description 16
- 239000008194 pharmaceutical composition Substances 0.000 description 16
- 101000979731 Homo sapiens NADH dehydrogenase [ubiquinone] 1 beta subcomplex subunit 9 Proteins 0.000 description 15
- 101000835998 Homo sapiens SRA stem-loop-interacting RNA-binding protein, mitochondrial Proteins 0.000 description 14
- 101000623076 Homo sapiens 40S ribosomal protein S28 Proteins 0.000 description 13
- 108090000565 Capsid Proteins Proteins 0.000 description 12
- 102000053602 DNA Human genes 0.000 description 12
- 238000003780 insertion Methods 0.000 description 12
- 230000037431 insertion Effects 0.000 description 12
- 238000001890 transfection Methods 0.000 description 12
- 241001556567 Acanthamoeba polyphaga mimivirus Species 0.000 description 11
- 102100023321 Ceruloplasmin Human genes 0.000 description 11
- 241000699670 Mus sp. Species 0.000 description 11
- 238000012217 deletion Methods 0.000 description 11
- 230000037430 deletion Effects 0.000 description 11
- 102100024978 NADH dehydrogenase [ubiquinone] 1 beta subcomplex subunit 9 Human genes 0.000 description 10
- 210000000170 cell membrane Anatomy 0.000 description 10
- 239000013604 expression vector Substances 0.000 description 10
- 102100030685 Alpha-sarcoglycan Human genes 0.000 description 9
- 102000004888 Aquaporin 1 Human genes 0.000 description 9
- 108090001004 Aquaporin 1 Proteins 0.000 description 9
- 102100020999 Argininosuccinate synthase Human genes 0.000 description 9
- 102100038238 Aromatic-L-amino-acid decarboxylase Human genes 0.000 description 9
- 101710151768 Aromatic-L-amino-acid decarboxylase Proteins 0.000 description 9
- 102100030686 Beta-sarcoglycan Human genes 0.000 description 9
- 102100028282 Bile salt export pump Human genes 0.000 description 9
- 102100022002 CD59 glycoprotein Human genes 0.000 description 9
- 102100032539 Calpain-3 Human genes 0.000 description 9
- 201000008890 Charcot-Marie-Tooth disease type 4J Diseases 0.000 description 9
- 102100031060 Clarin-1 Human genes 0.000 description 9
- 102100029140 Cyclic nucleotide-gated cation channel beta-3 Human genes 0.000 description 9
- 102100034746 Cyclin-dependent kinase-like 5 Human genes 0.000 description 9
- 241000701022 Cytomegalovirus Species 0.000 description 9
- 102100038694 DNA-binding protein SMUBP-2 Human genes 0.000 description 9
- 102100032248 Dysferlin Human genes 0.000 description 9
- 102100021792 Gamma-sarcoglycan Human genes 0.000 description 9
- 102100037156 Gap junction beta-2 protein Human genes 0.000 description 9
- 108091010837 Glial cell line-derived neurotrophic factor Proteins 0.000 description 9
- 102000034615 Glial cell line-derived neurotrophic factor Human genes 0.000 description 9
- 102100036264 Glucose-6-phosphatase catalytic subunit 1 Human genes 0.000 description 9
- 101000703500 Homo sapiens Alpha-sarcoglycan Proteins 0.000 description 9
- 101000784014 Homo sapiens Argininosuccinate synthase Proteins 0.000 description 9
- 101000703495 Homo sapiens Beta-sarcoglycan Proteins 0.000 description 9
- 101000897400 Homo sapiens CD59 glycoprotein Proteins 0.000 description 9
- 101000867715 Homo sapiens Calpain-3 Proteins 0.000 description 9
- 101000992973 Homo sapiens Clarin-1 Proteins 0.000 description 9
- 101000771083 Homo sapiens Cyclic nucleotide-gated cation channel beta-3 Proteins 0.000 description 9
- 101000945692 Homo sapiens Cyclin-dependent kinase-like 5 Proteins 0.000 description 9
- 101000665135 Homo sapiens DNA-binding protein SMUBP-2 Proteins 0.000 description 9
- 101001016184 Homo sapiens Dysferlin Proteins 0.000 description 9
- 101000616435 Homo sapiens Gamma-sarcoglycan Proteins 0.000 description 9
- 101000954092 Homo sapiens Gap junction beta-2 protein Proteins 0.000 description 9
- 101000930910 Homo sapiens Glucose-6-phosphatase catalytic subunit 1 Proteins 0.000 description 9
- 101000633984 Homo sapiens Influenza virus NS1A-binding protein Proteins 0.000 description 9
- 101000634196 Homo sapiens Neurotrophin-3 Proteins 0.000 description 9
- 101000812677 Homo sapiens Nucleotide pyrophosphatase Proteins 0.000 description 9
- 101001134169 Homo sapiens Otoferlin Proteins 0.000 description 9
- 101000574223 Homo sapiens Palmitoyl-protein thioesterase 1 Proteins 0.000 description 9
- 101000955481 Homo sapiens Phosphatidylcholine translocator ABCB4 Proteins 0.000 description 9
- 101000827703 Homo sapiens Polyphosphoinositide phosphatase Proteins 0.000 description 9
- 101000801643 Homo sapiens Retinal-specific phospholipid-transporting ATPase ABCA4 Proteins 0.000 description 9
- 101000742938 Homo sapiens Retinol dehydrogenase 12 Proteins 0.000 description 9
- 101000631760 Homo sapiens Sodium channel protein type 1 subunit alpha Proteins 0.000 description 9
- 101000585180 Homo sapiens Stereocilin Proteins 0.000 description 9
- 101000617738 Homo sapiens Survival motor neuron protein Proteins 0.000 description 9
- 101000801040 Homo sapiens Transmembrane channel-like protein 1 Proteins 0.000 description 9
- 101001104102 Homo sapiens X-linked retinitis pigmentosa GTPase regulator Proteins 0.000 description 9
- 101710151321 Melanostatin Proteins 0.000 description 9
- 108010093662 Member 11 Subfamily B ATP Binding Cassette Transporter Proteins 0.000 description 9
- 102400000064 Neuropeptide Y Human genes 0.000 description 9
- 102100029268 Neurotrophin-3 Human genes 0.000 description 9
- 102100039306 Nucleotide pyrophosphatase Human genes 0.000 description 9
- 102100034198 Otoferlin Human genes 0.000 description 9
- 102100039032 Phosphatidylcholine translocator ABCB4 Human genes 0.000 description 9
- 102100023591 Polyphosphoinositide phosphatase Human genes 0.000 description 9
- 101710114165 Progranulin Proteins 0.000 description 9
- 102100033617 Retinal-specific phospholipid-transporting ATPase ABCA4 Human genes 0.000 description 9
- 102100038054 Retinol dehydrogenase 12 Human genes 0.000 description 9
- 102100040756 Rhodopsin Human genes 0.000 description 9
- 102100030681 SH3 and multiple ankyrin repeat domains protein 3 Human genes 0.000 description 9
- 101710101741 SH3 and multiple ankyrin repeat domains protein 3 Proteins 0.000 description 9
- 102100025491 SRA stem-loop-interacting RNA-binding protein, mitochondrial Human genes 0.000 description 9
- 102100028910 Sodium channel protein type 1 subunit alpha Human genes 0.000 description 9
- 102100029924 Stereocilin Human genes 0.000 description 9
- 102100021947 Survival motor neuron protein Human genes 0.000 description 9
- 102100033690 Transmembrane channel-like protein 1 Human genes 0.000 description 9
- 102100040092 X-linked retinitis pigmentosa GTPase regulator Human genes 0.000 description 9
- URPYMXQQVHTUDU-OFGSCBOVSA-N nucleopeptide y Chemical compound C([C@@H](C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(N)=O)NC(=O)[C@H](CC=1NC=NC=1)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CC=1C=CC(O)=CC=1)NC(=O)[C@H](CCCNC(N)=N)NC(=O)[C@H](C)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](C)NC(=O)[C@H]1N(CCC1)C(=O)[C@H](C)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)CNC(=O)[C@H]1N(CCC1)C(=O)[C@H](CC(N)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@H]1N(CCC1)C(=O)[C@H](CCCCN)NC(=O)[C@H](CO)NC(=O)[C@H]1N(CCC1)C(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 URPYMXQQVHTUDU-OFGSCBOVSA-N 0.000 description 9
- 239000000047 product Substances 0.000 description 9
- 102100023679 40S ribosomal protein S28 Human genes 0.000 description 8
- 208000002267 Anti-neutrophil cytoplasmic antibody-associated vasculitis Diseases 0.000 description 8
- 101001078886 Homo sapiens Retinaldehyde-binding protein 1 Proteins 0.000 description 8
- 101000936922 Homo sapiens Sarcoplasmic/endoplasmic reticulum calcium ATPase 2 Proteins 0.000 description 8
- 102100028001 Retinaldehyde-binding protein 1 Human genes 0.000 description 8
- 102100027732 Sarcoplasmic/endoplasmic reticulum calcium ATPase 2 Human genes 0.000 description 8
- 241000700584 Simplexvirus Species 0.000 description 8
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 8
- 238000001415 gene therapy Methods 0.000 description 8
- 238000001727 in vivo Methods 0.000 description 8
- 230000004048 modification Effects 0.000 description 8
- 238000012986 modification Methods 0.000 description 8
- 230000035772 mutation Effects 0.000 description 8
- 239000002953 phosphate buffered saline Substances 0.000 description 8
- 230000010076 replication Effects 0.000 description 8
- 125000006850 spacer group Chemical group 0.000 description 8
- 102100034505 Ceroid-lipofuscinosis neuronal protein 5 Human genes 0.000 description 7
- 101710177611 DNA polymerase II large subunit Proteins 0.000 description 7
- 101710184669 DNA polymerase II small subunit Proteins 0.000 description 7
- 102100027525 Frataxin, mitochondrial Human genes 0.000 description 7
- 101150103820 Fxn gene Proteins 0.000 description 7
- 102100037410 Gigaxonin Human genes 0.000 description 7
- 101000710208 Homo sapiens Ceroid-lipofuscinosis neuronal protein 5 Proteins 0.000 description 7
- 101001025761 Homo sapiens Gigaxonin Proteins 0.000 description 7
- 101000573220 Homo sapiens NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 7 Proteins 0.000 description 7
- 101000604411 Homo sapiens NADH-ubiquinone oxidoreductase chain 1 Proteins 0.000 description 7
- 101001109579 Homo sapiens NPC intracellular cholesterol transporter 2 Proteins 0.000 description 7
- 101000996052 Homo sapiens Nicotinamide/nicotinic acid mononucleotide adenylyltransferase 1 Proteins 0.000 description 7
- 101000609211 Homo sapiens Polyadenylate-binding protein 2 Proteins 0.000 description 7
- 101000687060 Homo sapiens Protein phosphatase 1 regulatory subunit 1A Proteins 0.000 description 7
- 101000579423 Homo sapiens Regulator of nonsense transcripts 1 Proteins 0.000 description 7
- 101000899806 Homo sapiens Retinal guanylyl cyclase 1 Proteins 0.000 description 7
- 101000611338 Homo sapiens Rhodopsin Proteins 0.000 description 7
- 101000772888 Homo sapiens Ubiquitin-protein ligase E3A Proteins 0.000 description 7
- 102100029241 Influenza virus NS1A-binding protein Human genes 0.000 description 7
- 241001465754 Metazoa Species 0.000 description 7
- 241000699666 Mus <mouse, genus> Species 0.000 description 7
- 102100022737 NPC intracellular cholesterol transporter 2 Human genes 0.000 description 7
- 102100034451 Nicotinamide/nicotinic acid mononucleotide adenylyltransferase 1 Human genes 0.000 description 7
- 102100025824 Palmitoyl-protein thioesterase 1 Human genes 0.000 description 7
- 102100039427 Polyadenylate-binding protein 2 Human genes 0.000 description 7
- 102100024606 Protein phosphatase 1 regulatory subunit 1A Human genes 0.000 description 7
- 102000009572 RNA Polymerase II Human genes 0.000 description 7
- 108010009460 RNA Polymerase II Proteins 0.000 description 7
- 101100388570 Rattus norvegicus Ebp gene Proteins 0.000 description 7
- 102100028287 Regulator of nonsense transcripts 1 Human genes 0.000 description 7
- 102100022663 Retinal guanylyl cyclase 1 Human genes 0.000 description 7
- 101000942604 Sphingomonas wittichii (strain DC-6 / KACC 16600) Chloroacetanilide N-alkylformylase, oxygenase component Proteins 0.000 description 7
- 108060007963 Surf-1 Proteins 0.000 description 7
- 102000046669 Surf-1 Human genes 0.000 description 7
- 108091023040 Transcription factor Proteins 0.000 description 7
- 102000040945 Transcription factor Human genes 0.000 description 7
- 102100030434 Ubiquitin-protein ligase E3A Human genes 0.000 description 7
- 239000003814 drug Substances 0.000 description 7
- 239000003937 drug carrier Substances 0.000 description 7
- 238000001476 gene delivery Methods 0.000 description 7
- 230000001939 inductive effect Effects 0.000 description 7
- 230000010354 integration Effects 0.000 description 7
- 239000003550 marker Substances 0.000 description 7
- 239000000243 solution Substances 0.000 description 7
- 235000000346 sugar Nutrition 0.000 description 7
- 241001655883 Adeno-associated virus - 1 Species 0.000 description 6
- 241000702423 Adeno-associated virus - 2 Species 0.000 description 6
- 241001164825 Adeno-associated virus - 8 Species 0.000 description 6
- 102100036524 Anoctamin-5 Human genes 0.000 description 6
- 102100032948 Aspartoacylase Human genes 0.000 description 6
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 6
- 101000928364 Homo sapiens Anoctamin-5 Proteins 0.000 description 6
- 101001137074 Homo sapiens Long-wave-sensitive opsin 1 Proteins 0.000 description 6
- 101000701142 Homo sapiens Transcription factor ATOH1 Proteins 0.000 description 6
- 102100035576 Long-wave-sensitive opsin 1 Human genes 0.000 description 6
- 102100026374 NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 7 Human genes 0.000 description 6
- 102100029373 Transcription factor ATOH1 Human genes 0.000 description 6
- 125000003275 alpha amino acid group Chemical group 0.000 description 6
- 210000004727 amygdala Anatomy 0.000 description 6
- 210000004227 basal ganglia Anatomy 0.000 description 6
- 125000002091 cationic group Chemical group 0.000 description 6
- 210000001638 cerebellum Anatomy 0.000 description 6
- 210000003710 cerebral cortex Anatomy 0.000 description 6
- 239000003795 chemical substances by application Substances 0.000 description 6
- 210000001320 hippocampus Anatomy 0.000 description 6
- 239000002502 liposome Substances 0.000 description 6
- 108700043045 nanoluc Proteins 0.000 description 6
- 241000894007 species Species 0.000 description 6
- 241000202702 Adeno-associated virus - 3 Species 0.000 description 5
- 241000580270 Adeno-associated virus - 4 Species 0.000 description 5
- 241001634120 Adeno-associated virus - 5 Species 0.000 description 5
- 241000972680 Adeno-associated virus - 6 Species 0.000 description 5
- 241001164823 Adeno-associated virus - 7 Species 0.000 description 5
- 102100022440 Battenin Human genes 0.000 description 5
- 101150044789 Cap gene Proteins 0.000 description 5
- 101710132601 Capsid protein Proteins 0.000 description 5
- 101710197658 Capsid protein VP1 Proteins 0.000 description 5
- 101000901683 Homo sapiens Battenin Proteins 0.000 description 5
- 101000941325 Homo sapiens Copper homeostasis protein cutC homolog Proteins 0.000 description 5
- 101000573234 Homo sapiens NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 8 Proteins 0.000 description 5
- 101000836620 Homo sapiens Nucleic acid dioxygenase ALKBH1 Proteins 0.000 description 5
- 102100027051 Nucleic acid dioxygenase ALKBH1 Human genes 0.000 description 5
- 101710118046 RNA-directed RNA polymerase Proteins 0.000 description 5
- 101710108545 Viral protein 1 Proteins 0.000 description 5
- 239000000969 carrier Substances 0.000 description 5
- 230000001419 dependent effect Effects 0.000 description 5
- 229940079593 drug Drugs 0.000 description 5
- 230000001965 increasing effect Effects 0.000 description 5
- 210000003734 kidney Anatomy 0.000 description 5
- 239000007788 liquid Substances 0.000 description 5
- 238000003468 luciferase reporter gene assay Methods 0.000 description 5
- 239000002609 medium Substances 0.000 description 5
- 238000010369 molecular cloning Methods 0.000 description 5
- 229920001983 poloxamer Polymers 0.000 description 5
- 238000003752 polymerase chain reaction Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 238000013519 translation Methods 0.000 description 5
- 241000649045 Adeno-associated virus 10 Species 0.000 description 4
- 102100031397 Copper homeostasis protein cutC homolog Human genes 0.000 description 4
- 102100040259 Deoxyribonuclease TATDN1 Human genes 0.000 description 4
- 108700007698 Genetic Terminator Regions Proteins 0.000 description 4
- 101000718525 Homo sapiens Alpha-galactosidase A Proteins 0.000 description 4
- 101000797251 Homo sapiens Aspartoacylase Proteins 0.000 description 4
- 101000770637 Homo sapiens Cytochrome c oxidase assembly protein COX15 homolog Proteins 0.000 description 4
- 101000891564 Homo sapiens Deoxyribonuclease TATDN1 Proteins 0.000 description 4
- 101000982032 Homo sapiens Myosin-binding protein C, cardiac-type Proteins 0.000 description 4
- 101000589519 Homo sapiens N-acetyltransferase 8 Proteins 0.000 description 4
- 241000254158 Lampyridae Species 0.000 description 4
- 241000124008 Mammalia Species 0.000 description 4
- 101710081079 Minor spike protein H Proteins 0.000 description 4
- 102100026771 Myosin-binding protein C, cardiac-type Human genes 0.000 description 4
- 102100026377 NADH dehydrogenase [ubiquinone] 1 alpha subcomplex subunit 8 Human genes 0.000 description 4
- 241000714474 Rous sarcoma virus Species 0.000 description 4
- 230000002378 acidificating effect Effects 0.000 description 4
- 230000008901 benefit Effects 0.000 description 4
- UCMIRNVEIXFBKS-UHFFFAOYSA-N beta-alanine Chemical compound NCCC(O)=O UCMIRNVEIXFBKS-UHFFFAOYSA-N 0.000 description 4
- 210000000601 blood cell Anatomy 0.000 description 4
- 239000000872 buffer Substances 0.000 description 4
- 210000003169 central nervous system Anatomy 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- 238000004520 electroporation Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 230000002068 genetic effect Effects 0.000 description 4
- 238000003306 harvesting Methods 0.000 description 4
- 229920001519 homopolymer Polymers 0.000 description 4
- 102000051615 human NDUFB9 Human genes 0.000 description 4
- 102000054335 human RPS28 Human genes 0.000 description 4
- 102000051475 human SLIRP Human genes 0.000 description 4
- 210000003016 hypothalamus Anatomy 0.000 description 4
- 238000010348 incorporation Methods 0.000 description 4
- 239000000463 material Substances 0.000 description 4
- 210000001767 medulla oblongata Anatomy 0.000 description 4
- 210000001259 mesencephalon Anatomy 0.000 description 4
- 239000002086 nanomaterial Substances 0.000 description 4
- 210000000956 olfactory bulb Anatomy 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 101150066583 rep gene Proteins 0.000 description 4
- 102000004346 ribosomal protein L9 Human genes 0.000 description 4
- 108090000907 ribosomal protein L9 Proteins 0.000 description 4
- 239000011780 sodium chloride Substances 0.000 description 4
- 210000000278 spinal cord Anatomy 0.000 description 4
- 239000000126 substance Substances 0.000 description 4
- 210000001103 thalamus Anatomy 0.000 description 4
- 238000011282 treatment Methods 0.000 description 4
- 210000004885 white matter Anatomy 0.000 description 4
- 241000649046 Adeno-associated virus 11 Species 0.000 description 3
- 241000649047 Adeno-associated virus 12 Species 0.000 description 3
- 241000300529 Adeno-associated virus 13 Species 0.000 description 3
- 241000958487 Adeno-associated virus 3B Species 0.000 description 3
- 241000283690 Bos taurus Species 0.000 description 3
- 102100029079 Cytochrome c oxidase assembly protein COX15 homolog Human genes 0.000 description 3
- 102100035426 DnaJ homolog subfamily B member 7 Human genes 0.000 description 3
- 101100285903 Drosophila melanogaster Hsc70-2 gene Proteins 0.000 description 3
- 101100178718 Drosophila melanogaster Hsc70-4 gene Proteins 0.000 description 3
- 101100178723 Drosophila melanogaster Hsc70-5 gene Proteins 0.000 description 3
- 102100036291 Galactose-1-phosphate uridylyltransferase Human genes 0.000 description 3
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 3
- 101000804114 Homo sapiens DnaJ homolog subfamily B member 7 Proteins 0.000 description 3
- 101150090950 Hsc70-1 gene Proteins 0.000 description 3
- PMMYEEVYMWASQN-DMTCNVIQSA-N Hydroxyproline Chemical compound O[C@H]1CN[C@H](C(O)=O)C1 PMMYEEVYMWASQN-DMTCNVIQSA-N 0.000 description 3
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 3
- LRQKBLKVPFOOQJ-YFKPBYRVSA-N L-norleucine Chemical compound CCCC[C@H]([NH3+])C([O-])=O LRQKBLKVPFOOQJ-YFKPBYRVSA-N 0.000 description 3
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 3
- 108700026244 Open Reading Frames Proteins 0.000 description 3
- 101100150366 Schizosaccharomyces pombe (strain 972 / ATCC 24843) sks2 gene Proteins 0.000 description 3
- 108010034546 Serratia marcescens nuclease Proteins 0.000 description 3
- 108020004682 Single-Stranded DNA Proteins 0.000 description 3
- 230000023445 activated T cell autonomous cell death Effects 0.000 description 3
- 125000000129 anionic group Chemical group 0.000 description 3
- 125000003118 aryl group Chemical group 0.000 description 3
- 229910052796 boron Inorganic materials 0.000 description 3
- 238000004113 cell culture Methods 0.000 description 3
- 238000012512 characterization method Methods 0.000 description 3
- 238000004587 chromatography analysis Methods 0.000 description 3
- 239000000356 contaminant Substances 0.000 description 3
- PMMYEEVYMWASQN-UHFFFAOYSA-N dl-hydroxyproline Natural products OC1C[NH2+]C(C([O-])=O)C1 PMMYEEVYMWASQN-UHFFFAOYSA-N 0.000 description 3
- 210000003527 eukaryotic cell Anatomy 0.000 description 3
- 230000007717 exclusion Effects 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 238000009472 formulation Methods 0.000 description 3
- 229960002591 hydroxyproline Drugs 0.000 description 3
- 230000005847 immunogenicity Effects 0.000 description 3
- 208000015181 infectious disease Diseases 0.000 description 3
- 238000002347 injection Methods 0.000 description 3
- 239000007924 injection Substances 0.000 description 3
- 238000001638 lipofection Methods 0.000 description 3
- 210000004185 liver Anatomy 0.000 description 3
- 210000004072 lung Anatomy 0.000 description 3
- 108020004999 messenger RNA Proteins 0.000 description 3
- 229910052751 metal Inorganic materials 0.000 description 3
- 239000002184 metal Substances 0.000 description 3
- 150000002739 metals Chemical class 0.000 description 3
- 150000003904 phospholipids Chemical class 0.000 description 3
- 230000008488 polyadenylation Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000000746 purification Methods 0.000 description 3
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- FSYKKLYZXJSNPZ-UHFFFAOYSA-N sarcosine Chemical compound C[NH2+]CC([O-])=O FSYKKLYZXJSNPZ-UHFFFAOYSA-N 0.000 description 3
- 238000010561 standard procedure Methods 0.000 description 3
- 150000008163 sugars Chemical class 0.000 description 3
- 230000008685 targeting Effects 0.000 description 3
- 238000002560 therapeutic procedure Methods 0.000 description 3
- 230000002103 transcriptional effect Effects 0.000 description 3
- 238000003151 transfection method Methods 0.000 description 3
- 239000003981 vehicle Substances 0.000 description 3
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 3
- FUOOLUPWFVMBKG-UHFFFAOYSA-N 2-Aminoisobutyric acid Chemical compound CC(C)(N)C(O)=O FUOOLUPWFVMBKG-UHFFFAOYSA-N 0.000 description 2
- 101100279849 Arabidopsis thaliana EPF1 gene Proteins 0.000 description 2
- 108010045171 Cyclic AMP Response Element-Binding Protein Proteins 0.000 description 2
- 102000005636 Cyclic AMP Response Element-Binding Protein Human genes 0.000 description 2
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 2
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 2
- RTZKZFJDLAIYFH-UHFFFAOYSA-N Diethyl ether Chemical compound CCOCC RTZKZFJDLAIYFH-UHFFFAOYSA-N 0.000 description 2
- 102100025177 Dimethylglycine dehydrogenase, mitochondrial Human genes 0.000 description 2
- 239000006144 Dulbecco’s modified Eagle's medium Substances 0.000 description 2
- 102100023947 Dynein light chain Tctex-type protein 2 Human genes 0.000 description 2
- 108090000331 Firefly luciferases Proteins 0.000 description 2
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 2
- 102100027738 Heterogeneous nuclear ribonucleoprotein H Human genes 0.000 description 2
- 241000238631 Hexapoda Species 0.000 description 2
- 101001005618 Homo sapiens Dimethylglycine dehydrogenase, mitochondrial Proteins 0.000 description 2
- 101000904044 Homo sapiens Dynein light chain Tctex-type protein 2 Proteins 0.000 description 2
- 101001021379 Homo sapiens Galactose-1-phosphate uridylyltransferase Proteins 0.000 description 2
- 101001081149 Homo sapiens Heterogeneous nuclear ribonucleoprotein H Proteins 0.000 description 2
- 101001003584 Homo sapiens Prelamin-A/C Proteins 0.000 description 2
- 101000933386 Homo sapiens S-methylmethionine-homocysteine S-methyltransferase BHMT2 Proteins 0.000 description 2
- UQSXHKLRYXJYBZ-UHFFFAOYSA-N Iron oxide Chemical compound [Fe]=O UQSXHKLRYXJYBZ-UHFFFAOYSA-N 0.000 description 2
- 241000699660 Mus musculus Species 0.000 description 2
- KSPIYJQBLVDRRI-UHFFFAOYSA-N N-methylisoleucine Chemical compound CCC(C)C(NC)C(O)=O KSPIYJQBLVDRRI-UHFFFAOYSA-N 0.000 description 2
- 206010028980 Neoplasm Diseases 0.000 description 2
- 101710163270 Nuclease Proteins 0.000 description 2
- 239000002202 Polyethylene glycol Substances 0.000 description 2
- 102100026531 Prelamin-A/C Human genes 0.000 description 2
- 241000125945 Protoparvovirus Species 0.000 description 2
- 102000014450 RNA Polymerase III Human genes 0.000 description 2
- 108010078067 RNA Polymerase III Proteins 0.000 description 2
- 238000003559 RNA-seq method Methods 0.000 description 2
- 241000712907 Retroviridae Species 0.000 description 2
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 2
- 108700026226 TATA Box Proteins 0.000 description 2
- 206010046865 Vaccinia virus infection Diseases 0.000 description 2
- NRAUADCLPJTGSF-ZPGVOIKOSA-N [(2r,3s,4r,5r,6r)-6-[[(3as,7r,7as)-7-hydroxy-4-oxo-1,3a,5,6,7,7a-hexahydroimidazo[4,5-c]pyridin-2-yl]amino]-5-[[(3s)-3,6-diaminohexanoyl]amino]-4-hydroxy-2-(hydroxymethyl)oxan-3-yl] carbamate Chemical compound NCCC[C@H](N)CC(=O)N[C@@H]1[C@@H](O)[C@H](OC(N)=O)[C@@H](CO)O[C@H]1\N=C/1N[C@H](C(=O)NC[C@H]2O)[C@@H]2N\1 NRAUADCLPJTGSF-ZPGVOIKOSA-N 0.000 description 2
- DZBUGLKDJFMEHC-UHFFFAOYSA-N acridine Chemical compound C1=CC=CC2=CC3=CC=CC=C3N=C21 DZBUGLKDJFMEHC-UHFFFAOYSA-N 0.000 description 2
- 238000007792 addition Methods 0.000 description 2
- 210000000577 adipose tissue Anatomy 0.000 description 2
- 210000004100 adrenal gland Anatomy 0.000 description 2
- 150000001412 amines Chemical group 0.000 description 2
- 125000000539 amino acid group Chemical group 0.000 description 2
- 229960000723 ampicillin Drugs 0.000 description 2
- AVKUERGKIZMTKX-NJBDSQKTSA-N ampicillin Chemical compound C1([C@@H](N)C(=O)N[C@H]2[C@H]3SC([C@@H](N3C2=O)C(O)=O)(C)C)=CC=CC=C1 AVKUERGKIZMTKX-NJBDSQKTSA-N 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 239000003242 anti bacterial agent Substances 0.000 description 2
- 229940088710 antibiotic agent Drugs 0.000 description 2
- 230000004888 barrier function Effects 0.000 description 2
- 210000000481 breast Anatomy 0.000 description 2
- 239000001506 calcium phosphate Substances 0.000 description 2
- 229910000389 calcium phosphate Inorganic materials 0.000 description 2
- 235000011010 calcium phosphates Nutrition 0.000 description 2
- 201000011510 cancer Diseases 0.000 description 2
- 229910052799 carbon Inorganic materials 0.000 description 2
- 230000002490 cerebral effect Effects 0.000 description 2
- 210000003679 cervix uteri Anatomy 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 229960005091 chloramphenicol Drugs 0.000 description 2
- WIIZWVCIJKGZOK-RKDXNWHRSA-N chloramphenicol Chemical compound ClC(Cl)C(=O)N[C@H](CO)[C@H](O)C1=CC=C([N+]([O-])=O)C=C1 WIIZWVCIJKGZOK-RKDXNWHRSA-N 0.000 description 2
- 238000002487 chromatin immunoprecipitation Methods 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 210000001072 colon Anatomy 0.000 description 2
- 230000021615 conjugation Effects 0.000 description 2
- 238000009295 crossflow filtration Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 241001493065 dsRNA viruses Species 0.000 description 2
- 230000005684 electric field Effects 0.000 description 2
- 230000009881 electrostatic interaction Effects 0.000 description 2
- 210000004696 endometrium Anatomy 0.000 description 2
- 210000003238 esophagus Anatomy 0.000 description 2
- 238000002474 experimental method Methods 0.000 description 2
- 102000037865 fusion proteins Human genes 0.000 description 2
- 108020001507 fusion proteins Proteins 0.000 description 2
- BTCSSZJGUNDROE-UHFFFAOYSA-N gamma-aminobutyric acid Chemical compound NCCCC(O)=O BTCSSZJGUNDROE-UHFFFAOYSA-N 0.000 description 2
- 238000012239 gene modification Methods 0.000 description 2
- 230000005017 genetic modification Effects 0.000 description 2
- 235000013617 genetically modified food Nutrition 0.000 description 2
- 230000002209 hydrophobic effect Effects 0.000 description 2
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 230000002458 infectious effect Effects 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 230000003834 intracellular effect Effects 0.000 description 2
- 229960000318 kanamycin Drugs 0.000 description 2
- 229930027917 kanamycin Natural products 0.000 description 2
- SBUJHOSQTJFQJX-NOAMYHISSA-N kanamycin Chemical compound O[C@@H]1[C@@H](O)[C@H](O)[C@@H](CN)O[C@@H]1O[C@H]1[C@H](O)[C@@H](O[C@@H]2[C@@H]([C@@H](N)[C@H](O)[C@@H](CO)O2)O)[C@H](N)C[C@@H]1N SBUJHOSQTJFQJX-NOAMYHISSA-N 0.000 description 2
- 229930182823 kanamycin A Natural products 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 125000005647 linker group Chemical group 0.000 description 2
- 239000002122 magnetic nanoparticle Substances 0.000 description 2
- 239000012092 media component Substances 0.000 description 2
- 210000004165 myocardium Anatomy 0.000 description 2
- XEAQPCFJFQRDDZ-UHFFFAOYSA-N n-(2,2-dimethylpropyl)-n-methylnitrous amide Chemical compound O=NN(C)CC(C)(C)C XEAQPCFJFQRDDZ-UHFFFAOYSA-N 0.000 description 2
- 230000007935 neutral effect Effects 0.000 description 2
- 230000030147 nuclear export Effects 0.000 description 2
- 210000004940 nucleus Anatomy 0.000 description 2
- 210000003101 oviduct Anatomy 0.000 description 2
- 108010079892 phosphoglycerol kinase Proteins 0.000 description 2
- 229920001223 polyethylene glycol Polymers 0.000 description 2
- 239000011148 porous material Substances 0.000 description 2
- 230000000069 prophylactic effect Effects 0.000 description 2
- 125000006239 protecting group Chemical group 0.000 description 2
- ZCCUUQDIBDJBTK-UHFFFAOYSA-N psoralen Chemical compound C1=C2OC(=O)C=CC2=CC2=C1OC=C2 ZCCUUQDIBDJBTK-UHFFFAOYSA-N 0.000 description 2
- 238000010188 recombinant method Methods 0.000 description 2
- 210000000880 retinal rod photoreceptor cell Anatomy 0.000 description 2
- 210000003705 ribosome Anatomy 0.000 description 2
- 238000003196 serial analysis of gene expression Methods 0.000 description 2
- 210000002966 serum Anatomy 0.000 description 2
- 238000001542 size-exclusion chromatography Methods 0.000 description 2
- 210000003491 skin Anatomy 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 239000003381 stabilizer Substances 0.000 description 2
- 230000010473 stable expression Effects 0.000 description 2
- 239000000758 substrate Substances 0.000 description 2
- 208000024891 symptom Diseases 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 210000001550 testis Anatomy 0.000 description 2
- 231100000419 toxicity Toxicity 0.000 description 2
- 230000001988 toxicity Effects 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000009466 transformation Effects 0.000 description 2
- QORWJWZARLRLPR-UHFFFAOYSA-H tricalcium bis(phosphate) Chemical compound [Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O QORWJWZARLRLPR-UHFFFAOYSA-H 0.000 description 2
- 241000701447 unidentified baculovirus Species 0.000 description 2
- 241001430294 unidentified retrovirus Species 0.000 description 2
- 208000007089 vaccinia Diseases 0.000 description 2
- GMKMEZVLHJARHF-UHFFFAOYSA-N (2R,6R)-form-2.6-Diaminoheptanedioic acid Natural products OC(=O)C(N)CCCC(N)C(O)=O GMKMEZVLHJARHF-UHFFFAOYSA-N 0.000 description 1
- VEVRNHHLCPGNDU-MUGJNUQGSA-N (2s)-2-amino-5-[1-[(5s)-5-amino-5-carboxypentyl]-3,5-bis[(3s)-3-amino-3-carboxypropyl]pyridin-1-ium-4-yl]pentanoate Chemical compound OC(=O)[C@@H](N)CCCC[N+]1=CC(CC[C@H](N)C(O)=O)=C(CCC[C@H](N)C([O-])=O)C(CC[C@H](N)C(O)=O)=C1 VEVRNHHLCPGNDU-MUGJNUQGSA-N 0.000 description 1
- 102000040650 (ribonucleotides)n+m Human genes 0.000 description 1
- BLCJBICVQSYOIF-UHFFFAOYSA-N 2,2-diaminobutanoic acid Chemical compound CCC(N)(N)C(O)=O BLCJBICVQSYOIF-UHFFFAOYSA-N 0.000 description 1
- JKMHFZQWWAIEOD-UHFFFAOYSA-N 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 1
- OYIFNHCXNCRBQI-UHFFFAOYSA-N 2-aminoadipic acid Chemical compound OC(=O)C(N)CCCC(O)=O OYIFNHCXNCRBQI-UHFFFAOYSA-N 0.000 description 1
- RDFMDVXONNIGBC-UHFFFAOYSA-N 2-aminoheptanoic acid Chemical compound CCCCCC(N)C(O)=O RDFMDVXONNIGBC-UHFFFAOYSA-N 0.000 description 1
- JUQLUIFNNFIIKC-UHFFFAOYSA-N 2-aminopimelic acid Chemical compound OC(=O)C(N)CCCCC(O)=O JUQLUIFNNFIIKC-UHFFFAOYSA-N 0.000 description 1
- 108020005345 3' Untranslated Regions Proteins 0.000 description 1
- BRMWTNUJHUMWMS-UHFFFAOYSA-N 3-Methylhistidine Natural products CN1C=NC(CC(N)C(O)=O)=C1 BRMWTNUJHUMWMS-UHFFFAOYSA-N 0.000 description 1
- PECYZEOJVXMISF-UHFFFAOYSA-N 3-aminoalanine Chemical compound [NH3+]CC(N)C([O-])=O PECYZEOJVXMISF-UHFFFAOYSA-N 0.000 description 1
- VXGRJERITKFWPL-UHFFFAOYSA-N 4',5'-Dihydropsoralen Natural products C1=C2OC(=O)C=CC2=CC2=C1OCC2 VXGRJERITKFWPL-UHFFFAOYSA-N 0.000 description 1
- 229940117976 5-hydroxylysine Drugs 0.000 description 1
- 208000035657 Abasia Diseases 0.000 description 1
- 241000649044 Adeno-associated virus 9 Species 0.000 description 1
- 208000010370 Adenoviridae Infections Diseases 0.000 description 1
- 206010060931 Adenovirus infection Diseases 0.000 description 1
- 241001664176 Alpharetrovirus Species 0.000 description 1
- 241000710929 Alphavirus Species 0.000 description 1
- 206010002091 Anaesthesia Diseases 0.000 description 1
- 241000271566 Aves Species 0.000 description 1
- 241000714230 Avian leukemia virus Species 0.000 description 1
- 241001485018 Baboon endogenous virus Species 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 101000805768 Banna virus (strain Indonesia/JKT-6423/1980) mRNA (guanine-N(7))-methyltransferase Proteins 0.000 description 1
- 102100027883 Bardet-Biedl syndrome 2 protein Human genes 0.000 description 1
- ZOXJGFHDIHLPTG-UHFFFAOYSA-N Boron Chemical compound [B] ZOXJGFHDIHLPTG-UHFFFAOYSA-N 0.000 description 1
- 241000714266 Bovine leukemia virus Species 0.000 description 1
- 206010006187 Breast cancer Diseases 0.000 description 1
- QCMYYKRYFNMIEC-UHFFFAOYSA-N COP(O)=O Chemical class COP(O)=O QCMYYKRYFNMIEC-UHFFFAOYSA-N 0.000 description 1
- 101100172118 Caenorhabditis elegans eif-2Bgamma gene Proteins 0.000 description 1
- 241000282832 Camelidae Species 0.000 description 1
- 241000282465 Canis Species 0.000 description 1
- 241000283707 Capra Species 0.000 description 1
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 1
- 241000282693 Cercopithecidae Species 0.000 description 1
- 101000686790 Chaetoceros protobacilladnavirus 2 Replication-associated protein Proteins 0.000 description 1
- 101000864475 Chlamydia phage 1 Internal scaffolding protein VP3 Proteins 0.000 description 1
- 108091033380 Coding strand Proteins 0.000 description 1
- 241000711573 Coronaviridae Species 0.000 description 1
- IGXWBGJHJZYPQS-SSDOTTSWSA-N D-Luciferin Chemical compound OC(=O)[C@H]1CSC(C=2SC3=CC=C(O)C=C3N=2)=N1 IGXWBGJHJZYPQS-SSDOTTSWSA-N 0.000 description 1
- HMFHBZSHGGEWLO-SOOFDHNKSA-N D-ribofuranose Chemical compound OC[C@H]1OC(O)[C@H](O)[C@@H]1O HMFHBZSHGGEWLO-SOOFDHNKSA-N 0.000 description 1
- 230000004543 DNA replication Effects 0.000 description 1
- 230000006820 DNA synthesis Effects 0.000 description 1
- 241000450599 DNA viruses Species 0.000 description 1
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 1
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 1
- 108010053770 Deoxyribonucleases Proteins 0.000 description 1
- 102000016911 Deoxyribonucleases Human genes 0.000 description 1
- 229920002307 Dextran Polymers 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 102100030207 Endoplasmic reticulum membrane-associated RNA degradation protein Human genes 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 101710100588 Erythroid transcription factor Proteins 0.000 description 1
- 102100031690 Erythroid transcription factor Human genes 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 101000803553 Eumenes pomiformis Venom peptide 3 Proteins 0.000 description 1
- 108091029865 Exogenous DNA Proteins 0.000 description 1
- 108700024394 Exon Proteins 0.000 description 1
- 241000714165 Feline leukemia virus Species 0.000 description 1
- 241000714174 Feline sarcoma virus Species 0.000 description 1
- 241000282324 Felis Species 0.000 description 1
- 241000710831 Flavivirus Species 0.000 description 1
- 208000000666 Fowlpox Diseases 0.000 description 1
- 101710082961 GATA-binding factor 2 Proteins 0.000 description 1
- 102000016669 GATA1 Transcription Factor Human genes 0.000 description 1
- 108010028165 GATA1 Transcription Factor Proteins 0.000 description 1
- 102000011852 GATA2 Transcription Factor Human genes 0.000 description 1
- 108010075641 GATA2 Transcription Factor Proteins 0.000 description 1
- 241001663880 Gammaretrovirus Species 0.000 description 1
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 1
- 241000713813 Gibbon ape leukemia virus Species 0.000 description 1
- 102000003886 Glycoproteins Human genes 0.000 description 1
- 108090000288 Glycoproteins Proteins 0.000 description 1
- 241000941423 Grom virus Species 0.000 description 1
- 239000007995 HEPES buffer Substances 0.000 description 1
- 101000583961 Halorubrum pleomorphic virus 1 Matrix protein Proteins 0.000 description 1
- 102100021519 Hemoglobin subunit beta Human genes 0.000 description 1
- 108091005904 Hemoglobin subunit beta Proteins 0.000 description 1
- 108010005336 Histone H2a Dioxygenase AlkB Homolog 1 Proteins 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101100269633 Homo sapiens ALKBH1 gene Proteins 0.000 description 1
- 101000697700 Homo sapiens Bardet-Biedl syndrome 2 protein Proteins 0.000 description 1
- 101001011818 Homo sapiens Endoplasmic reticulum membrane-associated RNA degradation protein Proteins 0.000 description 1
- 101000878605 Homo sapiens Low affinity immunoglobulin epsilon Fc receptor Proteins 0.000 description 1
- 101000579123 Homo sapiens Phosphoglycerate kinase 1 Proteins 0.000 description 1
- 101000650649 Homo sapiens Small EDRK-rich factor 1 Proteins 0.000 description 1
- 206010020460 Human T-cell lymphotropic virus type I infection Diseases 0.000 description 1
- 241000714260 Human T-lymphotropic virus 1 Species 0.000 description 1
- 241000713673 Human foamy virus Species 0.000 description 1
- 241000701044 Human gammaherpesvirus 4 Species 0.000 description 1
- 241000701806 Human papillomavirus Species 0.000 description 1
- 108700005091 Immunoglobulin Genes Proteins 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- PIWKPBJCKXDKJR-UHFFFAOYSA-N Isoflurane Chemical compound FC(F)OC(Cl)C(F)(F)F PIWKPBJCKXDKJR-UHFFFAOYSA-N 0.000 description 1
- SNDPXSYFESPGGJ-BYPYZUCNSA-N L-2-aminopentanoic acid Chemical compound CCC[C@H](N)C(O)=O SNDPXSYFESPGGJ-BYPYZUCNSA-N 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- AGPKZVBTJJNPAG-UHNVWZDZSA-N L-allo-Isoleucine Chemical compound CC[C@@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-UHNVWZDZSA-N 0.000 description 1
- RHGKLRLOHDJJDR-BYPYZUCNSA-N L-citrulline Chemical compound NC(=O)NCCC[C@H]([NH3+])C([O-])=O RHGKLRLOHDJJDR-BYPYZUCNSA-N 0.000 description 1
- SNDPXSYFESPGGJ-UHFFFAOYSA-N L-norVal-OH Natural products CCCC(N)C(O)=O SNDPXSYFESPGGJ-UHFFFAOYSA-N 0.000 description 1
- 102100038007 Low affinity immunoglobulin epsilon Fc receptor Human genes 0.000 description 1
- 201000005505 Measles Diseases 0.000 description 1
- 241000713862 Moloney murine sarcoma virus Species 0.000 description 1
- 241001529936 Murinae Species 0.000 description 1
- 241000714177 Murine leukemia virus Species 0.000 description 1
- JDHILDINMRGULE-LURJTMIESA-N N(pros)-methyl-L-histidine Chemical compound CN1C=NC=C1C[C@H](N)C(O)=O JDHILDINMRGULE-LURJTMIESA-N 0.000 description 1
- JJIHLJJYMXLCOY-BYPYZUCNSA-N N-acetyl-L-serine Chemical compound CC(=O)N[C@@H](CO)C(O)=O JJIHLJJYMXLCOY-BYPYZUCNSA-N 0.000 description 1
- YPIGGYHFMKJNKV-UHFFFAOYSA-N N-ethylglycine Chemical compound CC[NH2+]CC([O-])=O YPIGGYHFMKJNKV-UHFFFAOYSA-N 0.000 description 1
- 108010065338 N-ethylglycine Proteins 0.000 description 1
- PYUSHNKNPOHWEZ-YFKPBYRVSA-N N-formyl-L-methionine Chemical compound CSCC[C@@H](C(O)=O)NC=O PYUSHNKNPOHWEZ-YFKPBYRVSA-N 0.000 description 1
- AKCRVYNORCOYQT-YFKPBYRVSA-N N-methyl-L-valine Chemical compound CN[C@@H](C(C)C)C(O)=O AKCRVYNORCOYQT-YFKPBYRVSA-N 0.000 description 1
- RHGKLRLOHDJJDR-UHFFFAOYSA-N Ndelta-carbamoyl-DL-ornithine Natural products OC(=O)C(N)CCCNC(N)=O RHGKLRLOHDJJDR-UHFFFAOYSA-N 0.000 description 1
- 101100030361 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) pph-3 gene Proteins 0.000 description 1
- 108091092724 Noncoding DNA Proteins 0.000 description 1
- 241000714209 Norwalk virus Species 0.000 description 1
- 108091093105 Nuclear DNA Proteins 0.000 description 1
- 102000015636 Oligopeptides Human genes 0.000 description 1
- 108010038807 Oligopeptides Proteins 0.000 description 1
- 241000702244 Orthoreovirus Species 0.000 description 1
- 241000283973 Oryctolagus cuniculus Species 0.000 description 1
- KJWZYMMLVHIVSU-IYCNHOCDSA-N PGK1 Chemical compound CCCCC[C@H](O)\C=C\[C@@H]1[C@@H](CCCCCCC(O)=O)C(=O)CC1=O KJWZYMMLVHIVSU-IYCNHOCDSA-N 0.000 description 1
- 229910019142 PO4 Inorganic materials 0.000 description 1
- 101150087778 PPE65 gene Proteins 0.000 description 1
- 235000019483 Peanut oil Nutrition 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- 102000035195 Peptidases Human genes 0.000 description 1
- 102100028251 Phosphoglycerate kinase 1 Human genes 0.000 description 1
- ABLZXFCXXLZCGV-UHFFFAOYSA-N Phosphorous acid Chemical group OP(O)=O ABLZXFCXXLZCGV-UHFFFAOYSA-N 0.000 description 1
- 241000709664 Picornaviridae Species 0.000 description 1
- 229920002873 Polyethylenimine Polymers 0.000 description 1
- 229920001213 Polysorbate 20 Polymers 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 108010076504 Protein Sorting Signals Proteins 0.000 description 1
- 238000011529 RT qPCR Methods 0.000 description 1
- 206010037742 Rabies Diseases 0.000 description 1
- 241000711798 Rabies lyssavirus Species 0.000 description 1
- 241000700159 Rattus Species 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 1
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 1
- 101150069016 Rep40 gene Proteins 0.000 description 1
- 101150100379 Rep52 gene Proteins 0.000 description 1
- 101150051517 Rep68 gene Proteins 0.000 description 1
- 101150076399 Rep78 gene Proteins 0.000 description 1
- 108700008625 Reporter Genes Proteins 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- PYMYPHUHKUWMLA-LMVFSUKVSA-N Ribose Natural products OC[C@@H](O)[C@@H](O)[C@@H](O)C=O PYMYPHUHKUWMLA-LMVFSUKVSA-N 0.000 description 1
- 241000283984 Rodentia Species 0.000 description 1
- 102100025992 S-methylmethionine-homocysteine S-methyltransferase BHMT2 Human genes 0.000 description 1
- 206010039491 Sarcoma Diseases 0.000 description 1
- 108010077895 Sarcosine Proteins 0.000 description 1
- 241000713311 Simian immunodeficiency virus Species 0.000 description 1
- 101150085022 Slirp gene Proteins 0.000 description 1
- 102100027693 Small EDRK-rich factor 1 Human genes 0.000 description 1
- 241000713675 Spumavirus Species 0.000 description 1
- 208000000389 T-cell leukemia Diseases 0.000 description 1
- 208000028530 T-cell lymphoblastic leukemia/lymphoma Diseases 0.000 description 1
- RYYWUUFWQRZTIU-UHFFFAOYSA-N Thiophosphoric acid Chemical class OP(O)(S)=O RYYWUUFWQRZTIU-UHFFFAOYSA-N 0.000 description 1
- 108091036066 Three prime untranslated region Proteins 0.000 description 1
- 108010022394 Threonine synthase Proteins 0.000 description 1
- 108091034131 VA RNA Proteins 0.000 description 1
- 241000711975 Vesicular stomatitis virus Species 0.000 description 1
- 108010067390 Viral Proteins Proteins 0.000 description 1
- 108091093126 WHP Posttrascriptional Response Element Proteins 0.000 description 1
- 241001492404 Woodchuck hepatitis virus Species 0.000 description 1
- 241000714205 Woolly monkey sarcoma virus Species 0.000 description 1
- 241000021375 Xenogenes Species 0.000 description 1
- 101150093411 ZNF143 gene Proteins 0.000 description 1
- 108010084455 Zeocin Proteins 0.000 description 1
- MZVQCMJNVPIDEA-UHFFFAOYSA-N [CH2]CN(CC)CC Chemical group [CH2]CN(CC)CC MZVQCMJNVPIDEA-UHFFFAOYSA-N 0.000 description 1
- 230000021736 acetylation Effects 0.000 description 1
- 238000006640 acetylation reaction Methods 0.000 description 1
- 239000002253 acid Substances 0.000 description 1
- 125000002015 acyclic group Chemical group 0.000 description 1
- 208000011589 adenoviridae infectious disease Diseases 0.000 description 1
- 239000002671 adjuvant Substances 0.000 description 1
- 238000001042 affinity chromatography Methods 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- 125000003342 alkenyl group Chemical group 0.000 description 1
- 125000000217 alkyl group Chemical group 0.000 description 1
- 239000002168 alkylating agent Substances 0.000 description 1
- HMFHBZSHGGEWLO-UHFFFAOYSA-N alpha-D-Furanose-Ribose Natural products OCC1OC(O)C(O)C1O HMFHBZSHGGEWLO-UHFFFAOYSA-N 0.000 description 1
- 229940059260 amidate Drugs 0.000 description 1
- 229940124277 aminobutyric acid Drugs 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 230000037005 anaesthesia Effects 0.000 description 1
- 210000004102 animal cell Anatomy 0.000 description 1
- 238000010171 animal model Methods 0.000 description 1
- 210000000612 antigen-presenting cell Anatomy 0.000 description 1
- 239000003963 antioxidant agent Substances 0.000 description 1
- 229910052586 apatite Inorganic materials 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- PYMYPHUHKUWMLA-WDCZJNDASA-N arabinose Chemical compound OC[C@@H](O)[C@@H](O)[C@H](O)C=O PYMYPHUHKUWMLA-WDCZJNDASA-N 0.000 description 1
- PYMYPHUHKUWMLA-UHFFFAOYSA-N arabinose Natural products OCC(O)C(O)C(O)C=O PYMYPHUHKUWMLA-UHFFFAOYSA-N 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 210000004507 artificial chromosome Anatomy 0.000 description 1
- 208000004668 avian leukosis Diseases 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- 239000003855 balanced salt solution Substances 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- SRBFZHDQGSBBOR-UHFFFAOYSA-N beta-D-Pyranose-Lyxose Natural products OC1COC(O)C(O)C1O SRBFZHDQGSBBOR-UHFFFAOYSA-N 0.000 description 1
- 229940000635 beta-alanine Drugs 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 210000004369 blood Anatomy 0.000 description 1
- 239000008280 blood Substances 0.000 description 1
- 238000006664 bond formation reaction Methods 0.000 description 1
- 210000000988 bone and bone Anatomy 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 238000009395 breeding Methods 0.000 description 1
- 230000001488 breeding effect Effects 0.000 description 1
- 239000007975 buffered saline Substances 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 150000004657 carbamic acid derivatives Chemical class 0.000 description 1
- 125000002837 carbocyclic group Chemical group 0.000 description 1
- 125000004432 carbon atom Chemical group C* 0.000 description 1
- 239000002134 carbon nanofiber Substances 0.000 description 1
- 239000002041 carbon nanotube Substances 0.000 description 1
- 229910021393 carbon nanotube Inorganic materials 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 239000006143 cell culture medium Substances 0.000 description 1
- 230000022131 cell cycle Effects 0.000 description 1
- 238000002659 cell therapy Methods 0.000 description 1
- 229920002301 cellulose acetate Polymers 0.000 description 1
- 238000005119 centrifugation Methods 0.000 description 1
- 239000002738 chelating agent Substances 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 229960002173 citrulline Drugs 0.000 description 1
- 235000013477 citrulline Nutrition 0.000 description 1
- 238000005352 clarification Methods 0.000 description 1
- 239000008395 clarifying agent Substances 0.000 description 1
- 238000003776 cleavage reaction Methods 0.000 description 1
- 230000004186 co-expression Effects 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000000295 complement effect Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 239000012228 culture supernatant Substances 0.000 description 1
- 238000012258 culturing Methods 0.000 description 1
- 125000000392 cycloalkenyl group Chemical group 0.000 description 1
- 125000000753 cycloalkyl group Chemical group 0.000 description 1
- 230000009089 cytolysis Effects 0.000 description 1
- 230000002950 deficient Effects 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- YSMODUONRAFBET-UHFFFAOYSA-N delta-DL-hydroxylysine Natural products NCC(O)CCC(N)C(O)=O YSMODUONRAFBET-UHFFFAOYSA-N 0.000 description 1
- 239000000412 dendrimer Substances 0.000 description 1
- 229920000736 dendritic polymer Polymers 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 238000009795 derivation Methods 0.000 description 1
- 239000003599 detergent Substances 0.000 description 1
- 239000008121 dextrose Substances 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 102000004419 dihydrofolate reductase Human genes 0.000 description 1
- 239000003085 diluting agent Substances 0.000 description 1
- LOKCTEFSRHRXRJ-UHFFFAOYSA-I dipotassium trisodium dihydrogen phosphate hydrogen phosphate dichloride Chemical compound P(=O)(O)(O)[O-].[K+].P(=O)(O)([O-])[O-].[Na+].[Na+].[Cl-].[K+].[Cl-].[Na+] LOKCTEFSRHRXRJ-UHFFFAOYSA-I 0.000 description 1
- 208000035475 disorder Diseases 0.000 description 1
- NAGJZTKCGNOGPW-UHFFFAOYSA-N dithiophosphoric acid Chemical class OP(O)(S)=S NAGJZTKCGNOGPW-UHFFFAOYSA-N 0.000 description 1
- 239000002552 dosage form Substances 0.000 description 1
- 230000005670 electromagnetic radiation Effects 0.000 description 1
- 238000005421 electrostatic potential Methods 0.000 description 1
- 238000005538 encapsulation Methods 0.000 description 1
- 230000012202 endocytosis Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000006911 enzymatic reaction Methods 0.000 description 1
- YSMODUONRAFBET-UHNVWZDZSA-N erythro-5-hydroxy-L-lysine Chemical compound NC[C@H](O)CC[C@H](N)C(O)=O YSMODUONRAFBET-UHNVWZDZSA-N 0.000 description 1
- NPUKDXXFDDZOKR-LLVKDONJSA-N etomidate Chemical compound CCOC(=O)C1=CN=CN1[C@H](C)C1=CC=CC=C1 NPUKDXXFDDZOKR-LLVKDONJSA-N 0.000 description 1
- 230000002349 favourable effect Effects 0.000 description 1
- 239000000945 filler Substances 0.000 description 1
- 125000000524 functional group Chemical group 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 230000030279 gene silencing Effects 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- 239000008103 glucose Substances 0.000 description 1
- 229960001031 glucose Drugs 0.000 description 1
- 235000001727 glucose Nutrition 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- 239000003102 growth factor Substances 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 208000006454 hepatitis Diseases 0.000 description 1
- 231100000283 hepatitis Toxicity 0.000 description 1
- 108010051779 histone H3 trimethyl Lys4 Proteins 0.000 description 1
- 239000012510 hollow fiber Substances 0.000 description 1
- 210000005260 human cell Anatomy 0.000 description 1
- 238000004191 hydrophobic interaction chromatography Methods 0.000 description 1
- 238000000530 impalefection Methods 0.000 description 1
- 230000002779 inactivation Effects 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000007918 intramuscular administration Methods 0.000 description 1
- 238000001990 intravenous administration Methods 0.000 description 1
- RGXCTRIQQODGIZ-UHFFFAOYSA-O isodesmosine Chemical compound OC(=O)C(N)CCCC[N+]1=CC(CCC(N)C(O)=O)=CC(CCC(N)C(O)=O)=C1CCCC(N)C(O)=O RGXCTRIQQODGIZ-UHFFFAOYSA-O 0.000 description 1
- 229960002725 isoflurane Drugs 0.000 description 1
- 238000001738 isopycnic centrifugation Methods 0.000 description 1
- 239000000644 isotonic solution Substances 0.000 description 1
- 101150066555 lacZ gene Proteins 0.000 description 1
- 239000003446 ligand Substances 0.000 description 1
- 230000029226 lipidation Effects 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 244000144972 livestock Species 0.000 description 1
- 238000011068 loading method Methods 0.000 description 1
- 230000007774 longterm Effects 0.000 description 1
- 238000003670 luciferase enzyme activity assay Methods 0.000 description 1
- 238000004020 luminiscence type Methods 0.000 description 1
- 210000004698 lymphocyte Anatomy 0.000 description 1
- 239000006166 lysate Substances 0.000 description 1
- 230000002934 lysing effect Effects 0.000 description 1
- 150000002671 lyxoses Chemical class 0.000 description 1
- 239000006249 magnetic particle Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- GMKMEZVLHJARHF-SYDPRGILSA-N meso-2,6-diaminopimelic acid Chemical compound [O-]C(=O)[C@@H]([NH3+])CCC[C@@H]([NH3+])C([O-])=O GMKMEZVLHJARHF-SYDPRGILSA-N 0.000 description 1
- VNWKTOKETHGBQD-UHFFFAOYSA-N methane Chemical class C VNWKTOKETHGBQD-UHFFFAOYSA-N 0.000 description 1
- 230000011987 methylation Effects 0.000 description 1
- 238000007069 methylation reaction Methods 0.000 description 1
- 244000005700 microbiome Species 0.000 description 1
- 238000000520 microinjection Methods 0.000 description 1
- 239000002480 mineral oil Substances 0.000 description 1
- 235000010446 mineral oil Nutrition 0.000 description 1
- 230000002438 mitochondrial effect Effects 0.000 description 1
- 238000001823 molecular biology technique Methods 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 238000001728 nano-filtration Methods 0.000 description 1
- 239000002070 nanowire Substances 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 238000011330 nucleic acid test Methods 0.000 description 1
- 239000002777 nucleoside Substances 0.000 description 1
- 150000003833 nucleoside derivatives Chemical class 0.000 description 1
- 239000003921 oil Substances 0.000 description 1
- 235000019198 oils Nutrition 0.000 description 1
- 238000002515 oligonucleotide synthesis Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000001590 oxidative effect Effects 0.000 description 1
- 210000000496 pancreas Anatomy 0.000 description 1
- 239000000312 peanut oil Substances 0.000 description 1
- VSIIXMUUUJUKCM-UHFFFAOYSA-D pentacalcium;fluoride;triphosphate Chemical compound [F-].[Ca+2].[Ca+2].[Ca+2].[Ca+2].[Ca+2].[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O.[O-]P([O-])([O-])=O VSIIXMUUUJUKCM-UHFFFAOYSA-D 0.000 description 1
- YVBBRRALBYAZBM-UHFFFAOYSA-N perfluorooctane Chemical compound FC(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)C(F)(F)F YVBBRRALBYAZBM-UHFFFAOYSA-N 0.000 description 1
- 230000035699 permeability Effects 0.000 description 1
- 239000003208 petroleum Substances 0.000 description 1
- 239000008177 pharmaceutical agent Substances 0.000 description 1
- CWCMIVBLVUHDHK-ZSNHEYEWSA-N phleomycin D1 Chemical compound N([C@H](C(=O)N[C@H](C)[C@@H](O)[C@H](C)C(=O)N[C@@H]([C@H](O)C)C(=O)NCCC=1SC[C@@H](N=1)C=1SC=C(N=1)C(=O)NCCCCNC(N)=N)[C@@H](O[C@H]1[C@H]([C@@H](O)[C@H](O)[C@H](CO)O1)O[C@@H]1[C@H]([C@@H](OC(N)=O)[C@H](O)[C@@H](CO)O1)O)C=1N=CNC=1)C(=O)C1=NC([C@H](CC(N)=O)NC[C@H](N)C(N)=O)=NC(N)=C1C CWCMIVBLVUHDHK-ZSNHEYEWSA-N 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 239000010452 phosphate Substances 0.000 description 1
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 1
- 150000004713 phosphodiesters Chemical class 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- BZQFBWGGLXLEPQ-REOHCLBHSA-N phosphoserine Chemical compound OC(=O)[C@@H](N)COP(O)(O)=O BZQFBWGGLXLEPQ-REOHCLBHSA-N 0.000 description 1
- 229920000729 poly(L-lysine) polymer Polymers 0.000 description 1
- 238000006116 polymerization reaction Methods 0.000 description 1
- 239000000256 polyoxyethylene sorbitan monolaurate Substances 0.000 description 1
- 235000010486 polyoxyethylene sorbitan monolaurate Nutrition 0.000 description 1
- 229920000136 polysorbate Polymers 0.000 description 1
- 230000029279 positive regulation of transcription, DNA-dependent Effects 0.000 description 1
- 230000001124 posttranscriptional effect Effects 0.000 description 1
- 239000000843 powder Substances 0.000 description 1
- 238000001556 precipitation Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 239000003755 preservative agent Substances 0.000 description 1
- 210000001236 prokaryotic cell Anatomy 0.000 description 1
- 238000011321 prophylaxis Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000003762 quantitative reverse transcription PCR Methods 0.000 description 1
- 230000002285 radioactive effect Effects 0.000 description 1
- 238000003753 real-time PCR Methods 0.000 description 1
- 108010054624 red fluorescent protein Proteins 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000003252 repetitive effect Effects 0.000 description 1
- 230000003362 replicative effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000001177 retroviral effect Effects 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 229940043230 sarcosine Drugs 0.000 description 1
- 230000007017 scission Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 150000003341 sedoheptuloses Chemical class 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 239000007790 solid phase Substances 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 238000000527 sonication Methods 0.000 description 1
- 239000003549 soybean oil Substances 0.000 description 1
- 235000012424 soybean oil Nutrition 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 230000035882 stress Effects 0.000 description 1
- 238000007920 subcutaneous administration Methods 0.000 description 1
- 239000004094 surface-active agent Substances 0.000 description 1
- 230000002459 sustained effect Effects 0.000 description 1
- 230000009885 systemic effect Effects 0.000 description 1
- 230000002123 temporal effect Effects 0.000 description 1
- 229940124597 therapeutic agent Drugs 0.000 description 1
- YSMODUONRAFBET-WHFBIAKZSA-N threo-5-hydroxy-L-lysine Chemical compound NC[C@@H](O)CC[C@H](N)C(O)=O YSMODUONRAFBET-WHFBIAKZSA-N 0.000 description 1
- 239000003053 toxin Substances 0.000 description 1
- 231100000765 toxin Toxicity 0.000 description 1
- 108700012359 toxins Proteins 0.000 description 1
- FGMPLJWBKKVCDB-UHFFFAOYSA-N trans-L-hydroxy-proline Natural products ON1CCCC1C(O)=O FGMPLJWBKKVCDB-UHFFFAOYSA-N 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 238000010361 transduction Methods 0.000 description 1
- 230000026683 transduction Effects 0.000 description 1
- 230000001131 transforming effect Effects 0.000 description 1
- 230000014616 translation Effects 0.000 description 1
- 241000712461 unidentified influenza virus Species 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
- 230000017613 viral reproduction Effects 0.000 description 1
- 230000010464 virion assembly Effects 0.000 description 1
- 229940088594 vitamin Drugs 0.000 description 1
- 229930003231 vitamin Natural products 0.000 description 1
- 239000011782 vitamin Substances 0.000 description 1
- 235000013343 vitamin Nutrition 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
- 238000009736 wetting Methods 0.000 description 1
- 239000000080 wetting agent Substances 0.000 description 1
- 150000003742 xyloses Chemical class 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
- C12N15/86—Viral vectors
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61K—PREPARATIONS FOR MEDICAL, DENTAL OR TOILETRY PURPOSES
- A61K48/00—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy
- A61K48/005—Medicinal preparations containing genetic material which is inserted into cells of the living body to treat genetic diseases; Gene therapy characterised by an aspect of the 'active' part of the composition delivered, i.e. the nucleic acid delivered
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2750/00—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA ssDNA viruses
- C12N2750/00011—Details
- C12N2750/14011—Parvoviridae
- C12N2750/14111—Dependovirus, e.g. adenoassociated viruses
- C12N2750/14141—Use of virus, viral particle or viral elements as a vector
- C12N2750/14143—Use of virus, viral particle or viral elements as a vector viral genome or elements thereof as genetic vector
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2830/00—Vector systems having a special element relevant for transcription
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2830/00—Vector systems having a special element relevant for transcription
- C12N2830/20—Vector systems having a special element relevant for transcription transcription of more than one cistron
- C12N2830/205—Vector systems having a special element relevant for transcription transcription of more than one cistron bidirectional
Abstract
The invention relates generally to compact bidirectional promoters and their use in expressing genes, e.g., for treating disease.
Description
COMPACT BIDIRECTIONAL PROMOTERS FOR GENE EXPRESSION
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims the benefit of and priority to U.S. Provisional Application No. 63/403,571, filed September 2, 2022, the entire disclosure of which is hereby incorporated by reference in its entirety for all purposes.
FIELD OF THE INVENTION
[0002] The invention relates generally to compact bidirectional promoters and their use in expressing genes, e.g., for treating disease.
BACKGROUND
[0003] Adeno-associated viruses (AAV) provide a safe means of therapeutic gene delivery; however, a significant technical obstacle limits an AAV vector’s utility: its small payload capacity. The large size of certain genes, in addition to a promoter, terminator, and two inverted terminal repeats (ITRs), presents a significant barrier to AAV packaging. In particular, due to the large size of current promoters, there is less space in vectors for regulatory elements that can improve safety, thereby making manufacturing less efficient. Initially, efforts were aimed at fitting the expression cassette within a single AAV by eliminating the promoter entirely. More recent attempts at overcoming the limited payload capacity of AAVs have focused on a combination of small synthetic promoters and/or a truncated payload gene. There exists an outstanding need for compositions and methods for packaging larger genes in vectors, such as AAV, which are suitable for gene delivery.
[0004] In addition to the above, viral promoters with ubiquitous expression (e.g., CMV, CBA, and CAG) have been the standard for decades. The reliance on novel capsid technologies has failed to address the necessity of tissue-specificity as a feature in successful gene therapy. Further, existing promoters strongly overexpress proteins, leading to cell stress, toxicity, immunogenicity, and silencing, while existing enhancers are known to increase the risk of oncogenicity. Therefore, there exists an additional outstanding need for compositions that may provide a spectrum of gene expression.
SUMMARY OF THE INVENTION
[0005] The invention is based, at least in part, upon the surprising discovery that compact bidirectional promoters can effectively drive expression of one or more genes (e.g., by RNA
polymerase II) useful in, for example, gene therapy applications. Adeno-associated viruses (AAV) are a promising delivery vehicle for nucleic acids for gene therapy, but the small size of AAV is a barrier to delivery of genes, such as those having coding sequences above about 4000 bp, and vector components. Here, the disclosure provides a solution to this problem using a compact bidirectional promoter to deliver sufficient and sustained expression of genes, e.g., by RNA polymerase II, via AAV. In some embodiments, the bidirectional promoter is capable of promoting transcription e.g., by RNA polymerase II) of two coding sequences positioned on opposite sides of the promoter. Accordingly, the compact bidirectional promoters of the invention provide at least four notable advantages over the prior art, including 1) providing space for regulatory elements that can improve safety of a vector, as well as 2) increased tissuespecificity and 3) tunable expression profiles to overcome issues of lack of tissue- and expression-sensitivity. Further, 4) the compact bidirectional promoters of the invention are derived from mammalian promoters, enabling increased durability as compared to viral promoters that have a propensity to be silenced. As yet another advantage, the nucleic acid molecules of the invention provide the notable advantages of lower oncogenicity, for example, due to omission of enhancers, as well as lower immunogenicity, as provided by adjusting tissue- and expression-specificity such that antigen-presenting cells are reduced compared to expression driven by canonical nucleic acid molecules and promoters, respectively.
[0006] Accordingly, in one aspect, the disclosure relates to a nucleic acid including a compact bidirectional promoter, or a functional fragment or variant thereof, operably linked to at least one heterologous coding sequence, wherein the compact bidirectional promoter is less than about 1000 bp, and wherein the bidirectional promoter is capable of promoting transcription of two coding sequences positioned on opposite sides of the promoter.
[0007] In another aspect, the disclosure relates to an expression construct including the nucleic acid of the foregoing aspect.
[0008] In another aspect, the disclosure relates to a vector including the expression construct of the foregoing aspect, optionally wherein the vector is a plasmid, a DNA vector, an RNA vector, a virion, or a viral vector.
[0009] In some embodiments of the foregoing aspect, the vector is a viral vector. In some embodiments, the viral vector is an AAV, lentivirus, adenovirus, simian virus 40, vaccinia virus, measles virus, herpes virus, or poxvirus. In some embodiments, the viral vector is an AAV vector. For example, in some embodiments, the AAV is a single-stranded AAV (ssAAV) vector. In some embodiments, the AAV is a self-complementary AAV (scAAV) vector.
[0010] In another aspect, the disclosure relates to a method of expressing a heterologous coding sequence in a cell, the method including transfecting the cell with the expression construct or the vector of any one of the foregoing aspects.
[0011] In another aspect, the disclosure relates to a method of treating a disease in a subject in need thereof, the method including administering to the subject the vector of any one of the foregoing aspects.
[0012] In another aspect, the disclosure relates to a method of expressing at least one heterologous coding sequence in a target cell, the method including introducing into a subject a nucleic acid including a compact bidirectional promoter, or a functional fragment or variant thereof, operably linked to at least one heterologous coding sequence, wherein the compact bidirectional promoter is less than about 1000 bp, and wherein the bidirectional promoter is capable of promoting transcription of two coding sequences positioned on opposite sides of the promoter in the cell.
[0013] In another aspect, the disclosure relates to a method of expressing two heterologous coding sequences in different target cells, the method including introducing into a subject a nucleic acid including a compact bidirectional promoter, or a functional fragment or variant thereof, operably linked to the two heterologous coding sequences positioned on opposite sides of the compact bidirectional promoter in the cell, wherein the compact bidirectional promoter is less than about 1000 bp, and wherein the compact bidirectional promoter promotes transcription of one of the coding sequences in a first target cell and promotes transcription of the other coding sequence in a second target cell.
[0014] In some embodiments of any of the foregoing aspects, the compact bidirectional promoter, or the functional fragment or the variant thereof, expresses the at least one heterologous coding sequence in a target cell.
[0015] In some embodiments of any of the foregoing aspects, the compact bidirectional promoter, or the functional fragment or the variant thereof, is capable of expressing each of the two heterologous coding sequences in a partially overlapping set of target cells.
[0016] In some embodiments of any of the foregoing aspects, the at least one coding sequence is codon optimized. In some embodiments, the codon optimized coding sequence comprises a nucleic acid sequence selected from any one of SEQ ID NOs: 819-836, or a nucleic acid sequence having at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% sequence identity thereto.
[0017] In another aspect, the disclosure relates to a method of administering an scAAV vector including a therapeutic coding sequence at a reduced dose for treating a disease treatable by the
therapeutic coding sequence, the method including, administering to a subject a scAAV including a compact bidirectional promoter, or a functional fragment or variant thereof, operably linked to the therapeutic coding sequence, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is less than about 1000 bp and is heterologous to the therapeutic coding sequence, wherein the scAAV vector is administered at a reduced dose as compared to the therapeutically effective dose for an ssAAV vector including the therapeutic coding sequence.
[0018] In some embodiments of the foregoing aspect, the reduced dose is between about 10-fold and about 600-fold lower than the therapeutically effective dose for an ssAAV vector. For example, in some embodiments, the reduced dose is about 10-fold lower than the therapeutically effective dose for an ssAAV vector.
[0019] In some embodiments of any of the foregoing aspects, the bidirectional promoter, or the functional fragment or the variant thereof, is capable of promoting transcription of two coding sequences positioned on opposite sides of the promoter.
[0020] In some embodiments of any of the foregoing aspects, the compact bidirectional promoter includes a nucleic acid sequence selected from any one of SEQ ID NOs: 1-800, or a nucleic acid sequence having at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% sequence identity thereto.
[0021] In some embodiments of any of the foregoing aspects, the compact bidirectional promoter, or the functional fragment or the variant thereof, includes at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, at least 99.5%, or 100% sequence identity to a naturally occurring mammalian promoter.
[0022] In some embodiments of any of the foregoing aspects, the compact bidirectional promoter, or the functional fragment or the variant thereof, expresses the therapeutic coding sequence in a target cell.
[0023] In some embodiments of any of the foregoing aspects, the therapeutic coding sequence encodes A1AT, ALPL, ARSA, BBS1, BEST1, CAH, CFH, CFI, CHM, CLN2, CLN7, CNGA3, CYP46A1, F9, FKRP, FMRI, FMRP, F0XG1, GAD, GALC, GALGT2, GBA1, GBE1, GLB1, GRN, HEXA, HTRA1, IDS, IDEA, LAMP2, LCA5, MECP2, MFN2, MMUT, MIMI, NAGLU, ND4, PAH, RIGA, PRKN, RPE65, SERPINGI, SGSH, SLCI3A5, SLC6A1, or a functional fragment or variant thereof.
[0024] In some embodiments of any of the foregoing aspects, the therapeutic coding sequence is codon optimized. In some embodiments, the codon optimized coding sequence comprises a
nucleic acid sequence selected from any one of SEQ ID NOs: 819-836, or a nucleic acid sequence having at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% sequence identity thereto.
[0025] In some embodiments of a foregoing aspect, the therapeutic coding sequence is less than about 750 amino acids. For example, in some embodiments, the therapeutic coding sequence is from about 350 amino acids to about 750 amino acids.
[0026] In another aspect, the disclosure relates to a method including: obtaining a genome file including information about the location of transcription start sites on the plus and minus strands of a chromosome; and identifying regions between a transcription start site on the minus strand of the chromosome and a transcription start site on the plus strand of the chromosome, thereby identifying one or more bidirectional promoters.
[0027] In some embodiments of the foregoing aspect, the genome file including annotations categorized by chromosome, wherein the annotations include indices, wherein the indices include genes, pseudogenes, and coding regions for protein-coding genes, wherein each coding region includes a transcription start site.
[0028] In some embodiments of the foregoing aspect, the one or more bidirectional promoters are identified by obtaining a non-transitory computer readable medium including instructions that, when executed by a processor, cause the processor to identify the regions between the transcription start site on the minus strand of a chromosome and the transcription start site on the plus strand of the chromosome.
[0029] In some embodiments of the foregoing aspect, the genome file including annotations includes mammalian annotations. For example, in some embodiments, the mammalian annotations include human annotations or mouse annotations.
[0030] In some embodiments of the foregoing aspect, the genome file including annotations is GRCh38_latest_genomic.gff or GRCm39_vM27.gff3. For example, in some embodiments, the genome file is GRCm39_vM27.gff3.
[0031] In some embodiments of any of the foregoing aspects, the one or more bidirectional promoters are less than about 1000 bp. For example, in some embodiments, the one or more bidirectional promoters are between about 30 bp and about 800 bp. In some embodiments, the one or more bidirectional promoters are between about 30 bp and about 600 bp. In some embodiments, the one or more bidirectional promoters are between about 30 bp and about 400 bp. In some embodiments, the one or more bidirectional promoters are between about 30 bp and about 200 bp.
[0032] In some embodiments of a foregoing aspect, the method further includes linking the one or more bidirectional promoters to at least one heterologous coding sequence.
[0033] In some embodiments of a foregoing aspect, the method further includes linking the one or more bidirectional promoters to two heterologous coding sequences. In some embodiments, the one or more bidirectional promoters are capable of promoting transcription of two coding sequences positioned on opposite sides of the promoter.
[0034] In some embodiments of any of the foregoing aspects, the compact promoter is operably linked to a 5' UTR.
[0035] In some embodiments of any of the foregoing aspects, the compact bidirectional promoter, or the functional fragment or the variant thereof, is operably linked to a Kozak consensus sequence.
[0036] In some embodiments of a foregoing aspect, the method further includes linking each of the one or more bidirectional promoters to only one heterologous coding sequence. For example, in some embodiments, the method further includes, linking each of the one or more bidirectional promoters to two heterologous coding sequences positioned on opposite sides of the promoter. [0037] In some embodiments of any of the foregoing aspects, the two heterologous coding sequences include the same coding sequence. In some embodiments, the two heterologous coding sequences include different coding sequences.
[0038] In some embodiments of any of the foregoing aspects, the one or more bidirectional promoters are capable of expressing the at least one heterologous coding sequence in a target cell. For example, in some embodiments, the target cell is a lung cell, a pancreatic cell, a kidney cell, a muscle cell, a liver cell, a retinal cell, a neuron, a glial cell, an endothelial cell, or an epithelial cell.
[0039] In some embodiments of any of the foregoing aspects, the one or more bidirectional promoters are capable of expressing each of the two heterologous coding sequences: (a) in the same target cell or cells, (b) in different target cells, or (c) in a partially overlapping set of target cells.
[0040] In some embodiments of any of the foregoing aspects, the compact bidirectional promoter expresses a luciferase reporter at a higher level than is a herpes simplex virus (HSV) thymidine kinase (TK) promoter.
[0041] In some embodiments of a foregoing aspect, the at least one coding sequence encodes CFTR, ATP7B, ATP7A, AGL, CPS1, A1AT, ALPL, ARSA, BBS1, BEST1, CAH, CFH, CFI, CHM, CLN2, CLN7, CNGA3, CYP46A1, F9, FKRP, FMRI, FMRP, FOXG1, GAD, GALC, GALGT2, GBA1, GBE1, GLB1, GRN, HEXA, HTRA1, IDS, IDEA, LAMP2, LCA5, MECP2, MFN2,
MMUT, MIMI, NAGLU, ND4, PAH, PIGA, PRKN, RPE65, SERPING1, SGSH, SLC13A5, or SLC6AJ.
[0042] These and other aspects and features of the invention are described in the following detailed description and claims.
BRIEF DESCRIPTION OF THE DRAWINGS
[0043] The invention can be more completely understood with reference to the following drawings.
[0044] FIG. 1 is a schematic showing an exemplary nucleic acid that includes a bidirectional promoter (“Promoter Region;” e.g., a compact bidirectional promoter) operably linked to two heterologous coding regions, each of which is transcribed by RNA polymerase II.
[0045] FIG. 2 is a graph showing the number of genes in the human genome as a function of their length in base pairs (bp). The subset of genes that can be packaged into an ssAAV using the compact bidirectional promoters identified herein is highlighted in grey.
[0046] FIG. 3 is a graph showing the number of genes in the human genome as a function of their length in bp. The subset of genes that can be packaged into an scAAV using the compact bidirectional promoters identified herein is highlighted in grey.
[0047] FIGs. 4A-4B are a set of graphs depicting the unique tissue expression profiles of two genes, COX15 and CUTC, that flank a bidirectional promoter identified in Example 1. The tissue expression data is plotted as a function of normalized protein-coding transcripts per million (nTPM; y-axis) and was obtained using the Human Protein Atlas (HP A) and the Genotype-Tissue Expression (GTEx) databases, with expression data from HPA shown in FIG. 4A and consensus expression data from HPA and GTEx shown in FIG. 4B.
[0048] FIGs. 5A-5H are a set of radar plots depicting the unique liver-, hepatocyte-, neuronal-, kidney tubular-, skeletal muscle-, cerebral cortex-, retina-, and rod photoreceptor-specific expression profiles the compact bidirectional promoters of the disclosure (e.g., a promoter having less than 300 bp). Each radar plot reflects a single promoter, with specific tissues indicated at the vertices. This provides a y-axis for each tissue (with zero at the center) and with increasing promoter activity radiating from the center, such that the value of the number indicates nTPM levels from the GTEx transcriptomics dataset.
[0049] FIGs 6A-6D are a set of radar plots, as described in FIGs. 5A-5H, depicting cell subtype expression profiles in the lung for four exemplary compact bidirectional promoters of the disclosure.
[0050] FIG. 7 is a schematic outline of a method of the disclosure used to identify a bidirectional promoter (e.g., a compact bidirectional promoter). In brief, the schematic depicts, from top-to-bottom the steps of (a) obtaining a genome file (experimental data set) including database-derived annotations categorized by chromosome, wherein the annotations are indexed by, for example, genes, pseudogenes, and coding regions for protein-coding genes, wherein each coding region includes a transcription start site; and (b) obtaining a non-transitory computer readable medium comprising instructions that, when executed by a processor, cause the processor to: determine transcription start sites and orientations, identify divergent transcription and the genomic coordinates thereof, and extract the sequence between the divergent transcription, thereby identifying one or more bidirectional promoters.
[0051] FIG. 8 is a graph depicting the expression profiles of the thymidine kinase (TK; “p322”), human Hl (control pol II/pol III promoter; “p096”), human M0RN5 (“p387”), human RPL9 (“p389”), human NDUFB9 (“p390”), human RPS28 (“p391”), and human SLIRP (“p392”) promoters in HeLa cells using a luciferase reporter assay. Data was obtained from n > 3 technical replicates and n > 3 biological replicates, with error bars indicating mean ±SEM, where SEM = (SD/ N).
[0052] FIG. 9 is a graph depicting the expression profiles of the TK (“p322”), human Hl (control pol II/pol III promoter; “p096”), human M0RN5 (“p387”), human RPL9 (“p389”), human NDUFB9 (“p390”), human RPS28 (“p391”), and human SLIRP (“p392”) promoters in A549 cells using a luciferase reporter assay. Data was obtained from n > 3 technical replicates and n > 3 biological replicates, with error bars indicating mean ±SEM, where SEM = (SD/ N). [0053] FIG. 10 is a graph depicting the expression profiles of the TK (“p322”), human Hl (control pol II/pol III promoter; “p096”), human M0RN5 (“p387”), human RPL9 (“p389”), human NDUFB9 (“p390”), human RPS28 (“p391”), and human SLIRP (“p392”) promoters in CFBE cells using a luciferase reporter assay. Data was obtained from n > 3 technical replicates and n > 3 biological replicates, with error bars indicating mean ±SEM, where SEM = (SD/ N). [0054] FIGs. 11A-11B are a set of graphs depicting the unique tissue expression profiles of two genes, M0RN5 and NDUFA8, that flank the bidirectional promoter M0RN5 described in Example 2. The tissue expression data is plotted as a function of nTPM (y-axis) and was obtained using the HPA and GTEx databases, with expression data from HPA shown in FIG. HA and consensus expression data from HPA and GTEx shown in FIG. 11B.
[0055] FIGs. 12A-12M are a set of graphs depicting the unique central nervous system tissue expression profiles of two genes, M0RN5 and NDUFA8, that flank the bidirectional promoter M0RN5 described in Example 2 was obtained using the HPA and GTEx databases, with
expression data from the cerebral cortex shown in FIG. 12A, olfactory bulb shown in FIG. 12B, hippocampal formation shown in FIG. 12C, amygdala shown in FIG. 12D, basal ganglia shown in FIG. 12E, thalamus shown in FIG. 12F, hypothalamus shown in FIG. 12G, cerebellum shown in FIG. 12H, midbrain shown in FIG. 121, pons shown in FIG. 12J, medulla oblongata shown in FIG. 12K, spinal cord shown in FIG. 12L, and white matter shown in FIG. 12M. [0056] FIG. 13 is a set of graphs depicting the unique single cell RNA expression profiles of two genes, M0RN5 and NDUFA8, that flank the bidirectional promoter M0RN5 described in Example 2. The tissue expression data is plotted as a function of nTPM (y-axis).
[0057] FIG. 14 is a graph depicting the unique blood cell RNA expression profiles of two genes, M0RN5 and NDUFA8, that flank the bidirectional promoter M0RN5 described in Example 2. The tissue expression data is plotted as a function of nTPM (y-axis).
[0058] FIGs. 15A-15B are a set of graphs depicting the unique tissue expression profiles of two genes, NDUFB9 and TATDN1, that flank the bidirectional promoter NDUFB9 described in Example 2. The tissue expression data is plotted as a function of nTPM (y-axis) and was obtained using the HPA and GTEx databases, with expression data from HPA shown in FIG. 15A and consensus expression data from HPA and GTEx shown in FIG. 15B.
[0059] FIGs. 16A-16M are a set of graphs depicting the unique central nervous system tissue expression profiles of two genes, NDUFB9 and TATDN1, that flank the bidirectional promoter NDUFB9 described in Example 2 was obtained using the HPA and GTEx databases, with expression data from the cerebral cortex shown in FIG. 16A, olfactory bulb shown in FIG. 16B, hippocampal formation shown in FIG. 16C, amygdala shown in FIG. 16D, basal ganglia shown in FIG. 16E, thalamus shown in FIG. 16F, hypothalamus shown in FIG. 16G, cerebellum shown in FIG. 16H, midbrain shown in FIG. 161, pons shown in FIG. 16J, medulla oblongata shown in FIG. 16K, spinal cord shown in FIG. 16L, and white matter shown in FIG. 16M. [0060] FIG. 17 is a set of graphs depicting the unique single cell RNA expression profiles of two genes, NDUFB9 and TATDN1, that flank the bidirectional promoter NDUFB9 described in Example 2. The tissue expression data is plotted as a function of nTPM (y-axis).
[0061] FIG. 18 is a graph depicting the unique blood cell RNA expression profiles of two genes, NDUFB9 and TATDN1, that flank the bidirectional promoter NDUFB9 described in Example 2. The tissue expression data is plotted as a function of nTPM (y-axis).
[0062] FIGs. 19A-19B are a set of graphs depicting the unique tissue expression profiles of two genes, NDUFA7 and RPS28, that flank the bidirectional promoter RPS28 described in Example 2. The tissue expression data is plotted as a function of nTPM (y-axis) and was obtained using
the HPA and GTEx databases, with expression data from HPA shown in FIG. 19A and consensus expression data from HPA and GTEx shown in FIG. 19B.
[0063] FIGs. 20A-20M are a set of graphs depicting the unique central nervous system tissue expression profiles of two genes, NDUFA7 and RPS28, that flank the bidirectional promoter RPS28 described in Example 2 was obtained using the HPA and GTEx databases, with expression data from the cerebral cortex shown in FIG. 20A, olfactory bulb shown in FIG. 20B, hippocampal formation shown in FIG. 20C, amygdala shown in FIG. 20D, basal ganglia shown in FIG. 20E, thalamus shown in FIG. 20F, hypothalamus shown in FIG. 20G, cerebellum shown in FIG. 20H, midbrain shown in FIG. 201, pons shown in FIG. 20J, medulla oblongata shown in FIG. 20K, spinal cord shown in FIG. 20L, and white matter shown in FIG. 20M. [0064] FIG. 21 is a set of graphs depicting the unique single cell RNA expression profiles of two genes, NDUFA7 and RPS28, that flank the bidirectional promoter RPS28 described in Example 2. The tissue expression data is plotted as a function of nTPM (y-axis).
[0065] FIG. 22 is a graph depicting the unique blood cell RNA expression profiles of two genes, NDUFA7 and RPS28, that flank the bidirectional promoter RPS28 described in Example 2. The tissue expression data is plotted as a function of nTPM (y-axis).
[0066] FIGs. 23A-23B are a set of graphs depicting the unique tissue expression profiles of two genes, ALKBH1 and SLIRP, that flank the bidirectional promoter SLIRP described in Example 2. The tissue expression data is plotted as a function of nTPM (y-axis) and was obtained using the HPA and GTEx databases, with expression data from HPA shown in FIG. 23A and consensus expression data from HPA and GTEx shown in FIG. 23B.
[0067] FIGs. 24A-24M are a set of graphs depicting the unique central nervous system tissue expression profiles of two genes, ALKBH1 and SLIRP, that flank the bidirectional promoter SLIRP described in Example 2 was obtained using the HPA and GTEx databases, with expression data from the cerebral cortex shown in FIG. 24A, olfactory bulb shown in FIG. 24B, hippocampal formation shown in FIG. 24C, amygdala shown in FIG. 24D, basal ganglia shown in FIG. 24E, thalamus shown in FIG. 24F, hypothalamus shown in FIG. 24G, cerebellum shown in FIG. 24H, midbrain shown in FIG. 241, pons shown in FIG. 24J, medulla oblongata shown in FIG. 24K, spinal cord shown in FIG. 24L, and white matter shown in FIG. 24M. [0068] FIG. 25 is a set of graphs depicting the unique single cell RNA expression profiles of two genes, ALKBH1 and SLIRP, that flank the bidirectional promoter SLIRP described in Example 2. The tissue expression data is plotted as a function of nTPM (y-axis).
[0069] FIG. 26 is a graph depicting the unique blood cell RNA expression profiles of two genes, ALKBH1 and SLIRP, that flank the bidirectional promoter SLIRP described in Example 2. The tissue expression data is plotted as a function of nTPM (y-axis).
DETAILED DESCRIPTION
[0070] Various features and aspects of the invention are discussed in more detail below.
[0071] In particular, the disclosure provides nucleic acids, expression constructs, and vectors including a compact bidirectional promoter and a gene, wherein the compact bidirectional promoter is small enough to allow for the inclusion of a heterologous coding sequence in a vector, such as an AAV vector, having a size limit that makes expression of genes difficult using conventional promoters. The disclosure herein also provides methods of identifying and using the same. Unless otherwise defined herein, scientific and technical terms used in this application shall have the meanings that are commonly understood by those of ordinary skill in the art. [0072] Generally, nomenclature used in connection with, and techniques of, pharmacology, cell and tissue culture, molecular biology, cell and cancer biology, neurobiology, neurochemistry, virology, immunology, microbiology, genetics, and protein and nucleic acid chemistry, described herein, are those well-known and commonly used in the art. In case of conflict, the present specification, including definitions, will control.
[0073] The practice of the present disclosure will employ, unless otherwise indicated, conventional techniques of molecular biology (including recombinant techniques), microbiology, cell biology, biochemistry and immunology, which are within the skill of the art. Such techniques are explained fully in the literature, such as, Molecular Cloning: A Laboratory Manual, second edition (Sambrook et al. , 1989) Cold Spring Harbor Press; Oligonucleotide Synthesis (M.J. Gait, ed., 1984); Methods in Molecular Biology, Humana Press; Cell Biology: A Laboratory Notebook (J.E. Cellis, ed., 1998) Academic Press; Animal Cell Culture (R.I. Freshney, ed., 1987); Introduction to Cell and Tissue Culture (J. P. Mather and P.E. Roberts, 1998) Plenum Press; Cell and Tissue Culture: Laboratory Procedures (A. Doyle, J.B. Griffiths, and D.G. Newell, eds., 1993-1998) J. Wiley and Sons; Methods in Enzymology (Academic Press, Inc.); Gene Transfer Vectors for Mammalian Cells (J.M. Miller and M.P. Calos, eds., 1987); Current Protocols in Molecular Biology (F.M. Ausubel et al., eds., 1987); PCR: The Polymerase Chain Reaction, (Mullis et al., eds., 1994); Sambrook and Russell, Molecular Cloning: A Laboratory Manual, 3rd. ed., Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY (2001); Ausubel et al., Current Protocols in Molecular Biology, John Wiley & Sons, NY (2002); Harlow and Lane Using Antibodies: A Laboratory Manual, Cold Spring Harbor
Laboratory Press, Cold Spring Harbor, NY (1998); Coligan et al., Short Protocols in Protein Science, John Wiley & Sons, NY (2003); Short Protocols in Molecular Biology (Wiley and Sons, 1999).
[0074] Enzymatic reactions and purification techniques are performed according to manufacturer’s specifications, as commonly accomplished in the art or as described herein. The nomenclatures used in connection with, and the laboratory procedures and techniques of, analytical chemistry, biochemistry, immunology, molecular biology, synthetic organic chemistry, and medicinal and pharmaceutical chemistry described herein are those well-known and commonly used in the art. Standard techniques are used for chemical syntheses, and chemical analyses.
[0075] Throughout this specification and embodiments, the word “comprise,” or variations such as “comprises” or “comprising,” will be understood to imply the inclusion of a stated integer or group of integers but not the exclusion of any other integer or group of integers.
[0076] It is understood that wherever embodiments are described herein with the language “comprising,” otherwise analogous embodiments described in terms of “consisting of’ and/or “consisting essentially of’ are also provided.
[0077] The term “including” is used to mean “including but not limited to.” “Including” and “including but not limited to” are used interchangeably.
[0078] Any example(s) following the term “e.g.” or “for example” is not meant to be exhaustive or limiting.
[0079] Unless otherwise required by context, singular terms shall include pluralities and plural terms shall include the singular.
[0080] Notwithstanding that the numerical ranges and parameters setting forth the broad scope of the disclosure are approximations, the numerical values set forth in the specific examples are reported as precisely as possible. Any numerical value, however, inherently contains certain errors necessarily resulting from the standard deviation found in their respective testing measurements. Moreover, all ranges disclosed herein are to be understood to encompass any and all subranges subsumed therein. For example, a stated range of “1 to 10” should be considered to include any and all subranges between (and inclusive of) the minimum value of 1 and the maximum value of 10; that is, all subranges beginning with a minimum value of 1 or more, e.g., 1 to 6.1, and ending with a maximum value of 10 or less, e.g., 5.5 to 10.
[0081] Where aspects or embodiments of the disclosure are described in terms of a Markush group or other grouping of alternatives, the present disclosure encompasses not only the entire group listed as a whole, but each member of the group individually and all possible subgroups of
the main group, but also the main group absent one or more of the group members. The present disclosure also envisages the explicit exclusion of one or more of any of the group members in an embodiment of the disclosure.
[0082] Exemplary methods and materials are described herein, although methods and materials similar or equivalent to those described herein can also be used in the practice or testing of the present disclosure. The materials, methods, and examples are illustrative only and not intended to be limiting.
I. Definitions
[0083] The following terms, unless otherwise indicated, shall be understood to have the following meanings:
[0084] The articles “a” and “an” are used herein to refer to one or to more than one (z.e., to at least one) of the grammatical object of the article. By way of example, “an element” means one element or more than one element. Reference to “about” a value or parameter herein includes (and describes) embodiments that are directed to that value or parameter per se. For example, description referring to “about X” includes description of “X .” Numeric ranges are inclusive of the numbers defining the range. Where the use of the term “about” is before a quantitative value, the present invention also includes the specific quantitative value itself, unless specifically stated otherwise. As used herein, the term “about” refers to a ± 10% variation from the nominal value unless otherwise indicated or inferred.
[0085] As used herein, the term “adeno-associated virus” (AAV) refers to a vector derived from an adeno-associated virus serotype, including without limitation, AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV13, AAV14, AAV15, AAV16, AAV.rh8, AAV.rhlO, AAV.rh20, AAV.rh39, AAV.Rh74, AAV.RHM4-1, AAV.hu37, AAV.Anc80, AAV.Anc80L65, AAV.7m8, AAV.PHP.B, AAV.PHP.EB, AAV2.5, AAV2tYF, AAV3B, AAV.LK03, AAV.HSC1, AAV.HSC2, AAV.HSC3, AAV.HSC4, AAV.HSC5, AAV.HSC6, AAV.HSC7, AAV.HSC8, AAV.HSC9, AAV.HSC10, AAV.HSC11, AAV.HSC12, AAV.HSC13, AAV.HSC14, AAV.HSC15, AAV-TT, AAV-DJ8, and AAV.HSC16. AAV vectors can have one or more of the AAV wild-type genes deleted in whole or part, e.g., the Rep and/or Cap genes, but retain functional flanking inverted terminal repeat (ITR) sequences. Functional ITR sequences promote the rescue, replication, and packaging of the AAV virion. Thus, an AAV vector is defined herein to include at least those sequences required in cis for replication and packaging e.g., functional ITRs) of the virus. ITRs do not need to be the wildtype polynucleotide sequences and may be altered, e.g., by the insertion, deletion, or substitution
of nucleotides, so long as the sequences provide for functional rescue, replication, and packaging. AAV expression vectors are constructed using known techniques to at least provide as operatively linked components in the direction of transcription, control elements including a transcriptional initiation region, the DNA of interest (e.g., a polynucleotide encoding a nucleic acid molecule of the disclosure) and a transcriptional termination region. The terms “adeno- associated virus inverted terminal repeats” and “AAV ITRs” refer to art-recognized regions flanking each end of the AAV genome which function together in cis as origins of DNA replication and as packaging signals for the virus. AAV ITRs, together with the AAV Rep coding region, provide for the efficient excision and integration of a polynucleotide sequence interposed between two flanking ITRs into a mammalian genome. The polynucleotide sequences of AAV ITR regions are known. As used herein, an “AAV ITR” does not necessarily include the wild-type polynucleotide sequence, which may be altered, e.g., by the insertion, deletion, or substitution of nucleotides. Additionally, the AAV ITR may be derived from any of several AAV serotypes, including without limitation AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV13, AAV14, AAV15, AAV16, AAV.rh8, AAV.rhlO, AAV.rh20, AAV.rh39, AAV.Rh74, AAV.RHM4-1, AAV.hu37, AAV.Anc80, AAV.Anc80L65, AAV.7m8, AAV.PHP.B, AAV.PHP.EB, AAV2.5, AAV2tYF, AAV3B, AAV.LK03, AAV.HSC1, AAV.HSC2, AAV.HSC3, AAV.HSC4, AAV.HSC5, AAV.HSC6, AAV.HSC7, AAV.HSC8, AAV.HSC9, AAV.HSC10, AAV.HSC11, AAV.HSC12, AAV.HSC13, AAV.HSC14, AAV.HSC15, AAV-TT, AAV-DJ8, and AAV.HSC16, among others. Furthermore, 5' and 3' ITRs which flank a selected polynucleotide sequence in an AAV vector need not be identical or derived from the same AAV serotype or isolate, so long as they function as intended, e.g., to allow for excision and rescue of the sequence of interest from a host cell genome or vector, and to allow integration of the heterologous sequence into the recipient cell genome when AAV Rep gene products are present in the cell. Additionally, AAV ITRs may be derived from any of several AAV serotypes, including without limitation, AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV13, AAV14, AAV15, AAV16, AAV.rh8, AAV.rhlO, AAV.rh20, AAV.rh39, AAV.Rh74, AAV.RHM4-1, AAV.hu37, AAV.Anc80, AAV.Anc80L65, AAV.7m8, AAV.PHP.B, AAV.PHP.EB, AAV2.5, AAV2tYF, AAV3B, AAV.LK03, AAV.HSC1, AAV.HSC2, AAV.HSC3, AAV.HSC4, AAV.HSC5, AAV.HSC6, AAV.HSC7, AAV.HSC8, AAV.HSC9, AAV.HSC10, AAV.HSC11, and AAV.HSC12.
[0086] An “AAV inverted terminal repeat (ITR)” sequence, a term well-understood in the art, is an approximately 145 -nucleotide sequence that is present at both termini of the native single-
stranded AAV genome. The outermost 125 nucleotides of the ITR can be present in either of two alternative orientations, leading to heterogeneity between different AAV genomes and between the two ends of a single AAV genome. The outermost 125 nucleotides also contains several shorter regions of self-complementarity (designated A, A', B, B', C, C and D regions), allowing intrastrand base-pairing to occur within this portion of the ITR.
[0087] “Administering” or “administration” of a substance, a compound, or an agent to a subject can be carried out using one of a variety of methods known to those skilled in the art. In some embodiments, administration may be local. In other embodiments, administration may be systemic. Administering can also be performed, for example, once, a plurality of times, and/or over one or more extended periods. In some aspects, the administration includes both direct administration, including self-administration, and indirect administration, including the act of prescribing a drug. For example, as used herein, a physician who instructs a subject to selfadminister a drug, or to have the drug administered by another and/or who provides a subject with a prescription for a drug is administering the drug to the subject.
[0088] It should be understood that the expression of “at least one of’ includes individually each of the recited objects after the expression and the various combinations of two or more of the recited objects unless otherwise understood from the context and use. The expression “and/or” in connection with three or more recited objects should be understood to have the same meaning unless otherwise understood from the context.
[0089] As used herein, a “coding sequence” is a portion of a nucleic acid that contains codons that can be translated into amino acids. Although a “stop codon” (TAG, TGA, and TAA) is not translated into an amino acid, it may be considered to be part of a coding region, if present, but any flanking sequences, for example, promoters, ribosome binding sites, transcriptional terminators, introns, 5' and 3" untranslated regions, and the like, are not part of the coding region.
[0090] As used herein, “codon optimization” refers to the process of modifying a nucleic acid sequence in accordance with the principle that the frequency of occurrence of synonymous codons (e.g., codons that code for the same amino acid) in coding DNA is biased in different species. Such codon degeneracy allows an identical polypeptide to be encoded by a variety of nucleotide sequences. Sequences modified in this way are referred to herein as “codon optimized.” This process may be performed on any of the sequences described in this specification to enhance expression or stability. Codon optimization may be performed in a manner, such as that described in, e.g., U.S. Patent Nos. 7,561,972; 7,561,973; and 7,888,112, the entire contents of each of which is incorporated herein by reference. The sequence
surrounding the translational start site can be converted to a consensus Kozak sequence according to known methods. See, e.g., Kozak et al. (Nucleic Acids Res AS (20): 8125-8148, 1987), the entire contents of which is hereby incorporated by reference. In some embodiments, codon optimization includes the incorporation of multiple stop codons.
[0091] Throughout this specification and embodiments, the word “include,” or variations such as “includes” or “including,” will be understood to imply the inclusion of a stated integer or group of integers but not the exclusion of any other integer or group of integers.
It is understood that wherever embodiments are described herein with the language “including,” otherwise analogous embodiments described in terms of “consisting of’ and/or “consisting essentially of’ are also provided.
[0092] The term “consensus sequence,” as used herein in the context of nucleic acid sequences, refers to a calculated sequence representing the most frequent nucleotide residues found at each position in a plurality of similar sequences. Typically, a consensus sequence is determined by sequence alignment in which similar sequences are compared to each other and similar sequence motifs are calculated.
[0093] A “deletion” may include the deletion of subject amino acids, deletion of small groups of amino acids such as 2, 3, 4, or 5 amino acids, or deletion of larger amino acid regions, such as the deletion of specific amino acid domains or other features.
[0094] Any example(s) following the terms “e.g.” or “for example” are not meant to be exhaustive or limiting.
[0095] As used herein, the term “functional fragment” refers to a fragment of (a) a promoter or (b) a gene or coding sequence (e.g., an mRNA) that encodes a protein that retains, for example, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or 100% of at least one activity of the corresponding full- length, naturally occurring promoter or protein. The term “fragment of,” or “fragment thereof,” as used herein, refers to a segment (e.g., a segment of at least about 10%, at least about 15%, at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 95%, at least about 97%, at least about 98%, at least about 99%, at least about 99.5%, or at least about 99.9%) of the full length gene(s) or nucleic acid molecule(s) of interest.
[0096] A “helper virus” for AAV refers to a virus that allows AAV (which is a defective parvovirus) to be replicated and packaged by a host cell. A number of such helper viruses are known in the art.
[0097] As used herein, the term “heterologous” refers to regions that are not normally associated with a particular nucleic acid in nature. For example, a “coding region heterologous to a promoter” is a coding region that is not normally associated with the promoter in nature.
[0098] As used herein, a “host cell” includes an individual cell or cell culture that can be or has been a recipient for vector(s) for incorporation of polynucleotide inserts. The term host cell may refer to the packaging cell line in which a recombinant AAV (rAAV) is produced from a plasmid. In the alternative, the term “host cell” may refer to a target cell in which expression of a transgene is desired.
[0099] The use of the terms “include,” “includes,” “including,” “have,” “has,” “having,” “contain,” “contains,” or “containing,” including grammatical equivalents thereof, should be understood generally as open-ended and non-limiting, for example, not excluding additional unrecited elements or steps, unless otherwise specifically stated or understood from the context. [0100] An “insertion” may include the insertion of subject amino acids, insertion of small groups of amino acids such as 2, 3, 4, or 5 amino acids, or insertion of larger amino acid regions, such as the insertion of specific amino acid domains or other features.
[0101] An “inverted terminal repeat” or “ITR” sequence is a term well understood in the art and refers to relatively short sequences found at the termini of viral genomes which are in opposite orientation.
[0102] As used herein, “isolated molecule” (where the molecule is, for example, a polypeptide, a polynucleotide, or fragment thereof) is a molecule that by virtue of its origin or source of derivation (1) is not associated with one or more naturally-associated components that accompany it in its native state, (2) is substantially free of one or more other molecules from the same species (3) is expressed by a cell from a different species, or (4) does not occur in nature. [0103] “Operably linked” refers to a juxtaposition wherein the components so described are in a relationship permitting them to function in their intended manner. For instance, a promoter is operably linked to a coding sequence if the promoter affects its transcription or expression.
[0104] The terms “patient,” “subject,” and “individual” are used interchangeably herein and refer to either a human or a non-human animal. These terms include mammals, such as humans, non-human primates, laboratory animals, livestock animals (including bovines, porcines, camels, efc.), companion animals (e.g., canines, felines, other domesticated animals, efc.) and rodents
(e.g., mice and rats). In some embodiments, the subject is a human that is at least 40, 45, 50, 55, 60, 65, 70, 75, 80, 85, 90, or 95 years of age.
[0105] “Percent (%) sequence identity” or “percent (%) identical to” with respect to a reference polypeptide (or nucleotide) sequence is defined as the percentage of amino acid residues (or nucleic acids) in a candidate sequence that are identical with the amino acid residues (or nucleic acids) in the reference polypeptide (nucleotide) sequence, after aligning the sequences and introducing gaps, if necessary, to achieve the maximum percent sequence identity, and not considering any conservative substitutions as part of the sequence identity. Alignment for purposes of determining percent amino acid sequence identity can be achieved in various ways that are within the skill in the art, for instance, using publicly available computer software such as BLAST, BLAST-2, ALIGN, or Megalign (DNASTAR) software. Those skilled in the art can determine appropriate parameters for aligning sequences, including any algorithms needed to achieve maximal alignment over the full length of the sequences being compared.
[0106] As known in the art, “polynucleotide,” or “nucleic acid,” are used interchangeably herein and refer to chains of nucleotides of any length, and include DNA and RNA. The nucleotides can be deoxyribonucleotides, ribonucleotides, modified nucleotides or bases, and/or their analogs, or any substrate that can be incorporated into a chain by DNA or RNA polymerase. A polynucleotide may include modified nucleotides, such as methylated nucleotides and their analogs. If present, modification to the nucleotide structure may be imparted before or after assembly of the chain. The sequence of nucleotides may be interrupted by non-nucleotide components. A polynucleotide may be further modified after polymerization, such as by conjugation with a labeling component. Other types of modifications include, for example, “caps,” substitution of one or more of the naturally occurring nucleotides with an analog, intemucleotide modifications such as, for example, those with uncharged linkages (e.g., methyl phosphonates, phosphotriesters, phosphoamidates, carbamates) and with charged linkages (e.g., phosphorothioates, phosphorodithioates), those containing pendant moieties, such as, for example, proteins (e.g., nucleases, toxins, antibodies, signal peptides, poly-L-lysine), those with intercalators (e.g., acridine, psoralen), those containing chelators (e.g., metals, radioactive metals, boron, oxidative metals), those containing alkylators, those with modified linkages (e.g., alpha anomeric nucleic acids), as well as unmodified forms of the polynucleotide(s). Further, any of the hydroxyl groups ordinarily present in the sugars may be replaced, for example, by phosphonate groups, phosphate groups, protected by standard protecting groups, or activated to prepare additional linkages to additional nucleotides, or may be conjugated to solid supports. The 5' and 3' terminal OH can be phosphorylated or substituted with amines or organic capping
group moieties of from 1 to 20 carbon atoms. Other hydroxyls may also be derivatized to standard protecting groups. Polynucleotides can also contain analogous forms of ribose or deoxyribose sugars that are generally known in the art, including, for example, 2’-O-methyl-, 2’- O-allyl, 2’ -fluoro- or 2’ -azido-ribose, carbocyclic sugar analogs, alpha- or beta-anomeric sugars, epimeric sugars such as arabinose, xyloses or lyxoses, pyranose sugars, furanose sugars, sedoheptuloses, acyclic analogs, and abasic nucleoside analogs, such as methyl riboside. One or more phosphodiester linkages may be replaced by alternative linking groups. These alternative linking groups include, but are not limited to, embodiments wherein phosphate is replaced by P(O)S(“thioate”), P(S)S (“dithioate”), (O)NRi (“amidate”), P(O)R, P(O)OR’, CO or CH2 (“formacetal”), in which each R or R' is independently H or substituted or unsubstituted alkyl (1-20 C) optionally containing an ether (-O-) linkage, aryl, alkenyl, cycloalkyl, cycloalkenyl or araldyl. Not all linkages in a polynucleotide need be identical. The preceding description applies to all polynucleotides referred to herein, including RNA and DNA.
[0107] IUPAC nucleotide code is used throughout. IUPAC nucleotide code is provided in Table 1, below.
Table 1. IUPAC nucleotide code
[0108] The terms “polypeptide,” “oligopeptide,” “peptide,” and “protein” are used interchangeably herein to refer to chains of amino acids of any length. The chain may be linear or branched, it may include modified amino acids and/or may be interrupted by non-amino acids. The terms also encompass an amino acid chain that has been modified naturally or by intervention; for example, disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, or any other manipulation or modification, such as conjugation with a labeling component. Also included within the definition are, for example, polypeptides containing one or more analogs of an amino acid (e.g., unnatural amino acids), as well as other modifications known in the art. It is understood that the polypeptides can occur as single chains or associated chains.
[0109] As used herein, the term “promoter” refers to a recognition site on DNA that is bound by an RNA polymerase. The polymerase drives transcription of a transgene. Exemplary promoters suitable for use with the compositions and methods described herein are described herein. Additionally, the term “promoter” may refer to a synthetic promoter, such as a regulatory DNA sequence that does not occur naturally in a biological system. Synthetic promoters contain parts of naturally occurring promoters combined with polynucleotide sequences that do not occur in nature and can be optimized to express recombinant DNA.
[0110] A “recombinant adeno-associated virus (rAAV virus)” or “rAAV viral particle” refers to a viral particle composed of at least one AAV capsid protein and an encapsidated rAAV vector genome.
[OHl] A “recombinant AAV vector (rAAV vector)” refers to a polynucleotide vector based on an AAV including one or more heterologous sequences (z.e., nucleic acid sequence not of AAV origin) that are flanked by at least one AAV ITR. Such rAAV vectors can be replicated and packaged into infectious viral particles when present in a host cell that has been infected with a suitable helper virus (or that is expressing suitable helper functions) and that is expressing AAV Rep and Cap gene products (i.e. AAV Rep and Cap proteins). When a rAAV vector is incorporated into a larger polynucleotide (e.g., in a chromosome or in another vector such as a plasmid used for cloning or transfection), then the rAAV vector may be referred to as a “provector” which can be “rescued” by replication and encapsidation in the presence of AAV packaging functions and suitable helper functions. An rAAV vector can be in any of a number of forms, including, but not limited to, plasmids, linear artificial chromosomes, complexed with lipids, encapsulated within liposomes, and encapsidated in a viral particle, e.g., an AAV particle.
An rAAV vector can be packaged into an AAV virus capsid to generate a “recombinant adeno- associated viral particle (rAAV particle)”.
[0112] The term “regulatory element” or “regulatory sequence” is intended to include promoters, enhancers, internal ribosomal entry sites (IRES), and other expression control elements (e.g. transcription termination signals, such as polyadenylation signals and poly-U sequences). Such regulatory sequences are described, for example, in Goeddel (1990) Gene Expression Technology: Methods in Enzymology 185, Academic Press, San Diego Calif. Regulatory sequences include those that direct constitutive expression of a nucleotide sequence in many types of host cell and those that direct expression of the nucleotide sequence only in certain host cells (e.g., tissue-specific regulatory sequences). A tissue-specific promoter may direct expression primarily in a desired tissue of interest, such as muscle, neuron, bone, skin, blood, specific organs (e.g. liver or pancreas), or particular cell types (e.g. lymphocytes). Regulatory sequences may also direct expression in a temporal -dependent manner, such as in a cell cycle-dependent or developmental stage-dependent manner, which may not also be tissue- or cell type-specific. Also encompassed by the term “regulatory element” are enhancer elements, such as WPRE; CMV enhancers; the R-U5' segment in LTR of HTLV-I (Takebe et al. (1988) MOL. CELL. BIOL. 8:466-472); SV40 enhancer; and the intron sequence between exons 2 and 3 of rabbit P-globin (O'Hare et al. (1981) PROC. NATL. ACAD. SCI. USA. 78(3): 1527-31). It will be appreciated by those skilled in the art that the design of the expression vector can depend on such factors as the choice of the host cell to be transformed, the level of expression desired, etc. [0113] As used herein, “residue” refers to a position in a protein and its associated amino acid identity.
[0114] A “substitution” includes replacing a wild-type amino acid with another (e.g, a nonwild-type amino acid). In some embodiments, the another (e.g, non-wild-type) or inserted amino acid is Ala (A), His (H), Lys (K), Phe (F), Met (M), Thr (T), Gin (Q), Asp (D), or Glu (E). In some embodiments, the another (e.g., non-wild-type) or inserted amino acid is A. In some embodiments, the another (e.g., non-wild-type) amino acid is Arg (R), Asn (N), Cys (C), Gly (G), He (I), Leu (L), Pro (P), Ser (S), Trp (W), Tyr (Y), or Vai (V). Conventional or naturally occurring amino acids are divided into the following basic groups based on common side-chain properties: (1) non-polar: Norleucine, Met, Ala, Vai, Leu, and He; (2) polar without charge: Cys, Ser, Thr, Asn, and Gin; (3) acidic (negatively charged): Asp and Glu; (4) basic (positively charged): Lys and Arg; and (5) residues that influence chain orientation: Gly and Pro; and (6) aromatic: Trp, Tyr, Phe and His. Conventional amino acids include L or D stereochemistry. In some embodiments, the another (e.g., non-wild-type) amino acid is a member of a different
group (e.g., an aromatic amino acid is substituted for a non-polar amino acid). Substantial modifications in the biological properties of the polypeptide are accomplished by selecting substitutions that differ significantly in their effect on maintaining (a) the structure of the polypeptide backbone in the area of the substitution, for example, as a P-sheet or helical conformation, (b) the charge or hydrophobicity of the molecule at the target site, or (c) the bulk of the side chain. Naturally occurring residues are divided into groups based on common sidechain properties: (1) non-polar: Norleucine, Met, Ala, Vai, Leu, and He; (2) polar without charge: Cys, Ser, Thr, Asn, and Gin; (3) acidic (negatively charged): Asp and Glu; (4) basic (positively charged): Lys and Arg; (5) residues that influence chain orientation: Gly and Pro; and (6) aromatic: Trp, Tyr, Phe, and His. In some embodiments, the another (e.g., non-wild-type) amino acid is a member of a different group (e.g., a hydrophobic amino acid for a hydrophilic amino acid, a charged amino acid for a neutral amino acid, or an acidic amino acid for a basic amino acid). In some embodiments, the another (e.g., non-wild-type) amino acid is a member of the same group (e.g., another basic amino acid, another acidic amino acid, another neutral amino acid, another charged amino acid, another hydrophilic amino acid, another hydrophobic amino acid, another polar amino acid, another aromatic amino acid, or another aliphatic amino acid). In some embodiments, the another (e.g., non-wild-type) amino acid is an unconventional amino acid. Unconventional amino acids are non-naturally occurring amino acids. Examples of an unconventional amino acid include, but are not limited to, aminoadipic acid, beta-alanine, betaaminopropionic acid, aminobutyric acid, piperidinic acid, aminocaprioic acid, aminoheptanoic acid, aminoisobutyric acid, aminopimelic acid, citrulline, diaminobutyric acid, desmosine, diaminopimelic acid, diaminopropionic acid, N-ethylglycine, N-ethylaspargine, hyroxylysine, allo-hydroxylysine, hydroxyproline, isodesmosine, allo-isoleucine, N-m ethylglycine, sarcosine, N-methylisoleucine, N-methylvaline, norvaline, norleucine, orithine, 4-hydroxyproline, y- carboxyglutamate, s-N,N,N-trimethyllysine, s-N-acetyllysine, O-phosphoserine, N-acetylserine, N-formylmethionine, 3-methylhistidine, 5 -hydroxy lysine, c-N-methylarginine, and other similar amino acids and amino acids (e.g., 4-hydroxyproline).
[0115] The term “transgene” refers to a polynucleotide that is introduced into a cell and is capable of being transcribed into RNA and optionally, translated and/or expressed under appropriate conditions. In aspects, it confers a desired property to a cell into which it was introduced, or otherwise leads to a desired therapeutic or diagnostic outcome.
[0116] “Treating” a condition or subject refers to taking steps to obtain beneficial or desired results, including clinical results. With respect to a disease or condition, treatment refers to the reduction or amelioration of the progression, severity, and/or duration of one or more symptoms
of the disease, or the amelioration of one or more symptoms resulting from the administration of one or more therapies (including, but not limited to, the administration of one or more prophylactic or therapeutic agents).
[0117] As used herein, the term “variant” refers to a variant of (a) a promoter or (b) a gene or coding sequence (e.g., an mRNA) that encodes a protein that retains, for example, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, at least 90%, at least 95%, or 100% of at least one activity of the corresponding full-length, naturally occurring promoter or protein. For example, a variant can include a splice variant or a gene including a mutation such as an insertion, deletion, or substitution.
[0118] As used herein, the term “vector” includes a nucleic acid vector, e.g., a DNA vector, such as a plasmid, an RNA vector, or another suitable replicon (e.g., viral vector). A variety of vectors have been developed for the delivery of polynucleotides encoding exogenous polynucleotides or proteins into a prokaryotic or eukaryotic cell. Examples of such expression vectors are disclosed in, e.g., WO 1994/011026; incorporated herein by reference as it pertains to vectors suitable for the expression of a nucleic acid molecule of interest. Expression vectors suitable for use with the compositions and methods described herein contain a polynucleotide sequence as well as, e.g., additional sequence elements used for the expression of heterologous nucleic acid materials (e.g., a nucleic acid molecule) in a mammalian cell. Certain vectors that can be used for the expression of the nucleic acid molecules described herein include plasmids that contain regulatory sequences, such as promoter and enhancer regions, which direct gene transcription. In some embodiments, the compact bidirectional promoters do not contain an enhancer. Other useful vectors for expression of nucleic acid molecule agents disclosed herein contain polynucleotide sequences that enhance the rate of translation of these polynucleotides or improve the stability or nuclear export of the RNA that results from gene transcription. These sequence elements include, e.g., 5' and 3' untranslated regions, an internal ribosomal entry site (IRES), and polyadenylation signal (poly A) in order to direct efficient transcription of the gene carried on the expression vector. The expression vectors suitable for use with the compositions and methods described herein may also contain a polynucleotide encoding a marker for selection of cells that contain such a vector. Examples of a suitable marker are genes that encode resistance to antibiotics, such as ampicillin, chloramphenicol, kanamycin, nourseothricin, or zeocin.
[0119] In some embodiments, a vector comprises one or more pol II promoters. Examples of pol II promoters include, but are not limited to the retroviral Rous sarcoma virus (RSV) LTR promoter (optionally with the RSV enhancer), the cytomegalovirus (CMV) promoter (optionally
with the CMV enhancer) e.g., Boshart et al. (1985) CELL 41 :521-530), the SV40 promoter, the dihydrofolate reductase promoter, the P-actin promoter, the phosphoglycerol kinase (PGK) promoter, and the EFla promoter.
[0120] A vector can be introduced into host cells to thereby produce transcripts, proteins, or peptides, including fusion proteins or peptides, encoded by nucleic acids as described herein (e.g., clustered regularly interspersed short palindromic repeats (CRISPR) transcripts, proteins, enzymes, mutant forms thereof, fusion proteins thereof, etc.). Advantageous vectors include lentiviruses and AAVs, and types of such vectors can also be selected for targeting particular types of cells.
[0121] The term “vector genome (vg)” as used herein may refer to one or more polynucleotides comprising a set of the polynucleotide sequences of a vector, e.g., a viral vector. A vector genome may be encapsidated in a viral particle. Depending on the particular viral vector, a vector genome may comprise single-stranded DNA, double-stranded DNA, single-stranded RNA, or double-stranded RNA. A vector genome may include endogenous sequences associated with a particular viral vector and/or any heterologous sequences inserted into a particular viral vector through recombinant techniques. For example, a recombinant AAV vector genome may include at least one ITR sequence flanking a promoter, a stuffer, a sequence of interest, and a polyadenylation sequence. A complete vector genome may include a complete set of the polynucleotide sequences of a vector. In some embodiments, the nucleic acid titer of a viral vector may be measured in terms of vg/mL. Methods suitable for measuring this titer are known in the art (e.g., quantitative PCR).
[0122] As used herein the term “wild-type” is a term of the art understood by skilled persons and means the typical form of an organism, strain, gene, or characteristic as it occurs in nature as distinguished from mutant or variant forms.
[0123] Each embodiment described herein may be used individually or in combination with any other embodiment described herein.
II. Compact Bidirectional Promoters
[0124] The present disclosure provides, among other things, compact bidirectional promoters that can effectively drive expression of genes useful in, for example, gene therapy applications such as those involving AAV. In some embodiments, the bidirectional promoter is capable of promoting transcription of two coding sequences positioned on opposite sides of the promoter.
[0125] In some embodiments, the compact bidirectional promoter is operably linked to at least one (e.g., two) heterologous coding sequence. For example, in some embodiments, the compact bidirectional promoter is operably linked to two heterologous coding sequences.
[0126] In some embodiments, the compact bidirectional promoter promotes transcription of a heterologous coding sequence by an RNA polymerase II (“pol II”). For example, in some embodiments, the compact bidirectional promoter promotes transcription of a first heterologous coding sequence in one direction (e.g., on one strand of a DNA molecule), and a second heterologous coding sequence in another direction (e.g., on the opposite strand of the DNA molecule), as shown in FIG. 1. In some embodiments, the heterologous promoter does not promote transcription by an RNA polymerase III (“pol III”) (i.e., the promoter is not a pol III promoter.).
[0127] In some embodiments, the compact bidirectional promoter is less than about 1000 base pairs (bp) (e.g., less than about 800 bp, less than about 600 bp, less than about 400 bp, or less than about 200 bp). For example, in some embodiments, the promoter is less than about 800 bp. In some embodiments, the promoter is less than about 600 bp. In some embodiments, the promoter is less than about 400 bp. In some embodiments, the promoter is or less than about 200 bp.
[0128] In some embodiments, the compact bidirectional promoter is between about 30 bp and about 800 bp (e.g., between about 31 bp and about 750 bp, between about 32 bp and about 700 bp, between about 33 bp and about 600 bp, between about 34 bp and about 500 bp, between about 35 bp and about 400 bp, between about 36 bp and about 300 bp, between about 37 bp and about 250 bp, between about 40 bp and about 200 bp, or between about 50 bp and about 100 bp). For example, in some embodiments, promoter is between about 31 bp and about 750 bp. In some embodiments, the promoter is between about 32 bp and about 700 bp. In some embodiments, the promoter is between about 33 bp and about 600 bp. In some embodiments, the promoter is between about 34 bp and about 500 bp. In some embodiments, the promoter is between about 35 bp and about 400 bp. In some embodiments, the promoter is between about 36 bp and about 300 bp. In some embodiments, the promoter is between about 37 bp and about 250 bp. In some embodiments, the promoter is between about 40 bp and about 200 bp. In some embodiments, the promoter is between about 50 bp and about 100 bp.
[0129] In some embodiments, the compact bidirectional promoter is smaller than a CMV promoter.
[0130] In some embodiments, the compact bidirectional promoter is capable of promoting transcription of two coding sequences positioned on opposite sides of the promoter, as shown in FIG. 1
[0131] In some embodiments, the promoter includes a nucleotide sequence having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof. For example, in some embodiments, the promoter includes a nucleotide sequence having at least 86% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 87% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 88% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 89% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 90% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 91% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 92% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 93% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 94% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 95% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 96% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 97% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 98% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 99% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof.
In some embodiments, the promoter includes a nucleotide sequence having at least 99.5% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having 100% sequence identity to any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes the nucleotide sequence of any one of SEQ ID NOs: 1-800 or a functional fragment or variant thereof.
[0132] In some embodiments, the promoter includes a nucleotide sequence derived from an origin species, such as a homo sapiens or mus musculus. For example, in some embodiments, the promoter includes a nucleotide sequence derived from a homo sapiens promoter, such as a sequence having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. Alternatively, for example, in some embodiments, the promoter includes a nucleotide sequence derived from a mus musculus promoter, such as a sequence having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof.
[0133] In some embodiments, the promoter includes a nucleotide sequence having at least 85% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. For example, in some embodiments, the promoter includes a nucleotide sequence having at least 86% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 87% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 88% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 89% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 90% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 91% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 92% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 93% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof.
In some embodiments, the promoter includes a nucleotide sequence having at least 94% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 95% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 96% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 97% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 98% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 99% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 99.5% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having 100% sequence identity to any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof. In some embodiments, the promoter includes the nucleotide sequence of any one of SEQ ID NOs: 1-400 or a functional fragment or variant thereof.
[0134] In some embodiments, the promoter includes a nucleotide sequence having at least 85% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof. For example, in some embodiments, the promoter includes a nucleotide sequence having at least 86% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 87% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 88% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 89% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 90% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 91% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 92% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or
variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 93% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 94% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 95% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 96% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 97% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 98% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 99% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having at least 99.5% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes a nucleotide sequence having 100% sequence identity to any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof. In some embodiments, the promoter includes the nucleotide sequence of any one of SEQ ID NOs: 401-800 or a functional fragment or variant thereof.
[0135] In some embodiments, a functional fragment includes a truncation of from about 10 to about 70 e.g., about 20, 30, 40, 50, or 60) bp at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-800 or a variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 1-800). For example, in some embodiments, a functional fragment includes a truncation of about 20 bp at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-800 or a variant thereof. In some embodiments, a functional fragment includes a truncation of about 30 bp at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-800 or a variant thereof. In some embodiments, a functional fragment includes a truncation of about 40 bp at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-800 or a variant thereof. In some embodiments, a functional fragment includes a truncation of about 50 bp at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-800 or a variant thereof . In some embodiments, a functional fragment includes a truncation of about 60 bp at the 5' end, at
the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-800 or a variant thereof . In some embodiments, a functional fragment includes a truncation of about 70 bp at the 5' end, at the 3' end, or at each of the 5' and 3' ends of any one of SEQ ID NOs: 1-800 or a variant thereof . [0136] In some embodiments, the compact bidirectional promoter includes at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, at least about 99.5%, or 100% sequence identity to a naturally occurring mammalian promoter. For example, in some embodiments, the compact bidirectional promoter includes at least about 96% sequence identity to a naturally occurring mammalian promoter. In some embodiments, the compact bidirectional promoter includes at least about 97% sequence identity to a naturally occurring mammalian promoter. In some embodiments, the compact bidirectional promoter includes at least about 98% sequence identity to a naturally occurring mammalian promoter. In some embodiments, the compact bidirectional promoter includes at least about 99% sequence identity to a naturally occurring mammalian promoter. In some embodiments, the compact bidirectional promoter includes at least about 99.5% sequence identity to a naturally occurring mammalian promoter. In some embodiments, the compact bidirectional promoter includes 100% sequence identity to a naturally occurring mammalian promoter.
[0137] For example, in some embodiments, the compact bidirectional promoter includes at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99%, at least about 99.5%, or 100% sequence identity to a naturally occurring human promoter. In some embodiments, the compact bidirectional promoter includes at least about 96% sequence identity to a naturally occurring human promoter. In some embodiments, the compact bidirectional promoter includes at least about 97% sequence identity to a naturally occurring human promoter. In some embodiments, the compact bidirectional promoter includes at least about 98% sequence identity to a naturally occurring human promoter. In some embodiments, the compact bidirectional promoter includes at least about 99% sequence identity to a naturally occurring human promoter. In some embodiments, the compact bidirectional promoter includes at least about 99.5% sequence identity to a naturally occurring human promoter. In some embodiments, the compact bidirectional promoter includes 100% sequence identity to a naturally occurring human promoter.
[0138] In some embodiments, the compact bidirectional promoter or a functional fragment or variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 1- 800) has higher activity than standard promoters (e.g., higher activity than a herpes simplex virus (HSV) thymidine kinase (TK) promoter). For example, in some embodiments, the compact
bidirectional promoter or a functional fragment or variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 1-800) is capable of expressing a luciferase reporter at a higher level than is a HSV TK promoter. The expression level of a compact bidirectional promoter can be determined, for example, by expressing a reporter molecule in a cell, e.g., a human embryonic kidney (HEK) cell line or an N2A cell line.
[0139] In some embodiments, the compact bidirectional promoter or a functional fragment or variant thereof (e.g., a variant having at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 1- 800) is capable of promoting expression of a gene in a tissue or a subset of tissues as identified in the Human Protein Atlas (HP A), FANTOM, or Genotype-Tissue Expression (GTEx) databases, for example, as shown in FIGs. 11 A-25, and/or as shown in Appendix A of U.S. Provisional Application No. 63/403,571, the entire disclosure of which is hereby incorporated by reference in its entirety for all purposes. For example, the compact bidirectional promoter of SEQ ID NO: 17 (flanking genes ALKBH 1 and SLIRP can express a heterologous coding sequence at a low level in adipose tissue, adrenal glands, amygdala, basal ganglia, breast, cerebellum cerebral cortex, cervix, uterine tissue, colon, endometrium, esophagus, fallopian tube, heart muscle, hippocampal formation, etc., as identified in the HP A, FANTOM, or GTEx databases, for example, as shown in FIGs. 11 A-25, and/or as shown in Appendix A of U.S.
Provisional Application No. 63/403,571, the entire disclosure of which is hereby incorporated by reference in its entirety for all purposes, if positioned on the side of the promoter where the ALKBH1 gene naturally occurs. In addition, the compact bidirectional promoter of SEQ ID NO: 17 can express a heterologous coding sequence at varying levels in adipose tissue, adrenal glands, amygdala, basal ganglia, breast, cerebellum cerebral cortex, cervix, uterine tissue, colon, endometrium, esophagus, fallopian tube, heart muscle, hippocampal formation, etc., as shown in identified in the HP A, FANTOM, or GTEx databases, for example, as shown in FIGs. 11 A-25, and/or as shown in Appendix A of U.S. Provisional Application No. 63/403,571, the entire disclosure of which is hereby incorporated by reference in its entirety for all purposes, if positioned on the side of the promoter where the SLIRP gene naturally occurs. Expression data for promoters of SEQ ID NOs: 1-800 are shown in identified in the HP A, FANTOM, or GTEx databases, for example, as shown in FIGs. 11 A-25, and/or as shown in Appendix A of U.S.
Provisional Application No. 63/403,571, the entire disclosure of which is hereby incorporated by reference in its entirety for all purposes, identified by flanking gene names (Table 2). Expression data is shown for the human promoter (a promoter selected from SEQ ID NOs: 1-400), except
where indicated as mouse expression data, which refers to the corresponding promoter in SEQ ID NOs: 401-800. Accordingly, the present disclosure includes a method of expressing one or two heterologous coding sequences using a compact bidirectional promoter or functional fragment of variant thereof, as disclosed herein, wherein the bidirectional promoter or functional fragment of variant thereof promotes expression of the one or two heterologous coding sequences in the tissues shown in identified in the HP A, FANTOM, or GTEx databases, for example, as shown in FIGs. 11 A-25, and/or as shown in Appendix A of U.S. Provisional Application No. 63/403,571, the entire disclosure of which is hereby incorporated by reference in its entirety for all purposes. Table 2. Promoters of SEQ ID NOs: 1-800 identified by flanking gene names
[0140] In some embodiments, the compact bidirectional promoter is operably linked to a 5' untranslated region (UTR). For example, in some embodiments, the 5' UTR includes at least a portion of a beta-globin 5' UTR sequence. For example, in some embodiments, the 5' UTR includes the nucleotide sequence 5'- GCCGCCRCC -3', or a 6 bp, 7 bp, or 8 bp fragment thereof. In some embodiments, the 6 bp fragment is 5'-GCCACC-3'.
[0141] In some embodiments, the compact promoter is operably linked to a Kozak consensus sequence.
[0142] In some embodiments, the compact bidirectional promoter includes a TATA mutation. For example, in some embodiments, the TATA mutation is a TATAA
TCGAA mutation. [0143] In some embodiments, the compact bidirectional promoter is coupled with a viral intron (e.g., an SV40i intron, a MVM intron, a Mv2 intron, an HNRNPH1 intron, a chimeric intron, or a synthetic intron). For example, in some embodiments, the compact bidirectional promoter is coupled with an SV40i intron. In some embodiments, the compact bidirectional promoter is coupled with a MVM intron. In some embodiments, the compact bidirectional promoter is coupled with a Mv2 intron. In some embodiments, the compact bidirectional promoter is coupled with an HNRNPH1 intron. In some embodiments, the compact bidirectional promoter is coupled with a chimeric intron. In some embodiments, the compact bidirectional promoter is coupled with a synthetic intron.
[0144] In some embodiments, the compact bidirectional promoter does not include a viral promoter or a synthetic promoter. For example, in some embodiments, the compact bidirectional promoter does not include a viral promoter. In some embodiments, the compact bidirectional promoter does not include a synthetic promoter.
[0145] In some embodiments, the functional fragment of a compact bidirectional promoter described herein includes a transcription factor binding site. Identification of transcription factor binding sites can be determined, for example, by consensus or by using a differential distance matrix or multidimensional scaling (De Bleser P. et al. (2007) Genome Biol 8(5):R83). In some embodiments, a functional fragment of a compact bidirectional promoter described herein includes a transcription factor binding site selected from Staf, DSE, PSE, c-REL, GATA-1, GATA-2, and CREB. For example, in some embodiments, a functional fragment of a compact bidirectional promoter described herein includes a Staf transcription factor binding site. In some embodiments, a functional fragment of a compact bidirectional promoter described herein includes a DSE transcription factor binding site. In some embodiments, a functional fragment of a compact bidirectional promoter described herein includes a PSE transcription factor binding site. In some embodiments, a functional fragment of a compact bidirectional promoter described herein includes a c-REL transcription factor binding site. In some embodiments, a functional fragment of a compact bidirectional promoter described herein includes a GATA-1 transcription factor binding site. In some embodiments, a functional fragment of a compact bidirectional promoter described herein includes a GATA-2 transcription factor binding site. In some embodiments, a functional fragment of a compact bidirectional promoter described herein includes a CREB transcription factor binding site.
[0146] In some embodiments, a functional fragment of a compact bidirectional promoter described herein can include a B recognition sequence (BRE) or TATA box. For example, in some embodiments, a functional fragment of a compact bidirectional promoter described herein can include a BRE. In some embodiments, a functional fragment of a compact bidirectional promoter described herein can include a TATA box.
[0147] In some embodiments, a nucleic acid including a compact bidirectional promoter described herein further includes a terminator sequence. In some embodiments, the terminator sequence includes one of the exemplary, non-limiting terminator sequences in Table 3, below.
III. Methods of Identifying Bidirectional Promoters
[0148] The present disclosure also provides, among other things, methods of identifying bidirectional promoters of the disclosure. For example, in some embodiments, a bidirectional promoter (e.g., a compact bidirectional promoter) of the disclosure is identified as such by a method including identifying regions between a transcription start site on the minus strand and a transcription start site on the plus strand. For example, the disclosure provides a method including: (a) obtaining a genome file including annotations categorized by chromosome, wherein the annotations include indices, wherein the indices include genes, pseudogenes, and coding regions for protein-coding genes, wherein each coding region includes a transcription start site; and (b) obtaining a non-transitory computer readable medium including instructions that, when executed by a processor, cause the processor to: identify regions between a transcription start site on the minus strand and a transcription start site on the plus strand.
[0149] In some embodiments, the instructions, when executed by a processor, further cause the processor to: save annotations and/or sort indices by chromosome.
[0150] For example, the methods of the disclosure include developing a script (e.g., a python script) to identify bidirectional promoters (e.g., compact bidirectional promoters) from genomic annotation files, including, for example, mammalian (e.g., human) annotations. In some embodiments, the script can be applied to genome-wide transcription data files. In an exemplary method, an input data file is obtained (e.g., GRCh38_latest_genomic.gff or GRCm39_vM27.gff3). The file can then be, for example, categorized by chromosome with each line pertaining to each region of interest in the genome with examples including genes, pseudogenes, and coding regions for protein-coding genes. The script can, for example, iterate through every line in the file and store the type of annotation. The genes can be, for example, sorted by index on a per-chromosome basis and/or, the script may identify regions in-between transcription on the minus strand and transcription on the plus strand, thereby defining the intervening region as a bidirectional promoter (e.g., a compact bidirectional promoter). In some embodiments, the transcripts are filtered for those that are orientated in opposite directions
(divergent transcription). Promoter boundaries can be, for example, further refined using the coding sequence (CDS) start for protein coding genes.
[0151] In some embodiments, the annotations include mammalian annotations, such as, for example, human or mouse annotations. For example, in some embodiments, the annotations include human annotations (e.g., the genome file including annotations is GRCh38_latest_genomic.gff). In some embodiments, the annotations include mouse annotations (e.g., the genome file including annotations is GRCm39_vM27.gff3).
[0152] In some embodiments, the genome file including annotations is GRCh38_latest_genomic.gff or GRCm39_vM27.gff3. For example, in some embodiments, the genome file including annotations is GRCh38_latest_genomic.gff In some embodiments, the genome file including annotations is GRCm39_vM27.gff3.
[0153] In some embodiments, the genome file includes experimentally-derived annotations. For example, in some embodiments, the genome file includes annotations derived from serial analysis of gene expression (SAGE). In some embodiments, the genome file includes annotations derived from RNA sequencing (RNAseq). In some embodiments, the genome file includes annotations derived from H3K4mel chromatin immunoprecipitation (ChIP) sequencing (ChlP-seq). In some embodiments, the genome file includes annotations derived from H3K4me3 ChlP-seq. In some embodiments, the genome file includes annotations derived from RNA polymerase II ChlP-seq. In some embodiments, the genome file includes annotations derived from Cap Analysis of Gene Expression (CAGE).
[0154] In some embodiments, a bidirectional promoter (e.g., a compact bidirectional promoter) identified by a method of the disclosure is operably linked to at least one (e.g., two) heterologous coding sequence. For example, in some embodiments, a bidirectional promoter (e.g., a compact bidirectional promoter) identified by a method of the disclosure is operably linked to two heterologous coding sequences.
[0155] In some embodiments, a bidirectional promoter (e.g., a compact bidirectional promoter) identified by a method of the disclosure is capable of promoting transcription of two coding sequences positioned on opposite sides of the promoter.
[0156] In some embodiments, the compact bidirectional promoter promotes transcription of a heterologous coding sequence by an RNA polymerase II (“pol II”). For example, in some embodiments, the compact bidirectional promoter promotes transcription of a first heterologous coding sequence in one direction (e.g., on one strand of a DNA molecule), and a second heterologous coding sequence in another direction (e.g., on the opposite strand of the DNA molecule), as shown in FIG. 1. In some embodiments, the heterologous promoter does not
promote transcription by an RNA polymerase III (“pol III”) (z.e., the promoter is not a pol III promoter).
[0157] In some embodiments, a bidirectional promoter (e.g., a compact bidirectional promoter) identified by a method of the disclosure is less than about 1000 bp (e.g., less than about 800 bp, less than about 600 bp, less than about 400 bp, or less than about 200 bp). For example, in some embodiments, the promoter is less than about 800 bp. In some embodiments, the promoter is less than about 600 bp. In some embodiments, the promoter is less than about 400 bp. In some embodiments, the promoter is less than about 200 bp.
[0158] In some embodiments, a bidirectional promoter (e.g., a compact bidirectional promoter) identified by a method of the disclosure is between about 30 bp and about 800 bp (e.g., between about 31 bp and about 750 bp, about 32 bp and about 700 bp, about 33 bp and about 600 bp, about 34 bp and about 500 bp, about 35 bp and about 400 bp, about 36 bp and about 300 bp, about 37 bp and about 250 bp, about 40 bp and about 200 bp, or about 50 bp and about 100 bp).
For example, in some embodiments, promoter is between about 31 bp and about 750 bp. In some embodiments, the promoter is between about 32 bp and about 700 bp. In some embodiments, the promoter is between about 33 bp and about 600 bp. In some embodiments, the promoter is between about 34 bp and about 500 bp. In some embodiments, the promoter is between about 35 bp and about 400 bp. In some embodiments, the promoter is between about 36 bp and about 300 bp. In some embodiments, the promoter is between about 37 bp and about 250 bp. In some embodiments, the promoter is between about 40 bp and about 200 bp. In some embodiments, the promoter is between about 50 bp and about 100 bp.
[0159] In some embodiments, a bidirectional promoter (e.g., a compact bidirectional promoter) identified by a method of the disclosure is smaller than a CMV promoter.
[0160] In some embodiments, the bidirectional promoter (e.g., a compact bidirectional promoter) has higher activity than standard promoters (e.g., higher activity than a HSV TK promoter). For example, in some embodiments, the bidirectional promoter (e.g., a compact bidirectional promoter) is capable of expressing a luciferase reporter at a higher level than is a HSV TK promoter. The expression level of a bidirectional promoter (e.g., a compact bidirectional promoter) can be determined, for example, by expressing a reporter molecule in a cell, e.g., a HEK cell line or an N2A cell line.
IV. Coding Sequences
[0161] In some embodiments, a compact bidirectional promoter of the disclosure is operably linked to at least one (e.g., two) heterologous coding sequence. For example, in some
embodiments, the compact bidirectional promoter is operably linked to only one heterologous coding sequence.
[0162] In some embodiments, the bidirectional promoter is capable of promoting transcription of two coding sequences positioned on opposite sides of the promoter.
[0163] In some embodiments, the compact bidirectional promoter of the disclosure is operably linked to two heterologous coding sequence. In some embodiments, the two heterologous coding sequences include the same coding sequence. Alternatively, for example, in some embodiments, the two heterologous coding sequences include different coding sequences.
[0164] In some embodiments, the compact bidirectional promoter is capable of expressing the at least one (e.g., two) heterologous coding sequence in a target cell (e.g., a lung cell, a pancreatic cell, a kidney cell, a muscle cell, a liver cell, a retinal cell, a neuron, a glial cell, an endothelial cell, or an epithelial cell). For example, in some embodiments, the compact bidirectional promoter is capable of expressing each of the two heterologous coding sequences: (a) in the same target cell or cells, (b) in different target cells, or (c) in a partially overlapping set of target cells. In some embodiments, the compact bidirectional promoter is capable of expressing each of the two heterologous coding sequences in the same target cell or cells. In some embodiments, the compact bidirectional promoter is capable of expressing each of the two heterologous coding sequences in different target cells. In some embodiments, the compact bidirectional promoter is capable of expressing each of the two heterologous coding sequences in a partially overlapping set of cells.
[0165] In some embodiments, a coding sequence encodes one or more genes selected from the non-limiting list of: CFTR, ATP2B, ATP7A, AGL, CPS1, AIAT, ALPL, ARSA, BBS1, BEST1, CAH, CFH, CFI, CHM, CLN2, CLN7, CNGA3, CYP46A1, F9, FKRP, FMRI, FMRP, FOXG1, GAD, GALC, GALGT2, GBA1, GBE1, GLB1, GRN, HEXA, HTRA1, IDS, IDEA, LAMP2, LCA5, MECP2, MFN2, MMUT, MIMI, NAGLU, ND4, PAH, RIGA, PRKN, RPE65, SERPINGI, SGSH, SLC13A5, and SLC6A1 or a functional fragment or variant thereof. For example, in some embodiments, a coding sequence encodes CFTR or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes A TP2B or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes A TP 7 A or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes AGL or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes CPS1 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes A 1AT or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes ALPL or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes
ARSA or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes BBS1 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes BEST1 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes CAH or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes CFH or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes CFI or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes CHMo a functional fragment or variant thereof. In some embodiments, a coding sequence encodes CLN2 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes CLN7 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes CNGA3 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes CYP46A1 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes F9 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes FKRP or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes FMRI or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes FMRP or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes F0XG1 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes GAD or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes GALC or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes GALGT2 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes GBA1 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes GBE1 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes GLB1 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes GRN or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes HEXA or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes HTRA1 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes IDS or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes IDUA or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes LAMP2 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes LCA5 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes MECP2 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes MFN2 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes MMUT or a
functional fragment or variant thereof. In some embodiments, a coding sequence encodes MTM1 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes NAGLU or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes ND4 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes PAH or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes PIGA or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes PPKN or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes PPE65 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes SERPINGI or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes SGSH or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes SLC13A5 or a functional fragment or variant thereof. In some embodiments, a coding sequence encodes SLC6A1 or a functional fragment or variant thereof.
[0166] In some embodiments, a coding sequence encodes one or more genes selected from the non-limiting list of: F8, F9, PIGA, SGSH, G6PC, NAGLU, CLN3, GBA, IDS, GAA, OTC, GLA, CAH, IDUA, LAMP2, CLN1, ATP7B, Al AT, GALT, EMNA, ENPP1, CLN2, CLN5, CLN7/MFSD8, AGU, MMUT, NPC2, ABCB11, ABCB4, ASS1, SMN1, AADC, MIMI, GBA1, GRN, GAD, GALGT2, SGCB, GDNF, ASPA, GLB1, GALC, SGCA, DYSF, HEXA, GAN, FXN, ARSA, MECP2, IGHMBP2, UBE3A, CDKL5, PGRN, FKRP, CYP46A1, OPMD, Cavl, neuropeptide Y/Y2, SCN1A, SHANK3, APOE2(R158C), FMRI, UPF1, CMT4J, MFN2, PRKN, CAPN3, NTF3, ANO5, SGCG, EMD, SURF1, GBE1, FMRP, RPE65, RPGR, CHM, ND4, CNGB3, PDE6b, CFI, CNGA3, GUCY2D, RLBP1, CD59, OPN1LW, CFH, MY07A, RSI, ABCA4, ND1, BEST1, RHO, LCA5, RDH12, NMNAT1, SERPINGI, AQP1, PPP1R1A, IL-IRa, CFTR, OTOF, CLRN1, GJB2, ALPL, TMC1, STRC, ATOH1, andMYBPC3, or a functional fragment of a variant thereof.
[0167] For example, in some embodiments, a coding sequence encodes F8. In some embodiments, a coding sequence encodes F9. In some embodiments, a coding sequence encodes PIGA. In some embodiments, a coding sequence encodes SGSH. In some embodiments, a coding sequence encodes G6PC. In some embodiments, a coding sequence encodes NAGLU. In some embodiments, a coding sequence encodes CLN3. In some embodiments, a coding sequence encodes GBA. In some embodiments, a coding sequence encodes IDS. In some embodiments, a coding sequence encodes GAA. In some embodiments, a coding sequence encodes OTC. In some embodiments, a coding sequence encodes GLA. In some embodiments, a coding sequence encodes CAH. In some embodiments, a coding sequence encodes IDUA. In some embodiments,
a coding sequence encodes LAMP2. In some embodiments, a coding sequence encodes CLN1. In some embodiments, a coding sequence encodes A TP7B. In some embodiments, a coding sequence encodes A1AT. In some embodiments, a coding sequence encodes GALT. In some embodiments, a coding sequence encodes EMNA. In some embodiments, a coding sequence encodes ENPP1. In some embodiments, a coding sequence encodes CLN2. In some embodiments, a coding sequence encodes CLN5. In some embodiments, a coding sequence encodes CLN7/MFSD8. In some embodiments, a coding sequence encodes AGU. In some embodiments, a coding sequence encodes MMUT. In some embodiments, a coding sequence encodes NPC2. In some embodiments, a coding sequence encodes ABCB11. In some embodiments, a coding sequence encodes ABCB4. In some embodiments, a coding sequence encodes ASS1. In some embodiments, a coding sequence encodes SMN1. In some embodiments, a coding sequence encodes AADC. In some embodiments, a coding sequence encodes MTM1. In some embodiments, a coding sequence encodes GBA1. In some embodiments, a coding sequence encodes GRN In some embodiments, a coding sequence encodes GAD. In some embodiments, a coding sequence encodes GALGT2. In some embodiments, a coding sequence encodes SGCB. In some embodiments, a coding sequence encodes GDNF. In some embodiments, a coding sequence encodes ASPA. In some embodiments, a coding sequence encodes GLB1. In some embodiments, a coding sequence encodes GALC. In some embodiments, a coding sequence encodes SGCA. In some embodiments, a coding sequence encodes DYSF. In some embodiments, a coding sequence encodes HEXA. In some embodiments, a coding sequence encodes GAN. In some embodiments, a coding sequence encodes FXN. In some embodiments, a coding sequence encodes ARSA. In some embodiments, a coding sequence encodes MECP2. In some embodiments, a coding sequence encodes IGHMBP2. In some embodiments, a coding sequence encodes UBE3A. In some embodiments, a coding sequence encodes CDKL5. In some embodiments, a coding sequence encodes PGRN. In some embodiments, a coding sequence encodes FKRP. In some embodiments, a coding sequence encodes CYP46A1. In some embodiments, a coding sequence encodes OPMD. In some embodiments, a coding sequence encodes Cavl. In some embodiments, a coding sequence encodes neuropeptide Y/Y2. In some embodiments, a coding sequence encodes SCN1A. In some embodiments, a coding sequence encodes SHANK3. In some embodiments, a coding sequence encodes APOE2(R158C). In some embodiments, a coding sequence encodes FMRI. In some embodiments, a coding sequence encodes UPF1. In some embodiments, a coding sequence encodes CMT4J. In some embodiments, a coding sequence encodes MFN2. In some embodiments, a coding sequence encodes PRKN. In some embodiments, a coding sequence encodes CAPN3. In some
embodiments, a coding sequence encodes NTF3. In some embodiments, a coding sequence encodes AN05. In some embodiments, a coding sequence encodes SGCG. In some embodiments, a coding sequence encodes EMD. In some embodiments, a coding sequence encodes SURF1. In some embodiments, a coding sequence encodes GBE1. In some embodiments, a coding sequence encodes FMRP. In some embodiments, a coding sequence encodes RPE65. In some embodiments, a coding sequence encodes RPGR. In some embodiments, a coding sequence encodes CHM. In some embodiments, a coding sequence encodes ND4. In some embodiments, a coding sequence encodes CNGB3. In some embodiments, a coding sequence encodes PDE6b. In some embodiments, a coding sequence encodes CFI. In some embodiments, a coding sequence encodes CNGA3. In some embodiments, a coding sequence encodes GUCY2D. In some embodiments, a coding sequence encodes RLBP1. In some embodiments, a coding sequence encodes CD59. In some embodiments, a coding sequence encodes 0PN1LW. In some embodiments, a coding sequence encodes CFH. In some embodiments, a coding sequence encodes MYO 7 A. In some embodiments, a coding sequence encodes RSI. In some embodiments, a coding sequence encodes ABCA4. In some embodiments, a coding sequence encodes ND1. In some embodiments, a coding sequence encodes BEST1. In some embodiments, a coding sequence encodes RHO. In some embodiments, a coding sequence encodes LCA5. In some embodiments, a coding sequence encodes RDH12. In some embodiments, a coding sequence encodes NMNA Tl. In some embodiments, a coding sequence encodes SERPING1. In some embodiments, a coding sequence encodes AQP1. In some embodiments, a coding sequence encodes PPP1R1A. In some embodiments, a coding sequence encodes IL-lRa. In some embodiments, a coding sequence encodes CFTR. In some embodiments, a coding sequence encodes OTOF. In some embodiments, a coding sequence encodes CLRN1. In some embodiments, a coding sequence encodes GJB2. In some embodiments, a coding sequence encodes ALPL. In some embodiments, a coding sequence encodes TMC1. In some embodiments, a coding sequence encodes STRC. In some embodiments, a coding sequence encodes AT0H1. In some embodiments, a coding sequence encodes MYBPC3.
[0168] In some embodiments, the therapeutic coding sequence is less than about 750 (e.g., less than about 700, less than about 600, less than about 500, or less than about 400) amino acids. For example, in some embodiments, the therapeutic coding sequence is less than about 700 amino acids. In some embodiments, the therapeutic coding sequence is less than about 600 amino acids. In some embodiments, the therapeutic coding sequence is less than about 500 amino acids. In some embodiments, the therapeutic coding sequence is less than about 400 amino acids.
[0169] In some embodiments, the therapeutic coding sequence is from about 350 amino acids to about 750 amino acids (e.g., from about 400 amino acids to about 700 amino acids or from about 500 amino acids to about 600 amino acids). For example, in some embodiments, the therapeutic coding sequence is from about 400 amino acids to about 700 amino acids. In some embodiments, the therapeutic coding sequence is from about 500 amino acids to about 600 amino acids.
[0170] For example, in some embodiments, any such coding sequence may be provided in an expression construct and the construct itself may be provided as a transgene in a vector, such as the exemplary vectors of the disclosure (e.g., rAAV). The transgene is a nucleic acid sequence, heterologous to the vector sequences flanking the transgene, which encodes a polypeptide, protein, or other product, of interest. The nucleic acid coding sequence may be operatively linked to regulatory components in a manner which permits transgene transcription, translation, and/or expression in a target cell. The heterologous nucleic acid sequence e.g., transgene) can be derived from any organism. In some embodiments, the transgene is derived from a mammal, such as a human.
[0171] In some embodiments, the expression construct includes, in addition to a compact bidirectional promoter and a coding sequence, a second coding sequence positioned on the opposite side of the promoter that encodes an RNA molecule or a protein. For example, in some embodiments, the second coding sequence encodes a molecule (e.g., an RNA molecule or a second protein) smaller than a molecule encoded by the first coding sequence. In some embodiments, the second coding sequence encodes a molecule (e.g., an RNA molecule or a second protein) larger than a molecule encoded by the first coding sequence. In some embodiments, the second coding sequence encodes a molecule (e.g., an RNA molecule or a second protein) having a substantially equal size to a molecule encoded by the first coding sequence.
[0172] In some embodiments, the coding sequence is expressed in a target cell. In some embodiments, the target cell is a lung cell, a pancreatic cell, a kidney cell, a muscle cell, a liver cell, a retinal cell, a neuron, a glial cell, an endothelial cell, or an epithelial cell. For example, in some embodiments, the target cell is a lung cell. In some embodiments, the target cell is a pancreatic cell. In some embodiments, the target cell is a kidney cell. In some embodiments, the target cell is a muscle cell. In some embodiments, the target cell is a liver cell. In some embodiments, the target cell is a retinal cell. In some embodiments, the target cell is a retinal cell. In some embodiments, the target cell is a neuron. In some embodiments, the target cell is a glial cell. In some embodiments, the target cell is an endothelial cell. In some embodiments, the target cell is an epithelial cell.
A. Codon Optimization
[0173] The coding sequences described herein can be codon optimized variants of a nucleic acid sequence of a gene or RNA equivalent thereof encoding a protein of interest so as to achieve, for instance, enhanced expression of the protein in a particular cell type (e.g., a lung cell, a pancreatic cell, a kidney cell, a muscle cell, a liver cell, a retinal cell, a neuron, a glial cell, an endothelial cell, or an epithelial cell). For example, genes and RNA equivalents thereof can be optimized for tissue-specific expression of an encoded protein. Optimized genes and RNA equivalents thereof can be synthesized by methods known in the art, such as chemical synthesis techniques, and may be amplified, for instance, using polymerase chain reaction (PCR)-based amplification methods or by transfection of the gene into a cell, such as a bacterial cell or mammalian cell capable of replicating exogenous nucleic acids.
(i) Increasing quantity of high-frequency codons
[0174] For example, one of skill in the art can design variants of the target gene that contain greater quantities of high-frequency codons within the target organism of interest. For instance, after enhancing the protein-encoding gene sequence by incorporating codon substitutions that minimize the sequence identity of the coding strand of the target gene relative to the coding strands of genes expressed at high levels within the target cell, one of skill in the art can subsequently modify the designed coding sequence to as to increase the quantity of codons that frequently occur in endogenous genes within the target organism (e.g., a mammal, such as a human). For example, codons that have increased GC content tend to be employed more frequently in protein-coding genes.
(ii) Reducing CpG content and homopolymer content
[0175] Alternatively, or in addition to the above, one of skill in the art can manipulate the protein-encoding gene sequence of a target gene by incorporating codon substitutions that diminish the CpG content and/or homopolymer content of the gene. For instance, one can begin with a wild-type gene sequence and introduce substitutions e.g., single-nucleotide substitutions) that reduce the CpG content and/or homopolymer content of the gene while preserving the identity of the encoded proteins sequence. One can then, for example, obtain a gene sequence that minimally resembles the genes encoded in a cell type of interest. Alternatively, one can begin with a sequence that has been codon optimized and subsequently can be manipulated by the introduction of mutations (e.g., single- nucleotide substitutions) that reduce the CpG content
and/or homopolymer content of the gene. Once designed, the final codon optimized gene can be prepared, for instance, by solid phase nucleic acid procedures known in the art. Additionally, the prepared gene can be amplified, for instance, using PCR-based techniques described herein or known in the art, and/or by transformation of cells with a plasmid containing the designed gene.
(Hi) Exemplary codon optimized coding sequences
[0176] In some embodiments, the one or more (e.g., two) coding sequences of the disclosure encodes one or more codon optimized genes selected from the non-limiting list of: CFTR, ATP2B, ATP7A, AGL, CPS1, AIAT, ALPL, ARSA, BBS1, BEST1, CAH, CFH, CFI, CHM, CLN2, CLN7, CNGA3, CYP46A1, F9, FKRP, FMRI, FMRP, FOXG1, GAD, GALC, GALGT2, GBA1, GBE1, GLB1, GRN, HEXA, HTRA1, IDS, IDEA, LAMP2, LCA5, MECP2, MFN2, MMUT, MIMI, NAGLU, ND4, PAH, RIGA, PRKN, RPE65, SERPINGI, SGSH, SLCI3A5, and SLC6A1. For example, in some embodiments, a coding sequence encodes a codon optimized variant of CFTR. In some embodiments, a coding sequence encodes a codon optimized variant of ATP2B. In some embodiments, a coding sequence encodes a codon optimized variant of ATP7A. In some embodiments, a coding sequence encodes a codon optimized variant of AGL. In some embodiments, a coding sequence encodes a codon optimized variant of CPS1. In some embodiments, a coding sequence encodes a codon optimized variant of A1AT. In some embodiments, a coding sequence encodes a codon optimized variant of ALPL. In some embodiments, a coding sequence encodes a codon optimized variant of ARSA. In some embodiments, a coding sequence encodes a codon optimized variant of BBS1. In some embodiments, a coding sequence encodes a codon optimized variant of BEST1. In some embodiments, a coding sequence encodes a codon optimized variant of CAH. In some embodiments, a coding sequence encodes a codon optimized variant of CFH. In some embodiments, a coding sequence encodes a codon optimized variant of CFI. In some embodiments, a coding sequence encodes a codon optimized variant of CHM. In some embodiments, a coding sequence encodes a codon optimized variant of CLN2. In some embodiments, a coding sequence encodes a codon optimized variant of CLN7. In some embodiments, a coding sequence encodes a codon optimized variant of CNGA3. In some embodiments, a coding sequence encodes a codon optimized variant of CYP46A1. In some embodiments, a coding sequence encodes a codon optimized variant of F9. In some embodiments, a coding sequence encodes a codon optimized variant of FKRP. In some embodiments, a coding sequence encodes a codon optimized variant of FMRI. In some embodiments, a coding sequence encodes a codon optimized variant of FMRP. In some
embodiments, a coding sequence encodes a codon optimized variant of F0XG1. In some embodiments, a coding sequence encodes a codon optimized variant of GAD. In some embodiments, a coding sequence encodes a codon optimized variant of GALC. In some embodiments, a coding sequence encodes a codon optimized variant of GALGT2. In some embodiments, a coding sequence encodes a codon optimized variant of GBA1. In some embodiments, a coding sequence encodes a codon optimized variant of GBE1. In some embodiments, a coding sequence encodes a codon optimized variant of GLB1. In some embodiments, a coding sequence encodes a codon optimized variant of GRN. In some embodiments, a coding sequence encodes a codon optimized variant of HEXA. In some embodiments, a coding sequence encodes a codon optimized variant of HTRA1. In some embodiments, a coding sequence encodes a codon optimized variant of IDS. In some embodiments, a coding sequence encodes a codon optimized variant of IDUA. In some embodiments, a coding sequence encodes a codon optimized variant of LAMP2. In some embodiments, a coding sequence encodes a codon optimized variant of LCA5. In some embodiments, a coding sequence encodes a codon optimized variant oiMECP2. In some embodiments, a coding sequence encodes a codon optimized variant oiMFN2. In some embodiments, a coding sequence encodes a codon optimized variant of MMUT. In some embodiments, a coding sequence encodes a codon optimized variant of MTMl. In some embodiments, a coding sequence encodes a codon optimized variant of NAGLU. In some embodiments, a coding sequence encodes a codon optimized variant of ND4. In some embodiments, a coding sequence encodes a codon optimized variant of PAH. In some embodiments, a coding sequence encodes a codon optimized variant of PIGA. In some embodiments, a coding sequence encodes a codon optimized variant of PRKN. In some embodiments, a coding sequence encodes a codon optimized variant of RPE65. In some embodiments, a coding sequence encodes a codon optimized variant of SERPING1. In some embodiments, a coding sequence encodes a codon optimized variant of SGSH. In some embodiments, a coding sequence encodes a codon optimized variant of SLC13A5. In some embodiments, a coding sequence encodes a codon optimized variant of SLC6A1.
[0177] In some embodiments, a coding sequence encoding a codon optimized variant of CFTR has a nucleotide sequence having at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 819-826. In some embodiments, a coding sequence encoding a codon optimized variant of CFTR has a nucleotide sequence having at least 85% sequence identity to any one of SEQ ID NOs: 819-826. In some embodiments, a coding
sequence encoding a codon optimized variant of CFTR has a nucleotide sequence having at least 90% sequence identity to any one of SEQ ID NOs: 819-826. In some embodiments, a coding sequence encoding a codon optimized variant of CFTR has a nucleotide sequence having at least 91% sequence identity to any one of SEQ ID NOs: 819-826. In some embodiments, a coding sequence encoding a codon optimized variant of CFTR has a nucleotide sequence having at least 92% sequence identity to any one of SEQ ID NOs: 819-826. In some embodiments, a coding sequence encoding a codon optimized variant of CFTR has a nucleotide sequence having at least 93% sequence identity to any one of SEQ ID NOs: 819-826. In some embodiments, a coding sequence encoding a codon optimized variant of CFTR has a nucleotide sequence having at least 94% sequence identity to any one of SEQ ID NOs: 819-826. In some embodiments, a coding sequence encoding a codon optimized variant of CFTR has a nucleotide sequence having at least 95% sequence identity to any one of SEQ ID NOs: 819-826. In some embodiments, a coding sequence encoding a codon optimized variant of CFTR has a nucleotide sequence having at least 96% sequence identity to any one of SEQ ID NOs: 819-826. In some embodiments, a coding sequence encoding a codon optimized variant of CFTR has a nucleotide sequence having at least 97% sequence identity to any one of SEQ ID NOs: 819-826. In some embodiments, a coding sequence encoding a codon optimized variant of CFTR has a nucleotide sequence having at least 98% sequence identity to any one of SEQ ID NOs: 819-826. In some embodiments, a coding sequence encoding a codon optimized variant of CFTR has a nucleotide sequence having at least 99% sequence identity to any one of SEQ ID NOs: 819-826. In some embodiments, a coding sequence encoding a codon optimized variant of CFTR has a nucleotide sequence having 100% sequence identity to any one of SEQ ID NOs: 819-826.
[0178] In some embodiments, a coding sequence encoding a codon optimized variant o MTMl has a nucleotide sequence having at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 827-836. In some embodiments, a coding sequence encoding a codon optimized variant of MTM1 has a nucleotide sequence having at least 85% sequence identity to any one of SEQ ID NOs: 827-836. In some embodiments, a coding sequence encoding a codon optimized variant of MTM1 has a nucleotide sequence having at least 90% sequence identity to any one of SEQ ID NOs: 827-836. In some embodiments, a coding sequence encoding a codon optimized variant of MTM1 has a nucleotide sequence having at least 91% sequence identity to any one of SEQ ID NOs: 827-836. In some embodiments, a coding sequence encoding a codon optimized variant of MTM1 has a nucleotide sequence having at least 92% sequence identity to any one of SEQ ID NOs: 827-836. In some embodiments, a
coding sequence encoding a codon optimized variant of MTM1 has a nucleotide sequence having at least 93% sequence identity to any one of SEQ ID NOs: 827-836. In some embodiments, a coding sequence encoding a codon optimized variant of MTM1 has a nucleotide sequence having at least 94% sequence identity to any one of SEQ ID NOs: 827-836. In some embodiments, a coding sequence encoding a codon optimized variant of MTM1 has a nucleotide sequence having at least 95% sequence identity to any one of SEQ ID NOs: 827-836. In some embodiments, a coding sequence encoding a codon optimized variant of MTM1 has a nucleotide sequence having at least 96% sequence identity to any one of SEQ ID NOs: 827-836. In some embodiments, a coding sequence encoding a codon optimized variant of MTM1 has a nucleotide sequence having at least 97% sequence identity to any one of SEQ ID NOs: 827-836. In some embodiments, a coding sequence encoding a codon optimized variant of MTM1 has a nucleotide sequence having at least 98% sequence identity to any one of SEQ ID NOs: 827-836. In some embodiments, a coding sequence encoding a codon optimized variant of MTM1 has a nucleotide sequence having at least 99% sequence identity to any one of SEQ ID NOs: 827-836. In some embodiments, a coding sequence encoding a codon optimized variant of MTM1 has a nucleotide sequence having 100% sequence identity to any one of SEQ ID NOs: 827-836.
[0179] In some embodiments, the one or more (e.g., two) coding sequences of the disclosure encodes one or more codon optimized genes selected from the non-limiting list of: F8, F9, PIGA, SGSH, G6PC, NAGLU, CLN3, GBA, IDS, GAA, OTC, GLA, CAH, IDUA, LAMP2, CLN1, ATP7B, Al AT, GALT, EMNA, ENPP1, CLN2, CLN5, CLN7/MFSD8, AGU, MMUT, NPC2, ABCB11, ABCB4, ASS1, SMN1, AADC, MIMI, GBA1, GRN, GAD, GALGT2, SGCB, GDNF, ASPA, GLB1, GALC, SGCA, DYSF, HEXA, GAN, FXN, ARSA, MECP2, IGHMBP2, UBE3A, CDKL5, PGRN, FKRP, CYP46A1, OPMD, Cavl, neuropeptide Y/Y2, SCN1A, SHANK3, APOE2(R158C), FMRI, UPF1, CMT4J, MFN2, PRKN, CAPN3, NTF3, ANO5, SGCG, FMD, SURF1, GBE1, FMRP, RPE65, RPGR, CHM, ND4, CNGB3, PDE6b, CFI, CNGA3, GUCY2D, RLBP1, CD59, OPN1LW, CFH, MY07A, RSI, ABCA4, ND1, BEST1, RHO, LCA5, RDH12, NMNAT1, SERPING1, AQP1, PPP1R1A, IL-IRa, CFTR, OTOF, CLRN1, GJB2, ALPL, TMC1, STRC, ATOH1, and MYBPC 3, or a functional fragment of a variant thereof.
[0180] For example, in some embodiments, the one or more (e.g., two) coding sequences of the disclosure encodes codon optimized F8 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized F9 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized PIGA or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes
codon optimized SGSH or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized G6PC or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized NAGLU or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized CLN3 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized GBA or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized IDS or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized GAA or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized OTC or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized GLA or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized CAH or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized IDUA or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized LAMP2 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized CLN1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized A TP 7B or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized A1AT or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized GALT or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized LMNA or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized ENPP1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized CLN2 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized CLN5 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized CLN7/MFSD8 or a functional fragment of a variant thereof. In some embodiments, the one or
more coding sequences of the disclosure encodes codon optimized AGU or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized MMUT or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized NPC2 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized ABCB11 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized ABCB4 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized ASS1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized SMN1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized AADC or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized MTM1 or a functional fragment of a variant thereof (e.g., a nucleotide sequence having at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 827-836) . In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized GBA1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized GRN or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized GAD or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized GALGT2 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized SGCB or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized GDNF or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized ASPA or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized GLB1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized GALC or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized SGCA or a functional fragment of a variant thereof. In some
embodiments, the one or more coding sequences of the disclosure encodes codon optimized DYSF or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized HEXA or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized GAN or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized FXN or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized ARSA or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized MECP2 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized IGHMBP2 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized UBE3A or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized CDKL5 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized PGRN or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized FKRP or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized CYP46A1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized OPMD or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized Cavl or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized neuropeptide Y/Y2 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized SCN1A or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized SHANK3 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized APOE2(R158C) or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized FMRI or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized UPF1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the
disclosure encodes codon optimized CMT4J or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized MFN2 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized PRKN or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized CAPN3 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized NTF3 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized AN05 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized SGCG or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized EMD or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized SURF1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized GBE1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized FMRP or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized RPE65 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized RPGR or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized CHM or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized ND4 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized CNGB3 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized PDE6b or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized CFI or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized CNGA3 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized
GUCY2D or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized BEBP1 or a functional fragment of
a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized CD59 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized 0PN1LW or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized CFH or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized MY07A or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized RSI or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized ABCA4 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized ND1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized BEST1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized RHO or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized LCA5 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized RDH12 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized NMNA T1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized SERPING1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized AQP1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized PPP 1R1A or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized IL-IRa or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized CFTR or a functional fragment of a variant thereof (e.g., a nucleotide sequence having at least 85%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, at least 99%, or 100% sequence identity to any one of SEQ ID NOs: 819-826). In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized OTOF or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized
CLRN1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized GJB2 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized ALPL or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized TMC1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized STRC or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized A T0H1 or a functional fragment of a variant thereof. In some embodiments, the one or more coding sequences of the disclosure encodes codon optimized MYBPC3.
V. Methods for the Delivery of Exogenous Nucleic Acids to Target Cells
[0181] A compact bidirectional promoter provided herein can be selected to express the selected coding sequence in a desired target cell. For example, the disclosure herein provides a method of expressing a heterologous coding sequence in a cell (e.g., a lung cell, a pancreatic cell, a kidney cell, a muscle cell, a liver cell, a retinal cell, a neuron, a glial cell, an endothelial cell, or an epithelial cell), the method including transfecting the cell with any of the described expression constructs, such as with the methods described herein.
[0182] The disclosure also provides, a method of expressing at least one heterologous coding sequence in a target cell, the method including introducing into a subject a nucleic acid (e.g., such as with the methods described in this section) including a compact bidirectional promoter operably linked to at least one heterologous coding sequence, wherein the compact bidirectional promoter is less than about 1000 bp (e.g., less than about 800 bp, less than about 600 bp, less than about 400 bp, or less than about 200 bp), and wherein the bidirectional promoter is capable of promoting transcription of two coding sequences positioned on opposite sides of the promoter in the cell.
[0183] In yet another embodiment, the disclosure provides a method of expressing two heterologous coding sequences in different target cells (e.g., a combination of two cell types selected from a lung cell, a pancreatic cell, a kidney cell, a muscle cell, a liver cell, a retinal cell, a neuron, a glial cell, an endothelial cell, and an epithelial cell), the method including introducing into a subject a nucleic acid (e.g., such as with the methods described in this section) including a compact bidirectional promoter operably linked to the two heterologous coding sequences positioned on opposite sides of the compact bidirectional promoter in the cell,
wherein the compact bidirectional promoter is less than about 1000 bp (e.g., less than about 800 bp, less than about 600 bp, less than about 400 bp, or less than about 200 bp), and wherein the compact bidirectional promoter promotes transcription of one of the coding sequences in a first target cell (e.g., a kidney cell) and promotes transcription of the other coding sequence in a second target cell (e.g., a muscle cell).
[0184] In some embodiments, the promoter comprises a COX15 bidirectional promoter (SEQ ID NO. 80) that expresses one coding sequence in one or more of the tissues shown for gene “COX15” in FIG. 4A-FIG. 4B and the other coding sequence in one or more of the tissues shown for gene “CUTC” in FIG. 4A-FIG. 4B.
[0185] In some embodiments, the promoter comprises a M0RN5 bidirectional promoter (SEQ ID NO. 221) that expresses one coding sequence in one or more of the tissues shown for gene “M0RN5” in FIGs 11A-FIG. 14 and the other coding sequence in one or more of the tissues shown for gene “NDUFA8” in FIGs 11A-FIG. 14.
[0186] In some embodiments, the promoter comprises an NDUFB9 bidirectional promoter (SEQ ID NO. 339) that expresses one coding sequence in one or more of the tissues shown for gene “NDUFB9” in FIGs 15A-FIG. 18 and the other coding sequence in one or more of the tissues shown for gene “TATDNT in FIGs 15A-FIG. 18.
[0187] In some embodiments, the promoter comprises an NDUFA7 bidirectional promoter (SEQ ID NO. 220) that expresses one coding sequence in one or more of the tissues shown for gene “NDUFA7' in FIGs 19A-FIG. 22 and the other coding sequence in one or more of the tissues shown for gene “RPS28” in FIGs 19A-FIG. 22.
[0188] In some embodiments, the promoter comprises an ALKBH1 bidirectional promoter (SEQ ID NO. 17) that expresses one coding sequence in one or more of the tissues shown for gene “ALKBHT in FIGs 23A-FIG. 26 and the other coding sequence in one or more of the tissues shown for gene “SLIRP” in FIGs 23A-FIG. 26. In some embodiments, a coding sequence is expressed in a target cell. In some embodiments, the target cell is a lung cell, a pancreatic cell, a kidney cell, a muscle cell, a liver cell, a retinal cell, a neuron, a glial cell, an endothelial cell, or an epithelial cell. For example, in some embodiments, the target cell is a lung cell. In some embodiments, the target cell is a pancreatic cell. In some embodiments, the target cell is a kidney cell. In some embodiments, the target cell is a muscle cell. In some embodiments, the target cell is a liver cell. In some embodiments, the target cell is a retinal cell. In some embodiments, the target cell is a retinal cell. In some embodiments, the target cell is a neuron. In some embodiments, the target cell is a glial cell. In some embodiments, the target cell is an endothelial
cell. In some embodiments, the target cell is an epithelial cell. In some embodiments, the target cell is any cell in FIG. 4A-FIG. 6D and FIG. 11A-FIG. 26
[0189] Techniques that can be used to introduce a nucleic acid molecule into a mammalian cell are well known in the art. For example, electroporation can be used to permeabilize mammalian cells (e.g., human target cells) by the application of an electrostatic potential to the cell of interest. Mammalian cells, such as human cells, subjected to an external electric field in this manner are subsequently predisposed to the uptake of exogenous nucleic acids. Electroporation of mammalian cells is described in detail, e.g., in Chu et al., NUCLEIC ACIDS RESEARCH 15: 1311 (1987), the disclosure of which is incorporated herein by reference. A similar technique, Nucleofection™, utilizes an applied electric field in order to stimulate the uptake of exogenous polynucleotides into the nucleus of a eukaryotic cell.
[0190] Nucleofection™ and protocols useful for performing this technique are described in detail, e.g., in Distler et al., EXPERIMENTAL DERMATOLOGY 14:315 (2005), as well as in US 2010/0317114, the disclosures of each of which are incorporated herein by reference.
[0191] Additional techniques useful for the transfection of target cells are the squeeze-poration methodology. This technique induces the rapid mechanical deformation of cells in order to stimulate the uptake of exogenous DNA through membranous pores that form in response to the applied stress. This technology is advantageous in that a vector is not required for delivery of nucleic acids into a cell, such as a human target cell. Squeeze-poration is described in detail, e.g., in Sharei et al., JoVE 81 :e50980 (2013), the disclosure of which is incorporated herein by reference.
[0192] Lipofection represents another technique useful for transfection of target cells. This method involves the loading of nucleic acids into a liposome, which often presents cationic functional groups, such as quaternary or protonated amines, towards the liposome exterior. This promotes electrostatic interactions between the liposome and a cell due to the anionic nature of the cell membrane, which ultimately leads to uptake of the exogenous nucleic acids, for example, by direct fusion of the liposome with the cell membrane or by endocytosis of the complex. Lipofection is described in detail, for example, in U.S. Patent No. 7,442,386, the disclosure of which is incorporated herein by reference.
[0193] Similar techniques that exploit ionic interactions with the cell membrane to provoke the uptake of foreign nucleic acids are contacting a cell with a cationic polymer-nucleic acid complex. Exemplary cationic molecules that associate with polynucleotides so as to impart a positive charge favorable for interaction with the cell membrane are activated dendrimers (described, e.g., in Dennig, TOPICS IN CURRENT CHEMISTRY 228:227 (2003), the disclosure of
which is incorporated herein by reference) polyethylenimine, and diethylaminoethyl (DEAE)- dextran, the use of which as a transfection agent is described in detail, for example, in Gulick et al., CURRENT PROTOCOLS IN MOLECULAR BIOLOGY 40: 1 :9.2:9.2.1 (1997), the disclosure of which is incorporated herein by reference. Magnetic beads are another tool that can be used to transfect target cells in a mild and efficient manner, as this methodology utilizes an applied magnetic field in order to direct the uptake of nucleic acids. This technology is described in detail, for example, in US 2010/0227406, the disclosure of which is incorporated herein by reference.
[0194] Another useful tool for inducing the uptake of exogenous nucleic acids by target cells is laserfection, also called optical transfection, a technique that involves exposing a cell to electromagnetic radiation of a particular wavelength in order to gently permeabilize the cells and allow polynucleotides to penetrate the cell membrane. The bioactivity of this technique is similar to, and in some cases found superior to, electroporation.
[0195] Impalefection is another technique that can be used to deliver genetic material to target cells. It relies on the use of nanomaterials, such as carbon nanofibers, carbon nanotubes, and nanowires.
[0196] Needle-like nanostructures are synthesized perpendicular to the surface of a substrate. DNA containing the gene, intended for intracellular delivery, is attached to the nanostructure surface. A chip with arrays of these needles is then pressed against cells or tissue. Cells that are impaled by nanostructures can express the delivered gene(s). An example of this technique is described in Shalek et al., PNAS 107: 1870 (2010), the disclosure of which is incorporated herein by reference.
[0197] Magnetofection can also be used to deliver nucleic acids to target cells. The magnetofection principle is to associate nucleic acids with cationic magnetic nanoparticles. The magnetic nanoparticles are made of iron oxide, which is fully biodegradable, and coated with specific cationic proprietary molecules varying upon the applications. Their association with the gene vectors (DNA, viral vector) is achieved by salt-induced colloidal aggregation and electrostatic interaction. The magnetic particles are then concentrated on the target cells by the influence of an external magnetic field generated by magnets. This technique is described in detail in Scherer et al., GENE THERAPY 9: 102 (2002), the disclosure of which is incorporated herein by reference.
[0198] Another useful tool for inducing the uptake of exogenous nucleic acids by target cells is sonoporation, a technique that involves the use of sound (typically ultrasonic frequencies) for modifying the permeability of the cell plasma membrane permeabilize the cells and allow
polynucleotides to penetrate the cell membrane. This technique is described in detail, e.g., in Rhodes et al., METHODS IN CELL BIOLOGY 82:309 (2007), the disclosure of which is incorporated herein by reference.
[0199] Microvesicles represent another potential vehicle that can be used to modify the genome of a target cell according to the methods described herein. For example, microvesicles that have been induced by the co-overexpression of the glycoprotein VSV-G with, e.g., a genomemodifying protein, such as a nuclease, can be used to efficiently deliver proteins into a cell that subsequently catalyze the site-specific cleavage of an endogenous polynucleotide sequence so as to prepare the genome of the cell for the covalent incorporation of a polynucleotide of interest, such as a gene or regulatory sequence. The use of such vesicles, also referred to as Gesicles, for the genetic modification of eukaryotic cells is described in detail, e.g., in Quinn et al., Genetic Modification of Target Cells by Direct Delivery of Active Protein [abstract]. In: Methylation changes in early embryonic genes in cancer [abstract], in: Proceedings of the 18th Annual Meeting of the American Society of Gene and Cell Therapy; 2015 May 13, Abstract No. 122.
VI. Nucleic Acid Vectors
[0200] Effective intracellular concentrations of a coding sequence (e.g., a gene) disclosed herein can be achieved via the stable expression of a vector encoding a coding sequence (e.g., by integration into the nuclear or mitochondrial genome of a mammalian cell). In order to introduce such a gene into a mammalian cell, the gene can be incorporated into a vector.
[0201] Vectors can be introduced into a cell by a variety of methods, including transformation, transfection, direct uptake, projectile bombardment, and by encapsulation of the vector in a liposome. Examples of suitable methods of transfecting or transforming cells are calcium phosphate precipitation, electroporation, microinjection, infection, lipofection, and direct uptake. Such methods are described in more detail, for example, in Green et al., Molecular Cloning: A Laboratory Manual, Fourth Edition (Cold Spring Harbor University Press, New York (2014)); and Ausubel et al. , Current Protocols in Molecular Biology (John Wiley & Sons, New York (2015)), the disclosures of each of which are incorporated herein by reference.
[0202] The genes disclosed herein can also be introduced into a mammalian cell by targeting a vector containing a polynucleotide encoding such a gene to cell membrane phospholipids. For example, vectors can be targeted to the phospholipids on the extracellular surface of the cell membrane by linking the vector molecule to a VSV-G protein, a viral protein with affinity for all cell membrane phospholipids. Such, a construct can be produced using conventional and routine methods of the art. In addition to achieving high rates of transcription and translation, stable
expression of an exogenous polynucleotide in a mammalian cell can be achieved by integration of the polynucleotide containing the gene into the nuclear genome of the mammalian cell. A variety of vectors for the delivery and integration of polynucleotides encoding exogenous proteins into the nuclear DNA of a mammalian cell have been developed. Examples of expression vectors are disclosed in, e.g., WO 1994/011026 and are incorporated herein by reference. Expression vectors for use in the compositions and methods described herein contain a polynucleotide sequence that encodes a gene as well as, e.g., additional sequence elements used for the expression of these genes and/or the integration of these polynucleotide sequences into the genome of a mammalian cell. Certain vectors that can be used include plasmids that contain regulatory sequences, such as promoter and enhancer regions, which direct gene transcription. Other useful vectors contain polynucleotide sequences that enhance the rate of translation of these genes or improve the stability or nuclear export of the mRNA that results from gene transcription. These sequence elements include, e.g., 5' and 3' UTR regions, an IRES, and polyA in order to direct efficient transcription of the gene carried on the expression vector. The expression vectors suitable for use with the compositions and methods described herein may also contain a polynucleotide encoding a marker for selection of cells that contain such a vector. Examples of a suitable marker are genes that encode resistance to antibiotics, such as ampicillin, chloramphenicol, kanamycin, and nourseothricin.
[0203] In some embodiments, any of the vectors disclosed herein are capable of inducing at least 20%, at least 50%, at least 100%, at least 150%, at least 200%, at least 250%, at least 300%, at least 400%, at least 500%, at least 700%, at least 900%, at least 1000%, at least 1100%, at least 1500%, or at least 2000% higher expression of CFTR, ATP2B, ATP7A, AGL, CPS1, A1AT, ALPL, ARSA, BBS I, BEST1, CAH, CFH, CFI, CHM, CLN2, CLN7, CNGA3, CYP46A1, F9, FKRP, FMRI, FMRP, F0XG1, GAD, GALC, GALGT2, GBA1, GBE1, GLB1, GRN, HEXA, HTRA1, IDS, IDEA, LAMP2, LCA5, MECP2, MFN2, MMUT, MIMI, NAGLU, ND4, PAH, RIGA, PRKN, RPE65, SERPINGI, SGSH, SLCI3A5, or SLC6A1 or a functional fragment or variant thereof in a target cell, as compared to the endogenous expression of CFTR, ATP2B, ATP7A, AGL, CPS1, A1AT, ALPL, ARSA, BBS1, BEST1, CAH, CFH, CFI, CHM, CLN2, CLN7, CNGA3, CYP46A1, F9, FKRP, FMRI, FMRP, F0XG1, GAD, GALC, GALGT2, GBA1, GBE1, GLB1, GRN, HEXA, HTRA1, IDS, IDEA, LAMP2, LCA5, MECP2, MFN2, MMET, MIMI, NAGLE, ND4, PAH, PIGA, PRKN, RPE65, SERPINGI, SGSH, SLC13A5, or SLC6A1, respectively, in the target cell.
[0204] In some embodiments, expression of any of the vectors disclosed herein in a target cell results in at least 20%, at least 50%, at least 100%, at least 150%, at least 200%, at least 250%,
at least 300%, at least 400%, at least 500%, at least 700%, at least 900%, at least 1000%, at least 1100%, at least 1500%, or at least 2000% higher activity levels of CFTR, ATP2B, ATP7A, AGL, CPS1, Al AT, ALPL, ARSA, BBS1, BEST1, CAH, CFH, CFI, CHM, CLN2, CLN7, CNGA3, CYP46A1, F9, FKRP, FMRI, FMRP, F0XG1, GAD, GALC, GALGT2, GBA1, GBE1, GLB1, GRN, HEXA, HTRA1, IDS, IDEA, LAMP2, LCA5, MECP2, MFN2, MMUT, MIMI, NAGLU, ND4, PAH, PIGA, PRKN, RPE65, SERPING1, SGSH, SLC13A5, or SLC6A1 or a functional fragment or variant thereof in the target cell as compared to endogenous activity levels of CFTR, ATP2B, ATP7A, AGL, CPS1, A1AT, ALPL, ARSA, BBS1, BEST1, CAH, CFH, CFI, CHM, CLN2, CLN7, CNGA3, CYP46A1, F9, FKRP, FMRI, FMRP, F0XG1, GAD, GALC, GALGT2, GBA1, GBE1, GLB1, GRN, HEXA, HTRA1, IDS, IDEA, LAMP2, LCA5, MECP2, MFN2, MMET, MTM1, NAGLE, ND4, PAH, PIGA, PRKN, RPE65, SERPING1, SGSH, SLC13A5, or SLC6A1, respectively, in the target cell.
[0205] In some embodiments, any of the vectors disclosed herein are capable of inducing at least 20%, at least 50%, at least 100%, at least 150%, at least 200%, at least 250%, at least 300%, at least 400%, at least 500%, at least 700%, at least 900%, at least 1000%, at least 1100%, at least 1500%, or at least 2000% higher expression of F8, F9, PIGA, SGSH, G6PC, NAGLE, CLN3, GBA, IDS, GAA, OTC, GLA, CAH, IDEA, LAMP2, CLN1, ATP7B, A1AT, GALT, EMNA, ENPP1, CLN2, CLN5, CLN7/MFSD8, AGE, MMET, NPC2, ABCB11, ABCB4, ASS1, SMN1, AADC, MTM1, GBA1, GRN, GAD, GALGT2, SGCB, GDNF, ASPA, GLB1, GALC, SGCA, DYSF, HEXA, GAN, FXN, ARSA, MECP2, IGHMBP2, EBE3A, CDKL5, PGRN, FKRP, CYP46A1, OPMD, Cavl, neuropeptide Y/Y2, SCN1A, SHANK3, APOE2(R158C), FMRI, EPF1, CMT4J, MFN2, PRKN, CAPN3, NTF3, ANO5, SGCG, EMD, SERF!, GBE1, FMRP, RPE65, RPGR, CHM, ND4, CNGB3, PDE6b, CFI, CNGA3, GECY2D, RLBP1, CD59, OPN1LW, CFH, MY07A, RSI, ABCA4, ND1, BEST1, RHO, LCA5, RDH12, NMNAT1, SERPING1, AQP1, PPP 1 RIA, IL-IRa, CFTR, OTOF, CLRN1, GJB2, ALPL, TMC1, STRC, ATOH1, or MYBPC3 or a functional fragment or variant thereof in a target cell, as compared to the endogenous expression of F8, F9, PIGA, SGSH, G6PC, NAGLE, CLN3, GBA, IDS, GAA, OTC, GLA, CAH, IDEA, LAMP2, CLN1, ATP7B, Al AT, GALT, EMNA, ENPP1, CLN2, CLN5, CLN7/MFSD8, AGE, MMET, NPC2, ABCB11, ABCB4, ASS1, SMN1, AADC, MTM1, GBA1, GRN, GAD, GALGT2, SGCB, GDNF, ASPA, GLB1, GALC, SGCA, DYSF, HEXA, GAN, FXN, ARSA, MECP2, IGHMBP2, EBE3A, CDKL5, PGRN, FKRP, CYP46A1, OPMD, Cavl, neuropeptide Y/Y2, SCN1A, SHANK3, APOE2(R158C), FMRI, EPF1, CMT4J, MFN2, PRKN, CAPN3, NTF3, ANO5, SGCG, EMD, SERF1, GBE1, FMRP, RPE65, RPGR, CHM, ND4, CNGB3, PDE6b, CFI, CNGA3, GECY2D, RLBP1, CD59, OPN1LW, CFH, MY07A, RSI, ABCA4, ND1, BEST1, RHO,
LCA5, RDH12, NMNAT1, SERPINGI, AQP1, PPP1R1A, IL-IRa, CFTR, OTOF, CLRN1, GJB2, ALPL, TMC1, STRC, AT0H1, or MYBPC 3, respectively, in the target cell.
[0206] In some embodiments, expression of any of the vectors disclosed herein in a target cell results in at at least 20%, at least 50%, at least 100%, at least 150%, at least 200%, at least 250%, at least 300%, at least 400%, at least 500%, at least 700%, at least 900%, at least 1000%, at least 1100%, at least 1500%, or at least 2000% higher activity levels of F8, F9, PIGA, SGSH, G6PC, NAGLU, CLN3, GBA, IDS, GAA, OTC, GLA, CAH, IDUA, LAMP2, CLN1, ATP7B, Al AT, GALT, EMNA, ENPP1, CLN2, CLN5, CLN7/MFSD8, AGU, MMUT, NPC2, ABCB11, ABCB4, ASS1, SMN1, AADC, MIMI, GBA1, GRN, GAD, GALGT2, SGCB, GDNF, ASP A, GLB1, GALC, SGCA, DYSF, HEXA, GAN, FXN, ARSA, MECP2, IGHMBP2, UBE3A, CDKL5, PGRN, FKRP, CYP46A1, OPMD, Cavl, neuropeptide Y/Y2, SCN1A, SHANK3, APOE2(R158C), FMRI, UPF1, CMT4J, MFN2, PRKN, CAPN3, NTF3, ANO5, SGCG, EMD, SURF1, GBE1, FMRP, RPE65, RPGR, CHM, ND4, CNGB3, PDE6b, CFI, CNGA3, GUCY2D, RLBP1, CD59, OPN1LW, CFH, MY07A, RSI, ABCA4, ND1, BEST1, RHO, LCA5, RDH12, NMNAT1, SERPINGI, AQP1, PPP1R1A, IL-IRa, CFTR, OTOF, CLRN1, GJB2, ALPL, TMC1, STRC, ATOH1, or MYBPC 3 or a functional fragment or variant thereof in the target cell as compared to endogenous activity levels of S, F9, PIGA, SGSH, G6PC, NAGLU, CLN3, GBA, IDS, GAA, OTC, GLA, CAH, IDUA, LAMP2, CLN1, ATP7B, A1AT, GALT, EMNA, ENPP1, CLN2, CLN5, CLN7/MFSD8, AGU, MMUT, NPC2, ABCB11, ABCB4, ASS1, SMN1, AADC, MIMI, GBA1, GRN, GAD, GALGT2, SGCB, GDNF, ASP A, GLB1, GALC, SGCA, DYSF, HEXA, GAN, FXN, ARSA, MECP2, IGHMBP2, UBE3A, CDKL5, PGRN, FKRP, CYP46A1, OPMD, Cavl, neuropeptide Y/Y2, SCN1A, SHANK3, APOE2(R158C), FMRI, UPF1, CMT4J, MFN2, PRKN, CAPN3, NTF3, ANO5, SGCG, EMD, SURF1, GBE1, FMRP, RPE65, RPGR, CHM, ND4, CNGB3, PDE6b, CFI, CNGA3, GUCY2D, RLBP1, CD59, OPN1LW, CFH, MY07A, RSI, ABCA4, ND1, BEST1, RHO, LCA5, RDH12, NMNAT1, SERPINGI, AQP1, PPP1R1A, IL-IRa, CFTR, OTOF, CLRN1, GJB2, ALPL, TMC1, STRC, ATOH1, or MYBPC 3 respectively, in the target cell.
VII. Viral Vectors
[0207] Viral genomes provide a rich source of vectors that can be used for the efficient delivery of exogenous polynucleotides into a mammalian cell. Viral genomes are particularly useful vectors for gene delivery as the polynucleotides contained within such genomes are typically incorporated into the nuclear genome of a mammalian cell by generalized or specialized transduction. These processes occur as part of the natural viral replication cycle, and do not
require added proteins or reagents in order to induce gene integration. Examples of viral vectors are a parvovirus (e.g., AAV, retrovirus (e.g., Retroviridae family viral vector), adenovirus (e.g., Ad5, Ad26, Ad34, Ad35, and Ad48), coronavirus, negative strand RNA viruses such as orthomyxovirus (e.g., influenza virus), rhabdovirus e.g., rabies and vesicular stomatitis virus), paramyxovirus e.g. measles and Sendai), positive strand RNA viruses, such as picornavirus and alphavirus, and double stranded DNA viruses including adenovirus, herpesvirus (e.g., Herpes Simplex virus types 1 and 2, Epstein-Barr virus, cytomegalovirus), and poxvirus (e.g., vaccinia, modified vaccinia Ankara (MV A), fowlpox, and canarypox). Other viruses include Norwalk virus, togavirus, flavivirus, reoviruses, papovavirus, hepadnavirus, human papilloma virus, human foamy virus, and hepatitis virus, for example. Examples of retroviruses are avian leukosis-sarcoma, avian C-type viruses, mammalian C-type, B-type viruses, D-type viruses, oncoretroviruses, HTLV-BLV group, lentivirus, alpharetrovirus, gammaretrovirus, spumavirus (Coffin, J. M., Retroviridae: The viruses and their replication, Virology, Third Edition (Lippincott-Raven, Philadelphia, (1996))). Other examples are murine leukemia viruses, murine sarcoma viruses, murine mammary tumor virus, bovine leukemia virus, feline leukemia virus, feline sarcoma virus, avian leukemia virus, human T-cell leukemia virus, baboon endogenous virus, Gibbon ape leukemia virus, Pfizer monkey virus, simian immunodeficiency virus, simian sarcoma virus, Rous sarcoma virus, and lentiviruses. Other examples of vectors are described, for example, in McVey et al., (U.S. Patent No. 5,801,030), the teachings of which are incorporated herein by reference.
A. Regulatory Sequences
[0208] A nucleic acid of the disclosure may be operably linked to a regulatory sequence. For example, in some embodiments, regulatory sequences are operably linked to a transgene including a heterologous coding sequence or a functional fragment or variant thereof. The regulatory sequences may include conventional control elements which permit the coding sequence’s transcription, translation, and/or expression in a cell transfected with the vector or infected with the virus produced by the disclosure.
[0209] The regulatory sequences useful in the constructs of the present disclosure may include an intron, such as an intron located between the compact bidirectional promoter and the coding sequence. In some embodiments, the intron sequence is derived from SV-40 and is a 100 bp mini-intron splice donor/splice acceptor referred to as SD-SA.
[0210] In some embodiments, a vector of the disclosure may include a woodchuck hepatitis virus post-transcriptional element. (See, e.g., L. Wang and I. Verma, 1999 PROC. NATL. ACAD. SCI., USA, 96:3906-3910).
[0211] In some embodiments, a vector of the disclosure may include a polyA signal, such as a polyA signal derived from many suitable species, including, without limitation SV-40, human, and bovine.
[0212] Another regulatory component of the rAAV useful in the method of the disclosure is an IRES. An IRES sequence, or other suitable systems, may be used to produce more than one polypeptide from a single gene transcript (for example, to produce more polypeptides). An IRES may be used to produce a protein that contains more than one polypeptide chains or to express two different proteins from or within the same cell. In some embodiments, the IRES is located 3' to the transgene in the rAAV vector.
[0213] Other regulatory sequences useful in the vectors of the disclosure include enhancer sequences. Enhancer sequences useful in the disclosure include the 1RBP enhancer, immediate early cytomegalovirus enhancer, an enhancer derived from an immunoglobulin gene, an enhancer derived from the SV40 enhancer, or an enhancer identified in a c/.s-acting element in a mouse proximal promoter.
[0214] Selection of these and other common vector and regulatory elements are well-known in the art and many such sequences are available (see, e.g., Sambrook et al., and references cited therein at, for example, pages 3.18-3.26 and 16, 17-16.27 and Ausubel et al., Current Protocols in Molecular Biology, John Wiley & Sons, New York, 1989).
[0215] A vector herein may also contain a reporter sequence for co-expression, such as but not limited to lacZ, GFP, CFP, YFP, RFP, mCherry, and tdTomato. In some embodiments, the rAAV vector may include a selectable marker.
B. A A V Vectors
[0216] Genes described herein can be incorporated into rAAV vectors in order to facilitate their introduction into a cell, such as a target cell. rAAV vectors useful in the conjunction with the compositions and methods described herein include recombinant nucleic acid constructs that contain (1) a gene and (2) nucleic acids that facilitate and expression of the heterologous genes. The viral nucleic acids may include those sequences of AAV that are required in cis for replication and packaging (e.g., functional ITRs) of the DNA into a virion. Such rAAV vectors may also contain marker or reporter genes.
[0217] Useful rAAV vectors include those having one or more of the naturally occurring AAV genes deleted in whole or in part, but retain functional flanking ITR sequences. The AAV ITRs may be of any serotype suitable for a particular application. Methods for using rAAV vectors are described, for example, in Tai et al., J. BIOMED. SCI. 7:279-291 (2000), and Monahan and Samulski, GENE DELIVERY 7:24-30 (2000), the disclosures of each of which are incorporated herein by reference as they pertain to AAV vectors for gene delivery.
[0218] In some embodiments, the AAV includes two ITRs.
[0219] The genes described herein can be incorporated into a rAAV virion in order to facilitate introduction of the nucleic acid or vector into a cell. The capsid proteins of AAV compose the exterior, non-nucleic acid portion of the virion and are encoded by the AAV Cap gene. The Cap gene encodes three viral coat proteins, VP1, VP2 and VP3, which are required for virion assembly. The construction of rAAV virions has been described, for example, in US Patent Nos. 5,173,414; 5,139,941; 5,863,541; 5,869,305; 6,057,152; and 6,376,237; as well as in Rabinowitz et al., J. VIROL. 76:791-801 (2002) and Bowles et al., J. VIROL. 77:423-432 (2003), the disclosures of each of which are incorporated herein by reference as they pertain to AAV vectors for gene delivery.
[0220] In some embodiments, the recombinant AAV vector, including rep sequences, cap sequences, and helper functions required for producing the rAAV of the disclosure may be delivered to the packaging host cell using any appropriate genetic element (e.g, vector). In some embodiments, a single nucleic acid encoding all three capsid proteins (e.g, VP1, VP2 and VP3) is delivered into the packaging host cell in a single vector. In some embodiments, nucleic acids encoding the capsid proteins are delivered into the packaging host cell by two vectors; a first vector including a first nucleic acid encoding two capsid proteins (e.g., VP1 and VP2) and a second vector including a second nucleic acid encoding a single capsid protein (e.g., VP3). In some embodiments, three vectors, each including a nucleic acid encoding a different capsid protein, are delivered to the packaging host cell. The selected genetic element may be delivered by any suitable method, including those described herein. The methods used to construct any embodiment of this disclosure are known to those with skill in nucleic acid manipulation and include genetic engineering, recombinant engineering, and synthetic techniques. See, e.g., Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring Harbor, N.Y. Similarly, methods of generating rAAV virions are well known and the selection of a suitable method is not a limitation on the present disclosure. See, e.g., K. Fisher et al., J. VIROL., 70:520-532 (1993) and U.S. Pat. No. 5,478,745. These publications are incorporated by reference herein.
[0221] rAAV virions useful in conjunction with the compositions and methods described herein include those derived from a variety of AAV serotypes including AAV 1, 2, 3, 4, 5, 6, 7, 8, and 9. Construction and use of AAV vectors and AAV proteins of different serotypes are described, for example, in Chao et al., MOL. THER. 2:619-623 (2000); Davidson et al., PROC. NATL. ACAD. SCI. USA 97:3428-3432 (2000); Xiao etal., J. VIROL. 72:2224-2232 (1998); Halbert etal., J. VIROL. 74: 1524-1532 (2000); Halbert et al., J. VIROL. 75:6615-6624 (2001); and Auricchio et al., HUM. MOLEC. GENET. 10:3075-3081 (2001), the disclosures of each of which are incorporated herein by reference as they pertain to AAV vectors for gene delivery.
[0222] Also useful in conjunction with the compositions and methods described herein are pseudotyped rAAV vectors. Pseudotyped vectors include AAV vectors of a given serotype pseudotyped with a capsid gene derived from a serotype other than the given serotype e.g., AAV1, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, or AAV9, among others). For example, a representative pseudotyped vector is an AAV2 vector encoding a therapeutic protein pseudotyped with a capsid gene derived from AAV serotype 8 or AAV serotype 9. Techniques involving the construction and use of pseudotyped rAAV virions are known in the art and are described, for example, in Duan et al., J. VIROL. 75:7662-7671 (2001); Halbert et al., J. VIROL. 74: 1524-1532 (2000); Zolotukhin et al., METHODS, 28: 158-167 (2002); and Auricchio et al., HUM. MOLEC. GENET., 10:3075-3081 (2001).
[0223] AAV virions that have mutations within the virion capsid may be used to infect particular cell types more effectively than non-mutated capsid virions. For example, suitable AAV mutants may have ligand insertion mutations for the facilitation of targeting AAV to specific cell types. The construction and characterization of AAV capsid mutants including insertion mutants, alanine screening mutants, and epitope tag mutants is described in Wu et al., J. VIROL. 74:8635- 45 (2000).
[0224] As used herein, artificial AAV capsids may be used. Such an artificial capsid may be generated by any suitable technique using a selected AAV sequence (e.g., a fragment of a VP1 capsid protein) in combination with heterologous sequences which may be obtained from a different selected AAV serotype, non-contiguous portions of the same AAV serotype, from a non-AAV viral source, or from a non-viral source. An artificial AAV serotype may be, without limitation, a pseudotyped AAV, a chimeric AAV capsid, a recombinant AAV capsid, or a “humanized” AAV capsid.
[0225] Other rAAV virions that can be used in methods of the invention include those capsid hybrids that are generated by molecular breeding of viruses as well as by exon shuffling. See,
e.g., Soong et al., NAT. GENET., 25:436-439 (2000); and Kolman and Stemmer, Nat. Biotechnol. 19:423-428 (2001).
[0226] In some embodiments, the capsid is modified to improve therapy. The capsid may be modified using conventional molecular biology techniques. For example, in some embodiments, the capsid is modified for minimized immunogenicity, better stability and particle lifetime, efficient degradation, and/or accurate delivery of the heterologous coding sequence or a functional fragment or variant thereof to the nucleus. In some embodiments, the modification or mutation is an amino acid deletion, insertion, substitution, or any combination thereof in a capsid protein. A modified polypeptide may include 1, 2, 3, 4, 5, up to 10, or more amino acid substitutions and/or deletions and/or insertions. In some embodiments, one or more amino acid substitutions are introduced into one or more of VP1, VP2, and VP3. In one aspect, a modified capsid protein includes 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, or 15 conservative or nonconservative substitutions relative to the wild-type polypeptide.
[0227] In another aspect, the modified capsid polypeptide of the disclosure includes modified sequences, wherein such modifications can include both conservative and non-conservative substitutions, deletions, and/or additions, and typically include peptides that share at least 60%, at least 65%, at least 70%, at least 75%, at least 80%, at least 85%, at least 87%, at least 89%, at least 90%, at least 91%, at least 92%, at least 93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% sequence identity to the corresponding wild-type capsid protein.
[0228] In some embodiments, the vector includes a “stuffer” or “filler” sequence to bring the total size of the nucleic acid sequence between the two ITRs to between 2 and 5 kB. For example, in some embodiments, any of the vectors disclosed herein may include a spacer, e.g., a DNA sequence interposed between the promoter and the Rep gene ATG start site. In some embodiments, the spacer may be a random sequence of nucleotides, or alternatively, it may encode a gene product, such as a marker gene. In some embodiments, the spacer may contain genes which typically incorporate start/stop and poly A sites. In some embodiments, the spacer may be a non-coding DNA sequence from a prokaryote or eukaryote, a repetitive non-coding sequence, a coding sequence without transcriptional controls or a coding sequence with transcriptional controls. In some embodiments, the spacer is a phage ladder sequences or a yeast ladder sequence. In some embodiments, the spacer is of a size sufficient to reduce expression of the Rep78 and Rep68 gene products, leaving the Rep52, Rep40 and Cap gene products expressed at normal levels. In some embodiments, the length of the spacer may therefore range
from about 10 bp to about 10.0 kbp, such as in the range of about 100 bp to about 8.0 kbp. In some embodiments, the spacer is less than 2 kbp in length.
[0229] The rAAV vector may also contain additional sequences, for example, from an adenovirus, which assist in effecting a desired function for the vector. Such sequences include, for example, those which assist in packaging the rAAV vector in adenovirus-associated virus particles.
[0230] rAAV vectors useful in the methods of the disclosure are further described in PCT publication No. WO 2015/168666 and PCT publication no. WO 2014/011210, the contents of which are incorporated by reference herein.
[0231] In some embodiments, the rAAV particle is a single stranded AAV (ssAAV). Accordingly, in some embodiments, the compact bidirectional promoters described herein allow for the use of ssAAV vectors with genes previously thought to be too large to fit into an ssAAV (FIG. 2). Alternatively, for example, in some embodiments, the rAAV particle is a self- complementary AAV (sc-AAV) (see e.g., US 2012/0141422 which is incorporated herein by reference). Self-complementary vectors package an inverted repeat genome that can fold into dsDNA without the requirement for DNA synthesis or base-pairing between multiple vector genomes. Because scAAV have no need to convert the single-stranded DNA (ssDNA) genome into double -stranded DNA (dsDNA) prior to expression, they are more efficient vectors. However, the trade-off for this efficiency is the loss of half the coding capacity of the vector, scAAV are useful for small protein-coding genes (e.g., up to about 1.7 kb) and any currently available RNA-based therapy.
(i) Production of rAA V vectors
[0232] Numerous methods are known in the art for production of rAAV vectors, including transfection, stable cell line production, and infectious hybrid virus production systems which include adenovirus- AAV hybrids, herpesvirus-AAV hybrids (Conway, Je et al., (1997). VIROLOGY 71(11):8780-8789) and baculovirus-AAV hybrids. rAAV production cultures for the production of rAAV virus particles all require; 1) suitable host cells, including, for example, human-derived cell lines such as HeLa, A549, or 293 cells, or insect-derived cell lines such as SF-9, in the case of baculovirus production systems; 2) suitable helper virus function, provided by wild-type or mutant adenovirus (such as temperature sensitive adenovirus), herpes virus, baculovirus, or a plasmid construct providing helper functions; 3) AAV Rep and Cap genes and gene products; 4) a transgene (such as a transgene including a heterologous coding sequence (e.g. CFTR, ATP2B, ATP7A, AGL, CPS1, A1AT, ALPL, ARSA, BBS1, BEST1, CAH, CFH,
CFI, CHM, CLN2, CLN7, CNGA3, CYP46A1, F9, FKRP, FMRI, FMRP, F0XG1, GAD, GALC, GALGT2, GBA1, GBE1, GLB1, GRN, HEXA, HTRA1, IDS, IDUA, LAMP2, LCA5, MECP2, MFN2, MMUT, MTM1, NAGLU, ND4, PAH, PIGA, PRKN, RPE65, SERPING1, SGSH, SLC13A5, and SLC6A1, or a functional fragment or variant thereof) flanked by at least one AAV ITR sequence; and 5) suitable media and media components to support rAAV production. Suitable media known in the art may be used for the production of rAAV vectors. These media include, without limitation, media produced by Hyclone Laboratories and JRH including Modified Eagle Medium (MEM), Dulbecco’s Modified Eagle Medium (DMEM), custom formulations such as those described in U.S. Patent No. 6,566,118, and Sf-900 II SFM media as described in U.S. Patent No. 6,723,551, each of which is incorporated herein by reference in its entirety, particularly with respect to custom media formulations for use in production of recombinant AAV vectors.
[0233] The rAAV particles can be produced using methods known in the art. See, e.g., U.S. Pat. Nos. 6,566,118; 6,989,264; and 6,995,006. In practicing the disclosure, host cells for producing rAAV particles include mammalian cells, insect cells, plant cells, microorganisms, and yeast. Host cells can also be packaging cells in which the AAV Rep and Cap genes are stably maintained in the host cell or producer cells in which the AAV vector genome is stably maintained. Exemplary packaging and producer cells are derived from 293, A549, or HeLa cells. AAV vectors are purified and formulated using standard techniques known in the art.
[0234] Recombinant AAV particles are generated by transfecting producer cells with a plasmid (cv.s-plasmid) containing a rAAV genome including a transgene flanked by the 145 nucleotide- long AAV ITRs and a separate construct expressing the AAV Rep and Cap genes in trans. In addition, adenovirus helper factors such as El A, E1B, E2A, E40RF6, and VA RNAs may be provided by either adenovirus infection or by transfecting a third plasmid providing adenovirus helper genes into the producer cells. Producer cells may be HEK293 cells. Packaging cell lines suitable for producing AAV vectors may be readily accomplished given readily available techniques (see e.g., U.S. Pat. No. 5,872,005). The helper factors provided will vary depending on the producer cells used and whether the producer cells already carry some of these helper factors.
[0235] In some embodiments, rAAV particles may be produced by a triple transfection method, such as the exemplary triple transfection method provided infra. Briefly, a plasmid containing a Rep gene and a Cap gene, along with a helper adenoviral plasmid, may be transfected (e.g., using the calcium phosphate method) into a cell line (e.g., HEK-293 cells), and virus may be collected and optionally purified.
[0236] In some embodiments, rAAV particles may be produced by a producer cell line method, such as the exemplary producer cell line method provided infra (see also (referenced in Martin et al., (2013) HUMAN GENE THERAPY METHODS 24:253-269). Briefly, a cell line (e.g., a HeLa cell line) may be stably transfected with a plasmid containing a Rep gene, a Ccap gene, and a promoter-transgene sequence. Cell lines may be screened to select a lead clone for rAAV production, which may then be expanded to a production bioreactor and infected with an adenovirus (e.g., a wild-type adenovirus) as helper to initiate rAAV production. Virus may subsequently be harvested, adenovirus may be inactivated (e.g., by heat) and/or removed, and the rAAV particles may be purified.
[0237] In some aspects, a method is provided for producing any rAAV particle as disclosed herein including: (a) culturing a host cell under a condition that rAAV particles are produced, wherein the host cell includes (i) one or more AAV package genes, wherein each said AAV packaging gene encodes an AAV replication and/or encapsidation protein; (ii) a rAAV provector including a nucleic acid encoding a therapeutic polypeptide and/or nucleic acid as described herein flanked by at least one AAV ITR; and (iii) an AAV helper function; and (b) recovering the rAAV particles produced by the host cell. In some embodiments, said at least one AAV ITR is selected from the group consisting of AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAVrh8, AAVrh8R, AAV9, AAV10, AAVrhlO, AAV1 1, AAV 12, AAV2R471 A, AAV DJ, a goat AAV, bovine AAV, or mouse AAV or the like. In some embodiments, the encapsidation protein is an AAV2 encapsidation protein.
[0238] Suitable rAAV production culture media of the present disclosure may be supplemented with serum or serum-derived recombinant proteins at a level of 0.5-20 (v/v or w/v). Alternatively, as is known in the art, rAAV vectors may be produced in serum-free conditions which may also be referred to as media with no animal-derived products. One of ordinary skill in the art may appreciate that commercial or custom media designed to support production of rAAV vectors may also be supplemented with one or more cell culture components know in the art, including without limitation glucose, vitamins, amino acids, and/or growth factors, in order to increase the titer of rAAV in production cultures.
[0239] rAAV production cultures can be grown under a variety of conditions (over a wide temperature range, for varying lengths of time, and the like) suitable to the particular host cell being utilized. As is known in the art, rAAV production cultures include attachment-dependent cultures which can be cultured in suitable attachment-dependent vessels such as, for example, roller bottles, hollow fiber filters, microcarriers, and packed-bed or fluidized-bed bioreactors. rAAV vector production cultures may also include suspension-adapted host cells such as HeLa,
293, and SF-9 cells which can be cultured in a variety of ways including, for example, spinner flasks, stirred tank bioreactors, and disposable systems such as the Wave bag system.
[0240] rAAV vector particles of the disclosure may be harvested from rAAV production cultures by lysis of the host cells of the production culture or by harvest of the spent media from the production culture, provided the cells are cultured under conditions known in the art to cause release of rAAV particles into the media from intact cells, as described more fully in U.S. Patent No. 6,566,118. Suitable methods of lysing cells are also known in the art and include, for example, multiple freeze/thaw cycles, sonication, microfluidization, and treatment with chemicals, such as detergents and/or proteases.
[0241] In a further embodiment, the rAAV particles are purified. The term “purified” as used herein includes a preparation of rAAV particles devoid of at least some of the other components that may also be present where the rAAV particles naturally occur or are initially prepared from. Thus, for example, isolated rAAV particles may be prepared using a purification technique to enrich it from a source mixture, such as a culture lysate or production culture supernatant. Enrichment can be measured in a variety of ways, such as, for example, by the proportion of DNase-resistant particles (DRPs) or genome copies (gc) present in a solution, or by infectivity, or it can be measured in relation to a second, potentially interfering substance present in the source mixture, such as contaminants, including production culture contaminants or in-process contaminants, including helper virus, media components, and the like.
[0242] In some embodiments, the rAAV production culture harvest is clarified to remove host cell debris. In some embodiments, the production culture harvest is clarified by filtration through a series of depth filters including, for example, a grade DOHC Millipore Millistak+ HC Pod Filter, a grade A1HC Millipore Millistak+ HC Pod Filter, and a 0.2 pm Filter Opticap XL 10 Millipore Express SHC Hydrophilic Membrane filter. Clarification can also be achieved by a variety of other standard techniques known in the art, such as, centrifugation or filtration through any cellulose acetate filter of 0.2 pm or greater pore size known in the art.
[0243] In some embodiments, the rAAV production culture harvest is further treated with Benzonase® to digest any high molecular weight DNA present in the production culture. In some embodiments, the Benzonase® digestion is performed under standard conditions known in the art including, for example, a final concentration of 1-2.5 units/mL of Benzonase® at a temperature ranging from ambient to 37 °C for a period of 30 minutes to several hours.
[0244] rAAV particles may be isolated or purified using one or more of the following purification steps: equilibrium centrifugation; flow-through anionic exchange filtration; tangential flow filtration (TFF) for concentrating the rAAV particles; rAAV capture by apatite
chromatography; heat inactivation of helper virus; rAAV capture by hydrophobic interaction chromatography; buffer exchange by size exclusion chromatography (SEC); nanofiltration; and rAAV capture by anionic exchange chromatography, cationic exchange chromatography, or affinity chromatography. These steps may be used alone, in various combinations, or in different orders. In some embodiments, the method includes all the steps in the order as described below. Methods to purify rAAV particles are found, for example, in Xiao el al., (1998) Journal of Virology 72:2224-2232; U.S. Patent Numbers 6,989,264 and 8,137,948; and WO 2010/148143. [0245] Cells may also be transfected with a vector (e.g., helper vector) which provides helper functions to the AAV. The vector providing helper functions may provide adenovirus functions, including, e.g., Ela, Elb, E2a, and E40RF6. The sequences of adenovirus gene providing these functions may be obtained from any known adenovirus serotype, such as serotypes 2, 3, 4, 7, 12, and 40, and further including any of the presently identified human types known in the art. Thus, in some embodiments, the methods involve transfecting the cell with a vector expressing one or more genes necessary for AAV replication, AAV gene transcription, and/or AAV packaging. [0246] In some embodiments, such a stable host cell will contain the required component(s) under the control of an inducible promoter. Alternatively, the required component(s) may be under the control of a constitutive promoter. In still another alternative, a selected stable host cell may contain selected component(s) under the control of a constitutive promoter and other selected component(s) under the control of one or more inducible promoters. For example, a stable host cell may be generated which is derived from 293 cells (which contain El helper functions under the control of a constitutive promoter), but which contains the Rep and/or Cap proteins under the control of inducible promoters. Still other stable host cells may be generated by one of skill in the art.
[0247] The minigene, Rep sequences, Cap sequences, and helper functions required for producing the rAAV of the disclosure may be delivered to the packaging host cell in the form of any genetic element which transfers the sequences. The selected genetic element may be delivered by any suitable method known in the art. See, e.g., Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring Harbor Press, Cold Spring Harbor, NY.
VIII. Pharmaceutical Compositions
[0248] The present disclosure provides, among other things, pharmaceutical compositions including a nucleic acid including a compact bidirectional promoter, or a functional fragment or variant thereof, as described herein and a heterologous coding sequence, or a functional fragment
or variant thereof, and a pharmaceutically acceptable carrier. The pharmaceutical compositions may be suitable for any mode of administration described herein.
[0249] In some embodiments, the pharmaceutical compositions including a nucleic acid described herein and a pharmaceutically acceptable carrier is suitable for administration to a human subject. Such carriers are well known in the art (see, e.g., Remington’s Pharmaceutical Sciences, 15th Edition, pp. 1035-1038 and 1570-1580). Such pharmaceutically acceptable carriers can be sterile liquids, such as water and oil, including those of petroleum, animal, vegetable or synthetic origin, such as peanut oil, soybean oil, mineral oil, and the like. Saline solutions and aqueous dextrose, polyethylene glycol (PEG) and glycerol solutions can also be employed as liquid carriers, particularly for injectable solutions. The pharmaceutical composition may further include additional ingredients, for example preservatives, buffers, tonicity agents, antioxidants and stabilizers, nonionic wetting or clarifying agents, viscosityincreasing agents, and the like. The pharmaceutical compositions described herein can be packaged in single unit dosages or in multi-dosage forms. The compositions are generally formulated as sterile and substantially isotonic solution.
[0250] For example, in some embodiments, the pharmaceutical compositions of the disclosure include a pharmaceutically acceptable carrier. For example, in some embodiments, the pharmaceutical compositions of the disclosure include PBS. In some embodiments, the pharmaceutical compositions of the disclosure include pluronic. In some embodiments, the pharmaceutical compositions of the disclosure include PBS, NaCl, and pluronic. In some embodiments, the vectors are administered by intravitreal injection in a solution of PBS, with additional NaCl and pluronic.
[0251] In one embodiment, the nucleic acid including the desired compact bidirectional promoter, or a functional fragment or variant thereof, as described herein and the desired heterologous coding sequence or a functional fragment or variant thereof for use in target cells, as detailed above, is formulated into a pharmaceutical composition intended for oral, inhalation, intranasal, intratracheal, intravenous, intramuscular, subcutaneous, intradermal, or other parental routes of administration. Such formulation involves the use of a pharmaceutically and/or physiologically acceptable vehicle or carrier, such as buffered saline or other buffers, e.g., HEPES, to maintain pH at appropriate physiological levels, and, optionally, other medicinal agents, pharmaceutical agents, stabilizing agents, buffers, carriers, adjuvants, or diluents. For injection, the carrier will typically be a liquid. Exemplary physiologically acceptable carriers include sterile, pyrogen-free water and sterile, pyrogen-free, and phosphate buffered saline. A variety of such known carriers are provided in U.S. Patent No. 7,629,322, incorporated herein by
reference. In some embodiments, the carrier is an isotonic sodium chloride solution. In some embodiments, the carrier is balanced salt solution. In some embodiments, the carrier includes tween. If the virus is to be stored long-term, it may be frozen in the presence of glycerol or Tween20. In some embodiments, the pharmaceutically acceptable carrier includes a surfactant, such as perfluorooctane (Perfluoron liquid). Routes of administration may be combined, if desired.
[0252] Pharmaceutical compositions useful in the methods of the disclosure are further described in PCT publication No. WO 2015/168666 and PCT publication No. WO 2014/011210, the contents of which are incorporated by reference herein.
IX. Methods of Treatment or Prophylaxis
[0253] Provided herein are various methods of preventing, treating, arresting progression of or ameliorating disease and disorders. Generally, the methods include administering to a subject, e.g., a mammalian subject, in need thereof, an effective amount of a composition including a vector described above (e.g., an rAAV), carrying a heterologous coding sequence or a functional fragment or variant thereof under the control of a compact bidirectional promotor and, optionally, regulatory sequences which express the product of the gene in target cells of a subject, and a pharmaceutically acceptable carrier. Any of the vectors, such as AAV (e.g., ssAAV e.g., scAAV) described herein are useful in the methods described below.
[0254] The disclosure also provides a method of treating a subject having a disease, including the step of administering to the subject a vector of the disclosure.
[0255] In some embodiments, the disclosure provides a method of treating a subject having a disease as described herein, comprising the step of administering to the subject a vector of the disclosure. In some embodiments, the vector is administered at a dose between 2.5 x 1010 vg/kg and 1.4 x 1011 vg/kg. In some embodiments, the vectors are administered at a dose between 1.0 x 1011 vg/kg and 1.5 x 1013 vg/kg. In some embodiments, the vectors are administered at a dose between 1.0 x 1011 vg/kg and 1.5 x 1012 vg/kg. In some embodiments, the vectors are administered at a dose of about 1.4 x 1012. In some embodiments, the vectors are administered at a dose of 1.4 x 1012 vg/kg.
[0256] In some embodiments, the pharmaceutical compositions of the disclosure comprise a pharmaceutically acceptable carrier. In some embodiments, the pharmaceutical compositions of the disclosure comprise PBS. In some embodiments, the pharmaceutical compositions of the disclosure comprise pluronic. In some embodiments, the pharmaceutical compositions of the disclosure comprise PBS, NaCl, and pluronic.
[0257] In some embodiments, any of the treatment and/or prophylactic methods disclosed herein are applied to a subject. In some embodiments, the subject is a mammal. In some embodiments, the subject is a human.
[0258] In some embodiments, the human is a newborn, an infant, child, pre-adolescent, adolescent, or adult.
A. Reduced Dosing Methods Using sc AAV
[0259] It has additionally been discovered that the compact bidirectional promoters described herein allow for the use of scAAV vectors with genes previously thought to be too large to fit into an scAAV (see, FIG. 3). scAAV vectors are about half the size of wild-type vectors and can package a double-stranded, hairpin-like genome that is self-complementary. (See, e.g., Wang et al. (2003) GENE THERAPY 10:2105-2111.) Because the genome is self-complementary, the vector is able to circumvent the single-stranded to double-stranded conversion that takes place for transcriptional activation to occur. This conversion is time-consuming, and causes a delay in expression of a transgene (e.g., a therapeutic coding sequence) which is disadvantageous for applications that require immediate activity. Use of scAAV vectors can reduce the amount of vector (z.e., the dosing) needed, thereby reducing toxicity which can be caused by large doses of AAV.
[0260] Accordingly, the disclosure also provides a method of administering an scAAV vector including a therapeutic coding sequence at a reduced dose for treating a disease treatable by the therapeutic coding sequence. For example, the method may include administering to a subject a scAAV including a compact bidirectional promoter operably linked to the therapeutic coding sequence, wherein the compact bidirectional promoter is less than about 1000 bp (e.g., less than about 800 bp, less than about 600 bp, less than about 400 bp, or less than about 200 bp) and is heterologous to the therapeutic coding sequence, wherein the scAAV vector is administered at a reduced dose as compared to the therapeutically effective dose for an ssAAV vector including the therapeutic coding sequence.
[0261] In some embodiments, the therapeutic coding sequence encodes a protein that is from about 450 amino acids to about 750 amino acids in size. For example, the therapeutic coding sequence can encode a protein from about 450 to about 550 amino acids, about 450 to about 650 amino acids, about 550 to about 650 amino acids, about 550 to about 750 amino acids, or about 650 to about 750 amino acids in size. In some embodiments, the therapeutic coding sequence comprises F8, F9, PIGA, SGSH, G6PC, NAGLU, CLN3, GBA, IDS, GAA, OTC, GLA, CAH, IDUA, LAMP2, CLN1, ATP7B, A1AT, GALT, LMNA, ENPP1, CLN2, CLN5, CLN7/MFSD8,
AGU, MMUT, NPC2, ABCB11, ABCB4, ASS1, SMN1, AADC, MTM1, GBA1, GRN, GAD, GALGT2, SGCB, GDNF, ASP A, GLB1, GALC, SGCA, DYSF, HEXA, GAN, FXN, ARSA, MECP2, IGHMBP2, UBE3A, CDKL5, PGRN, FKRP, CYP46A1, OPMD, Cavl, neuropeptide Y/Y2, SCN1A, SHANK3, APOE2(R158C), FMRI, UPF1, CMT4J, MFN2, PRKN, CAPN3, NTF3, AN05, SGCG, EMD, SURF1, GBE1, FMRP, RPE65, RPGR, CHM, ND4, CNGB3, PDE6b, CFI, CNGA3, GUCY2D, RLBP1, CD59, 0PN1LW, CFH, MY07A, RSI, ABCA4, ND1, BEST1, RHO, LCA5, RDH12, NMNAT1, SERPING1, AQP1, PPP1R1A, IL-IRa, CFTR, OTOF, CLRN1, GJB2, ALPL, TMC1, STRC, ATOH1, or MYBPC3.
[0262] For example, in some embodiments, the reduced dose is between about 10-fold and about 600-fold (e.g., about 11-fold and about 550-fold, about 12-fold and about 500-fold, about 13- fold and about 400-fold, about 14-fold and about 300-fold, about 15-fold and about 200-fold, about 20-fold and about 100-fold, or about 50-fold) lower than the therapeutically effective dose for an ssAAV vector. In some embodiments, the reduced dose is between about 11 -fold and about 550-fold lower than the therapeutically effective dose for an ssAAV vector.
[0263] In some embodiments, the reduced dose is between about 12-fold and about 500-fold lower than the therapeutically effective dose for an ssAAV vector. In some embodiments, the reduced dose is between about 13-fold and about 400-fold lower than the therapeutically effective dose for an ssAAV vector. In some embodiments, the reduced dose is between about 14-fold and about 300-fold lower than the therapeutically effective dose for an ssAAV vector. In some embodiments, the reduced dose is between about 15-fold and about 200-fold lower than the therapeutically effective dose for an ssAAV vector. In some embodiments, the reduced dose is between about 20-fold and about 100-fold lower than the therapeutically effective dose for an ssAAV vector. In some embodiments, the reduced dose is about 50-fold lower than the therapeutically effective dose for an ssAAV vector.
[0264] In some embodiments, the reduced dose is about 10-fold lower than the therapeutically effective dose for an ssAAV vector.
X. Kits
[0265] Any of the vectors disclosed herein may be assembled into a pharmaceutical or diagnostic or research kit to facilitate their use in therapeutic, diagnostic or research applications. A kit may include one or more containers housing any of the vectors disclosed herein and instructions for use.
[0266] The kit may be designed to facilitate use of the methods described herein by researchers and can take many forms. Each of the compositions of the kit, where applicable, may be
provided in liquid form (e.g., in solution), or in solid form, e.g., a dry powder). In certain cases, some of the compositions may be constitutable or otherwise processable (e.g., to an active form), for example, by the addition of a suitable solvent or other species e.g., water or a cell culture medium), which may or may not be provided with the kit. As used herein, “instructions” can define a component of instruction and/or promotion, and typically involve written instructions on or associated with packaging of the disclosure. Instructions also can include any oral or electronic instructions provided in any manner such that a user will clearly recognize that the instructions are to be associated with the kit, for example, audiovisual (e.g., videotape or DVD), internet, and/or web-based communications. The written instructions may be in a form prescribed by a governmental agency regulating the manufacture, use or sale of pharmaceuticals or biological products, which instructions can also reflects approval by the agency of manufacture, use, or sale for animal administration.
EXAMPLES
[0267] The following Examples are merely illustrative and are not intended to limit the scope or content of the invention in any way.
Example 1. Identification of Bidirectional Promoters
[0268] This Example describes the identification of compact bidirectional promoters (see exemplary promoter in FIG. 1) from genomic databases.
[0269] A custom python script was developed to identify bidirectional promoters from genomic annotation files (as outlined in FIG. 7). The steps below specify human annotations, but the script was used to identify bidirectional promoters from other genomes annotations and can similarly be applied to genome-wide transcription data files. First, the input data file was obtained: GRCh38_latest_genomic.gff was used for the human input file, which was an annotated file of the GRCh38 genome and GRCm39_vM27.gff3 was used for the mouse genome. The file was categorized by chromosome with each line pertaining to each region of interest in the genome with examples including genes, pseudogenes, and coding regions for protein-coding genes. The custom script iterated through every line in the file and stored the type of annotation. Once the relevant information had been stored from the input file, the genes were sorted by index on a per-chromosome basis. After sorting, the custom script identified regions in between transcription on the minus strand and transcription on the plus strand, defining the intervening region as a bidirectional promoter. Promoter boundaries can be further refined using
the coding sequence (CDS) start for protein coding genes that are capable of expressing the at least one heterologous coding sequence in a target cell.
[0270] Using the approach, more than 9000 bidirectional promoters were identified, 332 of which had a length of no more than 1000 bp. 291 of these promoters were no more than 800 bp. 234 of these promoters were no more than 600 bp. 137 of these promoters were no more than 400 bp. 34 of these promoters were no more than 200 bp.
[0271] Compact bidirectional promoters identified using the method of Example 1 are provided at SEQ ID NOs: 1-800.
Example 2. Tissue expression of compact bidirectional promoters
[0272] Tissue expression for an exemplary compact bidirectional promoter identified in Example 1 was determined using expression databases for each protein coding gene flanking the bidirectional promoter. Specifically, tissue expression data was obtained using the Human Protein Atlas (HPA) and the Genotype-Tissue Expression (GTEx) databases. As shown in FIG. 4A and FIG. 4B, a compact bidirectional promoter flanked by COX15 and CUTC drives expression of CUTC in skin and tongue. Another exemplary bidirectional promoter is flanked by DYNLT2 and ERMARD, which drives expression of DYNLT2 in the testes. Yet another exemplary bidirectional promoter is flanked by BHMT2 and DMGDH, which shows the same tissue specificity in both orientations, which includes expression in both the kidney and liver. [0273] Tissue expression for a number of compact bidirectional promoter identified in Example 1 was determined as above, and the exemplary expression profiles are provided in FIGs. 5A-H and FIGs. 11A-26. For example, FIGs. 5A-H provides a set of graphs depicting the unique liver-, hepatocyte-, neuronal-, kidney tubular-, skeletal muscle-, cerebral cortex-, retina-, and rod photoreceptor-specific expression profiles of compact bidirectional promoters of less than 300 bp identified in Example 1.
[0274] FIGs. 6A-6D are a set of graphs depicting cell sub-type expression profiles in the lung for four exemplary compact bidirectional promoters of the disclosure.
Example 3. Transgene expression driven by compact bidirectional promoters
[0275] This Example describes the characterization of a library of compact bidirectional promoters for their capacity to drive gene expression usingluciferase reporters (e.g., Firefly luciferase and NANOLUC®) in cell lines. A normalized luciferase expression was quantified for compact bidirectional promoters of the disclosure and a benchmark against a control thymidine kinase (TK) promoter was determined.
[0276] Promoter expression activity was assessed using a luciferase reporter assay. Characterization of the luciferase assay was performed, for example, by co-transfecting cells with a plasmid encoding Firefly luciferase and with a plasmid encoding NANOLUC® reporters. The luciferase reporters were under transcriptional control of standard promoters (e.g., TK). A standard curve of the normalized luciferase signal (Firefly signal/NANOLUC® signal) was generated using a transfection ratio, such as the following exemplary transfection ratios, 90 ng Firefly: 10 ng NANOLUC®, 99 ng Firefly: 1 ng NANOLUC®, and 100 ng Firefly:0.1 ng NANOLUC®. Establishing such a ratiometric luciferase reporter assay allowed the determination of promoter expression activity without cross-signal interference.
[0277] Compact bidirectional promoters of the disclosure (e.g., any one or more of the promoters having the nucleic acid sequence of SEQ ID NOs: 1-800), including Human M0RN5 (“p387;” e.g, SEQ ID NOs. 221 and/or 621), human RPL9 (“p389;” e.g, SEQ ID NOs. 300 and/or 700), human NDUFB9 (“p390;” e.g, SEQ ID NOs. 339 and/or 739), human RPS28 (“p391 e.g, SEQ ID NOs. 220 and/or 620), and human SLIRP (“p392;” e.g., SEQ ID NOs. 17 and/or 417), were evaluated for reporter expression in HeLa (FIG. 8), A549 (FIG. 9), and CFBE (FIG. 10) cell lines. Activity of the compact promoters, along with activity of a control Hl promoter (“p096”) and the standard TK promoter (“p322”) was plotted (FIGs. 8-10), showing that the strongest promoters exceed TK-controlled expression activity.
Example 4: In vivo Promoter Expression
[0278] This Example describes assessment of promoter activity and payload expression in vivo in mice. To demonstrate promoter activity, in vivo luminescence driven by the candidate promoter is examined. For example, a promoter-Luciferase reporter construct that is flanked by ITR sequences can be constructed, packaged into an AAV (e.g., scAAV), and delivered via intranasal administration to mice. Exemplary scAAV comprising a compact bidirectional promoter for testing include SEQ ID NOs. 812-818, having the CUTC promoter (SEQ ID NO. 80 and SEQ ID NO. 480 (e.g., SEQ ID NO. 812)), NDUFA7 promoter (SEQ ID NOs. 220 and SEQ ID NO. 620 (e.g., SEQ ID NO: 813)), and NDUFB9 promoter (SEQ ID NO. 339 and SEQ ID NO. 739 (e.g., SEQ ID NOs. 814-818)), respectively, and each comprising a MTM1 heterologous coding sequence. A time course of in vivo luciferase imaging can provide a direct readout of promoter activity and transgene expression in specific tissues of the mice.
Cloning and AA V6 Virus Production
[0279] A luciferase- AAV reporter construct (e.g., luciferase-scAAV reporter constructs) including a compact bidirectional promoter of the disclosure is generated using a plasmid
transfection method, as known in the art. At 8 weeks of age, a group of mice will receive, for example, a single 50 pl intranasal instillation of either 2 x 1014 vg/kg AAV or sterile PBS.
In vivo Luciferase Activity
[0280] Mice are monitored for 32 weeks post-transfection to comprehensively assess peak luciferase expression and vector durability. In such an experiment, for example, mice can be injected intraperitoneally with 75 mg/kg D-luciferin in 100 pL of PBS and placed in a chamber of an imaging system under isoflurane anesthesia. 10 minutes post-injection, luminescent images can be acquired (Xenogen IVIS). In vivo luciferase expression enables following the kinetics of expression onset along with quantification of promoter activity without having to sacrifice the mice. A control vector driving luciferase expression from a control promoter (e.g., PGK1) can be used to compare tissue distribution and expression level. Tissue distribution can be examined over time to confirm that expression is not silenced as compared with the control promoter. As yet another demonstration of in vivo expression of a payload by a compact bidirectional promoter of the disclosure, relevant tissue samples (e.g., lungs, testes, and brain) from the mice may be collected and RT-qPCR or a Western Blot may be performed to validate gene and protein expression, respectively. For example, the kidney and liver of mice may be collected and determined that the genes BHMT2 and DMGDH, and their respectively encoded proteins, show elevated levels of expression following transfection with an AAV encoding a compact bidirectional promotor of the disclosure, as compared to control mice. Such experiments can be used to confirm in vivo payload expression.
INCORPORATION BY REFERENCE
[0281] The entire disclosure of each of the patent and scientific documents referred to herein is incorporated by reference for all purposes.
EQUIVALENTS
[0282] The invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof. The foregoing embodiments are therefore to be considered in all respects illustrative rather than limiting on the invention described herein. Scope of the invention is thus indicated by the appended claims rather than by the foregoing description, and all changes that come within the meaning and range of equivalency of the claims are intended to be embraced therein.
Claims
WHAT IS CLAIMED IS:
1. A nucleic acid comprising a compact bidirectional promoter, or a functional fragment or variant thereof, operably linked to at least one heterologous coding sequence, wherein the compact bidirectional promoter is less than about 1000 bp, and wherein the bidirectional promoter is capable of promoting transcription of two coding sequences positioned on opposite sides of the promoter.
2. The nucleic acid of claim 1, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is between about 30 bp and about 800 bp.
3. The nucleic acid of claim 1, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is between about 30 bp and about 600 bp.
4. The nucleic acid of claim 1, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is between about 30 bp and about 400 bp.
5. The nucleic acid of claim 1, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is between about 30 bp and about 200 bp.
6. The nucleic acid of claim 1, wherein the compact bidirectional promoter comprises a nucleic acid sequence selected from any one of SEQ ID NOs: 1-800, or a nucleic acid sequence having at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% sequence identity thereto.
7. The nucleic acid of any one of claims 1-6, wherein the compact promoter, or the functional fragment or the variant thereof, is operably linked to a 5' untranslated region (UTR).
8. The nucleic acid of any one of claims 1-7, wherein the compact promoter, or a functional fragment or variant thereof, is operably linked to a Kozak consensus sequence.
9. The nucleic acid of any one of claims 1-8, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, comprises at least 95%, at least 98%, at
least 99%, at least 99.5% or 100% sequence identity to a naturally occurring mammalian promoter. The nucleic acid of any one of claims 1-9, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is operably linked to only one heterologous coding sequence. The nucleic acid of any one of claims 1-9, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is operably linked to two heterologous coding sequences positioned on opposite sides of the promoter. The nucleic acid of claim 11, wherein the two heterologous coding sequences comprise the same coding sequence. The nucleic acid of claim 11, wherein the two heterologous coding sequences comprise different coding sequences. The nucleic acid of any one of claims 1-13, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is capable of expressing the at least one heterologous coding sequence in a target cell. The nucleic acid of claim 14, wherein the target cell is a lung cell, a pancreatic cell, a kidney cell, a muscle cell, a liver cell, a retinal cell, a neuron, a glial cell, an endothelial cell, or an epithelial cell. The nucleic acid of any one of claims 11-13, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is capable of expressing each of the two heterologous coding sequences:
(a) in the same target cell or cells,
(b) in different target cells, or
(c) in a partially overlapping set of target cells.
The nucleic acid of any one of claims 1-16, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is capable of expressing a luciferase reporter at a higher level than is a HSV thymidine kinase (TK) promoter. The nucleic acid of any one of claims 1-17, wherein the at least one coding sequence encodes cystic fibrosis transmembrane conductance regulator (CFTR), ATP7B, ATP7A, AGL, CPS1, A1AT, ALPL, ARSA, BBS1, BEST1, CAH, CFH, CFI, CHM, CLN2, CLN7, CNGA3, CYP46A1, F9, FKRP, FMRI, FMRP, F0XG1, GAD, GALC, GALGT2, GBA1, GBE1, GLB1, GRN, HEXA, HTRA1, IDS, IDUA, LAMP2, LCA5, MECP2, MFN2, MMUT, MTM1, NAGLU, ND4, PAH, PIGA, PRKN, RPE65, SERPING1, SGSH, SLC13A5, SLC6A1, or a functional fragment or variant thereof. The nucleic acid of claim 18, wherein the at least one coding sequence is codon optimized, optionally wherein the codon optimized coding sequence comprises a nucleic acid sequence selected from any one of SEQ ID NOs: 819-836, or a nucleic acid sequence having at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% sequence identity thereto. An expression construct comprising the nucleic acid of any one of claims 1-19. A vector comprising the expression construct of claim 20, optionally wherein the vector is a plasmid, a DNA vector, an RNA vector, a virion, or a viral vector. The vector of claim 21, wherein the vector is a viral vector. The viral vector of claim 22, wherein the viral vector is an adeno-associated virus (AAV), lentivirus, adenovirus, simian virus 40, vaccinia virus, measles virus, herpes virus, or poxvirus. The vector of claim 23, wherein the viral vector is an AAV vector. The vector of claim 24, wherein the AAV is a single-stranded AAV (ssAAV) vector. The vector of claim 25, wherein the AAV is a self-complementary AAV (scAAV) vector.
A method of expressing a heterologous coding sequence in a cell, the method comprising transfecting the cell with the expression construct of claim 20 or the vector of any one of claims 21-26. A method of treating a disease in a subject in need thereof, the method comprising administering to the subject the vector of any one of claims 21-26. A method of expressing at least one heterologous coding sequence in a target cell, the method comprising introducing into a subject a nucleic acid comprising a compact bidirectional promoter, or a functional fragment or variant thereof, operably linked to at least one heterologous coding sequence, wherein the compact bidirectional promoter is less than about 1000 bp, and wherein the bidirectional promoter is capable of promoting transcription of two coding sequences positioned on opposite sides of the promoter in the cell. The method of claim 29, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is between about 30 bp and about 800 bp. The method of claim 29, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is between about 30 bp and about 600 bp. The method of claim 29, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is between about 30 bp and about 400 bp. The method of claim 29, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is between about 30 bp and about 200 bp. The method of claim 29, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is operably linked to a 5' UTR. The method of claim 29, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is operably linked to a Kozak consensus sequence.
The method of claim 29, wherein the compact bidirectional promoter comprises a nucleic acid sequence selected from any one of SEQ ID NOs: 1-800, or a nucleic acid sequence having at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% sequence identity thereto. The method of any one of claims 29-36, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, comprises at least 95%, at least 98%, at least 99%, at least 99.5% or 100% sequence identity to a naturally occurring mammalian promoter. The method of any one of claims 29-37, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is operably linked to only one heterologous coding sequence. The method of any one of claims 29-37, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is operably linked to two heterologous coding sequences positioned on opposite sides of the promoter. The method of claim 39, wherein the two heterologous coding sequences comprise the same coding sequence. The method of claim 39, wherein the two heterologous coding sequences comprise different coding sequences. The method of any one of claims 29-41, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, expresses the at least one heterologous coding sequence in a target cell. The method of claim 42, wherein the target cell is a lung cell, a pancreatic cell, a kidney cell, a muscle cell, a liver cell, a retinal cell, a neuron, a glial cell, an endothelial cell, or an epithelial cell.
The method of any one of claims 39-43, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is capable of expressing each of the two heterologous coding sequences:
(a) in the same target cell or cells,
(b) in different target cells, or
(c) in a partially overlapping set of target cells. The method of any one of claims 29-44, wherein the compact bidirectional promoter expresses a luciferase reporter at a higher level than is a HSV TK promoter. The method of any one of claims 29-45, wherein the at least one coding sequence encodes CFTR, ATP7B, ATP7A, AGL, CPS1, A1AT, ALPL, ARSA, BBS1, BEST1, CAH, CFH, CFI, CHM, CLN2, CLN7, CNGA3, CYP46A1, F9, FKRP, FMRI, FMRP, F0XG1, GAD, GALC, GALGT2, GBA1, GBE1, GLB1, GRN, HEXA, HTRA1, IDS, IDUA, LAMP2, LCA5, MECP2, MFN2, MMUT, MTM1, NAGLU, ND4, PAH, PIGA, PRKN, RPE65, SERPING1, SGSH, SLC13A5, SLC6A1, or a functional fragment or variant thereof. The method of any one of claims 29-46, wherein the at least one coding sequence is codon optimized, optionally wherein the codon optimized coding sequence comprises a nucleic acid sequence selected from any one of SEQ ID NOs: 819-836, or a nucleic acid sequence having at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% sequence identity thereto. A method of expressing two heterologous coding sequences in different target cells, the method comprising introducing into a subject a nucleic acid comprising a compact bidirectional promoter, or a functional fragment or variant thereof, operably linked to the two heterologous coding sequences positioned on opposite sides of the compact bidirectional promoter in the cell, wherein the compact bidirectional promoter is less than about 1000 bp, and wherein the compact bidirectional promoter promotes transcription of one of the coding sequences in a first target cell and promotes transcription of the other coding sequence in a second target cell.
The method of claim 48, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is between about 30 bp and about 800 bp. The method of claim 48, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is between about 30 bp and about 600 bp. The method of claim 48, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is between about 30 bp and about 400 bp. The method of claim 48, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is between about 30 bp and about 200 bp. The method of claim 48, wherein the compact promoter, or the functional fragment or the variant thereof, is operably linked to a 5' UTR. The method of claim 48, wherein the compact promoter, or the functional fragment or the variant thereof, is operably linked to a Kozak consensus sequence. The method of any one of claims 48-54, wherein the compact bidirectional promoter comprises a nucleic acid sequence selected from any one of SEQ ID NOs: 1-800, or a nucleic acid sequence having at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% sequence identity thereto. The method of any one of claims 48-55, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, comprises at least 95%, at least 98%, at least 99%, at least 99.5% or 100% sequence identity to a naturally occurring mammalian promoter. The method of any one of claims 48-56, wherein the two heterologous coding sequences comprise the same coding sequence. The method of any one of claims 48-57, wherein the two heterologous coding sequences comprise different coding sequences.
The method of any one of claims 48-58, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, expresses the at least one heterologous coding sequence in a target cell. The method of claim 59, wherein the target cell is a lung cell, a pancreatic cell, a kidney cell, a muscle cell, a liver cell, a retinal cell, a neuron, a glial cell, an endothelial cell, or an epithelial cell. The method of any one of claims 48-60, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is capable of expressing each of the two heterologous coding sequences in a partially overlapping set of target cells. The method of any one of claims 48-61, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, expresses a luciferase reporter at a higher level than is a HSV TK promoter. The method of any one of claims 48-62, wherein the at least one coding sequence encodes CFTR, ATP7B, ATP7A, AGL, CPS1, A1AT, ALPL, ARSA, BBS1, BEST1, CAH, CFH, CFI, CHM, CLN2, CLN7, CNGA3, CYP46A1, F9, FKRP, FMRI, FMRP, FOXG1, GAD, GALC, GALGT2, GBA1, GBE1, GLB1, GRN, HEXA, HTRA1, IDS, IDUA, LAMP2, LCA5, MECP2, MFN2, MMUT, MTM1, NAGLU, ND4, PAH, PIGA, PRKN, RPE65, SERPING1, SGSH, SLC13A5, SLC6A1, or a functional fragment or variant thereof. The method of any one of claims 48-63, wherein the at least one coding sequence is codon optimized, optionally wherein the codon optimized coding sequence comprises a nucleic acid sequence selected from any one of SEQ ID NOs: 819-836, or a nucleic acid sequence having at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% sequence identity thereto. A method of administering an scAAV vector comprising a therapeutic coding sequence at a reduced dose for treating a disease treatable by the therapeutic coding sequence, the method comprising,
administering to a subject a scAAV comprising a compact bidirectional promoter, or a functional fragment or variant thereof, operably linked to the therapeutic coding sequence, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is less than about 1000 bp and is heterologous to the therapeutic coding sequence, wherein the sc AAV vector is administered at a reduced dose as compared to the therapeutically effective dose for an ssAAV vector comprising the therapeutic coding sequence.
66. The method of claim 65, wherein the reduced dose is between about 10-fold and about 600-fold lower than the therapeutically effective dose for an ssAAV vector.
67. The method of claim 65, wherein the reduced dose is about 10-fold lower than the therapeutically effective dose for an ssAAV vector.
68. The method of claim 65, wherein the bidirectional promoter, or the functional fragment or the variant thereof, is capable of promoting transcription of two coding sequences positioned on opposite sides of the promoter.
69. The method of any one of claims 65-68, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is between about 30 bp and about 800 bp.
70. The method of any one of claims 65-68, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is between about 30 bp and about 600 bp.
71. The method of any one of claims 65-68, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is between about 30 bp and about 400 bp.
72. The method of any one of claims 65-68, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, is between about 30 bp and about 200 bp.
73. The method of any one of claims 65-68, wherein the compact bidirectional promoter comprises a nucleic acid sequence selected from any one of SEQ ID NOs: 1-800, or a
nucleic acid sequence having at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% sequence identity thereto. The method of any one of claims 65-73, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, comprises at least 95%, at least 98%, at least 99%, at least 99.5% or 100% sequence identity to a naturally occurring mammalian promoter. The method of any one of claims 65-74, wherein the compact bidirectional promoter, or the functional fragment or the variant thereof, expresses the therapeutic coding sequence in a target cell. The method of claim 75, wherein the target cell is a lung cell, a pancreatic cell, a kidney cell, a muscle cell, a liver cell, a retinal cell, a neuron, a glial cell, an endothelial cell, or an epithelial cell. The method of any one of claims 65-76, wherein the compact bidirectional promoter expresses a luciferase reporter at a higher level than is a HSV TK promoter. The method of any one of claims 65-77, wherein the therapeutic coding sequence encodes A1AT, ALPL, ARSA, BBS1, BEST1, CAH, CFH, CFI, CHM, CLN2, CLN7, CNGA3, CYP46A1, F9, FKRP, FMRI, FMRP, FOXG1, GAD, GALC, GALGT2, GBA1, GBE1, GLB1, GRN, HEXA, HTRA1, IDS, IDUA, LAMP2, LCA5, MECP2, MFN2, MMUT, MTM1, NAGLU, ND4, PAH, PIGA, PRKN, RPE65, SERPING1, SGSH, SLC13A5, SLC6A1, or a functional fragment or variant thereof. The method of any one of claims 65-78, wherein the therapeutic coding sequence is codon optimized, optionally wherein the codon optimized coding sequence comprises a nucleic acid sequence selected from any one of SEQ ID NOs: 819-836, or a nucleic acid sequence having at least about 90%, at least about 95%, at least about 98%, at least about 99%, or at least about 99.5% sequence identity thereto. The method of any one of claims 65-79, wherein the therapeutic coding sequence is less than about 750 amino acids.
81. The method of any one of claims 65-80, wherein the therapeutic coding sequence is from about 350 amino acids to about 750 amino acids.
82. A method comprising: obtaining a genome file comprising information about the location of transcription start sites on the plus and minus strands of a chromosome; and identifying regions between a transcription start site on the minus strand of the chromosome and a transcription start site on the plus strand of the chromosome, thereby identifying one or more bidirectional promoters.
83. The method of claim 82, wherein the genome file comprising annotations categorized by chromosome, wherein the annotations comprise indices, wherein the indices comprise genes, pseudogenes, and coding regions for protein-coding genes, wherein each coding region comprises a transcription start site.
84. The method of claim 82 or 83, wherein the one or more bidirectional promoters are identified by obtaining a non-transitory computer readable medium comprising instructions that, when executed by a processor, cause the processor to identify the regions between the transcription start site on the minus strand of a chromosome and the transcription start site on the plus strand of the chromosome.
85. The method of any one of claims 82-84, wherein the genome file comprising annotations comprises mammalian annotations.
86. The method of claim 83, wherein the mammalian annotations comprise human annotations or mouse annotations.
87. The method of claim 83, wherein the genome file comprising annotations is GRCh38_latest_genomic.gff or GRCm39_vM27.gff3.
88. The method of claim 87, wherein the genome file is GRCm39_vM27.gff3.
The method of claim 82, wherein the one or more bidirectional promoters are less than about 1000 bp. The method of claim 89, wherein the one or more bidirectional promoters are between about 30 bp and about 800 bp. The method of claim 89, wherein the one or more bidirectional promoters are between about 30 bp and about 600 bp. The method of claim 89, wherein the one or more bidirectional promoters are between about 30 bp and about 400 bp. The method of claim 89, wherein the one or more bidirectional promoters are between about 30 bp and about 200 bp. The method of any one of claims 90-93, further comprising linking the one or more bidirectional promoters to at least one heterologous coding sequence. The method of any one of claims 90-93, further comprising linking the one or more bidirectional promoters to two heterologous coding sequences. The method of any one of claims 85-95, wherein the one or more bidirectional promoters is capable of promoting transcription of two coding sequences positioned on opposite sides of the promoter. The method of claim 89, wherein the compact promoter is operably linked to a 5' UTR. The method of claim 89, further comprising linking each of the one or more bidirectional promoters to only one heterologous coding sequence. The method of claim 89, further comprising linking each of the one or more bidirectional promoters to two heterologous coding sequences positioned on opposite sides of the promoter.
. The method of claim 99, wherein the two heterologous coding sequences comprise the same coding sequence. . The method of claim 99, wherein the two heterologous coding sequences comprise different coding sequences. . The method of claim 94, wherein the one or more bidirectional promoters are capable of expressing the at least one heterologous coding sequence in a target cell. . The method of claim 102, wherein the target cell is a lung cell, a pancreatic cell, a kidney cell, a muscle cell, a liver cell, a retinal cell, a neuron, a glial cell, an endothelial cell, or an epithelial cell. . The method of claim 94, wherein the one or more bidirectional promoters are capable of expressing each of the two heterologous coding sequences:
(a) in the same target cell or cells,
(b) in different target cells, or
(c) in a partially overlapping set of target cells. . The method of claim 89, wherein the one or more bidirectional promoters are capable of expressing a luciferase reporter at a higher level than is a HSV TK promoter. . The method of claim 94, wherein the at least one coding sequence encodes CFTR, ATP7B, ATP7A, AGL, CPS1, A1AT, ALPL, ARSA, BBS1, BEST1, CAH, CFH, CFI, CHM, CLN2, CLN7, CNGA3, CYP46A1, F9, FKRP, FMRI, FMRP, FOXG1, GAD, GALC, GALGT2, GBA1, GBE1, GLB1, GRN, HEXA, HTRA1, IDS, IDUA, LAMP2, LCA5, MECP2, MFN2, MMUT, MTM1, NAGLU, ND4, PAH, PIGA, PRKN, RPE65, SERPING1, SGSH, SLC13A5, or SLC6A1.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202263403571P | 2022-09-02 | 2022-09-02 | |
US63/403,571 | 2022-09-02 |
Publications (2)
Publication Number | Publication Date |
---|---|
WO2024050547A2 true WO2024050547A2 (en) | 2024-03-07 |
WO2024050547A3 WO2024050547A3 (en) | 2024-05-16 |
Family
ID=90098786
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2023/073367 WO2024050547A2 (en) | 2022-09-02 | 2023-09-01 | Compact bidirectional promoters for gene expression |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024050547A2 (en) |
-
2023
- 2023-09-01 WO PCT/US2023/073367 patent/WO2024050547A2/en unknown
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11034974B2 (en) | Hairpin MRNA elements and methods for the regulation of protein translation | |
AU2018337833B2 (en) | Adeno-associated virus variant capsids and methods of use thereof | |
AU2016362317B2 (en) | Scalable methods for producing recombinant Adeno-Associated Viral (AAV) vector in serum-free suspension cell culture system suitable for clinical use | |
KR102373765B1 (en) | Capsid-free aav vectors, compositions, and methods for vector production and gene delivery | |
JP5911069B2 (en) | Adeno-associated virus (AAV) syngeneic strains (clades), sequences, vectors containing them and uses thereof | |
US20200165632A1 (en) | ENHANCING AGENTS FOR IMPROVED CELL TRANSFECTION AND/OR rAAV VECTOR PRODUCTION | |
JP2017510264A (en) | Further improved AAV vectors produced in insect cells | |
CN110606874A (en) | Variant AAV and compositions, methods and uses for gene transfer into cells, organs and tissues | |
CN106884014B (en) | Adeno-associated virus inverted terminal repeat sequence mutant and application thereof | |
JP2021514659A (en) | AAV chimera | |
WO2021113634A1 (en) | Transgene cassettes designed to express a human mecp2 gene | |
TW201837173A (en) | shRNA expression cassette, polynucleotide sequence carrying same and application thereof sequentially containing a DNA sequence for expressing shRNA and a filling sequence according to a sequence 5'-3' | |
JP6929230B2 (en) | Nucleic acid molecules containing spacers and methods of their use | |
US20210301305A1 (en) | Engineered untranslated regions (utr) for aav production | |
WO2021246909A1 (en) | Codon-optimized nucleic acid encoding smn1 protein | |
WO2024050547A2 (en) | Compact bidirectional promoters for gene expression | |
US20230049066A1 (en) | Novel aav3b variants that target human hepatocytes in the liver of humanized mice | |
JP2023518415A (en) | Compositions and methods for reducing reverse packaging of CAP and REP sequences in recombinant AAV | |
US20220177529A1 (en) | Fusion protein for enhancing gene editing and use thereof | |
OA21075A (en) | Codon-optimized nucleic acid that encodes SMN1 protein, and use thereof | |
WO2023025920A1 (en) | Insect cell-produced high potency aav vectors with cns-tropism | |
WO2023144565A1 (en) | Recombinant optimized mecp2 cassettes and methods for treating rett syndrome and related disorders | |
WO2024015877A2 (en) | Novel aav3b capsid variants with enhanced hepatocyte tropism | |
JP2024506681A (en) | Use of histidine-rich peptides as transfection reagents for rAAV and rBV production | |
CN117377500A (en) | Adeno-associated viral vector capsids with improved tissue tropism |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23861621 Country of ref document: EP Kind code of ref document: A2 |