WO2024055013A1 - Systems and methods for transposing cargo nucleotide sequences - Google Patents
Systems and methods for transposing cargo nucleotide sequences Download PDFInfo
- Publication number
- WO2024055013A1 WO2024055013A1 PCT/US2023/073796 US2023073796W WO2024055013A1 WO 2024055013 A1 WO2024055013 A1 WO 2024055013A1 US 2023073796 W US2023073796 W US 2023073796W WO 2024055013 A1 WO2024055013 A1 WO 2024055013A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cell
- transposase
- nucleic acid
- sequence
- cargo
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 46
- 108091028043 Nucleic acid sequence Proteins 0.000 title description 10
- 108010020764 Transposases Proteins 0.000 claims abstract description 175
- 102000008579 Transposases Human genes 0.000 claims abstract description 174
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 110
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 108
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 108
- 125000003729 nucleotide group Chemical group 0.000 claims abstract description 73
- 239000002773 nucleotide Substances 0.000 claims abstract description 69
- 210000004027 cell Anatomy 0.000 claims description 187
- 108020004414 DNA Proteins 0.000 claims description 77
- 102000053602 DNA Human genes 0.000 claims description 77
- 238000009739 binding Methods 0.000 claims description 13
- 230000027455 binding Effects 0.000 claims description 12
- 230000002538 fungal effect Effects 0.000 claims description 10
- 210000004962 mammalian cell Anatomy 0.000 claims description 7
- 210000003527 eukaryotic cell Anatomy 0.000 claims description 6
- 210000003958 hematopoietic stem cell Anatomy 0.000 claims description 6
- 210000001236 prokaryotic cell Anatomy 0.000 claims description 5
- 241000238631 Hexapoda Species 0.000 claims description 4
- 101000600434 Homo sapiens Putative uncharacterized protein encoded by MIR7-3HG Proteins 0.000 claims description 4
- 102100037401 Putative uncharacterized protein encoded by MIR7-3HG Human genes 0.000 claims description 4
- 210000005253 yeast cell Anatomy 0.000 claims description 4
- 108020000946 Bacterial DNA Proteins 0.000 claims description 3
- 210000001744 T-lymphocyte Anatomy 0.000 claims description 3
- 108020005202 Viral DNA Proteins 0.000 claims description 3
- 230000030648 nucleus localization Effects 0.000 claims description 3
- 241000700605 Viruses Species 0.000 description 81
- 108090000623 proteins and genes Proteins 0.000 description 59
- 102000004169 proteins and genes Human genes 0.000 description 33
- 235000018102 proteins Nutrition 0.000 description 32
- 102000040430 polynucleotide Human genes 0.000 description 29
- 108091033319 polynucleotide Proteins 0.000 description 29
- 239000002157 polynucleotide Substances 0.000 description 28
- 108090000765 processed proteins & peptides Proteins 0.000 description 27
- 102000004196 processed proteins & peptides Human genes 0.000 description 23
- 229920001184 polypeptide Polymers 0.000 description 22
- 230000000694 effects Effects 0.000 description 20
- 239000013598 vector Substances 0.000 description 19
- 241000588724 Escherichia coli Species 0.000 description 18
- 235000001014 amino acid Nutrition 0.000 description 18
- 229940024606 amino acid Drugs 0.000 description 17
- 150000001413 amino acids Chemical class 0.000 description 17
- 238000003776 cleavage reaction Methods 0.000 description 17
- 230000007017 scission Effects 0.000 description 17
- 239000013612 plasmid Substances 0.000 description 16
- 229920002477 rna polymer Polymers 0.000 description 14
- 238000006467 substitution reaction Methods 0.000 description 13
- 230000017105 transposition Effects 0.000 description 13
- PEDCQBHIVMGVHV-UHFFFAOYSA-N Glycerine Chemical compound OCC(O)CO PEDCQBHIVMGVHV-UHFFFAOYSA-N 0.000 description 12
- 102100034343 Integrase Human genes 0.000 description 12
- 108091005804 Peptidases Proteins 0.000 description 11
- 239000004365 Protease Substances 0.000 description 11
- 102100037486 Reverse transcriptase/ribonuclease H Human genes 0.000 description 11
- 241000196324 Embryophyta Species 0.000 description 10
- 108700026244 Open Reading Frames Proteins 0.000 description 10
- 230000010354 integration Effects 0.000 description 10
- 239000002502 liposome Substances 0.000 description 10
- 229950010342 uridine triphosphate Drugs 0.000 description 10
- 230000002068 genetic effect Effects 0.000 description 9
- 238000010362 genome editing Methods 0.000 description 9
- 230000037431 insertion Effects 0.000 description 9
- 238000003780 insertion Methods 0.000 description 9
- 150000002632 lipids Chemical class 0.000 description 9
- 241000702421 Dependoparvovirus Species 0.000 description 8
- 238000001597 immobilized metal affinity chromatography Methods 0.000 description 8
- 238000000338 in vitro Methods 0.000 description 8
- 239000003550 marker Substances 0.000 description 8
- 230000007246 mechanism Effects 0.000 description 8
- 230000035772 mutation Effects 0.000 description 8
- 102000004190 Enzymes Human genes 0.000 description 7
- 108090000790 Enzymes Proteins 0.000 description 7
- 108010061833 Integrases Proteins 0.000 description 7
- 230000003197 catalytic effect Effects 0.000 description 7
- 230000000295 complement effect Effects 0.000 description 7
- -1 diTP Chemical compound 0.000 description 7
- 229940088598 enzyme Drugs 0.000 description 7
- 108020004999 messenger RNA Proteins 0.000 description 7
- 229920000642 polymer Polymers 0.000 description 7
- 230000001105 regulatory effect Effects 0.000 description 7
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 6
- 239000012634 fragment Substances 0.000 description 6
- 239000000203 mixture Substances 0.000 description 6
- 230000004048 modification Effects 0.000 description 6
- 238000012986 modification Methods 0.000 description 6
- 238000013518 transcription Methods 0.000 description 6
- 230000035897 transcription Effects 0.000 description 6
- 230000003612 virological effect Effects 0.000 description 6
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 5
- 230000003115 biocidal effect Effects 0.000 description 5
- VYXSBFYARXAAKO-UHFFFAOYSA-N ethyl 2-[3-(ethylamino)-6-ethylimino-2,7-dimethylxanthen-9-yl]benzoate;hydron;chloride Chemical compound [Cl-].C1=2C=C(C)C(NCC)=CC=2OC2=CC(=[NH+]CC)C(C)=CC2=C1C1=CC=CC=C1C(=O)OCC VYXSBFYARXAAKO-UHFFFAOYSA-N 0.000 description 5
- 238000001638 lipofection Methods 0.000 description 5
- 239000011159 matrix material Substances 0.000 description 5
- 244000005700 microbiome Species 0.000 description 5
- 238000002887 multiple sequence alignment Methods 0.000 description 5
- 230000010076 replication Effects 0.000 description 5
- OAKPWEUQDVLTCN-NKWVEPMBSA-N 2',3'-Dideoxyadenosine-5-triphosphate Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1CC[C@@H](CO[P@@](O)(=O)O[P@](O)(=O)OP(O)(O)=O)O1 OAKPWEUQDVLTCN-NKWVEPMBSA-N 0.000 description 4
- 108091026890 Coding region Proteins 0.000 description 4
- 241001465754 Metazoa Species 0.000 description 4
- 241000283984 Rodentia Species 0.000 description 4
- 241000723792 Tobacco etch virus Species 0.000 description 4
- ARLKCWCREKRROD-POYBYMJQSA-N [[(2s,5r)-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl] phosphono hydrogen phosphate Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)CC1 ARLKCWCREKRROD-POYBYMJQSA-N 0.000 description 4
- 125000003275 alpha amino acid group Chemical group 0.000 description 4
- 230000001580 bacterial effect Effects 0.000 description 4
- RGWHQCVHVJXOKC-SHYZEUOFSA-J dCTP(4-) Chemical compound O=C1N=C(N)C=CN1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)C1 RGWHQCVHVJXOKC-SHYZEUOFSA-J 0.000 description 4
- 230000001939 inductive effect Effects 0.000 description 4
- PYWVYCXTNDRMGF-UHFFFAOYSA-N rhodamine B Chemical compound [Cl-].C=12C=CC(=[N+](CC)CC)C=C2OC2=CC(N(CC)CC)=CC=C2C=1C1=CC=CC=C1C(O)=O PYWVYCXTNDRMGF-UHFFFAOYSA-N 0.000 description 4
- 239000004055 small Interfering RNA Substances 0.000 description 4
- ABZLKHKQJHEPAX-UHFFFAOYSA-N tetramethylrhodamine Chemical compound C=12C=CC(N(C)C)=CC2=[O+]C2=CC(N(C)C)=CC=C2C=1C1=CC=CC=C1C([O-])=O ABZLKHKQJHEPAX-UHFFFAOYSA-N 0.000 description 4
- 239000001226 triphosphate Substances 0.000 description 4
- 235000011178 triphosphate Nutrition 0.000 description 4
- ZKHQWZAMYRWXGA-KQYNXXCUSA-J ATP(4-) Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](COP([O-])(=O)OP([O-])(=O)OP([O-])([O-])=O)[C@@H](O)[C@H]1O ZKHQWZAMYRWXGA-KQYNXXCUSA-J 0.000 description 3
- ZKHQWZAMYRWXGA-UHFFFAOYSA-N Adenosine triphosphate Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)C(O)C1O ZKHQWZAMYRWXGA-UHFFFAOYSA-N 0.000 description 3
- AHCYMLUZIRLXAA-SHYZEUOFSA-N Deoxyuridine 5'-triphosphate Chemical compound O1[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C[C@@H]1N1C(=O)NC(=O)C=C1 AHCYMLUZIRLXAA-SHYZEUOFSA-N 0.000 description 3
- 241000713666 Lentivirus Species 0.000 description 3
- 108020004682 Single-Stranded DNA Proteins 0.000 description 3
- 239000007983 Tris buffer Substances 0.000 description 3
- HDRRAMINWIWTNU-NTSWFWBYSA-N [[(2s,5r)-5-(2-amino-6-oxo-3h-purin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl] phosphono hydrogen phosphate Chemical compound C1=2NC(N)=NC(=O)C=2N=CN1[C@H]1CC[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HDRRAMINWIWTNU-NTSWFWBYSA-N 0.000 description 3
- 125000000539 amino acid group Chemical group 0.000 description 3
- 210000004102 animal cell Anatomy 0.000 description 3
- XMQFTWRPUQYINF-UHFFFAOYSA-N bensulfuron-methyl Chemical compound COC(=O)C1=CC=CC=C1CS(=O)(=O)NC(=O)NC1=NC(OC)=CC(OC)=N1 XMQFTWRPUQYINF-UHFFFAOYSA-N 0.000 description 3
- 230000004071 biological effect Effects 0.000 description 3
- SUYVUBYJARFZHO-UHFFFAOYSA-N dATP Natural products C1=NC=2C(N)=NC=NC=2N1C1CC(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-UHFFFAOYSA-N 0.000 description 3
- URGJWIFLBWJRMF-JGVFFNPUSA-N ddTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)CC1 URGJWIFLBWJRMF-JGVFFNPUSA-N 0.000 description 3
- 230000037430 deletion Effects 0.000 description 3
- 238000012217 deletion Methods 0.000 description 3
- 238000004520 electroporation Methods 0.000 description 3
- 239000000499 gel Substances 0.000 description 3
- 230000012010 growth Effects 0.000 description 3
- 210000005260 human cell Anatomy 0.000 description 3
- 238000001727 in vivo Methods 0.000 description 3
- 229920002521 macromolecule Polymers 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 238000007481 next generation sequencing Methods 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000011160 research Methods 0.000 description 3
- 238000010845 search algorithm Methods 0.000 description 3
- 239000013049 sediment Substances 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 239000011780 sodium chloride Substances 0.000 description 3
- 241000894007 species Species 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 230000001225 therapeutic effect Effects 0.000 description 3
- 101150065732 tir gene Proteins 0.000 description 3
- 238000010361 transduction Methods 0.000 description 3
- 230000026683 transduction Effects 0.000 description 3
- 125000002264 triphosphate group Chemical class [H]OP(=O)(O[H])OP(=O)(O[H])OP(=O)(O[H])O* 0.000 description 3
- 241001529453 unidentified herpesvirus Species 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- VGIRNWJSIRVFRT-UHFFFAOYSA-N 2',7'-difluorofluorescein Chemical compound OC(=O)C1=CC=CC=C1C1=C2C=C(F)C(=O)C=C2OC2=CC(O)=C(F)C=C21 VGIRNWJSIRVFRT-UHFFFAOYSA-N 0.000 description 2
- JKMHFZQWWAIEOD-UHFFFAOYSA-N 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 2
- WCKQPPQRFNHPRJ-UHFFFAOYSA-N 4-[[4-(dimethylamino)phenyl]diazenyl]benzoic acid Chemical compound C1=CC(N(C)C)=CC=C1N=NC1=CC=C(C(O)=O)C=C1 WCKQPPQRFNHPRJ-UHFFFAOYSA-N 0.000 description 2
- SJQRQOKXQKVJGJ-UHFFFAOYSA-N 5-(2-aminoethylamino)naphthalene-1-sulfonic acid Chemical compound C1=CC=C2C(NCCN)=CC=CC2=C1S(O)(=O)=O SJQRQOKXQKVJGJ-UHFFFAOYSA-N 0.000 description 2
- 241001655883 Adeno-associated virus - 1 Species 0.000 description 2
- 241000702423 Adeno-associated virus - 2 Species 0.000 description 2
- 241000202702 Adeno-associated virus - 3 Species 0.000 description 2
- 241000580270 Adeno-associated virus - 4 Species 0.000 description 2
- 241001634120 Adeno-associated virus - 5 Species 0.000 description 2
- 241000972680 Adeno-associated virus - 6 Species 0.000 description 2
- 241001164823 Adeno-associated virus - 7 Species 0.000 description 2
- 241001164825 Adeno-associated virus - 8 Species 0.000 description 2
- 241000649045 Adeno-associated virus 10 Species 0.000 description 2
- 241000710929 Alphavirus Species 0.000 description 2
- 241001339993 Anelloviridae Species 0.000 description 2
- 241000124740 Bocaparvovirus Species 0.000 description 2
- 102000004657 Calcium-Calmodulin-Dependent Protein Kinase Type 2 Human genes 0.000 description 2
- 108010003721 Calcium-Calmodulin-Dependent Protein Kinase Type 2 Proteins 0.000 description 2
- 241000222120 Candida <Saccharomycetales> Species 0.000 description 2
- 241001337994 Cryptococcus <scale insect> Species 0.000 description 2
- 241000725619 Dengue virus Species 0.000 description 2
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 2
- 102100031780 Endonuclease Human genes 0.000 description 2
- 108010042407 Endonucleases Proteins 0.000 description 2
- 108010013369 Enteropeptidase Proteins 0.000 description 2
- 102100029727 Enteropeptidase Human genes 0.000 description 2
- 108010074860 Factor Xa Proteins 0.000 description 2
- 102000005720 Glutathione transferase Human genes 0.000 description 2
- 108010070675 Glutathione transferase Proteins 0.000 description 2
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 2
- XKMLYUALXHKNFT-UUOKFMHZSA-N Guanosine-5'-triphosphate Chemical compound C1=2NC(N)=NC(=O)C=2N=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)[C@H]1O XKMLYUALXHKNFT-UUOKFMHZSA-N 0.000 description 2
- 239000007995 HEPES buffer Substances 0.000 description 2
- 101000829506 Homo sapiens Rhodopsin kinase GRK1 Proteins 0.000 description 2
- 241000700588 Human alphaherpesvirus 1 Species 0.000 description 2
- 241000701074 Human alphaherpesvirus 2 Species 0.000 description 2
- 241000701041 Human betaherpesvirus 7 Species 0.000 description 2
- 241001502974 Human gammaherpesvirus 8 Species 0.000 description 2
- 241000701027 Human herpesvirus 6 Species 0.000 description 2
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 2
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 2
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 2
- 101710175625 Maltose/maltodextrin-binding periplasmic protein Proteins 0.000 description 2
- 108060004795 Methyltransferase Proteins 0.000 description 2
- 108091034117 Oligonucleotide Proteins 0.000 description 2
- 229910019142 PO4 Inorganic materials 0.000 description 2
- 241000125945 Protoparvovirus Species 0.000 description 2
- 102100023742 Rhodopsin kinase GRK1 Human genes 0.000 description 2
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 2
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 2
- 108091027967 Small hairpin RNA Proteins 0.000 description 2
- 108020004459 Small interfering RNA Proteins 0.000 description 2
- 244000057717 Streptococcus lactis Species 0.000 description 2
- 235000014897 Streptococcus lactis Nutrition 0.000 description 2
- 102000001435 Synapsin Human genes 0.000 description 2
- 108050009621 Synapsin Proteins 0.000 description 2
- PZBFGYYEXUXCOF-UHFFFAOYSA-N TCEP Chemical compound OC(=O)CCP(CCC(O)=O)CCC(O)=O PZBFGYYEXUXCOF-UHFFFAOYSA-N 0.000 description 2
- 108090000190 Thrombin Proteins 0.000 description 2
- 108020004566 Transfer RNA Proteins 0.000 description 2
- 108091023045 Untranslated Region Proteins 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- 241000700618 Vaccinia virus Species 0.000 description 2
- 108020000999 Viral RNA Proteins 0.000 description 2
- PGAVKCOVUIYSFO-UHFFFAOYSA-N [[5-(2,4-dioxopyrimidin-1-yl)-3,4-dihydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl] phosphono hydrogen phosphate Chemical compound OC1C(O)C(COP(O)(=O)OP(O)(=O)OP(O)(O)=O)OC1N1C(=O)NC(=O)C=C1 PGAVKCOVUIYSFO-UHFFFAOYSA-N 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 210000004436 artificial bacterial chromosome Anatomy 0.000 description 2
- 210000004507 artificial chromosome Anatomy 0.000 description 2
- 210000001106 artificial yeast chromosome Anatomy 0.000 description 2
- 238000003556 assay Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 108091092259 cell-free RNA Proteins 0.000 description 2
- 239000013043 chemical agent Substances 0.000 description 2
- 239000003795 chemical substances by application Substances 0.000 description 2
- 210000000349 chromosome Anatomy 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical compound NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- SUYVUBYJARFZHO-RRKCRQDMSA-N dATP Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 SUYVUBYJARFZHO-RRKCRQDMSA-N 0.000 description 2
- 230000003247 decreasing effect Effects 0.000 description 2
- 230000002950 deficient Effects 0.000 description 2
- 201000010099 disease Diseases 0.000 description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 2
- 238000002337 electrophoretic mobility shift assay Methods 0.000 description 2
- 230000002255 enzymatic effect Effects 0.000 description 2
- GNBHRKFJIUUOQI-UHFFFAOYSA-N fluorescein Chemical compound O1C(=O)C2=CC=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 GNBHRKFJIUUOQI-UHFFFAOYSA-N 0.000 description 2
- 210000000688 human artificial chromosome Anatomy 0.000 description 2
- 150000002500 ions Chemical class 0.000 description 2
- 210000003734 kidney Anatomy 0.000 description 2
- 239000007788 liquid Substances 0.000 description 2
- 230000033001 locomotion Effects 0.000 description 2
- 239000006166 lysate Substances 0.000 description 2
- 102000016470 mariner transposase Human genes 0.000 description 2
- 108060004631 mariner transposase Proteins 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 229930182817 methionine Natural products 0.000 description 2
- 108091070501 miRNA Proteins 0.000 description 2
- 239000002679 microRNA Substances 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 210000003205 muscle Anatomy 0.000 description 2
- 239000002105 nanoparticle Substances 0.000 description 2
- 235000015097 nutrients Nutrition 0.000 description 2
- 101150093139 ompT gene Proteins 0.000 description 2
- 239000010452 phosphate Substances 0.000 description 2
- 238000010839 reverse transcription Methods 0.000 description 2
- 108020004418 ribosomal RNA Proteins 0.000 description 2
- 150000003384 small molecules Chemical class 0.000 description 2
- 239000002689 soil Substances 0.000 description 2
- 239000000243 solution Substances 0.000 description 2
- 239000000126 substance Substances 0.000 description 2
- 230000008685 targeting Effects 0.000 description 2
- 229960004072 thrombin Drugs 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- 238000012546 transfer Methods 0.000 description 2
- 230000001052 transient effect Effects 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 2
- 241000701161 unidentified adenovirus Species 0.000 description 2
- 241000701447 unidentified baculovirus Species 0.000 description 2
- 241001430294 unidentified retrovirus Species 0.000 description 2
- 108091064702 1 family Proteins 0.000 description 1
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 1
- ZLOIGESWDJYCTF-XVFCMESISA-N 4-thiouridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=S)C=C1 ZLOIGESWDJYCTF-XVFCMESISA-N 0.000 description 1
- LQLQRFGHAALLLE-UHFFFAOYSA-N 5-bromouracil Chemical compound BrC1=CNC(=O)NC1=O LQLQRFGHAALLLE-UHFFFAOYSA-N 0.000 description 1
- NJYVEMPWNAYQQN-UHFFFAOYSA-N 5-carboxyfluorescein Chemical compound C12=CC=C(O)C=C2OC2=CC(O)=CC=C2C21OC(=O)C1=CC(C(=O)O)=CC=C21 NJYVEMPWNAYQQN-UHFFFAOYSA-N 0.000 description 1
- WQZIDRAQTRIQDX-UHFFFAOYSA-N 6-carboxy-x-rhodamine Chemical compound OC(=O)C1=CC=C(C([O-])=O)C=C1C(C1=CC=2CCCN3CCCC(C=23)=C1O1)=C2C1=C(CCC1)C3=[N+]1CCCC3=C2 WQZIDRAQTRIQDX-UHFFFAOYSA-N 0.000 description 1
- FVFVNNKYKYZTJU-UHFFFAOYSA-N 6-chloro-1,3,5-triazine-2,4-diamine Chemical compound NC1=NC(N)=NC(Cl)=N1 FVFVNNKYKYZTJU-UHFFFAOYSA-N 0.000 description 1
- HRPVXLWXLXDGHG-UHFFFAOYSA-N Acrylamide Chemical compound NC(=O)C=C HRPVXLWXLXDGHG-UHFFFAOYSA-N 0.000 description 1
- 241000649046 Adeno-associated virus 11 Species 0.000 description 1
- 241000649047 Adeno-associated virus 12 Species 0.000 description 1
- 241000300529 Adeno-associated virus 13 Species 0.000 description 1
- 108091093088 Amplicon Proteins 0.000 description 1
- 241000203069 Archaea Species 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 101100007857 Bacillus subtilis (strain 168) cspB gene Proteins 0.000 description 1
- 241000894006 Bacteria Species 0.000 description 1
- 241000186018 Bifidobacterium adolescentis Species 0.000 description 1
- 241001134770 Bifidobacterium animalis Species 0.000 description 1
- 241000901050 Bifidobacterium animalis subsp. lactis Species 0.000 description 1
- 241000186012 Bifidobacterium breve Species 0.000 description 1
- 241001608472 Bifidobacterium longum Species 0.000 description 1
- 241000186015 Bifidobacterium longum subsp. infantis Species 0.000 description 1
- 108090000994 Catalytic RNA Proteins 0.000 description 1
- 102000053642 Catalytic RNA Human genes 0.000 description 1
- 108020004638 Circular DNA Proteins 0.000 description 1
- KQLDDLUWUFBQHP-UHFFFAOYSA-N Cordycepin Natural products C1=NC=2C(N)=NC=NC=2N1C1OCC(CO)C1O KQLDDLUWUFBQHP-UHFFFAOYSA-N 0.000 description 1
- 108091029523 CpG island Proteins 0.000 description 1
- 241000699800 Cricetinae Species 0.000 description 1
- 241000699802 Cricetulus griseus Species 0.000 description 1
- 150000008574 D-amino acids Chemical class 0.000 description 1
- 102000004594 DNA Polymerase I Human genes 0.000 description 1
- 108010017826 DNA Polymerase I Proteins 0.000 description 1
- 230000007018 DNA scission Effects 0.000 description 1
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 1
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 1
- 241000255601 Drosophila melanogaster Species 0.000 description 1
- 241000701959 Escherichia virus Lambda Species 0.000 description 1
- 241000206602 Eukaryota Species 0.000 description 1
- 108700024394 Exon Proteins 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- 108091092584 GDNA Proteins 0.000 description 1
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 1
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- NYHBQMYGNKIUIF-UUOKFMHZSA-N Guanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O NYHBQMYGNKIUIF-UUOKFMHZSA-N 0.000 description 1
- HVLSXIKZNLPZJJ-TXZCQADKSA-N HA peptide Chemical compound C([C@@H](C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](C(C)C)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](C)C(O)=O)NC(=O)[C@H]1N(CCC1)C(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HVLSXIKZNLPZJJ-TXZCQADKSA-N 0.000 description 1
- 241000256244 Heliothis virescens Species 0.000 description 1
- 108091006054 His-tagged proteins Proteins 0.000 description 1
- 101150090950 Hsc70-1 gene Proteins 0.000 description 1
- 108090000144 Human Proteins Proteins 0.000 description 1
- 102000003839 Human Proteins Human genes 0.000 description 1
- 241000701085 Human alphaherpesvirus 3 Species 0.000 description 1
- 229930010555 Inosine Natural products 0.000 description 1
- UGQMRVRMYYASKQ-KQYNXXCUSA-N Inosine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C2=NC=NC(O)=C2N=C1 UGQMRVRMYYASKQ-KQYNXXCUSA-N 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 description 1
- 150000008575 L-amino acids Chemical class 0.000 description 1
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- 240000001046 Lactobacillus acidophilus Species 0.000 description 1
- 235000013956 Lactobacillus acidophilus Nutrition 0.000 description 1
- 244000199885 Lactobacillus bulgaricus Species 0.000 description 1
- 235000013960 Lactobacillus bulgaricus Nutrition 0.000 description 1
- 244000199866 Lactobacillus casei Species 0.000 description 1
- 235000013958 Lactobacillus casei Nutrition 0.000 description 1
- 241000186673 Lactobacillus delbrueckii Species 0.000 description 1
- 241000186840 Lactobacillus fermentum Species 0.000 description 1
- 240000002605 Lactobacillus helveticus Species 0.000 description 1
- 235000013967 Lactobacillus helveticus Nutrition 0.000 description 1
- 241001468157 Lactobacillus johnsonii Species 0.000 description 1
- 241000186605 Lactobacillus paracasei Species 0.000 description 1
- 240000006024 Lactobacillus plantarum Species 0.000 description 1
- 235000013965 Lactobacillus plantarum Nutrition 0.000 description 1
- 241000186604 Lactobacillus reuteri Species 0.000 description 1
- 241000218588 Lactobacillus rhamnosus Species 0.000 description 1
- 241000186869 Lactobacillus salivarius Species 0.000 description 1
- 241000194036 Lactococcus Species 0.000 description 1
- 241000194034 Lactococcus lactis subsp. cremoris Species 0.000 description 1
- GUBGYTABKSRVRQ-QKKXKWKRSA-N Lactose Natural products OC[C@H]1O[C@@H](O[C@H]2[C@H](O)[C@@H](O)C(O)O[C@@H]2CO)[C@H](O)[C@@H](O)[C@H]1O GUBGYTABKSRVRQ-QKKXKWKRSA-N 0.000 description 1
- 108091026898 Leader sequence (mRNA) Proteins 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 239000000232 Lipid Bilayer Substances 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 1
- 102000016397 Methyltransferase Human genes 0.000 description 1
- KWYHDKDOAIKMQN-UHFFFAOYSA-N N,N,N',N'-tetramethylethylenediamine Chemical compound CN(C)CCN(C)C KWYHDKDOAIKMQN-UHFFFAOYSA-N 0.000 description 1
- 108091061960 Naked DNA Proteins 0.000 description 1
- 229930193140 Neomycin Natural products 0.000 description 1
- 108010077850 Nuclear Localization Signals Proteins 0.000 description 1
- 101710163270 Nuclease Proteins 0.000 description 1
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 1
- 240000007019 Oxalis corniculata Species 0.000 description 1
- 108091093037 Peptide nucleic acid Proteins 0.000 description 1
- 108091000080 Phosphotransferase Proteins 0.000 description 1
- 206010035226 Plasma cell myeloma Diseases 0.000 description 1
- 241000288906 Primates Species 0.000 description 1
- 229930185560 Pseudouridine Natural products 0.000 description 1
- PTJWIQPHWPFNBW-UHFFFAOYSA-N Pseudouridine C Natural products OC1C(O)C(CO)OC1C1=CNC(=O)NC1=O PTJWIQPHWPFNBW-UHFFFAOYSA-N 0.000 description 1
- 101000902592 Pyrococcus furiosus (strain ATCC 43587 / DSM 3638 / JCM 8422 / Vc1) DNA polymerase Proteins 0.000 description 1
- 102000009572 RNA Polymerase II Human genes 0.000 description 1
- 108010009460 RNA Polymerase II Proteins 0.000 description 1
- 102000014450 RNA Polymerase III Human genes 0.000 description 1
- 108010078067 RNA Polymerase III Proteins 0.000 description 1
- 230000004570 RNA-binding Effects 0.000 description 1
- 238000011529 RT qPCR Methods 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- 101100150366 Schizosaccharomyces pombe (strain 972 / ATCC 24843) sks2 gene Proteins 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 241000256251 Spodoptera frugiperda Species 0.000 description 1
- 108010090804 Streptavidin Proteins 0.000 description 1
- 241000194017 Streptococcus Species 0.000 description 1
- 235000014962 Streptococcus cremoris Nutrition 0.000 description 1
- 241000194020 Streptococcus thermophilus Species 0.000 description 1
- 241000187747 Streptomyces Species 0.000 description 1
- 108700026226 TATA Box Proteins 0.000 description 1
- 108091036066 Three prime untranslated region Proteins 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- 108091023040 Transcription factor Proteins 0.000 description 1
- 102000040945 Transcription factor Human genes 0.000 description 1
- 241000255993 Trichoplusia ni Species 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 1
- JCZSFCLRSONYLH-UHFFFAOYSA-N Wyosine Natural products N=1C(C)=CN(C(C=2N=C3)=O)C=1N(C)C=2N3C1OC(CO)C(O)C1O JCZSFCLRSONYLH-UHFFFAOYSA-N 0.000 description 1
- NOXMCJDDSWCSIE-DAGMQNCNSA-N [[(2R,3S,4R,5R)-5-(2-amino-4-oxo-3H-pyrrolo[2,3-d]pyrimidin-7-yl)-3,4-dihydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl] phosphono hydrogen phosphate Chemical compound C1=2NC(N)=NC(=O)C=2C=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)[C@H]1O NOXMCJDDSWCSIE-DAGMQNCNSA-N 0.000 description 1
- AZRNEVJSOSKAOC-VPHBQDTQSA-N [[(2r,3s,5r)-5-[5-[(e)-3-[6-[5-[(3as,4s,6ar)-2-oxo-1,3,3a,4,6,6a-hexahydrothieno[3,4-d]imidazol-4-yl]pentanoylamino]hexanoylamino]prop-1-enyl]-2,4-dioxopyrimidin-1-yl]-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl] phosphono hydrogen phosphate Chemical compound O1[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C[C@@H]1N1C(=O)NC(=O)C(\C=C\CNC(=O)CCCCCNC(=O)CCCC[C@H]2[C@H]3NC(=O)N[C@H]3CS2)=C1 AZRNEVJSOSKAOC-VPHBQDTQSA-N 0.000 description 1
- ZXZIQGYRHQJWSY-NKWVEPMBSA-N [hydroxy-[[(2s,5r)-5-(6-oxo-3h-purin-9-yl)oxolan-2-yl]methoxy]phosphoryl] phosphono hydrogen phosphate Chemical compound O1[C@H](COP(O)(=O)OP(O)(=O)OP(O)(=O)O)CC[C@@H]1N1C(NC=NC2=O)=C2N=C1 ZXZIQGYRHQJWSY-NKWVEPMBSA-N 0.000 description 1
- 230000021736 acetylation Effects 0.000 description 1
- 238000006640 acetylation reaction Methods 0.000 description 1
- 102000005421 acetyltransferase Human genes 0.000 description 1
- 108020002494 acetyltransferase Proteins 0.000 description 1
- 238000001042 affinity chromatography Methods 0.000 description 1
- 235000004279 alanine Nutrition 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 229940009098 aspartate Drugs 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- WGDUUQDYDIIBKT-UHFFFAOYSA-N beta-Pseudouridine Natural products OC1OC(CN2C=CC(=O)NC2=O)C(O)C1O WGDUUQDYDIIBKT-UHFFFAOYSA-N 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 229940118852 bifidobacterium animalis Drugs 0.000 description 1
- 229940004120 bifidobacterium infantis Drugs 0.000 description 1
- 229940009289 bifidobacterium lactis Drugs 0.000 description 1
- 229940009291 bifidobacterium longum Drugs 0.000 description 1
- 239000012148 binding buffer Substances 0.000 description 1
- 239000011616 biotin Substances 0.000 description 1
- 229960002685 biotin Drugs 0.000 description 1
- 235000020958 biotin Nutrition 0.000 description 1
- 239000005420 bog Substances 0.000 description 1
- 238000006664 bond formation reaction Methods 0.000 description 1
- 239000000872 buffer Substances 0.000 description 1
- 125000002091 cationic group Chemical group 0.000 description 1
- 239000006143 cell culture medium Substances 0.000 description 1
- 108091092356 cellular DNA Proteins 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 238000004587 chromatography analysis Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 230000021615 conjugation Effects 0.000 description 1
- OFEZSBMBBKLLBJ-BAJZRUMYSA-N cordycepin Chemical compound C1=NC=2C(N)=NC=NC=2N1[C@@H]1O[C@H](CO)C[C@H]1O OFEZSBMBBKLLBJ-BAJZRUMYSA-N 0.000 description 1
- OFEZSBMBBKLLBJ-UHFFFAOYSA-N cordycepine Natural products C1=NC=2C(N)=NC=NC=2N1C1OC(CO)CC1O OFEZSBMBBKLLBJ-UHFFFAOYSA-N 0.000 description 1
- 101150110403 cspA gene Proteins 0.000 description 1
- 101150068339 cspLA gene Proteins 0.000 description 1
- 235000018417 cysteine Nutrition 0.000 description 1
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 1
- 238000004163 cytometry Methods 0.000 description 1
- 229940104302 cytosine Drugs 0.000 description 1
- HAAZLUGHYHWQIW-KVQBGUIXSA-N dGTP Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@H]1C[C@H](O)[C@@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)O1 HAAZLUGHYHWQIW-KVQBGUIXSA-N 0.000 description 1
- NHVNXKFIZYSCEB-XLPZGREQSA-N dTTP Chemical compound O=C1NC(=O)C(C)=CN1[C@@H]1O[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C1 NHVNXKFIZYSCEB-XLPZGREQSA-N 0.000 description 1
- 238000002716 delivery method Methods 0.000 description 1
- 238000000326 densiometry Methods 0.000 description 1
- 239000005549 deoxyribonucleoside Substances 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 239000005546 dideoxynucleotide Substances 0.000 description 1
- ZPTBLXKRQACLCR-XVFCMESISA-N dihydrouridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1N1C(=O)NC(=O)CC1 ZPTBLXKRQACLCR-XVFCMESISA-N 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000003623 enhancer Substances 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- LYCAIKOWRPUZTN-UHFFFAOYSA-N ethylene glycol Natural products OCCO LYCAIKOWRPUZTN-UHFFFAOYSA-N 0.000 description 1
- 239000013613 expression plasmid Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000010441 gene drive Methods 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- 229930195712 glutamate Natural products 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 230000013595 glycosylation Effects 0.000 description 1
- 238000006206 glycosylation reaction Methods 0.000 description 1
- 239000001963 growth medium Substances 0.000 description 1
- 229940029575 guanosine Drugs 0.000 description 1
- 244000005702 human microbiome Species 0.000 description 1
- 125000002887 hydroxy group Chemical group [H]O* 0.000 description 1
- WGCNASOHLSPBMP-UHFFFAOYSA-N hydroxyacetaldehyde Natural products OCC=O WGCNASOHLSPBMP-UHFFFAOYSA-N 0.000 description 1
- 230000000415 inactivating effect Effects 0.000 description 1
- 238000010348 incorporation Methods 0.000 description 1
- 206010022000 influenza Diseases 0.000 description 1
- 229960003786 inosine Drugs 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 125000001449 isopropyl group Chemical group [H]C([H])([H])C([H])(*)C([H])([H])[H] 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 229940039695 lactobacillus acidophilus Drugs 0.000 description 1
- 229940004208 lactobacillus bulgaricus Drugs 0.000 description 1
- 229940017800 lactobacillus casei Drugs 0.000 description 1
- 229940012969 lactobacillus fermentum Drugs 0.000 description 1
- 229940054346 lactobacillus helveticus Drugs 0.000 description 1
- 229940072205 lactobacillus plantarum Drugs 0.000 description 1
- 229940001882 lactobacillus reuteri Drugs 0.000 description 1
- 239000008101 lactose Substances 0.000 description 1
- 230000029226 lipidation Effects 0.000 description 1
- 239000012160 loading buffer Substances 0.000 description 1
- 230000002934 lysing effect Effects 0.000 description 1
- 210000001161 mammalian embryo Anatomy 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 239000002609 medium Substances 0.000 description 1
- 230000037353 metabolic pathway Effects 0.000 description 1
- 239000000693 micelle Substances 0.000 description 1
- 238000000520 microinjection Methods 0.000 description 1
- 239000003068 molecular probe Substances 0.000 description 1
- 125000004573 morpholin-4-yl group Chemical group N1(CCOCC1)* 0.000 description 1
- 201000000050 myeloid neoplasm Diseases 0.000 description 1
- 229960004927 neomycin Drugs 0.000 description 1
- 230000007935 neutral effect Effects 0.000 description 1
- 239000002853 nucleic acid probe Substances 0.000 description 1
- 210000001672 ovary Anatomy 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 238000004806 packaging method and process Methods 0.000 description 1
- 239000003415 peat Substances 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- NBIIXXVUZAFLBC-UHFFFAOYSA-K phosphate Chemical compound [O-]P([O-])([O-])=O NBIIXXVUZAFLBC-UHFFFAOYSA-K 0.000 description 1
- 230000026731 phosphorylation Effects 0.000 description 1
- 238000006366 phosphorylation reaction Methods 0.000 description 1
- 102000020233 phosphotransferase Human genes 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 229920002704 polyhistidine Polymers 0.000 description 1
- 230000023603 positive regulation of transcription initiation, DNA-dependent Effects 0.000 description 1
- 239000000843 powder Substances 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 235000004252 protein component Nutrition 0.000 description 1
- PTJWIQPHWPFNBW-GBNDHIKLSA-N pseudouridine Chemical compound O[C@@H]1[C@H](O)[C@@H](CO)O[C@H]1C1=CNC(=O)NC1=O PTJWIQPHWPFNBW-GBNDHIKLSA-N 0.000 description 1
- 238000000746 purification Methods 0.000 description 1
- 239000002096 quantum dot Substances 0.000 description 1
- QQXQGKSPIMGUIZ-AEZJAUAXSA-N queuosine Chemical compound C1=2C(=O)NC(N)=NC=2N([C@H]2[C@@H]([C@H](O)[C@@H](CO)O2)O)C=C1CN[C@H]1C=C[C@H](O)[C@@H]1O QQXQGKSPIMGUIZ-AEZJAUAXSA-N 0.000 description 1
- 230000002285 radioactive effect Effects 0.000 description 1
- 239000011535 reaction buffer Substances 0.000 description 1
- 238000010188 recombinant method Methods 0.000 description 1
- 230000002207 retinal effect Effects 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- 239000002342 ribonucleoside Substances 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 108091092562 ribozyme Proteins 0.000 description 1
- 238000005096 rolling process Methods 0.000 description 1
- 239000000523 sample Substances 0.000 description 1
- 229930000044 secondary metabolite Natural products 0.000 description 1
- 239000010865 sewage Substances 0.000 description 1
- 238000002415 sodium dodecyl sulfate polyacrylamide gel electrophoresis Methods 0.000 description 1
- 244000000000 soil microbiome Species 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000002904 solvent Substances 0.000 description 1
- 238000000527 sonication Methods 0.000 description 1
- 230000009870 specific binding Effects 0.000 description 1
- 239000012536 storage buffer Substances 0.000 description 1
- 239000000725 suspension Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- MPLHNVLQVRSVEE-UHFFFAOYSA-N texas red Chemical compound [O-]S(=O)(=O)C1=CC(S(Cl)(=O)=O)=CC=C1C(C1=CC=2CCCN3CCCC(C=23)=C1O1)=C2C1=C(CCC1)C3=[N+]1CCCC3=C2 MPLHNVLQVRSVEE-UHFFFAOYSA-N 0.000 description 1
- IBVCSSOEYUMRLC-GABYNLOESA-N texas red-5-dutp Chemical compound O1[C@H](COP(O)(=O)OP(O)(=O)OP(O)(O)=O)[C@@H](O)C[C@@H]1N1C(=O)NC(=O)C(C#CCNS(=O)(=O)C=2C=C(C(C=3C4=CC=5CCCN6CCCC(C=56)=C4OC4=C5C6=[N+](CCC5)CCCC6=CC4=3)=CC=2)S([O-])(=O)=O)=C1 IBVCSSOEYUMRLC-GABYNLOESA-N 0.000 description 1
- 150000003573 thiols Chemical class 0.000 description 1
- ANRHNWWPFJCPAZ-UHFFFAOYSA-M thionine Chemical compound [Cl-].C1=CC(N)=CC2=[S+]C3=CC(N)=CC=C3N=C21 ANRHNWWPFJCPAZ-UHFFFAOYSA-M 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 210000001519 tissue Anatomy 0.000 description 1
- 238000001890 transfection Methods 0.000 description 1
- UNXRWKVEANCORM-UHFFFAOYSA-N triphosphoric acid Chemical compound OP(O)(=O)OP(O)(=O)OP(O)(O)=O UNXRWKVEANCORM-UHFFFAOYSA-N 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 239000013603 viral vector Substances 0.000 description 1
- 210000002845 virion Anatomy 0.000 description 1
- 239000000277 virosome Substances 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
- JCZSFCLRSONYLH-QYVSTXNMSA-N wyosin Chemical compound N=1C(C)=CN(C(C=2N=C3)=O)C=1N(C)C=2N3[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O JCZSFCLRSONYLH-QYVSTXNMSA-N 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/87—Introduction of foreign genetic material using processes not otherwise provided for, e.g. co-transformation
- C12N15/90—Stable introduction of foreign DNA into chromosome
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/102—Mutagenizing nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/90—Vectors containing a transposable element
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Chemical & Material Sciences (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- Biotechnology (AREA)
- Biomedical Technology (AREA)
- Molecular Biology (AREA)
- Microbiology (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Plant Pathology (AREA)
- Mycology (AREA)
- Medicinal Chemistry (AREA)
- Crystallography & Structural Chemistry (AREA)
- Enzymes And Modification Thereof (AREA)
Abstract
The present disclosure provides systems and methods for transposing a cargo nucleotide sequence to a target nucleic acid site. These systems and methods may comprise a first double-stranded nucleic acid comprising the cargo nucleotide sequence, wherein the cargo nucleotide sequence is configured to interact with a transposase, and the transposase, wherein said transposase is configured to transpose the cargo nucleotide sequence to the target nucleic acid site.
Description
SYSTEMS AND METHODS FOR TRANSPOSING CARGO NUCLEOTIDE SEQUENCES
CROSS-REFERENCE
[0001] This application claims the benefit of and priority to U.S. Provisional Patent Application No. 63/404,859 filed September 8, 2022, which is incorporated by reference in its entirety herein.
SUMMARY
[0002] Transposons are mobile genetic elements evolved to execute highly efficient integration of their genes into the genomes of their host cells. The ability of transposons to naturally transfer DNA throughout the genome has been harnessed for a wide variety of research and therapeutic applications including gene editing applications.
[0003] Described herein, in certain embodiments, are engineered transposase systems, comprising: (a) a double- stranded nucleic acid and comprising a cargo nucleotide sequence; and (b) a transposase configured to interact with the double-stranded nucleic acid to transpose the cargo nucleotide sequence to a target nucleic acid site; and comprising a sequence having at least 75% sequence identity to any one of SEQ ID NOs: 1-38. In some embodiments, the cargo nucleotide sequence is flanked by a left-hand transposase recognition sequence and a right-hand transposase recognition sequence recognized by the transposase. In some embodiments, the transposase is configured to transpose the cargo nucleotide sequence as double- stranded deoxyribonucleic acid polynucleotide. In some embodiments, the transposase comprises one or more nuclear localization sequences (NLSs) proximal to an N- or C-terminus of the transposase. In some embodiments, the NLS comprises a sequence according to any one of SEQ ID NOs: 1480-1495.
[0004] Described herein, in certain embodiments, are methods for binding, nicking, cleaving, marking, modifying, or transposing a double- stranded deoxyribonucleic acid polynucleotide, comprising contacting the double- stranded deoxyribonucleic acid polynucleotide with a transposase configured to transpose the cargo nucleotide sequence to a target nucleic acid site; and comprising a sequence having at least 75% sequence identity to any one of SEQ ID NOs: 1-38.
[0005] Described herein, in certain embodiments, are methods of modifying a target nucleic acid site, comprising contacting the target nucleic acid site with the engineered transposase system described herein. In some embodiments, modifying the target nucleic acid site comprises binding, nicking, cleaving, marking, modifying, or transposing the target nucleic acid site. In some embodiments, the target nucleic acid site comprises deoxyribonucleic acid (DNA). In some embodiments, the target nucleic acid site comprises genomic DNA, viral DNA, or bacterial DNA.
[0006] Described herein, in certain embodiments, are methods for transposing a cargo nucleotide sequence into a target nucleic acid site comprising introducing the engineered transposase system described herein to a cell.
[0007] Described herein, in certain embodiments, are cells comprising the engineered transposase system described herein. In some embodiments, the cell is a eukaryotic cell. In some embodiments, the cell is a mammalian cell. In some embodiments, the cell is an immortalized cell. In some embodiments, the cell is an insect cell. In some embodiments, the cell is a yeast cell. In some embodiments, the cell is a plant cell. In some embodiments, the cell is a fungal cell. In some embodiments, the cell is a prokaryotic cell. In some embodiments, the cell is an A549, HEK-293, HEK-293T, BHK, CHO, HeLa, MRC5, Sf9, Cos-1, Cos-7, Vero, BSC 1, BSC 40, BMT 10, WI38, HeLa, Saos, C2C12, L cell, HT1080, HepG2, Huh7, K562, primary cell, or a derivative thereof. In some embodiments, the cell is an engineered cell. In some embodiments, the cell is a stable cell. In some embodiments, the cell is a primary cell. In some embodiments, the cell is a T cell. In some embodiments, the cell is a hematopoietic stem cell (HSC).
BRIEF DESCRIPTION OF THE DRAWINGS
[0008] The novel features of the disclosure are set forth with particularity in the appended claims. A better understanding of the features and advantages of the present disclosure will be obtained by reference to the following detailed description that sets forth illustrative embodiments, in which the principles of the disclosure are utilized, and the accompanying drawings of which:
[0009] FIGs. 1A-1C depict MG63-961. FIG. 1A depicts the genomic context of a bacterial Tel- Mariner superfamily transposase. The transposase encodes a predicted DDE superfamily endonuclease domain. FIG. IB depicts alignment of identified, imperfect terminal inverted repeats. FIG. IB discloses SEQ ID NOs: 55-56, respectively, in order of appearance. FIG. 1C depicts 3D structure prediction of an MG Tel /Mariner-like superfamily transposase folds best after a Eukaryotic Mosl transposase.
[0010] FIGs. 2A-2B depict multiple sequence alignments (MSA) that identified key transposon features. FIG. 2A depicts MSA of transposase proteins vs. the catalytic domain of Sleeping Beauty, which identified conserved catalytic residues DDE (boxes). FIG. 2A discloses SEQ ID NOs: 57-66, respectively, in order of appearance. FIG. 2B depicts length distribution of the distance between the second aspartate (D) and the glutamate (E) catalytic residues of the canonical DDE transposases motif. The x-axis represents the sequence length between the D and E catalytic residues, inclusive.
BRIEF DESCRIPTION OF THE SEQUENCE LISTING
[0011] The Sequence Listing filed herewith provides exemplary polynucleotide and polypeptide sequences for use in methods, compositions, and systems according to the disclosure. Below are exemplary descriptions of sequences therein.
[0012] MG63
[0013] SEQ ID NOs: 1-38 show the full-length peptide sequences of MG63 transposition proteins.
DETAILED DESCRIPTION
[0014] While various embodiments of the invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. Numerous variations, changes, and substitutions may occur to those skilled in the art without departing from the invention. It should be understood that various alternatives to the embodiments of the invention described herein may be employed.
[0015] The practice of some methods disclosed herein employ, unless otherwise indicated, techniques of immunology, biochemistry, chemistry, molecular biology, microbiology, cell biology, genomics, and recombinant DNA. See for example Sambrook and Green, Molecular Cloning: A Laboratory Manual, 4th Edition (2012); the series Current Protocols in Molecular Biology (F. M. Ausubel, et al. eds.); the series Methods In Enzymology (Academic Press, Inc.), PCR 2: A Practical Approach (M.J. MacPherson, B.D. Hames and G.R. Taylor eds. (1995)), Harlow and Lane, eds. (1988) Antibodies, A Laboratory Manual, and Culture of Animal Cells: A Manual of Basic Technique and Specialized Applications, 6th Edition (R.I. Freshney, ed. (2010)).
[0016] As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. Furthermore, to the extent that the terms “including”, “includes”, “having”, “has”, “with”, or variants thereof are used in either the detailed description and/or the claims, such terms are intended to be inclusive in a manner similar to the term “comprising”.
[0017] The term “about” or “approximately” means within an acceptable error range for the particular value as determined by one of ordinary skill in the art, which will depend in part on how the value is measured or determined, i.e., the limitations of the measurement system. For example, “about” can mean within one or more than one standard deviation, per the practice in the art. Alternatively, “about” can mean a range of up to 20%, up to 15%, up to 10%, up to 5%, or up to 1% of a given value.
[0018] The term “nucleotide,” as used herein, refers to a base-sugar-phosphate combination. Contemplated nucleotides include naturally occurring nucleotides and synthetic nucleotides.
Nucleotides are monomeric units of a nucleic acid sequence (e.g., deoxyribonucleic acid (DNA) and ribonucleic acid (RNA)). The term nucleotide includes ribonucleoside triphosphates adenosine triphosphate (ATP), uridine triphosphate (UTP), cytosine triphosphate (CTP), guanosine triphosphate (GTP) and deoxyribonucleoside triphosphates such as dATP, dCTP, diTP, dUTP, dGTP, dTTP, or derivatives thereof. Such derivatives include, for example, [aS]dATP, 7-deaza-dGTP and 7-deaza- dATP, and nucleotide derivatives that confer nuclease resistance on the nucleic acid molecule containing them. The term nucleotide as used herein encompasses dideoxyribonucleoside triphosphates (ddNTPs) and their derivatives. Illustrative examples of ddNTPs include, but are not limited to, ddATP, ddCTP, ddGTP, ddITP, and ddTTP. A nucleotide may be unlabeled or detectably labeled, such as using moieties comprising optically detectable moieties (e.g., fluorophores) or quantum dots. Detectable labels include, for example, radioactive isotopes, fluorescent labels, chemiluminescent labels, bioluminescent labels, and enzyme labels. Fluorescent labels of nucleotides include but are not limited fluorescein, 5-carboxyfluorescein (FAM), 2'7'-dimethoxy-4'5-dichloro-6- carboxyfluorescein (JOE), rhodamine, 6-carboxyrhodamine (R6G), N,N,N',N'-tetramethyl-6- carboxyrhodamine (TAMRA), 6-carboxy-X-rhodamine (ROX), 4-(4 'dimethylaminophenylazo) benzoic acid (DABCYL), Cascade Blue, Oregon Green, Texas Red, Cyanine and 5-(2'- aminoethyl)aminonaphthalene-l- sulfonic acid (EDANS). Specific examples of fluorescently labeled nucleotides include [R6G]dUTP, [TAMRA]dUTP, [R110]dCTP, [R6G]dCTP, [TAMRA]dCTP, [JOE]ddATP, [R6G]ddATP, [FAM]ddCTP, [R110]ddCTP, [TAMRA]ddGTP, [ROX]ddTTP, [dR6G]ddATP, [dR110]ddCTP, [dTAMRA]ddGTP, and [dROX]ddTTP available from Perkin Elmer, Foster City, Calif; FluoroLink DeoxyNucleotides, FluoroLink Cy3-dCTP, FluoroLink Cy5- dCTP, FluoroLink Fluor X-dCTP, FluoroLink Cy3-dUTP, and FluoroLink Cy5-dUTP available from Amersham, Arlington Heights, IL; Fluorescein- 15-dATP, Fluorescein- 12-dUTP, Tetramethyl- rodamine-6-dUTP, IR770-9-dATP, Fluorescein- 12-ddUTP, Fluorescein- 12-UTP, and Fluorescein- 15-2'-dATP available from Boehringer Mannheim, Indianapolis, Ind.; and Chromosome Labeled Nucleotides, B0DIPY-FL-14-UTP, B0DIPY-FL-4-UTP, B0DIPY-TMR-14-UTP, BODIPY-TMR- 14-dUTP, B0DIPY-TR-14-UTP, BODIPY-TR-14-dUTP, Cascade Blue-7-UTP, Cascade Blue-7- dUTP, fluorescein- 12-UTP, fluorescein- 12-dUTP, Oregon Green 488-5-dUTP, Rhodamine Green-5- UTP, Rhodamine Green-5-dUTP, tetramethylrhodamine-6-UTP, tetramethylrhodamine-6-dUTP, Texas Red-5-UTP, Texas Red-5-dUTP, and Texas Red- 12-dUTP available from Molecular Probes, Eugene, Oreg. The term nucleotide encompasses chemically modified nucleotides. An exemplary chemically-modified nucleotide is biotin-dNTP. Non-limiting examples of biotinylated dNTPs include, biotin-dATP (e.g., bio-N6-ddATP, biotin- 14-dATP), biotin-dCTP (e.g., biotin- 11-dCTP, biotin- 14-dCTP), and biotin-dUTP (e.g., biotin- 11-dUTP, biotin- 16-dUTP, biotin-20-dUTP).
[0019] The terms “polynucleotide,” “oligonucleotide,” and “nucleic acid” are used interchangeably to refer to a polymeric form of nucleotides of any length, either deoxyribonucleotides or ribonucleotides, or analogs thereof, either in single-, double-, or multi- stranded form. Contemplated polynucleotides include a gene or fragment thereof. Exemplary polynucleotides include, but are not limited to, DNA, RNA, coding or non-coding regions of a gene or gene fragment, loci (locus) defined from linkage analysis, exons, introns, messenger RNA (mRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), short interfering RNA (siRNA), short-hairpin RNA (shRNA), micro-RNA (miRNA), ribozymes, cDNA, recombinant polynucleotides, branched polynucleotides, plasmids, vectors, cell-free polynucleotides including cell-free DNA (cfDNA) and cell-free RNA (cfRNA), nucleic acid probes, and primers. In a polynucleotide when referring to a T, a T means U (Uracil) in RNA and T (Thymine) in DNA. A polynucleotide can be exogenous or endogenous to a cell and/or exist in a cell-free environment. The term polynucleotide encompasses modified polynucleotides (e.g., altered backbone, sugar, or nucleobase). If present, modifications to the nucleotide structure are imparted before or after assembly of the polymer. Non-limiting examples of modifications include: 5-bromouracil, peptide nucleic acid, xeno nucleic acid, morpholinos, locked nucleic acids, glycol nucleic acids, threose nucleic acids, dideoxynucleotides, cordycepin, 7-deaza-GTP, fluorophores (e.g., rhodamine or fluorescein linked to the sugar), thiol-containing nucleotides, biotin-linked nucleotides, fluorescent base analogs, CpG islands, methyl-7-guanosine, methylated nucleotides, inosine, thiouridine, pseudouridine, dihydrouridine, queuosine, and wyosine. The sequence of nucleotides may be interrupted by non-nucleotide components.
[0020] The terms “transfection” or “transfected” refer to introduction of a polynucleotide into a cell by non- viral or viral-based methods. The polynucleotides may be gene sequences encoding complete proteins or functional portions thereof. See, e.g., Sambrook et al., 1989, Molecular Cloning: A Laboratory Manual, 18.1-18.88.
[0021] The terms “peptide,” “polypeptide,” and “protein” are used interchangeably herein to refer to a polymer of at least two amino acid residues joined by peptide bond(s). This term does not connote a specific length of polymer, nor is it intended to imply or distinguish whether the peptide is produced using recombinant techniques, chemical or enzymatic synthesis, or is naturally occurring. The terms apply to naturally occurring amino acid polymers as well as amino acid polymers comprising at least one modified amino acid. In some cases, the polymer is interrupted by non-amino acids. The terms include amino acid chains of any length, including full length proteins, and proteins with or without secondary or tertiary structure (e.g., domains). The terms also encompass an amino acid polymer that has been modified, for example, by disulfide bond formation, glycosylation, lipidation, acetylation, phosphorylation, oxidation, and any other manipulation such as conjugation
with a labeling component. The terms “amino acid” and “amino acids,” as used herein, refer to natural and non-natural amino acids, including, but not limited to, modified amino acids. Modified amino acids include amino acids that have been chemically modified to include a group or a chemical moiety not naturally present on the amino acid. The term “amino acid” includes both D- amino acids and L-amino acids.
[0022] As used herein, the “non-native” refers to a nucleic acid or polypeptide sequence that is non- naturally occurring. Non-native refers to a non-naturally occurring nucleic acid or polypeptide sequence that comprises modifications such as mutations, insertions, or deletions. The term non- native encompasses fusion nucleic acids or polypeptides that encodes or exhibits an activity (e.g., enzymatic activity, methyltransferase activity, acetyltransferase activity, kinase activity, ubiquitinating activity, etc.) of the nucleic acid or polypeptide sequence to which the non-native sequence is fused. A non-native nucleic acid or polypeptide sequence includes those linked to a naturally-occurring nucleic acid or polypeptide sequence (or a variant thereof) by genetic engineering to generate a chimeric nucleic acid or polypeptide sequence encoding a chimeric nucleic acid or polypeptide.
[0023] The term “promoter”, as used herein, refers to the regulatory DNA region which controls transcription or expression of a polynucleotide (e.g., a gene) and which may be located adjacent to or overlapping a nucleotide or region of nucleotides at which RNA transcription is initiated. A promoter may contain specific DNA sequences which bind protein factors, often referred to as transcription factors, which facilitate binding of RNA polymerase to the DNA leading to gene transcription. Eukaryotic basal promoters typically, though not necessarily, contain a TATA-box and/or a CAAT box.
[0024] The term “expression”, as used herein, refers to the process by which a nucleic acid sequence or a polynucleotide is transcribed from a DNA template (such as into mRNA or other RNA transcript) and/or the process by which a transcribed mRNA is subsequently translated into peptides, polypeptides, or proteins. Transcripts and encoded polypeptides may be collectively referred to as “gene product.” If the polynucleotide is derived from genomic DNA, the term expression includes splicing of the mRNA in a eukaryotic cell.
[0025] As used herein, “operably linked”, “operable linkage”, “operatively linked”, or grammatical equivalents thereof refer to an arrangement of genetic elements, e.g., a promoter, an enhancer, a polyadenylation sequence, etc., wherein an operation (e.g., movement or activation) of a first genetic element has some effect on the second genetic element. The effect on the second genetic element can be, but need not be, of the same type as operation of the first genetic element. For example, two genetic elements are operably linked if movement of the first element causes an activation of the
second element. For instance, a regulatory element, which may comprise promoter and/or enhancer sequences, is operatively linked to a coding region if the regulatory element helps initiate transcription of the coding sequence. There may be intervening residues between the regulatory element and coding region so long as this functional relationship is maintained.
[0026] A “vector” as used herein, refers to a macromolecule or association of macromolecules that comprises or associates with a polynucleotide and which mediates delivery of the polynucleotide to a cell. Examples of vectors include nucleic-based vectors (e.g., plasmids and viral vectors) and liposomes. An exemplary nucleic-acid based vector comprises genetic elements, e.g., regulatory elements, operatively linked to a gene to facilitate expression of the gene in a target.
[0027] As used herein, “expression cassette” and “nucleic acid cassette” are used interchangeably to refer to a component of a vector comprising a combination of nucleic acid sequences or elements (e.g., therapeutic gene, promoter, and a terminator) that are expressed together or are operably linked for expression. The terms encompass an expression cassette including a combination of regulatory elements and a gene or genes to which they are operably linked for expression.
[0028] A “functional fragment” of a DNA or protein sequence refers to a fragment that retains a biological activity (either functional or structural) that is substantially similar to a biological activity of the full-length DNA or protein sequence. A biological activity of a DNA sequence includes its ability to influence expression in a manner attributed to the full-length sequence.
[0029] The terms “engineered,” “synthetic,” and “artificial” are used interchangeably herein to refer to an object that has been modified by human intervention. For example, the terms refer to a polynucleotide or polypeptide that is non-naturally occurring. An engineered peptide has, but does not require, low sequence identity (e.g., less than 50% sequence identity, less than 25% sequence identity, less than 10% sequence identity, less than 5% sequence identity, less than 1% sequence identity) to a naturally occurring human protein. For example, VPR and VP64 domains are synthetic transactivation domains. Non-limiting examples include the following: a nucleic acid modified by changing its sequence to a sequence that does not occur in nature; a nucleic acid modified by ligating it to a nucleic acid that it does not associate with in nature such that the ligated product possesses a function not present in the original nucleic acid; an engineered nucleic acid synthesized in vitro with a sequence that does not exist in nature; a protein modified by changing its amino acid sequence to a sequence that does not exist in nature; an engineered protein acquiring a new function or property. An “engineered” system comprises at least one engineered component.
[0030] As used herein, the term “transposable element” refers to a DNA sequence that can move from one location in the genome to another (i.e., they can be “transposed”). Transposable elements can be generally divided into two classes. Class I transposable elements, or “retrotransposons”, are
transposed via transcription and translation of an RNA intermediate which is subsequently reincorporated into its new location into the genome via reverse transcription (a process mediated by a reverse transcriptase). Class II transposable elements, or “DNA transposons”, are transposed via a complex of single- or double- stranded DNA flanked on either side by a transposase.
[0031] As used herein, the term “Tcl/Mariner” refers to a class and superfamily of DNA transposons. Tcl/Mariner transposons consist of a transposase gene flanked by terminal inverted repeats (“TIRs”) and short tandem site duplications (“TSDs”). Transposition, which occurs by a “cute and paste” mechanism, is initiated by two transposases’ recognition and binding of the TIR sequences. The transposases join together and promote double-stranded DNA cleavage, after which the DNA-transposase complex inserts the DNA at the target sequence. The Tcl/Mariner superfamily exhibits a characteristic DDE catalytic triad.
[0032] The term “sequence identity” or “percent identity” in the context of two or more nucleic acids or polypeptide sequences, refers to two (e.g., in a pairwise alignment) or more (e.g., in a multiple sequence alignment) sequences that are the same or have a specified percentage of amino acid residues or nucleotides that are the same, when compared and aligned for maximum correspondence over a local or global comparison window, as measured using a sequence comparison algorithm. Suitable sequence comparison algorithms for polypeptide sequences include, e.g., BLASTP using parameters of a wordlength (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix setting gap costs at existence of 11, extension of 1, and using a conditional compositional score matrix adjustment for polypeptide sequences longer than 30 residues; BLASTP using parameters of a wordlength (W) of 2, an expectation (E) of 1000000, and the PAM30 scoring matrix setting gap costs at 9 to open gaps and 1 to extend gaps for sequences of less than 30 residues (these are the default parameters for BLASTP in the BLAST suite available at https://blast.ncbi.nlm.nih.gov); CLUSTALW with the Smith- Waterman homology search algorithm parameters with a match of 2, a mismatch of -1, and a gap of -1; MUSCLE with default parameters; MAFFT with parameters of a retree of 2 and max iterations of 1000; Novafold with default parameters; HMMER hmmalign with default parameters.
[0033] The term “optimally aligned” in the context of two or more nucleic acids or polypeptide sequences, refers to two (e.g., in a pairwise alignment) or more (e.g., in a multiple sequence alignment) sequences that have been aligned to maximal correspondence of amino acids residues or nucleotides, for example, as determined by the alignment producing a highest or “optimized” percent identity score.
[0034] Included in the current disclosure are variants of any of the enzymes described herein with one or more conservative amino acid substitutions. Such conservative substitutions can be made in
the amino acid sequence of a polypeptide without disrupting the three-dimensional structure or function of the polypeptide. Conservative substitutions can be accomplished by substituting amino acids with similar hydrophobicity, polarity, and R chain length for one another. Additionally, or alternatively, by comparing aligned sequences of homologous proteins from different species, conservative substitutions can be identified by locating amino acid residues that have been mutated between species (e.g., non-conserved residues) without altering the basic functions of the encoded proteins. Such conservatively substituted variants may include variants with at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, at least about 99% identity to any one of the transposase protein sequences described herein (e.g. MG63 family transposases described herein, or any other family transposase described herein). In some embodiments, such conservatively substituted variants are functional variants. Such functional variants can encompass sequences with substitutions such that the activity of one or more critical active site residues of the transposase are not disrupted. In some embodiments, a functional variant of any of the proteins described herein lacks substitution of at least one of the conserved or functional residues called out in FIG. 2A. In some embodiments, a functional variant of any of the proteins described herein lacks substitution of all of the conserved or functional residues called out in FIG. 2A.
[0035] Also included in the current disclosure are variants of any of the enzymes described herein with substitution of one or more catalytic residues to decrease or eliminate activity of the enzyme (e.g. decreased-activity variants). In some embodiments, a decreased activity variant as a protein described herein comprises a disrupting substitution of at least one, at least two, or all three catalytic residues called out in FIG. 2A.
[0036] Conservative substitution tables providing functionally similar amino acids are available from a variety of references (see, for e.g., Creighton, Proteins: Structures and Molecular Properties (W H Freeman & Co.; 2nd edition (December 1993)). The following eight groups each contain amino acids that are conservative substitutions for one another:
1) Alanine (A), Glycine (G);
2) Aspartic acid (D), Glutamic acid (E);
3) Asparagine (N), Glutamine (Q);
4) Arginine (R), Lysine (K);
5) Isoleucine (I), Leucine (L), Methionine (M), Valine (V);
6) Phenylalanine (F), Tyrosine (Y), Tryptophan (W);
7) Serine (S), Threonine (T); and
8) Cysteine (C), Methionine (M)
Overview
[0037] The discovery of new transposable elements with unique functionality and structure may offer the potential to further disrupt deoxyribonucleic acid (DNA) editing technologies, improving speed, specificity, functionality, and ease of use. Relative to the predicted prevalence of transposable elements in microbes and the sheer diversity of microbial species, relatively few functionally characterized transposable elements exist in the literature. This is partly because a huge number of microbial species may not be readily cultivated in laboratory conditions. Metagenomic sequencing from natural environmental niches containing large numbers of microbial species may offer the potential to drastically increase the number of new transposable elements documented and speed the discovery of new oligonucleotide editing functionalities.
[0038] Transposable elements are deoxyribonucleic acid sequences that can change position within a genome, often resulting in the generation or amelioration of mutations. In eukaryotes, a great proportion of the genome, and a large share of the mass of cellular DNA, is attributable to transposable elements. Although transposable elements are “selfish genes” which propagate themselves at the expense of other genes, they have been found to serve various important functions and to be crucial to genome evolution. Based on their mechanism, transposable elements are classified as either Class I “retrotransposons” or Class II “DNA transposons.”
[0039] Class I transposable elements, also referred to as retrotransposons, function according to a two-part “copy and paste” mechanism involving an RNA intermediate. First, the retrotransposon is transcribed. The resulting RNA is subsequently converted back to DNA by reverse transcriptase (generally encoded by the retrotransposon itself), and the reverse transcribed retrotransposon is finally integrated into its new position in the genome by integrase. Retrotransposons are further classified into three orders. Retrotransposons with long terminal repeats (“LTRs”) encode reverse transcriptase and are flanked by long strands of repeating DNA. Retrotransposons with long interspersed nuclear elements (“LINEs”) encode reverse transcriptase, lack LTRs, and are transcribed by RNA polymerase II. Retrotransposons with short interspersed nuclear elements (“SINEs”) are transcribed by RNA polymerase III but lack reverse transcriptase, instead relying on the reverse transcription machinery of other transposable elements (e.g., LINEs).
[0040] Class II transposable elements, also referred to as DNA transposons, function according to mechanisms that do not involve an RNA intermediate. Many DNA transposons display a “cut and
paste” mechanism in which transposase binds terminal inverted repeats (“TIRs”) flanking the transposon, cleaves the transposon from the donor region, and inserts it into the target region of the genome. Others, referred to as “helitrons,” display a “rolling circle” mechanism involving a singlestranded DNA intermediate and mediated by an undocumented protein believed to possess HUH endonuclease function and 5’ to 3’ helicase activity. First, a circular strand of DNA is nicked to create two single DNA strands. The protein remains attached to the 5’ phosphate of the nicked strand, leaving the 3’ hydroxyl end of the complementary strand exposed and thus allowing a polymerase to replicate the non-nicked strand. Once replication is complete, the new strand disassociates and is itself replicated along with the original template strand. Still other DNA transposons, “Polintons,” are theorized to undergo a “self-synthesis” mechanism. The transposition is initiated by an integrase’s excision of a single- stranded extra-chromosomal Polinton element, which forms a racket- like structure. The Polinton undergoes replication with DNA polymerase B, and the double stranded Polinton is inserted into the genome by the integrase. Finally, some DNA transposons, such as those in the IS200/IS605 family, proceed via a “peel and paste” mechanism in which TnpA excises a piece of single-stranded DNA (as a circular “transposon joint”) from the lagging strand template of the donor gene and reinserts it into the replication fork of the target gene. [0041] While transposable elements have found some use as biological tools, documented transposable elements do not encompass the full range of possible biodiversity and targetability, and may not represent all possible activities. Here, thousands of genomic fragments were mined from numerous metagenomes for transposable elements. The documented diversity of transposable elements may have been expanded and novel systems may have been developed into highly targetable, compact, and precise gene editing agents.
Gene Editing Systems
MG Enzymes
[0042] Described herein, in certain embodiments, are engineered transposase systems, comprising (a) a double- stranded nucleic acid comprising a cargo nucleotide sequence; and (b) a transposase configured to interact with the double-stranded nucleic acid to transpose the cargo nucleotide sequence to a target nucleic acid site. In some embodiments, the transposase is a MG63 transposase (i.e., SEQ ID NOs: 1-38). See FIGs. 1A-1C.
[0043] In some embodiments, the engineered transposase system is discovered through metagenomic sequencing. In some embodiments, the metagenomic sequencing is conducted on samples collected from various environments. In some embodiments, the environment is a human microbiome, an
animal microbiome, an environment with high temperatures, an environment with low temperatures, or sediment.
[0044] In some embodiments, the transposase is a MG63 transposase (i.e., SEQ ID NOs: 1-38). In some embodiments, the transposase comprises a sequence having at least about 70% sequence identity to any one of SEQ ID NOs: 1-38. In some embodiments, the transposase has at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity to any one of SEQ ID NOs: 1-38. In some embodiments, the transposase comprises a sequence having at least about 70% identity to any one of SEQ ID NOs: 1-38. In some embodiments, the transposase comprises a sequence having at least about 75% identity to any one of SEQ ID NOs: 1-38. In some embodiments, the transposase comprises a sequence having at least about 80% identity to any one of SEQ ID NOs: 1-38. In some embodiments, the transposase comprises a sequence having at least about 85% identity to any one of SEQ ID NOs: 1-38. In some embodiments, the transposase comprises a sequence having at least about 90% identity to any one of SEQ ID NOs: 1-38. In some embodiments, the transposase comprises a sequence having at least about 95% identity to any one of SEQ ID NOs: 1- 38. In some embodiments, the transposase comprises a sequence having at least about 96% identity to any one of SEQ ID NOs: 1-38. In some embodiments, the transposase comprises a sequence having at least about 97% identity to any one of SEQ ID NOs: 1-38. In some embodiments, the transposase comprises a sequence having at least about 98% identity to any one of SEQ ID NOs: 1- 38. In some embodiments, the transposase comprises a sequence having at least about 99% identity to any one of SEQ ID NOs: 1-38. In some embodiments, the transposase comprises a sequence having 100% identity to any one of SEQ ID NOs: 1-38.
[0045] In some embodiments, the transposase is not a Tcl/Mariner transposase. In some embodiments, the transposase has less than about 90%, less than about 85%, less than about 80%, less than about 75%, less than about 70%, less than about 65%, less than about 60%, less than about 55%, less than about 50%, less than about 45%, less than about 40%, less than about 35%, less than about 30%, less than about 25%, less than about 20%, less than about 15%, less than about 10%, or less than about 5% sequence identity to a Tcl/Mariner transposase.
[0046] In some embodiments, the transposase comprises a sequence complementary to a eukaryotic, fungal, plant, mammalian, or human genomic polynucleotide sequence. In some embodiments, the transposase comprises a sequence complementary to a eukaryotic genomic polynucleotide sequence.
In some embodiments, the transposase comprises a sequence complementary to a fungal genomic polynucleotide sequence. In some embodiments, the transposase comprises a sequence complementary to a plant genomic polynucleotide sequence. In some embodiments, the transposase comprises a sequence complementary to a mammalian genomic polynucleotide sequence. In some embodiments, the transposase comprises a sequence complementary to a human genomic polynucleotide sequence.
[0047] In some embodiments, the transposase comprises a nuclear localization sequence (NLS). In some embodiments, the NLS is at an N-terminus of the transposase. In some embodiments, the NLS is at a C-terminus of the transposase. In some embodiments, the NLS is at an N-terminus and a C- terminus of the transposase.
[0048] In some embodiments, the NLS comprises a sequence of any one of SEQ ID NOs: 39-54, or a sequence having at least about 20%, at least about 25%, at least about 30%, at least about 35%, at least about 40%, at least about 45%, at least about 50%, at least about 55%, at least about 60%, at least about 65%, at least about 70%, at least about 75%, at least about 80%, at least about 85%, at least about 90%, at least about 91%, at least about 92%, at least about 93%, at least about 94%, at least about 95%, at least about 96%, at least about 97%, at least about 98%, or at least about 99% identity to any one of SEQ ID NOs: 39-54. In some embodiments, the NLS comprises a sequence having at least about 80% identity to SEQ ID NOs: 39-54. In some embodiments, the NLS comprises a sequence having at least about 85% identity to SEQ ID NOs: 39-54. In some embodiments, the NLS comprises a sequence having at least about 90% identity to SEQ ID NOs: 39-54. In some embodiments, the NLS comprises a sequence having at least about 91% identity to SEQ ID NOs: 39- 54. In some embodiments, the NLS comprises a sequence having at least about 92% identity to SEQ ID NOs: 39-54. In some embodiments, the NLS comprises a sequence having at least about 93% identity to SEQ ID NOs: 39-54. In some embodiments, the NLS comprises a sequence having at least about 94% identity to SEQ ID NOs: 39-54. In some embodiments, the NLS comprises a sequence having at least about 95% identity to SEQ ID NOs: 39-54. In some embodiments, the NLS comprises a sequence having at least about 96% identity to SEQ ID NOs: 39-54. In some embodiments, the NLS comprises a sequence having at least about 97% identity to SEQ ID NOs: 39-54. In some embodiments, the NLS comprises a sequence having at least about 98% identity to SEQ ID NOs: 39- 54. In some embodiments, the NLS comprises a sequence having at least about 99% identity to SEQ ID NOs: 39-54. In some embodiments, the NLS comprises a sequence having 100% identity to SEQ ID NOs: 39-54.
Table 1: Example NLS Sequences that can be used with the transposases according to the present disclosure
[0049] In some embodiments, a transposase sequence described herein is determined by a BLASTP, CLUSTALW, MUSCLE, or MAFFT algorithm, or a CLUSTALW algorithm with the Smith- Waterman homology search algorithm parameters. In some embodiments, a transposase sequence is determined by the BLASTP homology search algorithm using parameters of a wordlength (W) of 3, an expectation (E) of 10, and a BLOSUM62 scoring matrix setting gap costs at existence of 11, extension of 1, and using a conditional compositional score matrix adjustment.
Cargo Polynucleotides
[0050] Described herein, in certain embodiments, are engineered transposase systems comprising a transposase and a cargo nucleotide sequence. In some embodiments, the transposase is configured to interact with the transposase and transpose the cargo nucleotide sequence to a target nucleic acid site. In some embodiments, the cargo nucleotide sequence is flanked by a left-hand transposase recognition sequence recognized by the transposase and a right-hand transposase recognition sequence recognized by the transposase.
[0051] In some embodiments, the cargo nucleotide sequence is double stranded. In some embodiments, the cargo nucleotide sequence is double stranded DNA. In some embodiments, the cargo nucleotide sequence is single stranded. In some embodiments, the cargo nucleotide sequence is a eukaryotic, plant, fungal, mammalian, rodent, or human double-stranded deoxyribonucleic acid polynucleotide.
[0052] In some embodiments, the target nucleic acid is double stranded. In some embodiments, the target nucleic acid is double stranded DNA. In some embodiments, the target nucleic acid is single stranded. In some embodiments, the target nucleic acid polynucleotide is a eukaryotic, plant, fungal, mammalian, rodent, or human double- stranded deoxyribonucleic acid polynucleotide.
MG Systems
[0053] Described herein, in certain embodiments, are engineered transposase systems, comprising (a) a double- stranded nucleic acid comprising a cargo nucleotide sequence; and (b) a transposase configured to interact with the double-stranded nucleic acid to transpose the cargo nucleotide sequence to a target nucleic acid site.
[0054] In some embodiments, the engineered transposase system comprises (a) a transposase comprising sequence having at least about 70% identity to any one of SEQ ID NOs: 1-38, and (b) a cargo nucleotide. In some embodiments, the engineered transposase system comprises (a) a transposase comprising sequence having at least about at least about 75% identity to any one of SEQ ID NOs: 1-38, and (b) a cargo nucleotide. In some embodiments, the engineered transposase system comprises (a) a transposase comprising sequence having at least about at least about 80% identity to any one of SEQ ID NOs: 1-38, and (b) a cargo nucleotide. In some embodiments, the engineered transposase system comprises (a) a transposase comprising sequence having at least about at least about 85% identity to any one of SEQ ID NOs: 1-38, and (b) a cargo nucleotide. In some embodiments, the engineered transposase system comprises (a) a transposase comprising sequence having at least about at least about 90% identity to any one of SEQ ID NOs: 1-38, and (b) a cargo nucleotide. In some embodiments, the engineered transposase system comprises (a) a transposase comprising sequence having at least about at least about 95% identity to any one of SEQ ID NOs: 1- 38, and (b) a cargo nucleotide. In some embodiments, the engineered transposase system comprises (a) a transposase comprising sequence having at least about at least about 96% identity to any one of SEQ ID NOs: 1-38, and (b) a cargo nucleotide. In some embodiments, the engineered transposase system comprises (a) a transposase comprising sequence having at least about at least about 97% identity to any one of SEQ ID NOs: 1-38, and (b) a cargo nucleotide. In some embodiments, the engineered transposase system comprises (a) a transposase comprising sequence having at least about at least about 98% identity to any one of SEQ ID NOs: 1-38, and (b) a cargo nucleotide. In
some embodiments, the engineered transposase system comprises (a) a transposase comprising sequence having at least about at least about 99% identity to any one of SEQ ID NOs: 1-38, and (b) a cargo nucleotide. In some embodiments, the engineered transposase system comprises (a) a transposase comprising 100% identity to any one of SEQ ID NOs: 1-38, and (b) a cargo nucleotide.
Delivery and Vectors
[0055] Disclosed herein, in some embodiments, are nucleic acid sequences encoding an engineered transposase system disclosed herein.
[0056] In some embodiments, the nucleic acid encoding the engineered transposase system is a DNA, for example a linear DNA, a plasmid DNA, or a minicircle DNA. In some embodiments, the nucleic acid encoding the engineered transposase system is an RNA, for example a mRNA.
[0057] In some embodiments, the nucleic acid encoding the engineered transposase system is delivered by a nucleic acid-based vector. In some embodiments, the nucleic acid-based vector is a plasmid (e.g., circular DNA molecules that can autonomously replicate inside a cell), cosmid (e.g., pWE or sCos vectors), artificial chromosome, human artificial chromosome (HAC), yeast artificial chromosomes (YAC), bacterial artificial chromosome (BAC), Pl -derived artificial chromosomes (PAC), phagemid, phage derivative, bacmid, or virus. In some embodiments, the nucleic acid-based vector is selected from the list consisting of: pSF-CMV-NEO-NH2-PPT-3XFLAG, pSF-CMV-NEO- COOH-3XFLAG, pSF-CMV-PURO-NH2-GST-TEV, pSF-OXB20-COOH-TEV-FLAG(R)-6His, pCEP4 pDEST27, pSF-CMV-Ub-KrYFP, pSF-CMV-FMDV-daGFP, pEFla-mCherry-Nl vector, pEFla-tdTomato vector, pSF-CMV-FMDV-Hygro, pSF-CMV-PGK-Puro, pMCP-tag(m), pSF- CMV-PURO-NH2-CMYC, pSF-OXB20-BetaGal,pSF-OXB20-Fluc, pSF-OXB20, pSF-Tac, pRI 101-AN DNA, pCambia2301,pTYB21, pKLAC2, pAc5.1/V5-His A, and pDEST8.
[0058] In some embodiments, the nucleic acid-based vector comprises a promoter. In some embodiments, the promoter is selected from the group consisting of a mini promoter, an inducible promoter, a constitutive promoter, and derivatives thereof. In some embodiments, the promoter is selected from the group consisting of CMV, CBA, EFla, CAG, PGK, TRE, U6, UAS, T7, Sp6, lac, araBad, trp, Ptac, p5, pl9, p40, Synapsin, CaMKII, GRK1, and derivatives thereof. In some embodiments the promoter is a U6 promoter. In some embodiments, the promoter is a CAG promoter.
[0059] In some embodiments, the nucleic acid-based vector is a virus. In some embodiments, the virus is an alphavirus, a parvovirus, an adenovirus, an AAV, a baculovirus, a Dengue virus, a lentivirus, a herpesvirus, a poxvirus, an anellovirus, a bocavirus, a vaccinia virus, or a retrovirus. In some embodiments, the virus is an alphavirus. In some embodiments, the virus is a parvovirus. In
some embodiments, the virus is an adenovirus. In some embodiments, the virus is an AAV. In some embodiments, the virus is a baculovirus. In some embodiments, the virus is a Dengue virus. In some embodiments, the virus is a lentivirus. In some embodiments, the virus is a herpesvirus. In some embodiments, the virus is a poxvirus. In some embodiments, the virus is an anellovirus. In some embodiments, the virus is a bocavirus. In some embodiments, the virus is a vaccinia virus. In some embodiments, the virus is or a retrovirus.
[0060] In some embodiments, the AAV is AAV1, AAV2, AAV3, AAV4, AAV5, AAV6, AAV7, AAV8, AAV9, AAV10, AAV11, AAV12, AAV13, AAV14, AAV15, AAV16, AAV-rh8, AAV- rhlO, AAV-rh20, AAV-rh39, AAV-rh74, AAV-rhM4-l, AAV-hu37, AAV-Anc80, AAV-Anc80L65, AAV-7m8, AAV-PHP-B, AAV-PHP-EB, AAV-2.5, AAV-2tYF, AAV-3B, AAV-LK03, AAV- HSC1, AAV-HSC2, AAV-HSC3, AAV-HSC4, AAV-HSC5, AAV-HSC6, AAV-HSC7, AAV- HSC8, AAV-HSC9, AAV-HSC10, AAV-HSC11, AAV-HSC12, AAV-HSC13, AAV-HSC14, AAV-HSC15, AAV-TT, AAV-DJ/8, AAV-Myo, AAV-NP40, AAV-NP59, AAV-NP22, AAV- NP66, AAV-HSC16, or a derivative thereof. In some embodiments, the herpesvirus is HSV type 1, HSV-2, VZV, EBV, CMV, HHV-6, HHV-7, or HHV-8.
[0061] In some embodiments, the virus is AAV1 or a derivative thereof. In some embodiments, the virus is AAV2 or a derivative thereof. In some embodiments, the virus is AAV3 or a derivative thereof. In some embodiments, the virus is AAV4 or a derivative thereof. In some embodiments, the virus is AAV5 or a derivative thereof. In some embodiments, the virus is AAV6 or a derivative thereof. In some embodiments, the virus is AAV7 or a derivative thereof. In some embodiments, the virus is AAV8 or a derivative thereof. In some embodiments, the virus is AAV9 or a derivative thereof. In some embodiments, the virus is AAV10 or a derivative thereof. In some embodiments, the virus is AAV 11 or a derivative thereof. In some embodiments, the virus is AAV 12 or a derivative thereof. In some embodiments, the virus is AAV 13 or a derivative thereof. In some embodiments, the virus is AAV14 or a derivative thereof. In some embodiments, the virus is AAV15 or a derivative thereof. In some embodiments, the virus is AAV 16 or a derivative thereof. In some embodiments, the virus is AAV-rh8 or a derivative thereof. In some embodiments, the virus is AAV-rhlO or a derivative thereof. In some embodiments, the virus is AAV-rh20 or a derivative thereof. In some embodiments, the virus is AAV-rh39 or a derivative thereof. In some embodiments, the virus is AAV-rh74 or a derivative thereof. In some embodiments, the virus is AAV-rhM4-l or a derivative thereof. In some embodiments, the virus is AAV-hu37 or a derivative thereof. In some embodiments, the virus is AAV-Anc80 or a derivative thereof. In some embodiments, the virus is AAV-Anc80L65 or a derivative thereof. In some embodiments, the virus is AAV-7m8 or a derivative thereof. In some embodiments, the virus is AAV-PHP-B or a derivative thereof. In some embodiments, the virus is
AAV-PHP-EB or a derivative thereof. In some embodiments, the virus is AAV-2.5 or a derivative thereof. In some embodiments, the virus is AAV-2tYF or a derivative thereof. In some embodiments, the virus is AAV-3B or a derivative thereof. In some embodiments, the virus is AAV-LK03 or a derivative thereof. In some embodiments, the virus is AAV-HSC1 or a derivative thereof. In some embodiments, the virus is AAV-HSC2 or a derivative thereof. In some embodiments, the virus is AAV-HSC3 or a derivative thereof. In some embodiments, the virus is AAV-HSC4 or a derivative thereof. In some embodiments, the virus is AAV-HSC5 or a derivative thereof. In some embodiments, the virus is AAV-HSC6 or a derivative thereof. In some embodiments, the virus is AAV-HSC7 or a derivative thereof. In some embodiments, the virus is AAV-HSC8 or a derivative thereof. In some embodiments, the virus is AAV-HSC9 or a derivative thereof. In some embodiments, the virus is AAV-HSC10 or a derivative thereof. In some embodiments, the virus is AAV-HSC11 or a derivative thereof. In some embodiments, the virus is AAV-HSC12 or a derivative thereof. In some embodiments, the virus is AAV-HSC13 or a derivative thereof. In some embodiments, the virus is AAV-HSC14 or a derivative thereof. In some embodiments, the virus is AAV-HSC15 or a derivative thereof. In some embodiments, the virus is AAV-TT or a derivative thereof. In some embodiments, the virus is AAV-DJ/8 or a derivative thereof. In some embodiments, the virus is AAV-Myo or a derivative thereof. In some embodiments, the virus is AAV-NP40 or a derivative thereof. In some embodiments, the virus is AAV-NP59 or a derivative thereof. In some embodiments, the virus is AAV-NP22 or a derivative thereof. In some embodiments, the virus is AAV-NP66 or a derivative thereof. In some embodiments, the virus is AAV-HSC16 or a derivative thereof.
[0062] In some embodiments, the virus is HSV-1 or a derivative thereof. In some embodiments, the virus is HSV-2 or a derivative thereof. In some embodiments, the virus is VZV or a derivative thereof. In some embodiments, the virus is EBV or a derivative thereof. In some embodiments, the virus is CMV or a derivative thereof. In some embodiments, the virus is HHV-6 or a derivative thereof. In some embodiments, the virus is HHV-7 or a derivative thereof. In some embodiments, the virus is HHV-8 or a derivative thereof.
[0063] In some embodiments, the nucleic acid encoding the engineered transposase system is delivered by a non-nucleic acid-based delivery system (e.g., a non-viral delivery system). In some embodiments, the non-viral delivery system is a liposome. In some embodiments, the nucleic acid is associated with a lipid. The nucleic acid associated with a lipid, in some embodiments, is encapsulated in the aqueous interior of a liposome, interspersed within the lipid bilayer of a liposome, attached to a liposome via a linking molecule that is associated with both the liposome and the nucleic acid, entrapped in a liposome, complexed with a liposome, dispersed in a solution
containing a lipid, mixed with a lipid, combined with a lipid, contained as a suspension in a lipid, contained or complexed with a micelle, or otherwise associated with a lipid. In some embodiments, the nucleic acid is comprised in a lipid nanoparticle (LNP).
[0064] In some embodiments, the engineered transposase system is introduced into the cell in any suitable way, either stably or transiently. In some embodiments, an engineered transposase system is transfected into the cell. In some embodiments, the cell is transduced or transfected with a nucleic acid construct that encodes an engineered transposase system. For example, a cell is transduced (e.g., with a virus encoding an engineered transposase system), or transfected (e.g., with a plasmid encoding an engineered transposase system) with a nucleic acid that encodes an engineered transposase system, or the translated engineered transposase system. In some embodiments, the transduction is a stable or transient transduction. In some embodiments, a plasmid expressing an engineered transposase system is introduced into cells through electroporation, transient (e.g., lipofection) and stable genome integration (e.g., piggybac) and viral transduction (for example lentivirus or AAV) or other methods known to those of skill in the art. In some embodiments, the gene editing system is introduced into the cell as one or more polypeptides. In some embodiments, delivery is achieved through the use of RNP complexes. Delivery methods to cells for polypeptides and/or RNPs are known in the art, for example by electroporation or by cell squeezing.
[0065] Exemplary methods of delivery of nucleic acids include lipofection, nucleofection, electroporation, stable genome integration (e.g., piggybac), microinjection, biolistics, virosomes, liposomes, immunoliposomes, polycation or lipid nucleic acid conjugates, naked DNA, artificial virions, and agent-enhanced uptake of DNA. Lipofection is described in e.g., U.S. Pat. Nos. 5,049,386; 4,946,787; and 4,897,355) and lipofection reagents are sold commercially (e.g., Transfectam™, Lipofectin™ and SF Cell Line 4D-Nucleofector X Kit™ (Lonza)). Cationic and neutral lipids that are suitable for efficient receptor-recognition lipofection of polynucleotides include those of WO 91/17424 and WO 91/16024. In some embodiments, the delivery is to cells (e.g., in vitro or ex vivo administration) or target tissues (e.g., in vivo administration). In some embodiments, the nucleic acid is comprised in a liposome or a nanoparticle that specifically targets a host cell.
[0066] Additional methods for the delivery of nucleic acids to cells are known to those skilled in the art. See, for example, US 2003/0087817.
Cells
[0067] Described herein, in certain embodiments, is a cell comprising the engineered transposase system described herein.
[0068] In some embodiments, the cell is a eukaryotic cell (e.g., a plant cell, an animal cell, a protist cell, or a fungi cell), a mammalian cell (a Chinese hamster ovary (CHO) cell, baby hamster kidney (BHK), human embryo kidney (HEK), mouse myeloma (NSO), or human retinal cells), an immortalized cell (e.g., a HeLa cell, a COS cell, a HEK-293T cell, a MDCK cell, a 3T3 cell, a PC12 cell, a Huh7 cell, a HepG2 cell, a K562 cell, a N2a cell, or a SY5Y cell), an insect cell (e.g., a Spodoptera frugiperda cell, a Trichoplusia ni cell, a Drosophila melanogaster cell, a S2 cell, or a Heliothis virescens cell), a yeast cell (e.g., a Saccharomyces cerevisiae cell, a Cryptococcus cell, or a Candida cell), a plant cell (e.g., a parenchyma cell, a collenchyma cell, or a sclerenchyma cell), a fungal cell (e.g., a Saccharomyces cerevisiae cell, a Cryptococcus cell, or a Candida cell), or a prokaryotic cell (e.g., a E. coli cell, a streptococcus bacterium cell, a streptomyces soil bacteria cell, or an archaea cell). In some embodiments, the cell is a eukaryotic cell. In some embodiments, the cell is a mammalian cell. In some embodiments, the cell is an immortalized cell. In some embodiments, the cell is an insect cell. In some embodiments, the cell is a yeast cell. In some embodiments, the cell is a plant cell. In some embodiments, the cell is a fungal cell. In some embodiments, the cell is a prokaryotic cell.
[0069] In some embodiments, the cell is an A549, HEK-293, HEK-293T, BHK, CHO, HeLa, MRC5, Sf9, Cos-1, Cos-7, Vero, BSC 1, BSC 40, BMT 10, WI38, HeLa, Saos, C2C12, L cell, HT1080, HepG2, Huh7, K562, a primary cell, or derivative thereof.
Methods of Use
[0070] Described herein, in certain embodiments, are methods for modifying a target nucleic acid comprising providing an engineered transposase system disclosed herein. In some embodiments, the engineered transposase system comprises a transposase and cargo nucleotide sequence. In some embodiments, the target nucleic acid is double stranded. In some embodiments, the target nucleic acid is double stranded DNA. In some embodiments, the target nucleic acid is single stranded.
[0071] In some embodiments, the methods are used to introduce a modification in the genome of a cell. In some embodiments, the modification is an insertion, deletion, or mutation. In some embodiments, the methods are used to introduce site-directed insertions, deletions, and/or mutations in the genome of a cell (for example an insertion and a mutation). In some embodiments, the methods are used in combination with a nucleic acid template to facilitate site-directed insertions into the genome of a cell.
[0072] In some embodiments, the cell is a human cell. In some embodiments, the cell genome or a vector comprised in the cell is modified. In some embodiments, the cell genome is modified ex vivo. In some embodiments, the cell genome is modified in vivo.
[0073] Systems of the present disclosure may be used for various applications, such as, for example, nucleic acid editing (e.g., gene editing), binding to a nucleic acid molecule (e.g., sequence- specific binding). Such systems may be used, for example, for addressing (e.g., removing or replacing) a genetically inherited mutation that may cause a disease in a subject, inactivating a gene in order to ascertain its function in a cell, as a diagnostic tool to detect disease-causing genetic elements (e.g., via cleavage of reverse-transcribed viral RNA or an amplified DNA sequence encoding a diseasecausing mutation), as deactivated enzymes in combination with a probe to target and detect a specific nucleotide sequence (e.g., sequence encoding antibiotic resistance int bacteria), to render viruses inactive or incapable of infecting host cells by targeting viral genomes, to add genes or amend metabolic pathways to engineer organisms to produce valuable small molecules, macromolecules, or secondary metabolites, to establish a gene drive element for evolutionary selection, to detect cell perturbations by foreign small molecules and nucleotides as a biosensor.
[0074] Described herein, in certain embodiments, are methods for modifying a target nucleic acid comprising providing an engineered transposase system. In some embodiments, the present disclosure provides a method for binding, nicking, cleaving, marking, modifying, or transposing a double-stranded deoxyribonucleic acid polynucleotide. In some embodiments, the method comprises contacting the double-stranded deoxyribonucleic acid polynucleotide with a transposase. In some embodiments, the transposase induces a single-stranded break or a double- stranded break at or proximal to the target nucleic acid site. In some embodiments, the transposase induces a staggered single stranded break within or 5’ to the target site.
[0075] In some embodiments, the double-stranded deoxyribonucleic acid polynucleotide is a eukaryotic, plant, fungal, mammalian, rodent, or human double- stranded deoxyribonucleic acid polynucleotide.
[0076] In some embodiments, the transposase is configured to transpose the cargo nucleotide sequence as single-stranded deoxyribonucleic acid polynucleotide. In some embodiments, the transposase is configured to transpose the cargo nucleotide sequence as double- stranded deoxyribonucleic acid polynucleotide. In some embodiments, the transposase is configured to transpose the cargo nucleotide sequence via a ribonucleic acid polynucleotide intermediate. In some embodiments, the cargo nucleotide sequence is flanked by a 3’ untranslated region (UTR) and a 5’ untranslated region (UTR).
[0077] In some embodiments, the present disclosure provides a method of modifying a target nucleic acid site. In some embodiments, the method comprises delivering to the target nucleic acid site the engineered transposase system described herein. In some embodiments, the engineered transposase
system is configured such that upon binding of the engineered transposase system to the target nucleic acid site, the engineered transposase system modifies the target nucleic acid site.
[0078] In some embodiments, modifying the target nucleic acid site comprises binding, nicking, cleaving, marking, modifying, or transposing the target nucleic acid site. In some embodiments, the target nucleic acid site comprises deoxyribonucleic acid (DNA) or ribonucleic acid (RNA). In some embodiments, the target nucleic acid comprises genomic DNA, viral DNA, viral RNA, or bacterial DNA. In some embodiments, the target nucleic acid site is in vitro. In some embodiments, the target nucleic acid site is within a cell. In some embodiments, the cell is a prokaryotic cell, a bacterial cell, a eukaryotic cell, a fungal cell, a plant cell, an animal cell, a mammalian cell, a rodent cell, a primate cell, or a human cell. In some embodiments, the cell is a primary cell. In some embodiments, the primary cell is a T cell. In some embodiments, the primary cell is a hematopoietic stem cell (HSC). In some embodiments, the cell is a human cell. In some embodiments, the cell is genome edited ex vivo. In some embodiments, the cell is genome edited in vivo.
[0079] In some embodiments, delivery of the engineered transposase system to the target nucleic acid site comprises delivering the nucleic acid described herein or the vector described herein. In some embodiments, delivery of engineered transposase system to the target nucleic acid site comprises delivering a nucleic acid comprising an open reading frame encoding the transposase. In some embodiments, the nucleic acid comprises a promoter. In some embodiments, the open reading frame encoding the transposase is operably linked to the promoter.
[0080] In some embodiments, delivery of the engineered transposase system to the target nucleic acid site comprises delivering a capped mRNA containing the open reading frame encoding the transposase. In some embodiments, delivery of the engineered transposase system to the target nucleic acid site comprises delivering a translated polypeptide.
[0081] In some embodiments, the transposase does not induce a break at or proximal to the target nucleic acid site.
[0082] In some embodiments, the transposition activity is measured in vitro by introducing the transposase to cells comprising the target nucleic acid site and detecting transposition of the target nucleic acid site in the cells. In some embodiments, the composition comprises 20 pmoles or less of the transposase. In some embodiments, the composition comprises 1 pmol or less of the transposase. [0083] Further described herein, in certain embodiments, are methods of manufacturing a transposase. In some embodiments, the method comprises cultivating a host cell with the engineered transposase system described herein.
[0084] In some embodiments, the host cell is a bacterial cell. In some embodiments, the bacterial cell is Bifidobacterium longum, Bifidobacterium lactis, Bifidobacterium animalis, Bifidobacterium breve,
Bifidobacterium infantis, Bifidobacterium adolescentis, Lactobacillus acidophilus, Lactobacillus casei, Lactobacillus paracasei, Lactobacillus salivarius, Lactobacillus reuteri, Lactobacillus rhamno sus, Lactobacillus johnsonii, Lactobacillus plantarum, Lactobacillus fermentum, Lactococcus lactis, Streptococcus thermophilus, Lactococcus lactis, Lactococcus diacetylactis, Lactococcus cremoris, Lactobacillus bulgaricus, Lactobacillus helveticus, Lactobacillus delbrueckii, or Escherichia coli. In some embodiments, the host cell is an E. coli cell. In some embodiments, the E. coli cell is a Z.DE3 lysogen or a BL21(DE3) strain. In some embodiments, the E. coli cell has an ompT Ion genotype.
[0085] In some embodiments, the host cell is an E. coli cell. In some embodiments, the E. coli cell is a Z.DE3 lysogen or the E. coli cell is a BL21(DE3) strain. In some embodiments, the E. coli cell has an ompT Ion genotype.
[0086] In some embodiments, the open reading frame is operably linked to a promoter sequence. In some embodiments, the promoter is selected from the group consisting of a mini promoter, an inducible promoter, a constitutive promoter, and derivatives thereof. In some embodiments, the promoter is selected from the group consisting of CMV, CBA, EFla, CAG, PGK, TRE, U6, UAS, T7, Sp6, lac, araBad, trp, Ptac, p5, pl9, p40, Synapsin, CaMKII, GRK1, and derivatives thereof. [0087] In some embodiments, the open reading frame is operably linked to a T7 promoter sequence, a T7-lac promoter sequence, a lac promoter sequence, a tac promoter sequence, a trc promoter sequence, a ParaBAD promoter sequence, a PrhaBAD promoter sequence, a T5 promoter sequence, a cspA promoter sequence, an araPBAD promoter, a strong leftward promoter from phage lambda (pL promoter), or any combination thereof.
[0088] In some embodiments, the open reading frame comprises a sequence encoding an affinity tag linked in-frame to a sequence encoding the transposase. In some embodiments, the affinity tag is an immobilized metal affinity chromatography (IMAC) tag. In some embodiments, the IMAC tag is a polyhistidine tag. In some embodiments, the affinity tag is a myc tag, a human influenza hemagglutinin (HA) tag, a maltose binding protein (MBP) tag, a glutathione S-transferase (GST) tag, a streptavidin tag, a FLAG tag, or any combination thereof. In some embodiments, the affinity tag is linked in-frame to the sequence encoding the transposase via a linker sequence encoding a protease cleavage site. In some embodiments, the protease cleavage site is a tobacco etch virus (TEV) protease cleavage site, a PreScission® protease cleavage site, a Thrombin cleavage site, a Factor Xa cleavage site, an enterokinase cleavage site, or any combination thereof.
[0089] In some embodiments, the open reading frame is codon-optimized for expression in the host cell. In some embodiments, the open reading frame is provided on a vector. In some embodiments, the open reading frame is integrated into a genome of the host cell.
[0090] In some embodiments, the present disclosure provides a culture comprising a host cell described herein in compatible liquid medium.
[0091] In some embodiments, the present disclosure provides a method of producing a transposase, comprising cultivating a host cell described herein in compatible growth medium. In some embodiments, the method further comprises inducing expression of the transposase by addition of an additional chemical agent or an increased amount of a nutrient. In some embodiments, the additional chemical agent or increased amount of a nutrient comprises Isopropyl P-D-l -thiogalactopyranoside (IPTG) or additional amounts of lactose. In some embodiments, the method further comprises isolating the host cell after the cultivation and lysing the host cell to produce a protein extract. In some embodiments, the method further comprises subjecting the protein extract to IMAC, or ionaffinity chromatography. In some embodiments, the open reading frame comprises a sequence encoding an IMAC affinity tag linked in-frame to a sequence encoding the transposase. In some embodiments, the IMAC affinity tag is linked in-frame to the sequence encoding the transposase via a linker sequence encoding protease cleavage site. In some embodiments, the protease cleavage site comprises a tobacco etch virus (TEV) protease cleavage site, a PreScission® protease cleavage site, a Thrombin cleavage site, a Factor Xa cleavage site, an enterokinase cleavage site, or any combination thereof. In some embodiments, the method further comprises cleaving the IMAC affinity tag by contacting a protease corresponding to the protease cleavage site to the transposase. In some embodiments, the method further comprises performing subtractive IMAC affinity chromatography to remove the affinity tag from a composition comprising the transposase.
Kits
[0092] In some embodiments, this disclosure provides kits comprising one or more nucleic acid constructs encoding the various components of the transposase or gene editing system described herein, e.g., comprising a nucleotide sequence encoding the components of the transposase or gene editing system capable of modifying a target DNA sequence. In some embodiments, the nucleotide sequence comprises a heterologous promoter that drives expression of the gene editing system components.
[0093] In some embodiments, any of the transposase or gene editing systems disclosed herein is assembled into a pharmaceutical, diagnostic, or research kit to facilitate its use in therapeutic, diagnostic, or research applications. A kit may include one or more containers housing any of the vectors disclosed herein and instructions for use.
[0094] The kit may be designed to facilitate use of the methods described herein by researchers and can take many forms. Each of the compositions of the kit, where applicable, may be provided in
liquid form (e.g., in solution), or in solid form, (e.g., a dry powder). In certain cases, some of the compositions may be constitutable or otherwise processable (e.g., to an active form), for example, by the addition of a suitable solvent or other species (for example, water or a cell culture medium), which may or may not be provided with the kit. As used herein, "instructions" can define a component of instruction and/or promotion, and typically involve written instructions on or associated with packaging of the disclosure. Instructions also can include any oral or electronic instructions provided in any manner such that a user will clearly recognize that the instructions are to be associated with the kit, for example, audiovisual (e.g., videotape, DVD, etc.), Internet, and/or web-based communications, etc. The written instructions, in some embodiments, are in a form prescribed by a governmental agency regulating the manufacture, use, or sale of pharmaceuticals or biological products, which instructions can also reflect approval by the agency of manufacture, use, or sale for animal administration.
EXAMPLES
Example 1 - A method of metagenomic analysis for new proteins
[0095] Metagenomic samples were collected from sediment, soil, and animals. Deoxyribonucleic acid (DNA) was extracted and sequenced. Additional raw sequence data from public sources included animal microbiomes, sediment, soil, hot springs, hydrothermal vents, marine, peat bogs, permafrost, and sewage sequences. Metagenomic sequence data was searched using Hidden Markov Models generated based on known transposase protein sequences to identify new transposases. Transposase proteins identified by the search were aligned to known proteins to identify potential active sites. This metagenomic workflow resulted in the delineation of the MG63 family described herein.
Example 2 - Discovery of MG63 Family of Transposases
[0096] Analysis of the data from the metagenomic analysis of Example 1 revealed a new cluster of previously undescribed putative transposase systems comprising 1 family (MG63). The corresponding protein sequences for these new enzymes and their exemplary subdomains are presented as SEQ ID NOs: 1-38.
Example 3 - Integrase in vitro activity (prophetic)
[0097] Integrase activity is preferentially conducted via expression in an E. coli lysate based expression system. The required components for in vitro testing are three plasmids: an expression plasmid with the transposon gene(s) under a T7 promoter, a target plasmid, and a donor plasmid
which contains the required left end (LE) and right end (RE) DNA sequences for transposition around a cargo gene (e.g. Tet resistance gene). The lysate-based expression products, target DNA, and donor DNA are incubated to allow for transposition to occur. Transposition is detected via PCR. In addition, the transposition product will be tagmented with T5 and sequenced via NGS to determine the insertion sites on a population of transposition events. Alternatively, the in vitro transposition products can be transformed into E. coli under antibiotic (e.g. Tet) selection, where growth requires the transposition cargo to be stably inserted into a plasmid. Either single colonies or a population of E. coli can be sequenced to determine the insertion sites.
[0098] Integration efficiency can be measured via ddPCR or qPCR of the experimental output of target DNA with integrated cargo, normalized to the amount of unmodified target DNA also measured via ddPCR.
[0099] This assay may also be conducted with purified protein components rather than from lysatebased expression. In this case, the proteins are expressed in E. coli protease-deficient B strain under T7 inducible promoter, the cells are lysed using sonication, and the His-tagged protein of interest is purified. Purity is determined using densitometry of the protein bands resolved on SDS-PAGE and coomassie stained acrylamide gels. The protein is desalted in storage buffer composed of 50 mM Tris-HCl, 300 mM NaCl, 1 mM TCEP, 5% glycerol; pH 7.5 (or other buffers as determined for maximum stability) and stored at -80°C. After purification the transposon gene(s) are added to the target DNA and donor DNA as described above in a reaction buffer, for example 26 mM HEPES pH 7.5, 4.2 mM TRIS pH 8, 50 ug/mL BSA, 2 mM ATP, 2.1 mM DTT, 0.05 mM EDTA, 0.2 mM MgCh, 28 mM NaCl, 21 mM KC1, 1.35% glycerol, (final pH 7.5) supplemented with 15 mM MgOAci.
Example 4 - Transposon end verification via gel shift (prophetic)
[00100] The transposon ends are tested for transposase binding via an electrophoretic mobility shift assay (EMSA). In this case, the potential LE or RE is synthesized as a DNA fragment (100-500 bp) and end-labeled with FAM via PCR with FAM-labeled primers. The transposase protein is synthesized in an in vitro transcription/translation system. After synthesis, 1 uL of protein is added to 50 nM of the labeled RE or LE in a 10 uL reaction in binding buffer (e.g. 20 mM HEPES pH 7.5, 2.5 mM Tris pH 7.5, 10 mM NaCl, 0.0625 mM EDTA, 5 mM TCEP, 0.005% BSA, 1 ug/mL poly(dl- dC), and 5% glycerol). The binding is incubated at 30° for 40 minutes, then 2 uL of 6X loading buffer (60 mM KC1, 10 mM Tris pH 7,6, 50% glycerol) is added. The binding reaction is separated on a 5% TBE gel and visualized. Shifts of the LE or RE in the presence of transposase protein can be attributed to successful binding and are indicative of transposase activity. This assay can also be
performed with transposase truncations or mutations, as well as using E. coli extract or purified protein.
Example 5 - Integrase activity in E. coli (prophetic)
[00101] Engineered E. coli strains are transformed with a plasmid expressing the transposon genes and a plasmid containing a temperature-sensitive origin of replication with a selectable marker flanked by left end (LE) and right end (RE) transposon motifs for integration. Transformants induced for expression of these genes are then screened for transfer of the marker to a genomic target by selection at restrictive temperature for plasmid replication and the marker integration in the genome is confirmed by PCR.
[00102] Integrations are screened using an unbiased approach. In brief, purified gDNA is tagmented with Tn5, and DNA of interest is then PCR amplified using primers specific to the Tn5 tagmentation and the selectable marker. The amplicons are then prepared for NGS sequencing. Analysis of the resulting sequences is trimmed of the transposon sequences and flanking sequences are mapped to the genome to determine insertion position, and insertion rates are determined.
[00103] Alternatively, a polA mutant E. coli strain, MM383, which produces a DNA polymerase I (Poll) that is defective at 42°C, is used to detect integration as described previously. Resistance to a selectable marker after growth at 42°C indicates incorporation of donor DNA into the chromosome. The pUC19 plasmid without donor is used as a control following growth for 24 hours at 42°C without antibiotic selection.
[00104] E. coli strains that successfully grow in selection media are presumed to have integrated the donor DNA encoding the cargo resistance gene. Colonies growing in antibiotic selection plates are genotyped for cargo presence and NGS of whole genome sequence is performed.
Example 6 - Integrase activity in mammalian cells (prophetic)
[00105] To show targeting and cleavage activity in mammalian cells, each of the transposon proteins is purified with 2 NLS peptides on either terminus of the protein sequence. A plasmid containing a selectable neomycin resistance marker (NeoR) or a fluorescent marker flanked by the left end (LE) and right end (RE) motifs is synthesized. Cells are then transfected with the plasmid, recovered for 4-6 hours, and subsequently electroporated with transposon proteins. Antibiotic resistance integration into the genome is quantified by G418-resistant colony counts, and positive transposition by the fluorescent marker is assayed by fluorescence activated cell cytometry. 72 hours after cotransfection, genomic DNA is extracted and used for the preparation of an NGS-library. Integration frequency is assayed by Tn5 tagmentation.
Table 2 - Protein and nucleic acid sequences referred to herein
[00106] While preferred embodiments of the present invention have been shown and described herein, it will be obvious to those skilled in the art that such embodiments are provided by way of example only. It is not intended that the invention be limited by the specific examples provided within the specification. While the invention has been described with reference to the aforementioned specification, the descriptions and illustrations of the embodiments herein are not meant to be construed in a limiting sense. Numerous variations, changes, and substitutions will now occur to those skilled in the art without departing from the invention. Furthermore, it shall be understood that all aspects of the invention are not limited to the specific depictions, configurations or relative proportions set forth herein which depend upon a variety of conditions and variables. It should be understood that various alternatives to the embodiments of the invention described herein may be employed in practicing the invention. It is therefore contemplated that the invention shall also cover any such alternatives, modifications, variations or equivalents. It is intended that the following claims define the scope of the invention and that methods and structures within the scope of these claims and their equivalents be covered thereby.
Claims
1. An engineered transposase system, comprising:
(a) a double- stranded nucleic acid and comprising a cargo nucleotide sequence; and
(b) a transposase configured to interact with the double- stranded nucleic acid to transpose the cargo nucleotide sequence to a target nucleic acid site; and comprising a sequence having at least 75% sequence identity to any one of SEQ ID NOs: 1-38.
2. The engineered transposase system of claim 1, wherein the cargo nucleotide sequence is flanked by a left-hand transposase recognition sequence and a right-hand transposase recognition sequence recognized by the transposase.
3. The engineered transposase system of any one of claims 1-2, wherein the transposase is configured to transpose the cargo nucleotide sequence as double-stranded deoxyribonucleic acid polynucleotide.
4. The engineered transposase system of any one of claims 1-3, wherein the transposase comprises one or more nuclear localization sequences (NLSs) proximal to an N- or C-terminus of the transposase.
5. The engineered transposase system of claim 4, wherein the NLS comprises a sequence according to any one of SEQ ID NOs: 1480-1495.
6. A method for binding, nicking, cleaving, marking, modifying, or transposing a doublestranded deoxyribonucleic acid polynucleotide, comprising contacting the double- stranded deoxyribonucleic acid polynucleotide with a transposase configured to transpose the cargo nucleotide sequence to a target nucleic acid site; and comprising a sequence having at least 75% sequence identity to any one of SEQ ID NOs: 1-38.
7. A method of modifying a target nucleic acid site, comprising contacting the target nucleic acid site with the engineered transposase system of any one of claims 1-5.
8. The method of claim 7, wherein modifying the target nucleic acid site comprises binding, nicking, cleaving, marking, modifying, or transposing the target nucleic acid site.
9. The method of any one of claims 7-8, wherein the target nucleic acid site comprises deoxyribonucleic acid (DNA).
10. The method of claim 9, wherein the target nucleic acid site comprises genomic DNA, viral DNA, or bacterial DNA.
11. A method for transposing a cargo nucleotide sequence into a target nucleic acid site comprising introducing the engineered transposase system of any one of claims 1-5 to a cell.
12. A cell comprising the engineered transposase system of any one of claims 1-5.
13. The cell of claim 12, wherein the cell is a eukaryotic cell.
14. The cell of claim 12, wherein the cell is a mammalian cell.
15. The cell of claim 12, wherein the cell is an immortalized cell.
16. The cell of claim 12, wherein the cell is an insect cell.
17. The cell of claim 12, wherein the cell is a yeast cell.
18. The cell of claim 12, wherein the cell is a plant cell.
19. The cell of claim 12, wherein the cell is a fungal cell.
20. The cell of claim 12, wherein the cell is a prokaryotic cell.
21. The cell of claim 12, wherein the cell is an A549, HEK-293, HEK-293T, BHK, CHO,
HeLa, MRC5, Sf9, Cos-1, Cos-7, Vero, BSC 1, BSC 40, BMT 10, WI38, HeLa, Saos, C2C12, L cell, HT1080, HepG2, Huh7, K562, primary cell, or a derivative thereof.
22. The cell of claim 12, wherein the cell is an engineered cell.
23. The cell of claim 12, wherein the cell is a stable cell.
24. The cell of claim 12, wherein the cell is a primary cell.
25. The cell of claim 12, wherein the cell is a T cell.
26. The cell of claim 12, wherein the cell is a hematopoietic stem cell (HSC).
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202263404859P | 2022-09-08 | 2022-09-08 | |
US63/404,859 | 2022-09-08 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2024055013A1 true WO2024055013A1 (en) | 2024-03-14 |
Family
ID=90191980
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2023/073796 WO2024055013A1 (en) | 2022-09-08 | 2023-09-08 | Systems and methods for transposing cargo nucleotide sequences |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2024055013A1 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110296543A1 (en) * | 2006-06-01 | 2011-12-01 | The University Of California | Nucleic acids and proteins and methods for making and using them |
US20210139919A1 (en) * | 2015-10-08 | 2021-05-13 | Dna Twopointo Inc. | Dna vectors, transposons and transposases for eukaryotic genome modification |
WO2022066335A1 (en) * | 2020-09-24 | 2022-03-31 | Metagenomi Ip Technologies, Llc | Systems and methods for transposing cargo nucleotide sequences |
-
2023
- 2023-09-08 WO PCT/US2023/073796 patent/WO2024055013A1/en unknown
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20110296543A1 (en) * | 2006-06-01 | 2011-12-01 | The University Of California | Nucleic acids and proteins and methods for making and using them |
US20210139919A1 (en) * | 2015-10-08 | 2021-05-13 | Dna Twopointo Inc. | Dna vectors, transposons and transposases for eukaryotic genome modification |
WO2022066335A1 (en) * | 2020-09-24 | 2022-03-31 | Metagenomi Ip Technologies, Llc | Systems and methods for transposing cargo nucleotide sequences |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20240117330A1 (en) | Enzymes with ruvc domains | |
US11713471B2 (en) | Class II, type V CRISPR systems | |
US10913941B2 (en) | Enzymes with RuvC domains | |
WO2021226363A1 (en) | Enzymes with ruvc domains | |
CA3192224A1 (en) | Base editing enzymes | |
AU2022342157A1 (en) | Class ii, type v crispr systems | |
WO2023039436A1 (en) | Systems and methods for transposing cargo nucleotide sequences | |
WO2024055013A1 (en) | Systems and methods for transposing cargo nucleotide sequences | |
WO2023076952A1 (en) | Enzymes with hepn domains | |
US20220220460A1 (en) | Enzymes with ruvc domains | |
WO2024055012A1 (en) | Systems and methods for transposing cargo nucleotide sequences | |
WO2024086661A2 (en) | Gene editing systems comprising reverse transcriptases | |
WO2024102666A2 (en) | Serine recombinases for gene editing | |
WO2024102667A2 (en) | Serine recombinases for gene editing | |
Warth et al. | A new site-specific recombinase-mediated system for targeted multiple genomic deletions employing chimeric loxP and mrpS sites | |
WO2024086669A2 (en) | Gene editing systems comprising reverse transcriptases | |
WO2023039434A1 (en) | Systems and methods for transposing cargo nucleotide sequences | |
WO2024026499A2 (en) | Class ii, type v crispr systems | |
WO2023164592A2 (en) | Fusion proteins | |
WO2023164593A2 (en) | Systems and methods for transposing cargo nucleotide sequences | |
WO2023164591A2 (en) | Systems and methods for transposing cargo nucleotide sequences | |
WO2023039438A1 (en) | Systems, compositions, and methods involving retrotransposons and functional fragments thereof | |
WO2023164590A2 (en) | Fusion proteins | |
GB2617659A (en) | Enzymes with RUVC domains | |
CN116867897A (en) | Base editing enzyme |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 23864064 Country of ref document: EP Kind code of ref document: A1 |