WO2021110119A1 - Highly active transposase and application thereof - Google Patents
Highly active transposase and application thereof Download PDFInfo
- Publication number
- WO2021110119A1 WO2021110119A1 PCT/CN2020/133796 CN2020133796W WO2021110119A1 WO 2021110119 A1 WO2021110119 A1 WO 2021110119A1 CN 2020133796 W CN2020133796 W CN 2020133796W WO 2021110119 A1 WO2021110119 A1 WO 2021110119A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- transposase
- nucleic acid
- amino acid
- sequence
- seq
- Prior art date
Links
- 102000008579 Transposases Human genes 0.000 title claims abstract description 247
- 108010020764 Transposases Proteins 0.000 title claims abstract description 247
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 168
- 125000003275 alpha amino acid group Chemical group 0.000 claims abstract description 106
- 230000000694 effects Effects 0.000 claims abstract description 89
- 238000002360 preparation method Methods 0.000 claims abstract description 28
- 238000012546 transfer Methods 0.000 claims abstract description 23
- 230000006698 induction Effects 0.000 claims abstract description 21
- 238000011160 research Methods 0.000 claims abstract description 18
- 238000002659 cell therapy Methods 0.000 claims abstract description 17
- 230000004069 differentiation Effects 0.000 claims abstract description 17
- 238000001415 gene therapy Methods 0.000 claims abstract description 17
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 claims abstract description 15
- 239000003814 drug Substances 0.000 claims abstract description 13
- 229940079593 drug Drugs 0.000 claims abstract description 9
- 210000000130 stem cell Anatomy 0.000 claims abstract description 7
- 150000007523 nucleic acids Chemical class 0.000 claims description 142
- 102000039446 nucleic acids Human genes 0.000 claims description 131
- 108020004707 nucleic acids Proteins 0.000 claims description 131
- 239000013598 vector Substances 0.000 claims description 106
- 230000035772 mutation Effects 0.000 claims description 95
- 210000004027 cell Anatomy 0.000 claims description 94
- 235000001014 amino acid Nutrition 0.000 claims description 65
- 229940024606 amino acid Drugs 0.000 claims description 62
- 235000018102 proteins Nutrition 0.000 claims description 62
- 102000004169 proteins and genes Human genes 0.000 claims description 62
- 150000001413 amino acids Chemical class 0.000 claims description 61
- 230000014509 gene expression Effects 0.000 claims description 34
- 230000006870 function Effects 0.000 claims description 28
- 238000006467 substitution reaction Methods 0.000 claims description 28
- 108010076504 Protein Sorting Signals Proteins 0.000 claims description 26
- 108020001507 fusion proteins Proteins 0.000 claims description 26
- 102000037865 fusion proteins Human genes 0.000 claims description 26
- 210000004962 mammalian cell Anatomy 0.000 claims description 17
- 108010077850 Nuclear Localization Signals Proteins 0.000 claims description 16
- 108091033319 polynucleotide Proteins 0.000 claims description 15
- 102000040430 polynucleotide Human genes 0.000 claims description 15
- 239000002157 polynucleotide Substances 0.000 claims description 15
- 241000238631 Hexapoda Species 0.000 claims description 14
- 210000005253 yeast cell Anatomy 0.000 claims description 14
- 238000003780 insertion Methods 0.000 claims description 13
- 230000037431 insertion Effects 0.000 claims description 13
- 125000000539 amino acid group Chemical group 0.000 claims description 12
- 239000004475 Arginine Substances 0.000 claims description 11
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 claims description 11
- 241000588724 Escherichia coli Species 0.000 claims description 11
- QNAYBMKLOCPYGJ-REOHCLBHSA-N L-alanine Chemical compound C[C@H](N)C(O)=O QNAYBMKLOCPYGJ-REOHCLBHSA-N 0.000 claims description 11
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 claims description 11
- 235000004279 alanine Nutrition 0.000 claims description 11
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 claims description 11
- 235000009582 asparagine Nutrition 0.000 claims description 11
- 229960001230 asparagine Drugs 0.000 claims description 11
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 claims description 11
- 210000001778 pluripotent stem cell Anatomy 0.000 claims description 11
- 239000004474 valine Substances 0.000 claims description 11
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 claims description 10
- 230000000295 complement effect Effects 0.000 claims description 10
- 239000013604 expression vector Substances 0.000 claims description 10
- 125000000404 glutamine group Chemical group N[C@@H](CCC(N)=O)C(=O)* 0.000 claims description 10
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 claims description 10
- 229960000310 isoleucine Drugs 0.000 claims description 10
- 239000013599 cloning vector Substances 0.000 claims description 9
- 125000002987 valine group Chemical group [H]N([H])C([H])(C(*)=O)C([H])(C([H])([H])[H])C([H])([H])[H] 0.000 claims description 9
- 125000000741 isoleucyl group Chemical group [H]N([H])C(C(C([H])([H])[H])C([H])([H])C([H])([H])[H])C(=O)O* 0.000 claims description 7
- 241000700605 Viruses Species 0.000 claims description 6
- 241000701161 unidentified adenovirus Species 0.000 claims description 5
- 241000193830 Bacillus <bacterium> Species 0.000 claims description 3
- 241000713666 Lentivirus Species 0.000 claims description 3
- 210000002429 large intestine Anatomy 0.000 claims description 3
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 claims description 2
- 239000000203 mixture Substances 0.000 claims description 2
- 239000003550 marker Substances 0.000 claims 1
- 230000003252 repetitive effect Effects 0.000 claims 1
- 239000002773 nucleotide Substances 0.000 abstract description 95
- 125000003729 nucleotide group Chemical group 0.000 abstract description 95
- 108090000765 processed proteins & peptides Proteins 0.000 description 47
- 230000017105 transposition Effects 0.000 description 47
- 102000004196 processed proteins & peptides Human genes 0.000 description 28
- 108091026890 Coding region Proteins 0.000 description 26
- 239000013612 plasmid Substances 0.000 description 23
- 229920001184 polypeptide Polymers 0.000 description 23
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 20
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 20
- 108091028043 Nucleic acid sequence Proteins 0.000 description 18
- 230000001105 regulatory effect Effects 0.000 description 18
- 239000012634 fragment Substances 0.000 description 16
- 238000012216 screening Methods 0.000 description 16
- 238000000034 method Methods 0.000 description 15
- 102000004190 Enzymes Human genes 0.000 description 13
- 108090000790 Enzymes Proteins 0.000 description 13
- 102000007079 Peptide Fragments Human genes 0.000 description 13
- 108010033276 Peptide Fragments Proteins 0.000 description 13
- 238000012217 deletion Methods 0.000 description 13
- 230000037430 deletion Effects 0.000 description 13
- 108010048367 enhanced green fluorescent protein Proteins 0.000 description 13
- 230000002950 deficient Effects 0.000 description 11
- 230000030648 nucleus localization Effects 0.000 description 11
- 108020004414 DNA Proteins 0.000 description 10
- 238000010586 diagram Methods 0.000 description 10
- 108020004705 Codon Proteins 0.000 description 9
- 238000007792 addition Methods 0.000 description 9
- 235000004554 glutamine Nutrition 0.000 description 9
- 230000001965 increasing effect Effects 0.000 description 9
- 235000014393 valine Nutrition 0.000 description 9
- 108700026244 Open Reading Frames Proteins 0.000 description 8
- 230000010354 integration Effects 0.000 description 8
- 235000014705 isoleucine Nutrition 0.000 description 8
- 238000010362 genome editing Methods 0.000 description 7
- 239000002609 medium Substances 0.000 description 7
- 210000004978 chinese hamster ovary cell Anatomy 0.000 description 6
- 238000013518 transcription Methods 0.000 description 6
- 230000035897 transcription Effects 0.000 description 6
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 5
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 5
- 210000001744 T-lymphocyte Anatomy 0.000 description 5
- 239000002253 acid Substances 0.000 description 5
- 239000005090 green fluorescent protein Substances 0.000 description 5
- 210000003819 peripheral blood mononuclear cell Anatomy 0.000 description 5
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 5
- 230000008569 process Effects 0.000 description 5
- 238000013519 translation Methods 0.000 description 5
- 230000014616 translation Effects 0.000 description 5
- 238000010276 construction Methods 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 230000012010 growth Effects 0.000 description 4
- 239000007787 solid Substances 0.000 description 4
- 238000011144 upstream manufacturing Methods 0.000 description 4
- 239000007222 ypd medium Substances 0.000 description 4
- 206010064571 Gene mutation Diseases 0.000 description 3
- 102100021244 Integral membrane protein GPR180 Human genes 0.000 description 3
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 3
- 230000037429 base substitution Effects 0.000 description 3
- 238000010367 cloning Methods 0.000 description 3
- -1 etc.) Proteins 0.000 description 3
- 229930182830 galactose Natural products 0.000 description 3
- 210000005260 human cell Anatomy 0.000 description 3
- 230000001939 inductive effect Effects 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 108020004999 messenger RNA Proteins 0.000 description 3
- 239000002777 nucleoside Substances 0.000 description 3
- 238000005215 recombination Methods 0.000 description 3
- 230000006798 recombination Effects 0.000 description 3
- 230000028327 secretion Effects 0.000 description 3
- 230000009131 signaling function Effects 0.000 description 3
- 239000000243 solution Substances 0.000 description 3
- 230000005030 transcription termination Effects 0.000 description 3
- 230000009261 transgenic effect Effects 0.000 description 3
- 239000013603 viral vector Substances 0.000 description 3
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 2
- 241000255789 Bombyx mori Species 0.000 description 2
- 238000012270 DNA recombination Methods 0.000 description 2
- DHMQDGOQFOQNFH-UHFFFAOYSA-N Glycine Chemical compound NCC(O)=O DHMQDGOQFOQNFH-UHFFFAOYSA-N 0.000 description 2
- 101000908713 Homo sapiens Dihydrofolate reductase Proteins 0.000 description 2
- 101001030211 Homo sapiens Myc proto-oncogene protein Proteins 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- 241000699666 Mus <mouse, genus> Species 0.000 description 2
- 101710135898 Myc proto-oncogene protein Proteins 0.000 description 2
- 102100038895 Myc proto-oncogene protein Human genes 0.000 description 2
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 2
- 101710150448 Transcriptional regulator Myc Proteins 0.000 description 2
- 108700019146 Transgenes Proteins 0.000 description 2
- 101150050575 URA3 gene Proteins 0.000 description 2
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 2
- 230000002378 acidificating effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 2
- 239000012895 dilution Substances 0.000 description 2
- 238000010790 dilution Methods 0.000 description 2
- 230000002255 enzymatic effect Effects 0.000 description 2
- 238000000684 flow cytometry Methods 0.000 description 2
- 238000010353 genetic engineering Methods 0.000 description 2
- 102000053563 human MYC Human genes 0.000 description 2
- 230000006872 improvement Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000007935 neutral effect Effects 0.000 description 2
- 150000003833 nucleoside derivatives Chemical class 0.000 description 2
- 230000037361 pathway Effects 0.000 description 2
- 230000008488 polyadenylation Effects 0.000 description 2
- 230000004481 post-translational protein modification Effects 0.000 description 2
- 230000001124 posttranscriptional effect Effects 0.000 description 2
- 238000011155 quantitative monitoring Methods 0.000 description 2
- 108010054624 red fluorescent protein Proteins 0.000 description 2
- 230000003248 secreting effect Effects 0.000 description 2
- 235000004400 serine Nutrition 0.000 description 2
- 230000010473 stable expression Effects 0.000 description 2
- 108091005957 yellow fluorescent proteins Proteins 0.000 description 2
- 102000007469 Actins Human genes 0.000 description 1
- 108010085238 Actins Proteins 0.000 description 1
- 101710163595 Chaperone protein DnaK Proteins 0.000 description 1
- 108010019670 Chimeric Antigen Receptors Proteins 0.000 description 1
- 108091007741 Chimeric antigen receptor T cells Proteins 0.000 description 1
- 108700010070 Codon Usage Proteins 0.000 description 1
- 102000008186 Collagen Human genes 0.000 description 1
- 108010035532 Collagen Proteins 0.000 description 1
- 102000001187 Collagen Type III Human genes 0.000 description 1
- 108010069502 Collagen Type III Proteins 0.000 description 1
- IGXWBGJHJZYPQS-SSDOTTSWSA-N D-Luciferin Chemical compound OC(=O)[C@H]1CSC(C=2SC3=CC=C(O)C=C3N=2)=N1 IGXWBGJHJZYPQS-SSDOTTSWSA-N 0.000 description 1
- 230000004544 DNA amplification Effects 0.000 description 1
- CYCGRDQQIOGCKX-UHFFFAOYSA-N Dehydro-luciferin Natural products OC(=O)C1=CSC(C=2SC3=CC(O)=CC=C3N=2)=N1 CYCGRDQQIOGCKX-UHFFFAOYSA-N 0.000 description 1
- 241000702421 Dependoparvovirus Species 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 108700041152 Endoplasmic Reticulum Chaperone BiP Proteins 0.000 description 1
- 102100021451 Endoplasmic reticulum chaperone BiP Human genes 0.000 description 1
- 102000010911 Enzyme Precursors Human genes 0.000 description 1
- 108010062466 Enzyme Precursors Proteins 0.000 description 1
- 240000008187 Erythrina edulis Species 0.000 description 1
- 235000002757 Erythrina edulis Nutrition 0.000 description 1
- 108091006020 Fc-tagged proteins Proteins 0.000 description 1
- 108090000331 Firefly luciferases Proteins 0.000 description 1
- BJGNCJDXODQBOB-UHFFFAOYSA-N Fivefly Luciferin Natural products OC(=O)C1CSC(C=2SC3=CC(O)=CC=C3N=2)=N1 BJGNCJDXODQBOB-UHFFFAOYSA-N 0.000 description 1
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 1
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 1
- 239000004471 Glycine Substances 0.000 description 1
- 102000004457 Granulocyte-Macrophage Colony-Stimulating Factor Human genes 0.000 description 1
- 108010017213 Granulocyte-Macrophage Colony-Stimulating Factor Proteins 0.000 description 1
- 101150112743 HSPA5 gene Proteins 0.000 description 1
- 101710178376 Heat shock 70 kDa protein Proteins 0.000 description 1
- 101710152018 Heat shock cognate 70 kDa protein Proteins 0.000 description 1
- 101001139134 Homo sapiens Krueppel-like factor 4 Proteins 0.000 description 1
- 101000772905 Homo sapiens Polyubiquitin-B Proteins 0.000 description 1
- 101000687905 Homo sapiens Transcription factor SOX-2 Proteins 0.000 description 1
- 102000012330 Integrases Human genes 0.000 description 1
- 108010061833 Integrases Proteins 0.000 description 1
- 102100020677 Krueppel-like factor 4 Human genes 0.000 description 1
- XUJNEKJLAYXESH-REOHCLBHSA-N L-Cysteine Chemical compound SC[C@H](N)C(O)=O XUJNEKJLAYXESH-REOHCLBHSA-N 0.000 description 1
- ONIBWKKTOPOVIA-BYPYZUCNSA-N L-Proline Chemical compound OC(=O)[C@@H]1CCCN1 ONIBWKKTOPOVIA-BYPYZUCNSA-N 0.000 description 1
- ODKSFYDXXFIFQN-BYPYZUCNSA-P L-argininium(2+) Chemical compound NC(=[NH2+])NCCC[C@H]([NH3+])C(O)=O ODKSFYDXXFIFQN-BYPYZUCNSA-P 0.000 description 1
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 1
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 1
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 1
- HNDVDQJCIGZPNO-YFKPBYRVSA-N L-histidine Chemical compound OC(=O)[C@@H](N)CC1=CN=CN1 HNDVDQJCIGZPNO-YFKPBYRVSA-N 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- KDXKERNSBIXSRK-YFKPBYRVSA-N L-lysine Chemical compound NCCCC[C@H](N)C(O)=O KDXKERNSBIXSRK-YFKPBYRVSA-N 0.000 description 1
- FFEARJCKVFRZRR-BYPYZUCNSA-N L-methionine Chemical compound CSCC[C@H](N)C(O)=O FFEARJCKVFRZRR-BYPYZUCNSA-N 0.000 description 1
- COLNVLDHVKWLRT-QMMMGPOBSA-N L-phenylalanine Chemical compound OC(=O)[C@@H](N)CC1=CC=CC=C1 COLNVLDHVKWLRT-QMMMGPOBSA-N 0.000 description 1
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- OUYCCCASQSFEME-QMMMGPOBSA-N L-tyrosine Chemical compound OC(=O)[C@@H](N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-QMMMGPOBSA-N 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- 108060001084 Luciferase Proteins 0.000 description 1
- DDWFXDSYGUXRAY-UHFFFAOYSA-N Luciferin Natural products CCc1c(C)c(CC2NC(=O)C(=C2C=C)C)[nH]c1Cc3[nH]c4C(=C5/NC(CC(=O)O)C(C)C5CC(=O)O)CC(=O)c4c3C DDWFXDSYGUXRAY-UHFFFAOYSA-N 0.000 description 1
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- 241000699660 Mus musculus Species 0.000 description 1
- 241000699670 Mus sp. Species 0.000 description 1
- 208000009869 Neu-Laxova syndrome Diseases 0.000 description 1
- 238000009004 PCR Kit Methods 0.000 description 1
- 102100035423 POU domain, class 5, transcription factor 1 Human genes 0.000 description 1
- 101710126211 POU domain, class 5, transcription factor 1 Proteins 0.000 description 1
- 241000224016 Plasmodium Species 0.000 description 1
- 102100030432 Polyubiquitin-B Human genes 0.000 description 1
- ONIBWKKTOPOVIA-UHFFFAOYSA-N Proline Natural products OC(=O)C1CCCN1 ONIBWKKTOPOVIA-UHFFFAOYSA-N 0.000 description 1
- 108091030071 RNAI Proteins 0.000 description 1
- 102000018120 Recombinases Human genes 0.000 description 1
- 108010091086 Recombinases Proteins 0.000 description 1
- 108010052090 Renilla Luciferases Proteins 0.000 description 1
- 108700008625 Reporter Genes Proteins 0.000 description 1
- 101100111629 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) KAR2 gene Proteins 0.000 description 1
- 241000700584 Simplexvirus Species 0.000 description 1
- 108091081024 Start codon Proteins 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- 102100024270 Transcription factor SOX-2 Human genes 0.000 description 1
- 241000255993 Trichoplusia ni Species 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- 102000015098 Tumor Suppressor Protein p53 Human genes 0.000 description 1
- 108010078814 Tumor Suppressor Protein p53 Proteins 0.000 description 1
- 108091023045 Untranslated Region Proteins 0.000 description 1
- 241000700618 Vaccinia virus Species 0.000 description 1
- 238000009825 accumulation Methods 0.000 description 1
- 150000001408 amides Chemical class 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 210000004102 animal cell Anatomy 0.000 description 1
- 239000003242 anti bacterial agent Substances 0.000 description 1
- 229940088710 antibiotic agent Drugs 0.000 description 1
- 125000000613 asparagine group Chemical group N[C@@H](CC(N)=O)C(=O)* 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 238000005844 autocatalytic reaction Methods 0.000 description 1
- 230000001580 bacterial effect Effects 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 230000004071 biological effect Effects 0.000 description 1
- 230000008827 biological function Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 108091005948 blue fluorescent proteins Proteins 0.000 description 1
- 210000004899 c-terminal region Anatomy 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000006555 catalytic reaction Methods 0.000 description 1
- 210000003850 cellular structure Anatomy 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003153 chemical reaction reagent Substances 0.000 description 1
- 210000000349 chromosome Anatomy 0.000 description 1
- 229920001436 collagen Polymers 0.000 description 1
- 239000002299 complementary DNA Substances 0.000 description 1
- 238000009833 condensation Methods 0.000 description 1
- 230000005494 condensation Effects 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 239000013078 crystal Substances 0.000 description 1
- 238000005520 cutting process Methods 0.000 description 1
- XUJNEKJLAYXESH-UHFFFAOYSA-N cysteine Natural products SCC(N)C(O)=O XUJNEKJLAYXESH-UHFFFAOYSA-N 0.000 description 1
- 235000018417 cysteine Nutrition 0.000 description 1
- 230000018044 dehydration Effects 0.000 description 1
- 238000006297 dehydration reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 239000002552 dosage form Substances 0.000 description 1
- 210000001671 embryonic stem cell Anatomy 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007457 establishment of nucleus localization Effects 0.000 description 1
- 102000034287 fluorescent proteins Human genes 0.000 description 1
- 108091006047 fluorescent proteins Proteins 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 230000009368 gene silencing by RNA Effects 0.000 description 1
- 231100000025 genetic toxicology Toxicity 0.000 description 1
- 230000001738 genotoxic effect Effects 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- 101150028578 grp78 gene Proteins 0.000 description 1
- 238000013537 high throughput screening Methods 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 238000002744 homologous recombination Methods 0.000 description 1
- 230000006801 homologous recombination Effects 0.000 description 1
- 230000002209 hydrophobic effect Effects 0.000 description 1
- 230000005965 immune activity Effects 0.000 description 1
- 230000005847 immunogenicity Effects 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 239000000411 inducer Substances 0.000 description 1
- 230000003834 intracellular effect Effects 0.000 description 1
- 235000005772 leucine Nutrition 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 229930182817 methionine Natural products 0.000 description 1
- 238000002703 mutagenesis Methods 0.000 description 1
- 231100000350 mutagenesis Toxicity 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 125000003835 nucleoside group Chemical group 0.000 description 1
- 210000004940 nucleus Anatomy 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 101150079312 pgk1 gene Proteins 0.000 description 1
- 239000000546 pharmaceutical excipient Substances 0.000 description 1
- COLNVLDHVKWLRT-UHFFFAOYSA-N phenylalanine Natural products OC(=O)C(N)CC1=CC=CC=C1 COLNVLDHVKWLRT-UHFFFAOYSA-N 0.000 description 1
- 235000008729 phenylalanine Nutrition 0.000 description 1
- 210000001236 prokaryotic cell Anatomy 0.000 description 1
- 235000013930 proline Nutrition 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 238000003259 recombinant expression Methods 0.000 description 1
- 230000006833 reintegration Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 102220215480 rs780747709 Human genes 0.000 description 1
- 238000012163 sequencing technique Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 238000003786 synthesis reaction Methods 0.000 description 1
- 235000008521 threonine Nutrition 0.000 description 1
- 230000002103 transcriptional effect Effects 0.000 description 1
- 238000001890 transfection Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000011830 transgenic mouse model Methods 0.000 description 1
- 210000004881 tumor cell Anatomy 0.000 description 1
- 235000002374 tyrosine Nutrition 0.000 description 1
- OUYCCCASQSFEME-UHFFFAOYSA-N tyrosine Natural products OC(=O)C(N)CC1=CC=C(O)C=C1 OUYCCCASQSFEME-UHFFFAOYSA-N 0.000 description 1
- 241001430294 unidentified retrovirus Species 0.000 description 1
- 108700026220 vif Genes Proteins 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
- C12N9/1241—Nucleotidyltransferases (2.7.7)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/435—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from animals; from humans
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K19/00—Hybrid peptides, i.e. peptides covalently bound to nucleic acids, or non-covalently bound protein-protein complexes
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/62—DNA sequences coding for fusion proteins
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/80—Vectors or expression systems specially adapted for eukaryotic hosts for fungi
- C12N15/81—Vectors or expression systems specially adapted for eukaryotic hosts for fungi for yeasts
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/63—Introduction of foreign genetic material using vectors; Vectors; Use of hosts therefor; Regulation of expression
- C12N15/79—Vectors or expression systems specially adapted for eukaryotic hosts
- C12N15/85—Vectors or expression systems specially adapted for eukaryotic hosts for animal cells
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N5/00—Undifferentiated human, animal or plant cells, e.g. cell lines; Tissues; Cultivation or maintenance thereof; Culture media therefor
- C12N5/10—Cells modified by introduction of foreign genetic material
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/10—Transferases (2.)
- C12N9/12—Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/10—Plasmid DNA
- C12N2800/106—Plasmid DNA for vertebrates
- C12N2800/107—Plasmid DNA for vertebrates for mammalian
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2800/00—Nucleic acids vectors
- C12N2800/22—Vectors comprising a coding region that has been codon optimised for expression in a respective host
Definitions
- the invention belongs to the field of molecular biology and biomedicine, and specifically relates to a high-activity transposase and its application.
- a DNA transposon is a mobile DNA sequence that can be transposed from one position in the genome to another through a series of processes such as cutting and reintegration.
- PiggyBac (PB) transposon is a DNA transposon isolated from Trichoplusia ni TN368 cell line. It can be specifically inserted into the target site of "TTAA". With the help of transposase, PiggyBac transposes The transposon can accurately excise the target gene from the host without reshooting the host chromosome.
- PB transposon has no potential viral genotoxicity, can carry a long foreign gene fragment (up to 150kb), and has strong transformability.
- the transgene mediated by PB transposase has the characteristics of high integration efficiency, stable integration, long-term expression, single copy integration, insertable site location, and easy manipulation. It is often used in the production of transgenic mice and the genetic manipulation of mouse embryonic stem cells. , Gene mutagenesis and other genetic manipulation, pluripotent stem cell induction and other fields.
- PB transposase The transposition activity of PB transposase is the highest among existing mammalian DNA transposons, and it has a very broad application prospect.
- PB transposon system There have been many studies at home and abroad that use the PB transposon system as a method of gene editing to carry out transgene and gene mutation in a variety of organisms, including insect cells, protists, plants and vertebrates.
- hDHFR human dihydrofolate reductase
- the DNA transposon system consists of two parts, the transposons with inverted repeats (IRs) at both ends that can carry the target DNA fragments, and the transposase that can catalyze the "cut and paste" of the transposon.
- the transposase first binds the IRs sequences on both sides of the transposon, and then removes the transposon from the host DNA site accurately and seamlessly, and finally integrates the DNA fragment into the new site.
- the establishment of an efficient transposition system can achieve targeted knockout of target genes or targeted introduction of target genes, providing an effective vector tool for gene editing in mammalian cells.
- the transposition efficiency of the transposable system determines the efficiency of gene editing, and a large part of the transposition efficiency depends on the expression level of the transposase. Therefore, increasing the transposase activity is a key technical point for increasing the transposable efficiency of transposons.
- transposase The transposition activity of transposase is affected by the binding site, active site, structure and other factors. At present, the crystal structure of transposase has not been clearly analyzed, but some domains are considered to be important structures, and experiments have proved The activity of transposase can be affected by any non-special amino acid.
- a hyperactive piggyBac transposase for mammals discloses a transposition efficiency of mPBase( The wild-type PiggyBac transposase optimized by mammalian codons) 10-fold high-activity PiggyBac transposase with amino acid mutations at the following positions (refers to the following existing high-activity transposase hyPBase, as shown in SEQ ID NO:1 Show): I30V, G165S, S103P, M282V, S509G, N570S and N538K.
- PiggyBac transposon mutants and their applications are reapplications based on the priority of U.S. Provisional Application No. 61/155206.
- the present invention provides a new high-activity transposase, which exhibits extremely high transposition activity in E. coli, insect cells, yeast cells, mammalian cells and other cells, which is higher than the existing high-activity transposase.
- the active transposase hyPBase has a broad spectrum of application to host cells, and also has high transposition activity in mammalian cells, especially in human cells. It is the exploration of transposase, especially in human cells. The exploration of transposase provides new clues and basis.
- the present invention also provides amino acid sequences and peptides that are the basis of the new highly active transposase of the present invention, as well as nucleotide sequences encoding the amino acid sequences, peptides and proteins of the highly active transposase of the present invention, and the nucleoside Acid sequence-based nucleic acids, nucleic acid constructs, recombinant vectors and host cells, and gene transfer systems and applications based on the above peptides, proteins, nucleic acids, nucleic acid constructs, recombinant vectors and host cell components.
- the amino acid sequence of the existing highly active transposase hyPBase (shown in SEQ ID NO:1) is mutated to asparagine at position 92 and valine at position 119 to Alanine and glutamine at position 601 were mutated to arginine to obtain the target mutant amino acid sequence, as shown in SEQ ID NO: 2.
- the transposition efficiency (30.9%), the target high-activity enzyme generated based on the amino acid sequence of SEQ ID NO: 2
- the transposition efficiency of bz-hyPBase (51.7%) is increased by nearly 21%; in PBMC cells, compared with the existing high-activity transposase hyPBase, the transposition efficiency (9.81%) is codon-optimized and added to the nuclear localization signal system.
- the transposition efficiency (19.4%) of the target high-activity bz-hyPBase enzyme generated based on the amino acid sequence of SEQ ID NO: 2 is increased by nearly 10%. This shows that the target high-activity enzyme based on the mutant amino acid sequence of the present invention exhibits better transposition activity than the existing high-activity transposase hyPBase, especially in mammalian cells and human-derived cells. Block activity.
- some embodiments of the present invention provide a new highly active transposase, which contains one or more amino acid sequences shown in SEQ ID NO: 2, and the highly active transposase is in Escherichia coli , Insect cells, yeast cells and mammalian cells all show extremely high transposition activity, especially to meet the high transposition activity requirements of mammalian and human-derived cells.
- Target mutant amino acid sequence containing nuclear localization sequence SEQ ID NO: 2
- the amino acid sequence of the existing high-activity transposase hyPBase (shown in SEQ ID NO: 1) is the amino acid sequence obtained by performing the above amino acid mutations at positions 92, 119, and 601 alone or at any two positions, with one or Enzymes formed based on multiple mutant amino acid sequences also have the same or similar transposition efficiency as the target high-activity transposase bz-hyPBase described in the Examples of the present invention or the same or similar to the existing hyPBase, and are also protected by the present invention.
- the mutant amino acid sequence of the new highly active transposase, and the enzyme formed based on the mutant amino acid sequence also belongs to the new highly active transposase to be protected by the present invention.
- amino acid sequence 92, 119, and 601 of the existing highly active transposase hyPBase (shown in SEQ ID NO: 1), any two positions or three positions alone, any two positions or three positions are subjected to the above amino acid mutations.
- the mutated amino acid sequence of, and the amino acid sequence obtained by performing one or more amino acid deletion, substitution, insertion or addition operations that still maintain or improve the enzyme activity also belong to the replacement scheme of the technical scheme of the present invention with the same or similar technical effects.
- mutant amino acid sequence of the new highly active transposase to be protected by the present invention is also included, and enzymes formed on the basis of one or more of this mutant amino acid sequence also belong to the new mutant amino acid sequence to be protected by the present invention. Highly active transposase.
- the amino acid sequence 92, 119, and 601 of the existing highly active transposase hyPBase (shown in SEQ ID NO: 1), any two positions or three positions alone, any two positions or three positions are subjected to the above amino acid mutations.
- the mutant amino acid sequence also contains the amino acid sequence of the functional protein.
- Add functional protein to the new high-activity transposase to improve or increase the function of the new high-activity transposase such as the amino acid sequence and expression of the nuclear localization signal EGFP green fluorescent protein amino acid sequence, tag protein amino acid sequence or antibody amino acid sequence, etc.
- These functional proteins can improve the transposition activity of new highly active transposases.
- nuclear localization signals can help improve the transposition activity of transposases; or can enhance the transposition monitoring function of highly active transposases, such as EGFP green Fluorescent protein or tag protein facilitates the qualitative and/or quantitative monitoring of transposase activity; or adds new functions to new highly active transposases, such as antibodies that can additionally increase immune activity.
- highly active transposases such as EGFP green Fluorescent protein or tag protein facilitates the qualitative and/or quantitative monitoring of transposase activity
- adds new functions to new highly active transposases, such as antibodies that can additionally increase immune activity such as antibodies that can additionally increase immune activity.
- the present invention also protects the amino acid sequence 92, 119, and 601 of the existing high-activity transposase hyPBase (shown in SEQ ID NO: 1), any two or three of the above amino acid mutations.
- the mutant amino acid sequence of the mutant amino acid sequence, and the derivative amino acid sequence obtained by performing one or more amino acid deletion, substitution, insertion or addition operations on the basis of the mutant amino acid sequence, which still maintains or improves the enzyme activity, is connected by peptide bonds after dehydration and condensation of the amino acids
- the chain compound that is, peptide.
- the number of peptides containing the above-mentioned mutant amino acids or the above-mentioned derived amino acid sequences can be one or more.
- the peptide is also connected with the functional protein's amino acid sequence after being dehydrated and condensed by amino acids and then connected by peptide bonds, such as the peptide of nuclear localization signal, the peptide of expressing EGFP green fluorescent protein, and the peptide of tag protein. Segment or antibody peptide segment, etc.
- the present invention uses the existing high-activity transposase hyPBase (shown in SEQ ID NO: 1) amino acid sequence 92, 119, 601 alone, any two positions or three positions to carry out the above amino acid mutations.
- the sequence and the protein formed on the basis of the peptide fragment formed on the basis of the derived amino acid sequence belong to the new highly active transposase protected by the present invention.
- the number of the above-mentioned mutant amino acid sequence, derivative amino acid sequence, and peptide segments formed on the basis of the above-mentioned mutant amino acid sequence and derivative amino acid sequence in the new highly active transposase is one or more.
- a mutant nucleotide sequence encoding the above-mentioned new highly active transposase, peptide fragment and its amino acid sequence of the present invention a nucleotide sequence complementary to, hybridizing or overlapping with the mutant nucleotide sequence, or the mutant core
- the nucleotide sequence undergoes base substitution, deletion or addition operations and has a nucleotide sequence encoding a new highly active transposase, or a nucleotide sequence that has at least 80% homology with the mutant nucleotide sequence, Preferably, a nucleotide sequence having at least 90% or more homology with the mutant nucleotide sequence, and preferably a nucleotide sequence having at least 96% or more homology with the mutant nucleotide sequence, all belong to the present invention.
- the number of protected mutant nucleotide sequences encoding the new highly active transposase, peptides and amino acid sequences of the present invention can be one or multiple repeated copies
- the nucleotide sequence encoding the amino acid sequence of the existing high-activity enzyme hyPBase (SEQ ID NO:1) is optimized by human codons to obtain a human codon optimized nucleotide sequence, and the nucleotide sequence is optimized with human codons Based on the sequence (SEQ ID NO: 4), the following base mutations were made: base T at 276 was mutated to base C, base T at 356 was mutated to base C, and base G at base 900 was mutated to Base A, base A at position 1802 is mutated to base G; a mutant nucleotide sequence encoding the amino acid sequence of the new highly active transposase bz-hyPBase (shown in SEQ ID NO: 2) of the present invention is obtained, as shown in SEQ ID NO: as shown in 3.
- nucleotide sequence (SEQ ID NO: 4) of the existing high-activity enzyme hyPBase with nuclear localization sequence optimized by human-derived codons:
- mutant nucleotide sequence (shown in SEQ ID NO: 3) undergoes base substitution, deletion or addition operations and has a nucleotide sequence encoding a new highly active transposase bz-hyPBase;
- nucleotide sequence complementary to the mutant nucleotide sequence shown in SEQ ID NO: 3 and its base substitution, deletion or addition operation and a new highly active transposase
- nucleotide sequence of bz-hyPBase The nucleotide sequence of bz-hyPBase
- the same mutant nucleotide sequence has more than 80% homology and has a nucleotide sequence encoding the new highly active transposase bz-hyPBase; specifically, the same mutant nucleoside is preferred
- the acid sequence has more than 90% homology and has a nucleotide sequence encoding the new highly active transposase bz-hyPBase; more preferably a homomutated nucleotide sequence (SEQ ID NO: 3) It has more than 96% homology and has a nucleotide sequence encoding the new highly active transposase bz-hyPBase;
- the mutant nucleotide sequence encoding it also contains a nucleotide sequence encoding the functional protein, such as a nucleotide sequence encoding a nuclear localization signal , The nucleotide sequence expressing EGFP green fluorescent protein, the nucleotide sequence encoding the peptide of the tag protein or the nucleotide sequence encoding the antibody, etc.
- the present invention also provides the above-mentioned nucleic acid polymerized from the mutant nucleotide sequence encoding the new highly active transposase of the present invention, or its peptide fragment, or its amino acid sequence.
- the nucleic acid also contains a nucleotide sequence encoding the functional protein (nuclear localization signal, EGFP green fluorescent protein, tag protein or antibody).
- the present invention also provides a nucleic acid construct to which one or more regulatory sequences are operably linked, and the regulatory sequences direct the target sequence to be expressed and coded in a host cell.
- the expression codes include those involved in the production of proteins or polypeptides. Any step of the process, including but not limited to transcription, post-transcriptional modification, translation, post-translational modification and secretion, etc.
- the nucleic acid construct also contains the above-mentioned mutant nucleotide sequence encoding the new highly active transposase of the present invention, or its peptide fragment, or its amino acid sequence, or a nucleic acid polymerized from the mutant nucleotide sequence.
- the present invention also provides a recombinant vector containing the above-mentioned mutant nucleotide sequence encoding the new highly active transposase of the present invention, or its peptide fragment, or its amino acid sequence, or polymerized by the mutant nucleotide sequence.
- the recombinant vector includes a recombinant cloning vector, a recombinant eukaryotic expression vector or a recombinant viral vector.
- the recombinant cloning vector includes a pRS vector, a T vector or a pUC vector, etc.
- the recombinant eukaryotic expression vector includes pEGFP, pCMVp-NEO-BAN Or pSV2, etc.
- the recombinant virus vector includes a recombinant adenovirus vector or a lentivirus vector.
- the present invention also provides a host cell, which contains the above-mentioned mutant nucleotide sequence encoding the new highly active transposase of the present invention, or its peptide fragment, or its amino acid sequence, or is polymerized from the mutant nucleotide sequence.
- the host cells include E. coli cells, insect cells, yeast cells, mammalian cells, and the like.
- the present invention provides a new high-activity transposase used in the transposition system to improve the transposable activity of transposons, or a peptide segment constituting a new high-activity transposase, or a nucleic acid encoding the new high-activity transposase Construct, or recombinant vector encoding the new high-activity transposase or nucleic acid construct containing the new high-activity transposase and/or encoding the new high-activity transposase and/or encoding the new high-activity transposase Enzyme recombinant vector host cells (E.
- the stable expression of the original host genes can be used to construct new gene transfer systems, and can also be used to prepare or use as drugs and/or preparations for genome research, gene therapy, cell therapy, or the induction and/or differentiation of pluripotent stem cells. It can be prepared or used as a tool for genome research, gene therapy, cell therapy, or multifunctional stem cell induction and/or differentiation.
- the gene transfer system also contains a transposon gene, a nucleic acid or nucleic acid construct encoding a new highly active transposase integrated with the transposon gene; or a nucleic acid or nucleic acid construct encoding a new highly active transposase It is independent of the transposon gene; or the nucleic acid or nucleic acid construct encoding the new highly active transposase is located on the same recombinant vector as the transposon gene; or the nucleic acid or nucleic acid construct encoding the new highly active transposase and The transposon gene is located on a different recombinant vector; or the transposon gene is integrated into the nucleic acid construct encoding the new highly active transposase; or the transposon gene is integrated into the recombinant vector encoding the new highly active transposase ; Or the transposon gene is independent of the recombinant vector encoding the new high-activity transposase
- the nucleic acid construct or the recombinant vector encoding the new high-activity transposase or the nucleic acid construct containing the new high-activity transposase and/or the nucleic acid construct encoding the new high-activity transposase and/or the new high-activity transposase The host cell of the recombinant vector of the transposase, or the above-mentioned gene transfer system.
- the medicine used for genome research, gene therapy, cell therapy, or multifunctional stem cell induction and/or differentiation also contains pharmaceutically acceptable excipients, and can be prepared into any pharmaceutically feasible dosage form, and can also be supplemented at the same time Auxiliary treatment components.
- a tool for genome research, gene therapy, cell therapy, or induction and/or differentiation of pluripotent stem cells containing the new highly active transposase of the present invention, or a nucleic acid construct encoding the new highly active transposase , Or a recombinant vector encoding the new high-activity transposase or a nucleic acid construct containing a new high-activity transposase and/or a nucleic acid construct encoding a new high-activity transposase and/or a new high-activity transposase
- the host cell of the recombinant vector, or the above-mentioned gene transfer system is not limited to the recombinant vector, or the above-mentioned gene transfer system.
- Figure 1 is a vector map of PRS316-URA-PBase in step (3) of Example 1.
- Figure 2 is a schematic diagram of the flow of multiple accumulation error-prone PCR mutations of the transposase in step (3) of Example 1 (above) and the transposase fragments and linearized vectors recovered by the error-prone PCR are transformed into a 10:1 molar ratio
- Schematic diagram of the ura-deficient yeast strain (the figure below).
- Figure 3 is a schematic diagram of the mutant library and screening of high-efficiency transposase in step (3) of Example 1.
- Figure 4 is a diagram of the plasmid PRS316-URA-PBase in Example 2 and the working principle diagram of the plasmid (A), WT PBase, hyPBase, optimized hyPBase, and bz-hyPBase in the yeast transposition visual diagram (B), WT PBase, hyPBase, Statistical graph of the transposition of optimized hyPBase, bz-hyPBase in yeast (C) Statistic histogram of the transposition of WT PBase, hyPBase, optimized hyPBase, and bz-hyPBase in yeast (D).
- Example 5 is a schematic diagram of the structure of ploxP-bz-HyPB plasmid in Example 3.
- Figure 6 is a schematic diagram of the pSAD-EGFP plasmid structure in Example 3.
- FIG. 7 is a comparison diagram of the efficiency of editing CHO cell genome using optimized hyPBase and bz-hyPBase transposase in Example 3. It can be seen that the transposition efficiency of bz-hyPBase in CHO cells is significantly increased.
- FIG. 8 is a comparison diagram of the efficiency of preparing CAR T cells using optimized hyPBase and bz-hyPBase transposase in Example 4. It can be seen that the transposition efficiency of bz-hyPBase in multiple PBMC donors is significantly increased.
- the highly active transposase provided by the present invention exists in one, any two or all three positions selected from the 92nd, 119th and 601th positions Amino acid mutations, including amino acid insertions, deletions or substitutions; or compared to the transposase shown in SEQ ID NO: 11, in one, any two or all three selected from the 82nd, 109th and 591th positions There are amino acid insertions, deletions or substitutions at these positions.
- a preferred mutation is a substitution mutation.
- the highly active transposase of the present invention has mutations in the above three positions, especially amino acid substitutions have occurred.
- the amino acid residues at the remaining positions of the transposase of the present invention are the same as the amino acid residues at the corresponding positions of SEQ ID NO: 1 or 11 except for the mutation at the position.
- the amino acid sequence of the highly active transposase of the present invention has one, any two or all three of the following substitution mutations compared with the sequence shown in SEQ ID NO:1: Isoleucine at position 92
- the acid mutation is asparagine, the valine at position 119 is mutated to alanine, and the glutamine at position 601 is mutated to arginine; further preferably, the highly active transposase of the present invention has all the above three positions.
- the substitution mutation has occurred.
- the amino acid sequence of the highly active transposase of the present invention has one, any two or all three of the following substitution mutations compared with the sequence shown in SEQ ID NO: 11: Isoleucine at position 82
- the acid mutation is asparagine, the valine at position 109 is mutated to alanine, and the glutamine at position 591 is mutated to arginine; further preferably, the highly active transposase of the present invention has all the above three positions.
- the substitution mutation has occurred.
- the amino acid residues at the remaining positions of the transposase of the present invention are the same as the amino acid residues at the corresponding positions of SEQ ID NO: 1 or 11 except for the mutation at the position.
- the amino acid sequence of the highly active transposase of the present invention is shown in SEQ ID NO: 12. In a particularly preferred embodiment, the amino acid sequence of the highly active transposase of the present invention is shown in SEQ ID NO: 2.
- the amino acid sequence of the transposase shown in SEQ ID NO: 11 and 12 herein does not contain a nuclear localization sequence.
- the present invention also includes the following transposase: Compared with SEQ ID NO: 1, except for one, any two or all three positions of the 92nd, 119th and 601th positions, the transposase described in any of the embodiments herein In addition to the mutations, there are one or more insertion, deletion and/or substitution mutations in the other one or more amino acid positions of SEQ ID NO: 1, or compared with SEQ ID NO: 11, except in the 82, 109 and In addition to the mutations described in any of the embodiments herein in one, any two or all three positions of position 591, there are one or more insertions in the other one or more amino acid positions of SEQ ID NO: 11, Deletion and/or substitution mutation, and the transposase still has the transposase activity described herein.
- substitution mutations are substitution mutations, and more preferred are conservative substitutions.
- substitution of amino acid residues with the same or similar properties usually does not significantly change the transposase activity of the resulting mutant.
- amino acids whose side chain groups have the same polarity can be used for substitution.
- amino acids can be divided into non-polar amino acids (hydrophobic amino acids) and polar amino acids (hydrophilic amino acids); among them, non-polar amino acids include alanine, valine, leucine, iso Leucine, proline, phenylalanine, tryptophan and methionine; polar amino acids include neutral amino acids, basic amino acids and acidic amino acids, among which neutral amino acids include serine, threonine, and cysteine , Tyrosine, asparagine and glutamine, basic amino acids include lysine, arginine and histidine, acidic amino acids include aspartic acid and glutamic acid.
- non-polar amino acids include alanine, valine, leucine, iso Leucine, proline, phenylalanine, tryptophan and methionine
- polar amino acids include neutral amino acids
- basic amino acids and acidic amino acids among which neutral amino acids include serine, threonine, and cysteine
- basic amino acids include ly
- this type of transposase has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least SEQ ID NO:1. 99% sequence identity, and at least one, any two or all three of the 92nd, 119th and 601th positions have the substitution mutations described in any of the embodiments herein, or are similar to SEQ ID NO: 11.
- the present invention provides a fusion protein, which contains the highly active transposase described in any embodiment of the present invention and a functional protein, or is formed or formed by the highly active transposase and the functional protein. composition. It should be understood that the fusion protein should at least retain the transposition activity of the highly active transposase described herein.
- the functional protein is used to improve or increase the biological activity or biological function of the highly active transposase of the present invention.
- Exemplary functional proteins include, but are not limited to, functional proteins used to increase the transposable activity of transposases, used to monitor the transposable function of transposases, and/or used to add new functions to transposases.
- functional proteins include, but are not limited to, nuclear localization signal proteins/sequences, which can guide transposase to accumulate in the nucleus, thereby helping to improve the transposition efficiency of transposase; labeled proteins (such as fluorescent labeled proteins, such as green fluorescent protein (such as EGFP), red fluorescent protein, blue fluorescent protein, yellow fluorescent protein, etc.) or tag protein (such as His6, Flag, GST, MBP, HA, Myc, His-Myc, etc.), used to enhance the transposition of transposase
- the monitoring function facilitates the qualitative and/or quantitative monitoring of the transposase activity of the transposase; the antibody of interest is used to increase the new function of the transposase, such as increasing the immunogenicity.
- An exemplary nuclear localization signal protein or sequence is the c-myc nuclear localization signal sequence, and its sequence may be as shown in the amino acid residues 3-11 of SEQ ID NO:1.
- the amino acid sequence of the transposase has one, any two or all three substitution mutations as compared with the sequence shown in SEQ ID NO:1: position 92
- the isoleucine is mutated to asparagine, the valine at position 119 is mutated to alanine, and the glutamine at position 601 is mutated to arginine; further preferably, the transposase is at the above three positions All have the substitution mutation; and further preferably, the amino acid residue at the remaining position of the transposase is the same as the amino acid residue at the corresponding position of SEQ ID NO:1.
- the amino acid sequence of the transposase has one, any two or all three substitution mutations as compared with the sequence shown in SEQ ID NO: 11: Isoleucine at position 82
- the transposase is mutated to asparagine, the valine at position 109 is mutated to alanine, and the glutamine at position 591 is mutated to arginine; further preferably, the transposase has been mutated at the above three positions.
- Substitution mutation; and further preferably, the amino acid residue at the remaining position of the transposase is the same as the amino acid residue at the corresponding position of SEQ ID NO:1.
- the amino acid sequence of the transposase in the fusion protein of the present invention is shown in SEQ ID NO: 2 or 12.
- the transposase and the functional protein can be connected via a linker sequence.
- the linker sequence may be a conventional linker, such as a linker sequence containing glycine and serine.
- the transposase can be located at the N-terminal or C-terminal of the fusion protein; or, when the fusion protein has more than two functional proteins, the fusion protein can also be located between two or more functional proteins.
- the present invention includes nucleic acid molecules whose polynucleotide sequence is the coding sequence of the transposase described herein or the complementary sequence of the coding sequence, or the coding sequence of the fusion protein described herein or the complementary sequence thereof.
- the coding sequence of the transposase of the present invention has a base at one, any two, or all three of positions 276, 356, and 1802. Base mutation, optionally there is a base mutation at base 900.
- base T at position 276 is mutated to base C
- base T at position 356 is mutated to base C
- base G at position 900 is mutated to base A
- base A at position 1802 is mutated. Mutation to base G.
- the polynucleotide sequence of the nucleic acid molecule of the present invention is shown in SEQ ID NO: 3.
- the polynucleotide sequence of the nucleic acid molecule of the present invention is present at one, any two, or all three positions among the 246th, 326th, and 1772th positions.
- Base mutation optionally there is a base mutation at base 870; preferably, the mutation at position 246 is a base T mutation to base C, and the mutation at position 326 is a base T mutation to a base C, the mutation at position 870 is a mutation of base G to base A, and the mutation at position 1772 is a mutation of base A to base G; more preferably, the polynucleotide sequence is as shown in SEQ ID NO: 14. Show. Here, the polynucleotide sequences shown in SEQ ID NOs: 13 and 14 do not contain the coding sequence of the nuclear localization sequence.
- the polynucleotide sequence of the nucleic acid molecule of the present invention has at least 80% homology, preferably at least 90% homology, and more than the polynucleotide sequence described in SEQ ID NO: 4 or 13.
- the base mutation is present at three positions, and the base mutation is optionally present at the 900th or 870th base.
- nucleic acid construct containing the coding sequence of the transposase described in any embodiment herein or its complement, or the coding sequence of the fusion protein described in any embodiment herein or its complement.
- the nucleic acid construct is an expression cassette, and in addition to the coding sequence, the expression cassette also contains a transcription termination sequence such as a PolyA tailing signal sequence and a promoter.
- a transcription termination sequence such as a PolyA tailing signal sequence
- promoters are well known in the art, and those skilled in the art can select a suitable promoter capable of promoting the expression of the transposase described herein or its fusion protein in the host according to the host used for expression.
- the nucleic acid construct sequentially includes the following elements: a transposon 5'terminal repeat sequence (5'ITR), a polyclonal insertion site, a polyA tailing signal sequence, a transposon 3'terminal repeat sequence (3'ITR), the nucleic acid molecule described in any of the embodiments herein, and a promoter that controls the expression of the nucleic acid molecule.
- the direction and/or order referred to in “sequentially” in the “sequentially including the following elements” refers to from upstream to downstream. In the present invention, unless otherwise specified, the direction along the aforementioned “forward direction” is from upstream to downstream, and the direction along the aforementioned “reverse direction” is from downstream to upstream.
- the 5'end repeat sequence of the transposon is the 5'end repeat sequence of the PiggyBac transposon, and its nucleotide sequence is, for example, as shown in SEQ ID NO: 15; the 3'end of the transposon The repeat sequence is the 3'terminal repeat sequence of the PiggyBac transposon, and its nucleotide sequence is, for example, as shown in SEQ ID NO: 16.
- the polyA tailing signal sequence has a polyA tailing signal function in both forward and reverse directions.
- each of the above 6 elements is independently a single copy or multiple copies.
- the above-mentioned 6 elements may be directly connected, or may contain other sequences such as linker or restriction site.
- the above-mentioned "the polyA tailing signal sequence has a polyA tailing signal function in both forward and reverse directions" includes but is not limited to the following situations:
- a polyA tailing signal sequence which has the function of polyA tailing signal in both forward and reverse directions;
- the solution in 1) above is adopted.
- the exogenous gene expression cassette and the PiggyBac transposase expression cassette can share a polyA tailing signal sequence, thereby reducing a polyA tailing signal sequence, embodying the principle of intensiveness, reducing the size of the plasmid, and helping in Under the premise of ensuring transfection efficiency, increase the capacity of the foreign gene expression cassette.
- the PB expression cassette is placed in the same direction as the exogenous gene expression cassette, and two polyA tailed signal sequences are used, where the PB expression cassette is in front, and the polyA tailed signal sequence is placed in one of the ITRs and the exogenous gene.
- gene promoters For example: the promoter that controls the expression of PB transposase, PB transposase coding sequence, transposon 5'terminal repeat sequence, polyA tail signal sequence 1, foreign gene promoter and foreign gene (multiple clone insertion site ), polyA tailing signal sequence 2, transposon 3'terminal repeat sequence; and the direction of the expression cassette of the PB transposase is the same as the direction of the expression cassette of the foreign gene.
- the position of the 5'end repeat of the transposon and the 3'end of the transposon can be interchanged.
- nucleotide sequence of the polyclonal insertion site is shown in SEQ ID NO: 17;
- the nucleotide sequence of the polyA tailing signal sequence is shown in SEQ ID NO: 18; the sequence shown in SEQ ID NO: 18 has a polyA tailing signal function in both forward and reverse directions.
- Exemplary promoters include, but are not limited to, CMV promoter, EF1 ⁇ promoter, SV40 promoter, Ubiquitin B promoter, CAG promoter, HSP70 promoter, PGK-1 promoter, ⁇ -actin promoter, TK promoter And GRP78 promoter.
- One or more identical or different foreign genes of interest and optionally a promoter that controls the expression of the foreign gene can be operably inserted into the multiple cloning site of the nucleic acid construct of the present invention, or its multiple clones
- the site is replaced with one or more identical or different exogenous gene coding sequences and optionally a promoter that controls the expression of the exogenous gene; the exogenous gene is independently a single copy or multiple copies.
- the direction of the expression cassette of the transposase is opposite to the direction of the expression cassette of the foreign gene.
- the exogenous gene is selected from a luciferin reporter gene (such as green fluorescent protein, red fluorescent protein, yellow fluorescent protein, etc.), luciferase genes (such as firefly luciferase, Renilla luciferase, etc.) ), natural functional protein genes (such as TP53, GM-CSF, OCT4, SOX2, Nanog, KLF4, c-Myc), RNAi genes and artificial chimeric genes (such as chimeric antigen receptor genes, Fc fusion protein genes, full length One or more of antibody genes, Nanobody genes).
- a luciferin reporter gene such as green fluorescent protein, red fluorescent protein, yellow fluorescent protein, etc.
- luciferase genes such as firefly luciferase, Renilla luciferase, etc.
- natural functional protein genes such as TP53, GM-CSF, OCT4, SOX2, Nanog, KLF4, c-Myc
- RNAi genes and artificial chimeric genes such as
- expression cassette refers to the complete elements required to express a gene, including promoters, gene coding sequences, and PolyA tailing signal sequences.
- nucleic acid construct is defined herein as a single-stranded or double-stranded nucleic acid molecule, and preferably refers to an artificially constructed nucleic acid molecule.
- the nucleic acid construct further comprises one or more control sequences operably linked, and the control sequences can direct the coding sequence to be expressed in a suitable host cell under compatible conditions. Expression should be understood to include any steps involved in the production of a protein or polypeptide, including, but not limited to, transcription, post-transcriptional modification, translation, post-translational modification, and secretion.
- operably inserted/linked is defined herein as a conformation in which the regulatory sequence is located at an appropriate position relative to the coding sequence of the DNA sequence so that the regulatory sequence directs the expression of the protein or polypeptide.
- the foreign gene promoter and the foreign gene coding sequence are placed at the multiple cloning site by DNA recombination technology.
- the "operably linked” can be achieved by means of DNA recombination, specifically, the nucleic acid construct is a recombinant nucleic acid construct.
- coding sequence is defined herein as the part of a nucleic acid sequence that directly determines the amino acid sequence of its protein product.
- the boundary of the coding sequence is usually determined by the ribosome binding site immediately upstream of the 5'open reading frame of the mRNA (for prokaryotic cells) and the transcription termination sequence immediately downstream of the 3'open reading frame of the mRNA.
- Coding sequences can include, but are not limited to DNA, cDNA, and recombinant nucleic acid sequences.
- regulatory sequence herein is defined as including all components necessary or advantageous for expressing the peptide of the present invention.
- Each control sequence may be naturally contained or foreign to the nucleic acid sequence encoding the protein or polypeptide.
- regulatory sequences include, but are not limited to, leader sequences, polyadenylation sequences, propeptide sequences, promoters, signal sequences, and transcription terminator. At a minimum, regulatory sequences should include promoters and termination signals for transcription and translation.
- a regulatory sequence with a linker can be provided.
- the control sequence may be a suitable promoter sequence, that is, a nucleic acid sequence recognized by the host cell expressing the nucleic acid sequence.
- the promoter sequence contains transcriptional regulatory sequences that mediate the expression of the protein or polypeptide.
- the promoter can be any nucleic acid sequence that is transcriptionally active in the host cell of choice, including mutant, truncated and hybrid promoters, and can be derived from extracellular or intracellular encoding homologous or heterologous to the host cell Protein or peptide gene.
- the regulatory sequence can also be a suitable transcription termination sequence, that is, a sequence that can be recognized by the host cell to terminate transcription.
- the termination sequence can be operably linked to the 3'end of the nucleic acid sequence encoding the protein or polypeptide. Any terminator that can function in the host cell of choice can be used in the present invention.
- the control sequence can also be a suitable leader sequence, that is, an untranslated region of mRNA that is important for translation by the host cell.
- the leader sequence is operably linked to the 5'end of the nucleic acid sequence encoding the polypeptide. Any leader sequence that can function in the host cell of choice can be used in the present invention.
- the control sequence can also be a signal peptide coding region, which encodes an amino acid sequence linked to the amino terminus of a protein or polypeptide, which can guide the encoded polypeptide into the secretory pathway of cells.
- the 5'end of the coding region of the nucleic acid sequence may naturally contain a signal peptide coding region in which the translation reading frame is naturally linked to the fragment of the coding region of the secreted polypeptide.
- the 5'end of the coding region may contain a signal peptide coding region that is foreign to the coding sequence.
- the coding sequence normally does not contain a signal peptide coding region, it may be necessary to add a foreign signal peptide coding region.
- the natural signal peptide coding region can be simply replaced with a foreign signal peptide coding region to enhance polypeptide secretion.
- any signal peptide coding region that can guide the expressed polypeptide into the secretory pathway of the host cell used can be used in the present invention.
- the control sequence can also be a propeptide coding region, which encodes an amino acid sequence located at the amino terminus of the polypeptide.
- the resulting polypeptide is called a zymogen or a propolypeptide.
- the pro-polypeptide is usually inactive and can be converted into a mature active polypeptide by cleaving the pro-polypeptide from the pro-polypeptide through catalysis or autocatalysis.
- the pro-peptide region is adjacent to the amino terminus of the polypeptide, and the signal peptide region is adjacent to the amino terminus of the pro-peptide region.
- regulatory sequences that can regulate the expression of the polypeptide according to the growth of the host cell.
- regulatory systems are those that respond to chemical or physical stimuli (including in the presence of regulatory compounds) to turn on or turn off gene expression.
- regulatory sequences are those that enable gene amplification.
- the nucleic acid sequence encoding the protein or polypeptide should be operably linked to the regulatory sequence.
- the invention provides recombinant vectors.
- the recombinant vector may contain the nucleic acid molecule or nucleic acid construct described in any of the embodiments herein.
- the recombinant vector can be a recombinant cloning vector, a recombinant eukaryotic expression vector or a recombinant virus vector.
- the recombinant vector may contain other regulatory elements, including but not limited to leader sequence, polyadenylation sequence, propeptide sequence, enhancer, transcription terminator, resistance gene, etc.
- the corresponding recombinant vector can be selected and constructed according to different purposes, so that it contains the required regulatory elements.
- the recombinant cloning vector is preferably a pRS vector, a T vector or a pUC vector
- the recombinant eukaryotic expression vector is preferably pEGFP, pCMVp-NEO-BAN or pSV2
- the recombinant viral vector is preferably a recombinant adenovirus vector or a lentiviral vector.
- the recombinant cloning vector is the nucleic acid construct according to any one of the embodiments of the present invention and pUC18, pUC19, pMD18-T, pMD19-T, pGM-T vector, pUC57, pMAX or pDC315 series vector A recombinant vector obtained by recombination;
- the recombinant expression vector is the nucleic acid construct according to any embodiment of the present invention and the pCDNA3 series vector, pCDNA4 series vector, pCDNA5 series vector, pCDNA6 series vector, pRL series vector, pUC57 vector, pMAX A vector or a recombinant vector obtained by recombination of the pDC315 series vector;
- the recombinant virus vector is a recombinant adenovirus vector, a recombinant adeno-associated virus vector, a recombinant retrovirus vector, a recomb
- nucleic acid constructs and recombinant vectors can be constructed by methods well known in the art, and expressed by conventional methods, so as to prepare the transposase and fusion proteins described herein.
- the present invention also provides a host cell, which contains the nucleic acid molecule, nucleic acid construct and/or recombinant vector described in any of the embodiments herein, or expresses the transposase and/or the transposase described in any of the embodiments herein Or fusion protein.
- the host cell of the present invention is preferably an E. coli cell, an insect cell, a yeast cell or a mammalian cell.
- the host cell is a recombinant mammalian cell; for example, a recombinant primary culture T cell, Jurkat cell, K562 cell, tumor cell, HEK293 cell or CHO cell.
- the present invention also provides a gene transfer system, which contains the transposase, fusion protein, nucleic acid molecule, nucleic acid construct, recombinant vector or host cell described in any of the embodiments herein.
- the gene transfer system further contains a transposon gene.
- the nucleic acid molecule or nucleic acid construct described in any of the embodiments herein is integrated with a transposon gene; in some embodiments, the nucleic acid molecule or nucleic acid construct is relatively independent of the transposon gene In some embodiments, the nucleic acid molecule or nucleic acid construct and the transposon gene are located on the same recombinant vector; in some embodiments, the nucleic acid molecule or nucleic acid construct and the transposon gene are located On different recombinant vectors; in some embodiments, the transposon gene is integrated into the nucleic acid construct; in some embodiments, the transposon gene is integrated into the recombinant vector described in any of the embodiments herein On; In some embodiments, the transposon gene is transferred into the host cell described in any of the embodiments herein; in some embodiments, the transposon gene is located in the host described in any of the embodiments herein Extracellular.
- the present invention also provides the use of the transposase, fusion protein, nucleic acid molecule, nucleic acid construct, recombinant vector, host cell or gene transfer system described in any of the embodiments herein in any of the following:
- the present invention also provides a medicine and/or preparation for genome research, gene therapy, cell therapy, or induction and/or differentiation of pluripotent stem cells, containing the transposase, fusion protein, and nucleic acid described in any of the embodiments herein Molecules, nucleic acid constructs, recombinant vectors, host cells or gene transfer systems.
- the present invention also provides a tool for genome research, gene therapy, cell therapy, or multifunctional stem cell induction and/or differentiation, which contains the transposase, fusion protein, nucleic acid molecule, and nucleic acid construct described in any of the embodiments herein Food, recombinant vector, host cell or gene transfer system.
- the present invention includes the following items 1 to 18:
- amino acid sequence of a highly active transposase containing one or more of the following amino acid sequences: (1) Amino acid mutations at the following positions of the amino acid sequence shown in SEQ ID NO:1 have transposase activity Amino acid sequence: at least one of amino acid 92, amino acid 119, or amino acid 601; preferably amino acid 92, amino acid 119, and amino acid 601 are simultaneously subjected to amino acid mutations; more preferably the isoleucine at 92 position Mutations to asparagine, valine at position 119 to alanine, and glutamine at position 601 to arginine; (2) In (1) the amino acid at position 92, amino acid 119 or amino acid 601 One or more amino acids other than amino acid mutations are deleted, substituted, inserted or added to obtain an amino acid sequence with transposase activity; preferably one or more of amino acid mutations other than amino acid 92, amino acid 119 and amino acid 601 are simultaneously undergone mutation The amino acid sequence with transposas:
- amino acid sequence according to item 1 wherein the amino acid sequence also contains the amino acid sequence of a functional protein; the amino acid sequence of the functional protein is preferably an amino acid sequence for nuclear localization signal, an amino acid sequence for expressing EGFP green fluorescent protein , Tag protein amino acid sequence or antibody amino acid sequence, etc.
- amino acid sequence of a highly active transposase containing one or more of the amino acid sequence shown in SEQ ID NO: 2 or the amino acid sequence shown in SEQ ID NO: 2 at amino acid 92, amino acid 119, and amino acid 601
- the amino acid sequence with transposase activity is obtained by deleting, replacing, inserting or adding one or more other amino acids.
- Base C base G at position 900 is mutated to base A, base A at position 1802 is mutated to base G; or (2) a nucleotide sequence complementary to the mutated nucleotide sequence in (1); Or (3) a nucleotide sequence that overlaps with the mutated nucleotide sequence in (1) and has the same coding function; or (4) hybridizes with the mutated nucleotide sequence in (1) and has the same coding function (5) Substitution, deletion or addition of one or more bases in the nucleotide sequence of (1), (2), (3) or (4) except for the gene mutation site Nucleotides with the same coding function; or (6) Nucleosides that have at least 80% homology with the nucleotide sequence in (1), (2), (3) or (4) and have the same coding function Acid sequence; preferably a nucleotide sequence with at least 90% homology and the same coding function; more preferably a nucleotide sequence with at least 96% homology and the same coding function.
- the nucleotide sequence described in item 6 or 7 also contains a nucleotide sequence encoding a functional protein, preferably a nucleotide sequence encoding a nuclear localization signal, a nucleotide sequence expressing EGFP green fluorescent protein, The nucleotide sequence encoding the peptide of the tag protein or the nucleotide sequence encoding the antibody.
- a nucleic acid construct encoding the amino acid sequence described in any one of items 1 to 3 or the peptide fragment described in item 4 or the protein described in item 5.
- a nucleic acid construct according to item 10 which contains the nucleotide sequence according to any one of items 6 to 8, or contains the nucleic acid according to item 9.
- a recombinant vector containing the nucleotide sequence of any one of items 6-8, or the nucleic acid of item 9, or the nucleic acid construct of any one of items 10-11 The recombinant vector is preferably a recombinant cloning vector, a recombinant eukaryotic expression vector or a recombinant viral vector, the recombinant cloning vector is preferably a pRS vector, a T vector or a pUC vector, and the recombinant eukaryotic expression vector is preferably pEGFP, pCMVp-NEO- BAN or pSV2, the recombinant virus vector is preferably a recombinant adenovirus vector or a lentivirus vector.
- a gene transfer system characterized in that it contains the peptide of item 4, or the protein of item 5, or the nucleic acid of item 9, or any one of items 10-11 The nucleic acid construct described in item 12, or the recombinant vector described in item 12, or the host cell described in item 13.
- a gene transfer system characterized in that it further contains a transposon gene, the nucleic acid of item 9 or the nucleic acid construct of any one of items 10-11 and Transposon gene integration; or the nucleic acid of item 9 or the nucleic acid construct of any one of items 10-11 and the transposon gene are relatively independent; or the nucleic acid of item 9 or the nucleic acid of item 10-
- the nucleic acid construct according to any one of items 11 and the transposon gene are located on the same recombinant vector; or the nucleic acid according to item 9 or the nucleic acid construct according to any one of items 10-11 and the transposon
- the daughter gene is located on a different recombinant vector; or the transposon gene is integrated into the nucleic acid construct described in any one of items 10-11; or the transposon gene is integrated into the recombinant vector described in item 12; or
- the transposon gene is transferred into the host cell described in item 13; or the transposon gene is located outside the host
- a drug and/or preparation for genome research, gene therapy, cell therapy, or induction and/or differentiation of pluripotent stem cells containing the peptide described in item 4, or the protein described in item 5, Or the nucleic acid of item 9, or the nucleic acid construct of any one of items 10-11, or the recombinant vector of item 12, or the host cell of item 13, or the host cell of item 13 or 14- The gene transfer system described in any one of 15 items.
- a tool for genome research, gene therapy, cell therapy, or induction and/or differentiation of pluripotent stem cells containing the peptide described in item 4, or the protein described in item 5, or item 9
- the nucleic acid, or the nucleic acid construct described in any one of items 10-11, or the recombinant vector described in item 12, or the host cell described in item 13, or any one of items 14-15 The gene transfer system described in one item.
- a human c-myc nuclear localization signal is added after the start codon to improve the integration efficiency of foreign genes in the host cell;
- the resistance gene G418 is inserted between the 5'IR and 3'IR of the original transposon by means of gene synthesis to form the transposon G418-IR.
- the transposon was inserted into the TTAA in the URA3 gene by recombination after PCR, and the transposase with an inducible promoter was inserted into the PRS316 polyclonal restriction site to finally constitute the screening report vector PRS316-URA- PBase.
- the specific operations are as follows:
- PCR was performed on the template PRS316 using primers pURA-F (SEQ ID NO: 5: aagccgctaaaggcattatccgcc) and pURA-R (SEQ ID NO: 6: aactgtgccctccatggaaaaatcagtc) to obtain linearized fragment 1 of plasmid PRS316.
- transposase ORF open reading frame
- the transposase ORF has a homologous sequence of about 50 bp at both ends.
- the transposase is mutated using clonth's error-prone PCR kit, and the number of mutations can be accumulated by recovering PCR fragments as a template for multiple mutations (as shown in the flow chart above in Figure 2).
- the screening report vector PRS316-URA-PBase uses XbaI and EcoRI for linearization, and removes the original unmutated transposase.
- the transposase fragments and linearized vectors recovered by PCR are transformed into ura-deficient yeast strains at a molar ratio of 10:1 (shown in the flow chart below in Figure 2 and shown in Figure 3), and the yeast will use its own homologous recombination to repair
- the mechanism allows the exogenous target fragment to be replaced by the homology arm into the DNA plasmid carrying the gap, thereby automatically combining into a complete plasmid with the target fragment in the yeast cell.
- the screening process is divided into two screenings.
- the first screening all mutants were screened on a large scale, and mutants with significantly higher transposition efficiency than those in the unmutated control group were obtained.
- the second screening was carried out in the yeast obtained in the first screening, and the exact transposition was calculated.
- bz-hyPBase SEQ ID NO: 2 amino acid sequence, SEQ ID NO: 3 nucleotide sequence.
- the transformed mutant library is picked up and activated in YPD medium containing G418 antibiotics in a 96-well plate. After 24 hours of activation, it is transferred using a replicator and inoculated to a concentration of 2% Induce in YPD medium with galactose. After 24 hours of induction, dilute the bacterial solution to 10-2 or 10-3 (determined according to the growth of yeast), take 10 ⁇ l of the dot plate on the ura-deficient solid medium, and observe the growth of the mutant after 48 hours of cultivation , And compared with the clones without mutations, the clones with significantly higher transposition efficiency were screened out, and the second screening was carried out.
- Second screening Activate the suspected mutants obtained in the first screening for 24 hours, adjust the OD600 value after activation to be consistent, and inoculate them into YPD medium containing 2% galactose at a ratio of 1:100 for induction for 24 hours After induction, adjust the OD600 value to be consistent again, and dilute to 10-2, 10-3, 10-4, take 20 ⁇ l diluted to 10-2, 10-3 and spread on the ura-deficient solid medium for 24 hours. Count the number of clones, and the clones grown on the ura-deficient solid medium are the clones that have undergone transposition. At the same time, take 20 ⁇ l diluted to 10-3, 10-4 and spread the YPD complete solid medium on the para-position control.
- the grown clones are the total number of yeast.
- amino acid sequence of bz-hyPBase with nuclear localization sequence SEQ ID NO: 2
- amino acid sequence of the existing highly active transposase hyPBase (SEQ ID NO:1) is mutated from isoleucine at position 92 to asparagine, valine at position 119 is mutated to alanine, and glutamine at position 601
- the amide was mutated to arginine to obtain the amino acid sequence of bz-hyPBase as shown in SEQ ID NO: 2.
- nucleotide sequence of human codon-optimized hyPBase transposase containing nuclear localization sequence SEQ ID NO: 4.
- Nucleotide sequence of bz-hyPBase transposase containing nuclear localization sequence SEQ ID NO: 3:
- the nucleotide sequence of the existing high-activity enzyme hyPBase has been optimized by human codons to obtain a human codon optimized nucleotide sequence.
- SEQ ID NO: 4 Based on the human codon optimized nucleotide sequence (SEQ ID NO: 4), the following is performed Base mutation at position: base T at position 276 was mutated to base C, base T at position 356 was mutated to base C, base G at position 900 was mutated to base A, and base A at position 1802 was mutated to Base G; to obtain a mutated nucleotide sequence that encodes the new high-activity transposase bz-hyPBase of the present invention as shown in SEQ ID NO: 3.
- Example 2 bz-hyPBase has higher transposition efficiency in yeast
- the transposase is turned on and expressed under the regulation of the inducer galactose, which promotes the transposition of the transposon, the transposition of the transposon, the normal expression of the URA gene, and the clone that undergoes the transposition resumes normal growth in the ura-deficient medium .
- the transposable efficiency of transposase in Saccharomyces cerevisiae can be calculated.
- WT PBase is a plasmid carrying a mammalian codon-optimized piggybac transposase
- hyPBase is a plasmid carrying the existing highly active piggybac transposase (obtained by mutation of 7 amino acid sites for WTPBase described in the background art)
- optimized hyPBase In order to carry the existing high-activity piggybac transposase through the human source codon optimization and nuclear positioning signal system to obtain the transposase plasmid, bz-hyPBase is a new high-activity transposase screened in the present invention (i.e. optimized hyPBase A plasmid obtained by carrying out the three amino acid site mutations described in the Examples of the present invention).
- Example 3 bz-hyPBase has higher gene editing efficiency in CHO cells
- the transposon carrying the EGFP gene was cloned into the vector pSAD-EGFP ( Figure 6) to express green fluorescent protein.
- the two plasmids expressing transposase and transposon are jointly electrotransformed into CHO cells.
- the transposon with EGFP will be inserted into the genome under the action of transposase to make it stably express green fluorescent protein.
- the cells expressing green fluorescent protein were counted by flow cytometry technology. The more cells that can express the fluorescent protein, the higher the efficiency of transposase transposition. From the statistical results in Figure 7, the transposition activity of bz-hyPBase is significantly better than hyPBase.
- Example 4 bz-hyPBase has higher gene editing efficiency in T cells
Abstract
Provided are a highly active transposase and an application thereof, the amino acid sequence of the transposase is as shown in SEQ ID NO: 2 or 12, the transposase used for a transposon system can significantly improve the gene transfer activity of the transposon. The transposase enzyme and its coding nucleotide sequence can be used to construct a gene transfer system, to prepare or be used as a drug, preparation, or tool for genome research, gene therapy, cell therapy, or multi-functional stem cell induction and/or differentiation.
Description
本发明属于分子生物学及生物医药领域,具体涉及一种高活性转座酶及其应用。The invention belongs to the field of molecular biology and biomedicine, and specifically relates to a high-activity transposase and its application.
DNA转座子是一段可移动的DNA序列,可以通过切割、重新整合等一系列过程从基因组的一个位置转座到另一个位置。PiggyBac(PB)转座子是从粉纹夜蛾(Trichoplusia ni)TN368细胞系中分离的一种DNA转座子,可特异性的插入到“TTAA”靶位点,借助转座酶,PiggyBac转座子可将目的基因从宿主精确切离,且不会使宿主染色体发生重拍。PB转座子没有潜在的病毒遗传毒性,可携带较长外源基因片段(最多150kb),具有很强的改造性。由PB转座酶介导的转基因具有整合效率高、稳定整合、长期表达、单拷贝整合、插入位点可定位、操纵简便等特点,常运用在转基因小鼠生产、小鼠胚胎干细胞遗传学操作、基因诱变等基因操作、多能干细胞诱导等多个领域。A DNA transposon is a mobile DNA sequence that can be transposed from one position in the genome to another through a series of processes such as cutting and reintegration. PiggyBac (PB) transposon is a DNA transposon isolated from Trichoplusia ni TN368 cell line. It can be specifically inserted into the target site of "TTAA". With the help of transposase, PiggyBac transposes The transposon can accurately excise the target gene from the host without reshooting the host chromosome. PB transposon has no potential viral genotoxicity, can carry a long foreign gene fragment (up to 150kb), and has strong transformability. The transgene mediated by PB transposase has the characteristics of high integration efficiency, stable integration, long-term expression, single copy integration, insertable site location, and easy manipulation. It is often used in the production of transgenic mice and the genetic manipulation of mouse embryonic stem cells. , Gene mutagenesis and other genetic manipulation, pluripotent stem cell induction and other fields.
PB转座酶的转座活性在现有哺乳动物DNA转座子中是最高的,具有很广泛的运用前景。国内外已有许多研究将PB转座子系统作为一种基因编辑的方法,运用在多种生物中进行转基因和基因突变,包括昆虫细胞、原生生物、植物和脊椎动物。2003年,Tomita将人类III型胶原蛋白与增强型绿色荧光蛋白EGFP融合,使用PB转座子整合到家蚕蚕丝蛋白基因中,获得了能稳定表达人类胶原蛋白的转基因家蚕。2005年,Balu将人二氢叶酸还原酶(hDHFR)通过PB转座子系统将其插入疟原虫基因组中。2014年,Eric T获得了PB转座子能进行体内转座的稳定转基因株系。2005年,Sheng Ding通过PB转座子将外源基因片段高效导入体外培养的人类细胞和小鼠细胞株,并使其稳定表达,培育出性状稳定的转基因荧光小鼠,证明了PB转座子系统可作为一种有效的操作工具用于研究其他脊椎动物基因功能的可能性。The transposition activity of PB transposase is the highest among existing mammalian DNA transposons, and it has a very broad application prospect. There have been many studies at home and abroad that use the PB transposon system as a method of gene editing to carry out transgene and gene mutation in a variety of organisms, including insect cells, protists, plants and vertebrates. In 2003, Tomita fused human type III collagen with enhanced green fluorescent protein EGFP and used the PB transposon to integrate into the silkworm silk protein gene to obtain a transgenic silkworm that can stably express human collagen. In 2005, Balu inserted human dihydrofolate reductase (hDHFR) into the Plasmodium genome through the PB transposon system. In 2014, Eric T obtained a stable transgenic line with PB transposon capable of transposing in vivo. In 2005, Sheng Ding used PB transposons to efficiently introduce foreign gene fragments into human cells and mouse cell lines cultured in vitro, and stably express them, cultivating stable traits of transgenic fluorescent mice, proving that PB transposons The system can be used as an effective operating tool to study the possibility of other vertebrate gene functions.
DNA转座子系统由两部分组成,两端带有反向重复序列(IRs)可携带目的DNA片段的转座子和能催化转座子发生“剪切和粘贴”的转座酶。转座酶首先结合转座子两侧的IRs序列,然后将转座子从宿主DNA位点精准地无痕移除,最终将DNA片段整合到新的位点上。高效转座系统的建立可以实现对靶基因的定点敲除或者目的基因的定点引入,为哺乳细胞内的基因编辑提供有效的载体工具。转座系统的转座效率决定了基因编辑的效率,而转座效率很大一部分取决于转座酶的表达水平,因此,增加转座酶活性是增加转座 子转座效率的关键技术点。The DNA transposon system consists of two parts, the transposons with inverted repeats (IRs) at both ends that can carry the target DNA fragments, and the transposase that can catalyze the "cut and paste" of the transposon. The transposase first binds the IRs sequences on both sides of the transposon, and then removes the transposon from the host DNA site accurately and seamlessly, and finally integrates the DNA fragment into the new site. The establishment of an efficient transposition system can achieve targeted knockout of target genes or targeted introduction of target genes, providing an effective vector tool for gene editing in mammalian cells. The transposition efficiency of the transposable system determines the efficiency of gene editing, and a large part of the transposition efficiency depends on the expression level of the transposase. Therefore, increasing the transposase activity is a key technical point for increasing the transposable efficiency of transposons.
转座酶的转座活性受结合位点、活性位点及结构等因素的影响,目前尚未对转座酶的晶体结构作出清晰的解析,但有部分结构域被认为是重要结构,且实验证明转座酶的活性可由任何一个非特殊氨基酸影响。The transposition activity of transposase is affected by the binding site, active site, structure and other factors. At present, the crystal structure of transposase has not been clearly analyzed, but some domains are considered to be important structures, and experiments have proved The activity of transposase can be affected by any non-special amino acid.
一种用于哺乳动物的高活性PiggyBac转座酶(A hyperactive piggyBac transposase for mammalian applications,PNAS|January 25,2011|vol.108|no.4|1531–1536)公开一种转座效率为mPBase(经哺乳动物密码子优化的野生型PiggyBac转座酶)10倍的进行以下位点氨基酸突变的高活性PiggyBac转座酶(指下述现有高活性转座酶hyPBase,如SEQ ID NO:1所示):I30V、G165S、S103P、M282V、S509G、N570S和N538K。A hyperactive piggyBac transposase for mammals (A hyperactive piggyBac transposase for mammalian applications, PNAS|January 25, 2011|vol.108|no.4|1531-1536) discloses a transposition efficiency of mPBase( The wild-type PiggyBac transposase optimized by mammalian codons) 10-fold high-activity PiggyBac transposase with amino acid mutations at the following positions (refers to the following existing high-activity transposase hyPBase, as shown in SEQ ID NO:1 Show): I30V, G165S, S103P, M282V, S509G, N570S and N538K.
PiggyBac转座子突变体及其应用(PiggyBac transposon variants and methods of use,US9670503B2)与PiggyBac转座子变异及其使用方法(CN102421902A)为基于美国临时申请案号61/155206优先权的再申请,两者公开:在整合缺陷PiggyBac突变体基础上继续突变选择整合酶活性高于整合缺陷PiggyBac突变体的突变体,在野生型PiggyBac正常体基础上进行突变选择整合活性高于野生型PiggyBac正常体的突变体。PiggyBac transposon mutants and their applications (PiggyBac transposon variants and methods of use, US9670503B2) and PiggyBac transposon variants and methods of use (CN102421902A) are reapplications based on the priority of U.S. Provisional Application No. 61/155206. The author disclosed: continue mutation selection based on integration-deficient PiggyBac mutants with higher integrase activity than integration-deficient PiggyBac mutants, and mutation selection on the basis of wild-type PiggyBac normals with higher integration activity than wild-type PiggyBac normals body.
现有PiggyBac转座酶突变体的酶活性虽相对野生型PiggyBac转座酶有所提高,但仍不能满足更高、更严苛的酶活性要求,因此对高酶活性的PiggyBac转座酶的研究仍有必要。Although the enzyme activity of the existing PiggyBac transposase mutant is higher than that of the wild-type PiggyBac transposase, it still cannot meet the higher and stricter enzymatic activity requirements. Therefore, the research on the PiggyBac transposase with high enzymatic activity It is still necessary.
发明内容Summary of the invention
本发明提供一种新的高活性转座酶,该高活性转座酶在大肠杆菌、昆虫细胞、酵母细胞及哺乳动物细胞等细胞中均表现极高的转座活性,相较于现有高活性转座酶hyPBase具有适用宿主细胞的广谱性,还具有哺乳动物细胞内高的转座活性,尤其具有人源细胞内高的转座活性,为转座酶的探索尤其是人源细胞内转座酶的探索提供新的线索和依据。The present invention provides a new high-activity transposase, which exhibits extremely high transposition activity in E. coli, insect cells, yeast cells, mammalian cells and other cells, which is higher than the existing high-activity transposase. The active transposase hyPBase has a broad spectrum of application to host cells, and also has high transposition activity in mammalian cells, especially in human cells. It is the exploration of transposase, especially in human cells. The exploration of transposase provides new clues and basis.
本发明还提供作为本发明该新的高活性转座酶基础的氨基酸序列和肽段,及编码本发明该高活性转座酶氨基酸序列、肽段和蛋白的核苷酸序列及以该核苷酸序列为基础的核酸、核酸构建体、重组载体和宿主细胞,及基于上述肽段、蛋白质、核酸、核酸构建体、重组载体和宿主细胞构件的基因转移系统及应用。The present invention also provides amino acid sequences and peptides that are the basis of the new highly active transposase of the present invention, as well as nucleotide sequences encoding the amino acid sequences, peptides and proteins of the highly active transposase of the present invention, and the nucleoside Acid sequence-based nucleic acids, nucleic acid constructs, recombinant vectors and host cells, and gene transfer systems and applications based on the above peptides, proteins, nucleic acids, nucleic acid constructs, recombinant vectors and host cell components.
本发明一些实施方式中,同时将现有高活性转座酶hyPBase的氨基酸序列(SEQ ID NO:1所示)92位的异亮氨酸突变为天冬酰胺、119位的缬氨酸突变为丙氨酸、601位的谷氨酰胺突变为精氨酸,得到目标突变氨基酸序列,如SEQ ID NO:2所示。CHO细胞中,相较于现有高活性转座酶hyPBase经密码子优化并加入核定位信号系统的转座效率(30.9%),以SEQ ID NO:2氨基酸序列为基础生成的目标高活性酶bz-hyPBase的转座 效率(51.7%)提高近21%;PBMC细胞中,相较于现有高活性转座酶hyPBase经密码子优化并加入核定位信号系统的转座效率(9.81%),以SEQ ID NO:2氨基酸序列为基础生成的目标高活性bz-hyPBase酶的转座效率(19.4%)提高近10%。说明,以本发明突变氨基酸序列为基础的目标高活性酶相较于现有高活性转座酶hyPBase表现出更优异的转座活性,尤其是在哺乳动物细胞和人源细胞中表现的高转座活性。故,本发明一些实施方式提供一种新的高活性转座酶,该高活性转座酶含有一个或多个SEQ ID NO:2所示的氨基酸序列,该高活性的转座酶在大肠杆菌、昆虫细胞、酵母细胞及哺乳动物细胞中均表现极高的转座活性,尤其满足哺乳动物和人源细胞的高转座活性要求。In some embodiments of the present invention, the amino acid sequence of the existing highly active transposase hyPBase (shown in SEQ ID NO:1) is mutated to asparagine at position 92 and valine at position 119 to Alanine and glutamine at position 601 were mutated to arginine to obtain the target mutant amino acid sequence, as shown in SEQ ID NO: 2. In CHO cells, compared with the existing high-activity transposase hyPBase, which is codon-optimized and added to the nuclear localization signal system, the transposition efficiency (30.9%), the target high-activity enzyme generated based on the amino acid sequence of SEQ ID NO: 2 The transposition efficiency of bz-hyPBase (51.7%) is increased by nearly 21%; in PBMC cells, compared with the existing high-activity transposase hyPBase, the transposition efficiency (9.81%) is codon-optimized and added to the nuclear localization signal system. The transposition efficiency (19.4%) of the target high-activity bz-hyPBase enzyme generated based on the amino acid sequence of SEQ ID NO: 2 is increased by nearly 10%. This shows that the target high-activity enzyme based on the mutant amino acid sequence of the present invention exhibits better transposition activity than the existing high-activity transposase hyPBase, especially in mammalian cells and human-derived cells. Block activity. Therefore, some embodiments of the present invention provide a new highly active transposase, which contains one or more amino acid sequences shown in SEQ ID NO: 2, and the highly active transposase is in Escherichia coli , Insect cells, yeast cells and mammalian cells all show extremely high transposition activity, especially to meet the high transposition activity requirements of mammalian and human-derived cells.
含核定位序列的hyPBase转座酶的氨基酸序列(SEQ ID NO:1):Amino acid sequence of hyPBase transposase containing nuclear localization sequence (SEQ ID NO:1):
含核定位序列的目标突变氨基酸序列(SEQ ID NO:2):Target mutant amino acid sequence containing nuclear localization sequence (SEQ ID NO: 2):
对现有高活性转座酶hyPBase(SEQ ID NO:1所示)氨基酸序列92位、119位、601位单独或任意两个位点进行如上的氨基酸突变所得的突变氨基酸序列,并以一个或多个此突变氨基酸序列为基础形成的酶同样具有与本发明实施例记载的目标高活性转座酶bz-hyPBase或者与现有hyPBase相同或相近似的转座效率,同样属于本发明所要保护的新的高活性转座酶的突变氨基酸序列,由此突变氨基酸序列为基础形成的酶同样属于本发明所要保护的新的高活性的转座酶。The amino acid sequence of the existing high-activity transposase hyPBase (shown in SEQ ID NO: 1) is the amino acid sequence obtained by performing the above amino acid mutations at positions 92, 119, and 601 alone or at any two positions, with one or Enzymes formed based on multiple mutant amino acid sequences also have the same or similar transposition efficiency as the target high-activity transposase bz-hyPBase described in the Examples of the present invention or the same or similar to the existing hyPBase, and are also protected by the present invention. The mutant amino acid sequence of the new highly active transposase, and the enzyme formed based on the mutant amino acid sequence also belongs to the new highly active transposase to be protected by the present invention.
如上所述,对现有高活性转座酶hyPBase(SEQ ID NO:1所示)氨基酸序列92位、119位、601位单独、任意两个位点或者三个位点进行如上的氨基酸突变所得的突变氨基酸序列,再进行一个或多个氨基酸缺失、取代、插入或添加操作所得的仍保持或提高酶活性的氨基酸序列,也属于本发明技术方案具有相同或相近似技术效果的替换方案,在本发明的保护范围之内,同样属于本发明所要保护的新的高活性转座酶的突变氨基酸序列,由一个或多个此突变氨基酸序列为基础形成的酶同样属于本发明所要保护的新的高活性转座酶。As mentioned above, the amino acid sequence 92, 119, and 601 of the existing highly active transposase hyPBase (shown in SEQ ID NO: 1), any two positions or three positions alone, any two positions or three positions are subjected to the above amino acid mutations. The mutated amino acid sequence of, and the amino acid sequence obtained by performing one or more amino acid deletion, substitution, insertion or addition operations that still maintain or improve the enzyme activity also belong to the replacement scheme of the technical scheme of the present invention with the same or similar technical effects. Within the scope of protection of the present invention, the mutant amino acid sequence of the new highly active transposase to be protected by the present invention is also included, and enzymes formed on the basis of one or more of this mutant amino acid sequence also belong to the new mutant amino acid sequence to be protected by the present invention. Highly active transposase.
如上所述,对现有高活性转座酶hyPBase(SEQ ID NO:1所示)氨基酸序列92位、119位、601位单独、任意两个位点或者三个位点进行如上的氨基酸突变所得的突变氨基酸序列,还含有功能性蛋白的氨基酸序列,在新的高活性转座酶上增加功能性蛋白,改善或增加新的高活性转座酶的功能,如核定位信号的氨基酸序列、表达EGFP绿色荧光蛋白的氨基酸序列、标签蛋白氨基酸序列或抗体氨基酸序列等。这些功能性蛋白可提高新的高活性转座酶的转座活性,如核定位信号可辅助提高转座酶的转座活性;或可增强高活性转座酶的转座监测功能,如EGFP绿色荧光蛋白或标签蛋白便于转座酶转座活性的定性和/或定量监测;或为新的高活性转座酶增加新的功能,如抗体可另外增加免疫活性。As mentioned above, the amino acid sequence 92, 119, and 601 of the existing highly active transposase hyPBase (shown in SEQ ID NO: 1), any two positions or three positions alone, any two positions or three positions are subjected to the above amino acid mutations. The mutant amino acid sequence also contains the amino acid sequence of the functional protein. Add functional protein to the new high-activity transposase to improve or increase the function of the new high-activity transposase, such as the amino acid sequence and expression of the nuclear localization signal EGFP green fluorescent protein amino acid sequence, tag protein amino acid sequence or antibody amino acid sequence, etc. These functional proteins can improve the transposition activity of new highly active transposases. For example, nuclear localization signals can help improve the transposition activity of transposases; or can enhance the transposition monitoring function of highly active transposases, such as EGFP green Fluorescent protein or tag protein facilitates the qualitative and/or quantitative monitoring of transposase activity; or adds new functions to new highly active transposases, such as antibodies that can additionally increase immune activity.
本发明还保护对现有高活性转座酶hyPBase(SEQ ID NO:1所示)氨基酸序列92位、119位、601位单独、任意两个位点或者三个位点进行如上的氨基酸突变所得的突变氨基酸序列,及在该突变氨基酸序列基础上再进行一个或多个氨基酸缺失、取代、插入或添加操作所得的仍保持或提高酶活性的衍生氨基酸序列,经氨基酸脱水缩合后以肽键相连的链状化合物,即肽段。肽段中含有上述的突变氨基酸或上述的衍生氨基酸序列的个数可为一个或多个。该肽段中还连接有功能性蛋白的氨基酸序列经氨基酸脱水缩合后以肽键相连的功能性蛋白的肽段,如核定位信号的肽段、表达EGFP绿色荧光蛋白的肽段、标签蛋白肽段或抗体肽段等。The present invention also protects the amino acid sequence 92, 119, and 601 of the existing high-activity transposase hyPBase (shown in SEQ ID NO: 1), any two or three of the above amino acid mutations. The mutant amino acid sequence of the mutant amino acid sequence, and the derivative amino acid sequence obtained by performing one or more amino acid deletion, substitution, insertion or addition operations on the basis of the mutant amino acid sequence, which still maintains or improves the enzyme activity, is connected by peptide bonds after dehydration and condensation of the amino acids The chain compound, that is, peptide. The number of peptides containing the above-mentioned mutant amino acids or the above-mentioned derived amino acid sequences can be one or more. The peptide is also connected with the functional protein's amino acid sequence after being dehydrated and condensed by amino acids and then connected by peptide bonds, such as the peptide of nuclear localization signal, the peptide of expressing EGFP green fluorescent protein, and the peptide of tag protein. Segment or antibody peptide segment, etc.
本发明以现有高活性转座酶hyPBase(SEQ ID NO:1所示)氨基酸序列92位、119位、601位单独、任意两个位点或者三个位点进行如上的氨基酸突变所得的突变氨基酸序列,及以该突变氨基酸序列为基础形成的肽段,及在该突变氨基酸序列基础上再进行一个 或多个氨基酸缺失、取代、插入或添加操作所得的仍保持或提高酶活性的衍生氨基酸序列,及以该衍生氨基酸序列为基础形成的肽段为基础形成的蛋白,均属于本发明保护的新的高活性转座酶。该新的高活性转座酶中上述的突变氨基酸序列、衍生氨基酸序列及以上述的突变氨基酸序列、衍生氨基酸序列为基础形成的肽段个数为一个或多个。The present invention uses the existing high-activity transposase hyPBase (shown in SEQ ID NO: 1) amino acid sequence 92, 119, 601 alone, any two positions or three positions to carry out the above amino acid mutations. The amino acid sequence, and the peptide fragment formed based on the mutant amino acid sequence, and the derivative amino acid obtained by performing one or more amino acid deletions, substitutions, insertions or additions on the basis of the mutant amino acid sequence and still maintain or improve the enzyme activity The sequence and the protein formed on the basis of the peptide fragment formed on the basis of the derived amino acid sequence belong to the new highly active transposase protected by the present invention. The number of the above-mentioned mutant amino acid sequence, derivative amino acid sequence, and peptide segments formed on the basis of the above-mentioned mutant amino acid sequence and derivative amino acid sequence in the new highly active transposase is one or more.
编码上述提及的本发明新的高活性转座酶、肽段及其氨基酸序列的突变核苷酸序列、与该突变核苷酸序列互补、杂交或重叠的核苷酸序列、或该突变核苷酸序列进行碱基取代、缺失或添加操作并具有编码新的高活性转座酶的核苷酸序列、或与该突变核苷酸序列具有至少80%以上同源性的核苷酸序列,优选与该突变核苷酸序列具有至少90%以上同源性的核苷酸序列,最好与该突变核苷酸序列具有至少96%以上同源性的核苷酸序列,均属于本发明所要保护的编码本发明新的高活性转座酶、肽段及其氨基酸序列的突变核苷酸序列,其个数可以是一个也可以是多个重复的拷贝。具体如下:A mutant nucleotide sequence encoding the above-mentioned new highly active transposase, peptide fragment and its amino acid sequence of the present invention, a nucleotide sequence complementary to, hybridizing or overlapping with the mutant nucleotide sequence, or the mutant core The nucleotide sequence undergoes base substitution, deletion or addition operations and has a nucleotide sequence encoding a new highly active transposase, or a nucleotide sequence that has at least 80% homology with the mutant nucleotide sequence, Preferably, a nucleotide sequence having at least 90% or more homology with the mutant nucleotide sequence, and preferably a nucleotide sequence having at least 96% or more homology with the mutant nucleotide sequence, all belong to the present invention. The number of protected mutant nucleotide sequences encoding the new highly active transposase, peptides and amino acid sequences of the present invention can be one or multiple repeated copies. details as follows:
编码现有高活性酶hyPBase(SEQ ID NO:1所示)氨基酸序列的核苷酸序列经人源密码子优化后得到人源密码子优化核苷酸序列,以人源密码子优化核苷酸序列(SEQ ID NO:4)为基础进行以下位点的碱基突变:276的碱基T突变为碱基C,356位的碱基T突变为碱基C,900位的碱基G突变为碱基A,1802位的碱基A突变为碱基G;得到编码本发明新的高活性转座酶bz-hyPBase氨基酸序列(SEQ ID NO:2所示)的突变核苷酸序列,如SEQ ID NO:3所示。The nucleotide sequence encoding the amino acid sequence of the existing high-activity enzyme hyPBase (SEQ ID NO:1) is optimized by human codons to obtain a human codon optimized nucleotide sequence, and the nucleotide sequence is optimized with human codons Based on the sequence (SEQ ID NO: 4), the following base mutations were made: base T at 276 was mutated to base C, base T at 356 was mutated to base C, and base G at base 900 was mutated to Base A, base A at position 1802 is mutated to base G; a mutant nucleotide sequence encoding the amino acid sequence of the new highly active transposase bz-hyPBase (shown in SEQ ID NO: 2) of the present invention is obtained, as shown in SEQ ID NO: as shown in 3.
人源密码子优化的现有含核定位序列的高活性酶hyPBase的核苷酸序列(SEQ ID NO:4):The nucleotide sequence (SEQ ID NO: 4) of the existing high-activity enzyme hyPBase with nuclear localization sequence optimized by human-derived codons:
含核定位序列的突变核苷酸序列(SEQ ID NO:3):Mutant nucleotide sequence containing nuclear localization sequence (SEQ ID NO: 3):
或者,突变核苷酸序列(SEQ ID NO:3所示)进行碱基取代、缺失或添加操作并具有编码新的高活性转座酶bz-hyPBase的核苷酸序列;Alternatively, the mutant nucleotide sequence (shown in SEQ ID NO: 3) undergoes base substitution, deletion or addition operations and has a nucleotide sequence encoding a new highly active transposase bz-hyPBase;
或者按照碱基互补配对原则,同突变核苷酸序列(SEQ ID NO:3所示)互补的核苷酸序列及其再进行碱基取代、缺失或添加操作并具有新的高活性转座酶bz-hyPBase的核苷酸序列;Or in accordance with the principle of base complementary pairing, the nucleotide sequence complementary to the mutant nucleotide sequence (shown in SEQ ID NO: 3) and its base substitution, deletion or addition operation and a new highly active transposase The nucleotide sequence of bz-hyPBase;
或者同突变核苷酸序列(SEQ ID NO:3所示)重叠并具有编码新的高活性转座酶bz-hyPBase的核苷酸序列的核苷酸序列;Or overlap with the mutant nucleotide sequence (shown in SEQ ID NO: 3) and have a nucleotide sequence encoding the nucleotide sequence of the new highly active transposase bz-hyPBase;
或者同突变核苷酸序列(SEQ ID NO:3所示)杂交并具有编码新的高活性转座酶bz-hyPBase的核苷酸序列的核苷酸序列;Or hybridize with the mutant nucleotide sequence (shown in SEQ ID NO: 3) and have a nucleotide sequence encoding the nucleotide sequence of the new highly active transposase bz-hyPBase;
或者同突变核苷酸序列(SEQ ID NO:3所示)具有80%以上同源性并具有编码新的高活性转座酶bz-hyPBase的核苷酸序列;具体地,优选同突变核苷酸序列(SEQ ID NO:3所示)具有90%以上同源性并具有编码新的高活性转座酶bz-hyPBase的核苷酸序列;更优选同突变核苷酸序列(SEQ ID NO:3所示)具有96%以上同源性并具有编码新的高活性转座酶bz-hyPBase的核苷酸序列;Or the same mutant nucleotide sequence (shown in SEQ ID NO: 3) has more than 80% homology and has a nucleotide sequence encoding the new highly active transposase bz-hyPBase; specifically, the same mutant nucleoside is preferred The acid sequence (shown in SEQ ID NO: 3) has more than 90% homology and has a nucleotide sequence encoding the new highly active transposase bz-hyPBase; more preferably a homomutated nucleotide sequence (SEQ ID NO: 3) It has more than 96% homology and has a nucleotide sequence encoding the new highly active transposase bz-hyPBase;
均属于本发明所要保护的编码新的高活性转座酶bz-hyPBase、或其肽段、或其氨基酸序列的突变核苷酸序列。They all belong to the mutant nucleotide sequences encoding the new high-activity transposase bz-hyPBase, its peptide fragments, or its amino acid sequence to be protected by the present invention.
若本发明新的高活性转座酶上还连接有功能性蛋白时,编码其的突变核苷酸序列上还含有编码功能性蛋白的核苷酸序列,如编码核定位信号的核苷酸序列、表达EGFP绿色荧光蛋白的核苷酸序列、编码标签蛋白肽段的核苷酸序列或编码抗体的核苷酸序列等。If a functional protein is connected to the new high-activity transposase of the present invention, the mutant nucleotide sequence encoding it also contains a nucleotide sequence encoding the functional protein, such as a nucleotide sequence encoding a nuclear localization signal , The nucleotide sequence expressing EGFP green fluorescent protein, the nucleotide sequence encoding the peptide of the tag protein or the nucleotide sequence encoding the antibody, etc.
本发明还提供上述由编码本发明新的高活性转座酶、或其肽段、或其氨基酸序列的突变核苷酸序列聚合而成的核酸。当本发明新的高活性转座酶上连接有功能性蛋白时,该核酸上还含有编码功能性蛋白(核定位信号、EGFP绿色荧光蛋白、标签蛋白或抗体)的核苷酸序列。The present invention also provides the above-mentioned nucleic acid polymerized from the mutant nucleotide sequence encoding the new highly active transposase of the present invention, or its peptide fragment, or its amino acid sequence. When a functional protein is connected to the novel high-activity transposase of the present invention, the nucleic acid also contains a nucleotide sequence encoding the functional protein (nuclear localization signal, EGFP green fluorescent protein, tag protein or antibody).
本发明还提供一种核酸构建体,该核酸构建体上可操作地连接一个或多个调控序列,所述调控序列指导目标序列在宿主细胞内表达编码,表达编码包括蛋白质或多肽生成中所涉及的任何步骤,包括但不限于转录、转录后修饰、翻译、翻译后修饰和分泌等。该核酸构建体上还含有上述编码本发明新的高活性转座酶、或其肽段、或其氨基酸序列的突变核苷酸序列或由该突变核苷酸序列聚合而成的核酸。The present invention also provides a nucleic acid construct to which one or more regulatory sequences are operably linked, and the regulatory sequences direct the target sequence to be expressed and coded in a host cell. The expression codes include those involved in the production of proteins or polypeptides. Any step of the process, including but not limited to transcription, post-transcriptional modification, translation, post-translational modification and secretion, etc. The nucleic acid construct also contains the above-mentioned mutant nucleotide sequence encoding the new highly active transposase of the present invention, or its peptide fragment, or its amino acid sequence, or a nucleic acid polymerized from the mutant nucleotide sequence.
本发明还提供一种重组载体,该重组载体上含有上述编码本发明新的高活性转座酶、或其肽段、或其氨基酸序列的突变核苷酸序列或由该突变核苷酸序列聚合而成的核酸、或上述的核酸构建体。所述重组载体包括重组克隆载体、重组真核表达载体或重组病毒载体,所述重组克隆载体包括pRS载体、T载体或pUC载体等,所述重组真核表达载体包括pEGFP、pCMVp-NEO-BAN或pSV2等,所述重组病毒载体包括重组腺病毒载体或慢病毒载体等。The present invention also provides a recombinant vector containing the above-mentioned mutant nucleotide sequence encoding the new highly active transposase of the present invention, or its peptide fragment, or its amino acid sequence, or polymerized by the mutant nucleotide sequence. The nucleic acid, or the above-mentioned nucleic acid construct. The recombinant vector includes a recombinant cloning vector, a recombinant eukaryotic expression vector or a recombinant viral vector. The recombinant cloning vector includes a pRS vector, a T vector or a pUC vector, etc., and the recombinant eukaryotic expression vector includes pEGFP, pCMVp-NEO-BAN Or pSV2, etc. The recombinant virus vector includes a recombinant adenovirus vector or a lentivirus vector.
本发明还提供了一种宿主细胞,该宿主细胞含有上述编码本发明新的高活性转座酶、或其肽段、或其氨基酸序列的突变核苷酸序列或由该突变核苷酸序列聚合而成的核酸,或上述的核酸构建体,或上述的重组载体。所述宿主细胞包括大肠杆菌细胞、昆虫细胞、酵母细胞或哺乳动物细胞等。The present invention also provides a host cell, which contains the above-mentioned mutant nucleotide sequence encoding the new highly active transposase of the present invention, or its peptide fragment, or its amino acid sequence, or is polymerized from the mutant nucleotide sequence. The nucleic acid, or the above-mentioned nucleic acid construct, or the above-mentioned recombinant vector. The host cells include E. coli cells, insect cells, yeast cells, mammalian cells, and the like.
本发明提供的用于转座系统提高转座子转座活性的新的高活性转座酶、或组成新的高活性转座酶的肽段、或编码该新的高活性转座酶的核酸构建体、或编码该新的高活性转座酶的重组载体或含有新的高活性转座酶和/或编码新的高活性转座酶的核酸构建体和/或编码新的高活性转座酶的重组载体的宿主细胞(大肠杆菌细胞、昆虫细胞、酵母细胞或哺乳动物细胞等),定点、稳定、高效地将外源基因整合入宿主细胞基因组,并实现长期、稳定表达,又不影响宿主原始基因的稳定表达,可用于构建新的基因转移系统,还可用于制备或用作基因组研究、基因治疗、细胞治疗、或者多功能干细胞诱导和/或分化的药物和/或制剂,还可用于制备或者用作基因组研究、基因治疗、细胞治疗、或者多功能干细胞诱导和/或分化的工具。The present invention provides a new high-activity transposase used in the transposition system to improve the transposable activity of transposons, or a peptide segment constituting a new high-activity transposase, or a nucleic acid encoding the new high-activity transposase Construct, or recombinant vector encoding the new high-activity transposase or nucleic acid construct containing the new high-activity transposase and/or encoding the new high-activity transposase and/or encoding the new high-activity transposase Enzyme recombinant vector host cells (E. coli cells, insect cells, yeast cells or mammalian cells, etc.), point, stably and efficiently integrate foreign genes into the host cell genome, and achieve long-term and stable expression without affecting The stable expression of the original host genes can be used to construct new gene transfer systems, and can also be used to prepare or use as drugs and/or preparations for genome research, gene therapy, cell therapy, or the induction and/or differentiation of pluripotent stem cells. It can be prepared or used as a tool for genome research, gene therapy, cell therapy, or multifunctional stem cell induction and/or differentiation.
一种基因转移系统,含有本发明新的高活性转座酶、或编码该新的高活性转座酶的核酸构建体、或编码该新的高活性转座酶的重组载体或含有新的高活性转座酶和/或编码新的高活性转座酶的核酸构建体和/或编码新的高活性转座酶的重组载体的宿主细胞。A gene transfer system containing the new high-activity transposase of the present invention, or a nucleic acid construct encoding the new high-activity transposase, or a recombinant vector encoding the new high-activity transposase, or a new high-activity transposase Active transposase and/or a nucleic acid construct encoding a new highly active transposase and/or a host cell for a recombinant vector encoding a new highly active transposase.
在该基因转移系统中,还含有转座子基因,编码新的高活性转座酶的核酸或核酸构建体与转座子基因整合;或者编码新的高活性转座酶的核酸或核酸构建体与转座子基因相独立;或者编码新的高活性转座酶的核酸或核酸构建体与转座子基因位于同一重组载体上;或者编码新的高活性转座酶的核酸或核酸构建体与转座子基因位于不同重组载体上;或者转座子基因整合于编码新的高活性转座酶的核酸构建体上;或者转座子基因整合于编码新的高活性转座酶的重组载体上;或者转座子基因与编码新的高活性转座酶的重组载体相独立;或者转座子基因转入含有新的高活性转座酶和/或编码新的高活性转座酶的核酸构建体和/或编码新的高活性转座酶的重组载体的宿主细胞内;或者转座子基因位于含有新的高活性转座酶和/或编码新的高活性转座酶的核酸构建体和/或编码新的高活性转座酶的重组载体的宿主细胞外。The gene transfer system also contains a transposon gene, a nucleic acid or nucleic acid construct encoding a new highly active transposase integrated with the transposon gene; or a nucleic acid or nucleic acid construct encoding a new highly active transposase It is independent of the transposon gene; or the nucleic acid or nucleic acid construct encoding the new highly active transposase is located on the same recombinant vector as the transposon gene; or the nucleic acid or nucleic acid construct encoding the new highly active transposase and The transposon gene is located on a different recombinant vector; or the transposon gene is integrated into the nucleic acid construct encoding the new highly active transposase; or the transposon gene is integrated into the recombinant vector encoding the new highly active transposase ; Or the transposon gene is independent of the recombinant vector encoding the new high-activity transposase; or the transposon gene is transferred into the nucleic acid construction containing the new high-activity transposase and/or encoding the new high-activity transposase Or the transposon gene is located in a nucleic acid construct and/or a recombinant vector encoding a new highly active transposase; or a transposon gene is located in a nucleic acid construct containing a new highly active transposase and/or encoding a new highly active transposase and / Or outside the host cell of the recombinant vector encoding the new highly active transposase.
一种用于基因组研究、基因治疗、细胞治疗、或者多功能干细胞诱导和/或分化的药物和/或制剂,含有本发明新的高活性转座酶、或编码该新的高活性转座酶的核酸构建体、或编码该新的高活性转座酶的重组载体或含有新的高活性转座酶和/或编码新的高活性转座酶的核酸构建体和/或编码新的高活性转座酶的重组载体的宿主细胞,或上述的基因转移系统。A medicine and/or preparation for genome research, gene therapy, cell therapy, or induction and/or differentiation of pluripotent stem cells, containing the new high-activity transposase of the present invention, or encoding the new high-activity transposase The nucleic acid construct or the recombinant vector encoding the new high-activity transposase or the nucleic acid construct containing the new high-activity transposase and/or the nucleic acid construct encoding the new high-activity transposase and/or the new high-activity transposase The host cell of the recombinant vector of the transposase, or the above-mentioned gene transfer system.
所述用于基因组研究、基因治疗、细胞治疗、或者多功能干细胞诱导和/或分化的药 物中,还含有药学上可接受辅料,并可制备成药学上可行的任意剂型,还可同时辅以辅助治疗成分。The medicine used for genome research, gene therapy, cell therapy, or multifunctional stem cell induction and/or differentiation also contains pharmaceutically acceptable excipients, and can be prepared into any pharmaceutically feasible dosage form, and can also be supplemented at the same time Auxiliary treatment components.
一种用于基因组研究、基因治疗、细胞治疗、或者多功能干细胞诱导和/或分化的工具,含有本发明新的高活性转座酶、或编码该新的高活性转座酶的核酸构建体、或编码该新的高活性转座酶的重组载体或含有新的高活性转座酶和/或编码新的高活性转座酶的核酸构建体和/或编码新的高活性转座酶的重组载体的宿主细胞,或上述的基因转移系统。A tool for genome research, gene therapy, cell therapy, or induction and/or differentiation of pluripotent stem cells, containing the new highly active transposase of the present invention, or a nucleic acid construct encoding the new highly active transposase , Or a recombinant vector encoding the new high-activity transposase or a nucleic acid construct containing a new high-activity transposase and/or a nucleic acid construct encoding a new high-activity transposase and/or a new high-activity transposase The host cell of the recombinant vector, or the above-mentioned gene transfer system.
图1为实施例1步骤(3)中PRS316-URA-PBase载体图谱。Figure 1 is a vector map of PRS316-URA-PBase in step (3) of Example 1.
图2为实施例1步骤(3)中转座酶进行多次累积易错PCR突变流程示意图(上方图)及易错PCR回收的转座酶片段和线性化载体按照10:1的摩尔比转化至ura缺陷型酵母菌株流程示意图(下方图)。Figure 2 is a schematic diagram of the flow of multiple accumulation error-prone PCR mutations of the transposase in step (3) of Example 1 (above) and the transposase fragments and linearized vectors recovered by the error-prone PCR are transformed into a 10:1 molar ratio Schematic diagram of the ura-deficient yeast strain (the figure below).
图3为实施例1步骤(3)中突变体库及筛选高效转座酶的流程示意图。Figure 3 is a schematic diagram of the mutant library and screening of high-efficiency transposase in step (3) of Example 1.
图4为实施例2中质粒PRS316-URA-PBase图谱及质粒工作原理图(A)、WT PBase、hyPBase、optimized hyPBase、bz-hyPBase在酵母中转座情况直观图(B)、WT PBase、hyPBase、optimized hyPBase、bz-hyPBase在酵母中转座情况统计图(C)WT PBase、hyPBase、optimized hyPBase、bz-hyPBase在酵母中转座情况统计柱形直方图(D)。Figure 4 is a diagram of the plasmid PRS316-URA-PBase in Example 2 and the working principle diagram of the plasmid (A), WT PBase, hyPBase, optimized hyPBase, and bz-hyPBase in the yeast transposition visual diagram (B), WT PBase, hyPBase, Statistical graph of the transposition of optimized hyPBase, bz-hyPBase in yeast (C) Statistic histogram of the transposition of WT PBase, hyPBase, optimized hyPBase, and bz-hyPBase in yeast (D).
图5为实施例3中ploxP-bz-HyPB质粒结构示意图。5 is a schematic diagram of the structure of ploxP-bz-HyPB plasmid in Example 3.
图6为实施例3中pSAD-EGFP质粒结构示意图。Figure 6 is a schematic diagram of the pSAD-EGFP plasmid structure in Example 3.
图7为实施例3中使用optimized hyPBase和bz-hyPBase转座酶编辑CHO细胞基因组效率对比图。可知,bz-hyPBase在CHO细胞中的转座效率显著增高。FIG. 7 is a comparison diagram of the efficiency of editing CHO cell genome using optimized hyPBase and bz-hyPBase transposase in Example 3. It can be seen that the transposition efficiency of bz-hyPBase in CHO cells is significantly increased.
图8为实施例4中使用optimized hyPBase和bz-hyPBase转座酶制备CAR T细胞效率对比图。可知,bz-hyPBase在PBMC多个供体中的转座效率显著增高。A:7天的结果;B:14天的结果。FIG. 8 is a comparison diagram of the efficiency of preparing CAR T cells using optimized hyPBase and bz-hyPBase transposase in Example 4. It can be seen that the transposition efficiency of bz-hyPBase in multiple PBMC donors is significantly increased. A: Results of 7 days; B: Results of 14 days.
本发明提供的高活性转座酶相对于SEQ ID NO:1所示的转座酶,在选自第92位、第119位和第601位的一个、任意两个或全部三个位置上存在氨基酸突变,包括氨基酸插入、缺失或取代;或者相较于SEQ ID NO:11所示的转座酶,在选自第82位、第109位和第591位的一个、任意两个或全部三个位置上存在氨基酸插入、缺失或取代。优选的突变是取代突变。优选本发明的高活性转座酶在上述三个位置上均发生了突变,尤其是均发生了氨基酸取代。优选地,除在所述位置上具有突变外,本发明转座酶的余下位置上的氨基酸 残基与SEQ ID NO:1或11的相应位置上的氨基酸残基相同。Compared with the transposase shown in SEQ ID NO:1, the highly active transposase provided by the present invention exists in one, any two or all three positions selected from the 92nd, 119th and 601th positions Amino acid mutations, including amino acid insertions, deletions or substitutions; or compared to the transposase shown in SEQ ID NO: 11, in one, any two or all three selected from the 82nd, 109th and 591th positions There are amino acid insertions, deletions or substitutions at these positions. A preferred mutation is a substitution mutation. Preferably, the highly active transposase of the present invention has mutations in the above three positions, especially amino acid substitutions have occurred. Preferably, the amino acid residues at the remaining positions of the transposase of the present invention are the same as the amino acid residues at the corresponding positions of SEQ ID NO: 1 or 11 except for the mutation at the position.
在优选的实施方案中,本发明的高活性转座酶的氨基酸序列与SEQ ID NO:1所示的序列相比具有以下一个、任意两个或全部三个取代突变:92位的异亮氨酸突变为天冬酰胺、119位的缬氨酸突变为丙氨酸、601位的谷氨酰胺突变为精氨酸;进一步优选地,本发明的高活性转座酶在上述三个位置上均发生了所述取代突变。在优选的实施方案中,本发明的高活性转座酶的氨基酸序列与SEQ ID NO:11所示的序列相比具有以下一个、任意两个或全部三个取代突变:82位的异亮氨酸突变为天冬酰胺、109位的缬氨酸突变为丙氨酸、591位的谷氨酰胺突变为精氨酸;进一步优选地,本发明的高活性转座酶在上述三个位置上均发生了所述取代突变。优选地,除在所述位置上具有突变外,本发明转座酶的余下位置上的氨基酸残基与SEQ ID NO:1或11的相应位置上的氨基酸残基相同。In a preferred embodiment, the amino acid sequence of the highly active transposase of the present invention has one, any two or all three of the following substitution mutations compared with the sequence shown in SEQ ID NO:1: Isoleucine at position 92 The acid mutation is asparagine, the valine at position 119 is mutated to alanine, and the glutamine at position 601 is mutated to arginine; further preferably, the highly active transposase of the present invention has all the above three positions. The substitution mutation has occurred. In a preferred embodiment, the amino acid sequence of the highly active transposase of the present invention has one, any two or all three of the following substitution mutations compared with the sequence shown in SEQ ID NO: 11: Isoleucine at position 82 The acid mutation is asparagine, the valine at position 109 is mutated to alanine, and the glutamine at position 591 is mutated to arginine; further preferably, the highly active transposase of the present invention has all the above three positions. The substitution mutation has occurred. Preferably, the amino acid residues at the remaining positions of the transposase of the present invention are the same as the amino acid residues at the corresponding positions of SEQ ID NO: 1 or 11 except for the mutation at the position.
在一些实施方案中,本发明的高活性转座酶的氨基酸序列如SEQ ID NO:12所示。在特别优选的实施方案中,本发明的高活性转座酶的氨基酸序列如SEQ ID NO:2所示。本文SEQ ID NO:11和12所示的转座酶的氨基酸序列中不含有核定位序列。In some embodiments, the amino acid sequence of the highly active transposase of the present invention is shown in SEQ ID NO: 12. In a particularly preferred embodiment, the amino acid sequence of the highly active transposase of the present invention is shown in SEQ ID NO: 2. The amino acid sequence of the transposase shown in SEQ ID NO: 11 and 12 herein does not contain a nuclear localization sequence.
本发明还包括以下的转座酶:与SEQ ID NO:1相比,除在第92、119和601位中的一个、任意两个或全部三个位置上具有本文任一实施方案所述的突变外,在SEQ ID NO:1的其它一个或多个氨基酸位置上还具有一个或多个插入、缺失和/或取代突变,或与SEQ ID NO:11相比,除在第82、109和591位中的一个、任意两个或全部三个位置上具有本文任一实施方案所述的突变外,在SEQ ID NO:11的其它一个或多个氨基酸位置上还具有一个或多个插入、缺失和/或取代突变,且该转座酶仍然具有本文所述的转座酶活性。优选的突变为取代突变,更优选为保守性取代。例如,使用性质相同或相近似的氨基酸残基取代,通常不会显著改变所得突变体的转座酶活性。例如,可使用侧链基团极性相同的氨基酸进行取代。基于侧链基团极性,可将氨基酸分成非极性氨基酸(疏水氨基酸)和极性氨基酸(亲水氨基酸);其中,非极性氨基酸包括丙氨酸、缬氨酸、亮氨酸、异亮氨酸、脯氨酸、苯丙氨酸、色氨酸和蛋氨酸;极性氨基酸包括中性氨基酸、碱性氨基酸和酸性氨基酸,其中,中性氨基酸包括丝氨酸、苏氨酸、半胱氨酸、酪氨酸、天冬酰胺和谷氨酰胺,碱性氨基酸包括赖氨酸、精氨酸和组氨酸,酸性氨基酸包括天冬氨酸和谷氨酸。在一些实施方案中,这类转座酶与SEQ ID NO:1相比,具有至少80%、至少85%、至少90%、至少95%、至少96%、至少97%、至少98%或至少99%的序列同一性,且在第92、119和601位中的至少一个、任意两个或全部三个位置上具有本文任一实施方案所述的取代突变,或与SEQ ID NO:11相比,具有至少80%、至少85%、至少90%、至少95%、至少96%、至少97%、至少98%或至少99%的序列同一性,且在第82、109和591位中的至少一个、任意两个或全部三个位置上具有本文任一实施方案所述的取代突变。可采用本领 域周知的工具,如BLASTP计算两条氨基酸序列之间的序列同一性。The present invention also includes the following transposase: Compared with SEQ ID NO: 1, except for one, any two or all three positions of the 92nd, 119th and 601th positions, the transposase described in any of the embodiments herein In addition to the mutations, there are one or more insertion, deletion and/or substitution mutations in the other one or more amino acid positions of SEQ ID NO: 1, or compared with SEQ ID NO: 11, except in the 82, 109 and In addition to the mutations described in any of the embodiments herein in one, any two or all three positions of position 591, there are one or more insertions in the other one or more amino acid positions of SEQ ID NO: 11, Deletion and/or substitution mutation, and the transposase still has the transposase activity described herein. Preferred mutations are substitution mutations, and more preferred are conservative substitutions. For example, the substitution of amino acid residues with the same or similar properties usually does not significantly change the transposase activity of the resulting mutant. For example, amino acids whose side chain groups have the same polarity can be used for substitution. Based on the polarity of side chain groups, amino acids can be divided into non-polar amino acids (hydrophobic amino acids) and polar amino acids (hydrophilic amino acids); among them, non-polar amino acids include alanine, valine, leucine, iso Leucine, proline, phenylalanine, tryptophan and methionine; polar amino acids include neutral amino acids, basic amino acids and acidic amino acids, among which neutral amino acids include serine, threonine, and cysteine , Tyrosine, asparagine and glutamine, basic amino acids include lysine, arginine and histidine, acidic amino acids include aspartic acid and glutamic acid. In some embodiments, this type of transposase has at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least SEQ ID NO:1. 99% sequence identity, and at least one, any two or all three of the 92nd, 119th and 601th positions have the substitution mutations described in any of the embodiments herein, or are similar to SEQ ID NO: 11. Have a sequence identity of at least 80%, at least 85%, at least 90%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99%, and have a sequence identity of at least 82, 109, and 591 At least one, any two, or all three positions have the substitution mutations described in any of the embodiments herein. A tool known in the art, such as BLASTP, can be used to calculate the sequence identity between two amino acid sequences.
在一些实施方案中,本发明提供一种融合蛋白,其含有本发明任一实施方案所述的高活性转座酶和功能性蛋白,或由所述高活性转座酶与功能性蛋白形成或组成。应理解,所述融合蛋白至少应保留本文所述的高活性转座酶的转座活性。功能性蛋白用于改善或增加本发明所述的高活性转座酶的生物学活性或生物学功能。示例性的功能性蛋白包括但不限于用于提高转座酶的转座活性、用于监测转座酶的转座功能和/或用于为转座酶增加新的功能的功能性蛋白。例如,功能性蛋白包括但不限于核定位信号蛋白/序列,其可引导转座酶在细胞核聚集,从而辅助提高转座酶的转座效率;标记蛋白(如荧光标记蛋白,如绿色荧光蛋白(如EGFP)、红色荧光蛋白、蓝色荧光蛋白、黄色荧光蛋白等)或标签蛋白(如His6、Flag、GST、MBP、HA、Myc、His-Myc等),用于增强转座酶的转座监测功能,便于转座酶转座活性的定性和/或定量监测;感兴趣的抗体,用于增加转座酶的新功能,如使其增加免疫原性等。示例性的核定位信号蛋白或序列为c-myc核定位信号序列,其序列可如SEQ ID NO:1第3-11位氨基酸残基所示。In some embodiments, the present invention provides a fusion protein, which contains the highly active transposase described in any embodiment of the present invention and a functional protein, or is formed or formed by the highly active transposase and the functional protein. composition. It should be understood that the fusion protein should at least retain the transposition activity of the highly active transposase described herein. The functional protein is used to improve or increase the biological activity or biological function of the highly active transposase of the present invention. Exemplary functional proteins include, but are not limited to, functional proteins used to increase the transposable activity of transposases, used to monitor the transposable function of transposases, and/or used to add new functions to transposases. For example, functional proteins include, but are not limited to, nuclear localization signal proteins/sequences, which can guide transposase to accumulate in the nucleus, thereby helping to improve the transposition efficiency of transposase; labeled proteins (such as fluorescent labeled proteins, such as green fluorescent protein ( Such as EGFP), red fluorescent protein, blue fluorescent protein, yellow fluorescent protein, etc.) or tag protein (such as His6, Flag, GST, MBP, HA, Myc, His-Myc, etc.), used to enhance the transposition of transposase The monitoring function facilitates the qualitative and/or quantitative monitoring of the transposase activity of the transposase; the antibody of interest is used to increase the new function of the transposase, such as increasing the immunogenicity. An exemplary nuclear localization signal protein or sequence is the c-myc nuclear localization signal sequence, and its sequence may be as shown in the amino acid residues 3-11 of SEQ ID NO:1.
在优选的实施方案中,本发明的融合蛋白中,所述转座酶的氨基酸序列与SEQ ID NO:1所示的序列相比具有以下一个、任意两个或全部三个取代突变:92位的异亮氨酸突变为天冬酰胺、119位的缬氨酸突变为丙氨酸、601位的谷氨酰胺突变为精氨酸;进一步优选地,所述转座酶在上述三个位置上均发生了所述取代突变;且进一步优选地,所述转座酶的余下位置上的氨基酸残基与SEQ ID NO:1相应位置上的氨基酸残基相同。或者,本发明的融合蛋白中,所述转座酶的氨基酸序列与SEQ ID NO:11所示的序列相比具有以下一个、任意两个或全部三个取代突变:82位的异亮氨酸突变为天冬酰胺、109位的缬氨酸突变为丙氨酸、591位的谷氨酰胺突变为精氨酸;进一步优选地,所述转座酶在上述三个位置上均发生了所述取代突变;且进一步优选地,所述转座酶的余下位置上的氨基酸残基与SEQ ID NO:1相应位置上的氨基酸残基相同。在特别优选的实施方案中,本发明融合蛋白中的所述转座酶的氨基酸序列如SEQ ID NO:2或12所示。In a preferred embodiment, in the fusion protein of the present invention, the amino acid sequence of the transposase has one, any two or all three substitution mutations as compared with the sequence shown in SEQ ID NO:1: position 92 The isoleucine is mutated to asparagine, the valine at position 119 is mutated to alanine, and the glutamine at position 601 is mutated to arginine; further preferably, the transposase is at the above three positions All have the substitution mutation; and further preferably, the amino acid residue at the remaining position of the transposase is the same as the amino acid residue at the corresponding position of SEQ ID NO:1. Alternatively, in the fusion protein of the present invention, the amino acid sequence of the transposase has one, any two or all three substitution mutations as compared with the sequence shown in SEQ ID NO: 11: Isoleucine at position 82 The transposase is mutated to asparagine, the valine at position 109 is mutated to alanine, and the glutamine at position 591 is mutated to arginine; further preferably, the transposase has been mutated at the above three positions. Substitution mutation; and further preferably, the amino acid residue at the remaining position of the transposase is the same as the amino acid residue at the corresponding position of SEQ ID NO:1. In a particularly preferred embodiment, the amino acid sequence of the transposase in the fusion protein of the present invention is shown in SEQ ID NO: 2 or 12.
本发明的融合蛋白中,需要时,可通过接头序列连接所述转座酶与所述功能性蛋白。接头序列可以是常规的接头,如含有甘氨酸和丝氨酸的接头序列。融合蛋白中,转座酶可位于融合蛋白的N端,也可在C端;或者,当融合蛋白具有两个以上功能性蛋白时,融合蛋白也可位于两个以上功能性蛋白之间。In the fusion protein of the present invention, if necessary, the transposase and the functional protein can be connected via a linker sequence. The linker sequence may be a conventional linker, such as a linker sequence containing glycine and serine. In the fusion protein, the transposase can be located at the N-terminal or C-terminal of the fusion protein; or, when the fusion protein has more than two functional proteins, the fusion protein can also be located between two or more functional proteins.
本发明包括核酸分子,其多核苷酸序列为本文所述的转座酶的编码序列或该编码序列的互补序列,或者为本文所述融合蛋白的编码序列或其互补序列。在一些实施方案中,本发明转座酶的编码序列与SEQ ID NO:4相比,在第276位、第356位和第1802位中的一个、任意两个或全部三个位置上存在碱基突变,任选在第900位碱基上还存在碱基突变。 优选地,当发生突变时,276位的碱基T突变为碱基C,356位的碱基T突变为碱基C,900位的碱基G突变为碱基A,1802位的碱基A突变为碱基G。本发明核酸分子的多核苷酸序列如SEQ ID NO:3所示。在一些实施方案中,本发明核酸分子的多核苷酸序列与SEQ ID NO:13相比,在第246位、第326位和第1772位中的一个、任意两个或全部三个位置上存在碱基突变,任选在第870位碱基上还存在碱基突变;优选地,所述246位突变为碱基T突变为碱基C,所述326位突变为碱基T突变为碱基C,所870位的突变为碱基G突变为碱基A,所述1772位的突变为碱基A突变为碱基G;更优选地,所述多核苷酸序列如SEQ ID NO:14所示。本文中,SEQ ID NO:13和14所示的多核苷酸序列中不含有核定位序列的编码序列。The present invention includes nucleic acid molecules whose polynucleotide sequence is the coding sequence of the transposase described herein or the complementary sequence of the coding sequence, or the coding sequence of the fusion protein described herein or the complementary sequence thereof. In some embodiments, compared with SEQ ID NO: 4, the coding sequence of the transposase of the present invention has a base at one, any two, or all three of positions 276, 356, and 1802. Base mutation, optionally there is a base mutation at base 900. Preferably, when a mutation occurs, base T at position 276 is mutated to base C, base T at position 356 is mutated to base C, base G at position 900 is mutated to base A, and base A at position 1802 is mutated. Mutation to base G. The polynucleotide sequence of the nucleic acid molecule of the present invention is shown in SEQ ID NO: 3. In some embodiments, compared with SEQ ID NO: 13, the polynucleotide sequence of the nucleic acid molecule of the present invention is present at one, any two, or all three positions among the 246th, 326th, and 1772th positions. Base mutation, optionally there is a base mutation at base 870; preferably, the mutation at position 246 is a base T mutation to base C, and the mutation at position 326 is a base T mutation to a base C, the mutation at position 870 is a mutation of base G to base A, and the mutation at position 1772 is a mutation of base A to base G; more preferably, the polynucleotide sequence is as shown in SEQ ID NO: 14. Show. Here, the polynucleotide sequences shown in SEQ ID NOs: 13 and 14 do not contain the coding sequence of the nuclear localization sequence.
在一些实施方案中,本发明的核酸分子的多核苷酸序列与SEQ ID NO:4或13所述的多核苷酸序列相比具有至少80%同源性、优选至少90%同源性、更优选至少96%同源性,并具有相同编码功能,同时在所述第276位或第246位、第356位或第326位和第1802位或第1772位中的一个、任意两个或全部三个位置上存在所述碱基突变,并任选在所述第900位或第870位碱基上存在所述碱基突变。In some embodiments, the polynucleotide sequence of the nucleic acid molecule of the present invention has at least 80% homology, preferably at least 90% homology, and more than the polynucleotide sequence described in SEQ ID NO: 4 or 13. Preferably at least 96% homology, and the same coding function, at the same time at the 276th position or the 246th position, the 356th position or the 326th position and the 1802th position or the 1772th position, any two or all of them The base mutation is present at three positions, and the base mutation is optionally present at the 900th or 870th base.
本文还提供一种核酸构建体,其含有本文任一实施方案所述的转座酶的编码序列或其互补序列,或本文任一实施方案所述的融合蛋白的编码序列或其互补序列。在一些实施方案中,所述核酸构建体为表达框,除所述编码序列外,所述表达框还含有转录终止序列如PolyA加尾信号序列和启动子。合适的启动子为本领域所周知,且本领域技术人员可根据用于表达的宿主选择合适的能在该宿主中启动本文所述转座酶或其融合蛋白表达的启动子。Also provided herein is a nucleic acid construct containing the coding sequence of the transposase described in any embodiment herein or its complement, or the coding sequence of the fusion protein described in any embodiment herein or its complement. In some embodiments, the nucleic acid construct is an expression cassette, and in addition to the coding sequence, the expression cassette also contains a transcription termination sequence such as a PolyA tailing signal sequence and a promoter. Appropriate promoters are well known in the art, and those skilled in the art can select a suitable promoter capable of promoting the expression of the transposase described herein or its fusion protein in the host according to the host used for expression.
在一些实施方案中,所述核酸构建体依次包含如下元件:转座子5’末端重复序列(5’ITR)、多克隆插入位点、polyA加尾信号序列、转座子3’末端重复序列(3’ITR)、本文任一实施方案所述的核酸分子以及控制该所述核酸分子表达的启动子。所述的“依次包含如下元件”中的“依次”所指的方向和/或顺序是指从上游至下游。在本发明中,如果没有特别说明,沿着上述“正向”的方向为从上游至下游,沿着上述“反向”的方向为从下游至上游。In some embodiments, the nucleic acid construct sequentially includes the following elements: a transposon 5'terminal repeat sequence (5'ITR), a polyclonal insertion site, a polyA tailing signal sequence, a transposon 3'terminal repeat sequence (3'ITR), the nucleic acid molecule described in any of the embodiments herein, and a promoter that controls the expression of the nucleic acid molecule. The direction and/or order referred to in "sequentially" in the "sequentially including the following elements" refers to from upstream to downstream. In the present invention, unless otherwise specified, the direction along the aforementioned "forward direction" is from upstream to downstream, and the direction along the aforementioned "reverse direction" is from downstream to upstream.
在一些实施方案中,所述转座子5’末端重复序列为PiggyBac转座子5’末端重复序列,其核苷酸序列例如如SEQ ID NO:15所示;所述转座子3’末端重复序列为PiggyBac转座子3’末端重复序列,其核苷酸序列例如如SEQ ID NO:16所示。In some embodiments, the 5'end repeat sequence of the transposon is the 5'end repeat sequence of the PiggyBac transposon, and its nucleotide sequence is, for example, as shown in SEQ ID NO: 15; the 3'end of the transposon The repeat sequence is the 3'terminal repeat sequence of the PiggyBac transposon, and its nucleotide sequence is, for example, as shown in SEQ ID NO: 16.
在一些实施方案中,所述polyA加尾信号序列正反向均具有polyA加尾信号功能。In some embodiments, the polyA tailing signal sequence has a polyA tailing signal function in both forward and reverse directions.
在一些实施方案中,上述6种元件各自独立地为单拷贝或者多拷贝。In some embodiments, each of the above 6 elements is independently a single copy or multiple copies.
上述的6个元件之间可以直接相连接,也可以包含有其它的序列例如连接序列(linker) 或者酶切位点。The above-mentioned 6 elements may be directly connected, or may contain other sequences such as linker or restriction site.
在本发明中,如果没有特别说明,上述的“所述polyA加尾信号序列正反向均具有polyA加尾信号功能”包括但不限于如下的情形:In the present invention, if there is no special description, the above-mentioned "the polyA tailing signal sequence has a polyA tailing signal function in both forward and reverse directions" includes but is not limited to the following situations:
1)一种polyA加尾信号序列,其正反向均具有polyA加尾信号功能;1) A polyA tailing signal sequence, which has the function of polyA tailing signal in both forward and reverse directions;
2)两种polyA加尾信号序列,一个正向具有polyA加尾信号功能,一个反向具有polyA加尾信号功能。2) Two polyA tailing signal sequences, one has the function of polyA tailing in the forward direction, and the other has the function of polyA tailing signal in the reverse direction.
优选地,采用上面1)中的方案。不拘于理论的限制,这样外源基因表达框与PiggyBac转座酶表达框可以共用一个polyA加尾信号序列,从而减少了一个polyA加尾信号序列,体现集约原则,缩小质粒大小,有助于在保证转染效率的前提下,增加外源基因表达框的容量。Preferably, the solution in 1) above is adopted. Without being bound by theory, the exogenous gene expression cassette and the PiggyBac transposase expression cassette can share a polyA tailing signal sequence, thereby reducing a polyA tailing signal sequence, embodying the principle of intensiveness, reducing the size of the plasmid, and helping in Under the premise of ensuring transfection efficiency, increase the capacity of the foreign gene expression cassette.
在一些实施方案中,PB表达框与外源基因表达框同向放置,用两个polyA加尾信号序列,其中PB的表达框在前,其polyA加尾信号序列放在其中一个ITR与外源基因启动子之间。例如:控制PB转座酶表达的启动子、PB转座酶编码序列、转座子5’末端重复序列、polyA加尾信号序列1、外源基因启动子和外源基因(多克隆插入位点)、polyA加尾信号序列2、转座子3’末端重复序列;并且PB转座酶的表达框的方向与外源基因表达框的方向相同。In some embodiments, the PB expression cassette is placed in the same direction as the exogenous gene expression cassette, and two polyA tailed signal sequences are used, where the PB expression cassette is in front, and the polyA tailed signal sequence is placed in one of the ITRs and the exogenous gene. Between gene promoters. For example: the promoter that controls the expression of PB transposase, PB transposase coding sequence, transposon 5'terminal repeat sequence, polyA tail signal sequence 1, foreign gene promoter and foreign gene (multiple clone insertion site ), polyA tailing signal sequence 2, transposon 3'terminal repeat sequence; and the direction of the expression cassette of the PB transposase is the same as the direction of the expression cassette of the foreign gene.
在一些实施方案中,所述转座子5’末端重复序列与所述转座子3’末端重复序列的位置能够互换。In some embodiments, the position of the 5'end repeat of the transposon and the 3'end of the transposon can be interchanged.
在一些实施方案中,所述多克隆插入位点的核苷酸序列如SEQ ID NO:17所示;In some embodiments, the nucleotide sequence of the polyclonal insertion site is shown in SEQ ID NO: 17;
在一些实施方案中,所述polyA加尾信号序列的核苷酸序列如SEQ ID NO:18所示;SEQ ID NO:18所示的序列正反向均具有polyA加尾信号功能。In some embodiments, the nucleotide sequence of the polyA tailing signal sequence is shown in SEQ ID NO: 18; the sequence shown in SEQ ID NO: 18 has a polyA tailing signal function in both forward and reverse directions.
示例性的启动子包括但不限于和CMV启动子、EF1α启动子、SV40启动子、Ubiquitin B启动子、CAG启动子、HSP70启动子、PGK-1启动子、β-actin启动子、TK启动子和GRP78启动子。Exemplary promoters include, but are not limited to, CMV promoter, EF1α promoter, SV40 promoter, Ubiquitin B promoter, CAG promoter, HSP70 promoter, PGK-1 promoter, β-actin promoter, TK promoter And GRP78 promoter.
可在本发明的核酸构建体的多克隆位点可操作地插入有一个或多个相同或不同的感兴趣的外源基因以及可选的控制该外源基因表达的启动子,或者其多克隆位点被替换为一个或多个相同或不同的外源基因编码序列以及可选的控制外源基因表达的启动子;所述外源基因独立地为单拷贝或多拷贝。One or more identical or different foreign genes of interest and optionally a promoter that controls the expression of the foreign gene can be operably inserted into the multiple cloning site of the nucleic acid construct of the present invention, or its multiple clones The site is replaced with one or more identical or different exogenous gene coding sequences and optionally a promoter that controls the expression of the exogenous gene; the exogenous gene is independently a single copy or multiple copies.
在一些实施方案中,所述转座酶的表达框的方向与外源基因表达框的方向相反。In some embodiments, the direction of the expression cassette of the transposase is opposite to the direction of the expression cassette of the foreign gene.
在一些实施方案中,所述外源基因选自荧光素报告基因(例如绿色荧光蛋白、红色荧光蛋白、黄色荧光蛋白等)、荧光素酶基因(例如萤火虫荧光素酶、海肾荧光素酶等)、天然功能蛋白基因(例如TP53、GM-CSF、OCT4、SOX2、Nanog、KLF4、c-Myc)、RNAi基 因以及人工嵌合基因(例如嵌合抗原受体基因、Fc融合蛋白基因、全长抗体基因、纳米抗体基因)中的一种或多种。In some embodiments, the exogenous gene is selected from a luciferin reporter gene (such as green fluorescent protein, red fluorescent protein, yellow fluorescent protein, etc.), luciferase genes (such as firefly luciferase, Renilla luciferase, etc.) ), natural functional protein genes (such as TP53, GM-CSF, OCT4, SOX2, Nanog, KLF4, c-Myc), RNAi genes and artificial chimeric genes (such as chimeric antigen receptor genes, Fc fusion protein genes, full length One or more of antibody genes, Nanobody genes).
在本文中,术语“表达框”是指表达一个基因所需的完整元件,包括启动子、基因编码序列、PolyA加尾信号序列。As used herein, the term "expression cassette" refers to the complete elements required to express a gene, including promoters, gene coding sequences, and PolyA tailing signal sequences.
术语“核酸构建体”,在文中定义为单链或双链核酸分子,优选是指人工构建的核酸分子。可选地,所述核酸构建体还包含有可操作地连接的1个或多个调控序列,所述调控序列在其相容条件下能指导编码序列在合适的宿主细胞中进行表达。表达应理解为包括蛋白或多肽生产中所涉及的任何步骤,包括,但不限于转录、转录后修饰、翻译、翻译后修饰和分泌。The term "nucleic acid construct" is defined herein as a single-stranded or double-stranded nucleic acid molecule, and preferably refers to an artificially constructed nucleic acid molecule. Optionally, the nucleic acid construct further comprises one or more control sequences operably linked, and the control sequences can direct the coding sequence to be expressed in a suitable host cell under compatible conditions. Expression should be understood to include any steps involved in the production of a protein or polypeptide, including, but not limited to, transcription, post-transcriptional modification, translation, post-translational modification, and secretion.
术语“可操作地插入/连接”在文中定义为这样一种构象,其中调控序列位于相对DNA序列之编码序列的适当位置,以使调控序列指导蛋白或多肽的表达。在本发明的核酸构建体中,例如,外源基因启动子与外源基因编码序列通过DNA重组技术被置于所述多克隆位点。所述“可操作地连接”可以通过DNA重组的手段实现,具体地,所述核酸构建体为重组核酸构建体。The term "operably inserted/linked" is defined herein as a conformation in which the regulatory sequence is located at an appropriate position relative to the coding sequence of the DNA sequence so that the regulatory sequence directs the expression of the protein or polypeptide. In the nucleic acid construct of the present invention, for example, the foreign gene promoter and the foreign gene coding sequence are placed at the multiple cloning site by DNA recombination technology. The "operably linked" can be achieved by means of DNA recombination, specifically, the nucleic acid construct is a recombinant nucleic acid construct.
术语“编码序列”在文中定义为核酸序列中直接确定其蛋白产物的氨基酸序列的部分。编码序列的边界通常是由紧邻mRNA 5’端开放读码框上游的核糖体结合位点(对于原核细胞)和紧邻mRNA 3’端开放读码框下游的转录终止序列确定。编码序列可以包括,但不限于DNA、cDNA和重组核酸序列。The term "coding sequence" is defined herein as the part of a nucleic acid sequence that directly determines the amino acid sequence of its protein product. The boundary of the coding sequence is usually determined by the ribosome binding site immediately upstream of the 5'open reading frame of the mRNA (for prokaryotic cells) and the transcription termination sequence immediately downstream of the 3'open reading frame of the mRNA. Coding sequences can include, but are not limited to DNA, cDNA, and recombinant nucleic acid sequences.
本文中术语“调控序列”定义为包括表达本发明肽所必需或有利的所有组分。每个调控序列对于编码蛋白或多肽的核酸序列可以是天然含有的或外来的。这些调控序列包括,但不限于,前导序列、多聚腺苷酸化序列、前肽序列、启动子、信号序列和转录终止子。最低限度,调控序列要包括启动子以及转录和翻译的终止信号。为了导入特定的限制位点以便将调控序列与编码蛋白或多肽的核酸序列的编码区进行连接,可以提供带接头的调控序列。The term "regulatory sequence" herein is defined as including all components necessary or advantageous for expressing the peptide of the present invention. Each control sequence may be naturally contained or foreign to the nucleic acid sequence encoding the protein or polypeptide. These regulatory sequences include, but are not limited to, leader sequences, polyadenylation sequences, propeptide sequences, promoters, signal sequences, and transcription terminator. At a minimum, regulatory sequences should include promoters and termination signals for transcription and translation. In order to introduce specific restriction sites to connect the regulatory sequence to the coding region of the nucleic acid sequence encoding the protein or polypeptide, a regulatory sequence with a linker can be provided.
调控序列可以是合适的启动子序列,即可被表达核酸序列的宿主细胞识别的核酸序列。启动子序列含有介导蛋白或多肽表达的转录调控序列。启动子可以是在所选宿主细胞中有转录活性的任何核酸序列,包括突变的、截短的和杂合的启动子,可以得自编码与宿主细胞同源或异源的胞外或胞内蛋白或多肽的基因。The control sequence may be a suitable promoter sequence, that is, a nucleic acid sequence recognized by the host cell expressing the nucleic acid sequence. The promoter sequence contains transcriptional regulatory sequences that mediate the expression of the protein or polypeptide. The promoter can be any nucleic acid sequence that is transcriptionally active in the host cell of choice, including mutant, truncated and hybrid promoters, and can be derived from extracellular or intracellular encoding homologous or heterologous to the host cell Protein or peptide gene.
调控序列还可以是合适的转录终止序列,即能被宿主细胞识别从而终止转录的一段序列。终止序列可操作连接在编码蛋白或多肽的核酸序列的3’末端。在所选宿主细胞中可发挥功能的任何终止子都可以用于本发明。The regulatory sequence can also be a suitable transcription termination sequence, that is, a sequence that can be recognized by the host cell to terminate transcription. The termination sequence can be operably linked to the 3'end of the nucleic acid sequence encoding the protein or polypeptide. Any terminator that can function in the host cell of choice can be used in the present invention.
调控序列还可以是合适的前导序列,即对宿主细胞的翻译十分重要的mRNA非翻译 区。前导序列可操作连接于编码多肽的核酸序列的5’末端。在所选宿主细胞中可发挥功能的任何前导序列均可用于本发明。The control sequence can also be a suitable leader sequence, that is, an untranslated region of mRNA that is important for translation by the host cell. The leader sequence is operably linked to the 5'end of the nucleic acid sequence encoding the polypeptide. Any leader sequence that can function in the host cell of choice can be used in the present invention.
调控序列还可以是信号肽编码区,该区编码一段连在蛋白或多肽氨基端的氨基酸序列,能引导编码多肽进入细胞分泌途径。核酸序列编码区的5’端可能天然含有翻译读框一致地与分泌多肽的编码区片段自然连接的信号肽编码区。或者,编码区的5’端可含有对编码序列是外来的信号肽编码区。当编码序列在正常情况下不含有信号肽编码区时,可能需要添加外来信号肽编码区。或者,可以用外来的信号肽编码区简单地替换天然的信号肽编码区以增强多肽分泌。但是,任何能引导表达后的多肽进入所用宿主细胞的分泌途径的信号肽编码区都可以用于本发明。The control sequence can also be a signal peptide coding region, which encodes an amino acid sequence linked to the amino terminus of a protein or polypeptide, which can guide the encoded polypeptide into the secretory pathway of cells. The 5'end of the coding region of the nucleic acid sequence may naturally contain a signal peptide coding region in which the translation reading frame is naturally linked to the fragment of the coding region of the secreted polypeptide. Alternatively, the 5'end of the coding region may contain a signal peptide coding region that is foreign to the coding sequence. When the coding sequence normally does not contain a signal peptide coding region, it may be necessary to add a foreign signal peptide coding region. Alternatively, the natural signal peptide coding region can be simply replaced with a foreign signal peptide coding region to enhance polypeptide secretion. However, any signal peptide coding region that can guide the expressed polypeptide into the secretory pathway of the host cell used can be used in the present invention.
调控序列还可以是肽原编码区,该区编码位于多肽氨基末端的一段氨基酸序列。所得多肽被称为酶原或多肽原。多肽原通常没有活性,可以通过催化或自我催化而从多肽原切割肽原而转化为成熟的活性多肽。The control sequence can also be a propeptide coding region, which encodes an amino acid sequence located at the amino terminus of the polypeptide. The resulting polypeptide is called a zymogen or a propolypeptide. The pro-polypeptide is usually inactive and can be converted into a mature active polypeptide by cleaving the pro-polypeptide from the pro-polypeptide through catalysis or autocatalysis.
在多肽的氨基末端既有信号肽又有肽原区时,肽原区紧邻多肽的氨基末端,而信号肽区则紧邻肽原区的氨基末端。When there is both a signal peptide and a pro-peptide region at the amino terminus of the polypeptide, the pro-peptide region is adjacent to the amino terminus of the polypeptide, and the signal peptide region is adjacent to the amino terminus of the pro-peptide region.
添加能根据宿主细胞的生长情况来调节多肽表达的调控序列可能也是需要的。调控系统的例子是那些能对化学或物理刺激物(包括在有调控化合物的情况下)作出反应,从而开放或关闭基因表达的系统。调控序列的其他例子是那些能使基因扩增的调控序列。在这些例子中,应将编码蛋白或多肽的核酸序列与调控序列可操作地连接在一起。It may also be necessary to add regulatory sequences that can regulate the expression of the polypeptide according to the growth of the host cell. Examples of regulatory systems are those that respond to chemical or physical stimuli (including in the presence of regulatory compounds) to turn on or turn off gene expression. Other examples of regulatory sequences are those that enable gene amplification. In these examples, the nucleic acid sequence encoding the protein or polypeptide should be operably linked to the regulatory sequence.
在一些实施方案中,本发明提供重组载体。该重组载体可含有本文任一实施方案所述的核酸分子或核酸构建体。该重组载体可以是重组克隆载体、重组真核表达载体或重组病毒载体。重组载体中可含有其它调节元件,包括但不限于前导序列、多聚腺苷酸化序列、前肽序列、增强子、转录终止子、抗性基因等。可根据不同目的选择并构建相应的重组载体,使其含有所需的调节元件。重组克隆载体优选为pRS载体、T载体或pUC载体,所述重组真核表达载体优选为pEGFP、pCMVp-NEO-BAN或pSV2,所述重组病毒载体优选为重组腺病毒载体或慢病毒载体。在一些实施方案中,所述重组克隆载体为本发明中任一实施方案所述的核酸构建体与pUC18、pUC19、pMD18-T、pMD19-T、pGM-T载体、pUC57、pMAX或pDC315系列载体经重组得到的重组载体;所述重组表达载体为本发明任一实施方案所述的核酸构建体与pCDNA3系列载体、pCDNA4系列载体、pCDNA5系列载体、pCDNA6系列载体、pRL系列载体、pUC57载体、pMAX载体或pDC315系列载体经重组得到的重组载体;所述重组病毒载体为重组腺病毒载体、重组腺相关病毒载体、重组逆转录病毒载体、重组单纯疱疹病毒载体或重组痘苗病毒载体。In some embodiments, the invention provides recombinant vectors. The recombinant vector may contain the nucleic acid molecule or nucleic acid construct described in any of the embodiments herein. The recombinant vector can be a recombinant cloning vector, a recombinant eukaryotic expression vector or a recombinant virus vector. The recombinant vector may contain other regulatory elements, including but not limited to leader sequence, polyadenylation sequence, propeptide sequence, enhancer, transcription terminator, resistance gene, etc. The corresponding recombinant vector can be selected and constructed according to different purposes, so that it contains the required regulatory elements. The recombinant cloning vector is preferably a pRS vector, a T vector or a pUC vector, the recombinant eukaryotic expression vector is preferably pEGFP, pCMVp-NEO-BAN or pSV2, and the recombinant viral vector is preferably a recombinant adenovirus vector or a lentiviral vector. In some embodiments, the recombinant cloning vector is the nucleic acid construct according to any one of the embodiments of the present invention and pUC18, pUC19, pMD18-T, pMD19-T, pGM-T vector, pUC57, pMAX or pDC315 series vector A recombinant vector obtained by recombination; the recombinant expression vector is the nucleic acid construct according to any embodiment of the present invention and the pCDNA3 series vector, pCDNA4 series vector, pCDNA5 series vector, pCDNA6 series vector, pRL series vector, pUC57 vector, pMAX A vector or a recombinant vector obtained by recombination of the pDC315 series vector; the recombinant virus vector is a recombinant adenovirus vector, a recombinant adeno-associated virus vector, a recombinant retrovirus vector, a recombinant herpes simplex virus vector or a recombinant vaccinia virus vector.
可采用本领域周知的方法构建核酸构建体和重组载体,并采用常规的方法表达,从而 制备得到本文所述的转座酶和融合蛋白。The nucleic acid constructs and recombinant vectors can be constructed by methods well known in the art, and expressed by conventional methods, so as to prepare the transposase and fusion proteins described herein.
在一些实施方案中,本发明还提供宿主细胞,其含有本文任一实施方案所述的核酸分子、核酸构建体和/或重组载体,或表达本文任一实施方案所述的转座酶和/或融合蛋白。本发明的宿主细胞优选为大肠杆菌细胞、昆虫细胞、酵母细胞或哺乳动物细胞。在一些实施方案中,所述宿主细胞为重组的哺乳动物细胞;例如重组的原代培养T细胞、Jurkat细胞、K562细胞、肿瘤细胞、HEK293细胞或CHO细胞。In some embodiments, the present invention also provides a host cell, which contains the nucleic acid molecule, nucleic acid construct and/or recombinant vector described in any of the embodiments herein, or expresses the transposase and/or the transposase described in any of the embodiments herein Or fusion protein. The host cell of the present invention is preferably an E. coli cell, an insect cell, a yeast cell or a mammalian cell. In some embodiments, the host cell is a recombinant mammalian cell; for example, a recombinant primary culture T cell, Jurkat cell, K562 cell, tumor cell, HEK293 cell or CHO cell.
在一些实施方案中,本发明还提供一种基因转移系统,其含有本文任一实施方案所述的转座酶、融合蛋白、核酸分子、核酸构建体、重组载体或宿主细胞。在一些实施方案中,所述基因转移系统还含有转座子基因。在一些实施方案中,本文任一实施方案所述的核酸分子或者核酸构建体与转座子基因整合;在一些实施方案中,所述核酸分子或者核酸构建体与所述转座子基因相对独立;在一些实施方案中,所述核酸分子或者核酸构建体与所述转座子基因位于同一重组载体上;在一些实施方案中,所述核酸分子或者核酸构建体与所述转座子基因位于不同重组载体上;在一些实施方案中,所述转座子基因整合于所述核酸构建体上;在一些实施方案中,所述转座子基因整合于本文任一实施方案所述的重组载体上;在一些实施方案中,所述转座子基因转入本文任一实施方案所述的宿主细胞内;在一些实施方案中,所述转座子基因位于本文任一实施方案所述的宿主细胞外。In some embodiments, the present invention also provides a gene transfer system, which contains the transposase, fusion protein, nucleic acid molecule, nucleic acid construct, recombinant vector or host cell described in any of the embodiments herein. In some embodiments, the gene transfer system further contains a transposon gene. In some embodiments, the nucleic acid molecule or nucleic acid construct described in any of the embodiments herein is integrated with a transposon gene; in some embodiments, the nucleic acid molecule or nucleic acid construct is relatively independent of the transposon gene In some embodiments, the nucleic acid molecule or nucleic acid construct and the transposon gene are located on the same recombinant vector; in some embodiments, the nucleic acid molecule or nucleic acid construct and the transposon gene are located On different recombinant vectors; in some embodiments, the transposon gene is integrated into the nucleic acid construct; in some embodiments, the transposon gene is integrated into the recombinant vector described in any of the embodiments herein On; In some embodiments, the transposon gene is transferred into the host cell described in any of the embodiments herein; in some embodiments, the transposon gene is located in the host described in any of the embodiments herein Extracellular.
本发明还提供本文任一实施方案所述的转座酶、融合蛋白、核酸分子、核酸构建物、重组载体、宿主细胞或基因转移系统在以下任一项中的用途:The present invention also provides the use of the transposase, fusion protein, nucleic acid molecule, nucleic acid construct, recombinant vector, host cell or gene transfer system described in any of the embodiments herein in any of the following:
(1)制备或用作基因组研究、基因治疗、细胞治疗、或者多功能干细胞诱导和/或分化的药物和/或制剂;优选制备或用作将外源基因整合入宿主细胞基因组的药物和/或制剂,优选宿主细胞为大肠杆菌细胞、昆虫细胞、酵母细胞或哺乳动物细胞;(1) Preparation or use as drugs and/or preparations for genome research, gene therapy, cell therapy, or multifunctional stem cell induction and/or differentiation; preferably, preparation or use as drugs and/or preparations for integrating foreign genes into the host cell genome Or a preparation, preferably the host cell is an E. coli cell, an insect cell, a yeast cell or a mammalian cell;
(2)制备或者用作基因组研究、基因治疗、细胞治疗、或者多功能干细胞诱导和/或分化的工具;优选制备或用作将外源基因整合入宿主细胞基因组的工具,优选宿主细胞为大肠杆菌细胞、昆虫细胞、酵母细胞或哺乳动物细胞。(2) Preparation or use as a tool for genome research, gene therapy, cell therapy, or induction and/or differentiation of pluripotent stem cells; preferably for preparation or use as a tool for integrating foreign genes into the host cell genome, preferably the host cell is the large intestine Bacillus cells, insect cells, yeast cells or mammalian cells.
本发明还提供一种用于基因组研究、基因治疗、细胞治疗、或者多功能干细胞诱导和/或分化的药物和/或制剂,含有本文任一实施方案所述的转座酶、融合蛋白、核酸分子、核酸构建物、重组载体、宿主细胞或基因转移系统。The present invention also provides a medicine and/or preparation for genome research, gene therapy, cell therapy, or induction and/or differentiation of pluripotent stem cells, containing the transposase, fusion protein, and nucleic acid described in any of the embodiments herein Molecules, nucleic acid constructs, recombinant vectors, host cells or gene transfer systems.
本发明还提供一种用于基因组研究、基因治疗、细胞治疗、或者多功能干细胞诱导和/或分化的工具,含有本文任一实施方案所述的转座酶、融合蛋白、核酸分子、核酸构建物、重组载体、宿主细胞或基因转移系统。The present invention also provides a tool for genome research, gene therapy, cell therapy, or multifunctional stem cell induction and/or differentiation, which contains the transposase, fusion protein, nucleic acid molecule, and nucleic acid construct described in any of the embodiments herein Food, recombinant vector, host cell or gene transfer system.
在一些实施方案中,本发明包括以下第1到第18项方案:In some embodiments, the present invention includes the following items 1 to 18:
1.一种高活性转座酶的氨基酸序列,含有一个或多个的以下氨基酸序列:(1)对 SEQ ID NO:1所示氨基酸序列的以下位点进行氨基酸突变所得的具有转座酶活性的氨基酸序列:氨基酸92位、氨基酸119位或者氨基酸601位中的至少一个氨基酸位点;优选氨基酸92位、氨基酸119位和氨基酸601位同时进行氨基酸突变;更优选为92位的异亮氨酸突变为天冬酰胺、119位的缬氨酸突变为丙氨酸、601位的谷氨酰胺突变为精氨酸;(2)在(1)中氨基酸92位、氨基酸119位或者氨基酸601位的氨基酸突变之外的一个或多个氨基酸进行缺失、取代、插入或添加所得具有转座酶活性的氨基酸序列;优选氨基酸92位、氨基酸119位和氨基酸601位同时进行氨基酸突变之外的一个或多个氨基酸进行缺失、取代、插入或添加所得具有转座酶活性的氨基酸序列;更有选为92位的异亮氨酸突变为天冬酰胺、119位的缬氨酸突变为丙氨酸、601位的谷氨酰胺突变为精氨酸之外的一个或多个氨基酸进行缺失、取代、插入或添加所得具有转座酶活性的氨基酸序列。1. An amino acid sequence of a highly active transposase, containing one or more of the following amino acid sequences: (1) Amino acid mutations at the following positions of the amino acid sequence shown in SEQ ID NO:1 have transposase activity Amino acid sequence: at least one of amino acid 92, amino acid 119, or amino acid 601; preferably amino acid 92, amino acid 119, and amino acid 601 are simultaneously subjected to amino acid mutations; more preferably the isoleucine at 92 position Mutations to asparagine, valine at position 119 to alanine, and glutamine at position 601 to arginine; (2) In (1) the amino acid at position 92, amino acid 119 or amino acid 601 One or more amino acids other than amino acid mutations are deleted, substituted, inserted or added to obtain an amino acid sequence with transposase activity; preferably one or more of amino acid mutations other than amino acid 92, amino acid 119 and amino acid 601 are simultaneously undergone mutation The amino acid sequence with transposase activity obtained by deletion, substitution, insertion or addition of three amino acids; moreover, the isoleucine at position 92 is mutated to asparagine, the valine at position 119 is mutated to alanine, and 601 The glutamine at position is mutated into one or more amino acids other than arginine, and the amino acid sequence with transposase activity is obtained by deletion, substitution, insertion or addition.
2.根据第1项所述的氨基酸序列,所述氨基酸序列还含有功能性蛋白的氨基酸序列;所述功能性蛋白的氨基酸序列优选为核定位信号的氨基酸序列、表达EGFP绿色荧光蛋白的氨基酸序列、标签蛋白氨基酸序列或抗体氨基酸序列等。2. The amino acid sequence according to item 1, wherein the amino acid sequence also contains the amino acid sequence of a functional protein; the amino acid sequence of the functional protein is preferably an amino acid sequence for nuclear localization signal, an amino acid sequence for expressing EGFP green fluorescent protein , Tag protein amino acid sequence or antibody amino acid sequence, etc.
3.一种高活性转座酶的氨基酸序列,含有一个或多个的SEQ ID NO:2所示氨基酸序列或SEQ ID NO:2所示氨基酸序列在氨基酸92位、氨基酸119位及氨基酸601位之外的一个或多个氨基酸进行氨基酸的缺失、取代、插入或添加所得具有转座酶活性的氨基酸序列。3. An amino acid sequence of a highly active transposase, containing one or more of the amino acid sequence shown in SEQ ID NO: 2 or the amino acid sequence shown in SEQ ID NO: 2 at amino acid 92, amino acid 119, and amino acid 601 The amino acid sequence with transposase activity is obtained by deleting, replacing, inserting or adding one or more other amino acids.
4.一种含有一个或多个第1-3项中任一项所述氨基酸序列的肽段。4. A peptide fragment containing one or more amino acid sequences described in any one of items 1-3.
5.一种含有一个或多个第1-3项中任一项所述氨基酸序列或者一个或多个第4项所述肽段的蛋白,具有转座酶活性。5. A protein containing one or more amino acid sequences described in any one of items 1 to 3 or one or more peptide fragments described in item 4, which has transposase activity.
6.编码第1-3项中任一项所述氨基酸序列或者第4项所述肽段或者第5项所述蛋白的核苷酸序列,含有一个或多个以下的核苷酸序列:(1)对SEQ ID NO:4所示核苷酸序列进行以下位点的碱基突变:碱基276、碱基356位、碱基900位或碱基1802位中的至少一个碱基位点;优选同时进行碱基275位、碱基356位、碱基900位和碱基1802位的碱基突变,更优选275位的碱基T突变为碱基C,356位的碱基T突变为碱基C,900位的碱基G突变为碱基A,1802位的碱基A突变为碱基G;或(2)与(1)中突变后的核苷酸序列互补的核苷酸序列;或(3)与(1)中突变后的核苷酸序列重叠并具有相同编码功能的核苷酸序列;或(4)与(1)中突变后的核苷酸序列杂交并具有相同编码功能的核苷酸序列;或(5)对(1)、(2)、(3)或(4)中核苷酸序列除基因突变位点之外的一个或多个碱基进行取代、缺失或者添加并具有相同编码功能的核苷酸序列;或(6)与(1)、(2)、(3)或(4)中核苷酸序列具有至少80%同源性并具有相同编码功能的核苷酸序列;优选至少90%同源性并具有相同编码功能的核苷酸序列;更优选至少96% 同源性并具有相同编码功能的核苷酸序列。6. A nucleotide sequence encoding the amino acid sequence of any one of items 1-3 or the peptide fragment of item 4 or the protein of item 5, containing one or more of the following nucleotide sequences:( 1) Make the following base mutations to the nucleotide sequence shown in SEQ ID NO: 4: at least one of base 276, base 356, base 900, or base 1802; It is preferable to carry out base mutations at base 275, base 356, base 900 and base 1802 at the same time, and it is more preferable to mutate base T at base 275 to base C and base T at base 356 to base mutation. Base C, base G at position 900 is mutated to base A, base A at position 1802 is mutated to base G; or (2) a nucleotide sequence complementary to the mutated nucleotide sequence in (1); Or (3) a nucleotide sequence that overlaps with the mutated nucleotide sequence in (1) and has the same coding function; or (4) hybridizes with the mutated nucleotide sequence in (1) and has the same coding function (5) Substitution, deletion or addition of one or more bases in the nucleotide sequence of (1), (2), (3) or (4) except for the gene mutation site Nucleotides with the same coding function; or (6) Nucleosides that have at least 80% homology with the nucleotide sequence in (1), (2), (3) or (4) and have the same coding function Acid sequence; preferably a nucleotide sequence with at least 90% homology and the same coding function; more preferably a nucleotide sequence with at least 96% homology and the same coding function.
7.编码第1-3项中任一项所述氨基酸序列或者第4项所述肽段或者第5项所述蛋白的核苷酸序列,含有一个或多个的以下核苷酸序列:(1)SEQ ID NO:3所示核苷酸序列;或(2)与(1)中突变后的核苷酸序列互补的核苷酸序列;或(3)与(1)中突变后的核苷酸序列重叠并具有相同编码功能的核苷酸序列;或(4)与(1)中突变后的核苷酸序列杂交并具有相同编码功能的核苷酸序列;或(5)对(1)、(2)、(3)或(4)中核苷酸序列除基因突变位点之外的一个或多个碱基进行取代、缺失或者添加并具有相同编码功能的核苷酸序列;或(6)与(1)、(2)、(3)或(4)中核苷酸序列具有至少80%同源性并具有相同编码功能的核苷酸序列;优选至少90%同源性并具有相同编码功能的核苷酸序列;更优选至少96%同源性并具有相同编码功能的核苷酸序列。7. A nucleotide sequence encoding the amino acid sequence described in any one of items 1-3 or the peptide fragment described in item 4 or the protein described in item 5, containing one or more of the following nucleotide sequences:( 1) The nucleotide sequence shown in SEQ ID NO: 3; or (2) the nucleotide sequence complementary to the mutated nucleotide sequence in (1); or (3) the mutated nucleotide sequence in (1) Nucleotide sequences overlapping and having the same coding function; or (4) a nucleotide sequence that hybridizes with the mutated nucleotide sequence in (1) and has the same coding function; or (5) a pair of (1) ), (2), (3) or (4) in which one or more bases in the nucleotide sequence other than the gene mutation site are substituted, deleted or added and have the same coding function; or ( 6) A nucleotide sequence with at least 80% homology and the same coding function as the nucleotide sequence in (1), (2), (3) or (4); preferably at least 90% homology and the same A nucleotide sequence that encodes a function; more preferably a nucleotide sequence that is at least 96% homologous and has the same encoding function.
8.第6项或第7项所述的核苷酸序列还含有编码功能性蛋白的核苷酸序列,优选编码核定位信号的核苷酸序列、表达EGFP绿色荧光蛋白的核苷酸序列、编码标签蛋白肽段的核苷酸序列或编码抗体的核苷酸序列。8. The nucleotide sequence described in item 6 or 7 also contains a nucleotide sequence encoding a functional protein, preferably a nucleotide sequence encoding a nuclear localization signal, a nucleotide sequence expressing EGFP green fluorescent protein, The nucleotide sequence encoding the peptide of the tag protein or the nucleotide sequence encoding the antibody.
9.一种含有第6-8项中任一项所述的核苷酸序列的核酸。9. A nucleic acid containing the nucleotide sequence of any one of items 6-8.
10.一种核酸构建体,编码第1-3项中任一项所述氨基酸序列或者第4项所述肽段或者第5项所述蛋白。10. A nucleic acid construct encoding the amino acid sequence described in any one of items 1 to 3 or the peptide fragment described in item 4 or the protein described in item 5.
11.根据第10项所述的一种核酸构建体,含有第6-8项中任一项所述的核苷酸序列,或者含有第9项所述的核酸。11. A nucleic acid construct according to item 10, which contains the nucleotide sequence according to any one of items 6 to 8, or contains the nucleic acid according to item 9.
12.一种重组载体,含有第6-8项中任一项所述的核苷酸序列、或者第9项所述的核酸、或者第10-11项中任一项所述的核酸构建体;所述重组载体优选为重组克隆载体、重组真核表达载体或重组病毒载体,所述重组克隆载体优选pRS载体、T载体或pUC载体,所述重组真核表达载体优选pEGFP、pCMVp-NEO-BAN或pSV2,所述重组病毒载体优选重组腺病毒载体或慢病毒载体。12. A recombinant vector containing the nucleotide sequence of any one of items 6-8, or the nucleic acid of item 9, or the nucleic acid construct of any one of items 10-11 The recombinant vector is preferably a recombinant cloning vector, a recombinant eukaryotic expression vector or a recombinant viral vector, the recombinant cloning vector is preferably a pRS vector, a T vector or a pUC vector, and the recombinant eukaryotic expression vector is preferably pEGFP, pCMVp-NEO- BAN or pSV2, the recombinant virus vector is preferably a recombinant adenovirus vector or a lentivirus vector.
13.一种宿主细胞,含有第10-11项中任一项所述的核酸构建体或者第12项所述的重组载体;所述宿主细胞优选为大肠杆菌细胞、昆虫细胞、酵母细胞或哺乳动物细胞。13. A host cell containing the nucleic acid construct according to any one of items 10-11 or the recombinant vector according to item 12; the host cell is preferably an E. coli cell, an insect cell, a yeast cell or a mammal Animal cells.
14.一种基因转移系统,其特征在于,含有第4项所述肽段、或者第5项所述蛋白、或者第9项所述的核酸、或第10-11项中任一项所述的核酸构建体、或第12项所述的重组载体、或第13项所述的宿主细胞。14. A gene transfer system, characterized in that it contains the peptide of item 4, or the protein of item 5, or the nucleic acid of item 9, or any one of items 10-11 The nucleic acid construct described in item 12, or the recombinant vector described in item 12, or the host cell described in item 13.
15.根据第14项所述的一种基因转移系统,其特征在于,还含有转座子基因,第9项所述的核酸或者第10-11项中任一项所述的核酸构建体与转座子基因整合;或者第9项所述的核酸或者第10-11项中任一项所述的核酸构建体与转座子基因相对独立;或者第9项所述的核酸或者第10-11项中任一项所述的核酸构建体与转座子基因位于同一重组载 体上;或者第9项所述的核酸或者第10-11项中任一项所述的核酸构建体与转座子基因位于不同重组载体上;或者转座子基因整合于第10-11项中任一项所述的核酸构建体上;或者转座子基因整合于第12项所述的重组载体上;或者转座子基因转入第13项所述的宿主细胞内;或者转座子基因位于第13项所述的宿主细胞外。15. A gene transfer system according to item 14, characterized in that it further contains a transposon gene, the nucleic acid of item 9 or the nucleic acid construct of any one of items 10-11 and Transposon gene integration; or the nucleic acid of item 9 or the nucleic acid construct of any one of items 10-11 and the transposon gene are relatively independent; or the nucleic acid of item 9 or the nucleic acid of item 10- The nucleic acid construct according to any one of items 11 and the transposon gene are located on the same recombinant vector; or the nucleic acid according to item 9 or the nucleic acid construct according to any one of items 10-11 and the transposon The daughter gene is located on a different recombinant vector; or the transposon gene is integrated into the nucleic acid construct described in any one of items 10-11; or the transposon gene is integrated into the recombinant vector described in item 12; or The transposon gene is transferred into the host cell described in item 13; or the transposon gene is located outside the host cell described in item 13.
16.第4项所述的肽段、或者第5项所述的蛋白质、或者第9项所述的核酸、或第10-11项中任一项所述的核酸构建体、或第12项所述的重组载体、或第13项所述的宿主细胞、或第14-15项中任一项所述的基因转移系统在以下项中任一项中的用途:16. The peptide segment of item 4, or the protein of item 5, or the nucleic acid of item 9, or the nucleic acid construct of any one of items 10-11, or item 12 Use of the recombinant vector, or the host cell described in item 13, or the gene transfer system described in any one of items 14-15, in any one of the following items:
(1)制备或用作基因组研究、基因治疗、细胞治疗、或者多功能干细胞诱导和/或分化的药物和/或制剂;优选制备或用作将外源基因整合入宿主细胞基因组的药物和/或制剂,优选宿主细胞为大肠杆菌细胞、昆虫细胞、酵母细胞或哺乳动物细胞;(1) Preparation or use as drugs and/or preparations for genome research, gene therapy, cell therapy, or multifunctional stem cell induction and/or differentiation; preferably, preparation or use as drugs and/or preparations for integrating foreign genes into the host cell genome Or a preparation, preferably the host cell is an E. coli cell, an insect cell, a yeast cell or a mammalian cell;
(2)制备或者用作基因组研究、基因治疗、细胞治疗、或者多功能干细胞诱导和/或分化的工具;优选制备或用作将外源基因整合入宿主细胞基因组的工具,优选宿主细胞为大肠杆菌细胞、昆虫细胞、酵母细胞或哺乳动物细胞。(2) Preparation or use as a tool for genome research, gene therapy, cell therapy, or induction and/or differentiation of pluripotent stem cells; preferably for preparation or use as a tool for integrating foreign genes into the host cell genome, preferably the host cell is the large intestine Bacillus cells, insect cells, yeast cells or mammalian cells.
17.一种用于基因组研究、基因治疗、细胞治疗、或者多功能干细胞诱导和/或分化的药物和/或制剂,含有第4项所述的肽段、或者第5项所述的蛋白质、或者第9项所述的核酸、或第10-11项中任一项所述的核酸构建体、或第12项所述的重组载体、或第13项所述的宿主细胞、或第14-15项中任一项所述的基因转移系统。17. A drug and/or preparation for genome research, gene therapy, cell therapy, or induction and/or differentiation of pluripotent stem cells, containing the peptide described in item 4, or the protein described in item 5, Or the nucleic acid of item 9, or the nucleic acid construct of any one of items 10-11, or the recombinant vector of item 12, or the host cell of item 13, or the host cell of item 13 or 14- The gene transfer system described in any one of 15 items.
18.一种用于基因组研究、基因治疗、细胞治疗、或者多功能干细胞诱导和/或分化的工具,含有第4项所述的肽段、或者第5项所述的蛋白质、或者第9项所述的核酸、或第10-11项中任一项所述的核酸构建体、或第12项所述的重组载体、或第13项所述的宿主细胞、或第14-15项中任一项所述的基因转移系统。18. A tool for genome research, gene therapy, cell therapy, or induction and/or differentiation of pluripotent stem cells, containing the peptide described in item 4, or the protein described in item 5, or item 9 The nucleic acid, or the nucleic acid construct described in any one of items 10-11, or the recombinant vector described in item 12, or the host cell described in item 13, or any one of items 14-15 The gene transfer system described in one item.
下文结合说明书附图和具体实施例更清楚地对本发明进行阐述,所述的具体实施例只用于解释本发明,但并不因此而受到任何限制。所述的实施例中的实验方法条件如无特殊指明均是常规实验方法条件;试剂等无特殊说明均按厂家说明进行。The following describes the present invention more clearly with reference to the drawings and specific embodiments of the specification. The specific embodiments are only used to explain the present invention, but are not limited in any way. The experimental method conditions in the described examples are conventional experimental method conditions unless otherwise specified; the reagents and the like are carried out according to the manufacturer's instructions without special instructions.
实施例1 高活性bz-hyPBase突变体的获得Example 1 Obtaining a highly active bz-hyPBase mutant
我们在现有高活性piggybac转座酶(简称hyPBase)原始序列(SEQ ID NO:1所示氨基酸序列)的基础上,做了以下改变,得到该被保护baize piggyBac transposase(简称bz-hyPBase)序列信息:Based on the original sequence of the existing highly active piggybac transposase (hyPBase for short) (amino acid sequence shown in SEQ ID NO:1), we made the following changes to obtain the protected baize piggyBac transposase (bz-hyPBase for short) sequence information:
(1)基于人类密码子使用偏好,我们对现有高活性piggybac转座酶进行了密码子优化,得如SEQ ID NO:4所示核苷酸序列,以提高转座酶表达水平;(1) Based on human codon usage preferences, we optimized the existing high-activity piggybac transposase to obtain the nucleotide sequence shown in SEQ ID NO: 4 to increase the expression level of the transposase;
(2)在起始密码子后加入了人源c-myc核定位信号,提高外源基因在宿主细胞中的 整合效率;(2) A human c-myc nuclear localization signal is added after the start codon to improve the integration efficiency of foreign genes in the host cell;
(3)采用以下方法对SEQ ID NO:4所示核苷酸序列进行随机突变,得到一株转座效率明显优于现有高活性piggybac转座酶的突变体,我们将其命名为bz-hyPBase(SEQ ID NO:2所示氨基酸序列,SEQ ID NO:3所示核苷酸序列),具体如下:(3) The following method was used to randomly mutate the nucleotide sequence shown in SEQ ID NO: 4 to obtain a mutant whose transposition efficiency was significantly better than that of the existing highly active piggybac transposase. We named it bz- hyPBase (amino acid sequence shown in SEQ ID NO: 2, nucleotide sequence shown in SEQ ID NO: 3), specifically as follows:
a.筛选报告载体的构建a. Construction of screening report vector
用基因合成的方式将抗性基因G418插入转座子原件5’IR和3’IR之间,形成转座子G418-IR。将该转座子利用PCR后重组的方法将其插入URA3基因中TTAA处,将带有可诱导型启动子的转座酶插入PRS316多克隆酶切位点,最终构成筛选报告载体PRS316-URA-PBase。具体操作如下:The resistance gene G418 is inserted between the 5'IR and 3'IR of the original transposon by means of gene synthesis to form the transposon G418-IR. The transposon was inserted into the TTAA in the URA3 gene by recombination after PCR, and the transposase with an inducible promoter was inserted into the PRS316 polyclonal restriction site to finally constitute the screening report vector PRS316-URA- PBase. The specific operations are as follows:
(1)使用引物pURA-F(SEQ ID NO:5:aagccgctaaaggcattatccgcc)和pURA-R(SEQ ID NO:6:aactgtgccctccatggaaaaatcagtc),对模板PRS316进行PCR,得到质粒PRS316的线性化片段1。(1) PCR was performed on the template PRS316 using primers pURA-F (SEQ ID NO: 5: aagccgctaaaggcattatccgcc) and pURA-R (SEQ ID NO: 6: aactgtgccctccatggaaaaatcagtc) to obtain linearized fragment 1 of plasmid PRS316.
(2)使用引物pURA-IR-F(SEQ ID NO:7:)和pURA-IR-R(SEQ ID NO:8:),对合成的转座子G418-IR进行PCR,得到转座子的与PRS316有同源序列的线性化片段2。(2) Use primers pURA-IR-F (SEQ ID NO: 7:) and pURA-IR-R (SEQ ID NO: 8:) to perform PCR on the synthesized transposon G418-IR to obtain the transposon Linearized fragment 2 with homologous sequence to PRS316.
pURA-IR-F(SEQ ID NO:7):pURA-IR-F (SEQ ID NO: 7):
pURA-IR-R(SEQ ID NO:8):pURA-IR-R (SEQ ID NO: 8):
(3)将片段1和片段2使用NEBuilder同源重组酶进行连接,构成质粒PRS316-URA。(3) Use NEBuilder homologous recombinase to connect fragment 1 and fragment 2 to form plasmid PRS316-URA.
(4)合成带有GALS可诱导的启动子基因PB转座酶,使用SacI和EcoRI将其克隆进载体PRS316-URA,最终生成质粒PRS316-URA-PBase。PRS316-URA-PBase载体图谱如图1所示。(4) Synthesize the PB transposase with GALS inducible promoter gene, use SacI and EcoRI to clone it into the vector PRS316-URA, and finally generate the plasmid PRS316-URA-PBase. The PRS316-URA-PBase vector map is shown in Figure 1.
b.突变体库的构建b. Construction of mutant library
在转座酶开放阅读框架(ORF)之外设计PCR引物:GR-F(SEQ ID NO:9:taatcagcgaagcgatga)和GR-R(SEQ ID NO:10:cagcatgcctgctattgtcttcc),PRS-URA-PBase载体上的转座酶ORF两端有50bp左右的同源序列,使用clonth的易错PCR试剂盒对转座酶进行突变,并可通过回收PCR片段作为模板多次突变累积突变数量(图2上方流程图所示),最终得到含有点突变的转座酶片段。筛选报告载体PRS316-URA-PBase使用XbaI和EcoRI进行线性化,并去掉原有未突变的转座酶。将PCR回收的转座酶片段和线性化载体按照10:1的摩尔比转化至ura缺陷型酵母菌株中(图2下方流程图及图3所示),酵母会利用自带的同源重组修复机制,使得外源目的片段通过同源臂置换到携带有缺口的DNA质粒中,从而在酵母细胞内自动组合成带有目的片段的完整质粒。通过此方法,能 实现DNA片段到酵母菌种的一步克隆,同时减少了在大肠杆菌构建质粒扩增后再转入酵母过程中突变体高频重复现象。通过此方法,转化后再平板上得到的克隆即为突变体,通过挑取单克隆的方法即可得到一定数量的突变体库。Design PCR primers outside the open reading frame (ORF) of the transposase: GR-F (SEQ ID NO: 9: taatcagcgaagcgatga) and GR-R (SEQ ID NO: 10: cagcatgcctgctattgtcttcc), on the PRS-URA-PBase vector The transposase ORF has a homologous sequence of about 50 bp at both ends. The transposase is mutated using clonth's error-prone PCR kit, and the number of mutations can be accumulated by recovering PCR fragments as a template for multiple mutations (as shown in the flow chart above in Figure 2). (Shown), and finally get a transposase fragment containing point mutations. The screening report vector PRS316-URA-PBase uses XbaI and EcoRI for linearization, and removes the original unmutated transposase. The transposase fragments and linearized vectors recovered by PCR are transformed into ura-deficient yeast strains at a molar ratio of 10:1 (shown in the flow chart below in Figure 2 and shown in Figure 3), and the yeast will use its own homologous recombination to repair The mechanism allows the exogenous target fragment to be replaced by the homology arm into the DNA plasmid carrying the gap, thereby automatically combining into a complete plasmid with the target fragment in the yeast cell. Through this method, one-step cloning of DNA fragments into yeast strains can be achieved, and at the same time, the phenomenon of high frequency repetition of mutants in the process of plasmid construction and amplification in Escherichia coli and then transferred to yeast can be reduced. By this method, the clones obtained on the plate after transformation are mutants, and a certain number of mutant libraries can be obtained by picking single clones.
c.筛选高效转座酶流程c. Screening process for efficient transposase
如图3所示,筛选过程分为两次筛选。第一次筛选对全部突变体进行大范围的筛选,筛选得到明显比未突变对照组转座效率高的突变体,第二次筛选在第一次筛选得到的酵母中进行,通过计算确切的转座效率,得到在酵母中转座效率增高的突变体,即bz-hyPBase(SEQ ID NO:2氨基酸序列,SEQ ID NO:3核苷酸序列)。As shown in Figure 3, the screening process is divided into two screenings. In the first screening, all mutants were screened on a large scale, and mutants with significantly higher transposition efficiency than those in the unmutated control group were obtained. The second screening was carried out in the yeast obtained in the first screening, and the exact transposition was calculated. To obtain a mutant with increased transposition efficiency in yeast, bz-hyPBase (SEQ ID NO: 2 amino acid sequence, SEQ ID NO: 3 nucleotide sequence).
第一次筛选:将转化得到的突变体库挑单克隆到96孔板、含有G418抗生素的YPD培养基中进行活化,活化24小时后,使用复制器对其进行转接,接种至含有2%半乳糖的YPD培养基中进行诱导。诱导24小时后,对菌液进行稀释,稀释至10-2或10-3(根据酵母生长情况确定),取10μl点板至ura缺陷型固体培养基上,培养48小时后观察突变体生长情况,并与未突变的克隆进行对比,将转座效率明显增高的克隆筛选出来,进行第二次筛选。The first screening: The transformed mutant library is picked up and activated in YPD medium containing G418 antibiotics in a 96-well plate. After 24 hours of activation, it is transferred using a replicator and inoculated to a concentration of 2% Induce in YPD medium with galactose. After 24 hours of induction, dilute the bacterial solution to 10-2 or 10-3 (determined according to the growth of yeast), take 10μl of the dot plate on the ura-deficient solid medium, and observe the growth of the mutant after 48 hours of cultivation , And compared with the clones without mutations, the clones with significantly higher transposition efficiency were screened out, and the second screening was carried out.
第二次筛选:将上述第一次筛选得到的疑似突变体进行活化24小时,活化后调整OD600值一致,按照1:100的比例接种至含有2%半乳糖的YPD培养基中进行诱导24小时,诱导后再次调整OD600值一致,梯度稀释至10-2、10-3、10-4,取20μl稀释至10-2、10-3涂布于ura缺陷型固体培养基上进行培养24小时,统计克隆数,在ura缺陷型固体培养基上生长的克隆即为已经发生转座的克隆。同时取20μl稀释至10-3、10-4涂布YPD完全固体培养基上对位对照,生长的克隆即为酵母总数量。转座酶转座效率=发生转座的克隆数量/总克隆数量=(在ura缺陷型培养基中克隆数*稀释倍数)/(在YPD培养基中克隆数*稀释倍数)*100%。通过此方法,即可实现高通量筛选,一次单人操作可实现96-960个突变体的通量筛选,大大增加了筛选得到高活性转座酶的概率。Second screening: Activate the suspected mutants obtained in the first screening for 24 hours, adjust the OD600 value after activation to be consistent, and inoculate them into YPD medium containing 2% galactose at a ratio of 1:100 for induction for 24 hours After induction, adjust the OD600 value to be consistent again, and dilute to 10-2, 10-3, 10-4, take 20μl diluted to 10-2, 10-3 and spread on the ura-deficient solid medium for 24 hours. Count the number of clones, and the clones grown on the ura-deficient solid medium are the clones that have undergone transposition. At the same time, take 20μl diluted to 10-3, 10-4 and spread the YPD complete solid medium on the para-position control. The grown clones are the total number of yeast. Transposase transposition efficiency=number of clones transposed/total number of clones=(number of clones in ura-deficient medium*dilution factor)/(number of clones in YPD medium*dilution factor)*100%. Through this method, high-throughput screening can be achieved. A single operation can achieve throughput screening of 96-960 mutants, which greatly increases the probability of obtaining highly active transposases through screening.
通过上述计算,我们可以得到突变体准确的转座效率,我们将选取转座效率增高的菌株进行突变位点分析。接种最初活化的96孔板中的酵母进行扩大培养,抽提酵母质粒,送公司测序分析,通过与原始序列的比对即得到突变体突变位点。Through the above calculation, we can get the accurate transposition efficiency of the mutant, and we will select the strains with increased transposition efficiency for mutation site analysis. Inoculate the yeast in the initially activated 96-well plate for expansion culture, extract the yeast plasmid, send it to the company for sequencing and analysis, and obtain the mutant mutation site by comparing with the original sequence.
含核定位序列的hyPBase的氨基酸序列(SEQ ID NO:1):The amino acid sequence of hyPBase with nuclear localization sequence (SEQ ID NO:1):
含核定位序列的bz-hyPBase的氨基酸序列(SEQ ID NO:2):The amino acid sequence of bz-hyPBase with nuclear localization sequence (SEQ ID NO: 2):
现有高活性转座酶hyPBase的氨基酸序列(SEQ ID NO:1所示)92位的异亮氨酸突变为天冬酰胺、119位的缬氨酸突变为丙氨酸、601位的谷氨酰胺突变为精氨酸,得到如SEQ ID NO:2所示的bz-hyPBase的氨基酸序列。The amino acid sequence of the existing highly active transposase hyPBase (SEQ ID NO:1) is mutated from isoleucine at position 92 to asparagine, valine at position 119 is mutated to alanine, and glutamine at position 601 The amide was mutated to arginine to obtain the amino acid sequence of bz-hyPBase as shown in SEQ ID NO: 2.
人源密码子优化的含核定位序列的hyPBase转座酶的核苷酸序列(SEQ ID NO:4):The nucleotide sequence of human codon-optimized hyPBase transposase containing nuclear localization sequence (SEQ ID NO: 4):
含核定位序列的bz-hyPBase转座酶的核苷酸序列(SEQ ID NO:3):Nucleotide sequence of bz-hyPBase transposase containing nuclear localization sequence (SEQ ID NO: 3):
现有高活性酶hyPBase的核苷酸序列经人源密码子优化后得到人源密码子优化核苷酸序列,以人源密码子优化核苷酸序列(SEQ ID NO:4)为基础进行以下位点的碱基突变:276的碱基T突变为碱基C,356位的碱基T突变为碱基C,900位的碱基G突变为碱基A,1802位的碱基A突变为碱基G;得到编码本发明新的高活性转座酶bz-hyPBase如SEQ ID NO:3所示突变核苷酸序列。The nucleotide sequence of the existing high-activity enzyme hyPBase has been optimized by human codons to obtain a human codon optimized nucleotide sequence. Based on the human codon optimized nucleotide sequence (SEQ ID NO: 4), the following is performed Base mutation at position: base T at position 276 was mutated to base C, base T at position 356 was mutated to base C, base G at position 900 was mutated to base A, and base A at position 1802 was mutated to Base G; to obtain a mutated nucleotide sequence that encodes the new high-activity transposase bz-hyPBase of the present invention as shown in SEQ ID NO: 3.
实施例2 bz-hyPBase在酵母菌中有更高的转座效率Example 2 bz-hyPBase has higher transposition efficiency in yeast
我们将带有G418抗性基因的转座子插入到酵母质粒PRS316中URA3基因上,破坏URA基因的表达,将带有诱导型启动子的转座酶同时克隆进PRS316,生成质粒PRS316-URA-Pbase,平行制备分别携带不同转座酶WT PBase、hyPBase、optimized hyPBase、bz-hyPBase的质粒。将该质粒转入ura缺陷型酿酒酵母菌BJ2168中,该菌株不能在ura缺陷型的培养基中生存。转座酶在诱导物半乳糖的调控下开启表达,促进转座子发生转座,转座子发生转座,URA基因正常表达,发生转座的克隆恢复在ura缺陷型培养基中的正常生长。通过统计一定数量酵母菌中发生转座的克隆数量,即可计算出转座酶在酿酒酵母菌中转座效率。我们通过此方法对比了野生型piggybac转座酶WT PBase、现有高活性piggybac转座酶hyPBase、经过密码子优化且加入核定位信号的转座酶optimized hyPBase和bz-hyPBase的转座效率,图4实验结果显示bz-hyPBase的转座效率是hyPBase的3倍,证明bz-hyPBase在酵母菌中有更高的转座效率。We inserted the transposon with the G418 resistance gene into the URA3 gene in the yeast plasmid PRS316, disrupted the expression of the URA gene, and cloned the transposase with the inducible promoter into PRS316 at the same time to generate the plasmid PRS316-URA- Pbase, prepare plasmids carrying different transposases WT PBase, hyPBase, optimized hyPBase, and bz-hyPBase in parallel. The plasmid was transformed into ura-deficient Saccharomyces cerevisiae BJ2168, which cannot survive in the ura-deficient medium. The transposase is turned on and expressed under the regulation of the inducer galactose, which promotes the transposition of the transposon, the transposition of the transposon, the normal expression of the URA gene, and the clone that undergoes the transposition resumes normal growth in the ura-deficient medium . By counting the number of transposable clones in a certain number of yeasts, the transposable efficiency of transposase in Saccharomyces cerevisiae can be calculated. Through this method, we compared the transposition efficiency of wild-type piggybac transposase WT PBase, the existing highly active piggybac transposase hyPBase, codon-optimized transposase optimized hyPBase and bz-hyPBase with nuclear localization signal added. 4 The experimental results show that the transposition efficiency of bz-hyPBase is 3 times that of hyPBase, which proves that bz-hyPBase has higher transposition efficiency in yeast.
WT PBase为携带经哺乳动物密码子优化的piggybac转座酶的质粒,hyPBase为携带现有高活性piggybac转座酶(为背景技术记载的WTPBase进行7个氨基酸位点突变得到)的质粒,optimized hyPBase为携带现有高活性piggybac转座酶经人源密码子优化及加核定位信号系统后所得转座酶的质粒,bz-hyPBase为携带本发明筛选的新的高活性转座酶(即optimized hyPBase进行本发明实施例记载的三个氨基酸位点突变所得转座酶)的质粒。WT PBase is a plasmid carrying a mammalian codon-optimized piggybac transposase, hyPBase is a plasmid carrying the existing highly active piggybac transposase (obtained by mutation of 7 amino acid sites for WTPBase described in the background art), optimized hyPBase In order to carry the existing high-activity piggybac transposase through the human source codon optimization and nuclear positioning signal system to obtain the transposase plasmid, bz-hyPBase is a new high-activity transposase screened in the present invention (i.e. optimized hyPBase A plasmid obtained by carrying out the three amino acid site mutations described in the Examples of the present invention).
实施例3 bz-hyPBase在CHO细胞中有更高的基因编辑效率Example 3 bz-hyPBase has higher gene editing efficiency in CHO cells
我们将optimized hyPBase和bz-hyPBase克隆进哺乳动物细胞表达载体中生成质粒ploxP-optimized hyPBase(结构同图5,仅将图5中转座酶由bz-hyPBase替换为optimized hyPBase)和ploxP-bz-HyPB(图5),使其表达转座酶。optimized hyPBase和bz-hyPBase的启动子后均连接有人源c-myc核定位信号。将带有EGFP基因的转座子克隆进载体pSAD-EGFP(图6)使其表达绿色荧光蛋白。将表达转座酶和转座子的两个质粒共同电 转进入CHO细胞中,带有EGFP的转座子会在转座酶的作用下插入基因组使其稳定表达绿色荧光蛋白,经过两次传代培养后,第7天和第14天使用流式细胞检测技术对表达绿色荧光蛋白的细胞进行计数,能表达荧光蛋白的细胞数量越多,则表示转座酶转座效率越高。从图7统计结果看来,bz-hyPBase的转座活性明显优于hyPBase。We clone optimized hyPBase and bz-hyPBase into mammalian cell expression vectors to generate plasmids ploxP-optimized hyPBase (the structure is the same as Figure 5, only the transposase in Figure 5 is replaced by optimized hyPBase from bz-hyPBase) and ploxP-bz-HyPB (Figure 5) to express transposase. Both the optimized hyPBase and bz-hyPBase promoters are connected to the human c-myc nuclear localization signal. The transposon carrying the EGFP gene was cloned into the vector pSAD-EGFP (Figure 6) to express green fluorescent protein. The two plasmids expressing transposase and transposon are jointly electrotransformed into CHO cells. The transposon with EGFP will be inserted into the genome under the action of transposase to make it stably express green fluorescent protein. After two subcultures Then, on the 7th and 14th days, the cells expressing green fluorescent protein were counted by flow cytometry technology. The more cells that can express the fluorescent protein, the higher the efficiency of transposase transposition. From the statistical results in Figure 7, the transposition activity of bz-hyPBase is significantly better than hyPBase.
实施例4 bz-hyPBase在T细胞中有更高的基因编辑效率Example 4 bz-hyPBase has higher gene editing efficiency in T cells
我们将实施例3中的ploxP-optimized hyPBase和ploxP-bz-HyPB质粒共同电转入外周血单核细胞PBMC细胞中,进行T细胞基因组编辑备。带有EGFP绿色荧光蛋白基因的转座子在转座酶的作用下对T细胞基因组进行编辑,对T细胞的编辑效率即可反应出转座酶活性的强弱。我们使用3个不同健康人来源的PBMC细胞进行多组实验,在第5天使用流式细胞检测技术检测基因编辑效率,EGFP阳性率越高,代表转座酶活性越高。实验结果如图8显示,在不同捐献者的PBMC细胞中,bz-hyPBase的转座活性均优于optimized hyPBase。We combined the ploxP-optimized hyPBase and ploxP-bz-HyPB plasmids in Example 3 into peripheral blood mononuclear cell PBMC cells for T cell genome editing preparation. The transposon with the EGFP green fluorescent protein gene edits the T cell genome under the action of the transposase, and the editing efficiency of the T cell can reflect the strength of the transposase activity. We used 3 PBMC cells from different healthy individuals to conduct multiple experiments. Flow cytometry was used to detect gene editing efficiency on the 5th day. The higher the EGFP positive rate, the higher the transposase activity. The experimental results are shown in Figure 8. In PBMC cells from different donors, the transposition activity of bz-hyPBase is better than optimized hyPBase.
最后应当说明的是,以上实施例仅仅用于说明本发明的技术方案,并不是对本发明的保护范围的限制,尽管参照前述实施例对本发明进行了详细的说明,但是本领域的普通技术人员应当明白在不脱离本发明原理的前提下,还可以做出若干改进,这些改进也应当视为本发明的保护范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention and are not intended to limit the protection scope of the present invention. Although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should It is understood that several improvements can be made without departing from the principle of the present invention, and these improvements should also be regarded as the protection scope of the present invention.
Claims (15)
- 一种转座酶,其特征在于,所述转座酶为SEQ ID NO:11的突变体,其氨基酸序列与SEQ ID NO:11相比,在选自第82位、第109位和第591位的一个、任意两个或全部三个位置上存在氨基酸突变,并任选地在除所述第82位、第109位和第591位外的SEQ ID NO:11的其它位置上具有一个或多个氨基酸突变;或A transposase, characterized in that the transposase is a mutant of SEQ ID NO: 11, and its amino acid sequence is compared with SEQ ID NO: 11, and is selected from the 82nd, 109th and 591th positions. There are amino acid mutations at one, any two, or all three positions at the position, and optionally at other positions of SEQ ID NO: 11 except for the 82nd, 109th, and 591th positions. Multiple amino acid mutations; or所述转座酶为SEQ ID NO:1的突变体,其氨基酸序列与SEQ ID NO:1相比,在选自第92位、第119位和第601位的一个、任意两个或全部三个位置上存在氨基酸突变,并任选地在除所述第92位、第119位和第601位外的SEQ ID NO:1的其它位置上具有一个或多个氨基酸突变。The transposase is a mutant of SEQ ID NO:1, and its amino acid sequence is compared with SEQ ID NO:1 in one, any two or all three selected from the 92nd, 119th and 601th positions. There are amino acid mutations in each position, and optionally one or more amino acid mutations in other positions of SEQ ID NO:1 except for the 92nd, 119th, and 601th positions.
- 如权利要求1所述的转座酶,其特征在于,所述转座酶的氨基酸序列与SEQ ID NO:11所示的序列相比具有以下一个、任意两个或全部三个取代突变:82位的异亮氨酸突变为天冬酰胺、109位的缬氨酸突变为丙氨酸、591位的谷氨酰胺突变为精氨酸,且任选地,该转座酶在除所述第82位、第109位和第591位外的SEQ ID NO:11的其它位置上具有一个或多个氨基酸突变;优选地,所述转座酶其余位置上的氨基酸残基与SEQ ID NO:11相同;或The transposase of claim 1, wherein the amino acid sequence of the transposase has one, any two, or all three substitution mutations as compared with the sequence shown in SEQ ID NO: 11: 82 The isoleucine at position is mutated to asparagine, the valine at position 109 is mutated to alanine, the glutamine at position 591 is mutated to arginine, and optionally, the transposase is in addition to the There are one or more amino acid mutations in other positions of SEQ ID NO: 11 except for positions 82, 109 and 591; preferably, the amino acid residues in the remaining positions of the transposase are the same as those in SEQ ID NO: 11. The same; or所述转座酶的氨基酸序列与SEQ ID NO:1所示的序列相比具有以下一个、任意两个或全部三个取代突变:92位的异亮氨酸突变为天冬酰胺、119位的缬氨酸突变为丙氨酸、601位的谷氨酰胺突变为精氨酸,且任选地,该转座酶在除所述第92位、第119位和第601位外的SEQ ID NO:1的其它位置上具有一个或多个氨基酸突变;优选地,所述转座酶其余位置上的氨基酸残基与SEQ ID NO:1相同;Compared with the sequence shown in SEQ ID NO:1, the amino acid sequence of the transposase has one, any two, or all three substitution mutations: the isoleucine at position 92 is mutated to asparagine, and the amino acid sequence at position 119 is Valine is mutated to alanine, glutamine at position 601 is mutated to arginine, and optionally, the transposase is in SEQ ID NO except for the 92nd, 119th and 601th positions. 1 has one or more amino acid mutations in other positions; preferably, the amino acid residues in the remaining positions of the transposase are the same as SEQ ID NO:1;优选地,所述转座酶的氨基酸序列如SEQ ID NO:2或12所示。Preferably, the amino acid sequence of the transposase is shown in SEQ ID NO: 2 or 12.
- 一种融合蛋白,其含有权利要求1-2中任一项所述的转座酶和功能性蛋白,或由权利要求1-2中任一项所述的转座酶与功能性蛋白形成或组成。A fusion protein comprising the transposase described in any one of claims 1-2 and a functional protein, or is formed or formed by the transposase described in any one of claims 1-2 and the functional protein composition.
- 如权利要求3所述的融合蛋白,其特征在于,所述功能性蛋白为用于提高所述转座酶的转座活性、用于监测所述转座酶的转座功能和/或用于为所述转座酶增加新的功能的功能性蛋白;The fusion protein of claim 3, wherein the functional protein is used to increase the transposable activity of the transposase, used to monitor the transposable function of the transposase, and/or used to A functional protein that adds new functions to the transposase;优选地,所述功能性蛋白选自:核定位信号蛋白、标记蛋白或标签蛋白和感兴趣的抗体。Preferably, the functional protein is selected from the group consisting of a nuclear localization signal protein, a marker protein or a tag protein and an antibody of interest.
- 一种核酸分子,其多核苷酸序列为:A nucleic acid molecule whose polynucleotide sequence is:(1)编码权利要求1-2中任一项所述的转座酶的多核苷酸序列;(1) A polynucleotide sequence encoding the transposase of any one of claims 1-2;(2)编码权利要求3-4中任一项所述的融合蛋白的多核苷酸序列;或(2) A polynucleotide sequence encoding the fusion protein of any one of claims 3-4; or(3)(1)或(2)所述多核苷酸序列的互补序列。(3) The complementary sequence of the polynucleotide sequence described in (1) or (2).
- 如权利要求5所述的核酸分子,其特征在于,其多核苷酸序列与SEQ ID NO:4相比,在第276位、第356位和第1802位中的一个、任意两个或全部三个位置上存在碱基突变,任选在第900位碱基上还存在碱基突变;优选地,所述276位突变为碱基T突变为碱基C,所述356位突变为碱基T突变为碱基C,所述900位的突变为碱基G突变为碱基A,所述1802位的突变为碱基A突变为碱基G;更优选地,所述多核苷酸序列如SEQ ID NO:3所示;或The nucleic acid molecule of claim 5, wherein the polynucleotide sequence is compared with SEQ ID NO: 4, at one, any two, or all three of the 276th, 356th, and 1802th positions. There is a base mutation at each position, and optionally there is a base mutation at base 900; preferably, the mutation at position 276 is a base T mutation to base C, and the mutation at position 356 is a base T The mutation at position 900 is a mutation of base G to base A, and the mutation at position 1802 is a mutation of base A to base G; more preferably, the polynucleotide sequence is as SEQ ID NO: as shown in 3; or其多核苷酸序列与SEQ ID NO:13相比,在第246位、第326位和第1772位中的一个、任意两个或全部三个位置上存在碱基突变,任选在第870位碱基上还存在碱基突变;优选地,所述246位突变为碱基T突变为碱基C,所述326位突变为碱基T突变为碱基C,所870位的突变为碱基G突变为碱基A,所述1772位的突变为碱基A突变为碱基G;更优选地,所述多核苷酸序列如SEQ ID NO:14所示。Compared with SEQ ID NO: 13, its polynucleotide sequence has base mutations at one, any two or all three of the 246th, 326th and 1772th positions, optionally at the 870th position. There are also base mutations on the bases; preferably, the mutation at position 246 is a mutation of base T to base C, the mutation at position 326 is a mutation of base T to base C, and the mutation at position 870 is a base. The G mutation is base A, and the mutation at position 1772 is the mutation of base A to base G; more preferably, the polynucleotide sequence is shown in SEQ ID NO: 14.
- 一种核酸构建体,其含有权利要求5或6所述的核酸分子;优选地,所述核酸构建体为表达框。A nucleic acid construct containing the nucleic acid molecule of claim 5 or 6; preferably, the nucleic acid construct is an expression cassette.
- 如权利要求7所述的核酸构建体,其特征在于,所述核酸构建体依次包含:转座子5’末端重复序列、多克隆插入位点、polyA加尾信号序列、转座子3’末端重复序列、权利要求5或6所述的核酸分子以及控制该所述核酸分子表达的启动子。The nucleic acid construct according to claim 7, wherein the nucleic acid construct comprises in turn: a transposon 5'end repeat sequence, a polyclonal insertion site, a polyA tailing signal sequence, and a transposon 3'end A repetitive sequence, the nucleic acid molecule of claim 5 or 6, and a promoter that controls the expression of the nucleic acid molecule.
- 一种重组载体,其含有权利要求5或6所述的核酸分子或权利要求7或8所述的核酸构建体;优选地,所述重组载体为重组克隆载体、重组真核表达载体或重组病毒载体。A recombinant vector containing the nucleic acid molecule of claim 5 or 6 or the nucleic acid construct of claim 7 or 8; preferably, the recombinant vector is a recombinant cloning vector, a recombinant eukaryotic expression vector or a recombinant virus Carrier.
- 如权利要求9所述的重组载体,其特征在于,所述重组克隆载体为pRS载体、T载体或pUC载体,所述重组真核表达载体为pEGFP、pCMVp-NEO-BAN或pSV2,所述重组病毒载体为重组腺病毒载体或慢病毒载体。The recombinant vector of claim 9, wherein the recombinant cloning vector is a pRS vector, a T vector or a pUC vector, and the recombinant eukaryotic expression vector is pEGFP, pCMVp-NEO-BAN or pSV2, and the recombinant The virus vector is a recombinant adenovirus vector or a lentivirus vector.
- 一种宿主细胞,其含有权利要求5或6所述的核酸分子、权利要求7或8所述的核酸构建体或权利要求9或10所述的重组载体,和/或其表达权利要求1-2中任一项所述的转座酶和/或权利要求3或4所述的融合蛋白;优选地所述宿主细胞为大肠杆菌细胞、昆虫细胞、酵母细胞或哺乳动物细胞。A host cell containing the nucleic acid molecule according to claim 5 or 6, the nucleic acid construct according to claim 7 or 8, or the recombinant vector according to claim 9 or 10, and/or its expression according to claim 1- The transposase of any one of 2 and/or the fusion protein of claim 3 or 4; preferably, the host cell is an E. coli cell, an insect cell, a yeast cell or a mammalian cell.
- 一种基因转移系统,其含有权利要求1-2中任一项所述的转座酶、权利要求3或4所述的融合蛋白、权利要求5或6所述的核酸分子、权利要求7或8所述的核酸构建体、权利要求9或10所述的重组载体或权利要求11所述的宿主细胞。A gene transfer system, which contains the transposase of any one of claims 1-2, the fusion protein of claim 3 or 4, the nucleic acid molecule of claim 5 or 6, and the nucleic acid molecule of claim 7 or The nucleic acid construct of 8, the recombinant vector of claim 9 or 10, or the host cell of claim 11.
- 如权利要求12所述的基因转移系统,其特征在于,所述基因转移系统还含有转 座子基因;The gene transfer system of claim 12, wherein the gene transfer system further contains a transposon gene;优选地,权利要求5或6所述的核酸分子或者权利要求7或8所述的核酸构建体与转座子基因整合,或权利要求5或6所述的核酸分子或者权利要求7或8所述的核酸构建体与所述转座子基因相对独立,或权利要求5或6所述的核酸分子或者权利要求7或8所述的核酸构建体与所述转座子基因位于同一重组载体上,或权利要求5或6所述的核酸分子或者权利要求7或8所述的核酸构建体与所述转座子基因位于不同重组载体上,或所述转座子基因整合于权利要求7或8所述的核酸构建体上,或所述转座子基因整合于权利要求9或10所述的重组载体上,或所述转座子基因转入权利要求11所述的宿主细胞内,或所述转座子基因位于权利要求11所述的宿主细胞外。Preferably, the nucleic acid molecule of claim 5 or 6 or the nucleic acid construct of claim 7 or 8 is integrated with a transposon gene, or the nucleic acid molecule of claim 5 or 6 or the nucleic acid molecule of claim 7 or 8. The nucleic acid construct is relatively independent of the transposon gene, or the nucleic acid molecule of claim 5 or 6 or the nucleic acid construct of claim 7 or 8 and the transposon gene are located on the same recombinant vector , Or the nucleic acid molecule of claim 5 or 6, or the nucleic acid construct of claim 7 or 8, and the transposon gene are located on different recombinant vectors, or the transposon gene is integrated in claim 7 or On the nucleic acid construct of 8, or the transposon gene is integrated into the recombinant vector of claim 9 or 10, or the transposon gene is transferred into the host cell of claim 11, or The transposon gene is located outside the host cell of claim 11.
- 权利要求1-2中任一项所述的转座酶、权利要求3或4所述的融合蛋白、权利要求5或6所述的核酸分子、权利要求7或8所述的核酸构建体、权利要求9或10所述的重组载体、权利要求11所述的宿主细胞或权利要求12或13所述的基因转移系统在以下任一项中的用途:The transposase of any one of claims 1-2, the fusion protein of claim 3 or 4, the nucleic acid molecule of claim 5 or 6, the nucleic acid construct of claim 7 or 8, Use of the recombinant vector of claim 9 or 10, the host cell of claim 11 or the gene transfer system of claim 12 or 13 in any of the following:(1)制备或用作基因组研究、基因治疗、细胞治疗、或者多功能干细胞诱导和/或分化的药物和/或制剂;优选制备或用作将外源基因整合入宿主细胞基因组的药物和/或制剂,优选宿主细胞为大肠杆菌细胞、昆虫细胞、酵母细胞或哺乳动物细胞;(1) Preparation or use as drugs and/or preparations for genome research, gene therapy, cell therapy, or multifunctional stem cell induction and/or differentiation; preferably, preparation or use as drugs and/or preparations for integrating foreign genes into the host cell genome Or a preparation, preferably the host cell is an E. coli cell, an insect cell, a yeast cell or a mammalian cell;(2)制备或者用作基因组研究、基因治疗、细胞治疗、或者多功能干细胞诱导和/或分化的工具;优选制备或用作将外源基因整合入宿主细胞基因组的工具,优选宿主细胞为大肠杆菌细胞、昆虫细胞、酵母细胞或哺乳动物细胞。(2) Preparation or use as a tool for genome research, gene therapy, cell therapy, or induction and/or differentiation of pluripotent stem cells; preferably for preparation or use as a tool for integrating foreign genes into the host cell genome, preferably the host cell is the large intestine Bacillus cells, insect cells, yeast cells or mammalian cells.
- 一种用于基因组研究、基因治疗、细胞治疗、或者多功能干细胞诱导和/或分化的药物、制剂或工具,其含有权利要求1-2中任一项所述的转座酶、权利要求3或4所述的融合蛋白、权利要求5或6所述的核酸分子、权利要求7或8所述的核酸构建体、权利要求9或10所述的重组载体、权利要求11所述的宿主细胞或权利要求12或13所述的基因转移系统。A medicine, preparation or tool for genome research, gene therapy, cell therapy, or induction and/or differentiation of pluripotent stem cells, which contains the transposase according to any one of claims 1-2, claim 3 The fusion protein of or 4, the nucleic acid molecule of claim 5 or 6, the nucleic acid construct of claim 7 or 8, the recombinant vector of claim 9 or 10, the host cell of claim 11 Or the gene transfer system of claim 12 or 13.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201911227263 | 2019-12-04 | ||
CN201911227263.5 | 2019-12-04 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021110119A1 true WO2021110119A1 (en) | 2021-06-10 |
Family
ID=76110908
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/133796 WO2021110119A1 (en) | 2019-12-04 | 2020-12-04 | Highly active transposase and application thereof |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN112899252A (en) |
WO (1) | WO2021110119A1 (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113481174B (en) * | 2021-07-01 | 2022-08-19 | 温州医科大学 | Nucleic acid ligase |
CN116286713A (en) * | 2022-05-10 | 2023-06-23 | 翌圣生物科技(上海)股份有限公司 | High activity Tn5 transposase mutant |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010099296A1 (en) * | 2009-02-26 | 2010-09-02 | Transposagen Biopharmaceuticals, Inc. | Hyperactive piggybac transposases |
CN102421902A (en) * | 2009-02-25 | 2012-04-18 | 约翰·霍普金斯大学 | Piggybac transposon variants and methods of use |
WO2012074758A1 (en) * | 2010-11-16 | 2012-06-07 | Transposagen Bioharmaceuticals, Inc. | Hyperactive piggybac transposases |
-
2020
- 2020-09-09 CN CN202010940595.4A patent/CN112899252A/en active Pending
- 2020-12-04 WO PCT/CN2020/133796 patent/WO2021110119A1/en active Application Filing
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102421902A (en) * | 2009-02-25 | 2012-04-18 | 约翰·霍普金斯大学 | Piggybac transposon variants and methods of use |
WO2010099296A1 (en) * | 2009-02-26 | 2010-09-02 | Transposagen Biopharmaceuticals, Inc. | Hyperactive piggybac transposases |
WO2012074758A1 (en) * | 2010-11-16 | 2012-06-07 | Transposagen Bioharmaceuticals, Inc. | Hyperactive piggybac transposases |
Non-Patent Citations (2)
Title |
---|
WEN WEN, SONG SHANSHAN, HAN YUCHUN, CHEN HAIBIN, LIU XIANGZHEN, QIAN QIJUN: "An efficient Screening System in Yeast to Select a Hyperactive piggyBac Transposase for Mammalian Applications", INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES, vol. 21, no. 9, 26 April 2020 (2020-04-26), XP055819123, DOI: 10.3390/ijms21093064 * |
ZHOU QINQIAN, ZHOU MINGBING: "Modification and decoration of transposase: a review", CHINESE JOURNAL OF BIOTECHNOLOGY, vol. 30, no. 10, 25 October 2014 (2014-10-25), pages 1504 - 1514, XP055819126, DOI: 10.13345/j.cjb.130653 * |
Also Published As
Publication number | Publication date |
---|---|
CN112899252A (en) | 2021-06-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230322956A1 (en) | Compositions and methods for making antibodies based on use of an expression-enhancing locus | |
WO2017215619A1 (en) | Fusion protein producing point mutation in cell, and preparation and use thereof | |
AU2013279333B2 (en) | Site-specific integration | |
Balasubramanian et al. | Rapid recombinant protein production from piggyBac transposon-mediated stable CHO cell pools | |
JP2016523084A (en) | Target integration | |
WO2021110119A1 (en) | Highly active transposase and application thereof | |
TW200914612A (en) | Promoter | |
CA2568788A1 (en) | Production of polypeptides by improved secretion | |
JP2002512015A (en) | Rapidly degradable GFP fusion proteins and methods of use | |
US11427932B2 (en) | Materials and methods for protein production | |
CN106589134A (en) | Chimeric protein pAgoE, construction method and applications thereof, chimeric protein pAgoE using guide, and construction method and applications thereof | |
Landgraf et al. | Scarless gene tagging with one-step transformation and two-step selection in Saccharomyces cerevisiae and Schizosaccharomyces pombe | |
CN112162096A (en) | Double-fluorescent protein positioning detection system for detecting cell mitochondrion autophagy and application | |
JP5304275B2 (en) | Codon optimized nucleic acid encoding apocrytin-II and methods of use thereof | |
JP6824594B2 (en) | How to design synthetic genes | |
JP2004538002A (en) | Novel method of recombinant gene expression by suppressing stop codon | |
CN111718929B (en) | Protein translation using circular RNA and uses thereof | |
JPWO2008044794A1 (en) | Gene transfer auxiliary reagent | |
US8338134B2 (en) | Expression of polypeptides from the nuclear genome of Ostreococcus sp | |
CN114957433B (en) | Tropical rana Fosl1 protein mutant and application thereof | |
US20240141310A1 (en) | Methods for producing cas3 proteins | |
CN111218476B (en) | Mammalian cell expression vector and construction method and application thereof | |
EP1957660B1 (en) | Materials and methods to increase peptide chain expression | |
JP2019187453A (en) | Methods for designing synthetic genes | |
CN115161306A (en) | Apolygus lucorum RNA degrading enzyme, encoding gene, vector, strain and application thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20895297 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 20895297 Country of ref document: EP Kind code of ref document: A1 |