CN116355878B - Novel TnpB programmable nuclease and application thereof - Google Patents
Novel TnpB programmable nuclease and application thereof Download PDFInfo
- Publication number
- CN116355878B CN116355878B CN202310177144.3A CN202310177144A CN116355878B CN 116355878 B CN116355878 B CN 116355878B CN 202310177144 A CN202310177144 A CN 202310177144A CN 116355878 B CN116355878 B CN 116355878B
- Authority
- CN
- China
- Prior art keywords
- tnpb
- protein
- sequence
- nucleic acid
- composition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 101710163270 Nuclease Proteins 0.000 title abstract description 15
- 150000007523 nucleic acids Chemical class 0.000 claims abstract description 27
- 102000039446 nucleic acids Human genes 0.000 claims abstract description 25
- 108020004707 nucleic acids Proteins 0.000 claims abstract description 25
- 108090000623 proteins and genes Proteins 0.000 claims description 66
- 102000004169 proteins and genes Human genes 0.000 claims description 59
- 108020004414 DNA Proteins 0.000 claims description 36
- 210000004027 cell Anatomy 0.000 claims description 33
- 238000003776 cleavage reaction Methods 0.000 claims description 32
- 230000007017 scission Effects 0.000 claims description 31
- 102000053602 DNA Human genes 0.000 claims description 26
- 238000000034 method Methods 0.000 claims description 19
- 229910021645 metal ion Inorganic materials 0.000 claims description 18
- 239000000203 mixture Substances 0.000 claims description 16
- 108020004682 Single-Stranded DNA Proteins 0.000 claims description 10
- 238000012986 modification Methods 0.000 claims description 10
- 230000004048 modification Effects 0.000 claims description 10
- 210000004102 animal cell Anatomy 0.000 claims description 5
- 208000015181 infectious disease Diseases 0.000 claims description 5
- 229910001437 manganese ion Inorganic materials 0.000 claims description 4
- 101100341000 Escherichia coli (strain K12) insQ gene Proteins 0.000 claims description 3
- 230000000813 microbial effect Effects 0.000 claims description 3
- 101150044011 tnpB gene Proteins 0.000 claims description 3
- BHPQYMZQTOCNFJ-UHFFFAOYSA-N Calcium cation Chemical compound [Ca+2] BHPQYMZQTOCNFJ-UHFFFAOYSA-N 0.000 claims description 2
- JLVVSXFLKOJNIY-UHFFFAOYSA-N Magnesium ion Chemical compound [Mg+2] JLVVSXFLKOJNIY-UHFFFAOYSA-N 0.000 claims description 2
- 229910001424 calcium ion Inorganic materials 0.000 claims description 2
- 229910001425 magnesium ion Inorganic materials 0.000 claims description 2
- 238000003745 diagnosis Methods 0.000 claims 2
- 201000010099 disease Diseases 0.000 claims 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims 2
- 230000001225 therapeutic effect Effects 0.000 claims 1
- 230000000694 effects Effects 0.000 abstract description 41
- 102000004190 Enzymes Human genes 0.000 abstract description 25
- 108090000790 Enzymes Proteins 0.000 abstract description 25
- 235000018102 proteins Nutrition 0.000 description 55
- 238000009396 hybridization Methods 0.000 description 22
- 239000013612 plasmid Substances 0.000 description 20
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 18
- ZHNUHDYFZUAESO-UHFFFAOYSA-N Formamide Chemical compound NC=O ZHNUHDYFZUAESO-UHFFFAOYSA-N 0.000 description 14
- 235000001014 amino acid Nutrition 0.000 description 14
- 239000002773 nucleotide Substances 0.000 description 14
- 125000003729 nucleotide group Chemical group 0.000 description 14
- 125000003275 alpha amino acid group Chemical group 0.000 description 13
- 239000012634 fragment Substances 0.000 description 12
- 230000035772 mutation Effects 0.000 description 12
- 239000000523 sample Substances 0.000 description 12
- 241000196324 Embryophyta Species 0.000 description 11
- 229940024606 amino acid Drugs 0.000 description 11
- 150000001413 amino acids Chemical class 0.000 description 11
- 102000040430 polynucleotide Human genes 0.000 description 11
- 108091033319 polynucleotide Proteins 0.000 description 11
- 239000002157 polynucleotide Substances 0.000 description 11
- 108020005004 Guide RNA Proteins 0.000 description 10
- 102000008300 Mutant Proteins Human genes 0.000 description 8
- 108010021466 Mutant Proteins Proteins 0.000 description 8
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 8
- 238000006467 substitution reaction Methods 0.000 description 8
- 238000005406 washing Methods 0.000 description 8
- 108091028043 Nucleic acid sequence Proteins 0.000 description 7
- 241000203069 Archaea Species 0.000 description 6
- 241000588724 Escherichia coli Species 0.000 description 6
- 238000010362 genome editing Methods 0.000 description 6
- 230000008685 targeting Effects 0.000 description 6
- 230000007018 DNA scission Effects 0.000 description 5
- 241000191998 Pediococcus acidilactici Species 0.000 description 5
- 238000004458 analytical method Methods 0.000 description 5
- 108090000765 processed proteins & peptides Proteins 0.000 description 5
- 108010042407 Endonucleases Proteins 0.000 description 4
- 102000004533 Endonucleases Human genes 0.000 description 4
- 125000000539 amino acid group Chemical group 0.000 description 4
- 239000000872 buffer Substances 0.000 description 4
- 238000006243 chemical reaction Methods 0.000 description 4
- 239000002299 complementary DNA Substances 0.000 description 4
- 230000002255 enzymatic effect Effects 0.000 description 4
- 230000006872 improvement Effects 0.000 description 4
- 238000000338 in vitro Methods 0.000 description 4
- 229920000642 polymer Polymers 0.000 description 4
- 229920001184 polypeptide Polymers 0.000 description 4
- 102000004196 processed proteins & peptides Human genes 0.000 description 4
- 239000011780 sodium chloride Substances 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 241000894006 Bacteria Species 0.000 description 3
- 108091033409 CRISPR Proteins 0.000 description 3
- 108020004705 Codon Proteins 0.000 description 3
- 108091034117 Oligonucleotide Proteins 0.000 description 3
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 3
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 3
- 230000001580 bacterial effect Effects 0.000 description 3
- 230000004071 biological effect Effects 0.000 description 3
- 230000000295 complement effect Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 210000003527 eukaryotic cell Anatomy 0.000 description 3
- 238000002474 experimental method Methods 0.000 description 3
- 238000001727 in vivo Methods 0.000 description 3
- 238000011534 incubation Methods 0.000 description 3
- 239000000243 solution Substances 0.000 description 3
- 241000894007 species Species 0.000 description 3
- 239000000758 substrate Substances 0.000 description 3
- 230000009466 transformation Effects 0.000 description 3
- 229930024421 Adenine Natural products 0.000 description 2
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 2
- 108010052875 Adenine deaminase Proteins 0.000 description 2
- 108010080611 Cytosine Deaminase Proteins 0.000 description 2
- 102000000311 Cytosine Deaminase Human genes 0.000 description 2
- NYHBQMYGNKIUIF-UUOKFMHZSA-N Guanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O NYHBQMYGNKIUIF-UUOKFMHZSA-N 0.000 description 2
- 102100023823 Homeobox protein EMX1 Human genes 0.000 description 2
- 101001048956 Homo sapiens Homeobox protein EMX1 Proteins 0.000 description 2
- 101001000998 Homo sapiens Protein phosphatase 1 regulatory subunit 12C Proteins 0.000 description 2
- 241000186604 Lactobacillus reuteri Species 0.000 description 2
- 102100035620 Protein phosphatase 1 regulatory subunit 12C Human genes 0.000 description 2
- 240000008042 Zea mays Species 0.000 description 2
- 235000002017 Zea mays subsp mays Nutrition 0.000 description 2
- 238000007792 addition Methods 0.000 description 2
- 229960000643 adenine Drugs 0.000 description 2
- 230000015572 biosynthetic process Effects 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- 230000001276 controlling effect Effects 0.000 description 2
- 230000005782 double-strand break Effects 0.000 description 2
- 210000002257 embryonic structure Anatomy 0.000 description 2
- 238000003209 gene knockout Methods 0.000 description 2
- 238000001415 gene therapy Methods 0.000 description 2
- 230000002068 genetic effect Effects 0.000 description 2
- 230000002452 interceptive effect Effects 0.000 description 2
- 229940001882 lactobacillus reuteri Drugs 0.000 description 2
- 244000005700 microbiome Species 0.000 description 2
- 238000010369 molecular cloning Methods 0.000 description 2
- 150000003839 salts Chemical class 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 229910001415 sodium ion Inorganic materials 0.000 description 2
- MTCFGRXMJLQNBG-REOHCLBHSA-N (2S)-2-Amino-3-hydroxypropansäure Chemical compound OC[C@H](N)C(O)=O MTCFGRXMJLQNBG-REOHCLBHSA-N 0.000 description 1
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 1
- 229920000936 Agarose Polymers 0.000 description 1
- 239000004475 Arginine Substances 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-N Asparagine Natural products OC(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-N 0.000 description 1
- 238000010354 CRISPR gene editing Methods 0.000 description 1
- 101100252357 Caenorhabditis elegans rnp-1 gene Proteins 0.000 description 1
- OYPRJOBELJOOCE-UHFFFAOYSA-N Calcium Chemical compound [Ca] OYPRJOBELJOOCE-UHFFFAOYSA-N 0.000 description 1
- UXVMQQNJUSDDNG-UHFFFAOYSA-L Calcium chloride Chemical compound [Cl-].[Cl-].[Ca+2] UXVMQQNJUSDDNG-UHFFFAOYSA-L 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- MIKUYHXYGGJMLM-GIMIYPNGSA-N Crotonoside Natural products C1=NC2=C(N)NC(=O)N=C2N1[C@H]1O[C@@H](CO)[C@H](O)[C@@H]1O MIKUYHXYGGJMLM-GIMIYPNGSA-N 0.000 description 1
- NYHBQMYGNKIUIF-UHFFFAOYSA-N D-guanosine Natural products C1=2NC(N)=NC(=O)C=2N=CN1C1OC(CO)C(O)C1O NYHBQMYGNKIUIF-UHFFFAOYSA-N 0.000 description 1
- 241000192091 Deinococcus radiodurans Species 0.000 description 1
- 108010008532 Deoxyribonuclease I Proteins 0.000 description 1
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 1
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 1
- 108010067770 Endopeptidase K Proteins 0.000 description 1
- WHUUTDBJXJRKMK-UHFFFAOYSA-N Glutamic acid Natural products OC(=O)C(N)CCC(O)=O WHUUTDBJXJRKMK-UHFFFAOYSA-N 0.000 description 1
- DCXYFEDJOCDNAF-REOHCLBHSA-N L-asparagine Chemical compound OC(=O)[C@@H](N)CC(N)=O DCXYFEDJOCDNAF-REOHCLBHSA-N 0.000 description 1
- CKLJMWTZIZZHCS-REOHCLBHSA-N L-aspartic acid Chemical compound OC(=O)[C@@H](N)CC(O)=O CKLJMWTZIZZHCS-REOHCLBHSA-N 0.000 description 1
- WHUUTDBJXJRKMK-VKHMYHEASA-N L-glutamic acid Chemical compound OC(=O)[C@@H](N)CCC(O)=O WHUUTDBJXJRKMK-VKHMYHEASA-N 0.000 description 1
- ZDXPYRJPNDTMRX-VKHMYHEASA-N L-glutamine Chemical compound OC(=O)[C@@H](N)CCC(N)=O ZDXPYRJPNDTMRX-VKHMYHEASA-N 0.000 description 1
- AGPKZVBTJJNPAG-WHFBIAKZSA-N L-isoleucine Chemical compound CC[C@H](C)[C@H](N)C(O)=O AGPKZVBTJJNPAG-WHFBIAKZSA-N 0.000 description 1
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 1
- AYFVYJQAPQTCCC-GBXIJSLDSA-N L-threonine Chemical compound C[C@@H](O)[C@H](N)C(O)=O AYFVYJQAPQTCCC-GBXIJSLDSA-N 0.000 description 1
- KZSNJWFQEVHDMF-BYPYZUCNSA-N L-valine Chemical compound CC(C)[C@H](N)C(O)=O KZSNJWFQEVHDMF-BYPYZUCNSA-N 0.000 description 1
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 1
- KDXKERNSBIXSRK-UHFFFAOYSA-N Lysine Natural products NCCCCC(N)C(O)=O KDXKERNSBIXSRK-UHFFFAOYSA-N 0.000 description 1
- 239000004472 Lysine Substances 0.000 description 1
- FYYHWMGAXLPEAU-UHFFFAOYSA-N Magnesium Chemical compound [Mg] FYYHWMGAXLPEAU-UHFFFAOYSA-N 0.000 description 1
- PWHULOQIROXLJO-UHFFFAOYSA-N Manganese Chemical compound [Mn] PWHULOQIROXLJO-UHFFFAOYSA-N 0.000 description 1
- 238000012408 PCR amplification Methods 0.000 description 1
- 108091093037 Peptide nucleic acid Proteins 0.000 description 1
- 108091028664 Ribonucleotide Proteins 0.000 description 1
- MTCFGRXMJLQNBG-UHFFFAOYSA-N Serine Natural products OCC(N)C(O)=O MTCFGRXMJLQNBG-UHFFFAOYSA-N 0.000 description 1
- 241000191940 Staphylococcus Species 0.000 description 1
- AYFVYJQAPQTCCC-UHFFFAOYSA-N Threonine Natural products CC(O)C(N)C(O)=O AYFVYJQAPQTCCC-UHFFFAOYSA-N 0.000 description 1
- 239000004473 Threonine Substances 0.000 description 1
- KZSNJWFQEVHDMF-UHFFFAOYSA-N Valine Natural products CC(C)C(N)C(O)=O KZSNJWFQEVHDMF-UHFFFAOYSA-N 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- 235000005824 Zea mays ssp. parviglumis Nutrition 0.000 description 1
- 235000016383 Zea mays subsp huehuetenangensis Nutrition 0.000 description 1
- 239000011543 agarose gel Substances 0.000 description 1
- 230000006229 amino acid addition Effects 0.000 description 1
- ODKSFYDXXFIFQN-UHFFFAOYSA-N arginine Natural products OC(=O)C(N)CCCNC(N)=N ODKSFYDXXFIFQN-UHFFFAOYSA-N 0.000 description 1
- 229960001230 asparagine Drugs 0.000 description 1
- 235000009582 asparagine Nutrition 0.000 description 1
- 235000003704 aspartic acid Nutrition 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- OQFSQFPPLPISGP-UHFFFAOYSA-N beta-carboxyaspartic acid Natural products OC(=O)C(N)C(C(O)=O)C(O)=O OQFSQFPPLPISGP-UHFFFAOYSA-N 0.000 description 1
- 229910052791 calcium Inorganic materials 0.000 description 1
- 239000011575 calcium Substances 0.000 description 1
- 239000001110 calcium chloride Substances 0.000 description 1
- 229910001628 calcium chloride Inorganic materials 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 230000003197 catalytic effect Effects 0.000 description 1
- 150000001768 cations Chemical class 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 210000003763 chloroplast Anatomy 0.000 description 1
- 235000005822 corn Nutrition 0.000 description 1
- 238000012258 culturing Methods 0.000 description 1
- 210000000805 cytoplasm Anatomy 0.000 description 1
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical class NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 239000005547 deoxyribonucleotide Substances 0.000 description 1
- 125000002637 deoxyribonucleotide group Chemical group 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 230000000368 destabilizing effect Effects 0.000 description 1
- 238000010790 dilution Methods 0.000 description 1
- 239000012895 dilution Substances 0.000 description 1
- 238000010494 dissociation reaction Methods 0.000 description 1
- 230000005593 dissociations Effects 0.000 description 1
- 239000000975 dye Substances 0.000 description 1
- 230000001094 effect on targets Effects 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- ZMMJGEGLRURXTF-UHFFFAOYSA-N ethidium bromide Chemical compound [Br-].C12=CC(N)=CC=C2C2=CC=C(N)C=C2[N+](CC)=C1C1=CC=CC=C1 ZMMJGEGLRURXTF-UHFFFAOYSA-N 0.000 description 1
- 229960005542 ethidium bromide Drugs 0.000 description 1
- 238000013401 experimental design Methods 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 238000001215 fluorescent labelling Methods 0.000 description 1
- 238000010353 genetic engineering Methods 0.000 description 1
- 235000013922 glutamic acid Nutrition 0.000 description 1
- 239000004220 glutamic acid Substances 0.000 description 1
- ZDXPYRJPNDTMRX-UHFFFAOYSA-N glutamine Natural products OC(=O)C(N)CCC(N)=O ZDXPYRJPNDTMRX-UHFFFAOYSA-N 0.000 description 1
- 229940029575 guanosine Drugs 0.000 description 1
- 238000012165 high-throughput sequencing Methods 0.000 description 1
- 238000003780 insertion Methods 0.000 description 1
- 230000037431 insertion Effects 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 229960000310 isoleucine Drugs 0.000 description 1
- AGPKZVBTJJNPAG-UHFFFAOYSA-N isoleucine Natural products CCC(C)C(N)C(O)=O AGPKZVBTJJNPAG-UHFFFAOYSA-N 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 239000007788 liquid Substances 0.000 description 1
- 229910052749 magnesium Inorganic materials 0.000 description 1
- 239000011777 magnesium Substances 0.000 description 1
- 235000009973 maize Nutrition 0.000 description 1
- 210000001161 mammalian embryo Anatomy 0.000 description 1
- 229910052748 manganese Inorganic materials 0.000 description 1
- 239000011572 manganese Substances 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 238000002844 melting Methods 0.000 description 1
- 230000008018 melting Effects 0.000 description 1
- 210000003470 mitochondria Anatomy 0.000 description 1
- 238000002715 modification method Methods 0.000 description 1
- QELJHCBNGDEXLD-UHFFFAOYSA-N nickel zinc Chemical compound [Ni].[Zn] QELJHCBNGDEXLD-UHFFFAOYSA-N 0.000 description 1
- 210000003463 organelle Anatomy 0.000 description 1
- 238000003752 polymerase chain reaction Methods 0.000 description 1
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 1
- 210000001236 prokaryotic cell Anatomy 0.000 description 1
- 230000004952 protein activity Effects 0.000 description 1
- 238000001742 protein purification Methods 0.000 description 1
- 101150044726 pyrE gene Proteins 0.000 description 1
- 238000002708 random mutagenesis Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 239000002336 ribonucleotide Substances 0.000 description 1
- 125000002652 ribonucleotide group Chemical group 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 238000002741 site-directed mutagenesis Methods 0.000 description 1
- 239000001509 sodium citrate Substances 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000010186 staining Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 125000001424 substituent group Chemical group 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- HRXKRNGNAMMEHJ-UHFFFAOYSA-K trisodium citrate Chemical compound [Na+].[Na+].[Na+].[O-]C(=O)CC(O)(CC([O-])=O)C([O-])=O HRXKRNGNAMMEHJ-UHFFFAOYSA-K 0.000 description 1
- 229940038773 trisodium citrate Drugs 0.000 description 1
- 239000004474 valine Substances 0.000 description 1
- 239000011534 wash buffer Substances 0.000 description 1
- 239000011592 zinc chloride Substances 0.000 description 1
- JIAARYAFYJHUJI-UHFFFAOYSA-L zinc dichloride Chemical compound [Cl-].[Cl-].[Zn+2] JIAARYAFYJHUJI-UHFFFAOYSA-L 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/14—Hydrolases (3)
- C12N9/16—Hydrolases (3) acting on ester bonds (3.1)
- C12N9/22—Ribonucleases RNAses, DNAses
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P19/00—Preparation of compounds containing saccharide radicals
- C12P19/26—Preparation of nitrogen-containing carbohydrates
- C12P19/28—N-glycosides
- C12P19/30—Nucleotides
- C12P19/34—Polynucleotides, e.g. nucleic acids, oligoribonucleotides
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A50/00—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE in human health protection, e.g. against extreme weather
- Y02A50/30—Against vector-borne diseases, e.g. mosquito-borne, fly-borne, tick-borne or waterborne diseases whose impact is exacerbated by climate change
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Organic Chemistry (AREA)
- Molecular Biology (AREA)
- Engineering & Computer Science (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Genetics & Genomics (AREA)
- Microbiology (AREA)
- Biotechnology (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Medicinal Chemistry (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
The invention relates to novel TnpB programming nuclease, belonging to the field of nucleic acid editing. Specifically, the invention provides a novel TnpB programming nuclease which has lower homology with the reported TnpB enzyme and has larger difference in characteristics, can realize the activity of the nuclease in cells and outside the cells, and has wide application prospect.
Description
Technical Field
The invention relates to novel TnpB programming nuclease, belonging to the field of nucleic acid editing.
Background
TnpB is a transposon encoded protein belonging to the Ω system, and in recent years, it has been found that it is also a novel RNA-mediated programmatic nuclease (Karvelis T,Druteika G,Bigelyte G,et al.Transposon-associated TnpB is a programmable RNA-guided DNA endonuclease[J].Nature.2021,599(7886):692-696;Han AT,Soumya K,F Esra D,et al.The widespread IS200/IS605 transposon family encodes diverse programmable RNA-guided endonucleases[J].Science,374(6563):57-65.). with nucleic acid cleavage and editing activity, which has advantages in cell delivery efficiency due to the smaller volume of TnpB compared to Cas9, can be used for genome editing and modification in vivo and in vitro, and can show great application prospects in the fields of gene therapy, species genome modification, trait improvement, and the like as a novel nucleic acid editing tool in the future. The development of different types of TnpB proteins facilitates the use of the enzyme in the field of nucleic acid editing. However, as a transposon-encoded protein which is ubiquitous in microorganisms, there are more than 40 tens of thousands of TnpB proteins annotated in the NCBI database, but not every TnpB can be used as a tool for nucleic acid cleavage and editing.
ISDra2 from staphylococcus radiodurans (Deinococcus radiodurans) is a relatively well-studied TnpB programmatic nuclease that cleaves 5' -TTGAT as a TAM sequence under the direction of gRNA of a single shot card structure at non-targeting and targeting sequences at target positions to form cross-dissociated cohesive ends, thereby allowing identification, cleavage and editing of target nucleic acid sequences in vitro and in vivo.
In order to discover more TnpB programming nucleases which are different from ISDra2 in characteristics, the invention separates some novel TnpB programming nucleases from archaea, has a far evolution relationship with ISDra2 and a very low similarity (less than 30%), and further discovers that the novel TnpB programming nucleases have unique values in the aspects of identified TAM sites, omega RNA structures, enzyme activity sites, temperature sensitivity, metal ion dependence and the like, and have great application prospects.
Disclosure of Invention
The invention provides a composition, which is characterized by comprising the following components:
a) A TnpB protein or one or more nucleotide sequences encoding the TnpB protein;
b) An omega RNA molecule or one or more nucleotide sequences encoding the omega RNA molecule, the omega RNA molecule capable of forming a complex with a) the TnpB protein and directing TnpB to recognize a target sequence;
Wherein, tnpB protein is selected from any one of the No. 1-22 sequences listed in Table 1;
Wherein the omega RNA molecule is of a double hairpin structure; optionally, the omega RNA molecule has the sequence of :UUAAGAAGGACUUGACUUUGGCUGACCGUGUGUUUGUAUGUCCUAAAUGUGGUUGGACUGUAGAUCGUGACUAUAAUGCUUCUCUAAAUAUUCUUCGUGCGGGGUCGGGACUGCCCUUAGAGCCUGUGGACAGGGGACCUCUGCUAUACAUUCCCUUCUCAGAAGGGGUGUAUAGUAAGUUUCUUGGAAGAAGCAGGAAAUCUCCAUCGUGAGGUGGAGAUGCCACGUCCGUAAGGGCGGGGUUGUUCAC.
In some embodiments, the TAM (Transposon Associated Motif) sequence 5 'to the upper target sequence is 5' -TTTAA or 5'-ATTAA or 5' -TCTAA or 5'-TGTAA or 5' -TATAA or 5'-TTCAA or 5' -TTAAA or 5'-TTTCA or 5' -TTTGA or 5'-TTTTA or 5' -TTTAC or 5'-TTTAG or 5' -TTTAT.
In some embodiments, the above-described compositions further comprise one or more metal ions; the metal ions comprise magnesium ions, manganese ions or calcium ions; the concentration of the metal ions was 10mM. The addition of these metal ions can enhance the activity of TnpB described above.
The invention also provides a carrier system capable of encoding the above composition.
The invention also provides an engineered host cell containing the composition; in some embodiments, the host cell is a microbial cell, such as a Pediococcus acidilactici, lactobacillus reuteri, or E.coli cell; or animal cells, such as HEK293T cells; or a plant cell.
The invention also provides application of the composition, the vector system and the host cell in the field of nucleic acid recognition or modification.
The invention also provides application of the composition, the vector system and the host cell in the field of phage infection resistance.
The invention also provides a nucleic acid recognition or modification method, which is characterized in that a target sequence and the composition are placed in an environment of 37-85 ℃; in some embodiments, the ambient temperature is 37 ℃ or 42 ℃ or 55 ℃ or 65 ℃ or 75 ℃ or 85 ℃; in some embodiments, the ambient temperature described above is 75 ℃. The TnpB provided by the invention has wider temperature adaptability, and can be used in a wider temperature range, so that different host cells can be used for culturing. In addition, the enzymatic activity of TnpB above is also highest if the ambient temperature reaches 75 ℃.
The present invention also provides a TnpB mutant protein, which is characterized in that it contains any one of the sequences 1 to 22 listed in Table 1 and has a mutation at the position D187 and/or E271 corresponding to the sequence No. 6. The D187 and E271 sites of the SiRe_0632 sequence of the No. 6 protein are key sites influencing the enzyme activity, the enzyme activity is seriously influenced after mutation, meanwhile, the two sites are conserved in the 22 SiRe proteins provided by the invention, and any site of the SiRe proteins can be obtained through protein sequence similarity comparison, and the mutant protein is obtained through mutation.
The invention also provides application of the mutant protein in the field of nucleic acid recognition or modification. The mutant proteins lose enzymatic activity and thus recognize the target sequence under the direction of ωRNA, but cannot cleave. In combination with adenine or cytosine deaminase, a base editor can be developed to realize single base editing of a target sequence.
The invention has the beneficial effects that: the invention provides a novel TnpB enzyme, which has low homology with the reported TnpB enzyme ISDra, has larger difference in characteristics, can show nuclease activity in cells and outside cells, has unique value in the aspects of identified TAM sites, omega RNA structures, enzyme activity sites, temperature sensitivity, metal ion dependence and the like, and has wide application prospect.
Drawings
The structure and evolutionary relationship of the newly isolated TnpB of FIG. 1. A: tnpB schematic structural diagram. LE: LEFT ELEMENT sequences; RE: RIGHT ELEMENT sequences. B: the evolution of 22 SiRe-like TnpB proteins was related to the published ISDra2 TnpB proteins.
FIG. 2 SiRe class TnpB TAM sequences, omega RNA sequences and constructs. A: TAM and LE. B: omega RNA scaffold and guide sequences. C: omega RNA structure. D: siRe protein purification diagram.
FIG. 3 enzyme activity at different temperatures. A: at different temperatures, sisTnpB1 had cleavage effect on plasmid. L: molecular weight standard; ctrl: blank control group; OC: nicking the plasmid; FLL: a linear plasmid; SC: supercoiled plasmid; 37-85: different temperatures. B: enzyme activity at different temperatures was reflected by FLL quantitative statistics. C: specificity of cleavage. Target represents the Target sequence. D: metal ion specificity. -: no metal ions. E: enzyme active site analysis; f: cleavage pattern of double-stranded DNA. The arrow indicates the cutting position.
FIG. 4 SisTnpB1 simultaneously cleaves dsDNA and ssDNA. A. B: sisTnpB1 cleaves dsDNA carrying TAM and the target sequence at 75deg.C. C. D: sisTnpB1 cleaves dsDNA carrying TAM and the target sequence at 37 ℃. E. F: g, H under 75 ℃ conditions: sisTnpB1 cleaves dsDNA carrying the target sequence but without TAM at 37 ℃. I. J: sisTnpB1 cleaved dsDNA without target sequence and TAM at 75deg.C. K. L: sisTnpB1 cleaves ssDNA with the target sequence and with TAM (K) and without TAM (L) at 75 ℃. M, N: sisTnpB1 cleaves ssDNA with the target sequence and with and without TAM (M) at 37 ℃. O, P: sisTnpB1 cleaves ssDNA without the target sequence and TAM at 75deg.C. FAM: fluorescent labeling.
FIG. 5 explores the Seed sequence and TAM diversity by base mutation. A: sisTnpB 1A schematic representation of the matching of the guide RNA to the target sequence. B: sisTnpB1 of the 5 consecutive bases of the target sequence were mutated and then cleaved. C: sisTnpB1 to the target sequence +1 to +10 position on single base mutation after cutting. D: sisTnpB1 to the-1 to-5 positions on the Tam position of the target sequence. Ctrl: control without SisTnpB 1.
FIG. 6 uses SisTnpB to edit the bacterial genome.
A: exogenous carrying SisTnpB and guide RNA thereof are designed, and interference schematic diagram is carried out on two endogenous plasmids of Pediococcus acidilactici. B: sisTnpB1 interference of plasmids carrying GE00037 and GE00033, respectively, GE00014+GE00039 is a common gene on both plasmids. C: designing exogenous carrying SisTnpB and guide RNA, and carrying out gene knockout schematic on two target sites on Pyre genes on pediococcus acidilactici genome. D: sisTnpB1 results of gene knockout at two targeting sites on Pyre gene. E: the transformants were compiled and run-off sequenced.
Detailed Description
The following definitions and methods are provided to better define the present application and to guide those of ordinary skill in the art in the practice of the present application. Unless otherwise indicated, terms are to be construed according to conventional usage by those of ordinary skill in the relevant art. All patent documents, academic papers, industry standards, and other publications cited herein are incorporated by reference in their entirety.
Unless otherwise indicated, nucleic acids are written in the 5 'to 3' direction from left to right; the amino acid sequence is written in the amino to carboxyl direction from left to right. Amino acids may be represented herein by their commonly known three-letter symbols or by the single-letter symbols recommended by the IUPAC-IUB biochemical nomenclature committee. Likewise, nucleotides may be referred to by commonly accepted single letter codes. The numerical range includes the numbers defining the range. As used herein, "nucleic acid" includes reference to deoxyribonucleotide or ribonucleotide polymers in either single-or double-stranded form, and unless otherwise limited, includes known analogs (e.g., peptide nucleic acids) having the basic properties of natural nucleotides that hybridize to single-stranded nucleic acids in a manner similar to naturally occurring nucleotides. As used herein, the term "encode" or "encoded" when used in the context of a particular nucleic acid, means that the nucleic acid contains the necessary information to direct translation of the nucleotide sequence into a particular protein. The information encoding the protein is represented using codons. As used herein, reference to a "full-length sequence" of a particular polynucleotide or protein encoded thereby refers to an entire nucleic acid sequence or an entire amino acid sequence having a natural (non-synthetic) endogenous sequence. The full length polynucleotide encodes the full length, catalytically active form of the particular protein. The terms "polypeptide", "polypeptide" and "protein" are used interchangeably herein to refer to a polymer of amino acid residues. The term is used for amino acid polymers in which one or more amino acid residues are artificial chemical analogs of the corresponding naturally occurring amino acid. The term is also used for naturally occurring amino acid polymers. The terms "residue" or "amino acid" are used interchangeably herein to refer to an amino acid that is incorporated into a protein, polypeptide, or peptide (collectively, "protein"). Amino acids may be naturally occurring amino acids, and unless otherwise limited, may include known analogs of natural amino acids, which analogs may function in a similar manner to naturally occurring amino acids.
In the present application, the terms "comprises," "comprising," or variations thereof, are to be understood to encompass other elements, numbers, or steps in addition to those described. "subject plant" or "subject plant cell" refers to a plant or plant cell in which genetic engineering has been effected, or a progeny cell of a plant or cell so engineered, which progeny cell comprises the engineering. "control" or "control plants" provide a reference point for measuring phenotypic changes in a subject plant.
Those skilled in the art will readily recognize that advances in the field of molecular biology, such as site-specific and random mutagenesis, polymerase chain reaction methods, and protein engineering techniques, provide a wide range of suitable tools and procedures for engineering or engineering amino acid sequences and potentially genetic sequences of proteins of interest.
In some embodiments, the nucleotide sequences of the present application may be altered to make conservative amino acid substitutions. The principles and examples of conservative amino acid substitutions are described further below. In certain embodiments, substitutions may be made to the nucleotide sequences of the present application in accordance with published species codon preferences without altering the amino acid sequence. In some embodiments, a portion of the nucleotide sequence in the present application is replaced with a different codon encoding the same amino acid sequence, such that the amino acid sequence encoded thereby is not changed while the nucleotide sequence is changed. Conservative variants include those sequences that encode the amino acid sequence of one of the proteins of an embodiment due to the degeneracy of the genetic code. Those skilled in the art will recognize that amino acid additions and/or substitutions are generally based on the relative similarity of amino acid side chain substituents, e.g., hydrophobicity, charge, size, etc., of the substituents. Exemplary amino acid substituents having various of the aforementioned contemplated properties are well known to those skilled in the art and include arginine and lysine; glutamic acid and aspartic acid; serine and threonine; glutamine and asparagine; and valine, leucine and isoleucine. Guidelines for suitable amino acid substitutions that do not affect the biological activity of the protein of interest can be found in the model of Dayhoff et al (1978) Atlas of Protein Sequence and Structure (protein sequence and structure atlas) (Natl. Biomed. Res. Foundation, washington, D.C.), incorporated herein by reference. Conservative substitutions, such as substitution of one amino acid for another with similar properties, may be made. Identification of sequence identity includes hybridization techniques. For example, all or part of a known nucleotide sequence is used as a probe for selective hybridization with other corresponding nucleotide sequences present in a cloned genomic DNA fragment or population of cDNA fragments (i.e., a genomic library or cDNA library) from a selected organism. The hybridization probes may be genomic DNA fragments, cDNA fragments, RNA fragments, or other oligonucleotides, and may be labeled with a detectable group such as 32P or other detectable marker. Thus, for example, hybridization probes can be prepared by labeling synthetic oligonucleotides based on the sequences of the embodiments. Methods for preparing hybridization probes and constructing cDNA and genomic libraries are generally known in the art. Hybridization of the sequences may be performed under stringent conditions. As used herein, the term "stringent conditions" or "stringent hybridization conditions" refers to conditions under which a probe will hybridize to its target sequence to a detectably greater extent (e.g., at least 2-fold, 5-fold, or 10-fold over background) relative to hybridization to other sequences. Stringent conditions are sequence-dependent and will be different in different circumstances. By controlling the hybridization stringency and/or controlling the washing conditions, target sequences 100% complementary to the probes can be identified (homologous probe method). Alternatively, stringent conditions can be adjusted to allow for some sequence mismatches in order to detect lower similarity (heterologous probe method). Typically, the probe is less than about 1000 or 500 nucleotides in length. Typically, stringent conditions are those in which the salt concentration is less than about 1.5M Na ion, typically about 0.01M to 1.0M Na ion concentration (or other salt) at a pH of 7.0 to 8.3, and the temperature conditions are: when used with short probes (e.g., 10 to 50 nucleotides), at least about 30 ℃; when used with long probes (e.g., greater than 50 nucleotides), at least about 60 ℃. Stringent conditions can also be achieved with the addition of destabilizing agents such as formamide. Exemplary low stringency conditions include hybridization at 37 ℃ with 30% to 35% formamide buffer, 1M NaCl, 1% sds (sodium dodecyl sulfate), washing in 1 x to 2 x SSC (20 x SSC = 3.0M naci/0.3M trisodium citrate) at 50 ℃ to 55 ℃. Exemplary moderately stringent conditions include hybridization in 40% to 45% formamide, 1.0M NaCl, 1% SDS at 37℃and washing in 0.5 XSSC to 1 XSSC at 55℃to 60 ℃. Exemplary high stringency conditions include hybridization in 50% formamide, 1M NaCl, 1% sds at 37 ℃ and a final wash in 0.1 x SSC at 60 ℃ to 65 ℃ for at least about 20 minutes. Optionally, the wash buffer may comprise about 0.1% to about 1% sds. The duration of hybridization is typically less than about 24 hours, typically from about 4 hours to about 12 hours. Specificity generally depends on post-hybridization washing, the key factors being the ionic strength and temperature of the final wash solution. The Tm (thermodynamic melting point) of DNA-DNA hybrids can be approximated from the formula Meinkoth and Wahl (1984) Anal. Biochem. 138:267-284: tm=81.5 ℃ +16.6 (log) +0.41 (% GC) -0.61 (% formamide) -500/L; where M is the molar concentration of monovalent cations,% GC is the percentage of guanosine and cytosine nucleotides in the DNA,% formamide is the percentage of formamide in the hybridization solution, and L is the base pair length of the hybrid. Tm is the temperature (at a defined ionic strength and pH) at which 50% of the complementary target sequence hybridizes to a perfectly matched probe. Washing is typically performed at least until equilibrium is reached and a low hybridization background level is reached, such as 2 hours, 1 hour, or 30 minutes. Each 1% mismatch corresponds to a decrease in Tm of about 1 ℃; thus, tm, hybridization, and/or wash conditions can be adjusted to hybridize to sequences of desired identity. For example, if sequences with ≡90% identity are desired, the Tm can be reduced by 10 ℃. Typically, stringent conditions are selected to be about 5 ℃ lower than the Tm for the specific sequence and its complement at a defined ionic strength and pH. However, under very stringent conditions, hybridization and/or washing may be performed at 4℃below the Tm; hybridization and/or washing may be performed at 6 ℃ below the Tm under moderately stringent conditions; hybridization and/or washing can be performed at 11℃below the Tm under low stringency conditions.
In some embodiments, fragments of the nucleotide sequence and the amino acid sequence encoded thereby are also included. As used herein, the term "fragment" refers to a portion of the nucleotide sequence of a polynucleotide or a portion of the amino acid sequence of a polypeptide of an embodiment. Fragments of a nucleotide sequence may encode protein fragments that retain the biological activity of the native or corresponding full-length protein and thus have protein activity. Mutant proteins include biologically active fragments of a native protein that comprise consecutive amino acid residues that retain the biological activity of the native protein.
In the present invention, a "target sequence" or "target polynucleotide" or "target nucleic acid" may be any polynucleotide that is endogenous or exogenous to a cell (e.g., a prokaryotic or eukaryotic cell). For example, the target polynucleotide may be a polynucleotide that is present in the nucleus of a eukaryotic cell. The target polynucleotide may be a sequence encoding a gene product (e.g., a protein) or a non-coding sequence (e.g., a regulatory polynucleotide or unwanted DNA). In some cases, the target sequence should be related to the protospacer adjacent motif (PAM or TAM).
"Target sequence" refers to a polynucleotide targeted by a guide sequence in a gRNA, e.g., a sequence that has complementarity to the guide sequence, wherein hybridization between the target sequence and the guide sequence will promote the formation of a CRISPR or TnpB complex (including Cas or TnpB protein and gRNA). Complete complementarity is not necessary so long as sufficient complementarity exists to cause hybridization and promote the formation of a complex. The target sequence may comprise any polynucleotide, such as DNA or RNA. In some cases, the target sequence is located either inside or outside the cell. In some cases, the target sequence is located in the nucleus or cytoplasm of the cell. In some cases, the target sequence may be located within an organelle of a eukaryotic cell, such as a mitochondria or chloroplast.
The following examples are illustrative of the application and are not intended to limit the scope of the application. Modifications and substitutions to methods, procedures, or conditions of the present application without departing from the spirit and nature of the application are intended to be within the scope of the present application. Examples follow conventional experimental conditions, such as the molecular cloning laboratory Manual of Sambrook et al (Sambrook J & Russell DW, molecular cloning: alaboratory manual, 2001), or conditions recommended by the manufacturer's instructions, unless otherwise indicated. Unless otherwise indicated, all chemical reagents used in the examples were conventional commercial reagents, and the technical means used in the examples were conventional means well known to those skilled in the art.
EXAMPLE 1 isolation of TnpB protein from archaea
As TnpB has smaller volume compared with Cas9, the cell delivery efficiency is improved, and the cell delivery efficiency can be used for genome editing in vivo and in vitro, so that the cell delivery efficiency can be used as a novel nucleic acid editing tool in the future, and the cell delivery efficiency has great application prospects in the fields of gene therapy, species genome modification, character improvement and the like. The development of different types of TnpB proteins facilitates the use of the enzyme in the field of nucleic acid editing. However, as a transposon-encoded protein which is ubiquitous in microorganisms, there are 40 or more kinds of TnpB proteins annotated in NCBI database, and not every TnpB can be used as a tool for nucleic acid cleavage and editing.
So far, RNA guided endonucleases for genome editing have only been identified from bacteria. Bacterial insert sequences of the IS200/IS605 and IS607 families encode RNA-directed TnpB endonucleases that are reprogrammed for genome editing. These families of insertion sequences are also widely separated in archaea (Archaea). However, it is unclear whether TnpB, and in particular which TnpB, are present in archaea that can be used for genome editing.
The present invention identifies 149194 IS200/605 elements from the sequenced and assembled archaea genome, from which 8574 IS200/605 elements encoding both TnpA and TnpB genes are screened. Conservative catalytic amino acid residue alignments have found TnpB encoded by some IS200/605, possibly with programmatic nuclease activity. They have far more distant evolutionary relationships with the published ISDra2 TnpB proteins, have very low similarity (< 30%), are a novel class of TnpB proteins, and may have different properties than ISDra2 TnpB.
By comparing LEFT ELEMENT sequences of IS200/605, conserved TAM sequences were identified from the immediate outside of conserved LEFT ELEMENT; guide RNA sequences and 3' -immediately adjacent guide sequences were predicted from conserved RIGHT ELEMENT sequences. And further through activity experiments, 22 proteins are found to have cleavage activity, while other proteins, such as WP_198539375.1, WP_240570379.1, WP_240570379.1, MCL4379344.1, WP_014512934.1, WP_014513195.1, ADX85008 and the like, have no cleavage activity. These 22 proteins (information shown in Table 1) are therefore expected to be developed into valuable TnpB programmatic nucleases.
Table 1 isolated TnpB protein with Activity
The structure, sequence similarity and evolutionary relationship of these 22 proteins all differ significantly from ISDra2 (wp_ 010887311.1) (fig. 1).
Through RNA comparison and belief analysis, TAM (Transposon Associated Motif) is also different from ISDra, 5' -TTTAA is found, the sequence and structure of omega RNA is also different from ISDra (see figure 2 for details), and the double hairpin structure is unique. The sequence of the omega RNA obtained is as follows:
UUAAGAAGGACUUGACUUUGGCUGACCGUGUGUUUGUAUGUCCUAAAUGUGGUUGGACUGUAGAUCGUGACUAUAAUGCUUCUCUAAAUAUUCUUCGUGCGGGGUCGGGACUGCCCUUAGAGCCUGUGGACAGGGGACCUCUGCUAUACAUUCCCUUCUCAGAAGGGGUGUAUAGUAAGUUUCUUGGAAGAAGCAGGAAAUCUCCAUCGUGAGGUGGAGAUGCCACGUCCGUAAGGGCGGGGUUGUUCAC.
Example 2 analysis of SiRe protein Properties
The invention further analyzes the characteristics of SiRe protein, and mainly comprises activity at different temperatures, enzyme cleavage specificity, dependence on metal ions, enzyme active sites and double-stranded DNA cutting modes.
1. Activity at different temperatures.
One of the SiRe proteins SiRe_0632 (designated SisTnpB 1) and a specific Target site thereof and a plasmid containing TAM are subjected to cleavage experiments at different temperatures in vitro, and the specific steps are as follows:
5.4nM SisTnpB1 RNP complex (complex formed by SisTnpB1 and ωRNA) and 110ng of puc19 plasmid DNA carrying double-stranded oligo-chain clones of different target sequences and TAM sequences were added to 10mM Tris-HCl buffer (pH 7.5), 1mM DTT, 1mM EDTA, 100mM NaCl, 10mM MgCl 2 buffer and reacted at different temperatures (37-85 ℃) for 60 minutes. The reaction was stopped by adding 20mM proteinase K and 4% SDS solution and incubating at 37℃for 1 hour. Then, a 2 Xsupported dye was added and the cleavage reaction was analyzed by agarose or denaturing PAGE electrophoresis. The DNA fragments in the agarose gel were visualized by ethidium bromide staining, and the DNA fragments in denaturing PAGE were detected using a FUJIFILM scanner (FLA-5100).
The results showed that the programmer enzyme had specific cleavage activity at 37 to 85℃and the optimal enzyme activity was 75℃C (FIG. 3A, B).
2. Specificity of cleavage.
The cleavage specificity was measured by reacting SisTnpB1 RNP complex with puc19 plasmid DNA carrying no target sequence and no TAM sequence, carrying target sequence and no TAM sequence, and carrying target and TAM sequences, respectively, according to the method described in 1, with the reaction temperature set at 75 ℃. The specificity of the programmer was found to be high, with cleavage activity only when the TAM sequence and the Target sequence were present (fig. 3C).
3. Metal ion dependence.
The reaction temperature was set at 37℃and 10mM MgCl 2 in the buffer was replaced with MnCl 2、CaCl2、ZnCl2 or NiCl 2, respectively, and the metal ion dependence was examined according to the method described in 1. The programmer enzyme was found to have a broader spectrum of metal ion dependence, with the highest activity in the presence of manganese ions and lower activity in the presence of zinc nickel (fig. 3D). Therefore, the enzyme activity can be increased by adding metal ions such as magnesium, manganese and calcium, and especially adding manganese ions. Specific data are shown in Table 2 (enzyme activity reflected by linear plasmid percentage).
TABLE 2 quantitative enzyme Activity of SisTnpB1 on Metal ion dependence
4. Enzyme active site.
The invention discovers a plurality of nuclease active sites existing on RuvC structural domains through sequence alignment of SiRe proteins and analysis. The invention respectively mutates the sites to obtain mutant protein, and the reaction temperature is set at 37 ℃ according to the method described in 1, so as to detect the enzymatic activity of the mutant protein. The significant decrease in enzyme activity was found after mutation at two sites D187 and E271 (fig. 3E), demonstrating the important role of these two sites on enzyme activity. The mutant proteins lose enzymatic activity and thus recognize the target sequence under the direction of ωRNA, but cannot cleave. The SiRe protein can be modified by mutating the two sites, and can be developed into a base editor by combining adenine or cytosine deaminase, so that single base editing of a target sequence is realized.
5. Double-stranded DNA cleavage scheme.
The linearized double-stranded plasmid formed after cleavage of the plasmid was further recovered and sequenced, and the sequencing results showed that the cleavage pattern of its double-stranded DNA was in a staggered dissociation mode at 15-18nt of the TAM segment on the non-targeting sequence and 20-28nt of the TAM segment on the targeting sequence (fig. 3F), resulting in a 5' protruding cohesive end.
These results show that the SiRe proteins differ from ISDra2 in their activity at different temperatures, cleavage specificity, dependence on metal ions, enzyme active site and double-stranded DNA cleavage pattern, revealing some unique properties of such proteins.
Example 3 analysis of the dependence of SisTnpB protein on target and TAM sequences
To investigate whether SisTnpB1 was able to cleave double-stranded DNA (dsDNA) and single-stranded DNA (ssDNA) simultaneously, the present invention tested the cleavage capacity of TnpB at different temperatures with short double-stranded or single-stranded oligonucleotides with or without TAM and target sequences as substrates. TnpB1 was found to rapidly cleave dsDNA carrying TAM and 20bp target sequence at 75deg.C; whereas cleavage of the target sequence and non-target sequence (NTS, nontarget sequence) sequences was strongly reduced at 37 ℃. If no omega RNA matching sequence is present, sisTnpB1 has no cleavage activity and shows stronger specificity. At an optimal enzyme activity temperature of 75 ℃, sisTnpB1 showed very weak cleavage on the target strand of the dsDNA substrate, whereas there was no cleavage on the non-target strand without TAM sequence. Little cleavage activity at 37℃showed a more stringent PAM dependence, consistent with plasmid cleavage results. SisTnpB1 also cleaved a matched single stranded DNA at 75℃whether TAM was present or not. Whereas the presence of TAM sequences on ssDNA substrates resulted in higher cleavage efficiency (fig. 4).
These results indicate that sisTnpB when used to cleave double-stranded DNA, has a high Target dependence over a broad range of temperatures. sisTnpB1 also has a strong Target dependence when used for cleavage of single stranded DNA.
Example 4 development of TAM site diversity
In order to research whether sisTnpB1 recognizes more diverse TAM sites, the invention mutates the basic group Or/>) The effect of the target sequence and TAM sequence on SisTnpB endonuclease activity was studied in the introduction of the target sequence and adjacent TAM (fig. 5A). Mutations M1-M5 converted +1gccaa+5 to +1cggtt+5 on the target sequence, which almost abrogated DNA cleavage by SisTnpB1 (fig. 5B); whereas the M11-M15 and M16-M20 mutations have less effect on target DNA cleavage, indicating that the target sequence is located at positions +1 to +10. In addition, M-1 to M-5, in which the TAM sequence was mutated, almost eliminated DNA cleavage (FIG. 5B).
We introduced single inversion mutation in SEED and TAM to find out the +1 nucleotide of target sequenceThe reverse mutation of (a) strongly inhibited SisTnpB1 cleavage, while the other single reversals had less effect on SisTnpB cleavage of the target DNA (fig. 5C), indicating that SisTnpB1 was highly tolerant to the target DNA sequence mutation.
More importantly, sisTnpB1 can recognize a wider variety of TAM sequences, including 5′-TTTAA、5′-ATTAA、5′-TCTAA、5′-TGTAA、5′-TATAA、5′-TTCAA、5′-TTAAA、5′-TTTCA、5′-TTTGA、5′-TTTTA、5′-TTTAC、5′-TTTAG、5′-TTTAT,, which greatly expands the application range of the protein.
EXAMPLE 5 editing of microbial, animal, plant cell genomes Using SisTnpB1
The invention selects Pediococcus acidilactici as a representative, and tests SisTnpB1 the effect of editing the bacterial genome. The method comprises the following specific steps:
The interfering plasmid encoding SisTnpB RNP 1 was transferred into Pediococcus acidilactici, on which Guide RNA (5' -TTTAA as TAM) was designed targeting the endogenous plasmid of the bacterium, and the results after transformation indicated that the endogenous plasmid was targeted and consumed by cleavage (FIG. 6A, B).
In addition, the protein knockout plasmid is designed, two target sites taking 5'-ATTTAA or 5' -TTTAT as TAM are selected on pyrE genes, the target sites are respectively cultured at 37 degrees and 45 degrees after transformation, PCR amplification is carried out by designing primers around the target sites, the higher efficiency of knockout of the genes can be achieved (FIG. 6C, D), and the deletion of the sequences of the sites is verified by sequencing (FIG. 6E).
These show SisTnpB that the invention can be applied to interference and genome editing of bacteria, and the same experimental design has higher editing efficiency when used in lactobacillus reuteri and escherichia coli.
In addition, by testing the editing effect of three sites AGBL, EMX1 and AAVS1 in HEK293T cells, it was shown that SisTnpB was able to successfully edit the animal cell genome.
In addition, the effect of SisTnpB1 on the editing of animal cells was tested, as represented by HEK293T cells. The target sequence after selection of optimal TAM 5' TTTAA at three sites AGBL, EMX1 and AAVS1 revealed that all three SisTnpB1 targets showed DNA Double Strand Break (DSB) repair by high throughput sequencing of the target sequence. However, the editing effect of the target cannot be detected in HEK293T cells tested in the control group, and the results show that SisTnpB1 can successfully edit the genome of the animal cells.
Also, the effect of corn test protein on the editing of plant genome was selected. The maize editing system was designed to target the target region at the optimal TAM in both ms26 and wax genes, and after 24h transformation, the immature embryos were incubated at 45 ℃ for a total of 3 days for 4 hours each. Embryo incubation at 37 ℃ was also used as a control. The incubation time and length at 37 ℃ was the same as 45 ℃ treatment, and was maintained at this temperature during the experiment. After the treatment, embryos were collected and high throughput sequenced for the target region, which showed that SisTnpB generated targeted mutations at both ms26 and wax sites in the 45 ℃ incubation treatment, indicating that they achieved successful editing in plant cells.
Example 6 use SisTnpB1 to combat phage infection
SisTnpB1 may also be used to combat phage infection.
GuideRNA interfering plasmids were designed that specifically interfere with the E.coli phage genome. The plasmid was transferred into E.coli Rosseta strain, liquid cultured to OD 0.6-0.8 and plated. E.coli containing SisTnpB and guideRNA were found to be resistant to phage infection by a 10-fold dilution gradient of E.coli virus T5 and T7 on plates in different areas after about 8 hours compared to the control.
Through tests, other SiRe proteins besides SisTnpB < 1 > also have the same characteristics, functions and technical effects.
While the invention has been described in detail in the foregoing general description and with reference to specific embodiments thereof, it will be apparent to one skilled in the art that modifications and improvements can be made thereto. Accordingly, such modifications or improvements may be made without departing from the spirit of the invention and are intended to be within the scope of the invention as claimed.
Claims (12)
1. A composition, comprising:
a) A TnpB protein or one or more nucleic acid molecules encoding the TnpB protein;
b) An omega RNA molecule or one or more nucleic acid molecules encoding the omega RNA molecule, the omega RNA molecule capable of forming a complex with a) the TnpB protein and directing TnpB to recognize a target nucleic acid;
Wherein, tnpB protein is selected from the protein with the sequence number of 6 in the table 1;
Wherein the omega RNA molecule has a double hairpin structure and the sequence is that :UUAAGAAGGACUUGACUUUGGCUGACCGUGUGUUUGUAUGUCCUAAAUGUGGUUGGACUGUAGAUCGUGACUAUAAUGCUUCUCUAAAUAUUCUUCGUGCGGGGUCGGGACUGCCCUUAGAGCCUGUGGACAGGGGACCUCUGCUAUACAUUCCCUUCUCAGAAGGGGUGUAUAGUAAGUUUCUUGGAAGAAGCAGGAAAUCUCCAUCGUGAGGUGGAGAUGCCACGUCCGUAAGGGCGGGGUUGUUCAC.
2. The composition of claim 1, wherein the composition further comprises one or more metal ions.
3. The composition of claim 2, wherein the metal ions comprise magnesium ions or manganese ions or calcium ions.
4. A composition according to claim 3, wherein the concentration of the metal ions is 10mM.
5. A carrier system capable of encoding the composition of any one of claims 1-4.
6. An engineered host cell comprising the composition of any one of claims 1-4, said host cell being a microbial cell or an animal cell.
7. Use of the composition of any one of claims 1-4, or the vector system of claim 5, or the host cell of claim 6 in the field of nucleic acid recognition or modification for non-disease diagnosis and therapeutic purposes.
8. The use of claim 7, wherein the use comprises targeted cleavage of double-stranded DNA, single-stranded DNA, or targeted recognition of a target nucleic acid.
9. The use of claim 6, further comprising combating phage infection.
10. A method of nucleic acid recognition or modification, characterized in that a target nucleic acid and the composition of any one of claims 1-4 are placed in an environment of 37-85 ℃, said method being a method for non-disease diagnosis and treatment purposes.
11. The method of claim 10, wherein the ambient temperature is 37 ℃ or 42 ℃ or 55 ℃ or 65 ℃ or 75 ℃ or 85 ℃.
12. The method of claim 11, wherein the ambient temperature is 75 ℃.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310177144.3A CN116355878B (en) | 2023-02-28 | 2023-02-28 | Novel TnpB programmable nuclease and application thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310177144.3A CN116355878B (en) | 2023-02-28 | 2023-02-28 | Novel TnpB programmable nuclease and application thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN116355878A CN116355878A (en) | 2023-06-30 |
CN116355878B true CN116355878B (en) | 2024-04-26 |
Family
ID=86939038
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310177144.3A Active CN116355878B (en) | 2023-02-28 | 2023-02-28 | Novel TnpB programmable nuclease and application thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116355878B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117512071B (en) * | 2023-09-15 | 2024-06-04 | 湖北大学 | High temperature-resistant TnpB protein and application thereof in nucleic acid detection |
CN116970590B (en) * | 2023-09-22 | 2024-01-30 | 北京科芙兰德生物科学有限责任公司 | Super mini-gene editor smaller than 380 amino acids and application thereof |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108513582A (en) * | 2015-06-18 | 2018-09-07 | 布罗德研究所有限公司 | Novel C RISPR enzymes and system |
CN108738326A (en) * | 2015-12-29 | 2018-11-02 | 孟山都技术公司 | Novel C RISPR associated transposable enzymes and application thereof |
CN113881652A (en) * | 2020-11-11 | 2022-01-04 | 山东舜丰生物科技有限公司 | Novel Cas enzymes and systems and uses |
CN114517190A (en) * | 2021-02-05 | 2022-05-20 | 山东舜丰生物科技有限公司 | CRISPR enzymes and systems and uses |
CA3204429A1 (en) * | 2021-01-07 | 2022-07-14 | Feng Zhang | Dna nuclease guided transposase compositions and methods of use thereof |
-
2023
- 2023-02-28 CN CN202310177144.3A patent/CN116355878B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108513582A (en) * | 2015-06-18 | 2018-09-07 | 布罗德研究所有限公司 | Novel C RISPR enzymes and system |
CN108738326A (en) * | 2015-12-29 | 2018-11-02 | 孟山都技术公司 | Novel C RISPR associated transposable enzymes and application thereof |
CN115216459A (en) * | 2015-12-29 | 2022-10-21 | 孟山都技术公司 | Novel CRISPR-associated transposase and use thereof |
CN113881652A (en) * | 2020-11-11 | 2022-01-04 | 山东舜丰生物科技有限公司 | Novel Cas enzymes and systems and uses |
CA3204429A1 (en) * | 2021-01-07 | 2022-07-14 | Feng Zhang | Dna nuclease guided transposase compositions and methods of use thereof |
CN114517190A (en) * | 2021-02-05 | 2022-05-20 | 山东舜丰生物科技有限公司 | CRISPR enzymes and systems and uses |
Non-Patent Citations (2)
Title |
---|
CRISPR 相关转座酶及其细菌基因组编辑应用;周晓杰;生物技术通报;20221231;第39卷(第4期);全文 * |
Transposon-associated TnpB is a programmable RNA-guided DNA endonuclease;Tautvydas Karvelis;Nature;20211007;第599卷;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN116355878A (en) | 2023-06-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN116355878B (en) | Novel TnpB programmable nuclease and application thereof | |
US20210017507A1 (en) | Methods and compositions for sequences guiding cas9 targeting | |
JP7429057B2 (en) | Methods and compositions for sequences that guide CAS9 targeting | |
KR102084186B1 (en) | Method of identifying genome-wide off-target sites of base editors by detecting single strand breaks in genomic DNA | |
CA2975166C (en) | Crispr hybrid dna/rna polynucleotides and methods of use | |
ES2713503T3 (en) | Use of RNA-guided FOKI nucleases (RFN) to increase the specificity for editing the RNA-guided genome | |
CN110799525A (en) | Variants of CPF1(CAS12a) with altered PAM specificity | |
CN109804066A (en) | Programmable CAS9- recombination enzyme fusion proteins and application thereof | |
WO2015168404A1 (en) | Toehold-gated guide rna for programmable cas9 circuitry with rna input | |
JP7308380B2 (en) | Methods for in vitro site-directed mutagenesis using gene editing technology | |
US11761001B2 (en) | Mbp_Argonaute proteins from prokaryotes and applications thereof | |
KR20210042130A (en) | ACIDAMINOCOCCUS SP. A novel mutation that enhances the DNA cleavage activity of CPF1 | |
WO2018035466A1 (en) | Targeted mutagenesis | |
CN108998406B (en) | Human primary culture cell genome editing and site-specific gene knock-in method | |
CN116606839A (en) | Nucleic acid cutting system based on Argonaute protein Tcago and application thereof | |
CN117737034A (en) | TnpB editing system and application thereof | |
AU2020261071A1 (en) | Engineered Cas9 with broadened DNA targeting range | |
US20220186254A1 (en) | Argonaute proteins from prokaryotes and applications thereof | |
RU2749307C1 (en) | New compact type ii cas9 nuclease from anoxybacillus flavithermus | |
WO2024017189A1 (en) | Tnpb-based genome editor | |
Esquerra et al. | Identification of the EH CRISPR-Cas9 system on a metagenome and its application to genome engineering | |
US20230075913A1 (en) | Codon-optimized cas9 endonuclease encoding polynucleotide | |
CN115210380A (en) | Thermostable mismatch endonuclease variants | |
WO2023039434A1 (en) | Systems and methods for transposing cargo nucleotide sequences | |
CN117925694A (en) | Method for improving salt tolerance and drought resistance of soybean and application thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |