CN111088234B - Double-stranded DNA peptide ligase dDPlaseII and use method thereof - Google Patents
Double-stranded DNA peptide ligase dDPlaseII and use method thereof Download PDFInfo
- Publication number
- CN111088234B CN111088234B CN202010077328.9A CN202010077328A CN111088234B CN 111088234 B CN111088234 B CN 111088234B CN 202010077328 A CN202010077328 A CN 202010077328A CN 111088234 B CN111088234 B CN 111088234B
- Authority
- CN
- China
- Prior art keywords
- leu
- ligase
- ser
- arg
- dna
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 108090000765 processed proteins & peptides Proteins 0.000 title claims abstract description 214
- 108020004414 DNA Proteins 0.000 title claims abstract description 177
- 108090000364 Ligases Proteins 0.000 title claims abstract description 158
- 102000003960 Ligases Human genes 0.000 title claims abstract description 157
- 102000053602 DNA Human genes 0.000 title claims abstract description 71
- 238000000034 method Methods 0.000 title abstract description 23
- 125000003275 alpha amino acid group Chemical group 0.000 claims abstract 2
- 102000004196 processed proteins & peptides Human genes 0.000 abstract description 148
- 229920001184 polypeptide Polymers 0.000 abstract description 144
- 238000006243 chemical reaction Methods 0.000 abstract description 93
- 102000004190 Enzymes Human genes 0.000 abstract description 48
- 108090000790 Enzymes Proteins 0.000 abstract description 48
- 108090000623 proteins and genes Proteins 0.000 abstract description 45
- 239000005547 deoxyribonucleotide Substances 0.000 abstract description 44
- 238000010521 absorption reaction Methods 0.000 abstract description 41
- 125000002637 deoxyribonucleotide group Chemical group 0.000 abstract description 38
- 239000000872 buffer Substances 0.000 abstract description 36
- 230000035484 reaction time Effects 0.000 abstract description 14
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 abstract description 13
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 abstract description 12
- 230000026731 phosphorylation Effects 0.000 abstract description 12
- 238000006366 phosphorylation reaction Methods 0.000 abstract description 12
- 108091026890 Coding region Proteins 0.000 abstract description 9
- 238000010353 genetic engineering Methods 0.000 abstract description 6
- 239000011780 sodium chloride Substances 0.000 abstract description 6
- 229920004890 Triton X-100 Polymers 0.000 abstract description 5
- 239000013504 Triton X-100 Substances 0.000 abstract description 5
- 230000000694 effects Effects 0.000 abstract description 4
- 238000012252 genetic analysis Methods 0.000 abstract description 4
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 66
- 229940088598 enzyme Drugs 0.000 description 45
- 239000000047 product Substances 0.000 description 42
- 108091028664 Ribonucleotide Proteins 0.000 description 31
- 239000002336 ribonucleotide Substances 0.000 description 31
- 239000012634 fragment Substances 0.000 description 29
- 239000000758 substrate Substances 0.000 description 26
- 125000002652 ribonucleotide group Chemical group 0.000 description 25
- 108010059892 Cellulase Proteins 0.000 description 23
- 150000001413 amino acids Chemical group 0.000 description 23
- 150000007523 nucleic acids Chemical class 0.000 description 21
- 239000007795 chemical reaction product Substances 0.000 description 20
- 108020004682 Single-Stranded DNA Proteins 0.000 description 19
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 18
- 108020004999 messenger RNA Proteins 0.000 description 17
- 230000014509 gene expression Effects 0.000 description 16
- 102000039446 nucleic acids Human genes 0.000 description 16
- 108020004707 nucleic acids Proteins 0.000 description 16
- 239000002773 nucleotide Substances 0.000 description 16
- 101150003160 X gene Proteins 0.000 description 15
- 125000003729 nucleotide group Chemical group 0.000 description 15
- 230000004048 modification Effects 0.000 description 14
- 238000012986 modification Methods 0.000 description 14
- 125000001433 C-terminal amino-acid group Chemical group 0.000 description 12
- 238000002835 absorbance Methods 0.000 description 12
- 238000004458 analytical method Methods 0.000 description 11
- 239000007853 buffer solution Substances 0.000 description 11
- 230000000295 complement effect Effects 0.000 description 10
- 238000004949 mass spectrometry Methods 0.000 description 10
- 229960002685 biotin Drugs 0.000 description 9
- 235000020958 biotin Nutrition 0.000 description 9
- 239000011616 biotin Substances 0.000 description 9
- 230000018044 dehydration Effects 0.000 description 9
- 238000006297 dehydration reaction Methods 0.000 description 9
- 230000002255 enzymatic effect Effects 0.000 description 9
- 102000004169 proteins and genes Human genes 0.000 description 9
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Chemical compound O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 9
- ROHFNLRQFUQHCH-YFKPBYRVSA-N L-leucine Chemical compound CC(C)C[C@H](N)C(O)=O ROHFNLRQFUQHCH-YFKPBYRVSA-N 0.000 description 8
- UCOCBWDBHCUPQP-DCAQKATOSA-N Leu-Arg-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O UCOCBWDBHCUPQP-DCAQKATOSA-N 0.000 description 8
- ROHFNLRQFUQHCH-UHFFFAOYSA-N Leucine Natural products CC(C)CC(N)C(O)=O ROHFNLRQFUQHCH-UHFFFAOYSA-N 0.000 description 8
- 108091028043 Nucleic acid sequence Proteins 0.000 description 8
- 238000006482 condensation reaction Methods 0.000 description 8
- 239000008367 deionised water Substances 0.000 description 8
- 229910021641 deionized water Inorganic materials 0.000 description 8
- RWSXRVCMGQZWBV-WDSKDSINSA-N glutathione Chemical compound OC(=O)[C@@H](N)CCC(=O)N[C@@H](CS)C(=O)NCC(O)=O RWSXRVCMGQZWBV-WDSKDSINSA-N 0.000 description 8
- 229910052799 carbon Inorganic materials 0.000 description 7
- 229940106157 cellulase Drugs 0.000 description 7
- 238000006911 enzymatic reaction Methods 0.000 description 7
- 238000007169 ligase reaction Methods 0.000 description 7
- 238000000746 purification Methods 0.000 description 7
- 108010038745 tryptophylglycine Proteins 0.000 description 7
- 101710173438 Late L2 mu core protein Proteins 0.000 description 6
- KZNQNBZMBZJQJO-UHFFFAOYSA-N N-glycyl-L-proline Natural products NCC(=O)N1CCCC1C(O)=O KZNQNBZMBZJQJO-UHFFFAOYSA-N 0.000 description 6
- NHCKESBLOMHIIE-IRXDYDNUSA-N Phe-Gly-Phe Chemical compound C([C@H](N)C(=O)NCC(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=CC=C1 NHCKESBLOMHIIE-IRXDYDNUSA-N 0.000 description 6
- 101710188315 Protein X Proteins 0.000 description 6
- BLGHHPHXVJWCNK-GUBZILKMSA-N Ala-Gln-Leu Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O BLGHHPHXVJWCNK-GUBZILKMSA-N 0.000 description 5
- 241000588724 Escherichia coli Species 0.000 description 5
- 108010090804 Streptavidin Proteins 0.000 description 5
- 108010047495 alanylglycine Proteins 0.000 description 5
- 239000000306 component Substances 0.000 description 5
- 150000004712 monophosphates Chemical class 0.000 description 5
- 108010031719 prolyl-serine Proteins 0.000 description 5
- 108010061238 threonyl-glycine Proteins 0.000 description 5
- 239000001226 triphosphate Substances 0.000 description 5
- 235000011178 triphosphate Nutrition 0.000 description 5
- UNXRWKVEANCORM-UHFFFAOYSA-N triphosphoric acid Chemical compound OP(O)(=O)OP(O)(=O)OP(O)(O)=O UNXRWKVEANCORM-UHFFFAOYSA-N 0.000 description 5
- KDRNOBUWMVLVFH-UHFFFAOYSA-N 2-methyl-n-(2,2,6,6-tetramethylpiperidin-4-yl)prop-2-enamide Chemical compound CC(=C)C(=O)NC1CC(C)(C)NC(C)(C)C1 KDRNOBUWMVLVFH-UHFFFAOYSA-N 0.000 description 4
- 229930024421 Adenine Natural products 0.000 description 4
- OTOXOKCIIQLMFH-KZVJFYERSA-N Arg-Ala-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N OTOXOKCIIQLMFH-KZVJFYERSA-N 0.000 description 4
- GHNDBBVSWOWYII-LPEHRKFASA-N Arg-Asn-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC(=O)N)NC(=O)[C@H](CCCN=C(N)N)N)C(=O)O GHNDBBVSWOWYII-LPEHRKFASA-N 0.000 description 4
- CRCCTGPNZUCAHE-DCAQKATOSA-N Arg-His-Ser Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CC1=CN=CN1 CRCCTGPNZUCAHE-DCAQKATOSA-N 0.000 description 4
- HCIUUZGFTDTEGM-NAKRPEOUSA-N Arg-Ile-Cys Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N HCIUUZGFTDTEGM-NAKRPEOUSA-N 0.000 description 4
- RYQSYXFGFOTJDJ-RHYQMDGZSA-N Arg-Thr-Leu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(O)=O RYQSYXFGFOTJDJ-RHYQMDGZSA-N 0.000 description 4
- WONGRTVAMHFGBE-WDSKDSINSA-N Asn-Gly-Gln Chemical compound C(CC(=O)N)[C@@H](C(=O)O)NC(=O)CNC(=O)[C@H](CC(=O)N)N WONGRTVAMHFGBE-WDSKDSINSA-N 0.000 description 4
- CHRCKSPMGYDLIA-SRVKXCTJSA-N Cys-Phe-Ser Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(O)=O CHRCKSPMGYDLIA-SRVKXCTJSA-N 0.000 description 4
- NGWIXHCFVSSVHX-IHPCNDPISA-N Cys-Tyr-Trp Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O NGWIXHCFVSSVHX-IHPCNDPISA-N 0.000 description 4
- MQANCSUBSBJNLU-KKUMJFAQSA-N Gln-Arg-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O MQANCSUBSBJNLU-KKUMJFAQSA-N 0.000 description 4
- WOMUDRVDJMHTCV-DCAQKATOSA-N Glu-Arg-Arg Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O WOMUDRVDJMHTCV-DCAQKATOSA-N 0.000 description 4
- 108010024636 Glutathione Proteins 0.000 description 4
- KTSZUNRRYXPZTK-BQBZGAKWSA-N Gly-Gln-Glu Chemical compound NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O KTSZUNRRYXPZTK-BQBZGAKWSA-N 0.000 description 4
- KMSGYZQRXPUKGI-BYPYZUCNSA-N Gly-Gly-Asn Chemical compound NCC(=O)NCC(=O)N[C@H](C(O)=O)CC(N)=O KMSGYZQRXPUKGI-BYPYZUCNSA-N 0.000 description 4
- IRJWAYCXIYUHQE-WHFBIAKZSA-N Gly-Ser-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](CO)NC(=O)CN IRJWAYCXIYUHQE-WHFBIAKZSA-N 0.000 description 4
- VYUXYMRNGALHEA-DLOVCJGASA-N His-Leu-Ala Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O VYUXYMRNGALHEA-DLOVCJGASA-N 0.000 description 4
- YAALVYQFVJNXIV-KKUMJFAQSA-N His-Leu-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](N)CC1=CN=CN1 YAALVYQFVJNXIV-KKUMJFAQSA-N 0.000 description 4
- 101000740112 Homo sapiens Membrane-associated transporter protein Proteins 0.000 description 4
- 108700039609 IRW peptide Proteins 0.000 description 4
- MLSUZXHSNRBDCI-CYDGBPFRSA-N Ile-Pro-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)O)N MLSUZXHSNRBDCI-CYDGBPFRSA-N 0.000 description 4
- IBMVEYRWAWIOTN-UHFFFAOYSA-N L-Leucyl-L-Arginyl-L-Proline Natural products CC(C)CC(N)C(=O)NC(CCCN=C(N)N)C(=O)N1CCCC1C(O)=O IBMVEYRWAWIOTN-UHFFFAOYSA-N 0.000 description 4
- WSGXUIQTEZDVHJ-GARJFASQSA-N Leu-Ala-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(O)=O WSGXUIQTEZDVHJ-GARJFASQSA-N 0.000 description 4
- IBMVEYRWAWIOTN-RWMBFGLXSA-N Leu-Arg-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N1CCC[C@@H]1C(O)=O IBMVEYRWAWIOTN-RWMBFGLXSA-N 0.000 description 4
- MYGQXVYRZMKRDB-SRVKXCTJSA-N Leu-Asp-Lys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@H](C(O)=O)CCCCN MYGQXVYRZMKRDB-SRVKXCTJSA-N 0.000 description 4
- YORLGJINWYYIMX-KKUMJFAQSA-N Leu-Cys-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O YORLGJINWYYIMX-KKUMJFAQSA-N 0.000 description 4
- QNBVTHNJGCOVFA-AVGNSLFASA-N Leu-Leu-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@H](C(O)=O)CCC(O)=O QNBVTHNJGCOVFA-AVGNSLFASA-N 0.000 description 4
- BRTVHXHCUSXYRI-CIUDSAMLSA-N Leu-Ser-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CO)C(O)=O BRTVHXHCUSXYRI-CIUDSAMLSA-N 0.000 description 4
- QWWPYKKLXWOITQ-VOAKCMCISA-N Leu-Thr-Leu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CC(C)C QWWPYKKLXWOITQ-VOAKCMCISA-N 0.000 description 4
- KLSUAWUZBMAZCL-RHYQMDGZSA-N Leu-Thr-Pro Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N1CCC[C@H]1C(O)=O KLSUAWUZBMAZCL-RHYQMDGZSA-N 0.000 description 4
- QESXLSQLQHHTIX-RHYQMDGZSA-N Leu-Val-Thr Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O QESXLSQLQHHTIX-RHYQMDGZSA-N 0.000 description 4
- ONPDTSFZAIWMDI-AVGNSLFASA-N Lys-Leu-Gln Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(O)=O ONPDTSFZAIWMDI-AVGNSLFASA-N 0.000 description 4
- 102100037258 Membrane-associated transporter protein Human genes 0.000 description 4
- AHZNUGRZHMZGFL-GUBZILKMSA-N Met-Arg-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CO)C(O)=O)CCCNC(N)=N AHZNUGRZHMZGFL-GUBZILKMSA-N 0.000 description 4
- NKDSBBBPGIVWEI-RCWTZXSCSA-N Met-Arg-Thr Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NKDSBBBPGIVWEI-RCWTZXSCSA-N 0.000 description 4
- YBAFDPFAUTYYRW-UHFFFAOYSA-N N-L-alpha-glutamyl-L-leucine Natural products CC(C)CC(C(O)=O)NC(=O)C(N)CCC(O)=O YBAFDPFAUTYYRW-UHFFFAOYSA-N 0.000 description 4
- OPEVYHFJXLCCRT-AVGNSLFASA-N Phe-Gln-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O OPEVYHFJXLCCRT-AVGNSLFASA-N 0.000 description 4
- UXQFHEKRGHYJRA-STQMWFEESA-N Phe-Met-Gly Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CCSC)C(=O)NCC(O)=O UXQFHEKRGHYJRA-STQMWFEESA-N 0.000 description 4
- VZKBJNBZMZHKRC-XUXIUFHCSA-N Pro-Ile-Leu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O VZKBJNBZMZHKRC-XUXIUFHCSA-N 0.000 description 4
- 108010003201 RGH 0205 Proteins 0.000 description 4
- AZWNCEBQZXELEZ-FXQIFTODSA-N Ser-Pro-Ser Chemical compound OC[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O AZWNCEBQZXELEZ-FXQIFTODSA-N 0.000 description 4
- RXUOAOOZIWABBW-XGEHTFHBSA-N Ser-Thr-Arg Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N RXUOAOOZIWABBW-XGEHTFHBSA-N 0.000 description 4
- LAIUAVGWZYTBKN-VHWLVUOQSA-N Trp-Asn-Ile Chemical compound CC[C@H](C)[C@H](NC(=O)[C@H](CC(N)=O)NC(=O)[C@@H](N)Cc1c[nH]c2ccccc12)C(O)=O LAIUAVGWZYTBKN-VHWLVUOQSA-N 0.000 description 4
- NOBINHCGDUHOBV-NAZCDGGXSA-N Trp-His-Thr Chemical compound [H]N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O NOBINHCGDUHOBV-NAZCDGGXSA-N 0.000 description 4
- ZLFHAAGHGQBQQN-AEJSXWLSSA-N Val-Ala-Pro Chemical compound C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](C(C)C)N ZLFHAAGHGQBQQN-AEJSXWLSSA-N 0.000 description 4
- ZLFHAAGHGQBQQN-GUBZILKMSA-N Val-Ala-Pro Natural products CC(C)[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@H]1C(O)=O ZLFHAAGHGQBQQN-GUBZILKMSA-N 0.000 description 4
- 101710086987 X protein Proteins 0.000 description 4
- 229960000643 adenine Drugs 0.000 description 4
- 108010018691 arginyl-threonyl-arginine Proteins 0.000 description 4
- 108010010430 asparagine-proline-alanine Proteins 0.000 description 4
- 239000011324 bead Substances 0.000 description 4
- 230000015572 biosynthetic process Effects 0.000 description 4
- 210000004899 c-terminal region Anatomy 0.000 description 4
- 150000001721 carbon Chemical group 0.000 description 4
- 239000001913 cellulose Substances 0.000 description 4
- 229920002678 cellulose Polymers 0.000 description 4
- 238000010276 construction Methods 0.000 description 4
- 230000029087 digestion Effects 0.000 description 4
- 108010078144 glutaminyl-glycine Proteins 0.000 description 4
- 229960003180 glutathione Drugs 0.000 description 4
- 108010072405 glycyl-aspartyl-glycine Proteins 0.000 description 4
- 108010082286 glycyl-seryl-alanine Proteins 0.000 description 4
- 238000001819 mass spectrum Methods 0.000 description 4
- 108010073025 phenylalanylphenylalanine Proteins 0.000 description 4
- 125000002467 phosphate group Chemical group [H]OP(=O)(O[H])O[*] 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 239000011535 reaction buffer Substances 0.000 description 4
- 230000009257 reactivity Effects 0.000 description 4
- 238000011160 research Methods 0.000 description 4
- 108010026333 seryl-proline Proteins 0.000 description 4
- 238000001228 spectrum Methods 0.000 description 4
- 238000013518 transcription Methods 0.000 description 4
- 230000035897 transcription Effects 0.000 description 4
- FWMNVWWHGCHHJJ-SKKKGAJSSA-N 4-amino-1-[(2r)-6-amino-2-[[(2r)-2-[[(2r)-2-[[(2r)-2-amino-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-4-methylpentanoyl]amino]hexanoyl]piperidine-4-carboxylic acid Chemical compound C([C@H](C(=O)N[C@H](CC(C)C)C(=O)N[C@H](CCCCN)C(=O)N1CCC(N)(CC1)C(O)=O)NC(=O)[C@H](N)CC=1C=CC=CC=1)C1=CC=CC=C1 FWMNVWWHGCHHJJ-SKKKGAJSSA-N 0.000 description 3
- JBGSZRYCXBPWGX-BQBZGAKWSA-N Ala-Arg-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N JBGSZRYCXBPWGX-BQBZGAKWSA-N 0.000 description 3
- CXZFXHGJJPVUJE-CIUDSAMLSA-N Ala-Cys-Leu Chemical compound C[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)O)N CXZFXHGJJPVUJE-CIUDSAMLSA-N 0.000 description 3
- RMAWDDRDTRSZIR-ZLUOBGJFSA-N Ala-Ser-Asp Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(O)=O)C(O)=O RMAWDDRDTRSZIR-ZLUOBGJFSA-N 0.000 description 3
- AUFACLFHBAGZEN-ZLUOBGJFSA-N Ala-Ser-Cys Chemical compound N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O AUFACLFHBAGZEN-ZLUOBGJFSA-N 0.000 description 3
- YYAVDNKUWLAFCV-ACZMJKKPSA-N Ala-Ser-Gln Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYAVDNKUWLAFCV-ACZMJKKPSA-N 0.000 description 3
- AUIJUTGLPVHIRT-FXQIFTODSA-N Arg-Ser-Cys Chemical compound C(C[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CS)C(=O)O)N)CN=C(N)N AUIJUTGLPVHIRT-FXQIFTODSA-N 0.000 description 3
- ISJWBVIYRBAXEB-CIUDSAMLSA-N Arg-Ser-Glu Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(O)=O ISJWBVIYRBAXEB-CIUDSAMLSA-N 0.000 description 3
- DNLQVHBBMPZUGJ-BQBZGAKWSA-N Arg-Ser-Gly Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)NCC(O)=O DNLQVHBBMPZUGJ-BQBZGAKWSA-N 0.000 description 3
- JPAWCMXVNZPJLO-IHRRRGAJSA-N Arg-Ser-Phe Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O JPAWCMXVNZPJLO-IHRRRGAJSA-N 0.000 description 3
- OGZBJJLRKQZRHL-KJEVXHAQSA-N Arg-Thr-Tyr Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@H](C(O)=O)CC1=CC=C(O)C=C1 OGZBJJLRKQZRHL-KJEVXHAQSA-N 0.000 description 3
- QCVXMEHGFUMKCO-YUMQZZPRSA-N Asp-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CC(O)=O QCVXMEHGFUMKCO-YUMQZZPRSA-N 0.000 description 3
- KYQNAIMCTRZLNP-QSFUFRPTSA-N Asp-Ile-Val Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O KYQNAIMCTRZLNP-QSFUFRPTSA-N 0.000 description 3
- XWKBWZXGNXTDKY-ZKWXMUAHSA-N Asp-Val-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CC(O)=O XWKBWZXGNXTDKY-ZKWXMUAHSA-N 0.000 description 3
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 3
- UXBYDFKFHMYYPL-XIRDDKMYSA-N Cys-His-Trp Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O UXBYDFKFHMYYPL-XIRDDKMYSA-N 0.000 description 3
- CUVBTVWFVIIDOC-YEPSODPASA-N Gly-Thr-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)CN CUVBTVWFVIIDOC-YEPSODPASA-N 0.000 description 3
- OCRQUYDOYKCOQG-IRXDYDNUSA-N Gly-Tyr-Phe Chemical compound C([C@H](NC(=O)CN)C(=O)N[C@@H](CC=1C=CC=CC=1)C(O)=O)C1=CC=C(O)C=C1 OCRQUYDOYKCOQG-IRXDYDNUSA-N 0.000 description 3
- SDTPKSOWFXBACN-GUBZILKMSA-N His-Glu-Asp Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(O)=O)C(O)=O SDTPKSOWFXBACN-GUBZILKMSA-N 0.000 description 3
- BXOLYFJYQQRQDJ-MXAVVETBSA-N His-Leu-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(C)C)NC(=O)[C@H](CC1=CN=CN1)N BXOLYFJYQQRQDJ-MXAVVETBSA-N 0.000 description 3
- 101500025353 Homo sapiens Insulin A chain Proteins 0.000 description 3
- 101500025354 Homo sapiens Insulin B chain Proteins 0.000 description 3
- ZXJFURYTPZMUNY-VKOGCVSHSA-N Ile-Arg-Trp Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@H](CCCN=C(N)N)NC(=O)[C@@H](N)[C@@H](C)CC)C(O)=O)=CNC2=C1 ZXJFURYTPZMUNY-VKOGCVSHSA-N 0.000 description 3
- SITWEMZOJNKJCH-UHFFFAOYSA-N L-alanine-L-arginine Natural products CC(N)C(=O)NC(C(O)=O)CCCNC(N)=N SITWEMZOJNKJCH-UHFFFAOYSA-N 0.000 description 3
- YOZCKMXHBYKOMQ-IHRRRGAJSA-N Leu-Arg-Lys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCCCN)C(=O)O)N YOZCKMXHBYKOMQ-IHRRRGAJSA-N 0.000 description 3
- VKOAHIRLIUESLU-ULQDDVLXSA-N Leu-Arg-Phe Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O VKOAHIRLIUESLU-ULQDDVLXSA-N 0.000 description 3
- OGCQGUIWMSBHRZ-CIUDSAMLSA-N Leu-Asn-Ser Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O OGCQGUIWMSBHRZ-CIUDSAMLSA-N 0.000 description 3
- DZQMXBALGUHGJT-GUBZILKMSA-N Leu-Glu-Ala Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(O)=O DZQMXBALGUHGJT-GUBZILKMSA-N 0.000 description 3
- XWEVVRRSIOBJOO-SRVKXCTJSA-N Leu-Pro-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O XWEVVRRSIOBJOO-SRVKXCTJSA-N 0.000 description 3
- CNWDWAMPKVYJJB-NUTKFTJISA-N Leu-Trp-Ala Chemical compound C1=CC=C2C(C[C@H](NC(=O)[C@@H](N)CC(C)C)C(=O)N[C@@H](C)C(O)=O)=CNC2=C1 CNWDWAMPKVYJJB-NUTKFTJISA-N 0.000 description 3
- SEOXPEFQEOYURL-PMVMPFDFSA-N Leu-Tyr-Trp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(O)=O SEOXPEFQEOYURL-PMVMPFDFSA-N 0.000 description 3
- DBXMFHGGHMXYHY-DCAQKATOSA-N Met-Leu-Ser Chemical compound CSCC[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O DBXMFHGGHMXYHY-DCAQKATOSA-N 0.000 description 3
- SZZBUDVXWZZPDH-BQBZGAKWSA-N Pro-Cys-Gly Chemical compound OC(=O)CNC(=O)[C@H](CS)NC(=O)[C@@H]1CCCN1 SZZBUDVXWZZPDH-BQBZGAKWSA-N 0.000 description 3
- LQZZPNDMYNZPFT-KKUMJFAQSA-N Pro-Gln-Phe Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O LQZZPNDMYNZPFT-KKUMJFAQSA-N 0.000 description 3
- 102000006382 Ribonucleases Human genes 0.000 description 3
- 108010083644 Ribonucleases Proteins 0.000 description 3
- FMDHKPRACUXATF-ACZMJKKPSA-N Ser-Gln-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(O)=O FMDHKPRACUXATF-ACZMJKKPSA-N 0.000 description 3
- XNXRTQZTFVMJIJ-DCAQKATOSA-N Ser-Met-Leu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O XNXRTQZTFVMJIJ-DCAQKATOSA-N 0.000 description 3
- BEBVVQPDSHHWQL-NRPADANISA-N Ser-Val-Glu Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O BEBVVQPDSHHWQL-NRPADANISA-N 0.000 description 3
- GLQFKOVWXPPFTP-VEVYYDQMSA-N Thr-Arg-Asp Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CC(O)=O)C(O)=O GLQFKOVWXPPFTP-VEVYYDQMSA-N 0.000 description 3
- XGFYGMKZKFRGAI-RCWTZXSCSA-N Thr-Val-Arg Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@H](C(O)=O)CCCN=C(N)N XGFYGMKZKFRGAI-RCWTZXSCSA-N 0.000 description 3
- PWONLXBUSVIZPH-RHYQMDGZSA-N Thr-Val-Lys Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CCCCN)C(=O)O)N)O PWONLXBUSVIZPH-RHYQMDGZSA-N 0.000 description 3
- LHTGRUZSZOIAKM-SOUVJXGZSA-N Tyr-Glu-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N)C(=O)O LHTGRUZSZOIAKM-SOUVJXGZSA-N 0.000 description 3
- XUIOBCQESNDTDE-FQPOAREZSA-N Tyr-Thr-Ala Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](C)C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)N)O XUIOBCQESNDTDE-FQPOAREZSA-N 0.000 description 3
- HJSLDXZAZGFPDK-ULQDDVLXSA-N Val-Phe-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=CC=C1)NC(=O)[C@H](C(C)C)N HJSLDXZAZGFPDK-ULQDDVLXSA-N 0.000 description 3
- UJMCYJKPDFQLHX-XGEHTFHBSA-N Val-Ser-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CO)NC(=O)[C@H](C(C)C)N)O UJMCYJKPDFQLHX-XGEHTFHBSA-N 0.000 description 3
- GUIYPEKUEMQBIK-JSGCOSHPSA-N Val-Tyr-Gly Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)NCC(O)=O GUIYPEKUEMQBIK-JSGCOSHPSA-N 0.000 description 3
- 108010040443 aspartyl-aspartic acid Proteins 0.000 description 3
- 108010093581 aspartyl-proline Proteins 0.000 description 3
- 230000003197 catalytic effect Effects 0.000 description 3
- 108010042598 glutamyl-aspartyl-glycine Proteins 0.000 description 3
- VPZXBVLAVMBEQI-UHFFFAOYSA-N glycyl-DL-alpha-alanine Natural products OC(=O)C(C)NC(=O)CN VPZXBVLAVMBEQI-UHFFFAOYSA-N 0.000 description 3
- 108010020688 glycylhistidine Proteins 0.000 description 3
- 238000002372 labelling Methods 0.000 description 3
- 244000005700 microbiome Species 0.000 description 3
- 108010004914 prolylarginine Proteins 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 238000012163 sequencing technique Methods 0.000 description 3
- 239000010902 straw Substances 0.000 description 3
- PCIFXPRIFWKWLK-YUMQZZPRSA-N Ala-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@H](C)N PCIFXPRIFWKWLK-YUMQZZPRSA-N 0.000 description 2
- BLIMFWGRQKRCGT-YUMQZZPRSA-N Ala-Gly-Lys Chemical compound C[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CCCCN BLIMFWGRQKRCGT-YUMQZZPRSA-N 0.000 description 2
- DHBKYZYFEXXUAK-ONGXEEELSA-N Ala-Phe-Gly Chemical compound OC(=O)CNC(=O)[C@@H](NC(=O)[C@@H](N)C)CC1=CC=CC=C1 DHBKYZYFEXXUAK-ONGXEEELSA-N 0.000 description 2
- HOVPGJUNRLMIOZ-CIUDSAMLSA-N Ala-Ser-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CO)NC(=O)[C@H](C)N HOVPGJUNRLMIOZ-CIUDSAMLSA-N 0.000 description 2
- IOFVWPYSRSCWHI-JXUBOQSCSA-N Ala-Thr-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)O)NC(=O)[C@H](C)N IOFVWPYSRSCWHI-JXUBOQSCSA-N 0.000 description 2
- ASQYTJJWAMDISW-BPUTZDHNSA-N Arg-Asp-Trp Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CCCN=C(N)N)N ASQYTJJWAMDISW-BPUTZDHNSA-N 0.000 description 2
- AQPVUEJJARLJHB-BQBZGAKWSA-N Arg-Gly-Ala Chemical compound OC(=O)[C@H](C)NC(=O)CNC(=O)[C@@H](N)CCCN=C(N)N AQPVUEJJARLJHB-BQBZGAKWSA-N 0.000 description 2
- UHFUZWSZQKMDSX-DCAQKATOSA-N Arg-Leu-Asn Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CC(=O)N)C(=O)O)NC(=O)[C@H](CCCN=C(N)N)N UHFUZWSZQKMDSX-DCAQKATOSA-N 0.000 description 2
- FSNVAJOPUDVQAR-AVGNSLFASA-N Arg-Lys-Arg Chemical compound NC(=N)NCCC[C@H](N)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FSNVAJOPUDVQAR-AVGNSLFASA-N 0.000 description 2
- MJINRRBEMOLJAK-DCAQKATOSA-N Arg-Lys-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CCCN=C(N)N MJINRRBEMOLJAK-DCAQKATOSA-N 0.000 description 2
- OVQJAKFLFTZDNC-GUBZILKMSA-N Arg-Pro-Asp Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(O)=O OVQJAKFLFTZDNC-GUBZILKMSA-N 0.000 description 2
- INOIAEUXVVNJKA-XGEHTFHBSA-N Arg-Thr-Ser Chemical compound [H]N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(O)=O INOIAEUXVVNJKA-XGEHTFHBSA-N 0.000 description 2
- ULBHWNVWSCJLCO-NHCYSSNCSA-N Arg-Val-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCN=C(N)N ULBHWNVWSCJLCO-NHCYSSNCSA-N 0.000 description 2
- SRUUBQBAVNQZGJ-LAEOZQHASA-N Asn-Gln-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC(=O)N)N SRUUBQBAVNQZGJ-LAEOZQHASA-N 0.000 description 2
- AITGTTNYKAWKDR-CIUDSAMLSA-N Asn-His-Ser Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(O)=O AITGTTNYKAWKDR-CIUDSAMLSA-N 0.000 description 2
- NCFJQJRLQJEECD-NHCYSSNCSA-N Asn-Leu-Val Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O NCFJQJRLQJEECD-NHCYSSNCSA-N 0.000 description 2
- PLTGTJAZQRGMPP-FXQIFTODSA-N Asn-Pro-Ala Chemical compound OC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CC(N)=O PLTGTJAZQRGMPP-FXQIFTODSA-N 0.000 description 2
- ZKAOJVJQGVUIIU-GUBZILKMSA-N Asp-Pro-Arg Chemical compound OC(=O)C[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(O)=O ZKAOJVJQGVUIIU-GUBZILKMSA-N 0.000 description 2
- XOASPVGNFAMYBD-WFBYXXMGSA-N Asp-Trp-Ala Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CNC2=C1C=CC=C2)C(=O)N[C@@H](C)C(O)=O XOASPVGNFAMYBD-WFBYXXMGSA-N 0.000 description 2
- 241000894006 Bacteria Species 0.000 description 2
- PORWNQWEEIOIRH-XHNCKOQMSA-N Cys-Gln-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CS)N)C(=O)O PORWNQWEEIOIRH-XHNCKOQMSA-N 0.000 description 2
- VPQZSNQICFCCSO-BJDJZHNGSA-N Cys-Leu-Ile Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O VPQZSNQICFCCSO-BJDJZHNGSA-N 0.000 description 2
- ALNKNYKSZPSLBD-ZDLURKLDSA-N Cys-Thr-Gly Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O ALNKNYKSZPSLBD-ZDLURKLDSA-N 0.000 description 2
- 108010061982 DNA Ligases Proteins 0.000 description 2
- 108010014303 DNA-directed DNA polymerase Proteins 0.000 description 2
- 102000016928 DNA-directed DNA polymerase Human genes 0.000 description 2
- 102000004163 DNA-directed RNA polymerases Human genes 0.000 description 2
- 108090000626 DNA-directed RNA polymerases Proteins 0.000 description 2
- 108060002716 Exonuclease Proteins 0.000 description 2
- XFKUFUJECJUQTQ-CIUDSAMLSA-N Gln-Gln-Glu Chemical compound NC(=O)CC[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(O)=O)C(O)=O XFKUFUJECJUQTQ-CIUDSAMLSA-N 0.000 description 2
- WOSRKEJQESVHGA-CIUDSAMLSA-N Glu-Arg-Ser Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O WOSRKEJQESVHGA-CIUDSAMLSA-N 0.000 description 2
- QNJNPKSWAHPYGI-JYJNAYRXSA-N Glu-Phe-Leu Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@H](C(=O)N[C@@H](CC(C)C)C(O)=O)CC1=CC=CC=C1 QNJNPKSWAHPYGI-JYJNAYRXSA-N 0.000 description 2
- HHSKZJZWQFPSKN-AVGNSLFASA-N Glu-Tyr-Asp Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(O)=O HHSKZJZWQFPSKN-AVGNSLFASA-N 0.000 description 2
- FMNHBTKMRFVGRO-FOHZUACHSA-N Gly-Asn-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)[C@H](CC(N)=O)NC(=O)CN FMNHBTKMRFVGRO-FOHZUACHSA-N 0.000 description 2
- PAWIVEIWWYGBAM-YUMQZZPRSA-N Gly-Leu-Ala Chemical compound NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(O)=O PAWIVEIWWYGBAM-YUMQZZPRSA-N 0.000 description 2
- MHXKHKWHPNETGG-QWRGUYRKSA-N Gly-Lys-Leu Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(O)=O MHXKHKWHPNETGG-QWRGUYRKSA-N 0.000 description 2
- JBCLFWXMTIKCCB-UHFFFAOYSA-N H-Gly-Phe-OH Natural products NCC(=O)NC(C(O)=O)CC1=CC=CC=C1 JBCLFWXMTIKCCB-UHFFFAOYSA-N 0.000 description 2
- HXKZJLWGSWQKEA-LSJOCFKGSA-N His-Ala-Val Chemical compound CC(C)[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CC1=CN=CN1 HXKZJLWGSWQKEA-LSJOCFKGSA-N 0.000 description 2
- VGYOLSOFODKLSP-IHPCNDPISA-N His-Leu-Trp Chemical compound C([C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(O)=O)C1=CN=CN1 VGYOLSOFODKLSP-IHPCNDPISA-N 0.000 description 2
- ZDNORQNHCJUVOV-KBIXCLLPSA-N Ile-Gln-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](C)C(O)=O ZDNORQNHCJUVOV-KBIXCLLPSA-N 0.000 description 2
- MVHXGBZUJLWZOH-BJDJZHNGSA-N Leu-Ser-Ile Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O MVHXGBZUJLWZOH-BJDJZHNGSA-N 0.000 description 2
- SVBJIZVVYJYGLA-DCAQKATOSA-N Leu-Ser-Val Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O SVBJIZVVYJYGLA-DCAQKATOSA-N 0.000 description 2
- BTSXLXFPMZXVPR-DLOVCJGASA-N Lys-Ala-His Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)O)NC(=O)[C@H](CCCCN)N BTSXLXFPMZXVPR-DLOVCJGASA-N 0.000 description 2
- VHXMZJGOKIMETG-CQDKDKBSSA-N Lys-Ala-Tyr Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CC=C(C=C1)O)C(=O)O)NC(=O)[C@H](CCCCN)N VHXMZJGOKIMETG-CQDKDKBSSA-N 0.000 description 2
- VKCPHIOZDWUFSW-ONGXEEELSA-N Lys-Val-Gly Chemical compound OC(=O)CNC(=O)[C@H](C(C)C)NC(=O)[C@@H](N)CCCCN VKCPHIOZDWUFSW-ONGXEEELSA-N 0.000 description 2
- KLFPZIUIXZNEKY-DCAQKATOSA-N Met-Gln-Met Chemical compound [H]N[C@@H](CCSC)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(O)=O KLFPZIUIXZNEKY-DCAQKATOSA-N 0.000 description 2
- RZJOHSFAEZBWLK-CIUDSAMLSA-N Met-Gln-Ser Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CO)C(=O)O)N RZJOHSFAEZBWLK-CIUDSAMLSA-N 0.000 description 2
- FWAHLGXNBLWIKB-NAKRPEOUSA-N Met-Ile-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CCSC FWAHLGXNBLWIKB-NAKRPEOUSA-N 0.000 description 2
- PESQCPHRXOFIPX-UHFFFAOYSA-N N-L-methionyl-L-tyrosine Natural products CSCCC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 PESQCPHRXOFIPX-UHFFFAOYSA-N 0.000 description 2
- 125000001429 N-terminal alpha-amino-acid group Chemical group 0.000 description 2
- 108700026244 Open Reading Frames Proteins 0.000 description 2
- 108020002230 Pancreatic Ribonuclease Proteins 0.000 description 2
- 102000005891 Pancreatic ribonuclease Human genes 0.000 description 2
- UHRNIXJAGGLKHP-DLOVCJGASA-N Phe-Ala-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O UHRNIXJAGGLKHP-DLOVCJGASA-N 0.000 description 2
- RAGOJJCBGXARPO-XVSYOHENSA-N Phe-Thr-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC1=CC=CC=C1 RAGOJJCBGXARPO-XVSYOHENSA-N 0.000 description 2
- IHCXPSYCHXFXKT-DCAQKATOSA-N Pro-Arg-Glu Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCC(O)=O)C(O)=O IHCXPSYCHXFXKT-DCAQKATOSA-N 0.000 description 2
- OGRYXQOUFHAMPI-DCAQKATOSA-N Pro-Cys-His Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CS)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O OGRYXQOUFHAMPI-DCAQKATOSA-N 0.000 description 2
- 102000007056 Recombinant Fusion Proteins Human genes 0.000 description 2
- 108010008281 Recombinant Fusion Proteins Proteins 0.000 description 2
- FIXILCYTSAUERA-FXQIFTODSA-N Ser-Ala-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FIXILCYTSAUERA-FXQIFTODSA-N 0.000 description 2
- HBTCFCHYALPXME-HTFCKZLJSA-N Ser-Ile-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O HBTCFCHYALPXME-HTFCKZLJSA-N 0.000 description 2
- RIAKPZVSNBBNRE-BJDJZHNGSA-N Ser-Ile-Leu Chemical compound OC[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(C)C)C(O)=O RIAKPZVSNBBNRE-BJDJZHNGSA-N 0.000 description 2
- MOINZPRHJGTCHZ-MMWGEVLESA-N Ser-Ile-Pro Chemical compound CC[C@H](C)[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CO)N MOINZPRHJGTCHZ-MMWGEVLESA-N 0.000 description 2
- QYSFWUIXDFJUDW-DCAQKATOSA-N Ser-Leu-Arg Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QYSFWUIXDFJUDW-DCAQKATOSA-N 0.000 description 2
- XUDRHBPSPAPDJP-SRVKXCTJSA-N Ser-Lys-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@H](CCCCN)NC(=O)[C@@H](N)CO XUDRHBPSPAPDJP-SRVKXCTJSA-N 0.000 description 2
- 108091081024 Start codon Proteins 0.000 description 2
- FHDLKMFZKRUQCE-HJGDQZAQSA-N Thr-Glu-Arg Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O FHDLKMFZKRUQCE-HJGDQZAQSA-N 0.000 description 2
- MSIYNSBKKVMGFO-BHNWBGBOSA-N Thr-Gly-Pro Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N1CCC[C@@H]1C(=O)O)N)O MSIYNSBKKVMGFO-BHNWBGBOSA-N 0.000 description 2
- VFJIWSJKZJTQII-SRVKXCTJSA-N Tyr-Asp-Ser Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CO)C(O)=O VFJIWSJKZJTQII-SRVKXCTJSA-N 0.000 description 2
- YYZPVPJCOGGQPC-JYJNAYRXSA-N Tyr-His-Gln Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CCC(N)=O)C(O)=O YYZPVPJCOGGQPC-JYJNAYRXSA-N 0.000 description 2
- QSFJHIRIHOJRKS-ULQDDVLXSA-N Tyr-Leu-Arg Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O QSFJHIRIHOJRKS-ULQDDVLXSA-N 0.000 description 2
- SLLKXDSRVAOREO-KZVJFYERSA-N Val-Ala-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](C)NC(=O)[C@H](C(C)C)N)O SLLKXDSRVAOREO-KZVJFYERSA-N 0.000 description 2
- 230000009471 action Effects 0.000 description 2
- 238000000137 annealing Methods 0.000 description 2
- 230000033228 biological regulation Effects 0.000 description 2
- 238000010367 cloning Methods 0.000 description 2
- 239000002299 complementary DNA Substances 0.000 description 2
- 108010016616 cysteinylglycine Proteins 0.000 description 2
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Natural products NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 2
- 229940104302 cytosine Drugs 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 102000013165 exonuclease Human genes 0.000 description 2
- XBGGUPMXALFZOT-UHFFFAOYSA-N glycyl-L-tyrosine hemihydrate Natural products NCC(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 XBGGUPMXALFZOT-UHFFFAOYSA-N 0.000 description 2
- 108010050848 glycylleucine Proteins 0.000 description 2
- 108010081551 glycylphenylalanine Proteins 0.000 description 2
- 108010085325 histidylproline Proteins 0.000 description 2
- 229910052739 hydrogen Inorganic materials 0.000 description 2
- 239000001257 hydrogen Substances 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 108010027338 isoleucylcysteine Proteins 0.000 description 2
- 238000005259 measurement Methods 0.000 description 2
- 230000001404 mediated effect Effects 0.000 description 2
- 229910052760 oxygen Inorganic materials 0.000 description 2
- 239000013612 plasmid Substances 0.000 description 2
- 238000002360 preparation method Methods 0.000 description 2
- 108091008146 restriction endonucleases Proteins 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 108010048818 seryl-histidine Proteins 0.000 description 2
- 238000003786 synthesis reaction Methods 0.000 description 2
- 238000013519 translation Methods 0.000 description 2
- 108010080629 tryptophan-leucine Proteins 0.000 description 2
- 229910021642 ultra pure water Inorganic materials 0.000 description 2
- 239000012498 ultrapure water Substances 0.000 description 2
- 241001515965 unidentified phage Species 0.000 description 2
- FAAHJOLJYDXKKU-ZHDGNLTBSA-N (2s)-6-amino-2-[[(2s)-1-[(2s,3r)-2-[[(2s)-2-[[(2s)-2-[[(2s)-2-[(2-aminoacetyl)amino]-3-phenylpropanoyl]amino]-3-phenylpropanoyl]amino]-3-(4-hydroxyphenyl)propanoyl]amino]-3-hydroxybutanoyl]pyrrolidine-2-carbonyl]amino]hexanoic acid Chemical compound C([C@@H](C(=O)N[C@@H]([C@H](O)C)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCCN)C(O)=O)NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)CN)C1=CC=C(O)C=C1 FAAHJOLJYDXKKU-ZHDGNLTBSA-N 0.000 description 1
- TTXMOJWKNRJWQJ-FXQIFTODSA-N Ala-Arg-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@@H](N)C)CCCN=C(N)N TTXMOJWKNRJWQJ-FXQIFTODSA-N 0.000 description 1
- RXTBLQVXNIECFP-FXQIFTODSA-N Ala-Gln-Gln Chemical compound C[C@H](N)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O RXTBLQVXNIECFP-FXQIFTODSA-N 0.000 description 1
- KMGOBAQSCKTBGD-DLOVCJGASA-N Ala-His-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CN=CN1 KMGOBAQSCKTBGD-DLOVCJGASA-N 0.000 description 1
- HUUOZYZWNCXTFK-INTQDDNPSA-N Ala-His-Pro Chemical compound C[C@@H](C(=O)N[C@@H](CC1=CN=CN1)C(=O)N2CCC[C@@H]2C(=O)O)N HUUOZYZWNCXTFK-INTQDDNPSA-N 0.000 description 1
- WNHNMKOFKCHKKD-BFHQHQDPSA-N Ala-Thr-Gly Chemical compound [H]N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(=O)NCC(O)=O WNHNMKOFKCHKKD-BFHQHQDPSA-N 0.000 description 1
- XKXAZPSREVUCRT-BPNCWPANSA-N Ala-Tyr-Met Chemical compound CSCC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](C)N)CC1=CC=C(O)C=C1 XKXAZPSREVUCRT-BPNCWPANSA-N 0.000 description 1
- XSLGWYYNOSUMRM-ZKWXMUAHSA-N Ala-Val-Asn Chemical compound [H]N[C@@H](C)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O XSLGWYYNOSUMRM-ZKWXMUAHSA-N 0.000 description 1
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 1
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 1
- DBKNLHKEVPZVQC-LPEHRKFASA-N Arg-Ala-Pro Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](C)C(=O)N1CCC[C@@H]1C(O)=O DBKNLHKEVPZVQC-LPEHRKFASA-N 0.000 description 1
- GIVATXIGCXFQQA-FXQIFTODSA-N Arg-Ala-Ser Chemical compound OC[C@@H](C(O)=O)NC(=O)[C@H](C)NC(=O)[C@@H](N)CCCN=C(N)N GIVATXIGCXFQQA-FXQIFTODSA-N 0.000 description 1
- MUXONAMCEUBVGA-DCAQKATOSA-N Arg-Arg-Gln Chemical compound NC(N)=NCCC[C@H](N)C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CCC(N)=O)C(O)=O MUXONAMCEUBVGA-DCAQKATOSA-N 0.000 description 1
- RCAUJZASOAFTAJ-FXQIFTODSA-N Arg-Asp-Cys Chemical compound C(C[C@@H](C(=O)N[C@@H](CC(=O)O)C(=O)N[C@@H](CS)C(=O)O)N)CN=C(N)N RCAUJZASOAFTAJ-FXQIFTODSA-N 0.000 description 1
- PNIGSVZJNVUVJA-BQBZGAKWSA-N Arg-Gly-Asn Chemical compound NC(N)=NCCC[C@H](N)C(=O)NCC(=O)N[C@@H](CC(N)=O)C(O)=O PNIGSVZJNVUVJA-BQBZGAKWSA-N 0.000 description 1
- NVCIXQYNWYTLDO-IHRRRGAJSA-N Arg-His-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCCN=C(N)N)N NVCIXQYNWYTLDO-IHRRRGAJSA-N 0.000 description 1
- NPZJLGMWMDNQDD-GHCJXIJMSA-N Asn-Ser-Ile Chemical compound [H]N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O NPZJLGMWMDNQDD-GHCJXIJMSA-N 0.000 description 1
- JXMREEPBRANWBY-VEVYYDQMSA-N Asn-Thr-Arg Chemical compound NC(=O)C[C@H](N)C(=O)N[C@@H]([C@H](O)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JXMREEPBRANWBY-VEVYYDQMSA-N 0.000 description 1
- AXXCUABIFZPKPM-BQBZGAKWSA-N Asp-Arg-Gly Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O AXXCUABIFZPKPM-BQBZGAKWSA-N 0.000 description 1
- UGKZHCBLMLSANF-CIUDSAMLSA-N Asp-Asn-Leu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC(C)C)C(O)=O UGKZHCBLMLSANF-CIUDSAMLSA-N 0.000 description 1
- DZQKLNLLWFQONU-LKXGYXEUSA-N Asp-Cys-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CS)NC(=O)[C@H](CC(=O)O)N)O DZQKLNLLWFQONU-LKXGYXEUSA-N 0.000 description 1
- DWOSGXZMLQNDBN-FXQIFTODSA-N Asp-Pro-Cys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC(=O)O)N)C(=O)N[C@@H](CS)C(=O)O DWOSGXZMLQNDBN-FXQIFTODSA-N 0.000 description 1
- KNDCWFXCFKSEBM-AVGNSLFASA-N Asp-Tyr-Glu Chemical compound [H]N[C@@H](CC(O)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(O)=O)C(O)=O KNDCWFXCFKSEBM-AVGNSLFASA-N 0.000 description 1
- 102000053642 Catalytic RNA Human genes 0.000 description 1
- 108090000994 Catalytic RNA Proteins 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- UWXFFVQPAMBETM-ZLUOBGJFSA-N Cys-Asp-Asn Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(O)=O UWXFFVQPAMBETM-ZLUOBGJFSA-N 0.000 description 1
- DZIGZIIJIGGANI-FXQIFTODSA-N Cys-Glu-Gln Chemical compound SC[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O DZIGZIIJIGGANI-FXQIFTODSA-N 0.000 description 1
- UUOYKFNULIOCGJ-GUBZILKMSA-N Cys-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CS)N UUOYKFNULIOCGJ-GUBZILKMSA-N 0.000 description 1
- XTHUKRLJRUVVBF-WHFBIAKZSA-N Cys-Gly-Ser Chemical compound SC[C@H](N)C(=O)NCC(=O)N[C@@H](CO)C(O)=O XTHUKRLJRUVVBF-WHFBIAKZSA-N 0.000 description 1
- WTNLLMQAFPOCTJ-GARJFASQSA-N Cys-His-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CC2=CN=CN2)NC(=O)[C@H](CS)N)C(=O)O WTNLLMQAFPOCTJ-GARJFASQSA-N 0.000 description 1
- HBHMVBGGHDMPBF-GARJFASQSA-N Cys-Leu-Pro Chemical compound CC(C)C[C@@H](C(=O)N1CCC[C@@H]1C(=O)O)NC(=O)[C@H](CS)N HBHMVBGGHDMPBF-GARJFASQSA-N 0.000 description 1
- OZHXXYOHPLLLMI-CIUDSAMLSA-N Cys-Lys-Ala Chemical compound [H]N[C@@H](CS)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OZHXXYOHPLLLMI-CIUDSAMLSA-N 0.000 description 1
- WTXCNOPZMQRTNN-BWBBJGPYSA-N Cys-Thr-Ser Chemical compound C[C@H]([C@@H](C(=O)N[C@@H](CO)C(=O)O)NC(=O)[C@H](CS)N)O WTXCNOPZMQRTNN-BWBBJGPYSA-N 0.000 description 1
- 102000012410 DNA Ligases Human genes 0.000 description 1
- 230000004544 DNA amplification Effects 0.000 description 1
- 102000007260 Deoxyribonuclease I Human genes 0.000 description 1
- 102000016911 Deoxyribonucleases Human genes 0.000 description 1
- 108010053770 Deoxyribonucleases Proteins 0.000 description 1
- 102100031780 Endonuclease Human genes 0.000 description 1
- HHWQMFIGMMOVFK-WDSKDSINSA-N Gln-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H](N)CCC(N)=O HHWQMFIGMMOVFK-WDSKDSINSA-N 0.000 description 1
- LLRJEFPKIIBGJP-DCAQKATOSA-N Gln-Glu-His Chemical compound C1=C(NC=N1)C[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)O)NC(=O)[C@H](CCC(=O)N)N LLRJEFPKIIBGJP-DCAQKATOSA-N 0.000 description 1
- JXFLPKSDLDEOQK-JHEQGTHGSA-N Gln-Gly-Thr Chemical compound C[C@@H](O)[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)CCC(N)=O JXFLPKSDLDEOQK-JHEQGTHGSA-N 0.000 description 1
- IWUFOVSLWADEJC-AVGNSLFASA-N Gln-His-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(O)=O IWUFOVSLWADEJC-AVGNSLFASA-N 0.000 description 1
- HWEINOMSWQSJDC-SRVKXCTJSA-N Gln-Leu-Arg Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O HWEINOMSWQSJDC-SRVKXCTJSA-N 0.000 description 1
- MLSKFHLRFVGNLL-WDCWCFNPSA-N Gln-Leu-Thr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O MLSKFHLRFVGNLL-WDCWCFNPSA-N 0.000 description 1
- NMYFPKCIGUJMIK-GUBZILKMSA-N Gln-Met-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@H](CCC(=O)N)N NMYFPKCIGUJMIK-GUBZILKMSA-N 0.000 description 1
- FALJZCPMTGJOHX-SRVKXCTJSA-N Gln-Met-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC(C)C)C(O)=O FALJZCPMTGJOHX-SRVKXCTJSA-N 0.000 description 1
- GQTNWYFWSUFFRA-KKUMJFAQSA-N Gln-Met-Tyr Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O GQTNWYFWSUFFRA-KKUMJFAQSA-N 0.000 description 1
- MFORDNZDKAVNSR-SRVKXCTJSA-N Gln-Pro-Lys Chemical compound NCCCC[C@@H](C(O)=O)NC(=O)[C@@H]1CCCN1C(=O)[C@@H](N)CCC(N)=O MFORDNZDKAVNSR-SRVKXCTJSA-N 0.000 description 1
- KUBFPYIMAGXGBT-ACZMJKKPSA-N Gln-Ser-Ala Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](CO)C(=O)N[C@@H](C)C(O)=O KUBFPYIMAGXGBT-ACZMJKKPSA-N 0.000 description 1
- VEYGCDYMOXHJLS-GVXVVHGQSA-N Gln-Val-Leu Chemical compound [H]N[C@@H](CCC(N)=O)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O VEYGCDYMOXHJLS-GVXVVHGQSA-N 0.000 description 1
- ITYRYNUZHPNCIK-GUBZILKMSA-N Glu-Ala-Leu Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(C)C)C(O)=O ITYRYNUZHPNCIK-GUBZILKMSA-N 0.000 description 1
- ZJICFHQSPWFBKP-AVGNSLFASA-N Glu-Asn-Tyr Chemical compound [H]N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O ZJICFHQSPWFBKP-AVGNSLFASA-N 0.000 description 1
- ZZIFPJZQHRJERU-WDSKDSINSA-N Glu-Cys-Gly Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CS)C(=O)NCC(O)=O ZZIFPJZQHRJERU-WDSKDSINSA-N 0.000 description 1
- XMVLTPMCUJTJQP-FXQIFTODSA-N Glu-Gln-Cys Chemical compound C(CC(=O)O)[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)N[C@@H](CS)C(=O)O)N XMVLTPMCUJTJQP-FXQIFTODSA-N 0.000 description 1
- VXQOONWNIWFOCS-HGNGGELXSA-N Glu-His-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CN=CN1)NC(=O)[C@H](CCC(=O)O)N VXQOONWNIWFOCS-HGNGGELXSA-N 0.000 description 1
- JJSVALISDCNFCU-SZMVWBNQSA-N Glu-Leu-Trp Chemical compound CC(C)C[C@H](NC(=O)[C@@H](N)CCC(O)=O)C(=O)N[C@@H](Cc1c[nH]c2ccccc12)C(O)=O JJSVALISDCNFCU-SZMVWBNQSA-N 0.000 description 1
- DAHLWSFUXOHMIA-FXQIFTODSA-N Glu-Ser-Gln Chemical compound OC(=O)CC[C@H](N)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(N)=O)C(O)=O DAHLWSFUXOHMIA-FXQIFTODSA-N 0.000 description 1
- GQGAFTPXAPKSCF-WHFBIAKZSA-N Gly-Ala-Cys Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CS)C(=O)O GQGAFTPXAPKSCF-WHFBIAKZSA-N 0.000 description 1
- GZUKEVBTYNNUQF-WDSKDSINSA-N Gly-Ala-Gln Chemical compound NCC(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O GZUKEVBTYNNUQF-WDSKDSINSA-N 0.000 description 1
- DHDOADIPGZTAHT-YUMQZZPRSA-N Gly-Glu-Arg Chemical compound NCC(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N DHDOADIPGZTAHT-YUMQZZPRSA-N 0.000 description 1
- ORXZVPZCPMKHNR-IUCAKERBSA-N Gly-His-Glu Chemical compound OC(=O)CC[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)CN)CC1=CNC=N1 ORXZVPZCPMKHNR-IUCAKERBSA-N 0.000 description 1
- COVXELOAORHTND-LSJOCFKGSA-N Gly-Ile-Val Chemical compound NCC(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C(C)C)C(O)=O COVXELOAORHTND-LSJOCFKGSA-N 0.000 description 1
- YTSVAIMKVLZUDU-YUMQZZPRSA-N Gly-Leu-Asp Chemical compound [H]NCC(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(O)=O)C(O)=O YTSVAIMKVLZUDU-YUMQZZPRSA-N 0.000 description 1
- PCPOYRCAHPJXII-UWVGGRQHSA-N Gly-Lys-Met Chemical compound [H]NCC(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCSC)C(O)=O PCPOYRCAHPJXII-UWVGGRQHSA-N 0.000 description 1
- OQQKUTVULYLCDG-ONGXEEELSA-N Gly-Lys-Val Chemical compound CC(C)[C@H](NC(=O)[C@H](CCCCN)NC(=O)CN)C(O)=O OQQKUTVULYLCDG-ONGXEEELSA-N 0.000 description 1
- HJARVELKOSZUEW-YUMQZZPRSA-N Gly-Pro-Gln Chemical compound [H]NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(N)=O)C(O)=O HJARVELKOSZUEW-YUMQZZPRSA-N 0.000 description 1
- NSVOVKWEKGEOQB-LURJTMIESA-N Gly-Pro-Gly Chemical compound NCC(=O)N1CCC[C@H]1C(=O)NCC(O)=O NSVOVKWEKGEOQB-LURJTMIESA-N 0.000 description 1
- HAOUOFNNJJLVNS-BQBZGAKWSA-N Gly-Pro-Ser Chemical compound NCC(=O)N1CCC[C@H]1C(=O)N[C@@H](CO)C(O)=O HAOUOFNNJJLVNS-BQBZGAKWSA-N 0.000 description 1
- KBBFOULZCHWGJX-KBPBESRZSA-N Gly-Tyr-His Chemical compound C1=CC(=CC=C1C[C@@H](C(=O)N[C@@H](CC2=CN=CN2)C(=O)O)NC(=O)CN)O KBBFOULZCHWGJX-KBPBESRZSA-N 0.000 description 1
- UYTPUPDQBNUYGX-UHFFFAOYSA-N Guanine Natural products O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 1
- LPZUKJALYGXBIE-SRVKXCTJSA-N His-Gln-Met Chemical compound CSCC[C@@H](C(=O)O)NC(=O)[C@H](CCC(=O)N)NC(=O)[C@H](CC1=CN=CN1)N LPZUKJALYGXBIE-SRVKXCTJSA-N 0.000 description 1
- TWROVBNEHJSXDG-IHRRRGAJSA-N His-Leu-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](C(C)C)C(O)=O TWROVBNEHJSXDG-IHRRRGAJSA-N 0.000 description 1
- HBGKOLSGLYMWSW-DCAQKATOSA-N His-Pro-Cys Chemical compound C1C[C@H](N(C1)C(=O)[C@H](CC2=CN=CN2)N)C(=O)N[C@@H](CS)C(=O)O HBGKOLSGLYMWSW-DCAQKATOSA-N 0.000 description 1
- FFKJUTZARGRVTH-KKUMJFAQSA-N His-Ser-Tyr Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(O)=O FFKJUTZARGRVTH-KKUMJFAQSA-N 0.000 description 1
- JGFWUKYIQAEYAH-DCAQKATOSA-N His-Ser-Val Chemical compound [H]N[C@@H](CC1=CNC=N1)C(=O)N[C@@H](CO)C(=O)N[C@@H](C(C)C)C(O)=O JGFWUKYIQAEYAH-DCAQKATOSA-N 0.000 description 1
- XHQYFGPIRUHQIB-PBCZWWQYSA-N His-Thr-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@H]([C@H](O)C)NC(=O)[C@@H](N)CC1=CN=CN1 XHQYFGPIRUHQIB-PBCZWWQYSA-N 0.000 description 1
- UIRUVUUGUYCMBY-KCTSRDHCSA-N His-Trp-Ala Chemical compound C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)NC(=O)[C@H](CC3=CN=CN3)N UIRUVUUGUYCMBY-KCTSRDHCSA-N 0.000 description 1
- DURWCDDDAWVPOP-JBDRJPRFSA-N Ile-Cys-Ser Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CS)C(=O)N[C@@H](CO)C(=O)O)N DURWCDDDAWVPOP-JBDRJPRFSA-N 0.000 description 1
- WIZPFZKOFZXDQG-HTFCKZLJSA-N Ile-Ile-Ala Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](C)C(O)=O WIZPFZKOFZXDQG-HTFCKZLJSA-N 0.000 description 1
- HPCFRQWLTRDGHT-AJNGGQMLSA-N Ile-Leu-Leu Chemical compound CC[C@H](C)[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(C)C)C(O)=O HPCFRQWLTRDGHT-AJNGGQMLSA-N 0.000 description 1
- XHBYEMIUENPZLY-GMOBBJLQSA-N Ile-Pro-Asn Chemical compound CC[C@H](C)[C@H](N)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CC(N)=O)C(O)=O XHBYEMIUENPZLY-GMOBBJLQSA-N 0.000 description 1
- ZNOBVZFCHNHKHA-KBIXCLLPSA-N Ile-Ser-Glu Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H](CO)C(=O)N[C@@H](CCC(=O)O)C(=O)O)N ZNOBVZFCHNHKHA-KBIXCLLPSA-N 0.000 description 1
- QHUREMVLLMNUAX-OSUNSFLBSA-N Ile-Thr-Val Chemical compound CC[C@H](C)[C@@H](C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@@H](C(C)C)C(=O)O)N QHUREMVLLMNUAX-OSUNSFLBSA-N 0.000 description 1
- LHSGPCFBGJHPCY-UHFFFAOYSA-N L-leucine-L-tyrosine Natural products CC(C)CC(N)C(=O)NC(C(O)=O)CC1=CC=C(O)C=C1 LHSGPCFBGJHPCY-UHFFFAOYSA-N 0.000 description 1
- CZCSUZMIRKFFFA-CIUDSAMLSA-N Leu-Ala-Asn Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](C)C(=O)N[C@@H](CC(N)=O)C(O)=O CZCSUZMIRKFFFA-CIUDSAMLSA-N 0.000 description 1
- BQSLGJHIAGOZCD-CIUDSAMLSA-N Leu-Ala-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@@H](CO)C(O)=O BQSLGJHIAGOZCD-CIUDSAMLSA-N 0.000 description 1
- QKIBIXAQKAFZGL-GUBZILKMSA-N Leu-Cys-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CS)C(=O)N[C@@H](CCC(N)=O)C(O)=O QKIBIXAQKAFZGL-GUBZILKMSA-N 0.000 description 1
- AVEGDIAXTDVBJS-XUXIUFHCSA-N Leu-Ile-Arg Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O AVEGDIAXTDVBJS-XUXIUFHCSA-N 0.000 description 1
- KOSWSHVQIVTVQF-ZPFDUUQYSA-N Leu-Ile-Asp Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CC(O)=O)C(O)=O KOSWSHVQIVTVQF-ZPFDUUQYSA-N 0.000 description 1
- ORWTWZXGDBYVCP-BJDJZHNGSA-N Leu-Ile-Cys Chemical compound SC[C@@H](C(O)=O)NC(=O)[C@H]([C@@H](C)CC)NC(=O)[C@@H](N)CC(C)C ORWTWZXGDBYVCP-BJDJZHNGSA-N 0.000 description 1
- ZDJQVSIPFLMNOX-RHYQMDGZSA-N Leu-Thr-Arg Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N ZDJQVSIPFLMNOX-RHYQMDGZSA-N 0.000 description 1
- LFSQWRSVPNKJGP-WDCWCFNPSA-N Leu-Thr-Glu Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H]([C@@H](C)O)C(=O)N[C@H](C(O)=O)CCC(O)=O LFSQWRSVPNKJGP-WDCWCFNPSA-N 0.000 description 1
- VHTIZYYHIUHMCA-JYJNAYRXSA-N Leu-Tyr-Gln Chemical compound [H]N[C@@H](CC(C)C)C(=O)N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCC(N)=O)C(O)=O VHTIZYYHIUHMCA-JYJNAYRXSA-N 0.000 description 1
- TUIOUEWKFFVNLH-DCAQKATOSA-N Leu-Val-Cys Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CS)C(O)=O TUIOUEWKFFVNLH-DCAQKATOSA-N 0.000 description 1
- VKVDRTGWLVZJOM-DCAQKATOSA-N Leu-Val-Ser Chemical compound CC(C)C[C@H](N)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O VKVDRTGWLVZJOM-DCAQKATOSA-N 0.000 description 1
- GAOJCVKPIGHTGO-UWVGGRQHSA-N Lys-Arg-Gly Chemical compound [H]N[C@@H](CCCCN)C(=O)N[C@@H](CCCNC(N)=N)C(=O)NCC(O)=O GAOJCVKPIGHTGO-UWVGGRQHSA-N 0.000 description 1
- FUKDBQGFSJUXGX-RWMBFGLXSA-N Lys-Arg-Pro Chemical compound C1C[C@@H](N(C1)C(=O)[C@H](CCCN=C(N)N)NC(=O)[C@H](CCCCN)N)C(=O)O FUKDBQGFSJUXGX-RWMBFGLXSA-N 0.000 description 1
- GPJGFSFYBJGYRX-YUMQZZPRSA-N Lys-Gly-Asp Chemical compound NCCCC[C@H](N)C(=O)NCC(=O)N[C@H](C(O)=O)CC(O)=O GPJGFSFYBJGYRX-YUMQZZPRSA-N 0.000 description 1
- PINHPJWGVBKQII-SRVKXCTJSA-N Lys-Leu-Cys Chemical compound CC(C)C[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CCCCN)N PINHPJWGVBKQII-SRVKXCTJSA-N 0.000 description 1
- LMKSBGIUPVRHEH-FXQIFTODSA-N Met-Ala-Asn Chemical compound CSCC[C@H](N)C(=O)N[C@@H](C)C(=O)N[C@H](C(O)=O)CC(N)=O LMKSBGIUPVRHEH-FXQIFTODSA-N 0.000 description 1
- FZDOBWIKRQORAC-ULQDDVLXSA-N Met-Tyr-Leu Chemical compound CC(C)C[C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CCSC)N FZDOBWIKRQORAC-ULQDDVLXSA-N 0.000 description 1
- XMBSYZWANAQXEV-UHFFFAOYSA-N N-alpha-L-glutamyl-L-phenylalanine Natural products OC(=O)CCC(N)C(=O)NC(C(O)=O)CC1=CC=CC=C1 XMBSYZWANAQXEV-UHFFFAOYSA-N 0.000 description 1
- 108010079364 N-glycylalanine Proteins 0.000 description 1
- BQVUABVGYYSDCJ-UHFFFAOYSA-N Nalpha-L-Leucyl-L-tryptophan Natural products C1=CC=C2C(CC(NC(=O)C(N)CC(C)C)C(O)=O)=CNC2=C1 BQVUABVGYYSDCJ-UHFFFAOYSA-N 0.000 description 1
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 1
- 108091005461 Nucleic proteins Proteins 0.000 description 1
- 108090000119 Nucleotidyltransferases Proteins 0.000 description 1
- 102000003832 Nucleotidyltransferases Human genes 0.000 description 1
- 102000035195 Peptidases Human genes 0.000 description 1
- 108091005804 Peptidases Proteins 0.000 description 1
- JEGFCFLCRSJCMA-IHRRRGAJSA-N Phe-Arg-Ser Chemical compound C1=CC=C(C=C1)C[C@@H](C(=O)N[C@@H](CCCN=C(N)N)C(=O)N[C@@H](CO)C(=O)O)N JEGFCFLCRSJCMA-IHRRRGAJSA-N 0.000 description 1
- YCCUXNNKXDGMAM-KKUMJFAQSA-N Phe-Leu-Ser Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CO)C(O)=O YCCUXNNKXDGMAM-KKUMJFAQSA-N 0.000 description 1
- GKRCCTYAGQPMMP-IHRRRGAJSA-N Phe-Ser-Met Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCSC)C(O)=O GKRCCTYAGQPMMP-IHRRRGAJSA-N 0.000 description 1
- MHNBYYFXWDUGBW-RPTUDFQQSA-N Phe-Tyr-Thr Chemical compound C[C@H]([C@@H](C(=O)O)NC(=O)[C@H](CC1=CC=C(C=C1)O)NC(=O)[C@H](CC2=CC=CC=C2)N)O MHNBYYFXWDUGBW-RPTUDFQQSA-N 0.000 description 1
- GOUWCZRDTWTODO-YDHLFZDLSA-N Phe-Val-Asn Chemical compound [H]N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O GOUWCZRDTWTODO-YDHLFZDLSA-N 0.000 description 1
- 108010021757 Polynucleotide 5'-Hydroxyl-Kinase Proteins 0.000 description 1
- 102000008422 Polynucleotide 5'-hydroxyl-kinase Human genes 0.000 description 1
- KIZQGKLMXKGDIV-BQBZGAKWSA-N Pro-Ala-Gly Chemical compound OC(=O)CNC(=O)[C@H](C)NC(=O)[C@@H]1CCCN1 KIZQGKLMXKGDIV-BQBZGAKWSA-N 0.000 description 1
- CYQQWUPHIZVCNY-GUBZILKMSA-N Pro-Arg-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O CYQQWUPHIZVCNY-GUBZILKMSA-N 0.000 description 1
- SBYVDRLQAGENMY-DCAQKATOSA-N Pro-Asn-His Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CC(=O)N)C(=O)N[C@@H](CC2=CN=CN2)C(=O)O SBYVDRLQAGENMY-DCAQKATOSA-N 0.000 description 1
- JARJPEMLQAWNBR-GUBZILKMSA-N Pro-Asp-Arg Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CCCNC(N)=N)C(O)=O JARJPEMLQAWNBR-GUBZILKMSA-N 0.000 description 1
- DXTOOBDIIAJZBJ-BQBZGAKWSA-N Pro-Gly-Ser Chemical compound [H]N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CO)C(O)=O DXTOOBDIIAJZBJ-BQBZGAKWSA-N 0.000 description 1
- OFGUOWQVEGTVNU-DCAQKATOSA-N Pro-Lys-Ala Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](C)C(O)=O OFGUOWQVEGTVNU-DCAQKATOSA-N 0.000 description 1
- KLOQCCRTPHPIFN-DCAQKATOSA-N Pro-Met-Gln Chemical compound CSCC[C@@H](C(=O)N[C@@H](CCC(=O)N)C(=O)O)NC(=O)[C@@H]1CCCN1 KLOQCCRTPHPIFN-DCAQKATOSA-N 0.000 description 1
- SXJOPONICMGFCR-DCAQKATOSA-N Pro-Ser-Lys Chemical compound C1C[C@H](NC1)C(=O)N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)O SXJOPONICMGFCR-DCAQKATOSA-N 0.000 description 1
- VDHGTOHMHHQSKG-JYJNAYRXSA-N Pro-Val-Phe Chemical compound CC(C)[C@H](NC(=O)[C@@H]1CCCN1)C(=O)N[C@@H](Cc1ccccc1)C(O)=O VDHGTOHMHHQSKG-JYJNAYRXSA-N 0.000 description 1
- FIODMZKLZFLYQP-GUBZILKMSA-N Pro-Val-Ser Chemical compound [H]N1CCC[C@H]1C(=O)N[C@@H](C(C)C)C(=O)N[C@@H](CO)C(O)=O FIODMZKLZFLYQP-GUBZILKMSA-N 0.000 description 1
- 239000004365 Protease Substances 0.000 description 1
- 102000055027 Protein Methyltransferases Human genes 0.000 description 1
- 108700040121 Protein Methyltransferases Proteins 0.000 description 1
- 108010092799 RNA-directed DNA polymerase Proteins 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 238000012300 Sequence Analysis Methods 0.000 description 1
- MMGJPDWSIOAGTH-ACZMJKKPSA-N Ser-Ala-Gln Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H](CCC(N)=O)C(O)=O MMGJPDWSIOAGTH-ACZMJKKPSA-N 0.000 description 1
- JPIDMRXXNMIVKY-VZFHVOOUSA-N Ser-Ala-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)O)C(O)=O JPIDMRXXNMIVKY-VZFHVOOUSA-N 0.000 description 1
- QVOGDCQNGLBNCR-FXQIFTODSA-N Ser-Arg-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CO)C(O)=O QVOGDCQNGLBNCR-FXQIFTODSA-N 0.000 description 1
- OYEDZGNMSBZCIM-XGEHTFHBSA-N Ser-Arg-Thr Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)O)C(O)=O OYEDZGNMSBZCIM-XGEHTFHBSA-N 0.000 description 1
- RDFQNDHEHVSONI-ZLUOBGJFSA-N Ser-Asn-Ser Chemical compound OC[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CO)C(O)=O RDFQNDHEHVSONI-ZLUOBGJFSA-N 0.000 description 1
- DSGYZICNAMEJOC-AVGNSLFASA-N Ser-Glu-Phe Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CC1=CC=CC=C1)C(O)=O DSGYZICNAMEJOC-AVGNSLFASA-N 0.000 description 1
- XERQKTRGJIKTRB-CIUDSAMLSA-N Ser-His-Asp Chemical compound OC(=O)C[C@@H](C(O)=O)NC(=O)[C@@H](NC(=O)[C@H](CO)N)CC1=CN=CN1 XERQKTRGJIKTRB-CIUDSAMLSA-N 0.000 description 1
- KCNSGAMPBPYUAI-CIUDSAMLSA-N Ser-Leu-Asn Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CC(N)=O)C(O)=O KCNSGAMPBPYUAI-CIUDSAMLSA-N 0.000 description 1
- FPCGZYMRFFIYIH-CIUDSAMLSA-N Ser-Lys-Ser Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CO)C(O)=O FPCGZYMRFFIYIH-CIUDSAMLSA-N 0.000 description 1
- KZPRPBLHYMZIMH-MXAVVETBSA-N Ser-Phe-Ile Chemical compound [H]N[C@@H](CO)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O KZPRPBLHYMZIMH-MXAVVETBSA-N 0.000 description 1
- SGZVZUCRAVSPKQ-FXQIFTODSA-N Ser-Val-Cys Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CS)C(=O)O)NC(=O)[C@H](CO)N SGZVZUCRAVSPKQ-FXQIFTODSA-N 0.000 description 1
- WFUAUEQXPVNAEF-ZJDVBMNYSA-N Thr-Arg-Thr Chemical compound C[C@@H](O)[C@H](N)C(=O)N[C@H](C(=O)N[C@@H]([C@@H](C)O)C(O)=O)CCCN=C(N)N WFUAUEQXPVNAEF-ZJDVBMNYSA-N 0.000 description 1
- CTONFVDJYCAMQM-IUKAMOBKSA-N Thr-Asn-Ile Chemical compound CC[C@H](C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)N)NC(=O)[C@H]([C@@H](C)O)N CTONFVDJYCAMQM-IUKAMOBKSA-N 0.000 description 1
- NLSNVZAREYQMGR-HJGDQZAQSA-N Thr-Asp-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(C)C)C(O)=O NLSNVZAREYQMGR-HJGDQZAQSA-N 0.000 description 1
- MPUMPERGHHJGRP-WEDXCCLWSA-N Thr-Gly-Lys Chemical compound C[C@H]([C@@H](C(=O)NCC(=O)N[C@@H](CCCCN)C(=O)O)N)O MPUMPERGHHJGRP-WEDXCCLWSA-N 0.000 description 1
- VTVVYQOXJCZVEB-WDCWCFNPSA-N Thr-Leu-Glu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(O)=O)C(O)=O VTVVYQOXJCZVEB-WDCWCFNPSA-N 0.000 description 1
- AHERARIZBPOMNU-KATARQTJSA-N Thr-Ser-Leu Chemical compound [H]N[C@@H]([C@@H](C)O)C(=O)N[C@@H](CO)C(=O)N[C@@H](CC(C)C)C(O)=O AHERARIZBPOMNU-KATARQTJSA-N 0.000 description 1
- RERIQEJUYCLJQI-QRTARXTBSA-N Trp-Asp-Val Chemical compound CC(C)[C@@H](C(=O)O)NC(=O)[C@H](CC(=O)O)NC(=O)[C@H](CC1=CNC2=CC=CC=C21)N RERIQEJUYCLJQI-QRTARXTBSA-N 0.000 description 1
- AIISTODACBDQLW-WDSOQIARSA-N Trp-Leu-Arg Chemical compound C1=CC=C2C(C[C@H](N)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCN=C(N)N)C(O)=O)=CNC2=C1 AIISTODACBDQLW-WDSOQIARSA-N 0.000 description 1
- QMNWABHLJOHGDS-IHRRRGAJSA-N Tyr-Met-Cys Chemical compound SC[C@@H](C(O)=O)NC(=O)[C@H](CCSC)NC(=O)[C@@H](N)CC1=CC=C(O)C=C1 QMNWABHLJOHGDS-IHRRRGAJSA-N 0.000 description 1
- CYTJBBNFJIWKGH-STECZYCISA-N Tyr-Met-Ile Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CCSC)C(=O)N[C@@H]([C@@H](C)CC)C(O)=O CYTJBBNFJIWKGH-STECZYCISA-N 0.000 description 1
- VBFVQTPETKJCQW-RPTUDFQQSA-N Tyr-Phe-Thr Chemical compound [H]N[C@@H](CC1=CC=C(O)C=C1)C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)N[C@@H]([C@@H](C)O)C(O)=O VBFVQTPETKJCQW-RPTUDFQQSA-N 0.000 description 1
- NUQZCPSZHGIYTA-HKUYNNGSSA-N Tyr-Trp-Gly Chemical compound C1=CC=C2C(=C1)C(=CN2)C[C@@H](C(=O)NCC(=O)O)NC(=O)[C@H](CC3=CC=C(C=C3)O)N NUQZCPSZHGIYTA-HKUYNNGSSA-N 0.000 description 1
- HZDQUVQEVVYDDA-ACRUOGEOSA-N Tyr-Tyr-Leu Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(O)=O)NC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=C(O)C=C1 HZDQUVQEVVYDDA-ACRUOGEOSA-N 0.000 description 1
- NVJCMGGZHOJNBU-UFYCRDLUSA-N Tyr-Val-Phe Chemical compound CC(C)[C@@H](C(=O)N[C@@H](CC1=CC=CC=C1)C(=O)O)NC(=O)[C@H](CC2=CC=C(C=C2)O)N NVJCMGGZHOJNBU-UFYCRDLUSA-N 0.000 description 1
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Natural products O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 1
- IDKGBVZGNTYYCC-QXEWZRGKSA-N Val-Asn-Pro Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CC(N)=O)C(=O)N1CCC[C@H]1C(O)=O IDKGBVZGNTYYCC-QXEWZRGKSA-N 0.000 description 1
- XLDYBRXERHITNH-QSFUFRPTSA-N Val-Asp-Ile Chemical compound CC[C@H](C)[C@@H](C(O)=O)NC(=O)[C@H](CC(O)=O)NC(=O)[C@@H](N)C(C)C XLDYBRXERHITNH-QSFUFRPTSA-N 0.000 description 1
- BRPKEERLGYNCNC-NHCYSSNCSA-N Val-Glu-Arg Chemical compound CC(C)[C@H](N)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@H](C(O)=O)CCCN=C(N)N BRPKEERLGYNCNC-NHCYSSNCSA-N 0.000 description 1
- URIRWLJVWHYLET-ONGXEEELSA-N Val-Gly-Leu Chemical compound CC(C)C[C@@H](C(O)=O)NC(=O)CNC(=O)[C@@H](N)C(C)C URIRWLJVWHYLET-ONGXEEELSA-N 0.000 description 1
- 238000000862 absorption spectrum Methods 0.000 description 1
- 108010044940 alanylglutamine Proteins 0.000 description 1
- 108010070944 alanylhistidine Proteins 0.000 description 1
- 108010070783 alanyltyrosine Proteins 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 108010062796 arginyllysine Proteins 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 238000005842 biochemical reaction Methods 0.000 description 1
- 238000007622 bioinformatic analysis Methods 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 238000010804 cDNA synthesis Methods 0.000 description 1
- 125000003178 carboxy group Chemical group [H]OC(*)=O 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000009833 condensation Methods 0.000 description 1
- 230000005494 condensation Effects 0.000 description 1
- 239000008358 core component Substances 0.000 description 1
- 108010004073 cysteinylcysteine Proteins 0.000 description 1
- 238000006731 degradation reaction Methods 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 238000001962 electrophoresis Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000001976 enzyme digestion Methods 0.000 description 1
- 102000037865 fusion proteins Human genes 0.000 description 1
- 108020001507 fusion proteins Proteins 0.000 description 1
- 108010063718 gamma-glutamylaspartic acid Proteins 0.000 description 1
- 238000012215 gene cloning Methods 0.000 description 1
- 108010049041 glutamylalanine Proteins 0.000 description 1
- 108010074027 glycyl-seryl-phenylalanine Proteins 0.000 description 1
- 108010087823 glycyltyrosine Proteins 0.000 description 1
- HNDVDQJCIGZPNO-UHFFFAOYSA-N histidine Natural products OC(=O)C(N)CC1=CN=CN1 HNDVDQJCIGZPNO-UHFFFAOYSA-N 0.000 description 1
- 108010040030 histidinoalanine Proteins 0.000 description 1
- 108010025306 histidylleucine Proteins 0.000 description 1
- 108010018006 histidylserine Proteins 0.000 description 1
- 238000000338 in vitro Methods 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 238000010921 in-depth analysis Methods 0.000 description 1
- 238000011534 incubation Methods 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 108010078274 isoleucylvaline Proteins 0.000 description 1
- 108010012058 leucyltyrosine Proteins 0.000 description 1
- 108010017391 lysylvaline Proteins 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 230000002503 metabolic effect Effects 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 239000002853 nucleic acid probe Substances 0.000 description 1
- 125000004437 phosphorous atom Chemical group 0.000 description 1
- 229910052698 phosphorus Inorganic materials 0.000 description 1
- 238000011533 pre-incubation Methods 0.000 description 1
- 230000013777 protein digestion Effects 0.000 description 1
- 238000003753 real-time PCR Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 230000008439 repair process Effects 0.000 description 1
- -1 respectively Proteins 0.000 description 1
- 238000010839 reverse transcription Methods 0.000 description 1
- 108091092562 ribozyme Proteins 0.000 description 1
- 108010048397 seryl-lysyl-leucine Proteins 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 238000002798 spectrophotometry method Methods 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 108010078580 tyrosylleucine Proteins 0.000 description 1
- 108010003137 tyrosyltyrosine Proteins 0.000 description 1
- 238000011144 upstream manufacturing Methods 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- 108010073969 valyllysine Proteins 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
Images
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N9/00—Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
- C12N9/93—Ligases (6)
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K7/00—Peptides having 5 to 20 amino acids in a fully defined sequence; Derivatives thereof
- C07K7/04—Linear peptides containing only normal peptide links
- C07K7/06—Linear peptides containing only normal peptide links having 5 to 11 amino acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12P—FERMENTATION OR ENZYME-USING PROCESSES TO SYNTHESISE A DESIRED CHEMICAL COMPOUND OR COMPOSITION OR TO SEPARATE OPTICAL ISOMERS FROM A RACEMIC MIXTURE
- C12P21/00—Preparation of peptides or proteins
- C12P21/02—Preparation of peptides or proteins having a known sequence of two or more amino acids, e.g. glutathione
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Health & Medical Sciences (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Molecular Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biotechnology (AREA)
- Microbiology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Medicinal Chemistry (AREA)
- General Engineering & Computer Science (AREA)
- Chemical Kinetics & Catalysis (AREA)
- General Chemical & Material Sciences (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Enzymes And Modification Thereof (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The invention provides a novel double-stranded DNA peptide ligase dDPlaseII which can be used for catalyzing covalent linkage of specific double-stranded DNA and specific polypeptide, and its enzymological characteristics and application method. The amino acid sequence of dplaseii enzyme is set forth in SEQ ID NO: 8, and the corresponding gene coding sequence is shown as SEQ ID NO: 7 is shown in the specification; the dDPlaseII enzyme can recognize a specific DNA double strand of which the 5 'end is initiated by a 5' -CTGGATCAT-3 'double strand sequence and the 5' end deoxyribonucleotide C is in a phosphorylation state and a specific polypeptide chain of which the C end is initiated by a 7 peptide N '-MANCEHL-C', and catalyze the mutual connection of the two by a covalent bond; the double-stranded DNA and polypeptide ligation product has a characteristic absorption peak at 372nm wavelength, and can be used for determining the reaction activity of dDPlaseII ligase. The dDPlaseII ligase buffer is prepared from 450mM Tris-HCl with pH 7.8 and 100mM Mg2+80mM NaCl, 20mM ATP and 8mM Triton X-100, wherein the temperature range of the optimal reaction is 35-45 ℃, and the connection reaction time is 3-10 min. The novel double-stranded DNA polypeptide ligase provided by the invention can be used as a tool enzyme for genetic engineering and genetic analysis.
Description
Technical Field
The invention relates to the field of molecular biology, in particular to a brand-new ligase capable of connecting a specific double-stranded DNA fragment and a specific polypeptide and a use method thereof.
Background
The gene engineering field is not independent of various molecular biology tool enzymes, the tool enzymes can realize in vitro nucleic acid amplification, transcription or reverse transcription, digestion or excision, connection and modification, protein digestion or excision and the like, and are widely used in the fields of target gene amplification, nucleic acid sequence analysis, recombinant DNA preparation, vector construction, nucleic acid probe marking, protein analysis and the like, for example, DNA polymerase is a core component in a PCR system, and the construction of a cDNA library is not independent of RNA polymerase. The tool enzymes commonly used in genetic engineering include mainly DNA polymerase, restriction endonuclease, DNA ligase, RNA polymerase, reverse transcriptase, exonuclease, DNA methylase, ribonuclease, deoxyribonuclease, polynucleotide kinase, alkaline phosphatase, terminal nucleotidyl transferase, and various proteases. Most of the current commercial molecular tool enzymes are derived from microorganisms mainly due to the fact that microorganisms grow and proliferate fast and metabolize vigorously, and in addition, expression, separation and purification of the enzymes are facilitated. Microorganisms in nature are diverse in their species, in the micro-ecology they occupy, and thus, in their metabolic forms and processes, which are diverse, and they are not isolated from the corresponding enzymes capable of performing various biochemical reactions, and thus, they are a huge resource pool of molecular enzymes. For example, T4 ligase derived from the T4 bacteriophage, which is originally used by the bacteriophage to repair DNA cleaved by a restriction endonuclease of a host cell, has high ligation efficiency. As more and more enzyme molecules are being explored, the number and use of molecular tools enzymes is increasing, which greatly opens up the field of genetic engineering research and applications.
In the previous research on microbial ecosystems degraded by dried straws, the applicant finds that a specific single-stranded RNA peptide ligase exists in the system, can recognize a specific sequence at the 5' end of a specific mRNA and covalently links the specific sequence with a specific polypeptide, thereby regulating the process of translating the specific mRNA into protein. Further, with the aid of bioinformatic design, the applicant has modified the single-stranded RNA peptide ligase gene to develop 1 specific single-stranded DNA peptide ligase and 2 specific double-stranded DNA peptide ligase which can recognize a specific DNA single-stranded sequence at the 5 '-end of a specific single-stranded DNA or a specific DNA double-stranded sequence at the 5' -end of a specific double-stranded DNA, respectively, and covalently join them with a specific polypeptide. On the basis of exogenous expression of these nucleic acid peptide ligases, the applicants have explored their enzymatic properties and methods of use. There are many commercially available DNA ligases (such as T4 DNAlagase from Saimer fly, USA) and modified enzymes (DNA methylase from NEB), which are only used for linking and specifically modifying nucleic acid fragments, and no ligase which can covalently link nucleic acid fragments and polypeptide fragments is reported. The invention provides a brand-new nucleic acid polypeptide ligase, which can be used as a tool enzyme for genetic engineering and genetic analysis and has huge application prospects in the fields of nucleic acid labeling, nucleic acid analysis and the like.
Disclosure of Invention
The present invention provides novel nucleic acid polypeptide ligases useful for ligation of specific nucleic acid fragments and specific polypeptide fragments, as well as enzymatic properties and methods of use thereof.
The technical scheme adopted by the invention for solving the technical problems is as follows:
the object (1) of the present invention is to provide a single-stranded RNA peptide ligase sRPPlaseI which can be used for catalyzing covalent ligation of a specific single-stranded RNA and a specific polypeptide, and its enzymatic properties and methods of use. The amino acid sequence of the sRPlaseI enzyme is set forth in SEQ ID NO: 2, and the corresponding gene coding sequence is shown as SEQ ID NO: 1 is shown in the specification; the sRPlasII enzyme can recognize a specific RNA single chain of which the 5 ' end is initiated by a 5 ' -AUGAUCCAG-3 ' sequence and the 5 ' end ribonucleotide A is in a phosphorylation state and a specific polypeptide chain of which the C end is initiated by 7 peptides N ' -MANCEHL-C ', and catalyze the dehydration condensation reaction between the phosphorylated adenine ribonucleotide A at the 5 ' end of the RNA single chain and leucine L at the C end of the polypeptide chain to form a phosphorus-carbon bond with O = P-C = O structure, so that the single-stranded RNA and the polypeptide are connected with each other by covalent bonds; the RP of the single-stranded RNA and the polypeptide has a maximum absorption peak at a wavelength of 351nm, and the peak is a characteristic absorption peak and can be used for measuring the reactivity of sRPLSeI ligase. The reaction system for sRPlaseI ligase is: (1) 1 ng/mu L-100 ng/mu L of specific RNA single strand; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer consisting of 500mM Tris-HCl, 100mM Mg, pH 7.52+10mM ATP and 10mM DTT, and 0.1. mu.L ligase buffer solution is added to each 1. mu.L reaction system; (4) 1. mu.g of sRPoseI enzyme; (5) and (5) adding deionized water to the required volume. Wherein, the specific RNA single strand refers to the RNA single strand of which the 5 'end is started to be 5' -AUGAUCCAG-3 'sequence and the 5' end ribonucleotide A is in a phosphorylation state, and the specific polypeptide refers to the polypeptide chain of which the C end is started to be 7 peptides of N '-MANCEHL-C'. The optimal reaction temperature range of sRPoseI ligase is 30-40 ℃, the ligation reaction time is 10-15 min, and the reaction system can be placed at 80 ℃ for 3min after the ligation reaction is finished, so that the reaction can be completely terminated.
The object (2) of the present invention is to provide a usefulSingle-stranded DNA peptide ligase sDPlaseI for catalyzing the covalent attachment of specific single-stranded DNA to specific polypeptides and its enzymatic properties and methods of use. The amino acid sequence of the sDPlaseI enzyme is shown as SEQ ID NO: 4, and the corresponding gene coding sequence is shown as SEQ ID NO: 3 is shown in the specification; the sDPlaseI enzyme can recognize a specific DNA single chain of which the 5 ' end is initially provided with a 5 ' -ATGATCCAG-3 ' sequence and the 5 ' end deoxyribonucleotide A is in a phosphorylation state and a specific polypeptide chain of which the C end is initially provided with a 7 peptide N ' -MANCEHL-C ', and catalyze the dehydration condensation reaction between the phosphorylated adenine deoxyribonucleotide A at the 5 ' end of the DNA single chain and the leucine L at the C end of the polypeptide chain to form a phosphorus-carbon bond with O = P-C = O structure, so that the single-chain DNA and the polypeptide are connected with each other by covalent bonds; the single-stranded DNA and polypeptide ligation product sDP has a maximum absorption peak at 358nm, which is a characteristic absorption peak and can be used for the determination of the reactivity of sDPlaseI ligase. The reaction system for the sDPlaseI ligase was: (1) 1 ng/. mu.L-100 ng/. mu.L of a specific DNA single strand; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer consisting of 500mM Tris-HCl, 100mM Mg, pH 7.52+10mM ATP and 10mM DTT, and adding 0.1 mu L ligase buffer solution to each 1 mu L reaction system; (4) 1 μ g of sDPlaseI enzyme; (5) deionized water to make up to the required volume. Wherein the specific DNA single strand is a DNA single strand having 5 '-end starting with 5' -ATGATCCAG-3 'sequence and 5' -end deoxyribonucleotide A in phosphorylated state, and the specific polypeptide is a polypeptide chain having C-end starting with 7 peptides N '-MANCEHL-C'. The optimal reaction temperature range of the sDPlaseI ligase is 30-40 ℃, the ligation reaction time is 5-15 min, and the reaction system can be kept at 80 ℃ for 3min after the ligation reaction is finished, so that the reaction can be completely stopped.
The invention aims at providing a double-stranded DNA peptide ligase dDPlaseI which can be used for catalyzing covalent linkage of specific double-stranded DNA and specific polypeptide, and the enzymatic properties and the using method thereof. The amino acid sequence of dplastel enzyme is shown in SEQ ID NO: 6, and the corresponding gene coding sequence is shown as SEQ ID NO: 5 is shown in the specification; the dDPlaseI enzyme can recognize a specific DNA double strand and C-terminal, wherein the 5 '-terminal is initially a 5' -ATGATCCAG-3 'double strand sequence, and the 5' -terminal deoxyribonucleotide A is in a phosphorylation stateThe end of the specific polypeptide chain is started to be N ' -MANCEHL-C ' which is the 7 peptide, and the dehydration condensation reaction is carried out between the phosphorylated adenine deoxyribonucleotide A at the 5 ' end of the DNA double chain and the leucine L at the C end of the polypeptide chain to form O = P-C = O construction phosphorus-carbon bond, so that the double-chain DNA and the polypeptide are mutually connected by covalent bonds; the double-stranded DNA and polypeptide ligation product dDPI has a maximum absorption peak at the wavelength of 360nm, and the peak is a characteristic absorption peak and can be used for measuring the reaction activity of dDPlaseI ligase. The reaction system for dplasei ligase is: (1) 1 ng/. mu.L-100 ng/. mu.L of a specific DNA duplex; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer consisting of 450mM Tris-HCl, 100mM Mg, pH 7.82+80mM NaCl, 20mM MATP and 8mM Triton X-100, and adding 0.1 mu L of ligase buffer solution into each 1 mu L of reaction system; (4) 1 μ g of dplasei enzyme; (5) and (5) adding deionized water to the required volume. Wherein the specific DNA double strand is a DNA double strand of which 5 '-end is initially a 5' -ATGATCCAG-3 'double strand sequence and 5' -end deoxyribonucleotide A is in a phosphorylated state, and the specific polypeptide is a polypeptide chain of which C-end is initiated by a peptide 7 of N '-MANCEHL-C'. The optimal reaction temperature range of dDPlaseI ligase is 35-45 ℃, the ligation reaction time is 3-10 min, and the reaction system can be kept at 80 ℃ for 3min after the ligation reaction is finished, so that the reaction can be completely stopped.
The invention aims at providing another double-stranded DNA peptide ligase dDPlaseII which can be used for catalyzing covalent linkage of specific double-stranded DNA and specific polypeptide and its enzymological characteristics and using method. The amino acid sequence of dplaseii enzyme is set forth in SEQ ID NO: 8, and the corresponding gene coding sequence is shown as SEQ ID NO: 7 is shown in the specification; the dDPlaseII enzyme can recognize a specific DNA double chain of which the 5 ' end is initiated to be a 5 ' -CTGGATCAT-3 ' double-chain sequence and the 5 ' end deoxyribonucleotide C is in a phosphorylation state and a specific polypeptide chain of which the C end is initiated to be a 7 peptide of N ' -MANCEHL-C ', and catalyze the dehydration condensation reaction between the phosphorylated cytosine deoxyribonucleotide C at the 5 ' end of the DNA double chain and the leucine L at the C end of the polypeptide chain to form a phosphorus-carbon bond with O = P-C = O structure, so that the double-chain DNA and the polypeptide are connected with each other by a covalent bond; the product of ligation of double-stranded DNA and polypeptide, dDPII, has a double-stranded DNA and polypeptide at a wavelength of 372nmThe maximum absorption peak, which is the characteristic absorption peak, can be used for the determination of the dDPlaseII ligase reaction activity. The reaction system for dplasteii ligase was: (1) 1 ng/. mu.L-100 ng/. mu.L of a specific DNA duplex; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer consisting of 450mM Tris-HCl, 100mM Mg, pH 7.82+80mM NaCl, 20mM MATP and 8mM Triton X-100, and adding 0.1 mu L of ligase buffer solution into each 1 mu L of reaction system; (4) 1 μ g of dplaseii enzyme; (5) deionized water to make up to the required volume. Wherein, the specific DNA double-strand refers to a DNA double-strand of which the 5 '-end is initially a 5' -CTGGATCAT-3 'double-strand sequence and the 5' -end deoxyribonucleotide C is in a phosphorylation state, and the specific polypeptide refers to a polypeptide chain of which the C-end is initiated by a peptide 7 of N '-MANCEHL-C'. The optimal reaction temperature range of dDPlaseII ligase is 35-45 ℃, the ligation reaction time is 3-10 min, and the reaction system can be kept at 80 ℃ for 3min after the ligation reaction is finished, so that the reaction can be completely stopped.
The novel nucleic acid polypeptide ligase, the enzymatic reaction characteristic and the using method thereof provided by the invention can lay a foundation for the novel nucleic acid polypeptide ligase serving as a novel tool enzyme for genetic engineering and genetic analysis and the application of the novel nucleic acid polypeptide ligase in the fields of nucleic acid analysis, nucleic acid marking and the like.
Drawings
FIG. 1 is a schematic diagram of a special DNA structure obtained from a sequencing analysis of a microecological metagenome of dried straw, which mainly comprises a promoter I, X gene, a short peptide gene and a leader DNA fragment.
FIG. 2 shows the verification of the ligation product of R9 and P7 by protein X.
Fig. 3 is a schematic diagram of O = P-C = O configuration phosphorus-carbon bonds.
FIG. 4 is a full-wavelength scanning spectrum of the R9 and P7 ligation product RP under the action of single-stranded RNA peptide ligase sRPPlaseI, and the characteristic absorption peak is at 351nm wavelength.
FIG. 5 is a full wavelength scanning spectrum of the ligation product sDP of sD9 and P7 under the action of single-stranded DNA peptide ligase sDPlaseI, with a characteristic absorption peak at a wavelength of 358 nm.
FIG. 6 is a full wavelength scanning spectrum of dD9 and P7 ligation product dDPI with the aid of double stranded DNA peptide ligase dDPlaseI, with characteristic absorption peak at 360nm wavelength.
FIG. 7 is a full wavelength scanning spectrum of dD9 and P7 ligation product dDPII with the aid of double stranded DNA peptide ligase dDPlaseII, with a characteristic absorption peak at 372 nm.
FIG. 8 is the difference between the double stranded DNA peptide ligases dDPlaseI and dDPlaseII for the dD9 and P7 ligation reactions.
Detailed Description
The invention is further illustrated below with reference to specific examples.
Example 1 enzymes catalyzing the ligation of specific RNA Single strands and specific Polypeptides
The applicant utilizes the metagenome sequencing technology to research the functional genome of a dry straw micro-ecosystem in a Fujian region in the earlier stage, and through splicing analysis of sequencing data, the functional genome discovers that similar special DNA structures are contained at the upstream of more than ten different gene segments, and the special DNA structures are shown in figure 1 by taking cellulase (specifically, endoglucanase, EC.3.2.1.4, a main component in a cellulase system) genes as an example. The immediate vicinity of the front ends of more than ten genes such as cellulase and the like all contain two ORFs (open reading frames), and the special DNA structure taking cellulase genes as an example comprises an X gene (the sequence of which is shown as SEQ ID NO: 1) with the length of 1362bp and a short peptide gene (the sequence of which is shown as SEQ ID NO: 9) with the length of 24 bp. The polypeptide sequence coded by the short peptide gene is N '-MANCEHL-C' (shown as SEQ ID NO: 10), which sequentially corresponds to 7 amino acids of N '-methionine-alanine-asparagine-cysteine-glutamic acid-histidine-leucine-C', N 'represents the N end of the polypeptide chain, and C' represents the C end of the polypeptide chain. The X gene and the short peptide gene have unknown functions, and share one promoter (promoter I in the figure), and the gene structure product is presumed to be a component of the cellulose degrading enzyme complex because the promoter is arranged at the front end of the cellulase gene and is close to the cellulase gene. In addition, a 9-nucleotide short DNA double strand, namely 5 '-ATGATCCAG-3' (the complementary DNA strand sequence is 3 '-TACTAGGTC-5', the corresponding transcription sequence is 5 '-AUGAUCCAG-3' short RNA sequence), is also present between the cellulase gene promoter (promoter II in the figure) and the initiation codon, and is accompanied by the X gene and the short peptide gene, and the similar short DNA double strand is contained before the initiation codon of more than ten genes, and is referred to as a leader DNA fragment.
In order to explore the role of the specific gene construct (mainly comprising promoter I, X gene, short peptide gene and leader DNA fragment) shown in fig. 1 in the cellulose complex enzyme system, it was subjected to gene cloning and protein expression together with cellulase gene: primers are designed to amplify the gene structure (containing cellulase genes) shown in figure 1, recombinant plasmids are constructed and transferred into escherichia coli engineering bacteria for exogenous expression, and then separation and purification are carried out, and the result shows that the expression product basically has no cellulose degradation capability (including endoglucanase activity). The transcription product of the cellulase gene, namely mRNA, is detected by using real-time fluorescent quantitative PCR, the corresponding cellulase protein cannot be detected by using Western blot, and the detected protein band size conforms to the X gene product; in addition, the presence of the N '-MANCEHL-C' polypeptide was also detected, indicating that the polypeptide gene was also successfully expressed. In general, cellulase genes are transcribed into mRNA in the constructed recombinant escherichia coli exogenous expression system, but are not translated into cellulase proteins, and the result shows that the translation process of the cellulase mRNA is blocked, but the adjacent X gene and leader DNA fragments can successfully express corresponding proteins and polypeptides. .
Further analysis shows that for an engineering bacteria system of escherichia coli introduced with recombinant plasmids of the cellulase complex gene structure shown in fig. 1, the length of mRNA of the extracted cellulose gene analyzed on denatured RNA electrophoresis by referring to an RNA Marker does not accord with the theory, and the actual length is larger than the theoretical length. In-depth analysis (Qubit4.0) revealed that the mRNA is actually a complex of mRNA and polypeptide, a short peptide is linked to the 5' end of the mRNA, and the short peptide is the encoded product of the short peptide gene shown in FIG. 1 by RNase enzyme digestion and mass spectrometry (see example 2 for related analysis methods and procedures). Normally, mRNA and polypeptide do not ligate spontaneously, so it is assumed that the X gene expression product is an enzyme that catalyzes the ligation of a particular mRNA and a particular polypeptide. Further cloning of the X gene for expression purification is required in order to explore its characteristics.
Example 2 clonal expression of the X Gene and exploration of the X protein Performance
Primers 5'-ATGCGGACGCGCCACAGC-3' and 5'-CTACATCTGACGTCGAAGG-3' were designed based on the X gene sequence (SEQ ID NO: 1) to amplify the X gene (refer to conventional PCR system and conditions, annealing temperature 51 ℃ C.), the X gene was cloned and expressed using the recombinant protein E.coli expression and purification system (pET Express & Purify Kits) of Takara, Japan according to the instructions, and the X protein (X gene-encoding expression product) was isolated and purified using the histidine tag attached to the fusion protein, and the final purified X protein concentration was 0.18. mu.g/ml as determined by Qubit4.0, and the amino acid sequence thereof was as shown in SEQ ID NO: 2, respectively. As can be seen from example 1, protein X may be a ligase that catalyzes the ligation of a specific RNA to a specific polypeptide, and its properties can be further verified and studied in view of the successful exogenous expression and isolation and purification of protein X. Since the short peptide sequence N ' -MANCEHL-C ' and the short leader RNA single-stranded sequence 5 ' -AUGAUCCAG-3 ' are ubiquitous (the leader DNA fragment and cellulase gene are transcribed as described above, and mRNA thereof is present, and the short leader RNA sequence is present at the 5 ' end as shown in FIG. 1 and is transcribed from the leader DNA fragment), it is presumed that the X ligase mainly recognizes and ligates the short peptide sequence and the short leader RNA sequence, thereby blocking the translation of mRNA ligated to the short leader RNA (e.g., mRNA of cellulase gene in example 1) into protein.
The short peptide P7 (FIG. 1 short peptide gene coding product) of 7 amino acids N '-MANCEHL-C' was synthesized by Kinsley Biotechnology Ltd and the purity was not less than 95%. 5 '-P-AUGAUCCAG-B-3' short leader RNA single strand R9 of 9 ribonucleotides was synthesized simultaneously (FIG. 1 leader DNA fragment transcript, its 5 'terminal A ribonucleotide was monophosphorylated and modified, denoted by P; in this text A in the RNA strand is adenine ribonucleotide, U is uracil ribonucleotide, C is cytosine ribonucleotide, G is guanine ribonucleotide), and Biotin (Biotin, denoted by B) was modified at its 3' terminal ribonucleotide G to isolate it with streptavidin-coated magnetic beads. All references herein to a phosphorylation modification refer to the attachment of the modified phosphate group to the 5 th carbon atom of a five carbon sugar of a ribonucleotide (or deoxyribonucleotide), as in a natural monophosphate or triphosphate mononucleotide; in addition, triphosphate modification may also be applicable, however, in view of practical convenience in using monophosphate, it is not specified one by one below for the sake of simplicity; as used herein, the 5 'terminal or 3' terminal nucleotide refers to the first nucleotide of the corresponding terminus, i.e., the terminal-most nucleotide. The binding of biotin to streptavidin is the strongest non-covalent interaction now known in nature and is commonly used for the isolation, purification and analysis of specific nucleic acids or proteins. A mixed reaction system of the protein X was prepared in a PCR tube according to Table 1, and placed in a PCR apparatus at 37 ℃ for 30 min. The components related in the reaction system are all prepared by RNase-free water, and the consumables are all treated by the RNase removal. Since the 3' -terminal ribonucleotide G of the short RNA single strand R9 has a biotin modification, R9 in the reaction system was captured after the reaction was completed by streptavidin magnetic beads (Dynabeads MyOne T1) from Sammarvying, USA according to the method described in the specification. After R9 was captured, RNase I (Amersham fly, which can completely digest RNA into single nucleotides) was added and RNA digestion was performed according to the protocol. After the RNA digestion product is taken and purified, the RNA digestion product is sent to Beijing Baitacg Biotechnology Limited for mass spectrometry analysis. If the protein X can recognize and connect the short leader RNA single chain R9 and the short peptide sequence P7, the connection complex is digested by RNase I to obtain a connection object of the short peptide sequence P7 and the 5' terminal nucleotide A of the short leader RNA single chain R9, and finally the connection object is verified by molecular weight by mass spectrometry, wherein the specific principle and the process are shown in the figure 2 (B in the figure 2 represents biotin modification).
The mass spectrum showed that the major peak corresponding to the molecular weight of N '-MANCEHL-A (relative molecular weight: about 1146.1, A in A represents A ribonucleotide) and the minor peak corresponding to the molecular weight of C' -L-A (relative molecular weight: about 460.4, C '-L represents C-terminal amino acid, L amino acid, and A in A represents A ribonucleotide) were present, but no peak corresponding to the molecular weight of N' -M-A was found. The main peak results indicate that R9 and P7 are linked together; the secondary peak results confirmed that the linkage of R9 and P7 was a linkage between the 5 'ribonucleotide a of R9 and the C' amino acid L of P7 via a P-C phosphorus-carbon bond (specifically, the structure was O = P-C = O, i.e., the phosphorus atom in the phosphorus-oxygen double bond P = O and the carbon atom in the carbon-oxygen double bond O = C were covalently linked, see fig. 3, which was formed by dehydration condensation of the phosphate group of a ribonucleotide and the carboxyl group of L amino acid). In conclusion, protein X catalyzes the covalent linkage between R9 and P7, and is a newly discovered single-stranded RNA peptide ligase, which is abbreviated as sRPPlaseI (i.e., one strand for single-stranded RNA and polypeptide), and its specific ligation product of catalytic RNA single-stranded and polypeptide is denoted as RP.
TABLE 1 protein reaction System
Components | Adding amount of | |
1. mu.g/. mu.L of synthetic RNA single | 1μL | |
1 mu g/mu L synthetic short peptide P7 | 1μL | |
Ligase buffer (composed of 500mM Tris-HCl, 100mM Mg, pH 7.5)2+10mM ATP and 10mM DTT) | 1μL | |
X protein | 1μg | |
Ultrapure water | Make up to 10 μ L |
Example 3 detection of reaction product of Single-stranded RNA peptide ligase sRPPlaseI and exploration of enzymatic Properties
Examples 1 and 2 show that sRPoseI is a ligase that catalyzes the ligation of a specific single-stranded RNA to a specific polypeptide, and that a simple ligation assay should be established in order to further explore its enzymatic properties. Since the reaction product catalyzed by sRPlaseI ligase is a complex of a single RNA strand and a polypeptide covalently linked (O = P-C = O constituting a phosphorus-carbon bond), an attempt to measure the absorption spectrum characteristic thereof was made in order to attempt reaction measurement by a simple spectrophotometric method. The reaction products of Table 1 of example 2, 3. mu.l each of R9, P7 and the blank reaction system were scanned at a full wavelength in the range of 200nm to 1000nm using an Epoch microplate reader (Biotek, USA), and the results of the reaction products are shown in FIG. 4. The results show that the ligation reaction complex of RNA single strand and polypeptide (designated as RP) shows a maximum absorption at 351nm, while none of R9, P7 and the blank reaction system shows this peak, indicating that the 351nm wavelength peak is the characteristic absorption peak of the ligation complex RP, which is due to the specific absorption of the phosphorus-carbon bond of the O = P-C = O configuration. Because the RNA is a connection product of RNA and polypeptide, absorption peaks respectively appear at wavelengths of about 215nm, 260nm and 280nm, but under the influence of a reaction system, the wavelengths are not suitable for measuring RP. From this, it is found that absorbance at 351nm is a simple and effective method for measuring the sRPoseI ligase reaction (product analysis method).
To explore the substrate specificity of srplatei ligase, attempts were made to alter the R9 and/or P7 fragments. By altering the R9 fragment, including by reducing and/or replacing some of the ribonucleotides, the system does not have a significant absorption peak at 351nm after the ligation reaction is completed, indicating that no phosphorus-carbon bond product of O = P-C = O is formed, i.e., no ligation reaction is performed. Likewise, by altering the P7 fragment, including by reducing and/or replacing some of the amino acids, the system also showed no significant absorption peak at 351nm after the end of the ligation reaction, indicating that no phosphorus-carbon bond product of O = P-C = O was formed, i.e., no ligation reaction was performed. It was thus shown that sRPoseI ligase catalyzed ligation was based on the recognition of the stringent R9 and P7 fragments, which also confirmed that the more than ten genes mentioned in example 1 previously appeared in gene constructs similar to R9 and P7, for sRPoseI ligase the R9 and P7 sequences were conserved and no other sequence forms of the R9 and P7 corresponding genes linked to their genes were found. However, since sRPlaseI ligase mediates the ligation of the 5 '-terminal ribonucleotide of R9 and the C-terminal amino acid of P7, extension of the 3' -terminal ribonucleotide chain of R9 and/or the N-terminal amino acid chain of P7 does not theoretically affect the progress of the ligation reaction while retaining R9 and P7. 5 ' -P-AUGAUCCAG-Rn-3 ' (where P represents monophosphorylated modification) and N ' -Pm-MANCEHL-C ' were used as substrates, where Rn is a single-stranded RNA fragment linked to the 3 ' end of R9, and 3 fragments of different lengths and sequences R1, R2 and R3 were tried (in view of sequence synthesis techniques and cost, R1-R3 selected mRNA regions corresponding to glutathione, human insulin A chain and human insulin B chain, respectively, and glutathione was a virtual mRNA sequence), whose ribonucleotide sequences were shown in SEQ ID NO: 11-SEQ ID NO: 13 (9 nt, 63nt, and 90nt in length, respectively); pm is a polypeptide fragment connected with the N-terminal of P7, and 3 fragments of different lengths and sequences, namely P1, P2 and P3 (P1-P3 correspond to R1-R3 respectively and correspond to glutathione, a human insulin A chain and a human insulin B chain respectively) are tried, and the amino acid sequences of the Pm are shown in SEQ ID NO: 14-16 (3 aa, 21aa and 30aa in length, respectively), a total of 9 combinations of two substrates (RNA single strand and polypeptide) were used, and the srplatei ligase-mediated reaction system was able to detect a significant absorption peak at 351nm, indicating that O = P-C = O structure phosphorus-carbon bond product was formed, i.e. ligation was performed, in reference to table 1. This is also consistent with the observation in example 1 that the cellulase gene is transcribed but not translated into protein, because the short leader RNA single strand is attached to the 5 'end of the long mRNA transcribed from the cellulase gene, and P7 is attached to the short leader RNA single strand R9 at the 5' end by sRPlaseI ligase (the expression product of the X gene immediately adjacent to the cellulase in the specific gene configuration of fig. 1), and the cellulase mRNA cannot be recognized and translated.
As can be seen, sRPlaseI ligase is capable of ligating RNA single strands with a 5 ' terminus initially containing intact R9 and a 5 ' terminal ribonucleotide a in a phosphorylated state (i.e., 5 ' -P-R9-3 ', P indicates that the 5 ' terminal ribonucleotide of R9 is monophosphorylated and P indicates other RNA sequence at the 3 ' terminus) and a C-terminus initially containing intact P7 and a specific polypeptide chain as a substrate (i.e., N ' -P7-C ', -indicating other polypeptide sequence at the N terminus), catalyzing covalent ligation of RNA single strands and polypeptide chains, particularly the formation of a phosphorus-carbon bond with O = P-C = O configuration between the 5 ' terminal ribonucleotide a of RNA and the C-terminal amino acid L of the polypeptide. The 5 'terminal ribonucleotide refers to the 5' terminal first ribonucleotide. The phosphorylated form of the nucleotide refers to the attachment of a modified or naturally occurring phosphate group to the 5 th carbon atom of a five carbon sugar of a ribonucleotide as in the case of the natural monophosphate or triphosphate mononucleotide; in addition, triphosphate modification may also be applicable, however, monophosphate may be conveniently used in consideration of practical use, and is not specifically indicated below for the sake of simplicity.
With reference to Table 1, the optimum reaction buffer system and temperature of sRPISeI ligase were investigated using different buffers and temperatures according to Table 2 using R9 and P7 as substrates. Buffers A and B were 2 commonly used ligase buffers, buffer A was prepared from 500mM Tris-HCl, 100mM Mg at pH 7.52+10mM ATP and 10mM DTT; buffer B was composed of 450mM Tris-HCl, 100mM Mg, pH 7.82+80mM NaCl, 20mM ATP, 8mM Tris-100. The amount of sRPoseI ligase added was 1. mu.g, and the reaction time was 30 min. After completion, 3. mu.L of the reaction product was collected, and the absorbance at 351nm was measured using an Epoch microplate reader (Biotek, USA), and the results are shown in Table 2. As can be seen from Table 2, the absorbance of the reaction product was higher in the buffer A system than in the buffer B at each temperature, indicating that the buffer A is more suitable for sRPoseI ligase. In the buffer A group, the absorbance measured at 37 ℃ for the reaction product was the highest (1.28), indicating that 37 ℃ is the optimum reaction temperature for sRPoseI ligase. Furthermore, the results in Table 2 show that the optimum reaction temperature range for sRPISeI ligase was 30 ℃ to 40 ℃. The aforementioned 5 ' -P-AUGAUCCAG-Rn-3 ' (P indicates that the 5 ' -terminal A ribonucleotide is phosphorylated) andn '-Pm-ManceHL-C' is used as a substrate, and the optimal buffer solution and the optimal temperature condition are consistent with the R9 and P7 substrate groups.
TABLE 2 reactivity of sRPISeI ligase in different reaction buffer systems and temperature gradients
Extension of the reaction time to 60min and increase of the amount of sRPISeI ligase to 2. mu.g at buffer A and 37 ℃ revealed that the absorbance of the reaction product at 351nm was also 1.28, indicating that the reaction was complete for 30min at buffer A and 37 ℃. The reaction time was shortened to 10min while keeping the original system and conditions unchanged, and it was found that the absorbance of the reaction product at 351nm was also 1.28, and thus the reaction proceeded sufficiently for 10 min. Keeping the original system and conditions unchanged, preserving the temperature of the sRPoseI ligase reaction system at 80 ℃ for 3min (incubating the basic reaction system at 80 ℃ for 15min and then adding the sRPoseI ligase), and continuing to react, wherein the result shows that the absorbance of the reaction product at 351nm is not significantly different from that of a blank control (the blank control is 0.08, and the pre-incubation system at 80 ℃ is 0.13) to indicate that the reaction is terminated, so that the reaction can be completely terminated by preserving the temperature at 80 ℃ for 3min after the reaction is finished.
In summary, the present invention provides a single-stranded RNA peptide ligase sRPlaseI for catalyzing covalent linkage between a specific single-stranded RNA and a specific polypeptide, wherein the amino acid sequence of the single-stranded RNA peptide ligase sRPlaseI is as shown in SEQ ID NO: 2, and the corresponding gene coding sequence is shown as SEQ ID NO: 1 is shown in the specification; the sRPoseI enzyme can recognize a specific RNA single chain of which the 5 ' end starts to be a 5 ' -AUGAUCCAG-3 ' sequence and the 5 ' end ribonucleotide A is in a phosphorylated state and a specific polypeptide chain of which the C end starts to be a 7 peptide N ' -MANCEHL-C ', and catalyzes a dehydration condensation reaction between the adenine ribonucleotide A at the 5 ' end of the RNA single chain and leucine L at the C end of the polypeptide chain to form a phosphorus-carbon bond with O = P-C = O structure, so that the single-stranded RNA and the polypeptide are linked with each other by covalent bonds. The product RP of the ligation of the single-stranded RNA and the polypeptide has a maximum absorption peak at a wavelength of 351nmThe peak is its characteristic absorption peak. The reaction system for srplatei ligase is: (1) 1 ng/mu L-100 ng/mu L of specific RNA single strand; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer consisting of 500mM Tris-HCl, 100mM Mg, pH 7.52+10mM ATP and 10mM DTT, and 0.1. mu.L ligase buffer solution is added to each 1. mu.L reaction system; (4) deionized water to make up to the required volume. Wherein, the specific RNA single strand refers to the RNA single strand of which the 5 'end is started to be 5' -AUGAUCCAG-3 'sequence and the 5' end ribonucleotide A is in a phosphorylation state, and the specific polypeptide refers to the polypeptide chain of which the C end is started to be 7 peptides of N '-MANCEHL-C'. The optimal reaction temperature range of sRPoseI ligase is 30-40 ℃, the ligation reaction time is 10-15 min, and the reaction system can be placed at 80 ℃ for 3min after the ligation reaction is finished, so that the reaction can be completely terminated.
Example 4 DNA peptide ligase that catalyzes ligation of specific DNA Single strands or specific DNA double strands to specific Polypeptides
In addition to single-stranded RNA, nucleic acid forms common in vivo and in the field of molecular biology include double-stranded DNA and single-stranded DNA, and therefore, the gene encoding the single-stranded RNA peptide ligase sRPoseI (i.e., the X gene in FIG. 1, the sequence shown in SEQ ID NO: 1) is further engineered by means of bioinformatic design in order to construct a specific DNA peptide ligase. The substrate binding site and the catalytic activity center of the single-stranded RNA peptide ligase sRPoseI were predicted by bioinformatic analysis, a series of modified genes were obtained focusing on changing the coding sequences of these key sites by customization, and the modified enzymes were obtained by cloning, expressing and isolating and purifying according to the instructions using the recombinant protein E.coli expression and purification system (pET Express & Purify Kits) of Takara, Japan, with reference to the method in example 2. Considering the stringency of ligase on the substrates, P7 was still selected for the polypeptides in both substrates, and the DNA strand (5 '-P-ATGATCCAG-B-3', sD9, P for monophosphorylated modification and B for biotin modification) or the DNA strand (5 '-P-ATGATCCAG-B-3', double-stranded form, complementary strand 3 '-TACTAGGTC-P-5', dD9, P for monophosphorylated modification and B for biotin modification) was selected for the corresponding DNA strand R9. Wherein the short single-stranded DNA, namely sD9, is directly consigned to the Kinsley Biotechnology GmbH for synthesis; and the short double-stranded DNA fragment, namely dD9, firstly entrusts Kinsley Biotechnology Limited company to respectively synthesize complementary strands, then mixes two complementary single-stranded DNAs with equal amount (1.5 mu g/mu L) in an annealing system buffer solution, keeps the temperature for 16h at 25 ℃, then purifies a double-stranded product and adopts Qubit4.0 to carry out concentration determination, and the preparation of the double-stranded DNA is completed by the company.
An enzyme reaction system was prepared according to Table 3 and placed in a PCR instrument at 37 ℃ for 30 min. Single-stranded DNA and reaction products of double-stranded DNA with P7 were designated sDP and dDP (collectively, DP), respectively. Since the single-stranded RNA peptide ligase sRPlaseI can catalyze R9 and P7 to form a product RP containing a phosphorus-carbon bond with O = P-C = O configuration, which has characteristic absorption at 351nm wavelength. In this regard, the reaction product of the DNA strand acted on by the engineered enzyme system and P7 was scanned over all wavelengths to see if it had characteristic absorption peaks at 351nm or other nearby wavelengths, thereby allowing for easy screening of engineered enzymes that could be used to link sD9 (or dD 9) to P7. On the basis of full-wavelength scanning primary screening, for a candidate modified enzyme system, referring to the method in example 2, a streptavidin-coated magnetic bead is used for capturing a DNA chain (sD 9 or dD9, namely D9) with a biotin label, DNA exonuclease (digestible single-stranded DNA and double-stranded DNA) is used for digesting a ligation reaction system product according to the method in the specification, and the product is purified and then sent to a company for mass spectrometry detection and analysis.
TABLE 3 modification of the enzyme reaction System
Components | Adding amount of | |
Mu.g/. mu.L of DNA single strand sD9 or DNA double strand | 1μL | |
1. mu.g/. mu.L ofSynthetic short peptide P7 | 1μL | |
Ligase buffer (composed of 500mM Tris-HCl, 100mM Mg, pH 7.5)2+10mM ATP and 10mM DTT) | 1μL | |
Engineering enzymes | 1μg | |
Ultra-pure water | Make up to 10 μ L |
Through screening of 34 kinds of modified enzymes, 1 kind of candidate high-reactivity specific single-stranded DNA polypeptide ligase is found, which is marked as sDPlaseI (i.e. one strand for single-stranded DNA and polypeptide), and the gene sequence of the ligase is shown as SEQ ID NO: 3, and the corresponding amino acid sequence is shown as SEQ ID NO: 4 is shown in the specification; 2 candidate highly reactive specific double-stranded DNA polypeptide ligases, designated dDPlaseI (i.e., ligase I for double-stranded DNA and polypeptide) and dDPlaseI (i.e., ligase II for double-stranded DNA and polypeptide), have the gene sequences as set forth in SEQ ID NO: 5 and SEQ ID NO: 7, and the corresponding amino acid sequences are respectively shown as SEQ ID NO: 6 and SEQ ID NO: shown in fig. 8. Full wavelength scan patterns for the sDPlaseI, dDPlaseI and dDPlaseII enzyme reaction systems are shown in fig. 5-7, respectively. The products of the 3 DNA polypeptide ligase reaction systems have absorption peaks at wavelengths of about 215nm, 260nm and 280nm, and when the streptavidin-coated magnetic beads are used for capturing a substance containing a biotin label and then full-wavelength scanning measurement is carried out, the 3 wavelength peaks still exist, and the 3 wavelength peaks are just the characteristic absorption peaks of DNA and polypeptide, and the existence of a DNA and polypeptide compound is verified from the side. In contrast, the enzyme reaction systems of sDPlaseI, dDPlaseI and dDPlaseII each exhibited unique characteristic absorption wavelengths of 358nm, 360nm and 372nm, which are very close to the characteristic absorption wavelength of 351nm of the single-stranded RNA polypeptide complex RP in example 3, and are likely to be characteristic absorption peaks where O = P-C = O constructing a phosphorus-carbon bond, but slightly different characteristic absorption wavelengths were caused due to the inconsistency between the nucleotide (ribonucleotide or deoxyribonucleotide) forming a phosphorus-carbon bond with L amino acid and its surrounding environment. And verifying by mass spectrometry on the basis of determining a characteristic absorption peak by full-wavelength scanning to screen a candidate DNA polypeptide ligase system.
Further, the mass spectrum of the reaction product of the sdplaseI enzyme showed the occurrence of a major peak corresponding to the molecular weight of N ' -MANCEHL-A (relative molecular mass: 1130.2, -A in A represents the 5 ' -terminal nucleotide of sD9, i.e., A deoxyribonucleotide), and in addition, a peak corresponding to the molecular weight of C ' -L-A in the minor peak (relative molecular mass: 444.4, C ' -L represents the C-terminal amino acid of P7, i.e., L amino acid, -A in A represents the 5 ' -terminal nucleotide of sD9, i.e., A deoxyribonucleotide). The mass spectrometry results showed that sD9 and P7 had a linkage that was a phosphorus-carbon bond interconnection between the 5' terminal deoxyribonucleotide a of sD9 and the C-terminal amino acid L of P7 in an O = P-C = O configuration. Therefore, the sDPlaseI enzyme can catalyze the covalent linkage of sD9 and P7, and is a novel single-stranded DNA peptide ligase, and the catalyzed DNA single-stranded and polypeptide ligation product is represented by sDP.
The mass spectrum of the dDPlaseI enzyme reaction product showed the appearance of a major peak corresponding to the molecular weight of N ' -MANCEHL-A (relative molecular mass 1130.2, -A in A represents the 5 ' terminal nucleotide of the dD9 linker, i.e., A deoxyribonucleotide), and in addition, a minor peak corresponding to the molecular weight of C ' -L-A (relative molecular mass 444.4, C ' -L represents the C-terminal amino acid of P7, i.e., L amino acid, -A in A represents the 5 ' terminal nucleotide of the dD9 linker, i.e., A deoxyribonucleotide). It should be noted that although the short double-stranded DNA fragment dD9 is connected to a nucleotide pair a = T at one end, under mass spectrometry conditions, the hydrogen bond of base complementary pairing between a = T will be opened, and only the deoxyribonucleotide a covalently linked to P7 will be retained. The mass spectrometry results showed that dD9 and P7 have been linked together, the linkage being a phosphorus-carbon bond interconnection between the 5' terminal deoxyribonucleotide a on the dD9 connecting strand and the C-terminal amino acid L of P7 in the O = P-C = O configuration. Therefore, dDPlaseI can catalyze the covalent linkage of dD9 and P7, and is a novel double-stranded DNA peptide ligase, and the catalytic linkage product of a DNA double strand and a polypeptide is expressed by dDPI.
The mass spectrum of the dDPlasEI enzyme reaction product showed the appearance of a major peak corresponding to the molecular weight of N ' -MANCEHL-C (relative molecular mass 1106.1; C in-C denotes the 5 ' terminal nucleotide of the dD9 linker, i.e., C deoxyribonucleotide), and in the minor peak a peak corresponding to the molecular weight of C ' -L-C (relative molecular mass 420.4, C ' -L denotes the C-terminal amino acid of P7, i.e., L amino acid, and C in-C denotes the 5 ' terminal nucleotide of the dD9 linker, i.e., C deoxyribonucleotide). It should be noted that although the short double-stranded DNA segment dD9 has a C.ident.G nucleotide pair at one end, under mass spectrometry conditions, the hydrogen bond of base complementary pairing between C.ident.G will be opened, and only the deoxyribonucleotide C covalently linked to P7 will be retained. Mass spectrometry results showed that dD9 and P7 were linked together and that dD9 and P7 were linked together by a phosphorus-carbon bond in the O = P-C = O configuration between the 5' terminal deoxyribonucleotide C on one strand of dD9 and the C-terminal amino acid L of P7. Therefore, dDPlaseII can catalyze the covalent linkage of dD9 and P7, and is a novel double-stranded DNA peptide ligase, and the catalytic linkage product of a DNA double strand and a polypeptide is represented by dDPII.
Both dplasei and dplasei enzymes can catalyze the covalent linkage of dD9 and P7 with a phosphorus-carbon bond of O = P-C = O, except that dplasei enzymes catalyze the linkage of a 5 'terminal deoxyribonucleotide a on one strand of dD9 (i.e., the 5' -P-ATGATCCAG-3 'strand) and the C-terminal amino acid L of P7 with a phosphorus-carbon bond of O = P-C = O, whereas dplasei enzymes catalyze the linkage of a 5' terminal deoxyribonucleotide C on the other strand of dD9 (the 5 '-P-CTGGATCAT-3' strand, i.e., the complementary strand of the 5 '-P-ATGATCCAG-3' strand) and the C-terminal amino acid L of P7 with a phosphorus-carbon bond of O = P-C = O, as shown in fig. 8. Thus, in practical applications, for example, for nucleic acid labeling, any strand of a double strand of DNA can be labeled with dDPlaseI or dDPlaseII of the ligase of the present invention; also for example, for preventing gene expression or regulation, transcription or regulation of the expression product of either strand of the DNA duplex can be prevented by the ligase dDPlaseI or dDPlaseII of the invention.
EXAMPLE 53 exploration of the enzymatic Properties of DNA polypeptide ligases
To explore the substrate specificity of 3 DNA polypeptide ligases, sDPlaseI, dDPlaseI, and dDPlaseII, attempts were made to alter sD9 (or dD 9) and/or P7 fragments. By altering the sD9 (or dD 9) fragment, including the reduction and/or replacement of some of the deoxyribonucleotides, the system does not have a significant absorption peak at 358nm (or 360nm or 372 nm) after the end of the ligation reaction, indicating that no O = P-C = O structural phosphorus carbon bond product is formed, i.e., no ligation reaction is proceeding. Likewise, by altering the P7 fragment, including by reducing and/or replacing some of its amino acids, the system also shows no significant absorption peak at 358nm (or 360nm or 372 nm) after the ligation reaction is completed, indicating that no phosphorus-carbon bond product of O = P-C = O structure is formed, i.e., no ligation reaction is performed. Thus, it was shown that the ligation catalyzed by the sDPlaseI, dDPlaseI and dDPlaseII ligases is based on the recognition of the stringent D9 (i.e., sD9 or dD 9) and P7 fragments, which are consistent with the stringent substrate sequence requirements of the single-stranded RNA polypeptide ligase sRPaseI, i.e., the 3 DNA polypeptide ligases require that the D9 and P7 recognition sequences be conserved. However, since all of the 3 DNA polypeptide ligases mediate the ligation between the 5 '-terminal deoxyribonucleotide of D9 and the C-terminal amino acid of P7, extension of the 3' -terminal deoxyribonucleotide chain on the non-ligation side of D9 and/or the N-terminal amino acid chain of P7 does not theoretically affect the ligation reaction while retaining D9 and P7. For a single-stranded DNA polypeptide ligase sDPlaseI system, a DNA single strand of 5 '-P-ATGATCCAG-Dn-3' (P represents phosphorylation modification) and N '-Pm-MANCEHL-C' are used as substrates; for a double-stranded DNA polypeptide ligase dDPlaseI system, a DNA double strand of 5 '-P-ATGATCCAG-Dn-3' (the complementary strand of the DNA double strand is 3 '-TACTAGGTC-Dn' -5 ', and Dn' is the complementary strand of Dn) and N '-Pm-MANCEHL-C' are used as substrates; for a double-stranded DNA polypeptide ligase dDPlaseII system, a DNA double strand of 5 ' -P-CTGGATCAT-Dn-3 ' (the complementary strand of the DNA double strand is 3 ' -GACCTAGTA-Dn ' -5 ', and Dn ' is the complementary strand of Dn) and N ' -Pm-ManceHL-C ' are used as substrates, wherein Dn is a DNA fragment connected with the 3 ' end of the non-enzymatic connection side of D9, and as with the exploration of single-stranded RNA polypeptide ligase, 3 fragments of different lengths and sequences, D1, D2 and D3 (D1, D2 and D3 correspond to R1, R2 and R3 respectively, namely Dn is a DNA form of Rn) are tried, and the sequences of the DNA double-stranded DNA polypeptide ligase are respectively shown as SEQ ID NO: 17-19; pm is a polypeptide fragment connected with the N-terminal of P7, and 3 fragments P1, P2 and P3 (corresponding to encoded polypeptide products of D1-D3, respectively, glutathione, a human insulin A chain and a human insulin B chain) with different lengths and sequences are tried, wherein the sequences are respectively shown in SEQ ID NO: 14-16, for each of the 3 DNA polypeptide ligases, 9 combinations of the two substrates were tried, and significant absorption peaks were consistently detected at 358nm, 360nm and 372nm in the sDPlaseI, dDPlaseI and dDPlaseII ligase-mediated reaction systems, respectively, indicating that O = P-C = O structure phosphorus-carbon bond product formation, i.e., ligation reaction was proceeding.
In conclusion, the scdlasel ligase can catalyze the covalent ligation of single DNA strands and polypeptide chains by using as a substrate a single DNA strand starting at the 5 'end as intact sD9 and having its 5' terminal deoxyribonucleotide a in phosphorylated form (i.e., 5 '-P-ATGATCCAG-' 3 '; indicates other DNA sequence at the 3' end) and a polypeptide chain starting at the C-terminal end as intact P7; dplastei ligase is capable of catalyzing the covalent ligation of DNA duplexes and polypeptide chains by using as substrates a DNA duplex starting at the 5 ' end with the complete dD9 construct and having its 5 ' end deoxyribonucleotide a in phosphorylated form (i.e. DNA duplex corresponding to 5 ' -P-ATGATCCAG-x-3 ' representing the 3 ' end of the other DNA sequence) and a polypeptide chain starting at the C-end with the complete P7; dplaseii ligase is capable of catalyzing covalent ligation of DNA duplexes and polypeptide chains by using as a substrate a DNA duplex which is constructed with the 5 ' end starting as intact dD9 and whose 5 ' end deoxyribonucleotide C is in phosphorylated form (i.e., a DNA duplex corresponding to 5 ' -P-CTGGATCAT-3 ' and representing the 3 ' end of the other DNA sequence) and a polypeptide chain whose C end starting as intact P7. The 5 'terminal deoxyribonucleotide refers to the 5' terminal first deoxyribonucleotide. By phosphorylated form of deoxyribonucleotide is meant that, like the natural mono-or triphosphoric monodeoxyribonucleotides, the modified or naturally occurring phosphate group is linked to the 5 th carbon atom of the five carbon sugar of the deoxyribonucleotide; in addition, triphosphate modification may also be applicable, however, in view of practical convenience in use of monophosphate, it is not specifically indicated below for the sake of simplicity.
Referring to Table 3, the optimal reaction buffer system and temperature for the 3 ligases sDPlaseI, dDPlaseI and dDPlaseII were investigated at their characteristic absorption wavelengths using different buffers and temperatures according to Table 4 using D9 (sD 9 or dD 9) and P7 as substrates. The same single-stranded RNA polypeptide ligase sRPoseI was investigated, and 2 kinds of commonly used ligase buffers, buffer A and buffer B, were used. The addition amount of the DNA polypeptide ligase is 1 mu g, and the reaction time is 30 min. After completion, 3. mu.L of the reaction product was collected, and absorbance was measured at 358nm (sDPlaseI ligase reaction system), 360nm (dDPlaseI ligase reaction system) and 372nm (dDPlaseI ligase reaction system) using an Epoch microplate reader (Biotek, USA), and the results are shown in Table 4. As can be seen from table 4, for scdplaei ligase, the absorbance measured for the reaction product of buffer a system was higher at each temperature than for buffer B, indicating that buffer a is more suitable for scdplaei ligase; whereas for dplasei and dplasei ligase, the absorbance of the reaction product was higher for the buffer B system than for buffer a at each temperature, indicating that buffer B is more suitable for both dplasei and dplasei ligase. For the sDPlaseI ligase, the absorbance of the reaction product was measured to be the highest in buffer a group at 37 ℃ (1.37), indicating that 37 ℃ is the optimal reaction temperature for sDPlaseI ligase. The absorbance measured for the dplasei and dplasei ligases in buffer B was highest for the reaction products at 40 ℃ (1.44 and 1.48, respectively), indicating that 40 ℃ is the optimal reaction temperature for dplasei and dplasei ligases. Furthermore, from the results in Table 4, the optimal reaction temperature range for sDPlaseI ligase was 30 ℃ to 40 ℃, while the optimal reaction temperature range for dDPlaseI and dDPlaseII ligases were 35 ℃ to 45 ℃.
The research on the specificity of the same substrate, for a single-stranded DNA polypeptide ligase sDPlaseI system, a DNA single strand of 5 '-P-ATGATCCAG-Dn-3' and N '-Pm-MANCEHL-C' are used as substrates; for a double-stranded DNA polypeptide ligase dDPlaseI system, a DNA double strand of 5 '-P-ATGATCCAG-Dn-3' and N '-Pm-MANCEHL-C' are used as substrates; for a double-stranded DNA polypeptide ligase dDPlaseII system, a DNA double strand of 5 '-P-CTGGATCAT-Dn-3' and N '-Pm-ManceHL-C' are used as substrates, and for the 3 DNA polypeptide ligase, the optimal buffer solution and the optimal temperature condition are consistent with those of a D9 substrate group and a P7 substrate group.
TABLE 43 reactivity of DNA polypeptide ligases in different reaction buffer systems and temperature gradients
In the investigation of the reaction conditions of sRPoseI ligase in example 3, it was found that the ligation reaction was sufficiently performed by adding 1. mu.g of sDPlaseI ligase to buffer A at 37 ℃ for 10min by changing the amount of the enzyme added and the reaction time in the reaction system (see the system in Table 3); in buffer B and 40 deg.C (refer to Table 3 system), adding 1. mu.g dDPlaseI ligase (or dDPlaseI ligase) takes 8min to fully perform the ligation reaction. For the sDPlaseI, dDPlaseI and dDPlaseI ligases, the reaction was completely stopped by incubation for 3min at 80 ℃.
In summary, the present invention provides a single-stranded DNA peptide ligase sDPlaseI for catalyzing the covalent linkage of a specific single-stranded DNA and a specific polypeptide, the amino acid sequence of which is as shown in SEQ ID NO: 4, and the corresponding gene coding sequence is shown as SEQ ID NO: 3 is shown in the specification; the sDPlaseI enzyme can recognize a specific DNA single chain of which the 5 ' terminal is initiated by a 5 ' -ATGATCCAG-3 ' sequence and the 5 ' terminal deoxyribonucleotide A is in a phosphorylation state and a specific polypeptide chain of which the C terminal is initiated by a 7-peptide N ' -MANCEHL-C ', and catalyze the dehydration condensation reaction between the adenine deoxyribonucleotide A at the 5 ' terminal of the DNA single chain and the leucine L at the C terminal of the polypeptide chain to form O = P-C = O construction phosphorus-carbon bond, so that the single-chain DNA and the polypeptide are connected with each other by covalent bonds. The single-stranded DNA and polypeptide ligation product sDP has a maximum absorption peak at a wavelength of 358nm, which is its characteristic absorption peak. The reaction system for the sDPlaseI ligase was:(1) 1 ng/. mu.L-100 ng/. mu.L of a specific DNA single strand; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer consisting of 500mM Tris-HCl, 100mM Mg, pH 7.52+10mM ATP and 10mM DTT, and 0.1. mu.L ligase buffer solution is added to each 1. mu.L reaction system; (4) deionized water to make up to the required volume. Wherein the specific DNA single strand refers to a DNA single strand of which 5 '-end is initiated with a 5' -ATGATCCAG-3 'sequence and 5' -end deoxyribonucleotide A is in a phosphorylated state, and the specific polypeptide refers to a polypeptide chain of which C-end is initiated with a 7-peptide N '-MANCEHL-C'. The optimal reaction temperature range of the sDPlaseI ligase is 30-40 ℃, the ligation reaction time is 5-15 min, and the reaction system can be placed at 80 ℃ for 3min after the ligation reaction is finished, so that the reaction can be completely terminated.
The invention also provides a double-stranded DNA peptide ligase dDPlaseI for catalyzing covalent linkage of specific double-stranded DNA and specific polypeptide, wherein the amino acid sequence of the double-stranded DNA peptide ligase dDPlaseI is shown as SEQ ID NO: 6, the corresponding gene coding sequence is shown as SEQ ID NO: 5 is shown in the specification; the dDPlaseI enzyme can recognize a specific DNA double chain of which the 5 ' end is initiated to be a 5 ' -ATGATCCAG-3 ' double-chain sequence and the 5 ' end deoxyribonucleotide A is in a phosphorylation state and a specific polypeptide chain of which the C end is initiated to be a 7 peptide of N ' -MANCEHL-C ', and catalyze the dehydration condensation reaction between the adenine deoxyribonucleotide A at the 5 ' end of the DNA double chain and the leucine L at the C end of the polypeptide chain to form a phosphorus-carbon bond with O = P-C = O structure, so that the double-chain DNA and the polypeptide are connected with each other by covalent bonds. The double-stranded DNA and polypeptide ligation product dDPI has a maximum absorption peak at a wavelength of 360nm, which is its characteristic absorption peak. The reaction system for dplasei ligase is: (1) 1 ng/. mu.L-100 ng/. mu.L of a specific DNA duplex; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer consisting of 450mM Tris-HCl, 100mM Mg, pH 7.82+80mM NaCl, 20mM MATP and 8mM Triton X-100, and adding 0.1 mu L of ligase buffer solution into each 1 mu L of reaction system; (4) deionized water to make up to the required volume. Wherein the specific DNA double strand refers to a DNA double strand with 5 '-ATGATCCAG-3' double strand sequence at the 5 'end and deoxyribonucleotide A at the 5' end in a phosphorylated state, and the specific polypeptide refers to a DNA double strand with N '-ManceHL-C' at the C endPeptide 7 is the starting polypeptide chain. The optimal reaction temperature range of dDPlaseI ligase is 35-45 ℃, the ligation reaction time is 3-10 min, and the reaction system can be kept at 80 ℃ for 3min after the ligation reaction is finished, so that the reaction can be completely stopped.
The invention also provides another double-stranded DNA peptide ligase dDPlaseII for catalyzing the covalent linkage of specific double-stranded DNA and specific polypeptide, wherein the amino acid sequence of the double-stranded DNA peptide ligase dDPlaseII is shown as SEQ ID NO: 8, and the corresponding gene coding sequence is shown as SEQ ID NO: 7 is shown in the specification; the dplaseii enzyme can recognize a specific DNA double strand of which 5 ' end is initially a 5 ' -CTGGATCAT-3 ' double strand sequence and of which 5 ' end deoxyribonucleotide C is in a phosphorylated state and a specific polypeptide chain of which C end is initially a 7 peptide of N ' -MANCEHL-C ', and catalyze a dehydration condensation reaction between cytosine deoxyribonucleotide C at the 5 ' end of the DNA double strand and leucine L at the C end of the polypeptide chain to form a phosphorus-carbon bond with O = P-C = O structure, so that the double-stranded DNA and the polypeptide are covalently linked to each other. The double-stranded DNA and polypeptide ligation product dDPII has a maximum absorption peak at a wavelength of 372nm, which is a characteristic absorption peak. The reaction system for dplasteii ligase was: (1) 1 ng/. mu.L-100 ng/. mu.L of a specific DNA duplex; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer, prepared from 450mM Tris-HCl, 100mM Mg, pH 7.82+80mM NaCl, 20mM MATP and 8mM Triton X-100, and adding 0.1 mu L of ligase buffer solution into each 1 mu L of reaction system; (4) deionized water to make up to the required volume. Wherein the specific DNA double strand is a DNA double strand of which 5 '-end is initially a 5' -CTGGATCAT-3 'double strand sequence and 5' -end deoxyribonucleotide C is in a phosphorylated state, and the specific polypeptide is a polypeptide chain of which C-end is initiated by a peptide 7 of N '-MANCEHL-C'. The optimal reaction temperature range of the dDPlaseII ligase is 35-45 ℃, the ligation reaction time is 3-10 min, and the reaction system can be placed at 80 ℃ for 3min after the ligation reaction is finished, so that the reaction can be completely terminated.
In conclusion, the invention provides 4 novel nucleic acid polypeptide ligases, including 1 specific single-stranded RNA peptide ligase sRPPlaseI, 1 specific single-stranded DNA peptide ligase sDPlaseI and 2 specific double-stranded DNA peptide ligases dDPlaseI and dDPlaseII, which can be used as tool enzymes for genetic engineering and genetic analysis and have great application prospects in the fields of nucleic acid labeling, nucleic acid analysis and the like.
Sequence listing
<110> Fujian Chengxienke Biotech Co., Ltd
<120> Single-stranded RNA peptide ligase sRPeseI and method of use thereof
<160> 19
<170> SIPOSequenceListing 1.0
<210> 1
<211> 1362
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 1
atgcggacgc gccacagcag aactctgcta ctcgagcatc ttgccacaga aagacgacgt 60
ttagctccgt ggaacatagg gtccgcttca acaagacact tgcttctaga caagagtccc 120
tctagaaatc ccttccagtc tatatgttac tggggccaag aattggtcac cggcgggaat 180
cttacattga gtaacgggca gctccgatca cagagatatt ggcacacctt gagttcgacg 240
cccatcctcc gaatctgtat gagatcctgc ttctccttac gttctccgaa actacagttg 300
cgtccgatac cagttttatg tttcttcatg ggagcgacca atatcagttt acgcttcgca 360
agccaaggga cggtggatat agttcatagt gttgaacgtt ccgagagtca gtcccgcacc 420
tacgtgtttt tggaactgtg ggccttcggg ttttacacag cctttggttt cagcatgcta 480
ttagtctcca ccttggaggc acacctaatc acggtaaagg gtgatggcct aatacgttgg 540
gatgtagcgc gctccttccg gtcaggtcac gaggatggag cgtgtcttac acgcgatccg 600
tgcggtccgc agtttgcgtc tgacgactac gagccgcgtt cttgcctacc ccagatgcta 660
tcggcgagag ggggtcccgg ttcgtttatc gttgtgtatg gttgtcattg ggctcaattg 720
cggattcagg cggggctagc aaaccaagtg ttgagtgttt gtcttatttg taaggcatat 780
atgatctcag agtttttgtc catacctaac cattcctatt acttgcgcgc gccatgtgaa 840
caaggtaaaa tgttgataga tgcgaggcac ctttggctac gggtagagcg gctgaattct 900
atcattgcag gtctggcatc acttcgtaag cgaggtaata ctcgcaccag cttgaactca 960
atccttttat ttagtaagga tcaacagtat aaaatgcggc gcgccgcact gagtatacta 1020
ctatattggg gctatttcac agtccgcgca tcctgcgata atcttgtggc cactctacgg 1080
aaagaccccc gggaatacga ttcggcgact gggccgtcga aactttgtca gcccaaagct 1140
catccctgtc atccgatgca aatgtacctg agggactggg caggcaaatt aagagcaacg 1200
aagcggccag acaggggcgc ccaacaagaa catgcggtga accccgccgg ctaccatcaa 1260
atgcagagtg caaggttggt tgcgccttta accccgtccg cccagcttac tgagcgtgat 1320
tgtacaggaa aggttgggct tgaccttcga cgtcagatgt ag 1362
<210> 2
<211> 453
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 2
Met Arg Thr Arg His Ser Arg Thr Leu Leu Leu Glu His Leu Ala Thr
1 5 10 15
Glu Arg Arg Arg Leu Ala Pro Trp Asn Ile Gly Ser Ala Ser Thr Arg
20 25 30
His Leu Leu Leu Asp Lys Ser Pro Ser Arg Asn Pro Phe Gln Ser Ile
35 40 45
Cys Tyr Trp Gly Gln Glu Leu Val Thr Gly Gly Asn Leu Thr Leu Ser
50 55 60
Asn Gly Gln Leu Arg Ser Gln Arg Tyr Trp His Thr Leu Ser Ser Thr
65 70 75 80
Pro Ile Leu Arg Ile Cys Met Arg Ser Cys Phe Ser Leu Arg Ser Pro
85 90 95
Lys Leu Gln Leu Arg Pro Ile Pro Val Leu Cys Phe Phe Met Gly Ala
100 105 110
Thr Asn Ile Ser Leu Arg Phe Ala Ser Gln Gly Thr Val Asp Ile Val
115 120 125
His Ser Val Glu Arg Ser Glu Ser Gln Ser Arg Thr Tyr Val Phe Leu
130 135 140
Glu Leu Trp Ala Phe Gly Phe Tyr Thr Ala Phe Gly Phe Ser Met Leu
145 150 155 160
Leu Val Ser Thr Leu Glu Ala His Leu Ile Thr Val Lys Gly Asp Gly
165 170 175
Leu Ile Arg Trp Asp Val Ala Arg Ser Phe Arg Ser Gly His Glu Asp
180 185 190
Gly Ala Cys Leu Thr Arg Asp Pro Cys Gly Pro Gln Phe Ala Ser Asp
195 200 205
Asp Tyr Glu Pro Arg Ser Cys Leu Pro Gln Met Leu Ser Ala Arg Gly
210 215 220
Gly Pro Gly Ser Phe Ile Val Val Tyr Gly Cys His Trp Ala Gln Leu
225 230 235 240
Arg Ile Gln Ala Gly Leu Ala Asn Gln Val Leu Ser Val Cys Leu Ile
245 250 255
Cys Lys Ala Tyr Met Ile Ser Glu Phe Leu Ser Ile Pro Asn His Ser
260 265 270
Tyr Tyr Leu Arg Ala Pro Cys Glu Gln Gly Lys Met Leu Ile Asp Ala
275 280 285
Arg His Leu Trp Leu Arg Val Glu Arg Leu Asn Ser Ile Ile Ala Gly
290 295 300
Leu Ala Ser Leu Arg Lys Arg Gly Asn Thr Arg Thr Ser Leu Asn Ser
305 310 315 320
Ile Leu Leu Phe Ser Lys Asp Gln Gln Tyr Lys Met Arg Arg Ala Ala
325 330 335
Leu Ser Ile Leu Leu Tyr Trp Gly Tyr Phe Thr Val Arg Ala Ser Cys
340 345 350
Asp Asn Leu Val Ala Thr Leu Arg Lys Asp Pro Arg Glu Tyr Asp Ser
355 360 365
Ala Thr Gly Pro Ser Lys Leu Cys Gln Pro Lys Ala His Pro Cys His
370 375 380
Pro Met Gln Met Tyr Leu Arg Asp Trp Ala Gly Lys Leu Arg Ala Thr
385 390 395 400
Lys Arg Pro Asp Arg Gly Ala Gln Gln Glu His Ala Val Asn Pro Ala
405 410 415
Gly Tyr His Gln Met Gln Ser Ala Arg Leu Val Ala Pro Leu Thr Pro
420 425 430
Ser Ala Gln Leu Thr Glu Arg Asp Cys Thr Gly Lys Val Gly Leu Asp
435 440 445
Leu Arg Arg Gln Met
450
<210> 3
<211> 1359
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 3
atgcggacgc gccacagcag aactctgcta ctcgagcatc ttgccacaga aagacgacgt 60
ttagctccgt ggaacatagg gtccgcttca acaagacact tgcttctaga caagagtccc 120
tctagaaatc ccttccagtc tatatgttac tggggccaag aattggtcac cggcgggaat 180
cttacattga gtaacgggca gctccgatca cagagatatt ggcacacctt gagttcgacg 240
cccatcctcc gaatctgtat gagatcctgc ttctccttac gttctccgaa actacagttg 300
cgtccgatac cagttttatg tttcttcatg ggagcgagca acagtttacg cttcgcaagc 360
caagggacgg tggatatagt tcatagtgtt gaacgttccg agagtcagtc ccgcacctac 420
gtgtttttgg aactgtgggc cttcgggttt tacacagcct ttggtttcag catgctatta 480
gtctccacct tggaggcaca cctaatcacg gtaaagggtg atggcctaat acgttgggat 540
gtagcgcgct ccttccggtc aggtcacgag gatggagcgt gtcttacacg cgatccgtgc 600
ggtccgcagt ttgcgtctga cgactacgag ccgcgttctt gcctacccca gatgctatcg 660
gcgagagggg gtcccggttc gtttaccgat gtgtatggtt gtcattgggc tcaattgcgg 720
attcaggcgg ggctagcaaa ccaagtgttg agtgtttgtc ttatttgtaa ggcatatatg 780
atctcagagt ttttgtccat acctaaccat tcctattact tgcgcgcgcc atgtgaacaa 840
ggtaaaatgt tgatagatgc gaggcacctt tggctacggg tagagcggct gaattctatc 900
attgcaggtc tggcatcact tcgtaagcga ggtaatactc gcaccagctt gaactcaatc 960
cttttattta gtaaggatca acagtataaa atgcggcgcg ccgcactgag tatactacta 1020
tattggggct atttcacagt ccgcgcatcc tgcgataatc ttgtggccac tctacggaaa 1080
gacccccggg aatacgattc ggcgactggg ccgtcgaaac tttgtcagcc caaagctcat 1140
ccctgtcatc cgatgcaaat gtacctgagg gactgggcag gcaaattaag agcaacgaag 1200
cggccagaca ggggcgccca acaagaacat gcggtgaacc ccgccggcta ccatcaaatg 1260
cagagtgcaa ggttggttgc gcctttaacc ccgtccgccc agcttactga gcgtgattgt 1320
acaggaaagg ttgggcttga ccttcgacgt cagatgtag 1359
<210> 4
<211> 452
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 4
Met Arg Thr Arg His Ser Arg Thr Leu Leu Leu Glu His Leu Ala Thr
1 5 10 15
Glu Arg Arg Arg Leu Ala Pro Trp Asn Ile Gly Ser Ala Ser Thr Arg
20 25 30
His Leu Leu Leu Asp Lys Ser Pro Ser Arg Asn Pro Phe Gln Ser Ile
35 40 45
Cys Tyr Trp Gly Gln Glu Leu Val Thr Gly Gly Asn Leu Thr Leu Ser
50 55 60
Asn Gly Gln Leu Arg Ser Gln Arg Tyr Trp His Thr Leu Ser Ser Thr
65 70 75 80
Pro Ile Leu Arg Ile Cys Met Arg Ser Cys Phe Ser Leu Arg Ser Pro
85 90 95
Lys Leu Gln Leu Arg Pro Ile Pro Val Leu Cys Phe Phe Met Gly Ala
100 105 110
Ser Asn Ser Leu Arg Phe Ala Ser Gln Gly Thr Val Asp Ile Val His
115 120 125
Ser Val Glu Arg Ser Glu Ser Gln Ser Arg Thr Tyr Val Phe Leu Glu
130 135 140
Leu Trp Ala Phe Gly Phe Tyr Thr Ala Phe Gly Phe Ser Met Leu Leu
145 150 155 160
Val Ser Thr Leu Glu Ala His Leu Ile Thr Val Lys Gly Asp Gly Leu
165 170 175
Ile Arg Trp Asp Val Ala Arg Ser Phe Arg Ser Gly His Glu Asp Gly
180 185 190
Ala Cys Leu Thr Arg Asp Pro Cys Gly Pro Gln Phe Ala Ser Asp Asp
195 200 205
Tyr Glu Pro Arg Ser Cys Leu Pro Gln Met Leu Ser Ala Arg Gly Gly
210 215 220
Pro Gly Ser Phe Thr Asp Val Tyr Gly Cys His Trp Ala Gln Leu Arg
225 230 235 240
Ile Gln Ala Gly Leu Ala Asn Gln Val Leu Ser Val Cys Leu Ile Cys
245 250 255
Lys Ala Tyr Met Ile Ser Glu Phe Leu Ser Ile Pro Asn His Ser Tyr
260 265 270
Tyr Leu Arg Ala Pro Cys Glu Gln Gly Lys Met Leu Ile Asp Ala Arg
275 280 285
His Leu Trp Leu Arg Val Glu Arg Leu Asn Ser Ile Ile Ala Gly Leu
290 295 300
Ala Ser Leu Arg Lys Arg Gly Asn Thr Arg Thr Ser Leu Asn Ser Ile
305 310 315 320
Leu Leu Phe Ser Lys Asp Gln Gln Tyr Lys Met Arg Arg Ala Ala Leu
325 330 335
Ser Ile Leu Leu Tyr Trp Gly Tyr Phe Thr Val Arg Ala Ser Cys Asp
340 345 350
Asn Leu Val Ala Thr Leu Arg Lys Asp Pro Arg Glu Tyr Asp Ser Ala
355 360 365
Thr Gly Pro Ser Lys Leu Cys Gln Pro Lys Ala His Pro Cys His Pro
370 375 380
Met Gln Met Tyr Leu Arg Asp Trp Ala Gly Lys Leu Arg Ala Thr Lys
385 390 395 400
Arg Pro Asp Arg Gly Ala Gln Gln Glu His Ala Val Asn Pro Ala Gly
405 410 415
Tyr His Gln Met Gln Ser Ala Arg Leu Val Ala Pro Leu Thr Pro Ser
420 425 430
Ala Gln Leu Thr Glu Arg Asp Cys Thr Gly Lys Val Gly Leu Asp Leu
435 440 445
Arg Arg Gln Met
450
<210> 5
<211> 1365
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 5
atgcggacgc gccacagcag aactctgcta ctcgagcatc ttgccacaga aagacgacgt 60
ttagctccgt ggaacatagg gtccgcttca acaagacact tgcttctaga caagagtccc 120
tctagaaatc ccttccagtc tatatgttac tggggccaag aattggtcac cggcgggaat 180
cttacattga gtaacgggca gctccgatca cagagatatt ggcacacctt gagttcgacg 240
cccatcctcc gaatctgtat gagatcctgc ttctccttac gttctccgaa actacagttg 300
cgtccgatac cagttttatg tttcttcatg ggaccgagca aaagtttacg cttcgcaagc 360
caagggacgg tggatatagt tcatagtgtt gaacgttccg agagtcagtc ccgcacctac 420
gtgtttttgg aactgtgggc cttcgggttt tacacagcct ttggtttcag catgctatta 480
gtctccacct tggaggcaca cctaatcacg gtaaagggtg atggcctaat acgttgggat 540
gtagcgcgct ccttccggtc aggtcacgag gatggagcgt gtcttacacg cgatccgtgc 600
ggtccgcagt ttgcgtctga cgactacgag ccgcgttctt gcctacccca gatgctatcg 660
gcgagagggg gtcccgtttc gtttaccgat gtgtatggtt gtcattgggc tcaattgcgg 720
attcaggcgg ggctagcaaa ccaagtgttg agtgtttgtc ttatttgtaa ggcatatatg 780
atctcagagt ttttgtccat acctaaccat tcctattact tgcgcgcgcc atgtgaacaa 840
ggtaaaatgt tgatagatgc gaggcacctt tggctacggg tagagcggct gaattctatc 900
attgcaggtc tggcatcact tcgtaagcga ggtaatactc gcaccagctt gaactcaatc 960
cttttattta gtaaggatca acagtataaa atgcggcgcg ccgcactgag tatactacta 1020
tattggggct atttcacagt ccgcgcatcc tgcgataatc ttgtggccac tctacggaaa 1080
gacccccggg aatacgattc ggcgactggg ccgtcgaaac tttgtcagcc caaagctcat 1140
ccctgtcatc cgatgcaaat gtacctgagg gactgggcag gcaaattaag agcaacgaag 1200
cggccagaca ggggcgccca acaagaacat gcggtgaacc ccgccggcta ccatcaaatg 1260
cagagtgcaa ggttggttgc gcctttaacc ccgtccgccc agcttactga gcgcagccac 1320
gattgtacag gaaaggttgg gcttgacctt cgacgtcaga tgtag 1365
<210> 6
<211> 454
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 6
Met Arg Thr Arg His Ser Arg Thr Leu Leu Leu Glu His Leu Ala Thr
1 5 10 15
Glu Arg Arg Arg Leu Ala Pro Trp Asn Ile Gly Ser Ala Ser Thr Arg
20 25 30
His Leu Leu Leu Asp Lys Ser Pro Ser Arg Asn Pro Phe Gln Ser Ile
35 40 45
Cys Tyr Trp Gly Gln Glu Leu Val Thr Gly Gly Asn Leu Thr Leu Ser
50 55 60
Asn Gly Gln Leu Arg Ser Gln Arg Tyr Trp His Thr Leu Ser Ser Thr
65 70 75 80
Pro Ile Leu Arg Ile Cys Met Arg Ser Cys Phe Ser Leu Arg Ser Pro
85 90 95
Lys Leu Gln Leu Arg Pro Ile Pro Val Leu Cys Phe Phe Met Gly Pro
100 105 110
Ser Lys Ser Leu Arg Phe Ala Ser Gln Gly Thr Val Asp Ile Val His
115 120 125
Ser Val Glu Arg Ser Glu Ser Gln Ser Arg Thr Tyr Val Phe Leu Glu
130 135 140
Leu Trp Ala Phe Gly Phe Tyr Thr Ala Phe Gly Phe Ser Met Leu Leu
145 150 155 160
Val Ser Thr Leu Glu Ala His Leu Ile Thr Val Lys Gly Asp Gly Leu
165 170 175
Ile Arg Trp Asp Val Ala Arg Ser Phe Arg Ser Gly His Glu Asp Gly
180 185 190
Ala Cys Leu Thr Arg Asp Pro Cys Gly Pro Gln Phe Ala Ser Asp Asp
195 200 205
Tyr Glu Pro Arg Ser Cys Leu Pro Gln Met Leu Ser Ala Arg Gly Gly
210 215 220
Pro Val Ser Phe Thr Asp Val Tyr Gly Cys His Trp Ala Gln Leu Arg
225 230 235 240
Ile Gln Ala Gly Leu Ala Asn Gln Val Leu Ser Val Cys Leu Ile Cys
245 250 255
Lys Ala Tyr Met Ile Ser Glu Phe Leu Ser Ile Pro Asn His Ser Tyr
260 265 270
Tyr Leu Arg Ala Pro Cys Glu Gln Gly Lys Met Leu Ile Asp Ala Arg
275 280 285
His Leu Trp Leu Arg Val Glu Arg Leu Asn Ser Ile Ile Ala Gly Leu
290 295 300
Ala Ser Leu Arg Lys Arg Gly Asn Thr Arg Thr Ser Leu Asn Ser Ile
305 310 315 320
Leu Leu Phe Ser Lys Asp Gln Gln Tyr Lys Met Arg Arg Ala Ala Leu
325 330 335
Ser Ile Leu Leu Tyr Trp Gly Tyr Phe Thr Val Arg Ala Ser Cys Asp
340 345 350
Asn Leu Val Ala Thr Leu Arg Lys Asp Pro Arg Glu Tyr Asp Ser Ala
355 360 365
Thr Gly Pro Ser Lys Leu Cys Gln Pro Lys Ala His Pro Cys His Pro
370 375 380
Met Gln Met Tyr Leu Arg Asp Trp Ala Gly Lys Leu Arg Ala Thr Lys
385 390 395 400
Arg Pro Asp Arg Gly Ala Gln Gln Glu His Ala Val Asn Pro Ala Gly
405 410 415
Tyr His Gln Met Gln Ser Ala Arg Leu Val Ala Pro Leu Thr Pro Ser
420 425 430
Ala Gln Leu Thr Glu Arg Ser His Asp Cys Thr Gly Lys Val Gly Leu
435 440 445
Asp Leu Arg Arg Gln Met
450
<210> 7
<211> 1365
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
atgcggacgc gccacagcag aactctgcta ctcgagcatc ttgccacaga aagacgacgt 60
ttagctccgt ggaacatagg gtccgcttca acaagacact tgcttctaga caagagtccc 120
tctagaaatc ccttccagtc tatatgttac tggggccaag aattggtcac cggcgggaat 180
cttacattga gtaacgggca gctccgatca cagagatatt ggcacacctt gagttcgacg 240
cccatcctcc gaatctgtat gagatcctgc ttctccttac gttctccgaa actacagttg 300
cgtccgatac cagttttatg tttcttcatg ggaccgagca gaagtttacg cttcgcaagc 360
caagggacgg tggatatagt tcatagtgtt gaacgttccg agagtcagtc ccgcacctac 420
gtgtttttgg aactgtgggc cttcgggttt tacacagcct ttggtttcag catgctatta 480
gtctccacct tggaggcaca cctaatcacg gtaaagggtg atggcctaat acgttgggat 540
gtagcgcgct ccttccggtc aggtcacgag gatggagcgt gtcttacacg cgatccgtgc 600
ggtccgcagt ttgcgtctga cgactacgag ccgcgttctt gcctacccca gatgctatcg 660
gcgagagggg gtcccgtttt taccgatctt tatatgtgtc attgggctca attgcggatt 720
caggcggggc tagcaaacca agtgttgagt gtttgtctta tttgtaaggc atatatgatc 780
tcagagtttt tgtccatacc taaccattcc tattacttgc gcgcgccatg tgaacaaggt 840
aaaatgttga tagatgcgag gcacctttgg ctacgggtag agcggctgaa ttctatcatt 900
gcaggtctgg catcacttcg taagcgaggt aatactcgca ccagcttgaa ctcaatcctt 960
ttatttagta aggatcaaca gtataaaatg cggcgcgccg cactgagtat actactatat 1020
tggggctatt tcacagtccg cgcatcctgc gataatcttg tggccactct acggaaagac 1080
ccccgggaat acgattcggc gactgggccg tcgaaacttt gtcagcccaa agctcatccc 1140
tgtcatccga tgcaaatgta cctgagggac tgggcaggca aattaagagc aacgaagcgg 1200
ccagacaggg gcgcccaaca agaacatgcg gtgaaccccg ccggctacca tcaaatgcag 1260
agtgcaaggt tggttgcgcc tttaaccccg tccgcccagc ttactgagcg cagccacaca 1320
gattgtacag gaaaggttgg gcttgacctt cgacgtcaga tgtag 1365
<210> 8
<211> 454
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 8
Met Arg Thr Arg His Ser Arg Thr Leu Leu Leu Glu His Leu Ala Thr
1 5 10 15
Glu Arg Arg Arg Leu Ala Pro Trp Asn Ile Gly Ser Ala Ser Thr Arg
20 25 30
His Leu Leu Leu Asp Lys Ser Pro Ser Arg Asn Pro Phe Gln Ser Ile
35 40 45
Cys Tyr Trp Gly Gln Glu Leu Val Thr Gly Gly Asn Leu Thr Leu Ser
50 55 60
Asn Gly Gln Leu Arg Ser Gln Arg Tyr Trp His Thr Leu Ser Ser Thr
65 70 75 80
Pro Ile Leu Arg Ile Cys Met Arg Ser Cys Phe Ser Leu Arg Ser Pro
85 90 95
Lys Leu Gln Leu Arg Pro Ile Pro Val Leu Cys Phe Phe Met Gly Pro
100 105 110
Ser Arg Ser Leu Arg Phe Ala Ser Gln Gly Thr Val Asp Ile Val His
115 120 125
Ser Val Glu Arg Ser Glu Ser Gln Ser Arg Thr Tyr Val Phe Leu Glu
130 135 140
Leu Trp Ala Phe Gly Phe Tyr Thr Ala Phe Gly Phe Ser Met Leu Leu
145 150 155 160
Val Ser Thr Leu Glu Ala His Leu Ile Thr Val Lys Gly Asp Gly Leu
165 170 175
Ile Arg Trp Asp Val Ala Arg Ser Phe Arg Ser Gly His Glu Asp Gly
180 185 190
Ala Cys Leu Thr Arg Asp Pro Cys Gly Pro Gln Phe Ala Ser Asp Asp
195 200 205
Tyr Glu Pro Arg Ser Cys Leu Pro Gln Met Leu Ser Ala Arg Gly Gly
210 215 220
Pro Val Phe Thr Asp Leu Tyr Met Cys His Trp Ala Gln Leu Arg Ile
225 230 235 240
Gln Ala Gly Leu Ala Asn Gln Val Leu Ser Val Cys Leu Ile Cys Lys
245 250 255
Ala Tyr Met Ile Ser Glu Phe Leu Ser Ile Pro Asn His Ser Tyr Tyr
260 265 270
Leu Arg Ala Pro Cys Glu Gln Gly Lys Met Leu Ile Asp Ala Arg His
275 280 285
Leu Trp Leu Arg Val Glu Arg Leu Asn Ser Ile Ile Ala Gly Leu Ala
290 295 300
Ser Leu Arg Lys Arg Gly Asn Thr Arg Thr Ser Leu Asn Ser Ile Leu
305 310 315 320
Leu Phe Ser Lys Asp Gln Gln Tyr Lys Met Arg Arg Ala Ala Leu Ser
325 330 335
Ile Leu Leu Tyr Trp Gly Tyr Phe Thr Val Arg Ala Ser Cys Asp Asn
340 345 350
Leu Val Ala Thr Leu Arg Lys Asp Pro Arg Glu Tyr Asp Ser Ala Thr
355 360 365
Gly Pro Ser Lys Leu Cys Gln Pro Lys Ala His Pro Cys His Pro Met
370 375 380
Gln Met Tyr Leu Arg Asp Trp Ala Gly Lys Leu Arg Ala Thr Lys Arg
385 390 395 400
Pro Asp Arg Gly Ala Gln Gln Glu His Ala Val Asn Pro Ala Gly Tyr
405 410 415
His Gln Met Gln Ser Ala Arg Leu Val Ala Pro Leu Thr Pro Ser Ala
420 425 430
Gln Leu Thr Glu Arg Ser His Thr Asp Cys Thr Gly Lys Val Gly Leu
435 440 445
Asp Leu Arg Arg Gln Met
450
<210> 9
<211> 7
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 9
Met Ala Asn Cys Glu His Leu
1 5
<210> 10
<211> 24
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 10
atggccaact gtgaacatct gtga 24
<210> 11
<211> 9
<212> RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 11
gaauguggu 9
<210> 12
<211> 63
<212> RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 12
ggcauugugg aacaaugcug uaccagcauc ugcucccucu accagcugga gaacuacugc 60
aac 63
<210> 13
<211> 90
<212> RNA
<213> Artificial Sequence (Artificial Sequence)
<400> 13
uuugugaacc aacaccugug cggcucacac cugguggaag cucucuaccu agugugcggg 60
gaacgaggcu ucuucuacac acccaagacc 90
<210> 14
<211> 3
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 14
Glu Cys Gly
1
<210> 15
<211> 21
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 15
Gly Ile Val Glu Gln Cys Cys Thr Ser Ile Cys Ser Leu Tyr Gln Leu
1 5 10 15
Glu Asn Tyr Cys Asn
20
<210> 16
<211> 30
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 16
Phe Val Asn Gln His Leu Cys Gly Ser His Leu Val Glu Ala Leu Tyr
1 5 10 15
Leu Val Cys Gly Glu Arg Gly Phe Phe Tyr Thr Pro Lys Thr
20 25 30
<210> 17
<211> 9
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 17
gaatgtggt 9
<210> 18
<211> 63
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 18
ggcattgtgg aacaatgctg taccagcatc tgctccctct accagctgga gaactactgc 60
aac 63
<210> 19
<211> 90
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 19
tttgtgaacc aacacctgtg cggctcacac ctggtggaag ctctctacct agtgtgcggg 60
gaacgaggct tcttctacac acccaagacc 90
Claims (1)
1. A double-stranded DNA peptide ligase dplasii characterized in that the amino acid sequence of dplasii ligase is as set forth in SEQ NO: shown in fig. 8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010077328.9A CN111088234B (en) | 2020-01-27 | 2020-01-27 | Double-stranded DNA peptide ligase dDPlaseII and use method thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010077328.9A CN111088234B (en) | 2020-01-27 | 2020-01-27 | Double-stranded DNA peptide ligase dDPlaseII and use method thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111088234A CN111088234A (en) | 2020-05-01 |
CN111088234B true CN111088234B (en) | 2022-07-15 |
Family
ID=70399822
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010077328.9A Expired - Fee Related CN111088234B (en) | 2020-01-27 | 2020-01-27 | Double-stranded DNA peptide ligase dDPlaseII and use method thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111088234B (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4587044A (en) * | 1983-09-01 | 1986-05-06 | The Johns Hopkins University | Linkage of proteins to nucleic acids |
CN110452303A (en) * | 2019-08-08 | 2019-11-15 | 中国科学院武汉病毒研究所 | It is covalently attached the method and application of nucleic acid and peptide or protein |
CN110637086A (en) * | 2017-03-17 | 2019-12-31 | 珂璧斯塔斯株式会社 | Method for producing complex of RNA molecule and peptide, and use thereof |
-
2020
- 2020-01-27 CN CN202010077328.9A patent/CN111088234B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4587044A (en) * | 1983-09-01 | 1986-05-06 | The Johns Hopkins University | Linkage of proteins to nucleic acids |
CN110637086A (en) * | 2017-03-17 | 2019-12-31 | 珂璧斯塔斯株式会社 | Method for producing complex of RNA molecule and peptide, and use thereof |
CN110452303A (en) * | 2019-08-08 | 2019-11-15 | 中国科学院武汉病毒研究所 | It is covalently attached the method and application of nucleic acid and peptide or protein |
Non-Patent Citations (2)
Title |
---|
RNA-peptide fusions for the in vitro selection of peptides and proteins;Roberts, RW等;《PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA》;19971111;第94卷(第23期);第12297-12302页 * |
Synthesis of a peptide-universal nucleotide antigen: towards next-generation antibodies to detect topoisomerase I-DNA covalent complexes;Perkins, AL等;《ORGANIC & BIOMOLECULAR CHEMISTRY》;20160413;第14卷(第17期);第4103-4109页 * |
Also Published As
Publication number | Publication date |
---|---|
CN111088234A (en) | 2020-05-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7303901B2 (en) | Thermostable RNA ligase from thermus phage | |
CN101180390B (en) | Improved polymerases | |
CN113785053B (en) | Reverse transcriptase with increased enzymatic activity and use thereof | |
US10883091B2 (en) | DNA polymerase variant and application thereof | |
CN110914415A (en) | Engineered ligase variants | |
WO2022227880A1 (en) | Novel phosphorylated adenylase, and preparation method therefor and application thereof | |
WO2019128836A1 (en) | Improved promoter and use thereof | |
JP2022522397A (en) | How to Regularly Build Circular and Linear DNA Molecules | |
CN114262697A (en) | Bsu DNA polymerase and Bsu DNA polymerase mutant as well as gene, plasmid and genetic engineering bacteria thereof | |
CN111088234B (en) | Double-stranded DNA peptide ligase dDPlaseII and use method thereof | |
CN111088236B (en) | Single-stranded RNA peptide ligase sRPoseI and use method thereof | |
CN111117978B (en) | Single-chain DNA peptide ligase sDPlaseI and use method thereof | |
CN111088235B (en) | Double-stranded DNA peptide ligase dDPlaseI and use method thereof | |
WO2023273366A1 (en) | Nucleic acid ligase | |
CN112725331B (en) | Construction method of high-throughput mutant library | |
CN115261363A (en) | Method for determining RNA deaminase activity of APOBEC3A and APOBEC3A variant with high RNA activity | |
CN114230644A (en) | GP32 protein mutant, recombinant vector, and construction method and application thereof | |
CN113930405B (en) | Novel thermostable phosphorylation and adenylylation one-step catalytic enzyme, and preparation method and application thereof | |
CN118460500A (en) | Fusion DNA polymerase and preparation method and application thereof | |
KR100774102B1 (en) | Thermostable dna ligase of sulfophobococcus zilligii | |
KR100673836B1 (en) | Thermostable dna ligase of staphylothermus marinus | |
KR20080047754A (en) | Thermostable pyrophosphatase from picrophilus torridus | |
TWI609961B (en) | Nucleic acid construct, recombinant expression vector and method thereof for producing a recombinant enzyme | |
CN115011578A (en) | Enhanced M-MLV reverse transcriptase mutant and application thereof | |
CN114480345A (en) | MazF mutant, recombinant vector, recombinant engineering bacterium and application thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20220629 Address after: 362300 room 114, building 5, yangguangyoujia, Meilin street, Nan'an City, Quanzhou City, Fujian Province Applicant after: Huang Zhongshan Address before: 350005 No. 178 on the east side of the second floor of the complex building, Lianpan 34, Xiangyuan street, Jin'an District, Fuzhou City, Fujian Province Applicant before: Fujian chenxinke Biotechnology Co.,Ltd. |
|
TA01 | Transfer of patent application right | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20220715 |
|
CF01 | Termination of patent right due to non-payment of annual fee |