CN111088234A

CN111088234A - Double-stranded DNA peptide ligase dDPlaseII and use method thereof

Info

Publication number: CN111088234A
Application number: CN202010077328.9A
Authority: CN
Inventors: 黄志玲; 黄种山; 其他发明人请求不公开姓名
Original assignee: Fujian Chenxinke Biotechnology Co Ltd
Current assignee: Huang Zhongshan
Priority date: 2020-01-27
Filing date: 2020-01-27
Publication date: 2020-05-01
Anticipated expiration: 2040-01-27
Also published as: CN111088234B

Abstract

The invention provides a novel double-stranded DNA peptide ligase dDPlaseII which can be used for catalyzing covalent linkage of specific double-stranded DNA and specific polypeptide, and its enzymological characteristics and application method. The amino acid sequence of dplaseii enzyme is set forth in SEQ ID NO: 8, and the corresponding gene coding sequence is shown as SEQ ID NO: 7 is shown in the specification; the dDPlaseII enzyme can recognize a specific DNA double chain of which the 5 'end is started to be a 5' -CTGGATCAT-3 'double-chain sequence and the 5' end deoxyribonucleotide C is in a phosphorylation state and a specific polypeptide chain of which the C end is started to be a 7 peptide of N '-MANCEHL-C', and catalyze the mutual connection of the two chains by a covalent bond; the double-stranded DNA and polypeptide ligation product has a characteristic absorption peak at 372nm wavelength, and can be used for determining the reaction activity of dDPlaseII ligase. The dDPlaseII ligase buffer is prepared from 450mM Tris-HCl with pH 7.8 and 100mM Mg²⁺80mM NaCl, 20mM ATP and 8mM Triton X-100, the temperature range of the optimal reaction is 35-45 ℃, and the ligation reaction is carried outThe time interval is 3min-10 min. The novel double-stranded DNA polypeptide ligase provided by the invention can be used as a tool enzyme for genetic engineering and genetic analysis.

Description

Double-stranded DNA peptide ligase dDPlaseII and use method thereof

Technical Field

The invention relates to the field of molecular biology, in particular to a brand-new ligase capable of connecting a specific double-stranded DNA fragment and a specific polypeptide and a use method thereof.

Background

The field of genetic engineering is not separated from various molecular biology tool enzymes, the tool enzymes can realize in vitro nucleic acid amplification, transcription or reverse transcription, digestion or excision, connection and modification, protein digestion or excision and the like, and the tool enzymes are widely used in the fields of target gene amplification, nucleic acid sequence analysis, recombinant DNA preparation, vector construction, nucleic acid probe marking, protein analysis and the like, for example, DNA polymerase is a core component in a PCR system, and the construction of a cDNA library is not separated from RNA polymerase. The tool enzymes commonly used in genetic engineering include mainly DNA polymerase, restriction endonuclease, DNA ligase, RNA polymerase, reverse transcriptase, exonuclease, DNA methylase, ribonuclease, deoxyribonuclease, polynucleotide kinase, alkaline phosphatase, terminal nucleotidyl transferase, and various proteases. Most of the current commercial molecular tool enzymes are derived from microorganisms mainly due to the fact that the microorganisms are high in growth and proliferation speed and vigorous in metabolism, and expression, separation and purification of the enzymes are facilitated. Microorganisms in nature are diverse in their micro-ecology and thus in their metabolic forms and processes, and they are not isolated from the corresponding enzymes capable of performing various biochemical reactions, and thus are a vast resource pool of molecular enzymes. For example, T4 ligase derived from the T4 bacteriophage, which is originally used by the bacteriophage to repair DNA cleaved by a restriction endonuclease of a host cell, has high ligation efficiency. As more and more enzyme molecules are being explored, the number and use of molecular tools enzymes is increasing, which greatly opens up the field of genetic engineering research and applications.

In the previous research on microbial ecosystems degraded by dried straws, the applicant finds that a specific single-stranded RNA peptide ligase exists in the system, can recognize a specific sequence at the 5' end of a specific mRNA and covalently links the specific sequence with a specific polypeptide, thereby regulating the process of translating the specific mRNA into protein. Further, by means of bioinformatic design, the applicant has engineered the single-stranded RNA peptide ligase gene to develop 1 specific single-stranded DNA peptide ligase and 2 specific double-stranded DNA peptide ligase, which can recognize a specific DNA single-stranded sequence at the 5 '-end of a specific single-stranded DNA or a specific DNA double-stranded sequence at the 5' -end of a specific double-stranded DNA and covalently link them to a specific polypeptide, respectively. On the basis of exogenous expression of these nucleic acid peptide ligases, the applicants have explored their enzymatic properties and methods of use. There are many commercially available DNA ligases (such as T4 DNAlagase from Saimer fly, USA) and modified enzymes (DNA methylase from NEB), which are only used for linking and specifically modifying nucleic acid fragments, and no ligase which can covalently link nucleic acid fragments and polypeptide fragments is reported. The invention provides a brand-new nucleic acid polypeptide ligase, which can be used as a tool enzyme for genetic engineering and genetic analysis and has huge application prospects in the fields of nucleic acid labeling, nucleic acid analysis and the like.

Disclosure of Invention

The present invention provides novel nucleic acid polypeptide ligases useful for ligating specific nucleic acid fragments to specific polypeptide fragments and methods of using and enzymatic properties thereof.

The technical scheme adopted by the invention for solving the technical problems is as follows:

the object (1) of the present invention is to provide a single-stranded RNA peptide ligase sRPPlaseI which can be used for catalyzing covalent ligation of a specific single-stranded RNA and a specific polypeptide, and its enzymatic properties and methods of use. The amino acid sequence of the sRPlaseI enzyme is set forth in SEQ ID NO: 2, and the corresponding gene coding sequence is shown as SEQ ID NO: 1 is shown in the specification; the sRPlasII enzyme can recognize a specific RNA single chain of which the 5 ' end is initiated by a 5 ' -AUGAUCCAG-3 ' sequence and the 5 ' end ribonucleotide A is in a phosphorylation state and a specific polypeptide chain of which the C end is initiated by 7 peptides N ' -MANCEHL-C ', and catalyze the dehydration condensation reaction between the phosphorylated adenine ribonucleotide A at the 5 ' end of the RNA single chain and leucine L at the C end of the polypeptide chain to form a phosphorus-carbon bond with O = P-C = O structure, so that the single-stranded RNA and the polypeptide are connected with each other by covalent bonds; the RP of the single-stranded RNA and the polypeptide has a maximum absorption peak at a wavelength of 351nm, and the peak is a characteristic absorption peak and can be used for measuring the reactivity of sRPLSeI ligase. The reaction system for sRPlaseI ligase is: (1) 1 ng/. mu.L-100 ng/. mu.L of a specific RNA single strand; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer consisting of 500mM Tris-HCl, 100mM Mg, pH 7.5²⁺10mM ATP and 10mM DTT, reaction per 1. mu.LAdding 0.1 mu L of ligase buffer solution into the system; (4) 1. mu.g of sRPPlaseI enzyme; (5) deionized water to make up to the required volume. Wherein, the specific RNA single strand refers to the RNA single strand of which the 5 'end is started to be 5' -AUGAUCCAG-3 'sequence and the 5' end ribonucleotide A is in a phosphorylation state, and the specific polypeptide refers to the polypeptide chain of which the C end is started to be 7 peptides of N '-MANCEHL-C'. The optimal reaction temperature range of sRPoseI ligase is 30-40 ℃, the ligation reaction time is 10-15 min, and the reaction system can be placed at 80 ℃ for 3min after the ligation reaction is finished, so that the reaction can be completely terminated.

The invention aims at providing a single-stranded DNA peptide ligase sDPlaseI which can be used for catalyzing covalent linkage of specific single-stranded DNA and specific polypeptide, and the enzymological characteristics and the using method thereof. The amino acid sequence of the sdplaseI enzyme is shown in SEQ ID NO: 4, and the corresponding gene coding sequence is shown as SEQ ID NO: 3 is shown in the specification; the sDPlaseI enzyme can recognize a specific DNA single chain of which the 5 ' end is initially provided with a 5 ' -ATGATCCAG-3 ' sequence and the 5 ' end deoxyribonucleotide A is in a phosphorylation state and a specific polypeptide chain of which the C end is initially provided with a 7 peptide N ' -MANCEHL-C ', and catalyze the dehydration condensation reaction between the phosphorylated adenine deoxyribonucleotide A at the 5 ' end of the DNA single chain and the leucine L at the C end of the polypeptide chain to form a phosphorus-carbon bond with O = P-C = O structure, so that the single-chain DNA and the polypeptide are connected with each other by covalent bonds; the single-stranded DNA and polypeptide ligation product sDP has a maximum absorption peak at 358nm, which is a characteristic absorption peak and can be used for the determination of the reactivity of sDPlaseI ligase. The reaction system for the sDPlaseI ligase is as follows: (1) 1 ng/. mu.L-100 ng/. mu.L of a specific DNA single strand; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer consisting of 500mM Tris-HCl, 100mM Mg, pH 7.5²⁺10mM ATP and 10mM DTT, and 0.1. mu.L ligase buffer solution is added to each 1. mu.L reaction system; (4) 1 μ g of sDPlaseI enzyme; (5) deionized water to make up to the required volume. Wherein the specific DNA single strand is a DNA single strand having 5 '-end starting with 5' -ATGATCCAG-3 'sequence and 5' -end deoxyribonucleotide A in phosphorylated state, and the specific polypeptide is a polypeptide chain having C-end starting with 7 peptides N '-MANCEHL-C'. Temperature range for optimal reaction of sDPlaseI ligaseThe temperature is 30-40 ℃, the connection reaction time is 5-15 min, and the reaction system can be placed at the temperature of 80 ℃ for 3min after the connection reaction is finished, so that the reaction can be completely stopped.

The invention aims at providing a double-stranded DNA peptide ligase dDPlaseI which can be used for catalyzing covalent linkage of specific double-stranded DNA and specific polypeptide, and the enzymatic properties and the using method thereof. The amino acid sequence of dplasei enzyme is shown in SEQ id no: 6, the corresponding gene coding sequence is shown as SEQ ID NO: 5 is shown in the specification; the dDPlaseI enzyme can recognize a specific DNA double chain of which the 5 ' end is initiated to be a 5 ' -ATGATCCAG-3 ' double-chain sequence and the 5 ' end deoxyribonucleotide A is in a phosphorylation state and a specific polypeptide chain of which the C end is initiated to be a 7 peptide of N ' -MANCEHL-C ', and catalyze the dehydration condensation reaction between the phosphorylated adenine deoxyribonucleotide A at the 5 ' end of the DNA double chain and the leucine L at the C end of the polypeptide chain to form a phosphorus-carbon bond with O = P-C = O structure, so that the double-chain DNA and the polypeptide are connected with each other by covalent bonds; the double-stranded DNA and polypeptide ligation product dDPI has a maximum absorption peak at the wavelength of 360nm, and the peak is a characteristic absorption peak and can be used for measuring the reaction activity of dDPlaseI ligase. The reaction system for dplasei ligase is: (1) 1 ng/. mu.L-100 ng/. mu.L of a specific DNA duplex; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer consisting of 450mM Tris-HCl, 100mM Mg, pH 7.8²⁺80mM NaCl, 20mM ATP and 8mM Tris-100, and adding 0.1 mu L of ligase buffer solution into each 1 mu L of reaction system; (4) 1 μ g of dplasei enzyme; (5) deionized water to make up to the required volume. Wherein the specific DNA double strand is a DNA double strand of which 5 '-end is initially a 5' -ATGATCCAG-3 'double strand sequence and of which 5' -end deoxyribonucleotide A is in a phosphorylated state, and the specific polypeptide is a polypeptide chain of which C-end is initiated by the 7 peptide N '-MANCEHL-C'. The optimal reaction temperature range of the dDPlaseI ligase is 35-45 ℃, the ligation reaction time is 3-10 min, and the reaction system can be placed at 80 ℃ for 3min after the ligation reaction is finished, so that the reaction can be completely terminated.

The object of the present invention (4) is to provide another double-stranded DNA peptide ligase dDPlaseII which can be used for catalyzing covalent ligation of specific double-stranded DNA and specific polypeptide, its enzymatic properties and usesThe method is used. The amino acid sequence of dplaseii enzyme is set forth in SEQ ID NO: 8, and the corresponding gene coding sequence is shown as SEQ ID NO: 7 is shown in the specification; the dDPlaseII enzyme can recognize a specific DNA double chain of which the 5 ' end is initiated to be a 5 ' -CTGGATCAT-3 ' double-chain sequence and the 5 ' end deoxyribonucleotide C is in a phosphorylation state and a specific polypeptide chain of which the C end is initiated to be a 7 peptide of N ' -MANCEHL-C ', and catalyze the dehydration condensation reaction between the phosphorylated cytosine deoxyribonucleotide C at the 5 ' end of the DNA double chain and the leucine L at the C end of the polypeptide chain to form a phosphorus-carbon bond with O = P-C = O structure, so that the double-chain DNA and the polypeptide are connected with each other by a covalent bond; the double-stranded DNA and polypeptide ligation product dDPII has a maximum absorption peak at 372nm, and the peak is a characteristic absorption peak and can be used for measuring the reaction activity of dDPlaseII ligase. The reaction system for dplaseii ligase is: (1) 1 ng/. mu.L-100 ng/. mu.L of a specific DNA duplex; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer consisting of 450mM Tris-HCl, 100mM Mg, pH 7.8²⁺80mM NaCl, 20mM MATP and 8mM Triton X-100, and adding 0.1 mu L of ligase buffer solution into each 1 mu L of reaction system; (4) 1 μ g of dplaseii enzyme; (5) deionized water to make up to the required volume. Wherein, the specific DNA double-strand refers to a DNA double-strand of which the 5 '-end is initially a 5' -CTGGATCAT-3 'double-strand sequence and the 5' -end deoxyribonucleotide C is in a phosphorylation state, and the specific polypeptide refers to a polypeptide chain of which the C-end is initiated by a peptide 7 of N '-MANCEHL-C'. The optimal reaction temperature range of the dDPlaseII ligase is 35-45 ℃, the ligation reaction time is 3-10 min, and the reaction system can be placed at 80 ℃ for 3min after the ligation reaction is finished, so that the reaction can be completely terminated.

The novel nucleic acid polypeptide ligase, the enzymatic reaction characteristic and the use method thereof provided by the invention can lay a foundation for the novel nucleic acid polypeptide ligase as a novel tool enzyme for genetic engineering and genetic analysis and the application thereof in the fields of nucleic acid analysis, nucleic acid marking and the like.

Drawings

FIG. 1 is a schematic diagram of a special DNA structure obtained from a sequencing analysis of a microecological metagenome of dried straw, which mainly comprises a promoter I, X gene, a short peptide gene and a leader DNA fragment.

FIG. 2 shows the verification of the ligation product of R9 and P7 by protein X.

Fig. 3 is a schematic diagram of O = P-C = O configuration phosphorus-carbon bonds.

FIG. 4 is a full-wavelength scanning spectrum of the R9 and P7 ligation product RP under the action of single-stranded RNA peptide ligase sRPPlaseI, and the characteristic absorption peak is at 351nm wavelength.

FIG. 5 is a full wavelength scanning pattern of sD9 and P7 ligation product sDP with single-stranded DNA peptide ligase sDPlaseI, with a characteristic absorption peak at 358 nm.

FIG. 6 is a full wavelength scanning spectrum of dD9 and P7 ligation product dDPI with the aid of double stranded DNA peptide ligase dDPlaseI, with characteristic absorption peak at 360nm wavelength.

FIG. 7 is a full wavelength scanning spectrum of dD9 and P7 ligation product dDPII with the aid of double stranded DNA peptide ligase dDPlaseII, with a characteristic absorption peak at 372 nm.

FIG. 8 is the difference between the double stranded DNA peptide ligases dDPlaseI and dDPlaseII for the dD9 and P7 ligation reactions.

Detailed Description

The invention is further illustrated below with reference to specific examples.

Example 1 enzymes catalyzing the ligation of specific RNA Single strands and specific Polypeptides

The applicant utilizes the metagenome sequencing technology to research the functional genome of a dry straw microecological system in a certain area of Fujian in the early period, and through the splicing analysis of sequencing data, the upstream of more than ten different gene segments are found to contain similar special DNA structures, and the special DNA structures are shown in figure 1 by taking cellulase (specifically, endoglucanase, EC.3.2.1.4, the main component in a cellulase system) genes as an example. The immediate vicinity of the front ends of more than ten genes such as cellulase and the like all contain two ORFs (open reading frames), and the special DNA structure taking cellulase genes as an example comprises an X gene (the sequence of which is shown as SEQ ID NO: 1) with the length of 1362bp and a short peptide gene (the sequence of which is shown as SEQ ID NO: 9) with the length of 24 bp. The polypeptide sequence coded by the short peptide gene is N '-MANCEHL-C' (shown as SEQ ID NO: 10), which sequentially corresponds to 7 amino acids of N '-methionine-alanine-asparagine-cysteine-glutamic acid-histidine-leucine-C', N 'represents the N end of the polypeptide chain, and C' represents the C end of the polypeptide chain. The X gene and the short peptide gene have unknown functions, share one promoter (promoter I in the figure), and the gene construction product is supposed to be a cellulose degrading enzyme complex component because the promoter is arranged at the front end of the cellulase gene and is close to the cellulase gene. In addition, a 9-nucleotide short DNA double strand, namely 5 '-ATGATCCAG-3' (the complementary DNA strand sequence is 3 '-TACTAGGTC-5', the corresponding transcription sequence is 5 '-AUGAUCCAG-3' short RNA sequence), is also present between the cellulase gene promoter (promoter II in the figure) and the initiation codon, and is accompanied by the X gene and the short peptide gene, and the similar short DNA double strand is contained before the initiation codon of more than ten genes, and is referred to as a leader DNA fragment.

To explore the role of the specific gene constructs shown in fig. 1 (mainly including promoter I, X gene, short peptide gene and leader DNA fragment) in the cellulose complex enzyme system, they were subjected to gene cloning and protein expression together with cellulase gene: primers are designed to amplify the gene structure (containing cellulase genes) shown in figure 1, recombinant plasmids are constructed and transferred into escherichia coli engineering bacteria for exogenous expression, and then separation and purification are carried out, and the result shows that the expression product basically has no cellulose degradation capability (including endoglucanase activity). The transcription product of the cellulase gene, namely mRNA, is detected by using real-time fluorescent quantitative PCR, the corresponding cellulase protein cannot be detected by using Western blot, and the detected protein band size is consistent with that of an X gene product; in addition, the presence of the N '-MANCEHL-C' polypeptide was also detected, indicating that the polypeptide gene was also successfully expressed. In general, cellulase genes are transcribed into mRNA in the constructed recombinant escherichia coli exogenous expression system, but are not translated into cellulase proteins, and the result shows that the translation process of the cellulase mRNA is blocked, but the adjacent X gene and leader DNA fragments can successfully express corresponding proteins and polypeptides. .

Further analysis shows that for an escherichia coli engineering bacteria system introduced with the recombinant plasmid with the cellulase complex gene structure shown in fig. 1, the length of the mRNA of the extracted cellulose gene analyzed on denatured RNA electrophoresis by referring to RNA Marker does not accord with the theory, and the actual length is larger than the theoretical length. In-depth analysis (Qubit4.0) revealed that the mRNA is actually a complex of mRNA and polypeptide, a short peptide is linked to the 5' end of the mRNA, and the short peptide is the encoded product of the short peptide gene shown in FIG. 1 by RNase enzyme digestion and mass spectrometry (see example 2 for related analysis methods and procedures). Normally, mRNA and polypeptide will not be automatically linked, and thus it is presumed that the X gene expression product is an enzyme that catalyzes the linkage of a particular mRNA to a particular polypeptide. Further cloning of the X gene for expression purification is required in order to explore its characteristics.

EXAMPLE 2 cloning expression of the X Gene and exploration of the X protein Properties

Primers 5'-ATGCGGACGCGCCACAGC-3' and 5'-CTACATCTGACGTCGAAGG-3' were designed based on the X gene sequence (SEQ ID NO: 1) to amplify the X gene (refer to conventional PCR system and conditions, annealing temperature: 51 ℃ C.), the X gene was expressed by cloning using the recombinant protein E.coli expression and purification system (pET Express & Purify Kits) of Takara, Japan according to the instructions, and the X protein (X gene-encoding expression product) was isolated and purified using the histidine tag attached to the fusion protein, and the final purified X protein concentration was 0.18. mu.g/ml as determined by Qubit4.0, and its amino acid sequence was as shown in SEQ ID NO: 2, respectively. As can be seen from example 1, protein X may be a ligase that catalyzes the ligation of a specific RNA to a specific polypeptide, and its properties can be further verified and studied in view of the successful exogenous expression and isolation and purification of protein X. Since the short peptide sequence N ' -MANCEHL-C ' and the short leader RNA single-stranded sequence 5 ' -AUGAUCCAG-3 ' are ubiquitous (the leader DNA fragment and cellulase gene are transcribed as described above, and mRNA thereof is present, and the short leader RNA sequence is present at the 5 ' end as shown in FIG. 1 and is transcribed from the leader DNA fragment), it is presumed that the X ligase mainly recognizes and ligates the short peptide sequence and the short leader RNA sequence, thereby blocking the translation of mRNA ligated to the short leader RNA (e.g., mRNA of cellulase gene in example 1) into protein.

King of King Shi Biotechnology Ltd for the synthesis of N '-MANCEHL-C' 7 amino acid short peptide P7 (FIG. 1 short peptide gene coding product) with purity of not less than 95%. 5 '-P-AUGAUCCAG-B-3' short leader RNA single strand R9 of 9 ribonucleotides was synthesized simultaneously (FIG. 1 leader DNA fragment transcript, its 5 'terminal A ribonucleotide was monophosphorylated and modified, denoted by P; in this text A in the RNA strand is adenine ribonucleotide, U is uracil ribonucleotide, C is cytosine ribonucleotide, G is guanine ribonucleotide), and Biotin (Biotin, denoted by B) was modified at its 3' terminal ribonucleotide G to isolate it with streptavidin-coated magnetic beads. All references herein to a phosphorylation modification refer to the attachment of the modified phosphate group to the 5 th carbon atom of a five carbon sugar of a ribonucleotide (or deoxyribonucleotide), as in a natural monophosphate or triphosphate mononucleotide; in addition, triphosphate modification may also be applicable, however, in view of practical convenience in using monophosphate, it is not specified one by one below for the sake of simplicity; as used herein, the 5 'terminal or 3' terminal nucleotide refers to the first nucleotide at the corresponding end, i.e., the terminal-most nucleotide. The binding of biotin to streptavidin is the strongest non-covalent interaction now known in nature and is commonly used for the isolation, purification and analysis of specific nucleic acids or proteins. A mixed reaction system of protein X was prepared in a PCR tube according to Table 1, and placed in a PCR apparatus at 37 ℃ for 30 min. The components related in the reaction system are all prepared by RNase-free water, and the consumables are all treated by the RNase removal. Since the 3' -terminal ribonucleotide G of the short RNA single strand R9 carries a biotin modification, R9 in the reaction system was captured after the reaction was completed by streptavidin magnetic beads (DynabeadsMyOne T1) from Sammarvillea, USA according to the method described in the specification. After R9 was captured, RNase I (Amersham fly, which can completely digest RNA into single nucleotides) was added and RNA digestion was performed according to the protocol. After the RNA digestion product is purified, the RNA digestion product is sent to Beijing Baitacg Biotechnology Limited for mass spectrometry analysis. If the protein X can recognize the short lead RNA single strand R9 and the short peptide sequence P7 and connect the two, the connection complex is digested by RNase I to obtain a connection object of the short peptide sequence P7 and the 5' terminal nucleotide A of the short lead RNA single strand R9, and finally the connection object is verified by molecular weight by mass spectrometry, wherein the specific principle and the process are shown in detail in FIG. 2 (B in FIG. 2 represents biotin modification).

The mass spectrum showed that the major peak corresponding to the molecular weight of N '-MANCEHL-A (relative molecular weight: about 1146.1, A in A represents A ribonucleotide) and the minor peak corresponding to the molecular weight of C' -L-A (relative molecular weight: about 460.4, C '-L represents C-terminal amino acid, L amino acid, and A in A represents A ribonucleotide) were present, but no peak corresponding to the molecular weight of N' -M-A was found. The main peak results indicate that R9 and P7 are linked together; the secondary peak results confirmed that the linkage of R9 and P7 was a P-C phosphorus-carbon bond (specifically, the configuration was O = P-C = O, i.e., the phosphorus atom in the phosphorus-oxygen double bond P = O and the carbon atom in the carbon-oxygen double bond O = C were covalently linked, see fig. 3, which was formed by dehydration condensation of the phosphate group of a ribonucleotide and the carboxyl group of L amino acid) between the 5 '-terminal ribonucleotide a of R9 and the C' -terminal amino acid L of P7. In conclusion, protein X, a newly discovered single-stranded RNA peptide ligase, is indicated by the abbreviation sRPoseI (i.e., single-stranded for single-stranded RNA and polypeptide), and catalyzes the covalent linkage of R9 and P7, and the specific ligation product of RNA single strand and polypeptide is indicated by RP.

TABLE 1 protein reaction System

Components	Adding amount of
		1. mu.g/. mu.L of synthetic RNA single strand R9	1μL
1 mu g/mu L synthetic short peptide P7	1μL
		Ligase buffer (composed of 500mM Tris-HCl, 100mM Mg at pH 7.5)²⁺10mM ATP and 10mM DTT)	1μL
X protein	1μg
		Ultrapure water	Make up to 10. mu.L

Example 3 detection of reaction product of Single-stranded RNA peptide ligase sRPPlaseI and exploration of enzymatic Properties

Examples 1 and 2 show that sRPoseI is a ligase that catalyzes the ligation of a specific single-stranded RNA to a specific polypeptide, and that a simple ligation assay should be established in order to further explore its enzymatic properties. Since the reaction product catalyzed by sRPlaseI ligase is a covalently linked (O = P-C = O building a phosphorus-carbon bond) complex of RNA single strands and polypeptides, an attempt can be made to determine the absorption spectrum characteristics thereof in order to attempt reaction determination using a simple spectrophotometric method. The reaction products of Table 1 of example 2, 3. mu.l each of R9, P7 and the blank reaction system were scanned at a full wavelength in the range of 200nm to 1000nm using an Epoch microplate reader (Biotek, USA), and the results of the reaction products are shown in FIG. 4. The results show that the ligation reaction complex of RNA single strand and polypeptide (designated as RP) shows a maximum absorption at 351nm, while none of R9, P7 and the blank reaction system shows this peak, indicating that the 351nm wavelength peak is the characteristic absorption peak of the ligation complex RP, which is due to the specific absorption of the phosphorus-carbon bond of the O = P-C = O configuration. Because the RNA is a connection product of RNA and polypeptide, absorption peaks respectively appear at wavelengths of about 215nm, 260nm and 280nm, but under the influence of a reaction system, the wavelengths are not suitable for measuring RP. From this, it is found that absorbance at 351nm is a simple and effective method for measuring the sRPoseI ligase reaction (product analysis method).

To explore the substrate specificity of sRPlaseI ligase, attempts were made to alter the R9 and/or P7 fragments. By altering the R9 fragment, including by reducing and/or replacing some of the ribonucleotides, the system does not have a significant absorption peak at 351nm after the ligation reaction is completed, indicating that no phosphorus-carbon bond product of O = P-C = O is formed, i.e., no ligation reaction is performed. Likewise, by altering the P7 fragment, including by reducing and/or replacing some of the amino acids, the system also showed no significant absorption peak at 351nm after the end of the ligation reaction, indicating that no phosphorus-carbon bond product of O = P-C = O was formed, i.e., no ligation reaction was performed. It was thus shown that sRPoseI ligase catalyzed ligation was based on the recognition of the stringent R9 and P7 fragments, which also confirmed that the more than ten genes mentioned in example 1 previously appeared in gene constructs similar to R9 and P7, for sRPoseI ligase the R9 and P7 sequences were conserved and no other sequence forms of the R9 and P7 corresponding genes linked to their genes were found. However, since sRPLSeI ligase mediates the ligation of the 5 '-terminal ribonucleotide of R9 and the C-terminal amino acid of P7, extension of the 3' -terminal ribonucleotide chain of R9 and/or the N-terminal amino acid chain of P7 does not theoretically affect the progress of the ligation reaction while retaining R9 and P7. 5 ' -P-AUGAUCCAG-Rn-3 ' (where P represents monophosphorylated modification) and N ' -Pm-MANCEHL-C ' were used as substrates, where Rn is a single-stranded RNA fragment linked to the 3 ' end of R9, and 3 fragments of different lengths and sequences R1, R2 and R3 were tried (in view of sequence synthesis techniques and cost, R1-R3 selected mRNA regions corresponding to glutathione, human insulin A chain and human insulin B chain, respectively, and glutathione was a virtual mRNA sequence), whose ribonucleotide sequences were shown in SEQ ID NO: 11-SEQ ID NO: 13 (9 nt, 63nt, and 90nt in length, respectively); pm is a polypeptide fragment connected with the N-terminal of P7, and 3 fragments of different lengths and sequences, P1, P2 and P3 (P1-P3 correspond to R1-R3 respectively, and are peptide chains corresponding to glutathione, a human insulin A chain and a human insulin B chain respectively) are tried, and the amino acid sequences of the fragments are shown in SEQ ID NO: 14-16 (3 aa, 21aa and 30aa in length, respectively), and a total of 9 combinations of two substrates (RNA single strand and polypeptide) were used, and with reference to table 1, the sRPlaseI ligase-mediated reaction system detected a significant absorption peak at 351nm, indicating that a phosphorus-carbon bond product with O = P-C = O structure was formed, i.e., a ligation reaction was performed. This is also consistent with the observation in example 1 that the cellulase gene is transcribed but not translated into protein, because the short leader RNA single strand is attached to the 5 'end of the long mRNA transcribed from the cellulase gene, and P7 is attached to the short leader RNA single strand R9 at the 5' end by sRPlaseI ligase (the expression product of the X gene immediately adjacent to the cellulase in the specific gene construct of fig. 1), which results in the inability of recognition and translation of cellulase mRNA.

As can be seen, sRPlaseI ligase is capable of ligating RNA single strands with a 5 ' terminus initially containing intact R9 and a 5 ' terminal ribonucleotide a in a phosphorylated state (i.e., 5 ' -P-R9-3 ', P indicates that the 5 ' terminal ribonucleotide of R9 is monophosphorylated and P indicates other RNA sequence at the 3 ' terminus) and a C-terminus initially containing intact P7 and a specific polypeptide chain as a substrate (i.e., N ' -P7-C ', -indicating other polypeptide sequence at the N terminus), catalyzing covalent ligation of RNA single strands and polypeptide chains, particularly, dehydration condensation between the 5 ' terminal ribonucleotide a of RNA and the C-terminal amino acid L of the polypeptide to form a phosphorus-carbon bond with O = P-C = O configuration. The 5 'terminal ribonucleotide refers to the first ribonucleotide at the 5' terminus. The phosphorylated form of the nucleotide refers to the attachment of a modified or naturally occurring phosphate group to the 5 th carbon atom of a five carbon sugar of a ribonucleotide as in the case of the natural monophosphate or triphosphate mononucleotide; in addition, triphosphate modification may also be applicable, however, in view of practical convenience in use of monophosphate, it is not specifically indicated below for the sake of simplicity.

With reference to Table 1, the optimum reaction buffer system and temperature of sRPISeI ligase were investigated using different buffers and temperatures according to Table 2 using R9 and P7 as substrates. Buffers A and B were 2 commonly used ligase buffers, buffer A was composed of 500mM Tris-HCl, 100mM Mg at pH 7.5²⁺10mM ATP and 10mM DTT; buffer B was composed of 450mM Tris-HCl, 100mM Mg, pH 7.8²⁺80mM NaCl, 20mM ATP, 8mM Tris-100. The amount of sRPoseI ligase added was 1. mu.g, and the reaction time was 30 min. Taking 3 mu L of reaction product after the reaction is finishedThe absorbance of the sample was measured at 351nm using an Epoch microplate reader (Biotek, USA), and the results are shown in Table 2. As can be seen from Table 2, the absorbance of the reaction product was higher in the buffer A system than in the buffer B at each temperature, indicating that the buffer A is more suitable for sRPoseI ligase. In the buffer A group, the absorbance of the reaction product was the highest at 37 ℃ (1.28), indicating that 37 ℃ is the optimum reaction temperature for sRPoseI ligase. Furthermore, the results in Table 2 show that the optimum reaction temperature range for sRPISeI ligase was 30 ℃ to 40 ℃. The optimal buffer solution and optimal temperature conditions of the 5 ' -P-AUGAUCCAG-Rn-3 ' (P indicates that the 5 ' -terminal A ribonucleotide is phosphorylated) and N ' -Pm-ManCEHL-C ' as the substrates were consistent with the R9 and P7 substrate groups.

TABLE 2 reactivity of sRPISeI ligase in different reaction buffer systems and temperature gradients

Extension of the reaction time to 60min and increase of the amount of sRPISeI ligase to 2. mu.g at buffer A and 37 ℃ revealed that the absorbance of the reaction product at 351nm was also 1.28, indicating that the reaction was complete for 30min at buffer A and 37 ℃. The reaction time was shortened to 10min while keeping the original system and conditions unchanged, and it was found that the absorbance of the reaction product at 351nm was also 1.28, and thus the reaction proceeded sufficiently for 10 min. Keeping the original system and conditions unchanged, preserving the temperature of the sRPoseI ligase reaction system at 80 ℃ for 3min (incubating the basic reaction system at 80 ℃ for 15min, then adding the sRPoseI ligase), and continuing the reaction, wherein the absorbance of the reaction product at 351nm is not significantly different from that of a blank control (the blank control is 0.08, and the pre-incubation system at 80 ℃ is 0.13) to indicate that the reaction is terminated, so that the temperature can be preserved for 3min at 80 ℃ after the reaction is finished to completely terminate the reaction.

In conclusion, the invention provides a single-stranded RNA peptide ligase sRPPlaseI which can be used for catalyzing covalent connection of specific single-stranded RNA and specific polypeptide, and the amino acid sequence of the single-stranded RNA peptide ligase sRPPlaseIAs set forth in SEQ ID NO: 2, the corresponding gene coding sequence is shown as SEQ ID NO: 1 is shown in the specification; the sRPlasII enzyme can recognize a specific RNA single chain of which the 5 ' end is initiated by a 5 ' -AUGAUCCAG-3 ' sequence and the 5 ' end ribonucleotide A is in a phosphorylated state and a specific polypeptide chain of which the C end is initiated by a 7-peptide N ' -MANCEHL-C ', and catalyze a dehydration condensation reaction between the adenine ribonucleotide A at the 5 ' end of the RNA single chain and leucine L at the C end of the polypeptide chain to form a phosphorus-carbon bond with O = P-C = O structure, so that the single-stranded RNA and the polypeptide are connected with each other by covalent bonds. The RP, a product of the ligation of single-stranded RNA and polypeptide, has a maximum absorption peak at a wavelength of 351nm, which is a characteristic absorption peak. The reaction system for sRPlaseI ligase is: (1) 1 ng/. mu.L-100 ng/. mu.L of a specific RNA single strand; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer consisting of 500mM Tris-HCl, 100mM Mg, pH 7.5²⁺10mM ATP and 10mM DTT, and 0.1. mu.L ligase buffer solution is added to each 1. mu.L reaction system; (4) deionized water to make up to the required volume. Wherein, the specific RNA single strand refers to the RNA single strand of which the 5 'end is started to be 5' -AUGAUCCAG-3 'sequence and the 5' end ribonucleotide A is in a phosphorylation state, and the specific polypeptide refers to the polypeptide chain of which the C end is started to be 7 peptides of N '-MANCEHL-C'. The optimal reaction temperature range of sRPoseI ligase is 30-40 ℃, the ligation reaction time is 10-15 min, and the reaction system can be placed at 80 ℃ for 3min after the ligation reaction is finished, so that the reaction can be completely terminated.

Example 4 DNA peptide ligase that catalyzes ligation of specific DNA Single strands or specific DNA double strands to specific Polypeptides

In addition to single-stranded RNA, nucleic acid forms common in vivo and in the field of molecular biology include double-stranded DNA and single-stranded DNA, and therefore, the gene encoding the single-stranded RNA peptide ligase sRPoseI (i.e., the X gene in FIG. 1, the sequence shown in SEQ ID NO: 1) is further engineered by means of bioinformatic design in order to construct a specific DNA peptide ligase. Through bioinformatic analysis, the substrate binding site and the catalytic activity center of the single-stranded RNA peptide ligase sRPlaseI were predicted, a series of modified genes were obtained focusing on the change of the coding sequences of these key sites by customization, and the modified enzymes were obtained by cloning, expressing and isolating and purifying according to the instructions using the recombinant protein escherichia coli expression and purification system (pET Express & Purify Kits) of Takara corporation, japan with reference to the method in example 2. Considering the stringency of ligase on the substrates, P7 is still selected for the polypeptides in both substrates, and the DNA strand corresponding to R9 is selected for the single strand DNA (5 '-P-ATGATCCAG-B-3', sD9, P for monophosphorylated modification and B for biotin modification) or the double strand DNA (5 '-P-ATGATCCAG-B-3', double-stranded form, complementary strand 3 '-TACTAGGTC-P-5', dD9, P for monophosphorylated modification and B for biotin modification). Wherein the short single-stranded DNA, namely sD9, is directly consigned to the Kinsley Biotechnology GmbH for synthesis; and the short double-stranded DNA fragment, namely dD9, firstly entrusts Kinsley Biotechnology Limited company to respectively synthesize complementary strands, then equally (1.5 mu g/mu L) two complementary single-stranded DNAs are mixed in an annealing system buffer solution, the temperature is kept for 16h at 25 ℃, then a double-stranded product is purified, the concentration is measured by adopting Qubit4.0, and the preparation of the double-stranded DNA is completed by the company.

The modified enzyme reaction system was prepared according to Table 3 and placed in a PCR instrument at 37 ℃ for 30 min. The single-stranded DNA and the reaction products of the double-stranded DNA with P7 were designated sDP and dDP (collectively, DP), respectively. Since the single-stranded RNA peptide ligase sRPlaseI can catalyze R9 and P7 to form a product RP containing a phosphorus-carbon bond with O = P-C = O configuration, which has characteristic absorption at 351nm wavelength. In this regard, the reaction product of P7 and DNA strand acted on by the engineered enzyme system is scanned over the full wavelength to see if it has a characteristic absorption peak at 351nm or other nearby wavelengths, thereby allowing easy screening of the engineered enzymes useful for linking sD9 (or dD 9) to P7. On the basis of full-wavelength scanning primary screening, for a candidate modified enzyme system, referring to the method in example 2, a streptavidin-coated magnetic bead is used for capturing a DNA chain (sD 9 or dD9, namely D9) with a biotin label, DNA exonuclease (digestible single-stranded DNA and double-stranded DNA) is used for digesting a ligation reaction system product according to the method in the specification, and the product is purified and then sent to a company for mass spectrometry detection and analysis.

TABLE 3 modified enzyme reaction systems

Components	Adding amount of
		Mu.g/. mu.L of DNA single strand sD9 or DNA double strand fragment dD9	1μL
1 mu g/mu L synthetic short peptide P7	1μL
		Ligase buffer (composed of 500mM Tris-HCl, 100mM Mg at pH 7.5)²⁺10mM ATP and 10mM DTT)	1μL
Engineering enzymes	1μg
		Ultrapure water	Make up to 10. mu.L

Through screening of 34 kinds of modified enzymes, 1 kind of candidate high-reactivity specific single-stranded DNA polypeptide ligase is found, which is marked as sDPlaseI (i.e. one strand for single-stranded DNA and polypeptide), and the gene sequence of the ligase is shown as SEQ ID NO: 3, and the corresponding amino acid sequence is shown as SEQ ID NO: 4 is shown in the specification; 2 candidate highly reactive specific double-stranded DNA polypeptide ligases, designated as dDPlaseI (i.e. ligand I for double-stranded DNA and polypeptide) and dDPlaseI (i.e. ligand II for double-stranded DNA and polypeptide), have the gene sequences as shown in SEQ ID NO: 5 and SEQ ID NO: 7, and the corresponding amino acid sequences are respectively shown as SEQ ID NO: 6 and SEQ ID NO: shown in fig. 8. Full wavelength scan spectra of the sDPlaseI, dDPlaseI and dDPlaseI enzyme reaction systems are shown in FIGS. 5-7, respectively. The products of the 3 DNA polypeptide ligase reaction systems have absorption peaks at wavelengths of about 215nm, 260nm and 280nm, and when the streptavidin-coated magnetic beads are used for capturing a substance containing a biotin label and then full-wavelength scanning measurement is carried out, the 3 wavelength peaks still exist, and the 3 wavelength peaks are just the characteristic absorption peaks of DNA and polypeptide, and the existence of a DNA and polypeptide compound is verified from the side. In contrast, the enzyme reaction systems of sDPlaseI, dDPlaseI and dDPlaseII each exhibited unique characteristic absorption wavelengths of 358nm, 360nm and 372nm, which are very close to the characteristic absorption wavelength of 351nm of the single-stranded RNA polypeptide complex RP in example 3, and are likely to be characteristic absorption peaks where O = P-C = O constructing a phosphorus-carbon bond, but slightly different characteristic absorption wavelengths were caused due to the inconsistency between the nucleotide (ribonucleotide or deoxyribonucleotide) forming a phosphorus-carbon bond with L amino acid and its surrounding environment. And verifying by mass spectrometry on the basis of determining a characteristic absorption peak by full-wavelength scanning to screen a candidate DNA polypeptide ligase system.

Further, the mass spectrum of the reaction product of the sdplaseI enzyme showed the occurrence of a major peak corresponding to the molecular weight of N ' -MANCEHL-A (relative molecular mass: 1130.2, -A in A represents the 5 ' -terminal nucleotide of sD9, i.e., A deoxyribonucleotide), and in addition, a peak corresponding to the molecular weight of C ' -L-A in the minor peak (relative molecular mass: 444.4, C ' -L represents the C-terminal amino acid of P7, i.e., L amino acid, -A in A represents the 5 ' -terminal nucleotide of sD9, i.e., A deoxyribonucleotide). The mass spectrometry results showed that sD9 and P7 had a linkage that was a phosphorus-carbon bond interconnection between the 5' terminal deoxyribonucleotide a of sD9 and the C-terminal amino acid L of P7 in an O = P-C = O configuration. It is known that the sDPlaseI enzyme can catalyze the covalent linkage of sD9 and P7, and the sDPlaseI enzyme is a novel single-stranded DNA peptide ligase, and the catalyzed ligation product of a DNA single strand and a polypeptide is indicated by sDP.

The mass spectrum of the dplasel enzyme reaction product showed the appearance of a major peak corresponding to the molecular weight of N ' -MANCEHL-a (relative molecular mass 1130.2-a in a represents the 5 ' terminal nucleotide of the dD9 linker, i.e., a deoxyribonucleotide), and a minor peak corresponding to the molecular weight of C ' -L-a in a minor peak (relative molecular mass 444.4, C ' -L represents the C-terminal amino acid of P7, i.e., L amino acid, -a in a represents the 5 ' terminal nucleotide of the dD9 linker, i.e., a deoxyribonucleotide). It should be noted that although the short double-stranded DNA fragment dD9 is connected to a nucleotide pair a = T at one end, under mass spectrometry conditions, the hydrogen bond of base complementary pairing between a = T will be opened, and only the deoxyribonucleotide a covalently linked to P7 will be retained. The mass spectrometry results showed that dD9 and P7 had been linked together in a phosphorus-carbon bond configuration with O = P-C = O between the 5' terminal deoxyribonucleotide a on the dD9 connecting strand and the C-terminal amino acid L of P7. It is known that dDPlaseI enzyme can catalyze the covalent linkage between dD9 and P7, and dDPlaseI enzyme is a novel double-stranded DNA peptide ligase, and the catalytic ligation product of the double-stranded DNA and the polypeptide is represented by dDPI.

The mass spectrum of the dplaseii enzyme reaction product showed the appearance of a major peak corresponding to the molecular weight of N ' -MANCEHL-C (relative molecular mass 1106.1, -C in C represents the 5 ' terminal nucleotide of the dD9 linker, i.e., the C deoxyribonucleotide), and in addition, a peak corresponding to the molecular weight of C ' -L-C in the minor peak (relative molecular mass 420.4, C ' -L represents the C terminal amino acid of P7, i.e., the L amino acid, -C in C represents the 5 ' terminal nucleotide of the dD9 linker, i.e., the C deoxyribonucleotide). It should be noted that although the short double-stranded DNA fragment dD9 is linked with a C.ident.G nucleotide pair at one end, under mass spectrometry conditions, the hydrogen bond of base complementary pairing between C.ident.G will be opened, and only the deoxyribonucleotide C covalently linked to P7 will be retained. The mass spectrometry results showed that dD9 and P7 were linked together, and that dD9 and P7 were linked together by a phosphorus-carbon bond in the O = P-C = O configuration between the 5' terminal deoxyribonucleotide C on one strand of dD9 and the C-terminal amino acid L of P7. It is known that dDPlaseII enzyme can catalyze the covalent linkage between dD9 and P7, and dDPlaseII enzyme is a novel double-stranded DNA peptide ligase, and the catalytic ligation product of a DNA double strand and a polypeptide is represented by dDPII.

Both dplasei and dplasei enzymes can catalyze the linkage of dD9 and P7 with a covalent bond of O = P-C = O for phosphorus-carbon bonds, except that dplasei enzymes catalyze the linkage of a 5 'terminal deoxyribonucleotide a on one strand of dD9 (i.e., the 5' -P-ATGATCCAG-3 'strand) and a C-terminal amino acid L of P7 with O = P-C = O for phosphorus-carbon bonds, while dplasei enzymes catalyze the linkage of a 5' terminal deoxyribonucleotide C on the other strand of dD9 (the 5 '-P-CTGGATCAT-3' strand, i.e., the complement of the 5 '-P-ATGATCCAG-3' strand) and a C-terminal amino acid L of P7 with O = P-C = O for phosphorus-carbon bonds, as shown in fig. 8. Thus, in practical applications, for example, for nucleic acid labeling, any strand of the double strand of DNA can be labeled with the ligase dDPlaseI or dDPlaseII of the invention; in another example, for preventing gene expression or regulation, the transcription or regulation of the expression product of either strand of the double strand of DNA can be prevented by the ligase dDPlaseI or dDPlaseII of the invention.

EXAMPLE 53 exploration of the enzymatic Properties of DNA polypeptide ligases

To explore the substrate specificity of the 3 DNA polypeptide ligases, sDPlaseI, and dDPlaseII, it was attempted to alter the sD9 (or dD 9) and/or P7 fragments. By altering the sD9 (or dD 9) fragment, including the reduction and/or replacement of some of the deoxyribonucleotides, the system does not have a significant absorption peak at 358nm (or 360nm or 372 nm) after the end of the ligation reaction, indicating that no phosphorus-carbon bond product of O = P-C = O configuration is formed, i.e., no ligation reaction is performed. Likewise, by altering the P7 fragment, including by reducing and/or replacing some of its amino acids, the system also shows no significant absorption peak at 358nm (or 360nm or 372 nm) after the ligation reaction is completed, indicating that no phosphorus-carbon bond product of O = P-C = O structure is formed, i.e., no ligation reaction is performed. Thus, it was shown that the ligation reactions catalyzed by the sDPlaseI, dDPlaseI and dDPlaseII ligases were based on the recognition of the stringent D9 (i.e., sD9 or dD 9) and P7 fragments, which are consistent with the stringent substrate sequence requirements of the single-stranded RNA polypeptide ligase sRPaseI, i.e., the 3 DNA polypeptide ligases required that the D9 and P7 recognition sequences be conserved. However, since all of the 3 DNA polypeptide ligases mediate the linkage between the 5 '-terminal deoxyribonucleotide of D9 and the C-terminal amino acid of P7, extension of the 3' -terminal deoxyribonucleotide chain on the non-ligation side of D9 and/or the N-terminal amino acid chain of P7 does not theoretically affect the progress of the ligation reaction while retaining D9 and P7. For the single-stranded DNA polypeptide ligase sDPlaseI system, a DNA single strand of 5 '-P-ATGATCCAG-Dn-3' (P represents phosphorylation modification) and N '-Pm-MANCEHL-C' are used as substrates; for a double-stranded DNA polypeptide ligase dDPlaseI system, a DNA double strand of 5 '-P-ATGATCCAG-Dn-3' (the complementary strand of the DNA double strand is 3 '-TACTAGGTC-Dn' -5 ', and Dn' is the complementary strand of Dn) and N '-Pm-MANCEHL-C' are used as substrates; for the double-stranded DNA polypeptide ligase dDPlaseII system, a DNA double strand of 5 ' -P-CTGGATCAT-Dn-3 ' (the complementary strand is 3 ' -GACCTAGTA-Dn ' -5 ', Dn ' is the complementary strand of Dn) and N ' -Pm-MANCEHL-C ' are used as substrates, wherein Dn is a DNA fragment connected with the 3 ' end of the non-enzyme-linked side of D9, and as with the exploration of single-stranded RNA polypeptide ligase, 3 fragments of different lengths and sequences, D1, D2 and D3 (D1, D2 and D3 correspond to R1, R2 and R3 respectively, namely Dn is the DNA form of Rn) are tried, and the sequences are respectively shown as SEQ ID NO: 17-19; pm is a polypeptide fragment connected with the N-terminal of P7, and 3 fragments P1, P2 and P3 (corresponding to the encoded polypeptide products of D1-D3, respectively, glutathione, human insulin A chain and human insulin B chain) with different lengths and sequences are tried, wherein the sequences are respectively shown in SEQ ID NO: 14-16, for each of the 3 DNA polypeptide ligases, 9 combinations of the two substrates were tried, and significant absorption peaks were consistently detected at 358nm, 360nm and 372nm in the sDPlaseI, dDPlaseI and dDPlaseII ligase-mediated reaction systems, respectively, indicating that there was a phosphorus-carbon bond product formation of O = P-C = O structure, i.e., ligation was proceeding.

As can be seen, the scdplasei ligase can catalyze covalent ligation of DNA single-strands and polypeptide-chains by using as a substrate a DNA single-strand whose 5 ' end starts with intact sD9 and whose 5 ' end deoxyribonucleotide a is in phosphorylated form (i.e., 5 ' -P-ATGATCCAG-3 ' -represents other DNA sequence of the 3 ' end) and a polypeptide-chain whose C-end starts with intact P7; dplasei ligase is capable of catalyzing covalent ligation of DNA duplexes and polypeptide chains by using DNA duplexes with the 5 ' end starting with the complete dD9 construct and the 5 ' end deoxyribonucleotide a in phosphorylated form (i.e., DNA duplexes corresponding to 5 ' -P-ATGATCCAG-3 ' and representing other DNA sequences at the 3 ' end) and polypeptide chains with the C end starting with the complete P7 as substrates; dplaseii ligase is capable of catalyzing covalent ligation of DNA duplexes and polypeptide chains by using as a substrate a DNA duplex that is constructed with the 5 ' end starting as intact dD9 and whose 5 ' end deoxyribonucleotide C is in phosphorylated form (i.e., a DNA duplex corresponding to 5 ' -P-CTGGATCAT-3 ' and representing the 3 ' end of the other DNA sequence) and a polypeptide chain that is started as intact P7 at the C-end. The 5 '-terminal deoxyribonucleotide refers to the first deoxyribonucleotide at the 5' -terminal. By phosphorylated form of deoxyribonucleotide is meant that, like the natural monophosphate or triphosphate monodeodeoxyribonucleotide, a modified or naturally occurring phosphate group is attached to the 5 th carbon atom of the five carbon sugar of the deoxyribonucleotide; in addition, triphosphate modification may also be applicable, however, in view of practical convenience in use of monophosphate, it is not specifically indicated below for the sake of simplicity.

Referring to Table 3, using D9 (sD 9 or dD 9) and P7 as substrates, the optimal reaction buffer systems and temperatures of the 3 ligases sDPlaseI, dDPlaseI and dDPlaseII were investigated at their characteristic absorption wavelengths according to Table 4 using different buffers and temperatures. The 2 commonly used ligase buffers, buffer A and B, were still used for the same single-stranded RNA polypeptide ligase sRPISEI probe. The addition amount of the DNA polypeptide ligase is 1 mu g, and the reaction time is 30 min. After completion, 3. mu.L of the reaction product was collected, and absorbance was measured at 358nm (sDPlaseI ligase reaction system), 360nm (dDPlaseI ligase reaction system) and 372nm (dDPlaseI ligase reaction system) using an Epoch microplate reader (Biotek, USA), and the results are shown in Table 4. As can be seen from table 4, for the scdplaei ligase, the absorbance of the reaction product was higher for the scdplaei ligase in the buffer a system than for buffer B at each temperature, indicating that buffer a is more suitable for the scdplaei ligase; whereas for dplasei and dplasei ligase, the absorbance of the reaction product was higher for the buffer B system than for buffer a at each temperature, indicating that buffer B is more suitable for both dplasei and dplasei ligase. For the sDPlaseI ligase, the absorbance of the reaction product was measured to be the highest in buffer A group at 37 ℃ (1.37), indicating that 37 ℃ is the optimal reaction temperature for sDPlaseI ligase. The absorbance measured for dplasei and dplasei ligases was highest for the reaction product at 40 ℃ (1.44 and 1.48, respectively) in buffer B group, indicating that 40 ℃ is the optimal reaction temperature for dplasei and dplasei ligases. Furthermore, from the results in Table 4, the optimal reaction temperature range for sDPlaseI ligase was 30 ℃ to 40 ℃, while the optimal reaction temperature range for dDPlaseI and dDPlaseII ligases were 35 ℃ to 45 ℃.

The research on the specificity of the same substrate, for a single-stranded DNA polypeptide ligase sDPlaseI system, a DNA single strand of 5 '-P-ATGATCCAG-Dn-3' and N '-Pm-MANCEHL-C' are used as substrates; for a double-stranded DNA polypeptide ligase dDPlaseI system, a DNA double strand of 5 '-P-ATGATCCAG-Dn-3' and N '-Pm-MANCEHL-C' are used as substrates; for the double-stranded DNA polypeptide ligase dDPlaseII system, the DNA double strand of 5 '-P-CTGGATCAT-Dn-3' and N '-Pm-MANCEHL-C' are used as substrates, and for the 3 DNA polypeptide ligases, the optimal buffer solution and the optimal temperature condition are consistent with those of the D9 and P7 substrates.

TABLE 43 reactivity of DNA polypeptide ligases in different reaction buffer systems and temperature gradients

Referring to the study of the reaction conditions of sRPISeI ligase in example 3, it was found that the ligation reaction was sufficiently performed by adding 1. mu.g of sDPlaseI ligase to buffer A at 37 ℃ for 10min by changing the amount of enzyme added and the reaction time in the reaction system (see the system in Table 3); in buffer B and 40 deg.C (refer to Table 3 system), adding 1. mu.g dDPlaseI ligase (or dDPlaseI ligase) takes 8min to fully perform the ligation reaction. For the sDPlaseI, dDPlaseI and dDPlaseI ligases, the reaction was stopped completely by incubation at 80 ℃ for 3 min.

In summary, the present invention provides a single-stranded DNA peptide ligase sDPlaseI for catalyzing covalent ligation of a specific single-stranded DNA and a specific polypeptide, the amino acid sequence of which is as shown in SEQ ID NO: 4, the corresponding gene coding sequence is shown as SEQ ID NO: 3 is shown in the specification; the sDPlaseI enzyme recognizes the 5 'end starting from 5' -ATGATCCAGA specific single-strand DNA having a 3 ' sequence and a 5 ' terminal deoxyribonucleotide A in a phosphorylated state and a specific polypeptide chain having a C-terminal starting from the 7 peptide N ' -MANCEHL-C ', and catalyzing a dehydration condensation reaction between the adenine deoxyribonucleotide A at the 5 ' terminal end of the single-strand DNA and leucine L at the C-terminal end of the polypeptide chain to form a phosphorus-carbon bond having a structure of O = P-C = O, so that the single-strand DNA and the polypeptide are covalently bonded to each other. The single-stranded DNA and polypeptide ligation product sDP has a maximum absorption peak at a wavelength of 358nm, which is its characteristic absorption peak. The reaction system for the sDPlaseI ligase is as follows: (1) 1 ng/. mu.L-100 ng/. mu.L of a specific DNA single strand; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer consisting of 500mM Tris-HCl, 100mM Mg, pH 7.5²⁺10mM ATP and 10mM DTT, and 0.1. mu.L ligase buffer solution is added to each 1. mu.L reaction system; (4) deionized water to make up to the required volume. Wherein the specific DNA single strand is a DNA single strand having 5 '-end starting with 5' -ATGATCCAG-3 'sequence and 5' -end deoxyribonucleotide A in phosphorylated state, and the specific polypeptide is a polypeptide chain having C-end starting with 7 peptides N '-MANCEHL-C'. The optimal reaction temperature range of the sDPlaseI ligase is 30-40 ℃, the ligation reaction time is 5-15 min, and the reaction system can be placed at 80 ℃ for 3min after the ligation reaction is finished, so that the reaction can be completely terminated.

The invention also provides a double-stranded DNA peptide ligase dDPlaseI for catalyzing covalent linkage of specific double-stranded DNA and specific polypeptide, wherein the amino acid sequence of the double-stranded DNA peptide ligase dDPlaseI is shown as SEQ ID NO: 6, the corresponding gene coding sequence is shown as SEQ ID NO: 5 is shown in the specification; the dDPlaseI enzyme can recognize a specific DNA double chain of which the 5 ' end is initiated to be a 5 ' -ATGATCCAG-3 ' double-chain sequence and the 5 ' end deoxyribonucleotide A is in a phosphorylation state and a specific polypeptide chain of which the C end is initiated to be a 7 peptide of N ' -MANCEHL-C ', and catalyze the dehydration condensation reaction between the adenine deoxyribonucleotide A at the 5 ' end of the DNA double chain and the leucine L at the C end of the polypeptide chain to form a phosphorus-carbon bond with O = P-C = O structure, so that the double-chain DNA and the polypeptide are connected with each other by covalent bonds. The double-stranded DNA and polypeptide ligation product dDPI has a maximum absorption peak at a wavelength of 360nm, which is its characteristic absorption peak. The reaction system for dplasei ligase is: (1) 1 ng/. mu.L-100 ng/. mu.L of a specific DNA duplex; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer consisting of 450mM Tris-HCl, 100mM Mg, pH 7.8²⁺80mM NaCl, 20mM ATP and 8mM Tris-100, and adding 0.1 mu L of ligase buffer solution into each 1 mu L of reaction system; (4) deionized water to make up to the required volume. Wherein the specific DNA double strand is a DNA double strand of which 5 '-end is initially a 5' -ATGATCCAG-3 'double strand sequence and of which 5' -end deoxyribonucleotide A is in a phosphorylated state, and the specific polypeptide is a polypeptide chain of which C-end is initiated by the 7 peptide N '-MANCEHL-C'. The optimal reaction temperature range of the dDPlaseI ligase is 35-45 ℃, the ligation reaction time is 3-10 min, and the reaction system can be placed at 80 ℃ for 3min after the ligation reaction is finished, so that the reaction can be completely terminated.

The invention also provides another double-stranded DNA peptide ligase dDPlaseII for catalyzing covalent linkage of specific double-stranded DNA and specific polypeptide, wherein the amino acid sequence of the double-stranded DNA peptide ligase dDPlaseII is shown as SEQ ID NO: 8, and the corresponding gene coding sequence is shown as SEQ ID NO: 7 is shown in the specification; the dplaseii enzyme can recognize a specific DNA double strand of which 5 ' end is initially a 5 ' -CTGGATCAT-3 ' double strand sequence and of which 5 ' end deoxyribonucleotide C is in a phosphorylated state and a specific polypeptide chain of which C end is initially a 7 peptide of N ' -MANCEHL-C ', and catalyze a dehydration condensation reaction between cytosine deoxyribonucleotide C at the 5 ' end of the DNA double strand and leucine L at the C end of the polypeptide chain to form a phosphorus-carbon bond with O = P-C = O structure, so that the double-stranded DNA and the polypeptide are covalently linked to each other. The double-stranded DNA and polypeptide ligation product dDPII has a maximum absorption peak at a wavelength of 372nm, which is a characteristic absorption peak. The reaction system for dplaseii ligase is: (1) 1 ng/. mu.L-100 ng/. mu.L of a specific DNA duplex; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer consisting of 450mM Tris-HCl, 100mM Mg, pH 7.8²⁺80mM NaCl, 20mM ATP and 8mM Tris-100, and adding 0.1 mu L of ligase buffer solution into each 1 mu L of reaction system; (4) deionized water to make up to the required volume. Wherein the specific DNA double strand is a DNA double strand having a 5 '-end starting with a 5' -CTGGATCAT-3 'double strand sequence and a 5' -end deoxyribonucleotide C in a phosphorylated state, and the specific polypeptide is a polypeptide having a C-endN '-MANCEHL-C' this 7 peptide is the starting polypeptide chain. The optimal reaction temperature range of the dDPlaseII ligase is 35-45 ℃, the ligation reaction time is 3-10 min, and the reaction system can be placed at 80 ℃ for 3min after the ligation reaction is finished, so that the reaction can be completely terminated.

In summary, the present invention provides 4 novel nucleic acid polypeptide ligases, including 1 specific single-stranded RNA peptide ligase sRPlaseI, 1 specific single-stranded DNA peptide ligase stdplasei and 2 specific double-stranded DNA peptide ligases dplasei and dplasei, which can be used as tool enzymes for genetic engineering and genetic analysis, and have great application prospects in the fields of nucleic acid labeling and nucleic acid analysis, etc.

Sequence listing

<110> Fujian Chengxi Biotech Co., Ltd

<120> single-stranded RNA peptide ligase sRPPlaseI and use method thereof

<160>19

<170>SIPOSequenceListing 1.0

<210>1

<211>1362

<212>DNA

<213> Artificial Sequence (Artificial Sequence)

<400>1

atgcggacgc gccacagcag aactctgcta ctcgagcatc ttgccacaga aagacgacgt 60

ttagctccgt ggaacatagg gtccgcttca acaagacact tgcttctaga caagagtccc 120

tctagaaatc ccttccagtc tatatgttac tggggccaag aattggtcac cggcgggaat 180

cttacattga gtaacgggca gctccgatca cagagatatt ggcacacctt gagttcgacg 240

cccatcctcc gaatctgtat gagatcctgc ttctccttac gttctccgaa actacagttg 300

cgtccgatac cagttttatg tttcttcatg ggagcgacca atatcagttt acgcttcgca 360

agccaaggga cggtggatat agttcatagt gttgaacgtt ccgagagtca gtcccgcacc 420

tacgtgtttt tggaactgtg ggccttcggg ttttacacag cctttggttt cagcatgcta 480

ttagtctcca ccttggaggc acacctaatc acggtaaagg gtgatggcct aatacgttgg 540

gatgtagcgc gctccttccg gtcaggtcac gaggatggag cgtgtcttac acgcgatccg 600

tgcggtccgc agtttgcgtc tgacgactac gagccgcgtt cttgcctacc ccagatgcta 660

tcggcgagag ggggtcccgg ttcgtttatc gttgtgtatg gttgtcattg ggctcaattg 720

cggattcagg cggggctagc aaaccaagtg ttgagtgttt gtcttatttg taaggcatat 780

atgatctcag agtttttgtc catacctaac cattcctatt acttgcgcgc gccatgtgaa 840

caaggtaaaa tgttgataga tgcgaggcac ctttggctac gggtagagcg gctgaattct 900

atcattgcag gtctggcatc acttcgtaag cgaggtaata ctcgcaccag cttgaactca 960

atccttttat ttagtaagga tcaacagtat aaaatgcggc gcgccgcact gagtatacta 1020

ctatattggg gctatttcac agtccgcgca tcctgcgata atcttgtggc cactctacgg 1080

aaagaccccc gggaatacga ttcggcgact gggccgtcga aactttgtca gcccaaagct 1140

catccctgtc atccgatgca aatgtacctg agggactggg caggcaaatt aagagcaacg 1200

aagcggccag acaggggcgc ccaacaagaa catgcggtga accccgccgg ctaccatcaa 1260

atgcagagtg caaggttggt tgcgccttta accccgtccg cccagcttac tgagcgtgat 1320

tgtacaggaa aggttgggct tgaccttcga cgtcagatgt ag 1362

<210>2

<211>453

<212>PRT

<213> Artificial Sequence (Artificial Sequence)

<400>2

Met Arg Thr Arg His Ser Arg Thr Leu Leu Leu Glu His Leu Ala Thr

1 5 10 15

Glu Arg Arg Arg Leu Ala Pro Trp Asn Ile Gly Ser Ala Ser Thr Arg

20 25 30

His Leu Leu Leu Asp Lys Ser Pro Ser Arg Asn Pro Phe Gln Ser Ile

35 40 45

Cys Tyr Trp Gly Gln Glu Leu Val Thr Gly Gly Asn Leu Thr Leu Ser

50 55 60

Asn Gly Gln Leu Arg Ser Gln Arg Tyr Trp His Thr Leu Ser Ser Thr

65 70 75 80

Pro Ile Leu Arg Ile Cys Met Arg Ser Cys Phe Ser Leu Arg Ser Pro

85 90 95

Lys Leu Gln Leu Arg Pro Ile Pro Val Leu Cys Phe Phe Met Gly Ala

100 105 110

Thr Asn Ile Ser Leu Arg Phe Ala Ser Gln Gly Thr Val Asp Ile Val

115 120 125

His Ser Val Glu Arg Ser Glu Ser Gln Ser Arg Thr Tyr Val Phe Leu

130 135 140

Glu Leu Trp Ala Phe Gly Phe Tyr Thr Ala Phe Gly Phe Ser Met Leu

145 150 155 160

Leu Val Ser Thr Leu Glu Ala His Leu Ile Thr Val Lys Gly Asp Gly

165 170 175

Leu Ile Arg Trp Asp Val Ala Arg Ser Phe Arg Ser Gly His Glu Asp

180 185 190

Gly Ala Cys Leu Thr Arg Asp Pro Cys Gly Pro Gln Phe Ala Ser Asp

195 200 205

Asp Tyr Glu Pro Arg Ser Cys Leu Pro Gln Met Leu Ser Ala Arg Gly

210 215 220

Gly Pro Gly Ser Phe Ile Val Val Tyr Gly Cys His Trp Ala Gln Leu

225 230 235 240

Arg Ile Gln Ala Gly Leu Ala Asn Gln Val Leu Ser Val Cys Leu Ile

245 250 255

Cys Lys Ala Tyr Met Ile Ser Glu Phe Leu Ser Ile Pro Asn His Ser

260 265 270

Tyr Tyr Leu Arg Ala Pro Cys Glu Gln Gly Lys Met Leu Ile Asp Ala

275 280 285

Arg His Leu Trp Leu Arg Val Glu Arg Leu Asn Ser Ile Ile Ala Gly

290 295 300

Leu Ala Ser Leu Arg Lys Arg Gly Asn Thr Arg Thr Ser Leu Asn Ser

305 310 315 320

Ile Leu Leu Phe Ser Lys Asp Gln Gln Tyr Lys Met Arg Arg Ala Ala

325 330 335

Leu Ser Ile Leu Leu Tyr Trp Gly Tyr Phe Thr Val Arg Ala Ser Cys

340 345 350

Asp Asn Leu Val Ala Thr Leu Arg Lys Asp Pro Arg Glu Tyr Asp Ser

355 360 365

Ala Thr Gly Pro Ser Lys Leu Cys Gln Pro Lys Ala His Pro Cys His

370 375 380

Pro Met Gln Met Tyr Leu Arg Asp Trp Ala Gly Lys Leu Arg Ala Thr

385 390 395 400

Lys Arg Pro Asp Arg Gly Ala Gln Gln Glu His Ala Val Asn Pro Ala

405 410 415

Gly Tyr His Gln Met Gln Ser Ala Arg Leu Val Ala Pro Leu Thr Pro

420 425 430

Ser Ala Gln Leu Thr Glu Arg Asp Cys Thr Gly Lys Val Gly Leu Asp

435 440 445

Leu Arg Arg Gln Met

450

<210>3

<211>1359

<212>DNA

<213> Artificial Sequence (Artificial Sequence)

<400>3

atgcggacgc gccacagcag aactctgcta ctcgagcatc ttgccacaga aagacgacgt 60

ttagctccgt ggaacatagg gtccgcttca acaagacact tgcttctaga caagagtccc 120

tctagaaatc ccttccagtc tatatgttac tggggccaag aattggtcac cggcgggaat 180

cttacattga gtaacgggca gctccgatca cagagatatt ggcacacctt gagttcgacg 240

cccatcctcc gaatctgtat gagatcctgc ttctccttac gttctccgaa actacagttg 300

cgtccgatac cagttttatg tttcttcatg ggagcgagca acagtttacg cttcgcaagc 360

caagggacgg tggatatagt tcatagtgtt gaacgttccg agagtcagtc ccgcacctac 420

gtgtttttgg aactgtgggc cttcgggttt tacacagcct ttggtttcag catgctatta 480

gtctccacct tggaggcaca cctaatcacg gtaaagggtg atggcctaat acgttgggat 540

gtagcgcgct ccttccggtc aggtcacgag gatggagcgt gtcttacacg cgatccgtgc 600

ggtccgcagt ttgcgtctga cgactacgag ccgcgttctt gcctacccca gatgctatcg 660

gcgagagggg gtcccggttc gtttaccgat gtgtatggtt gtcattgggc tcaattgcgg 720

attcaggcgg ggctagcaaa ccaagtgttg agtgtttgtc ttatttgtaa ggcatatatg 780

atctcagagt ttttgtccat acctaaccat tcctattact tgcgcgcgcc atgtgaacaa 840

ggtaaaatgt tgatagatgc gaggcacctt tggctacggg tagagcggct gaattctatc 900

attgcaggtc tggcatcact tcgtaagcga ggtaatactc gcaccagctt gaactcaatc 960

cttttattta gtaaggatca acagtataaa atgcggcgcg ccgcactgag tatactacta 1020

tattggggct atttcacagt ccgcgcatcc tgcgataatc ttgtggccac tctacggaaa 1080

gacccccggg aatacgattc ggcgactggg ccgtcgaaac tttgtcagcc caaagctcat 1140

ccctgtcatc cgatgcaaat gtacctgagg gactgggcag gcaaattaag agcaacgaag 1200

cggccagaca ggggcgccca acaagaacat gcggtgaacc ccgccggcta ccatcaaatg 1260

cagagtgcaa ggttggttgc gcctttaacc ccgtccgccc agcttactga gcgtgattgt 1320

acaggaaagg ttgggcttga ccttcgacgt cagatgtag 1359

<210>4

<211>452

<212>PRT

<213> Artificial Sequence (Artificial Sequence)

<400>4

Met Arg Thr Arg His Ser Arg Thr Leu Leu Leu Glu His Leu Ala Thr

1 5 10 15

Glu Arg Arg Arg Leu Ala Pro Trp Asn Ile Gly Ser Ala Ser Thr Arg

20 25 30

His Leu Leu Leu Asp Lys Ser Pro Ser Arg Asn Pro Phe Gln Ser Ile

35 40 45

Cys Tyr Trp Gly Gln Glu Leu Val Thr Gly Gly Asn Leu Thr Leu Ser

50 55 60

Asn Gly Gln Leu Arg Ser Gln Arg Tyr Trp His Thr Leu Ser Ser Thr

65 70 7580

Pro Ile Leu Arg Ile Cys Met Arg Ser Cys Phe Ser Leu Arg Ser Pro

85 90 95

Lys Leu Gln Leu Arg Pro Ile Pro Val Leu Cys Phe Phe Met Gly Ala

100 105 110

Ser Asn Ser Leu Arg Phe Ala Ser Gln Gly Thr Val Asp Ile Val His

115 120 125

Ser Val Glu Arg Ser Glu Ser Gln Ser Arg Thr Tyr Val Phe Leu Glu

130 135 140

Leu Trp Ala Phe Gly Phe Tyr Thr Ala Phe Gly Phe Ser Met Leu Leu

145 150 155 160

Val Ser Thr Leu Glu Ala His Leu Ile Thr Val Lys Gly Asp Gly Leu

165 170 175

Ile Arg Trp Asp Val Ala Arg Ser Phe Arg Ser Gly His Glu Asp Gly

180 185 190

Ala Cys Leu Thr Arg Asp Pro Cys Gly Pro Gln Phe Ala Ser Asp Asp

195 200 205

Tyr Glu Pro Arg Ser Cys Leu Pro Gln Met Leu Ser Ala Arg Gly Gly

210 215 220

Pro Gly Ser Phe Thr Asp Val Tyr Gly Cys His Trp Ala Gln Leu Arg

225 230 235 240

Ile Gln Ala Gly Leu Ala Asn Gln Val Leu Ser Val Cys Leu Ile Cys

245 250 255

Lys Ala Tyr Met Ile Ser Glu Phe Leu Ser Ile Pro Asn His Ser Tyr

260 265 270

Tyr Leu Arg Ala Pro Cys Glu Gln Gly Lys Met Leu Ile Asp Ala Arg

275 280 285

His Leu Trp Leu Arg Val Glu Arg Leu Asn Ser Ile Ile Ala Gly Leu

290 295 300

Ala Ser Leu Arg Lys Arg Gly Asn Thr Arg Thr Ser Leu Asn Ser Ile

305 310 315 320

Leu Leu Phe Ser Lys Asp Gln Gln Tyr Lys Met Arg Arg Ala Ala Leu

325 330 335

Ser Ile Leu Leu Tyr Trp Gly Tyr Phe Thr Val Arg Ala Ser Cys Asp

340 345 350

Asn Leu Val Ala Thr Leu Arg Lys Asp Pro Arg Glu Tyr Asp Ser Ala

355 360 365

Thr Gly Pro Ser Lys Leu Cys Gln Pro Lys Ala His Pro Cys His Pro

370 375 380

Met Gln Met Tyr Leu Arg Asp Trp Ala Gly Lys Leu Arg Ala Thr Lys

385 390 395 400

Arg Pro Asp Arg Gly Ala Gln Gln Glu His Ala Val Asn Pro Ala Gly

405 410 415

Tyr His Gln Met Gln Ser Ala Arg Leu Val Ala Pro Leu Thr Pro Ser

420 425 430

Ala Gln Leu Thr Glu Arg Asp Cys Thr Gly Lys Val Gly Leu Asp Leu

435 440 445

Arg Arg Gln Met

450

<210>5

<211>1365

<212>DNA

<213> Artificial Sequence (Artificial Sequence)

<400>5

atgcggacgc gccacagcag aactctgcta ctcgagcatc ttgccacaga aagacgacgt 60

ttagctccgt ggaacatagg gtccgcttca acaagacact tgcttctaga caagagtccc 120

tctagaaatc ccttccagtc tatatgttac tggggccaag aattggtcac cggcgggaat 180

cttacattga gtaacgggca gctccgatca cagagatatt ggcacacctt gagttcgacg 240

cccatcctcc gaatctgtat gagatcctgc ttctccttac gttctccgaa actacagttg 300

cgtccgatac cagttttatg tttcttcatg ggaccgagca aaagtttacg cttcgcaagc 360

caagggacgg tggatatagt tcatagtgtt gaacgttccg agagtcagtc ccgcacctac 420

gtgtttttgg aactgtgggc cttcgggttt tacacagcct ttggtttcag catgctatta 480

gtctccacct tggaggcaca cctaatcacg gtaaagggtg atggcctaat acgttgggat 540

gtagcgcgct ccttccggtc aggtcacgag gatggagcgt gtcttacacg cgatccgtgc 600

ggtccgcagt ttgcgtctga cgactacgag ccgcgttctt gcctacccca gatgctatcg 660

gcgagagggg gtcccgtttc gtttaccgat gtgtatggtt gtcattgggc tcaattgcgg 720

attcaggcgg ggctagcaaa ccaagtgttg agtgtttgtc ttatttgtaa ggcatatatg 780

atctcagagt ttttgtccat acctaaccat tcctattact tgcgcgcgcc atgtgaacaa 840

ggtaaaatgt tgatagatgc gaggcacctt tggctacggg tagagcggct gaattctatc 900

attgcaggtc tggcatcact tcgtaagcga ggtaatactc gcaccagctt gaactcaatc 960

cttttattta gtaaggatca acagtataaa atgcggcgcg ccgcactgag tatactacta 1020

tattggggct atttcacagt ccgcgcatcc tgcgataatc ttgtggccac tctacggaaa 1080

gacccccggg aatacgattc ggcgactggg ccgtcgaaac tttgtcagcc caaagctcat 1140

ccctgtcatc cgatgcaaat gtacctgagg gactgggcag gcaaattaag agcaacgaag 1200

cggccagaca ggggcgccca acaagaacat gcggtgaacc ccgccggcta ccatcaaatg 1260

cagagtgcaa ggttggttgc gcctttaacc ccgtccgccc agcttactga gcgcagccac 1320

gattgtacag gaaaggttgg gcttgacctt cgacgtcaga tgtag 1365

<210>6

<211>454

<212>PRT

<213> Artificial Sequence (Artificial Sequence)

<400>6

Met Arg Thr Arg His Ser Arg Thr Leu Leu Leu Glu His Leu Ala Thr

1 5 10 15

Glu Arg Arg Arg Leu Ala Pro Trp Asn Ile Gly Ser Ala Ser Thr Arg

20 25 30

His Leu Leu Leu Asp Lys Ser Pro Ser Arg Asn Pro Phe Gln Ser Ile

35 40 45

Cys Tyr Trp Gly Gln Glu Leu Val Thr Gly Gly Asn Leu Thr Leu Ser

50 55 60

Asn Gly Gln Leu Arg Ser Gln Arg Tyr Trp His Thr Leu Ser Ser Thr

65 70 75 80

Pro Ile Leu Arg Ile Cys Met Arg Ser Cys Phe Ser Leu Arg Ser Pro

85 90 95

Lys Leu Gln Leu Arg Pro Ile Pro Val Leu Cys Phe Phe Met Gly Pro

100 105 110

Ser Lys Ser Leu Arg Phe Ala Ser Gln Gly Thr Val Asp Ile Val His

115 120 125

Ser Val Glu Arg Ser Glu Ser Gln Ser Arg Thr Tyr Val Phe Leu Glu

130 135 140

Leu Trp Ala Phe Gly Phe Tyr Thr Ala Phe Gly Phe Ser Met Leu Leu

145 150 155 160

Val Ser Thr Leu Glu Ala His Leu Ile Thr ValLys Gly Asp Gly Leu

165 170 175

Ile Arg Trp Asp Val Ala Arg Ser Phe Arg Ser Gly His Glu Asp Gly

180 185 190

Ala Cys Leu Thr Arg Asp Pro Cys Gly Pro Gln Phe Ala Ser Asp Asp

195 200 205

Tyr Glu Pro Arg Ser Cys Leu Pro Gln Met Leu Ser Ala Arg Gly Gly

210 215 220

Pro Val Ser Phe Thr Asp Val Tyr Gly Cys His Trp Ala Gln Leu Arg

225 230 235 240

Ile Gln Ala Gly Leu Ala Asn Gln Val Leu Ser Val Cys Leu Ile Cys

245 250 255

Lys Ala Tyr Met Ile Ser Glu Phe Leu Ser Ile Pro Asn His Ser Tyr

260 265 270

Tyr Leu Arg Ala Pro Cys Glu Gln Gly Lys Met Leu Ile Asp Ala Arg

275 280 285

His Leu Trp Leu Arg Val Glu Arg Leu Asn Ser Ile Ile Ala Gly Leu

290 295 300

Ala Ser Leu Arg Lys Arg Gly Asn Thr Arg Thr Ser Leu Asn Ser Ile

305 310 315 320

Leu Leu Phe Ser Lys Asp Gln Gln Tyr Lys Met Arg ArgAla Ala Leu

325 330 335

Ser Ile Leu Leu Tyr Trp Gly Tyr Phe Thr Val Arg Ala Ser Cys Asp

340 345 350

Asn Leu Val Ala Thr Leu Arg Lys Asp Pro Arg Glu Tyr Asp Ser Ala

355 360 365

Thr Gly Pro Ser Lys Leu Cys Gln Pro Lys Ala His Pro Cys His Pro

370 375 380

Met Gln Met Tyr Leu Arg Asp Trp Ala Gly Lys Leu Arg Ala Thr Lys

385 390 395 400

Arg Pro Asp Arg Gly Ala Gln Gln Glu His Ala Val Asn Pro Ala Gly

405 410 415

Tyr His Gln Met Gln Ser Ala Arg Leu Val Ala Pro Leu Thr Pro Ser

420 425 430

Ala Gln Leu Thr Glu Arg Ser His Asp Cys Thr Gly Lys Val Gly Leu

435 440 445

Asp Leu Arg Arg Gln Met

450

<210>7

<211>1365

<212>DNA

<213> Artificial Sequence (Artificial Sequence)

<400>7

atgcggacgc gccacagcag aactctgcta ctcgagcatc ttgccacaga aagacgacgt 60

ttagctccgt ggaacatagg gtccgcttca acaagacact tgcttctaga caagagtccc 120

tctagaaatc ccttccagtc tatatgttac tggggccaag aattggtcac cggcgggaat 180

cttacattga gtaacgggca gctccgatca cagagatatt ggcacacctt gagttcgacg 240

cccatcctcc gaatctgtat gagatcctgc ttctccttac gttctccgaa actacagttg 300

cgtccgatac cagttttatg tttcttcatg ggaccgagca gaagtttacg cttcgcaagc 360

caagggacgg tggatatagt tcatagtgtt gaacgttccg agagtcagtc ccgcacctac 420

gtgtttttgg aactgtgggc cttcgggttt tacacagcct ttggtttcag catgctatta 480

gtctccacct tggaggcaca cctaatcacg gtaaagggtg atggcctaat acgttgggat 540

gtagcgcgct ccttccggtc aggtcacgag gatggagcgt gtcttacacg cgatccgtgc 600

ggtccgcagt ttgcgtctga cgactacgag ccgcgttctt gcctacccca gatgctatcg 660

gcgagagggg gtcccgtttt taccgatctt tatatgtgtc attgggctca attgcggatt 720

caggcggggc tagcaaacca agtgttgagt gtttgtctta tttgtaaggc atatatgatc 780

tcagagtttt tgtccatacc taaccattcc tattacttgc gcgcgccatg tgaacaaggt 840

aaaatgttga tagatgcgag gcacctttgg ctacgggtag agcggctgaa ttctatcatt 900

gcaggtctgg catcacttcg taagcgaggt aatactcgca ccagcttgaa ctcaatcctt 960

ttatttagta aggatcaaca gtataaaatg cggcgcgccg cactgagtat actactatat 1020

tggggctatt tcacagtccg cgcatcctgc gataatcttg tggccactct acggaaagac 1080

ccccgggaat acgattcggc gactgggccg tcgaaacttt gtcagcccaa agctcatccc 1140

tgtcatccga tgcaaatgta cctgagggac tgggcaggca aattaagagc aacgaagcgg 1200

ccagacaggg gcgcccaaca agaacatgcg gtgaaccccg ccggctacca tcaaatgcag 1260

agtgcaaggt tggttgcgcc tttaaccccg tccgcccagc ttactgagcg cagccacaca 1320

gattgtacag gaaaggttgg gcttgacctt cgacgtcaga tgtag 1365

<210>8

<211>454

<212>PRT

<213> Artificial Sequence (Artificial Sequence)

<400>8

Met Arg Thr Arg His Ser Arg Thr Leu Leu Leu Glu His Leu Ala Thr

1 5 10 15

Glu Arg Arg Arg Leu Ala Pro Trp Asn Ile Gly Ser Ala Ser Thr Arg

20 25 30

His Leu Leu Leu Asp Lys Ser Pro Ser Arg Asn Pro Phe Gln Ser Ile

35 40 45

Cys Tyr Trp Gly Gln Glu Leu Val Thr Gly Gly Asn Leu Thr Leu Ser

50 55 60

Asn Gly Gln Leu Arg Ser Gln Arg Tyr Trp His Thr Leu Ser Ser Thr

65 70 75 80

Pro Ile Leu Arg Ile Cys Met Arg Ser Cys Phe Ser Leu Arg Ser Pro

85 90 95

Lys Leu Gln Leu Arg Pro Ile Pro Val Leu Cys Phe Phe Met Gly Pro

100 105 110

Ser Arg Ser Leu Arg Phe Ala Ser Gln Gly Thr Val Asp Ile Val His

115 120 125

Ser Val Glu Arg Ser Glu Ser Gln Ser Arg Thr Tyr Val Phe Leu Glu

130 135 140

Leu Trp Ala Phe Gly Phe Tyr Thr Ala Phe Gly Phe Ser Met Leu Leu

145 150 155 160

Val Ser Thr Leu Glu Ala His Leu Ile Thr Val Lys Gly Asp Gly Leu

165 170 175

Ile Arg Trp Asp Val Ala Arg Ser Phe Arg Ser Gly His Glu Asp Gly

180 185 190

Ala Cys Leu Thr Arg Asp Pro Cys Gly Pro Gln Phe Ala Ser Asp Asp

195 200 205

Tyr Glu Pro Arg Ser Cys Leu Pro Gln Met Leu Ser Ala Arg Gly Gly

210 215 220

Pro Val Phe Thr Asp Leu Tyr Met Cys His Trp Ala Gln Leu Arg Ile

225 230 235 240

Gln Ala Gly Leu Ala Asn Gln Val Leu Ser Val Cys Leu Ile Cys Lys

245250 255

Ala Tyr Met Ile Ser Glu Phe Leu Ser Ile Pro Asn His Ser Tyr Tyr

260 265 270

Leu Arg Ala Pro Cys Glu Gln Gly Lys Met Leu Ile Asp Ala Arg His

275 280 285

Leu Trp Leu Arg Val Glu Arg Leu Asn Ser Ile Ile Ala Gly Leu Ala

290 295 300

Ser Leu Arg Lys Arg Gly Asn Thr Arg Thr Ser Leu Asn Ser Ile Leu

305 310 315 320

Leu Phe Ser Lys Asp Gln Gln Tyr Lys Met Arg Arg Ala Ala Leu Ser

325 330 335

Ile Leu Leu Tyr Trp Gly Tyr Phe Thr Val Arg Ala Ser Cys Asp Asn

340 345 350

Leu Val Ala Thr Leu Arg Lys Asp Pro Arg Glu Tyr Asp Ser Ala Thr

355 360 365

Gly Pro Ser Lys Leu Cys Gln Pro Lys Ala His Pro Cys His Pro Met

370 375 380

Gln Met Tyr Leu Arg Asp Trp Ala Gly Lys Leu Arg Ala Thr Lys Arg

385 390 395 400

Pro Asp Arg Gly Ala Gln Gln Glu His Ala Val Asn Pro Ala Gly Tyr

405410 415

His Gln Met Gln Ser Ala Arg Leu Val Ala Pro Leu Thr Pro Ser Ala

420 425 430

Gln Leu Thr Glu Arg Ser His Thr Asp Cys Thr Gly Lys Val Gly Leu

435 440 445

Asp Leu Arg Arg Gln Met

450

<210>9

<211>7

<212>PRT

<213> Artificial Sequence (Artificial Sequence)

<400>9

Met Ala Asn Cys Glu His Leu

1 5

<210>10

<211>24

<212>DNA

<213> Artificial Sequence (Artificial Sequence)

<400>10

atggccaact gtgaacatct gtga 24

<210>11

<211>9

<212>RNA

<213> Artificial Sequence (Artificial Sequence)

<400>11

gaauguggu 9

<210>12

<211>63

<212>RNA

<213> Artificial Sequence (Artificial Sequence)

<400>12

ggcauugugg aacaaugcug uaccagcauc ugcucccucu accagcugga gaacuacugc 60

aac 63

<210>13

<211>90

<212>RNA

<213> Artificial Sequence (Artificial Sequence)

<400>13

uuugugaacc aacaccugug cggcucacac cugguggaag cucucuaccu agugugcggg 60

gaacgaggcu ucuucuacac acccaagacc 90

<210>14

<211>3

<212>PRT

<213> Artificial Sequence (Artificial Sequence)

<400>14

Glu Cys Gly

1

<210>15

<211>21

<212>PRT

<213> Artificial Sequence (Artificial Sequence)

<400>15

Gly Ile Val Glu Gln Cys Cys Thr Ser Ile Cys Ser Leu Tyr Gln Leu

1 5 10 15

Glu Asn Tyr Cys Asn

20

<210>16

<211>30

<212>PRT

<213> Artificial Sequence (Artificial Sequence)

<400>16

Phe Val Asn Gln His Leu Cys Gly Ser His Leu Val Glu Ala Leu Tyr

1 5 10 15

Leu Val Cys Gly Glu Arg Gly Phe Phe Tyr Thr Pro Lys Thr

20 25 30

<210>17

<211>9

<212>DNA

<213> Artificial Sequence (Artificial Sequence)

<400>17

gaatgtggt 9

<210>18

<211>63

<212>DNA

<213> Artificial Sequence (Artificial Sequence)

<400>18

ggcattgtgg aacaatgctg taccagcatc tgctccctct accagctgga gaactactgc 60

aac 63

<210>19

<211>90

<212>DNA

<213> Artificial Sequence (Artificial Sequence)

<400>19

tttgtgaacc aacacctgtg cggctcacac ctggtggaag ctctctacct agtgtgcggg 60

gaacgaggct tcttctacac acccaagacc 90

Claims

1. A double-stranded DNA peptide ligase dplasii characterized in that the amino acid sequence of dplasii ligase is as set forth in seq no: shown in fig. 8.

2. The double stranded DNA peptide ligase dplasii of claim 1 wherein the gene coding sequence for dplasii ligase is as set forth in SEQ ID NO: shown at 7.

3. The double-stranded DNA peptide ligase dplasii according to claim 1 or 2, wherein the dplasii ligase recognizes a specific DNA double-strand having a 5 ' end starting with a 5 ' -CTGGATCAT-3 ' double-strand sequence and a 5 ' end deoxyribonucleotide C in a phosphorylated state and a specific polypeptide chain having a C end starting with the 7 peptide N ' -MANCEHL-C ', and catalyzes a dehydration condensation reaction between the phosphorylated cytosine deoxyribonucleotide C at the 5 ' end of the DNA double-strand and a leucine L at the C end of the polypeptide chain to form a phosphorus-carbon bond with O = P-C = O structure, so that the double-stranded DNA and the polypeptide are covalently joined to each other; the double-stranded DNA and polypeptide ligation product has a maximum absorption peak at 372nm, which is a characteristic absorption peak and can be used for determining the reaction activity of dDPlaseII ligase.

4. The method of using a double stranded DNA peptide ligase dplasii as claimed in claim 1 or 2 wherein the dplasii ligase is reacted in the system: (1) 1 ng/. mu.L-100 ng/. mu.L of a specific DNA duplex; (2) 1 ng/. mu.L-100 ng/. mu.L of a particular polypeptide; (3) ligase buffer consisting of 450mM Tris-HCl, 100mM Mg, pH 7.8²⁺80mM NaCl, 20mM ATP and 8mM Triton X-100, and adding 0.1. mu.L of ligase buffer solution into each 1. mu.L reaction system; (4) 1 μ g of dplaseii enzyme; (5) deionized water, and the volume is complemented to the required volume;

the reaction conditions for dplaseii ligase were: the temperature range of the optimal reaction is 35-45 ℃, the connection reaction time is 3-10 min, and the reaction system can be placed at the temperature of 80 ℃ for 3min after the connection reaction is finished, so that the reaction can be completely stopped.

5. The method of using the double-stranded DNA peptide ligase dDPlaseII according to claim 4, wherein the specific DNA duplex is a DNA duplex wherein the 5 'end is initially a 5' -CTGGATCAT-3 'duplex sequence and the 5' end deoxyribonucleotide C is in a phosphorylated state, and the specific polypeptide is a polypeptide chain having a C-end initially a 7-peptide N '-MANCEHL-C'.