CN113717256A - Fusion protein and application thereof - Google Patents

Fusion protein and application thereof Download PDF

Info

Publication number
CN113717256A
CN113717256A CN202011308076.2A CN202011308076A CN113717256A CN 113717256 A CN113717256 A CN 113717256A CN 202011308076 A CN202011308076 A CN 202011308076A CN 113717256 A CN113717256 A CN 113717256A
Authority
CN
China
Prior art keywords
fusion protein
ala
seq
hbd
lys
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011308076.2A
Other languages
Chinese (zh)
Other versions
CN113717256B (en
Inventor
姚红杰
李尧益
王新秀
黄赛南
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Institute of Biomedicine and Health of CAS
Bioisland Laboratory
Original Assignee
Guangzhou Institute of Biomedicine and Health of CAS
Bioisland Laboratory
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Institute of Biomedicine and Health of CAS, Bioisland Laboratory filed Critical Guangzhou Institute of Biomedicine and Health of CAS
Priority to CN202011308076.2A priority Critical patent/CN113717256B/en
Publication of CN113717256A publication Critical patent/CN113717256A/en
Application granted granted Critical
Publication of CN113717256B publication Critical patent/CN113717256B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1241Nucleotidyltransferases (2.7.7)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Molecular Biology (AREA)
  • Biochemistry (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Microbiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Medicinal Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Biotechnology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Chemical Kinetics & Catalysis (AREA)
  • General Chemical & Material Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Enzymes And Modification Thereof (AREA)

Abstract

The invention relates to a fusion protein for preparing a single-cell in-situ active R-loop library and application thereof, wherein the fusion protein relates to R-loop specific binding protein HBD, MNase nuclease and Tn5 transposase. The fusion protein is mainly used for R-loop detection and high-throughput library construction. The fusion protein is used for detecting R-loop and constructing a high-throughput library, can realize in-situ active R-loop detection, can improve library construction efficiency, reduce library background, improve the accuracy of an R-loop detection technology, and simplify an R-loop detection process.

Description

Fusion protein and application thereof
Technical Field
The invention relates to the technical field of biology, in particular to a fusion protein and application thereof.
Background
R-loop is a three-stranded nucleic acid structure, i.e., one strand of RNA binds to one strand of double-stranded DNA, and the other strand of DNA leaves a loop with RNA-DNA hybrid strands. R-loop length has previously been considered harmful to cells and has received less attention. In recent years, R-loop, an important element of the cell genome, has been involved in many biological functions and has become an important research field in epigenetics. Currently, most methods for detecting R-loop are based on the DRIP-seq of antibody S9.6 and the derivation of DRIP-seq.
DRIP-seq ((DNA: RNA hybrid immunoprepitition and sequencing) is a co-immunoprecipitation and high-throughput sequencing analysis technology based on an antibody specifically recognizing a DNA: RNA heterozygous chain, and has been used for detecting R-loop distribution at the whole genome level of model organisms such as human, mice, yeast and the like.A derivative method of DRIP-seq and DRIP-seq roughly comprises the steps of collecting cells and other biological samples, extracting genomic DNA, cutting the genomic DNA into DNA fragments with certain size by using restriction endonuclease, immunoprecipitating and enriching the DNA fragments containing the R-loop by using an antibody S9.6, purifying and recovering the DNA fragments, constructing a library by using different library construction methods, but the existing derivative methods of DRIP-seq and DRIP-seq have the defects of poor resolution of detection signals and poor specificity of the antibody with limited S9.6. particularly S9.6 can recognize double-chain RNA. And active R-loop means are scarce as to how to detect single cell levels.
In general, the existing R-loop detection methods have some defects, mainly including: 1. the amount of cells used is large; 2. not in situ detection; 3. lack of strand-specific information; 4. the time required by the process is long.
However, the establishment of a novel single-cell active R-loop detection method has important significance for researching the biological function of R-loop, and the key point is that a more suitable polypeptide is needed to enable the establishment of the novel single-cell active R-loop detection method.
Disclosure of Invention
The invention aims to provide a fusion protein for preparing a single-cell in-situ active R-loop library, which can be used for a novel single-cell active R-loop detection method and has the advantages of less required cell amount, short detection time and the like.
The technical scheme for achieving the purpose is as follows.
A fusion protein comprises a dimer formed by a first functional region and a second functional region, wherein the first functional region comprises R-loop specific binding protein (HBD);
the second functional region comprises MNase nuclease or Tn5 transposase;
a linker connecting the first functional region and the second functional region.
In some embodiments, the HBD is an R-loop specific recognition protein, the amino acid sequence of the R-loop specific recognition protein is shown in SEQ ID NO.1, or the HBD is an amino acid sequence which is substituted, deleted or added with one or more amino acids on the basis of the sequence shown in SEQ ID NO.1 and has the same function.
In some embodiments, the MNase nuclease is a wild-type MNase truncation, and further preferably has an amino acid sequence shown in SEQ ID No.2, or an amino acid sequence with one or more amino acids substituted, deleted or added on the basis of the sequence shown in SEQ ID No.2, and has the same function.
In some embodiments, the Tn5 transposase is a wild type Tn5 transposase mutant, the amino acid sequence of which is shown in SEQ ID No.3, or an amino acid sequence which is obtained by substituting, deleting or adding one or more amino acids on the basis of the sequence shown in SEQ ID No.3 and has the same function.
In some embodiments, the amino acid sequence of the linker is shown in SEQ ID NO. 4.
In some embodiments, the kit further comprises a protein purification tag, wherein the protein purification tag comprises a His tag, a GST tag, a MBP tag, a SUMO tag and other affinity chromatography purification.
In some of these embodiments, the second functional region of the protein is linked N-terminal (nitrogen-terminal) to the functional region of amino acids of the first functional region.
In some embodiments, the monomer of the fusion protein is HBD-MNase or HBD-Tn5, the amino acid sequence of the fusion protein HBD-MNase is shown as SEQ ID NO.5, or the fusion protein HBD-MNase is an amino acid sequence which is obtained by substituting, deleting or adding one or more amino acids on the basis of the sequence shown as SEQ ID NO.5 and has the same function; the amino acid sequence of the fusion protein HBD-Tn5 is shown in SEQ ID NO.6, or is an amino acid sequence which is obtained by substituting, deleting or adding one or more amino acids on the basis of the sequence shown in SEQ ID NO.6 and has the same function.
The invention also aims to provide the application of the fusion protein in preparing an R-loop high-throughput sequencing library of a biological sample.
In some embodiments, the fusion protein is used for preparing an R-loop high-throughput sequencing library of a biological sample, wherein the biological sample comprises but is not limited to a culture cell sample and a tissue sample which are not crosslinked, fixed or frozen/subjected to crosslinking, fixing or freezing treatment; the high-throughput sequencing library is a high-throughput sequencing library for detecting R-loop.
Another objective of the invention is to provide a method for preparing a high-throughput sequencing library of a biological sample.
A method of preparing a high throughput sequencing library of a biological sample comprising the steps of:
the method comprises the steps of collecting and processing a biological sample to obtain a single cell suspension;
washing a biological sample by using a buffer solution, and adding a proper amount of fusion protein to fully combine the fusion protein and then washing the unbound protein;
activating the fusion protein, and fully reacting to obtain small fragment DNA or labeled DNA fragments which are recognized and cut by the fusion protein;
fourthly, adding a stop solution to terminate the reaction, and purifying and recovering the DNA fragment;
fifthly, carrying out PCR amplification to complete library construction.
The fusion protein is mainly used for constructing an R-loop high-throughput detection library, and compared with the existing method, the fusion protein has the following beneficial effects:
the invention creatively connects a proper MNase nuclease or Tn5 transposase with a specific binding protein HBD through a linker (linker) to form a brand-new fusion protein, and the fusion protein can obviously reduce the required cell amount in the R-loop high-throughput detection process, even reach the level of single cells: based on traditional methods such as DRIP-seq and the like, the cell demand needs to reach the level of ten million, but the detection method only needs one hundred thousand cells at most.
The fusion protein constructed by the invention can obviously improve the specificity of detection of the R-loop, the DRIP-seq in the traditional detection mode uses the S9.6 antibody to capture the R-loop in the fragmented genome, but the specificity of the S9.6 antibody in the mode is not strong, the DRIP-seq can capture the hybrid chain in the R-loop and can also combine with double-stranded RNA, so that the detected signal is probably not real R-loop, RNaseH can specifically digest the R-loop, the specific binding capacity of the R-loop is hundreds of times stronger than that of S9.6, and the specific recognition and capture of the R-loop can be effectively improved by utilizing the specific binding capacity of RNaseH binding domain HBD to the R-loop.
The fusion protein constructed by the invention can obviously simplify the experimental process in the R-loop high-throughput detection process, improve the library construction efficiency and reduce the library background: the library building process only needs half an hour at least, the operation time can be completed within five minutes, the library building based on Tn5 can obviously reduce the background of the library, the Tn5 in the fusion protein HBD-Tn5 related by the invention can add specific aptamers at two ends of a cutting site while cutting a genome, and after a positive fragment added with the aptamers is released, the library building method of adding A and the aptamers to the terminal repair is not needed, the library preparation can be directly and efficiently realized through PCR, the experimental process is obviously simplified, and the library building efficiency is improved; namely, the fusion protein can rapidly realize high-flux single cell library preparation in R-loop high-flux detection: the library building process of the detection method needs two hours at most, and the library building based on HBD-Tn5 only needs half an hour at least.
The fusion protein can be subjected to in-situ detection in the R-loop high-flux detection process, and further space in-situ information of a sample is effectively reserved. The detection of the invention does not need the traditional chemical reagent treatment for crosslinking, does not need the traditional enzyme digestion or ultrasonic damage of subcellular structures, and after cell punching, the fusion protein related to the invention is incubated with cells, and Ca is added2+Or Mg2+The fusion protein is activated under the action, the R-loop is cut in situ and the genome is released, and the most original structure of the active cell is really reserved in the process. While the traditional DRIP-seq needs to cross-link cells with tens of millions of cell volumes, and needs to fragment the genome by enzyme digestion or ultrasound, so that in-situ information is lost.
Drawings
FIG. 1: and (3) constructing a map of the HBD-MNase expression vector.
FIG. 2: HBD-Tn5 expression vector construction map.
FIG. 3: and (3) purifying the high-purity HBD-MNase.
FIG. 4: the result of purification of high purity HBD-Tn 5.
FIG. 5: and (5) detecting the activity of HBD-MNase.
FIG. 6: HBD-Tn5 activity assay results.
FIG. 7: the HBD-MNase fusion protein disclosed by the invention is applied to R-mapping result analysis and comparison of track graphs with the traditional DRIP-seq.
FIG. 8: the HBD-Tn5 fusion protein disclosed by the invention is applied to R-mapping result analysis and a track graph compared with the DRIP-seq traditional detection method.
Detailed Description
In order that the invention may be more readily understood, reference will now be made to the following more particular description of the invention, examples of which are set forth below. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein. These embodiments are provided so that this disclosure will be thorough and complete. It is to be understood that the experimental procedures in the following examples, where specific conditions are not noted, are generally in accordance with conventional conditions, or with conditions recommended by the manufacturer. The various reagents used in the examples are commercially available.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the term "and/or" includes any and all combinations of one or more of the associated listed items.
The present invention is further illustrated by the following specific examples, which are not intended to limit the scope of the invention.
The fusion protein formed by the invention can be used for detecting R-loop with high flux in the following embodiments.
The fusion protein comprises a dimer formed by a first functional region and a second functional region, wherein the first functional region comprises R-loop specific binding protein HBD, and the amino acid sequence of the R-loop specific binding protein HBD is shown in SEQ ID NO. 1;
the second functional region comprises MNase nuclease (the amino acid sequence of which is shown in SEQ ID NO.2) and Tn5 transposase (the amino acid sequence of which is shown in SEQ ID NO. 3);
a connecting structure (linker) connecting the first functional region and the second functional region, wherein the amino acid sequence is shown in SEQ ID NO.4, and the protein purification Tag is 6 × His Tag.
The monomer of the fusion protein is HBD-MNase or HBD-Tn5, the amino acid sequence of the fusion protein HBD-MNase is shown as SEQ ID NO.5, and the amino acid sequence of the fusion protein HBD-Tn5 is shown as SEQ ID NO. 6.
SEQ ID NO.1
>HBD
>MFYAVRRGRRTGVFLSWSECKAQVDRFPAARFKKFATEDEAWAF
SEQ ID NO.2
>MNase
>ATSTKKLHKEPATLIKAIDGDTVKLMYKGQPMTFRLLLVDTPETKHPKKGVEKYGPEAS AFTKKMVENAKKIEVEFDKGQRTDKYGRGLAYIYADGKMVNEALVRQGLAKVAYVYKP NNTHEQHLRKSEAQAKKEKLNIWSEDNADSGQ
SEQ ID NO.3
>Tn5
>MITSALHRAADWAKSVFSSAALGDPRRTARLVNVAAQLAKYSGKSITISSEGSKAMQEG AYRFIRNPNVSAEAIRKAGAMQTVKLAQEFPELLAIEDTTSLSYRHQVAEELGKLGSIQDKS RGWWVHSVLLLEATTFRTVGLLHQEWWMRPDDPADADEKESGKWLAAAATSRLRMGS MMSNVIAVCDREADIHAYLQDKLAHNERFVVRSKHPRKDVESGLYLYDHLKNQPELGGY QISIPQKGVVDKRGKRKNRPARKASLSLRSGRITLKQGNITLNAVLAEEINPPKGETPLKWL LLTSEPVESLAQALRVIDIYTHRWRIEEFHKAWKTGAGAERQRMEEPDNLERMVSILSFVA VRLLQLRESFTPPQALRAQGLLKEAEHVESQSAETVLTPDECQLLGYLDKGKRKRKEKAG SLQWAYMAIARLGGFMDSKRTGIASWGALWEGWEALQSKLDGFLAAKDLMAQGIKI
Linker1 (for HBD-MNase)
>DDDKEF
Linker2 (for HBD-Tn5)
>DDDKEFGGGGS(SEQ ID NO.4)
>6×His Tag
>HHHHHH
SEQ ID NO.5
>HBD-MNase
>MFYAVRRGRRTGVFLSWSECKAQVDRFPAARFKKFATEDEAWAFDDDKEFGGGGSATS TKKLHKEPATLIKAIDGDTVKLMYKGQPMTFRLLLVDTPETKHPKKGVEKYGPEASAFTK KMVENAKKIEVEFDKGQRTDKYGRGLAYIYADGKMVNEALVRQGLAKVAYVYKPNNTH EQHLRKSEAQAKKEKLNIWSEDNADSGQ
SEQ ID NO.6
>HBD-Tn5
>MFYAVRRGRRTGVFLSWSECKAQVDRFPAARFKKFATEDEAWAFDDDKEFGGGGSMIT SALHRAADWAKSVFSSAALGDPRRTARLVNVAAQLAKYSGKSITISSEGSKAMQEGAYRFI RNPNVSAEAIRKAGAMQTVKLAQEFPELLAIEDTTSLSYRHQVAEELGKLGSIQDKSRGW WVHSVLLLEATTFRTVGLLHQEWWMRPDDPADADEKESGKWLAAAATSRLRMGSMMS NVIAVCDREADIHAYLQDKLAHNERFVVRSKHPRKDVESGLYLYDHLKNQPELGGYQISIP QKGVVDKRGKRKNRPARKASLSLRSGRITLKQGNITLNAVLAEEINPPKGETPLKWLLLTS EPVESLAQALRVIDIYTHRWRIEEFHKAWKTGAGAERQRMEEPDNLERMVSILSFVAVRLL QLRESFTPPQALRAQGLLKEAEHVESQSAETVLTPDECQLLGYLDKGKRKRKEKAGSLQW AYMAIARLGGFMDSKRTGIASWGALWEGWEALQSKLDGFLAAKDLMAQGIKI
Example 1
Design, expression and purification of HBD-MNase fusion protein
1. Construction of HBD-MNase fusion protein expression vector
The first functional region (SEQ ID NO.1) and the second functional region (SEQ ID NO.2) are respectively amplified by PCR through primers. The primers are as follows:
first functional region forward primer:
ATGGGTCGCGGATCCGAATTCATGTTCTATGCGGTGAGGAG SEQ ID NO.7
first functional region reverse primer:
GAACTCCTTATCGTCATCAAAGGCCCAGGCCTCATCTT SEQ ID NO.8
second functional region forward primer:
GATGACGATAAGGAGTTCGCAACTTCAACTAAAAAATTACA SEQ ID NO.9
second functional region reverse primer:
GGTGGTGGTGGTGGTGCTCGAGTTATTGACCTGAATCAGCGTTGTC SEQ ID NO.10
secondly, amplifying two DNA fragments into a fragment in a bridge PCR mode, namely amplifying a first functional region by using a forward primer and a reverse primer of the first functional region, and amplifying a second functional region by using a forward primer and a reverse primer of the second functional region; and then using the PCR product of the first functional region and the PCR product of the second functional region as templates, and amplifying spliced fragments of the first functional region and the second functional region by using a forward primer of the first functional region and a reverse primer of the second functional region.
Cloning of the first functional region (source of template Sequence: NCBI Reference Sequence: NM-011275.3):
PCR reaction (50. mu.L) (enzyme used KOD-Plus Toyobo Cat # KOD-201):
10×KOD Plus Buffer 5μL,dNTP 4μL,Mg2SO4 2μL,F+R Primer(10mM)2μL,template 1μL (500ng),KOD Plus 1μL,ddH2O 35μL;
the procedure is as follows: pre-denaturation at 95 ℃ for 3 min; denaturation at 95 ℃ for 30 s; annealing at 60 ℃ for 30 s; extension at 68 ℃ for 15 s; stretching at 68 deg.C for 5 min; storing at 12 ℃; for a total of 35 cycles.
Cloning of the second functional region (source of template sequence: GenBank: V01281.1):
and (3) PCR reaction system:
10×KOD Plus Buffer 5μL,dNTP 4μL,25mM Mg2SO4 2μL,F+R Primer(10mM)2μL, template 1μL(500ng),KOD-Plus 1μL,ddH2O 35μL;
the procedure is as follows: pre-denaturation at 95 ℃ for 3 min; denaturation at 95 ℃ for 30 s; annealing at 60 ℃ for 30 s; extension at 68 ℃ for 30 s; stretching at 68 deg.C for 5 min; storing at 12 ℃; for a total of 35 cycles.
Detecting the PCR product by 1% agarose Gel electrophoresis, and recovering the target fragment by a Biospin Gel Extraction Kit Gel recovery Kit (BIO FLUX, Cat # BSC02M 1)).
Bridge PCR reaction system:
10×KOD Plus Buffer 5μL,dNTP 4μL,25mM Mg2SO 42 μ L, F + R Primer (10mM)2 μ L, template 2 μ L (molar ratio of two functional fragments 1: 1), KOD-Plus 1 μ L, ddH2O 34μL;
The procedure is as follows: pre-denaturation at 95 ℃ for 3 min; denaturation at 95 ℃ for 30 s; annealing at 60 ℃ for 30 s; extension at 68 ℃ for 40 s; stretching at 68 deg.C for 5 min; the cells were stored at 12 ℃ for 35 cycles.
The PCR products were detected by 1% agarose Gel electrophoresis, Biospin Gel Extraction Kit Gel recovery Kit (BIO FLUX,
cat # BSC02M1)) to recover the target fragment.
(3) The fragment was cloned into the expression vector pET-28a, as shown in FIG. 1.
Adopting a homologous recombination mode, wherein a reaction system comprises the following steps:
20ng of pET-28a ((EcoRI + Xhol double restriction enzyme product)), 60ng of the target fragment recovered in the second step, 2 muL of 5 Xligation-Free Cloning (ABM), ddH2Make up to 10 μ L and ice-bath for 30 min. The homologous recombination product was transformed into DH5 alpha (Trans).
The Sanger sequencing ensures the integrity of the carrier sequence.
Single clones were single-sequenced with T7 promoter.
2. Expression and purification of HBD-MNase fusion protein
The HBD-MNase fusion protein expression plasmid with correct sequencing is transferred into a BL21(DE3) expression strain.
Inoculating a monoclonal strain expressing HBD-MNase fusion protein into LB culture medium containing kanamycin, culturing at 37 ℃, 220rpm and over night.
Inoculating the culture in the second seed into 100mL of LB culture medium, culturing at 37 ℃ and 220rpm for 3h until OD is 0.6-0.8. Fourth, the culture bottle in the third crop was placed in a refrigerator at 4 ℃ for 15min, and IPTG was added to adjust the final concentration to 0.5 mM.
Fifthly, culturing at 160rpm for 2h at 20 ℃, collecting thalli, and crushing with ultrasound or high pressure to obtain the protein solution.
Sixthly, centrifuging 13000g of the protein solution obtained by the step I for 30min at 4 ℃ by using a centrifugal machine, precipitating a bacterial genome by using PEI (with the final concentration of 0.05%) in the protein solution obtained by the supernatant, centrifuging 13000g of the protein solution for 30min at 4 ℃ by using a 0.45-micrometer filter head, and filtering the supernatant.
Affinity purification using Ni-Column (Ni-NTA 1ml Pre-Packed gradient Column Industrial Cat # C600791-0010) the nickel Column was first equilibrated, after completion of the Buffer run in the Ni Column, by adding a Pre-cooled 20mM imidazole (formulated in 50mM Tris-HCl, pH 7.5, 0.8M NaCl, 0.2% Triton X-100 and 10% glycerol) to the equilibrated nickel Column. Thereafter, the protein solution was applied to the column 5 times, and then the column was washed several times with 150mL of pre-cooled 20mM Imidazole to wash unbound proteins, then with 10mL of 50mM Imidazole, and finally with 5mL of pre-cooled 300mM Imidazole to elute the protein of interest. The eluted target protein was dialyzed overnight at 4 ℃ in a dialysis solution (20mM Tris-HCl, pH 7.5, 150mM NaCl, 10% glycerol). HBD-MNase is concentrated by a 10kDa protein concentration column, the protein concentration is detected by a BCA method after concentration, and meanwhile, the residual bacterial genome is detected by the Qubit (the residual genome amount is controlled to be less than 0.5 ng/. mu.L).
The purified protein was subjected to SDS-PAGE and stained with Coomassie Brilliant blue as shown in FIG. 3. FT represents the flow-through of the protein on the column, imidazole represents the elution products of imidazole eluents with different concentrations (50mM imidazole represents the non-specific or weakly-bound protein by washing, 300mM imidazole represents the target protein solution finally eluted), M represents protein marker, and the red asterisk marked band in the 300mM imidazole lane is the purified target protein product. The amino acid sequence of the HBD-MNase fusion protein is shown as SEQ ID NO. 5. The function of the HBD-Tn5 fusion protein can be realized by the skilled person through substituting, deleting or adding one or more amino acids on the basis of the sequence shown in SEQ ID NO.5 and the amino acid sequence with the same function according to the common knowledge of the skilled person.
Example 2 design, expression and purification of HBD-Tn5 fusion protein
1. Construction of HBD-Tn5 fusion protein expression vector
The first functional region (SEQ ID NO.1) and the second functional region (SEQ ID NO.3) are PCR-amplified by using primers. The primers are as follows:
first functional region forward primer:
ATGGGTCGCGGATCCGAATTCATGTTCTATGCGGTGAGGAG SEQ ID NO.7
first functional region reverse primer:
GGCTTTAGCCGCTGCCTCCTTTGCGGCAGCAAAGGCCCAGGCCTCATCTT SEQ ID NO.11
second functional region forward primer:
GCTGCCGCAAAGGAGGCAGCGGCTAAAGCCATGATTACCAGTGCACTGCA SEQ ID NO.12
second functional region reverse primer:
GGTGGTGGTGGTGGTGCTCGAGTTAGATTTTAATGCCCTGCGCCATC SEQ ID NO.13
secondly, amplifying two DNA fragments into a fragment in a bridge PCR mode, namely amplifying a first functional region by using a forward primer and a reverse primer of the first functional region, and amplifying a second functional region by using a forward primer and a reverse primer of the second functional region; and then using the PCR product of the first functional region and the PCR product of the second functional region as templates, and amplifying spliced fragments of the first functional region and the second functional region by using a forward primer of the first functional region and a reverse primer of the second functional region.
Cloning of the first functional region:
source of template sequence for PCR reaction (50 μ L): NCBI Reference Sequence NM-011275.3: 10 XKOD Plus Buffer 5. mu.L, dNTP 4. mu.L, 25mM Mg2SO4 2μL,F+R Primer(10mM)2μL, template 1μL(500ng),KOD Plus 1μL,ddH2O 35μL;
The procedure is as follows: pre-denaturation at 95 ℃ for 3 min; denaturation at 95 ℃ for 30 s; annealing at 60 ℃ for 30 s; extension at 68 ℃ for 15 s; stretching at 68 deg.C for 5 min; the cells were stored at 12 ℃ for 35 cycles.
Cloning of the second functional region:
and (3) PCR reaction system:
10×KOD Plus Buffer 5μL,dNTP 4μL,25mM Mg2SO4 2μL,F+R Primer(10mM)2μL, template 1μL(500ng),KOD-Plus 1μL,ddH2O 35μL;
the procedure is as follows: pre-denaturation at 95 ℃ for 3 min; denaturation at 95 ℃ for 30 s; annealing at 60 ℃ for 30 s; extension at 68 ℃ for 1min30 s; stretching at 68 deg.C for 5 min; the cells were stored at 12 ℃ for 35 cycles.
Detecting the PCR product by 1% agarose Gel electrophoresis, and recovering the target fragment by a Biospin Gel Extraction Kit Gel recovery Kit (BIO FLUX, Cat # BSC02M 1)).
Bridge PCR reaction system (enzyme used KOD-Plus Toyobo Cat # KOD-201):
10×KOD Plus Buffer 5μL,dNTP 4μL,25mM Mg2SO 42 μ L, F + R Primer (10mM)2 μ L, template 2 μ L (molar ratio of two functional fragments 1: 1), KOD-Plus 1 μ L, ddH2O 34μL;
The procedure is as follows: pre-denaturation at 95 ℃ for 3 min; denaturation at 95 ℃ for 30 s; annealing at 60 ℃ for 30 s; extending at 68 ℃ for 1min for 45 s; stretching at 68 deg.C for 5 min; the cells were stored at 12 ℃ for 35 cycles.
Detecting the PCR product by 1% agarose Gel electrophoresis, and recovering the target fragment by a Biospin Gel Extraction Kit Gel recovery Kit (BIO FLUX, Cat # BSC02M 1)).
Cloning the fragment into expression vector pET-28a, as shown in FIG. 2.
Adopting a homologous recombination mode, wherein a reaction system comprises the following steps:
20ng of pET-28a (EcoRI + Xhol double enzyme digestion product); 60ng of the obtained target fragments are recovered; 5 × Ligation-Free Cloning 2 μ L (ABM); ddH2Make up to 10 μ L and ice-bath for 30 min. The homologous recombination product was transformed into DH5 alpha (TransGen).
The Sanger sequencing ensures the integrity of the carrier sequence.
Single clones were sequenced bidirectionally with T7promoter and T7 terminator.
2. Expression and purification of HBD-Tn5 fusion protein
The HBD-Tn5 fusion protein expression plasmid with correct sequencing is transferred into a BL21(DE3) expression strain.
Inoculating a monoclonal strain expressing the HBD-Tn5 fusion protein into LB culture medium containing kanamycin, and culturing at 37 ℃ and 220rpm overnight.
Inoculating the culture in the second seed into 100mL of LB culture medium, culturing at 37 ℃ and 220rpm for 3h until OD is 0.6-0.8.
Fourth, the culture bottle in the third crop was placed in a refrigerator at 4 ℃ for 15min, and IPTG was added to adjust the final concentration to 0.5 mM.
Fifthly, culturing at the temperature of 37 ℃ and the rpm of 160 for 2h, collecting thalli, and crushing with ultrasound or high pressure to obtain a protein solution.
Sixthly, centrifuging 13000g of the protein solution obtained by the step of the sixth, for 30min, precipitating a bacterial genome by using PEI (with a final concentration of 0.05%), centrifuging 13000g of a centrifugal machine at the temperature of the step of the fourth, and filtering by using a filter head with the size of 0.45 mu m.
Affinity purification with Ni column
The Ni column was equilibrated first, and after the Buffer flow in the Ni column was complete, a pre-cooled 20mM Imidazole (prepared in 50mM Tris-HCl, pH 7.5, 0.8M NaCl, 0.2% Triton X-100, and 10% glycerol) was added to equilibrate the Ni column. Thereafter, the protein was loaded onto the column 5 times, the column was washed several times with 150mL of pre-cooled 20mM Imidazole, then 10mL of 35mM Imidazole, and finally the protein was eluted with 5mL of pre-cooled 300mM Imidazole. The eluted protein was directly dialyzed in buffer (20mM Tris-HCl, pH 7.5, 150mM NaCl, 10% glycerol) at 4 ℃ overnight. HBD-Tn5 was concentrated using a 30kDa protein concentration column, and the protein concentration was measured by BCA method after concentration, while the remaining bacterial genome was detected by Qubit (the amount of remaining genome was controlled to less than 0.5 ng/. mu.L).
The purified protein was subjected to SDS-PAGE and stained with Coomassie Brilliant blue as shown in FIG. 4. Lane FT shows flow-through, imidazole shows elution products from imidazole eluents of different concentrations (35mM imidazole shows unbound protein washed away, 300mM imidazole shows target protein solution finally eluted), M shows protein marker, and the red asterisk marked band in 300mM imidazole lane is the purified target protein product.
The amino acid sequence of the HBD-Tn5 fusion protein is shown in SEQ ID NO. 6. The function of the HBD-Tn5 fusion protein can be realized by the skilled person through substituting, deleting or adding one or more amino acids on the basis of the sequence shown in SEQ ID NO.6 and the amino acid sequence with the same function according to the common knowledge of the skilled person.
Example 3 detection of HBD-MNase Activity
Mixing different amounts of HBD-MNase (input amount shown in FIG. 5) with 550ng of genomic DNA, and adding Ca2+The final concentration was adjusted to 10mM, and the reaction was carried out in ice bath for 10 min. Agarose gel electrophoresis detection shows that HBD-MNase can cut the genome DNA into DNA fragments with the size of 100bp, and the result is shown in FIG. 5: under the condition of the same genome input amount (550ng), with the increase of the input amount of the fusion protein HBD-MNase (from 0ng to 578.4ng), the gradual narrowing and even disappearance of a genome band can be obviously seen, and a gradually increased and dispersed genome can appear below a lane, so that the in vitro enzyme digestion result shows that the fusion protein HBD-MNase has higher enzyme activity.
Example 4 HBD-Tn5 Activity assay
15pmol of HBD-Tn5 was mixed with 200ng of genomic DNA, and Mg was added2+The reaction was carried out at 55 ℃ for 10min to a final concentration of 10mM, and agarose gel electrophoresis was carried out for detection, as shown in FIG. 6: compared with a control group which is not treated by HBD-Tn5, the addition of HBD-Tn5 can effectively break the genome DNA, so that an obvious genome band disappears, dispersed genome distribution appears below a lane, most of fragment distribution is below 1000bp, and in-vitro enzyme digestion genome experiments prove that the fusion protein HBD-Tn5 disclosed by the invention has higher enzyme activity.
Example 5
The HBD-MNase and HBD-Tn5 obtained by the invention are respectively applied to non-crosslinked cells in a natural state to carry out in-situ detection on R-loop, and the detection method comprises the following steps:
firstly, 100000 HEK 293T cells cultured in vitro are collected, washed once with PBS (PH 7.2), centrifuged for 3min at 600g at room temperature, and then supernatant is removed, and washed once with 200 muL wash buffer (10mM HEPES (4-hydroxyethyl piperazine ethanesulfonic acid), 150mM NaCl, 0.5mM spermidine (spermidine)) and 5% Digitonin;
add binding buffers (10mM HEPES, 10mM KCl, 1mM CaCl) to 8. mu.L of Con A beads (Bangslabs)2,1mM MnCl2) Mixing and incubating with cells in the first step for 15min after cleaning, centrifuging the cells at 4 ℃ by a centrifuge for 100g for instant separation, discarding supernatant (magnetic force rack), adding fusion protein HBD-MNase or HBD-Tn5 into 100 mu L of wash buffer, wherein the final concentration of the fusion protein HBD-MNase or HBD-Tn5 is 1 mu M, Digitonin (digitalis saponin), 1 mu L of PIC (100 x), and incubating the cells at 4 ℃ for 2-12 h;
washing with 200 mu L wash buffer for three times, washing off redundant protein, centrifuging with 100g centrifuge at 4 ℃, and discarding supernatant (magnetic rack);
addition of 10mM Ca2+Or Mg2+Activating fusion protein HBD-MNase or HBD-Tn5 to cut DNA (R-mapping, wherein the reaction condition of the fusion protein HBD-MNase is 0 +/-0.5 ℃ for 30min, and the reaction condition of the fusion protein HBD-Tn5 is 37-55 ℃ for 1 h);
after the cutting reaction is terminated by adding 10mM EDTA, phenol: chloroform: extracting DNA with isoamyl alcohol;
sixthly, performing end repairing and connection of an Adapter (Vazyme VAHTS Adapter-S for illumina) on the extracted genome fragment based on the library establishment of HBD-MNase (R-mapping), and performing PCR library establishment (Vazyme index for illumina).
And (3) PCR system: 24 μ L of DNA, 1 μ L i5(10mM), 1 μ L i7(10mM), 10 μ L of 5 XKAPA HiFi Fidelity buffer, 1 μ L of KAPA HiFi hot start, 11.5 μ L of ddH2O, 1.5. mu.L dNTP; a total of 50. mu.L.
The procedure is as follows: pre-denaturation at 98 ℃ for 45 s; denaturation at 98 ℃ for 15 s; annealing at 60 ℃ for 30 s; extending for 1min at 72 ℃; total extension at 72 deg.C for 1min, preservation at 12 deg.C for 13 cycles; library products were sequenced by illumina.
Based on HBD-Tn5(R-mapping) library construction, the purified genome was subjected to PCR library construction (Vazyme index for illumina).
And (3) PCR system: 23 μ L of DNA, 1 μ L i5(10mM), 1 μ L i7(10mM), 25 μ L of 2 XMix buffer (NEB Cat # M0541S); the procedure is as follows: 72 ℃ for 5 min; pre-denaturation at 98 ℃ for 30 s; denaturation at 98 ℃ for 15 s; annealing at 60 ℃ for 30 s; extending for 1min at 72 ℃; total extension at 72 deg.C for 5min, preservation at 16 deg.C for 13 cycles; the library products were subjected to illumina sequencing.
The biological sample for detection of the fusion protein of the present application may include, but is not limited to, a cultured cell sample, a tissue sample or other biological sample that is not crosslinked, fixed or frozen/processed by crosslinking, fixing or freezing.
After obtaining DNA by using the commercially available kit or method, PCR library construction may be performed by using other polymerases, including isothermal polymerases and other polymerases with strand displacement properties, RNA or DNA dependent polymerases.
As shown in fig. 7 and 8: the results obtained by applying the fusion proteins HBD-MNase and HBD-Tn5 in the invention to in-situ detection of R-loop (R-mapping) are compared with the results of the traditional method DRIP-seq (R-loop formation is a discrete mechanical of methylated human CpG island reagents, Ginno et al, Mol cell 2012Mar 30; 45(6):814-25) for detecting R-loop.
As shown in fig. 7: under the condition of relatively small cell amount, the R-mapping based on HBD-MNase can capture positive signals with the same DRIP-seq, the peak signals on the gene locus are more concentrated, the DRIP-seq signals are relatively dispersed, and the signal value of the R-mapping detection method is higher, so that the signal-to-noise ratio (shown in figure 7) is effectively improved compared with the traditional method. Meanwhile, a track graph also shows that the invention can detect signals (shown in figure 7) which can not be captured by DRIP-seq, and the signals specifically detected by the invention can be digested after RNase H treatment (shown in figure 7), which shows that the signals are real R-loop, and the R-mapping detection method is proved to be capable of effectively realizing accurate capture of the R-loop compared with the traditional method;
based on the application of HBD-Tn5 fusion protein, the invention not only consumes less cell amount, but also greatly shortens the library construction time to within half an hour based on the characteristic that Tn5 cuts genome and is directly connected with adapter, and then the library is constructed by PCR.
As shown in fig. 8: according to the invention, based on the comparative analysis of results of applying R-mapping of HBD-Tn5 fusion protein to in-situ detection of R-loop (on a picture 8) in a natural state non-crosslinked cell and DRIP-seq (under a picture 8) of traditional detection of R-loop, R-loop peak signals obtained by applying the R-mapping detection method of the fusion protein are more concentrated and relatively more accurate, DRIP-seq signals are relatively dispersed, and meanwhile, the signal value obtained by the invention is higher, namely compared with the traditional DRIP-seq, the signal-to-noise ratio can be effectively improved by detecting the in-situ R-loop through HBD-Tn5 used by the invention.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
Sequence listing
<110> Guangzhou biomedical and health research institute of Chinese academy of sciences
<120> fusion protein and application thereof
<160> 13
<170> SIPOSequenceListing 1.0
<210> 1
<211> 44
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 1
Met Phe Tyr Ala Val Arg Arg Gly Arg Arg Thr Gly Val Phe Leu Ser
1 5 10 15
Trp Ser Glu Cys Lys Ala Gln Val Asp Arg Phe Pro Ala Ala Arg Phe
20 25 30
Lys Lys Phe Ala Thr Glu Asp Glu Ala Trp Ala Phe
35 40
<210> 2
<211> 149
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 2
Ala Thr Ser Thr Lys Lys Leu His Lys Glu Pro Ala Thr Leu Ile Lys
1 5 10 15
Ala Ile Asp Gly Asp Thr Val Lys Leu Met Tyr Lys Gly Gln Pro Met
20 25 30
Thr Phe Arg Leu Leu Leu Val Asp Thr Pro Glu Thr Lys His Pro Lys
35 40 45
Lys Gly Val Glu Lys Tyr Gly Pro Glu Ala Ser Ala Phe Thr Lys Lys
50 55 60
Met Val Glu Asn Ala Lys Lys Ile Glu Val Glu Phe Asp Lys Gly Gln
65 70 75 80
Arg Thr Asp Lys Tyr Gly Arg Gly Leu Ala Tyr Ile Tyr Ala Asp Gly
85 90 95
Lys Met Val Asn Glu Ala Leu Val Arg Gln Gly Leu Ala Lys Val Ala
100 105 110
Tyr Val Tyr Lys Pro Asn Asn Thr His Glu Gln His Leu Arg Lys Ser
115 120 125
Glu Ala Gln Ala Lys Lys Glu Lys Leu Asn Ile Trp Ser Glu Asp Asn
130 135 140
Ala Asp Ser Gly Gln
145
<210> 3
<211> 476
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 3
Met Ile Thr Ser Ala Leu His Arg Ala Ala Asp Trp Ala Lys Ser Val
1 5 10 15
Phe Ser Ser Ala Ala Leu Gly Asp Pro Arg Arg Thr Ala Arg Leu Val
20 25 30
Asn Val Ala Ala Gln Leu Ala Lys Tyr Ser Gly Lys Ser Ile Thr Ile
35 40 45
Ser Ser Glu Gly Ser Lys Ala Met Gln Glu Gly Ala Tyr Arg Phe Ile
50 55 60
Arg Asn Pro Asn Val Ser Ala Glu Ala Ile Arg Lys Ala Gly Ala Met
65 70 75 80
Gln Thr Val Lys Leu Ala Gln Glu Phe Pro Glu Leu Leu Ala Ile Glu
85 90 95
Asp Thr Thr Ser Leu Ser Tyr Arg His Gln Val Ala Glu Glu Leu Gly
100 105 110
Lys Leu Gly Ser Ile Gln Asp Lys Ser Arg Gly Trp Trp Val His Ser
115 120 125
Val Leu Leu Leu Glu Ala Thr Thr Phe Arg Thr Val Gly Leu Leu His
130 135 140
Gln Glu Trp Trp Met Arg Pro Asp Asp Pro Ala Asp Ala Asp Glu Lys
145 150 155 160
Glu Ser Gly Lys Trp Leu Ala Ala Ala Ala Thr Ser Arg Leu Arg Met
165 170 175
Gly Ser Met Met Ser Asn Val Ile Ala Val Cys Asp Arg Glu Ala Asp
180 185 190
Ile His Ala Tyr Leu Gln Asp Lys Leu Ala His Asn Glu Arg Phe Val
195 200 205
Val Arg Ser Lys His Pro Arg Lys Asp Val Glu Ser Gly Leu Tyr Leu
210 215 220
Tyr Asp His Leu Lys Asn Gln Pro Glu Leu Gly Gly Tyr Gln Ile Ser
225 230 235 240
Ile Pro Gln Lys Gly Val Val Asp Lys Arg Gly Lys Arg Lys Asn Arg
245 250 255
Pro Ala Arg Lys Ala Ser Leu Ser Leu Arg Ser Gly Arg Ile Thr Leu
260 265 270
Lys Gln Gly Asn Ile Thr Leu Asn Ala Val Leu Ala Glu Glu Ile Asn
275 280 285
Pro Pro Lys Gly Glu Thr Pro Leu Lys Trp Leu Leu Leu Thr Ser Glu
290 295 300
Pro Val Glu Ser Leu Ala Gln Ala Leu Arg Val Ile Asp Ile Tyr Thr
305 310 315 320
His Arg Trp Arg Ile Glu Glu Phe His Lys Ala Trp Lys Thr Gly Ala
325 330 335
Gly Ala Glu Arg Gln Arg Met Glu Glu Pro Asp Asn Leu Glu Arg Met
340 345 350
Val Ser Ile Leu Ser Phe Val Ala Val Arg Leu Leu Gln Leu Arg Glu
355 360 365
Ser Phe Thr Pro Pro Gln Ala Leu Arg Ala Gln Gly Leu Leu Lys Glu
370 375 380
Ala Glu His Val Glu Ser Gln Ser Ala Glu Thr Val Leu Thr Pro Asp
385 390 395 400
Glu Cys Gln Leu Leu Gly Tyr Leu Asp Lys Gly Lys Arg Lys Arg Lys
405 410 415
Glu Lys Ala Gly Ser Leu Gln Trp Ala Tyr Met Ala Ile Ala Arg Leu
420 425 430
Gly Gly Phe Met Asp Ser Lys Arg Thr Gly Ile Ala Ser Trp Gly Ala
435 440 445
Leu Trp Glu Gly Trp Glu Ala Leu Gln Ser Lys Leu Asp Gly Phe Leu
450 455 460
Ala Ala Lys Asp Leu Met Ala Gln Gly Ile Lys Ile
465 470 475
<210> 4
<211> 11
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 4
Asp Asp Asp Lys Glu Phe Gly Gly Gly Gly Ser
1 5 10
<210> 5
<211> 204
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 5
Met Phe Tyr Ala Val Arg Arg Gly Arg Arg Thr Gly Val Phe Leu Ser
1 5 10 15
Trp Ser Glu Cys Lys Ala Gln Val Asp Arg Phe Pro Ala Ala Arg Phe
20 25 30
Lys Lys Phe Ala Thr Glu Asp Glu Ala Trp Ala Phe Asp Asp Asp Lys
35 40 45
Glu Phe Gly Gly Gly Gly Ser Ala Thr Ser Thr Lys Lys Leu His Lys
50 55 60
Glu Pro Ala Thr Leu Ile Lys Ala Ile Asp Gly Asp Thr Val Lys Leu
65 70 75 80
Met Tyr Lys Gly Gln Pro Met Thr Phe Arg Leu Leu Leu Val Asp Thr
85 90 95
Pro Glu Thr Lys His Pro Lys Lys Gly Val Glu Lys Tyr Gly Pro Glu
100 105 110
Ala Ser Ala Phe Thr Lys Lys Met Val Glu Asn Ala Lys Lys Ile Glu
115 120 125
Val Glu Phe Asp Lys Gly Gln Arg Thr Asp Lys Tyr Gly Arg Gly Leu
130 135 140
Ala Tyr Ile Tyr Ala Asp Gly Lys Met Val Asn Glu Ala Leu Val Arg
145 150 155 160
Gln Gly Leu Ala Lys Val Ala Tyr Val Tyr Lys Pro Asn Asn Thr His
165 170 175
Glu Gln His Leu Arg Lys Ser Glu Ala Gln Ala Lys Lys Glu Lys Leu
180 185 190
Asn Ile Trp Ser Glu Asp Asn Ala Asp Ser Gly Gln
195 200
<210> 6
<211> 531
<212> PRT
<213> Artificial Sequence (Artificial Sequence)
<400> 6
Met Phe Tyr Ala Val Arg Arg Gly Arg Arg Thr Gly Val Phe Leu Ser
1 5 10 15
Trp Ser Glu Cys Lys Ala Gln Val Asp Arg Phe Pro Ala Ala Arg Phe
20 25 30
Lys Lys Phe Ala Thr Glu Asp Glu Ala Trp Ala Phe Asp Asp Asp Lys
35 40 45
Glu Phe Gly Gly Gly Gly Ser Met Ile Thr Ser Ala Leu His Arg Ala
50 55 60
Ala Asp Trp Ala Lys Ser Val Phe Ser Ser Ala Ala Leu Gly Asp Pro
65 70 75 80
Arg Arg Thr Ala Arg Leu Val Asn Val Ala Ala Gln Leu Ala Lys Tyr
85 90 95
Ser Gly Lys Ser Ile Thr Ile Ser Ser Glu Gly Ser Lys Ala Met Gln
100 105 110
Glu Gly Ala Tyr Arg Phe Ile Arg Asn Pro Asn Val Ser Ala Glu Ala
115 120 125
Ile Arg Lys Ala Gly Ala Met Gln Thr Val Lys Leu Ala Gln Glu Phe
130 135 140
Pro Glu Leu Leu Ala Ile Glu Asp Thr Thr Ser Leu Ser Tyr Arg His
145 150 155 160
Gln Val Ala Glu Glu Leu Gly Lys Leu Gly Ser Ile Gln Asp Lys Ser
165 170 175
Arg Gly Trp Trp Val His Ser Val Leu Leu Leu Glu Ala Thr Thr Phe
180 185 190
Arg Thr Val Gly Leu Leu His Gln Glu Trp Trp Met Arg Pro Asp Asp
195 200 205
Pro Ala Asp Ala Asp Glu Lys Glu Ser Gly Lys Trp Leu Ala Ala Ala
210 215 220
Ala Thr Ser Arg Leu Arg Met Gly Ser Met Met Ser Asn Val Ile Ala
225 230 235 240
Val Cys Asp Arg Glu Ala Asp Ile His Ala Tyr Leu Gln Asp Lys Leu
245 250 255
Ala His Asn Glu Arg Phe Val Val Arg Ser Lys His Pro Arg Lys Asp
260 265 270
Val Glu Ser Gly Leu Tyr Leu Tyr Asp His Leu Lys Asn Gln Pro Glu
275 280 285
Leu Gly Gly Tyr Gln Ile Ser Ile Pro Gln Lys Gly Val Val Asp Lys
290 295 300
Arg Gly Lys Arg Lys Asn Arg Pro Ala Arg Lys Ala Ser Leu Ser Leu
305 310 315 320
Arg Ser Gly Arg Ile Thr Leu Lys Gln Gly Asn Ile Thr Leu Asn Ala
325 330 335
Val Leu Ala Glu Glu Ile Asn Pro Pro Lys Gly Glu Thr Pro Leu Lys
340 345 350
Trp Leu Leu Leu Thr Ser Glu Pro Val Glu Ser Leu Ala Gln Ala Leu
355 360 365
Arg Val Ile Asp Ile Tyr Thr His Arg Trp Arg Ile Glu Glu Phe His
370 375 380
Lys Ala Trp Lys Thr Gly Ala Gly Ala Glu Arg Gln Arg Met Glu Glu
385 390 395 400
Pro Asp Asn Leu Glu Arg Met Val Ser Ile Leu Ser Phe Val Ala Val
405 410 415
Arg Leu Leu Gln Leu Arg Glu Ser Phe Thr Pro Pro Gln Ala Leu Arg
420 425 430
Ala Gln Gly Leu Leu Lys Glu Ala Glu His Val Glu Ser Gln Ser Ala
435 440 445
Glu Thr Val Leu Thr Pro Asp Glu Cys Gln Leu Leu Gly Tyr Leu Asp
450 455 460
Lys Gly Lys Arg Lys Arg Lys Glu Lys Ala Gly Ser Leu Gln Trp Ala
465 470 475 480
Tyr Met Ala Ile Ala Arg Leu Gly Gly Phe Met Asp Ser Lys Arg Thr
485 490 495
Gly Ile Ala Ser Trp Gly Ala Leu Trp Glu Gly Trp Glu Ala Leu Gln
500 505 510
Ser Lys Leu Asp Gly Phe Leu Ala Ala Lys Asp Leu Met Ala Gln Gly
515 520 525
Ile Lys Ile
530
<210> 7
<211> 41
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 7
atgggtcgcg gatccgaatt catgttctat gcggtgagga g 41
<210> 8
<211> 38
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 8
gaactcctta tcgtcatcaa aggcccaggc ctcatctt 38
<210> 9
<211> 41
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 9
gatgacgata aggagttcgc aacttcaact aaaaaattac a 41
<210> 10
<211> 46
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 10
ggtggtggtg gtggtgctcg agttattgac ctgaatcagc gttgtc 46
<210> 11
<211> 50
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 11
ggctttagcc gctgcctcct ttgcggcagc aaaggcccag gcctcatctt 50
<210> 12
<211> 50
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 12
gctgccgcaa aggaggcagc ggctaaagcc atgattacca gtgcactgca 50
<210> 13
<211> 47
<212> DNA
<213> Artificial Sequence (Artificial Sequence)
<400> 13
ggtggtggtg gtggtgctcg agttagattt taatgccctg cgccatc 47

Claims (10)

1. A fusion protein is characterized by comprising a dimer formed by a first functional region and a second functional region, wherein the first functional region comprises R-loop specific binding protein (HBD);
the second functional region comprises MNase nuclease or Tn5 transposase;
and the connecting structure connecting the first functional area and the second functional area.
2. The fusion protein of claim 1, wherein the HBD is an R-loop specific recognition protein, and the amino acid sequence of the R-loop specific recognition protein is shown in SEQ ID No.1, or the HBD is an amino acid sequence which is obtained by substituting, deleting or adding one or more amino acids on the basis of the sequence shown in SEQ ID No.1 and has the same function.
3. The fusion protein of claim 1, wherein the MNase nuclease is a wild-type MNase truncation, preferably having an amino acid sequence as shown in SEQ ID No.2, or an amino acid sequence with one or more amino acids substituted, deleted or added based on the sequence shown in SEQ ID No.2, and having the same function.
4. The fusion protein of claim 1, wherein the Tn5 transposase is a wild type Tn5 transposase mutant, and the amino acid sequence thereof is shown in SEQ ID No.3, or is an amino acid sequence that is obtained by substituting, deleting or adding one or more amino acids based on the sequence shown in SEQ ID No.3, and has the same function.
5. The fusion protein of claim 1, wherein the amino acid sequence of the linking structure is DDDKEF or DDDKEFGGGGS.
6. The fusion protein of claim 1, wherein the fusion protein has a protein purification tag attached thereto, wherein the protein purification tag is a His tag, a GST tag, an MBP tag, or a SUMO tag.
7. The fusion protein of claim 1, wherein the second functional domain of the fusion protein is N-terminal to the functional amino acid domain of the first functional domain.
8. The fusion protein of claim 1, wherein the monomer of the fusion protein is HBD-MNase or HBD-Tn5, and the amino acid sequence of the HBD-MNase is shown in SEQ ID NO.5, or the fusion protein is an amino acid sequence which is obtained by substituting, deleting or adding one or more amino acids on the basis of the sequence shown in SEQ ID NO.5 and has the same function; the amino acid sequence of the fusion protein HBD-Tn5 is shown in SEQ ID NO.6, or is an amino acid sequence which is obtained by substituting, deleting or adding one or more amino acids on the basis of the sequence shown in SEQ ID NO.6 and has the same function.
9. Use of the fusion protein of any one of claims 1-8 for preparing an R-loop high-throughput sequencing library of a biological sample.
10. Use of the fusion protein of any one of claims 1-8 in an in situ active R-loop assay.
CN202011308076.2A 2020-11-19 2020-11-19 Fusion protein and application thereof Active CN113717256B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011308076.2A CN113717256B (en) 2020-11-19 2020-11-19 Fusion protein and application thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011308076.2A CN113717256B (en) 2020-11-19 2020-11-19 Fusion protein and application thereof

Publications (2)

Publication Number Publication Date
CN113717256A true CN113717256A (en) 2021-11-30
CN113717256B CN113717256B (en) 2023-10-03

Family

ID=78672368

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011308076.2A Active CN113717256B (en) 2020-11-19 2020-11-19 Fusion protein and application thereof

Country Status (1)

Country Link
CN (1) CN113717256B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115948363A (en) * 2022-08-26 2023-04-11 武汉影子基因科技有限公司 Tn5 transposase mutant and preparation method and application thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109400714A (en) * 2018-10-26 2019-03-01 南京诺唯赞生物科技有限公司 The recombination fusion protein of antibody target and its application in epigenetics
CN110372799A (en) * 2019-08-01 2019-10-25 北京大学 A kind of fusion protein and its application for the preparation of the unicellular library ChIP-seq

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109400714A (en) * 2018-10-26 2019-03-01 南京诺唯赞生物科技有限公司 The recombination fusion protein of antibody target and its application in epigenetics
CN110372799A (en) * 2019-08-01 2019-10-25 北京大学 A kind of fusion protein and its application for the preparation of the unicellular library ChIP-seq

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHEN,J.等: "Accession NO: 1EY0_A,Chain A, STAPHYLOCOCCAL NUCLEASE", 《NCBI》 *
NOWOTNY MARCIN等: "Specific recognition of RNA DNA hybrid and enhancement of human RNase H1 activity by HBD", 《THE EMBO JOURNAL》 *
QINGQING YAN等: "Mapping Native R-Loops Genome-wide Using a Targeted Nuclease Approach", 《CELL REPORTS》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115948363A (en) * 2022-08-26 2023-04-11 武汉影子基因科技有限公司 Tn5 transposase mutant and preparation method and application thereof
CN115948363B (en) * 2022-08-26 2024-02-27 武汉影子基因科技有限公司 Tn5 transposase mutant and preparation method and application thereof

Also Published As

Publication number Publication date
CN113717256B (en) 2023-10-03

Similar Documents

Publication Publication Date Title
CN110372799B (en) Fusion protein for preparing single-cell ChIP-seq library and application thereof
CN101835901B (en) High throughput screening of genetically modified photosynthetic organisms
KR20190059966A (en) S. The Piogenes CAS9 mutant gene and the polypeptide encoded thereby
CN111499765A (en) Coronavirus fusion protein and preparation method and application thereof
CN114380922A (en) Fusion protein for generating point mutation in cell, preparation and application thereof
CN109970858B (en) CD22 single domain antibody, nucleotide sequence and kit
CN111996179A (en) DNA polymerase and application thereof in PCR detection
CN113195521A (en) Mtu Delta I-CM intein variants and uses thereof
CN114262697B (en) Bsu DNA polymerase and Bsu DNA polymerase mutant as well as gene, plasmid and genetic engineering bacteria thereof
CN113717256B (en) Fusion protein and application thereof
CN113637658A (en) dCas 9-oToV-based gene transcription system and application thereof
KR101841264B1 (en) Recombinant Vector Including Gene of Autopahgy Activation Protein and Crystallizing Method for Recombinant Protein Using Thereof
WO2011125852A1 (en) Nucleic acid structure, method for producing complex using same, and screening method
CN113402596B (en) RR-2 family epidermal protein of scarab beetle, coding nucleotide sequence and application thereof
CN112391443B (en) In-situ active R-loop library building and detecting method and kit
CN114573673B (en) Two-fork rhinoceros scarab epidermis protein, coding nucleotide sequence and application thereof
CN112301044B (en) Raw tobacco NbAPX3Gene polyclonal antibody and preparation method and application thereof
WO2022041231A1 (en) Preparation method for cas9 protein able to be used in human primary cell gene editing
CN116640225B (en) Limulus blood coagulation factor combination, reaction system, kit and application thereof
CN112689674B (en) Dextran affinity tag and application thereof
CN116240199B (en) Mutant ribonuclease R and application thereof
CN112301037B (en) NbPLP gene polyclonal antibody of Bunsen tobacco, and preparation method and application thereof
CN116640710B (en) Strain for producing horseshoe crab coagulation factor FG beta&#39;, preparation method and application
WO2009130031A9 (en) Artificial protein scaffolds
JP2010075090A (en) Protein expression vector

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant