WO2019184044A1 - Fusion protein of transposase-antibody binding protein and preparation and application thereof - Google Patents

Fusion protein of transposase-antibody binding protein and preparation and application thereof Download PDF

Info

Publication number
WO2019184044A1
WO2019184044A1 PCT/CN2018/084711 CN2018084711W WO2019184044A1 WO 2019184044 A1 WO2019184044 A1 WO 2019184044A1 CN 2018084711 W CN2018084711 W CN 2018084711W WO 2019184044 A1 WO2019184044 A1 WO 2019184044A1
Authority
WO
WIPO (PCT)
Prior art keywords
protein
fusion protein
transposase
domain
seq
Prior art date
Application number
PCT/CN2018/084711
Other languages
French (fr)
Chinese (zh)
Inventor
朱化星
王米
李科
何翼
Original Assignee
上海欣百诺生物科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 上海欣百诺生物科技有限公司 filed Critical 上海欣百诺生物科技有限公司
Publication of WO2019184044A1 publication Critical patent/WO2019184044A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1241Nucleotidyltransferases (2.7.7)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/305Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Micrococcaceae (F)
    • C07K14/31Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Micrococcaceae (F) from Staphylococcus (G)
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K14/00Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
    • C07K14/195Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria
    • C07K14/315Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from bacteria from Streptococcus (G), e.g. Enterococci
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/14Hydrolases (3)
    • C12N9/16Hydrolases (3) acting on ester bonds (3.1)
    • C12N9/22Ribonucleases RNAses, DNAses
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6804Nucleic acid analysis using immunogens
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/70Fusion polypeptide containing domain for protein-protein interaction
    • CCHEMISTRY; METALLURGY
    • C07ORGANIC CHEMISTRY
    • C07KPEPTIDES
    • C07K2319/00Fusion polypeptide
    • C07K2319/70Fusion polypeptide containing domain for protein-protein interaction
    • C07K2319/705Fusion polypeptide containing domain for protein-protein interaction containing a protein-A fusion

Definitions

  • the invention belongs to the fields of molecular biology, genomics and biotechnology, and particularly relates to a fusion protein of a transposase-antibody binding protein, and preparation and application thereof.
  • the transposon sequence can be inserted and integrated into the random location of the genome under the action of a transposase. Due to the nature of random insertion of DNA, transposases are often used in mutant library construction and sequencing library construction. In the library construction of high-throughput sequencing, the transposase can randomly break the sequence to be tested, and add a linker at both ends of the fragmented sequence. After PCR amplification, the next step can be directly sequenced, which is compared with the conventional ultrasonic disruption. The law has great advantages, and fewer steps save money and time.
  • Antibody binding proteins such as ProteinA, ProteinG or ProteinL can bind to mammalian IgG via the Fc region.
  • the binding strength of the antibody binding protein to IgG is highly dependent on the species and subtype of the antibody, and the recombinant antibody binding protein has a stronger binding ability than the natural antibody binding protein.
  • Antibody binding proteins are often used in experiments such as immunoassays, antibody immunoprecipitation, and the like.
  • ChIP Chromatin-immunoprecipitation
  • the principle of ChIP is to crosslink the DNA and protein in the cell, cut the chromatin into small fragments by sonication, add IgG to specifically bind to the antigen protein, and precipitate the DNA fragment bound to the target protein in antibody binding. On the protein beads, this will enrich the DNA associated with the protein of interest.
  • the application of ChIP ranges from studying the relationship between the target protein and the known target sequence to the study of the interaction between the target protein and the unknown sequence; from studying the relationship between a target protein and DNA, to studying the mutual binding of two proteins and DNA.
  • ChIP Role; from the study of histone modifications in the promoter region to the study of protein complexes bound to DNA sequences. ChIP is a relatively mature technology, but there are still some technical difficulties. For example, ChIP experiments involve many steps, the results are less reproducible, and require a large amount of starting materials. For nerve cells and stem cells, it is often difficult to culture, it is difficult to obtain a large number of cells, and it is difficult to distinguish between individual cells and whole cells. Phenotype; chromatin immunoprecipitation often results in a large number of DNA, including a large number of non-specifically bound false positive binding sequences.
  • ChIP-Seq which combines ChIP with Next-generation sequencing technology, is capable of detecting DNA segments that interact with proteins across the genome.
  • ChIP-Seq will be purified and library constructed by ChIP-specific DNA fragments that bind to the protein of interest, and then high-throughput sequencing of these fragments. Sequence information for interaction with the protein of interest can be obtained and compared to the genome-wide map to accurately map these sequences to the genome.
  • ChIP-Seq inherits the same technical difficulties as ChIP, and additionally fills the end of the DNA fragment used in the library construction-sequencing step, then adds A' at the end, and then adds a Y-type connector at A'.
  • the method of sequencing is also cumbersome and has many steps, and each step will lose valuable samples and lose the final sequencing information. Due to the loss, the amount of sample DNA required must also be high, which is a big limitation for some studies that are difficult to obtain a large number of samples, such as some nerve cells that are difficult to cultivate, some need to study cell heterogeneity, and need to be directed to a single cell. Study of tumor cells.
  • a fusion protein comprising a first domain having a transposition function and a second domain having a function of binding an Fc fragment of an antibody.
  • the transposition function refers to a transposition insertion of a gene sequence function.
  • said binding antibody Fc segment function refers to binding to an Fc segment function in an IgG molecule.
  • the first domain is a transposase or a protein analog having a transposition function.
  • the protein analog is capable of carrying a DNA sequence and inserting the DNA sequence into another stretch of DNA.
  • the having a transposition function may mean having a transposase activity.
  • the protein analog having a transposition function may be a protein analog having a transposase activity.
  • the first domain is a Tc1/Mariner, hobo, MITEs, hAT, PiggyBac (PB), TnA transposase family having a transposition function; or other protein analog having a transposition function.
  • the protein analog is capable of carrying a DNA sequence and inserting it into another piece of DNA.
  • the having a transposition function may mean having a transposase activity.
  • the protein analog having a transposition function may be a protein analog having a transposase activity.
  • the first domain is a family of TnA transposases.
  • the TnA transposase family is selected from the group consisting of Tn1, Tn2, Tn3, Tn4, Tn5, Tn6, Tn7, Tn8, Tn9 or Tn10.
  • the first domain is a Tn5 transposase or a Tn10 transposase.
  • the Tn5 transposase is selected from the group consisting of a full-length Tn5 transposase, a partial functional domain of a Tn5 transposase, a Tn5 transposase mutant, a tagged full-length Tn5 transposase, and a portion of a tagged Tn5 transposase Functional domain or tagged Tn5 transposase mutant.
  • the Tn10 transposase is selected from the group consisting of a full-length Tn10 transposase, a partial domain of a Tn10 transposase, a Tn10 transposase mutant, a tagged full-length Tn10 transposase, a portion of a tagged Tn10 transposase Functional domain or tagged Tn10 transposase mutant.
  • the label is selected from the group consisting of HHHHHH, DYKDDDDK, YPYDVPDYA, GGLLISGGAL.
  • the Tn5 transposase mutant is selected from the group consisting of: R30Q, K40Q, Y41H, T47P, E54K/V, M56A, R62Q, D97A, E110K, D188A, Y319A, R322A/K/Q, E326A, K330A/R, K333A, R342A, E344A, E345K, N348A, L372P, S438A, K439A, S445A, G462D, A466D.
  • amino acid sequence of the full-length Tn5 transposase is as shown in SEQ ID NO.
  • the full length Tn10 transposase amino acid sequence is set forth in SEQ ID NO.
  • the second domain is S. aureus Protein A (Protein A), Streptococcal Protein G (Protein G), Streptococcal Protein L (Protein L) or other protein analog having the function of binding to the Fc segment of the antibody.
  • the protein analog can bind to the Fc segment of the antibody.
  • the second domain is a full-length Staphylococcus aureus protein A, a partial functional domain of S. aureus A protein, a S. aureus A protein mutant, a full-length streptococcal protein G protein, a streptococcus G protein a partial functional domain, a Streptococcus G protein mutant, a tagged full-length S. aureus protein A, a partial functional domain of a tagged S. aureus protein A, a labeled S. aureus A protein mutant, Labeled full-length Streptococcal protein G, a partial functional domain of the tagged Streptococcus G protein or a labeled Streptococcus G protein mutant.
  • amino acid sequence of the full-length S. aureus A protein is as shown in SEQ ID NO.
  • amino acid sequence of the full-length Streptococcus G protein is shown in SEQ ID NO.
  • the first domain and the second domain are joined by a linker.
  • the present invention has no particular requirements for the order of connection as long as the object of the present invention is not limited.
  • the C-terminus of the first domain can be joined to the N-terminus of the second domain.
  • the C-terminus of the second domain may be joined to the N-terminus of the first domain.
  • the fusion protein has the general formula: a first domain-linker fragment-second domain or a second domain-linker fragment-first domain.
  • the linker has the structural formula of (GS) a (GGS) b (GGGS) c (GGGGS) d , wherein a, b, c, and d are each an integer greater than or equal to zero.
  • amino acid sequence of the ligated fragment can be selected from the following:
  • the ligated fragment may be Linker1, and the amino acid sequence of the ligated linker Linker1 is as shown in SEQ ID NO. 10, specifically: GGGGS.
  • the ligated fragment may be Linker2, and the amino acid sequence of the ligated linker Linker2 is as shown in SEQ ID NO. 11, specifically: GGGSGGGGS.
  • fusion proteins such as Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, and Tn5-ProteinA-5 are listed.
  • the amino acid sequence of the Tn5-Protein A-1 is shown in SEQ ID NO.
  • the amino acid sequence of the Tn5-ProteinG-2 is shown in SEQ ID NO.
  • the amino acid sequence of the Tn10-Protein A-3 is shown in SEQ ID NO.
  • the amino acid sequence of the Tn10-ProteinG-4 is shown in SEQ ID NO.
  • the amino acid sequence of the Tn5-Protein A-5 is shown in SEQ ID NO.
  • amino acid sequence of the fusion protein of the invention may be as set forth in any one of SEQ ID NO. 5, SEQ ID NO. 6, SEQ ID NO. 7, SEQ ID NO. 8, or SEQ ID NO. However, it is not limited to the specific forms listed in the preferred cases of the present invention.
  • an isolated polynucleotide i.e., a DNA molecule
  • a DNA molecule encoding the aforementioned fusion protein
  • the polynucleotide encoding the fusion protein of the present invention may be in the form of DNA or RNA.
  • DNA forms include cDNA, genomic DNA or synthetic DNA.
  • DNA can be single-stranded or double-stranded.
  • the polynucleotide encoding the fusion protein of the present invention can be prepared by any suitable technique well known to those skilled in the art. Such techniques are described in the general description of the art, such as the Guide to Molecular Cloning (J. Sambrook et al., Science Press, 1995). Methods including, but not limited to, recombinant DNA techniques, chemical synthesis, and the like; for example, overlapping extension PCR.
  • the nucleotide sequence encoding the Tn5 transposase is optimized as shown in SEQ ID NO.
  • the nucleotide sequence encoding the Tn10 transposase encoding is optimized as shown in SEQ ID NO.
  • the nucleotide sequence encoding the S. aureus Protein A is optimized as shown in SEQ ID NO.
  • the encoded nucleotide sequence encoding the Streptococcus G protein is optimized as set forth in SEQ ID NO.
  • the nucleotide sequence encoding the linker Linker1 is optimized as shown in SEQ ID NO. 16, specifically: GGTGGTGGTGGTTCT.
  • the nucleotide sequence encoding the linker Linker2 is shown in SEQ ID NO. 17, specifically: GGTGGTGGTTCTGGTGGTGGTGGTTCT.
  • nucleotide sequence encoding the fusion protein Tn5-ProteinA-1 is optimized as shown in SEQ ID NO.
  • nucleotide sequence encoding the fusion protein Tn5-ProteinG-2 is shown in SEQ ID NO.
  • nucleotide sequence encoding the fusion protein Tn10-Protein A-3 is shown in SEQ ID NO.
  • nucleotide sequence encoding the fusion protein Tn10-ProteinG-4 is shown in SEQ ID NO.
  • nucleotide sequence encoding the fusion protein Tn5-ProteinA-5 is shown in SEQ ID NO.
  • a cloning vector and an expression vector comprising the aforementioned polynucleotide are provided.
  • the expression vector of the present invention contains a polynucleotide encoding the fusion protein.
  • Methods well known to those skilled in the art can be used to construct the expression vector. These methods include recombinant DNA techniques, DNA synthesis techniques, and the like.
  • the DNA encoding the fusion protein can be operably linked to a multiple cloning site in the vector to direct mRNA synthesis to express the protein, or for homologous recombination.
  • the cloning vector can be used as Easy Cloning T-vector (Novoprotein, T003-01B), and the expression vector can be pET21a.
  • a host cell which is transformed with the aforementioned expression vector or cloning vector.
  • the host cell can employ BL21 (DE3) or Rosetta pLysS.
  • a method for preparing the aforementioned fusion protein comprises the steps of:
  • Synthesizing or cloning the DNA sequence of interest constructing a cloning vector containing the DNA sequence of interest, constructing an expression vector containing the DNA sequence of interest, transforming the expression vector containing the DNA sequence of interest into a prokaryotic host cell, and screening for a high-yielding cell highly expressed in the growth medium.
  • the strain is cultured and the highly expressed cell strain is cultured and the fusion protein is expressed, and the fusion protein is purified from the expression product.
  • the expression vector may employ pET21a.
  • the host cell may employ BL21 (DE3) or Rosetta pLysS.
  • a reagent combination comprising the aforementioned fusion protein and other components thereof for use.
  • components that are used correspondingly include the corresponding Buffer and other components used in combination.
  • Other components that are suitable for use may be in the form of a solid, a liquid or a material adsorbed on a special material.
  • An antibody-binding transposome can be obtained by ligating a linker in the first domain of the fusion protein. Further, the antibody binds to a transposome as a dimer.
  • the antibody-binding transposome has the function of cleaving a DNA duplex at a random position and inserting a linker at the cleavage site.
  • the antibody binding transposome can also bind to the Fc portion of the antibody and form a complex with the antibody.
  • an antibody-binding transposome comprising the fusion protein and a linker linked to the first domain of the fusion protein is provided.
  • the antibody binds to a transposome as a dimer.
  • the antibody-binding transposome has the function of cleaving a DNA duplex and inserting a linker at the cleavage site.
  • the antibody binding transposome can also bind to the Fc portion of the antibody and form a complex with the antibody.
  • the use of the antibody-binding transposome to construct a sequencing library is provided.
  • the use of the antibody-binding transposome for studying protein-chromatin interactions is provided.
  • a ninth aspect of the invention there is provided a method of studying protein-chromatin interaction, comprising the steps of:
  • the present invention uniquely found that in the step (3), the antibody-binding transposome will be linked to the antibody via the second domain, and the antibody can specifically bind to the chromatin transcription factor ( On TranscriptionFactor, TF) or histone (Histone), the antibody-binding transposome-antibody-TF-DNA will be joined together to form a complex. Limited by this complex, the effect of the antibody binding to the transposome to cleave DNA will be limited to the location of the DNA where the TF is located, and the DNA at a further position will not be cleaved.
  • TF On TranscriptionFactor
  • Histone histone
  • the position to be cleaved is introduced into the linker, and these linkers are used as a part of the primer for PCR amplification, and a DNA fragment having TF binding position information is obtained. Moreover, these fragments have been connected to the sequencing linker in the PCR, and after the magnetic bead screening, the sequencing library construction is completed, and the next high-throughput sequencing can be directly performed.
  • the above method for studying protein-chromatin interaction of the present invention omits co-immunoprecipitation-elution, reduces the operation steps, and greatly reduces sample loss, and directly constructs a sequencing library, reduces the operation steps, and greatly facilitates subsequent sequencing. Work, so that the requirements for the starting sample are reduced, the loss information is less, and the repeatability and credibility are greatly improved.
  • the present invention has the following beneficial effects:
  • the present invention provides a novel fusion protein which has both a transposon insertion gene sequence function and an Fc segment function in an IgG molecule.
  • the fusion protein can be used to prepare an antibody-binding transposome for experiments on mutant library construction, high-throughput sequencing library construction, immunoassay, IgG purification, and the like.
  • Using the antibody-binding transposome to study protein-chromatin interactions, which is simpler than ChIP-Seq, is more efficient, more economical, has better repeatability, and requires less sample size. The amount of sample DNA required is greatly reduced, and the sample and data lost in the middle are greatly reduced. This is of great significance for the study of protein-DNA interaction.
  • FIG. 1 Electrophoresis pattern of Tn5-ProteinA-1 fusion protein expressed after IPTG induction, wherein Lane A: non-induced crude; Lane B: Induced crude; LaneB1-B4: Induced crude; Lane C: Supernatant of lysate; Lane D : Precipitation of lysate; MK: Molecular weight marker.
  • FIG. 2 Electrophoresis pattern of Tn5-ProteinG-2 fusion protein expressed after IPTG induction, wherein Lane A: non-induced crude; Lane B: Induced crude; LaneB1-B2: Induced crude; Lane C: Supernatant of lysate; Lane D : Precipitation of lysate; MK: Molecular weight marker.
  • FIG. 3 Electrophoresis pattern of Tn10-ProteinA-3 fusion protein expressed after IPTG induction, wherein Lane A: non-induced crude; Lane B: Induced crude; LaneB1-B4: Induced crude; Lane C: Supernatant of lysate; Lane D : Precipitation of lysate; MK: Molecular weight marker.
  • FIG. 4 Electrophoresis pattern of Tn10-ProteinG-4 fusion protein expressed after IPTG induction, wherein Lane A: non-induced crude; Lane B: Induced crude; LaneB1-B4: Induced crude; Lane C: Supernatant of lysate; Lane D : Precipitation of lysate; MK: Molecular weight marker.
  • FIG. 5 Electrophoresis pattern of Tn5-ProteinA-5 fusion protein expressed after IPTG induction, wherein Lane A: non-induced crude; Lane B: Induced crude; LaneB1-B4: Induced crude; Lane C: Supernatant of lysate; Lane D : Precipitation of lysate; MK: Molecular weight marker.
  • FIG. 6 Electronization map of Tn5-ProteinA-1 amplification expression, wherein Lane A: non-induced crude; Lane B: Induced crude; Lane C: Supernatant of lysate; Lane D: Precipitation of lysate; MK: Molecular weight marker.
  • Figure 7 Amplified expression electrophoresis pattern of Tn5-ProteinG-2, wherein Lane A: non-induced crude; Lane B: Induced crude; Lane C: Supernatant of lysate; Lane D: Precipitation of lysate; M: Molecular weight marker.
  • Figure 8 Amplified expression electrophoresis pattern of Tn10-ProteinA-3, wherein Lane A: non-induced crude; Lane B: Induced crude; Lane C: Supernatant of lysate; Lane D: Precipitation of lysate; M: Molecular weight marker.
  • Figure 9 Amplified expression electrophoresis pattern of Tn10-ProteinG-4, wherein Lane A: non-induced crude; Lane B: Induced crude; Lane C: Supernatant of lysate; Lane D: Precipitation of lysate; M: Molecular weight marker.
  • Figure 10 Amplified expression electrophoresis pattern of Tn5-ProteinA-5, wherein Lane A: non-induced crude; Lane B: Induced crude; Lane C: Supernatant of lysate; Lane D: Precipitation of lysate; MK: Molecular weight marker.
  • FIG. 11 Transposase formed by a fusion protein of a transposase-antibody binding protein and a adaptor, which can fragment a double-stranded DNA and add a linker at both ends of the fragmented DNA.
  • Figure 12 Fusion protein 1-5, Tn5 transposome digestion 50 ng human genomic DNA, wherein MK: DNA Marker; 1: Tn5-Protein A-1; 2: Tn5-ProteinG-2; 3: Tn10-ProteinA- 3; 4: Tn10-Protein G-4; 5: Tn5-Protein A-5; 6: Tn5; 7: untreated genomic DNA.
  • MK DNA Marker
  • 1 Tn5-Protein A-1
  • 2 Tn5-ProteinG-2
  • 3 Tn10-ProteinA- 3
  • 4 Tn10-Protein G-4
  • Figure 13 DNA fragments sorted by magnetic beads after PCR amplification, 1-7 is the difference in the amount of magnetic beads added twice, and sorted into DNA of different fragment sizes.
  • Figure 14 Detection of fragment size using an Agilent 2100 high-sensitivity DNA chip, 1: unsorted DNA fragments; 2-6: DNA fragments of different lengths after sorting.
  • Figure 15 The antibody binding protein portion of the fusion protein can bind to the Fc portion of IgG.
  • Figure 16 Standard curve of ProteinA protein.
  • FIG. 17 ChT-Seq method to study the interaction between protein and genomic DNA.
  • the transposase can only cleave adjacent DNA sequences under the restriction of the Transposase-ProteinA/G-IgG-TF complex and introduce a sequencing linker.
  • FIG. 19 Genomic DNA electrophoresis map, 1, 2 is 1*10 6 Hela cells extracted genomic DNA for ChIP-Seq experiment; 3, 4 is 2*10 5 Hela cells extracted genomic DNA, used In the ChT-Seq experiment, it can be seen that the genomic DNA used in ChIP-Seq is more than the DNA used in ChT-Seq.
  • FIG. 20 Electrophoresis map of the library, M is DNA Marker; 1, 2 starts with about 10 ug of genomic DNA, and is subjected to PCR amplification by ChIP-Seq method ultrasonication-immunoprecipitation-complementing-addition of A-plus linker; 3, 4 start about 2 ug of genomic DNA, ChT-Seq method transposome one-step cleavage of the genome and ligation of the linker, the library obtained after PCR amplification. As can be seen from the figure, ChT-Seq obtained more libraries with fewer starting templates, with much less loss in the middle.
  • FIG. 21 Qubit, Nanodrop detection library quality, ordinate is DNA concentration (ng / ul).
  • ChIP-Seq represents a sequencing library constructed using the ChIP-Seq method for the initial 10 ug DNA
  • ChT-Seq represents a sequencing library constructed with the ChT-Seq method for the initial 2 ug DNA
  • blue is the Qubit test result
  • red is the Nanodrop test result.
  • ChT-Seq is able to obtain more sequencing libraries than ChIP-Seq with less initial amount of DNA.
  • the experimental methods, detection methods, and preparation methods disclosed in the present invention employ molecular biology, biochemistry, chromatin structure and analysis, analytical chemistry, cell culture, recombinant DNA technology, and related fields conventional in the art. Conventional technology. These techniques are well described in the prior literature, see Sambrook et al.
  • MOLECULAR CLONING A LABORATORY MANUAL, Second edition, Cold Spring Harbor Laboratory Press, 1989 and Third edition, 2001; Ausubel et al, CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, New York, 1987 and periodic updates; the series METHODS IN ENZYMOLOGY, Academic Press, San Diego; Wolffe, CHROMATIN STRUCTURE AND FUNCTION, Third edition, Academic Press, San Diego, 1998; METHODS IN ENZYMOLOGY, Vol. 304, Chromatin ( PMWassarman and AP Wolffe, eds.), Academic Press, San Diego, 1999; and METHODS IN MOLECULAR BIOLOGY, Vol. 119, Chromatin Protocols (PBBecker, ed.) Humana Press, Totowa, 1999, and the like.
  • the fusion protein of the present embodiment includes a first domain having a transposition function and a second domain having a function of binding an antibody Fc segment.
  • the transposition function refers to the function of transposition insertion of a gene sequence.
  • the binding antibody Fc segment function refers to binding to the Fc segment function in an IgG molecule.
  • the first domain and the second domain are connected by a linker, ie, a Linker. Further, the first domain may be a transposase protein and the second domain may be an antibody binding protein.
  • the specific construction scheme of the fusion protein is that the transposase protein and the antibody binding protein are linked by a fragment, that is, the structural formula is (GS) a (GGS) b (GGGS) c (GGGGS) d (where a, b, c, d Linker connections that are all integers greater than or equal to 0).
  • the present invention is directed to a transposase protein, an antibody binding protein, and a ligation fragment (GS) a (GGS) b (GGGS) c (GGGGS) d (where a, b, c, Sequences in which d is an integer greater than or equal to 0 are optimized.
  • This example exemplarily prepared fusion proteins of five differently constructed transposase-antibody binding proteins, named Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, respectively. , Tn5-ProteinA-5.
  • a fusion protein designated Tn5-ProteinA-1 the first domain of which is a Tn5 transposase, the second domain thereof is a S. aureus A protein, and the first domain and the second domain are linked by Fragment Linker1 connection.
  • the amino acid sequence of the Tn5 transposase is as shown in SEQ ID NO. 1, specifically:
  • the nucleotide sequence encoding the Tn5 transposase is as shown in SEQ ID NO. 12, specifically:
  • amino acid sequence of the S. aureus A protein is shown in SEQ ID NO. 3, specifically:
  • S. aureus A protein The nucleotide sequence encoding the S. aureus A protein is shown in SEQ ID NO. 14, specifically:
  • the amino acid sequence of the linker, Linker1 is set forth in SEQ ID NO. 10, specifically GGGGS.
  • the coding sequence of the ligated fragment, Linker1, is set forth in SEQ ID NO. 16, specifically: GGTGGTGGTGGTTCT.
  • amino acid sequence of the fusion protein Tn5-ProteinA-1 is shown in SEQ ID NO. 5, specifically:
  • the coding nucleotide sequence of the fusion protein Tn5-ProteinA-1 is shown in SEQ ID NO. 18, specifically:
  • a fusion protein designated Tn5-ProteinG-2 the first domain of which is a Tn5 transposase, the second domain of which is a Streptococcus G protein, and the first structure and the second domain are linked by a linker Linker1 .
  • the amino acid sequence of the Tn5 transposase is set forth in SEQ ID NO.
  • the nucleotide sequence encoding the Tn5 transposase is set forth in SEQ ID NO.
  • the amino acid sequence of the Streptococcus G protein is as shown in SEQ ID NO. 4, specifically:
  • the nucleotide sequence encoding the Streptococcus G protein is as shown in SEQ ID NO. 15, specifically:
  • the amino acid sequence of the linker, Linker1, is set forth in SEQ ID NO.
  • Linker1 The coding sequence of the ligated fragment, Linker1, is set forth in SEQ ID NO.
  • Tn5-ProteinG-2 fusion protein The amino acid sequence of the Tn5-ProteinG-2 fusion protein is shown in SEQ ID NO. 6, specifically:
  • the nucleotide sequence encoding the Tn5-ProteinG-2 fusion protein is shown in SEQ ID NO. 19, specifically:
  • a fusion protein designated Tn10-ProteinA-3 has a first domain of Tn10 transposase and a second domain of S. aureus A protein, the first domain and the second domain being joined by a linker Linker1.
  • Tn10 transposase amino acid sequence is shown in SEQ ID NO. 2, specifically:
  • the Tn10 transposase encoding nucleotide sequence is SEQ ID NO. 13, specifically:
  • amino acid sequence of the S. aureus A protein is SEQ ID NO.
  • the nucleotide sequence encoding the S. aureus A protein is SEQ ID NO.
  • the amino acid sequence of the linker, Linker1, is set forth in SEQ ID NO.
  • Linker1 The coding sequence of the ligated fragment, Linker1, is set forth in SEQ ID NO.
  • Tn10-Protein A-3 fusion protein The amino acid sequence of the Tn10-Protein A-3 fusion protein is shown in SEQ ID NO. 7, specifically:
  • the coding nucleotide sequence of the Tn10-Protein A-3 fusion protein is shown in SEQ ID NO. 20, specifically:
  • a fusion protein designated Tn10-ProteinG-4 the first domain of which is a Tn10 transposase, the second domain of which is a Streptococcus G protein, and the first domain and the second domain are joined by a linker Linker1.
  • Tn10 transposase amino acid sequence is shown in SEQ ID NO.
  • Tn10 transposase encoding nucleotide sequence is set forth in SEQ ID NO.
  • the amino acid sequence of the Streptococcus G protein is shown in SEQ ID NO.
  • the nucleotide sequence encoding the Streptococcus G protein is shown in SEQ ID NO.
  • the amino acid sequence of the linker, Linker1, is set forth in SEQ ID NO.
  • Linker1 The coding sequence of the ligated fragment, Linker1, is set forth in SEQ ID NO.
  • Tn10-ProteinG-4 fusion protein The amino acid sequence of the Tn10-ProteinG-4 fusion protein is shown in SEQ ID NO. 8, specifically:
  • the nucleotide sequence encoding the Tn10-ProteinG-4 fusion protein is shown in SEQ ID NO. 21, specifically:
  • a fusion protein designated Tn5-ProteinA-5 the first domain of which is a Tn5 transposase, the second domain thereof is a S. aureus A protein, and the first domain and the second domain are linked by Fragment Linker2 connection.
  • the amino acid sequence of the Tn5 transposase is set forth in SEQ ID NO.
  • the nucleotide sequence encoding the Tn5 transposase is set forth in SEQ ID NO.
  • amino acid sequence of the S. aureus A protein is shown in SEQ ID NO.
  • the nucleotide sequence encoding the S. aureus Protein A is set forth in SEQ ID NO.
  • Linker 2 which is Linker 2 is shown in SEQ ID NO. 11, specifically: GGGSGGGGS.
  • linker ie, Linker2
  • SEQ ID NO. 17 The coding sequence of the linker, ie, Linker2, is represented by SEQ ID NO. 17, specifically:
  • the amino acid sequence of the fusion protein Tn5-ProteinA-5 is shown in SEQ ID NO. 9, specifically:
  • the coding nucleotide sequence of the fusion protein Tn5-ProteinA-5 is shown in SEQ ID NO. 22, specifically:
  • the nucleotide sequence encoding the above-mentioned optimized transposase-antibody-binding protein fusion protein was transferred into the expression vector pET21a, and the reaction system was 20 ⁇ L. The following components were added to a 0.2 mL EP tube:
  • the reaction was carried out at 37 degrees for 20 minutes to obtain a recombinant expression vector.
  • the kit used was NR001 from Novoprotein.
  • step 2 1) Add the DNA of interest to the competent cell suspension (i.e., the expression vector of step 1), gently rotate the tube to mix the contents, and let stand in an ice bath for 30 min.
  • the competent cell suspension i.e., the expression vector of step 1
  • the sample is subjected to SDS-PAGE to detect whether the target protein is expressed.
  • Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, and Tn5-ProteinA-5 were successfully expressed.
  • Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, and Tn5-ProteinA-5 were sequenced correctly, which were consistent with expectations.
  • Example 1 The function of the fusion protein of the transposase-antibody binding protein obtained in Example 1 was examined. First, the random disrupted genome of the fusion protein was verified, and the tag sequence was inserted, and the sequencing library function was constructed after PCR.
  • the principle and schematic diagram of the random insertion of the fusion protein is shown in Figure 11.
  • the transposome formed by the fusion protein and the linker can fragment the double-stranded DNA and add a linker at both ends of the fragmented DNA.
  • fusion protein 1-5 ie Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, Tn5-ProteinA-5) and Tn5 transposase (positive control), respectively Processing human genomic DNA, it is expected that short DNAs of different lengths and fragments will be fragmented, and linker sequences will be ligated at both ends of these fragmented DNA.
  • PCR can be used to amplify fragmentation using the linker sequence as a primer.
  • the DNA which thus constructed the sequenced library, also demonstrated the ability of the fusion protein to have the same random insertion of integrated DNA as the Tn5 transposase.
  • Fusion protein 1-5 (ie Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, Tn5-ProteinA-5) and Tn5 were respectively dissolved in a stock solution (50 mM HPCRES-KOH) pH 7.2, 0.1 M NaCl, 0.1 mM EDTA, 1 mM DTT, 0.1% Triton X-100, 10% glycerol), quantified by BCA method, and the molar concentration was calculated.
  • the reaction system is configured as follows:
  • reaction conditions were: 30 ° C, 1 hour, -20 ° C preservation, respectively, to prepare fusion protein 1-5 (ie Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, Tn5 -ProteinA-5) Transposable body and Tn5 transposome.
  • fusion protein 1-5 ie Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, Tn5 -ProteinA-5) Transposable body and Tn5 transposome.
  • the 5* reaction buffer 50 mM TAPS-NaOH pH 8.5, 25 mM MgCl 2 . was thawed at room temperature, mixed upside down and set aside. The following 20 ul reaction system was placed in a sterile PCR tube, and a negative control without a transposome was set. Positive control added to the Tn5 transposome:
  • the components were thoroughly mixed by gently pipetting with a pipette; the PCR tube was placed in a PCR machine and reacted at 55 ° C for 10 min. The PCR tube was taken out from the PCR instrument, and several fragmented products were taken up and electrophoresed together with the control to observe the fragmentation effect.
  • the electrophoresis pattern is shown in Fig. 12, and the fusion protein 1-5 (i.e., Tn5-Protein A-1, Tn5-ProteinG- 2.
  • Tn10-ProteinA-3, Tn10-ProteinG-4, Tn5-ProteinA-5) successfully fragmented the human genome.
  • reaction tube is briefly centrifuged and placed in a magnetic frame to completely separate the magnetic beads from the liquid (about 5 minutes, after the solution is clarified), and the supernatant is carefully removed, and the reaction tube is kept on the magnetic frame;
  • reaction tube is briefly centrifuged and placed in a magnetic stand to completely separate the magnetic beads from the liquid (about 5 min, after the solution is clarified), and 14 ul of the supernatant is carefully pipetted into a new sterile PCR tube for PCR enrichment step;
  • the obtained PCR product was subjected to length sorting by magnetic bead sorting.
  • the magnetic beads are equilibrated to room temperature and shaken well, and the PCR system must be supplemented with 50 ⁇ l of sterile distilled water to avoid evaporation of the sample during the PCR process, resulting in the sorted fragments being inconsistent with the expected length.
  • the sorting results are shown in Figure 13.
  • the fragmented DNA is sorted by different ratios of magnetic beads, leaving the DNA of the desired length, MK:DNA Marker, 1-6: different ratios of magnetic beads after sorting DNA fragment, 7: Negative control without sorting.
  • the product was assayed for fragment size using an Agilent 2100 high-sensitivity DNA chip (see Figure 14). Since the primers used were specific primers designed for the adaptor Adaptor 1, 2, it was found that the transposome successfully broke the target DNA and inserted the Adaptor linker sequence at the same time.
  • the library was sequenced by Illumina Hiseq XTM system, the sequencing strategy was PE150, the sequencing result data is shown in Table 6, and the sequencing data was compared with the reference genome sequence as follows:
  • the effective data can be more than 95%, and the Q20 and Q30 data are over 90. %, about 98% coverage can be achieved with a sequencing depth of about 13X.
  • the library construction process is fast, easy to operate, and requires a small amount of sample.
  • Example 2 The function of the fusion protein successfully combined with IgG obtained in Example 1 was examined.
  • FIG. 1 A schematic diagram of the binding of the fusion protein to IgG, as shown in FIG.
  • the ProteinA/G portion of the fusion protein can bind to the Fc portion of the IgG.
  • the plate was sealed with a cover plate and incubated at 37 ° C for 30 minutes.
  • Washing the plate Remove the cover film, discard the liquid in the plate hole, add 260 ul of 1 ⁇ washing solution to each well, soak for 30 seconds, discard the washing solution, and repeat the washing 4 times.
  • the color reaction time is affected by temperature.
  • the ideal reaction temperature is 20-25 ° C. When the temperature is low, the reaction time should be extended.
  • the OD value is plotted on the ordinate
  • the ProteinA standard protein concentration is plotted on the abscissa
  • a standard curve is drawn.
  • the content of the fusion protein in the sample is calculated according to the standard curve and converted into the binding efficiency of the fusion protein.
  • the fusion protein concentration was 1.8 ng/ml and the standard ProteinA concentration was 0.2 ng/ml, the molar concentration of the fusion protein and the standard was consistent. From the standard curve of Fig. 16, the standard ProteinA concentration was 0.2 ng/ml, and the OD450 was 0.318, which was similar to the OD450 value of the fusion protein. Elisa results showed that the fusion proteins Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, Tn5-ProteinA-5 could bind to IgG and the binding ability was similar to that of the standard ProteinA.
  • ChIP-Seq The principle of the traditional ChIP-Seq method is to firstly enrich the DNA fragments of the target protein by chromatin immunoprecipitation (ChIP), and then purify and construct the fragments, and then perform high-throughput sequencing.
  • ChIP chromatin immunoprecipitation
  • step (2) ultrasonic interrupting inconvenience control, easy to interrupt excessive or insufficient, poor repeatability
  • step (3) immunoprecipitation experiments are not only complicated and cumbersome, but also A large number of samples will be lost
  • step (5) has a lot of steps in the construction of the library, and will also lose sample and DNA information; thus ChIP-Seq requires more starting samples and less repeatability of the experiment.
  • transposase-antibody-binding fusion protein of the present invention due to its unique function and characteristics, has been applied to protein-DNA interaction research, creating a new method ChT-Seq (Chromatin-Transposase-Sequencing). .
  • ChT-Seq Chroatin-Transposase-Sequencing
  • transposase-antibody binding protein transposome (2) eluting formaldehyde, adding a specific antibody of the target protein, a transposase-antibody binding protein transposome; the transposome will combine with the antibody and the target protein to form a complex, and the transposase functions as a transposition function and will be cut.
  • a sequence adjacent to the DNA site to which the target protein is ligated as shown in Figure 18;
  • ChT-Seq omits the co-immunoprecipitation-elution, reduces the operation steps, greatly reduces the sample loss, and directly connects the linker, which greatly simplifies the library construction work and facilitates sequencing.
  • the requirements for the initial sample are reduced, the loss information is less, and the repeatability and credibility are greatly improved.
  • Collect cells Hela cells, about 2 * 10 5 , traditional Chip-Seq about 1 * 10 6-7 ), add formaldehyde to a final concentration of 1%, gently shake and mix, react at room temperature for 10 min;
  • cell lysis according to the amount of cells added to the pre-cooled cell lysate and 5ul protease inhibitor, resuspended cells, can be divided into 300 ⁇ 400ul, take 5ul cell lysate running electrophoresis to observe the extracted genome (see Figure 19), Excess samples can be stored at -80 ° C;
  • Blank control group 100 ul of template sample DNA was added, 4 ul of 5 M NaCl was added, and treated at 65 ° C for 2 h to be cross-linked, which was used as a blank control. After extracting a part of phenol/chloroform, the breaking effect of step 6 was identified by electrophoresis.
  • Negative control group 100 ul of template sample DNA, 900 ul dilution buffer (containing 4.5 ul protease inhibitor), non-specific Mouse IgG 1 ug as antibody, incubate at 2 ° C for 2 h with gentle shaking;
  • experimental group template sample DNA taken 200ul, add 900ul dilution buffer (containing 4.5ul protease inhibitor), add the target protein specific antibody, 4 ° C, gently shake for 2h;
  • the experimental group and the negative control group were added to the Transposase-Protein A/G transposome with Adaptor1 (transposable construct was carried out according to the transposase-antibody binding protein function detection part of Example 2), and incubated at 4 °C for 10 min with gentle shaking. Adding MgCl 2 to a final concentration of Mg 2+ of 5 mM, 55 ° C for 10 min;
  • the experiment was carried out using the traditional ChIP-Seq method.
  • the number of Hela cells was 1*10 6
  • the DNA extracted after cell lysis was about 10 ug
  • the DNA obtained after chromatin immunoprecipitation was about 10 ng.
  • After filling up add A at the end, connect Y-type linker, and carry out PCR amplification. 20 ng of library can be obtained for sequencing on the machine.
  • the experiment was carried out by ChT-Seq method.
  • the number of Hela cells was about 2*10 5 , and the DNA extracted after cell lysis was about 2 ug. After transposase digestion and ligation, a library of 100 ng was obtained. Sequencing. It can be seen that the ChT-Seq method uses fewer starting cells, and the obtained sequencing library has more sample size than ChIP-Seq, with less loss in the middle, and the repeated effect is consistent and the repeatability is high.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Organic Chemistry (AREA)
  • Genetics & Genomics (AREA)
  • Zoology (AREA)
  • Molecular Biology (AREA)
  • Engineering & Computer Science (AREA)
  • Wood Science & Technology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • General Health & Medical Sciences (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Medicinal Chemistry (AREA)
  • Biophysics (AREA)
  • Microbiology (AREA)
  • General Engineering & Computer Science (AREA)
  • Biotechnology (AREA)
  • Gastroenterology & Hepatology (AREA)
  • Biomedical Technology (AREA)
  • Analytical Chemistry (AREA)
  • Immunology (AREA)
  • Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Peptides Or Proteins (AREA)

Abstract

The present invention relates to the fields of molecular biology, genomics and biotechnology, and provided thereby are a fusion protein of transposase-antibody binding protein, and a preparation and an application thereof. The fusion protein of the present invention comprises a transposase portion, a linker peptide portion and a portion capable of binding to an antibody Fc segment, and simultaneously has a transposition function and a function of binding the antibody Fc segment. Further provided by the present invention is a new method for studying the interaction of protein-chromatin DNA in vivo: a method of chromatin-transposition-sequencing (ChT-Seq).

Description

一种转座酶-抗体结合蛋白的融合蛋白及其制备与应用Fusion protein of transposase-antibody binding protein and preparation and application thereof 技术领域Technical field
本发明属于分子生物学、基因组学以及生物技术领域,具体涉及一种转座酶-抗体结合蛋白的融合蛋白及其制备与应用。The invention belongs to the fields of molecular biology, genomics and biotechnology, and particularly relates to a fusion protein of a transposase-antibody binding protein, and preparation and application thereof.
背景技术Background technique
转座子序列在转座酶的作用下,可以插入并整合进基因组的随机位置。因其随机插入DNA的特性,转座酶常被应用于突变体文库构建、测序文库构建。在高通量测序的文库构建环节,转座酶可以随机打断待测序列,并在片段化的序列两端加上接头,PCR扩增后可以直接进行下一步测序,这相对于传统超声波破碎法有很大优势,更少的步骤节约了费用和时间。The transposon sequence can be inserted and integrated into the random location of the genome under the action of a transposase. Due to the nature of random insertion of DNA, transposases are often used in mutant library construction and sequencing library construction. In the library construction of high-throughput sequencing, the transposase can randomly break the sequence to be tested, and add a linker at both ends of the fragmented sequence. After PCR amplification, the next step can be directly sequenced, which is compared with the conventional ultrasonic disruption. The law has great advantages, and fewer steps save money and time.
抗体结合蛋白如ProteinA、ProteinG或ProteinL可通过Fc区与哺乳动物IgG结合。抗体结合蛋白与IgG的结合强度很大程度依赖于该抗体的种属和亚型,重组的抗体结合蛋白比天然的抗体结合蛋白拥有更强的结合能力。抗体结合蛋白常被应用于免疫分析、抗体免疫共沉淀等实验。Antibody binding proteins such as ProteinA, ProteinG or ProteinL can bind to mammalian IgG via the Fc region. The binding strength of the antibody binding protein to IgG is highly dependent on the species and subtype of the antibody, and the recombinant antibody binding protein has a stronger binding ability than the natural antibody binding protein. Antibody binding proteins are often used in experiments such as immunoassays, antibody immunoprecipitation, and the like.
染色质免疫共沉淀技术(Chromatin-immunoprecipitation,ChIP),即应用了抗体结合的这个功能,ChIP主要用于分析基因组DNA与蛋白(转录因子或组蛋白)之间的相互关系。ChIP的原理是将细胞内的DNA与蛋白质交联在一起,通过超声处理将染色质切为小片段,加入IgG令其与抗原蛋白特异结合,将与目的蛋白相结合的DNA片段沉淀在抗体结合蛋白小珠上,这样就可以富集与目的蛋白相关的DNA。ChIP的应用范围从研究目的蛋白与已知靶序列的相互关系,到研究目的蛋白与未知序列的相互作用;从研究一个目的蛋白与DNA的相互关系,到研究两个蛋白与DNA共同结合的相互作用;从研究启动子区域的组蛋白修饰,发展到研究结合在DNA序列上的蛋白复合物。ChIP是相对成熟的技术,但目前还存在一些技术难点。例如,ChIP实验涉及的步骤多,结果的重复性较低,需要大量的起始材料;而对于神经细胞和干细胞等,往往培养困难,很难获得大量细胞,并且难以区分个别细胞与总体细胞的表型;染色质免疫沉淀获得的DNA数量往往很多,包含大量的非特异结合的假阳性结合序列。Chromatin-immunoprecipitation (ChIP), which uses the function of antibody binding, is mainly used to analyze the relationship between genomic DNA and proteins (transcription factors or histones). The principle of ChIP is to crosslink the DNA and protein in the cell, cut the chromatin into small fragments by sonication, add IgG to specifically bind to the antigen protein, and precipitate the DNA fragment bound to the target protein in antibody binding. On the protein beads, this will enrich the DNA associated with the protein of interest. The application of ChIP ranges from studying the relationship between the target protein and the known target sequence to the study of the interaction between the target protein and the unknown sequence; from studying the relationship between a target protein and DNA, to studying the mutual binding of two proteins and DNA. Role; from the study of histone modifications in the promoter region to the study of protein complexes bound to DNA sequences. ChIP is a relatively mature technology, but there are still some technical difficulties. For example, ChIP experiments involve many steps, the results are less reproducible, and require a large amount of starting materials. For nerve cells and stem cells, it is often difficult to culture, it is difficult to obtain a large number of cells, and it is difficult to distinguish between individual cells and whole cells. Phenotype; chromatin immunoprecipitation often results in a large number of DNA, including a large number of non-specifically bound false positive binding sequences.
结合了ChIP与高通量测序技术(Next-generation sequencing technology)的ChIP-Seq技术能够在全基因组范围内检测与蛋白相互作用的DNA区段。ChIP-Seq将通过ChIP 特异性收集到的与目的蛋白结合的DNA片段进行纯化与文库构建,然后对这些片段进行高通量测序。可获得与目的蛋白互作的序列信息,并可将其与全基因组图谱比较,将这些序列精确定位到基因组上。同样的,ChIP-Seq继承了和ChIP一样的技术难点,另外在文库构建-测序环节使用的先给DNA片段末端补平,再在末端加A’,再在A’加上Y型接头,最后测序的方法,也是操作繁琐,步骤繁多,而且每一步都会损失宝贵的样本,损失最终的测序信息。由于有损失,需求的样本DNA量也必须较高,这对一些难以取得大量样本的研究是个很大的限制,例如一些培养困难的神经细胞,一些需要研究细胞异质性,需要针对单个细胞进行研究的肿瘤细胞。The ChIP-Seq technology, which combines ChIP with Next-generation sequencing technology, is capable of detecting DNA segments that interact with proteins across the genome. ChIP-Seq will be purified and library constructed by ChIP-specific DNA fragments that bind to the protein of interest, and then high-throughput sequencing of these fragments. Sequence information for interaction with the protein of interest can be obtained and compared to the genome-wide map to accurately map these sequences to the genome. Similarly, ChIP-Seq inherits the same technical difficulties as ChIP, and additionally fills the end of the DNA fragment used in the library construction-sequencing step, then adds A' at the end, and then adds a Y-type connector at A'. The method of sequencing is also cumbersome and has many steps, and each step will lose valuable samples and lose the final sequencing information. Due to the loss, the amount of sample DNA required must also be high, which is a big limitation for some studies that are difficult to obtain a large number of samples, such as some nerve cells that are difficult to cultivate, some need to study cell heterogeneity, and need to be directed to a single cell. Study of tumor cells.
因此,开发简单、高效、易于操作、成本较低、具备较好重复性的蛋白质-DNA互作研究新方法对科学研究非常重要。Therefore, a new method for protein-DNA interaction research that is simple, efficient, easy to operate, low in cost, and has good reproducibility is very important for scientific research.
发明内容Summary of the invention
为了克服现有技术中所存在的问题,本发明的目的在于提供一种转座酶-抗体结合蛋白的融合蛋白及其制备与应用。In order to overcome the problems in the prior art, it is an object of the present invention to provide a fusion protein of a transposase-antibody binding protein, and preparation and use thereof.
为了实现上述目的以及其他相关目的,本发明采用如下技术方案:In order to achieve the above and other related objects, the present invention adopts the following technical solutions:
本发明的第一方面,提供一种融合蛋白,其结构中包括具有转座功能的第一结构域和具有结合抗体Fc段功能的第二结构域。In a first aspect of the invention, there is provided a fusion protein comprising a first domain having a transposition function and a second domain having a function of binding an Fc fragment of an antibody.
优选的,所述转座功能是指转座插入基因序列功能。Preferably, the transposition function refers to a transposition insertion of a gene sequence function.
优选的,所述结合抗体Fc段功能是指结合IgG分子中Fc段功能。Preferably, said binding antibody Fc segment function refers to binding to an Fc segment function in an IgG molecule.
优选的,所述第一结构域为转座酶或具有转座功能的蛋白类似物。Preferably, the first domain is a transposase or a protein analog having a transposition function.
在此说明的是,所述蛋白类似物,能够携带DNA序列,并将所述DNA序列插入整合到另一段DNA中。It is stated herein that the protein analog is capable of carrying a DNA sequence and inserting the DNA sequence into another stretch of DNA.
进一步地,所述具有转座功能可以是指具有转座酶活性。再进一步地,所述具有转座功能的蛋白类似物可以是具有转座酶活性的蛋白类似物。Further, the having a transposition function may mean having a transposase activity. Still further, the protein analog having a transposition function may be a protein analog having a transposase activity.
优选的,所述第一结构域为具有转座功能的Tc1/Mariner、hobo、MITEs、hAT、PiggyBac(PB)、TnA转座酶家族;或其他具有转座功能的蛋白类似物。Preferably, the first domain is a Tc1/Mariner, hobo, MITEs, hAT, PiggyBac (PB), TnA transposase family having a transposition function; or other protein analog having a transposition function.
在此说明的是,所述蛋白类似物,能够携带DNA序列,并将其插入整合到另一段DNA中。It is stated herein that the protein analog is capable of carrying a DNA sequence and inserting it into another piece of DNA.
进一步地,所述具有转座功能可以是指具有转座酶活性。再进一步地,所述具有转座功 能的蛋白类似物可以是具有转座酶活性的蛋白类似物。Further, the having a transposition function may mean having a transposase activity. Still further, the protein analog having a transposition function may be a protein analog having a transposase activity.
优选的,所述第一结构域为TnA转座酶家族。所述TnA转座酶家族选自Tn1、Tn2、Tn3、Tn4、Tn5、Tn6、Tn7、Tn8、Tn9或Tn10。Preferably, the first domain is a family of TnA transposases. The TnA transposase family is selected from the group consisting of Tn1, Tn2, Tn3, Tn4, Tn5, Tn6, Tn7, Tn8, Tn9 or Tn10.
优选的,所述第一结构域为Tn5转座酶或Tn10转座酶。所述Tn5转座酶选自全长Tn5转座酶、Tn5转座酶的部分功能域、Tn5转座酶突变体、带标签的全长Tn5转座酶、带标签的Tn5转座酶的部分功能域或带标签的Tn5转座酶突变体。所述Tn10转座酶选自全长Tn10转座酶、Tn10转座酶的部分功能域、Tn10转座酶突变体、带标签的全长Tn10转座酶、带标签的Tn10转座酶的部分功能域或带标签的Tn10转座酶突变体。Preferably, the first domain is a Tn5 transposase or a Tn10 transposase. The Tn5 transposase is selected from the group consisting of a full-length Tn5 transposase, a partial functional domain of a Tn5 transposase, a Tn5 transposase mutant, a tagged full-length Tn5 transposase, and a portion of a tagged Tn5 transposase Functional domain or tagged Tn5 transposase mutant. The Tn10 transposase is selected from the group consisting of a full-length Tn10 transposase, a partial domain of a Tn10 transposase, a Tn10 transposase mutant, a tagged full-length Tn10 transposase, a portion of a tagged Tn10 transposase Functional domain or tagged Tn10 transposase mutant.
优选的,所述标签选自如下:HHHHHH、DYKDDDDK、YPYDVPDYA、GGLLISGGAL。Preferably, the label is selected from the group consisting of HHHHHH, DYKDDDDK, YPYDVPDYA, GGLLISGGAL.
优选的,所述Tn5转座酶突变体选自:R30Q,K40Q,Y41H,T47P,E54K/V,M56A,R62Q,D97A,E110K,D188A,Y319A,R322A/K/Q,E326A,K330A/R,K333A,R342A,E344A,E345K,N348A,L372P,S438A,K439A,S445A,G462D,A466D。Preferably, the Tn5 transposase mutant is selected from the group consisting of: R30Q, K40Q, Y41H, T47P, E54K/V, M56A, R62Q, D97A, E110K, D188A, Y319A, R322A/K/Q, E326A, K330A/R, K333A, R342A, E344A, E345K, N348A, L372P, S438A, K439A, S445A, G462D, A466D.
优选的,所述全长Tn5转座酶的氨基酸序列如SEQ ID NO.1所示。Preferably, the amino acid sequence of the full-length Tn5 transposase is as shown in SEQ ID NO.
优选的,所述全长Tn10转座酶氨基酸序列如SEQ ID NO.2所示。Preferably, the full length Tn10 transposase amino acid sequence is set forth in SEQ ID NO.
优选的,所述第二结构域为金黄色葡萄球菌A蛋白(ProteinA)、链球菌G蛋白(ProteinG)、链球菌L蛋白(ProteinL)或其他具有结合抗体Fc段功能的蛋白类似物。Preferably, the second domain is S. aureus Protein A (Protein A), Streptococcal Protein G (Protein G), Streptococcal Protein L (Protein L) or other protein analog having the function of binding to the Fc segment of the antibody.
此处需要说明的是,所述蛋白类似物,可结合抗体的Fc段。It should be noted here that the protein analog can bind to the Fc segment of the antibody.
优选的,所述第二结构域为全长金黄色葡萄球菌A蛋白、金黄色葡萄球菌A蛋白的部分功能域、金黄色葡萄球菌A蛋白突变体、全长链球菌G蛋白、链球菌G蛋白的部分功能域、链球菌G蛋白突变体、带标签的全长金黄色葡萄球菌A蛋白、带标签的金黄色葡萄球菌A蛋白的部分功能域、带标签的金黄色葡萄球菌A蛋白突变体、带标签的全长链球菌G蛋白、带标签的链球菌G蛋白的部分功能域或带标签的链球菌G蛋白突变体。Preferably, the second domain is a full-length Staphylococcus aureus protein A, a partial functional domain of S. aureus A protein, a S. aureus A protein mutant, a full-length streptococcal protein G protein, a streptococcus G protein a partial functional domain, a Streptococcus G protein mutant, a tagged full-length S. aureus protein A, a partial functional domain of a tagged S. aureus protein A, a labeled S. aureus A protein mutant, Labeled full-length Streptococcal protein G, a partial functional domain of the tagged Streptococcus G protein or a labeled Streptococcus G protein mutant.
优选的,所述全长金黄色葡萄球菌A蛋白的氨基酸序列如SEQ ID NO.3所示。Preferably, the amino acid sequence of the full-length S. aureus A protein is as shown in SEQ ID NO.
优选的,所述全长链球菌G蛋白(ProteinG)的氨基酸序列如SEQ ID NO.4所示。Preferably, the amino acid sequence of the full-length Streptococcus G protein (ProteinG) is shown in SEQ ID NO.
优选地,所述第一结构域和所述第二结构域通过连接片段连接。Preferably, the first domain and the second domain are joined by a linker.
本发明对于连接顺序没有特殊要求,只要不限制本发明的目的即可。例如,所述第一结构域的C末端可以与所述第二结构域的N末端连接。或者第二结构域的C末端可以与所述第一结构域的N末端连接。The present invention has no particular requirements for the order of connection as long as the object of the present invention is not limited. For example, the C-terminus of the first domain can be joined to the N-terminus of the second domain. Alternatively, the C-terminus of the second domain may be joined to the N-terminus of the first domain.
亦即,所述融合蛋白的通式为:第一结构域-连接片段-第二结构域或者第二结构域-连接片段-第一结构域。That is, the fusion protein has the general formula: a first domain-linker fragment-second domain or a second domain-linker fragment-first domain.
所述连接片段的结构通式为(GS) a(GGS) b(GGGS) c(GGGGS) d,其中a、b、c、d均是大于或等于0的整数。 The linker has the structural formula of (GS) a (GGS) b (GGGS) c (GGGGS) d , wherein a, b, c, and d are each an integer greater than or equal to zero.
例如,所述连接片段的氨基酸序列可选自以下情况:For example, the amino acid sequence of the ligated fragment can be selected from the following:
(1)GGGGS;(1) GGGGS;
(2)GSGGGGS;(2) GSGGGGS;
(3)GGSGGSGGGS;(3) GGSGGSGGGS;
(4)GGGSGGGGSGS;(4) GGGSGGGGSGS;
(5)GGSGGSGGSGGS;(5) GGSGGSGGSGGS;
(6)GGGGSGGGGS;(6) GGGGSGGGGS;
(7)无连接片段。(7) No connection fragment.
优选的,所述连接片段可以是Linker1,所述连接片段Linker1的氨基酸序列如SEQ ID NO.10所示,具体为:GGGGS。Preferably, the ligated fragment may be Linker1, and the amino acid sequence of the ligated linker Linker1 is as shown in SEQ ID NO. 10, specifically: GGGGS.
优选的,所述连接片段可以是Linker2,所述连接片段Linker2的氨基酸序列如SEQ ID NO.11所示,具体为:GGGSGGGGS。Preferably, the ligated fragment may be Linker2, and the amino acid sequence of the ligated linker Linker2 is as shown in SEQ ID NO. 11, specifically: GGGSGGGGS.
在本案的较佳案例中,列举了Tn5-ProteinA-1、Tn5-ProteinG-2、Tn10-ProteinA-3、Tn10-ProteinG-4、Tn5-ProteinA-5等融合蛋白。所述Tn5-ProteinA-1的氨基酸序列如SEQ ID NO.5所示。所述Tn5-ProteinG-2的氨基酸序列如SEQ ID NO.6所示。所述Tn10-ProteinA-3的氨基酸序列如SEQ ID NO.7所示。所述Tn10-ProteinG-4的氨基酸序列如SEQ ID NO.8所示。所述Tn5-ProteinA-5的氨基酸序列如SEQ ID NO.9所示。In the preferred case of the present case, fusion proteins such as Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, and Tn5-ProteinA-5 are listed. The amino acid sequence of the Tn5-Protein A-1 is shown in SEQ ID NO. The amino acid sequence of the Tn5-ProteinG-2 is shown in SEQ ID NO. The amino acid sequence of the Tn10-Protein A-3 is shown in SEQ ID NO. The amino acid sequence of the Tn10-ProteinG-4 is shown in SEQ ID NO. The amino acid sequence of the Tn5-Protein A-5 is shown in SEQ ID NO.
因此,本发明所述融合蛋白的氨基酸序列可以如SEQ ID NO.5、SEQ ID NO.6、SEQ ID NO.7、SEQ ID NO.8或SEQ ID NO.9之任一所示。但不限于本发明较佳案例中所列举的具体形式。Thus, the amino acid sequence of the fusion protein of the invention may be as set forth in any one of SEQ ID NO. 5, SEQ ID NO. 6, SEQ ID NO. 7, SEQ ID NO. 8, or SEQ ID NO. However, it is not limited to the specific forms listed in the preferred cases of the present invention.
本发明的第二方面,提供一种分离的多核苷酸(亦即DNA分子),其编码前述融合蛋白。In a second aspect of the invention, an isolated polynucleotide (i.e., a DNA molecule) encoding the aforementioned fusion protein is provided.
本发明的编码所述融合蛋白的多核苷酸,可以是DNA形式或RNA形式。DNA形式包括cDNA、基因组DNA或人工合成的DNA。DNA可以是单链的或是双链的。The polynucleotide encoding the fusion protein of the present invention may be in the form of DNA or RNA. DNA forms include cDNA, genomic DNA or synthetic DNA. DNA can be single-stranded or double-stranded.
本发明的编码所述融合蛋白的多核苷酸,可以通过本领域技术人员熟知的任何适当的技术制备。所述技术见于本领域的一般描述,如《分子克隆实验指南》(J.萨姆布鲁克等,科学出版社,1995)。包括但不限于重组DNA技术、化学合成等方法;例如采用重叠延伸PCR法。The polynucleotide encoding the fusion protein of the present invention can be prepared by any suitable technique well known to those skilled in the art. Such techniques are described in the general description of the art, such as the Guide to Molecular Cloning (J. Sambrook et al., Science Press, 1995). Methods including, but not limited to, recombinant DNA techniques, chemical synthesis, and the like; for example, overlapping extension PCR.
优选的,经过优化的,编码所述Tn5转座酶的核苷酸序列如SEQ ID NO.12所示。Preferably, the nucleotide sequence encoding the Tn5 transposase is optimized as shown in SEQ ID NO.
优选的,经过优化的,编码所述Tn10转座酶编码核苷酸序列如SEQ ID NO.13所示。Preferably, the nucleotide sequence encoding the Tn10 transposase encoding is optimized as shown in SEQ ID NO.
优选的,经过优化的,编码所述金黄色葡萄球菌A蛋白的核苷酸序列如SEQ ID NO.14所示。Preferably, the nucleotide sequence encoding the S. aureus Protein A is optimized as shown in SEQ ID NO.
优选的,经过优化的,编码所述链球菌G蛋白的编码核苷酸序列如SEQ ID NO.15所示。Preferably, the encoded nucleotide sequence encoding the Streptococcus G protein is optimized as set forth in SEQ ID NO.
优选的,经过优化的,编码所述连接片段Linker1的核苷酸序列如SEQ ID NO.16所示,具体为:GGTGGTGGTGGTTCT。编码所述连接片段Linker2的核苷酸序列如SEQ ID NO.17所示,具体为:GGTGGTGGTTCTGGTGGTGGTGGTTCT。Preferably, the nucleotide sequence encoding the linker Linker1 is optimized as shown in SEQ ID NO. 16, specifically: GGTGGTGGTGGTTCT. The nucleotide sequence encoding the linker Linker2 is shown in SEQ ID NO. 17, specifically: GGTGGTGGTTCTGGTGGTGGTGGTTCT.
进一步地,优选的,经过优化的,编码所述融合蛋白Tn5-ProteinA-1的核苷酸序列如SEQ ID NO.18所示。Further, preferably, the nucleotide sequence encoding the fusion protein Tn5-ProteinA-1 is optimized as shown in SEQ ID NO.
或者编码所述融合蛋白Tn5-ProteinG-2的核苷酸序列如SEQ ID NO.19所示。Alternatively, the nucleotide sequence encoding the fusion protein Tn5-ProteinG-2 is shown in SEQ ID NO.
或者编码所述融合蛋白Tn10-ProteinA-3的核苷酸序列如SEQ ID NO.20所示。Alternatively, the nucleotide sequence encoding the fusion protein Tn10-Protein A-3 is shown in SEQ ID NO.
或者编码所述融合蛋白Tn10-ProteinG-4的核苷酸序列如SEQ ID NO.21所示。Alternatively, the nucleotide sequence encoding the fusion protein Tn10-ProteinG-4 is shown in SEQ ID NO.
或者编码所述融合蛋白Tn5-ProteinA-5的核苷酸序列如SEQ ID NO.22所示。Alternatively, the nucleotide sequence encoding the fusion protein Tn5-ProteinA-5 is shown in SEQ ID NO.
本发明的第三方面,提供一种克隆载体和表达载体,其含有前述多核苷酸。In a third aspect of the invention, a cloning vector and an expression vector comprising the aforementioned polynucleotide are provided.
本发明的所述表达载体含有编码所述融合蛋白的多核苷酸。本领域的技术人员熟知的方法能用于构建所述表达载体。这些方法包括重组DNA技术、DNA合成技术等。可将编码所述融合蛋白的DNA有效连接到载体中的多克隆位点上,以指导mRNA合成进而表达蛋白,或者用于同源重组。本发明的较佳案例中,所述克隆载体可用Easy Cloning T-vector(Novoprotein,T003-01B),表达载体可采用pET21a。The expression vector of the present invention contains a polynucleotide encoding the fusion protein. Methods well known to those skilled in the art can be used to construct the expression vector. These methods include recombinant DNA techniques, DNA synthesis techniques, and the like. The DNA encoding the fusion protein can be operably linked to a multiple cloning site in the vector to direct mRNA synthesis to express the protein, or for homologous recombination. In a preferred embodiment of the present invention, the cloning vector can be used as Easy Cloning T-vector (Novoprotein, T003-01B), and the expression vector can be pET21a.
本发明的第四方面,提供一种宿主细胞,其被前述表达载体或克隆载体所转化。In a fourth aspect of the invention, a host cell is provided which is transformed with the aforementioned expression vector or cloning vector.
本发明的较佳案例中,所述宿主细胞可采用BL21(DE3)或Rosetta pLysS。In a preferred embodiment of the invention, the host cell can employ BL21 (DE3) or Rosetta pLysS.
本发明的第五方面,提供一种制备前述融合蛋白的方法,包括如下步骤:According to a fifth aspect of the invention, a method for preparing the aforementioned fusion protein comprises the steps of:
合成或者克隆目的DNA序列,构建含有目的DNA序列的克隆载体,构建含有目的DNA序列的表达载体,将含目的DNA序列的表达载体转化至原核宿主细胞,筛选在在生长培养基中高表达的高产细胞株,培养筛选到的高表达细胞株并表达融合蛋白,从表达产物中纯化获得所述的融合蛋白。Synthesizing or cloning the DNA sequence of interest, constructing a cloning vector containing the DNA sequence of interest, constructing an expression vector containing the DNA sequence of interest, transforming the expression vector containing the DNA sequence of interest into a prokaryotic host cell, and screening for a high-yielding cell highly expressed in the growth medium. The strain is cultured and the highly expressed cell strain is cultured and the fusion protein is expressed, and the fusion protein is purified from the expression product.
本发明的较佳案例中,所述表达载体可采用pET21a。所述宿主细胞可采用BL21(DE3)或Rosetta pLysS。In a preferred embodiment of the invention, the expression vector may employ pET21a. The host cell may employ BL21 (DE3) or Rosetta pLysS.
本发明的第六方面,提供一种试剂组合,包含前述融合蛋白以及其对应使用的其他组分。In a sixth aspect of the invention, there is provided a reagent combination comprising the aforementioned fusion protein and other components thereof for use.
进一步地,其对应使用的其他组分包括对应使用的Buffer和其他配合使用的组分。其对 应使用的其他组分可以是固体、液体或者吸附在专门的材料上的形式。Further, other components that are used correspondingly include the corresponding Buffer and other components used in combination. Other components that are suitable for use may be in the form of a solid, a liquid or a material adsorbed on a special material.
本发明的第六方面,提供前述融合蛋白在制备抗体结合转座体中的用途。In a sixth aspect of the invention, there is provided the use of the aforementioned fusion protein for the preparation of an antibody binding transposome.
在所述融合蛋白的第一结构域连接上接头,即可获得抗体结合转座体。进一步地,所述抗体结合转座体为二聚体。所述抗体结合转座体具有在随机位置切割DNA双链,并在切断位置插入接头的功能。所述抗体结合转座体还可以结合抗体的Fc部分,并与抗体形成复合体。An antibody-binding transposome can be obtained by ligating a linker in the first domain of the fusion protein. Further, the antibody binds to a transposome as a dimer. The antibody-binding transposome has the function of cleaving a DNA duplex at a random position and inserting a linker at the cleavage site. The antibody binding transposome can also bind to the Fc portion of the antibody and form a complex with the antibody.
本发明的第六方面,提供一种抗体结合转座体,所述抗体结合转座体包括所述融合蛋白以及与所述融合蛋白的第一结构域连接的接头。In a sixth aspect of the invention, an antibody-binding transposome comprising the fusion protein and a linker linked to the first domain of the fusion protein is provided.
进一步地,所述抗体结合转座体为二聚体。所述抗体结合转座体具有切割DNA双链,并在切断位置插入接头的功能。所述抗体结合转座体还可以结合抗体的Fc部分,并与抗体形成复合体。Further, the antibody binds to a transposome as a dimer. The antibody-binding transposome has the function of cleaving a DNA duplex and inserting a linker at the cleavage site. The antibody binding transposome can also bind to the Fc portion of the antibody and form a complex with the antibody.
本发明的第七方面,提供所述抗体结合转座体在构建测序文库中的用途。In a seventh aspect of the invention, the use of the antibody-binding transposome to construct a sequencing library is provided.
本发明的第八方面,提供所述抗体结合转座体在研究蛋白-染色质互作中的用途。In an eighth aspect of the invention, the use of the antibody-binding transposome for studying protein-chromatin interactions is provided.
本发明的第九方面,提供一种研究蛋白-染色质互作的方法,包括如下步骤:In a ninth aspect of the invention, there is provided a method of studying protein-chromatin interaction, comprising the steps of:
(1)取足量细胞,破碎细胞,分离染色质,预处理染色质;(1) taking sufficient amount of cells, breaking the cells, separating the chromatin, and pretreating the chromatin;
(2)添加针对目标转录因子(或结合在染色质上的蛋白)的抗体,再添加带接头的抗体结合转座体;(2) adding an antibody against a target transcription factor (or a protein bound to chromatin), and then adding a linker-binding antibody to the transposome;
(3)染色体被片段化,并引入接头;(3) the chromosome is fragmented and a linker is introduced;
(4)以接头为引物进行扩增,获得DNA文库;(4) amplifying with a linker as a primer to obtain a DNA library;
(5)使用所得文库进行高通量测序。(5) High-throughput sequencing was performed using the resulting library.
本发明独创性地发现,在所述步骤(3)中,所述抗体结合转座体将会通过第二结构域连接到抗体上,而抗体又可以特异性结合到染色质上的转录因子(TranscriptionFactor,TF)或者组蛋白(Histone)上,抗体结合转座体-抗体-TF-DNA将会连接在一起,形成复合体。受到这个复合体的限制,抗体结合转座体切割DNA的作用将会被限定在TF所在的DNA位置附近,更远位置的DNA将不会被切割。而且被切割的位置会被引入接头,以这些接头作为引物的一部分,进行PCR扩增,将得到带有TF结合位置信息的DNA片段。并且这些片段已经在PCR中接上了测序用接头,经过磁珠筛选后,就完成了测序用文库构建,可以直接进行下一步高通量测序。The present invention uniquely found that in the step (3), the antibody-binding transposome will be linked to the antibody via the second domain, and the antibody can specifically bind to the chromatin transcription factor ( On TranscriptionFactor, TF) or histone (Histone), the antibody-binding transposome-antibody-TF-DNA will be joined together to form a complex. Limited by this complex, the effect of the antibody binding to the transposome to cleave DNA will be limited to the location of the DNA where the TF is located, and the DNA at a further position will not be cleaved. Moreover, the position to be cleaved is introduced into the linker, and these linkers are used as a part of the primer for PCR amplification, and a DNA fragment having TF binding position information is obtained. Moreover, these fragments have been connected to the sequencing linker in the PCR, and after the magnetic bead screening, the sequencing library construction is completed, and the next high-throughput sequencing can be directly performed.
本发明上述研究蛋白-染色质互作的方法省略了免疫共沉淀-洗脱,减少了操作步骤,并大大减少了样本损失,并直接构建了测序文库,减少了操作步骤,大大简便之后的测序工作, 这样对起始样本要求下降,损失信息较少,可重复性及可信度大大提高。The above method for studying protein-chromatin interaction of the present invention omits co-immunoprecipitation-elution, reduces the operation steps, and greatly reduces sample loss, and directly constructs a sequencing library, reduces the operation steps, and greatly facilitates subsequent sequencing. Work, so that the requirements for the starting sample are reduced, the loss information is less, and the repeatability and credibility are greatly improved.
与现有技术相比,本发明具有如下有益效果:Compared with the prior art, the present invention has the following beneficial effects:
本发明提供了一种全新的融合蛋白,所述融合蛋白同时具有转座插入基因序列功能以及结合IgG分子中Fc段功能。所述融合蛋白可用于制备成抗体结合转座体,用来进行突变体文库构建、高通量测序文库构建、免疫分析、IgG纯化等实验。采用所述抗体结合转座体研究蛋白-染色质互作,其相较于ChIP-Seq更简单,建库更高效,更节省,具备更好的可重复性,对样本量的要求更低,所需的样本DNA量大幅下降,中途损失的样本和数据大大减少等优点,对蛋白质与DNA互作研究具有重大意义。The present invention provides a novel fusion protein which has both a transposon insertion gene sequence function and an Fc segment function in an IgG molecule. The fusion protein can be used to prepare an antibody-binding transposome for experiments on mutant library construction, high-throughput sequencing library construction, immunoassay, IgG purification, and the like. Using the antibody-binding transposome to study protein-chromatin interactions, which is simpler than ChIP-Seq, is more efficient, more economical, has better repeatability, and requires less sample size. The amount of sample DNA required is greatly reduced, and the sample and data lost in the middle are greatly reduced. This is of great significance for the study of protein-DNA interaction.
附图说明DRAWINGS
图1:Tn5-ProteinA-1融合蛋白在IPTG诱导后表达电泳图,其中,Lane A:non-induced crude;Lane B:Induced crude;LaneB1-B4:Induced crude;Lane C:Supernatant of lysate;Lane D:Precipitation of lysate;MK:Molecular weight marker。Figure 1: Electrophoresis pattern of Tn5-ProteinA-1 fusion protein expressed after IPTG induction, wherein Lane A: non-induced crude; Lane B: Induced crude; LaneB1-B4: Induced crude; Lane C: Supernatant of lysate; Lane D : Precipitation of lysate; MK: Molecular weight marker.
图2:Tn5-ProteinG-2融合蛋白在IPTG诱导后表达电泳图,其中,Lane A:non-induced crude;Lane B:Induced crude;LaneB1-B2:Induced crude;Lane C:Supernatant of lysate;Lane D:Precipitation of lysate;MK:Molecular weight marker。Figure 2: Electrophoresis pattern of Tn5-ProteinG-2 fusion protein expressed after IPTG induction, wherein Lane A: non-induced crude; Lane B: Induced crude; LaneB1-B2: Induced crude; Lane C: Supernatant of lysate; Lane D : Precipitation of lysate; MK: Molecular weight marker.
图3:Tn10-ProteinA-3融合蛋白在IPTG诱导后表达电泳图,其中,Lane A:non-induced crude;Lane B:Induced crude;LaneB1-B4:Induced crude;Lane C:Supernatant of lysate;Lane D:Precipitation of lysate;MK:Molecular weight marker。Figure 3: Electrophoresis pattern of Tn10-ProteinA-3 fusion protein expressed after IPTG induction, wherein Lane A: non-induced crude; Lane B: Induced crude; LaneB1-B4: Induced crude; Lane C: Supernatant of lysate; Lane D : Precipitation of lysate; MK: Molecular weight marker.
图4:Tn10-ProteinG-4融合蛋白在IPTG诱导后表达电泳图,其中,Lane A:non-induced crude;Lane B:Induced crude;LaneB1-B4:Induced crude;Lane C:Supernatant of lysate;Lane D:Precipitation of lysate;MK:Molecular weight marker。Figure 4: Electrophoresis pattern of Tn10-ProteinG-4 fusion protein expressed after IPTG induction, wherein Lane A: non-induced crude; Lane B: Induced crude; LaneB1-B4: Induced crude; Lane C: Supernatant of lysate; Lane D : Precipitation of lysate; MK: Molecular weight marker.
图5:Tn5-ProteinA-5融合蛋白在IPTG诱导后表达电泳图,其中,Lane A:non-induced crude;Lane B:Induced crude;LaneB1-B4:Induced crude;Lane C:Supernatant of lysate;Lane D:Precipitation of lysate;MK:Molecular weight marker。Figure 5: Electrophoresis pattern of Tn5-ProteinA-5 fusion protein expressed after IPTG induction, wherein Lane A: non-induced crude; Lane B: Induced crude; LaneB1-B4: Induced crude; Lane C: Supernatant of lysate; Lane D : Precipitation of lysate; MK: Molecular weight marker.
图6:Tn5-ProteinA-1放大表达电泳图,其中,Lane A:non-induced crude;Lane B:Induced crude;Lane C:Supernatant of lysate;Lane D:Precipitation of lysate;MK:Molecular weight marker。Figure 6: Electronization map of Tn5-ProteinA-1 amplification expression, wherein Lane A: non-induced crude; Lane B: Induced crude; Lane C: Supernatant of lysate; Lane D: Precipitation of lysate; MK: Molecular weight marker.
图7:Tn5-ProteinG-2放大表达电泳图,其中,Lane A:non-induced crude;Lane B: Induced crude;Lane C:Supernatant of lysate;Lane D:Precipitation of lysate;M:Molecular weight marker。Figure 7: Amplified expression electrophoresis pattern of Tn5-ProteinG-2, wherein Lane A: non-induced crude; Lane B: Induced crude; Lane C: Supernatant of lysate; Lane D: Precipitation of lysate; M: Molecular weight marker.
图8:Tn10-ProteinA-3放大表达电泳图,其中,Lane A:non-induced crude;Lane B:Induced crude;Lane C:Supernatant of lysate;Lane D:Precipitation of lysate;M:Molecular weight marker。Figure 8: Amplified expression electrophoresis pattern of Tn10-ProteinA-3, wherein Lane A: non-induced crude; Lane B: Induced crude; Lane C: Supernatant of lysate; Lane D: Precipitation of lysate; M: Molecular weight marker.
图9:Tn10-ProteinG-4放大表达电泳图,其中,Lane A:non-induced crude;Lane B:Induced crude;Lane C:Supernatant of lysate;Lane D:Precipitation of lysate;M:Molecular weight marker。Figure 9: Amplified expression electrophoresis pattern of Tn10-ProteinG-4, wherein Lane A: non-induced crude; Lane B: Induced crude; Lane C: Supernatant of lysate; Lane D: Precipitation of lysate; M: Molecular weight marker.
图10:Tn5-ProteinA-5放大表达电泳图,其中,Lane A:non-induced crude;Lane B:Induced crude;Lane C:Supernatant of lysate;Lane D:Precipitation of lysate;MK:Molecular weight marker。Figure 10: Amplified expression electrophoresis pattern of Tn5-ProteinA-5, wherein Lane A: non-induced crude; Lane B: Induced crude; Lane C: Supernatant of lysate; Lane D: Precipitation of lysate; MK: Molecular weight marker.
图11:转座酶-抗体结合蛋白的融合蛋白与接头形成的转座体,可以将双链DNA片段化,并在片段化的DNA两端加上接头。Figure 11: Transposase formed by a fusion protein of a transposase-antibody binding protein and a adaptor, which can fragment a double-stranded DNA and add a linker at both ends of the fragmented DNA.
图12:融合蛋白1-5、Tn5转座体酶切处理50ng人基因组DNA,其中,MK:DNA Marker;1:Tn5-ProteinA-1;2:Tn5-ProteinG-2;3:Tn10-ProteinA-3;4:Tn10-ProteinG-4;5:Tn5-ProteinA-5;6:Tn5;7:未被处理的基因组DNA。Figure 12: Fusion protein 1-5, Tn5 transposome digestion 50 ng human genomic DNA, wherein MK: DNA Marker; 1: Tn5-Protein A-1; 2: Tn5-ProteinG-2; 3: Tn10-ProteinA- 3; 4: Tn10-Protein G-4; 5: Tn5-Protein A-5; 6: Tn5; 7: untreated genomic DNA.
图13:PCR扩增后经过磁珠分选的DNA片段,1-7是两次加入磁珠量的不同,分选到不同片段大小的DNA。Figure 13: DNA fragments sorted by magnetic beads after PCR amplification, 1-7 is the difference in the amount of magnetic beads added twice, and sorted into DNA of different fragment sizes.
图14:产物使用安捷伦2100高灵敏DNA芯片检测片段大小,1:未经过分选的DNA片段;2-6:分选后不同长度的DNA片段。Figure 14: Detection of fragment size using an Agilent 2100 high-sensitivity DNA chip, 1: unsorted DNA fragments; 2-6: DNA fragments of different lengths after sorting.
图15:融合蛋白的抗体结合蛋白部分可以结合IgG的Fc部分。Figure 15: The antibody binding protein portion of the fusion protein can bind to the Fc portion of IgG.
图16:ProteinA蛋白标准曲线。Figure 16: Standard curve of ProteinA protein.
图17:ChT-Seq法研究蛋白质与基因组DNA互作示意图。Figure 17: ChT-Seq method to study the interaction between protein and genomic DNA.
图18:转座酶在Transposase-ProteinA/G-IgG-TF复合体的限制下,只能切割临近DNA序列,并引入测序接头。Figure 18: The transposase can only cleave adjacent DNA sequences under the restriction of the Transposase-ProteinA/G-IgG-TF complex and introduce a sequencing linker.
图19:基因组DNA电泳图,1、2为1*10 6个Hela细胞提取出来的基因组DNA,用于ChIP-Seq实验;3、4为2*10 5个Hela细胞提取出来的基因组DNA,用于ChT-Seq实验,如图可见ChIP-Seq所用基因组DNA比ChT-Seq所用DNA更多。 Figure 19: Genomic DNA electrophoresis map, 1, 2 is 1*10 6 Hela cells extracted genomic DNA for ChIP-Seq experiment; 3, 4 is 2*10 5 Hela cells extracted genomic DNA, used In the ChT-Seq experiment, it can be seen that the genomic DNA used in ChIP-Seq is more than the DNA used in ChT-Seq.
图20:文库电泳图,M为DNA Marker;1、2起始约10ug基因组DNA,经过ChIP-Seq法超声打断-免疫沉淀-补平-加A-加接头后PCR扩增得到的文库;3、4起始约2ug基因组DNA,ChT-Seq法转座体一步切割基因组并连接上接头,PCR扩增后得到的文库。由图可 知,在起始模板更少的情况下,ChT-Seq获得了更多的文库,中途损失少得多。Figure 20: Electrophoresis map of the library, M is DNA Marker; 1, 2 starts with about 10 ug of genomic DNA, and is subjected to PCR amplification by ChIP-Seq method ultrasonication-immunoprecipitation-complementing-addition of A-plus linker; 3, 4 start about 2 ug of genomic DNA, ChT-Seq method transposome one-step cleavage of the genome and ligation of the linker, the library obtained after PCR amplification. As can be seen from the figure, ChT-Seq obtained more libraries with fewer starting templates, with much less loss in the middle.
图21:Qubit、Nanodrop检测文库质量,纵坐标为DNA浓度(ng/ul)。ChIP-Seq代表起始10ug DNA用ChIP-Seq法构建的测序文库;ChT-Seq代表起始2ug DNA用ChT-Seq法构建的测序文库;蓝色为Qubit检测结果,红色为Nanodrop检测结果。由图可知,在更少的起始DNA量的情况下,ChT-Seq能够比ChIP-Seq获得更多的测序文库。Figure 21: Qubit, Nanodrop detection library quality, ordinate is DNA concentration (ng / ul). ChIP-Seq represents a sequencing library constructed using the ChIP-Seq method for the initial 10 ug DNA; ChT-Seq represents a sequencing library constructed with the ChT-Seq method for the initial 2 ug DNA; blue is the Qubit test result, and red is the Nanodrop test result. As can be seen, ChT-Seq is able to obtain more sequencing libraries than ChIP-Seq with less initial amount of DNA.
具体实施方式detailed description
在进一步描述本发明具体实施方式之前,应理解,本发明的保护范围不局限于下述特定的具体实施方案;还应当理解,本发明实施例中使用的术语是为了描述特定的具体实施方案,而不是为了限制本发明的保护范围。下列实施例中未注明具体条件的试验方法,通常按照常规条件,或者按照各制造商所建议的条件。Before the present invention is further described, it is to be understood that the scope of the present invention is not limited to the specific embodiments described below; It is not intended to limit the scope of the invention. The test methods which do not specify the specific conditions in the following examples are usually carried out according to conventional conditions or according to the conditions recommended by each manufacturer.
当实施例给出数值范围时,应理解,除非本发明另有说明,每个数值范围的两个端点以及两个端点之间任何一个数值均可选用。除非另外定义,本发明中使用的所有技术和科学术语与本技术领域技术人员通常理解的意义相同。除实施例中使用的具体方法、设备、材料外,根据本技术领域的技术人员对现有技术的掌握及本发明的记载,还可以使用与本发明实施例中所述的方法、设备、材料相似或等同的现有技术的任何方法、设备和材料来实现本发明。When the numerical values are given by the examples, it is to be understood that the two endpoints of each numerical range and any one of the two. Unless otherwise defined, all technical and scientific terms used in the present invention have the same meaning meaning In addition to the specific methods, devices, and materials used in the embodiments, the methods, devices, and materials described in the embodiments of the present invention may also be used according to the prior art and the description of the present invention by those skilled in the art. Any method, apparatus, and material of the prior art, similar or equivalent, is used to practice the invention.
除非另外说明,本发明中所公开的实验方法、检测方法、制备方法均采用本技术领域常规的分子生物学、生物化学、染色质结构和分析、分析化学、细胞培养、重组DNA技术及相关领域的常规技术。这些技术在现有文献中已有完善说明,具体可参见Sambrook等MOLECULAR CLONING:A LABORATORY MANUAL,Second edition,Cold Spring Harbor Laboratory Press,1989 and Third edition,2001;Ausubel等,CURRENT PROTOCOLS IN MOLECULAR BIOLOGY,John Wiley&Sons,New York,1987 and periodic updates;the series METHODS IN ENZYMOLOGY,Academic Press,San Diego;Wolffe,CHROMATIN STRUCTURE AND FUNCTION,Third edition,Academic Press,San Diego,1998;METHODS IN ENZYMOLOGY,Vol.304,Chromatin(P.M.Wassarman and A.P.Wolffe,eds.),Academic Press,San Diego,1999;和METHODS IN MOLECULAR BIOLOGY,Vol.119,Chromatin Protocols(P.B.Becker,ed.)Humana Press,Totowa,1999等。Unless otherwise stated, the experimental methods, detection methods, and preparation methods disclosed in the present invention employ molecular biology, biochemistry, chromatin structure and analysis, analytical chemistry, cell culture, recombinant DNA technology, and related fields conventional in the art. Conventional technology. These techniques are well described in the prior literature, see Sambrook et al. MOLECULAR CLONING: A LABORATORY MANUAL, Second edition, Cold Spring Harbor Laboratory Press, 1989 and Third edition, 2001; Ausubel et al, CURRENT PROTOCOLS IN MOLECULAR BIOLOGY, John Wiley & Sons, New York, 1987 and periodic updates; the series METHODS IN ENZYMOLOGY, Academic Press, San Diego; Wolffe, CHROMATIN STRUCTURE AND FUNCTION, Third edition, Academic Press, San Diego, 1998; METHODS IN ENZYMOLOGY, Vol. 304, Chromatin ( PMWassarman and AP Wolffe, eds.), Academic Press, San Diego, 1999; and METHODS IN MOLECULAR BIOLOGY, Vol. 119, Chromatin Protocols (PBBecker, ed.) Humana Press, Totowa, 1999, and the like.
实施例1转座酶-抗体结合蛋白的融合蛋白的制备Example 1 Preparation of fusion protein of transposase-antibody binding protein
一、融合蛋白结构First, the fusion protein structure
本实施例的融合蛋白,其结构中包括具有转座功能的第一结构域和具有结合抗体Fc段功能的第二结构域。所述转座功能是指转座插入基因序列功能。所述结合抗体Fc段功能是指结合IgG分子中Fc段功能。所述第一结构域和第二结构域之间通过连接片段亦即Linker连接。进一步地,所述第一结构域可以是转座酶蛋白,第二结构域可以是抗体结合蛋白。The fusion protein of the present embodiment includes a first domain having a transposition function and a second domain having a function of binding an antibody Fc segment. The transposition function refers to the function of transposition insertion of a gene sequence. The binding antibody Fc segment function refers to binding to the Fc segment function in an IgG molecule. The first domain and the second domain are connected by a linker, ie, a Linker. Further, the first domain may be a transposase protein and the second domain may be an antibody binding protein.
融合蛋白的具体构建方案为:转座酶蛋白和抗体结合蛋白之间通过连接片段亦即结构式为(GS) a(GGS) b(GGGS) c(GGGGS) d(其中a、b、c、d均是大于或等于0的整数)的Linker连接。为了使融合蛋白能够更好地发挥其功能,本发明针对转座酶蛋白、抗体结合蛋白以及连接片段(GS) a(GGS) b(GGGS) c(GGGGS) d(其中a、b、c、d均是大于或等于0的整数)的序列均做了优化。 The specific construction scheme of the fusion protein is that the transposase protein and the antibody binding protein are linked by a fragment, that is, the structural formula is (GS) a (GGS) b (GGGS) c (GGGGS) d (where a, b, c, d Linker connections that are all integers greater than or equal to 0). In order to enable the fusion protein to function better, the present invention is directed to a transposase protein, an antibody binding protein, and a ligation fragment (GS) a (GGS) b (GGGS) c (GGGGS) d (where a, b, c, Sequences in which d is an integer greater than or equal to 0 are optimized.
本实施例示例性地制备了5个不同构造的转座酶-抗体结合蛋白的融合蛋白,分别命名为Tn5-ProteinA-1、Tn5-ProteinG-2、Tn10-ProteinA-3、Tn10-ProteinG-4、Tn5-ProteinA-5。This example exemplarily prepared fusion proteins of five differently constructed transposase-antibody binding proteins, named Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, respectively. , Tn5-ProteinA-5.
命名为Tn5-ProteinA-1的融合蛋白,其第一结构域为Tn5转座酶,其第二结构域为金黄色葡萄球菌A蛋白,所述第一结构域和所述第二结构域通过连接片段Linker1连接。A fusion protein designated Tn5-ProteinA-1, the first domain of which is a Tn5 transposase, the second domain thereof is a S. aureus A protein, and the first domain and the second domain are linked by Fragment Linker1 connection.
所述Tn5转座酶的氨基酸序列如SEQ ID NO.1所示,具体为:The amino acid sequence of the Tn5 transposase is as shown in SEQ ID NO. 1, specifically:
Figure PCTCN2018084711-appb-000001
Figure PCTCN2018084711-appb-000001
所述Tn5转座酶的编码核苷酸序列如SEQ ID NO.12所示,具体为:The nucleotide sequence encoding the Tn5 transposase is as shown in SEQ ID NO. 12, specifically:
Figure PCTCN2018084711-appb-000002
Figure PCTCN2018084711-appb-000002
Figure PCTCN2018084711-appb-000003
Figure PCTCN2018084711-appb-000003
所述金黄色葡萄球菌A蛋白的氨基酸序列如SEQ ID NO.3所示,具体为:The amino acid sequence of the S. aureus A protein is shown in SEQ ID NO. 3, specifically:
Figure PCTCN2018084711-appb-000004
Figure PCTCN2018084711-appb-000004
所述金黄色葡萄球菌A蛋白的编码核苷酸序列如SEQ ID NO.14所示,具体为:The nucleotide sequence encoding the S. aureus A protein is shown in SEQ ID NO. 14, specifically:
Figure PCTCN2018084711-appb-000005
Figure PCTCN2018084711-appb-000005
所述连接片段亦即Linker1的氨基酸序列如SEQ ID NO.10所示,具体为GGGGS。The amino acid sequence of the linker, Linker1, is set forth in SEQ ID NO. 10, specifically GGGGS.
所述连接片段亦即Linker1的编码核苷酸序列如SEQ ID NO.16所示,具体为:GGTGGTGGTGGTTCT。The coding sequence of the ligated fragment, Linker1, is set forth in SEQ ID NO. 16, specifically: GGTGGTGGTGGTTCT.
所述融合蛋白Tn5-ProteinA-1的氨基酸序列如SEQ ID NO.5所示,具体为:The amino acid sequence of the fusion protein Tn5-ProteinA-1 is shown in SEQ ID NO. 5, specifically:
Figure PCTCN2018084711-appb-000006
Figure PCTCN2018084711-appb-000006
Figure PCTCN2018084711-appb-000007
Figure PCTCN2018084711-appb-000007
所述融合蛋白Tn5-ProteinA-1的编码核苷酸序列如SEQ ID NO.18所示,具体为:The coding nucleotide sequence of the fusion protein Tn5-ProteinA-1 is shown in SEQ ID NO. 18, specifically:
Figure PCTCN2018084711-appb-000008
Figure PCTCN2018084711-appb-000008
Figure PCTCN2018084711-appb-000009
Figure PCTCN2018084711-appb-000009
命名为Tn5-ProteinG-2的融合蛋白,其第一结构域为Tn5转座酶,其第二结构域为链球菌G蛋白,所述第一结构与第二结构域之间通过连接片段Linker1连接。A fusion protein designated Tn5-ProteinG-2, the first domain of which is a Tn5 transposase, the second domain of which is a Streptococcus G protein, and the first structure and the second domain are linked by a linker Linker1 .
所述Tn5转座酶的氨基酸序列如SEQ ID NO.1所示。The amino acid sequence of the Tn5 transposase is set forth in SEQ ID NO.
所述Tn5转座酶的编码核苷酸序列如SEQ ID NO.12所示。The nucleotide sequence encoding the Tn5 transposase is set forth in SEQ ID NO.
所述链球菌G蛋白的氨基酸序列如SEQ ID NO.4所示,具体为:The amino acid sequence of the Streptococcus G protein is as shown in SEQ ID NO. 4, specifically:
Figure PCTCN2018084711-appb-000010
Figure PCTCN2018084711-appb-000010
所述链球菌G蛋白的编码核苷酸序列如SEQ ID NO.15所示,具体为:The nucleotide sequence encoding the Streptococcus G protein is as shown in SEQ ID NO. 15, specifically:
Figure PCTCN2018084711-appb-000011
Figure PCTCN2018084711-appb-000011
所述连接片段亦即Linker1的氨基酸序列如SEQ ID NO.10所示。The amino acid sequence of the linker, Linker1, is set forth in SEQ ID NO.
所述连接片段亦即Linker1的编码核苷酸序列如SEQ ID NO.16所示。The coding sequence of the ligated fragment, Linker1, is set forth in SEQ ID NO.
所述Tn5-ProteinG-2融合蛋白的氨基酸序列如SEQ ID NO.6所示,具体为:The amino acid sequence of the Tn5-ProteinG-2 fusion protein is shown in SEQ ID NO. 6, specifically:
Figure PCTCN2018084711-appb-000012
Figure PCTCN2018084711-appb-000012
所述Tn5-ProteinG-2融合蛋白的编码核苷酸序列如SEQ ID NO.19所示,具体为:The nucleotide sequence encoding the Tn5-ProteinG-2 fusion protein is shown in SEQ ID NO. 19, specifically:
Figure PCTCN2018084711-appb-000013
Figure PCTCN2018084711-appb-000013
Figure PCTCN2018084711-appb-000014
Figure PCTCN2018084711-appb-000014
命名为Tn10-ProteinA-3的融合蛋白,其第一结构域为Tn10转座酶,其第二结构域为金黄色葡萄球菌A蛋白,第一结构域和第二结构域通过连接片段Linker1连接。A fusion protein designated Tn10-ProteinA-3 has a first domain of Tn10 transposase and a second domain of S. aureus A protein, the first domain and the second domain being joined by a linker Linker1.
所述Tn10转座酶氨基酸序列如SEQ ID NO.2所示,具体为:The Tn10 transposase amino acid sequence is shown in SEQ ID NO. 2, specifically:
Figure PCTCN2018084711-appb-000015
Figure PCTCN2018084711-appb-000015
Figure PCTCN2018084711-appb-000016
Figure PCTCN2018084711-appb-000016
所述Tn10转座酶编码核苷酸序列如SEQ ID NO.13,具体为:The Tn10 transposase encoding nucleotide sequence is SEQ ID NO. 13, specifically:
Figure PCTCN2018084711-appb-000017
Figure PCTCN2018084711-appb-000017
所述金黄色葡萄球菌A蛋白的氨基酸序列如SEQ ID NO.3。The amino acid sequence of the S. aureus A protein is SEQ ID NO.
所述金黄色葡萄球菌A蛋白的编码核苷酸序列如SEQ ID NO.14。The nucleotide sequence encoding the S. aureus A protein is SEQ ID NO.
所述连接片段亦即Linker1的氨基酸序列如SEQ ID NO.10所示。The amino acid sequence of the linker, Linker1, is set forth in SEQ ID NO.
所述连接片段亦即Linker1的编码核苷酸序列如SEQ ID NO.16所示。The coding sequence of the ligated fragment, Linker1, is set forth in SEQ ID NO.
所述Tn10-ProteinA-3融合蛋白的氨基酸序列如SEQ ID NO.7所示,具体为:The amino acid sequence of the Tn10-Protein A-3 fusion protein is shown in SEQ ID NO. 7, specifically:
Figure PCTCN2018084711-appb-000018
Figure PCTCN2018084711-appb-000018
Figure PCTCN2018084711-appb-000019
Figure PCTCN2018084711-appb-000019
所述Tn10-ProteinA-3融合蛋白的编码核苷酸序列如SEQ ID NO.20所示,具体为:The coding nucleotide sequence of the Tn10-Protein A-3 fusion protein is shown in SEQ ID NO. 20, specifically:
Figure PCTCN2018084711-appb-000020
Figure PCTCN2018084711-appb-000020
命名为Tn10-ProteinG-4的融合蛋白,其第一结构域为Tn10转座酶,其第二结构域为链球菌G蛋白,第一结构域和第二结构域通过连接片段Linker1连接。A fusion protein designated Tn10-ProteinG-4, the first domain of which is a Tn10 transposase, the second domain of which is a Streptococcus G protein, and the first domain and the second domain are joined by a linker Linker1.
所述Tn10转座酶氨基酸序列如SEQ ID NO.2所示。The Tn10 transposase amino acid sequence is shown in SEQ ID NO.
所述Tn10转座酶编码核苷酸序列如SEQ ID NO.13所示。The Tn10 transposase encoding nucleotide sequence is set forth in SEQ ID NO.
所述链球菌G蛋白的氨基酸序列如SEQ ID NO.4所示。The amino acid sequence of the Streptococcus G protein is shown in SEQ ID NO.
所述链球菌G蛋白的编码核苷酸序列如SEQ ID NO.15所示。The nucleotide sequence encoding the Streptococcus G protein is shown in SEQ ID NO.
所述连接片段亦即Linker1的氨基酸序列如SEQ ID NO.10所示。The amino acid sequence of the linker, Linker1, is set forth in SEQ ID NO.
所述连接片段亦即Linker1的编码核苷酸序列如SEQ ID NO.16所示。The coding sequence of the ligated fragment, Linker1, is set forth in SEQ ID NO.
所述Tn10-ProteinG-4融合蛋白的氨基酸序列如SEQ ID NO.8所示,具体为:The amino acid sequence of the Tn10-ProteinG-4 fusion protein is shown in SEQ ID NO. 8, specifically:
Figure PCTCN2018084711-appb-000021
Figure PCTCN2018084711-appb-000021
所述Tn10-ProteinG-4融合蛋白的编码核苷酸序列如SEQ ID NO.21所示,具体为:The nucleotide sequence encoding the Tn10-ProteinG-4 fusion protein is shown in SEQ ID NO. 21, specifically:
Figure PCTCN2018084711-appb-000022
Figure PCTCN2018084711-appb-000022
Figure PCTCN2018084711-appb-000023
Figure PCTCN2018084711-appb-000023
命名为Tn5-ProteinA-5的融合蛋白,其第一结构域为Tn5转座酶,其第二结构域为金黄色葡萄球菌A蛋白,所述第一结构域和所述第二结构域通过连接片段Linker2连接。A fusion protein designated Tn5-ProteinA-5, the first domain of which is a Tn5 transposase, the second domain thereof is a S. aureus A protein, and the first domain and the second domain are linked by Fragment Linker2 connection.
所述Tn5转座酶的氨基酸序列如SEQ ID NO.1所示。The amino acid sequence of the Tn5 transposase is set forth in SEQ ID NO.
所述Tn5转座酶的编码核苷酸序列如SEQ ID NO.12所示。The nucleotide sequence encoding the Tn5 transposase is set forth in SEQ ID NO.
所述金黄色葡萄球菌A蛋白的氨基酸序列如SEQ ID NO.3所示。The amino acid sequence of the S. aureus A protein is shown in SEQ ID NO.
所述金黄色葡萄球菌A蛋白的编码核苷酸序列如SEQ ID NO.14所示。The nucleotide sequence encoding the S. aureus Protein A is set forth in SEQ ID NO.
所述连接片段亦即Linker2的氨基酸序列如SEQ ID NO.11所示,具体为:GGGSGGGGS。The amino acid sequence of Linker 2, which is Linker 2, is shown in SEQ ID NO. 11, specifically: GGGSGGGGS.
所述连接片段亦即Linker2的编码核苷酸序列如SEQ ID NO.17所示,具体为:The coding sequence of the linker, ie, Linker2, is represented by SEQ ID NO. 17, specifically:
Figure PCTCN2018084711-appb-000024
Figure PCTCN2018084711-appb-000024
所述融合蛋白Tn5-ProteinA-5的氨基酸序列如SEQ ID NO.9所示,具体为:The amino acid sequence of the fusion protein Tn5-ProteinA-5 is shown in SEQ ID NO. 9, specifically:
Figure PCTCN2018084711-appb-000025
Figure PCTCN2018084711-appb-000025
所述融合蛋白Tn5-ProteinA-5的编码核苷酸序列如SEQ ID NO.22所示,具体为:The coding nucleotide sequence of the fusion protein Tn5-ProteinA-5 is shown in SEQ ID NO. 22, specifically:
Figure PCTCN2018084711-appb-000026
Figure PCTCN2018084711-appb-000026
Figure PCTCN2018084711-appb-000027
Figure PCTCN2018084711-appb-000027
二、转座酶-抗体结合蛋白的融合蛋白的原核表达2. Prokaryotic expression of fusion protein of transposase-antibody binding protein
1、将上述优化后的转座酶-抗体结合蛋白的融合蛋白的编码核苷酸序列分别转入表达载体pET21a,反应体系为20μL,在0.2mL EP管中加入下列成分:1. The nucleotide sequence encoding the above-mentioned optimized transposase-antibody-binding protein fusion protein was transferred into the expression vector pET21a, and the reaction system was 20 μL. The following components were added to a 0.2 mL EP tube:
表1Table 1
Figure PCTCN2018084711-appb-000028
Figure PCTCN2018084711-appb-000028
Figure PCTCN2018084711-appb-000029
Figure PCTCN2018084711-appb-000029
然后,在37度反应20分钟,获得重组表达载体。使用试剂盒为Novoprotein公司的NR001。Then, the reaction was carried out at 37 degrees for 20 minutes to obtain a recombinant expression vector. The kit used was NR001 from Novoprotein.
2、将上述步骤1中获得的重组表达载体转入大肠杆菌Rosetta pLysS2. Transfer the recombinant expression vector obtained in the above step 1 to E. coli Rosetta pLysS
1)将感受态细胞从-80℃冰箱取出,立即置于冰浴中,如需分装可将刚融化细胞悬液分装到无菌预冷的离心管,置于冰浴中。1) The competent cells were taken out from the -80 ° C refrigerator and immediately placed in an ice bath. If necessary, the freshly thawed cell suspension was dispensed into a sterile pre-cooled centrifuge tube and placed in an ice bath.
2)向感受态细胞悬液中加入目的DNA(即为步骤1表达载体),轻轻旋转离心管以混匀内容物,在冰浴中静置30min。2) Add the DNA of interest to the competent cell suspension (i.e., the expression vector of step 1), gently rotate the tube to mix the contents, and let stand in an ice bath for 30 min.
3)将离心管置于42℃水浴中,放置60-90s,然后快速将管转移到冰浴中,使细胞冷却2-3min,该过程不要摇动离心管。3) Place the tube in a 42 ° C water bath for 60-90 s, then quickly transfer the tube to an ice bath and allow the cells to cool for 2-3 min. Do not shake the tube.
4)向每个离心管中加入500μL无抗LB培养基(不含抗生素),混匀后置于37℃摇床震荡培养45min(150rpm),目的是使质粒上相关抗性标记基因表达,使菌体复苏。4) Add 500 μL of anti-LB medium (without antibiotics) to each centrifuge tube, mix and place at 37 ° C shaker for 45 min (150 rpm) in order to make the relevant resistance marker gene expression on the plasmid. Bacterial resuscitation.
5)将离心管内容物3000rpm离心5min,剩下200μL培养基,用移液枪吹打均匀,涂布到含相应抗生素的LB固体琼脂培养基上,将平板置于室温直至液体被吸收,倒置平板,37℃培养12-16h。5) Centrifuge the contents of the centrifuge tube at 3000 rpm for 5 min, leave 200 μL of the medium, blow it evenly with a pipette, apply it onto LB solid agar medium containing the corresponding antibiotic, place the plate at room temperature until the liquid is absorbed, and invert the plate. Incubate at 37 ° C for 12-16 h.
3、菌株表达筛选3. Strain expression screening
1)在平板上挑3-5个单克隆,至试管(2ml LB培养基)中,37℃摇床培养3h。1) Pick 3-5 monoclonals on the plate and incubate in a test tube (2 ml LB medium) for 3 h at 37 ° C on a shaker.
2)培养至OD在0.5左右的时候,每管试管中取出800μL菌液加入200μL 80%甘油保存菌种,同一个项目取200μL诱前作为对照。2) When cultured until the OD is about 0.5, remove 800 μL of the bacterial solution from each tube and add 200 μL of 80% glycerol to preserve the strain. Take 200 μL of pre-induction as the control in the same item.
3)剩余的菌液每管加入终浓度为1mM的IPTG。3) The remaining bacterial solution was added to each tube at a final concentration of 1 mM IPTG.
4)诱导后3h后,每管取100μL,加上之前取的诱前样品,离心收集菌体加入40μL H2O将沉淀吹打混匀,再加入5×还原电泳缓冲液。4) After 3 hours after induction, 100 μL of each tube was taken, and the pre-induction sample was taken. The cells were collected by centrifugation, 40 μL of H 2 O was added, and the precipitate was mixed by blowing, and then 5× reduction electrophoresis buffer was added.
5)样品进行SDS-PAGE检测,检测目的蛋白是否有表达。5) The sample is subjected to SDS-PAGE to detect whether the target protein is expressed.
6)将有表达的菌株挑选出来,放大培养表达。6) The strains with expression are selected and the expression is amplified.
SDS-PAGE电泳检测表达结果分别如图1-5所示。放大表达结果分别如图6-10所示。The expression results of SDS-PAGE electrophoresis are shown in Figure 1-5. The enlarged expression results are shown in Figure 6-10.
可知Tn5-ProteinA-1、Tn5-ProteinG-2、Tn10-ProteinA-3、Tn10-ProteinG-4、Tn5-ProteinA-5均成功表达。It was found that Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, and Tn5-ProteinA-5 were successfully expressed.
此外,经测序获知,Tn5-ProteinA-1、Tn5-ProteinG-2、Tn10-ProteinA-3、Tn10-ProteinG-4、 Tn5-ProteinA-5的全长基因均序列正确,均与预期相符。In addition, the full-length genes of Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, and Tn5-ProteinA-5 were sequenced correctly, which were consistent with expectations.
经N/C末端序列分析,结果表明所表达的Tn5-ProteinA-1、Tn5-ProteinG-2、Tn10-ProteinA-3、Tn10-ProteinG-4、Tn5-ProteinA-5均读框无误,与理论N/C末端氨基酸序列一致。The results of N/C end sequence analysis indicated that the expressed Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, and Tn5-ProteinA-5 were all in-frame, and theoretical N The /C terminal amino acid sequence is identical.
实施例2转座酶-抗体结合蛋白的融合蛋白的功能检测Example 2 Functional Detection of Fusion Protein of Transposase-Antibody Binding Protein
一、检测实施例1获得的转座酶-抗体结合蛋白的融合蛋白的功能,首先验证所述融合蛋白的随机打断基因组,并插入标签序列,PCR后构建测序文库功能。1. The function of the fusion protein of the transposase-antibody binding protein obtained in Example 1 was examined. First, the random disrupted genome of the fusion protein was verified, and the tag sequence was inserted, and the sequencing library function was constructed after PCR.
融合蛋白随机插入原理及示意图如图11所示:融合蛋白和接头形成的转座体,可以将双链DNA片段化,并在片段化的DNA两端加上接头。The principle and schematic diagram of the random insertion of the fusion protein is shown in Figure 11. The transposome formed by the fusion protein and the linker can fragment the double-stranded DNA and add a linker at both ends of the fragmented DNA.
我们使用融合蛋白1-5(亦即Tn5-ProteinA-1、Tn5-ProteinG-2、Tn10-ProteinA-3、Tn10-ProteinG-4、Tn5-ProteinA-5)和Tn5转座酶(阳性对照)分别处理人基因组DNA,预计得到大小长度不一的,被片段化的短DNA,并且在这些片段化DNA的两端还被连上了接头序列,使用接头序列作为引物,则可以PCR扩增片段化的DNA,从而构建好了测序的文库,也证明了融合蛋白具有与Tn5转座酶同样的随机插入整合DNA的能力。We used fusion protein 1-5 (ie Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, Tn5-ProteinA-5) and Tn5 transposase (positive control), respectively Processing human genomic DNA, it is expected that short DNAs of different lengths and fragments will be fragmented, and linker sequences will be ligated at both ends of these fragmented DNA. PCR can be used to amplify fragmentation using the linker sequence as a primer. The DNA, which thus constructed the sequenced library, also demonstrated the ability of the fusion protein to have the same random insertion of integrated DNA as the Tn5 transposase.
具体的试验流程如下:The specific test process is as follows:
1、片段化双链DNA:1. Fragmented double-stranded DNA:
(1)转座体制备:(1) Preparation of transposome:
设计两个插入DNA接头,它们的一端带转座子末端序列:Two inserted DNA adaptors were designed with a transposon end sequence at one end:
Figure PCTCN2018084711-appb-000030
Figure PCTCN2018084711-appb-000030
融合蛋白1-5(亦即Tn5-ProteinA-1、Tn5-ProteinG-2、Tn10-ProteinA-3、Tn10-ProteinG-4、Tn5-ProteinA-5)、Tn5分别溶解于储存液(50mM HPCRES-KOH pH 7.2,0.1M NaCl,0.1mM EDTA,1mM DTT,0.1%Triton X-100,10%glycerol),BCA法定量,计算摩尔浓度。Fusion protein 1-5 (ie Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, Tn5-ProteinA-5) and Tn5 were respectively dissolved in a stock solution (50 mM HPCRES-KOH) pH 7.2, 0.1 M NaCl, 0.1 mM EDTA, 1 mM DTT, 0.1% Triton X-100, 10% glycerol), quantified by BCA method, and the molar concentration was calculated.
配置反应体系如下:The reaction system is configured as follows:
表2 转座体制备体系Table 2 Transposome preparation system
Figure PCTCN2018084711-appb-000031
Figure PCTCN2018084711-appb-000031
Figure PCTCN2018084711-appb-000032
Figure PCTCN2018084711-appb-000032
上述X、Y表示具体使用体积可以根据具体情况进行调整。The above X and Y indicate that the specific use volume can be adjusted according to specific conditions.
反应条件为:30℃,1小时,-20℃保存,分别制备融合蛋白1-5(亦即Tn5-ProteinA-1、Tn5-ProteinG-2、Tn10-ProteinA-3、Tn10-ProteinG-4、Tn5-ProteinA-5)转座体和Tn5转座体。The reaction conditions were: 30 ° C, 1 hour, -20 ° C preservation, respectively, to prepare fusion protein 1-5 (ie Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, Tn5 -ProteinA-5) Transposable body and Tn5 transposome.
2、转座体酶切消化人基因组DNA2, transposome digestion of human genomic DNA
室温解冻5*反应buffer(50mM TAPS-NaOH pH8.5,25mM MgCl 2),上下颠倒混匀后备用,在无菌PCR管中配置如下20ul反应体系,同时设置不加入转座体的阴性对照,加入Tn5转座体的阳性对照: The 5* reaction buffer (50 mM TAPS-NaOH pH 8.5, 25 mM MgCl 2 ) was thawed at room temperature, mixed upside down and set aside. The following 20 ul reaction system was placed in a sterile PCR tube, and a negative control without a transposome was set. Positive control added to the Tn5 transposome:
表3片段化DNA反应体系Table 3 Fragmented DNA Reaction System
Figure PCTCN2018084711-appb-000033
Figure PCTCN2018084711-appb-000033
使用移液枪轻轻吹打,使各组分充分混匀;将PCR管置入PCR仪,55℃反应10min。从PCR仪中取出PCR管,吸取若干片段化产物与对照一起进行电泳,观察片段化效果,电泳图如图12所示,融合蛋白1-5(亦即Tn5-ProteinA-1、Tn5-ProteinG-2、Tn10-ProteinA-3、Tn10-ProteinG-4、Tn5-ProteinA-5)成功将人基因组片段化。The components were thoroughly mixed by gently pipetting with a pipette; the PCR tube was placed in a PCR machine and reacted at 55 ° C for 10 min. The PCR tube was taken out from the PCR instrument, and several fragmented products were taken up and electrophoresed together with the control to observe the fragmentation effect. The electrophoresis pattern is shown in Fig. 12, and the fusion protein 1-5 (i.e., Tn5-Protein A-1, Tn5-ProteinG- 2. Tn10-ProteinA-3, Tn10-ProteinG-4, Tn5-ProteinA-5) successfully fragmented the human genome.
2、磁珠纯化片段化产物2, magnetic beads purification fragmentation products
(1)使用磁珠纯化片段化产物之前,预先将磁珠平衡至室温;(1) pre-balancing the magnetic beads to room temperature before purifying the fragmented product using magnetic beads;
(2)用涡旋混匀器振荡混匀磁珠后,吸取20ul磁珠加入20ul片段化后的产物中,继续涡旋混匀,室温孵育5min;(2) After vortexing and mixing the magnetic beads with a vortex mixer, add 20 ul of magnetic beads to 20 ul of the fragmented product, continue to vortex and mix, incubate for 5 min at room temperature;
(3)将反应管短暂离心后置于磁力架,使磁珠与液体完全分离(约5min,溶液澄清后),小心移除上清,此过程仍保持反应管置于磁力架上;(3) The reaction tube is briefly centrifuged and placed in a magnetic frame to completely separate the magnetic beads from the liquid (about 5 minutes, after the solution is clarified), and the supernatant is carefully removed, and the reaction tube is kept on the magnetic frame;
(4)向反应管中加入200ul新鲜配制的80%乙醇漂洗磁珠,室温孵育30sec后,用枪头小心移除上清,此过程仍保持反应管置于磁力架上;(4) adding 200 ul of freshly prepared 80% ethanol to the reaction tube, rinsing the magnetic beads, and incubating for 30 sec at room temperature, carefully removing the supernatant with a pipette tip, while still maintaining the reaction tube on the magnetic stand;
(5)重复步骤(4)一次;(5) repeating step (4) once;
(6)保持反应管置于磁力架上,打开管盖,室温晾干5min;(6) keeping the reaction tube on the magnetic stand, opening the tube cover, and drying at room temperature for 5 min;
(7)将反应管从磁力架上上取下,加入16ul无菌超纯水洗脱,吹打混匀或涡旋混匀后室温孵育5min;(7) Remove the reaction tube from the magnetic stand, add 16 ul of sterile ultrapure water, mix by blowing or vortex, and incubate for 5 min at room temperature;
(8)将反应管短暂离心后置于磁力架,使磁珠与液体完全分离(约5min,溶液澄清后),小心吸取14ul上清至新的无菌PCR管内应用于PCR富集步骤;(8) The reaction tube is briefly centrifuged and placed in a magnetic stand to completely separate the magnetic beads from the liquid (about 5 min, after the solution is clarified), and 14 ul of the supernatant is carefully pipetted into a new sterile PCR tube for PCR enrichment step;
3、PCR扩增并连上测序接头3. PCR amplification and ligating the sequencing linker
(1)以接头(加粗部分)作为一部分设计引物:(1) Design primers with a linker (bold part) as part:
P5引物:P5 primer:
Figure PCTCN2018084711-appb-000034
Figure PCTCN2018084711-appb-000034
P7引物:P7 Primer:
Figure PCTCN2018084711-appb-000035
Figure PCTCN2018084711-appb-000035
(2)PCR扩增DNA片段:(2) PCR amplification of DNA fragments:
扩增成功则证明融合蛋白转座体在片段化的DNA两端连上了测序接头。将装有14ul片段化纯化产物的PCR管置于冰上配置50ul反应体系:Successful amplification demonstrated that the fusion protein transposome was ligated to the sequencing link at both ends of the fragmented DNA. A PCR tube containing 14 ul of fragmented purified product was placed on ice to configure a 50 ul reaction system:
表4 PCR扩增片段化后的DNA反应体系Table 4 DNA reaction system after PCR amplification and fragmentation
Figure PCTCN2018084711-appb-000036
Figure PCTCN2018084711-appb-000036
反应组分充分混匀后在PCR仪中运行如下程序:After the reaction components are thoroughly mixed, run the following procedure in the PCR machine:
表5 PCR扩增片段化后的DNA反应程序Table 5 DNA reaction procedure after PCR amplification fragmentation
Figure PCTCN2018084711-appb-000037
Figure PCTCN2018084711-appb-000037
Figure PCTCN2018084711-appb-000038
Figure PCTCN2018084711-appb-000038
(3)PCR扩增产物长度分选:(3) PCR amplification product length sorting:
采用磁珠分选法将获得的PCR产物进行长度分选。使用前磁珠平衡至室温并充分振荡混匀,并务必将PCR后的体系用无菌蒸馏水补足回50μl,以免PCR过程中样品蒸发导致分选片段与预期长度不一致。其分选结果如图13所示为片段化后的DNA经过不同比例的磁珠分选,留下所需长度的DNA,MK:DNA Marker,1-6:不同比例的磁珠分选后的DNA片段,7:没有经过分选的阴性对照。The obtained PCR product was subjected to length sorting by magnetic bead sorting. Before use, the magnetic beads are equilibrated to room temperature and shaken well, and the PCR system must be supplemented with 50 μl of sterile distilled water to avoid evaporation of the sample during the PCR process, resulting in the sorted fragments being inconsistent with the expected length. The sorting results are shown in Figure 13. The fragmented DNA is sorted by different ratios of magnetic beads, leaving the DNA of the desired length, MK:DNA Marker, 1-6: different ratios of magnetic beads after sorting DNA fragment, 7: Negative control without sorting.
(4)产物质检:(4) Product inspection:
产物使用安捷伦2100高灵敏DNA芯片检测片段大小(见图14)。由于所用引物为针对接头Adaptor 1、2设计的特异引物,由此可知转座体成功实现了靶DNA的打断,并在同时插入了Adaptor接头序列。The product was assayed for fragment size using an Agilent 2100 high-sensitivity DNA chip (see Figure 14). Since the primers used were specific primers designed for the adaptor Adaptor 1, 2, it was found that the transposome successfully broke the target DNA and inserted the Adaptor linker sequence at the same time.
4、使用二代测序仪验证文库质量4. Verify library quality using a second-generation sequencer
文库经Illumina Hiseq XTM系统测序,测序策略为PE150,测序结果数据见表6,测序数据与参考基因组序列比对结果如下:The library was sequenced by Illumina Hiseq XTM system, the sequencing strategy was PE150, the sequencing result data is shown in Table 6, and the sequencing data was compared with the reference genome sequence as follows:
表6:测序结果数据Table 6: Sequencing Results Data
Figure PCTCN2018084711-appb-000039
Figure PCTCN2018084711-appb-000039
表7:测序数据与参考基因组序列比对结果Table 7: Alignment of sequencing data with reference genomic sequences
Figure PCTCN2018084711-appb-000040
Figure PCTCN2018084711-appb-000040
从表7中可以看出:对于基因组较大的人类基因组DNA,仅仅50ng的起始样品量,使用融合蛋白转座体构建文库,得到的有效数据可超过95%,Q20、Q30数据均超过90%,在测序深度约13X的情况下即可达到约98%的覆盖率,文库构建过程快速、操作简便,所需样品量少。It can be seen from Table 7 that for the genome of the larger human genomic DNA, only 50 ng of the starting sample amount, using the fusion protein transposome to construct the library, the effective data can be more than 95%, and the Q20 and Q30 data are over 90. %, about 98% coverage can be achieved with a sequencing depth of about 13X. The library construction process is fast, easy to operate, and requires a small amount of sample.
二、检测实施例1成功获得的融合蛋白与IgG结合的功能2. The function of the fusion protein successfully combined with IgG obtained in Example 1 was examined.
融合蛋白与IgG结合示意图,如图15所示。融合蛋白的ProteinA/G部分可以结合IgG的Fc部分。A schematic diagram of the binding of the fusion protein to IgG, as shown in FIG. The ProteinA/G portion of the fusion protein can bind to the Fc portion of the IgG.
我们应用双抗体夹心酶联免疫检测(夹心ELISA)原理检测融合蛋白,用抗ProteinA单克隆抗体包被微孔板,形成固相抗体,向固相抗体微孔板中加入ProteinA标准品和融合蛋白,然后加入生物素(Biotin)标记的另一种抗ProteinA的单克隆抗体,最后加入辣根过氧化物酶标记的链霉亲和素(SA-HRP),形成抗体+抗原+抗体-Biotin+SA-HRP复合物,经过洗涤后加入TMB底物显色;TMB在HRP酶的催化下转化成蓝色,并在酸作用下最终转化为黄色,待测融合蛋白和ProteinA标准品颜色的深浅比较即可得融合蛋白与抗体结合的能力强弱。We applied the double-antibody sandwich enzyme-linked immunosorbent assay (sandwich ELISA) principle to detect the fusion protein, coated the microplate with anti-ProteinA monoclonal antibody to form a solid phase antibody, and added ProteinA standard and fusion protein to the solid phase antibody microplate. Then, add another anti-ProteinA monoclonal antibody labeled with Biotin, and finally add horseradish peroxidase-labeled streptavidin (SA-HRP) to form antibody + antigen + antibody - Biotin+ The SA-HRP complex is washed and added to the TMB substrate for color development; TMB is converted to blue under the catalysis of HRP enzyme, and finally converted to yellow under the action of acid, and the color of the fusion protein and the ProteinA standard are compared. The ability of the fusion protein to bind to the antibody is obtained.
具体的ELISA实验流程为:The specific ELISA experimental procedure is:
将ProteinA标准品分别稀释为:Dilute the ProteinA standard to:
0nmol/ml,0.1nmol/ml,0.2nmol/ml,0.4nmol/ml,0.8nmol/ml,1.6nmol/ml,3.2nmol/ml,便于制作标准曲线。0nmol/ml, 0.1nmol/ml, 0.2nmol/ml, 0.4nmol/ml, 0.8nmol/ml, 1.6nmol/ml, 3.2nmol/ml, which is convenient for making a standard curve.
标准品/待测样品孵育:Standard/test sample incubation:
1.将配制好的Protein A标准曲线或待测样品各100ul加入对应板孔中(空白孔除外)。1. Add 100ul of the prepared Protein A standard curve or sample to be tested into the corresponding plate hole (except for the blank hole).
2.用盖板膜封板,37℃孵育30分钟。2. The plate was sealed with a cover plate and incubated at 37 ° C for 30 minutes.
3.洗板:移去盖板膜,弃去板孔中液体,每孔加入1×洗液260ul,浸泡30秒,弃去洗液,重复洗涤4次。3. Washing the plate: Remove the cover film, discard the liquid in the plate hole, add 260 ul of 1× washing solution to each well, soak for 30 seconds, discard the washing solution, and repeat the washing 4 times.
4.在平板纸上拍板,彻底去除板孔中的残留液体。4. Draw a plate on the flat paper to completely remove the residual liquid in the hole.
Protein A检测抗体孵育:Protein A detection antibody incubation:
5.快速混匀后,每孔加入Protein A检测抗体100ul(空白孔除外)。5. After rapid mixing, add 100 μL of Protein A detection antibody per well (except for blank wells).
6.用盖板膜封板,37℃孵育30分钟。6. Seal the plate with a cover plate and incubate at 37 ° C for 30 minutes.
7.洗板,同第3步。7. Wash the plate, same as step 3.
8.在平板纸上拍板,彻底去除板孔中的残留液体。8. Clapper on the flat sheet to completely remove residual liquid from the wells.
HRP标记链霉亲和素孵育:HRP-labeled streptavidin incubation:
9.每孔加入HRP标记链霉亲和素100ul(空白孔除外)。9. Add Hul-labeled streptavidin 100 ul per well (except for blank wells).
10.用盖板膜封板,37℃孵育10分钟。10. Seal the plate with a cover plate and incubate for 10 minutes at 37 °C.
11.洗板,同第3步。11. Wash the plate, same as step 3.
12.在平板纸上拍板,彻底去除板孔中的残留液体。12. Clapper on the flat sheet to completely remove residual liquid from the wells.
底物反应和吸光值检测:Substrate reaction and absorbance detection:
13.每孔加入显色液100ul(包括空白孔)。13. Add 100 ul of coloring solution (including blank wells) to each well.
14.用盖板膜封板,室温(20‐25℃)避光反应10‐15分钟(从加入显色液至第一孔时开始计时)。注:显色反应时间受温度影响,理想反应温度为20‐25℃,当温度低时,反应时间要适当延长。14. Seal the plate with a cover film and leave it at room temperature (20-25 °C) for 10-15 minutes (starting from the time the color solution is added to the first hole). Note: The color reaction time is affected by temperature. The ideal reaction temperature is 20-25 ° C. When the temperature is low, the reaction time should be extended.
15.移去盖板膜,每孔加入终止液50ul(包括空白孔)。15. Remove the cover film and add 50 ul of stop solution (including blank holes) to each well.
16.终止后立即用酶标仪在450nm处测量吸光值。16. Immediately after termination, the absorbance was measured at 450 nm using a microplate reader.
17.以OD值为纵坐标,ProteinA标准蛋白浓度为横坐标,绘制标准曲线,根据标准曲线计算出样品中融合蛋白的含量,并换算为融合蛋白的结合效率。17. The OD value is plotted on the ordinate, the ProteinA standard protein concentration is plotted on the abscissa, and a standard curve is drawn. The content of the fusion protein in the sample is calculated according to the standard curve and converted into the binding efficiency of the fusion protein.
根据融合蛋白的数据(见表8)和ProteinA标准品的标准曲线图16,可计算出融合蛋白与IgG的结合效率与ProteinA标准品之比。Based on the data of the fusion protein (see Table 8) and the standard curve of the Protein A standard, Figure 16, the ratio of the binding efficiency of the fusion protein to IgG to the ProteinA standard can be calculated.
表8.融合蛋白与IgG的结合效率Table 8. Binding efficiency of fusion protein to IgG
Figure PCTCN2018084711-appb-000041
Figure PCTCN2018084711-appb-000041
当融合蛋白浓度为1.8ng/ml,标准品ProteinA浓度为0.2ng/ml,融合蛋白和标准品的摩尔浓度一致。从图16的标准曲线可得标准品ProteinA浓度为0.2ng/ml时,其OD450为0.318,与融合蛋白的OD450值相近。Elisa结果显示,融合蛋白Tn5-ProteinA-1、Tn5-ProteinG-2、Tn10-ProteinA-3、Tn10-ProteinG-4、Tn5-ProteinA-5可以结合IgG,且结合能力与标准品ProteinA相近。When the fusion protein concentration was 1.8 ng/ml and the standard ProteinA concentration was 0.2 ng/ml, the molar concentration of the fusion protein and the standard was consistent. From the standard curve of Fig. 16, the standard ProteinA concentration was 0.2 ng/ml, and the OD450 was 0.318, which was similar to the OD450 value of the fusion protein. Elisa results showed that the fusion proteins Tn5-ProteinA-1, Tn5-ProteinG-2, Tn10-ProteinA-3, Tn10-ProteinG-4, Tn5-ProteinA-5 could bind to IgG and the binding ability was similar to that of the standard ProteinA.
实施例3转座酶-抗体结合蛋白的融合蛋白带来的研究蛋白质与DNA互作的新方法:ChT-SeqExample 3 A Novel Method for Studying Protein-DNA Interaction Brought by Fusion Proteins of Transposase-Antibody Binding Proteins: ChT-Seq
传统的ChIP-Seq法的原理是先通过染色质免疫共沉淀(ChIP)特异性地富集目的蛋白结合的DNA片段,再对这些片段进行纯化与文库构建,之后对其进行高通量测序。其实验流程如下:The principle of the traditional ChIP-Seq method is to firstly enrich the DNA fragments of the target protein by chromatin immunoprecipitation (ChIP), and then purify and construct the fragments, and then perform high-throughput sequencing. The experimental process is as follows:
(1)甲醛交联整个细胞系(组织),即将目标蛋白与染色质连结起来;(1) formaldehyde cross-links the entire cell line (tissue), that is, the target protein and chromatin are linked;
(2)分离基因组DNA,并用超声波将其打断成一定长度的小片段;(2) Isolating genomic DNA and interrupting it with ultrasonic waves into small fragments of a certain length;
(3)添加结合目标蛋白的特异抗体,该抗体与目标蛋白形成免疫沉淀免疫结合复合体;(3) adding a specific antibody that binds to the target protein, and the antibody forms an immunoprecipitation immunobinding complex with the target protein;
(4)去交联,纯化DNA得到染色质免疫沉淀的DNA样本;(4) decrosslinking, purifying the DNA to obtain a chromatin immunoprecipitated DNA sample;
(5)文库构建:将DNA末端补平,再在末端加A,再接上Y型接头,再进行PCR扩增,质检后得到DNA文库;(5) Library construction: the DNA ends were filled in, then A was added at the end, followed by Y-type linker, and then PCR amplification was performed, and a DNA library was obtained after quality inspection;
(6)将准备好的DNA文库进行NGS测序。(6) The prepared DNA library was subjected to NGS sequencing.
由此可见其实验步骤较为繁琐,且操作困难,如步骤(2)超声波打断不便控制,容易 打断过量或者不足,重复性差;步骤(3)免疫共沉淀的实验不仅操作复杂、繁琐,而且会损失大量样本;而步骤(5)文库构建部分步骤繁多,也会损失样本和DNA信息;这样ChIP-Seq需要的起始样本必须较多,而且实验的重复性较低。It can be seen that the experimental steps are cumbersome and difficult to operate, such as step (2) ultrasonic interrupting inconvenience control, easy to interrupt excessive or insufficient, poor repeatability; step (3) immunoprecipitation experiments are not only complicated and cumbersome, but also A large number of samples will be lost; and step (5) has a lot of steps in the construction of the library, and will also lose sample and DNA information; thus ChIP-Seq requires more starting samples and less repeatability of the experiment.
本发明的转座酶-抗体结合的融合蛋白,因其独特的功能和特性,我们将其应用于蛋白质与DNA互作研究,创造了一种新的方法ChT-Seq(Chromatin-Transposase-Sequencing)。其基本原理示意图如图17所示。The transposase-antibody-binding fusion protein of the present invention, due to its unique function and characteristics, has been applied to protein-DNA interaction research, creating a new method ChT-Seq (Chromatin-Transposase-Sequencing). . The basic principle diagram is shown in Figure 17.
本发明新的用于蛋白质与DNA互作研究的方法,包括如下步骤:The novel method for protein-DNA interaction research of the present invention comprises the following steps:
(1)甲醛交联细胞,分离基因组DNA,预处理片段化;(1) formaldehyde cross-linking cells, separating genomic DNA, pre-processing fragmentation;
(2)洗脱甲醛,添加目标蛋白的特异抗体、转座酶-抗体结合蛋白转座体;转座体将与抗体以及目标蛋白结合形成复合体,转座酶发挥转座功能,会切割下目标蛋白连接的DNA位点临近的序列,如图18所示;(2) eluting formaldehyde, adding a specific antibody of the target protein, a transposase-antibody binding protein transposome; the transposome will combine with the antibody and the target protein to form a complex, and the transposase functions as a transposition function and will be cut. a sequence adjacent to the DNA site to which the target protein is ligated, as shown in Figure 18;
(3)以接头为引物进行PCR扩增,得到测序文库,再进行NGS测序。(3) PCR amplification using a linker as a primer to obtain a sequencing library, followed by NGS sequencing.
由此可见,ChT-Seq这种新方法省略了免疫共沉淀-洗脱,减少了操作步骤,并大大减少了样本损失,并直接连接上了接头,大大简化了文库构建工作,便于测序,这样对起始样本要求下降,损失信息较少,可重复性及可信度大大提高。It can be seen that the new method of ChT-Seq omits the co-immunoprecipitation-elution, reduces the operation steps, greatly reduces the sample loss, and directly connects the linker, which greatly simplifies the library construction work and facilitates sequencing. The requirements for the initial sample are reduced, the loss information is less, and the repeatability and credibility are greatly improved.
具体的ChT-Seq实验过程:The specific ChT-Seq experimental process:
实验试剂准备:细胞裂解液(SDS lysis buffer:1%SDS,10mM EDTA and 50mM Tris,pH8.1)复温,令其充分溶解;蛋白酶抑制剂(Protease inhibitor cocktail)室温下解冻;PBS预冷,稀释成1*PBS。Experimental reagent preparation: cell lysate (SDS lysis buffer: 1% SDS, 10 mM EDTA and 50 mM Tris, pH 8.1) was rewarmed to fully dissolve it; Protease inhibitor cocktail was thawed at room temperature; PBS was pre-cooled. Dilute to 1*PBS.
1、收集细胞(Hela细胞,约2*10 5个,传统Chip-Seq约1*10 6-7个),加入甲醛,使其终浓度为1%,轻摇混匀,室温反应10min; 1. Collect cells (Hela cells, about 2 * 10 5 , traditional Chip-Seq about 1 * 10 6-7 ), add formaldehyde to a final concentration of 1%, gently shake and mix, react at room temperature for 10 min;
2、终止交联,加入甘氨酸使其终浓度为1%,混匀后,室温放置5min;2, termination of cross-linking, adding glycine to a final concentration of 1%, after mixing, room temperature for 5min;
3、冰上放置冷却,吸出多余培养基,用预冷的PBS清洗细胞两次,加入2ml预冷的PBS和10ul蛋白酶抑制剂;3. Place on ice to cool, aspirate excess medium, wash the cells twice with pre-cooled PBS, add 2 ml of pre-cooled PBS and 10 ul of protease inhibitor;
4、低速离心,700g,4℃,2-5min沉淀细胞,去除上清;4, low speed centrifugation, 700g, 4 ° C, 2-5min precipitation of cells, remove the supernatant;
5、细胞裂解,按照细胞量加入预冷的细胞裂解液和5ul蛋白酶抑制剂,重悬细胞,可分装为300~400ul,取5ul细胞裂解液跑电泳观察提取的基因组(见图19),多余样品可放置于-80℃保存;5, cell lysis, according to the amount of cells added to the pre-cooled cell lysate and 5ul protease inhibitor, resuspended cells, can be divided into 300 ~ 400ul, take 5ul cell lysate running electrophoresis to observe the extracted genome (see Figure 19), Excess samples can be stored at -80 ° C;
6、超声波打断,使用3档冲击15s,冰上放置45s,重复3次,此操作一直在冰上进行(此步骤也可选择酶水解消化基因组);6, ultrasonic interrupt, using 3 files for 15s, placed on ice for 45s, repeated 3 times, this operation has been carried out on ice (this step can also choose enzymatic hydrolysis digestion of the genome);
7、10000g 4℃10min离心后,收集上清,去除不溶物质,检测DNA浓度,计算样本量;7. After centrifugation at 10000 g for 10 min at 4 ° C, the supernatant was collected, the insoluble matter was removed, the DNA concentration was measured, and the sample amount was calculated;
8、空白对照组:模板样本DNA取100ul,加入4ul 5MNaCl,65℃处理2h解交联,用作空白对照,取一部分酚/氯仿抽提后,电泳鉴定步骤6的打断效果。8. Blank control group: 100 ul of template sample DNA was added, 4 ul of 5 M NaCl was added, and treated at 65 ° C for 2 h to be cross-linked, which was used as a blank control. After extracting a part of phenol/chloroform, the breaking effect of step 6 was identified by electrophoresis.
9、阴性对照组:模板样本DNA取100ul,加入900ul稀释buffer(内含4.5ul蛋白酶抑制剂),加入非特异性Mouse IgG 1ug作为抗体,4℃,轻摇孵育2h;9. Negative control group: 100 ul of template sample DNA, 900 ul dilution buffer (containing 4.5 ul protease inhibitor), non-specific Mouse IgG 1 ug as antibody, incubate at 2 ° C for 2 h with gentle shaking;
10、实验组:模板样本DNA取200ul,加入900ul稀释buffer(内含4.5ul蛋白酶抑制剂),加入目的蛋白特异性抗体,4℃,轻摇孵育2h;10, experimental group: template sample DNA taken 200ul, add 900ul dilution buffer (containing 4.5ul protease inhibitor), add the target protein specific antibody, 4 ° C, gently shake for 2h;
11、将实验组,阴性对照组,加入带Adaptor1的Transposase-ProteinA/G转座体(转座体构建按照实施例2转座酶-抗体结合蛋白功能检测部分进行),4℃轻摇孵育10min,加入MgCl 2使Mg 2+终浓度为5mM,55℃10min; 11. The experimental group and the negative control group were added to the Transposase-Protein A/G transposome with Adaptor1 (transposable construct was carried out according to the transposase-antibody binding protein function detection part of Example 2), and incubated at 4 °C for 10 min with gentle shaking. Adding MgCl 2 to a final concentration of Mg 2+ of 5 mM, 55 ° C for 10 min;
12、加入带Adaptor2的Tn5转座体,55℃5min;12. Add Tn5 transposome with Adaptor2, 55 ° C for 5 min;
13、以带Adaptor1序列的引物P5(序列如SEQ ID NO.26)和带Adaptor2序列的引物P7(序列如SEQ ID NO.27)进行PCR扩增;13. PCR amplification using primer P5 with the Adaptor1 sequence (sequence as SEQ ID NO. 26) and primer P7 with the Adaptor2 sequence (sequence as SEQ ID NO. 27);
14、文库质量控制:取部分构建好的文库进行电泳(见图20)、Qubit、Nanodrop(见图21);检测各组分的DNA浓度纯度及总量。14. Library quality control: Take some of the constructed libraries for electrophoresis (see Figure 20), Qubit, Nanodrop (see Figure 21); check the DNA concentration purity and total amount of each component.
15、磁珠筛选片段后,上机测序。15. After the magnetic beads are used to screen the fragments, they are sequenced on the machine.
作为对比,使用传统ChIP-Seq法进行实验,起始Hela细胞数为1*10 6个,细胞裂解后提取的DNA约为10ug,经过染色质免疫沉淀后得到的DNA约为10ng,将DNA末端补平,再在末端加A,再接上Y型接头,再进行PCR扩增,可得到文库20ng进行上机测序。 For comparison, the experiment was carried out using the traditional ChIP-Seq method. The number of Hela cells was 1*10 6 , the DNA extracted after cell lysis was about 10 ug, and the DNA obtained after chromatin immunoprecipitation was about 10 ng. After filling up, add A at the end, connect Y-type linker, and carry out PCR amplification. 20 ng of library can be obtained for sequencing on the machine.
而用ChT-Seq法进行实验,起始Hela细胞数约为2*10 5个,细胞裂解后提取的DNA约为2ug,经过转座酶酶切并连上接头,可得到文库100ng进行上机测序。由此可见使用ChT-Seq法由更少的起始细胞数,得到的测序文库样本量较之ChIP-Seq更多,中途损失更少,而且重复的效果一致,可重复性高。 The experiment was carried out by ChT-Seq method. The number of Hela cells was about 2*10 5 , and the DNA extracted after cell lysis was about 2 ug. After transposase digestion and ligation, a library of 100 ng was obtained. Sequencing. It can be seen that the ChT-Seq method uses fewer starting cells, and the obtained sequencing library has more sample size than ChIP-Seq, with less loss in the middle, and the repeated effect is consistent and the repeatability is high.
以上所述,仅为本发明的较佳实施例,并非对本发明任何形式上和实质上的限制,应当指出,对于本技术领域的普通技术人员,在不脱离本发明方法的前提下,还将可以做出若干改进和补充,这些改进和补充也应视为本发明的保护范围。凡熟悉本专业的技术人员,在不脱离本发明的精神和范围的情况下,当可利用以上所揭示的技术内容而做出的些许更动、修饰与演变的等同变化,均为本发明的等效实施例;同时,凡依据本发明的实质技术对上述实施例所作的任何等同变化的更动、修饰与演变,均仍属于本发明的技术方案的范围内。The above is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. It should be noted that those skilled in the art will also A number of improvements and additions may be made which are also considered to be within the scope of the invention. All of the equivalents of the changes, modifications, and evolutions that can be made by the above-disclosed technical content are all those skilled in the art without departing from the spirit and scope of the present invention. Equivalent embodiments; at the same time, any changes, modifications and evolutions of any equivalent changes made to the above-described embodiments in accordance with the essential techniques of the present invention are still within the scope of the technical solutions of the present invention.

Claims (30)

  1. 一种融合蛋白,其特征在于,所述融合蛋白的结构中包括具有转座功能的第一结构域和具有结合抗体Fc段功能的第二结构域。A fusion protein characterized in that the structure of the fusion protein comprises a first domain having a transposition function and a second domain having a function of binding an antibody Fc segment.
  2. 如权利要求1所述的融合蛋白,其特征在于,所述第一结构域和所述第二结构域通过连接片段连接。The fusion protein according to claim 1, wherein the first domain and the second domain are joined by a linker.
  3. 如权利要求2所述的融合蛋白,其特征在于,所述连接片段的结构通式为(GS) a(GGS) b(GGGS) c(GGGGS) d,其中a、b、c、d均是大于或等于0的整数。 The fusion protein according to claim 2, wherein the linking fragment has the structural formula of (GS) a (GGS) b (GGGS) c (GGGGS) d , wherein a, b, c, and d are An integer greater than or equal to 0.
  4. 如权利要求3所述的融合蛋白,其特征在于,所述连接片段的氨基酸序列可选自以下之任一情况:The fusion protein according to claim 3, wherein the amino acid sequence of the ligated fragment is selected from any of the following:
    (1)GGGGS;(1) GGGGS;
    (2)GSGGGGS;(2) GSGGGGS;
    (3)GGSGGSGGGS;(3) GGSGGSGGGS;
    (4)GGGSGGGGSGS;(4) GGGSGGGGSGS;
    (5)GGSGGSGGSGGS;(5) GGSGGSGGSGGS;
    (6)GGGGSGGGGS;(6) GGGGSGGGGS;
    (7)无连接片段。(7) No connection fragment.
  5. 如权利要求1所述的融合蛋白,其特征在于,所述第一结构域为转座酶或具有转座功能的蛋白类似物。The fusion protein according to claim 1, wherein the first domain is a transposase or a protein analog having a transposition function.
  6. 如权利要求5所述的融合蛋白,其特征在于,所述第一结构域为具有转座功能的The fusion protein according to claim 5, wherein said first domain has a transposition function
    Tc1/Mariner、hobo、MITEs、hAT、PiggyBac(PB)、TnA转座酶家族或者有转座功能的蛋白类似物。Tc1/Mariner, hobo, MITEs, hAT, PiggyBac (PB), TnA transposase family or protein analogs with transposable functions.
  7. 如权利要求1所述的融合蛋白,其特征在于,所述第一结构域为TnA转座酶家族,所述TnA转座酶家族选自Tn1、Tn2、Tn3、Tn4、Tn5、Tn6、Tn7、Tn8、Tn9或Tn10。The fusion protein according to claim 1, wherein said first domain is a TnA transposase family, and said TnA transposase family is selected from the group consisting of Tn1, Tn2, Tn3, Tn4, Tn5, Tn6, Tn7, Tn8, Tn9 or Tn10.
  8. 如权利要求1所述的融合蛋白,其特征在于,所述第一结构域为Tn5转座酶或Tn10转座酶。The fusion protein according to claim 1, wherein the first domain is a Tn5 transposase or a Tn10 transposase.
  9. 如权利要求8所述的融合蛋白,其特征在于,还包括如下特征中的任一项或多项:(1)所述Tn5转座酶选自全长Tn5转座酶、Tn5转座酶的部分功能域、Tn5转座酶突变体、带标签的全长Tn5转座酶、带标签的Tn5转座酶的部分功能域或带标签的Tn5转座酶突变体;(2)所述Tn10转座酶选自全长Tn10转座酶、Tn10转座酶的部分功能域、Tn10转座酶突变体、带标签的全长Tn10转座酶、带标签的Tn10转座酶的部分功能域或带标签的Tn10转座酶突变体。The fusion protein according to claim 8, further comprising any one or more of the following features: (1) said Tn5 transposase is selected from the group consisting of full-length Tn5 transposase, Tn5 transposase a partial functional domain, a Tn5 transposase mutant, a tagged full-length Tn5 transposase, a partial functional domain of a tagged Tn5 transposase or a tagged Tn5 transposase mutant; (2) the Tn10 transgene The enzyme is selected from the full-length Tn10 transposase, a partial domain of the Tn10 transposase, a Tn10 transposase mutant, a tagged full-length Tn10 transposase, a partial functional domain or band of the tagged Tn10 transposase Labeled Tn10 transposase mutant.
  10. 如权利要求9所述的融合蛋白,其特征在于,所述Tn5转座酶突变体选自如下:R30Q,K40Q,Y41H,T47P,E54K/V,M56A,R62Q,D97A,E110K,D188A,Y319A,R322A/K/Q,E326A,K330A/R,K333A,R342A,E344A,E345K,N348A,L372P,S438A,K439A,S445A,G462D,A466D。The fusion protein according to claim 9, wherein the Tn5 transposase mutant is selected from the group consisting of R30Q, K40Q, Y41H, T47P, E54K/V, M56A, R62Q, D97A, E110K, D188A, Y319A, R322A/K/Q, E326A, K330A/R, K333A, R342A, E344A, E345K, N348A, L372P, S438A, K439A, S445A, G462D, A466D.
  11. 如权利要求9所述的融合蛋白,其特征在于,所述标签选自如下:HHHHHH、The fusion protein according to claim 9, wherein said label is selected from the group consisting of HHHHHH,
    DYKDDDDK、YPYDVPDYA、GGLLISGGAL。DYKDDDDK, YPYDVPDYA, GGLLISGGAL.
  12. 如权利要求9所述的融合蛋白,其特征在于,还包括如下特征中的任一项或多项:(1)所述全长Tn5转座酶的氨基酸序列如SEQ ID NO.1所示;(2)所述全长Tn10转座酶的氨基酸序列如SEQ ID NO.2所示。The fusion protein according to claim 9, further comprising any one or more of the following features: (1) the amino acid sequence of the full-length Tn5 transposase is as shown in SEQ ID NO. (2) The amino acid sequence of the full-length Tn10 transposase is shown in SEQ ID NO.
  13. 如权利要求1所述的融合蛋白,其特征在于,所述第二结构域为金黄色葡萄球菌A蛋白(ProteinA)、链球菌G蛋白(ProteinG)、链球菌L蛋白(ProteinL)或其他具有结合抗体Fc段功能的蛋白类似物。The fusion protein according to claim 1, wherein the second domain is Staphylococcus aureus A protein (Protein A), Streptococcus G protein (Protein G), Streptococcal protein L (Protein L) or the like. A protein analog that functions as an Fc segment of an antibody.
  14. 如权利要求13所述的融合蛋白,其特征在于,所述第二结构域为全长金黄色葡萄球菌A蛋白、金黄色葡萄球菌A蛋白的部分功能域、金黄色葡萄球菌A蛋白突变体、全长链球菌G蛋白、链球菌G蛋白的部分功能域、链球菌G蛋白突变体、带标签的全长金黄色葡萄球菌A蛋白、带标签的金黄色葡萄球菌A蛋白的部分功能域、带标签的金黄色葡萄球菌A蛋白突变体、带标签的全长链球菌G蛋白、带标签的链球菌G蛋白的部分功能域或带标签的链球菌G蛋白突变体。The fusion protein according to claim 13, wherein the second domain is a full-length S. aureus A protein, a partial functional domain of S. aureus A protein, a S. aureus A protein mutant, a full-length streptococcal G protein, a partial functional domain of the Streptococcal G protein, a Streptococcus G protein mutant, a tagged full-length Staphylococcus aureus A protein, a partial functional domain of the tagged Staphylococcus aureus A protein, Labeled S. aureus A protein mutant, tagged full-length Streptococcus G protein, part of the functional domain of the tagged Streptococcus G protein or a tagged Streptococcus G protein mutant.
  15. 如权利要求14所述融合蛋白,其特征在于,添加的标签选自如下:HHHHHH、DYKDDDDK、YPYDVPDYA、GGLLISGGAL。The fusion protein according to claim 14, wherein the added label is selected from the group consisting of HHHHHH, DYKDDDDK, YPYDVPDYA, GGLLISGGAL.
  16. 如权利要求14所述的融合蛋白,其特征在于,还包括以下特征中的任一项或多项:(1)所述全长金黄色葡萄球菌A蛋白的氨基酸序列如SEQ ID NO.3所示;(2)所述全长链球菌G蛋白的氨基酸序列如SEQ ID NO.4所示。The fusion protein according to claim 14, further comprising any one or more of the following features: (1) the amino acid sequence of the full-length S. aureus A protein is as set forth in SEQ ID NO. (2) The amino acid sequence of the full-length Streptococcus G protein is shown in SEQ ID NO.
  17. 如权利要求1所述的融合蛋白,其特征在于,所述融合蛋白的通式为:第一结构域-连接片段-第二结构域,或者第二结构域-连接片段-第一结构域。The fusion protein according to claim 1, wherein the fusion protein has the general formula: a first domain-linker fragment-second domain, or a second domain-linker fragment-first domain.
  18. 如权利要求1所述的融合蛋白,其特征在于,所述融合蛋白的氨基酸序列如SEQ ID NO.5、SEQ ID NO.6、SEQ ID NO.7、SEQ ID NO.8或SEQ ID NO.9之任一所示。The fusion protein according to claim 1, wherein the fusion protein has the amino acid sequence of SEQ ID NO. 5, SEQ ID NO. 6, SEQ ID NO. 7, SEQ ID NO. 8 or SEQ ID NO. Any of 9 is shown.
  19. 编码如权利要求1~18任何一项所述融合蛋白的多核苷酸。A polynucleotide encoding the fusion protein of any one of claims 1 to 18.
  20. 如权利要求19所述的多核苷酸,其特征在于,包含编码权利要求18中所述氨基酸序列的DNA序列。The polynucleotide according to claim 19, which comprises a DNA sequence encoding the amino acid sequence of claim 18.
  21. 一种克隆载体,其含有如权利要求19或20所述的多核苷酸。A cloning vector comprising the polynucleotide of claim 19 or 20.
  22. 一种表达载体,其含有如权利要求19或20所述的多核苷酸。An expression vector comprising the polynucleotide of claim 19 or 20.
  23. 一种宿主细胞,包含如权利要求21或22所述的载体,或者转染了权利要求21或22所述的载体。A host cell comprising the vector of claim 21 or 22 or transfected with the vector of claim 21 or 22.
  24. 如权利要求1~18任一项所述融合蛋白的制备方法,包括如下步骤:合成或者克隆目的DNA序列,构建含有目的DNA序列的克隆载体,构建含有目的DNA序列的表达载体,将含目的DNA序列的表达载体转化至原核宿主细胞,筛选在在生长培养基中高表达的高产细胞株,培养筛选到的高表达细胞株并表达融合蛋白,从表达产物中纯化获得所述的融合蛋白。The method for producing a fusion protein according to any one of claims 1 to 18, comprising the steps of synthesizing or cloning a DNA sequence of interest, constructing a cloning vector containing the DNA sequence of interest, and constructing an expression vector containing the DNA sequence of interest, and containing the DNA of interest. The expression vector of the sequence is transformed into a prokaryotic host cell, and a high-yielding cell line highly expressed in a growth medium is selected, the selected high expression cell line is cultured and a fusion protein is expressed, and the fusion protein is purified from the expression product.
  25. 如权利要求24所述的方法,其特征在于,所述原核宿主细胞为BL21或Rosetta细胞。The method of claim 24, wherein the prokaryotic host cell is a BL21 or Rosetta cell.
  26. 一种试剂组合,其特征在于,包含权利要求1~18所述的融合蛋白,以及其对应使用的其他组分。A reagent combination comprising the fusion protein of claims 1 to 18, and other components thereof for use.
  27. 如权利要求1~18任一项所述融合蛋白在制备抗体结合转座体中的用途。Use of a fusion protein according to any one of claims 1 to 18 for the preparation of an antibody-binding transposome.
  28. 一种抗体结合转座体,所述抗体结合转座体包括如权利要求1~18任一项所述融合蛋白以及与所述融合蛋白的第一结构域连接的接头。An antibody binding to a transposome comprising the fusion protein of any one of claims 1 to 18 and a linker linked to the first domain of the fusion protein.
  29. 如权利要求28所述抗体结合转座体在研究蛋白-染色质互作中的用途。The use of an antibody-binding transposome as claimed in claim 28 for studying protein-chromatin interactions.
  30. 一种研究蛋白-染色质互作的方法,包括如下步骤:A method for studying protein-chromatin interactions, comprising the steps of:
    (1)取足量细胞,破碎细胞,分离染色质,预处理染色质;(1) taking sufficient amount of cells, breaking the cells, separating the chromatin, and pretreating the chromatin;
    (2)添加针对目标转录因子或结合在染色质上的蛋白的抗体,添加如权利要求28中的带接头的抗体结合转座体;(2) adding an antibody against a target transcription factor or a protein bound to chromatin, and adding the antibody-binding transposome having a linker according to claim 28;
    (3)染色体被片段化,并引入接头;(3) the chromosome is fragmented and a linker is introduced;
    (4)以接头为引物进行扩增,获得DNA文库;(4) amplifying with a linker as a primer to obtain a DNA library;
    (5)使用所得文库进行高通量测序。(5) High-throughput sequencing was performed using the resulting library.
PCT/CN2018/084711 2018-03-27 2018-04-27 Fusion protein of transposase-antibody binding protein and preparation and application thereof WO2019184044A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810257987 2018-03-27
CN201810257987.3 2018-03-27

Publications (1)

Publication Number Publication Date
WO2019184044A1 true WO2019184044A1 (en) 2019-10-03

Family

ID=62657688

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/084711 WO2019184044A1 (en) 2018-03-27 2018-04-27 Fusion protein of transposase-antibody binding protein and preparation and application thereof

Country Status (2)

Country Link
CN (1) CN108219006A (en)
WO (1) WO2019184044A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114106196A (en) * 2021-10-29 2022-03-01 陈凯 Antibody-transposase fusion protein and preparation method and application thereof
WO2022056704A1 (en) * 2020-09-16 2022-03-24 深圳华大生命科学研究院 Method for analyzing cell epigenomics from multiple dimensions
WO2022167665A1 (en) * 2021-02-05 2022-08-11 Ospedale San Raffaele S.R.L. Engineered transposase and uses thereof

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108285494B (en) * 2018-02-11 2021-09-07 北京大学 Fusion protein, kit and CHIP-seq detection method
CN109400714B (en) * 2018-10-26 2019-11-01 南京诺唯赞生物科技有限公司 The recombination fusion protein of antibody target and its application in epigenetics
CN109517048B (en) * 2018-12-04 2021-05-14 厦门大学 Optogenetics tool for regulating long-distance interaction of chromatin under blue light
CN111440843A (en) * 2019-01-16 2020-07-24 中国科学院生物物理研究所 Method for preparing chromatin co-immunoprecipitation library by using trace clinical puncture sample and application thereof
CN110372799B (en) * 2019-08-01 2020-06-09 北京大学 Fusion protein for preparing single-cell ChIP-seq library and application thereof
CN110938610B (en) * 2019-12-16 2020-12-01 广东菲鹏生物有限公司 Transposase mutant, fusion protein, preparation method and application thereof
CN112812192B (en) * 2021-01-22 2022-05-20 湖南大学 ProA/G-dRep fusion protein serving as nucleic acid-antibody conjugate universal carrier and application thereof
CN112795563A (en) * 2021-03-23 2021-05-14 上海欣百诺生物科技有限公司 Use and method of biotinylated transposomes for recovering CUT & Tag or ATAC-seq products
CN113136374B (en) * 2021-04-25 2022-10-21 福建农林大学 Preparation and application of recombinant mutant Tn5 transposase
CN115785283B (en) * 2022-11-02 2024-05-31 武汉影子基因科技有限公司 PAG-Tn5 mutant and application thereof
CN116496365B (en) * 2022-12-08 2024-06-25 宜明(济南)生物科技有限公司 Acidic surface-assisted dissolution short peptide tag for improving recombinant protein expression efficiency
CN116813800B (en) * 2023-07-07 2024-03-12 南京诺唯赞生物科技股份有限公司 Double-stranded DNA binding protein-transposase fusion protein and library construction method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1262553A1 (en) * 2001-05-22 2002-12-04 ADEREGEM (Association pour le Développement de la Recherche en Génétique Moléculaire Luciferase-selection marker fusion proteins, polynucleotides encoding such proteins, and uses thereof
CN104651409A (en) * 2015-02-17 2015-05-27 浙江大学 Efficient transgenosis method with mediation of transcription activator-like effector protein
CN106507677A (en) * 2014-04-15 2017-03-15 伊鲁米那股份有限公司 For improving insetion sequence bias and increasing the transposase that DNA is input into the modification of tolerance
CN108285494A (en) * 2018-02-11 2018-07-17 北京大学 A kind of fusion protein, kit and CHIP-seq detection methods

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105331632A (en) * 2015-11-05 2016-02-17 浙江大学 Method for synthesizing secretion calcium ion binding protein through bombyx mori posterior silkgland
CN105481984B (en) * 2015-12-03 2020-09-22 上海细胞治疗研究院 Transposase for efficiently mediating exogenous gene integration and application thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1262553A1 (en) * 2001-05-22 2002-12-04 ADEREGEM (Association pour le Développement de la Recherche en Génétique Moléculaire Luciferase-selection marker fusion proteins, polynucleotides encoding such proteins, and uses thereof
CN106507677A (en) * 2014-04-15 2017-03-15 伊鲁米那股份有限公司 For improving insetion sequence bias and increasing the transposase that DNA is input into the modification of tolerance
CN104651409A (en) * 2015-02-17 2015-05-27 浙江大学 Efficient transgenosis method with mediation of transcription activator-like effector protein
CN108285494A (en) * 2018-02-11 2018-07-17 北京大学 A kind of fusion protein, kit and CHIP-seq detection methods

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022056704A1 (en) * 2020-09-16 2022-03-24 深圳华大生命科学研究院 Method for analyzing cell epigenomics from multiple dimensions
WO2022167665A1 (en) * 2021-02-05 2022-08-11 Ospedale San Raffaele S.R.L. Engineered transposase and uses thereof
CN114106196A (en) * 2021-10-29 2022-03-01 陈凯 Antibody-transposase fusion protein and preparation method and application thereof

Also Published As

Publication number Publication date
CN108219006A (en) 2018-06-29

Similar Documents

Publication Publication Date Title
WO2019184044A1 (en) Fusion protein of transposase-antibody binding protein and preparation and application thereof
JP7120630B2 (en) Macromolecular analysis using nucleic acid encoding
ES2266224T3 (en) SERIES OF FUNCTIONAL PROTEINS.
JP5415264B2 (en) Detectable nucleic acid tag
JP2017526725A (en) Methods and reagents for constructing nucleic acid single-stranded circular libraries
US20240096441A1 (en) Genome-wide identification of chromatin interactions
CN109400714B (en) The recombination fusion protein of antibody target and its application in epigenetics
JP2015092183A (en) Screening method of new drug candidate substance inhibiting target protein-protein interaction for innovative new drug development
US9644203B2 (en) Method of protein display
CN113272441A (en) Methods and compositions for preparing nucleic acids that preserve spatially contiguous continuity information
CN111155175B (en) Epigenetic DAP-seq sequencing database building method
EP3959312A1 (en) Isolated nucleic acid binding domains
JP4303112B2 (en) Methods for the generation and identification of soluble protein domains
JP2005514913A (en) Methods and devices for expression, purification and detection of endogenous proteins
CN113092748A (en) Method for qualitatively and quantitatively determining protein and application thereof
Brasino et al. Isothermal rolling circle amplification of virus genomes for rapid antigen detection and typing
EP2516702B1 (en) Protein display
US20200363410A1 (en) A process for immobilizing polypeptides
CN110938610B (en) Transposase mutant, fusion protein, preparation method and application thereof
US20230287490A1 (en) Systems and methods for assaying a plurality of polypeptides
WO2024141002A1 (en) Non-nucleic acid target detection method based on split cas protein and use thereof
US20210380967A1 (en) Methods of Identifying Adenosine-to-Inosine Edited RNA
WO2020145405A1 (en) Three-dimensional dna structure interaction analysis method
US11041849B2 (en) Methods and systems for identifying candidate nucleic acid agent
WO2023150742A2 (en) Methods for generating nucleic acid encoded protein libraries and uses thereof

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18912800

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18912800

Country of ref document: EP

Kind code of ref document: A1