WO2022148311A1 - Research method for multi-target protein-dna interaction, and tool - Google Patents

Research method for multi-target protein-dna interaction, and tool Download PDF

Info

Publication number
WO2022148311A1
WO2022148311A1 PCT/CN2021/143623 CN2021143623W WO2022148311A1 WO 2022148311 A1 WO2022148311 A1 WO 2022148311A1 CN 2021143623 W CN2021143623 W CN 2021143623W WO 2022148311 A1 WO2022148311 A1 WO 2022148311A1
Authority
WO
WIPO (PCT)
Prior art keywords
oligonucleotide
sequence
transposase
optionally
dna
Prior art date
Application number
PCT/CN2021/143623
Other languages
French (fr)
Chinese (zh)
Inventor
曹林
聂俊伟
瞿志鹏
江明扬
韩锦雄
吴恒
唐秋雨
Original Assignee
南京诺唯赞生物科技股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 南京诺唯赞生物科技股份有限公司 filed Critical 南京诺唯赞生物科技股份有限公司
Publication of WO2022148311A1 publication Critical patent/WO2022148311A1/en

Links

Images

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N15/00Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
    • C12N15/09Recombinant DNA-technology
    • C12N15/11DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12NMICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
    • C12N9/00Enzymes; Proenzymes; Compositions thereof; Processes for preparing, activating, inhibiting, separating or purifying enzymes
    • C12N9/10Transferases (2.)
    • C12N9/12Transferases (2.) transferring phosphorus containing groups, e.g. kinases (2.7)
    • C12N9/1241Nucleotidyltransferases (2.7.7)
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6806Preparing nucleic acids for analysis, e.g. for polymerase chain reaction [PCR] assay
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6869Methods for sequencing
    • CCHEMISTRY; METALLURGY
    • C40COMBINATORIAL TECHNOLOGY
    • C40BCOMBINATORIAL CHEMISTRY; LIBRARIES, e.g. CHEMICAL LIBRARIES
    • C40B50/00Methods of creating libraries, e.g. combinatorial synthesis
    • C40B50/06Biochemical methods, e.g. using enzymes or whole viable microorganisms

Definitions

  • the present application belongs to the field of biotechnology, and relates to an oligonucleotide-tagged targeting transposome complex and its use for studying multi-target protein-DNA interactions.
  • RNA Ribonucleic acid
  • RNA Ribonucleic acid
  • protein Ribonucleic acid
  • epigenetics The extent determines when, where, and how a gene is expressed.
  • Common epigenetic controls include DNA methylation, histone modifications, and chromatin conformation changes.
  • Chromatin immunoprecipitation is a widely used method to study protein-DNA interactions, usually for the study of transcription factor binding sites or histone-specific modification sites.
  • the basic process of ChIP is: (1) use formaldehyde to fix tissue cut into pieces or fix cells directly, so that DNA and proteins are cross-linked together to form target protein-DNA complexes; (2) chromatin DNA is fragmented by ultrasound , and then add the ChIP-level antibody against the target protein to bind with the target protein-DNA complex; (3) add protein A/G beads bound to the antibody to bind the complex, and then release the DNA fragments by de-crosslinking; (4) Purify the enriched DNA fragment, and detect the DNA sequence of the enriched fragment by downstream detection technology (quantitative PCR, gene chip, sequencing, etc.).
  • ChIP-Seq technology which combines ChIP with next-generation sequencing technology, can efficiently detect DNA segments that interact with histones, transcription factors, etc. on a genome-wide scale.
  • CUT&RUN Cleavage Under Targets&Release Using Nuclease
  • CUT&Tag Cleavage Under Targets&Tagmentation
  • the sample is targeted to cut the sample using the embedded protein A-Tn5 transposome containing different barcode adapters, so that the sample DNA has different adapter sequences after being interrupted by transposase, It is an easy-to-operate, high-throughput and high-quality single-cell ChIP-seq technology.
  • ChIP-seq technology is affected by factors such as cross-linking/ultrasonic interruption conditions, antibodies and other factors, and requires a large amount of cells/tissues for library construction, which is difficult to apply to micro-samples and single-cell experiments.
  • ChIP-seq technology is prone to false negatives/false positives, resulting in high sequencing background noise due to uneven ultrasound interruptions.
  • This application provides a technology for simultaneous multi-target DNA-protein interaction research, which can simultaneously detect two or more target proteins and their interacting DNA fragments for the same experimental sample, and obtain them through high-throughput sequencing technology.
  • Library information with low background.
  • the entire experimental process greatly reduces the library construction steps, shortens the library construction time, reduces the requirements for the initial amount of samples, improves the library output and data quality, and helps to obtain more histones/transcription factors/DNA-binding proteins. action in the body.
  • the basic process of the present application is: (1) annealing oligonucleotides containing different index sequences to form adapters, and embedding a pair of adapters with protein A-Tn5 transposase or protein G-Tn5 transposase, Each adapter-transposase complex produced contains a unique index or combination of indexes; (2) an antibody against the protein of interest is incubated with the embedded adapter-transposase complex to form the adapter - Transposase-antibody complex, one antibody corresponds to one index or index combination; (3) Collect cells/nuclei, add adapter-transposase-antibody complex for incubation, use antibody to target transposase target protein; (4) activate the transposase, cut the DNA near the target protein, and connect the adapter; (5) inactivate the transposase, purify the fragmented and tagged DNA, and obtain it by PCR amplification for sequencing and (6) sequencing by downstream sequencing technology. Finally, the DNA
  • an adaptor-transposase complex ie, a transposome
  • a transposome containing different indexes or index combinations
  • the antibodies combine in vitro to form a variety of adaptor-transposase-antibody complexes. After mixing these different adaptor-transposase-antibody complexes, they enter the sample simultaneously to target different target proteins.
  • the DNA near the target protein is cut by the transposase, so that the two ends of the DNA fragment are connected with different indexes or index combinations, and the library is generated by PCR amplification. After sequencing, the index or index combination can be split to identify the interaction between different target proteins and DNA happening.
  • the method of the present application can be applied in one-tube high-throughput, and can be "seamlessly” combined with the single-cell sequencing platform.
  • High-throughput sequencing technology also known as second-generation sequencing technology, next-generation sequencing technology, and can be abbreviated as NGS. It refers to the technology of performing sequence determination on hundreds of thousands to millions of DNA molecules in parallel at a time, and the length of the determined sequence is generally short.
  • Transposase an enzyme that performs the function of transposition, usually encoded by a transposon, recognizes specific sequences at both ends of the transposon, and can detach the transposon from the adjacent sequence and insert it into a new DNA target site, No homology requirement.
  • Tn5 transposase is a kind of transposase. It has the characteristics of good randomness, high stability, and easy sequencing of insertion sites. It is an efficient tool for molecular genetics and gene sequencing.
  • a targeting moiety refers to any moiety capable of binding a molecule of interest, preferably an antibody or antibody fragment.
  • antibody is used herein in the broadest sense and encompasses a variety of antibody structures including, but not limited to, monoclonal antibodies, polyclonal antibodies, multispecific antibodies (eg, bispecific antibodies), and antibody fragments so long as they exhibit the desired antigen-binding activity.
  • antibody fragment refers to a molecule other than an intact antibody that comprises the portion of the intact antibody that binds the antigen to which the intact antibody binds.
  • antibody fragments include, but are not limited to, Fv, Fab, Fab', Fab'-SH, F(ab') 2 ; diabodies; linear antibodies; single-chain antibody molecules (eg, scFv); sexual antibodies.
  • the application relates to an oligonucleotide pair comprising a first oligonucleotide and a second oligonucleotide, wherein: the first oligonucleotide comprises a first transposase recognition sequence, the second The oligonucleotide comprises a second transposase recognition sequence, the first oligonucleotide comprises a first tag sequence and/or the second oligonucleotide comprises a second tag sequence.
  • the first transposase recognition sequence is the same as the second transposase recognition sequence. In one embodiment where the first transposase recognition sequence is the same as the second transposase recognition sequence, the first transposase recognition sequence is in the same direction as the second transposase recognition sequence. In one embodiment where the first transposase recognition sequence is the same as the second transposase recognition sequence, the first transposase recognition sequence is reversed from the second transposase recognition sequence. In one embodiment, the first transposase recognition sequence is different from the second transposase recognition sequence.
  • the first oligonucleotide further comprises a first sequencing solid phase binding sequence and/or a first sequencing primer binding sequence.
  • the second oligonucleotide further comprises a second sequencing solid phase binding sequence and/or a second sequencing primer binding sequence.
  • the first tag sequence and/or the second tag sequence corresponds to the targeting moiety and/or target molecule below.
  • the first oligonucleotide and/or the second oligonucleotide may also comprise one or more additional tag sequences, with additional uses.
  • the first oligonucleotide and/or the second oligonucleotide are single-stranded, double-stranded, or a combination thereof.
  • the first transposase recognition sequence in the first oligonucleotide and/or the second transposase recognition sequence in the second oligonucleotide is double stranded.
  • the portion of the first oligonucleotide other than the first transposase recognition sequence eg, the first sequencing solid phase binding sequence, the first tag sequence, and/or the first sequencing primer binding sequence
  • the portion of the second oligonucleotide other than the second transposase recognition sequence eg, the second sequencing solid phase binding sequence, the second tag sequence, and/or the second sequencing primer binding sequence
  • the first sequencing solid phase binding sequence and/or the second sequencing solid phase binding sequence may be truncated relative to the first immobilized probe and/or the second immobilized probe on the sequencing solid phase, respectively and/or extended, e.g.
  • the first sequencing primer binding sequence and/or the second sequencing primer binding sequence may be truncated and/or extended relative to the first sequencing primer and/or the second sequencing primer, respectively, eg, truncated and/or extended /or at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 extended 5' end and/or 3' end 14, 15, 16, 17, 18, 19, 20 nucleotides or more
  • the first oligonucleotide comprises an optional first sequencing solid phase binding sequence in a 5' to 3' direction (eg, the same or opposite to the sequence of the first binding probe of the sequencing solid phase). to complement), an optional first tag sequence, an optional first sequencing primer binding sequence (e.g., identical or reverse complementary to the sequence of the first sequencing primer), and a first transposase recognition sequence (plus strand (e.g. AGATGTGTATAAGAGACAG, SEQ ID NO: 9) or minus strand).
  • the second oligonucleotide comprises an optional second sequencing solid phase binding sequence in a 5' to 3' direction (eg, the same or opposite to the sequence of the second binding probe of the sequencing solid phase).
  • an optional second tag sequence e.g. identical or reverse complementary to the sequence of the second sequencing primer
  • a second transposase recognition sequence plus strand (e.g. AGATGTGTATAAGAGACAG, SEQ ID NO: 9) or minus strand).
  • first oligonucleotide the first tag sequence and the first sequencing primer binding sequence can be transposed.
  • second oligonucleotide the second tag sequence and the second sequencing primer binding sequence can be transposed.
  • the first oligonucleotide comprises a first transposase recognition sequence (minus strand (eg CTGTCTCTTATACACATCT, SEQ ID NO: 10) or plus strand), optionally The first sequencing primer binding sequence (e.g., reverse complementary or identical to the sequence of the first sequencing primer), an optional first tag sequence, and an optional first sequencing solid phase binding sequence (e.g., the The sequence of a binding probe is reverse complementary or identical).
  • a first transposase recognition sequence minus strand (eg CTGTCTCTTATACACATCT, SEQ ID NO: 10) or plus strand
  • the first sequencing primer binding sequence e.g., reverse complementary or identical to the sequence of the first sequencing primer
  • an optional first tag sequence e.g., the The sequence of a binding probe is reverse complementary or identical.
  • the second oligonucleotide comprises a second transposase recognition sequence (minus strand (eg CTGTCTCTTATACACATCT, SEQ ID NO: 10) or plus strand) in a 5' to 3' direction, optionally
  • the second sequencing primer binding sequence e.g., reverse complementary or identical to the sequence of the second sequencing primer
  • an optional second tag sequence e.g., with the second sequencing solid phase binding sequence
  • the sequences of the two binding probes are reverse complementary or identical).
  • the first tag sequence and the first sequencing primer binding sequence can be transposed.
  • the second tag sequence and the second sequencing primer binding sequence can be transposed.
  • binding probes for sequencing solid phases and their sequences are known in the art.
  • the sequencing solid phase binding sequences of the present application eg, the first sequencing solid phase binding sequence and/or the second sequencing solid phase binding sequence
  • the ion torrent platform, the illumina platform and the Huada platform are also known or readily available in the art (see the instructions for use of each sequencing platform).
  • the ion torrent platform, the illumina platform and the Huada platform are also known or readily available in the art.
  • the sequencing solid phase binding sequence of the present application can be AATGATACGGCGACCACCGAGATCTACAC (SEQ ID NO: 1) or its reverse complement GTGTAGATCTCGGTGGTCGCCGTATCATT (SEQ ID NO: 1) NO: 2), or CAAGCAGAAGACGGCATACGAGAT (SEQ ID NO: 3) or its reverse complement ATCTCGTATGCCGTCTTCTGCTTG (SEQ ID NO: 4), including truncated and/or extended sequences thereof.
  • Sequencing primers and their sequences are known in the art.
  • the sequencing primer binding sequences of the present application eg, the first sequencing primer binding sequence and/or the second sequencing primer binding sequence
  • the instructions for use of each sequencing platform such as ion torrent platform, illumina platform and BGI platform.
  • the sequencing primer binding sequence of the present application may be TCGTCGGCAGCGTC (SEQ ID NO: 5) or its reverse complement GACGCTGCCGACGA (SEQ ID NO: 6) ), or GTCTCGTGGGCTCGG (SEQ ID NO: 7) or its reverse complement CCGAGCCCACGAGAC (SEQ ID NO: 8), including truncated and/or extended sequences thereof.
  • the tag sequences of the present application can utilize any short oligonucleotide, eg, at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 or more nucleotides.
  • the tag sequences of the present application can utilize tag sequences utilized by current or future sequencing platforms (including but not limited to ion torrent platforms, illumina platforms, and BGI platforms).
  • the first oligonucleotide is linked to the second oligonucleotide.
  • the first and second oligonucleotides are at opposite ends of the transposase recognition sequence (eg, a sequencing solid phase bound sequence). end) connection.
  • a cleavage site such as a restriction enzyme recognition site, exists between the first oligonucleotide and the second oligonucleotide.
  • the present application relates to an oligonucleotide-tagged targeted transposome complex comprising a transposase, an oligonucleotide pair of the present application, and a targeting moiety.
  • the targeting moiety is an aptamer, an oligonucleotide that specifically binds a target molecule.
  • the targeting moiety is an antibody (including antibody fragments).
  • the targeting module specifically binds a target molecule that interacts with DNA (eg, modulates gene expression).
  • the target molecule is a histone, including isoforms, variants, fragments thereof.
  • the target molecule is a DNA polymerase, including isoforms, variants, fragments thereof.
  • the target molecule is RNA polymerase, including isoforms, variants, fragments thereof.
  • the target molecule is a transcription factor, such as an ARS binding factor, an rDNA enhancer binding protein, a TATA binding protein, or a CCCTC binding factor.
  • the target molecule is one or more of the following: AAF, abl, ADA2, ADA-NF1, AF-1, AFP1, AhR, AIIN3, ALL-1, ⁇ -CBF, ⁇ -CP 1.
  • Nrf2 NRF-2 ⁇ l, NRF-2 ⁇ l, NRL, NRSF form 1, NRSF form 2, NTF, 02, OCA-B, Oct-1, Oct-2, Oct-2.1, Oct-2B, Oct-2C, Oct-4A, Oct4B, Oct-5, Oct-6, Octa-factor, octamer-binding factor, oct-B2, oct-B3, Otxl, Otx2, OZF, pl07, pl30, p28 regulator, p300, p38erg , p45, p49erg, -p53, p55, p55erg, p65 ⁇ , p67, Pax-1, Pax-2, Pax-3, Pax-3A, Pax-3B, Pax-4, Pax-5, Pax-6, Pax- 6/Pd-5a, Pax-7, Pax-8, Pax-8a, Pax-8b, Pax-8c, Pax-8d, Pax-8e, Pax-8f, Pax-9
  • the transposase is covalently associated with the targeting moiety, eg, by fusion or chemical coupling. In one embodiment, the fusion is direct or indirect. In one embodiment, the transposase is preferably non-covalently associated with the targeting moiety. In one embodiment, the transposase is fused or coupled to one member of a binding pair and the targeting moiety is fused or coupled to another member of the binding pair. In one embodiment, the binding pair is biotin-avidin, biotin-streptavidin, ligand-receptor, enzyme-substrate or complementary oligonucleotide. In one embodiment where the targeting moiety is an antibody, the transposase is fused to an antibody binding protein. In one embodiment, the antibody binding protein is protein A, protein G, an Fc receptor, or a secondary antibody.
  • the transposase is a transposase known in the art or discovered in the future, such as a Tn5 transposase, Mu transposase, IS5 transposase or IS91 transposase , including wild type and mutant type (see eg CN1367840A, CN109400714A, US6406896B1, US20040235103A1).
  • the transposase is a highly active Tn5 transposase, such as an EK/LP Tn5 transposase.
  • the transposase is a Tn5 transposase mutant, eg, comprising one or more substitutions of E58V, L372Q, E344K, D97E, D188E, E326D.
  • the transposase recognition sequence is a transposase recognition sequence known in the art or discovered in the future, eg, a Tn5-type transposase recognition sequence, eg, inner end (IE) or outer end (OE) , including its wild-type and mutant forms, as well as methylated forms (ME), such as the 19 bp Tn5 core terminal sequence ( AGATGTGTATAAGAGACAG , SEQ ID NO: 9) or its reverse complement ( CTGTCTCTTATACACATCT , SEQ ID NO: 10).
  • the transposase recognition sequence is a Mu transposase recognition sequence, an IS5 transposase recognition sequence, or an IS91 transposase recognition sequence, including wild-type and mutant types.
  • the first tag sequence in the first oligonucleotide and/or the second tag sequence in the second oligonucleotide is specific to the targeting moiety.
  • the present application relates to a mixture comprising at least a first complex of the present application and a second complex, wherein a targeting moiety in the first complex specifically binds a first target molecule, the first complex The targeting moiety in the two-complex specifically binds a second target molecule that is different from the second target molecule.
  • the mixture of the present application involves a set of targeting modules.
  • different targeting modules in the set of targeting modules correspond to the same first tag sequence and different second tag sequences.
  • different targeting moieties in the set of targeting moieties correspond to different first tag sequences and the same second tag sequence.
  • different targeting moieties in the set of targeting moieties correspond to different first tag sequences and different second tag sequences.
  • the mixture of the present application involves multiple sets of targeting modules.
  • different sets of targeting modules in the plurality of sets of targeting modules correspond to different first tag sequences
  • different targeting modules in the same set of targeting modules correspond to the same first tag sequence and different the second tag sequence.
  • different sets of targeting modules in the plurality of sets of targeting modules correspond to different second tag sequences
  • different targeting modules in the same set of targeting modules correspond to the same second tag sequence and different The first tag sequence of .
  • the present application relates to a method of preparing a nucleic acid library for the simultaneous study of the interaction of multiple target molecules with DNA, comprising: obtaining a mixture of the present application comprising multiple complexes against multiple target molecules , that is, comprising a complex for each of the multiple target molecules; obtaining a sample in which multiple target molecules interact with DNA; reacting the mixture with the sample, so that the targeting module binds to the corresponding target molecule, and transfers the The posase fragments the DNA and adds corresponding tag sequences on both sides of the DNA fragments; and recovers the tagged DNA fragments to obtain a nucleic acid library.
  • the method further comprises purifying and/or amplifying the recovered DNA fragments.
  • the present application relates to a method for simultaneously identifying the sites of action of multiple target molecules on DNA, comprising: obtaining a mixture of the present application comprising multiple complexes directed against multiple target molecules, ie, comprising targeting multiple target molecules.
  • a complex of each of the plurality of target molecules; a sample of the interaction of the plurality of target molecules with DNA is obtained; the mixture is reacted with the sample, so that the targeting module binds the corresponding target molecule, and the transposase binds the DNA fragments.
  • the method further comprises purifying and/or amplifying the recovered DNA fragments.
  • the method further comprises analyzing the sequencing results.
  • analyzing the sequencing results includes aggregating sequencing reads corresponding to (eg, comprising) the same first tag sequence and/or the same second tag sequence. For example, where the targeting moiety for target molecule A corresponds to tag sequences A1 and A2 and the targeting moiety for target molecule B corresponds to tag sequences B1 and B2, the sequencing reads corresponding to tag sequence A1 or A2 are read Under the target molecule A, the insert sequence in the sequencing read is the DNA site that interacts with the target molecule A; the sequencing read corresponding to the tag sequence B1 or B2 is classified under the target molecule B, the The insert sequence in the sequencing read is the DNA site that interacts with target molecule B.
  • the targeting modules for group A target molecules all correspond to the same tag sequence A
  • the targeting modules for target molecules AB1, AB2 in group A correspond to unique tag sequences B1, B2...
  • the sequencing reads corresponding to the tag sequence A are classified under the target molecule of group A
  • the sequencing reads corresponding to the tag sequence B1 are classified under the target molecule AB1
  • the sequencing reads corresponding to the tag sequence B2 are classified under the target molecule AB1.
  • the transposase in the complex is inactive.
  • the method further comprises the step of activating the transposase, eg adding a divalent cation, eg Mg2+ .
  • the method further comprises adding a modulator of the target molecule-DNA interaction. In one embodiment, the method further comprises comparing the sequencing results of the sample with the added modulator to the sample without the modulator added.
  • the method further comprises altering the reaction conditions for the target molecule-DNA interaction. In one embodiment, the method further comprises comparing the sequencing results of the samples under different reaction conditions.
  • Sequencing results can be qualitative, semi-quantitative, quantitative, or any combination thereof.
  • the sample is a cell or a derivative thereof.
  • the cell is a prokaryotic cell or a eukaryotic cell.
  • the sample is a nucleus, cytoplasm or organelle or a derivative thereof.
  • the sample is a cell lysate.
  • the method includes the step of permeabilizing the cells, eg, adding digitonin.
  • the DNA is genomic, chromosomal or chromatin, eg, prokaryotic or eukaryotic.
  • Figure 1 shows the library quality assessment of Example 2.
  • Figure 2 shows the TSS enrichment of Example 2.
  • Figure 3 shows the IgV view of Example 2.
  • Figure 4 shows the library quality assessment of Example 3.
  • Figure 5 shows the TSS enrichment of Example 3.
  • Figure 6 shows the IgV view of Example 3.
  • Figure 7 shows a schematic diagram of an exemplary embodiment of a nucleic acid library constructed in the present application.
  • the transposase used in the embodiment is Hyperactive pG-Tn5 Transposase for CUT&Tag (Item No. S602) or Hyperactive pA-Tn5 Transposase for CUT&Tag (Item No. S603) of Nanjing Novizan Biotechnology Co., Ltd.
  • the H3K4me2 antibody was from Abcam, catalog number: ab11946; the CTCF antibody was from CST, catalog number: 3418S; the RNA Pol II antibody was from Abcam, catalog number: ab817; the H3K27me3 antibody was from CST, catalog number: #9733S;
  • the method of this application is universal and applicable to various sequencing platforms, such as ion torrent platform, illumina platform and BGI platform.
  • the embodiment takes the illumina platform as an example. If other sequencing platforms are used, it is only necessary to replace the sequences of the immobilized probes and sequencing primers used by the illumina platform in the following oligonucleotides or their reverse complementary sequences with the corresponding sequences of other platforms.
  • Oligonucleotide 1 (SEQ ID NO: 10):
  • Oligonucleotide 2 (SEQ ID NO: 11):
  • Oligonucleotide 3 (SEQ ID NO: 12):
  • oligonucleotide 2 and oligonucleotide 3 may delete several bases of the 5' portion of the italicized segment, leaving at least four bases of the 3' portion of the italicized segment.
  • Amplification primer 1 (SEQ ID NO: 1): 5'-AATGATACGGCGACCACCGAGATCTACAC-3'
  • Amplification primer 2 (SEQ ID NO: 3): 5'-CAAGCAGAAGACGGCATACGAGAT-3'
  • Amplification primer 1 (N5) is the same as the complete italicized segment of oligonucleotide 2
  • amplification primer 2 (N7) is the same as the complete italicized segment of oligonucleotide 3.
  • Oligonucleotide 1' (SEQ ID NO: 9): 5'-phos- AGATGTGTATAAGAGACAG - NH2-3'
  • Oligonucleotide 2' (SEQ ID NO: 13):
  • Oligonucleotide 3' (SEQ ID NO: 14):
  • Oligonucleotide 2 Oligonucleotide 2', Oligonucleotide 3 and Oligonucleotide 3' above constitute yet another alternative embodiment of the present application.
  • the Illumina library structure is as follows:
  • -MMMMMM- represents the insertion sequence (the length of the insertion sequence is 6 nucleotides is exemplary, not limiting), and other segments have the same meanings as above.
  • reaction 1 and reaction 2 were vortexed to mix well, and centrifuged briefly to return the solution to the bottom of the tube. Put it in the PCR machine and carry out the following reaction procedures:
  • the reaction was placed at 30°C for 1 hour.
  • the reaction product is a transposome (adapter-transposase complex), which can be directly used in subsequent experiments or stored at -30 to -15°C.
  • the final concentration of transposomes prepared according to this reaction system was 4 ⁇ M.
  • transposase with adaptor pairs containing different indices, labelled as transposome 1, transposome 2, transposome 3, depending on the index used.
  • This example is used to simultaneously study the binding of histone modifications, transcription factors and RNA Pol II to genomic DNA in cells.
  • Wash Buffer 1 from Vazyme, #TD901, Wash Buffer,
  • Wash Buffer 2 from Vazyme, #TD901, Dig-Wash Buffer,
  • Reaction buffer from Vazyme, #TD901, Tagmentation Buffer,
  • Termination buffer from Vazyme, #TD901, Termination Buffer.
  • the purified DNA is directly amplified by PCR to complete the library construction.
  • the heated lid was set to 105°C, and the number of amplification cycles was adjusted according to the actual situation.
  • Amplification products were purified using VAHTS DNA Clean Beads (Vazyme, #N411) according to the manufacturer's instructions.
  • the resulting library was subjected to concentration determination using a Qubit 3.0 Fluorometer (invitrogen) and library yield was calculated.
  • the library concentration was 34.8 ng/ ⁇ L (22 ⁇ L elution volume).
  • the completed library was used for next-generation sequencing on the illumina platform, Hiseq X, PE150bp.
  • the sequencing results are shown in Table 1 and Figures 2 and 3.
  • This embodiment provides a method for simultaneously studying histone methylation and acetylation modification.
  • the specific process of this embodiment is as follows:
  • the library yield was detected with Qubit as described in Example 2; the library concentration was 57.4 ng/ ⁇ L (22 ⁇ L elution volume);

Abstract

Provided are a targeted transposome complex having an oligonucleotide tag, and a use thereof for researching the interaction of multiple-target molecule-DNA.

Description

多靶点蛋白质-DNA相互作用的研究方法和工具Methods and tools for the study of multi-target protein-DNA interactions 技术领域technical field
本申请属于生物技术领域,涉及带寡核苷酸标签的靶向性转座体复合物及其用于研究多靶点蛋白质-DNA相互作用的用途。The present application belongs to the field of biotechnology, and relates to an oligonucleotide-tagged targeting transposome complex and its use for studying multi-target protein-DNA interactions.
背景技术Background technique
基因的表达调控是生命体进行一切生命活动的基础,根据遗传的中心法则,生物遗传信息从DNA转录形成RNA,再由RNA翻译形成蛋白质。在经典遗传学之外,DNA的核苷酸序列不发生改变的情况下,基因的表达水平发生变化并且可以稳定遗传给后代的现象,称为表观遗传学(epigenetics),表观遗传很大程度上决定了基因何时何地以何种方式表达。常见的表观遗传学调控包括DNA甲基化、组蛋白修饰、染色质构像改变等。The regulation of gene expression is the basis for all life activities of living organisms. According to the central dogma of heredity, biological genetic information is transcribed from DNA to form RNA, and then translated from RNA to form protein. In addition to classical genetics, the phenomenon that the expression level of genes changes and can be stably passed on to offspring without changing the nucleotide sequence of DNA is called epigenetics. The extent determines when, where, and how a gene is expressed. Common epigenetic controls include DNA methylation, histone modifications, and chromatin conformation changes.
生物大分子之间的相互作用参与调控基因的选择性表达和基因的转录后调控。DNA与蛋白质的相互作用是普遍存在的,传统的研究方法包括电泳迁移率变动测定法(EMSA)、DNA酶I足迹、酵母杂交系统和萤光素酶报告基因测定法(LRGA)等。Interactions between biological macromolecules are involved in regulating the selective expression of genes and post-transcriptional regulation of genes. The interaction between DNA and protein is ubiquitous, and traditional research methods include electrophoretic mobility shift assay (EMSA), DNase I footprinting, yeast hybridization system and luciferase reporter gene assay (LRGA).
染色质免疫共沉淀技术(ChIP and ChIP-seq)是广泛用于研究蛋白质与DNA相互作用的方法,通常用于转录因子结合位点或组蛋白特异性修饰位点的研究。ChIP的基本流程是:(1)采用甲醛固定切成碎块的组织或者直接固定细胞,使得DNA与蛋白质交联在一起形成靶蛋白-DNA复合物;(2)通过超声将染色质DNA片段化,再加入针对目的蛋白的ChIP级别的抗体,与靶蛋白-DNA复合物相互结合;(3)加入与抗体结合的蛋白A/G珠,结合复合体,再通过解交联,释放DNA片段;(4)纯化富集的DNA片段,通过下游检测技术(定量PCR、基因芯片、测序等)来检测此富集片段的DNA序列。将ChIP与二代测序技术相结合的ChIP-Seq技术,能够高效地在全基因组范围内检测与组蛋白、转录因子等互作的DNA区段。近年来,随着技术的不断更新与优化,出现了CUT&RUN(Cleavage Under Targets&Release Using Nuclease)技术和CUT&Tag(Cleavage Under Targets&Tagmentation)技术。Chromatin immunoprecipitation (ChIP and ChIP-seq) is a widely used method to study protein-DNA interactions, usually for the study of transcription factor binding sites or histone-specific modification sites. The basic process of ChIP is: (1) use formaldehyde to fix tissue cut into pieces or fix cells directly, so that DNA and proteins are cross-linked together to form target protein-DNA complexes; (2) chromatin DNA is fragmented by ultrasound , and then add the ChIP-level antibody against the target protein to bind with the target protein-DNA complex; (3) add protein A/G beads bound to the antibody to bind the complex, and then release the DNA fragments by de-crosslinking; (4) Purify the enriched DNA fragment, and detect the DNA sequence of the enriched fragment by downstream detection technology (quantitative PCR, gene chip, sequencing, etc.). ChIP-Seq technology, which combines ChIP with next-generation sequencing technology, can efficiently detect DNA segments that interact with histones, transcription factors, etc. on a genome-wide scale. In recent years, with the continuous updating and optimization of technology, CUT&RUN (Cleavage Under Targets&Release Using Nuclease) technology and CUT&Tag (Cleavage Under Targets&Tagmentation) technology have emerged.
美国弗雷德·哈金森癌症研究中心的Steven Henikoff团队,于2019年4月在Nature Communication公开了CUT&Tag技术的实验方案,在抗体靶向引导下,使用带有P5、P7端部分衔接头序列的蛋白A-Tn5转座酶融合物对靶蛋白附近的DNA进行片段化,切割的同时在DNA片段两端分别加上P5、P7端部分衔接头序列,通过PCR扩增加上索引序列以及衔接头的其余部分,产生高分辨率低背景的文库。The team of Steven Henikoff of the Fred Hutchinson Cancer Research Center in the United States published the experimental scheme of CUT&Tag technology in Nature Communication in April 2019. Under the guidance of antibody targeting, the use of P5 and P7 end partial adapter sequences was used. The protein A-Tn5 transposase fusion fragmented the DNA near the target protein, and added P5 and P7 end adapter sequences at both ends of the DNA fragment while cutting, and added the index sequence and the adapter sequence through PCR amplification. For the remainder, a high-resolution, low-background library is generated.
2019年8月北京大学分子医学研究所何爱彬课题组在Molecular Cell杂志在线公开了CoBATCH(combinatorial barcoding and targeted chromatin release)技术。在上述CUT&Tag技术的基础上,使用含不同条码衔接头的包埋的蛋白A-Tn5转座体对样本进行靶向切割,使得样本DNA经转座酶打断后带有不同的衔接头序列,是一种易操作、高通量和高质量的单细胞ChIP-seq技术。In August 2019, the research group of He Aibin, Institute of Molecular Medicine, Peking University published CoBATCH (combinatorial barcoding and targeted chromatin release) technology online in Molecular Cell. On the basis of the above-mentioned CUT&Tag technology, the sample is targeted to cut the sample using the embedded protein A-Tn5 transposome containing different barcode adapters, so that the sample DNA has different adapter sequences after being interrupted by transposase, It is an easy-to-operate, high-throughput and high-quality single-cell ChIP-seq technology.
2020年10月清华大学颉伟课题组在Nature发表的文章中提及了Stacc-seq技术。将抗体与蛋白A/G融合的Tn5转座酶在体外孵育结合后,进入体内靶向目标蛋白,随后激活转座酶并对靶蛋白附近的DNA进行切割,通过PCR扩增即可产生可进行二代测序的文库。In October 2020, the Stacc-seq technology was mentioned in an article published in Nature by Jie Wei's group from Tsinghua University. After the antibody is incubated with the Tn5 transposase fused to protein A/G in vitro, it enters the body to target the target protein, and then activates the transposase and cleaves the DNA near the target protein, which can be produced by PCR amplification. next-generation sequencing library.
传统的ChIP-seq技术受到交联/超声打断条件不固定,抗体等因素的影响,需要投入大量的细胞/组织进行建库,很难适用于微量样本及单细胞实验。The traditional ChIP-seq technology is affected by factors such as cross-linking/ultrasonic interruption conditions, antibodies and other factors, and requires a large amount of cells/tissues for library construction, which is difficult to apply to micro-samples and single-cell experiments.
ChIP-seq技术容易出现假阴性/假阳性的情况,由于超声打断的不均匀,会导致测序背景噪音高。ChIP-seq technology is prone to false negatives/false positives, resulting in high sequencing background noise due to uneven ultrasound interruptions.
尽管CUT&Tag技术相对于ChIP-seq技术极大缩短了实验时间,但仍步骤繁琐。Although CUT&Tag technology greatly shortens the experimental time compared to ChIP-seq technology, it is still cumbersome.
现有的技术仅适用于研究靶向单一目的蛋白,不能在同一样本中同时进行多靶点的研究。Existing technologies are only suitable for studying targeting a single target protein, and cannot simultaneously conduct multi-target studies in the same sample.
发明内容SUMMARY OF THE INVENTION
本申请提供了一种同时进行多靶点DNA-蛋白质互作研究的技术,可以针对同一份实验样本,同时检测2个及以上目的蛋白及其相互作用的DNA片段,通过高通量测序技术获得低背景的文库信息。整个实验过程大大减少了建库步骤,缩短了建库时间,降低了对样本起始量的要求,提高文库产出和下机数据质量,帮助得到更多组蛋白/转录因子/DNA结合蛋白在体内的作用情况。This application provides a technology for simultaneous multi-target DNA-protein interaction research, which can simultaneously detect two or more target proteins and their interacting DNA fragments for the same experimental sample, and obtain them through high-throughput sequencing technology. Library information with low background. The entire experimental process greatly reduces the library construction steps, shortens the library construction time, reduces the requirements for the initial amount of samples, improves the library output and data quality, and helps to obtain more histones/transcription factors/DNA-binding proteins. action in the body.
本申请的基本流程是:(1)将含有不同索引序列的寡核苷酸退火形成衔接头,将一对衔接头与蛋白A-Tn5转座酶或蛋白G-Tn5转座酶进行包埋,所产生的每一种衔接头-转座酶复合物都包含独特的索引或索引组合;(2)将针对目的蛋白的抗体与包埋后的衔接头-转座酶复合物孵育,形成衔接头-转座酶-抗体复合物,一种抗体对应一种索引或索引组合;(3)收集细胞/细胞核,加入衔接头-转座酶-抗体复合物进行孵育,利用抗体将转座酶靶向目的蛋白;(4)激活转座酶,切割目的蛋白附近的DNA,并连接衔接头;(5)灭活转座酶,纯化片段化并打标签后的DNA,通过PCR扩增获得可供测序的文库;以及(6)通过下游测序技术测序。最终,通过不同索引或索引组合的拆分即可获得不同目的蛋白结合的DNA序列信息。The basic process of the present application is: (1) annealing oligonucleotides containing different index sequences to form adapters, and embedding a pair of adapters with protein A-Tn5 transposase or protein G-Tn5 transposase, Each adapter-transposase complex produced contains a unique index or combination of indexes; (2) an antibody against the protein of interest is incubated with the embedded adapter-transposase complex to form the adapter - Transposase-antibody complex, one antibody corresponds to one index or index combination; (3) Collect cells/nuclei, add adapter-transposase-antibody complex for incubation, use antibody to target transposase target protein; (4) activate the transposase, cut the DNA near the target protein, and connect the adapter; (5) inactivate the transposase, purify the fragmented and tagged DNA, and obtain it by PCR amplification for sequencing and (6) sequencing by downstream sequencing technology. Finally, the DNA sequence information bound by different target proteins can be obtained by splitting different indexes or index combinations.
本申请实现多靶点检测的关键在于步骤(1)中包埋产生含有不同索引或索引组合的衔接头-转座酶复合物(即转座体),一种转座体与一种目的蛋白的抗体在体外结合,形成多种衔接头-转座酶-抗体复合物,将这些不同的衔接头-转座酶-抗体复合物混合后,同时进入样品中靶向不同的目的蛋白,通过激活转座酶切割目的蛋白附近的DNA,使得DNA片段两端连接不同的索引或索引组合,通过PCR扩增产生文库,测序后通过索引或索引组合的拆分可以识别不同目的蛋白与DNA互作的情况。The key to realizing multi-target detection in the present application lies in that in step (1), an adaptor-transposase complex (ie, a transposome) containing different indexes or index combinations is generated by embedding, a transposome and a target protein The antibodies combine in vitro to form a variety of adaptor-transposase-antibody complexes. After mixing these different adaptor-transposase-antibody complexes, they enter the sample simultaneously to target different target proteins. The DNA near the target protein is cut by the transposase, so that the two ends of the DNA fragment are connected with different indexes or index combinations, and the library is generated by PCR amplification. After sequencing, the index or index combination can be split to identify the interaction between different target proteins and DNA Happening.
本申请方法可一管式高通量应用,并可与单细胞测序平台“无缝”结合。The method of the present application can be applied in one-tube high-throughput, and can be "seamlessly" combined with the single-cell sequencing platform.
高通量测序技术:又称为第二代测序技术、下一代测序技术,可简写为NGS。是指一次并行对几十万到几百万条DNA分子进行序列测定的技术,其测定序列长度一般较短。High-throughput sequencing technology: also known as second-generation sequencing technology, next-generation sequencing technology, and can be abbreviated as NGS. It refers to the technology of performing sequence determination on hundreds of thousands to millions of DNA molecules in parallel at a time, and the length of the determined sequence is generally short.
转座酶:执行转座功能的酶,通常由转座子编码,识别转座子两端的特定序列,能把转座子从相邻序列中脱离出来,再插入到新的DNA靶位点,无同源性要求。Tn5转座酶是转座 酶中的一种,具有随机性好、稳定性高、插入位点容易测序等特点,是应用于分子遗传和基因测序的高效工具。Transposase: an enzyme that performs the function of transposition, usually encoded by a transposon, recognizes specific sequences at both ends of the transposon, and can detach the transposon from the adjacent sequence and insert it into a new DNA target site, No homology requirement. Tn5 transposase is a kind of transposase. It has the characteristics of good randomness, high stability, and easy sequencing of insertion sites. It is an efficient tool for molecular genetics and gene sequencing.
靶向模块指能够结合感兴趣分子的任何模块,优选抗体或抗体片段。A targeting moiety refers to any moiety capable of binding a molecule of interest, preferably an antibody or antibody fragment.
本文中的术语“抗体”以最广义使用,并且涵盖各种抗体结构,包括但不限于单克隆抗体、多克隆抗体、多特异性抗体(例如双特异性抗体)、和抗体片段,只要它们展现出期望的抗原结合活性。The term "antibody" is used herein in the broadest sense and encompasses a variety of antibody structures including, but not limited to, monoclonal antibodies, polyclonal antibodies, multispecific antibodies (eg, bispecific antibodies), and antibody fragments so long as they exhibit the desired antigen-binding activity.
“抗体片段”指与完整抗体不同的分子,其包含完整抗体中结合完整抗体结合的抗原的部分。抗体片段的例子包括但不限于Fv、Fab、Fab’、Fab’-SH、F(ab’) 2;双抗体;线性抗体;单链抗体分子(例如scFv);和由抗体片段形成的多特异性抗体。 An "antibody fragment" refers to a molecule other than an intact antibody that comprises the portion of the intact antibody that binds the antigen to which the intact antibody binds. Examples of antibody fragments include, but are not limited to, Fv, Fab, Fab', Fab'-SH, F(ab') 2 ; diabodies; linear antibodies; single-chain antibody molecules (eg, scFv); Sexual antibodies.
在一个方面,本申请涉及一种寡核苷酸对,其包含第一寡核苷酸和第二寡核苷酸,其中:第一寡核苷酸包含第一转座酶识别序列,第二寡核苷酸包含第二转座酶识别序列,第一寡核苷酸包含第一标签序列和/或第二寡核苷酸包含第二标签序列。In one aspect, the application relates to an oligonucleotide pair comprising a first oligonucleotide and a second oligonucleotide, wherein: the first oligonucleotide comprises a first transposase recognition sequence, the second The oligonucleotide comprises a second transposase recognition sequence, the first oligonucleotide comprises a first tag sequence and/or the second oligonucleotide comprises a second tag sequence.
在一个实施方案中,第一转座酶识别序列与第二转座酶识别序列相同。在第一转座酶识别序列与第二转座酶识别序列相同的一个实施方案中,第一转座酶识别序列与第二转座酶识别序列同向。在第一转座酶识别序列与第二转座酶识别序列相同的一个实施方案中,第一转座酶识别序列与第二转座酶识别序列反向。在一个实施方案中,第一转座酶识别序列与第二转座酶识别序列不同。In one embodiment, the first transposase recognition sequence is the same as the second transposase recognition sequence. In one embodiment where the first transposase recognition sequence is the same as the second transposase recognition sequence, the first transposase recognition sequence is in the same direction as the second transposase recognition sequence. In one embodiment where the first transposase recognition sequence is the same as the second transposase recognition sequence, the first transposase recognition sequence is reversed from the second transposase recognition sequence. In one embodiment, the first transposase recognition sequence is different from the second transposase recognition sequence.
在一个实施方案中,第一寡核苷酸还包含第一测序固相结合序列和/或第一测序引物结合序列。在一个实施方案中,第二寡核苷酸还包含第二测序固相结合序列和/或第二测序引物结合序列。在一个实施方案中,第一标签序列和/或第二标签序列对应于下文的靶向模块和/或靶分子。在一个实施方案中,第一寡核苷酸和/或第二寡核苷酸还可以包含一种或多种别的标签序列,具有别的用途。In one embodiment, the first oligonucleotide further comprises a first sequencing solid phase binding sequence and/or a first sequencing primer binding sequence. In one embodiment, the second oligonucleotide further comprises a second sequencing solid phase binding sequence and/or a second sequencing primer binding sequence. In one embodiment, the first tag sequence and/or the second tag sequence corresponds to the targeting moiety and/or target molecule below. In one embodiment, the first oligonucleotide and/or the second oligonucleotide may also comprise one or more additional tag sequences, with additional uses.
在一个实施方案中,第一寡核苷酸和/或第二寡核苷酸为单链、双链或其组合。在一个实施方案中,第一寡核苷酸中的第一转座酶识别序列和/或第二寡核苷酸中的第二转座酶识别序列为双链。在一个实施方案中,第一寡核苷酸中除了第一转座酶识别序列以外的部分(例如第一测序固相结合序列、第一标签序列、和/或第一测序引物结合序列)和/或第二寡核苷酸中除了第二转座酶识别序列以外的部分(例如第二测序固相结合序列、第二标签序列、和/或第二测序引物结合序列)为单链。在一个实施方案中,第一测序固相结合序列和/或第二测序固相结合序列分别相对于测序固相上的第一固定化探针和/或第二固定化探针可以是截短的和/或延长的,例如截短和/或延长5’端和/或3’端至少1个、2个、3个、4个、5个、6个、7个、8个、9个、10个、11个、12个、13个、14个、15个、16个、17个、18个、19个、20个核苷酸或更多。在一个实施方案中,第一测序引物结合序列和/或第二测序引物结合序列分别相对于第一测序引物和/或第二测序引物可以是截短的和/或延长的,例如截短和/或延长5’端和/或3’端至少1个、2个、3个、4个、5个、6个、7个、8个、9个、10个、11个、12个、13个、14个、15个、16个、17个、18个、19个、20个核苷酸或更多In one embodiment, the first oligonucleotide and/or the second oligonucleotide are single-stranded, double-stranded, or a combination thereof. In one embodiment, the first transposase recognition sequence in the first oligonucleotide and/or the second transposase recognition sequence in the second oligonucleotide is double stranded. In one embodiment, the portion of the first oligonucleotide other than the first transposase recognition sequence (eg, the first sequencing solid phase binding sequence, the first tag sequence, and/or the first sequencing primer binding sequence) and /or the portion of the second oligonucleotide other than the second transposase recognition sequence (eg, the second sequencing solid phase binding sequence, the second tag sequence, and/or the second sequencing primer binding sequence) is single-stranded. In one embodiment, the first sequencing solid phase binding sequence and/or the second sequencing solid phase binding sequence may be truncated relative to the first immobilized probe and/or the second immobilized probe on the sequencing solid phase, respectively and/or extended, e.g. truncated and/or extended 5' and/or 3' end by at least 1, 2, 3, 4, 5, 6, 7, 8, 9 , 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 nucleotides or more. In one embodiment, the first sequencing primer binding sequence and/or the second sequencing primer binding sequence may be truncated and/or extended relative to the first sequencing primer and/or the second sequencing primer, respectively, eg, truncated and/or extended /or at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 extended 5' end and/or 3' end 14, 15, 16, 17, 18, 19, 20 nucleotides or more
在一个实施方案中,第一寡核苷酸以自5’端至3’端方向包含任选的第一测序固相结合序列(例如与测序固相的第一结合探针的序列相同或反向互补)、任选的第一标签序列、任选的第一测序引物结合序列(例如与第一测序引物的序列相同或反向互补)、和第一转座酶识别序列(正链(例如AGATGTGTATAAGAGACAG,SEQ ID NO:9)或负链)。在一个实施方案中,第二寡核苷酸以自5’端至3’端方向包含任选的第二测序固相结合序列(例如与测序固相的第二结合探针的序列相同或反向互补)、任选的第二标签序列、任选的第二测序引物结合序列(例如与第二测序引物的序列相同或反向互补)、和第二转座酶识别序列(正链(例如AGATGTGTATAAGAGACAG,SEQ ID NO:9)或负链)。在第一寡核苷酸中,第一标签序列与第一测序引物结合序列可以换位。在第二寡核苷酸中,第二标签序列与第二测序引物结合序列可以换位。In one embodiment, the first oligonucleotide comprises an optional first sequencing solid phase binding sequence in a 5' to 3' direction (eg, the same or opposite to the sequence of the first binding probe of the sequencing solid phase). to complement), an optional first tag sequence, an optional first sequencing primer binding sequence (e.g., identical or reverse complementary to the sequence of the first sequencing primer), and a first transposase recognition sequence (plus strand (e.g. AGATGTGTATAAGAGACAG, SEQ ID NO: 9) or minus strand). In one embodiment, the second oligonucleotide comprises an optional second sequencing solid phase binding sequence in a 5' to 3' direction (eg, the same or opposite to the sequence of the second binding probe of the sequencing solid phase). to complement), an optional second tag sequence, an optional second sequencing primer binding sequence (e.g. identical or reverse complementary to the sequence of the second sequencing primer), and a second transposase recognition sequence (plus strand (e.g. AGATGTGTATAAGAGACAG, SEQ ID NO: 9) or minus strand). In the first oligonucleotide, the first tag sequence and the first sequencing primer binding sequence can be transposed. In the second oligonucleotide, the second tag sequence and the second sequencing primer binding sequence can be transposed.
在一个实施方案中,第一寡核苷酸以自5’端至3’端方向包含第一转座酶识别序列(负链(例如CTGTCTCTTATACACATCT,SEQ ID NO:10)或正链)、任选的第一测序引物结合序列(例如与第一测序引物的序列反向互补或相同)、任选的第一标签序列、和任选的第一测序固相结合序列(例如与测序固相的第一结合探针的序列反向互补或相同)。在一个实施方案中,第二寡核苷酸以自5’端至3’端方向包含第二转座酶识别序列(负链(例如CTGTCTCTTATACACATCT,SEQ ID NO:10)或正链)、任选的第二测序引物结合序列(例如与第二测序引物的序列反向互补或相同)、任选的第二标签序列、和任选的第二测序固相结合序列(例如与测序固相的第二结合探针的序列反向互补或相同)。在第一寡核苷酸中,第一标签序列与第一测序引物结合序列可以换位。在第二寡核苷酸中,第二标签序列与第二测序引物结合序列可以换位。In one embodiment, the first oligonucleotide comprises a first transposase recognition sequence (minus strand (eg CTGTCTCTTATACACATCT, SEQ ID NO: 10) or plus strand), optionally The first sequencing primer binding sequence (e.g., reverse complementary or identical to the sequence of the first sequencing primer), an optional first tag sequence, and an optional first sequencing solid phase binding sequence (e.g., the The sequence of a binding probe is reverse complementary or identical). In one embodiment, the second oligonucleotide comprises a second transposase recognition sequence (minus strand (eg CTGTCTCTTATACACATCT, SEQ ID NO: 10) or plus strand) in a 5' to 3' direction, optionally The second sequencing primer binding sequence (e.g., reverse complementary or identical to the sequence of the second sequencing primer), an optional second tag sequence, and an optional second sequencing solid phase binding sequence (e.g., with the second sequencing solid phase binding sequence) The sequences of the two binding probes are reverse complementary or identical). In the first oligonucleotide, the first tag sequence and the first sequencing primer binding sequence can be transposed. In the second oligonucleotide, the second tag sequence and the second sequencing primer binding sequence can be transposed.
测序固相的结合探针及其序列是本领域已知的。因而,本申请的测序固相结合序列(例如第一测序固相结合序列和/或第二测序固相结合序列)也是本领域已知的或容易得到的(参见各测序平台的使用说明书),例如ion torrent平台,illumina平台和华大平台的。例如,本申请的测序固相结合序列(例如第一测序固相结合序列和/或第二测序固相结合序列)可以是AATGATACGGCGACCACCGAGATCTACAC(SEQ ID NO:1)或其反向互补序列GTGTAGATCTCGGTGGTCGCCGTATCATT(SEQ ID NO:2),或者是CAAGCAGAAGACGGCATACGAGAT(SEQ ID NO:3)或其反向互补序列ATCTCGTATGCCGTCTTCTGCTTG(SEQ ID NO:4),包括它们的截短和/或延长序列。Binding probes for sequencing solid phases and their sequences are known in the art. Thus, the sequencing solid phase binding sequences of the present application (eg, the first sequencing solid phase binding sequence and/or the second sequencing solid phase binding sequence) are also known or readily available in the art (see the instructions for use of each sequencing platform), For example, the ion torrent platform, the illumina platform and the Huada platform. For example, the sequencing solid phase binding sequence of the present application (eg, the first sequencing solid phase binding sequence and/or the second sequencing solid phase binding sequence) can be AATGATACGGCGACCACCGAGATCTACAC (SEQ ID NO: 1) or its reverse complement GTGTAGATCTCGGTGGTCGCCGTATCATT (SEQ ID NO: 1) NO: 2), or CAAGCAGAAGACGGCATACGAGAT (SEQ ID NO: 3) or its reverse complement ATCTCGTATGCCGTCTTCTGCTTG (SEQ ID NO: 4), including truncated and/or extended sequences thereof.
测序引物及其序列是本领域已知的。因而,本申请的测序引物结合序列(例如第一测序引物结合序列和/或第二测序引物结合序列)也是本领域已知的或容易得到的(参见各测序平台的使用说明书),例如ion torrent平台,illumina平台和华大平台的。例如,本申请的测序引物结合序列(例如第一测序引物结合序列和/或第二测序引物结合序列)可以是TCGTCGGCAGCGTC(SEQ ID NO:5)或其反向互补序列GACGCTGCCGACGA(SEQ ID NO:6),或者是GTCTCGTGGGCTCGG(SEQ ID NO:7)或其反向互补序列CCGAGCCCACGAGAC(SEQ ID NO:8),包括它们的截短和/或延长序列。Sequencing primers and their sequences are known in the art. Thus, the sequencing primer binding sequences of the present application (eg, the first sequencing primer binding sequence and/or the second sequencing primer binding sequence) are also known in the art or readily available (see the instructions for use of each sequencing platform), such as ion torrent platform, illumina platform and BGI platform. For example, the sequencing primer binding sequence of the present application (eg, the first sequencing primer binding sequence and/or the second sequencing primer binding sequence) may be TCGTCGGCAGCGTC (SEQ ID NO: 5) or its reverse complement GACGCTGCCGACGA (SEQ ID NO: 6) ), or GTCTCGTGGGCTCGG (SEQ ID NO: 7) or its reverse complement CCGAGCCCACGAGAC (SEQ ID NO: 8), including truncated and/or extended sequences thereof.
本申请的标签序列(例如第一标签序列和/或第二标签序列)可以利用任何短寡核苷酸,例如长度为至少2个、3个、4个、5个、6个、7个、8个、9个、10个、11个、12个、13个、14个、15个、16个或更多核苷酸。本申请的标签序列(例如第一标签序列和/或第二标签序列)可以利用当前或将来测序平台(包括但不限于ion torrent平台、illumina平台和华大平台)利用的标签序列。The tag sequences of the present application (eg, the first tag sequence and/or the second tag sequence) can utilize any short oligonucleotide, eg, at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 or more nucleotides. The tag sequences of the present application (eg, the first tag sequence and/or the second tag sequence) can utilize tag sequences utilized by current or future sequencing platforms (including but not limited to ion torrent platforms, illumina platforms, and BGI platforms).
在一个实施方案中,第一寡核苷酸与第二寡核苷酸相连。在第一寡核苷酸和第二寡核苷酸相连的一个实施方案中,第一寡核苷酸和第二寡核苷酸以转座酶识别序列的相反端(例如测序固相结合序列端)连接。优选地,第一寡核苷酸和第二寡核苷酸之间存在断裂位点,例如限制性内切酶识别位点。In one embodiment, the first oligonucleotide is linked to the second oligonucleotide. In one embodiment where the first oligonucleotide and the second oligonucleotide are linked, the first and second oligonucleotides are at opposite ends of the transposase recognition sequence (eg, a sequencing solid phase bound sequence). end) connection. Preferably, a cleavage site, such as a restriction enzyme recognition site, exists between the first oligonucleotide and the second oligonucleotide.
在一个方面,本申请涉及一种带寡核苷酸标签的靶向性转座体复合物,其包含转座酶、本申请的寡核苷酸对、和靶向模块。In one aspect, the present application relates to an oligonucleotide-tagged targeted transposome complex comprising a transposase, an oligonucleotide pair of the present application, and a targeting moiety.
在一个实施方案中,所述靶向模块是适体,即特异性结合靶分子的寡核苷酸。在一个实施方案中,所述靶向模块是抗体(包括抗体片段)。在一个实施方案中,所述靶向模块特异性结合与DNA相互作用(例如,调控基因表达)的靶分子。在一个实施方案中,所述靶分子是组蛋白,包括其同等型、变体、片段。在一个实施方案中,所述靶分子是DNA聚合酶,包括其同等型、变体、片段。在一个实施方案中,所述靶分子是RNA聚合酶,包括其同等型、变体、片段。在一个实施方案中,所述靶分子是转录因子,例如ARS结合因子、rDNA增强子结合蛋白、TATA结合蛋白、或CCCTC结合因子。In one embodiment, the targeting moiety is an aptamer, an oligonucleotide that specifically binds a target molecule. In one embodiment, the targeting moiety is an antibody (including antibody fragments). In one embodiment, the targeting module specifically binds a target molecule that interacts with DNA (eg, modulates gene expression). In one embodiment, the target molecule is a histone, including isoforms, variants, fragments thereof. In one embodiment, the target molecule is a DNA polymerase, including isoforms, variants, fragments thereof. In one embodiment, the target molecule is RNA polymerase, including isoforms, variants, fragments thereof. In one embodiment, the target molecule is a transcription factor, such as an ARS binding factor, an rDNA enhancer binding protein, a TATA binding protein, or a CCCTC binding factor.
在一个实施方案中,所述靶分子是下述一种或多种:AAF、abl、ADA2、ADA-NF1、AF-1、AFP1、AhR、AIIN3、ALL-1、α-CBF、α-CP 1、α-CP2a、α-CP2b、αHo、αH2-αFB、Alx-4、aMEF-2、AML1、AMLla、AMLlb、AMLlc、AMLlδN、AML2、AML3、AML3a、AML3b、AMY-1L、A-Myb、ANF、AP-1、AP-2αA、AP-2αB、AP-2β、AP-2γ、AP-3(1)、AP-3(2)、AP-4、AP-5、APC、AR、AREB6、Arnt、Arnt(774M形式)、ARP-1、ATBF1-A、ATBF1-B、ATF、ATF-1、ATF-2、ATF-3、ATF-3δZIP、ATF-a、ATF-aδ、ATPFl、Barhll、Barhl2、Barxl、Barx2、Bcl-3、BCL-6、BD73、β-连环蛋白、Binl、B-Myb、BP1、BP2、brahma、BRCAl、Brn-3a、Brn-3b、Brn-4、BTEB、BTEB2、B-TFIID、C/EBPα、C/EBPβ、C/EBPδ、CACC结合因子、Cart-1、CBF(4)、CBF(5)、CBP、CCAAT-结合因子、CCMT-结合因子、CCF、CCG1、CCK-la、CCK-lb、CD28RC、cdk2、cdk9、Cdx-1、CDX2、Cdx-4、CFF、ChxlO、CLIM1、CLIM2、CNBP、CoS、COUP、CP1、CP1A、CP1C、CP2、CPBP、CPE结合蛋白、CREB、CREB-2、CRE-BPl、CRE-BPa、CREMα、CRF、Crx、CSBP-1、CTCF、CTF、CTF-1、CTF-2、CTF-3、CTF-5、CTF-7、CUP、CUTL1、Cx、细胞周期蛋白A、细胞周期蛋白T1、细胞周期蛋白T2、细胞周期蛋白T2a、细胞周期蛋白T2b、DAP、DAX1、DB1、DBF4、DBP、DbpA、DbpAv、DbpB、DDB、DDB-1、DDB-2、DEF、δCREB、δMax、DF-1、DF-2、DF-3、Dlx-1、Dlx-2、Dlx-3、DIx4(长同种型)、Dlx-4(短同种型、Dlx-5、Dlx-6、DP-1、DP-2、DSIF、DSIF-pl4、DSIF-pl60、DTF、DUX1、DUX2、DUX3、DUX4、E、E12、E2F、E2F+E4、E2F+pl07、E2F-1、 E2F-2、E2F-3、E2F-4、E2F-5、E2F-6、E47、E4BP4、E4F、E4F1、E4TF2、EAR2、EBP-80、EC2、EF1、EF-C、EGR1、EGR2、EGR3、EIIaE-A、EIIaE-B、EIIaE-Cα、EIIaE-Cβ、EivF、EIf-1、EIk-1、Emx-1、Emx-2、Emx-2、En-1、En-2、ENH-bind.prot.、ENKTF-1、EPAS 1、εF 1、ER、Erg-1、Erg-2、ERR1、ERR2、ETF、Ets-1、Ets-1δVil、Ets-2、Evx-1、F2F、因子2、Factor name、FBP、f-EBP、FKBP59、FKHL18、FKHRL1P2、Fli-1、Fos、FOXB1、FOXC1、FOXC2、FOXD1、FOXD2、FOXD3、FOXD4、FOXE1、FOXE3、FOXF1、FOXF2、FOXGla、FOXGlb、FOXGlc、FOXH1、FOXI1、FOXJla、FOXJlb、FOXJ2(长同种型)、FOXJ2(短同种型)、FOXJ3、FOXKla、FOXKlb、FOXKlc、FOXL1、FOXMla、FOXMlb、FOXMlc、FOXN1、FOXN2、FOXN3、FOXOla、FOXOlb、FOX02、FOX03a、FOX03b、FOX04、FOXP1、FOXP3、Fra-1、Fra-2、FTF、FTS、G因子、G6因子、GABP、GABP-α、GABP-βl、GABP-β2、GADD153、GAF、γCMT、γCACl、γCAC2、GATA-1、GATA-2、GATA-3、GATA-4、GATA-5、GATA-6、Gbx-1、Gbx-2、GCF、GCMa、GCNS、GF1、GLI、GLI3、GRα、GRβ、GRF-1、Gsc、Gscl、GT-IC、GT-IIA、GT-IIBα、GT-IIBβ、HlTFl、H1TF2、H2RIIBP、H4TF-1、H4TF-2、HAND 1、HAND2、HB9、HDAC1、HDAC2、HDAC3、hDaxx、热诱导的因子、HEB、HEBl-p67、HEBl-p94、HEF-1B、HEF-1T、HEF-4C、HEN1、HEN2、Hesxl、Hex、HIF-1、HIF-lα、HIF-lβ、HiNF-A、HiNF-B、HINF-C、HINF-D、HiNF-D3、HiNF-E、HiNF-P、HIP1、HIV-EP2、Hlf、HLTF、HLTF(Metl23)、HLX、HMBP、HMG I、HMG I(Y)、HMG Y、HMGI-C、HNF-1A、HNF-IB、HNF-1C、HNF-3、HNF-3α、HNF-3β、HNF-3γ、HNF4、HNF-4α、HNF4αl、HNF-4α2、HNF-4α3、HNF-4α4、HNF4γ、HNF-6α、hnRNP K、HOX11、HOXA1、HOXA10、HOXA10PL2、HOXA11、HOXA13、HOXA2、HOXA3、HOXA4、HOXA5、HOXA6、HOXA7、HOXA9A、HOXA9B、HOXB-1、HOXB13、HOXB2、HOXB3、HOXB4、HOXBS、HOXB6、HOXA5、HOXB7、HOXB8、HOXB9、HOXC10、HOXC11、HOXC12、HOXC13、HOXC4、HOXC5、HOXC6、HOXC8、HOXC9、HOXD10、HOXD11、HOXD12、HOXD13、HOXD3、HOXD4、HOXD8、HOXD9、Hp55、Hp65、HPX42B、HrpF、HSF、HSF1(长)、HSF1(短)、HSF2、hsp56、Hsp90、IBP-1、ICER-II、ICER-liγ、ICSBP、Idl、IdlH'、Id2、Id3、Id3/Heir-1、IF1、IgPE-1、IgPE-2、IgPE-3、IκB、IκB-α、IκB-β、IκBR、II-1RF、IL-6RE-BP、11-6RF、INSAF、IPF1、IRF-1、IRF-2、B、IRX2a、Irx-3、Irx-4、ISGF-1、ISGF-3、ISGF3α、ISGF-3γ、lst-1、ITF、ITF-1、ITF-2、JRF、Jun、JunB、JunD、κy因子、KBP-1、KER1、KER-1、Koxl、KRF-1、Ku自身抗原、KUP、LBP-1、LBP-la、LBX1、LCR-F1、LEF-1、LEF-1B、LF-A1、LHX1、LHX2、LHX3a、LHX3b、LHXS、LHX6.1a、LHX6.1b、LIT-1、Lmol、Lmo2、LMX1A、LMX1B、L-Myl(长形式)、L-Myl(短形式)、L-My2、LSF、LXRα、LyF-1、Lyl-l、M因子、Madl、MASH-1、Maxl、Max2、MAZ、MAZ1、MB67、MBF1、MBF2、MBF3、MBP-1(1)、MBP-1(2)、MBP-2、MDBP、MEF-2、MEF-2B、MEF-2C(433AA形式)、MEF-2C(465AA形式)、MEF-2C(473M形式)、MEF-2C/δ32(441AA形式)、MEF-2D00、MEF-2D0B、MEF-2DA0、MEF-2DAO、MEF-2DAB、MEF-2DA'B、Meis-1、Meis-2a、Meis-2b、Meis-2c、Meis-2d、 Meis-2e、Meis3、Meoxl、Meoxla、Meox2、MHox(K-2)、Mi、MIF-1、Miz-1、MM-1、MOP3、MR、Msx-1、Msx-2、MTB-Zf、MTF-1、mtTFl、Mxil、Myb、Myc、Myc 1、Myf-3、Myf-4、Myf-5、Myf-6、MyoD、MZF-1、NCI、NC2、NCX、NELF、NER1、Net、NF Ill-a、NF NF-1、NF-1A、NF-1B、NF-1X、NF-4FA、NF-4FB、NF-4FC、NF-A、NF-AB、NFAT-1、NF-AT3、NF-Atc、NF-Atp、NF-Atx、NfβA、NF-CLEOa、NF-CLEOb、NFδE3A、NFδE3B、NFδE3C、NFδE4A、NFδE4B、NFδE4C、Nfe、NF-E、NFE2、NF-E2p45、NF-E3、NFE-6、NF-Gma、NF-GMb、NF-IL-2A、NF-IL-2B、NF-jun、NF-κB、NF-κB(样)、NF-κBl、NF-κBl、前体、NF-κB2、NF-κB2(p49)、NF-κB2前体、NF-κEl、NF-κE2、NF-κE3、NF-MHCIIA、NF-MHCIIB、NF-muEl、NF-muE2、NF-muE3、NF-S、NF-X、NF-X1、NF-X2、NF-X3、NFXc、NF-YA、NF-Zc、NF-Zz、NHP-1、NHP-2、NHP3、NHP4、NKX2-5、NKX2B、NKX2C、NKX2G、NKX3A、NKX3A vl、NKX3A v2、NKX3A v3、NKX3A v4、NKX3B、NKX6A、Nmi、N-Myc、N-Oct-2α、N-Oct-2β、N-Oct-3、N-Oct-4、N-Oct-5a、N-Oct-5b、NP-TCII、NR2E3、NR4A2、Nrfl、Nrf-1、Nrf2、NRF-2βl、NRF-2γl、NRL、NRSF形式1、NRSF形式2、NTF、02、OCA-B、Oct-1、Oct-2、Oct-2.1、Oct-2B、Oct-2C、Oct-4A、Oct4B、Oct-5、Oct-6、Octa-因子、八聚体-结合因子、oct-B2、oct-B3、Otxl、Otx2、OZF、pl07、pl30、p28调节剂、p300、p38erg、p45、p49erg,-p53、p55、p55erg、p65δ、p67、Pax-1、Pax-2、Pax-3、Pax-3A、Pax-3B、Pax-4、Pax-5、Pax-6、Pax-6/Pd-5a、Pax-7、Pax-8、Pax-8a、Pax-8b、Pax-8c、Pax-8d、Pax-8e、Pax-8f、Pax-9、Pbx-la、Pbx-lb、Pbx-2、Pbx-3a、Pbx-3b、PC2、PC4、PC5、PEA3、PEBP2α、PEBP2β、Pit-1、PITX1、PITX2、PITX3、PKNOX1、PLZF、POB、Pontin52、PPARα、PPARβ、PPARγl、PPARγ2、PPUR、PR、PR A、pRb、PRD1-BF1、PRDI-BFc、Prop-1、PSE1、P-TEFb、PTF、PTFα、PTFβ、PTFδ、PTFγ、Pu box结合因子、Pu box结合因子(BJA-B)、PU.l、PuF、Pur因子、Rl、R2、RAR-αl、RAR-β、RAR-β2、RAR-γ、RAR-γl、RBP60、RBP-Jκ、Rel、RelA、RelB、RFX、RFX1、RFX2、RFX3、RFXS、RF-Y、RORαl、RORα2、RORα3、RORβ、RORγ、Rox、RPF1、RPGα、RREB-1、RSRFC4、RSRFC9、RVF、RXR-α、RXR-β、SAP-la、SAP lb、SF-1、SHOX2a、SHOX2b、SHOXa、SHOXb、SHP、SIII-pl lO、SIII-pl5、SIII-pl8、SIM'、Six-1、Six-2、Six-3、Six-4、Six-5、Six-6、SMAD-1、SMAD-2、SMAD-3、SMAD-4、SMAD-5、SOX-11、SOX-12、Sox-4、Sox-5、SOX-9、Spl、Sp2、Sp3、Sp4、Sph因子、Spi-B、SPIN、SRCAP、SREBP-la、SREBP-lb、SREBP-lc、SREBP-2、SRE-ZBP、SRF、SRY、SRP1、Staf-50、STATlα、STATlβ、STAT2、STAT3、STAT4、STAT6、T3R、T3R-αl、T3R-α2、T3R-β、TAF(I)110、TAF(I)48、TAF(I)63、TAF(II)100、TAF(II)125、TAF(II)135、TAF(II)170、TAF(II)18、TAF(II)20、TAF(II)250、TAF(II)250Δ、TAF(II)28、TAF(II)30、TAF(II)31、TAF(II)55、TAF(II)70-α、TAF(II)70-β、TAF(II)70-γ、TAF-I、TAF-II、TAF-L、Tal-1、Tal-lβ、Tal-2、TAR因子、TBP、TBXIA、TBXIB、TBX2、TBX4、TBXS(长同种型)、TBXS(短同种型)、TCF、TCF-1、TCF-1A、TCF-1B、TCF-1C、TCF-1D、TCF-1E、TCF-1F、TCF-1G、TCF-2α、TCF-3、TCF-4、TCF-4(K)、TCF-4B、TCF-4E、TCFβl、TEF-1、TEF-2、tel、TFE3、TFEB、TFIIA、TFIIA-αβ前体、TFIIA-α/β前体、TFIIA-γ、 TFIIB、TFIID、TFIIE、TFIIE-α、TFIIE-β、TFIIF、TFIIF-α、TFIIF-β、TFIIH、TFIIH*、TFIIH-CAK、TFIIH-细胞周期蛋白H、TFIIH-ERCC2/CAK、TFIIH-MAT1、TFIIH-M015、TFIIH-p34、TFIIH-p44、TFIIH-p62、TFIIH-p80、TFIIH-p90、TFII-I、Tf-LFl、Tf-LF2、TGIF、TGIF2、TGT3、THRA1、TIF2、TLE1、TLX3、TMF、TR2、TR2-11、TR2-9、TR3、TR4、TRAP、TREB-1、TREB-2、TREB-3、TREF1、TREF2、TRF(2)、TTF-1、TXRE BP、TxREF、UBF、UBP-1、UEF-1、UEF-2、UEF-3、UEF-4、USF1、USF2、USF2b、Vav、Vax-2、VDR、vHNF-lA、vHNFlB、vHNF-lC、VITF、WSTF、WT1、WT1I、WT1I-KTS、WT1I-del2、WT1-KTS、WTl-del2、X2BP、XBP-1、XW-V、XX、YAF2、YB-1、YEBP、YYl、ZEB、ZF1、ZF2、ZFX、ZHX1、ZIC2、ZID、ZNF174、ASH1L、ASH2、ATF2、ASXL1、BAP1、bcllO、Bmil、BRG1、CARM1、KAT3A/CBP、CDC73、CHD1、CHD2、CTCF、DNMT1、DOTL1、EHMT1、ESET、EZH1、EZH2、FBXL10、FRP(Plu-1)、HD AC 1、HDAC2、HMGA1、hnRNPAl、HP1γ、Hsetlb、JaridlA、JaridlC、KIAA1718JHDM1D、KAT5、KMT4、LSD1、NFKB P100、NSD2、MBD2、MBD3、MLL2、MLL4、P300、pRB、RbAP46/48、RBP1、RbBP5、RING IB、RNApolII P S2、RNApolII PS5、ROC1、sap30、setDB 1、Sf3bl、SIRT1、Sirt6、SMYD1、SP1、SUV39H1、SUZ12、TCF4、TET1、TRRAP、TRX2、WDR5、WDR77和/或YYl。In one embodiment, the target molecule is one or more of the following: AAF, abl, ADA2, ADA-NF1, AF-1, AFP1, AhR, AIIN3, ALL-1, α-CBF, α-CP 1. α-CP2a, α-CP2b, αHo, αH2-αFB, Alx-4, aMEF-2, AML1, AMLla, AMLlb, AMLlc, AML1δN, AML2, AML3, AML3a, AML3b, AMY-1L, A-Myb, ANF, AP-1, AP-2αA, AP-2αB, AP-2β, AP-2γ, AP-3(1), AP-3(2), AP-4, AP-5, APC, AR, AREB6, Arnt, Arnt (774M form), ARP-1, ATBF1-A, ATBF1-B, ATF, ATF-1, ATF-2, ATF-3, ATF-3δZIP, ATF-a, ATF-aδ, ATPF1, Barhll, Barhl2, Barxl, Barx2, Bcl-3, BCL-6, BD73, β-catenin, Binl, B-Myb, BP1, BP2, brahma, BRCA1, Brn-3a, Brn-3b, Brn-4, BTEB, BTEB2 , B-TFIID, C/EBPα, C/EBPβ, C/EBPδ, CACC-binding factor, Cart-1, CBF(4), CBF(5), CBP, CCAAT-binding factor, CCMT-binding factor, CCF, CCG1 , CCK-la, CCK-lb, CD28RC, cdk2, cdk9, Cdx-1, CDX2, Cdx-4, CFF, ChxlO, CLIM1, CLIM2, CNBP, CoS, COUP, CP1, CP1A, CP1C, CP2, CPBP, CPE Binding protein, CREB, CREB-2, CRE-BP1, CRE-BPa, CREMα, CRF, Crx, CSBP-1, CTCF, CTF, CTF-1, CTF-2, CTF-3, CTF-5, CTF-7 , CUP, CUTL1, Cx, Cyclin A, Cyclin T1, Cyclin T2, Cyclin T2a, Cyclin T2b, DAP, DAX1, DB1, DBF4, DBP, DbpA, DbpAv, DbpB, DDB, DDB-1, DDB-2, DEF, δCREB, δMax, DF-1, DF-2, DF-3, Dlx-1, Dlx-2, Dlx-3, DIx4 (long isoform), Dlx-4 ( Short isoform, Dlx-5, Dlx-6, DP-1, DP-2, DSIF, DSIF-pl4, DSIF-pl60, DTF, DUX1, D UX2, DUX3, DUX4, E, E12, E2F, E2F+E4, E2F+pl07, E2F-1, E2F-2, E2F-3, E2F-4, E2F-5, E2F-6, E47, E4BP4, E4F, E4F1, E4TF2, EAR2, EBP-80, EC2, EF1, EF-C, EGR1, EGR2, EGR3, EIIaE-A, EIIaE-B, EIIaE-Cα, EIIaE-Cβ, EivF, EIf-1, EIk-1, Emx-1, Emx-2, Emx-2, En-1, En-2, ENH-bind.prot., ENKTF-1, EPAS 1, εF 1, ER, Erg-1, Erg-2, ERR1, ERR2 , ETF, Ets-1, Ets-1δVil, Ets-2, Evx-1, F2F, Factor 2, Factor name, FBP, f-EBP, FKBP59, FKHL18, FKHRL1P2, Fli-1, Fos, FOXB1, FOXC1, FOXC2 , FOXD1, FOXD2, FOXD3, FOXD4, FOXE1, FOXE3, FOXF1, FOXF2, FOXGla, FOXGlb, FOXGlc, FOXH1, FOXI1, FOXJla, FOXJlb, FOXJ2 (long isoform), FOXJ2 (short isoform), FOXJ3, FOXKla , FOXKlb, FOXKlc, FOXL1, FOXMla, FOXMlb, FOXMlc, FOXN1, FOXN2, FOXN3, FOXOla, FOXOlb, FOX02, FOX03a, FOX03b, FOX04, FOXP1, FOXP3, Fra-1, Fra-2, FTF, FTS, G-factor, G6 factor, GABP, GABP-α, GABP-β1, GABP-β2, GADD153, GAF, γCMT, γCACl, γCAC2, GATA-1, GATA-2, GATA-3, GATA-4, GATA-5, GATA-6 , Gbx-1, Gbx-2, GCF, GCMa, GCNS, GF1, GLI, GLI3, GRα, GRβ, GRF-1, Gsc, Gscl, GT-IC, GT-IIA, GT-IIBα, GT-IIBβ, HlTF1 , H1TF2, H2RIIBP, H4TF-1, H4TF-2, HAND1, HAND2, HB9, HDAC1, HDAC2, HDAC3, hDaxx, heat-induced factor, HEB, HEB1-p67, HEB1-p94, HEF-1B, HEF-1T , HEF-4C, HEN1, HEN2, Hesxl, Hex , HIF-1, HIF-1α, HIF-1β, HiNF-A, HiNF-B, HINF-C, HINF-D, HiNF-D3, HiNF-E, HiNF-P, HIP1, HIV-EP2, Hlf, HLTF , HLTF(Metl23), HLX, HMBP, HMG I, HMG I(Y), HMG Y, HMGI-C, HNF-1A, HNF-IB, HNF-1C, HNF-3, HNF-3α, HNF-3β, HNF-3γ, HNF4, HNF-4α, HNF4α1, HNF-4α2, HNF-4α3, HNF-4α4, HNF4γ, HNF-6α, hnRNP K, HOX11, HOXA1, HOXA10, HOXA10PL2, HOXA11, HOXA13, HOXA2, HOXA3, HOXA4 , HOXA5, HOXA6, HOXA7, HOXA9A, HOXA9B, HOXB-1, HOXB13, HOXB2, HOXB3, HOXB4, HOXBS, HOXB6, HOXA5, HOXB7, HOXB8, HOXB9, HOXC10, HOXC11, HOXC12, HOXC13, HOXC4, HOXC5, HOXC6, HOXC8 , HOXC9, HOXD10, HOXD11, HOXD12, HOXD13, HOXD3, HOXD4, HOXD8, HOXD9, Hp55, Hp65, HPX42B, HrpF, HSF, HSF1(long), HSF1(short), HSF2, hsp56, Hsp90, IBP-1, ICER -II, ICER-liγ, ICSBP, Idl, IdlH', Id2, Id3, Id3/Heir-1, IF1, IgPE-1, IgPE-2, IgPE-3, IκB, IκB-α, IκB-β, IκBR, II-1RF, IL-6RE-BP, 11-6RF, INSAF, IPF1, IRF-1, IRF-2, B, IRX2a, Irx-3, Irx-4, ISGF-1, ISGF-3, ISGF3α, ISGF- 3γ, lst-1, ITF, ITF-1, ITF-2, JRF, Jun, JunB, JunD, κy factor, KBP-1, KER1, KER-1, Koxl, KRF-1, Ku autoantigen, KUP, LBP -1, LBP-la, LBX1, LCR-F1, LEF-1, LEF-1B, LF-A1, LHX1, LHX2, LHX3a, LHX3b, LHXS, LHX6.1a, LHX6.1b, LIT-1, Lmol, Lmo2 , LMX1A, LMX1B, L-Myl (long form), L-Myl (short form), L-My2, LSF, LXRα, LyF-1, Lyl-1, M-factor, Madl, MASH-1, Maxl, Max2, MAZ, MAZ1, MB67, MBF1, MBF2, MBF3, MBP-1 ( 1), MBP-1(2), MBP-2, MDBP, MEF-2, MEF-2B, MEF-2C (433AA form), MEF-2C (465AA form), MEF-2C (473M form), MEF- 2C/δ32 (441AA form), MEF-2D00, MEF-2D0B, MEF-2DA0, MEF-2DAO, MEF-2DAB, MEF-2DA'B, Meis-1, Meis-2a, Meis-2b, Meis-2c, Meis-2d, Meis-2e, Meis3, Meoxl, Meoxla, Meox2, MHox(K-2), Mi, MIF-1, Miz-1, MM-1, MOP3, MR, Msx-1, Msx-2, MTB -Zf, MTF-1, mtTFl, Mxil, Myb, Myc, Myc 1, Myf-3, Myf-4, Myf-5, Myf-6, MyoD, MZF-1, NCI, NC2, NCX, NELF, NER1, Net, NF Ill-a, NF NF-1, NF-1A, NF-1B, NF-1X, NF-4FA, NF-4FB, NF-4FC, NF-A, NF-AB, NFAT-1, NF- AT3, NF-Atc, NF-Atp, NF-Atx, NfβA, NF-CLEOa, NF-CLEOb, NFδE3A, NFδE3B, NFδE3C, NFδE4A, NFδE4B, NFδE4C, Nfe, NF-E, NFE2, NF-E2p45, NF- E3, NFE-6, NF-Gma, NF-GMb, NF-IL-2A, NF-IL-2B, NF-jun, NF-κB, NF-κB (like), NF-κB1, NF-κB1, pre- NF-κB2, NF-κB2(p49), NF-κB2 precursor, NF-κEl, NF-κE2, NF-κE3, NF-MHCIIA, NF-MHCIIB, NF-muEl, NF-muE2, NF-muE3 , NF-S, NF-X, NF-X1, NF-X2, NF-X3, NFXc, NF-YA, NF-Zc, NF-Zz, NHP-1, NHP-2, NHP3, NHP4, NKX2-5 , NKX2B, NKX2C, NKX2G, NKX3A, NKX3A vl, NKX3A v2, NKX3A v3, NKX3A v4, NKX3B, NKX6A, Nmi, N-My c, N-Oct-2α, N-Oct-2β, N-Oct-3, N-Oct-4, N-Oct-5a, N-Oct-5b, NP-TCII, NR2E3, NR4A2, Nrfl, Nrf- 1. Nrf2, NRF-2βl, NRF-2γl, NRL, NRSF form 1, NRSF form 2, NTF, 02, OCA-B, Oct-1, Oct-2, Oct-2.1, Oct-2B, Oct-2C, Oct-4A, Oct4B, Oct-5, Oct-6, Octa-factor, octamer-binding factor, oct-B2, oct-B3, Otxl, Otx2, OZF, pl07, pl30, p28 regulator, p300, p38erg , p45, p49erg, -p53, p55, p55erg, p65δ, p67, Pax-1, Pax-2, Pax-3, Pax-3A, Pax-3B, Pax-4, Pax-5, Pax-6, Pax- 6/Pd-5a, Pax-7, Pax-8, Pax-8a, Pax-8b, Pax-8c, Pax-8d, Pax-8e, Pax-8f, Pax-9, Pbx-la, Pbx-lb, Pbx-2, Pbx-3a, Pbx-3b, PC2, PC4, PC5, PEA3, PEBP2α, PEBP2β, Pit-1, PITX1, PITX2, PITX3, PKNOX1, PLZF, POB, Pontin52, PPARα, PPARβ, PPARγl, PPARγ2, PPUR, PR, PR A, pRb, PRD1-BF1, PRDI-BFc, Prop-1, PSE1, P-TEFb, PTF, PTFα, PTFβ, PTFδ, PTFγ, Pu box binding factor, Pu box binding factor (BJA-B ), PU.1, PuF, Pur factor, R1, R2, RAR-α1, RAR-β, RAR-β2, RAR-γ, RAR-γ1, RBP60, RBP-Jκ, Rel, RelA, RelB, RFX, RFX1 , RFX2, RFX3, RFXS, RF-Y, RORα1, RORα2, RORα3, RORβ, RORγ, Rox, RPF1, RPGα, RREB-1, RSRFC4, RSRFC9, RVF, RXR-α, RXR-β, SAP-la, SAP lb, SF-1, SHOX2a, SHOX2b, SHOXa, SHOXb, SHP, SIII-pl 10, SIII-pl5, SIII-pl8, SIM', Six-1, Six-2, Six-3, Six-4, Six- 5. Six-6, SMAD-1, SMAD-2, S MAD-3, SMAD-4, SMAD-5, SOX-11, SOX-12, Sox-4, Sox-5, SOX-9, Spl, Sp2, Sp3, Sp4, Sph factor, Spi-B, SPIN, SRCAP , SREBP-la, SREBP-lb, SREBP-lc, SREBP-2, SRE-ZBP, SRF, SRY, SRP1, Staf-50, STAT1α, STAT1β, STAT2, STAT3, STAT4, STAT6, T3R, T3R-α1, T3R -α2, T3R-β, TAF(I)110, TAF(I)48, TAF(I)63, TAF(II)100, TAF(II)125, TAF(II)135, TAF(II)170, TAF (II)18, TAF(II)20, TAF(II)250, TAF(II)250Δ, TAF(II)28, TAF(II)30, TAF(II)31, TAF(II)55, TAF(II) )70-α, TAF(II)70-β, TAF(II)70-γ, TAF-I, TAF-II, TAF-L, Tal-1, Tal-lβ, Tal-2, TAR factor, TBP, TBXIA, TBXIB, TBX2, TBX4, TBXS (long isoform), TBXS (short isoform), TCF, TCF-1, TCF-1A, TCF-1B, TCF-1C, TCF-1D, TCF-1E, TCF-1F, TCF-1G, TCF-2α, TCF-3, TCF-4, TCF-4(K), TCF-4B, TCF-4E, TCFβl, TEF-1, TEF-2, tel, TFE3, TFEB , TFIIA, TFIIA-αβ precursor, TFIIA-α/β precursor, TFIIA-γ, TFIIB, TFIID, TFIIE, TFIIE-α, TFIIE-β, TFIIF, TFIIF-α, TFIIF-β, TFIIH, TFIIH*, TFIIH-CAK, TFIIH-Cyclin H, TFIIH-ERCC2/CAK, TFIIH-MAT1, TFIIH-M015, TFIIH-p34, TFIIH-p44, TFIIH-p62, TFIIH-p80, TFIIH-p90, TFII-I, Tf -LF1, Tf-LF2, TGIF, TGIF2, TGT3, THRA1, TIF2, TLE1, TLX3, TMF, TR2, TR2-11, TR2-9, TR3, TR4, TRAP, TREB-1, TREB-2, TREB-3 , TREF1, TREF2, TRF(2), TTF-1, TXRE BP, Tx REF, UBF, UBP-1, UEF-1, UEF-2, UEF-3, UEF-4, USF1, USF2, USF2b, Vav, Vax-2, VDR, vHNF-1A, vHNF1B, vHNF-1C, VITF, WSTF, WT1, WT1I, WT1I-KTS, WT1I-del2, WT1-KTS, WT1-del2, X2BP, XBP-1, XW-V, XX, YAF2, YB-1, YEBP, YYl, ZEB, ZF1, ZF2, ZFX, ZHX1, ZIC2, ZID, ZNF174, ASH1L, ASH2, ATF2, ASXL1, BAP1, bcllO, Bmil, BRG1, CARM1, KAT3A/CBP, CDC73, CHD1, CHD2, CTCF, DNMT1, DOTL1, EHMT1, ESET, EZH1, EZH2, FBXL10, FRP(Plu-1), HD AC 1, HDAC2, HMGA1, hnRNPA1, HP1γ, Hsetlb, JaridlA, JaridlC, KIAA1718JHDM1D, KAT5, KMT4, LSD1, NFKB P100, NSD2, MBD2, MBD3, MLL2, MLL4, P300, pRB, RbAP46/48, RBP1, RbBP5, RING IB, RNApolII P S2, RNApolII PS5, ROC1, sap30, setDB 1, Sf3bl, SIRT1, Sirt6, SMYD1, SP1, SUV39H1, SUZ12, TCF4, TET1, TRRAP, TRX2 , WDR5, WDR77 and/or YY1.
在一个实施方案中,所述转座酶与所述靶向模块共价联合,例如通过融合或化学偶联。在一个实施方案中,所述融合为直接的或间接的。在一个实施方案中,所述转座酶与所述靶向模块优选非共价联合。在一个实施方案中,所述转座酶与结合对的一个成员融合或偶联,且所述靶向模块与结合对的另一个成员融合或偶联。在一个实施方案中,所述结合对是生物素-亲合素、生物素-链霉亲合素、配体-受体、酶-底物或互补寡核苷酸。在所述靶向模块是抗体的一个实施方案中,所述转座酶与抗体结合蛋白融合。在一个实施方案中,所述抗体结合蛋白是蛋白A、蛋白G、Fc受体、或二抗。In one embodiment, the transposase is covalently associated with the targeting moiety, eg, by fusion or chemical coupling. In one embodiment, the fusion is direct or indirect. In one embodiment, the transposase is preferably non-covalently associated with the targeting moiety. In one embodiment, the transposase is fused or coupled to one member of a binding pair and the targeting moiety is fused or coupled to another member of the binding pair. In one embodiment, the binding pair is biotin-avidin, biotin-streptavidin, ligand-receptor, enzyme-substrate or complementary oligonucleotide. In one embodiment where the targeting moiety is an antibody, the transposase is fused to an antibody binding protein. In one embodiment, the antibody binding protein is protein A, protein G, an Fc receptor, or a secondary antibody.
在上述任一方面的一个实施方案中,所述转座酶是本领域已知的或将来发现的转座酶,例如Tn5转座酶、Mu转座酶、IS5转座酶或IS91转座酶,包括野生型和突变型(参见例如CN1367840A、CN109400714A、US6406896B1、US20040235103A1)。在一个实施方案中,所述转座酶是高活性Tn5转座酶,例如EK/LP Tn5转座酶。在一个实施方案中,所述转座酶是Tn5转座酶突变体,例如包含E58V、L372Q、E344K、D97E、D188E、E326D中的一处或多处替代。In one embodiment of any of the above aspects, the transposase is a transposase known in the art or discovered in the future, such as a Tn5 transposase, Mu transposase, IS5 transposase or IS91 transposase , including wild type and mutant type (see eg CN1367840A, CN109400714A, US6406896B1, US20040235103A1). In one embodiment, the transposase is a highly active Tn5 transposase, such as an EK/LP Tn5 transposase. In one embodiment, the transposase is a Tn5 transposase mutant, eg, comprising one or more substitutions of E58V, L372Q, E344K, D97E, D188E, E326D.
在一个实施方案中,所述转座酶识别序列是本领域已知的或将来发现的转座酶识别序列,例如Tn5型转座酶识别序列,例如内末端(IE)或外末端(OE),包括其野生型和突变型,以及甲基化形式(ME),例如19 bp Tn5核心末端序列( AGATGTGTATAAGAGACAG,SEQ ID NO:9)或其反向互补序列( CTGTCTCTTATACACATCT,SEQ ID NO:10)。在一个实施方案中,所述转座酶识别序列是Mu转座酶识别序列、IS5转座酶识别序列或IS91转座酶识别序列,包括野生型和突变型。 In one embodiment, the transposase recognition sequence is a transposase recognition sequence known in the art or discovered in the future, eg, a Tn5-type transposase recognition sequence, eg, inner end (IE) or outer end (OE) , including its wild-type and mutant forms, as well as methylated forms (ME), such as the 19 bp Tn5 core terminal sequence ( AGATGTGTATAAGAGACAG , SEQ ID NO: 9) or its reverse complement ( CTGTCTCTTATACACATCT , SEQ ID NO: 10). In one embodiment, the transposase recognition sequence is a Mu transposase recognition sequence, an IS5 transposase recognition sequence, or an IS91 transposase recognition sequence, including wild-type and mutant types.
在一个实施方案中,所述第一寡核苷酸中的第一标签序列和/或第二寡核苷酸中的第二标 签序列对于所述靶向模块而言是专一的。In one embodiment, the first tag sequence in the first oligonucleotide and/or the second tag sequence in the second oligonucleotide is specific to the targeting moiety.
在一个方面,本申请涉及一种混合物,其至少包含本申请的第一复合物和第二复合物,其中所述第一复合物中的靶向模块特异性结合第一靶分子,所述第二复合物中的靶向模块特异性结合第二靶分子,所述第一靶分子与所述第二靶分子不同。In one aspect, the present application relates to a mixture comprising at least a first complex of the present application and a second complex, wherein a targeting moiety in the first complex specifically binds a first target molecule, the first complex The targeting moiety in the two-complex specifically binds a second target molecule that is different from the second target molecule.
在一个实施方案中,本申请的混合物涉及一组靶向模块。在一个实施方案中,所述一组靶向模块中的不同靶向模块对应于相同的第一标签序列和不同的第二标签序列。在一个实施方案中,所述一组靶向模块中的不同靶向模块对应于不同的第一标签序列和相同的第二标签序列。在一个实施方案中,所述一组靶向模块中的不同靶向模块对应于不同的第一标签序列和不同的第二标签序列。In one embodiment, the mixture of the present application involves a set of targeting modules. In one embodiment, different targeting modules in the set of targeting modules correspond to the same first tag sequence and different second tag sequences. In one embodiment, different targeting moieties in the set of targeting moieties correspond to different first tag sequences and the same second tag sequence. In one embodiment, different targeting moieties in the set of targeting moieties correspond to different first tag sequences and different second tag sequences.
在一个实施方案中,本申请的混合物涉及多组靶向模块。在一个实施方案中,所述多组靶向模块中的不同组靶向模块对应于不同的第一标签序列,同一组靶向模块中的不同靶向模块对应于相同的第一标签序列和不同的第二标签序列。在一个实施方案中,所述多组靶向模块中的不同组靶向模块对应于不同的第二标签序列,同一组靶向模块中的不同靶向模块对应于相同的第二标签序列和不同的第一标签序列。In one embodiment, the mixture of the present application involves multiple sets of targeting modules. In one embodiment, different sets of targeting modules in the plurality of sets of targeting modules correspond to different first tag sequences, and different targeting modules in the same set of targeting modules correspond to the same first tag sequence and different the second tag sequence. In one embodiment, different sets of targeting modules in the plurality of sets of targeting modules correspond to different second tag sequences, and different targeting modules in the same set of targeting modules correspond to the same second tag sequence and different The first tag sequence of .
在一个方面,本申请涉及一种制备用于同时研究多种靶分子与DNA的相互作用的核酸文库的方法,其包括:获得本申请的混合物,其包含针对多种靶分子的多种复合物,即包含针对所述多种靶分子中每一种的复合物;获得多种靶分子与DNA相互作用的样品;使所述混合物与所述样品反应,使得靶向模块结合相应靶分子,转座酶将DNA片段化并在DNA片段两侧加上相应标签序列;和回收带标签的DNA片段,得到核酸文库。在一个实施方案中,该方法还包括纯化和/或扩增回收的DNA片段。In one aspect, the present application relates to a method of preparing a nucleic acid library for the simultaneous study of the interaction of multiple target molecules with DNA, comprising: obtaining a mixture of the present application comprising multiple complexes against multiple target molecules , that is, comprising a complex for each of the multiple target molecules; obtaining a sample in which multiple target molecules interact with DNA; reacting the mixture with the sample, so that the targeting module binds to the corresponding target molecule, and transfers the The posase fragments the DNA and adds corresponding tag sequences on both sides of the DNA fragments; and recovers the tagged DNA fragments to obtain a nucleic acid library. In one embodiment, the method further comprises purifying and/or amplifying the recovered DNA fragments.
在一个方面,本申请涉及一种同时鉴定多种靶分子在DNA上的作用位点的方法,其包括:获得本申请的混合物,其包含针对多种靶分子的多种复合物,即包含针对所述多种靶分子中每一种的复合物;获得多种靶分子与DNA作用的样品;使所述混合物与所述样品反应,使得靶向模块结合相应靶分子,转座酶将DNA片段化并在DNA片段两侧加上相应标签序列;回收带标签的DNA片段;并对回收的DNA片段测序,其中与标签序列对应的测序得到的序列指示与该标签序列对应的靶分子在DNA上的作用位点。在一个实施方案中,该方法还包括纯化和/或扩增回收的DNA片段。In one aspect, the present application relates to a method for simultaneously identifying the sites of action of multiple target molecules on DNA, comprising: obtaining a mixture of the present application comprising multiple complexes directed against multiple target molecules, ie, comprising targeting multiple target molecules. A complex of each of the plurality of target molecules; a sample of the interaction of the plurality of target molecules with DNA is obtained; the mixture is reacted with the sample, so that the targeting module binds the corresponding target molecule, and the transposase binds the DNA fragments. and adding corresponding tag sequences on both sides of the DNA fragments; recovering the tagged DNA fragments; and sequencing the recovered DNA fragments, wherein the sequence obtained by the sequencing corresponding to the tag sequence indicates that the target molecule corresponding to the tag sequence is on the DNA site of action. In one embodiment, the method further comprises purifying and/or amplifying the recovered DNA fragments.
在一个实施方案中,所述方法还包括分析测序结果。在一个实施方案中,分析测序结果包括汇总与相同第一标签序列和/或相同第二标签序列对应(例如包含)的测序读出。例如,在针对靶分子A的靶向模块对应于标签序列A1和A2、针对靶分子B的靶向模块对应于标签序列B1和B2的情况中,将与标签序列A1或A2对应的测序读出归在靶分子A下,所述测序读出中的插入物序列为与靶分子A相互作用的DNA位点;将与标签序列B1或B2对应的测序读出归在靶分子B下,所述测序读出中的插入物序列为与靶分子B相互作用的DNA位点。又例如,在针对A组靶分子的靶向模块均对应于相同的标签序列A且针对A组内靶分子AB1、AB2……的靶向模块依次对应于独特的标签序列B1、B2……的情况中,将与标签序列A对应的测序读出均 归在A组靶分子下,且将与标签序列B1对应的测序读出归在靶分子AB1下,将与标签序列B2对应的测序读出归在靶分子AB2下……In one embodiment, the method further comprises analyzing the sequencing results. In one embodiment, analyzing the sequencing results includes aggregating sequencing reads corresponding to (eg, comprising) the same first tag sequence and/or the same second tag sequence. For example, where the targeting moiety for target molecule A corresponds to tag sequences A1 and A2 and the targeting moiety for target molecule B corresponds to tag sequences B1 and B2, the sequencing reads corresponding to tag sequence A1 or A2 are read Under the target molecule A, the insert sequence in the sequencing read is the DNA site that interacts with the target molecule A; the sequencing read corresponding to the tag sequence B1 or B2 is classified under the target molecule B, the The insert sequence in the sequencing read is the DNA site that interacts with target molecule B. For another example, the targeting modules for group A target molecules all correspond to the same tag sequence A, and the targeting modules for target molecules AB1, AB2 in group A correspond to unique tag sequences B1, B2... In this case, the sequencing reads corresponding to the tag sequence A are classified under the target molecule of group A, the sequencing reads corresponding to the tag sequence B1 are classified under the target molecule AB1, and the sequencing reads corresponding to the tag sequence B2 are classified under the target molecule AB1. Under the target molecule AB2...
在一个实施方案中,所述复合物中的转座酶是无活性的。在一个实施方案中,所述方法还包括激活转座酶的步骤,例如添加二价阳离子,例如Mg 2+In one embodiment, the transposase in the complex is inactive. In one embodiment, the method further comprises the step of activating the transposase, eg adding a divalent cation, eg Mg2+ .
在一个实施方案中,该方法还包括添加靶分子-DNA相互作用的调控剂。在一个实施方案中,该方法还包括比较添加调控剂的样品与不添加调控剂的样品的测序结果。In one embodiment, the method further comprises adding a modulator of the target molecule-DNA interaction. In one embodiment, the method further comprises comparing the sequencing results of the sample with the added modulator to the sample without the modulator added.
在一个实施方案中,该方法还包括改变靶分子-DNA相互作用的反应条件。在一个实施方案中,该方法还包括比较不同反应条件下的样品的测序结果。In one embodiment, the method further comprises altering the reaction conditions for the target molecule-DNA interaction. In one embodiment, the method further comprises comparing the sequencing results of the samples under different reaction conditions.
测序结果可以是定性的、半定量的、定量的或其任意组合。Sequencing results can be qualitative, semi-quantitative, quantitative, or any combination thereof.
在一个实施方案中,所述样品为细胞或其衍生物。在一个实施方案中,所述细胞为原核细胞或真核细胞。在一个实施方案中,所述样品为细胞核、细胞质或细胞器或其衍生物。在一个实施方案中,所述样品为细胞裂解物。在一个实施方案中,所述方法包括透化细胞的步骤,例如添加洋地黄皂苷。In one embodiment, the sample is a cell or a derivative thereof. In one embodiment, the cell is a prokaryotic cell or a eukaryotic cell. In one embodiment, the sample is a nucleus, cytoplasm or organelle or a derivative thereof. In one embodiment, the sample is a cell lysate. In one embodiment, the method includes the step of permeabilizing the cells, eg, adding digitonin.
在一个实施方案中,所述DNA是基因组、染色体或染色质,例如原核生物或真核生物的。In one embodiment, the DNA is genomic, chromosomal or chromatin, eg, prokaryotic or eukaryotic.
附图说明Description of drawings
图1显示实施例2的文库质量评价。Figure 1 shows the library quality assessment of Example 2.
图2显示实施例2的TSS富集。Figure 2 shows the TSS enrichment of Example 2.
图3显示实施例2的IgV视图。Figure 3 shows the IgV view of Example 2.
图4显示实施例3的文库质量评价。Figure 4 shows the library quality assessment of Example 3.
图5显示实施例3的TSS富集。Figure 5 shows the TSS enrichment of Example 3.
图6显示实施例3的IgV视图。Figure 6 shows the IgV view of Example 3.
图7显示本申请构建的核酸文库的示例性实施方案的示意图。Figure 7 shows a schematic diagram of an exemplary embodiment of a nucleic acid library constructed in the present application.
具体实施方式Detailed ways
材料Material
实施例中使用的转座酶为南京诺唯赞生物科技股份有限公司Hyperactive pG-Tn5 Transposase for CUT&Tag(货号S602)或Hyperactive pA-Tn5 Transposase for CUT&Tag(货号S603)。The transposase used in the embodiment is Hyperactive pG-Tn5 Transposase for CUT&Tag (Item No. S602) or Hyperactive pA-Tn5 Transposase for CUT&Tag (Item No. S603) of Nanjing Novizan Biotechnology Co., Ltd.
H3K4me2抗体来自Abcam,货号:ab11946;CTCF抗体来自CST,货号:3418S;RNA Pol II抗体来自Abcam,货号:ab817;H3K27me3抗体来自CST,货号:#9733S;H3K27ac抗体来自Abcam,货号:ab4729。The H3K4me2 antibody was from Abcam, catalog number: ab11946; the CTCF antibody was from CST, catalog number: 3418S; the RNA Pol II antibody was from Abcam, catalog number: ab817; the H3K27me3 antibody was from CST, catalog number: #9733S;
本申请的方法具有普适性,适用于各种测序平台,例如ion torrent平台,illumina平台和华大平台。实施例以illumina平台为例。如果采用别的测序平台,只需要将下文寡核苷酸中illumina平台采用的固定化探针和测序引物的序列或其反向互补序列替换为别的平台的相应 序列。The method of this application is universal and applicable to various sequencing platforms, such as ion torrent platform, illumina platform and BGI platform. The embodiment takes the illumina platform as an example. If other sequencing platforms are used, it is only necessary to replace the sequences of the immobilized probes and sequencing primers used by the illumina platform in the following oligonucleotides or their reverse complementary sequences with the corresponding sequences of other platforms.
寡核苷酸1(SEQ ID NO:10):Oligonucleotide 1 (SEQ ID NO: 10):
5’-phos- CTGTCTCTTATACACATCT-NH 2-3’ 5'-phos- CTGTCTCTTATACACATCT - NH2-3'
寡核苷酸2(SEQ ID NO:11):Oligonucleotide 2 (SEQ ID NO: 11):
Figure PCTCN2021143623-appb-000001
Figure PCTCN2021143623-appb-000001
寡核苷酸3(SEQ ID NO:12):Oligonucleotide 3 (SEQ ID NO: 12):
Figure PCTCN2021143623-appb-000002
Figure PCTCN2021143623-appb-000002
其中
Figure PCTCN2021143623-appb-000003
代表索引序列(索引序列长度8个核苷酸是示例性的,而非限制性的),加粗区段为测序引物结合序列(与测序使用的引物的序列相同),下划线区段为转座酶结合的甲基化19bp核心末端序列,斜体区段为测序芯片结合序列(与测序芯片使用的固定化探针的序列相同)。在备选的实施方案中,寡核苷酸2和寡核苷酸3可以删除斜体区段的5’部分的若干碱基,保留斜体区段的3’部分的至少四个碱基。
in
Figure PCTCN2021143623-appb-000003
Represents the index sequence (the index sequence length of 8 nucleotides is exemplary and not limiting), the bold segment is the sequencing primer binding sequence (same sequence as the primer used for sequencing), the underlined segment is the transposition The methylated 19bp core end sequence bound by the enzyme, the italicized segment is the sequence chip binding sequence (same as the sequence of the immobilized probe used by the sequence chip). In an alternative embodiment, oligonucleotide 2 and oligonucleotide 3 may delete several bases of the 5' portion of the italicized segment, leaving at least four bases of the 3' portion of the italicized segment.
扩增引物1(SEQ ID NO:1):5’-AATGATACGGCGACCACCGAGATCTACAC-3’Amplification primer 1 (SEQ ID NO: 1): 5'-AATGATACGGCGACCACCGAGATCTACAC-3'
扩增引物2(SEQ ID NO:3):5’-CAAGCAGAAGACGGCATACGAGAT-3’Amplification primer 2 (SEQ ID NO: 3): 5'-CAAGCAGAAGACGGCATACGAGAT-3'
扩增引物1(N5)与寡核苷酸2的完整斜体区段相同,扩增引物2(N7)与寡核苷酸3的完整斜体区段相同。Amplification primer 1 (N5) is the same as the complete italicized segment of oligonucleotide 2, and amplification primer 2 (N7) is the same as the complete italicized segment of oligonucleotide 3.
上文三种寡核苷酸序列的一个备选实施方案如下:An alternative embodiment of the above three oligonucleotide sequences is as follows:
寡核苷酸1’(SEQ ID NO:9):5’-phos- AGATGTGTATAAGAGACAG-NH 2-3’ Oligonucleotide 1' (SEQ ID NO: 9): 5'-phos- AGATGTGTATAAGAGACAG - NH2-3'
寡核苷酸2’(SEQ ID NO:13):Oligonucleotide 2' (SEQ ID NO: 13):
Figure PCTCN2021143623-appb-000004
Figure PCTCN2021143623-appb-000004
寡核苷酸3’(SEQ ID NO:14):Oligonucleotide 3' (SEQ ID NO: 14):
Figure PCTCN2021143623-appb-000005
Figure PCTCN2021143623-appb-000005
上文寡核苷酸2、寡核苷酸2’、寡核苷酸3和寡核苷酸3’构成本申请的又一个备选实施方案。Oligonucleotide 2, Oligonucleotide 2', Oligonucleotide 3 and Oligonucleotide 3' above constitute yet another alternative embodiment of the present application.
实施例2中采用的索引:Index adopted in Example 2:
Figure PCTCN2021143623-appb-000006
Figure PCTCN2021143623-appb-000006
实施例3中采用的索引:Index adopted in Example 3:
Figure PCTCN2021143623-appb-000007
Figure PCTCN2021143623-appb-000007
Illumina文库结构如下:The Illumina library structure is as follows:
Figure PCTCN2021143623-appb-000008
Figure PCTCN2021143623-appb-000008
其中,-MMMMMM-代表插入序列(插入序列长度6个核苷酸是示例性的,而非限制性的),其它区段含义同上文。Wherein, -MMMMMM- represents the insertion sequence (the length of the insertion sequence is 6 nucleotides is exemplary, not limiting), and other segments have the same meanings as above.
实施例1:转座体(衔接头-转座酶复合物)制备Example 1: Preparation of transposomes (adapter-transposase complexes)
按照S602产品说明书进行包埋,步骤如下:Embedding according to the S602 product manual, the steps are as follows:
1.衔接头退火:1. Adapter annealing:
(1)使用Annealing Buffer(Vazyme,#S602)分别溶解寡核苷酸1、寡核苷酸2、寡核苷酸3至100μM;(1) Use Annealing Buffer (Vazyme, #S602) to dissolve oligonucleotide 1, oligonucleotide 2, and oligonucleotide 3 to 100 μM respectively;
(2)将寡核苷酸1与寡核苷酸2等摩尔混合得到反应1,将寡核苷酸1与寡核苷酸3等摩尔混合得到反应2;(2) equimolar mixing of oligonucleotide 1 and oligonucleotide 2 to obtain reaction 1, and equimolar mixing of oligonucleotide 1 and oligonucleotide 3 to obtain reaction 2;
(3)分别将反应1和反应2涡旋震荡充分混匀,并短暂离心使溶液回到管底。置于PCR仪内,进行如下反应程序:(3) Reaction 1 and reaction 2 were vortexed to mix well, and centrifuged briefly to return the solution to the bottom of the tube. Put it in the PCR machine and carry out the following reaction procedures:
热盖hot cover 105℃105℃
75℃75 15min15min
60℃60 10min10min
50℃50 10min10min
40℃40 10min10min
25℃25℃ 30min30min
(4)将反应1和反应2等体积混合,混匀。命名为Adapter Mix,于-30至-15℃保存。(4) Mix equal volumes of Reaction 1 and Reaction 2, and mix well. Named Adapter Mix and stored at -30 to -15°C.
2.组装转座体(衔接头-转座酶复合物)2. Assemble the transposome (adapter-transposase complex)
(1)在灭菌PCR管中依次添加各反应组分:(1) Add each reaction component in sequence to the sterilized PCR tube:
Figure PCTCN2021143623-appb-000009
Figure PCTCN2021143623-appb-000009
Figure PCTCN2021143623-appb-000010
Figure PCTCN2021143623-appb-000010
(2)混匀。(2) Mix well.
(3)置于30℃反应1小时。反应产物为转座体(衔接头-转座酶复合物),可直接应用于后续实验,或于-30至-15℃保存。(3) The reaction was placed at 30°C for 1 hour. The reaction product is a transposome (adapter-transposase complex), which can be directly used in subsequent experiments or stored at -30 to -15°C.
按照此反应体系制备的转座体终浓度为4μM。The final concentration of transposomes prepared according to this reaction system was 4 μM.
将转座酶与含有不同索引的衔接头对进行包埋,根据所使用的索引分别标记为转座体1,转座体2,转座体3……Embed the transposase with adaptor pairs containing different indices, labelled as transposome 1, transposome 2, transposome 3, depending on the index used.
实施例2:Example 2:
本实施例用于同时研究细胞中组蛋白修饰、转录因子和RNA Pol II与基因组DNA的结合情况。This example is used to simultaneously study the binding of histone modifications, transcription factors and RNA Pol II to genomic DNA in cells.
所用的洗涤缓冲液,反应缓冲液,终止缓冲液配方如下:The wash buffer, reaction buffer, and stop buffer used were formulated as follows:
洗涤缓冲液1:来自Vazyme,#TD901,Wash Buffer,Wash Buffer 1: from Vazyme, #TD901, Wash Buffer,
洗涤缓冲液2:来自Vazyme,#TD901,Dig-Wash Buffer,Wash Buffer 2: from Vazyme, #TD901, Dig-Wash Buffer,
反应缓冲液:来自Vazyme,#TD901,Tagmentation Buffer,Reaction buffer: from Vazyme, #TD901, Tagmentation Buffer,
终止缓冲液:来自Vazyme,#TD901,Termination Buffer。Termination buffer: from Vazyme, #TD901, Termination Buffer.
本实施例具体流程如下:The specific process of this embodiment is as follows:
1.取3支1.5mL EP管,加入10μL洗涤缓冲液2,分别加入1μL包埋后的转座体1,转座体2,转座体3,向3个EP管中分别加入0.5μg相应抗体,在4℃孵育30min,使得转座体与抗体充分结合,得到转座体1-H3K4me2抗体、转座体2-CTCF抗体和转座体3-RNA Pol II抗体,三种转座体-抗体复合物;1. Take three 1.5mL EP tubes, add 10μL of washing buffer 2, add 1μL of embedded transposome 1, transposome 2, and transposome 3, respectively, add 0.5μg of the corresponding Antibody, incubate at 4°C for 30min, so that the transposome can be fully combined with the antibody to obtain transposome 1-H3K4me2 antibody, transposome 2-CTCF antibody and transposome 3-RNA Pol II antibody, three kinds of transposome- antibody complexes;
2.收集约100,000个常规体外培养的293T细胞,用PBS洗1次,离心收集细胞后用洗涤缓冲液1洗1次;2. Collect about 100,000 conventional in vitro cultured 293T cells, wash once with PBS, collect cells by centrifugation and wash once with washing buffer 1;
3.用洗涤缓冲液2重悬细胞,同时加入步骤1中得到的三种转座体-抗体复合物,4℃或室温条件下旋转孵育30min;3. Resuspend the cells with washing buffer 2, add the three transposome-antibody complexes obtained in step 1 at the same time, rotate and incubate at 4°C or room temperature for 30 minutes;
4.用洗涤缓冲液2洗涤细胞3次,除去没有结合的转座体-抗体复合物;4. Wash cells 3 times with wash buffer 2 to remove unbound transposome-antibody complexes;
5.用反应缓冲液重悬细胞,37℃孵育30min;5. Resuspend cells with reaction buffer and incubate at 37°C for 30min;
6.加入终止缓冲液终止反应,并用酚-氯仿进行DNA纯化;6. The reaction was terminated by adding stop buffer, and the DNA was purified with phenol-chloroform;
7.纯化好的DNA直接进行PCR扩增完成建库。7. The purified DNA is directly amplified by PCR to complete the library construction.
扩增体系:Amplification system:
组分component 体积volume
DNA纯化产物DNA purification product 24μL24μL
5×TAB(Vazyme,#TD501)5 x TAB (Vazyme, #TD501) 10μL10μL
TAE(Vazyme,#TD501)TAE (Vazyme, #TD501) 1μL1μL
N5引物(4μM)N5 primer (4μM) 5μL5μL
N7引物(4μM)N7 primer (4μM) 5μL5μL
ddH 2O ddH 2 O 5μL5μL
PCR反应程序,热盖设置为105℃,根据实际情况调整扩增循环数。In the PCR reaction program, the heated lid was set to 105°C, and the number of amplification cycles was adjusted according to the actual situation.
Figure PCTCN2021143623-appb-000011
Figure PCTCN2021143623-appb-000011
8.扩增产物纯化8. Purification of Amplification Products
使用VAHTS DNA Clean Beads(Vazyme,#N411)依照说明书纯化扩增产物。Amplification products were purified using VAHTS DNA Clean Beads (Vazyme, #N411) according to the manufacturer's instructions.
9.Qubit检测文库浓度9. Qubit detection library concentration
使用Qubit 3.0 Fluorometer(invitrogen)对所得文库进行浓度测定,计算文库产出。文库浓度为34.8ng/μL(22μL洗脱体积)。The resulting library was subjected to concentration determination using a Qubit 3.0 Fluorometer (invitrogen) and library yield was calculated. The library concentration was 34.8 ng/μL (22 μL elution volume).
10.用Agilent 2100 Bioanalyzer评价文库质量10. Assess library quality with the Agilent 2100 Bioanalyzer
取1μL纯化后的PCR产物,用Agilent DNA 1000 kit(Agilent,Cat.No.5067-1504)进行分析。结果见图1。Take 1 μL of the purified PCR product and analyze it with Agilent DNA 1000 kit (Agilent, Cat. No. 5067-1504). The results are shown in Figure 1.
11.测序11. Sequencing
将完成建库的文库用于illumina平台二代测序,Hiseq X,PE150bp。测序结果见表1及图2和图3。The completed library was used for next-generation sequencing on the illumina platform, Hiseq X, PE150bp. The sequencing results are shown in Table 1 and Figures 2 and 3.
表1Table 1
SampleSample H3K4me2H3K4me2 CTCFCTCF RNA Pol IIRNA Pol II
Clean readsClean reads 2979890429798904 1971980419719804 1415698014156980
Q20Q20 0.976750.97675 0.983350.98335 0.97570.9757
Q30Q30 0.938150.93815 0.9540.954 0.93620.9362
Mapping rateMapping rate 96.41%96.41% 95.18%95.18% 97.64%97.64%
Duplicate RateDuplicate Rate 24.91%24.91% 23.60%23.60% 17.67%17.67%
peak numberpeak number 1717817178 1813218132 1502415024
由表1中数据可知,从Reads数、Q20、Q30、Mapping Rate、Duplication Rate和peak number等信息来看,本mtChIP-seq实验流程得到的文库质量高,Dup比例低,从图2的TSS富集情况和图3的IVG视图来看,在同一样本中同时检测组蛋白、转录因子和RNA Pol II均可获得较好的样本富集且文库信噪比高。From the data in Table 1, it can be seen from the information such as the number of Reads, Q20, Q30, Mapping Rate, Duplication Rate and peak number that the library obtained by this mtChIP-seq experimental process is of high quality and low Dup ratio. According to the collection situation and the IVG view in Figure 3, the simultaneous detection of histones, transcription factors and RNA Pol II in the same sample can obtain better sample enrichment and high library signal-to-noise ratio.
实施例3Example 3
本实施例提供一种同时研究组蛋白甲基化和乙酰化修饰的方法,本实施例具体流程如下:This embodiment provides a method for simultaneously studying histone methylation and acetylation modification. The specific process of this embodiment is as follows:
1.称取新鲜C57BL/6成年小鼠肝脏100mg,使用细胞核提取试剂盒(Solarbio,货号:SN0020)进行组织细胞核提取;1. Weigh 100 mg of fresh C57BL/6 adult mouse liver, and use the nucleus extraction kit (Solarbio, Cat. No.: SN0020) to extract the nucleus of the tissue;
2.取2支1.5mL EP管,加入10μL洗涤缓冲液2,加入1μL包埋后的转座体4和转座体5,向2个EP管中分别加入0.5μg相应抗体,在4℃孵育30min,使得转座体与抗体充分结合,得到转座体4-H3K27me3抗体和转座体5-H3K27ac抗体;2. Take two 1.5mL EP tubes, add 10μL of washing buffer 2, add 1μL of embedded transposome 4 and transposome 5, add 0.5μg of the corresponding antibodies to the two EP tubes, and incubate at 4°C For 30 minutes, the transposome and the antibody are fully combined to obtain the transposome 4-H3K27me3 antibody and the transposome 5-H3K27ac antibody;
3.取2μL组织细胞核,溶于洗涤缓冲液1,离心其上清;3. Take 2 μL of tissue nuclei, dissolve in Wash Buffer 1, and centrifuge the supernatant;
4.用洗涤缓冲液2重悬细胞核,同时加入步骤2中得到的两种转座体-抗体复合物,4℃或室温条件下旋转孵育30min;4. Resuspend the nuclei with washing buffer 2, add the two transposome-antibody complexes obtained in step 2 at the same time, rotate and incubate at 4°C or room temperature for 30 minutes;
5.用洗涤缓冲液2洗涤细胞核3次,除去没有结合的转座体-抗体复合物;5. Wash nuclei three times with wash buffer 2 to remove unbound transposome-antibody complexes;
6.用反应缓冲液重悬细胞,37℃孵育30min;6. Resuspend cells in reaction buffer and incubate at 37°C for 30min;
7.加入终止缓冲液终止反应,并用酚-氯仿进行DNA纯化;7. Add stop buffer to terminate the reaction, and use phenol-chloroform for DNA purification;
8.如实施例2所述,纯化好的DNA直接进行PCR扩增完成建库;8. As described in Example 2, the purified DNA was directly amplified by PCR to complete the library building;
9.如实施例2所述,用磁珠纯化扩增产物;9. Purify the amplification product with magnetic beads as described in Example 2;
10.如实施例2所述,用Qubit检测文库产量;文库浓度为57.4ng/μL(22μL洗脱体积);10. The library yield was detected with Qubit as described in Example 2; the library concentration was 57.4 ng/μL (22 μL elution volume);
11.如实施例2所述,用Agilent 2100 Bioanalyzer评价文库质量。结果见图4;11. As described in Example 2, library quality was assessed with an Agilent 2100 Bioanalyzer. The results are shown in Figure 4;
12.如实施例2所述,将完成建库的文库用于二代测序。测序结果见表2及图5和图6。12. As described in Example 2, the completed library was used for next-generation sequencing. The sequencing results are shown in Table 2 and Figures 5 and 6.
表2Table 2
SampleSample H3K27me3H3K27me3 H3K27acH3K27ac
Clean readsClean reads 1817739818177398 2504232425042324
Q20Q20 0.981650.98165 0.982250.98225
Q30Q30 0.94580.9458 0.946550.94655
Mapping rateMapping rate 95.45%95.45% 97.16%97.16%
Duplicate RateDuplicate Rate 19.52%19.52% 16.39%16.39%
peak numberpeak number 4406944069 2959829598
由表2中数据可知,从Reads数、Q20、Q30、Mapping Rate、Duplication Rate和peak number等信息来看,本mtChIP-seq实验流程得到的文库质量高,Dup比例低,从图5的TSS富集情况和图6的IVG视图来看,在同一样本中可以同时检测组蛋白的甲基化和乙酰化修饰,可获得较好的样本富集且文库信噪比高。From the data in Table 2, it can be seen from the information such as the number of Reads, Q20, Q30, Mapping Rate, Duplication Rate and peak number, the library obtained by this mtChIP-seq experimental process is of high quality and low Dup ratio. According to the collection situation and the IVG view in Figure 6, the methylation and acetylation modifications of histones can be detected simultaneously in the same sample, which can achieve better sample enrichment and high library signal-to-noise ratio.

Claims (13)

  1. 一种寡核苷酸对,其包含第一寡核苷酸和第二寡核苷酸,其中:An oligonucleotide pair comprising a first oligonucleotide and a second oligonucleotide, wherein:
    第一寡核苷酸包含第一转座酶识别序列,the first oligonucleotide comprises a first transposase recognition sequence,
    第二寡核苷酸包含第二转座酶识别序列,The second oligonucleotide comprises a second transposase recognition sequence,
    第一寡核苷酸包含第一标签序列和/或第二寡核苷酸包含第二标签序列。The first oligonucleotide contains a first tag sequence and/or the second oligonucleotide contains a second tag sequence.
  2. 如权利要求1所述的寡核苷酸对,其中,第一寡核苷酸还包含第一测序固相结合序列和/或第一测序引物结合序列。The oligonucleotide pair of claim 1, wherein the first oligonucleotide further comprises a first sequencing solid phase binding sequence and/or a first sequencing primer binding sequence.
  3. 如权利要求1所述的寡核苷酸对,其中,第二寡核苷酸还包含第二测序固相结合序列和/或第二测序引物结合序列。The oligonucleotide pair of claim 1, wherein the second oligonucleotide further comprises a second sequencing solid phase binding sequence and/or a second sequencing primer binding sequence.
  4. 如权利要求1所述的寡核苷酸对,其中,第一测序固相结合序列和/或第二测序固相结合序列是SEQ ID NO:1至4所示序列任一,The oligonucleotide pair of claim 1, wherein the first sequencing solid phase binding sequence and/or the second sequencing solid phase binding sequence is any one of the sequences shown in SEQ ID NOs: 1 to 4,
    任选地,第一测序引物结合序列和/或第二测序引物结合序列是SEQ ID NO:5至8所示序列任一,Optionally, the first sequencing primer binding sequence and/or the second sequencing primer binding sequence is any of the sequences shown in SEQ ID NOs: 5 to 8,
    任选地,第一寡核苷酸和/或第二寡核苷酸为单链、双链或其组合,Optionally, the first oligonucleotide and/or the second oligonucleotide are single-stranded, double-stranded, or a combination thereof,
    任选地,第一寡核苷酸以自5’端至3’端方向包含任选的第一测序固相结合序列、任选的第一标签序列、任选的第一测序引物结合序列、和第一转座酶识别序列,Optionally, the first oligonucleotide comprises an optional first sequencing solid phase binding sequence, an optional first tag sequence, an optional first sequencing primer binding sequence, and the first transposase recognition sequence,
    任选地,第二寡核苷酸以自5’端至3’端方向包含任选的第二测序固相结合序列、任选的第二标签序列、任选的第二测序引物结合序列、和第二转座酶识别序列,Optionally, the second oligonucleotide comprises an optional second sequencing solid phase binding sequence, an optional second tag sequence, an optional second sequencing primer binding sequence, and the second transposase recognition sequence,
    任选地,第一寡核苷酸与第二寡核苷酸相连,优选地,以远离转座酶识别序列的一端连接,任选地,第一寡核苷酸和第二寡核苷酸之间存在断裂位点。Optionally, the first oligonucleotide is linked to the second oligonucleotide, preferably at one end remote from the transposase recognition sequence, optionally, the first oligonucleotide and the second oligonucleotide There are breakpoints in between.
  5. 一种带寡核苷酸标签的靶向性转座体复合物,其包含转座酶、权利要求1的寡核苷酸对、和靶向模块,An oligonucleotide-tagged targeting transposome complex comprising a transposase, the oligonucleotide pair of claim 1, and a targeting module,
    任选地,所述转座酶是Tn5转座酶,任选地,所述转座酶是EK/LP Tn5转座酶,optionally, the transposase is a Tn5 transposase, optionally the transposase is an EK/LP Tn5 transposase,
    任选地,所述转座酶识别序列是Tn5型转座酶识别序列,任选地,所述转座酶识别序列是SEQ ID NO:9或10所示19bp Tn5核心末端序列,Optionally, the transposase recognition sequence is a Tn5-type transposase recognition sequence, optionally, the transposase recognition sequence is a 19bp Tn5 core terminal sequence shown in SEQ ID NO: 9 or 10,
    任选地,所述靶向模块是抗体,任选地,所述靶向模块特异性结合与DNA相互作用,任选地,调控基因表达的靶分子,任选地,所述靶分子是组蛋白或转录因子,Optionally, the targeting moiety is an antibody, optionally the targeting moiety specifically binds and interacts with DNA, optionally, a target molecule that regulates gene expression, optionally, the target molecule is a group protein or transcription factor,
    任选地,所述转座酶与所述靶向模块共价或非共价联合,Optionally, the transposase is covalently or non-covalently associated with the targeting moiety,
    所述第一寡核苷酸中的第一标签序列和/或第二寡核苷酸中的第二标签序列对于所述靶向模块是专一的。The first tag sequence in the first oligonucleotide and/or the second tag sequence in the second oligonucleotide is specific to the targeting moiety.
  6. 如权利要求5的复合物,其中所述转座酶与所述靶向模块共价联合,任选地,通过融合或化学偶联,任选地,所述融合为直接的或间接的。6. The complex of claim 5, wherein the transposase is covalently associated with the targeting moiety, optionally by fusion or chemical coupling, optionally the fusion is direct or indirect.
  7. 如权利要求6的复合物,其中所述转座酶与所述靶向模块非共价联合,任选地,所述转座酶与结合对的一个成员融合,且所述靶向模块与结合对的另一个成员融合,任选地,所述结合对是生物素-亲合素、生物素-链霉亲合素、配体-受体、酶-底物或互补寡核苷酸。The complex of claim 6, wherein the transposase is non-covalently associated with the targeting module, optionally the transposase is fused to a member of a binding pair, and the targeting module is associated with The other member of the pair is fused, optionally, the binding pair is a biotin-avidin, biotin-streptavidin, ligand-receptor, enzyme-substrate, or complementary oligonucleotide.
  8. 如权利要求7的复合物,其中所述靶向模块是抗体,所述转座酶与抗体结合蛋白融合, 任选地,所述抗体结合蛋白是蛋白A、蛋白G、Fc受体、或二抗。The complex of claim 7, wherein the targeting moiety is an antibody and the transposase is fused to an antibody binding protein, optionally the antibody binding protein is protein A, protein G, an Fc receptor, or two anti.
  9. 一种混合物,其至少包含如权利要求5-8任一项所述的第一复合物和第二复合物,A mixture comprising at least a first compound and a second compound as claimed in any one of claims 5-8,
    其中所述第一复合物中的靶向模块特异性结合第一靶分子,所述第二复合物中的靶向模块特异性结合第二靶分子,所述第一靶分子与所述第二靶分子不同。wherein the targeting moiety in the first complex specifically binds a first target molecule, the targeting moiety in the second complex specifically binds a second target molecule, and the first target molecule and the second The target molecules are different.
  10. 一种制备用于同时研究多种靶分子-DNA相互作用的核酸文库的方法,其包括:A method of preparing a nucleic acid library for the simultaneous study of multiple target molecule-DNA interactions, comprising:
    获得如权利要求9所述的混合物,其包含针对多种靶分子的多种复合物;obtaining a mixture as claimed in claim 9, comprising a plurality of complexes against a plurality of target molecules;
    获得多种靶分子与DNA相互作用的样品;Obtain samples of various target molecules interacting with DNA;
    使所述混合物与所述样品反应,使得靶向模块结合相应靶分子,转座酶将DNA片段化并在DNA片段两侧加上相应标签序列;和reacting the mixture with the sample such that the targeting module binds the corresponding target molecule, the transposase fragments the DNA and flanks the DNA fragment with the corresponding tag sequence; and
    回收带标签的DNA片段,任选纯化和/或扩增,得到核酸文库。The tagged DNA fragments are recovered, optionally purified and/or amplified, resulting in a nucleic acid library.
  11. 一种同时鉴定多种靶分子在DNA上的作用位点的方法,其包括:A method for simultaneously identifying the action sites of multiple target molecules on DNA, comprising:
    获得如权利要求9所述的混合物,其包含针对多种靶分子的多种复合物;obtaining a mixture as claimed in claim 9, comprising a plurality of complexes against a plurality of target molecules;
    获得多种靶分子与DNA作用的样品;Obtain samples of various target molecules interacting with DNA;
    使所述混合物与所述样品反应,使得靶向模块结合相应靶分子,转座酶将DNA片段化并在DNA片段两侧加上相应标签序列;reacting the mixture with the sample so that the targeting module binds the corresponding target molecule, and the transposase fragments the DNA and flanks the DNA fragment with corresponding tag sequences;
    回收带标签的DNA片段,任选纯化和/或扩增;和recovering the tagged DNA fragments, optionally purifying and/or amplifying; and
    对回收的DNA片段测序,其中与标签序列对应的测序结果序列指示与该标签序列对应的靶分子在DNA上的作用位点。The recovered DNA fragments are sequenced, wherein the sequence of the sequencing result corresponding to the tag sequence indicates the action site of the target molecule corresponding to the tag sequence on the DNA.
  12. 如权利要求10或11的方法,其中:The method of claim 10 or 11, wherein:
    所述样品为细胞或其衍生物,任选地,所述样品为细胞核、细胞质或细胞器或其衍生物,任选地,所述样品为细胞裂解物;或the sample is a cell or a derivative thereof, optionally the sample is a nucleus, cytoplasm or organelle or a derivative thereof, optionally the sample is a cell lysate; or
    所述靶分子为组蛋白或转录因子;或The target molecule is a histone or a transcription factor; or
    所述DNA为基因组、染色体或染色质。The DNA is genome, chromosome or chromatin.
  13. 如权利要求10或11的方法,其还包括:The method of claim 10 or 11, further comprising:
    任选地,激活转座酶;和/或optionally, activate a transposase; and/or
    任选地,灭活转座酶。Optionally, the transposase is inactivated.
PCT/CN2021/143623 2021-01-06 2021-12-31 Research method for multi-target protein-dna interaction, and tool WO2022148311A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110016489.1 2021-01-06
CN202110016489.1A CN113322254B (en) 2021-01-06 2021-01-06 Methods and tools for multi-target protein-DNA interaction

Publications (1)

Publication Number Publication Date
WO2022148311A1 true WO2022148311A1 (en) 2022-07-14

Family

ID=77413503

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/143623 WO2022148311A1 (en) 2021-01-06 2021-12-31 Research method for multi-target protein-dna interaction, and tool

Country Status (2)

Country Link
CN (1) CN113322254B (en)
WO (1) WO2022148311A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113322254B (en) * 2021-01-06 2022-05-20 南京诺唯赞生物科技股份有限公司 Methods and tools for multi-target protein-DNA interaction
CN114106196A (en) * 2021-10-29 2022-03-01 陈凯 Antibody-transposase fusion protein and preparation method and application thereof
CN116891532A (en) * 2022-04-06 2023-10-17 南京诺唯赞生物科技股份有限公司 Biotin-avidin-based secondary antibody transposase complex and application thereof in detecting interaction between protein and DNA
CN115386966B (en) * 2022-10-26 2023-03-21 北京寻因生物科技有限公司 DNA appearance modification library building method, sequencing method and library building kit thereof

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1833034A (en) * 2003-06-20 2006-09-13 埃克斯魁恩公司 Probes, libraries and kits for analysis of mixtures of nucleic acids and methods for constructing the same
WO2015035108A1 (en) * 2013-09-05 2015-03-12 The Jackson Laboratory Compositions for rna-chromatin interaction analysis and uses thereof
CN107530654A (en) * 2015-02-04 2018-01-02 加利福尼亚大学董事会 Nucleic acid is sequenced by bar coded in discrete entities
CN109791157A (en) * 2016-09-26 2019-05-21 赛卢拉研究公司 Use the reagent measuring protein expression with bar coded oligonucleotide sequence
CN110914426A (en) * 2017-03-23 2020-03-24 哈佛大学的校长及成员们 Nucleobase editors comprising nucleic acid programmable DNA binding proteins
CN113322254A (en) * 2021-01-06 2021-08-31 南京诺唯赞生物科技股份有限公司 Methods and tools for studying multi-target protein-DNA interactions

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014190214A1 (en) * 2013-05-22 2014-11-27 Active Motif, Inc. Targeted transposition for use in epigenetic studies
CN108680757B (en) * 2018-06-29 2021-02-09 上海交通大学 Kit and method for detecting interaction between protein and DNA and application
CN109400714B (en) * 2018-10-26 2019-11-01 南京诺唯赞生物科技有限公司 The recombination fusion protein of antibody target and its application in epigenetics
CN111440843A (en) * 2019-01-16 2020-07-24 中国科学院生物物理研究所 Method for preparing chromatin co-immunoprecipitation library by using trace clinical puncture sample and application thereof
CN109957562B (en) * 2019-03-06 2019-12-06 南京诺唯赞生物科技有限公司 Method and kit for quickly constructing transcriptome sequencing library
CN110372799B (en) * 2019-08-01 2020-06-09 北京大学 Fusion protein for preparing single-cell ChIP-seq library and application thereof

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1833034A (en) * 2003-06-20 2006-09-13 埃克斯魁恩公司 Probes, libraries and kits for analysis of mixtures of nucleic acids and methods for constructing the same
WO2015035108A1 (en) * 2013-09-05 2015-03-12 The Jackson Laboratory Compositions for rna-chromatin interaction analysis and uses thereof
CN107530654A (en) * 2015-02-04 2018-01-02 加利福尼亚大学董事会 Nucleic acid is sequenced by bar coded in discrete entities
CN109791157A (en) * 2016-09-26 2019-05-21 赛卢拉研究公司 Use the reagent measuring protein expression with bar coded oligonucleotide sequence
CN110914426A (en) * 2017-03-23 2020-03-24 哈佛大学的校长及成员们 Nucleobase editors comprising nucleic acid programmable DNA binding proteins
CN113322254A (en) * 2021-01-06 2021-08-31 南京诺唯赞生物科技股份有限公司 Methods and tools for studying multi-target protein-DNA interactions

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
AI SHANSHAN, XIONG HAIQING, LI CHEN C., LUO YINGJIE, LIU YAXI, YU XIANHONG, HE AIBIN: "ItChIP-simultaneous indexing and tagmentation-based ChIP-seq", RESEARCH SQAURE, 6 September 2019 (2019-09-06), pages 1 - 16, XP055949665, Retrieved from the Internet <URL:https://assets.researchsquare.com/files/pex-555/v1/11bbfc77-8815-4b8c-ba1d-132a82e30497.pdf?c=1631827055> [retrieved on 20220808], DOI: 10.21203/rs.2.11366/v1 *
WANG LISHAN, ZHU PENGFEI; QI FUJUAN; CAO XINKAI; KONG YAN; ZANG WEIDONG: "Molecular mechanism of p53-mediated tumor suppressionin p53-WT breast cancer using Ch IP-seq data", SHENGWU XINXIXUE - CHINESE JOURNAL OF BIOINFORMATICS, HARBIN GONGYE DAXUE, CN, vol. 12, no. 4, 1 December 2014 (2014-12-01), CN , pages 257 - 262, XP055949671, ISSN: 1672-5565, DOI: 10.3969/j.issn.1672-5565.201404.04 *

Also Published As

Publication number Publication date
CN113322254B (en) 2022-05-20
CN113322254A (en) 2021-08-31

Similar Documents

Publication Publication Date Title
WO2022148311A1 (en) Research method for multi-target protein-dna interaction, and tool
US11885814B2 (en) High efficiency targeted in situ genome-wide profiling
US10934636B2 (en) Methods for studying nucleic acids
US20160208323A1 (en) Methods for Shearing and Tagging DNA for Chromatin Immunoprecipitation and Sequencing
US20230227813A1 (en) Parallel analysis of individual cells for rna expression and dna from targeted tagmentation by sequencing
US20060292560A1 (en) Transcription factor target gene discovery
US20050079492A1 (en) Micro-arrayed organization of transcription factor target genes
US20020177218A1 (en) Methods of detecting multiple DNA binding protein and DNA interactions in a sample, and devices, systems and kits for practicing the same
US10851423B2 (en) SNP arrays
US9989528B2 (en) Synthetic olgononucleotides for detection of nucleic acid binding proteins
CN116891532A (en) Biotin-avidin-based secondary antibody transposase complex and application thereof in detecting interaction between protein and DNA

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21917350

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21917350

Country of ref document: EP

Kind code of ref document: A1