WO2022148311A1

WO2022148311A1 - Research method for multi-target protein-dna interaction, and tool

Info

Publication number: WO2022148311A1
Application number: PCT/CN2021/143623
Authority: WO
Inventors: 曹林; 聂俊伟; 瞿志鹏; 江明扬; 韩锦雄; 吴恒; 唐秋雨
Original assignee: 南京诺唯赞生物科技股份有限公司
Priority date: 2021-01-06
Filing date: 2021-12-31
Publication date: 2022-07-14
Also published as: CN113322254A; CN113322254B

Abstract

Provided are a targeted transposome complex having an oligonucleotide tag, and a use thereof for researching the interaction of multiple-target molecule-DNA.

Description

Methods and tools for the study of multi-target protein-DNA interactions

technical field

The present application belongs to the field of biotechnology, and relates to an oligonucleotide-tagged targeting transposome complex and its use for studying multi-target protein-DNA interactions.

Background technique

The regulation of gene expression is the basis for all life activities of living organisms. According to the central dogma of heredity, biological genetic information is transcribed from DNA to form RNA, and then translated from RNA to form protein. In addition to classical genetics, the phenomenon that the expression level of genes changes and can be stably passed on to offspring without changing the nucleotide sequence of DNA is called epigenetics. The extent determines when, where, and how a gene is expressed. Common epigenetic controls include DNA methylation, histone modifications, and chromatin conformation changes.

Interactions between biological macromolecules are involved in regulating the selective expression of genes and post-transcriptional regulation of genes. The interaction between DNA and protein is ubiquitous, and traditional research methods include electrophoretic mobility shift assay (EMSA), DNase I footprinting, yeast hybridization system and luciferase reporter gene assay (LRGA).

Chromatin immunoprecipitation (ChIP and ChIP-seq) is a widely used method to study protein-DNA interactions, usually for the study of transcription factor binding sites or histone-specific modification sites. The basic process of ChIP is: (1) use formaldehyde to fix tissue cut into pieces or fix cells directly, so that DNA and proteins are cross-linked together to form target protein-DNA complexes; (2) chromatin DNA is fragmented by ultrasound , and then add the ChIP-level antibody against the target protein to bind with the target protein-DNA complex; (3) add protein A/G beads bound to the antibody to bind the complex, and then release the DNA fragments by de-crosslinking; (4) Purify the enriched DNA fragment, and detect the DNA sequence of the enriched fragment by downstream detection technology (quantitative PCR, gene chip, sequencing, etc.). ChIP-Seq technology, which combines ChIP with next-generation sequencing technology, can efficiently detect DNA segments that interact with histones, transcription factors, etc. on a genome-wide scale. In recent years, with the continuous updating and optimization of technology, CUT&RUN (Cleavage Under Targets&Release Using Nuclease) technology and CUT&Tag (Cleavage Under Targets&Tagmentation) technology have emerged.

The team of Steven Henikoff of the Fred Hutchinson Cancer Research Center in the United States published the experimental scheme of CUT&Tag technology in Nature Communication in April 2019. Under the guidance of antibody targeting, the use of P5 and P7 end partial adapter sequences was used. The protein A-Tn5 transposase fusion fragmented the DNA near the target protein, and added P5 and P7 end adapter sequences at both ends of the DNA fragment while cutting, and added the index sequence and the adapter sequence through PCR amplification. For the remainder, a high-resolution, low-background library is generated.

In August 2019, the research group of He Aibin, Institute of Molecular Medicine, Peking University published CoBATCH (combinatorial barcoding and targeted chromatin release) technology online in Molecular Cell. On the basis of the above-mentioned CUT&Tag technology, the sample is targeted to cut the sample using the embedded protein A-Tn5 transposome containing different barcode adapters, so that the sample DNA has different adapter sequences after being interrupted by transposase, It is an easy-to-operate, high-throughput and high-quality single-cell ChIP-seq technology.

In October 2020, the Stacc-seq technology was mentioned in an article published in Nature by Jie Wei's group from Tsinghua University. After the antibody is incubated with the Tn5 transposase fused to protein A/G in vitro, it enters the body to target the target protein, and then activates the transposase and cleaves the DNA near the target protein, which can be produced by PCR amplification. next-generation sequencing library.

The traditional ChIP-seq technology is affected by factors such as cross-linking/ultrasonic interruption conditions, antibodies and other factors, and requires a large amount of cells/tissues for library construction, which is difficult to apply to micro-samples and single-cell experiments.

ChIP-seq technology is prone to false negatives/false positives, resulting in high sequencing background noise due to uneven ultrasound interruptions.

Although CUT&Tag technology greatly shortens the experimental time compared to ChIP-seq technology, it is still cumbersome.

Existing technologies are only suitable for studying targeting a single target protein, and cannot simultaneously conduct multi-target studies in the same sample.

SUMMARY OF THE INVENTION

This application provides a technology for simultaneous multi-target DNA-protein interaction research, which can simultaneously detect two or more target proteins and their interacting DNA fragments for the same experimental sample, and obtain them through high-throughput sequencing technology. Library information with low background. The entire experimental process greatly reduces the library construction steps, shortens the library construction time, reduces the requirements for the initial amount of samples, improves the library output and data quality, and helps to obtain more histones/transcription factors/DNA-binding proteins. action in the body.

The basic process of the present application is: (1) annealing oligonucleotides containing different index sequences to form adapters, and embedding a pair of adapters with protein A-Tn5 transposase or protein G-Tn5 transposase, Each adapter-transposase complex produced contains a unique index or combination of indexes; (2) an antibody against the protein of interest is incubated with the embedded adapter-transposase complex to form the adapter - Transposase-antibody complex, one antibody corresponds to one index or index combination; (3) Collect cells/nuclei, add adapter-transposase-antibody complex for incubation, use antibody to target transposase target protein; (4) activate the transposase, cut the DNA near the target protein, and connect the adapter; (5) inactivate the transposase, purify the fragmented and tagged DNA, and obtain it by PCR amplification for sequencing and (6) sequencing by downstream sequencing technology. Finally, the DNA sequence information bound by different target proteins can be obtained by splitting different indexes or index combinations.

The key to realizing multi-target detection in the present application lies in that in step (1), an adaptor-transposase complex (ie, a transposome) containing different indexes or index combinations is generated by embedding, a transposome and a target protein The antibodies combine in vitro to form a variety of adaptor-transposase-antibody complexes. After mixing these different adaptor-transposase-antibody complexes, they enter the sample simultaneously to target different target proteins. The DNA near the target protein is cut by the transposase, so that the two ends of the DNA fragment are connected with different indexes or index combinations, and the library is generated by PCR amplification. After sequencing, the index or index combination can be split to identify the interaction between different target proteins and DNA Happening.

The method of the present application can be applied in one-tube high-throughput, and can be "seamlessly" combined with the single-cell sequencing platform.

High-throughput sequencing technology: also known as second-generation sequencing technology, next-generation sequencing technology, and can be abbreviated as NGS. It refers to the technology of performing sequence determination on hundreds of thousands to millions of DNA molecules in parallel at a time, and the length of the determined sequence is generally short.

Transposase: an enzyme that performs the function of transposition, usually encoded by a transposon, recognizes specific sequences at both ends of the transposon, and can detach the transposon from the adjacent sequence and insert it into a new DNA target site, No homology requirement. Tn5 transposase is a kind of transposase. It has the characteristics of good randomness, high stability, and easy sequencing of insertion sites. It is an efficient tool for molecular genetics and gene sequencing.

A targeting moiety refers to any moiety capable of binding a molecule of interest, preferably an antibody or antibody fragment.

The term "antibody" is used herein in the broadest sense and encompasses a variety of antibody structures including, but not limited to, monoclonal antibodies, polyclonal antibodies, multispecific antibodies (eg, bispecific antibodies), and antibody fragments so long as they exhibit the desired antigen-binding activity.

An "antibody fragment" refers to a molecule other than an intact antibody that comprises the portion of the intact antibody that binds the antigen to which the intact antibody binds. Examples of antibody fragments include, but are not limited to, Fv, Fab, Fab', Fab'-SH, F(ab') ₂ ; diabodies; linear antibodies; single-chain antibody molecules (eg, scFv); Sexual antibodies.

In one aspect, the application relates to an oligonucleotide pair comprising a first oligonucleotide and a second oligonucleotide, wherein: the first oligonucleotide comprises a first transposase recognition sequence, the second The oligonucleotide comprises a second transposase recognition sequence, the first oligonucleotide comprises a first tag sequence and/or the second oligonucleotide comprises a second tag sequence.

In one embodiment, the first transposase recognition sequence is the same as the second transposase recognition sequence. In one embodiment where the first transposase recognition sequence is the same as the second transposase recognition sequence, the first transposase recognition sequence is in the same direction as the second transposase recognition sequence. In one embodiment where the first transposase recognition sequence is the same as the second transposase recognition sequence, the first transposase recognition sequence is reversed from the second transposase recognition sequence. In one embodiment, the first transposase recognition sequence is different from the second transposase recognition sequence.

In one embodiment, the first oligonucleotide further comprises a first sequencing solid phase binding sequence and/or a first sequencing primer binding sequence. In one embodiment, the second oligonucleotide further comprises a second sequencing solid phase binding sequence and/or a second sequencing primer binding sequence. In one embodiment, the first tag sequence and/or the second tag sequence corresponds to the targeting moiety and/or target molecule below. In one embodiment, the first oligonucleotide and/or the second oligonucleotide may also comprise one or more additional tag sequences, with additional uses.

In one embodiment, the first oligonucleotide and/or the second oligonucleotide are single-stranded, double-stranded, or a combination thereof. In one embodiment, the first transposase recognition sequence in the first oligonucleotide and/or the second transposase recognition sequence in the second oligonucleotide is double stranded. In one embodiment, the portion of the first oligonucleotide other than the first transposase recognition sequence (eg, the first sequencing solid phase binding sequence, the first tag sequence, and/or the first sequencing primer binding sequence) and /or the portion of the second oligonucleotide other than the second transposase recognition sequence (eg, the second sequencing solid phase binding sequence, the second tag sequence, and/or the second sequencing primer binding sequence) is single-stranded. In one embodiment, the first sequencing solid phase binding sequence and/or the second sequencing solid phase binding sequence may be truncated relative to the first immobilized probe and/or the second immobilized probe on the sequencing solid phase, respectively and/or extended, e.g. truncated and/or extended 5' and/or 3' end by at least 1, 2, 3, 4, 5, 6, 7, 8, 9 , 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20 nucleotides or more. In one embodiment, the first sequencing primer binding sequence and/or the second sequencing primer binding sequence may be truncated and/or extended relative to the first sequencing primer and/or the second sequencing primer, respectively, eg, truncated and/or extended /or at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13 extended 5' end and/or 3'

end

14, 15, 16, 17, 18, 19, 20 nucleotides or more

In one embodiment, the first oligonucleotide comprises an optional first sequencing solid phase binding sequence in a 5' to 3' direction (eg, the same or opposite to the sequence of the first binding probe of the sequencing solid phase). to complement), an optional first tag sequence, an optional first sequencing primer binding sequence (e.g., identical or reverse complementary to the sequence of the first sequencing primer), and a first transposase recognition sequence (plus strand (e.g. AGATGTGTATAAGAGACAG, SEQ ID NO: 9) or minus strand). In one embodiment, the second oligonucleotide comprises an optional second sequencing solid phase binding sequence in a 5' to 3' direction (eg, the same or opposite to the sequence of the second binding probe of the sequencing solid phase). to complement), an optional second tag sequence, an optional second sequencing primer binding sequence (e.g. identical or reverse complementary to the sequence of the second sequencing primer), and a second transposase recognition sequence (plus strand (e.g. AGATGTGTATAAGAGACAG, SEQ ID NO: 9) or minus strand). In the first oligonucleotide, the first tag sequence and the first sequencing primer binding sequence can be transposed. In the second oligonucleotide, the second tag sequence and the second sequencing primer binding sequence can be transposed.

In one embodiment, the first oligonucleotide comprises a first transposase recognition sequence (minus strand (eg CTGTCTCTTATACACATCT, SEQ ID NO: 10) or plus strand), optionally The first sequencing primer binding sequence (e.g., reverse complementary or identical to the sequence of the first sequencing primer), an optional first tag sequence, and an optional first sequencing solid phase binding sequence (e.g., the The sequence of a binding probe is reverse complementary or identical). In one embodiment, the second oligonucleotide comprises a second transposase recognition sequence (minus strand (eg CTGTCTCTTATACACATCT, SEQ ID NO: 10) or plus strand) in a 5' to 3' direction, optionally The second sequencing primer binding sequence (e.g., reverse complementary or identical to the sequence of the second sequencing primer), an optional second tag sequence, and an optional second sequencing solid phase binding sequence (e.g., with the second sequencing solid phase binding sequence) The sequences of the two binding probes are reverse complementary or identical). In the first oligonucleotide, the first tag sequence and the first sequencing primer binding sequence can be transposed. In the second oligonucleotide, the second tag sequence and the second sequencing primer binding sequence can be transposed.

Binding probes for sequencing solid phases and their sequences are known in the art. Thus, the sequencing solid phase binding sequences of the present application (eg, the first sequencing solid phase binding sequence and/or the second sequencing solid phase binding sequence) are also known or readily available in the art (see the instructions for use of each sequencing platform), For example, the ion torrent platform, the illumina platform and the Huada platform. For example, the sequencing solid phase binding sequence of the present application (eg, the first sequencing solid phase binding sequence and/or the second sequencing solid phase binding sequence) can be AATGATACGGCGACCACCGAGATCTACAC (SEQ ID NO: 1) or its reverse complement GTGTAGATCTCGGTGGTCGCCGTATCATT (SEQ ID NO: 1) NO: 2), or CAAGCAGAAGACGGCATACGAGAT (SEQ ID NO: 3) or its reverse complement ATCTCGTATGCCGTCTTCTGCTTG (SEQ ID NO: 4), including truncated and/or extended sequences thereof.

Sequencing primers and their sequences are known in the art. Thus, the sequencing primer binding sequences of the present application (eg, the first sequencing primer binding sequence and/or the second sequencing primer binding sequence) are also known in the art or readily available (see the instructions for use of each sequencing platform), such as ion torrent platform, illumina platform and BGI platform. For example, the sequencing primer binding sequence of the present application (eg, the first sequencing primer binding sequence and/or the second sequencing primer binding sequence) may be TCGTCGGCAGCGTC (SEQ ID NO: 5) or its reverse complement GACGCTGCCGACGA (SEQ ID NO: 6) ), or GTCTCGTGGGCTCGG (SEQ ID NO: 7) or its reverse complement CCGAGCCCACGAGAC (SEQ ID NO: 8), including truncated and/or extended sequences thereof.

The tag sequences of the present application (eg, the first tag sequence and/or the second tag sequence) can utilize any short oligonucleotide, eg, at least 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 or more nucleotides. The tag sequences of the present application (eg, the first tag sequence and/or the second tag sequence) can utilize tag sequences utilized by current or future sequencing platforms (including but not limited to ion torrent platforms, illumina platforms, and BGI platforms).

In one embodiment, the first oligonucleotide is linked to the second oligonucleotide. In one embodiment where the first oligonucleotide and the second oligonucleotide are linked, the first and second oligonucleotides are at opposite ends of the transposase recognition sequence (eg, a sequencing solid phase bound sequence). end) connection. Preferably, a cleavage site, such as a restriction enzyme recognition site, exists between the first oligonucleotide and the second oligonucleotide.

In one aspect, the present application relates to an oligonucleotide-tagged targeted transposome complex comprising a transposase, an oligonucleotide pair of the present application, and a targeting moiety.

In one embodiment, the targeting moiety is an aptamer, an oligonucleotide that specifically binds a target molecule. In one embodiment, the targeting moiety is an antibody (including antibody fragments). In one embodiment, the targeting module specifically binds a target molecule that interacts with DNA (eg, modulates gene expression). In one embodiment, the target molecule is a histone, including isoforms, variants, fragments thereof. In one embodiment, the target molecule is a DNA polymerase, including isoforms, variants, fragments thereof. In one embodiment, the target molecule is RNA polymerase, including isoforms, variants, fragments thereof. In one embodiment, the target molecule is a transcription factor, such as an ARS binding factor, an rDNA enhancer binding protein, a TATA binding protein, or a CCCTC binding factor.

In one embodiment, the target molecule is one or more of the following: AAF, abl, ADA2, ADA-NF1, AF-1, AFP1, AhR, AIIN3, ALL-1, α-CBF, α-CP 1. α-CP2a, α-CP2b, αHo, αH2-αFB, Alx-4, aMEF-2, AML1, AMLla, AMLlb, AMLlc, AML1δN, AML2, AML3, AML3a, AML3b, AMY-1L, A-Myb, ANF, AP-1, AP-2αA, AP-2αB, AP-2β, AP-2γ, AP-3(1), AP-3(2), AP-4, AP-5, APC, AR, AREB6, Arnt, Arnt (774M form), ARP-1, ATBF1-A, ATBF1-B, ATF, ATF-1, ATF-2, ATF-3, ATF-3δZIP, ATF-a, ATF-aδ, ATPF1, Barhll, Barhl2, Barxl, Barx2, Bcl-3, BCL-6, BD73, β-catenin, Binl, B-Myb, BP1, BP2, brahma, BRCA1, Brn-3a, Brn-3b, Brn-4, BTEB, BTEB2 , B-TFIID, C/EBPα, C/EBPβ, C/EBPδ, CACC-binding factor, Cart-1, CBF(4), CBF(5), CBP, CCAAT-binding factor, CCMT-binding factor, CCF, CCG1 , CCK-la, CCK-lb, CD28RC, cdk2, cdk9, Cdx-1, CDX2, Cdx-4, CFF, ChxlO, CLIM1, CLIM2, CNBP, CoS, COUP, CP1, CP1A, CP1C, CP2, CPBP, CPE Binding protein, CREB, CREB-2, CRE-BP1, CRE-BPa, CREMα, CRF, Crx, CSBP-1, CTCF, CTF, CTF-1, CTF-2, CTF-3, CTF-5, CTF-7 , CUP, CUTL1, Cx, Cyclin A, Cyclin T1, Cyclin T2, Cyclin T2a, Cyclin T2b, DAP, DAX1, DB1, DBF4, DBP, DbpA, DbpAv, DbpB, DDB, DDB-1, DDB-2, DEF, δCREB, δMax, DF-1, DF-2, DF-3, Dlx-1, Dlx-2, Dlx-3, DIx4 (long isoform), Dlx-4 ( Short isoform, Dlx-5, Dlx-6, DP-1, DP-2, DSIF, DSIF-pl4, DSIF-pl60, DTF, DUX1, D UX2, DUX3, DUX4, E, E12, E2F, E2F+E4, E2F+pl07, E2F-1, E2F-2, E2F-3, E2F-4, E2F-5, E2F-6, E47, E4BP4, E4F, E4F1, E4TF2, EAR2, EBP-80, EC2, EF1, EF-C, EGR1, EGR2, EGR3, EIIaE-A, EIIaE-B, EIIaE-Cα, EIIaE-Cβ, EivF, EIf-1, EIk-1, Emx-1, Emx-2, Emx-2, En-1, En-2, ENH-bind.prot., ENKTF-1, EPAS 1, εF 1, ER, Erg-1, Erg-2, ERR1, ERR2 , ETF, Ets-1, Ets-1δVil, Ets-2, Evx-1, F2F, Factor 2, Factor name, FBP, f-EBP, FKBP59, FKHL18, FKHRL1P2, Fli-1, Fos, FOXB1, FOXC1, FOXC2 , FOXD1, FOXD2, FOXD3, FOXD4, FOXE1, FOXE3, FOXF1, FOXF2, FOXGla, FOXGlb, FOXGlc, FOXH1, FOXI1, FOXJla, FOXJlb, FOXJ2 (long isoform), FOXJ2 (short isoform), FOXJ3, FOXKla , FOXKlb, FOXKlc, FOXL1, FOXMla, FOXMlb, FOXMlc, FOXN1, FOXN2, FOXN3, FOXOla, FOXOlb, FOX02, FOX03a, FOX03b, FOX04, FOXP1, FOXP3, Fra-1, Fra-2, FTF, FTS, G-factor, G6 factor, GABP, GABP-α, GABP-β1, GABP-β2, GADD153, GAF, γCMT, γCACl, γCAC2, GATA-1, GATA-2, GATA-3, GATA-4, GATA-5, GATA-6 , Gbx-1, Gbx-2, GCF, GCMa, GCNS, GF1, GLI, GLI3, GRα, GRβ, GRF-1, Gsc, Gscl, GT-IC, GT-IIA, GT-IIBα, GT-IIBβ, HlTF1 , H1TF2, H2RIIBP, H4TF-1, H4TF-2, HAND1, HAND2, HB9, HDAC1, HDAC2, HDAC3, hDaxx, heat-induced factor, HEB, HEB1-p67, HEB1-p94, HEF-1B, HEF-1T , HEF-4C, HEN1, HEN2, Hesxl, Hex , HIF-1, HIF-1α, HIF-1β, HiNF-A, HiNF-B, HINF-C, HINF-D, HiNF-D3, HiNF-E, HiNF-P, HIP1, HIV-EP2, Hlf, HLTF , HLTF(Metl23), HLX, HMBP, HMG I, HMG I(Y), HMG Y, HMGI-C, HNF-1A, HNF-IB, HNF-1C, HNF-3, HNF-3α, HNF-3β, HNF-3γ, HNF4, HNF-4α, HNF4α1, HNF-4α2, HNF-4α3, HNF-4α4, HNF4γ, HNF-6α, hnRNP K, HOX11, HOXA1, HOXA10, HOXA10PL2, HOXA11, HOXA13, HOXA2, HOXA3, HOXA4 , HOXA5, HOXA6, HOXA7, HOXA9A, HOXA9B, HOXB-1, HOXB13, HOXB2, HOXB3, HOXB4, HOXBS, HOXB6, HOXA5, HOXB7, HOXB8, HOXB9, HOXC10, HOXC11, HOXC12, HOXC13, HOXC4, HOXC5, HOXC6, HOXC8 , HOXC9, HOXD10, HOXD11, HOXD12, HOXD13, HOXD3, HOXD4, HOXD8, HOXD9, Hp55, Hp65, HPX42B, HrpF, HSF, HSF1(long), HSF1(short), HSF2, hsp56, Hsp90, IBP-1, ICER -II, ICER-liγ, ICSBP, Idl, IdlH', Id2, Id3, Id3/Heir-1, IF1, IgPE-1, IgPE-2, IgPE-3, IκB, IκB-α, IκB-β, IκBR, II-1RF, IL-6RE-BP, 11-6RF, INSAF, IPF1, IRF-1, IRF-2, B, IRX2a, Irx-3, Irx-4, ISGF-1, ISGF-3, ISGF3α, ISGF- 3γ, lst-1, ITF, ITF-1, ITF-2, JRF, Jun, JunB, JunD, κy factor, KBP-1, KER1, KER-1, Koxl, KRF-1, Ku autoantigen, KUP, LBP -1, LBP-la, LBX1, LCR-F1, LEF-1, LEF-1B, LF-A1, LHX1, LHX2, LHX3a, LHX3b, LHXS, LHX6.1a, LHX6.1b, LIT-1, Lmol, Lmo2 , LMX1A, LMX1B, L-Myl (long form), L-Myl (short form), L-My2, LSF, LXRα, LyF-1, Lyl-1, M-factor, Madl, MASH-1, Maxl, Max2, MAZ, MAZ1, MB67, MBF1, MBF2, MBF3, MBP-1 ( 1), MBP-1(2), MBP-2, MDBP, MEF-2, MEF-2B, MEF-2C (433AA form), MEF-2C (465AA form), MEF-2C (473M form), MEF- 2C/δ32 (441AA form), MEF-2D00, MEF-2D0B, MEF-2DA0, MEF-2DAO, MEF-2DAB, MEF-2DA'B, Meis-1, Meis-2a, Meis-2b, Meis-2c, Meis-2d, Meis-2e, Meis3, Meoxl, Meoxla, Meox2, MHox(K-2), Mi, MIF-1, Miz-1, MM-1, MOP3, MR, Msx-1, Msx-2, MTB -Zf, MTF-1, mtTFl, Mxil, Myb, Myc, Myc 1, Myf-3, Myf-4, Myf-5, Myf-6, MyoD, MZF-1, NCI, NC2, NCX, NELF, NER1, Net, NF Ill-a, NF NF-1, NF-1A, NF-1B, NF-1X, NF-4FA, NF-4FB, NF-4FC, NF-A, NF-AB, NFAT-1, NF- AT3, NF-Atc, NF-Atp, NF-Atx, NfβA, NF-CLEOa, NF-CLEOb, NFδE3A, NFδE3B, NFδE3C, NFδE4A, NFδE4B, NFδE4C, Nfe, NF-E, NFE2, NF-E2p45, NF- E3, NFE-6, NF-Gma, NF-GMb, NF-IL-2A, NF-IL-2B, NF-jun, NF-κB, NF-κB (like), NF-κB1, NF-κB1, pre- NF-κB2, NF-κB2(p49), NF-κB2 precursor, NF-κEl, NF-κE2, NF-κE3, NF-MHCIIA, NF-MHCIIB, NF-muEl, NF-muE2, NF-muE3 , NF-S, NF-X, NF-X1, NF-X2, NF-X3, NFXc, NF-YA, NF-Zc, NF-Zz, NHP-1, NHP-2, NHP3, NHP4, NKX2-5 , NKX2B, NKX2C, NKX2G, NKX3A, NKX3A vl, NKX3A v2, NKX3A v3, NKX3A v4, NKX3B, NKX6A, Nmi, N-My c, N-Oct-2α, N-Oct-2β, N-Oct-3, N-Oct-4, N-Oct-5a, N-Oct-5b, NP-TCII, NR2E3, NR4A2, Nrfl, Nrf- 1. Nrf2, NRF-2βl, NRF-2γl, NRL, NRSF form 1, NRSF form 2, NTF, 02, OCA-B, Oct-1, Oct-2, Oct-2.1, Oct-2B, Oct-2C, Oct-4A, Oct4B, Oct-5, Oct-6, Octa-factor, octamer-binding factor, oct-B2, oct-B3, Otxl, Otx2, OZF, pl07, pl30, p28 regulator, p300, p38erg , p45, p49erg, -p53, p55, p55erg, p65δ, p67, Pax-1, Pax-2, Pax-3, Pax-3A, Pax-3B, Pax-4, Pax-5, Pax-6, Pax- 6/Pd-5a, Pax-7, Pax-8, Pax-8a, Pax-8b, Pax-8c, Pax-8d, Pax-8e, Pax-8f, Pax-9, Pbx-la, Pbx-lb, Pbx-2, Pbx-3a, Pbx-3b, PC2, PC4, PC5, PEA3, PEBP2α, PEBP2β, Pit-1, PITX1, PITX2, PITX3, PKNOX1, PLZF, POB, Pontin52, PPARα, PPARβ, PPARγl, PPARγ2, PPUR, PR, PR A, pRb, PRD1-BF1, PRDI-BFc, Prop-1, PSE1, P-TEFb, PTF, PTFα, PTFβ, PTFδ, PTFγ, Pu box binding factor, Pu box binding factor (BJA-B ), PU.1, PuF, Pur factor, R1, R2, RAR-α1, RAR-β, RAR-β2, RAR-γ, RAR-γ1, RBP60, RBP-Jκ, Rel, RelA, RelB, RFX, RFX1 , RFX2, RFX3, RFXS, RF-Y, RORα1, RORα2, RORα3, RORβ, RORγ, Rox, RPF1, RPGα, RREB-1, RSRFC4, RSRFC9, RVF, RXR-α, RXR-β, SAP-la, SAP lb, SF-1, SHOX2a, SHOX2b, SHOXa, SHOXb, SHP, SIII-pl 10, SIII-pl5, SIII-pl8, SIM', Six-1, Six-2, Six-3, Six-4, Six- 5. Six-6, SMAD-1, SMAD-2, S MAD-3, SMAD-4, SMAD-5, SOX-11, SOX-12, Sox-4, Sox-5, SOX-9, Spl, Sp2, Sp3, Sp4, Sph factor, Spi-B, SPIN, SRCAP , SREBP-la, SREBP-lb, SREBP-lc, SREBP-2, SRE-ZBP, SRF, SRY, SRP1, Staf-50, STAT1α, STAT1β, STAT2, STAT3, STAT4, STAT6, T3R, T3R-α1, T3R -α2, T3R-β, TAF(I)110, TAF(I)48, TAF(I)63, TAF(II)100, TAF(II)125, TAF(II)135, TAF(II)170, TAF (II)18, TAF(II)20, TAF(II)250, TAF(II)250Δ, TAF(II)28, TAF(II)30, TAF(II)31, TAF(II)55, TAF(II) )70-α, TAF(II)70-β, TAF(II)70-γ, TAF-I, TAF-II, TAF-L, Tal-1, Tal-lβ, Tal-2, TAR factor, TBP, TBXIA, TBXIB, TBX2, TBX4, TBXS (long isoform), TBXS (short isoform), TCF, TCF-1, TCF-1A, TCF-1B, TCF-1C, TCF-1D, TCF-1E, TCF-1F, TCF-1G, TCF-2α, TCF-3, TCF-4, TCF-4(K), TCF-4B, TCF-4E, TCFβl, TEF-1, TEF-2, tel, TFE3, TFEB , TFIIA, TFIIA-αβ precursor, TFIIA-α/β precursor, TFIIA-γ, TFIIB, TFIID, TFIIE, TFIIE-α, TFIIE-β, TFIIF, TFIIF-α, TFIIF-β, TFIIH, TFIIH*, TFIIH-CAK, TFIIH-Cyclin H, TFIIH-ERCC2/CAK, TFIIH-MAT1, TFIIH-M015, TFIIH-p34, TFIIH-p44, TFIIH-p62, TFIIH-p80, TFIIH-p90, TFII-I, Tf -LF1, Tf-LF2, TGIF, TGIF2, TGT3, THRA1, TIF2, TLE1, TLX3, TMF, TR2, TR2-11, TR2-9, TR3, TR4, TRAP, TREB-1, TREB-2, TREB-3 , TREF1, TREF2, TRF(2), TTF-1, TXRE BP, Tx REF, UBF, UBP-1, UEF-1, UEF-2, UEF-3, UEF-4, USF1, USF2, USF2b, Vav, Vax-2, VDR, vHNF-1A, vHNF1B, vHNF-1C, VITF, WSTF, WT1, WT1I, WT1I-KTS, WT1I-del2, WT1-KTS, WT1-del2, X2BP, XBP-1, XW-V, XX, YAF2, YB-1, YEBP, YYl, ZEB, ZF1, ZF2, ZFX, ZHX1, ZIC2, ZID, ZNF174, ASH1L, ASH2, ATF2, ASXL1, BAP1, bcllO, Bmil, BRG1, CARM1, KAT3A/CBP, CDC73, CHD1, CHD2, CTCF, DNMT1, DOTL1, EHMT1, ESET, EZH1, EZH2, FBXL10, FRP(Plu-1), HD AC 1, HDAC2, HMGA1, hnRNPA1, HP1γ, Hsetlb, JaridlA, JaridlC, KIAA1718JHDM1D, KAT5, KMT4, LSD1, NFKB P100, NSD2, MBD2, MBD3, MLL2, MLL4, P300, pRB, RbAP46/48, RBP1, RbBP5, RING IB, RNApolII P S2, RNApolII PS5, ROC1, sap30, setDB 1, Sf3bl, SIRT1, Sirt6, SMYD1, SP1, SUV39H1, SUZ12, TCF4, TET1, TRRAP, TRX2 , WDR5, WDR77 and/or YY1.

In one embodiment, the transposase is covalently associated with the targeting moiety, eg, by fusion or chemical coupling. In one embodiment, the fusion is direct or indirect. In one embodiment, the transposase is preferably non-covalently associated with the targeting moiety. In one embodiment, the transposase is fused or coupled to one member of a binding pair and the targeting moiety is fused or coupled to another member of the binding pair. In one embodiment, the binding pair is biotin-avidin, biotin-streptavidin, ligand-receptor, enzyme-substrate or complementary oligonucleotide. In one embodiment where the targeting moiety is an antibody, the transposase is fused to an antibody binding protein. In one embodiment, the antibody binding protein is protein A, protein G, an Fc receptor, or a secondary antibody.

In one embodiment of any of the above aspects, the transposase is a transposase known in the art or discovered in the future, such as a Tn5 transposase, Mu transposase, IS5 transposase or IS91 transposase , including wild type and mutant type (see eg CN1367840A, CN109400714A, US6406896B1, US20040235103A1). In one embodiment, the transposase is a highly active Tn5 transposase, such as an EK/LP Tn5 transposase. In one embodiment, the transposase is a Tn5 transposase mutant, eg, comprising one or more substitutions of E58V, L372Q, E344K, D97E, D188E, E326D.

In one embodiment, the transposase recognition sequence is a transposase recognition sequence known in the art or discovered in the future, eg, a Tn5-type transposase recognition sequence, eg, inner end (IE) or outer end (OE) , including its wild-type and mutant forms, as well as methylated forms (ME), such as the 19 bp Tn5 core terminal sequence ( AGATGTGTATAAGAGACAG , SEQ ID NO: 9) or its reverse complement ( CTGTCTCTTATACACATCT , SEQ ID NO: 10). In one embodiment, the transposase recognition sequence is a Mu transposase recognition sequence, an IS5 transposase recognition sequence, or an IS91 transposase recognition sequence, including wild-type and mutant types.

In one embodiment, the first tag sequence in the first oligonucleotide and/or the second tag sequence in the second oligonucleotide is specific to the targeting moiety.

In one aspect, the present application relates to a mixture comprising at least a first complex of the present application and a second complex, wherein a targeting moiety in the first complex specifically binds a first target molecule, the first complex The targeting moiety in the two-complex specifically binds a second target molecule that is different from the second target molecule.

In one embodiment, the mixture of the present application involves a set of targeting modules. In one embodiment, different targeting modules in the set of targeting modules correspond to the same first tag sequence and different second tag sequences. In one embodiment, different targeting moieties in the set of targeting moieties correspond to different first tag sequences and the same second tag sequence. In one embodiment, different targeting moieties in the set of targeting moieties correspond to different first tag sequences and different second tag sequences.

In one embodiment, the mixture of the present application involves multiple sets of targeting modules. In one embodiment, different sets of targeting modules in the plurality of sets of targeting modules correspond to different first tag sequences, and different targeting modules in the same set of targeting modules correspond to the same first tag sequence and different the second tag sequence. In one embodiment, different sets of targeting modules in the plurality of sets of targeting modules correspond to different second tag sequences, and different targeting modules in the same set of targeting modules correspond to the same second tag sequence and different The first tag sequence of .

In one aspect, the present application relates to a method of preparing a nucleic acid library for the simultaneous study of the interaction of multiple target molecules with DNA, comprising: obtaining a mixture of the present application comprising multiple complexes against multiple target molecules , that is, comprising a complex for each of the multiple target molecules; obtaining a sample in which multiple target molecules interact with DNA; reacting the mixture with the sample, so that the targeting module binds to the corresponding target molecule, and transfers the The posase fragments the DNA and adds corresponding tag sequences on both sides of the DNA fragments; and recovers the tagged DNA fragments to obtain a nucleic acid library. In one embodiment, the method further comprises purifying and/or amplifying the recovered DNA fragments.

In one aspect, the present application relates to a method for simultaneously identifying the sites of action of multiple target molecules on DNA, comprising: obtaining a mixture of the present application comprising multiple complexes directed against multiple target molecules, ie, comprising targeting multiple target molecules. A complex of each of the plurality of target molecules; a sample of the interaction of the plurality of target molecules with DNA is obtained; the mixture is reacted with the sample, so that the targeting module binds the corresponding target molecule, and the transposase binds the DNA fragments. and adding corresponding tag sequences on both sides of the DNA fragments; recovering the tagged DNA fragments; and sequencing the recovered DNA fragments, wherein the sequence obtained by the sequencing corresponding to the tag sequence indicates that the target molecule corresponding to the tag sequence is on the DNA site of action. In one embodiment, the method further comprises purifying and/or amplifying the recovered DNA fragments.

In one embodiment, the method further comprises analyzing the sequencing results. In one embodiment, analyzing the sequencing results includes aggregating sequencing reads corresponding to (eg, comprising) the same first tag sequence and/or the same second tag sequence. For example, where the targeting moiety for target molecule A corresponds to tag sequences A1 and A2 and the targeting moiety for target molecule B corresponds to tag sequences B1 and B2, the sequencing reads corresponding to tag sequence A1 or A2 are read Under the target molecule A, the insert sequence in the sequencing read is the DNA site that interacts with the target molecule A; the sequencing read corresponding to the tag sequence B1 or B2 is classified under the target molecule B, the The insert sequence in the sequencing read is the DNA site that interacts with target molecule B. For another example, the targeting modules for group A target molecules all correspond to the same tag sequence A, and the targeting modules for target molecules AB1, AB2 in group A correspond to unique tag sequences B1, B2... In this case, the sequencing reads corresponding to the tag sequence A are classified under the target molecule of group A, the sequencing reads corresponding to the tag sequence B1 are classified under the target molecule AB1, and the sequencing reads corresponding to the tag sequence B2 are classified under the target molecule AB1. Under the target molecule AB2...

In one embodiment, the transposase in the complex is inactive. In one embodiment, the method further comprises the step of activating the transposase, eg adding a divalent cation, eg ^Mg2+ .

In one embodiment, the method further comprises adding a modulator of the target molecule-DNA interaction. In one embodiment, the method further comprises comparing the sequencing results of the sample with the added modulator to the sample without the modulator added.

In one embodiment, the method further comprises altering the reaction conditions for the target molecule-DNA interaction. In one embodiment, the method further comprises comparing the sequencing results of the samples under different reaction conditions.

Sequencing results can be qualitative, semi-quantitative, quantitative, or any combination thereof.

In one embodiment, the sample is a cell or a derivative thereof. In one embodiment, the cell is a prokaryotic cell or a eukaryotic cell. In one embodiment, the sample is a nucleus, cytoplasm or organelle or a derivative thereof. In one embodiment, the sample is a cell lysate. In one embodiment, the method includes the step of permeabilizing the cells, eg, adding digitonin.

In one embodiment, the DNA is genomic, chromosomal or chromatin, eg, prokaryotic or eukaryotic.

Description of drawings

Figure 1 shows the library quality assessment of Example 2.

Figure 2 shows the TSS enrichment of Example 2.

Figure 3 shows the IgV view of Example 2.

Figure 4 shows the library quality assessment of Example 3.

Figure 5 shows the TSS enrichment of Example 3.

Figure 6 shows the IgV view of Example 3.

Figure 7 shows a schematic diagram of an exemplary embodiment of a nucleic acid library constructed in the present application.

Detailed ways

Material

The transposase used in the embodiment is Hyperactive pG-Tn5 Transposase for CUT&Tag (Item No. S602) or Hyperactive pA-Tn5 Transposase for CUT&Tag (Item No. S603) of Nanjing Novizan Biotechnology Co., Ltd.

The H3K4me2 antibody was from Abcam, catalog number: ab11946; the CTCF antibody was from CST, catalog number: 3418S; the RNA Pol II antibody was from Abcam, catalog number: ab817; the H3K27me3 antibody was from CST, catalog number: #9733S;

The method of this application is universal and applicable to various sequencing platforms, such as ion torrent platform, illumina platform and BGI platform. The embodiment takes the illumina platform as an example. If other sequencing platforms are used, it is only necessary to replace the sequences of the immobilized probes and sequencing primers used by the illumina platform in the following oligonucleotides or their reverse complementary sequences with the corresponding sequences of other platforms.

Oligonucleotide 1 (SEQ ID NO: 10):

5'-phos- CTGTCTCTTATACACATCT _- NH2-3'

Oligonucleotide 2 (SEQ ID NO: 11):

Oligonucleotide 3 (SEQ ID NO: 12):

in

Represents the index sequence (the index sequence length of 8 nucleotides is exemplary and not limiting), the bold segment is the sequencing primer binding sequence (same sequence as the primer used for sequencing), the underlined segment is the transposition The methylated 19bp core end sequence bound by the enzyme, the italicized segment is the sequence chip binding sequence (same as the sequence of the immobilized probe used by the sequence chip). In an alternative embodiment, oligonucleotide 2 and oligonucleotide 3 may delete several bases of the 5' portion of the italicized segment, leaving at least four bases of the 3' portion of the italicized segment.

Amplification primer 1 (SEQ ID NO: 1): 5'-AATGATACGGCGACCACCGAGATCTACAC-3'

Amplification primer 2 (SEQ ID NO: 3): 5'-CAAGCAGAAGACGGCATACGAGAT-3'

Amplification primer 1 (N5) is the same as the complete italicized segment of oligonucleotide 2, and amplification primer 2 (N7) is the same as the complete italicized segment of oligonucleotide 3.

An alternative embodiment of the above three oligonucleotide sequences is as follows:

Oligonucleotide 1' (SEQ ID NO: 9): 5'-phos- AGATGTGTATAAGAGACAG - NH2-3'

Oligonucleotide 2' (SEQ ID NO: 13):

Oligonucleotide 3' (SEQ ID NO: 14):

Oligonucleotide 2, Oligonucleotide 2', Oligonucleotide 3 and Oligonucleotide 3' above constitute yet another alternative embodiment of the present application.

Index adopted in Example 2:

Index adopted in Example 3:

The Illumina library structure is as follows:

Wherein, -MMMMMM- represents the insertion sequence (the length of the insertion sequence is 6 nucleotides is exemplary, not limiting), and other segments have the same meanings as above.

Example 1: Preparation of transposomes (adapter-transposase complexes)

Embedding according to the S602 product manual, the steps are as follows:

1. Adapter annealing:

(1) Use Annealing Buffer (Vazyme, #S602) to dissolve oligonucleotide 1, oligonucleotide 2, and oligonucleotide 3 to 100 μM respectively;

(2) equimolar mixing of oligonucleotide 1 and oligonucleotide 2 to obtain reaction 1, and equimolar mixing of oligonucleotide 1 and oligonucleotide 3 to obtain reaction 2;

(3) Reaction 1 and reaction 2 were vortexed to mix well, and centrifuged briefly to return the solution to the bottom of the tube. Put it in the PCR machine and carry out the following reaction procedures:

热盖hot cover	105℃105℃
75℃75℃		15min15min
60℃60℃		10min10min
50℃50℃		10min10min
40℃40℃		10min10min
25℃25℃	30min30min

(4) Mix equal volumes of Reaction 1 and Reaction 2, and mix well. Named Adapter Mix and stored at -30 to -15°C.

2. Assemble the transposome (adapter-transposase complex)

(1) Add each reaction component in sequence to the sterilized PCR tube:

(2) Mix well.

(3) The reaction was placed at 30°C for 1 hour. The reaction product is a transposome (adapter-transposase complex), which can be directly used in subsequent experiments or stored at -30 to -15°C.

The final concentration of transposomes prepared according to this reaction system was 4 μM.

Embed the transposase with adaptor pairs containing different indices, labelled as transposome 1, transposome 2, transposome 3, depending on the index used.

Example 2:

This example is used to simultaneously study the binding of histone modifications, transcription factors and RNA Pol II to genomic DNA in cells.

The wash buffer, reaction buffer, and stop buffer used were formulated as follows:

Wash Buffer 1: from Vazyme, #TD901, Wash Buffer,

Wash Buffer 2: from Vazyme, #TD901, Dig-Wash Buffer,

Reaction buffer: from Vazyme, #TD901, Tagmentation Buffer,

Termination buffer: from Vazyme, #TD901, Termination Buffer.

The specific process of this embodiment is as follows:

1. Take three 1.5mL EP tubes, add 10μL of washing buffer 2, add 1μL of embedded transposome 1, transposome 2, and transposome 3, respectively, add 0.5μg of the corresponding Antibody, incubate at 4°C for 30min, so that the transposome can be fully combined with the antibody to obtain transposome 1-H3K4me2 antibody, transposome 2-CTCF antibody and transposome 3-RNA Pol II antibody, three kinds of transposome- antibody complexes;

2. Collect about 100,000 conventional in vitro cultured 293T cells, wash once with PBS, collect cells by centrifugation and wash once with washing buffer 1;

3. Resuspend the cells with washing buffer 2, add the three transposome-antibody complexes obtained in step 1 at the same time, rotate and incubate at 4°C or room temperature for 30 minutes;

4. Wash cells 3 times with wash buffer 2 to remove unbound transposome-antibody complexes;

5. Resuspend cells with reaction buffer and incubate at 37°C for 30min;

6. The reaction was terminated by adding stop buffer, and the DNA was purified with phenol-chloroform;

7. The purified DNA is directly amplified by PCR to complete the library construction.

Amplification system:

组分component	体积volume
DNA纯化产物DNA purification product	24μL24μL
5×TAB(Vazyme，#TD501)5 x TAB (Vazyme, #TD501)	10μL10μL
TAE(Vazyme，#TD501)TAE (Vazyme, #TD501)	1μL1μL

N5引物(4μM)N5 primer (4μM)	5μL5μL
N7引物(4μM)N7 primer (4μM)	5μL5μL
ddH ₂O ddH ₂ O	5μL5μL

In the PCR reaction program, the heated lid was set to 105°C, and the number of amplification cycles was adjusted according to the actual situation.

8. Purification of Amplification Products

Amplification products were purified using VAHTS DNA Clean Beads (Vazyme, #N411) according to the manufacturer's instructions.

9. Qubit detection library concentration

The resulting library was subjected to concentration determination using a Qubit 3.0 Fluorometer (invitrogen) and library yield was calculated. The library concentration was 34.8 ng/μL (22 μL elution volume).

10. Assess library quality with the Agilent 2100 Bioanalyzer

Take 1 μL of the purified PCR product and analyze it with Agilent DNA 1000 kit (Agilent, Cat. No. 5067-1504). The results are shown in Figure 1.

11. Sequencing

The completed library was used for next-generation sequencing on the illumina platform, Hiseq X, PE150bp. The sequencing results are shown in Table 1 and Figures 2 and 3.

Table 1

SampleSample	H3K4me2H3K4me2	CTCFCTCF	RNA Pol IIRNA Pol II
Clean readsClean reads	2979890429798904	1971980419719804	1415698014156980
Q20Q20	0.976750.97675	0.983350.98335	0.97570.9757
Q30Q30	0.938150.93815	0.9540.954	0.93620.9362
Mapping rateMapping rate	96.41％96.41%	95.18％95.18%	97.64％97.64%
Duplicate RateDuplicate Rate	24.91％24.91%	23.60％23.60%	17.67％17.67%
peak numberpeak number	1717817178	1813218132	1502415024

From the data in Table 1, it can be seen from the information such as the number of Reads, Q20, Q30, Mapping Rate, Duplication Rate and peak number that the library obtained by this mtChIP-seq experimental process is of high quality and low Dup ratio. According to the collection situation and the IVG view in Figure 3, the simultaneous detection of histones, transcription factors and RNA Pol II in the same sample can obtain better sample enrichment and high library signal-to-noise ratio.

Example 3

This embodiment provides a method for simultaneously studying histone methylation and acetylation modification. The specific process of this embodiment is as follows:

1. Weigh 100 mg of fresh C57BL/6 adult mouse liver, and use the nucleus extraction kit (Solarbio, Cat. No.: SN0020) to extract the nucleus of the tissue;

2. Take two 1.5mL EP tubes, add 10μL of washing buffer 2, add 1μL of embedded transposome 4 and transposome 5, add 0.5μg of the corresponding antibodies to the two EP tubes, and incubate at 4°C For 30 minutes, the transposome and the antibody are fully combined to obtain the transposome 4-H3K27me3 antibody and the transposome 5-H3K27ac antibody;

3. Take 2 μL of tissue nuclei, dissolve in Wash Buffer 1, and centrifuge the supernatant;

4. Resuspend the nuclei with washing buffer 2, add the two transposome-antibody complexes obtained in step 2 at the same time, rotate and incubate at 4°C or room temperature for 30 minutes;

5. Wash nuclei three times with wash buffer 2 to remove unbound transposome-antibody complexes;

6. Resuspend cells in reaction buffer and incubate at 37°C for 30min;

7. Add stop buffer to terminate the reaction, and use phenol-chloroform for DNA purification;

8. As described in Example 2, the purified DNA was directly amplified by PCR to complete the library building;

9. Purify the amplification product with magnetic beads as described in Example 2;

10. The library yield was detected with Qubit as described in Example 2; the library concentration was 57.4 ng/μL (22 μL elution volume);

11. As described in Example 2, library quality was assessed with an Agilent 2100 Bioanalyzer. The results are shown in Figure 4;

12. As described in Example 2, the completed library was used for next-generation sequencing. The sequencing results are shown in Table 2 and Figures 5 and 6.

Table 2

SampleSample	H3K27me3H3K27me3	H3K27acH3K27ac
Clean readsClean reads	1817739818177398	2504232425042324
Q20Q20	0.981650.98165	0.982250.98225
Q30Q30	0.94580.9458	0.946550.94655
Mapping rateMapping rate	95.45％95.45%	97.16％97.16%
Duplicate RateDuplicate Rate	19.52％19.52%	16.39％16.39%
peak numberpeak number	4406944069	2959829598

From the data in Table 2, it can be seen from the information such as the number of Reads, Q20, Q30, Mapping Rate, Duplication Rate and peak number, the library obtained by this mtChIP-seq experimental process is of high quality and low Dup ratio. According to the collection situation and the IVG view in Figure 6, the methylation and acetylation modifications of histones can be detected simultaneously in the same sample, which can achieve better sample enrichment and high library signal-to-noise ratio.

Claims

An oligonucleotide pair comprising a first oligonucleotide and a second oligonucleotide, wherein:

the first oligonucleotide comprises a first transposase recognition sequence,

The second oligonucleotide comprises a second transposase recognition sequence,

The first oligonucleotide contains a first tag sequence and/or the second oligonucleotide contains a second tag sequence.
The oligonucleotide pair of claim 1, wherein the first oligonucleotide further comprises a first sequencing solid phase binding sequence and/or a first sequencing primer binding sequence.
The oligonucleotide pair of claim 1, wherein the second oligonucleotide further comprises a second sequencing solid phase binding sequence and/or a second sequencing primer binding sequence.
The oligonucleotide pair of claim 1, wherein the first sequencing solid phase binding sequence and/or the second sequencing solid phase binding sequence is any one of the sequences shown in SEQ ID NOs: 1 to 4,

Optionally, the first sequencing primer binding sequence and/or the second sequencing primer binding sequence is any of the sequences shown in SEQ ID NOs: 5 to 8,

Optionally, the first oligonucleotide and/or the second oligonucleotide are single-stranded, double-stranded, or a combination thereof,

Optionally, the first oligonucleotide comprises an optional first sequencing solid phase binding sequence, an optional first tag sequence, an optional first sequencing primer binding sequence, and the first transposase recognition sequence,

Optionally, the second oligonucleotide comprises an optional second sequencing solid phase binding sequence, an optional second tag sequence, an optional second sequencing primer binding sequence, and the second transposase recognition sequence,

Optionally, the first oligonucleotide is linked to the second oligonucleotide, preferably at one end remote from the transposase recognition sequence, optionally, the first oligonucleotide and the second oligonucleotide There are breakpoints in between.
An oligonucleotide-tagged targeting transposome complex comprising a transposase, the oligonucleotide pair of claim 1, and a targeting module,

optionally, the transposase is a Tn5 transposase, optionally the transposase is an EK/LP Tn5 transposase,

Optionally, the transposase recognition sequence is a Tn5-type transposase recognition sequence, optionally, the transposase recognition sequence is a 19bp Tn5 core terminal sequence shown in SEQ ID NO: 9 or 10,

Optionally, the targeting moiety is an antibody, optionally the targeting moiety specifically binds and interacts with DNA, optionally, a target molecule that regulates gene expression, optionally, the target molecule is a group protein or transcription factor,

Optionally, the transposase is covalently or non-covalently associated with the targeting moiety,

The first tag sequence in the first oligonucleotide and/or the second tag sequence in the second oligonucleotide is specific to the targeting moiety.
6. The complex of claim 5, wherein the transposase is covalently associated with the targeting moiety, optionally by fusion or chemical coupling, optionally the fusion is direct or indirect.
The complex of claim 6, wherein the transposase is non-covalently associated with the targeting module, optionally the transposase is fused to a member of a binding pair, and the targeting module is associated with The other member of the pair is fused, optionally, the binding pair is a biotin-avidin, biotin-streptavidin, ligand-receptor, enzyme-substrate, or complementary oligonucleotide.
The complex of claim 7, wherein the targeting moiety is an antibody and the transposase is fused to an antibody binding protein, optionally the antibody binding protein is protein A, protein G, an Fc receptor, or two anti.
A mixture comprising at least a first compound and a second compound as claimed in any one of claims 5-8,

wherein the targeting moiety in the first complex specifically binds a first target molecule, the targeting moiety in the second complex specifically binds a second target molecule, and the first target molecule and the second The target molecules are different.
A method of preparing a nucleic acid library for the simultaneous study of multiple target molecule-DNA interactions, comprising:

obtaining a mixture as claimed in claim 9, comprising a plurality of complexes against a plurality of target molecules;

Obtain samples of various target molecules interacting with DNA;

reacting the mixture with the sample such that the targeting module binds the corresponding target molecule, the transposase fragments the DNA and flanks the DNA fragment with the corresponding tag sequence; and

The tagged DNA fragments are recovered, optionally purified and/or amplified, resulting in a nucleic acid library.
A method for simultaneously identifying the action sites of multiple target molecules on DNA, comprising:

obtaining a mixture as claimed in claim 9, comprising a plurality of complexes against a plurality of target molecules;

Obtain samples of various target molecules interacting with DNA;

reacting the mixture with the sample so that the targeting module binds the corresponding target molecule, and the transposase fragments the DNA and flanks the DNA fragment with corresponding tag sequences;

recovering the tagged DNA fragments, optionally purifying and/or amplifying; and

The recovered DNA fragments are sequenced, wherein the sequence of the sequencing result corresponding to the tag sequence indicates the action site of the target molecule corresponding to the tag sequence on the DNA.
The method of claim 10 or 11, wherein:

the sample is a cell or a derivative thereof, optionally the sample is a nucleus, cytoplasm or organelle or a derivative thereof, optionally the sample is a cell lysate; or

The target molecule is a histone or a transcription factor; or

The DNA is genome, chromosome or chromatin.
The method of claim 10 or 11, further comprising:

optionally, activate a transposase; and/or

Optionally, the transposase is inactivated.